3.2 General Linear Model
3.2.2 Assumptions
3.2.2.1 Homoskedasticity
For the general linear model, the assumption of normally distributed residuals in Equation (3.5) with E (ǫ) = 0 and Var (εi) = σ2is often added to derive properties of the estimated coefficients (see, e. g., Steyer, 2003):
ǫ ∼ N(0,σ2I) (3.7)
3.2 General Linear Model 57
The diagonal elements of the N × N covariance matrix Σǫǫ= σ2I are assumed to be equal, which is known as the assumption of homoskedasticity of the error variances. All off-diagonal elements in Σǫǫare assumed to be equal to zero, i. e., the residuals are assumed to be uncorrelated. As described in detail in section 3.1.2, the homoskedasticity assumption is likely to be violated for applications of the general linear model for inferences about the estimated average total effect, because we expect heterogeneity of between-group residual variances as a consequence of individual treatment effects.
A noticeable amount of literature deals with the robustness of the general linear model against het-eroskedasticity, and misleading interpretations of the assumption of homogeneity of residual variance can be found in the literature (see for a summary Bobko & Russell, 1990). It is known that even in the presence of heteroskedasticity, estimates of the regression coefficients obtained by ordinary least-squares remain un-biased (as they are derived without assumptions about the variances of the residuals εi, see, for example, Hayes & Cai, 2007). Furthermore, the F –test used in regression analysis and analysis of variance is known to be robust against heterogeneity of residual variance and non-normality, in particular when group sizes are equal (see, e. g., Berry, 1993, p. 78). Violation of homoskedasticity is critical with respect to the standard error of estimated regression coefficients when group sizes are unequal, a common situation for the analy-sis of unbalanced quasi-experimental designs (see, e. g., Ito, 1980).8 Furthermore, there is a known effect of heteroskedasticity on the statistical power of detecting unequal slopes in analysis of covariance (see, for example, Alexander & DeShon, 1994). This effect also applies to the statistical power of the detection of moderation effects (e. g., Aguinis, Petersen, & Pierce, 1999).
We have identified three important approaches to account for heteroskedasticity: Weighed least-squares, the transformation of variables, and the calculation of robust standard errors. To correct the stan-dard errors of ordinary least-squares estimates for heteroskedasticity, alternative estimation methods like generalized and weighted least-squares have been suggested. By the same derivation we used to demon-strate that the theory of stochastic causality implies heterogeneity of residual variances (see above), the structure of the variance-covariance matrix of residuals can be predicted and taken into account for the estimation process (see, for example, Cox & McCullagh, 1982). Weighted least-squares can be used to ob-tain an estimator which gives unbiased standard errors by specifying the structure of the residual matrix in terms of weights (see, e. g., Kutner et al., 2005, ch. 11, or S. Weisberg, 2005, for an introduction to weighted least-squares and Wilcox & Keselman, 2004, for a general summary of robust methods dealing with het-erogeneity of residual variances, as well as Cai & Hayes, 2008, for an up-to-date overview of how to han-dle heteroskedasticity of an unknown form). We do not consider weighted least-squares within this thesis because we will focus on structural equation models under a multivariate normality assumption. Some
8Some authors suggest avoiding unbalanced designs by throwing out data (see, for instance, Scheiner & Gurevitch, 2001, p. 119).
3.2 General Linear Model 58
authors suggest the transformation of the variables to handle heteroskedasticity of residual variances (e. g., Carroll & Ruppert, 1988). We do not follow this approach either because as Long and Ervin (2000) point out:
If there are theoretical reasons to believe that errors are heteroscedastic around the correct functional form, transforming the dependent variable is inappropriate. Furthermore, it should at least be mentioned that a lot of diagnostic techniques are discussed in the literature for detection of heteroskedasticity (see Darken, 2004, for an overview, and, e. g., D. R. Cook & Weisberg, 1983, for a test statistic based on the score statistic).
Liang and Zeger (1986) suggest the application of robust standard errors based on White (1980a) to draw valid inference about the estimated parameters, even if the assumption of homoskedasticity is not fulfilled. In order to make the ordinary least-squares estimators (and the general linear model) more robust to the violation of assumptions, a plethora of different corrections have been developed. Two of them are described in more detail in the following paragraphs: Heteroskedasticity consistent estimators and adjusted standard errors for regression estimates.
Heteroskedasticity Consistent Estimators Standard errors for regression coefficients obtained from the general linear model are expected to be biased due to heterogeneity of residual variance (heteroskedasticity) for unequal group sizes. A very general approach to deal with this bias is the application of robust standard errors (developed by White, 1980a). These so-called sandwich estimators are very popular in economics (see, e. g., Kleiber & Zeileis, 2008) and nowadays implemented, for example, in variousR–packages (e. g., Zeileis, 2006), or as an additional macro code (see, for example, Hayes & Cai, 2007). A detailed descrip-tion of White’s heteroscedasticity consistent estimator can be found, e. g., in Greene (2007, ch. 10). The procedures are based on the heteroskedasticity consistent covariance matrix estimation (HCCM); different versions of the correction exist (see J. G. MacKinnon & White, 1985, and Zeileis, 2004, for a summary of their properties and their implementation). In a Monte-Carlo simulation, Long and Ervin (2000) found that the HC3estimator should be applied to small samples N < 25. As Zeileis (2004) points out, theHC4estimator recently suggested by Cribari-Neto (2004) further improves the small sample performance, especially in the presence of influential observations. The correction can be applied to the centering approach as well as to the general linear hypothesis. This is possible because of the general nature of the sandwich estimator. In line with Zeileis (2004), we will study the performance ofHC3andHC4in the simulation study presented in chapter 4.
Standard Error for Regression Estimate A very promising approach for the estimation of the average to-tal effect based on predicted values was described by Schafer and Kang (2008). We have already described this approach in subsection 2.2.4. For the simple linear regression with one covariate, this approach is
al-3.2 General Linear Model 59
gebraically equivalent to the sample estimator of the expectation of the effect function (for a group-specific specification of the covariate-treatment regression):
ATEd10 = N1P
i
£ˆyi1− ˆyi0
¤
= N1P
i£¡ˆβ10+ ˆβ11· zi
¢−¡ˆβ00+ ˆβ01· zi
¢¤
= N1P
i£¡ˆβ10− ˆβ00¢
+¡ˆβ11− ˆβ01¢ zi¤
= ¡ˆβ10− ˆβ00¢
+¡ˆβ11− ˆβ01¢ b µZ.
(3.8)
Schafer and Kang (2008, p. 293) provide formulas for standard errors that do not assume correctly specified implied variances for the outcome model, and should therefore be robust with respect to heterogeneity of residual variances: “Our standard errors are robust to misspecification of mean-variance relationships, whereas the so-called model-based standard errors typically provided by linear regression software are not.”
Hence, we mention this approach in this subsection again. For the derivation of the standard errors for the regression estimate see Schafer and Kang (2008, p. 311).
We conclude that the general linear model without an additional adjustment for heteroskedasticity is not suitable for statistical inference about the average total effect when groups are of unequal size (see also Hartenstein, 2005). For equal group sizes, we expect the methods based on the general linear model to be robust against heteroskedasticity, at least for conditions without covariate-treatment interactions.
3.2.2.2 Fixed Regressors
Within the general linear model, the values of the design matrix are assumed to be fixed and non-stochastic quantities. This conflicts with the basic requirement for the implementation of a generalized analysis of covariance presented in section 3.1.3. If we consider the simple covariate-treatment regression again, for a univariate covariate Z with a linear parameterization of the intercept function and the effect function [see Equation (1.29) on page 17], we can write the interesting part of the design matrix x =¡
1, Xfixed
¢out as
Xfixed=
x1 ,... , xN
z1 ,... , zN
x1· z1 ,... , xN· zN
T
, (3.9)
with xiand zias known constants (fixed values of the regressors), and xi·zias the simple products necessary to obtain a regression coefficient for the interaction term. Hence, with respect to the observed random variables opposed to the theory of stochastic causality only the elements of the vector y of the general linear model y = xβ + ε are assumed to change with repeated sampling.
3.2 General Linear Model 60
Different arguments in support of this so-called fixed-X assumption for the traditional ANCOVA were summarized in section 3.1.3. Neter, Wasserman, and Kutner (1983, p. 83 f.) describe an interplay between the stochasticity of regressors and the homoskedasticity assumption mentioned above. They argue that as long as the conditional distributions of yigiven the regressors xiare normal and independent with variance σ2, and as long as the xiare independent random variables whose probability distribution does not involve the regression parameters, the randomness of the regressors can be ignored (see also Ryan, 1996, p. 34).
Flory (2004) presented results of a simulation study for a test of the hypothesis ATE10= 0 within the general linear model for data generated with homogenous residual variances, equal group sizes and different in-teraction effects. For strong inin-teraction effects he found heavily inflated type-I-error rate for a sample size of N = 1000. Hence, at least according to the results of this simulation study, the robustness of the linear regression model against violations of the fixed-X assumption does not hold for regression models with covariate-treatment interactions.
Although Henderson (1982) presents an approach to the analysis of covariance within the framework of mixed models, and Milliken and Johnson (2002) describe a random effects model with covariates consid-ered as random within the framework of the generalized linear mixed model (GLMM), we are not aware of further approaches9dealing with the robustness of the general linear model with respect to violations of the fixed X -assumption.