• No results found

CHAPTER CONCEPTS SEM Modeling Steps Model specification Model identification Model estimation Model testing Model modification Model-fit Criteria

Saturated models and independence models Measurement versus structural models Types of Fit Criteria

Model fit

Model comparison Model parsimony Parameter fit Hypothesis Testing

Power Sample size

Model comparison Effect size

Parameter Significance

Two-Step Versus Four-Step Approach to Modeling

Structural equation modeling (SEM) tests whether a sample variance–covariance matrix is similar to the variance–covariance matrix implied by a theoretical model. The theoretical model produces a variance–covariance matrix based on the paths specified in the model (recall path models). The chi-square statistic tests whether the two matrices are similar or different. A significant chi-square value

SEM BaSicS

107

indicates that the two matrices are different, thus the sample data do not sup-port the theoretical model, while a non-significant chi-square value indicates that the theoretical model produces a matrix similar to the sample data matrix. This has been referred to as a “badness of fit” test because we seek a non-significant chi-square value.

There are several approaches when conducting SEM to test whether a sample variance–covariance matrix is supported by a theoretical model. For example, a CFA measurement model could be tested to determine if the variables share com-mon variance in defining a latent variable. A researcher is attempting to establish key variables that relate to a construct. The theoretical model is either confirmed or disconfirmed, based on the chi-square statistical test of significance and/or meeting acceptable model-fit criteria. Another SEM approach could specify alter-native models where the researcher creates different theoretical models. The theor-etical models use the same data set, so they are referred to as nested models. The alternative approach conducts a chi-square difference test to compare each of the alternative models. In this situation, a significant chi-square value would indicate that the two models being compared were different. A  third SEM approach is considered model generating, where an initial theoretical model is specified, but when the data do not fit this initial model modification indices are used to add or delete paths in the model to arrive at a final best-fitting model. The goal in model generating is to find a model that the data fit well with a practical and substantive theoretical meaning. The process of finding the best-fitting model is also referred to as a specification search, implying that if an initially specified model does not fit the data, then the model is modified in an effort to improve the fit (Marcoulides &

Drezner, 2001, 2003). Recent advances in Tabu search algorithms have permitted the generation of a set of models that the data fit equally well with a final deter-mination by the researcher of which model to accept (Marcoulides, Drezner, &

Schumacker, 1998). AMOS includes a specification search procedure to find the best-fitting model.

SEM MODELING STEPS

The steps a researcher takes in conducting SEM involves model specification, model identification, model estimation, model testing, and model modification.

SEM combines measurement models (CFA) with structural models (path models) using latent variables. Once the measurement models for both latent independent and dependent variables yield a good data to model fit, the relations amongst the latent variables are tested in the structural model. A researcher spends most of their time selecting observed variables as indicators of latent variables, thus test-ing the CFA models. The structural model tests the parameter estimates in the

a BEginnEr’S guidE to Structural Equation ModEling

108

structural equations for statistical significance. The SEM modeling steps generally occur as outlined in the following five basic steps:

Model specification: A measurement model and/or a structural model are spe-cified based on prior research and theory. This comprises the review of lit-erature which substantiates selection of observed variables as indicators of latent variables, and the theory behind testing the relations amongst the latent variables in a structural model.

Model identification: A model is identified if the degrees of freedom is equal to or greater than 1. A df = 0 indicates a saturated model, thus all param-eters are being estimated, which is also called a just-identified model. An under-identified model would have negative degrees of freedom because more parameters are being estimated than distinct values in the covariance matrix.

We are interested in an over-identified model, which specifies fewer paths or variable relations, yet the model-implied (reproduced) variance matrix is close to the sample covariance matrix. The order and rank condition are also other matrix features which indicate a model is identified, which implies that parameters can be estimated.

Model estimation: A hypothesized theoretical model can have parameters esti-mated using several different estimation methods. The unweighted least squares estimation method works fine when the assumptions of the Pearson correlation coefficient are met: normal distribution assumption, and other parametric assumptions hold. Other estimation methods, such as maximum likelihood, were developed to handle data that do not meet the parametric assumptions, but yield robust estimates of parameters. A key issue in estimat-ing model parameters is the associated standard error. If the standard error is biased or inflated, then the test of statistical significance of a parameter is affected. A model parameter is tested for statistical significance by dividing the parameter estimate by its standard error.

Model testing:  A  model is tested for fit based on the non-significance of the chi-square statistic, and other subjective indices. When chi-square is non-significant it indicates that the original variance–covariance matrix and the model-implied variance–covariance matrix are similar. This implies that the model is a good representation of the relations amongst the observed variables, that is, their variance and covariance.

Model modification:  When hypothesis testing, a model may not fit the data.

A researcher is guided by residual values in the residual matrix, modification indices, or theory to make changes. It is not recommended that the struc-tural model be changed by adding or deleting a path unless additional theory substantiates the structural model modification. Generally, the measurement model will require adding an error covariance term between observed vari-ables, which is sufficient to provide a better data to model fit. When an error

SEM BaSicS

109

covariance term is added, justification should be provided, which usually includes similar observed variables on a factor, the same measurement scale, or similar instrumentation. It is also plausible that the model simply does not fit the data, which implies that the theory is not supported or perhaps another random sample of data will work better.

MODEL-FIT CRITERIA

Structural equation modeling (SEM) tests a hypothesized theoretical model, which has its basis in testing theory. Theory is defined as relations amongst constructs.

Constructs are first established by testing confirmatory factor models, which cre-ate lcre-atent variables. The lcre-atent variables are then used in a structural equation model, which forms the basis of the theory.

A researcher typically uses the following three criteria in judging the statistical sig-nificance and substantive meaning of a theoretical model:

1. The first criterion is the non-statistical significance of the chi-square test, which is considered a global fit measure. A  non-statistically significant chi-square value indicates that the sample covariance matrix and the reproduced model-implied covariance matrix are similar.

2. The second criterion is the statistical significance of individual parameter esti-mates for the paths in the model, which are values computed by dividing the parameter estimates by their respective standard errors. This is referred to as a t or z value, and is typically compared to a tabled t or z value of 1.96 at the .05 level of significance.

3. The third criterion is the magnitude and direction of the parameter estimates, paying particular attention to whether a positive or negative coefficient makes sense for the parameter estimate. For example, it would not be theoretically meaningful to have a negative parameter estimate for number of hours spent studying and grade point average.

There are numerous subjective model-fit criteria for assessing model fit.

Determining model fit is complicated because several model-fit criteria have been developed to assist in interpreting structural equation models under differ-ent model-building assumptions. In addition, the determination of model fit in structural equation modeling is not as straightforward as it is in other statistical approaches in multivariable procedures, such as the analysis of variance, multiple regression, discriminant analysis, multivariate analysis of variance, and canon-ical correlation analysis. These multivariable methods use observed variables that are assumed to be measured without error and have statistical tests with known

a BEginnEr’S guidE to Structural Equation ModEling

110

distributions. Many SEM model-fit indices have no single statistical test of signifi-cance that identifies a correct model, given the sample data, especially as equivalent models or alternative models can exist that yield exactly the same data to model fit (Hershberger & Marcoulides, 2013).

Chi-square (χ2) is considered the only statistical test of significance for testing the theoretical model. The chi-square value ranges from zero for a saturated model with all paths included to a maximum value for the independence model with no paths included. Your theoretical model chi-square value will be somewhere between these two extreme values. This can be visualized as follows:

Saturated model (all paths in model) χ2 = 0

Independence model

(no paths in model) χ2 = maximum value

A chi-square value of zero indicates a perfect fit or no difference between values in the sample covariance matrix (S) and the model-implied covariance matrix (∑) that was created, based on the specified theoretical model. Obviously, a theoretical model in SEM with all paths specified is of limited interest because it always yields a saturated model (recall regression equations). The goal in structural equation modeling is to achieve a parsimonious model with a few substantive meaningful paths and a non-significant chi-square value close to the saturated model value of zero, thus indicating little difference between the sample covariance matrix and the model-implied covariance matrix. The difference between these two covariance matrices is output in a residual matrix. When the chi-square value is non-significant (close to zero), residual values in the residual matrix are close to zero, indicating that the sample data fit the theoretical implied model; hence there is little difference between the sample covariance matrix and the model-implied covariance matrix.

Many of the model-fit criteria are computed-based on knowledge of the satu-rated model, independence model, sample size, degrees of freedom, and/or the chi-square values, which are used to formulate an index of model fit that ranges in value from 0 (no fit) to 1 (perfect fit). These various model-fit indices, however, are subjectively interpreted when determining an acceptable model fit. Some research-ers have suggested that a structural equation model with a model-fit value of .95 or higher is acceptable (Baldwin, 1989; Bentler & Bonett, 1980), while others suggested the non-centrality parameter close to zero (Browne & Cudeck, 1993;

Steiger, 1990) and the root-mean-square error of approximation (Steiger, 1990).

Consequently, the various structural equation modeling programs report a var-iety of model-fit criteria. We distinguish the model-fit criteria based on whether

SEM BaSicS

111

assessing model fit, model comparison, or model parsimony as global fit measures (Hair, Anderson, Tatham, & Black, 1992).

Many of the subjective fit indices are computed given knowledge of the null model χ2 (independence model, where the covariance terms are assumed to be zero in the model), null model df, hypothesized model χ2, hypothesized model df, number of observed variables in the model, number of free parameters in the model, and sample size. The formula for the goodness-of-fit index (GFI), normed fit index (NFI), relative fit index (RFI), incremental fit index (IFI), Tucker–Lewis index (TLI), comparative fit index (CFI), model AIC, null AIC, and RMSEA using these values are as follows:

GFI = 1 - [χ2model/χ2null] NFI = (χ2nullχ2model)/χ2null

RFI = 1 - [(χ2model/dfmodel)/(χ2null/dfnull)]

IFI = (χ2nullχ2model)/(χ2null − dfmodel)

TLI = [(χ2null/dfnull) − (χ2model/dfmodel)]/[(χ2null/dfnull) − 1]

CFI = 1 - [(χ2model − dfmodel)/(χ2null − dfnull)]

Model AIC = Normal theory χ2model + 2q (Note: q = df -1) Null AIC = χ2 null + 2q (Note: q = df - 1)