CHAPTER CONCEPTS
Measurement model group difference Measurement invariance
Structural model group difference Multiple sample model
Testing for parameter differences
Multiple group models were developed to test whether groups had the same or different theoretical model (Lomax, 1985). This approach can be applied in com-paring groups on a measurement model, which is called a test of measurement invariance. Measurement invariance implies that two or more groups have the same measurement model, that is, the same construct or meaning for the latent vari-ables in the measurement model. Byrne and Sunita (2006) provided a step-by-step approach for examining measurement invariance in SEM.
Once a researcher establishes that the latent variable constructs are the same, a comparison of the groups in the structural model can be conducted. A researcher would desire a non-significant chi-square difference between the groups in the measurement model invariance comparison. However, a researcher might want the structural models to be different, thus a significant chi-square difference. For example, adolescent drug use is different between high and low GPA high school students in the structural model, but the self concept and attitude toward drugs constructs in the measurement models are the same.
A researcher could also compare multiple samples of data on a measurement model, thus establishing the validity of a construct. You are basically applying a single specified model to one or more samples of data. Also, samples of data can
Multiple Group (SaMple) ModelS
145
be applied to a structural model to establish model validity. A unified approach to multi-group modeling is explained in Marcoulides and Schumacker (2001).
A misconception in multiple group modeling is that all paths in the model have to be the same. In fact, the chi-square statistic is a test of global fit; an overall fit of a model. It is possible for a measurement model to be slightly different for each group.
Some parameters can therefore be different between the groups in the theoretical model. We can constrain or fix some parameters to be different between the groups, but allow other paths to be tested for equality. You might even consider dropping a single indicator variable that is different between the groups in a measurement model. This type of SEM modeling permits testing for group differences in the spe-cified model without all parameters being tested for equality (Jöreskog & Sörbom, 1993). It is also possible to test for differences in specific parameter estimates.
MEASUREMENT MODEL GROUP DIFFERENCE
The global model fit is assessed using the chi-square test, which is dependent on sample size. The chi-square statistic will reject models if the sample size is large, that is, yield a significant chi-square value; and will fail to reject models if the sam-ple size is too small, that is, yield a non-significant chi-square value. So, other types of fit indices were created to assess the fit of a model when making multiple group comparisons. The comparative fit index (CFI) or the Tucker–Lewis (TLI) index with values > .95 are considered acceptable. The root-mean-square error of approx-imation (RMSEA) with values less than .05 is considered acceptable, because it is insensitive to sample size, although affected by model complexity (larger degrees of freedom). The Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) are both used to compare models using the –2LL chi-square fit value; however, they are sensitive to the number of parameters in a model (model complexity). The rule of thumb is to choose models with lower AIC or BIC values.
Our initial test of a theoretical measurement model is based on the Holzinger and Swineford (1939) study, where a confirmatory model of spatial and verbal ability was presented in Chapter 6. However, this time the raw data set (HS.data) will be used from the MBESS package in R with N = 301 observations and 32 variables [library(MBESS); data(HS.data)]. We need the raw data set to extract separate variance–covariance matrices on the demographic variable of interest, school, to be able to test for group differences in the measurement model. We have assessed the global model fit in Chapter 6 for the indicator variables of the constructs indicated in the theoretical measurement model diagrammed in Figure 8.1 (χ2 = 13.68, df = 7, p = .06). We concluded that the sample data fit the theoretical measurement model.
a BeGinner’S Guide to Structural equation ModelinG
146
Measurement Invariance
The multiple group model analysis generally establishes that the data for each group fit the same measurement model. This is considered a test of measurement invariance for the groups, which means that the constructs are the same for each group. We would proceed by running separate models for each group to assess the data to model fit. It is possible that the two separate group models may not be exactly the same, that is, a few modification indices may be indicated for the error covariances in one group, but not in the other group. Differences in ran-dom error are acceptable, but we desire the factor loadings and factor correla-tions to be similar in the measurement model. The chi-square statistics for the two groups on the theoretical measurement model are compared. This results in an omnibus chi-square test of model fit for group differences, which we desire to be non-significant to indicate that the constructs are the same for both groups.
The individual SEM measurement model programs would contain different sam-ple sizes and covariance matrices for each group, but have the same specified measurement model. The separate covariance matrices for each group are given in Table 8.1.
The results in Table 8.2 indicate that the Pasteur school and the Grant-White school fit the measurement model differently, as noted by the non-significant chi-square and the RMSEA, CFI, and GFI indices for each group. The
spatial
visperc
cubes
lozenges
wordmean paragrap
sentence
err_v
err_c
err_l
err_p
err_s
err_w verbal
Figure 8.1: THEORETICAL MEASUREMENT MODEL (AMOS)
Multiple Group (SaMple) ModelS
147
Table 8.1: Pasteur and Grant-White Covariance Matrices Pasteur
(n = 156)
VISPERC CUBES LOZENGES PARCOMP SENCOMP WORDMEAN
50.552316
9.712738 24.215219
29.957155 15.354673 86.686187
10.288627 1.068900 3.977337 11.953805
11.399628 2.295533 2.139950 13.041315 27.502854
21.032465 5.468073 12.517949 15.959843 26.318486 48.056038 Grant-White
(n = 145) VISPERC CUBES LOZENGES PARCOMP SENCOMP WORDMEAN
47.800958
10.012500 19.758333
25.797893 15.416667 69.172414
7.972605 3.420833 9.206657 11.393487
9.935728 3.295833 11.091954 11.277347 21.615709
17.425335 6.876389 22.954262 19.166523 25.320977 63.162548
Table 8.2: Comparison of Global and Individual School Measurement Models
Global Pasteur1 Grant-White
Measurement Model Spatial Verbal Spatial Verbal Spatial Verbal Factor loadings:
Visual Perception .96 .91 .68
Cubes .31 .33 .46
Lozenges .46 .50 .66
Paragraph Comprehension .85 .82 .87
Sentence Completion .86 .86 .83
Word Meaning .84 .83 .83
(Cubes, Lozenges) .20 .25 .11
(Spatial, Verbal) .42 .38 .56
χ2 13.68 14.25 2.22
df 7 7 7
p value .057 .046 .947
CFI .98 1.0
RMSEA .086 0
GFI .97 .99
1 Pasteur results indicated a negative error variance for VISPERC.
a BeGinner’S Guide to Structural equation ModelinG
148
Grant-White school data fit the measurement model better. The chi-square diffe-rence is greater than the critical chi-square of 3.84 at the .05 level of significance (14.25 – 2.22 = 12.03), so the two schools do show a difference in the constructs.
Upon further inspection, we see that the factor loadings on the spatial ability latent variable are different between the Pasteur and Grant-White schools. This factor loading difference on spatial ability most likely reflects the group diffe-rence between the two schools.
STRUCTURAL MODEL GROUP DIFFERENCE
The multiple group method can also test for group differences in a structural model. An example is presented based on data from Arbuckle and Wothke (2003).
The multiple group model is specified to examine the perceived attractiveness and perceived academic ability differences between a sample of 209 girls and 207 boys.
Before conducting group differences, we would first establish that the total sam-ple fits the theoretical structural model. The theoretical structural model is dia-grammed in Figure 8.2. The global chi-square was 1.02, df = 2, p = .60, so the data fit the theoretical model (Table 8.3).
The SEM software programs will need to include two separate groups of data.
The observed variables, sample size, and then either the covariance matrix for each group, or the correlation matrix with means and standard deviations will need
gpa
height
weight
rating
academic
attract
Figure 8.2: STRUCTURAL MODEL (AMOS)
Multiple Group (SaMple) ModelS
149
to be provided for each group. The programs in LISREL are run separately for each group with the same model statements. The SEM program provides separ-ate analysis estimsepar-ates for the girls and boys, but uses the same model stsepar-atements.
The results are in Table 8.3. The model chi-square values for each group were non-significant, indicating a good data to model fit. The chi-square difference was also non-significant (2.76 – .42 = 2.34), indicating that the groups did not differ on their academic and attractiveness constructs. Consequently, the global model fit correctly reflects the relations in the sample data.
MULTIPLE SAMPLE MODEL
The multiple sample approach can be used to test a theoretical model for dif-ferences in parameter estimates across samples of data. A theoretical regression model is diagrammed in Figure 8.3 where vehicle weight (weight) and engine horsepower (power) predict miles per gallon (mpg). The data set (cars.sav) for the multiple sample approach can be found in SPSS, and read into most SEM software programs. The cars.sav data set estimated miles per gallon (mpg) based on various vehicle characteristics (weight, horsepower, engine displacement, year of vehicle, etc.). For our purposes we selected miles per gallon as the dependent variable with vehicle weight and horsepower as independent predictor variables. The original data set contained N = 406 observations; however, only N = 394 were useable because of 12 missing cases (6 due to dependent variable missingness and 6 due to independent variable missingness). The multiple sample approach compares Table 8.3: Comparison of Global and Individual Group Structural Models
Global Girls Boys
Structural Model academic attract academic attract academic attract Structure Coefficients:
gpa .51 .49 .52
height .06 .00 .13
weight –.09 –.08 –.17
rating .30 .36 .19
academic .48 .52 .45
attract .07 –.01 .19
(academic, attract) –.52 –.05 –.13
χ2 1.02 2.76 .42
df 2 2 2
p value .60 .252 .809
a BeGinner’S Guide to Structural equation ModelinG
150
the model fit and parameter estimates of each sample to determine whether they differ significantly. We therefore took two random samples without replacement from the cars.sav data.
The descriptive data for the two samples are in Table 8.4. A visual inspection shows that the correlations, means, and standard deviations look similar.
The global model fit, as well as the results for each sample from the analysis, are reported in Table 8.5 for the regression equation: mpg = weight + power. The multiple regression results for the complete data (N = 394) indicated what we expected for results in terms of R2 values and regression coefficients. We can visu-ally compare our two individual sample parameter estimates. The results appear to be very similar. Structural equation modeling software, however, provides the
weight
power
mpg
Figure 8.3: REGRESSION MODEL (AMOS)
Table 8.4: Sample Descriptive Statistics Sample 1 (N = 206)
Correlation Matrix mpg weight power
1.0
–.82 1.0
–.778 .865 1.0
Means 23.94 2921.67 104.23
Standard Deviations 8.140 835.421 41.129
Sample 2 (N = 188)
Correlation Matrix mpg weight power
1.0
–.823 1.0
–.760 .855 1.0
Means 23.59 2952.02 102.72
Standard Deviations 7.395 805.372 36.234
Multiple Group (SaMple) ModelS
151
capability of testing whether our results (parameter estimates) are statistically different.
SEM software provides the ability to compare both samples rather than having to run separate multiple regression programs on each sample and hand calculate a t test or z test for differences in the regression coefficients. The sample comparison yielded a non-significant chi-square (χ 2 = 2.01, df = 3, p = .57), which indicates that the two samples do not have statistically different parameter estimates in the regression model. So, it seems reasonable to have a total sample regression coef-ficient of –.551 for vehicle weight, compared to the individual sample regression coefficients of −.585 and −.642, respectively. Looking at the regression coefficient for horsepower predicting mpg, we find individual regression coefficients of −.272 and −.212, respectively. So, it seems reasonable to have a common regression coef-ficient of –.299 for horsepower. Also, notice the R2 values for each sample. We find that for each individual sample, the R2 values were .692 and .689, respectively.
So, once again, the total sample R2 value of .675 is reasonable. Our interpret-ation would suggest that two-thirds of the miles per gallon variinterpret-ation for a car can be explained by a vehicle’s weight and horsepower. The negative regression Table 8.5: Multiple Sample Regression Model
Total Sample (N = 394)
Unstandardized Coefficients Standardized
Coefficients t p R 2
B Std Error B .675
Constant 44.777 .825 54.307 .0001
Vehicle Weight –.005 .001 –.551 –9.818 .0001
Horsepower –.061 .011 –.299 –5.335 .0001
Sample 1 (N = 206)
Unstandardized Coefficients Standardized Coefficients
t p R 2
B Std Error B .692
Constant 46.214 1.193 38.723 .0001
Vehicle Weight –.006 .001 –.585 –7.550 .0001
Horsepower –.054 .015 –.272 –3.509 .0001
Sample 2 (N = 188)
Unstandardized Coefficients Standardized
Coefficients t p R 2
B Std Error B .689
Constant 45.412 1.166 38.957 .0001
Vehicle Weight –.006 .001 –.642 –8.114 .0001
Horsepower –.043 .016 –.212 –2.675 .0001
a BeGinner’S Guide to Structural equation ModelinG
152
coefficients are expected because as weight and horsepower increase, miles per gallon should decrease.
SUMMARY
The SEM multiple group approach is useful for testing whether group dif-ferences exist for a theoretically specified model. The group comparisons can be made for regression, path, factor, or structural models. When the multiple group method is used in factor analysis, the purpose might be to establish measurement invariance, that is, the groups have the same construct or latent variable meaning. However, a researcher can hypothesize that group differ-ences exist in factor models, as well as the other model types if the interest is in hypothesizing group differences. Multiple sample testing uses the same SEM approach, but now the focus is on whether the samples of data yield similar or different parameter estimates, or overall model fit, when comparing multiple regression, path models, confirmatory factor models, or structural equation models.
This chapter presented only one example for each of the applications because a more in-depth coverage is beyond the scope of this book. However, your SEM software program should provide other examples. An Internet search will also yield articles and software examples. We have placed the data and programs used in this chapter on the book website.
EXERCISES
1. Multiple Group Model
Create an SEM program that produces output to determine if path coefficients are statistically significantly different. You will need the separate data set information provided below to perform this task. Also provide a path diagram with interpreta-tion of results.
The path model tests whether job satisfaction (satis) is indicated by boss attitude (boss) and the number of hours worked (hrs). The boss attitude (boss) is in turn indicated by the employee satisfaction (satis). The boss attitude (boss) is also indi-cated by the type of work performed (type), level of assistance provided (assist), and evaluation of the work (eval). There would be two model statements or equa-tions specified as follows:
Multiple Group (SaMple) ModelS
153
satis = boss hrs
boss = type assist eval satis
Note: Because a reciprocal relation exists between boss and satis, the errors would need to be correlated to obtain the correct path coefficients.
The data set information to be used to test hypotheses of equal or unequal parameter estimates in a path model between Germany and the United States are listed below.
Germany
Observed Variables satis boss hrs type assist eval Sample Size 400
Means 1.12 2.42 10.34 4.00 54.13 12.65
Standard Deviation 1.25 2.50 3.94 2.91 9.32 2.01 Correlation Matrix
1.00 .55 1.00 .49 .42 1.00 .10 .35 .08 1.00 .04 .46 .18 .14 1.00 .01 .43 .05 .19 .17 1.00 United States
Observed Variables satis boss hrs type assist eval Sample Size = 400
Means: 1.10 2.44 8.65 5.00 61.91 12.59
Standard Deviations: 1.16 2.49 4.04 4.41 4.32 1.97 Correlation Matrix
1.00 .69 1.00 .48 .35 1.00 .02 .24 .11 1.00 .11 .19 .16 .31 1.00 .10 .28 .13 .26 .18 1.00
2. Multiple Sample Model
Nursing programs are interested in knowing if their outcomes are similar from one semester to the next. Two semesters of data were obtained on how student effort (effort) and learning environment (learn) predicted clinical competence (comp) in nursing. The regression model is: comp = effort + learn.
a BeGinner’S Guide to Structural equation ModelinG
154
Create an SEM program to test whether the regression coefficients in the model are the same or statistically significantly different for the two semester samples of data. Semester 1 had 250 nurses and Semester 2 had 205 nurses. Note: The means and standard deviations were not available, so assume the data are in standardized form and only use the correlation matrix in your analysis. The correlation matrices for the two semesters are in Table 8.6.
SUGGESTED READINGS Multiple Group Models
conner, B. t., Stein, J. a., & longshore, d. (2005). are cognitive aidS risk-reduction model equally applicable among high- and low-risk seekers? Personality &
Individual Differences, 38, 379–393.
long, B. (1998). coping with workplace stress: a multiple-group comparison of female managers and clerical workers. Journal of Counseling Psychology, 45, 65–78.
unrau, n., & Schlackman, J. (2006, november/december). Motivation and its relationship with reading achievement in an urban middle school. The Journal of Educational Research, 100(2), 81–101.
Multiple Samples
Geary, d. c., & Whitworth, r. H. (1988). dimensional structure of the Wais-r: a sim-ultaneous multi-sample analysis. Educational and Psychological Measurement, 48(4), 945–956.
poon, W. Y., & tang, F. c. (2002). Multisample analysis of multivariate ordinal cat-egorical variables. Multivariate Behavioral Research, 37, 479–500.
Table 8.6: Nursing Program Correlation Matrices Semester 1 (N = 250)
Clinical Effort Learn
Clinical 1.0
Effort .28 1.0
Learn .23 .25 1.0
Semester 2 (N = 205)
Clinical Effort Learn
Clinical 1.0
Effort .21 1.0
Learn .16 .15 1.0
Multiple Group (SaMple) ModelS
155
tschanz, B. t., Morf, c. c., & turner, c. W. (1998). Gender differences in the struc-ture of narcissism: a multi-sample analysis of the narcissistic personality inventory. Sex Roles: A Journal of Research, 38, 863–868.
REFERENCES
arbuckle, J. l., & Wothke, W. (2003). Amos 5.0 user’s guide. chicago, il: Smallwaters corporation.
Byrne, B., & Sunita, M. S. (2006). the MacS approach to testing for multigroup invariance of a second-order structure—a walk through the process. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 287–321.
Holzinger, K. J., & Swineford, F. a. (1939). A study in factor analysis: The stabil-ity of a bi-factor solution. (Supplementary educational Monographs, no. 48.) chicago, il: university of chicago, department of education.
Jöreskog, K., & Sörbom, d. (1993). LISREL 8: Structural equation modeling with the SIMPLIS command language. chicago, il: Scientific Software international.
lomax, r. G. (1985). a structural model of public and private schools. Journal of Experimental Education, 53, 216–226.
Marcoulides, G., & Schumacker, r. e. (eds.). (2001). New developments and techniques in structural equation modeling: Issues and techniques. Mahwah, nJ: lawrence erlbaum.