Statistical criteria for assessing the validity of measurement models

7. INSTRUMENT VALIDATION AND MEASUREMENT MODEL

7.5. Assessing Construct Validity through Confirmatory Factor Analysis

7.5.2. Statistical criteria for assessing the validity of measurement models

After the measurement model was specified and developed in Stages 1–3, the data collected and important decisions regarding its estimation set, the measurement model was ready for Stage 4—the validity assessment of Hair’s (2006) process. The main purpose of using SEM to assess the measurement model is to find the most parsimonious model which is well fitting and valid. This section details the necessary tests and the acceptance levels for goodness of fit, convergent validity, discriminant validity and second order confirmatory factor analysis measurement tests. The discussion is generic and serves as the foundation for the actual tests to be conducted and reported in the subsequent sections.

Goodness of Fit

Whether a measurement model is considered valid is dependent on goodness of fit (GOF) indices. GOF indices indicate how well the model reflects the data, in other words, how well the specified model reproduces the covariance matrix among the indicator items (Hair et al. 2006). There are various GOF indicators, although usually only a couple of which are reported. Generally GOF indicators can be grouped into three categories: absolute measures, incremental measures and parsimonious fit measures. To ensure rigour in the

empirical assessment, as suggested in the literature (Ho 2006; Kline 2005) multiple GOF indices are used. The literature is divided over the amount of fit indices that should be reported (e.g. Kline (2005) suggests at least four), which fit indices are most appropriate, as well as the acceptable cut-off threshold (Hair et al. 2006; Kline 2005). Table 7.10 summarises the basics of GOF used in this which are further detailed in Appendix O.

Appendix O also acknowledges other popular fit indices which were not used in this study.

This study follows the advice by Weston and Gore (2006), MacCallum and Austin (2000), Hu and Bentler (1998), Mcdonald and Ho (2002) and presents the following fit indices: chi-square, normed chi-chi-square, RMSEA, RMR and CFI. In addition to the advice above, Hu and Bentler (1998) recommended against the usage of GFI and AGFI because they are not only insufficiently and inconsistently sensitive to model misspecification, they are also strongly influenced by sample size (MacCallum & Austin 2000). Hence, GFI and AGFI were not used in this study. The chosen GOF indicators and their acceptance level are summarised in Table 7.10.

Table 7-10: Summary of Goodness of Fit Indices

Category and definition Indicators Description

Traditional

Tests if the proposed model fits the

collected empirical data p> 0.05 Normed

Chi-Square

Handles the sensitivity of Chi-Square in complex models and can be used to estimate the parsimony of the model

1.0 - 2.0

Root Mean-Square Error of Approximation

(RMSEA)

Addresses the issue of error in the approximation of the population via a sample survey. In contrast to the exact fit test of the chi-square, the RMSEA is a measure of discrepancy per degree

of freedom

< 0.10 Absolute fit measures

indicate the degree to which the proposed model fits/predicts the observed covariance matrix

RMR RMR is the mean absolute value of the

covariance residuals < 0.10

Incremental fit measures compare the proposed model to some baseline model. Hence, they are also often called comparative fit indices

Comparative Fit index (CFI)

CFI avoids the underestimation of fit often noted in small samples and is the improved version of the often used NFI

> 0.90

Source: (Ho 2006); (Jöreskog & Sörbom 1993); (Holmes-Smith 2007).

In summary, several GOF indices were presented, encompassing their interpretation and their acceptance level. The chosen GOF indicators are based on the recommendation of Hair (2006), to account for sample size and model complexity. These GOF indicators are a lot more stringent and rigorous than the classic ones employed by Weston and Gore (2006) and are more relevant than those used by Holmes-Smith (2007), as they account for complexity and sample size. Table 7.11 below provides an overview of the GOF measures and their acceptance levels used for this research with a sample size of below 250.

Table 7-11: Goodness of Fit Measures

Acceptance level Name Abbreviation Traditional

(Gore 2006) Adjusted levels of this CFA Complexity of model (no.

Lo90 =0 (not necessary, but indicates that the test of exact fit is supported)

RMR RMR < 0.10 n.a. < 0.08 < 0.09

Comparative Fit Index CFI > 0.90 > 0.97 > 0.95 > 0.92

Convergent validity

Convergent validity measures whether items of the same variable or construct measure the same thing and, therefore, reveal correlations to each other. In CFA, convergent validity measures whether items of the same latent factor share a proportion of variance (Hair et al. 2006). Convergent validity is, therefore, a direct measure of the extent of the relationship between an observed variable and a latent construct. According to Holmes-Smith (2007), convergent validity is achieved when this relationship, represented by factor loadings, is significantly different from zero. To assess the statistical significance of the factor loading, critical ratios and p-values were calculated for each factor loading. Critical ratios outside the -1.96 to +1.96 z-value range and p-values below p<0.05 indicate factor loadings that are significantly different from zero. This statistical test of the significant factor loading is the key criterion in assessing factor validity (Holmes-Smith 2007).

Furthermore, regression weights, standardised regression weights and squared multiple correlations (SMC) can be calculated to assess convergent validity. Standardised regression weights should be above 0.5, with values of above 0.7 optimal (Hair et al.

2006, pp. 776-7). Squared multiple correlations are squared standardised factor loadings and represent the extent to which a measured variable’s variance is explained by a latent factor (Hair et al. 2006). SMC can also be used to assess item reliability. To identify a concrete value for an acceptable level of SMC a literature review was conducted, which yielded no definite threshold level. Hair et al. (2006, pp. 776–7) explicitly comment on the vague handling of SMC values: ‘We do not provide specific rules for interpreting these values here because in a congeneric measurement model they are a function of the factor loading estimates. Recall that a congeneric measurement model is one in which no measured variable loads on more than one construct. The rules for the factor loading

estimated tend to produce the same results’. Although all authors agree that the higher the SMC, the better the item reflects the latent variable (Anderson & Gerbing 1988; Byrne 2001; Kahn 2006; Kaplan 2000; Kline 2005; Straub, Boudreau & Gefan 2004; Tabachnick

& Fidell 2007; Weston & Gore 2006), very few have provided concrete values for an acceptance level. Values below 0.3 indicate that the item is a poor measure of the construct and should be dropped (Holmes-Smith 2007). SMC between 0.3 and 0.5 indicates that the item is a weak but adequate measure of the construct (Holmes-Smith 2007). An SMC of 0.5 calculates to a standardised loading of 0.7, which indicates that the item reflects the construct very well (Hair et al. 2006; Holmes-Smith 2007).

In sum, convergent validity is assessed through a variety of measures: firstly, with standardised regression loadings of higher than 0.5 (Hair et al. 2006); secondly, with significant p-values (at 95% confidence interval) (Anderson & Gerbing 1988; Hair et al.

2006) and critical ratios outside the -1.96 to +1.96 z-range; and finally, SMC values below 0.4 are considered not to hold convergent validity. SMC values between 0.4 and .05 were scrutinised and accepted if all other convergent validity measures were well above the recommended thresholds. SMC above 0.5 were accepted. The standardised factor loadings, the critical ratio, p-value and SMC of each item are displayed for each construct.

Discriminant validity

Discriminant validity measures to what extent latent variables differ from each other. In contrast to convergent validity, which is a measure within latent variables, discriminant validity is a measure between variables. Discriminant validity is especially important if latent variables and constructs are interrelated. It can be assessed in two ways. Firstly, correlations between different constructs can be calculated. High correlations (above 0.8 or 0.9) between constructs indicate a lack of discriminant validity (Holmes-Smith 2007).

Secondly, the average variance extracted for constructs should exceed the square of the correlations between the constructs (Holmes-Smith 2007). In addition to model fit statistics, both discriminant validity measures will be presented for each construct.

Second order confirmatory factor analysis measurement models

The section above discussed the first order measurement models of the research model.

As the main research questions of this study are at a higher order concept level, second order confirmatory factor analysis was employed. The advantages of higher order confirmatory factor analysis are, on the one hand, that they include fewer parameters to be estimated, and that the model represents the underlying structure of the sample data in a more parsimonious way (Byrne 2001). On the other hand, higher order confirmatory factor analysis enables the estimation of the relationships between higher order

constructs, rather than only estimating the relationships between variables or lower order constructs. In a second order confirmatory factor analysis, the first order variables are regarded as though they were items.

Similar to one factor, first order confirmatory factor analysis, in second order confirmatory factor analysis, models can either be estimated as a congeneric version with freed error variances and regression weights, or as parallel versions. In the following section, a second order confirmatory factor analysis measurement model of the research model will be estimated and presented. Therefore, sections 7.6.3 and 7.7.3 transfer the one factor confirmatory factor analysis models from 7.7.1 and 7.72 into second order, one factor confirmatory factor analysis measurement models. Model fit statistics and convergent validity will be presented for each second order construct. Subsequently, a full second order confirmatory factor analysis model will be presented together with model fit, convergent and discriminant validity statistics.

In document Adaptive IT capability and its impact on the competitiveness of firms: a dynamic capability perspective (Page 144-148)