Testing for Internal Construct Validity using Single-group CFA

estimate critical psychometric information for a measurement device. Single-group CFA can determine whether significant amounts of random error are in the data, as well as if the measure accurately assesses the hypothesised trait construct

(Schumacker & Lomax, 2010). To examine these enquiries, a single-group CFA model is specified, as per the model depicted in Figure 4.1. Presuming the model is deemed a good fit for the data (using the fit indices described in Chapter Five) the factor loadings for the latent trait and error components are examined. Factor

loadings greater than .32 are typically deemed a minimum standard for acceptability in applied psychological research because it indicates there is at least 10 percent shared loading between the variable and the factor (Gorsuch, 1983). However, from a statistical point of view the choice of threshold for a meaningful loading is often arbitrary and higher factor loadings around .5 or .6, are preferred (Gorsuch, 1983).

The factor loadings of the indicators onto the latent attitude factor (represented by λ11-41 in Figure 4.1) show whether the latent trait construct is significantly and

substantively assessed by the observed scores. Good internal construct validity for a measure would be demonstrated if all indicators were significant and greater than .32, although much higher factor loadings would provide stronger evidence of good construct validity (Gorsuch, 1983). Good internal construct validity for the political questionnaire depicted in Figure 4.1 would imply the items (Y1-4) were all

consistently adequate measures of the same underlying construct of political

onto the error factors (ε1-4) imply that random error variance comprises a significant

portion of the observed scores.

For the implicit attitude measures, single-group CFA could be applied to each IAT or APT individually to determine whether the implicit attitude scores provide an

adequate and consistent estimate of the latent implicit attitude construct being investigated. If this were the case, it would reveal whether each IAT or APT possessed adequate internal construct validity, a vital prerequisite for any measure. Single-group CFA could also determine whether significant proportions of random error variance comprise the implicit attitude scores, as hypothesised. This is a key advantage of CFA for the current dissertation.

Application 2: Estimating Reliability using Composite Reliability and Average Variance Extracted

The other crucial prerequisite for a task, reliability, can also be estimated using the same one-factor single-group CFA model outlined above. Reliability is the degree to which a test consistently measures that which it measures (Nunnally, 1978).

Reliability is often assessed by examining how well all the items of a test relate to each other, an estimate referred to as internal consistency. A popular internal

consistency estimator is Cronbach’s (1951) coefficient alpha (α), which rates greater inter-correlations as indicative of more consistency amongst test items and thus better stability/reliability for the test. Despite being widely used in behavioural and social research for more than 60 years, Cronbach’s (1951) coefficient alpha has been shown to provide a sub-optimal indicator of reliability due to not accounting for error variance, as well as issues of under- and over-representation (Novick & Lewis, 1967;

Raykov, 1997; Zimmerman, 1972). Unless all the scale items have equivalent factor loadings; a model type referred to as tau-equivalent (Graham, 2006), coefficient alpha has been found to underestimate composite reliability at the population level by quite substantial amounts at times (Novick & Lewis, 1967; Raykov, 1997). Such inaccuracies have led to the conclusion that coefficient alpha cannot be considered a dependable estimator of measure reliability (Raykov, 1997). Rather, it has been argued that reliability estimation be reported using Composite Reliability (CR) and Average Variance Extracted (AVE) instead (Fornell & Larcker, 1981). CR and AVE can both be easily applied to single-group CFA. This matches the stronger reliability estimate afforded by CR and AVE with the statistical rigour of CFA to deliver a significant advantage over Cronbach’s alpha.

In SEM, Composite Reliability provides an estimate of the internal consistency of a task by assessing the extent to which a set of indicators share in the measurement of a latent construct (Hair, Black, Babin, Anderson, & Tatham, 2006). CR estimates thus differ from Cronbach’s alpha estimates by examining the reliability of the latent construct, after random error variance is removed, rather than examining the

reliability of the individual test items. CR delivers an estimate similar to the reliability of the summated scale and will typically reveal stronger reliability

estimates than Cronbach’s α, unless items are tau-equivalent (Raykov, 1997). Fornell and Larcker (1981) propose CR be calculated using the formula depicted in Equation 4.3. Adequate reliability for a measure is revealed if CR estimates are greater than .60 (Tseng, Dörnyei, & Schmitt, 2006) or .70 (Hair et al., 2006).

𝜌_𝜂

=

𝜆

𝑖 2

𝜆

_𝑖 2

_{+ 𝑉𝑎𝑟 𝜀}

_𝑖

where

𝜌

_𝜂 is the composite reliability,

𝜆

_𝑖 is the factor loading i,

𝑉𝑎𝑟 𝜀_𝑖 is error variance for the factor loading i

(4.3)

Average Variance Extracted (AVE) compliments Composite Reliability (CR) by providing an estimate of how much variance within the indicators is explained by the common factor (Hair et al., 2006). In other words, AVE measures how much of the trait construct is accounted for by the attitude scores. This is essentially a test of internal convergent validity. AVE can be calculated using the formula depicted in Equation 4.4, similarly specified by Fornell and Larcker (1981). AVE results

represent the ratio of total variance due to the latent variable and can vary between 0 and 1. AVE values greater than .50 are considered satisfactory because they indicate that at least 50% of the variance in a measure is due to the hypothesised underlying trait (Bagozzi, 1991; Dillon & Goldstein, 1984; Hair et al., 2006; Tseng et al., 2006). This score is thus particularly important as it indicates what proportion of trait versus error variance is accounted for by the observed scores. A result of greater than .50 implies good validity for both the construct and the individual variables, revealing acceptable internal convergent validity for the measure.

𝜌_{𝑣𝑐(𝜂)}

=

𝜆

𝑖2

𝜆

_𝑖2

+ 𝑉𝑎𝑟 𝜀

_𝑖

where

𝜌

_{𝑣𝑐(𝜂)} is the average variance extracted,

𝜆

_𝑖 is the factor loading i,

𝑉𝑎𝑟 𝜀_𝑖 is error variance for the factor loading i

Together, CR and AVE provide a psychometrically robust estimate of the internal consistency and internal convergent validity of a task, such as the IAT or APT. Given the concerns outlined in Chapter Three regarding the reliability of these implicit attitude measures, such statistical processes provide a very crucial

application for CFA in the psychometric evaluation of these tasks. CR and AVE can be applied to assess whether the IAT and APT provide a consistent and adequate measure of the implicit attitude constructs of interest. The AVE assessment can also be used to estimate what proportion of random error variance is confounding the implicit attitudinal data.

Application 3: Testing for Construct Validity using Single-group CFA

In document Evaluating the construct validity of Implicit Association Tests using Confirmatory Factor Analytic Models (Page 122-126)

Testing for Internal Construct Validity using Single-group CFA

𝜌𝜂

=

𝜆

𝜆

+ 𝑉𝑎𝑟 𝜀

𝜌

𝜆

𝜌𝑣𝑐(𝜂)

=

𝜆

𝜆

+ 𝑉𝑎𝑟 𝜀

𝜌

𝜆

𝜌_𝜂

_{+ 𝑉𝑎𝑟 𝜀}

𝜌_{𝑣𝑐(𝜂)}