Theoretical Introduction to Factor Analysis and Structural Equation Modelling
7.2 What is Path Analysis & Factor Analysis?
Structural equation models involve the evaluation of two distinct types of analysis: path analysis and factor analysis. Path analysis involves using a path diagram to represent the relationship between variables and is linked to multiple regression, through the involvement of simultaneous estimations of multiple regression models. Path analysis allows for a direct and very effective method of modelling complex relationships among variables, including indirect effects, which can be difficult to measure using other methods. In basic terms, path analysis can be described as causal modelling, due to the fact that it is made up of modelled structural relations among latent and observed variables, which are based on the researcher’s hypotheses about how the various independent variables might affect the dependent variables (Lei & Wu, 2007). On the other hand, path analysis can also be referred to as covariance structure analysis. This is due to the importance of the analysis of interrelationships and associations among
variables in structural equation models, which researchers hypothesise in order to create particular correlations among the variables (Lei & Wu, 2007). Within SEM, variables can play a number of roles. Variables can act as exogenous (independent) variables, which are source variables, endogenous (dependent) variables which are result variables, or mediator variables, which act as both source and result variables (Fox, 2006; Lei & Wu, 2007). During the course of path analysis, observed variables are considered as if they were measured without error, which can be problematic, as this poses a probable false reality (Tabachnick & Fidell, 2007). This is where factor analysis comes into play.
Through the use of a factor analysis-based measurement model, variances of the observed variables can be separated from error variances, correcting unreliability in the model. A proportion of distinctive measurement error is assumed to be contained within every directly observed measurement in factor analysis procedures. Using factor analysis, researchers are able to determine what is distinctive to each factor under consideration and what is shared amongst them, through the use of a few factors that are directly observed (Suen, Lei, & Li, 2011). In factor analysis, the emphasis is on how the latent factors relate to the observed variables. In basic terms, factor analysis is a statistical approach to the identification of a limited number of unobservable factors that are used to signify relationships among various sets of interconnected observable variables. For example, Drug Use by Youth is a broad construct that can have a number of factors, for example, solitary drug use, drug use with friends, and drug use while drinking. The central purpose of factor analysis is the identification of the structure which exists amongst the variables being analysed.
There are two main types of factor analysis: exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) (Byrne, 2009). The most salient difference between EFA and CFA involves the differing ways in which communalities are used.
Field (2005) explains that in CFA the assumption that communalities are one in the first instance means that there is no error variance due to the fact that the complete amount of variance amongst the variables can be determined by its factors.
Conversely, EFA involves assumed error variance and an estimation of communalities.
Exploratory factor analysis is primarily concerned with the identification of a structure amongst variables and is often used as a method to reduce data. Hair et al. (2006) summarise EFA quite well:
EFA explores the data and provides the researcher with information about how many factors are needed to best represent the data. With EFA, all measured variables are related to every factor by a factor loading estimate. Simple structure results when each measured variable loads highly on only one factor and has smaller loadings on other factors (i.e., loadings<.4) (p. 773).
There are no constraints placed on the data when EFA is employed, unlike CFA, which often involves a priori constraints. In other words, CFA is based on preconceived notions regarding the data structure, which is based on theory, previous research, or often both, while EFA is based solely on statistically-derived factors. EFA will not be discussed further, as it was not used in this project, but the following paragraphs will discuss CFA in further detail.
Confirmatory Factor Analysis (CFA) was performed in this research to allow for the testing of models/representations of the factor structure of a scale, which was proposed in advance. CFA is a powerful statistical technique that determines if a self-report measure is doing what it is meant to do. The purpose behind determining the factor structure of a model is to use this understanding to predict outcomes or to understand
how they arise in the first place. Ultimately, determining factor structure allows for the definitive determination of whether the factors making up a scale measure what they are meant to measure.
CFA was also chosen as an appropriate statistical methodology for this research because it allows for the testing of construct validity. Construct validity is one of the most important things to investigate when determining the reliability and validity of a scale, because it shows empirically that the factors are measuring what they are meant to be measuring (Brown, 2006). Construct validity deals with measurement accuracy in that it determines “the extent to which a set of measured items actually reflects the theoretical latent construct those items are designed to measure” and once construct validity is established there is confidence that “item measures taken from a sample represent the actual true score that exists in the population” (Hair, et al., 2006, p. 776).
Composite reliability was also established. The consistency amongst responses confirmed that there was consistent reliability in the patterns established through the various responses. This was further indication that the scale was doing what it was meant to do.
In conclusion, a scale was developed to investigate three different types of victimisation experiences that were related but distinct. CFA was then used to investigate if in fact the scale was doing what it was intended to do within this sample.
The results of CFA indicated that it supported the chosen design.