Exploratory Factor Analysis - Quantitative Study

Chapter 2.0 Literature Review

3.5 Quantitative Study – Methods

3.5.1 Exploratory Factor Analysis

3.5.1.1 Sample Size

Although sample size is important in factor analysis, there are varying opinions, and several guiding rules of thumb are cited in the literature (Hogart et al., 2005, pp. 8-10). The lack of agreement is noted by Hogarty et al. (p. 203)who noted that the disparity in sample size recommendations was not helpful to those carrying out research. General guides include, Tabachnick’s rule of thumb (Tabachnick & Fidell, 2007, p. 86-89) suggests that having at least 300 cases are needed for factor analysis. Hair, Anderson, Tatham, and Black (1999, p. 156) find that sample sizes should be 100 or greater. A number of textbooks (Hogart et al., 2005, pp. 8-10) cite the work of Comrey and Lee (in their guide to sample sizes: 100 as poor, 200 as fair, 300 as good, 500 as very good, and 1000 or more as excellent. According to MacCallum et al. (1999, p. 88), such rules of thumb can at times be misleading and often do not take into account many of the complex dynamics of a factor analysis. Henson & Roberts (2006, p. 405) provide an example of this complexity where communalities having values > .60 and each factor is defined by several items, then sample sizes can actually be relatively small. However, Reise, Comrey, and Waller (2000, p. 290) found that when communalities are low (e.g. when analysing items), the number of factors is large and the number of indicators per factor is small, even a sample size of 500 may not be adequate.

3.5.1.2 Sample to Variable Ratio (N:p ratio)

There are also recommendations to provide researchers with guidance regarding how many participants are required for each variable. The sample to variable ratio, often denoted as N:p ratio where N refers to the number of participants and p refers to the number of variables (Hogarty et al., 2005, p. 224)For example, rules of thumb range anywhere from 3:1, 6:1, 10:1, 15:1, or 20:1 (Pett et al., 2003, pp. 8-10). To highlight this ambiguity, investigators such as Hogarty et al. (2005, p. 222)observed that their results showed that there was not a minimum level of N or N:p ratio to achieve good factor recovery across the conditions they examined. As can be seen, the suggested sample size required to complete a factor analysis of a group of items that participants have responded to, varies greatly. MacCallum et al. (2002, p. 634) using factor analytic theory (MacCallum & Tucker, 1991) were able to show that it is impossible to derive a minimum sample size that is appropriate in all situations and that it may be more appropriate to limit the number of variables when exploring factorability. The

researcher with regard to the extant literature on sample size took a pragmatic view in keeping with the epistemological framework of the research design, aimed for at least ≥ 500 participants and thus reaching an acceptable level for most sample size criteria.

3.5.1.3 Factorability of the Correlation Matrix

A correlation matrix should be used in the EFA process displaying the relationships between individual variables. Henson and Roberts (2006, p. 406) pointed out that a correlation matrix is most popular among investigators. Tabachnick and Fidell recommended inspecting the correlation matrix (often termed Factorability of R for correlation coefficients > .30. Hair et al. (1995) categorised these loadings using another rule of thumb as ≥ .30 = minimal, ≥ .40 = Important, and ≥ .50 = practically significant, (Hair et al., 1995, p. 88).If no correlations have values of ≥ .30, then the researcher

should reconsider whether factor analysis is the appropriate statistical method to utilize (Hair et al., 1995, p. 88; Tabachnick & Fidell, 2005, p. 203). The assessment of

factorability of the data also comes from the determinant of the correlation matrix. The determinant of a matrix is a single value calculated using the values within a square matrix, revealing the presence or absence of possible linear combinations within the matrix. With the exception of cases where the determinant is zero, the values can be arranged into linear combinations. In factor analysis, these linear combinations are considered factors where a non-zero determinant indicates that a factor or component is mathematically possible. It does not however, offer any indication of the practical meaning or significance of the factors. The values for the determinant of a correlation matrix range from 0 to 1 and are most often are very small, suggesting that a few linear combinations exist (Pett et al., 2003).

3.5.1.4 Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy/Bartlett's Test of Sphericity

Prior to the extraction of the factors, several tests should be used to assess the suitability of the respondent data for factor analysis. These tests include Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy, (Kaiser, 1970, pp. 401-415; Kaiser, 1974, pp. 111-117)and Bartlett's Test of Sphericity (Bartlett, 1950, pp. 177-185).The KMO index, in particular, is recommended when the cases to variable ratio are less than 1:5. The KMO index ranges from 0 to 1 with .50 considered suitable for factor analysis (Kaiser, 1974, pp. 111-117)The Bartlett's Test of Sphericity should be significant (p < .05) for factor analysis to be suitable (Bartlett, 1950, pp. 177-185).

3.5.1.5 Initial Factor Extraction

literature (Costello & Osborne, 2005, p. 1; Pett et al., 2003, p. 9). Fabrigar et al. (1999, p. 277) argued that if data are relatively normally distributed, PCA is the best choice because it allows for the computation of a wide range of indexes of the goodness of fit of the model (p. 277). The benefit of PAF however is that does not require distributional assumptions to be met and therefore can be used to analyse data that are not normally distributed. However, PCA is also recommended when no priori theory or model exists (Williams et al., 2010, p. 6).Pett et al. (2003, p. 129) suggested using PCA in

establishing preliminary solutions in EFA. The aim of the data extraction is reduce a large number of items into factors. In order to produce scale unidimensionality, and simplify the factor solutions several criteria are available to researchers. However, given the choice and sometimes confusing nature of factor analysis, no single criteria should be assumed to determine factor extraction (Costello & Osbourne, 2005, p. 2). Whilst there are many rules that can be used to determine the number of factors to retain the two most commonly used are the eigenvalue > 1 rule (Kaiser, 1960) and the scree test (Cattell, 1966). According to Thompson and Daniel (1996, p. 200) the most frequently used method is the EV > 1 rule, as it is the default option in most statistics packages, however Costello and Osbourne (2005, p. 3) recommend the scree test. Costello and Osborne, 2005 found that the EV > 1 rule over estimated the number of factors but this was contrary to the findings of Fabrigar et al. (1999, p. 278), Henson and Roberts (2006, p. 398) and, Schonrock-Adema et al. (2009, p. 227) who noted that the EV > 1 rule may underestimate the number of factors. Due to the factor retention decision directly affecting the EFA results obtained, Henson and Roberts (2006, p. 399) and Schonrock-Adema et al. (2009, p. 228) advise researchers to use both multiple criteria and reasoned reflection. Researchers should also explicitly inform readers about the strategies used in making factor retention decisions. In light of the above research and recommendations, both the EV > 1 rule and the scree test will be selected for the current

study and will be used for factor extraction and comparison between the two methods.

3.5.1.6 Selection of Rotational Methods

Rotation maximises high item loadings and minimises low item loadings, therefore producing a more interpretable and simplified solution. There are two common rotation methods: orthogonal rotation and oblique rotation (Beavers et al., 2013 p. 10).

Regardless of which rotation method is used, the main objectives are to provide easier interpretation of results, and produce a solution that is more parsimonious (Hair et al., 1995; Kieffer, 1999, p. 78). There are several specific types to choose from for both rotation options, for example, orthogonal varimax or quartimax or oblique oblimin or promax. Orthogonal rotations produce factors that are uncorrelated; oblique methods allow the factors to correlate (Henson and Roberts, 2006, p. 400). Costello & Osborne (2005, p. 3) observe that conventional wisdom steers research to use orthogonal rotation because it produces more easily interpretable results, however may be a flawed

argument. According to Fabragar, (1999, p. 282), there is a general expectation that research carried out in the discipline of the social sciences will produce correlation among factors. Orthogonal rotation can result in a loss of valuable information if the factors are correlated, and therefore oblique rotation should theoretically render a more accurate, and perhaps more reproducible solution (p. 283). Oblique rotation produces factors that are correlated, which is often seen as producing more accurate results for research involving human behaviours, or when data does not meet priori assumptions (Costello & Osborne 2005, p. 3) and for this reason, an oblique rotation was selected for this study.

3.5.1.7 Interpretation

variables must load on a factor so it can be given a meaningful interpretation (Beavers et al., 2013, p. 11). Appropriate interpretation, then, must invoke both the factor pattern and factor structure matrices (Henson and Roberts, 2006, p. 400). They also note that the meaningfulness of latent factors is ultimately dependent on researcher definition (p. 396). Pett et al. (2003, p. 207) agrees with this point and suggests the labelling of factors can be a subjective, theoretical, and inductive process.A thorough and systematic factor analyses must be undertaken in order to isolate items with high

loadings in the resultant pattern matrices. This produces those factors that taken together explain the majority of the responses. When the researcher is content with these factors, these should then be operationalised by explaining what is being represented by each factor and then descriptively labelled. It is important that these labels or constructs reflect the theoretical and conceptual intent (Beavers et al., 2013, p. 11). EFA is a complex multivariate statistical approach involving many linear and sequential steps. In addition, many options and rules of thumb apply themselves to EFA emphasising that clear decision sequencing and protocols are paramount in each investigation. The resultant factor structure may represent a truly exploratory investigation of the data or it may have produced a structure that represents an a priori model.

3.5.2 Hierarchical Multiple Regression

In order to test whether the a priori model can predict perceived community efficacy it is necessary to use multiple regression analysis. That is to say that the factors

representing the model should predict the criterion factor. In regression analysis two or more variables are used to predict one other variable. For instance, two independent variables may be selected to predict a relationship in a dependent variable. It is called multiple regression because the analysis is simultaneously using multiple predictor variables (Dewberry, 2004, p. 247). Multiple regression can be used in a number of

In document The utility of perceived community efficacy in emergency preparedness (Page 75-80)