Data collection and analysis - Creation and validation of the SLCS

Chapter 4 Construction and validation of the survey instrument

4.3 Creation and validation of the SLCS

4.3.5 Data collection and analysis

Acting on the positive feedback from the second pilot, the new survey with three measurement scales (the SLCS, BFI and SSEIT) was uploaded online using Qualtrics. In addition to the three scales, questions which allowed participants to provide demographic information such as gender, age, and years of experience were listed. School size and location were also part of the demographic information requested. This information would help the researcher create data analysis to compare survey data across multiple demographic groups.

Participant recruitment and responses

Participants were recruited via email from current a list of 2,000 NSW schools obtained from the website of the NSW Department of Education. Target participants included classroom teachers, team leaders (e.g. year advisors), assistant principals and head teachers, deputy principals and principals from primary and secondary schools in the NSW public school system from urban, regional and rural areas. School leaders from SSPs were also invited. Around 250– 300 emails were sent each weekend to different schools in NSW, totalling 2,000 invitations in 2 months. As responses from high schools were low at the beginning of recruitment, 1,000 emails were repeated, making a total of 3,000 emails sent in 15 weeks. Participation rates increased toward the end of the last school term of 2016, as many teachers were less busy during this time.

A total of 305 responses were collected by the end of 2016, and two were found to have missing data. Therefore, the total number of useful responses was N=303. This sample was used to validate the survey instrument. A second round of recruitment was subsequently

101 conducted between May and July 2017 to increase the sample size to 400, because a sample size of 400 is commonly assumed to give a confidence interval of ±5% (Simon & Goes, 2012). In the second round of recruitment, another 100 responses were collected. Participant data and results from the survey are detailed in Chapter 6.

Statistical analysis of data obtained from the online survey

Having established face and content validity in the previous processes, step five of the process involved the establishment of the construct validity of this scale. Construct validity refers to the degree to which an investigation or measurement assesses the fundamental theoretical construct it aimed to measure. This section discusses the aim, hypothesis, steps and results of the data analysis process to confirm the construct validity of the online survey to measure the self-perceived strengths of school leaders.

The aim of this work was to demonstrate that the scale that had been designed offers a valid measurement of the constructs it purports to measure. The use of an SEM, a confirmatory technique, is one of the most common ways to test the adequacy of a model. This involves the generation of an a priori hypothesised factorial structure (i.e. the configuration of factor loadings, variances, covariances and unique errors) of the instrument. To support this work, two hypotheses were generated:

Hypothesis 1: CFA will demonstrate the SLCS as a model a priori with a five-factor structure which measures five levels of leadership.

Hypothesis 2: A test of reliability will demonstrate reliability scores for each of the five sub-scales in the SLCS (Leading Self, Leading Others, Leading Other Leaders; Leading the Organisation, and Leading the Community).

The use of EFA was the first step taken to explore and summarise the underlying correlational structure of the dataset collected from the online survey. This was followed by the use of CFA to test the correlational structure of the dataset against a hypothesised structure and to rate the measurement scale’s goodness of fit. An assessment of reliability was then conducted, along with a test of internal consistency using Cronbach’s alpha and correlation coefficient. The final step was an assessment of construct validity using convergent and discriminant validity testing. Each of these analyses are described in turn in the following sections.

102

Factor analysis

Factor analysis is an interdependent technique used to define the underlying structure among the variables being analysed. As previously discussed, the two common approaches to factor analysis are EFA and CFA.

The factor analysis process involved:

1. Screening data to ensure no missing values, resulting in 303 useful data sets out of 305 collected data sets (N=303).

2. Using Kaiser-Meyer-Olkin (KMO) to measure how suited the data from the survey was for factor analysis; this test showed 0.956 sample adequacy (indicating high adequacy).

3. Using Barlett’s Test of Sphericity to show the suitability of the data for factor analysis.

4. Calculating an item-total correlation or inter-item correlation for each item, for reliability. The inter-item correlation matrix found a positive correlation in all sets. 5. Confirming that each item shared certain common variances with other items, via

communalities that showed all items above 0.30. All these steps are illustrated in Appendix 2.3, 2.4, and 2.5.

Given these comprehensive indicators, a factor analysis was conducted, resulting in five extracted factors as illustrated in the scree plot in Figure 4.2 below, which gives a visual illustration of the five factors extracted. Factor 1 stands out, as it has a value of over 20; the four following factors are higher than 1. Following these factors, the line is almost flat, showing that each successive factor accounts for decreasing amounts of the total variance. The exceptionally high value found for Factor 1 may be due to the high rating of strengths in Capability Set 1: Leading Self, indicating a general perception of high competency in the capability items listed in this set.

103

Factor rotation

Factor rotation is a function that shows the pattern of loadings where each item loads strongly on one of the factors, and much more weakly on the other factors. In this study, a Promax rotation (oblique) was used as it gives a simple structure which increases the interpretability of factors. It was also used because this method computes faster with large datasets (Field, 2011). The pattern matrix generated from the factor rotation indicated that Sets 1, 2, and 3 all loaded well as individual factors, while in Set 4 (Leading the Organisation), items 1, 2, 3 and 4, and in Set 5 (Leading the Community) items 1, 7 and 8, loaded as one factor. At the same time, for Set 5, items 2, 3, 4, 5 and 6 loaded strongly with items 5, 6, 7 and 8 in Set 4 as another factor (Table 4.6).

104 Table 4.6 Pattern Matrix

Extraction Method: Principal Component Analysis. Rotation Method: Promax with Kaiser Normalisation.a

105 With the factor loadings extracted from EFA, the next step taken was to use CFA to test the correlational structure of the data set against the hypothesised structure, and test the goodness of fit of the scale.

Confirmatory factor analysis

CFA is a statistical tool for examining the nature of and relations among latent constructs (e.g. leadership capability in leading self; leading others). It tests the correlational structure of a data set against a hypothesised structure. With the use of SEM and CFA, this structure can be statistically tested.

In CFA, the researcher postulates an a priori model which outlines a set of relations between the observed indicators (survey items) and the underlying unobserved construct, referred to as a latent factor in CFA (Byrne, 2010). Figure 4.3 is a path model which depicts the hypothesised structure of the SLCS to be tested. This model hypothesised that collected data will generate a 5-factor model in which factors are inter-correlated, and will produce a good model fit.

106 As can be seen from Figure 4.3, the SLCS was hypothesised to be a multidimensional measure of five levels of leadership capabilities. There were forty observed indicators, shown as the small rectangles on the right of each group of factors. Each of the five latent constructs (factors) is enclosed in an oval (LS: Leading Self; LO: Leading Others; LOL: Leading Other Leaders; Lor: Leading the Organisation; and LC: Leading the Community). A straight line from the factor to the indicator marks the effect of a latent factor on an observed indicator (i.e. a factor loading). Each of the five latent factors has eight indicators; therefore, there are forty such loadings. The covariances among latent factors are marked with curved arrows. The model represents the configuration of factor loadings, factor variances, and unique errors in the measured variables. In CFA it is assumed that variation in the observed item scores is due to the underlying latent factor plus unique measurement error.

Having specified the model, the next step was to evaluate how closely the model represents the relations observed in the collected data, a process referred to as model fitting. Model fit refers to the ability of a model to reproduce the data, generally the variance-covariance

Figure 4.3 Path model of a hypothesised multidimensional structure of the SLCS

107 matrix. A good-fitting model is one that is relatively consistent with the data and does not necessarily require a specification (Schumacher & Lomax, 2004). A chi-square test is usually used in the model fit assessment. If the chi-square is not significant, the model is regarded as acceptable. However, some researchers, such as Brown (2006) and Kline (1998), have found that while for models with about 75 to 200 cases, the chi-square test is generally a reasonable measure of fit, large sample sizes have constantly posed problems for significance tests based on chi-square statistics. For models with more cases (400 or more), the chi-square is almost always statistically significant. Chi-square is affected by the size of the correlations in the model: the larger the correlations, the poorer the fit. To test model fit, statisticians therefore recommend that a range of indices be used and point out that fit indices vary greatly in their sensitivity to sample size and reliability of estimation (Brown, 2006; Browne & Cudeck,1993; Kline, 1998).

Consistent with current practice in model fit assessment, the root mean square error of approximation (RMSEA) was also used to evaluate fit. A value of 0.5 indicates a good fit and values higher than 0.8 would be considered not acceptable (Browne & Cudeck, 1993). Another model fit index, the Comparative Fit Index or CFI (Bentler, 1990), was also used to evaluate the model fit for this study. A CFI value greater than .90 is indicative of a good model fit (Byrne, 2010).

108 Using the data collected for all five leadership capability sets from the SLCS, a five-factor model was generated by CFA. This model shows strong correlations amongst all five factors (Figure 4.4). A U-shaped visual display of the model is used instead of a vertical one as shown in the a priori mode (Figure 4.3) to enhance clarity of connections between the five factors. This five-factor model provided an acceptable fit to the data, with RMSEA =0.75, and CFI = 0.95. The factor loadings (Table 4.7) indicated that all five factors were well defined. Each factor loading was statistically significant and substantial in size (range = .56 to .93; mean = .78; median =.78).

109 As illustrated in the pattern matrix (Table 4.6) the factor rotation indicated that Set 4 and Set 5 did not appear to group into one solid factor respectively. A second model was created to test for a model fit using data from the EFA, which detected a split in factors for Capability Set 4 (Lorg) and Set 5 (LC).

Factors 1–3 remained the same as LS, LO and LOL, while Factor 4 in the new model comprised LORG 1, LORG 2, LORG 3, LORG 4, LC 1, LC7 and LC 8, a total of seven items. Factor 5 comprised LORG 5, LORG 6, LORG 7, LORG 8, LC 3, LC4, LC5 and LC 6, a total of nine items.

This second model (Model 2) also showed strong correlations amongst all five factors, as illustrated in Figure 4.5. It also provided an acceptable fit for the data, with RMSEA =0.75, and CFI = 0.96. The factor loadings (Table 4.8) indicate that all five factors are well defined, despite having an unequal number of items in factors 4 and 5. Each factor loading was also statistically significant and substantial in size (range = .68 to .95; mean = .76; median =.77). However, as there was very limited difference in model fit shown between the two models, Model 1, the original model was retained for use in future studies.

110 Table 4.8 CFA results for Model 2

111

Results from factor analysis

EFA was successfully used to explore and summarise the underlying correlational structure for the data set collected from the online survey on how school leaders perceived their strengths using the SLCS. Through CFA, the correlational structure of the data set was shown to support Hypothesis 1 set for this test, demonstrating a five-factor structure for the SLCS and providing an acceptable model fit.

4.3.5.3 Reliability

Reliability is a measure of the internal consistency of the construct indicators, depicting the degree to which they reflect the same latent variable (Hair, Anderson, Tatham, & Black, 1998). Reliability makes researchers more confident that the individual indicators are all consistent with their measurements. Cronbach’s alpha is commonly used as a measure of internal consistency, which estimates the reliability of test scores (Field, 2011).

112 Reliability statistics in this survey showed a Cronbach’s alpha of .973, a very high score, for the SLCS, which indicated strong internal consistency and reliability. Likewise, the combined scales of LSCS and BFI showed a Cronbach’s alpha of 0.943, and the combined SLCS and SSEIT scales showed a Cronbach’s alpha of 0.942, indicating strong reliability for the data collected.

Convergent and divergent (discriminant) validity

Convergent validity and discriminant validity are commonly viewed as subsets of construct validity. Convergent validity tests whether constructs that are expected to be related are actually related, while discriminant validity (or divergent validity) is commonly used to determine the external consistency of the measurement model, and tests that constructs that should have no relationship do not have any relationship (Domino & Domino, 2006). In statistics, the correlation coefficient ‘r’ measures the strength and direction of a linear relationship between two variables on a scatter plot. A score of 0.3 indicates a weak positive linear relationship, 0.5 indicates a moderate linear relationship and 0.7 indicates a strong linear relationship, while +1 indicates a perfect relationship.

In this study, the correlation coefficient between the SLCS is measured with the BFI personality scale to test convergent validity. Results show r= .321 and p<0.01. This indicates that there is a weak but acceptable linear relationship between the SCLS and the BFI (John & Srivastava, 1999). The scatter plot (Figure 4.6) shows a moderate convergence of the factors with numerous outliers. It shows there are some similarities between the two scales, but that they do not measure the same things.

113 Conversely, the correlation coefficient between the SLCS and SSEIT scales showed the results r=.201 and p<0.01. This supports divergent validity: that the two scales have no or little relationship and do not measure the same constructs. The scatter plot (Figure 4.7) shows that some factors do converge, but the majority are scattered.

114 Results from the r-test and the significance of the scales, and the illustrations from the two scatter plots, support the convergent and divergent validity of the survey scale constructed for this study. The results support Hypothesis 2, which states that:

H2: Test of reliability will demonstrate reliability scores for each of the 5 sub-scale in the SLCS (Leading Self, Leading Others, Leading Other Leaders; Leading the Organisation, and Leading the Community).

In document A School Leadership Pipeline Model : a systemic and holistic model for school leadership development (Page 115-129)