Analytic steps in the development of the measurement and structural models

CHAPTER 3: RESEARCH DESIGN AND METHODS

3.6 Analytic Techniques

3.6.4 Analytic steps in the development of the measurement and structural models

Recommended SEM steps for primary researchers as described by Kline (2011) were reviewed and adjusted to reflect the nature of the current research as secondary data analysis. These steps will be generally outlined here and the results of each are reported in the following results chapters. This process applies for the CFAs, SEMs and path analyses that were conducted. The analysis steps were: thorough data screening; model specification; estimation of the model with evaluation of fit indices and parameter estimates; and re-specification of the model as required.

Data screening included the examination of included variables in relation to outliers, normality assumptions and multicollinearity issues. General guidelines identified in the literature are that kurtosis scores should have an absolute value no greater than ten and that skewness scores should have an absolute value no greater than three (Kline, 2011). With very few exceptions, the variables included in this study met these criteria. To further ensure the validity of the results and to also account for the ordinal categorical nature of self-regulation indicators, the WLSMV estimator was chosen. This method of estimation provides “weighted least square parameter estimates using a diagonal weight matrix with standard errors and mean- and variance-adjusted chi-square test statistic that use a full weight matrix” (Muthén & Muthén, 1998 - 2012,

85 p. 533). This estimator has been recommended where analyses use data that are

categorical or ordinal in nature (Brown, 2006).

Models were specified based on the research to date and theoretical

considerations discussed in Chapter 2, and were further constrained by the variables available in the dataset. A priori decisions on how model fit would be evaluated were made following extensive reading of the SEM literature including model fit discussion papers (McDonald & Ho, 2002; Schreiber et al., 2006) and recently published

longitudinal developmental research in leading journals.

The chi-square statistic provides a null hypothesis significance test measure of exact fit. That is, it quantifies the predictive power of the hypothesised model in relation to the real-world data to which it is fitted. If the chi-square test is non-significant, (i.e., p > .05) then the model is considered a good fit for the data. However, it has been noted that the conditions for this test statistic to meet the precise chi-square distribution will rarely be met in real world research (Bentler, 2007) and that with large sample sizes, it is unlikely that a non-significant chi-square test will be achieved (Byrne, 2012).

Therefore, a range of other fit indices have been developed and were used to assess the model fit, along with the other information provided by the model estimation output (Bentler, 2007).

Model fit was considered using the Tucker-Lewis Index (TLI), Comparative Fit Index (CFI), root mean square error of approximation (RMSEA), and weighted root mean residual (WRMR). Both the TLI (Tucker & Lewis, 1973) and the CFI (Bentler, 1990) are incremental fit indices through which improvement of fit of the specified model is compared with a null hypothesis model in which there are no structural relationships between the variables tested. Suggested cut-off criteria of values close to or higher than 0.95 have been suggested for both the TLI and CFI when using

continuous data (Hu & Bentler, 1999). More recently, a cut-off value of higher than 0.96 for the CFI with sample sizes over 250 and for categorical data has been recommended (Yu, 2002).

The RMSEA is an absolute fit index which is sensitive to the number of parameters estimated in the model (Steiger, 1998). It has a known distribution that permits the calculation of confidence intervals. Hu and Bentler (1999) have

86 recommended a cut-off value for RMSEA of close to or lower than .06. The WRMR measures the (weighted) average differences between the sample and estimated population variances and covariances.

The WRMR has been proposed as useful when sample statistics are on different scales, such as in the current study, and is also suitable for non-normal data. The recommended cut-off value for WRMR of close to or lower than 1.0 has been found to perform well in CFA models, although using this value makes it more likely that models with trivial misspecification of factor covariance may be rejected (Yu, 2002). WRMR values that were close to 1.0 but at times greater than 1.0 were therefore accepted in the current study.

Re-specification of the models was considered where the baseline model showed poor fit to the data. This was done by examining model estimates and modification indices produced through the initial estimation process. Modification indices were examined to identify parameter constraints which, if freely estimated, would contribute to a significant drop in chi-square, hence, potentially improving overall model fit (Byrne, 2012). These issues were used to guide decisions regarding model re-

specification. Correlation residual estimates quantify the extent to which correlations implied by models and observed correlations differ. Model outputs were screened for correlation residuals with absolute values of over .10 as SEM rules of thumb suggest (Kline, 2011).

Once a final model was accepted the estimates were interpreted. The model estimates included the path coefficients and r-squares for the items. R-square values represent the proportion of variance in each dependent variable accounted for by the model. The path coefficients correspond to traditional regression estimates or effect sizes and are shown as arrows in figures. Where the arrow head points to a continuous variable (the dependent variable), the path coefficient refers to a linear regression coefficient. When the arrow head points to a categorical or ordinal variable, the path coefficient refers to a probit regression coefficient.

Standardised coefficients are useful in comparing the relative contribution of each of the variables to the dependent variables and are provided throughout the results chapters. Where covariates are continuous, estimates are standardised in relation to both

87 the independent and dependent variable and provided as StdYX values in Mplus output. These standardised estimates can then be interpreted as the standard deviation change in the dependent variable with a one unit change in the independent variable. Where the covariate is binary (such as for the control variables of gender and history of maternal depression in this study), coefficients were standardised in relation to the dependent variable. Mplus generally provides these as StdY values in output, however, where the WLSMV estimator is selected, as it was in these analyses, these are not provided and must be calculated by hand. The equation for this is the standardised value (Std in Mplus) divided by the standard deviation of the independent variable. Standard deviations for the independent variables were calculated by finding the square root of the variable variances which are provided in Mplus output on the diagonal of the covariance matrix.

In document Self-regulation from birth to age seven : associations with maternal mental health, parenting, and social, emotional and behavioural outcomes for children (Page 104-107)