Data Analysis - Executive Functions and the Interaction Between Category Learning Systems

2.5 Method

2.5.3 Data Analysis

Data Trimming and Outlier Analyses

To improve the distribution of the variables used in the study, they were trimmed and extreme values were removed using methods (described below) in line with previous research (Friedman et al., 2008; Miyake et al., 2000). A small number (1.29%) of participants were dropped from some tasks because of technical errors or because the participant did not understand the task. One participant in the keep track task, four in the spatial 2- back task, and six in the second categorization task were eliminated because of technical problems (e.g., power outage). Seven participants in the spatial 2-back task, four in the number letter task, one in the category switch task and two in the Stroop task were dropped because they did not do the task properly (e.g., responded to the colour name rather than text colour in the Stroop task). See Table 2.2 for the number of remaining participants for each task.

For the executive function tasks for which reaction time was the dependent variable (excluding the stop signal task), only correct responses were analyzed. For the set shifting tasks, responses following errors were excluded to ensure that set shifting really had occurred. Similar to Miyake and colleagues (Miyake et al., 2000; Friedman et al., 2008), responses faster than 200 ms were not analyzed. Extreme reaction times were trimmed

using the median absolute deviation, a method which is robust to non-normality (Wu, Zhao, & Wang, 2002). Median absolute deviation is a measure of variability in which the median deviation between a participant’s median and actual scores is calculated. For each participant, responses that were more than 3.32 median absolute deviations away from the individual’s median for that task were excluded from analysis. All RT measures were calculated so that higher values indicated better performance and descriptive statistics for RT measures can be found in Table 2.2. Proportion correct on these tasks was high (number letter, M = .94, SD = .12; colour shape, M = .94,SD = .05; category switch, M = .89, SD = .09).

For executive function tasks for which proportion correct was the dependent variable, the performance of participants whose mean was greater than 3 standard deviations from the group mean was replaced by a value 3 standard deviations from the mean in order to improve normality and to minimize the influence of outliers on the parameter estimates. This affected data from 0 participants in the antisaccade task, 4 participants (2.30%) in the letter memory task, 2 participants (1.16%) in the keep track task, 2 participants (1.22%) in the spatial 2-back task and 2 participants (1.73%) in the Stroop task. Next, performance in each task was arcsine transformed to disperse very low and very high values and improve normality (Miyake et al., 2000; Kline, 2010). Untransformed and transformed means for executive function tasks measured using proportion correct can be found in Table 2.2.

For the stop signal task, stop-signal reaction time (SSRT) was estimated for short, medium and long durations for each participant(Logan, 1994). For each participant, the probability of responding on stop trials was calculated for each duration. Next,

go trials were put in ascending order according to reaction time. Then the RT of the

nth trial was determined for each participant at each duration. The nth trial was the

one whose rank equaled the product of the number of go trials and the probability of responding at the given duration. The SSRT for the duration was the RT of thenth _trial

minus the stop-signal delay for that duration. For each participant, SSRTs for the short, medium and long durations were averaged to form the measure of inhibition. If, for any participant, the probability of responding at a given duration did not fall between .15 and .85, the SSRT for that duration was not included in the average SSRT. Finally, SSRT was multiplied by negative 1 so that larger values indicated better performance.

No transformation or outlier analyses were conducted on categorization performance. Categorization performance for each category type (II and RD) at each block (1 to 4) was calculated by averaging performance across participants. Mean and standard deviation of categorization performance for each category set at each block are presented in Table 2.2. Because SEM requires multivariate normal distributions, the skew index and kurtosis index was calculated for each variable. The sign of the skew index indicates the direction of the skew (i.e., negative or positive) and the magnitude of the skew index indicates the size of the skew. In general, absolute values greater than 3.0 indicate extreme skew. Sim- ilarly, the sign of the kurtosis index indicates the direction of the kurtosis (i.e., negative for platykurtic distributions and positive for leptokurtic distributions) and the magnitude indicates the size of the kurtosis, with absolute values greater than 8.0 indicating extreme kurtosis (Kline, 2010). Table 2.2 illustrates that after data trimming and transformation, all variables have an acceptable level of skew and kurtosis.

problematic for SEM in terms of difficulties converging on a solution. To correct for the ill-scaled variance among measures, some measures were multiplied by a constant so that variances were roughly comparable. The rescaled measures were used for all confirmatory factor analysis and structural equation models. Note that rescaling the variables had no effect on the final outcome of the models because the relationship between variables is unaffected when variables are multiplied by a constant. The unscaled standard deviation and scaled standard deviation and the constant for each variable are printed in Table 2.2.

Model Estimation

A two-step modeling procedure was used in which an acceptable measurement model was first identified and then a structural equation model was based on the measurement model. Measurement models use confirmatory factor analysis (CFA) to model the rela- tionships between latent variables and the observed variables that define them. As such, variables are related using correlational rather than causal pathways. Prior to testing predictions using a structural equation model, which uses causal pathways, it is advisable first to ensure that the corresponding measurement model has an acceptable fit to the data. If the measurement model fits the data poorly, then the structural equation model should not be used to test hypotheses (Kline, 2010).

All confirmatory factor analyses and structural equation models were carried out using the lavaan package in R (Rosseel, 2012). For the confirmatory factor analyses and structural equation models, a maximum likelihood estimator was used. Missing values were handled using ML estimation for incomplete data, which has been shown to be an effective method (Meade & Bauer, 2007; Tomarken & Waller, 2005). All models were

overidentified, positive definite and all estimation processes converged. Latent variables were scaled by setting the factor loading of the first indicator to equal 1.

The fit of the models was measured using a number of fit indices because no single fit index is sufficient to accept or reject a model. The root mean square error of ap- proximation (RMSEA) is an absolute measure of fit that adjusts for model complexity and indicates the model discrepancy per degree of freedom. Values of .01, .05 and .08 indicate excellent, good and mediocre fit, respectively, and the 90% confidence interval of the RMSEA ideally includes 0 (MacCallum, Browne, & Sugawara, 1996). Standardized Root Mean Square Residual (SRMR) is a measure of the mean absolute residual between the observed and reproduced correlation matrices, with smaller values indicating better fit, and values below .08 indicating acceptable fit (Hu & Bentler, 1999). The Compara- tive Fit Index (CFI) and Tucker Lewis Index (TLI) are both incremental fit indices that measure the relative improvement of the model compared to a null model. For both CFI and TLI a value of .95 indicates a good fit of the model to the data (Hu & Bentler, 1999). The model chi-squared indicates the deviation of the reproduced correlation matrix from the observed correlation matrix and a significant model chi-squared indicates a poor fit. The above fit indices were used to measure the overall fit of the model, and chi-squared difference tests were used to compare the relative fit of models to adjudicate between nested models. For each model, the pattern of residuals and coefficients was also examined. Modification indices, which indicate changes to the model that would improve its fit, were also examined for each model.

In document Executive Functions and the Interaction Between Category Learning Systems (Page 73-78)