Chapter 3 The developmental relations between spatial cognition and
3.2.3 Mathematics ability measures
3.2.3.1 NFER Progress in Mathematics
The NFER PiM was administered as a measure of standardised mathematics performance. As outlined in Chapter 2, the NFER PiM test is an assessment of mathematics achievement designed to address the National Mathematics Curriculum in England, Wales and Northern Ireland (NFER, 2004). The test series includes items assessing number, algebra, data handling, shape, space and measures. Age- appropriate NFER PiM tests were administered to each age group of participants as per the test guidelines (NFER, 2004). Age-based standardised scores with a mean of 100 and a standard deviation of 15, were used in all analyses (Min: 69; Max: 141).
112 3.2.3.2 Approximate Number System Task
A dot comparison task was used to measure ANS skills in this study. This computerised task was taken from Gilmore, Attridge, De Smedt, and Inglis (2014). In each trial participants were required to compare and identify the more numerous of two dot arrays (see Figure 3.7). Each set of dot arrays was presented for 1500ms (or until a key press) and was followed by a fixation dot. Participants used labelled keys on the left and right of the computer keyboard to respond. Only participants who achieved at least 50% on the practice trials (eight trials) continued to the 64 randomly presented experimental trials. The quantity of dots in each comparison array ranged from 5 to 22. The ratio between the dots in each array varied between 0.5, 0.6, 0.7 and 0.8, with approximately equal numbers of trials assessing each of these ratios. This ratio effect is characteristic of performance on ANS tasks, and reduced performance is typically observed as the ratio between item sets approaches 1. For example, participants are expected to have higher performance when comparing 5 to 10 dots (a ratio of 0.5) than when comparing 5 to 6 dots (a ratio of approximately 0.8) (Barth, Kanwisher, & Spelke, 2003; Gilmore et al., 2014). The colour of the more numerous array (red or blue), in addition to the size and the density of dot presentation, were counterbalanced between trials. Task performance was measured as percentage accuracy (Min: 0%; Max: 100%).
It is noteworthy that performance on ANS tasks can be measured using several different metrics including performance accuracy, Weber fractions and the numerical ratio effect (NRE) for accuracy or reaction time (Inglis & Gilmore, 2014). Measuring ANS performance using the Weber fraction (w) assumes that when an individual is presented with an array of n dots, they form a representation of the dots that follows a normal distribution (with mean n and standard deviation w) (Inglis & Gilmore, 2014). However, there is evidence that the use of the Weber fraction leads to highly skewed distributions and that this metric has low test-retest reliability (Inglis & Gilmore, 2014). Additionally, this metric is highly sensitive to context and differs with task and stimulus properties (DeWind & Brannon, 2016). Furthermore, there is evidence that the Weber Fraction is highly correlated with performance accuracy on ANS measures, which poses the question as to what additional information the
113
Weber fraction provides, beyond performance accuracy scores. For the NRE, scores are calculated as the slope of the line created by plotting an individual’s accuracy against the ratio of dots being compared (or alternatively plotting response times against the ratio of dots being compared) (Gilmore, Attridge & Inglis, 2011). However, there is also evidence that the NRE has poor test-retest reliability and that this outcome does not correlate with either accuracy or Weber fraction measures of ANS performance (Inglis & Gilmore, 2014). Taken together, and as advocated in several other papers (e.g., Inglis & Gilmore 2014; Guillaume & Van Rinsveld, 2018), performance accuracy was used as the outcome measure in this study.
Figure 3.7. Sample dot arrays from the ANS Task
3.2.3.3 Number line estimation
The paper-based Number-Line Estimation Task used to assess symbolic numerical representation in this study was adapted from Siegler and Opfer (2003). Two trial types were included, number estimation (NP) and position estimation (PN) trials. As shown in Figure 3.8, for NP trials, participants were presented with a target number and were asked to estimate its location on a number line by drawing a straight line (hatch mark) through the number line at their selected location. As shown in Figure 3.8, for PN trials participants were presented with a vertical hatch mark on a number line and were asked to estimate what number was represented by the mark. This task was comprised of three blocks. Within each block participants completed two practice trials (one NP and one PN) followed by eight experimental trials (equal numbers of NP and PN trials presented alternately). Performance on NP and PN trials
114
were collapsed across blocks. Blocks differed in the number line range presented. As per the Siegler and Opfer (2003) method, the number line in Block B ranged from 0- 100 (numbers included 2, 3, 6, 18, 20, 24, 42, 50, 67, 71) and the number line in Block C ranged from 0-1000 (numbers included 2, 6, 18, 24, 71, 230, 250, 390, 500, 810). In this study, Block A with a range of 0-10 was added to reduce floor effects in younger children who may be less familiar with larger numbers (numbers included 1, 2, 3, 4, 5, 6, 7, 8, 9, 10).
Trial order was fixed and increased in difficulty. Participants began with Block A, followed by Block B and Block C. The numbers included in each block were chosen to enhance the identification of children’s use of logarithmic and linear models and to minimize the impact of content knowledge (e.g., 50 is one half of 100). Similarly to other studies, there was over-sampling of numbers below 20 (Friso-van den Bos et al., 2015; Laski & Siegler, 2007). As outlined in section 1.3.2 performance was measured using PAE scores (Min: 0%; Max: 100%) and curve estimation (R2
LIN scores;
Min: 0; Max: 1). As the results were broadly similar for these measures, only R2 LIN
scores are reported in this chapter. Similar patterns of performance, with smaller effects, were found for PAE scores (see Appendix B). Participants were given the opportunity to complete all blocks. However, the 0-10 block was considered an age specific measure, and was analysed, at 6 and 7 years only. For each block where a participant’s mean PAE scores for the practice trials were greater than 15%, or where participants failed to answer at least 80% of items, they were excluded from analysis for this block. For the 0-1000 block, only four participants at 6 years were eligible for inclusion, hence this age group was excluded from analysis.
115
Figure 3.8. Sample items from the Number Line Estimation Task. Number to Position
trials are shown above and Position to Number trials are shown below