• No results found

III. STUDY 1: THE LATIN SQUARE TASK

3.13. Experiment 4: Results

3.13.1. Approach to the analysis

Analyses used participant gaze data within areas of interest (AOIs) as the measure of interest. As seen in the example item in Figure 3.5, the cells of the matrix and the response options were used to create an AOI template applied to all items. The value of each AOI was the total gaze time spent within that AOI on that item for that participant. These values were then transferred to a set of dynamic AOIs, created for each item using an index of item attributes. For instance, for the example in Figure 3.5, item 03, cell 2B is the final target cell (the cell with the ‘?’ that must be solved), and so the gaze time value of the final target cell is equal to the gaze time value of the cell 2B for that item (whereas for item 04, the ‘?’ cell is in cell 4B, so the final target cell for that item is taken from the gaze time for cell 4B). In addition to final target cell, the other dynamic AOIs within the matrix were distractor cells (filled cells which have no impact on the solvability of the item), final relation cells (filled cells involved in the relation of the final step), interim target cell (the cell that must be solved in the first step of a 2-step item, i.e., the interim step), and interim relation cells (filled cells involved in the relation of the interim step). In addition, two dynamic AOIs corresponding to the response options were calculated: final-answer-RO (the response option with the answer to the item) and interim-answer-RO (the response option with the answer to the interim step). For distractor cells, interim relation cells, and final relation cells (all of which may have more than one cell per item), the value was a sum of all the cells that corresponded to that attribute for that item.

The target, relational, and interim cells were derived from Birney et al.’s (2006) RC analysis of the LST, though an additional item analysis was then conducted for each item to determine if there were alternate solution pathways. For some items, there were indeed multiple solution pathways, which made calculating distractor and relation cells difficult. For these, we first assumed that the most relationally simple pathway was taken. In the event of a tie (e.g., an item where two binary solution pathways were available), each separate solution pathway was calculated separately, and the final value of the relation cells was equal to the

highest gaze duration solution pathway used by each participant. For distractor cells, the cells

were only summed if they did not contribute to any potential solution pathways. In other words, distractor cells were filled cells that, if removed and turned to empty cells, would not affect the solvability of the item regardless of the pathway taken. Although this approach may result in some loss of gaze data if participants switch solution pathways through the problem, it was the most straightforward solution to ensuring there was only one set of relation and distractor cells per item for use in the analyses.

Figure 3.5. Areas of interest (AOIs) for the LST. The matrix on the left displays the AOI

template analogously applied to all items. These template AOIs are converted to dynamic AOIs for each item, as demonstrated by the example matrix on the right. For 1-step items, there is no interim target cell, interim relation cells, or interim answer RO (response option).

Distractor cells are shape-filled cells that have no impact on the solvability of the item (i.e.,

they could be turned to empty cells and the item would have the same solution pathway). For items with multiple paths to solution (e.g., two sets of relation cells per target cell), the set of relation cells with the highest amount of gaze duration (per participant) are recorded as the relation cells for that item.

Finally, we also included RO-revisits, a measure of the number of times a participant returned to the response options on each item. Although not a measure of gaze duration, revisits is nonetheless a gaze metric, one which Laurence et al. (2018) found was the best predictor of test scores on a similar, matrix-style reasoning task.

The program recorded gaze duration data in milliseconds, but values are reported in seconds for interpretability. Hypotheses were tested using binary logistic regression on item- level data, using item metrics (RC, steps) and gaze metrics (e.g., final target cell, final

relation cells, RO-revisits, etc.) for each item predicting success on that item (0 for incorrect,

change in log-odds. Confidence intervals for odds ratios are reported, for ease of interpretability (CIs containing 1 indicate non-significance).

3.13.2. Gaze time descriptives and logistic regressions

Overall, performance was similar to that described in the earlier experiments for RC (2D M = .94, SD = .24; 3D M = .83, SD = .37; 4D M = .58, SD = .50) and Steps (1S M = .82, SD = .38; 2S M = .75, SD = .43). Descriptives for gaze metrics are provided in Table 3.8. These mean values demonstrate that, on average, about 3.5 seconds were spent on final relation cells of each item, while 4.9 seconds were spent on interim relation cells. The high variance in these descriptives is to be expected, considering they average across item types. Table 3.8. Gaze time metric descriptives.

Mean SD

Final answer RO 0.97s 0.74s

Interim answer RO (2S only) 0.67s 0.85s

Final target cell 2.21s 3.22s

Interim target cell (2S only) 1.75s 2.19s

Final relation cells 3.54s 4.00s

Interim relation cells (2S only) 4.90s 5.29s

Distractor cells 1.87s 2.96s

RO Revisits 4.22 5.32

N = 510 (15 x 36) item responses (255 for 2S only metrics)

For the first regression, item success was predicted using RC, Steps, final answer RO,

final target cell, final relation cells, distractor cells, and RO revisits. As hypothesized, RC

was a significant predictor of item success (CI95% = [0.172, 0.400], p < .001), as was Steps (CI95% = [0.216, 0.802], p = .009), both lowering the chance of success with increases. For the gaze metrics, final answer RO was a significant and very powerful positive predictor of success (CI95% = [17.77, 90.95], p < .001), though this was unsurprising, as it was

attributable to the fact that participants needed to input their answer by clicking the corresponding response option. Final target cell was also significant (CI95% = [0.801,

0.983], p = .022), though in a negative direction: for every 1 second spent looking at the final

target cell, there was, on average, a 12.3% reduction in the chance of correctly answering the

item. The distractor cells were also significant (CI95% = [0.750, 0.935], p = .002) in a negative direction: for every 1 second spent looking at distractor cells, there was, on average, a 16.3% reduction in the chance of correctly answering the item. The number of RO revisits (toggling rate) was also significant (CI95% = [0.736, 0.867], p < .001) in a negative direction: for every additional revisit to the response options, there was, on average, a 20.1% reduction in the chance to solve the item correctly. Contrary to the hypothesis, the final relation cells were not significant predictors of item success (CI95% = [0.966, 1.112], p = .001). Table 3.9 displays the full output of this regression.

Table 3.9. Output of Binary Logistic Regression with Item Characteristics, Gaze Time on

Areas of Interests (AOIs), and Revisit Rates predicting Item Success (1S and 2S items).

Exp(B) CI-Exp(B) Sig.

Relational Complexity 0.263 0.172, 0.400 < 0.001

Steps 0.416 0.216, 0.802 0.009

Final-answer Response Option (sec) 40.207 17.774, 90.952 < 0.001

Final-answer Target Cell (sec) 0.887 0.801, 0.983 0.022

Final relation cells (sec) 1.036 0.966, 1.112 0.323

Distractor cells (sec) 0.837 0.750, 0.935 0.002

Response Option Revisits (#) 0.799 0.736, 0.867 < 0.001

Constant 329.65 < 0.001

χ2 =240.17, df = 7, p < .001

Classification Accuracy = 88.8% Nagelkerke R2 = .582

N = 510 items

The second regression included the same predictors as above, but also added interim gaze metrics as additional predictors (interim answer RO, interim target cell, interim relation

cells). Because interim gaze metrics were only calculated for 2S items, only 2S items were

included. This regression was conducted over two models. The first model aimed to replicate the results of the first regression (i.e., interim AOIs were not included), while the second

model added the interim AOI metrics. The first model mostly replicated the previous

regression. However, this time, the final-answer target cell was not a significant predictor of item success, (CI95% = [0.748, 1.090], p = .288); but the final relation cells were

(CI95% = [1.005, 1.416], p = .044), such that for every 1 additional second spent looking at the final relation cells, there was, on average, a 19.3% increase in the chance of solving the item correctly. In the second model, the pattern of predictions for the previous predictors remained the same. Of the three new predictors, only interim answer RO was a significant predictor, in a negative direction (CI95% = [0.108, .813], p = .018. However, as with the other response option AOIs, this should be interpreted with caution, since those looking to input their answer look towards the response options (in this case, inputting the interim response would result in an incorrect answer, so the chance of success decreases). Contrary to hypotheses, the other two predictors, interim target cell and interim relation cells were not significant predictors, p’s > .05.