2 Literature Review
2.4 Eye tracking reading
2.4.2 Algorithms for detecting reading from eye tracking data
The reading detection algorithms published to date use one of three different methods to identify patterns. Some, Campbell and Maglio (2001), consider the direction and distance travelled between fixations (sometimes referred to as saccadic amplitude) whilst others, Kollmorgen and Holmqvist (2009) and Simola, Salojarvi and Kojo (2008) use hidden Markvok models to analyse eye tracking data generated by reading to detect data patterns. The research data is then examined for similar patterns of data in order to label sections of data which exhibit the same characteristics. A third type, Biedert et al. (2012) uses the speed of the movement between fixations to indicate the type of reading (based on the fact that the eye moves much faster over long, scanning type saccades).
The researcher experimented with those algorithms where the fine detail of their workings could be ascertained from the literature and found that the approach used by Campbell and Maglio (2001) could be successfully adapted to suit the demands of this project. The following sections describe the characteristics of careful reading that are described in the literature and consequently were incorporated into the researcher’s own method of detecting careful reading.
To distinguish between careful reading and other types of selective reading it was not enough to consider whether fixations represented a move forward through the text or represented a regression. The researcher needed to distinguish between forward moving fixations which might represent careful reading and fixations which might represent other types of selective reading behaviour. The researcher decided to label fixations which occurred on the same line and at a distance of between 1 and 16 characters to the right of the previous fixation as short forward fixations. This decision was based on two factors. Firstly 16 character represents double the average forward saccade reported in the literature and secondly, this
64
16-character limit accounts for 95 per cent of the saccades reported by Rayner et al. (2012:95) of eight college-age readers.
Fixations which move from near the end of one line, to the beginning of the line below are referred to as Return Sweeps (Rayner et al. 2012:91). Although different from a short forward in terms of the physical movement, Return Sweeps are, in essence, a short forward movement from the end of one line to the beginning of the next. Fixations which represented a move to the line below in conjunction with a long to the left (more than 50 characters which represented over half a line) were labelled as Return Sweeps.
The only remaining type of forward moving fixations (once we have excluded short forward and Return Sweeps) the researcher labelled as long forward. Simola et al. (2008:5) reported that when participants changed from ‘rauding’ (which they define as normal reading in which the reader is looking at each consecutive word of a text to comprehend the content) to ‘skimming’ the length of forward saccades increases. For this reason, long (more than 16 characters) forward moving saccades are unlikely to form part of careful reading.
During reading the eyes do not move relentlessly forward through the text. On average 10-15 per cent of fixations are a return to an area of text already read (Holmqvist et al., 2011; Rayner, Juhasz and Pollatsek, 2005; Rayner et al., 2012). These backward movements are called regressions. Rayner et al. (2005) suggest that short regressions within the current sentence represent word recognition problems, whilst longer regressions back to previous sentences are likely to represent comprehension difficulties. Rayner (2009) reports that such short, sentence level regressions account for the majority of regressions. Holmqvist et al. (2011) showed that the number of regressions made decreased as a function of improved reading skill. For the purpose of analysis, any fixation which moved to an earlier part of the text, whether within the sentence or to an earlier sentence was labelled as a
65
regression. Although the literature reported above suggests that regressive fixations account for 10-15 per cent of all fixations, the researcher was surprised to note a much higher proportion of regressive fixations in the data collected. This is discussed later in chapter four.
Unsurprisingly, the literature on reading fixations reports that better readers have, on average, shorter fixations and make longer saccades than less skilled readers (Rayner, 1998). Word frequency has also been shown to influence fixation duration with fixation duration increasing as words become less familiar. Inhoff and Rayner (1986) showed that even after controlling for word length (infrequent words tend to contain more characters than frequent words), infrequent words were fixated for longer than frequently occurring words. More recent research has moved on to consider how reading skill and other factors such as word frequency and predictability may interact. Ashby, Rayner and Clifton (2005) conducted research into the effects of both word frequency and word predictability in relation to reading skill. They concluded that the low predictability of a word does interact with reading skill, extending the fixation times of average readers disproportionately compared to the increase in fixation times of skilled readers. The same research was less clear cut in relation to how skill and word frequency interact. However, Kuperman and Van Dyke (2011) demonstrated that differences in fixation durations between better readers and poorer readers remained constant regardless of word frequency.
Whilst careful reading may be wide-spread, it may be only one of several types of reading that are required to accomplish academic reading into writing tasks. Chan (2013, see section 2.3.5 of this chapter for a full discussion of Chan’s work) suggests that during the meaning and discourse construction phase, participants engage in high-level reading processes. Khalifa and Weir (2009) suggest that skilled readers engage in different types of reading. Khalifa and Weir (ibid) differentiate between careful and expeditious reading (see
66
section 2.3.5.3 of this chapter for a full discussion). Therefore, this research wishes to not only identify when reading is likely to be taking place, but whether it fits the patterns of careful reading suggested in the literature, or whether the eye-movements of the participants suggest that some other type of reading (selective reading) is taking place.
Rayner et al. (2012:377) acknowledge that the type of reading engaged in is likely to vary according to the reader's purpose for reading. Rayner limits the discussion of other types of reading to skimming: ‘the type of reading activity in which you skim over the text without really deeply comprehending it’ (Rayner et al., 2012:377). However, Rayner et al. do not elaborate on the type of eye-movements which might be characteristic of skim reading.
The skimming discussed by Rayner et al. would seem to fit into what Khalifa and Weir (2009) describe as expeditious reading. Khalifa and Weir describe expeditious reading as ‘quick, selective and efficient reading to access desired information in a text’ (2009:46). Khalifa and Weir suggest that expeditious reading involves targeted reading that does not aim to extract a complete understanding of the text. Instead expeditious reading incorporates skimming, scanning and searching in which, the writer suggests, the emphasis is on the reader’s purpose for reading, rather than an attempt to comprehend everything that the writer wishes to communicate. In the absence of literature defining the eye-movements that characterise expeditious or selective reading, section 2.5 includes a discussion of how the types of reading suggested by Khalifa and Weir might be characterised by various types of eye-movements in reading. However, before that the researcher wishes to briefly review the literature relating to eye tracking studies which have focused on text level processing, rather than individual cognitive processes within reading.
67