Direct control and serial attention: current debates

3.2 The core theory

3.2.4 Direct control and serial attention: current debates

Two of the major debates in the current literature are on the allocation of attention and the directness of control in reading. The thesis model is a serial-attention model: samples from only one word are processed at a time. It is a direct-oculomotor- control model, meaning that saccade decisions are conditioned on the lexical processing stream, rather than some other autonomous process. Justication for these choices is provided below.

3.2.4.1 Serial Attention

Inheriting a debate from the attention community, models of eye movement control are split based on whether they view attention as essentially serial (i.e. focused on one word at a time), or parallel. Serial models are defended on several grounds: they are easy to formulate and interpret, because they always make it clear what is being processed; they are more strongly constrained than the parallel alternative, in the sense that a parallel attention model reduces to serial attention as a special case; and they provide word-order information to the sentence processing system `for free'.

Reichle, Liversedge, Pollatsek & Rayner (2009) make additional challenges to parallel models on the grounds that mental representations of words as separate units might be dicult to create if they are not processed ont at a time, and that word order might be dicult to compute if the perceptual input is not over word-size units. These latter arguments are, however, not compelling: Bayesian theories that assume a generative model of sentences and the availability of letter information (Legge et al., 1997, 2002; Bicknell & Levy, 2010b) can rationally update word-level beliefs from information of multiple partial words. In addition, as Reichle and colleagues note, the focus on word-level units is if anything a disadvantage for understanding reading in non-alphabetic languages like Chinese, or languages like Thai where word boundaries are not marked.

The challenge to purely serial models like E-Z Reader is accounting for spillover, preview, and parafoveal-on-foveal eects: all eects where the current word reading time is aected by properties of words that precede or follow. E-Z Reader addresses this challenge by imposing parallelism between oculomotor and lexical-attentional processing rather than in attention. By decoupling attention from eye position, the theory allows attention (and therefore processing) of one word extend into xations on words that precede or follow. This explanation is not sucient to cover parafoveal- on-foveal eects (when words forward of the xated word aect xation durations on the xated word), but there is still an ongoing debate in the literature on the existence and robustness of these eects (Kennedy et al.,2002;Kennedy & Pynte,2005;Kliegl et al., 2007; Drieghe et al.,2008).

The advantage of parallel attention models empirically is in their ability to account for the aforementioned parafoveal-on-foveal eects, as well as the full set of empirical facts serial models cover. The challenge to them is on grounds of predictive exibility: theoretically, the space of parallel models can in principle span the range from full parallelism, through word-level serialism, down to phoneme- or grapheme-level serial models. Such models therefore require additional a priori estimates or constraints as to the extent of parallelism.

When it comes to the LLDT, this distinction may not be important, however. Given the short words and wide spaces used in the LLDT, parafoveal words may well be at far enough eccentricity for their processing to be minimal even in a fully-parallel model. A serial model is simpler to build, faster to simulate, and easier to interpret because it is easy to understand which words and xations trigger each action in the model's near-optimal policies. It also greatly simplies the choice of policy space for the model: a model that has information coming from multiple words at a time might

also condition its behavior on multiple words at a time, whereas it is natural for a model that only takes in information from one word position to also condition its behavior on a single position. Therefore, a serial attention scheme is used for the dissertation model.

3.2.4.2 Direct oculomotor control

Another debate in the literature on eye movement control in reading is on the directness of the link between lexical and oculomotor processing. At one extreme of the spectrum are models that assume saccade targets and timings are primarily governed by visual features like the location of letters and spaces rather than ongoing lexical processing (e.g.Reilly & O'Regan,1998). At the other are models that assume both saccade targets and timings are directly controlled by the lexical processing stream, though with the delays noted above (e.g. Reichle et al., 2009). In between are models that make intermediate choices, for example keeping saccade targets controlled by lexical processing but letting timings be largely autonomous (e.g Engbert et al., 2005). See Reichle et al.(2003, Table 1) for a more detailed review and taxonomy of models on this dimension.

This debate is driven by two fundamental facts about eye movements: on the one hand, standard estimates of saccade planning durations for sequential eye movements range as high as 200ms (Rayner et al.,1983), and typical xations in reading are only slightly longer. Even lower bounds on how long a saccade takes to plan in a simple psychophysics task is at about 100ms (Becker & Jürgens, 1979). On the other hand, word-level properties do aect reading times (as reviewed in 2.2). The time left for cognitive processing to aect a direct eye movement control decision and yield these word-level eects is therefore only about 50-150ms, with at least about 50ms of this time taken up by eye-brain lag.

In light of these duration estimates, the reader appears to spend more time viewing a word after deciding to move on to the next one than he or she takes to makes the decision. There are two primary ways of understanding this fact. The rst is taken by so-called primary oculomotor-control models. These models argue that the above facts mean that decisions of when and where to saccade are driven primarily by the same low-level perceptual considerations as the decisions in the sequential cued saccade tasks. Some example proposals are that the reader always attempts to xate on the so-called optimal viewing position, just left-of-center (O'Regan & Lévy-Schoen, 1987), or to the longest word within some reasonable range to the right (Reilly & O'Regan, 1998), and to do so after a xed amount of time. Other proposals claim

that xation targets are determined by the lexical processing system, but the timing is nonetheless controlled by an autonomous saccade clock (Engbert et al., 2005). In practice, all of these models must allow some higher level information to make it indirectly into the oculomotor stream, otherwise they will fail to capture the known eects of word- and sentence-level properties on reading times.

The second way of handling this empirical puzzle comes from from a key insight by Morrison(1984): that if saccade planning can occur in parallel with lexical processing, and given that saccades can be canceled after planning starts but before execution (Becker & Jürgens, 1979), then lexical processing can play a signicant role in the where and when decisions in eye movements by operating in parallel with saccade planning. Among the most prominent models of this type is E-Z Reader (Reichle et al., 2009), discussed in greater detail in chapter 5.

The thesis model assumes that saccade timing is directly controlled by lexical processing (as delayed by saccade planning). When the lexical processing threshold is reached, saccade planning begins. The theory remains agnostic on the question of saccade targeting by assuming that in the LLDT both direct and indirect control of saccade targets will look quite similar and can be approximated by assuming saccades sequentially target each word. The choice is made in part in the interest of simplicity: understanding and interpreting what the model is doing is far easier when the actions of theoretical interest are under active control. But direct control was also used in ideal-observer approaches to eye movements similar to the thesis work (e.g. Bicknell & Levy, 2010a,b) so it has some empirical support.

In document Adaptive Eye Movement Control in a Simple Linguistic Task. (Page 43-46)