What does the n-back task measure? - Chapter Introduction

2 Chapter Summary

3.0 Chapter Introduction

3.0.2 What does the n-back task measure?

The n-back procedure has received some criticism over its validity as a working memory measure. Though the n-back task has face validity as a working memory task, it has shown little correlation with complex span (Jaeggi, Buschkuehl, et al., 2010;

Redick & Lindsey, 2013; Simmons, 2000), a task often considered the gold-standard for measuring working memory capacity (Shelton, Elliott, Matthews, Hill, & Gouvier, 2010). Complex span tasks are a commonly used measure of working memory ability due to their strong predictive ability for tests of higher-order cognition, such as fluid intelligence measured through reasoning tasks (Barrouillet & Lecas, 1999; Conway, Cowan, Bunting, Therriault, & Minkoff, 2002; Conway et al., 2003; Kyllonen &

Christal, 1990; Nash Unsworth & Engle, 2007). This lack of concurrent validity is therefore problematic for models that suggest both n-back and complex span measure the same working memory construct (see Kane, Conway, Miura, & Colflesh, 2007).

Paradoxically however, despite the apparent disparity between complex span and n-back measures, a relationship between n-back performance and fluid intelligence has also been observed (Jaeggi, Buschkuehl, Jonides, & Perrig, 2008; Kane et al., 2007;

Schmiedek, Lövdén, & Lindenberger, 2014). Whether this predictive ability of both complex span and n-back performance on higher-order cognition supports a shared mechanism across the tasks, or independent contributions to intelligence, is somewhat equivocal.

Kane et al. (2007) found both tasks predicted independent variance in measures of fluid intelligence, and a meta-analysis by Redick and Lindsay (2013) supports a view that the two tasks cannot be used interchangeably as working memory measures. They suggest that this discrepancy instead supports a multi-faceted working memory system that includes non-unitary executive functions such as shifting, updating, and inhibition (Miyake et al., 2000; Oberauer, 2009). Indeed, a key difference between complex span and n-back tasks, and a possible reason for a weak relationship between the two, is the reliance on retrieval through recall in the former and through recognition processes in the latter (Harbison, Atkins, & Dougherty, 2011; Jaeggi, Buschkuehl, et al., 2010;

Oberauer, 2005; Redick & Lindsey, 2013; Shelton et al., 2010). Specifically, the recognition process required for the n-back task is influenced by both familiarity and recollection processes (Jaeggi, Buschkuehl, et al., 2010; Kane et al., 2007). A familiarity signal (for example, through elevated activation in LTM, Oberauer, 2009) is found in recently-presented items, which on its own may be sufficient for accepting target items (Harbison et al., 2011; Kane et al., 2007). However, it is the inclusion of ‘recent’ lures (i.e. lures that have previously appeared in positions n-1, n+1, or n+2) that produces a familiarity signal comparable to target items, and means this signal alone is not sufficient to distinguish the targets from non-targets. Participants are therefore required

to engage in a control process over a familiarity signal to determine the correct position of the stimulus (Harbison et al., 2011; Szmalec, Verbruggen, Vandierendonck, &

Kemps, 2011).

This process of control over familiarity may obscure the relationship between the n-back task and recall-based working memory measures (Kane et al., 2007). This has been supported by implementing a recall modification to the n-back (Shelton, Metzger, &

Elliott, 2007; Wilhelm et al., 2013), which has shown a stronger relationship with complex span performance (e.g. r = .32 with operation span and r = .41 with listening span, Shelton et al., 2010). In this procedure, participants are presented with multiple sequentially-presented lists of items that vary in length, and at the end of each list are instructed to report the 1, 2, or 3-back item. Recall performance of 2 and 3-back items is then used to index working memory ability. Consequently, the task tests maintenance and manipulation without requiring a process of matching a probe to the n-back position.

Alternatively, the discrepancy between these working memory measures may be due to the design of the n-back procedures. Harbison et al. (2011) suggest n-back and complex span tasks show a weak relationship because in most demonstrations of the n-back task, approximately 50% of trial items appear as ‘non-recent’ lures. This means that a large proportion of an n-back score, based on an index of hits and false alarms, is made up of lure items where the participant need not attempt recollection. However, it should be noted that Kane et al. (2007) compared n-back scores based on only recent-lure performance with operation span, and found only weak correlations between the two.

That is, for n-back lure items where control over familiarity-based responding was challenged, a weak relationship with complex span tasks remained. To be clear, though the reliance on item familiarity for these lure types may complicate the assessment of

the relationship between the n-back and complex span, the inclusion of these trials does not appear to be the primary reason for any disparity.

Though there are clearly issues with treating n-back performance as working memory capacity analogous to that measured in complex span tasks, it is unclear whether they reflect unrelated constructs in working memory. Indeed, the differences across measures might simply reflect paradigm-specific variance which systematically reduces the observed relationships between the two tasks (Schmiedek, Hildebrandt, et al., 2009).

This mismatch of memory-test methods, in addition to content-specific variance and measurement error, is proposed to be responsible for the low correlations between n-back and other tasks. The underlying working memory constructs may be better examined by assessing the relationship between latent variables based on multiple versions of both n-back and complex span (Schmiedek et al., 2014). This method has consequently revealed near-perfect relationships between the two category of tasks (Schmiedek, Hildebrandt, et al., 2009; Schmiedek et al., 2014; Wilhelm et al., 2013).

Whilst correlations between individual tasks did vary considerably, latent variables of complex span (from measures of reading span, counting span, and rotation span) and the n-back task (using numerical and spatial n-back) correlated substantially (e.g. r = .69, Schmiedek et al., 2014). Furthermore, these variables loaded highly onto a working memory factor, which in turn was predictive of reasoning ability. The authors suggest a better measure of working memory can be achieved by using multiple heterogeneous tasks such as the complex span and n-back to produce a latent working memory factor.

The shared explained variance from complex span and n-back tasks indicate a crucial component of working memory capacity utilised in both procedures, and in measures of reasoning ability for which both tasks have predictive utility. Controlled attention, proposed as a domain-general process for maintaining and manipulating working

memory items (Cowan, 1999; Engle et al., 1999), may be the source of this shared variance. Transfer effects from training in the n-back procedure to performance in reasoning tests is proposed to occur because attentional control is essential for both tasks (Jaeggi, Studer-Luethi, et al., 2010; Jaeggi et al., 2008). Similarly, attentional control is also considered essential for complex span tasks, as information about the stimulus must be accessible whilst attention shifts to the processing task (Engle & Kane, 2004; Kane et al., 2004). Alternatively, the common source of variance between complex span and n-back tasks may be the ability to create, maintain, and update bindings (Oberauer, 2005, 2009; Wilhelm et al., 2013). That is, the n-back task requires constant creation and updating of bindings between an item and its context, whilst complex span tasks have similar binding requirements where the item must be bound to its serial position (Wilhelm et al., 2013). In Wilhelm et al., a binding factor accounted for 100% of an updating factor’s variance that included the n-back task, and 90% of complex span variance.

In summary, when variance from retrieval differences in the n-back and complex span task is accounted for, there is a strong relationship between the two tasks that suggests they share a common working memory function (Schmiedek et al., 2014; Wilhelm et al., 2013). Consequently, the n-back task appears to be a valid measure of working memory, though the influence of familiarity-based recognition on n-back performance should be considered when designing this working memory measure (e.g. Kane et al., 2007).

In document Olfactory short term memory: understanding perceptual representations of odours and the role of encoding strategies in working memory. (Page 100-104)