3.1 Experiment 3: Separating decay and interference in continuous recognition
3.1.1 Introduction
Experiments 1 and 2 made extensive use of the continuous recognition paradigm and plotted conventional curves of recognition performance as a function of lag. Whilst lag nominally denotes the number of items intervening between study and test presentations, because trials are presented at a constant rate, the time between study and test is also proportional to lag. In explaining the results obtained it is not possible to distinguish between the effect of time and the effect of intervening stimuli. Traditional accounts of loss of information from memory utilise two concepts: decay and interference. Decay-based theories propose a loss that is directly related to the time elapsed since exposure to the material to be memorised, whereas interference theory proposes that forgetting is caused by interference from other information. In order to discriminate between the two, a continuous recognition experiment was carried out in which interstimulus interval was manipulated.
The standard model of short-term memory (STM) is based upon the idea of activation, a mnemonic property that keeps information in an accessible form (Cowan, 2001; Shiffrin, 1999). Information may be activated from long-term stores to become accessible in STM. Two counter-acting processes are proposed to determine whether information is retained or forgotten: rehearsal and decay. The activated items are presumed to rapidly decay with time unless they
are consciously rehearsed. This rehearsal is sufficient to ‘refresh’ the memory, and counteract decay of activated information.
Evidence for decay as the mechanism of forgetting from STM came from experiments which prevented rehearsal by requiring participants to count backwards, or read a list of digits, after presentation of material (lists of letters) to be retained (J. Brown, 1958; Peterson & Peterson, 1959). The finding that there was a rapid loss of information to be recalled, was considered striking because the distracting material (digits) was sufficiently different from the study material (letters) that interference could be considered to have been avoided (Wickens, Born, & Allen, 1963). The conclusion the authors arrived at was that, during the period that rehearsal had been prevented, the memory trace decayed over time.
Support for this conclusion came from later findings that forgetting occurred in the absence of interference from previous trials (i.e. on the very first trial) when ceiling effects were avoided (A. D. Baddeley & Scott, 1971). It has been suggested that decay is an adaptation to the statistical structure of the environment (Anderson & Milson, 1989) as it allows “overwriting” of the memory with the most recent and relevant information about the individual’s environment.
Some authors (e.g. Nairne, 2002), however, have dispensed with the rehearsal mechanism in their theories pointing to evidence that item-based differences in memory remain when rehearsal is blocked and articulation rates are held constant. For example, high-frequency words have a greater memory span than low-frequency words when the articulation rate is held constant (Roodenrys, Hulme, Alban, Ellis, & Brown, 1994), as do words over nonwords (Hulme, Maughan, & Brown, 1991), and concrete words over abstract words (I.
Walker & Hulme, 1999). Theories relying solely on decay ignore the effects of the retrieval environment and the type of activity between study and retrieval. In contrast, many authors have attempted to create models of STM without appealing to the process of decay, explaining forgetting by the various forms of
interference that can occur (e.g. G. D. A. Brown & Hulme, 1995; G. D. A. Brown, Preece, & Hulme, 2000; Neath, 1993).
The idea that the interference of existing memories with new information is responsible for forgetting has a long history dating back to the work of McGeoch (1932). Interference can be either proactive (when associations learned prior to current learning interfere with current learning) or retroactive (when associations learned after the learning of material to be retained interfere with that learning). Keppel and Underwood (1962) proposed that proactive interference could account for the findings of Peterson and Peterson (1959).
Rather than the memory trace for study material decaying whilst the participants were engaged in the distractor task, Keppel and Underwood argued that forgetting occurred due to the build up of interference from previously learnt material. Waugh and Norman (1965) found that varying the rate of presentation in a list-based memory task had little effect compared to the serial position of an item, suggesting that time was not as important as the number of intervening, interfering items between study and test. However, whilst interference-based explanations for forgetting are attractive, it is advisable to bear in mind the fact that any putative effect of time can be explained by various combinations of retroactive and proactive interference (Cowan, 2001).
Whilst interference and decay have historically been contrasted with one another as opposing explanations, a recent theory of Altmann and Gray (2002) combines decay- and interference-based explanations of forgetting. The functional decay model proposes that, when an attribute has to be constantly updated in memory, its value decays to prevent proactive interference. This rate of decay is theorised to adapt to the rate at which memory must be updated.
Such a system would avoid memory becoming rapidly filled with items that would hinder recall of the most recent information (proactive interference). The authors found evidence for this in the gradual decline of performance on the current task
set in a task-switching experiment. This decline was attributed to memory decay.
Single-digit number stimuli were presented, and participants were required to label them either as odd or even, or greater or less than 5. The less frequently the task was updated, the more gradual the decline in performance became, suggesting an adaptive rate of forgetting.
Shepard and Teghtsoonian designed the continuous recognition paradigm with the intention of studying recognition under conditions in which “the possibility of rehearsal is minimized while the interference of preceding material is maximized” (Shepard & Teghtsoonian, 1961, p. 303, p. 303). Their aim was to create an experimental situation in which forgetting occurred at a constant rate, although their results demonstrate that the false alarm rate was still gradually increasing at the end of the experiment, suggesting that this steady state was never actually achieved. This continual increase in the false alarm rate suggested that memory stores became overloaded with previous stimuli, making novel stimuli progressively more difficult to distinguish from an increasingly large set in memory. However, this effect was not found in Experiments 1 and 2; rather the false alarm rate initially increased and then reached a plateau. This finding compliments the findings of other researchers (Doty & Savakis, 1997; Hockley, 1982) who noted that reaction times for correct responses in a continuous recognition procedure did not increase as a function of list position.
Whether or not overall list position of a stimulus affects old/new discrimination, a universally reported effect is the effect of lag. For example, Hockley (1982) noted that correct reaction times decreased with increasing lag, and that the function was approximately logarithmic. Doty and Savakis (1997) have studied old/new discrimination of completely novel stimuli from items tested in continuous recognition 1 or 2 weeks previously. Participants carried out continuous recognition of one set of complex, abstract images on week 1 (mean d’=1.82), and a second on week 2 (mean d’=1.26). On week 3 they carried out a
similar continuous recognition experiment where half of the stimuli had been seen previously in either the first or second weeks. The effect was to greatly decrease discriminative ability to a mean d’ of 0.52. The same participants carried out a similar procedure with common 4-letter words on weeks 4 and 5 and the resulting decrease in mean d’ was from 2.97 to 0.89. Significant increases in reaction times were also observed. Nonetheless, the authors concluded that the “robust behavioral retention over a period of 1 week or more offers the likelihood that time per se is not critical” (Doty & Savakis, 1997, p. 292, p.292).
To the author’s knowledge no study has attempted to dissociate the effects of number of intervening items and time under conditions of continuous recognition, with the same participants, within the same experimental session. In the current experiment the intertrial interval (the time elapsed between successive trials) varied systematically across three blocks of either “long” (8 sec), “medium” (3.5 sec) or “short” (1.25 sec) values. Within each block the frequencies of lags were controlled. As such the effect of intertrial interval and number of intervening items could be separated.
Most of the classic literature regarding forgetting is based on experiments using words and other verbalisable stimuli, and often refers to subvocal rehearsal as a mechanism for maintaining information (e.g. B. B. J. Murdock, 1961;
Peterson & Peterson, 1959; Waugh & Norman, 1965). Given the assumption set out in Chapter 2 that certain visual stimuli (e.g. fractals) are not amenable to verbal labels, it was important to dissociate the effects of intertrial interval and number of intervening items for both verbalisable and non-verbalisable stimuli.
Each participant, therefore, carried out two sessions: one where the stimuli were fractals and one where they were trigrams.
3.1.2 Methods
3.1.2.1 Design
The experiment had a three-way within subjects design. The independent variables were the type of stimulus tested, the time interval between successive stimuli (intertrial interval), and the number of intervening trials (lag). Stimulus type had 2 levels: fractals and trigrams. Intertrial interval had 3 levels: short (1.25 sec), medium (3.5 sec), and long (8 sec), giving total trial durations of 2.25 sec, 4.5 sec, and 9 sec, respectively. Lag had 3 levels: 1, 4 and 9 intervening items. The dependent variables were the participants’ d’ scores, and reaction times for correct recognition.
3.1.2.2 Participants
A total of 21 participants were tested, of which 16 were female and 5 male. All were students at the University of Nottingham with a mean age of 20±0.3 years. All participants had normal or corrected-to-normal vision.
In order to exclude data from participants who ‘gave up’ on the tasks, during one or other of the relatively long 35 min sessions, criteria were set for inclusion of the data in analyses. Each participant’s data were entered into a one-tailed chi square, to determine that the number of correct responses were significantly above that expected by chance.
3.1.2.3 Stimuli
Two different categories of stimulus were employed: fractals and trigrams.
These stimulus sets were the same as those described in the Methods section of Experiment 1.
3.1.2.4 Presentation
All stimuli were presented using an Apple Macintosh G3 computer (300Mhz, 384Mb RAM) with an ATI Radeon 7000 (32Mb) graphics card, on a 21”
Mitsubishi colour display monitor (size: 1024 x 768 pixels, resolution: 72 x 72 dpi,
refresh rate: 75Hz). Stimuli were presented in the centre of the screen using Matlab v5.2.1 (Mathworks UK), running the psychophysics toolbox. All bitmap
images (fractals) were converted to a standard size of 59 x 59 pixels, and had an area of 2 cm x 2 cm when displayed at the screen resolution described above.
Participants were seated at a distance of 57.5 cm from the screen so stimuli subtended an area of approximately 2° x 2° of visual angle.
3.1.2.5 Procedure
Each experiment took place in the form of two sessions, identical except for the type of stimuli (fractals and trigrams) and the order of intertrial interval blocks (see more info below). The order in which the participants carried out the fractals session and the trigrams session was counterbalanced across participants.
For each session a pseudorandomly determined frame of 120 trials was generated, providing 16 study and 16 test trials at each of the 3 lags. These lags (1, 4, and 9) were selected as early, mid, and late points from the retention curves obtained in Chapter 2. Twelve study-test pairs of unscored filler trials made up the spaces in the frame.
This frame, or block, was repeated three times, with novel stimuli each time. For each block the intertrial interval duration was different (1.25 sec, 3.5 sec, or 8 sec), with the order in which these occurred counterbalanced across participants. In addition, the experiment began with a buffer of 40 unscored filler trial pairs to prevent the occurrence of primacy effects. Intertrial interval for the buffer was always medium length (3.5 sec). This yielded a total of 400 trials.
Once the order of trials for the entirety of a session had been generated, the 200 stimuli were randomly assigned to the 200 pairs of trials, so that each participant experienced the stimuli in a different order. Once begun, the session progressed through the buffer, and the three blocks continuously without breaks in between, the aim being to maintain the steady state of performance encountered previously (see Chapter 2).
For each trial stimuli appeared on the screen for 1 sec. During this period participants were instructed to respond either ‘old’ or ‘new’ with the left or right mouse button respectively, ‘old’ to previously seen and ‘new’ to previously unseen items. In order that participants did not forget which button corresponded to which response during the experiment, a label of which was which was placed in clear view of the participant. After 0.5 sec the stimulus was replaced with a blank screen. One second later feedback was given for incorrect responses, or where no response was given. Feedback consisted of a low-pitched beep. No feedback was given if the response was correct. The total length of time that the screen remained blank varied according to the intertrial interval. For the short intertrial interval the total period was 1.25 sec (total trial length of 2.25 sec), for medium it was 3.5 sec (trial length = 4.5 sec), and for long 8 sec (trial length = 9 sec). The experiment lasted a total of 34.5 min.
3.1.2.6 Scoring
As mentioned previously, participants’ responses were only detected during the 1 sec presentation period of each stimulus. Once a response had been made it was final, and no opportunity for the correction of responses was allowed.
Reaction time was measured as the latency from the start of the trial until the detection of the response.
3.1.3 Results
‘No responses’, when participants failed to respond within the timeframe, were not scored. The number of correct responses to test presentations for each lag, for each interstimulus interval, for each stimulus type, was divided by the total number of scored test presentations for that category, to obtain the hit rate, or proportion of correct responses.
From these results a one-tailed chi square test was conducted on each participant’s responses, to determine that the number of ‘old’ responses to repeated stimuli was significantly above that expected at chance. For four participants there was no significant difference, and those participants’ data were omitted from further analysis.
To obtain a hit rate independent of the participant’s bias to respond ‘old’, d’ was calculated using hit rates and false alarm rates. The values for fractals and
trigrams are presented as a function of the number of intervening items between study and test in Figure 3.1 and Figure 3.3, and as a function of study-test interval in Figure 3.2 and Figure 3.4. Study-test interval is the period of time, in seconds, elapsing between the end of the initial study presentation of a stimulus, and the onset of the second, test presentation.
0 1 2 3 4 5 6 7 8 9 10 0
0.5 1 1.5 2 2.5
1.25 sec ISI 3.5 sec ISI 8 sec ISI
Figure 3.1: d’ values for fractal recognition, as a function of the number of items intervening between study and test. ISI = interstimulus interval. Data = Mean ± SEM.
0 20 40 60 80 100
0 0.5 1 1.5 2 2.5
1.25 sec ISI 3.5 sec ISI 8 sec ISI
Figure 3.2: d’ values for fractal recognition, as a function of study-test interval (sec). ISI = interstimulus interval. Data = Mean ± SEM.
0 1 2 3 4 5 6 7 8 9 10 0
0.5 1 1.5 2 2.5
1.25 sec ISI 3.5 sec ISI 8 sec ISI
Figure 3.3: d’ values for trigram recognition, as a function of the number of items intervening between study and test. ISI = interstimulus interval. Data = Mean ± SEM.
0 20 40 60 80 100 0
0.5 1 1.5 2 2.5
1.25 sec ISI 3.5 sec ISI 8 sec ISI
Figure 3.4: d’ values for trigram recognition, as a function of study-test interval (sec). ISI
= interstimulus interval. Data = Mean ± SEM.
3.1.3.1 D-prime scores
Mauchly’s test of sphericity was significant for lag, and all subsequent results for lag are Greenhouse-Geisser epsilon corrected. A 2 (stimulus type) x 3 (interstimulus interval) x 3 (lag) repeated-measures ANOVA was performed on d’
data and revealed a significant main effect of stimulus type (F(1,16)=77.5, MSe=0.558, p<0.001). Fractal d’ scores were higher than those for trigrams, a finding consistent with the findings of Experiment 1. There was also a significant main effect of lag (F(1.33,21.2)=23.8, MSe=0.353, p<0.001), and Tukey’s post-hoc tests revealed significantly higher scores at lag 1 than lag 4 (p<0.01), at lag 1 than lag 9 (p<0.001), and at lag 4 than lag 9 (p<0.01). This finding is unsurprising and consistent with the effects of lag reported in Chapter 2. There was no main effect of interstimulus interval. No interactions between any of the factors reached significance. The amount of time elapsing between study and test had no effect
on the participants’ performance, which was instead affected by the number of items intervening between study and test.
0 1 2 3 4 5 6 7 8 9 10
650 700 750 800 850 900 950
1.25 sec ISI 3.5 sec ISI 8 sec ISI
Figure 3.5: Reaction time values for correct recognition of fractals (ms). ISI = interstimulus interval. Data = Mean ± SEM.
0 1 2 3 4 5 6 7 8 9 10 650
700 750 800 850 900 950
1.25 sec ISI 3.5 sec ISI 8 sec ISI
Figure 3.6: Reaction time values for correct recognition of trigrams (ms). ISI = interstimulus interval. Data = Mean ± SEM.
3.1.3.2 Reaction times
The reaction time data for hits (correct recognition) is shown in Figure 3.5 (fractals) and Figure 3.6 (trigrams). Mauchly’s test of sphericity was significant for the stimulus type x intertrial interval interaction, the stimulus type x lag interaction, the intertrial interval x lag interaction, and the stimulus type x intertrial interval x lag interaction. All subsequent results pertaining to these interactions are quoted as Greenhouse-Geisser epsilon corrected values. A similar 2 x 3 x 3 repeated measures ANOVA to that described for d’ results was carried out on these data, and revealed a significant main effect of stimulus type (F(1,16)=16.5, MSe=9640, p<0.01). Trigrams were recognised more rapidly than fractals, as was also the case in Experiment 1. There was also a significant main effect of intertrial interval (F(2,32)=64.9, MSe=3620, p<0.001). Short intertrial intervals were associated with faster reaction times for correct recognition than both medium and long intervals (p<0.001), and medium intervals were associated with
longer reaction times than long intervals (p<0.001). There was no effect of lag on reaction times, in a departure from the results of Experiment 1, which showed significantly faster reaction times at lower lags for both stimulus sets. However, the reaction times in the current experiment were much faster on average than those recorded for Experiment 1, and this may have resulted in a ceiling effect.
No significant interactions were observed.
3.1.3.3 Serial position
Data for p(hit) and p(false) was again calculated for 10 trial epochs and are shown in Figure 3.7. As was found in Chapter 2, both p(hit) and p(false) were initially low and then increased throughout the course of the experiment before reaching a steady state, and p(false) was higher for trigrams than fractals.
Interestingly, the steady state for p(false) during trigram recognition appeared to have been achieved later in current experiment, suggesting that trigram recognition was more difficult when interstimulus interval varied. This same effect was not present in fractal recognition. The data were complicated by the counterbalancing of the order in which short, medium, and long interstimulus intervals occurred, and no consistent trends associated with the start of experimental blocks were observable.
0 5 10 15 20 25 30 35 40 0
0.2 0.4 0.6 0.8 1
Fractals p(hit) Fractals p(false) Trigrams p(hit) Trigrams p(false)
Figure 3.7: Hit and false alarm rates for recognition data by epoch. Epoch = 10 trials.
Arrows indicate the start of each experimental block. Data = mean ± SEM.
3.1.3.4 Summary
Increasing lag decreased discriminative ability, as measured by the d’
scores, but it had no effect on reaction times for hits. Conversely, whilst increasing interstimulus interval had no effect on d’ scores, it led to significantly slower reaction times for correct recognition. That is, whilst accuracy of recognition was only affected by the number of items intervening between study and test, speed of recognition was affected only by the rate of presentation. No significant interactions were found between factors for any of the data, suggesting that recognition of both the fractals and the trigrams were similarly affected by interstimulus interval and lag. However, the serial position data suggest that trigram recognition was significantly adversely affected by the
manipulation of interstimulus interval, resulting in higher false alarm rates, and a
manipulation of interstimulus interval, resulting in higher false alarm rates, and a