• No results found

CHAPTER 1. GENERAL INTRODUCTION

1.3.4. Initial theoretical framework

Within the past couple of decades converging behavioural and neuroimaging evidence was provided in support of the idea that multisensory integration and selective attention can influence each other. As described in Section 1.3.2, there are instances, in which temporally coincident signals from different modalities are integrated in a pre-attentive and effortless manner into a salient emergent multimodal object which has an increased ability to attract exogenous visual attention (Olivers et al., 2008). In contrast, the literature discussed in Section 1.3.3 strongly suggests that certain forms of cross-modal interactions are dependent on endogenous attention. How can these disparate results concerning the direction of interaction between selective attention and multisensory integration be reconciled?

Recently, Talsma, Senkowski, Soto-Faraco, and Woldorff (2010) proposed an initial theoretical framework to describe and explain the factors that play a crucial role in this interplay. According to their model, the ‘complexity’ of the multisensory environment determines whether multisensory integration will require attentional resources or occur automatically. Talsma et al. (2010) defined the ‘environment complexity’ as the degree of ongoing competition occurring between stimuli within each modality. In particular, the probability of effortless, automatic multisensory integration is higher in task contexts in which the competition between stimuli in the other modality is low, e.g., in contexts where

the events are rare. Infrequency of these stimuli should increase their bottom-up salience and trigger a neural response strong enough to be automatically associated with a response to a concurrently presented object or event in the primary modality. For example, this account is supported by the results of Olivers et al. (2008), which demonstrated that sparse auditory stimulation can increase the ability of a concurrent visual object to be selected from a rapidly changing array.

In their framework, Talsma and others (2010) contrasted this context with one, in which there were multiple stimuli present in close succession in each modality. The competition for processing resources that occurs in such circumstances decreases the bottom-up salience of stimuli in the other modality, which necessitates the presence of endogenous attention for integration of appropriate, task-relevant signals and effective processing of the resulting integrated multimodal object. Support for this account is

provided by the study of Talsma et al. (2007), in which the suppression of neural response to unattended co-occurring visual and auditory stimuli could be explained by depletion of processing resources due to focusing attention on the concurrent RSVP task. In other words, in contexts where competition decreases the perceptual salience of the stimuli presented in the other modality, unimodal signals might have to be first separately selected by respective within-modal attentional mechanisms in order for them be integrated into a multimodal object. Notably, the notion, according to which attention plays a role of ‘glue’ that integrates appropriate features in multi-stimulus environments, is one of the core assumptions of FIT (see Section 1.1.2).

The framework proposed by Talsma and colleagues was the first attempt to integrate the plethora of findings that represent two very different facets of the interplay between multisensory integration and selective attention. While it would be highly beneficial for the proposed hypotheses to be tested through systematic manipulation of the degree of

competition between stimuli within the other modality, their plausibility seems to be supported by their resemblance to the tenets of the ‘perceptual load theory’ of visual attention (Lavie, Hirst, Fockert, & Viding, 2004; Lavie, 2005, 2010; Lavie & Tsal, 1994). According to this model, if the competition for processing resources among visual stimuli is low, task-irrelevant stimuli will be automatically processed. However, if processing

resources are depleted by a perceptually demanding task, the task-irrelevant distractors will be filtered out at early stages of information processing. Extrapolating to multisensory contexts, it can be assumed that when the task-at-hand does not impose high demands for perceptual resources, these resources will be automatically diverted to the processing of the co-occurrence of signals from different modalities, producing a salient multimodal objects

that is preferentially selected when presented among purely visual objects. The idea of Talsma and colleagues, according to which the stimulus delivery rate (i.e., the length of the interval between the onset of the present and the successive event) is critical in determining whether the integration of signals across modalities takes place in a pre-attentive manner (see Fujisaki, Koene, Arnold, Johnston, & Nishida, 2006, for results showing that stimuli presented faster than one every 250 ms fail to do so) is consistent with the perceptual load theory.

Overall, Talsma and colleagues (2010) were the first to aim to identify a single factor that might be of critical importance for the directionality of the influence between selective attention and multisensory integration. The most important contribution of this model to the research presented in this thesis is highlighting the need for investigation of the interplay between visual attention and cross-modal integrative processes in ecologically valid contexts where multiple stimuli compete with each other for selection.

1.4. Methodological approach

A method well suited for investigations of the influences of multisensory integration on visual attention is one involving a combination of performance and event-related potentials measures. The ERP technique involves recording the brain activity at the scalp (i.e., electroencephalogram, EEG) and dissociating the signal associated with processing of specific events from the background noise. Together with behavioural indices of attentional selection, ERPs can provide an important insight into whether audiovisually-induced enhancements of spatial selection of visual objects accompanied by irrelevant tones are already visible at stages concerned with selective perceptual processing. In the context of search for objects defined as conjunctions of visual and auditory features, ERPs, as a direct measure of neural processing of items that do not require an overt response, can reveal whether and, if so, via which mechanisms (i.e., onset latency effects or just amplitude effects) attentional selection of task-irrelevant cues changes depending on how many features they share with the target. Importantly, in this area, ERPs can reveal the stage at which selective processing is already controlled by integrated audiovisual object templates. The aim of the following sections is to describe the biophysical basis of the ERP technique and to provide an overview of ERP components and ERP effects associated with