• No results found

1. Introduction

1.2. Cross-modal correspondences

1.2.3. Explaining correspondences

With the ability to match between incredibly wide varieties of sensory properties, any common mechanism underlying these must abstract these properties to a common dimension for comparison. The most commonly cited explanation has been that of matching stimuli based on intensity. At its simplest level, the presence of stimulation in the first modality is more similar to stimulation of a second modality than it is to an absence of stimulation. This works well for explaining loudness-luminance associations as both the auditory and visual cortex increase their neural activation to these stimuli (Goodyear & Menon, 1998; Jäncke, Shah, Posse, Grosse-Ryuken & Müller-Gärtner, 1998). However this is less clear for pitch-luminance associations since pitch is processed tonotopically in the auditory cortex, not through increased activation relative to low pitch (Talavage et al., 2004). One influential account is that of Walsh's (2003) 'A Theory of Magnitude' (AToM), this proposed that sensory features can be abstracted to 'higher' or 'lower' along a sensory dimension in the parietal cortex (Bueti & Walsh, 2009; Cohen Kadosh, Lammertyn & Izard, 2008; Cohen Kadosh, Cohen Kadosh & Henik, 2008; Pinel, Piazza, Le Bihan & Dehaene, 2004). This would mean that sensory features deemed to be 'higher' on each sensory dimension would be treated as more equivalent due to their matching magnitudes relative to their 'lower' counterparts. One difference to 'intensity-matching' is that being higher on a magnitude scale may not necessarily involve the most neural intensity. This can put competing explanations in conflict, for example, a relatively low pitched tone might be 'low' on an abstracted dimension of magnitude however if you were to increase its volume, it would also become high in terms of neural intensity. The neural intensity explanation would expect a loudness-brightness correspondence to occur while the magnitude explanation might expect a colour of low luminance to be chosen. These competing explanations and predictions may help illustrate which best account for these correspondences and furthermore if certain explanations are more common or dominant over others. Individual differences in how correspondences are reached may help to explain why some individuals express loudness-brightness correspondences while others express loudness-darkness correspondences (Marks, 1974). Abstracted theories such as AToM can also be more flexible in what dimension is focused by the individual for the cross-sensory matching. For instance, a complex tactile object could be processed primarily using shape, texture or hardness information, dimensions which have their own unique cross-sensory matches, rather than only how stimulating an object is (Ludwig & Simner, 2013; Servos, Lederman, Wilson & Gati, 2001). In support of applying Walsh's AToM theory to correspondences is that disruption to parietal regions also disrupts interference from incongruent correspondences (Bien, ten Oever, Goebel & Sack, 2011). The finding that abstracted numerosity is topographically arranged in the parietal cortex leaves the door open for specific neurological

predictions regarding which correspondences are treated as more or less similar (Harvey, Klein, Petridou & Dumoulin, 2013). For instance, if a specific pitch is associated with an abstract value (and location) in the parietal cortex, its perceived similarity to a specific luminance (which has also been abstracted to the parietal cortex) might be based on the neural proximity of these two abstractions in the parietal cortex. That said, the basis of why certain features (such as pitch) would be treated as 'greater' than others is less well known.

Spence's (2011) theoretical framework for understanding correspondences separates them down to three distinct categories with predictable effects on information processing. The first of these is the 'structural' account where neural connectivity between the corresponding regions leads to enhanced congruent perceptual processing. These regions may either be directly connected, proximal or connected via intermediary regions such as the parietal lobe in magnitude evaluations (Walsh, 2003). One example of this would be loudness-brightness correspondences (Marks, 1974). As this occurs at an early stage of processing prior to conscious evaluations, this would be expected to affect early perceptual processing and influence cognitive decisions (Marks, 1987). The second account is through learned associations through regularities in the environment which are referred to as 'statistical' correspondences. For example, the tendency for smaller animals to have higher pitched vocals than larger animals, leads to a pitch-size correspondence which aides in multisensory integration (Parise & Spence, 2009). Statistical associations can also be artificially brute-forced through trained associations as in attempts to train synaesthesia (Colizoli et al., 2012; Elias et al., 2004; Meier & Rothen, 2009; Rothen et al., 2011). Similar to structural correspondences, these statistical correspondences affect both low-level perceptual processing and higher-level decisions (Evans & Treisman, 2010; Gallace & Spence, 2006). The third account is through shared higher-level meanings, such as through shared words in language or emotional affect. Sharing the terms of "low" and "high" to describe both auditory pitch and spatial location would also lead to a correspondence between the two (Martino & Marks, 1999), while for emotional valence, music in a major chord with a fast tempo was associated with bright yellows, both of which share positive emotional affect linking the two (Palmer et al., 2013). These higher-level links occur at a decisional rather than perceptual level and so would not be predicted to affect behaviours at a pre-cognitive level such as with speeded classification tasks. Contrary to these strict distinctions however is that some correspondences appear to have evidence for occurring at all of these levels. Correspondences between pitch and height occur in pre-linguistic infants under four months of age (Dolscheid, Hunnius, Casasanto & Majid, 2012; Walker et al., 2010). This suggests that minimal (if any) environmental experience is required, which is indicative of a structural explanation. However, Parise, Knorre and Ernst (2014) found that natural auditory scenes in both urban and rural areas

featured more high frequency content located higher in space, an effect that is further emphasised by the frequency-filtering properties of the human ear. So pitch-height correspondences are further reinforced through our environment and anatomy, lending support to statistical explanations. These predispositions are likely to influence language with shared terms representing each, resulting in further reinforcement through higher-level semantic correspondences (Martino & Marks, 1999). The potential for one correspondence to occur at multiple levels has many additional implications for their effect on one another. Firstly, it is unclear for correspondences that occur at one or more levels whether this has an impact on the expression of a correspondence, for instance, is a structural correspondence stronger if it is also supported through statistical and higher-level influences? Contrary to this, what occurs if a structural and statistical correspondence are in conflict, is there a hierarchy of influence on perceptual processing tasks? Is it possible to elicit one type of correspondence independent of another? While a variety of explanations can be given to explain a correspondence's origin, the relationship between correspondences has not been explored theoretically.