1.7 The theoretical incoherence of the multiple semantics hypothesis
1.7.4 The Modality-Specific Context Hypothesis
Another possible interpretation of the multiple semantics hypothesis is that information in the visual semantic subsystem consists of knowledge that has been acquired in a visual context, while information in the verbal semantic system consists of knowledge acquired in a linguistic context. Thus, for example, visual information such as the fact that footballs are round might well be stored in the verbal system if this knowledge has been acquired in a linguistic context. Caramazza et al. (1990) refer to this interpretation as the modality- specific context hypothesis. One problem with this version of the hypothesis is that it leaves entirely open the question of whether there would be extensive duplication of information across semantic subsystems. Caramazza et al. appear to take the view that the modality in which a particular piece of information was first acquired would be the relevant factor, but it would be equally possible to argue that information could be stored in both the visual and verbal systems if that information had been encountered in the context of both modalities. A second problem with this version o f the hypothesis is that, as Caramazza et al. point out, it would, in practice, be impossible to determine what information a particular individual had acquired in a visual context and what information they had acquired in a verbal context. This version of the hypothesis does not, therefore, appear to be empirically testable.
Within each of these interpretations of the multiple semantics hypothesis the assumption remains that there are bi-directional links between the visual and verbal semantic subsystems. The last three versions also assume that stimuli presented in a particular modality always access information in the modality-congruent semantic subsystem first, and that access to the phonological output lexicon and the orthographic output lexicon is always mediated by the verbal semantic subsystem.
It is by no means clear which of these four interpretations is supported by advocates of the multiple semantics hypothesis. Different authors appear to support slightly different forms of the theory.
example, Silveri and Gainotti (1988) attributed their patient's greater ability to name animals from descriptions stressing functional or metaphorical characteristics rather than visual characteristics to a greater preservation of verbal semantics than visual semantics. In other words, these authors take the view that the visual and verbal semantic systems contain visual and functional information respectively. Hart and Gordon (1992) also take the view that the visual and verbal semantic subsystems are distinguished from one another in terms of their content, stating, for example, that "knowledge of ... [functional] properties resides only in the language-based system" (p. 63).
By contrast, there is some evidence that McCarthy and Warrington favour the Modality- Specific Context hypothesis. These authors have suggested that processes involved in the acquisition of language play an important part in the development of visual and verbal semantic subsystems (McCarthy and Warrington 1988). Caramazza et al. (1990) have interpreted McCarthy and Warrington's account in this way, but it should be noted that Riddoch et al. (1988) have suggested that this model corresponds to the Input account.
As noted above, Shallice has explicitly rejected both the input account and the modality- specific format hypothesis as interpretations of the multiple semantics model (Shallice 1993). However, while some aspects of his characterisation of the model appear to correspond to the modality-specific content hypothesis, others appear to be more similar to the modality- specific context hypothesis. For example, he has stated that " If... a question such as "is it larger or smaller than a cat" together with, say, a picture of a wolf can be used to test for the visual knowledge of w olf (Warrington 1975) then this implies that at least the visual representation of the more familiar item cat can be accessed not only from visual input but also from auditory-verbal input via a verbal semantic system" (1988a, p. 137; my italics). This statement implies that for visual information to be accessed from verbal stimuli, the visual semantic subsystem must be accessed, and that this system must be accessed via the verbal semantic subsystem. This means that each subsystem contains different types of semantic information, as stated by the modality-specific content hypothesis. However, he also suggests that some (but not all) types of information may be stored in more than one system. He considers, as an example, where knowledge about the habitat of a hedgehog
might be stored. He suggests that in the verbal system <hedgehog> may be associated with <Engiand>, whereas in the visual system the representation of this animal may be associated with the type of scene in wiiich it is typically observed. It is not clear from this example why some types of information may be stored in more than one system - whether this is because some information may have been acquired in more than one context, or because representations which differ slightly in content may be tapped to answer the same question.
More recently, Shallice (1988b; 1993) has suggested that the semantic system may be considered in terms of a large distributed net with regions that are specialized for particular processes. Such processes would include the representation of the sensory and functional properties of objects, relevant action routines and so on. This specialization would come about because of the different pattern of connections with systems outside the semantic store that would be required by each process. Most significant for the multiple semantics hypothesis is that pre-semantic recognition systems would have stronger and more direct links with some regions of the semantic net than others. Shallice (1993) suggests that the original distinction between visual and verbal semantic subsystems provides a rough approximation to this view if it is posited that there are two clusters of semantic processes that occur predominantly with visual and verbal input. However, this version of the hypothesis still fails to address the fundamental problem of what information "visual" and "verbal" semantic subsystems would contain.
In fact, Shallice (1993) has suggested that asking such questions as whether visual semantics is best described in terms of visual content or context of acquisition is "both indeterminate and irrelevant", on the grounds that it is extremely difficult to decide which of these characterisations would best fit any particular piece of semantic knowledge. However, while it is certainly the case that it is difficult to classify some pieces of knowledge in terms of these categories, it is no more difficult than attempting to decide whether there is empirical evidence to support the hypothesis that there are functionally separable semantic subsystems if the contents of these systems are left unspecified.
some authors have failed to specify which of the basic assumptions of the hypothesis they wish to retain (for example, whether they wish to preserve the assumption that direct access to each modality-specific semantic subsystem is only possible fi’om modality-congruent input). Because of this ambiguity, it is possible to account for different sets of data using slightly different versions of the hypothesis, whilst making it extremely difficult to identify any ensuing theoretical inconsistencies. Consequently, it is difficult to determine whether the multiple semantics hypothesis can, in fact, account for the full range of semantic deficits that have been observed.
Overall, therefore, it would appear that the charge that the multiple semantics hypothesis lacks theoretical coherence is, to some extent, justified. However, in order to test the hypothesis empirically, it is necessary to formulate a concrete characterization of the model. As stated above, Shallice has explicitly rejected the Input account and the Modality-Specific Format hypothesis as interpretations of the model. This leaves the Modality-Specific Content hypothesis and the Modality-Specific Context hypothesis.
Within Shallice's later model, the most accurate description of the information that is subsumed under the title of "visual semantics" would appear to be that information which is most fi’equently accessed fi’om visual input. Various types of information would differ in terms of the frequency with which they would be accessed from visual and verbal input. However, it would seem reasonable to assume that there would be a rough correspondence between the modality of the stored information and the input modality from which it would be most commonly accessed. In other words, visual attributes would be more strongly associated with visual rather than verbal input, while the opposite would be the case for more abstract properties. This is not to say that the correspondence would be perfect. For example, visual properties that are used in common phrases, such as "As big as an elephant" might be represented in an area with strong links to verbal input. However, the general notion that the visual and verbal semantic subsystems contain, for the most part, modality- congruent information would provide a usefiil heuristic for determining the contents of these systems.
The Modality-Specific Content hypothesis will therefore be taken here to represent what is meant by the multiple semantics hypothesis.^ Although this means that little weight is being given to the fact that the multiple semantics hypothesis has sometimes been described in a way which incorporates characteristics of the Modality-Specific Context hypothesis, there are a number of reasons for adopting this approach. First, as argued above, it is not clear how the Modality-Specific Context hypothesis could be tested empirically. Second, there is a good deal of overlap between the Modality-Specific Content hypothesis and the Modality-Specific Context hypothesis in terms of the predictions that they make Avith regard to where different types of information will be stored. For example, information about the visual properties of objects would usually be acquired in a visual context, and hence both hypotheses would predict that such information would be stored in the visual semantic subsystem. Finally, the most important point about the Modality-Specific Content hypothesis is that it captures the notion that différent input modalities have privileged access to different types of semantic information, which is one of the strongest predictions of the multiple semantics hypothesis.