Research project ideas and frameworks
7.7 A project investigating the relationship between gesture and speech processing
using fMRI scanning techniques
Quote 7.8 From Straube et al. (2010) ‘Social cues, mentalizing and the neural processing of speech accompanied by gestures’
Body orientation and eye gaze influence how information is conveyed during face-to-face communication. However, the neural pathways underpinning the comprehension of social cues in everyday interaction are not known. In this study we investigated the influence of addressing vs. non-addressing body
7.7.1 Commentary and ideas for further work
This paper presents work on speech communication that is at the cutting edge at the time of writing and is included for this reason. The technique used, fMRI scanning, is described on page 38 and is one that allows the researcher to see what parts of the brain are activated by different stimuli, in this case a combination of speech, stance, and gesture. This project shows that speakers’ brains respond quite differently when someone speaks to them directly as opposed to standing as if speaking to someone else.
They also respond differently when a speaker is describing an object with an illustrative gesture (such as ‘The bowl in the kitchen is round’ spoken in combination with a circular motion of the hands), or describing a human entity and using a commonly understood (‘emblematic’) gesture (‘The actor did a good job in the play’ in combination with a thumbs-up sign). These variables – stance and gaze, person- versus object-related message and descriptive versus culturally known gesture – were set up as four conditions to be combined with the gestures: Person-related + Frontal stance, Person-related content + Lateral stance and ditto Object-related content. These were recorded as short video clips and 30 of them in each condition were shown to 18 subjects. These were carefully chosen, as would be expected in the experimental paradigm being used, and were: all male, all right-handed, all native German speakers with no impairments to vision or hearing, had an average age of around 24 and were all between 20 and 30. The flow of blood to various parts of the brain was analysed and conclusions reached as to the differing effects of speech, content, stance and gesture in the combinations outlined.
It is not assumed that the classroom practitioner will wish to undertake this kind of neuro-linguistic research directly or that they would have easy access to fMRI equipment if they wished to. However, it is included to suggest that our understanding of spoken communication may soon be very different and that this will have an impact on how the skill is regarded
orientation on the neural processing of speech accompanied by gestures. . . . Our findings indicate that social cues influence the neural processing of speech–gesture utterances. Mentalizing (the process of inferring the mental state of another individual) could be responsible for these effects. In particu-lar, socially relevant cues seem to activate regions of the anterior temporal lobes if abstract person-related content is communicated by speech and ges-ture. These new findings illustrate the complexity of interpersonal communi-cation, as our data demonstrate that multisensory information pathways interact at both perceptual and semantic levels.
(Straube et al., 2010: 382)
and is taught. As new insights such as those reported here are gained about the complex interplay between different modes and signals – speech, ges-ture, social cues – that need to be taken into consideration together when understanding speaking, our understanding of spoken communication will begin to change quite radically over the next few years. In particular, the idea that speaking can be treated as a simple linear process that is similar to writing but carried to the world on breath rather than paper or a screen will become untenable. The authors of the current study conclude: ‘Our findings illustrate the complexity of natural communication, in which mul-tiple channels of information interact at both the perceptual and semantic level’ (Straube et al., 2010: 393). In terms of applications in teaching, knowing that person-related information is processed quite differently from impersonal information, that believing a person is speaking to you affects how the brain is ‘primed’ to speak, and understanding the subtle effects of gesture in relation to speech generally will all have clear rele-vance to both face-to-face classroom teaching and perhaps more impor-tantly the ability to move the teaching of speaking online.
7.7.2 Potential reader project: raising awareness of speaking as a multi-sensory skill
The findings reported in Straube et al. (2010) imply that many cues other than simply the stream of spoken sounds help us to communicate via speech. As noted elsewhere in this book, speaking is very often taught as if it is written language delivered through oral/aural channels. The cutting-edge work reported here suggests that our understanding of speaking may soon be very different. This project investigates the potential differences between listener comprehension of explanations with and without the benefit of visual cues.
Stage 1: Preliminary decisions
This project could be based on the data gathered for the study on speech rates in presentations described in section 7.3.2. The material could then be played to listeners via a sound recording only or via a video to include visual cues. A decision would be needed whether particular sections of the presentation would be the focus, for example, where a student is explain-ing a technical term, or whether the whole presentation would be used.
The benefit of using a particular functional category such as explanation or giving examples would be that some patterns of gaze, stance and gesture could potentially be linked to the function. Another approach would be to begin from sections of talk where the speaker uses gesture to enhance meaning and extract these for the viewer/listener. Evaluating listener com-prehension is a particularly tricky process and thought would need to be
given to the background knowledge of the listeners on a given topic and their current listening ability in the target language.
Stage 2: Experimental phase
Two groups of listeners, matched for age, language ability and educational background, would be played extracts from the presentations under two conditions: (a) via video showing speaker plus gaze, stance and gesture; (b) via a sound recording. The hypothesis would be that comprehension lev-els are higher under condition (a). Ideally, there would be sufficient extracts for the listeners to be played a large enough set of samples for statistical analysis and for the same extracts to be played to different listeners to allow direct comparison by extract as well as general analysis. For each extract, the subjects would be required to indicate their level of compre-hension. This could be a simple Likert scale (0 = could not comprehend to 5 = fully comprehend) or some other technique such as testing recall via a written reformulation or notes. The advantage of the former approach is that it does not depend on the subjects’ written language ability that may interfere with their capacity to explain clearly what they have really under-stood.
Stage 3: Analysis of results
The hypothesis would be supported if there were higher levels of compre-hension in listeners under condition (a). Further analysis of any trends might show correlations between language function, gesture and other visual stimuli and ease of comprehension.