Naturalistic language comprehension : a fMRI study on semantics in a narrative context

(1)

Naturalistic language comprehension:

a fMRI study on semantics in a narrative context

Maria Karoliina Kurkinen

Master’s thesis in Neuroscience, Molecular and integrative biosciences, Faculty of Biological and Environmental Sciences, University of Helsinki Supervisors: Jussi Alho and Satu Saalasti

(2)

Tiedekunta – Fakultet – Faculty Bio- ja ympäristötieteellinen

Koulutusohjelma – Utbildingsprogram – Degree Programme Neurotieteiden maisteriohjelma

Tekijä – Författare – Author Karoliina Kurkinen

Työn nimi – Arbetets titel – Title

Naturalistic language comprehension: a fMRI study on semantics in a narrative context Oppiaine/Opintosuunta – Läroämne/Studieinriktning – Subject/Study track

Neurotiede

Työn laji – Arbetets art – Level

Maisterintutkielma Aika – Datum – Month and year 31.12.2019 Sivumäärä – Sidoantal – Number of pages 38

Tiivistelmä – Referat – Abstract

Semantiikka tutkii kieleen sisältyviä merkityksiä, joita tarvitaan kielen ymmärryksessä. Kuinka aivomme käsittelevät semantiikkaa ja kuinka ymmärrämme erityisesti luonnollisessa muodossa olevaa kieltä, on vielä aivotutkijoille epäselvää. Tässä tutkimuksessa kysyttiin, miten laajemmassa kontekstissa, narratiivissa, olevan kielen ymmärrys ja semanttinen prosessointi heijastuu aivojen aktiivisuuteen. Koehenkilöt kuulivat narratiivin toiminnallisen magneettiresonanssikuvantamisen (fMRI) aikana. Narratiivin semanttinen sisältö mallinnettiin laskennallisesti word2vec algoritmin avulla, ja tätä mallia verrattiin veren happitasosta riippuvaiseen (BOLD) aivosignaaliin ridge regression avulla vokseli kerrallaan. Lähestymistavalla saatiin eristettyä yksityiskohtaisempaa tietoa jatkuvan stimuluksen aivodatasta perustuen kielen semanttiseen sisältöön. Subjektien välinen BOLD-signaalin korrelaatio (ISC) itsessään paljasti molempien aivopuoliskojen osallistuvan kielen ymmärrykseen laajasti. Alueellista päällekkäisyyttä löytyi muiden aivoverkostojen kanssa, jotka vastaavat mm. mentalisaatiosta, muistista ja keskittymiskyvystä, mikä viittaa kielen ymmärryksen vaativan myös muiden kognition osien toimintaa. Ridge regression tulokset viittaavat bilateraalisten pikkuaivojen, superiorisen, keskimmäisen sekä mediaalisen etuaivokuoren poimujen, inferiorisen ja mediaalisen parietaalikuoren sekä visuaalikuoren, sekä oikean temporaalikuoren osallistuvan narratiivin semanttiseen prosessointiin aivoissa. Aiempi semantiikan tutkimus on tuottanut samankaltaisia tuloksia, joten word2vec vaikuttaisi tämän tutkimuksen perusteella mallintavan semantiikkaa riittävän hyvin aivotutkimuksen tarpeisiin. Tutkimuksen perusteella molemmat aivopuoliskot osallistuvat kielen laajemman kontekstin käsittelyyn, ja semantiikka nähdään aktivaationa eri puolilla aivokuorta. Nämä aktiivisuudet ovat mahdollisesti riippuvaisia kielen sisällöstä, mutta miten paljon kielen sisältö vaikuttaa eri aivoalueiden osallistumiseen kielen semanttisessa prosessoinnissa, on vielä avoin tutkimuskysymys.

Semantics is a study of meaning in language and basis for language comprehension. How these phenomena are processed in the brain is still unclear especially in naturalistic context. In this study, naturalistic language comprehension, and how semantic processing in a narrative context is reflected in brain activity were investigated. Subjects were measured with functional magnetic resonance imaging (fMRI) while listening to a narrative. The semantic content of the narrative was modelled computationally with word2vec and compared to voxel-wise blood-oxygen-level dependent (BOLD) brain signal time courses using ridge regression. This approach provides a novel way to extract more detailed information from the brain data based on semantic content of the stimulus. Inter-subject correlation (ISC) of voxel-wise BOLD signals alone showed both hemispheres taking part in language comprehension. Areas involved in this task overlapped with networks of mentalisation, memory and attention suggesting comprehension requiring other modalities of cognition for its function. Ridge regression suggested cerebellum, superior, middle and medial frontal, inferior and medial parietal and visual cortices bilaterally and temporal cortex on right hemisphere having a role in semantic processing of the narrative. As similar results have been found in previous research on semantics, word2vec appears to model semantics sufficiently and is an applicable tool in brain research. This study suggests contextual language recruiting brain areas in both hemispheres and semantic processing showing as distributed activity on the cortex. This activity is likely dependent on the content of language, but further studies are required to distinguish how strongly brain activity is affected by different semantic contents.

Avainsanat – Nyckelord – Keywords

luonnollinen kieli, kielen ymmärrys, semanttinen prosessointi, narratiivi, sanasto, konsepti, kognitio, konteksti, laskennallinen lingvistiikka, word2vec, toiminnallinen magneettiresonanssikuvantaminen, subjektien välinen korrelaatio (ISC), ridge regressio, aivokuori

Ohjaaja tai ohjaajat – Handledare – Supervisor or supervisors Satu Saalasti ja Jussi Alho

Säilytyspaikka – Förvaringställe – Where deposited Helsingin yliopiston digitaalinen arkisto HELDA Muita tietoja – Övriga uppgifter – Additional information

(3)

Naturalistic language comprehension:

a fMRI study on semantics in a narrative context

Kurkinen, K.1,2

1 Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University School of Science 2 Molecular and Integrative Biosciences, Faculty of Biological and Environmental sciences, University of Helsinki

Abstract: Semantics is a study of meaning in language and basis for language comprehension. How these phenomena are processed in the brain is still unclear especially in naturalistic context. In this study, naturalistic language comprehension, and how semantic processing in a narrative context is reflected in brain activity were investigated. Subjects were measured with functional magnetic resonance imaging (fMRI) while listening to a narrative. The semantic content of the narrative was modelled computationally with word2vec and compared to voxel-wise blood-oxygen-level dependent (BOLD) brain signal time courses using ridge regression. This approach provides a novel way to extract more detailed information from the brain data based on semantic content of the stimulus. Inter-subject correlation (ISC) of voxel-wise BOLD signals alone showed both hemispheres taking part in language comprehension. Areas involved in this task overlapped with networks of mentalisation, memory and attention suggesting comprehension requiring other modalities of cognition for its function. Ridge regression suggested cerebellum, superior, middle and medial frontal, inferior and medial parietal and visual cortices bilaterally and temporal cortex on right hemisphere having a role in semantic processing of the narrative. As similar results have been found in previous research on semantics, word2vec appears to model semantics sufficiently and is an applicable tool in brain research. This study suggests contextual language recruiting brain areas in both hemispheres and semantic processing showing as distributed activity on the cortex. This activity is likely dependent on the content of language, but further studies are required to distinguish how strongly brain activity is affected by different semantic contents.

Keywords:

naturalistic language, comprehension, semantic processing, narrative, lexicon, concept, cognition, context, computational linguistics, word2vec, functional magnetic resonance imaging, inter-subject correlation, ridge regression, cortex

(5)

Introduction

As language is the basis of how we humans interact but also partly what we construct our inner worlds with, there has been many attempts to explain the mechanisms behind it. Some theories in philosophy have tried but could not explain our ability to creatively combine information and to talk about entirely abstract ideas that do not physically exist in our world (Lakoff, 1988). One example of such a theory was Hilary Putnam’s theory of semantic externalism (Putnam, 1973, 2013), also known as objectivistic semantics – meaning lies in the objects around the subject. Putnam’s and other philosophers such as Noam Chomsky’s work at the time was followed by a theory called experientialist cognition, supported by philosophers in the late 1980. As opposed to semantic externalism, experientialist cognition focused more on internal events when talking about semantics, suggesting meaning of language being more psychological in nature (Lakoff, 1988). This view has also been called constructivist semantics - the meaning is constructed in the subject’s mind.

Moreover, some have suggested a need for entirely new philosophical view instead of objectivist or constructivist approaches (Jonassen, 1991). One could argue, that meaning of language must lie somewhere in between the external world and consciousness of an individual, shaped by social interaction between people. Later on, with the advances in brain imaging many of the linguistic phenomena discussed in philosophy have been studied from an experimental point of view (Mashal, Faust and Hendler, 2005; Ahrens et al., 2007; Rapp et al., 2011; Saban-Bezalel et al., 2017). For this field of research, the use of naturalistic stimuli in functional brain imaging studies might provide new insights on how human brain processes meaningful information during more natural phenomena and not only during simple stimuli and tasks.

Semantic processing of language

Semantic processing refers to understanding the meaning of words and sentences, and larger texts these comprise (Cruse, 2011). Therefore, within this framework, semantics is thought to be part of the process of comprehending language. The meaning in words is based on the knowledge we acquire through experience (Bréal, 1897). Since there is meaning encoded into words, we humans are able to meaningfully communicate with each other via language.

The importance of semantic processing in language comprehension is reflected in acquired disorders, such as semantic variant of primary progressive aphasia (svPPA), and developmental disorders, such as semantic-pragmatic language disorder (SPD). SvPPA, also known as semantic dementia (SD), is a degenerative nervous system disorder that affects semantic memory of the patient and causes anomia, difficulty in

retrieving words from memory (Mesulam et al., 2003). Semantic-pragmatic language

disorder (SPD) is a language impairment that has an effect on semantic and pragmatic, i.e. context related, processes of an individual. Its core symptom is delayed language development. Research on semantics could potentially benefit patients with deficiencies in language comprehension.

(6)

Research on genetics has shown the importance of conservation of and missense mutations on FOXP2 gene in the development of human language to its current form (Enard et al., 2002; Atkinson et al., 2018). It has been proposed that this gene has also a role in our symbolic thought and abstraction (Atkinson et al., 2018). FOXP2 encodes a transcription factor forkhead box protein P2 that has a role in the development and plasticity of the brain among its other effects (Fisher and Scharff, 2009; Atkinson et al., 2018). Mice that were transfected with human version of FOXP2 showed altered dopaminergic concentrations. Dopaminergic cells target striatal D1 receptor expressing medium spiny neurons that express also FOXP2 (Wijchers et al., 2006; Enard et al., 2009). Most of the dopaminergic cells react to stimuli with a prediction error response by bursting with an unexpected reward and silencing their firing with an expected but a failed one (Bayer and Glimcher, 2005; Schultz, 2016). Some of these dopaminergic cell bodies in substantia nigra and ventral tegmental area project to dorsal striatum. Contextual information from cortex is suggested to merge with the information from reward system in striatum (Enard et al., 2009; Lieberman, 2009). Mutations on FOXP2 gene correlate with deficits in linguistic skills (Fisher and Scharff, 2009) but some argue these to be related to motoric language production, such as in developmental verbal dyspraxia (Ocklenburg et al., 2013). Other genetic influences on language functions have been suggested. For example, missense mutation on progranulin (GRN) coding gene has been found in a patient with svPPA (Cerami et al., 2013). Then again, other sources suspect svPPA to be sporadic in its pathology (Landin-Romero et al., 2016). Genetic screening of patients with language disorders combined with behavioral and genetic studies on animals might provide an additional layer in the field of language research from the evolutionary perspective.

Communication context evolutionally important

Communication has been the core drive for complex language development (Arbib, 2005), from us being able to refer to concrete items in the surroundings all the way to the ability to make abstractions, name them and share those with others. Evolution of language has evidently had an effect on the semantic capabilities of humans, by expanding these over the course of time. Semantic abilities of other species such as primates have also been studied. Primate calls contain different combinations of sounds that specify the nature of danger, thus having semantic variability in them (Arnold and Zuberbühler, 2006). Studies on animal behavior and genetics could further lead to understanding at least the basis of human language, although the ability to think abstractly is thought to be unique for humans.

Context is important in language comprehension and is the most important cue for predicting what is about to come next in language. Meaning of a single word can vary even drastically depending on the referent context (Pulvermüller, 2019). A sentence taken out of its context can be understood very differently from what was meant by it in the first place. This path is dual: the brain makes attempts to bring together the context from the language already used, and from this context it makes predictions for

(7)

the future. In the field of semantics context of language is especially important and research using linguistic stimuli without wider context might not represent semantics fully.

During recent years, language research has shifted from using parts of language taken out of their context towards more naturalistic settings (Verga and Kotz, 2019). While block or event-related study designs with simple stimuli have been traditionally used in BOLD fMRI studies (Pan et al., 2011), naturalistic stimuli are continuous and offer an alternative approach to study language with fMRI. As communication is the most natural form of language, it should be taken into account as the context when studying meaning. Use of communicationally informative stimuli should thus be considered in brain research – studying brain with narratives or conversations might reveal more about semantics than single words.

Language as a part of cognition

Understanding naturalistic language is a complex cognitive event often recruiting other modalities of cognition for its function (Zwaan et al., 2004; Jung-beeman, 2005; Bastiaansen and Hagoort, 2006). Networks overlapping with language processing are for example networks of attention, working memory, mentalization and consciousness (Chafe, 1974; Baddeley, 1992; Garagnani, Wennekers and Pulvermüller, 2008; Vanlangendonck, Willems and Hagoort, 2018). In addition, emotional aspects of our mind are playing a role in understanding language fully (Ferstl, Rinck and Cramon, 2005). Complete separation of these networks might not always be the most functional approach for studying language and cognition, since they seem to be interconnected to each other (Mill, Ito and Cole, 2018). In addition, brain areas have multiple functions (Kanwisher, 2010), thus having a role in multiple networks.

Attention in language comprehension

Literature suggests varying theories over how these networks are related to language processing. For example, the more attentive the individual is the better comprehension of language the person has (Kristensen et al., 2013). Attentive processes direct our focus on incoming sensory information or alternatively to internal processes of the mind (Lepsien and Nobre, 2006). This might not be directly part of the linguistic network in its core but have a role in what information in language we are putting our focus on. Some attention deficits might also disturb language comprehension even if there are not direct problems in the traditional linguistic areas or their connectivity (Mcinnes et al., 2003).

Attention network has been thought to divide into ventral and dorsal streams in the fronto-parietal networks (Scolari, Seidl-Rathkopf and Kastner, 2015). Areas such as ventral precuneus (vPCUN) (Zhang and Li, 2012), middle frontal gyrus (MFG) and temporal-parietal junction (TPJ) are associated with attentive processes (Ptak and Schnider, 2011). Fronto-parietal network functions have also been found to contribute to enhanced speech comprehension in narrative context (Smirnov et al., 2014). These

(8)

brain areas are involved both in spatial attention tasks and sentence comprehension, suggesting attention having a role in language comprehension. Clear language pitch used in the stimuli recruit attention areas more robustly than non-articulated sentences suggesting direction of attention favoring sounds typical to spoken language (Kristensen et al., 2013).

Language comprehension recruit memory

Networks of memory are closely related to language (MacKay, Stewart and Burke, 1998; Bastiaansen and Hagoort, 2006). Both short- and long-term memory are important for language comprehension (Hagoort, 2005). Enough working memory capacity is required to maintain the context in mind (Baddeley, 1992; Cain, Oakhill and Lemmon, 2004). Explicit (declarative) memory is required for the storage, and later retrieval of words and their meanings from this long-term memory. Memories are thought to be stored as engram cells that are controlled over hippocampal activity (Semon, 1921; Blank et al., 2016). Some research puts more emphasis on the connectivity of the brain, and the strength of these connections between nodes in the networks (MacKay, Stewart and Burke, 1998). In spite of the framework or theory discussed, memory functions are important in language processing.

Maintaining context in mind is needed for comprehending language, and working memory capacity has been suggested to have an effect on word comprehension (Daneman and Merikle, 1996). Working memory has been suggested to take place in dorsolateral prefrontal cortex (DLPFC; Barbey, Koenigs and Grafman, 2013). Activity in inferior frontal gyri (IFG) and some inferior parietal lobular (IPL) areas have been found to correlate with working memory tasks (Newman, Just and Carpenter, 2002; Barbey, Koenigs and Grafman, 2013). DLPFC has been found to be involved in sentence comprehension (Klaus and Schutter, 2018). IFG and IPL contribution in language comprehension have been found in studies with words and naturalistic stimuli (Binder et al., 2009; AbdulSabur et al., 2014). Overlapping in these areas suggest that working memory is recruited in language processing and its capacity has an effect on language comprehension.

Semantic memory – storage for meaning

Semantic memory is the part of declarative memory that consists the linguistic knowledge of a person (Tulving, 1972). This collection of knowledge in words, that changes over time, is referred as lexicon (Bréal, 1897). Semantic memory is recruited in language comprehension (Kutas and Federmeier, 2000). Even though it does not represent our memory as a whole, it contains associations we make in this world, such as names for people and objects, but also for more abstract phenomena and concepts (Binder and Desai, 2011). Some have argued that as well as lexicon is related to semantic memory, it associates to episodic memories especially in children whose semantic memory is not as structured as that of adults (Petrey, 1977). Important is that we are able to make associations between words and experiences, and eventually retrieve these associations again in future situations.

(9)

Literature suggests that knowledge within the lexicon contributes more to meaning than for example syntax, the structure of language (Fedorenko, Nieto-castañon and Kanwisher, 2012). This statement contains a presumption that language structure and semantic meaning are separate phenomena. Other sources claim that lexical knowledge contains both semantic and syntactic properties of language (Bastiaansen and Hagoort, 2006; Müller and Hagoort, 2006). The distinction between these two is not clear and they are often intertwined in neuroscientific research (Fedorenko, Nieto-castañon and Kanwisher, 2012). Another debate is whether lexical semantic processing differs from semantic processing of non-linguistic visual cues (Bright, Moss and Tyler, 2004).

Conceptual categories, such as tools and animals, are our way to generalize, store and organize information (Pustejovsky, 1991; Pulvermüller, 2019). As we use categorization to simplify our world, our brain uses similar approach to reduce its computational load. Conceptual knowledge divided into these categories can be attained with visual or verbal cues (Bright, Moss and Tyler, 2004). Thus, we can think lexicon to be divided into semantic conceptual categories (Müller and Hagoort, 2006), and each word in the lexicon seems to fall into these categories based on the features they have. The core semantic network is activated in a similar manner when processing conceptual similarity between items such as an apple and a pear and the associative links between dissimilar items such as apple and a tree, that often occur in a similar context but are not under the same semantic category (Jackson et al., 2015). Feature semantics refer to our ability to associate certain features to words and concepts. Such features can for example be perceptual properties related to our sensory systems, such as visual shapes, somatosensory cues, sounds and smells (Jackson et al., 2015; Pulvermüller, 2019). Some literature in cognitive science refers the group of features related to concepts as conceptual spaces. Each space is formed by quality dimensions, that are mathematically measurable. For example, color of an object can have dimensions of hue, saturation and brightness (Gärdenfors, 1996). We can also associate emotions (Binder and Desai, 2011), other objects and environments (Jackson et al., 2015), movement (Hauk, Johnsrude and Pulvermüller, 2004) and episodic memories (Takashima et al., 2014) to a concept or an item. These all might not be considered as features but can be thought as associative relationships between different words, items and phenomena (Jackson et al., 2015). Various associations are thought to be represented in the brain as connections (Pulvermüller, 2019).

One example of a theory on memory and language is called MUC (Memory, Unification, Control) that discusses the storage and retrieval of memories, unification processes on syntactic, semantic and phonological levels and top-down prefrontal control of linguistic processes (Hagoort, 2005). Phonology is related to sounds in language and is thought to be lower level language process when semantics occur later in time in higher brain areas (Vigneau et al., 2006). Some research discuss the dynamic nature of cognitive functions in general (Mesulam, 1990). As natural language is often complicated, progresses in time and is context dependent, brain recruits different

(10)

modalities of cognition dynamically for proper comprehension (Medaglia, Lynall and Bassett, 2015).

In computational linguistics, the research in cognitive sciences and linguistics are used in the development of computational models for language processing, analysis and research (Pustejovsky, 1991). The research on memory and language has inspired computational models such as Dynamic Memory Networks (DMN) for analysis and processing of natural language (Kumar et al., 2016). Computational cognitive science has generated models such as word2vec and latent semantic analysis (LSA) that are used to model semantic content of language in numeric form (Landauer, 1997; Mikolov et al., 2013). These tools base their computation on linguistic content using for example parts of internet as a corpus to analyze word context, i.e. their relation to each other. Some of these tools are being adapted in experimental studies in cognitive brain research.

Language perception affected by previous experiences

We do not necessarily take incoming information as it is, but our perception is subjectively affected by previous experiences that again reshape the associations we make. If we create a prior of the concept, and predict the world based on this information, our way of understanding the world is already biased towards our previous knowledge (Hari, 2018). This knowledge is, to our fortune, prone to change through error. As prediction error in the dopaminergic reward system directs behavior of mice and monkeys (Bayer and Glimcher, 2005; Schultz, 2013), predictive coding is thought to apply to sensory systems in a hierarchical generative manner (Iglesias et al., 2013), and even further, to language perception (Hickok, 2013; Pickering and Garrod, 2013; Lupyan and Clark, 2015).

Subjective perception of an object, a word or a concept is the combination of action potentials within the inter-connected cells (Pulvermüller, 2019), and the formation of these connections is following the Hebbian rule – cells that fire together, wire together (Hebb, 1949). Thus, semantic meaning lies in the patterns of activation over these cells, in the neural networks they comprise. Since these connections keep being molded by each experience, reshaping the associations we make, also the way we understand and make meanings in this world is being shaped throughout a lifetime.

Networks of consciousness might define what sort of sensory information, or more specifically, what sort of information in language we are actually conscious of (Chafe, 1974; Bimmel, van den Bergh and Oostdam, 2001). We are naturally biased due to our previous experiences (Gilovich, Griffin and Kahneman, 2002) and these biases may be largely unconscious (Perry, Murphy and Dovidio, 2015). This can mold our language comprehension to be very different from the way another person with different experiences and biases understands language.

Depending on the content of the language, different networks might be recruited in its processing in a dynamic manner (Sporns, 2014). With social content mentalization

(11)

networks are more involved than with non-social content (Vanlangendonck, Willems and Hagoort, 2018). With contexts and tasks that might not appear so interesting to us, larger contribution of attention is required (Langer and Eickhoff, 2013). Also, the level of how complex the used language is, has an effect on how heavily memory functions are recruited in the task (Bastiaansen and Hagoort, 2006).

Brain research of semantics

Brain uses multiple areas to process language. Traditionally Broca’s and Wernicke’s areas are addressed. Wernicke’s area has been thought to play a key role in processing semantic content of the language. Anyhow, much wider areas are recruited in this task than previously has been thought (Ardila, Bernal and Rosselli, 2016). Focusing on merely anatomical areas when considering complex cognitive functions, such as language production or perception, is debatable. Opposing ideas suggest that more distributed neural processing is taking place in cognition, and the focus should be more on the functional specificity rather than regional emphasizing the engagement of a certain region into certain task (Kanwisher, 2010).

As semantic literature suggests wide brain areas to be involved, some have introduced hub-based models, where hubs are the important crossroads for connectivity of the brain. It has been suggested that words related to different functions, such as actions, are connected to certain areas, such as pre-central gyrus, and these different semantic centers or hubs are activated based on the semantic categories within the language (Garagnani and Pulvermüller, 2016). A theory called hub-and-spoke considers the concept formation being based on verbal and non-verbal input, and that these concepts are represented by the engagement and activation of modality-specific hubs distributed across the cortex (Mesulam, 1990; Ralph et al., 2017). This theory pinpoints bilateral contribution of anterior temporal lobe (ATL) in this process, and suggest ATL to be a mediator of the semantic network (Ralph et al., 2017). It does seem that ATL has a role in semantic functions, as for example neurodegeneration of these areas disturb both verbal and non-verbal semantic comprehension (Gorno-Tempini et al., 2011), but the nature of this role remains unclear. Hub based models have an emphasis on brain connectivity which is in line with theories on lexical categories and features.

Literature suggests language as a distinct network from these other modalities of cognition. Research with words and sentences implies that most of the linguistic areas in inferior frontal and posterior temporal gyri are shared between lexico-semantic and syntactic processes. The more meaningful content there is in the stimulus, the wider brain areas seem to take part in semantic processing (Fedorenko, Nieto-castañon and Kanwisher, 2012). Once brain makes more associations with comprehensible content in contrast with incomprehensible, wider areas are recruited (Fedorenko et al., 2010; Saalasti et al., 2017). Further on, research has found evidence for more detailed recruitment of brain areas: in a meta study on language processing, areas anteriorly from precentral gyrus along IFG and areas in posterior temporal lobe were found to correlate with phonological processing. Syntactic processes recruited similar areas

(12)

extending posteriorly to parietal areas including angular gyrus (AG) and to areas anteriorly further in IFG (Vigneau et al., 2006).

In the same meta-analysis, semantic studies were investigated. Semantics recruited similar but wider areas in temporal lobe extending to anterior parts and posteriorly in parietal AG when compared to phonological and syntactic areas. Also, frontal areas were recruited similarly but more widely, especially in IFG, but less towards precentral gyrus that is responsible of motor functions (Vigneau et al., 2006). These results suggest semantic processing recruiting brain widely. In addition to frontal areas, parietal activity is often found in brain research of semantics. In a study on semantic aphasia, disturbed connectivity or lesions in TPJ were found in patients (Dragoy, Akinina and Dronkers, 2017) suggesting it to have a role in semantics. In EEG studies, negative N400 response in centro-parietal electrodes and has been liked to language comprehension tasks (Bambini et al., 2016). Also, positive parietal P600 responses have been found to correlate with language comprehension, even though it has been previously associated with syntactic tasks.

Study on svPPA patients found atrophies in temporal pole and orbitofrontal areas bilaterally, left ventral temporal areas, fusiform gyrus and amygdala (Mummery et al., 2000). More specifically, left posterior middle and anterior superior temporal gyri (MTG, STG), superior temporal sulcus (STS) posteriorly, AG, MFG and parts of IFG were found to relate language comprehension in other lesion studies (Dronkers et al., 2004). In addition, connectivity between these areas has been studied. They are interconnected by multiple white matter tracts: fascicles of inferior occipito-frontal, arcuate and middle and inferior longitudinal play a role in this connectiveness. Disturbances also in these tracts might result in impaired comprehension (Turken and Dronkers, 2011).

While structural processing of language takes generally place on the left hemisphere, it seems that more complex semantic processing of language takes also place widely on the right hemisphere (Jung-beeman, 2005). Right lateralization of semantic processes could be the case at least in parts where other language related functions are taking place on the left hemisphere. Some functional brain imaging studies using metaphorical linguistic stimuli have found tendency towards right hemispheric activation with novel combinations of figurative language (Mashal, Faust and Hendler, 2005; Ahrens et al., 2007).

Research suggests that wide areas in temporal and frontal cortex, as well as parietal AG and supramarginal gyrus (SMG; Binder and Desai, 2011) are playing a role in word and sentence comprehension. These areas might be related to various aspects of language processing and might not be linked merely to semantics but contribute to the overall task of comprehension. In addition to cerebrum, cerebellar areas are suggested to play a role in cognition in general, but also in language processing (Leiner, Leiner and Dow, 1993; Schmahmann, 2004). The role of cerebellum has only recently been acknowledged in linguistic studies, even though lesions in cerebellar areas crus I and crus II has been associated with language disturbances (Richter et al., 2007; Stoodley et

(13)

al., 2016). Cerebellum is highly connected to cerebrum. Some research suggests lobule IV and vermis to be taking part in this connectivity (Stoodley and Schmahmann, 2011). Closed-loop connectivity between frontal cortical areas and cerebellum have been suggested (Watson et al., 2014), and this could play a role also in semantic processing. Combined PET and fMRI study with auditory stories and nursery rhymes found involvement of PCUN, inferior parietal and dorsomedial prefrontal cortices and premotor areas in language comprehension. They also found that language production in a narrative form showed mostly left hemisphere correlation when

comprehension tasks recruited also right hemisphere (AbdulSabur et al., 2014). In a

fMRI study on narrative comprehension in a developing brain, bilateral activation of superior temporal areas was found to correlate positively with age (Szaflarski et al., 2012) suggesting its role in semantic memory. A fMRI study with over two hours long narrative stimulus showed similar results: relatively symmetrical bilateral correlation was observed in lateral and ventral temporal cortices (LTC, VTC), lateral and medial parietal cortices (LPC, MPC) and medial, superior and inferior PFC. Patterns of how these areas were recruited were category specific – social content elicited different patterns of brain activity than for example numerical or locational contents (Huth et al., 2016).

Furthermore, semantic processing of words and pictures differ only partly in PET studies (positron emission tomography; Vandenberghe et al., 1996; Bright, Moss and Tyler, 2004). Most of the semantic system seems to be shared between these modalities of input, and both seeing a picture of an apple and reading the word apple elicit somewhat similar responses in higher brain areas. Differences seem to be in the areas processing sensory and structural linguistic information, but the so called semantic network is hypothesized to be activated similarly in both cases (Vandenberghe et al., 1996). Supporting results have been found in fMRI studies comparing words to objects (Devereux et al., 2013). This distributed activation of the cortex is thought to vary depending on the properties and functionality of the word or the object: non-living tools are categorized differently than living animals, and this is also reflected in the patterns of brain activity (Mahon et al., 2007; Devereux et al., 2013).

As discussed in the beginning, semantics most often refer to the meaning of language. Studies on natural language processing across subjects might reveal more about the underlying brain processes of language comprehension. On the other hand, results from research using naturalistic stimulus are more difficult to interpret than simpler stimuli such as words or sentences. As stimulus gets more complicated, there are also more variables to be taken into account during the analysis. Therefore, it is important to model the used stimulus as accurately as possible. In the current study, semantic comprehension of language was studied by measuring fMRI-BOLD responses while participants listened to an auditory narrative. Semantic content of the narrative was modelled computationally with word2vec and compared to voxel-wise BOLD time courses using ridge regression. The aim was to identify brain areas underlying narrative comprehension and further discover more specifically how semantic processing during naturalistic narrative is represented in the brain.

(14)

Methods

Subjects and MRI acquisition

31 healthy right-handed Finnish-speaking females volunteered as a subject. None of the subjects had psychiatric or neurological disorders, and all reported normal hearing. Two of these subjects were excluded from the study due to artefacts in the data and lack of attention, resulting in total 29 subjects.

Subjects were scanned with a 3T MRI scanner (Magnetom Skyra, Siemens Healthcare, Erlangen, Germany) with a 20-channel coil at Advanced Magnetic Imaging (AMI) Centre at Aalto University School of Science. Anatomical T1-weighted images were obtained using MPRAGE pulse sequence (repetition time, TR=2,530 ms; echo time, TE=3.3 ms; flip angle 7°, 256 x 256 matrix, 1x1x1 mm3_{resolution, 176 sagittal slices).}

Echo planar imaging (EPI) pulse sequence was used to attain functional T2-weighed images (TR=1700 ms, TE=24 ms, flip angle 70°, 202 x 202 matrix, in plane resolution 3x3 mm2_{, each volume comprising 33 slices of 4 mm thickness, 295 volumes).}

Stimuli and experimental design

An auditory narrative was presented to the subjects during fMRI with MRI-compatible headphones (Sensimetrics S14 insert earphones) on top of which earmuffs were placed for safety and canceling the noise in the MRI scanner. The narrative was told from a female first-person perspective and it described everyday life situations such as social interaction at home and work. Some parts focused more on the narrator’s mental processes, while others were more descriptive in nature. The auditory stimulus was a 7 minutes and 54 seconds long. Following is a short sample of the narrative translated from Finnish to English (Saalasti et al., 2017).

“-- On ridges grew pines and in valleys dense spruce. In other places the road crossed over small rapids. Nature already started to turn green, much to the influence of the spring sun. I reflected on the behavior of Jarkko this morning: his sudden disconnection of the phone call and blushing as if guilty. I wonder whether Jarkko had something inappropriate going on with someone--.”

Preprocessing of the data

Data were preprocessed with an in-house MATLAB toolbox (BraMiLa; available at https://version.aalto.fi/gilab/BML/bramila). Preprocessing included slice-timing and movement correction, and co-registration of functional and structural images. Also, blood-oxygen-level dependent (BOLD) signal time series detrending, nuisance signals and noise regression, temporal high-pass filtering (cut-off at 0.01 Hz) and spatial smoothing with a Gaussian kernel (8 mm) were applied.

(15)

Inter-subject correlation (ISC)

Similarities of BOLD signal time courses between the subjects were calculated using inter-subject correlation (ISC; Hasson et al., 2004) implemented in ISCtoolbox (Kauppi et al., 2010). ISC analysis compares the time courses of each voxel between the subjects by predicting the activations in the following subject based on the previous without having to specify certain regions of interests in advance (Hasson et al., 2004). In the analysis, 406 pairwise Pearson’s correlation coefficients were acquired with ISC for each voxel’s time course.

Computational linguistic model

Semantic content of the narrative was modelled with a computational linguistic model Word2vec (Mikolov et al., 2013). Word2vec creates a vector space based on co-occurrences of words in a large corpus (here the Finnish language internet). Following literature, we created a 300-dimensional vector space (dos Santos and Gatti, 2014; Pereira et al., 2018; Van Uden et al., 2018; Kivisaari et al., 2019). The less dimensionality in the model, the easier it is to analyze and interpret as it also reduces the computational demands (Ordentlich et al., 2016). Although, reducing dimensionality comes with the expense of how descriptive the model is (Mikolov et al., 2013).

These vectors are used to describe semantics of language numerically by placing the words that share similar context near to each other. For example, in Figure 1, word pine is spatially closer to spruce than to phone because they share their context in the corpus. For clarification, Figure 1 is merely an illustration of the principle idea of the word2vec model. In the actual model, a 300-dimensional numerical vector describing a single word is more similar between words sharing the same context than between words that rarely occur in similar contexts. Thus, in the word2vec model, semantics of the language is encoded in the similarity of these 300-dimensional vectors.

In practice, the language model was a 282x300 matrix of numerical values. The 282 rows equated the repetition times in fMRI, thus representing the functional images collected every 1.7 second. Each of these rows were the 300-dimensional vectors describing the content of the narrative during each repetition time. The vectors for each lemmatized word were acquired with word2vec (Mikolov et al., 2013). The content of the narrative was divided to 1.7 second items from the audio file using Transcribe! software (1998-2019 Seventh String Software), and the vectors of words occurring in each repetition time were summed accordingly. As follows, the columns of the matrix were the 300 numerical values describing the sum of the semantic content of the narrative during each 1.7-second-long timeslots.

(16)

Figure 1, Illustration of the principal idea of word2vec, where contextually similar words, such as pine and spruce, are located near to each other while words that occur in a different context within the used corpus, such as road or phone, are located further away. The relation of these words is represented with 300-dimensional vectors – each word is described with one, and the more similar these vectors are the nearer these words are to each other’s contextually.

Canonical correlation

Regularized kernel canonical correlation analysis (rKCCA) was used to compare the semantic content of the stimulus with the voxel-wise blood-oxygen-level dependent (BOLD) brain signal time course. rKCCA was chosen as an analysis method for its ability to estimate correlation between different-sized multidimensional matrices and because it allows data regularization (Bilenko and Gallant, 2016). L2 regularization was used to minimize the size of the fitted parameters (Ng, 2004). L2 is a derivative of Tikhonov regularization with the addition of using identity matrices (Wager, Wang and Liang, 2013).

rKCCA creates a feature space where the maximum correlation, !", between word2vec model and fMRI data was evaluated across subjects. The datasets X and Y were combined linearly with canonical weight vectors (support vectors) aj and bj. Dot

(17)

product of the dataset and these coefficients resulted as canonical components uj and vj in the feature space, from which the canonical correlation was calculated as in eq.1.

$" = < (", ) > ,

+" = < ,", - > !" = max

< $", +" > ‖$"‖‖+"‖

(1)

Equation 1, maximum correlation !" between canonical components uj and vj, that are inner products of the weights aj and bj, and datasets X and Y, in this study word2vec model and BOLD time course.

Regularization parameter was validated with Monte Carlo cross-validation. Because validating the parameter was computationally expensive, linear kernelization was used to reduce the dimensionality of the data. This simplified the computing process by making an inner product of the pairs of data in feature space (Akaho, 2006). In the analysis, 70 % of the data was used in training and 30 % for testing. One canonical component was used. An open-source toolbox for Python was applied in the analysis (Bilenko and Gallant, 2016).

Ridge regression

We then analyzed the correlation between word2vec model and brain data with another commonly used multivariate analysis method, ridge regression. The used ridge parameter k was 106_.

Coefficients ( 34 )of a ridge regression between the word2vec model and BOLD time

course were calculated as in eq.2. (Hoerl and Kennard, 1970). This was done with ridge regression, ridge(y,W,k), in which the definable arguments were observed response y, predictor data X and ridge parameter k. In this case, y was the voxel time course from

fMRI, W the word2vec model and k was predefined based on previous knowledge.

Usually, k is cross-validated from the data used, but in this study, we did not have enough data for cross-validation as observed also in the canonical correlation analysis. Voxel wise time course was obtained with taking a z-score as in eq.3. from each voxel of the fMRI data from which every dimension with length 1 were removed.

34 = ()5_{) + 78)}:;₎5_< ₍₂₎

Equation 2, Ridge regression coefficients, 3,= are products between the observed response y, transpose of the design matrix X, and ()5) + 78) to the power of -1. In the latter, product of design matrix X and XT_{is summed with the product of ridge parameter}_k_{and identity matrix}

I. In ridge regression, the data matrix will be transformed to invertible with coefficients for it to be computable.

(18)

> = (? − A)

B (3)

Equation 3, z-score z, in which mean of the data µ is subtracted from the data point x, in this case voxel, which is then divided with the standard deviation s of the dataset, resulting as the voxel time course S.

A dot product between the coefficients ( 34 ) and the word2vec model was computed

to obtain voxel-wise word2vec model time course as in eq.4. Time courses of BOLD

signal and word2vec model were finally correlated with each other to obtain a brain map of Pearson coefficients as in eq.5. Custom MATLAB script was implemented for the analysis.

C = < 34, D > (4)

Equation 4, word2vec model time course T, in which dot product of Ridge regression coefficients and word2vec model was computed.

EℎG((, ,) = ∑ (CI,J− CI)(KL,J− K̅L) N JO; P∑N (C_I,J − CQ_I)R JO; ∑NSO;(KL,S− K̅L)RT ;/R (5)

Equation 5, where linear correlation between word2vec model time course T and voxel time course S from fMRI, was calculated with Pearson’s Correlation Coefficient rho(a,b), in which a and b are columns in related matrices. (C_I,J − C_I)(K_L,J− K̅_L) is summed from time point 1 to n, in which n is the number of timepoints in this experiment, 281, Ta is a column in matrix T, Sb is a column in matrix S, C_Iis the mean ∑N (

JO; CI,J)/V and K̅L is the mean ∑NSO;(KL,S)/V. This is

then divided with P∑N (C_I,J− CQ_I)R

JO; ∑NSO;(KL,S− K̅L)RT ;/R

, where the sums of squares of means of the matrix columns a and b subtraction from each datasets columns Ta and Sb are multiplied with each other’s, and the result the sums of is raised to the power of ½.

Parametric cluster correction with a threshold t-value of 1.7 and minimal cluster size of 5x5x5 voxels was done with FSL. Thus, remaining clusters after the correction were at least 125 voxels large with a t-value of 1.7, equaling to p- value of 0.05. Results were finally visualized with CARET software (Van Essen, 2005).

(19)

Results

Inter-subject correlation

Inter-subject correlation (ISC) suggests that neural activation while listening to a narrative is similar across subjects in extensive cortical areas (Figure 1, p< 0.05, cluster corrected). Temporal cortical areas in superior temporal gyrus (STG) and middle temporal gyrus (MTG) correlated strongly between subjects. Bilateral similarity of activity was observed with a slight left hemispheric dominance.

Figure 2 Inter-subject correlation (ISC) of cortical activities related to auditory narrative (p £ 0.05, cluster corrected) from lateral views of A) left hemisphere, B) right hemisphere, from medial views C) right hemisphere, D) left hemisphere and E) dorsal view of cerebellum. Bilaterally correlating brain areas found were calcarine fissure (CF), superior and middle temporal gyri (STG, MTG), middle frontal gyrus (MFG), precuneus and cuneus (PCUN, CUN), lingual gyrus (LG) and cerebellar lobules VI and crus II. Inferior frontal gyrus (IFG), supramarginal gyrus (SMG) and paracingulate gyrus (PCG) correlated mostly on left and fusiform gyrus (FuG) on right hemisphere. Threshold-free cluster enhancement (as implemented in FSL randomise) was used as cluster correction (Woolrich et al., 2009). Results are visualized with CARET software (Van Essen, 2005).

Some variation was observed in which brain areas showed correlation across subjects between the two hemispheres. For example, activation on inferior frontal gyrus (IFG), medial superior frontal gyrus and supramarginal gyrus (SMG) correlated in the left hemisphere but not as widely in the right. On the contrary, correlation in middle

A

B

C

D

E

0.0 0.1 CF Crus II VI STG IFG MTG FuG _LG CUN PCUN PCUN MTG STG MFG PCG mSFG SMG AG CF HF

(20)

frontal gyrus (MFG), paracingulate gyrus (PCG) and fusiform gyrus (FuG) was more similar between the subjects in right than in left hemisphere.

In addition to areas of temporal and frontal lobes, angular gyrus (AG) in inferior parietal lobule and areas around calcarine fissure (CF) in occipital lobe showed bilateral correlation of activation. Correlation of brain activation across subjects was also similar between the hemispheres in some parts of middle frontal gyrus (MFG) and lingual gyrus (LG). Same was observed in cuneal (CUN) and precuneal (PCUN) areas in medial occipital lobe.

ISC was bilaterally significant also in the cerebellum. Correlation was found between the subjects in cerebellar cortical areas in lobules VI and crus II. Also, horizonal fissure (HF), the largest one in cerebellum, showed correlation bilaterally. The inter-subject correlation was dominated in the right cerebellar hemisphere in crus II and horizontal fissure, but not in VI.

Combining fMRI time courses with word2vec model

Next, word2vec model of semantic content of the narrative was compared to the group level similarities in brain activation patterns. Regularized canonical correlation analysis showed even stronger correlation in more widespread areas than seen in the ISC (Figure 1). It seemed irrational to gain higher correlation as a result after including the semantic content of the stimulus in the analysis. This suggested that the results were incorrect and thus not reliable, and that this particular analysis method was not suitable for these data.

Ridge regression analysis suggested some correlation across subjects between BOLD signal and word2vec model of the semantic content of the stimulus (Table 1 and Figure 3). The clusters of correlating activity were spread widely on the cortical surface. Activity in cerebellum, especially in cerebellar crus II, was correlated more on the right hemisphere while correlation in cerebrum was spread bilaterally. Although, some spatial variation in the correlation between the hemispheres was observed. Supramarginal gyrus and middle frontal gyrus on right hemisphere were the largest clusters. Also, maximal t-values were highest in these areas.

Frontal corticalareas seen in Figure 3 had most correlation in the orbital middle frontal gyrus (oMFG), triangular inferior frontal gyrus (triangIFG) and precentral gyrus (PreCG) on left hemisphere and medial orbitofrontal cortex (mOFC) bilaterally. On the contrary, medial part of superior frontal gyrus (SFG), middle frontal gyrus (MFG) and orbital inferior frontal gyrus (oIFG) showed correlation on right hemisphere. Also, insular cortex (INS) on right and olfactory cortex (OLF) on left hemisphere showed significant correlation between the subjects and the linguistic model.

(21)

Table 1, peak voxels from ridge regression analysis

Cluster anatomical region Hemisphere Size of cluster

MNI coordinates

(x,y,z) max t-value

Supramarginal gyrus R 4814 (66, -32, 28) 5,4638 Middle frontal gyrus R 2551 (36, 62, 6) 5,4965 Cerebellar crus II R 1652 (38, -74, -48) 4,4214 Angular gyrus L 657 (-44, -62, 46) 5,1008 Orbital middle frontal gyrus L 356 (-34, 66, -2) 3,9859 Cerebellar crus II L 318 (-28, -68, -42) 4,3767 Orbital inferior frontal gyrus R 300 (56, 32, -6) 3,3183 Rolandic operculum L 232 (-64, -4, 16) 4,0188 Postcentral gyrus R 220 (64, -6, 26) 3,6835 Supramarginal gyrus L 212 (-56, -28, 40) 3,8418 Olfactory cortex L 194 (-4, 12, 0) 4,3376 Middle temporal gyrus R 185 (66, -34, -6) 4,2516 Precentral gyrus R 167 (50, -10, 56) 3,8454

Insula R 137 (38, 24, 2) 3,8456

Medial superior frontal gyrus L 133 (-12, -28, 28) 3,2527 Triangular inferior frontal gyrus L 129 (-48, 38, 4) 2,9379 Inferior temporal gyrus R 127 (56, -60, -18) 3,3035

Correlation was also observed in middle temporal gyrus (MTG) in right hemispheric temporal cortical areas. In right temporal lobe, some correlation was observed near temporo-parieto-occipital junction (TPOJ), extending below from the supramarginal gyrus (SMG). Clearly largest cluster in temporal cortical areas was anyhow MTG (Table 1). Similar correlation was not found on the left hemispheric temporal cortex.

In left parietal cortex, correlation spread anteriorly from angular gyrus (AG) towards central sulcus, covering some areas of SMG and postcentral gyrus (PostCG). Similarly, supramarginal gyrus (SMG) on the right parietal cortex showed correlation with the linguistic model while this was not seen in right PostCG or AG. More correlation was seen on right precuneus (PCUN) than on left. SMG cluster in right parietal cortex was the largest one found in the analysis (Table 1). In parietal cortex, following largest clusters in size were left AG, right PreCG and finally left SMG. Clusters in PCUN were quite scattered, so they did not reach other clusters in size to fit the peak table. Also, Rolandic operculum in left opercular cortical areas around sulcus lateralis showed significant correlation.

(22)

Figure 3, Correlation with ridge regression between BOLD signal and the semantic model visualized on cortical surfaces (cluster corrected, p £ 0.05) from lateral views of A) left hemisphere, B) right hemisphere, from medial views C) right hemisphere, D) left hemisphere and E) dorsal view of cerebellum. Figure shows correlations that reached significancy (p£ 0.05) on yellow and orange color scale. These areas of similarity include clusters in left medial superior frontal gyrus (SFG), left orbital middle frontal gyrus (oMFG) and triangular inferior frontal gyrus (triangIFG), left pre- and postcentral gyri (PreCG, PostCG), left rolandic operculum (ROL), left angular gyrus (AG), supramarginal gyrus (SMG) bilaterally, right medial temporal gyrus (MTG), right insula (INS), right medial frontal gyrus (MFG), right orbital inferior frontal gyrus (oIFG), right lingual gyrus (LG), right precuneus (PCUN), right temporo-parieto-occipital junction (TPOJ), left cuneus (CUN), medial orbitofrontal cortex (mOFC) bilaterally, left olfactory cortex (OLF), right cerebellar VI and crus I and cerebellar crus II bilaterally.

Other significantly correlated areas were found in occipital lobe where lingual gyrus on right and cuneus on left hemisphere seemed to correlate the most between subjects and the linguistic model. On cerebellar cortex, significance was found in lobules VI, crus I and crus II. Although correlation was lateralized clearly more on the right hemisphere, also left crus II activity correlated significantly with the model.

C

A

B

D

E

0.0 4.3 SMG MFG Crus II AG PreCG PostCG MTG mOFC Crus IVI LG CUN oMFG oIFG ROL OLF INS triangIFG SFG PCUN TPOJ

(23)

Discussion

The current study addressed questions of how brain contributes to language comprehension during narrative listening and what are the neural correlates of semantic processing in the brain. Brain research has implemented computational methods to study concepts and semantic representations in the functional brain (Binder et al., 2009; Fedorenko et al., 2010; Pereira et al., 2018). The approach was to implement methods of cognitive neuroscience and computational linguistics to investigate further this complex question using a continuous naturalistic stimulus taking the context of language into account.

Semantics in narrative elicit correlation on distributed cortical areas

Inter-subject correlation analysis of auditory intelligible narrative suggested wide correlation in the BOLD signals between the subjects as seen in figure 2. Not all of the correlation can be related with semantic understanding. In fact, these results describe the whole process occurring in the brain while listening to the narrative, showing also lower level activation in for example sensory auditory areas. Because the task was to listen to the stimulus, especially auditory areas in temporal cortices correlated between the subjects. As the stimulus is naturalistic and it activates brain on wider areas than for example a simpler and more controlled one, it is difficult to make conclusions from the ISC analysis to separate these events. The task does not only produce semantic understanding of the narrative but also the processes prior to this. Thus, it is not possible to separate which of the responses are actually related to semantic, phonologic or auditory processes based on ISC alone.

In ISC, the correlation in temporal cortical areas extended from auditory areas in lateral fissure and superior temporal gyrus to anterior parts of temporal lobe on left hemisphere and this result supports the previous findings (Fedorenko et al., 2010; Binder and Desai, 2011). This does not necessarily relate only to auditory and phonological processing. Some semantic processing has also been thought to occur in these brain areas (Démonet et al., 1992; Mummery et al., 2000; Friederici et al., 2003; Visser and Lambon Ralph, 2011).

Correlation of brain data with the semantic model in ridge regression (Figure 3) was in line with previous research using narrative stimuli. Inferior parts of parietal lobe and on medial plane in PCUN are involved in semantic processing of narrative and similar results were found in previous studies (AbdulSabur et al., 2014; Huth et al., 2016). In addition, involvement of medial, superior and inferior PFC in narrative comprehension was found in ridge regression supported by research. In previous studies, temporal areas were found to take more part in semantic processing compared to these results. Only right MTG in temporal areas was involved with semantic processing in ridge regression. MTG is usually left lateralized in language tasks with simple stimuli, and with narrative context, bilateral activity of MTG has been found (Huth et al., 2016). It could be that once word2vec maps the semantic content of words based on their context, these results describe the content that is clearly associated to

(24)

some distinct feature. Maybe left temporal areas more traditionally associated to semantics are processing some more basic, procedural aspects of semantics that these algorithms cannot mimic. Thus, the semantic model used in this study could have such limitations or properties that these areas are not shown so strongly in the analysis. Another possibility is that semantics are not processed as much in temporal lobes with wider context. It is possible, that when the contextual information increases, detailed linguistic processing is in less importance in relative to larger semantic meaning elicited by larger context.

In ridge regression, some motor areas seemed to be involved in semantic processing of language. Left rolandic operculum, ventral part of the premotor cortex correlated with word2vec, suggesting them having a role in semantics. Some research has associated rolandic operculum with linguistic functions, but mostly with articulatory and other speech related aspects. Rolandic operculum is connected to SMG with a white matter tract (Maldonado, Moritz-Gasser and Duffau, 2011), and also SMG is involved in semantics in ridge regression. In addition, correlation was shown on left preCG. Similar result was found in another fMRI study using narratives (AbdulSabur et al., 2014). Across the central sulcus, somatosensory areas in postCG showed correlation as well. Motor areas might have a role in understanding the movement described in the narrative and somatosensory the somatic sensations. Olfactory cortex was involved indicating olfaction taking part in comprehending the narrative content, as visual areas around calcarine fissure might have a role in imagining the scenery while listening to the narrative. Involvement of these sensory and motor areas could be related to features, such as smell or texture, of the concepts used in the text. Following, combination of these features would construct the meaning of a concept, semantics showing as a distributed correlation in the brain many brain areas being involved in its processing.

Involvement of insular cortex in ridge regression during narrative listening indicates some emotional processes of the brain taking part. In the narrative, there was for example surprisal of co-workers handing a gift, some confusing events such as phone in the refrigerator and backpack on the car roof, a car accident, mentalization eliciting paragraphs, fear of the narrator being cheated by partner and then revealing of the narrators own cheating to the listener. These are all events that should elicit some sort of emotions in the listener, and depending on the individual experiences, they might elicit very different kind of emotions with varying intensities. Insular cortex has been traditionally associated with disgust and its involvement in this data could indicates narrative eliciting disgust in subjects. It is known, that memories and concepts that are related to some emotions stay better in our memory (Kensinger and Corkin, 2003), and that words in our lexicon also associate to emotions (Binder and Desai, 2011). Emotional content of narrative has been found to elicit different patterns of activity than for example numeric suggesting these semantic categories being represented in the brain differently (Huth et al., 2016). When a certain concept is associated to an emotion, correlation is seen in the areas processing the emotion. Same correlation is not seen with a neutral concept, which is why they are also felt differently in the light of this specific aspect.

(25)

ISC included all processes related to naturalistic narrative listening, but ridge regression took into account the semantic content of the narrative. These two analyses showed areas overlapping in their correlation. Many of these areas were bilateral in ISC but unilateral after applying word2vec. Correlation shown in ISC was naturally much stronger and covered wider areas. In ISC, frontal cortical areas correlated more on the left hemisphere than right. This is in line with previous studies (Vigneau et al., 2006; Binder and Desai, 2011). In ridge regression, semantic processing of narrative recruited frontal cortical areas on right hemisphere. Bilateral parietal cortices and cerebellum were found to be involved in both ISC and ridge regression. Also, parts of visual and temporal cortices were involved in semantic processing in both analyses. Correlation was shown in distributed manner also on medial plane of cerebrum. Activation of frontal and parietal areas is found in previous studies with words and sentences (Vigneau et al., 2006; Binder et al., 2009), but these results lacked correlation in left temporal areas that is often shown in research. Semantics of language are encoded in distributed brain areas rather than one fixed area.

Left inferior frontal gyrus (LIFG) activation has been associated with sentence comprehension (Friederici et al., 2003; Vigneau et al., 2006). LIFG has been propose to play a role in unification, combining bits of linguistic information to larger entities (Hagoort, 2005). Correlation in left triangIFG was shown in both of the analyses. This suggests LIFG having a role also in narrative level comprehension, but the cluster was rather small in ridge regression. It might have a smaller role in relative to the rest of the brain when contextual information recruit other areas more strongly.

Research on genetics has shown the importance of FOXP2 in language but its role in language comprehension and semantics is not that clear. This gene encodes a protein that has an effect on brain plasticity. Research has suggested dopaminergic system and basal ganglia involvement in learning to be basis on how FOXP2 has had an effect on language development over the course of evolution. These areas are not shown in functional brain imaging studies of semantics, but they might anyhow have their role in learning process and formation of the semantic system via plasticity. This task does not necessarily trigger learning but merely comprehension that is based on already existing knowledge. Different kind of study design might be needed to study if basal ganglia have a role in language to further reveal the missing link between FOXP2 and language. Instead of nuclei in the basal ganglia, cerebellum was involved in both results. Cerebellum is one of the memory formation sites in the brain in addition to basal ganglia, hippocampus and amygdala (Gao, Van Beugen and De Zeeuw, 2012). Cerebellum role in semantics is still unclear.

Narrative context recruits both hemispheres for semantic processing

Quantitatively the difference in correlation was not large between left and right hemisphere in ISC. Previously, it was thought that language processing takes place on the left hemisphere (Krashen, 1973; Mori, Yamadori and Furumoto, 1989), and later studies claimed language being left lateralized in right-handed people (Binder et al.,

Naturalistic language comprehension : a fMRI study on semantics in a narrative context

Naturalistic language comprehension:

a fMRI study on semantics in a narrative context

Table of contents

Naturalistic language comprehension:

a fMRI study on semantics in a narrative context

Keywords:

Introduction

Semantic processing of language

Communication context evolutionally important

Language as a part of cognition

Brain research of semantics

Methods

Subjects and MRI acquisition

Stimuli and experimental design

Preprocessing of the data

Inter-subject correlation (ISC)

Computational linguistic model

Canonical correlation

Ridge regression

Results

Inter-subject correlation

A

B

C

D

E

Combining fMRI time courses with word2vec model

C

A

B

D

E

Discussion

Semantics in narrative elicit correlation on distributed cortical areas

Narrative context recruits both hemispheres for semantic processing