Rationale - : Lexical Decision Experiment

Chapter 3 : Lexical Decision Experiment

3.1 Rationale

Further to Chao's (1948) research on the dialects of Hubei province, this research project is aimed at examining the influences of these dialects on the lingua franca: Mandarin Chinese. Many of the dialects in Chao's (1948) compendium exhibit forms of assimilation whereby an [l] becomes an [n] or vice-versa. This phenomenon has been observed in the surrounding regions in present day China, and is something that people who have moved to these areas from other parts of China notice in their conversations with people from these affected areas. For example, people living in the city of Changsha seem to show random variation in their production of these two phonemes. Sometimes an [l] in Mandarin will be spoken as an [n] and vice-versa. This study looks to investigate the representation and production of these phonemes in experienced non-native speakers. As the two phones had already been determined by observation, the initial experiments as completed by Zhang et al. (2012) were omitted.

These initial experiments included a discrimination test and a Garner test to verify that the particular phonemes chosen were appropriate for study. The final experiment used a medium term priming paradigm to examine lexical processing of Mandarin contrasts in native Cantonese speakers. This study uses a similar methodology to find out about the lexical representations that are stored in the brain, as well as their associated production patterns (see Chapter 4) following a similar line of thought as Hayes-Harb & Masuda (2008) who investigated Japanese geminate consonant discrimination and production in L2 speakers. Further experiments conducted by Weber & Cutler (2004) and Cutler et al. (2006) show an asymmetric mapping phenomenon whereby one particular order of the stimuli showed priming whilst the other order showed inhibition. The priming phenomena is “a change in the ability to identify or produce an item as a result of a specific prior encounter with the item” (Schacter & Buckner, 1998). This paradigm will permit for examining whether a similar asymmetrical representation exists in the particular case of Southern Mandarin speakers. The experimental design considers the possibility that the internal mapping of phonemes is interconnected and overlapping. As a result, it is possible that two different sounds map to the same representation in the lexical inventory as is predicted in the PAM (Best, 1995).

This experiment examines the possibility of a similar phenomenon in Standard Chinese when it is a shared L1 with a local dialect. It may be that [n] and [l] become mapped to the same lexical representation similar to what occurs in the above example. An implicit priming task was used to attempt to unmask this effect.

3.2 Methodology

3.2.1 Participants

There were fifteen native speakers of Mandarin Chinese (12 females and 3 males) who participated in the study. None of the participants reported any vision or hearing problems. The average age of the participants was 21 years old (range: 20-23). Participants were all students at Leiden University in either an undergraduate program studying for a Bachelors degree (n=10) or a graduate program studying for a Masters degree (n=5). Participants' English language abilities were not assessed as all of the documentation had been translated into Mandarin Chinese prior to the experimental sessions.

Most of the participants had lived the majority of their lives in Beijing (n=9). The remaining participants came from parts of northern China (n=3) and Taiwan (n=3). All speakers reported that their L1 native language was Mandarin Chinese ( 中文

[Mandarin Chinese] n=1, 普通话 [Mandarin Chinese] n=9, 北京话 [Beijing dialect] n=2, 華語 [Mandarin Chinese] n=1, 國語 [Mandarin Chinese] n=1, 漢語 [Mandarin Chinese] n=1). This was vocally confirmed after questioning by the researcher due to the multitude of different written forms that have similar meanings. Two participants reported a language other than Mandarin Chinese that was spoken at home (陕西话

[Shǎnxī (also Shaanxi) dialect] n=1, and 闽南话 [Mǐnnán dialect] n=1).

The participants' parents spoke a wider variety of languages, but when asked to elaborate on the usage of the languages used at home, all of the participants, except for 2, said that their parents would only speak in Mandarin Chinese, and would rarely use their native tongue.

3.2.2 Stimuli

Three types of monosyllabic words were used for the experiment. The three categories were (1) target, (2) control, and (3) filler. Words categorised as target words had an initial that was either a [n] or [l], for example nǚ and lǎo. Words in the control category had an initial that was either a [th_{] or [t], for example}_tiē_and_duì5_{. Words in} the filler category had any initial that was not a part of the interest or control group, for example [ʃ] or [m] in shēng and mèng. A total of forty monosyllabic words were selected for the interest and control groups of words. A further eighty monosyllabic words were selected for the filler words, and eighty monosyllabic nonce words were created using the same constraints as the real words in the filler category.

Each word that was selected for inclusion in the experiment had an associated minimal pair. These minimal pairs differed only in the initial consonant in the interest and control groups, all other phonetic components were controlled. For example, the word tài was selected for the control group. The associated minimal pair item is dài.

Written in IPA, these two words are [_th_ai˥] and [_tai˥] respectively, identical save for the initial consonant. Pairings for the filler words and nonce words did not follow this minimal pairing scheme as they were not key lexical items.

Four different conditions were developed in accordance with Zhang et al. (2012). These conditions varied across the four different lists so that each possible pairing only occurred once throughout the four lists. Each list was made up of 288 words. One word of the pairing was considered to be an “A” and the other the “B.” In the first condition, two “As” would be given. In the second condition, two “Bs” would be given. In the third condition, an “A” would be given followed by a “B.” In the fourth

condition, a “B” would be given, followed by an “A.” A sample distribution of conditions across the four lists is given in the figure below. This kind of configuration allows us to examine all of the potential relationships between the two words.

List 1 List 2 List 3 List 4

/tài/-/dài/ Con 1: /tài/-/tài/ Con 2: /dài/-/dài/ Con 3: /tài/-/dài/ Con 4: /dài/-/tài/ /nín/-/lín/ Con 2: /lín/-/lín/ Con 3: /nín/-/lín/ Con 4: /lín/-/nín/ Con 1: /nín/-/nín/

Table 2: Conditions across Lists

The pairs were entered into a spreadsheet and then repeatedly randomised using a feature of the spreadsheet program. In order to get a similar result to Zhang et al. (2012), the distance between the pairs needed to also be randomised. To this end, another column in the spreadsheet was created and an integer between 8 and 20 was generated using the RANDBETWEEN(8; 20) function. Once this was completed for the first list, all of the other lists were generated using the same ordering as the first list. The first list was manually checked over to avoid potential conflicts of similar segments occurring close together to create an effect. The lists then had their conditions changed so that they matched the table above. Each pair was only present in one condition on each list and no pair was presented with the same condition across lists.

A characteristic of Mandarin Chinese is that there are many homophones, identical in terms of the segment and tone at the lexical level. For example, 是 “to be,” 市 “city,” and 示 “to show” are all pronounced shì. As a result, words that did not have any homophones were selected first and only when there were no non-homophonic words

remaining, were words with homophones selected. In this case, the most frequently occurring of the homophones was chosen.

Moreover, the relative frequencies of the selected words was controlled. This was done to avoid the possibility of creating an artificial effect whereby more frequently occurring words would be responded to faster and more accurately than a similar word with a lower frequency. The minimal pairs chosen had similar frequencies to each other, and the overall frequency of the words across all of the lists were kept as similar as possible. Word frequencies were first taken from the Academia Sinica Balanced Corpus of Modern Chinese6_{, and then checked against the larger the Jun Da} (2004) corpus7_{. This was done for two reasons: (1) to verify against anomalies} between corpora, and (2) to compensate for any skewing of the word frequencies and differences in pronunciation caused by corpora from two different regions of the Mandarin-speaking world. For example, if a word in the Academia Sinica corpus had a relatively low frequency for a word, and the Jun Da corpus also had a relatively low frequency for a word, then that word would be classified as low frequency. However, if the Academia Sinica corpus classified a word as high frequency when Jun Da did not, then the word would be discarded. In the situation where Jun Da had a high frequency word where the Academia Sinica corpus did not, the Jun Da frequency would continue to be used as the interest group and control group both are from mainland China. Furthermore, some words in the Academia Sinica corpus use the Taiwan pronunciation which differs from the pronunciation in mainland China. A widely known example of this is the word for “rubbish” 垃圾. In Taiwan, this word is

pronounced lèsè, however, in mainland China it is pronounced lājī. In this case, the Academia Sinica frequency would not be counted.

Nonce words were also used in the experiment. These are all pronounceable words that could theoretically be written in pīnyīn, but have no meaning. A pronounceable word in this context means that the combination of sounds is phonotactically legal, but is not present in the actual language. These words were all checked in the Jun Da corpus to verify that there were no words that appeared when a particular pīnyīn was queried in the database. For example, the nonce word rǐ was coined for the purposes of the experiment. When this is entered into the Jun Da database, there are no results given, meaning that this word was verified as being non-existent, and therefore classifiable as a nonce word.

3.2.3 Recording

A female native Mandarin speaker from Beijing read all 288 stimulus syllables (both words and nonce words) into a Sennheiser MKH416T short shotgun interference tube microphone connected to a Roland UA-55 Quad Capture USB audio interface connected to a desktop PC running Microsoft Windows 7. The recording was done in a soundproof chamber at the Leiden University Phonetics Laboratory using Audacity 2.0.2 (Audacity Team, 2012). Before the recording, the speaker and the researcher went through all of the stimuli together to ensure that there would be no issues producing the stimuli during the recording session. The speaker read through the list of stimulus items one at a time. The audio tokens were recorded at a sampling rate of 44,100 Hz. In order to account for natural variations in speech, each of the stimulus items was produced three times. The three tokens were then evaluated by the

researcher and the clearest token was selected for the experiment. Each individual token was down-sampled to 22,050 Hz and converted to the Waveform Audio File Format (WAV) for compatibility with E-Prime psychological experiment software (Psychology Software Tools, 2012).

3.2.4 Procedure

Participants were tested in the Leiden University Phonetics Laboratory. Before the experiment started, the participants were asked to fill out a questionnaire about their language background. The questionnaire collected information such as age, gender, education level, birthplace, languages spoken, languages and birthplace of the parents, and hearing or vision problems. The questionnaire was developed based on Li et al.'s (2014) LHQ 2.0 language questionnaire. As all of the participants were international students from China, it was assumed that their level of Mandarin was sufficient to collect data. Participants were also asked to read through a page outlining the ethical constraints and requirements of the experiment in accordance with Leiden University Centre for Linguistics guidelines.

Participants were seated in front of a computer screen in the phonetics laboratory and given headphones to wear over their ears. They were instructed to press two keys on the keyboard in order to complete the experiment.

A practice session preceded the test. The words used in the practice session were not repeated in the remainder of the experiment. The practice session was to help all of the participants become familiar with the format of the experiment before data collection began. The practice session included both real words and nonce words to simulate the experiment. Participants were told to respond based on whether or not

they thought the utterance that they heard could be classified as a word. The participant would see a fixation point in the centre of the screen before every sound was played.

Responses were entered through a keyboard. Participants were instructed to press the “k” key for a word response and the “j” key for a nonce word response. Participants were asked to place their right hand index finger and right hand middle finger over the “j” and “k” keys respectively. This hand placement follows proper QWERTY keyboarding technique and also prevented any possible bias that may have been caused by using two different hands to respond. All participants were asked to respond as quickly and as accurately as possible. The stimuli was presented using E- Prime psychological experiment software version 2.0.10 (Psychology Software Tools, 2012). Speed and accuracy of the responses were measured and recorded by E-Prime. Each item-pair was only presented once to the participants. No feedback was given.

3.3 Results

The experiment was originally intended to be conducted with two different groups of participants, Mandarin speakers from Beijing and northern China forming the control

group and Mandarin speakers from Hubei and other surrounding areas forming the interest group. However, due to technical difficulties with E-Prime, data from the interest group was unable to be collected.

Before conducting statistical analysis on the results, all of the filler and nonce words were removed from the analysis. Any incorrect responses were also removed from the analysis. Overall accuracy for the experiment was high, at 89.77%. There was a significant effect when subjects were considered as factors against response accuracy (F1,14 = 3.098, p < .001). However, further analysis suggests that the variation across the participants never exceeded six percentage points (max = 0.05903) and therefore the significant result can be disregarded.

Figure 3.2 shows the effects of the different conditions on response times. A one-way ANOVA did not yield a significant overall effect OF condition ON response time (F1,806 = 1.908, p > .05). This means that there is little statistical variation between the different conditions considered against the response times. A one-way ANOVA did give a significant effect comparing the prime and target conditions and their response times (F1,806 = 5.278, p < .05). This effect was expected, as the priming paradigm presupposes that the conditions, if any effect is present, will create a difference between the prime and target conditions across the experiment.

The x-axis of Figure 3.2 expresses the different word types and the conditions. In order to save space, conditions are abbreviated to “C” followed by the number as given in Table 2. “Con” stands for the control set of items (i.e. /t/-/d/) and “Int” stands for the target set of items (i.e. /n/-/l).

3.4 Discussion

The results demonstrated that there was a priming effect shown on conditions 1 and 2. This was expected because both of the conditions are simply repeated presentation of the same stimulus. While the result was not statistically significant, a trend can be seen by observing the above figure. This indicates that the priming model may not be as reliable in a study like this in comparison to a more prominent priming model, such as those experiments conducted by Lee (2004) with an ISI of 250ms between pairs. As this experiment used a less salient priming paradigm, the results may not be as clear as they may have been had a different form priming paradigm been used.

Continuing with the discussion of conditions 1 and 2, the noticeable changes in response times across the control and interest word pairs may be due to random variation. All of the error bars for these columns show overlaps meaning that there are no significant differences between them.

3.4.1 Control Pairings

The third condition (/t/-/d/) and fourth condition (/d/-/t/) in the control pairing shows an inhibition effect between the prime and target in the control pairing. The third condition result suggests that a [th_{] initial is inhibiting the processing of a [t] initial,} whilst a priming effect can be seen when the order of presentation is reversed in the fourth condition, a [t] initial primes a [th_{] initial. The difference between the two} phonemes is the presence of aspiration, i.e. the difference between the “t” in “two” and the “t” in “computer.” In English, an initial “t” is always aspirated, whereas when it occurs within a word segment, there is generally no aspiration and it sounds more like a “d.” Apart from the aspiration, the place of articulation is identical between the two phonemes. This suggests that the aspirated phoneme acts as a subset of the non- aspirated phoneme. That is to say that the brain first processes the [t] and subsequently categorises the sound as aspirated [th_{]. Once this has occurred, the next} sound must skip over the initial processing and try to match it with the previous aspirated sound, and when the match does not occur, it begins from the beginning, returning to the [t] sound, and a consequence, taking much longer to process.

This kind of lexical representation is further supported by the significant priming effect found in condition four. In this condition, a [t] is presented first. The brain has not gone any further with processing because [t] is a higher-level phoneme in the

phonemic inventory. When a [th_{] is then subsequently heard, a comparison is made} between [t] and [th_{] and the correct phoneme is selected. In this case, the narrowing of} the phonemic candidates has already taken place, and since [th_{] appears to be} categorised as a subset of [t], the selection occurs very quickly. A visual representation of what this effect could look like is given in figure 3.3.

This phenomenon may be indicative of a larger effect on phonemic activation in Mandarin Chinese where aspirated consonants are compared with their non-aspirated counterparts. A further study could be conducted that examines this effect in more detail.

3.4.2 Interest Pairings

The interest pairings in conditions three and four did not seem to generate much variation beyond what was reasonably expected. In condition three, there was a slight priming effect shown that by and large mimics the effects shown in conditions one

In document The representation of alveolar nasals and laterals in Southern Mandarin (Page 30-43)