Data for the Study - RESEARCH METHODOLOGY

RESEARCH METHODOLOGY

3.2 Data for the Study

This section includes data collection and data processing.

3.2.1 Data Collection

The following sections present data collection, the subjects and instrument used.

98 The subjects are 117 native speakers of Mandarin Chinese. They are all university students from Tongji University, aged 18 to 20 years, and consist of 80 males and 37 females, all majoring in science. They are from different parts of mainland China and are about to finish the first academic year of university study after graduating from high school study.

Mandarin Chinese, used officially in China by the government, the media and the domain of education, is their native language. Students have learned Mandarin Chinese since they entered primary schools, and it is one of the compulsory courses taught throughout their schooling.

All the subjects have started learning English from the primary school. As the curriculum is unified, they have similar number of years of English learning experience from primary school through senior high school in China. They have also had the same experience of English during their first-year of study at the present university. They all received classroom instruction in EFL for a period of thirteen years. None of them have any experience studying abroad. They all passed the College English Test (CET) band – 4, with a higher than the national average score of 590. Among them, the highest is 660, and the lowest is 597.

3.2.1.2 Instrument

The data was collected from one writing task administered during the classroom hours. The justification for choosing of genre – an argumentative essay was based on two criteria: one follows Granger (1990) who uses the written learner argumentative essays as learner corpus. The International Corpus of Learner English (ICLE) constructed by Granger (1990) consists mainly of argumentative essays produced by the university undergraduates in English who are advanced EFL learners with different mother tongues. Each essay is accompanied by a 'learner profile' which gives information about the essay

99 (topic, writing conditions, and etc) and the learner (native language, age, sex, educational background, etc) in ICLE (Granger, 1990). The other justification for the selection of argumentative essay was that it is usually given to students in examinations and assignments throughout their academic years.

The subjects were asked to write under timed conditions on the topic, “Success is 1%

inspiration and 99% perspiration”. The subjects were required to give their views on this

in 200 words in order that as a big error data size as possible would be obtained even though it requires 120 words in a standard test across China. Dictionaries were not provided for them. No permission was given for discussion in class during the test. The administrator did not offer any information about the content of essay. The time given was 60 minutes since duration of a standardized test for essay writing in China is half an hour of the total testing time (120 minutes).

3.2.2 Data Processing

The researcher set about identifying the errors after processing the students’ essays. At this preliminary stage, the written texts were scrutinized to detect the errors for the present study. This process of detecting errors involved reconstructing what the learner was attempting to say by inferring the learner’s intentions from the interpretation of the whole context of situation (Corder, 1973:274).

The present study defines the word “error” following the principle where typical errors were identified and processed. It is generally acknowledged that patterns of collocation which have a history of recurrence in a language become part of the language’s standard linguistic repertorire and users do not stop to think about them when they encounter them in the text. But, it must be pointed out that unlike grammatical statements, statements about collocations are made in terms of what is typical or untypical rather than what is admissible and inadmissible Baker (2011:55).

100 Meanwhile, those inappropriate uses of a word in a type of collocation are also taken into consideration in the present study. Baker (2011:55) pointed out that there exists a middle ground between completely acceptable collocations (especially lexical collocations) and erroneous collocations which may be judged as ‘non-nativelike or stylistically non-appropriate’. Such ‘non-nativelike’ or ‘non-appropriate’ collocations identified were also counted as erroneous collocations in the error analysis of the present study. After all, collocation lays emphasis on the semantic restriction rule at lexical level and on the restrictive rule in morphology and syntax at grammatical level. A non-appropriate collocation is likely to violate the restrictive collocation rule either at lexical level or at grammatical level.

According to Cook (1993:22), “The recognition of an error and its reconstruction are subjective processes; the error is not a clear-cut objective ‘fact’ but is established by a process of analysis and deduction.” In order to establish validity and reliability in the performance data (Mahammad, 1998) errors of English collocations made by the subjects were determined by using certain procedures: use of a learner corpus, English native speaker corpus and dictionaries.

The error identification requires manual searching and manual annotation after the researcher extracted all the examples of English collocation errors present in the data. Manual searching was seen as the most appropriate strategy for error identification.

Before the manual identification of errors, 117 student essays were coded at random. For example, the student essay will be coded as T which refers to Text, as T1, T2, … T117.

The English native speaker corpus used for correct form of errors in this study is the British National Corpus (BNC). As the LOB and BROWN corpora, which were established in the 60s, may have collocations that were outdated, the BNC set up in 1990s was a better alternative. Dictionaries, such as BBI Dictionary by Benson et al. (1997),

101 Oxford Collocations Dictionary for Learners of English (Lea et al. 2002), and Oxford Advanced English Learner’s English-Chinese Dictionary (2004) were used to check the correct form of collocations. Two college English teachers in China were also invited to identify errors.

There were approximately 24,130 English words in total collected from the 117 student essays. Although identification and tallying of correct form of collocation errors was a time-consuming task, it was done manually with much caution.

Two EFL instructors helped to identify and underline all the possible collocation errors in the essays of the subjects and the researcher detected and checked all the underlined errors in the data and made correction by consulting the English native speaker corpus and dictionaries.

In document A study of collocation errors among Chinese learners of English (with reference to Chinese college students of Tongji university in China) / Ye Hong (Page 120-124)