2.7 The scoring element of validity
2.7.2 Complexity, Accuracy, Fluency
2.7.2.2 Accuracy
Accuracy is defined as ‘the ability to produce target-like and error free language’ (Housen et al., 2012, p. 2). Measures of accuracy assess the extent to which language learners produce speech that adheres to the grammatical conventions of the target language. There are two broad categories of accuracy measures: measures of global accuracy and measures of accuracy of specific language forms. This review begins by discussing measures of global accuracy.
Measures of global accuracy are designed to provide an overall indication of the proportion of language use during a language task that is grammatically accurate. In the pre-task planning literature, the process of assessing global accuracy has involved the identification and quantification of measures such as error free clauses (Skehan and Foster, 1997, Foster and Skehan, 1999) or the percentage of error free speech units (T-units in Crookes, 1989, C-units in Elder and Iwashita, 2005, AS-units in Elder and Wigglesworth, 2006). To calculate the number of error free AS-units, first calculate the number of AS-units, then calculate the number of AS units that contain grammatical errors. The result is the number of error free units divided by the total number of units. For example, if ten AS-units are produced, and six of these units contain errors, the percentage of error free AS-units is 40 per cent. In the pre-task planning literature, planning has not consistently impacted the number of error free speech units (Crookes, 1989, Elder and Iwashita, 2005, Elder and Wigglesworth, 2006). However, Skehan and Foster (1997) and Foster and Skehan (1999) demonstrate increases in the percentage of error free clauses after planning. For
example, Skehan and Foster (1997) report that planning increased the percentage of error free clauses on a picture-based narrative task from 53 per cent to 69 per cent.
The calculation of error free AS-units provides an indication of the overall grammatical accuracy of a speech sample. However, there is a shortcoming with this measure. An AS-unit that contains multiple grammatical errors returns the same result as an AS-unit that contains only one (i.e. this AS-unit is/is not error free). It is necessary to calculate an additional measure of global accuracy that accounts for this shortcoming. Li et al. (2014) propose the mean number of errors per AS-unit. Used in combination with the calculation of error-free AS-units, the mean number of errors per AS-unit describes the extent to which grammatical errors are common within AS- units. Li et al. (2014) demonstrated that planning enhanced grammatical accuracy by reducing the mean number of errors per AS-unit from .68 after no pre-task planning to .42 under their five-minute pre-task planning condition.
Measures of the accuracy of specific language forms focus on a particular area of the second language such as the article system. This level of analysis is specific and provides detailed information about the kind of inaccuracies that appear in a spoken sample. For example, to measure the accuracy of article usage, first the number of ‘obligatory occasions’ (Ellis and Barkhuizen, 2005, p. 80) for articles is calculated, then the number of correctly supplied articles is calculated, the number of correctly supplied articles is then divided by the number of obligatory occasions and the percentage that was correctly supplied is reported. Research that uses specific measures includes Yuan and Ellis (2003) who report no difference in correct verb forms after pre-task planning. Wigglesworth (1997) reports that planning increased
the accuracy of articles but had no effect on plural usage and verbal morphology. Neilson (2013) found that planning had no impact on subject verb agreement. Crookes (1989) found no difference in the accuracy of article use and plural noun endings. In sum, the findings indicate that pre-task planning makes little difference to the results of measures of the accuracy of specific language forms.
One drawback of the measurement of the accuracy of specific language forms is that the outcome is heavily reliant on the individual test taker who may overuse or avoid a given structure. Specifically, in Crookes’ (1989) study, the degree of accurately supplied articles was heavily dependent upon the language proficiency of the participants. Crookes argues that the definite article is a language form that is generally acquired at later stages of the acquisition process by Japanese learners of English. Seeking to generalize about the level of accuracy with such a measure may therefore produce a questionable outcome. As such, the solitary usage of specific measures for assessment of accuracy may prove problematic and is not recommended. A combination of global and specific measures allows for a richer and fuller description of the grammatical accuracy of test performance.
Researchers have commented upon the limited effect that planning seems to have on the grammatical accuracy of speech (Ellis, 2005, 2009, Foster and Skehan, 1996, 1999, Skehan, 2009, Skehan and Foster, 1997, Yuan and Ellis, 2003). It is widely acknowledged that of the three areas of speech production (CAF), accuracy is least consistently increased by pre-task planning (Ellis, 2009). Yuan and Ellis (2003) suggest that accuracy is more related to opportunities to engage in online planning
(i.e. planning during the task) than pre-task planning. In their research, the findings indicate that when a task involves time pressure, levels of accuracy diminish.