Reader-Writer Interaction - Results – Analysis Of Writing Scripts

Chapter 6: Results – Analysis Of Writing Scripts

6.8 Reader-Writer Interaction

Markers of reader-writer interaction were grouped into four categories: markers of writer identity, hedges, boosters and attempted passive voice.

Over half of the scripts made no use of writer identity. However, although so many scripts did not make use of this category, the mean was just under 2.5 mark-ers per script. This shows that a number of writmark-ers used a large number of these markers.

Figure 42: Distribution of features of writer identity over overall sample and DELNA sub-levels

The box plots in Figure 42 and the table of descriptive statistics (Table 53) above show the distribution over the different DELNA levels. It is clear that this variable did not differentiate distinctly between the different proficiency levels.

Table 53: Descriptive statistics - Writer identity

DELNA level Mean SD Minimum Maximum

4 2.83 1.95 0 6

5 2.32 2.72 0 19

6 2.64 3.71 0 29

7 2.49 3.01 0 17

8 2.48 3.25 0 7

The analysis of variance showed no statistically significant difference between the different levels of writing, F (4, 577) = 1.07, p = .368.

The second variable under investigation was the number of hedging devices. On average, writers used just under six of these structures per script.

Figure 43: Distribution of hedging devices over overall sample and DELNA sublevels

When broken up into the different DELNA band levels, the use of hedging de-vices can be seen to have quite clearly distinguished between different levels of writing. This is revealed in the box plot (Figure 43 above) as well as in the table summarising the descriptive statistics (Table 54 be-low). The table shows that whilst writers at lower levels used on average about five hedging devices in their writing, higher level writers used more than eight of these devices.

Table 54: Descriptive statistics – Hedges

DELNA level Mean SD Minimum Maximum

4 5.00 3.10 1 12

5 4.70 2.85 0 15

6 5.84 3.88 0 20

7 6.38 3.68 0 17

8 8.42 2.91 4 14

The analysis of variance revealed a statistically significant difference between the groups, F (4, 596) = 7.39, p = .000. The Games-Howell procedure showed that levels 5 and 6 as well as levels 7 and 8 were statistically distinct from each other.

The hedging variable was also investigated when script length was controlled.

This showed an even stronger difference between the different proficiency levels.

The final variable investigated in this category was boosters. The distribution per band level (box plots in Figure 44) and the table indicating descriptive statistics (Table 55) show that this category failed to distinguish between different levels of writing because writers of all levels used on average about 2.5 boosters in their writing.

Figure 44: Distribution of boosters over overall sample and DELNA sublevels

An analysis of variance revealed no statistically significant differences between the different levels of writing, F (4, 596) = .157, p = .960.

Table 55: Descriptive statistics - Boosters

DELNA level Mean SD Minimum Maximum

4 2.5 1.45 0 9

5 2.5 2.14 0 16

6 2.4 2.05 0 12

7 2.44 2.06 0 11

8 2.46 1.88 0 14

Finally, the use of the attempted passive voice was investigated.

Inter-rater reliability was established by a Pearson correlation between the coding of two raters on a sample of fifty scripts. The correlation coefficient shows a strong relationship, r = .898, n = 50, p = .000.

Figure 45: Distribution of passives over overall sample and DELNA sublevels

The box plots (Figure 45) and the table above (Table 56) show that higher level writers used the passive more frequently, whilst hardly any writers at level 4 at-tempted this structure; however the differences between the different levels of writing proficiency were very small on average.

Table 56: Descriptive statistics – Passives

DELNA level Mean SD Minimum Maximum

4 .33 .89 0 3

5 .80 1.09 0 5

6 1.03 1.25 0 6

7 1.05 1.25 0 8

8 1.38 1.70 0 5

An analysis of variance revealed no statistically significant differences between the different levels of writing, F (4, 596) = 2.37, p = .052

Finally, it was of interest whether there was a relationship between the use of markers of writer identity and the passive voice. It is conceivable, for instance, that writers who use markers of writer identity (by projecting their own voice into the text) use fewer passives. A correlation analysis was conducted which showed a positive relationship between these two variables, r = .304, n = 583, p = .000.

This means that writers who used more passives also tended to use more markers of writer identity.

6.9 Content

The final category investigated was content. Content was divided into three sec-tions, closely following the three sections of the prompts: data description, data interpretation and Content Part three.

Part one, the description of data was calculated as percentages of information de-scribed.

Inter-rater reliability was established by having a second rater double code a sub-set of fifty scripts. The relationship was significant, r = .821, n = 50, p = .000.

The mean for all scripts was 0.59, indicating that, on average, the writers included just under 60% of the possible data. Whilst some writers did not attempt this sec-tion of the task and therefore scored 0%, about 70 writers described all the pieces of information deemed important by the expert writers and therefore scored 100%.

The largest number of writers (more than 150) described 50% of the data.

Figure 46: Distribution of proportion of data description over overall sample and DELNA sublevels

The box plots in Figure 46 and Table 57 indicate that this variable, based on the mean scores, splits the data set into clearly separate levels.

Table 57: Descriptive statistics - Data description

DELNA level Mean SD Minimum Maximum

4 .19 .07 .13 .33

5 .47 .25 .00 1.00

6 .58 ..23 .00 1.00

7 .61 .21 .00 1.00

8 .81 .20 .13 1.00

A Welch test was performed to investigate differences between the groups. The analysis revealed statistically significant differences between the groups, F (4, 55.58, p = .032. However, no adjacent pairs were found to be statistically distinct by the Games-Howell procedure.

The second part of each writing task, the interpretation of data, is scored in terms of the number of reasons given for the facts described in the data.

An inter-rater reliability test established a strong correlation between the two cod-ers, r = .811, n = 50, p = .000.

The data distribution over the five DELNA proficiency levels was investigated using side-by-side box plots (Figure 47). The means (as seen in Table 58 below) ranged from 1.6 to 4.5 reasons, showing a clear differentiation according to level.

A Welch test revealed statistically significant differences between the groups in-volved, F (4, 57.26) = 5.78, p = .001. The Games-Howell procedure showed that no adjacent band levels were statistically distinct from each other.

Figure 47: Distribution of interpretation of data over overall sample and DELNA sublevels

Table 58: Descriptive statistics - Interpretation of data

DELNA level Mean SD Minimum Maximum

4 1.64 .50 1.00 2.00

5 2.67 1.39 0.00 6.00

6 3.23 1.51 0.00 8.00

7 3.54 1.38 0.0 8.00

8 4.52 1.23 2.00 7.00

The final part of each prompt, content Part 3, required the writer to either de-scribe how the current situation will or can be changed in the future or dede-scribe a similar situation in their own country. This part was again scored by giving a point for each proposition.

An inter-rater reliability check was undertaken to ensure reliability in the coding of the variable. The resulting coefficient indicates a strong reliability, R = .807, n

= 50, p = .000.

When the number of propositions in part three of the prompt was plotted against the overall DELNA score (Figure 48), it became clear that the data was separated well by this variable. Descriptive statistics can be seen in Table 59 below.

Figure 48: Distribution of Content Part 3 over overall sample and DELNA sublevels

The analysis of variance revealed statistically significant differences between the groups involved in the analysis, F (4, 576) = 9.61, p = .000. The Games-Howell procedure showed that the only adjoining band levels statistically distinct from one another were levels 5 and 6.

Table 59: Descriptive statistics - Content Part 3

DELNA level Mean SD Minimum Maximum

4 .73 .90 0.00 2.00

5 1.68 1.31 0.00 6.00

6 2.28 1.54 0.00 7.00

7 2.62 1.55 0.00 9.00

8 4.00 1.22 2.00 6.00

6.10 Conclusion

The results in this chapter are based on the analysis of 601 writing scripts pro-duced as part of the 2004 administration of DELNA. Each measure was plotted against the different proficiency levels, providing a clear visual overview of the distribution. Inferential statistics were presented for each structure. The analysis revealed that the variables in Table 60 below successfully differentiated between the different levels.

Table 60: Variables successful in differentiating between levels Construct Measure Accuracy Percentage error-free t-units Fluency Number of self-corrections Complexity Average word length

Sophisticated lexical words / total lexical words Number of AWL words

Mechanics Paragraphing

Coherence Parallel progression

Direct sequential progression Superstructure

Indirect progression Unrelated progression Coherence break

Cohesion Anaphoric pronominals – these, this Linking devices – qualitative analysis Reader/writer interaction Number of hedges

Content Percentage data supplied

Number of propositions (Part 2 and 3)

The next chapter discusses the findings presented above. Here, relevant previous research is related to the current data. Based on the findings in this chapter, the new rating scale is developed.

--- Notes:

1 The n-size in each histogram differs, because of missing values resulting from the analysis.

2 No writing scripts scored at level 9 were included in this analysis, because overall as part of the more than two thousand scripts there were only three writing samples that received a score of 9 by both raters. These three scripts were excluded from any further analysis. Scripts that were scored at level 9 by only one rater were rounded down as part of the calculation of the average of two raters.

3 Percentages are represented as proportions of 1 in the data below.

4 A further analysis showed that if the variable is controlled for the number of words per script, the variable is even more discriminating between levels. For reasons of space it was not repro-duced in this chapter.

5 See Note 4

6 See Note 4

Chapter 7: Discussion – Analysis of Writing Scripts

In document and Evaluation Ute Knoch Diagnostic Writing Assessment PETER LANG The Development and Validation of a Rating Scale LTE 17 Ute Knoch LANG (Page 162-171)