• No results found

This chapter addressed the first subsidiary research question of this thesis: “To what extent can the automated rhetorical parser XIP be used to identify indicators of good academic writing in undergraduate student essays from different disciplines, as judged by the essay grade?”

XIP was designed to work on peer-reviewed academic research writing; but by a team with no training in education, or intent for it to be used in education. However, it connects with education to the degree that there is overlap in the hallmarks of research articles and the kinds of writing that academics seek to nurture in undergraduate students, and reward through grading criteria, which is what this study has dealt with. Therefore, there was a need to understand whether the XIP can be used to identify indicators of good

undergraduate student writing. The quantitative data analysis chapter described evaluation studies carried out to test the XIP’s performance on undergraduate student essays from

various disciplines and levels, using the mark awarded as a measure of the quality of the writing. The studies presented in this chapter sought to assess the quality of the XIP through correlational studies and regression analysis.

The research question can be answered as follows: To a significant extent, depending on the discipline, level and tutors’ expectations, the automated rhetorical parser XIP can be

used to identify indicators of good academic writing in undergraduate student essays, where these indicators are tested by the essay mark. The following conclusion points can be given based on the studies with different datasets (S000, E000, and L000) and on the BAWE corpus:

 From a learning analytics point of view, it has been found that some of the

XIP categories were good predictors of final marks. However, these categories were discipline and level specific.

 Not all of XIP’s existing categories were found to have a significant impact

on the essay mark. The categories TENDENCY, SURPRISE, NOVELTY and OPEN QUESTION, that are found in journal writing of experienced researchers, did not appear necessary for undergraduate students to get better grades.

 The categories BACKGROUND, EMPHASIS, CONTRAST and

SUMMARY, on the other hand, were associated with higher marks.

 XIP was less likely to work well with student writing from hard knowledge

fields, whereas XIP performed well with student writing from soft disciplines such as Arts and Humanities.

 XIP did not work for Level 1 student writing, but it was more likely to work

at higher levels, Level 2 and Level 3.

 Where tutors’ marking guidelines were available to inform the selection of

datasets, this served as a better validation of XIP, since it was known that students were being required to produce argumentative writing. When the marking rubric aligned with XIP categories, it was more likely that the presence of some categories correlated with grade. Therefore, it can be argued that XIP was able to detect features of a good advanced student essay automatically in the discipline of the Arts and Humanities.

These promising outcomes suggest that XIP could be used for training undergraduate students and making them aware of these types of categories in order to improve their writing skills as well as to get better grades. However, some of the outliers occurred during the studies have to be acknowledged. Specifically, in the E000 dataset, whereas in the great majority of the essays the grade was correlated with the number of salient sentences

very few salient sentences, and conversely, low grades were given to essays with a relatively greater number of salient sentences.

High graded essays with few salient sentences have a strikingly vivid and literary style, which does not strictly follow the patterns of concise scholarly communication that is used in XIP’s algorithm. These essays convey a personal approach, show deep knowledge, and use unconventional expressions, which is why the salient sentences were not picked up. Alternative explanations required by the marking grid are provided; however, they are embedded into a particular narrative flow, in which the expression of contrast is distributed throughout several sentences. Instead of referring to the alternative arguments through expressions such as ‘contrasting analyses’ or ‘critical debates’, the author of this example essay lays them out in several sentences.

In the case of low-graded essays containing a relatively high number of salient sentences is that in contrast, their style is simple and schematic, and sometimes their syntactic structure is not clear. The fact, however, that the number of salient sentences shows a correlation with the marks indicates that the more scholarly meta-discourse is present in a student essay, the more likely it is that it gets a better mark in the evaluation. However, these outliers signal the fact that XIP requires some alterations which need to be explored within the following studies given in next chapters. Based on this chapter for instance, sentence- based analysis could be spread to paragraph-level so that when an author lays expressions across several sentences, this could be captured.

The quantitative data analysis studies advance the understanding of the relevance of XIP’s rhetorical parsing for undergraduate writing. There are better answers to the research question: “To what extent can the automated rhetorical parser XIP be used to identify

indicators of good academic writing in undergraduate student essays from different disciplines, as judged by the essay grade?” On the other hand, it cannot be said that these

play. Therefore, while for many educators the statistical correlation with grade is an important question to answer, before such a parser can be considered as a practical tool, it requires validation by tutors themselves. The next chapter describes the qualitative data gathered by consulting tutors to gain a better understanding of their views on what makes good student writing.

6

ONE-TO-ONE INTERVIEWS WITH

MARKERS

6.1 Introduction

his chapter addresses the second subsidiary research question: “How do educators define the attributes of good student writing, and to what degree can the automated rhetorical parser, XIP, identify the presence of these attributes?” Answering this question

required an investigation into how educators define the quality of student writing, what they give credit for when marking a student essay, and to what extent the XIP analysis can capture these elements.

The XIP analysis of student writing, explained in Chapter 5, suggested that promising results could be obtained from relating categories used in XIP analysis, to the essay marks for student texts from various disciplines requiring argumentative critical writing, with the exception of hard disciplines, despite the fact that the XIP tool had not been developed for this particular purpose and context. Since it is important to know that this XIP analysis is in line with what educators expect to see in good student writing, it is essential to

understand in depth what educators value in writing, and how similar the XIP analysis is with respect to their judgement of quality. The next section of this chapter reports the design details of the study, and how the data were collected, which is then followed by the explanations of the participants. An account of how the data were transcribed and

analysed is then given, and, finally, the findings are reported.