The growing number of data sets and the opportunity to study these using computational techniques has led to the development of analytics. The term analytics refers to the
processes of studying such data sets and analysing them to measure, improve, and compare the performance of individuals, programmes, departments, or institutions (Norris, Baer, & Offerman, 2009). Analytics technology aids decision makers to find the best course of action by evaluating large data sets (Brown, 2011). “Analytics is the process of developing actionable insights” (Cooper, 2012, p. 3), and action analytics refers to “analytics
capabilities and practices that are powerful, immediate, and lead to outcomes that are useful to a wide variety of stakeholders” (Norris et al., 2009, p. 1).
Analytics is used in business and science to describe computational support for capturing digital data trails to provide rapid feedback, timely interventions and to help inform decision-making processes. Learning analytics brings this concept into an educational context and considers how learning data should be analysed to improve learning and the environments in which it occurs, based on the assumption that big data and analytics can add value to education by shaping its future (Siemens & Long, 2011).
The Society for Learning Analytics Research defines learning analytics as “the
measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs”. Digital data trails produced by learners, such as log-in information, rates of participation in specific activities, the amount of time students spend interacting with online resources, etc. can be used to understand what happens during learning processes and can be useful to find out what kind of improvements should be carried out by educators (Siemens & Long, 2011). Additionally, analysis of learner-produced data can provide detailed information about the potential problems experienced by students who might need additional support (Siemens & Long, 2011). It can help not only learners by showing their own progress and learning habits back to them but also educators to understand the impact of changing various elements in learning processes.
The type of data gathered varies by institution and by application, but in general it includes information about the frequency with which students access online materials or the results of assessments from student exercises and activities conducted online. Since the focus of this thesis is on the automatic identification of discourse elements in students’ writing, learning analytics based on discourse elements will be a main theme for this thesis, which is explained in the next section.
1.5.1 Discourse-centric learning analytics
Most learning analytics applications provide quantitative information about learners, e.g. based on how many times they have logged in to learning platforms, viewed a forum post, and replied to it. However, learning analytics can move beyond reporting these simple quantitative logs, and provide information on the quality of these contributions students made (Buckingham Shum, Knight, & Littleton, 2012). One interest for learning analytics is in its potential for the analysis of discourse data (Buckingham Shum & Ferguson, 2012).
Researchers are beginning to draw on extensive prior work on how tutors mark essays and discussion posts, how spoken and written dialogue shape learning and how computers can recognize good argumentation, in order to design analytics that can assess the quality of text, with the ultimate goal of scaffolding the higher order thinking and writing that we seek to install in students (Buckingham Shum et al., 2012, p. 6).
Discourse-centric learning analytics is a term first defined by De Liddo, Buckingham Shum, Quinto, Bachler, and Cannavacciuolo (2011) in the first Learning Analytics and Knowledge conference (LAK). De Liddo et al. (2011, p. 6) motivated a learning analytics that focuses on “learners’ discourse as a promising site to identify patterns of meaningful learning”. Their work identifies the rhetorical attitude of learners towards discourse contributions, like arguments supported and rejected by learners, the evidence they used for such arguments, and emerging questions.
Following this, the first discourse-centric learning analytics (DCLA) workshop
(Buckingham Shum et al., 2013) held at the third LAK conference proposed a mission statement for DCLA: “to devise and validate analytics that look beyond surface measures in order to quantify linguistic proxies for deeper learning” (Ferguson, De Liddo,
Whitelock, De Laat, & Buckingham Shum, 2014, p. 1). In 2014 as part of the fourth LAK conference, the second DCLA workshop was held with a focus on the intersection of learning analytics research, theory and practice: “once researchers have developed and
validated discourse-centric analytics, how can these be successfully deployed at scale to support learning?” (Ferguson et al., 2014, p. 1).
Learning analytics with a focus on the use of discourse to support learning and teaching are being developed at the intersection of fields such as automated assessment, learning
dynamics, deliberation platforms, and computational linguistics. Ferguson et al. (2014) questioned what steers such developments towards the category of learning analytics, as opposed to research that sits in any of the other categories: their use or potential to generate actionable intelligence specifically in the context of learning, such as helping educators to understand significant discourse patterns.
The definition for this addition to learning analytics came from Knight and Littleton (2015, p. 17): “DCLA focuses on analytics to support high quality discourse for learning contexts; it consists of analysis of discourse data, creation of effective feedback to learners and educators, and the validation and theorising of our analytic techniques”.
The ‘D’ in DCLA stands for discourse coming not only from student writing but also from social interactions, online discussions, forum posts, and exploratory dialogue. As the DCLA workshops had already produced a couple of papers on extended student writing, at the sixth LAK conference a new workshop was held specifically focusing on discourse in student writing, called ‘Critical perspectives on writing analytics’ (Buckingham Shum, Knight, et al., 2016). “Broadly defined, writing analytics involves the measurement and analysis of written texts for the purpose of understanding writing processes and products, in their educational contexts” (Buckingham Shum, Knight, et al., 2016). This workshop therefore focused on analytics that can help to gain a better understanding both of the writing process as well as the final product, and of the pedagogical context in which writing analytics should take place, i.e. how to embed writing analytics meaningfully within a pedagogical context.
DCLA as a sub-area of learning analytics does not only take an interest in computational- analytic techniques for discourse but also in the explicit learning implications of those techniques (Knight & Littleton, 2015); this is why this thesis is part of the field of
discourse-focused learning analytics. The next section sets out the thesis structure for the remaining chapters.