• No results found

Educational Data Mining (EDM) and Learning Analytics (LA)

lytics (LA)

The creation of OLM requires the distillation from a huge amount of raw data of information and knowledge about one or more characteristics of the learner, such as preferences, interaction habits, knowledge, skills,and experiences. This task could be supported by data intensive techniques, such as those developed in the Data Mining (DM) field.

DM techniques comes from fields such as economy or marketing, where they are used to find the most common pattern or co-occurrence in the buying habits of consumers. This approach generates rules in the form of precondition => postcondition that are not really devoted to investigate the causal relationships generating the rule itself, but which is more interested in the coverage (how many buyers amongst the sampled ones follow that specific pattern) and support (what percentage of the buyers that have the precondition also have the postcondition) of the found rule (Agrawal, Imieli´nski, and Swami 1993).

When specifically applied to the field of education, this approach takes the name of Educa- tional Data Mining (Romero 2011). The above has been defined as a separated field due to the specificity of the kind of rule involved and the particular attention paid to the learning domain, peculiar of its own. In fact, this is especially relevant in order to achieve a better learners’ understanding, but also to explore and offer an in depth interpretation of the learning context

1.

2.4 Educational Data Mining (EDM) and Learning Analytics (LA)

The main objective of EDM is to develop new tools for discovering relevant rules or patterns in the raw data. When the attention switches to the large scale applications of these techniques, some researchers call it Learning Analytics, indicating that it is more geared towards the broad applicability of rules and findings retrieved inside the field (Bienkowski, Feng, and Means 2012). This is important in order to be able to extend the results of a single or a small number of experiments that confirm the hypothesis into courses or institutions other than the ones considered and directly analysed.

Nevertheless, other researchers1 pose the distinction between EDM and LA more on the

methods used for analysis. In their view, LA is more general, as it also takes into account qualitative methods and human judgment – such as sentiment, influence and discourse analysis, sense-making model, and Social Network Analysis – whereas EDM seems to be only interested in relationship of the quantitative data about the educational experience.

Combining EDM and LA it is possible to support tasks for defining learner profiles, tracking behaviors and finding relevant dimensions to classify and interpret online user activities in TEL experiences. The main object of the researches developed in this field is to predict a model to measure the student performance with the idea to recommend improvements to the current educational practices. Two tasks currently classified inside EDM are particularly relevant to the present research: “Causal data mining” and “Distillation of data for human judgment” (Baker 2010), which respectively focus on the elicitation of the generating causal relationship for some of the rules found and on the distillation of higher level knowledge from a huge amount of information, in order to better support the human capabilities of judgment and decision making.

Based on a recent report elaborated by Bienkowski, Feng, and Means (2012) for the U.S. Department of Education - Office of Educational Technology called ”Enhancing Teaching and Learning Through Educational Data Mining and Learning Analytics” it is possible to identify some directions of research in which these approaches could help TEL to offer better experiences and provide educational results more in line with the requirements of modern education. The information and knowledge extracted could be re-framed to different time–scales and devoted to distinct roles, as indicated in Table 2.1.

An aspect normally neglected or underundestimated is the IT costs associated with the application of these techniques, which are both economic and organisational. In fact, the application of EDM requires the storage of vast amount of data for the relevant time-frame in

Role TimeFrame Scopes Learner immediate

real-time

– selection of the next problem – feedback on subject completed

– strong and weak/deficient personal knowledge weekly – improvements in the last week

– strong and weak personal knowledge areas semester – improvements in the last semester

– courses passed and not passed – suggestion for the next semester plan Tutor some hours – monitoring of learner activities

– near–immediate scaffolding intervention – providing feedback on the current activities Teacher daily – next day’s teaching adaptation

weekly – didactic plan advancement Teacher coordinator monthly

semester

– judging educational progress – realigning the didactic load

– identifying possible difficulties in learners School Administrator yearly – overall school improvements

– adaptations for the next school year – identification of best and problematic cases

Table 2.1: Possible application of EDM and LA, distinguished by the objectives, the optimal time-frame and the role interested in. Adapted from Bienkowski, Feng, and Means (2012) ’En- hancing Teaching and Learning through Educational Data Mining and Learning Analytics’.

2.4 Educational Data Mining (EDM) and Learning Analytics (LA)

a quick and reliable fashion, to guarantee prompt access but also to respond to the need to offer continuous education for the heterogeneous, yet very specialised, professionals that it will support and to improve and validate the algorithms underpinning the procedures. Not to be neglected are the security and privacy constraints to be respected alongside with the ethical obligations related to treating student data.

For an effective and fully meaningful usage of EDM and LA, the authors of the report suggest some basic directions to be followed:

• Cultural change => Using data for making instructional decision is a process that requires time and effort and which has to be supported to help Teachers and Instructional Designers to understand it and make the best of it

• Consider IT => Involve the IT departments in the design phase of the educational expe- riences as well. Use the suggestions they give to structure the experiments in the best way possible, also with regards to the collection and further reuse of the data of interest

• Information Usage => Support all the user of the information provided (visual or graphical, if possible) so that they become smart data consumer i.e. able to explore the information and to obtain the most useful knowledge

• Pilot => Start with pilot areas where the support of these tools seems more promising and concentrate the effort on those. Afterwards, progressively extend the successful cases to cover broader areas

• Communicate => Involve students (and even parents, if the case, such as in compulsory educations like K-12) reporting to them where, when and how the data is captured and how it will be used

• Conform => Try to conform to already existing standards and reuse well–known approaches when feasible, always respecting the technical limitations and the policies regulating the treatment of the data in the specific institution, environment or state

The authors of this report (Bienkowski, Feng, and Means) also indicate some research di- rections, whose results can guarantee a future real adoption and positive impact of EDM and LA in the education sector as one of the pillars of its innovation:

• Usability => Improve the usability aspect of the tool’s design and interface that provide information to the users

• Effectiveness => Monitor the effectiveness of the tool, both with internal (i.e didactic results) and external (i.e satisfaction and willingness of use) drivers

• DSS => Use the extracted information to develop tools able to support the human judgment and decision process (DSS)

• Extend => Understand how it would be possible to extend predictive models already elabo- rated in domains/contexts other of the current one

The development of GVIS – and of the semantic data layers (i.e: the formalized description of sources, operations and encoding steps to be used, that will be explained in details in the next chapter) – tried to respect these indications and quality measures as much as possible. Despite these analogies, EDM (and LA) approaches are normally based on computationally intensive and fully automated data analysis, where GVIS is mainly based on the application of semantic layers of data that the Instructional Designers provide. This means that the main difference is the presence of layers that give an interpretation of the data retrieved as well as of the aggregation operations (semantic approach), upon which the process can rely to extract interesting information.

This rather different approach provides a way to guarantee a meaningful didactic interpreta- tion of this data without the need for a further extensive and time–consuming validation phase as would be the case of a pure EDM approach. Obviously, the definition of the semantic layers for data extraction, identification, validation, fusion, distillation and representation is a quite challenging task for Instructional Designers. It requires in fact a good level of experience, a clear pre-identification of objectives and constraints and a critical thinking approach.

Nevertheless, thanks to these challenging tasks, the results automatically comply, for the most part, with the must-have objectives of a supportive educational tool i.e. ’Analysis and visualization of data’, ’Providing feedback for supporting instructors’ and ’Detecting undesirable student behaviors’, as defined in the work of (Romero and Ventura 2010).