Inter-‐rater reliability process 78

Chapter 3 Methodology 54

3.5 Procedure for data analysis 71

3.5.1 First level of analysis: Identifying the nature of feedback in the MOOC 71

3.5.1.1 Inter-‐rater reliability process 78

As previously mentioned, the process of inter-‐rater reliability started during the first round of codification and included various steps.

The first step of the process was to search for and elect two external evaluators with experience in qualitative research and online education, who were familiar with the method of content analysis. The first evaluator was the supervisor of this study, who is a lecturer at Lancaster University (England) and her research interests focus on online higher education theories and practices. The second evaluator was a senior lecturer at the Autonomous University of Barcelona (Spain) and a consultant instructor at the Open University of Catalonia (Spain). Her research interests focus on teaching and learning strategies.

Based on the discussion with the first evaluator, the sample used for piloting the coding system was randomly selected, and included 30 messages from the fourth learning phase. Considering the three assessment criteria (Relevance, Substance, and

Clarity and coherence), 10 messages were selected for each criterion.

In the second step, a guideline for the evaluators was created and data were prepared for its codification.

The guideline was created to brief evaluators on three topics: contextual information of the case, presentation of the coding system, and training on the use of the coding system. The contextual information included the name of the learning phase, its learning objective, the description of the suggested assignment, the keywords that were likely to be encountered in the messages when coding, and finally the type of output expected from the assignment. This information would help evaluators make sense of the messages they were going to codify.

The coding system was presented with descriptions and examples for each category. Finally, the last section of the guideline explained how codification was expected to be carried out. By means of varied types of examples and explanations justifying its codification, all categories were covered.

In a parallel way, data were prepared with the intention of providing all evaluators with the same starting condition. All 30 messages comprised single thought units or ideas, and segmentation in TUs was not necessary. Both guideline and data were shared with the evaluators.

The third step of the process was carried out after each evaluator had followed the guideline and codified the suggested sample. Synchronous discussions were planned with each evaluator individually to compare the codification and understand the rationale behind it.

In the meeting with the first evaluator, the codification agreement was first compared. The agreement rate was of 8 TUs out of 30. It became apparent that descriptions differentiating between the categories within the content aspect and those in the cognitive function were not clear. Apparently, the sample consisted mostly of categories within those unclear categories. Also, it became evident that a new category needed to be included. During the meeting, a flow diagram was created in order to better support evaluators during the codification process. The diagram was tested directly during the meeting with the same sample of 30 TUs, and a common understanding on the meaning of each of the categories was reached for each of the TUs.

As a next step, it was agreed to randomly select a new sample of messages from the same learning phase, in order to pilot the coding system. A new set of data and the flow diagram were sent to both evaluators (refer to Appendix Four for the flow diagram).

The first meeting with the second evaluator focused on discussing the information in the flow diagram and its mode of use. The results for the codification for the first dataset were compared, and the sample was used to test the flow diagram. Although the agreement rate after comparing our initial coding was of 9 TUs out of 30, going

through the sample with the flow diagram and discussing the categories was productive, and a shared understanding of the intention of each category was constructed.

The diversity in terms of elaboration of feedback was discussed and its effect when coding. Similarly, and building on the understanding that feedback has a formative function, agreement was reached regarding the characteristics of the TUs to be coded into any of the categories defined under the cognitive function.

The results from the discussions from the separate meetings were shared with both evaluators, to assure a common ground when coding. A second meeting was

planned to be held after all evaluators had coded the second sample.

A week after the first meeting, the second online meeting with each of the

evaluators took place. The focus of the meeting was to compare the agreement on the codified data. The percentage agreement with both evaluators was higher than the first time around. With the first evaluator the agreement was of 70% (21 TUs out of 30), with the second evaluator it was 73% (22 TUs out of 30), and between both evaluators the agreement rate was 76% (23 TUs out of 30). TUs in which a consensus was not clear were discussed until reaching a common agreement. The percentage agreement that was reached after discussion was 80%. This value was considered to be high enough to progress with the study.

3.5.2 Second level of analysis: Describing the evolution of feedback in terms of

In document The nature of peer feedback in a MOOC:a case study (Page 90-93)

Inter-­‐rater reliability process 78

Chapter 3 Methodology 54

3.5 Procedure for data analysis 71

3.5.1 First level of analysis: Identifying the nature of feedback in the MOOC 71

3.5.1.1 Inter-­‐rater reliability process 78

3.5.1.1 Inter-‐rater reliability process 78