• No results found

3. The impact of peer solution quality on peer feedback provision on geometry

3.1.6. Current study

3.2.4.3. Basic geometric knowledge

A basic geometric knowledge test was used to measure students’ basic geometric knowledge. This test was used in previous research (e.g., Ufer et al., 2008). The test consists of 49 true/false items which were scored dichotomously as 0 if answered incorrectly and 1 if answered correctly (Cronbach’s α = .79). The test measured different basic geometric knowledge including properties of triangles, properties of a parallelogram, transversals, and quadrangles. A total sum score of the 49 items was computed as a measure of participants’ basic geometric knowledge. This measure was used as a covariate in the analyses.

3.2.4.4. Peer feedback content: type and accuracy

The content of PF was determined regarding (a) the type of the provided PF, and (b) the accuracy of the provided PF.

The PF type was operationalized as having two dimensions – purpose and style of PF – based on a coding scheme to analyze PF messages developed by Strijbos et al. (2012). The coding scheme, which was created for bachelor’s thesis proposal, was adapted for use with PF on peer solutions to geometry proofs. According to the coding scheme, the purpose of PF could have (a) a cognitive focus on the content knowledge including knowledge about the overall proof or parts of the proof, (b) a metacognitive focus on the general learning strategies related to the learning task as well as monitoring and evaluating the learning process, (c) an affective focus on motivating and encouraging the (fictional) peer for future performance, and (d) a self- efficacy focus on the providers’ confidence in their ability to provide PF or to solve the proof. Cognitive and metacognitive PF can have two styles: verification or elaboration.Verification

represents confirming or disconfirming statements about the correctness or incorrectness of the entire peer solution or parts of the peer solution, or the learning approaches used to deal with the task. Elaboration represents more detailed comments about parts of the peer solution that follow verifications or other elaborations in the form of correction, confirmation, justification, questioning, or suggestions. PF comments can also include general statements that are neither verifications nor elaborations, or comments on the surface features of the peer solution (e.g., presentation style, spelling or grammar mistakes, etc.). The general statements normally serve as an anchor to a subsequent verification or elaboration (Strijbos et al., 2012). The general statements were not included in statistical analyses of PF in the current study because they do not include any evaluative information about the peer solution.

Coding procedure

Two student assistants transcribed the verbal PF. The transcripts were then segmented following the procedure by Strijbos et al. (2006), with the smallest meaningful segment as the unit of analysis. Two independent coders (student assistants) segmented parts of the data over several independent rounds, with 10 % of the data each time, until an acceptable percentage

agreement was reached (92.5 % lower bound; 93 % upper bound). Afterwards, one of the student assistants segmented the remaining data.

The segmented PF data was subsequently coded using the coding scheme adapted from Strijbos et al. (2012). Two independent coders (student assistants) segmented part of the data over several rounds, with 10% of the data each time, until an acceptable inter-rater reliability was reached (Krippendorf’s α = .81). No metacognitive PF was observed in the current data. The reliability of the coding was calculated for different levels of the coding scheme: main categories (i.e., cognitive, affective, and self-efficacy; Krippendorf’s α = .68), sub-categories (e.g., cognitive verification, cognitive elaboration, general cognitive statement, etc.; Krippendorf’s α = .81), and lowest level of sub-categories (e.g., cognitive confirmation, cognitive justification, content self-efficacy, PF self-efficacy, etc.; Krippendorf’s α= .76).

Although the main categories typically have higher reliability values than the smaller categories, because they are easier to distinguish, this was not the case for the current sample. The majority of the PF belonged to the main category ‘cognitive’, which might have resulted in the very few disagreements on the other main categories (self-efficacy and affective) causing a severe correction of the Krippendorff’s alpha value. The chance-correction level is higher for three categories (compared to the five subcategories within the cognitive category). Thus, it might be the case that the data did not have enough cases for the other main categories for a more accurate estimate of the inter-rater reliability. After reaching an acceptable inter-rater reliability value for the PF levels used for analysis in this study (Krippendorff’s α = .81), the same student assistants coded half of the remaining data each.

Peer feedback accuracy

For PF accuracy we adopt the definition of PF quality from Van Steendam et al. (2010), but limiting PF accuracy to the number of detected errors or correct statements. Accordingly, the provided PF was assessed in terms of the number of accurate comments about the correct,

incorrect or missing aspects in the peer solution that the participants provided PF on. The same student assistants who coded for PF type also coded the accuracy of the comments. Over several rounds, each time 10% of the data was coded until an acceptable level of inter-rater reliability was reached (Krippendorf’s α = .76). Afterwards, each student assistant coded half of the remaining data.

The PF accuracy score was not directly comparable because the two peer solutions had comparable and non-comparable parts. The peer solution in condition A was longer, and the other peer solution in condition B had more errors. The maximum PF accuracy score for the peer solution in condition A was 14 points, whereas it was16 points for condition B. To compare the two conditions, proportional PF accuracy scores were computed by dividing the number of accurate comments made by the maximum number of possible correct comments.