3. CHAPTER 3: DEVELOPING SELF-EVALUATION SKILLS THROUGH REPETITION:
3.5 Analysis
3.5.2 What explains differences in over and under-confidence?
The results from Section 3.5.1 draw a few conclusions about the improvement in self- evaluation scores across students, generally, that they do improve over repeated writing assignments, though the observable factors associated with that improvement are weak. In this section, we are interested in the direction of self-evaluation inaccuracy, specifically in understanding what groups of students tend to be over/under confident with respect to the quality of their work, and if that over/under confidence fades with repeated self-evaluation procedures.
To describe differences in over- and under-confidence between students, we construct a dummy measure of “overestimation”, which equals 1 if the student’s self-evaluation score was higher than the score assigned by the grader, and 0 otherwise. Overall, we find an increasing trend of over-confidence across the writing assignments. Specifically, on the first assignment, 17.3% of the sample was over-confident, in that their self-evaluation score was higher than the actual grade assigned. By the second and third assignment, that level of over-confidence increased to 43.5% and 69.6%, respectively.
To better understand what type of students are over-confident versus under-confident, Table 4 above presents a set of correlation coefficients between our measure of overestimation and a set of student level-characteristics, by writing assignment. On the first assignment, we find that students who earned a higher grade on the assignment, those of higher GPA, and females are less likely to be over-confident in their self-evaluation score. On the second assignment, the statistical significance on both current GPA and female disappears, though there remains the negative correlation with grade. On the final assignment, we actually find a statistically significant reversal on the effect of actual assigned grades, in that now higher assigned grades are correlated with over- confidence.
The above result, that overestimation is largely based on the assigned grade, resembles closely that of a “Dunning-Kruger” effect, as established in the psychology literature and which has been replicated in countless studies across discipline. Here, it is thought that low ability individuals tend to overestimate their abilities, as, they argue, those skills that are required to perform well on an assignment are also those skills needed to evaluate your actual performance on that assignment. (Kruger and Dunning, 1999) To better understand if these similar effects exist within this study sample, we examine differences in the actual versus perceived percentile rank of students in the course. Here, we assigned each student a percentile rank based on the extent to which their self-evaluation scores correlated with the actual scores received. On average, students self-evaluated their own work in the 49th percentile, as opposed to the actual mean percentile (50, by definition). This suggests that, on average, there was no difference between the students actual and perceived performance, when measured relative to the rest of the class.
Given the results of the descriptive analysis above, simply examining average percentile ranks alone might be masking heterogeneous abilities of students to self-evaluate, and thus our main focus is on the self-evaluation abilities of different groups of students. Panel A in Figure 3.2 groups students into quartiles by actual performance on the writing assignment. That is, the students who scored the worst on the writing assignment were grouped in the bottom quartile while students scoring the highest on the assignment were grouped in the highest quartile. In Figure 3.2, we plot the actual performance (by quartile) against the self-evaluated performance of the students in each of those groupings. If the students are able to self-evaluate accurately, we would expect then those two lines to closely follow each other. As Figure 3.2 depicts, we find that students in the bottom quartiles grossly overestimated their ability relative to the rest of the class and that students in the highest quartiles grossly underestimated their abilities. That is, students in the
lowest quartile ranked by actual performance ranked themselves 65th percentile, whereas their actual grades fell in the 18th percentile, which is statistically different at the 95% level (see Appendix Table 3.A1 for the full percentile rankings). This suggests that students in the bottom quarter of the distribution tended to feel they were better than average in terms of their writing abilities.
Students in the upper half of the actual grade distribution were also subject to misperceiving the quality of their work, except these students (i.e. those students in the third and fourth quartile) tended to grossly underestimate the quality of their work. That is, those in the third quartile tended to self-evaluate themselves in the 35th percentile, while their actual grades fell in the 70th percentile, on average, and those in the highest quartile rated the quality of their work in the 37th percentile, whereas it actually fell in the 90th percentile. These differences are each significant at the 95% level, suggesting a nearly symmetric misperception in ability between low- performing and high-performing students. This follows the result of Dunning and Kruger (1999), though they do not find that individuals in the upper quartiles of ability misperceived the quality of their work to the same degree.
Panel B depicts this same percentile ranking for the second assignment. Here, we find a similar pattern of results for those of lower actual abilities, though we find that all groups of abilities are now more accurate with respect to their perceived percentile rankings. Specifically, students in the lowest quartile of actual performance perceived their work in the 40th percentile, representing an overestimation on average of about 25.7 percentile points, down from a 47.1 percentile point difference on the first assignment. Further, for those in the highest quartile of actual abilities, we find that there is no statistical difference between that actual and perceived percentile rankings.
Finally, Panel C depicts another increase in the abilities of all groups of students to accurately self-evaluate. On this third assignment, for those of lower ability we find another decrease in the gap between perceived and actual percentile rankings, falling to an overestimate of 12.5 percentile points, whereas there still exists no difference between perceived and actual across all other groups of students. In Panel C, we find that actual and self-evaluation lines closely follow each other, suggesting that a repeated self-evaluation procedure is associated with an increase in self-evaluation accuracy across student abilities.
3.6 Discussion and Conclusions
The main hypothesis of this study was that repeated self-evaluation would enhance a students’ ability to accurately assess the quality of their own work. In other words, forcing students to iteratively evaluate their own work would enhance their abilities as self-regulated learners. This hypothesis was tested in the context of repeated writing assignments from an upper- level environmental economics course. Here, our data suggest that self-evaluation accuracy increased by 32% over the three writing assignments, where most of those gains were experienced through from females and those with prior self-evaluation experience.
Further, from the first self-evaluation exercise, the data from this study mimicked closely the results of Kruger and Dunning (1999), in which students of lower abilities (i.e. those that scored the lowest on the writing assignment) tended to overestimate the score they felt their work deserved, and students with higher abilities on average underestimated their perceived score. These systematic over- and underestimations by ability level disappeared by the third and final writing assignment, suggesting that experience with self-evaluation increased the self-evaluation skills of both high and low ability students.
A natural follow-up question to these results would be to ask if this enhanced ability to self-evaluate had any effect on learning outcomes. That is, do greater self-evaluation abilities lead to increased performance in the class? To answer this question, we examine class averages on both writing assignments and exams in the course, as well the difference between the first and third of these assignments, to assess the overall change. These results are presented in Table 3.5 and show that, on average, there was no statistically significant improvement in either blog scores or exam scores over the three assignments. Thus, the data do not show that learning outcomes increased over the course of the semester and can possibly be explained by two different, nonexclusive explanations. First, as self-evaluation represents only one component of self-regulated learning, the type of learning measure used may have had a ceiling effect, so it may not have been sensitive enough to identify possible treatment effects of repeated self-evaluation. Second, it may be necessary for a larger amount of self-evaluation practice to occur before its effects on learning can meaningfully be measured.
Given the positive effects of repeated self-evaluation procedures on the accuracy of student abilities, the lack of results with respect to increased learning outcomes deserves further attention. As this study was carried out in the context of a specialized economics course containing a rather small sample of students, this opens up many avenues for future research directions. One notable direction to take is to assess the learning outcomes of students in this sample as they progress through future classes. Here, it is possible that more salient effects of increased self-evaluation abilities could be experienced outside of the context in which those abilities were developed. However, we can conclude that our study
supports that self-evaluation is a practiced skill, suggesting that it must be implemented in the classroom as a mechanism to enhance the abilities of students as self-regulated learners.
References
Bangert-Drowns, R. L., Kulik, C. L. C., Kulik, J. A., & Morgan, M. (1991). The instructional effect of feedback in test-like events. Review of educational research, 61(2), 213-238.
Bazerman, C., & Russell, D. (1994). Landmark Essays on Writing Across the Curriculum. Davis, CA: Hermagoras Press.
Bazerman, C., Little, J., Bethel, L., Chavkin, T., Fouquette, D., & Garufis, J. (2005). Reference Guide to Writing Across the Curriculum. West Lafayette, IN: Parlor Press.
Boud, D. and Falchikov, N., 1989. Quantitative studies of student self-assessment in higher education: A critical analysis of findings. Higher Education, 18(5), 529-549.
Boud, D., Lawson, R. and Thompson. D. G., 2013. Does student engagement in self-assessment calibrate their judgement over time?. Assessment & Evaluation in Higher Education, 38(8), pp.941-956.
Bloxham, S., 2009. Marking and moderation in the UK: false assumptions and wasted resources. Assessment & Evaluation in Higher Education, 34(2), pp. 209-220.
Bransford, J., Brown, A., & Cocking, R. (1999). How People Learn: Brain, Mind, Experience ,and School. Washington, D.C.: National Academy Press.
Brewer, S. M., & Jozefowicz, J. J. (2006). Making economic principles personal: Student journals and reflection papers. The Journal of Economic Education, 37(2), 202-216.
Butler, D. L., & Winne, P. H. (1995). Feedback and self-regulated learning: A theoretical synthesis. Review of educational research, 65(3), 245-281.
Cameron, M. P. (2012). ‘Economics with training wheels’: Using blogs in teaching and assessing introductory economics. The Journal of Economic Education, 43(4), 397-407.
Crowe, D., & Youga, J. (1986). Using writing as a tool for learning economics. The Journal of Economic Education, 17(3), 218-222.
Dynan, L., & Cate, T. (2005). Does writing matter? An empirical test in a college of business course meeting AACSB’s new international standard. Proceedings of the Association for Global Business, USA, 12.
Dynan, L., & Cate, T. (2009). The impact of writing assignments on student learning: Should writing assignments be structured or unstructured?. International Review of Economics Education, 8(1), 64-86.
Dynan, L. & Cate, T. (2010). Structured Writing Assignments as Pedagogy: Teaching Content, Enhancing Skills. Journal of Applied Economics and Policy , 29, 47-60.
Embrey, T.R. 2002. You blog, we blog: A guide to how teacher-librarians can use weblogs to build communication and research skills, Teacher Librarian 30: 7-9.
Ferdig, R.E., and Trammell, K.D. 2004. Content delivery in the “blogosphere”, Technological Horizons in Education 31: 12-20.
Ferraro, P.J., 2010. Know thyself: competence and self-awareness. Atlantic Economic Journal, 38(2) pp.183-196.
Fulwiler, T. (1994). How Well Does Writing Across the Curriculum Work? In C. Bazerman, & D. Russell, Landmark Essays on Writing Across the Curriculum (pp. 51-64). Davis, CA: Hermagoras Press.
Galbraith, D., & Rijlaarsdam, G. (1995). Effective Strategies for the Teaching and Learning of Writing. Learning and Writing , 9 (2), 93-108.
Greene, J., & Azevedo, R. (2007). A theoretical review of Winne and Hadwin’s model of self- regualted learning: New perspectives and directions. Review of Educational Research, 77(3), 334- 372.
Goffe, W.L., and Sosin, K. 2005. Teaching with technology: May you live in interesting times. Journal of Economic Education, 36: 278-291.
Greenlaw, S. A. (2003). Using writing to enhance student learning in undergraduate economics. International Review of Economics Education, 1(1), 61-70.
Grimes, P.W., 2002. The overconfident principles of the economics student: An examination of metacognitive skill. Journal of Economic Education, 33(1), pp. 15-30.
Guest, J., & Riegler, R. (2017). Learning by doing: Do economics students self-evaluation skills improve?. International Review of Economics Education, 24, 50-64.
Hayes, J. R. (2000). A New Framework for Understanding Cognition and. Perspectives on writing: Research, theory, and practice, 6.
Hayes, J. R., & Flower, L. S. (1980). Identifying the organization of writing processes. I LW Gregg & ER Steinberg (Eds.), Cognitive Processes in Writing (pp. 3-30).
Jonassen, D.H. 1999. Designing constructivist learning environments. In C. Reigeluth (ed.), Instructional-design theories and models, Volume II, pp. 215-239. Mahwah, N.J.: Erlbaum. Knowles, M. (1975). Self-directed Learning: A Guide for Learners and Teachers. Englewood Cliffs, NJ. Prentice-Hall/Cambridge.
Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: how difficulties in recognizing one's own incompetence lead to inflated self-assessments. Journal of personality and social psychology, 77(6), 1121.
Land, S.M. 2000. Cognitive requirements for learning with open-ended learning environments, Educational Technology Research and Development 48: 61-78.
Langer, J. A., & Applebee, A. N. (1987). How Writing Shapes Thinking: A Study of Teaching and Learning. NCTE Research Report No. 22. National Council of Teachers of English, 1111 Kenyon Rd., Urbana, IL 61801.
Leidner, D., and Jarvenpaa, S. 1995. The use of information technology to enhance management school education, MIS Quarterly 19: 265-291.
Lew, M.D., Alwis, W.A.M. and Schmidt, H.G., 2010. Accuracy of students' self-assessment and their beliefs about its utility. Assessment & Evaluation in Higher Education, 35(2), pp. 135-156. Lew. M.D. and Schmidt, H.G., 2007. Measuring students’ beliefs about self-assessment. Rotterdam; Erasmus University.
Meadows, M. and Billington, L., 2005. A review of the literature on marking reliability, Commissioned report to the National Assessment Agency, London: AQA.
Newstead, S., 2004. Time to make our mark. The Psychologist, 17(1), pp. 20-23.
Nicol, D.J. and MacFarlane-Dick, D. (2006). Formative assessment and self-regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199- 218.
Nowell, C. and Alston, R., 2007. I thought I got an A! Overconfidence across the Economics Curriculum. Journal of Economic Education, 38(2), pp. 131-142.
Oravec, J.A. 2002. Bookmarking the world: Weblog applications in education, Journal of Adolescent & Adult Literacy 45: 616-621.
Sadler, R.D. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18(2), 119-144.
Simpson, M. S., & Carroll, S. E. (1999). Assignments for a writing-intensive economics course. The Journal of Economic Education, 30(4), 402-410.
Stowe, K. (2010). A Quick Argument for Active Learning: The Effectiveness of One-Minute Papers. Journal for Economic Educators , 10 (1), 33-39.
Ward, M., Gruppen, L. and Regehr, G., 2002. Measuring self-assessment: Current state of the art. Advances in Health Sciences Education, 7(1), pp. 63-80.
Watts, M., & Schaur, G. (2011). Teaching and Assessment Methods in Undergraduate Economics: A Fourth National Quinquennial Survey. The Journal of Economic Education , 42 (3), 294-309. Winne, P. H. (2001). Self-regulated learning viewed from models of information processing. Self- regulated learning and academic achievement: Theoretical perspectives, 2, 153-189.
Zimmerman, B.J., & Campillo, M. (2003). Motivating self-regulated problem solvers. In J. E. Davidson & R. J. Sternberg (Eds.), The nature of problem solving (pp. 233-262). New York: Cambridge.
Figures
Figure 3.1. Absolute Difference between Actual Performance and Self-Evaluated Performance, by Writing Assignment
Notes: *, **, *** indicate statistical significance at the 0.10, 0.05, and 0.01 levels, respectively.
8.8*** 6.2*** 6.0*** 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
Blog 1 Blog 2 Blog 3
Pe rc ent age Poi nt s
Figure 3.2. Perceived Versus Actual Blog Post Performance, by Quartile
Note: The horizontal axis represents the performance of students based on actual performance of blog post, binning students into four individual quartiles. The self-evaluation measures represent the average percentile the students in each of the bins perceived their work, relative to the rest of the class. Convergence of the actual and self-evaluation lines across the three writing assignments suggests that Dunning-Kruger effects are being mitigated as students increase self-evaluation abilities. 0 20 40 60 80 100
Lowest Quartile Second Quartile Third Quartile Highest Quartile
Pe
rc
ent
ile
Blog 1 Actual Self-Evaluation 1
0 20 40 60 80 100
Lowest Quartile Second Quartile Third Quartile Highest Quartile
Pe
rc
ent
ile
Blog 2 Actual Self-Evaluation 2
0 20 40 60 80 100
Lowest Quartile Second Quartile Third Quartile Highest Quartile
Pe
rc
ent
ile
Blog 3 Actual Self-Evaluation 3
P ane l A P ane l B P ane l C
Tables
Table 3.1 Summary statistics
Variable Mean Female (Yes = 1) 0.478 Year Senior 0.608 Junior 0.261 Sophomore 0.131 Major Economics 0.391
Natural Resource Economics 0.217
Other 0.391
Current GPA (SD) (.497) 3.06
ESL (Yes = 1) 0.261
Table 3.2. Mean Absolute Value of Inaccuracy, by Student Group and Writing Assignment
Blog Post 1 Blog Post 2 Blog Post 3 (=BP3-BP1) Change
Female 9.50*** (7.573) 4.82*** (4.106) 5.82*** (4.395) (5.654) -3.86* Male 8.25*** (5.914) 7.49*** (5.643) 6.17*** (5.706) (7.618) -2.08 Senior 8.68*** (7.344) 6.03*** (4.092) 5.64*** (4.737) (7.313) -3.03 Junior 9.75*** (5.646) (7.146) 7.17* (6.354) 7.25** (6.181) -2.50 Sophomore (7.006) 7.83 (6.144) 5.00 (4.752) 5.17 (6.526) -2.67 Major: Econ 11.00*** (9.290) 6.83*** (6.000) (7.305) 7.11** (8.495) -3.89 Major: Environmental 10.25*** (4.401) (3.778) 5.75** 4.25*** (1.405) -6.00** (5.099) Major: Other 5.37*** (2.06) 5.81** (5.25) 6.06*** (3.499) (3.703) 0.69 GPA: Low 9.19*** (6.488) 6.19*** (5.691) 5.88*** (5.228) (8.403) -2.75 GPA: High 8.40*** (7.136) 6.20*** (4.347) 6.15*** (4.983) (4.424) -2.25 ESL (9.497) 8.00 (5.776) 5.33 (5.836) 6.83** (5.202) -1.17 Past Self-Eval Experience 10.64*** (6.575) 6.14*** (4.079) (3.133) 4.28** (8.029) -6.35* No Self-Eval Experience 8.06*** (6.710) 6.22*** (5.531) 6.75*** (5.558) (5.549) -1.31
Notes: *, **, *** indicate statistical significance at the 0.10, 0.05, and 0.01 levels, respectively. Standard deviation in parentheses.
Table 3.3. Results of regression model on absolute self-evaluation inaccuracy for each writing assignment and the change in inaccuracy between writing assignment 1 and 3
Blog1 Blog2 Blog3 (=B1-B3) Change
(1) (2) (3) (4) Senior -1.115 -0.015 -1.386 0.114 (3.394) (2.599) (2.608) (3.190) Econ 5.613 1.845 1.168 -3.891 (3.793) (2.978) (2.959) (3.605) Env 5.270 1.146 -1.373 -5.959 (4.612) (3.630) (3.631) (4.403) Current GPA -3.655 -1.388 1.041 3.919 (3.817) (2.807) (3.043) (3.683) ESL 0.124 0.043 1.486 1.431 (4.797) (3.715) (3.749) (4.584) Female 1.922 -0.005 0.673 -1.024 (4.012) (3.312) (3.242) (4.058) Blog 1 Grade 0.214 (0.260) Blog 2 Grade -0.258 (0.188) Blog 3 Grade -0.250 (0.185) Average Blog Grade -0.434 (0.303) Constant -3.550 32.304 25.760 27.692 (22.265) (16.782) (15.797) (24.860) R-Squared 0.2439 0.2039 0.1795 0.3042 N 23 23 23 23
Notes: Inaccuracy is measured as the absolute difference between the student’s self-evaluation score and their actual grade. *, **, *** indicate statistical significance at the 0.10, 0.05, and 0.01 levels, respectively. Standard errors in parentheses.
Table 3.4. Correlations between Student Demographics and Dummy Measure of Over- Confidence, by Writing Assignment
Blog Post 1 Blog Post 2 Blog Post 3
Actual Grade -0.750** -0.480** 0.674** Senior 0.133 -0.016 -0.143 Econ -0.133 -0.344 -0.051 Env -0.011 0.078 -0.037 Current GPA -0.447* 0.198 0.129 ESL -0.273 0.078 -0.037 Female -0.439* -0.137 0.255 Past Self-Eval -0.054 -0.008 0.438
Table 3.5 Average Grades on Three (3) Writing Assignments and Exams
Assignment Blog Exam
1 92.3 79.4
2 88.8 74.3
3 91.5 83.6
Diff [(3)-(1)] -0.8 4.1
APPENDICES APPENDIX 3.A
SUPPLEMENTARY TABLES AND FIGURES
Table 3.A.1. Average Percentile Rankings for Actual and Self-Evaluated Performance, by Blog Post and Actual Performance Quartile
Actual Self-Evaluation Diff Blog Post 1 Lowest Quartile 18.6 65.7 47.1** Second Quartile 40.0 48.8 8.8 Third Quartile 70.0 35.0 -35.0** Highest Quartile 90.0 37.5 -52.5**