Skip to main content
Original Articles

The Interplay Between Student Evaluation and Instruction

Grading and Feedback in Mathematics Classrooms

Published Online:https://doi.org/10.1027/0044-3409.216.2.111

Teachers’ practices of student evaluation can be considered crucial to the implementation of embedded assessment systems. This article reports on two studies investigating these practices in detail. The first study examines teacher judgments about student achievement in terms of the grades awarded. It examines whether the grades awarded reflect two dimensions of students’ achievement as well as learning behavior. It also explores whether teachers’ grading is aligned with their instruction. In the second study, we analyze how teacher evaluation affects students’ subsequent learning processes. This study utilizes feedback given to students by the teacher within classroom interaction as an indicator for student evaluation, and investigates the impact of two types of feedback, evaluative and informational, on student learning and motivation. The results of Study 1 show that both the dimensions of student achievement as well as involvement were found to contribute substantially to students’ grades. Moreover, these contributions depended on teacher beliefs and instructional quality. The findings of Study 2 show that positive evaluative feedback in the classroom was associated with increased intrinsic motivation, whereas negative evaluative feedback was not related to motivation. Informational feedback was shown to foster motivation via emotional experience and cognitive processing. None of the feedback types examined had a significant impact on students’ achievement development. Finally, implications of the two studies for the implementation of embedded classroom assessment and the investigation of its effects are discussed.

References

  • Bangert-Downs, R.L. , Kulik, C.-L.C. , Kulik, J.A. , Morgan, M.T. (1991). The instructional effect of feedback in test-like events. Review of Education Research, 61, 213–238. First citation in articleCrossrefGoogle Scholar

  • Baumert, J. , Kunter, M. (2006). Stichwort: Professionelle Kompetenz von Lehrkräften [Keyword: Professional competencies of teachers]. Zeitschrift für Erziehungswissenschaft, 9, 469–520. First citation in articleCrossrefGoogle Scholar

  • Bennett, R.E. , Gottesman, R.L. , Rock, D.A. , Cerullo, F. (1993). Influence of behavior perceptions and gender on teachers’ judgments of students’ academic skill. Journal of Educational Psychology, 85, 347–356. First citation in articleCrossrefGoogle Scholar

  • Bonesronning, H. (1999). The variation in teachers’ grading practices: Causes and consequences. Economics of Education Review, 18, 89–105. First citation in articleCrossrefGoogle Scholar

  • Braun, H. (2004). Reconsidering the impact of high-stakes testing. Education Policy Analysis Archives, 12 (1) Retrieved January 2008 from http://epaa.asu.edu/epaa/v12n1/ First citation in articleGoogle Scholar

  • Brookhart, S.M. (1993). Teachers’ grading practices: Meaning and values. Journal of Educational Measurement, 30, 123–142. First citation in articleCrossrefGoogle Scholar

  • Cheng, L. , Curtis, A. (2004). Washback or backwash: A review of the impact of testing on teaching and learning. In L. Cheng, Y. Watanabe, A. Curtis (Eds.), Washback in language testing: Research contexts and methods (pp. 3–17). Mahwah, NJ: Erlbaum. First citation in articleCrossrefGoogle Scholar

  • Cheng, L. , Watanabe, Y. , Curtis, A. (Eds.). (2004). Washback in language testing: Research contexts and methods. Mahwah, NJ: Erlbaum. First citation in articleCrossrefGoogle Scholar

  • Cizek, G.J. (2001). More unintended consequences of high-stakes testing. Educational Measurement, Issues and Practice, 20, 19–28. First citation in articleCrossrefGoogle Scholar

  • Deci, E.L. , Ryan, R.M. (1991). A motivational approach to self: Integration in personality. In R. Dienstbier (Ed.), Nebraska symposium on motivation: Perspectives on motivation (Vol. 38, pp. 237–288). Lincoln: University of Nebraska Press. First citation in articleGoogle Scholar

  • Dresel, M. , Ziegler, A. (2002, September). Failure as an element of adaptive learning. Paper presented at the 8th Biennial Conference of the European Association for Research on Adolescence, Oxford, UK. First citation in articleGoogle Scholar

  • Educational Reporting Consortium (2006). Bildung in Deutschland. Ein indikatorengestützter Bericht mit einer Analyse zu Bildung und Migration [Education in Germany. An indicator-based report including an analysis of education and migration]. Bielefeld: Bertelsmann. First citation in articleGoogle Scholar

  • Gielen, S.M. , Dochy, F. , Dierick, S. (2003). Evaluating the consequential validity of new modes of assessment: The influence of assessment on learning, including pre-, post- and true-assessment effects. In M. Segers, F. Dochy, E. Cascallar (Eds.), Optimizing new modes of assessment: In search of qualities and standards (pp. 37–54). Dordrecht: Kluwer. First citation in articleCrossrefGoogle Scholar

  • Hattie, J. , Timperley, H. (2007). The power of feedback. Review of Educational Research, 77, 81–112. First citation in articleCrossrefGoogle Scholar

  • Hoge, R.D. , Coladarci, T. (1989). Teacher-based judgments of academic achievement: A review of literature. Review of Educational Research, 59, 297–313. First citation in articleCrossrefGoogle Scholar

  • Hox, J.J. (2002). Multilevel analysis: Techniques and applications. Mahwah, NJ: Erlbaum. First citation in articleCrossrefGoogle Scholar

  • Kenny, D.A. , Kashy, D.A. , Bolger, N. (1998). Data analysis in social psychology. In D. Gilbert, S.T. Fiske, G. Lindzey (Eds.), The handbook of social psychology (4th ed., Vol. 1, pp. 233–265). New York: McGraw-Hill. First citation in articleGoogle Scholar

  • Klieme, E. (2005). Bildungsqualität und Standards. Anmerkungen zu einem umstrittenen Begriffspaar [Educational quality and standards. Comments on a controversial pair of concepts]. Standards. Friedrich Jahresheft, 23, 6–7. First citation in articleGoogle Scholar

  • Klieme, E. , Avenarius, H. , Blum, W. , Döbrich, P. , Gruber, H. , Prenzel, M. et al. (2003). Zur Entwicklung nationaler Bildungsstandards. Eine Expertise [The development of national educational standards. An expert commentary]. Bonn: Bundesministerium für Bildung und Forschung. First citation in articleGoogle Scholar

  • Klieme, E. , Lipowsky, F. , Rakoczy, K. , Ratzka, N. (2006). Qualitätsdimensionen und Wirksamkeit von Mathematikunterricht. Theoretische Grundlagen und ausgewählte Ergebnisse des Projekts “Pythagoras” [Quality dimensions and effectiveness of mathematics instruction. Theoretical foundation and selected results of the project “Pythagoras”]. In M. Prenzel (Ed.), Untersuchungen zur Bildungsqualität von Schule: Abschlussbericht des DFG-Schwerpunktprogramms [Studies on the educational quality of schools: The final report on the DFG Priority Program] (pp. 127–146). Münster: Waxmann. First citation in articleGoogle Scholar

  • Kluger, A.N. , DeNisi, A. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119, 256–284. First citation in articleCrossrefGoogle Scholar

  • Kobarg, M. , Seidel, T. (2003). Prozessorientierte Lernbegleitung im Physikunterricht [Process-oriented learning support in physics instruction]. In T. Seidel, M. Prenzel, R. Duit,, & M. Lehrke (Eds.), Technischer Bericht zur Videostudie “Lehr-Lern-Prozesse im Physikunterricht” [Technical report on the video study “Teaching and learning processes in physics instruction”] (pp. 151–200). Kiel: IPN. First citation in articleGoogle Scholar

  • Koeppen, K. , Hartig, J. , Klieme, E. , Leutner, E. (2008). Current issues in research on competence modeling and assessment. Zeitschrift für Psychologie / Journal of Psychology, 216, 60–72. First citation in articleLinkGoogle Scholar

  • Krapp, A. (2002). Structural and dynamic aspects of interest development: Theoretical considerations from an ontogenetic perspective. Learning and Instruction, 12, 383–409. First citation in articleCrossrefGoogle Scholar

  • Krapp, A. (2004). Beschreibung und Erklärung antagonistisch wirkender Steuerungssysteme in pädagogisch-psychologischen Motivationstheorien [Description and explanation of antagonistically operating controlling systems in educational-psychological motivation theories]. Zeitschrift für Pädagogische Psychologie, 18, 146–156. First citation in articleLinkGoogle Scholar

  • Kulhavy, R.W. (1977). Feedback in written instruction. Review of Educational Research, 47, 211–232. First citation in articleCrossrefGoogle Scholar

  • Kulhavy, R.W. , Stock, W.A. (1989). Feedback in written instruction: The place of response certitude. Educational Psychology Review, 1, 279–308. First citation in articleCrossrefGoogle Scholar

  • Lanahan, L. , McGrath, D.J. , McLaughlin, M. , Burian-Fitzgerald, M. , Salganik, L. (2005). Fundamental problems in the measurement of instructional processes: Estimating reasonable effect sizes and conceptualizing what is important to measure. Washington, DC: American Institutes for Research. First citation in articleGoogle Scholar

  • Leiter, J. , Brown, J.S. (1985). Determinants of elementary school grading. Sociology of Education, 58, 166–180. First citation in articleCrossrefGoogle Scholar

  • Leuders, T. (2006). Kompetenzorientierte Aufgaben im Unterricht [Competence-oriented tasks in classroom instruction]. In W. Blum, C. Drüke-Noe, R. Hartung,, & O. Köller (Eds.), Bildungsstandards Mathematik: konkret. Sekundarstufe I: Aufgabenbeispiele, Unterrichtsanregungen, Fortbildungsideen [Standards for mathematics education: Concrete. Secondary education: Example tasks, instructional suggestions, ideas for advanced training] (pp. 81–95). Berlin: Cornelsen. First citation in articleGoogle Scholar

  • Lipowsky, F. , Drollinger-Vetter, B. , Hartig, J. , Klieme, E. (2006). Leistungstests [Achievement tests]. In E. Klieme, C. Pauli, K. Reusser (Eds.), Dokumentation der Erhebungs- und Auswertungsinstrumente zur schweizerisch-deutschen Videostudie “Unterrichtsqualität, Lernverhalten und mathematisches Verständnis.” Materialien zur Bildungsforschung [Technical report of the Swiss-German video study “Instructional quality, learning behavior, and mathematical understanding.” Materials for educational research] (Vol. 14). Frankfurt am Main: GFPF. First citation in articleGoogle Scholar

  • Narciss, S. (2006). Informatives tutorielles Feedback. Entwicklungs- und Evaluationsprinzipien auf der Basis instruktionspsychologischer Erkenntnisse [Informative tutoring feedback. Principles of development and evaluation on basis of instructional-psychological findings]. Münster: Waxmann. First citation in articleGoogle Scholar

  • Narciss, S. , Huth, K. (2004). How to design informative tutoring feedback for multimedia learning. In H.M. Niegemann, R. Brünken, D. Leutner (Eds.), Instructional design for multimedia learning (pp. 181–195). Münster: Waxmann. First citation in articleGoogle Scholar

  • Nichols, S.L. , Glass, G. V, & Berliner, D.C. (2006). High-stakes testing and student achievement: Does accountability pressure increase student learning? Education Policy Analysis Archives, 14 (1). Retrieved November 2007 from http://epaa.asu.edu/ epaa/v14n1 First citation in articleGoogle Scholar

  • Pauli, C. (2006). Klassengespräch [Class conversation]. In E. Klieme, C. Pauli, K. Reusser (Eds.), Dokumentation der Erhebungs- und Auswertungsinstrumente zur schweizerisch-deutschen Videostudie “Unterrichtsqualität, Lernverhalten und mathematisches Verständnis.” Materialien zur Bildungsforschung [Technical report of the Swiss-German video study “Instructional quality, learning behavior, and mathematical understanding.” Materials for educational research] (Vol. 15, pp. 123–146). Frankfurt: GFPF. First citation in articleGoogle Scholar

  • Pellegrino, J.W. , Chudowsky, N. , Glaser, R. (2001). Knowing what students know. The science and design of educational assessment. Washington, DC: National Academic Press. First citation in articleGoogle Scholar

  • Phye, G.D. (1979). The processing of informative feedback about multiple-choice test performance. Contemporary Educational Psychology, 4, 381–394. First citation in articleCrossrefGoogle Scholar

  • Popham, J.W. , Keller, T. , Moulding, B. , Pellegrino, J. , Sandifer, P. (2005). Instructional supportive accountability tests in science: A viable assessment option? Measurement: Interdisciplinary Research and Perspectives, 3, 121–179. First citation in articleCrossrefGoogle Scholar

  • Rakoczy, K. (2008). Motivationsunterstützung im Mathematikunterricht – Unterricht aus der Perspektive von Lernenden und Beobachtern [Motivational support in mathematics instruction – Instruction from learners’ and observers’ perspectives]. Münster: Waxmann. First citation in articleGoogle Scholar

  • Rakoczy, K. , Buff, A. , Lipowsky, F. (2005). Befragungsinstrumente [Questionnaires]. In E. Klieme, C. Pauli, K. Reusser (Eds.), Dokumentation der Erhebungs- und Auswertungsinstrumente zur schweizerisch-deutschen Videostudie “Unterrichtsqualität, Lernverhalten und mathematisches Verständnis.” Materialien zur Bildungsforschung [Technical report of the Swiss-German video study “Instructional quality, learning behavior, and mathematical understanding.” Materials for educational research] (Vol. 13). Frankfurt am Main: GFPF. First citation in articleGoogle Scholar

  • Raudenbush, S.W. , Bryk, T. , Congdon, R. (2000). HLM 5: Hierarchical, linear and nonlinear modeling [Computer software]. Chicago, IL: Scientific Software International. First citation in articleGoogle Scholar

  • Raymond, M.E. , Hanushek, E.A. (2003). High-stakes research. Education Next, 3, 48–55. First citation in articleGoogle Scholar

  • Rheinberg, F. (2001). Bezugsnormen und Leistungsbeurteilungen [Frame of reference and performance assessment]. In F.E. Weinert (Ed.), Leistungsmessung in Schulen [Performance assessment in schools] (pp. 59–71). Weinheim: Beltz. First citation in articleGoogle Scholar

  • Ryan, R.M. (1982). Control and information in the intrapersonal sphere: An extension of cognitive evaluation theory. Journal of Personality and Social Psychology, 43, 450–461. First citation in articleCrossrefGoogle Scholar

  • Ryan, R.M. , Deci, E.L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist, 55, 68–78. First citation in articleCrossrefGoogle Scholar

  • Ryan, R.M. , Deci, E.L. (2002). An overview of self-determination theory: An organismic dialectical perspective. In E.L. Deci, R.M. Ryan (Eds.), Handbook of self-determination research (pp. 3–33). Rochester, NY: University of Rochester Press. First citation in articleGoogle Scholar

  • Schrader, F.-W. (1989). Diagnostische Kompetenzen von Lehrern und ihre Bedeutung für die Gestaltung und Effektivität des Unterrichts [Teachers’ diagnostic competence and its impact on instruction’s organization and effectiveness]. Frankfurt: Lang. First citation in articleGoogle Scholar

  • Schrader, R.W. , Helmke, A. (1990). Lassen sich Lehrer bei der Leistungsbeurteilung von sachfremden Gesichtspunkten leiten? Eine Untersuchung zu Determinanten diagnostischer Lehrerurteile [The influence of student characteristics on teachers’ judgments of achievement in mathematics]. Zeitschrift für Entwicklungspsychologie und Pädagogische Psychologie, 4, 312–324. First citation in articleGoogle Scholar

  • Seidel, T. (2003). Lehr-Lernskripts im Unterricht. Freiräume und Einschränkungen für kognitive und motivationale Lernprozesse – eine Videostudie im Physikunterricht [Teaching-learning scripts in instruction. Freedom and constraints for cognitive and motivational learning processes – A videostudy in physics instruction]. Münster: Waxmann. First citation in articleGoogle Scholar

  • Seidel, T. , Prenzel, M. , Kobarg, M. (2005). How to run a video study. Technical report of the IPN video study. Münster: Waxmann. First citation in articleGoogle Scholar

  • Shavelson, R.J. , Young, D.B. , Ayala, C.C. , Brandon, P.R. , Furtak, E.M. , Ruiz-Primo, M.A. et al. (2007). On the impact of curriculum-embedded formative assessment on learning: A collaboration between curriculum and assessment developers. Manuscript submitted for publication. First citation in articleGoogle Scholar

  • Smith, C.L. , Wiser, M. , Anderson, C.W. , Krajcik, J. (2006). Implications of research on children’s learning for standards and assessment: A proposed learning progression for matter and the atomic-molecular theory. Measurement Interdisciplinary Research and Perspectives, 4, 1–98. First citation in articleCrossrefGoogle Scholar

  • Sobel, M.E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. In S. Leinhardt (Ed.), Sociological methodology (pp. 290–312). Washington, DC: American Sociological Association. First citation in articleCrossrefGoogle Scholar

  • Spinath, B. (2005). Akkuratheit der Einschätzung von Schülermerkmalen durch Lehrer/innen und das Konstrukt der diagnostischen Kompetenz [Accuracy of teacher judgments on student characteristics and the construct of diagnostic competence]. Zeitschrift für Pädagogische Psychologie, 19, 85–95. First citation in articleLinkGoogle Scholar

  • Vollmeyer, R. , Rheinberg, F. (2005). A surprising effect of feedback on learning. Learning and Instruction, 15, 589–602. First citation in articleCrossrefGoogle Scholar

  • Watson, D. , Clark, L.A. , Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54, 1063–1070. First citation in articleCrossrefGoogle Scholar

  • Wilson, M. (2008). Cognitive diagnosis using item response models. Zeitschrift für Psychologie / Journal of Psychology, 216, 73–87. First citation in articleGoogle Scholar

  • Wu, M.L. , Adams, R.J. , Wilson, M.R. (1998). ConQuest: Multi-aspect test software [Computer software]. Camberwell: Australian Council for Educational Research. First citation in articleGoogle Scholar