Skip to main content

Advertisement

Log in

Consequences and outcomes of policies governing medium-stakes large-scale exit exams

  • Published:
Educational Assessment, Evaluation and Accountability Aims and scope Submit manuscript

Abstract

Large-scale externally mandated exit exams are common in many education systems. Exit exams are considered medium stakes when exam scores are blended with school-awarded marks to determine a final course grade. This study examined the effects of policy decisions regarding the weighting of exam and school-awarded marks when calculating a student’s final blended grade. Grade 12 teachers (n = 343) in the Sciences and Humanities were surveyed regarding the effects of Alberta’s exit exam program on their planning and assessment practices. This study found that these exit exams had a profound impact on teacher practice at both the 50% and 30% weightings, leading to narrowing of planning and assessment practices. While a recent change in weighting contributed to a minor shift away from this practice, perceived narrowness of the exit exam’s content sample and perceived pressure related to classroom assessment outcomes were factors that mitigated this potential shift.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Language Arts for French immersion students. This is different from French Language Arts which is not for immersion students.

  2. In Alberta, all grade 12 courses are designated as 30 level courses (e.g., Chemistry 30). In Social Studies, English Language Arts, and Mathematics, there are multiple streams of 30 level courses (e.g., Math 30-1, Math 30-2). As a general rule, dash one courses are more academic in focus than are dash two courses.

  3. Alberta’s newest generation large-scale assessment program.

References

  • Abrams, L., Pedulla, J., & Madaus, G. (2003). Views from the classroom: teachers’ opinions of statewide testing programs. Theory Into Practice, 42(1), 18–29. https://doi.org/10.1353/tip.2003.0001.

    Article  Google Scholar 

  • Agrey, L. (2004). The pressure cooker in education: standardized assessment and high-stakes. Canadian Social Studies, 38(3) Retrieved from https://sites.educ.ualberta.ca/css/Css_38_3/ARagrey_pressure_cooker_education.htm.

  • Alberta Education (2019). Diploma exam weighting FAQ. [online] Education.alberta.ca. Available at: https://education.alberta.ca/writing-diploma-exams/diploma-exam-weighting/everyone/diploma-exam-weighting-faq/ [Accessed 6 Mar 2019].

  • Atkinson, R. C., & Geiser, S. (2009). Reflections on a century of college admissions tests. Educational Researcher, 38, 665–676. https://doi.org/10.3102/0013189x09351981.

  • Bowers, A. J., Sprott, R., & Taff, S. (2013). Do we know who will drop out? A review of the predictors of dropping out of high school: Precision, sensitivity and specificity. High School Journal, 96, 77–100. https://doi.org/10.1353/hsj.2013.0000.

  • Brookhart, S. M., Guskey, T. R., Bowers, A. J., McMillan, J. H., Smith,J. K., Smith, L. F., Stevens, M. T., & Welsh, M. E. (2016). A century of grading research: Meaning and value in the most common educational measure. Review of Educational Research, 86, 803–848.

  • Barksdale-Ladd, M. A., & Thomas, K. (2000). What’s at stake in high-stakes testing: teachers and parents speak out. Journal of Teacher Education, 51(5), 384–397.

    Article  Google Scholar 

  • Brace, I. (2008). Questionnaire design: how to plan, structure and write survey material for effective market research (2nd ed.). London, UK: Kogan Page.

    Google Scholar 

  • Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101.

    Article  Google Scholar 

  • Braun, V., & Clarke, V. (2012). Thematic analysis. In H. Cooper (Ed.), APA handbook of research methods in psychology: Vol. 2. Research designs (pp. 57–71). https://doi.org/10.1037/13620-004.

    Chapter  Google Scholar 

  • Cheng, L. (2000). Washback or backwash: a review of the impact of testing on teaching and learning. Retrieved from https://files.eric.ed.gov/fulltext/ED442280.pdf

  • Cohen, L., Manion, L., & Morrison, K. (2018). Research methods in education (8th ed.). Oxon, UK: Routledge.

    Google Scholar 

  • Creswell, J. W., & Creswell, J. D. (2018). Research design. qualitative, quantitative, and mixed method approaches (5th ed.). Thousand Oaks, CA: Sage.

    Google Scholar 

  • Goertz, M. E., & Duffy, M. (2003). Mapping the landscape of high-stakes testing and accountability programs. Theory Into Practice, 42(1), 4–11. https://doi.org/10.1353/tip.2003.0004.

    Article  Google Scholar 

  • Heilig, J. V., & Darling-Hammond, L. (2008). Accountability Texas-style: the progress and learning of urban minority students in a high-stakes testing context. Educational Evaluation and Policy Analysis, 30(2), 75–110.

    Article  Google Scholar 

  • Holme, J. J., Richards, M. P., Jimerson, J. B., & Cohen, R. W. (2010). Assessing the effects of high school exit examinations. Review of Educational Research, 80(4), 476–526.

    Article  Google Scholar 

  • Huffman (2015, January 12). School boards call to drop diploma exam weight. Grande Prairie Daily Herald-Tribune. Retrieved from http://www.dailyheraldtribune.com/2015/01/12/school-boards-call-to-drop-diploma-exam-weight

  • Kane, M. (2006). Validation. In R. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Washington, DC: American Council on Measurement in Education.

    Google Scholar 

  • Klinger, D. A. (2016). Monitoring accountability, and improvement, oh no! Assessment policies and practices in Canadian education. In S. Scott, D. E. Scott, & C. F. Webber (Eds.), Assessment in education: Implications for leadership (pp. 53–65). Cham: Springer.

    Chapter  Google Scholar 

  • Koretz, D., & Hamilton, L. S. (2006). Testing for accountability in K-12. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 531–578). Westport, CT: American Council on Education/Praeger.

    Google Scholar 

  • Marchant, G. (2004). What’s at stake with high stakes testing? A discussion of issues and research. Ohio Journal of Science, 104(2), 2–7.

    Google Scholar 

  • Marynowski, R. (2014). From frustration to understanding: An inquiry into secondary mathematics teachers’ experiences with government mandated examinations (Doctoral dissertation). University of Alberta: Edmonton, AB.

  • McEwen, N. (1995). Educational accountability in Alberta. Canadian Journal of Education, 20(1), 27–44.

    Article  Google Scholar 

  • McIntosh, S., & Kober, N. (2012). State high school exit exams: a policy in transition. Washington, DC: Center on Education Policy.

    Google Scholar 

  • Merki, K. M., & Oerke, B. (2017). Long-term effects of the implementation of state-wide exit exams: a multilevel regression analysis of mediation effects of teaching practices on students’ motivational orientations. Educational Assessment, Evaluation and Accountability, 29(1), 23–54.

    Article  Google Scholar 

  • Merki, K. M., & Holmeier, M. (2015). Comparability of semester and exit exam grades: long-term effect of the implementation of state-wide exit exams. School Effectiveness and School Improvement, 26(1), 57–74.

    Article  Google Scholar 

  • Messick, S. (1996). Validity and washback in language testing. Language Testing, 13(3), 241–256.

    Article  Google Scholar 

  • Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York, NY: Macmillan.

    Google Scholar 

  • Pasta, D. J. (2009, March). Learning when to be discrete: continuous vs. categorical predictors (Paper 248–2009). Paper presented at SAS Global Forum, Washington, DC. Retrieved from http://support.sas.com/resources/papers/proceedings09/248-2009.pdf

  • Pelech, S. (2015). What does it mean to teach biology well? A hermeneutic inquiry (doctoral dissertation). University of Calgary, Calgary, Alberta.

  • Phelps, R. P. (2000). Trends in large-scale testing outside the United States. Educational Measurement: Issues and Practice, 19(1), 11–21.

    Article  Google Scholar 

  • Sipple, J. W., Killeen, K., & Monk, D. M. (2004). Adoption and adaptation: school district responses to state imposed high school graduation requirements. Educational Evaluation and Policy Analysis, 26, 143–168.

    Article  Google Scholar 

  • Slomp, D. H. (2008). Harming not helping: The impact of a Canadian standardized writing assessment on curriculum and pedagogy. Assessing Writing, 13(3), 180–200.

  • Slomp, D. H., Corrigan, J. A., & Sugimoto, T. (2014). A framework for using consequential validity evidence in evaluating large-scale writing assessments: A Canadian study. Research in the Teaching of English, 48(3), 276.

  • Slomp, D. (2016). An integrated design and appraisal framework for ethical writing assessment. The Journal of Writing Assessment, 9(1), 1–14.

  • Slomp, D., Elliot, N., Marynowski, R. & Rudniy, A. (2017). Alberta’s Student Leaning Assessment Program: An integrated evaluation. Government of Alberta. Retrieved from https://education.alberta.ca/media/3615918/sla-research-executive-summary.pdf.

  • Smith, M. L. (1991). Put to the test: the effects of external testing on teachers. Educational Researcher, 20(5), 8–11.

    Article  Google Scholar 

  • Smyth, E., & Banks, J. (2012). High stakes testing and student perspectives on teaching and learning in the Republic of Ireland. Educational Assessment, Evaluation and Accountability, 24(4), 283–306. https://doi.org/10.1007/s11092-012-9154-6.

    Article  Google Scholar 

  • Suits, D. B. (1957). Use of dummy variables in regression equations. Journal of the American Statistical Association, 52(280), 548–551.

    Article  Google Scholar 

  • Tabachnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics (5th ed.) [Kindle edition.] Pearson.

    Google Scholar 

  • Volante, L. (2006). An alternative vision for large-scale assessment in Canada. Journal of Teaching and Learning, 4(1), 1–14.

    Article  Google Scholar 

  • Williams, R. (2016, May). Ordinal independent variables. Retrieved from https://www3.nd.edu/~rwilliam/stats3/OrdinalIndependent.pdf

  • Yeager, E. A., & von Hover, S. (2006). Virginia vs. Florida: two beginning history teachers’ perceptions of the influence of high-stakes tests on their instructional decision making. Social Studies Research and Practice, 1(3), 340–358.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Slomp.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Survey question

Scoring system

Will you be teaching a diploma examination course during the 2015–2016 school year?

Yes; no

How many years of teaching experience do you have?

Less than 5 years; 5–10 years; 11–15 years; 16–20 years; over 20 years

Which of the following will you be teaching during the 2015–2016 school year? (Please select all that apply)

Mathematics 30-1; Mathematics 30-2, English Language Arts 30-1; English Language Arts 30-2; Social Studies 30-1; Social Studies 30-2; Biology 30; Chemistry 30; Physics 30; Science 30; Français 30

How many years have you been teaching diploma examination courses?

This is my first; 2–5 years; 6–10 years; over 10 years

Which diploma examination courses have you taught prior to the 2015–2016 school year? (Please select all that apply)

Mathematics 30-1; Mathematics 30-2, English Language Arts 30-1; English Language Arts 30-2; Social Studies 30-1; Social Studies 30-2; Biology 30; Chemistry 30; Physics 30; Science 30; Français 30

Have you participated in any of the following (Please select all that apply)

Marking diploma examinations; item writing for diploma examinations; reviewing diploma examinations on technical review committees; field testing diploma examination; none of the above

From your perspective, how much importance does your district administration put on diploma exam scores?

10-level scale with the following labeled: not important (far left); moderately important (middle); very important (far right)*

From your perspective, how much importance does your school administration put on diploma exam scores?

10-level scale with the following labeled: not important (far left); moderately important (middle); very important (far right)*

From your perspective, how much importance does your parent community put on diploma exam scores?

10-level scale with the following labeled: not important (far left); moderately important (middle); very important (far right)*

From your perspective, how much importance do your students put on diploma exam scores?

10-level scale with the following labeled: not important (far left); moderately important (middle); very important (far right)*

How much importance do you put on diploma exam scores?

10-level scale with the following labeled: not important (far left); moderately important (middle); very important (far right)*

To what extent do you feel pressure to ensure that the marks that students earn in class be relatively close to their diploma exam score?

10-level scale with the following labeled: no pressure (far left); moderate pressure (middle); significant pressure (far right)*

What are the key sources of the pressure that you feel to ensure that the marks students earn in class be relatively close to their diploma exam score? (Please select all that apply)

Self; colleagues; parents; students; school administration; district administration; media; none of the above; other (please specify)

To what extent was your planning for diploma examination courses influenced by the presence of a diploma examination when the exams were weighted at 50%?

10-level scale with the following labeled: not influenced (far left); moderately influenced (middle); significantly influenced (far right)

Please describe the specific ways the diploma exam, at 50% weighting, influenced your planning.

 

To what extent were your classroom assessment practices influenced by the presence of a diploma examination when the exams were weighted at 50%?

10-level scale with the following labeled: not influenced (far left); moderately influenced (middle); significantly influenced (far right)

Please describe the specific ways the diploma exam, at 50% weighting, influenced your classroom assessment practices.

 

To what extent has your planning for diploma exam courses changed (or will it change) this year now that weighting of the examinations has been reduced to 30%?

10-level scale with the following labeled: no change (far left); moderate change (middle); significant change (far right)

Please describe the specific ways your planning has changed (or will change) this year.

 

To what extent have your classroom assessment practices changed (or will they change) this year now that weighting of the examinations has been reduced to 30%?

10-level scale with the following labeled: no change (far left); moderate change (middle); significant change (far right)

Please describe the specific ways your classroom assessment practices have changed (or will change) this year.

 

To what extent do you believe the diploma exam measures the breadth of knowledge and skills required of students by the curriculum?

10-level scale with the following labeled: does not measure breadth (far left); moderate measure of breadth (middle); significant measurement of breadth (far right)*

Thank you for your submission. We are interested in hearing more about your experiences. A second component of this study consists of a voluntary interview. If you agree to voluntarily participate in the second component to this study, your participation will include engagement in a 30-min interview either in person or via technology (phone call, Skype, Google Hangout). By entering your name and e-mail address and by clicking “Submit Responses,” you have indicated that you understand and agree to be contacted by the researchers for a follow-up interview.

 
  1. *Space provide for further comments

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Slomp, D., Marynowski, R., Holec, V. et al. Consequences and outcomes of policies governing medium-stakes large-scale exit exams. Educ Asse Eval Acc 32, 431–460 (2020). https://doi.org/10.1007/s11092-020-09334-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11092-020-09334-8

Keywords

Navigation