Abstract
The use of experts to judge performance assessments is desirable because ratings of performances, carried out by experts in the content domain of the examination, are often considered to be the ``gold standard.'' However, one drawback of using experts to rate performances is the high cost involved. A more economic alternative for scoring performance assessments entails using analytic scoring, which typically involves assigning points to individual traits present in the performance, and summing to arrive at a single score. This strategy is less costly, but may lack the richness of holistic scoring. This study investigates the use of regression-based techniques to predict expert judgments on a written performance task from a combination of analytic scores. Potentially, this will result in scores that approximate the richness of holistic ratings while maintaining the cost-effectiveness of analytic scoring. Results show that a substantial proportion of variance in expert judgments can be explained by the analytic scores, but that decisions based on actual expert judgments and the predicted expert judgments were not sufficiently consistent to warrant the substitution of one score for the other.
Similar content being viewed by others
References
Bennett, R.E. & Sebrechts, M.M. (1996). The accuracy of expert-system diagnoses of mathematical problem solutions. Applied Measurement in Education 9: 133-150.
Boulet, J., Friedman Ben-David, M., Hambleton, R. K., Burdick, W., Ziv, A. & Gary, N.E. (1998). An investigation of the sources of measurement error in the post-encounter written scores from standardized patient examinations. Advances in Health Sciences Education 3: 81-87.
Boulet, J., Friedman Ben-David, M., Ziv, A., Burdick, W.P. & Gary, N.E. (2000). The use of holistic scoring for post-encounter written exercises. In D. Melnick (ed.), Evolving Assessment: Protecting the Human Dimension. The Eighth International Conference on Medical Education Proceedings, pp. 254-260. Philadelphia: National Board of Medical Examiners.
Braun, H.I., Bennett, R.E., Frye, D. & Soloway, E. (1990). Scoring constructed responses using expert systems. Journal of Educational Measurement 27: 93-108.
Burstein, J., Kukich, K., Wolff, S. & Lu, C. (1998, April). Computer analysis of essay content for automated score prediction. Paper presented at the meeting of the National Council on Measurement in Education, San Diego, CA.
Clauser, B.E., Subhiyah, R.G., Nungester, R.J., Ripkey, D.R., Clyman, S.G. & McKinley, D. (1995). Scoring a performance-based assessment by modeling the judgments of experts. Journal of Educational Measurement 32(4): 397-415.
Clauser, B.E., Margolis, M.J., Clyman, S.G. & Ross, L.P. (1997). Development of automated scoring algorithms for complex performance assessments: A comparison of two approaches. Journal of Educational Measurement 34(2): 141-161.
Friedman Ben-David, M., Boulet, J.R., Burdick,W.P., Ziv, A., Hambleton, R.K. & Gary, N.E. (1997). Issues of validity and reliability concerning who should score the post-encounter patient progress note. Academic Medicine 72: S79-S81.
Goulden, N.R. (1994). Relationship of analytic and holistic methods to raters' scores for speeches. Journal of Research and Development in Education 27(2): 73-82.
Swaminathan, H., Hambleton, R.K. & Algina, J. (1974). Reliability of criterion-referenced tests: A decision-theoretic formulation. Journal of Educational Measurement 11: 263-268.
Vu, N.V. & Barrows, H.S. (1994). Use of standardized patients in clinical assessments: Recent developments and findings. Educational Researcher 23(3): 25-30.
Wainer, H. & Thissen, D. (1993). Combining multiple-choice and constructed-response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education 6: 103-118.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Slater, S.C., Boulet, J.R. Predicting Holistic Ratings of Written Performance Assessments from Analytic Scoring. Adv Health Sci Educ Theory Pract 6, 103–119 (2001). https://doi.org/10.1023/A:1011478224834
Issue Date:
DOI: https://doi.org/10.1023/A:1011478224834