Skip to main content

Consequences of Test Interpretation and Use: The Fusion of Validity and Values in Psychological Assessment

  • Chapter
Problems and Solutions in Human Assessment

Abstract

This paper addresses the role of social values in educational and psychological measurement, with special attention to the consequences of testing as validity evidence, which is an inherently value-dependent enterprise. The primary measurement standards that must be met to legitimize a proposed test use are those of reliability, validity, and fairness, which are also value-laden concepts. Evidence of reliability signifies that something is being measured; the major concern is score consistency or stability. Evidence of validity circumscribes the nature of that something; the major concern is score meaning. Evidence of fairness indicates that score meaning does not differ consequentially across individuals, groups, or settings; the major concern is comparability.

Reprinted by permission of Educational Testing Service, the copyright holder.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • American Psychological Association, American Educational Research Association, & National Council on Measurement in Education. (1985). Standards for educational and psychological testing. Washington, DC: American Psychological Association.

    Google Scholar 

  • Bennett, R. E. (1998). Computer-based testing for examinees with disabilities: On the road to generalized accommodation (pp. 181–191). In S. Messick (Ed.), Assessment in higher education. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Brandon, P. R. (1996, June). Fatal flaw in consequential validity theory. AERA-D Division D: Measurement and Research Methodology. Available on-line at: http://lists.asu.edu/cgi-bin/wa?A2=ind9606&L=aera-d&F=&S=&P=5219.

  • Civil Rights Act of 1991. (November 21, 1991). Publ. L. No. 102-166, 105 Stat. 1071.

    Google Scholar 

  • Cleary, T. A. (1968). Test bias: Prediction of grades of Negro and White students in integrated colleges. Journal of Educational Measurement, 5, 115–124.

    Article  Google Scholar 

  • Cole, N. S. (1973). Bias in selection. Journal of Educational Measurement, 10, 237–255.

    Article  Google Scholar 

  • Cook, T.D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Chicago: Rand McNally.

    Google Scholar 

  • Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 443–507). Washington, DC: American Council on Education.

    Google Scholar 

  • Cronbach, L. J. (1976). Equity in selection: Where psychometrics and political philosophy meet. Journal of Educational Measurement, 13, 31–41.

    Article  Google Scholar 

  • Cronbach, L. J. (1988). Five perspectives on validation argument (pp. 3–17). In H. Wainer & H. Braun (Eds.), Test validity. Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Darlington, R. B. (1971). Another look at “cultural fairness.” Journal of Educational Measurement, 8, 71–82.

    Article  Google Scholar 

  • Dunnette, M. D., & Borman, W. C. (1979). Personnel selection and classification systems. Annual Review of Psychology, 30, 477–525.

    Article  Google Scholar 

  • Einhorn, H. J., & Bass, A. R. (1971). Methodological considerations relevant to discrimination in employment testing. Psychological Bulletin, 75, 261–269.

    Article  PubMed  Google Scholar 

  • Embretson (Whitely), S. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93, 179–197.

    Article  Google Scholar 

  • Feldt, L. S., & Brennan, R. L. (1989). Reliability. In R. L. Linn (Ed.), Educational Measurement (3rd ed., pp. 105–146). New York: Macmillan.

    Google Scholar 

  • Ferguson, G. A. (1956). On transfer and the abilities of man. Canadian Journal of Psychology, 10, 121–131.

    Article  PubMed  Google Scholar 

  • Gottfredson, L.S. (1994). The science and politics of race-norming. American Psychologist, 49, 955–963.

    Article  PubMed  Google Scholar 

  • Gordon, E. (1998). Human diversity and equitable assessment (203–211). In S. Messick (Ed.), Assessment in higher education. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Gross A.L., & Su, W. (1975). Defining a “fair” and “unbiased” selection model: A question of utilities. Journal of Applied Psychology, 60, 345–351.

    Article  Google Scholar 

  • Hartigan, J. A., & Wigdor, A. K. (Eds.) (1989). Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. Washington, DC: National Academy Press.

    Google Scholar 

  • Heller, K. A., Holtzman, W. H., & Messick, S. (Eds.) (1982). Placing children in special education: A strategy for equity. Washington, DC: National Academy Press.

    Google Scholar 

  • Hunter, J. E., & Schmidt, F. I. (1976). Critical analysis of the statistical and ethical implications of various definitions of test bias. Psychological Bulletin, 83, 1053–1071.

    Article  Google Scholar 

  • Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527–535.

    Article  Google Scholar 

  • Lennon, R. T. (1956). Assumptions underlying the use of content validity. Educational and Psychological Measurement, 16, 294–304.

    Article  Google Scholar 

  • Linn, R. L. (1973). Fair test use in selection. Review of Educational Research, 43, 139–161.

    Google Scholar 

  • Linn, R. L. (1976). In search of fair selection procedures. Journal of Educational Measurement, 13, 53–58.

    Article  Google Scholar 

  • Linn, R. L. (1998). Partitioning responsibility for the evaluation of the consequences of assessment programs. Educational Measurement: Issues and Practice, 17(2), 28–30.

    Article  Google Scholar 

  • Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3, 635–694 (Monograph Supplement, 9).

    Google Scholar 

  • Markus, K. (1998). Science, measurement, and validity: Is completion of Samuel Messick’s synthesis possible? Social Indicators Research, in press.

    Google Scholar 

  • Mehrens, W. A. (1997). The consequences of consequential validity. Educational Measurement: Issues and Practice, 16(2), 16–18.

    Article  Google Scholar 

  • Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: Macmillan.

    Google Scholar 

  • Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–749.

    Article  Google Scholar 

  • Messick, S. (1998). Test validity: A matter of consequence. Social Indicators Research, in press.

    Google Scholar 

  • Novick, M. R., & Ellis, D. D. (1977). Equal opportunity in educational and employment selection. American Psychologist, 72, 306–320.

    Article  Google Scholar 

  • Peak, H. (1953). Problems of observation. In L. Festinger & D. Katz (Eds.), Research methods in the behavioral sciences (pp.243-299). Hinsdale, IL: Dryden.

    Google Scholar 

  • Petersen, N. S., & Novick, M. R. (1976). An evaluation of some models for culture-fair selection. Journal of Educational Measurement, 13, 3–29.

    Article  Google Scholar 

  • Popham, W. J. (1997). Consequential validity: Right concern-wrong concept. Educational Measurement: Issues and Practice, 16(2), 9–13.

    Article  Google Scholar 

  • Sackett, P. R. & Wilk, S. L. (1994). Within-group norming and other forms of score adjustment in preemployment testing. American Psychologist, 49, 929–954.

    Article  PubMed  Google Scholar 

  • Sawyer, R. L., Cole, N. S., & Cole, J. W. L. (1976). Utilities and the issue of fairness in a decision theoretic model for selection. Journal of Educational Measurement, 13, 59–76.

    Article  Google Scholar 

  • Shepard, L. A. (1993). Evaluating test validity. Review of Research in Education, 19, 405–450.

    Google Scholar 

  • Shulman, L. S. (1970). Reconstruction of educational research. Review of Educational Research, 40, 371–396.

    Google Scholar 

  • Tenopyr, M. L. (1996, April). Construct-consequences confusion. Paper presented at the annual meeting of the Society for Industrial and Organizational Psychology, San Diego.

    Google Scholar 

  • Thorndike, R. L. (1971). Concepts of culture fairness. Journal of Educational Measurement, 8, 63–70.

    Article  Google Scholar 

  • Wiggins, G. (1993). Assessment: Authenticity, context, and validity. Phi Delta Kappan, 75, 200–214.

    Google Scholar 

  • Wiley, D. E. (1991). Test validity and invalidity reconsidered. In R. E. Snow & D. E. Wiley (Eds.), Improving inquiry in the social sciences: A volume in honor of Lee J. Cronbach (pp. 75–107). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Willingham, W. W. (1998). A systemic view of test validity (213-242). In S. Messick (Ed.), Assessment in higher education. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Willingham, W.W., & Cole, N. S. (1997). Gender bias and fair assessment. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Willingham, W. W., Ragosta, M., Bennett, R. E., Braun, H., Rock, D.A., & Powers, D. E. (1988). Testing handicapped people. Boston: Allyn & Bacon.

    Google Scholar 

  • Yen, W. M. (1998). Investigating the consequential aspects of validity: Who is responsible and what should they do? Educational Measurement: Issues and Practice, 17(2), 5.

    Article  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer Science+Business Media New York

About this chapter

Cite this chapter

Messick, S. (2000). Consequences of Test Interpretation and Use: The Fusion of Validity and Values in Psychological Assessment. In: Goffin, R.D., Helmes, E. (eds) Problems and Solutions in Human Assessment. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-4397-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-4397-8_1

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-6978-3

  • Online ISBN: 978-1-4615-4397-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics