Elsevier

Intelligence

Volume 75, July–August 2019, Pages 9-18
Intelligence

First publication of subtests in the Stanford-Binet 5, WAIS-IV, WISC-V, and WPPSI-IV

https://doi.org/10.1016/j.intell.2019.02.005Get rights and content

Highlights

  • Many subtests have long histories that support their use in measuring intelligence.

  • The history of many subtests is older and more complex than many may realize.

  • Many subtests have histories that are much older than theories of test creation.

Abstract

In this article we describe the origins of the subtests that appear on the modern Stanford-Binet Intelligence Scales (SB5), Wechsler Preschool and Primary Scale of Intelligence (WPPSI-IV), Wechsler Intelligence Scale for Children (WISC-V), and Wechsler Adult Intelligence Scale (WAIS-IV). We found that the majority of these subtest formats were first created in 1908 or earlier and that only three have been created since 1980. We discuss the implications of these findings, which are that (1) many subtests have lengthy research histories that support their use in measuring intelligence; (2) many subtests have formats that predate modern theories of test creation, cognitive psychology, and intelligence; and (3) the history of many subtests is more complex than psychologists probably realize.

Introduction

One of the first successes in applied psychology was the development of intelligence tests. Early tests in the 1910's and 1920's found rapid, widespread acceptance, with millions of American examinees tested every year (Cronbach, 1975; Thorndike, 1975; Yerkes, 1921). The use of these tests persists today, and in the 21st century the most popular individually administered intelligence tests are the Stanford-Binet Intelligence Scale (SB5) and the Wechsler Intelligence Scales, the latter of which are the Wechsler Adult Intelligence Scale (WAIS-IV), the Wechsler Intelligence Scale for Children (WISC-V), and the Wechsler Preschool and Primary Scale of Intelligence (WPPSI-IV). These instruments have dominated intelligence testing for decades. The original version of the Stanford-Binet scale was first published over 100 years ago (Terman, 1916), though many of the items were direct translations or close adaptations of items from Binet’s 1905, 1908, and 1911 intelligence scales. Ironically, Binet and Terman had opposite goals in their work on intelligence testing. Binet aimed to identify children who were struggling academically (Wolf, 1973), while Terman had an interest in identifying gifted children—an interest which started with his dissertation (Terman, 1905) and lasted until his death. Indeed, Terman's research on gifted children is his work that Terman is best remembered today (Warne, 2019). The Stanford-Binet has been revised several times since 1916, with the fifth edition, published in 2003, being the most recent.

The first Wechsler scale appeared in 1939 as the Wechsler-Bellevue, an intelligence test designed for adult examinees (see description in Wechsler, 1944), as opposed to the child examinees that Terman designed the Stanford-Binet for. Wechsler disapproved of the heavily verbal content of the early versions of the Stanford-Binet and of the test's ability to produce a global IQ as the only measure of a person's intellectual level (Wechsler, 1944). Therefore, he designed his test to produce a verbal and performance (i.e., non-verbal) IQ score. To create the Wechsler-Bellevue, Wechsler evaluated item formats that appeared on prior scales and selected the ones which he thought were the best measures of intelligence, based on his research (Boake, 2002; Wechsler, 1944) and his experience administering the Army Alpha and Army Beta in Texas during World War I (Yerkes, 1921, pp. 40, 80). As he wrote, “Our aim was not to produce a set of brand new tests but to select, from whatever source available, such a combination of them as would best meet the requirements of an effective adult scale” (Wechsler, 1944, p. 76). Wechsler favored test formats and items that (a) showed high discrimination in intelligence across much of the continuum of ability, (b) produced scores with high reliability, (c) correlated strongly with other widely accepted measures of intelligence, and (d) correlated with “pragmatic” subjective ratings of intelligence from people who knew the examinee—such as a work supervisor (Wechsler, 1944). These criteria led Wechsler to believe that, for example, an information subtest was effective but that the Army Beta's cube analysis subtest was not (because the latter was incapable of discriminating among people with intellectual disabilities). The success of the scale led Wechsler to create a separate test for children (the WISC) in 1949 and another for preschool children (the WPPSI) in 1967. All Wechsler tests have been revised several times since their creation (Kaplan & Saccuzzo, 2018).

Throughout the years, however, psychologists have updated these tests with new analyses and norm samples, while also adding or removing subtests. Despite the revisions that have occurred over the decades, the revisers of the Wechsler scales or the Stanford-Binet have never completely replaced every subtest when updating an intelligence scale. The result is that contemporary versions of these tests are an amalgamation of old subtest formats and modern test construction methods.

It is the legacy of these old subtests on modern tests that intrigued us. Knowing that many subtests on the Stanford-Binet or the Wechsler scales long predate the current versions of these tests, we investigated the origins of these subtests, hoping to find the earliest publication of the subtest format in the scholarly literature. Throughout the history of the changes to the subtests, there has never been a compilation of the origins of the subtests on popular intelligence scales. Considering many of the subtests that have long been part of the SB or the Wechsler scales are still in use today, it is important to understand where they came from. The origin of these subtests provides valuable information about the creation of the SB and Wechsler scales and may shed light on test theory and test score interpretation. We believed that understanding the history of subtests would lead intelligence test users to have a greater appreciation of these subtests.

Moreover, we have engaged in this historical research with the goal of correcting misconceptions that psychologists have about the origin of frequently used intelligence subtests. For example, in one article the authors claimed that Corsi invented the block tapping task in 1972 (Wongupparaj, Wongupparaj, Kumari, & Morris, 2017, p. 72). In reality, we show below that the task was invented in 1913. Likewise, we found multiple sources (e.g., Boake, 2002; Frank, 1983) that stated that the picture completion subtest (found on the WAIS-IV) originated with Healy (1914), but we discovered that Healy's task is different from the modern subtest, which originated with Binet (see below). We believe that such misconceptions are probably common. An incorrect understanding of the origin of a subtest may limit the thoroughness of literature searches about psychometric validity or the Flynn effect. Finally, research about the subtests' psychometric properties outside of the context of the SB or Wechsler scales can strengthen scientists' interpretations of what these tests measure.

Section snippets

Search procedures

The task of identifying the origin of subtests may seem easy at first glance, but there are circumstances that make the task difficult. When the SB or Wechsler scales were first created or later updated, the test creators or revisers often did not state any origins of the subtests on their scales, let alone provide any citations for the first description of subtests. Modern test manuals for these tests are silent on the issue of the origin of their subtests, probably because many readers do not

Subtests

Table 1 lists all of the subtests found in the Stanford-Binet 5, the WAIS-IV, the WISC-V, and the WPPSI-IV. Subtests with very similar formats are combined into a single row. For example, the WISC-V Digit Span, WAIS-IV Digit Span, WISC-V Picture Span, and WISC-V Letter-Number Sequencing all require examinees to repeat in order a sequence of stimuli that have been presented. Although the stimuli and/or difficulty differ, the required tasks are all sufficiently similar that we saw the later

Discussion

Tracing the origins of all subtests found on the SB5, WAIS-IV, WISC-V, and WPPSI- IV is a project that has not been undertaken before. We wrote this article to give intelligence test users an appreciation for the history of these subtests and also explain more about the creation of the original scales. Perhaps with a knowledge that most subtests have been in use for many years, practitioners can have more confidence in their use of these item formats because they can know that these subtests

Acknowledgements

We appreciate Elijah L. Armstrong, who provided us with a source to earlier versions of the Picture Absurdities and Position and Direction subtests that we had been unaware of when we posted the original version of this manuscript as a pre-print on psyarXiv.

References (67)

  • F.X. Blair

    A study of the visual memory of deaf and hearing children

    American Annals of the Deaf

    (1957)
  • E. Blin

    Les débilités mentales

    Revue de Psychiatrie

    (1902)
  • C. Boake

    From the Binet-Simon to the Wechsler-Bellevue: Tracing the history of intelligence testing

    Journal of Clinical and Experimental Neuropsychology

    (2002)
  • F.G. Bonser

    The reasoning ability of children of the fourth, fifth, and sixth school grades (Teachers College Contributions to Education, no. 37)

    (1910)
  • J.B. Carroll

    Human cognitive abilities: A survey of factor-analytic studies

    (1993)
  • K. Coaley

    An introduction to psychological assessment & psychometrics

    (2014)
  • L.J. Cronbach

    Five decades of public controversy over mental testing

    American Psychologist

    (1975)
  • H. Damaye

    Eassai de diagnostic entre les états de débilités mentales

    (1903)
  • W.F. Dearborn

    Experiments in learning

    Journal of Educational Psychology

    (1910)
  • G. Frank

    The Wechsler enterprise: An assessment of the development, structure, and use of the Wechsler tests of intelligence

    (1983)
  • S.J. Gould

    The mismeasure of man

    (1981)
  • G.S. Hall

    The contents of children's minds on entering school

    (1893)
  • W. Healy

    A pictorial completion test

    Psychological Review

    (1914)
  • J. Jacobs

    Experiments on “prehension”

    Mind

    (1887)
  • R.M. Kaplan et al.

    Psychological testing: Principles, applications, and issues

    (2018)
  • H.A. Knox

    The differentiation between moronism and ignorance

    New York Medical Journal

    (1913)
  • S.C. Kohs

    The block-design tests

    Journal of Experimental Psychology

    (1920)
  • R.L. Linn

    Educational testing and assessment: Research needs and policy issues

    American Psychologist

    (1986)
  • D.F. Lohman et al.

    Cognitive Abilities Test (form 8)

    (2017)
  • J.D. Matarazzo

    Wechsler's measurement and appraisal of adult intelligence

    (1972)
  • D.W. Maurer

    The argot of the three-shell game

    American Speech

    (1947)
  • J.A. Naglieri

    Traditional IQ: 100 years of misconception and its relationship to minority representation in gifted programs

  • M. Norgate

    Cutting borders: Dissected maps and the origins of the jigsaw puzzle

    The Cartographic Journal

    (2007)
  • Cited by (11)

    • Ableism in Education: Rethinking School Practices and Policies

      2023, Ableism in Education: Rethinking School Practices and Policies
    View all citing articles on Scopus
    View full text