Keywords

Introduction

Learning assessment takes on new meaning in open, distance, and digital education (ODDE). When education moves from classrooms to online settings, many traditional signals for what is valued are eliminated or transformed. Furthermore, as Conrad and Openo (2018) point out in the introduction to Assessment Strategies for Online Learning, assessment has become increasingly critical as all education (and particularly postsecondary education) increasingly moves away from credit hours and toward learning outcomes. Similarly, the introduction of open distance learning led to a paradigm shift away from grading and certification and towards performance-based assessment and learning outcomes. These developments have helped online designers, educators, and administrators appreciate how challenging and laborious it is to enact high-quality assessment in online settings.

In transitioning to online instruction, many educators and designers struggle to transform their informal assessment of classroom discourse to the discussion forums that are the primary form of interaction in many online classes. Likewise, many educators find that when offering formative assessment (i.e., assessment “for” learning), their relatively efficient whole class feedback sessions are supplanted by laborious individualized private feedback. When administering online summative assessments, (i.e., assessment “of” prior learning), many educators struggle to replace secure “closed-book” tests. Even when using expensive and intrusive digital proctors or requiring students to come to campus or a testing center, the nature of the Internet and the proliferation of so-called “homework help” websites raises suspicions about test scores in many online educational contexts.

This chapter is intended to help readers understand these and related issues and begin to address them in specific educational contexts. The chapter builds on two dimensions of assessment discussed in a prior consideration in Hickey and Pellegrino (2005). The first dimension is assessment purposes/function (e.g., formative vs. summative) while the second dimension is learning theory (i.e., differential, cognitive-associationist, cognitive-constructivist, and situative/sociocultural). Particular attention is paid to how the two dimensions interact with one another in online and open contexts; the chapter also briefly considers how these dimensions interact with item format (e.g., selected response vs. constructed response) and assessment level (i.e., immediate, close, proximal, distal, and remote) in online settings, as elaborated in Hickey, Harris, and Lee (in review). This consideration of assessment includes the entire range of fully online and open learning environments. This includes massive open online courses (MOOCs) and other open courses as well as conventional for-credit online courses. This also includes both synchronous online formats, as well as cohorted “semi-synchronous” formats and fully asynchronous self-paced formats. This chapter does not consider assessment in traditional non-digital “correspondence-courses” using mail, broadcast radio, or television. And while developments such as computer-adaptive testing and new measurement models (e.g., Mislevy, 2018) are certainly relevant to this chapter, these are entire topics unto themselves that quickly move beyond the scope of the chapter. This chapter also does not directly consider online testing independent of online education (e.g., as in commercial achievement tests and tests for college admissions and professional licensure).

Given space limitations, this chapter does not attempt to exhaustively review the existing prior relevant research literature (sometimes characterized as “e-assessment”). Readers may wish to consult Mawhinney’s (2013) systematic review, which uncovered four main themes across 10 articles including perceptions, validity and reliability, student support, and benefits of e-assessment. Covering some of the same terrain, Wei, Saab, and Admiraal’s (2020) systematic review of the assessment of cognitive, behavioral, and affective learning outcomes across 65 studies uncovered 25 different approaches. Xiong and Suen (2018) reviewed the research literature on assessment in MOOCs, with a particular focus on the differences between the associationist “xMOOCs” and the connectivist “cMOOCs” described below.

Assessment Purposes/Functions

Assessment scholars have traditionally focused on the intended purposes of assessment. For example, the 2001 expert consensus report from the US National Research Council distinguished between the familiar formative (“assessment for learning”), summative (“assessment of learning”), and evaluative (“assessment of programs”) purposes and cautioned against using assessments for multiple purposes (primarily because summative and evaluative purposes undermine formative purposes). Most considerations of assessment in online and open learning embrace these distinctions, and many embrace this concern.

The research literature on online formative assessment (sometimes “OFA”) is particularly vast. The integrated narrative review of higher education research by Gikandi, Morrow, and Davis (2011) uncovered themes such as Vygotksky’s zone of proximal development (ZPD) underpinning authentic OFA, OFA for individuals, peers, and teachers, threats to validity and reliability, and interactive formative feedback. Explicitly building on Gikandi et al., McLaughlin and Yan’s (2017) narrative review included 32 more recent studies, uncovered expanded delivery formats, detailed cognitive and emotional benefits of OFA, and expanded into K-12 contexts. The systematic review by Mahanan, Talib, and Ibrahim (2021) included 10 studies in higher STEM education and uncovered evidence of the tools used, themes used, outcomes assessed, practical skills, and assessment format. As discussed in Arnold (2016), one important issue in formative assessment is the likelihood and consequences of cheating; Hickey and Harris (2021) argued that carefully “aligning” formative and summative assessments can discourage such practices by convincing students that completing formative assessments as intended is an ideal way to prepare for summative assessments.

Naturally, summative purposes are central to many of the considerations of assessment in online and open learning. Russell (2019) discussed the role of digital technologies in summative assessment in general while Russell (2018) discussed crucial issues of accessibility in this context. As elaborated in Hickey and Harris (2021) and Stadler, Kolb, and Sailer (2021), time limits are an important issue in online summative assessments. To reiterate, online test proctors are expensive and intrusive; they can also be bypassed by workarounds that proliferate online.

Meanwhile, numerous studies have shown that, despite honor-codes, many students will cheat on online assessments (e.g., LoSchiavo & Shatz, 2011); cheating is particularly likely when students assume that their classmates are doing so (Lang, 2013). Furthermore, the profusion of online “homework help” sites means that students can directly locate the answers to many items drawn from textbook publishers’ item banks (sometimes, thanks to the power of modern search engines, even after such item stems and answers are reworded). With all assessment formats, using time limits and ensuring that items are not directly searchable can maximize the validity of summative assessment scores as estimates of likely transfer of that knowledge to subsequent educational, professional, and personal contexts. With selected response format, including challenging “best answer” items or ensuring that students would have to search starting from the item responses (rather than the item stem) can further enhance the trustworthiness of scores on time-limited summative assessments.

While the evaluation of courses and programs is mostly outside of the scope of this chapter, summative assessment of learning certainly plays a role in doing so. Notable consideration of using assessment in evaluations is included in many of the chapters in Azevedo and Azevedo (2018).

Conformative, Deformative, and Transformative Assessment Functions

Rather than assessment purposes, Hickey and Pellegrino (2005) argued instead for a sociocultural focus on assessment functions and a broader range of “learning” beyond the familiar individual behavioral or cognitive outcomes. As elaborated below, this makes it possible for a single assessment to serve multiple complementary functions. This also directs additional attention to the consequences of assessment practices.

To reiterate, the lack of classroom interaction means that assessments typically have a greater influence on the culture of education delivered online. Torrance (2012) demonstrated how a sociocultural perspective draws additional attention to the unintended consequences of assessment. Torrance insisted that “all assessment is formative, for student dispositions and self-identities as learners, as well as knowledge and understanding, but not necessarily in a positive way” (p. 325). In addition to the three conventional functions above, Torrance acknowledged three additional functions, each of which takes on added meaning in online and open education. Conformative functions occur when instruction is overly aligned to narrow curricular aims that can be readily assessed. This concern is captured by Preston’s (2017) argument that competency-based approaches and the associated “mastery learning” movement of the 1980s was suppressed by the widespread embrace of constructivist theory in the 1990s, only to later resurge as “an existential threat to human learning” in the context of online education and training. Relatedly, Torrance cautioned about deformative functions, whereby assessment feedback and particularly low marks or scores undermine students’ affinity for and identity with the assessed knowledge.

Finally, Torrance (2012) encouraged the recognition of transformative functions, whereby the entire assessment practice and the social construction of judgment is made transparent and used to serve broader educational goals. Arguably, Torrance’s extensions shed new light on considerations of transformative functions. For example, Chaudhary and Dey (2013) discussed a “paradigm shift” away from content-based tests for grading and certification and towards a range of problem-based assessments following a broader governmental shift; for some, this is precisely the concern over conformative functions raised by Preston (2017). Alternatively, Ehlers (2013) discussed how new assessment practices might support “open learning cultures” via self-assessment, peer-assessment, “social information retrieval,” e-portfolios, and rubrics. Arguably, such discussions must be informed by explicit consideration of one’s underlying theory of learning, as discussed below.

Prior Learning Assessment

Within summative functions, another prominent assessment function in online and open education is prior learning assessment and recognition (PLAR and sometimes just PLA) whereby assessments and/or work samples or other evidence are used to award course credit. The practice is particularly prominent in continuing education contexts and is particularly relevant for older students. Conrad’s significant contributions here should be noted, including two handbook chapters (2008a, 2008b), a special issue (2011), and an exploration in the context of MOOCs (2013). Other noteworthy considerations of PLA/PLAR are represented by the various chapters in Stevenson (2021). While not specifically about online and open learning, the journal PLA Inside Out was launched in 2012 and includes many relevant contributions.

Learning Theory

Theories of learning are really theories of knowing, as one’s theory of learning must account for the nature of the knowledge that is learned. Together, assumptions about knowing and learning have profound implications for assessment. It is worth noting that different considerations of learning theory use different labels and categories. The influential 1996 handbook chapter by Greeno, Collins, and Resnick contrasted behavioral/empiricist, cognitive/rationalist, and situative/pragmatist-sociohistoric perspectives, the US National Research Council (2001) contrasted differential, behaviorist, cognitive, and situative, while Hickey and Pellegrino (2005) contrasted empiricist, rationalist, and socioculturalist perspectives. These categories refer to “grand theories’‘ (or “perspectives’‘), with each including more specific theories. Expanding beyond theory, Conrad and Openo (2018) summarized seven “philosophical orientations” in assessment, including liberalism, progressivism, behaviorism, humanism, radicalism, cognitivism, and constructivism, though without directly linking those philosophies to assessment practices. For reasons elaborated below, we have chosen to organize our discussion around differential, cognitive-associationist, cognitive-constructivist, and situative/sociocultural theories.

It is also worth noting that many practitioners and scholars pragmatically combine the second and third categories into an encompassing framework of “cognitive science” (e.g., Mayer & Wittrock, 1996) and that others similarly combine the last three (e.g., NRC, 2001). As elaborated below, we contend that working with different theories in the context of assessment calls for caution and careful consideration. Space limitations preclude elaboration beyond our points of departure from prevailing considerations; readers are referred to Conrad and Openo’s (2018, Chap. 4) and the NRC’s (2001, Chap. 3) extended discussions of learning theory and educational assessment.

Differential Theories

Differential theories emerged in the early twentieth century within efforts to uncover stable intellectual traits like IQ. Differential theories eventually came to be seen as theories of measurement rather than theories of learning, because they assumed that knowledge is whatever tests measured. These theories were gradually supplanted by behaviorism (mostly in the USA) and Gestalt theory (mostly in Europe) and now sometimes go unacknowledged (e.g., Greeno, Collins, & Resnick, 1996; Hickey & Pellegrino, 2005). While differential theories of learning and the associated theory of knowledge transfer (i.e., general transfer of general skills) live on in “classical” education, such approaches are usually delivered in traditional classroom or homeschool settings (though the curricula and assessments are increasingly distributed and accessed online). However, the elaborate statistical machinery that the development of differential theories left behind lives on in modern standardized tests (NRC, 2001). While such tests have greater consequences for K-12 and professional education than for higher education and open education, they are still quite influential.

We contend that differential theories live on in an additional way that may have even larger consequences for assessment in online and open education. Bruner (1996) convinced many that a great deal of teaching was driven by folk pedagogy, educators’ lay theories, or tacit assumptions about how students learn. While Bruner identified four distinct folk pedagogies, he was referring mostly to K-12 educators (who have almost always had some formal preparation in pedagogy and usually some exposure to scientific theories of learning). Arguably, there are many educators and designers in online and open higher education who came to their role via disciplinary expertise and who have little or no training in scientific theories of learning. In our experience, many such educators embrace a theory of learning that is loosely consistent with differential theories. This is represented in tacit assumptions that (a) their assessments capture meaningful knowledge, (b) higher scores are better, and (c) higher scores are better by any means necessary absent cheating.

Our sentiments in this regard are captured nicely in the title of the study of computing education literature by Sanders et al., (2017), Boustedt, Eckerdal, McCartney, & Zander (2017) entitled “Folk Psychology: Nobody Doesn’t Like Active Learning.” We share their concern that higher education has broadly and enthusiastically embraced “active learning” (as well as “student-centered learning”) as a description of an instructional technique rather than a characterization of student learning. We also share their concern that many believe all active learning techniques are equally effective. As online and open learning are increasingly oriented to specific, measurable competencies, the way those competencies are gained in relationship to the way those competencies are assessed becomes more and more important.

To illustrate this nuanced difference, we invite readers to imagine two students who earn equivalent scores on performance assessments in an introductory online course. One student was taught by a part-time instructor whose evaluations (and continued employment) were based entirely on scores on assessments (whose coverage is known to the instructor) and student course evaluations. Such an instructor is likely to focus primarily on the content on those assessments to support high marks (and presumably stronger evaluations) while skimming or bypassing other content. In contrast, the other student was taught by a tenured faculty member who was more concerned with preparing students for subsequent courses and was not terribly concerned with student course evaluations. Such a faculty member would be inclined to cover all course topics equally and treat the performance assessments as “snapshots” of what the students learned. The second student likely learned more (and possibly a lot more), but that knowledge is not captured in the assessment scores. Our point here is that educators and assessors whose practice is not grounded in a viable scientific theory may tacitly embrace a “folk-differential” theory and assume that “learning” is whatever their assessments capture.

Cognitive-Associationist Theories

Cognitive-associationist theories are rooted in and sometimes equated with behaviorism. But outside of K-12 education of students with special needs and the education of adults with profound disabilities, behaviorist theories have relatively little influence in contemporary education. Cognitive-associationist theories emerged when some leaders of the “cognitive revolution” (e.g., Anderson, 1980) retained the core assumption of behaviorism that knowledge consists of organized structures of many small associations. This assumption and the corresponding concern with cognitive load support traditional “mastery learning” approaches and more contemporary “expository” approaches (i.e., expose students to content, give them practice, and test that knowledge). These approaches are widely used in MOOCs (typically with video and automated quizzes) and are sometimes referred to in that context as “instructivist” theories (e.g., Falkner & Sheard, 2019). Indeed, the term “xMOOC” (after the popular edX MOOC platform) was coined to distinguish instructionist MOOCs from the “cMOOCs” described below. Associationist theories underpin most (but not all) “competency-based” approaches that are widely used in online and open education, and which have significant implications for assessment in these contexts (e.g., Aram et al., (2019), Mödritscher, Neumann, & Andergassen, 2019).

Significantly for open and online learning, cognitive-associationist theories underpin most intelligent tutoring systems (ITSs). As illustrated by de Boulay (Chap. 7, “Artificial Intelligence in Education and Ethics,” this volume), and Drachsler (Chap. 60, “The Rise of Multimodal Tutors in Education,” this volume), ITSs and artificial intelligence more generally have a prominent role in open and online learning. Associationist assumptions allow ITSs to use assessment evidence to maintain and constantly update a model of what each learner knows at a given time. When paired with a model of how learning about the topic typically/optimally progresses, ITSs are able to deliver instructional content that learners are presumably most ready to learn.

Because these theories assume that specific associations transfer relatively easily to new settings where they might be used, assessment of associationist learning is relatively unproblematic. In particular, selected-response items can be used to quickly and automatically assess whether students have formed those associations. But many assume that such item formats can only capture evidence of these more specific associations (e.g., Hirumi, 2014). However, selected response items, particularly when developed and vetted by professionals, can require relatively sophisticated understanding and reasoning to consistently answer correctly. This issue quickly exceeds the scope of this chapter (but see Mislevy, 2018). The key arguments for the purposes of this chapter is that (a) the relationship between theories of learning and assessment format is not as straightforward as many assume, (b) the concerns over selected-response formats primarily reflect cognitive-constructivist theories of learning, and (c) the efficiency and automation afforded by selected-response formats offer advantages that should not be ignored.

Cognitive-Constructivist Theories

As argued in Greeno et al. (1996) and others, cognitive-constructivist theories are largely antithetical to associationist theories. Rather than specific associations, constructivist theories assume that knowledge consists of higher-order conceptual “schema” that the human mind (uniquely among animals) constructs when making sense of the world. Constructivist theories (a) became prominent in the 1980s, (b) are still widely embraced by many cognitive scientists and educational psychologists, (c) encompass numerous more specific theories including socio-constructivist theories, (d) have long been a driving force in calls for “alternative” assessments and assessment reforms (e.g., Wolf et al., (1991), Bixby, Glenn III, & Gardner, 1991), and (e) motivated much of the explosion of interest in formative assessment ignited by Black and Wiliam (1998/2000). Arguably, this class of theories was tacitly embraced and taken for granted by many until situative/sociocultural theories started becoming prominent around 2000.

A great deal of the discussion of assessment in open and online education embraces cognitive-constructivist and/or socio-constructivist theories. In particular, the influential community of inquiry (CoI) framework “embraces deep approaches rather than surface approaches to learning and aims to create conditions to encourage higher order cognitive processing” and “represents a process of creating deep and meaningful (collaborative-constructivist) learning experience through three interdependent elements—social presence, cognitive presence, and teaching presence” (Garrison & Akyol, 2013, p. 106). Drawing directly from CoI and constructivist theory, Conrad and Openo (2018) devoted an entire chapter to defining constructivist “authentic” assessment. They speak for many when they assert that:

Authentic assessments, especially in blended and online learning contexts, encourage students to take a deep approach to learning, provide necessary alignment for faculty to better determine the quantity and quality of student learning, and provide institutions with the evidence necessary to respond to external pressures regarding their ability to measure student learning outcomes. (p. 55)

Furthermore, many agree with Conrad and Openo’s characterization of all selected response formats as “inauthentic” and likely to encourage cheating (p. 101).

It is important to note that measurement theorists (e.g., Messick, 1994) have long pointed out that authentic and alternative assessments are “task-driven” (rather than “construct-driven”). This means that they may introduce “construct-irrelevant easiness” and “construct-irrelevant variance” which introduce significant threats to the validity of the resulting evidence to support claims of achievement and expertise. Such assessment may be capturing evidence of what students “did” rather than what they will be able to “do” in subsequent contexts. Put differently, such assessments may inadvertently capture evidence of “near-transfer” or even “zero-transfer” rather than actual transfer of problem-solving skills or “far-transfer.” In terms of assessment “levels” described in Hickey and Pellegrino (2005) and Hickey et al. (in review), special interpretive care is needed to ensure that performance assessments are functioning at the proximal or distal level rather than the immediate or close level and that portfolio assessments are assigned, completed, and scored in a manner that provides valid evidence of future performance.

Pragmatically speaking, so-called “authentic” online assessments (both formative and summative) often call for relatively extensive individualized private feedback. This is in part because it is challenging to replace the traditional “whole class” feedback session that can be quite efficient in physical settings. Furthermore, the subjective nature of scoring such assessments can lead to corrosive arguments with students over grades and marks. This feedback and these arguments can take away precious instructor time for more efficient public instructor interaction and are sources of online instructor “burnout” (see Conceição & Lehman, 2011). As illustrated by the computer-adaptive assessments developed by the Smarter Balanced Assessment Consortium (SBAC, 2021), new psychometric models and technologies (e.g., Mislevy, 2018) now allow automated multi-part performance assessments. Nonetheless, such assessments require specialized expertise and are extremely expensive to create. While such investment may be manageable at the massive scale of MOOCs, these approaches are likely beyond the reach of most online educators for the foreseeable future.

Situative/Sociocultural Theories

This fourth category of theories is rooted in the work of the early Soviet theorist Vygotsky (1934/1987). We emphasize the situative strand of this broader class of sociocultural theories to highlight the perspective that emerged from the Institute of Research on Learning in Palo Alto, CA, from 1986 to 2000 (e.g., Greeno, 1998). We do so to distinguish this category of theories from the work of many socio-constructivist assessment theorists who also reference Vygotsky (e.g., Conrad & Openo, 2018) and have helped popularize situative/sociocultural theories among proponents of open and online education. While not explicitly citing the influence of situative theories, Siemens’s (2005) new theory of connectivism embraces many of the same assumptions while also addressing the massive influence of the Internet on the very nature of knowing and learning. The large influence of connectivism in open learning was signaled by the introduction of the term “cMOOC” to distinguish this approach from the more expository xMOOCs described above.

According to Greeno et al. (1996), assessment within this category of theories means “assessing participation in inquiry and social practices of learning,” “student participation in assessment,” and “design of assessment systems” (p. 39). Some considerations of e-portfolio assessment explicitly embrace situative theories (e.g., Batson, 2011; Habib & Wittek, 2007). From our perspective, the most important implication of situative theory is the way Greeno’s (1998, p. 17) “situative synthesis” reconciles the difference between individual activity and social activity. Cognitive-associationist and cognitive-constructivist theories reconcile these differences by characterizing social activity as aggregations of individual activity. However, this results in two incompatible characterizations of social activity, neither of which are capable of capturing the manner in which situative theories assume that knowledge is fundamentally “distributed” (i.e., “situated”) in social, cultural, and material contexts. In contrast, the situative synthesis uses a “dialectical” approach to resolve the difference between individual and social activity. From this perspective, the way that the human mind processes information (as in associationist theories) and the way that humans make sense of the world around them (as in constructivist theories) are both “special cases” of socially situated activity.

As argued in Hickey and Pellegrino (2005) and Hickey (2015), applying the situative synthesis to assessment and testing similarly makes it possible to characterize the entire range of assessment practices as special cases of socially situated activity. Doing so makes it possible to frame formative and summative educational assessment as specialized forms of discourse between educators and students and to frame external achievement tests as a specialized form of discourse between disciplinary experts and test takers. While these forms of discourse are certainly peculiar (if not downright bizarre), they serve narrow and potentially necessary functions in many if not most educational ecosystems. As outlined in Hickey and Harris (2021), this also provides a coherent framework for “aligning” formative and summative functions across increasingly formal levels of assessment and makes it possible to coherently assign formative and summative functions to the same assessment. This means, for example, close-level assessments can serve a summative function for prior engagement while also serving a formative function for the same learner’s understanding of targeted concepts.

Conclusion

In summary, this chapter organized selected research relevant to assessment in open and online education around the dimensions of purpose/function and learning theory. We acknowledge that this is a novel way of organizing research and insights about assessment. We further acknowledge that this organization is rooted in our underlying embrace of situative theories of knowing and learning. We contend that this organization reveals crucial interactions between these dimensions that may undermine more specific goals of assessment practices as well as the broader enterprise of education. While situative theories of knowing and learning are widely appreciated by many in open and online learning, there is relatively little consideration of them within considerations of assessment beyond the work summarized above. Our arguments about the situative synthesis are not widely known or appreciated in the assessment literature more broadly.

The primary implication of our position is one of caution regarding constructivist arguments in support of “authentic” summative assessment formats and against “traditional” selected-response formats. Summative performance and portfolio assessments can generate unsustainable demands for private instructor-student interaction and take time away from more effective formative assessment and more efficient public instructor interaction. We suggest that selected-response assessments that are well-constructed, time-limited, non-searchable, uncompromised, and automatically scored can efficiently provide valid estimates of the extent to which learning in online and open courses is likely to transfer to subsequent educational, personal, and professional contexts. We close by suggesting that this argument presents a particularly promising direction for future research on assessment in open and online education.