Extending problem-solving procedures through reflection☆
Introduction
While some instruction has as its goal that the learner become skilled at just what is being taught, in many cases the goal is for the learner to be able to transfer what is learned to new situations. The literature abounds with demonstrations of both failed transfer (e.g., Bassok, 1990, Detterman, 1993, Gick and Holyoak, 1980) and near total transfer (e.g., Bovair et al., 1990, Singley and Anderson, 1989). Educators properly anguish over the implications of these apparently contradictory results (e.g., Bransford and Schwartz, 1999, Carraher and Schliemann, 2002).
One of the reasons for the different perspectives on transfer is the wide variety of things that can transfer. They can range from transfer of highly proceduralized skills such as from one kind of manual transmission to another to what might better be called discovery such as the connection made between the structure of the solar system and the structure of the atom. This paper will focus on a particular type of transfer – where one derives new solution procedures by extending problem-solving procedures that one already knows. It is particularly important in mathematics learning, which is the content focus of this paper. To take a modest example, children who learn the basic principles for solving equations need to apply them successfully to an infinite space of equations. To take a more ambitious example, mathematics education hopes that students will transfer what they learn in the classroom to being successful workers and informed citizens.
More specifically, this paper will consider situations where participants need to reflect on a known procedure and modify and replace parts of it. For instance, people often face such a situation when a favorite piece of software is upgraded. It is an explicit goal of the National Council of Teachers of Mathematics (NCTM) standards (Romberg, 1992) that students should be able to “generate new procedures and extend or modify familiar ones.”
This paper will develop a theory of procedural extension within the ACT-R theory (Anderson, 2007, Anderson et al., 2004, Salvucci, 2013, Taatgen et al., 2008) of procedure following. The ACT-R theory holds that both verbal procedural instructions and examples of procedures are initially encoded as declarative representations of problem-solving steps, which are retrieved and interpreted in solving a problem. Note that declarative encodings of procedures are not the sort of unconscious “procedures” that occupy much of the discussion about the procedural–declarative distinction in psychology (e.g., Cohen et al., 1997, Willingham et al., 1989). With enough practice such declarative knowledge can be compiled into production rules in ACT-R, which are one form of unconscious procedures.
Recently, Taatgen (2013) has produced an ACT-R theory of transfer in which steps from one procedure automatically transfer to another procedure. This is not the reflective transfer considered here. This paper is concerned with situations where one consciously reflects on what one knows and how to extend that knowledge. A classic example would be Wertheimer’s (1945/1959) study of how children could use what they know of the area of rectangles to find the area of a parallelogram.
Section snippets
ACT-R, procedure following, and fMRI
As background for the current research, we will briefly review the ACT-R theory, how procedure following is modeled, and how the activity of components in the ACT-R theory have been related to fMRI measures. ACT-R 6.0 (Anderson, 2007) consists of a set of different modules whose interactions are controlled by a production system. Different modules are specialized to achieve specific goals. Of relevance to this paper, the Manual module programs the hands, the Visual module encodes visual input,
The challenge of modeling procedural extension
There has been a considerable history of ACT-R models successfully predicting activity in the regions of Fig. 1 (other than the RLPFC; see Anderson, 2007, Anderson et al., 2008 for reviews of the work). This comfortable picture of research success was upset when we decided to explore what happens when participants were asked to extend what they had been taught to do. One such task involves what are called pyramid problems which are presented with a dollar symbol as the operator – e.g., 4$3 = X.
Pyramid experiments
We have collected data from 40 adults (ages 19–35) and 35 young adolescents (ages 12–14) solving these problems. Although the adults were more successful and somewhat faster than children (see Table 1), the two populations overlap (see Fig. 20). Their data are pooled but the results do not substantially change if the two populations are analyzed separately. A data set this large provides a basis for application of the state discovery procedures. The end of this paper will address the question
Step 1. Discovering mental states
This section describes an updated version of the model discovery process described in Anderson and Fincham (2014). This procedure is purely data-driven and is in no way specific to the ACT-R theory. Fig. 4 provides an overview of the state discovery procedure. The inputs to this state discovery procedure are the 20 PCA scores for each scan. The outputs are a set of parameters that describe the states and a description of these trials in terms of their state occupancy (probability of being in a
Step 2. Guiding an ACT-R Model8
Treating each of 128 problems separately leaves open the question of what can be concluded generally about what participants are doing. To address this question, we developed an ACT-R model, guided by the differences in state durations for individual problems. Like other ACT-R models, this is a “full-task” model that addresses the visual encoding and motor processing as well as the cognitive aspects of the task. This is critical for explaining whole brain patterns of activation, because
Step 3. Refining the mental states
From the ACT-R model, we can obtain time estimates for each state for each problem by noting when the goal associated with that state was active. Fig. 16 presents a comparison of these predictions of the ACT-R model and the estimated state times from the 128-condition HMM. There is 0 correlation for the Encoding State because the ACT-R model predicts no variation in its duration across problems. Correspondingly, the variation is least for this state in the HMM times (standard deviations of .55 s
Step 4. Interpreting the fMRI data
Now we turn to what this imaging analysis can say about two interesting aspects of the experiment that we have ignored to this point. First, 40 of the participants were adults at Carnegie Mellon and the other 35 were children between the ages of 12 and 14. What were the differences between these two populations? Second, we have only considered correct responses. What were the differences between the state characteristics of correct and incorrect answers? To address these questions, we fit the
Conclusions
Our past experimental studies (e.g. Anderson et al., 2011, Wintermute et al., 2012) had shown that a wide network of regions becomes active when participants are challenged to extend their knowledge to solve a problem. The current research has shown that this pattern is not constant throughout problem solving but is concentrated around the period of time when participants retrieve a solution strategy and plan their procedure based on that. The ACT-R model has codified one sense of
Acknowledgments
This work was supported by the National Science Foundation Grant DRL-1007945, ONR Grant N000140910098, and a James S. McDonnell Scholar Award. We would like to thank Jelmer Borst and Aryn Pyke for their comments on the paper. We would also like to thank Jelmer Borst for his help in constructing Fig. 17.
References (58)
- et al.
Left ventrolateral prefrontal cortex and the cognitive control of memory
Neuropsychologia
(2007) - et al.
Approximations and consistency of Bayes factors as model dimension grows
Journal of Statistical Planning and Inference
(2003) - et al.
Left, but not right, rostrolateral prefrontal cortex meets a stringent test of the relational integration hypothesis
Neuroimage
(2009) - et al.
Assembling and encoding word representations: fMRI subsequent memory effects implicate a role for phonological control
Neuropsychologia
(2003) AFNI: Software for analysis and visualization of functional magnetic resonance neuroimages
Computers and Biomedical Research, An International Journal
(1996)- et al.
Analogical problem solving
Cognitive Psychology
(1980) Deconvolution of impulse response in event-related BOLD fMRI
Neuroimage
(1999)- et al.
Does learning of a complex task have to be complex? A study in learning decomposition
Cognitive Psychology
(2001) - et al.
Neural networks underlying endogenous and exogenous visual–spatial orienting
Neuroimage
(2004) - et al.
The visual word form area: Expertise for reading in the fusiform gyrus
Trends in Cognitive Sciences
(2003)
The neural basis of strategy and skill in sentence–picture verification
Cognitive Psychology
Neural correlates of retrieval processing in the prefrontal cortex during recognition and exclusion tasks
Neuropsychologia
Effects of repetition and competition on activity in left prefrontal cortex during word generation
Neuron
Functional-anatomic correlates of remembering and knowing
Neuroimage
An integrated model of cognitive control in task switching
Psychological Review
Human symbol manipulation within an integrated cognitive architecture
Cognitive Science
How can the human mind occur in the physical universe?
Discovering the sequential structure of thought
Cognitive science
Cognitive and metacognitive activity in mathematical problem solving: Prefrontal and parietal patterns
Cognitive, Affective, and Behavioral Neuroscience
An integrated theory of the mind
Psychological Review
Using fMRI to test models of complex cognition
Cognitive Science
Children’s knowledge of simple arithmetic: A developmental model and simulation
Transfer of domain-specific problem solving procedures
Journal of Experimental Psychology: Learning, Memory, and Cognition
Rostral prefrontal cortex and the focus of attention in prospective memory
Cerebral Cortex
The acquisition and performance of text-editing skill: A cognitive complexity analysis
Human-Computer Interaction
Rethinking transfer: A simple proposal with multiple implications
Review of Research in Education
The brain's default network: Anatomy, function, and relevance to disease
Annals of the New York Academy of Sciences
Analogical reasoning and prefrontal cortex: Evidence for separable retrieval and integration mechanisms
Cerebral Cortex
The transfer dilemma
The Journal of the Learning Sciences
Cited by (34)
Using model-based neuroimaging to adjudicate structured and continuous representational accounts in same-different categorization and beyond
2021, Current Opinion in Behavioral SciencesMetacognition for a Common Model of Cognition
2018, Procedia Computer ScienceWhen math operations have visuospatial meanings versus purely symbolic definitions: Which solving stages and brain regions are affected?
2017, NeuroImageCitation Excerpt :For the visuospatial group the numbers were: 17 (2 stages better than 1), 16 (3 stages better than 2), 16 (4 stages better than 3), and 9 (ns: 5 stages not better than 4). The 4-stage models are compatible with a model found for another experiment with only symbolic learners solving a subset of similar problems (Anderson and Fincham, 2014a; Anderson and Fincham, 2014b). In keeping with that work, the stages were interpreted as: i) encode; ii) plan; iii) compute; and iv) respond.
Phases of learning: How skill acquisition impacts cognitive processing
2016, Cognitive PsychologyCitation Excerpt :To decorrelate the data and extract the meaningful independent sources of information, we performed a spatial principal component analysis (PCA) of the voxel activity where each voxel is treated as a variable that varies over scans, trials, and participants. As in Anderson and Fincham (2014a, 2014b), we focused on the first 20 PCA components, which account for 47% of the total variance in the data, which we assume captures mostly systematic effects in the data. These 20 component scores are approximately 20 independent normally distributed variables.
- ☆
The analyses and models in this paper can be obtained at http://act-r.psy.cmu.edu/?post_type=publications&p=16145.