Research on contextual control of behavior indicates that extinction is more context-specific than is initial acquisition (for a review, see Rosas, Todd, & Bouton, 2013). For instance, in a conditioned suppression experiment with rats reported by Bouton and King (1983), conditioned fear was first established by repeatedly pairing a tone (conditioned stimulus, CS) with shock (unconditioned stimulus, US) in one context (context A). In a second phase, fear responding was extinguished by presentations of the CS without the US. For one group of rats (Group Ext-A), this extinction treatment was conducted in the same context as the initial acquisition, whereas another group (Ext-B) received extinction in a second context (context B). Bouton and King observed no difference in the rates of extinction between the groups, indicating that acquisition performance generalized perfectly across contexts. However, when the animals in Group Ext-B were shifted from context B back to context A in a final test phase, the initially learned fear recovered (ABA renewal), indicating that the context switch disrupted extinction performance (for similar results in human predictive learning, see Paredes-Olay & Rosas, 1999).

Besides ABA renewal, the recovery of an extinguished response was also observed when acquisition, extinction, and testing were conducted in different contexts (ABC renewal; e.g., Bouton & Bolles, 1979; Üngör & Lachnit, 2008) or when acquisition and extinction took place in the same context but testing occurred in a different one (AAB renewal; e.g., Bouton & Ricker, 1994). Importantly, the existence of ABC and AAB renewal indicates that initial acquisition generalizes more readily to novel contexts than extinction. This difference in context dependency between acquisition and extinction has rather troubling implications for behavior therapy. It suggests that if someone acquires pathological fear in a particular context, this fear will harass the person in a variety of other contexts, whereas therapeutic successes in overcoming the fear will be tied to the therapeutic environment.

According to Bouton (1993, 1994, 1997, 2004), extinction is especially context-specific because it is the second information learned about a stimulus. He assumed that the excitatory association established during initial acquisition remains intact during extinction, but is counteracted by the growth of a second, inhibitory link between the CS and the US. Thus, the significance of a CS becomes ambiguous in the sense that different associations are related to the stimulus. In order to resolve this ambiguity, Bouton proposed that organisms utilize contextual information to gate behavioral output. More precisely, he assumed that retrieval of the second-learned association requires gating by the context of extinction, whereas the first-learned association is coded independently of contextual stimuli. To explain the difference in context processing during acquisition and extinction, Bouton (1997, 2004) suggested that organisms might ignore contextual information as long as the significance of a CS is unambiguous, which is the case during initial acquisition, because in this situation the context provides no information for solving the task. However, the ambiguity in the significance of a CS caused by extinction might encourage organisms to pay attention to the actual context, leading to context-specific processing of ambiguous information (for a similar idea, see Darby & Pearce, 1995).

The memory-retrieval account proposed by Bouton (1993, 1994, 1997, 2004) is able to deal with all three different types of renewal (ABA, ABC, and AAB). However, the model is challenged by demonstrations of the context specificity of simple excitatory learning (e.g., Hall & Honey, 1989, 1990; Lucke, Lachnit, Koenig, & Uengoer, 2013; Rosas, García-Gutiérrez, & Callejas-Aguilera, 2006; Üngör & Lachnit, 2006) and the context dependency of latent inhibition (e.g., Channell & Hall, 1983).

Bouton’s idea that the context specificity of extinction results from an increased processing of contextual stimuli during extinction learning was extended by Rosas and colleagues (Rosas, Callejas-Aguilera, Ramos-Álvarez, & Abad, 2006) in their attentional theory of context processing (ATCP). They pointed out that once an organism starts to pay attention to the context, any information learned within the context should be processed in a way that makes it context-specific, regardless of whether first-learned or second-learned associations are concerned. Thus, according to Rosas, Callejas-Aguilera, et al., contextual control of behavior is not a matter of the type of learning involved (acquisition or extinction), but rather depends on factors determining the amount of attention directed toward contextual stimuli.

Besides CS ambiguity, ATCP (Rosas, Callejas-Aguilera, et al., 2006) assumes that the informational value of contexts also plays a central role in the modulation of attention to contexts. For instance, in an experiment reported by Preston, Dickinson, and Mackintosh (1986), one group of rats (Group Cond) acquired a conditional discrimination for which contexts were relevant to solve the task. Animals in this group were trained to respond (lever-press) during tone presentation (S1+) and not to respond during clicker presentation (S2–) in context A, whereas in context B the contingencies for S1 and S2 were reversed (S1–, S2+). For a second group (Group Disc), initial training consisted of a simple discrimination for which contexts were irrelevant—these rats received S1+ and S2– trials in both contexts. In addition, animals in both groups were reinforced for responding during light presentation (S3+) in context A. Subsequent to successful acquisition, responding to S3 was extinguished for all animals. For half of the animals in each group, extinction took place in context A, and for the other half, in context B. The authors observed that extinction of S3 proceeded faster in context B than in context A for Group Cond, for which contexts were relevant to solve the discrimination between S1 and S2 during the acquisition stage. However, the rate of extinction was independent of the contexts in Group Disc, for which contextual stimuli were irrelevant. This finding indicates that first-learned associations are less context-specific when acquired in a situation in which contextual stimuli are irrelevant rather than relevant.

The impact of context relevance on context-specific learning was also demonstrated in humans using different procedures, including instrumental (León, Abad, & Rosas, 2010; León, Gámez, & Rosas, 2012) and predictive learning tasks (León, Abad, & Rosas, 2008; Lucke et al., 2013). In one of these experiments (Lucke et al., 2013, Exp. 2), eye-gaze positions of the participants were recorded to assess overt attention to contexts during learning. Lucke et al. observed longer dwell times on contexts that were relevant than on those trained as irrelevant, supporting the interpretation proposed by Rosas, Callejas-Aguilera, et al. (2006) in terms of attentional processes.

Thus, the literature provides strong support for the idea that relevant contexts receive more attention than do contexts that are irrelevant, leading to stronger context-specific encoding of the information learned in the former than in the latter type of contexts (Rosas, Callejas-Aguilera, et al., 2006). All of these studies, however, focused on the context specificity of acquisition learning (León et al., 2008, 2010; León et al., 2012; Lucke et al., 2013; Preston et al., 1986). Therefore, the purpose of the present experiments was to investigate the role of the informational value of contexts for the formation of context-specific extinction learning. According to Rosas, Callejas-Aguilera, et al., context specificity of acquisition and extinction are governed by the same mechanisms. Thus, extinction also should be more context-dependent when conducted in relevant rather than irrelevant contexts. Alternatively, however, the main function of context processing during extinction might be to enable a disambiguation of the significance of the CS. Therefore, extinction might encourage the processing of context stimuli regardless of whether the context was previously trained as relevant or as irrelevant.

In each of the present experiments, we used a predictive-learning scenario in which participants were instructed to assume the role of a medical doctor whose patient often suffers from stomach trouble after the consumption of meals in restaurants. The task was to predict the occurrence of this stomach trouble. On each trial, participants were presented with one of several cues (food types) in one of several contexts (restaurants) and were asked to predict the patient’s reaction. Each context was composed of two elements: a color spot and a picture of an animal (see Fig. 1). During the learning phases of each experiment (Phases 1–3), each trial ended with information about whether stomach trouble had occurred (+) or not (–).

Fig. 1
figure 1

Example of a trial during the experiments. The pictures in the upper corners of the screen (from left to right) show the color spot and the picture of an animal. In the middle of the screen, the food cue was displayed. The two buttons, labeled “Yes, I expect stomach trouble” and “No, I do not expect stomach trouble” (in German), were presented below the food cue.

Experiment 1

Table 1 illustrates the design for the two groups of Experiment 1. In this experiment, the context color (dimension A) was either blue (A1) or pink (A2), and the context animal (dimension B) was either a whale (B1) or a bear (B2). During Phase 1 (acquisition), all participants received training with a target cue Z+ in context A1B1. In a subsequent phase (context training), half of the participants (Group Col-Rel/Ani-Irrel, which is short for Color-Relevant/Animal-Irrelevant) were trained with a conditional discrimination that was solvable on the basis of the colors of the contexts (A1 and A2), but not on the basis of the animals related to the contexts (B1 and B2). Participants in this group received X+ and Y– trials in contexts A1B1 and A1B2, together with X– and Y+ trials in contexts A2B1 and A2B2. For the other half (Group Ani-Rel/Col-Irrel, short for Animal-Relevant/Color-Irrelevant), training in Phase 2 comprised a conditional discrimination for which dimension B of the contexts was relevant and dimension A irrelevant. The participants in Group Ani-Rel/Col-Irrel received X+ and Y– trials in contexts A1B1 and A2B1, together with X– and Y+ trials in contexts A1B2 and A2B2.

Table 1 Design of Experiment 1

During Phase 3 (extinction), all participants were presented with Z– trials in context A2B2. Thus, the extinction of Z took place in a context differing on both dimensions from the context of initial acquisition. In a final test, Z was presented in its extinction context A2B2 and in a context that differed from the extinction context only on dimension A, but not on dimension B: context A1B2.

The context training conducted in Phase 2 should encourage the participants in Group Col-Rel/Ani-Irrel to pay more attention to dimension A than to dimension B, whereas in Group Ani-Rel/Col-Irrel, dimension B should capture more attention than dimension A. According to Rosas, Callejas-Aguilera, et al. (2006), this difference between the groups in the distribution of attention across the context elements should lead to differences in the context specificity of extinction learning about Z in Phase 3. In Group Col-Rel/Ani-Irrel, extinction of Z should be linked more strongly to context element A2 than to the element B2, whereas in Group Ani-Rel/Col-Irrel, extinction should be more dependent on B2 than on A2. As a consequence, changing the context of extinction only on dimension A during the test phase should lead to a stronger disruption of extinction performance in Group Col-Rel/Ani-Irrel than in Group Ani-Rel/Col-Irrel.

Our experimental design also included control trials, to assess whether responding to Z during Phase 3 indeed reflected extinction learning. It might be possible that the context training conducted in Phase 2 would lead to forgetting of the cue–outcome relations trained in Phase 1. As a consequence, participants might treat cue Z as a novel cue when it was presented in Phase 3. To assess this possibility, the training schedule in Phase 3 included trials with a novel cue N– in context A1B2. If our participants remembered acquisition training with Z from Phase 1 when presented with this cue at the outset of Phase 3, then responding to Z should be stronger than that to the novel cue N.

Method

Participants

A group of 72 students from Philipps-Universität Marburg (50 women, 22 men; M age = 21.9 years, age range 18–42) participated voluntarily in the experiment and received either course credit or payment (€2.30 [US $3]). They gave their informed consent before starting the experiment. The participants were randomly allocated to two different experimental groups of equal size as they arrived in the experimental room. All were tested individually and needed approximately 20 min to complete the experiment.

Apparatus and stimuli

The stimuli, instructions, and further necessary information were presented on a computer screen. Participants interacted with the computer by using the mouse. The following food types were used as cues N, X, Y, Z, and F1 to F5: avocado, banana, broccoli, grapes, orange, pear, pepper, pineapple, strawberry, tomato, and zucchini. The assignments of the different food types to cues N, X, Y, Z, and F1 to F5 were implemented randomly for each participant. Four pairs of stimuli served as the contexts. Each pair consisted of a color spot and a picture of an animal. The four pairs were a blue spot and the picture of a whale (context A1B1), a blue spot and the picture of a bear (context A1B2), a pink spot and the picture of a whale (context A2B1), and a pink spot and the picture of a bear (context A2B2). Participants were instructed that each pair represents the name of a restaurant. The two different outcomes were the occurrence (+) or nonoccurrence (–) of stomach trouble.

Procedure

Each participant was initially asked to read the following instructions on the screen (in German):

Our study is concerned with the questions of how people learn about relationships between different events. Imagine that you are a medical doctor and that one of your patients often suffers from stomach trouble after meals. Your task is to discover what causes this stomach trouble that your patient is suffering from.

Your patient likes to go out for meals. Blue Whale, Pink Whale, Blue Bear, and Pink Bear are your patient’s favorite places. You will be told which restaurant your patient has visited each day and which foods your patient has eaten there. Please look carefully at the foods and the respective restaurant. Thereafter, you will be asked to predict whether the patient suffers from stomach trouble. For this prediction, please click on the appropriate response button. After you have made your prediction, you will be informed whether your patient actually suffers from stomach trouble or not.

Use this feedback to find out what causes the stomach trouble that your patient is suffering from. Obviously, at first you will have to guess because you do not know anything about your patient. But eventually you will learn which causes lead to stomach trouble in this patient and you will be able to make correct predictions.

For all of your answers, accuracy rather than speed is essential. Please do not take any notes during the experiment. If you have any more questions, please ask them now. If you do not have any questions, please start the experiment by clicking on the Next button.

When a participant asked a question, it was answered by the experimenter. After the participant clicked on the Next button, learning Phase 1 commenced.

On each learning trial (for an example, see Fig. 1), two context elements (a color spot and a picture of an animal) were presented side by side on the top half of the screen. The color spot was shown on the left and the picture of the animal on the right. The phrases “The patient ate at the restaurant” and “the following food type” were presented above and below the context elements, respectively. At the center of the screen, a picture of a single food type was shown. Participants were asked to predict whether or not their patient would suffer from stomach trouble after eating the particular food. They made their predictions by clicking on one of two buttons, labeled “Yes, I expect stomach trouble” or “No, I do not expect stomach trouble.” Immediately after participants had responded, a feedback window appeared, telling whether or not the patient had suffered from stomach trouble. The feedback that the patient had no stomach trouble appeared in green font, the feedback that stomach trouble had occurred was written in red font. Participants had to confirm that they had read the feedback by clicking on an “OK” button. Thereafter, the next trial started.

Each of two groups worked on three learning phases and a test phase (see Table 1). In Phase 1 (acquisition), all participants were presented with Z+ and F1– trials in context A1B1, together with F2+ and F3– trials in context A2B2. During Phase 2 (context training), the participants in Group Col-Rel/Ani-Irrel were trained with a conditional discrimination, with X+ and Y– trials in contexts A1B1 and A1B2, together with X– and Y+ trials in contexts A2B1 and A2B2. In contrast, the participants in Group Ani-Rel/Col-Irrel received a conditional discrimination with X+ and Y– trials in contexts A1B1 and A2B1, together with X– and Y+ trials in contexts A1B2 and A2B2. The training of the conditional discrimination in either group was continued in Phase 3 (extinction). In addition, all participants were trained in Phase 3 with N– and F4+ trials in context A1B2, together with Z– and F5+ trials in context A2B2. In each of the three learning phases, each of the cue/context combinations was presented ten times. Thus, Phase 1 consisted of 40 trials in total, Phase 2 of 80 trials, and Phase 3 of 120 trials. The three learning phases followed each other seamlessly, and the transition was not signaled to the participants. Training of the cues F1 to F5 ensured that each context was associated with the occurrence of stomach trouble on 50 % of the trials.

The extinction phase was followed by a series of test trials. This test was introduced by the following instructions: “From now on the feedback of whether your patient actually suffers from stomach trouble will be omitted. Nevertheless, please try to predict the occurrence or nonoccurrence of stomach trouble as accurately as possible.” The test trials were identical to the learning trials, with the exception that the feedback window was omitted. Participants in both groups received four presentations of Z in context A2B2 and four presentations of Z in context A1B2, so that the test phase consisted of eight trials in total.

For both groups, each learning phase was divided into five blocks, and the test phase into two blocks. The order of presentation of the trials within each block was determined randomly for each block and each participant. Within each block, each of the trial types trained in a phase was presented on two occasions, except for the first two blocks of the context training conducted in Phase 2.

In order to facilitate acquisition of the conditional discrimination in Phase 2, each of the first two blocks in this phase only included a subset of the trial types. One of these blocks only comprised the trial types including X (X in contexts A1B1, A1B2, A2B1, and A2B2), and the other block only the trial types including Y (Y in contexts A1B1, A1B2, A2B1, and A2B2). The sequence of training with X and Y across the two blocks was counterbalanced across participants. Within each of the first two blocks of Phase 2, each trial type was presented on four occasions. In each of the remaining blocks, the eight cue–context combinations of the conditional discrimination were trained in an intermixed fashion.

Results and discussion

For this and the subsequent experiment, the .05 level of significance was used in all statistical tests, and the degrees of freedom were corrected with the Box (1954) method where appropriate. Unless stated otherwise, we used partial eta-squared as the measure of effect size.

Acquisition in Phase 1

The left-hand panel of Fig. 2 presents the percentages of participants who predicted stomach trouble for Z+ in context A1B1 across the ten trials of the acquisition phase for each group. Black squares represent the data from Group Col-Rel/Ani-Irrel, and white squares the data from Group Ani-Rel/Col-Irrel. As can be seen, no differences in responding to Z occurred between the groups.

Fig. 2
figure 2

The left-hand panel shows the percentages of participants who predicted stomach trouble for Z+ in context A1B1 across the ten trials of the acquisition phase of Experiment 1, separately for Group Col-Rel/Ani-Irrel (black squares) and Group Ani-Rel/Col-Irrel (white squares). The right-hand panel shows the percentages of participants who predicted stomach trouble for Z– and N– across the ten trials of the extinction phase, separately for Groups Col-Rel/Ani-Irrel and Ani-Rel/Col-Irrel. Error bars denote standard errors of the means.

To assess acquisition performance in Phase 1, we calculated for each participant the mean percentage of stomach trouble predictions collapsed across the first four trials with stimulus Z (beginning) and the mean percentage of stomach trouble predictions collapsed across the last four trials with Z (end). A 2 × 2 repeated measures analysis of variance (ANOVA) was conducted, including the within-subjects factor Time (beginning vs. end) and the between-subjects factor Group (Col-Rel/Ani-Irrel vs. Ani-Rel/Col-Irrel). The analysis revealed a main effect of time, F(1, 70) = 56.85, p < .001, η p 2 = .45, indicating an increase of stomach trouble predictions to Z over the course of training. The main effect of group and all related interactions were not significant (all Fs < 1.32, ps > .254), confirming no differences between the two groups in performance to Z.

Context training in Phase 2 and Phase 3

Table 2 depicts the results for the training of the conditional discrimination across the five blocks of Phase 2 and the five blocks of Phase 3. For each block, we calculated the mean percentage of correct predictions across the 16 consecutive trials of the conditional discrimination. A 2 × 5 × 2 (Phase [2, 3] × Block [1, 2, 3, 4, 5] × Group [Col-Rel/Ani-Irrel, Ani-Rel/Col-Irrel]) ANOVA revealed a main effect of phase, F(1, 70) = 4.74, p = .033, η p 2 = .06, as well as a main effect of block, F(4, 280) = 4.83, p = .001, η p 2 = .07, indicating that the mean percentages of correct predictions increased in the course of training. Neither the main effect of group, F(1, 70) = 3.42, p = .069, η p 2 = .05, nor the interactions including this factor were significant, Fs < 1.

Table 2 Mean percentages of correct predictions across 16 consecutive trials of the conditional discrimination for each of the five blocks of Phase 2 and each of the five blocks of Phase 3 in Experiment 1 (standard errors within brackets)

Extinction in Phase 3

The right-hand panel of Fig. 2 presents the percentages of participants who predicted stomach trouble for Z– and N– across the ten trials of the extinction phase for each group. As the figure demonstrates, responding to Z– decreased over the course of training in each group, and no differences in responding occurred between the groups.

To assess extinction performance to Z in Phase 3, we calculated for each participant the mean percentage of stomach trouble predictions collapsed across the first four trials with Z (beginning) and the mean percentage of stomach trouble predictions collapsed across the last four trials with Z (end). A 2 × 2 (Time [beginning, end] × Group [Col-Rel/Ani-Irrel, Ani-Rel/Col-Irrel]) ANOVA yielded a significant main effect of time, F(1, 70) = 113.65, p < .001, η p 2 = .62, showing a decrease in stomach trouble predictions over the course of Phase 3. Neither the main effect of group nor any of the interactions including this factor were significant (all Fs < 1.24, ps > .269), reflecting no evidence of differences in responding to Z between the groups.

The right-hand panel of Fig. 2 also shows that responding during the first trial of Phase 3 was stronger for Z– than for N– in each group. To compare the performance to Z– and N– at the outset of Phase 3, we analyzed responding to the first presentation of each trial type using a McNemar test with prediction (stomach trouble vs. no stomach trouble) and stimulus (Z vs. N) as the dichotomous variables. The results showed that more participants predicted stomach trouble to Z than to N, χ 2(1, 72) = 4.11, p = .041, Φ = .24 (effect size Phi), confirming that participants successfully recalled acquisition learning about Z at the outset of the extinction phase.

Contextual control during the test phase

Figure 3 depicts responding to Z in contexts A1B2 and A2B2 during the test phase in terms of the mean percentages of stomach trouble predictions. The left-hand bars show the data for Group Col-Rel/Ani-Irrel, and the right-hand bars present the data for Group Ani-Rel/Col-Irrel. Within each group, the black bar depicts responding in context A1B2, and the white bar, responding in context A2B2.

Fig. 3
figure 3

Mean percentages of stomach trouble predictions for Z in context A1B2 and context A2B2 during the test phase of Experiment 1, collapsed across the four presentations of each trial type, separately for Group Col-Rel/Ani-Irrel and Group Ani-Rel/Col-Irrel. Error bars denote standard errors of the means.

As can be seen, the participants in Group Col-Rel/Ani-Irrel responded more strongly to Z in context A1B2 than in context A2B2, whereas no difference in responding to Z was apparent between the contexts for Group Ani-Rel/Col-Irrel. This was confirmed by a 2 × 2 (Context [A1B2, A2B2] × Group [Col-Rel/Ani-Irrel, Ani-Rel/Col-Irrel]) ANOVA, revealing a significant main effect of context, F(1, 70) = 5.69, p = .020, η p 2 = .08, indicating that responding to Z was stronger in context A1B2 than in A2B2, and a significant Context × Group interaction, F(1, 70) = 4.28, p = .042, η p 2 = .06.

Simple main effects of context at each level of the Group factor were calculated in order to further analyze the Context × Group interaction, revealing a significant main effect of context for Group Col-Rel/Ani-Irrel, F(1, 70) = 9.92, p = .002, η p 2 = .12, but not for Group Ani-Rel/Col-Irrel, F < 1. This result shows that only participants in Group Col-Rel/Ani-Irrel responded significantly more to Z in context A1B2 than in context A2B2.

Overall, following the extinction of Z in context A2B2, extinction performance was disrupted by a partial change of the extinction context when the context change was based on the context dimension that was trained as being relevant for the conditional discrimination. However, when the extinction context was partially changed on the irrelevant context dimension of the conditional discrimination, extinction performance was not affected. These findings are consistent with the idea that the relevant context elements received more attention than those that were irrelevant, and that this difference in attention made it easier for the relevant context elements to gain behavioral control than for the irrelevant context elements.

Note, however, that the present experiment is silent about the specific way in which differences in attention between relevant and irrelevant contexts might arise. The training of the conditional discrimination in the present experiment might have increased attention to relevant context elements, decreased attention to irrelevant context elements, or both. In a similar vein, the experiment cannot reveal whether the difference in performance during the test phase between the groups was a consequence of enhanced renewal in Group Col-Rel/Ani-Irrel, suppressed renewal in Group Ani-Rel/Col-Irrel, or both (for a more detailed discussion, see General Discussion). Nonetheless, our results are consistent with the principles proposed by ATCP (Rosas, Callejas-Aguilera, et al., 2006).

However, an alternative explanation for the results of Experiment 1 in terms of rule induction would require no recourse to the assumption that attentional processes modulated the context specificity of extinction learning in the present experiment. Assume, for instance, that the participants in each group derived from the training of the conditional discrimination specific rules about the relationship between changes in the significance of cues and contextual variations. In Group Col-Rel/Ani-Irrel, participants might have extrapolated the rule that the meaning of a cue changes when there is a shift in the color of the context. According to this rule, a stimulus, for instance, that is associated with no stomach trouble in the presence of context element A2 will be followed by stomach trouble when presented together with context element A1. Correspondingly, participants in Group Ani-Rel/Col-Irrel might have derived the rule that the meaning of a cue varies together with the context animal. Thus, a stimulus followed by no stomach trouble in the presence of context element B2 will cause stomach trouble when accompanied by context element B1. Having learned that stimulus Z was followed by no stomach trouble in context A2B2 in Phase 3, an application of these rules to the test trials with Z in context A1B2 would lead to the prediction of stomach trouble in Group Col-Rel/Ani-Irrel, whereas the participants in Group Ani-Rel/Col-Irrel should hold on to predict the absence of stomach trouble. Given this alternative explanation of the present results, we conducted a second experiment in order to test the rule account against the attentional account proposed by Rosas, Callejas-Aguilera, et al. (2006).

Experiment 2

The design of Experiment 2 is summarized in Table 3. As in Experiment 1, each context was composed of a color spot and a picture of an animal. However, in contrast to the previous experiment, the assignment of color and animal to the context dimensions A and B was counterbalanced across participants (for details, see the Method section below).

Table 3 Design of Experiment 2

Following acquisition training with Z+ in context A1B1, participants received a conditional discrimination between X and Y for which context dimension A was relevant, and context dimension B irrelevant. Participants were trained with X+ and Y– trials when these cues were accompanied by context element A1, whereas in the presence of either context element A2 or A3, the training consisted of trials with X– and Y+. Thus, in contrast to the conditional discrimination trained in Experiment 1, not every change in the value of the relevant context dimension A was accompanied by shifts in the cue–outcome contingencies. Only contextual manipulations involving a shift between the context elements A1 and A2 or between A1 and A3 were associated with changes in the significance of the cues. However, when the context change involved a shift between the context elements A2 and A3, the meaning of the cues remained unchanged.

After the training of the conditional discrimination, participants received extinction trials with Z– in context A2B2. Finally, Z was tested for response recovery in a context that differed from the extinction context only on the irrelevant context dimension B (A2B1) and in a context differing from the extinction context only on the relevant context dimension A (A3B2).

If extinction of Z is linked more strongly to the context element A2 than to the context element B2, as is predicted by the attentional account proposed by Rosas, Callejas-Aguilera, et al. (2006), extinction performance during the test phase should be disrupted more strongly when Z is presented in context A3B2 than when it is shown in context A2B1. Alternatively, if participants respond during the test phase according to the knowledge that they derived from the training of the conditional discrimination, responding to Z should not differ across the test contexts. Having learned that Z is followed by no stomach trouble in context A2B2, participants should hold on to predict the absence of stomach trouble, regardless of whether Z is presented in context A2B1 or in context A3B2.

Method

Participants

A group of 34 students from Philipps-Universität Marburg (26 women, eight men; M age = 21.4 years, age range 18–28) voluntarily participated in the experiment and received either course credit or payment (€2.30 [US $3]). They gave their informed consent before starting the experiment. All of the participants were tested individually and needed approximately 20 min to complete the experiment.

Apparatus, stimuli, and procedure

The stimuli and instructions were the same as those of Experiment 1, unless stated otherwise. Nine food pictures from Experiment 1 (avocado, banana, broccoli, orange, pear, pepper, pineapple, strawberry, and tomato) were assigned randomly to the different cues (N, X, Y, Z, and F1 to F5) for each participant of Experiment 2. In addition to the colors and animals that had served as context elements in Experiment 1, yellow and bird were used as the context color and context animal, respectively. For half of the participants, the context color was assigned to context dimension A and the context animal to context dimension B (Balance 1), whereas for the other half, animals were assigned to dimension A and colors to dimension B (Balance 2). For each dimension, the assignment of particular colors or animals to the values 1–3 was implemented randomly for each participant.

In Phase 1 (acquisition), participants received training with Z+ and F1– trials in context A1B1, together with F2+ and F3– trials in context A2B2 (each cue/context combination was presented 16 times). During Phase 2 (context training), participants were trained with a conditional discrimination with X+ and Y– trials in contexts A1B1 and A1B2, together with X– and Y+ trials in contexts A2B1, A2B2, A3B1, and A3B2 (each cue/context combination including context element A1 was presented 24 times, whereas each of the remaining cue/context combinations was presented 15 times). In Phase 3 (extinction), they were trained with N– and F4+ trials in context A1B1, together with Z– and F5+ trials in context A2B2 (each cue/context combination was presented ten times). In the test phase, participants were presented with Z in each of the contexts A2B1, A2B2, and A3B2 (with each cue/context combination being presented four times).

Phase 1 was divided into eight blocks, Phase 2 into nine blocks, Phase 3 into five blocks, and the test phase into two blocks. The order of presentation of the trials within each block was determined randomly for each block and each participant. Within each block, each of the trial types trained in the phase was presented on two occasions, except for the first six blocks of the context training, conducted in Phase 2.

In order to facilitate acquisition of the conditional discrimination in Phase 2, each of the first six blocks in this phase only included a subset of the trial types. Three of these blocks only comprised the trial types including contexts A1B1, A1B2, A2B1, and A2B2, and the other three only the trial types including contexts A1B1, A1B2, A3B1, and A3B2. In each of the remaining blocks, all 12 cue–context combinations of the conditional discrimination were trained in an intermixed fashion.

Results and Discussion

Acquisition in Phase 1

The left-hand panel of Fig. 4 presents the percentages of participants who predicted stomach trouble for Z+ in context A1B1 across the 16 trials of the acquisition phase. As the figure demonstrates, stomach trouble predictions to Z increased over the course of training.

Fig. 4
figure 4

The left-hand panel shows the percentages of participants who predicted stomach trouble for Z+ in context A1B1 across the 16 trials of the acquisition phase of Experiment 2. The right-hand panel shows the percentages of participants who predicted stomach trouble for Z– and N– across the ten trials of the extinction phase. Error bars denote standard errors of the means.

To assess acquisition performance in Phase 1, we calculated for each participant the mean percentage of stomach trouble predictions collapsed across the first four trials with Z (beginning) and the mean percentage of stomach trouble predictions collapsed across the last four trials with Z (end). A 2 × 2 (Time [beginning, end] × Balance [1, 2]) ANOVA revealed a main effect of time, F(1, 32) = 57.02, p < .001, η p 2 = .64, confirming an increase of stomach trouble predictions to Z over the course of training. The main effect of balance and all related interactions were not significant, all Fs < 2.15, ps > .152.

Context training in Phase 2

Table 4 depicts the mean percentages of correct predictions across 24 consecutive trials for each block of Phase 2. A 9 × 2 (Block [1, 2, 3, 4, 5, 6, 7, 8, 9] × Balance [1, 2]) ANOVA revealed a main effect of block, F(8, 256) = 7.17, p < .001, η p 2 = .18, indicating that the mean percentages of correct predictions increased in the course of training. The main effect of balance and all interactions including this factor were not significant, all Fs < 1.

Table 4 Mean percentages of correct predictions across 24 consecutive trials for each of the nine blocks of Phase 2 in Experiment 2 (standard errors within brackets)

Extinction in Phase 3

The right-hand panel of Fig. 4 presents the percentages of participants who predicted stomach trouble for Z– and N– across the ten trials of the extinction phase. As can be seen, responding to Z– decreased over the course of training. To assess extinction performance in Phase 3, we calculated for each participant the mean percentage of stomach trouble predictions collapsed across the first four trials with stimulus Z (beginning) and the mean percentage of stomach trouble predictions collapsed across the last four trials with Z (end). A 2 × 2 (Time [beginning, end] × Balance [1, 2]) ANOVA yielded a significant main effect of block, F(1, 32) = 40.02, p < .001, η p 2 = .56, confirming a decrease in stomach trouble predictions in the course of Phase 3. Neither the main effect of balance nor any of the interactions including this factor were significant (all Fs < 1), showing no evidence of differences in responding to Z between the balance conditions.

The right-hand panel of Fig. 4 also shows that responding during the first trial of Phase 3 was stronger for Z than for N. To compare responding to Z– and N– at the outset of Phase 3, we compared responding to the first presentation of each cue in this phase using a McNemar test with prediction (stomach trouble vs. no stomach trouble) and stimulus (Z vs. N) as the dichotomous variables. The results showed that more participants predicted stomach trouble to Z than to N, χ 2(1, 34) = 4.08, p = .039, Φ = .35.

Contextual control during the test phase

Figure 5 depicts responding to Z in contexts A2B1, A2B2, and A3B2 during the test phase in terms of the mean percentages of stomach trouble predictions. As can be seen, participants responded more strongly to Z in context A3B2 than in either of the contexts A2B2 or A2B1, whereas we found no difference in responding to Z between the contexts A2B2 and A2B1.

Fig. 5
figure 5

Mean percentages of stomach trouble predictions for Z in contexts A2B1, A2B2, and A3B2 during the test phase of Experiment 2, collapsed across the four presentations of each trial type. Error bars denote standard errors of the means.

To assess performance during the test phase, we calculated for each participant the mean percentage of stomach trouble predictions collapsed across the four presentations of stimulus Z in each context. A 3 × 2 (Context [A2B1, A2B2, A3B2] × Balance [1, 2]) ANOVA revealed a significant main effect of context, F(2, 64) = 10.16, p = .001, η p 2 = .24, indicating that responding to Z was different in the three test contexts. The main effect of balance and all interactions including this factor were not significant, all Fs < 1.

Post-hoc tests using the Bonferroni correction revealed that the mean percentage of stomach trouble predictions for Z was higher in context A3B2 than in context A2B2, p = .001, d = 0.742, and also higher than in context A2B1, p = .029, d = 0.681. We found no difference between responding to Z in context A2B1 and A2B2, p = .525, d = 0.249. This result reflects that participants responded more strongly to Z in context A3B2 than in contexts A2B1 and A2B2.

Consistent with the results of Experiment 1, extinction performance to Z was disrupted when the extinction context was changed on the context dimension that was trained as being relevant for the conditional discrimination. However, when the extinction context was changed on the irrelevant context dimension, extinction performance was not affected.

Performance during the test phase of Experiment 2 cannot be explained by the assumption that participants responded during the test according to rules that they had derived from the training of the conditional discrimination in Phase 2. In this conditional discrimination, the cue–outcome contingencies for X and Y remained unchanged when the context change involved a shift between the context elements A2 and A3. Thus, having learned that Z was followed by no stomach trouble in context A2B2 in Phase 3, participants should have continued to predict the absence of stomach trouble when Z was presented for testing in contexts A3B2 and A2B1. Instead, our results are in line with the idea of Rosas, Callejas-Aguilera, et al. (2006) that the amount of attention dedicated to contextual stimuli determines the strength of context specificity of learning.

General discussion

In two experiments, we investigated the role of the informational value of contexts for the formation of context-specific extinction learning. In each experiment, participants initially received acquisition training with a target cue Z in a context composed of two elements taken from two distinct dimensions (A1B1). Subsequently, participants were trained to discriminate two other cues X and Y for which only one of the two context dimensions was relevant. Training of the conditional discrimination was followed by extinction of the target cue Z, which was conducted in a context that differed on both dimensions from the context of initial acquisition (A2B2). During a final test phase, we observed in both experiments that a partial change of the extinction context disrupted extinction performance when the extinction context was changed on the context dimension that was trained as relevant for the conditional discrimination. However, when the extinction context was changed on the context dimension that was trained as irrelevant, extinction performance was not affected. Our results are consistent with the idea that more attention is paid to relevant than irrelevant context elements, and that this difference in attention makes it easier for the relevant context elements to gain behavioral control than for the irrelevant context elements.

Our findings demonstrate the generality of the conclusion drawn from previous studies that the informational value of contexts affects the strength of context-specific learning (León et al., 2008, 2010; León et al., 2012; Lucke et al., 2013; Preston et al., 1986). The present experiments extend and complement these previous studies, which focused on the context specificity of acquisition learning, by demonstrating that the context specificity of extinction learning as well is influenced by the informational value of contexts.

The results of the present experiments strongly support the account of Rosas, Callejas-Aguilera, et al. (2006) that, regardless of the type of learning (acquisition or extinction), the strength of context-specific encoding of information depends on the amount of attention captured by contextual stimuli. Even though Rosas, Callejas-Aguilera, et al. did not provide a formalized specification of their account, it is possible, however, to draw conclusions on the basis of the present and the previous studies (León et al., 2008, 2010; León et al., 2012; Lucke et al., 2013; Preston et al., 1986) about the mechanisms that regulate attention to the contexts.

The idea that organisms pay more attention to relevant than to irrelevant stimuli can be found in several theories of learning and attention (e.g., Kruschke, 1992, 2001, 2006; Mackintosh, 1975; Pearce, George, & Redhead, 1998). Some of these theories (Kruschke, 2001, 2006; Mackintosh, 1975) adopt an elemental stimulus representation, assuming that each element of a stimulus compound acquires its own direct excitatory or inhibitory association with the outcome. Performance on a trial is then controlled by the algebraic sum of the associative strengths of the currently present stimuli—including context elements. Hence, these models cannot account for the acquisition of a conditional discrimination, as we observed in the present experiments (see also Üngör & Lachnit, 2006). One way to overcome this problem is to extend elemental theories by the assumption of a unique cue (e.g., Rescorla, 1973; see also Lachnit & Kimmel, 2000; Lachnit, Lober, Reinhard, & Kinder, 2001). According to this hypothesis, any combination of two or more stimuli creates a unique element that can gain associative strength in the same way as conventional stimuli.

An alternative way to explain the acquisition of a conditional discrimination is provided by models assuming a configural stimulus representation (Kruschke, 1992; Pearce et al., 1998). According to this view, the entire pattern of stimulation provoked by a specific stimulus compound results in one unitary representation developing a connection to the outcome. The response-eliciting property of a stimulus configuration is then determined by its direct association to the outcome, as well as by the generalized associative strengths of other configurations, whereby the amount of generalization is based on similarity (Pearce, 1987, 1994; see also Kinder & Lachnit, 2003; Lober & Lachnit, 2002).

The present experiments also allow the evaluation of an alternative explanation for the results of the previous studies on context relevance and context-specific learning (León et al., 2008, 2010; León et al., 2012; Lucke et al., 2013; Preston et al., 1986) that would require no recourse to attentional processes. In each of these previous studies, one group was trained with a simple discrimination of the form AX+, BX–, AY+, BY–, whereas a second group received a conditional discrimination of the form AX+, BX–, AY–, BY+. Ample evidence in the literature indicates that these different kinds of discrimination problems encourage different forms of stimulus representations (e.g., Melchers, Lachnit, & Shanks, 2004; Melchers, Lachnit, Üngör, & Shanks, 2005; Melchers, Üngör, & Lachnit, 2005; Williams & Braker, 1999; for a review Melchers, Shanks, & Lachnit, 2008). For instance, training of a simple discrimination was found to encourage organisms to process stimulus compounds from another discrimination in a way predicted by elemental theories (e.g., Rescorla & Wagner, 1972), whereas conditional discriminations encouraged organisms to process other stimulus compounds as unitary configurations (e.g., Pearce, 1994). Given these findings, it is hard to rule out the possibility that differences in the kind of stimulus representation contributed to the context-shift effects reported in the previous studies on context relevance and context-specific learning. However, in the present experiments all participants received a conditional discrimination. Therefore, it is impossible to explain the present results in terms of differences in the way of stimulus representation.

Our results indicate that relevant contexts receive more attention than those that are irrelevant. However, each of our experiments is silent about the way in which the difference in attention between relevant and irrelevant context stimuli arises. Such a difference in attention might arise from (a) increases of attention to relevant context stimuli, (b) decreases of attention to irrelevant context stimuli, or (c) both increases and decreases to relevant and irrelevant contexts, respectively. A related issue is that the present experiments cannot discern whether our manipulation of the informational value of the contexts caused an increase in context-specific extinction learning involving the relevant context elements, a decrease in context-specific extinction with respect to the irrelevant context elements, or both. For the contextual control of acquisition, at least one experiment with human participants, by León et al. (2010), suggests that training contexts as being relevant results in an increase of the context specificity of learning. Similar to Preston et al. (1986), two groups of participants were trained either with a conditional discrimination for which contexts were relevant (X+, Y– in context A; X–, Y+ in context B) or with a simple discrimination with irrelevant contexts (X+, Y– in contexts A and B). In addition, however, the experiment by León et al. comprised a third group in which contexts were neither explicitly trained as relevant or irrelevant for the cue-outcome contingencies. Participants in this group received X+ and Y– trials in context A, but were trained with a different pair of cues (V+, W–) in context B. León et al. observed that responding to a target cue Z, which possessed a consistent reinforcement history, was context-specific when contexts were trained as relevant for the discrimination between X and Y. However, performance to Z was not disrupted by a context change when contexts were trained as irrelevant or when contexts were not explicitly experienced as relevant or irrelevant. Based on these findings, it seems reasonable to assume that in the present experiments training specific context elements as relevant increased their processing during extinction learning, leading to enhanced renewal when the extinction context was changed on the relevant context dimension. However, future research will be required in order to examine this conclusion and to further specify the dynamics of attentional changes to contextual stimuli.

Evidence in the literature suggests that human performance can be based on abstract rules derived from prior learning experience. For example, in a study by Shanks and Darby (1998) participants were trained concurrently with two patterning problems, a positive patterning discrimination in which two stimuli were followed by an outcome when presented as a compound (AB+) but not when presented individually (A–, B–), and a negative patterning problem in which each of two stimuli was paired with an outcome (C+, D+) but not when the two stimuli appeared together on a trial (CD–). In order to assess whether the training of the patterning problems induced the formation of a rule of the form a compound and its elements predict opposite outcomes, the training stage also included two elements that were each followed by an outcome, I+ and J+, and a nonreinforced compound stimulus, KL–. During a final test phase, the two separately trained elements I and J were presented together as a compound and the two elements of the KL compound were presented individually. The application of a patterning rule to the test trials would lead participants to predict the outcome on trials with K or with L, but not for the IJ compound. Alternatively, however, if performance during the test phase would be based on similarity (e.g., Pearce, 1987, 1994; Rescorla & Wagner, 1972), the IJ compound should be more strongly associated with the outcome than K or L. Remarkably, Shanks and Darby observed both patterns of generalization during the test phase depending on the level of accuracy reached by the participants at the end of the training stage with “better learners” responding to the test trials more in accordance with a patterning rule.

In each of the present experiments, the training of the conditional discriminations between X and Y might have encouraged our participants to form abstract rules about the relationship between changes in specific context elements and changes in cue–outcome contingencies. And, they might have used these rules to adjust their performance during later stages of the experiments. However, the results from the test phase of Experiment 2 were not consistent with this idea. None of the context changes during the test of Experiment 2 were previously related to changes in the significance of cues. In contrast, the rule-based generalization account provides a straightforward explanation for the results of Experiment 1. To further evaluate the possibility that rule-induction contributed to the performance in the present experiments, we correlated the accuracy achieved by our participants at the end of the conditional discrimination with their performance outside of the extinction context during the test phase. For Experiment 1, we found a negative correlation between the mastery of the conditional discrimination and the percentages of stomach trouble predictions outside of the extinction context (A1B2) in the test phase for the group in which the extinction context was changed on the irrelevant context dimension (Group Ani-Rel/Col-Irrel), r = –.62, p < .001, which is consistent with the assumption of rule induction. However, the analysis revealed no positive correlation for the group in which the extinction context was changed on the relevant context dimension during the test phase (Group Col-Rel/Ani-Irrel), r = .19, p = .26. Moreover, for Experiment 2, we found no relationship between the mastery of the conditional discrimination and performance outside of the extinction context during the test phase, neither when the extinction context was changed on the relevant dimension (A3B2), r = –.03, p = .86, nor when changed on the irrelevant dimension (A2B1), r = –.18, p = .31. In conclusion, a rule-based generalization account is unable to deal with the results of both of the present experiments. However, it must be acknowledged that rule induction could explain aspects of the data of our study. Therefore, future research should further investigate the nature of abstract rules formed by participants during the kind of learning tasks used in the present experiments and should examine how rule learning interact with other learning mechanisms and attentional changes in determining human behavior.

Our experimental designs ensured that each context element was associated with a specific outcome on half of the trials. However, in order to keep the design of Experiment 2 as simple as possible, we accepted other mismatches across the different context elements. For instance, both the number of presentations and the number of associated cues were higher for context element B1 than for context element A3. Therefore, we cannot exclude that these mismatches contributed to the difference in responding across the contexts A2B1 and A3B2 during the final phase of Experiment 2. Although these features of the design of Experiment 2 are not entirely satisfactory, it seems unlikely that they provide a full account of the present results. In Experiment 1, in which the context elements A1 and A2 were matched in their training histories, we observed context switch effects that were consistent with the results of our second experiment. Nevertheless, it is important to be aware of such weaknesses of the present designs, which could be considered in future research.

Overall, the results of our experiments support the idea that the informational value of contexts affects the strength of context dependent learning. The experiments add further evidence to the hypothesis that relevant contexts receive more attention than do those that are irrelevant, and that this difference in attention leads to differences in the strength of context-specific processing of the information acquired in these contexts. Furthermore, we extended previous studies by demonstrating that the informational value of the context of extinction influences the strength of renewal.