Visual scenes typically contain multiple objects of varying complexity that need to be processed selectively in order to achieve one’s behavioural goals. When searching for a specific target object in a given scene, visual selection can be supported by a variety of cues directing attention towards relevant, and away from irrelevant, parts of a scene. Thus, for example, search may be guided bottom-up, by visual cues that attract attention on the basis of perceptual salience, as well as top-down, by a working memory “template” specifying features of the searched-for target (see Wolfe & Horowitz, 2004, for a review). In addition, selection may be aided by learned contingencies within a given environment. Real-world scenes usually consist of a relatively stable collection of co-occurring objects, permitting search for one object to be facilitated via its associations with other objects (see Oliva & Torralba, 2007, for a review). For example, visual search for a toaster might be quicker when it is presented in a kitchen rather than a garage scene. Thus, context information can offer valuable cues to the location of a target object (see Bar, 2004, for a review; Biederman, Mezzanotte, & Rabinowitz, 1982; Hollingworth, 2006).

The role of such invariant context information on attentional guidance has also been investigated in a number of studies under controlled laboratory conditions (see Chun, 2000, for a review; Chun & Jiang, 1998). In a typical experiment, search displays consist of 12 items, one T-shaped target and eleven L-shaped nontargets (for an example, see Fig. 1). The task is to find the “T” and indicate its orientation (left or right). Importantly, and unknown to the observers, a set of displays is repeated throughout the experiment with preserved spatial configurations of the target and nontargets. Search performance for these “old” displays is better than performance for displays that are newly generated on every trial, an effect known as contextual cueing (Chun & Jiang, 1998). Moreover, a recognition test at the end of the experiment revealed that participants could not reliably discern between old and new configurations, suggesting that participants had no explicit memory of the spatial relations between the target location and its invariant context. Contextual cueing is therefore considered an implicit memory mechanism for spatial context, which facilitates visual search by guiding attention more efficiently (or directly) towards the target location. Guidance by this form of contextual memory may thus provide useful support for attentional orienting in complex environments, as demonstrated in visual search. Such a mechanism should also be flexible and adaptive, to compensate for the variability and possible changes that can occur in the environment. Flexibility could, for instance, mean that one invariant context is associated with multiple target objects. In real environments, such as a kitchen, search might benefit from the stable kitchen layout not only when it comes to finding a toaster, but also when it comes to finding other potentially relevant items, such as a coffee machine.

Fig. 1
figure 1

Example search displays with an old (invariant) context paired with two different target locations

Thus far, studies investigating the adaptivity of contextual cueing to multiple target locations have yielded ambiguous results. Partial support for an adaptive nature of contextual cueing was already provided by Chun and Jiang (1998). In a variant of the contextual-cueing paradigm, a given search display was repeatedly presented with two distinct target locations. Thus, on some trials, the invariant context was presented with one target location, whereas on other trials it was presented with a second target location (for an example, see the left- and right-hand panels of Fig. 1). The results of this experiment showed a somewhat reduced, but nevertheless reliable, contextual-cueing effect for contexts with two target locations (see also Conci, Sun, & Müller, 2011, for comparable results with simultaneously presented targets). By contrast, invariant contexts paired with three or four repeated targets have been reported as not eliciting contextual cueing (Kunar, Michod, & Wolfe, 2005; Wolfe, Klempen, & Dahlen, 2000). Other studies revealed that sudden (unpredictable) changes of the target location disrupted contextual cueing (Chun & Jiang, 1998; Conci et al., 2011; Fiske & Sanocki, 2010; Makovski & Jiang, 2010; Manginelli & Pollmann, 2009). More specifically, when a target that was learned in an invariant context was suddenly moved to a new, previously empty location, contextual cueing was impaired and did not recover with repeated presentation of the new target location (Manginelli & Pollmann, 2009). Recently, Makovski and Jiang further qualified this lack of adaptivity by showing that contextual cueing was transferred to a new target located in close proximity to the original target location. Thus, adaptation of contextual cueing seems to occur only within a fairly limited spatial range.

In sum, while some studies have reported evidence for adaptation to multiple target locations in contextual cueing (Chun & Jiang, 1998; Conci et al., 2011; Kunar et al., 2005), others have clearly failed to provide evidence of flexible compensation for environmental changes (Chun & Jiang, 1998; Conci et al., 2011; Makovski & Jiang, 2010; Manginelli & Pollmann, 2009), or have reported adaptation as occurring only within a limited spatial region (see also Chua & Chun, 2003; Makovski & Jiang, 2010).

The present study was designed to reconcile the contradictory findings on contextual cueing for multiple target locations, and to distinguish between possible alternatives of explaining how contextual cueing is modified by multiple target locations. On the one hand, according to Brady and Chun’s (2007) computational model of contextual cueing, multiple target locations can be (learned to be) associated with one invariant context; that is, contextual learning is adaptive. In this view, the overall reduced magnitude found for the contextual-cueing effect (Chun & Jiang, 1998; Kunar et al., 2005) simply results from the number of potential target locations that have to be inspected (multiple-target learning). On the other hand, the clear lack of adaptation in other recent studies (Conci et al., 2011; Fiske & Sanocki, 2010; Makovski & Jiang, 2010; Manginelli & Pollmann, 2009) suggests that contextual cueing is restricted to single target locations or their narrow surrounds. That is, only one of two (or more) target locations may be reliably cued by an invariant context (single-target learning). If only one target location benefits from contextual cueing, averaging across the cued and uncued target locations (when the invariant context is paired with two target locations) would result in an overall reduced contextual-cueing effect. Adding a third (or fourth, etc.) repeated target location would further reduce the overall effect, because the contextual-cueing effect would be averaged across one cued and two (or three, etc.) uncued target locations.

To determine the degree of adaptivity in contextual learning, three contextual-cueing experiments with multiple target locations were conducted. Contextual-cueing effects were observed with two target locations (Experiments 1 and 2), but the effect was significantly reduced when directly compared to the effect in displays with one target location (Experiment 2). Moreover, no contextual-cueing effect was observed for displays that were paired with three possible target locations (Experiment 3). While, overall, this pattern of results replicated previous studies (see above), additional post-hoc analyses of all three experiments confirmed that one (dominant) target location consistently showed significantly more contextual cueing than did the other (minor) locations. Furthermore, proximity between targets enabled contextual cueing for two, or even all three, target locations. Taken together, these findings show that contextual cueing does not integrate multiple target locations evenly, but, in fact, the successful predictive association between an invariant context and a target location is limited to only one target location and its immediate surround.

Experiment 1

Experiment 1 was designed to replicate the results of Chun and Jiang (1998), who reported that contextual cueing occurred for invariant contexts paired with two possible target locations. Each search display was paired with two distinct target locations (see Fig. 1 for an example). To ensure that both target locations could be associated equally well with the invariant (old) context, the two targets were always presented in separate, alternating blocks of trials. This variation was used to avoid primacy of one target over the other owing to the order of presentation. If contextual cueing can operate for two different target locations, a facilitatory effect should occur for repeated displays with two target locations.

Method

Participants

A group of 16 participants took part in the experiment (10 women, 6 men; mean age = 26 years, age range = 22–49 years). All participants had normal or corrected-to-normal visual acuity, and all but 1 were right-handed. They received either payment (€8) or one course credit.

Apparatus and stimuli

Stimulus presentation and response collection were controlled by a PC-compatible computer using MATLAB routines and Psychophysics Toolbox extensions (Brainard, 1997; Pelli, 1997). The stimuli subtended 0.7º x 0.7º of visual angle and were presented in grey (8.5 cd/m2) against a black background (0.02 cd/m2) on a 17-in. CRT monitor. Search displays consisted of 12 items, one of which was a T-shaped target rotated randomly by 90º to either the left or the right. The 11 remaining items were L-shaped nontargets rotated randomly in one of the four orthogonal orientations. Search displays were generated by placing the target and nontargets randomly in the cells of a 6 x 8 matrix, with an individual cell size of 2.5º × 2.5º. Nontargets were jittered horizontally and vertically in steps of 0.1º, within a range of ±0.6º. Example search displays are shown in Fig. 1. Participants were seated in a dimly lit room with an unrestrained viewing distance of approximately 57 cm from the computer screen.

Trial sequence

At the beginning of each trial, a fixation cross was presented for 500 ms at the centre of the screen. Then, a search display appeared and remained visible until participants made a speeded response by pressing one of two mouse buttons (with the left- and the right-hand index finger, respectively). Participants were instructed to search for the target “T” and decide as quickly and accurately as possible whether the stem was pointing to the left or the right. In case of a response error, a minus sign appeared on the screen for 1,000 ms. An interstimulus interval of 1,000 ms separated one trial from the next.

Design and procedure

In Experiment 1, we implemented a 2 x 8 repeated measures design, with the (within-subjects) factors Context (old, new) and Epoch (1–8). With respect to context, for old contexts, a set of 12 displays was generated for each participant and repeated throughout the experiment (with an invariant arrangement of nontarget items on every presentation). For new contexts, the configuration of nontarget items was generated randomly on each trial. Each display was paired with two target locations. In order to rule out location probability effects, different sets of target locations were selected for old and new contexts, such that, overall, 48 possible target locations were assigned to the displays. The orientation of the target was random on each trial, whereas those of the nontargets were held constant for old contexts. Figure 1 depicts an example search display with an invariant configuration of nontargets paired with two different target locations. The second factor Epoch divided the experiment into eight equally sized consecutive bins (each bin consisted of 120 trials), which permitted the examination of possible learning effects over the course of the experiment by using aggregated, more robust values.

The experiment started with a practice block of 24 randomly generated displays, to familiarise participants with the task. All subsequent (40) experimental blocks consisted of 24 trials, 12 with old and 12 with new context displays, presented in random order. The two possible target locations for each (old and new) display were always presented in alternating order (i.e., one of the two possible target locations was presented in all odd blocks, the other target location was presented in all even blocks), such that each target location was presented 20 times. After each block, participants took a short break and continued with the experiment at their own pace. Overall, participants completed 984 trials.

Recognition test

After the last search trial, an instruction was presented on the screen informing participants about the repetition of some of the search displays throughout the experiment. Participants started the presentation of another 24 trials and decided via mouse button responses whether a particular display had been shown previously (= old) or not (= new). All displays were presented with target locations corresponding to the odd blocks only (i.e., with the targets presented in Block 1), since the explicit recognition of a given repeated context would not depend on the location of the target, but rather on the arrangement of the nontargets. The response was nonspeeded, and no error feedback was given.

Results

Search task

Individual mean error rates were calculated for each variable combination. The overall error rate was low (2.9%) and a repeated measures ANOVA with the factors Context (old, new) and Epoch (1–8) revealed no significant effects (all ps > .1).

Next, individual mean response times (RTs) were calculated for old and new contexts, separately for each epoch. Error trials and RTs exceeding the individual’s mean RT by ± 2.5 standard deviations were excluded from the analysis. This outlier criterion led to the removal of 2.3% of the data; the same outlier procedure was applied in all subsequent experiments, with comparable exclusion rates. Further inspection of the RT data revealed normally distributed RTs, as verified by Kolmogorov–Smirnov tests (all ps > .1; similar results were obtained in all subsequent experiments). Greenhouse–Geisser corrected values are reported in cases in which Mauchley’s test of sphericity was significant (p < .05).

Figure 2 shows mean RTs for old and new contexts as a function of epoch. A repeated measures ANOVA with the factors Context (old, new) and Epoch (1–8) yielded a significant main effect of context, F(1, 15) = 10.36, p < .01, and a marginally significant main effect of epoch, F(1.34, 20.13) = 3.27, p = .075. RTs were on average 57 ms faster for old than for new contexts, and they decreased by about 166 ms from the first to the last epoch. The interaction between context and epoch was not significant, F(7, 105) = 1.31, p > .2. When Target Location (location in odd or in even blocks) was entered as a third factor into the analysis, the Context x Target Location interaction did not reach significance (p > .3; all other effects were as described above); that is, the magnitude of contextual cueing for the two target locations was not systematically influenced by the order of presentation (similar results were obtained in Experiment 2). An additional analysis performed on individual blocks (rather than epochs) revealed the first significant difference between old and new contexts to occur in Block 5, t(15) = −2.86, p = .01, which is comparable to findings of fast contextual learning in previous studies (e.g., Conci et al., 2011) and to all subsequent experiments reported here.

Fig. 2
figure 2

Mean RTs (in milliseconds, with associated standard error bars) for old and new contexts (filled and unfilled symbols, respectively) as a function of epoch in Experiment 1

Recognition test

Overall, old and new contexts were classified as old and new, respectively, in 51% of all trials. Participants correctly identified old contexts in 45.8% of the trials (hit rate), and their false-alarm rate of reporting new contexts as old (46.9%) was comparable to the hit rate, t(15) = −0.21, p = .84. This suggests that participants were unaware of the repeated contexts during the experiment.

Discussion

The results of Experiment 1 replicated previous findings of Chun and Jiang (1998), showing that contextual cueing can occur for invariant contexts paired with two distinct target locations. Targets in old-context displays were detected 57 ms faster than targets in new-context displays. Moreover, the recognition test scores suggested that participants learned the associations between the invariant context and the target locations implicitly.

In comparison to Chun and Jiang (1998), who reported only a marginally significant contextual-cueing effect of 35 ms for two target locations, the 57-ms effect observed here was more robust and statistically reliable. This may suggest that both the alternating order of target presentations—which would facilitate associating both target locations equally well with the context—and the larger number of trials contributed to the formation of stronger context–target associations. However, contextual cueing for two-target displays was still substantially reduced as compared to similar experiments with only one target location for each display (e.g., Conci & von Mühlenen, 2009, reported contextual-cueing effects greater than 200 ms). This overall reduction in the magnitude of contextual cueing could be the result of multiple-target learning (as suggested by Brady & Chun, 2007). Alternatively, observers may learn only one of two target locations effectively (single-target learning), in which case contextual cueing would be reduced because positive contextual-cueing effects (for one location) would be averaged with near-zero effects (for the other location).

Experiment 2

In order to examine the effectiveness of contextual cueing for displays with different numbers of target locations, in Experiment 2 we implemented a within-subjects design to enable a direct comparison of contextual cueing between one-target displays (baseline) and two-target displays. Half of the search displays were paired with one target location, and the other half with two target locations. On the basis of Experiment 1 and previous findings (e.g., Chun & Jiang, 1998), we expected to find a reduction of contextual cueing when there were two target locations, rather than one target location, paired with a given contextual layout.

Method

The apparatus, stimuli, design, and procedure were similar to those of Experiment 1, except that half of the old and new displays were paired with one target location (baseline) and the other half with two target locations. Overall, 36 target locations were used in Experiment 2. One-target and two-target displays were randomly intermixed within blocks (40 in total). Again, two-target displays contained one of two possible target locations in alternating order across blocks; that is, each of the two target locations was shown 20 times.

A group of 21 participants took part in the experiment (15 women, 6 men; mean age = 26.9 years, age range = 19–50 years). All participants had normal or corrected-to-normal visual acuity and were right-handed. They received either payment (€8) or one course credit.

Results

Search task

The overall error rate was relatively low (2.1%), and a repeated measures ANOVA with the factors Context (old, new), Targets (one, two), and Epoch (1–8) only revealed a significant interaction between targets and epoch, F(3.72, 74.39) = 3.42, p < .05. Errors increased slightly from Epoch 1 (2.3%) to Epoch 8 (2.7%) for one-target displays, as compared to a slight decrease in errors (from 2% to 1.5%) for two-target displays.

Individual mean RTs were calculated for each variable combination, excluding error trials and outliers. Figure 3 shows mean RTs for old and new contexts as a function of epoch, separately for displays paired with one (left panel) and two (right panel) target locations. A repeated measures ANOVA with the factors Context (old, new), Targets (one, two), and Epoch (1–8) revealed significant main effects of context, F(1, 20) = 14.05, p < .01, and epoch, F(3.44, 68.75) = 18.48, p < .001. RTs were on average 67 ms faster for old relative to new contexts, and they decreased by about 169 ms from the first to the last epoch. Importantly, the interaction between context and targets was also significant, F(1, 20) = 6.19, p < .05, due to larger contextual-cueing effects for one-target displays (101 ms) as compared to two-target displays (33 ms). As can be seen in Fig. 3 (right panel), contextual cueing for two-target displays only emerged from Epoch 3 onwards, reaching sizes comparable to those in Experiment 1 only in the last two epochs [57 and 55 ms, t(20) = −2.22, p = .04, and t(20) = −1.93, p = .07, respectively].

Fig. 3
figure 3

Mean RTs (in milliseconds, with associated standard error bars) for old and new contexts (filled and unfilled symbols, respectively) as a function of epoch in Experiment 2, for displays paired with one (left panel) and with two (right panel) target locations

Recognition test

The overall accuracy of recognising old and new contexts was 45.2%. For one-target displays, participants correctly identified old contexts on 56.4% of trials (hit rate), but this did not differ from the false alarm rate of 49.6%, t(20) = 1.21, p = .24. Similarly, the numbers of hits (57.9%) and false alarms (49.6%) were statistically comparable for two-target displays, t(20) = 1.84, p = .08, suggesting that participants were mostly unable to explicitly discern between old and new contexts.

Analysis by separate target locations

The results of both Experiments 1 and 2 revealed a contextual-cueing effect for displays with two target locations, but the effect was considerably reduced relative to the baseline condition with one target location. To examine whether this reduction was due to learning of only one of the two target locations, the data of all two-target displays from Experiments 1 and 2 were collapsed. For each participant, the mean contextual-cueing effect was computed separately for each display and target location. Subsequently, for each display, the target location with a relatively larger contextual-cueing effect was assigned to a “dominant target” category, while the target location with the smaller contextual-cueing effect was assigned to a “minor target” category. As can be seen from Fig. 4, the averaged contextual-cueing effect was positive and large only for the dominant target location (204 ms), while being negative for the minor target (−124 ms) [comparison of dominant vs. minor targets: t(36) = 18.16, p = .00]. Contextual cueing for both the dominant and minor target locations differed reliably from zero, as revealed by one-sample t tests, t(36) = 11.14, p = .00, and t(36) = −8.09, p = .00, respectively. This pattern of positive and negative cueing effects indicates that only one of two target locations was effectively cued by a repeated context, whereas there were significant costs for the other location.

Fig. 4
figure 4

Mean contextual cueing (in milliseconds, with associated standard error bars) for dominant and minor target locations (collapsed data for all two-target displays from Experiments 1 and 2)

In order to demonstrate that the difference in contextual cueing between the dominant and minor target locations was not simply an artefact of our sorting procedure, one-target displays (baseline) were also examined for equivalent effects (Experiment 2 only). This was done by applying a sorting procedure analogous to the one with two-target displays: For each participant, pairs of one-target (baseline) displays were randomly selected (which can be considered equivalent to a random pairing of target locations for two-target displays), and for each pair, displays that generated a larger and a smaller contextual-cueing effect were assigned to a “dominant” and a “minor” category, respectively, exactly as in the procedure described above. The resulting mean dominant contextual-cueing effect was large and positive (251 ms), and the mean minor effect negative (−49 ms) [comparison of dominant vs. minor cueing effects: t(20) = 12.56, p = .00]; note, though, that only the dominant effect differed significantly from zero, t(20) = 9.25, p = .00 [minor, t(20) = −1.85, p = .08]. In a subsequent step, dominant and minor contextual cueing effects in the baseline condition (one-target displays) were compared with contextual cueing of dominant and minor target locations in two-target displays (Experiment 2 only). The results revealed the dominant contextual-cueing effects to be comparable between the one- and two-target displays (251 vs. 205 ms), t(20) = 1.30, p = .28. By contrast, the effect for the minor target location in two-target displays was significantly smaller (i.e., in a more negative direction) compared to the minor effect in the baseline [−139 vs. –49 ms; t(20) = 3.02, p = .01]—indicating considerable costs, of 90 ms, for the minor target location in two-target displays relative to the baseline condition. Thus, while dominant contextual cueing was comparable between both types of displays, there were pronounced contextual costs for minor target locations in two-target displays.

Between-target distance analysis

Additional analyses for all two-target displays were performed on the combined data from Experiments 1 and 2 in order to examine the influence of spatial distance between the dominant and minor target locations (range = 2.5º–20.2º of visual angle) on contextual cueing for the latter location. First, a correlation analysis revealed contextual cueing for the minor target location to decrease with increasing distance from the dominant target location, r = −.318, p = .00. In a further step, we examined whether spatial distance between the two locations facilitated positive contextual cueing for one target location or for both target locations. Displays were sorted according to whether there was a positive (i.e., above zero) contextual-cueing effect for both target locations (30.5%), or for only one target location (46.5%; or for none of the locations). Note that 3 observers had to be excluded from this analysis because they did not show contextual cueing for more than one target location. When both target locations were cued, the mean distance between them was significantly smaller than when only one location was cued, 7.4º versus 9.7º, respectively [t(33) = 4.27, p = .00]. This finding implies that smaller distances facilitated contextual cueing of two target locations more reliably than did larger distances. Still, with two cued target locations, the dominant location exhibited more contextual cueing than did the minor location, 362 versus 172 ms, respectively [t(33) = 10.33, p = .00]. It should be noted that the numerically large contextual-cueing effects obtained in these (and subsequent) analyses resulted from the procedure of selecting only relatively extreme cases with large contextual-cueing effects (while excluding smaller or negative values).

Discussion

In agreement with previous studies (e.g., Chun & Jiang, 1998), Experiment 2 demonstrated a contextual-cueing effect for both one-target and two-target displays. But, at the same time, contextual cueing was significantly reduced for two-target displays relative to one-target displays (33 vs. 101 ms).

According to Brady and Chun (2007), a reduction in contextual cueing for two-target displays originates from the increase in inspection times due to multiple-target learning. However, close scrutiny of the collapsed data from Experiments 1 and 2 supports an alternative explanation based on single-target learning. When displays were ranked according to the size of contextual cueing for each target location, only one (the dominant) target location showed strong contextual cueing comparable to learning with one-target displays. By contrast, the other (minor) target location was associated with contextual costs, and these costs significantly exceeded negative contextual-cueing effects in baseline displays. This pattern of results suggests that contextual cueing is much less flexible than proposed. Rather, a given invariant context can reliably cue search to only one repeated target location, but (mostly) fails to facilitate search for a target presented at a second repeated location (for comparable results, see also Conci et al., 2011; Makovski & Jiang, 2010; Manginelli & Pollmann, 2009). The fact that minor target locations in two-target displays elicited larger contextual costs than those in any baseline displays indicates that the learned (dominant) target location misdirects spatial–attentional allocation to the dominant location when the target is actually presented at the other (minor) location.

In addition, contextual cueing decreased for minor target locations with increasing distance from the dominant target location, and reliable (i.e., above-zero) contextual-cueing effects for both target locations were only found when these were (relatively) close to each other (see also Brady & Chun, 2007; Makovski & Jiang, 2010). Nevertheless, even if both targets were cued successfully, one dominant target location could still be identified as exhibiting more contextual cueing than the other (minor) location (362 vs. 172 ms). Taken together, this pattern of results demonstrates that contextual cueing is not well adaptive to multiple target locations, because it effectively facilitates guidance to one target location (and its immediate surround) only.

Experiment 3

The results obtained thus far showed that contextual cueing was reduced for two-target relative to one-target displays, and this reduction occurred because only one of two targets was reliably cued. To examine whether single-target learning transfers to multiple repeated target locations in general, in Experiment 3, half of the search displays were paired with three different target locations, and the other half with one (baseline). We expected to observe contextual cueing for only one of the three alternative target locations. If reliable contextual cueing only occurred for one out of three target locations, the averaged contextual-cueing benefit for three-target displays should be even more reduced than that for two-target displays.

Methods

The methodological details were similar to those of Experiment 2, except that now half of the old and new displays were paired with three distinct target locations, and the other half again with only one target location (baseline). Overall, 48 possible target locations were used in Experiment 3. Three-target displays presented all possible target locations in a systematically alternating order across blocks; that is, within a sequence of three blocks, the three target locations were presented in random order. In each block, one-target and three-target displays were presented in random order. Each target of the three-target displays was presented 14 times. Altogether, participants completed 42 experimental blocks of trials (1,032 trials). Bins of 6 blocks were aggregated into seven epochs for analysis purposes.

A group of 22 participants took part in the experiment (16 women, 6 men; mean age = 26 years; age range = 18–34 years). All participants had normal or corrected-to-normal visual acuity and were right-handed. They received either payment (€8) or one course credit.

Results

Search task

The overall error rate was relatively low (2.4%), and a repeated measures ANOVA with the factors Context (old, new), Targets (one, three), and Epoch (1–7) revealed no significant effects (ps > .3).

Individual mean RTs were calculated for each variable combination after exclusion of error trials and outliers. Figure 5 depicts the mean RTs for old and new contexts as a function of epoch, separately for one-target (left panel) and three-target (right panel) displays. A repeated measures ANOVA with the factors Context (old, new), Targets (one, three), and Epoch (1–7) yielded significant main effects of context, F(1, 21) = 4.57, p < .05, targets, F(1, 21) = 16.35, p < .01, and epoch, F(2.65, 55.63) = 14.29, p < .001. RTs were faster for the old- as compared to the new-context displays (by 46 ms), and for one-target as compared to three-target displays (by 71 ms). The main effect of epoch was reflected in a decrease in RTs, by 160 ms, from the first to the last epoch. Furthermore, the Targets x Context interaction was significant, F(1, 21) = 7.52, p < .05, due to a strong contextual-cueing effect for one-target displays (95 ms) but not for three-target displays (−3 ms). The factors Context and Epoch also interacted significantly, F(2.95, 61.93) = 3.64, p < .05, with contextual-cueing effects increasing from −19 ms in Epoch 1 to 63 ms in Epoch 7. The interaction between Targets and Epoch was significant, F(3.91, 82.03) = 3.11, p < .05, with RTs decreasing more across epochs for one-target displays (by 195 ms) than for three-target displays (by 123 ms).

Fig. 5
figure 5

Mean RTs (in milliseconds, with associated standard error bars) for old and new contexts (filled and unfilled symbols, respectively) as a function of epoch in Experiment 3, for displays paired with one (left panel) and with three (right panel) target locations

Recognition test

Overall, the mean accuracy in the recognition test was 55.1%. For one-target displays, participants correctly identified old contexts on 60.6% of trials (hit rate), and this differed significantly from the false-alarm rate of 46.6%, t(21) = 2.59, p = .02, suggesting that participants were to some extent aware of the repeated contexts. For three-target displays, the rates of hits (53%) and false alarms (46.6%) were comparable and showed no evidence of explicit recognition, t(21) = 1.06, p = .30. To further qualify the explicit recognition performance in one-target displays, we examined whether the participants’ ability to recognise repeated layouts was related to the size of the contextual-cueing effect. Individual sensitivity scores d' [z(hits) – z(false alarms)] were computed as a measure of explicit recognition and correlated with the contextual-cueing effect for one-target displays. This analysis produced no evidence of a correlation, r = −.03, p = .89; that is, recognition performance was not systematically related to the size of contextual cueing.

Analysis by separate target locations

In a subsequent step, contextual cueing for all three-target displays was analysed separately for the dominant target location and the two minor target locations (see Experiment 2 above for details of the analysis procedure). Figure 6 illustrates that the mean contextual-cueing effect for the dominant target location was significantly larger than that for the two minor target locations (271 vs. –17 vs. –263 ms), t(21) = 11.28, p = .00, and t(21) = 14.64, p = .00, respectively. Contextual cueing for the minor target locations also differed significantly from each other, t(21) = 11.62, p = .00. Mean contextual cueing of the dominant target location was significantly greater than zero, t(21) = 8.02, p = .00, but contextual cueing of the minor target locations was equal to or less than zero, t(21) = −0.61, p = .55, and t(21) = −9.15, p = .00, respectively.

Fig. 6
figure 6

Mean contextual cueing (in milliseconds, with associated standard error bars) for dominant and minor target locations in displays with three possible target locations (Experiment 3)

In order to compare the contextual-cueing effects for the dominant and minor target locations (three-target displays) to the corresponding effects in the baseline condition (one-target displays), analogous to the analysis in Experiment 2, triplets of one-target displays were randomly selected (for each participant), and then each triplet was sorted by the largest (dominant), the second largest (Minor 1), and the smallest contextual-cueing effect (Minor 2) to obtain a baseline ranking for the three-target displays. Not surprisingly, in the baseline, the dominant contextual-cueing effect (331 ms) was greater as compared to both minor effects (109 and −139 ms, respectively) [t(21) = 7.73, p = .00, and t(21) = 9.91, p = .00, for the two comparisons], and the latter two effects also differed reliably from each other, t(21) = 7.70, p = .00. Less trivially, each dominant and minor baseline contextual-cueing effect also differed significantly from zero [t(21) = 7.5, p = .00; t(21) = 3.2, p = .00; and t(21) = −4.07, p = .00, respectively].

Next, dominant and minor contextual cueing in the baseline condition were compared to contextual cueing of dominant and minor target locations in the three-target displays. As in Experiment 2, dominant contextual cueing was comparable between one-target displays (331 ms) and three-target displays (271 ms), t(21) = −1.56, p = .13. But contextual cueing of minor target locations (−17 and −263 ms) was significantly smaller compared to minor contextual-cueing effects in the baseline (109 and −139 ms) [t(21) = −2.82, p = .01, and t(21) = −3.22, p = .00, respectively]. In sum, the dominant contextual-cueing effect in the baseline was similar to that for dominant target locations in three-target displays. By contrast, minor target locations in three-target displays showed no contextual-cueing effect, or even a contextual cost, whereas minor effects in the baseline still reflected a reliable contextual benefit (at least for the Minor 1 category). Thus, minor target locations in three-target displays were associated with significant contextual costs beyond the smallest effects in the baseline.

Between-target distance analysis

Again, the influence of spatial distance in three-target displays between dominant and minor target locations (range = 2.5º–21.5º of visual angle) on contextual cueing for the minor target locations was analysed. Overall, contextual cueing for the minor target locations was reduced with greater distance from the dominant location, r = −.335, p = .00, and r = −.331, p = .00, respectively (correlations were partially controlled for distance between minor target locations). In a further step, RTs for three-target displays were sorted into three groups, according to whether (above-zero) contextual-cueing effects were obtained for all three target locations (13.6% of the data), for two target locations (30.3%), or for one target location (40.9%; or for none of the target locations). A one-way ANOVA revealed that the mean distance differed significantly between groups, F(3, 128) = 6.84, p < .001 [with a significant linear trend: F(1, 128) = 11.10, p < .01]. Mean distances were 10.7º, 9º, and 7.6º for contextual cueing of one, two, and three target locations, respectively, suggesting that the integration of multiple target locations into a learned context was only possible with smaller between-target distances. When two target locations were successfully cued, the average effect was 394 ms for the dominant target location and 163 ms for the minor target location (−238 ms for the “uncued” location; all ps < .001). When three target locations exhibited contextual cueing, the average effect was 441 ms for the dominant target location, and 319 and 155 ms for the first and the second minor target locations, respectively (all ps < .001).

Discussion

In Experiment 3, we compared contextual cueing between one-target displays (baseline) and three-target displays. Overall, only one-target displays, but not three-target displays, generated reliable contextual cueing (95 and −3 ms, respectively). In addition, search in three-target displays was slowed relative to one-target displays, which might point to extended inspection times due to the resolution of multiple associations between an invariant context and various target locations (see Brady & Chun, 2007). However, further analyses revealed that only a single, dominant target location was successfully cued by an invariant context with effects comparable to baseline, one-target displays. By contrast, the two remaining (minor) target locations did not show reliable contextual cueing and were associated with significant contextual costs when compared to the smallest effects in baseline displays. Of course, targets of three-target displays were presented fewer times than targets of one target displays which could have affected speed of learning, but had no influence on the overall contextual-cueing effect of the dominant target location. Therefore, the lack of observable contextual cueing for three-target displays can be attributed to single-target learning.

Moreover, as with two-target displays (Experiment 2), relative proximity between target locations facilitated contextual cueing for minor target locations and enhanced, to a certain extent, contextual cueing of two, or even all three, target locations by one and the same, invariant context. However, the size of contextual cueing for one or two proximal target locations never reached the same level as that for the dominant target location. This pattern of results again demonstrates that contextual cueing can index only a single target location (and its immediate surround) reliably, but fails to represent multiple target locations within an invariant context.

General discussion

The repeated presentation of invariant spatial item layouts facilitates visual search by guiding attention more directly to a learned target location. In the present study, invariant contexts were paired with multiple target locations (each presented on different trials) to investigate the adaptive properties of contextual cueing. Altogether, our results revealed that contextual cueing integrated only one target location successfully, but failed to reliably facilitate search for a second or third target location.

In line with previous results by Chun and Jiang (1998), contextual-cueing effects were obtained for repeated search displays paired with two target locations (Experiments 1 and 2). However, in comparison to one-target displays, contextual cueing for two-target displays was significantly reduced (101 vs. 33 ms, respectively; Experiment 2). Subsequent analyses showed that this reduction was caused by reliable learning of only one of two target locations (i.e., the dominant target; which was, however, not determined by order of presentation). Search for the remaining minor target locations did not benefit from the invariant context, but rather, in fact, showed contextual costs that were greater than the costs observed for inefficiently learned baseline displays. Furthermore, when a third target location was paired with a given, invariant context, there was no observable contextual cueing (−3 ms) overall, while there was reliable contextual cueing, of 95 ms, for one-target displays (Experiment 3). Again, closer inspection of the result pattern showed that the substantial reduction was caused by reliable cueing of only one of three target locations. By contrast, search for targets appearing at minor locations was again characterised by contextual costs that exceeded the costs observed for inefficiently learned baseline displays. However, additional analyses of all three experiments indicated that, in a subset of the repeated displays, larger distances between the dominant and the minor target locations were related to reduced contextual-cueing effects (or, in other words, increased contextual costs) for minor target locations. Conversely, proximity between target locations seemed to enable contextual cueing of two or even three locations. Nevertheless, the dominant target location still exhibited more contextual cueing than the proximal location(s).

In sum, the present study confirmed that contextual cueing could not adjust to multiple target locations, but rather indicated that it was limited to enhancing a single repeated target location—and possibly its immediate surround. Accordingly, the overall reduction of contextual cueing by multiple target locations was caused by averaging across cued and uncued target locations. For two-target displays, averaging occurred at a ratio of 1:1, at least halving the overall effect. For three-target displays, this ratio was reduced to 1:2, which explains why contextual cueing for three possible target locations appeared to be ineffective overall. Therefore, our results do not converge with models that proposed a reduction in contextual cueing due to multiple-target learning (see Brady & Chun, 2007).

Previous studies had already reported that, following the learning of a first target location, the introduction of a second target location disrupted contextual cueing (Conci et al., 2011; Makovski & Jiang, 2010; Manginelli & Pollmann, 2009). These findings implied that the learned association between a given target location and a given invariant context hinders adaptation to a second target location. The present pattern of results replicated these findings even for displays that presented the possible target locations in alternating order (across blocks of trials), which was expected to provide optimal conditions for learning more than one target location. Consequently, changes in the context–target relation cannot be sufficiently adapted to or compensated for in contextual learning.

Nevertheless, within a relatively narrow spatial range, two-target and three-target displays revealed contextual cueing for multiple target locations, but contextual facilitation dissipated as the spatial distances among target locations increased (see also Makovski & Jiang, 2010, for similar findings). This could mean that contextual cueing establishes multiple memory-based associations between an invariant context and proximal target locations. However, the magnitude of contextual cueing still differed between the cued (dominant and minor) target locations, suggesting that contextual cueing of a second or third target was rather a side effect of contextual cueing of the dominant target. Computational models of contextual cueing (Brady & Chun, 2007) have assumed that observers build up associations between the target location and the invariant context in repeated visual search. In subsequent search, target locations are cued by a locally activated context, rather than the whole repeated display (see also Geyer, Shi, & Müller, 2010). Thus, a second or third target, located near the contextually activated dominant target location, automatically benefits from contextual cueing. Given this finding, contextual cueing of minor target locations is presumably a side effect of contextual cueing of a “primarily” cued target area. Similarly, the prominent contextual costs for distant minor target locations also result from (mis)guidance to the “primarily” cued target location.

From the present results, we conclude that observers orient attention primarily to the learned (dominant) target location, and if the target appears at its expected (i.e., learned) location, robust contextual-cueing effects occur. However, if the dominant target is absent, observers need to reorient attention to the unlearned (minor) target, which shows contextual cueing if it is located near the dominant target, but this facilitation dissipates, and even turns into considerable costs, with growing distance from the dominant location (see also Manginelli & Pollmann, 2009, who demonstrated comparable results based on eye movement measures).

Interestingly, single-target learning was equally effective even when three different target locations were paired with one and the same, invariant context. This demonstrates a remarkable degree of selective and noise-resistant (or interference-free) learning. Evidence for the resistance of contextual cueing to interference was already reported by Jiang and Chun (2001), who found contextual learning for a repeated set of nontargets presented among another set of unpredictably changing items (see also Endo & Takeda, 2005; Olson & Chun, 2002). In addition, effective learning of repeated contexts occurred even when these were intermixed with a large number of novel display layouts on five consecutive days (Jiang, Song, & Rigas, 2005). Furthermore, once contextual cueing was established for a set of old-context displays, the subsequent presentation of noise (i.e., the presentation of new-context displays) no longer affected the learned associations (Jungé, Scholl, & Chun, 2007). In general agreement with these findings, in the present study, contextual memory for the learned (dominant) target location was equally strong whether it was associated with its repeated context in 100%, 50%, or only 33% of all cases. Thus, while contextual learning is rather inflexible in adapting to changing environments (e.g., when the target location changes), the learned associations between a repeated context and a target location are remarkably stable.

But what might be the advantage of optimising selectivity at the expense of flexibility? One tentative answer is that contextual learning is, in fact, particularly effective when an invariant context cues only one target region. By contrast, if three (or even more) target locations were learned to be associated with a single invariant context, the context would provide only a vague cue, with a 33% (or smaller) chance of directly guiding attention to the relevant location—thus substantially compromising the benefit of predictive surrounds. Consequently, preserving the functional role of predictability may be more valuable in repeated visual search than a high degree of flexibility.

In summary, our findings show that contextual cueing lacks the potential of multiple-target learning. However, other adaptive processes appear to be maintained in contextual learning—for example, when a given change preserves the context–target relation (Jiang & Wagner, 2004; Nabeta, Ono, & Kawahara, 2003) or when relational changes are predictable (Conci et al., 2011). Also, real environments typically contain much richer sources of information than the simple spatial relations in the contextual-cueing paradigm, and these, in turn, could facilitate multiple-target learning. For example, contextual learning would not be particularly useful if an environment, such as a kitchen, cued only the location of the toaster, but not that of the coffee machine. Hence, factors contributing to multiple-target learning in contextual learning remain a fruitful topic for future studies.