A common assumption across several different literatures (e.g., psycholinguistics, cognition, social-cognition, cognitive neuroscience) is that print activates semantics in an “automatic” fashion. The ubiquity of “the” Stroop effect (see MacLeod’s 1991 review) is one line of evidence that is almost universally taken as evidence that semantic activation is “automatic,” because subjects appear unable to refrain from reading the irrelevant word (i.e., the reading of the irrelevant word is claimed to be unintentional, ballistic, and cannot be interfered with). That is, the time to name the ink-color is much slower when the irrelevant color carrier is an incongruent color word (e.g., the word “BLUE” printed in RED ink) than when the color word is congruent (e.g., the word “BLUE” printed in BLUE ink).

Indeed, a recent review concludes that:

“…the methodological and empirical arguments discussed here clearly indicate that no empirical evidence from the Stroop task currently contradicts the widespread automatic view of word reading.”

Augustinova and Ferrand (2014, p. 347)

Assessing whether a process is automatic

Assessing whether a particular process is itself automatic, or not, is complicated by the number of characteristics attributed to such processes (e.g., fast, unconscious, unintended, ballistic, capacity free, not interfered with by other processes). There is no good reason to suppose that all of these properties need to hold simultaneously in order for some process to be considered automatic at some level. For example, few would argue against the claim that many cognitive processes are unconscious (and therefore would concede these processes are indeed automatic in that restricted sense); whether they are automatic by other criteria must be determined on a case-by-case basis. Thus, the dominant approach is (or at least should be) to assess whether individual criteria hold (e.g., Moors & De Houwer, 2006).

Here, we concern ourselves with the specific issue of whether semantic activation is automatic in one oft-used sense: whether processing capacity is required. In particular, see Neely and Kahan’s (2001) view:

“We conclude that unless visual feature integration is impaired through misdirected spatial attention, SA (semantic activation) is indeed automatic in that it is unaffected by the intention for it to occur and by the amount and quality of the attentional resources allocated to it.”

Neely and Kahan (2001, p. 88)

One approach that can be used to assess whether a process uses capacity is the Psychological Refractory Period (PRP) paradigm (Pashler, 1994). Subjects perform two speeded tasks (Task 1 and Task 2) with priority given to Task 1 in that subjects are instructed to overtly respond to Task 1 before responding to Task 2. Task overlap is controlled by varying stimulus onset asynchrony (SOA). At short SOAs (e.g., 50 ms), processing for the two tasks overlap, whereas at long SOAs (e.g., 1,500 ms), processing for the two tasks does not. The standard finding is that Task 2 response time (RT) increases dramatically as SOA decreases. The widely accepted interpretation of this finding is that both tasks use the same limited capacity processor, which acts as an all-or-none processing bottleneck (e.g., Pashler, 1994). According to this account, elements of Task 2 processing that require capacity is/are delayed until this capacity is no longer needed by Task 1. In contrast, early Task 2 processes that do not need capacity can proceed in parallel with Task 1 processes that do use capacity.Footnote 1

Insight into whether a manipulated process uses capacity is provided by examining how a Task 2 factor is affected by increasing task overlap (Pashler, 1994). The delay of Task 2 processes that use capacity creates a period of cognitive slack during which Task 2 processing waits for the bottleneck to be freed up by Task 1 (see Fig. 1, panel A). This period of cognitive slack can absorb the effects of a Task 2 factor that affects processing prior to the bottleneck. Thus, the effects of a pre-bottleneck factor (i.e., one(s) that do not require capacity) will be reduced or eliminated at short SOAs. However, if the effects of a Task 2 factor arise at, or after, the bottleneck the effects of this factor will be additive with SOA (see Fig. 1, panels B and C).

Fig. 1
figure 1

An illustration of locus of slack logic in the context of the Psychological Refractory Period (PRP) paradigm. Panel a illustrates absorption of an early effect into cognitive slack (under-additivity of a Task 2 factor with decreasing stimulus onset asynchrony (SOA)). Panels b and c illustrate how central and late effects in Task 2 are unaffected by task overlap (additive effects of a Task 2 factor and SOA)

The PRP paradigm has been used to determine whether various visual word recognition processes can be characterized as needing capacity or not (see McCann, Remington, & Van Selst’s 2000 seminal paper). In particular, work with this paradigm has demonstrated that feature level, letter level, and word level processing, at least in skilled readers, do not require capacity, and hence manipulations that affect these levels are under-additive with decreasing SOA in this paradigm (Besner, Reynolds & O’Malley, 2009; O’Malley, Reynolds, Stolz & Besner, 2008; Reynolds & Besner, 2006). It follows, therefore, that if semantic activation (which is necessarily contingent on these prior processes) also does not require capacity then factors used to index semantic activation will be under-additive with SOA in Task 2 in the context of the PRP paradigm.

Stroop effects in the context of PRP: Is semantic activation capacity free?

One line of evidence viewed as inconsistent with the hypothesis that semantic activation is capacity free is that the Stroop effect (incongruent minus congruent conditions) is additive with SOA in Task 2 of the PRP paradigm (Fagot & Pashler, 1992; Megan & Cohen, 2002; 2010). This result has been taken to imply that semantic activation uses capacity on the grounds that it is interfered with by Task 1.

However, one logical problem with this interpretation is that color words affect color naming due to both response competition and semantic level competition (e.g., Augustinova and colleagues, 2010, 2015; Manwell, Roberts & Besner, 2004). Response competition arises because the color word activates an incorrect phonological representation (particularly when the word is incongruent with the color and belongs to the response set) through direct mappings from its orthographic lexical representation to its phonological lexical representation. The word also activates its semantic representation in parallel and hence also produces semantic competition (see Fig. 2). The semantic interference effect is considerably smaller than response interference from color words that appear in the response set (see Augustinova et al., 2010, 2015; Manwell et al., 2004). Thus, even if semantic activation is capacity free, whereas response competition is bottlenecked, additivity of congruency and SOA in the standard Stroop context is still expected because the small semantic component effect will be hidden by the longer time taken to resolve response competition in the standard version of the Stroop effect.Footnote 2

Fig. 2
figure 2

Two sources of conflict in the Stroop task

What is needed then, is a version of the Stroop task that isolates semantic level competition. Following Neely and Kahan (2001), this can be accomplished by utilizing color-associated words (e.g., “sky”) instead of color words (e.g., “blue”). This semantic Stroop effect is (largely) independent of response interference because none of the color-associated words are in the response set. Instead, it is localized to the semantic level because the incongruent condition (e.g., SKY printed in green) will yield more competition at the semantic level than the congruent condition (e.g., FROG printed in green) or neutral condition (e.g., TABLE printed in green). As Augustinova and Ferrand argue:

“Stroop interference can continue to serve as the “gold standard” of automaticity if, and only if, the semantic conflict is appropriately isolated and adequately separated from the response conflict, as can easily be done when the semantic Stroop paradigm is administered with vocal responses.”

Augustinova and Ferrand (2014, p. 347)

For present purposes then, if semantic activation is capacity free as suggested by the claim that semantic activation is automatic, then the semantic Stroop effect seen at the long SOA should be eliminated (or at least reduced in magnitude) at the short SOA in the PRP paradigm because of absorption into slack. In contrast, if semantic activation requires capacity, then semantic activation will be bottlenecked by Task 1 and the semantic Stroop effect will be additive with SOA. Below we report the results of an experiment in which both kinds Stroop effect (using color words and color-associated words) are manipulated in the context of the PRP paradigm. To anticipate the results, both kinds of Stroop effect are additive with SOA, consistent with the inference that semantic activation is capacity limited.

Method

Participants

Thirty-eight undergraduate students at Trent University participated in the experiment for credit towards an eligible course. All participants reported English as their first language, normal or corrected to normal visual acuity and normal color vision.

Stimuli

The ink-colors were the standard E-Prime colors for red, yellow, blue, and green (Schneider, Eschman, & Zuccolotto, 2002). The word stimuli consisted of three sets of character strings, color words, color associates, and neutral words. The color words were RED, YELLOW, BLUE, and GREEN. The color associates (SKY, FROG, LEMON, and TOMATO) and neutral words (KEG, JAIL, TABLE, and PALACE) were taken from Manwell et al. (2004).

Apparatus

The experiment was run on a Dell Vostro 420 Desktop computer with a core 2 quad processor, an ATI Radeon 4350 video card, and the Windows XP operating system with service pack 2. Stimuli were presented on a Dell E207 LCD screen. Data collection and stimulus presentation were controlled using E-Prime 2.0 software and a PST Response Box, microphone, and voice-key assembly (Schneider et al., 2002).

Procedure

The experiment consisted of 36 practice trials followed by 288 experimental trials. There were equal numbers of congruent, incongruent, and neutral trials for each word type (color associates/color words).

Each trial began with a fixation marker (“+”) in the center of the screen for 500 ms followed by a blank screen for 500 ms. A timer was started in EPrime at the onset of the 50-ms tone which was presented over a pair of Altec Lansing BX1220 speakers. The Stroop stimulus was presented once the timer value was greater than or equal to the SOA value for the trial (i.e., SOA = 50 ms or 1,500 ms).Footnote 3 The Stroop stimulus was presented in the center of the screen in lowercase, 18 point, bold, courier new font. Subjects were instructed to indicate whether the tone was “high” or “low” via button press (m vs. n). The assignment of button to tone was counterbalanced across subjects.

The Stroop stimulus remained on the screen until a vocal response triggered the voice key. Response times for Task 1 and Task 2 were calculated by querying the timer when a response was collected and subtracting the time at which the corresponding stimulus was presented. Once a response was made to both tasks, the experimenter coded the subject’s vocal response as blue, red, yellow, or green, or mistrial (e.g., voice-key failure). In the case of an incorrect vocal or button press response feedback was presented indicating the error for 3,200 ms. Subjects were instructed to perform both tasks as quickly as possible, but to give priority to Task 1.

Results

Separate robust repeated measures analysis of variance with 20 % Winsorized meansFootnote 4 with distractor type (e.g., incongruent, congruent, and neutral) and SOA (50 ms vs. 1,500 ms) as factors were conducted for each Stroop type (associates vs. color words) and for each dependent variable. Robust statistical methods were used because they address issues of non-normality and heteroscedasticity (Wilcox, 1998). Prior to analysis of the RT data, trials on which an error occurred in either Task 1 or Task 2 were removed (16.2 %). An additional 2.7 % of the correct RT data were excluded due to voice key failures. Mean RTs and percentage errors for each condition can be seen in Table 1.Footnote 5

Table 1 Response time (RT; ms) and percentage error as a function of word type (color words vs. color associations), condition (congruent, neutral, and incongruent; stimulus onset asynchrony (SOA) 50 ms vs. 1,500 ms) and task color (identification vs. discrimination)

Standard Stroop task (color words)

Task 1

The RT data revealed a main effect of SOA (F=12.48, p<.001). There was no effect of distractor type (F=.557, p=.573), nor was there an interaction between SOA and distractor type (F=.237, p=.789).

In Errors, there was a main effect of SOA; more errors were made at the short SOA compared to the long SOA (F=21.94, p<.001). There was no reliable effect of distractor type (F=2.45, p=.087), nor was there an interaction between SOA and distractor type (F=.24, p=.789).

Task 2

Analysis of the RT data revealed a main effect of SOA (F=368.7, p<.001). There was a main effect of distractor type (F=44.15, p<.001. Critically, there was no interaction between distractor type and SOA (F=.22, p=.799). Follow-up tests revealed a reliable Stroop effect (incongruent – congruent conditions) (F=87.2, p<.001), a reliable interference effect (incongruent – neutral conditions) (F=49.688, p<.001), and a reliable facilitation effect (neutral – congruent condition) (F=11.663, p<.001). Neither the Stroop effect, the interference effect, nor the facilitation effect interacted with SOA (F < .001, p=.992, F=.283, p=.595 and F=.370, p=.543, respectively).

In Task 2 Errors, there was no main effect of SOA (F=.148, p=.700). There was a main effect of distractor type (F=27.44, p<.001). There was no interaction (F=.038, p=.963). Follow-up tests revealed that there were reliable Stroop (F=45.93, p<.001) and interference effects (F=55.97, p<.001), but no facilitation effect (F=1.95, p=.163). Neither the Stroop effect, the interference effect, nor the facilitation effect interacted with SOA (F=.0786, p=.787, F=.059, p=.808 and F < .001, p=.980, respectively).

Semantic Stroop task (color-associated words)

Task 1

The RT data revealed a main effect of SOA (48 ms, F=8.9, p=.003). There was no effect of distractor type (F=.62, p=.537), nor was there an interaction between SOA and distractor type (F=.57, p=.565).

For Errors, there was a main effect of SOA, where more errors were made at the short SOA than at the long SOA (F=12.6, p<.001). There was no effect of distractor type (F=.048, p=.953,) nor was there an interaction between SOA and distractor type (F=.668, p=.513).

Task 2

The RTs yielded a main effect of SOA (F=311.8, p<.001). There was a main effect of distractor type (F=12.3, p<.001). There was no interaction between distractor type and SOA (F=.410, p=.664). Follow-up tests revealed that there were reliable Stroop (F=14.18, p<.001) and interference effects (F=23.2, p<.001), but no facilitation effect (F=.485, p=.486). Neither the Stroop effect, the interference effect, nor the facilitation effect interacted with SOA (F=.312, p=.577, F=.205, p=.651, and F=.796, p=.372, respectively).

For Errors there was no main effect of SOA (F=.762, p=.383). There was a no main effect of distractor type (F=1.600, p=.202) and no interaction (F=.464, p=.629).

Discussion

There are two important aspects to the results of this experiment. The first is that we replicated the additivity of the standard Stroop effect and SOA (Fagot & Pashler, 1992; Magen & Cohen, 2002, 2010; White & Besner, 2016). The standard Stroop effect (incongruent minus congruent) is 187 ms at the long SOA and 187 ms at the short SOA. These results are consistent with the effect of response competition being bottlenecked by Task 1, but do not address the issue of whether semantic level processing is bottlenecked by Task 1.

The second and novel result is that additivity was also observed for the semantic Stroop effect (color-associated words) and SOA. The semantic Stroop effect (incongruent minus congruent) is 32 ms at the long SOA and 43 ms at the short SOA. This demonstration of additivity between semantic Stroop and SOA is conventionally understood in terms of a factor in Task 2 indexing a process that is delayed (bottlenecked) by processing in Task 1. Such bottlenecking occurs because processing in Task 2 for that factor relies on some limited resource (capacity) that is not available until Task 1 no longer needs it. The present data, therefore, are inconsistent with the hypothesis that semantic activation is capacity free.Footnote 6 Instead, they are consistent with the conclusion that it is capacity demanding, and by that criterion is therefore not automatic (see also White & Besner, 2016) .

To the future

These data raise the novel theoretical question of why semantic activation is subject to capacity limitations, whereas prior processes are not (feature, letter, and word level activation) in the same paradigm with the same Task 1 manipulation. One possibility is that conceptual processing is simply that much more complex than processes involved in lexically identifying a visually presented word. Another possibility is that the linguistic/conceptual distinction (one of domain specificity) is important. This possibility was alluded to by Reynolds and Besner (2006) when they argued that some form of capacity was required to translate across modalities (i.e., orthography to phonology). Finally, a third possibility is that conceptual level processing may not be intrinsically capacity limited, but that it is sometimes subject to interference from other ongoing tasks. According to this account, for example, working memory would have to organize (control) both orthographic and linguistic processing (feature, letter, and word level) in conjunction with Task 1, and that the addition of another domain (here, semantics) is too taxing, and therefore has to queue for processing. This latter account is consistent with the recent observation reported by Ford and Reynolds (2016) that a semantic factor and SOA yielded underadditive effects when making parity judgments in the context of the PRP paradigm (e.g., where the semantic domain is very narrow, and symbol and meaning tightly conjoined, as in parity judgments about Arabic numerals). Whatever the answer, the present results call for a reconsideration of the received wisdom about the automaticity of semantic processing when reading words.