Introduction

The ability to sustain attention over time is critical for successful performance in a variety of everyday activities. Maintaining focus on task goals is effortful, however, and in reality our attention fluctuates; sometimes we are on task, while other times we stray off because of boredom, fatigue, or distraction. Although we are all susceptible to occasional failures of sustained attention, the ability to remain focused varies widely from person to person and is related to other cognitive abilities and personality differences, as well as to brain structure and neurological health (e.g., Carriere, Cheyne, & Smilek, 2008; Fukuda & Vogel, 2009; Kanai, Dong, Bahrami, & Rees, 2011; Robertson, Manly, Andrade, Baddeley, & Yiend, 1997; Westlye, Grydeland, Walhovd, & Fjell, 2011). In the present study, we aimed to better characterize three important aspects of sustained attention: declines in performance over time (i.e., vigilance decrements), higher-frequency fluctuations in performance (e.g., trial-to-trial changes in reaction times [RTs]), and individual differences in attentional abilities.

Studies of vigilance

Vigilance, the ability to sustain attentional focus and remain alert to stimuli over time (Warm, Parasuraman, & Matthews, 2008), was first studied during World War II to investigate why radar operators were more likely to miss rare events near the ends of their shifts. In the original vigilance task, the Mackworth clock test, observers monitored a pointer moving in regular increments around a blank clock for up to 2 h and were instructed to respond when they saw an infrequent double jump in the pointer’s movement (Mackworth, 1948). Detection accuracy declined after 30 min of watch, and later vigilance studies generally found decrements within the first 15 min of performance, or even the first 5 min under demanding task conditions (Nuechterlein, Parasuraman, & Jiang, 1983; Temple et al., 2000). Continuous performance tasks (CPTs) are another group of paradigms commonly used to study vigilance (Riccio, Reynolds, & Lowe, 2001; Rosvold, Mirsky, Sarason, Bransome, & Beck, 1956), especially in clinical populations. In contrast to more traditional vigilance tasks, which typically involve the detection of intermittent, unpredictable, and infrequent signals over a long period of time (Davies & Parasuraman, 1982), CPTs generally entail continuous discrimination of a constant stream of stimuli with relatively short interstimulus intervals (~1 s). Like traditional vigilance tasks, many CPTs (e.g., X or A-X CPTs) require responses to rare targets and generally elicit decrements in performance over time (Ballard, 2001; Rosvold et al., 1956). Although during both traditional vigilance and rare-target CPT tasks, participants are constantly monitoring for stimuli or making discriminations between targets and nontargets (e.g., Davies & Parasuraman, 1982; Szalma, Hancock, Warm, Dember, & Parsons, 2006; Warm & Jerison, 1984), overt responses are infrequent, and thus moment-to-moment fluctuations in RTs are not accessible.

In contrast, a subset of CPTs, known as not-X CPTs, in which participants respond to the majority of stimuli and withhold responses on rare target trials (e.g., the sustained-attention-to-response task, or SART, of Robertson et al., 1997, and Conners’s, 2000, CPT-II), provide a complementary approach to studying sustained attention. Because the frequent nontarget trials require constant responding, not-X CPTs enable investigation of RT patterns that precede and predict errors. For example, faster and more erratic correct responses have been shown to foreshadow errors within seconds (Cheyne, Carriere, & Smilek, 2006; Robertson et al., 1997), with neural markers of attention lapses preceding errors by up to 20 s (O’Connell et al., 2009). In these tasks, the most common errors are failures to inhibit response on target trials—that is, commission errors, which have been associated with reduced attention to task. In support of this idea, commission errors are associated with task-unrelated thoughts and mind wandering (Cheyne et al., 2006; Christoff, Gordon, Smallwood, Smith, & Schooler, 2009; Hester, Foxe, Molholm, Shpaner, & Garavan, 2005; Manly, Robertson, Galloway, & Hawkins, 1999; Robertson et al., 1997; Smallwood, Beach, Schooler, & Handy, 2008; Smallwood et al., 2004), as well as with self-report measures of absentmindedness (Cheyne et al., 2006; Robertson et al., 1997; Smilek, Carriere, & Cheyne, 2010).

Despite the efficacy of not-X CPTs in foreshadowing errors due to periodic lapses of attention, these tasks have limitations. The duration of not-X CPTs is often shorter than that of rare-target tasks, and vigilance decrements are not typically reported. During longer versions, not-X CPTs sometimes elicit vigilance decrements (e.g., Grier et al., 2003; Helton et al., 2005), but can even show improvements over time (Helton, Kern, & Walker, 2009). Since vigilance decrements are not consistently observed or reported in healthy adult populations, the concern arises that not-X CPT tasks may not adequately tax sustained attention (see also Helton & Russell, 2011, and the Discussion below for further concerns regarding not-X CPTs). One possible reason that vigilance decrements would not consistently be observed or reported is that the abrupt visual onset of each trial, which captures attention exogenously (Yantis & Jonides, 1984), may reduce demands on the endogenous maintenance of attention and enable more consistent performance over time. The effect of these abrupt onsets may be particularly apparent during not-X CPTs (rather than rare-target tasks), as the visual onsets serve as cues to execute a motor response, and are thus consistently relevant behaviorally. In support of this idea, MacLean et al. (2009) found that sudden-onset visual cues presented before stimuli in a rare-target CPT attenuated declines in perceptual sensitivity. In addition, it is possible that the frequent motor responses themselves may undermine the sustained-attention aspect of the task by tapping into other cognitive mechanisms, such as impulsivity and response strategy (e.g., Helton et al., 2009).

In summary, traditional vigilance tasks, rare-target CPTs, and not-X CPTs have been useful in studying unique aspects of sustained attention. Traditional vigilance tasks and rare-target CPTs reliably elicit performance decrements over time, while not-X CPTs, by requiring frequent responses, better assess moment-to-moment fluctuations in RTs. In an effort to jointly examine both vigilance decrements and moment-to-moment RT fluctuations, we developed the gradual-onset CPT (gradCPT), a not-X CPT that removes abrupt stimulus onsets. Instead, gradual transitions between stimuli are introduced in order to more thoroughly tax sustained attention and produce more consistent vigilance decrements than have previously been reported using short-duration not-X CPTs.

RT fluctuations and attention

In tasks with frequent behavioral responses (such as not-X CPTs), another way in which trial-to-trial variations in attention have been explored is through analyses of RT fluctuations. Systematic changes in trial-to-trial RTs have been characterized as important measures of attentional performance since the late nineteenth century (Hylan, 1898) and have been linked to attention in several ways. First, unusually slow RTs have been conceptualized as indicating lack of readiness or reduced attention to task (Weissman, Roberts, Visscher, & Woldorff, 2006), while abnormally fast RTs, thought to indicate premature or routinized responding, have been associated with failures of attentional control and response inhibition (Cheyne, Carriere, & Smilek, 2009). Second, intraindividual variability in RTs has been linked to impairments of attention and executive function (e.g., attention-deficit hyperactivity disorder, ADHD), such that more erratic responding is related to greater deficits (Sonuga-Barke & Castellanos, 2007; Stuss, Murphy, Binns, & Alexander, 2003; West, Murphy, Armilio, Craik, & Stuss, 2002). For example, even when overall speed is controlled, those with ADHD show more variable correct responses and more prominent RT fluctuations every 12–40 s (Castellanos, Sonuga-Barke, Milham, & Tannock, 2006; Di Martino et al., 2008; Vaurio, Simmonds, & Mostofsky, 2009). Important to the goals of the present study, previous work has yet to explore changes in within-subjects response variability as attention fluctuates over time (for one exception, see Faulkner, 1962).

On the basis of findings linking erratic RTs (both faster and slower) before target trials to reduced attention, and also studies relating increased RT variability to attentional impairments, we posited that examining within-subjects RT variability over time would reveal fluctuations in attention during sustained performance that are not fully captured by examining accuracy alone. To explore this possibility, we employed a novel analysis method that makes use of within-subjects fluctuations in RT variability to measure error propensity during more- and less-variable periods of gradCPT performance.

Individual differences

Despite the importance of characterizing individual differences in sustained attention, few clear psychological markers distinguishing high and low performers have been uncovered (for reviews, see Davies & Parasuraman, 1982; Reinerman-Jones, Matthews, Langheim, & Warm, 2010). However, when focusing on the overall performance during sustained-attention tasks rather than on decrements in performance over time, recent work has suggested that certain types of self-report questionnaires are related to individual differences. Specifically, scores on the Cognitive Failures Questionnaire (CFQ; Broadbent, Cooper, FitzGerald, & Parkes, 1982) and the Attention-Related Cognitive Errors Scale (ARCES; Cheyne et al., 2006) correlate modestly with commission errors on the SART (see Smilek et al., 2010, for a recent meta-analysis; CFQ, r = .21; ARCES, r = .22 to .31), such that participants who report more attention lapses in daily life make more errors during task performance (Cheyne et al., 2006; Robertson et al., 1997; Smilek et al., 2010).

Individual differences in the tendency to be mindful may also be predictive of performance on sustained-attention tasks. Here, we refer to mindfulness as open, nonjudgmental awareness of and attention to present experience (e.g., Baer, Smith, Hopkins, Krietemeyer, & Toney, 2006; Brown & Ryan, 2003). Note that this conceptualization, which stems from contemplative traditions, is distinct from the creative cognitive process described by Langer (for a review of the distinction, see Langer, 1989). Originally studied for its beneficial effects on well-being and role in stress reduction (e.g., Baer, 2003; Grossman, Niemann, Schmidt, & Walach, 2004; Hölzel et al., 2011), recently mindfulness has been helpful in understanding intraindividual variation in attention. For example, self-reported mindfulness (on the Mindful Attention Awareness Scale, or MAAS; Brown & Ryan, 2003) shows negative correlations with commission errors and positive correlations with RTs on the SART. That is, more mindful individuals make fewer errors and exhibit slower RTs (Cheyne et al., 2006; Smilek et al., 2010). Evidence also suggests that mindfulness training in the form of meditation improves sustained attention (Chambers, Lo, & Allen, 2007; MacLean et al., 2010). Although MAAS and ARCES scores are often related (negatively correlated, given scoring conventions), the MAAS has been more directly related to RT, and the ARCES to accuracy (Cheyne et al., 2006; Smilek et al., 2010), suggesting that these measures are sensitive to unique aspects of sustained attention.

The associations between self-reported attentional abilities and task performance may be stronger in the presence of distracting or task-irrelevant information (e.g., Kanai et al., 2011; Tipper & Baylis, 1987). Distractors may strengthen relationships with individual-difference measures by increasing task difficulty, and thus impeding performance in certain participants (e.g., Davies & Tune, 1969; Demeter, Hernandez-Garcia, Sarter, & Lustig, 2011), or, in some circumstances, by aiding performance in a subset of individuals via increased arousal (as was found in training studies with healthy controls [O’Connell et al., 2008] and with patients with right hemisphere damage [Robertson, Tegner, Tham, Lo, & Nimmo-Smith, 1995]). In addition, it may be that the presence of distractors more accurately reflects the sustained attention challenges that one faces in everyday life, thus better paralleling the experiences probed by questionnaires such as the CFQ and ARCES. To further explore the possible influence of distraction on sustained-attention performance, in the present study we employed two versions of the gradCPT, with and without distracting background images.

Gradual-onset CPT

We created a novel task, the gradual-onset CPT (gradCPT), to address our aims of better characterizing performance decrements over time, moment-to-moment fluctuations in RTs, and individual differences in sustained attention. The gradCPT represents a unique combination of task features, in that it both requires frequent overt responses and removes abrupt stimulus onsets that may exogenously capture attention. By using an analysis method that explores within-subjects fluctuations in RT variability during gradCPT performance, we exploited a higher-resolution and more continuous measure of attention than response accuracy. We hypothesized that the gradCPT would elicit performance decrements over time in both accuracy and RT variability. Furthermore, we hypothesized that fluctuations in RT variability would interact with error proneness, potentially revealing different attentional states and shedding light on distinct causes for errors. In addition, to examine the potential effect of distraction on the relationship between task performance and individual-difference measures, some participants performed the gradCPT with visual distraction in the background of the central task. We predicted that background distractors would potentially interfere with performance, causing more frequent errors and increased RT variability, and that, importantly, individual differences in self-reported mindfulness and everyday attention lapses (as measured by ARCES and MAAS) would be more strongly related to gradCPT performance in the presence of distractors.

Method

Participants

A group of 29 neurologically healthy adults (20 female, nine male; age range = 18–26 years, mean = 20 years) participated in the experiment. Of these participants, 14 were randomly assigned to perform the gradCPT with intact scenes in the background of the central task (distractor present), while 15 performed the task with scrambled scenes in the background (distractor absent). There were no differences in age or gender across groups. The study was approved by the VA Boston Healthcare System institutional review board, and all participants gave written informed consent.

Stimuli and paradigm

Stimuli were created by centrally overlaying grayscale face photographs (from the MIT Face Database [Russell, 2009], with permission from Richard Russell) on scene photographs or scrambled backgrounds. The face stimuli consisted of one female and 10 male faces, cropped to show eyes, nose, and mouth, and sized to a circle of 75-pixel radius. We elected to use these stimuli in order to limit the perceptual challenge and increase the monotony of the task, as well as to roughly equate the numbers of presentations of the individual face images (male faces were presented approximately nine times more often than the female face). Pilot testing demonstrated that discrimination of these faces was close to ceiling.

Scene backgrounds (distractor present) consisted of 20 grayscale photographs of urban and rural scenes from the SUN Database (Xiao, Hays, Ehinger, Oliva, & Torralba, 2010) sized to 400 square pixels (Fig. 1b). The scrambled backgrounds (distractor absent) were created by phase scrambling the scene photographs (Fig. 1a).

Fig. 1
figure 1

The gradual-onset continuous performance task (gradCPT). a Faces gradually transition from one to the next every 1,200 ms. Participants are instructed to respond to male faces and withhold responses for a target female face. The background noise gradually transitions every 2,250 ms (distractor-absent version). b Distractor-present version of the gradCPT

The images were presented on a 15-in. MacBook Pro using the Psychophysics Toolbox in MATLAB. From a viewing distance of 66 cm, the background stimuli subtended 13 deg of visual angle, and the face stimuli, 2.4 deg in both height and width.

On each trial, a face photograph gradually transitioned from one to the next using linear pixel-by-pixel interpolation. Each transition took 800 ms, and the faces paused for 400 ms when fully cohered (Fig. 1a; see also Supplementary Video 1). Faces were presented randomly, with male faces occurring 90% of the time and the target female face 10% of the time, without allowing identical faces to repeat on consecutive trials (banning repeats was necessary, since there was only one target female; this made the overall probability of targets 9 %). The backgrounds gradually transitioned out of sync with faces, at a rate of 2,250 ms.

Note that the distractors were irrelevant both temporally (changing at different rates than the faces) and categorically (scenes not faces). We reasoned that these backgrounds would provide a more ecologically valid measure of visual distraction, in that continuously present distractors unrelated to the task goals are more frequently encountered in daily life (see the Discussion).

Procedure

Before beginning the task, participants were familiarized with the target female face and given two 30-s practice blocks. Participants were instructed to press a key in response to every male face and to withhold response to the target female (essentially, an individuation task). They were asked to disregard the backgrounds and focus on the faces. Participants then performed 12 min of either the distractor-present or the distractor-absent version of the gradCPT (randomly assigned) without a break.

Participants who performed the distractor-present version were then administered a subsequent surprise scene memory task. They were instructed to respond “old” or “new” to each of 50 photographs, depending on whether they had seen them during the gradCPT. The stimuli consisted of the 20 scenes used as backgrounds in the gradCPT and 30 foils of other city and mountain scenes.

Finally, participants completed the ARCES and the MAAS. The ARCES (Cheyne et al., 2006) assesses errors in routine activity caused, in part or entirely, by attentional lapses, drawing questions from the Cognitive Failures Questionnaire (Reason, 1977, 1979, 1984). The MAAS, a measure of mindfulness, probes the propensities for both mind wandering and action errors (Brown & Ryan, 2003). Each of these scales has strong psychometric properties, and has been validated in college and community samples (Carriere et al., 2008; Smilek et al., 2010).

Behavioral analysis

Overall performance and decrements

RTs were calculated relative to the beginning of each image transition, such that an RT between 800 and 1,200 ms indicated a buttonpress when the face on the current trial was 100 % cohered and not interpolated with the subsequent image. An RT shorter than 800 ms indicated that the current face was still in the process of transitioning from the previous one, and an RT longer than 1,200 ms indicated that the current face was transitioning to the next. On the rare trials with highly deviant RTs (before 70 % coherence of the current face n and after 40 % coherence of the following face n + 1) or multiple button presses, an iterative algorithm that maximized correct responses was employed. First, the algorithm assigned unambiguous correct responses—that is, presses to nontarget faces that occurred between 560 and 1,520 ms after stimulus onset. Second, the remaining, ambiguous presses (i.e., those before 70 % coherence from the previous face and after 40 % coherence of the following face or multiple presses; less than 5 % of trials) were assigned to an adjacent trial if one of the two had no response. If both adjacent trials had no response, ambiguous presses were assigned to the closest trial, unless one was the target face, in which case participants were given the benefit of the doubt that they had correctly omitted. Finally, if there were multiple presses that could be assigned on any one trial, the fastest response was selected. Slight variations to the algorithm yielded highly similar results, as most button presses showed a one-to-one correspondence with the presented images.

In addition to mean RT, we examined three other dependent measures: RT variability (i.e., the standard deviation [SD] of the RT), the commission error rate (CE rate, the proportion of target female trials to which the participant failed to inhibit response), and the omission error rate (OE rate, the proportion of nontarget male trials to which the participant failed to respond). In addition to overall performance on these measures (mean RT, SD of RTs, CE rate, and OE rate), we examined how performance changed over time and was influenced by background condition. Vigilance decrements were calculated by dividing the task run into 3-min quartiles and conducting repeated measures ANOVAs and examining linear trends (e.g., Helton et al., 2005) in each of the four dependent measures. These ANOVAs included Background as a between-subjects factor and Time as a within-subjects factor. When significant linear trends were present, we calculated a per-minute linear slope (converting from a quartile-based slope).

Between-subjects relationships

To examine the unique relationships of the mean RTs and the SDs of RTs with the CE rate, we computed semipartial correlations (i.e., the correlation between mean RTs and CE rate, controlling for SD, and the correlation between SDs and CE rate, controlling for mean RT). We used semipartial rather than partial correlations because we were most interested in the degree to which unique variance in the SDs (or RTs) explained total variance in the CE rate, and also for ease of interpretability. In particular, by using this approach we could directly compare the semipartial correlations of mean RT (controlling for SD) that predicted CE and of SD (controlling for RT) that predicted CE, since they would both indicate the amount of total CE variance explained. We also assessed semipartial correlations between the linear slopes of these performance measures to examine whether changes in accuracy over time would parallel changes in RT or RT variability over time.

Within-subjects fluctuations

To examine within-subjects moment-to-moment fluctuations in attentional stability, RT variability was analyzed using a novel analysis procedure, the variance time course (VTC; see Fig. 2a). A VTC was computed for each participant from the approximately 485 correct responses on the task. Each trial was assigned a value representing the normalized (z-score) absolute deviance of that trial’s RT from the participant’s mean RT on the task, and then values for error trials (CEs and OEs) and correct omissions (COs) were interpolated linearly—that is, by weighting the two neighboring baseline trial RTs (correct commissions; i.e., correct responses to a male face) directly before and after. After assigning each trial a deviance value, the VTC was smoothed using a Gaussian smoothing kernel of nine trials (7.2 s) full width at half maximum, integrating information from the surrounding 20 trials (approximately 24 s), on the basis of previous findings suggesting fluctuations on the order of 20 s (Di Martino et al., 2008).

Fig. 2
figure 2

Reaction time (RT) variability as a measure of attentional state. a An example variance time course (VTC) in a representative participant. b Participants made more commission errors and omission errors when “out of the zone” than when “in the zone.” ** p < .01

Smoothed VTC values were used to assign trials to participant-specific low- or high-variability epochs via median split (“in-the-zone” and “out-of-the-zone” epochs). Thus, in-the-zone epochs represent periods of low deviance of RTs from the mean, while out-of-the-zone epochs represent periods of relative variability or high deviance of RTs from the mean. The VTC was computed exclusively from correct RTs.

Individual differences

To examine the relationship between the ARCES and MAAS and task performance, we performed several regression analyses using participants’ questionnaire scores to predict our main dependent measures (mean RT, RT variability, CE rate, and OE rate). These models also explored the influence of background condition through inclusion of the main effect (dummy-coded) and interaction terms. Effects of background condition are reported where significant.

For the distractor-present condition, sensitivity (d′) and response bias (c) were calculated to assess participants’ subsequent memory performance for the distracting scenes.

Results

Overall performance and decrements

On average, participants made CEs 20 % of the time when the female face appeared (i.e., failed to inhibit response; range = 4 %–40 %) and made OEs to 1 % of the male faces (i.e., failed to respond to a male face; range = 0 %–4 %). The mean RT across participants was 906 ms (i.e., 106 ms after the face fully cohered; range = 740–1,007 ms). We found no main effect of background condition on any performance measure [CE rate, F(1, 27) = 1.014 p = .32; OE rate, F(1, 27) = 0.007, p = .94; overall RT, F(1, 27) = 0.48, p = .49; overall SD, F(1, 27) = 0.16, p = .69]. Of note, while RT and SD are often correlated in cognitive tasks, RT and SD were not significantly correlated in the gradCPT (r = .264, p > .1), and using either the coefficient of variation (SD/mean RT) or residualized SD (residual of a linear regression using individual mean RT to predict SD) in place of SD did not impact the results. In nearly all of the subsequent group analyses, no main effects of or interactions with background condition emerged; only significant effects are reported below.

Participants exhibited performance decrements across the 12-min run. We found significant effects of time on CE rates [F(3, 81) = 7.51, p < .001] and RT variability [F(3, 81) = 5.21, p = .002]. A trend analysis revealed a significant linear performance decrement in CE rates [F(1, 27) = 18.24, p < .001; 1.2 % increase each minute] and RT variability [F(1, 27) = 12.79, p = .001; 2-ms increase in SD each minute; Fig. 3a and b]. No significant effect of time was apparent on OE rates or RTs (Fs < 1.48, ps > .2; see Fig. 3c and d). Mauchly’s test of sphericity indicated that the data did not violate the assumption of sphericity [χ 2(5) > 0.5, p > .3].

Fig. 3
figure 3

Vigilance decrements. a Participants made significantly more commission errors over time (1.2 % increase each minute). b Reaction times (RTs) became significantly more variable over time (2-ms increase in SD each minute). c No significant change over time was observed in omission error rates (p > .1). (d) No significant change over time was observed in RTs (p > .2). Shading represents standard errors. ** p < .01

Between-subjects relationships

Examining the independent associations of mean RT and RT variability with CE rate, we found that RT and SD each explained a unique proportion of the variance in overall CE rates, such that participants who were more variable (controlling for RT) and those who responded more quickly (controlling for SD) made more CEs (SD semipartial r = .537, p = .001; RT semipartial r = −.574, p < .001). When looking at effects over time, the SD slope, but not the RT slope, explained a significant proportion of the variance in CE slope: Participants who showed greater increases in variability over time (even when controlling for changes in overall RT) also showed greater increases in CE rate over time (SD slope semipartial r = .587, p = .001; RT slope semipartial r = −.178, p = .25). Thus, although both fast and variable responders made more errors overall, accuracy declines were associated with increasing variability, not speed, over time.

Within-subjects fluctuations

To further examine the relationship between RT variability and CEs within each subject, the VTC (Fig. 2a) was used to define periods of relatively low and high RT variability for each participant (i.e., “in-the-zone” and “out-of-the-zone” epochs). Participants made more errors—in terms of both CEs [t(28) = −5.55, p < .001] and OEs [t(28) = −3.47, p = .002]—during out-of-the-zone as compared to in-the-zone epochs (see the Method section, Fig. 2b).

To further characterize the in-the-zone and out-of-the-zone periods, we next examined whether CE precursors differed across these two VTC-defined epochs. Collapsed across epochs, the baseline trial RTs immediately preceding CEs (n – 1) were faster than those immediately preceding correct omissions [COs; i.e., correctly inhibiting response to the target female; t(28) = −11.99, p < .001]. That is, incorrectly pressing to target trials was preceded by relatively fast trials, whereas correctly withholding response to targets was preceded by relatively slow trials. These differences did not extend beyond n – 1 trials. Of note, this effect differed across variability epochs, such that RTs preceding CEs were faster, and those preceding COs were slower, when out of the zone than when in the zone [F(1, 27) = 26.81, p < .001; pre-CE in vs. out of zone, t(27) = 4.42, p < .001; pre-CO in vs. out of zone, t(28) = −2.97, p = .006; see Fig. 4a. Note that one participant made no CEs when in the zone, thus df = 27 instead of 28 in comparisons including in-the-zone CE RT]. There were no differences in RTs preceding baseline trials across epochs [t(28) = −0.64, p = .53], and average RTs were nearly identical [t(28) = −0.73, p = .47; see the histogram Fig. 4b].

Fig. 4
figure 4

a Error precursors by attentional state. Across states, baseline reaction times (RTs; i.e., of correct commissions) immediately preceding commission errors (CEs; i.e., on trial n – 1) were faster than those immediately preceding correct omissions (COs). RTs preceding CEs were faster, and those preceding COs were slower, when out of the zone than when in the zone. There were no differences in RTs preceding baseline trials between in-the-zone and out-of-the-zone periods. b Histogram of baseline RTs in all participants, grouped by VTC-defined epochs. While the mean RTs across participants are not different between the states (907 ms when out of the zone vs. 902 ms when in the zone), the standard deviation of the baseline RTs is larger when out of the zone than when in the zone [184 vs. 101 ms; t(28) = 23.33, p < .001]. Out-of-the-zone baseline RTs are approximately equally distributed among fast and slow RTs (51 % faster than individual participants’ mean RTs, 49 % slower). * p < .05 ** p < .01

Although RTs did not change over time at the group level, we addressed the possibility that some participants might have exhibited fluctuating mean RTs across the task, potentially biasing the VTC to define out-of-the-zone epochs when mean RTs were extreme. A local-mean VTC was computed by comparing each trial’s RT to a sliding, rather than the overall, mean (20-trial window). This yielded identical results with regard to the in-the-zone and out-of-the-zone differences in accuracy and error precursors described above.

Additionally, to address the possibility that out-of-the-zone epochs reflected stable periods of abnormally fast or slow RTs rather than erratic deviant responding, we calculated the SD in every in-the-zone and out-of-the-zone epoch (the average lengths of both the in-the-zone and out-of-the-zone epochs were 20 trials). The SD was significantly higher in out-of-the-zone than in in-the-zone epochs [157 vs. 91 ms; t(28) = −13.53, p < .001], confirming that out-of-the-zone epochs comprised erratic deviant RTs rather than consistently fast or slow RTs.

These findings demonstrate that the degree of RT variability interacts with other behavioral markers of attention: Highly variable out-of-the-zone periods are marked by a greater likelihood of errors and a greater influence of local response speeding or slowing on subsequent performance accuracy, while in-the-zone periods are characterized by relative response consistency and fewer errors. Together, these results suggest that the VTC analysis reveals two potentially distinct attentional states.

To examine whether discrete task or behavioral events precipitated switches between in-the-zone and out-of-the-zone epochs, we calculated both the mean RT and CE rate at various intervals (each consisting of four trials, to ensure that every participant was contributing at least one CE per bin), relative to the transition from in-the-zone to out-of-the-zone epochs, and vice versa (Fig. 5). To summarize, switches were not preceded by patterns of RT slowing or speeding, or by increasing or decreasing CE rates (Fig. 5). Rather, the transition between these two states appears to be gradual. In particular, the increase in CE rate develops over time after a transition from an in-the-zone to an out-of-the-zone epoch (Fig. 5b).

Fig. 5
figure 5

a Reaction times (RTs) on baseline trials preceding and following transitions between out-of-the-zone and in-the-zone states demonstrate that switches were not preceded by patterns of RT slowing or speeding. Time bins are averages over four trials. Thus, Bin −1 represents the average RT for baseline trials one to four trials before a state transition, and Bin 1, the average RT for one to four trials after. b Switches were also not preceded by increasing or decreasing commission error (CE) rates. Rather, the higher CE rate during out-of-the-zone than during in-the-zone epochs emerges gradually as one moves farther from the transition point. Shading represents standard errors

Individual differences: correlations with self-report questionnaires

Scores on the ARCES were predictive of CE rates (r = .473, p = .009), such that participants who self-rated as being more error-prone in daily life made more CEs on the gradCPT. No other correlations between ARCES and performance were significant.

MAAS scores were not significantly related to performance when collapsing across the two conditions; however, we did find significant interactions between background condition and MAAS scores for both RT variability [t(26) = −3.58, p = .001] and OE rate [t(26) = −2.79, p = .01], such that mindfulness was correlated with variability and OE rate in the distractor-present but not the distractor-absent condition (for SD, distractor present, r = −.772, p = .001; distractor absent, r = .153, p = .58; for OE rate, distractor present, r = −.723, p = .003; distractor absent, r = −.082, p = .76). Thus, more mindful participants responded less variably and made fewer OEs only in the presence of distracting scenes.

In addition to our primary performance measures, in participants who performed the distractor-present condition we also examined relationships between the questionnaire scores and memory for distractor scenes. The overall scene accuracy was 72.4 % (SD = 6.8 %), and using a signal detection approach, we found the overall sensitivity (d′) to be 1.18 (SD = 0.42) and the bias (c) to be .43 (SD = .46). A positive bias in this case indicates a more conservative response strategy, or endorsing fewer scenes as “old.” Only mindfulness was correlated with d′, and marginally with accuracy, such that more mindful participants remembered the background scenes better (d′, r = .579, p = .03; accuracy, r = .51, p = .06). Additionally, more-mindful participants endorsed fewer scenes as being “old” or familiar, as compared to less-mindful participants, indicating a more conservative response strategy (c: r = .546, p = .04).

Discussion

We developed a novel paradigm, the gradual-onset continuous performance task (gradCPT), to more adequately tax sustained attention in a short period of time and requiring frequent overt responses, in order to observe moment-to-moment RT fluctuations. Using the 12-min gradCPT, we observed vigilance decrements in both accuracy and RT variability, such that participants made more errors and responded more variably as the task progressed. We also employed a new method for tracking attentional fluctuations that assessed trial-to-trial RT variability, and found that periods of relative RT stability were accompanied by reduced error proneness. Together, these design and analysis innovations improve our understanding of decrements and fluctuations in sustained attention, and show promising future applications.

The gradCPT elicited vigilance decrements in both error rates and RT variability over a relatively short period of time, findings rarely reported in studies using short-duration not-X CPTs. Previous studies of longer-duration not-X CPTs have found a variety of changes in performance over time: While some have observed decrements over 30 min (e.g., Grier et al., 2003; Helton et al., 2005), others have observed improvements (e.g., Helton et al., 2009). The present decrement over a short duration may be a result of greater workload, due to the task’s gradual stimulus transitions. Gradual transitions may mute the exogenous response cues associated with abrupt stimulus onsets, and thus increase demands on the endogenous maintenance of attention. Future studies comparing performance and workload (e.g., with the NASA task load index) between abrupt- and gradual-stimulus-onset CPTs would be useful to assess this possibility.

Several of the present findings highlight RT variability as being an important indicator of attentional state, one that is observable at a higher frequency and may capture subtler changes than errors. Specifically, we observed both between- and within-subjects relationships between RT variability and error likelihood. That participants with more-variable RTs made more CEs, even when mean RT was statistically controlled, suggests that RT variability is a unique and important component of sustained-attention performance. In addition, the relationship between increasing RT variability and decreasing accuracy over the course of the task suggests that vigilance decrements accompany more erratic, rather than faster (or slower), responding over time. Furthermore, the within-subjects relationship between RT variability and the likelihood of CEs also supports the idea that consistent responding relates to successful performance on the gradCPT, suggesting distinct in-the-zone and out-of-the-zone attentional states. Out-of-the-zone periods, characterized by more-variable correct RTs, a greater likelihood of errors, and stronger speed–accuracy trade-offs (i.e., a greater difference between RTs preceding COs vs. CEs), may be more taxing or effortful than in-the-zone periods in which participants have less-variable response latencies, make fewer errors, and show smaller speed–accuracy trade-offs. In addition, while we replicated the finding that faster RTs predict subsequent errors, we extended this by demonstrating that the tendency for speeding-induced errors ebbs and flows throughout the experiment. Thus, RT variability, here assessed by the VTC, is a potential new way to track attentional state, as demonstrated by its relation to error proneness and speed–accuracy trade-offs.

Our approach and findings may inform theories of sustained attention. Current theories of the vigilance decrement fall into two broad categories, which attribute attentional decline to either “underload” or “overload” (Pattyn, Neyt, Henderickx, & Soetens, 2008). A recent underload theory, the mindlessness model, attributes declines in performance over time to the failure of a supervisory attentional system in directing attention to monotonous tasks, which causes observers to approach the task in a thoughtless, routinized manner, instead of exerting effortful attention (Manly et al., 1999; Robertson et al., 1997). In contrast, the attentional-resource model views the vigilance decrement as the result of an overload of task demands, a consequence of depleted attentional resources that cannot be replaced over time (Parasuraman & Davies, 1977; Parasuraman, Warm, & Dember, 1987). Considering our findings in the context of underload and overload theories of sustained attention, we propose that errors committed while in the zone—during periods of time in which participants perform relatively well—may reflect underload phenomena such as mind wandering, whereas those committed while out of the zone—when participants are performing relatively poorly—may reflect overload mechanisms such as difficulty resulting in resource depletion. While it is admittedly speculative, this proposal receives support from a recent fMRI study by our group (Esterman, Noonan, Rosenberg, & DeGutis, 2012). Specifically, we found that in-the-zone errors are preceded by elevated activity in the default mode network—regions of the brain associated with mind wandering—and thus may reflect periods of mindlessness. On the other hand, out-of-the-zone errors are associated with failures to fully engage task-positive dorsal attention network regions associated with attentional control and may be the result of attentional resource depletion. It is interesting to note that, if confirmed, this interpretation would imply that underload and overload undermine sustained attention at different times during task performance. In particular, overload may have a greater influence on performance later in the task as RT variability increases; that is, depletion of attentional resources may increasingly contribute to failures over the course of the task.

As in previous studies, we found a relationship between the self-reported tendency to make attention-related errors in daily life (as measured by the ARCES) and performance on the gradCPT, such that participants who reported making more errors showed a higher CE rate on the gradCPT (r = .473, p = .009). Note that the magnitude of this relationship was numerically greater than that found by previous studies using a not-X CPT (in a comparison of SART errors and CFQ scores from a recent meta-analysis, r = .21 [Smilek et al., 2010]; for SART errors and ARCES scores, r = .23–.32 [Cheyne et al., 2006; Smilek et al., 2010]). The gradCPT may thus be more sensitive to individual differences than previous not-X CPTs have been, potentially because it taxes endogenous attention mechanisms to a greater degree and may better simulate the everyday circumstances under which attention fails. Improved sensitivity on the gradCPT may also be due to the better reliability of the measure and/or a wider range of performance, alternatives that future studies could explore.

Although we predicted a main effect of distraction, background scenes did not elicit differences in overall performance or vigilance decrements, suggesting that across participants, continuous irrelevant visual distraction did not impact performance, likely due to the specific nature of our distractors. Whereas distraction is often studied using stimuli that compete with task goals, either by capturing attention via bottom-up salience (such as in a singleton paradigm) or by introducing cognitive factors that interact with the top-down task set (as in contingent capture and flanker tasks), here we were interested in a subtler form of distraction: the continuous presence of a stream of irrelevant information that, while posing no direct conflict with processing of the primary stimuli, could nonetheless divert attention as task-set maintenance fluctuated. The distractors in the present study were irrelevant both temporally (as they changed at a different rate than the faces) and categorically (as they were scenes, not faces). We reasoned that these backgrounds would provide a more ecologically valid measure of visual distraction, in that continuously present distractors unrelated to task goals, such as the view from one’s office window or the passing scenes as one is driving, are more frequently encountered in daily life than are abruptly appearing, transient distractors that present direct conflict. Nevertheless future research will need to compare the effect of these continuous, irrelevant distractors to that of abrupt, conflicting distractors on decrements and fluctuations in sustained-attention performance.

Despite finding no overall effect of the distractors, as hypothesized, we did detect an interaction of background condition with self-reported attentional ability, with stronger relationships observed in the presence of distractors. Specifically, when we included scene distractors, mindfulness was associated with lower OE rates and RT variability, further supporting previous characterizations of the MAAS as relating to attention lapses via RT (reflected in more-variable RTs in this study) and the ARCES as relating to errors of commission (Cheyne et al., 2006; Smilek et al., 2010). Stronger relationships between gradCPT performance and the MAAS, but not the ARCES, in the distractor-present condition may be due to the fact that distraction affected subtler attentional lapses (reflected by RT variability), but not more catastrophic lapses (reflected by commission errors), in a subset of participants.

To fully appreciate the influence of mindfulness on gradCPT performance, it is important to consider the results above in combination with the findings regarding scene memory—that is, that mindfulness positively correlated with distractor scene memory (r = .579, p = .03). This result suggests that mindful individuals did not achieve better performance on the distractor-present version of the gradCPT because of enhanced attentional filtering, but were actually more likely to encode the irrelevant background scenes. One possible explanation for this pattern of results is that mindful individuals have greater resources available for processing immediate experiential information, resulting in a more distributed attentional focus (Bishop et al., 2004; Valentine & Sweet, 1999) and/or in more efficient task performance. In line with perceptual load theory (Lavie, 1995; Yi, Woodman, Widders, Marois, & Chun, 2004), unattended background scene processing should occur under conditions of low perceptual load, but not under conditions of high perceptual load, when more resources are demanded by the central task. Thus, if the perceptual load of the task varies across individuals on the basis of cognitive style, capacity, or strategy, this may explain the degree to which mindful individuals both perform the central task more consistently and remember the distractor scenes. In other words, it is possible that the perceptual load of the central task is low for mindful individuals, and thus background scenes are processed throughout the duration of the task. On the other hand, the load of the central task may be high for less mindful individuals, resulting in sporadic scene processing during times of mind wandering or struggle, and thus increased RT variability. In all cases, a tendency to allow background information to enter attentional awareness may be uniquely advantageous in this specific case, when the distractors are noncompeting and task-irrelevant.

Limitations and future directions

Although the gradCPT represents a potentially improved method for examining decrements and fluctuations in sustained attention, we acknowledge several limitations of the present study. First, we employed a variant of not-X CPT with the assumption that commission errors resulted from lapses of attention. An alternative interpretation would argue that such tasks measure speed–accuracy trade-offs and response strategy rather than sustained attention, and posit that errors are due to failures of motor control rather than attention lapses (Helton, Head, & Russell, 2011; Helton et al., 2005; Helton et al., 2009). However, we believe that several converging lines of evidence demonstrate that not-X CPTs do measure attention. In particular, we argue that associations between not-X CPT performance, mind wandering, and self-reported absentmindedness (Cheyne et al., 2006; Smilek et al., 2010); findings that performance on a not-X CPT is more strongly related to measures of sustained attention than to measures of other types of attention or response inhibition (Robertson et al., 1997); and work suggesting that children with ADHD demonstrate impaired not-X CPT performance (Johnson et al., 2007) but do not have motor inhibition impairments (Rommelse et al., 2007) reinforce the validity of not-X CPTs as measures of sustained attention.

Additionally, further work would be helpful in exploring the assumptions underlying our analysis of within-subjects RT variability. That is, in light of the limited existing data that speak to the time scale of attention fluctuations in healthy individuals, we chose our smoothing kernel of nine trials (7.2 s) on the basis of findings from the ADHD literature that suggest that individuals with ADHD have higher power in frequency bands corresponding to behavioral fluctuations on the order of every 20 s than do controls (Di Martino et al., 2008). Increased spectral power of oscillations in the same range has also been observed in hemodynamics and heart rate (Malliani, Pagani, Lombardi, & Cerutti, 1991; Pagani et al., 1997), which have been linked to arousal and attention. Future studies could examine the impact of using smaller or larger kernels, as well as explore additional fluctuations in performance via frequency analysis.

Consideration should also be given to aspects of our experimental design and procedures. First, we acknowledge that order effects are possible in relation to the administration of the gradCPT followed by questionnaires. Although we consider this an unlikely possibility, responses to the questionnaires could have been be influenced by the preceding task performance (e.g., participants who felt that they performed poorly may have self-rated as being less mindful or making frequent attention-related errors). Yet, had the ARCES and MAAS been presented before the gradCPT, they might have activated self-perceptions capable of influencing the subsequent task performance. Thus, order effects could be possible in either direction. Second, as our distraction manipulation was administered between subjects, we do not have a direct measure of the effect of distraction on fluctuations or decrements in sustained attention; future studies utilizing a within-subjects design could address this question.

In sum, the present findings highlight the promise of the gradCPT and the VTC analysis method as a new set of tools in the study of sustained attention. The results extend empirical evidence for both individual differences and within-subjects fluctuations in sustained attention and lay the foundation for future work that could potentially integrate several prevailing models of sustained attention.