The disfluent discourse: Effects of filled pauses on recall

https://doi.org/10.1016/j.jml.2011.03.004Get rights and content

Abstract

We investigated the mechanisms by which fillers, such as uh and um, affect memory for discourse. Participants listened to and attempted to recall recorded passages adapted from Alice’s Adventures in Wonderland. The type and location of interruptions were manipulated through digital splicing. In Experiment 1, we tested a processing time account of fillers’ effects. While fillers facilitated recall, coughs matched in duration to the fillers impaired recall, suggesting that fillers’ benefits cannot be attributed to adding processing time. In Experiment 2, fillers’ locations were manipulated based on norming data to be either predictive or non-predictive of upcoming material. Fillers facilitated recall in both cases, inconsistent with an account in which listeners predict upcoming material using past experience with the distribution of fillers. Instead, these results suggest an attentional orienting account in which fillers direct attention to the speech stream but do not always result in specific predictions about upcoming material.

Highlights

► We test the effects of disfluent filled pauses on recall of a discourse. ► Fillers improve recall whether or not they predict upcoming discourse boundaries. ► No benefit from coughs of equal duration, ruling out a processing time effect. ► Fillers have an attentional orienting effect.

Introduction

Natural conversation, unlike most laboratory speech, is rife with disfluencies, or interruptions in the fluent flow of speech. For instance, speech frequently includes fillers such as uh and um. Estimates have placed the rate of disfluencies in speech as high as 6 per 100 words (Fox Tree, 1995). What are the consequences of these frequent interruptions for understanding and remembering information from a discourse?

Eye tracking and electrophysiological measures have found that fillers often benefit online comprehension of simple materials (see Corley & Stewart, 2008, for review). However, little work has investigated whether findings from these paradigms generalize to the discourse level and to situations involving later memory. Moreover, the specific mechanisms by which fillers benefit comprehension remain unclear.

In two experiments, we tested whether fillers affect later memory for discourse and examined the mechanisms by which they might do so. We compare three theoretical accounts of how fillers affect processing: fillers may allow participants to predict what will come next, fillers may orient attention to the speech stream, or fillers may allow more time to process the discourse.

Although precise taxonomies of disfluencies vary (e.g. Clark, 1996, Maclay and Osgood, 1959, Shriberg, 1994), all include filled pauses, or fillers, verbal interruptions that do not relate to the proposition of the main message. In American English, the most common fillers are uh and um (Clark & Fox Tree, 2002). Fillers are one of the most frequent types of disfluency, accounting for one-third to over one-half of all disfluencies in several corpora (Shriberg, 1994).

Prior work has found that, although fillers interrupt the fluent delivery of an utterance, they often benefit listeners’ online comprehension. Fillers facilitate judgments of whether a word in running speech matches an earlier probe (Fox Tree, 2001) and allow listeners to more quickly respond to an instruction in which the speaker repairs a prior error (Brennan & Schober, 2001). Listeners use the presence of fillers to anticipate that the speaker will refer to a less accessible referent rather than a more accessible one (Arnold et al., 2007, Arnold et al., 2004, Barr and Seyfeddinipurr, 2009).

Although these studies have established the benefits of fillers in lexical and referential processing, several outstanding issues remain. First, it is unclear to what degree these effects generalize to later memory. Some evidence that fillers modulate memory comes from Corley, MacGregor, and Donaldson (2007), who found that sentence-final words were more apt to be recognized on a later memory test when they were preceded by a filler than when they were fluent. In general, however, less work has examined what the consequences of fillers and other disfluencies are for the long-term understanding of a connected discourse.

Second, many studies of disfluency have focused on the effects of fillers on identifying individual referents or lexical items. These studies have often used either isolated sentences or discourses with only a few possible referents. Thus, it is unclear whether fillers benefit processing only at lower levels of language comprehension or also at the level of the discourse. In the present work, we address these two issues by examining whether the benefits of fillers generalize to long-term memory for a complex discourse.

Finally, the specific mechanisms by which fillers benefit processing remain uncertain. Below, we review several accounts that have been proposed about how fillers modulate online language comprehension, and discuss how these accounts could be applied to the effect of fillers on long-term memory for a discourse. These hypotheses include predictive processing, attentional orienting, and increased processing time.

One reason that fillers and other disfluencies may benefit comprehension is that listeners can use them to predict what they will hear next. Speakers are most apt to produce fillers before material that is less accessible, such as a referent that is new to the discourse or that is difficult to name. Prior experience with this distribution of fillers might allow listeners to use the presence of a filler to predict that the speaker will next refer to a less accessible referent. This type of finding has been obtained in studies using the visual world paradigm; in these studies, eye fixations are recorded as participants follow instructions that refer to referents in simple scenes (e.g. Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995). Participants’ eye movements suggest that when they hear a disfluency, thee uh, before a noun phrase, they anticipate that the upcoming noun phrase refers to a new item in the discourse (Arnold et al., 2004) or one that is difficult to name (Arnold et al., 2007). Similar patterns have also been obtained by tracking participants’ movements of a computer mouse to select one of several objects (e.g., Barr & Seyfeddinipur, 2009).

Although predictive processing accounts have focused on the ability of listeners to quickly identify particular referents, it is plausible that predictions could have consequences at a broader discourse level. Most theories of discourse propose that discourse comprehension requires processing both at a global level and at the level of local reference (for review, see Graesser, Millis, & Zwaan, 1997). Facilitating the local level of processing would allow more time or resources to be devoted to the global level. Furthermore, predictions could also be made at the discourse level itself. Speakers produce fillers more frequently at discourse boundaries (Fraundorf and Watson, 2008, Swerts, 1998), so listeners could potentially use a filler to successfully anticipate a topic change rather than the continuation of an existing topic.

Importantly, however, the predictive processing hypothesis predicts that fillers must lead listeners to make accurate predictions in order to benefit comprehension. When fillers lead to predictions that are later disconfirmed, such as when a filler is followed by a highly accessible referent or the continuation of the same topic, fillers should not benefit, and might even impair, comprehension.

An alternative to the predictive processing hypothesis is that the linguistic nature of fillers orients attention to the speech stream, but does not necessarily result in specific predictions about the nature of upcoming material. As with the predictive processing account, this hypothesis has typically focused on local referential or lexical processing. But attention would likely have consequences for discourse-level comprehension as well. The construction of discourse coherence at a global level is influenced by motivation and goals (Graesser, Millis, & Zwaan, 1997), which would likely be sensitive to changes in attention. Thus, increases in attention in response to fillers might also improve discourse-level comprehension.

In an attentional orienting account, fillers need not be predictive of later material in order to benefit comprehension. For instance, fillers can affect responses even to acoustic events that do not occur in natural speech. Collard, Corley, MacGregor, and Donaldson (2008) tested this hypothesis by examining the effect of fillers on the event-related potentials (ERPs) evoked in response to an oddball auditory effect: the temporary acoustic compression of the speech stream. In general, such oddballs produced large MMN (mismatch negativity) and P300 components, ERP components argued to reflect the orienting and updating of attention. However, a filler before the oddball greatly reduced the magnitude of these effects. Collard et al. interpreted this pattern as indicating that attention was already oriented to the speech stream as a result of the filler. Crucially for the attentional orienting hypothesis, fillers had this effect even though listeners could not predict the unusual, laboratory-only oddball effect based on prior experience about the distribution of fillers.

The attentional orienting hypothesis also predicts that a filler should still benefit comprehension even if it occurs in an atypical location, such as in the middle of a discourse segment or before a highly accessible response. Support for this prediction comes from Corley and Hartsuiker (2003), who asked participants to click on one of several pictures on a computer screen based on an auditory instruction. A filler (um) in the instruction facilitated response times to targets of both high lexical frequency (high accessibility) and low lexical frequency (low accessibility). One interpretation of these results is that the fillers did not lead participants to make a specific prediction that the upcoming referent would be the less accessible one, but rather increased general attention to the referring expression.

The two prior accounts have proposed that fillers affect comprehension qua fillers, either because they lead to specific predictions or because they increase attention to the speech stream. But a third hypothesis is that interruptions of the speech stream simply add additional time for comprehension processes, including those at the discourse level, to unfold. This processing-time hypothesis predicts that any interruption, including a silent pause or environmental noise, should achieve the same effect as a filler, whether or not the interruption is linguistic in nature or diagnostic of the speaker’s mental state.

Evidence in support of this hypothesis is mixed. When listeners followed instructions in which the speaker self-corrected an erroneous object description, a silent pause was as effective as a filler in speeding responses to the correction (Brennan & Schober, 2001). Listeners were also equally likely to expect that a referring expression would refer to a complex object when it was preceded by a filler as by a silent pause (Watanabe, Hirose, Den, & Minematsu, 2008). In addition, fillers, environmental noises, and fluent modifying expressions all have similar effects on listeners’ interpretation of temporary syntactic ambiguities (Bailey & Ferreira, 2003). However, not all of the results in the literature point to effects of processing time: Barr and Seyfeddinipur (2009) found that fillers directed mouse movements towards a new referent more quickly than did coughs or sniffles at the same point in the utterance. This suggests that fillers had effects beyond the time they spent interrupting the speech stream. Fillers also had larger effects on offline judgments of a speaker’s knowledgeability than silent pauses matched in duration (Brennan & Williams, 1995).

The mixed evidence for the processing-time hypothesis may owe in part to the difficulty of determining the appropriate control condition. Although periods of silence might at first appear to present a useful control interruption, silent pauses are themselves a form of disfluency associated with planning difficulties (Maclay & Osgood, 1959) and would also be diagnostic of a speaker’s planning difficulties. Other experiments (e.g., Bailey & Ferreira, 2003) have compared fillers to environmental noises such as telephone rings and animal calls. These interruptions present their own challenges: they rely on the assumption that an external noise would plausibly explain the cessation in speech. If in natural production many speakers prefer to talk over such interruptions, then the absence of speech during an interruption might still be interpreted by listeners as a disfluency.

In the present study, we follow Barr and Seyfeddinipur (2009) by using coughs as a control condition for the processing-time hypothesis. Coughs are interruptions that are generated by the speaker and that provide a plausible explanation for the cessation of speech, but they should not be interpreted as related to planning difficulties. Unlike in the Barr and Seyfeddinipur experiment, which investigated online reference resolution, we compare fillers to coughs in their effect on memory for a discourse.

The present work aimed to answer two questions: First, do the benefits of fillers in online, local processing generalize to long-term understanding of a discourse? Although it is plausible that disfluencies could also lead to attention to, or predictions about, a more global level of discourse, and that this would improve long-term understanding, this hypothesis has not yet been tested. Second, if fillers do benefit memory for a discourse, are these effects attributable to listeners’ predictions about upcoming materials, to attentional orienting, or to increased processing time?

We examined the influence of fillers on discourse memory using a storytelling paradigm. Participants listened to short recorded stories, excerpted from Alice’s Adventures in Wonderland (Carroll, 1865), and then attempted to retell them from memory. The presence or absence of various interruptions was manipulated by splicing them in and out of the recorded stories. This paradigm permits an assessment of the effects of fillers on discourses that are more complex than those used in many experiments, but still allows precise control over the existence and location of interruptions.

One concern with this paradigm is that if participants found the spliced disfluencies unusual or unnatural, they might devote special attention to them even if they would not do so in natural language comprehension. To ensure that participants found the interruptions plausible, participants were told a cover story that the discourses had been recorded by a participant in a previous study who had to learn the stories and retell them from memory. (Post-experiment debriefing, discussed below, did not find any participants who detected the splicing.)

In Experiment 1, we tested the processing-time hypothesis by comparing comprehension of a story containing fillers to a fluent story and to a story containing a non-linguistic interruption—the speaker coughing—matched in duration to the fillers. If fillers benefit comprehension simply because they allow more processing time, a cough of equal duration should equally benefit processing. However, if the association of fillers with language production difficulties is critical to their facilitation of comprehension, then non-linguistic interruptions such as coughs should not produce the same effect as fillers.

In Experiment 2, we then compared the predictive processing and attentional orienting hypotheses by manipulating the location of fillers within a story. According to the predictive processing hypothesis, fillers only benefit comprehension when they allow listeners to predict upcoming material. Thus, fillers at a more likely location—between discourse segments—should have more predictive utility and benefit comprehension more than fillers in unlikely locations—such as within a plot point. By contrast, if fillers simply orient attention to the speech stream, they may facilitate comprehension relative to a fluent story no matter where they are located.

Section snippets

Experiment 1

Experiment 1 assessed the effect of fillers on recall of a discourse. Specifically, we contrasted the processing-time hypothesis with the predictive processing and attentional orienting hypotheses by comparing the effects of fillers and coughs.

To verify that fillers were generally associated with production difficulties in these materials, we conducted a production experiment (Fraundorf & Watson, 2008) for norming. Participants in the norming study read passages divided into 14 plot points and

Experiment 2

In Experiment 2, we compared the predictions of the predictive processing and attentional orienting hypotheses by splicing fillers into typical or atypical locations. We determined typical locations for fillers for these materials empirically on the basis of the norming production experiment described above. In this experiment, fillers occurred frequently before what we designated as a new plot point, but far less frequently within a plot point. The mean rate of filler use was 5.19 fillers per

General discussion

In two experiments, we examined both the effects of fillers on participants’ ability to correctly recall elements of a short discourse as well as the potential mechanisms underlying those effects.

In Experiment 1, two different types of interruptions, filler and coughs, appeared before new plot points. The fillers facilitated recall of the stories relative to a fluent version. However, coughs—an interruption unrelated to language—impaired recall. This divergence provides evidence against the

Acknowledgments

We thank Sarah Brown-Schmidt and members of the Communication and Language Lab for their comments and suggestions, Jessica George for recording stimulus materials, and Lisa Brannan, Shefali Khanna, Sujin Park, and Amie Roten for data collection.

This work was supported by National Science Foundation Graduate Research Fellowship 2007053221 to Scott H. Fraundorf and a NIH Grant from the National Institute on Deafness and Other Communication Disorders R01DC008774.

References (45)

  • V.L. Smith et al.

    On the course of answering questions

    Journal of Memory and Language

    (1993)
  • M. Swerts

    Filled pauses as markers of discourse structure

    Journal of Pragmatics

    (1998)
  • M. Watanabe et al.

    Filled pauses as cues to the complexity of upcoming phrases for native and non-native listeners

    Speech Communication

    (2008)
  • A. Almor et al.

    Focus and noun phrase anaphors in spoken language comprehension

    Language and Cognitive Processes

    (2008)
  • J.E. Arnold et al.

    If you say thee uh- you’re describing something hard: The on-line attribution of disfluency during reference comprehension

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (2007)
  • J.E. Arnold et al.

    The old and thee, uh, new: Disfluency and reference resolution

    Psychological Science

    (2004)
  • D. Barr

    Paralinguistic correlates of conceptual structure

    Psychonomic Bulletin & Review

    (2003)
  • D. Barr et al.

    The role of fillers in listener attributions for speaker disfluency

    Language and Cognitive Processes

    (2009)
  • Bates, D., Maechler, M., & Dai B. (2010). lme4: Linear mixed-effects models using s4 classes [computer software...
  • A. Bell et al.

    Effects of disfluencies, predictability, and utterance position on word form variation in English conversation

    The Journal of the Acoustical Society of America

    (2003)
  • K. Bock

    Language production: Methods and methodologies

    Psychonomic Bulletin & Review

    (1996)
  • D.H. Brainard

    The psychophysics toolbox

    Spatial Vision

    (1997)
  • Cited by (73)

    View all citing articles on Scopus
    View full text