The disfluent discourse: Effects of filled pauses on recall
Highlights
► We test the effects of disfluent filled pauses on recall of a discourse. ► Fillers improve recall whether or not they predict upcoming discourse boundaries. ► No benefit from coughs of equal duration, ruling out a processing time effect. ► Fillers have an attentional orienting effect.
Introduction
Natural conversation, unlike most laboratory speech, is rife with disfluencies, or interruptions in the fluent flow of speech. For instance, speech frequently includes fillers such as uh and um. Estimates have placed the rate of disfluencies in speech as high as 6 per 100 words (Fox Tree, 1995). What are the consequences of these frequent interruptions for understanding and remembering information from a discourse?
Eye tracking and electrophysiological measures have found that fillers often benefit online comprehension of simple materials (see Corley & Stewart, 2008, for review). However, little work has investigated whether findings from these paradigms generalize to the discourse level and to situations involving later memory. Moreover, the specific mechanisms by which fillers benefit comprehension remain unclear.
In two experiments, we tested whether fillers affect later memory for discourse and examined the mechanisms by which they might do so. We compare three theoretical accounts of how fillers affect processing: fillers may allow participants to predict what will come next, fillers may orient attention to the speech stream, or fillers may allow more time to process the discourse.
Although precise taxonomies of disfluencies vary (e.g. Clark, 1996, Maclay and Osgood, 1959, Shriberg, 1994), all include filled pauses, or fillers, verbal interruptions that do not relate to the proposition of the main message. In American English, the most common fillers are uh and um (Clark & Fox Tree, 2002). Fillers are one of the most frequent types of disfluency, accounting for one-third to over one-half of all disfluencies in several corpora (Shriberg, 1994).
Prior work has found that, although fillers interrupt the fluent delivery of an utterance, they often benefit listeners’ online comprehension. Fillers facilitate judgments of whether a word in running speech matches an earlier probe (Fox Tree, 2001) and allow listeners to more quickly respond to an instruction in which the speaker repairs a prior error (Brennan & Schober, 2001). Listeners use the presence of fillers to anticipate that the speaker will refer to a less accessible referent rather than a more accessible one (Arnold et al., 2007, Arnold et al., 2004, Barr and Seyfeddinipurr, 2009).
Although these studies have established the benefits of fillers in lexical and referential processing, several outstanding issues remain. First, it is unclear to what degree these effects generalize to later memory. Some evidence that fillers modulate memory comes from Corley, MacGregor, and Donaldson (2007), who found that sentence-final words were more apt to be recognized on a later memory test when they were preceded by a filler than when they were fluent. In general, however, less work has examined what the consequences of fillers and other disfluencies are for the long-term understanding of a connected discourse.
Second, many studies of disfluency have focused on the effects of fillers on identifying individual referents or lexical items. These studies have often used either isolated sentences or discourses with only a few possible referents. Thus, it is unclear whether fillers benefit processing only at lower levels of language comprehension or also at the level of the discourse. In the present work, we address these two issues by examining whether the benefits of fillers generalize to long-term memory for a complex discourse.
Finally, the specific mechanisms by which fillers benefit processing remain uncertain. Below, we review several accounts that have been proposed about how fillers modulate online language comprehension, and discuss how these accounts could be applied to the effect of fillers on long-term memory for a discourse. These hypotheses include predictive processing, attentional orienting, and increased processing time.
One reason that fillers and other disfluencies may benefit comprehension is that listeners can use them to predict what they will hear next. Speakers are most apt to produce fillers before material that is less accessible, such as a referent that is new to the discourse or that is difficult to name. Prior experience with this distribution of fillers might allow listeners to use the presence of a filler to predict that the speaker will next refer to a less accessible referent. This type of finding has been obtained in studies using the visual world paradigm; in these studies, eye fixations are recorded as participants follow instructions that refer to referents in simple scenes (e.g. Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995). Participants’ eye movements suggest that when they hear a disfluency, thee uh, before a noun phrase, they anticipate that the upcoming noun phrase refers to a new item in the discourse (Arnold et al., 2004) or one that is difficult to name (Arnold et al., 2007). Similar patterns have also been obtained by tracking participants’ movements of a computer mouse to select one of several objects (e.g., Barr & Seyfeddinipur, 2009).
Although predictive processing accounts have focused on the ability of listeners to quickly identify particular referents, it is plausible that predictions could have consequences at a broader discourse level. Most theories of discourse propose that discourse comprehension requires processing both at a global level and at the level of local reference (for review, see Graesser, Millis, & Zwaan, 1997). Facilitating the local level of processing would allow more time or resources to be devoted to the global level. Furthermore, predictions could also be made at the discourse level itself. Speakers produce fillers more frequently at discourse boundaries (Fraundorf and Watson, 2008, Swerts, 1998), so listeners could potentially use a filler to successfully anticipate a topic change rather than the continuation of an existing topic.
Importantly, however, the predictive processing hypothesis predicts that fillers must lead listeners to make accurate predictions in order to benefit comprehension. When fillers lead to predictions that are later disconfirmed, such as when a filler is followed by a highly accessible referent or the continuation of the same topic, fillers should not benefit, and might even impair, comprehension.
An alternative to the predictive processing hypothesis is that the linguistic nature of fillers orients attention to the speech stream, but does not necessarily result in specific predictions about the nature of upcoming material. As with the predictive processing account, this hypothesis has typically focused on local referential or lexical processing. But attention would likely have consequences for discourse-level comprehension as well. The construction of discourse coherence at a global level is influenced by motivation and goals (Graesser, Millis, & Zwaan, 1997), which would likely be sensitive to changes in attention. Thus, increases in attention in response to fillers might also improve discourse-level comprehension.
In an attentional orienting account, fillers need not be predictive of later material in order to benefit comprehension. For instance, fillers can affect responses even to acoustic events that do not occur in natural speech. Collard, Corley, MacGregor, and Donaldson (2008) tested this hypothesis by examining the effect of fillers on the event-related potentials (ERPs) evoked in response to an oddball auditory effect: the temporary acoustic compression of the speech stream. In general, such oddballs produced large MMN (mismatch negativity) and P300 components, ERP components argued to reflect the orienting and updating of attention. However, a filler before the oddball greatly reduced the magnitude of these effects. Collard et al. interpreted this pattern as indicating that attention was already oriented to the speech stream as a result of the filler. Crucially for the attentional orienting hypothesis, fillers had this effect even though listeners could not predict the unusual, laboratory-only oddball effect based on prior experience about the distribution of fillers.
The attentional orienting hypothesis also predicts that a filler should still benefit comprehension even if it occurs in an atypical location, such as in the middle of a discourse segment or before a highly accessible response. Support for this prediction comes from Corley and Hartsuiker (2003), who asked participants to click on one of several pictures on a computer screen based on an auditory instruction. A filler (um) in the instruction facilitated response times to targets of both high lexical frequency (high accessibility) and low lexical frequency (low accessibility). One interpretation of these results is that the fillers did not lead participants to make a specific prediction that the upcoming referent would be the less accessible one, but rather increased general attention to the referring expression.
The two prior accounts have proposed that fillers affect comprehension qua fillers, either because they lead to specific predictions or because they increase attention to the speech stream. But a third hypothesis is that interruptions of the speech stream simply add additional time for comprehension processes, including those at the discourse level, to unfold. This processing-time hypothesis predicts that any interruption, including a silent pause or environmental noise, should achieve the same effect as a filler, whether or not the interruption is linguistic in nature or diagnostic of the speaker’s mental state.
Evidence in support of this hypothesis is mixed. When listeners followed instructions in which the speaker self-corrected an erroneous object description, a silent pause was as effective as a filler in speeding responses to the correction (Brennan & Schober, 2001). Listeners were also equally likely to expect that a referring expression would refer to a complex object when it was preceded by a filler as by a silent pause (Watanabe, Hirose, Den, & Minematsu, 2008). In addition, fillers, environmental noises, and fluent modifying expressions all have similar effects on listeners’ interpretation of temporary syntactic ambiguities (Bailey & Ferreira, 2003). However, not all of the results in the literature point to effects of processing time: Barr and Seyfeddinipur (2009) found that fillers directed mouse movements towards a new referent more quickly than did coughs or sniffles at the same point in the utterance. This suggests that fillers had effects beyond the time they spent interrupting the speech stream. Fillers also had larger effects on offline judgments of a speaker’s knowledgeability than silent pauses matched in duration (Brennan & Williams, 1995).
The mixed evidence for the processing-time hypothesis may owe in part to the difficulty of determining the appropriate control condition. Although periods of silence might at first appear to present a useful control interruption, silent pauses are themselves a form of disfluency associated with planning difficulties (Maclay & Osgood, 1959) and would also be diagnostic of a speaker’s planning difficulties. Other experiments (e.g., Bailey & Ferreira, 2003) have compared fillers to environmental noises such as telephone rings and animal calls. These interruptions present their own challenges: they rely on the assumption that an external noise would plausibly explain the cessation in speech. If in natural production many speakers prefer to talk over such interruptions, then the absence of speech during an interruption might still be interpreted by listeners as a disfluency.
In the present study, we follow Barr and Seyfeddinipur (2009) by using coughs as a control condition for the processing-time hypothesis. Coughs are interruptions that are generated by the speaker and that provide a plausible explanation for the cessation of speech, but they should not be interpreted as related to planning difficulties. Unlike in the Barr and Seyfeddinipur experiment, which investigated online reference resolution, we compare fillers to coughs in their effect on memory for a discourse.
The present work aimed to answer two questions: First, do the benefits of fillers in online, local processing generalize to long-term understanding of a discourse? Although it is plausible that disfluencies could also lead to attention to, or predictions about, a more global level of discourse, and that this would improve long-term understanding, this hypothesis has not yet been tested. Second, if fillers do benefit memory for a discourse, are these effects attributable to listeners’ predictions about upcoming materials, to attentional orienting, or to increased processing time?
We examined the influence of fillers on discourse memory using a storytelling paradigm. Participants listened to short recorded stories, excerpted from Alice’s Adventures in Wonderland (Carroll, 1865), and then attempted to retell them from memory. The presence or absence of various interruptions was manipulated by splicing them in and out of the recorded stories. This paradigm permits an assessment of the effects of fillers on discourses that are more complex than those used in many experiments, but still allows precise control over the existence and location of interruptions.
One concern with this paradigm is that if participants found the spliced disfluencies unusual or unnatural, they might devote special attention to them even if they would not do so in natural language comprehension. To ensure that participants found the interruptions plausible, participants were told a cover story that the discourses had been recorded by a participant in a previous study who had to learn the stories and retell them from memory. (Post-experiment debriefing, discussed below, did not find any participants who detected the splicing.)
In Experiment 1, we tested the processing-time hypothesis by comparing comprehension of a story containing fillers to a fluent story and to a story containing a non-linguistic interruption—the speaker coughing—matched in duration to the fillers. If fillers benefit comprehension simply because they allow more processing time, a cough of equal duration should equally benefit processing. However, if the association of fillers with language production difficulties is critical to their facilitation of comprehension, then non-linguistic interruptions such as coughs should not produce the same effect as fillers.
In Experiment 2, we then compared the predictive processing and attentional orienting hypotheses by manipulating the location of fillers within a story. According to the predictive processing hypothesis, fillers only benefit comprehension when they allow listeners to predict upcoming material. Thus, fillers at a more likely location—between discourse segments—should have more predictive utility and benefit comprehension more than fillers in unlikely locations—such as within a plot point. By contrast, if fillers simply orient attention to the speech stream, they may facilitate comprehension relative to a fluent story no matter where they are located.
Section snippets
Experiment 1
Experiment 1 assessed the effect of fillers on recall of a discourse. Specifically, we contrasted the processing-time hypothesis with the predictive processing and attentional orienting hypotheses by comparing the effects of fillers and coughs.
To verify that fillers were generally associated with production difficulties in these materials, we conducted a production experiment (Fraundorf & Watson, 2008) for norming. Participants in the norming study read passages divided into 14 plot points and
Experiment 2
In Experiment 2, we compared the predictions of the predictive processing and attentional orienting hypotheses by splicing fillers into typical or atypical locations. We determined typical locations for fillers for these materials empirically on the basis of the norming production experiment described above. In this experiment, fillers occurred frequently before what we designated as a new plot point, but far less frequently within a plot point. The mean rate of filler use was 5.19 fillers per
General discussion
In two experiments, we examined both the effects of fillers on participants’ ability to correctly recall elements of a short discourse as well as the potential mechanisms underlying those effects.
In Experiment 1, two different types of interruptions, filler and coughs, appeared before new plot points. The fillers facilitated recall of the stories relative to a fluent version. However, coughs—an interruption unrelated to language—impaired recall. This divergence provides evidence against the
Acknowledgments
We thank Sarah Brown-Schmidt and members of the Communication and Language Lab for their comments and suggestions, Jessica George for recording stimulus materials, and Lisa Brannan, Shefali Khanna, Sujin Park, and Amie Roten for data collection.
This work was supported by National Science Foundation Graduate Research Fellowship 2007053221 to Scott H. Fraundorf and a NIH Grant from the National Institute on Deafness and Other Communication Disorders R01DC008774.
References (45)
- et al.
Disfluencies affect the parsing of garden-path sentences
Journal of Memory and Language
(2003) - et al.
How listeners compensate for disfluencies in spontaneous speech
Journal of Memory and Language
(2001) - et al.
The feeling of another’s knowing: Prosody and filled pauses as cues to listeners about the metacognitive states of speakers
Journal of Memory and Language
(1995) - et al.
Using uh and um in spontaneous speaking
Cognition
(2002) - et al.
Repeating words in spontaneous speech
Cognitive Psychology
(1998) - et al.
It’s the way that you, er, say it: Hesitations in speech affect language comprehension
Cognition
(2007) - et al.
Pronouncing “the” as “thee” to signal problems in speaking
Cognition
(1997) - et al.
Item effects in recognition memory for words
Journal of Memory and Language
(2010) Monitoring and self-repair in speech
Cognition
(1983)- et al.
Putting lexical constraints in context into the visual-world paradigm
Cognition
(2008)
On the course of answering questions
Journal of Memory and Language
Filled pauses as markers of discourse structure
Journal of Pragmatics
Filled pauses as cues to the complexity of upcoming phrases for native and non-native listeners
Speech Communication
Focus and noun phrase anaphors in spoken language comprehension
Language and Cognitive Processes
If you say thee uh- you’re describing something hard: The on-line attribution of disfluency during reference comprehension
Journal of Experimental Psychology: Learning, Memory, and Cognition
The old and thee, uh, new: Disfluency and reference resolution
Psychological Science
Paralinguistic correlates of conceptual structure
Psychonomic Bulletin & Review
The role of fillers in listener attributions for speaker disfluency
Language and Cognitive Processes
Effects of disfluencies, predictability, and utterance position on word form variation in English conversation
The Journal of the Acoustical Society of America
Language production: Methods and methodologies
Psychonomic Bulletin & Review
The psychophysics toolbox
Spatial Vision
Cited by (73)
Effect of linguistic disfluency on consumer satisfaction: Evidence from an online knowledge payment platform
2023, Information and ManagementCitation Excerpt :Spontaneous speech (i.e., the opposite of read speech) is naturally disfluent because speakers are subject to interrupt their speech with disfluency such as filled pauses, word repetitions, and false starts when spoken utterances are organized with limited or no preparation [6]. The most frequent and typical type of disfluency is filled pause [21], which is defined as voiced pauses filled with fillers (e.g., um and uh in English). Existing research investigates the consequences of filled pause with respect to audience judgments in different contexts.
Linguistic features of spontaneous speech predict conversational recall
2024, Psychonomic Bulletin and Review“Um…, It’s Really Difficult to… Um… Speak Fluently”: Neural Tracking of Spontaneous Speech
2023, Neurobiology of Language