doi:10.1016/j.specom.2005.02.013
Copyright © 2005 Elsevier B.V. All rights reserved.
Cues to upcoming Swedish prosodic boundaries: Subjective judgment studies and acoustic correlates
aDepartment of Speech, Music and Hearing, KTH (Royal Institute of Technology), Lindstedsvägen 24, 5th floor, SE-100 44 Stockholm, Sweden
bDepartment of Computer Science, Columbia University, 1214 Amsterdam Avenue, M/C 0401, 450 CS Building, New York, NY 10027, USA
cFaculty of Arts, Tilburg University, P.O. Box 90153, NL - 5000 LE, Tilburg, The Netherlands
dDepartment of Linguistics, University of Antwerp, Universiteitsplein 1, B-2610 Antwerpen, Belgium
Received 23 August 2004;
revised 9 February 2005;
accepted 27 February 2005.
Available online 3 May 2005.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
Studies of perceptually based predictions of upcoming prosodic boundaries in spontaneous Swedish speech, both by native speakers of Swedish and of native speakers of standard American English reveal marked similarity in judgments. We examined whether Swedish and American listeners were able to predict the occurrence and strength of upcoming boundaries in a series of web-based perceptive experiments. Utterance fragments (in both long and short versions) were selected from a corpus of spontaneous Swedish speech, which was first labeled for boundary presence and strength by expert labelers. These fragments were then presented to listeners, who were instructed to guess whether or not they were followed by a prosodic break, and if so, what the strength of the break was. Results revealed that both Swedish and American listening groups were indeed able to predict whether or not a boundary (of a particular strength) followed the fragment. This suggests that acoustic and prosodic, rather than lexico-grammatical and semantic information was being used by listeners as a primary cue. Acoustic and prosodic correlates of these judgments were then examined, with significant correlations found between judgments and the presence/absence of final creak and phrase-final f0 level and slope.
Keywords: Prosodic boundaries; Prosody perception
Fig. 1. Mean perceived upcoming boundary strength. Data grouped according to labeled boundary strength, fragment size and native language.
Fig. 2. Mean perceived upcoming boundary strength by stimulus length. Data grouped according to subject’s native language American (AM) and Swedish (SW).
Fig. 3. Correlation between perceived upcoming boundary strength for each word in isolation and the corresponding 2 s fragment in which the word occurs. Data for both the Swedish and American subjects. Each point corresponds to the mean of the perceived upcoming boundary strength over the subjects. Regression coefficient r = 0.89 (SW) and r = 0.80 (AM).
Fig. 4. Number of stimuli with creaky voice (in %) for different judged boundary strength intervals (one word). No American data with a mean higher or equal 4 was found and thus the corresponding bar is missing.
Fig. 5. Correlation between perceived upcoming boundary strength and the F0 median (Hz) final 50 ms for (a) Swedish listeners and (b) American listeners.
Fig. 6. Perceived upcoming boundary strength. Data grouped according to labeled boundary strength and native language and to fragment size (a) 2 s and (b) word.