Article

Free Access

Assigning intonational features in synthesized spoken directions

Authors:
James Raymond Davis

The Media Laboratory, MIT, Cambridge, MA

The Media Laboratory, MIT, Cambridge, MA
View Profile

,
Julia Hirschberg

AT&T Bell Laboratories, Murray Hill, NJ

AT&T Bell Laboratories, Murray Hill, NJ
View Profile

ACL '88: Proceedings of the 26th annual meeting on Association for Computational LinguisticsJune 1988Pages 187–193https://doi.org/10.3115/982023.982046

Published:07 June 1988Publication History

ACL '88: Proceedings of the 26th annual meeting on Association for Computational Linguistics

Pages 187–193

ABSTRACT

Speakers convey much of the information hearers use to interpret discourse by varying prosodic features such as PHRASING, PITCH ACCENT placement, TUNE, and PITCH RANGE. The ability to emulate such variation is crucial to effective (synthetic) speech generation. While text-to-speech synthesis must rely primarily upon structural information to determine appropriate intonational features, speech synthesized from an abstract representation of the message to be conveyed may employ much richer sources. The implementation of an intonation assignment component for Direction Assistance, a program which generates spoken directions, provides a first approximation of how recent models of discourse structure can be used to control intonational variation in ways that build upon recent research in intonational meaning. The implementation further suggests ways in which these discourse models might be augmented to permit the assignment of appropriate intonational features.

References

Barbara Grosz. The Representation and Use of Focus in Dialogue Understanding. Phd thesis, University of California at Berkeley, 1976. Google ScholarDigital Library
B. Grosz, A. K. Joshi, and S. Weinstein. Providing a Unified Account of Definite Noun Phrases in Discourse. Proceedings of the Association for Computational Linguistics, pages 44--50, June 1983. Google ScholarDigital Library
Candace Sidner. Towards a computational theory of definite anaphora comprehension in English discourse. PhD thesis, MIT, 1979.Google Scholar
M. Anderson, J. Pierrehumbert, and M. Liberman. Synthesis by rule of English intonation patterns. Proceedings of the conference on Acoustics, Speech, and Signal Processing, page 2.8.1 to 2.8.4, 1984.Google Scholar
Gillian Brown. Prosodic structure and the given/new distinction. In Cutler and Ladd, editors, Prosody: Models and Measurements, chapter 6, Springer Verlag, 1983.Google Scholar
James R. Davis. Giving directions: a voice interface to an urban navigation program. In American Voice I/O Society, pages 77--84, Sept 1986.Google Scholar
James R. Davis and Thomas F. Trobaugh. Direction Assistance. Technical Report, MIT Media Technology Lab, Dec 1987.Google Scholar
Marcia A. Derr and Kathleen R. McKeown. Using focus to generate complex and simple sentences. Proceedings of the Tenth International Conference on Computational Linguistics, pages 319--325, 1984. Google ScholarDigital Library
Barbara J. Grosz and Candace L. Sidner. Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3):175--204, 1986. Google ScholarDigital Library
Dwight Bolinger. Accent is predictable (if you're a mind-reader). Language, 48:633--644, 1972.Google ScholarCross Ref
M. A. K. Halliday. Intonation and Grammar in British English. Mouton, 1967.Google ScholarCross Ref
J. Hirschberg and J. Pierrehumbert. The intonational structure of discourse. Proceedings of the Association for Computational Linguistics, pages 136--144, July 1986. Google ScholarDigital Library
Kathleen R. McKeown. Discourse strategies for generating natural-language text. Artificial Intelligence, 27(1): 1--41, 85. Google ScholarDigital Library
S. G. Nooteboom and J. M. B. Terken. What makes speakers omit pitch accents? an experiment. Phonetica, 39:317--336, 1982.Google ScholarCross Ref
J. Pierrehumbert and J. Hirschberg. The meaning of intonation contours in the interpretation of discourse. In Plans and Intentions in Communication, SDF Benchmark Series in Computational Linguistics, MIT Press, forthcoming.Google Scholar
Janet B. Pierrehumbert. The Phonology and Phonetics of English Intonation. PhD thesis, MIT, Dept of Linguistics, 1980.Google Scholar
Ellen F. Prince. Toward a taxonomy of given - new information. In Peter Cole, editor, Radical Pragmatics, pages 223--256, Academic Press, 1981.Google Scholar
Kim E. A. Silverman. Natural prosody for synthetic speech, PhD thesis, Cambridge University, 1987.Google Scholar
L. Witten and P. Madams. The telephone inquiry service: a man-machine system using synthetic speech. International Journal of Man-Machine Studies, 9:449--464, 1977.Google ScholarCross Ref
S. J. Young and F. Fallside. Speech synthesis from concept: a method for speech output from information systems. Journal of the Acoustic Society of America, 66(3):685--695, Sept 1979.Google ScholarCross Ref
J. P. Olive and M. Y. Liberman. Text to speech - An overview. Journal of the Acoustic Society of America, Suppl. 1, 78(3):s6, Fall 1985.Google ScholarCross Ref

Assigning intonational features in synthesized spoken directions
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Speech repairs, intonational boundaries and discourse markers: modeling speakers' utterances in spoken dialog
Read More
Automatic recognition of intonational features
ICASSP'92: Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1

This paper reports the initial development of an algorithm to automatically detect boundary tones and prominences in continuous speech. Utilizing phoneme durations given by a speech recognizer, we employ a tree quantizer and hidden Markov model to label ...
Read More
Synthesized speech intelligibility and persuasion: Speech rate and non-native listeners

This experiment assessed the effect of variation in speech rate on comprehension and persuasiveness of a message presented in text-to-speech (TTS) synthesis to native and non-native listeners. Eighty non-native speakers of English and 80 native speakers ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACL '88: Proceedings of the 26th annual meeting on Association for Computational Linguistics
June 1988
304 pages
Program Chair:
Jerry Hobbs
SRI International
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 7 June 1988
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate85of443submissions,19%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 211
  Total Downloads
- Downloads (Last 12 months)20
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Assigning intonational features in synthesized spoken directions

ACL '88: Proceedings of the 26th annual meeting on Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

Speech repairs, intonational boundaries and discourse markers: modeling speakers' utterances in spoken dialog

Automatic recognition of intonational features

Synthesized speech intelligibility and persuasion: Speech rate and non-native listeners

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Assigning intonational features in synthesized spoken directions

ACL '88: Proceedings of the 26th annual meeting on Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

Speech repairs, intonational boundaries and discourse markers: modeling speakers' utterances in spoken dialog

Automatic recognition of intonational features

Synthesized speech intelligibility and persuasion: Speech rate and non-native listeners

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media