Word Spotting from Continuous Speech Utterances

Rose, Richard C.

doi:10.1007/978-1-4613-1367-0_13

Richard C. Rose³

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 355))

443 Accesses
11 Citations

Abstract

There are many speech recognition applications that require only partial information to be extracted from a speech utterance. These applications include human-machine interactions where it may be difficult to constrain users’ utterances to be within the domain of the machine. Other types of applications that are of interest are those where speech utterances arise from human-human interaction, interaction with speech messaging systems, or any other domain that can be characterized as being unconstrained or spontaneous. This chapter is concerned with the problem of spotting keywords in continuous speech utterances. Many important speech input applications involving word spotting will be described. The chapter will also discuss Automatic Speech Recognition (ASR) problems that are particularly important in word spotting applications. These problems include rejection of out-of-vocabulary utterances, derivation of measures of confidence, and the development of efficient and flexible search algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

L. R. Rabiner and B. H. Juang, Fundamentals of speech recognition. En-glewood Cliffs, N. J.: Prentice Hall, 1993.
Google Scholar
K. F. Lee, Automatic Speech Recognition. Norwell, Mass.: Kluwer, 1989.
Google Scholar
P. J. Bickel and K. A. Doksum, Mathematical Statistics. Englewood Cliffs, N. J.: Prentice Hall, 1977.
MATH Google Scholar
H. V. Poor, An Introduction to Signal Detection and Estimation. New York, N. Y.: Springer-Verlag, 1988.
MATH Google Scholar
J. S. Bridle, “An efficient elastic template method for detecting keywords in running speech,” Brit. Acoust. Soc. Meeting, pp. 1–4, April 1973.
Google Scholar
J. B. Kruskal and D. Sankoff, “An anthology of algorithms and concepts for sequence comparison,” in Time Warps, String Edits, and Macromolecules: The theory and practice of string comparison (D. Sankoff and J. B. Kruskal, eds.), Addison-Welsley, 1983.
Google Scholar
P. Nowell and R. Moore, “A subword approach to topic spotting,” Speech Research Symposium, June 1994.
Google Scholar
J. R. Rohlicek, W. Russel, S. Roucos, and H. Gish, “Continuous HMM for speaker independent word spotting,” Proc. Int. Conf on Acoust., Speech, and Sig. Processing, May 1989.
Google Scholar
L. D. Wilcox and M. A. Bush, “HMM word spotting for voice editing and indexing,” Proc. European Conf. on Speech Communications, pp. 25–28, Sept. 1991.
Google Scholar
R. C. Rose and D. B. Paul, “A hidden Markov model based keyword recognition system,” Proc. Int. Conf. on Acousi., Speech, and Sig. Processing, April 1990.
Google Scholar
J. R. Rohlicek, P. Jeanrenaud, K. Ng, H. Gish, B. Musicus, and M. Shi, “Phonetic training and language modeling for word spotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, April 1993.
Google Scholar
L. Gillick, J. Baker, J. Baker, J. Bridle, M. Hunt, Y. Ito, S. Lowe, J. Orloff, B. Peskin, R. Roth, and F. Scattone, “Application of large vocabulary continuous speech recognition to topic and speaker identification using telephone speech,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, April 1993.
Google Scholar
M. Weintraub, “Keyword spotting using SRI’s decipher large vocabulary speech recognition system,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, April 1993.
Google Scholar
J. G. Wilpon, L. R. Rabiner, C. H. Lee, and E. R. Goldman, “Automatic recognition of keywords in unconstrained speech using hidden Markov models,” IEEE Trans on Acous. Speech and Sig. Proc, vol. 38, no. 11, pp. 1870–1878, 1990.
Article Google Scholar
M. W. Feng and B. Mazor, “Continuous wordspotting for telecommunications applications,” Proc. Int. Conf. on Spoken Lang. Processing, October 1992.
Google Scholar
E. Lleida, J. B. Marino, J. Slavedra, A. Bonafonte, E. Monte, and A. Martinez, “Out-of-vocabulary word modelling and rejection for keywrod spotting,” Proc. European Conf on Speech Communications, pp. 1265–1268, September 1993.
Google Scholar
T. Zeppenfeld and A. H. Waibel, “A hybrid neural network, dynamic programming word spotter,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. II77–II80, April 1992.
Google Scholar
R. P. Lippmann and E. Singer, “Hybrid neural-network/HMM approaches to wordspotting,” Proc. Int. Conf on Acoust., Speech, and Sig. Processing, pp. 1565–1568, April 1993.
Google Scholar
R. C. Rose, “Discriminant wordspotting techniques for rejecting non-vocabulary utterances in unconstrained speech,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, March 1992.
Google Scholar
R. A. Sukkar and J. G. Wilpon, “A two pass classifier for utterance rejection in keyword spotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. II451–II454, April 1993.
Google Scholar
D. P. Morgan, C. I. Scofield, and J. E. Adcock, “Multiple neural network topologies applied to keyword spotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 313–316, April 1991.
Google Scholar
R. C. Rose and E. M. Hofstetter, “Task independent wordspotting using decision tree based allophone clustering,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 11–467 to 11–470, April 1993.
Google Scholar
A. Asadi, R. Schwartz, and J. Makhoul, “Automatic modeling for adding new words to a large vocabulary speech recognition system,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 305–308, April 1991.
Google Scholar
S. R. Young and W. H. Ward, “Recognition confidence measures for spontaneous spoken dialog,” Proc. European Conf. on Speech Communications, pp. 1177–1179, September 1993.
Google Scholar
B. Mazor and M. W. Feng, “Improved a-posteriori processing for keyword spotting,” Proc. European Conf. on Speech Communications, September 1993.
Google Scholar
D. S. Pallett, J. G. Fiscus, W. M. Fisher, J. S. Garofolo, B. A. Lund, A. Martin, and M. A. Przybocki, “1994 benchmark tests for the ARPA spoken language program,” Proc. DARPA Speech and Natural Language Workshop, January 1995.
Google Scholar
A. L. Higgins and R. E. Wohlford, “Keyword recognition using template concatenation,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 1233–1236, April 1985.
Google Scholar
R. C. Rose, “Definition of acoustic subword units for word spotting,” Proc. European Conf. on Speech Communications, pp. 1049–1052, Sept. 1993.
Google Scholar
J. J. Godfrey, E. C. Holliman, and J. McDaniel, “Switchboard: Telephone speech corpus for research and development,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, March 1992.
Google Scholar
J. M. Boite, H. Bourlard, B. D’hoore, and M. Haesen, “A new approach to keyword spotting,” Proc. European Conf. on Speech Communications, September 1993.
Google Scholar
J. C. Spohrer, P. F. Brown, P. H. Hochschild, and J. K. Baker, “Partial backtrace in continuous speech recognition,” Proc. Int. Conf. on Systems, Man, and Cybernetics, pp. 36–42, 1980.
Google Scholar
M. Weintraub, “LVCSR log-likelihood ratio scoring for keyword spotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 297–300, April 1995.
Google Scholar
C. Torre and A. Acero, “Discriminative training of garbage model for non-vocabulary utterance rejection,” Proc. Int. Conf. on Spoken Lang. Processing, June 1994.
Google Scholar
R. C. Rose, B. H. Juang, and C. H. Lee, “A training procedure for verifying string hypotheses in continuous speech recognition,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 281–284, April 1995.
Google Scholar
B. H. Juang and S. Katagiri, “Discriminative learning for minimum error classification,” IEEE Trans, on Signal Proc, pp. 3043–3054, December 1992.
Google Scholar
R. C. Rose, “Techniques for information retrieval from speech messages,” Lincoln Laboratory Journal, vol. 4, no. 1, pp. 45–60, 1991.
Google Scholar
D. A. James and S. J. Young, “A fast lattice-based approach to vocabulary independent wordspotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 1377–1380, April 1994.
Google Scholar
K. F. Lee, “The conversational computer: an Apple perspective,” Proc. European Conf. on Speech Communications, pp. 1377–1384, Sept. 1993.
Google Scholar
P. Gopalakrishnan and D. Nahamoo, “Immediate recognition of embedded command words,” Proc. European Conf. on Speech Communications, pp. 21–24, Sept. 1991.
Google Scholar
B. Chigier, “Rejection and keyword spotting algorithms for a directory assistance city name recognition application,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, March 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

AT&T Bell Laboratories, Murray Hill, NJ, 07974, USA
Richard C. Rose

Authors

Richard C. Rose
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

AT&T Bell Laboratories, Murray Hill, NJ, 07974, USA
Chin-Hui Lee & Frank K. Soong &
School of Microelectronic Engineering, Griffith University, Australia
Kuldip K. Paliwal

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Rose, R.C. (1996). Word Spotting from Continuous Speech Utterances. In: Lee, CH., Soong, F.K., Paliwal, K.K. (eds) Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science, vol 355. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1367-0_13

Download citation

DOI: https://doi.org/10.1007/978-1-4613-1367-0_13
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8590-8
Online ISBN: 978-1-4613-1367-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics