Skip to main content

Word Spotting from Continuous Speech Utterances

  • Chapter
Automatic Speech and Speaker Recognition

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 355))

Abstract

There are many speech recognition applications that require only partial information to be extracted from a speech utterance. These applications include human-machine interactions where it may be difficult to constrain users’ utterances to be within the domain of the machine. Other types of applications that are of interest are those where speech utterances arise from human-human interaction, interaction with speech messaging systems, or any other domain that can be characterized as being unconstrained or spontaneous. This chapter is concerned with the problem of spotting keywords in continuous speech utterances. Many important speech input applications involving word spotting will be described. The chapter will also discuss Automatic Speech Recognition (ASR) problems that are particularly important in word spotting applications. These problems include rejection of out-of-vocabulary utterances, derivation of measures of confidence, and the development of efficient and flexible search algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. R. Rabiner and B. H. Juang, Fundamentals of speech recognition. En-glewood Cliffs, N. J.: Prentice Hall, 1993.

    Google Scholar 

  2. K. F. Lee, Automatic Speech Recognition. Norwell, Mass.: Kluwer, 1989.

    Google Scholar 

  3. P. J. Bickel and K. A. Doksum, Mathematical Statistics. Englewood Cliffs, N. J.: Prentice Hall, 1977.

    MATH  Google Scholar 

  4. H. V. Poor, An Introduction to Signal Detection and Estimation. New York, N. Y.: Springer-Verlag, 1988.

    MATH  Google Scholar 

  5. J. S. Bridle, “An efficient elastic template method for detecting keywords in running speech,” Brit. Acoust. Soc. Meeting, pp. 1–4, April 1973.

    Google Scholar 

  6. J. B. Kruskal and D. Sankoff, “An anthology of algorithms and concepts for sequence comparison,” in Time Warps, String Edits, and Macromolecules: The theory and practice of string comparison (D. Sankoff and J. B. Kruskal, eds.), Addison-Welsley, 1983.

    Google Scholar 

  7. P. Nowell and R. Moore, “A subword approach to topic spotting,” Speech Research Symposium, June 1994.

    Google Scholar 

  8. J. R. Rohlicek, W. Russel, S. Roucos, and H. Gish, “Continuous HMM for speaker independent word spotting,” Proc. Int. Conf on Acoust., Speech, and Sig. Processing, May 1989.

    Google Scholar 

  9. L. D. Wilcox and M. A. Bush, “HMM word spotting for voice editing and indexing,” Proc. European Conf. on Speech Communications, pp. 25–28, Sept. 1991.

    Google Scholar 

  10. R. C. Rose and D. B. Paul, “A hidden Markov model based keyword recognition system,” Proc. Int. Conf. on Acousi., Speech, and Sig. Processing, April 1990.

    Google Scholar 

  11. J. R. Rohlicek, P. Jeanrenaud, K. Ng, H. Gish, B. Musicus, and M. Shi, “Phonetic training and language modeling for word spotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, April 1993.

    Google Scholar 

  12. L. Gillick, J. Baker, J. Baker, J. Bridle, M. Hunt, Y. Ito, S. Lowe, J. Orloff, B. Peskin, R. Roth, and F. Scattone, “Application of large vocabulary continuous speech recognition to topic and speaker identification using telephone speech,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, April 1993.

    Google Scholar 

  13. M. Weintraub, “Keyword spotting using SRI’s decipher large vocabulary speech recognition system,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, April 1993.

    Google Scholar 

  14. J. G. Wilpon, L. R. Rabiner, C. H. Lee, and E. R. Goldman, “Automatic recognition of keywords in unconstrained speech using hidden Markov models,” IEEE Trans on Acous. Speech and Sig. Proc, vol. 38, no. 11, pp. 1870–1878, 1990.

    Article  Google Scholar 

  15. M. W. Feng and B. Mazor, “Continuous wordspotting for telecommunications applications,” Proc. Int. Conf. on Spoken Lang. Processing, October 1992.

    Google Scholar 

  16. E. Lleida, J. B. Marino, J. Slavedra, A. Bonafonte, E. Monte, and A. Martinez, “Out-of-vocabulary word modelling and rejection for keywrod spotting,” Proc. European Conf on Speech Communications, pp. 1265–1268, September 1993.

    Google Scholar 

  17. T. Zeppenfeld and A. H. Waibel, “A hybrid neural network, dynamic programming word spotter,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. II77–II80, April 1992.

    Google Scholar 

  18. R. P. Lippmann and E. Singer, “Hybrid neural-network/HMM approaches to wordspotting,” Proc. Int. Conf on Acoust., Speech, and Sig. Processing, pp. 1565–1568, April 1993.

    Google Scholar 

  19. R. C. Rose, “Discriminant wordspotting techniques for rejecting non-vocabulary utterances in unconstrained speech,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, March 1992.

    Google Scholar 

  20. R. A. Sukkar and J. G. Wilpon, “A two pass classifier for utterance rejection in keyword spotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. II451–II454, April 1993.

    Google Scholar 

  21. D. P. Morgan, C. I. Scofield, and J. E. Adcock, “Multiple neural network topologies applied to keyword spotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 313–316, April 1991.

    Google Scholar 

  22. R. C. Rose and E. M. Hofstetter, “Task independent wordspotting using decision tree based allophone clustering,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 11–467 to 11–470, April 1993.

    Google Scholar 

  23. A. Asadi, R. Schwartz, and J. Makhoul, “Automatic modeling for adding new words to a large vocabulary speech recognition system,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 305–308, April 1991.

    Google Scholar 

  24. S. R. Young and W. H. Ward, “Recognition confidence measures for spontaneous spoken dialog,” Proc. European Conf. on Speech Communications, pp. 1177–1179, September 1993.

    Google Scholar 

  25. B. Mazor and M. W. Feng, “Improved a-posteriori processing for keyword spotting,” Proc. European Conf. on Speech Communications, September 1993.

    Google Scholar 

  26. D. S. Pallett, J. G. Fiscus, W. M. Fisher, J. S. Garofolo, B. A. Lund, A. Martin, and M. A. Przybocki, “1994 benchmark tests for the ARPA spoken language program,” Proc. DARPA Speech and Natural Language Workshop, January 1995.

    Google Scholar 

  27. A. L. Higgins and R. E. Wohlford, “Keyword recognition using template concatenation,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 1233–1236, April 1985.

    Google Scholar 

  28. R. C. Rose, “Definition of acoustic subword units for word spotting,” Proc. European Conf. on Speech Communications, pp. 1049–1052, Sept. 1993.

    Google Scholar 

  29. J. J. Godfrey, E. C. Holliman, and J. McDaniel, “Switchboard: Telephone speech corpus for research and development,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, March 1992.

    Google Scholar 

  30. J. M. Boite, H. Bourlard, B. D’hoore, and M. Haesen, “A new approach to keyword spotting,” Proc. European Conf. on Speech Communications, September 1993.

    Google Scholar 

  31. J. C. Spohrer, P. F. Brown, P. H. Hochschild, and J. K. Baker, “Partial backtrace in continuous speech recognition,” Proc. Int. Conf. on Systems, Man, and Cybernetics, pp. 36–42, 1980.

    Google Scholar 

  32. M. Weintraub, “LVCSR log-likelihood ratio scoring for keyword spotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 297–300, April 1995.

    Google Scholar 

  33. C. Torre and A. Acero, “Discriminative training of garbage model for non-vocabulary utterance rejection,” Proc. Int. Conf. on Spoken Lang. Processing, June 1994.

    Google Scholar 

  34. R. C. Rose, B. H. Juang, and C. H. Lee, “A training procedure for verifying string hypotheses in continuous speech recognition,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 281–284, April 1995.

    Google Scholar 

  35. B. H. Juang and S. Katagiri, “Discriminative learning for minimum error classification,” IEEE Trans, on Signal Proc, pp. 3043–3054, December 1992.

    Google Scholar 

  36. R. C. Rose, “Techniques for information retrieval from speech messages,” Lincoln Laboratory Journal, vol. 4, no. 1, pp. 45–60, 1991.

    Google Scholar 

  37. D. A. James and S. J. Young, “A fast lattice-based approach to vocabulary independent wordspotting,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, pp. 1377–1380, April 1994.

    Google Scholar 

  38. K. F. Lee, “The conversational computer: an Apple perspective,” Proc. European Conf. on Speech Communications, pp. 1377–1384, Sept. 1993.

    Google Scholar 

  39. P. Gopalakrishnan and D. Nahamoo, “Immediate recognition of embedded command words,” Proc. European Conf. on Speech Communications, pp. 21–24, Sept. 1991.

    Google Scholar 

  40. B. Chigier, “Rejection and keyword spotting algorithms for a directory assistance city name recognition application,” Proc. Int. Conf. on Acoust., Speech, and Sig. Processing, March 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Kluwer Academic Publishers

About this chapter

Cite this chapter

Rose, R.C. (1996). Word Spotting from Continuous Speech Utterances. In: Lee, CH., Soong, F.K., Paliwal, K.K. (eds) Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science, vol 355. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1367-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-1367-0_13

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4612-8590-8

  • Online ISBN: 978-1-4613-1367-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics