Skip to main content

Evaluating Text Normalization for Speech-Based Media Selection

  • Conference paper
Perception in Multimodal Dialogue Systems (PIT 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5078))

  • 1382 Accesses

Abstract

In this paper, we present an approach how to evaluate text normalization for multi-lingual speech-based dialogue systems. The application of text normalization occurs within the task of music selection, which imposes several important and novel requirements on its performance. The main idea is that text normalization should determine likely user utterances from metadata that is available within a user’s music collection. This is substantially different from the text preprocessing applied, for instance, in text-to-speech systems, because a) more than one normalization hypothesis may be generated, b) for media selection the information content may be reduced, which is not desirable for Text-to-speech (TTS). These factors also have an impact on evaluation.

We describe an data collection effort that was carried out with the purpose of building an initial corpus of text normalization references and scorings, as well as experiments with well-known evaluation metrics from different areas of language research aiming at identifying an adequate evaluation measure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akiba, Y., Federico, M., Kando, N., Nakaiwa, H., Paul, M., Tsujii, J.: Overview of the IWSLT04 evaluation campaign. In: Proc. of the International Workshop on Spoken Language Translation, Kyoto, Japan, pp. 1–12 (2004)

    Google Scholar 

  2. Banerjee, S., Lavie, A.: METEOR: An automatic metric for MT evaluation with improved correlation with human judgments (2005)

    Google Scholar 

  3. Callison-Burch, C., Osborne, M., Koehn, P.: Re-evaluating the role of BLEU in Machine Translation Research (2006)

    Google Scholar 

  4. Pfeil, M.: Automatic evaluation of text normalization (2007)

    Google Scholar 

  5. McCowan, I., Moore, D., Dines, J., Gatica-Perez, D., Flynn, M., Wellner, P., Bourlard, H.: On the Use of Information Retrieval Measures for Speech Recognition Evaluation. IDIAP-RR 73, IDIAP, Martigny, Switzerland (2004)

    Google Scholar 

  6. Minker, W., Buehler, D., Dybkjaer, L. (eds.): Spoken Multimodal Human-Computer Dialogue in Mobile Environments. Text, Speech and Language Technology, vol. 28. Springer, Heidelberg (2005)

    Google Scholar 

  7. Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation (2001)

    Google Scholar 

  8. Nießen, S.: Improving Statistical Machine Translation using Morpho-syntactic Information (2002)

    Google Scholar 

  9. Sproat, R., Black, A.W., Chen, S., Kumar, S., Ostendorf, M., Richards, C.: Article Submitted to Computer Speech and Language Normalization of Non-Standard Words (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Elisabeth André Laila Dybkjær Wolfgang Minker Heiko Neumann Roberto Pieraccini Michael Weber

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pfeil, M., Buehler, D., Gruhn, R., Minker, W. (2008). Evaluating Text Normalization for Speech-Based Media Selection. In: André, E., Dybkjær, L., Minker, W., Neumann, H., Pieraccini, R., Weber, M. (eds) Perception in Multimodal Dialogue Systems. PIT 2008. Lecture Notes in Computer Science(), vol 5078. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69369-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69369-7_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69368-0

  • Online ISBN: 978-3-540-69369-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics