Skip to main content

Audio Lifelog Search System Using a Topic Model for Reducing Recognition Errors

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6588))

Included in the following conference series:

  • 1057 Accesses

Abstract

A system that records daily conversations is one of the most useful types of lifelogs. It is, however, not widely used due to the low precision of speech recognizers when applied to conversations. To solve this problem, we propose a method that uses a topic model to reduce incorrectly recognized words. Specifically, we measure relevancy between a term and the other words in the conversation and remove those that come below the threshold. An audio lifelog search system was implemented using the method. Experiments showed that our method is effective in compensating recognition errors of speech recognizers. We observed increase in both precision and recall. The results indicate that our method has an ability to reduce errors in the index of a lifelog search system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sellen, A., Whittaker, S.: Beyond total capture: a constructive critique of lifelogging. Communications of the ACM 53(5), 70–77 (2010)

    Article  Google Scholar 

  2. Rabiner, L., Juang, B.H.: Fundamentals of speech recognition. Prentice Hall, Englewood Cliffs (1993)

    MATH  Google Scholar 

  3. Ney, H., Ortmanns, S.: Dynamic Programming Search for Continuous Speech Recognition Contents. IEEE Signal Processing Magazine 16, 64–83 (1999)

    Article  Google Scholar 

  4. Holmes, J., Holmes, W.: Speech synthesis and recognition. Taylor & Francis, Abington (2001)

    MATH  Google Scholar 

  5. Bellegarda, J.R.: Exploiting latent semantic information in statistical language modeling. Proc. of the IEEE 88(8), 1279–1296 (2000)

    Article  Google Scholar 

  6. Bellegarda, J.R.: Statistical language model adaptation: review and perspectives. Speech Communication 42, 93–108 (2004)

    Article  Google Scholar 

  7. Wick, M.L., Ross, M.G., Learned-Miller, E.G.: Context-Sensitive Error Correction: Using Topic Models to Improve OCR. In: Proc. of the 9th International Conference on Document Extraction and Analysis, pp. 1168–1172 (September 2007)

    Google Scholar 

  8. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  9. Griffiths, T.L., Steyvers, M.: Finding Scientific Topics. Proc. of the National Academy of Sciences of the United States of America 101, 5228–5235 (2004)

    Article  Google Scholar 

  10. Heinrich, G.: Parameter estimation for text analysis, Technical Note, ver 2.4 (2008), http://www.arbylon.net/publications/text-est.pdf

  11. Wikipedia, http://wikipedia.org

  12. Julius - Open-Source Large Vocabulary CSR Engine, http://julius.sourceforge.jp/en_index.php

  13. The Corpus of Spontaneous Japanese (CSJ Corpus), http://www.kokken.go.jp/katsudo/seika/corpus/public/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tezuka, T., Maeda, A. (2011). Audio Lifelog Search System Using a Topic Model for Reducing Recognition Errors. In: Yu, J.X., Kim, M.H., Unland, R. (eds) Database Systems for Advanced Applications. DASFAA 2011. Lecture Notes in Computer Science, vol 6588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20152-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20152-3_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20151-6

  • Online ISBN: 978-3-642-20152-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics