Skip to main content

Predicting Word Pronunciation in Japanese

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6609))

  • 1286 Accesses

Abstract

This paper addresses the problem of predicting the pronunciation of Japanese words, especially those that are newly created and therefore not in the dictionary. This is an important task for many applications including text-to-speech and text input method, and is also challenging, because Japanese kanji (ideographic) characters typically have multiple possible pronunciations. We approach this problem by considering it as a simplified machine translation/transliteration task, and propose a solution that takes advantage of the recent technologies developed for machine translation and transliteration research. More specifically, we divide the problem into two subtasks: (1) Discovering the pronunciation of new words or those words that are difficult to pronounce by mining unannotated text, much like the creation of a bilingual dictionary using the web; (2) Building a decoder for the task of pronunciation prediction, for which we apply the state-of-the-art discriminative substring-based approach. Our experimental results show that our classifier for validating the word-pronunciation pairs harvested from unannotated text achieves over 98% precision and recall. On the pronunciation prediction task of unseen words, our decoder achieves over 70% accuracy, which significantly improves over the previously proposed models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bisani, M., Ney, H.: Investigations on joint-multigram models for grapheme-to-phoneme conversion. In: The Proceedings of the International Conference on Spoken Language Processing (2002)

    Google Scholar 

  2. Bisani, M., Ney, H.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Communication 50(5), 434–451 (2008)

    Article  Google Scholar 

  3. Cao, G., Gao, J., Nie, J.-Y.: A system to mine large-scale bilingual dictionaries from monolingual web pages. In: MT Summit XI (2007)

    Google Scholar 

  4. Chen, S.F.: Conditional and joint models for grapheme-to-phoneme conversion. In: The Proceedings of the European Conference on Speech Communication and Technology (2003)

    Google Scholar 

  5. Cherry, C., Suzuki, H.: Discriminative Substring Decoding for Transliteration. In: EMNLP (2009)

    Google Scholar 

  6. Den, Y., Ogiso, T., Ogura, H., Yamada, A., Minematsu, N., Uchimoto, K., Koiso, H.: The development of an electronic dictionary for morphological analysis and its application to Japanese corpus linguistics. Japanese linguistics 22, 101–122 (2007) (in Japanese)

    Google Scholar 

  7. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Statist. 29(5), 1189–1232 (2001)

    Article  MATH  Google Scholar 

  8. Gao, J., Goodman, J., Li, M., Lee, K.-F.: Toward a unified approach to statistical language modeling for Chinese. ACM Transactions on Asian Language Information Processing 1(1), 3–33 (2002a)

    Article  Google Scholar 

  9. Gao, J., Suzuki, H., Wen, Y.: Exploiting headword dependency and predictive clustering for language modeling. In: EMNLP 2002 (2002b)

    Google Scholar 

  10. Ghoshal, A., Jansche, M., Khudanpur, S., Riley, M., Ulinski, M.: Web-derived Pronunciations. In: ICASSP (2009)

    Google Scholar 

  11. Jiampojamarn, S., Kondrak, G., Sherif, T.: Applying many-to-many alignments and hidden markov models to letter-to-phoneme conversion. In: HLT-NAACL (2007)

    Google Scholar 

  12. Jiampojamarn, S., Cherry, C., Kondrak, G.: Joint Processing and Discriminative Training for Letter-to-Phoneme Conversion. In: ACL (2008)

    Google Scholar 

  13. Jiampojamarn, S., Cherry, C., Kondrak, G.: Integrating Joint n-gram Features into a Discriminative Training Framework. In: The Proceedings of NAACL (2010)

    Google Scholar 

  14. Knight, K., Graehl, J.: Machine Transliteration. Computational Linguistics 24(4) (1998)

    Google Scholar 

  15. Koehn, P., Och, F., Marcu, D.: Statistical phrase-based translation. In: NAACL (2003)

    Google Scholar 

  16. Kurata, G., Mori, S., Suitoh, N., Nishimura., M.: Unsupervised lexicon acquisition from speech and text. In: The Proceedings of ICASSP (2007)

    Google Scholar 

  17. Li, H., Zhang, M., Su, J.: A joint source-channel model for machine transliteration. In: ACL (2004)

    Google Scholar 

  18. Lin, D., Zhao, S., Durme, B.V., Pasca, M.: Mining parenthetical translations from the web by word alignment. In: ACL 2008 (2008)

    Google Scholar 

  19. Maekawa, K.: Compilation of the KOTONOHA-BCCWJ Corpus. Nihongo no kenkyu (Studies in Japanese) 4(1), 82–95 (2008) (in Japanese)

    Google Scholar 

  20. Mori, S., Neubig, G.: Automatically improving language processing accuracy by using kana-kanji conversion logs. In: The Proc. of the 16th Annual Meeting of the Association for NLP (2010) (in Japanese)

    Google Scholar 

  21. Mori, S., Sasada, T., Neubig, G.: Language Model Estimation from a Stochastically Tagged Corpus. Technical Report, SIG, Information Processing Society of Japan (2010) (in Japanese)

    Google Scholar 

  22. Nagano, T., Mori, S., Nishimura, M.: An n-gram-based approach to phoneme and accent estimation for TTS. Transactions of Information Processing Society of Japan 47(6), 1793–1801 (2006) (in Japanese)

    Google Scholar 

  23. Och, F.J.: Minimum Error Rate Training for Statistical Machine Translation. In: ACL (2003)

    Google Scholar 

  24. Reddy, S., Goldsmith, J.: An MDL-based approach to extracting subword units for grapheme-to-phoneme conversion. In: NAACL (2010)

    Google Scholar 

  25. Sasada, T., Mori, S., Kawahara, T.: Extracting word-pronunciation pairs from comparable set of text and speech. In: The Proceedings of the 9th Annual Conference of the International Speech Communication Association (2008)

    Google Scholar 

  26. Sasada, T., Mori, S., Kawahara, T.: Domain adaptation of statistical kana-kanji conversion system by automatic acquisition of contextual information with unknown words. In: The Proceedings of the 15th Annual Meeting of the Association for NLP (2009) (in Japanese)

    Google Scholar 

  27. Schroeter, J., Conkie, A., Syrdal, A., Beutnagel, M., Jilka, M., Strom, V., Kim, Y.-J., Kang, H.-G., Kapilow, D.: A perspective on the next challenges for TTS research. In: The Proceedings of the IEEE 2002 Workshop on Speech Synthesis (2002)

    Google Scholar 

  28. Sherif, T., Kondrak, G.: Substring-based transliteration. In: ACL (2007)

    Google Scholar 

  29. Sumita, E., Sugaya, F.: Word Pronunciation Disambiguation using the Web. In: NAACL (2006)

    Google Scholar 

  30. Vance, T.J.: An introduction to Japanese phonology. State University of New York Press (1987)

    Google Scholar 

  31. Zens, R., Ney, H.: Improvements in Phrase-Based Statistical Machine Translation. In: HLT-NAACL (2004)

    Google Scholar 

  32. Zhang, H., Quirk, C., Moore, R.C., Gildea, D.: Bayesian learning of non-compositional phrases with synchronous parsing. In: ACL (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hatori, J., Suzuki, H. (2011). Predicting Word Pronunciation in Japanese. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6609. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19437-5_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19437-5_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19436-8

  • Online ISBN: 978-3-642-19437-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics