Digits to Words Converter for Slavic Languages in Systems of Automatic Speech Recognition

Chaloupka, Josef

doi:10.1007/978-3-319-66429-3_30

Josef Chaloupka¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10458))

Included in the following conference series:

International Conference on Speech and Computer

2207 Accesses
2 Citations

Abstract

In this paper, a system for digits to words conversion for almost all Slavic languages is proposed. This system was developed for improvement of text corpora which we are using for building of a lexicon or for training of language models and acoustic models in the task of Large Vocabulary Continuous Speech Recognition (LVCSR). Strings of digits, some other special characters (%, €, $, ...) or abbreviations of physical units (km, m, cm, kg, l, ${}^\circ $C, etc.) occur very often in our text corpora. It is in about 5% cases. The strings of digits or special characters are usually omitted if a lexicon is being built or if the language model is being trained. The task of digits to words conversion in non-inflected languages (e.g. English) is solved by relatively simple conversion or lookup table. The problem is more complex in inflected Slavic languages. The string of digits can be converted into several different word combinations. It depends on the context and resulting words are inflected by gender or cases. The main goal of this research was to find the rules (patterns) for conversion of string of digits into words for Slavic languages. The second goal was to unify this patterns over Slavic languages and to integrate them to the universal system for digits to words conversion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhang, Y., Pezeshki, M., Brakel, P., Zhang, S., Laurent, C., Bengio, Y., Courville, A.: Towards end-to-end speech recognition with deep convolutional neural networks. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2016, pp. 410–414 (2016). ISSN: 2308–457X
Google Scholar
Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Acero, A.: Recent advances in deep learning for speech research at Microsoft. In: IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP 2013, pp. 8604–8608 (2013). ISBN: 978-147990356-6
Google Scholar
Nouza, J., Blavka, K., Zdansky, J., Cerva, P., Silovsky, J., Bohac, M., Chaloupka, J., Kucharova, M., Seps, L.: Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives. In: 2012 IEEE 14th International Workshop on Multimedia Signal Processing, MMSP 2012, pp. 337–342 (2012). ISBN: 978-146734572-9
Google Scholar
Nouza, J., Zdansky, J., David, P., Cerva, P., Kolorenc, J., Nejedlova, D.: Fully automated system for Czech spoken broadcast transcription with very large (300K+) lexicon. In: Interspeech 2005, Lisboa, Portugal, pp. 1681–1684 (2005). ISSN: 1018–4074
Google Scholar
Nouza, J., Silovsky, J., Zdansky, J., Cerva, P., Kroul, M., Chaloupka, J.: Czech-to-Slovak adapted broadcast news transcription system. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association, (Interspeech 2008), pp. 2683–2686, 22–26 September, Brisbane, Australia (2008). ISSN: 1990–9772
Google Scholar
Nouza, J., Cerva, P., Safarik, R.: Cross-lingual adaptation of broadcast transcription system to polish language using public data sources. In: 7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Poland, pp. 181–185 (2015). ISBN: 978-83-932640-8-7
Google Scholar
Nouza, J., Safarik, R., Cerva, P.: ASR for south slavic languages developed in almost automated way. In: Proceedings of the 17th Annual Conference of the International Speech Communication Association (INTERSPEECH 2016), San Francisco, USA, pp. 3868–3872 (2016). doi:10.21437/Interspeech.2016-747, Scopus EID: 2-s2.0-84994385032, ISSN: 2308-457X
Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, ICASSP 2013, pp. 8609–8613 (2013). ISBN: 978-147990356-6
Google Scholar

Download references

Acknowledgments

The research was supported by the Technology Agency of the Czech Republic in project no. TA04010199.

Author information

Authors and Affiliations

Technical University of Liberec, 461 17, Liberec, Czech Republic
Josef Chaloupka

Authors

Josef Chaloupka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josef Chaloupka .

Editor information

Editors and Affiliations

SPIIRAS, Saint Petersburg, Russia
Alexey Karpov
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova
University of Hertfordshire, Hatfield, United Kingdom
Iosif Mporas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chaloupka, J. (2017). Digits to Words Converter for Slavic Languages in Systems of Automatic Speech Recognition. In: Karpov, A., Potapova, R., Mporas, I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science(), vol 10458. Springer, Cham. https://doi.org/10.1007/978-3-319-66429-3_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-66429-3_30
Published: 13 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66428-6
Online ISBN: 978-3-319-66429-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics