A Machine Learning Based Approach for Vocabulary Selection for Speech Transcription

Jouvet, Denis; Langlois, David

doi:10.1007/978-3-642-40585-3_9

A Machine Learning Based Approach for Vocabulary Selection for Speech Transcription

Denis Jouvet^20,21,22 &
David Langlois^20,21,22

Conference paper

2410 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8082))

Abstract

This paper introduces a new approach based on neural networks for selecting the vocabulary to be used in a speech transcription system. Indeed, nowadays, large sets of text data can be collected from web sources, and used in addition to more traditional text sources for building language models for speech transcription systems. However, web data sources lead to large amounts of heterogeneous data, and, as a consequence, standard vocabulary selection procedures based on unigram approaches tend to select unwanted and undesirable items as new words. As an alternative to unigram-based and empirical manual-based selection approaches, this paper proposes a new selection procedure that relies on a machine learning technique, namely neural networks. The paper presents and discusses the results obtained with the various selection procedures. The neural network based selection experiments are promising and they can handle automatically various detailed information in the selection process.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rosenfeld, R.: Optimizing lexical and ngram coverage via judicious use of linguistic data. In: Proc. EUROSPEECH 1995, 4th European Conf. on Speech Communication and Technology, Madrid, Spain, pp. 1763–1766 (1995)
Google Scholar
Allauzen, A., Gauvain, J.-L.: Automatic building of the vocabulary of a speech transcription system (in French) “Construction automatique du vocabulaire d’un système de transcription”. In: Proc. JEP 2004, Journées d’Etudes sur la Parole, Fès, Maroc (2004)
Google Scholar
Venkataraman, A., Wang, W.: Techniques for effective vocabulary selection. In: Proc. INTERSPEECH 2003, 8th European Conf. on Speech Communication and Technology, Geneva, Switzerland, pp. 245–248 (2003)
Google Scholar
Maergner, P., Waibel, A., Lane, I.: Unsupervised Vocabulary Selection for Real-Time Speech Recognition of Lectures. In: Proc. ICASSP 2012, IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Kyoto, Japan (2012)
Google Scholar
Mendona, A., Graff, D., DiPersio, D.: French Gigaword, 2nd edn. Linguistic Data Consortium, Philadelphia (2009)
Google Scholar
Stolcke, A.: SRILM - An Extensible Language Modeling Toolkit. In: Proc. ICSLP 2002, Int. Conf. on Spoken Language Processing, Denver, Colorado (2002)
Google Scholar
Gravier, G., Adda, G.: Evaluations en traitement automatique de la parole (ETAPE). Evaluation Plan, Etape 2011, version 2.0 (2011)
Google Scholar
de Calmès, M., Pérennou, G.: BDLEX: A Lexicon for Spoken and Written French. In: Proc. LREC 1998, 1st Int. Conf. on Language Resources & Evaluation, Grenade, pp. 1129–1136 (1998)
Google Scholar
FANN toolkit, http://leenissen.dk/fann/wp/
Sphinx (2011), http://cmusphinx.sourceforge.net
Jouvet, D., Vinuesa, N.: Classification margin for improved class-based speech recognition performance. In: ICASSP 2012, IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Kyoto, Japan (2012)
Google Scholar
Galliano, S., Gravier, G., Chaubard, L.: The Ester 2 evaluation campaign for rich transcription of French broadcasts. In: Proc. INTERSPEECH 2009, Brighton, UK, pp. 2583–2586 (2009)
Google Scholar
Corpus EPAC: Transcriptions orthographiques. Catalogue ELRA, reference ELRA-S0305, http://catalog.elra.info
Illina, I., Fohr, D., Jouvet, D.: Grapheme-to-Phoneme Conversion using Conditional Random Fields. In: Proc. INTERSPEECH 2011, Florence, Italy (2011)
Google Scholar
Gillick, L., Cox, S.J.: Some statistical issues in the comparison of speech recognition algorithms. In: Proc. ICASSP 1989, Int. Conf. on Acoustics, Speech and Signal Processing, pp. 532–535 (1989)
Google Scholar

Download references

Author information

Authors and Affiliations

Speech Group, LORIA Inria, Villers-lès-Nancy, F-54600, France
Denis Jouvet & David Langlois
Université de Lorraine, LORIA, UMR 7503, Villers-lès-Nancy, F-54600, France
Denis Jouvet & David Langlois
CNRS, LORIA, UMR 7503, Villers-lès-Nancy, F-54600, France
Denis Jouvet & David Langlois

Authors

Denis Jouvet
View author publications
You can also search for this author in PubMed Google Scholar
David Langlois
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of West Bohemia, 306 14, Pilsen, Czech Republic
Ivan Habernal & Václav Matoušek &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jouvet, D., Langlois, D. (2013). A Machine Learning Based Approach for Vocabulary Selection for Speech Transcription. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-40585-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40584-6
Online ISBN: 978-3-642-40585-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics