Copyright © 1997 Elsevier B.V. All rights reserved.
Production models as a structural basis for automatic speech recognition
Received 4 October 1996;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
We postulate in this paper that highly structured speech production models will have much to contribute to the ultimate success of speech recognition in view of the weaknesses of the theoretical foundation underpinning current technology. These weaknesses are analyzed in terms of phonological modeling and of phonetic-interface modeling. We present two probabilistic speech recognition models with the structure designed based on approximations to human speech production mechanisms, and conclude by suggesting that many of the advantages to be gained from interaction between speech production and speech recognition communities will develop from integrating production models with the probabilistic analysis-by-synthesis strategy currently used by the technology community.
Résumé
Dans cet article, nous suggérons que des modèles de production de la parole fortement structurés pourront contribuer significativement à la réussite future des modèles de reconnaissance automatique de la parole, limités en ce moment par les faiblesses de la base théorique de la technologie actuelle. Nous analysons ces faiblesses au niveau des modèles phonologiques et des modèles phonétiques, et présentons deux modèles statistiques de reconnaissance de la parole basés sur des approximations des mécanismes de production de la parole. Nous suggérons en conclusion que l'interaction entre les domaines de la production et de la reconnaissance de la parole peut être particulièrement efficace si l'on intègre les modèles de production dans la stratégie d'analyse-synthèse probabiliste, utilisée déjà depuis longtemps en reconnaissance de la parole.
Author Keywords: Speech production; Speech recognition; Analysis by synthesis; Stochastic modeling; Nonlinear phonology; Phonetic interface; Articulatory features; Articulatory dynamics; Stochastic target model







E-mail Article
Add to my Quick Links

Cited By in Scopus (13)






