Improving Phoneme Classification Performance Using Observation Context–Dependent Segment Models

Szarvas, Máté; Matsunaga, Shoichi

doi:10.1023/A:1026502830036

Improving Phoneme Classification Performance Using Observation Context–Dependent Segment Models

Published: December 2000

Volume 3, pages 253–262, (2000)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Máté Szarvas¹ &
Shoichi Matsunaga²

47 Accesses
4 Citations
Explore all metrics

Abstract

This article describes a novel method that models the correlation among acoustic observations in contiguous speech segments. The basic idea behind the method is that acoustic observations are conditioned not only on the phonetic context but also on the preceding acoustic segment observation. The correlation between consecutive acoustic observations is modeled by mean trajectory polynomial segment models (PSM). This method is an extension of conventional segment modeling approaches in that it describes the correlation of acoustic observations not only inside segments but also between contiguous segments. It is also a generalization of phonetic context (e.g., triphone) modeling approaches because it can model acoustic context and phonetic context at the same time. Using the proposed method in a speaker-independent phoneme classification test resulted in a 7 to 9% relative reduction of error rate as compared with the traditional triphone segmental model system and a 31% reduction as compared with a similar triphone hidden Markov model (HMM) system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Siamese Neural Networks: An Overview

A survey of methods for time series change point detection

Article 08 September 2016

catch22: CAnonical Time-series CHaracteristics

Article Open access 09 August 2019

References

Fukada, T., Sagisaka, Y., and Paliwal, K.K. (1997). Model parameter estimation for mixture density polynomial segment models. In ICASSP, pp. 1403–1406.
Furui, S. (1986). On the role of spectral transition for speech perception. J. Acoust. Soc. Am. 80(4):1016–1025.
Google Scholar
Gish, H. and Ng, K. (1993). A Segmental speech model with applications to word spotting. In ICASSP-93, pp. II/447–450.
Kimball, O. (1994). Segment Modeling Alternatives for Continuous Speech Recognition. Ph.D. thesis. Elect. Comput. Syst. Eng. Dept., Boston University.
Lee, K.-F. (1989). Automatic speech recognition: The developement of the SPHINX system. Norwell, Massachusetts 02061: Kluwer Academic Publishers.
Google Scholar
Ostendorf, M., Digalakis, V.V., and Kimball, O.A. (1996). From HMMs to segment models: A unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech and Audio Processing SAP; 4(5):360–378.
Google Scholar
Ostendorf, M., Kannan, A., Austin, S., Kimball, O., Schwartz, R., and Rohlicek, J.R. (1991). Integration of diverse recognition methodologies through reevaluation of N-Best sentence hypotheses. In Proc. of the DARPA Workshop on Speech and Natural Language, pp. 83–87.
Sagisaka, Y., Abe, M., Umeda, T., Katagiri, S., Takeda, K., and Kuwabara, H. (1990). A large-scale japanese speech database. In ICSLP, pp. 1089–1092.
Schwartz, R. and Chow, Y.-L. (1990). The N-Best Algorithm: An efficient and exact procedure for finding theNmost likely sentence hypotheses. In ICASSP, pp. 1857–1860.
Schwartz, R., Chow, Y.-L., Kimball, O., Roucos, S., Knasser, M., and Makhoul, J. (1985).Context-dependent modeling for acoustic phonetic recognition of continous-speech. In ICASSP, pp. 1205–1208.
Szarvas, M. and Matsunaga, S. (1998). Acoustic observation context modeling in segment based speech recognition. In ICSLP-98, pp. VII/2967–2970.
Szarvas, M. and Matsunaga, S. (1999). Segment-based speech recognition using acoustic observation context. Technical Report of IEICE SP98-119(1): 9–16.

Download references

Author information

Authors and Affiliations

TSP Laboratory, Department of Telecommunications and Telematics, Budapest University of Technology and Economics, 1117, Budapest, Pázmány P. sétány 1/D, Hungary
Máté Szarvas
NTT Cyberspace Laboratories, 1-1 Hikari-no-oka, Yokosuka-shi, Kanagawa, 239-0847, Japan
Shoichi Matsunaga

Authors

Máté Szarvas
View author publications
You can also search for this author in PubMed Google Scholar
Shoichi Matsunaga
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Szarvas, M., Matsunaga, S. Improving Phoneme Classification Performance Using Observation Context–Dependent Segment Models. International Journal of Speech Technology 3, 253–262 (2000). https://doi.org/10.1023/A:1026502830036

Download citation

Issue Date: December 2000
DOI: https://doi.org/10.1023/A:1026502830036

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Phoneme Classification Performance Using Observation Context–Dependent Segment Models

Abstract

Access this article

Similar content being viewed by others

Siamese Neural Networks: An Overview

A survey of methods for time series change point detection

catch22: CAnonical Time-series CHaracteristics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Improving Phoneme Classification Performance Using Observation Context–Dependent Segment Models

Abstract

Access this article

Similar content being viewed by others

Siamese Neural Networks: An Overview

A survey of methods for time series change point detection

catch22: CAnonical Time-series CHaracteristics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation