Regular Article
Detection of phonological features in continuous speech using neural networks

https://doi.org/10.1006/csla.2000.0148Get rights and content

Abstract

We report work on the first component of a two-stage speech recognition architecture based onphonological features rather than phones. This paper reports experiments on three phonological feature systems: (1) the Sound Pattern of English (SPE) system which uses binary features, (2) amulti-valued (MV) feature system which uses traditional phonetic categories such as manner, place, etc., and (3)Government Phonology (GP) which uses a set of structured primes. All experiments used recurrent neural networks to perform feature detection. In these networks the input layer is a standard framewise cepstral representation, and the output layer represents the values of the features. The system effectively produces a representation of the most likely phonological features for each input frame. All experiments were carried out on the TIMIT speaker-independent database. The networks performed well in all cases, with the average accuracy for a single feature ranging from 86% and 93%. We describe these experiments in detail, and discuss the justification and potential advantages of using phonological features rather than phones for the basis of speech recognition.

References (38)

  • M. Ostendorf et al.

    HMM topology design using maximum likelihood successive state splitting

    Computer Speech and Language

    (1997)
  • J. Zacks et al.

    A new neural network for articulatory speech recognition and its application to vowel identification

    Computer Speech and Language

    (1994)
  • A. M. A. Ali, J. Van der Spiegel, P. Mueller, G. Haentjens, J. Berman, Proceedings of the International Symposium on...
  • N. N. Bitar, C. Y. Espy-Wilson, Proceedings of the 1995 IEEE Dual-Use Technologies and Applications Conference, May...
  • N. N. Bitar, C. Y. Espy-Wilson, Proceedings of the International Conference on Acoustics, Speech and Signal Processing...
  • H. Bourlard, S. Dupont, Proceedings of the International Conference on Spoken Language Processing ’96 , Philadelphia,...
  • H. Bourlard et al.

    Connectionist Speech Recognition: A Hybrid Approach

    (1994)
  • J.S. Bridle et al.

    An investigation of segmental hidden dynamic models of speech coarticulation for automatic speech recognition

    CLSP/JHU Summer Workshop on Language Engineering

    (1998)
  • N. Chomsky et al.

    The Sound Pattern of English

    (1968)
  • L. Deng et al.

    A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features

    Journal of the Acoustical Society of America (May 1994, University of Delaware.)

    (1994)
  • L. Deng, Jim Jian-Xiong Wu, Proceedings of the International Conference on Spoken Language Processing ’96 , volume 4,...
  • K. Erler et al.

    An HMM-based speech recognizer using overlapping articulatory features

    Journal of the Acoustical Society of America

    (1996)
  • C. Y. Espy-Wilson, N. N. Bitar, Proceedings of Eurospeech-95 , Madrid September 1995. 1995, 1411,...
  • J. S. Garofolo, 1988, National Institute of Standards and Technology (NIST), Gaithersburgh,...
  • H.G. Goldberg et al.

    Feature extraction, segmentation and labelling in the Harpy and Hearsay-II systems

    Journal of the Acoustical Society of America

    (1976)
  • J. Harrington

    Acoustic cues for automatic recognition of English consonants

  • J. Harris

    English Sound Structure

    (1994)
  • M. A. Huckvale, Proceedings of the Institute of Acoustics Conference on Speech and Hearing,...
  • Cited by (0)

    1

    Author for correspondence. Email: [email protected]

    View full text