Continuous Speech Recognition Based on ICA and Geometrical Learning

Feng, Hao; Cao, Wenming; Wang, Shoujue

doi:10.1007/11739685_102

Hao Feng²⁴,
Wenming Cao^22,23 &
Shoujue Wang²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3930))

1083 Accesses

Abstract

We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems. We observe that this may be true for recognition tasks based on Geometrical Learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy. Experiments show that the method based on ICA and Geometrical Learning outperforms HMM in a different number of training samples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amari, S.: Neural learning in structured parameter spaces—natural Riemannian gradient. In: Advances in Neural Information Processing System, vol. 9, pp. 127–133. MIT Press, Cambridge (1997)
Google Scholar
Bell, A., Sejnowski, T.: An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7, 1129–1159 (1995)
Article Google Scholar
Bell, A.J., Sejnowski, T.J.: Learning the higher-order structure of a natural sound. Network Comput. Neural Syst. 7, 261–266 (1996)
Article MATH Google Scholar
Bell, A.J., Sejnowski, T.J.: The ‘independent components’ of natural scenes are edge filters. Vision Res. 37(23), 3327–3338 (1997)
Article Google Scholar
Box, G.E.P., Tiao, G.C.: Bayesian Inference in Statistical Analysis. Wiley, New York (1992)
MATH Google Scholar
Lee, J.H., Jung, H.Y., Lee, T.W., Lee, S.Y.: Speech feature extraction using independent component analysis. In: Proceedings of the International Conference Acoustics, Speech, Signal Processing, Istanbul, Turkey, June 2000, pp. 1631–1634 (2000)
Google Scholar
Lee, J.-H., Lee, T.-W., Jung, H.-Y., Lee, S.-Y.: On the efficient speech feature extraction based on independent component analysis. Neural Process. Lett. 15(3), 235–245 (2002)
Article MATH Google Scholar
ShouJue, W.: A new development on ANN in China - Biomimetic pattern recognition and multi weight vector neurons. LNCS (LNAI), vol. 2639, pp. 35–43. Springer, Heidelberg (2003)
Google Scholar
Shoujue, W., et al.: Multi Camera Human Face Personal Identification System Based on Biomimetic pattern recognition. Acta Electronica Sinica 31(1), 1–3 (2003)
Google Scholar
Shoujue, W., et al.: Discussion on the basic mathematical models of Neurons in General purpose Neurocomputer. Acta Electronica Sinica 29(5), 577–580 (2001)
Google Scholar
Wang, X., Wang, S.: The Application of Feedforward Neural Networks in VLSI Fabrication Process Optimization. International Journal of Computational Intelligence and Applications 1(1), 83–90 (2001)
Article Google Scholar
Cao, W., Hao, F., Wang, S.: The application of DBF neural networks for object recognition. Inf. Sci. 160(1-4), 153–160 (2004)
Article Google Scholar
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley, New York (2001)
Book Google Scholar
Cao, W.M.: Similarity index for clustering DNA microarray data based on multi-weighted neuron. In: Ślęzak, D., Yao, J., Peters, J.F., Ziarko, W.P., Hu, X. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3642, pp. 402–408. Springer, Heidelberg (2005)
Chapter Google Scholar
Cao, W.M., Hu, J.H., Xiao, G., et al.: Application of multi-weighted neuron for iris recognition. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3497, pp. 87–92. Springer, Heidelberg (2005)
Chapter Google Scholar
Cao, W.M.: The application of Direction basis function neural networks to the prediction of chaotic time series. Chinese Journal of Electronics 13(3), 395–398 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Intelligent Information System, Information College, Zhejiang University of Technology, Hangzhou, 310032, China
Wenming Cao
Institute of Semiconductors, Chinese Academy of Science, Beijing, 100083, China
Wenming Cao & Shoujue Wang
Jiaxing University, 320000, China
Hao Feng

Authors

Hao Feng
View author publications
You can also search for this author in PubMed Google Scholar
Wenming Cao
View author publications
You can also search for this author in PubMed Google Scholar
Shoujue Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing, Hong Kong Polytechnic University, P.O. Box, Hong Kong, China
Daniel S. Yeung
School of Creative Media, City University of Hong Kong,, China
Zhi-Qiang Liu
Department of Mathematics and Computer Science, Hebei University, 071002, Baoding, Hebei, P.R. China
Xi-Zhao Wang
School of Electrical and Information Engineering, University of Sydney, 2006, NSW, Australia
Hong Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feng, H., Cao, W., Wang, S. (2006). Continuous Speech Recognition Based on ICA and Geometrical Learning. In: Yeung, D.S., Liu, ZQ., Wang, XZ., Yan, H. (eds) Advances in Machine Learning and Cybernetics. Lecture Notes in Computer Science(), vol 3930. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11739685_102

Download citation

DOI: https://doi.org/10.1007/11739685_102
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33584-9
Online ISBN: 978-3-540-33585-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics