Research on English pronunciation training based on intelligent speech recognition

Cai, Jingyu; Liu, Ying

doi:10.1007/s10772-018-9523-8

Research on English pronunciation training based on intelligent speech recognition

Published: 15 June 2018

Volume 21, pages 633–640, (2018)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Jingyu Cai¹ &
Ying Liu²

486 Accesses
12 Citations
Explore all metrics

Abstract

When learning English, Chinese students tend to spend a lot of time in practicing reading and writing skills, while neglecting their ability to speak English. This study presented a speech recognition-based intelligent spoken English pronunciation training system which took Mel Frequency Cepstral Coefficients as the characteristic parameter of speech signal and introduced deep neural network algorithm to improve the accuracy of speech recognition. Taking tone, speech speed and intonation as the evaluation criteria, a simulation experiment of artificial evaluation and machine evaluation was carried out. The results demonstrated that deep neural network had high speech recognition rate, and the three evaluation criteria were reliable, which provides a reference for the development of spoken English learning system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Automatic speech recognition: a survey

Article 10 November 2020

Mishaim Malik, Muhammad Kamran Malik, … Imran Makhdoom

Comparison of Outcomes Between Robot-Assisted Language Learning System and Human Tutors: Focusing on Speaking Ability

Article Open access 11 April 2024

Takamasa Iio, Yuichiro Yoshikawa, … Hiroshi Ishiguro

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Amandeep Singh Dhanjal & Williamjeet Singh

References

Cao, J., Cui, H., Hao, S., & Jiao, L. (2016). Big Data: A parallel particle swarm optimization-back-propagation neural network algorithm based on MapReduce. PLoS ONE, 11(6), e0157551.
Article Google Scholar
Celebi, M. E., Kingravi, H. A., & Vela, P. A. (2013). A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Systems with Applications, 40(1), 200–210.
Article Google Scholar
Doremalen, J. V., Lou, B., Colpaert, J., Cucchiarini, C., & Strik, H. (2016). Evaluating automatic speech recognition-based language learning systems: A case study. Computer Assisted Language Learning, 29(4), 1–19.
Google Scholar
Fischer, A. (2014). Training restricted boltzmann machines. Pattern Recognition, 47(1), 25–39.
Article MATH Google Scholar
Hammami, N., Bedda, M., & Nadir, F. (2012). The second-order derivatives of MFCC for improving spoken Arabic digits recognition using Tree distributions approximation model and HMMs. In International conference on communications and information technology. IEEE, pp. 1–5.
Leema, N., Nehemiah, H. K., & Kannan, A. (2016). Neural network classifier optimization using differential evolution with global information and back propagation algorithm for clinical datasets. Applied Soft Computing, 49, 834–844.
Article Google Scholar
Li, X., & Wu, X. (2014). Labeling unsegmented sequence data with DNN-HMM and its application for speech recognition. In International Symposium on Chinese Spoken Language Processing. IEEE, pp. 10–14.
Maher, R., Millar, D. S., Savory, S. J., & Thomsen, B. C. (2012). SOA blanking and signal pre-emphasis for wavelength agile 100 Gb/s transmitters. In Opto-electronics and communications conference. IEEE, pp. 905–906.
Mishali, M., & Eldar, Y. C. (2011). Sub-Nyquist sampling: Bridging theory and practice. IEEE Signal Processing Magazine, 11(2):61–71.
Google Scholar
Nallasamy, U., Metze, F., & Schultz, T. (2013). Active learning for accent adaptation in automatic speech recognition. In Spoken language technology workshop. IEEE, pp. 360–365.
Shahamiri, S. R., & Salim, S. S. B. (2014). Artificial neural networks as speech recognisers for dysarthric speech: Identifying the best-performing set of MFCC parameters and studying a speaker-independent approach[J]. Advanced Engineering Informatics, 28(1), 102–110.
Article Google Scholar
Simonchik, K., Aleinik, S., Ivanko, D., & Lavrentyeva, G. (2015). Automatic preprocessing technique for detection of corrupted speech signal fragments for the purpose of speaker recognition. speech and computer (pp.121–128). Berlin: Springer
Google Scholar
Tan, S., & Sim, K. C. (2017). Learning utterance-level normalisation using Variational Autoencoders for robust automatic speech recognition. In Spoken language technology workshop. IEEE, pp. 43–49.
Wang, Z., & Bi, G. (2016). A time-frequency preprocessing method for blind source separation of speech signal with temporal structure. In International conference on information, communications and signal processing. IEEE, pp. 1–6.
Wu, Y., Ye, Q., Li, X., Tan, D., & Shao, G. (2013). Applications of autocorrelation function method for spatial characteristics analysis of dielectric barrier discharge. Vacuum, 91(3), 28–34.
Article Google Scholar
Xia, M., & Xu, Z. (2012). Entropy/cross entropy-based group decision making under intuitionistic fuzzy environment. Information Fusion, 13(1), 31–47.
Article MathSciNet Google Scholar
Xu, Z., Liu, J., Chen, X., Wang, Y., & Zhao, Z. (2017). Continuous blood pressure estimation based on multiple parameters from eletrocardiogram and photoplethysmogram by Back-propagation neural network. Computers in Industry, 89(C), 50–59.
Article Google Scholar
Zhang, J., Haitao, H. U., & Li, C. (2014). Robust voice endpoint detection fusing Burg spectrum estimate and signal variability. Journal of Xidian University, 41(3), 192–195+220.
Google Scholar
Zhou, H., Deng, Z., Xia, Y., & Fu, M. (2016). A new sampling method in particle filter based on Pearson correlation coefficient. Neurocomputing, 216, 208–215.
Article Google Scholar

Download references

Funding

Supported by Humanities and Social Sciences Research Project of Mudanjiang Normal University: Chinese Writing—Research on the New Generation of Chinese American Female Novels (No. GG2018009).

Author information

Authors and Affiliations

School of Western Languages, Mudanjiang Normal University, Mudanjiang, 157011, Heilongjiang, China
Jingyu Cai
Foreign Language Department, Mudanjiang Medical University, Mudanjiang, 157011, Heilongjiang, China
Ying Liu

Authors

Jingyu Cai
View author publications
You can also search for this author in PubMed Google Scholar
Ying Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cai, J., Liu, Y. Research on English pronunciation training based on intelligent speech recognition. Int J Speech Technol 21, 633–640 (2018). https://doi.org/10.1007/s10772-018-9523-8

Download citation

Received: 28 December 2017
Accepted: 05 June 2018
Published: 15 June 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s10772-018-9523-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Research on English pronunciation training based on intelligent speech recognition

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

Comparison of Outcomes Between Robot-Assisted Language Learning System and Human Tutors: Focusing on Speaking Ability

A comprehensive survey on automatic speech recognition using neural networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Research on English pronunciation training based on intelligent speech recognition

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

Comparison of Outcomes Between Robot-Assisted Language Learning System and Human Tutors: Focusing on Speaking Ability

A comprehensive survey on automatic speech recognition using neural networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation