Skip to main content

Speech Analysis with Deep Learning to Determine Speech Therapy for Learning Difficulties

  • Conference paper
  • First Online:
Intelligent and Fuzzy Techniques: Smart and Innovative Solutions (INFUS 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1197))

Included in the following conference series:

Abstract

The vocal tract movements involved in human speech makes the vocalisation of a complex array of coordinated and meaningful acoustic utterances possible. At the same time, it is hypothesized that related cognitive disorders can potentially interfere with the neurological, pre-articulatory and fine motor controls required for these fine movements. By leveraging the cognitive complexity of speech production, it is possible to detect a range of different disorders. Computer screening systems can be considered as an efficient approach for the early diagnosis and screening of voice disorders. For achieving the highest detection rate possible, a hybrid machine learning-based approach is proposed by combining Deep Learning with AdaBoost classifier. First, a set of acoustic features will be extracted using traditional features associated with the presence of autism, such as fundamental frequency descriptors. Then, a deep learning framework will be utilized for extracting additional acoustic contextual descriptors not definable using traditional feature extraction methods. Finally, the most informative features will be selected using a minimal-redundancy maximal-relevance feature selection approach with an AdaBoost classifier analysing all the selected features and informing the operator regarding the patient’s condition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Memari, N., Ramli, A.R., Saripan, M.I.B., Mashohor, S., Moghbel, M.: Retinal blood vessel segmentation by using matched filtering and fuzzy C-means clustering with integrated level set method for diabetic retinopathy assessment. J. Med. Biol. Eng. 39(5), 713–731 (2019)

    Article  Google Scholar 

  2. Zablotsky, B., Black, I., Maenner, J., Schieve, A., Blumberg, J.: Estimated prevalence of autism and other developmental disabilities following questionnaire changes in the 2014 National Health Interview Survey. Natl. Health Stat. Report 13, 1–20 (2015)

    Google Scholar 

  3. Verde, L., De Pietro, G., Sannino, G.: Voice disorder identification by using machine learning techniques. IEEE Access 6, 16246–16255 (2018)

    Article  Google Scholar 

  4. Cunningham, S.P., Green, P.D., Christensen, H., Atria, J.J., Coy, A., Malavasi, M., Rudzicz, F.: Cloud-based speech technology for assistive technology applications (CloudCAST). In: AAATE Conference, pp. 322–329 (2017)

    Google Scholar 

  5. Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., Quatieri, F.: A review of depression and suicide risk assessment using speech analysis. Speech Commun. 71, 10–49 (2015)

    Article  Google Scholar 

  6. Schuller, B.: Can affective computing save lives? Meet mobile health. Computer 5, 13 (2017)

    Article  Google Scholar 

  7. Cummins, N., Baird, A., Schuller, W.: Speech analysis for health: current state-of-the-art and the increasing impact of deep learning. Methods 151, 41–54 (2018)

    Article  Google Scholar 

  8. Bone, D., Li, M., Black, P., Narayanan, S.: Intoxicated speech detection: a fusion framework with speaker-normalized hierarchical functionals and GMM supervectors. Comput. Speech Lang. 28(2), 375–391 (2014)

    Article  Google Scholar 

  9. Oller, D.K., Niyogi, P., Gray, S., Richards, J.A., Gilkerson, J., Xu, D., Warren, F.: Automated vocal analysis of naturalistic recordings from children with autism, language delay, and typical development. Proc. Natl. Acad. Sci. 107(30), 13354–13359 (2010)

    Article  Google Scholar 

  10. Ringeval, F., Marchi, E., Grossard, C., Xavier, J., Chetouani, M., Cohen, D., Schuller, B.: Automatic Analysis of Typical and Atypical Encoding of Spontaneous Emotion in the Voice of Children (2016)

    Google Scholar 

  11. Schuller, B., Steidl, S., Batliner, A., Vinciarelli, A., Scherer, K., Ringeval, F., Mortillaro, M.. The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. In: Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France (2013)

    Google Scholar 

  12. Lee, Y., Hu, Y., Jing, H., Chang, F., Tsao, Y., Kao, C., Pao, L.: Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognition. In: INTERSPEECH, pp. 215–219, August 2013

    Google Scholar 

  13. Asgari, M., Bayestehtashk, A., Shafran, I.: Robust and accurate features for detecting and diagnosing autism spectrum disorders. In: Interspeech, pp. 191–194 (2013)

    Google Scholar 

  14. Huang, C.L., Hori, C.: Classification of children with voice impairments using deep neural networks. In: 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 1–5 (2013)

    Google Scholar 

  15. Nakai, Y., Takiguchi, T., Matsui, G., Yamaoka, N., Takada, S.: Detecting abnormal word utterances in children with autism spectrum disorders: machine-learning-based voice analysis versus speech therapists. Percept. Motor Skills 124(5), 961–973 (2017)

    Article  Google Scholar 

  16. Carpenter, M., Tomasello, M., Striano, T.: Role reversal imitation and language in typically developing infants and children with autism. Infancy 8(3), 253–278 (2005)

    Article  Google Scholar 

  17. Deng, J., Cummins, N., Schmitt, M., Qian, K., Ringeval, F., Schuller, B.: Speech-based diagnosis of autism spectrum condition by generative adversarial network representations. In: Proceedings of the 2017 International Conference on Digital Health, pp. 53–57 (2017)

    Google Scholar 

  18. Le Couteur, A., Haden, G., Hammal, D., McConachie, H.: Diagnosing autism spectrum disorders in pre-school children using two standardised assessment instruments: the ADI-R and the ADOS. J. Autism Dev. Disord. 38(2), 362–372 (2008)

    Article  Google Scholar 

  19. Mower, E., Black, M.P., Flores, E., Williams, M., Narayanan, S: Rachel: design of an emotionally targeted interactive agent for children with autism. In: 2011 IEEE International Conference on Multimedia and Expo (2011)

    Google Scholar 

  20. MartĂ­nez, D., Lleida, E., Ortega, A., Miguel, A., Villalba, J.: Voice pathology detection on the SaarbrĂ¼cken voice database with calibration and fusion of scores using multifocal toolkit. In: Advances in Speech and Language Technologies for Iberian Languages, pp. 99–109. Springer, Heidelberg (2012)

    Google Scholar 

  21. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  22. Nassif, A.B., Shahin, I., Attili, I., Azzeh, M., Shaalan, K.: Speech recognition using deep neural networks: a systematic review. IEEE Access 7, 19143–19165 (2019)

    Article  Google Scholar 

  23. Eyben, F., Weninger, F., Gross, F., Schuller, B.: Recent developments in open smile, the Munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 835–838 (2013)

    Google Scholar 

  24. Schmitt, M., Marchi, E., Ringeval, F., Schuller, B.: Towards cross-lingual automatic diagnosis of autism spectrum condition in children’s voices. In: Speech Communication; 12. ITG Symposium (2016)

    Google Scholar 

  25. Eyben, F., Scherer, K.R., Schuller, B.W., Sundberg, J., André, E., Busso, C., Truong, K.: The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Trans. Affect. Comput. 7(2), 190–202 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nogol Memari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Memari, N., Abdollahi, S., Khodabakhsh, S., Rezaei, S., Moghbel, M. (2021). Speech Analysis with Deep Learning to Determine Speech Therapy for Learning Difficulties. In: Kahraman, C., Cevik Onar, S., Oztaysi, B., Sari, I., Cebi, S., Tolga, A. (eds) Intelligent and Fuzzy Techniques: Smart and Innovative Solutions. INFUS 2020. Advances in Intelligent Systems and Computing, vol 1197. Springer, Cham. https://doi.org/10.1007/978-3-030-51156-2_136

Download citation

Publish with us

Policies and ethics