Speech Analysis with Deep Learning to Determine Speech Therapy for Learning Difficulties

Memari, Nogol; Abdollahi, Saranaz; Khodabakhsh, Sonia; Rezaei, Saeideh; Moghbel, Mehrdad

doi:10.1007/978-3-030-51156-2_136

Nogol Memari²⁰,
Saranaz Abdollahi²¹,
Sonia Khodabakhsh²²,
Saeideh Rezaei²³ &
…
Mehrdad Moghbel²⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1197))

Included in the following conference series:

International Conference on Intelligent and Fuzzy Systems

2484 Accesses
1 Citations

Abstract

The vocal tract movements involved in human speech makes the vocalisation of a complex array of coordinated and meaningful acoustic utterances possible. At the same time, it is hypothesized that related cognitive disorders can potentially interfere with the neurological, pre-articulatory and fine motor controls required for these fine movements. By leveraging the cognitive complexity of speech production, it is possible to detect a range of different disorders. Computer screening systems can be considered as an efficient approach for the early diagnosis and screening of voice disorders. For achieving the highest detection rate possible, a hybrid machine learning-based approach is proposed by combining Deep Learning with AdaBoost classifier. First, a set of acoustic features will be extracted using traditional features associated with the presence of autism, such as fundamental frequency descriptors. Then, a deep learning framework will be utilized for extracting additional acoustic contextual descriptors not definable using traditional feature extraction methods. Finally, the most informative features will be selected using a minimal-redundancy maximal-relevance feature selection approach with an AdaBoost classifier analysing all the selected features and informing the operator regarding the patient’s condition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Memari, N., Ramli, A.R., Saripan, M.I.B., Mashohor, S., Moghbel, M.: Retinal blood vessel segmentation by using matched filtering and fuzzy C-means clustering with integrated level set method for diabetic retinopathy assessment. J. Med. Biol. Eng. 39(5), 713–731 (2019)
Article Google Scholar
Zablotsky, B., Black, I., Maenner, J., Schieve, A., Blumberg, J.: Estimated prevalence of autism and other developmental disabilities following questionnaire changes in the 2014 National Health Interview Survey. Natl. Health Stat. Report 13, 1–20 (2015)
Google Scholar
Verde, L., De Pietro, G., Sannino, G.: Voice disorder identification by using machine learning techniques. IEEE Access 6, 16246–16255 (2018)
Article Google Scholar
Cunningham, S.P., Green, P.D., Christensen, H., Atria, J.J., Coy, A., Malavasi, M., Rudzicz, F.: Cloud-based speech technology for assistive technology applications (CloudCAST). In: AAATE Conference, pp. 322–329 (2017)
Google Scholar
Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., Quatieri, F.: A review of depression and suicide risk assessment using speech analysis. Speech Commun. 71, 10–49 (2015)
Article Google Scholar
Schuller, B.: Can affective computing save lives? Meet mobile health. Computer 5, 13 (2017)
Article Google Scholar
Cummins, N., Baird, A., Schuller, W.: Speech analysis for health: current state-of-the-art and the increasing impact of deep learning. Methods 151, 41–54 (2018)
Article Google Scholar
Bone, D., Li, M., Black, P., Narayanan, S.: Intoxicated speech detection: a fusion framework with speaker-normalized hierarchical functionals and GMM supervectors. Comput. Speech Lang. 28(2), 375–391 (2014)
Article Google Scholar
Oller, D.K., Niyogi, P., Gray, S., Richards, J.A., Gilkerson, J., Xu, D., Warren, F.: Automated vocal analysis of naturalistic recordings from children with autism, language delay, and typical development. Proc. Natl. Acad. Sci. 107(30), 13354–13359 (2010)
Article Google Scholar
Ringeval, F., Marchi, E., Grossard, C., Xavier, J., Chetouani, M., Cohen, D., Schuller, B.: Automatic Analysis of Typical and Atypical Encoding of Spontaneous Emotion in the Voice of Children (2016)
Google Scholar
Schuller, B., Steidl, S., Batliner, A., Vinciarelli, A., Scherer, K., Ringeval, F., Mortillaro, M.. The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. In: Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France (2013)
Google Scholar
Lee, Y., Hu, Y., Jing, H., Chang, F., Tsao, Y., Kao, C., Pao, L.: Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognition. In: INTERSPEECH, pp. 215–219, August 2013
Google Scholar
Asgari, M., Bayestehtashk, A., Shafran, I.: Robust and accurate features for detecting and diagnosing autism spectrum disorders. In: Interspeech, pp. 191–194 (2013)
Google Scholar
Huang, C.L., Hori, C.: Classification of children with voice impairments using deep neural networks. In: 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 1–5 (2013)
Google Scholar
Nakai, Y., Takiguchi, T., Matsui, G., Yamaoka, N., Takada, S.: Detecting abnormal word utterances in children with autism spectrum disorders: machine-learning-based voice analysis versus speech therapists. Percept. Motor Skills 124(5), 961–973 (2017)
Article Google Scholar
Carpenter, M., Tomasello, M., Striano, T.: Role reversal imitation and language in typically developing infants and children with autism. Infancy 8(3), 253–278 (2005)
Article Google Scholar
Deng, J., Cummins, N., Schmitt, M., Qian, K., Ringeval, F., Schuller, B.: Speech-based diagnosis of autism spectrum condition by generative adversarial network representations. In: Proceedings of the 2017 International Conference on Digital Health, pp. 53–57 (2017)
Google Scholar
Le Couteur, A., Haden, G., Hammal, D., McConachie, H.: Diagnosing autism spectrum disorders in pre-school children using two standardised assessment instruments: the ADI-R and the ADOS. J. Autism Dev. Disord. 38(2), 362–372 (2008)
Article Google Scholar
Mower, E., Black, M.P., Flores, E., Williams, M., Narayanan, S: Rachel: design of an emotionally targeted interactive agent for children with autism. In: 2011 IEEE International Conference on Multimedia and Expo (2011)
Google Scholar
Martínez, D., Lleida, E., Ortega, A., Miguel, A., Villalba, J.: Voice pathology detection on the Saarbrücken voice database with calibration and fusion of scores using multifocal toolkit. In: Advances in Speech and Language Technologies for Iberian Languages, pp. 99–109. Springer, Heidelberg (2012)
Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Nassif, A.B., Shahin, I., Attili, I., Azzeh, M., Shaalan, K.: Speech recognition using deep neural networks: a systematic review. IEEE Access 7, 19143–19165 (2019)
Article Google Scholar
Eyben, F., Weninger, F., Gross, F., Schuller, B.: Recent developments in open smile, the Munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 835–838 (2013)
Google Scholar
Schmitt, M., Marchi, E., Ringeval, F., Schuller, B.: Towards cross-lingual automatic diagnosis of autism spectrum condition in children’s voices. In: Speech Communication; 12. ITG Symposium (2016)
Google Scholar
Eyben, F., Scherer, K.R., Schuller, B.W., Sundberg, J., André, E., Busso, C., Truong, K.: The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Trans. Affect. Comput. 7(2), 190–202 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Binary University of Management and Entrepreneurship, Puchong, Selangor, Malaysia
Nogol Memari
Vali-e-Asr University of Rafsanjan, Rafsanjan, Iran
Saranaz Abdollahi
Universiti Tunku Abdul Rahman (UTAR), Kampar, Perak, Malaysia
Sonia Khodabakhsh
UCSI University, Kuala Lumpur, Malaysia
Saeideh Rezaei
Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
Mehrdad Moghbel

Authors

Nogol Memari
View author publications
You can also search for this author in PubMed Google Scholar
Saranaz Abdollahi
View author publications
You can also search for this author in PubMed Google Scholar
Sonia Khodabakhsh
View author publications
You can also search for this author in PubMed Google Scholar
Saeideh Rezaei
View author publications
You can also search for this author in PubMed Google Scholar
Mehrdad Moghbel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nogol Memari .

Editor information

Editors and Affiliations

Department of Industrial Engineering, Istanbul Technical University, Istanbul, Turkey
Cengiz Kahraman
Department of Industrial Engineering, Istanbul Technical University, Istanbul, Turkey
Sezi Cevik Onar
Department of Industrial Engineering, Istanbul Technical University, İstanbul, Turkey
Basar Oztaysi
Department of Industrial Engineering, Istanbul Technical University, Istanbul, Turkey
Irem Ucal Sari
Industrial Engineering Department, Yildiz Technical University, Istanbul, Turkey
Selcuk Cebi
Industrial Engineering Department, Galatasaray University, Istanbul, Turkey
A. Cagri Tolga

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Memari, N., Abdollahi, S., Khodabakhsh, S., Rezaei, S., Moghbel, M. (2021). Speech Analysis with Deep Learning to Determine Speech Therapy for Learning Difficulties. In: Kahraman, C., Cevik Onar, S., Oztaysi, B., Sari, I., Cebi, S., Tolga, A. (eds) Intelligent and Fuzzy Techniques: Smart and Innovative Solutions. INFUS 2020. Advances in Intelligent Systems and Computing, vol 1197. Springer, Cham. https://doi.org/10.1007/978-3-030-51156-2_136

Download citation

DOI: https://doi.org/10.1007/978-3-030-51156-2_136
Published: 11 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-51155-5
Online ISBN: 978-3-030-51156-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics