Abstract
The vocal tract movements involved in human speech makes the vocalisation of a complex array of coordinated and meaningful acoustic utterances possible. At the same time, it is hypothesized that related cognitive disorders can potentially interfere with the neurological, pre-articulatory and fine motor controls required for these fine movements. By leveraging the cognitive complexity of speech production, it is possible to detect a range of different disorders. Computer screening systems can be considered as an efficient approach for the early diagnosis and screening of voice disorders. For achieving the highest detection rate possible, a hybrid machine learning-based approach is proposed by combining Deep Learning with AdaBoost classifier. First, a set of acoustic features will be extracted using traditional features associated with the presence of autism, such as fundamental frequency descriptors. Then, a deep learning framework will be utilized for extracting additional acoustic contextual descriptors not definable using traditional feature extraction methods. Finally, the most informative features will be selected using a minimal-redundancy maximal-relevance feature selection approach with an AdaBoost classifier analysing all the selected features and informing the operator regarding the patient’s condition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Memari, N., Ramli, A.R., Saripan, M.I.B., Mashohor, S., Moghbel, M.: Retinal blood vessel segmentation by using matched filtering and fuzzy C-means clustering with integrated level set method for diabetic retinopathy assessment. J. Med. Biol. Eng. 39(5), 713–731 (2019)
Zablotsky, B., Black, I., Maenner, J., Schieve, A., Blumberg, J.: Estimated prevalence of autism and other developmental disabilities following questionnaire changes in the 2014 National Health Interview Survey. Natl. Health Stat. Report 13, 1–20 (2015)
Verde, L., De Pietro, G., Sannino, G.: Voice disorder identification by using machine learning techniques. IEEE Access 6, 16246–16255 (2018)
Cunningham, S.P., Green, P.D., Christensen, H., Atria, J.J., Coy, A., Malavasi, M., Rudzicz, F.: Cloud-based speech technology for assistive technology applications (CloudCAST). In: AAATE Conference, pp. 322–329 (2017)
Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., Quatieri, F.: A review of depression and suicide risk assessment using speech analysis. Speech Commun. 71, 10–49 (2015)
Schuller, B.: Can affective computing save lives? Meet mobile health. Computer 5, 13 (2017)
Cummins, N., Baird, A., Schuller, W.: Speech analysis for health: current state-of-the-art and the increasing impact of deep learning. Methods 151, 41–54 (2018)
Bone, D., Li, M., Black, P., Narayanan, S.: Intoxicated speech detection: a fusion framework with speaker-normalized hierarchical functionals and GMM supervectors. Comput. Speech Lang. 28(2), 375–391 (2014)
Oller, D.K., Niyogi, P., Gray, S., Richards, J.A., Gilkerson, J., Xu, D., Warren, F.: Automated vocal analysis of naturalistic recordings from children with autism, language delay, and typical development. Proc. Natl. Acad. Sci. 107(30), 13354–13359 (2010)
Ringeval, F., Marchi, E., Grossard, C., Xavier, J., Chetouani, M., Cohen, D., Schuller, B.: Automatic Analysis of Typical and Atypical Encoding of Spontaneous Emotion in the Voice of Children (2016)
Schuller, B., Steidl, S., Batliner, A., Vinciarelli, A., Scherer, K., Ringeval, F., Mortillaro, M.. The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. In: Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France (2013)
Lee, Y., Hu, Y., Jing, H., Chang, F., Tsao, Y., Kao, C., Pao, L.: Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognition. In: INTERSPEECH, pp. 215–219, August 2013
Asgari, M., Bayestehtashk, A., Shafran, I.: Robust and accurate features for detecting and diagnosing autism spectrum disorders. In: Interspeech, pp. 191–194 (2013)
Huang, C.L., Hori, C.: Classification of children with voice impairments using deep neural networks. In: 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 1–5 (2013)
Nakai, Y., Takiguchi, T., Matsui, G., Yamaoka, N., Takada, S.: Detecting abnormal word utterances in children with autism spectrum disorders: machine-learning-based voice analysis versus speech therapists. Percept. Motor Skills 124(5), 961–973 (2017)
Carpenter, M., Tomasello, M., Striano, T.: Role reversal imitation and language in typically developing infants and children with autism. Infancy 8(3), 253–278 (2005)
Deng, J., Cummins, N., Schmitt, M., Qian, K., Ringeval, F., Schuller, B.: Speech-based diagnosis of autism spectrum condition by generative adversarial network representations. In: Proceedings of the 2017 International Conference on Digital Health, pp. 53–57 (2017)
Le Couteur, A., Haden, G., Hammal, D., McConachie, H.: Diagnosing autism spectrum disorders in pre-school children using two standardised assessment instruments: the ADI-R and the ADOS. J. Autism Dev. Disord. 38(2), 362–372 (2008)
Mower, E., Black, M.P., Flores, E., Williams, M., Narayanan, S: Rachel: design of an emotionally targeted interactive agent for children with autism. In: 2011 IEEE International Conference on Multimedia and Expo (2011)
MartĂnez, D., Lleida, E., Ortega, A., Miguel, A., Villalba, J.: Voice pathology detection on the SaarbrĂ¼cken voice database with calibration and fusion of scores using multifocal toolkit. In: Advances in Speech and Language Technologies for Iberian Languages, pp. 99–109. Springer, Heidelberg (2012)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Nassif, A.B., Shahin, I., Attili, I., Azzeh, M., Shaalan, K.: Speech recognition using deep neural networks: a systematic review. IEEE Access 7, 19143–19165 (2019)
Eyben, F., Weninger, F., Gross, F., Schuller, B.: Recent developments in open smile, the Munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 835–838 (2013)
Schmitt, M., Marchi, E., Ringeval, F., Schuller, B.: Towards cross-lingual automatic diagnosis of autism spectrum condition in children’s voices. In: Speech Communication; 12. ITG Symposium (2016)
Eyben, F., Scherer, K.R., Schuller, B.W., Sundberg, J., André, E., Busso, C., Truong, K.: The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Trans. Affect. Comput. 7(2), 190–202 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Memari, N., Abdollahi, S., Khodabakhsh, S., Rezaei, S., Moghbel, M. (2021). Speech Analysis with Deep Learning to Determine Speech Therapy for Learning Difficulties. In: Kahraman, C., Cevik Onar, S., Oztaysi, B., Sari, I., Cebi, S., Tolga, A. (eds) Intelligent and Fuzzy Techniques: Smart and Innovative Solutions. INFUS 2020. Advances in Intelligent Systems and Computing, vol 1197. Springer, Cham. https://doi.org/10.1007/978-3-030-51156-2_136
Download citation
DOI: https://doi.org/10.1007/978-3-030-51156-2_136
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-51155-5
Online ISBN: 978-3-030-51156-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)