Abstract
Speech and communication are the bases of our society and the quality of the speech can seriously affect any person’s life. Besides the irregularities in the voice production can be caused by different diseases which can be treated better if they are diagnosed in an early stage. In this research we introduce a series of measurements based on continuous speech which could be a good base of developing a system that can automatically distinguish several voice disorders and identify the quality of the voice according to widely used RBH scale ((Rauhigkeit) (roughness) (Behauchtkeit) (breathiness) (Heiserkeit) (hoarseness)): 0 = normal voice quality, 3 = heavy huskiness). The different feature extraction and classification experiments presented in this research show how it is possible to separate healthy from pathological speech automatically and demonstrate possibilities to define the type of the pathological voice. The results suggest that the automatic classification into two or multiple classes can be improved by a multi-step pre-processing methodology, in which only the most significant acoustic features of a given class are extracted from the voice and used to train the SVM (Support Vector Machine) classifiers. We also present the importance of data acquisition, and how the selected features and the number of training samples can affect the accuracy and performance of the automatic voice disorder detection.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Li, T., et al.: Classification of pathological voice including severely noisy cases. In: 8th International Conference on Spoken Language Processing, Interspeech-2004, pp. 77–80 (2004)
Peng, C., Chen, W., Zhu, X., Wan, B., Wei, D.: Pathological voice classification based on a single vowel’s acoustic features. In: 7th International Conference on Computer and Information Technology, pp. 1106–1110. IEEE (2007)
Rabinov, C.R., Kreiman, J., Gerratt, B.R., Bielamowicz, S.: Comparing reliability of perceptual ratings of roughness and acoustic measures of jitter. J. Speech Hear. Res. 38, 26–32 (1995)
Ritchings, R.T., McGillion, M., Moore, C.J.: Pathological voice quality assessment using artificial neural networks. Med. Eng. Phys. 24, 561–564 (2002)
Grygiel, J., Strumiłło, P.: Application of mel cepstral representation of voice recordings for diagnosing vocal disorders. Przegląd Elektrotechniczny (Electr. Rev.), 88(6), 8–11 (2012). ISSN 0033-2097
Sarria-Paja, M., Daza-Santacoloma, G., Godino-Llorente, J.I., Casellanos-Dominquez, G., Sáenz-Lechón, N.: Feature selection in pathological voice classification using dynamic of component analysis. In: Proceedings of the 4th International Symposium on Image/Video Communications over Fixed and Mobile Networks. Universidad de Deusto (2008)
Vicsi, K., Imre, V.: Voice disorder detection on the base of continuous speech. In: 4th Advanced Voice Function Assessment Workshop, COST Action 2103, New York (2010)
Mitrovic, S.: Characteristics of the voice in patients with glottic carcinoma evaluated with the RBH (roughness, breathiness, hoarseness) and GIRBAS (grade, instability, roughness, breathiness, asthenia, strain) scales. Med. Pregl. 56, 337–340 (2003)
Yumotot, E.: Harmonics-to-noise ratio as an index of the degree of hoarseness. J. Acoust. Soc. Am. 71(6) 1544–1550 (1982)
Farrús, M., Hernando, J., Ejarque, P.: Jitter and shimmer measurements for speaker recognition. TALP Research Center, Department of Signal Theory and Communications, Barcelona, Spain
Ravi Kumar, K.M., Ganesan, S.: Comparison of multidimensional MFCC feature vectors for objective assessment of stuttered disfluencies. Int. J. Adv. Netw. Appl. 02(05), 854–860 (2011)
Vikram, C.M., Umarani, K.: Pathological voice analysis to detect neurological disorders using MFCC and SVM. Int. J. Adv. Electr. Electron. Eng. 2(4), 87–91 (2013). ISSN (Print) 2278-8948
Jolliffe, I.T.: Principal Component Analysis. Department of Mathematical Sciences, King’s College, London (2002)
Arjmandi, M.K., Pooyan, M., Mohammadnejad, H., Vali, M.: Voice disorders identification based on different feature reduction methodologies and support vector machine. In: Proceeding of ICEE2010. Isfahan University of Technology, 11–13 May 2010
Praat program. http://www.fon.hum.uva.nl/praat/manual/Voice.html. Accessed 30 June 2015
Multi SVM Matlab algorithm. http://www.mathworks.com/matlabcentral/fileexchange/33170-multi-class-support-vector-machine. Accessed 30 June 2015
LibSVM. https://www.csie.ntu.edu.tw/~cjlin/libsvm/. Accessed 30 June 2015
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kazinczi, F., Mészáros, K., Vicsi, K. (2015). Automatic Detection of Voice Disorders. In: Dediu, AH., Martín-Vide, C., Vicsi, K. (eds) Statistical Language and Speech Processing. SLSP 2015. Lecture Notes in Computer Science(), vol 9449. Springer, Cham. https://doi.org/10.1007/978-3-319-25789-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-25789-1_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25788-4
Online ISBN: 978-3-319-25789-1
eBook Packages: Computer ScienceComputer Science (R0)