Abstract
The Assamese language is spoken by the people of Assam, which is located in India’s north-east corner. The Indo-European language family includes the Assamese language. The pronunciation, grammar, and vocabulary of Assamese are vary in different sections of the state, resulting in different regional dialects of the language. There are four major regional dialects of the Assamese language, namely Central Assamese spoken in and around Nagaon district, Eastern Assamese dialect spoken in the Sibsagar and its neighboring districts, Kamrupi dialect spoken in Kamrup, Nalbari, Barpeta, Kokarajhar and some parts of Bongaigaon district and Goaplari dialect spoken in the Goaplara, Dhuburi and part of Bongaigaon district. Therefore, to develop a universal Assamese speech recognition system that seamlessly recognizes the words spoken in the Assamese language and its dialects, the identification of the dialect is a necessary condition. Using the Gaussian Mixture Model (GMM) and the Gaussian Mixture Model with Universal Background Model, this research proposes a novel technique for recognizing Assamese dialects (GMM-UBM). To extract spectral information from collected voice sample, the Mel-Frequency Cepstral Coefficient (MFCC) is used. Modeling is done using the GMM and GMM-UBM modeling techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Assamese Dialect Translation System—A Preliminary Proposal. http://himangshu.net/docs/iconacc.pdf. Accessed 2020/11/10
Assamese Language. Available: https://en.wikipedia.org/wiki/Assamese_language. Accessed 2019/10/10
Liu, G.A., Hansen, J.H.: A systematic strategy for robust automatic dialect identification. In: 19th European Signal Processing Conference, pp. 2138–2141. IEEE (2011)
Li, H., Ma, B., Lee, K.A.: Spoken language recognition: from fundamentals to practice. Proc. IEEE 101(5), 1136–1159 (2013)
Zhao, J., Shu, H., Zhang, L., Wang, X., Gong, Q., Li, P.: Cortical competition during language discrimination. NeuroImage 43(3), 624–633 (2008)
Nti, A.A.: Studying Dialects to Understand Human Language. Massachusetts Institute of Technology (2009)
Bailey, C.J.N.: Is There a “Midland” Dialect of American English? ERIC Clearinghouse. Distributed by ERIC Clearinghouse (1968)
Davis, L.M., Houck, C.L.: Is there a Midland dialect area?—Again. Am. Speech 61–70. Duke University Press (1992)
Etman, A., Beex, A.L.: Language and dialect identification: a survey. In: SAI Intelligent Systems Conference (IntelliSys) 2015, pp. 220–231. IEEE (2015)
Shoufan, A., Alameri, S.: Natural language processing for dialectical Arabic: a survey. In: Proceedings of the Second Workshop on Arabic Natural Language Processing, pp. 36–48 (2015)
Guellil, I., Saâdane, H., Azouaou, F., Gueni, B., Nouvel, D.: Arabic natural language processing: an overview. J. King Saud Univ.-Comput. Inf. Sci. 33(5), 497–507 (2021)
Elnagar, A., Yagi, S.M., Nassif, A.B., Shahin, I., Salloum, S.A.: Systematic literature review of dialectal Arabic: identification and detection. IEEE Access 9, 31010–31042 (2021)
Diab, M., Habash, N.: Arabic dialect processing tutorial. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts, pp. 5–6 (2007)
Watson, J.C.: 50. Arabic dialects (general article). In: The Semitic Languages, pp. 851–896. De Gruyter Mouton (2011)
Ibrahim, N.J., Idris, M.Y.I., Yakub, M., Yusoff, Z.M., Rahman, N.N.A., Dien, M.I.: Robust feature extraction based on spectral and prosodic features for classical Arabic accents recognition. Malays. J. Comput. Sci. 46–72 (2019)
Shivaprasad, S., Sadanandam, M.: Identification of regional dialects of Telugu language using text independent speech processing models. Int. J. Speech Technol. 1–8 (2020)
Chittaragi, N.B., Koolagudi, S.G.: Acoustic features based word level dialect classification using SVM and ensemble methods. In: Tenth International Conference on Contemporary Computing (IC3) 2017, pp. 1–6. IEEE (2017)
Chittaragi, N.B., Limaye, A., Chandana, N., Annappa, B., Koolagudi, S.G.: Automatic text-independent Kannada dialect identification system. In: Information Systems Design and Intelligent Applications, pp. 79–87. Springer (2019)
Rao, K., Koolagudi, S.G.: Identification of Hindi dialects and emotions using spectral and prosodic features of speech. Int. J. Syst. Cybern. 9(4), 24–33 (2011)
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)
Zhao, X., Shao, Y., Wang, D.: Robust speaker identification using a CASA front-end. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011, pp. 5468–5471. IEEE (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Das, H.C., Bhattacharjee, U. (2022). Identification of Four Major Dialects of Assamese Language Using GMM with UBM. In: Gupta, D., Goswami, R.S., Banerjee, S., Tanveer, M., Pachori, R.B. (eds) Pattern Recognition and Data Analysis with Applications. Lecture Notes in Electrical Engineering, vol 888. Springer, Singapore. https://doi.org/10.1007/978-981-19-1520-8_24
Download citation
DOI: https://doi.org/10.1007/978-981-19-1520-8_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1519-2
Online ISBN: 978-981-19-1520-8
eBook Packages: Computer ScienceComputer Science (R0)