Identification of Four Major Dialects of Assamese Language Using GMM with UBM

Das, Hem Chandra; Bhattacharjee, Utpal

doi:10.1007/978-981-19-1520-8_24

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 888))

680 Accesses
3 Citations

Abstract

The Assamese language is spoken by the people of Assam, which is located in India’s north-east corner. The Indo-European language family includes the Assamese language. The pronunciation, grammar, and vocabulary of Assamese are vary in different sections of the state, resulting in different regional dialects of the language. There are four major regional dialects of the Assamese language, namely Central Assamese spoken in and around Nagaon district, Eastern Assamese dialect spoken in the Sibsagar and its neighboring districts, Kamrupi dialect spoken in Kamrup, Nalbari, Barpeta, Kokarajhar and some parts of Bongaigaon district and Goaplari dialect spoken in the Goaplara, Dhuburi and part of Bongaigaon district. Therefore, to develop a universal Assamese speech recognition system that seamlessly recognizes the words spoken in the Assamese language and its dialects, the identification of the dialect is a necessary condition. Using the Gaussian Mixture Model (GMM) and the Gaussian Mixture Model with Universal Background Model, this research proposes a novel technique for recognizing Assamese dialects (GMM-UBM). To extract spectral information from collected voice sample, the Mel-Frequency Cepstral Coefficient (MFCC) is used. Modeling is done using the GMM and GMM-UBM modeling techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Assamese Dialect Translation System—A Preliminary Proposal. http://himangshu.net/docs/iconacc.pdf. Accessed 2020/11/10
Assamese Language. Available: https://en.wikipedia.org/wiki/Assamese_language. Accessed 2019/10/10
Liu, G.A., Hansen, J.H.: A systematic strategy for robust automatic dialect identification. In: 19th European Signal Processing Conference, pp. 2138–2141. IEEE (2011)
Google Scholar
Li, H., Ma, B., Lee, K.A.: Spoken language recognition: from fundamentals to practice. Proc. IEEE 101(5), 1136–1159 (2013)
Google Scholar
Zhao, J., Shu, H., Zhang, L., Wang, X., Gong, Q., Li, P.: Cortical competition during language discrimination. NeuroImage 43(3), 624–633 (2008)
Article Google Scholar
Nti, A.A.: Studying Dialects to Understand Human Language. Massachusetts Institute of Technology (2009)
Google Scholar
Bailey, C.J.N.: Is There a “Midland” Dialect of American English? ERIC Clearinghouse. Distributed by ERIC Clearinghouse (1968)
Google Scholar
Davis, L.M., Houck, C.L.: Is there a Midland dialect area?—Again. Am. Speech 61–70. Duke University Press (1992)
Google Scholar
Etman, A., Beex, A.L.: Language and dialect identification: a survey. In: SAI Intelligent Systems Conference (IntelliSys) 2015, pp. 220–231. IEEE (2015)
Google Scholar
Shoufan, A., Alameri, S.: Natural language processing for dialectical Arabic: a survey. In: Proceedings of the Second Workshop on Arabic Natural Language Processing, pp. 36–48 (2015)
Google Scholar
Guellil, I., Saâdane, H., Azouaou, F., Gueni, B., Nouvel, D.: Arabic natural language processing: an overview. J. King Saud Univ.-Comput. Inf. Sci. 33(5), 497–507 (2021)
Google Scholar
Elnagar, A., Yagi, S.M., Nassif, A.B., Shahin, I., Salloum, S.A.: Systematic literature review of dialectal Arabic: identification and detection. IEEE Access 9, 31010–31042 (2021)
Google Scholar
Diab, M., Habash, N.: Arabic dialect processing tutorial. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts, pp. 5–6 (2007)
Google Scholar
Watson, J.C.: 50. Arabic dialects (general article). In: The Semitic Languages, pp. 851–896. De Gruyter Mouton (2011)
Google Scholar
Ibrahim, N.J., Idris, M.Y.I., Yakub, M., Yusoff, Z.M., Rahman, N.N.A., Dien, M.I.: Robust feature extraction based on spectral and prosodic features for classical Arabic accents recognition. Malays. J. Comput. Sci. 46–72 (2019)
Google Scholar
Shivaprasad, S., Sadanandam, M.: Identification of regional dialects of Telugu language using text independent speech processing models. Int. J. Speech Technol. 1–8 (2020)
Google Scholar
Chittaragi, N.B., Koolagudi, S.G.: Acoustic features based word level dialect classification using SVM and ensemble methods. In: Tenth International Conference on Contemporary Computing (IC3) 2017, pp. 1–6. IEEE (2017)
Google Scholar
Chittaragi, N.B., Limaye, A., Chandana, N., Annappa, B., Koolagudi, S.G.: Automatic text-independent Kannada dialect identification system. In: Information Systems Design and Intelligent Applications, pp. 79–87. Springer (2019)
Google Scholar
Rao, K., Koolagudi, S.G.: Identification of Hindi dialects and emotions using spectral and prosodic features of speech. Int. J. Syst. Cybern. 9(4), 24–33 (2011)
Google Scholar
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)
Google Scholar
Zhao, X., Shao, Y., Wang, D.: Robust speaker identification using a CASA front-end. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011, pp. 5468–5471. IEEE (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Bodoland University, Kokrajhar, Assam, 783370, India
Hem Chandra Das
Rajiv Gandhi University, Doimukh, Arunachal Pradesh, 791112, India
Utpal Bhattacharjee

Authors

Hem Chandra Das
View author publications
You can also search for this author in PubMed Google Scholar
Utpal Bhattacharjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hem Chandra Das .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology Arunachal Pradesh, Jote, Arunachal Pradesh, India
Deepak Gupta
Department of Computer Science and Engineering, National Institute of Technology Arunachal Pradesh, Jote, Arunachal Pradesh, India
Rajat Subhra Goswami
Department of Computer Science and Engineering, National Institute of Technology Arunachal Pradesh, Jote, Arunachal Pradesh, India
Subhasish Banerjee
Department of Mathematics, Indian Institute of Technology Indore, Indore, Madhya Pradesh, India
M. Tanveer
Department of Electrical Engineering, Indian Institute of Technology Indore, Indore, Madhya Pradesh, India
Ram Bilas Pachori

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Das, H.C., Bhattacharjee, U. (2022). Identification of Four Major Dialects of Assamese Language Using GMM with UBM. In: Gupta, D., Goswami, R.S., Banerjee, S., Tanveer, M., Pachori, R.B. (eds) Pattern Recognition and Data Analysis with Applications. Lecture Notes in Electrical Engineering, vol 888. Springer, Singapore. https://doi.org/10.1007/978-981-19-1520-8_24

Download citation

DOI: https://doi.org/10.1007/978-981-19-1520-8_24
Published: 02 September 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1519-2
Online ISBN: 978-981-19-1520-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics