Skip to main content

Identification of Four Major Dialects of Assamese Language Using GMM with UBM

  • Conference paper
  • First Online:
Pattern Recognition and Data Analysis with Applications

Abstract

The Assamese language is spoken by the people of Assam, which is located in India’s north-east corner. The Indo-European language family includes the Assamese language. The pronunciation, grammar, and vocabulary of Assamese are vary in different sections of the state, resulting in different regional dialects of the language. There are four major regional dialects of the Assamese language, namely Central Assamese spoken in and around Nagaon district, Eastern Assamese dialect spoken in the Sibsagar and its neighboring districts, Kamrupi dialect spoken in Kamrup, Nalbari, Barpeta, Kokarajhar and some parts of Bongaigaon district and Goaplari dialect spoken in the Goaplara, Dhuburi and part of Bongaigaon district. Therefore, to develop a universal Assamese speech recognition system that seamlessly recognizes the words spoken in the Assamese language and its dialects, the identification of the dialect is a necessary condition. Using the Gaussian Mixture Model (GMM) and the Gaussian Mixture Model with Universal Background Model, this research proposes a novel technique for recognizing Assamese dialects (GMM-UBM). To extract spectral information from collected voice sample, the Mel-Frequency Cepstral Coefficient (MFCC) is used. Modeling is done using the GMM and GMM-UBM modeling techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Assamese Dialect Translation System—A Preliminary Proposal. http://himangshu.net/docs/iconacc.pdf. Accessed 2020/11/10

  2. Assamese Language. Available: https://en.wikipedia.org/wiki/Assamese_language. Accessed 2019/10/10

  3. Liu, G.A., Hansen, J.H.: A systematic strategy for robust automatic dialect identification. In: 19th European Signal Processing Conference, pp. 2138–2141. IEEE (2011)

    Google Scholar 

  4. Li, H., Ma, B., Lee, K.A.: Spoken language recognition: from fundamentals to practice. Proc. IEEE 101(5), 1136–1159 (2013)

    Google Scholar 

  5. Zhao, J., Shu, H., Zhang, L., Wang, X., Gong, Q., Li, P.: Cortical competition during language discrimination. NeuroImage 43(3), 624–633 (2008)

    Article  Google Scholar 

  6. Nti, A.A.: Studying Dialects to Understand Human Language. Massachusetts Institute of Technology (2009)

    Google Scholar 

  7. Bailey, C.J.N.: Is There a “Midland” Dialect of American English? ERIC Clearinghouse. Distributed by ERIC Clearinghouse (1968)

    Google Scholar 

  8. Davis, L.M., Houck, C.L.: Is there a Midland dialect area?—Again. Am. Speech 61–70. Duke University Press (1992)

    Google Scholar 

  9. Etman, A., Beex, A.L.: Language and dialect identification: a survey. In: SAI Intelligent Systems Conference (IntelliSys) 2015, pp. 220–231. IEEE (2015)

    Google Scholar 

  10. Shoufan, A., Alameri, S.: Natural language processing for dialectical Arabic: a survey. In: Proceedings of the Second Workshop on Arabic Natural Language Processing, pp. 36–48 (2015)

    Google Scholar 

  11. Guellil, I., Saâdane, H., Azouaou, F., Gueni, B., Nouvel, D.: Arabic natural language processing: an overview. J. King Saud Univ.-Comput. Inf. Sci. 33(5), 497–507 (2021)

    Google Scholar 

  12. Elnagar, A., Yagi, S.M., Nassif, A.B., Shahin, I., Salloum, S.A.: Systematic literature review of dialectal Arabic: identification and detection. IEEE Access 9, 31010–31042 (2021)

    Google Scholar 

  13. Diab, M., Habash, N.: Arabic dialect processing tutorial. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts, pp. 5–6 (2007)

    Google Scholar 

  14. Watson, J.C.: 50. Arabic dialects (general article). In: The Semitic Languages, pp. 851–896. De Gruyter Mouton (2011)

    Google Scholar 

  15. Ibrahim, N.J., Idris, M.Y.I., Yakub, M., Yusoff, Z.M., Rahman, N.N.A., Dien, M.I.: Robust feature extraction based on spectral and prosodic features for classical Arabic accents recognition. Malays. J. Comput. Sci. 46–72 (2019)

    Google Scholar 

  16. Shivaprasad, S., Sadanandam, M.: Identification of regional dialects of Telugu language using text independent speech processing models. Int. J. Speech Technol. 1–8 (2020)

    Google Scholar 

  17. Chittaragi, N.B., Koolagudi, S.G.: Acoustic features based word level dialect classification using SVM and ensemble methods. In: Tenth International Conference on Contemporary Computing (IC3) 2017, pp. 1–6. IEEE (2017)

    Google Scholar 

  18. Chittaragi, N.B., Limaye, A., Chandana, N., Annappa, B., Koolagudi, S.G.: Automatic text-independent Kannada dialect identification system. In: Information Systems Design and Intelligent Applications, pp. 79–87. Springer (2019)

    Google Scholar 

  19. Rao, K., Koolagudi, S.G.: Identification of Hindi dialects and emotions using spectral and prosodic features of speech. Int. J. Syst. Cybern. 9(4), 24–33 (2011)

    Google Scholar 

  20. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)

    Google Scholar 

  21. Zhao, X., Shao, Y., Wang, D.: Robust speaker identification using a CASA front-end. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011, pp. 5468–5471. IEEE (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hem Chandra Das .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Das, H.C., Bhattacharjee, U. (2022). Identification of Four Major Dialects of Assamese Language Using GMM with UBM. In: Gupta, D., Goswami, R.S., Banerjee, S., Tanveer, M., Pachori, R.B. (eds) Pattern Recognition and Data Analysis with Applications. Lecture Notes in Electrical Engineering, vol 888. Springer, Singapore. https://doi.org/10.1007/978-981-19-1520-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-1520-8_24

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-1519-2

  • Online ISBN: 978-981-19-1520-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics