Modified self-training based statistical models for image classification and speaker identification

Bodapati, Jyostna Devi

doi:10.1007/s10772-021-09861-9

Modified self-training based statistical models for image classification and speaker identification

Published: 08 June 2021

Volume 24, pages 1007–1015, (2021)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Jyostna Devi Bodapati ORCID: orcid.org/0000-0002-5185-882X^1,2

215 Accesses
Explore all metrics

Abstract

Building a high precision statistical model requires ample amounts of supervised (labeled) data to train the models. In certain domains, it is difficult to acquire large amounts of labeled data, especially applications involving images, speech and video data. At the same time, lots of unlabeled data is available in such applications. Self-training is one of the semi-supervised approaches that enables the use of vast unlabeled data to boost the efficiency of the model along with minimal labeled data. In this work, we propose a variant of the self-training approach that embraces soft labeling of unlabeled examples rather than the hard labeling used in conventional self-training. As our work focuses on image and speaker recognition tasks, Gaussian Mixture Model (GMM) based Bayesian classifier is used as a wrapper in the self-training approach. Our experimental studies on STL10, CIFAR10, MIT (image recognition task) and NIST (speaker recognition task) benchmark datasets indicate that the proposed modified self-training approach offers enhanced efficiency over conventional self-training.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speaker Identification Using Semi-supervised Learning

Scalable Semi-Supervised Clustering for Face Recognition with Insufficient Labelled Samples

Article 01 June 2022

Semi-supervised Batch Mode Active Learning for Multi-class Classification

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

Blum, A., & Chawla, S. (2001). Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the eighteenth international conference on machine learning (pp. 19–26).
Blum, A., Lafferty, J., Rwebangira, M. R., & Reddy, R. (2004). Semi-supervised learning using randomized mincuts. In Proceedings of the twenty-first international conference on Machine learning (p. 13). ACM.
Bodapati, J. D., & Veeranjaneyulu, N. (2017). Abnormal network traffic detection using support vector data description. In Proceedings of the 5th international conference on frontiers in intelligent computing: Theory and applications (pp. 497–506). Springer.
Bodapat, J. D., Veeranjaneyulu, N., & Shareef Shaik (2019). Sentiment analysis from movie reviews using LSTMs. Ingénierie des Systèmes d Inf, 24(1), 125–129.
Article Google Scholar
Bodapati, J. D., Vijay, A., & Veeranjaneyulu, N. (2020). Brain tumor detection using deep features in the latent space. Ingénierie des Systèmes d’Information, 25, 259–265.
Article Google Scholar
Chapelle, O., Scholkopf, B., & Zien, A. (2009). Semi-supervised learning (Chapelle, O. et al., eds.; 2006) [book reviews]. IEEE Transactions on Neural Networks, 20(3), 542–542.
Article Google Scholar
Čular, L., Tomaić, M., Subašić, M., Šarić, T., Sajković, V., & Vodanović, M. (2017). Dental age estimation from panoramic X-ray images using statistical models. In Proceedings of the 10th international symposium on image and signal processing and analysis (pp. 25–30). IEEE.
Davari, A., Aptoula, E., Yanikoglu, B., Maier, A., & Riess, C. (2018). GMM-based synthetic samples for classification of hyperspectral images with limited training data. IEEE Geoscience and Remote Sensing Letters, 15(6), 942–946.
Article Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (methodological), 39(1), 1–38.
MathSciNet MATH Google Scholar
Duan, R., Jiang, W., & Man, H. (2006). Semi-supervised image classification in likelihood space. In 2006 IEEE international conference on image processing (pp. 957–960). IEEE.
Garla, V., Taylor, C., & Brandt, C. (2013). Semi-supervised clinical text classification with Laplacian SVMs: An application to cancer case management. Journal of Biomedical Informatics, 46(5), 869–875.
Article Google Scholar
Jaakkola, T., & Szummer, M. (2002). Partially labeled classification with Markov random walks. In Advances in neural information processing systems (pp. 945–952).
Joachims, T. (2003). Transductive learning via spectral graph partitioning. In International conference on machine learning (pp. 290–297).
Kveton, B., Valko, M., Rahimi, A., & Huang, L. (2010). Semi-supervised learning with max-margin graph cuts. In International conference on artificial intelligence and statistics (pp. 421–428).
Li, Y.-F., & Zhou, Z.-H. (2011). Improving semi-supervised support vector machines through unlabeled instances selection. In Proceedings of the twenty-fifth AAAI conference on artificial intelligence (pp. 386–391).
Maurya, A., Kumar, D., & Agarwal, R. K. (2018). Speaker recognition for Hindi speech signal using MFCC-GMM approach. Procedia Computer Science, 125, 880–887.
Article Google Scholar
Miller, D. J. (2003). A mixture model and EM-based algorithm for class discovery, robust classification, and outlier rejection in mixed labeled/unlabeled data sets. Pattern Analysis and Machine Intelligence, 25(11), 1468–1483.
Article Google Scholar
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175.
Article Google Scholar
Patel, P., Chaudhari, A., Kale, R., & Pund, M. (2017). Emotion recognition from speech with Gaussian mixture models & via boosted GMM. International Journal of Research in Science and Engineering, 3.
Sajjad, H., Schmid, H., Fraser, A., & Schütze, H. (2017). Statistical models for unsupervised, semi-supervised, and supervised transliteration mining. Computational Linguistics, 43(2), 349–375.
Article MathSciNet Google Scholar
Shahshahani, B. M. (1994). The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon. Geoscience and Remote Sensing, 32(5), 1087–1095.
Article Google Scholar
Sindhwani, V., Niyogi, P., & Belkin, M. (2005). Beyond the point cloud: from transductive to semi-supervised learning. In Proceedings of the 22nd international conference on Machine learning (pp. 824–831). ACM.
Tanha, J., van Someren, M., & Afsarmanesh, H. (2017). Semi-supervised self-training for decision tree classifiers. International Journal of Machine Learning and Cybernetics, 8(1), 355–370.
Article Google Scholar
Vatsavai, R. R., Badhuri, B., Shekhar, S., & Burk, T. E. (2008). Multisource data classification using a hybrid semi-supervised learning scheme. In IEEE international geoscience and remote sensing symposium, 2008. IGARSS 2008 (Vol. 3, pp. III-1016). IEEE.
Veeranjaneyulu, N., Bodapati, J. D., & Buradagunta, S. (2020). Classifying limited resource data using semi-supervised SVM classifying limited resource data using semi-supervised SVM. Ingénierie des Systèmes d’Information, 25, 391–395.
Article Google Scholar
Wang, Y., Chen, S., & Zhou, Z.-H. (2012). New semi-supervised classification method based on modified cluster assumption. IEEE Transactions on Neural Networks and Learning Systems, 23(5), 689–702.
Article Google Scholar
Woo, J., Xing, F., Stone, M., Green, J., Reese, T. G., Brady, T. J., Prince, J. L., El Fakhri, G. (2019). Speech map: A statistical multimodal atlas of 4D tongue motion during speech from tagged and cine MR images. Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization, 7(4), 361–373.
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Madras, Chennai, India
Jyostna Devi Bodapati
Vignan’s Foundation for Science Technology and Research, Guntur, 522213, India
Jyostna Devi Bodapati

Authors

Jyostna Devi Bodapati
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jyostna Devi Bodapati.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bodapati, J.D. Modified self-training based statistical models for image classification and speaker identification. Int J Speech Technol 24, 1007–1015 (2021). https://doi.org/10.1007/s10772-021-09861-9

Download citation

Received: 18 March 2020
Accepted: 06 January 2021
Published: 08 June 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s10772-021-09861-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modified self-training based statistical models for image classification and speaker identification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Speaker Identification Using Semi-supervised Learning

Scalable Semi-Supervised Clustering for Face Recognition with Insufficient Labelled Samples

Semi-supervised Batch Mode Active Learning for Multi-class Classification

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now