Abstract
Building a high precision statistical model requires ample amounts of supervised (labeled) data to train the models. In certain domains, it is difficult to acquire large amounts of labeled data, especially applications involving images, speech and video data. At the same time, lots of unlabeled data is available in such applications. Self-training is one of the semi-supervised approaches that enables the use of vast unlabeled data to boost the efficiency of the model along with minimal labeled data. In this work, we propose a variant of the self-training approach that embraces soft labeling of unlabeled examples rather than the hard labeling used in conventional self-training. As our work focuses on image and speaker recognition tasks, Gaussian Mixture Model (GMM) based Bayesian classifier is used as a wrapper in the self-training approach. Our experimental studies on STL10, CIFAR10, MIT (image recognition task) and NIST (speaker recognition task) benchmark datasets indicate that the proposed modified self-training approach offers enhanced efficiency over conventional self-training.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Blum, A., & Chawla, S. (2001). Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the eighteenth international conference on machine learning (pp. 19–26).
Blum, A., Lafferty, J., Rwebangira, M. R., & Reddy, R. (2004). Semi-supervised learning using randomized mincuts. In Proceedings of the twenty-first international conference on Machine learning (p. 13). ACM.
Bodapati, J. D., & Veeranjaneyulu, N. (2017). Abnormal network traffic detection using support vector data description. In Proceedings of the 5th international conference on frontiers in intelligent computing: Theory and applications (pp. 497–506). Springer.
Bodapat, J. D., Veeranjaneyulu, N., & Shareef Shaik (2019). Sentiment analysis from movie reviews using LSTMs. Ingénierie des Systèmes d Inf, 24(1), 125–129.
Bodapati, J. D., Vijay, A., & Veeranjaneyulu, N. (2020). Brain tumor detection using deep features in the latent space. Ingénierie des Systèmes d’Information, 25, 259–265.
Chapelle, O., Scholkopf, B., & Zien, A. (2009). Semi-supervised learning (Chapelle, O. et al., eds.; 2006) [book reviews]. IEEE Transactions on Neural Networks, 20(3), 542–542.
Čular, L., Tomaić, M., Subašić, M., Šarić, T., Sajković, V., & Vodanović, M. (2017). Dental age estimation from panoramic X-ray images using statistical models. In Proceedings of the 10th international symposium on image and signal processing and analysis (pp. 25–30). IEEE.
Davari, A., Aptoula, E., Yanikoglu, B., Maier, A., & Riess, C. (2018). GMM-based synthetic samples for classification of hyperspectral images with limited training data. IEEE Geoscience and Remote Sensing Letters, 15(6), 942–946.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (methodological), 39(1), 1–38.
Duan, R., Jiang, W., & Man, H. (2006). Semi-supervised image classification in likelihood space. In 2006 IEEE international conference on image processing (pp. 957–960). IEEE.
Garla, V., Taylor, C., & Brandt, C. (2013). Semi-supervised clinical text classification with Laplacian SVMs: An application to cancer case management. Journal of Biomedical Informatics, 46(5), 869–875.
Jaakkola, T., & Szummer, M. (2002). Partially labeled classification with Markov random walks. In Advances in neural information processing systems (pp. 945–952).
Joachims, T. (2003). Transductive learning via spectral graph partitioning. In International conference on machine learning (pp. 290–297).
Kveton, B., Valko, M., Rahimi, A., & Huang, L. (2010). Semi-supervised learning with max-margin graph cuts. In International conference on artificial intelligence and statistics (pp. 421–428).
Li, Y.-F., & Zhou, Z.-H. (2011). Improving semi-supervised support vector machines through unlabeled instances selection. In Proceedings of the twenty-fifth AAAI conference on artificial intelligence (pp. 386–391).
Maurya, A., Kumar, D., & Agarwal, R. K. (2018). Speaker recognition for Hindi speech signal using MFCC-GMM approach. Procedia Computer Science, 125, 880–887.
Miller, D. J. (2003). A mixture model and EM-based algorithm for class discovery, robust classification, and outlier rejection in mixed labeled/unlabeled data sets. Pattern Analysis and Machine Intelligence, 25(11), 1468–1483.
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175.
Patel, P., Chaudhari, A., Kale, R., & Pund, M. (2017). Emotion recognition from speech with Gaussian mixture models & via boosted GMM. International Journal of Research in Science and Engineering, 3.
Sajjad, H., Schmid, H., Fraser, A., & Schütze, H. (2017). Statistical models for unsupervised, semi-supervised, and supervised transliteration mining. Computational Linguistics, 43(2), 349–375.
Shahshahani, B. M. (1994). The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon. Geoscience and Remote Sensing, 32(5), 1087–1095.
Sindhwani, V., Niyogi, P., & Belkin, M. (2005). Beyond the point cloud: from transductive to semi-supervised learning. In Proceedings of the 22nd international conference on Machine learning (pp. 824–831). ACM.
Tanha, J., van Someren, M., & Afsarmanesh, H. (2017). Semi-supervised self-training for decision tree classifiers. International Journal of Machine Learning and Cybernetics, 8(1), 355–370.
Vatsavai, R. R., Badhuri, B., Shekhar, S., & Burk, T. E. (2008). Multisource data classification using a hybrid semi-supervised learning scheme. In IEEE international geoscience and remote sensing symposium, 2008. IGARSS 2008 (Vol. 3, pp. III-1016). IEEE.
Veeranjaneyulu, N., Bodapati, J. D., & Buradagunta, S. (2020). Classifying limited resource data using semi-supervised SVM classifying limited resource data using semi-supervised SVM. Ingénierie des Systèmes d’Information, 25, 391–395.
Wang, Y., Chen, S., & Zhou, Z.-H. (2012). New semi-supervised classification method based on modified cluster assumption. IEEE Transactions on Neural Networks and Learning Systems, 23(5), 689–702.
Woo, J., Xing, F., Stone, M., Green, J., Reese, T. G., Brady, T. J., Prince, J. L., El Fakhri, G. (2019). Speech map: A statistical multimodal atlas of 4D tongue motion during speech from tagged and cine MR images. Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization, 7(4), 361–373.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bodapati, J.D. Modified self-training based statistical models for image classification and speaker identification. Int J Speech Technol 24, 1007–1015 (2021). https://doi.org/10.1007/s10772-021-09861-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-021-09861-9