ABSTRACT
Facing ever increasing volumes of data but limited human annotation capabilities, active learning strategies for selecting the most informative labels gain in importance. However, the choice of an appropriate active learning strategy itself is a complex task that requires to consider different criteria such as the informativeness of the selected labels, the versatility with respect to classification algorithms, or the processing speed. This raises the question, which combinations of active learning strategies and classification algorithms are the most promising to apply. A general answer to this question, without application-specific, label-intensive experiments on each dataset, is highly desirable, as active learning is applied in situations with limited labelled data. Therefore, this paper studies several combinations of different active learning strategies and classification algorithms and evaluates them in a series of comparative experiments.
- A. Asuncion and D. J. Newman. UCI machine learning repository, 2015.Google Scholar
- J. Attenberg, P. Melville, F. Provost, and M. Saar-Tsechansky. Selective Data Acquisition for Machine Learning, chapter 5. CRC Press, Inc., 2011.Google Scholar
- A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. Moa: Massive online analysis. The Journal of Machine Learning Research, 11:1601--1604, 2010. Google ScholarDigital Library
- O. Chapelle. Active learning for parzen window classifier. In Proc. of the 10th Int. Workshop on AI and Statistics, pages 49--56, 2005.Google Scholar
- O. Chapelle, B. Schölkopf, and A. Zien, editors. Semi-supervised Learning. MIT Press, 2006.Google ScholarDigital Library
- D. Cohn. Active learning. In C. Sammut and G. I. Webb, editors, Encyclopedia of Machine Learning, pages 10--14. Springer, 2010.Google Scholar
- P. Domingos and G. Hulten. Mining high-speed data streams. In Proc. of the 6th ACM SIGKDD int. conf. on Knowledge discovery and data mining (KDD00), pages 71--80. ACM, 2000. Google ScholarDigital Library
- Y. Fu, X. Zhu, and B. Li. A survey on instance selection for active learning. Knowledge and Information Systems, 35(2):249--283, 2012.Google ScholarCross Ref
- J. Gantz and D. Reinsel. The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far east, December 2012.Google Scholar
- Y. Guo and D. Schuurmans. Discriminative batch mode active learning. In J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, editors, Advances in Neural Information Processing Systems (NIPS2007), pages 593--600, 2007.Google Scholar
- I. Guyon, G. Cawley, G. Dror, V. Lemaire, and A. Statnikov, editors. Active Learning Challenge, volume 6 of Challenges in Machine Learning. Microtome Publishing, 2011.Google Scholar
- M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10--18, 2009. Google ScholarDigital Library
- G. Krempl, D. Kottke, and V. Lemaire. Optimised probabilistic active learning (OPAL) for fast, non-myopic, cost-sensitive active classification. Machine Learning, 2015. Google ScholarDigital Library
- G. Krempl, D. Kottke, and M. Spiliopoulou. Probabilistic active learning: Towards combining versatility, optimality and efficiency. In S. Dzeroski, P. Panov, D. Kocev, and L. Todorovski, editors, Proc. of the 17th Int. Conf. on Discovery Science (DS), volume 8777 of LNCS, pages 168--179. Springer, 2014.Google Scholar
- G. Krempl, I. Zliobaitė, D. Brzeziński, E. Hüllermeier, M. Last, V. Lemaire, T. Noack, A. Shaker, S. Sievi, M. Spiliopoulou, and J. Stefanowski. Open challenges for data stream mining research. SIGKDD Explorations, 16(1):1--10, 2014. Google ScholarDigital Library
- L. Lan, H. Shi, Z. Wang, and S. Vucetic. Active learning based on parzen window. Journal of Machine Learning Research, 16:99--112, 2011.Google Scholar
- D. D. Lewis and W. A. Gale. A sequential algorithm for training text classifiers. In Proc. of the 17th annual int. ACM SIGIR conf. on Research and development in information retrieval, SIGIR '94, pages 3--12, 1994. Google ScholarDigital Library
- W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Numerical Recipes in Fortran 77: The Art of Scientific Computing. Cambridge University Press, 2 edition, 1992. Google ScholarDigital Library
- N. Roy and A. McCallum. Toward optimal active learning through sampling estimation of error reduction. In Proc. of the 18th Int. Conf. on Machine Learning, ICML 2001, pages 441--448, 2001. Google ScholarDigital Library
- A. I. Schein and L. H. Ungar. Active learning for logistic regression: an evaluation. Machine Learning, 68(3):235--265, 2007. Google ScholarDigital Library
- C. Seifert and M. Granitzer. User-based active learning. In Proc. of 10th Int. Conf. on Data Mining Workshops (ICDMW2010), pages 418--425, 2010. Google ScholarDigital Library
- B. Settles. Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin-Madison, Madison, Wisconsin, USA, 2009.Google Scholar
- B. Settles. Active Learning. Number 18 in Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool Publishers, 2012.Google Scholar
- K. Tomanek and K. Morik. Inspecting sample reusability for active learning. In I. Guyon, G. C. Cawley, G. Dror, V. Lemaire, and A. R. Statnikov, editors, AISTATS workshop on Active Learning and Experimental Design, volume 16, pages 169--181. JMLR.org, 2011.Google Scholar
- J. Zhu, H. Wang, B. K. Tsou, and M. Y. Ma. Active learning with sampling by uncertainty and density for data annotations. IEEE Trans. on Audio, Speech & Language Processing, 18(6):1323--1331, 2010. Google ScholarDigital Library
- J. Zhu, H. Wang, T. Yao, and B. K. Tsou. Active learning with sampling by uncertainty and density for word sense disambiguation and text classification. In D. Scott and H. Uszkoreit, editors, 22nd Int. Conf. on Computational Linguistics (COLING 2008), pages 1137--1144, 2008. Google ScholarDigital Library
- I. Zliobaitė, A. Bifet, B. Pfahringer, and G. Holmes. Active learning with drifting streaming data. IEEE Transactions on Neural Networks and Learning Systems, PP(99), 2013.Google Scholar
Index Terms
- How to select information that matters: a comparative study on active learning strategies for classification
Recommendations
Active Learning for kNN Using Instance Impact
AI 2022: Advances in Artificial IntelligenceAbstractLabelling unlabeled data is a time-consuming and expensive process. Labelling initiatives should select samples that are likely to enhance the classification accuracy of the classifier. Several methods can be employed to accomplish this goal. One ...
Active Sampling for Class Probability Estimation and Ranking
In many cost-sensitive environments class probability estimates are used by decision makers to evaluate the expected utility from a set of alternatives. Supervised learning can be used to build class probability estimates; however, it often is very ...
Compression-Based Selective Sampling for Learning to Rank
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementLearning to rank (L2R) algorithms use a labeled training set to generate a ranking model that can be later used to rank new query results. These training sets are very costly and laborious to produce, requiring human annotators to assess the relevance or ...
Comments