Abstract
This paper describes a method that can be seen as an improvement of the standard progressive sampling. The standard method uses samples of data of increasing size until accuracy of the learned concept cannot be further improved. The issue we have addressed here is how to avoid using some of the samples in this progression. The paper presents a method for predicting the stopping point using a meta-learning approach. The method requires just four iterations of the progressive sampling. The information gathered is used to identify the nearest learning curves, for which the sampling procedure was carried out fully. This in turn permits to generate the prediction regards the stopping point. Experimental evaluation shows that the method can lead to significant savings of time without significant losses of accuracy.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Blake, C.L., Merz, C.J.U.: repository of machine learning databases (1998)
Provost Foster, J., David, J., Tim, O.: Efficient progressive sampling. In: Knowledge Discovery and Data Mining, pp. 23–32 (1999)
John George, H., Pat, L.: Static versus dynamic sampling for data mining. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Proc. 2nd Int. Conf. Knowledge Discovery and Data Mining, KDD, pp. 367–370. AAAI Press, Menlo Park (1996)
Bias, B.L.: Variance, and arcing classifiers. Technical Report 460, Statistics Department, University of California (1996)
Metal project site, http://www.metal-kdd.org/
Brazdil, P., Soares, C., Costa, J.: Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning 50, 251–277 (2003)
Leite, R., Brazdil, P.: Improving progressive sampling via meta-learning. In: Pires, F.M., Abreu, S.P. (eds.) EPIA 2003. LNCS (LNAI), vol. 2902, pp. 313–323. Springer, Heidelberg (2003)
Quinlan, R.: C5.0 an informal tutorial. RuleQuest (1998), http://www.rulequest.com/see5-info.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leite, R., Brazdil, P. (2004). Improving Progressive Sampling via Meta-learning on Learning Curves. In: Boulicaut, JF., Esposito, F., Giannotti, F., Pedreschi, D. (eds) Machine Learning: ECML 2004. ECML 2004. Lecture Notes in Computer Science(), vol 3201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30115-8_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-30115-8_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23105-9
Online ISBN: 978-3-540-30115-8
eBook Packages: Springer Book Archive