Abstract
In multi-instance learning, each example is described by a bag of instances instead of a single feature vector. In this paper, we revisit the idea of performing multi-instance classification based on a point-and-scaling concept by searching for the point in instance space with the highest diverse density. This is a computationally expensive process, and we describe several heuristics designed to improve runtime. Our results show that simple variants of existing algorithms can be used to find diverse density maxima more efficiently. We also show how significant increases in accuracy can be obtained by applying a boosting algorithm with a modified version of the diverse density algorithm as the weak learner.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Neural Information Processing Systems, pp. 561–568. MIT Press, Cambridge (2003)
Andrews, S., Hofmann, T.: Multiple-instance learning via disjunctive programming boosting. In: Neural Information Processing Systems (2003)
Auer, P., Ortner, R.: A boosting approach to multiple instance learning. In: European Conference on Machine Learning, pp. 63–74. Springer, Heidelberg (2004)
Chen, Y., Bi, J., Wang, J.Z.: MILES: Multiple-instance learning via embedded instance selection. IEEE Pattern Analysis and Machine Intelligence 28(12), 1931–1947 (2006)
Chen, Y., Wang, J.Z.: Image categorization by learning and reasoning with regions. Journal of Machine Learning Research 5, 913–939 (2004)
Chevaleyre, Y., Zucker, J.D.: Solving multiple-instance and multiple-part learning problems with decision trees and rule sets. Application to the mutagenesis problem. In: Conference of the Canadian Society for Computational Studies of Intelligence, pp. 204–214. Springer, Heidelberg (2001)
Dietterich, T.G., Lathrop, R.H., Lozano-Perez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence 89(1–2), 31–71 (1997)
Foulds, J.R., Frank, E.: Revisiting multiple-instance learning via embedded instance selection. In: Proc. 21st Australasian Joint Conference on Artificial Intelligence, Auckland, New Zealand, pp. 300–310. Springer, Heidelberg (2008)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: International Conference on Machine Learning, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of Statistics 28(2), 337–407 (2000)
Gärtner, T., Flach, P.A., Kowalczyk, A., Smola, A.: Multi-instance kernels. In: International Conference on Machine Learning, pp. 179–186. Morgan Kaufmann, San Francisco (2002)
Krogel, M.A., Wrobel, S.: Feature selection for propositionalization. In: International Conference on Discovery Science, pp. 430–434. Springer, Heidelberg (2002)
Maron, O.: Learning from ambiguity. Ph.D. thesis, Massachusetts Institute of Technology (1998)
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Neural Information Processing Systems. MIT Press, Cambridge (1998)
Nadeau, C., Bengio, Y.: Inference for the Generalization Error. Machine Learning 52(3), 239–281 (2003)
Ray, S., Craven, M.: Supervised learning versus multiple instance learning: an empirical comparison. In: International Conference on Machine Learning, pp. 697–704. Omnipress (2005)
Reutemann, P.: Development of a Propositionalization Toolbox. Master’s thesis, Albert Ludwigs University of Freiburg (2004)
Srinivasan, A., Muggleton, S., King, R., Sternberg, M.: Mutagenesis: ILP experiments in a non-determinate biological domain. In: Inductive Logic Programming, GMD-Studien, pp. 217–232 (1994)
Viola, P.A., Platt, J.C., Zhang, C.: Multiple instance boosting for object detection. In: Neural Information Processing Systems (2005)
Wang, J., Zucker, J.D.: Solving the multiple-instance problem: A lazy learning approach. In: International Conference on Machine Learning, pp. 1119–1125. Morgan Kaufmann, San Francisco (2000)
Weidmann, N., Frank, E., Pfahringer, B.: A two-level learning method for generalized multi-instance problems. In: European Conference on Machine Learning, pp. 468–479. Springer, Heidelberg (2003)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)
Xu, X.: Statistical Learning in Multiple Instance Problems. Master’s thesis, University of Waikato (2003)
Xu, X., Frank, E.: Logistic regression and boosting for labeled bags of instances. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 272–281. Springer, Heidelberg (2004)
Zhang, Q., Goldman, S.: EM-DD: An improved multiple-instance learning technique. In: Neural Information Processing Systems, pp. 1073–1080. MIT Press, Cambridge (2002)
Zhang, Q., Yu, W., Goldman, S., Fritts, J.: Content-based image retrieval using multiple-instance learning. In: International Conference on Machine Learning, pp. 682–689. Morgan Kaufmann, San Francisco (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Foulds, J.R., Frank, E. (2010). Speeding Up and Boosting Diverse Density Learning. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds) Discovery Science. DS 2010. Lecture Notes in Computer Science(), vol 6332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16184-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-16184-1_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16183-4
Online ISBN: 978-3-642-16184-1
eBook Packages: Computer ScienceComputer Science (R0)