First order random forests: Learning relational classifiers with complex aggregates

Van Assche, Anneleen; Vens, Celine; Blockeel, Hendrik; Džeroski, Sašo

doi:10.1007/s10994-006-8713-9

First order random forests: Learning relational classifiers with complex aggregates

Published: 21 June 2006

Volume 64, pages 149–182, (2006)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

First order random forests: Learning relational classifiers with complex aggregates

Download PDF

Anneleen Van Assche¹,
Celine Vens¹,
Hendrik Blockeel¹ &
…
Sašo Džeroski²

767 Accesses
40 Citations
Explore all metrics

Abstract

In relational learning, predictions for an individual are based not only on its own properties but also on the properties of a set of related individuals. Relational classifiers differ with respect to how they handle these sets: some use properties of the set as a whole (using aggregation), some refer to properties of specific individuals of the set, however, most classifiers do not combine both. This imposes an undesirable bias on these learners. This article describes a learning approach that avoids this bias, using first order random forests. Essentially, an ensemble of decision trees is constructed in which tests are first order logic queries. These queries may contain aggregate functions, the argument of which may again be a first order logic query. The introduction of aggregate functions in first order logic, as well as upgrading the forest’s uniform feature sampling procedure to the space of first order logic, generates a number of complications. We address these and propose a solution for them. The resulting first order random forest induction algorithm has been implemented and integrated in the ACE-ilProlog system, and experimentally evaluated on a variety of datasets. The results indicate that first order random forests with complex aggregates are an efficient and effective approach towards learning relational classifiers that involve aggregates over complex selections.

Article PDF

A random forest guided tour

Article 19 April 2016

Gérard Biau & Erwan Scornet

A comparative analysis of gradient boosting algorithms

Article 24 August 2020

Candice Bentéjac, Anna Csörgő & Gonzalo Martínez-Muñoz

A survey on ensemble learning

Article 30 August 2019

Xibin Dong, Zhiwen Yu, … Qianli Ma

References

Berka, P. (2000). Guide to the financial data set. In: A. Siebes & P. Berka (Eds.), The ECML/PKDD 2000 Discovery Challenge.
Blockeel, H., & Bruynooghe, M. (2003). Aggregation versus selection bias, and relational neural networks. In: IJCAI-2003 Workshop on Learning Statistical Models from Relational Data, SRL-2003, Acapulco, Mexico.
Blockeel, H., & De Raedt, L. (1997). Lookahead and discretization in ILP. In: Proceedings of the Seventh International Workshop on Inductive Logic Programming, vol. 1297 of Lecture Notes in Artificial Intelligence (pp. 77–85), Springer-Verlag.
Blockeel, H., & De Raedt, L. (1998). Top-down induction of first order logical decision trees. Artificial Intelligence, 101(1–2), 285–297.
Article MATH MathSciNet Google Scholar
Blockeel, H., Dehaspe, L., Demoen, B., Janssens, G., Ramon, J., & Vandecasteele, H. (2002) . Improving the efficiency of inductive logic programming through the use of query packs. Journal of Artificial Intelligence Research, 16, 135–166.
MATH Google Scholar
Breiman, L. (1996a). Bagging predictors. Machine Learning, 24(2), 123–140.
MATH Google Scholar
Breiman, L. (1996b). Out-of-bag estimation. ftp://ftp.stat.berkeley.edu/pub/users/breiman/OOBestimation.ps.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Article MATH Google Scholar
de Castro Dutra, I., Page, D., Costa, V., & Shavlik, J. (2002). An empirical evalutation of bagging in inductive logic programming. In: Proceedings of the 12th International Conference on Inductive Logic Programming, vol. 2583 of Lecture Notes in Computer Science (pp. 48–65).
De Raedt, L., & Van Laer, W. (1995). Inductive constraint logic. In: K. P. Jantke, T. Shinohara, & T. Zeugmann (Eds.), Proceedings of the Sixth International Workshop on Algorithmic Learning Theory, vol. 997 of Lecture Notes in Artificial Intelligence (pp. 80–94), Springer-Verlag.
Dietterich, T. (2000). Ensemble methods in machine learning. In: Proceedings of the 1th International Workshop on Multiple Classifier Systems, vol. 1857 of Lecture Notes in Computer Science (pp. 1–15).
Džeroski, S., Schulze-Kremer, S., Heidtke, K. R., Siems, K., Wettschereck, D., & Blockeel, H. (1998). Diterpene structure elucidation from ¹³C NMR spectra with inductive logic programming. Applied Artificial Intelligence, 12(5), 363–384.
Article Google Scholar
Emde, W., & Wettschereck, D. (1995). Relational instance based learning. In: Proceedings of the 1995 Workshop of the GI Special Interest Group on Machine Learning.
Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In: L. Saitta (Ed.), Proceedings of the Thirteenth International Conference on Machine Learning (pp. 148–156), Morgan Kaufmann.
Hansen, L., & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 993–1001.
Article Google Scholar
Hoche, S., & Wrobel, S. (2001). Relational learning using constrained confidence-rated boosting. In: C. Rouveirol, & M. Sebag (Eds.), Proceedings of the Eleventh International Conference on Inductive Logic Programming, vol. 2157 of Lecture Notes in Artificial Intelligence (pp. 51–64), Springer-Verlag.
Knobbe, A., de Haas, M., & Siebes, A. (2001). Propositionalisation and aggregates. In: L. De Raedt, & A. Siebes (Eds.), Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery, vol. 2168 of Lecture Notes in Artificial Intelligence (pp. 277–288), Springer.
Knobbe, A., Siebes, A., & Marseille, B. (2002). Involving aggregate functions in multi-relational search. In: Principles of Data Mining and Knowledge Discovery, Proceedings of the 6th European Conference (pp. 287–298), Springer-Verlag.
Koller, D. (1999). Probabilistic relational models. In: Proceedings of the Ninth International Workshop on Inductive Logic Programming, vol. 1634 of Lecture Notes in Artificial Intelligence (pp. 3–13), Springer-Verlag.
Krogel, M.-A., Rawles, S., Železný, F., Flach, P., Lavrač, N., & Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In: Proceedings of the 13th International Conference on Inductive Logic Programming, vol. 2835 of Lecture Notes in Artificial Intelligence (pp. 194–217), Springer-Verlag.
Krogel, M.-A., & Wrobel, S. (2001). Transformation-based learning using multi-relational aggregation. In: Proceedings of the Eleventh International Conference on Inductive Logic Programming (pp. 142–155).
Krogel, M.-A., & Wrobel, S. (2003). Facets of aggregation approaches to propositionalization. In: T. Horváth, & A. Yamamoto (Eds.), Proceedings of the Work-in-Progress Track at the 13th International Conference on Inductive Logic Programming (pp. 30–39).
Lavrač, N., & Džeroski, S. (1994). Inductive Logic Programming: Techniques and Applications. Ellis Horwood.
Michalski, R. (1980). Pattern recognition as rule-guided inductive inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 349–361.
Article MATH Google Scholar
Muggleton, S. (Ed.) (1992). Inductive Logic Programming. Academic Press.
Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, Special issue on Inductive Logic Programming, 13(3–4), 245–286.
Google Scholar
Neville, J., Jensen, D., Friedland, L., & Hay, M. (2003). Learning relational probability trees. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
Perlich, C., & Provost, F. (2003). Aggregation-based feature invention and relational concept classes. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 167–176), ACM Press.
Plotkin, G. (1969). A note on inductive generalization. Machine Intelligence, 5, 153–163.
MATH Google Scholar
Provost, F. J., & Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learning, 42(3), 203–231.
Article MATH Google Scholar
Quinlan, J. (1990). Learning logical definitions from relations. Machine Learning, 5, 239–266.
Google Scholar
Quinlan, J. (1996). Boosting first-order learning. In: Algorithmic Learning Theory, 7th International Workshop (ALT ’96).
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann series in Machine Learning. Morgan Kaufmann.
Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3), 297–336.
Article MATH Google Scholar
Srinivasan, A. (2003). The aleph manual. http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/.
Srinivasan, A., King, R., & Bristol, D. (1999). An assessment of ilp-assisted models for toxicology and the PTE-3 experiment. In: Proceedings of the Ninth International Workshop on Inductive Logic Programming, vol. 1634 of Lecture Notes in Artificial Intelligence (pp. 291–302), Springer-Verlag.
Srinivasan, A., Muggleton, S., & King, R. (1995). Comparing the use of background knowledge by Inductive Logic Programming systems. In: L. De Raedt (Ed.), Proceedings of the Fifth International Workshop on Inductive Logic Programming (pp. 199–230). Department of Computer Science, Katholieke Universiteit Leuven.
Uwents, W., & Blockeel, H. (2005). Classifying relational data with neural networks. In: Proceedings of 15th International Conference on Inductive Logic Programming, Bonn, Germany, vol. 3625 of Lecture Notes in Artificial Intelligence (pp. 384–396), Springer.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001, Leuven, Belgium
Anneleen Van Assche, Celine Vens & Hendrik Blockeel
Department of Knowledge Technologies, Jožef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Sašo Džeroski

Authors

Anneleen Van Assche
View author publications
You can also search for this author in PubMed Google Scholar
Celine Vens
View author publications
You can also search for this author in PubMed Google Scholar
Hendrik Blockeel
View author publications
You can also search for this author in PubMed Google Scholar
Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anneleen Van Assche.

Additional information

Editor: Rui Camacho

Rights and permissions

Reprints and permissions

About this article

Cite this article

Van Assche, A., Vens, C., Blockeel, H. et al. First order random forests: Learning relational classifiers with complex aggregates. Mach Learn 64, 149–182 (2006). https://doi.org/10.1007/s10994-006-8713-9

Download citation

Received: 08 April 2005
Revised: 24 February 2006
Accepted: 29 March 2006
Published: 21 June 2006
Issue Date: September 2006
DOI: https://doi.org/10.1007/s10994-006-8713-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

First order random forests: Learning relational classifiers with complex aggregates

Abstract

Article PDF

Similar content being viewed by others

A random forest guided tour

A comparative analysis of gradient boosting algorithms

A survey on ensemble learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

First order random forests: Learning relational classifiers with complex aggregates

Abstract

Article PDF

Similar content being viewed by others

A random forest guided tour

A comparative analysis of gradient boosting algorithms

A survey on ensemble learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation