Abstract
Machine learning techniques have been actively pursued in the last years, mainly due to the great number of applications that make use of some sort of intelligent mechanism for decision-making processes. In this work, we presented an ensemble of optimum-path forest (OPF) classifiers, which consists into combining different instances that compute a score-based confidence level for each training sample in order to turn the classification process “smarter”, i.e., more reliable. Such confidence level encodes the level of effectiveness of each training sample, and it can be used to avoid ties during the OPF competition process. Experimental results over fifteen benchmarking datasets have shown the effectiveness and efficiency of the proposed approach for classification problems, with more accurate results in more than 67% of the datasets considered in this work. Additionally, we also considered a bagging strategy for comparison purposes, and we showed the proposed approach can lead to considerably better results.
Similar content being viewed by others
Notes
Notice the percentages have been empirically chosen, being more intuitive to provide a larger validating set for calculating the confidence levels.
References
Al-Ani A, Deriche M (2002) A new technique for combining multiple classifiers using the dempster–shafer theory of evidence. J Artif Intell Res 17(1):333–361
Allène C, Audibert JY, Couprie M, Keriven R (2010) Some links between extremum spanning forests, watersheds and min-cuts. Image Vis Comput 28(10):1460–1471
Amancio DR, Comin CH, Casanova D, Travieso G, Bruno OM, Rodrigues FA, Costa LF (2014) A systematic comparison of supervised classifiers. PLoS ONE 9(4):e94,137
Amorim WP, Falcão AX, Papa JP, Carvalho MH (2016) Improving semi-supervised learning through optimum connectivity. Pattern Recogn 60:72–85
Andrews DF (1972) Plots of high-dimensional data. Biometrics 28(1):125–136
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Castillo E, Peteiro-Barral D, Berdiñas BG, Fontenla-Romero O (2015) Distributed one-class support vector machine. Int J Neural Syst 25(07):1550,029
Dash JK, Mukhopadhyay S (2016) Similarity learning for texture image retrieval using multiple classifier system. Multimed Tools Appl 1–25. doi:10.1007/s11042-016-4228-y
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Fernandes SEN, Scheirer W, Cox DD (2015) Papa JP progress in pattern recognition, image analysis, computer vision, and applications: 20th Iberoamerican congress, CIARP 2015, Montevideo, Uruguay, November 9–12, 2015, Proceedings, chap. improving optimum-path forest classification using confidence measures, pp 619–625. Springer International Publishing, Cham
Fernandes SEN, Souza AN, Gastaldello DS, Pereira DR, Papa JP (2017) Pruning optimum-path forest ensembles using metaheuristic optimization for land-cover classification. Int J Remote Sens 38:5736–5762
Folino G, Pisani FS (2015) Combining ensemble of classifiers by using genetic programming for cyber security applications. Springer International Publishing, Cham, pp 54–66
Giacinto G, Roli F, Fumera G (2000) Selection of classifiers based on multiple classifier behaviour. Springer, Berlin, pp 87–93
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Hunter JD (2007) Matplotlib: a 2d graphics environment. Comput Sci Eng 9(3):90–95
Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239
Koziol J, Hacke W (1991) A bivariate version of andrews plots. IEEE Trans Biomed Eng 38(12):1271–1274
Kuncheva L, Skurichina M, Duin RPW (2002) An experimental study on diversity for bagging and boosting with linear classifiers. Inf Fus 3(4):245–258
Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, New York
Nemenyi P (1963) Distribution-free multiple comparisons. Princeton University, Princeton
Papa JP, Falcão AX (2008) A new variant of the optimum-path forest classifier. In: Proceedings of the 4th international symposium on advances in visual computing, Lecture Notes in Computer Science, Springer, Berlin, pp 935–944
Papa JP, Falcão AX (2009) A learning algorithm for the optimum-path forest classifier. In: Torsello A, Escolano F, Brun L (eds) Graph-based representations in pattern recognition, vol 5534. Lecture Notes in Computer Science. Springer, Berlin Heidelberg, pp 195–204
Papa JP, Falcão AX, Albuquerque VHC, Tavares JMRS (2012) Efficient supervised optimum-path forest classification for large datasets. Pattern Recogn 45(1):512–520
Papa JP, Falcão AX, Suzuki CTN (2009) Supervised pattern classification based on optimum-path forest. Int J Imaging Syst Technol 19(2):120–131
Papa JP, Fernandes SEN, Falcão AX (2017) Optimum-path forest based on k-connectivity: theory and applications. Pattern Recogn Lett 87:117–126
Ponti M, Rossi I (2013) Ensembles of optimum-path forest classifiers using input data manipulation and undersampling. Multiple Classif Syst 7872:236–246
Ponti MP, Papa JP (2011) Improving accuracy and speed of optimum-path forest classifier using combination of disjoint training subsets. In: Sansone C, Kittler J, Roli F (eds) Multiple classifier systems, vol 6713. Lecture Notes in Computer Science. Springer, Berlin, pp 237–248
Ponti MP, Papa JP, Levada ALM (2011) A Markov random field model for combining optimum-path forest classifiers using decision graphs and game strategy approach. In: San Martin C, Kim SW (eds) Progress in pattern recognition, image analysis, computer vision, and applications, Lecture Notes in Computer Science, vol 7042, pp 581–590. Springer, Berlin
Souza R, Rittner L, Lotufo RA (2014) A comparison between k-optimum path forest and k-nearest neighbors supervised classifiers. Pattern Recogn Lett 39:2–10
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83
Xu L, Krzyzak A, Suen C (1992) Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans Syst Man Cybern 22(3):418–435
Zhang Y, Zhou W, Yuan S (2015) Multifractal analysis and relevance vector machine-based automatic seizure detection in intracranial EEG. Int J Neural Syst 25(6):1550020
Acknowledgements
The authors are grateful to FAPESP grants #2013/07375-0, #2014/16250-9, #2014/12236-1, and #2016/19403-6, Capes, and CNPq grants #470571/2016-6 and #306166/2014-3 for their financial support. The authors are also grateful to the reviewers for their insightful comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fernandes, S.E.N., Papa, J.P. Improving optimum-path forest learning using bag-of-classifiers and confidence measures. Pattern Anal Applic 22, 703–716 (2019). https://doi.org/10.1007/s10044-017-0677-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-017-0677-9