Elsevier

Pattern Recognition

Volume 60, December 2016, Pages 692-705
Pattern Recognition

Progressive subspace ensemble learning

https://doi.org/10.1016/j.patcog.2016.06.017Get rights and content

Highlights

  • Progressive subspace ensemble learning.

  • It takes into account the data sample space and the feature space at the same time.

  • A progressive selection process based on new cost functions that incorporate current and long-term information to select the classifiers sequentially will be introduced.

Abstract

There are not many classifier ensemble approaches which investigate the data sample space and the feature space at the same time, and this multi-pronged approach will be helpful for constructing more powerful learning models. For example, the AdaBoost approach only investigates the data sample space, while the random subspace technique only focuses on the feature space. To address this limitation, we propose the progressive subspace ensemble learning approach (PSEL) which takes into account the data sample space and the feature space at the same time. Specifically, PSEL first adopts the random subspace technique to generate a set of subspaces. Then, a progressive selection process based on new cost functions that incorporate current and long-term information to select the classifiers sequentially will be introduced. Finally, a weighted voting scheme is used to summarize the predicted labels and obtain the final result. We also adopt a number of non-parametric tests to compare PSEL and its competitors over multiple datasets. The results of the experiments show that PSEL works well on most of the real datasets, and outperforms a number of state-of-the-art classifier ensemble approaches.

Introduction

Nowadays, more and more researchers pay attention to ensemble learning [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [62], [63], [64], [65], [66], [67], due to its high performance and robustness in various machine learning tasks. A number of ensemble learning algorithms have been proposed in recent years, such as hybrid adaptive ensemble learning [1], ensemble learning based on random linear oracle [2], bagging [3], random subspace [4], classifier ensemble based on neural networks [5], [6], boosting [7], fuzzy classifier ensemble [8], rotation forest [9], random forest [10], and heterogeneous ensemble of classifiers [68], [69]. Ensemble learning approaches have been successfully applied in the field of data mining [11], [12], text categorization [13], [14], bioinformatics [15], [16] and face image classification [17], [18], along with many other tasks. In the field of data mining, Polikar et al. [11] introduced a new classifier ensemble framework named Learn++.MF to address the missing value problem in data mining. It classifies missing values with multiple classifiers. Galar et al. [12] designed a new classifier ensemble algorithm named EUSBoost which incorporates random sub-sampling with boosting in the ensemble to handle the data imbalance problem. In the field of text categorization, Liu et al. [13] designed a novel classifier ensemble framework named BAM-Vote Box, and introduced a new feature selection algorithm for text categorization task. Saeedian et al. [14] designed a new spam detection algorithm using the ensemble learning framework based on clustering and weighted voting. In the field of bioinformatics, Liu et al. [15] introduced a new ensemble learning algorithm which adopts mutual information to search for important gene subsets, and then use the ensemble method to classify cancer gene expression data. Plumpton et al. [16] designed a new ensemble framework to classify streaming functional magnetic resonance images and measure brain activities. In the field of face image classification, Connolly et al. [17] introduced a face recognition framework which combines multiple neural network classifiers and produces a progressive ensemble learning algorithm for face recognition from video data. Zhang et al. [18] introduced RDA, an extension of LDA, to analyze face image data in a subspace. Ahmadvand et al. [57] adopted the ensemble classifier for MRI brain image segmentation. In summary, both theoretical and empirical analysis show that under certain conditions, ensemble learning achieves better performance and robustness than single classifiers, due to its capability to integrate more information encapsulated in multiple learners.

While there are different kinds of ensemble learning approaches, most of them only consider the distribution of the data in the sample space, or the distributions of the attributes in the feature space. However, the two sets of distributions should be combined to achieve a better result in most of the cases. To address this limitation, we propose a progressive subspace ensemble learning approach (PSEL) which explores both the data sample space and the feature space at the same time. Specifically, PSEL adopts the random subspace technique to generate a set of subspaces. Each subspace is used to train a classifier in the original ensemble. Then, a progressive selection process is adopted to select the classifiers based on two cost functions which incorporate current and long-term information. Finally, a weighted voting scheme is used to combine the predicted results from individual classifiers of the ensemble, and generate the final result. The properties of PSEL are analyzed theoretically. We also adopt a number of non-parametric tests to compare PSEL and its competitors on multiple datasets. Our experiments show that PSEL works well on the real datasets, and outperforms most of the state-of-the-art classifier ensemble approaches.

The contribution of the paper is twofold. First, the progressive subspace ensemble learning approach is proposed, which not only explores the data sample space, but also takes the feature space into consideration. Second, two cost functions which incorporate current and long-term information are designed to perform the progressive selection process.

The remainder of the paper is organized as follows. Section 2 discusses previous works related to ensemble learning. Section 3 describes the progressive subspace ensemble learning approach and the progressive selection process. Section 4 analyzes the proposed algorithm theoretically. Section 5 experimentally evaluates the performance of our proposed approach. Section 6 presents our conclusion and future work.

Section snippets

Related work

As an important branch of ensemble learning, classifier ensemble has drawn the attention of researchers from many fields. In recent years, classifier ensemble approaches based on different viewpoints have been developed. Some of them focus on generating new ensembles [19], [20], [21], [22], [23], [24]. For example, Yu et al. [19] introduced a graph based semi-supervised ensemble algorithm which generates the ensemble by performing multiple rounds of dimensionality reduction. Ye et al. [20]

An overview of progressive subspace ensemble learning

Fig. 2 provides an overview of the progressive subspace ensemble learning approach, while Algorithm 1 shows a flow chart of the progressive subspace ensemble learning approach. Given a training set Tr with l pairs of training samples Tr={(x1,y1),(x2,y2),,(xl,yl)} (where l is the number of training samples, xi (i{1,,l}) is the training sample, yi is the label of the training sample, yi{1,2,,k}, k is the number of classes), and each sample xi consists of m attributes {xi1,xi2,,xim}, the

Time complexity analysis

We also perform a theoretical analysis of PSEL in terms of its computational cost. The time complexity TPSEL of PSEL in the training process is estimated as follows:TPSEL=TOE+TPSPwhere TOE and TPSP denote the computational costs for the original ensemble generation and the progressive selection process, respectively. TOE is related to Algorithm 1, while TPSP is related to Algorithm 2.

TOE is related to the number of training samples l, the number of attributes m, and the number of classifiers B

Experiment

The performances of PSEL and other classifier ensemble approaches are evaluated using 18 cancer gene expression datasets as shown in Table 1 and 4 datasets from UCI machine learning repository [61] in Table 2 (where l denotes the number of data samples, m denotes the number of attributes, and k denotes the number of classes). The preprocessing step for the cancer gene expression profiles is the same as that in [46], [72], [73], [74], [75]. Most of them are challenging datasets that are used in

Conclusion and future work

In this paper, we investigate the problem of how to combine multiple classifiers, and how to select a suitable classifier subset in the ensemble. Our major contribution is a progressive subspace ensemble learning approach (PSEL) which takes into account both the feature space and the sample space at the same time. PSEL integrates the random subspace technique, the proposed progressive ensemble member selection process, and the weighted voting scheme to obtain a more accurate, stable and robust

Acknowledgment

The authors are grateful for the constructive advice received from the anonymous reviewers of this paper. The work described in this paper was partially funded by the grant from the National High-Technology Research and Development Program (863 Program) of China No. 2013AA01A212, the grant from the NSFC for Distinguished Young Scholars 61125205, and the grants from the NSFC Nos. 61332002, 61300044, 61472145, 61572199, 61502174, and 61502173, the grant from the Guangdong Natural Science Funds

Zhiwen Yu (S'06-M'08-SM'14) is a Professor in School of Computer Science and Engineering, South China University of Technology, and an adjunct professor in Sun Yat-Sen university. He is a senior member of IEEE, ACM, CCF (China Computer Federation) and CAAI (Chinese Association for Artificial Intelligence). Until now, Dr. Yu has published more than 90 referred journal papers and international conference papers, including TKDE, TEVC, TCYB, TMM, TSMC-B, TCVST, TCBB, TNB, PR, IS, Bioinformatics,

References (75)

  • L. Zhang et al.

    Random forests with ensemble of feature spaces

    Pattern Recognit.

    (2014)
  • J. Zhang et al.

    A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples

    Pattern Recognit.

    (2011)
  • E. Hüllermeier et al.

    Combining predictions in pairwise classification: an optimal adaptive voting strategy and its relation to weighted voting

    Pattern Recognit.

    (2010)
  • H. Tian et al.

    A novel multiplex cascade classifier for pedestrian detection

    Pattern Recognit. Lett.

    (2013)
  • L. Guo et al.

    Pedestrian detection for intelligent transportation systems combining adaboost algorithm and support vector machine

    Expert Syst. Appl.

    (2012)
  • S. Ali et al.

    Can-Evo-Ensclassifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences

    J. Biomed. Inf.

    (2015)
  • Q. Gu et al.

    An ensemble classifier based prediction of G-protein-coupled receptor classes in low homology

    Neurocomputing

    (2015)
  • G. Giacinto et al.

    Intrusion detection in computer networks by a modular ensemble of one-class classifiers

    Inf. Fus.

    (2008)
  • S. Günter et al.

    Feature selection algorithms for the generation of multiple classifier systems and their application to handwritten word recognition

    Pattern Recognit. Lett.

    (2004)
  • Q. Hu et al.

    EROSensemble rough subspaces

    Pattern Recognit.

    (2007)
  • R.M. Cruz et al.

    META-DESa dynamic ensemble selection framework using meta-learning

    Pattern Recognit.

    (2015)
  • J. Xiao et al.

    A dynamic classifier ensemble selection approach for noise data

    Inf. Sci.

    (2010)
  • L. Yang

    Classifiers selection for ensemble learning based on accuracy and diversity

    Proc. Eng.

    (2011)
  • R. Bryll

    Attribute baggingimproving accuracy of classifier ensembles by using random feature subsets

    Pattern Recognit.

    (2003)
  • Z. Yu et al.

    From cluster ensemble to structure ensemble

    Inf. Sci.

    (2012)
  • L. Nanni et al.

    Double committee adaboost

    J. King Saud Univ.-Sci.

    (2013)
  • C.-X. Zhang et al.

    RotBoosta technique for combining rotation forest and adaboost

    Pattern Recognit. Lett.

    (2008)
  • Z. Yu et al.

    Hybrid adaptive classifier ensemble

    IEEE Trans. Cybern.

    (2015)
  • L.I. Kuncheva et al.

    Classifier ensembles with a random linear oracle

    IEEE Trans. Knowl. Data Eng.

    (2007)
  • L. Breiman

    Bagging predictors

    Mach. Learn.

    (1996)
  • T.K. Ho

    The random subspace method for constructing decision forests

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • Z.-H. Zhou et al.

    NeC4.5neural ensemble based C4.5

    IEEE Trans. Knowl. Data Eng.

    (2004)
  • L.I. Kuncheva

    Fuzzy vs non-fuzzy in combining classifiers designed by boosting

    IEEE Trans. Fuzzy Syst.

    (2003)
  • J.J. Rodriguez et al.

    Rotation foresta new classifier ensemble method

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2006)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • R. Liu, M. Jiang, Chinese text classification based on the BVB Model, in: Fourth International Conference on Semantics,...
  • M.F. Saeedian, H. Beigy, Spam detection using dynamic weighted voting based on clustering, in: Second International...
  • Cited by (0)

    Zhiwen Yu (S'06-M'08-SM'14) is a Professor in School of Computer Science and Engineering, South China University of Technology, and an adjunct professor in Sun Yat-Sen university. He is a senior member of IEEE, ACM, CCF (China Computer Federation) and CAAI (Chinese Association for Artificial Intelligence). Until now, Dr. Yu has published more than 90 referred journal papers and international conference papers, including TKDE, TEVC, TCYB, TMM, TSMC-B, TCVST, TCBB, TNB, PR, IS, Bioinformatics, SIGKDD, and so on. Refer to the homepage for more details: 〈http://www.hgml.cn/yuzhiwen.htm〉.

    Daxing Wang is a Ph.D. candidate in the School of Computer Science and Engineering in South China University of Technology. He received the B.Sc. degree from South China University of Technology in China. His research interests include bioinformatics, machine learning and data mining.

    Jane You is currently a Professor in the Department of Computing at the Hong Kong Polytechnic University and the Chair of Department Research Committee. Prof. You obtained her B.Eng. in Electronic Engineering from Xi'an Jiaotong University in 1986 and Ph.D in Computer Science from La Trobe University, Australia in 1992. She was a Lecturer at the University of South Australia and Senior Lecturer (tenured) at Griffith University from 1993 till 2002. Prof. You was awarded French Foreign Ministry International Postdoctoral Fellowship in 1993 and worked on the project on real-time object recognition and tracking at Universite Paris XI. She also obtained the Academic Certificate issued by French Education Ministry in 1994. Prof. Jane You has worked extensively in the fields of image processing, medical imaging, computer-aided diagnosis, pattern recognition. So far, she has more than 190 research papers published with more than 1000 non-self citations. She has been a principal investigator for one ITF project, three GRF projects and many other joint grants since she joined PolyU in late 1998. Prof. You is also a team member for two successful patents (one HK patent, one US patent) and three awards including Hong Kong Government Industrial Awards. Her current work on retinal imaging has won a Special Prize and Gold Medal with Jury's Commendation at the 39th International Exhibition of Inventions of Geneva (April 2011) and the second place in an international competition (SPIE Medical Imaging'2009 Retinopathy Online Challenge (ROC'2009)). Her research output on retinal imaging has been successfully led to technology transfer with clinical applications. Prof. You is also an Associate Editor of Pattern Recognition and other journals.

    Hau-San Wong is currently an Associate Professor in the Department of Computer Science, City University of Hong Kong. He received the B.Sc. and M.Phil. degrees in Electronic Engineering from the Chinese University of Hong Kong, and the Ph.D. degree in Electrical and Information Engineering from the University of Sydney. He has also held research positions in the University of Sydney and Hong Kong Baptist University. His research interests include multimedia information processing, multimodal human–computer interaction and machine learning.

    Si Wu received the Ph.D. degree in computer science from City University of Hong Kong, Kowloon, Hong Kong, in 2013. He is an Associate Professor with the School of Computer Science and Engineering, South China University of Technology, Guangzhou, China. His research interests include computer vision and pattern recognition.

    Jun Zhang (M'02-SM'08) received the Ph.D. degree in Electrical Engineering from the City University of Hong Kong, Kowloon, Hong Kong, in 2002. Since 2004, he has been with Sun Yat-Sen University, Guangzhou, China, where he is currently a Cheung Kong Professor. He has authored seven research books and book chapters, and over 100 technical papers in his research areas. His current research interests include computational intelligence, cloud computing, big data, and wireless sensor networks. Dr. Zhang was a recipient of the China National Funds for Distinguished Young Scientists from the National Natural Science Foundation of China in 2011 and the First-Grade Award in Natural Science Research from the Ministry of Education, China, in 2009. He is currently an Associate Editor of the IEEE Transactions on Evolutionary Computation, the IEEE Transactions on Industrial Electronics, and the IEEE Transactions on Cybernetics. He is the Founding and Current Chair of the IEEE Guangzhou Subsection, the Founding and Current Chair of ACM Guangzhou Chapter.

    Guoqiang Han is a Professor at the School of Computer Science and Engineering, South China University of Technology, Guangzhou, China. He is the head of the School of Computer Science and Engineering in SCUT. He received his B.Sc. degree from the Zhejiang University in 1982, and the Master and Ph.D. degree from the Sun Yat-Sen University in 1985 and 1988 respectively. His research interests include multimedia, computational intelligence, machine learning and computer graphics. He has published over 100 research papers.

    View full text