Elsevier

Neurocomputing

Volume 194, 19 June 2016, Pages 45-55
Neurocomputing

Active learning and data manipulation techniques for generating training examples in meta-learning

https://doi.org/10.1016/j.neucom.2016.02.007Get rights and content

Abstract

Algorithm selection is an important task in different domains of knowledge. Meta-learning treats this task by adopting a supervised learning strategy. Training examples in meta-learning (called meta-examples) are generated from experiments performed with a pool of candidate algorithms in a number of problems, usually collected from data repositories or synthetically generated. A meta-learner is then applied to acquire knowledge relating features of the problems and the best algorithms in terms of performance. In this paper, we address an important aspect in meta-learning which is to produce a significant number of relevant meta-examples. Generating a high quality set of meta-examples can be difficult due to the low availability of real datasets in some domains and the high computational cost of labelling the meta-examples. In the current work, we focus on the generation of meta-examples for meta-learning by combining: (1) a promising approach to generate new datasets (called datasetoids) by manipulating existing ones; and (2) active learning methods to select the most relevant datasets previously generated. The datasetoids approach is adopted to augment the number of useful problem instances for meta-example construction. However not all generated problems are equally relevant. Active meta-learning then arises to select only the most informative instances to be labelled. Experiments were performed in different scenarios, algorithms for meta-learning and strategies to select datasets. Our experiments revealed that it is possible to reduce the computational cost of generating meta-examples, while maintaining a good meta-learning performance.

Introduction

Algorithm selection is a challenging task in different domains related to computational intelligence, machine learning, optimization, among others. Such domains have in common the availability of different algorithms to solve the problems of interest and a shared statement that no single algorithm can be considered as the best one for all problems [1]. For instance, in a machine learning context, different algorithms can be alternatively adopted to solve classification problems, but the performance of the candidate algorithms can vary a lot depending on the features of the problems (e.g., dimensionality, training data quality, class complexity) and on the measures adopted for performance assessment. Additionally, each algorithm may have specific hyperparameters to set, which can also affect algorithm performance depending on the problem.

In this work, the algorithm selection problem was addressed by the meta-learning approach [2], [3], [1]. In meta-learning, algorithm selection is treated as a supervised learning task. Each training example (or meta-example) is related to a learning problem (e.g., a classification problem), the predictor attributes are features of that problem (e.g., class entropy, number of training examples, number of attributes) and the target attribute usually indicates the best algorithm for that problem, assigned after an empirical evaluation procedure (e.g., cross-validation). A meta-learner is adopted to select the best algorithms for new problems by exploiting the relationship between problem features and algorithm performance. A substantial amount of research in this topic was done in the context of the METAL project,1 resulting in new meta-learning procedures and problem characterization methods. In the last decade, meta-learning has been extrapolated for algorithm selection in a variety of other domains of knowledge [1], with promising results and new perspectives.

As any other learning task, the success of meta-learning depends on a good set of training instances (in our case, a good set of meta-examples). A large amount of previous work is focused on constructing and selecting relevant meta-features, but few papers are concerned with the instances (problems) used to generate the set of meta-examples. Ideally, meta-examples have to be generated from a representative and large enough set of problems in order to result in good meta-learning performance. However, in different domains, there is a low availability of real problem instances or benchmark datasets to produce a rich and large set of meta-examples [4]. In fact, in [1], the author reported meta-learning studies in some domains with very scarce sets of meta-examples. This issue has received attention of the research community by generating new problem instances from either synthetic or manipulated datasets [5], [6], [7].

A second issue that can be pointed out is related to the cost of generating meta-examples. In fact, in order to generate a meta-example from a given problem, an empirical evaluation of each candidate algorithm is performed on the available dataset. This is specifically related to the labelling process of a meta-example, which requires assigning the best candidate algorithm for that problem. The labelling process can lead to a high computational cost, for instance, in situations when a large pool of problem instances is available (real, synthetic problems or both) or when there is a pool of time consuming candidate algorithms to evaluate. Selecting only informative and non-redundant datasets is an important issue in meta-learning, which was addressed in [8] by deploying active learning techniques.

Motivated by the previous two issues, in our work we investigate the combination of manipulation approaches for generating datasets and active learning to support the selection of meta-examples. More specifically, in our proposal a previous approach for manipulating datasets, called datasetoids [7], is initially adopted to produce a large pool of problem instances. Following, active learning techniques based on uncertainty sampling are used to select from this pool only the most relevant problem instances, avoiding the generation of meta-examples from redundant or irrelevant problem instances. The goal of this combination is to address at the same time two challenges of meta-learning: obtaining a significant number of datasets for generating accurate meta-models and reducing the computational cost of collecting meta-data by actively selecting relevant problem instances.

Different aspects of our proposal were investigated: (1) alternative scenarios were considered to evaluate the usefulness of the datasetoids approach and also concerning how to integrate these data among the pool of real problem instances; (2) the selection of problem instances was accomplished by adopting an uncertainty sampling method based on entropy; (3) we also investigated the effect of peripherical instances in the performance of the uncertainty sampling method, which is a drawback already known in the literature of active learning [9]; (4) we adopted two different algorithms as meta-learner: the k-Nearest Neighbor (k-NN) (which has been a standard method in meta-learning [4]) and the Random Forest algorithm (specially motivated by its good comparative performance in the literature [10]).

The remaining of this paper is organized as follows. First, Section 2 provides some background on meta-learning, including a presentation of the datasetoids approach. Next, active learning is discussed in the context of meta-learning (Section 3). Section 4 presents the proposed solution. Section 5 presents the experiments and obtained results. Finally, Section 6 brings some conclusions and future work.

Section snippets

Meta-learning for algorithm selection

Based on the Rice׳s framework [12], reproduced in [1], the algorithm selection problem can be defined by considering four components: (1) a problem space P, which represents the possible instances related to a particular problem of interest (e.g., classification problems); (2) the feature space F, which defines the features adopted to describe the problem instances (e.g., number of training examples and number of classes); (3) the algorithm space A, which defines a set of candidate algorithms

Active meta-learning

The use of data-generation techniques increases the amount of data available for producing meta-examples, which can have a positive impact in meta-learning performance. However, the techniques to augment the number of problem instances may introduce difficulties that can result in less optimal meta-learners. First, data-generation techniques are likely to produce datasets that bring redundant or even irrelevant information [7]. Noisy datasets can harm the meta-learning performance, as it can

Proposal

In the current work, we propose the combination of active meta-learning and the datasetoids approach presented in Section 2, which has proved in previous work to produce useful datasets for improving the meta-learning performance. In this proposal, a pool of candidate problem instances was composed by a set of real datasets collected from UCI and their corresponding datasetoids. Following, an active selection process is adopted to select only the most relevant ones from the initial pool,

Experiments and results

In this section, we present the experiments performed to evaluate the active meta-learning in the context of the datasetoids approach. Initially, we present the meta-data adopted in our experiments, followed by the methodology to estimate meta-learning accuracy as well as the active learning settings. Finally, the results are presented and discussed.

Meta-data: In the current work, we adopted the same meta-learning task originally used to evaluate the datasetoids approach in [7]. It consists of

Conclusion

In this work, we propose the combination of active meta-learning, data manipulation and outlier detection techniques to improve the generation of training examples for meta-learning. The main objective of this approach is to reduce the computational cost of collecting meta-data and improve the performance of meta-learning by increasing the number of meta-examples and removing irrelevant data.

Experiments were performance in a case study using k-NN and Random Forest as meta-learners and an

Acknowledgment

The authors would like to thank CNPq, CAPES, and FACEPE (Brazilian Agencies) and FCT – Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) for their financial support.

Arthur F.M. Sousa received his B.Sc. degree in Computer Engineering from the Universidade de Pernambuco (2010) and his M.Sc. in Computer Science from the Universidade Federal de Pernambuco, Brazil. His main interests are Machine Learning, Neural Networks and Active-Learning.

References (35)

  • J. Kanda, A. Carvalho, E. Hruschka, C. Soares, Using meta-learning to recommend meta-heuristics for the traveling...
  • C. Soares, Uci++, improved support for algorithm selection using datasetoids, in: Lecture Notes in Computer Science,...
  • R. Prudêncio et al.

    Selective generation of training examples in active meta-learning

    Int. J. Hybrid Intell. Syst.

    (2008)
  • N. Roy, A. McCallum, Toward optimal active learning through sampling estimation of error reduction, in: Proceedings of...
  • R. Caruana, N. Karampatziakis, A. Yessenalina, An empirical evaluation of supervised learning in high dimensions, in:...
  • R. Prudêncio, C. Soares, T. Ludermir, Uncertainty sampling methods for selecting datasets in active meta-learning, in:...
  • B. Souza et al.

    Meta-learning approach to gene expression data classification

    Int. J. Intell. Comput. Cybern.

    (2009)
  • Cited by (12)

    • ProMetaUS: A proactive meta-learning uncertainty-based framework to select models for Dynamic Risk Management

      2021, Safety Science
      Citation Excerpt :

      We iterated the testing and improvement steps until we achieved the current version of the ProMetaUS framework. The ASP is addressed by meta-learning as a supervised learning task, whose aim is to learn a model that captures the relationship between the properties of the datasets (or the characteristics of learning problems) and the algorithms, in particular their performance (Abdulrahman et al., 2015, 2018; Brazdil et al., 2009; Cunha et al., 2018; Filchenkov and Pendryak, 2015; Khan et al., 2020; Prudêncio et al., 2011b, 2011d; Rossi et al., 2012; Shahoud et al., 2020; Smith-Miles, 2008a; Sousa et al., 2016). This model can then be used to predict the most suitable algorithm for a given new dataset (Abdulrahman et al., 2015, 2018).

    • Active learning with simultaneous subject and variable selections

      2019, Neurocomputing
      Citation Excerpt :

      In general, when an active learning method is used, the unlabeled data are sequentially annotated and recruited into a training set based on the current model information, such that we can learn the target model more economically [1–3]. Active learning methods have been used in different applications including automatic cell segmentation [4], multimedia annotation and retrieval [5], synthetic aperture radar image classification [6]; meta learning [7], meta-cognitive machine learning [8], computer aided medical diagnosis [9] and so on. Practitioners usually use different subject selection strategies according to their application needs.

    • Building multi-domain conversational systems from single domain resources

      2018, Neurocomputing
      Citation Excerpt :

      Collecting a corpus with real users and annotating it requires considerable time and effort. To address these problems, researchers have proposed alternative techniques that facilitate the acquisition and labeling of corpora, such as Wizard of Oz [54], bootstrapping [55], active learning [56], automatic dialog act classification and labeling [57], and user simulation [58]. The proposals described in this section show very interesting possibilities for the combination of single-task dialog systems to develop practical multi-domain conversational interfaces.

    View all citing articles on Scopus

    Arthur F.M. Sousa received his B.Sc. degree in Computer Engineering from the Universidade de Pernambuco (2010) and his M.Sc. in Computer Science from the Universidade Federal de Pernambuco, Brazil. His main interests are Machine Learning, Neural Networks and Active-Learning.

    Ricardo B.C. Prudêncio received his B.Sc. degree in Computer Science from the Universidade Federal do Ceara and his M.Sc. and Ph.D. degrees in Computer Science from the Universidade Federal de Pernambuco, Brazil. He is a lecturer at the Center of Informatics, Universidade Federal de Pernambuco, Brazil. His main interests are Machine Learning, Meta-Learning, Hybrid Intelligent Systems, Time Series Forecasting and Text Mining.

    Teresa B. Ludermir received the Ph.D. degree in artificial neural networks from Imperial College, University of London, U.K., in 1990. She is a Professor at CIn-UFPE and is an Editor-in-Chief of the International Journal of Computation Intelligence and Applications – Imperial College Press. Her research interests include weightless neural networks, hybrid neural systems, and applications of neural networks.

    Carlos Soares received his B.Sc. degree in Systems Engineering and Informatics from Universidade do Minho, Portugal. He received his M.Sc. degree in Artificial Intelligence and his Ph.D. in Computer Science from Universidade do Porto, Portugal. He is a lecturer at Faculdade de Economia da Universidade do Porto. His main interests are Machine Learning, Data Mining, Meta-Learning and Data Streams.

    View full text