Active learning and data manipulation techniques for generating training examples in meta-learning

doi:10.1016/j.neucom.2016.02.007

Neurocomputing

Volume 194, 19 June 2016, Pages 45-55

https://doi.org/10.1016/j.neucom.2016.02.007 Get rights and content

Abstract

Algorithm selection is an important task in different domains of knowledge. Meta-learning treats this task by adopting a supervised learning strategy. Training examples in meta-learning (called meta-examples) are generated from experiments performed with a pool of candidate algorithms in a number of problems, usually collected from data repositories or synthetically generated. A meta-learner is then applied to acquire knowledge relating features of the problems and the best algorithms in terms of performance. In this paper, we address an important aspect in meta-learning which is to produce a significant number of relevant meta-examples. Generating a high quality set of meta-examples can be difficult due to the low availability of real datasets in some domains and the high computational cost of labelling the meta-examples. In the current work, we focus on the generation of meta-examples for meta-learning by combining: (1) a promising approach to generate new datasets (called datasetoids) by manipulating existing ones; and (2) active learning methods to select the most relevant datasets previously generated. The datasetoids approach is adopted to augment the number of useful problem instances for meta-example construction. However not all generated problems are equally relevant. Active meta-learning then arises to select only the most informative instances to be labelled. Experiments were performed in different scenarios, algorithms for meta-learning and strategies to select datasets. Our experiments revealed that it is possible to reduce the computational cost of generating meta-examples, while maintaining a good meta-learning performance.

Introduction

Algorithm selection is a challenging task in different domains related to computational intelligence, machine learning, optimization, among others. Such domains have in common the availability of different algorithms to solve the problems of interest and a shared statement that no single algorithm can be considered as the best one for all problems [1]. For instance, in a machine learning context, different algorithms can be alternatively adopted to solve classification problems, but the performance of the candidate algorithms can vary a lot depending on the features of the problems (e.g., dimensionality, training data quality, class complexity) and on the measures adopted for performance assessment. Additionally, each algorithm may have specific hyperparameters to set, which can also affect algorithm performance depending on the problem.

In this work, the algorithm selection problem was addressed by the meta-learning approach [2], [3], [1]. In meta-learning, algorithm selection is treated as a supervised learning task. Each training example (or meta-example) is related to a learning problem (e.g., a classification problem), the predictor attributes are features of that problem (e.g., class entropy, number of training examples, number of attributes) and the target attribute usually indicates the best algorithm for that problem, assigned after an empirical evaluation procedure (e.g., cross-validation). A meta-learner is adopted to select the best algorithms for new problems by exploiting the relationship between problem features and algorithm performance. A substantial amount of research in this topic was done in the context of the METAL project,¹ resulting in new meta-learning procedures and problem characterization methods. In the last decade, meta-learning has been extrapolated for algorithm selection in a variety of other domains of knowledge [1], with promising results and new perspectives.

As any other learning task, the success of meta-learning depends on a good set of training instances (in our case, a good set of meta-examples). A large amount of previous work is focused on constructing and selecting relevant meta-features, but few papers are concerned with the instances (problems) used to generate the set of meta-examples. Ideally, meta-examples have to be generated from a representative and large enough set of problems in order to result in good meta-learning performance. However, in different domains, there is a low availability of real problem instances or benchmark datasets to produce a rich and large set of meta-examples [4]. In fact, in [1], the author reported meta-learning studies in some domains with very scarce sets of meta-examples. This issue has received attention of the research community by generating new problem instances from either synthetic or manipulated datasets [5], [6], [7].

A second issue that can be pointed out is related to the cost of generating meta-examples. In fact, in order to generate a meta-example from a given problem, an empirical evaluation of each candidate algorithm is performed on the available dataset. This is specifically related to the labelling process of a meta-example, which requires assigning the best candidate algorithm for that problem. The labelling process can lead to a high computational cost, for instance, in situations when a large pool of problem instances is available (real, synthetic problems or both) or when there is a pool of time consuming candidate algorithms to evaluate. Selecting only informative and non-redundant datasets is an important issue in meta-learning, which was addressed in [8] by deploying active learning techniques.

Motivated by the previous two issues, in our work we investigate the combination of manipulation approaches for generating datasets and active learning to support the selection of meta-examples. More specifically, in our proposal a previous approach for manipulating datasets, called datasetoids [7], is initially adopted to produce a large pool of problem instances. Following, active learning techniques based on uncertainty sampling are used to select from this pool only the most relevant problem instances, avoiding the generation of meta-examples from redundant or irrelevant problem instances. The goal of this combination is to address at the same time two challenges of meta-learning: obtaining a significant number of datasets for generating accurate meta-models and reducing the computational cost of collecting meta-data by actively selecting relevant problem instances.

Different aspects of our proposal were investigated: (1) alternative scenarios were considered to evaluate the usefulness of the datasetoids approach and also concerning how to integrate these data among the pool of real problem instances; (2) the selection of problem instances was accomplished by adopting an uncertainty sampling method based on entropy; (3) we also investigated the effect of peripherical instances in the performance of the uncertainty sampling method, which is a drawback already known in the literature of active learning [9]; (4) we adopted two different algorithms as meta-learner: the k-Nearest Neighbor (k-NN) (which has been a standard method in meta-learning [4]) and the Random Forest algorithm (specially motivated by its good comparative performance in the literature [10]).

The remaining of this paper is organized as follows. First, Section 2 provides some background on meta-learning, including a presentation of the datasetoids approach. Next, active learning is discussed in the context of meta-learning (Section 3). Section 4 presents the proposed solution. Section 5 presents the experiments and obtained results. Finally, Section 6 brings some conclusions and future work.

Section snippets

Meta-learning for algorithm selection

Based on the Rice׳s framework [12], reproduced in [1], the algorithm selection problem can be defined by considering four components: (1) a problem space P, which represents the possible instances related to a particular problem of interest (e.g., classification problems); (2) the feature space F, which defines the features adopted to describe the problem instances (e.g., number of training examples and number of classes); (3) the algorithm space A, which defines a set of candidate algorithms

Active meta-learning

The use of data-generation techniques increases the amount of data available for producing meta-examples, which can have a positive impact in meta-learning performance. However, the techniques to augment the number of problem instances may introduce difficulties that can result in less optimal meta-learners. First, data-generation techniques are likely to produce datasets that bring redundant or even irrelevant information [7]. Noisy datasets can harm the meta-learning performance, as it can

Proposal

In the current work, we propose the combination of active meta-learning and the datasetoids approach presented in Section 2, which has proved in previous work to produce useful datasets for improving the meta-learning performance. In this proposal, a pool of candidate problem instances was composed by a set of real datasets collected from UCI and their corresponding datasetoids. Following, an active selection process is adopted to select only the most relevant ones from the initial pool,

Experiments and results

In this section, we present the experiments performed to evaluate the active meta-learning in the context of the datasetoids approach. Initially, we present the meta-data adopted in our experiments, followed by the methodology to estimate meta-learning accuracy as well as the active learning settings. Finally, the results are presented and discussed.

Meta-data: In the current work, we adopted the same meta-learning task originally used to evaluate the datasetoids approach in [7]. It consists of

Conclusion

In this work, we propose the combination of active meta-learning, data manipulation and outlier detection techniques to improve the generation of training examples for meta-learning. The main objective of this approach is to reduce the computational cost of collecting meta-data and improve the performance of meta-learning by increasing the number of meta-examples and removing irrelevant data.

Experiments were performance in a case study using k-NN and Random Forest as meta-learners and an

Acknowledgment

The authors would like to thank CNPq, CAPES, and FACEPE (Brazilian Agencies) and FCT – Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) for their financial support.

Arthur F.M. Sousa received his B.Sc. degree in Computer Engineering from the Universidade de Pernambuco (2010) and his M.Sc. in Computer Science from the Universidade Federal de Pernambuco, Brazil. His main interests are Machine Learning, Neural Networks and Active-Learning.

References (35)

J.R. Rice
The algorithm selection problem
Adv. Comput.
(1976)
R.B.C. Prudêncio et al.
Meta-learning approaches to selecting time series models
Neurocomputing
(2004)
C. Lemke et al.
Meta-learning for time series forecasting and forecast combination
Neurocomputing
(2010)
A.L.D. Rossi et al.
Metastreama meta-learning based method for periodic algorithm selection in time-changing data
Neurocomputing
(2014)
D.G. Ferrari et al.
Clustering algorithm selection by meta-learning systemsa new distance-based problem characterization and ranking combination methods
Inf. Sci.
(2015)
K. Smith-Miles
Cross-disciplinary perspectives on meta-learning for algorithm selection
ACM Comput. Surv.
(2008)
C. Giraud-Carrier et al.
Introduction to the special issue on meta-learning
Mach. Learn.
(2004)
P. Brazdil, C. Giraud-Carrier, C. Soares, R. Vilalta (Eds.), Metalearning – Applications to Data Mining. Springer,...
P. Brazdil et al.
Ranking learning algorithmsusing IBL and meta-learning on accuracy and time results
Mach. Learn.
(2003)
N. Macià, A. Orriols-Puig, E. Bernadó-Mansilla, Genetic-based synthetic data sets for the analysis of classifiers...

J. Kanda, A. Carvalho, E. Hruschka, C. Soares, Using meta-learning to recommend meta-heuristics for the traveling...

C. Soares, Uci++, improved support for algorithm selection using datasetoids, in: Lecture Notes in Computer Science,...

R. Prudêncio et al.

Selective generation of training examples in active meta-learning

Int. J. Hybrid Intell. Syst.

(2008)

N. Roy, A. McCallum, Toward optimal active learning through sampling estimation of error reduction, in: Proceedings of...

R. Caruana, N. Karampatziakis, A. Yessenalina, An empirical evaluation of supervised learning in high dimensions, in:...

R. Prudêncio, C. Soares, T. Ludermir, Uncertainty sampling methods for selecting datasets in active meta-learning, in:...

B. Souza et al.

Meta-learning approach to gene expression data classification

Int. J. Intell. Comput. Cybern.

(2009)

Cited by (12)

ProMetaUS: A proactive meta-learning uncertainty-based framework to select models for Dynamic Risk Management
2021, Safety Science
Citation Excerpt :
We iterated the testing and improvement steps until we achieved the current version of the ProMetaUS framework. The ASP is addressed by meta-learning as a supervised learning task, whose aim is to learn a model that captures the relationship between the properties of the datasets (or the characteristics of learning problems) and the algorithms, in particular their performance (Abdulrahman et al., 2015, 2018; Brazdil et al., 2009; Cunha et al., 2018; Filchenkov and Pendryak, 2015; Khan et al., 2020; Prudêncio et al., 2011b, 2011d; Rossi et al., 2012; Shahoud et al., 2020; Smith-Miles, 2008a; Sousa et al., 2016). This model can then be used to predict the most suitable algorithm for a given new dataset (Abdulrahman et al., 2015, 2018).
Safety managers, practitioners, and researchers can employ different models for estimating and assessing hazards, consequences, likelihoods, risks, and/or mitigation measures in the safety field. The selection of a specific model may depend on the uncertainty associated with its estimation and its impact on the safety-related decision-making process. The recognition of this issue as an example of Algorithm Selection Problem (ASP) allows investigating the applicability of meta-learning principles that are scarcely adopted in the risk and safety literature. Consequently, we propose a novel meta-learning inspired framework to proactively rank a set of candidate models for Dynamic Risk Management (DRM) based on desired uncertainty conditions. We denominate this framework ProMetaUS (Proactive Meta-learning and Uncertainty-based Selection for dynamic risk management). To achieve this purpose, our meta-learning system acquires knowledge that relates the characteristics extracted both directly and indirectly from datasets (e.g. data-based, domain-based, simple and fast uncertainty-based, simple and fast sensitivity-based meta-features) to some performance measures of the models. Performance measures include confidence information, shape measurable quantities, safety decision criteria and threshold limits, and sensitivity analysis outputs. We tested the proposed framework in a case study about Oxygen Deficiency Hazard (ODH) assessment by means of @RISK. For each of the five datasets, single-performance measure rankings and a final ranking of the three models are generated. Such rankings are aggregated to obtain the global recommended ranking.
Active learning with simultaneous subject and variable selections
2019, Neurocomputing
Citation Excerpt :
In general, when an active learning method is used, the unlabeled data are sequentially annotated and recruited into a training set based on the current model information, such that we can learn the target model more economically [1–3]. Active learning methods have been used in different applications including automatic cell segmentation [4], multimedia annotation and retrieval [5], synthetic aperture radar image classification [6]; meta learning [7], meta-cognitive machine learning [8], computer aided medical diagnosis [9] and so on. Practitioners usually use different subject selection strategies according to their application needs.
Training data are essential for learning classification models. Therefore, if only a limited number of labeled subjects are available for use as training samples, whereas a considerable amount of unlabeled data already exists, then it is always desirable enlarging the training set by labeling more subjects in order to ameliorate classification models. When it is costly in time and capital to label unlabeled subjects, it is crucial to know how many labeled subjects are necessary for training a satisfactory classification model. Although, active learning methods can gradually recruit new unlabeled subjects and disclose their label information to enlarge the size of the training set, there is a lack of discussion about the size of training samples in the literature. Hence, when/how to appropriately stop an active learning procedure is studied in this paper. Since the sequential subject recruiting strategy is used in active learning procedures, it is natural to adopt the idea of sequential analysis to dynamically and adaptively determine the training sample size for learning. In this study, we propose a stopping criterion for a linear model-based active learning procedure, such that this learning process will asymptotically achieve its best possible empirical performance, in terms of the area under receiver the operating characteristic curve (ROC), when the procedure is stopped. Other statistical properties of the proposed procedure, including estimation consistency and variable selection, are also studied. The numerical results using both synthesized and a real example are reported.
Building multi-domain conversational systems from single domain resources
2018, Neurocomputing
Citation Excerpt :
Collecting a corpus with real users and annotating it requires considerable time and effort. To address these problems, researchers have proposed alternative techniques that facilitate the acquisition and labeling of corpora, such as Wizard of Oz [54], bootstrapping [55], active learning [56], automatic dialog act classification and labeling [57], and user simulation [58]. The proposals described in this section show very interesting possibilities for the combination of single-task dialog systems to develop practical multi-domain conversational interfaces.
Current advances in the development of mobile and smart devices have generated a growing demand for natural human-machine interaction and favored the intelligent assistant metaphor, in which a single interface gives access to a wide range of functionalities and services. Conversational systems constitute an important enabling technology in this paradigm. However, they are usually defined to interact in semantic-restricted domains in which users are offered a limited number of options and functionalities. The design of multi-domain systems implies that a single conversational system is able to assist the user in a variety of tasks. In this paper we propose an architecture for the development of multi-domain conversational systems that allows: (1) integrating available multi and single domain speech recognition and understanding modules, (2) combining available system in the different domains implied so that it is not necessary to generate new expensive resources for the multi-domain system, (3) achieving better domain recognition rates to select the appropriate interaction management strategies. We have evaluated our proposal combining three systems in different domains to show that the proposed architecture can satisfactory deal with multi-domain dialogs.
Data Augmentation for Regression Machine Learning Problems in High Dimensions
2024, Computation
Reinforced Meta-Learning Method for Shape-Dependent Regulation of Cutting Force in Pork Carcass Operation Robots
2023, ACM International Conference Proceeding Series
Is an SV caller compatible with sequencing data? An online recommendation tool to automatically recommend the optimal caller based on data features
2023, Frontiers in Genetics

View all citing articles on Scopus

Ricardo B.C. Prudêncio received his B.Sc. degree in Computer Science from the Universidade Federal do Ceara and his M.Sc. and Ph.D. degrees in Computer Science from the Universidade Federal de Pernambuco, Brazil. He is a lecturer at the Center of Informatics, Universidade Federal de Pernambuco, Brazil. His main interests are Machine Learning, Meta-Learning, Hybrid Intelligent Systems, Time Series Forecasting and Text Mining.

Teresa B. Ludermir received the Ph.D. degree in artificial neural networks from Imperial College, University of London, U.K., in 1990. She is a Professor at CIn-UFPE and is an Editor-in-Chief of the International Journal of Computation Intelligence and Applications – Imperial College Press. Her research interests include weightless neural networks, hybrid neural systems, and applications of neural networks.

Carlos Soares received his B.Sc. degree in Systems Engineering and Informatics from Universidade do Minho, Portugal. He received his M.Sc. degree in Artificial Intelligence and his Ph.D. in Computer Science from Universidade do Porto, Portugal. He is a lecturer at Faculdade de Economia da Universidade do Porto. His main interests are Machine Learning, Data Mining, Meta-Learning and Data Streams.

View full text

Active learning and data manipulation techniques for generating training examples in meta-learning

Abstract

Introduction

Section snippets

Meta-learning for algorithm selection

Active meta-learning

Proposal

Experiments and results

Conclusion

Acknowledgment

Adv. Comput.

Neurocomputing

Neurocomputing

Neurocomputing

Inf. Sci.

Cross-disciplinary perspectives on meta-learning for algorithm selection

ACM Comput. Surv.

Introduction to the special issue on meta-learning

Mach. Learn.

Ranking learning algorithmsusing IBL and meta-learning on accuracy and time results

Mach. Learn.

Selective generation of training examples in active meta-learning

Int. J. Hybrid Intell. Syst.

Meta-learning approach to gene expression data classification

Int. J. Intell. Comput. Cybern.