Full Length ArticleUnsupervised Mode of Rejection of Foreign Patterns
Graphical abstract
Introduction
A standard approach to a classification task concerns a formation of a model (classifier) that assigns a class label to each input pattern so that a certain performance measure (say, classification error) becomes minimized. However, we often encounter problems with patterns quality, which hinders performance of the resulting classifier. The first compelling example concerns a situation when highly contaminated (distorted) data are processed. For instance, some data samples may be completely erroneous and as such should not be classified at all; say a segmentation procedure is flawed and improperly extracts patterns, or two streams of data coming from two experiments have been mistakenly merged into a single one. Second, an example of a real-world problem that does not fall within the realm of a standard data processing task is novelty detection [21], [30]. Novelty detection could be expressed as a one-class classification problem [8], [20]. Patterns that do not belong to the recognized class are assumed to be the novelty. In particular, a noteworthy area of application of novelty detection methods is computer-aided medical diagnosis. Say, we have data describing patients who suffer from a certain illness. A new instance (patient), with the use of novelty detection methods, could be accounted as native to the recognized class (and as such we consider him being a sick patient and proceed with appropriate medical care) or foreign to the recognized class (rejected, being healthy). In this way, without any knowledge about the characteristics of the foreign class, we may reject certain patterns based on their “dissimilarity” to native patterns. An ability to form a binary decision rule (to accept or reject a pattern) but based only on the knowledge about native patterns is especially precious when samples of the other class (samples of the foreign class) are very difficult to obtain or differ very much. The described issues are a common problem for computer-aided medical diagnosis. Therefore, novelty detection is a vital issue in medical applications, where the majority (or all) of the gathered data comes from ill subjects and we may use it to train a one-class model that could help us determine whether given new material belongs to this class or not. Such application of machine learning could help doctors in their decision-making problems and reduce the number of invasive and costly tests needed otherwise to formulate a diagnosis. The studied have already presented a few computer aided diagnosis systems based on classification principles, including classifiers with reject option. The study reported in [19] presented an automated model diagnosing vertebral column pathologies [22], proposed a method for detection of antinuclear autoantibodies [16], and discussed a fuzzy rule-based model for assessing coronary artery disease.
With the above motivation in mind, in this study we present a novel approach to foreign patterns rejection. By foreign patterns we mean those patterns that should not be classified into any class and at the same time they do not form their own class(es). We propose an unsupervised approach to form the rejection mechanism with an ultimate objective to separate foreign patterns from native patterns. An issue of constructing multiclass classifiers for native patterns is not considered. Instead, whenever we deal with a native dataset comprising several classes, we regard it as a single-class problem (viz. a single class of native patterns). We define models that determine regions predominantly occupied by native patterns. Patterns that do not fall into such regions are rejected. The formation of these regions is realized in an unsupervised mode. This entails that no prior knowledge about foreign patterns is required to establish the rejection mechanism.
The ultimate objective of this paper is to present and investigate properties of a novel unsupervised approach to foreign patterns rejection. The proposed approach is related to geometrical approaches to rejection. The introduced perspective links the notion of distance with membership grades encountered in fuzzy sets. Both criteria are endowed with parameters and this allows for some flexibility to adjust the criteria to fit a particular problem at hand. With this regard, it is worth stressing that this contribution of this study is original and novel. In the paper, we show that the method could be applied to versatile pattern recognition and classification problems, including critical areas, such as computer-aided medical diagnosis.
The proposed method is first investigated at the conceptual and algorithmic level and subsequently, applied to several medical datasets.
Though the literature of the area of classification is abundant, relatively few attempts have been made to deal with the problem of rejection of foreign patterns (as a matter of fact, this type of problem formulation has not been encountered very often). Among techniques that relate to the approach outlined here are methods that produce classification decisions in the form of flexible scores. A degree of membership to a class is evaluated on a certain scale. Therefore, rejection is realized by eliminating those patterns for which the scores are low. The literature on the topic brings several classification methods that work in this manner, for example [1], [17], [24]. Classification scores could be obtained, for instance, with an extension of linear discriminant analysis, as it is reported in the second of the mentioned papers. Apart from the theoretical studies, literature offers studies where rejection methods are applied to aid particular pattern recognition problems. For example [14], contains a study on handwritten words recognition. Score-based rejection itself has been recognized quite a long time ago: in 1970C.K. Chow proposed to reinforce probabilistic classifiers with ambiguity rejection option that basically assumed that a patterns with low score were rejected [5]. In the same sense rejection option appears in the recalled papers on computer aided medical diagnosis [16], [19], [22].
It should be stressed that foreign patterns are not outliers. Outliers are native patterns that substantially differ from the majority of data. We remove outliers, because they tend to cause problems with construction of a classifier. In contrast, foreign elements are unknown to us at the stage of data preprocessing and classifier construction and we have to reject them, because they do not belong to the data at all. Notably, there is a substantial development in the area of outlier detection, for instance reported in [6]. Scope of the studies on outlier detection is extended to semi-supervised learning methods as well, as reported in [23], and it has been shown that we can efficiently remove native outliers from the data, but the issue of unknown, foreign elements remains rarely mentioned.
As we mentioned at the beginning of the introduction, one-class classification (unary classification) is a stream of studies tightly related to the subject of this paper. Let us reiterate, that in one-class classification, we construct a model based on a given training set of one class. The model is able to evaluate how a given new pattern resembles patterns from the training set. Resemblance is typically calculated based on distance or another similarity measure of choice. Among different methods for one-class classification one should mention estimators relying on a certain data distribution [31]. This group of methods shares a common weakness – their reliability is relatively low when we have a limited amount of data for model training. As a remedy for that, the so-called boundary approaches, which consider only a closed boundary around the trained set, were proposed. In this stream of studies we shall mention methods such as k-centers [31] and Support Vector Data Description [25]. These methods aim at constructing a boundary around a given dataset. To minimize the chance of accepting foreign patterns (the cited works call them outliers) the volume of a region enclosing the dataset becomes minimized.
Finally, one may mention Learning from Positive and Unlabeled Examples; PU learning. This approach is usually discussed as an example of partially supervised classification. Evolution of this approach was motivated by classification cases where there was one large class (called positive) and a large number of other small classes or unlabeled patterns at the same time and there are no so called negative instances. In such case, standard approaches to classification are unable to distinguish negative instances, as there were none for the classifier training. Such problems appear in the domain of text mining and biomedical informatics, etc. In those domains we often encounter problems with one *dominant class and a multitude of rare classes, which are sometimes very hard to correctly label. Typically the PU learning approach is based on assigning similarity scores for new patterns and the training set of the positive class. Among early papers concerning ideas of PU learning one may mention the study reported in [29].
This paper is structured as follows. Section 2 presents basic notions of clustering and in particular, of fuzzy clustering. Section 3 introduces the proposed approach to foreign patterns rejection. Section 4 covers empirical experiments, where we apply and discuss properties of our method. Section 5 concludes this paper and highlights future research directions.
Section snippets
Essential features of fuzzy clustering
Before we proceed with the description of the proposed rejection mechanism, it is necessary to introduce formal notation and algorithms, which are used in the later parts of this paper.
A straightforward idea to patterns rejection is transferred from the field of clustering. We see high resemblance in clustering and in rejection task. Clustering aims at identification of similar subsets of objects within a given dataset. By analogy, a reversed foreign patterns rejection task may be depicted as
Identifying regions of foreign data: unsupervised learning approach
The underlying observation behind the proposed approach is that clustering determines certain geometry in the feature space in which the patterns are located. Owing to the membership grades one can effectively identify regions in which there is a high likelihood of foreign patterns. Furthermore, as this development is based on the method of unsupervised learning, it is free of any explicit assumptions about specific (say, statistical) characteristics of the foreign data. At the same time, it is
Experimental studies
In this section, we apply the proposed method in three case studies concerning different medical datasets. We investigate properties of the proposed procedure. In particular, we are interested in model tuning. We show how to evaluate different combinations of ε and δ parameters and we look into properties of foreign patterns rejection and native patterns acceptance.
As we mentioned in the introduction, an important area of application of foreign patterns rejection techniques is computer-aided
Conclusions
This paper deals with the issue of native (proper) patterns recognition and foreign (outlying, erroneous) rejection. In the study, we presented a novel unsupervised approach to foreign patterns rejection. We proposed a construction of a geometric model, which defines regions in the feature space for native and for foreign patterns. It is worth highlighting that in order to construct such model, only the native patterns are considered. In the development of the method, no specific information
Acknowledgment
The research is supported by the National Science Centre, grant No 2012/07/B/ST6/01501, decision no. UMO-2012/07/B/ST6/01501.
References (31)
- et al.
Rejection strategies for offline handwritten text line recognition
Pattern Recognit. Lett.
(2006) Class imbalance learning via a fuzzy total margin based support vector machine
Appl. Soft Comput.
(2015)- et al.
One class proximal support vector machines
Pattern Recogn.
(2016) - et al.
Network constraints and multi-objective optimization for one-class classification
Neural Netw.
(1996) - et al.
A review of novelty detection
Signal Process.
(2014) - et al.
Rigorous and compliant approaches to one-class classification
Chemom. Intell. Lab. Syst.
(2016) - et al.
Extended semi-supervised fuzzy learning method for nonlinear outliers via pattern discovery
Appl. Soft Comput.
(2015) - et al.
Growing a multi-class classifier with a reject option
Pattern Recognit. Lett.
(2008) - et al.
A modified support vector data description based novelty detection approach for machinery components
Appl. Soft Comput.
(2013) Pattern Recognition with Fuzzy Objective Function Algorithms
(2017)
LOF: identifying density-based local outliers
Proc. of the 2000 ACM SIGMOD International Conference on Management of Data
On optimum recognition error and reject tradeoff
IEEE Trans. Inf. Theory
Cited by (4)
A consistent statistical test based on bivariate random samples
2023, Hacettepe Journal of Mathematics and StatisticsSeparation of foreign patterns from native ones: Active Contour based Mechanism
2019, BIOIMAGING 2019 - 6th International Conference on Bioimaging, Proceedings; Part of 12th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2019