Multiclass Contour-Preserving Classification with Support Vector Machine (SVM)

Piyabute Fuangkhon

doi:10.1515/jisys-2015-0087

Open Access Published by De Gruyter February 27, 2016

Multiclass Contour-Preserving Classification with Support Vector Machine (SVM)

Piyabute Fuangkhon

From the journal Journal of Intelligent Systems

https://doi.org/10.1515/jisys-2015-0087

Abstract

Multiclass contour-preserving classification (MCOV) has been used to preserve the contour of the data set and improve the classification accuracy of a feed-forward neural network. It synthesizes two types of new instances, called fundamental multiclass outpost vector (FMCOV) and additional multiclass outpost vector (AMCOV), in the middle of the decision boundary between consecutive classes of data. This paper presents a comparison on the generalization of an inclusion of FMCOVs, AMCOVs, and both MCOVs on the final training sets with support vector machine (SVM). The experiments were carried out using MATLAB R2015a and LIBSVM v3.20 on seven types of the final training sets generated from each of the synthetic and real-world data sets from the University of California Irvine machine learning repository and the ELENA project. The experimental results confirm that an inclusion of FMCOVs on the final training sets having raw data can improve the SVM classification accuracy significantly.

Keywords: Contour preservation; data mining; data pre-processing; neural network; support vector machine

MSC 2010: 68T01

1 Introduction

Support vector machine (SVM) [3] is a set of related supervised learning methods that analyzes data and recognizes patterns used for classification and regression analysis. It attempts to find a hyperplane in a high- or infinite-dimensional space that divides the two classes of data. A good separation is achieved by the hyperplane that has the largest distance to the nearest instance of any class as, in general, the larger the margin, the lower the generalization error of the classifier. The instances that fall on this margin are called the support vectors. LIBSVM [2] is a library for support vector classification (C-SVC, nu-SVC), regression (epsilon-SVR, nu-SVR), and distribution estimation (one-class SVM). This SVM library supports multiclass classification.

Consider a linearly separable problem shown in Figure 1A; the four classes of data are designated as red, green, blue, and magenta points in a two-dimensional Euclidean space determined by two coordinates. After SVM training, support vectors are identified and designated as black points, as shown in Figure 1B. These support vectors are the representative of the instances in different classes that are very close to each other. When SVM places a hyperplane or a set of hyperplanes, it is a tendency of these hyperplanes to be parallel to the x-axis, which can lead to a high misclassification rate for the instances located at the decision boundary between consecutive classes of data. A countermeasure is to assist the SVM to classify this linearly separable problem non-linearly. It could be implemented by reducing the space at the middle of the decision boundary between consecutive classes of data by an insertion of a set of synthesized instances in the middle of the decision boundary between consecutive classes of data, as shown in Figure 1C. SVM will be biased toward a selection of the synthesized instances as support vectors, resulting in a non-linear classification on a linearly separable problem.

Figure 1:

(A) A Four-Class Synthetic Data Set Having 3200 Samples per Class (Light Color Points). (B) A Four-Class Synthetic Data Set Having 3200 Samples per Class and Support Vectors (Black Points). (C) A Four-Class Synthetic Data Set Having 3200 Samples per Class and Synthesized Vectors (Solid-Color Points). (D) A Four-Class Synthetic Data Set Having 3200 Samples per Class and Multiclass Outpost Vectors (MCOVs) (Solid-Color Points) [9].

Multiclass contour-preserving classification (MCOV) [6] is a technique that can reduce the space between consecutive classes of data to improve the contour preservation of the data set for a feed-forward neural network (FFNN). The technique synthesizes two types of new instances, called fundamental multiclass outpost vectors (FMCOVs) and additional multiclass outpost vectors (AMCOVs), at the middle of the decision boundary between consecutive classes of data. These FMCOVs and AMCOVs assist the FFNN to place a set of hyperplanes in such a way that preserves the concave surface (curves inward) and the convex surface (bulges outward) of the data set more accurately. As a result, the generalization of the model can be improved. MCOVs are synthesized at the middle of the decision boundary between consecutive classes of data, as depicted in Figure 1D. The space between consecutive classes of data is significantly reduced. MCOVs might seem to be similar to support vectors. Nevertheless, MCOVs are synthesized from the original instances, while support vectors are selected from the original instances.

This paper presents a comparison on the generalization of an inclusion of FMCOVs, AMCOVs, and both MCOVs [6] on the final training sets with SVM, which is a further study of Ref. [5] to support multiclass data. Its goal is to determine whether or not an inclusion of the FMCOVs, AMCOVs, and both MCOVs on the final training sets can improve the level of the SVM classification accuracy. The experiments were carried out using MATLAB R2015a with SVM using LIBSVM v3.20 [2] on six non-overlapping synthetic data sets, eight highly overlapping synthetic data sets from the ELENA project [9], and six multiclass real-world data sets from the University of California Irvine (UCI) machine learning repository [1].

The paper is organized as follows. Section 2 describes research works related to SVM and MCOV. Section 3 presents the methodology used to compare the generalization of an inclusion of FMCOVs, AMCOVs, and both MCOVs on the final training sets with SVM using LIBSVM v3.20. Section 4 presents the experimental results. Section 5 presents the conclusions and future works.

2 Related Works

2.1 SVM

SVM [3] is a set of related supervised learning methods that analyzes data and recognizes patterns used for classification and regression analysis. A standard SVM predicts which of two possible classes an input is a member of. A prediction is done by constructing a hyperplane or a set of hyperplanes in a high- or infinite-dimensional space that will maximize a margin between both classes. In a data set where the data is linearly separable, hard-margin linear SVM will select instances, which lie along the supporting hyperplanes (the hyperplanes parallel to the dividing hyperplane at the edges of the margin), as support vectors. If the data set is not linearly separable, soft-margin linear SVM can widen the margin and produce a greater error for the data set, but improve generalization and/or find a linear separation of data that is not linearly separable. Besides, the data set that is not linearly separable in its original feature space can be transformed into a higher-dimensional space before the non-linear SVM divides the data via a kernel function. After data division is completed, the higher-dimensional space is transformed back to its original feature space. The performance of SVM may be considered regarding accuracy as well as computational complexity. The accuracy of the classification depends on the trade-off between a high-complexity model, which may over-fit the data, and a large margin, which will incorrectly classify some of the data set in the interest of better generalization. In general, better generalization error of the classifier can be achieved when the margin is larger. More specifically, a good separation is achieved by the hyperplane that has the largest distance to the neighboring data points (vectors) of both classes. The computational complexity of the classification focuses on the training time, which depends on the quadratic function for linear SVM, and the selected kernel function for non-linear SVM. For a testing time, the computational complexity is linear in the number of support vectors. Fewer support vectors lead to faster classification. Figure 2 illustrates support vectors and their hyperplane. An LIBSVM library [2] has been implemented to support multiclass data.

Figure 2:

Support Vectors and their Hyperplane.

2.2 MCOV

MCOV [6] is a technique that can narrow the space between consecutive classes of data to preserve the concave surface (curves inward) and the convex surface (bulges outward) of the data more accurately. The technique has been improved from Ref. [8] to support multiclass data and confirmed in Ref. [4] to improve the level of classification accuracy on FFNN. The technique synthesizes two types of new instances, FMCOV and AMCOV, from original instances in the middle of the decision boundary between consecutive classes of data, using Euclidean distance function. The anatomy of both FMCOV and AMCOV is shown below.

FMCOV is a synthesized vector that is used to declare the decision boundary of the territory of an instance of one class, let presume an instance i of class A [denoted by A_i], against an instance of any other class, let presume an instance j of class X [denoted by X_j], that has the smallest Euclidean distance to A_i. X_j is designated as a paired vector of A_i [denoted by ϕ(i)]. An FMCOV of A_i [denoted by o(i)] is placed at the boundary of A_i’s territory in the direction of X_j.
AMCOV is a synthesized vector that is used to declare the decision boundary of a paired vector of an instance, let presume a paired vector of instance i of class A [denoted by ϕ(i)], against that instance i of class A [denoted by A_i]. An AMCOV of ϕ(i) [denoted by o′(i)] is placed at the boundary of ϕ(i)’s territory; called counter boundary, in the direction of A_i.

The Multiclass Outpost Vector Generation Algorithm is presented in Algorithm 1. Figure 3 illustrates the concept of the three-class outpost vector [6] to help understand the anatomies of FMCOV and AMCOV. In Figure 3, there are three classes of data, which are designated as class A, B, and C. The top left and bottom center circles are the territory of class A. The top right and bottom left circles are the territory of class B. The bottom right circle is the territory of class C. To find the territory of each instance, each instance is modeled to span its territory as a circle (sphere in a case of three-dimensional space or hypersphere in a case of higher-dimensional space) until the territories collide with any other. The territory of instance k of class A, denoted by A_k, is found by locating an instance not in class A that is nearest to A_k. In this case, B*(A_k) of class B is nearest to A_k and referred to as A_k’s pair. Then, the territory of A_k is declared at halfway between A_k and B*(A_k). Consequently, the radius of A_k’s territory is set at half of the distance between A_k and B*(A_k). This option is to guarantee that if B*(A_k) sets its territory using the same radius, then the distance from the hyperplane to either A_k or B*(A_k) will be at the maximum. A_k then places its FMCOV against B*(A_k) at the decision boundary of A_k. The territories of B*(A_k) of class B, C*(B*(A_k)) of class C, A_j of class A, and B*(A_j) of class B are also found by the same method done with A_k of class A. After that, AMCOVs will then be generated from all instances as well. The AMCOV of B*(A_k) of class B against A_k of class A is placed at the counter boundary of B*(A_k) against A_k. The AMCOV of C*(B*(A_k)) of class C against B*(A_k) of class B is placed at the counter boundary of C*(B*(A_k)) against B*(A_k). The AMCOV of A_j of class A against C*(B*(A_k)) of class C is placed at the counter boundary of A_j against C*(B*(A_k)). The AMCOV of B*(A_j) of class B against A_j of class A is placed at the counter boundary of B*(A_j) against A_j.

Algorithm 1

The Multiclass Outpost Vector Generation Algorithm [6]

1: {input: the original data set (T)}

2: {output: the set of the Multiclass Outpost Vectors (MCOVs) (M)}

3: for each instance i∈Tdo

4: find an instance ϕ(i)∉class(i) and has shortest Euclidean distance to i

5: generate a FMCOV o(i)∈class(i) at almost half way between i and ϕ(i) on the territory of i in the direction of ϕ(i)

6: add o(i) into M

7: end for

8: for each instance i∈Tdo

9: ifϕ(ϕ(i))≠ithen

10: generate an AMCOV, o′(i)∈class (ϕ(i)), at almost half way between i and ϕ(i) on the territory of ϕ(i) in the direction of i

11: add o′(i) into M

12: end if

13: end for

Figure 3:

Instances (Big Rectangles), FMCOVs (Small Rectangles), AMCOVs (Small Triangles), Instances’ Boundary (Solid Circles), Instances’ Counter Boundary (Dotted Circles) in a Two-Dimensional Three-Class Data Set.

MCOV has been augmented in Ref. [7] to solve a space deficiency of an FMCOV generator and an AMCOV generator by a postprocessor that can remove a set of MCOVs that is not located close to the decision boundary between consecutive classes of data.

3 Methodology

This section presents the methodology used to compare the generalization of an inclusion of FMCOVs, AMCOVs, and both MCOVs on the final training sets with SVM using LIBSVM v3.20 [2]. For each data set, the following steps will be performed sequentially.

Prepare a data set for training and a data set for testing.
Generate the following final training sets from the data set for training:
1. S_Raw training set contains all instances from the data set for training.
2. S_FM training set contains all FMCOVs generated from the data set for training.
3. S_AM training set contains all AMCOVs generated from the data set for training.
4. S_M training set contains all FMCOVs and AMCOVs generated from the data set for training.
5. S_Raw+FM training set contains all instances from the data set for training, and all FMCOVs generated from that data set for training.
6. S_Raw+AM training set contains all instances from the data set for training and all AMCOVs generated from that data set for training.
7. S_Raw+M training set contains all instances from the data set for training, and all FMCOVs and AMCOVs generated from that data set for training.
Train SVM with each final training set in (ii) using LIBSVM.
Test the data set for testing with the trained SVM in (iii) and store the SVM misclassification rate.

4 Experiments

This section presents a comparison on the generalization of an inclusion of FMCOVs, AMCOVs, and both MCOVs on the final training sets with SVM using LIBSVM v3.20 [2].

4.1 Machine Learning

The experiments were carried out using MATLAB R2015a and LIBSVM v3.20 [2]. The following parameters were used:

SVM type=C-SVC (multiclass classification)
Kernel function=radial basis function
Degree in Kernel function=3
Gamma in Kernel function=1
Coefficient 0 in Kernel function=0
Cost of C-SVC=1
Weight of C-SVC=1
Cache memory size=100 MB
Tolerance of termination criterion=0.001
Shrinking heuristics=1
Probability estimates=0

4.2 Data Sets

Three groups of data sets were used in the experiments:

Six non-overlapping four-class synthetic data sets
Eight highly overlapping two-class synthetic data sets from the ELENA project [9]
Six multiclass real-world data sets from the UCI machine learning repository [1]

Table 1 presents the characteristics of six non-overlapping four-class synthetic data sets. These data sets were used to demonstrate how MCOVs could narrow the space between consecutive classes of multiclass data. They were generated by the same algorithm, but having different population sizes. There were four classes of data, which are designated as red, green, blue, and magenta points in a two-dimensional Euclidean space determined by two coordinates. The data sets for training consisted of 100, 200, 400, 800, 1600, and 3200 instances per class, which constituted 400, 800, 1600, 3200, 6400, and 12,800 instances in six data sets, respectively. The data set for training having 3200 instances per class is depicted in Figure 4A. Figure 4B, C, and D depict the data set for training having 3200 instances per class with FMCOVs, AMCOVs, and both FMCOVs and AMCOVs, respectively.

Table 1

Characteristics of the Non-overlapping Four-Class Synthetic Data Sets.

Data set	Type	Data type	No. of classes	No. of dimensions	Training instances	Testing instances
Sine 100	Bivariate	Integer	4	2	400	25,600
Sine 200	Bivariate	Integer	4	2	800	25,600
Sine 400	Bivariate	Integer	4	2	1600	25,600
Sine 800	Bivariate	Integer	4	2	3200	25,600
Sine 1600	Bivariate	Integer	4	2	6400	25,600
Sine 3200	Bivariate	Integer	4	2	12,800	25,600

Figure 4:

(A) A Four-Class Synthetic Data Set with 3200 Samples per Class with (B) FMCOVs, (C) AMCOVs, and (D) Both FMCOVs and AMCOVs [9].

Table 2 presents the characteristics of eight highly overlapping two-class synthetic data sets from the ELENA project [9]. There were “Clouds,” “Gaussian 2D,” “Gaussian 3D,” “Gaussian 4D,” “Gaussian 5D,” “Gaussian 6D,” “Gaussian 7D,” and “Gaussian 8D” data sets. These data sets were used to evaluate how MCOVs could improve the level of SVM classification accuracy on two-class data sets having a heavy intersection of the class distributions, a high degree of non-linearity of the class boundaries, and various dimensions of the vectors. The data sets for training and testing of each data set were generated by four-fold cross validation. The “Clouds” data set for training, FMCOVs, AMCOVs, and both FMCOVs and AMCOVs are depicted in Figure 5A, B, C, and D, respectively. The “Gaussian 2D” data set for training, FMCOVs, AMCOVs, and both FMCOVs and AMCOVs are depicted in Figure 6A, B, C, and D, respectively.

Table 2

Characteristics of the Highly Overlapping Two-Class Synthetic Data Sets from the ELENA Project.

Data set	Type	Data type	No. of classes	No. of dimensions	Training instances	Testing instances
Clouds	Bivariate	Float	2	2	3750	1250
Gaussian 2D	Bivariate	Float	2	2	3750	1250
Gaussian 3D	Bivariate	Float	2	3	3750	1250
Gaussian 4D	Bivariate	Float	2	4	3750	1250
Gaussian 5D	Bivariate	Float	2	5	3750	1250
Gaussian 6D	Bivariate	Float	2	6	3750	1250
Gaussian 7D	Bivariate	Float	2	7	3750	1250
Gaussian 8D	Bivariate	Float	2	8	3750	1250

Figure 5:

(A) “Clouds” Highly Overlapping Two-Class Synthetic Data Set having 3750 Samples. (B) FMCOVs. (C) AMCOVs. (D) Both FMCOVs and AMCOVs. [9].

Figure 6:

(A) “Gaussian 2D” Highly Overlapping Two-Class Synthetic Data Set having 3750 Samples. (B) FMCOVs. (C) AMCOVs. (D) Both FMCOVs and AMCOVs. [9].

Table 3 presents the characteristics of six multiclass real-world data sets from the UCI machine learning repository [1]. There were “Adult Income,” “Statlog (Landsat Satellite),” “Statlog (Shuttle Landing Control),” “Forest Cover Level,” “Pen-Based Recognition of Handwritten Digits,” and “Optical Recognition of Handwritten Digits” data sets. These data sets were used to evaluate how MCOVs could improve the level of SVM classification accuracy on multiclass real-world data sets having various data complexities, including the number of classes and the number of features or dimensions. The data set for training and the data set for testing of each data set were provided.

Table 3

Characteristics of the Six Multiclass Real-World Data Sets from the UCI Machine Learning Repository.

Data set	Type	Data type	No. of classes	No. of dimensions	Training instances	Testing instances
Adult income	Multivariate	Integer/Categorical	2	14	32,561	16,281
Statlog (Landsat)	Multivariate	Integer	6	36	4435	2000
Statlog (Shuttle)	Multivariate	Integer	7	9	43,500	14,500
Forest cover level	Multivariate	Integer/Categorical	7	54	58,104	522,908
Pen-based recog.	Multivariate	Integer	10	16	7494	3498
Optical recog.	Multivariate	Integer	10	64	3823	1797

The κ parameter [6] used in FMCOV generator and AMCOV generator on all final training sets was set equally to 5%. Table 4 summarizes the components of the final training sets aforementioned.

Table 4

Components of the Final Training Sets.

Final training set	Data set (Raw Data)	FMCOVs	AMCOVs
S_Raw	✓	–	–
S_FM	–	✓	–
S_AM	–	–	✓
S_M	–	✓	✓
S_Raw+FM	✓	✓	–
S_Raw+AM	✓	–	✓
S_Raw+M	✓	✓	✓

4.3 Experimental Results

Based on the methodology presented in Section 3 with SVM configuration presented in Section 4.1 on the data sets presented in Section 4.2, the following experimental results were produced.

Table 5 presents the SVM misclassification rates on the six non-overlapping four-class synthetic data sets. For this group of data sets

The final training sets having raw data and FMCOVs (S_Raw+FM) and the final training sets having raw data and both FMCOVs and AMCOVs (S_Raw+M) yield the lowest and the second-lowest SVM misclassification rates, respectively.
The final training sets having raw data and FMCOVs (S_Raw+FM), the final training sets having raw data and AMCOVs (S_Raw+AM), and the final training sets having raw data and both FMCOVs and AMCOVs (S_Raw+M) yield a higher level of SVM classification accuracy than that of the final training sets having raw data only (S_Raw) and the final training sets having only some types of MCOVs (S_FM, S_AM, S_M).

Table 5

SVM Misclassification Rates on the Six Non-overlapping Four-Class Synthetic Data Sets.

Data set	S_Raw	S_FM	S_AM	S_M	S_Raw+FM	S_Raw+AM	S_Raw+M
Sine 100	0.7141	0.7464	0.7136	0.7406	0.4261	0.5501	0.4837
Sine 200	0.7037	0.7450	0.7092	0.7432	0.3141	0.4354	0.3852
Sine 400	0.6732	0.7448	0.6795	0.7430	0.1772	0.2706	0.2033
Sine 800	0.6749	0.7459	0.6734	0.7443	0.1265	0.1764	0.1288
Sine 1600	0.6540	0.7444	0.6541	0.7369	0.0768	0.1262	0.0788
Sine 3200	0.6482	0.7439	0.6488	0.7434	0.0639	0.0966	0.0649

Bold values indicate the lowest misclassification rate. Underlined values indicate the second-lowest misclassification rate.

Table 6 presents the SVM misclassification rates on the eight highly overlapping two-class synthetic data sets. For this group of data sets

Both the final training sets having raw data only (S_Raw) and the final training sets having raw data and FMCOVs (S_Raw+FM) yield the lowest SVM misclassification rate.
An inclusion of AMCOVs on the final training sets (S_AM, S_Raw+AM, or S_M) can adversely affect the level of SVM classification accuracy.

Table 6

SVM Misclassification Rates on the Eight Highly Overlapping Two-Class Synthetic Data Sets.

Data set	S_Raw	S_FM	S_AM	S_M	S_Raw+FM	S_Raw+AM	S_Raw+M
Clouds	0.1224	0.8872	0.3136	0.8872	0.1160	0.3272	0.1472
Gaussian 2D	0.2632	0.7344	0.6200	0.7368	0.2680	0.6144	0.2696
Gaussian 3D	0.2176	0.7848	0.4368	0.7840	0.2184	0.4240	0.2488
Gaussian 4D	0.1912	0.8080	0.3832	0.7976	0.1912	0.3768	0.2328
Gaussian 5D	0.1512	0.8320	0.2592	0.8000	0.1496	0.3200	0.1856
Gaussian 6D	0.1352	0.8520	0.1928	0.5288	0.1360	0.2784	0.1592
Gaussian 7D	0.1176	0.8504	0.1440	0.3352	0.1016	0.2224	0.1072
Gaussian 8D	0.1128	0.8664	0.1328	0.2440	0.1080	0.2136	0.1192

Bold values indicate the lowest misclassification rate. Underlined values indicate the second-lowest misclassification rate.

Table 7 presents the SVM misclassification rates on the six multiclass real-world data sets. For this group of data sets

The final training sets having raw data and FMCOVs (S_Raw+FM) yield the lowest SVM misclassification rate.
The experimental results from “Adult Income,” “Statlog (Shuttle Landing Control),” and “Optical-Based Recognition of Handwritten Digits” data sets show that the final training sets having raw data and FMCOVs (S_Raw+FM) can significantly improve the level of SVM classification accuracy.
The experimental results from “Forest Cover Level” and “Pen-Based Recognition of Handwritten Digits” data sets show that the final training sets having raw data and FMCOVs (S_Raw+FM) can slightly improve the level of SVM classification accuracy.
The experimental results from “Statlog (Landsat Satellite)” data set show that the final training sets having raw data and both FMCOVs and AMCOVs (S_Raw+M) can slightly improve the level of SVM classification accuracy.

Table 7

SVM Misclassification Rates on the Six Multiclass Real-World Data Sets.

Data set	S_Raw	S_FM	S_AM	S_M	S_Raw+FM	S_Raw+AM	S_Raw+M
Adult income	0.2361	0.2362	0.7638	0.7633	0.2302	0.7638	0.7638
Statlog (Landsat)	0.7685	0.8945	0.7640	0.7650	0.7655	0.7650	0.7355
Statlog (Shuttle)	0.1872	0.8394	0.1778	0.1699	0.0168	0.0159	0.0161
Forest cover type	0.5123	0.6353	0.5122	0.5121	0.5021	0.5122	0.5122
Pen-based recog.	0.8954	0.8959	0.8959	0.8959	0.8899	0.8959	0.8959
Optical-based recog.	0.8470	0.8987	0.8809	0.8804	0.4708	0.7129	0.5988

Bold values indicate the lowest misclassification rate. Underlined values indicate the second-lowest misclassification rate.

The experiment results show that an inclusion of FMCOVs on the final training sets having raw data (S_Raw+FM) can improve the level of SVM classification accuracy. Hence, it can be concluded that the MCOV is applicable with both FFNN and SVM.

5 Conclusions

MCOV is a technique that can preserve the contour of the data set. The technique has been used to improve the classification accuracy of FFNN. It synthesizes two types of new instances, called FMCOV and AMCOV, from the data set in the middle of the decision boundary between consecutive classes of data. This paper presents a comparison on the generalization of an inclusion of FMCOVs, AMCOVs, and both MCOVs on the final training sets with SVM. The experiments were carried out using MATLAB R2015a and LIBSVM v3.20, a multiclass SVM library, on six non-overlapping synthetic data sets, eight highly overlapping synthetic data sets from the ELENA project, and six multiclass real-world data sets from the UCI machine learning repository. The data sets for training and the data sets for testing of these data sets were generated separately, generated by four-fold cross validation, and provided, respectively. For each data set for training, seven types of the final training sets were generated: (i) raw data, (ii) FMCOVs, (iii) AMCOVs, (iv) MCOVs, (v) raw data+FMCOVs, (vi) raw data+AMCOVs, and (vii) raw data+MCOVs. The experimental results confirm that an inclusion of FMCOVs on the final training sets having raw data (S_Raw+FM) can improve the SVM classification accuracy.

As a result, it can be concluded that MCOV can be applied with both FFNN and SVM.

Corresponding author: Piyabute Fuangkhon, Department of Business Information Systems, Assumption University, Samut Prakan 10540, Kingdom of Thailand, e-mail: piyabutefng@au.edu, piyabute@hotmail.com.

Bibliography

[1] K. Bache and M. Lichman, UCI Machine Learning Repository, 2015, Available at http://archive.ics.uci.edu/ml, Accessed 1 January, 2015.Search in Google Scholar

[2] C. C. Chang and C. J. Lin, LIBSVM: a library for support vector machines, ACM Trans. Intell. Syst. Technol.2 (2011), 1–27.10.1145/1961189.1961199Search in Google Scholar

[3] C. Cortes and V. Vapnik, Support-vector network, Mach. Learn.20 (1995), 273–297.10.1007/BF00994018Search in Google Scholar

[4] P. Fuangkhon, An incremental learning preprocessor for feed-forward neural network, Artif. Intell. Rev.41 (2014), 183–210.10.1007/s10462-011-9304-0Search in Google Scholar

[5] P. Fuangkhon and T. Tanprasert, An outpost vector placement evaluation of an incremental learning algorithm for support vector machine, in: Proc. Int. Joint Conf. on Neural Networks (IJCNN’11), pp. 254–261, 2011.10.1109/IJCNN.2011.6033229Search in Google Scholar

[6] P. Fuangkhon and T. Tanprasert, Multi-class contour preserving classification, in: Proc. of the Int. Conf. on Intelligent Data Engineering and Automate Learning (IDEAL’12), pp. 35–42, 2012.10.1007/978-3-642-32639-4_5Search in Google Scholar

[7] P. Fuangkhon and T. Tanprasert, Reduced multi-class contour preserving classification, Neural Process. Lett. (2015), doi:10.1007/s11063-015-9446-1.10.1007/s11063-015-9446-1Search in Google Scholar

[8] T. Tanprasert, C. Tanprasert and C. Lursinsap, Contour preserving classification for maximal reliability, in: Proc. Int. Joint Conf. on Neural Networks (IJCNN’98), pp. 1125–1130, 1998.10.1109/IJCNN.1998.685930Search in Google Scholar

[9] M. Verleysen, E. Bodt and V. Wertz, UCL Enhanced Learning for Evolutive Neural Architectures, 2015, Available at https://www.elen.ucl.ac.be/neural-nets/Research/Projects/ELENA/elena.htm, Accessed 1 January, 2015.Search in Google Scholar

Received: 2015-8-14

Published Online: 2016-2-27

Published in Print: 2017-4-1

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Multiclass Contour-Preserving Classification with Support Vector Machine (SVM)

Abstract

1 Introduction

2 Related Works

2.1 SVM

2.2 MCOV

3 Methodology

4 Experiments

4.1 Machine Learning

4.2 Data Sets

4.3 Experimental Results

5 Conclusions

Bibliography

Journal and Issue

Articles in the same Issue