Abstract

Early diagnosis of Alzheimer’s helps a doctor to decide the treatment for the patient based on the stages. The existing methods involve applying the deep learning methods for Alzheimer’s classification and have the limitations of overfitting problems. Some researchers were involved in applying the feature selection based on the optimization method, having limitations of easily trapping into local optima and poor convergence. In this research, Differential Evolution-Multiclass Support Vector Machine (DE-MSVM) is proposed to increase the performance of Alzheimer’s classification. The image normalization method is applied to enhance the quality of the image and represent the features effectively. The AlexNet model is applied to the normalized images to extract the features and also applied for feature selection. The Differential Evolution method applies Pareto Optimal Front for nondominated feature selection. This helps to select the feature that represents the characteristics of the input images. The selected features are applied in the MSVM method to represent in high dimension and classify Alzheimer’s. The DE-MSVM method has accuracy of 98.13% in the axial slice, and the existing whale optimization with MSVM has 95.23% accuracy.

1. Introduction

Alzheimer’s disease (AD) is a cognitive degenerative disorder leading to dementia and is considered a mental and physical disability. AD has stages like very mild, mild, and moderate dementia class, and patients’ treatment is dependent on the stage of AD [1]. AD is one of the fastest-growing and challenging diseases to treat and affects the livelihood of not just patients but also close family members, nurses, caregivers of the patients. Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET) scans are common imaging techniques to analyze Alzheimer’s [2]. Mild Cognitive Impairment (MCI) is AD at its transition state, and this is necessary to classify the stages for therapeutic measures to delay the disease progression [3]. Clinical neuroimaging techniques such as MRI and PET scans are suitable for analyzing brain changes with AD progression and MCI [4]. The structural MRI scans provide detailed information of the brain anatomical structures that can detect and measures AD of brain atrophy patterns [5].

Machine learning methods help in analyzing high-dimensional data and automated classification to learn the complex structural changes of complex patterns in different imaging modalities. Feature extraction helps in training the features for classification algorithms to build predictive models that are useful for clinical processes [6]. Various methods have been applied to develop the early classification of AD on an individual basis like deep learning networks and machine learning methods [7]. Predefined features like voxel and regional measures were obtained from image preprocessing pipelines for the combination of various algorithms with classifiers such as random forests or Support Vector Machines (SVM) [8]. Various existing methods have been applied for Alzheimer’s classification and have the limitations of overfitting and imbalance data problems [9, 10]. Existing methods have the limitation of being easily trapped into local optima that select some of the irrelevant features. Solving the local optima problem in the feature selection further improves the relevant feature selection to improve learning of feature difference and improves the classification performance. The proposed DE-MSVM method applies the threshold value to escape from local optima, and a set of samples is selected to fine-tune the model for classwise training. The contribution of the proposed DE-MSVM method is discussed in the following:(1)Multiobjective optimization method is proposed to select the features based on the data instances and classes of the training data. This helps to fine-tune the model to learn the difference between the features and improves the classification performance.(2)The feature learning and feature selection method provides higher performance in three-slice classification. The proposed method provides higher performance in MRI and PET images.(3)The proposed DE-MSVM method has accuracy of 98.3%, and the existing RFE-GA-SVM method has 95.79% accuracy.

Alzheimer’s disease diagnosis is an integral part of the clinical assessment and is usually carried out in MRI and PET images. Various models have been proposed for Alzheimer’s disease classification to improve its reliability. Some of the notable researches in Alzheimer’s classification were surveyed in this section for a better understanding of existing methods.

Basaia et al. [11] applied Convolutional Neural Network (CNN) for the classification of Alzheimer’s into 3 classes on MRI images. The ADNI dataset was used to test the performance of the proposed CNN model in Alzheimer’s classification. The CNN has higher accuracy in Alzheimer’s classification without feature engineering. Ramzan et al. [12] applied the deep learning method of ResNet-18 architecture to improve the efficiency of Alzheimer’s classification. The model was fully trained from scratch and performed transfer learning, and an extended network architecture was applied to fine-tune the model. The ADNI benchmark dataset was used to test the model performance in Alzheimer’s classification. The ResNet-18 model has a higher performance than existing methods in Alzheimer’s classification. Naz et al. [13] applied freeze features from ImageNet for binary and ternary classification for Alzheimer’s classification. Various CNN models such as VGG, InceptionResNet, Inception v3, DenseNet, ResNet, GoogLeNet, and AlexNet were applied with freeze features to test the performance. The ADNI dataset was applied to test the performance of the proposed method in Alzheimer’s classification. The result shows that the VGG model with freeze features has a higher performance in Alzheimer’s classification. Janghel and Rathore [14] applied VGG-16 architecture of the CNN model with conversion and resizing of images for feature extraction. The SVM, k means clustering, and decision tree were applied for the classification of Alzheimer’s. The developed method was tested on MRI and PET images of the ADNI dataset for Alzheimer’s classification. The developed method has higher performance on MRI and PET images compared to the existing CNN model. González et al. [15] applied the framework of preprocessing, feature extraction, and classification for Alzheimer’s classification in three datasets. The preprocessing involves unified segmentation of the input images. The Neuromorphometrics, Hammers, and atlas-based features were extracted for the feature extraction. The linear SVM, logistic regression, and random forest were used for Alzheimer’s classification based on extracted features. The result shows that the proposed method has higher performance in Alzheimer’s classification existing methods.

AbdulAzeem et al. [16] applied an end-to-end framework of the CNN model for Alzheimer’s classification in MRI images. The ADNI dataset was used in this study to test the developed model in binary and multiclass classification. The Glorot Uniform Weight Initializer was applied to enable the weight in the activation function to prevent the network from starting from the saturated region. The Adam optimizer was applied to fine-tune the model to improve the performance of Alzheimer’s classification. The developed model has a higher performance in Alzheimer’s classification compared to existing methods. Eitel and Ritter [17] applied four attributes such as occlusion, layerwise relevance propagation, guided backpropagation, and gradient input with the CNN model for Alzheimer’s classification. The developed method was applied in the MRI ADNI dataset to test the performance of Alzheimer’s classification. Buvaneswari and Gayathri [18] applied a deep learning-based SegNet method for segmentation, and ResNet-101 was applied for Alzheimer’s classification. The seven morphological characteristics were extracted for feature extraction. The developed ResNet-based model has higher performance in Alzheimer’s classification compared to the existing method. Alam et al. [19] applied the Twin SVM model with Dual-Tree Complex Wavelet Transform (DTCWT) for Alzheimer’s classification. The developed method has higher efficiency on the ADNI and OASIS datasets for Alzheimer’s classification. Deep learning-based models in Alzheimer’s classification suffer from the limitations of the overfitting problem and generate more irrelevant features for the classification. Some of the researchers were involved in applying the feature selection and SVM-based model for Alzheimer’s classification to overcome the overfitting problem.

Asim et al. [20] applied Principal Component Analysis (PCA) for the feature reduction and SVM method for the classification of Alzheimer’s classification. The developed method evaluated on the ADNI dataset shows that the developed method has higher efficiency in Alzheimer’s classification than the existing method. Shakarami et al. [21] proposed the AlexNet-SVM model for Alzheimer’s classification to improve the accuracy of the classification. The developed method was tested on the PET images from the ADNI dataset. The developed model has higher performance than the existing methods in Alzheimer’s classification. Zeng et al. [22] applied the switching delay PSO method to optimize the SVM parameters for Alzheimer’s classification. The PCA method was applied for the feature reduction, and SVM was applied for the classification. The developed method was tested on the ADNI dataset and showed higher performance in Alzheimer’s classification. Neffati et al. [23] applied downsized kernel Principal Component Analysis and Multiclass SVM model for Alzheimer’s classifier. The developed method was tested on the OASIS MRI dataset and showed higher performance. Divya et al. [24] applied Recursive Feature Elimination (RFE) and Genetic Algorithm (GA) for the feature selection and SVM for the classification of Alzheimer’s classification. The developed method has higher performance on the ADNI dataset than the existing methods in Alzheimer’s classification. Nanni et al. [25] applied various texture descriptions and SVM for the classification of Alzheimer’s classification. The MRI ADNI dataset was used to test the performance of Alzheimer’s classification. The texture features and SVM method improve the performance of Alzheimer’s classification than the existing methods. The SVM-based models suffer from imbalance data problem that affects the performance of classification.

Reddy et al. [26] applied an adaptive Genetic Algorithm with fuzzy logic for the prediction of heart disease at early stage. The developed model consists of a rough set-based heart disease feature selection method and a fuzzy rule-based classification model. Rough set theory selects the important features for heart disease classification, and the selected features were applied for the heart disease classification. Gadekallu et al. [27] applied Principal Component Analysis (PCA), Grey Wolf Optimization (GWO), and Deep Neural Network (DNN) for the classification of diabetic retinopathy. The standard scaler normalization method is applied to normalize the input dataset. The PCA method performs feature reduction in the input dataset, and GWO selects the optimal parameter for the DNN model.

Summarization of the related papers in Alzheimer’s classification in taxonomywise is given in Table 1.

This paper is formulated as follows: an explanation about the DE-MSVM method, normalization, and AlexNet feature extraction is given in Section 2. The simulation setup of the proposed method is given in Section 3, and the results of the proposed model in Alzheimer’s classification are given in Section 4. The conclusion of this proposed research is presented in Section 5.

2. Proposed Method

In this research, the DE-MSVM method is proposed to increase the performance of Alzheimer’s classification. The ADNI datasets were used to test the performance of the proposed DE-MSVM and existing methods for Alzheimer’s classification. The normalization is applied to enhance the quality of the images and applied for feature extraction. The AlexNet model is applied to extract the deep features from the input images. The Differential Evolution method is applied to select the relevant features from the extracted features. The MSVM model is applied for the classification based on the selected features. The block diagram of the proposed DE-MSVM model is shown in Figure 1.

2.1. Normalization

The original data differences affect the classification performance, and usually, image data have different intensities. The min–max normalization technique is applied to enhance the image intensity to provide clear information and improve classifier performance. The min–max intensity normalization is shown in the following equation:where denotes the image minimum intensity, denotes the image maximum intensity, denotes the normalized image, and denotes the input image.

2.2. Laplacian Redecomposition for Multimodal Medical Image Fusion

The LRD scheme is used to decompose the source images to obtain LSI images and nonoverlapping and overlapping domain images. The LEM fusion rule [2830] is used to fuse LSI, and two fusion rules such as OD and NOD are used to fuse overlapping and nonoverlapping domain images, respectively. The IRS fusion rule is applied to reconstruct HIS fusion image, and inverse Laplacian transform is applied to reconstruct the fusion image.

2.3. LEM Fusion Rule

Texture details are information of interest in anatomical images, and low-frequency information is present in some functional images. The LEM is defined as the square of the sum of local window pixels [28]. The square operation leads to unstable energy acquisition due to differences in functional and anatomical images. The square operation in normalized pixels’ range is smaller if the sum of pixels is less than 1 and larger if the sum of pixels is greater than 1. This affects the accurate acquisition of decision results. So, the direct addition operation is applied instead of the square operation in the following equation:where filtering template is denoted as and local window sizes are represented as and . Then, calculate the maximum value of as and this defines LEM, as given in the following equation:where is denoted as binary decision graph and is used to construct an LSI fusion image, as given in equations (4) and (5):where reverse operator in range of 0 to 1 is denoted as.

2.4. OD Fusion Rule

Domain fusion images have more useful information to make overlapping and have three major efforts: (1) a Local Decision Maximums (LDM) based on MLD and LEM is applied to mark edges and anatomical images details; (2) another LDM marker is applied based on LEM and MLD decision scheme for functional images of abnormal areas; (3) binary decision graph is developed based on two LDMs through comparing the sizes, and fusion images of the overlapping domain are obtained. The OD has more advantages of infusion of anatomical feature information and functional images. The algorithm description is given as follows.

Anatomical and functional images of LDM are calculated using LEM and MLD, as given in the following equation:where the anatomical image of LDM is denoted as and MLD obtains , while LEM obtains .

Equations (7) and (8) measure .where LDM value of the functional image is denoted as and binary decision graph is denoted as .

The fusion rule is given in equations (9) and (10).where binary decision graph is denoted as and anatomical and functional overlapping domain images are denoted as and . The overlapping domain fusion image is denoted as .

2.5. NOD Fusion Rule

According to DGR nonoverlapping domain definition, the nonoverlapping domain of fusion image is obtained using the following fusion algorithm, as given in the following equation:where and denote anatomical and functional nonoverlapping domain images. The fusion image of the nonoverlapping domain is denoted as .

2.6. IRS Fusion Rule

The subband fusion image with high frequency is reconstructed using the IRS fusion rule to eliminate artifacts in images since complementary and redundant information fuse in HIS separately, which leads to image artifacts in subband fusion image reconstruction with high frequency. Overlapping domain surrounding pixels are replaced, and two global decision graphs are constructed to solve this problem. The reconstruction task is completed by the first decision graph, and the local mean algorithm is combined with the second decision graph to eliminate artifacts.

2.7. Reconstructed Fused Image

Fused medical images are obtained by reconstruction of Laplacian multiscale. The traditional inverse Laplacian transform is applied to reconstruct fused images, and the decomposition process of inverse operation obtains the fusion image. The fusion image is measured in the following equation:

2.8. AlexNet Feature Extraction

The AlexNet is a deep learning technique applied for feature extraction. A fully connected layer of AlexNet CNN is applied for the feature extraction from the fused image. The AlexNet CNN consists of 22 layers of feature extractor based on transfer learning technique, plus a fully connected (FC) layer with 1 × 1 × 64 dimensions [21]. Extracted features were applied to the Differential Evolution method for feature selection.

2.9. Multiobjective Differential Evolution for Feature Selection

Differential Evolution (DE) has a population of solutions for and , where decision variables are represented by , the number of vectors is denoted as , and the number of vector elements is denoted as . The MOOP concept is explained before the DEMO algorithm discussion in detail [31, 32].

The MOOP has a number of objective functions that are either minimized or maximized. Several constraints are needed to be satisfied during optimization, and MOOP is given in equations (13) and (14).where and .

Subject to ,where , , , and .

The number of objective functions is denoted as , the number of inequality constraints is denoted as , the number of equality constraints is denoted as , and lower and upper boundaries of search space are denoted as and .

Decision space is divided by constraints into infeasible and feasible regions, where objective space is represented by feasible region . The objective functions values are determined in the multidimensional space of objective space. This finds the point in the objective space for each feasible solution .

The objective space of solution quality is measured based on its dominance. dominates solution when is better than in all objectives and strictly, should be better than at least in one objective.

Pareto Optimal Front (POF) is the dominant solution applied in this method. The POF is determined by MOOP in the decision space. Methods for nondominant selection are used for this task, and after mutation and crossover to original population size , a grown population is truncated.

The MOOP of feature selection uses two objective functions . The function is given inwhere a maximum number of features in vector is denoted as and the function for is given in

The number of used features is defined as the objective function and the function is sophisticated. The classes are defined as , where a maximum number of labels are defined as . A set of valid samples is belonging () to , as given in the following equation:

A set of samples in class and classification is defined as in the following equation:where classification of vector to class is denoted using a symbol . The ratio between the size of both introduced sets is function , as given in the following equation:where equation (19) helps to fine-tune the model to achieve higher accuracy for SVM in the training and validation set. Equation (15) helps the model to find a similar instance to fine-tune the model to reduce the error rate of the model. Furthermore, equation (15) helps to fine-tune the model based on the number of instances in the dataset, and equation (19) helps to fine-tune the model related to the labels. Equations (15) and (19) help the model to learn the difference in the features, which makes it easy for the hyperplane of SVM to classify the data.

The classification accuracy of function is based on a selected feature subset.

DEMO algorithm is one of the successful DE realizations for solving MOOPs. In this study, the DEMO method is applied to improve the strength of the Pareto evolutionary algorithm-SPEA2 for nondominant selection.

3. Multiclass Support Vector Machine

Binary classifiers are constructed for classes, each trained to be different from one class to the other [3337]. A multiclass category is obtained based on the maximal output before applying the SGN function: , where , in which .

Hyperplane distance to the point of a signed real value is denoted as which is referred as the confidence value. The higher value increases the confidence, where belongs to the positive class. The highest confidence value is assigned with .

The input data is denoted as , the hypersphere radius is denoted as , and the center is denoted as . The minimum hypersphere which encloses the optimization problem is given in equation (20).

Minimize , subject to ,

Derive. .

The following equation is obtained:

Hence, equation (10) becomes

The optimization problem is solved based on the dual form of , as given in the following:

Maximizesubject to and , .

Lagrange multiplier's possibilities are nonzero if the inequality constraints are equality solution.

Optimal solution complementarity conditions for are given in

Training samples lie on the surface of the optimal hypersphere related to .

The following equation provides the decision function solution:

Equations (26) and (27) are provided:

The method aims to obtain the minimum enclosing hypersphere consisting of satisfying all training samples.

4. Simulation Setup

The proposed Differential Evolution-Multiclass Support Vector Machine (DE-MSVM) model is implemented on the ADNI dataset and compared with existing methods. This section provides the implementation details of the proposed DE-MSVM model and dataset.

Dataset: ADNI fMRI and PET datasets were used to evaluate the performance of the proposed DE-MSVM method [38, 39]. The ADNI fMRI dataset consists of 3692 images, which contains 1775 normal images and 1917 Alzheimer’s disease images. The ADNI PET dataset consists of 1775 normal images and 900 diseased images. The images of axial, coronal, and sagittal planes are present in the dataset. The sample images of the MRI and PET dataset for three slices are shown in Figure 2.

System requirement: the proposed DE-MCSVM method is implemented in the system consisting of an Intel i7 processor, 16 GB RAM, 6 GB graphics card, and Windows 10 64-bit OS. The MATLAB R2018b tool was used to implement and measure the performance metrics for classification.

Metrics: the performance metrics include Accuracy, Sensitivity, Specificity, False Omission Rate (FOR), False Discovery Rate (FDR), and MCC. The formula for metrics is given as follows:where TP is true positive, TN is true negative, FP is false positive, and FN is false negative.

5. Experimental Results

In this study, the DE-MSVM model is proposed to increase the performance of Alzheimer’s classification. The ADNI fMRI and PET images were used to test the performance of Alzheimer’s classification. The normalization method is applied to enhance the quality of the images. AlexNet feature extraction method and DE feature selection are applied to select the relevant features for the classification. The MSVM model is applied with selected features and classifies Alzheimer’s images. This section provides detailed information on the results of the proposed DE-MSVM method.

Extracted features size is 4096, and the proposed feature selection method selected 2078 features for the classification. Equation (16) discards most of the features based on the threshold of more than 0.5-feature important score.

The proposed DE-MSVM method is applied on the ADNI axial slice for Alzheimer’s classification and compared with existing methods, as shown in Table 2. The proposed DE-MSVM model has higher performance compared to the existing method in Alzheimer’s classification. Pareto Optimal Front in differential entropy feature selection selects the relevant features to represent the characteristics of the input in a nondominated manner. The differential entropy feature selection method provides clear separation of feature characteristics based on multiobjective optimization. The MSVM model performs well in the classification in case of clear separation of margin and is more efficient in high-dimensional space. The AdaBoost classifiers are sensitive to the outlier in the feature, and the autoencoder classifier has a limitation of lower efficiency in many features. The proposed DE-MSVM model has accuracy of 98.13%, and AdaBoost has 96.67% accuracy.

The proposed DE-MSVM method is tested on the axial slice of the ADNI dataset and compared with existing methods, as shown in Figure 3. This shows that the DE-MSVM method has higher performance compared to existing feature selection and classifier models. Pareto Optimal Front in differential entropy applies nondominated feature selection to effectively represent the characteristics. The existing feature selection methods such as whale optimization, grey wolf, and bat methods have limitations of being easily trapped into local optima and having poor convergence. The proposed DE-MSVM method has accuracy of 98.31%, and the whale-MSVM method has 95.23% accuracy.

The proposed DE-MSVM method is tested on the sagittal slice images and compared with existing methods, as shown in Table 3. The proposed DE-MSVM method has a higher performance in Alzheimer’s classification than existing methods. The Pareto Optimal Front is applied in the DE method to select the features in a nondominated manner. The selected features are applied in the MSVM for the classification, and the MSVM method performs well on high-dimensional data. The autoencoder classifiers have the limitations of lower performance in many features and imbalanced class problems. The AdaBoost classifier is sensitive to an outlier in the features and has lower performance. The linear MSVM model does not consider the nonlinear relationship between the features and target. The proposed DE-MSVM method has 98.65% accuracy, and the autoencoder has 97.89% accuracy.

The proposed DE-MSVM method is tested on the sagittal slice and compared with the existing feature selection method, as shown in Figure 4. The proposed DE-MSVM model has higher performance than other existing feature selections. The Pareto Optimal Front helps to select the relevant features from AlexNet feature extraction for classification. The existing feature selection methods such as whale, grey wolf, and bat have the limitations of being easily trapped into local optima and having poor convergence.

The proposed DE-MSVM method is tested on the coronal slice and compared with existing methods, as shown in Table 4. The proposed DE-MSVM method has higher performance than existing methods in Alzheimer’s classification. The classwise learning of the proposed method helps the model to learn the feature difference that improves the sensitivity and specificity of the model. The Pareto Optimal Front is applied in Differential Evolution to select features in a nondominated manner. The selected feature is applied in the MSVM method for Alzheimer’s classification. The autoencoder method has the limitation of overfitting problem, AdaBoost classifier is sensitive to the outlier of features, and linear MSVM method fails to analyze the nonlinear relationship between the feature and target. The proposed DE-MSVM method has 98.12% accuracy, and the existing DE-AdaBoost has 95.9% accuracy.

The proposed DE-MSVM method is evaluated on the coronal slice and compared with existing methods, as shown in Figure 5. The proposed DE-MSVM method has higher performance than existing methods in Alzheimer’s classification. The result shows that other fine-tuned models are less sensitive to the feature difference, and this affects the sensitivity and specificity of the model. The Pareto Optimal Front in Differential Evolution selects the features in a nondominated manner and applies them for classification. The MSVM method provides the classification of Alzheimer’s based on the selected features. The existing whale, grey wolf, and bat methods have limitations of being easily trapped into local optima and having poor convergence.

5.1. Comparative Analysis

The proposed DE-MSVM method is compared with the existing methods in ADNI dataset to analyze the performance.

The proposed DE-MSVM is compared with existing methods in the ADNI dataset, as shown in Table 5 and Figure 6. The proposed DE-MSVM method has higher performance than existing deep learning methods and SVM-based methods. The proposed DE-MSVM method has applied Parent Optima Front to select the features in a nondominated manner. The proposed DE-MSVM model selects the features based on the data instances and classwise learning of features. This helps to learn the feature difference in the model that improves the classification efficiency. The sensitivity and specificity of the proposed method have achieved 98.35% and 98.23%, respectively. This shows that the classwise learning process in the proposed method improves the efficiency of the model. The deep learning methods like SegNet-ResNet-101 [18] and AlexNet-SVM [21] have limitations of overfitting problems. The RFE-GA-SVM [24] method has the limitation of being easily trapped into local optima and having poor convergence in the feature selection. The proposed DE-MSVM method has accuracy of 98.3%, and AlexNet-SVM [21] has 96.39% accuracy.

6. Conclusion

Alzheimer’s is a neurodegenerative disorder, and the early classification of Alzheimer’s helps in providing better treatment. The existing models in Alzheimer’s classification have the limitations of overfitting problems and local optima in the feature selection. In this study, the DE-MSVM method is proposed to improve the performance of Alzheimer’s classification. The Pareto Optimal Front in Differential Evolution selects the relevant features in a nondominated manner. The AlexNet model extracts the features from the input images and apply for the feature selection. The Differential Evolution method selects the features to represent the characteristics of the images. The selected features were applied to MSVM for Alzheimer’s classification on the ADNI dataset. The proposed DE-MSVM method has accuracy of 98.13% in the axial slice, and the existing whale-MSVM method has 95.23% accuracy. The future work of this proposed method is applied with a deep learning model for the classification.

Data Availability

The datasets analyzed during the current study are available in Alzheimer’s Disease Neuroimaging Initiative (ADNI) repository, https://adni.loni.usc.edu/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.