Elsevier

Physica Medica

Volume 47, March 2018, Pages 103-111
Physica Medica

Original paper
The effect of feature selection on multivariate pattern analysis of structural brain MR images

https://doi.org/10.1016/j.ejmp.2018.03.002Get rights and content

Highlights

  • The effectiveness of feature selection for MVPA of the brain MR images are examined.

  • Neurodegenerative disorders are predicted using structural brain features.

  • SVM, kNN, and BP-NN are employed to analyze the images of 1390 subjects.

  • Feature selection significantly increased the performance of the analysis.

  • The most successful MVPA method was SVM.

Abstract

Clinical predictions performed using structural magnetic resonance (MR) images are crucial in neuroimaging studies and can be used as a successful complementary method for clinical decision making. Multivariate pattern analysis (MVPA) is a significant tool that helps correct predictions by exhibiting a compound relationship between disease-related features. In this study, the effectiveness of determining the most relevant features for MVPA of the brain MR images are examined using ReliefF and minimum Redundancy Maximum Relevance (mRMR) algorithms to predict the Alzheimer’s disease (AD), schizophrenia, autism, and attention deficit and hyperactivity disorder (ADHD). Three state-of-the-art MVPA algorithms namely support vector machines (SVM), k-nearest neighbor (kNN) and backpropagation neural network (BP-NN) are employed to analyze the images from five different datasets that include 1390 subjects in total. Feature selection is performed on structural brain features such as volumes and thickness of anatomical structures and selected features are used to compare the effect of feature selection on different MVPA algorithms. Selecting the most relevant features for differentiating images of healthy controls from the diseased subjects using both ReliefF and mRMR methods significantly increased the performance. The most successful MVPA method was SVM for all classification tasks.

Introduction

Structural MR images provide a good quality view of the brain that can be used to describe the shape, size, and structures quantitatively. Improving the quality of images and developing new clinical diagnosis methods are active areas of brain MR imaging research [1], [2], [3]. Predicting neurodegenerative diseases using structural brain MR images is one of the fundamental purposes of neuroimaging studies where MVPA is used as a powerful tool. MVPA is beneficial where disease-related changes in the brain are subtle and spatially distributed that it is difficult to discriminate healthy and diseased images by using conventional mass-univariate methods like voxel-based morphometry. MVPA provides correction for multiple comparisons and statistical power for the prediction that improves its diagnostic value [4], [5]. MVPA methods that use brain MR images are implemented successfully in previous clinical decision making studies as predictive tools to determine the clinical condition of the subjects [3], [4], [5], [6], [7], [8], [9], [10], [11], [12].

Machine learning algorithms are employed frequently to evaluate multivariate patterns in the structural brain MR images for the purpose of classifying images as healthy or diseased for a number of neurodegenerative diseases [4], [5], [9], [13]. Sabuncu and Konukoglu used SVM, the neighborhood approximation forest (NAF) and the relevance voxel machine (RVoxM) algorithms and common types of structural measurements from brain MR scans to predict an array of clinically relevant variables. Their results revealed that neurodegenerative diseases can be predicted from the brain MR images in a degree and MVPA produces better prediction accuracies than univariate models [4]. Ecker et al. investigated the predictive value of structural MR images in adults with autism using a whole-brain classification approach employing an SVM. They classified autism correctly at a specificity of 86.0% and a sensitivity of 88.0% [5]. Liu et al. utilized MVPA to classify major depressive disorder (MDD) patients with different therapeutic responses and healthy controls which combined searchlight algorithm and principal component analysis (PCA). According to the obtained results, they suggested that structural MR images with MVPA might be a useful and reliable method to study the neuroanatomical changes to differentiate patients with MDD from healthy controls [9]. Salvatore et al. analyzed T1-weighted MR images of 137 CE, 210 MCI and 162 healthy controls selected from the Alzheimer’s disease neuroimaging initiative (ADNI) cohort to classify AD, MCI converters and MCI non-converters to AD. They selected the most discriminative features by PCA and used SVM for classification. Their classification accuracies were 76% for AD, 72% for MCI converters and 66% for MCI non-converters [13].

Feature selection is an essential operation to determine the effective subset of the input variables for a successful MVPA [7], [14], [15]. Input that is useful for classification has to be determined before MVPA analysis to ensure that it is meaningful in the condition of the disease and comparable across subjects. There are two types of feature selection methods namely feature ranking and feature subset selection. Feature ranking methods give a ranking score to each feature according to its degree of relevance that corresponds to the discriminative power of the feature for classification. Top-ranked features are then used for classification. Feature ranking methods are successful in high dimensional feature sets because of their good generalization ability. They have advantages like the independence of the classifier, lower computational cost and being fast. The disadvantage of these methods is that they do not have interaction with the classifier. Information gain, ReliefF, and mRMR can be listed as examples of feature ranking methods. Subset selection methods use a search strategy to determine a subset of features that jointly have discriminative power. Feature subset selection methods are not preferred for high dimensional problems since they are computationally expensive and have a risk of overfitting. Capturing feature dependencies is an advantage of these methods but selecting features depending on the classifier can be counted as a disadvantage. Correlation-based feature selection, consistency-based subset evaluation, and wrapper subset evaluation are some of the feature subset selection methods [14], [15], [16], [17]. Two of the most frequently used feature ranking methods namely ReliefF and mRMR are used in this study for feature reduction because of their lower computational costs and independence of classifier since the same feature reduction method is applied to three different machine learning algorithms that are used to analyze images of 1390 people that belongs to four different neurodegenerative disease groups.

Previous neuroimaging studies that identify neurodegenerative diseases have proven that reducing the dimension of the input boost the classification accuracy and decrease the computation time by excluding the highly correlated features and features that are not valuable to discriminate between classes [18], [19], [20], [21], [22]. Demirhan et al. improved the accuracy of classifying AD and MCI using SVM up to 15% by selecting the most relevant features with ReliefF algorithm [18]. Cui et al. identified the conversion from MCI to AD by using mRMR method for feature selection to choose optimal subsets of features from each modality of data, then they employed the SVM by incrementally adding features based on their ranking till obtaining the highest area under the curve (AUC). They proved that the selected features are closely related to AD progression and verified the effectiveness of feature selection [19]. Wee et al. combined ranking and wrapper-based feature selection methods to identify the most relevant features for autism spectrum disorder classification. T-test and mRMR ranking based methods are used to reduce the number of features based on general characteristics of the data. Then SVM-based recursive feature elimination (SVM-RFE) is used to determine the subset of features. They obtained high classification accuracies up to 96% [20]. Castro et al. proposed a recursive feature elimination method that uses a machine learning algorithm based on composite kernels to the classification of healthy controls and patients with schizophrenia. They showed that feature selection improved the accuracy of classification and allowed a better identification of the brain regions that characterize schizophrenia [21]. Dai et al. integrated multimodal image features using multi-kernel learning and compared the effects of using different features for classification of ADHD patients. They selected optimal feature subset by combining feature ranking methods and feature subset selection methods. Their experiments showed that multi-kernel learning using selected multimodal features can yield better classification results for ADHD prediction [22].

In this study, MVPA analysis is performed to discriminate AD, schizophrenia, autism, and ADHD patients from the healthy controls using the morphometric features such as volumes and thickness of anatomical structures obtained from the T1-weighted structural brain MR images. Effect of using feature selection on the classification performance is investigated using ReliefF and mRMR feature ranking methods with an unbiased brain-wide approach. Three state-of-the-art machine learning algorithms, SVM, kNN, and BP-NN, are employed for the MVPA analysis. 5-fold cross validation (CV) is used for all feature selection and classification tasks to assess the generalization ability of the performance.

Section snippets

Materials and methods

A brief description of the used data, feature selection methods, and the MVPA algorithms are given in this section. Details about the model hyperparameters are also stated in this section for reproducibility of the study. Flow diagram of the system that shows the workflow is given in Fig. 1. Feature selection is performed inside the CV loop before the classification to prevent the leakage of label information from the test data.

Results

MVPA analyses are performed using SVM, kNN, and BP-NN algorithms on four different feature sets constructed using morphometric brain measures obtained from the T1-weighted structural brain images of 1390 subjects that have AD, schizophrenia, autism, or ADHD diseases. Dimension reduction that will improve the classification accuracy is first performed to select a subset of the feature set without using a priori information related to the cases. Different number of features are selected using

Discussion

In the present study, the effect of the feature selection for the MVPA of the OASIS, ABIDE, COBRE, ADHD and MCIC datasets are evaluated using ReliefF and mRMR methods.

All the MVPA algorithms were more successful for the OASIS dataset, for classifying the case 2 subjects that have AD than the case 1 subjects that have AD mild. BP-NN was not functional for the feature sets that have too many features because of its high computational costs. SVM that use FS-1 was the most successful method for

Conclusions

In this section, conclusions that are obtained in the results of the study and generally valid for all datasets used in this study are given. Feature selection improved the performance of the MVPA for all feature sets independent of the number of features that include. Effect of feature selection was prominent for all datasets except ADHD which can be interpreted that the structural brain measures might only be weakly related to the disease that feature selection did not help it to achieve

Conflict of interest

The authors have no conflicts of interest to disclose.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References (40)

  • J. Jiang et al.

    Medical image analysis with artificial neural networks

    Comput Med Imaging Graph

    (2010)
  • C. Budak et al.

    Removal of impulse noise in digital images with naïve Bayes classifier method

    Turk J Elec Eng Comp Sci

    (2016)
  • Y. Zhang et al.

    Preclinical diagnosis of magnetic resonance (MR) brain images via discrete wavelet packet transform with Tsallis entropy and generalized eigenvalue proximal support vector machine (GEPSVM)

    Entropy

    (2015)
  • M.R. Sabuncu et al.

    Clinical prediction from structural brain MRI scans: a large-scale empirical study

    Neuroinformatics

    (2015)
  • M. Peker et al.

    Computer-aided diagnosis of Parkinson’s disease using complex-valued neural networks and mRMR feature selection algorithm

    J Healthc Eng

    (2015)
  • F. Liu et al.

    Classification of different therapeutic responses of major depressive disorder with multivariate pattern analysis method based on structural MR scans

    PLoS One

    (2012)
  • F. Hoeft et al.

    Neural systems predicting long-term outcome in dyslexia

    Proc Natl Acad Sci USA

    (2011)
  • B.A. Johnston et al.

    Predictive classification of individual magnetic resonance imaging scans from children and adolescents

    Eur Child Adoles Psy

    (2013)
  • C. Salvatore et al.

    Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer's disease: a machine learning approach

    Front Neurosci

    (2015)
  • V. Bolón-Canedo et al.

    A review of feature selection methods on synthetic data

    Knowl Inf Syst

    (2013)
  • Cited by (21)

    • An algorithm for learning shape and appearance models without annotations

      2019, Medical Image Analysis
      Citation Excerpt :

      Those papers reported multiple accuracies, so it would be difficult to choose a single accuracy with which to compare. The accuracy achieved for the COBRE dataset was 74.7 ± 7.1%, which is similar to the 69.7% accuracy reported by Cabral et al. (2016) using COBRE, and was roughly comparable with many of the accuracies obtained by Monté-Rubio et al. (2018) or Demirhan (2018). Others have used other datasets of T1-weighted scans for identifying patients with schizophrenia.

    • A systematic review of structural MRI biomarkers in autism spectrum disorder: A machine learning perspective

      2018, International Journal of Developmental Neuroscience
      Citation Excerpt :

      The next best classification algorithms include Learned Vector Quantization (LVQ) with an accuracy of 87.7% (as before, obtained on a small dataset), and the Radial Basis Function Neural Classifier (RFBN) with accuracies between 70 to 96% (Subbaraju et al., 2015; Vigneshwaran et al., 2013), although the later result was for classifying ASD specifically in females. Other classification algorithms such as Random Forests (RF) and k-nearest neighbours (k-NN) were not observed to perform as well as SVM, with RF achieving an Area Under the Curve (AUC) performance metric of 60 in one study (Katuwal et al., 2015) and an accuracy of 99% (Vigneshwaran et al., 2013), and k-NN achieving an AUC of 0.54 (Demirhan, 2018) and an accuracy of 75% (Abdelrahman et al., 2012). However, this reduced performance may be due to both methods being generalisable to multi-class problems, and hence are not as tailored to the binary classification of ASD as SVM.

    View all citing articles on Scopus
    View full text