The effect of feature selection on multivariate pattern analysis of structural brain MR images

doi:10.1016/j.ejmp.2018.03.002

Physica Medica

Volume 47, March 2018, Pages 103-111

https://doi.org/10.1016/j.ejmp.2018.03.002 Get rights and content

Highlights

•
The effectiveness of feature selection for MVPA of the brain MR images are examined.
•
Neurodegenerative disorders are predicted using structural brain features.
•
SVM, kNN, and BP-NN are employed to analyze the images of 1390 subjects.
•
Feature selection significantly increased the performance of the analysis.
•
The most successful MVPA method was SVM.

Abstract

Clinical predictions performed using structural magnetic resonance (MR) images are crucial in neuroimaging studies and can be used as a successful complementary method for clinical decision making. Multivariate pattern analysis (MVPA) is a significant tool that helps correct predictions by exhibiting a compound relationship between disease-related features. In this study, the effectiveness of determining the most relevant features for MVPA of the brain MR images are examined using ReliefF and minimum Redundancy Maximum Relevance (mRMR) algorithms to predict the Alzheimer’s disease (AD), schizophrenia, autism, and attention deficit and hyperactivity disorder (ADHD). Three state-of-the-art MVPA algorithms namely support vector machines (SVM), k-nearest neighbor (kNN) and backpropagation neural network (BP-NN) are employed to analyze the images from five different datasets that include 1390 subjects in total. Feature selection is performed on structural brain features such as volumes and thickness of anatomical structures and selected features are used to compare the effect of feature selection on different MVPA algorithms. Selecting the most relevant features for differentiating images of healthy controls from the diseased subjects using both ReliefF and mRMR methods significantly increased the performance. The most successful MVPA method was SVM for all classification tasks.

Introduction

Structural MR images provide a good quality view of the brain that can be used to describe the shape, size, and structures quantitatively. Improving the quality of images and developing new clinical diagnosis methods are active areas of brain MR imaging research [1], [2], [3]. Predicting neurodegenerative diseases using structural brain MR images is one of the fundamental purposes of neuroimaging studies where MVPA is used as a powerful tool. MVPA is beneficial where disease-related changes in the brain are subtle and spatially distributed that it is difficult to discriminate healthy and diseased images by using conventional mass-univariate methods like voxel-based morphometry. MVPA provides correction for multiple comparisons and statistical power for the prediction that improves its diagnostic value [4], [5]. MVPA methods that use brain MR images are implemented successfully in previous clinical decision making studies as predictive tools to determine the clinical condition of the subjects [3], [4], [5], [6], [7], [8], [9], [10], [11], [12].

Machine learning algorithms are employed frequently to evaluate multivariate patterns in the structural brain MR images for the purpose of classifying images as healthy or diseased for a number of neurodegenerative diseases [4], [5], [9], [13]. Sabuncu and Konukoglu used SVM, the neighborhood approximation forest (NAF) and the relevance voxel machine (RVoxM) algorithms and common types of structural measurements from brain MR scans to predict an array of clinically relevant variables. Their results revealed that neurodegenerative diseases can be predicted from the brain MR images in a degree and MVPA produces better prediction accuracies than univariate models [4]. Ecker et al. investigated the predictive value of structural MR images in adults with autism using a whole-brain classification approach employing an SVM. They classified autism correctly at a specificity of 86.0% and a sensitivity of 88.0% [5]. Liu et al. utilized MVPA to classify major depressive disorder (MDD) patients with different therapeutic responses and healthy controls which combined searchlight algorithm and principal component analysis (PCA). According to the obtained results, they suggested that structural MR images with MVPA might be a useful and reliable method to study the neuroanatomical changes to differentiate patients with MDD from healthy controls [9]. Salvatore et al. analyzed T1-weighted MR images of 137 CE, 210 MCI and 162 healthy controls selected from the Alzheimer’s disease neuroimaging initiative (ADNI) cohort to classify AD, MCI converters and MCI non-converters to AD. They selected the most discriminative features by PCA and used SVM for classification. Their classification accuracies were 76% for AD, 72% for MCI converters and 66% for MCI non-converters [13].

Feature selection is an essential operation to determine the effective subset of the input variables for a successful MVPA [7], [14], [15]. Input that is useful for classification has to be determined before MVPA analysis to ensure that it is meaningful in the condition of the disease and comparable across subjects. There are two types of feature selection methods namely feature ranking and feature subset selection. Feature ranking methods give a ranking score to each feature according to its degree of relevance that corresponds to the discriminative power of the feature for classification. Top-ranked features are then used for classification. Feature ranking methods are successful in high dimensional feature sets because of their good generalization ability. They have advantages like the independence of the classifier, lower computational cost and being fast. The disadvantage of these methods is that they do not have interaction with the classifier. Information gain, ReliefF, and mRMR can be listed as examples of feature ranking methods. Subset selection methods use a search strategy to determine a subset of features that jointly have discriminative power. Feature subset selection methods are not preferred for high dimensional problems since they are computationally expensive and have a risk of overfitting. Capturing feature dependencies is an advantage of these methods but selecting features depending on the classifier can be counted as a disadvantage. Correlation-based feature selection, consistency-based subset evaluation, and wrapper subset evaluation are some of the feature subset selection methods [14], [15], [16], [17]. Two of the most frequently used feature ranking methods namely ReliefF and mRMR are used in this study for feature reduction because of their lower computational costs and independence of classifier since the same feature reduction method is applied to three different machine learning algorithms that are used to analyze images of 1390 people that belongs to four different neurodegenerative disease groups.

Previous neuroimaging studies that identify neurodegenerative diseases have proven that reducing the dimension of the input boost the classification accuracy and decrease the computation time by excluding the highly correlated features and features that are not valuable to discriminate between classes [18], [19], [20], [21], [22]. Demirhan et al. improved the accuracy of classifying AD and MCI using SVM up to 15% by selecting the most relevant features with ReliefF algorithm [18]. Cui et al. identified the conversion from MCI to AD by using mRMR method for feature selection to choose optimal subsets of features from each modality of data, then they employed the SVM by incrementally adding features based on their ranking till obtaining the highest area under the curve (AUC). They proved that the selected features are closely related to AD progression and verified the effectiveness of feature selection [19]. Wee et al. combined ranking and wrapper-based feature selection methods to identify the most relevant features for autism spectrum disorder classification. T-test and mRMR ranking based methods are used to reduce the number of features based on general characteristics of the data. Then SVM-based recursive feature elimination (SVM-RFE) is used to determine the subset of features. They obtained high classification accuracies up to 96% [20]. Castro et al. proposed a recursive feature elimination method that uses a machine learning algorithm based on composite kernels to the classification of healthy controls and patients with schizophrenia. They showed that feature selection improved the accuracy of classification and allowed a better identification of the brain regions that characterize schizophrenia [21]. Dai et al. integrated multimodal image features using multi-kernel learning and compared the effects of using different features for classification of ADHD patients. They selected optimal feature subset by combining feature ranking methods and feature subset selection methods. Their experiments showed that multi-kernel learning using selected multimodal features can yield better classification results for ADHD prediction [22].

In this study, MVPA analysis is performed to discriminate AD, schizophrenia, autism, and ADHD patients from the healthy controls using the morphometric features such as volumes and thickness of anatomical structures obtained from the T1-weighted structural brain MR images. Effect of using feature selection on the classification performance is investigated using ReliefF and mRMR feature ranking methods with an unbiased brain-wide approach. Three state-of-the-art machine learning algorithms, SVM, kNN, and BP-NN, are employed for the MVPA analysis. 5-fold cross validation (CV) is used for all feature selection and classification tasks to assess the generalization ability of the performance.

Section snippets

Materials and methods

A brief description of the used data, feature selection methods, and the MVPA algorithms are given in this section. Details about the model hyperparameters are also stated in this section for reproducibility of the study. Flow diagram of the system that shows the workflow is given in Fig. 1. Feature selection is performed inside the CV loop before the classification to prevent the leakage of label information from the test data.

Results

MVPA analyses are performed using SVM, kNN, and BP-NN algorithms on four different feature sets constructed using morphometric brain measures obtained from the T1-weighted structural brain images of 1390 subjects that have AD, schizophrenia, autism, or ADHD diseases. Dimension reduction that will improve the classification accuracy is first performed to select a subset of the feature set without using a priori information related to the cases. Different number of features are selected using

Discussion

In the present study, the effect of the feature selection for the MVPA of the OASIS, ABIDE, COBRE, ADHD and MCIC datasets are evaluated using ReliefF and mRMR methods.

All the MVPA algorithms were more successful for the OASIS dataset, for classifying the case 2 subjects that have AD than the case 1 subjects that have AD mild. BP-NN was not functional for the feature sets that have too many features because of its high computational costs. SVM that use FS-1 was the most successful method for

Conclusions

In this section, conclusions that are obtained in the results of the study and generally valid for all datasets used in this study are given. Feature selection improved the performance of the MVPA for all feature sets independent of the number of features that include. Effect of feature selection was prominent for all datasets except ADHD which can be interpreted that the structural brain measures might only be weakly related to the disease that feature selection did not help it to achieve

Conflict of interest

The authors have no conflicts of interest to disclose.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References (40)

M. Bergamino et al.
A review of technical aspects of T1-weighted dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) in human brain
Phys Med
(2014)
C. Ecker et al.
Investigating the predictive value of whole-brain structural MR scans in autism: a pattern classification approach
Neuroimage
(2010)
S. Klöppel et al.
Diagnostic neuroimaging across diseases
Neuroimage
(2012)
K.A. Norman et al.
Beyond mind-reading: multi-voxel pattern analysis of fMRI data
Trends Cogn Sci
(2006)
S. Tangaro et al.
Automated voxel-by-voxel tissue classification for hippocampal segmentation: methods and validation
Phys Med
(2014)
S. Tangaro et al.
A fuzzy-based system reveals Alzheimer’s Disease onset in subjects with Mild Cognitive Impairment
Phys Med
(2017)
E. Castro et al.
Characterization of groups using composite kernels and multi-source fMRI analysis data: application to schizophrenia
Neuroimage
(2011)
B. Fischl
FreeSurfer
NeuroImage
(2012)
K. Kira et al.
A practical approach to feature selection
S. Jiang et al.
An improved K-nearest-neighbor algorithm for text categorization
Expert Syst Appl
(2012)

J. Jiang et al.

Medical image analysis with artificial neural networks

Comput Med Imaging Graph

(2010)

C. Budak et al.

Removal of impulse noise in digital images with naïve Bayes classifier method

Turk J Elec Eng Comp Sci

(2016)

Y. Zhang et al.

Preclinical diagnosis of magnetic resonance (MR) brain images via discrete wavelet packet transform with Tsallis entropy and generalized eigenvalue proximal support vector machine (GEPSVM)

Entropy

(2015)

M.R. Sabuncu et al.

Clinical prediction from structural brain MRI scans: a large-scale empirical study

Neuroinformatics

(2015)

M. Peker et al.

Computer-aided diagnosis of Parkinson’s disease using complex-valued neural networks and mRMR feature selection algorithm

J Healthc Eng

(2015)

F. Liu et al.

Classification of different therapeutic responses of major depressive disorder with multivariate pattern analysis method based on structural MR scans

PLoS One

(2012)

F. Hoeft et al.

Neural systems predicting long-term outcome in dyslexia

Proc Natl Acad Sci USA

(2011)

B.A. Johnston et al.

Predictive classification of individual magnetic resonance imaging scans from children and adolescents

Eur Child Adoles Psy

(2013)

C. Salvatore et al.

Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer's disease: a machine learning approach

Front Neurosci

(2015)

V. Bolón-Canedo et al.

A review of feature selection methods on synthetic data

Knowl Inf Syst

(2013)

Cited by (21)

Automatic characterization of cerebral MRI images for the detection of autism spectrum disorders
2024, Intelligence-Based Medicine
Autism Spectrum Disorders (ASD) are one of the most serious health problems that our generation is facing [1]. It affects around one out of every 54 children and causes issues with social interaction, communication [2] and repetitive behaviors [3]. The development of full biomarkers for neuroimaging is a crucial step in diagnosing and tailoring medical care for autism spectrum disorder [4]. Volumetric studies focused on 3D MRI texture features have shown a high capacity for detecting abnormalities and characterizing variations caused by tissue heterogeneity. Recently, it has been the interest of comprehensive studies. However, only a few studies have aimed to investigate the link between object texture and ASD. This paper suggests a framework based on geometric texture features analyzing the variations between ASD and development control (DC) subjects. Our study uses 1114 T1-weighted MRI scans from two groups of subjects: 521 individuals with ASD and 593 controls (age range: 6–64 years) [5], divided into three broad age groups. We then computed the features from automatically labeled subcortical and cortical regions and encoded them as texture features by applying seven global Riemannian geometry descriptors and eight local features of standard Harlicks quantifier functions. Significant tests were used to identify texture volumetric differences between ASD and DC subjects. The most discriminative features are selected by applying the Correlation Matrix, and these features are used to classify the two classes using an Artificial Neural Network analysis. Preliminary results indicate that in ASD subjects, all 15 structure-derived features and subcortical regions tested have significantly different distributions from DC subjects.
A classification framework for Autism Spectrum Disorder detection using sMRI: Optimizer based ensemble of deep convolution neural network with on-the-fly data augmentation
2023, Biomedical Signal Processing and Control
Autism Spectrum Disorder (ASD) has affected many children’s life due to their hidden symptoms. The late detection of ASD is due to its complex and heterogeneous nature. Due to the noninvasive property and soft tissue information, Magnetic resonance imaging (MRI) has played a very important role in the detection of various brain disorders including ASD. In the past few years, there has been an expeditious increase in the utilization of the Deep Learning approaches in the field of medicine. Many state-of-the-art approaches have utilized Functional Magnetic Resonance Imaging (fMRI) for the detection of ASD, whereas, comparatively very few works have considered Structural Magnetic Resonance Imaging (sMRI) for the detection of ASD with deep learning approaches. This work presents the sMRI-based classification framework for the detection of ASD using optimizer based ensemble of Deep Convolution Neural Network (DCNN) with an on-the-fly data augmentation approach. This work proposes newness toward the approach of ensembling the same model by combining the same DCNN model with different optimizers. The numbers of subjects considered for this work are 484 ASD and 491 Controls. The proposed ensemble model of DCNN with Adam and Nadam optimizer has achieved the accuracy of 77.58%, 77.66%, and 81.35% on the data division ratio of 70:30, 80:20, and 90:10 respectively. Experimental results validate the superior performance of the proposed model compared to the sMRI-based state-of-the-art approaches for the detection of ASD.
Brain imaging-based machine learning in autism spectrum disorder: methods and applications
2021, Journal of Neuroscience Methods
Autism spectrum disorder (ASD) is a neurodevelopmental condition with early childhood onset and high heterogeneity. As the pathogenesis is still elusive, ASD diagnosis is comprised of a constellation of behavioral symptoms. Non-invasive brain imaging techniques, such as magnetic resonance imaging (MRI), provide a valuable objective measurement of the brain. Many efforts have been devoted to developing imaging-based diagnostic tools for ASD based on machine learning (ML) technologies. In this survey, we review recent advances that utilize machine learning approaches to classify individuals with and without ASD. First, we provide a brief overview of neuroimaging-based ASD classification studies, including the analysis of publications and general classification pipeline. Next, representative studies are highlighted and discussed in detail regarding different imaging modalities, methods and sample sizes. Finally, we highlight several common challenges and provide recommendations on future directions. In summary, identifying discriminative biomarkers for ASD diagnosis is challenging, and further establishing more comprehensive datasets and dissecting the individual and group heterogeneity will be critical to achieve better ADS diagnosis performance. Machine learning methods will continue to be developed and are poised to help advance the field in this regard.
An algorithm for learning shape and appearance models without annotations
2019, Medical Image Analysis
Citation Excerpt :
Those papers reported multiple accuracies, so it would be difficult to choose a single accuracy with which to compare. The accuracy achieved for the COBRE dataset was 74.7 ± 7.1%, which is similar to the 69.7% accuracy reported by Cabral et al. (2016) using COBRE, and was roughly comparable with many of the accuracies obtained by Monté-Rubio et al. (2018) or Demirhan (2018). Others have used other datasets of T1-weighted scans for identifying patients with schizophrenia.
This paper presents a framework for automatically learning shape and appearance models for medical (and certain other) images. The algorithm was developed with the aim of eventually enabling distributed privacy-preserving analysis of brain image data, such that shared information (shape and appearance basis functions) may be passed across sites, whereas latent variables that encode individual images remain secure within each site. These latent variables are proposed as features for privacy-preserving data mining applications.
The approach is demonstrated qualitatively on the KDEF dataset of 2D face images, showing that it can align images that traditionally require shape and appearance models trained using manually annotated data (manually defined landmarks etc.). It is applied to the MNIST dataset of handwritten digits to show its potential for machine learning applications, particularly when training data is limited. The model is able to handle “missing data”, which allows it to be cross-validated according to how well it can predict left-out voxels. The suitability of the derived features for classifying individuals into patient groups was assessed by applying it to a dataset of over 1900 segmented T1-weighted MR images, which included images from the COBRE and ABIDE datasets.
A systematic review of structural MRI biomarkers in autism spectrum disorder: A machine learning perspective
2018, International Journal of Developmental Neuroscience
Citation Excerpt :
The next best classification algorithms include Learned Vector Quantization (LVQ) with an accuracy of 87.7% (as before, obtained on a small dataset), and the Radial Basis Function Neural Classifier (RFBN) with accuracies between 70 to 96% (Subbaraju et al., 2015; Vigneshwaran et al., 2013), although the later result was for classifying ASD specifically in females. Other classification algorithms such as Random Forests (RF) and k-nearest neighbours (k-NN) were not observed to perform as well as SVM, with RF achieving an Area Under the Curve (AUC) performance metric of 60 in one study (Katuwal et al., 2015) and an accuracy of 99% (Vigneshwaran et al., 2013), and k-NN achieving an AUC of 0.54 (Demirhan, 2018) and an accuracy of 75% (Abdelrahman et al., 2012). However, this reduced performance may be due to both methods being generalisable to multi-class problems, and hence are not as tailored to the binary classification of ASD as SVM.
Autism Spectrum Disorder (ASD) affects approximately 1% of the population and leads to impairments in social interaction, communication and restricted, repetitive behaviours. Establishing robust neuroimaging biomarkers of ASD using structural magnetic resonance imaging (MRI) is an important step for diagnosing and tailoring treatment, particularly early in life when interventions can have the greatest effect. However currently, there is mixed findings on the structural brain changes associated with autism. Therefore in this systematic review, recent (post-2007), high-resolution (3 T) MRI studies investigating brain morphology associated with ASD have been collated to identify robust neuroimaging biomarkers of ASD. A systematic search was conducted on three databases; PubMed, Web of Science and Scopus, resulting in 123 reviewed articles. Patients with ASD were observed to have increased whole brain volume, particularly under 6 years of age. Other consistent changes observed in ASD patients include increased volume in the frontal and temporal lobes, increased cortical thickness in the frontal lobe, increased surface area and cortical gyrification, and increased cerebrospinal fluid volume, as well as reduced cerebellum volume and reduced corpus callosum volume, compared to typically developing controls. Findings were inconsistent regarding the developmental trajectory of brain volume and cortical thinning with age in ASD, as well as potential volume differences in the white matter, hippocampus, amygdala, thalamus and basal ganglia. To elucidate these inconsistencies, future studies should look towards aggregating MRI data from multiple sites or available repositories to avoid underpowered studies, as well as utilising methods which quantify larger-scale image features to reduce the number of statistical tests performed, and hence risk of false positive findings. Additionally, studies should look to perform a thorough validation strategy, to ensure generalisability of study findings, as well as look to leverage the improved image resolution of 3 T scanning to identify subtle brain changes related to ASD.
Automated diagnosis of autism with artificial intelligence: State of the art
2024, Reviews in the Neurosciences

View all citing articles on Scopus

View full text

Original paperThe effect of feature selection on multivariate pattern analysis of structural brain MR images

Highlights

Abstract

Introduction

Section snippets

Materials and methods

Results

Discussion

Conclusions

Conflict of interest

Funding

Phys Med

Neuroimage

Neuroimage

Trends Cogn Sci

Phys Med

Phys Med

Neuroimage

NeuroImage

Expert Syst Appl

Comput Med Imaging Graph

Removal of impulse noise in digital images with naïve Bayes classifier method

Turk J Elec Eng Comp Sci

Preclinical diagnosis of magnetic resonance (MR) brain images via discrete wavelet packet transform with Tsallis entropy and generalized eigenvalue proximal support vector machine (GEPSVM)

Entropy

Clinical prediction from structural brain MRI scans: a large-scale empirical study

Neuroinformatics

Computer-aided diagnosis of Parkinson’s disease using complex-valued neural networks and mRMR feature selection algorithm

J Healthc Eng

Classification of different therapeutic responses of major depressive disorder with multivariate pattern analysis method based on structural MR scans

PLoS One

Neural systems predicting long-term outcome in dyslexia

Proc Natl Acad Sci USA

Predictive classification of individual magnetic resonance imaging scans from children and adolescents

Eur Child Adoles Psy

Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer's disease: a machine learning approach

Front Neurosci

A review of feature selection methods on synthetic data

Knowl Inf Syst

Original paper
The effect of feature selection on multivariate pattern analysis of structural brain MR images