Naive random subspace ensemble with linear classifiers for real-time classification of fMRI data
Highlights
► fMRI data is classified using online linear classifiers. ► A random subspace ensemble is used. ► fMRI data labels may not always be available. ► Predicted labels are used to update the ensemble member classifiers (naïve labelling).
Introduction
In recent times, data acquired by functional magnetic resonance imaging (fMRI) have allowed for valuable insights into the human mind, and into those processes which control and reflect human behaviour.
While most fMRI approaches analyse the data off-line (after scanning has been completed), more recently there has been interest in the development and application of real-time fMRI (rtfMRI) [1].
One application is neurofeedback, where a participant performs a certain task while brain activity is measured, and feedback based on this activity is given in real time. In this way, the participant may learn to exercise self-control of specific brain regions, for example those involved in pain perception [5]. This is typically achieved via a closed loop of brain computer interface (BCI) [9], [34], [4].
As multivariate approaches have been shown to be more sensitive than univariate approaches for off-line fMRI analyses [30], [13], it seems sensible to use the former for rtfMRI as well. The efficiency and precision of rtfMRI for brain control has been demonstrated by participants carrying out tasks such as navigating through computer-generated mazes [15], balancing a virtual inverted pendulum [7], predicting decisions in an economic game [16], and moving an arrow towards a target [25]. fMRI classification can be a fast and accurate component of the BCI loop for the purposes of neurofeedback. The neurofeedback loop is sketched in Fig. 1. The participant receives initial instructions and possibly some stimuli. Next, the participant's state of mind is measured and classified. Based on the measured brain state, feedback is given to the participant who may then attempt to adjust the brain state to improve task performance.
To provide an accurate classification of the brain state, the classifier should be properly trained. Given the limited amount of time to collect individual fMRI data for the subject partaking in the experiment, and the large ratio of features to instances, the initially trained classifier may be of insufficient accuracy. It is desirable that the classifier improves with time. Depending on the information available for the updating of the classifier, the online training may be done on labelled or unlabelled data. Previous works assumed that the classifier is not updated with time [15], [7], or updated with labelled data [32]. In this study we are interested in the possibility of using unlabelled data to improve on the classification accuracy of the initially trained classifier, termed semi-supervised learning [29], [33], [35]. This is the most natural and hence useful scenario, because in certain experiments, there may be no way to verify the true state of the brain. We propose to use naive labelling, where the classifier is updated by adding the new data point to the training set and taking the predicted label as the true label. This approach should be taken with caution, guarding against the possibility of a runaway classifier that progressively learns ‘the wrong thing’ [3]. Our previous study suggests that simple classifiers may benefit from semi-supervised learning [20].
The rest of the paper is organised as follows. Section 2 discusses classifiers for real-time fMRI, including the random subspace (RS) ensemble, and introduces naive labelling. The dataset and methods are discussed in Section 3 with the results being presented in Section 4. Section 5 concludes the paper.
Section snippets
Classifier models for fMRI
Various classifier models have been used for fMRI classification. Linear classifiers, including support vector machines (SVM) with linear kernel [26] and linear discriminant analysis (LDA) [18], are popular due to their speed and accuracy. Classifier ensembles are deemed to be more accurate than individual classifiers [19]. The random subspace ensemble (RS) is a classifier ensemble method whereby ensemble members are trained on feature subsets rather than on the entire feature set [14]. The
Datasets
We use three emotion based datasets which are described below. A summary of the datasets is given in Table 1.
Emotion_Negative (EN1 and EN2) Data: EN1 and EN2 are two runs with different participants in the same experiment. Participants were instructed to up-regulate their target region activity for periods of 20 s (‘up’; 10 TR) using negative emotional imagery, alternating with baseline periods of 14 s (‘rest’; 7 TR). There were 12 blocks of up-regulation and rest. Classification task is to
Results
We calculated the cumulative error progression for each time step. For time-step t the cumulative error is , where e(j) is 0, if the classifier/ensemble has labelled the point at time j correctly, and 1, otherwise. The ‘final’ errors for the three datasets, taken at time t=500 are summarised as colour plots in Fig. 2, Fig. 3, Fig. 4. For each combination of M and , we generated ensembles of L=[5,9,11] classifiers, giving a total of 25 individual classifiers. The individual error
Conclusion
We have shown that classifier ensembles are more accurate than individual classifiers. Our experiments also show that given an appropriate choice of parameters, classifiers updating using the naive labelling strategy perform well within an ensemble framework. In particular, on simple 2-class datasets. We have shown that for sufficient training data, a naive classifier ensemble performs significantly better than a fixed, pre-trained classifier ensemble.
During a real-time fMRI experiment, there
Acknowledgements
We would like to thank Dr. D. Linden of Bangor University and Dr. S. Johnston for supplying us with the datasets.
Catrin O. Plumpton studied Mathematics at Bangor University, UK before completing M.Sc. in Computer Systems in 2007. She is now studying Ph.D. at the School of Computer Science, Bangor University. Her main research interests include machine learning, specifically classifier ensembles and fMRI data analysis.
References (30)
- et al.
A new concept of a unified parameter management, experiment control, and data analysis in fMRI: application to real-time fMRI at 3t and 7t
Journal of Neuroscience Methods
(2008) - et al.
Comparison of multivariate classifiers and response normalizations for pattern-information fMRI
NeuroImage
(2010) - et al.
A case study on naïve labelling for the nearest mean and the linear discriminant classifiers
Pattern Recognition
(2008) - et al.
Classifier ensembles for fMRI data analysis: an experiment
Magnetic Resonance Imaging
(2010) - et al.
Support vector machines for temporal classification of block design fMRI data
NeuroImage
(2005) - et al.
Beyond mind-reading: multi-voxel pattern analysis of fMRI data
Trends in Cognitive Science
(2006) - et al.
Real-time functional magnetic resonance imaging: methods and applications
Magnetic Resonance Imaging
(2007) - et al.
Classifier geometrical characteristic comparison and its application in classifier selection
Pattern Recognition Letters
(2005) - et al.
Real-time functional magnetic resonance imaging
Magnetic Resonance in Medicine
(1995) - et al.
Real-time 3D image registration for functional MRI
Magnetic Resonance in Medicine
(1999)
Semi-supervised learning of mixture models
Applications of real-time fMRI
Nature Reviews Neuroscience
Control over brain activation and pain learned by using real-time functional MRI
Proceedings of the National Academy of Sciences of the United States of America
Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns
NeuroImage
Using real-time fMRI to control a dynamical system by brain activity classification
Cited by (39)
Machine learning in connectomics: from representation learning to model fitting
2023, Connectome Analysis: Characterization, Methods, and AnalysisA robust semi-supervised SVM via ensemble learning
2018, Applied Soft Computing JournalCitation Excerpt :Semi-supervised learning and ensemble learning have achieved great success respectively in their own fields during the past decade. Researches have been advocated that semi-supervised learning and ensemble learning are indeed beneficial to each other, and stronger learning machines can be generated by leveraging unlabeled data and classifier ensemble [15,16]. The research of ensemble learning has three main issues at present: firstly, theoretical analysis of ensemble learning, which provides when this method is effective.
On the classification of dynamical data streams using novel “Anti-Bayesian” techniques
2018, Pattern RecognitionCitation Excerpt :Recently, several other papers have focused on classification in dynamically varying data streams, and these have considered whether the number of labels received online were minimal, or were not present at all. Plumpton et al. [24] considered the problem of real-time classification of fMRI data, where the labeling was very limited. The authors updated the classifier using a so-called naive labelling strategy.
Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees
2018, GeomorphologyCitation Excerpt :It is an ensemble method where the vector of original high dimensional feature is sampled randomly to generate the low dimensional subspaces, and then the multiple classifiers are combined in those random subspaces in final decision (Wang and Tang, 2006). The RSS has been used widely in many fields such as banking (Wang and Ma, 2011), medical science (Bertoni et al., 2005), and computer science (Plumpton et al., 2012) but its use is still limited for landslide problems. In the present study, this method was applied to predict spatially landslide occurrences.
Catrin O. Plumpton studied Mathematics at Bangor University, UK before completing M.Sc. in Computer Systems in 2007. She is now studying Ph.D. at the School of Computer Science, Bangor University. Her main research interests include machine learning, specifically classifier ensembles and fMRI data analysis.
Ludmila I. Kuncheva received her M.Sc. degree from the Technical University of Sofia, Bulgaria, in 1982, and her Ph.D. degree from the Bulgarian Academy of Sciences in 1987. Until 1997 she worked at the Central Laboratory of Biomedical Engineering at the Bulgarian Academy of Sciences. Prof. Kuncheva is currently a Professor at the School of Computer Science, Bangor University, UK. Her interests include pattern recognition and classification, machine learning, classifier combination and fMRI data analysis. She has published two books and above 150 scientific papers.
Nikolaas N. Oosterhof received a double master's degree in Computer Science and in Philosophy of Science, Technology and Society from the University of Twente, the Netherlands, and a third master's degree in Cognitive Science from the University of Amsterdam, the Netherlands. Currently he is a Ph.D. candidate in Cognitive Neuroscience at Bangor University, UK, where he uses pattern classification and multivariate statistics for the analysis of fMRI data.