Discriminant Convex Non-negative Matrix Factorization for the classification of human brain tumours
Introduction
Neuro-Oncology is the medical field that investigates the tumours of the human brain. There are over a hundred types and sub-types of brain tumours, as officially catalogued by the World Health Organization (WHO) (Louis et al., 2007). They are grouped hierarchically into families, out of which the analyses in the current study concern glial tumours and metastases.
They all have in common that the direct access to the pathological tissue is recommended only in those few selected cases for which no alternative and less invasive treatment is available. Therefore, their diagnosis commonly depends on a human expert assessment made on the basis of indirect measurements.
Over the last few decades, several methods for the non-invasive acquisition of information from the brain tissue have been introduced in research and clinical practice. Radiologists in clinical settings frequently inform the diagnosis of brain tumours using magnetic resonance techniques that generate data of different modalities, such as MRI (which produces spatial information) and MRS (providing local metabolic information: Hollingworth et al., 2006, Sibtain et al., 2007), or, less frequently, even multi-modal data such as magnetic resonance spectroscopic imaging (MRSI), which produces local metabolic information over an anatomical area.
The acquisition of brain tissue information in the form of signal makes it suitable for PR. In fact, PR methods have for long been successfully used to extract usable knowledge from magnetic resonance data (El-Deredy, 1997, Lisboa et al., 1998). In most instances, the target was brain tumour automatic diagnosis, a problem treated as one of supervised classification. In the case of MRS, the inclusion of feature selection and extraction methods in the classification process makes it possible to obtain a sparse and practical metabolic characterization of different types of tumours (González-Navarro et al., 2010, Vellido et al., 2012).
The use of MRS creates a signal in the frequency domain that can be analyzed in an unsupervised manner to extract its constituent sources. We might expect each type of tumour to be characterized by either a single source or by a parsimonious subset of sources. According to this expectation, MRS data have been analyzed using Independent Component Analysis (Huang et al., 2003, Ladroue et al., 2003, Han and Li, 2011). More recently, source identification has been accomplished with different degree of success using Non-negative Matrix Factorization (NMF) methods (Sajda et al., 2004, Ortega-Martorell et al., 2012a) and, particularly, using a convex variant of NMF (Ding et al., 2010) that is especially suitable for real MRS signal as it relaxes the NMF constraints to allow negative values both in the spectra and in the extracted sources (Ortega-Martorell et al., 2012b).
This dichotomy between supervised classification and unsupervised source extraction can be bridged: the obtention of sources through NMF could benefit from the discriminant information that is available in class labels. This was recently proposed in a number of works (Zafeiriou et al., 2006, Kotsia et al., 2007). The resulting models, which have been given the common name of discriminant NMF (DNMF), work by incorporating discriminant constraints in the matrix decomposition process that underlies NMF. The use of class information in NMF has also been approached as a semi-supervised problem in Wang et al., 2008, Lee et al., 2010.
Given that the convex variant of NMF (Convex-NMF) is better suited to the analysis of MRS (as real spectra often have negative values), and also given that better discrimination results for Convex-NMF than for standard NMF have been reported in Ortega-Martorell et al. (2012a), we present in this paper a method that, building on those presented in Kotsia et al., 2007, Wang et al., 2005, introduces the available class information into the unsupervised source extraction process of Convex-NMF. Novel methods to appropriately generate diagnostic predictions for new, unseen spectra are also defined and assessed experimentally.
The goals of source separation and classification are thus tightly coupled in this study. Beyond supervised classification (in which classifiers could be used by a radiologist as diagnostic assistants), we aim for tumour-type specific discriminatory source extraction (which could be used by a radiologist as diagnostic assistance in recommendations that could be trusted because the signal sources the diagnosis relies upon have a medical interpretation).
The remainder of the paper is organized as follows: Section 2 provides a description of the human brain tumour MRS data analyzed in the experiments. This is followed, in Section 3, by outlines of the main methodological issues, starting with brief and self-contained descriptions of NMF, Convex-NMF and DNMF that work as an introduction to the proposed Discriminant Convex-NMF method and several novel approaches to its use in the prediction of unseen data. Section 4 delivers a report of the experimental results and their discussion.
Section snippets
Materials
Proton MRS (1H-MRS) allows obtaining a non-invasive biochemical profile of the metabolites within a tissue, that is, its metabolic fingerprint. In human brain tumours, given the likelihood of biochemical changes appearing earlier than gross morphologic changes, it should be possible to improve diagnostic sensitivity in clinical practice by complementing the morphologic information provided by MRI with the metabolomic pattern provided by MRS. This is because MRI, unlike MRS, only provides an
Non-negative Matrix Factorization
The NMF method was originally conceived with a very specific objective in mind: that of providing a mathematical model to explain how an object can be perceived in a cognitive task described as a parts-based process; that is, a process in which the perception of an object is a combination of the perception of its constituent parts (Lee and Seung, 1999). For this reason, it is not surprising that one of the main applications of NMF has been, from inception, that of face recognition (Liu and
Experiments
This section reports and discusses the results of a set of different experiments designed to validate the DCNMF method and assess its performance in its different variants and also in comparison with Convex-NMF.
As described in Section 2, two data sets were analyzed: one consisting of real MRS and another consisting of synthetically-generated MRS-like instances created from the real ones, for which different training and test subsets were generated. In all cases, each of the data instances was
Conclusions
The diagnosis of human brain tumours is a complex medical decision task. Experts can benefit from the use of PR for the analysis of the indirect information that must often be acquired to inform this decision (Lisboa et al., 2010). The use of MRS techniques provides radiologists with a clear metabolic signature of the tumour (Sibtain et al., 2007). This metabolic information takes the form of signal in the frequency domain, which can be analyzed using source extraction techniques in the hope
Acknowledgments
This research was partially funded by Spanish TIN2009-13895-C02-01, TIN2013-31377, and SAF2011-23870 projects. Centro de Investigación Biomédica en Red Bioingeniería, Biomateriales y Nanomedicina, CIBER-BNN, is an initiative of Instituto de Salud Carlos III, Spain, co-funded by EU FEDER. Authors acknowledge the former INTERPRET European project partners. Data providers: Dr. C. Majós (IDI), Dr. À. Moreno-Torres (CDP), Dr. F.A. Howe and Prof. J. Griffiths (SGUL), Prof. A. Heerschap (RU), Prof. L.
References (36)
- et al.
In vivo single-voxel proton MR spectroscopy in the differentiation of high-grade gliomas and solitary metastases
Clinical Radiology
(2004) - et al.
Feature and model selection with discriminatory visualization for diagnostic classification of brain tumors
Neurocomputing
(2010) Data clustering: 50 years beyond K-means
Pattern Recognition Letters
(2010)- et al.
Non-negative matrix factorization based methods for object recognition
Pattern Recognition Letters
(2004) - et al.
The clinical value of proton magnetic resonance spectroscopy in adult brain tumours
Clinical Radiology
(2007) - et al.
Improving non-negative matrix factorizations through structured initialization
Pattern Recognition
(2004) - Brodersen, K.H., Ong, C.S., Stephan, K.E., Buhmann, J.M., 2010. The balanced accuracy and its posterior distribution....
- et al.
Maximum likelihood from incomplete data via the EM algorithm
Journal of the Royal Statistical Society, Series B (Methodological)
(1977) - et al.
Convex and semi-nonnegative matrix factorizations
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2010) - Ding, X., Lee, J.H., Lee, S.W., 2011. A constrained alternating least squares nonnegative matrix factorization...
Pattern recognition approaches in biomedical and clinical magnetic resonance spectroscopy: a review
NMR in Biomedicine
Multiproject–multicenter evaluation of automatic brain tumor classification by magnetic resonance spectroscopy
Magnetic Resonance Materials in Physics, Biology and Medicine (MAGMA)
Proton NMR chemical shifts and coupling constants for brain metabolites
NMR in Biomedicine
Multi-resolution independent component analysis for high-performance tumor classification and biomarker discovery
BMC Bioinformatics
A systematic literature review of magnetic resonance spectroscopy for the characterization of brain tumors
American Journal of Neuroradiology
Tumour grading from magnetic resonance spectroscopy: a comparison of feature extraction with variable selection
Statistics in Medicine
A multi-centre, web-accessible and quality control-checked database of in vivo MR spectra of brain tumour patients
Magnetic Resonance Materials in Physics, Biology and Medicine (MAGMA)
Cited by (16)
Automated classification of brain tumours from short echo time in vivo MRS data using Gaussian Decomposition and Bayesian Neural Networks
2014, Expert Systems with ApplicationsCitation Excerpt :Several studies have investigated the problem of the extraction of relevant information for the task of diagnostic classification from MRS. The presence of noise and artifacts in the spectra (Vellido, Lisboa, & Vicente, 2006; Vellido et al., 2009) and the strong overlapping between spectral peaks (De Graaf & Bovee, 1990), amongst other causes, are known to make the extraction of discriminant information between classes difficult. This is especially true for techniques used for brain tumour discrimination such as Principal Component Analysis (PCA) (Devos et al., 2004) or the Discrete Wavelet Transform (DWT) (Arizmendi, Vellido, & Romero, 2012; Tate et al., 1996), but also for source extraction techniques such as Independent Component Analysis (ICA) and Non-negative Matrix Factorization (NMF), which tend to obtain sources that reflect tissue types instead of tissue metabolites (Huang, Lisboa, & El-Deredy, 2003; Vilamala, Lisboa, Ortega-Martorell, & Vellido, 2013). The technique of Peak Integration (PI) has also been suggested as a candidate to overcome the problem of peak overlapping (García-Gómez et al., 2009; Hoch & Stern, 1996).
An Efficient Parallelization Model for Sparse Non-negative Matrix Factorization Using cuSPARSE Library on Multi-GPU Platform
2022, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)