Skip to main content

Advertisement

Log in

Regularized Bagged Canonical Component Analysis for Multiclass Learning in Brain Imaging

  • Original Article
  • Published:
Neuroinformatics Aims and scope Submit manuscript

Abstract

A fundamental problem of supervised learning algorithms for brain imaging applications is that the number of features far exceeds the number of subjects. In this paper, we propose a combined feature selection and extraction approach for multiclass problems. This method starts with a bagging procedure which calculates the sign consistency of the multivariate analysis (MVA) projection matrix feature-wise to determine the relevance of each feature. This relevance measure provides a parsimonious matrix, which is combined with a hypothesis test to automatically determine the number of selected features. Then, a novel MVA regularized with the sign and magnitude consistency of the features is used to generate a reduced set of summary components providing a compact data description. We evaluated the proposed method with two multiclass brain imaging problems: 1) the classification of the elderly subjects in four classes (cognitively normal, stable mild cognitive impairment (MCI), MCI converting to AD in 3 years, and Alzheimer’s disease) based on structural brain imaging data from the ADNI cohort; 2) the classification of children in 3 classes (typically developing, and 2 types of Attention Deficit/Hyperactivity Disorder (ADHD)) based on functional connectivity. Experimental results confirmed that each brain image (defined by 29.852 features in the ADNI database and 61.425 in the ADHD) could be represented with only 30 − 45% of the original features. Furthermore, this information could be redefined into two or three summary components, providing not only a gain of interpretability but also classification rate improvements when compared to state-of-art reference methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The reference (Bron et al. 2015) summarizes the results of the data analysis competition, where the task was to classify subjects into Alzheimer’s disease (AD), mild cognitive impairment (MCI), and cognitively normal classes. We use this summary paper as a reference to all the methods in the competition if there is no particular reason to specify a particular method.

  2. Note that the inclusion of the regularization term over A prevents problems in the calculation of the inverse of KxKx. These issues should not appear when working with high dimensional data, however they can occur in case of high redundancy among variables.

  3. As the accuracy validation curves tend to present a saturation profile and their maximum value is given when almost all features are used, we have selected as optimum working point the CV Stability Point (CV-SP), the point of the curve where the saturation begins. In this way, we obtain a good performance point using a reduced set of features.

  4. Note that the T1w template (MNI152 from SPM12) is smaller than our brain mask because our brain mask includes all voxel containing gray matter (GM) for any subject. Therefore, some of the selected voxels appear to be slightly outside the template with the voxel-size of 4mm3 and the smoothing applied to GM volumes magnifying effects.

References

  • Abdulkadir, A., Peter, J., Ronneberger, O., Brox, T., & Klöppel, S. (2014). Voxel-based multi-class classification of AD, MCI, and elderly controls. In Medical image computing and computer-assisted intervention (MICCAI) 2014-CADDementia Challenge.

  • Bellec, P., Chu, C., Chouinard-Decorte, F., Benhajali, Y., Margulies, D.S., & Craddock, R.C. (2017). The neuro bureau ADHD-200 preprocessed repository. NeuroImage, 144, 275–286.

    Article  PubMed  Google Scholar 

  • Bi, J., Bennett, K., Embrechts, M., Breneman, C., & Song, M. (2003). Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research, 3(Mar), 1229–1243.

    Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Article  Google Scholar 

  • Bron, E.E., Smits, M., Van Der Flier, W.M., Vrenken, H., Barkhof, F., Scheltens, P., Papma, J.M., Steketee, R.M., Orellana, C.M., Meijboom, R., & et al. (2015). Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: the CADDementia challenge. NeuroImage, 111, 562–579.

    Article  PubMed  PubMed Central  Google Scholar 

  • Chen, L., & Huang, J.Z. (2012). Sparse reduced-rank regression for simultaneous dimension reduction and variable selection. Journal of the American Statistical Association, 107(500), 1533–1545.

    Article  CAS  Google Scholar 

  • Cheng, B., Liu, M., Shen, D., Li, Z., Zhang, D., & et al. (2017). Alzheimer’s Disease Neuroimaging Initiative Multi-domain transfer learning for early diagnosis of Alzheimer’s disease. Neuroinformatics, 15(2), 115–132.

    Article  PubMed  PubMed Central  Google Scholar 

  • Cheng, B., Liu, M., Zhang, D., Shen, D., & et al. (2019). Alzheimer’s Disease Neuroimaging Initiative Robust multi-label transfer feature learning for early diagnosis of Alzheimer’s disease. Brain Imaging and Behavior, 13(1), 138–153.

    Article  PubMed  Google Scholar 

  • Craddock, R.C., James, G.A., Holtzheimer, P.E., Hu, X.P., & Mayberg, H.S. (2012). A whole brain fMRI atlas generated via spatially constrained spectral clustering. Human Brain Mapping, 33(8), 1914–1928.

    Article  PubMed  Google Scholar 

  • Dimitriadis, S.I., Liparas, D., Tsolaki, M.N., & et al. (2018). Alzheimer’s Disease Neuroimaging Initiative Random forest feature selection, fusion and ensemble strategy: Combining multiple morphological MRI measures to discriminate among healhy elderly, MCI, cMCI and Alzheimer’s disease patients: From the Alzheimer’s disease neuroimaging initiative (ADNI) database. Journal of Neuroscience Methods, 302, 14–23.

    Article  CAS  PubMed  Google Scholar 

  • Dong, A., Toledo, J.B., Honnorat, N., Doshi, J., Varol, E., Sotiras, A., Wolk, D., Trojanowski, J.Q., Davatzikos, C., & Alzheimer Disease Neuroimaging Initiative. (2016). Heterogeneity of neuroanatomical patterns in prodromal Alzheimer’s disease: links to cognition, progression and biomarkers. Brain: A Journal of Neurology, 140 (3), 735–747.

    Google Scholar 

  • Douaud, G., Menke, R.A., Gass, A., Monsch, A.U., Rao, A., Whitcher, B., Zamboni, G., Matthews, P.M., Sollberger, M., & Smith, S. (2013). Brain microstructure reveals early abnormalities more than two years prior to clinical progression from mild cognitive impairment to Alzheimer’s disease. Journal of Neuroscience, 33(5), 2147–2155.

    Article  CAS  PubMed  Google Scholar 

  • Du, L., Liu, K., Yao, X., Risacher, S.L., Han, J., Guo, L., Saykin, A.J., & Shen, L. (2018). Fast multi-task SCCA learning with feature selection for multi-modal brain imaging genetics. In 2018 IEEE international conference on bioinformatics and biomedicine (BIBM) (pp. 356–361): IEEE.

  • Dukart, J., Schroeter, M.L., Mueller, K., & et al. (2011). Alzheimer’s Disease Neuroimaging Initiative Age correction in dementia–matching to a healthy brain. PloS one, 6(7), e22,193.

    Article  CAS  Google Scholar 

  • Frisoni, G.B., Fox, N.C., Jack, Jr C.R., Scheltens, P., & Thompson, P.M. (2010). The clinical use of structural MRI in Alzheimer’s disease. Nature Reviews Neurology, 6(2), 67.

  • Gaser, C., Franke, K., Klöppel, S, Koutsouleris, N., Sauer, H., & et al. (2013). Alzheimer’s Disease Neuroimaging Initiative Brainage in mild cognitive impaired patients: predicting the conversion to Alzheimer’s disease. PloS one, 8(6), e67, 346.

    Article  CAS  Google Scholar 

  • Gomez-Verdejo, V., Parrado-Hernandez, E., Tohka, J., & et al. (2019). Alzheimer’s Disease Neuroimaging Initiative Sign-consistency based variable importance for machine learning in brain imaging. Neuroinformatics, pp 1–17.

  • Hardoon, D.R., Mourao-Miranda, J., Brammer, M., & Shawe-Taylor, J. (2007). Unsupervised analysis of fMRI data using kernel canonical correlation. NeuroImage, 37(4), 1250– 1259.

    Article  PubMed  Google Scholar 

  • Hinrichs, C., Singh, V., Xu, G., Johnson, S.C., Initiative, A.D.N., & et al. (2011). Predictive markers for AD in a multi-modality framework: an analysis of MCI progression in the ADNI population. NeuroImage, 55(2), 574–589.

    Article  PubMed  Google Scholar 

  • Huttunen, H., Manninen, T., Kauppi, J.P., & Tohka, J. (2013). Mind reading with regularized multinomial logistic regression. Machine Vision and Applications, 24(6), 1311–1325.

    Article  Google Scholar 

  • Jie, N.F., Zhu, M.H., Ma, X.Y., Osuch, E.A., Wammes, M., Théberge, J, Li, H.D., Zhang, Y., Jiang, T.Z., Sui, J., & et al. (2015). Discriminating bipolar disorder from major depression based on SVM-foba: efficient feature selection with multimodal brain imaging data. IEEE Transactions on Autonomous Mental Development, 7(4), 320–331.

    Article  PubMed  PubMed Central  Google Scholar 

  • Klöppel, S, Stonnington, C.M., Chu, C., Draganski, B., Scahill, R.I., Rohrer, J.D., Fox, N.C., Jack, Jr C.R., Ashburner, J., & Frackowiak, R.S. (2008). Automatic classification of MR scans in Alzheimer’s disease. Brain: A Journal of Neurology, 131(3), 681–689.

  • Liu, S., Liu, S., Cai, W., Che, H., Pujol, S., Kikinis, R., Feng, D., Fulham, M.J., & et al. (2015). Multimodal neuroimaging feature learning for multiclass diagnosis of Alzheimer’s disease. IEEE Transactions on Biomedical Engineering, 62(4), 1132–1140.

    Article  PubMed  Google Scholar 

  • Michel, V., Gramfort, A., Varoquaux, G., Eger, E., & Thirion, B. (2011). Total variation regularization for fMRI-based prediction of behavior. IEEE Transactions on Medical Imaging, 30(7), 1328–1340.

    Article  PubMed  PubMed Central  Google Scholar 

  • Milham, M.P., Fair, D., Mennes, M., Mostofsky, S.H., & et al. (2012). The ADHD-200 consortium: a model to advance the translational potential of neuroimaging in clinical neuroscience. Frontiers in Systems Neuroscience, 6, 62.

    Google Scholar 

  • Moradi, E., Pepe, A., Gaser, C., Huttunen, H., Tohka, J., & et al. (2015). Alzheimer’s Disease Neuroimaging Initiative Machine learning framework for early MRI-based Alzheimer’s conversion prediction in MCI subjects. NeuroImage, 104, 398–412.

    Article  PubMed  Google Scholar 

  • Muñoz-Romero, S, Gómez-Verdejo, V, & Arenas-García, J. (2016). Regularized multivariate analysis framework for interpretable high-dimensional variable selection. IEEE Computational Intelligence Magazine, 11(4), 24–35. https://doi.org/10.1109/MCI.2016.2601701.

    Article  Google Scholar 

  • Muñoz-Romero, S, Gómez-verdejo, V, & Parrado-Hernández, E. (2017). A novel framework for parsimonious multivariate analysis. Pattern Recognition.

  • Mwangi, B., Tian, T.S., & Soares, J.C. (2014). A review of feature reduction techniques in neuroimaging. Neuroinformatics, 12(2), 229–244.

    Article  PubMed  PubMed Central  Google Scholar 

  • Nadeau, C, & Bengio, Y. (2000). Inference for the generalization error. In Advances in neural information processing systems (pp. 307–313).

  • Nie, F, Huang, H, Cai, X, & Ding, CH. (2010). Efficient and robust feature selection via joint l2,1-norms minimization. In Advances in neural information processing systems (pp. 1813– 1821).

  • Parrado-Hernández, E, Gómez-Verdejo, V, Martínez-Ramón, M, Shawe-Taylor, J, Alonso, P, Pujol, J, Menchón, JM, Cardoner, N, & Soriano-Mas, C. (2014). Discovering brain regions relevant to obsessive–compulsive disorder identification through bagging and transduction. Medical Image Analysis, 18(3), 435–448.

    Article  PubMed  Google Scholar 

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., & et al. (2011). Scikit-learn: machine learning in python. Journal of Machine Learning Research, 12(Oct), 2825–2830.

    Google Scholar 

  • Qureshi, M.N.I., Min, B., Jo, H.J., & Lee, B. (2016). Multiclass classification for the differential diagnosis on the ADHD subtypes using recursive feature elimination and hierarchical extreme learning machine: structural MRI study. PloS one, 11(8), e0160,697.

    Article  Google Scholar 

  • Qureshi, M.N.I., Oh, J., Min, B., Jo, H.J., & Lee, B. (2017). Multi-modal, multi-measure, and multi-class discrimination of ADHD with hierarchical feature extraction and extreme learning machine using structural and functional brain MRI. Frontiers in Human Neuroscience, 11, 157.

    PubMed  PubMed Central  Google Scholar 

  • Risacher, S.L., Shen, L., West, J.D., Kim, S., McDonald, B.C., Beckett, L.A., Harvey, D.J., Jack, C.R., Weiner, M.W., Saykin, A.J., & et al. (2010). Longitudinal MRI atrophy biomarkers: relationship to conversion in the ADNI cohort. Neurobiology of Aging, 31(8), 1401–1418.

    Article  PubMed  PubMed Central  Google Scholar 

  • Rondina, J.M., Hahn, T., de Oliveira, L., Marquand, A.F., Dresler, T., Leitner, T., Fallgatter, A.J., Shawe-Taylor, J., & Mourao-Miranda, J. (2013). Scors—a method based on stability for feature selection and mapping in neuroimaging. IEEE Transactions on Medical Imaging, 33(1), 85–98.

    Article  PubMed  PubMed Central  Google Scholar 

  • Stoub, T, Bulgakova, M, Wilson, R, Bennett, D, Leurgans, S, Wuu, J, Turner, D, & et al. (2004). MRI-derived entorhinal volume is a good predictor of conversion from MCI to AD. Neurobiology of Aging, 25(9), 1197–1203.

    Article  PubMed  Google Scholar 

  • Sun, L., Ji, S., Yu, S., & Ye, J. (2009). On the equivalence between canonical correlation analysis and orthonormalized partial least squares. In IJCAI, (Vol. 9 pp. 1230–1235).

  • Tanpitukpongse, T., Mazurowski, M., Ikhena, J., & Petrella, J. (2017). Predictive utility of marketed volumetric software tools in subjects at risk for Alzheimer Disease: do regions outside the hippocampus matter? American Journal of Neuroradiology, 38(3), 546–552.

    Article  CAS  PubMed  Google Scholar 

  • Tohka, J., Moradi, E., Huttunen, H., & et al. (2016). Alzheimer’s Disease Neuroimaging Initiative Comparison of feature selection techniques in machine learning for anatomical brain MRI in dementia. Neuroinformatics, 14(3), 279–296.

    Article  PubMed  Google Scholar 

  • Varon, D., Barker, W., Loewenstein, D., Greig, M., Bohorquez, A., Santos, I., Shen, Q., Harper, M., Vallejo-Luces, T., & Duara, R. (2015). Visual rating and volumetric measurement of medial temporal atrophy in the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort: baseline diagnosis and the prediction of MCI outcome. International Journal of Geriatric Psychiatry, 30(2), 192–200.

    Article  PubMed  Google Scholar 

  • Weiner, M.W., Veitch, D.P., Aisen, P.S., Beckett, L.A., Cairns, N.J., Green, R.C., Harvey, D., Jack, C.R., Jagust, W., Morris, J.C., & et al. (2017). Recent publications from the Alzheimer’s Disease Neuroimaging Initiative: Reviewing progress toward improved AD clinical trials. Alzheimer’s & Dementia, 13(4), e1–e85.

    Article  Google Scholar 

  • Yu, Y., Shen, H., Zhang, H., Zeng, L.L., Xue, Z., & Hu, D. (2013). Functional connectivity-based signatures of schizophrenia revealed by multiclass pattern analysis of resting-state fMRI from schizophrenic patients and their healthy siblings. Biomedical Engineering Online, 12(1), 10.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

C. Sevilla-Salcedo and V. Gómez-Verdejo’s work has been partly funded by the Spanish MINECO grant TEC2014-52289R and TEC2017-83838-R as well as KERMES, which is a NoE on kernel methods for structured data, funded by the Spanish Ministry of Economy and Competitiveness, TEC2016-81900-REDT ru. Jussi Tohka’s work is supported by the Academy of Finland (grant 316258).

Author information

Authors and Affiliations

Authors

Consortia

Corresponding author

Correspondence to Carlos Sevilla-Salcedo.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Alzheimer’s Disease Neuroimaging Initiative (ADNI) is a Group/Institutional Author. Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf

Electronic supplementary material

Appendices

Appendix A: Hypothesis Test

Considering the success probability \(p_{jcr} = (1/P){\sum }_{p=1}^{P}\)\({({{U^{p}_{c}}} > 0) }\), we can formulate the following hypothesis test:

$$ \begin{cases} H_{0}: & p_{jcr} = 0.5, ~j \text{~ is~ not~ relevant~ for~ c-th~ class~ and~ r-th~ eigenvector. }\\ H_{1}: & p_{jcr} \neq 0.5, ~j \text{~ is~ relevant~ for~ c-th~ class~ and~ r-th~ eigenvector. } \end{cases} $$
(18)

To be able to statically evaluate if pjcr differs from 0.5, we define the following statistic:

$$ t_{jcr} = \frac{p_{jcr} - 0.5}{\sigma_{jcr} }, $$
(19)

where σjcr is a scaling factor proportional to the standard error of pjcr. We now derive this scaling factor. The term \({\sum }_{p=1}^{P}\)\({({{U^{p}_{c}}} > 0) }\) counts of the number of times that a feature is positive over P bagging iterations. Thus, assuming that the bagging iterations are independent, it can be modelled as a rescaled Binomial distribution with parameters P (number of experiments) and pjcr (success probability). Further, since the number of bagging iterations is very large, the binomial distribution can be approximated by a Normal distribution with mean Ppjcr and variance Ppjcr(1 − pjcr). So, under the independence assumption, we can define σjcr as the standard deviation of the term \(\frac {1}{P} {\sum }_{p=1}^{P}\)\({({{U^{p}_{c}}} > 0) }\), which is straightforwardly computed by rescaling the variance of the Normal distribution:

$$ \sigma_{jcr} = \sqrt{\frac{1}{P} \cdot p_{jcr}(1-p_{jcr})}. $$
(20)

However, we need to take into account that the observations are coming from a bagging process and independence can not be assumed. To address this problem, the standard deviation is computed with an unbiased estimator (Nadeau and Bengio 2000) which, applied to our scenario, provides the following corrected estimator for the standard deviation:

$$ \begin{array}{@{}rcl@{}} \tilde{\sigma}_{jcr}^{\text{corr}} &=& \sqrt{\frac{1}{P} \left( 1 + P\frac{M}{1-M}\right) {p_{jcr}(1-p_{jcr})}} \\ &\simeq& \sqrt{\frac{M}{1-M} {p_{jcr}(1-p_{jcr})}} \end{array} $$
(21)

and, therefore, the statistic tj becomes:

$$ t_{jcr} = \frac{p_{jcr} - 0.5}{\sqrt[]{\frac{M}{1-M} {p_{jcr}(1-p_{jcr})} }}. $$
(22)

The statistic tjcr is distributed according to the t-distribution with P − 1 degrees of freedom. Since P is very large, one can safely approximate the t-distribution by the standard normal distribution.

Once this statistic is calculated, the class-wise feature selection can be carried out by majority-voting r. This means that for each class we select the features that are considered as relevant by the majority of the eigenvectors.

Thanks to the inclusion of the statistical test, the cross-validation (CV) of the optimum amount of selected features is not needed, therefore reducing the computational time. Furthermore, this efficient approach allows the selection of features in a class-wise manner, improving the interpretability of the results and posing an advantage over the approach presented in Muñoz-Romero et al. (2017).

Appendix B: Unbalanced Method Results

In this appendix further results obtained with different versions of the method are depicted. In particular, here we present the results obtained in both databases when not using the balanced version of the method.

Table 9 ADNI - Accuracy results with the different versions of the method, considering the usage of the proposed selection and extraction methods in their unbalanced version

Regarding Table 9, the results are similar to the ones obtained in Table 5 in terms of accuracy and slightly worse considering the AUC. The main advantage of using the balanced version in this database is the improvement in the classification of the most critical class, sMCI.

In Table 10 we can see the results obtained using the unbalanced version of the method in the ADHD database. In this highly unbalanced database, the results are worse than the ones obtained in Table 6 in terms of accuracy, having that without the usage of the balanced version the method overfits to the most populated class. Therefore, it is critical in this database to use the balanced version.

Table 10 ADHD - Accuracy results with the different versions of the method, considering the usage of the proposed selection and extraction methods in their unbalanced version

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sevilla-Salcedo, C., Gómez-Verdejo, V., Tohka, J. et al. Regularized Bagged Canonical Component Analysis for Multiclass Learning in Brain Imaging. Neuroinform 18, 641–659 (2020). https://doi.org/10.1007/s12021-020-09470-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12021-020-09470-y

Keywords

Navigation