Abstract
One of the goals of the European Flagship Human Brain Project is to create a platform that will enable scientists to search for new biologically and clinically meaningful discoveries by making use of a large database of neurological data enlisted from many hospitals. While the patients whose data will be available have been diagnosed, there is a widespread concern that their diagnosis, which relies on current medical classification, may be too wide and ambiguous and thus hides important scientific information.
We therefore offer a strategy for a search, which combines supervised and unsupervised learning in three steps: Categorization, Clustering and Classification. This 3-C strategy runs as follows: using external medical knowledge, we categories the available set of features into three types: the patients’ assigned disease diagnosis, clinical measurements and potential biological markers, where the latter may include genomic and brain imaging information. In order to reduce the number of clinical measurements a supervised learning algorithm (Random Forest) is applied and only the best predicting features are kept. We then use unsupervised learning in order to create new clinical manifestation classes that are based on clustering the selected clinical measurement. Profiles of these clusters of clinical manifestation classes are visually described using profile plots and analytically described using decision trees in order to facilitate their clinical interpretation. Finally, we classify the new clinical manifestation classes by relying on the potential biological markers. Our strategy strives to connect between potential biomarkers, and classes of clinical and functional manifestation, both expressed by meaningful features. We demonstrate this strategy using data from the Alzheimer’s Disease Neuroimaging Initiative cohort (ADNI).
An Erratum for this chapter can be found at http://dx.doi.org/10.1007/978-3-319-11812-3_31
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Weiner, M.W., Veitch, D.P., Aisen, P.S., Beckett, L.A., Cairns, N.J., Green, R.C., Harvey, D., Jack, C.R., Jagust, W., Liu, E., Morris, J.C., Petersen, R.C., Saykin, A.J., Schmidt, M.E., Shaw, L., Shen, L., Siuciak, J.A., Soares, H., Toga, A.W., Trojanowski, J.Q.: The Alzheimer’s Disease Neuroimaging Initiative: A review of papers published since its inception. Alzheimer’s Dement 9(5), e111–e194 (2013)
American Psychiatric Association, DSM-5 criteria for major neurocognitive disorder due to AD, 5th edn. Arlington, VA (2013)
Sonnen, J.A., Montine, K.S., Quinn, J.F., Kaye, J.A., Breitner, J.C.S., Montine, T.J.: Biomarkers for cognitive impairment and dementia in elderly people. Lancet Neurol. 7(8), 704–714 (2008)
Sunderland, T., Linker, G., Mirza, N., Putnam, K.T., Friedman, D.L., Kimmel, L.H., Bergeson, J., Manetti, G.J., Zimmermann, M., Tang, B., Bartko, J.J., Cohen, R.M.: Decreased beta-amyloid1-42 and increased tau levels in cerebrospinal fluid of patients with Alzheimer disease. JAMA 289(16), 2094–2103 (2094)
Yaffe, K., Weston, A., Graff-Radford, N.R., Satterfield, S., Simonsick, E.M., Younkin, S.G., Younkin, L.H., Kuller, L., Ayonayon, H.N., Ding, J., Harris, T.B.: Association of plasma beta-amyloid level and cognitive reserve with subsequent cognitive decline. JAMA 305(3), 261–266 (2011)
Gupta, V.B., Laws, S.M., Villemagne, V.L., Ames, D., Bush, A.I., Ellis, K.A., Lui, J.K., Masters, C., Rowe, C.C., Szoeke, C., Taddei, K., Martins, R.N.: Plasma apolipoprotein e and Alzheimer disease risk: The AIBL study of aging. Neurology 76(12), 1091–1098 (2011)
Evans, M.C., Barnes, J., Nielsen, C., Kim, L.G., Clegg, S.L., Blair, M., Leung, K.K., Douiri, A., Boyes, R.G., Ourselin, S., Fox, N.C.: Volume changes in Alzheimer’s disease and mild cognitive impairment: cognitive associations. Eur. Radiol. 20(3), 674–682 (2010)
Langbaum, J.B.S., Chen, K., Lee, W., Reschke, C., Fleisher, A.S., Alexander, G.E., Foster, N.L., Michael, W., Koeppe, R.A., Jagust, W.J., Reiman, E.M.: Categorical and Correlational Analyses of Baseline Fluorodeoxyglucose Positron Emission Tomography Images from the Alzheimer’s Disease. Neuroimage 45(4), 1107–1116 (2010)
Tosun, D., Schuff, N., Truran-Sacrey, D., Shaw, L.M., Trojanowski, J.Q., Aisen, P., Peterson, R., Weiner, M.W.: Relations between brain tissue loss, CSF biomarkers and the ApoE genetic profile: A longitudinal MRI study. Neurobiol. Aging 31(8), 1340–1354 (2011)
Cui, Y., Liu, B., Luo, S., Zhen, X., Fan, M., Liu, T., Zhu, W., Park, M., Jiang, T., Jin, J.S.: Identification of conversion from mild cognitive impairment to Alzheimer’s disease using multivariate predictors. PLoS One 6(7), e21896 (2011)
Kohannim, O., Hua, X., Hibar, D.P., Lee, S., Chou, Y.-Y., Toga, A.W., Jack, C.R., Weiner, M.W., Thompson, P.M.: Boosting power for clinical trials using classifiers based on multiple biomarkers. Neurobiol. Aging 31(8), 1429–1442 (2010)
Hinrichs, C., Singh, V., Xu, G., Johnson, S.C.: Predictive markers for AD in a multi-modality framework: An analysis of MCI progression in the ADNI population. Neuroimage 55(2), 574–589 (2011)
Zhang, D., Shen, D.: Multi modal multi task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease. Neuroimage 59(2), 895–907 (2013)
Walhovd, K.B., Fjell, M., Brewer, J., McEvoy, L.K., Fennema-Notestine, C., Hagler, D.J., Jennings, R.G., Karow, D., Dale, M.: Combining MR imaging, positron-emission tomography, and CSF biomarkers in the diagnosis and prognosis of Alzheimer disease. AJNR. Am. J. Neuroradiol. 31(2), 347–354 (2010)
Johnson, K.A., Sperling, R.A., Gidicsin, C., Carmasin, J., Maye, J., Coleman, R.E., Reiman, E.M., Sabbagh, M.N., Sadowsky, C.H., Fleisher, A.S., Doraiswamy, P.M., Carpenter, A.P., Clark, C.M., Joshi, A.D., Lu, M., Grundman, M., Mintun, M.A., Pontecorvo, M.J., Skovronsky, D.: Florbetapir (F18-AV-450) PET to assess amyloid burden in Alzheimer’s disease dementia, mild cognitive impairment, and normal aging. Alzheimer’s Dement 9 (2013)
Shadlen, M.-F., Larson, E.B.: UpToDate: Evaluation of cognitive impairment and dementia
Longo, D., Fauci, A., Kasper, D., Hauser, S., Jameson, J., Loscalzo, J.: Harrison’s Principles of Internal Medicine, 18th edn., National Institute of Health, Bethesda, MD, National Institute of Allergy and Infectious Diseases, Brigham and Women’s Hospital (2011)
R Core Team, R: A language and environment for statistical computing
Liaw, A., Wiener, M.: Classification and Regression by randomForest. R News 2, 18–22 (2002)
Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.: Cluster Analysis Basics and Extensions. R package version 1.14.4. CRAN (2013)
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in data set via the gap statistic. Journal of the Royal Statistical Society: Series B, Part 2, 411–423 (2001)
Revelle, W.: psych: Procedures for psychological, psychometric, and personality research, pp. 0–90. Northwest. Univ. Evanston, Illinois (2010)
Malterud, K.: The art and science of clinical knowledge: evidence beyond measures and numbers. Lancet 358(9279), 397–400 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Galili, T., Mitelpunkt, A., Shachar, N., Marcus-Kalish, M., Benjamini, Y. (2014). Categorize, Cluster, and Classify: A 3-C Strategy for Scientific Discovery in the Medical Informatics Platform of the Human Brain Project. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds) Discovery Science. DS 2014. Lecture Notes in Computer Science(), vol 8777. Springer, Cham. https://doi.org/10.1007/978-3-319-11812-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-11812-3_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11811-6
Online ISBN: 978-3-319-11812-3
eBook Packages: Computer ScienceComputer Science (R0)