Skip to main content
Log in

Statistical strategies for relating metabolomics and proteomics data: a real case study in nutrition research area

  • Original Article
  • Published:
Metabolomics Aims and scope Submit manuscript

Abstract

The current investigations were carried out in the context of a nutritional case study aiming at assessing the postnatal impact of maternal dietary protein restriction during pregnancy and lactation on rat offspring plasma metabolome and hypothalamic proteome. Although data generated by different “Omics” technologies are usually considered and analyzed separately, their interrelation may offer a valuable opportunity for assessing the emerging ‘integrated biology’ concept. The overall strategy of analysis first investigated data pretreatment and variable selection for each dataset. Then, three multivariate analyses were applied to investigate the links between the abundance of metabolites and the expression of proteins collected on the same samples. Unfold principal component analysis and regularized canonical correlation analysis did not take into account the presence of groups of individuals related to the intervention study. On the contrary, the predictive MultiBlock Partial Least Squares method used this information. Regularized canonical correlation analysis appeared as a relevant approach to investigate of the relationships between the two datasets. However, in order to highlight the molecular compounds, proteins and metabolites, associated in interacting or common metabolic pathways for the experimental groups, MultiBlock partial least squares was the most appropriate method in the present nutritional case study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Alexandre-Gouabau, Bailly, E., Moyon, T. L., Grit, I. C., Coupé, B., Le Drean, G., et al. (2011a). Postnatal growth velocity modulates alterations of proteins involved in metabolism and neuronal plasticity in neonatal hypothalamus in rats born with intrauterine growth restriction. The Journal of Nutritional Biochemistry, 23(2), 140–152.

    Article  PubMed  Google Scholar 

  • Alexandre-Gouabau, Courant, F., Le Gall, G., Moyon, T., Darmaun, D., Parnet, P., et al. (2011b). Offspring metabolomic response to maternal protein restriction in a rat model of intrauterine growth restriction (IUGR). Journal of Proteome Research, 10(7), 3292–3302.

    Article  PubMed  CAS  Google Scholar 

  • Allen, D. M. (1974). The relationship between variable selection and data agumentation and a method for prediction. Technometrics, 16(1), 125–127.

    Article  Google Scholar 

  • Angelica, D., Luigi, A., Antonio, N., Hille Adriaan, V. G., Diego, G., Vincenzo, Z., et al. (2011). Metabolomics in newborns with intrauterine growth retardation (IUGR): Urine reveals markers of metabolic syndrome. Journal of Maternal-Fetal and Neonatal Medicine, 24, 35–39.

    Google Scholar 

  • Barker, M., & Rayens, W. (2003). Partial least squares for discrimination. Journal of Chemometrics, 17(3), 166–173.

    Article  CAS  Google Scholar 

  • Bouret, S. G., & Simerly, R. B. (2006). Developmental programming of hypothalamic feeding circuits. Clinical Genetics, 70(4), 295–301.

    Article  PubMed  CAS  Google Scholar 

  • Brereton, R. G. (2006). Consequences of sample size, variable selection, and model validation and optimisation, for predicting classification ability from analytical data. TrAC, Trends in Analytical Chemistry, 25(11), 1103–1111.

    Article  CAS  Google Scholar 

  • Cawley, G. C., & Talbot, N. L. C. (2003). Efficient leave-one-out cross-validation of kernel Fisher discriminant classifiers. Pattern Recognition, 36(11), 2585–2592.

    Article  Google Scholar 

  • Coupe, B., Amarger, V., Grit, I., Benani, A., & Parnet, P. (2010). Nutritional programming affects hypothalamic organization and early response to leptin. Endocrinology, 151(2), 702.

    Article  PubMed  CAS  Google Scholar 

  • Davis, C. D., & Milner, J. (2004). Frontiers in nutrigenomics, proteomics, metabolomics and cancer prevention. Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis, 551(1–2), 51–64.

    Article  CAS  Google Scholar 

  • Fança-Berthon, P., Michel, C., Pagniez, A., Rival, M., Van Seuningen, I., Darmaun, D., et al. (2009). Intrauterine growth restriction alters postnatal colonic barrier maturation in rats. Pediatric Research, 66(1), 47.

    Article  PubMed  Google Scholar 

  • Golub, G. H., Heath, M., & Wahba, G. (1979). Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics, 21(2), 215–223.

    Article  Google Scholar 

  • González, I., Déjean, S., Martin, P. G. P., & Baccini, A. (2008). CCA: An R package to extend canonical correlation analysis. Journal of Statistical Software, 23(12), 1–14.

    Google Scholar 

  • González, I., Déjean, S., Martin, P., Gonçalves, O., Besse, P., & Baccini, A. (2009). Highlighting relationships between heteregeneous biological data through graphical displays based on regularized canonical correlation analysis. Journal of Biological Systems, 17(2), 173–199.

    Article  Google Scholar 

  • Henningsson, M., Sundbom, E., Armelius, B. Å., & Erdberg, P. (2001). PLS model building: A multivariate approach to personality test data. Scandinavian Journal of Psychology, 42(5), 399–409.

    Article  PubMed  CAS  Google Scholar 

  • Henrion, R. (1994). N-way principal component analysis theory, algorithms and applications. Chemometrics and Intelligent Laboratory Systems, 25(1), 1–23.

    Article  CAS  Google Scholar 

  • Horst, P. (1961). Generalized canonical correlations and their applications to experimental data. Journal of Clinical Psychology, 17(4), 331–347.

    Article  PubMed  CAS  Google Scholar 

  • Kemsley, E. K., Le Gall, G., Dainty, J. R., Watson, A. D., Harvey, L. J., Tapp, H. S., et al. (2007). Multivariate techniques and their application in nutrition: A metabolomics case study. British Journal of Nutrition, 98(01), 1–14.

    Article  PubMed  CAS  Google Scholar 

  • Kettenring, J. R. (1971). Canonical analysis of several sets of variables. Biometrika, 58(3), 433.

    Article  Google Scholar 

  • Kremser, K., Stangl, H., Pahan, K., & Singh, I. (1995). Nitric oxide regulates peroxisomal enzyme activities. Clinical Chemistry and Laboratory Medicine, 33(11), 763–774.

    Article  CAS  Google Scholar 

  • Kuhn, M. (2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28(5), 1–26.

    Google Scholar 

  • Lane, R. H., Kelley, D. E., Gruetzmacher, E. M., & Devaskar, S. U. (2001). Uteroplacental insufficiency alters hepatic fatty acid-metabolizing enzymes in juvenile and adult rats. American Journal of Physiology-Regulatory, Integrative and Comparative Physiology, 280(1), R183.

    PubMed  CAS  Google Scholar 

  • Lê, S., Josse, J., & Husson, F. (2008). FactoMineR: An R package for multivariate analysis. Journal of Statistical Software, 25(1), 1–18.

    Google Scholar 

  • Malaguarnera, M., Vacante, M., Avitabile, T., Cammalleri, L., & Motta, M. (2009). L-Carnitine supplementation reduces oxidized LDL cholesterol in patients with diabetes. The American Journal of Clinical Nutrition, 89(1), 71.

    Article  PubMed  CAS  Google Scholar 

  • Mayr, M., Madhu, B., & Xu, Q. (2007). Proteomics and metabolomics combined in cardiovascular research. Trends in Cardiovascular Medicine, 17(2), 43–48.

    Article  PubMed  CAS  Google Scholar 

  • Mevik, B. H., & Wehrens, R. (2007). The pls package: Principal component and partial least squares regression in R. Journal of Statistical Software, 18(2), 1–24.

    Google Scholar 

  • Morgane, P. J., Mokler, D. J., & Galler, J. R. (2002). Effects of prenatal protein malnutrition on the hippocampal formation. Neuroscience and Biobehavioral Reviews, 26(4), 471–483.

    Article  PubMed  CAS  Google Scholar 

  • Nedenskov Jensen, K., Jessen, F., & Jørgensen, B. M. (2008). Multivariate data analysis of two-dimensional gel electrophoresis protein patterns from few samples. Journal of Proteome Research, 7(3), 1288–1296.

    Article  CAS  Google Scholar 

  • Parsons, H. M., Ludwig, C., Günther, U. L., & Viant, M. R. (2007). Improved classification accuracy in 1- and 2-dimensional NMR metabolomics data using the variance stabilising generalised logarithm transformation. BMC Bioinformatics, 8(1), 234–250.

    Article  PubMed  Google Scholar 

  • Pereira, H., Martin, J. F., Joly, C., Sébédio, J. L., & Pujos-Guillot, E. (2009). Development and validation of a UPLC/MS method for a nutritional metabolomic study of human plasma. Metabolomics, 6(2), 207–218.

    Article  Google Scholar 

  • Pérez-Enciso, M., & Tenenhaus, M. (2003). Prediction of clinical outcome with microarray data: A partial least squares discriminant analysis (PLS-DA) approach. Human Genetics, 112(5), 581–592.

    PubMed  Google Scholar 

  • Rubingh, C. M., Bijlsma, S., Derks, E. P. P. A., Bobeldijk, I., Verheij, E. R., Kochhar, S., et al. (2006). Assessing the performance of statistical validation tools for megavariate metabolomics data. Metabolomics, 2(2), 53–61.

    Article  CAS  Google Scholar 

  • Saghatelian, A., & Cravatt, B. F. (2005). Global strategies to integrate the proteome and metabolome. Current Opinion in Chemical Biology, 9(1), 62–68. doi:10.1016/j.cbpa.2004.12.004.

    Article  PubMed  CAS  Google Scholar 

  • Shlens, J. (2009). A tutorial on principal component analysis. San Diego: Systems Neurobiology Laboratory, University of California at San Diego.

    Google Scholar 

  • Smilde, A. K., van der Werf, M. J., Bijlsma, S., van der Werff-van, B. J. C., & Jellema, R. H. (2005). Fusion of mass spectrometry-based metabolomics data. Analytical Chemistry, 77(20), 6729–6736.

    Article  PubMed  CAS  Google Scholar 

  • Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R., & Siuzdak, G. (2006). XCMS: Processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Analytical Chemistry, 78(3), 779–787.

    Article  PubMed  CAS  Google Scholar 

  • Taylor, J. M. G., Ankerst, D. P., & Andridge, R. R. (2008). Validation of biomarker-based risk prediction models. Clinical Cancer Research, 14(19), 5977.

    Article  PubMed  Google Scholar 

  • Team, R. (2008). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. ISBN 3(10).

  • Tenenhaus, (1998). La Régression PLS—Théorie et pratique. Paris: Editions Technip.

    Google Scholar 

  • Tenenhaus, & Tenenhaus, (2011). Regularized generalized canonical correlation analysis. Psychometrika, 76(2), 257–284.

    Article  Google Scholar 

  • Thompson, B. (1984). Canonical correlation analysis: Uses and interpretation. Thousand Oaks: Sage Publications, Inc.

    Google Scholar 

  • Van Den Berg, R. A., Hoefsloot, H. C. J., Westerhuis, J. A., Smilde, A. K., & Van Der Werf, M. J. (2006). Centering, scaling, and transformations: Improving the biological information content of metabolomics data. BMC Genomics, 7(1), 142–156.

    Article  PubMed  Google Scholar 

  • Vinod, H. D. (1976). Canonical ridge and econometrics of joint production. Journal of Econometrics, 4(2), 147–166.

    Article  Google Scholar 

  • Wangen, L., & Kowalski, B. (1989). A multiblock partial least squares algorithm for investigating complex chemical systems. Journal of Chemometrics, 3(1), 3–20.

    Article  Google Scholar 

  • Westerhuis, J. A., Kourti, T., & MacGregor, J. F. (1998). Analysis of multiblock and hierarchical PCA and PLS models. Journal of Chemometrics, 12(5), 301–321.

    Article  CAS  Google Scholar 

  • Wold, S., Johansson, E., & Cocchi, M. (1993). PLS—partial least squares projections to latent structures. 3-D QSAR drug design: Theory, methods and application (pp. 523–550).

  • Wurtman, M. C. R. J. (2007). 4 Aromatic amino acids in the brain. Handbook of neurochemistry and molecular neurobiology: Amino acids and peptides in the nervous system.

Download references

Acknowledgments

This work was supported by a grant from the Agence Nationale de la Recherche (ProtNeonat ANR-05-PNRA-009 grant). We acknowledge the contribution of Bérengère Coupé, PhD, who generated most of animal samples, sponsored by a doctoral fellowship from Institut National de la Recherche Agronomique and Région Pays de la Loire and granted by La Fondation Louis Bonduelle (France). We acknowledge Emilie Bailly for her help in 2-DE gels and image analysis, and Hélène Rogniaux for her help in protein identification by Maldi-tof and LC–MS/MS. We also acknowledge the financial support of the NUPEM project (NUtrition Périnatale et Empreinte Métabolique) from the Pays de la Loire Region. At last, the authors would like to thank Dominique Darmaun (Director of the Laboratory) for his careful re-reading of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Moyon.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 1172 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moyon, T., Le Marec, F., Qannari, E.M. et al. Statistical strategies for relating metabolomics and proteomics data: a real case study in nutrition research area. Metabolomics 8, 1090–1101 (2012). https://doi.org/10.1007/s11306-012-0415-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11306-012-0415-7

Keywords

Navigation