Skip to main content

Data Processing for GC-MS- and LC-MS-Based Untargeted Metabolomics

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1978))

Abstract

Gas chromatography and liquid chromatography coupled to mass spectrometry are used extensively in untargeted metabolomics, which involves the profiling of small metabolites in biological samples. The complex raw dataset produced from untargeted metabolomics requires proper processing before it can be statistically analyzed and interpreted. This chapter describes a high-throughput data processing workflow routinely used in our laboratory, including feature detection and alignment, data reduction, and spectral-matching-based annotation. This semiautomated workflow uses vendor neutral data file formats and freely available data processing tools and therefore can be readily implemented on datasets acquired from instruments of different vendors.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Hall R, Beale M, Fiehn O, Hardy N, Sumner L, Bino R (2002) The missing link in functional genomics strategies. Plant Cell 14:1437–1440

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Rhee EP, Gerszten RE (2012) Metabolomics and cardiovascular biomarker discovery. Clin Chem 58:139–147

    CAS  PubMed  Google Scholar 

  3. Newgard CB (2017) Metabolomics and metabolic diseases: where do we stand? Cell Metab 25:43–56

    CAS  PubMed  Google Scholar 

  4. Gibney MJ, Walsh M, Brennan L, Roche HM, German B, van Ommen B (2005) Metabolomics in human nutrition: opportunities and challenges. Am J Clin Nutr 82:497–503

    CAS  PubMed  Google Scholar 

  5. Plumb RS, Johnson KA, Rainville P, Smith BW, Wilson ID, Castro‐Perez JM, Nicholson JK (2006) UPLC/MSE; a new approach for generating molecular fragment information for biomarker structure elucidation. Rapid Commun Mass Spectrom 20:1989–1994

    CAS  PubMed  Google Scholar 

  6. Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G (2006) XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem 78:779–787

    CAS  PubMed  Google Scholar 

  7. Broeckling CD, Afsar FA, Neumann S, Ben-Hur A, Prenni JE (2014) RAMClust: a novel feature clustering method enables spectral-matching-based annotation for metabolomics data. Anal Chem 86:6812–6817

    CAS  PubMed  Google Scholar 

  8. Broeckling CD, Ganna A, Layer M, Brown K, Sutton B, Ingelsson E, Peers G, Prenni JE (2016) Enabling efficient and confident annotation of LC-MS metabolomics data through MS1 spectrum and time prediction. Anal Chem 88:9226–9234

    CAS  PubMed  Google Scholar 

  9. R Core Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/. Accessed 1 Feb 2018

    Google Scholar 

  10. Stacklies W, Redestig H, Scholz M, Walther D, Selbig J (2007) pcaMethods—a bioconductor package providing PCA methods for incomplete data. Bioinformatics 23:1164–1167

    CAS  PubMed  Google Scholar 

  11. Ejigu BA, Valkenborg D, Baggerman G, Vanaerschot M, Witters E, Dujardin JC, Burzykowski T, Berg M (2013) Evaluation of normalization methods to pave the way towards large-scale LC-MS-based metabolomics profiling experiments. OMICS 17:473–485

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Zelena E, Dunn WB, Broadhurst D, Francis-McIntyre S, Carroll KM, Begley P, O’Hagan S, Knowles JD, Halsall A, HUSERMET Consortium, Wilson ID, Kell DB (2009) Development of a robust and repeatable UPLC-MS method for the long-term metabolomic study of human serum. Anal Chem 81:1357–1364

    Google Scholar 

  13. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B 57:289–300

    Google Scholar 

  14. Chambers MC, MacLean B, Burke R, Amode D, Ruderman DL, Neumann S, Gatto L, Fischer B, Pratt B, Egertson J, Hoff K, Kessner D, Tasman N, Shulman N, Frewen B, Baker TA, Brusniak M-Y, Paulse C, Creasy D, Flashner L, Kani K, Moulding C, Seymour SL, Nuwaysir LM, Lefebvre B, Kuhlmann F, Roark J, Rainer P, Detlev S, Hemenway T, Huhmer A, Langridge J, Connolly B, Chadick T, Holly K, Eckels J, Deutsch EW, Moritz RL, Katz JE, Agus DB, MacCoss M, Tabb DL, Mallick P (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol 30:918–920

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Li B, Tang J, Yang Q, Li S, Cui X, Li Y, Chen Y, Xue W, Li X, Zhu F (2017) NOREVA: normalization and evaluation of MS-based metabolomics data. Nucleic Acids Res. https://doi.org/10.1093/nar/gkx449

  16. Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19:185–193

    CAS  PubMed  Google Scholar 

  17. Kopka J, Schauer N, Krueger S, Birkemeyer C, Usadel B, Bergmuller E, Dormann P, Weckwerth W, Gibon Y, Stitt M, Willmitzer L, Fernie AR, Steinhauser D (2005) GMD@CSB.DB: the Golm Metabolome Database. Bioinformatics 21:1635–1638

    CAS  PubMed  Google Scholar 

  18. Linstrom PJ, Mallard WG (eds) (2005) NIST chemistry webbook, NIST standard reference database number 69. National Institute of Standards and Technology, Gaithersburg, MD. https://doi.org/10.18434/T4D303. Accessed 1 Feb 2018

    Book  Google Scholar 

  19. Broeckling CD, Heuberger AL, Prince JA, Ingelsson E, Prenni JE (2013) Assigning precursor-product ion relationships in indiscriminant MS/MS data from non-targeted metabolite profiling studies. Metabolomics 9:33–43

    CAS  Google Scholar 

  20. Stein SE (1994) Optimization and testing of mass spectral library search algorithms for compound identification. J Am Soc Mass Spectrom 5:859–865

    CAS  PubMed  Google Scholar 

  21. Xia J, Psychogios N, Young N, Wishart DS (2009) MetaboAnalyst: a web server for metabolomic data analysis and interpretation. Nucleic Acids Res 37:W652–W660

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Smilde AK, Jansen JJ, Hoefsloot HC, Lamers RJA, Van Der Greef J, Timmerman ME (2005) ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data. Bioinformatics 21:3043–3048

    CAS  PubMed  Google Scholar 

  23. Wiklund S, Johansson E, Sjöström L, Mellerowicz EJ, Edlund U, Shockcor JP, Gottfries J, Moritz T, Trygg J (2008) Visualization of GC/TOF-MS-based metabolomics data for identification of biochemically interesting compounds using OPLS class models. Anal Chem 80:115–122

    CAS  PubMed  Google Scholar 

  24. Worley B, Powers R (2013) Multivariate analysis in metabolomics. Curr Metabolomics 1:92–107

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Corey D. Broeckling or Jessica E. Prenni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Yao, L., Sheflin, A.M., Broeckling, C.D., Prenni, J.E. (2019). Data Processing for GC-MS- and LC-MS-Based Untargeted Metabolomics. In: D'Alessandro, A. (eds) High-Throughput Metabolomics. Methods in Molecular Biology, vol 1978. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9236-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9236-2_18

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-4939-9235-5

  • Online ISBN: 978-1-4939-9236-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics