Skip to main content

Proteomics Data Exchange and Storage: The Need for Common Standards and Public Repositories

  • Protocol
  • First Online:
Mass Spectrometry Data Analysis in Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1007))

Abstract

Both the existence of data standards and public databases or repositories have been key factors behind the development of the existing “omics” approaches. In this book chapter we first review the main existing mass spectrometry (MS)-based proteomics resources: PRIDE, PeptideAtlas, GPMDB, and Tranche. Second, we report on the current status of the different proteomics data standards developed by the Proteomics Standards Initiative (PSI): the formats mzML, mzIdentML, mzQuantML, TraML, and PSI-MI XML are then reviewed.

Finally, we present an easy way to query and access MS proteomics data in the PRIDE database, as a representative of the existing repositories, using the workflow management system (WMS) tool Taverna. Two different publicly available workflows are explained and described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

CV:

Controlled vocabulary

DAS:

Distributed annotation system

(HUPO) PSI:

(Human Proteome Organisation) proteomics standards initiative

GO:

Gene ontology

GPMDB:

Global Proteome Machine DataBase

IMEX:

International Molecular EXchange (collaboration)

MGF:

Mascot Generic Format

MI:

Molecular interaction

MIAPE:

Minimum information about a proteomics experiment

MOPED:

Model Organism Protein Expression Database

MRM:

Multiple reaction monitoring

PASSEL:

PeptideAtlas SRM Experiment Library

PRIDE:

PRoteomics IDEntifications (database)

PSICQUIC:

Proteomics Standard Initiative Common QUery InterfaCe

SRM:

Selected reaction monitoring

TPP:

Trans proteomic pipeline

WMS:

Workflow management system

References

  1. Hull D, Wolstencroft K, Stevens R, Goble C, Pocock MR, Li P, Oinn T (2006) Taverna: a tool for building and running workflows of services. Nucleic Acids Res 34:W729–W732

    Article  PubMed  CAS  Google Scholar 

  2. Craig R, Cortens JP, Beavis RC (2004) Open source system for analyzing, validating, and storing protein identification data. J Proteome Res 3:1234–1242

    Article  PubMed  CAS  Google Scholar 

  3. Deutsch EW, Lam H, Aebersold R (2008) PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EMBO Rep 9:429–434

    Article  PubMed  CAS  Google Scholar 

  4. Martens L, Hermjakob H, Jones P, Adamski M, Taylor C, States D, Gevaert K, Vandekerckhove J, Apweiler R (2005) PRIDE: the proteomics identifications database. Proteomics 5:3537–3545

    Article  PubMed  CAS  Google Scholar 

  5. Smith BE, Hill JA, Gjukich MA, Andrews PC (2011) Tranche distributed repository and ProteomeCommons.org. Methods Mol Biol 696:123–145

    Article  PubMed  CAS  Google Scholar 

  6. Mead JA, Bianco L, Bessant C (2009) Recent developments in public proteomic MS repositories and pipelines. Proteomics 9:861–881

    Article  PubMed  CAS  Google Scholar 

  7. Schaab C, Geiger T, Stoehr G, Cox J, Mann M (2012) Analysis of high-accuracy, quantitative proteomics data in the MaxQB database. Mol Cell Proteomics 11(3):M111.014068

    Article  PubMed  Google Scholar 

  8. Kolker E, Higdon R, Haynes W, Welch D, Broomall W, Lancet D, Stanberry L, Kolker N (2012) MOPED: model organism protein expression database. Nucleic Acids Res 40:D1093–D1099

    Article  PubMed  CAS  Google Scholar 

  9. Barsnes H, Vizcaino JA, Eidhammer I, Martens L (2009) PRIDE converter: making proteomics data-sharing easy. Nat Biotechnol 27:598–599

    Article  PubMed  CAS  Google Scholar 

  10. Wang R, Fabregat A, Rios D, Ovelleiro D, Foster JM, Cote RG, Griss J, Csordas A, Perez-Riverol Y, Reisinger F, Hermjakob H, Martens L, Vizcaino JA (2012) PRIDE inspector: a tool to visualize and validate MS proteomics data. Nat Biotechnol 30:135–137

    Article  PubMed  Google Scholar 

  11. Deutsch EW, Mendoza L, Shteynberg D, Farrah T, Lam H, Tasman N, Sun Z, Nilsson E, Pratt B, Prazen B, Eng JK, Martin DB, Nesvizhskii AI, Aebersold R (2010) A guided tour of the trans-proteomic pipeline. Proteomics 10:1150–1159

    Article  PubMed  CAS  Google Scholar 

  12. Farrah T, Deutsch EW, Kreisberg R, Sun Z, Campbell DS, Mendoza L, Kusebauch U, Brusniak MY, Huttenhain R, Schiess R, Selevsek N, Aebersold R, Moritz RL (2012) PASSEL: the PeptideAtlas SRM experiment library. Proteomics 12(8):1170–1175

    Article  PubMed  CAS  Google Scholar 

  13. Hermjakob H, Apweiler R (2006) The proteomics identifications database (PRIDE) and the ProteomExchange consortium: making proteomics data accessible. Expert Rev Proteomics 3:1–3

    Article  PubMed  Google Scholar 

  14. Martens L, Chambers M, Sturm M, Kessner D, Levander F, Shofstahl J, Tang WH, Rompp A, Neumann S, Pizarro AD, Montecchi-Palazzi L, Tasman N, Coleman M, Reisinger F, Souda P, Hermjakob H, Binz PA, Deutsch EW (2011) mzML—a community standard for mass spectrometry data. Mol Cell Proteomics 10:R110.000133

    Article  PubMed  Google Scholar 

  15. Pedrioli PG, Eng JK, Hubley R, Vogelzang M, Deutsch EW, Raught B, Pratt B, Nilsson E, Angeletti RH, Apweiler R, Cheung K, Costello CE, Hermjakob H, Huang S, Julian RK, Kapp E, McComb ME, Oliver SG, Omenn G, Paton NW, Simpson R, Smith R, Taylor CF, Zhu W, Aebersold R (2004) A common open representation of mass spectrometry data and its application to proteomics research. Nat Biotechnol 22:1459–1466

    Article  PubMed  CAS  Google Scholar 

  16. Jones AR, Eisenacher M, Mayer G, Kohlbacher O, Siepen J, Hubbard S, Selley J, Searle B, Shofstahl J, Seymour S, Julian R, Binz PA, Deutsch EW, Hermjakob H, Reisinger F, Griss J, Vizcaino JA, Chambers M, Pizarro A, Creasy D (2012) The mzIdentML data standard for mass spectrometry-based proteomics results. Mol Cell Proteomics 11(7):M111.014381

    Article  PubMed  Google Scholar 

  17. Nesvizhskii AI, Aebersold R (2005) Interpretation of shotgun proteomic data: the protein inference problem. Mol Cell Proteomics 4:1419–1440

    Article  PubMed  CAS  Google Scholar 

  18. Deutsch EW, Chambers M, Neumann S, Levander F, Binz PA, Shofstahl J, Campbell DS, Mendoza L, Ovelleiro D, Helsens K, Martens L, Aebersold R, Moritz RL, Brusniak MY (2012) TraML: a standard format for exchange of selected reaction monitoring transition lists. Mol Cell Proteomics 11:R111.015040

    Article  PubMed  Google Scholar 

  19. Kerrien S, Orchard S, Montecchi-Palazzi L, Aranda B, Quinn AF, Vinod N, Bader GD, Xenarios I, Wojcik J, Sherman D, Tyers M, Salama JJ, Moore S, Ceol A, Chatr-Aryamontri A, Oesterheld M, Stumpflen V, Salwinski L, Nerothin J, Cerami E, Cusick ME, Vidal M, Gilson M, Armstrong J, Woollard P, Hogue C, Eisenberg D, Cesareni G, Apweiler R, Hermjakob H (2007) Broadening the horizon—level 2.5 of the HUPO-PSI format for molecular interactions. BMC Biol 5:44

    Article  PubMed  Google Scholar 

  20. Aranda B, Blankenburg H, Kerrien S, Brinkman FS, Ceol A, Chautard E, Dana JM, De Las Rivas J, Dumousseau M, Galeota E, Gaulton A, Goll J, Hancock RE, Isserlin R, Jimenez RC, Kerssemakers J, Khadake J, Lynn DJ, Michaut M, O’Kelly G, Ono K, Orchard S, Prieto C, Razick S, Rigina O, Salwinski L, Simonovic M, Velankar S, Winter A, Wu G, Bader GD, Cesareni G, Donaldson IM, Eisenberg D, Kleywegt GJ, Overington J, Ricard-Blum S, Tyers M, Albrecht M, Hermjakob H (2011) PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nat Methods 8:528–529

    Article  PubMed  CAS  Google Scholar 

  21. Taylor CF, Paton NW, Lilley KS, Binz PA, Julian RK Jr, Jones AR, Zhu W, Apweiler R, Aebersold R, Deutsch EW, Dunn MJ, Heck AJ, Leitner A, Macht M, Mann M, Martens L, Neubert TA, Patterson SD, Ping P, Seymour SL, Souda P, Tsugita A, Vandekerckhove J, Vondriska TM, Whitelegge JP, Wilkins MR, Xenarios I, Yates JR III, Hermjakob H (2007) The minimum information about a proteomics experiment (MIAPE). Nat Biotechnol 25:887–893

    Article  PubMed  CAS  Google Scholar 

  22. Goble CA, Bhagat J, Aleksejevs S, Cruickshank D, Michaelides D, Newman D, Borkum M, Bechhofer S, Roos M, Li P, De Roure D (2010) myExperiment: a repository and social network for the sharing of bioinformatics workflows. Nucleic Acids Res 38:W677–W682

    Article  PubMed  CAS  Google Scholar 

  23. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29

    Article  PubMed  CAS  Google Scholar 

  24. Binns D, Dimmer E, Huntley R, Barrell D, O’Donovan C, Apweiler R (2009) QuickGO: a web-based tool for gene ontology searching. Bioinformatics 25:3045–3046

    Article  PubMed  CAS  Google Scholar 

  25. Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, Kasprzyk A (2009) BioMart—biological queries made easy. BMC Genomics 10:22

    Article  PubMed  Google Scholar 

  26. Taylor CF, Binz PA, Aebersold R, Affolter M, Barkovich R, Deutsch EW, Horn DM, Huhmer A, Kussmann M, Lilley K, Macht M, Mann M, Muller D, Neubert TA, Nickson J, Patterson SD, Raso R, Resing K, Seymour SL, Tsugita A, Xenarios I, Zeng R, Julian RK Jr (2008) Guidelines for reporting the use of mass spectrometry in proteomics. Nat Biotechnol 26:860–861

    Article  PubMed  CAS  Google Scholar 

  27. Binz PA, Barkovich R, Beavis RC, Creasy D, Horn DM, Julian RK Jr, Seymour SL, Taylor CF, Vandenbrouck Y (2008) Guidelines for reporting the use of mass spectrometry informatics in proteomics. Nat Biotechnol 26:862

    Article  PubMed  CAS  Google Scholar 

  28. Orchard S, Salwinski L, Kerrien S, Montecchi-Palazzi L, Oesterheld M, Stumpflen V, Ceol A, Chatr-aryamontri A, Armstrong J, Woollard P, Salama JJ, Moore S, Wojcik J, Bader GD, Vidal M, Cusick ME, Gerstein M, Gavin AC, Superti-Furga G, Greenblatt J, Bader J, Uetz P, Tyers M, Legrain P, Fields S, Mulder N, Gilson M, Niepmann M, Burgoon L, De Las Rivas J, Prieto C, Perreau VM, Hogue C, Mewes HW, Apweiler R, Xenarios I, Eisenberg D, Cesareni G, Hermjakob H (2007) The minimum information required for reporting a molecular interaction experiment (MIMIx). Nat Biotechnol 25:894–898

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

R.C.J. is supported by the NHLBI Proteomics Center Award HHSN268201000035C. J.A.V. is supported by the EU FP7 grants LipidomicNet [grant number 202272] and ProteomeXchange [grant number 260558].

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Jiménez, R.C., Vizcaíno, J.A. (2013). Proteomics Data Exchange and Storage: The Need for Common Standards and Public Repositories. In: Matthiesen, R. (eds) Mass Spectrometry Data Analysis in Proteomics. Methods in Molecular Biology, vol 1007. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-392-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-392-3_14

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-391-6

  • Online ISBN: 978-1-62703-392-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics