Abstract
Mass spectrometry is an important technique for analyzing proteins and other biomolecular compounds in biological samples. Each of the vendors of these mass spectrometers uses a different proprietary binary output file format, which has hindered data sharing and the development of open source software for downstream analysis. The solution has been to develop, with the full participation of academic researchers as well as software and hardware vendors, an open XML-based format for encoding mass spectrometer output files, and then to write software to use this format for archiving, sharing, and processing. This chapter presents the various components and information available for this format, mzML. In addition to the XML schema that defines the file structure, a controlled vocabulary provides clear terms and definitions for the spectral metadata, and a semantic validation rules mapping file allows the mzML semantic validator to insure that an mzML document complies with one of several levels of requirements. Complete documentation and example files insure that the format may be uniformly implemented. At the time of release, there already existed several implementations of the format and vendors have committed to supporting the format in their products.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Pedrioli, P.G., et al., A common open representation of mass spectrometry data and its application to proteomics research. Nat Biotechnol, 2004. 22(11): p. 1459-66.
Day-Richter, J., et al., OBO-Edit--an ontology editor for biologists. Bioinformatics, 2007. 23(16): p. 2198-200.
Taylor, C.F., et al., Guidelines for reporting the use of mass spectrometry in proteomics. Nat Biotechnol, 2008. 26(8): p. 860-1.
Kessner, D., et al., ProteoWizard: Open Source Software for Rapid Proteomics Tools Development. Bioinformatics, 2008. 24(21): p. 2534-6.
ProteoWizard, http://proteowizard.sourceforge.net .
Acknowledgments
The design of mzML was a long process that involved a great many contributors. The author would like to thank the following for their contributions to mzML: Lennart Martens, Pierre-Alain Binz, Darren Kessner, Matt Chambers, Luisa Montecchi-Palazzi, Jim Shofstahl, Josh Tasman, Randall K Julian, Fredrik Levander, Puneet Souda, Jari Häkkinen, Brian Pratt, Erik Nilsson, Mike Coleman, Luis Mendoza, David Shteynberg, Lars Nilse, Benito Cañas, Lola Gutierrez, Alberto Medina, Trish Wheztel, Eva Duchoslav, Henning Hermjakob, Angel Pizarro, Phil Jones, Jimmy Eng, Kent Laursen, Sandra Orchard, Chris Taylor, Patrick Pedrioli, Sean Seymour, David Creasy, Howard Read, Jim Langridge, Jayson Falkner, David Horn, Ruth McNally, Ron Beavis,
Norman Paton, Marc Sturm, Parag Mallick, Rune Philosof, David Sparkman, Wilfred Tang, Marius Kallhardt, and Ruedi Aebersold.
EWD has been funded in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, under contract No. N01-HV-28179, and from PM50 GMO76547/Center for Systems Biology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Deutsch, E.W. (2010). Mass Spectrometer Output File Format mzML. In: Hubbard, S., Jones, A. (eds) Proteome Bioinformatics. Methods in Molecular Biology™, vol 604. Humana Press. https://doi.org/10.1007/978-1-60761-444-9_22
Download citation
DOI: https://doi.org/10.1007/978-1-60761-444-9_22
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60761-443-2
Online ISBN: 978-1-60761-444-9
eBook Packages: Springer Protocols