Abstract
Liquid chromatography-coupled mass spectrometry (LC-MS/MS) is the primary method to obtain direct evidence for the presentation of disease- or patient-specific human leukocyte antigen (HLA). However, compared to the analysis of tryptic peptides in proteomics, the analysis of HLA peptides still poses computational and statistical challenges. Recently, fragment ion intensity-based matching scores assessing the similarity between predicted and observed spectra were shown to substantially increase the number of confidently identified peptides, particularly in use cases where non-tryptic peptides are analyzed. In this chapter, we describe in detail three procedures on how to benefit from state-of-the-art deep learning models to analyze and validate single spectra, single measurements, and multiple measurements in mass spectrometry-based immunopeptidomics. For this, we explain how to use the Universal Spectrum Explorer (USE), online Oktoberfest, and offline Oktoberfest. For intensity-based scoring, Oktoberfest uses fragment ion intensity and retention time predictions from the deep learning framework Prosit, a deep neural network trained on a very large number of synthetic peptides and tandem mass spectra generated within the ProteomeTools project. The examples shown highlight how deep learning-assisted analysis can increase the number of identified HLA peptides, facilitate the discovery of confidently identified neo-epitopes, or provide assistance in the assessment of the presence of cryptic peptides, such as spliced peptides.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chong C, Coukos G, Bassani-Sternberg M (2022) Identification of tumor antigens with immunopeptidomics. Nat Biotechnol 40:175–188. https://doi.org/10.1038/s41587-021-01038-8
Parker R, Tailor A, Peng X et al (2021) The choice of search engine affects sequencing depth and HLA class I allele-specific peptide repertoires. Mol Cell Proteomics 20:100124. https://doi.org/10.1016/j.mcpro.2021.100124
Gessulat S, Schmidt T, Zolg DP et al (2019) Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat Methods 16:509–518. https://doi.org/10.1038/s41592-019-0426-7
Gabriels R, Martens L, Degroeve S (2019) Updated MS2PIP web server delivers fast and accurate MS2 peak intensity prediction for multiple fragmentation methods, instruments and labeling techniques. Nucleic Acids Res 47:W295–W299. https://doi.org/10.1093/nar/gkz299
Tarn C, Zeng W-F (2021) pDeep3: toward more accurate spectrum prediction with fast few-shot learning. Anal Chem 93:5815–5822. https://doi.org/10.1021/acs.analchem.0c05427
Zeng W-F, Zhou X-X, Willems S et al (2022) AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics. Nat Commun 13:7238. https://doi.org/10.1038/s41467-022-34904-3
Wilhelm M, Zolg DP, Graber M et al (2021) Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics. Nat Commun 12:3346. https://doi.org/10.1038/s41467-021-23713-9
Declercq A, Bouwmeester R, Degroeve S, et al (2021) MS2Rescore: data-driven rescoring dramatically boosts immunopeptide identification rates. 2021.11.02.466886
Cormican JA, Horokhovskyi Y, Soh WT et al (2022) inSPIRE: an open-source tool for increased mass spectrometry identification rates using Prosit spectral prediction. Mol Cell Proteomics 21:100432. https://doi.org/10.1016/j.mcpro.2022.100432
Zolg DP, Gessulat S, Paschke C et al (2021) INFERYS rescoring: boosting peptide identifications and scoring confidence of database search results. Rapid Commun Mass Spectrom:e9128. https://doi.org/10.1002/rcm.9128
Schmidt T, Samaras P, Dorfer V et al (2021) Universal Spectrum explorer: a standalone (web-)application for cross-resource Spectrum comparison. J Proteome Res 20:3388–3394. https://doi.org/10.1021/acs.jproteome.1c00096
Zolg DP, Wilhelm M, Schnatbaum K et al (2017) Building ProteomeTools based on a complete synthetic human proteome. Nat Methods 14:259–262. https://doi.org/10.1038/nmeth.4153
Searle BC, Swearingen KE, Barnes CA et al (2020) Generating high quality libraries for DIA MS with empirically corrected peptide predictions. Nat Commun 11:1548. https://doi.org/10.1038/s41467-020-15346-1
Gabriel W, The M, Zolg DP et al (2022) Prosit-TMT: deep learning boosts identification of TMT-labeled peptides. Anal Chem. https://doi.org/10.1021/acs.analchem.1c05435
Gabriel W, Giurcoiu V, Lautenbacher L, Wilhelm M (2022) Predicting fragment intensities and retention time of iTRAQ- and TMTPro-labeled peptides with Prosit-TMT. Proteomics 22:2100257. https://doi.org/10.1002/pmic.202100257
Martens L, Chambers M, Sturm M et al (2011) mzML—a community standard for mass spectrometry data. Mol Cell Proteomics 10(R110):000133. https://doi.org/10.1074/mcp.R110.000133
The M, MacCoss MJ, Noble WS, Käll L (2016) Fast and accurate protein false discovery rates on large-scale proteomics data sets with percolator 3.0. J Am Soc Mass Spectrom 27:1719–1727. https://doi.org/10.1007/s13361-016-1460-7
Fondrie WE, Noble WS (2021) Mokapot: fast and flexible Semisupervised learning for peptide detection. J Proteome Res 20:1966–1971. https://doi.org/10.1021/acs.jproteome.0c01010
Cox J, Mann M (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26:1367–1372. https://doi.org/10.1038/nbt.1511
Kong AT, Leprevost FV, Avtonomov DM et al (2017) MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nat Methods 14:513–520. https://doi.org/10.1038/nmeth.4256
Perkins DN, Pappin DJC, Creasy DM, Cottrell JS (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20:3551–3567. https://doi.org/10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
LeDuc RD, Deutsch EW, Binz P-A et al (2022) Proteomics standards Initiative’s ProForma 2.0: unifying the encoding of Proteoforms and Peptidoforms. J Proteome Res 21:1189–1195. https://doi.org/10.1021/acs.jproteome.1c00771
Debrie E, Malfait M, Gabriels R et al (2023) Quality control for the target decoy approach for peptide identification. J Proteome Res 22:350–358. https://doi.org/10.1021/acs.jproteome.2c00423
Deutsch EW, Perez-Riverol Y, Carver J et al (2021) Universal spectrum identifier for mass spectra. Nat Methods 18:768–770. https://doi.org/10.1038/s41592-021-01184-6
Mylonas R, Beer I, Iseli C et al (2018) Estimating the contribution of proteasomal spliced peptides to the HLA-I Ligandome*. Mol Cell Proteomics 17:2347–2357. https://doi.org/10.1074/mcp.RA118.000877
Erhard F, Dölken L, Schilling B, Schlosser A (2020) Identification of the cryptic HLA-I Immunopeptidome. Cancer Immunol Res 8:1018–1026. https://doi.org/10.1158/2326-6066.CIR-19-0886
Mishto M (2021) Commentary: are there indeed spliced peptides in the Immunopeptidome? Mol Cell Proteomics 20:100158. https://doi.org/10.1016/j.mcpro.2021.100158
Pino LK, Searle BC, Bollinger JG et al (2020) The skyline ecosystem: informatics for quantitative mass spectrometry proteomics. Mass Spec Rev 39:229–244. https://doi.org/10.1002/mas.21540
Bruderer R, Bernhardt OM, Gandhi T et al (2015) Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol Cell Proteomics 14:1400–1410. https://doi.org/10.1074/mcp.M114.044305
Chen X, Sun Y, Zhang T et al (2021) Quantitative proteomics using isobaric labeling: a practical guide. Genomics Proteomics Bioinformatics 19:689–706. https://doi.org/10.1016/j.gpb.2021.08.012
Zolg DP, Wilhelm M, Yu P et al (2017) PROCAL: a set of 40 peptide standards for retention time indexing, column performance monitoring, and collision energy calibration. Proteomics 17:1700263. https://doi.org/10.1002/pmic.201700263
Acknowledgments
We thank all members of the kusterlab and wilhelmlab for fruitful discussions. This work was in part funded by the German Federal Ministry of Education and Research (BMBF; Grant No 031L0008A and 031 L0168), European Union’s Horizon 2020 Program under Grant Agreement 823839 (H2020-INFRAIA-2018-1; EPIC-XS).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Gabriel, W., Picciani, M., The, M., Wilhelm, M. (2024). Deep Learning-Assisted Analysis of Immunopeptidomics Data. In: Schrader, M., Fricker, L.D. (eds) Peptidomics. Methods in Molecular Biology, vol 2758. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3646-6_25
Download citation
DOI: https://doi.org/10.1007/978-1-0716-3646-6_25
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3645-9
Online ISBN: 978-1-0716-3646-6
eBook Packages: Springer Protocols