Skip to main content

Deep Learning-Assisted Analysis of Immunopeptidomics Data

  • Protocol
  • First Online:
Peptidomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2758))

  • 399 Accesses

Abstract

Liquid chromatography-coupled mass spectrometry (LC-MS/MS) is the primary method to obtain direct evidence for the presentation of disease- or patient-specific human leukocyte antigen (HLA). However, compared to the analysis of tryptic peptides in proteomics, the analysis of HLA peptides still poses computational and statistical challenges. Recently, fragment ion intensity-based matching scores assessing the similarity between predicted and observed spectra were shown to substantially increase the number of confidently identified peptides, particularly in use cases where non-tryptic peptides are analyzed. In this chapter, we describe in detail three procedures on how to benefit from state-of-the-art deep learning models to analyze and validate single spectra, single measurements, and multiple measurements in mass spectrometry-based immunopeptidomics. For this, we explain how to use the Universal Spectrum Explorer (USE), online Oktoberfest, and offline Oktoberfest. For intensity-based scoring, Oktoberfest uses fragment ion intensity and retention time predictions from the deep learning framework Prosit, a deep neural network trained on a very large number of synthetic peptides and tandem mass spectra generated within the ProteomeTools project. The examples shown highlight how deep learning-assisted analysis can increase the number of identified HLA peptides, facilitate the discovery of confidently identified neo-epitopes, or provide assistance in the assessment of the presence of cryptic peptides, such as spliced peptides.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chong C, Coukos G, Bassani-Sternberg M (2022) Identification of tumor antigens with immunopeptidomics. Nat Biotechnol 40:175–188. https://doi.org/10.1038/s41587-021-01038-8

    Article  CAS  PubMed  Google Scholar 

  2. Parker R, Tailor A, Peng X et al (2021) The choice of search engine affects sequencing depth and HLA class I allele-specific peptide repertoires. Mol Cell Proteomics 20:100124. https://doi.org/10.1016/j.mcpro.2021.100124

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Gessulat S, Schmidt T, Zolg DP et al (2019) Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat Methods 16:509–518. https://doi.org/10.1038/s41592-019-0426-7

    Article  CAS  PubMed  Google Scholar 

  4. Gabriels R, Martens L, Degroeve S (2019) Updated MS2PIP web server delivers fast and accurate MS2 peak intensity prediction for multiple fragmentation methods, instruments and labeling techniques. Nucleic Acids Res 47:W295–W299. https://doi.org/10.1093/nar/gkz299

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Tarn C, Zeng W-F (2021) pDeep3: toward more accurate spectrum prediction with fast few-shot learning. Anal Chem 93:5815–5822. https://doi.org/10.1021/acs.analchem.0c05427

    Article  CAS  PubMed  Google Scholar 

  6. Zeng W-F, Zhou X-X, Willems S et al (2022) AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics. Nat Commun 13:7238. https://doi.org/10.1038/s41467-022-34904-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Wilhelm M, Zolg DP, Graber M et al (2021) Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics. Nat Commun 12:3346. https://doi.org/10.1038/s41467-021-23713-9

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Declercq A, Bouwmeester R, Degroeve S, et al (2021) MS2Rescore: data-driven rescoring dramatically boosts immunopeptide identification rates. 2021.11.02.466886

    Google Scholar 

  9. Cormican JA, Horokhovskyi Y, Soh WT et al (2022) inSPIRE: an open-source tool for increased mass spectrometry identification rates using Prosit spectral prediction. Mol Cell Proteomics 21:100432. https://doi.org/10.1016/j.mcpro.2022.100432

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Zolg DP, Gessulat S, Paschke C et al (2021) INFERYS rescoring: boosting peptide identifications and scoring confidence of database search results. Rapid Commun Mass Spectrom:e9128. https://doi.org/10.1002/rcm.9128

  11. Schmidt T, Samaras P, Dorfer V et al (2021) Universal Spectrum explorer: a standalone (web-)application for cross-resource Spectrum comparison. J Proteome Res 20:3388–3394. https://doi.org/10.1021/acs.jproteome.1c00096

    Article  CAS  PubMed  Google Scholar 

  12. Zolg DP, Wilhelm M, Schnatbaum K et al (2017) Building ProteomeTools based on a complete synthetic human proteome. Nat Methods 14:259–262. https://doi.org/10.1038/nmeth.4153

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Searle BC, Swearingen KE, Barnes CA et al (2020) Generating high quality libraries for DIA MS with empirically corrected peptide predictions. Nat Commun 11:1548. https://doi.org/10.1038/s41467-020-15346-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Gabriel W, The M, Zolg DP et al (2022) Prosit-TMT: deep learning boosts identification of TMT-labeled peptides. Anal Chem. https://doi.org/10.1021/acs.analchem.1c05435

  15. Gabriel W, Giurcoiu V, Lautenbacher L, Wilhelm M (2022) Predicting fragment intensities and retention time of iTRAQ- and TMTPro-labeled peptides with Prosit-TMT. Proteomics 22:2100257. https://doi.org/10.1002/pmic.202100257

    Article  CAS  Google Scholar 

  16. Martens L, Chambers M, Sturm M et al (2011) mzML—a community standard for mass spectrometry data. Mol Cell Proteomics 10(R110):000133. https://doi.org/10.1074/mcp.R110.000133

    Article  Google Scholar 

  17. The M, MacCoss MJ, Noble WS, Käll L (2016) Fast and accurate protein false discovery rates on large-scale proteomics data sets with percolator 3.0. J Am Soc Mass Spectrom 27:1719–1727. https://doi.org/10.1007/s13361-016-1460-7

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Fondrie WE, Noble WS (2021) Mokapot: fast and flexible Semisupervised learning for peptide detection. J Proteome Res 20:1966–1971. https://doi.org/10.1021/acs.jproteome.0c01010

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Cox J, Mann M (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26:1367–1372. https://doi.org/10.1038/nbt.1511

    Article  CAS  PubMed  Google Scholar 

  20. Kong AT, Leprevost FV, Avtonomov DM et al (2017) MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nat Methods 14:513–520. https://doi.org/10.1038/nmeth.4256

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Perkins DN, Pappin DJC, Creasy DM, Cottrell JS (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20:3551–3567. https://doi.org/10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2

    Article  CAS  PubMed  Google Scholar 

  22. LeDuc RD, Deutsch EW, Binz P-A et al (2022) Proteomics standards Initiative’s ProForma 2.0: unifying the encoding of Proteoforms and Peptidoforms. J Proteome Res 21:1189–1195. https://doi.org/10.1021/acs.jproteome.1c00771

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Debrie E, Malfait M, Gabriels R et al (2023) Quality control for the target decoy approach for peptide identification. J Proteome Res 22:350–358. https://doi.org/10.1021/acs.jproteome.2c00423

    Article  CAS  PubMed  Google Scholar 

  24. Deutsch EW, Perez-Riverol Y, Carver J et al (2021) Universal spectrum identifier for mass spectra. Nat Methods 18:768–770. https://doi.org/10.1038/s41592-021-01184-6

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Mylonas R, Beer I, Iseli C et al (2018) Estimating the contribution of proteasomal spliced peptides to the HLA-I Ligandome*. Mol Cell Proteomics 17:2347–2357. https://doi.org/10.1074/mcp.RA118.000877

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Erhard F, Dölken L, Schilling B, Schlosser A (2020) Identification of the cryptic HLA-I Immunopeptidome. Cancer Immunol Res 8:1018–1026. https://doi.org/10.1158/2326-6066.CIR-19-0886

    Article  CAS  PubMed  Google Scholar 

  27. Mishto M (2021) Commentary: are there indeed spliced peptides in the Immunopeptidome? Mol Cell Proteomics 20:100158. https://doi.org/10.1016/j.mcpro.2021.100158

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Pino LK, Searle BC, Bollinger JG et al (2020) The skyline ecosystem: informatics for quantitative mass spectrometry proteomics. Mass Spec Rev 39:229–244. https://doi.org/10.1002/mas.21540

    Article  CAS  Google Scholar 

  29. Bruderer R, Bernhardt OM, Gandhi T et al (2015) Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol Cell Proteomics 14:1400–1410. https://doi.org/10.1074/mcp.M114.044305

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Chen X, Sun Y, Zhang T et al (2021) Quantitative proteomics using isobaric labeling: a practical guide. Genomics Proteomics Bioinformatics 19:689–706. https://doi.org/10.1016/j.gpb.2021.08.012

    Article  CAS  PubMed  Google Scholar 

  31. Zolg DP, Wilhelm M, Yu P et al (2017) PROCAL: a set of 40 peptide standards for retention time indexing, column performance monitoring, and collision energy calibration. Proteomics 17:1700263. https://doi.org/10.1002/pmic.201700263

    Article  CAS  Google Scholar 

Download references

Acknowledgments

We thank all members of the kusterlab and wilhelmlab for fruitful discussions. This work was in part funded by the German Federal Ministry of Education and Research (BMBF; Grant No 031L0008A and 031 L0168), European Union’s Horizon 2020 Program under Grant Agreement 823839 (H2020-INFRAIA-2018-1; EPIC-XS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mathias Wilhelm .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Gabriel, W., Picciani, M., The, M., Wilhelm, M. (2024). Deep Learning-Assisted Analysis of Immunopeptidomics Data. In: Schrader, M., Fricker, L.D. (eds) Peptidomics. Methods in Molecular Biology, vol 2758. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3646-6_25

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-3646-6_25

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-3645-9

  • Online ISBN: 978-1-0716-3646-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics