Skip to main content

Constrained De Novo Sequencing of Peptides with Application to Conotoxins

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2011)

Abstract

We describe algorithms for incorporating prior sequence knowledge into the candidate generation stage of de novo peptide sequencing by tandem mass spectrometry. We focus on two types of prior knowledge: homology to known sequences encoded by a regular expression or position-specific score matrix, and amino acid content encoded by a multiset of required residues. We show an application to de novo sequencing of cone snail toxins, which are molecules of special interest as pharmaceutical leads and as probes to study ion channels. Cone snail toxins usually contain 2, 4, 6, or 8 cysteine residues, and the number of residues can be determined by a relatively simple mass spectrometry experiment. We show here that the prior knowledge of the number of cysteines in a precursor ion is highly advantageous for de novo sequencing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bandeira, N., Tsur, D., Frank, A., Pevzner, P.A.: Protein identification by spectral networks analysis. Proc. Natl. Acad. Sci. USA 104, 6140–6145 (2007)

    Article  Google Scholar 

  2. Bandeira, N., Clauser, K.R., Pevzner, P.A.: Assembly of peptide tandem mass spectra from mixtures of modified proteins. Molecular Cell. Proteomics 6, 1123–1134 (2007)

    Article  Google Scholar 

  3. Bandeira, N., Pham, V., Pevzner, P., Arnott, D., Lill, J.R.: Automated de novo protein sequencing of monoclonal antibodies. Nature Biotechnology 26, 1336–1338 (2008)

    Article  Google Scholar 

  4. Barrett, C., Jacob, R., Marathe, M.: Formal language constrained path problems. SIAM J. on Computing 30, 809–837 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bartels, C.: Fast algorithm for peptide sequencing by mass spectrometry. Biomedical and Environmental Mass Spectrometry 19, 363–368 (1990)

    Article  Google Scholar 

  6. Bern, M., Cai, Y., Goldberg, D.: Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. Anal. Chem. 79, 1393–1400 (2007)

    Article  Google Scholar 

  7. Bern, M., Goldberg, D.: De novo analysis of peptide tandem mass spectra by spectral graph partitioning. J. Computational Biology 13, 364–378 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bern, M., Phinney, B.S., Goldberg, D.: Reanalysis of Tyrannosaurus rex Mass Spectra. J. Proteome Res. 8, 4328–4332 (2009)

    Article  Google Scholar 

  9. Bern, M., Saladino, J., Sharp, J.S.: Conversion of methionine into homocysteic acid in heavily oxidized proteomics samples. Rapid Commun. Mass Spectrom. 24, 768–772 (2010)

    Article  Google Scholar 

  10. Chen, T., Kao, M.-Y., Tepel, M., Rush, J., Church, G.M.: A dynamic programming approach to de novo peptide sequencing by mass spectrometry. J. Computational Biology 8, 325–337 (2001)

    Article  MATH  Google Scholar 

  11. Dančik, V., Addona, T.A., Clauser, K.R., Vath, J.E., Pevzner, P.A.: De novo peptide sequencing via tandem mass spectrometry. J. Computational Biology 6, 327–342 (1999)

    Article  Google Scholar 

  12. Datta, R., Bern, M.: Spectrum fusion: using multiple mass spectra for de novo peptide sequencing. J. Comput. Biol. 16, 1169–1182 (2009)

    Article  MathSciNet  Google Scholar 

  13. Depontieu, F.R., Qian, J., Zarling, A.L., McMiller, T.L., Salay, T.M., Norris, A., English, A.M., Shabanowitz, J., Engelhard, V.H., Hunt, D.F., Topalian, S.L.: Identification of tumor-associated, MHC class II-restricted phosphopeptides as targets for immunotherapy. Proc. Natl. Acad. Sci. USA 106, 12073–12078 (2009)

    Article  Google Scholar 

  14. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience, Hoboken (2000)

    MATH  Google Scholar 

  15. Elias, J.E., Gibbons, F.D., King, O.D., Roth, F.P., Gygi, S.P.: Intensity-based protein identification by machine learning from a library of tandem mass spectra. Nature Biotechnology 22, 214–219 (2004)

    Article  Google Scholar 

  16. Eng, J.K., McCormack, A.L., Yates III., J.R.: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994)

    Article  Google Scholar 

  17. Fischer, B., Roth, V., Roos, F., Grossmann, J., Baginsky, S., Widmayer, P., Gruissem, W., Buhmann, J.M.: NovoHMM: A hidden Markov model for de novo peptide sequencing. Anal. Chem. 77, 7265–7273 (2005)

    Article  Google Scholar 

  18. Eppstein, D.: Finding the k shortest paths. SIAM J. Computing 28, 652–673 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  19. Frank, A., Pevzner, P.: PepNovo: De Novo Peptide Sequencing via Probabilistic Network Modeling. Anal. Chem. 77, 964–973 (2005)

    Article  Google Scholar 

  20. Frank, A.M., Savitski, M.M., Nielsen, M.L., Zubarev, R.A., Pevzner, P.A.: De Novo Peptide Sequencing and Identification with Precision Mass Spectrometry. J. Proteome Research 6, 114–123 (2007)

    Article  Google Scholar 

  21. Graehl, J.: Implementation of David Eppstein’s k Shortest Paths Algorithm, http://www.ics.uci.edu/~eppstein/

  22. Havilio, M., Haddad, Y., Smilansky, Z.: Intensity-based statistical scorer for tandem mass spectrometry. Anal. Chem. 75, 435–444 (2003)

    Article  Google Scholar 

  23. Kaas, Q., Westermann, J.C., Halai, R., Wang, C.K., Crak, D.J.: ConoServer, a database for conopeptide sequences and structures. Bioinformatics 445, 445–446 (2008)

    Article  Google Scholar 

  24. Liebler, D.C., Hansen, B.T., Davey, S.W., Tiscareno, L., Mason, D.E.: Peptide sequence motif analysis of tandem MS data with the SALSA algorithm. Anal. Chem. 74, 203–210 (2002)

    Article  Google Scholar 

  25. Liu, X., Han, Y., Yuen, D., Ma, B.: Automated protein (re)sequencing with MS/MS and a homologous database yields almost full coverage and accuracy. Bioinformatics 25, 2174–2180 (2009)

    Article  Google Scholar 

  26. Ma, B., Zhang, K., Hendrie, C., Liang, C., Li, M., Doherty-Kirby, A., Lajoie, G.: PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Comm. in Mass Spectrometry 17, 2337–2342 (2003), http://www.bioinformaticssolutions.com

    Article  Google Scholar 

  27. Nair, S.S., Nilsson, C.L., Emmett, M.R., Schaub, T.M., Gowd, K.H., Thakur, S.S., Krishnan, K.S., Balaram, P., Marshall, A.G.: De novo sequencing and disulfide mapping of a bromotryptophan-containing conotoxin by Fourier transform ion cyclotron resonance mass spectrometry. Anal. Chem. 78, 8082–8088 (2006)

    Article  Google Scholar 

  28. Pham, V., Henzel, W.J., Arnott, D., Hymowitz, S., Sandoval, W.N., Truong, B.-T., Lowman, H., Lill, J.R.: De novo proteomic sequencing of a monoclonal antibody raised against OX40 ligand. Analytical Biochemistry 352, 77–86 (2006)

    Article  Google Scholar 

  29. Resemann, A., Wunderlich, D., Rothbauer, U., Warscheid, B., Leonhardt, H., Fuschser, J., Kuhlmann, K., Suckau, D.: Top-Down de Novo Protein Sequencing of a 13.6 kDa Camelid Single Heavy Chain Antibody by Matrix-Assisted Laser Desorption Ionization-Time-of-Flight/Time-of-Flight Mass Spectrometry. Anal. Chem. 82, 3283–3292 (2010)

    Article  Google Scholar 

  30. Savitski, M.M., Nielsen, M.L., Kjeldsen, F., Zubarev, R.A.: Proteomics-Grade de Novo Sequencing Approach. J. Proteome Research, 2348–2354 (2005)

    Google Scholar 

  31. Shevchenko, A., et al.: Charting the proteomes of organisms with unsequenced genomes by MALDI-quadrupole time-of-flight mass spectrometry and BLAST homology searching. Anal. Chem. 73, 1917–1926 (2001)

    Article  Google Scholar 

  32. Syka, J.E., Coon, J.J., Schroeder, M.J., Shabanowitz, J., Hunt, D.F.: Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc. Natl. Acad. Sci. USA 101, 9528–9533 (2004)

    Article  Google Scholar 

  33. Tabb, D.L., Smith, L.L., Breci, L.A., Wysocki, V.H., Lin, D., Yates III., J.R.: Statistical characterization of ion trap tandem mass spectra from doubly charged tryptic digests. Anal. Chem. 75, 1155–1163 (2003)

    Article  Google Scholar 

  34. Tabb, D.L., MacCoss, M.J., Wu, C.C., Anderson, S.D., Yates III., J.R.: Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility. Anal. Chem. 75, 2470–2477 (2003)

    Article  Google Scholar 

  35. Taylor, J.A., Johnson, R.S.: Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. Anal. Chem. 73, 2594–2604 (2001)

    Article  Google Scholar 

  36. Tayo, L.L., Lu, B., Cruz, L.J., Yates III., J.R.: Proteomic analysis provides insights on venom processing in Conus textile. J. Proteome Research 9, 2292–2301 (2010)

    Article  Google Scholar 

  37. Ueberheide, B.M., Fenyö, D., Alewood, P.F., Chait, B.T.: Rapid sensitive analysis of cysteine rich peptide venom components. Proc. Natl. Acad. Sci. USA 106, 6910–6915 (2009)

    Article  Google Scholar 

  38. Zhang, Z., McElvain, J.S.: De novo peptide sequencing by two-dimensional fragment correlation mass spectrometry. Anal. Chem. 72, 2337–2350 (2000)

    Article  Google Scholar 

  39. Alpha-conotoxin family signature. Accession number PS60014, ProSite ExPASy Proteomics Server (March 2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bhatia, S. et al. (2011). Constrained De Novo Sequencing of Peptides with Application to Conotoxins. In: Bafna, V., Sahinalp, S.C. (eds) Research in Computational Molecular Biology. RECOMB 2011. Lecture Notes in Computer Science(), vol 6577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20036-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20036-6_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20035-9

  • Online ISBN: 978-3-642-20036-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics