Skip to main content

Using the Evolutionary History of Proteins to Engineer Insertion-Deletion Mutants from Robust, Ancestral Templates Using Graphical Representation of Ancestral Sequence Predictions (GRASP)

  • Protocol
  • First Online:
Enzyme Engineering

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2397))

Abstract

Analyzing the natural evolution of proteins by ancestral sequence reconstruction (ASR) can provide valuable information about the changes in sequence and structure that drive the development of novel protein functions. However, ASR has also been used as a protein engineering tool, as it often generates thermostable proteins which can serve as robust and evolvable templates for enzyme engineering. Importantly, ASR has the potential to provide an insight into the history of insertions and deletions that have occurred in the evolution of a protein family. Indels are strongly associated with functional change during enzyme evolution and represent a largely unexplored source of genetic diversity for designing proteins with novel or improved properties. Current ASR methods differ in the way they handle indels; inclusion or exclusion of indels is often managed subjectively, based on assumptions the user makes about the likelihood of each recombination event, yet most currently available ASR tools provide limited, if any, opportunities for evaluating indel placement in a reconstructed sequence. Graphical Representation of Ancestral Sequence Predictions (GRASP) is an ASR tool that maps indel evolution throughout a reconstruction and enables the evaluation of indel variants. This chapter provides a general protocol for performing a reconstruction using GRASP and using the results to create indel variants. The method addresses protein template selection, sequence curation, alignment refinement, tree building, ancestor reconstruction, evaluation of indel variants and approaches to library development.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Saab-Rincon G, Li Y, Meyer M, Carbone M, Landwehr M, Arnold FH (2011) Protein engineering by structure-guided SCHEMA recombination. In: Lutz S, Bornscheuer U (eds) Protein engineering handbook, 1st edn: 481-492. Wiley-VCH, Darmstadt

    Google Scholar 

  2. Zhang Z, Wang J, Gong Y, Li Y (2018) Contributions of substitutions and indels to the structural variations in ancient protein superfamilies. BMC Genomics 19(1):771. https://doi.org/10.1186/s12864-018-5178-8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Emond S, Petek M, Kay EJ, Heames B, Devenish SRA, Tokuriki N, Hollfelder F (2020) Accessing unexplored regions of sequence space in directed enzyme evolution via insertion/deletion mutagenesis. Nat Commun 11(1):3469. https://doi.org/10.1038/s41467-020-17061-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Arpino JA, Reddington SC, Halliwell LM, Rizkallah PJ, Jones DD (2014) Random single amino acid deletion sampling unveils structural tolerance and the benefits of helical registry shift on GFP folding and structure. Structure 22(6):889–898. https://doi.org/10.1016/j.str.2014.03.014

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Li D, Jackson EL, Spielman SJ, Wilke CO (2017) Computational prediction of the tolerance to amino-acid deletion in green-fluorescent protein. PLoS One 12(4):e0164905. https://doi.org/10.1371/journal.pone.0164905

    Article  CAS  Google Scholar 

  6. Kim R, Guo J-T (2010) Systematic analysis of short internal indels and their impact on protein folding. BMC Struct Biol 10(1):24. https://doi.org/10.1186/1472-6807-10-24

  7. Chang MSS, Benner SA (2004) Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments. J Mol Biol 341(2):617–631. https://doi.org/10.1016/j.jmb.2004.05.045

    Article  CAS  PubMed  Google Scholar 

  8. Light S, Sagit R, Sachenkova O, Ekman D, Elofsson A (2013) Protein expansion is primarily due to Indels in intrinsically disordered regions. Mol Biol Evol 30(12):2645–2653. https://doi.org/10.1093/molbev/mst157

    Article  CAS  PubMed  Google Scholar 

  9. Fraternali F, Joseph AP, Valadié H, Srinivasan N, de Brevern AG (2012) Local structural differences in homologous proteins: specificities in different SCOP classes. PLoS One 7(6):e38805. https://doi.org/10.1371/journal.pone.0038805

    Article  CAS  Google Scholar 

  10. de la Chaux N, Messer PW, Arndt PF (2007) DNA indels in coding regions reveal selective constraints on protein evolution in the human lineage. BMC Evol Biol 7(1):191. https://doi.org/10.1186/1471-2148-7-191

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Leushkin EV, Bazykin GA, Kondrashov AS (2012) Insertions and deletions trigger adaptive walks in drosophila proteins. Proc R Soc B Biol Sci 279(1740):3075–3082. https://doi.org/10.1098/rspb.2011.2571

    Article  Google Scholar 

  12. Zhang Z, Huang J, Wang Z, Wang L, Gao P (2011) Impact of Indels on the flanking regions in structural domains. Mol Biol Evol 28(1):291–301. https://doi.org/10.1093/molbev/msq196

    Article  CAS  PubMed  Google Scholar 

  13. Bloom JD, Labthavikul ST, Otey CR, Arnold FH (2006) Protein stability promotes evolvability. Proc Natl Acad Sci U S A 103(15):5869–5874. https://doi.org/10.1073/pnas.0510098103

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Ayuso-Fernandez I, Ruiz-Duenas FJ, Martinez AT (2018) Evolutionary convergence in lignin-degrading enzymes. Proc Natl Acad Sci U S A 115(25):6428–6433. https://doi.org/10.1073/pnas.1802555115

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Groussin M, Hobbs JK, Szollosi GJ, Gribaldo S, Arcus VL, Gouy M (2015) Toward more accurate ancestral protein genotype-phenotype reconstructions with the use of species tree-aware gene trees. Mol Biol Evol 32(1):13–22. https://doi.org/10.1093/molbev/msu305

    Article  CAS  PubMed  Google Scholar 

  16. Thomas A, Cutlan R, Finnigan W, van der Giezen M, Harmer N (2019) Highly thermostable carboxylic acid reductases generated by ancestral sequence reconstruction. Commun Biol 2:429. https://doi.org/10.1038/s42003-019-0677-y

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Schenkmayerova A, Pinto G, Toul M, Marek M, Hernychova L, Planas-Iglesias J, Liskova V, Pluskal D, Vasina M, Emond S, Doerr M, Chaloupková R, Bednar D, Prokop Z, Hollfelder F, Bornscheuer U, Damborsky J (2020) Engineering protein dynamics of ancestral luciferase. ChemRxiv. https://doi.org/10.26434/chemrxiv.12808295.v1

  18. Thornton JW (2004) Resurrecting ancient genes: experimental analysis of extinct molecules. Nat Rev Genet 5(5):366–375

    Article  CAS  PubMed  Google Scholar 

  19. Felsenstein J (2003) Inferring Phylogenies. Sinauer Associates, Inc., Sunderland, MA

    Google Scholar 

  20. Pupko T, Pe I, Shamir R, Graur D (2000) A fast algorithm for joint reconstruction of ancestral amino acid sequences. Mol Biol Evol 17(6):890–896. https://doi.org/10.1093/oxfordjournals.molbev.a026369

    Article  CAS  PubMed  Google Scholar 

  21. Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24(8):1586–1591. https://doi.org/10.1093/molbev/msm088

    Article  CAS  PubMed  Google Scholar 

  22. Koshi JM, Goldstein RA (1996) Probabilistic reconstruction of ancestral protein sequences. J Mol Evol 42(2):313–320. https://doi.org/10.1007/bf02198858

    Article  CAS  PubMed  Google Scholar 

  23. Vialle RA, Tamuri AU, Goldman N (2018) Alignment modulates ancestral sequence reconstruction accuracy. Mol Biol Evol 35(7):1783–1797. https://doi.org/10.1093/molbev/msy055

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Merkl R, Sterner R (2016) Ancestral protein reconstruction: techniques and applications. Biol Chem 397(1):1–21. https://doi.org/10.1515/hsz-2015-0158

    Article  CAS  PubMed  Google Scholar 

  25. Moretti S, Armougom F, Wallace IM, Higgins DG, Jongeneel CV, Notredame C (2007) The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods. Nucleic Acids Res 35(Web Server Issue):W645–W648. https://doi.org/10.1093/nar/gkm333

    Article  PubMed  PubMed Central  Google Scholar 

  26. Sela I, Ashkenazy H, Katoh K, Pupko T (2015) GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters. Nucleic Acids Res 43(W1):W7–W14. https://doi.org/10.1093/nar/gkv318

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Jehl P, Sievers F, Higgins DG (2015) OD-seq: outlier detection in multiple sequence alignments. BMC Bioinformatics 16(1):269. https://doi.org/10.1186/s12859-015-0702-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Chiner-Oms A, González-Candelas F (2016) EvalMSA: a program to evaluate multiple sequence alignments and detect outliers. Evol Bioinform Online 12:277–284. https://doi.org/10.4137/ebo.S40583

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ (2009) Jalview version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics 25(9):1189–1191. https://doi.org/10.1093/bioinformatics/btp033

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Cohen O, Ashkenazy H, Belinky F, Huchon D, Pupko T (2010) GLOOME: gain loss mapping engine. Bioinformatics 26(22):2914–2915. https://doi.org/10.1093/bioinformatics/btq549

    Article  CAS  PubMed  Google Scholar 

  31. Edwards RJ, Shields DC (2004) GASP: gapped ancestral sequence prediction for proteins. BMC Bioinformatics 5(1):123. https://doi.org/10.1186/1471-2105-5-123

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Musil M, Khan RT, Beier A, Stourac J, Konegger H, Damborsky J, Bednar D (2020) FireProtASR: a web server for fully automated ancestral sequence reconstruction. Brief Bioinform. 22(4): bbaa337. https://doi.org/10.1093/bib/bbaa337

  33. Oliva A, Pulicani S, Lefort V, Bréhélin L, Gascuel O, Guindon S (2019) Accounting for ambiguity in ancestral sequence reconstruction. Bioinformatics 35(21):4290–4297. https://doi.org/10.1093/bioinformatics/btz249

    Article  CAS  PubMed  Google Scholar 

  34. Lanfear R, von Haeseler A, Woodhams MD, Schrempf D, Chernomor O, Schmidt HA, Minh BQ, Teeling E (2020) IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol 37(5):1530–1534. https://doi.org/10.1093/molbev/msaa015

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313. https://doi.org/10.1093/bioinformatics/btu033

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Kozlov AM, Darriba D, Flouri T, Morel B, Stamatakis A (2019) RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35(21):4453–4455. https://doi.org/10.1093/bioinformatics/btz305

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Hanson-Smith V, Kolaczkowski B, Thornton JW (2010) Robustness of ancestral sequence reconstruction to phylogenetic uncertainty. Mol Biol Evol 27(9):1988–1999. https://doi.org/10.1093/molbev/msq081

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20(4):406–416. https://doi.org/10.2307/2412116

    Article  Google Scholar 

  39. Wheeler D (2003) Selecting the right protein-scoring matrix. Curr Protoc Bioinformatics. Chapter 3:Unit 3.5. https://doi.org/10.1002/0471250953.bi0305s00

  40. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30(4):772–780. https://doi.org/10.1093/molbev/mst010

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS (2017) ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14(6):587–589. https://doi.org/10.1038/nmeth.4285

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Darriba D, Taboada GL, Doallo R, Posada D (2011) ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27(8):1164–1165. https://doi.org/10.1093/bioinformatics/btr088

    Article  CAS  PubMed  Google Scholar 

  43. Minh BQ, Nguyen MAT, von Haeseler A (2013) Ultrafast approximation for phylogenetic bootstrap. Mol Biol Evol 30(5):1188–1195. https://doi.org/10.1093/molbev/mst024

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39(4):783–791. https://doi.org/10.2307/2408678

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elizabeth M. J. Gillam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Ross, C.M., Foley, G., Boden, M., Gillam, E.M.J. (2022). Using the Evolutionary History of Proteins to Engineer Insertion-Deletion Mutants from Robust, Ancestral Templates Using Graphical Representation of Ancestral Sequence Predictions (GRASP). In: Magnani, F., Marabelli, C., Paradisi, F. (eds) Enzyme Engineering. Methods in Molecular Biology, vol 2397. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1826-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1826-4_6

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1825-7

  • Online ISBN: 978-1-0716-1826-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics