Skip to main content

Bioinformatics Approaches for Predicting Disordered Protein Motifs

  • Chapter
  • First Online:
Intrinsically Disordered Proteins Studied by NMR Spectroscopy

Part of the book series: Advances in Experimental Medicine and Biology ((AEMB,volume 870))

Abstract

Short, linear motifs (SLiMs) in proteins are functional microdomains consisting of contiguous residue segments along the protein sequence, typically not more than 10 consecutive amino acids in length with less than 5 defined positions. Many positions are ‘degenerate’ thus offering flexibility in terms of the amino acid types allowed at those positions. Their short length and degenerate nature confers evolutionary plasticity meaning that SLiMs often evolve convergently. Further, SLiMs have a propensity to occur within intrinsically unstructured protein segments and this confers versatile functionality to unstructured regions of the proteome. SLiMs mediate multiple types of protein interactions based on domain-peptide recognition and guide functions including posttranslational modifications, subcellular localization of proteins, and ligand binding. SLiMs thus behave as modular interaction units that confer versatility to protein function and SLiM-mediated interactions are increasingly being recognized as therapeutic targets. In this chapter we start with a brief description about the properties of SLiMs and their interactions and then move on to discuss algorithms and tools including several web-based methods that enable the discovery of novel SLiMs (de novo motif discovery) as well as the prediction of novel occurrences of known SLiMs. Both individual amino acid sequences as well as sets of protein sequences can be scanned using these methods to obtain statistically overrepresented sequence patterns. Lists of putatively functional SLiMs are then assembled based on parameters such as evolutionary sequence conservation, disorder scores, structural data, gene ontology terms and other contextual information that helps to assess the functional credibility or significance of these motifs. These bioinformatics methods should certainly guide experiments aimed at motif discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Akiva E, Friedlander G, Itzhaki Z et al (2012) A dynamic view of domain-motif interactions. PLoS Comput Biol 8(1):e1002341. doi:10.1371/journal.pcbi.1002341

    Google Scholar 

  • Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410. doi:10.1016/S0022-2836(05)80360-2

    Google Scholar 

  • Bailey TL (2008) Discovering sequence motifs. Methods in Mol Biol 452:231–251. doi:10.1007/978-1-60327-159-212

    Google Scholar 

  • Bailey TL, Williams N, Misleh C et al (2006) MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res 34(Web Server issue):W369–373. doi:10.1093/nar/gkl198

    Google Scholar 

  • Bailey TL, Boden M, Buske FA et al (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37(Web Server issue):W202–208. doi:10.1093/nar/gkp335

    Google Scholar 

  • Berman HM, Kleywegt GJ, Nakamura H et al (2013) The future of the protein data bank. Biopolymers 99(3):218–222. doi:10.1002/bip.22132

    Google Scholar 

  • Bernier-Villamor V, Sampson DA, Matunis MJ et al (2002) Structural basis for E2-mediated SUMO conjugation revealed by a complex between ubiquitin-conjugating enzyme Ubc9 and RanGAP1. Cell 108(3):345–356

    Google Scholar 

  • Bhattacharyya RP, Remenyi A, Yeh BJ et al (2006) Domains, motifs, and scaffolds: the role of modular interactions in the evolution and wiring of cell signaling circuits. Annu Rev Biochem 75:655–680. doi:10.1146/annurev.biochem.75.103004.142710

    Google Scholar 

  • Brett TJ, Traub LM, Fremont DH (2002) Accessory protein recruitment motifs in clathrin-mediated endocytosis. Structure 10(6):797–809

    Google Scholar 

  • Chen X, Guo L, Fan Z et al (2008) W-AlignACE: an improved Gibbs sampling algorithm based on more accurate position weight matrices learned from sequence and gene expression/ChIP-chip data. Bioinformatics 24(9):1121–1128. doi:10.1093/bioinformatics/btn088

    Google Scholar 

  • Corti A, Curnis F (2011) Isoaspartate-dependent molecular switches for integrin-ligand recognition. J Cell Sci 124(Pt 4):515–522. doi:10.1242/jcs.077172

    Google Scholar 

  • Davey NE, Shields DC, Edwards RJ (2009) Masking residues using context-specific evolutionary conservation significantly improves short linear motif discovery. Bioinformatics 25(4):443–450. doi:10.1093/bioinformatics/btn664

    Google Scholar 

  • Davey NE, Haslam NJ, Shields DC et al (2010) SLiMFinder: a web server to find novel, significantly over-represented, short protein motifs. Nucleic Acids Res 38(Web Server issue):W534–539. doi:10.1093/nar/gkq440

    Google Scholar 

  • Davey NE, Haslam NJ, Shields DC et al (2011a) SLiMSearch 2.0: biological context for short linear motifs in proteins. Nucleic Acids Res 39(Web Server issue):W56–60. doi:10.1093/nar/gkr402

    Google Scholar 

  • Davey NE, Trave G, Gibson TJ (2011b) How viruses hijack cell regulation. Trends Biochem Sci 36(3):159–169. doi:10.1016/j.tibs.2010.10.002

    Google Scholar 

  • Davey NE, Cowan JL, Shields DC et al (2012a) SLiMPrints: conservation-based discovery of functional motif fingerprints in intrinsically disordered protein regions. Nucleic Acids Res 40(21):10628–10641. doi:10.1093/nar/gks854

    Google Scholar 

  • Davey NE, Van Roey K, Weatheritt RJ et al (2012b) Attributes of short linear motifs. Mol Biosyst 8(1):268–281. doi:10.1039/c1mb05231d

    Google Scholar 

  • D’Haeseleer P (2006) How does DNA sequence motif discovery work? Nat Biotechnol 24(8):959–961. doi:10.1038/nbt0806-959

    Google Scholar 

  • Dinkel H, Michael S, Weatheritt RJ et al (2012) ELM—the database of eukaryotic linear motifs. Nucleic Acids Res 40(Database issue):D242–D251. doi:10.1093/nar/gkr1064

    Google Scholar 

  • Dinkel H, Van Roey K, Michael S et al (2014) The eukaryotic linear motif resource ELM: 10 years and counting. Nucleic Acids Res 42(Database issue):D259–D266. doi:10.1093/nar/gkt1047

    Google Scholar 

  • Disfani FM, Hsu WL, Mizianty MJ et al (2012) MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins. Bioinformatics 28(12):i75–i83. doi:10.1093/bioinformatics/bts209

    Google Scholar 

  • Dosztanyi Z, Csizmok V, Tompa P et al (2005) IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21(16):3433–3434. doi:10.1093/bioinformatics/bti541

    Google Scholar 

  • Edwards RJ, Davey NE, Shields DC (2007) SLiMFinder: a probabilistic method for identifying over-represented, convergently evolved, short linear motifs in proteins. PloS ONE 2(10):e967. doi:10.1371/journal.pone.0000967

    Google Scholar 

  • Edwards RJ, Davey NE, Shields DC (2008) CompariMotif: quick and easy comparisons of sequence motifs. Bioinformatics 24(10):1307–1309. doi:10.1093/bioinformatics/btn105

    Google Scholar 

  • Edwards RJ, Davey NE, O’Brien K et al (2012) Interactome-wide prediction of short, disordered protein interaction motifs in humans. Mol Biosyst 8(1):282–295. doi:10.1039/c1mb05212h

    Google Scholar 

  • Fang J, Haasl RJ, Dong Y et al (2005) Discover protein sequence signatures from protein-protein interaction data. BMC Bioinformatics 6:277. doi:10.1186/1471-2105-6-277

    Google Scholar 

  • Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42(Database issue):D222–D230. doi:10.1093/nar/gkt1223

    Google Scholar 

  • Flicek P, Amode MR, Barrell D et al (2011) Ensembl 2011. Nucleic Acids Res 39(Database issue):D800–D806. doi:10.1093/nar/gkq1064

    Google Scholar 

  • Flicek P, Amode MR, Barrell D et al (2014) Ensembl 2014. Nucleic Acids Res 42(Database issue):D749–D755. doi:10.1093/nar/gkt1196

    Google Scholar 

  • Frith MC, Saunders NF, Kobe B et al (2008) Discovering sequence motifs with arbitrary insertions and deletions. PLoS Comput Biol 4(4):e1000071. doi:10.1371/journal.pcbi.1000071

    Google Scholar 

  • Fuxreiter M, Tompa P, Simon I (2007) Local structural disorder imparts plasticity on linear motifs. Bioinformatics 23(8):950–956. doi:10.1093/bioinformatics/btm035

    Google Scholar 

  • Gibson TJ (2009) Cell regulation: determined to signal discrete cooperation. Trends Biochem Sci 34(10):471–482. doi:10.1016/j.tibs.2009.06.007

    Google Scholar 

  • Glickman MH, Ciechanover A (2002) The ubiquitin-proteasome proteolytic pathway: destruction for the sake of construction. Physiol Rev 82(2):373–428. doi:10.1152/physrev.00027.2001

    Google Scholar 

  • Gould CM, Diella F, Via A et al (2010) ELM: the status of the 2010 eukaryotic linear motif resource. Nucleic Acids Res 38(Database issue):D167–D180. doi:10.1093/nar/gkp1016

    Google Scholar 

  • Grant CE, Bailey TL, Noble WS (2011) FIMO: scanning for occurrences of a given motif. Bioinformatics 27(7):1017–1018. doi:10.1093/bioinformatics/btr064

    Google Scholar 

  • Habchi J, Tompa P, Longhi S et al (2014) Introducing protein intrinsic disorder. Chem Rev 114(13):6561–6588. doi:10.1021/cr400514h

    Google Scholar 

  • Hagen T, Vidal-Puig A (2002) Characterisation of the phosphorylation of βcatenin at the GSK-3 priming site Ser45. Biochem Biophys Res Commun 294(2):324–328. doi:10.1016/S0006-291×(02)00485-0

    Google Scholar 

  • Henikoff JG, Henikoff S, Pietrokovski S (1999) New features of the Blocks Database servers. Nucleic Acids Res 27(1):226–228

    Google Scholar 

  • Hospital V, Chesneau V, Balogh A et al (2000) N-arginine dibasic convertase (nardilysin) isoforms are soluble dibasic-specific metalloendopeptidases that localize in the cytoplasm and at the cell surface. Biochem J 349(Pt 2):587–597

    Google Scholar 

  • Hu J, Li B, Kihara D (2005) Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res 33(15):4899–4913. doi:10.1093/nar/gki791

    Google Scholar 

  • Janin J, Bahadur RP, Chakrabarti P (2008) Protein-protein interaction and quaternary structure. Q Rev Biophys 41(2):133–180. doi:10.1017/S0033583508004708

    Google Scholar 

  • Kadaveru K, Vyas J, Schiller MR (2008) Viral infection and human disease–insights from minimotifs. Front Biosci: A J Virt Lib 13:6455–6471

    Google Scholar 

  • Listovsky T, Oren YS, Yudkovsky Y et al (2004) Mammalian Cdh1/Fzr mediates its own degradation. EMBO J 23(7):1619–1626. doi:10.1038/sj.emboj.7600149

    Google Scholar 

  • London N, Raveh B, Schueler-Furman O (2012) Modeling peptide-protein interactions. Methods Mol Biol 857:375–398. doi:10.1007/978-1-61779-588-617

    Google Scholar 

  • Lyons TJ, Gasch AP, Gaither LA et al (2000) Genome-wide characterization of the Zap1p zinc-responsive regulon in yeast. Proc Natl Acad Sci U S A 97(14):7957–7962

    Google Scholar 

  • Masson N, Ratcliffe PJ (2003) HIF prolyl and asparaginyl hydroxylases in the biological response to intracellular O(2) levels. J Cell Sci 116(Pt 15):3041–3049. doi:10.1242/jcs.00655

    Google Scholar 

  • Mi T, Merlin JC, Deverasetty S et al (2012) Minimotif Miner 3.0: database expansion and significantly improved reduction of false-positive predictions from consensus sequences. Nucleic Acids Res 40(Database issue):D252–D260. doi:10.1093/nar/gkr1189

    Google Scholar 

  • Michael S, Trave G, Ramu C et al (2008) Discovery of candidate KEN-box motifs using cell cycle keyword enrichment combined with native disorder prediction and motif conservation. Bioinformatics 24(4):453–457. doi:10.1093/bioinformatics/btm624

    Google Scholar 

  • Min JH, Yang H, Ivan M et al (2002) Structure of an HIF-1alpha -pVHL complex: hydroxyproline recognition in signaling. Science 296(5574):1886–1889. doi:10.1126/science.1073440

    Google Scholar 

  • Mohan A, Oldfield CJ, Radivojac P et al (2006) Analysis of molecular recognition features (MoRFs). J Mol Biol 362(5):1043–1059. doi:10.1016/j.jmb.2006.07.087

    Google Scholar 

  • Moult J, Fidelis K, Kryshtafovych A et al (2014) Critical assessment of methods of protein structure prediction (CASP)–round x. Proteins 82(Suppl 2):1–6. doi:10.1002/prot.24452

    Google Scholar 

  • Neduva V, Russell RB (2005) Linear motifs: evolutionary interaction switches. FEBS lett 579 (15):3342–3345. doi:10.1016/j.febslet.2005.04.005

    Google Scholar 

  • Obenauer JC, Cantley LC, Yaffe MB (2003) Scansite 2.0: Proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res 31(13):3635–3641

    Google Scholar 

  • Peters JM (2006) The anaphase promoting complex/cyclosome: a machine designed to destroy. Nat Rev Mol Cell Biol 7 (9):644–656. doi:10.1038/nrm1988

    Google Scholar 

  • Petsalaki E, Russell RB (2008) Peptide-mediated interactions in biological systems: new discoveries and applications. Curr Opin Biotechnol 19(4):344–350. doi:10.1016/j.copbio.2008.06.004

    Google Scholar 

  • Pfleger CM, Kirschner MW (2000) The KEN box: an APC recognition signal distinct from the D box targeted by Cdh1. Genes Dev 14 (6):655–665

    Google Scholar 

  • Van Roey K, Dinkel H, Weatheritt RJ et al (2013) The switches. ELM resource: a compendium of conditional regulatory interaction interfaces. Sci Signal 6(269):rs7. doi:10.1126/scisignal.2003345

    Google Scholar 

  • Van Roey K, Uyar B, Weatheritt RJ et al (2014) Short linear motifs: ubiquitous and functionally diverse protein interaction modules directing cell regulation. Chem Rev 114(13):6733–6778. doi:10.1021/cr400585q

    Google Scholar 

  • Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18(20):6097–6100

    Google Scholar 

  • Schon O, Friedler A, Bycroft M et al (2002) Molecular mechanism of the interaction between MDM2 and p53. J Mol Biol 323(3):491–501

    Google Scholar 

  • Sigrist CJ, de Castro E, Cerutti L et al (2013) New and continuing developments at PROSITE. Nucleic Acids Res 41(Database issue):D344–D347. doi:10.1093/nar/gks1067

    Google Scholar 

  • Takeda DY, Wohlschlegel JA, Dutta A (2001) A bipartite substrate recognition motif for cyclin-dependent kinases. J Biol Chem 276(3):1993–1997. doi:10.1074/jbc.M005719200

    Google Scholar 

  • Tompa P (2012) Intrinsically disordered proteins: a 10-year recap. Trends Biochem Sci 37(12):509–516. doi:10.1016/j.tibs.2012.08.004

    Google Scholar 

  • Tompa P, Davey NE, Gibson TJ et al (2014) A million peptide motifs for the molecular biologist. Mol Cell 55(2):161–169. doi:10.1016/j.molcel.2014.05.032

    Google Scholar 

  • Tran NT, Huang CH (2014) A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data. Biology Direct 9:4. doi:10.1186/1745-6150-9-4

    Google Scholar 

  • Uyar B, Weatheritt RJ, Dinkel H et al (2014) Proteome-wide analysis of human disease mutations in short linear motifs: neglected players in cancer? Mol Biosyst 10(10):2626–2642. doi:10.1039/c4mb00290c

    Google Scholar 

  • Vacic V, Oldfield CJ, Mohan A et al (2007) Characterization of molecular recognition features, MoRFs, and their binding partners. J Proteome Res 6(6):2351–2366. doi:10.1021/pr0701411

    Google Scholar 

  • Ward JJ, Sodhi JS, McGuffin LJ et al (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337(3):635–645. doi:10.1016/j.jmb.2004.02.002

    Google Scholar 

  • Wu G, Xu G, Schulman BA et al (2003) Structure of a βTrCP1-Skp1-βcatenin complex: destruction motif binding and lysine specificity of the SCF(βTrCP1) ubiquitin ligase. Mol Cell 11(6):1445–1456

    Google Scholar 

  • Xia X (2012) Position weight matrix, Gibbs sampler, and the associated significance tests in motif characterization and prediction. Scientifica 2012:917540. doi:10.6064/2012/917540

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Tompa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Bhowmick, P., Guharoy, M., Tompa, P. (2015). Bioinformatics Approaches for Predicting Disordered Protein Motifs. In: Felli, I., Pierattelli, R. (eds) Intrinsically Disordered Proteins Studied by NMR Spectroscopy. Advances in Experimental Medicine and Biology, vol 870. Springer, Cham. https://doi.org/10.1007/978-3-319-20164-1_9

Download citation

Publish with us

Policies and ethics