Abstract
In this protocol, we explain how to run ab initio motif discovery in order to gather putative transcription factor binding motifs (TFBMs) from sets of genomic regions returned by ChIP-seq experiments. The protocol starts from a set of peak coordinates (genomic regions) which can be either downloaded from ChIP-seq databases, or produced by a peak-calling software tool. We provide a concise description of the successive steps to discover motifs, cluster the motifs returned by different motif discovery algorithms, and compare them with reference motif databases. The protocol is documented with detailed notes explaining the rationale underlying the choice of options. The interpretation of the results is illustrated with an example from the model plant Arabidopsis thaliana.
*The authors of this chapter contributed equally to the work.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4:651–657
Mardis ER (2007) ChIP-seq: welcome to the new frontier. Nat Methods 4:613–614
Kulakovskiy IV, Boeva VA, Favorov AV, Makeev VJ (2010) Deep and wide digging for binding motifs in ChIP-Seq data. Bioinformatics 26:2622–2623
Machanick P, Bailey TL (2011) MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27:1696–1697
Thomas-Chollier M, Darbo E, Herrmann C, Defrance M, Thieffry D, van Helden J (2012) A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs. Nat Protoc 7:1551–1568
Thomas-Chollier M, Herrmann C, Defrance M, Sand O, Thieffry D, van Helden J (2012) RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets. Nucleic Acids Res 40, e31
Medina-Rivera A, Defrance M, Sand O, Herrmann C, Castro-Mondragon JA, Delerce J, Jaeger S, Blanchet C, Vincens P, Caron C, Staines DM, Contreras-Moreira B, Artufel M, Charbonnier-Khamvongsa L, Hernandez C, Thieffry D, Thomas-Chollier M, van Helden J (2015) RSAT 2015: Regulatory Sequence Analysis Tools. Nucleic Acids Res 43:W50–W56
Pepke S, Wold B, Mortazavi A (2009) Computation for ChIP-seq and RNA-seq studies. Nat Methods 6:S22–S32
Steinhauser S, Kurzawa N, Eils R, Herrmann C (2016) A comprehensive comparison of tools for differential ChIP-seq analysis. Brief Bioinform. doi:10.1093/bib/bbv110
Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, Liu XS (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9(9):R137. doi:10.1186/gb-2008-9-9-r137
Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK (2010) Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38(4):576–589
Wilder S (2009) SWEMBL: a generic peak-calling program. Unpublished. http://www.ebi.ac.uk/~swilder/SWEMBL/
Kharchenko PV, Tolstorukov MY, Park PJ (2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol 26:1351–1359. doi:10.1038/nbt.1508
Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18:6097–6100
Jolma A, Yan J, Whitington T, Toivonen J, Nitta KR, Rastas P, Morgunova E, Enge M, Taipale M, Wei G, Palin K, Vaquerizas JM, Vincentelli R, Luscombe NM, Hughes TR, Lemaire P, Ukkonen E, Kivioja T, Taipale J (2013) DNA-binding specificities of human transcription factors. Cell 152:327–339
Sebastian A, Contreras-Moreira B (2014) footprintDB: a database of transcription factors with annotated cis elements and binding interfaces. Bioinformatics 30:258–265
Thorvaldsdóttir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A (2013) NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res 41:D991–D995
Kolesnikov N, Hastings E, Keays M, Melnichuk O, Tang YA, Williams E, Dylag M, Kurbatova N, Brandizi M, Burdett T, Megy K, Pilicheva E, Rustici G, Tikhonov A, Parkinson H, Petryszak R, Sarkans U, Brazma A (2015) ArrayExpress update—simplifying data submissions. Nucleic Acids Res 43:D1113–D1116
Kobayashi K, Suzuki T, Iwata E et al (2015) Transcriptional repression by MYB3R proteins regulates plant organ growth. EMBO J 34:1992–2007
Ito M, Araki S, Matsunaga S, Itoh T, Nishihama R, Machida Y, Doonan JH, Watanabe A (2001) G2/M-phase-specific transcription during the plant cell cycle is mediated by c-Myb-like transcription factors. Plant Cell 13:1891–1905
van Helden J, André B, Collado-Vides J (1998) Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 281:827–842
Acknowledgements
We thank C. Dubos for feedback on MYBR3 proteins. This work was funded in part by Fundación ARAID and by the Enseignants-Chercheurs invités program of Aix-Marseille Université (to B.C.M.). C.R. was supported by the France Génomique National infrastructure, funded as part of the Investissements d’Avenir, program managed by the Agence Nationale pour la Recherche (contract ANR-10-INBS-09). J.C-M PhD grant is funded by the Ecole Doctorale des Sciences de la Vie et de la Santé (EDSVS), Aix-Marseille Université.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this protocol
Cite this protocol
Castro-Mondragon, J.A., Rioualen, C., Contreras-Moreira, B., van Helden, J. (2016). RSAT::Plants: Motif Discovery in ChIP-Seq Peaks of Plant Genomes. In: Hehl, R. (eds) Plant Synthetic Promoters. Methods in Molecular Biology, vol 1482. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6396-6_19
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6396-6_19
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6394-2
Online ISBN: 978-1-4939-6396-6
eBook Packages: Springer Protocols