ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia

  1. Michael Snyder1,28
  1. 1Department of Genetics, Stanford University, Stanford, California 94305, USA;
  2. 2Division of Biology, California Institute of Technology, Pasadena, California 92116, USA;
  3. 3Department of Computer Science, Stanford University, Stanford, California 94305, USA;
  4. 4Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA;
  5. 5HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA;
  6. 6Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA;
  7. 7Department of Statistics, University of California, Berkeley, California 94720, USA;
  8. 8Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard School of Public Health, Boston, Massachusetts 02215, USA;
  9. 9Computational Biology & Bioinformatics Program, Yale University, New Haven, Connecticut 06511, USA;
  10. 10Department of Computer Science and Center for Systems Biology, Duke University, Durham, North Carolina 27708, USA;
  11. 11Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
  12. 12Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, Section of Molecular Genetics and Microbiology, University of Texas at Austin, Austin, Texas 78701, USA;
  13. 13Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA;
  14. 14Division of Genetics, Department of Medicine, Brigham & Women’s Hospital, Boston, Massachusetts 02115, USA;
  15. 15Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois 60637, USA;
  16. 16Department of Statistics, Penn State University, University Park, Pennsylvania 16802, USA;
  17. 17Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA;
  18. 18National Human Genome Research Institute/National Institutes of Health, Rockville, Maryland 20852, USA;
  19. 19Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada;
  20. 20Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, Connecticut 06511, USA;
  21. 21Department of Pathology, Stanford University, Stanford, California 94305, USA;
  22. 22Department of Medicine, University of Washington, Seattle, Washington 98195, USA;
  23. 23University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA;
  24. 24Department of Biochemistry & Molecular Biology, Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California 90089, USA;
  25. 25Department of Biology, Carolina Center for Genome Sciences, and Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
    1. 26 These authors contributed equally to this work.

    2. 27 Present address: Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA.

    Abstract

    Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals.

    Footnotes

    • 28 Corresponding authors

      E-mail mpsnyder{at}stanford.edu

      E-mail woldb{at}caltech.edu

      E-mail jlieb{at}bio.unc.edu

      E-mail pfarnham{at}usc.edu

    • [Supplemental material is available for this article.]

    • Article and supplemental material are at http://www.genome.org/cgi/doi/10.1101/gr.136184.111.

      Freely available online through the Genome Research Open Access option.

    • Received December 10, 2011.
    • Accepted May 10, 2012.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    Related Articles

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server