ABSTRACT
There has been a rapid development in genome sequencing, including high-throughput next generation sequencing (NGS) technologies, automation in biological experiments, new bioinformatics tools and utilization of high-performance computing and cloud computing. ChIP-based NGS technologies, e.g. ChIP-seq and ChIP-exo, are widely used to detect the binding sites of DNA-interacting proteins in the genome and help us to have a deeper mechanistic understanding of genomic regulation. As sequencing data is generated at an unprecedented pace from the ChIP-based NGS pipelines, there is an urgent need for a metadata management system. To meet this need, we developed the Platform for Eukaryotic Genomic Regulation (PEGR), a web service platform that logs metadata for samples and sequencing experiments, manages the data processing workflows, and provides reporting and visualization. PEGR links together people, samples, protocols, DNA sequencers and bioinformatics computation. With the help of PEGR, scientists can have a more integrated understanding of the sequencing data and better understand the scientific mechanisms of genomic regulation. In this paper, we present the architecture and the major functionalities of PEGR. We also share our experience in developing this application and discuss the future directions.
Supplemental Material
- Enis Afgan, Dannon Baker, Bérénice Batut, Marius van den Beek, Dave Bouvier, Martin Čech, John Chilton, Dave Clements, Nate Coraor, Björn A Grüning, Aysam Guerler, Jennifer Hillman-Jackson, Saskia Hiltemann, Vahid Jalili, Helena Rasche, Nicola Soranzo, Jeremy Goecks, James Taylor, Anton Nekrutenko, and Daniel Blankenberg. 2018. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Research 46, W1 (2018), W537–W544. https://doi.org/10.1093/nar/gky379 arXiv:http://oup.prod.sis.lan/nar/article-pdf/46/W1/W537/25110642/gky379.pdfGoogle ScholarCross Ref
- Istvan Albert, Travis N. Mavrich, Lynn P. Tomsho, Ji Qi, Sara J. Zanton, Stephan C. Schuster, and B. Franklin Pugh. 2007. Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature 446(2007), 572 – 576.Google ScholarCross Ref
- D. S. Gilmour and J. T. Lis. 1984. Detecting protein-DNA interactions in vivo: distribution of RNA polymerase on specific bacterial genes. Proc. Natl Acad. Sci. 81(1984), 4275 – 4279.Google ScholarCross Ref
- Ayman Grada and Kate Weinbrecht. 2013. Next-Generation Sequencing: Methodology and Application. Journal of Investigative Dermatology 133, 8 (2013), 1 – 4. https://doi.org/10.1038/jid.2013.248Google ScholarCross Ref
- David S. Johnson, Ali Mortazavi, Richard M. Myers, and Barbara Wold. 2007. Genome-wide mapping of in vivo protein-DNA interactions. Science 316(2007), 1497 – 1502.Google ScholarCross Ref
- William K. M. Lai and B. Franklin Pugh. 2017. Understanding nucleosome dynamics and their links to gene expression and DNA replication. Nature Reviews Molecular Cell Biology 18 (2017), 548 EP –. https://doi.org/10.1038/nrm.2017.47 Review Article.Google ScholarCross Ref
- Sean Owen, Daniel Switkin, and ZXing Team. 2019. Barcode Scanner. https://play.google.com/store/apps/details?id=com.google.zxing.client.android. Retrieved: 2020-01-27.Google Scholar
- Louis Papageorgiou, Picasi Eleni, Sofia Raftopoulou, Meropi Mantaiou, Vasileios Megalooikonomou, and Dimitrios Vlachakis. 2018. Genomic big data hitting the storage bottleneck. EMBnet.journal 24, 0 (2018), 910. https://doi.org/10.14806/ej.24.0.910Google ScholarCross Ref
- Jason A Reuter, Damek V Spacek, and Michael P Snyder. 2015. High-throughput sequencing technologies. Molecular cell 58(2015), 586 – 97. Issue 4. https://doi.org/10.1016/j.molcel.2015.05.004Google Scholar
- Ho Sung Rhee and B. Franklin Pugh. 2012. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell 147, 6 (2012), 1408 – 1419. https://doi.org/10.1016/j.cell.2011.11.013Google ScholarCross Ref
- Ho Sung Rhee and B. Franklin Pugh. 2012. Genome-wide structure and organization of eukaryotic pre-initiation complexes. Nature 483, 7389 (2012), 295 – 301. https://doi.org/10.1038/nature10799Google Scholar
- Matthew J. Rossi and William K. M. Lai. 2018. Simplified ChIP-exo assays. Nature Communications 9(2018), 2842. https://doi.org/10.1038/s41467-018-05265-7Google ScholarCross Ref
- Carlo Scarioni. 2013. Pro Spring Security(1st. ed.). Apress, New York, NY.Google Scholar
- D. O. Skobelev, T. M. Zaytseva, A. D. Kozlov, V. L. Perepelitsa, and A. S. Makarova. 2011. Laboratory information management systems in the work of the analytic laboratory. Measurement Techniques 53, 10 (01 Jan 2011), 1182–1189. https://doi.org/10.1007/s11018-011-9638-7Google Scholar
- Glen Smith and Peter Ledbrook. 2014. Grails in Action (2nd.ed.). Manning, Shelter Island, NY.Google Scholar
- M. J. Solomon and A. Varshavsky. 1985. Formaldehyde-mediated DNA-protein crosslinking: a probe for in vivo chromatin structures. Proc. Natl Acad. Sci. 82(1985), 6470 – 6474.Google ScholarCross Ref
- Xinkun Wang. 2016. Next-Generation Sequencing Data Analysis. CRC Press, Boca Raton, FL.Google Scholar
Recommendations
Alignment-Free sequence comparison based on next generation sequencing reads: extended abstract
RECOMB'12: Proceedings of the 16th Annual international conference on Research in Computational Molecular BiologyNext generation sequencing (NGS) technologies have generated enormous amount of shotgun read data and assembly of the reads can be challenging, especially for organisms without template sequences. We study the power of genome comparison based on shotgun ...
Comments