Unbiased analysis of potential targets of breast cancer susceptibility loci by Capture Hi-C
- Nicola H. Dryden1,5,
- Laura R. Broome1,5,
- Frank Dudbridge2,
- Nichola Johnson1,
- Nick Orr1,
- Stefan Schoenfelder3,
- Takashi Nagano3,
- Simon Andrews4,
- Steven Wingett4,
- Iwanka Kozarewa1,
- Ioannis Assiotis1,
- Kerry Fenwick1,
- Sarah L. Maguire1,
- James Campbell1,
- Rachael Natrajan1,
- Maryou Lambros1,
- Eleni Perrakis1,
- Alan Ashworth1,
- Peter Fraser3 and
- Olivia Fletcher1
- 1Breakthrough Breast Cancer Research Centre, Institute of Cancer Research, London SW3 6JB, United Kingdom;
- 2Department of Non-communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London WC1E 7HT, United Kingdom;
- 3Nuclear Dynamics Programme, The Babraham Institute, Cambridge CB22 3AT, United Kingdom;
- 4Babraham Bioinformatics, The Babraham Institute, Cambridge CB22 3AT, United Kingdom
- Corresponding author: Olivia.Fletcher{at}icr.ac.uk
-
↵5 These authors contributed equally to this work.
Abstract
Genome-wide association studies have identified more than 70 common variants that are associated with breast cancer risk. Most of these variants map to non-protein-coding regions and several map to gene deserts, regions of several hundred kilobases lacking protein-coding genes. We hypothesized that gene deserts harbor long-range regulatory elements that can physically interact with target genes to influence their expression. To test this, we developed Capture Hi-C (CHi-C), which, by incorporating a sequence capture step into a Hi-C protocol, allows high-resolution analysis of targeted regions of the genome. We used CHi-C to investigate long-range interactions at three breast cancer gene deserts mapping to 2q35, 8q24.21, and 9q31.2. We identified interaction peaks between putative regulatory elements (“bait fragments”) within the captured regions and “targets” that included both protein-coding genes and long noncoding (lnc) RNAs over distances of 6.6 kb to 2.6 Mb. Target protein-coding genes were IGFBP5, KLF4, NSMCE2, and MYC; and target lncRNAs included DIRC3, PVT1, and CCDC26. For one gene desert, we were able to define two SNPs (rs12613955 and rs4442975) that were highly correlated with the published risk variant and that mapped within the bait end of an interaction peak. In vivo ChIP-qPCR data show that one of these, rs4442975, affects the binding of FOXA1 and implicate this SNP as a putative functional variant.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.175034.114.
- Received March 7, 2014.
- Accepted August 6, 2014.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.