Functional characterization of T2D-associated SNP effects on baseline and ER stress-responsive β cell transcriptional activation

Khetan, Shubham; Kales, Susan; Kursawe, Romy; Jillette, Alexandria; Ulirsch, Jacob C.; Reilly, Steven K.; Ucar, Duygu; Tewhey, Ryan; Stitzel, Michael L.

doi:10.1038/s41467-021-25514-6

Download PDF

Article
Open access
Published: 02 September 2021

Functional characterization of T2D-associated SNP effects on baseline and ER stress-responsive β cell transcriptional activation

Nature Communications volume 12, Article number: 5242 (2021) Cite this article

5737 Accesses
10 Citations
26 Altmetric
Metrics details

Subjects

Abstract

Genome-wide association studies (GWAS) have linked single nucleotide polymorphisms (SNPs) at >250 loci in the human genome to type 2 diabetes (T2D) risk. For each locus, identifying the functional variant(s) among multiple SNPs in high linkage disequilibrium is critical to understand molecular mechanisms underlying T2D genetic risk. Using massively parallel reporter assays (MPRA), we test the cis-regulatory effects of SNPs associated with T2D and altered in vivo islet chromatin accessibility in MIN6 β cells under steady state and pathophysiologic endoplasmic reticulum (ER) stress conditions. We identify 1,982/6,621 (29.9%) SNP-containing elements that activate transcription in MIN6 and 879 SNP alleles that modulate MPRA activity. Multiple T2D-associated SNPs alter the activity of short interspersed nuclear element (SINE)-containing elements that are strongly induced by ER stress. We identify 220 functional variants at 104 T2D association signals, narrowing 54 signals to a single candidate SNP. Together, this study identifies elements driving β cell steady state and ER stress-responsive transcriptional activation, nominates causal T2D SNPs, and uncovers potential roles for repetitive elements in β cell transcriptional stress response and T2D genetics.

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Saori Sakaue, Kathryn Weinand, … Soumya Raychaudhuri

Introduction

Type 2 diabetes (T2D) is a complex disease with both genetic and environmental risk factors that ultimately manifests when pancreatic β cells are unable to secrete adequate amounts of insulin in response to elevated blood glucose levels^1,2. Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) representing 403 association signals in 243 regions of the human genome (loci) for genetic risk of developing T2D^3,4. The overwhelming majority (~90%) of these T2D-associated GWAS SNPs are non-coding, suggesting that altered transcriptional regulation is a common molecular mechanism underlying disease risk in these loci^3,4,5,6. Identifying the functional variant(s) among multiple SNPs that are in linkage disequilibrium (LD) at each T2D GWAS locus is an important step to convert these statistical association signals into molecular and biological insights.

Although studies over the past several years clearly implicate altered islet cis-regulatory element (CRE) activity in T2D genetic risk and progression, they have co-localized only ~1/4 of T2D-associated loci to altered chromatin accessibility and/or gene expression levels in islets^5,7,8,9,10. This may be partially because previous studies measured the effect of genetic variants on chromatin accessibility (caQTLs) and gene expression levels (eQTLs) in islets under steady-state conditions^7,11, consequently missing the role of genetic variants whose functions emerge only under certain cellular conditions. Uncovering such genotype-environment interactions is critical for a complex disorder like T2D.

Endoplasmic reticulum (ER) and the unfolded protein response (UPR) contribute to physiologic processes governing β cell protein quality control, insulin processing/secretion, and to pathophysiologic events contributing to islet failure in T2D^12,13,14,15. Mild to moderate ER stress can elicit beneficial responses, such as β cell proliferation, to meet higher demand for insulin synthesis and secretion¹⁶. However, sustained insulin production demands of insulin resistance, associated with modern sedentary lifestyles¹⁷ and overnutrition^18,19,20, can intensify ER stress and activate terminal UPR, leading to β cell dysfunction and death^13,14. Genetic modulation of β cell ER folding capacity can ameliorate²¹ or exacerbate²² β cell death. Non-coding T2D risk alleles may therefore modulate the transcription of genes and pathways that alter ER stress responses and the UPR.

Massively parallel reporter assays (MPRA) are functional genomic tools to interrogate the transcription activating potential of thousands of sequences simultaneously²³. By introducing nucleotide changes in a given sequence of interest, the effect of naturally occurring variants in the human population on MPRA activity can also be elucidated²⁴. Recent studies have employed MPRA to identify functional SNPs associated with different conditions including red blood cell traits²⁵, adiposity²⁶, osteoarthritis²⁷, and eQTLs²⁴.

Here, we use MPRA to comprehensively test 2512 index and genetically linked (r²≥ 0.8) SNP/indel alleles representing 259 GWAS association signals for T2D and related quantitative trait from the NHGRI/EBI GWAS Catalogue, as well as 4124 SNPs residing in in vivo accessible chromatin sites in human islets, for their ability to modulate transcriptional activation in β cells under steady state and (patho)physiologic ER stress conditions. We identify 1982/6621 (29.9%) SNP-containing elements that activate transcription in MIN6 and 879 MPRA activity-modulating alleles. Multiple T2D-associated SNPs alter the activity of short interspersed nuclear element (SINE)-containing elements induced by ER stress. Importantly, MPRA uncovers 220 functional variants at 104 T2D association signals, narrowing 54 signals to a single candidate SNP. By identifying elements driving β cell steady state and ER stress-responsive transcriptional activation, we nominate putative causal T2D SNPs and uncover potential roles for repetitive elements in β cell transcriptional stress response and T2D genetics.

Results

Selection and testing of sequences for MPRA activity in β cells

To identify CRE sequences that activate β cell transcription and to determine how SNP alleles alter this activity, we employed MPRA in MIN6 β cells. The MPRA library consisted of two-hundred base pair (bp) sequences from the human genome containing each allele for SNPs including: (i) 2512 index or linked (EUR r²≥ 0.8) SNPs/indels from 259 T2D and related quantitative trait association signals in the NHGRI/EBI GWAS Catalogue²⁸ (“T2D SNPs”); (ii) 1910 SNPs significantly associated with changes in human islet chromatin accessibility (“caQTL SNPs”)⁷; and (iii) 2214 SNPs that overlapped human islet ATAC-seq peaks, but were not significantly associated with changes in human islet chromatin accessibility (“non-caQTL SNPs”)⁷ (Methods; Fig. 1a).

**Fig. 1: Massively parallel reporter assay (MPRA) identifies steady state and ER stress-responsive transcription activating sequences in β cells.**

Each sequence was cloned upstream of a minimal promoter controlling transcription of GFP mRNAs with distinct 20 bp barcodes in their 3′ end (Fig. 1a). This MPRA plasmid library was transfected into biological replicates of MIN6 mouse β cells and tested for transcriptional activity in three conditions: standard culture, 24-h exposure to the ER stress-inducing agent thapsigargin (250 nM Tg), or DMSO solvent control conditions (Fig. 1a). For each experimental condition (standard culture, DMSO, Tg), cells were harvested a total of thirty hours after transfection for RNA isolation, GFP mRNA capture, and Illumina sequencing of the sequence-associated barcodes (Methods; Fig. 1a). RNA expression of the transfected MPRA libraries was highly correlated between all replicates for each condition and clustered distinctly from the MPRA plasmid library input (Supplementary Figs. 1a–c). Under standard culture conditions, 1373 SNPs (20.7%) had at least one allele exhibiting significantly higher RNA-seq counts compared to the plasmid library input at FDR < 1% (Fig. 1b). We refer to these sequences as ‘MPRA active’ throughout the remainder of the manuscript.

While prior work has routinely used MIN6 mouse β cells to provide meaningful insights into human islet cis-regulatory control and the transcriptional effects of T2D risk alleles^{5,6,7,9,11,29,30,31}, we first confirmed that MIN6 β cells appropriately modeled the cis-regulatory potential of human islets. Binding motifs of 99 transcription factors (TFs) were significantly enriched (FDR < 1%) in MPRA active elements in MIN6 β cells, which notably included motifs for TFs with reported roles in modulating beta cell identity and function (Hnf1, MafA/B, Foxo1)^{32,33,34,35,36}, glucose-stimulated insulin secretion (Bcl11a³⁷, LXRE³⁸, RARa³⁹), ER quality control and insulin folding and processing (Atfs), circadian regulation of β cell functions (Clock), and regulation of T2D SNP-containing cis-REs (Foxa2^29,40, Rfx5¹¹) (Fig. 1c and Supplementary Data 1). Several of these same motifs were enriched in sorted human islet β cell ATAC-seq peaks (e.g., Fox, Mef2c)⁴¹ and were among the top motifs identified as predictors of human islet regulatory features by recent computational analyses⁴². In addition, empiric binding sites (i.e., ChIP-seq peaks) for PDX1, NKX6.1, MAFB, and FOXA2 in human islets⁸ were enriched in the MPRA active elements (Supplementary Figs. 2a). In a comparison with nine human tissues, the chromatin accessibility profile of MIN6 β cells most resembled that of human islets (Supplementary Figs. 2b–c). Furthermore, human-mouse sequence similarity did not influence the probability that an element was MPRA active (Supplementary Fig. 3a), and elements overlapping human islet ATAC-seq peaks were more likely to be MPRA active in MIN6 mouse β cells than those that did not (Supplementary Fig. 3b).

Together, comprehensive testing, identification, and analysis of thousands of human sequences using MPRA in MIN6 revealed regulatory features and sequence motifs empirically linked to steady-state β cell transcriptional activation and reinforced MIN6 cells as a valid cellular model to test the β cell transcriptional potential of human sequences.

Identification of ER stress-responsive β cell regulatory sequences

The high burden of insulin production and secretion makes β cells particularly susceptible to ER stress⁴³, and ER stress has been implicated in the genetic etiology and pathophysiology of both monogenic^44,45 and T2D²². Thapsigargin (Tg) blocks calcium transport into the ER lumen and has been widely used to induce the UPR in β cells^{46,47,48,49,50,51,52,53}. To identify ER stress-responsive sequences, we compared MPRA activity of sequences in MIN6 cells treated with Tg versus DMSO solvent control (Fig. 1a).

Treatment with Tg increased the expression of ER stress response genes Ddit3 (Chop), Hspa5, and Edem1, and reduced Ins2 expression, confirming UPR induction in MIN6 cells (Supplementary Fig. 1d). ER stress increased the MPRA activity of 328 sequences (representing ≥1 allele(s) of 275 elements) and decreased MPRA activity of 656 sequences (≥1 allele of 449 elements) (Fig. 1d). Elements with increased MPRA activity under ER stress were enriched for motifs of TFs that mediate transcriptional responses to uncompensated ER stress and whose activity and/or abundance is increased in T2D patient islets⁵⁴, such as ATF4 and DDIT3/CHOP^15,55,56,57, as well as factors linked to pathophysiologic epigenetic changes in the beta cells of diet-induced obese mice (Mef2a⁵⁸) and beta-cell senescence (THRa⁵⁹) (Fig. 1e; Supplementary Data 2). Elements with decreased MPRA activity were enriched for motifs of β cell TFs that control insulin transcription and secretion^60,61, such as MAFA, FOXA2, and PAX6 (Fig. 1e; Supplementary Data 2). Consistently, Ins2 expression was significantly decreased by Tg (Supplementary Fig. 1d), suggesting ER stress leads to the inactivation of the β cell-specific TFs. MPRA thus revealed β cell regulatory elements that respond to ER stress and provided a functional readout of TF dynamics in this (patho)physiologic β cell stress response.

MPRA identified 1938 elements in total for which one or both alleles activated transcription in MIN6 β cells for at least one of the experimental conditions tested (Fig. 1f). T2D SNP-containing elements had the lowest proportion of active elements among the three categories of SNPs tested by MPRA (Fig. 1g; Fisher’s exact p = 1.50e−63, caQTL vs. T2D; p = 1.23e−19, non-caQTL vs. T2D), presumably because the vast majority of those tested (n = 2299/2512) did not overlap islet ATAC-seq peaks. Although both caQTL and non-caQTL SNPs overlapped islet ATAC-seq peaks, a significantly higher fraction of elements containing caQTL SNPs were MPRA active (Fig. 1g; Fisher’s exact p = 3.12e−16, caQTL vs. non-caQTL). We hypothesize this is due to the closer proximity of caQTL SNPs to islet ATAC-seq peak summits, as SNP-to-ATAC-seq peak summit proximity was associated with increased MPRA activity (Supplementary Fig. 4).

MPRA identifies SNPs altering β cell transcriptional activity

To identify SNPs that altered transcriptional activation in β cells, we compared MPRA activity of each allele under each experimental condition (standard culture, DMSO, Tg). Combining all three categories (caQTL, T2D, and non-caQTL), 879 SNPs exhibited allelic effects on MPRA activity (FDR < 10%) in one or more experimental condition (Fig. 2a, Supplementary Fig. 5a–c, Supplementary Data 3 and 4). For 98.2% (n = 332/338; Binomial test p = 7.2 × 10⁻⁹⁰) of SNPs that altered MPRA activity in multiple conditions, the direction of allelic effects was concordant (Supplementary Fig. 5a–c). To assess if MPRA properly captured in vivo allelic effects, we assessed their concordance with caQTL effects. Strikingly, 82.8% (n = 246/297; p = 8.6 × 10⁻³², binomial test) of caQTL alleles that increased MPRA activity were associated with increased chromatin accessibility in islets (Fig. 2b), underscoring MPRA’s ability to report allelic effects relevant to their endogenous, in vivo consequences.

**Fig. 2: Identification of β cell transcription-modulating allele.**

We next studied the impact of ER stress on SNP allelic effects and β cell transcriptional activation. As anticipated based on motif enrichment analyses (Fig. 1e), fewer caQTL SNPs exhibited significant allelic effects on MPRA activity in Tg-treated cells (Fig. 2a). This was driven by an overall reduction of activity of caQTL elements under ER stress (Fig. 2c) and associated with the enrichment of islet-specific TF binding motifs (e.g., FOXA2, MAFA) in islet caQTL ATAC-seq peaks (Fig. 1e)⁷. In contrast, a larger proportion of elements containing T2D SNPs with allelic effects by MPRA exhibited higher activity under ER stress conditions (Fig. 2c, d). Most of these T2D SNPs (n = 190/220) were not islet caQTL and did not overlap islet ATAC-seq peaks in our extensive set of >150,000 islet open chromatin sites from 19 donors. This may be due in part to the steady-state nature of these ATAC-seq profiles, which do not capture regulatory elements made accessible by ER stress-responsive TFs. Alternatively, the stress-responsive increases in these T2D SNP-containing elements may be mediated by repetitive elements, which have been shown to become activated by and enhance transcriptional stress responses^62,63,64. Consistent with the latter, 63% (139/220) of T2D SNPs with allelic effects on MPRA activity overlapped repetitive elements. Among three repetitive element categories, only SINEs were enriched in MPRA active elements. SINEs containing T2D-associated SNPs were five times more likely to be active than any other SNP-containing repetitive elements tested (Fig. 2e). Therefore, we next asked whether T2D SNP-containing elements that overlapped SINEs showed higher activity under ER stress. Indeed, 79.4% of T2D SNP-containing elements with higher MPRA activity under ER stress overlapped SINEs, a significantly higher proportion compared to T2D SNP-containing elements with lower or no change in MPRA activity under ER stress (Fig. 2f; Supplementary Data 3). Alu elements, the most common SINEs in the human genome, are primate-specific⁶⁵. When we investigated conservation in 20 mammalian genomes, T2D SNP-containing elements with higher MPRA activity under ER stress were indeed less likely to be conserved beyond the primate lineage (Supplementary Fig. 6).

In summary, MPRA revealed 879 SNPs that alter β cell transcriptional activation, including 220 T2D-associated SNPs. These SNP-containing elements exhibited distinct patterns of transcriptional activity in Tg- vs. DMSO-treated β cells, with caQTL and T2D SNP- containing elements losing or gaining activity under ER stress, respectively.

MPRA nominates functional SNPs at T2D GWAS signals

We next sought to nominate T2D causal variants within each locus from the set of 2512 SNPs tested by MPRA. We identified allelic effects for 220 T2D-associated SNPs in 104 distinct association signals (Fig. 3a; Supplementary Data 5–7 and Supplementary Fig. 5d–f), only 17 of which were reported index SNPs. The number of candidate functional variants identified within the association signals ranged from one to as many as ten, suggesting that some T2D association signals might have more complex effects on multiple genes and/or regulatory elements (Fig. 3b). Importantly, 6/6 T2D risk alleles that significantly altered MPRA activity and in vivo islet chromatin accessibility (islet caQTL)⁷ did so in the same direction, and T2D SNP effects on MPRA activity were consistent with their previously reported effects on in vitro luciferase reporter activity, including rs7903146 (TCF7L2)^6,66, rs1635852 (JAZF1)³⁰, rs12189774 (VEGFA)⁶⁷, rs2943656 (IRS1)⁷, and rs10428126 (IGF2BP2)^7,68.

**Fig. 3: MPRA identifies functional SNPs at 104 T2D-associated GWAS signals.**

10.4% (n = 23/220) of T2D SNPs exhibited specific allelic effects on MPRA activity when treated with Tg compared to baseline conditions (Fig. 3a; Supplementary Data 5 and Supplementary Fig. 5d–f). This included rs12189774, which is predicted to alter the binding of ATF5 (Supplementary Data 7), a TF that forms complexes with PDX1 and ATF4 in stressed β cells to modulate the expression of stress response and apoptosis genes⁶⁹. Although rs12189774 has not been empirically connected to any target gene(s) to date, it may modulate ER stress-responsive VEGFA expression based on reports that ER stress elicits XBP-1(s) and ATF4 binding to the VEGFA promoter and induces VEGFA expression in pancreatic beta cells and other cell types^70,71,72. 6.8% (n = 15/220) of T2D SNPs altered MPRA activity exclusively in ER stressed cells compared to baseline and DMSO-treated cells (Supplementary Fig. 5d–f; Supplementary Data 5).

For 54 T2D GWAS signals, a single SNP among those tested exhibited allelic effects on MPRA activity (Fig. 3b, black dots; Supplementary Data 6). For instance, rs17748864 was the only SNP among the eight tested in the PEX5L locus that altered MPRA activity (Fig. 3c). In contrast to the non-risk C allele, which was inactive in both DMSO and Tg conditions, the T2D risk T allele exhibited an 11-fold greater activity in MPRA and had significantly higher MPRA activity under ER stress (Fig. 3d). These allelic differences in MPRA activity were supported biochemically in electrophoretic mobility shift assays (EMSAs), which identified allele-specific binding of the non-risk C allele by a protein or complex in MIN6 nuclear extracts (Fig. 3e, arrow). Together, these data suggest that the rs17748864 risk T allele conveys robust transcriptional activation activity to this sequence by abrogating the binding of a transcriptional repressor.

For 25 T2D-associated GWAS signals, we nominated two functional SNPs (Fig. 3b, red points; Supplementary Data 6). For example, in the LARP6 locus, rs11630895 and rs113350503 were the only functional SNPs among the 16 tested which all resided on the same tightly linked haplotype (r² = 0.97; Fig. 3f, g). The major allele for each of these SNPs displayed opposing directions-of-effect on MPRA activity in DMSO control conditions and were affected differently by ER stress (Fig. 3f, g). As with the PEX5L locus, EMSAs indicate that both transcription-lowering T2D SNP alleles are bound by distinct MIN6 nuclear factors (Fig. 3h, arrows). Thus, while our results provide conclusive support for allelic effects on in vitro binding and transcriptional output, further investigation to understand their relative contributions and directions-of-effect in their endogenous context is needed. Because oligo design and synthesis for this study occurred prior to the report of credible SNP sets for T2D-associated loci⁴, our LD-based approach to select SNPs for testing did not exhaustively and comprehensively test the functionality of all T2D credible set SNPs. However, for this and other GWAS signals, MPRA has nominated multiple high-priority candidate causal SNPs for targeted and mechanistic investigation in the future (Supplementary Data 5–7).

Integrating MPRA data with QTL maps refines T2D causal alleles/mechanisms

Finally, we sought to understand the in vivo consequences of the putative functional T2D SNPs nominated by MPRA. Integration with islet caQTL revealed striking correlations between SNP effects on in vitro MPRA activity and in vivo islet chromatin accessibility (Fig. 2b), which confirmed that the short sequence element tested by MPRA has function when residing in its broader, endogenous context. To define target genes, we determined islet transcripts whose abundance⁷³ was linked to the functional T2D SNP alleles nominated by MPRA. We found that T2D functional SNPs nominated by MPRA were more enriched for islet eQTLs than T2D SNPs without MPRA activity (Fig. 4a). Integration of MPRA with caQTLs and eQTLs confirmed previously reported effects of the rs10428126 T2D risk allele, which increased reporter activity (Fig. 2b), in vivo chromatin accessibility (Fig. 2b), and IGF2BP2 expression in islets (Fig. 4a).

**Fig. 4: MPRA identifies putative T2D causal SNPs altering islet expression (eQTL) and chromatin accessibility (caQTL).**

Importantly, this integrated approach refined QTL maps to improve insights at additional T2D loci. For example, at the RNF6 locus, rs4630391 was the only SNP to exert allelic effects on MPRA activity among 28 T2D SNPs in high LD that were tested (Fig. 4b). The C risk allele conveyed 30% higher MPRA activity than the T allele (Fig. 4c) and was associated with increased RNF6, CDK8, and WASF3 expression in human islets⁷³ (Fig. 4d). rs4630391 is one of eight SNPs in or immediately adjacent to an ATAC-seq peak containing an islet caQTL (Fig. 4e). Islet donors with AA genotype at the T2D-associated, lead caQTL SNP rs34584161 (n = 10) exhibited higher chromatin accessibility than those with AG (n = 8) or GG (n = 1) genotypes. Although rs34584161 has been recently reported as the SNP with the highest genetic posterior probability of being the causal allele for T2D association in this locus (PPAg = 0.67)⁴, only rs4630391 (PPAg = 0.037) exhibited transcription-modulating effects in MPRA (Fig. 4b). Interestingly, this SNP overlaps an Alu SINE element at the edge of an islet ATAC-seq peak and exhibits allelic effects under ER stress conditions (Fig. 4c, e). Importantly, allelic effects on MPRA activity and chromatin accessibility were concordant, i.e., the rs4630391-C allele was associated with both higher in vitro MPRA activity in MIN6 and increased in vivo chromatin accessibility in human islets (Fig. 4b, e). EMSA revealed that decreased transcriptional activity of the rs4630391-T allele was accompanied by increased T allele-specific binding in MIN6 nuclear extracts (Fig. 4f, arrow). While this does not definitively rule out that the lead caQTL SNP (rs34584161) or other SNPs in high LD are also functional, these data provide compelling functional support for rs4630391 as a putative causal variant despite its lower reported genetic posterior probability.

MPRA also refined the relative functional contributions of two T2D-associated SNPs in high LD. At the SLC35D3 locus, we previously identified rs6937795 and rs6917676, located 15 bp apart, as islet caQTLs⁷ for which the risk alleles were associated with significantly higher chromatin accessibility in islets (Fig. 5a). Their high LD (r²= 0.99) make it impractical to separate the individual contributions of each SNP allele to altered regulatory function using population genetic approaches. MPRA, however, allowed each pairwise, synthetic combination to be tested independently to identify which of the SNPs has functional effects on β cell transcription. This revealed rs6917676, not rs6937795, as the SNP that altered transcriptional activity (Fig. 5b). Interrogation of recent steady-state islet eQTL data⁷³ indicated that the rs6917676 T allele, which increased MPRA activity, was associated with increased SLC35D3 expression (Fig. 5c). Allelic effects of rs6917676 were lost in ER stressed cells, while the activity and allelic effects of rs947734 and rs947735, two SNPs in high LD (r² = 1) ~4.6 kb away in a SINE, increased (Fig. 5d). While additional mechanistic investigation is clearly needed to understand the causal relationship to T2D, this observation illustrates the value of evaluating variant haplotypes and their effects in multiple conditions.

**Fig. 5: MPRA refines T2D-associated haplotype in *SLC35D3* locus.**

Discussion

In this study, we tested 13,252 sequences containing alleles of 6,621 SNPs for their ability to activate and modulate transcription in MIN6 β cells under standard culture conditions and after ER stress or paired solvent control exposures. In total, 29.9% of elements (n = 1982/6621) exhibited increased β cell transcriptional activity from a minimal promoter. SNP alleles in 44.3% of these elements altered MPRA activity (n = 879/1982), including 220 SNPs associated with T2D risk by GWAS.

Multiple lines of evidence indicate that MPRA in MIN6 mouse β cells is capable of identifying features of β cell transcription activation of human sequences and allelic effects of SNPs on this activity, despite potential limitations of cross-species testing of human sequences using episomal assays. First, MPRA active elements were enriched for both motifs and empiric binding of several islet TFs governing human islet cell identity and function under steady-state conditions. Second, under ER stress, changes in MPRA activity reflect reported in vivo changes in TF levels and their activity in β cells^{54,55,56,58,59,60,61}. Finally, allelic effects on MPRA activity in MIN6 exhibited a significant, positive correlation with their effects on in vivo human islet chromatin accessibility. Moreover, the likelihood of MPRA activity was higher for elements in the vicinity of ATAC-seq peak summits, and SNPs closer to ATAC-seq peak summits were more likely to alter MPRA activity and chromatin accessibility. These results confirm cross-species transferability of this assay and help prioritize sequences within open chromatin regions for their importance in regulating human β cell transcriptional activity.

In total, 10.4% (n = 23/220) of T2D SNPs exhibiting allelic effects on MPRA activity did so only when treated with Tg compared to baseline conditions. This included a SNP that has a comparable PPAg to the top SNP in this association signal (PPAg = 0.093 vs. 0.12, respectively) and resides near VEGFA, a putative T2D effector gene whose expression is induced by ER stress via direct binding of the ER stress-responsive TFs XBP-1(s) and ATF4 to its promoter in pancreatic beta cells and other cell types^70,71,72. 6.8% (n = 15/220) of T2D SNPs that altered MPRA activity were detected exclusively in ER stressed cells. This proportion of T2D SNPs with ER stress-specific effects is similar to that of the broader set of SNPs altering ER stress-specific MPRA activity in this study (7.2%). Moreover, it is comparable to the reported percentage of context-specific regulatory element use or activation^{74,75,76,77,78}, including cytokine-responsive elements in islets (4.5%; n = 3798/84,162)⁷⁶ and latent enhancers in monocytes (8.1–15%)⁷⁹, and within the range of genetic variants reported to alter stimulus-responsive gene expression in monocytes exposed to three immunogenic stimuli (3, 9, and 17% response expression QTL upon exposures to LPS, MDP, and dsRNA, respectively)⁸⁰. In the future, it will be interesting and important to determine if the variants altering ER stress-responsive MPRA activity alter stress-responsive islet chromatin accessibility, active histone modifications, or target gene expression in vivo and, more broadly, to test SNP effects on cis-regulatory element use or activity in response to a range of (patho)physiologic stimuli and stressors.

Surprisingly, we found that multiple T2D SNP-containing elements in SINEs were active in MPRA and responded to ER stress with increased activity. These data suggest that SINEs, and SNPs within them, may play underappreciated roles in modulating β cell transcriptional programs in response to stress or other stimuli. Recent studies have contributed to an emerging appreciation of the importance of these elements in epigenetic and transcriptional regulation, demonstrating repetitive element-mediated oncogene activation and modulation of chromatin structure^{81,82,83,84,85,86,87,88,89}. Alu/SINEs have been shown to be transcriptionally induced by cellular stressors^63,90 and are emerging as pervasive transcriptional modulators of cellular functions and stress^{62,64,81,85,91}. Three of four SNPs tested by EMSA overlapped SINEs (rs113350503 and rs11630895 in the LARP6 locus, rs4630391 in the RNF6 locus) and showed specific binding for the allele with lower MPRA activity, suggesting that the activating alleles disrupt binding of a transcriptional repressor. Future studies elucidating target genes of these and other MPRA active, SINE-containing regulatory elements will be necessary to fully understand the functional consequences of sequence variation in these transcriptionally active repetitive element sequences and the potential role of SINE/Alu exaptation^83,92,93 in the genetics of islet (dys)function and T2D.

Finally, a key challenge in T2D genetics is to identify the functional SNPs from multiple variants in high LD per association signal. This is the first study to test thousands of T2D-associated SNPs for their empiric effects on transcriptional activation in β cells. MPRA identified 220 SNP alleles associated with T2D, representing 104 distinct association signals. Candidate causal T2D SNPs nominated by MPRA include those previously studied using targeted, low throughput luciferase assays, such as rs7903146 (TCF7L2)^6,66, rs1635852 (JAZF1)³⁰, rs12189774 (VEGFA)⁶⁷, rs2943656 (IRS1)⁷, and rs10428126 (IGF2BP2)^7,68. Importantly, directions-of-effect detected for the T2D risk alleles by MPRA were consistent with those observed in each previous study. At 54 T2D association signals, only one SNP among all SNPs tested exhibited significant allelic effects on MPRA, nominating it as a putative causal SNP for that respective signal. Two or more candidate causal SNPs were identified for 50 T2D association signals, including LARP6, wherein EMSA and MPRA demonstrated allelic effects on both nuclear factor binding and transcriptional activity for two SNPs in high LD. The components of T2D risk at this and the 49 other GWAS signals may therefore result from a combined effect of multiple functional SNPs. Although this study was underway before T2D credible set SNPs were reported⁴, select loci illustrate how MPRA may help to evaluate and prioritize them by providing functional evidence in support of SNPs with high genetic posterior probabilities (e.g., rs7903146 (TCF7L2, PPAg = 0.59), rs3802177 (SLC30A8, PPAg = 0.57), rs10811661 and rs10811660 (CDKN2A/B, PPAgs = 0.47, 0.41), rs11603349 (CENTD2/ARAP1, PPAg = 0.22), rs4846567 (LYPLAL1, PPAg = 0.17), rs2879813 (TP53INP1, PPAg = 0.14)) or by identifying SNPs with lower genetic posterior probability for T2D association signals, such as those in the RNF6, SLC35D3, ANK1/NKX6-3, JAZF1, SPRY2, THADA, and VEGFA loci, as candidate causal SNPs. However, it is possible that SNPs with low T2D association posterior probabilities identified as functional by MPRA may not be causal, and that technical limitations such as MPRA design, fragment size, cell line used, and conditions tested, may preclude the identification of some true causal SNPs. Therefore, comprehensive testing of T2D credible set SNPs by MPRA across multiple metabolic cell types and relevant (patho)physiologic states in the near future will be critical to modify poster probabilities and nominate functional T2D variants.

Methods

MPRA library design

200 base pair sequences, with 100 bps flanking each side of 6621 SNPs were included in our MPRA library. The SNPs belong to three categories:

1.
T2D-associated SNPs/indels: SNPs (n = 2299), small insertions (n = 72), and small deletions (n = 129) in linkage disequilibrium (r²≥ 0.8) with T2D-associated index SNPs (n = 259) were selected as previously described⁹⁴ for synthesis and testing. Briefly, T2D-associated SNPs were retrieved from the NHGRI/EBI GWAS Catalog (accessed 19 January 2017) and LD-pruned using PLINK version 1.9⁹⁵ with parameters “-maf 0.05-clump-clump-p1 0.0001-clump-p2 0.01-clump-r2 0.8-clump-kb1000” to remove index SNPs representing redundant association signals. Additional SNPs from the 1000 Genomes Phase 3 reference panel were identified and included based on their high LD (r²≥ 0.8) in EUR with each retained index SNP.
2.
Islet chromatin accessibility quantitative trait loci (caQTLs): 1910 SNPs previously identified as having a significant association with altered in vivo chromatin accessibility in islet samples were also included⁷. Only SNPs within a given ATAC-seq peak were considered and tested for their association with altered accessibility of that peak. For 1816 caQTLs, one SNP was found to show a significant correlation with chromatin accessibility changes in human islets, all of which were included in the MPRA library (Bonferroni adjusted p-values < 0.023). For 94 caQTLs, two SNPs <25 bp apart showed significant correlations with islet chromatin accessibility changes, so all four allelic combinations were synthesized and tested.
3.
Non-caQTL SNPs: 2214 SNPs that overlapped islet ATAC-seq peaks but did not significantly alter the accessibility of those peaks⁷ were also synthesized and tested. Since islet caQTLs were identified in a relatively small cohort of individuals (n = 19), the following criteria were used to include SNPs for which the caQTL study was more appropriately powered to detect associations with chromatin accessibility: (i) unadjusted p value >0.2; and (ii) minor allele frequency >0.125. SNPs overlapping individual-specific peaks or sharing peaks with other SNPs were removed. The 2,214 non-caQTL SNPs were randomly selected for inclusion in the MPRA library from the 15,178 SNPs that passed these inclusion criteria.

The vast majority of SNPs and elements tested belonged to only one of three categories. However, T2D-associated SNPs overlapping 13 ATAC-seq peaks were significantly associated with chromatin accessibility in islets (caQTLs). Therefore, for analysis purposes, whenever SNPs were required to belong to only one of the three categories above (such as Fig. 2a), they were not categorized as caQTLs, but as being T2D-associated only.

MPRA library construction

The MPRA library was constructed as previously described²⁴. Briefly, oligos were synthesized (Agilent Technologies) as 230 bp sequences containing 200 bp of genomic sequences and 15 bp of adaptor sequence on either end. Unique 20 bp barcodes were added by PCR along with additional constant sequence for subsequent incorporation into a backbone vector by Gibson assembly. The oligo library was expanded by electroporation into E. coli, and the resulting plasmid library was sequenced by Illumina 2 × 150 bp chemistry to acquire oligo-barcode pairings. The library underwent restriction digestion, and GFP with a minimal TATA promoter was inserted by Gibson assembly resulting in the 200 bp oligo sequence positioned directly upstream of the promoter and the 20 bp barcode falling in the 3′ UTR of GFP. After expansion within E. coli the final MPRA plasmid library was sequenced by Illumina 1 × 31 bp chemistry to acquire a baseline representation of each oligo-barcode pair within the library. Barcodes mapping to more than 1 sequence were discarded from all downstream analyses. Note: Two separate batches of the MPRA library were prepared. The first batch was used to perform MPRA under standard culture conditions. This MPRA library was then electroporated into E. coli to obtain a second batch of the MPRA library, which was used for the paired DMSO-Tg experiments.

MPRA library transfection into MIN6 cells

10 million MIN6 cells were seeded in each of seven 15 cm² dishes. The cells were 60–70% confluent the next day. Each 15 cm² dish was replaced with 20 ml of fresh media and transfected with 7 µg of the MPRA plasmid library using 55 µl Lipofectamine 2000 (38% transfection efficiency). Six hours after transfection, media was either (i) not changed (MPRA under standard culture conditions), (ii) replaced with media containing 250 nM thapsigargin (Tg) dissolved in 0.025% DMSO, or (iii) replaced with media containing 0.025% DMSO. Thirty hours after transfection, cells were trypsinized and collected by centrifugation. Cell pellets were frozen at −80 °C. For each condition (standard culture, DMSO, or Tg), MIN6 cells were transfected on five separate days to generate biological replicates.

RNA isolation and MPRA RNA-seq library generation

RNA was extracted from frozen cell pellets using the Qiagen RNeasy Midi kit. Following DNase treatment, a mixture of 3 GFP-specific biotinylated primers (Supplementary Data 8; #120, #123, and #126) were used to immunoprecipitated GFP transcripts using Streptavidin C1 Dynabeads (Life Technologies). Following another round of DNase treatment, cDNA was synthesized from GFP mRNA using SuperScript IV and purified with AMPure XP beads. Quantitative PCR using primers specific for GFP (Supplementary Data 8; #34 and #52) was used to determine the cycle at which linear amplification begins for each replicate. Replicates were diluted to approximately the same concentration based on the qPCR results, and PCR with primers #34 and #52 was used to amplify barcodes associated with the ~13.5k sequences included in the MPRA library for each replicate (9 cycles for standard culture, and 13 cycles for DMSO/250 nM Tg). A second round of PCR (6 cycles) was used to add Illumina sequencing adaptors to the DNA/RNA replicates. The resulting MPRA barcode libraries were spiked with 5% PhiX and sequenced using Illumina single-end 31 bp chemistry (with 8 bp index read), clustered at 80–90% maximum density.

MPRA data analysis

Data from the MPRA was analyzed as previously described²⁴. Briefly, the sum of the barcode counts for each oligo within replicates was median normalized, and oligos showing differential expression relative to the plasmid input were identified by modeling a negative binomial distribution with DESeq2⁹⁶ and applying a false discovery rate (FDR) threshold of 1%. For sequences that displayed significant MPRA activity, a paired t-test was applied on the log-transformed mRNA/plasmid ratios for each experimental replicate to test whether the reference and alternate allele had similar activity. An FDR threshold of 10% was used to identify SNPs with significant effects on MPRA activity between alleles. Because the MPRA testing standard culture conditions was performed with a separate MPRA library preparation, the DMSO-Tg MPRA results were not directly compared to MPRA performed under standard culture conditions.

Annotating repetitive elements tested with MPRA

The ‘RepeatMasker’ track for hg19 was downloaded from the UCSC genome browser. Among the ten different classes of repeats, only three classes (long interspersed nuclear element (LINE), long terminal repeat (LTR), and SINE) overlapped more than 100 elements tested with MPRA. Therefore, only these three classes of repeats were assessed for associations with MPRA activity.

TF motif enrichment

Homer⁹⁷ findMotifsGenome.pl script was used to investigate TF motifs enriched in a given set of elements. Elements with lower MPRA activity under ER stress were used as background to identify TF motifs enriched in elements with higher MPRA activity under ER stress, and vice-versa (parameters: hg19, -size given). 2008 T2D-associated elements with no MPRA activity were used as background to identify TF motifs enriched in the 492 T2D SNP-containing elements with significant MPRA activity (parameters: hg19, -size given). For the cross-species ATAC-seq peak analysis, ATAC-seq peaks shared with other human cell types were used as background to identify TF motifs enriched in unique ATAC-seq peaks (parameters: mm9, -size given).

Analysis of islet ChIP-seq data

Chromatin immunoprecipitation sequencing (ChIP-seq) data from Pasquali et al.⁸ were aligned to the hg19 reference human genome as previously described⁶. Elements tested with MPRA were then overlapped with ChIP-seq peaks to conduct Fisher’s exact tests using R.

Electrophoretic mobility shift assay (EMSA)

21-bp biotin end-labeled complementary oligonucleotides were designed with each SNP allele of interest in the 11th position of the oligo (Integrated DNA Technologies; Supplementary Data 8). For each SNP tested, complementary oligos were annealed to create double-stranded probes for each allele tested. Nuclear extract was prepared from MIN6 β cells using the NE-PER Extraction Kit (Thermo Scientific), and EMSA were completed using the LightShift Chemiluminescent EMSA kit (Thermo Scientific) according to the manufacturer’s instructions. Binding reactions consisted of 1× binding buffer, 1 µg poly dI-dC, 4 µg MIN6 nuclear extract, and 200 fmol labeled probe. Reactions were incubated at 25 °C for 25 m. For competition reactions, 25- and 50-fold excess of non-biotinylated double-stranded probes for either allele was included and pre-incubated in the reaction mixture for 15 m. DNA-protein complexes were detected by chemiluminescence. EMSAs were completed on two or more separate occasions to ensure that results were consistent.

MPRA-based interrogation of islet eQTLs

InsPIRE Consortium⁷³ islet eQTL p-values were retrieved for genes within 1 megabase (Mb) of each T2D-associated SNP included in the MPRA library. In Fig. 4a, nominal islet eQTL p-values of T2D-associated SNPs for which ≥1 SNP in high LD (r²> 0.8) exhibited significant allelic effects on MPRA activity were plotted and compared to those for which no SNPs in high LD exhibited MPRA activity. For locus-specific plots in Figs. 4d and 5c, nominal p-values were adjusted for multiple testing of genes within 1 Mb on either side of the SNP (Bonferroni corrected p-value cutoff = 0.01).

Mapping human regulatory sequences tested with MPRA to mammalian genomes

The UCSC genome browser Liftover tool was used to map human sequences (hg19) tested with MPRA to 20 mammalian genomes (with a minimum ratio of 0.20 bases that must remap; allowing for multiple output regions). The 20 mammalian genomes are: papAnu2 (Baboon), felCat5 (Cat), PanTro6 (Chimpanzee), BosTau7 (Cow), canFAM3 (Dog), loxAfr3 (Elephant), nomLeu3 (Gibbon), gorGor3 (Gorilla), equCab2 (Horse), mm9 (mouse), ponAbe2 (Orangutan), aiMel1 (Panda), susScr11 (Pig), ochPri3 (Pika), oryCun2 (Rabbit), rn5 (Rat), rheMac8 (Rhesus), oviAri3 (Sheep), sorAra2 (Shrew), speTri2 (Squirrel). Human sequences that did not lift over to the genome assembly of a given species were subsequently classified as not conserved (with a minimum ratio of 0.20 bases that must remap; allowing for multiple output regions).

To obtain human-mouse sequence similarity measures, Liftover was performed 99 times with the minimum ratio of bases that must remap ranging from 0.01 to 1.00 in increments of 0.01 (allowing for multiple output regions). The R package ‘sm’ was used plot density of human-mouse sequence similarity and perform non-parametric bootstrap hypothesis tests of equality. Human sequences that did not liftover to the mm9 mouse genome with even 1% sequence similarity were classified as having 0% sequence similarity.

Cross-species mapping of human ATAC-seq peaks to MIN6 ATAC-seq peaks

Human and mouse ATAC-seq data were processed as previously described⁷. Briefly, low-quality portions of reads were trimmed using Trimmomatic⁹⁸ and aligned to the hg19 or mm9 genome assembly using Burrows Wheeler Aligner-MEM. For each replicate, duplicate reads were removed after shifting. Technical replicates were merged using SAMtools and peaks were called using MACS2⁹⁹ (with parameters -callpeak–nomodel -f BAMPE) at FDR < 1%. ATAC-seq peak summit positions were obtained from MACS2 output files. The liftover tool in the UCSC genome browser was used to map human ATAC-seq peaks to the mouse (mm9) genome using a minimum ratio of 0.10 bases that must remap (not allowing for multiple output regions). Using bedtools, human ATAC-seq peaks mapping to the mouse genome were then overlapped with MIN6 ATAC-seq peaks.

Identification of TF binding motifs disrupted by T2D-associated SNPs

Genes expressed in MIN6 beta cells were identified by fitting a Gaussian mixture model (two components) to RNA-seq FPKM values. After identifying genes expressed in MIN6 beta cells (FPKM values ≥ 1.63), mouse gene names were converted to corresponding human homologs using the R Package, ‘biomaRt’¹⁰⁰. After filtering for expression in MIN6 beta cells, the R Package, ‘MotifBreakR’¹⁰¹, was used to identify TF binding motifs disrupted by SNPs with allelic skew in MPRA activity (Supplementary Data 4 and 7). Parameters used were: pwmList = hocomoco, threshold = 1e−2, filterp = TRUE, method = “ic”. In addition to MotifBreakR, predictions for TF binding motifs disrupted by T2D-associated and caQTL SNPs with allelic skew in MPRA activity were also obtained from SNP2TFBS¹⁰² (default options) (Supplementary Data 4 and 7).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

All datasets generated and analyzed during the current study are publicly available in GEO under Accession GSE145643 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc = GSE145643), which includes fastq files containing barcode sequences in the 3′UTR of gfp in the plasmid library and MIN6 RNA samples and processed files containing the sum of all barcode counts for each test construct in the plasmid DNA and MIN6 RNA samples. Human islet ATAC-seq data were obtained from NCBI Sequence Read Archive Accession SRP117935. Summary InsPIRE Consortium73 islet eQTL statistics were obtained from https://zenodo.org/record/3408356 and SNP2TFBS predictions for transcription factor motifs altered by MPRA-modulating SNP alleles were obtained from https://ccg.epfl.ch/snp2tfbs/.

Code availability

Code used for analysis in this paper is available at https://github.com/UcarLab/MPRA_Khetan and linked to Zenodo at https://doi.org/10.5281/zenodo.4974390

References

Prentki, M. & Nolan, C. J. Islet beta cell failure in type 2 diabetes. J. Clin. Invest. 116, 1802–1812 (2006).
Article CAS PubMed PubMed Central Google Scholar
Lawlor, N., Khetan, S., Ucar, D. & Stitzel, M. L. Genomics of islet (Dys)function and type 2 diabetes. Trends Genet. 33, 244–255 (2017).
Article CAS PubMed PubMed Central Google Scholar
Fuchsberger, C. et al. The genetic architecture of type 2 diabetes. Nature 536, 41–47 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).
Article CAS PubMed PubMed Central Google Scholar
Parker, S. C. J. et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc. Natl Acad. Sci. USA 110, 17921–17926 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Stitzel, M. L. et al. Global epigenomic analysis of primary human pancreatic islets provides insights into type 2 diabetes susceptibility loci. Cell Metab. 12, 443–455 (2010).
Article CAS PubMed PubMed Central Google Scholar
Khetan, S. et al. Type 2 diabetes-associated genetic variants regulate chromatin accessibility in human islets. Diabetes 67, 2466–2477 (2018).
Article CAS PubMed PubMed Central Google Scholar
Pasquali, L. et al. Pancreatic islet enhancer clusters enriched in type 2 diabetes risk-associated variants. Nat. Genet. 46, 136–143 (2014).
Article CAS PubMed PubMed Central Google Scholar
Roman, T. S. et al. A type 2 diabetes-associated functional regulatory variant in a pancreatic islet enhancer at the Adcy5 locus. Diabetes https://doi.org/10.2337/db17-0464 (2017).
Kycia, I. et al. A common type 2 diabetes risk variant potentiates activity of an evolutionarily conserved islet stretch enhancer and increases C2CD4A and C2CD4B expression. Am. J. Hum. Genet. 102, 620–635 (2018).
Article CAS PubMed PubMed Central Google Scholar
Varshney, A. et al. Genetic regulatory signatures underlying islet gene expression and type 2 diabetes. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1621192114 (2017).
Lytrivi, M., Castell, A.-L., Poitout, V. & Cnop, M. Recent insights into mechanisms of β-cell lipo- and glucolipotoxicity in type 2 diabetes. J. Mol. Biol. https://doi.org/10.1016/j.jmb.2019.09.016 (2019).
Cnop, M., Toivonen, S., Igoillo-Esteve, M. & Salpea, P. Endoplasmic reticulum stress and eIF2α phosphorylation: the Achilles heel of pancreatic β cells. Mol. Metab. 6, 1024–1039 (2017).
Article CAS PubMed PubMed Central Google Scholar
Back, S. H. & Kaufman, R. J. Endoplasmic reticulum stress and type 2 diabetes. Annu. Rev. Biochem. 81, 767–793 (2012).
Article CAS PubMed PubMed Central Google Scholar
Shrestha, N., Reinert, R. B. & Qi, L. Endoplasmic reticulum protein quality control in β cells. Semin. Cell Dev. Biol. 103, 59–67 (2020).
Article CAS PubMed PubMed Central Google Scholar
Sharma, R. B. et al. Insulin demand regulates β cell number via the unfolded protein response. J. Clin. Invest. 125, 3831–3846 (2015).
Article PubMed PubMed Central Google Scholar
Hu, F. B. Sedentary lifestyle and risk of obesity and type 2 diabetes. Lipids 38, 103–108 (2003).
Article CAS PubMed Google Scholar
Perng, W., Oken, E. & Dabelea, D. Developmental overnutrition and obesity and type 2 diabetes in offspring. Diabetologia 62, 1779–1788 (2019).
Article PubMed Google Scholar
Gupta, D., Krueger, C. B. & Lastra, G. Over-nutrition, obesity and insulin resistance in the development of β-cell dysfunction. Curr. Diabetes Rev. 8, 76–83 (2012).
Article CAS PubMed Google Scholar
Jiang, X., Ma, H., Wang, Y. & Liu, Y. Early life factors and type 2 diabetes mellitus. J. Diabetes Res. 2013, 485082 (2013).
PubMed PubMed Central Google Scholar
Teodoro-Morrison, T., Schuiki, I., Zhang, L., Belsham, D. D. & Volchuk, A. GRP78 overproduction in pancreatic beta cells protects against high-fat-diet-induced diabetes in mice. Diabetologia 56, 1057–1067 (2013).
Article CAS PubMed Google Scholar
Dooley, J. et al. Genetic predisposition for beta cell fragility underlies type 1 and type 2 diabetes. Nat. Genet. 48, 519–527 (2016).
Article CAS PubMed PubMed Central Google Scholar
Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012).
Article CAS PubMed PubMed Central Google Scholar
Tewhey, R. et al. Direct identification of hundreds of expression-modulating variants using a multiplexed reporter assay. Cell 165, 1519–1529 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ulirsch, J. C. et al. Systematic functional dissection of common genetic variation affecting red blood. Cell Traits Cell 165, 1530–1545 (2016).
CAS PubMed Google Scholar
Vockley, C. M. et al. Massively parallel quantification of the regulatory effects of noncoding genetic variation in a human cohort. Genome Res. 25, 1206–1214 (2015).
Article CAS PubMed PubMed Central Google Scholar
Klein, J. C. et al. Functional testing of thousands of osteoarthritis-associated variants for regulatory activity. Nat. Commun. 10, 2434 (2019).
Article ADS PubMed PubMed Central CAS Google Scholar
MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).
Article CAS PubMed Google Scholar
Fogarty, M. P., Cannon, M. E., Vadlamudi, S., Gaulton, K. J. & Mohlke, K. L. Identification of a regulatory variant that binds FOXA1 and FOXA2 at the CDC123/CAMK1D type 2 diabetes GWAS locus. PLoS Genet. 10, e1004633 (2014).
Article PubMed PubMed Central CAS Google Scholar
Fogarty, M. P., Panhuis, T. M., Vadlamudi, S., Buchkovich, M. L. & Mohlke, K. L. Allele-specific transcriptional activity at type 2 diabetes-associated single nucleotide polymorphisms in regions of pancreatic islet open chromatin at the JAZF1 locus. Diabetes 62, 1756–1762 (2013).
Article CAS PubMed PubMed Central Google Scholar
Kulzer, J. R. et al. A common functional regulatory variant at a type 2 diabetes locus upregulates ARAP1 expression in the pancreatic beta cell. Am. J. Hum. Genet. https://doi.org/10.1016/j.ajhg.2013.12.011 (2014).
Kitamura, Y. I. et al. FoxO1 protects against pancreatic beta cell failure through NeuroD and MafA induction. Cell Metab. 2, 153–163 (2005).
Article CAS PubMed Google Scholar
Cinti, F. et al. Evidence of β-cell dedifferentiation in human type 2 diabetes. J. Clin. Endocrinol. Metab. 101, 1044–1054 (2016).
Article CAS PubMed Google Scholar
Kim-Muller, J. Y. et al. Aldehyde dehydrogenase 1a3 defines a subset of failing pancreatic β cells in diabetic mice. Nat. Commun. 7, 12631 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Buteau, J. & Accili, D. Regulation of pancreatic beta-cell function by the forkhead protein FoxO1. Diabetes Obes. Metab. 9(Suppl 2), 140–146 (2007).
Article CAS PubMed Google Scholar
Nakae, J. et al. Regulation of insulin action and pancreatic beta-cell function by mutated alleles of the gene encoding forkhead transcription factor Foxo1. Nat. Genet. 32, 245–253 (2002).
Article CAS PubMed Google Scholar
Peiris, H. et al. Discovering human diabetes-risk gene function with genetics and physiological assays. Nat. Commun. 9, 3855 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar
Ogihara, T. et al. Liver X receptor agonists augment human islet function through activation of anaplerotic pathways and glycerolipid/free fatty acid cycling. J. Biol. Chem. 285, 5392–5404 (2010).
Article CAS PubMed Google Scholar
Brun, P.-J. et al. Retinoic acid receptor signaling is required to maintain glucose-stimulated insulin secretion and β-cell mass. FASEB J. 29, 671–683 (2015).
Article CAS PubMed Google Scholar
Gaulton, K. J. et al. Genetic fine mapping and genomic annotation defines causal mechanisms at type 2 diabetes susceptibility loci. Nat. Genet. 47, 1415–1425 (2015).
Article CAS PubMed PubMed Central Google Scholar
Ackermann, A. M., Wang, Z., Schug, J., Naji, A. & Kaestner, K. H. Integration of ATAC-seq and RNA-seq identifies human alpha cell and beta cell signature genes. Mol. Metab. 5, 233–244 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wesolowska-Andersen, A. et al. Deep learning models predict regulatory variants in pancreatic islets and refine type 2 diabetes association signals. eLife 9 (2020).
Papa, F. R. Endoplasmic reticulum stress, pancreatic β-cell degeneration, and diabetes. Cold Spring Harb. Perspect. Med. 2, a007666 (2012).
Article PubMed PubMed Central CAS Google Scholar
Bonnycastle, L. L. et al. Autosomal dominant diabetes arising from a Wolfram syndrome 1 mutation. Diabetes 62, 3943–3950 (2013).
Article CAS PubMed PubMed Central Google Scholar
Delépine, M. et al. EIF2AK3, encoding translation initiation factor 2-alpha kinase 3, is mutated in patients with Wolcott-Rallison syndrome. Nat. Genet. 25, 406–409 (2000).
Article PubMed Google Scholar
Booth, C. & Koch, G. L. Perturbation of cellular calcium induces secretion of luminal ER proteins. Cell 59, 729–737 (1989).
Article CAS PubMed Google Scholar
Sehgal, P. et al. Inhibition of the sarco/endoplasmic reticulum (ER) Ca2+-ATPase by thapsigargin analogs induces cell death via ER Ca2+ depletion and the unfolded protein response. J. Biol. Chem. 292, 19656–19673 (2017).
Article CAS PubMed PubMed Central Google Scholar
Stone, S. et al. Pancreatic stone protein/regenerating protein is a potential biomarker for endoplasmic reticulum stress in beta cells. Sci. Rep. 9, 5199 (2019).
Article ADS PubMed PubMed Central CAS Google Scholar
Robbins, R. D. et al. Inhibition of deoxyhypusine synthase enhances islet {beta} cell function and survival in the setting of endoplasmic reticulum stress and type 2 diabetes. J. Biol. Chem. 285, 39943–39952 (2010).
Article CAS PubMed PubMed Central Google Scholar
Cunha, D. A. et al. Pancreatic β-cell protection from inflammatory stress by the endoplasmic reticulum proteins thrombospondin 1 and mesencephalic astrocyte-derived neutrotrophic factor (MANF). J. Biol. Chem. 292, 14977–14988 (2017).
Article CAS PubMed PubMed Central Google Scholar
Syed, I. et al. PAHSAs attenuate immune responses and promote β cell survival in autoimmune diabetic mice. J. Clin. Invest. 129, 3717–3731 (2019).
Article PubMed PubMed Central Google Scholar
Chen, Y.-C. et al. PAM haploinsufficiency does not accelerate the development of diet- and human IAPP-induced diabetes in mice. Diabetologia 63, 561–576 (2020).
Article CAS PubMed PubMed Central Google Scholar
Suetomi, R. et al. Adrenomedullin has a cytoprotective role against ER stress for pancreatic ß-cells in autocrine and paracrine manners. J. Diabetes Investig. https://doi.org/10.1111/jdi.13218 (2020).
Article PubMed PubMed Central Google Scholar
Eizirik, D. L., Pasquali, L. & Cnop, M. Pancreatic β-cells in type 1 and type 2 diabetes mellitus: different pathways to failure. Nat. Rev. Endocrinol. 16, 349–362 (2020).
Article CAS PubMed Google Scholar
Cullinan, S. B. & Diehl, J. A. Coordination of ER and oxidative stress signaling: the PERK/Nrf2 signaling pathway. Int. J. Biochem. Cell Biol. 38, 317–332 (2006).
Article CAS PubMed Google Scholar
Oyadomari, S. & Mori, M. Roles of CHOP/GADD153 in endoplasmic reticulum stress. Cell Death Differ. 11, 381–389 (2004).
Article CAS PubMed Google Scholar
Ghosh, R., Colon-Negron, K. & Papa, F. R. Endoplasmic reticulum stress, degeneration of pancreatic islet β-cells, and therapeutic modulation of the unfolded protein response in diabetes. Mol. Metab. 27S, S60–S68 (2019).
Article PubMed CAS Google Scholar
Nammo, T. et al. Genome-wide profiling of histone H3K27 acetylation featured fatty acid signalling in pancreatic beta cells in diet-induced obesity in mice. Diabetologia 61, 2608–2620 (2018).
Article CAS PubMed Google Scholar
Aguayo-Mazzucato, C. et al. T3 induces both markers of maturation and aging in pancreatic β-cells. Diabetes 67, 1322–1331 (2018).
Article CAS PubMed PubMed Central Google Scholar
Gao, N. et al. Foxa1 and Foxa2 maintain the metabolic and secretory features of the mature beta-cell. Mol. Endocrinol. Baltim. Md 24, 1594–1604 (2010).
Article CAS Google Scholar
Fu, Z., Gilbert, E. R. & Liu, D. Regulation of insulin synthesis and secretion and pancreatic beta-cell dysfunction in diabetes. Curr. Diabetes Rev. 9, 25–53 (2013).
Article PubMed PubMed Central Google Scholar
Hernandez, A. J. et al. B2 and ALU retrotransposons are self-cleaving ribozymes whose activity is enhanced by EZH2. Proc. Natl Acad. Sci. USA 117, 415–425 (2020).
Article CAS PubMed Google Scholar
Liu, W. M., Chu, W. M., Choudary, P. V. & Schmid, C. W. Cell stress and translational inhibitors transiently increase the abundance of mammalian SINE transcripts. Nucleic Acids Res. 23, 1758–1765 (1995).
Article CAS PubMed PubMed Central Google Scholar
Sundaram, V. & Wang, T. Transposable element mediated innovation in gene regulatory landscapes of cells: re-visiting the ‘gene-battery’ model. BioEssays News Rev. Mol. Cell. Dev. Biol. 40 (2018).
Deininger, P. Alu elements: know the SINEs. Genome Biol. 12, 236 (2011).
Article CAS PubMed PubMed Central Google Scholar
Gaulton, K. J. et al. A map of open chromatin in human pancreatic islets. Nat. Genet. 42, 255–259 (2010).
Article CAS PubMed PubMed Central Google Scholar
Miguel-Escalada, I. et al. Human pancreatic islet three-dimensional chromatin architecture provides insights into the genetics of type 2 diabetes. Nat. Genet. 51, 1137–1148 (2019).
Article CAS PubMed PubMed Central Google Scholar
Greenwald, W. W. et al. Pancreatic islet chromatin accessibility and conformation reveals distal enhancer networks of type 2 diabetes risk. Nat. Commun. 10, 2078 (2019).
Article ADS PubMed PubMed Central CAS Google Scholar
Juliana, C. A. et al. A PDX1-ATF transcriptional complex governs β cell survival during stress. Mol. Metab. 17, 39–48 (2018).
Article CAS PubMed PubMed Central Google Scholar
Ghosh, R. et al. Transcriptional regulation of VEGF-A by the unfolded protein response pathway. PLoS ONE 5, e9575 (2010).
Article ADS PubMed PubMed Central CAS Google Scholar
Pereira, E. R., Frudd, K., Awad, W. & Hendershot, L. M. Endoplasmic reticulum (ER) stress and hypoxia response pathways interact to potentiate hypoxia-inducible factor 1 (HIF-1) transcriptional activity on targets like vascular endothelial growth factor (VEGF). J. Biol. Chem. 289, 3352–3364 (2014).
Article CAS PubMed Google Scholar
Pereira, E. R., Liao, N., Neale, G. A. & Hendershot, L. M. Transcriptional and post-transcriptional regulation of proangiogenic factors by the unfolded protein response. PLoS ONE 5 (2010).
Viñuela, A. et al. Genetic variant effects on gene expression in human pancreatic islets and their implications for T2D. Nat. Commun. 11, 4912 (2020).
Article ADS PubMed PubMed Central CAS Google Scholar
Reschen, M. E. et al. Lipid-induced epigenomic changes in human macrophages identify a coronary artery disease-associated variant that regulates PPAP2B expression through altered C/EBP-beta binding. PLoS Genet. 11, e1005061 (2015).
Article PubMed PubMed Central CAS Google Scholar
Calderon, D. et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat. Genet. 51, 1494–1505 (2019).
Article CAS PubMed PubMed Central Google Scholar
Ramos-Rodríguez, M. et al. The impact of proinflammatory cytokines on the β-cell regulatory landscape provides insights into the genetics of type 1 diabetes. Nat. Genet. 51, 1588–1595 (2019).
Article PubMed PubMed Central CAS Google Scholar
Vockley, C. M. et al. Direct GR binding sites potentiate clusters of TF binding across the human genome. Cell 166, 1269–1281.e19 (2016).
Article CAS PubMed PubMed Central Google Scholar
Umans, B. D., Battle, A. & Gilad, Y. Where are the disease-associated eQTLs? Trends Genet. https://doi.org/10.1016/j.tig.2020.08.009 (2020).
Ostuni, R. et al. Latent enhancers activated by stimulation in differentiated cells. Cell 152, 157–171 (2013).
Article CAS PubMed Google Scholar
Kim-Hellmuth, S. et al. Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. Nat. Commun. 8, 266 (2017).
Article ADS PubMed PubMed Central CAS Google Scholar
Pehrsson, E. C., Choudhary, M. N. K., Sundaram, V. & Wang, T. The epigenomic landscape of transposable elements across normal human development and anatomy. Nat. Commun. 10, 5640 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Su, M., Han, D., Boyd-Kirkup, J., Yu, X. & Han, J.-D. J. Evolution of Alu elements toward enhancers. Cell Rep. 7, 376–385 (2014).
Article CAS PubMed Google Scholar
Jang, H. S. et al. Transposable elements drive widespread expression of oncogenes in human cancers. Nat. Genet. 51, 611–617 (2019).
Article CAS PubMed PubMed Central Google Scholar
Franchini, L. F. et al. Convergent evolution of two mammalian neuronal enhancers by sequential exaptation of unrelated retroposons. Proc. Natl Acad. Sci. USA 108, 15270–15275 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Choudhary, M. N. et al. Co-opted transposons help perpetuate conserved higher-order chromosomal structures. Genome Biol. 21, 16 (2020).
Article CAS PubMed PubMed Central Google Scholar
Sundaram, V. et al. Widespread contribution of transposable elements to the innovation of gene regulatory networks. Genome Res 24, 1963–1976 (2014).
Article CAS PubMed PubMed Central Google Scholar
Xie, M. et al. DNA hypomethylation within specific transposable element families associates with tissue-specific enhancer landscape. Nat. Genet. 45, 836–841 (2013).
Article CAS PubMed PubMed Central Google Scholar
Sundaram, V. et al. Functional cis-regulatory modules encoded by mouse-specific endogenous retrovirus. Nat. Commun. 8, 14550 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Batut, P., Dobin, A., Plessy, C., Carninci, P. & Gingeras, T. R. High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression. Genome Res. 23, 169–180 (2013).
Article CAS PubMed PubMed Central Google Scholar
Li, T., Spearow, J., Rubin, C. M. & Schmid, C. W. Physiological stresses increase mouse short interspersed element (SINE) RNA expression in vivo. Gene 239, 367–372 (1999).
Article CAS PubMed Google Scholar
Zhang, X.-O., Gingeras, T. R. & Weng, Z. Genome-wide analysis of polymerase III-transcribed Alu elements suggests cell-type-specific enhancer function. Genome Res. 29, 1402–1414 (2019).
Article CAS PubMed PubMed Central Google Scholar
Bejerano, G. et al. A distal enhancer and an ultraconserved exon are derived from a novel retroposon. Nature 441, 87–90 (2006).
Article ADS CAS PubMed Google Scholar
Mita, P. & Boeke, J. D. How retrotransposons shape genome regulation. Curr. Opin. Genet. Dev. 37, 90–100 (2016).
Article CAS PubMed PubMed Central Google Scholar
Lawlor, N. et al. Multiomic profiling identifies cis-regulatory networks underlying human pancreatic β cell identity and function. Cell Rep. 26, 788–801.e6 (2019).
Article CAS PubMed PubMed Central Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central CAS Google Scholar
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Article CAS PubMed PubMed Central Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinforma. Oxf. Engl. 30, 2114–2120 (2014).
Article CAS Google Scholar
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Article PubMed PubMed Central CAS Google Scholar
Durinck, S., Spellman, P. T., Birney, E. & Huber, W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat. Protoc. 4, 1184–1191 (2009).
Article CAS PubMed PubMed Central Google Scholar
Coetzee, S. G., Coetzee, G. A. & Hazelett, D. J. motifbreakR: an R/Bioconductor package for predicting variant effects at transcription factor binding sites. Bioinforma. Oxf. Engl. 31, 3847–3849 (2015).
CAS Google Scholar
Kumar, S., Ambrosini, G. & Bucher, P. SNP2TFBS - a database of regulatory SNPs affecting predicted transcription factor binding site affinity. Nucleic Acids Res. 45, D139–D144 (2017).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank members of the Stitzel and Ucar labs for helpful discussion and critiques during study design and execution and Francis S. Collins, D. Leland Taylor, Cassandra Spracklen, Christine Beck, and Taneli Helenius for helpful comments on the manuscript. This work was supported by The Jackson Laboratory Director’s Innovation Fund (to M.L.S. and R.T.), W81XWH-18-0401 (to M.L.S. and D.U.), R01DK118011 (to M.L.S.), and R00HG008179 (to R.T.).

Author information

Authors and Affiliations

The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
Shubham Khetan, Romy Kursawe, Alexandria Jillette, Duygu Ucar & Michael L. Stitzel
Department of Genetics and Genome Sciences, University of Connecticut, Farmington, CT, USA
Shubham Khetan, Duygu Ucar & Michael L. Stitzel
The Jackson Laboratory for Mammalian Genetics, Bar Harbor, ME, USA
Susan Kales & Ryan Tewhey
Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
Jacob C. Ulirsch
Broad Institute of MIT and Harvard, Cambridge, MA, USA
Jacob C. Ulirsch & Steven K. Reilly
Institute of Systems Genomics, University of Connecticut, Farmington, CT, USA
Duygu Ucar & Michael L. Stitzel
Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
Ryan Tewhey
Tufts University School of Medicine, Boston, MA, USA
Ryan Tewhey

Authors

Shubham Khetan
View author publications
You can also search for this author in PubMed Google Scholar
Susan Kales
View author publications
You can also search for this author in PubMed Google Scholar
Romy Kursawe
View author publications
You can also search for this author in PubMed Google Scholar
Alexandria Jillette
View author publications
You can also search for this author in PubMed Google Scholar
Jacob C. Ulirsch
View author publications
You can also search for this author in PubMed Google Scholar
Steven K. Reilly
View author publications
You can also search for this author in PubMed Google Scholar
Duygu Ucar
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Tewhey
View author publications
You can also search for this author in PubMed Google Scholar
Michael L. Stitzel
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.L.S., R.T., S.K., and D.U. designed the study. S.K. and S. Kales designed, generated, and tested the MPRA library. S.K, R.K., and A.J. optimized and completed MPRA experiments in MIN6. S.K. and M.L.S. designed and completed EMSA experiments. S.K., D.U., RT, and M.L.S. analyzed and discussed MPRA results and interpretations. S.K. and S.K.R. completed evolutionary conservation analyses. J.C.U. and R.T. completed GWAS and credible set SNP enrichment analyses. S.K., S.K.R., D.U., R.T., and M.L.S. wrote the manuscript. All authors reviewed and approved the final manuscript.

Corresponding authors

Correspondence to Ryan Tewhey or Michael L. Stitzel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Khetan, S., Kales, S., Kursawe, R. et al. Functional characterization of T2D-associated SNP effects on baseline and ER stress-responsive β cell transcriptional activation. Nat Commun 12, 5242 (2021). https://doi.org/10.1038/s41467-021-25514-6

Download citation

Received: 13 February 2020
Accepted: 10 August 2021
Published: 02 September 2021
DOI: https://doi.org/10.1038/s41467-021-25514-6

This article is cited by

Leveraging massively parallel reporter assays for evolutionary questions
- Irene Gallego Romero
- Amanda J. Lea
Genome Biology (2023)
Genetics of sexually dimorphic adipose distribution in humans
- Grace T. Hansen
- Débora R. Sobreira
- Marcelo A. Nóbrega
Nature Genetics (2023)
Mutated lncRNA increase the risk of type 2 diabetes by promoting β cell dysfunction and insulin resistance
- Wan-Hui Guo
- Qi Guo
- Mi Yang
Cell Death & Disease (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.