People with a first-degree relative with myocardial infarction (MI) have a two- to four-fold increased relative risk of developing the disease1, indicating a significant genetic role in disease development. Multiple genetic variants have been identified for coronary artery disease2,3. Still, identified variants have been found to explain less than 15% of the heritability, and familial coronary artery disease remains an independent predictor of coronary disease after adjusting for known common genetic variants2,3,4, underscoring that additional approaches are needed to identify the residual genetic variation. Identification of biological pathways involved in coronary artery calcification (CAC) and MI has the potential to pinpoint novel therapeutic approaches to prevent disease occurrence and progression.

Previous population-based genomic studies have largely analyzed each trans-omic data type separately and have applied very stringent statistical cutoffs to reduce false-positive associations. However, this comes at a cost of low sensitivity for capturing true positive findings. With an integrative analysis of a range of different omics components, directionally concordant associations will reduce the risk of both false-positive and false-negative findings5. By applying this method for heart failure and echocardiographic traits, we have previously identified several plausible genetic variants associated with these outcomes6. Although conventional genomic association methods have yielded more genetic variants for coronary disease compared with heart failure, we postulated that more candidate genes could be identified for coronary disease using similar methods. The aim of this study was, therefore, to search for additional genetic loci associated with CAC and MI by integrating associations found across GWAS, DNA methylation, and gene expression.

Methods

Population

This study included participants from the Framingham Heart Study’s (FHS) Offspring and Third Generation cohorts. Detailed descriptions of the cohorts are available elsewhere7,8. Individuals were included if they had data on at least one of the omics of interest. In the case an individual did not have data on all omics, they only contributed to the omics analysis for which they had data. We followed patients from the date of the genomic profiling (date of the blood sample) until December 31st, 2016, or until MI or death, whichever occurred first.

Omics measures

Blood samples were collected during 1998–2008 for the Offspring cohort (examination cycle 7 and 8), and 2002–2005 for the Offspring Spouse and Third generation cohort (1st examination cycle).

Affymetrix 550 k Array (Affymetrix, Santa Clara, CA) was used for profiling of genetic variants, which were then imputed to the 1000 Genomes Project by MaCH (v 1.0.15)9 and only variants with an imputation quality greater than 0.3 were retained. DNA methylation was available for participants in the Offspring cohort from 8th examination cycle and Third generation cohort from 2nd examination cycle. The profiles for DNA methylation were measured from whole blood derived DNA using the Infinium HumanMethylation450 BeadChip (Illumina, San Diego, CA)10. Rigorous quality control was performed and only high quality CpG sites were kept. The gene expression profiling was derived from isolated RNAs from fasting peripheral whole blood on the Affymetrix Human Exon 1.0st Array (Affymetrix, Santa Clara, CA) for the Offspring cohort and Third Generation cohort at the same exams at which DNA methylation was assessed. More details of the methods for the profiling of omics data is available in previous publications6,11.

Outcomes

To investigate associations between gene function and coronary disease development and progression, respectively, a total of four outcomes were analyzed: CAC as a dichotomous variable (presence or absence of CAC), CAC as a continuous variable (excluding those with a CAC score of 0), prevalent MI, and incident MI. Separate analyses were undertaken for prevalent and incident MI to avoid incorrect handling of time (since pooling prevalent and incident cases may lead to spurious associations in opposite directions). The two measures of CAC were chosen to allow for investigations of associations between gene function and (1) development of calcification (CAC as a dichotomous variable) and (2) the extent of the calcification among subjects with CAC (CAC as a continuous variable). MI was diagnosed by ECG, enzymes and history, or autopsy evidence. CAC prevalence was estimated from coronary tomography (CT) scans undertaken for the Offspring in 1998–2001 (7th examination cycle) and Third generations in 2002–2005 (1st examination cycle), and the extent of CAC was quantified by the Agatston score12. The Agatston score is computed by multiplying the lesion area with a weighted attenuation score (based on the maximal attenuation score within the lesion), where a calcified lesion is an area including at least 3 connected pixels with CT attenuation > 130 Hounsfield units13.

Statistics

The analysis was performed in three steps. In the first step each omics measure (GWAS, DNA methylation, and gene expression) was regressed to each of the four outcomes. Given the familial relatedness among FHS participants, generalized estimating equations were used to test the association of omics measures with the presence of CAC and prevalent MI. Similarly, linear mixed models were used to assess the association between omics measures and CAC values. In addition, Cox proportional hazards models clustering on pedigrees were used to assess the association between omics measures and incident MI. All models were adjusted for age, sex, weight, height, and technical covariates.

In the second step, the associations of each of the four outcomes with each type of omic measures (GWAS, DNA methylation, and gene expression) were summarized at the gene level. For the genetic associations, the most significant genetic variant within each gene region was used to represent the overall association of the gene with the outcomes. Similarly, the most significant CpG site within each gene region was used to represent the overall association of methylation profile with the outcomes. For the gene expression, the most significant transcript was used to represent the association of each gene with the outcomes. Finally, we used robust rank aggregation to integrate the top 5% association from the three different omic data types. It tested how much better the gene was positioned in the ranked list than what would be expected by chance, which is formalized by randomly shuffling of the ranked list5. A trans-omic score was calculated to represent the significance of each gene across the different omic data types. A full description of the statistical test is available elsewhere5. The test results from the top 10 genes with the lowest trans-omic scores are shown for each outcome. Additionally, we highlighted genes that were identified among the top genes with the trans-omic scores for more than one of the studied outcomes. All trans-omic scores and the p-values from the individual analyses are available in Tables S1, S2, S3, and S4 in the Supplemental Data, which may be used by other researchers as a reference.

Ethics approval and consent to participate

The present study was approved by the Institutional Review Board of the Boston Medical Campus. A written informed consent has been collected from all individuals prior to entering the FHS. The study was conducted according to the Declaration of Helsinki.

Results

As shown in Table 1, the current study included 3106 participants with CAC measurements. The mean age was 57 years and 48.9% were female. In total, 1403 (45.2%) had CAC, and the median CAC value was 67.8 (IQR 10.8, 274.9). We found 65 individuals with prevalent MI and 60 with incident MI during a mean follow-up of 8.2 years. For the GWAS analysis, we included 2932 individuals. For the analyses of the DNA methylation and gene expression we included 1936 and 2729 individuals, respectively. In Table 2, characteristics of the individuals according to the outcomes are shown. Individuals with MI or CAC were on average older, more often male, and had higher prevalence of hypercholesterolemia, hypertension, obesity, and diabetes compared to the total study population (Table 2) .

Table 1 Characteristics of study participants.
Table 2 Characteristics of study participants according to the outcomes; presence of coronary artery calcification and myocardial infarction (prevalent and incident).

Association with coronary artery calcifications (CAC)

The top 10 most significant genes for the analyses of CAC included TMEM80, HAPLN2, GAK, PDCD6-AHRR, AHRR, EXOC3, SLC9A3-AS1, ALAS1, DNAH1, and TNFRSF1A for prevalent CAC, and TECPR2, GABARAP, ALPI, MACROD2, TTC34, VWA1, ZNF839, MOK, CLEC4F, and LOC101927666 for CAC as a continuous variable. A full list of annotations, putative functions, and locations of the top 10 genes are presented in Table 3. All trans-omic scores for the three omics data for CAC presence (dichotomous) and CAC score (continuous) are available in Tables S1 and S2 in the Supplemental Data.

Table 3 Trans-omic scores and p-values for GWAS, DNA-methylation, gene expression for CAC (prevalent and continuous).

Association with myocardial infarction (MI)

The 10 most significant genes comprised LAT, C1orf131, FYTTD1, PPFIBP1, AKAP8, MDC1-AS1, PINK1, BRD4, TUBB, and C1orf128 for prevalent MI, and BTF3L4, STXBP3, LINC02169, IRX3, WDR35, USP34, FOXF2, MIR6720, NDUFA11, and LOC100128568 for incident MI. Annotations, putative functions, and locations of the top 10 genes are presented in Table 4. The trans-omic scores for the three omics data for prevalent and incident MI can be found in Table S3 and S4 in the Supplemental Data.

Table 4 Trans-omic scores and p-values for genomics, epigenomics and transcriptomics association studies of prevalent and incident MI.

Integration of different outcomes

We finally compared the list of the top 100 genes with the lowest trans-omic scores for each outcome to each other and identified genes matches (Fig. 1). In total, 13 genes were found to associate with more than one of the outcomes. These genes included PDCD6-AHRR, AHRR, EXOC3 and SLC9A3-AS1 (Table 5). We further examined the enrichment of top genes in biological pathways, and found that the top enriched pathways include basal transcription factors, estrogen signaling pathway, and longevity regulating pathway, suggesting potential functions of these biological pathways in pathology of MI.

Figure 1
figure 1

Top 100 genes for each outcome with the lowest trans-omic scores. Genes identified in top 100 for more than one outcome was highlighted.

Table 5 Name, location, and annotation/function of genes identified in top 100 for at least two outcomes.

Discussion

In this analysis, we integrated the association results from three omics data (GWAS, DNA methylation, and gene expression) to identify molecular signatures related to CAC and MI. We also provided a full list of genes from the analyses in the online material allowing other researchers to access all our results. It is important to point out that the present study did not have any formal cutoff to claim statistical significance and the results from this and prior studies are therefore not directly comparable. In this context, those top loci did not reach the conventional genome-wide significance cutoff. For many of the top ranked genetic loci, there are other levels of evidence suggesting that they may be involved in the pathogenesis of coronary disease, as discussed in the next sections, which also aligns with pathophysiological pathways of atherosclerosis identified in previous studies60.

Among the top 10 genes associated with CAC levels (excluding those with no CAC), there were 4 genes located in proximity to each other at chromosome 5 (PDCD6-AHRR, AHRR, EXOC3, and SLC9A3-AS1). Aryl-Hydrocarbon Receptor Repressor (AHRR), which can bind to nuclear factor-kappa B (NFKB) and may be immune modulating16, has previously been reported to be upregulated among smokers. Further, variation in DNA methylation in the AHRR gene has previously been associated with carotid plaque scores, even after adjustment for smoking status17. The AhR pathway, which can be activated by smoking, can increase the expression of inflammatory markers in macrophages and is involved in the buildup of lipids in macrophages and formation of plaque17,61. EXOC3 is important for controlling granule secretion and glycoprotein receptor trafficking in platelets, and in EXOC3 conditional knockout mice arterial thrombosis was found to be accelerated along with improved homeostasis20. The sodium proton exchanger subtype 3 (SLC9A3) is highly expressed in the small intestine and colon, where it absorbs salt in the gastrointestinal tract and affects the extracellular fluid volume and blood pressure. SLC9A3 is a potential drug target for hypertension by reducing salt uptake in the gut21. These genes were also among the top genes across all outcomes.

As expected from what we know of vascular biology, several of the top genes are known to be involved in inflammation, macrophage signaling, and endothelial function. Neither of these genes have, however, been firmly identified by GWAS previously. For instance, HAPLN2 binds hyaluronan, which is expressed in relation to inflammatory signaling and appears to be involved in the progression of atherosclerotic plaques, was among the top genes for the CAC presence14,62. The tumor necrosis factor (TNF) receptor type 1 (prevalent CAC), A-kinase anchor protein 8 (an enzyme that bind to protein kinase A, prevalent MI), and Cyclin G Associated Kinase (a transcriptional target of p53 tumor suppressor gene, prevalent CAC), appear all to be downstream targets of the TNF-alpha signaling pathways39. ST14 (Matriptase, also known as PRSS14/Epithin) represent another potentially interesting pathway that may relate to macrophage migration into the arterial walls. It has previously been reported to be involved in the transendothelial migration of activated macrophages59.

Moreover, several genes have previously been implicated in lipid metabolism, including ALP1, which is involved in intestinal fat absorption30. ALP1 deficiency is linked to the metabolic syndrome and ischemic heart disease in humans31. CLEC4F, identified for continuous CAC, may be directly involved in cholesteryl ester transfer protein (CETP) production36 and has been proposed as a target for CVD63. The BRD4 is part of the bromodomain and extra-terminal (BET) protein family44 and has been suggested to be of importance for integration of the endothelium. Inhibition of the BET reader protein has been suggested as a possible strategy in the prevention of adverse vascular remodeling64.

Strengths and limitations

Strengths of the present study included multiple omics measures in a well-phenotyped cohort, and the familial relatedness in FHS, which could increase the likelihood of finding genetic mechanisms underlying MI given coronary disease clusters in families. We further integrated evidence from multi omics data to reduce false positive findings. Our study revealed multiple pathways possibly involved in the development of coronary disease. Our analyses should, however, be considered as hypothesis generating only. Although several pathways have been implicated in the pathogenesis of atherosclerosis and MI risk before, replication in independent cohorts would have further strengthened the plausibility of our findings. The use of whole blood to measure gene expression is a feasible, yet an imprecise measure of the actual gene activity within coronary arteries and comprise a limitation. Finally, this study includes only a moderate sample size with a very limited number of events and consists of a predominantly White population of European descent. Despite its limited sample size, the deep phenotyping, multi-omics measures, and multigenerational structure are unique features of the cohort, justifying the present set of analyses.

Conclusion

Using a trans-omic approach we integrated data from GWAS, DNA methylation, and gene expression to identify potential biological mechanisms in the development of CAC and MI. We identified several candidate genes for MI and CAC, of which many have been implicated in prior studies. Further research is still needed to confirm our findings and identify potential pathways for the prevention and treatment of coronary artery disease.