Introduction

Bronchopulmonary dysplasia (BPD) is a prominent chronic respiratory disease among survivors of extremely preterm (< 28 weeks of gestational age, GA) and very low birth weight (BW, < 1500 g) infants1,2. The modern consensus on BPD pathogenesis is largely attributed to the arrested lung development due to premature birth which interrupts the intra-uterine programmed lung development and the subsequent repair of lung injury caused by the supplemental oxygen (O2) and mechanical ventilation in the neonatal intensive care unit (NICU)2. Lack of definitive cure and the persistent lung impairment can lead to long-term pulmonary dysfunction and increased risk for adverse respiratory symptoms (e.g., airway diseases, exercise intolerance) in BPD survivors3,4,5,6.

There are limited genetic, epigenetic, or transcriptomic resources for the investigation of BPD etiology and pathogenesis in preterm infants. In previous studies, genome-wide association studies (GWASs) and exome sequencing in various populations with BPD elucidated potential genetic risk factors such as SPOCK2 encoding SPARC (osteonectin), Cwcv and Kazal like domains proteoglycan 2 and CRP encoding C-reactive protein7,8,9,10,11,12. Transcriptomic studies by database search, microarray, and RNA sequencing (RNA-Seq) have reported the differentially expressed genes (DEGs) in blood, lung cell or lung biopsy from BPD cases13,14,15,16,17. Most recently single cell RNA-Seq determined monocyte (tracheal aspirate)- or CD8 + T cell-specific BPD-associated genes18,19. Epigenome-wide association studies (EWASs) have identified DNA methylation loci associated with BPD in formalin-fixed lung autopsy/biopsy15 or with neonatal morbidities including BPD20. Recently, we reported epigenetic markers of BPD risk in cord blood DNAs from a preterm cohort21, and they included CpGs annotated with genes including cathepsin H (CTSH, cg24847366) and SPOCK2 (cg17958658)22. The results indicated that cord blood methylome changes involved in lung and tissue development, cell cycle and leukopoiesis, immune-mediated inflammation, T and B cell responses, and platelet activation may precede or indicate risk of BPD development22. In that report, we also demonstrated that methylation-based cord blood cell-type composition varied by BW or GA, and that a higher nucleated red blood cell (NRBC) proportion was significantly associated with a lower BW and DNA hypomethylation profile22.

In the current study, we expanded cell-type deconvolution of DNA methylation to postnatal peripheral blood from the same preterm cohort21,22. Methylome analysis during the newborn period determined potential postnatal epigenome markers of BPD development and RNA-Seq demonstrated BPD-associated DEGs in cord and postnatal peripheral blood cells from the premature infants. In addition, influence of prolonged NICU O2 supplementation on blood cell DNA methylation and gene expression landscape was elucidated in nonBPD infants on postnatal day 14. This investigation of the early life hematologic epigenome and transcriptome provides further insights and biomarkers of BPD pathogenesis.

Results

Demographics of cohort

A total of 109 premature infants were included in the study after applying exclusion criteria. Table 1 and Additional file 1: Table S1 present the neonatal and maternal characteristics and fetal complications of the prematurely born neonates diagnosed with BPD or without (nonBPD). Diagnosis of BPD was made for 15 preterm infants based on the National Institutes of Health consensus definition and O2 needs at either 36 weeks postmenstrual age or at 28–56 days of postnatal life based on GA23. BPD development was sexually dimorphic with 25% of males (10/40) and 7.2% of females (5/69) diagnosed among these premature infants. Infants diagnosed with BPD were born at significantly lower mean BW (p = 0.002) and GA (p < 0.001) and were supplemented with NICU O2 for a greater number of days (46.5 ± 4.0, p < 0.001) compared to nonBPD infants (7.8 ± 1.1). Infections were more common in the neonates who developed BPD (Table 1, p = 0.006). The time points (14 days and 28 days of life) used for peripheral blood analysis were chosen preceding symptoms and diagnosis of BPD.

Table 1 Characteristics of the preterm infant cohort used for overall analyses.

Methylation-based estimation of blood cell composition

The methylation-based deconvolution model estimated seven cell type percentages in cord and peripheral blood. Of these, six cell types (with the exception of natural killer (NK) cells) were determined to be significantly varied between BPD and nonBPD neonates at one or more time points (Fig. 1A). The percentage of lymphocytes including CD8 + T cells (day 14), CD4 + T cells (cord blood, days 14 and 28), and B cells (days 14 and 28) were significantly decreased in BPD relative to nonBPD (Fig. 1A). In contrast, percentage of myeloid cells including monocytes (day 28) and granulocytes (days 14 and 28) and NRBCs (days 14 and 28) were significantly higher in BPD than in nonBPD (Fig. 1A). The previously reported cord blood cell-type compositions22 based on the Houseman deconvolution algorithm24 were re-evaluated together with postnatal day data using the approach of Koestler et al. (Identifying Optimal DNA methylation Libraries, IDOL)25.

Figure 1
figure 1

DNA methylation-based estimation of blood cell type composition in preterm infants. (A) Seven cell-type distributions in preterm infants with bronchopulmonary dysplasia (BPD) development or without (nonBPD) were estimated by DNA methylation profiles of cord blood and peripheral blood on postnatal days 14 and 28. Individual cell type data is in Additional file: Table S1. Mean ± S.E.M. presented. *p < 0.05 vs. time-matched nonBPD. (B) Blood cell type distribution of all preterm infants by gestational age quintile (Q). CD4T CD4 + T cells, CD8T CD8 + T cells, NK natural killer cells, Mono monocytes, Gran granulocytes, NRBC nucleated red blood cell. n = 107 in cord blood (93 nonBPD, 14 BPD). n = 93 on day 14 (81 nonBPD, 12 BPD). n = 82 on day 28 (71 nonBPD, 11 BPD).

The estimated seven cell-type distribution plotted relative to GA quintiles among all premature infants indicated time- and GA-dependent shifts of cell-type composition during the neonatal period (Fig. 1B). Compared to cord blood, NRBC proportions in peripheral blood from days 14 and 28 were markedly decreased in all GA quintile groups (Fig. 1B). Higher CD4 + T cell percentage in higher GA groups were also evident at all times and a similar trend was shown for percentage of CD8 + T cells on day 14 and percentage of B cells in cord blood and on day 28 (Fig. 1B). In contrast granulocyte proportions, assumed to be largely neutrophils, were higher in lower GA groups on days 14 and 28 (Fig. 1B).

There were significant inverse correlations between GA and granulocyte percentage on both postnatal days in preterm infants (day 14, r2 = 0.0837, p = 0.0049; day 28, r2 = 0.1173, p = 0.0016) (Fig. 2A). GA was also significantly, but positively, associated with the percentage of the sum of all lymphoid subpopulations (CD4 + T, CD8 + T, B cell, NK) on both postnatal days (day 14 r2 = 0.2193, p = 1.0E−04; day 28, r2 = 0.1566, p = 2.0E−04) (Fig. 2A). An increased percentage of granulocytes and a suppressed percentage of lymphocytes was observed in BPD infants relative to nonBPD infants (Fig. 2A). The neutrophil (granulocyte)-lymphocyte ratio (NLR), reported to be an early predictor of BPD26,27, was significantly higher in BPD compared to nonBPD on all neonatal days (Fig. 2B). Examining NLR, we found significant correlations with NICU O2 days and GA (Fig. 2C). NLRs were lowered by day 28 (Fig. 2B) while the significant positive correlation to NICU O2 days persisted (Additional file 2: Fig. S1). Overall we observed that higher NLR was correlated with lower GA, longer O2 supplementation, and associated with development of BPD.

Figure 2
figure 2

Association between granulocytes, lymphocytes, and gestational age (GA) in preterm neonates diagnosed with bronchopulmonary dysplasia (BPD) or without (nonBPD). (A) Significant correlations between GA and peripheral blood granulocyte, or total lymphocyte, depicted by disease status and oxygen (O2) exposure days in neonatal intensive care unit (NICU). (B) Significantly elevated neutrophil (granulocyte)-lymphocyte ratio (NLR) in BPD compared to nonBPD in cord blood and peripheral blood on postnatal days 14 and 28. *, p < 0.05 vs. time-matched nonBPD. Individual data with mean ± S.E.M. presented. n = 107 in cord blood (93 nonBPD, 14 BPD). n = 93 in day 14 (81 nonBPD, 12 BPD). n = 82 in day 28 (71 nonBPD, 11 BPD). (C) Higher NLR was also associated with prolonged NICU O2 days and shorter GA on day 14 and  postnatal day 28 (Additional file 2: Fig. S1 A, B).

To further explore differences in cell-type proportions, particularly shifts among lymphocyte subtypes from naïve to memory, we utilized the 12 cell-type model of Salas et al.28. While this model does not include NRBCs and cannot distinguish NRBCs from myeloid lineage cell types, it estimates naïve and memory composition of B cells, CD4 T cells and CD8 T cells. BPD infants showed significant increases in memory B cells at both day 14 and day 28 (p < 0.01, Additional file 2: Table S2), significant decreases in CD4 and CD8 naïve cell types at day 14 (p < 0.05). This model also indicated a highly significant increase in NLR for BPD infants at both day 14 and day 28 (p < 0.01).

Cord blood genes differentially expressed in BPD and comparison with EWAS

RNA-seq was carried out on cord blood RNA, and DESeq2 analysis adjustment for nine covariates (GA, BW, sex, and six cell-type percentages of CD4 + T, CD8 + T, B cell, granulocyte, monocyte, and NRBC) determined DEGs in BPD infants (471 genes at p < 0.01; 1685 genes at p < 0.05) (Table 2; Additional file 1: Table S3; Additional file 2: Fig. S2A). Pathway analysis indicated that 471 DEGs were predominantly involved in the inhibition of lymphopoiesis, T cell receptor signaling and Th1/Th2 pathways, and adaptive immune response (e.g., CD27, CD79B, ICOS, IL18R1, LCK, STAT4) (Fig. 3A; Table 2; Additional file 1: Table S4), consistent with the decreased T and B cell quantity in BPD infants as depicted in Fig. 1A. The observation of the decreased pyroptosis pathway (e.g., CASP9, GZMA), activated antiviral response and interferon (IFN) signaling (e.g., IFIM3, SOCS1, TLR4) and mitotic cell cycle (e.g., BORA, RAD51, SPC25) (Fig. 3A; Additional file 1: Table S4) coordinately indicated inflammatory cell proliferation against infection, agreeing with the increased NLR found in BPD infants (shown in Fig. 2B). These and other pathways including development and respiratory disorders showed similar enrichment in the cord blood BPD-EWAS CpGs22. Parallel analysis of epigenome and transcriptome pathways in cord blood cells suggests an epigenetic modulation of the T cell autocrine lymphokine IL-229, which could be an upstream event to manage transcriptional changes for T cell clonal expansion and growth, leukocyte activation, and IL-18 stimulation for IFN production and NK cell cytotoxicity (Fig. 3B). The 134 genes in common between BPD-CpGs and BPD-DEGs (e.g., CCL5, HLA-DQA2, IL18RAP, LCK, SELP) may play key roles in granulocyte inflammation and innate and adaptive immune responses (Additional file 1: Table S3; Additional file 2: Fig. S2B). Lung alveolarization- and BPD-associated SPOCK27,30 and AGER encoding receptor for advanced glycosylation end-product (RAGE)31,32 were also elucidated in both analyses. Many of the overlapping genes showed reciprocal changes in methylation and gene expression differences indicating epigenetic regulation of their transcription.

Table 2 Differentially expressed cord blood genes in bronchopulmonary dysplasia (BPD).
Figure 3
figure 3

Bronchopulmonary dysplasia (BPD)-associated cord blood genes, predicted pathways and comparison with cord blood epigenetic changes. (A) A dot plot of highly enriched pathways and gene ontologies (GOs) of differentially expressed cord blood genes (471 genes at p < 0.01) in BPD (n = 12) compared to nonBPD (n = 55, ≥ 1 day oxygen-exposure in newborn intensive care unit) determined by RNA sequencing analysis (RNA-Seq) with adjustment for covariates (6 cell-type %, sex, gestational age, birth weight). Top-ranked pathways of cord blood epigenome-wide association analysis (EWAS) for BPD (386 genes annotated to 275 CpGs) were retrieved from a previous publication22. Circle size and number represent adjusted -Log10 p values of pathways/GOs. Circle color indicates activation z-score trend. Details of pathway analysis are in Additional file: Table S4. (B) Ingenuity pathway analysis (IPA) demonstrates that interleukin 2 (IL-2) may be an essential upstream molecule disturbing epigenomes and transcriptomes involved in blood cell development and activity during BPD pathogenesis. Sirt1 sirtuin 1, TREX1 three prime repair exonuclease 1, IRF2 interferon regulatory factor 2, RNASEH2B ribonuclease H2 subunit B, TNF tumor necrosis factor, ZBTB16 zinc finger and BTB domain containing 16, TCF3 transcription factor 3, NFATC2 nuclear factor of activated T cells 2, STAT4 signal transducer and activator of transcription 4, CREBBP CREB binding protein, NFκB nuclear factor kappa-light-chain-enhancer of activated B cells, PPARG peroxisome proliferator activated receptor gamma, KLRF killer cell lectin like receptor F1, IL18R1 IL-18 receptor 1, ICOS inducible T cell costimulatory, PSMB10 Proteasome 20S Subunit Beta 10, IFNGR1 interferon gamma receptor 1, TLR2 toll-like receptor 2.

BPD-associated CpGs and DEGs in peripheral blood cells on postnatal day 14

Robust linear regression analysis with adjustment for 10 covariates (GA, BW, sex, seven cell-type %) identified 153 BPD-associated CpGs at genome-wide significance level (Bonferroni or BF, p < 6.32E−08) and 2871 CpGs at FDR1% (Fig. 4A). More CpGs (76–82%) were hypomethylated in BPD than in nonBPD infants (Additional file 2: Fig. S3A). The 153 BF BPD-CpGs were annotated to a total of 225 nearby genes (Table 3; Additional file 1: Table S5).

Figure 4
figure 4

Epigenome-wide association study (EWAS) and genome-wide gene expression analysis for bronchopulmonary dysplasia (BPD) development in blood cells from premature infants on postnatal day 14. (A) A Manhattan plot displays BPD-associated CpGs on day 14 by comparison of BPD (n = 12) and nonBPD (n = 53) exposed to ≥ 1 day of oxygen (O2) in the neonatal intensive care unit (NICU) with adjustment for covariates [seven cell type %, sex, gestational age (GA), birth weight (BW)]. Robust linear regression model elucidated 153 CpGs with Bonferroni correction (p < 6.32E−08, red line) and 2871 CpGs at false discovery rate (FDR) 1% (blue dotted line). (B) Heat map depicts hierarchical clustering of differentially expressed 233 genes in BPD (FDR 10%) with adjustment for six cell type %, sex, GA, and BW determined by RNA sequencing (RNA-Seq, 12 BPD, 33 nonBPD-O2). Down-regulated and up-regulated genes in BPD relative to nonBPD are in blue and orange, respectively. NICU O2 days, neutrophil–lymphocyte ratio (NLR), nucleated red blood cell (NRBC) %, GA, and BW (color intensity = scale) as well as sex (red=male, blue=female) are labeled for each sample. Samples by row and genes by column. Heatmap created using Partek Flow (Partek Inc., Chesterfield, MO; https://www.partek.com/partek-flow/). (C) Top-ranked pathways and gene ontologies of BPD-associated genes and BPD CpG-annotated genes demonstrate similar or unique molecular events in transcriptome and epigenome. Circle size and label indicate –log10 p values (EWAS) or −log10 adjusted p values (RNA-Seq). Details of the pathways analysis are in Additional file: Table S7.

Table 3 Top 30 CpGs significantly associated with bronchopulmonary dysplasia (BPD) risk on postnatal days.

DESeq2 analysis with adjustment for nine covariates (GA, BW, sex, cell-type % of CD4 + T, CD8 + T, B cell, NK, granulocyte, and monocyte) determined DEGs (731 genes at p < 0.01) in peripheral blood RNA of BPD samples on day 14 (Table 4; Additional file 1: Table S6). Hierarchical clustering analysis of 731 DEGs demonstrated more genes (453/731) were overexpressed in BPD than in nonBPD infants (Fig. 4B; Additional file 2: Fig. S3A,B), which suggested potential demethylation-mediated transcriptional activation.

Table 4 Representative differentially expressed genes (DEGs) in peripheral blood cells of preterm infants with bronchopulmonary dysplasia (BPD).

Differentially methylated CpG-annotated genes were mainly enriched in organ and cellular development and morphology signaling (e.g., GNA12, NCOA1, PDGFA, MYH10), telomere length and cellular senescence (e.g., CDK6, TERT, TNKS), neutrophilic inflammation and immune responses (e.g., ADAM8, CD55, IFNA1, ITGAM, MMP8, PIN1) and oxidative stress (e.g., ALDH3B1, GGT1) (Fig. 4C; Additional file 1: Table S7). Transcriptome changes in infants who developed BPD included genes involved in chromosome segregation and DNA repair (e.g., AURKB, BRCA2, CCNA2, H4C13) potentially inhibiting cell cycle progression and directing cells toward apoptosis or DNA damage repair (Fig. 4C; Additional file 1: Table S7). In BPD infants, there may be decreased proliferation of lymphocytes and reduced T cell-mediated immune responses (e.g., CD2, IL15RA, JAK1, TNFSF8). In addition, enrichment in pathways for IFN signaling and cytokine ‘storm’ (e.g., IFI6, IFITM1, IL1RN, MYD88, LCK), antioxidant response regulated by nuclear factor erythroid 2-related factor 2/glutathione redox (e.g., CHST12, GCLM, GSR, NFE2L2, SOD1), and mitochondrial dysfunction (e.g., ACO1, COX7B, NDUFS4) indicated severe oxidative stress and inflammation in infants who developed BPD. The group of genes that overlapped between epigenome and transcriptome on day 14 (85 genes with FDR 1% epigenome changes, Table S7) showed enrichment in altered T cell development, T cell-mediated immune responses, and inflammation/oxidative stress-mediated cellular senescence (Additional file 2: Fig. S3C). These pathway analysis results were consistent with the reduced lymphocyte percentage, increased granulocyte percentage and the increased NLR in BPD infants on day 14 (shown in Fig. 2A,B).

BPD-associated CpGs and DEGs in peripheral blood cells on postnatal day 28

BPD diagnosis was associated with 116 CpGs at BF (2299 CpGs at FDR 1% cut-off) on day 28 (Fig. 5A; Table 3; Additional file 1: Table S8) and DNA hypomethylation (demethylation of ~ 80% of CpGs) was evident in infants who developed BPD, similarly to the result on day 14 (Additional file 2: Fig. S4A). Pathways of the BPD-CpG-annotated genes were similar to those seen on day 14 and tissue morphology, neuron and body development, and cellular assembly (e.g., CACNB2, CHD7, FGF8, HOXB6, PAX2) were noted (Fig. 5B; Additional file 1: Table S9). In addition, BPD-CpG annotated genes observed on day 28 (Fig. 5B) were involved in immune and inflammatory events, including activation of immune and inflammatory cells, phagosome formation, infection, and platelet development (e.g., FCGR2A, GPR55, IGKV1-37, NFIB, TNFSF8), nucleotide metabolism (e.g., ADARB2, NME6), cholesterol metabolism (e.g., CYP27A1, MBTBS1), and oxidative stress (e.g., ALOX12, MAFF, NXN).

Figure 5
figure 5

Epigenome-wide association study (EWAS) and genome-wide gene expression analysis for bronchopulmonary dysplasia (BPD) development in blood cells from premature infants on postnatal day 28. (A) A Manhattan plot displays BPD-associated CpGs on day 28 by comparison of BPD (n = 11) and nonBPD (n = 48, ≥ 1 day of oxygen (O2) treated in the newborn intensive care unit) with adjustment for covariates (seven cell types, sex, gestational age, birth weight). Robust linear regression model elucidated 116 CpGs with Bonferroni correction (p < 6.46−E08, red line) and 2299 CpGs at FDR 1% (p < 3.0E−05, blue line). (B) Top-ranked pathways and gene ontologies of BPD-associated genes and BPD CpG-annotated genes determined by pathway analysis tools depict similar or unique molecular events in transcriptome and epigenome. Circle size and label indicate −log10 p values (adjusted p for RNA-Seq). Details of the pathways are in Additional file: Table S9.

BPD-DEGs on day 28 included 312 genes (p < 0.05) (Table 4; Additional file 1: Table S10; Additional file 2: Fig. S4B). Enriched pathways of DEGs on day 28 were similar to those on day 14 such as chromosome segregation and DNA damage (e.g., BLM, CHEK1), activation of phagocytosis and degranulation (e.g., GNLY, MPO, LTF), leukocyte activation and immune response (e.g., IL23A, PIK3R3, RAG1, TRDC), neurological disorder (e.g., ATP6V1A, DUSP), and lipid metabolism (e.g., ACACA, SCD) (Fig. 5B; Additional file 1: Table S9). The epigenome-transcriptome common 30 genes on day 28 were also enriched in these pathways (Additional file 2: Fig. S4C), indicating the transcriptome changes are parallel with epigenome changes on day 28 in infants who are developing BPD. Overall, this indicated that altered leukocyte activation and antimicrobial response, DNA damage and nucleotide metabolism, lipid metabolism, and angiogenesis and other growth-related events continue on day 28 in premature infants who went on to develop BPD.

NICU O2 therapy-associated CpGs and DEGs in peripheral blood cells on postnatal day 14

To assess the epigenetic and transcriptomic response to NICU O2 supplementation in prematurity without regard to disease processes, we examined these endpoints on postnatal day 14 in nonBPD infants exposed to O2 for ≥ 14 days (High-O2), and compared them to nonBPD infants who were never exposed to O2 (No-O2). EWAS (n = 20 for High-O2, n = 28 for No-O2) determined 346 differentially methylated loci at FDR 1%, which varied up to 7.5% methylation difference and were annotated to 529 genes (Table 5; Additional file 1: Table S11; Additional file 2: Fig. S5A). RNA-Seq (n = 18 for High-O2, n = 15 for No-O2) determined 513 DEGs at FDR 1% (1480 DEGs at FDR 5%) with the greatest fold changes observed in mitochondria-encoded mitochondrial genes (Table 5; Additional file 1: Table S12). About 61% of DEGs (314/513) were markedly downregulated by O2 exposure (Fig. 6A). High-O2 infants had relatively higher NLR (Fig. 6A) and smaller GA and BW compared to No-O2 infants. In addition, more males (10/15, 67%) required prolonged O2 treatment than females (8/18, 44%).

Table 5 Representative differentially methylated CpGs and differentially expressed genes (DEGs) associated with prolonged oxygen (O2) exposure on postnatal day 14.
Figure 6
figure 6

Genome-wide gene expression analysis of oxygen (O2) treatment in neonatal intensive care unit (NICU) and comparison with epigenome-wide association study (EWAS) in preterm infants on postnatal day 14. (A) Heat map of O2-differentially expressed genes (n = 513 at FDR 1% determined by RNA sequencing (RNA-Seq) between nonBPD infants exposed to prolonged O2 (≥ 14 days, High-O2, n = 18) and those to no O2 exposure (No-O2, n = 15) with adjustment for nine covariates [six cell types, sex, gestational age (GA), birth weight (BW))]. Down-regulated and up-regulated genes in High-O2 relative to No-O2 are in orange and blue, respectively. NICU O2 days, neutrophil–lymphocyte ratio (NLR), nucleated red blood cell (NRBC) %, GA, and BW (color intensity = scale) as well as sex (Red-Male, Blue-Female) are labeled for each sample. Samples by row and genes by column. A volcano plot depicts Log10-transformed adjusted p values against Log2-transformed fold changes of differentially expressed genes by prolonged O2-exposure in nonBPD. Heatmaps, volcano plot and diagrams created using Partek Flow (Partek Inc., Chesterfield, MO; https://www.partek.com/partek-flow/) (B) Pathway analysis of differentially expressed genes predicted that NICU O2 exposure disturbs various molecular events including mitochondrial oxidative phosphorylation and biogenesis, which leads to mitochondrial dysfunction and related diseases such as neurological disorders. (C) Comparison of epigenome and transcriptome changed by ≥ 14 days of NICU O2 supplementation in premature infants elucidates similar enriched pathways and gene ontologies such as developmental signaling (TGF-β, NGF, ID-1), cellular assembly (cytoskeleton, cytoplasm), reactive oxygen species (ROS) production and xenobiotic metabolism, inflammation (interleukin 8, IL-8), and T cell differentiation (IL-15). The activation z-scores of methylome changes (EWAS) are in general reciprocal to expression changes (RNA-Seq).

Transcriptome differences in the blood of O2 supplemented neonates strongly pointed to the inhibition of mitochondrial function and morphology including oxidative phosphorylation (e.g., MT-ATP6, MT-CO2, MT-ND5, MT-CYB, LRPPRC) (Fig. 6B). O2-dependent DEGs were also involved in neuronal development and mental retardation (e.g. CREBBP, ROCK1, CUL3), growth retardation (e.g. HHEX, MT-ND4, NCOA6, NCOR2), lymphopoiesis (e.g. BCOR, GPR183, RASGRP4), neutrophilic inflammation (e.g., MOSPD2, DOK3, OSCAR), and DNA damage and repair (e.g., BAX, BRD4, LIG1) (Additional file 2: Fig. S5B). O2-altered methylome was predicted to affect cellular organization, neuron and organ development (e.g., MME, NCOR1, NGF, PDLIM7, RPL24, TRIO), and lymphoid organ and lymphocyte differentiation (e.g., HLA-DQB1, RAG1, TYK2) (Additional file 2: Fig. S5B).

Comparison of epigenomic and transcriptomic pathways influenced by O2 supplementation showed similarities in neuron and tissue development signaling pathways (e.g., TGF-β, NGF, ID-1), cellular assembly, xenobiotic metabolism, cellular senescence, reactive oxygen species production, cell and organ death, and inflammation and infection (Fig. 6C). The pathway activation z-scores of the methylation changes were, in general, inversely related to those of gene expression changes in these functions.

Discussion

This study analyzed DNA methylation and gene expression profiles of cord and venous blood during the 1st month after premature birth and reported on numerous molecular and cellular differences observed in neonates later diagnosed with BPD (summary in Fig. 7). We observed that GA was significantly correlated with decreased proportions of lymphocytes and increased proportions of granulocytes in both cord and peripheral blood. These changes were also reflected in the NLR, particularly in the extremely preterm neonates, including the infants who ultimately developed BPD. NLR is easily assessed by standard blood count methods and the NLR in the first weeks of life could indicate babies at high risk for later developing BPD as has been suggested27. The present findings are consistent with this notion as we observed greater NLR in BPD babies at each of the time points studied.

Figure 7
figure 7

Comparison of epigenome-wide association study (EWAS) and genome-wide gene expression analysis of bronchopulmonary dysplasia (BPD) risk in cord blood and postnatal day peripheral blood. (A) Overlapping genes annotated to differentially methylated CpGs determined by EWAS or differentially expressed genes determined by RNA-sequencing (RNA-Seq) between different times of samples. Common 103 genes from extended list (false discovery rate 1%) from EWAS were mainly associated with organ morphology and neonatal death, inflammation, and molecular transport. RNA-Seq elucidated seven overlapping genes (at p < 0.05) increased (↑) or decreased (↓) in BPD compared to nonBPD. (B) BPD-associated pathways and gene ontologies predicted from epigenome and transcriptome changes were compared by time.

The NLR is likely to be driven by inflammation, and both systemic and lung inflammation by antenatal infection, chorioamnionitis (e.g., rubella, Ureaplasma species), and postnatal sepsis (e.g., E. coli, group B streptococcus) have been reported to increase BPD risk33,34,35. The NLR and the pathway enrichment in: interferon signaling (cord blood and day 14 methylome and transcriptome); neutrophil degranulation (day 14 methylome, day 28 transcriptome), hyper-cytokinemia (day 14 transcriptome); complement system activation (day 14 methylome); and phagocytosis (day 28 methylome and transcriptome), collectively suggested microbial infection, neutrophilic inflammation, and altered innate immune responses in infants who developed BPD. Key genes in the innate immune response that were differentially expressed or methylated included GNLY and granzyme (GZMA), the antimicrobial peptides from killer lymphocyte granules causing microptosis36. In addition, antimicrobial lactoferrin (LTF), neutrophilic myeloperoxidase (MPO), anti-streptococcus AGER and cathepsin D (CTSD), interferon alpha 1 (IFNA1), interferon induced transmembrane proteins (IFITM1, IFITM3), and the negative regulator of interferon response (NPIR) are likely to be important players in innate immune modulation of BPD pathogenesis. This is consistent with our observation of a higher frequency of infection in the BPD infants (Table 1), including sepsis and pneumonia (Additional file 1: Table S1), and also with an adaptive immune response as indicated by apparent shifts from naïve to memory among B cells, CD4 T cells and CD8 T cells (Additional file 1: Table S2).

Consistent with the lowered percentages of lymphocytes (CD4 + T, CD8 + T, B cells) in BPD compared to nonBPD infants, EWAS and RNA-seq coordinately suggested suppressed T cell immunity in BPD infants (Fig. 7). Mucosal immune imbalance due to lymphocyte insufficiency may mediate respiratory infection and inflammation, and play a role in lung damage and aberrant development. O2-induced DNA damage and cell cycle arrest during the neonatal period (Figs. 4C; 5B) may also contribute to decreased lymphopoiesis in BPD. Immune cell senescence due to oxidative stress and telomere shortening was also suggested by telomere-related epigenetic (e.g., TERT, TNKS) or transcriptomic (e.g., PINX1, TERF2) changes in BPD cases on day 14. In cord blood, epigenetic inhibition of IL-2 may affect downstream transcriptomics leading to improper lymphocyte differentiation and activation in BPD (Fig. 3B). IL-2 from CD4 + T and CD8 + T cells is a powerful modulator of T cell growth, leukocyte activation, and IL-18 stimulation of IFN production and NK cell cytotoxicity29. Downregulation of immune surface marker genes related to T cell maintenance and T and B cell activation (e.g., CD28, CD27, CD7, CD79B, CD83) in cord and/or peripheral blood further supports skewed B and T cell immunity in preterm infants who developed BPD. In addition, decreased expression of tumor necrosis factor superfamily member 8 (TNFSF8) encoding CD153, one of the 7 BPD-DEGs in common to all timepoints, suggested interruption of T cell-dependent anti-mycobacterial immune response37 and also class switch DNA recombination and immunoglobulin production38 in BPD pathogenesis. T cell inadequacy has been suggested as an immunological clue for BPD prediction. For example, a downregulation of T cell receptor signaling was reported in postnatal blood (5–28 days) from preterm BPD babies14,19. Scheible et al.39 reported that premature infants had a lowered level of CD31 + CD4 + T cells on discharge and showed increased risk for respiratory complications at 1 year of age compared to full-term babies37,38.

In addition to the immunological biomarkers, we also observed development- and alveolarization-related epigenetic and transcriptomic markers in cord blood of infants who developed BPD30,31,40. Concurrently with BPD epigenome data22, we identified SPOCK2 and AGER among the BPD-DEGs in cord blood. SPOCK2, a BPD susceptibility gene identified from GWAS7,30,31, was reported to be deleterious in BPD development as its overexpression in the lung altered neonatal mouse lung alveolarization and exacerbated hyperoxia-caused alveolar simplification while anti-SPOCK2 antibody treatment in mice improved it7,30. AGER is a type 1 alveolar cell differentiation marker31 and prenatal exposure to nicotine in mice arrested alveolarization and upregulated the AGER signaling pathway41. In addition, Ager knock-in mice were generated to study alveolar development and type 1 cell turnover during injury and repair42.

A unique aspect of this study was the opportunity to examine the in vivo response of blood cells to O2 therapy among nonBPD infants. Comparing preterm infants who received at least 14 days of NICU O2 supplementation with those who never received O2 therapy, we observed a dramatic downregulation of a large set of mitochondria-related genes, either mitochondrial DNA-encoded (e.g., MT-ND5, MT-CYB, MT-CO2, MT-ATP6) or nuclear DNA-encoded (e.g., CPT1A, LRPPRC, SLC25A24, UCP2). Relative to those who never received O2 therapy, this down-regulation also occurred in BPD neonates. These genes are involved in mitochondrial DNA-related diseases such as mitochondrial leukoencephalopathy, mitochondrial myopathy, ventricular preexcitation, optic and retinal disorders, and type I diabetes (Additional file 2: Fig. S5B). Other downregulated genes (e.g., BAX, HTT, IGF1R, OGDH, TGFB1) have also been associated with mitochondrial morphology and permeability. Consistent with our observation, Das et al.43 reported that hyperoxia exposure reduced mitochondrial oxidative phosphorylation complex (I and II) activity in lung epithelial cells. In addition, inhibition of mitochondrial respiration caused arrested alveolarization of neonatal mice44, indicating mitochondrial biogenesis is critical in lung maturation. In addition, it has been reported that vascular endothelial mitochondrial function is associated with BPD45. We observed mitochondrial gene dysregulation and predicted functional abnormality in blood cells at day 14 of O2 therapy in both BPD and nonBPD neonates relative to neonates with no O2 therapy. While down-regulated mitochondrial biogenesis was observed among neonates with O2 therapy and may be caused by this treatment, it is possible the mitochondrial effects are reflecting the infant’s poor blood O2 saturation in the neonatal period, rather than a response to treatment. Overall, mitochondrial dysfunction may be one of the underlying mechanisms driving BPD in susceptible infants. However, while nonBPD neonates are able to respond and recover from this exposure, the neonates who develop BPD continue to need O2 therapy for many weeks to maintain optimal blood O2 saturation levels. It suggests that additional susceptibility factors such as genetic influences or prenatal conditions that lead to a deficit in the response to O2 therapy are likely to be involved45.

Optimal O2 supplementation in the NICU is essential to reduce the risk of respiratory mortality and other morbidities. However, in addition to BPD, preterm babies who require prolonged O2 treatment often develop neurological disorders, such as cognitive deficits in the absence of apparent brain injury, and are at higher risk of morbidities including retinopathy of prematurity (ROP) and cardiac disorders46,47. Mitochondrial dysfunction due to oxidative stress is associated with adult neurodegenerative disorders48 and cardiovascular toxicity49, and has been proposed as a mechanism in preterm morbidities50. Our observation of O2 treatment-induced mitochondrial effects in blood cells supports this idea.

Conclusions

The small size of the examined cohort and the low incidence of BPD in this study limited the statistical power for many comparisons and therefore, any conclusions should be interpreted with caution. However, despite these limitations, epigenomic and transcriptomic profiling of blood from preterm infants receiving O2 supplementation revealed time-dependent immunopathological events in the early weeks of life before they were diagnosed with BPD. Altered methylation and dysregulated transcription presumably have affected neutrophil and lymphocyte proportions, T cell and adaptive immune responses, inflammation and phagocytosis, cellular assembly, and repair of DNA damage in premature neonates. Prolonged O2 treatment highly suppressed mitochondrial gene expression, which may be driving lung pathology and be implicated in a variety of neuronal and optical disorders of prematurity. These molecular and cellular outcomes such as NLR may be predictors of BPD susceptibility and should be followed up in larger studies of BPD. Similarity of pathways and overlapping genes between two genomic networks suggested interplay between epigenetics and transcriptomics in BPD pathogenesis. These results provided insights into mechanisms of BPD and warrant further investigation into the clinical relevance of blood methylomic and transcriptomic markers of BPD risk.

Methods

Study cohort

The Discovery-Bronchopulmonary Dysplasia Program (D-BPD) cohort was described in previous publications21,22 and much of the methods overlap with the present work. In brief, in the parent study, 378 preterm infants < 1500 g of birth weight in Buenos Aires, Argentina, were recruited within 13 days of life and followed prospectively in the NICU until discharged or 44 weeks of corrected GA21,22. A diagnosis of BPD was made for infants who received at least 28 days of O2 (> 21%) supplementation therapy and need for O2 (≥ 30%) and/or positive pressure (1) at 36 weeks of PMA or at discharge (whichever comes first) if born < 32 weeks GA or (2) at 28–56 days postnatal age or at discharge (whichever comes first) if born ≥ 32 weeks GA, as defined in Jobe and Bancalari23. A total of 109 patients (15 BPD, 94 nonBPD) satisfying all study inclusion criteria provided a cord and/or peripheral blood sample for methylation arrays and/or RNA-Seq analysis (Table 1; Additional file 1: Table S1). The study methods were performed in accordance with the relevant guidelines and regulations and the design was approved by each of the local Institutional Review Boards (IRB) in Buenos Aires and at the NIEHS (08-E-N159). Parents provided written informed consent as described elsewhere21,22.

Genomic DNA and total RNA extraction

Cord or peripheral blood samples were collected at birth and at day 14 and day 28 of life, placed in PAXgene reagent (Qiagen Inc., Valencia, CA) and then snap frozen at – 80 °C. The PAXgene Blood miRNA Kit (PreAnalytix/Qiagen) was used following the manufacturer’s procedure. As in Wang et al.22 blood specimens were incubated at room temperature for > 2 h to lyse RBCs and centrifuged (3500 g, 15 min) to acquire cell pellets. Pellets were washed and treated with proteinase K at 55 °C (800 rpm, 15 min), and isopropanol was added to the soluble fractions of the supernatants prepared from the QiaShredder spin columns. For RNA isolation, the isopropanol precipitants were added into the PAXgene RNA spin columns and processed for DNase treatment followed by RNA extraction procedures as indicated in the manufacturer’s brochure. As in Wang et al.22, for DNA isolation, the isopropanol precipitants prepared with the PAXgene miRNA Kit were loaded into the DNeasy Mini spin columns (DNeasy Blood & Tissue Kit, Qiagen) and followed the manufacturer’s procedure. DNAs and RNAs were quantified using Qubit (Thermo Fisher, Waltham, MA) and stored at – 80 °C until used.

DNA methylation microarray analysis

Aliquots (100–500 ng) of genomic DNA from whole peripheral blood were bisulfite-converted using EZ-96 DNA Methylation MagPrep kits (Zymo Research, Irvine, CA) following the manufacturer’s instructions as described previously22. Bisulfite-converted DNAs were applied to Human MethylationEPIC BeadChip (Illumina, San Diego, CA), which covers over 850,000 CpG sites in the human genome, at the NIEHS Molecular Genomics Core Laboratory. The raw IDAT files from the EPIC methylation arrays were read into R with the minfi package51, and the data was preprocessed with background correction and dye-bias normalization using the preprocessNoob method52 as in22. The champ.runCombat function in ChAMP package53 was used for batch correction ("Sentrix_ID" and "Sentrix_Position"). DNA methylation data were filtered prior to normalization, using the following exclusion criteria: arrays with > 5% failed probes; CpG probes located on X and Y chromosomes; and any probes containing one or more single-nucleotide polymorphisms having a minor allele frequency ≥ 1% (in EUR population of the 1000 Genomes Project) occurring within 5 nucleotides relative to the CpG site. Probes reported to hybridize to one or more non-target sites in the genome were deleted54. There were 775,201 CpG probes remaining after exclusions. DNA methylation array data are deposited in Gene Expression Omnibus (GEO, accession number: GSE225313). To detect potential outlier samples in the methylation dataset we prepared principal component analysis (PCA) plots of methylation data from Day 14, Day 28 and combined (Additional file 2: Fig S6). No methylation samples were excluded.

Methylation-based cord blood and peripheral blood cell type estimation

The percentages of seven blood cell types (CD4 + T, CD8 + T, NK cells, B cells, monocytes, granulocytes, and nucleated red blood cells) were estimated using reference DNA methylation profiles (https://github.com/immunomethylomics/FlowSorted.CordBloodCombined.450k) for cord blood and the IDOL deconvolution algorithm25. The 12 cell-type model of Salas et al.28 was used to differentiate naïve from memory B cells, CD4 T cells and CD8 T cells.

Differentially methylated loci associated with BPD or NICU O2 exposure

Identification of differentially methylated probes was accomplished using a robust linear regression analysis of M values (log ratio of beta values) on disease status (BPD vs nonBPD-exposed to ≥ 1 day O2 in NICU) on methylation values from postnatal day 14 and day 28 with adjustment for infant sex, GA, BW, and seven estimated blood cell type proportions (CD4 + T, CD8 + T, NK, B, monocyte, granulocyte, NRBC). Differentially methylated loci associated with O2 exposure were identified on day 14 among nonBPD neonates (nonBPD-no O2 vs nonBPD- ≥ 14 days O2 in NICU) with adjustment for the 10 covariates indicated above. The Winsorize technique (https://www.rdocumentation.org/packages/DescTools/versions/0.99.44/topics/Winsorize) was used in an alternative EWAS analysis to test if outlier data points affected the resulting differentially methylated CpGs. We did 90% winsorization, that is the top 5% of the data is replaced by the value of the data at the 95th percentile and the value of the bottom 5% of the data is replaced by the value of the data at the 5th percentile. The re-analysis of all EWAS comparisons using the Winsorization method produced very similar results for differentially methylated CpGs at the FDR1% level (Additional file 1: Table S13). Differentially methylated CpGs were annotated to genes using the Illumina manifest and also by identifying the nearest transcription start site, which resulted in some CpGs being associated with more than one transcript.

RNA-seq analysis

Aliquots of RNA (250 ng) were used to generate poly-adenylated RNA libraries with TruSeq Stranded Total RNA Ribo-Zero Human Gold kit (Illumina). Samples were indexed with NEXTfle-96 RNA-Seq Barcodes (Bioo-scientific, Austin, TX) and 75 bp single-end sequencing was performed on NovaSeq 6000 platform using S4 flow cell (Illumina) in the NIEHS Epigenomics and DNA Sequencing Core Laboratory. Raw sequencing reads (FASTQ files, 22–150 million reads per sample) were aligned to hg38 using STAR and gene counts were generated with featureCounts using the GENCODE version 39 annotation. Count matrix data were then imported to Partek Flow (Partek Inc., Chesterfield, MO) and PCA was used to visualize outlier samples (Additional File 2: Fig S7). One outlier was removed from the cord blood analysis. Quantification of transcript expression and differential expression analyses were performed using DESeq adjusting for nine covariates (GA, BW, sex, CD4 + T%, CD8 + T%, B%, monocyte%, granulocyte%, NRBC%). DEGs were determined between BPD and nonBPD on day 0 (cord blood) or between BPD and nonBPD exposed to any day(s) of O2 on days 14 and 28 (peripheral blood) with cutoff for significance at p < 0.05 and/or FDR-adjusted p < 0.1. DEGs between nonBPD exposed no NICU O2 and nonBPD exposed to ≥ 14 days NICU O2 with cutoff for significance at FDR-adjusted p < 0.01. Controlled hierarchical cluster analysis by disease status generated heatmaps showing a structure of DEG expression trends and partition of DEGs into different clusters using Partek Flow. RNA-Seq raw data are deposited in GEO (accession number: GSE220135).

Pathway analyses

Enrichment of canonical pathways, functions, biological processes and diseases for the genes annotated to the differentially methylated loci or the DEGs were analyzed using Ingenuity Pathway Analysis (IPA, Qiagen), ToppGene Suite (https://toppgene.cchmc.org), Reactome Pathway Database (https://reactome.org) and/or David BioInformatics Resources (https://david.ncifcrf.gov) as described previously22. All p values of pathways and gene ontologies were corrected for multiple testing using the Benjamini–Hochberg method.

Statistical analyses

Association between percentages of granulocyte and GA, lymphocytes and GA, or NLR and NICU O2 days were analyzed by linear regression analyses without covariates (GraphPad Prism 9, GraphPad Software, San Diego, CA). Student’s t-test, Fisher Exact, or Chi-square tests were used to determine differences between BPD and nonBPD for each cell type percentage, NLR, and population demographics (SigmaPlot 14.0, Systat Software, San Jose, CA).

Ethics approval and consent to participate

All participating mothers provided written formed consent. The research was performed under the supervision and approval of the Institutional Review Board at recruiting centers in Argentina and National Institute of Environmental Health Sciences, National Institutes of Health (Protocol 08-E-N159).