Abstract
Methylation quantitative trait loci (mQTLs) are essential for understanding the role of DNA methylation changes in genetic predisposition, yet they have not been fully characterized in East Asians (EAs). Here we identified mQTLs in whole blood from 3,523 Chinese individuals and replicated them in additional 1,858 Chinese individuals from two cohorts. Over 9% of mQTLs displayed specificity to EAs, facilitating the fine-mapping of EA-specific genetic associations, as shown for variants associated with height. Trans-mQTL hotspots revealed biological pathways contributing to EA-specific genetic associations, including an ERG-mediated 233 trans-mCpG network, implicated in hematopoietic cell differentiation, which likely reflects binding efficiency modulation of the ERG protein complex. More than 90% of mQTLs were shared between different blood cell lineages, with a smaller fraction of lineage-specific mQTLs displaying preferential hypomethylation in the respective lineages. Our study provides new insights into the mQTL landscape across genetic ancestries and their downstream effects on cellular processes and diseases/traits.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout







Similar content being viewed by others
Data availability
Our mQTL database is available for download at https://www.biosino.org/panmqtl/, which incorporates mQTLs not only in EA (NSPT) but also in published European and South Asian data. The database also supports searching and visualization of genomic, functional and downstream disease/trait hits of mQTLs and mCpGs. The statistics of mQTLs in NSPT and CGZ cohort are available for download at NODE https://www.biosino.org/node under accession number OEP002902, or directly accessed at https://www.biosino.org/node/project/detail/OEP002902. The statistics of mQTLs replicated in CAS is available for download at OMIX https://ngdc.cncb.ac.cn/omix under accession number OMIX004116, or directly accessed at https://ngdc.cncb.ac.cn/omix/release/OMIX004116. The individual-level genotype data is not available because of IRB restrictions due to privacy concerns. The individual-level DNAm data can be requested at https://ngdc.cncb.ac.cn/omix/release/OMIX004363 (NSPT), https://ngdc.cncb.ac.cn/omix/release/OMIX004333 (CAS) and https://www.biosino.org/node/project/detail/OEP002902 (CGZ). Requests are normally processed within 1–3 months. Data usage shall be in full compliance with the Regulations on Management of Human Genetic Resources in China. The DNAm dataset in buccal cells is available by submitting data requests to mrclha.enquiries@ucl.ac.uk; see the full policy at http://www.nshd.mrc.ac.uk/data.aspx. Managed access is in place for this 69-year-old NSHD study to ensure that the use of the data is within the bounds of consent given previously by participants, and to safeguard any potential threat to anonymity because the participants are all born in the same week. The mQTL results of the EUR cohort (GoDMC) were downloaded from http://mqtldb.godmc.org.uk/downloads. The mQTL results of the EUR cohort (FHS) were downloaded from https://ftp.ncbi.nlm.nih.gov/eqtl/original_submissions/FHS_meQTLs/ (date: September 14, 2020). The annotation of CpG probes was downloaded from https://zwdzwd.github.io/InfiniumAnnotation (date: November 25, 2019). Significant GWAS results were downloaded from GWAS Catalog (https://www.ebi.ac.uk/gwas/docs/file-downloads, date: December 25, 2020) and significant EWAS results were downloaded from EWAS Atlas (https://ngdc.cncb.ac.cn/ewas/downloads, date: December 25, 2020). The cis-eQTL results in whole blood were downloaded from GTEx V8 database (https://www.gtexportal.org/home/datasets; date: June 17, 2020) and HGVD (http://www.genome.med.kyoto-u.ac.jp/SnpDB/). The human gene information (Ensembl release v104) was downloaded from GENCODE (https://www.gencodegenes.org/human/release_37lift37.html; date: April 26, 2021), the list of human TFs was from http://humantfs.ccbr.utoronto.ca/download.php (date: April 3, 2020), and the list of House-Keeping genes was downloaded from https://www.tau.ac.il/~elieis/HKG/. Motifs information of TFs was obtained from JASPAR 2020 database (http://jaspar.genereg.net/; date: July 2, 2021) and JASPAR 2022 (date: August 22, 2022). ChIP–seq signals of TFs were downloaded from the ChIP-Atlas database (http://chip-atlas.org/; date: June 2, 2021). Other data sources used in this study include BLUEPRINT mQTLs summary statistics (https://ega-archive.org/datasets/EGAD00001005200); Phenoscanner GWAS summary statistics (http://www.phenoscanner.medschl.cam.ac.uk/); Functional genomic regions from the Functional Annotation of Animal Genomes (FAANG) Project (https://www.faang.org); PCHi-C data (https://osf.io/u8tzp); H3K27ac HiChIP data (https://www.ncbi.nlm.nih.gov/geo/, GSE101498); The DNase-seq data for B cells and T cells and the H3K27ac ChIP–seq data of neutrophil cells (https://www.encodeproject.org); GO terms, KEGG pathways, and Reactome pathways were downloaded from the Molecular Signatures Database (https://www.gsea-msigdb.org/gsea/msigdb/index.jsp); and FANTOM5 (https://fantom.gsc.riken.jp/data/). Experimental Factor Ontology (EFO) (https://www.ebi.ac.uk/ols/ontologies/efo). GWASs in BBJ (https://pheweb.jp/); GWASs in UKBB (https://pan.ukbb.broadinstitute.org/); super enhancer databases (http://www.licpathway.net/sedb/; http://www.asntech.org/dbsuper/; http://www.licpathway.net/SEanalysis/); segmented functional regions from GM12878 cell line (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeAwgSegmentation); 15 chromatin states (https://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/coreMarks/jointModel/final/).
Code availability
Code for the analysis is available at GitHub (https://github.com/Fun-Gene/fastQTLmapping) and Zenodo (https://doi.org/10.5281/zenodo.8084877)75. Most operations are carried out by R (https://cran.r-project.org/), and the plots are mainly made by ggplot2 v3.4.2 R package (https://cran.r-project.org/web/packages/ggplot2/index.html). mQTL mapping is performed by fastQTLmapping (https://github.com/Fun-Gene/fastQTLmapping) and R package MatrixEQTL v2.3 (https://cran.r-project.org/web/packages/MatrixEQTL/index.html). Heritability is estimated by GCTA (https://yanglab.westlake.edu.cn/software/gcta/). MKL is available at https://software.intel.com/tools/onemkl. GSL is available at http://www.gnu.org/software/gsl/. Annotation of SNP is based on ANNOVAR (https://annovar.openbioinformatics.org/en/latest/, date: 2020.11.2) and annotation of CpG is based on the manufacturer’s manifest files (date: 2020.10.21). Genotype calling is based on GenomeStudio (https://support.illumina.com/array/array_software/genomestudio/downloads.html). Imputation of SNP chip is based on SHAPEIT2 (https://mathgen.stats.ox.ac.uk/genetics_software/shapeit/shapeit.html) and IMPUTE2 (https://mathgen.stats.ox.ac.uk/impute/impute_v2.html). Enrichment analysis of mQTLs is performed by R package clusterProfiler v4.8.1 (https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html). DNAm processing is based on R package minfi Bioconductor package v1.46.0 (https://bioconductor.org/packages/release/bioc/html/minfi.html) and CHAMP Bioconductor package v2.30.0 (https://bioconductor.org/packages/release/bioc/html/ChAMP.html). Cell-type mQTLs are estimated by CellDMC, which is available as part of the EpiDISH v2.8 Bioconductor R package (http://bioconductor.org/packages/devel/EpiDISH. eFORGE is run with the web server at eFORGE2.0 (https://eforge.altiusinstitute.org/). Sharing Effect of cell-type mQTLs is estimated by R package mashr (https://cran.r-project.org/web/packages/mashr/index.html). The GO and KEGG pathway enrichment analyses of mCpGs are conducted using R package missMethyl v1.34.0 (https://bioconductor.org/packages/3.13/bioc/html/missMethyl.html). Genes enrichment for diseases/traits analysis is performed by the R package disgenet2r v0.99.3 (https://www.disgenet.org/disgenet2r) based on the DisGeNET knowledgebase (date: 2021.6.9). The two-sample MR analysis is conducted using the R package TwoSampleMR v0.4.26 (https://mrcieu.github.io/TwoSampleMR/). The HiChIP loops are processed by HiCCUPS and implemented in the Juicer Tools (v0.7.5) with default parameter settings. The influence of SNPs on REs is calculated using the tool OpenCausal (https://github.com/liwenran/OpenCausal). Colocalization is performed by SMR v1.3.1 (https://yanglab.westlake.edu.cn/software/smr/#Download). Enrichment of mQTL CpGs for TF motifs is performed by TFmotifView (http://bardet.u-strasbg.fr/tfmotifview/) and R package PWMEnrich v4.30.0 (https://bioconductor.org/packages/release/bioc/html/PWMEnrich.html). Phenome-wide association analysis is carried out by PheWAS (https://gwas.mrcieu.ac.uk/phewas).
References
Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).
Huan, T. et al. Genome-wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nat. Commun. 10, 4267 (2019).
McRae, A. F. et al. Identification of 55,000 replicated DNA methylation QTL. Sci. Rep. 8, 17605 (2018).
van Dongen, J. et al. Genetic and environmental influences interact with age and sex in shaping the human methylome. Nat. Commun. 7, 11115 (2016).
Gaunt, T. R. et al. Systematic identification of genetic influences on methylation across the human life course. Genome Biol. 17, 61 (2016).
Hannon, E. et al. Leveraging DNA-methylation quantitative-trait loci to characterize the relationship between methylomic variation, gene expression, and complex traits. Am. J. Hum. Genet. 103, 654–665 (2018).
Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414 (2016).
Min, J. L. et al. Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation. Nat. Genet. 53, 1311–1321 (2021).
McClay, J. L. et al. High density methylation QTL analysis in human blood via next-generation sequencing of the methylated genomic DNA fraction. Genome Biol. 16, 291 (2015).
Lemire, M. et al. Long-range epigenetic regulation is conferred by genetic variation located at thousands of independent loci. Nat. Commun. 6, 6326 (2015).
Banovich, N. E. et al. Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels. PLoS Genet. 10, e1004663 (2014).
Bell, C. G. et al. Obligatory and facilitative allelic variation in the DNA methylome within common disease-associated loci. Nat. Commun. 9, 8 (2018).
Liu, Y. et al. GeMes, clusters of DNA methylation under genetic control, can inform genetic and epigenetic analysis of disease. Am. J. Hum. Genet. 94, 485–495 (2014).
Kassam, I. A.-O. et al. Genome-wide identification of cis DNA methylation quantitative trait loci in three Southeast Asian Populations. Hum. Mol. Genet. 30, 603–618 (2021).
Hawe, J. A.-O. X. et al. Genetic variation influencing DNA methylation provides insights into molecular mechanisms regulating genomic function. Nat. Genet. 54, 18–29 (2022).
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
Li, M. et al. EWAS Atlas: a curated knowledgebase of epigenome-wide association studies. Nucleic Acids Res. 47, D983–D988 (2019).
Higasa, K. et al. Human genetic variation database, a reference database of genetic variations in the Japanese population. J. Hum. Genet. 61, 547–553 (2016).
Narahara, M. et al. Large-scale East-Asian eQTL mapping reveals novel candidate genes for LD mapping and the genomic landscape of transcriptional effects of sequence variants. PLoS ONE 9, e100924 (2014).
Akiyama, M. et al. Characterizing rare and low-frequency height-associated variants in the Japanese population. Nat. Commun. 10, 4393 (2019).
Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022).
Wilson, N. K. et al. Combinatorial transcriptional control in blood stem/progenitor cells: genome-wide analysis of ten major transcriptional regulators. Cell Stem Cell 7, 532–544 (2010).
Hoang, T., Lambert, J. A. & Martin, R. SCL/TAL1 in hematopoiesis and cellular reprogramming. Curr. Top. Dev. Biol. 118, 163–204 (2016).
Zheng, S. C., Breeze, C. E., Beck, S. & Teschendorff, A. E. Identification of differentially methylated cell types in epigenome-wide association studies. Nat. Methods 15, 1059–1066 (2018).
You, C. et al. A cell-type deconvolution meta-analysis of whole blood EWAS reveals lineage-specific smoking-associated DNA methylation changes. Nat. Commun. 11, 4779 (2020).
Teschendorff, A. E., Jing, H., Paul, D. S., Virta, J. & Nordhausen, K. Tensorial blind source separation for improved analysis of multi-omic data. Genome Biol. 19, 76 (2018).
Li, W., Duren, Z., Jiang, R. & Wong, W. H. A method for scoring the cell type-specific impacts of noncoding variants in personal genomes. Proc. Natl Acad. Sci. USA 117, 21364–21372 (2020).
Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).
Whyte, W. A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).
Chen, M. H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182, 1198–1213 (2020).
Goyama, S., Huang, G., Kurokawa, M. & Mulloy, J. C. Posttranslational modifications of RUNX1 as potential anticancer targets. Oncogene 34, 3483–3492 (2015).
Wahl, S. et al. Epigenome-wide association study of body mass index, and the adverse outcomes of adiposity. Nature 541, 81–86 (2017).
Dick, K. J. et al. DNA methylation and body-mass index: a genome-wide analysis. Lancet 383, 1990–1998 (2014).
Mendelson, M. M. et al. Association of body mass index with DNA methylation and gene expression in blood cells and relations to cardiometabolic disease: a Mendelian randomization approach. PLoS Med. 14, e1002215 (2017).
Demerath, E. W. et al. Epigenome-wide association study (EWAS) of BMI, BMI change and waist circumference in African American adults identifies multiple replicated loci. Hum. Mol. Genet. 24, 4464–4479 (2015).
Teschendorff, A. E. et al. Correlation of smoking-associated DNA methylation changes in buccal cells with DNA methylation changes in epithelial cancer. JAMA Oncol. 1, 476–485 (2015).
Gurzov, E. N., Stanley, W. J., Brodnicki, T. C. & Thomas, H. E. Protein tyrosine phosphatases: molecular switches in metabolism and diabetes. Trends Endocrinol. Metab. 26, 30–39 (2015).
Rodriguez-Nunez, I. et al. Nod2 and Nod2-regulated microbiota protect BALB/c mice from diet-induced obesity and metabolic dysfunction. Sci. Rep. 7, 548 (2017).
Gurses, S. A. et al. Nod2 protects mice from inflammation and obesity-dependent liver cancer. Sci. Rep. 10, 20519 (2020).
Kreuter, R., Wankell, M., Ahlenstiel, G. & Hebbard, L. The role of obesity in inflammatory bowel disease. Biochim. Biophys. Acta Mol. Basis Dis. 1865, 63–72 (2019).
Hugot, J. P. et al. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn’s disease. Nature 411, 599–603 (2001).
Liu, P. et al. Foxp1 controls brown/beige adipocyte differentiation and thermogenesis through regulating β3-AR desensitization. Nat. Commun. 10, 5070 (2019).
Palmer, C. J. et al. Cdkal1, a type 2 diabetes susceptibility gene, regulates mitochondrial function in adipose tissue. Mol. Metab. 6, 1212–1225 (2017).
Consortium, U. I. G. et al. Genome-wide association study of ulcerative colitis identifies three new susceptibility loci, including the HNF4A region. Nat. Genet. 41, 1330–1334 (2009).
Anderson, C. A. et al. Investigation of Crohn’s disease risk loci in ulcerative colitis further defines their molecular relationship. Gastroenterology 136, 523–529 (2009).
Hachim, M. Y. et al. An integrative phenotype–genotype approach using phenotypic characteristics from the UAE National Diabetes Study identifies HSD17B12 as a candidate gene for obesity and type 2 diabetes. Genes (Basel) 11, 461 (2020).
Moreno-Navarrete, J. M. et al. Heme biosynthetic pathway is functionally linked to adipogenesis via mitochondrial respiratory activity. Obesity (Silver Spring) 25, 1723–1733 (2017).
Cox, B. et al. A co-expression analysis of the pacental transcriptome in association with maternal pre-pregnancy BMI and newborn birth weight. Front. Genet. 10, 354 (2019).
Huang, L. O. et al. Genome-wide discovery of genetic loci that uncouple excess adiposity from its comorbidities. Nat. Metab. 3, 228–243 (2021).
Oliva, M. et al. DNA methylation QTL mapping across diverse human tissues provides molecular links between genetic variation and complex traits. Nat. Genet. 55, 112–122 (2023).
DeFronzo, R. A. Chiglitazar: a novel pan-PPAR agonist. Sci. Bull. 66, 1497–1498 (2021).
Ji, L. et al. Efficacy and safety of chiglitazar, a novel peroxisome proliferator-activated receptor pan-agonist, in patients with type 2 diabetes: a randomized, double-blind, placebo-controlled, phase 3 trial (CMAP). Sci. Bull. 66, 1571–1580 (2021).
Jia, W. et al. Chiglitazar monotherapy with sitagliptin as an active comparator in patients with type 2 diabetes: a randomized, double-blind, phase 3 trial (CMAS). Sci. Bull. 66, 1581–1590 (2021).
Aryee, M. J. et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30, 1363–1369 (2014).
Teschendorff, A. E. et al. A β-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 29, 189–196 (2013).
Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).
Tian, Y. et al. ChAMP: updated methylation analysis pipeline for Illumina BeadChips. Bioinformatics 33, 3982–3984 (2017).
Wu, M. C. & Kuan, P. F. A guide to Illumina BeadChip data analysis. Methods Mol. Biol. 1708, 303–330 (2018).
Zhou, W., Laird, P. W. & Shen, H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 45, 22–22 (2016).
Gao, X. et al. FastQTLmapping: an ultra-fast package for mQTL-like analysis. Preprint at bioRxiv https://doi.org/10.1101/2021.11.16.468610 (2021).
Shabalin, A. A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Teschendorff, A. E., Breeze, C. E., Zheng, S. C. & Beck, S. A comparison of reference-based algorithms for correcting cell-type heterogeneity in epigenome-wide association studies. BMC Bioinformatics 18, 105 (2017).
Zheng, S. C. et al. EpiDISH web server: epigenetic dissection of intra-sample-heterogeneity with online GUI. Bioinformatics 36, 1950–1951 (2019).
Liu, Y. et al. Blood monocyte transcriptome and epigenome analyses reveal loci associated with human atherosclerosis. Nat. Commun. 8, 393 (2017).
Leporcq, C. et al. TFmotifView: a webserver for the visualization of transcription factor motifs in genomic regions. Nucleic Acids Res. 48, W208–W217 (2020).
Stojnic, R. & Diez, D. PWMEnrich: PWM enrichment analysis. R package version 4.30.0. https://bioconductor.org/packages/release/bioc/html/PWMEnrich.html (2021).
Fornes, O. et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2020).
Lambert, S. A. et al. The human transcription factors. Cell 172, 650–665 (2018).
Oki, S. et al. ChIP-Atlas: a data-mining suite powered by full integration of public ChIP–seq data. EMBO Rep. 19, e46255 (2018).
Stower, H. Gene expression: super enhancers. Nat. Rev. Genet. 14, 367 (2013).
Jiang, Y. et al. SEdb: a comprehensive human super-enhancer database. Nucleic Acids Res. 47, 235–243 (2019).
Khan, A. & Zhang, X. dbSUPER: a database of super-enhancers in mouse and human genome. Nucleic Acids Res. 44, D164–D171 (2016).
Qian, F. C. et al. SEanalysis: a web tool for super-enhancer associated regulatory analysis. Nucleic Acids Res. 47, W248–W255 (2019).
Peng, Q. et al. Code for the mQTL analyses in 2023 Nature Genetics (v1.0). Zenodo https://doi.org/10.5281/zenodo.8084877 (2023).
Acknowledgements
This work is supported by the Strategic Priority Research Program of Chinese Academy of Sciences (grant XDB38000000 to S.W., F.L. and P.J.), the National Natural Science Foundation of China (NSFC; 92249302 to S.W. and T.N., 32325013 to S.W., 32370699 and 32170652 to A.E.T., 81930056 to F.L., 32170657 to Y.Z. and L.S., 32200472 to W.L.), CAS Young Team Program for Stable Support of Basic Research (YSBR-077 to S.W.), CAS Interdisciplinary Innovation Team to S.W., CAS Youth Innovation Promotion Association (2020276 to Q.P.), Shanghai Science and Technology Commission Excellent Academic Leaders Program (22XD1424700 to S.W.), the Strategic Priority Research Program of Chinese Academy of Sciences (grant XDC01000000 to F.L.), the National Key Research and Development Project (2018YFC0910403 to S.W. and 2018YFE0201603 to Y.Z. and L.S.), Ministry of Science and Technology of the People’s Republic of China (2015FY111700 to L.J.), Science and Technology Commission of Shanghai Municipality Major Project (2017SHZDZX01 to L.J., S.W., F.L., Y.Z. and L.S.), 111 Project (B13016 to L.J.), CAMS Innovation Fund for Medical Science (2019-I2M-5-066 to L.J. and J.W.), Shanghai Science and Technology Commission Excellent Academic Leaders Program (22XD1424700 to S.W.), Science and Technology Service Network Initiative of Chinese Academy of Sciences (KFJ-STS-QYZD-2021-08-001 and KFJ-STS-ZDTP-079 to F.L.), Naif Arab University for Security Sciences (NAUSS-23-R18 and NAUSS-23-R19 to F.L.), CAS Young Team Program for Stable Support of Basic Research (YSBR-077 to S.W.), CAS Interdisciplinary Innovation Team to S.W., CAS Youth Innovation Promotion Association (2020276 to Q.P.), China Postdoctoral Science Foundation (2021M693274 and BX2021336 to W.L.). We are grateful to S. Beck from University College London, W. H. Wong from Stanford University and C. Wang from Huazhong University of Science and Technology for helpful discussion, C. Relton and J. Min from the University of Bristol for sharing information about SNPs, CpGs and mQTLs in GoDMC, X. Chen from Taizhou Institute of Health Sciences of Fudan University, Y. Fan from Human Phenome Institute of Fudan University and Y. Hu from CAS for providing materials and samples in this study, and X. Cai and Q. Qian from the University of Chinese Academy of Sciences for helping in data preparation.
Author information
Authors and Affiliations
Contributions
S.W., F.L. and A.E.T. designed and drafted the work. Q.P., X.L., W.L., H.J. and A.E.T. performed the statistical analyses and contributed to data interpretation and writing of the paper. J.L., Q.L., C.E.B., G.L. and S.P. contributed to data analysis. X.G. contributed to the program of QTL mapping (fastQTLmapping). J.L., N.Y., J.Q., L.Y. and G.Z. generated the mQTL database. C.Y. and S.D. contributed to preprocessing of methylation and SNP chip data in the discovery cohort. L.J., J.W., J.T. and Z.Y. contributed to the design and acquisition of data in the discovery cohort NSPT. Q.Z., P.J. and C.Z. contributed to the design and acquisition of data in the validation panel CAS. Y.Z., X.L. and L.S. contributed to the design and acquisition of data in the validation panel CGZ. S.G., Y.L., T.N. and B.W. contributed to the design of this work. All the authors revised this work, approved the submitted version, agreed with personal contributions and are responsible for the integrity of the data and the accuracy of the data analysis.
Corresponding authors
Ethics declarations
Competing interests
X. Lu is an employee of Shenzhen Chipscreen Biosciences. The other authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Carmen Marsit, Matthew Sudermann and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 mQTLs enrichment for different functional elements.
a,b, Enrichment of mQTLs (a) and mCpGs (b) in six functional elements: CTCF-enriched elements (CTCF), enhancers (E), promoters (P), promoter flanking regions (PF), regulatory elements (RE), and TF binding sites (TFBS). The y-axis indicates the fold changes (see Methods) and the significance from the one-tailed hypergeometric test is denoted by different symbols on each bar, that is, *, P < 0.05; **, P < 0.01; ***, P < 0.001. c, Enrichment of cis-mQTL pairs (left), lcis-mQTL pairs (middle), and trans-mQTL pairs (right) in all combinations of the six functional categories (that is, CTCF-E, P-E, RE-E, and etc). A one-tailed hypergeometric test is applied. The fold changes are labeled within each box. d, Proportion of SNP-CpG pairs of mQTL within the same TAD. e, Comparison of distance distributions of mQTLs and that of 3D loops.
Extended Data Fig. 2 The cis-colocalization at chr13q14.11 provides epigenetic evidence for the East Asian-specific height-association (rs7335629-height).
a, The East Asian-specific height signal (rs7335629) is in high-linkage with three SNPs in the colocalization locus at chr13q14.11. b, rs7335629 has potential chromatin interaction with one of the CpGs (cg21067652) that colocalized at chr13q14.11. c, Both rs7335629 and cg21067652 are located in regions of high DNase in several blood cell lines. d, Two-sample MR result indicates that cg21067652 is a causal factor for ELF1 RNA expression in CAGE (N = 2,765). Two-tailed MR egger test is applied. The dot and error bar indicate the beta value and s.e., which is SNP effect on CpG (x-axis) and ELF1 expression (y-axis). The blue dotted line indicates the regression line from MR egger test with beta = -9.19, P = 1.34◊10-4. e, Two-sample MR result indicates that cg21067652 is a causal factor for height in BBJ (N = 165,056). Two-tailed MR egger test is applied. The dot and error bar indicate the beta value and s.e., which is the SNP effect on CpG (x-axis) and body height (y-axis). The blue dotted line indicates the regression line from the MR egger test with beta = -0.31, P = 3.73◊10-9.
Extended Data Fig. 3 The enrichment of cis- and trans-colocalizations in EA-specific colocalizations and functional states.
a, Enrichment of trans- vs cis-mQTLs amongst EA-specific colocalizations vs others in NSPT (left), and amongst EA-specific vs EAS-EUR shared colocalizations (right). Trans-, cis- colocalizations in East Asian is carried out based on mQTLs in NSPT (N = 3,523) and 107 GWASs in BBJ (N = ~170,000). Trans-, cis-colocalization in European is carried out based on mQTLs in GoDMC (N = 27,750) and 107 GWAS traits in UKBB (N = ~500,000) which are overlapped with traits in BBJ. b, Enrichment results of cis- vs trans-colocalization loci in functional elements. Left: enrichment of cis- and trans-colocalization loci in functional elements; Middle, enrichment of East Asian-specific and EAS-EUR shared cis-colocalization loci in functional elements; Right, enrichment of East Asian-specific and EAS-EUR shared trans-colocalization loci in functional elements. c, Enrichment of cis- and trans-colocalization loci in chromatin states. Left, enrichment of cis- and trans-colocalization loci in chromatin states; Middle, enrichment of East Asian-specific and EAS-EUR shared cis-colocalization signals in chromatin states; Right, enrichment of East Asian-specific and EAS-EUR shared trans-colocalization signals in chromatin states. Two-tailed Fisher’s exact test is applied. Each point with an error bar indicates log10-scaled odds ratio and its 95% confidence interval.
Extended Data Fig. 4 The relation between the trans-colocalization at chr21q22.2 and blood cell traits and immune diseases.
a, The geographic distribution of rs80109907 allele frequencies in different populations (1000 Genomes Phase 3) by the Geography of Genetic Variants (GGV) browser (https://popgen.uchicago.edu/ggv). b, The PheWAS result of rs80107709 (https://gwas.mrcieu.ac.uk/phewas). c, The colocalization result of chr21q22.2 with other blood cell count and immune-related diseases. SMR test is applied, and the x-axis indicates the beta estimates from original GWAS while the y-axis shows the -log10(P) of the SMR test. d, Two-sample MR results showing that 39 CpGs are causal for 7 traits (several blood cell count and immune-related diseases) at FDR < 0.05. MR IVW test is applied. Red and blue squares indicate positive or negative causal effect of CpG on trait, while the size of the square indicates -log10(P) of the MR IVW test.
Supplementary information
Supplementary Information
Supplementary Protocols, References and Figs. 1–16.
Supplementary Tables 1–23
Supplementary Table 1: The extent of 3.46 million GoDMC mQTLs being also mQTLs in NSPT for different MAF bins. Supplementary Table 2: The extent of 2.65 million NSPT mQTLs being also mQTLs in GoDMC for different MAF bins. Supplementary Table 3: NSPT-only mQTLs (NSPT P < 10−14) defined with different significance thresholds in GoDMC. Supplementary Table 4: The overlapping of population-specific and nonspecific mQTLs with significant SNPs and associations in GWAS Catalog, UKBB and BBJ. Supplementary Table 5: EA-specific mQTLs met with associations and signals in three groups (BBJ-specific, UKBB-specific and shared). Supplementary Table 6: Colocalization results of EA-specific mQTLs and GWAS signals of 230 traits in BBJ. Supplementary Table 7: A total of 144 locus–trait associations (96 loci and 38 traits) identified by cis-colocalization, especially in EAs based on mQTLs in NSPT and traits in BBJ. Supplementary Table 8: A total of 541 locus–trait associations (36 loci and 15 traits) identified by trans-colocalization, especially in EAs based on mQTLs in NSPT and traits in BBJ. Supplementary Table 9: PheWAS result of rs80109907 from public database (https://gwas.mrcieu.ac.uk/phewas/). Supplementary Table 10: Weaker (compared to basophil count) but significant East Asian-specific trans-colocalizations at the locus on chr21q22.2 involving several blood cell counts and immune-related diseases. Supplementary Table 11: A total of 233 CpGs significantly enriched for motifs of 62 TFs (P < 5.3 × 10−5). Supplementary Table 12: Enrichment of 233 CpGs in the binding sites of 13 TFs was validated by blood cell ChIP–seq data. Supplementary Table 13: Cell-lineage-specific mQTLs calculated for random unrelated SNP–CpG pairs. Supplementary Table 14: mQTL hotspots in NSPT. Supplementary Table 15: The 12 (of 16) hotspots index mQTLs or their high LD (r2 > 0.6) trans-mQTLs in GWAS Catalog (P < 5 × 10−8). Supplementary Table 16: mQTL hotspots on each chromosome in NSPT. Supplementary Table 17: Annotation of 16 trans-mQTL hotspots. Supplementary Table 18: Overlap of nearby TF/DBPs related cis-eQTL and trans-mQTLs in each trans-mQTL hotspot. Supplementary Table 19: Overlap of nearby TF/DBPs related cis-eQTL in GTEX V7 (P < 0.05) and trans-mQTLs in each trans-mQTL hotspot. Supplementary Table 20: Enrichment of SNPs with significant OpenCausal scores in trans-mQTL hotspots. Supplementary Table 21: A total of 21 putative causal mCpGs (rs4666078trans-associated) of blood eosinophil count identified by two-sample MR analysis. Supplementary Table 22: Two-sample MR analysis statistics summary of NFKB1 CpGs. Supplementary Table 23: Two-sample MR analysis statistics summary of BMI EWAS CpGs.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Peng, Q., Liu, X., Li, W. et al. Analysis of blood methylation quantitative trait loci in East Asians reveals ancestry-specific impacts on complex traits. Nat Genet 56, 846–860 (2024). https://doi.org/10.1038/s41588-023-01494-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-023-01494-9
This article is cited by
-
Identification of shared genetic loci for asthma, allergic rhinitis, and pollinosis in East Asians
Scientific Reports (2025)
-
TonguExpert: A Deep Learning-Based Algorithm Platform for Fine-Grained Extraction and Classification of Tongue Phenotypes
Phenomics (2025)
-
3D facial imaging: a novel approach for metabolic abnormalities risk profiling
Science China Life Sciences (2025)
-
Quantification of multi-pathway metabolites related to folate metabolism and application in natural population with MTHFR C677T polymorphism
Analytical and Bioanalytical Chemistry (2024)