Abstract
Extensive evidence from genome-wide association studies (GWAS) has shown that jointly analyzing multiple phenotypes can improve the power of the association test compared to the traditional single variant versus single trait approach. Here we propose an adaptive test based on principal components (ATPC) that is powerful and efficient for discovering the association between a single variant and multiple traits. Our method only needs GWAS summary statistics that are often available. We first estimate the trait correlation matrix by LD score regression. Then, based on the correlation matrix, we construct a series of test statistics that contain different numbers of principal components. The ultimate test statistic combines the P values of these principal component-based statistics by using the aggregated Cauchy association test. The analytical P-value of the test statistic can be computed quickly without the permutation process, which is the notable feature of our proposed method. The extensive simulation studies demonstrate that ATPC can control the type I error rates and have powerful and robust performance compared to several existing tests in a wide range of simulation settings. The analysis of the lipids GWAS summary data from the Global Lipids Genetics Consortium shows that ATPC identifies 230 new SNPs that are missed by the original single trait association analysis. By searching the GWAS Catalog, some SNPs and mapped genes identified by ATPC are reported to be associated with lipid traits. Through further analysis for GWAS results, we also find some Gene Ontology terms and biological pathways related to lipids.
Similar content being viewed by others
References
1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491: 56–65.https://doi.org/10.1038/nature09270
Barnett I, Mukherjee R, Lin X (2017) The generalized higher criticism for testing snp-set effects in genetic association studies. J Am Stat Assoc 112:64–76. https://doi.org/10.1080/01621459.2016.1192039
Bulik-Sullivan B, Finucane HK, Anttila V et al (2015) An atlas of genetic correlations across human diseases and traits. Nat Genet 47:1236–1241. https://doi.org/10.1038/ng.3406
Chesmore K, Bartlett J, Williams SM (2018) The ubiquity of pleiotropy in human disease. Hum Genet 137:39–44. https://doi.org/10.1007/s00439-017-1854-z
Conneely KN, Boehnke M (2007) So many correlated tests, so little time! Rapid adjustment of p values for multiple correlated tests. Am J Hum Genet 81:1158–1168. https://doi.org/10.1086/522036
Guo B, Wu B (2019) Integrate multiple traits to detect novel trait-gene association using gwas summary data with an adaptive test approach. Bioinformatics 35:2251–2257. https://doi.org/10.1093/bioinformatics/bty961
Huang DW, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57. https://doi.org/10.1038/nprot.2008.211
Kim YJ, Go MJ, Hu C et al (2011) Large-scale genome-wide association studies in East Asians identify new genetic loci influencing metabolic traits. Nat Genet 43(10):990–995. https://doi.org/10.1038/ng.939
Liu Z, Lin X (2018) Multiple phenotype association tests using summary statistics in genome-wide association studies. Biometrics 74:165–175. https://doi.org/10.1111/biom.12735
Liu Y, Chen S, Li Z, Morrison AC, Boerwinkle E, Lin X (2019) ACAT: A fast and powerful p value combination method for rare-variant analysis in sequencing studies. Am J Hum Genet 104:410–421. https://doi.org/10.1016/j.ajhg.2019.01.002
Liu W, Guo Y, Liu Z (2021) An omnibus test for detecting multiple phenotype associations based on gwas summary level data. Front Genet 12:1–7. https://doi.org/10.3389/fgene.2021.644419
McLaren W, Gil L, Hunt SE et al (2016) The ensembl variant effect predictor. Genome Biol 17(1):122–135. https://doi.org/10.1186/s13059-016-0974-4
Pasaniuc B, Price AL (2017) Dissecting the genetics of complex traits using summary association statistics. Nat Rev Genet 18:117–127. https://doi.org/10.1038/nrg.2016.142
Solovieff N, Cotsapas C, Lee PH, Purcell SM, Smoller JW (2013) Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet 14:483–495. https://doi.org/10.1038/nrg3461
Spracklen CN, Chen P, Kim YJ et al (2017) Association analyses of East Asian individuals and trans-ancestry analyses with European individuals reveal new loci associated with cholesterol and triglyceride levels. Hum Mol Genet 26(9):1770–1784. https://doi.org/10.1093/hmg/ddx062
Stearns FW (2010) One hundred years of pleiotropy: a retrospective. Genetics 186:767–773. https://doi.org/10.1534/genetics.110.122549
Stephens M (2013) A unified framework for association analysis with multiple related phenotypes. PLoS ONE 8:e65245. https://doi.org/10.1371/journal.pone.0065245
Sun R, Lin X (2020) Genetic variant set-based tests using the generalized berk-jones statistic with application to a genome-wide association study of breast cancer. J Am Stat Assoc 115:1079–1091. https://doi.org/10.1080/01621459.2019.1660170
Surakka I, Horikoshi M, Mägi R et al (2015) The impact of low-frequency and rare variants on lipid levels. Nat Genet 47(6):589–597. https://doi.org/10.1038/ng.3300
Teslovich TM, Musunuru K, Smith AV et al (2010) Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466:707–713. https://doi.org/10.1038/nature09270
Visscher PM, Brown MA, McCarthy MI, Yang J (2012) Five years of gwas discovery. Am J Hum Genet 90:7–24. https://doi.org/10.1016/j.ajhg.2011.11.029
Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, Yang J (2017) 10 years of gwas discovery: biology, function, and translation. Am J Hum Genet 101:5–22. https://doi.org/10.1016/j.ajhg.2017.06.005
Willer CJ, Schmidt EM, Sengupta S et al (2013) Discovery and refinement of loci associated with lipid levels. Nat Genet 45(11):1274–1283. https://doi.org/10.1038/ng.2797
Zhu X, Feng T, Tayo BO et al (2015) Meta-analysis of correlated traits via summary statistics from gwass with an application in hypertension. Am J Hum Genet 96:21–36. https://doi.org/10.1016/j.ajhg.2014.11.011
Acknowledgements
This research was supported by the National Natural Science Foundation of China (Grant Nos. 12071114, 61873087).
Author information
Authors and Affiliations
Contributions
Qianran Wei wrote the main manuscript text; Lili Chen conceived and revised all portions of the paper; Yajing Zhou analyzed the data; Huiyi wang drew all of the diagrams.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wei, Q., Chen, L., Zhou, Y. et al. An adaptive test based on principal components for detecting multiple phenotype associations using GWAS summary data. Genetica 151, 97–104 (2023). https://doi.org/10.1007/s10709-023-00179-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10709-023-00179-9