Main

Major depression (MD) is one of the most pressing global health challenges1. While genome-wide association studies (GWAS) have shown promise of uncovering biological mechanisms underlying the development of MD2,3, they have revealed a highly polygenic genetic architecture, characterized by variants that individually confer small risk increases4, probably due to the heterogeneity of MD symptoms and etiology5. Previous genetic research explored the impact of different outcome definitions2,6, sex7,8,9 and trauma exposure10,11,12 on heterogeneity. However, the role of ancestry and ethnicity in the genetics of MD has not yet been systematically evaluated.

So far, GWAS of MD were mostly conducted in individuals of European ancestry2,13,14,15. The largest MD GWAS combined data from several studies and identified 223 independent significant single-nucleotide polymorphisms (SNPs)15. That study also included data from 59,600 African Americans from the Million Veteran Program (MVP) cohort. In their bi-ancestral meta-analysis, the number of significant SNPs increased to 233. Other MD GWASs were conducted in African American and Hispanic/Latin American participants with limited sample sizes, and did not find variants with statistically significant associations with MD16,17.

With 10,640 female Chinese participants, the CONVERGE study is the largest MD GWAS conducted outside ‘Western’ countries so far3. The study identified two genome-wide significant associations linked to mitochondrial biology and reported a genetic correlation of 0.33 with MD in European ancestry samples18. In line with this, our recent work demonstrated that some of the previously identified loci from GWAS conducted in samples of European ancestry are not transferable to samples of East Asian ancestry19.

Heterogeneity in genetic effects could impact on findings when evaluating causal effects of risk factors for MD. Previous studies in samples of European ancestry reported genetic correlations and causal relationships between MD and cardiometabolic outcomes2,13,15,20. Notably, our previous study indicated a contradicting direction for associations between MD and body mass index (BMI) in East Asian individuals and European ancestry individuals (positive causal effect of BMI)19,21. Thus, investigating causal relationships using Mendelian randomization (MR) in diverse ancestry groups and in different disease subtypes is important to ensure generalizability and to distinguish between biological and societal mechanisms underlying the relationship between a risk factor and the disease.

Increasing diversity in genetic research is also important to ensure equitable health benefits22. In the United States, differences in presentation of MD across ethnic groups can impact on the likelihood of diagnosis23. Genetics optimized for European ancestry participants would primarily benefit that group of patients and could therefore further widen the disparities in diagnosis and treatment between groups.

In this Article, we used data from samples with diverse ancestries and carried out genome-wide association meta-analyses, followed by fine mapping and prioritization of target genes (Fig. 1). We assessed the transferability of genetic loci across ancestry groups. Finally, we explored bi-directional causal links between MD and cardiometabolic traits.

Fig. 1: Schematic diagram of the analyses in this study.
figure 1

We included data from 21 cohorts with diverse ancestry. We assigned individuals into ancestry/ethnic groups and carried out association analyses with MD for each. Subsequently, we meta-analyzed the results by ancestry/ethnic group. We tested whether previously reported MD loci from European ancestry studies are transferable to these groups. We also used the results for discovery of novel depression associations and MR to assess the causal effects of cardiometabolic traits by ancestry. We subsequently merged all ancestry/ethnicity-specific results in a multi-ancestry meta-analysis that also included samples with European ancestry. The multi-ancestry meta-analysis results formed the basis for locus discovery, fine mapping to identify causal variants and several gene prioritization approaches to identify genes linked to MD risk. ST.(n) refers to the corresponding Supplementary Table. ST.2* (in green) refers to Supplementary Table 2, showing genomic inflation estimates of multiple analyses.

Results

GWAS in African, East Asian and South Asian ancestry and Hispanic/Latin American samples

We first conducted GWAS meta-analyses stratified by ancestry/ethnic group. Individuals were assigned to ancestry groups (African, South Asian, East Asian or European) using principal component analyses based on genetic relatedness matrices. Assignment to the Hispanic/Latin American group was based on self-report or on recruitment in a Latin American country (Supplementary Figs. 17)24,25,26. We acknowledge the arbitrary nature of this approach and of choosing reference groups and cut-offs to assign participants. However, creating such groups enabled us to look for associations that are specific to groups and to assess the transferability of previously identified loci. The studies included in the meta-analyses used the following measures to define MD: structured clinical interviews, medical healthcare records, symptoms questionnaires and self-completed surveys (Supplementary Table 1 and Supplementary Note).

The analyses included 36,818 MD cases and 161,679 controls of African ancestry, 21,980 cases and 360,956 controls of East Asian ancestry, 4,505 cases and 27,176 controls of South Asian ancestry, and 25,013 cases and 352,946 controls in the Hispanic/Latin American group (Extended Data Fig. 1 and Supplementary Figs. 811). To account for the minor inflation found in the Hispanic/Latin American samples (λ1,000 = 1.002 and linkage disequilibrium score regression (LDSC) intercept 1.051; Supplementary Table 2), we corrected test statistics for this analysis based on the LDSC intercept.

In the Hispanic/Latin American group, the G-allele of rs78349146 at 2q24.2 was associated with increased risk of MD (effect allele frequency (EAF) 0.04, β (regression coefficient) = 0.15, s.e.m. 0.03, P = 9.3 × 10−9) (Supplementary Fig. 12). To test the role of these loci in molecular profiles, we performed colocalization for depression and multi-ancestry brain expression quantitative trait loci (eQTLs)27. Loci with posterior probability (PP) >90% for both traits being associated and sharing two different but linked variants (hypothesis (H)3) or a single causal variant (H4) were considered as colocalized. We observed significant colocalization for DPP4, RBMS1 and TANK. We tested ancestry-specific eQTLs from blood and observed RBMS1 (H3: PP (Hispanic/Latin American) 99.12%) and TANK (H3: PP (European) 97.85%; H3: PP (Hispanic/Latin American) 99.61%) at the 2q24.2 locus. For the protein quantitative trait loci (pQTLs) from blood, we either did not find the genes in the cohort or the number of SNPs within the gene was too low (<20) to test for colocalization (Supplementary Table 3).

No variants were associated at genome-wide significance in the GWAS in samples of African, East and South Asian ancestry (Extended Data Fig. 1a,b,d). One locus was suggestively associated in the African ancestry GWAS (Extended Data Fig. 1a and Supplementary Fig. 12). The lead variant, rs6902879 (effect allele: A, EAF 0.16, β = −0.08, s.e.m. 0.01, P = 5.3 × 10−8) at 6q16 is located upstream of the melanin-concentrating hormone receptor 2 gene (MCHR2) and associated with increased expression of MCHR2 in cortex based on genotype–tissue expression (GTEx v8) (P = 6.0 × 10−6). Testing the multi-ancestry brain eQTLs27, we observed significant colocalization for GRIK2 and ASCC3, with significant ancestry differences for ASCC3 (H3: PP (European) 99.97%). MCHR2 was not present in the RNA data.

Although the lead variants at 2q24.2 and at 6q16 did not display strong evidence of association in a large published GWAS in participants of European ancestry14 (P > 0.01), in each case there was an uncorrelated variant within 500 kb of the lead variants associated at P < 10−6 (Supplementary Fig. 12). Hence, although the evidence does not support a shared causal variant, we cannot rule out that there is an association at the same locus, but possibly with a different causal variant in European ancestry participants.

As a sensitivity analysis, we conducted a meta-analysis for each ancestry/ethnic group for clinical depression, comprising studies in which MD was ascertained by structured clinical interviews or medical healthcare records following the International Classification of Diseases (ICD9)/10 or the Diagnostic and Statistical Manual of Mental Disorders (DSM)-IV/5 criteria for major depressive disorder (Supplementary Table 1). There were 29,389 cases and 49,999 controls of African ancestry, 7,886 cases and 14,412 controls of East Asian ancestry, 848 cases and 13,908 controls in the Hispanic/Latin American group, and 4,252 cases and 26,738 controls of South Asian ancestry (Extended Data Fig. 2 and Supplementary Figs. 1316). In the South Asian ancestry GWAS, the A allele of rs7749931 at 6q15 was associated with decreased risk of MD (effect allele: A, EAF 0.49, β = −0.15, s.e.m. 0.03, P = 4.3 × 10−8) (Extended Data Fig. 2d and Supplementary Fig. 15). The variant is located downstream of STX7 (syntaxin 7). We did not observe genome-wide significant loci associated with clinical depression in any other ancestry group.

Transferability of MD associations across ancestry groups

Previous GWAS in samples of European ancestry have identified 206 loci associated with MD (Supplementary Table 4)13,14,15. The results for 196 of these loci were available in at least one of the ancestry/ethnic groups. We assessed whether these genetic associations are shared across different ancestry groups. Individual loci may be underpowered to demonstrate an association; therefore, we followed an approach we recently developed28 and first estimated the number of loci we expect to see an association for when accounting for sample size (n), linkage disequilibrium (LD) and minor allele frequency (MAF). This estimate varied widely between ancestry groups, for example, we expected to detect significant associations for 65% of MD loci in the GWAS with samples of African ancestry, but only for 15% of MD loci in samples of South Asian ancestry (Fig. 2). We report the power-adjusted transferability (PAT) ratio, that is, the observed number divided by the expected number of loci. Transferability was low, with PAT ratios of 0.27 (95% confidence interval (CI) 0.19 to 0.35) in African ancestry samples, and 0.29 in both East Asian (95% CI 0.20 to 0.39) and South Asian (95% CI 0.12 to 0.46) ancestry samples. In the Hispanic/Latin American group, the PAT ratio was 0.63 (95% CI 0.55 to 0.72), notably higher than in the other groups. PAT estimates for clinical MD were close to those for broad MD, with overlapping CIs in each case (Fig. 2). We were unable to estimate PAT ratios for clinical MD in the Hispanic/Latin American group because of insufficient numbers of cases based on this definition. We also assessed the transferability of 102 loci identified in the Psychiatric Genomics Consortium–Major Depressive Disorder Working Group’s (PGC-MDD) GWAS13 and in an independent study in samples of European ancestry, the Australian Genetics of Depression Study (AGDS)14. The PAT ratio was 1.48, considerably higher than the cross-ancestry PAT estimates. We report evidence of transferability of individual loci (Supplementary Table 5) as well as their ancestry-specific eQTL and pQTL colocalization (Supplementary Table 6 and Supplementary Fig. 17).

Fig. 2: Transferability of previously reported loci from European ancestry discovery GWAS of MD to other ancestry groups.
figure 2

a, A Venn diagram showing the numbers of previously identified loci from European ancestry studies with evidence of transferability to the other ancestry/ethnic groups: African, Hispanic/Latin American, South Asian and East Asian (in black) and their intersections (in cyan). Only the 112 loci with evidence of transferability to at least one ancestry group are shown here. b, A plot showing power-adjusted transferability (PAT) ratios. We first calculated the observed number of transferable loci out of the 195, 196, 179 and 180 loci that were present in the African, Hispanic/Latin American, South Asian and East Asian ancestries, respectively. These were divided by the expected number of transferable loci (numbers displayed underneath the figure), taking effect estimates from previous European ancestry studies, and allele frequency and sample size information from our African, Hispanic/Latin American, South Asian and East Asian ancestry cohorts. The ratios are presented separately for broadly defined MD and clinically ascertained MD. The error bars indicate 95% CIs for PAT ratios. We were unable to compute results for clinical MD in the Hispanic/Latin American group because of insufficient numbers of cases.

In addition, we estimated trans-ancestry genetic correlations using POPCORN version 1.0 (ref. 29). We only present genetic correlation estimates where the s.e.m. was less than 0.3. The sample size for the South Asian ancestry group was too small to conduct this analysis. The genetic correlations for MD between the European and the Hispanic/Latin American, African and East Asian ancestry groups were ≥0.75. The lowest estimate was observed between the East Asian ancestry, and the Hispanic/Latin American group (rg = 0.52) (Fig. 3).

Fig. 3: Genetic correlations for MD between different ancestry groups.
figure 3

A plot showing the genome-wide genetic correlations between the African, European, East Asian and Hispanic/Latin American groups. The intensity of the coloring reflects the strength of the correlation. The estimated coefficients and standard errors are also shown in each cell. We only present estimates where the s.e.m. was smaller than 0.3; otherwise, the field is colored in gray.

Multi-ancestry meta-analysis

We carried out a multi-ancestry meta-analysis using data from studies conducted in participants of African, East Asian and South Asian ancestry and Hispanic/Latin American samples (Supplementary Note), and combined them with previously published data for 258,364 cases and 571,252 controls of European ancestry13,14, yielding a total sample size of 345,389 cases and 1,469,702 controls. These analyses provided results for 22,941,580 SNPs after quality control. There was no evidence of residual population stratification (λ1,000 = 1.001, LDSC intercept 1.019; Supplementary Table 2). We identified 190 independent genome-wide significant SNPs mapping to 169 loci that were separated from each other by at least 500 kb (Extended Data Fig. 3, Supplementary Table 7 and Supplementary Fig. 18). Fifty-three of the SNPs represent novel associations (r2 < 0.1 and located more than ±250 kb from previously reported variants). Most of the 196 previously reported loci were associated at genome-wide significance in the multi-ancestry meta-analysis, which incorporates the discovery data for these loci (Supplementary Table 4).

As a sensitivity analysis, we also conducted a multi-ancestry meta-analysis for clinical depression. There were 57,714 cases and 110,358 controls of European ancestry under the clinical definition of MD, which were subsequently combined with the aforementioned non-European clinically ascertained studies by meta-analysis (100,089 cases and 214,415 controls in total) (Extended Data Fig. 4 and Supplementary Figs. 19 and 20). This analysis identified seven genome-wide significant loci, two of which were novel (rs2085224 at 3p22.3 and rs78676209 at 5p12) (Supplementary Table 8).

We then excluded cohorts that had an extreme case–control ratio (ncases/ncontrols <0.25) and did not adjust for this analytically, as well as cohorts with adolescent participants. This sensitivity analysis also yielded consistent results for the 190 lead SNPs (Supplementary Fig. 21).

Finally, we re-analyzed the data using a multi-ancestry meta-analysis approach implemented in MR-MEGA, which resulted in 44 independent regions associated with MD after lambda GC correction, some of which had been missed in the main analyses due to their between-ancestry heterogeneity (Supplementary Table 9).

Multi-ancestry fine mapping

We used a multi-ancestry Bayesian fine-mapping method30 to derive 99% credible sets for 155 loci that were associated at genome-wide significance and did not show evidence of multiple independent signals. For comparison, we also implemented single ancestry fine mapping of the same loci based on GWAS conducted in participants of European ancestry, including PGC-MDD and AGDS13,14.

Multi-ancestry fine mapping increased fine-mapping resolution substantially as compared with fine mapping solely based on the data from European ancestry participants. The median size of the 99% credible sets was reduced from 65.5 to 30 variants. Among the 145 loci for which we conducted fine mapping on both sample sets, 113 (77.9%) loci had a smaller 99% credible set from the multi-ancestry fine mapping, while four loci (0 from the European fine-mapping) were resolved to single putatively causal SNPs (Fig. 4 and Supplementary Table 10). For example, rs12699323, annotated as an intronic variant, is linked to expression of TMEM106B (transmembrane protein 106B). rs1806152 is a splice region variant associated with expression of the nearby gene PAUPAR (PAX6 upstream antisense RNA) on chromosome 11. At another locus, rs9564313 has been linked to expression of PCDH9 (protocadherin-9), a gene that is also highlighted in our TWAS and multi-marker analysis of genomic annotation (MAGMA) results31,32.

Fig. 4: Resolution of the locus fine mapping based on the multi-ancestry and the European ancestry GWAS, showing the size of the credible sets for 155 significant loci.
figure 4

a, A box plot showing the median (central line) and interquartile range (upper and lower hinges) of the sizes of the credible set for fine-mapped loci. The whiskers extend to 1.5 times the interquartile range. Data points falling outside that range are denoted by individual dots in the figure. b, Stacked bar charts showing the number of loci within size categories for credible sets.

TWAS and gene prioritization

To better understand the biological mechanisms of our GWAS findings, we performed several in silico analyses to functionally annotate and prioritize the most likely causal genes. We carried out a transcriptome-wide association study (TWAS) based on the results from the multi-ancestry meta-analysis for expression in tissues relevant to MD33. We combined the TWAS results with functional mapping and annotation (FUMA), conventional MAGMA and HiC-MAGMA to prioritize target genes.

The TWAS identified 354 significant associations (P < 1.37 × 10−6) with MD, 205 of which had not been previously reported (Fig. 5 and Supplementary Table 11). The two most significant gene associations with MD were RPL31P12 (GTEx brain cerebellum, Z = −10.68, P = 1.27 × 10−26) and NEGR1 (GTEx brain caudate basal ganglia, Z = 10.677, P = 1.30 × 10−26), consistent with previous findings33.

Fig. 5: Manhattan-style Z-score plot of gene associations with MD in a TWAS based on the GWAS summary statistics for broadly defined MD.
figure 5

Significant gene associations are shown as red dots (354 significant genes, 205 of them novel), and the 50 most significant gene names are highlighted on both sides of the plot. Novel associations are shown in black, while genes previously associated with MD are shown in gray. The red lines indicate the significance threshold (P < 1.37 × 10−6). For genes on the top part of the graph, increased expression was associated with increased depression risk, while expression of the genes on the bottom part of the plot showed an inverse association. NT, novel transcript.

PCDH8P1 (GTEx brain anterior cingulate cortex BA24, Z = −8.3679, P = 5.86 × 10−17) was the most significant novel TWAS result. NDUFAF3 was another novel gene association with MD (GTEx brain nucleus accumbens basal ganglia, Z = −5.0785, P = 3.80 × 10−7, best GWAS ID rs7617480, best GWAS P = 0.00001). These results were also confirmed by HiC-MAGMA. The protein NDUFAF3 encodes is targeted by metformin, the first-line drug for treating type 2 diabetes.

Forty-three genes displayed evidence of association across all four gene prioritization methods (TWAS, FUMA, MAGMA and HiC-MAGMA) and were classified as high-confidence genes (Table 1 and Supplementary Tables 1115). These included genes repeatedly highlighted in previous studies due to their strong evidence of association and biological relevance in MD: NEGR1, DRD2, CELF4, LRFN5, TMEM161B and TMEM106B. Cadherin-9 (CDH9) and protocadherins (PCDHA1, PCDHA2 and PCDHA3) were also among the high-confidence genes (Supplementary Table 12). Finally, 25 of the high-confidence genes encode targets of established drugs, such as simvastatin (RHOA). These may indicate opportunities for drug repurposing.

Table 1 Genes associated with MD

MR

We assessed bi-directional causal relationships between MD genetic liability and cardiometabolic traits using ancestry-specific two-sample MR analyses. Our results indicated a positive, bi-directional relationship between MD genetic liability and BMI (MD− > BMI: β = 0.092, 95% CI 0.024 to 0.161, P = 8.12 × 10−3, BMI− > MD: β = 0.138, 95% CI 0.097 to 0.180, P = 6.88 × 10−11) (Fig. 6 and Supplementary Table 16). This bi-directional relationship was exclusively observed in samples of European ancestry (P > 0.1 in all other groups). MD genetic liability was also causal for other indicators of unfavorable metabolic profiles in samples of European ancestry: triglycerides (TGs, positive effect; β = 0.116, 95% CI 0.070 to 0.162, P = 7.93 × 10−7), high-density lipoproteins (HDLs, negative effect; β = −0.058, 95% CI −0.111 to −0.006, P = 0.029) and low-density lipoproteins (LDLs, positive effect; β = 0.054, 95% CI 0.012 to 0.096, P = 0.011). The effects remained significant after removing the variants contributing to the possible heterogeneity bias observed through the MR–pleiotropy residual sum and outlier global test. Additionally, no pleiotropy was observed (Supplementary Table 16). In samples of East Asian ancestry, on the other hand, we found a negative causal association between TG and MD (β = −0.127, 95% CI −0.223 to −0.032, P = 9.22 × 10−3). Moreover, MD genetic liability showed a positive causal association with systolic blood pressure (SBP, β = 0.034, 95% CI 0.009 to 0.059, P = 7.66 × 10−3). In samples of African ancestry, SBP had a positive causal association with MD (β = 0.080, 95% CI 0.026 to 0.133, P = 3.43 × 10−3).

Fig. 6: Bi-directional MR tests between MD and cardiometabolic outcomes.
figure 6

The data are presented with a β and a 95% CI. Nominally significant associations are marked with a red asterisk. Statistics have been derived using the β and standard errors for the number of variants used as IVs in each analysis, shown as N SNPs. Results are not shown for diastolic blood pressure for which there were no significant associations. *P < 0.05 (P values in order from top to bottom: 6.88 × 10−11, 8.22 × 10−3, 9.22 ×10−3, 7.93 × 10−7, 7.67 × 10−3, 3.43 ×’10−3, 0.03 and 0.01). More details can be found in Supplementary Table 16. AFR, African ancestry; EAS, East Asian ancestry; EUR, European ancestry; HIS, Hispanic/Latin American group; SAS, South Asian ancestry.

Discussion

We present the first large-scale GWAS of MD in an ancestrally diverse sample, including data from almost 1 million participants of African, East Asian and South Asian ancestry, and Hispanic/Latin American samples. The largest previous report included 26,000 cases of African ancestry15.

By aggregating data in ancestry-specific meta-analyses, we identified two novel loci, 2q24.2 and 6p15. In the Hispanic/Latin American group, variants at 2q24.2 were associated with MD. Most of the cases in this group were defined using symptoms questionnaires. Future studies will be required to assess whether the association of this loci with clinical MD is consistent with our estimate. While the additional association at 6q16 in the GWAS in samples of African ancestry requires further confirmation in future studies, the link with MD is biologically plausible. The lead variant was significantly associated with the expression of MCHR2 specifically in brain cortex tissue. Melanin-concentrating hormone (MCH) is a neuropeptide that is expressed in the central and peripheral nervous systems. It acts as a neurotransmitter and neuromodulator in a broad array of neuronal functions directed toward the regulation of goal-directed behavior, such as food intake, and general arousal34.

The diversity, in combination with the large sample size, enabled a comparison of the causal genetic architecture across ancestry groups. We assessed to what extent the 206 previously identified loci from large European ancestry discovery GWAS were transferable to other ancestry groups. Differences in allele frequencies, linkage disequilibrium and variable sample sizes impact on power to observe associations for each group. We recently developed PAT ratios, an approach to account for all these factors by comparing observed transferable loci with what is expected for a study of a given ancestry and sample size28. The PAT ratios were about 30% for African, South Asian and East Asian ancestry, remarkably similar and consistently low. We previously computed PAT ratios for several other traits and found variation between traits, but the estimates for MD were at the bottom19,28. With a PAT ratio of 64%, the transferability of MD loci discovered in European ancestry samples was much higher for the Hispanic/Latin American group. This finding may reflect that the Hispanic/Latin American group contained many participants with a high proportion of European ancestry35,36. The majority of cases within this group were defined via symptom questionnaires rather than clinical MD. Hence, it may be possible that the transferability for clinical MD is even higher in this group. For African, South Asian and East Asian ancestry, the PAT ratios for clinical MD were all below 0.5 and consistent with the estimates from the main analysis, demonstrating that heterogeneity in outcome definitions does not explain the limited transferability of MD loci across ancestry groups.

To better understand mechanisms underlying individual differences in vulnerability to development of MD, we need to bridge the gap from locus discovery to the identification of target genes. Our study achieved substantial progress in this respect. Fine mapping benefitted from the additional diverse samples, with median credible sets reduced from 65.6 to 35 in size and with 32 loci resolved to ≤10 putatively causal SNPs (11 loci from the European ancestry fine mapping).

On the TWAS, the expression of 354 genes was significantly associated with MD. Out of these, 205 gene associations were novel, and 89 were overlapping with results of the largest previously published MD TWAS15. Furthermore, 80 genes were overlapping with associations from another, previously published, large MD TWAS with largely overlapping samples of European ancestry33. A number of these TWAS features, including NEGR1, ESR2 and TMEM106B, were previously also fine mapped and highlighted as putatively causal in previous post-TWAS analyses, strengthening the role of TWAS as an important tool to better understand the relationship between gene expression and MD.

Through TWAS and three other tools that incorporate the growing body of knowledge about functional annotations of the genome, we classified 43 genes as ‘high confidence’. The definition admittedly remains arbitrary until the field establishes clear guidelines. Nevertheless, the high-confidence list represents an evidence-based starting point for further follow-up. It provides confirmation for several genes that have repeatedly been highlighted as being near a GWAS-associated variant and having high biological plausibility2,13,14,15,33: NEGR1, DRD2, CELF4, LRFN5, TMEM161B and TMEM106B.

Furthermore, cadherin-9 (CDH9) and protocadherins (PCDHA1, PCDHA2 and PCDHA3) were classified as high-confidence genes. Cadherins are transmembrane proteins, mediating adhesion between cells and tissues in organisms37. In previous studies, cadherins have been linked with MD and with other disorders involving the brain, including late-onset Alzheimer’s disease, which often manifests as neuropsychiatric symptoms coupled with depression and anxiety13,38,39,40. The results of our study strengthen the evidence for the involvement of cadherins and protocadherins in the etiology of MD.

Genes newly implicated in MD development in our study highlight novel pathways, pinpoint potential new drug targets and suggest opportunities for drug repurposing. NDUFAF3 encodes mitochondrial complex I assembly protein, which is the main target of the drug metformin41, the first-line drug for treating type 2 diabetes. Research in model organisms has provided a tentative link between metformin and a reduction in depression and anxiety42. Furthermore, a recent study using more than 360,000 samples from the United Kingdom Biobank (UKB) found associations between NDUFAF3 and mood instability, suggesting that energy dysregulation may play an important role in the physiology of mood instability43.

Previous MR studies conducted in populations of European ancestry suggested a causal relationship of higher BMI increasing the odds of depression44,45,46. To our knowledge, evidence of a reverse causal association (that is, MD genetic liability increases the odds of higher BMI) has not been previously reported2. We also observed that the genetic liability to MD was associated with higher TG levels, lower HDL cholesterol and higher LDL cholesterol levels in individuals of European ancestry, which were not significant in the only previous MR study of smaller statistical power47. Individuals with depression present higher levels of inflammation and are at increased risk of cardiometabolic disorders, irrespective of the age of onset48. The phenotypic associations between MD and cardiometabolic traits may partly reflect the genetic overlap between them49. However, in other ancestry groups, no significant relationship between BMI and MD was observed. Our MR analyses showed an effect of reduced TGs on increasing odds of MD in participants of East Asian ancestry. Therefore, we provide further evidence for an opposite direction of effect for the relationship between MD and metabolic traits in European and East Asian ancestry groups19,21. Instead of generalizing findings about depression risk factors across populations, further studies are needed to understand how genetic and environmental factors contribute to the complex relationships across diverse ancestry groups.

Our study has limitations. In this study, we assigned individuals into ancestry and ethnic groups. While this enabled important insights (for example, about transferability of MD loci), such categorical assignments are imprecise and some participants with admixed ancestry may still get excluded. In future research, we aim to implement different analytic strategies that are fully inclusive.

The sample size varied greatly across ancestry groups. The smallest group were individuals of South Asian ancestry. Most of the individuals included in our study live in the United States or in the United Kingdom. To characterize MD in global populations, future studies prioritizing primary data collection are needed. To contribute to this, we are currently recruiting MD patients and controls from Pakistan into the DIVERGE study50. However, a concerted global effort to increase diversity in genetics will be necessary to fully address the issue22. This also applies to the lack of other omics data and other functional databases to support downstream analyses for ancestrally diverse GWAS, such as large resources for transcriptomics or proteomics in relevant tissues51,52. This may have impacted on our TWAS results because the RNA sequencing data was predominantly from participants of European ancestry.

Furthermore, statistical power for discovery of genetic associations may be impacted by reduced coverage of genetic variation present in diverse ancestral groups, as well as other factors such as the reliability of outcome assessment across different groups.

Additionally, our bi-directional MR analysis tested the relationships between MD and cardiometabolic traits. When testing MD as the exposure, the results should be interpreted as the effect of MD genetic liability and not as the effect of MD itself.

This study utilized data from several existing cohorts and bioresources to achieve large sample sizes. This necessitated using different outcome definitions, covering self-administered symptom questionnaires, electronic healthcare records and structured clinical interviews. The potential advantages and disadvantages of these approaches have been extensively discussed in previous studies2,6. It is possible that some of the 190 genome-wide significant loci we identified are linked to a more general susceptibility to mental illness instead of being specific to MD. However, given the overlap between different psychiatric disorders53, such findings are nevertheless of value for our understanding of the biology and for the development of new treatments for MD.

In conclusion, in this first large-scale, multi-ancestry GWAS of MD, we demonstrated through transferability analyses that a notable proportion of MD loci are specific to samples of European ancestry. We identified novel, biologically plausible associations that were missed in European ancestry analyses and demonstrated that large, diverse samples can be important for identifying target genes and putative mechanisms. These findings suggest that for MD, a heterogeneous condition with highly complex etiology, increasing ancestral as well as global diversity in genetic studies may be particularly important to ensure discovery of core genes and to inform about transferability of findings across ancestry groups.

Methods

Participating cohorts

For the analyses of the African, East Asian, South Asian and Hispanic/Latin American group, we included data from 21 cohorts (Supplementary Table 1) with ancestrally diverse participants, where measurements were taken from distinct samples. Details including study design, genotyping and imputation methods and quality control for these studies had been described by previous publications (Supplementary Note). All participants provided informed consent. All studies obtained ethical approvals from local ethics review boards. Measures were taken from distinct samples rather than repeat measures from the same individual.

For each study, a principal component analysis was carried out based on the genetic similarity of individuals. Individuals who clustered around a reference group with confirmed ancestry were assigned to that specific group and included in the association analysis, except for the Hispanic/Latin American group, which was based on self-reported ethnicity. Individuals with admixture between the predefined ancestry reference groups were excluded.

We also included two previously published studies of MD, using data from ancestrally European participants, including the PGC-MDD2 (ncases = 246,241 and ncontrols = 558,568)13 and the AGDS (ncases = 12,123 and ncontrols = 12,684) (ref. 14) to conduct a multi-ancestry meta-analysis of MD (Supplementary Table 1). The total sample size of the multi-ancestry meta-analysis was 1,815,091 (ncases = 345,389 and neffhalf = 559,332). Of the participants, 70.1% (effective sample size) were of European ancestry, 8.2% East Asian, 11.8% African and 1.5% were of South Asian ancestry, and 7.9% were Hispanic/Latin American.

We used a range of measures to define depression, including structured clinical interviews, medical care records, symptom questionnaires and self-reported surveys (Supplementary Table 1). The meta-analyses were primarily conducted combining GWASs of all phenotype definitions (that is, broad MD). In addition, meta-analyses for clinical depression and relevant downstream analyses were also conducted. We considered depression ascertained by structured clinical interviews (directly assessing diagnostic criteria based on DSM-IV, DSM-5 or ICD9/10 through interviews or self-report) or medical care records (ICD9 or ICD10 from primary or secondary care units) as clinical depression. Among the GWASs, the Genes and Health study, MVP, the Genetic Epidemiology Research on Aging Study (GERA), BioVU, the Prevention Intervention Research Center First Generation Trial (PIRC), the Mexican Adolescent Mental Health Survey (MAMHS), CONVERGE, the UKB, the Army Study to Assess Risk and Resilience in Service Members (Army-STARRS) and BioMe fulfilled the clinical definition of depression. On the basis of European ancestry data from previous published work of the PGC-MDD group13, all studies fulfilled the clinical definition, except for the UKB and the 23andMe, which were excluded in the analysis of clinical MD.

Study-level genetic association analyses

Throughout the manuscript, all statistical tests were two sided, unless explicitly indicated. We had access to individual-level data for Army STARRS, UKB, Women’s Health Initiative (WHI), Intern Health Study (IHS), GERA, Jackson Heart Study (JHS), Drakenstein Child Health Study and the Detroit Neighborhood Health Study. Data access was granted via our collaborators, the UKB under application ID 51119 and the dbGaP under project ID 18933.

SNP-level associations with depression were assessed through logistic regressions using PLINK version 1.90. The additive per-allele model was employed. Age, sex, principal components and other relevant study-level covariates were included as covariates. Where available, genotypes on chromosome X were coded 0 or 2 in male participants and 0, 1 and 2 in female participants. Data for variants on X were only available for some of the studies (Supplementary Table 1). The effective sample size was 1,763 for African, 58,833 East Asian, 13,099 South Asian ancestry and 79,720 for Hispanic/Latin American. Summary statistics were received from our collaborators for all other studies. Additive-effect logistic regressions were conducted by the 23andMe Inc, Taiwan-MDD study, MVP, BBJ, Rabinowitz, MAMHS, PrOMIS and BioVU. Age, sex, principal components and other relevant study-level covariates were included as covariates.

Mixed-effect models were used in the association analysis for CKB, BioME and Genes and Health with SAIGE (version 0.36.1) (ref. 54). The CONVERGE study initially conducted mixed-effect model GWA tests with Bayesian and logistic regression toolkit–linear mixed model (BOLT-LMM), followed by PLINK logistic regressions to retrieve log odds ratios (ORs). For the CONVERGE study, the logORs and s.e.m. from PLINK were used in our meta-analysis. The HCHS/SOL implemented mixed-effect model GWA tests to adjust for population structure and relatedness with depression as binary outcome16 and was conducted using GENESIS55. The summary statistics from GENESIS were converted into logOR and s.e.m. before meta-analysis. First, the score and its variance were transformed into β and s.e.m. by β = score/variance and s.e.m. = sqrt(variance)/variance. Afterwards, β and s.e.m. were converted into approximate logOR and s.e.m. using β = β/(pi × (1 − pi)) and s.e.m. = s.e.m./(pi × (1 − pi)), where pi is the proportion of cases in analysis56.

We restricted the downstream analysis to variants with imputation accuracy info score of 0.7 or higher and effective allele count (2 × MAF × (1 − MAF) × N × R2) of 50 or higher. For study of small sample size, we required a minor allele frequency of no less than 0.05. The alleles for indels were re-coded as ‘I’ for the longer allele and ‘D’ for the shorter one. Indels of different patterns at the same position were removed.

Meta-analyses

We first implemented inverse variance-weighted (IVW) fixed-effect meta-analyses for GWAS from each ancestry/ethnic group (that is, African ancestry, East Asian ancestry, South Asian ancestry and the Hispanic/Latin American group) using METAL (version 2011-03-25) (ref. 57). The genomic inflation factor λ was calculated for each study and meta-analysis with R package GenABEL version 1.8.0 (ref. 58). Given the dependence of this estimate on sample size, we also calculated λ1,000 (ref. 59) as λ1,000 = 1 + (λ − 1) × (1/ncase + 1/ncontrol) × 500 (ref. 60). The LDSC intercept was also calculated with an ancestry-matched LD reference panel from the Pan UKB reference panel61 for each meta-analysis with LDSC (version 1.0.1) (ref. 62). For meta-analyses with residual inflation (λ > 1.1), test statistics for variants were adjusted by LDSC intercept. Following the meta-analyses by METAL, variants present in less than two studies were filtered out. Statistical tests were generally two sided unless otherwise stated. We also performed a heterogeneity analysis with METAL to assess whether observed effect sizes (or test statistics) are homogeneous across samples.

We combined data from 71 cohorts with diverse ancestry using an IVW fixed-effects meta-analysis in METAL57. λ and λ1,000 were calculated, and were 1.687 and 1.001, respectively. The LDSC intercept was also calculated with the multi-ancestry LD reference panel (Supplementary Note), which was 1.019 (s.e.m. 0.011). We adjusted the test statistics from the multi-ancestry meta-analysis using the LDSC intercept of 1.019. Only variants present in at least two studies were retained for further analysis, yielding a total of 22,941,580 variants. We also calculated the number of cases and the total number of samples for each variant based on the crude sample size and availability of each study.

We used a significance threshold of 5 × 10−8. To identify independent association signals, the GCTA forward selection and backward elimination process (command ‘cojo-slt’) were applied using the summary statistics from the multi-ancestry meta-analysis, with the aforementioned multi ancestry LD reference panel (GCTA version 1.92.0 beta2)63,64. It is possible that the algorithm identifies false positive secondary signals if the LD in the reference set does not match the actual LD in the GWAS data well; therefore, for each independent signal defined by the GCTA algorithm, locus zoom plots were generated for the 250 kb upstream and downstream region. We then inspected each of these plots manually and removed any secondary signals from our list where there was unclear LD separation, that is, some of the variants close to the secondary hit were in LD with the lead variant.

Loci were defined by the flanking genomic interval mapping 250 kb upstream and downstream of each lead SNP. Where lead SNPs were separated by less than 500 kb, the corresponding loci were aggregated as a single locus with multiple independent signals. The lead SNP for each locus was then selected as the SNP with minimum association P value. The analysis for loci identification, along with all other R-related tasks unless otherwise stated, was conducted using R (version 3.4.3) (ref. 65) and figures were produced using the packages ggplot2 (version 3.2.1) (ref. 66), qqman (version 0.1.4) (ref. 67) and ggpubr (version 0.6.0) (ref. 68).

We conducted sensitivity analyses for outcome definitions, case–control ratio and using a different multi-ancestry meta-analysis approach (Supplementary Note).

Fine mapping

We fine mapped all loci with statistically significant associations from the multi-ancestry GWAS using a statistical fine-mapping method for multi-ancestry samples30. Briefly, this method is an extension of a Bayesian fine-mapping approach30,69 that utilizes estimates of the heterogeneity across ancestry groups, such that variants with different effect estimates across populations have a smaller prior probability to be the causal variant.

For each lead variant, we first extracted all nearby variants with r2 > 0.1 as determined by the multi-ancestry LD reference. The multi-ancestry prior for each variant to be causal was calculated from a fixed-effects meta-analysis combining the summary statistics from ancestry-specific meta-analysis for each of the five major ancestry groups. I2 statistics were calculated to estimate the heterogeneity of the effect estimates across ancestry groups. The posterior probability for a variant to be included in the credible set was proportional to its chi-square test statistic and the prior. The 99% credible set for each lead variant was determined by ranking all SNPs (within r2 > 0.1 of the lead variant) according to their posterior probabilities and then including ranked SNPs until their cumulative posterior probabilities reached or exceeded 0.99.

As a comparison, we also conducted a Bayesian fine-mapping analysis based on the summary statistics of the European-ancestry meta-analysis. The same list of independent lead SNPs from the multi-ancestry meta-analysis were used for this fine mapping in the European ancestry data. All nearby SNPs with r2 > 0.1 as determined by the 1,000 Genomes European LD reference panel were included in the fine mapping. The posterior probability was calculated in a similar way, but without the multi-ancestry prior. Similar to the multi-ancestry fine mapping, all SNPs were ranked, and 99% of the credible sets were derived accordingly.

Since our fine mapping was based on meta-analysis summary statistics, heterogeneity of individual studies (for example, due to differences in genotyping array) can influence the fine-mapping calibration and recall. We used a novel summary statistics-based quality control method proposed by Kanai and colleagues (SLALOM) to dissect outliers in association statistics for each fine-mapped locus70. This method calculates test statistics (DENTIST-S) from Z-scores of test variants and the lead variant (the variant of the lowest P value in each locus), and the LD r between test variants and the lead variant in the locus71. Among the 155 fine-mapped loci in our study, there were 134 loci with the largest variant posterior inclusion probability of greater than 0.1. For these 134 loci, r values were calculated for all variants within the 1 Mb region of the lead variant for each locus based on our multi-ancestry LD reference from the UKB data. In line with the criteria used by Kanai and colleagues, variants with DENTIST-S P value smaller than 1 × 10−4 and r2 with the lead variant greater than 0.6 were defined as outliers. Fine-mapped loci were classified as robust if there were no outlying variants.

Colocalization analysis

We performed colocalization between genetic associations with MD and gene expression in brain and blood tissues from samples of European and African ancestry and Hispanic/Latin American participants using coloc R package72. To select genes for testing, we mapped SNPs within a 3 Mb window at 2q24.2 and 6q16.2 using Variant Effect Predictor31, resulting in eight and four genes, respectively. Loci with posterior probability >90% either for both traits are associated and share two different but linked variants (H3 hypothesis) or a single causal variant (H4 hypothesis) were considered as colocalized. The European and African ancestry summary statistics for MD were tested against multi-ancestry brain eQTLs from European and African American samples27. For the Hispanic/Latin American group, we tested gene and protein expression of blood tissue from Multi-Ethnic Study of Atherosclerosis and Trans-omics for Precision Medicine73. For African ancestry, we tested gene expression of blood from GENOA study74 and proteome expression of blood75. For European ancestry, we tested gene expression of blood from eQTLgen76, and proteome expression from blood75. We also carried out ancestry-specific eQTL and pQTL colocalization analyses for previously reported loci that were or were not transferable.

Assessment of transferability of MD-associated loci

We assessed whether published MD-associated loci display evidence of association in the East Asian, South Asian and African ancestry and Hispanic/Latin American samples. Pooling the independent genome-wide significant SNPs from two large GWAS of MD in samples of European ancestry yielded 195 loci13,14,15. The ancestrally diverse groups included in this study had smaller numbers of participants than the European ancestry discovery studies. Also, a given variant may be less frequent in another ancestry group. Therefore, individual lead variants may not display evidence of association because of lack of power. Moreover, in the discovery study, the lead variant is either the causal variant or is strongly correlated with it. However, differences in LD mean that the lead variant may not be correlated in another ancestry group and may therefore not display evidence of association. Our assessment of transferability was therefore based on PAT ratios that aggregate information across loci and account for all three factors, sample size, MAF and differences in LD28.

First, credible sets for each locus were generated. They consisted of lead variant plus all correlated SNPs (r2 ≥ 0.8) within a 50 kb window of the lead variant (based on ancestry matched LD reference panels from the 1,000 Genomes data) and with P < 100 × Plead. A signal was defined as being ‘transferable’ to another ancestry group if at least one variant from the credible set was associated at two-sided \(P < 10^{(\mathrm{log}_{10}0.05) - P_{\mathrm{f}} \times (N - 1)}\) with MD and had consistent direction of effect between the discovery and test study. N is the number of SNPs in the credible set for each locus, and Pf is a penalization factor we derived from empirical estimations. The effective number of independent SNPs was often higher in other ancestry groups due to differences in LD, leading to higher multiple testing burden and higher likelihood of identifying SNPs with a low P value, by chance alone. This inflates the test statistics and was adjusted for by the penalization factor (Pf). To derive the Pf for each ancestry group, we used the summary statistics from a previous GWAS on breast cancer77, in which phenotypes were believed to be uncorrelated with MD. A total of 441 breast cancer significant SNPs were taken from their paper, and linear regressions were conducted for the P values of these SNPs in each of our ancestrally diverse summary statistics for MD on the number of SNPs in credible sets. The coefficient estimates (slope from regressions) were treated as Pf for each ancestry. As a result, Pf were 0.008341, 0.007378, 0.006847 and 0.003147 for samples of African, East Asian, South Asian ancestry and for the Hispanic/Latin American group, respectively.

In the next step, the statistical power to detect an association of a given locus was calculated assuming an additive effect at a type I error rate of 0.05, with effect estimates from the discovery study, and allele frequency and sample size from each of the target datasets from diverse ancestry/ethnic groups. The power estimates were summed up across published loci to give an estimate of the total number of loci expected to be significantly associated. This is the expected number if all loci are transferable and accounts for the statistical power for replication. We calculated the PAT ratio by dividing the observed number of loci by the expected number. In addition, loci were defined as ‘nontransferable‘ if they had sufficient power for identifying an association but did not display evidence of association, that is, if they contained at least one variant in the credible set with >80% power, while none of the variants in the credible set had P < 0.05 and no variant within 50 kb of locus had P < 1 × 10−3 in the target dataset.

For comparison, we also conducted a transferability assessment for a European ancestry look-up study. The 102 significant loci reported by Howard and colleagues13 were evaluated for their transferability in the AGDS study using the aforementioned method.

To assess whether low transferability may be due to heterogeneous outcome definitions, we carried out a sensitivity analysis, where we estimated PAT ratios based only on studies fulfilling the clinical MD definition.

Trans-ancestry genetic correlations

We estimated trans-ancestry genetic correlations using POPCORN version 1.0 (refs. 29,54,78). Pairwise correlations were calculated between each combination of the five ancestry/ethnic groups (that is, African, European, East Asian, South Asian and Hispanic/Latin American) for broad depression and clinical depression separately.

Gene annotation

The summary statistic from the multi-ancestry meta-analysis was first annotated with FUMA79. Both positional mapping and eQTL mapping results were extracted from FUMA. The 1,000 Genomes European samples were employed as the LD reference panel for FUMA gene annotation. Datasets for brain tissue available in FUMA were employed for eQTL gene annotation.

Gene-based association analyses were implemented using Multi-marker Analysis of GenoMic Annotation (MAGMA, version 1.08) (ref. 80) and Hi-C coupled MAGMA (H-MAGMA)81. The aforementioned multiple ancestry LD reference panel from the UKB was used as the LD reference panel. H-MAGMA assigns noncoding SNPs to their cognate genes based on long-range interactions in disease-relevant tissues measured by Hi-C81. We used the adult brain Hi-C annotation file.

Transcriptome-wide association analysis and drug mapping

To perform a TWAS, the FUSION software was used82. SNP weights were downloaded from the FUSION website83 and were derived from multiple external studies, including (1) SNP weights from all available brain tissues, adrenal gland, pituitary gland, thyroid gland and whole blood33 from GTEx v8 (ref. 84) (based on significantly heritable genes and ‘All Samples’ in GTEx v8, which also includes African American and Asian individuals); (2) SNP weights from the CommonMind Consortium, which includes samples from the brain dorsolateral prefrontal cortex; (3) SNP weights from the Young Finns study; and (4) from the Netherlands Twin Register, which provides SNP weights from blood tissues (whole blood and peripheral blood, respectively).

We used the multi-ancestry LD reference panel described above. Variants present in the 1,000 Genomes European population reference panel were retained. A separate TWAS was also performed using a LD reference panel based on the 1,000 Genomes Project’s samples of European ancestry, as a sensitivity analysis.

The transcriptome-wide significance threshold for the TWAS associations in this study was P < 1.37 × 10−6. This threshold was previously derived using a permutation-based procedure, which estimates a significance threshold based on the number of features tested33.

The results were compared with previous TWAS in MD, including the two largest MD TWAS so far2,15,33,85,86. These studies generally used smaller sets of SNP weights (except the study by Dall’Aglio and colleagues, which used similar SNP weights as the current study, but with SNP weights derived from the previous GTEx release, v7). The TWAS Z-score plot was generated using a TWAS-plotter function87.

To assess the relevance of novel genes to drug discovery, genes were searched in three large drug databases: GeneCards88, DrugBank and ChEMBL89,90. In Table 1, a selection of drugs (the ones reported in multiple publications) probably targeting our high-confidence prioritized gene sets are shown for each gene.

MR

We performed a bi-directional two-sample MR analysis using the TwoSampleMR R package (version 0.5.6)91,92 to test possible causal effects between MD and six cardiometabolic traits. We followed the STROBE-MR (strengthening the reporting of observational studies in epidemiology using Mendelian randomisation) guidelines (Supplementary Note). For individuals of European ancestry, the UKB was used to select instruments for BMI, fasting glucose, HDL, LDL, SBP and TGs. SBP summary data were obtained from the UKB for individuals of African and South Asian ancestry and Hispanic/Latin American participants. For samples of African, East Asian and South Asian ancestry and the Hispanic/Latin American group, a meta-analysis was performed using METAL57 with inverse variance weighting using the UKB and the following consortia: GIANT93 for BMI; MAGIC94 for fasting glucose; Global Lipids Genetics Consortium95 for HDL, LDL and TG; and Biobank Japan95,96 for SBP in samples of East Asian ancestry. The genetic associations with quantitative variables were estimated with respect to the scale, units and models defined in the original studies. Heterogeneity analyses were also performed. To avoid sample overlap, the datasets used to define instrumental variables (IV) for the cardiometabolic traits were excluded from the MD genome-wide association statistics used for the MR analyses conducted with respect to each ancestry group.

Genome-wide significance (P = 5 × 10−8) was used as the threshold to select IVs for the exposures. However, if less than ten variants were available, a suggestive threshold (P = 5 × 10−6) was used to select IVs (Supplementary Table 16). We only included IVs that were present in both datasets (exposure and the outcome). We followed the three main IV assumptions for the analysis: (1) relevance: the IV is associated with the risk factor of interest; (2) independence: the IV is not associated with confounders; and (3) exclusion: the IV is only associated with the outcome through the exposure. We used the following criteria for clumping: r2 = 0.001 and a 10,000 kb window. The following information was used in both the exposure and outcome data: SNP ID, effect size, effect allele, other allele, EAF and P value. We used five different MR methods: IVW, MR–Egger, weighted median, simple mode and weighted mode92. The IVW estimates were reported as the main results due to their higher statistical power97 while the other tests were used to assess the consistency of the estimates across different methods. MR–Egger regression intercept and MR heterogeneity tests were conducted as additional sensitivity analyses. In case of significant heterogeneity, the MR–pleiotropy residual sum and outlier global test was used to remove genetic variants based on their contribution to heterogeneity98).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.