Introduction

Schizophrenia is a heritable and heterogeneous disorder likely to be affected by environmental factors. The combined evidence from genetic, epidemiological, transcriptomic and proteomic studies has now converged at alterations in metabolic, neurotrophic and prominently, immune/inflammatory processes in schizophrenia. The largest genome-wide association study (GWAS) conducted has identified 108 schizophrenia associated genetic loci involved in glutamatergic neurotransmission and synaptic plasticity and, importantly in immune processes1. Epidemiological and transcriptomic studies have long hinted at a role for immune dysregulation in schizophrenia2,3,4,5. Blood-based protein biomarker studies have also demonstrated changes in immune/inflammatory processes in prodromal and drug-naïve patients, from which candidate diagnostic and prognostic biomarker panels have been reported6. In addition, clinical trials have shown initial encouraging therapeutic effects associated with add-on anti-inflammatory medication in schizophrenia patients7.

Despite the convergence of evidence, direct links between the genetic and proteomic findings have as yet not been established for schizophrenia. While genetics may provide insights into the biological mechanisms underpinning disease susceptibility, proteomics can provide functional molecular evidence linked to disease manifestation. Although GWAS studies have ‘implicated’ a number of candidate genes, these studies have shown that most of the associations are to genomic regions (‘loci’). For most of these loci, it is not certain which exact gene is affected8. In addition, because almost all the schizophrenia-associated single nucleotide polymorphisms (SNPs) have been found to be located within non-coding regions of genes, elucidation and interpretation of the biological basis for the genetic associations remains challenging8.

With this in mind, we attempted for the first time to investigate associations between 190 serum proteins implicated in several psychiatric disorders9,10,11,12 including schizophrenia6,13,14,15,16 and SNPs located within genes encoding for the measured proteins in 149 schizophrenia patients and 198 matched controls. The SNPs will be analysed by the custom-made PsychArray, which was developed by Illumina in collaboration with the Psychiatric Genomic Consortium (PGC). It assesses approximately 270,000 tag SNPs, over 250,000 rare and low-frequency exonic variants and approximately 50,000 custom markers selected based on evidence from prior genetic studies of psychiatric illnesses including schizophrenia, major depressive, bipolar and autism-spectrum disorders17.

Materials and Methods

Clinical samples

A total of 347 individuals were recruited consecutively from the department of Psychiatry, University of Magdeburg, Germany including 198 controls and 149 schizophrenia patients (109 first-onset antipsychotic drug-naïve and 40 antipsychotic drug treated). Diagnosis of schizophrenia was performed by psychiatrists using the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV)18. Information on antipsychotic medication use was confirmed by direct contact with the treating family physicians and relatives along with consultations regarding detailed histories of psychotropic medication use prior to hospitalization. Controls were matched with the patient group for age, gender and body mass index (BMI) (Table 1) and were recruited from a database of blood donors at the Institute of Transfusion Medicine at the University of Magdeburg and, some were students and staff at the University. Exclusion criteria included chronic illnesses such as diabetes, cardiovascular disease, immune and autoimmune disorders, infections, treatment with immune- suppressive or -modulating drugs or antibiotics, other neuropsychiatric or neurological disorders (multiple sclerosis, epilepsy, mental retardation), chronic (terminal) diseases affecting the brain (cancer, hepatic and renal insufficiency), alcohol or drug addiction, organic psychosis/organic affective syndromes, severe trauma, other psychiatric and non-psychiatric co-morbidity. Medication was administered after completion of diagnostic evaluation as appropriate. Informed written consent was given by all participants and the study protocols, analysis of samples and test methods were approved by the local Institutional Ethics Review Board and were in compliance with the Standards for Reporting of Diagnostic Accuracy19.

Table 1 Demographic characteristics.

Serum sample preparation

Serum sample collection and preparation followed strict standard operating protocols, as described previously13. Briefly, blood samples were collected from all subjects between 8:00 and 12:00 hours in the morning into S-Monovette 7.5 mL serum tubes (Sarstedt; Numbrecht, Germany). The samples were left to clot at room temperature for 2 hours and then centrifuged at 4000 × g for 5 minutes. The resulting supernatants were stored at −80 °C in Low Binding Eppendorf tubes (Hamburg, Germany).

Multiplex immunoassay analysis and Quality Control

The Multi-Analyte Profiling immunoassay platform (DiscoveryMAP) was used to measure the concentrations of 190 proteins in patient sera. The proteins measured were mainly involved in immune/inflammatory, endocrine and metabolic processes previously implicated in several neurological/psychiatric disorders9,10,11,12 including schizophrenia6,13,14,15,16, depression20,21 and bipolar disorder22. All assays were conducted in the Clinical Laboratory Improved Amendments (CLIA)–certified laboratory at Myriad-RBM (Austin, TX, USA; described previously13). All serum samples were stored at −80 °C until analysis. Data were quality control (QC) assessed and pre-processed using R (http://www.R-project.org/)23, as described previously6. Briefly, proteins with greater than 30% missing values were excluded and values below or above the detection limits were imputed by the minimum and maximum detected values, respectively. Data were log2-transformed to stabilise variance. Sample outliers were examined using principal components analysis (PCA)24 and through inspection of quantile-quantile (Q-Q) plots.

Genotyping and Quality Control

Blood DNA samples from all participants were genotyped using the Illumina Infinium PsychArray v1.0 (Illuminia Inc, San Diego, California, USA) at the Department of Genomics at the Life and Brain Centre, University of Bonn. Data was quality control (QC) assessed using PLINK v1.0725 and R (http://www.R-project.org/)23, as described previously26. Per-individual QC involved exclusion of samples with (1) over 3% missing genotypes (no samples); (2) abnormal heterozygosity rate ±3 standard deviations from the mean (7 samples); (3) related or duplicated samples (7 samples) identified through identity by -state and -descend sharing analysis on an linkage disequilibrium-pruned set of SNPs (for each pair related or duplicated individuals, the individual with lower genotyping completeness was excluded); and, (4) individuals of non-European ancestry (2 samples) identified by combining study genotypes with genotypes from HapMap3 data with the following population codes: CEU (Utah residents with Northern and Western European ancestry), YRI (Yoruba in Ibadan, Nigeria), CHB (Han Chinese in Beijing, China), JPT (Japanese in Tokyo, Japan). Per-SNP QC involved exclusion of SNPs (1) with over 5% missing genotype (2348); (2) showing excessive deviation from Hardy-Weinberg equilibrium (P < 10−3) in the control group (800); (3) significantly different missing genotype rates between cases and controls (P < 10−5) (0); and, (4) with a very low minor allele frequency (MAF) of less than 1% (278912).

SNP selection

The genotype data were subjected to gene annotation using the biomaRt package27 in R to identify the SNPs located within genes encoding for the measured proteins. These SNPs were selected for downstream linear regression analysis.

Statistical analysis

All statistical analyses were performed in R (http://www.R-project.org/)23. Linear regression analyses were carried out to identify association between SNPs and the corresponding proteins. For each regression model, each SNP was individually included as the predictor variable (continuous variable coded 0, 1, 2 counting the number of major alleles) along with covariates, diagnosis (binary case/control status) and a SNP x Diagnosis interaction term to identify case/control specific SNP and protein associations. Protein concentration was modelled as the continuous outcome variable. The covariates were age, gender, BMI and antipsychotic medication. Regression diagnostics were examined to ensure that all the model assumptions were met including check for residual normality, data linearity, independence and homoscedasticity and, exclusion of high leverage points, outliers and influential values. False discovery rate was controlled according to Benjamini and Hochberg28. To account for regression model stability and robustness of findings, bootstrap resampling was repeated 1000 times for each test29. Given the exploratory nature of the study, SNP-protein associations were accepted as significant if adjusted P-values < 0.05 or if P-values < 0.05 in over 70% of 1000 bootstrap samples. SNPxDiagnosis results were strictly only accepted as significant following multiple correction at adjusted P-value < 0.05.

Results

The demographic characteristics of the study cohort are summarised in Table 1. The patient and control groups were matched for age, gender and BMI. The mean age and BMI of patients was 36 and 26, respectively and for controls 37 and 26, respectively. The percentage of males/females was 57%/43% for patients and 53%/47% for controls.

SNP and protein expression association

In total, 149 of the 190 proteins measured by the multiplex immunoassay platform survived QC. Following genetic data QC, 308,263 of the original 588,454 SNPs were left for analysis and sample size was reduced to 331 (189 controls and 142 schizophrenia patients). Of these, we found that 632 SNPs were located within 128 genes that encode for 132 of the measured proteins (for minor allele frequencies of SNPs, see Supplementary Table 1). This represented 89% SNP coverage for the 149 proteins surviving QC. Linear regression analysis showed that 115 SNPs were associated with 45 proteins (P-value < 0.05) (Table 2 and Fig. 1). Of these, associations between 81 SNPs and 29 proteins survived multiple testing (adjusted P-value < 0.05) and/or bootstrap resampling. These 29 proteins were involved in several biological functions including immune/inflammatory response (14) [Complement Factor H, Interleukin-6 receptor (IL-6r), Epithelial-Derived Neutrophil-Activating Protein 78 (ENA-78), Fetuin-A, Interleukin-16 (IL-16), Epidermal Growth Factor (EGF), CD5L, Receptor for advanced glycosylation end products (RAGE), Interleukin-18 (IL-18), Chemokine CC-4 (HCC-4), Bone Morphogenetic Protein 6 (BMP-6), Tumor Necrosis Factor alpha (TNF-alpha), Interferon gamma Induced Protein 10 (IP-10), Myeloid Progenitor Inhibitory Factor 1 (MPIF-1)], blood coagulation (3) [Factor VII, Serotransferrin, Matrix Metalloproteinase-1 (MMP-1)], lipid metabolism (2) [Apolipoprotein E (Apo-E), Apolipoprotein(a) (Lpa)], other metabolic processes (3) [Tamm-Horsfall Urinary Glycoprotein (THP), Matrix Metalloproteinase-3 (MMP-3), Glutathione S-Transferase alpha (GST-alpha)], endocrine or growth factor signalling (3) [Adiponectin, Thyroxine-Binding Globulin (TBG), Tenascin-C (TN-C)], vascular regulation (2) [Angiotensin-Converting Enzyme (ACE), Angiotensinogen] and, other (2) [Cystatin-C, Sortilin]. All directions of association between the SNPs and their corresponding proteins were consistent, except for associations with six proteins including CFH, Lpa, IL-6r, IL-16, Apo-E and MMP-3. This finding suggests that these SNPs may have differential regulatory effects on protein expression.

Table 2 Association of 115 SNPs with expression of 45 proteins.
Figure 1
figure 1

Polar histogram showing the significant SNP-protein associations. Key: A positive direction of association (β) indicates that a higher major allele copy number is associated with a higher protein level in blood. A negative direction of association (−β) indicates that a higher major allele copy number is associated with a lower protein level in blood.

SNP x Diagnosis interaction effect on protein expression

A total of 21 SNPs showed significant SNP x Diagnosis interactions for 19 proteins suggesting that the effect of these SNPs on expression of the respective proteins varies with diagnosis (Table 3). Separate analysis stratified by diagnosis showed that seven SNPs were associated with seven proteins in the control group, another seven SNPs showed associations with seven proteins in the schizophrenia group and two SNPs were associated with two proteins in both the control and schizophrenia groups (for the complete list of significant SNP and protein associations stratified by diagnosis, see Supplementary Table 2). Approximately half of these SNP-protein associations survived multiple testing including rs555212 (Factor VII) (β = −0.3, adj.P = 3.07E-05]), rs11846959 (Alpha-1-Antitrypsin; AAT) (β = −0.19, adj.P = 0.004), rs4256246 (IP-10) (β = 0.23, adj.P = 0.038) and rs12829220 (von Willebrand Factor; vWF) (β = 0.75, adj.P = 0.006) in the control group; rs9658644 (Chromogranin-A; CgA) (β = 0.77, adj.P = 0.008), rs2424577 (Cystatin-C) (β = 0.09, adj.P = 0.009) and rs6123 (Vitamin K-Dependent Protein S; VKDPS) (β = 0.09, adj.P = 0.034) in the schizophrenia group; rs7553796 (IL-6r) in both the control (β = 0.28, adj.P = 9.44E-07) and schizophrenia (β = 0.43, adj.P = 3.33E-10) groups (Table 3, Fig. 2, Supplementary Figure 1). Figure 2 shows that while an increasing number of major alleles of rs7553796 is associated with increasing levels of the IL-6r protein in blood in both groups, schizophrenia patients with two copies of the major allele (homozygous for the major allele) have significantly higher levels of the IL-6r protein compared to controls homozygous for the major allele. A positive β indicates that a higher number of major alleles is associated with higher protein expression in blood. A negative β indicates that a higher number of major alleles is associated with a lower protein level in blood.

Table 3 Significant SNP x Diagnosis interaction and results stratified by diagnosis.
Figure 2
figure 2

Interaction plots showing significant SNP and protein expression associations where SNP x Diagnosis interaction was significant.

Discussion

We investigated the association between SNPs genotyped using the PsychArray and the expression levels of 190 serum proteins in 149 schizophrenia patients and 198 matched controls. The hypothesis was that protein expression levels for a given individual are associated with SNPs in the corresponding gene.

We found that 632 SNPs were located within 128 genes that encode for 132 of the measured serum proteins. Linear regression analysis identified associations between 81 SNPs and 29 proteins that survived corrections for multiple testing and/or bootstrap resampling. Interestingly, more than half of these proteins could be associated with immune and inflammation responses. The remaining proteins were found to be related to a number of pathways ranging from blood coagulation, metabolism, endocrine signalling to vascular regulation.

As a next step, we investigated the SNP x diagnosis interaction. When the effect of a SNP on protein-level differs between patients and controls (e.g., present in one group and absent in the other), this indicates a difference in the biological regulation of the protein level in patients compared to controls. Furthermore, insights can be obtained into whether the involvement of a protein-biomarker in disease development involves disease-specific pathways (Supplementary Figure 2). We found eight serum proteins with a significant interaction which survived multiple testing following analysis stratified by diagnosis, namely rs555212 (Factor VII), rs11846959 (AAT), rs4256246 (IP-10) and rs12829220 (vWF) in the control group; rs9658644 (CgA), rs2424577 (CST3) and rs6123 (VKDPS) in the schizophrenia group; rs7553796 (IL-6r) in both the control and schizophrenia groups. Four out of these eight proteins (AAT, IP10, vWF and IL6r) were involved in immune function/inflammatory processes. This observation aligns with the finding that schizophrenia associations are enriched at enhancers that are active in tissues linked to immune function1. Importantly, all of the implicated proteins have previously been repeatedly reported to be differentially expressed in serum or plasma of schizophrenia patients6,13,14,15,16. Some of these proteins such as Factor VII, vWF, CgA, AAT and IL-6r have also been found to be altered or predict development of schizophrenia in ultra-high risk or pre-onset individuals6,30. However, to our knowledge only SNPs associated with IL-6r, CgA, vWF and AAT have previously been reported in schizophrenia. The strength of our study was that through genotyping using the PsychArray and profiling of circulating protein using the DiscoveryMAP platform, we have attempted to demonstrate a functional link between expression of protein biomarkers previously implicated in schizophrenia and schizophrenia related SNPs located within genes encoding for the measured proteins.

The IL-6r gene has been investigated extensively in previous genetic studies. However, results have not been consistent. While a previous genetic association study of IL-6r reported a significant association of rs2228145 C allele (Ala allele) with schizophrenia31, others failed to find significant differences in allele or genotype distribution between patients and controls32. Another study investigated promoter polymorphism of another IL-6r rs4845617 but found no significant association with schizophrenia in Taiwan33. Kapelski and colleagues recently reported a significant association of rs2228145 and rs4537545 with schizophrenia34. Rafiq and colleagues found that the minor allele T of rs4537545 accounted for approximately 20% of the variation in circulating IL-6r levels and individuals homozygous for the minor allele of the rs4537545 had a doubling of IL-6r levels compared to the major allele homozygous group35. Our results are in line with this finding. We found that individuals homozygous for the minor allele of rs4537545 and its exome SNP exm-rs4537545 had the highest IL-6r levels compared to individuals homozygous for the major allele, which had the lowest IL-6r levels (Table 2). However, rs4537545 x diagnosis interaction was not significant suggesting that the effect of this SNP on IL-6r expression does not vary with diagnosis (patient or control). Analysis stratified by diagnosis also showed significant association between the SNPs (rs4537545 and exm-rs4537545) and IL-6r expression in the control and patient groups separately demonstrating further that these SNPs are associated with IL-6r expression regardless of diagnosis (Supplementary Table 2). In addition to this finding, we also demonstrated that another SNP, rs7553796 located within the IL-6r gene was associated with increasing levels of the IL-6r protein in blood in both the control and schizophrenia groups (Table 3). A significant rs7553796 x diagnosis interaction and subsequent analysis stratified by diagnosis showed that while increasing number of the major allele is associated with increasing IL-6r levels in both groups, schizophrenia patients homozygous for the major allele had significantly higher levels of the IL-6r protein compared to controls homozygous for the major allele (Fig. 2). This finding suggests the possibility of differential regulation of protein expression in schizophrenia patients based on the major allele copy number of rs7553796.

The other interesting protein that has been studied extensively is CgA. This protein is widely expressed in secretory granules throughout the central nervous system and in endocrine tissue and is co-released with several neurotransmitters36. CgA has calcium binding and neuromodulatory properties and is a potent microglial activator resulting in neurotoxicity mediated through the secretion of glutamate, TNF alpha and nitric oxide which in turn induces mitochondrial stress and apoptosis36. Changes in the expression of CgA protein in schizophrenia have been reproducibly shown in post-mortem37 frontal cortex and pituitary, CSF38,39 and serum6,37,40. Biochemical studies have also demonstrated a reduction of CgA immunoreactivity in the prefrontal cortex of schizophrenic patients41. Alterations in CSF CgA levels have also been shown in a range of neurodegenerative disorders, Parkinson’s42 and Alzheimer’s43 disease. Allelic-association studies in Chinese patients have associated single-nucleotide polymorphisms at the chromogranin B (CHGB) locus with schizophrenia44. Genetic linkage studies in Japanese schizophrenia patients have implicated a genomic region near the CHGB locus on chromosome 2045. Subsequent Japanese studies have reported significant associations between schizophrenia and the CHGB gene, which belongs to the same family as CHGA46. More recently, studies showing associations between rs9658635 at promoter region and the haplotype of rs9658635–rs729940 in the CHGA gene with schizophrenia have also emerged36. Through SNP x diagnosis interaction analysis, we found that the effect of the rs9658644 on expression of the CgA protein varied with diagnosis (Table 3). Analysis stratified by diagnosis showed that an increase in number of major alleles of rs9658644 was significantly associated with an increase in expression of the CgA protein in the schizophrenia patient group. In the control group, the major allele was not associated with CgA protein levels in blood (Fig. 2). These results suggest differential regulation of CgA protein expression in patients based on number of rs9658644 major alleles.

The SERPINA1 gene, which encodes for the AAT protein has also been implicated in schizophrenia. Association studies demonstrated that the rs1303 displayed different genotype pattern distributions between patient and control individuals47. In an earlier study, alleles in this gene have been found to linked to family history of schizophrenia48. We found a significant interaction between another SNP, rs11846959, located within the SERPINA1 gene and diagnosis. Through analysis stratified by diagnosis, we demonstrated that rs11846959 was significantly associated with expression of the circulating AAT in the control group but not in the schizophrenia group (Table 3, Fig. 2). In the control group, an increase in the number of major alleles of rs11846959 was significantly associated with a decrease in expression of the AAT protein. Finally, vWF polymorphisms have also been implicated in schizophrenia and bipolar disorder but only in studies investigating co-segregation and genetic associations between von Willebrand’s disease and psychotic disorders49. We found that a higher number of major alleles of rs12829220 was significantly associated with an increasing level of the vWF in controls (Table 3, Fig. 2). A limitation of our study was that we were restricted to investigation of SNPs located within genes encoding for a selection of proteins included in the commercially available immunoassay panel. This may explain why we were not able to reproduce recent findings from the largest GWAS study1. In addition, despite our efforts to minimise effects of confounding factors on protein expression through implementation of strict exclusion criteria, we cannot rule out the effects of some of the key environmental confounders including menstrual cycle and stress50.

In conclusion, the data presented here shows that most of the significant SNP and protein associations and SNP x Diagnosis interactions are either directly or indirectly linked to inflammation responses. We found significant associations between SNPs located within genes and their corresponding encoded circulating proteins. Importantly, we demonstrated that significant SNP x Diagnosis interaction was identified for eight serum proteins suggesting that the effect of SNPs on expression of the respective proteins varies with diagnosis.