Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A human phenome-interactome network of protein complexes implicated in genetic disorders

Abstract

We performed a systematic, large-scale analysis of human protein complexes comprising gene products implicated in many different categories of human disease to create a phenome-interactome network. This was done by integrating quality-controlled interactions of human proteins with a validated, computationally derived phenotype similarity score, permitting identification of previously unknown complexes likely to be associated with disease. Using a phenomic ranking of protein complexes linked to human disease, we developed a Bayesian predictor that in 298 of 669 linkage intervals correctly ranks the known disease-causing protein as the top candidate, and in 870 intervals with no identified disease-causing gene, provides novel candidates implicated in disorders such as retinitis pigmentosa, epithelial ovarian cancer, inflammatory bowel disease, amyotrophic lateral sclerosis, Alzheimer disease, type 2 diabetes and coronary heart disease. Our publicly available draft of protein complexes associated with pathology comprises 506 complexes, which reveal functional relationships between disease-promoting genes that will inform future experimentation.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Steps in scoring each candidate in a linkage interval.
Figure 2: Performance of the Bayesian predictor.
Figure 3: Case studies of four candidate complexes.

Similar content being viewed by others

References

  1. Brunner, H.G. & van Driel, M.A. From syndrome families to functional genomics. Nat. Rev. Genet. 5, 545–551 (2004).

    Article  CAS  Google Scholar 

  2. Adie, E.A., Adams, R.R., Evans, K.L., Porteous, D.J. & Pickard, B.S. Speeding disease gene discovery by sequence-based candidate prioritization. BMC Bioinformatics 6, 55 (2005).

    Article  Google Scholar 

  3. Franke, L. et al. Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am. J. Hum. Genet. 78, 1011–1025 (2006).

    Article  CAS  Google Scholar 

  4. Franke, L. et al. TEAM: a tool for the integration of expression, and linkage and association maps. Eur. J. Hum. Genet. 12, 633–638 (2004).

    Article  CAS  Google Scholar 

  5. Turner, F.S., Clutterbuck, D.R. & Semple, C.A. POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol. 4, R75 (2003).

    Article  Google Scholar 

  6. Perez-Iratxeta, C., Bork, P. & Andrade, M.A. Association of genes to genetically inherited diseases using data mining. Nat. Genet. 31, 316–319 (2002).

    Article  CAS  Google Scholar 

  7. Perez-Iratxeta, C., Wjst, M., Bork, P. & Andrade, M.A. G2D: a tool for mining genes associated with disease. BMC Genet. 6, 45 (2005).

    Article  Google Scholar 

  8. Masseroli, M., Galati, O. & Pinciroli, F. GFINDer: genetic disease and phenotype location statistical analysis and mining of dynamically annotated gene lists. Nucleic Acids Res. 33, W717–W723 (2005).

    Article  CAS  Google Scholar 

  9. van Driel, M.A., Cuelenaere, K., Kemmeren, P.P., Leunissen, J.A. & Brunner, H.G. A new web-based data mining tool for the identification of candidate genes for human genetic disorders. Eur. J. Hum. Genet. 11, 57–63 (2003).

    Article  CAS  Google Scholar 

  10. van Driel, M.A. et al. GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Res. 33, W758–761 (2005).

    Article  CAS  Google Scholar 

  11. Hristovski, D., Peterlin, B., Mitchell, J.A. & Humphrey, S.M. Using literature-based discovery to identify disease candidate genes. Int. J. Med. Inform. 74, 289–298 (2005).

    Article  Google Scholar 

  12. Freudenberg, J. & Propping, P. A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics 18 Suppl 2, S110–S115 (2002).

    Article  Google Scholar 

  13. Barabasi, A.L. & Oltvai, Z.N. Network biology: understanding the cell's functional organization. Nat. Rev. Genet. 5, 101–113 (2004).

    Article  CAS  Google Scholar 

  14. van Driel, M.A., Bruggeman, J., Vriend, G., Brunner, H.G. & Leunissen, J.A. A text-mining analysis of the human phenome. Eur. J. Hum. Genet. 14, 535–542 (2006).

    Article  CAS  Google Scholar 

  15. Gavin, A.C. et al. Proteome survey reveals modularity of the yeast cell machinery. Nature (2006).

  16. Giot, L. et al. A protein interaction map of Drosophila melanogaster. Science 302, 1727–1736 (2003).

    Article  CAS  Google Scholar 

  17. Walhout, A.J. et al. Integrating interactome, phenome, and transcriptome mapping data for the C. elegans germline. Curr. Biol. 12, 1952–1958 (2002).

    Article  CAS  Google Scholar 

  18. Boulton, S.J. et al. Combined functional genomic maps of the C. elegans DNA damage response. Science 295, 127–131 (2002).

    Article  CAS  Google Scholar 

  19. Gandhi, T.K. et al. Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets. Nat. Genet. 38, 285–293 (2006).

    Article  CAS  Google Scholar 

  20. Lim, J. et al. A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration. Cell 125, 801–814 (2006).

    Article  CAS  Google Scholar 

  21. Oti, M., Snel, B., Huynen, M.A. & Brunner, H.G. Predicting disease genes using protein-protein interactions. J. Med. Genet. (2006).

  22. Aerts, S. et al. Gene prioritization through genomic data fusion. Nat. Biotechnol. 24, 537–544 (2006).

    Article  CAS  Google Scholar 

  23. von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).

    Article  CAS  Google Scholar 

  24. Rual, J.F. et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178 (2005).

    Article  CAS  Google Scholar 

  25. Stelzl, U. et al. A human protein-protein interaction network: a resource for annotating the proteome. Cell 122, 957–968 (2005).

    Article  CAS  Google Scholar 

  26. Korbel, J.O. et al. Systematic association of genes to phenotypes by genome and literature mining. PLoS Biol. 3, e134 (2005).

    Article  Google Scholar 

  27. Schijvenaars, B.J. et al. Thesaurus-based disambiguation of gene symbols. BMC Bioinformatics 6, 149 (2005).

    Article  Google Scholar 

  28. Butte, A.J. & Kohane, I.S. Creation and implications of a phenome-genome network. Nat. Biotechnol. 24, 55–62 (2006).

    Article  CAS  Google Scholar 

  29. Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A. & McKusick, V.A. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33 (Database Issue), D514–D517 (2005).

    Article  CAS  Google Scholar 

  30. Aronson, A.R. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc. AMIA Symp. 17–21 (2001).

  31. Bodenreider, O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32, D267–D270 (2004).

    Article  CAS  Google Scholar 

  32. Gerard Salton, M.J.M. Introduction to Modern Information Retrieval (Neal-Schuman Publishers, New York, 1983).

    Google Scholar 

  33. Divita, G., Tse, T. & Roth, L. Failure analysis of MetaMap Transfer (MMTx). Medinfo 11, 763–767 (2004).

    Google Scholar 

  34. Gu, S., Kumaramanickavel, G., Srikumari, C.R., Denton, M.J. & Gal, A. Autosomal recessive retinitis pigmentosa locus RP28 maps between D2S1337 and D2S286 on chromosome 2p11–p15 in an Indian family. J. Med. Genet. 36, 705–707 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001).

    Article  CAS  Google Scholar 

  36. Sohocki, M.M. et al. A range of clinical phenotypes associated with mutations in CRX, a photoreceptor transcription-factor gene. Am. J. Hum. Genet. 63, 1307–1315 (1998).

    Article  CAS  Google Scholar 

  37. Sekine, M. et al. Localization of a novel susceptibility gene for familial ovarian cancer to chromosome 3p22–p25. Hum. Mol. Genet. 10, 1421–1429 (2001).

    Article  CAS  Google Scholar 

  38. Demuth, I. et al. An inducible null mutant murine model of Nijmegen breakage syndrome proves the essential function of NBS1 in chromosomal stability and cell viability. Hum. Mol. Genet. 13, 2385–2397 (2004).

    Article  CAS  Google Scholar 

  39. Matsuura, S. et al. Positional cloning of the gene for Nijmegen breakage syndrome. Nat. Genet. 19, 179–181 (1998).

    Article  CAS  Google Scholar 

  40. Castilla, L.H. et al. Mutations in the BRCA1 gene in families with early-onset breast and ovarian cancer. Nat. Genet. 8, 387–391 (1994).

    Article  CAS  Google Scholar 

  41. Lancaster, J.M. et al. BRCA2 mutations in primary breast and ovarian cancers. Nat. Genet. 13, 238–240 (1996).

    Article  CAS  Google Scholar 

  42. Taniguchi, T. et al. Disruption of the Fanconi anemia–BRCA pathway in cisplatin-sensitive ovarian tumors. Nat. Med. 9, 568–574 (2003).

    Article  CAS  Google Scholar 

  43. Thompson, L.H. Unraveling the Fanconi anemia–DNA repair connection. Nat. Genet. 37, 921–922 (2005).

    Article  CAS  Google Scholar 

  44. Dechairo, B. et al. Replication and extension studies of inflammatory bowel disease susceptibility regions confirm linkage to chromosome 6p (IBD3). Eur. J. Hum. Genet. 9, 627–633 (2001).

    Article  CAS  Google Scholar 

  45. Hampe, J. et al. Linkage of inflammatory bowel disease to human chromosome 6p. Am. J. Hum. Genet. 65, 1647–1655 (1999).

    Article  CAS  Google Scholar 

  46. Hosler, B.A. et al. Linkage of familial amyotrophic lateral sclerosis with frontotemporal dementia to chromosome 9q21–q22. J.A.M.A. 284, 1664–1669 (2000).

    Article  CAS  Google Scholar 

  47. Koyama, S. et al. Alteration of familial ALS-linked mutant SOD1 solubility with disease progression: its modulation by the proteasome and Hsp70. Biochem. Biophys. Res. Commun. 343, 719–730 (2006).

    Article  CAS  Google Scholar 

  48. Polavarapu, N. et al. Investigation into biomedical literature classification using support vector machines. Proc. IEEE Comput. Syst. Bioinform. Conf., 366–374 (8-11 August 2005).

  49. Zanzoni, A. et al. MINT: a Molecular INTeraction database. FEBS Lett. 513, 135–140 (2002).

    Article  CAS  Google Scholar 

  50. Bader, G.D., Betel, D. & Hogue, C.W. BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res. 31, 248–250 (2003).

    Article  CAS  Google Scholar 

  51. Hermjakob, H. et al. IntAct: an open source molecular interaction database. Nucleic Acids Res. 32 (Database Issue), D452–D455 (2004).

    Article  CAS  Google Scholar 

  52. Kanehisa, M. et al. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, D354–D357 (2006).

    Article  CAS  Google Scholar 

  53. Joshi-Tope, G. et al. Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 33, D428–D432 (2005).

    Article  CAS  Google Scholar 

  54. Walhout, A.J. et al. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116–122 (2000).

    Article  CAS  Google Scholar 

  55. Lehner, B. & Fraser, A.G. A first-draft human protein-interaction map. Genome Biol. 5, R63 (2004).

    Article  Google Scholar 

  56. O'Brien, K.P., Remm, M. & Sonnhammer, E.L. Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res. 33 (Database Issue), D476–D480 (2005).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

The authors wish to thank Ulrik de Lichtenberg and Thomas Skøt Jensen for critical reading of the manuscript, editing and help in developing the protein interaction score. We also thank Christopher Workman and Zoltan Szallasi for valuable discussions and help with the manuscript. Y.M. is supported by K.U. Leuven GOA AMBioRICS, CoE EF/05/007 SymBioSys, BELSPO IUAP P6/25 BioMaGNet, EU-FP6-NoE Biopattern and EU-FP6-MC-EST Bioptrain. Z.M.S. is supported by an EU Biosapiens (NoE), FP6 grant. The Center for Biological Sequence Analysis and the Wilhelm Johannsen Center for Functional Genome Research are supported by the Danish National Research Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Søren Brunak.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Fig. 1

Measuring phenotype association scores between OMIM records. (PDF 287 kb)

Supplementary Fig. 2

Benchmark of the phenotype association score. (PDF 81 kb)

Supplementary Fig. 3

Benchmark of protein interaction score. (PDF 115 kb)

Supplementary Fig. 4

EOC candidate complex. (PDF 153 kb)

Supplementary Fig. 5

IBD candidate complex. (PDF 290 kb)

Supplementary Fig. 6

ALS with frontotemporal dementia candidate complex. (PDF 219 kb)

Supplementary Fig. 7

Influence of tf-idf weight on phenotype similarity scoring scheme. (PDF 83 kb)

Supplementary Fig. 8

Robustness of the cosine phenotype similarity measure. (PDF 79 kb)

Supplementary Fig. 9

Connection profiles. (PDF 101 kb)

Supplementary Fig. 10

Performance of phenotype similarity scheme on phenotypes with same molecular basis. (PDF 107 kb)

Supplementary Table 1

A randomly selected subset of 100 OMIM record pairs crossreferenced by the OMIM curators. (PDF 35 kb)

Supplementary Table 2

A list of 113 candidates identified by the Bayesian predictor. (PDF 30 kb)

Supplementary Table 3

Comparison of the different computational methods that have been tested by ranking candidate genes in linkage intervals. (PDF 7 kb)

Supplementary Table 4

Normalized connectivity of phenotypes from the training set and the prediction set. (PDF 12 kb)

Supplementary Table 5

Benchmarking subset where we ranked the correct gene as number one out of all candidates in the interval. (PDF 46 kb)

Supplementary Data (PDF 23 kb)

Supplementary Methods (PDF 148 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lage, K., Karlberg, E., Størling, Z. et al. A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol 25, 309–316 (2007). https://doi.org/10.1038/nbt1295

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt1295

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing