Skip to main content
Log in

Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases

  • Review
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

One of the central goals of human genetics is the identification of loci with alleles or genotypes that confer increased susceptibility. The availability of dense maps of single-nucleotide polymorphisms (SNPs) along with high-throughput genotyping technologies has set the stage for routine genome-wide association studies that are expected to significantly improve our ability to identify susceptibility loci. Before this promise can be realized, there are some significant challenges that need to be addressed. We address here the challenge of detecting epistasis or gene–gene interactions in genome-wide association studies. Discovering epistatic interactions in high dimensional datasets remains a challenge due to the computational complexity resulting from the analysis of all possible combinations of SNPs. One potential way to overcome the computational burden of a genome-wide epistasis analysis would be to devise a logical way to prioritize the many SNPs in a dataset so that the data may be analyzed more efficiently and yet still retain important biological information. One of the strongest demonstrations of the functional relationship between genes is protein-protein interaction. Thus, it is plausible that the expert knowledge extracted from protein interaction databases may allow for a more efficient analysis of genome-wide studies as well as facilitate the biological interpretation of the data. In this review we will discuss the challenges of detecting epistasis in genome-wide genetic studies and the means by which we propose to apply expert knowledge extracted from protein interaction databases to facilitate this process. We explore some of the fundamentals of protein interactions and the databases that are publicly available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Alfarano C, Andrade CE, Anthony K, Bahroos N, Bajec M, Bantoft K, Betel D, Ouellette BFF, Hogue CWV et al (2005) The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res 33:D418–D424

    Article  PubMed  CAS  Google Scholar 

  • Altshuler D, Brooks LD, Chakravarti A, Collins FS, Daly MJ, Donnelly MJ (2005) A haplotype map of the human genome. Nature 437:1299–1320

    Article  Google Scholar 

  • Asselbergs FW, Williams SM, Hebert PR, Coffey CS, Hillege HL, Navis G, Vaughan DE, van Gilst WH, Moore JH (2007) Epistatic effects of polymorphisms in genes from the renin–angiotensin, bradykinin, and fibrinolytic systems on plasma t-PA and PAI-1 levels. Genomics 89(3):362–369

    Article  PubMed  CAS  Google Scholar 

  • Bateson W (1909) Mendel’s principles of heredity. Cambridge University Press, Cambridge

    Google Scholar 

  • Breitkreutz BJ, Stark C, Reguly T, Boucher L, Breitkreutz A, Livstone M, Oughtred R, Lackner DH, Bähler J, Wood V, Dolinski K, Tyers M (2008) The BioGRID interaction database: 2008 update. Nucleic Acids Res 36:D637–D640

    Article  PubMed  CAS  Google Scholar 

  • Carlson CS (2006) Agnosticism and equity in genome-wide association studies. Nat Genet 38(6):605–606

    Article  PubMed  CAS  Google Scholar 

  • Cavallo A, Martin AC (2005) Mapping SNPs to protein sequence and structure data. Bioinformatics 21(8):1443–1450

    Article  PubMed  CAS  Google Scholar 

  • Chanock SJ, Manolio T, Boehnke M, Boerwinkle E, Hunter DJ, Thomas G, Hirschhorn JN, Abecasis G, Altshuler D, Bailey-Wilson JE, Brooks LD, Cardon LR, Daly M, Donnelly P, Fraumeni JF Jr, Freimer NB, Gerhard DS, Gunter C, Guttmacher AE et al (2007) Replicating genotype–phenotype associations. Nature 447(7145):655–660

    Article  PubMed  CAS  Google Scholar 

  • Chatr-aryamontri A, Ceol A, Palazzi LM, Nardelli G, Schneider MV, Castagnoli L, Cesareni G (2007) MINT: the Molecular INTeraction database. Nucleic Acids Res 35:D572–D574

    Article  PubMed  CAS  Google Scholar 

  • Chaurasia G, Yasir I, Hanig C, Herzel H, Wanker EE, Futschik ME (2007) UniHI: an entry gate to the human protein interactome. Nucleic Acids Res 35:D590–D594

    Article  PubMed  CAS  Google Scholar 

  • Cordell HJ (2002) Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet 11:2463–2468

    Article  PubMed  CAS  Google Scholar 

  • Coutinho AM, Sousa I, Martins M, Correia C, Morgadinho T, Bento C, Marques C, Ataide A, Miguel TS, Moore JH, Oliveira G, Vicente AM (2007) Evidence for epistasis between SLC6A4 and ITGB3 in autism etiology and in the determination of platelet serotonin levels. Hum Genet 121:243–256

    Article  PubMed  CAS  Google Scholar 

  • Fisher RA (1918) The correlation between relatives on the supposition of Mendelian inheritance. Trans R Soc Edinb 52:399–433

    Google Scholar 

  • Franke L, van-Bakel H, Fokkens L, de-Jong ED, Egmont-Petersen M, Wijmenga C (2006) Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet 78:1011–1025

    Article  PubMed  CAS  Google Scholar 

  • Hahn LW, Moore JH (2004) Ideal discrimination of discrete clinical endpoints using multilocus genotypes. In Silico Biol 4:0016

    Google Scholar 

  • Hahn LW, Ritchie MD, Moore JH (2003) Multifactor dimensionality reduction software for detecting gene–gene and gene–environment interactions. Bioinformatics 19:376–382

    Article  PubMed  CAS  Google Scholar 

  • Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6:95–108

    Article  PubMed  CAS  Google Scholar 

  • Kaltenbach LS, Romero E, Becklin RR, Chettier R, Bell R, Phansalkar A, Strand A, Torcassi C, Savage J, Hurlburt A, Cha G-H, Ukani L, Chepanoske CL, Zhen Y, Sahasrabuhde S, Olson J, Kurschner C, Ellerby LM, Peltier JM, Botas J, Hughes RE (2007) Huntingtin interacting proteins are genetic modifiers of neurodegeneration. PloS Genet 3:e82

    Article  PubMed  Google Scholar 

  • Li SH, Li XJ (2004) Huntingtin–protein interactions and the pathogenesis of Huntington’s disease. Trends Genet 20:146–154

    Article  PubMed  Google Scholar 

  • Lim J, Hao T, Shaw C, Patel AJ, Szabó G, Rual JF, Fisk CJ, Li N, Smolyar A, Hill DE, Barabási AL, Vidal M, Zoghbi HY (2006) A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration. Cell 125(4):801–814

    Article  PubMed  CAS  Google Scholar 

  • Limviphuvadh V, Tanaka S, Goto S, Ueda K, Kanehisa M (2007) The commonality of protein interaction networks determined in neurodegenerative disorders. Bioinformatics 23:2129–2138

    Article  PubMed  CAS  Google Scholar 

  • Mathivanan S, Periaswamy B, Gandi T, Kandasamy K, Suresh S, Mohmood R (2006) An evaluation of human protein–protein interaction data in the public domain. BMC Bioinformatics 7(Suppl 5):S19

    Article  PubMed  Google Scholar 

  • Mishra G, Suresh M, Kumaran K, Kannabiran N, Suresh S, Bala P, Shivkumar K, Prasad TSK, Pandey A et al (2006) Human protein reference database—2006 update. Nucleic Acids Res 34:D411–D414

    Article  PubMed  CAS  Google Scholar 

  • Moore JH (2003) The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered 56:73–82

    Article  PubMed  Google Scholar 

  • Moore JH (2004) Computational analysis of gene–gene interactions in common human diseases using multifactor dimensionality reduction. Expert Rev Mol Diagn 4:795–803

    Article  PubMed  CAS  Google Scholar 

  • Moore JH (2005) A global view of epistasis. Nat Genet 37:13–14

    Article  PubMed  CAS  Google Scholar 

  • Moore JH (2007) Genome-wide analysis of epistasis using multifactor dimensionality reduction: feature selection and construction in the domain of human genetics. In: Zhu X, Davidson I (eds) Knowledge discovery and data mining: challenges and realities with real world data. IGI Press, Hershey, pp 17–30

    Google Scholar 

  • Moore JH, Ritchie MD (2004) The challenges of whole-genome approaches to common diseases. JAMA 291:1642–1643

    Article  PubMed  CAS  Google Scholar 

  • Moore JH, White, BC (2007) Tuning Relief for genome-wide genetic analysis. In: Marchiori E, Moore JH, Rajapakse J (eds) Evolutionary computation, machine learning and data mining in bioinformatics, vol 4447. Lecture Notes in Computer Science, pp 166–175

  • Moore JH, Williams SM (2005) Traversing the conceptual divide between biological and statistical epistasis: Systems biology and a more modern synthesis. BioEssays 27:637–646

    Article  PubMed  CAS  Google Scholar 

  • Moore JH, Gilbert JC, Tsai CT, Chiang FT, Holden W, Barney N, White BC (2006) A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol 241:252–261

    Article  PubMed  Google Scholar 

  • Myers CL, Robson D, Wible A, Hibbs MA, Chiriac C, Theesfeld CL, Dolinski K, Troyanskaya OG (2005) Discovery of biological networks from diverse functional genomic data. Genome Biol 6:R114

    Article  PubMed  Google Scholar 

  • Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci USA 96:4285–4288

    Article  PubMed  CAS  Google Scholar 

  • Pellegrini M, Haynor D, Johnson JM (2004) Protein interaction networks. Expert Rev Proteomics 1:239–249

    Article  PubMed  CAS  Google Scholar 

  • Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N et al (2005) Towards a proteome-scale map of the human protein–protein interaction network. Nature 437:1173–1178

    Article  PubMed  CAS  Google Scholar 

  • Rea TJ, Brown CM, Sing CF (2006) Complex adaptive system models and the genetic analysis of plasma HDL-cholesterol concentration. Perspect Biol Med 49(4):490–503

    Article  PubMed  CAS  Google Scholar 

  • Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH (2001) Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. Am J Hum Genet 69:138–147

    Article  PubMed  CAS  Google Scholar 

  • Ritchie MD, Hahn LW, Moore JH (2003) Power of multifactor dimensionality reduction for detecting gene–gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Genet Epidemiol 24:150–157

    Article  PubMed  Google Scholar 

  • Risch NJ, Merikangas KR (1996) The future of genetic studies of complex human disease. Science 273:1516–1517

    Article  PubMed  CAS  Google Scholar 

  • Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D (2004) The database of interacting proteins: 2004 update. Nucleic Acids Res 32:D449–D451

    Article  PubMed  CAS  Google Scholar 

  • Sing CF, Stengard JH, Kardia SL (2003) Genes, environment, and cardiovascular disease. Arterioscler Thromb Vasc Biol 23:1190–1196

    Article  PubMed  CAS  Google Scholar 

  • Singleton AB, Farrer M, Johnson J, Singleton A, Hague S, Kachergus J, Hulihan M, Peuralinna T, Dutra A, Nussbaum R, Lincoln S, Crawley A, Hanson M, Maraganore D, Adler C, Cookson MR, Muenter M, Baptista M, Miller D, Blancato J, Hardy J, Gwinn-Hardy K (2003) alpha-Synuclein locus triplication causes Parkinson’s disease. Science 302(5646):841

    Article  PubMed  CAS  Google Scholar 

  • Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksöz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker EE (2005) A human protein–protein interaction network: a resource for annotating the proteome. Cell 122:957–968

    Article  PubMed  CAS  Google Scholar 

  • Tan SH, Zhang Z, Ng SK (2004) ADVICE: automated detection and validation of interaction by co-evolution. Nucleic Acids Res 32:W69–W72

    Article  PubMed  CAS  Google Scholar 

  • Templeton AR (2000) Epistasis and complex traits. In: Wade M, Brodie BIII, Wolf J (eds) Epistasis and evolutionary process. Oxford University Press, New York

    Google Scholar 

  • Thornton-Wells TA, Moore JH, Haines JL (2004) Genetics, statistics, and human disease: analytical retooling for complexity. Trends Genet 20:640–647

    Article  PubMed  CAS  Google Scholar 

  • Tong AH, Evangelista M, Parsons AB, Xu H, Bader GD, Pagé N, Robinson M, Raghibizadeh S, Hogue CWV, Bussey H, Andrews B, Tyers M, Boone C (2001) Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294:2364–2368

    Article  PubMed  CAS  Google Scholar 

  • Vastrik I, D’Eustachio P, Schmidt E, Joshi-Tope G, Gopinath G, Croft D, de Bono B, Gillespie M, Jassal B, Lewis S, Matthews L, Wu G, Birney E, Stein L (2007) Reactome: a knowledge base of biologic pathways and processes. Genome Biol 8:R39

    Article  PubMed  Google Scholar 

  • von Mering C, Jensen LJ, Kuhn M, Chaffron S, Doerks T, Krüger B, Snel B, Bork P (2007) STRING 7: recent developments in the integration and prediction of protein interactions. Nucleic Acids Res 35:D358–D362

    Article  Google Scholar 

  • Wang WY, Barratt BJ, Clayton DG, Todd JA (2005) Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 6:109–118

    Article  PubMed  CAS  Google Scholar 

  • Wang Z, Moult J (2001) SNPs, protein structure, and disease. Hum Mutat 4:263–270

    Article  Google Scholar 

  • Willis RC, Hoque CW (2006) Searching, viewing, and visualizing data in the Biomolecular Interaction Network Database (BIND). Curr Protoc Bioinformatics, chap 8.8.9

  • Yates JR (2000) Mass spectrometry: from genomics to proteomics. Trends Genet 16:5–8

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This publication was funded in part by National Institute of Health grants LM009012 and AI59694. We would like to thank Drs. Scott Gerber, David Jewell, Dean Madden and Mike Whitfield for helpful discussions that lead to some of the ideas in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jason H. Moore.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pattin, K.A., Moore, J.H. Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases. Hum Genet 124, 19–29 (2008). https://doi.org/10.1007/s00439-008-0522-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-008-0522-8

Keywords

Navigation