Abstract
The human genotope is the convex hull of all allele frequency vectors that can be obtained from the genotypes present in the human population. In this paper, we take a few initial steps toward a description of this object, which may be fundamental for future population based genetics studies. Here we use data from the HapMap Project, restricted to two ENCODE regions, to study a subpolytope of the human genotope. We study three different approaches for obtaining informative low-dimensional projections of this subpolytope. The projections are specified by projection onto few tag SNPs, principal component analysis, and archetypal analysis. We describe the application of our geometric approach to identifying structure in populations based on single nucleotide polymorphisms.
Article PDF
Similar content being viewed by others
References
Beerenwinkel, N., Pachter, L., Sturmfels, B., 2006. Epistasis and shapes of fitness landscapes, Statistica Sinica, to appear. ArXiv:q-bio.PE/0603034.
Beerenwinkel, N., Pachter, L., Sturmfels, B., Elena, S., Lenski, R., 2007. Analysis of epistatic interactions and fitness landscapes using a new geometric approach. BMC Evol. Biol. 7, 60.
Cavalli-Sforza, L.L., Menozzi, P., Piazza, A., 1994. The History and Geography of Human Genes. Princeton University Press, Princeton.
Chesler, E.J., Lu, L., Shou, S., Qu, Y., Gu, J., Wang, J., Hsu, H.C., Mountz, J.D., Baldwin, N.E., Langston, M.A., et al., 2005. Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nat. Genet. 37, 233–242.
Christiansen, F.B., 2000. Population Genetics of Multiple Loci. Wiley, New York.
Cutler, A., Breiman, L., 1994. Archetypal analysis. Technometrics 36, 338–347.
Dewey, C., Huggins, P., Woods, K., Sturmfels, B., Pachter, L., 2006. Parametric alignment of Drosophila genomes. PLoS Comput. Biol. 2(6), e73.
ENCODE project consortium, 2004. The ENCODE (ENCylopedia Of DNA Elements) project. Science 306(5696), 636–640.
Gawrilow, E., Joswig, M., 2005. Geometric reasoning with polymake. ArXiv:math/0507273.
Grosse, I., Bernaola-Galván, P., Carpena, P., Román-Roldán, R., Oliver, J., Stanley, H.E., 2002. Analysis of symbolic sequences using the Jensen–Shannon divergence. Phys. Rev. E 65(041904-1), 1063–1065.
Hallgrímsdóttir, I., Yuster, D., 2007. A complete classification of two-locus disease models. BMC Genet., in press.
International HapMap Consortium, 2005. A haplotype map of the human genome. Nature 437(7063), 1299–1320.
Kimmel, G., Shamir, R., 2005. A block-free hidden Markov model for genotypes and its application to disease association. J. Comput. Biol. 12(10), 1243–1260.
Ott, J., 1999. Analysis of Human Genetic Linkage, 3rd edn. Johns Hopkins University Press, Baltimore.
Price, A.L., Patterson, N.J., Plenge, R.M., Weinblatt, M.E., Shadlick, N.A., Reich, D., 2006. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909.
Pritchard, J.K., Stephens, M., Donnelly, P., 2006. Inference of population structure using multilocus genotype data. Genetics 55, 945–959.
Rinaldo, A., Bacanu, S.A., Devlin, B., Sonpar, V., Wasserman, L., Roeder, K., 2005. Characterization of multilocus linkage disequilibrium. Genet. Eipdemiol. 28(3), 193–206.
Sturm, J.F., 1999. Using SeDuMi, a Matlab toolbox for optimization over symmetric cones. Optim. Methods Softw. 11–12, 625–653.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huggins, P., Pachter, L. & Sturmfels, B. Toward the Human Genotope. Bull. Math. Biol. 69, 2723–2735 (2007). https://doi.org/10.1007/s11538-007-9244-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-007-9244-7