Abstract
A formal mathematical analysis of the substitution process in nucleotide sequence evolution was done in terms of the Markov process. By using matrix algebra theory, the theoretical foundation of Barry and Hartigan's (Stat. Sci. 2:191–210, 1987) and Lanave et al.'s (J. Mol. Evol. 20:86–93, 1984) methods was provided. Extensive computer simulation was used to compare the accuracy and effectiveness of various methods for estimating the evolutionary distance between two nucleotide sequences. It was shown that the multiparameter methods of Lanave et al.'s (J. Mol. Evol. 20:86–93, 1984), Gojobori et al.'s (J. Mol. Evol. 18:414–422, 1982), and Barry and Hartigan's (Stat. Sci. 2:191–210, 1987) are preferable to others for the purpose of phylogenetic analysis when the sequences are long. However, when sequences are short and the evolutionary distance is large, Tajima and Nei's (Mol. Biol. Evol. 1:269–285, 1984) method is superior to others.
Similar content being viewed by others
References
Barry D, Hartigan JA (1987) Statistical analysis of hominoid molecular evolution. Stat Sci 2:191–210
Bellman R (1960) Introduction to matrix analysis. McGraw-Hill, New York, p 34
Blaisdell BE (1985) A method of estimating from two aligned present-day DNA sequences their ancestral composition and subsequent rates of substitution, possibly different in the two lineages, corrected for multiple and parallel substitutions at the same site. J Mol Evol 22:69–81
Cavender JA, Felsenstein J (1987) Invariants of phylogenies in a simple case with discrete states. J Classification 4:57–71
DeBry RW (1992) The consistency of several phylogeny-inference methods under varying evolutionary rates. Mol Biol Evol 9:537–551
Felsenstein J (1973) Maximum-likelihood and minimum-steps methods for evolutinary trees from data on discrete characters. Syst Zool 26:77–88
Felsenstein J (1983) Statistical inference of phylogenies. J R Statist Soc A 146:246–272
Felsenstein J (1984) Distance methods for inferring phylogenies: a justification. Evolution 38:16–24
Felsenstein J (1992) Phylogenies from restriction sites: a maximum-likelihood approach. Evolution 46:159–173
Fitch WM (1980) Estimating the total number of nucleotide substitutions since the common ancestor of a pair of homologous genes: comparison of several methods and three beta homoglobulin messenger RNA's. J Mol Evol 16:153–209
Fitch WM (1986) The estimate of total nucleotide substitution from pairwise differences is biased. Philos Trans R Soc Lend Biol 312: 317–324
Gojobori T, Ishii K, Nei M (1982) Estimation of average number of nucleotide substitutions when the rate of substitution varies with nucleotide. J Mol Evol 18:414–422
Gojobori T, Moriyama EN, Kimura M (1990) Statistical method for estimating sequence divergence. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 531–550
Gojobori T, Nei M, Ishii K (1981) Mathematical model of nucleotide substitutions with unequal substitution rates. Genetics 97:s43
Hasegawa M, Kishino H, Yano T (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174
Holmquist R (1976) Solution to a gene divergence problem under arbitrary stable nucleotide transition probabilities. J Mol Evol 8: 337–349
Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro NH (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–123
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120
Kimura M (1981) Estimation of evolutionary differences between homologous nucleotide sequences. Proc Natl Acad Sci USA 78:454–458
Kishino H, Hasegawa M (1990) Converting distance to time: application to human evolution. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 550–570
Lanave C, Preparata G, Saccone C, Serio G (1984) A new method for calculating evolutionary substitution rates. J Mol Evol 20:86–93
Nei M, Tateno Y (1978) Nonrandom amino acid substitution and estimation of the number of nucleotide substitution in evolution. J Mol Evol 11:333–347
Nguyen T, Speed TP (1992) A derivation of all linear invariants for a nonbalanced transversion model. J Mol Evol 35:60–88
Olsen G (1991) Systematic underestimation of tree branch lengths by Lake's operator metrics: an effect of position-dependent substitution rates. Mol Biol Evol 8:592–608
Saccone C, Lanave C, Pesole G, Preparata G (1990) Influence of base composition on quantitative estimates of gene evolution. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 570–583
Saitou N (1990) Maximum likelihood methods. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 584–598
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425
Tajima F, Nei M (1982) Biases of the estimates of DNA divergence obtained by the restriction enzyme technique. J Mol Evol 18:115–120
Tajima F, Nei M (1984) Estimation of evolutionary distance between nucleotide sequences. Mol Biol Evol 1:269–285
Takahata N, Kimura M (1981) A model of evolutionary base substitution and its application with special reference to rapid change of pseudo-genes. Genetics 98:641–657
Tamura K (1992) Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C content biases. Mol Biol Evol 9:678–687
Zharkikh A, Li W-H (1992) Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. II. Four taxa without a molecular clock. J Mol Evol 35:356–366
Zharkikh A, Li W-H (1993) Inconsistency of the maximum parsimony method: the case of five taxa with a molecular clock. Syst Biology 42:113–125
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Zharkikh, A. Estimation of evolutionary distances between nucleotide sequences. J Mol Evol 39, 315–329 (1994). https://doi.org/10.1007/BF00160155
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00160155