Skip to main content
Log in

Estimation of evolutionary distances between nucleotide sequences

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

A formal mathematical analysis of the substitution process in nucleotide sequence evolution was done in terms of the Markov process. By using matrix algebra theory, the theoretical foundation of Barry and Hartigan's (Stat. Sci. 2:191–210, 1987) and Lanave et al.'s (J. Mol. Evol. 20:86–93, 1984) methods was provided. Extensive computer simulation was used to compare the accuracy and effectiveness of various methods for estimating the evolutionary distance between two nucleotide sequences. It was shown that the multiparameter methods of Lanave et al.'s (J. Mol. Evol. 20:86–93, 1984), Gojobori et al.'s (J. Mol. Evol. 18:414–422, 1982), and Barry and Hartigan's (Stat. Sci. 2:191–210, 1987) are preferable to others for the purpose of phylogenetic analysis when the sequences are long. However, when sequences are short and the evolutionary distance is large, Tajima and Nei's (Mol. Biol. Evol. 1:269–285, 1984) method is superior to others.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barry D, Hartigan JA (1987) Statistical analysis of hominoid molecular evolution. Stat Sci 2:191–210

    Google Scholar 

  • Bellman R (1960) Introduction to matrix analysis. McGraw-Hill, New York, p 34

    Google Scholar 

  • Blaisdell BE (1985) A method of estimating from two aligned present-day DNA sequences their ancestral composition and subsequent rates of substitution, possibly different in the two lineages, corrected for multiple and parallel substitutions at the same site. J Mol Evol 22:69–81

    Google Scholar 

  • Cavender JA, Felsenstein J (1987) Invariants of phylogenies in a simple case with discrete states. J Classification 4:57–71

    Google Scholar 

  • DeBry RW (1992) The consistency of several phylogeny-inference methods under varying evolutionary rates. Mol Biol Evol 9:537–551

    Google Scholar 

  • Felsenstein J (1973) Maximum-likelihood and minimum-steps methods for evolutinary trees from data on discrete characters. Syst Zool 26:77–88

    Google Scholar 

  • Felsenstein J (1983) Statistical inference of phylogenies. J R Statist Soc A 146:246–272

    Google Scholar 

  • Felsenstein J (1984) Distance methods for inferring phylogenies: a justification. Evolution 38:16–24

    Google Scholar 

  • Felsenstein J (1992) Phylogenies from restriction sites: a maximum-likelihood approach. Evolution 46:159–173

    Google Scholar 

  • Fitch WM (1980) Estimating the total number of nucleotide substitutions since the common ancestor of a pair of homologous genes: comparison of several methods and three beta homoglobulin messenger RNA's. J Mol Evol 16:153–209

    Google Scholar 

  • Fitch WM (1986) The estimate of total nucleotide substitution from pairwise differences is biased. Philos Trans R Soc Lend Biol 312: 317–324

    Google Scholar 

  • Gojobori T, Ishii K, Nei M (1982) Estimation of average number of nucleotide substitutions when the rate of substitution varies with nucleotide. J Mol Evol 18:414–422

    Google Scholar 

  • Gojobori T, Moriyama EN, Kimura M (1990) Statistical method for estimating sequence divergence. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 531–550

    Google Scholar 

  • Gojobori T, Nei M, Ishii K (1981) Mathematical model of nucleotide substitutions with unequal substitution rates. Genetics 97:s43

    Google Scholar 

  • Hasegawa M, Kishino H, Yano T (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174

    Google Scholar 

  • Holmquist R (1976) Solution to a gene divergence problem under arbitrary stable nucleotide transition probabilities. J Mol Evol 8: 337–349

    Google Scholar 

  • Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro NH (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–123

    Google Scholar 

  • Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120

    Google Scholar 

  • Kimura M (1981) Estimation of evolutionary differences between homologous nucleotide sequences. Proc Natl Acad Sci USA 78:454–458

    Google Scholar 

  • Kishino H, Hasegawa M (1990) Converting distance to time: application to human evolution. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 550–570

    Google Scholar 

  • Lanave C, Preparata G, Saccone C, Serio G (1984) A new method for calculating evolutionary substitution rates. J Mol Evol 20:86–93

    Google Scholar 

  • Nei M, Tateno Y (1978) Nonrandom amino acid substitution and estimation of the number of nucleotide substitution in evolution. J Mol Evol 11:333–347

    Google Scholar 

  • Nguyen T, Speed TP (1992) A derivation of all linear invariants for a nonbalanced transversion model. J Mol Evol 35:60–88

    Google Scholar 

  • Olsen G (1991) Systematic underestimation of tree branch lengths by Lake's operator metrics: an effect of position-dependent substitution rates. Mol Biol Evol 8:592–608

    Google Scholar 

  • Saccone C, Lanave C, Pesole G, Preparata G (1990) Influence of base composition on quantitative estimates of gene evolution. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 570–583

    Google Scholar 

  • Saitou N (1990) Maximum likelihood methods. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 584–598

    Google Scholar 

  • Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425

    Google Scholar 

  • Tajima F, Nei M (1982) Biases of the estimates of DNA divergence obtained by the restriction enzyme technique. J Mol Evol 18:115–120

    Google Scholar 

  • Tajima F, Nei M (1984) Estimation of evolutionary distance between nucleotide sequences. Mol Biol Evol 1:269–285

    Google Scholar 

  • Takahata N, Kimura M (1981) A model of evolutionary base substitution and its application with special reference to rapid change of pseudo-genes. Genetics 98:641–657

    Google Scholar 

  • Tamura K (1992) Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C content biases. Mol Biol Evol 9:678–687

    Google Scholar 

  • Zharkikh A, Li W-H (1992) Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. II. Four taxa without a molecular clock. J Mol Evol 35:356–366

    Google Scholar 

  • Zharkikh A, Li W-H (1993) Inconsistency of the maximum parsimony method: the case of five taxa with a molecular clock. Syst Biology 42:113–125

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zharkikh, A. Estimation of evolutionary distances between nucleotide sequences. J Mol Evol 39, 315–329 (1994). https://doi.org/10.1007/BF00160155

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00160155

Key words

Navigation