Estimation of evolutionary distances between nucleotide sequences

Zharkikh, Andrey

doi:10.1007/BF00160155

Estimation of evolutionary distances between nucleotide sequences

Published: September 1994

Volume 39, pages 315–329, (1994)
Cite this article

Journal of Molecular Evolution Aims and scope Submit manuscript

Andrey Zharkikh^1,2

609 Accesses
314 Citations
3 Altmetric
Explore all metrics

Abstract

A formal mathematical analysis of the substitution process in nucleotide sequence evolution was done in terms of the Markov process. By using matrix algebra theory, the theoretical foundation of Barry and Hartigan's (Stat. Sci. 2:191–210, 1987) and Lanave et al.'s (J. Mol. Evol. 20:86–93, 1984) methods was provided. Extensive computer simulation was used to compare the accuracy and effectiveness of various methods for estimating the evolutionary distance between two nucleotide sequences. It was shown that the multiparameter methods of Lanave et al.'s (J. Mol. Evol. 20:86–93, 1984), Gojobori et al.'s (J. Mol. Evol. 18:414–422, 1982), and Barry and Hartigan's (Stat. Sci. 2:191–210, 1987) are preferable to others for the purpose of phylogenetic analysis when the sequences are long. However, when sequences are short and the evolutionary distance is large, Tajima and Nei's (Mol. Biol. Evol. 1:269–285, 1984) method is superior to others.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Barry D, Hartigan JA (1987) Statistical analysis of hominoid molecular evolution. Stat Sci 2:191–210
Google Scholar
Bellman R (1960) Introduction to matrix analysis. McGraw-Hill, New York, p 34
Google Scholar
Blaisdell BE (1985) A method of estimating from two aligned present-day DNA sequences their ancestral composition and subsequent rates of substitution, possibly different in the two lineages, corrected for multiple and parallel substitutions at the same site. J Mol Evol 22:69–81
Google Scholar
Cavender JA, Felsenstein J (1987) Invariants of phylogenies in a simple case with discrete states. J Classification 4:57–71
Google Scholar
DeBry RW (1992) The consistency of several phylogeny-inference methods under varying evolutionary rates. Mol Biol Evol 9:537–551
Google Scholar
Felsenstein J (1973) Maximum-likelihood and minimum-steps methods for evolutinary trees from data on discrete characters. Syst Zool 26:77–88
Google Scholar
Felsenstein J (1983) Statistical inference of phylogenies. J R Statist Soc A 146:246–272
Google Scholar
Felsenstein J (1984) Distance methods for inferring phylogenies: a justification. Evolution 38:16–24
Google Scholar
Felsenstein J (1992) Phylogenies from restriction sites: a maximum-likelihood approach. Evolution 46:159–173
Google Scholar
Fitch WM (1980) Estimating the total number of nucleotide substitutions since the common ancestor of a pair of homologous genes: comparison of several methods and three beta homoglobulin messenger RNA's. J Mol Evol 16:153–209
Google Scholar
Fitch WM (1986) The estimate of total nucleotide substitution from pairwise differences is biased. Philos Trans R Soc Lend Biol 312: 317–324
Google Scholar
Gojobori T, Ishii K, Nei M (1982) Estimation of average number of nucleotide substitutions when the rate of substitution varies with nucleotide. J Mol Evol 18:414–422
Google Scholar
Gojobori T, Moriyama EN, Kimura M (1990) Statistical method for estimating sequence divergence. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 531–550
Google Scholar
Gojobori T, Nei M, Ishii K (1981) Mathematical model of nucleotide substitutions with unequal substitution rates. Genetics 97:s43
Google Scholar
Hasegawa M, Kishino H, Yano T (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174
Google Scholar
Holmquist R (1976) Solution to a gene divergence problem under arbitrary stable nucleotide transition probabilities. J Mol Evol 8: 337–349
Google Scholar
Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro NH (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–123
Google Scholar
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120
Google Scholar
Kimura M (1981) Estimation of evolutionary differences between homologous nucleotide sequences. Proc Natl Acad Sci USA 78:454–458
Google Scholar
Kishino H, Hasegawa M (1990) Converting distance to time: application to human evolution. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 550–570
Google Scholar
Lanave C, Preparata G, Saccone C, Serio G (1984) A new method for calculating evolutionary substitution rates. J Mol Evol 20:86–93
Google Scholar
Nei M, Tateno Y (1978) Nonrandom amino acid substitution and estimation of the number of nucleotide substitution in evolution. J Mol Evol 11:333–347
Google Scholar
Nguyen T, Speed TP (1992) A derivation of all linear invariants for a nonbalanced transversion model. J Mol Evol 35:60–88
Google Scholar
Olsen G (1991) Systematic underestimation of tree branch lengths by Lake's operator metrics: an effect of position-dependent substitution rates. Mol Biol Evol 8:592–608
Google Scholar
Saccone C, Lanave C, Pesole G, Preparata G (1990) Influence of base composition on quantitative estimates of gene evolution. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 570–583
Google Scholar
Saitou N (1990) Maximum likelihood methods. In: Doolittle RF (ed) Methods in enzymology, vol 183. Molecular evolution: computer analysis of protein and nucleic acid sequences. Academic Press, San Diego, pp 584–598
Google Scholar
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425
Google Scholar
Tajima F, Nei M (1982) Biases of the estimates of DNA divergence obtained by the restriction enzyme technique. J Mol Evol 18:115–120
Google Scholar
Tajima F, Nei M (1984) Estimation of evolutionary distance between nucleotide sequences. Mol Biol Evol 1:269–285
Google Scholar
Takahata N, Kimura M (1981) A model of evolutionary base substitution and its application with special reference to rapid change of pseudo-genes. Genetics 98:641–657
Google Scholar
Tamura K (1992) Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C content biases. Mol Biol Evol 9:678–687
Google Scholar
Zharkikh A, Li W-H (1992) Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. II. Four taxa without a molecular clock. J Mol Evol 35:356–366
Google Scholar
Zharkikh A, Li W-H (1993) Inconsistency of the maximum parsimony method: the case of five taxa with a molecular clock. Syst Biology 42:113–125
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Demographic and Population Genetics, University of Texas, P.O. Box 20334, 77225, Houston, TX, USA
Andrey Zharkikh
Institute of Cytology and Genetics, 630090, Novosibirsk, Russia
Andrey Zharkikh

Authors

Andrey Zharkikh
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zharkikh, A. Estimation of evolutionary distances between nucleotide sequences. J Mol Evol 39, 315–329 (1994). https://doi.org/10.1007/BF00160155

Download citation

Received: 09 February 1993
Accepted: 14 March 1994
Issue Date: September 1994
DOI: https://doi.org/10.1007/BF00160155

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Estimation of evolutionary distances between nucleotide sequences

Abstract

Access this article

Similar content being viewed by others

Information Metrics for Phylogenetic Trees via Distributions of Discrete and Continuous Characters

Numerical Optimization Techniques in Maximum Likelihood Tree Inference

Impossibility of Consistent Distance Estimation from Sequence Lengths Under the TKF91 Model

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Navigation

Estimation of evolutionary distances between nucleotide sequences

Abstract

Access this article

Similar content being viewed by others

Information Metrics for Phylogenetic Trees via Distributions of Discrete and Continuous Characters

Numerical Optimization Techniques in Maximum Likelihood Tree Inference

Impossibility of Consistent Distance Estimation from Sequence Lengths Under the TKF91 Model

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation