There is a great deal of variation in silent rates of evolution (Ks) between genes in the same species pair comparison (Bernardi, Mouchiroud, and Gautier 1993 ; Wolfe and Sharp 1993 ; Mouchiroud, Gautier, and Bernardi 1995 ). This may represent random fluctuation (Kumar and Subramanian 2002 ) or may be deterministically caused. Evidence for the latter comes from the finding that the rate of silent site evolution of a gene is repeatable across independent lineages (Bulmer, Wolfe, and Sharp 1991 ; Mouchiroud, Gautier, and Bernardi 1995 ; Bielawski, Dunn, and Yang 2000 ). Most notably, Mouchiroud, Gautier, and Bernardi (1995) found that the number of synonymous substitutions per synonymous site (Ks) for a gene in the human-cow comparison was a very strong predictor of the Ks of the same gene in the mouse-rat comparison.

This repeatability has been used as evidence (Mouchiroud, Gautier, and Bernardi 1995 ) for selection acting upon silent sites in orthologous genes. One argument holds, for example, that if purifying selection favors a particular amino acid at a given site, it should also favor the translationally most accurate codon and codon usage bias would be selected for. The repeatability of Ks and the Ka-Ks correlation are then attributed to the same selectionist cause. However, this interpretation is questionable on a number of counts. Most notably, with one possible exception (Debry and Marzluff 1994 ), there is no compelling evidence that codon usage bias in mammals is the result of selection (Eyre-Walker 1991 ; Karlin and Mrazek 1996 ; Urrutia and Hurst 2001 ). Alternative interpretations of the putative fact are also possible. These include gene- or chromosome-specific mutation rates and gene-specific rates of biased gene conversion, all of which could give repeatable Ks and a Ka-Ks correlation.

Leaving the difficulties of interpretation aside, we wish to note two potential problems with the prior analysis, that of Mouchiroud, Gautier, and Bernardi. First, these authors did not constrain the orthologs to be autosomal. Mammalian X-linked genes often have a low Ks, most probably in part because of the relatively short time spent in the male germ line (for review see Hurst and Ellegren 1998 ). A data set with numerous X and autosomal genes could give repeatability of Ks, but this may represent repeatability at the chromosomal level (and be mutationally driven) rather than be on account of selection on silent sites. To address this we analyze a data set of orthologs known to be autosomal.

Second, it is now well established that the sophistication of estimators of the silent site rate of evolution can have a major effect on many molecular evolutionary patterns, such as the relationship between Ks and GC4 (Pesole et al. 1995 ; Smith and Hurst 1998 ; Bielawski, Dunn, and Yang 2000 ; Hurst and Williams 2000 ) and between Ka and Ks (Smith and Hurst 1998 ; Bielawski, Dunn, and Yang 2000 ). For example, even if the mutation rate does not vary as a function of GC content, many methods give an artifactual report of an inverted U-shaped distribution (Pesole et al. 1995 ). Mouchiroud, Gautier, and Bernardi used the LWL method, which appears to be highly prone to this bias (Pesole et al. 1995 ). This artifact alone could lead to apparent repeatability. Is the repeatability of the synonymous substitution rate equally sensitive to method, and might the findings of Mouchiroud, Gautier, and Bernardi (1995) be an artifact of the usage of methods that do not make good allowance for such things as biased base composition?

HOVERGEN (Release 40, May 2000; GenBank release 116) was used to collect orthologs. Human, cow, mouse, and rat orthologs were collected for each gene. A total of 150 such sets of orthologs was originally collected; however, because of poor alignments or insufficient evidence of orthology, this was reduced to a final data set of 116 ortholog sets. Using HOVERGEN is probably the best method for determining orthology of sequences. It does, however, have its shortfalls. Most notably, if a gene from one species has a faster rate of evolution than the others, then the true ortholog can fall out at the bottom of the tree because of long branch attraction. Hence, whereas the human, cow, mouse, and rat sequences might cluster as a family, the topology is not as expected. But this result could also come about were the putatively fast evolving sequence also a paralog. Hence, we must reject the families with the unusual topologies, but this may bias to finding sequences with less overdispersed rates of evolution (and hence more repeatable rates). There were, however, only six families that we had to reject because the bovine gene seemed to have a faster rate of evolution and fell out at the bottom of the phylogeny of the putative orthologs.

The coding sequences were extracted from GenBank files using GBPARSE. CLUSTALW was used to align all four translated sequences together, and the nucleotide alignments were reconstituted from the protein alignments and the nucleotide sequences. For one gene, Lamp1 (J04182, L09113, M32015, M34959), it was not possible to align the signal peptide because of a high level of degeneracy. So the mature peptide alone was aligned. Signal peptides are known to evolve faster than average; they seem to conserve hydrophobicity and little else (Williams, Pal, and Hurst 2000 ).

Ka and Ks were calculated using LWL (Li, Wu, and Luo 1985 ), Li 1993 (Li 1993 ), YN00 (Yang and Nielsen 2000 ) and ML (Goldman and Yang 1994 ), comparing the human gene with the cow gene and the mouse gene with the rat gene. K4 was calculated using K2P (Kimura 1980 ), as well as using TN (Tamura and Nei 1993 ). The mean GC4 content was calculated from the human and cow genes as well as the mouse and rat genes. The entire protocol from GenBank file to final result was automated using a Perl script available from the authors. The mean length of the four orthologs defined the length of the gene. The human gene location was ascertained by looking at LocusLink at the NCBI website (http://www.ncbi.nlm.nih.gov/LocusLink/index.html). Doublets were removed using a Perl script obtainable from the authors.

For all cases where we looked at a subsample of the complete data set, e.g., when removing short genes, we tested the significance of the results obtained in that subsample by creating randomly subsampled data sets of the same size as the actual subsample. In nearly all cases, the difference in significance of the subsample compared with the complete data set was not significant. Any noteworthy results will also be pointed out in the results. Otherwise it should be assumed that there is no difference.

The data set was restricted to genes definitely present on human autosomes. However, we failed to replicate the results of Mouchiroud, Gautier, and Bernardi. (see table 1 ). Notably, whereas they report an r2 of 28%, our autosomal data set reports at most an r2 value of around 8%. Possibly of importance is the finding that the highest r2 that we found was that found employing the method used by Mouchiroud, Gautier, and Bernardi (i.e., LWL). We also found that K4, whichever method was used, gave no significant evidence of repeatability. This suggests that to some extent the previous high estimate may be a methodological artifact.

One obvious alternative explanation for the discrepancy is that our set of genes, unlike the previous one, is not constrained to longer genes in which estimates of Ks are more accurate. However, restricting analysis to genes with greater than 300 codons, we find that any repeatability all but disappears (see table 1 ). By contrast, Mouchiroud, Gautier, and Bernardi found the opposite tendency. The biased LWL method reports a weakly significant effect, but otherwise no method reports a significant effect. Note that for this analysis the sample size is close to that employed by Mouchiroud, Gautier, and Bernardi (their data set, N = 85; our long gene autosomal data set, N = 71).

We appear then to have failed to replicate Mouchiroud, Gautier, and Bernardi's result and can find no convincing evidence that given genes have characteristic synonymous substitution rates. One further possible cause of the discrepancy is that the prior data set took genes regardless of their chromosomal location. We then analyzed a data set of the known autosomal genes, known X-linked genes (N = 3), and genes of unknown location (N = 7). However, again, none of the results were highly significant (table 1 , N = 116). Again, the method that gave the strongest correlation was LWL, the method implemented by Mouchiroud, Gautier, and Bernardi (1995) . K4 again showed the weakest repeatability. Increasing the gene size did not increase the strength of repeatability (see table 1 ).

The discrepancy between the results is probably not because of a difference in the proportion of X-linked genes in the two data sets: we failed to detect a significant difference in the strength of correlation when autosomal and X-linked genes are used compared with when autosomal ones alone are used. This was established by randomly removing three genes, 100 times, from the data set and comparing the repeatability of synonymous substitutions with the data set lacking the three X-linked genes (Ks Li, P = 0.50; Ks LWL, P = 0.42; Ks YN00, P = 0.43; Ks ML, P = 0.51).

It has been hypothesized that a repeatability of Ks is expected if (1) Ka is repeatable, and (2) Ka covaries with Ks. A selectionist explanation can be provided for both findings. Is then our lack of evidence for strong repeatability possibly caused by Ka not being repeatable or to Ka and Ks not covarying? As might be expected, Ka is strongly and unambiguously gene-specific (see table 1 ). We also find in the human-cow comparison a strong correlation between Ka and Ks in all methods except YN00 (see table 2 ), and a weaker correlation in the mouse-rat comparison. The correlation between Ka and K4 is much weaker, however (see table 2 ). Removing doublets removes the Ka-Ks correlation (see table 2 ) but made little difference to the evidence for repeatability of Ks (see table 1 ). This suggests that any repeatability of Ks found using some of the methods is not caused by whatever causes the Ka-Ks correlation.

Possibly the most notable of our findings is that the extent of repeatability is highly method-dependent. Methods that use both twofold and fourfold sites come to conclusions different from those obtained using methods that employ only fourfold sites. The latter never detect repeatability, whereas the former do under some circumstances. The estimates of K4 repeatability are not greatly affected by employing long genes alone, so sample size effects are unlikely. Why then do the different measures of Ks report such drastically different estimates for the extent of repeatability? We suggest that GC content may be of importance.

GC content is strongly repeatable between mammalian orthologs (see table 1 ) (Bernardi 2000 ). If a method is biased with respect to GC content, then the repeatability of Ks could simply be an artifact of the repeatability in GC. Importantly, Pesole et al. (1995) showed that methodology was particularly sensitive to GC content such that at extremes of GC content many methods tended to be inaccurate. Notably, they report that K4 using the TN93 correction correctly recovers GC independence where other methods (e.g., Kimura 2 parameter), such as those built into LWL, fail to account for biased base composition. We did some similar simulations and found a similar result (data not shown). We also find in our data set that with LWL, Ks shows the typical (artifactual) inverted U–shaped distribution of Ks and GC4. We can then be confident that in some part the repeatability shown using LWL is an artifact of inaccurate Ks estimation at the extremes of GC content.

The weak repeatability shown using Goldman and Yang's maximum likelihood method and its approximation (YN00) is most likely also, in part, a reflection of a relationship between GC4 and Ks. With both, as found previously (Smith and Hurst 1998 ; Bielawski, Dunn, and Yang 2000 ; Hurst and Williams 2000 ), we find genes with a high GC content also have a high Ks. Under both methods the number of synonymous sites rapidly decreases as third site GC content tends toward 100%. This is because twofold codons are all either GA- or TC-ending. If GC content is skewed, the method supposes that these twofold degenerate third sites largely represent sites at which only nonsynonymous substitutions can occur. Consequently, almost regardless of the number of synonymous changes, the number of synonymous changes per synonymous site must increase because the number of synonymous sites is plummeting. It is unclear whether this method is unbiased.

In order to confirm our results we performed a repeatability analysis comparing human-pig with mouse-rat orthologs. Results were much as with the analysis using cows. For example, using just autosomal genes, the repeatability in Ks calculated using Li 1993 was significant (r2 = 9.2%, P = 0.003, N = 91), but using K4 measured by Tamura and Nei, we found no evidence of repeatability (r2 = 1.5%, P = 0.255). As before restricting analysis to genes with greater than 300 codons, we found no evidence of repeatability using either method (Li 1993 , r2 = 2.4%, P = 0.301, N = 46; K4 TN, r2 = 0.1%, P = 0.834). Interestingly we did find repeatability in the larger data set when we used K4 calculated using Kimura 2 parameter. This effect disappeared when we restricted the data set to longer genes. This again suggests that the difference between analyses is caused by methodological differences and GC-based artifacts.

We have been able to reach a few conclusions. We cannot under any circumstance recover the high correlation previously reported (Mouchiroud, Gautier, and Bernardi 1995 ) between the synonymous substitution rate of orthologs in the human-cow comparison and that in the mouse-rat comparison. In part the strength of the previous correlation may reflect methodological bias. What weak effects we can detect are only found using Ks, never found using K4 with a high parameter multihit correction method. A priori we expect K4 to be the better method because Ks has numerous potential GC-related biases. Given the uncertainty of what is fact and artifact as regards the GC-Ks correlation, it seems safest to conclude that there is no unambiguous evidence that individual autosomal mammalian genes have their own characteristic synonymous substitution rates. This is consistent with Kumar and Subramanian's (2002) finding that the variation in K4 between genes in a genome may be accounted for by a stochastic model, rather than a deterministic one.

Pekka Pamilo, Reviewing Editor

Address for correspondence and reprints: Laurence D. Hurst, Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, United Kingdom. l.d.hurst@bath.ac.uk

Table 1 The Pearson's Correlation of Human-Cow Substitution Rates with Those for Mouse-Rat Orthologs

Table 1 The Pearson's Correlation of Human-Cow Substitution Rates with Those for Mouse-Rat Orthologs

Table 2 Pearson's Correlation of Substitution Rates and GC Content in Human-Cow and Mouse-Rat Comparisons, for Genes Longer than 300 Codons

Table 2 Pearson's Correlation of Substitution Rates and GC Content in Human-Cow and Mouse-Rat Comparisons, for Genes Longer than 300 Codons

We thank Clare Hamilton for assistance with the pig analysis and BBRSC for funding for L.D.H. We are grateful to the editor and two anonymous referees for comments on an earlier version of the manuscript.

References

Bernardi G.,

2000
Isochores and the evolutionary genomics of vertebrates
Gene
241
:
3
-17

Bernardi G., D. Mouchiroud, C. Gautier,

1993
Silent substitutions in mammalian genomes and their evolutionary implications
J. Mol. Evol
37
:
583
-589

Bielawski J. P., K. A. Dunn, Z. H. Yang,

2000
Rates of nucleotide substitution and mammalian nuclear gene evolution: approximate and maximum-likelihood methods lead to different conclusions
Genetics
156
:
1299
-1308

Bulmer M., K. H. Wolfe, P. M. Sharp,

1991
Synonymous nucleotide substitution rates in mammalian genes—implications for the molecular clock and the relationship of mammalian orders
Proc. Natl. Acad. Sci. USA
88
:
5974
-5978

Debry R. W., W. F. Marzluff,

1994
Selection on silent sites in the rodent H3 histone gene family
Genetics
138
:
191
-202

Eyre-Walker A.,

1991
An analysis of codon usage in mammals: selection or mutation bias?
J. Mol. Evol
33
:
442
-449

Goldman N., Z. H. Yang,

1994
Codon-based model of nucleotide substitution for protein-coding DNA sequences
Mol. Biol. Evol
11
:
725
-736

Hurst L. D., H. Ellegren,

1998
Sex biases in the mutation rate
Trends Genet
14
:
446
-452

Hurst L. D., E. J. B. Williams,

2000
Covariation of GC content and the silent site substitution rate in rodents: implications for methodology and for the evolution of isochores
Gene
261
:
107
-114

Karlin S., J. Mrazek,

1996
What drives codon choices in human genes
J. Mol. Biol
262
:
459
-472

Kimura M.,

1980
A simple method for estimating evolutionary rates of base subsititutions through comparative studies of nucleotide sequences
J. Mol. Evol
16
:
111
-120

Kumar S., S. Subramanian,

2002
Mutation rates in mammalian genomes
Proc. Natl. Acad. Sci. USA
99
:
803
-808

Li W.-H.,

1993
Unbiased estimation of the rates of synonymous and nonsynonymous substitution
J. Mol. Evol
36
:
96
-99

Li W.-H., C.-I. Wu, C.-C. Luo,

1985
A new method for estimating synonymous and non-synonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes
Mol. Biol. Evol
2
:
150
-174

Mouchiroud D., C. Gautier, G. Bernardi,

1995
Frequencies of synonymous substitutions in mammals are gene-specific and correlated with frequencies of nonsynonymous substitutions
J. Mol. Evol
40
:
107
-113

Pesole G., G. Dellisanti, G. Preparata, C. Saccone,

1995
The importance of base composition in the correct assessment of genetic distance
J. Mol. Evol
41
:
1124
-1127

Smith N. G. C., L. D. Hurst,

1998
Sensitivity of patterns of molecular evolution to alterations in methodology: a critique of Hughes and Yeager
J. Mol. Evol
47
:
493
-500

Tamura K., M. Nei,

1993
Estimation of the number of nucleotide substitutions in the control region of mitochondrial-DNA in humans and chimpanzees
Mol. Biol. Evol
10
:
512
-526

Urrutia A., L. D. Hurst,

2001
Codon usage bias covaries with expression breadth and the rate of synonymous evolution in humans, but this is not evidence for selection
Genetics
159
:
1191
-1199

Williams E. J. B., C. Pal, L. D. Hurst,

2000
The molecular evolution of signal peptides
Gene
253
:
313
-322

Wolfe K. H., P. M. Sharp,

1993
Mammalian gene evolution: nucleotide sequence divergence between mouse and rat
J. Mol. Evol
37
:
441
-456

Yang Z. H., R. Nielsen,

2000
Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models
Mol. Biol. Evol
17
:
32
-43