Abstract
In order to test the hypothesis that the nucleotide sequences of the primitive informational polymers might not be chosen randomly and in the attempt to compare among taxa, we propose a comparison of computer-generated random sequences with tRNAs nucleotide sequences present in the bacterial and archaeal genomes, being tRNAs molecules possible “fossils” of the time (billions years ago) in which life arose. Our approach is based on the analysis of sequences of tRNAs described as random walks and the distances from the origin evaluated by the use of nonlinear indexes (largest Lyapunov exponent, entropy, BDS statistic). Six different tRNAs of Bacteria and Archaea (ten Archaea and ten Bacteria, thermophilic and mesophilic ones; n = 120), and computer-generated random sequences (n = 50) were studied. Our data show that tRNAs present indices statistical lower than the ones of computer-generated random data (tRNAs own a more ordered sequence than random ones: Lyapunov, p < 0.01; entropy, p < 0.05; BDS, p < 0.01). The observed deviation from pure randomness should be arisen from some constraints like the secondary structure of this biologic macromolecule and/or from a “frozen” stochastic transition, or even from the possible peculiar origin of tRNA by replication of older proto-RNA. Comparing between taxa, in the species studied, Bacteria present BDS and Base ratio (G+C)/(A+T) indexes statistically lower than in Archaea, together which a 20 % of entropy increase. The analysis of a greater number of tRNAs and species will permit to explain if this finding, showing a higher randomness in the bacterial tRNAs sequences, is linked to the different base ratio, to the different environments in which the microorganisms live or to an evolutionary effect.
References
The “Genomic tRNA Database” (Chan and Lowe 2009), http://gtrnadb.ucsc.edu/; SPLITSdb (Sugahara et al 2008), http://splits.iab.keio.ac.jp/splitsdb/
Adami C, Ofria C, Collier TC (2000) Evolution of biological complexity. PNAS 97(9):4463–4468
Anastassiou D (2001) Genomic signal processing. IEEE Signal Proc 18(4):8–20
Arneodo A, Bacry E, Graves PV et al (1995) Characterizing long-range correlations in DNA sequences from wavelet analysis. Phys Rev Lett 74:3293–3296
Berger JA, Mitra SK, Carli M et al (2002) New approaches to genome sequence analysis based on digital signal processing. IEEE Workshop on GENSIPS:1–4
Berger JA, Mitra SK, Carli M et al (2004) Visualization and analysis of DNA sequences using DNA walks. J Frankl Inst 341:37–53
Brock WA (1986) Distinguishing random and deterministic systems: abridged version. J Econ Theory 40:168–195
Ciccarelli FD, Doerks T, von Mering C et al (2006) Toward automatic reconstruction of a highly resolved tree of life. Science 311:1283–1287
Claverie J-M (1997) Computational methods for the identification of genes in vertebrate genomic sequences. Hum Mol Genet 6:1735–1744
Eigen M, Lindemann BF, Tietze M et al (1989) How old is the genetic code? Statistical geometry of tRNA provides an answer. Science 244:673–679
Fasold M, Langenberger D, Binder H et al (2011) DARIO: a ncRNA detection and analysis tool for next-generation sequencing experiments. Nucleic Acids Res 39:W112–W117
Feller W (1968) An introduction to probability theory and its applications, 3rd edn., Wiley series in probability and mathematical statisticsWiley, Wiley
Fujishima K, Kanai A (2014) tRNA gene diversity in the three domains of life. Frontiers Genet 5(142):1–11. doi:10.3389/fgene.2014.00142
Gayle KP, Freeland SJ (2011) Did evolution select a nonrandom “alphabet” of amino acids? Astrobiology 11:235–240
Grassberger P, Procaccia I (1983) Estimation of the Kolmogorov entropy from a chaotic signal. Phys Rev A 28:2591–2593
Haimovich AD, Byrne B, Ramaswamy R, Welsh WJ (2006) Wavelet analysis of DNA walks. J Comp Biol 13(7):1289–1298
Hamori E, Ruskin J (1983) H-curves, a novel method of representation of nucleotide series especially suited for long DNA sequences. J Biol Chem 258:1318–1327
Higgs PG, Wu M (2012) The importance of stochastic transitions for the origin of life. Orig Life Evol Biosph 42:453–457. doi:10.1007/s11084-012-9307-0
Howland JL (2000) The surprising archaea. Oxford University Press, London
Koonin EV, Yutin N (2014) The dispersed archaeal eukaryome and the complex archaeal ancestor of eukaryotes. Cold Spring Harb Perspect Biol 1–16. doi: 10.1101/cshperspect.a016188
Mizrahi E, Ninio J (1985) Graphical coding of nucleic acid sequences. Biochimie 67:445–448
Press WH, Teukolsky SA (1992) Portable random number generators. Comput Phys 6:522–524
Rodin AS, Szathmáry E, Rodin SN (2011) On origin of genetic code and tRNA before translation. Biol Direct 6:14
Sprott JC, Rowlands G (1995) Chaos data analyzer. Physics Academic Software, New York
Videm P, Rose D, Costa F, Backofen R (2014) BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles. Bioinformatics 30(21):274–282
Weiss O, Jiménez-Montaño MA, Herzelm H (2000) Information content of protein sequences. J Theor Biol 206:379–386
Wolf A, Swift JB, Swinney HL et al (1985) Determining Lyapunov exponents from a time series. Phys D 16:285–317
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bianciardi, G., Borruso, L. Nonlinear Analysis of tRNAs Nucleotide Sequences by Random Walks: Randomness and Order in the Primitive Informational Polymers. J Mol Evol 80, 81–85 (2015). https://doi.org/10.1007/s00239-015-9664-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-015-9664-1