Skip to main content
Log in

SNR of DNA sequences mapped by general affine transformations of the indicator sequences

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

The identification of gene coding regions of DNA sequences through digital signal processing techniques based on the so-called 3-base periodicity has been an emerging problem in bioinformatics. The signal to noise ratio (SNR) of a DNA sequence is computed after mapping the DNA symbolic sequence into numerical sequences. Typical mapping schemes include the Voss, Z-curve and tetrahedron representations and the like, which have been used to construct gene coding region detecting algorithms. In this paper, an extended definition of SNR is proposed, which has less computational cost and wider applicability than its original ones. Furthermore, we analyze the SNRs of different mapping schemes and derive the general relationship between Voss based SNR and that of its general affine transformations. We conclude that the SNRs of Z-curve and tetrahedron map are also linearly proportional to that of Voss map. Not only is our conclusion instructional for the design of other affine transformations, but it is also of much significance in understanding the role of the symbolic-to-numerical mapping in the detection of gene coding regions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anastassiou D (2000) Frequency-domain analysis of biomolecular sequences. Bioinformatics 16: 1073–1081

    Article  MathSciNet  Google Scholar 

  • Anastassiou D (2001) Genomic signal processing. IEEE Signal Process Mag 18:8–20

    Article  Google Scholar 

  • Bettecken T et al (2011) Human nucleosomes: special role of CG dinucleotides and Alu-nucleosomes. BMC Genomics 12: 273

    Article  Google Scholar 

  • Coward E (1997) Equivalence of two Fourier methods for biological sequences. J Math Biol 36: 64–70

    Article  MathSciNet  MATH  Google Scholar 

  • Fickett JW (1982) Recognition of protein coding regions in DNA sequences. Nucleic Acids Res 10: 5303–5318

    Article  Google Scholar 

  • Fickett JW, Tung CS (1992) Assessment of protein coding measures. Nucleic Acids Res 20: 5303–5318

    Article  Google Scholar 

  • Gao J, Qi Y, Cao Y, Tung WW (2005) Protein coding sequence identification by simultaneously characterizing the periodic and random features of DNA sequences. J Biomed Biotechnol 2: 139–146

    Article  Google Scholar 

  • George TP, Thomas T (2010) Discrete wavelet transform de-noising in eukaryotic gene splicing. BMC Bioinf 11(Suppl 1):S50

    Google Scholar 

  • Kortlar D, Lavner Y (2003) Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions. Genome Res 13: 1930–1937

    Google Scholar 

  • Ning J, Moore CN, Nelson JC (2003) Preliminary wavelet analysis of genomic sequences. In: Proceedings of the IEEE bioinformatics conference (CSB), pp 509–510

  • Paar V et al (2008) Hierarchical structure of cascade of primary and secondary periodicities in Fourier power spectrum of alphoid higher order repeats. BMC Bioinf(9): 466

  • Rushdi A, Tuqan J (2006) Gene identification using the Z-curve representation. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, vol 2, pp 1024–1027

  • Saeys Y, Rouze P, Peer YVd (2007) In search of the short ones: improved prediction of short exons in vertebrates, plants, fungi and protists. Bioinformatics 23: 414–420

    Article  Google Scholar 

  • Sharma D et al (2004) Spectral repeat finder (SRF): identification of repetitive sequences using Fourier transformation. Bioinformatics 9: 1405–1412

    Article  Google Scholar 

  • Sharma SD, Shakya K, Sharma SN (2011) Evaluation of DNA mapping schemes for exon detection. In: International conference on computer, communication and electrical technology, ICCCET 2011

  • Silverman BD, Linkser R (1986) A measure of DNA periodicity. J Theor Biol 118: 295–300

    Article  Google Scholar 

  • Song NY, Yan H (2011) Short exon detection in DNA sequences based on multifeature spectral analysis. EURASIP J Adv Signal Process. doi:10.1155/2011/780794 (article ID 780794)

  • Tiwari S, Ramachandran S, Bhattacharya A, Bhattacharya S, Ramaswamy R (1997) Prediction of probable genes by Fourier analysis of genomic sequences. CABIOS 13: 263–270

    Google Scholar 

  • Tuqan J, Rushdi A (2008) A DSP Approach for Finding the Codon Bias in DNA Sequences. IEEE J Select Topics Signal Process 2(3): 343–356

    Article  Google Scholar 

  • Voss RF (1992) Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. Phys Rev Lett 68: 3805–3808

    Article  Google Scholar 

  • Wang L, Stein LD (2010) Localizing triplet periodicity in DNA and cDNA sequences. BMC Bioinf 11: 550

    Article  Google Scholar 

  • Yan M, Zhang CT (1998) A new Fourier transform approach for protein coding measure based on the format of the Z-curve. Bioinformatics 14: 685–690

    Article  MathSciNet  Google Scholar 

  • Yin C, Yau SS-T (2005) A Fourier characteristic of coding sequences: origins and a non-Fourier approximation. J Comput Biol 9: 1153–1165

    Article  Google Scholar 

  • Yin C, Yau SS-T (2007) Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence. J Theor Biol 247: 687–694

    Article  MathSciNet  Google Scholar 

  • Zhang R, Zhang CT, Curves Z (1994) An intuitive tool for visualizing and analyzing the DNA sequences. J Biomol Struct Dyn 11: 767–782

    Article  Google Scholar 

  • Zhang CT, Wang J (2000) Recognition of protein coding genes in the yeast genome at better than 95 % accuracy based on the Z curve. Nucleic Acids Res 28: 2804–2814

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianfeng Shao.

Additional information

The present study was supported in part by National Basic Research Program (2011CBA00800) of China.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shao, J., Yan, X. & Shao, S. SNR of DNA sequences mapped by general affine transformations of the indicator sequences. J. Math. Biol. 67, 433–451 (2013). https://doi.org/10.1007/s00285-012-0564-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-012-0564-3

Keywords

Mathematics Subject Classification

Navigation