ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Gene
Volume 304, 30 January 2003, Pages 183-192
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (287 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
Special issue
View Record in Scopus
 
doi:10.1016/S0378-1119(02)01206-4    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2002 Elsevier Science B.V. All rights reserved.

Pentamer vocabularies characterizing introns and intron-like intergenic tracts from Caenorhabditis elegans and Drosophila melanogaster

Emanuele Bultrinia, Elisabetta PizziCorresponding Author Contact Information, E-mail The Corresponding Author, a, Paolo Del Giudiceb and Clara Frontalia

a Laboratorio di Biologia Cellulare, Istituto Superiore di Sanità, Viale Regina Elena 299, 00161, Rome, Italy b Laboratorio di Fisica, Istituto Superiore di Sanità, Rome, Italy

Received 4 July 2002; 
revised 15 November 2002; 
accepted 4 December 2002;
Received by G. Pesole 
Available online 24 January 2003.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Overall compositional properties at the level of bases, dinucleotides and longer oligos characterize genomes of different species. In Caenorhabditis elegans, using recurrence analysis, we recognized the existence of a long-range correlation in the oligonucleotide usage of introns and intergenic regions. Through correlation analysis, this is confirmed here to be a genome-wide property of C. elegans non-coding portions. We then investigate the possibility of extracting a typical vocabulary through statistical analysis of experimentally confirmed introns of sufficient length (>1 kb), deprived of known splice signals, the focus being on distributed lexical features rather than on localized motifs. Lexical preferences typical of introns could be exposed using principal component analysis of pentanucleotide frequency distributions, both in C. elegans and in Drosophila melanogaster. In either species, the introns' pentamer preferences are largely shared by intergenic tracts. The pentamer vocabularies extracted for the two species exhibit interesting symmetry properties and overlap in part. A more extensive investigation of the interspecies relationship at the level of oligonucleotide preferences in non-coding regions, not related by sequence similarity, might form the basis of new approaches for the study of the evolutionary behaviour of these regions.

Author Keywords: Introns; Caenorhabditis elegans; Drosophila melanogaster; Linguistic properties

Abbreviations: PCA, principal component analysis; PC1, first principal component; PC2, second principal component; ORF, open reading frame

Article Outline

1. Introduction
2. Methods
2.1. Data source
2.2. Correlation analysis
2.3. PCA
3. Results
3.1. Correlation analysis
3.2. Caenorhabditis elegans introns' vocabulary
3.3. Symmetry properties of the C. elegans introns' vocabulary
3.4. Genomic distribution of the introns' vocabulary
3.5. Drosophila melanogaster introns' vocabulary
4. Discussion
Acknowledgements
References








Gene
Volume 304, 30 January 2003, Pages 183-192
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.