Classification of intra-genomic helitrons based on features extracted from different orders of FCGS

https://doi.org/10.1016/j.imu.2019.100271Get rights and content
Under a Creative Commons license
open access

Abstract

Helitrons, eukaryotic transposable elements (TEs), were discovered 18 years ago in various genomes. In the Cænorhabditis elegans (C.elegans) genome, helitron sequences have high variability in terms of size by base pairs (bp) varied from 11 to 8965 bp from one sequence to another. These TEs are not uniformly dispersed sequences, and they have the ability to mobilize within a genome by a rolling-circle mechanism. This ability to move and reproduce in genomes enables these elements to play a major role in genomic evolution. In order to follow the evolution, we predicted helitron families (10 classes) in the C.elegans genome using the combination of the features extracted from signals corresponding to DNA sequences and the Support Vector Machine (SVM) classifier. In our classification system, the features extracted from the signals were shown to be efficient to automatically predict helitronic sequences. As a result, the Gaussian radial kernel over 100-fold cross-validation gave the best accuracy rates, ranging from 68% to 97%, with an overall mean score of 83.7%, and we successfully identified the Helitron Y1A class for a specific value of c and gamma, reaching an accuracy rate of 100%. In addition, other notable helitrons (NDNAX2, NDNAX3 Helitron_Y2) were predicted with interesting accuracy rates.

Keywords

Helitrons classification
Signal
FCGS coding technique
Machine learning
SVM

Cited by (0)

Rabeb. Touati: PhD in electrical engineering from the National Engineering School of Tunisia (ENIT). Currently, she has a Post-Doctoral position at the Laboratory of Human Genetics (LR-GH) at the Faculty of Medicine of Tunis (FMT). Her research interest includes genomic signal processing and machine learning.

Imen. Messaoudi: Received her PhD degree in electrical engineering from the National Engineering School of Tunisia. She is Assistant professor at the Higher Institute of Information Technologies and Communications (ISTIC) from Carthage University. Her research interest includes pattern recognition and biomedical and genomic signal processing.

Afef. Elloumi Oueslati: PhD in electrical engineering from the National Engineering School of Tunisia (ENIT). She is Associate Professor at the National School of Engineers of Carthage (ENICarthage). Her research interest includes issues related to signal and image processing applied in the biomedical and genomic fields.

Zied. Lachiri: PhD in electrical engineering from the National Engineering School of Tunisia (ENIT).He is Professor and Research Director in the Signal, Image and Information Technology laboratory (LR-SITI,ENIT). His research interests include pattern recognition, and signal and image processing in biomedical, multimedia, and man-machine communication

Maher. Kharrat: PhD in Human Genetics from the Faculty of Medicine of Tunis (FMT). He is Associate Professor and Research Director in the Genetic Human laboratory (LR99ES10) at the Faculty of Medicine of Tunis (FMT). He currently works at the Faculty of Medicine, University of Tunis El Manar. Dr. Maher does research in the field of Human Genetics.