Skip to main content
Advertisement

< Back to Article

Robust Identification of Noncoding RNA from Transcriptomes Requires Phylogenetically-Informed Sampling

Figure 5

Comparative analysis of RNA-seq datasets in the Goldilocks Zone is a powerful approach for identifying RUFs.

In this figure we illustrate data corresponding to 3 exemplar RUFs that show high covariation, conserved predicted secondary structures and are derived from one of the Goldilocks Zone clades shown in Figure 4. (A–C) The expression levels inferred from RNA-seq in the region encompassing each RUF. The regions contain a mix of ncRNAs (red arrows) and protein coding genes (blue arrows) and a RUF (red arrow). For each nucleotide, the total number of reads that map to that nucleotide was computed, and are presented as a heatmap; darker colours indicate high relative expression, lighter colours indicate low expression and black indicates a gap in the genomic alignment of the sequences for the locii. (D–F) R2R [68] representations of the predicted consensus secondary structures for exemplar RNAs of Unknown Function (RUFs) selected from the Enterobacteriaceae, Pseudomonas and Xanthomonadaceae data. Covariation is highlighted in green, structure-neutral variation is highlighted in blue, highly conserved regions are highlighted in pink. The Enterobacteriaceae RUF contains a conserved tetraloop of the GNRA or UNCG type, and there have been two independent insertions of hairpins in S. enterica and K. pneumoniae within the first hairpin. The Pseudomonas RUF hosts a 3′ rho independent transcription terminator.

Figure 5

doi: https://doi.org/10.1371/journal.pcbi.1003907.g005