Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure

  1. Elfar Torarinsson1,2,3,
  2. Milena Sawera1,3,
  3. Jakob H. Havgaard1,
  4. Merete Fredholm1, and
  5. Jan Gorodkin1,4
  1. 1 Division of Genetics and Bioinformatics, IBHV, The Royal Veterinary and Agricultural University, 1870 Frederiksberg C, Denmark;
  2. 2 Department of Natural Sciences, The Royal Veterinary and Agricultural University, 1870 Frederiksberg C, Denmark
  1. 3 These authors contributed equally to this work.

Abstract

Human and mouse genome sequences contain roughly 100,000 regions that are unalignable in primary sequence and neighbor corresponding alignable regions between both organisms. These pairs are generally assumed to be nonconserved, although the level of structural conservation between these has never been investigated. Owing to the limitations in computational methods, comparative genomics has been lacking the ability to compare such nonconserved sequence regions for conserved structural RNA elements. We have investigated the presence of structural RNA elements by conducting a local structural alignment, using FOLDALIGN, on a subset of these 100,000 corresponding regions and estimate that 1800 contain common RNA structures. Comparing our results with the recent mapping of transcribed fragments (transfrags) in human, we find that high-scoring candidates are twice as likely to be found in regions overlapped by transfrags than regions that are not overlapped by transfrags. To verify the coexpression between predicted candidates in human and mouse, we conducted expression studies by RT-PCR and Northern blotting on mouse candidates, which overlap with transfrags on human chromosome 20. RT-PCR results confirmed expression of 32 out of 36 candidates, whereas Northern blots confirmed four out of 12 candidates. Furthermore, many RT-PCR results indicate differential expression in different tissues. Hence, our findings suggest that there are corresponding regions between human and mouse, which contain expressed non-coding RNA sequences not alignable in primary sequence.

Footnotes

| Table of Contents

Preprint Server