Skip to main content

Supporting data for "Leveraging multiple transcriptome assembly methods for improved gene structure annotation"

Dataset type: Software, Transcriptomic
Data released on July 04, 2018

Venturini L; Caim S; Kaithakottil GG; Mapleson DL; Swarbreck D (2018): Supporting data for "Leveraging multiple transcriptome assembly methods for improved gene structure annotation" GigaScience Database. https://doi.org/10.5524/100464

DOI10.5524/100464

The performance of RNA-Seq aligners and assemblers varies greatly across different organisms and experiments, and often the optimal approach is not known beforehand. Here we show that the accuracy of transcript reconstruction can be boosted by combining multiple methods, and we present a novel algorithm to integrate multiple RNA-Seq assemblies into a coherent transcript annotation. Our algorithm can remove redundancies and select the best transcript models according to user-specified metrics, while solving common artefacts such as erroneous transcript chimerisms. We have implemented this method in an open-source Python3 and Cython program, Mikado, available at https://github.com/lucventurini/Mikado.

Additional details

Read the peer-reviewed publication(s):

  • Venturini, L., Caim, S., Kaithakottil, G. G., Mapleson, D. L., & Swarbreck, D. (2018). Leveraging multiple transcriptome assembly methods for improved gene structure annotation. GigaScience, 7(8). https://doi.org/10.1093/gigascience/giy093 (PubMed:30052957)

Additional information:

https://github.com/lucventurini/mikado

https://github.com/lucventurini/mikado-analysis

Accessions (data included in GigaDB):

BioProject: PRJEB22606

Accessions (data not in GigaDB):

BioProject: PRJEB4208
BioProject: PRJEB7093

Click on a table column to sort the results.

Table Settings
Sample ID Common Name Scientific Name Sample Attributes Taxonomic ID Genbank Name
SAM20465 Human Homo sapiens Description:RNA-seq transcript sequences of Strata...
Alternative names:human
Alternative accession-BioSample:SAM20465
...
9606 human
SAMEA2149758 roundworm Caenorhabditis elegans Description:RNA-seq transcript assembly evaluation...
Alternative names:roundworm
Alternative accession-BioSample:SAMEA2149758
...
6239
SAMEA2144177 roundworm Caenorhabditis elegans Description:RNA-seq transcript assembly evaluation...
Alternative names:roundworm
Alternative accession-BioSample:SAMEA2144177
...
6239
SAMEA2161067 roundworm Caenorhabditis elegans Description:RNA-seq transcript assembly evaluation...
Alternative names:roundworm
Alternative accession-BioSample:SAMEA2161067
...
6239
SAMEA2152280 Human Homo sapiens Description:RNA-seq transcript assembly evaluation...
Alternative names:human
Alternative accession-BioSample:SAMEA2152280
...
9606 human
SAMEA2157650 Human Homo sapiens Description:RNA-seq transcript assembly evaluation...
Alternative names:human
Alternative accession-BioSample:SAMEA2157650
...
9606 human
SAMEA2159595 Human Homo sapiens Description:RNA-seq transcript assembly evaluation...
Alternative names:human
Alternative accession-BioSample:SAMEA2159595
...
9606 human
SAMEA2163240 Human Homo sapiens Description:RNA-seq transcript assembly evaluation...
Alternative names:human
Alternative accession-BioSample:SAMEA2163240
...
9606 human
SAMEA2152217 Drosophila melanogaster Description:RNA-seq transcript assembly evaluation...
Alternative names:fruit fly
Alternative accession-BioSample:SAMEA2152217
...
7227 fruit fly
SAMEA2152327 Drosophila melanogaster Description:RNA-seq transcript assembly evaluation...
Alternative names:fruit fly
Alternative accession-BioSample:SAMEA2152327
...
7227 fruit fly

Click on a table column to sort the results.

Table Settings

File Name Description Sample ID Data Type File Format Size Release Date File Attributes Download
Readme TEXT 5.55 kB 2018-06-26
Archival copy of the GitHub repository https://github.com/lucventurini/mikado download 24-May-2018. Mikado - pick your transcript: a pipeline to determine and select the best RNA-Seq prediction GitHub archive archive 31.74 MB 2018-06-26 MD5 checksum: 079d0668ced5c853e59e6a5a17a92227
Archival copy of the GitHub repository https://github.com/lucventurini/mikado-analysis download 24-May-2018. This repository contains the scripts used for the Mikado analyses. GitHub archive archive 18.06 MB 2018-06-26 MD5 checksum: a7bf04105fb0855b28bfb0a0a130a581
assemblies derived from real data download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 Transcriptome sequence archive 98.46 MB 2018-06-26 MD5 checksum: 7336463437bfdd26ad9258f05ea3ab5f
assemblies derived from simulated data download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 Transcriptome sequence archive 68.17 MB 2018-06-26 MD5 checksum: 67db903cad24c22e56d30001c5930134
StringTie and CLASS2 assemblies derived by varying the Minimum Isoform Fraction parameter download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 Transcriptome sequence archive 90.89 MB 2018-06-26 MD5 checksum: fa84256c96af36bc3b215d2c9e24667b
assemblies derived from real data using multiple samples RNA-Seq of A. thaliana download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 Transcriptome sequence archive 194.11 MB 2018-06-26 MD5 checksum: b319a4d62f5d425e3b71e06427756b5c
alignments and assemblies of Illumina and PacBio reads download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 Alignments archive 56.33 MB 2018-06-26 MD5 checksum: 5bf765ec038c37b5d42dba60225c79f9
comparisons for the real and simulated datasets download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 Expression data archive 638.38 MB 2018-06-26 MD5 checksum: eb4a075ee93436410bfaedc07f813baa
comparisons for Stringtie/CLASS and derived Mikados obtained by varying the MIF parameter download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 Expression data archive 144.60 MB 2018-06-26 MD5 checksum: 50500190a62a409575c2ad955001da87
Funding body Awardee Award ID Comments
Biotechnology and Biological Sciences Research Council Federica Di Palma BB/CSP1720/1 Core Strategic Programme Grant
Biotechnology and Biological Sciences Research Council Neil Hall BB/CCG1720/1 Capability in Genomics and Single Cell
Biotechnology and Biological Sciences Research Council Ksenia Krasileva BB/J003743/1 Strategic LOLA Award
Date Action
July 4, 2018 Dataset publish
August 22, 2018 Manuscript Link added : 10.1093/gigascience/giy093
March 6, 2019 Sample Attribute added : of Sample SAMEA2152327
March 6, 2019 Sample Attribute added : of Sample SAMEA2152327
March 6, 2019 Sample Attribute added : of Sample SAMEA2152327
March 6, 2019 Sample Attribute added : of Sample SAMEA2152327
March 6, 2019 Sample Attribute added : of Sample SAMEA2152327
March 6, 2019 Sample Attribute added : of Sample SAMEA2152327
June 17, 2021 Sample Attribute added : of Sample SAMPLE:SAM20465
June 17, 2021 Sample Attribute added : of Sample SAMPLE:SAM20465
June 17, 2021 Sample Attribute added : of Sample SAMPLE:SAM20465
June 17, 2021 Sample Attribute added : of Sample SAMPLE:SAM20465
June 17, 2021 Sample Attribute added : of Sample SAMPLE:SAM20465
June 17, 2021 Sample Attribute added : of Sample SAMPLE:SAM20465
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2163240
June 17, 2021 Sample Attribute added : of Sample SAMEA2163240
June 17, 2021 Sample Attribute added : of Sample SAMEA2163240
June 17, 2021 Sample Attribute added : of Sample SAMEA2163240
June 17, 2021 Sample Attribute added : of Sample SAMEA2163240
June 17, 2021 Sample Attribute added : of Sample SAMEA2163240
June 17, 2021 Sample Attribute added : of Sample SAMEA2159595
June 17, 2021 Sample Attribute added : of Sample SAMEA2159595
June 17, 2021 Sample Attribute added : of Sample SAMEA2159595
June 17, 2021 Sample Attribute added : of Sample SAMEA2159595
June 17, 2021 Sample Attribute added : of Sample SAMEA2159595
June 17, 2021 Sample Attribute added : of Sample SAMEA2159595
June 17, 2021 Sample Attribute added : of Sample SAMEA2157650
June 17, 2021 Sample Attribute added : of Sample SAMEA2157650
June 17, 2021 Sample Attribute added : of Sample SAMEA2157650
June 17, 2021 Sample Attribute added : of Sample SAMEA2157650
June 17, 2021 Sample Attribute added : of Sample SAMEA2157650
June 17, 2021 Sample Attribute added : of Sample SAMEA2157650
June 17, 2021 Sample Attribute added : of Sample SAMEA2152280
June 17, 2021 Sample Attribute added : of Sample SAMEA2152280
June 17, 2021 Sample Attribute added : of Sample SAMEA2152280
June 17, 2021 Sample Attribute added : of Sample SAMEA2152280
June 17, 2021 Sample Attribute added : of Sample SAMEA2152280
June 17, 2021 Sample Attribute added : of Sample SAMEA2152280
June 17, 2021 Sample Attribute added : of Sample SAMEA2152217
June 17, 2021 Sample Attribute added : of Sample SAMEA2152217
June 17, 2021 Sample Attribute added : of Sample SAMEA2152217
June 17, 2021 Sample Attribute added : of Sample SAMEA2152217
June 17, 2021 Sample Attribute added : of Sample SAMEA2152217
June 17, 2021 Sample Attribute added : of Sample SAMEA2152217
June 17, 2021 Sample Attribute added : of Sample SAMEA1969505
June 17, 2021 Sample Attribute added : of Sample SAMEA1969505
June 17, 2021 Sample Attribute added : of Sample SAMEA1969505
June 17, 2021 Sample Attribute added : of Sample SAMEA1969505
June 17, 2021 Sample Attribute added : of Sample SAMEA1969505
June 17, 2021 Sample Attribute added : of Sample SAMEA1969505
June 17, 2021 Sample Attribute added : of Sample SAMEA2162985
June 17, 2021 Sample Attribute added : of Sample SAMEA2162985
June 17, 2021 Sample Attribute added : of Sample SAMEA2162985
June 17, 2021 Sample Attribute added : of Sample SAMEA2162985
June 17, 2021 Sample Attribute added : of Sample SAMEA2162985
June 17, 2021 Sample Attribute added : of Sample SAMEA2162985
June 17, 2021 Sample Attribute added : of Sample SAMEA2145518
June 17, 2021 Sample Attribute added : of Sample SAMEA2145518
June 17, 2021 Sample Attribute added : of Sample SAMEA2145518
June 17, 2021 Sample Attribute added : of Sample SAMEA2145518
June 17, 2021 Sample Attribute added : of Sample SAMEA2145518
June 17, 2021 Sample Attribute added : of Sample SAMEA2145518
June 17, 2021 Sample Attribute added : of Sample SAMEA2152327
June 17, 2021 Sample Attribute added : of Sample SAMEA2152327
June 17, 2021 Sample Attribute added : of Sample SAMEA2152327
June 17, 2021 Sample Attribute added : of Sample SAMEA2152327
June 17, 2021 Sample Attribute added : of Sample SAMEA2152327
June 17, 2021 Sample Attribute added : of Sample SAMEA2152327
June 17, 2021 Sample Attribute added : of Sample SAMEA2725016
June 17, 2021 Sample Attribute added : of Sample SAMEA2725016
June 17, 2021 Sample Attribute added : of Sample SAMEA2725016
June 17, 2021 Sample Attribute added : of Sample SAMEA2725016
June 17, 2021 Sample Attribute added : of Sample SAMEA2725016
June 17, 2021 Sample Attribute added : of Sample SAMEA2725016
June 17, 2021 Sample Attribute added : of Sample SAMEA2725016
June 17, 2021 Sample Attribute added : of Sample SAMEA2725012
June 17, 2021 Sample Attribute added : of Sample SAMEA2725012
June 17, 2021 Sample Attribute added : of Sample SAMEA2725012
June 17, 2021 Sample Attribute added : of Sample SAMEA2725012
June 17, 2021 Sample Attribute added : of Sample SAMEA2725012
June 17, 2021 Sample Attribute added : of Sample SAMEA2725012
June 17, 2021 Sample Attribute added : of Sample SAMEA2725012
June 17, 2021 Sample Attribute added : of Sample SAMEA2725011
June 17, 2021 Sample Attribute added : of Sample SAMEA2725011
June 17, 2021 Sample Attribute added : of Sample SAMEA2725011
June 17, 2021 Sample Attribute added : of Sample SAMEA2725011
June 17, 2021 Sample Attribute added : of Sample SAMEA2725011
June 17, 2021 Sample Attribute added : of Sample SAMEA2725011
June 17, 2021 Sample Attribute added : of Sample SAMEA2725011
June 17, 2021 Sample Attribute added : of Sample SAMEA2725014
June 17, 2021 Sample Attribute added : of Sample SAMEA2725014
June 17, 2021 Sample Attribute added : of Sample SAMEA2725014
June 17, 2021 Sample Attribute added : of Sample SAMEA2725014
June 17, 2021 Sample Attribute added : of Sample SAMEA2725014
June 17, 2021 Sample Attribute added : of Sample SAMEA2725014
June 17, 2021 Sample Attribute added : of Sample SAMEA2725014
June 17, 2021 Sample Attribute added : of Sample SAMEA2725018
June 17, 2021 Sample Attribute added : of Sample SAMEA2725018
June 17, 2021 Sample Attribute added : of Sample SAMEA2725018
June 17, 2021 Sample Attribute added : of Sample SAMEA2725018
June 17, 2021 Sample Attribute added : of Sample SAMEA2725018
June 17, 2021 Sample Attribute added : of Sample SAMEA2725018
June 17, 2021 Sample Attribute added : of Sample SAMEA2725018
June 17, 2021 Sample Attribute added : of Sample SAMEA2725017
June 17, 2021 Sample Attribute added : of Sample SAMEA2725017
June 17, 2021 Sample Attribute added : of Sample SAMEA2725017
June 17, 2021 Sample Attribute added : of Sample SAMEA2725017
June 17, 2021 Sample Attribute added : of Sample SAMEA2725017
June 17, 2021 Sample Attribute added : of Sample SAMEA2725017
June 17, 2021 Sample Attribute added : of Sample SAMEA2725017
June 17, 2021 Sample Attribute added : of Sample SAMEA2725019
June 17, 2021 Sample Attribute added : of Sample SAMEA2725019
June 17, 2021 Sample Attribute added : of Sample SAMEA2725019
June 17, 2021 Sample Attribute added : of Sample SAMEA2725019
June 17, 2021 Sample Attribute added : of Sample SAMEA2725019
June 17, 2021 Sample Attribute added : of Sample SAMEA2725019
June 17, 2021 Sample Attribute added : of Sample SAMEA2725019
June 17, 2021 Sample Attribute added : of Sample SAMEA2725008
June 17, 2021 Sample Attribute added : of Sample SAMEA2725008
June 17, 2021 Sample Attribute added : of Sample SAMEA2725008
June 17, 2021 Sample Attribute added : of Sample SAMEA2725008
June 17, 2021 Sample Attribute added : of Sample SAMEA2725008
June 17, 2021 Sample Attribute added : of Sample SAMEA2725008
June 17, 2021 Sample Attribute added : of Sample SAMEA2725008
June 17, 2021 Sample Attribute added : of Sample SAMEA2725010
June 17, 2021 Sample Attribute added : of Sample SAMEA2725010
June 17, 2021 Sample Attribute added : of Sample SAMEA2725010
June 17, 2021 Sample Attribute added : of Sample SAMEA2725010
June 17, 2021 Sample Attribute added : of Sample SAMEA2725010
June 17, 2021 Sample Attribute added : of Sample SAMEA2725010
June 17, 2021 Sample Attribute added : of Sample SAMEA2725010
June 17, 2021 Sample Attribute added : of Sample SAMEA2725015
June 17, 2021 Sample Attribute added : of Sample SAMEA2725015
June 17, 2021 Sample Attribute added : of Sample SAMEA2725015
June 17, 2021 Sample Attribute added : of Sample SAMEA2725015
June 17, 2021 Sample Attribute added : of Sample SAMEA2725015
June 17, 2021 Sample Attribute added : of Sample SAMEA2725015
June 17, 2021 Sample Attribute added : of Sample SAMEA2725015
June 17, 2021 Sample Attribute added : of Sample SAMEA2725013
June 17, 2021 Sample Attribute added : of Sample SAMEA2725013
June 17, 2021 Sample Attribute added : of Sample SAMEA2725013
June 17, 2021 Sample Attribute added : of Sample SAMEA2725013
June 17, 2021 Sample Attribute added : of Sample SAMEA2725013
June 17, 2021 Sample Attribute added : of Sample SAMEA2725013
June 17, 2021 Sample Attribute added : of Sample SAMEA2725013
June 17, 2021 Sample Attribute added : of Sample SAMEA2725009
June 17, 2021 Sample Attribute added : of Sample SAMEA2725009
June 17, 2021 Sample Attribute added : of Sample SAMEA2725009
June 17, 2021 Sample Attribute added : of Sample SAMEA2725009
June 17, 2021 Sample Attribute added : of Sample SAMEA2725009
June 17, 2021 Sample Attribute added : of Sample SAMEA2725009
June 17, 2021 Sample Attribute added : of Sample SAMEA2725009
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2149758
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2144177
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
June 17, 2021 Sample Attribute added : of Sample SAMEA2161067
November 11, 2022 Manuscript Link updated : 10.1093/gigascience/giy093