Supporting data for "Leveraging multiple transcriptome assembly methods for improved gene structure annotation"
Dataset type: Software, Transcriptomic
Data released on July 04, 2018
Venturini L; Caim S; Kaithakottil GG; Mapleson DL; Swarbreck D (2018): Supporting data for "Leveraging multiple transcriptome assembly methods for improved gene structure annotation" GigaScience Database. https://doi.org/10.5524/100464
The performance of RNA-Seq aligners and assemblers varies greatly across different organisms and experiments, and often the optimal approach is not known beforehand. Here we show that the accuracy of transcript reconstruction can be boosted by combining multiple methods, and we present a novel algorithm to integrate multiple RNA-Seq assemblies into a coherent transcript annotation. Our algorithm can remove redundancies and select the best transcript models according to user-specified metrics, while solving common artefacts such as erroneous transcript chimerisms. We have implemented this method in an open-source Python3 and Cython program, Mikado, available at https://github.com/lucventurini/Mikado.
Additional details
Read the peer-reviewed publication(s):
- Venturini, L., Caim, S., Kaithakottil, G. G., Mapleson, D. L., & Swarbreck, D. (2018). Leveraging multiple transcriptome assembly methods for improved gene structure annotation. GigaScience, 7(8). https://doi.org/10.1093/gigascience/giy093 (PubMed:30052957)
Additional information:
https://github.com/lucventurini/mikado
https://github.com/lucventurini/mikado-analysis
Accessions (data included in GigaDB):
BioProject: PRJEB22606
Accessions (data not in GigaDB):
Click on a table column to sort the results.
Table SettingsSample ID | Common Name | Scientific Name | Sample Attributes | Taxonomic ID | Genbank Name |
---|---|---|---|---|---|
SAM20465 | Human | Homo sapiens | Description:RNA-seq transcript sequences of Strata... Alternative names:human Alternative accession-BioSample:SAM20465 ... |
9606 | human |
SAMEA2149758 | roundworm | Caenorhabditis elegans | Description:RNA-seq transcript assembly evaluation... Alternative names:roundworm Alternative accession-BioSample:SAMEA2149758 ... |
6239 | |
SAMEA2144177 | roundworm | Caenorhabditis elegans | Description:RNA-seq transcript assembly evaluation... Alternative names:roundworm Alternative accession-BioSample:SAMEA2144177 ... |
6239 | |
SAMEA2161067 | roundworm | Caenorhabditis elegans | Description:RNA-seq transcript assembly evaluation... Alternative names:roundworm Alternative accession-BioSample:SAMEA2161067 ... |
6239 | |
SAMEA2152280 | Human | Homo sapiens | Description:RNA-seq transcript assembly evaluation... Alternative names:human Alternative accession-BioSample:SAMEA2152280 ... |
9606 | human |
SAMEA2157650 | Human | Homo sapiens | Description:RNA-seq transcript assembly evaluation... Alternative names:human Alternative accession-BioSample:SAMEA2157650 ... |
9606 | human |
SAMEA2159595 | Human | Homo sapiens | Description:RNA-seq transcript assembly evaluation... Alternative names:human Alternative accession-BioSample:SAMEA2159595 ... |
9606 | human |
SAMEA2163240 | Human | Homo sapiens | Description:RNA-seq transcript assembly evaluation... Alternative names:human Alternative accession-BioSample:SAMEA2163240 ... |
9606 | human |
SAMEA2152217 | Drosophila melanogaster | Description:RNA-seq transcript assembly evaluation... Alternative names:fruit fly Alternative accession-BioSample:SAMEA2152217 ... |
7227 | fruit fly | |
SAMEA2152327 | Drosophila melanogaster | Description:RNA-seq transcript assembly evaluation... Alternative names:fruit fly Alternative accession-BioSample:SAMEA2152327 ... |
7227 | fruit fly |
Click on a table column to sort the results.
Table SettingsFile Name | Description | Sample ID | Data Type | File Format | Size | Release Date | File Attributes | Download |
---|---|---|---|---|---|---|---|---|
Readme | TEXT | 5.55 kB | 2018-06-26 | |||||
Archival copy of the GitHub repository https://github.com/lucventurini/mikado download 24-May-2018. Mikado - pick your transcript: a pipeline to determine and select the best RNA-Seq prediction | GitHub archive | archive | 31.74 MB | 2018-06-26 | MD5 checksum: 079d0668ced5c853e59e6a5a17a92227 |
|||
Archival copy of the GitHub repository https://github.com/lucventurini/mikado-analysis download 24-May-2018. This repository contains the scripts used for the Mikado analyses. | GitHub archive | archive | 18.06 MB | 2018-06-26 | MD5 checksum: a7bf04105fb0855b28bfb0a0a130a581 |
|||
assemblies derived from real data download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 | Transcriptome sequence | archive | 98.46 MB | 2018-06-26 | MD5 checksum: 7336463437bfdd26ad9258f05ea3ab5f |
|||
assemblies derived from simulated data download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 | Transcriptome sequence | archive | 68.17 MB | 2018-06-26 | MD5 checksum: 67db903cad24c22e56d30001c5930134 |
|||
StringTie and CLASS2 assemblies derived by varying the Minimum Isoform Fraction parameter download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 | Transcriptome sequence | archive | 90.89 MB | 2018-06-26 | MD5 checksum: fa84256c96af36bc3b215d2c9e24667b |
|||
assemblies derived from real data using multiple samples RNA-Seq of A. thaliana download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 | Transcriptome sequence | archive | 194.11 MB | 2018-06-26 | MD5 checksum: b319a4d62f5d425e3b71e06427756b5c |
|||
alignments and assemblies of Illumina and PacBio reads download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 | Alignments | archive | 56.33 MB | 2018-06-26 | MD5 checksum: 5bf765ec038c37b5d42dba60225c79f9 |
|||
comparisons for the real and simulated datasets download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 | Expression data | archive | 638.38 MB | 2018-06-26 | MD5 checksum: eb4a075ee93436410bfaedc07f813baa |
|||
comparisons for Stringtie/CLASS and derived Mikados obtained by varying the MIF parameter download 26-Jun-2018. https://figshare.com/projects/Leveraging_multiple_transcriptome_assembly_methods_for_improved_gene_structure_annotation/26149 | Expression data | archive | 144.60 MB | 2018-06-26 | MD5 checksum: 50500190a62a409575c2ad955001da87 |
Funding body | Awardee | Award ID | Comments |
---|---|---|---|
Biotechnology and Biological Sciences Research Council | Federica Di Palma | BB/CSP1720/1 | Core Strategic Programme Grant |
Biotechnology and Biological Sciences Research Council | Neil Hall | BB/CCG1720/1 | Capability in Genomics and Single Cell |
Biotechnology and Biological Sciences Research Council | Ksenia Krasileva | BB/J003743/1 | Strategic LOLA Award |
Date | Action |
---|---|
July 4, 2018 | Dataset publish |
August 22, 2018 | Manuscript Link added : 10.1093/gigascience/giy093 |
March 6, 2019 | Sample Attribute added : of Sample SAMEA2152327 |
March 6, 2019 | Sample Attribute added : of Sample SAMEA2152327 |
March 6, 2019 | Sample Attribute added : of Sample SAMEA2152327 |
March 6, 2019 | Sample Attribute added : of Sample SAMEA2152327 |
March 6, 2019 | Sample Attribute added : of Sample SAMEA2152327 |
March 6, 2019 | Sample Attribute added : of Sample SAMEA2152327 |
June 17, 2021 | Sample Attribute added : of Sample SAMPLE:SAM20465 |
June 17, 2021 | Sample Attribute added : of Sample SAMPLE:SAM20465 |
June 17, 2021 | Sample Attribute added : of Sample SAMPLE:SAM20465 |
June 17, 2021 | Sample Attribute added : of Sample SAMPLE:SAM20465 |
June 17, 2021 | Sample Attribute added : of Sample SAMPLE:SAM20465 |
June 17, 2021 | Sample Attribute added : of Sample SAMPLE:SAM20465 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2163240 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2163240 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2163240 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2163240 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2163240 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2163240 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2159595 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2159595 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2159595 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2159595 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2159595 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2159595 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2157650 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2157650 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2157650 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2157650 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2157650 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2157650 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152280 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152280 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152280 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152280 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152280 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152280 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152217 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152217 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152217 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152217 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152217 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152217 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA1969505 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA1969505 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA1969505 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA1969505 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA1969505 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA1969505 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2162985 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2162985 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2162985 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2162985 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2162985 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2162985 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2145518 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2145518 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2145518 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2145518 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2145518 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2145518 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152327 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152327 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152327 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152327 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152327 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2152327 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725016 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725016 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725016 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725016 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725016 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725016 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725016 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725012 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725012 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725012 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725012 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725012 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725012 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725012 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725011 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725011 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725011 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725011 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725011 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725011 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725011 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725014 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725014 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725014 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725014 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725014 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725014 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725014 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725018 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725018 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725018 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725018 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725018 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725018 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725018 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725017 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725017 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725017 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725017 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725017 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725017 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725017 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725019 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725019 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725019 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725019 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725019 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725019 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725019 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725008 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725008 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725008 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725008 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725008 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725008 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725008 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725010 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725010 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725010 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725010 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725010 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725010 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725010 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725015 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725015 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725015 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725015 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725015 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725015 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725015 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725013 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725013 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725013 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725013 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725013 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725013 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725013 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725009 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725009 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725009 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725009 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725009 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725009 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2725009 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2149758 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2144177 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
June 17, 2021 | Sample Attribute added : of Sample SAMEA2161067 |
November 11, 2022 | Manuscript Link updated : 10.1093/gigascience/giy093 |