Skip to main content

Data supporting the 1000 plant (1KP) transcriptomes initiative

Dataset type: Transcriptomic
Data released on October 23, 2019

Carpenter EJ; Matasci N; Ayyampalayam S; Wu S; Sun J; Yu J; Jimenez Vieira FR; Bowler C; Dorrell RG; Gitzendanner MA; Li L; Du W; Ullrich K; Wickett NJ; Barkmann TJ; Barker MS; Leebens-Mack JH; Wong GK (2019): Data supporting the 1000 plant (1KP) transcriptomes initiative GigaScience Database. https://doi.org/10.5524/100627

DOI10.5524/100627

The 1000 Plants (1KP) transcriptomes initiative explored the genetic diversity of green plants (Viridiplantae) by sequencing RNA from 1,342 samples representing 1,173 species. All of the analyses done for the 1KP capstone, and previous studies on subsets of these data, are based on a series of de novo transcriptome assemblies and related outputs that will be described in the accompanying GigaScience publication. We also describe assessments of the data quality and an analysis to remove cross-contamination between the samples. These data will be useful to researchers with interests in specific gene families, either across the green plant tree of life or in more focused lineages.

Additional details

Read the peer-reviewed publication(s):

  • Carpenter, E. J., Matasci, N., Ayyampalayam, S., Wu, S., Sun, J., Yu, J., Jimenez Vieira, F. R., Bowler, C., Dorrell, R. G., Gitzendanner, M. A., Li, L., Du, W., K. Ullrich, K., Wickett, N. J., Barkmann, T. J., Barker, M. S., Leebens-Mack, J. H., & Wong, G. K.-S. (2019). Access to RNA-sequencing data from 1,173 plant species: The 1000 Plant transcriptomes initiative (1KP). GigaScience, 8(10). https://doi.org/10.1093/gigascience/giz126 (PubMed:31644802)
Related datasets:

doi:10.5524/100627 IsPreviousVersionOf doi:10.5524/100910(It is a more recent version of this dataset)


Accessions (data included in GigaDB):

BioProject: PRJEB4921
BioProject: PRJEB8056
BioProject: PRJEB21674
STUDY: SRP012845
BioProject: PRJNA163187

Click on a table column to sort the results.

Table Settings
Sample ID Common Name Scientific Name Sample Attributes Taxonomic ID Genbank Name
URDJ Amborella trichopoda Description:RNA extracted from leaves of Amborella...
Analyte type:RNA
Tissue:leaf [BTO:0000713]
...
13333
WTKZ Nuphar advena Description:RNA extracted from young leaves of Nup...
Analyte type:RNA
Tissue:juvenile leaf [BTO:0003147]
...
77108
ROAP Illicium parviflorum Description:RNA extracted from leaves of Illicium ...
Analyte type:RNA
Tissue:leaf [BTO:0000713]
...
13099
VZCI Florida anisetree Illicium floridanum Description:RNA extracted from young leaves of Ill...
Analyte type:RNA
Tissue:juvenile leaf [BTO:0003147]
...
13098
NWMY Kadsura heteroclita Description:RNA extracted from leaves of Kadsura h...
Analyte type:RNA
Tissue:leaf [BTO:0000713]
...
124781
FZJL Austrobaileya scandens Sample storage condition:RNA extract
Description:RNA extracted from young shoot of Aust...
Analyte type:RNA
...
13351
NPND hornwort Ceratophyllum demersum Description:RNA extracted from leaves of Ceratophy...
Analyte type:RNA
Tissue:leaf [BTO:0000713]
...
4428
DDEV barbasco Canella winterana Description:RNA extracted from young leaves of Can...
Analyte type:RNA
Tissue:juvenile leaf [BTO:0003147]
...
3426
IFCJ barbasco Canella winterana Description:RNA extracted from young leaves of Can...
Analyte type:RNA
Tissue:juvenile leaf [BTO:0003147]
...
3426
WKSU Drimys winteri Alternative accession-SRA Study:ERP023948
Alternative accession-BioSample:SAMEA104170167
Specimen voucher:Soltis and Miles 2843
...
3419

Click on a table column to sort the results.

Table Settings

File Name Description Sample ID Data Type File Format Size Release Date File Attributes Download
Readme TEXT 2.60 kB 2019-07-15
Clean and contaminant FASTA sequence files for each library Mixed archive GZIP 5.93 GB 2019-07-15 MD5 checksum: 05534e4a2d055ccc0d6ebdd48f332daa
Lists whether the sample is overall judged to be validated by the the SILVA based SSU check, i.e. contains the expected 18S sequence, and whether the sample has alignments to any other plant sequences (described as worrisome contamination). Tabular data CSV 56.94 kB 2019-07-15 MD5 checksum: a1935744d0b756e18e8bc6407ff5a6fe
Lists each scaffold identified as being an 18S sequence, and which reference sequence it matched against. Tabular data CSV 686.35 kB 2019-07-15 MD5 checksum: c7aa46a1a2e981a7a05549fc5bf2aca1
Listing of the detected contaminant scaffolds and the corresponding original source scaffold Tabular data CSV 5.29 MB 2019-07-15 MD5 checksum: 85c140a7d4eb44192449dbc57bb2415d
Lists summary statistics of the prevalence of the apparent contaminants in each sample Tabular data CSV 175.87 kB 2019-07-15 MD5 checksum: a2e4f303b21a0be9bb3eb18313b85687
List of taxonomically close sample pairs which were not compared Tabular data CSV 1.01 MB 2019-07-15 MD5 checksum: 3fb48df5f4a92d75b4c611cf21bb31d4
Corresponding ENA/NCBI references for the source read sequences. Tabular data CSV 556.41 kB 2019-07-15 MD5 checksum: 57fc7ba60dc638c755531ef9a06aea1c
Tables with list of samples/assemblies. Tabular data CSV 114.06 kB 2019-07-15 MD5 checksum: fbe3cde9a3f1bb3e21d778bd5960f3ab
The SOAPdenovo-Trans assemblies of the sequence reads Transcriptome sequence FASTA 7.61 MB 2019-08-06 MD5 checksum: 46da8d45af68ee9519fdd0ff9e61192f
Funding body Awardee Award ID Comments
Alberta Ministry of Advanced Education GKS Wong RES0010334 Alberta Innovates - Technology Futures
The National Key Research and Development Program of China
The Ministry of Science and Technology of the People's Republic of China 2015BAD04B01/2015BAD04B03
The Ministry of Science and Technology of the People's Republic of China 2011DQ782025 State Key Laboratory of Agricultural Genomics
Guangdong Province 2011A091000047 Key Laboratory of core collection of crop genetic resources research and application
Shenzhen Municipal Government of China CXZZ20140421112021913/ JCYJ20150529150409546/JCYJ20150529150505656
National Science Foundation DBI-1265383 iPlant Collaborative (CyVerse)
National Science Foundation Grants CWd IOS 0922742
National Science Foundation Grants MSB IOS-1339156
National Science Foundation Grants JL-M DEB 0830009
National Science Foundation Grants SWG EF-0629817
National Science Foundation Grants MSB EF-1550838
National Science Foundation Grants TW and JL-M) DEB 0733029
National Science Foundation Grants TW DBI1062335
National Institutes of Health Grant TMK 1R01DA025197
Natural Sciences and Engineering Research Council of Canada SWG Discovery grant

Protocols.io:

Date Action
October 23, 2019 Dataset publish
November 6, 2019 Manuscript Link added : 10.1093/gigascience/giz126
November 7, 2019 File 1kp_decontamination_libraries.gz updated
November 15, 2019 External Link updated : https://www.protocols.io/widgets/doi?uri=dx.doi.org/10.17504/protocols.io.439gyr6
January 20, 2021 File format attribute of 1kp_decontamination_libraries.gz updated from archive to gzip
June 22, 2021 Relationship added : DOI 100910
October 14, 2022 Manuscript Link updated : 10.1093/gigascience/giz126