Skip to main content

Genomic data of Pearl Millet (Pennisetum glaucum).

Dataset type: Genomic
Data released on August 18, 2017

Varshney RK; Liu X; Shi C; Vigouroux Y; Xu X (2017): Genomic data of Pearl Millet (Pennisetum glaucum). GigaScience Database. https://doi.org/10.5524/100192

DOI10.5524/100192

Pearl millet is a highly cross-pollinated diploid (2n=2x=14) C4 grass. It has high photosynthetic efficiency and biomass production potential. It is an important cereal cultivated as a staple food grain and source of straw for fodder and fuel in arid and semi-arid regions of sub-Saharan Africa and South Asia. It is necessary to explore the pearl millet’s genome data to reveal its genetic features and improve the production.
In our project, a pearl millet genotype Tift 23D2B1-P1-P5 was chosen for sequencing on Illumina HiSeq 2000 platform. Using SOAP denovo we assembled ~1.79 Gb of pearl millet genome with N50 884.95 kb (for scaffolds greater than 1kb). A total of 38,579 genes models, as well as genome repeats and non-coding RNA elements were estimated based on the genome sequences. The genomic data of pearl millet will provide more resources and empower pearl millet crop breeding.

Additional details

Read the peer-reviewed publication(s):

  • Varshney, R. K., Shi, C., Thudi, M., Mariac, C., Wallace, J., Qi, P., Zhang, H., Zhao, Y., Wang, X., Rathore, A., Srivastava, R. K., Chitikineni, A., Fan, G., Bajaj, P., Punnuri, S., Gupta, S. K., Wang, H., Jiang, Y., Couderc, M., … Xu, X. (2017). Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments. Nature Biotechnology, 35(10), 969–976. https://doi.org/10.1038/nbt.3943

Accessions (data included in GigaDB):

PROJECT: PRJNA294988
GENBANK: LKME00000000

Accessions (data not in GigaDB):

SRA: SRP063925

Click on a table column to sort the results.

Table Settings
Sample ID Common Name Scientific Name Sample Attributes Taxonomic ID Genbank Name
Tift 23D2B1-P1-P5 bulrush millet Cenchrus americanus Alternative names:pearl millet
4543

Click on a table column to sort the results.

Table Settings

File Name Description Sample ID Data Type File Format Size Release Date File Attributes Download
Readme TEXT 1.14 kB 2016-02-29 MD5 checksum: ddea768600db97440a6f61c780791d57
MD5 checksum MD5sum TEXT 289 B 2016-02-29 MD5 checksum: 7f6ec4ba2d252c8d8cb7254b613c63d6
Nucleotide protein coding sequences Coding sequence FASTA 12.90 MB 2016-02-29 MD5 checksum: 84e86f883485dd8b863b152d0bd33151
Gene models Annotation GFF 2.23 MB 2016-02-29 MD5 checksum: caa75aad9192540200a9034506730b43
Amino acid protein coding sequences Protein sequence FASTA 8.45 MB 2016-02-29 MD5 checksum: dfc166e4f34410a1597feb7f7f367aa2
Assembled and linked genome sequences Genome sequence FASTA 488.37 MB 2016-02-29 MD5 checksum: 34ec675acc03c517e01be6c68e07da43
MD5 checksum MD5sum UNKNOWN 302 B 2016-02-29 MD5 checksum: b6d698f49b97c92e60fbe38ceed20bb2
DNA sequence of each CEGMA prediction along with flanking DNA Other UNKNOWN 3.10 MB 2016-02-29 MD5 checksum: c92e559fb3fc58e61ca6721b66e5623a
Protein sequences of the predicted CEGs Other FASTA 202.41 kB 2016-02-29 MD5 checksum: 7a7f53dca9dfccfd9ff9adfaf28e5221
Exon details of all of the CEGMA predicted genes Other GFF 435.06 kB 2016-02-29 MD5 checksum: b21856fa4e07254e0dd17f3ac19fa3f9
Date Action
August 18, 2017 Dataset publish
August 23, 2017 Manuscript Link added : 10.1038/nbt.3943