Elsevier

Genomics

Volume 95, Issue 6, June 2010, Pages 315-327
Genomics

Review
Assembly algorithms for next-generation sequencing data

https://doi.org/10.1016/j.ygeno.2010.03.001Get rights and content
Under an Elsevier user license
open archive

Abstract

The emergence of next-generation sequencing platforms led to resurgence of research in whole-genome shotgun assembly algorithms and software. DNA sequencing data from the Roche 454, Illumina/Solexa, and ABI SOLiD platforms typically present shorter read lengths, higher coverage, and different error profiles compared with Sanger sequencing data. Since 2005, several assembly software packages have been created or revised specifically for de novo assembly of next-generation sequencing data. This review summarizes and compares the published descriptions of packages named SSAKE, SHARCGS, VCAKE, Newbler, Celera Assembler, Euler, Velvet, ABySS, AllPaths, and SOAPdenovo. More generally, it compares the two standard methods known as the de Bruijn graph approach and the overlap/layout/consensus approach to assembly.

Keywords

Genome assembly algorithms
Next-generation sequencing

Cited by (0)

Jason Rafe Miller manages software research and development on whole-genomeshotgun assembly at the J. Craig Venter Institute (JCVI). He previously contributed to assembly infrastructure development at TIGR, genome comparison and annotation software at Celera Genomics, and genomics visualization software at GlaxoSmithKline.Mr. Miller received a Master’s degree from University of Pennsylvania and a Bachelor’s degree from New York University.

Sergey Koren a software engineer at the J. Craig Venter Institute, is a Ph.D. student at the University of Maryland, College Park, where he received his Master of Science andBachelor of Science (cum laude, with honors) degrees in Computer Science. His research interests include genome assembly, application of graph analysis to metagenomics, and applications of high-performance computing. He is a contributor to the Celera Assembler, AMOS, and k-mer Tools projects hosted on Source Forge.

Granger Sutton is Senior Director of Informatics at the J. Craig Venter Institute (JCVI). Prior to joining JCVI, Dr. Sutton was a director in the Informatics Research department at Celera Genomics where he developed and managed research programs ingene finding, comparative genomics, and shotgun fragment assembly including the development of the Celera Assembler for assembling the human genome. As Computer Scientist at The Institute for Genomic Research (TIGR), he developed protein homology search, multiple sequence alignment, and shotgun fragment assembly algorithms. Dr.Sutton also worked at AT&T Bell Labs to design and implement office automation software. Dr. Sutton earned his Bachelor's degree in electrical engineering and Doctoratein Computer Science from University of Maryland, College Park, and a Master’s degreein Computer Engineering from Stanford University.