Samuel Aparicio,21*
Jarrod Chapman,3
Elia Stupka,1*
Nik Putnam,3
Jer-ming Chia,1
Paramvir Dehal,3
Alan Christoffels,1
Sam Rash,3
Shawn Hoon,1
Arian Smit,4
Maarten D. Sollewijn Gelpke,3
Jared Roach,4
Tania Oh,1
Isaac Y. Ho,3
Marie Wong,1
Chris Detter,3
Frans Verhoef,1
Paul Predki,3
Alice Tay,1
Susan Lucas,3
Paul Richardson,3
Sarah F. Smith,5
Melody S. Clark,5
Yvonne J. K. Edwards,5
Norman Doggett,6
Andrey Zharkikh,7
Sean V. Tavtigian,7
Dmitry Pruss,7
Mary Barnstead,8
Cheryl Evans,8
Holly Baden,8
Justin Powell,9
Gustavo Glusman,4
Lee Rowen,4
Leroy Hood,4
Y. H. Tan,1
Greg Elgar,5*
Trevor Hawkins,3*
Byrappa Venkatesh,1*
Daniel Rokhsar,3*
Sydney Brenner110*
The compact genome of Fugu rubripes has been sequenced
to over 95% coverage, and more than 80% of the assembly is in
multigene-sized scaffolds. In this 365-megabase vertebrate genome,
repetitive DNA accounts for less than one-sixth of the sequence, and
gene loci occupy about one-third of the genome. As with the human
genome, gene loci are not evenly distributed, but are clustered into
sparse and dense regions. Some "giant" genes were observed that had
average coding sequence sizes but were spread over genomic lengths
significantly larger than those of their human orthologs. Although
three-quarters of predicted human proteins have a strong match to
Fugu, approximately a quarter of the human proteins had
highly diverged from or had no pufferfish homologs, highlighting the
extent of protein evolution in the 450 million years since teleosts and
mammals diverged. Conserved linkages between Fugu and human
genes indicate the preservation of chromosomal segments from the common
vertebrate ancestor, but with considerable scrambling of gene order.
1 Institute of Molecular and Cell Biology, 30 Medical Drive,
Singapore 117609.
2 University of Cambridge, Department of
Oncology, Hutchison-MRC Research Centre, Cambridge CB2 2XZ, UK.
3 U.S. DoE Joint Genome Institute, 2800 Mitchell Drive,
Walnut Creek, CA 94598, USA.
4 Institute for Systems
Biology, 1441 North 34th Street, Seattle, WA 98103, USA.
5 MRC UK HGMP Resource Centre, Hinxton, Cambridge CB10 1SB,
UK.
6 Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
7 Myriad Genetics Inc., 320 Wakara Way, Salt Lake City,
UT 84108, USA.
8 Celera Genomics, 45 West Gude Drive,
Rockville, MD 20850, USA.
9 Paradigm Therapeutics Ltd.,
Physiological Laboratory, Cambridge CB2 3EG, UK.
10 Salk
Institute, 10010 North Torrey Pines Road, La Jolla, San Diego, CA
92037-1099, USA.
*
To whom correspondence and requests for materials should be
addressed. E-mail: saa1000{at}cam.ac.uk (S.A.), elia{at}fugu-sg.org (E.S.),
gelgar{at}hgmp.mrc.ac.uk (G.E.),
trevor.hawkins{at}am.amershambiosciences.com (T.H.),
mcbbv{at}imcb.nus.edu.sg (B.V.), dsrokhsar{at}lbl.gov (D.R.), sbrenner{at}salk.edu (S.B.).
Present address: Amersham Biosciences, 928 East Arques Avenue,
Sunnyvale, CA 945085, USA.