ReviewNovel genes retrieved from environmental DNA by polymerase chain reaction: Current genome-walking techniques for future metagenome applications
Introduction
Activity-screening approaches using isolated microbial strains were the cornerstones for discovery of novel enzymes for many decades. However, since the 1990s, several landmark reports have revolutionized our view of nature's biological and molecular diversity, which is orders of magnitude larger than previously assumed (Fulthorpe et al., 1998, Schmidt et al., 1991, Torsvik et al., 1990). It has become clear that nature is an extremely rich source of as-yet uncultured or uncultivable microorganisms (Watts et al., 1999), whose genomes are collectively referred to as the metagenome (Handelsman et al., 1998). This metagenome is accessible by extracting total DNA from environmental samples (eDNA), and it is the basis for uncovering novel biocatalysts using culture-independent methods (Lorenz and Schleper, 2002, Cowan et al., 2005). Several routes exist for accessing this biodiversity: (1) shotgun cloning of eDNA and screening for activity of recombinant clones harbouring these eDNA fragments (see for instance Henne et al., 1999, Lämmle et al., 2007, Lorenz and Eck, 2005, Rondon et al., 2000, Voget et al., 2003); (2) shotgun cloning of eDNA and assemblage of the obtained sequences into partial genomes (see for instance Quaiser et al., 2003, Schmeisser et al., 2003, Tyson et al., 2004, Venter et al., 2004); and (3) PCR-based methods using degenerate consensus primers that are specific for a particular enzyme class, followed by genome walking. This review focuses on this last approach comparing the available genome-walking methods and highlighting achievements in sequence-dependent gene mining from the metagenome.
The sequence-dependent approach usually consists of the following steps: (1) multiple sequence alignment of genes encoding members of a specific enzyme class; (2) design of consensus primers; (3) isolation of eDNA; (4) amplification of the target DNA using consensus primers and eDNA as the template; (5) design of gene-specific primers based on these sequences; (6) genome-walking PCR using eDNA as the template for assessing unknown upstream and downstream sequences; and (7) repetition of step 6 until the entire sequence is retrieved.
PCR-based recovery of entire genes from the metagenome is a convenient activity-independent approach to quickly enlarging the toolbox of future biocatalysts. Although the novel enzymes whose genes were retrieved resemble to some degree previously known enzymes, the consensus-based techniques are ideal to create a collection of similar enzymes differing in substrate specificity, specific activity, enantiopreference, enantioselectivity, regioselectivity, thermostability, solvent and pH stability, substrate and product inhibition, and other important characteristics (Eschenfeldt et al., 2001, Liebeton and Eck, 2004, Labes et al., 2008). No single enzyme of a particular enzyme class is likely to be suitable for all applications. As a consequence, libraries of enzymes of a particular enzyme family must be created. Mainly for this reason, scientists will use consensus-based PCR and genome-walking methods in the field of industrial enzyme applications. Activity-based screening, which does not rely on any known sequences, is a powerful alternative approach, making it possible to detect metagenome-derived genes that encode novel enzymes of unknown structure and/or function (Streit et al., 2004). However, activity-based screening requires a high-throughput screening method because usually tens to hundreds of thousands of recombinant clones have to be tested (Lorenz and Eck, 2005). These activity tests may not represent the true enzymatic characteristics being targeted in analysis of metagenomic libraries. Colorimetric tests for enzyme activity, which can be implemented for high-throughput screening, are available; however, searching for truly interesting enzyme–substrate pairs showing, for instance, high enantioselectivity, is technically more demanding and usually cannot be performed easily in a high-throughput format.
As an alternative, consensus-based PCR together with genome walking offers the possibility of obtaining novel functional enzymes with strengths to be discovered at a later stage using analytical methods. Consensus-based PCR is efficient because there is no need for construction and screening of large libraries. With the availability of DNA polymerases that provide good performance with difficult targets, the consensus-based PCR approach will undoubtedly grow in importance in the near future.
Section snippets
Consensus sequences
Many enzymes that belong to a specific class contain conserved sequences, which are most easily visualized by multiple sequence alignments. These conserved regions can then be used for consensus-specific primer design. Consensus sequences have been found in many enzymes. Among those with biotechnological importance are xylanases (Hayashi et al., 2005, Morris et al., 1998, Sunna and Bergquist, 2003), amylolytic enzymes (Kim et al., 2000), 2,5-diketo-d-gluconic acid reductases (Eschenfeldt et
Genome-walking methods
Genome walking has been used for many years to obtain sequence information of unknown segments adjacent to known chromosomal DNA sequences. Since its first use (Fors et al., 1990, Copley et al., 1991, Parker et al., 1991), many modifications of this method have been published. Today, several commercial genome-walking kits are available (for instance TaKaRa Bio Inc. or Clontech Laboratories Inc.). In principle, all genome-walking techniques can also be used with eDNA. Compared to chromosomal DNA
Genome walking with environmental DNA
Only a small fraction of the above-mentioned techniques has been applied with eDNA. Several researchers used nested and cassette primers with unphosphorylated DNA cassettes ligated to digested eDNA to retrieve specific gene fragments, which were then assembled into whole genes by genome walking. This approach has yielded one full-length lipase gene that was functionally expressed in Escherichia coli (Bell et al., 2002). A gene encoding a 1,4-β-xylanase was also amplified from eDNA using
Conclusion
Many methods are available for genome-walking PCR; however, only a few have been applied successfully with eDNA as a template. Using PCR with primers based on consensus sequences together with genome walking has allowed retrieval of many novel genes from unknown microorganisms without the need for extensive and laborious activity-screening. However, in contrast to activity-based screening, the consensus sequence approach does not offer the possibility to retrieve metagenome-derived genes that
Acknowledgments
This work was supported by the Czech Science Foundation (grant 204/06/0458) and by the Institutional Research Concept No. AV0Z50200510.
References (91)
- et al.
Cloning of complete genes for novel hydrolytic enzymes from Antarctic sea water bacteria by use of an improved genome walking technique
J. Biotechnol.
(2008) - et al.
Rapid amplification of genomic ends (RAGE) as a simple method to clone flanking genomic DNA
Gene
(1997) - et al.
Metagenomic gene discovery: past, present and future
Trends Biotechnol.
(2005) - et al.
A specific and versatile genome walking technique
Gene
(2006) - et al.
Selective amplification of cDNA sequence from total RNA by cassette-ligation mediated polymerase chain reaction (PCR): application to sequencing 6.5 kb genome segment of hantavirus strain B-1
Mol. Cell. Probes
(1992) - et al.
Environmental DNA as a source of a novel epoxide hydrolase reacting with aliphatic terminal epoxides
J. Mol. Catal. B: Enzym.
(2009) - et al.
Identification of novel enzymes with different hydrolytic activities by metagenome expression cloning
J. Biotechnol.
(2007) - et al.
Thermal asymmetric interlaced PCR: automatable amplification and sequencing of insert end fragments from P1 and YAC clones for chromosome walking
Genomics
(1995) - et al.
Metagenome–a challenging source of enzyme discovery
J. Mol. Catal. B: Enzym.
(2002) - et al.
One armed PCR (OA-PCR): amplification of genomic DNA from a single primer domain
Genomics
(1994)