Three assays for in-solution enrichment of ancient human DNA at more than a million SNPs

  1. David Reich1,2,3,4
  1. 1Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA;
  2. 2Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA;
  3. 3Howard Hughes Medical Institute, Boston, Massachusetts 02115, USA;
  4. 4Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138, USA
  1. 5 These authors contributed equally to this work.

  • Corresponding authors: nrohland{at}genetics.med.harvard.edu, shop{at}genetics.med.harvard.edu, reich{at}genetics.med.harvard.edu
  • Abstract

    The strategy of in-solution enrichment for hundreds of thousands of single-nucleotide polymorphisms (SNPs) has been used to analyze >70% of individuals with genome-scale ancient DNA published to date. This approach makes it economical to study ancient samples with low proportions of human DNA and increases the rate of conversion of sampled remains into interpretable data. So far, nearly all such data have been generated using a set of bait sequences targeting about 1.24 million SNPs (the “1240k reagent”), but synthesis of the reagent has been cost-effective for only a few laboratories. In 2021, two companies, Daicel Arbor Biosciences and Twist Bioscience, made available assays that target the same core set of SNPs along with supplementary content. We test all three assays on a common set of 27 ancient DNA libraries and show that all three are effective at enriching many hundreds of thousands of SNPs. For all assays, one round of enrichment produces data that are as useful as two. In our testing, the “Twist Ancient DNA” assay produces the highest coverages, greatest uniformity on targeted positions, and almost no bias toward enriching one allele more than another relative to shotgun sequencing. We also identify hundreds of thousands of targeted SNPs for which there is minimal allelic bias when comparing 1240k data to either shotgun or Twist data. This facilitates coanalysis of the large data sets that have been generated using 1240k and Twist capture, as well as shotgun sequencing approaches.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.276728.122.

    • Freely available online through the Genome Research Open Access option.

    • Received March 5, 2022.
    • Accepted November 16, 2022.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    Related Article

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server