A custom software solution for forensic mtDNA analysis of MiSeq data

https://doi.org/10.1016/j.fsigss.2015.09.242Get rights and content

Abstract

The use of a next generation sequencing (NGS) approach for mitochondrial (mt) DNA analysis in forensic casework is imminent. A feature of the NGS approach is the ability to detect and report mtDNA heteroplasmy, which will significantly enhance the discrimination potential of the testing method. A single software solution that allows for robust and user-friendly analysis of NGS-derived mtDNA sequence is not readily available, especially when considering heteroplasmy. This communication outlines the desired features of a software package for forensic applications, and the progress made towards those aims with the development of a custom version of NextGENe®.

Introduction

With the development of next generation sequencing (NGS) approaches, and the increased resolution provided by this technology, the forensic community is presented with an opportunity to identify, report, and take advantage of mtDNA heteroplasmic variants observed in casework [1], [2]. A number of software tools have been developed for the analysis of mtDNA sequence data generated on an NGS platform. Some are simple Excel-spreadsheet-based solutions [3] that help convert data into haplotypes using phylogenetically-derived nomenclature, and allow for consistent assignment of homopolymeric stretches and sequence motifs [4], [5], but do not allow for analysis of the individual sequencing reads; commonly referred to as the pileup. Others involve the reconstruction of mtgenomes from total genomic DNA [6], which is currently beyond the scope of forensic applications. Still others have toolboxes from which pipelines can be created to assist in the process of data management [7], [8], [9], but lack the proper nomenclature conversion, analysis of the pileup, and can be challenging to use.

The commercially available software package, NextGENe® (SoftGenetics, Inc.) is user friendly, includes numerous user defined parameters, and allows for detailed analysis of the pileup. It has been used successfully to analyze mtDNA sequence data on various NGS platforms [1], [2], but initially lacked some desired features. Those features include: (1) alignment to a circular version of the mtgenome so that data properly spans the transition point in the mtgenome numbering system [10], (2) alignment and nucleotide numbering consistent with the revised mtgenome sequence [11], (3) recognition and proper assignment of motifs and insertions/deletions (INDELs) consistent with phylogenetic and forensic considerations [4], [5], (4) robust identification of heteroplasmic sequences, and (5) export of reports that address forensic considerations and allow for seamless import into tertiary analysis tool such as the new version of EMPOP; www.empop.org, v3/R11. This communication reports on progress made towards the development of a custom, forensic version of NextGENe® that includes the features above.

Section snippets

Materials and methods

Data sets containing hundreds of mtDNA sequences from random individuals in the population have been analyzed in our laboratory with NextGENe®. As new versions have been developed, mtDNA data sets were run through the software to ensure that the outcomes were concordant with previous analyses. The most recent, commercially available, version of NextGENe® being used in our laboratory is v2.4.0.1, which was brought online in June 2015. Alpha versions (currently v2.4.2) of the custom software

Results and discussion

The haplotypes for all samples analyzed with new versions of the NextGENe® software have been concordant with previous results, and for those tested, are concordant with Sanger-derived haplotypes [7]. The newest forensic version of the software addresses the primary needs of the forensic community; alignment to a circular version of the mtgenome, appropriate use of nomenclature, alignment of INDELs, and reporting outputs. Fig. 1 illustrates some of these features, and compares the new interface

Conclusions

The introduction of an NGS approach for forensic mtDNA sequence analysis will require the use of a software package that enables the examiner to easily navigate through the data, reliably report the findings, and use the outputs to effectively run database searches. A custom, forensic version of NextGENe® has been developed and was evaluated to assess whether it meets these goals, which it does; a beta version of the software is available through SoftGenetics, Inc. The software remains in

Conflict of interest

The authors of this article have no relevant financial relationships with commercial interests to disclose.

Acknowledgements

The authors gratefully acknowledge the support of Jonathan Liu, John McGuigan and John Fosnacht from SoftGenetics for development of new versions of NextGENe®. This work was supported in part by grant 2014-DN-BX-K022 from the National Institute of Justice (NIJ). The points of view in this document are those of the authors and do not represent the official position or policies of the U.S. Department of Justice.

References (11)

There are more references available in the full text version of this article.

Cited by (3)

  • Evaluation of GeneMarker<sup>®</sup> HTS for improved alignment of mtDNA MPS data, haplotype determination, and heteroplasmy assessment

    2017, Forensic Science International: Genetics
    Citation Excerpt :

    NextGENe® has been used successfully to analyze mtDNA sequence data for forensic applications [7,10], but has lacked an integrated pipeline and desired features. We previously reported on the early development of a customized version of the NextGENe® software [21] to address the following considerations; 1) alignment to a circular version of the mtgenome so that data properly spans the transition point in the mtgenome numbering system [22], 2) alignment and nucleotide numbering consistent with the revised mtgenome sequence [23], 3) recognition of SNP-associated motifs and INDELs (insertions/deletions) consistent with phylogenetic and forensic considerations [16,17], 4) identification of heteroplasmic sequences, and 5) export of reports that address forensic considerations and allow for import into tertiary analysis tools such as EMPOP; www.empop.org, v3/R11. The alignment strategies drew from previous attempts to accomplish these goals [15].

  • Concordance and reproducibility of a next generation mtGenome sequencing method for high-quality samples using the Illumina MiSeq

    2016, Forensic Science International: Genetics
    Citation Excerpt :

    Final profile generation (i.e. alignment/nomenclature adjustments and determination of reportable range) was performed by the analyst in this study, yet automated incorporation of these forensic practices into the software would both ease analysis and help ensure accurate variant calls. Such software tools are now available to the forensic community to address some of these needs [38,39], and a detailed assessment of these tools is warranted. Increased sensitivity is one of the major advantages of NGS and was demonstrated by this study in the detection of the mixed sample and an additional six PHPs that had previously gone undetected with STS.

View full text