Using a fast clustering method for viral segment lineage determination, applied to the H9 influenza hemagglutinin.
- Published
- Accepted
- Subject Areas
- Bioinformatics, Virology
- Keywords
- viral lineages, clustering, influenza, H9, hemagglutinin
- Copyright
- © 2017 Dalby et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2017. Using a fast clustering method for viral segment lineage determination, applied to the H9 influenza hemagglutinin. PeerJ Preprints 5:e3166v1 https://doi.org/10.7287/peerj.preprints.3166v1
Abstract
Lineage determination is an important part of the analysis of viral sequence data. Previously this has depended on phylogenetic analysis in order to identify distinct clades within the phylogenetic trees. This method is time consuming and dependent on a set of empirical rules for clade identification. An alternative approach is to use clustering. Clustering is commonly used to identify operational taxonomic units in next generation sequencing data. In this paper we use clustering in order to rapidly identify viral segment lineages and clades without the need for tree construction.
Author Comment
This paper was submitted to Virus Genes in July 2017. This paper represents a clear example of the widespread heterogeneity of influenza subtypes within the H9 hemagglutinin phylogenetic tree. This disagrees with the conventional wisdom used in tree building where only a single subtype is considered and calls into question the WHO guidelines for the H5 nomenclature which assume homogeneity of subtypes within clades.