Skip to main content
Log in

Quality control of fighting fish nucleotide sequences in public repositories reveals a dark matter of systematic taxonomic implication

  • Research Article
  • Published:
Genes & Genomics Aims and scope Submit manuscript

Abstract

Background

The number of nucleotide sequences in public repositories has exploded recently. However, the data contain errors, leading to incorrect species identification. Several fighting fish (Betta spp.) are poorly described, with unresolved cryptic species complexes masking undescribed species. Here, DNA barcoding was used to detect erroneous sequences in public repositories.

Objective

This study reflects the current quantitative and qualitative status of DNA barcoding in fighting fish and provides a rapid and reliable identification tool.

Methods

A total of 1034 barcode sequences were analyzed from mitochondrial cytochrome c oxidase I (COI) and cytochrome b (Cytb) genes from 71 fighting fish species.

Results

The nearest neighbor test showed the highest percentage of intraspecific nearest neighbors at 93.41% for COI and 91.67% for Cytb, which can be used as reference barcodes for certain taxa. Intraspecific variation was usually less than 13%, while most species differed by more than 54%. The barcoding gap, calculated from the difference between inter- and intraspecific sequence divergences, was negative in the COI data set indicating overlapping intra- and interspecific sequence divergence. Sequence saturation was observed in the Cytb data set but not in the COI data set.

Conclusion

The COI gene should thus be used as the main barcoding marker for fighting fish.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The full dataset and metadata from this publication are available from the Dryad Digital Repository. Dataset, https://datadryad.org/stash/share/MhZqErReqixoxVCF_HEP5EFWzKQ44Dnn2NAoKVz05w (https://doi.org/10.5061/dryad.dz08kps14).

References

  • Abbas G, Nadeem A, Javed M, Ali MM, Aqeel M, Babar ME, Tahir MS, Tabassu S, Shehzad W (2020) Mitochondrial cytochrome-b, cytochrome-c and D-loop region based phylogenetic and diversity analysis in blackbuck (Antilope cervicapra). Kafkas Univ Vet Fak Derg 26:25–31

  • Amorim A, Fernandes T, Taveira N (2019) Mitochondrial DNA in human identification: a review. PeerJ 7:e7314

    Article  Google Scholar 

  • Andrews C (1990) The ornamental fish trade and fish conservation. J Fish Biol 37:53–59

    Article  Google Scholar 

  • Ashelford KE, Chuzhanova NA, Fry JC, Jones AJ, Weightman AJ (2005) At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies. Appl Environ Microbiol 71:7724–7736

    Article  CAS  Google Scholar 

  • Bickford D, Lohman DJ, Sodhi NS, Ng PK, Meier R, Winker K, Ingram KK, Das I (2007) Cryptic species as a window on diversity and conservation. Trends Ecol Evol 22:148–155

    Article  Google Scholar 

  • Britz R (1994) Ontogenetic features of Luciocephalus (Perciformes, Anabantoidei) with a revised hypothesis of anabantoid intrarelationships. Zool J Linn Soc 112:491–508

    Article  Google Scholar 

  • Brown SD, Collins RA, Boyer S, Lefort MC, Malumbres-Olarte JAGOBA, Vink CJ, Cruickshank RH (2012) Spider: an R package for the analysis of species identity and evolution, with particular reference to DNA barcoding. Mol Ecol Resour 12:562–565

    Article  Google Scholar 

  • Cantor TE (1849) Catalogue of Malayan fishes. J Asiat Soc Bengal 18:983–1443

    Google Scholar 

  • Ceballos G, Ehrlich PR, Raven PH (2020) Vertebrates on the brink as indicators of biological annihilation and the sixth mass extinction. Proc Natl Acad Sci USA 117:13596–13602

    Article  CAS  Google Scholar 

  • Chen W, Zhong Z, Dai W, Fan Q, He S (2017) Phylogeographic structure, cryptic speciation and demographic history of the sharpbelly (Hemiculter leucisculus), a freshwater habitat generalist from southern China. BMC Evol Biol 17:1–13

    Article  CAS  Google Scholar 

  • Cuvier G, Valenciennes A (1831). In: Levrault FG (ed) Histoire naturelle des poissons. Tome 7, Paris, p 531

    Google Scholar 

  • Dayrat B (2005) Towards integrative taxonomy. Biol J Linn Soc 85:407–417

    Article  Google Scholar 

  • Degani G (2013) Mitochondrial DNA sequence analysis in Anabantoidei fish. Adv Carbohydr Chem Biochem 3:347–355

    Google Scholar 

  • Dhar B, Ghosh SK (2015) Genetic assessment of ornamental fish species from North East India. Gene 555:382–392

    Article  CAS  Google Scholar 

  • Drummond AJ, Suchard MA, Xie D, Rambaut A (2012) Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 29:1969–1973

    Article  CAS  Google Scholar 

  • Erixon P, Svennblad B, Britton T, Oxelman B (2003) Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics. Syst Biol 52:665–673

    Article  Google Scholar 

  • Felsenstein J (1984) Distance methods for inferring phylogenies: a justification. Evolution 38:16–24

    Article  Google Scholar 

  • Fleishman E, Noss RF, Noon BR (2006) Utility and limitations of species richness metrics for conservation planning. Ecol Indic 6:543–553

    Article  Google Scholar 

  • Franzen C, Müller A (1999) Molecular techniques for detection, species differentiation, and phylogenetic analysis of microsporidia. Clin Microbiol Rev 12:243–285

    Article  CAS  Google Scholar 

  • Funk WC, Caminer M, Ron SR (2012) High levels of cryptic species diversity uncovered in Amazonian frogs. Proc R Soc B Biol Sci 279:1806–1814

    Article  Google Scholar 

  • Gonzalez DL, Giannerini S, Rosa R (2020) Rumer’s transformation: a symmetry puzzle standing for half a century. BioSystems 187:104036

    Article  CAS  Google Scholar 

  • Hashemzadeh SI, Normandeau E, Benestan L, Rougeux C, Coté G, Moore JS, Ghaedrahmati N, Abdoli A, Bernatchez L (2018) Genetic and morphological support for possible sympatric origin of fish from subterranean habitats. Sci Rep 8:2909

    Article  Google Scholar 

  • Hebert PD, Cywinska A, Ball SL, deWaard JR (2003) Biological identifications through DNA barcodes. Proc Biol Sci 270:313–321

    Article  CAS  Google Scholar 

  • Hohenegger J (2014) Species as the basic units in evolution and biodiversity: recognition of species in the recent and geological past as exemplified by larger foraminifera. Gondwana Res 25:707–728

    Article  Google Scholar 

  • Holland LA, Kaelin EA, Maqsood R, Estifanos B, Wu LI, Varsani A, Halden RU, Hogue BG, Scotch M, Lim ES (2020) An 81 base-pair deletion in SARS-CoV-2 ORF7a identified from sentinel surveillance in Arizona (Jan-Mar 2020). medRxiv

  • Huang W, Xie X, Huo L, Liang X, Wang X, Chen X (2020) An integrative DNA barcoding framework of ladybird beetles (Coleoptera: Coccinellidae). Sci Rep 10:1–10

    Google Scholar 

  • Imtiaz A, Nor SAM, Naim DM (2017) Progress and potential of DNA barcoding for species identification of fish species. Biodivers J 18:1394–1405

    Article  Google Scholar 

  • Joo D, Kwan YS, Song J, Pinho C, Hey J, Won YJ (2013) Identification of cichlid fishes from lake Malawi using computer vision. PLoS ONE 8:e77686

    Article  CAS  Google Scholar 

  • Kamran M, Yaqub A, Malkani N, Anjum KM, Awan MN, Paknejad H (2020) Identification and phylogenetic analysis of Channa species from riverine system of Pakistan using COI gene as a DNA barcoding marker. J Bioresour Manag 7:10

    Article  Google Scholar 

  • Kautt AF, Kratochwil CF, Nater A, Machado-Schiaffino G, Olave M, Henning F, Torres-Dowdall J, Härer A, Hulsey CD, Franchini P, Pippel M, Myers EW, Meyer A (2020) Contrasting signatures of genomic divergence during sympatric speciation. Nature 588:106–111

    Article  CAS  Google Scholar 

  • Kier G, Barthlott W (2001) Measuring and mapping endemism and species richness: a new methodological approach and its application on the flora of Africa. Biodivers Conserv 10:1513–1529

    Article  Google Scholar 

  • Kottelat M, Ng PK (1994) Diagnoses of five new species of fighting fishes from Banka and Borneo. Belontiidae, Teleostei

    Google Scholar 

  • Kowasupat C, Panijpan B, Ruenwongsa P, Sriwattanarothai N (2012) Betta mahachaiensis, a new species of bubble-nesting fighting fish (Teleostei: Osphronemidae) from Samut Sakhon Province, Thailand. Zootaxa 3522:49–60

    Article  Google Scholar 

  • Kowasupat C, Panijpan B, Laosinchai P, Ruenwongsa P, Phongdara A, Wanna W, Senapin S, Phiwsaiya K (2014) Biodiversity of the Betta smaragdina (Teleostei: Perciformes) in the northeast region of Thailand as determined by mitochondrial COI and nuclear ITS1 gene sequences. Meta Gene 2:83–95

    Article  Google Scholar 

  • Kumar S, Stecher G, Li M, Knyaz C, Tamura K (2018) MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol 35:1547

    Article  CAS  Google Scholar 

  • Ladiges W (1972) Betta smaragdina nov. spec. Die Aquar Und Terr z 25:190–191

    Google Scholar 

  • Ladiges W (1975) Betta imbellis nov. spec., der Friedliche Kampffisch. Die Aquar Und Terr z 28:262–264

    Google Scholar 

  • Laopichienpong N, Muangmai N, Supikamolseni A, Twilprawat P, Chanhome L, Suntrarachun S, Peyachoknagul S, Srikulnath K (2016) Assessment of snake DNA barcodes based on mitochondrial COI and Cytb genes revealed multiple putative cryptic species in Thailand. Gene 594:238–247

    Article  CAS  Google Scholar 

  • Lijtmaer DA, Kerr KC, Barreira AS, Hebert PD, Tubaro PL (2011) DNA barcode libraries provide insight into continental patterns of avian diversification. PLoS ONE 6:e20744

    Article  CAS  Google Scholar 

  • Lohse K (2009) Can mtDNA barcodes be used to delimit species? A response to Pons et al. (2006). Syst Biol 58:439–442

    Article  Google Scholar 

  • Mace GM (2004) The role of taxonomy in species conservation. Philos Trans R Soc Lond B Biol Sci 359:711–719

    Article  Google Scholar 

  • Mat Jaafar TNA, Taylor MI, Mohd Nor SA, de Bruyn M, Carvalho GR (2012) DNA barcoding reveals cryptic diversity within commercially exploited Indo-Malay Carangidae (Teleosteii: Perciformes). PLoS ONE 7:e49623

    Article  CAS  Google Scholar 

  • Meier R, Shiyang K, Vaidya G, Ng PK (2006) DNA barcoding and taxonomy in Diptera: a tale of high intraspecific variability and low identification success. Syst Biol 55:715–728

    Article  Google Scholar 

  • Panthum T, Jaisamut K, Singchat W, Ahmad SF, Kongkaew L, Wongloet W, Dokkaew S, Kraichak E, Muangmai N, Duengkae P, Srikulnath K (2022) Something fishy about Siamese fighting fish (Betta splendens) sex: Polygenic sex determination or a newly emerged sex-determining region? Cells 11:1764

    Article  CAS  Google Scholar 

  • Papadopoulou A, Bergsten J, Fujisawa T, Monaghan MT, Barraclough TG, Vogler AP (2008) Speciation and DNA barcodes: testing the effects of dispersal on the formation of discrete sequence clusters. Philos Trans R Soc Lond B Biol Sci 363:2987–2996

    Article  Google Scholar 

  • Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinform 20:289–290

    Article  CAS  Google Scholar 

  • Pons J, Barraclough TG, Gomez-Zurita J, Cardoso A, Duran DP, Hazell S, Kamoun S, Sumlin WD, Vogler AP (2006) Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Syst Biol 55:595–609

    Article  Google Scholar 

  • Putri UK, Simanjuntak R, Febriamansyah TA, Roesma DI, Tjong DH (2021) The role of molecular taxonomy in uncovering local ornamental Palo Fish (Betta sp.: Osphronemidae) and other Betta based on Cytochrome b gene. Na J Adv Res Rev 10:30–40

    CAS  Google Scholar 

  • R Core Team (2022) R: A language and environment for statistical computing. R foundation for statistical computing, Vienna

    Google Scholar 

  • Rambaut A, Drummond AJ (2007) Tracer version 1·5. Available at: http://beast.bio.ed.ac.uk/Tracer.

  • Regan CT (1909) The Asiatic fishes of the family Anabantidae. Proc Zool Soc Lond 79:767–787

    Article  Google Scholar 

  • Ronquist F, Teslenko M, Van Der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP (2012) MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542

    Article  Google Scholar 

  • Rougerie R, Decaëns T, Deharveng L, Porco D, James SW, Chang CH, Richard B, Potapov M, Suhardjono Y, Hebert PD (2009) DNA barcodes for soil animal taxonomy. Pesqui Agropecu Bras 44:789–802

    Article  Google Scholar 

  • Rubinoff D (2006) Utility of mitochondrial DNA barcodes in species conservation. Conserv Biol 20:1026–1033

    Article  Google Scholar 

  • Russell JC, Meyer JY, Holmes ND, Pagad S (2017) Invasive alien species on islands: impacts, distribution, interactions and management. Environ Conserv 44:359–370

    Article  Google Scholar 

  • Santos C, Montiel R, Sierra B, Bettencourt C, Fernandez E, Alvarez L, Lima M, Abade A, Aluja MP (2005) Understanding differences between phylogenetic and pedigree-derived mtDNA mutation rate: a model using families from the Azores Islands (Portugal). Mol Biol Evol 22:1490–1505

    Article  CAS  Google Scholar 

  • Sauvage H (1884) Note sur une collection de poissons recueillie à Pérak, presqu’île de Malacca. Bull Soc Zool Fr 9:216–220

    Google Scholar 

  • Schindler I, Schmidt J (2006) Review of the mouthbrooding Betta (Teleostei, Osphronemidae) from Thailand, with descriptions of two new species. Z Für Fischkunde 8:47–69

    Google Scholar 

  • Shen YY, Chen X, Murphy RW (2013) Assessing DNA barcoding as a tool for species identification and data quality control. PLoS ONE 8:e57125

    Article  CAS  Google Scholar 

  • Sheraliev B, Peng Z (2021) Molecular diversity of Uzbekistan’s fishes assessed with DNA barcoding. Sci Rep 11:16894

    Article  CAS  Google Scholar 

  • Srikulnath K, Singchat W, Laopichienpong N, Ahmad SF, Jehangir M, Subpayakom N, Suntronpong A, Jangtarwan K, Pongsanarm T, Panthum T, Ariyaraphong N, Camcuan J, Duengkae P, Dokkaew S, Muangmai N (2021) Overview of the betta fish genome regarding species radiation, parental care, behavioral aggression, and pigmentation model relevant to humans. Genes Genom 43:91–104

    Article  Google Scholar 

  • Stewart JE, Timmer LW, Lawrence CB, Pryor BM, Peever TL (2014) Discord between morphological and phylogenetic species boundaries: incomplete lineage sorting and recombination results in fuzzy species boundaries in an asexual fungal pathogen. BMC Evol Biol 14:1–14

    Article  CAS  Google Scholar 

  • Supikamolseni A, Ngaoburanawit N, Sumontha M, Chanhome L, Suntrarachun S, Peyachoknagul S, Srikulnath K (2015) Molecular barcoding of venomous snakes and species-specific multiplex PCR assay to identify snake groups for which anti-venom is available in Thailand. Genet Mol Res 14:13981–13997

    Article  CAS  Google Scholar 

  • Tan HH (2009) Redescription of Betta midas Bleeker; and a new species of Betta from West Kalimantan, Borneo (Teleostei: Osphronemidae). Zootaxa 2165:59–68

    Article  Google Scholar 

  • Tan HH, Ng PKL (2005) The fighting fishes (Teleostei: Osphronemidae: genus Betta) of Singapore, Malaysia and Brunei. Raffles Bull Zool 13:43–99

    Google Scholar 

  • Vierke J (1979) Betta anabatoides und Betta foerschi spec. nov., zwei Kampffische aus Borneo. Aquarium Aqua Terra 13:386–390

    Google Scholar 

  • Vogler AP, Monaghan MT (2007) Recent advances in DNA taxonomy. J Zool Syst Evol Res 45:1–10

    Article  Google Scholar 

  • Xia X (2018) DAMBE7: New and improved tools for data analysis in molecular biology and evolution. Mol Biol Evol 35:1550–1552

    Article  CAS  Google Scholar 

  • Xia X, Lemey P (2009) Assessing substitution saturation with DAMBE. The phylogenetic handbook: a practical approach to DNA and protein phylogeny, vol 2. Cambridge University Press, Cambridge, pp 615–630

    Chapter  Google Scholar 

  • Xia X, Xie Z, Salemi M, Chen L, Wang Y (2003) An index of substitution saturation and its application. Mol Phylogenet Evol 26:1–7

    Article  CAS  Google Scholar 

  • Yeo D, Srivathsan A, Meier R (2020) Longer is not always better: optimizing barcode length for large-scale species discovery and identification. Syst Biol 69:999–1015

    Article  Google Scholar 

  • Zhang DX, Hewitt GM (1996) Nuclear integrations: challenges for mitochondrial DNA markers. Trends Ecol Evol 11:247–251

    Article  CAS  Google Scholar 

  • Zhang J, Kapli P, Pavlidis P, Stamatakis A (2013) A general species delimitation method with applications to phylogenetic placements. Bioinform 29:2869–2876

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank NCBI database for supplying betta fish accession number. We also thank the Faculty of Science for providing supporting research facilities (6501.0901.1) and the Center for Bio-Medical Engineering Core Facility at Dankook University.

Funding

This research was financially supported in part by the High-Quality Research Graduate Development Cooperation Project between Kasetsart University and the National Science and Technology Development Agency (NSTDA) (Grant no. 6417400247) awarded to TP and KS; the Thailand Science Research and Innovation through The Kasetsart University Reinventing University Program 2021 (Grant no. 3/2564) awarded to TP and KS; Fundamental Fund: Kasetsart University Research and Development Institute (KURDI) (FF(S-KU)17.66) awarded to SFA and KS; the e-ASIA Joint Research Program (Grant no. P1851131) awarded to WS and KS; a grant from the National Science and Technology Development Agency (NSTDA) (Grant no’s: NSTDA, P-19–52238 and JRA-CO-2564–14003-TH) awarded to KS and WS; a post-doctoral researcher award at Kasetsart University awarded to SFA and KS; the Higher Education for Industry Consortium (Hi-FI; no. 6414400777) awarded to NA and KS; the Office of the Ministry of Higher Education, Science, Research and Innovation. International SciKU Branding (ISB), Faculty of Science, Kasetsart University awarded funds to KS. No funding source was involved in the study design, collection, analysis, and interpretation of the data, writing the report, or the decision to submit the article for publication.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, TP and KS; formal analysis, TP, NA, PW, WS, SFA, EK, and KS; funding acquisition, KS; investigation, TP; methodology, TP and KS; project administration, KS; visualization, NM and KS; writing—original draft, TP and KS; writing—review and editing, TP, NA, PW, WS, SFA, EK, SD, NM, PD and KS. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Kornsorn Srikulnath.

Ethics declarations

Conflict of interest

The authors declare no competing financial interests or personal relationships that have influenced this study.

Ethical approval

All animal care and experimental procedures were approved (approval no. ACKU63-SCI-007) by the Animal Experiment Committee of Kasetsart University and conducted in accordance with the Regulations on Animal Experiments at Kasetsart University.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

13258_2022_1353_MOESM1_ESM.jpg

Supplementary file1 A phylogram clarifying the phylogenetic relationships among the 746 GenBank accessions, constructed from a Bayesian inference analysis using mitochondrial cytochrome c oxidase I (COI) sequences. (JPG 15800 KB)

13258_2022_1353_MOESM2_ESM.jpg

Supplementary file2 A phylogram clarifying the phylogenetic relationships among the 288 GenBank accessions, constructed from a Bayesian inference analysis using mitochondrial cytochrome b (Cytb) sequences. (JPG 6350 KB)

13258_2022_1353_MOESM3_ESM.jpg

Supplementary file3 Phylogram clarifying the phylogenetic relationships among 288 GenBank accessions, constructed from Bayesian inference analysis using mitochondrial cytochrome c oxidase I (COI) sequences. Group 1: higher-level similarity with the same species. Group 2: higher-level similarity with multiple species. Group 3: unique sequences with no similarity within most sequences. Class 1: sequences with the same species name exhibiting intraspecific cohesive clustering and interspecific distinct clustering with high posterior probability (0.90–1.00). Class 2: sequences with the same species name that do not exhibit intraspecific cohesive clustering. Class 3: sequences with a different species name exhibiting cohesive clustering. There is only 1 accession number (asterisk symbol: *). (JPG 16033 KB)

13258_2022_1353_MOESM4_ESM.jpg

Supplementary file4 Phylogram clarifying the phylogenetic relationships among 746 GenBank accessions, constructed from Bayesian inference analysis using mitochondrial cytochrome b (Cytb) sequences. Group 1: higher-level similarity with the same species. Group 2: higher-level similarity with multiple species. Group 3: unique sequences with no similarity within most sequences. Class 1: sequences with the same species name exhibiting intraspecific cohesive clustering and interspecific distinct clustering with high posterior probability (0.90–1.00). Class 2: sequences with the same species name that do not exhibit intraspecific cohesive clustering. Class 3: sequences with a different species name exhibiting cohesive clustering. There is only 1 accession number (asterisk symbol: *). (JPG 7083 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Panthum, T., Ariyaphong, N., Wattanadilokchatkun, P. et al. Quality control of fighting fish nucleotide sequences in public repositories reveals a dark matter of systematic taxonomic implication. Genes Genom 45, 169–181 (2023). https://doi.org/10.1007/s13258-022-01353-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13258-022-01353-7

Keywords

Navigation