Abstract
Background
The number of nucleotide sequences in public repositories has exploded recently. However, the data contain errors, leading to incorrect species identification. Several fighting fish (Betta spp.) are poorly described, with unresolved cryptic species complexes masking undescribed species. Here, DNA barcoding was used to detect erroneous sequences in public repositories.
Objective
This study reflects the current quantitative and qualitative status of DNA barcoding in fighting fish and provides a rapid and reliable identification tool.
Methods
A total of 1034 barcode sequences were analyzed from mitochondrial cytochrome c oxidase I (COI) and cytochrome b (Cytb) genes from 71 fighting fish species.
Results
The nearest neighbor test showed the highest percentage of intraspecific nearest neighbors at 93.41% for COI and 91.67% for Cytb, which can be used as reference barcodes for certain taxa. Intraspecific variation was usually less than 13%, while most species differed by more than 54%. The barcoding gap, calculated from the difference between inter- and intraspecific sequence divergences, was negative in the COI data set indicating overlapping intra- and interspecific sequence divergence. Sequence saturation was observed in the Cytb data set but not in the COI data set.
Conclusion
The COI gene should thus be used as the main barcoding marker for fighting fish.
Similar content being viewed by others
Data availability
The full dataset and metadata from this publication are available from the Dryad Digital Repository. Dataset, https://datadryad.org/stash/share/MhZqErReqixoxVCF_HEP5EFWzKQ44Dnn2NAoKVz05w (https://doi.org/10.5061/dryad.dz08kps14).
References
Abbas G, Nadeem A, Javed M, Ali MM, Aqeel M, Babar ME, Tahir MS, Tabassu S, Shehzad W (2020) Mitochondrial cytochrome-b, cytochrome-c and D-loop region based phylogenetic and diversity analysis in blackbuck (Antilope cervicapra). Kafkas Univ Vet Fak Derg 26:25–31
Amorim A, Fernandes T, Taveira N (2019) Mitochondrial DNA in human identification: a review. PeerJ 7:e7314
Andrews C (1990) The ornamental fish trade and fish conservation. J Fish Biol 37:53–59
Ashelford KE, Chuzhanova NA, Fry JC, Jones AJ, Weightman AJ (2005) At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies. Appl Environ Microbiol 71:7724–7736
Bickford D, Lohman DJ, Sodhi NS, Ng PK, Meier R, Winker K, Ingram KK, Das I (2007) Cryptic species as a window on diversity and conservation. Trends Ecol Evol 22:148–155
Britz R (1994) Ontogenetic features of Luciocephalus (Perciformes, Anabantoidei) with a revised hypothesis of anabantoid intrarelationships. Zool J Linn Soc 112:491–508
Brown SD, Collins RA, Boyer S, Lefort MC, Malumbres-Olarte JAGOBA, Vink CJ, Cruickshank RH (2012) Spider: an R package for the analysis of species identity and evolution, with particular reference to DNA barcoding. Mol Ecol Resour 12:562–565
Cantor TE (1849) Catalogue of Malayan fishes. J Asiat Soc Bengal 18:983–1443
Ceballos G, Ehrlich PR, Raven PH (2020) Vertebrates on the brink as indicators of biological annihilation and the sixth mass extinction. Proc Natl Acad Sci USA 117:13596–13602
Chen W, Zhong Z, Dai W, Fan Q, He S (2017) Phylogeographic structure, cryptic speciation and demographic history of the sharpbelly (Hemiculter leucisculus), a freshwater habitat generalist from southern China. BMC Evol Biol 17:1–13
Cuvier G, Valenciennes A (1831). In: Levrault FG (ed) Histoire naturelle des poissons. Tome 7, Paris, p 531
Dayrat B (2005) Towards integrative taxonomy. Biol J Linn Soc 85:407–417
Degani G (2013) Mitochondrial DNA sequence analysis in Anabantoidei fish. Adv Carbohydr Chem Biochem 3:347–355
Dhar B, Ghosh SK (2015) Genetic assessment of ornamental fish species from North East India. Gene 555:382–392
Drummond AJ, Suchard MA, Xie D, Rambaut A (2012) Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 29:1969–1973
Erixon P, Svennblad B, Britton T, Oxelman B (2003) Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics. Syst Biol 52:665–673
Felsenstein J (1984) Distance methods for inferring phylogenies: a justification. Evolution 38:16–24
Fleishman E, Noss RF, Noon BR (2006) Utility and limitations of species richness metrics for conservation planning. Ecol Indic 6:543–553
Franzen C, Müller A (1999) Molecular techniques for detection, species differentiation, and phylogenetic analysis of microsporidia. Clin Microbiol Rev 12:243–285
Funk WC, Caminer M, Ron SR (2012) High levels of cryptic species diversity uncovered in Amazonian frogs. Proc R Soc B Biol Sci 279:1806–1814
Gonzalez DL, Giannerini S, Rosa R (2020) Rumer’s transformation: a symmetry puzzle standing for half a century. BioSystems 187:104036
Hashemzadeh SI, Normandeau E, Benestan L, Rougeux C, Coté G, Moore JS, Ghaedrahmati N, Abdoli A, Bernatchez L (2018) Genetic and morphological support for possible sympatric origin of fish from subterranean habitats. Sci Rep 8:2909
Hebert PD, Cywinska A, Ball SL, deWaard JR (2003) Biological identifications through DNA barcodes. Proc Biol Sci 270:313–321
Hohenegger J (2014) Species as the basic units in evolution and biodiversity: recognition of species in the recent and geological past as exemplified by larger foraminifera. Gondwana Res 25:707–728
Holland LA, Kaelin EA, Maqsood R, Estifanos B, Wu LI, Varsani A, Halden RU, Hogue BG, Scotch M, Lim ES (2020) An 81 base-pair deletion in SARS-CoV-2 ORF7a identified from sentinel surveillance in Arizona (Jan-Mar 2020). medRxiv
Huang W, Xie X, Huo L, Liang X, Wang X, Chen X (2020) An integrative DNA barcoding framework of ladybird beetles (Coleoptera: Coccinellidae). Sci Rep 10:1–10
Imtiaz A, Nor SAM, Naim DM (2017) Progress and potential of DNA barcoding for species identification of fish species. Biodivers J 18:1394–1405
Joo D, Kwan YS, Song J, Pinho C, Hey J, Won YJ (2013) Identification of cichlid fishes from lake Malawi using computer vision. PLoS ONE 8:e77686
Kamran M, Yaqub A, Malkani N, Anjum KM, Awan MN, Paknejad H (2020) Identification and phylogenetic analysis of Channa species from riverine system of Pakistan using COI gene as a DNA barcoding marker. J Bioresour Manag 7:10
Kautt AF, Kratochwil CF, Nater A, Machado-Schiaffino G, Olave M, Henning F, Torres-Dowdall J, Härer A, Hulsey CD, Franchini P, Pippel M, Myers EW, Meyer A (2020) Contrasting signatures of genomic divergence during sympatric speciation. Nature 588:106–111
Kier G, Barthlott W (2001) Measuring and mapping endemism and species richness: a new methodological approach and its application on the flora of Africa. Biodivers Conserv 10:1513–1529
Kottelat M, Ng PK (1994) Diagnoses of five new species of fighting fishes from Banka and Borneo. Belontiidae, Teleostei
Kowasupat C, Panijpan B, Ruenwongsa P, Sriwattanarothai N (2012) Betta mahachaiensis, a new species of bubble-nesting fighting fish (Teleostei: Osphronemidae) from Samut Sakhon Province, Thailand. Zootaxa 3522:49–60
Kowasupat C, Panijpan B, Laosinchai P, Ruenwongsa P, Phongdara A, Wanna W, Senapin S, Phiwsaiya K (2014) Biodiversity of the Betta smaragdina (Teleostei: Perciformes) in the northeast region of Thailand as determined by mitochondrial COI and nuclear ITS1 gene sequences. Meta Gene 2:83–95
Kumar S, Stecher G, Li M, Knyaz C, Tamura K (2018) MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol 35:1547
Ladiges W (1972) Betta smaragdina nov. spec. Die Aquar Und Terr z 25:190–191
Ladiges W (1975) Betta imbellis nov. spec., der Friedliche Kampffisch. Die Aquar Und Terr z 28:262–264
Laopichienpong N, Muangmai N, Supikamolseni A, Twilprawat P, Chanhome L, Suntrarachun S, Peyachoknagul S, Srikulnath K (2016) Assessment of snake DNA barcodes based on mitochondrial COI and Cytb genes revealed multiple putative cryptic species in Thailand. Gene 594:238–247
Lijtmaer DA, Kerr KC, Barreira AS, Hebert PD, Tubaro PL (2011) DNA barcode libraries provide insight into continental patterns of avian diversification. PLoS ONE 6:e20744
Lohse K (2009) Can mtDNA barcodes be used to delimit species? A response to Pons et al. (2006). Syst Biol 58:439–442
Mace GM (2004) The role of taxonomy in species conservation. Philos Trans R Soc Lond B Biol Sci 359:711–719
Mat Jaafar TNA, Taylor MI, Mohd Nor SA, de Bruyn M, Carvalho GR (2012) DNA barcoding reveals cryptic diversity within commercially exploited Indo-Malay Carangidae (Teleosteii: Perciformes). PLoS ONE 7:e49623
Meier R, Shiyang K, Vaidya G, Ng PK (2006) DNA barcoding and taxonomy in Diptera: a tale of high intraspecific variability and low identification success. Syst Biol 55:715–728
Panthum T, Jaisamut K, Singchat W, Ahmad SF, Kongkaew L, Wongloet W, Dokkaew S, Kraichak E, Muangmai N, Duengkae P, Srikulnath K (2022) Something fishy about Siamese fighting fish (Betta splendens) sex: Polygenic sex determination or a newly emerged sex-determining region? Cells 11:1764
Papadopoulou A, Bergsten J, Fujisawa T, Monaghan MT, Barraclough TG, Vogler AP (2008) Speciation and DNA barcodes: testing the effects of dispersal on the formation of discrete sequence clusters. Philos Trans R Soc Lond B Biol Sci 363:2987–2996
Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinform 20:289–290
Pons J, Barraclough TG, Gomez-Zurita J, Cardoso A, Duran DP, Hazell S, Kamoun S, Sumlin WD, Vogler AP (2006) Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Syst Biol 55:595–609
Putri UK, Simanjuntak R, Febriamansyah TA, Roesma DI, Tjong DH (2021) The role of molecular taxonomy in uncovering local ornamental Palo Fish (Betta sp.: Osphronemidae) and other Betta based on Cytochrome b gene. Na J Adv Res Rev 10:30–40
R Core Team (2022) R: A language and environment for statistical computing. R foundation for statistical computing, Vienna
Rambaut A, Drummond AJ (2007) Tracer version 1·5. Available at: http://beast.bio.ed.ac.uk/Tracer.
Regan CT (1909) The Asiatic fishes of the family Anabantidae. Proc Zool Soc Lond 79:767–787
Ronquist F, Teslenko M, Van Der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP (2012) MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542
Rougerie R, Decaëns T, Deharveng L, Porco D, James SW, Chang CH, Richard B, Potapov M, Suhardjono Y, Hebert PD (2009) DNA barcodes for soil animal taxonomy. Pesqui Agropecu Bras 44:789–802
Rubinoff D (2006) Utility of mitochondrial DNA barcodes in species conservation. Conserv Biol 20:1026–1033
Russell JC, Meyer JY, Holmes ND, Pagad S (2017) Invasive alien species on islands: impacts, distribution, interactions and management. Environ Conserv 44:359–370
Santos C, Montiel R, Sierra B, Bettencourt C, Fernandez E, Alvarez L, Lima M, Abade A, Aluja MP (2005) Understanding differences between phylogenetic and pedigree-derived mtDNA mutation rate: a model using families from the Azores Islands (Portugal). Mol Biol Evol 22:1490–1505
Sauvage H (1884) Note sur une collection de poissons recueillie à Pérak, presqu’île de Malacca. Bull Soc Zool Fr 9:216–220
Schindler I, Schmidt J (2006) Review of the mouthbrooding Betta (Teleostei, Osphronemidae) from Thailand, with descriptions of two new species. Z Für Fischkunde 8:47–69
Shen YY, Chen X, Murphy RW (2013) Assessing DNA barcoding as a tool for species identification and data quality control. PLoS ONE 8:e57125
Sheraliev B, Peng Z (2021) Molecular diversity of Uzbekistan’s fishes assessed with DNA barcoding. Sci Rep 11:16894
Srikulnath K, Singchat W, Laopichienpong N, Ahmad SF, Jehangir M, Subpayakom N, Suntronpong A, Jangtarwan K, Pongsanarm T, Panthum T, Ariyaraphong N, Camcuan J, Duengkae P, Dokkaew S, Muangmai N (2021) Overview of the betta fish genome regarding species radiation, parental care, behavioral aggression, and pigmentation model relevant to humans. Genes Genom 43:91–104
Stewart JE, Timmer LW, Lawrence CB, Pryor BM, Peever TL (2014) Discord between morphological and phylogenetic species boundaries: incomplete lineage sorting and recombination results in fuzzy species boundaries in an asexual fungal pathogen. BMC Evol Biol 14:1–14
Supikamolseni A, Ngaoburanawit N, Sumontha M, Chanhome L, Suntrarachun S, Peyachoknagul S, Srikulnath K (2015) Molecular barcoding of venomous snakes and species-specific multiplex PCR assay to identify snake groups for which anti-venom is available in Thailand. Genet Mol Res 14:13981–13997
Tan HH (2009) Redescription of Betta midas Bleeker; and a new species of Betta from West Kalimantan, Borneo (Teleostei: Osphronemidae). Zootaxa 2165:59–68
Tan HH, Ng PKL (2005) The fighting fishes (Teleostei: Osphronemidae: genus Betta) of Singapore, Malaysia and Brunei. Raffles Bull Zool 13:43–99
Vierke J (1979) Betta anabatoides und Betta foerschi spec. nov., zwei Kampffische aus Borneo. Aquarium Aqua Terra 13:386–390
Vogler AP, Monaghan MT (2007) Recent advances in DNA taxonomy. J Zool Syst Evol Res 45:1–10
Xia X (2018) DAMBE7: New and improved tools for data analysis in molecular biology and evolution. Mol Biol Evol 35:1550–1552
Xia X, Lemey P (2009) Assessing substitution saturation with DAMBE. The phylogenetic handbook: a practical approach to DNA and protein phylogeny, vol 2. Cambridge University Press, Cambridge, pp 615–630
Xia X, Xie Z, Salemi M, Chen L, Wang Y (2003) An index of substitution saturation and its application. Mol Phylogenet Evol 26:1–7
Yeo D, Srivathsan A, Meier R (2020) Longer is not always better: optimizing barcode length for large-scale species discovery and identification. Syst Biol 69:999–1015
Zhang DX, Hewitt GM (1996) Nuclear integrations: challenges for mitochondrial DNA markers. Trends Ecol Evol 11:247–251
Zhang J, Kapli P, Pavlidis P, Stamatakis A (2013) A general species delimitation method with applications to phylogenetic placements. Bioinform 29:2869–2876
Acknowledgements
We thank NCBI database for supplying betta fish accession number. We also thank the Faculty of Science for providing supporting research facilities (6501.0901.1) and the Center for Bio-Medical Engineering Core Facility at Dankook University.
Funding
This research was financially supported in part by the High-Quality Research Graduate Development Cooperation Project between Kasetsart University and the National Science and Technology Development Agency (NSTDA) (Grant no. 6417400247) awarded to TP and KS; the Thailand Science Research and Innovation through The Kasetsart University Reinventing University Program 2021 (Grant no. 3/2564) awarded to TP and KS; Fundamental Fund: Kasetsart University Research and Development Institute (KURDI) (FF(S-KU)17.66) awarded to SFA and KS; the e-ASIA Joint Research Program (Grant no. P1851131) awarded to WS and KS; a grant from the National Science and Technology Development Agency (NSTDA) (Grant no’s: NSTDA, P-19–52238 and JRA-CO-2564–14003-TH) awarded to KS and WS; a post-doctoral researcher award at Kasetsart University awarded to SFA and KS; the Higher Education for Industry Consortium (Hi-FI; no. 6414400777) awarded to NA and KS; the Office of the Ministry of Higher Education, Science, Research and Innovation. International SciKU Branding (ISB), Faculty of Science, Kasetsart University awarded funds to KS. No funding source was involved in the study design, collection, analysis, and interpretation of the data, writing the report, or the decision to submit the article for publication.
Author information
Authors and Affiliations
Contributions
Conceptualization, TP and KS; formal analysis, TP, NA, PW, WS, SFA, EK, and KS; funding acquisition, KS; investigation, TP; methodology, TP and KS; project administration, KS; visualization, NM and KS; writing—original draft, TP and KS; writing—review and editing, TP, NA, PW, WS, SFA, EK, SD, NM, PD and KS. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing financial interests or personal relationships that have influenced this study.
Ethical approval
All animal care and experimental procedures were approved (approval no. ACKU63-SCI-007) by the Animal Experiment Committee of Kasetsart University and conducted in accordance with the Regulations on Animal Experiments at Kasetsart University.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
13258_2022_1353_MOESM1_ESM.jpg
Supplementary file1 A phylogram clarifying the phylogenetic relationships among the 746 GenBank accessions, constructed from a Bayesian inference analysis using mitochondrial cytochrome c oxidase I (COI) sequences. (JPG 15800 KB)
13258_2022_1353_MOESM2_ESM.jpg
Supplementary file2 A phylogram clarifying the phylogenetic relationships among the 288 GenBank accessions, constructed from a Bayesian inference analysis using mitochondrial cytochrome b (Cytb) sequences. (JPG 6350 KB)
13258_2022_1353_MOESM3_ESM.jpg
Supplementary file3 Phylogram clarifying the phylogenetic relationships among 288 GenBank accessions, constructed from Bayesian inference analysis using mitochondrial cytochrome c oxidase I (COI) sequences. Group 1: higher-level similarity with the same species. Group 2: higher-level similarity with multiple species. Group 3: unique sequences with no similarity within most sequences. Class 1: sequences with the same species name exhibiting intraspecific cohesive clustering and interspecific distinct clustering with high posterior probability (0.90–1.00). Class 2: sequences with the same species name that do not exhibit intraspecific cohesive clustering. Class 3: sequences with a different species name exhibiting cohesive clustering. There is only 1 accession number (asterisk symbol: *). (JPG 16033 KB)
13258_2022_1353_MOESM4_ESM.jpg
Supplementary file4 Phylogram clarifying the phylogenetic relationships among 746 GenBank accessions, constructed from Bayesian inference analysis using mitochondrial cytochrome b (Cytb) sequences. Group 1: higher-level similarity with the same species. Group 2: higher-level similarity with multiple species. Group 3: unique sequences with no similarity within most sequences. Class 1: sequences with the same species name exhibiting intraspecific cohesive clustering and interspecific distinct clustering with high posterior probability (0.90–1.00). Class 2: sequences with the same species name that do not exhibit intraspecific cohesive clustering. Class 3: sequences with a different species name exhibiting cohesive clustering. There is only 1 accession number (asterisk symbol: *). (JPG 7083 KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Panthum, T., Ariyaphong, N., Wattanadilokchatkun, P. et al. Quality control of fighting fish nucleotide sequences in public repositories reveals a dark matter of systematic taxonomic implication. Genes Genom 45, 169–181 (2023). https://doi.org/10.1007/s13258-022-01353-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13258-022-01353-7