1 Introduction

Korean ginseng (Panax ginseng) has long been regarded as a superior medicine that is non-toxic to the human body even when administered for a long time. Today, ginseng is still one of the representative medicinal crops used to promote human health. In 2020, the total sales of the 54 items of the health functional food registered in Korea was 2.450 billion USD, of which red ginseng sales were highest as it accounts for 32.7%. In particular, after the outbreak of COVID-19, as a result of increased global interest in strengthening of immunity, the export share of Korea red ginseng products in 2020 increased by 39.8% to the United States, 16.6% to Vietnam, 282.0% to Singapore, and 292.0% to Australia, compared to 2019 (Hwang et al. 2020). With the growing demand for ginseng production, high-yielding property has become one of the most important factors in the breeding and cultivation of ginseng. In addition, the decrease in arable land due to global warming caused an increase in the ratio of ginseng cultivation on unsuitable land, exposing ginseng to various environmental stresses while being grown in one place for at least 4 years. As a result, physiological disorders such as heat damage and rusty roots increased, which lowered the quality of ginseng as well as the yield. Therefore, to simultaneously improve the yield and quality of ginseng, there has been a strong demand for the development of a cultivar with multiple agricultural characteristics related to high yield and resistance to diseases, pests and physiological disorders (Lee et al. 2015; Bang et al. 2020; Zhang et al. 2020).

Breeding of ginseng is difficult due to 4-year generation time and the small number of seeds, and thus ginseng cultivars developed before the 2000s did not meet the demand for multi-characteristic cultivars. Panax ginseng cv. Chunpoong, the first registered cultivar in Korea, has tolerance to rusty roots, but has a disadvantage in lowering yield (Kwon et al. 1998). Panax ginseng cv. Yunpoong was bred as a high-yielding cultivar with heavy root weight and large diameter of the main roots, but it is vulnerable to rusty roots (Lee et al. 1999; Kwon et al. 2000). Panax ginseng cv. Gumpoong is tolerable to rusty roots, P. ginseng cv. Gopoong has high-ginsenoside contents, and P. ginseng cv. Cheongsun has early germination time. However, their yields were not remarkable compared to Yunpoong (Lee et al. 2015). For these reasons, local landraces, native mixed lines, are still cultivated the most in Korea, and none of the cultivars were widely cultivated in ginseng fields, accounting for only 14.6% (Bang et al. 2020). In the late 2000s, P. ginseng cv. Sunil and P. ginseng cv. Sunmyoung were developed as cultivars with high yield and tolerance to heat damage (Lee et al. 2010, 2021). Especially, Sunmyoung also showed resistance to Alternaria leaf spot and lodging phenomenon. However, Sunil was evaluated as susceptible to lodging, and Sunmyoung has a risk of being damaged by cold waves due to its early sprouting. Therefore, despite these efforts, there is a continuous demand for cultivars that maintain high yield and quality by resisting various diseases and environmental stresses during long-term cultivation of ginseng.

To maintain the purity of elite cultivars and provide them to ginseng farmers, it is also necessary to establish a convenient and reliable authentication system. Since ginseng cultivars are highly similar in their morphology, especially in the early developmental stage of ginseng, dried ginseng, seeds, and seedlings, there is a limitation in discriminating ginseng cultivars based on their morphological characteristics (Ying et al. 2022). Numerous molecular marker systems have been developed in ginseng over the past 20 years to address this, but most of them were difficult to utilize for convenient high-throughput authentication of cultivars due to post-PCR processing such as gel electrophoresis (Jo et al. 2017). Recently, high-throughput genotyping tools using high-resolution-melting (HRM) analysis or kompetitive allele-specific PCR (KASP) have been reported (Jo et al. 2015; Jang et al. 2022). However, these studies were performed on the basis of transcriptome or plastome, there was also a limitation in distinguishing wide range of ginseng cultivars. Recent advances in next-generation sequencing (NGS), particularly with the advent of genotyping-by-sequencing (GBS), have allowed for the rapid discovery of large numbers of single nucleotide polymorphism (SNP) markers at the genome-wide scale in a cost-effective manner (Elshire et al. 2011). GBS approach has been successfully used for SNP discovery in large genome-sized organisms as well as in complex polyploidy organisms with narrow genetic differences, such as cotton, wheat, and strawberry (Poland et al. 2012; Islam et al. 2015; Kim et al. 2019). Ginseng also has a large size and complex structure of genome, therefore GBS strategy is expected to help easily obtain large amounts of SNPs from ginseng cultivars (Kim et al. 2018).

This study describes the breeding history and characteristics of P. ginseng cv. Sunhong with high-yielding property and tolerance to multiple physiological disorders including heat damage, rusty roots and lodging. In addition, to improve the purity of Sunhong, we suggest a reliable high-throughput authentication system using genome-wide SNPs obtained through GBS analysis. The results of this study can be used to promote the cultivation ratio of ginseng cultivars, thereby improving the yield and quality of ginseng.

2 Materials and methods

2.1 Breeding and cultivation

The ginseng seedlings with a root weight of 0.8–1.0 g, a root length of 15 cm or longer, and no disease or damage caused by physiological disorders were used in the experiments. The preliminary yield trial was performed by planting 63 seedlings in an area of 1.62 m2 at Suwon Experiment Station in Gyeonggi, Korea. The advanced yield trial was performed by planting 126 seedlings in an area of 3.24 m2 at Boeun Experiment Station in Chungbuk, Korea. The regional adaptability trial was performed by planting 189 seedlings in an area of 4.86 m2 at farms in Yeoju and Anseong, Gyeonggi, Korea. All trials were performed in triplicate. From the end of May to the end of August, additional sun-shading sheets were placed to minimize heat damage. The cultivation managements were carried out according to the Ginseng Good Agricultural Practices (RDA 2012a).

2.2 Qualitative traits and ginsenoside contents

The qualitative traits of the aerial part investigated in this study were the color of berries, the color of autumn leaves, the shape of the central and additional leaflets, and the intensity of anthocyanin coloration in the stem, the petiole and the petiolules. The investigation was performed in accordance with the inspection standards of International Union for the Protection of New Varieties of Plants (UPOV) (UPOV 2004). The ginsenosides were separated by a Waters ACQUITY UPLC system (Waters, Millford, MA, USA) and detected by PDA at 203 nm as previously described (Park et al. 2013; Jang et al. 2020).

2.3 Physiological disorders, diseases, and pests

Yellow leaf spot and heat damage, which are physiological disorders of the aerial part, were graded from 0 to 9 according to the visual index scale, where 0 = absent, 1 < 1%, 3 = 1–10%, 5 = 10–25%, 7 = 25–40%, and 9 > 40%, respectively. The occurrence of yellow leaf spot and heat damage was determined if more than one-third of leaf is damaged, and the percentage of symptom (%) was calculated as [the number of leaves with symptoms/the total number of leaves in all plants tested × 100]. The rusty roots and rough skin, which are physiological disorders of the underground part, were measured with the same visual index scale applied to the investigation of leaf discoloration and heat damage (RDA 2012b; Kim et al. 2013). Lodging was rated on a visual index scale from 1 to 9 for the angle at which the stem was tilted: 1 = 0°, 3 = 1°–10°, 5 = 10°–20°, 7 = 20°–30°, and 9 > 30° (Lee et al. 2021).

Alternaria leaf spot and anthracnose, were graded by leaf discoloration using the same visual index scale as autumn leaves. Mealybug, insect that causes aerial part disease, was counted on a visual index scale from 0 to 9 as the number of adult insects in the peak season: 0 = absent, 1 = 1–5 insects, 3 = 6–10 insects, 5 = 11–30 insects, 7 = 31–50 insects, and 9 > 50 insects (RDA 2012b). The investigation of physiological disorders, diseases, and insects were performed with more than 50 individuals. Statistical analyses were performed using one-way ANOVA with the Duncan’s multiple range test (p < 0.05). The SAS package v9.2 (SAS Institute Inc., Cary, NC, USA) was used as statistical analysis program (Jang et al. 2021).

2.4 GBS library construction and sequencing

A total of 288 ginseng consisting of 8 cultivars were collected from the farms in Anseong and Yeoju, Gyeonggi, Korea (Table 1). Among these, 192 samples collected from Anseong were used for construction of two GBS libraries. Total genomic DNA was extracted using HiQ Stool DNA/RNA kit (BioD, Gwangmyeong, Korea) according to the manufacturer’s protocol, and then digested with ApeKI restriction enzyme (NEB, Ipswich, MA, USA) at 75 °C for 2 h. The barcoded adapters were ligated with each digested DNA, followed by pooling and cleaning with QIAquick PCR purification kit (Qiagen, Germantown, MD, USA). Each GBS library was prepared after amplification by PCR, and was sequenced with HiSeq-X (Illumina, San Diego, CA, USA).

Table 1 List of ginseng cultivars used for genotyping-by-sequencing (GBS) analysis and genotyping

2.5 GBS analysis, SNP identification, and phylogenetic analysis

De-multiplexing was performed using the barcode sequence, and adapter sequences were removed using cutadapt v1.8.3 (Martin 2011). The high-quality cleaned reads were obtained by trimming short reads with a Phred score of less than 20 using Trimmomatic v0.39 (Bolger et al. 2014), and then mapped to scaffold sequences of P. ginseng downloaded from Ginseng Genome Database (http://ginsengdb.snu.ac.kr) using BWA-MEM v0.7.17 (Li 2013; Jayakodi et al. 2018). Raw SNPs were detected using SAMtools v0.1.16 (Danecek et al. 2021), and were filtered under the conditions of minor allele frequency (MAF) > 5% and missing data < 30%, resulting in 3927 filtered SNPs. The SNPs with a read depth ratio greater than 90% were classified as homozygous, and those with a read depth ratio of 40–60% were classified as heterozygous. The evolutionary history was inferred using the Neighbor-Joining method with 1000 replications. The optimal tree with the sum of branch length = 12.85232980 is shown. The evolutionary distances were computed using the Maximum Composite Likelihood method and are in the units of the number of base substitutions per site. The analysis involved 192 nucleotide sequences consisting of 3957 filtered SNPs. All ambiguous positions were removed for each sequence pair. Phylogenetic analyses were conducted in MEGA6 (Tamura et al. 2013).

2.6 SNP filtering, validation, and genotyping

A total of 1655 probe-affordable SNPs were selected based on absence of sequence variables within 30 bp left and right of 3957 filtered SNPs, and then were further filtered based on missing data, MAF, and presence of two kinds of homozygous genotypes within eight cultivars. The SNPs selected as candidates of TaqMan markers were validated using Sanger sequencing in 32 samples. TaqMan markers were designed using RealTimeDesign software (LGC Biosearch Technologies, Hoddesden, UK). The primers and probes were synthesized by Metabion (Planegg/Steinkirchen, Germany). PCR amplification for genotyping was performed using 7500 Fast (Applied Biosystems, Waltham, MA, USA), and fluorescence measurements were carried out at the annealing step of PCR in FAM and HEX channels (Waminal et al. 2020). The genotyping results using each TaqMan marker were analyzed with 7500 software v2.3 (Applied Biosystems), and then homozygous sample data were integrated through principal component analysis (PCA). PCA was performed using the princomp function and results were visualized by the plot3D package (https://CRAN.R-project.org/package=plot3D) in R v4.2.1.

3 Results

3.1 Development of P. ginseng cv. Sunhong

In 2000, a ginseng individual with pink berries was discovered for the first time in Korea at the Eumseong Experiment Station of the Korea Ginseng and Tobacco Research Institute (KGTRI). The seeds were planted and then 17 individuals survived until a trait test was performed in 2004. An individual with two or more stems and pink berries was selected (line ‘Pink-13’) and cultivated by the pure line selection. The preliminary yield trial conducted between 2005 and 2008 showed that it has high-yielding property as well as uniformity and stability. In addition, the advanced yield trial performed between 2009 and 2012 confirmed not only the continuity that the major traits were stably maintained in its progeny, but also tolerance to heat damage, lodging, and rusty roots. The cultivar was subsequently applied to the Korea Seed and Variety Service in 2013 with the cultivar name, Sunhong (P. ginseng cv. Sunhong). Following the regional adaptability trial and cultivation review from 2016 to 2017, it was registered as a new cultivar in 2019 (Supplementary Fig. S1).

3.2 Qualitative traits and ginsenoside contents of Sunhong

To show the distinctness of Sunhong, the qualitative traits were compared with Chunpoong and Yunpoong. Chunpoong is the first cultivar in Korea and is tolerant to rusty roots. Yunpoong is a representative high-yielding cultivar in Korea. In addition, these two cultivars were derived from violet-stem landrace, Jakyung, which is the same origin as Sunhong (Kwon et al. 1998, 2000; Um et al. 2016). However, Sunhong distinctively showed reddish pink in berries, whereas the color of berries was red for Yunpoong and yellowish orange for Chunpoong. The expression level of anthocyanin in stems, petioles, and petiolules was moderate in Sunhong, which was weaker than Yunpoong and stronger than Chunpoong. The color of autumn leaves was yellowish orange in both Sunhong and Chunpoong, while Yunpoong turned red with aging (Fig. 1). Other characteristics of Sunhong were similar to Yunpoong, but showed a clear difference from Chunpoong. The anthocyanin color was detected along the whole petioles and petiolules in Sunhong and Yunpoong, while Chunpoong displayed purple only on the basal section. Sunhong and Yunpoong showed two or more stems and additional leaflets, and the shape of central leaflet was also similar to each other with a narrow oval shape. By contrast, as for Chunpoong, the frequency of multi-stem and additional leaflets was significantly lower, and the shape of central leaflet was narrow (Fig. 1 and Supplementary Table S1). These distinctness of Sunhong, especially its pink berries, help ginseng famers to easily distinguish it from other ginsengs in the fields, increasing the convenience of maintaining the purity during cultivation. In contrast, there was no significant difference in the ginsenoside contents among the cultivars. The total ginsenoside contents of Sunhong, Chunpoong, and Yunpoong were 18.39 mg g−1, 20.16 mg g−1, and 21.20 mg g−1, respectively, and seven major ginsenosides, such as Rg1, Re, Rf, Rb1, Rc, Rb2 and Rd were detected in all three cultivars (Supplementary Table S2).

Fig. 1
figure 1

Characteristics of the aerial part of ginseng cultivars. A The distinctiveness of the aerial section of Panax ginseng cv. Sunhong. The color of berries is reddish pink. The distribution of anthocyanin coloration shows purple only on the lower section of the stem, but appears purple throughout petioles and petiolules. Multiple stems emerge from one individual. The color of autumn leaves is yellowish orange. B The distinctiveness of the aerial section in P. ginseng cv. Chunpoong. The color of berries and autumn leaves are yellowish orange. The distribution of anthocyanin coloration of the stem, petioles, and petiolules shows purple only on the basal section. A single stem emerges from one individual. C The distinctiveness of the aerial section in P. ginseng cv. Yunpoong. The color of berries is red. The distribution of anthocyanin coloration of the stem shows purple along the whole stems, petioles, and petiolules. Multiple stems emerge from one individual. The color of autumn leaves is red

3.3 Agricultural characteristics of Sunhong

Sunhong exhibited a similar level of resistance to diseases and insect pests compared to Chunpoong and Yunpoong. The prevalence of alternaria leaf spot and anthracnose was not different in all three cultivars. The degree of mealybug damage that was counted on a visual index scale from 0 to 9 as the number of adult insects in the peak season was 5.7 for both Sunhong and Yunpoong. It was slightly lower than that of Chunpoong (7.0), but not a strong level of resistance to mealybugs (Supplementary Table S3).

Physiological disorders of Sunhong, such as yellow leaf spot, heat damage, lodging, rusty roots, and rough skin, were also evaluated on a visual index scale depending on the prevalence. Chunpoong, which is tolerable to rusty roots but vulnerable to heat damage and lodging, and Yunpoong, which has high-yielding property and tolerance to heat damage and lodging but is vulnerable to rusty roots, were used as control cultivars for comparison with Sunhong. The yellow leaf spot index was not significantly different among the three cultivars, and rough skin was found in none of the three cultivars. On the contrary, Sunhong’s prevalence of rusty roots was 3.7, which was as low as Chunpoong (2.3), a cultivar tolerable to rusty roots. Sunhong also showed tolerance to the heat damage and lodging. The heat damage level of Sunhong was 3.7, which was similar to that of Yunpoong (3.3) and over two times lower than that of Chunpoong (8.3). In particular, Sunhong had the lowest level of lodging among the three cultivars, as it was 2.0 in Sunhong, 3.3 in Yunpoong, and 6.7 in Chunpoong (Table 2).

Table 2 Responses to physiological stress to aerial and underground part of ginseng cultivars

In addition, Sunhong clearly showed high yield, much the same as Yunpoong, high-yielding cultivar. In the regional adaptability trials conducted in two regions for 2 years, the average yield of Sunhong and Yunpoong were similar at 580.0 kg 10 a−1 and 577.1 kg 10 a−1, respectively. This result was 32.0% higher than the average yield of Chunpoong (437.4 kg 10 a−1), suggesting that Sunhong is also a high-yielding cultivar like Yunpoong (Table 3). Taken together, Sunhong had the potential as a versatile cultivar possessing convenience of maintaining purity based on morphological distinctness, tolerance to various physiological disorders, and high yield.

Table 3 The yield potential (kg 10 a−1) of ginseng cultivars in the regional adaptability trials

3.4 GBS approach to establish authentication system of Sunhong

Although Sunhong has the convenience of maintaining the seed purity based on its distinct color of berries, it is still important to develop a reliable molecular marker-aided authentication system, especially to discriminate it from previously developed cultivars. Gumpoong, Gopoong, Cheongsun, Sunil, and Sunmyoung were used as control cultivars in addition to Chunpoong and Yunpoong because these registered elite cultivars have good agricultural traits and have been supplied for a long time in Korea (Lee et al. 2010, 2011, 2015, 2021; Park et al. 2021). A total of 192 samples consisting of eight cultivars were used to establish molecular marker-aided authentication system through the GBS analysis (Table 1). As a result, an average of 5.1 million reads per samples were generated and 97.94% of reads were mapped to reference genome. A total of 3957 bi-allelic SNPs were identified by filtering with thresholds of MAF > 5% and missing data < 30% from 271,258 raw SNPs (Table 4). Phylogenetic analysis using the filtered bi-allelic SNPs placed samples belonging to the same cultivar into the same cluster except Sunhong and Yunpoong, indicating these SNPs are informative for discriminating the cultivars used in this study (Fig. 2). Sunhong and Yunpoong were classified into the same group, which is consistent with their similar qualitative traits as mentioned previously (Fig. 1 and Supplementary Table S1).

Table 4 Summary of GSB data and single nucleotide polymorphism (SNP) selection
Fig. 2
figure 2

Phylogenetic analysis with 3957 filtered SNPs obtained from 192 ginseng samples consisting of 8 cultivars. The evolutionary history was inferred using the Neighbor-Joining method with 1000 replications. The optimal tree with the sum of branch length = 12.85232980 is shown. The evolutionary distances were computed using the Maximum Composite Likelihood method and are in the units of the number of base substitutions per site. The names of cultivars are abbreviated in two letters: CP Chunpoong, YP Yunpoong, GP Gopoong, KP Gumpoong, CS Cheongsun, SI Sunil, SM Sunmyoung, SH Sunhong

3.5 Selection of polymorphic SNPs to discriminate the eight cultivars

A total of 208 polymorphic SNPs were selected based on absence of sequence variables within 30 bp left and right of filtered SNPs, thresholds of missing data < 30% for each cultivar, and presence of two kinds of homozygous genotypes within eight cultivars (Table 4). From these 208 SNPs, twelve were selected as cultivar-specific SNPs, with eight being validated using Sanger sequencing (Supplementary Table S4). Among the eight SNPs, four were developed as TaqMan markers based on their allele combinations of eight cultivars (Supplementary Table S5). Genotyping results displayed on endpoint fluorescence scatter plots indicated that 04GS8365 marker was specific for Gumpoong but the other three markers, 06GS2795, 07GS4336, and 08GS9211, had no distinctive cluster of fluorescence signal for a single cultivar (Fig. 3A and Supplementary Fig. S2). To obtain a visualization of the multivariate data set generated with a combination of four markers, the complex genotyping data set was simplified by applying PCA. The 3-D score plots for the first three components obtained through PCA showed seven distinctive clusters consisting of a single cultivar, with the exception of the cluster containing both Sunhong and Yunpoong (Fig. 3B). The results suggest that these markers were effective in the authentication of most of the cultivars used in this study, but additional markers are needed to differentiate between Sunhong and Yunpoong.

Fig. 3
figure 3

Genotyping results of 192 ginseng samples consisting of 8 cultivars. Endpoint fluorescence scatter plots show the genotyping results of each TaqMan marker (A and C). The x and y axes of scatter plots represent the relative fluorescence units (RFU) of 04GS8365 (A) and 14GS1293 (B) markers labeled with FAM and HEX, respectively. Three-dimensional score plots show the first three principal components (PCs) analyzed from the combined genotyping results of 171 homozygous samples using multiple TaqMan markers (B and D). The first three PCs were obtained by principal component analysis (PCA) of the genotyping data using four markers, 04GS8365, 06GS2795, 07GS4336, and 08GS9211 (B) and all five markers including 14GS1293 marker (D)

3.6 Development of additional markers for discrimination of Sunhong and Yunpoong

Because polymorphic SNPs with different genotypes in Sunhong and Yunpoong were not found in 208 polymorphic SNPs, 1655 probe-affordable SNPs were filtered again using the loosen criteria of 40% of minor allele frequency (MAF) and 50% of missing data for each cultivar. Among these filtered SNPs, only three SNPs met the condition for existence of two kinds of homozygous genotypes within Sunhong and Yunpoong (Table 4). The 14GS1293 marker was subsequently selected after validation by Sanger sequencing and conversion to a TaqMan marker (Supplementary Tables S4 and S5). As a result of genotyping analysis using 14GS1293 marker, homozygotes of Sunhong and Yunpoong were clearly divided into two groups. Sunhong, Chunpoong, Gumpoong, and Cheongsun were classified into the same group and the other group consisted of Yunpoong, Gopoong, Sunil, and Sunmyoung (Fig. 3C). Together with the prior four markers, a combination including 14GS1293 marker clearly separated all of the eight cultivars from each other in the 3-D score plot for the first three components calculated by PCA (Fig. 3D).

4 Discussion

Sunhong was newly developed as a ginseng cultivar with reddish-pink berries by pure line selection (Supplementary Fig. S1). In addition, Sunhong showed high-yielding property, heat tolerance, rusty-root resistance, and lodging resistance in all trials conducted during the development process. The expression and distribution of anthocyanin color in Sunhong were distinctly different from those of Chunpoong and Yunpoong (Fig. 1), even though all of them were derived from Jakyung, local landrace with red berries and violet stem (Kwon et al. 1998, 2000; Lee et al. 2015). Since Jakyung is the most widely cultivated in Korea despite their mixed origins (Bang et al. 2020), the distinct color of Sunhong makes it easy for farmers to distinguish Sunhong from other cultivars or local landraces in the ginseng fields. High-yielding property is one of the most important factors in the breeding and cultivation of ginseng, as the land suitable for ginseng cultivation is gradually decreasing on the Korean peninsula due to global warming (Lee et al. 2015, 2021). Heat damage greatly affects the yield of ginseng, as exposure to 30 °C for more than 7 days results in burning of leaf edges, eventually causing discoloration of the entire leaf and suspension of growth (Lee et al. 2010). Sunhong demonstrated both characteristics of high yield and tolerance to heat damage comparable to Yunpoong, which has been evaluated as a high-yielding and heat-damage tolerant cultivar for a long time (Table 3). Yield and heat tolerance results of Sunhong are also similar to those of Sunil and Sunmyoung in previous report that showed their high yield and resistance to high temperatures (Lee et al. 2021). Rusty roots, characterized by the development of reddish-brown to orange-brown areas on the root surface, leads to a decrease in the quality of ginseng and the content of ginsenosides (Rahman and Punja 2005). This has been reported as a disadvantage of Yunpoong in a number of previous studies (Lee et al. 1999; Kang et al. 2010; Kim et al. 2017), so it was one of the factors limiting the expansion of Yunpoong’s cultivation despite its high yield and heat tolerance. On the other hand, Sunhong had resistance to rusty roots by showing 1.4 times lower prevalence than Yunpoong (Table 2). Lodging, a well-known disorder in cereals, is the process by which the shoots are displaced from their vertical stance and become permanently tilted or lying horizontally on the ground, reducing yield by up to 80% and increasing harvest time (Mariani and Ferrante 2017). In the case of ginseng, lodging-susceptible cultivars such as Chunpoong and Sunil have the inconvenience of installing additional belts to prevent stems from being exposed to strong sunlight and rain by leaning out of the shade canopy (Lee et al. 2021). However, Sunhong is expected to have no such inconvenience as it showed 3.4 times higher lodging resistance than Chunpoong (Table 2). Ginsenoside contents of Sunhong were not significantly different from that of the previously developed cultivars, indicating that Sunhong has quality suitable for raw materials of health functional food (Supplementary Table S2). Collectively, Sunhong has proven to be a novel cultivar with multiple advantages related to yield, quality and convenience of cultivation. The heat and rusty-root resistance of Sunhong helps to resist various environmental stresses during long-term cultivation, thereby keeping its high yield and quality. Distinct color makes the purity management of Sunhong easy in the fields and lodging resistance increased the convenience of cultivation. In addition, there was no difference in sprouting time, berry ripening time, and disease resistance compared to Yunpoong that was evaluated as preferred cultivar of ginseng famers and thus showed the highest cultivation rate of 7% (Bang et al. 2020). Therefore, Sunhong is considered to be a cultivar with strong potential to expand the cultivation area of ginseng cultivars and increase total ginseng production.

Although Sunhong has the convenience of maintaining purity based on the distinctness of color, there is a limitation in distinguishing Sunhong from other cultivars due to the highly similar morphology of ginseng in the early stages of development, such as seeds and seedlings. Therefore, accurate and reliable authentication systems, such as molecular marker-aided discrimination, are critical to improving the seed purity of superior cultivars, thus helping ginseng production by supplying pure seeds to farmers. In this study, five TaqMan markers were identified through GBS analysis to discriminate eight elite ginseng cultivars including Sunhong. GBS is a cost-effective approach used for genetic and genomic studies to a wide range of high diversity and large genome species (Elshire et al. 2011). Since ginseng has a large genome size of about 3 Gbp and complex genome structure with allotetraploid and diverse repetitive sequences (Kim et al. 2018), the GBS strategy helped to easily obtain large amounts of SNPs from the eight ginseng cultivars. Approximately 4 K filtered SNPs were acquired from 192 samples, and phylogenetic analysis revealed that these SNPs have the discriminant power to clearly divide the eight ginseng cultivars into 7 clusters (Fig. 2). After further filtering and validation, five polymorphic SNPs were finally selected and converted to TaqMan markers (Supplementary Tables S4 and S5). The genotyping data set simplified by PCA and visualized in 3-D plots revealed that a combination of these five TaqMan markers was able to authenticate each of the eight elite cultivars (Fig. 3). The TaqMan PCR is considered the preferred technique for genotyping a small number of SNPs in large populations, as it does not require a gel electrophoresis step and thus is high-throughput and time-efficient (Shen et al. 2009). In addition, due to the presence of two allele-specific TaqMan probes, it is highly accurate and reliable compared to other PCR-based genotyping methods such as HRM analysis (Zhang et al. 2013). Therefore, our results have significance in efficiently establishing a convenient and reliable authentication system using high-throughput TaqMan markers developed with GBS technique, which will help improve a seed purity of the novel superior cultivar.

The overall qualitative traits of Sunhong were very similar to Yunpoong, except for characteristics related to anthocyanin color expression (Fig. 1). In addition, Sunhong and Yunpoong were not distinguished in a phylogenetic analysis using 3957 SNPs identified by filtering with thresholds of MAF > 5% and missing data < 30% from raw SNPs (Fig. 2). These results imply that the genetic information of the two cultivars is very resembling, even though they were selected from independent Jakyung populations in different regions (Kwon et al. 2000). For this reason, among the 208 polymorphic SNPs, there were no SNPs with different genotypes between Sunhong and Yunpoong (Table 4). Addressing this, the loosen criteria of 50% of missing data and 40% of MAF were applied to SNP filtering for each cultivar. Missing data is one of the major limitations of GBS data due to the non-uniform distribution of sequence reads and low sequence coverage (Beissinger et al. 2013). Heterogeneity in ginseng cultivars was another limitation of GBS data analysis in this study. It is very difficult to achieve homogeneity in ginseng breeding due to the long generation period and morphological trait-based breeding (Kim et al. 2012). Indeed, some off-type individuals such as KP19, CS07, CS19, CP20, and CP21 were identified in the phylogenetic tree (Fig. 2), and heterozygous alleles were also detected by genotyping (Fig. 3C and Supplementary Fig. S2). To generate biological insights from GBS data of highly heterozygous and polyploid species, arbitrary choice of filtering thresholds or empirical selection of permissiveness to missing data has also been reported (Gardner et al. 2014; Bombonato et al. 2020). Considering these circumstances, as many as 24 replicates for each cultivar were used in the GBS analysis, which made it possible to loosen the thresholds for missing data and MAF, eventually discovering three SNPs with different genotypes between Sunhong and Yunpoong (Table 4). One of them was converted to a TaqMan marker, and thus the authentication system consisting of 5 TaqMan markers was finally established (Fig. 3D). This result demonstrated a simple and effective approach to extract as much information as possible from existing GBS data, which facilitated developing additional markers for varietal identification. Statistical and computational imputation can be performed on the GBS data in the future to further improve the SNP information obtained with high permissiveness to missing data.

In this study, we developed the ginseng cultivar Sunhong with pink berries, and showed its versatile characteristics such as high yield, cultivation convenience, and multiple tolerance. The GBS approach enabled to establish a reliable high-throughput authentication system for Sunhong and other ginseng cultivars by identifying a sufficient number of informative SNPs. In addition, we demonstrated that more permissive filtering thresholds can be effective in extracting more useful information from GBS data of heterozygous and polyploid species with narrow genetic differences like Sunhong and Yunpoong. It is hoped that these results will help to improve yield, quality, and homogeneity, and therefore promote the ginseng industry.