Introduction

Despite the large volume of globally traded vegetable oil, 75% of the production is from only four crops: soybean, oil palm, rapeseed, and sunflower. Because of the domination of these four (Murphy 1999), many other oil crops are now underutilized or neglected, though these species provide opportunities through their great genetic diversity and diverse agro-ecological adaptation (Hammer and Heller 1998; Padulosi et al. 1999; Thies 2000). This monopoly of few crops has not only resulted in increased vulnerability of agricultural production to sudden disease outbreaks but also impoverished the human diet. Further domestication and development of underutilized crops is a possible solution for the growing and diversified nutritional needs of humankind.

Safflower (Carthamus tinctorius L.), a member of the family Asteraceae, is one of the important underutilized oil seed crops. It is a highly branched, herbaceous, thistle-like annual or winter annual, usually with many long sharp spines on the leaves. Plants are 30–200 cm tall and have strong taproots, which enable them to thrive in dry climates. It is a multi-purpose crop species grown for oil, medicinal and industrial uses, from the Mediterranean region to the Pacific Ocean, at latitudes between 45° N and 45° S. Traditionally, the crop was grown for its yellow-orange flowers, used for coloring and flavoring foods, and making dyes (Ashri 1975; Weiss 1983; Knowles 1989). Safflower oil is thought to be one of the highest quality vegetable oils, containing oleic acid and linoleic acid. There are two types of safflower; high linoleic types that have an approximate range of linoleic acid from 3.1% to 88.8%, and high oleic types with an approximate range of oleic acid from 3.9% to 90.6% (Fernández-Martínez et al. 1993).

For the effective use of underutilized crops, it is critical to understand the extent and distribution of genetic diversity within species (Padulosi et al. 1999). In the past, genetic diversity studies in crop plants mostly relied on the evaluation of morphological and agronomic traits (Upadhyaya et al. 2002; Atta-Krah et al. 2004). Currently, there are many molecular marker systems routinely used to evaluate genetic diversity in plants. These include randomly amplified polymorphic DNA (RAPDs), amplified fragment length polymorphisms (AFLPs) and ISSRs (intersimple sequence repeats) (Patzak 2001). RAPD markers are frequently used in genetic divergence studies of crops because they do not require prior genomic information, and are simpler, less costly, and less labor-intensive than other DNA marker techniques (Williams et al. 1990). RAPDs have been used successfully with many other underutilized crop species, for example, lentil (Ferguson et al. 1998), sesame (Bhat et al. 1999), and flax (Fu 2005).

Characterization of safflower germplasm so far included agro-morphological studies (Khan et al. 2004; Dwivedi et al. 2005; Jaradat and Shahid 2006), biochemical analyses (Fernández-Martínez et al. 1993; Pascual-Villalobos and Albuquerque 1996; Zhang 2001) and, recently, molecular markers have been applied (Sehgal and Raina 2005; Johnson et al. 2007; Yang et al. 2007). Nevertheless, the RAPD study by Sehgal and Raina (2005) included 14 Indian cultivars only. Also, genetic diversity is best estimated if agro-morphological, biochemical and marker studies are used together (Vollmann et al. 2005) along with appropriate multivariate analysis (Mohammadi and Prasanna 2003). Despite the limited information available that integrates agro-morphological, biochemical and genetic methods in diversity studies of safflower (e.g., Johnson et al. 2007), core collections have been proposed by Johnson et al. (1999) and Dwivedi et al. (2005).

The objectives of this research were to evaluate the diversity in a comprehensive safflower germplasm collection at agro-morphological, biochemical and molecular levels, to determine relationships among these factors and, consequently, identify patterns of geographical diversity.

Materials and methods

Seed material of 747 accessions from all over the world was obtained from several genebanks and sown in April 2002 in the experimental field at Reinshof of Georg-August-University Göttingen (Latitude, 51°32′ North and Longitude, 9°57′ East). The experiment was laid out in a simple lattice design with two replications of each accession. In every plot, there were three rows and in each row 25 seeds were sown. Each row was 120 cm long, and the path between rows was 80 cm. All accessions were provided with the same agronomical practices and weeds were manually controlled. The design was equal to similar field trials carried out at the German Hohenheim University within the frame of a larger project “Screening, selection and suitability for cultivation of safflower (Carthamus tinctorius) and false flax (Camelina sativa) for the extraction of edible oil under conditions of ecological agriculture” (Reinbrecht et al. 2003) financed by StollVITA-Stiftung, Waldshut, Germany.

Data previously collected by Khan et al. (2004) for the traits flower color, spines, branching pattern, head size, disease, days to flowering, plant height, palmitic acid, stearic acid, oleic acid, and linoleic acid were analyzed. Methodology adopted for the collection of data was as follows. Flower color: Color of fresh flowers on the plants was recorded as white = 1, pale yellow = 2, yellow = 3, yellow orange = 4, orange = 5, orange red = 6, red = 7. Spines: Presence and absence of spines was observed as spineless = 1, spiny = 2. Branching pattern: Three levels of branching pattern were scored, basal = 1, medium = 2, upper = 3. Head size: Scored as small = 1, intermediate = 2, and large = 3. Disease: Infection of any kind of disease was noted as tolerant = 1, intermediate = 2, infected = 3. Days to flowering: Number of days from the date of sowing to the date when at least three plants showed open flowers. Plant height: Mean height (cm) at maturity was measured when plants were in full flower from ground level to the tip of the main stem. Fatty acid analysis: Gas Chromatography (GC) was used for fatty acid analysis after Velasco et al. (1998).

Seed was harvested manually at maturity, about 6 months after sowing. Because of disease incidence and unfavorable climate, only 193 out of 747 accessions produced enough seeds, to be used for further analysis. These 193 accessions were divided into geographical groups according to the information received from the genebanks, similar to Johnson et al. (1999), and one group with accessions from unknown origin (Table 1).

Table 1 Safflower accessions divided into geographical groups based on origin after geographical information provided by genebanks

Analysis of agro-morphological and biochemical data

Pearson correlation coefficient was calculated for all agro-morphological and biochemical variables. The means of each geographical group for all agro-morphological traits and fatty acid composition were then used for cluster analysis to study the relationship between geographical groups. Euclidean distance was used as the resemblance coefficient for cluster analysis with the Unweighted Pair Group Arithmetic Means method (UPGMA) using NTSYS-pc software version 2.1 (Rohlf 2001).

Selection of accessions for marker study

Agro-morphological and biochemical data was standardized before using in multivariate analysis by applying the YBAR and STD options in NTSYS-pc software version 2.1 (Rohlf 2001). Principal component analysis (PCA) was then performed on all these variables. In order to cover broad agro-morphological and biochemical variation, accessions were chosen for further analysis on the basis of the first two PCA scores. Four accessions, two with minimum values and two with maximum values, were selected from each of the nine geographical groups for randomly amplified polymorphic DNA (RAPD) analysis. Single plants of selected accessions were raised from seed produced in the field in a greenhouse of University of Göttingen; however, one accession from South Western Asia did not germinate.

RAPD marker analysis

Genetic diversity was estimated in safflower accessions using RAPD marker analysis. Young, emerging leaves from a single plant randomly chosen per accession were harvested from plants three weeks after emergence. Genomic DNA was extracted from 90 to 120 mg of young leaf tissue of 35 accessions by using the Nucleon PhytoPure DNA Extraction Kit (Amersham Pharmacia) protocol. DNA concentration was quantified by using a Fluorometer (Versa FluorTM, System 170-2402; Bio-Rad Lab. Inc.). For RAPD analysis of genomic DNA, 10-decamer primers (Operon DNA Technologies, Alameda, California, USA) were chosen randomly and others recommended by RC Johnson (personal communication). The polymerase chain reaction (PCR) was performed in a volume of 25 μl containing 50 ng of genomic DNA, 10× polymerase buffer, 0.2 mM of each dNTP, 3.0 mM MgCl2, 0.4 μM primer, and 1 unit of Taq polymerase. PTC-100 thermocycler (MJ Research, Watertown, Massachusetts, USA) was used for PCR amplification reactions with the PCR conditions: 94°C denaturation step (30 s) followed by 45 cycles of 1 min at 92°C, 1 min at 35°C, and 2 min at 72°C, and a final extension step of 5 min at 72°C. After amplification, 5 μl of loading buffer was added in each reaction tube and the amplification products were separated on 1.5% agarose gels in TBE buffer. The gels were stained with ethidium bromide (1 mg/l H2O) for 10–15 min and washed in distilled water for 25–30 min. The RAPD bands were then visualized through UV-Transluminator (UV-Rays at λ = 254 nm) and photographed.

Analysis of RAPD marker data

Sharp and highly reproducible polymorphic RAPD bands were visually scored to develop a 1, 0 matrix. Jacquard’s similarity coefficient was calculated and used to produce an UPGMA-based dendrogram. For the similarity analysis, NTSYS-pc software version 2.1 (Rohlf 2001) was applied.

Comparison of morphological and marker diversity

Finally, for 35 accessions an Euclidean distance matrix for agro-morphological and biochemical data and a Jaccard’s similarity matrix for marker data were compared. Correlations between the distance and similarity matrices were performed by using NTSYS-pc software version 2.1 (Rolf 2001).

Results

Agro-morphological and biochemical diversity

The safflower germplasm evaluated exhibited large variation in all characters studied (Table 2). Fresh flower color varied from white to orange red, and both spiny and spineless accessions were present in all geographical groups. Under the field conditions tested, South Western Asian (mean 1.5 SD ± 0.4) and East Asian (1.6 SD ± 0.4) accessions were less susceptible to diseases compared to the remainder of the groups. The range of days to flowering in material from Africa, East Asia and South Western Asia was slightly larger than those of the other groups. The range of plant height in East Asian accessions was lower than in the other groups, while it was higher in accessions from Eastern Europe. Mediterranean and African material had a relatively narrow range of plant height compared to other groups.

Table 2 Agro-morphological and biochemical variation in 193 safflower germplasm accessions evaluated at Reinshof, Germany, according to different geographical regions of origin

Material from North America exhibited a larger range of palmitic acid as compared to material from the other groups, while for stearic acid, accessions from Central Western Europe had the widest range. For oleic acid, germplasm from North America, East Asia, and Africa displayed high maximum values. Linoleic acid ranged from 63% to 83%, with Eastern European accessions showing the greatest range (68.3–83.3%).

There was a significant (P > 0.01) correlation between days to flowering and plant height as well as plant height and disease susceptibility (Table 3). Palmitic acid was positively correlated with both stearic acid and oleic acid, and negatively with linoleic acid (P < 0.01). A highly significant negative correlation existed between oleic acid and linoleic acid.

Table 3 Pearson correlation coefficients for agro-morphological traits and fatty acid composition of 193 safflower germplasm accessions evaluated at Reinshof, Germany

Patterns of agro-morphological and biochemical diversity

The cluster analysis based on means of all agro-morphological and biochemical variables from each geographical group showed that the accessions from Eastern and Southern Europe clustered together with the Mediterranean group, and African accessions were also close to this cluster. The East Asian group was the most distant (Fig. 1).

Fig. 1
figure 1

Dendrogram produced by UPGMA clustering based on means of 7 agro-morphological traits and fatty acid composition (4 acids) of 193 safflower accessions from eight geographical groups. (East Asia, Ea; South Western Asia, Swa; Eastern Europe, Eeu; Central Western Europe, Cwe; Southern Europe, Seu; Mediterranean, Med; Africa, Af; North America, Na; Unknown, Un.)

When applying PCA on all agro-morphological and biochemical variables together, the first four principal components accounted for 68.4% of the variation (Table 4). While the first principal component (PC1) contributed about 25% of the variation, showing highest contributions from the proportions of linoleic and oleic acid, PC2 received higher contributions from agro-morphological traits. However, palmitic acid and disease susceptibility were also important for PC1. PC2 contributed 23% to the total variation, with days to flowering and plant height contributing the most.

Table 4 Eigenvectors and eigenvalues generated by Principal Component Analysis applied on agro-morphological traits and fatty acid composition of 193 safflower accessions

Despite the large diversity in germplasm for fatty acid composition, days to flowering and plant height, no clear separation between accessions from different geographical groups was achieved by PCA. However, a general tendency was observed that South Western Asian accessions and accessions from Central Western Europe lay close to each other (data not shown).

Genetic diversity

A total of 78 polymorphic bands were produced by 15 primers (Table 5). The primers produced a maximum of 10 (OP A-9, OP B-13) and a minimum of 1 (OP Al-1) polymorphic band. Operon primer AA-04 produced seven polymorphic bands (Fig. 2).

Table 5 Operon primers with number of scorable polymorphic bands, with 5′–3′ sequence used to assess genetic diversity of safflower
Fig. 2
figure 2

RAPD analysis of 27 safflower accessions from eight different geographical regions with Operon primer AA-04 (see Table 5). (East Asia, Ea; South Western Asia, Swa; Eastern Europe, Eeu; Central Western Europe, Cwe; Southern Europe, Seu; Mediterranean, Med; Africa, Af; North America, Na; Unknown, Un.)

The cluster analysis based on marker data showed no clear grouping of accessions with regard to geographical regions of origin reported (Fig. 3). Two of the African and one of the Mediterranean accessions were most distant in the cluster.

Fig. 3
figure 3

Dendrogram for 35 safflower accessions generated by UPGMA clustering using Jaccard’s coefficient of similarity on RAPD data. (East Asia, Ea; South Western Asia, Swa; Eastern Europe, Eeu; Central Western Europe, Cwe; Southern Europe, Seu; Mediterranean, Med; Africa, Af; North America, Na; Unknown, Un.)

The correlation between the Euclidean distance matrix for agro-morphological traits and fatty acid composition, and Jacquard’s similarity matrix for marker data was low and not significant (r = 0.027). This shows that agro-morphological and biochemical variation was generally independent of RAPD variation.

Discussion

Agro-morphological, biochemical and genetic diversity

A large amount of diversity was found in safflower accessions at agro-morphological, biochemical, and genetic levels. Among the agro-morphological and biochemical traits studied were oil content (oleic and linoleic acid), plant height and days to flowering. However, the present germplasm did not contain high oleic acid types (Table 2) and, therefore, it is not directly comparable to the core and non-core accessions described by Johnson et al. (1999). Other studies (Ashri 1975; Jaradat and Shahid 2006) reported comparable results, showing a wide range of variation in safflower collections for agro-morphological traits as a result of human and natural selection. This suggests good potential for selection in a breeding program (Pascual-Villalobos and Albuquerque 1996).

PCA identified linoleic and oleic acid as the most important traits responsible for variation in our material. Variation in linoleic and oleic acid content of safflower oil indicates the possibility of improving oil quality through breeding (Fernández-Martínez et al. 1993). Consequently, the most diverse accessions could be selected on the basis of these first two principal components. Even though a small number of RAPD primers was used in this study, these still produced a reasonable number of polymorphic bands, indicating large genetic diversity contained in the materials studied, especially compared to the few cultivars researched by Sehgal and Raina (2005).

Relationship between diversity patterns and geographical origin

Neither cluster analysis nor PCA revealed a clear relationship between diversity pattern and geographical origin. For most of the traits studied, similarity among accessions was independent of their origin, and there was no trait in the present study, which could clearly separate the accessions on the basis of geographical origin.

Ashri et al. (1975) stated that the variation for days to flowering within a region was so great that the difference between regions was not apparent. Most regional groups in the comprehensive study by Johnson et al. (1999) strongly overlapped. Our results (Fig. 3) also showed large variation within regions and, most likely, the lack of clear clustering of the accessions according to geographical regions of origin could partly have been for this reason. On the other hand, the fact that only one single plant per accession was studied by RAPDs, disregarded any possible variability that might be contained in the accessions. Johnson et al. (2007) stated that the high AFLP marker uniformity within safflower accessions was likely associated with predominant self-pollinating, although the species also has potential for substantial outcrossing. They could even illustrate that eight bulk samples studied by AFLPs were representative for most individuals from these populations.

Results presented here may also point to the practice of genebanks across the world, which, in their effort to assemble diverse samples, may have acquired various duplicates of the accessions from other institutions. For example, Li and Mündel (1996) and Dwivedi et al. (2005) reported elimination of duplicate entries from safflower genebank material. This conclusion is also supported by the fact that there are a few closely related accessions from otherwise diverse regions in the cluster analysis (Fig. 3). Ashri (1975) also reported that divergence was not always obvious. He reasoned that this could be due to unspecificities of variety origins, either due to inclusion of recently naturalized varieties, the discrepancy between phytogeographical regions and political boundaries, or unreported exchange of seeds between regions by nomads, armies, or merchants. Broad diversity of safflower accessions within large regions is reasonable given the variability of climatic and agricultural conditions, as well as safflower’s traditional mode of cultivation, in small plots, which could have fragmented the gene pool into small populations, in which natural and human selection, and genetic drift could have operated (Ashri 1975; Ashri et al. 1975; Jaradat and Shahid 2006).

Cluster analysis based on geographical groups showed that Eastern European and Southern European accessions were fairly similar and clustered together with the Mediterranean group. African accessions also lay in the same main cluster, which supports the studies by Johnson et al. (1999) and Jaradat and Shahid (2006) that reported the proximity of African accessions to the Mediterranean group. The East Asian group stood out as the most distant. Isozyme study of 89 safflower accessions has shown that East Asian accessions had the highest mean gene diversity (Zhang 2001). Whereas, Johnson et al. (2007) attributed higher gene diversity to the American group due to past efforts of crop improvement by introducing exotic materials, especially from China. Genetic diversity showed no relationship to geographical origin (Fig. 3), as in the case of morphological diversity; accessions from unknown origin were also randomly distributed over the whole dendrogram.

Relationship between agro-morphological and biochemical, and genetic diversity

There was no association between agro-morphological and biochemical diversity and molecular diversity. Similar disparity between morphological traits and RAPDs was reported in different studies (Vollmann et al. 2005; Johnson et al. 2007). There could be many reasons for the lack of correlation between RAPDs and morphological distances. One reason could be that RAPDs detect polymorphisms in coding as well as non-coding regions of the genome, of which only a small portion is coding, therefore, it is very likely that the polymorphism found is in a non-coding region. The relationship between molecular markers and phenotypic traits could be significant if the markers were linked to selected loci (Persson and Gustavsson 2001). Also, plants that are morphologically similar are not necessarily genetically so, as different gene pools could result in similar phenotypes. Comparisons may correctly display similarities in neither, both or either morphological or molecular phenotypes, making the relationship between groups complex to interpret. Therefore, consistency should only be expected if plants had shared genetic resources and parallel breeding objectives or, conversely, were very different in both gene-pool source and selection targets (Roldán-Ruiz et al. 2001; Archak et al. 2003).

Conclusions

The diversity encountered in safflower germplasm points out that there is a large potential for the improvement of safflower for both agronomic and quality traits. This wide variation, however, is probably only a proportion of the variation that exists in safflower worldwide, and further evaluation appears warranted to use and conserve this potentially valuable genetic resource. Breeding strategies need to exploit existing variation within the safflower germplasm to broaden the genetic base of currently used cultivars.