Introduction

Landraces are a dynamic, heterogeneous population of crops that have evolved in response to natural selection for the local environment, mutations, migrations and genetic drift, but also preferences and agricultural practices of local farmers (Boggini et al. 1997; Frankel et al. 1995; Mohammadi et al. 2014; Pusadee et al. 2014; Villa et al. 2005). They are associated with traditional farming systems and are not the result of premediated crop improvement (Villa et al. 2005). Current elite cultivars have significantly lower genetic diversity than landraces or wild relatives (Boczkowska et al. 2015; Boczkowska and Onyśk 2016). Beside the genetic diversity aspect, landraces have been a significant part of traditional knowledge and history of the local community, which is still underestimated. Poland was one of the very few European countries that maintained the on-farm cultivation of cereal landraces for a relatively long time due to the considerable fragmentation of farms and, consequently, extensive land use. Such crop management was especially typical for eastern and southern parts of the country (Hammer et al. 1981; Hanelt and Hammer 1977; Kulpa and Jastrzębski 1986). Currently, cereal landraces are rarely collected during expeditions due to significant changes in agriculture.

The total area of arable land under sown crops decreased by about 4 million hectares in Poland during the last 50 years, which is over 25% total area. The decline was particularly drastic after 1990, when economic factors caused the set aside of large arable land. In the period of <20 years (1990–2008) the area of agricultural lands decreased by over 2.5 million ha. This decline was due to the transfer of land to non-agricultural purposes, including afforestation. Poor and very poor soils constituted more than 60% of land allocated for non-agricultural purposes (Krasowicz and Kuś 2010). Also many farms, especially small ones, have resigned in recent years from plant production. The average area of farms is growing, although still among 1.4 million farms 1 million has the size of 1–10 ha. There are also very large changes in the area of cultivation of some cereal species, this is particularly evident in the case of rye and oats. During last 50 years, the area of oats has decreased more than 3 times, from about 1.7 million ha to 0.5 million ha (Sułek and Leszczyńska 2004). Significant decline in the stock of horses and unfavourable prices for seeds result in a systematic decline in the oat growing area. Because traditionally oat grain is mainly used as a fodder, and only in a small part is used for consumption and industrial processing. However, in sub-montane and in the north especially in the north-eastern part of the country, oats still play an important role in the sowing structure (Sułek and Leszczyńska 2004).

The main aim of this study was a comprehensive analysis of the diversity of common oat landraces acquired from different regions of Poland employing multiple phenotypic, genetic and metabolic approaches. The focus was given to establish mutual relationships between germplasm features determined at various description levels and to assess the impact of eco-geographic conditions of landrace place of origin on the descriptor patterns. The biochemical characterization of germplasm was performed using infrared spectroscopy, a convenient approach for high-throughput analysis, which provided a metabolic fingerprint of a sample. The feasibility of this method for discriminating and taxonomic applications have been successfully proved in researches with various plant organs or parts including seeds (Demir et al. 2015; Liu and Yu 2011), leaves (Kim et al. 2004; Lu et al. 2008), petals (Li et al. 2015), roots (Naumann et al. 2010), or pollens (Zimmermann and Kohler 2014).

To the best of our knowledge, this is the first attempt to associate spectral features of the same biological material to the extended range of characteristics including eco-geographic data, multiple phenotypic traits and genetic background.

Methods

Plant material

A germplasm assembly of 67 common oat landraces collected from 1973 to 1999 in Poland was obtained from the National Centre for Plant Genetic Resources (NCPGR). The list of tested accessions is provided in the supplementary file Appendix Table 1. The botanical varieties of all accessions were determined according to the classification of Rodionova (Rodionova et al. 1994) and was in detail described by Boczkowska and Tarczyk (2013).

Eco-geography

The eco-geographic conditions of collection sites were defined on the basis of geographic coordinates in the passport data (Nowosielska and Nowosielski 2008) (Appendix Table 1). Oat (Avena sativa L.) landraces were collected in five regions of Poland. The geographic coordinate ranges of the collection sites were 49.21–54.20 N latitude and 19.16–23.36 E longitude. The altitude of the collection sites was in the range from 100 to 913 meters above mean sea level (m a. s. l.). Based on statistical meteorological data, annual average temperature in Poland is estimated at approximately 8 °C; however, in mountainous regions, it is significantly below the average. Annual average temperature in the collection sites was between 5.7 and 8.8 °C with the mean 7.7 °C. Average annual precipitation in Poland is about 630 mm. In the collection sites it ranged from 535 to 1066 mm with the average approximately equal to 730 mm, which resulted from typically much higher precipitation in mountain areas.

Genetic analysis

Each accession was represented by a bulk sample of 25 randomly chosen plants. The genetic analysis was described in detail in our previous studies i.e. AFLP (Boczkowska et al. 2014) and ISSR (Boczkowska and Tarczyk 2013) and is also enclosed in the supplementary file (Appendix).

Phenotypic analysis

A phenotype of examined accessions was described using both qualitative (morphological) and quantitative (evaluation of agriculturally relevant traits) characteristics. The morphological description was based on 25 traits (Appendix Table 2), which were assessed according to Boczkowska et al. (2014). Twelve phenological and agronomical quantitative traits, hereinafter called ‘evaluation’ were described on the basis of field experiments during three subsequent seasons, using standardized scales and descriptors (Appendix Table 3). The trial was carried out on an experimental field of the Plant Breeding and Acclimatization Institute at Radzików, Poland in 2009–2011. Each accession was represented by 1000 grains which were sown on the 2.5 m2 plots. The characters were scored at an appropriate growth stage of the oat plants following the UPOV (1994). The traits were measured as follow: the number of days to heading was counted from the sowing date to 50% of panicles fully emerged; the number of days to maturity was counted from the sowing date to harvest ripeness; the shoot height was determined by the average of three plants measured from ground to the tip of panicle; the lodging score was determined in mature stage according to a scale where: 1 means all plants lodged and 9 no lodging at all; the susceptibility to diseases was also determined based on percentage of affected plants (natural infections); the grain yield was assessed as the weight of grain per plot; the thousand seed weight was estimated based on the average weight of three samples of 100 fully filled grains; the number of panicles (number of fertile tillers) was an average of 10 plants.

Metabolic analysis

About fifty randomly selected dehusked oat grains were ground and homogenized with the laboratory ball mill (MM 301, Retsch GmbH, Germany) into a fine powder. After drying at 35 °C for 24 h, the material was transferred to an exicator and kept until the measurements. FTIR spectroscopy measurements were performed at ambient temperature using an FTIR spectrometer (Nicolet iZ 10 module, Thermo Scientific, USA) equipped with KBr/Ge beam splitter, a deuterated triglycine sulfate (DTGS) detector and the Smart Orbit Attenuated Total Reflectance (ATR) accessory endowed with one-bounce diamond crystal. Sixty-seven interferograms were collected for three biological replications at a resolution of 4 cm−1 and co-added within the wavenumber range between 4000 and 400 cm−1. Spectra were background, vapor absorption and rubber band baseline corrected, smoothed, area-normalized and averaged using OMNIC software (v. 8.1, Thermo Fischer Scientific Inc.).

Data analysis

For establishing mutual correlations and interrelationships between various types of data, measured parameters were converted to binary matrices. The length of AFLP and ISSR products were scored as 1 in their presence and 0 in the absence. The morphological qualitative traits were converted in the similar manner i.e. the presence or absence of a particular state of a trait was coded as 1 or 0, respectively. FTIR spectral data collected for the whole range of wavenumbers between 4000 and 400 cm−1, the same as the quantitative evaluation scores (mean of 3 years), were normalized and then binarized. The normalization was defined by:

$$a_{i}^{'} = \frac{{a_{i} - a_{i\_\hbox{min} } }}{{a_{i\_\hbox{max} } - a_{i\_\hbox{min} } }}$$

where ai is an original value, a i_max is the maxima among all the data points, a i_min is the minima among all the data points and a i is the data point i normalized between 0 and 1. The resulting data were binarized as: [0, 0.5] = 0 and (0.5, 1] = 1. In addition, spectral bands of 1700–1480 and 2990–2830 cm−1 wavelengths assigned to protein and lipid, respectively, were extracted and considered as further two matrices. Furthermore, the matrices of genetic (AFLP + ISSR), phenotypic (morphology + evaluation) and spectral metabolic data (the whole range) were prepared. Subsequent analyses were carried out in the same manner for all of the matrices. The Nei’s coefficient (Lynch and Milligan 1994) was calculated to estimate genetic variation within the groups of accessions based of eco-geographic data, morphological traits and the performed method. One-way ANOVA was used to confirm the significance of differences in the coefficient values between groups. Post-hoc Fisher’s least significant difference (LSD) was applied to explore all possible pair-wise comparisons of means. The genetic distance was calculated based on Nei’s formula (Nei 1978). The geographic distance was calculated based on geographic coordinates of collection sites. Matrices of absolute values of environmental conditions were created based on eco-geographic data. The Mantel test (Mantel 1967) with 999 permutations was conducted to compare dissimilarity, geographic distance and absolute value of difference in environment conditions matrices. The Principal Coordinate Analysis (PCoA) was aimed at graphical representation of a resemblance matrix between landraces. Generalized Procrustes Analysis (GPA) was used to obtain a consensus configuration of all analysis types. Redundancy Analysis (RDA) allowed the relationship between PCoA matrices and eco-graphic and phenotypic data to be studied. The two-way ANOVA was conducted for agronomic traits and included in evaluation part of phenotypic description. All analyses were performed using GenAlex 6.5 (Peakall and Smouse 2012) and Microsoft® Excel 2010/XLSTAT©-Pro software (Version 2013.4.07, Addinsoft, Inc., Brooklyn, NY, USA).

Results

Phenotype description

Qualitative traits (morphological description)

The morphological traits differed in terms of variation calculated as Nei’s coefficient (Hj) within the collection (Appendix Fig. 1). The highest variability among tested landraces was observed for the angle between the leaves and culm axis (Hj = 0.426). Both erectness of spikelet and hairiness of lemma showed a total lack of variation within the set of landraces. All of the plants had drooping spikelet and hairless lemma. The remaining 22 traits revealed medium and faint diversity.

Quantitative traits (phenological and agricultural values)

Yield-forming parameters, as well as the disease resistance scores, showed great variability in dependence on accession (Appendix Table 3). The two-way ANOVA indicated that the heading date was the sole trait determined by genotype, while other characteristics were influenced by environmental conditions to varying degrees.

FTIR spectroscopy based—metabolic fingerprint

Dehusked oat grains showed a few distinct and prominent peaks in the spectral region characteristic for functional groups between 4000 and 1500 cm−1 and several overlapping peaks in the fingerprint region between 1500 and 900 cm−1 (Appendix Fig. 2a). Broad and strong band centered at 3316 cm−1 reflected the N–H and the O–H stretching vibrations due to presence mostly of proteins and polysaccharides. It is also shaped to some extent by the presence of liquid water in grains, particularly through the contribution of symmetric (~3270 cm−1) and asymmetric (3490 cm−1) stretching and the overtone of bending vibrations of water molecules (~3290 cm−1) (Max and Chapados 2009; Venyaminov and Prendergast 1997). Broadening of the band results from the extensive hydrogen binding of water and the biomolecules containing –NH and –OH groups (Stuart 2005). The strong peaks at 2926 and 2854 cm−1 were assigned to methylene –CH2 asymmetric and symmetric stretching vibrations, respectively. The bands specific for methyl –CH3 groups occurred as minor shoulders at 2962 and 2863 cm−1 (Appendix Fig. 2b), and showed insubstantial absorbance relative to the methylene groups (–CH2). This indicates the prevailing contribution of long-chain fatty acids with rather minor impact of side chains in proteins to the spectral features in this region (Silverstein et al. 2005). The spectral features of this region were used in the present paper for discriminating accessions in terms of lipid fraction content and quality. The presence of a shoulder due to carbonyl C = O stretching vibrations of the ester group located at 1740 cm−1 and the lack of signals generated by free acids at about 1710 cm−1 indicate that the fatty acids were largely in esterified forms (Vlachos et al. 2006). Protein fraction of the oat grains was spectrally characterized by the shape and height of bands centered at 1652 and 1544 cm−1 assigned as amide I and amide II bands, respectively (Appendix Fig. 2a) (Barth 2007). The features of this region were considered here as spectral characteristics of the grain proteins.

The wavenumber range between 1200 and 900 cm−1 with a prominent peak at 1023 cm−1 and two distinct shoulders at 1152 and 1079 cm−1 on the higher frequency side and a minor shoulder at 960 cm−1 on a lower frequency side are considered spectral characteristics of carbohydrates (Kačuráková and Wilson 2001) (Appendix Fig. 2c). This region consists of many overlapping peaks due to various sugar compounds and linkages and to some extend may show contribution of nucleic acids (Tajmir-Riahi et al. 2009). The distinct bands in FTIR spectrum from oat grains (upper) at 1152, 932, 858, and 763 cm−1 were located at the wavelengths characteristic for spectrum of starch, taken herein as the experimental reference (lower), and thus reflected the contribution from this component. However, the most prominent and broad band centered at 1023 cm−1 only partially corresponded to the starch spectral features and should thus be considered the result of overlapping bands attributed to remaining polysaccharide components, mostly from (1→3) (1→4)-β-glucan, and to a much lower extent from cellulose, occurring at relatively minor content, which typically show peaks around 1030 cm−1 (Guillon et al. 2011). The absorbance from (1→3) (1→4)-β-glucan also contributed to the bands at 1152 and 1080 cm−1. In turn, the band of highly branched arabinoxylans, prominent in wheat at around 1040 cm−1 (Toole et al. 2007) was not present in the oat landraces. Overall, the whole-range infrared spectrum taken in this paper as one of the accession characteristics gave insight into a metabolic profile of dehusked grains, reflecting relative contribution of main biochemical components including storage and structural polysaccharides, proteins, lipids and some minor chemical compounds e.g. phenolics.

Diversity

Comparison of the diversity within genetic, phenotypic and metabolic data (Appendix Fig. 3), showed that the highest differentiation occurred for the latter (Hj = 0.378), while the lowest was obtained for genetic data (Hj = 0.217). Variation detected by particular techniques was in the range from 0.205 for AFLP to 0.358 for FTIR spectrometry of lipids. The level of diversity was also determined within the groups (Appendix Fig. 4) created based on passport and eco-geographic data.

Regions

According to the genetic and metabolic datasets, the highest differentiation occurred in the South region, while phenotypic data indicted the highest variation in the South-East. Both genetic and phenotypic data showed that the minimum variation was attributed to the central part of Poland, but metabolic data revealed the lowest value in the North–East. Each of the extracted techniques, except for the quantitative traits approach, showed a minimum diversity in the center of the country (Appendix Fig. 4).

Altitude

The accessions collected above 600 mamsl had higher metabolic variation compared to those acquired from lower altitudes. The genetic and phenotypic results presented quite opposite trends. Analysis of data acquired by particular techniques showed that the differentiation of accessions increased along with altitude for both lipids and proteins (i.e. FTIR ranges corresponding to lipids and proteins), except for the group >800 mamsl (Appendix Fig. 4).

Average annual precipitation

The highest metabolic variation was observed within the group with the highest annual rainfall. The genotypic data presented the opposite trend. For the phenotype, no statistically significant differences between groups in the Hj level were found, regardless of the method used for plant description (Appendix Fig. 4).

Average annual temperature

The diversity differed significantly exclusively for metabolic data. Interestingly, the level of this variation declined as the temperature increased (Appendix Fig. 4).

Mantel test

Pairwise Mantel test was performed for all of combinations of distance matrices (Appendix Table 4). The statistically significant correlation was observed for the comparison of genetic and phenotypic distance matrices (0.272, p = 0.05). The metabolome correlations with both genotype and phenotype were insignificant (p > 0.05). On the other hand, comparison of particular techniques allowed a few other significant correlations to be identified, including ISSR vs both morphology and proteins, and morphology vs both evaluation and proteins.

In the next step, all of the developed distance matrices were also compared with the geographical distance and absolute differences in altitude, precipitation and temperature (Appendix Table 4). The phenotype matrix correlated significantly but at a low level with all of the tested matrices, while the genotype was not correlated with geographical distance. At the same time, no significant correlation was found between the metabolome and matrices describing the collection site. Further relationships are given in more detail in Appendix file.

Grouping analysis

Combination of PCoA results obtained from AFLP, ISSR, phenotypic qualitative and quantitative traits, and protein and lipid spectroscopic profiles was done using the Generalized Procrustes Analysis (GPA). The consensus proportion (i.e. Rc = the proportion of total variance explained by the consensus matrix) for the data set was rather low (0.297) but permutation test (n = 10,000) indicated that it was statistically significant (p < 0.05) so the agreement among six methods was not strong. Based on pairwise Rc values, it was clear that AFLP and evaluation gave the less consensual results (Appendix Table 5). The first two PCA dimensions (F1 and F2) accounted for 58% of the variance in consensus matrix (Fig. 1). At first glance, based on the bi-plot of the first two dimensions, no distinct group of accession was seen, but a thorough analysis of the scatter plots showed that the grouping pattern was related to yield (RDA correlation = 0.661) and to average annual temperature in collection site (RDA correlation = 0.561) (Table 1; Fig. 1).

Fig. 1
figure 1

Biplots of 1 versus 2 axis of GPA. The upper panel provides a general plot with the accession numbers. Two small ones below present grouping pattern related with yield (left) and with average annual temperature in collection site (right)

Table 1 Results of RDA analysis

Discussion

In this paper, we evaluated the impact of the ecosystem attributes defined by eco-geographical conditions on germplasm biodiversity using 67 oat landraces. An immense data set, determined by a number of various techniques including genetic (1260 loci), phenotypic (37 traits) and metabolomics approaches (several hundred variables), was employed for combined and comparative analysis and establishing regularities that characterized the landraces.

Analysis of the phenotypic data showed that morphologically similar accessions had a very different agronomically relevant characteristic. Moreover, the environment influenced both the richness of phenotypes and also favored some morphotypes specific for a given location. The interdependence found between stalk morphology and the thousand seed weight additionally confirmed the deliberate selection by farmers. Interestingly, white lemma landraces mostly occurred in mountain regions with the highest total annual precipitation. This might be explained by generally lower tolerance of white lemma genotypes to the drought (Lewicki and Mazurek 1967). Furthermore, accessions collected in close proximity were characterized by a very similar appearance but quite different scores of the evaluation data that, in turn, may indicate preferences of farmers for maintaining specific morphological forms, specifically in particular regions. The morphological diversity of examined populations, ranked in our test on the average level, was higher than estimated on the basis of the same criteria for oat cultivars grown in Poland before 1939 (Boczkowska et al. 2014). All of the analyzed landraces showed relatively high resistance to diseases such as crown and stem rust and powdery mildew. It could be a consequence of the buffering effect of disease spreading in genetically heterogeneous populations such as landraces (Frankel et al. 1995) or due to the presence of different resistance alleles.

Thorough analysis also revealed that the accessions, which were collected in a close proximity, were completely different genetically, despite morphological similarities. This indicates a different origin of landraces grown in particular regions and clearly proved preservation of the accessions’ genetic distinctiveness. Small albeit statistically significant correlation of the genetic markers with altitude, precipitation and temperature was clearly visible. Consequently, the specific genotype patterns of the oat landraces could be influenced or even created, to a great extent, by the local environment. Based on molecular data, it cannot be unequivocally confirmed whether these changes are related to the coding or non-coding genome regions. However, the comprehensive analysis presented in this paper indisputably demonstrated that some of them were located in coding regions, and were reflected in both the morphological traits and protein spectral features.

Cereal grains are biochemically highly heterogeneous organs, containing starch, proteins and lipids as the major compounds, (Klose and Arendt 2012; Redaelli et al. 2003; Shewry and Halford 2002) and a relatively high fraction of cell wall components, including heteroxylans, beta-mixed glucans and cellulose (Biel et al. 2014; Conciatori 2000; Shewry et al. 2008; Tiwari and Cummins 2009). All of these compounds and their relative contents affect the average metabolic profile of a grain providing a unique biochemical signature to particular cereal crop accessions, strains, cultivars, or varieties. Applying FTIR spectroscopy analysis, we expected to reveal the selective pressure of local environment on the grain metabolic fingerprint, particularly related to lipids and proteins, two main grain compounds which were pivotal in the biological improvement of oats. We found that FTIR spectroscopy revealed a much higher level of diversity of the oat landraces compared to the remaining examined features. This particularly concerned data extracted from the whole mid-infrared range of spectra and from the frequency range characteristic for lipids. Protein-related vibrational date showed a polymorphism comparable to that of genetic markers. The total spectral profile, which corresponded to almost all functional groups of organic compounds, did not display any statistically significant correlations with both the genotype and phenotype. The lack of a clear relationship between the whole metabolome and AFLP and RFLP markers has been previously reported for the sesame (Laurentin et al. 2008) and rice (Mochida et al. 2009) germplasm, respectively. Truncation of the spectral data to frequencies assigned specifically to proteins or lipids gave a much more interesting insight into the landrace diversity and enabled the direct association of traits with the environment of the place of origin. Particularly, FTIR technique revealed landrace variability in relation to the altitude, annual temperature, and annual precipitation in collection sites, some of which could not be detected with other traits. For the groups classified according to the mean annual temperature, the spectral data exclusively generated significant diversity of the landraces and its level declined as the temperature increased. Since all accessions were examined in the same field trials and were thus exposed to identical environment, the observed metabolomics diversity reflected specific, adaptive modifications of landrace to habitats in the place of their origin. In other studies, the biochemical traits based on non-targeted spectroscopic data have been used as markers for accession identity (Dumlupinar et al. 2011; Luthria et al. 2008), for investigation of the effect of environmental factors on the cereal grain quality (Andersson and Börjesdotter 2011; Shao et al. 2015; Yan et al. 2007), or establishing phylogenetic relationships (Demir et al. 2015).

Oat grains contain lipids at relatively high concentration, mainly in triacylglycerol form (Klose and Arendt 2012), and show high genotypic variability in this trait (Banaś et al. 2007; Biel et al. 2014; Danytė 2012). They possess exceptionally high capacity, among other cereals, to accumulate the oil in endosperm. The lipidic FTIR-based fingerprint showed the highest variability among all tested traits. This feature was also highly diverse when accessions were grouped according to the annual average temperature, region of landrace origin and altitude as classification criteria. The variability pattern was, however, distinct from that for proteins. Interestingly, a weak but statistically significant correlation was established between FTIR signatures and the ISSR markers.

The diversity level of oat landraces derived from amide I and II band assignment to proteins (Barth 2007) was comparable to that obtained from the genetic polymorphism analysis. A significant correlation occurred between the protein-related spectra and both genotype and phenotypic qualitative traits. Moreover, a large group of landraces with a very high internal similarity of protein and lipid spectral profiles was identified. Interestingly, accessions collected in the same or very close location had a similar lipid profile, whereas the considerable differences were observed in proteins.

This study also revealed interesting relationships indicating the plasticity of the grain biochemical profile dependent on the location of origin and thus prevailing environmental conditions in the natural habitat. This adaptive evolution coincided with the shift in the genetic background of accessions and possibly altered metabolic pathways shaping the pattern of accumulation of the major compounds in grains that was reflected in the FTIR spectral fingerprints. The biochemical diversity of the tested landraces might be a result of both their adaptive fitting to local habitats and the farmer-driven activity (Leclerc and Coppens d’Eeckenbrugge 2011; Watson and Eyzaguirre 2002).

The grains of oats are superior among other cereals due to relatively high content of lipids, health beneficial lipid and protein composition, the high beta-glucan to arabinoxylan ratio and the abundance of various phytonutrients that renders them valuable source of dietary compounds (Araus and Cairns 2014; Banaś et al. 2007; Biel et al. 2009; Kueger et al. 2012; Loskutov and Rines 2011; Vernocchi et al. 2011; Vilmane et al. 2015). Exploration of natural variability in these traits is important for the selection of valuable gene donors for breeding oriented to improved grain quality. Landraces can also be a superior source of genes determining advantageous response to disadvantage the environment, which may attract attention in nearest future with priorities for the sustainability of crop production and challenging climate change.