Introduction

The biochemical approach to the study of systematics and population genetics has proven to be extremely informative for a wide range of organisms. Biochemical techniques have been particularly useful in the systematics of animals which are morphologically very similar, and when environmental factors may influence classical taxonomic characters (Wilson et al., 1977; Platzer, 1981; Bakker, 1987; Nadler, 1990).

Nowadays, DNA markers detected by PCR are being used mainly to establish genetic relationships within populations (Random Amplified Polymorphic DNA (RAPD), Microsatellites, Amplified Fragment Length Polymorphisms (AFLP),…) replacing protein markers, because the latter detect less variability and are more laborious to use.

Among protein analysis techniques, Two-Dimensional Gel Electrophoresis (2-DE) followed by highly sensitive nonspecific staining methods (silver staining), detects a great deal of genetic variability (O’Farrell, 1975). This technique is based on the separation of proteins from a sample in two dimensions, the first separating by Isoelectric Point (pI) and the second by Molecular Weight (MW), allowing the resolution of hundreds of proteins in a single gel. One of the most frequent criticisms to 2-DE is the technical difficulty of the method and the laborious and complex analysis of the gels that is required. Nevertheless, recent technical improvements have increased the reproducibility of 2-DE and new software applications have simplified the analysis of the gels. These developments have favoured the application of two-dimensional gel electrophoresis to population genetics (Folkertsma et al., 1996a; Van derBeek et al., 1997, 1998; Miyazaki et al., 1998; Tastet et al., 1999).

2-DE data can be processed in two ways: through the calculation of a similarity index when qualitative differences (presence or absence of a protein) are considered (Zimny-Arndt & Klose, 1987; Posch et al., 1992; Thomas & Singh, 1992; VanderBeek et al., 1998), or through the analysis of the variation in allelic frequencies of known loci (Janczewski et al., 1990; Choudhary et al., 1992; Slattery et al., 1994; Folkertsma et al., 1996a). Both methods have some limitations.

Qualitative analysis does not require the prior identification of the proteins in a gel, but it is necessary to examine a large number of spots in order to differentiate related populations (Bakker, 1987; Bossis & Mugniéry, 1993). The analysis of a large number of spots makes it necessary to perform the electrophoresis in high quality gels. Moreover, similarity indexes based on qualitative data are not appropriate for the study of genetically related populations. Qualitative analysis does not consider quantitative variation of a character and two populations would be considered as identical even though they differed 100-fold in the intensity of a spot (Buth, 1984).

On the other hand, the study of allelic frequencies does not require the definition of as many spots as in qualitative analysis, and the algorithms used to reflect the genetic distances are well tested. However it is very difficult to determine which spots on the gels are the products of alleles of the same locus.

The resolution of which of these alternatives is better for the study of genetic relationships among the Globodera pallida and G. rostochiensis species arose during the work in our laboratory. Globodera are microscopic phytonematodes with sexual reproduction. These nematode species are one of the major pests of the potato (Evans & Trudgill, 1992). We were interested in the establishment of the genetic relationships among populations of the Globodera genus, especially in relation to virulence characteristics, since selection and differentiation of populations are correlated with such characters. We found information obtained from expressed characters more suitable than noncoding sequences.

Genetic relationships among G. pallida populations are well documented and Allelic Frequency Analysis (AFA) of 2-DE gels has already been used in the study of this species (Bakker & Bouwman-Smits, 1988; Folkertsma et al., 1996a; Folkertsma, 1997). However, when we tried to use the alleles described by Bakker (1987) in the AFA of G. rostochiensis species we did not succeed, since we could not detect unambiguously which spots of the gel matched the alleles described. This led us to develop an alternative method for 2-DE analysis which no longer required previous knowledge of which spots represented alleles of the same locus.

Picard et al. (1997) developed a new method in durum wheat called Quantitative Analysis (QA), which combines the advantages of Allelic Frequency Analysis (AFA) and of qualitative analysis. This analysis is especially appropriate for the study of organisms in which there is no previous information about allelic variants. Moreover, this method works with a small number of spots. This method does not require the prior identification of which spots of the gel are alleles of the same locus and the number of spots to be studied can be quite small. Nevertheless, the method developed by Picard et al. (1997) has some limitations when it is applied to organisms in which the unit of analysis is the population and not the individual, as in our study of Globodera. In order to apply the QA to Globodera populations, some modifications have been added to Picard’s method and we have renamed it Relative Quantitative Analysis (RQA). RQA analysis allows the calculation of the genetic distance between two populations based on the proportion of individuals which express a particular protein in each population.

In order to establish the feasibility of the RQA method for this kind of analysis we have tested the applicability of this new methodology to G. pallida and we have compared the results with those of traditional AFA analysis, using the information obtained from the study of seven loci described by Folkertsma et al. (1996a) for this purpose.

To reduce the differences in metabolic states of the populations, all were stored, multiplied, harvested and processed in the same stage of development and following the same protocol. Our work is based on the assumption that if two populations have a different quantity of protein for a specific spot of the gel it means that the proportion of individuals which express that protein differs between the populations.

Materials and methods

Populations

Nine G. pallida populations were selected for this study (Table 1). The two South American populations were collected from Peru, one from the area of Junin in the middle of the Andes, provided by the CIP (Centro Internacional de la Papa, Lima, Peru), and the other from Marcapampa, on the eastern side of the Andes, provided by the Proimpa (Proyecto de Investigación de la Papa, Cochabamba, Bolivia). The European populations came from Germany, provided by the SCRI (Scottish Crop Research Institute, Dundee, UK), Balearic Islands (2), Canary Islands (2) and mainland Spain (1). The D3 population, which recently has undergone a bottleneck, was included in order to define the behaviour of RQA analysis in the analysis of populations which have suffered a major genetic drift event.

Table 1 Populations of potato cyst nematode (Globodera pallida), their pathotype and origin

All G. pallida nematode populations were multiplied in a growth chamber (constant temperature 18°C, 16 h light per day), in 12 cm clay pots on the potato cultivar cv. Desirée. All they were multiplied at least twice, in identical conditions.

2DGE analysis

Fifty five-week-old females per population were collected in a small glass mortar and rinsed with 10 mM Tris-HCl, pH 7.4. Total protein samples were prepared by homogenizing the 50 females in 30 μL of homogenization solution (urea 9 M, 5% v/v β-mercaptoethanol and ampholytes, 0.4% pH range 3–10, 0.8% 6–8 and 0.8% 5–7 (Biolyte, Bio-Rad)). The homogenates were centrifuged at approximately 10 000 g (13 000 r.p.m.) for 10 min and supernatants were frozen at −70°C. Just before use, the samples were thawed and protein concentrations estimated using the Bradford method. Six μg of protein were loaded in each electrophoretic run.

2-DE was carried out on two or more gels for each sample, as described by Bossis & Mugniéry (1993), except that piparizine diacrylamide was replaced by bisacrylamide. The voltage conditions for the first dimension were: 17 h at 18 V, 90 min at 180 V, 30 min at 270 V and 95 min at 500 V. The second-dimensional separation was performed on a 12% polyacrylamide gel. The current conditions were: 5 mA for 20 min followed 10 mA during 55 min. Proteins were stained with silver ammonia solution using the protocol of Bossis & Mugniéry (1993).

Gels were scanned in an Apple Colour Onescanner (Apple Computer, Cupertino, CA, USA) at the maximum optic resolution of 300 × 600 ppi (pixels per inch).

Allelic frequency determination

The allelic frequencies of seven loci (E, F, H, I, J, N and X) (Folkertsma, 1997) were scored. For the estimation of the allelic frequencies, the scanned images were analysed in a Sun Sparc Station with ‘2-D ANALYSER’ software (version 6.03, Bioimage, Millipore).

To determine the frequency of the isoelectric point variants, encoding alleles at putative loci, called IP-variants by Folkertsma (1997), the protein quantity of each spot was measured using the integrated intensity (I.I.) option of the software. These I.I. values were converted into allelic frequencies, considering as the unit the sum of the quantity of the different alleles of a single locus (Fig 1). Allelic frequencies for two or three different replicates of the same populations were obtained and mean values were calculated. These mean frequency values were used as input data for genetic distance calculations. Two distance algorithms were used: Rogers’ distance (Table 2) (NTSYS-PC, v 1.8, Applied Biostatistics, 3 Heritage Lane, Setauket, NY) (Rohlf, 1993) and Reynolds’ distance (PHYLIP 3.5 software package, Department of Genetics, University of Washington, Seattle, WA). The distance matrices were used to produce dendrograms using the Unweighted Pair Group Method with Arithmetic Averaging (UPGMA) (NTSYS-PC, v 1.8). In both cases a cophenetic correlation coefficient was computed to evaluate the quality of the cluster analysis. The cophenetic correlation coefficient was computed from the tree matrix and compared to the original similarity matrix in order to measure the goodness of fit (Rohlf, 1972).

Fig. 1
figure 1

Two-dimensional gel electrophoresis of Tf-Pa34CI populations of Globodera pallida. The alleles used in frequency analysis are marked.

Table 2 Distance matrices in populations of Globodera pallida. Upper triangle: distance matrix using 61 quantitative characters (Euclidean distance); lower triangle: distance matrix using frequency analysis of seven loci and 17 alleles (Rogers’ genetic distance)

Relative quantitative analysis

Spots of proteins from all isoelectric point and molecular weight ranges were selected and the I.I. values, which are directly proportional to protein quantity, of 61 spots were estimated. The protein quantity of each spot was transformed into a proportion, defining as unity the sum of the 61 spots’ I.I. values. In this way, the value for relative normalized I.I. value corrects for differences between gels produced by protein loading differences or staining problems.

The normalized I.I. value of each spot was obtained from different replicates of the same population and the mean value was calculated. An I.I. mean values matrix was constructed and used as input data to calculate the root sum of squares of differences (Euclidean distance) matrix (NTSYS-PC, v 1.8) and the Reynolds’ distance (PHYLIP 3.5). Reynolds’ distance is a Euclidean distance when continuous data are used (Reynolds et al., 1983).

The distance matrix was converted into a dendrogram using the UPGMA clustering method from NTSYS-PC. In both cases a cophenetic correlation coefficient was computed to evaluate the quality of the cluster analysis.

Comparison of methods

To estimate the correlation of the distance matrices obtained from allelic frequency analysis and from quantitative analysis, two indexes were used: the Pearson correlation (equivalent to the Mantel statistic ‘r’) and the Spearman nonparametric test for association (SPSS Advanced Statistics 7.5; SPSS Inc., 1997).

In order to measure the variability in the allelic frequencies and in quantitative data, a bootstrap analysis was conducted (Felsenstein, 1985). A thousand bootstrapped data sets were transformed into distance matrices using Reynolds’ algorithm. These results were used to obtain a consensus dendrogram (consensus application; PHYLIP 3.5).

Results

Two methods were used for 2-DE data analysis: Allelic Frequency Analysis (AFA) and Relative Quantitative Analysis (RQA). In both studies the same populations and gels were used.

The AFA of seven loci did not show any population-specific alleles, and no monomorphic loci were observed (Fig. 1). Allelic frequencies were estimated for all loci and populations, except for Lo-Bañares in which the H locus could not be unequivocally detected. For this reason the H locus in the Lo-Bañares population was scored as a missing value for genetic distance calculations.

The genetic distances calculated using AFA are summarized in Table 2 and the dendrogram in Fig. 2.

Fig. 2
figure 2

Dendrogram representing the relationships between G. pallida populations from the Allelic Frequency Analysis (AFA) using 17 alleles of seven loci, Rogers’ genetic distance and UPGMA clustering method. The cophenetic correlation value is r=0.97.

In RQA, 61 spots were measured, five of which were population specific. The Euclidean distances among populations are summarized in Table 2 and the dendrogram in Fig. 3.

Fig. 3
figure 3

Dendrogram representing the relationships between G. pallida populations from the Relative Quantitative Analysis (RQA) using 61 characters, Euclidean distance and UPGMA clustering method. The cophenetic correlation value is r=0.89.

In both dendrograms (Figs 2 and 3), European and South American populations are differentiated into two separate groups.

The main difference between the two analyses (AFA and RQA) concerns the D3 population. Both studies include this population within the Europeans, but AFA shows D3 as the most separated population, whereas RQA places it close to the Europeans.

The mathematical verification of the correlation between the AFA and RQA distance matrices was performed by means of Pearson and Spearman indexes, and in both cases the observed correlation was highly significant (r=0.82, P ≤ 0.01 in Pearson tests and r=0.76, P ≤ 0.01 in Spearman tests).

A bootstrap analysis was performed to determine the strength of the clusters obtained with each method, AFA (Fig. 4) and RQA (Fig. 5). The Lo-Bañares population was not considered since PHYLIP software does not allow missing values.

Fig. 4
figure 4

Consensus dendrogram obtained with Reynolds’ genetic distance from Allelic Frequency Analysis (AFA).

Fig. 5
figure 5

Consensus dendrogram obtained with Euclidean genetic distance from Relative Quantitative Analysis (RQA).

The consensus dendrogram obtained from the bootstrap analysis from AFA (Fig. 4) is identical to the first AFA tree (Fig. 2). In this consensus dendrogram, the European populations cluster together in 692 out of 1000 cases, and the South American ones in 487 out of 1000 cases. Nevertheless, the consensus dendrogram from RQA (Fig. 5) is not identical to the previous RQA tree (Fig. 3), the most important difference being that the two South American populations do not cluster together.

Discussion

Allelic frequency analysis detected by 2-DE has previously been shown to be valuable for population genetic analysis in G. pallida (Bakker, 1987; Folkertsma et al., 1996a). Moreover, correlations have been observed between the results from allelic frequency analysis as well as other molecular techniques such as RAPDs (Folkertsma, 1997) and AFLPs (Fullaondo, 1999), and from analysis of virulence characteristics (Folkertsma, 1997; Fullaondo, 1999).

Although AFA is an adequate tool for the genetic study of these phytonematodes, it creates serious difficulties in the identification and definition of alleles. For this reason, we propose an alternative system named RQA, which allows us to use the information obtained from the distinguishable proteins through 2-DE, with the advantage that it is not necessary to determine previously the allelic variants of a locus.

As can be seen from the correlation analysis results, the two 2-DE data analysis methods (RQA and AFA) (Figs 2 and 3) explain the genetic variation among G. pallida populations in a similar way. However some differences can be observed with regard to: (1) the order of clustering of the European populations; (2) the position of the D3 population; and (3) the clustering of the South American populations, in the consensus dendrogram.

1 With respect to the first point, both methods cluster the European populations together, but in a different order. Similar results have been previously detected by different groups (Folkertsma et al., 1996a; Folkertsma, 1997). The most widely accepted explanation for these differences in the clustering among G. pallida populations is that they are not really genetically different (Folkertsma et al., 1996a, b; Blok et al., 1997; Folkertsma, 1997; Thiéry et al., 1997; Blok et al., 1998). The cited studies show that G. pallida populations of the Pa2/3 virulence group are not genetically different, and for this reason the grouping within them should be considered as random grouping. Moreover, Folkertsma (1997) has developed an analysis to determine the reproducibility of AFA obtained by 2-DE data, and has demonstrated that different gels of the same populations can exhibit genetic distances of 0.05, suggesting that the intrinsic error of the AFA from 2-DE analysis falls within this range. In our AFA work, European populations cluster near the 0.05 limit, which indicates that these populations are so similar among themselves that different groupings derived from AFA and RQA analysis are not relevant, since they are not significant.

2 Regarding the second point, the D3 experimental population falls in the middle of the European group in RQA, but D3 is located outside this group in AFA. This position in AFA analysis is strongly conditioned by the presence of allele N2. This allele is not present in the rest of the European populations analysed, but it appears in the Pe-Marcapampa South American population. In populations that have recently suffered a bottleneck (as is the case described here) rare alleles tend to be lost or they increase greatly in frequency (Chakraborty et al., 1980; Smith & Glenn, 1995). If we exclude the locus N from AFA, the D3 population clusters within the European group (data not shown), as it does in RQA. It seems, hence, that the use of the RQA method for the analysis of populations which have suffered one or more bottlenecks attenuates the peculiarities of these populations due to modifications in the frequencies of specific alleles. Moreover, it allows for a more balanced genetic characterization based on the analysis of a larger number of characters.

3 With respect to the third point, the two South American populations cluster together in the AFA dendrogram (Fig. 4) but they do not in the RQA dendrogram. This prompts the question whether the South American populations constitute a group.

There are several methods to ascertain whether a grouping forms a true group. One of the most widely accepted is the Majority Consensus Rule. This method considers that a group is present when the same cluster is obtained in at least 50% of trees (Margush & McMorris, 1981; Rohlf, 1993). For this reason all groupings that do not reach this value should not be considered as a group. Based on the Majority Consensus Rule criteria in both cases (AFA and RQA), the two South American populations do not form a group because their grouping does not reach this 50% value. The absence of clustering between the two South American populations points to a high level of heterogeneity among the populations of this area. Higher genetic heterogeneity among South American populations than among European populations has been previously reported using RAPDs (Blok et al., 1997; Bendezu et al., 1998; Fullaondo, 1999).

Therefore, after this detailed comparison between the AFA and RQA results, we can conclude that both methods show the same kind of variation, and that the observed clusters are very similar when comparing both methods.

Moreover, the quantitative method shows some advantages, compared to the qualitative method, which make it very valuable for protein variation analysis.

1 It is not necessary to have prior knowledge of the protein genotypic pattern of the organism.

2 It takes advantage of new image analysis software, which allows us to quantify the protein in each spot, and to use more effective mathematical algorithms to determine the genetic relationships between populations of the same species.

3 The relatively high number of characters analysed avoids the distortion associated with the presence of rare alleles, when compared to classical frequency analysis.

4 The high number of characters analysed should allow us to obtain a better description of a population.

In view of these advantages, we believe that RQA can be applied to genetic analysis of other species’ populations when genotypic patterns have not been determined or when technical difficulties hinder their determination. Finally, we think that RQA is especially useful for organisms whose microscopic size limits the unit of analysis to the population.