Introduction

The human genome contains a large amount of highly repetitive DNA sequences including many with a variable number of tandem repeats (VNTRs). Minisatellites are loci composed of tandemly repeated sequences between 10 and 100 base pairs (bp).1 As a consequence of a high rate of germline mutation to new allelic states, minisatellites show substantial allelic variability in the number of repeat units and are among the most polymorphic markers reported to date. The variability shown by many of the VNTRs studied renders them interesting for different applications, including linkage analysis, forensic identification, paternity testing, anthropological research and phylogenetic studies. Recent data show that such polymorphisms can be useful for genetic population studies.2,3

One of these VNTR loci is mapped on chromosome 2 and is located 75 bp from the second polyadenylation signal at the 3′ end of the apolipoprotein B gene (ApoB).4 The ApoB protein is one of the major low-density proteins and plays a central role in the metabolism of serum cholesterol. The 3′ ApoB hypervariable region (HVR) consists of a tandem-repeat sequence, rich in A and T.4 Two basic types of 15-nucleotide-long core repeats (X and Y) have been identified.5 Differences between alleles are because of extension of the YX repeats. In a number of core segments, the HVR alleles have sequence microvariations (a pure AT sequence is interrupted by a C or a G), usually concentrated at the 3′ end of the minisatellite.3,6 The 3′ ApoB polymorphism can be analysed by polymerase chain reaction (PCR) amplification and high-resolution agarose or polyacrylamide gel (PAGE) or capillary electrophoresis.2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 This HVR polymorphism has been widely used in investigations of the history and diversity of humans, both worldwide and in individual population groups.2,5,7,8,12,19 It is considered a suitable locus for a pilot study of the relationships between the shape of allele-size distributions of minisatellites and the micro-evolutionary processes leading to their present-day distribution.2 The allele-size frequencies can be used to calculate interpopulation genetic distances. Higher rates of polymorphism have been found for populations of different origin and ethnicity.2,5,7,8,12

In this study, we analysed the allele frequencies of the 3′ ApoB HVR locus in normal individuals from 10 different populations (five Russian, two Byelorussian, and the Adygei, Kalmyk and Yakut) living in the Russian Federation and the Republic of Belarus.

Materials and methods

DNA samples for study were obtained from donors of 10 populations from five ethnic groups: three Caucasoid and two Mongoloid (Figure 1).

Figure 1
figure 1

Geographical location of populations tested.

Two ethnic groups originate from the Altaic linguistic family: these are the Kalmyk and Yakut populations. Kalmyks inhabit the steppe region to the northwest of the Caspian Sea and represent a Mongolic linguistic group (Altaic linguistic family). Kalmyks correspond to the Baikal anthropological type. The Yakut population lives in East Siberia and belongs to the Turkic linguistic group (Altaic linguistic family). In classical anthropology, they are classified as the Central Asian type. In this study, samples from the Elista region of Kalmykia and from a central group of Yakut people were examined. The East Slavonic linguistic group (Indo-European linguistic family) was represented by samples from five Russian populations from the European (western) part of Russia (the Kursk, the Novgorod, the Kostroma, the Smolensk and the Oshevensk), and two Byelorussian populations (the Mjadel and the Bobruisk) from the Republic of Belarus. Russians are the largest ethnic group of East Slavs, representing a Central to East-European anthropological type. The Oshevensk group is an isolated population from the Arkhangelsk region of north Russia where external influences are very slight. The Novgorod group is a North-Western Russian population; the Kursk are from south Russia; the Kostroma group is a North-Eastern Russian population from the European part of Russia; and the Smolensk is a Central group. The Republic of Belarus is located to the west of Russia. The Bobruisk and the Mjadel are two populations originating from the eastern and northern regions of Belarus. The Bobruisk population is urban. The Mjadel population sample was collected in Byelorussian woodland of the Mjadel region (North Belarus) where the migration rate is low. The Adygei population came from Adyg-Shabsug and represents a North Caucasian linguistic family. Adygei people are one of the most ancient indigenous populations of the Caucasus region,21 living both on the eastern shore of the Black Sea and the main Caucasus mountains (highlanders).

Blood samples were collected after the informed consent of individuals according to the following criteria: all individuals needed to belong to the native ethnic group of the regions studied (with at least three generations living in the region) and were nonrelated and healthy. DNA was isolated from peripheral leucocytes by proteinase K treatment and extraction with phenol–chloroform.22 Analyses were made based on 76–139 people for each population.

For the determination of allele size, we performed PCR amplification of the target sequence and PAGE to separate the PCR products. The PCR products were visualized with silver staining because of higher sensitivity.23 Amplification of the locus was carried out as described by Renges et al.7 The electrophoresis was performed in TBE buffer (pH 9.0) for 3–4 h at 700 V. The exact size of 3′ ApoB alleles was determined with coelectrophoresis of a locus-specific allele ladder, containing 25–52 tandem repeats. The ladder was constructed by the amplification of a mixture of DNA from individuals with known alleles24 of the Byelorussian population.25 Allele designation followed the terminology of Ludwig et al.26

Calculations of allele frequencies, correspondence to Hardy–Weinberg equilibria, observed and expected heterozygosity levels and Nei's genetic distance calculation were all carried out with the POPGENE software, version 1.32.27 Hardy–Weinberg equilibrium was evaluated using the likelihood ratio test (G-statistics)28 and the exact test.29 χ2 testing for heterogeneity between pairs of samples examined was performed with the R × C contingency table testing program.30 Multidimensional scaling analysis was performed with STATISTICA software, version 5.5.31

Results

The polymorphic 3′ ApoB minisatellite was examined by PCR in 10 human populations from five ethnic groups. The distributions of allele frequencies found are shown in Table 1. The analysis detected 25 alleles with 25–55 repeats out of 1932 chromosomes typed. The allele spectra in Caucasoid populations (Russians, Byelorussians and Adygeis) are bimodal with the main peak in alleles 34–36, and a secondary mode around alleles 46–48. The Mongoloid populations (Kalmyks and Yakuts) demonstrated unimodal distribution of allele frequencies with a peak around 34–36 repeats. These patterns coincide with Caucasoid and Mongoloid population allele frequency profiles of other investigations for this locus.2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20

Table 1 Allele frequencies of ApoB 3′ minisatellite locus in different ethnic groups of Russia and Belarus

A high degree of similarity was found for allele profiles for all East Slavonic populations. The most frequent allele in these samples was allele 36 (frequencies ranged from 29 to 45%), followed by allele 34 (from 20 to 28%). A small inversion in the magnitude of 34- and 36-allele frequencies was noted in the Adygei population: the 36-allele frequency was 29.5%; the 34-allele frequency was 32%, but allele frequencies of the second mode were similar to those in other Caucasoid populations. In the Mongoloid populations, as in Caucasoids, the main alleles were 34 and 36, but with strong differences in the magnitude of their frequencies. In the main mode, these populations show a reversed frequency of the two common alleles: that of allele 34 rises to 55% in Yakuts and 44% in Kalmyks, and that of allele 36 decreases to 19% in Kalmyks and 9% in Yakuts.

High heterozygosity levels were observed for all populations investigated (73–84%, Table 2). The observed numbers of distinct genotypes also reveal a high degree of 3′ ApoB minisatellite polymorphism in the populations analysed. The genotype distributions observed do not deviate from Hardy–Weinberg expectations, based on both the likelihood ratio and the exact tests. The allele frequency patterns and the heterozygosity indexes obtained in East Slavonic and Adygei population samples were similar to others reported in Caucasian populations.7,8,9,10,11,12,13,14,15,16,17,18 The allele frequency patterns in Kalmyks and Yakuts diverged from those reported in Caucasoid peoples, but resembled those of Mongoloids.3,19,20

Table 2 Number of chromosomes analysed, expected and observed heterozygosity, and allele spectra parameters for investigated populations in the ApoB 3′ minisatellite

Comparison of the allele distributions in different ethnic groups was made using the R × C test30 for each pair of populations studied. When we compared each of the Mongoloid populations (Yakuts and Kalmyks) with any Caucasoid population, statistically significant (P<0.0001) differences were found. The test for heterogeneity of East Slavs revealed a high degree of similarity among the populations studied, except for the Bobruisk. The deviation of allele distribution in the Bobruisk group was mainly because of the 28-allele frequency, which was 3.7% in this population and 0.7% in the Mjadel population, whereas it was not detected in others (Table 1).

For the population relationship analysis, Nei's33 original pairwise genetic distances were calculated. The multidimensional scaling of this matrix results in the two-dimensional plot shown in Figure 2. We have outlined Russian and Byelorussian population samples by great-circle lines to make ethnic group comparisons easier. Statistical robustness of the plot was confirmed by a good correlation between the original distance matrix and that reconstructed by the multidimensional scaling algorithm (the stress index value for two dimensions was 0.028). Allele frequencies were also calculated for two ethnic groupings (Russian and Byelorussian) and compared with other populations of Eurasia in which this polymorphism has been studied (Table 3). The multidimensional scaling analysis (the stress index value for two dimensions was 0.035) based on Nei's distance matrix (Figure 3) shows significant differentiation among the populations corresponding to the two main anthropological types.

Figure 2
figure 2

Multidimensional scaling analysis plot (dimension I/II) of populations investigated.

Table 3 Frequencies of the alleles of locus ApoB 3′ minisatellite in the populations of this work and the comparison populations
Figure 3
figure 3

Multidimensional scaling analysis plot (dimension I/II) of 23 Caucasoid and Mongoloid ethnic groups of Eurasia. Abbreviations are RUS (Russians), BEL (Byelorussians), U (Ukrainians), A (Austrians), G (Germans), FRA (The French), P (The Portuguese), GR (Greeks) and H (Hungarians).

Discussion

We have analysed normal variability in the 3′ ApoB VNTR locus in some East European populations. As established previously,2,3,5,7,8,12 this locus is extremely polymorphic in individuals and significant diversity has been found for allele spectra in groups of different origin; similar results were observed here. We have tested populations of Mongoloid and Caucasoid origin, belonging to different linguistic families (Indo-European, Altaic and North Caucasian). The most remarkable differences in allele distribution were found for the Yakut and Kalmyk populations, who derive from the Altaic linguistic family (Turkic and Mongolic linguistic groups, respectively) and have Mongoloid ancestry. Significant divergence of these populations from other population groups studied was also shown by a χ2 test of the probability of divergence.

The multidimensional scaling analysis was performed to distinguish the populations studied. It is clear that the Yakut and Kalmyk populations are placed far from the others, which form a cluster of Caucasoid populations having their own configuration. East Slav ethnic groups are arranged close together and the Adygei population is the neighbour of East Slavs.

Genetic distance analysis carried out with multidimensional scaling treatment demonstrates that this marker is able to differentiate not only between Caucasoid and Mongoloid groups but also between related populations with a similar genetic pool. In the multidimensional scaling plots, almost all East Slavonic ethnic groups form a single cluster, with only the Bobruisk population lying a little closer to the Adygei population. The Bobruisk population originates from a region with a higher migration rate compared with other population samples studied. The Kostroma and the Mjadel populations, situated in the east and west regions of Eastern Europe, take positions slightly remote from the East Slavonic core cluster and possibly are most similar in genetic variability as out-of-the-way isolated populations.

Using a multidimensional scaling procedure, we compared our data with those reported for 15 European and three Asian populations.3,7,8,9,10,11,12,13,14,15,16,17,18,20 This analysis led to almost the same pattern as the plot for populations investigated in this study. European populations form one cluster, while the Mongoloid population groups fall in another. On the plot, all the European populations share a common cluster, with 10 forming a core. Some other European populations are situated around this core and those populations are all geographical outliers. East European populations are arranged on the plot as follows: East Slavs (Russians, Ukrainians and Byelorussians) are most closely related to the French, Germans, Austrians, Italians, Hungarians, Greeks, Portuguese, but less closely to other European populations. The Adygei population is as far from the European core as the Swedish, Basque, Spanish, Albanian, Slovakian and Finnish populations. In agreement with the plot pattern, we can support the hypothesis that East Slavs migrated from the Central Europe area, as indicated by anthropological and archaeological data.

According to archaeological data summarized by Sedov,34 Slavs as a separate group were being formed about the middle of the first millennium BC on the basis of Lusatian (Lausitz) culture, which belonged to the Central European community of Urnfield cultures, in the region of the Middle and Upper Vistula and the right bank of the Oder. Very similar results were obtained by Alexeeva and Alexeev35 based on craniological and physician anthropological research.

Malyarchuk and Derenko analysed hypervariable segment I mtDNA polymorphisms of East Slavs and found that these populations share rare haplotypes mainly with Germans and Finno-Ugric European populations. They concluded that East Slavs migrated in the early Middle Ages from their putative homeland in Central Europe.36 Recent investigations of short tandem repeat loci variability on the Y chromosome have shown that Eastern Slavs are homogeneous and most closely related to Poles, Hungarians and Germans whereas they are less closely related to some other Europeans.37,38 Thus, all genetic data to date support the hypothesis of an East Slav population originating from the central part of Europe.

The position of the Adygei population as an outlier from the core among European populations is in agreement with the geographic location and population history of these people. According to Macaulay et al,39 this ethnic group, belonging to the North Caucasian linguistic family, is thought to be close to the origins of important population expansions into Europe.

The Kalmyk and Yakut populations are positioned in another cluster with Japanese at the centre, and are as remote from the Japanese as the Chinese. While the present Kalmyk population inhabits an East European area, until the early Middle Ages these people lived in Asia as a Mongol confederate on the territory of Mongol China.40,41

Thus, the multidimensional scaling analysis of Nei's genetic distances reveals close affinities between East Slavonic ethnic groups. It shows close relationships between East Slavs, Adygeis and European Caucasoids. The Mongoloid populations are found to be significantly different from Caucasoid populations, and are most closely related to each other. The patterns of allele frequency distributions and χ2 tests for heterogeneity between samples support these conclusions.

While information provided by such a single-locus study should be used with caution, our results on East Europe populations are compatible with data based on mtDNA, Y-chromosome polymorphisms, anthropology and archaeology.34,35,36,37,38,39,41 We believe that this polymorphism is one of the most convenient available for pilot population relationship research, but of course it should be reinforced with further investigation of other polymorphic systems. This study provides new data about 3′ ApoB minisatellite polymorphisms in East European populations that may be useful for the analysis of human evolution and population history, both in this region and worldwide.