Introduction

Glanzmann thrombasthenia (GT) is a rare inherited bleeding disorder because of a lack or a qualitative deficiency of the αIIbβ3 integrin expressed at the platelet surface.1 The ITGA2B gene encodes for the αIIb subunit while the ITGB3 gene encodes for β3; they are both located on chromosome 17q21–23. As GT is an autosomal recessive disease, in non-consanguineous populations, patients are often compound heterozygotes. More than 150 mutations causing GT have already been reported. They are distributed on almost all the exons of ITGA2B and ITGB3 genes. An Internet database that includes clinical and biological data, as well as mutation analyses on reported patients is regularly updated (http://med.mssm.edu/glanzmanndb).

GT occurs in high frequency in certain ethnic populations with an increased incidence of consanguinity, such as Iraqi Jews or Jordanian Arabs.2, 3 The unusual high frequency of GT among these groups is largely the result of founder mutations. Specific gene mutations involved in rare disorders and of private founder mutations have been studied in detail in well-characterized founder populations, such as Israeli Jews, Finns and French Canadians.2, 4, 5

In France, GT is particularly frequent in the Gypsy population, mainly represented by the Manouche tribe.6, 7 Their clinical phenotype is the same as for classic type I disease, with a moderate and occasional severe bleeding syndrome, absent platelet aggregation and absent clot retraction. Flow cytometry analysis has shown that their platelet surface expresses <5% of normal αIIbβ3 levels, while western blotting showed no αIIb and only trace amounts of β3. They are also susceptible to develop platelet iso-antibodies after platelet transfusions.8, 9

Two groups independently localized the disease-causing mutation: a homozygous c.1544+1G>A substitution at the splice donor site of intron 15 of the ITGA2B gene (Figure 1).10, 11 Interestingly, this mutation was detected in all GT patients coming from the Manouche community and originating from different regions of France, the so-called French Gypsy mutation.12, 13 Moreover, this mutation appears limited to this tribe, which led us to hypothesize the existence of a founder mechanism in this population.14, 15, 16

Figure 1
figure 1

c.1544+1G>A mutation affects the splicing and expression of αIIb. (a) Schematic representation of the exon 15/intron 15 boundary of the ITGA2B gene showing the position of the c.1544+1G>A mutation (marked by red lines); (b) mRNA sequence showing the splice products. Eight base pair deletion causes a frameshift, which results in premature stop codon.

In this study, we examined whether GT patients within different families originating from the Manouche tribe and carrying the French Gypsy mutation share a common ancestor, and if so, to estimate the age of the founding event.

Materials and methods

Sample selected for haplotyping

Our study sample was composed of 23 individuals from 16 families originating from the French Manouche community, with 9 affected, 6 heterozygous carriers and 8 normal individuals. The male to female ratio was 0.43 (10 males and 13 females). These patients, originating from different regions, were registered in the French Reference Centre for Platelet Diseases. To the best of our knowledge, all families were unrelated. DNA was isolated using a standard protocol from peripheral blood using the QIAamp DNA Blood Mini Kit (Qiagen, Courtaboeuf, France). These individuals were previously tested for the presence of the French Gypsy mutation in ITGA2B intron 15 by an allelic discrimination assay with high-resolution melting (HRM) curve analysis.17 Parents and/or siblings of patients were genotyped and the correct segregation of the mutation in the pedigree was verified, when DNA was available. All patients signed the informed consent for genetic analysis before inclusion, and national ethics authorization for DNA analysis was obtained under the promotion of the French National Institute of Health and Medical Research (INSERM : ANR-08-GENO-028–03).

Haplotype analysis

Four single-nucleotide polymorphisms (SNPs) were selected for the haplotype reconstruction: rs5910, rs5918, rs5919 and rs11870252. These SNPs were chosen based on their chromosomal position and on their allelic frequencies, according to data available from the GenBank database (National Center for Biotechnology Information). They are located on chromosome 17, in the ITGA2B and ITGB3 genes, outlining a genomic fragment of approximately 3 cM from rs11870252 to rs5910 (Table 1). SNPs rs5918, rs5919 and rs11870252 are respectively located in exons 3, 6 and intron 11 of the ITGB3 gene. On the other hand, SNP rs5910 is located in exon 30 of the ITGA2B gene. The distance from rs5910 to the French Gypsy mutation is about 8000 bp. Two short tandem repeats (STRs) flanking the ITGA2B gene on chromosome 17 were also selected in THRA and BRCA1 genes.18 These STRs consist of tandemly repeated dinucleotide (CA) sequences. Alleles were named using an arbitrary scale for the observed fragment length of the microsatellites.

Table 1 French Gypsy mutation-linked microsatellite and SNP genotypes

Genotyping

SNP genotyping was performed by HRM (see Supplementary Table). In brief, 20–60 ng of DNA were amplified in a Light Cycler 480 system (Roche Diagnostics, Meylan, France) in a final volume of 20 μl with 2 mM MgCl2, 0.2 μ M of each primer, and 10 μl of PCR mix containing LC Green (Roche Diagnostics). Amplification was performed by 45 cycles of 95 °C for 10 s, annealing for 15 s (see Supplementary Table) and 72 °C for 15 s followed by a melt from 65 to 95 °C at 4 °C/s. Microsatellites were amplified using fluorescently labelled forward primers in a 25 μl reaction mix, containing 50 ng of genomic DNA, 1.25 units of Taq polymerase (Applied Biosystems, Courtaboeuf, France), 2.5 mM of MgCl2, 400 nM of each primer, and 50 μ M of each dNTP. After 7 min at 95 °C, 40 cycles of amplification (95 °C for 30 s; 54 °C for 30 s; 72 °C for 40 s) were performed, followed by 5 min at 72 °C. PCR products were loaded onto a Beckman Coulter CEQ 8000 sequencer with the Beckman Coulter DNA Size Kit 400 (Beckman Coulter, Roissy, France).

Statistical analysis

The age of the most recent common ancestor of the 15 carriers of the mutant allele was estimated using a likelihood-based method.19 This maximum-likelihood algorithm requires first a reconstruction of the haplotypes of the carriers of the mutant allele for the surrounding neutral markers. It uses as input the percentage of recombinant haplotypes, as well as the carrier frequency of the mutant allele in the population. It then estimates the number of generations (with 95% CI) elapsed since the most recent common ancestor introduced the mutation into the population and the growth rate of this mutation since then. As this method requires reconstructing the haplotypes carried by the patients, we genotyped individuals related to these patients (when available) to determine the phase. A number of GT patients could not be allocated to any pedigree. We used PHASE v2.1 software (University of Washington, Seattle, WA, USA) to infer the haplotypes for these isolated patients.20 This method is based on the use of a maximum likelihood to estimate haplotype frequencies and to infer haplotype pairs. Finally, distances in base pairs were transformed into distances in cM by using the correspondence between length in base pairs and in cM provided by the International HapMap project.21

The current number of copies of the mutant allele in the study population can be computed from the population size and the allelic frequency of the mutant allele. The basic characteristic of the Gypsies is their nomadic way of life, which makes it difficult to estimate accurately their population sizes. In France, statistical institutions evaluate the number of travelling people as approximately 250 000 individuals, belonging in majority to the Gypsy community.20, 22 There is little data on the size of the French Manouche population, which despite fluctuations has remained small. As the Manouche tribe is a major contributor to the French Gypsies we estimate their number as between 80 000 and 120 000; we have therefore used a total value of 100 000 individuals. Although we did not know the effective population size, we also considered two empirical effective values of 10 000 and 25 000 individuals.

Regarding the allelic frequency of the mutant allele, its estimation requires an unbiased sample of the population. However, patients who refer to Hospital Departments do so because they require treatment, thus introducing a recruitment bias. Mutant allele frequencies reported by other studies in similar founder populations mostly range from 0.002 to 0.05.19 Thus, we assumed here that the mutant allele frequency in our population ranged between 0.001 and 0.01.

Results

The obtained genotypes are shown in Figure 2 and Table 2. It was not possible to accurately determine the haplotype associated with the French Gypsy mutation in patients M11 and M15 because of limited information from family members. The haplotype reconstruction with PHASE was ambiguous for these two patients and we discarded them from further analyses.

Figure 2
figure 2

Pedigrees of studied cases of GT. Homozygous GT subjects are represented by solid squares or circles; heterozygous individuals by solid squares or circles within open frames. The bars represent the haplotype constructed from four SNPs and two STRs around the French Gypsy mutation of the ITGA2B gene.

Table 2 STRs and SNPs genotypes for the 23 individuals (Mx) from the 16 Manouche families carrying or not the c.1544+1G>A mutation in ITGA2B gene

For the remaining individuals, 10 chromosomes positive for the c.1544+1G>A transversion in 6 different families shared a common haplotype for all SNPs and for the 2 microsatellite repeats (haplotype H1=5, 3, C, C, T, T, respectively, for THRA/BRCA1 STRs and SNPs at nucleotides c.3063C>T, c.176T>C, c.882T>C and c.1914–19T>C) (Table 3). The STR in the BRCA1 gene of one patient (M5) showed a different allele than haplotype H1 (haplotype H2=5, 9, C, C, T, T), possibly corresponding to a single recombination event. Three additional individuals differed at the microsatellite repeat located in the THRA gene, suggesting another crossing-over event (haplotype H3=7, 3, C, C, T, T) (Table 3). The frequency of haplotypes H1, H2 and H3 in these French Gypsy mutation carriers were respectively 68% (15/22 chromosomes), 5% (1/22) and 27% (6/22) (Table 3).

Table 3 Identification and frequency of c.1544+1G>A haplotypes

To assess whether these strong associations of specific haplotypes were indeed the result of a founder effect, eight related and unrelated healthy individuals not carrying the French Gypsy mutation (M3, M4, M12, M13, M17, M19, M20 and M22) and coming from the Manouche community were genotyped for the whole set of SNP loci. The most common allele in these individuals for the SNP rs5918 was the T allele (15/16: 94 %), which is never associated with c.1544+1G>A among the disease chromosomes (Table 3). Moreover, the only healthy individual (M19) carrying the C allele for rs5918 did not carry the C allele found for all patients at the rs5910 locus. Thus, the haplotypes H1, H2 and H3 were absent in our sample of healthy individuals, which clearly indicates that these haplotypes are specifically linked to the French Gypsy mutation.

So, all patients but one shared a smaller unique homozygous haplotype (3, C, C, T, T) in the region made by the following five markers: BRCA1, rs5910, rs5918, rs5919 and rs11870252. Thus, this defines a common ancestral core haplotype shared by almost all patients. This strong linkage disequilibrium with the French Gypsy mutation in ITGA2B is a strong indication of a founder effect. Globally our results suggest that the mutation in the Manouche families is located on a founder chromosome and that crossing-over event that occurred at BRCA1 and THRA resulted in the apparition of haplotypes H2 and H3. All this suggests that the mutation was transmitted by a relatively recent common ancestor.

The method to estimate the age of the most recent common ancestor, which considers the linkage disequilibrium across all markers, is implemented in a specific software.19 We provided the program with an estimate of 100 000 for the number of individuals currently present in the Manouche community. Three separate analyses were performed, each using a different disease allele frequency. Only markers situated on the left side of the c.1544+1G>A could be used, as the markers situated on the right side were completely linked with the disease gene. Using these left-side markers, we estimated an age of the mutation ranging from 12.1 to 15.3 generations (Table 4), associated with a growth rate ranging from 1.33 to 2.2, depending on the assumed disease allele frequency. Assuming that a generation spans 25 years, the most recent common ancestor was estimated to have lived about 300 to 400 years ago (95% confidence interval; 255 to 552.5 years). Also, considering an empirical effective size of 10 000 and 25 000 individuals, we obtained older estimates for the age, but the differences were quite small (Table 4).

Table 4 Estimation of the age of the French Gypsy mutation

Discussion

We report here a large group of patients analyzed for the French Gypsy mutation. This mutation in a homozygous state results in the absence of αIIbβ3 protein because of an abnormal splicing site, a frameshift and a premature stop codon in the ITGA2B gene.10, 11 Clusters of GT families in the French Manouche community and a relative over-representation of the disease-causing mutation are indicative of a founder effect. This hypothesis was further supported by the report of nine homozygous patients for the French Gypsy mutation who were tested for the human platelet antigen-1 a/b (HPA), a SNP in the ITGB3 gene.14, 15, 16 Jacquelin et al14 noted that the French Gypsy mutation was strongly linked to HPA-1b, the minority allele in a normal population, suggesting that these patients carry a homozygous common ancestral haplotype at the ITGA2B locus. Furthermore, among several hundred GT patients who have been genotyped worldwide, to our knowledge, none has ever been found to carry this mutation, which rules out the hypothesis that it is caused by a mutation ‘hotspot’.

Genotyping a set of four SNPs and two polymorphic STR markers near the French Gypsy mutation locus allowed us to characterize three different haplotypes in eight unrelated patients, from which a minimal ancestral common core haplotype was defined at markers BRCA1, rs5910, rs5918, rs5919 and rs11870252, spanning about 4 Mb. The large size of this core haplotype is a clear indication of a recent founder effect.

By applying the method of a maximum likelihood to the microsatellite haplotype data, we were able to calculate the age of the common ancestor of all carriers of the mutation. This estimated age of the French Gypsy mutation ranged from 12.1 to 15.3 generations (302.5–382.5 years, assuming a 25-year generation time). The study would probably have a better accuracy with a higher sample size for the French Gypsy mutation carriers, but we were limited in part by the fact that GT is a rare disease. Nevertheless, confidence intervals are narrow enough so that our conclusions are reliable, and our sample of 15 phased chromosomes from unrelated patients is of the same order of magnitude as in other studies.19

Founder effects have also been previously identified for GT in 11 unrelated Palestinian patients carrying the same 13-bp deletion in the ITGA2B gene, and in 6 Jordanian families carrying a C549R mutation in β3.23, 24 The Jordanian C549R mutation probably occurred more recently because the haplotype associated with the C549R mutation was conserved over a distance of 4.5 Mb, while the common haplotype associated with the Palestinian 13-bp deletion was only of 350 kb. Analyzing the closest loci showing recombination events with the mutations yielded an age estimate of 300–600 years for the 13-bp deletion in Palestinian patients and 120–150 years for the C549R in Jordanian ones.

Our study illustrates how recent developments in population genetics allow insights into demographic and population history. Records on linguistics, cultural anthropology and demography describe the Gypsies as a population with Indian origins, with an initial exodus into Byzantium Empire dated approximately to the eleventh century, and a subsequent dispersal throughout Europe by the fifteenth century (Figure 3).25, 26, 27, 28, 29 The migration of Gypsies in France took two routes: the first, through the North of Europe, and the second from Persia through North-Africa. Gypsies recognize sub-populations among themselves based in part on territorial, cultural and dialect differences. The main branches in France are the Gitans, mostly in Southern France (Spain had been for a long time their favored country); and the Manouches, cousins of the Sinties (who stayed in Germany), who bear Germanic names.10, 11, 15

Figure 3
figure 3

Origin of the French Gypsy mutation. Origin of Gypsies , European migration in Balkans , migration in Germany , introduction of the French Gypsy mutation in France and founder effect .

On the basis of our results, it is possible to infer the migration history of these Manouche families in France (Figure 3). It is likely that they moved from Germany to the North of France between the seventeenth and eighteenth century. Our results support a founder effect at that time and that this c.1544+1G>A mutation was brought by one founder (or a small group of related founders). Then, the number of carriers of this mutation increased within the Alsace region.6 Limited social interactions occurred between the autochthonous population and the foreigners, because of a strong family history of consanguineous unions and maintenance of family structure. It also seems reasonable to assume that, at the time of the appearance of the mutation, the restricted movements of these Manouche families were linked to favorable local agro-ecological conditions.30 Files from registries of the Alsace region suggest a population well integrated into the local agrarian society.30 The high rate of consanguinity in this isolated Manouche group has resulted in an over-representation of the French Gypsy mutation. For a long time, the movements of these Manouches remained confined within the North of France. Demographic explosion, search of new outlets and also perhaps the taste of adventure pushed the Manouches to disseminate to other regions of France during the nineteenth century, propagating the mutation.30

The results of this study provide additional support to the historical and demographic parameters occurring in these Manouche families. Although our demographic model does not infer a definitive migrational history, this study provides additional insights into Manouche population movements, through the analysis of this founder mutation.