Introduction

Although sperms are typically viewed as being small, sperm length is extremely variable across taxa (for a review, see Pitnick et al., 2009), varying from the tiny porcupine sperm (Gage, 1998) to the enormously elongated sperm of Drosophila hydei (Pitnick et al., 1995). Some of this variation may be due to the differences in sperm competition risk, although this is not always the case (Gage, 1994; Hosken 1997; Stockley et al., 1997; for a review, see Pitnick et al., 2009). Similarly, there is also considerable sperm length variation within species, although within males, this variation is relatively small (e.g. Morrow and Gage, 2001a; for a review, see Pitnick et al., 2009). Again, an unequivocal, single explanation for this variation is lacking, although sperm size may have a function in post-copulatory sexual selection, including sperm competition, in at least some cases (e.g. Gage and Morrow, 2003; Calhim et al., 2007; for a review, see Pitnick et al., 2009).

As with any trait, heritable genetic variation is required for sperm length to evolve under any form of selection, and there is some evidence that sperm length has a significant additive genetic component, although there have been very few studies on the quantitative genetics of sperm morphometry (for a review, see Simmons and Moore, 2009). Nevertheless, aspects of sperm length have been shown to be heritable in birds (Birkhead et al., 2005), rabbits (Napier, 1961) and mice (Woolley and Beatty, 1967). Similarly, among insects, sperm length is heritable in bees (Baer et al., 2006), beetles (Simmons and Kotiaho, 2002) and crickets (Morrow and Gage, 2001b), and these heritability estimates are relatively high, ranging from around 40% to more than 100% (with the latter estimate thought to be upwardly biased by sex-linked inheritance). In addition to the general paucity of estimates of the heritability of sperm length (Simmons and Moore, 2009), only a few studies have selected on sperm length to unequivocally show that length indeed responds to selection (Morrow and Gage, 2001b; Miller and Pitnick, 2002)—sperm length may not evolve because of the underlying genetic architecture (for example, see Blows and Hoffmann, 2005).

The yellow dung fly (Scathophaga stercoraria) is a model system for the study of sperm competition and sperm size variation (Parker, 1970; Simmons et al., 1999; Hosken and Ward, 2000). In one of the first reports of within-species sperm length variation, Ward and Hauschteck-Jungen (1993) showed that although sperm size varied across males, it was far less variable within males. Although some of this variation may be due to environmental temperature differences during development (Blanckenhorn and Hellriegel, 2002), there is no indication that the variation is due to inherent developmental instability (Hosken et al., 2003). In addition, although there is some evidence that sperm length influences sperm storage (Otronen et al., 1997), when fly populations were forced to evolve with and without sperm competition, sperm size did not evolve, even though other reproductive characters did (Hosken and Ward, 2001; Hosken et al., 2001). Furthermore, sperm length is heritable in yellow dung flies, and there is a significant autosomal contribution to this (Ward 2000). Therefore, all else being equal, sperm length should have evolved if it was important in sperm competition. It is, of course, possible that all else is not equal, and that in spite of significant sperm length heritability, the genetic architecture of sperm in some way precludes sperm length evolution (Blows and Hoffmann, 2005). Artificial selection on sperm length would provide direct, unequivocal evidence that sperm length can evolve in response to selection (Simmons and Moore, 2009), although this approach has rarely been used in studies of sperm length. Here we use replicated lines of yellow dung flies and bi-directionally selected on sperm length. We find that sperm length, indeed, responds to selection and that the realized heritability is not greatly different from previous estimates. We also assessed the fertility of males from the long and short sperm-length lines. Differences may have been apparent if there was a trade-off between sperm length and number, but we found none.

Materials and methods

Pairs of wild S. stercoraria (n=66) were captured near Zürich (Switzerland) in mid-October 2003, and were taken to the laboratory. In the laboratory, each pair was placed into a glass vial and given a portion of fresh cow dung on filter paper on which females could lay their eggs. Females were allowed to lay eggs overnight, and 57 did so. Females captured on dung in the field are normally non-virgins, and they store sperm. Therefore, the effective population size, and hence genetic variation, in our initial population sample was greater than that suggested by the census size.

About 15–20 eggs per field-captured female were used for the parental lab generation and were reared under standard conditions: 20 °C, 66% relative humidity and a 12:12 dark–light regime. Eggs were placed in 100 ml plastic bottles filled with approximately 75 ml of cow dung to ensure ad libitum larval food conditions (more than 2 g dung per larva (Amano 1983)) and minimize larval competition.

After 20 days offspring began to emerge, and all bottles were subsequently checked daily and the number of emerging flies was recorded (sex and family). Newly emerging flies were always housed singly in a 100 ml glass bottle and provided with ad libitum water, sugar and excess Drosophila melanogaster as prey under standard environmental conditions. To establish the selection lines, males and females that emerged around the mean emergence time (males: approximately 26 days; females: approximately 25 days) were randomly divided into six lines. Three lines (replicates) were assigned to the short-sperm selection treatment (S1–3), the other three to the long-sperm selection treatment (L1–3). Each line consisted of about 40–45 males and the same number of females. After being assigned to their respective lines, flies were re-housed and fed as above. All flies were subsequently housed under standard conditions (except that the rearing temperature was 13 °C—yellow dung flies are cold adapted and this temperature allowed us to measure sperm from all sires before the emergence of the next generation) until matings occurred.

Matings occurred on three successive days (one line per treatment per day). Females were, at least, 14 days old and males, at least, 6 days old, to ensure they were sexually mature (Foster 1967; Hosken et al., 2002). In each line, 35 pairs were provided with the opportunity to copulate. Males were first added to a vial containing a portion of cow dung on a filter paper. Females were added 10–25 min later. Matings between siblings were always avoided. After a full copulation (longer than 15 min: direct observation), males were removed (after they released the female voluntarily) and stored in Eppendorf tubes at −20 °C for subsequent sperm length measurement. Females were allowed to lay one clutch of eggs over the next four days, although most laid within 24 h. About 20 eggs per female were then transferred to a 100 ml plastic bottle filled with 75 ml cow dung and stored under standard conditions until offspring emergence.

Sperm length was measured for all frozen males, and hind tibia length (HTL), an indicator of body size (Simmons and Ward, 1991; Parker and Simmons, 1994), was also measured. To measure sperm length, one testis per male fly was dissected out in a droplet of distilled water on a microscope slide. All parts other than the testis (fat droplets, tracheae and other tissue) were removed. The testis was pierced in the proximal third (close to the ejaculatory duct) to obtain only mature sperm. The testis was then discarded and the sperm gently diluted in the droplet of distilled water and dispersed on the slide. Slides were then air-dried, and sperm length was measured using microscope images ( × 160 magnification) conveyed to a PC running ZEISS KS300 software (Carl Zeiss MicroImaging GmbH, Göttingen, Germany). Fifteen sperm per male fly were measured, as preliminary investigation showed this was sufficient to obtain an accurate average of a male's mean sperm length (also see Hellriegel and Bernasconi, 2000). Note that the mean sperm length per male does not vary over successive copulations (Bernasconi et al., 2007).

Mean sperm length was then calculated for each line. All offspring of males with shorter sperm than the mean in the short-sperm selection treatment (or longer sperm than the mean in the long-sperm selection treatment, respectively) were kept for the next generation. When less than 11 families per line fulfilled this condition, the families from fathers with the next appropriate sperm length (that is, the next shortest or longest) were selected to ensure 11 families were used per line for the next generation. This was necessary in five lines in generation 1 (S2, two families; S3, three families; L1, three families; L2, one family; and L3, one family) and two lines in generation 2 (S2 and S3 two families each).

This procedure was continued for a further three generations (a total of four generations of selection in all), at which time the realized heritability of sperm length was calculated using three methods: treatment divergence and cumulated selection differentials (Falconer and Mackay, 1996), a weighted (because the number of sons that each family contributed to each generation varied) linear parent–offspring regression approach (Falconer and Mackay 1996), and restricted maximum likelihood approach on an animal model using WOMBAT—a freeware program equivalent to ASReml (http://agbu.une.edu.au/~kmeyer/wombat.html) (Meyer, 2006, 2007). WOMBAT includes the pedigree structure of our selection lines to estimate trait heritabilities (sperm length and body size) and is probably the most powerful way to estimate heritability given our data. Further, we got estimations for the genetic and phenotypic correlations between sperm length and body size from this model.

The fertility of males from both selection treatments was assessed as a potential correlated response after a further generation of selection (generation 5). Ten males from each line were given the opportunity to copulate sequentially with five different females from the same selection line as the male (that is, five matings per male). In all, 20 males from the long-sperm selection lines and 13 males in the short-sperm selection lines managed to copulate with all 5 females successfully, but the number of successful copulations was not different between the selection treatments (data not shown). (mean±s.e.m. sperm length of copulating males: long 221.1±0.8 μm; short 212.1±1.2 μm).

After a full copulation (visual observation), a male's first and fifth females were transferred to a new glass bottle and given the opportunity to lay eggs on fresh dung once a week until death. Eggs were counted and hatching success recorded. To assess hatching success, 10 eggs from each clutch were transferred to a moist filter paper and checked for hatching after 24 h—it is easy to see which eggs have hatched as the chorion is left behind in a crumpled heap when the maggot exits. The proportion of hatched eggs was then used to estimate the fertility of the male.

There are a number of ways in which these data could be analyzed to test for differences in fertility between selection treatments. We used repeated-measures analysis of variance and non-parametric tests (for example, Wilcoxon's Signed-rank and Wilcoxon's rank sums tests) of the arcsine transformed hatch proportions (and total eggs hatching; data not shown). As all tests gave qualitatively the same results, only the repeated-measures analysis of variance of the hatch proportions will be presented. All test-of-fertility data were conducted using JMP 7.0.2 (SAS Institute Inc., Cary, NC, USA, 2007).

Results

After four generations of selection there were significant differences between our selection treatments, with longer sperm found in our upward selection lines (F1,4=18.86; P=0.012) (Figure 1). Responses to selection for longer sperm were relatively strong and rapid (first generation, 212.284±0.424 μm; fourth generation: 219.483±0.542 μm), whereas the responses to selection for shorter sperm were much weaker (first generation, 212.079±0.389 μm; fourth generation: 213.489±0.570 μm) (Figure 2). Note that this asymmetry in response is not because of differences in selection differentials (mean±s.e.m.: long-sperm selection treatment, 3.328±0.206; short-sperm selection treatment, 3.024±0.207) over generations or between treatments (all F<1.01; all P>0.33).

Figure 1
figure 1

The final mean±s.e.m. sperm lengths of each line in our selection regimes after four generations of selection. The differences in sperm length were statistically significant. It should be noted that the mean sperm length was initially approximately 212 μm.

Figure 2
figure 2

The results of selecting on sperm size across four generations. Solid squares are the upward lines, open circles the downward. Error bars are the s.e.m. values across the three replicates per treatment. The asymmetrical responses to selection are clearly evident: upward selection led to an increase in sperm length, whereas downward selection had virtually no effect.

The overall realized heritability, estimated using a combined generation mean (difference) versus cumulative selection differential regression, was 0.45±0.02 (r2=0.99; F1,1=467.7; P=0.029). The realized heritability in the upward direction was 0.81±0.10 (r2=0.96; F1,2=2.73; P=0.016) and that in the downward direction was 0.51±0.20 (r2=0.76; F1,2=6.426; P=0.1267).

A weighted parent–offspring regression yields a total heritability of sperm length of 0.48±1.94 (r2=−0.005; F1,178=0.06; P=0.81). Weighted parent–offspring regression for each direction of selection separately gave a heritability in the upwards direction of 0.32±1.4 (r2=−0.01; F1,97=0.08; P=0.78) and in the downward direction of 0.28±0.20 (r2=0.010; F1,90=1.92; P=0.17).

The estimated heritability of sperm length with an animal model approach, using generation and line as fixed factors and individual as a random factor, was 0.53±0.09 (N=673; max. log L=−809.48). Body size (HTL) had an estimated heritability of 0.450±0.09 (N=656; max. log L=−809.48). Sperm length and body size were not correlated (genetically: rG=−0.007±0.151; phenotypic: rP=0.042±0.043).

To test for possible fertility differences between the selection regimes, we used repeated measure analysis of variance with the hatching proportion of the males' first and fifth mates' eggs as repeated measure (arcsine square root transformed) and selection treatments (long and short sperm, respectively) as factors. There was no significant difference between the selection treatments (F1,4=0.1126; P=0.7541: mean±s.e.m. hatch proportions: long sperm, first female 0.68±0.07; short sperm, first female 0.73±0.09; long sperm, fifth female 0.64±0.08; short sperm, fifth female 0.58±0.11).

Discussion

Our major findings are that sperm length responds to selection, that divergence was rapid, but responses were asymmetrical—we observed a response in the upward direction, but not downward. In addition, our estimates of the realized heritability were not greatly different from previously published narrow-sense estimates derived from sire–son regression, but we observed no correlated response in male fertility to selection on sperm length. We discuss each of these findings and their significance in turn.

Sperm length responded rapidly to selection in the upward direction, and after four generations of selection significant divergence was found between treatments. This is consistent with findings from other sperm length selection studies. For example, in crickets, sperm length also significantly diverged after four generations of selection (Morrow and Gage, 2001b), and in D. melanogaster, long- and short-sperm lines seemed to differ after about three generations (Miller and Pitnick, 2002). However, maternal line also had to be included in the selection regime for sperm length responses to be realized in the cricket study (Morrow and Gage, 2001b). This indicates some maternal (genome) contribution to sperm length, and in a more formal quantitative genetic analysis of sperm length in Drosophila mojavensis that tested alternative models, including autosomal, maternal, cytoplasmic, and X and Y effects, sperm length models were significant only when X or Y effects were included (Miller et al., 2003). There is also some evidence for sex-linked inheritance of sperm length in yellow dung flies (Ward, 2000), although this inference is based on maternal grandfather–son regressions rather than more formal model assessment. In any case, we generated a rapid response to selection here. This finding indicates that previous failure to document a response in the sperm length of S. stercoraria evolving under different levels of sperm competition (Hosken et al., 2001) is unlikely to be due to a lack of evolutionary potential in sperm length. Rather, as was previously suggested (Hosken et al., 2001), sperm length is likely to be a neutral character in relation to sperm competition sensu stricto in S. stercoraria.

What is less clear is that if sperm length has no effect on sperm competitiveness, what are the cause and consequences of the apparent sperm length effect on sperm storage in S. stercoraria (Otronen et al., 1997), especially as sperm number is so important in competitive siring success. This previous finding is difficult to interpret, however, because copulations were experimentally interrupted and males did not, therefore, complete a full copulation. In any case, definitive demonstrations of sperm length effects on sperm competition are rare. The size of the amoeboid sperm influences paternity in bulb mites (Radwan, 1996) and nematodes (LaMunyon and Ward, 1999), and the length of a non-fertilizing sperm influences paternity in a snail (Oppliger et al., 2003), but generally, the fitness consequences of intra-specific sperm size variation are unclear (Pitnick et al., 2009). However, across species, sperm length often co-evolves with aspects of the female reproductive tract morphology (for example, Dybas and Dybas, 1981; Briskie et al., 1997; Minder et al., 2005). For example, across the Scathophagidae, sperm length scales significantly with the length of the duct leading to the female sperm store, but not with testis size (Minder et al., 2005). Identical sperm–female correlations have also been reported for other taxa (for example, Morrow and Gage, 2000), and experimental evolution studies provide some evidence that the benefits of sperm length depend on the dimensions of the female reproductive tract (Miller and Pitnick, 2002), as do within-species across-population studies (Pitnick et al., 2003). All these facts suggests that post-copulatory sexual selection more generally (that is, not sperm competition exclusively) has, at least, been involved in sperm size divergence across species and populations in some instances. Consistent with this, within-species intra-male variation in sperm length is influenced by sexual selection across dung fly species (Hosken and Minder, unpublished).

The response to selection we documented was highly asymmetrical, with a rapid upward response, but essentially no downward response. In fact there was an initial increase in sperm size in the downward line. This may be due to moving flies to the laboratory, as temperature slightly influences sperm length in this species (Blanckenhorn and Hellriegel, 2002). Furthermore, previous studies of sperm length in dung flies have also found changes in length that seem to be due to rearing flies in the laboratory, although in those studies, the initial response was for a decrease in length (Ward, 2000) rather than the increase we noted. Ward (2000) suggested some of the ‘noise’ he recorded may have been diet related and dependent on the quality of the Drosophila that were fed to the dung flies. What seems certain is that there are specific environmental effects on sperm length in S. stercoraria, although their exact nature remains to be comprehensively explored (see, for example, Gay et al., 2009). In addition to these potential environmental influences, there are many other possible reasons for asymmetrical responses in artificial selection experiments (Falconer, 1989). These include drift, inbreeding depression, indirect selection, genetic asymmetries and genes of large effect. Indeed, in regression analyses of the heritability of sperm length, there is some evidence for non-linearity (Ward, 2000)—probably due to the presence of genes of large effect (Falconer, 1989)—which actually predicts some asymmetry in the first generation(s) of selection (Falconer, 1989). Hence, in this regard, our findings are consistent with the expectation based on previous parent–offspring assessment. Nevertheless, any of these potential explanations could explain the asymmetrical response we observed, and as Falconer (1989) states, with so many potential causes of asymmetry, it would be surprising if such a difference were not detected. Similar asymmetrical responses to selection on sperm length have been observed in studies on sperm length in Drosophila (Miller and Pitnick, 2002), and could also relate to sperm–female interactions, with females representing unmeasured indirect selection opposing the artificial selection imposed by experimenters.

Our estimate of the heritability of sperm size is not greatly different from that of Ward (2000). The mean estimate from that study was 67%, whereas ours is between 44 and 53%, depending on the approach. The most precise estimate of sperm length heritability in our study comes from the restricted maximum likelihood analysis (53%). This is well within the range of estimates reported from other taxa, but there are relatively few estimates of sperm length heritability (Simmons and Moore, 2009). Morphological characters typically have relatively high heritability (Mousseau and Roff, 1987; Roff, 1997), and one suggestion for this is that these characters are more distantly related to fitness than life history traits (for example, see Mousseau and Roff, 1987; Roff, 1997). This may well be the explanation for the high heritability of sperm length in dung flies, although whether this is the explanation for the general pattern is a matter of dispute (Price and Schluter, 1991; Rowe and Houle, 1996). Furthermore, sexually selected characters generally tend to have high additive genetic variance and heritability (Pomiankowski and Møller, 1995), which may be due to the sex linkage that seems to be associated with many sexually selected traits (Reinhold, 1998). If sperm length is sexually selected and sex linked, as seems to be the case in at least some species, this could partly explain the generally high heritabilities seen for sperm length (Miller et al., 2003). Finally, we also found no associations between sperm and body size, although body size was heritable, all of which is consistent with previous studies (for example, Ward and Hauschteck-Jungen, 1993; Simmons and Ward, 1991).

We found no evidence for a fertility difference between our treatments, and therefore no evidence for a sperm size/number trade-off, although we acknowledge the power of this test was low, and the actual differences in sperm lengths between treatments were only about 4%. In addition, although we looked over five copulations, trade-offs may only be detected over a lifetime. A negative association between sperm size and number is a key assumption of sperm competition theory (Pizzari and Parker, 2009), but one for which there is only limited support (for example, see Pitnick, 1996; Oppliger et al., 1998). This is, at least, partly because of the difficulties in assessing the predicted negative association (Pitnick, 1996; Pizzari and Parker, 2009), but in any case, we find no evidence indicative of a trade-off—with the caveats given above, which make the test rather weak.

In conclusion, sperm length responded rapidly but asymmetrically to bidirectional artificial selection, and the realized heritability was close to that estimated previously using sire–son regression. Nevertheless, in spite of considerable investigation, the functional significance of sperm length variation in yellow dung flies remains unclear.