Introduction

Estimates of the heritability of behavioural traits based on single events are often used to make inferences about selection acting on behavioural traits. In general, these estimates suggest that behavioural traits have low heritabilities, which has led to inferences about the strength of selection acting on such traits (Mousseau & Roff, 1987; Roff & Mousseau, 1987). Yet, the fact that heritability estimates for behavioural traits are low may reflect the large variability between events for this class of traits rather than a history of strong selection. Measurements of behavioural traits are normally associated with a low level of repeatability. If an animal is tested in a maze or along a gradient on two occasions, many individuals behave differently upon retesting. In Drosophila, for example, Kekic & Marinkovic (1974) found that, on average, only 33% of the flies selected the same light intensity on successive days. This may be largely caused by minor changes in the apparatus, an individual's internal state or testing conditions.

The repeatability of a trait is normally computed as the intraclass correlation (see Sokal & Rohlf, 1981; Lessells & Boag, 1987) and can be defined as VI/(VI+VM), where VI is the variance component among individuals and VM is the measurement error. It therefore follows that

where VP is the total phenotypical variance (i.e. VI+VM). Low values of rM will influence estimates of the heritability of behavioural traits. When estimates of heritability are based on single events, as is invariably the case, any variation between measurements on the same individual is included in the environmental variance. This variation, known as the special environmental variance (VEs), is equivalent to VM when multiple measures are taken. VEs cannot be separated from the general environmental variance (VEg), the environmental variance contributing to between-individual variability (Falconer, 1989). As a consequence, the narrow-sense heritability of a trait, defined as the additive genetic variance (VA) divided by the phenotypical variance (VP), will be underestimated, because VEs is included in the VP term. When the phenotypical value of an individual is based on n measurements, the phenotypical variance is given by

where VG is the genetic variance. Increasing the number of measures will, therefore, decrease the phenotypical variance and increase the heritability of a trait.

The extent to which VP should be corrected when determining the heritability of a trait will depend on the number of times an individual undertakes a behaviour in its lifetime. For instance, if a female mates only once, then its mating behaviour will be reflected solely by what happens in this event, and no adjustment may be necessary. However, most behaviours are repeated many times by an individual as it responds to cues from the environment and from other individuals. An animal may respond thousands of times to olfactory cues or phototactic cues during its lifetime. All of these responses will contribute to its lifetime fitness as it searches for food, reproduces and avoids predators, making it important to control for intraindividual variability.

To examine the impact of repeat measurements on the heritability of mating behaviours in Drosophila, variation in two traits was considered: the mating and courtship speed of males when they encounter females. The mating speed of males is commonly measured as the time males take to copulate with virgin females when they are introduced into a container, and this measure is thought to reflect overall male mating success (Fulker, 1966). This trait typically shows a low narrow-sense heritability when estimates are obtained from single events, which has been interpreted as indicating that the trait is closely related to fitness (Fulker, 1966; Stamenkovic-Radak et al., 1992). Another courtship trait that appears to be at least partly independent of mating speed is the time taken for males to start courting females when introduced into a container. Gromko (1987) has found that this trait does not show detectable genetic variability in D. melanogaster. By comparing single-event and multiple-event measures of these behaviours, it is shown that substantial levels of heritable variation become evident when multiple events are considered, in contrast to conclusions based on earlier studies.

Materials and methods

The procedure for measuring courtship and mating speed follows Gromko (1987). All culturing and experiments were undertaken under continuous light. Drosophila melanogaster were obtained from a genetically heterogeneous stock that had been founded by 50 females from Hastings, near Melbourne, 15 generations previously. This stock had been maintained as discrete generations at a census size of around 1000 flies. Males were collected as virgins without anaesthesia and then aged at 25°C for 2 days before the experiment was started. Each male was paired with a virgin female (2–3 days old) in a vial containing 10 mL of a laboratory medium. The time taken for the male to start courting the female and to copulate with her was then scored with a stopwatch to the nearest second. Almost all males (>95%) successfully copulated with females on this first occasion. Males were removed from females, and the process was repeated on the following 5 days with fresh batches of females of the same age. On each occasion, most males (>92%) mated with the females.

Males for the offspring generation were obtained from copulations with the first set of females. Inseminated females were allowed to oviposit in vials at 25°C for only 2 days to ensure that larvae developed under low-density conditions. Offspring were aged and tested as for the parental generation. An attempt was made to test two male offspring for each family. However, data from only one offspring were collected for around 15% of the 82 families tested, because one male emerged within the collection time being used or because one of the males died during the experiment.

To examine heritable variation, the mean scores of offspring were regressed onto male parental scores. Regressions were undertaken with males tested on the first day of scoring to determine heritabilities for a single event (coefficients for males scored only on days 2, 3, 4, 5 or 6 were similar to those for males scored on day 1; for courtship speed, the mean value was −0.03±0.05, whereas for mating speed it was −0.026±0.03). To obtain the two-event scores, data were averaged for each individual over the first 2 days of scoring, and so forth for the three-, four-, five- and six-event data. Because regressions are only based on one parent, coefficients were doubled to obtain heritability estimates.

Results and discussion

Courtship and mating speeds from the 6 days were analysed initially by ANOVAs to determine the repeatability of individual scores. These indicate significant differences among individuals for both traits, as well as minor differences between days for one trait (Table 1). These data were used to quantify measurement error associated with the behaviours. By computing variance components attributable to error and to differences among individuals, estimates of the measurement repeatability (rM) of each trait were obtained according to eqn (1). Based on the data presented in Table 1, the measurement repeatabilities for time to courtship and mating computed according to Lessells & Boag (1987) are 0.064±0.021 and 0.175±0.026, respectively. These repeatabilities assume that there is no interaction between individuals and day of mating (i.e. that the expected mean square for individuals is given by σ2+6σ2ind). These are low values and reflect the large degree of variability associated with the traits. Removing this error boosts heritability estimates based on single events by a factor of 1/rM, in this case by 15.5 for courtship and by 5.7 for mating speed.

Table 1 Mean squares from ANOVAs for time to courtship and time to mating of Drosophila melanogaster measured repeatedly on males from the parental and offspring generations over 6 days

Regression coefficients for both traits increase as the number of events increases (Fig. 1). For time to mating, heritable variation does not become evident until individuals have been scored on more than four occasions, and regression coefficients are only significantly greater than 0 for the five-event (P=0.03) and six-event (P<0.001) comparisons. For courtship speed, regression coefficients are significant for the three-event (P=0.002) and six-event (P=0.04) comparisons, and borderline for the four-event (P=0.10) and five-event (P=0.06) comparisons. When behaviours are averaged over six events, heritability estimates obtained by doubling regression coefficients are intermediate for both mating speed (64%) and courtship speed (39%), whereas they are essentially 0% for the single-event estimates because of the negative but non-significant regression coefficients. Heritabilities for these traits are likely to be at least partly independent, because the correlation between the traits was fairly low (r=0.20, n=1318, P<0.001).

Fig. 1
figure 1

Effect of measuring repeated courtship and mating events on the regression of offspring scores onto parental scores. Male Drosophila melanogaster were scored for 1–6 days, and regressions are based on one event (scores on day 1), the means of two events (days 1 and 2), the means of three events (days 1–3) and so forth. Error bars represent standard errors.

The results show that scoring the same individual several times leads to a marked increase in the heritability of time to mating and time to courtship. Both traits have intermediate heritabilities when measurements are averaged across events. The estimates obtained are higher than those reported in the literature for these traits. For instance, Drosophila estimates for mating speed include values of 1.6% (Spuhler et al., 1978), 7% (Gromko, 1987), 17% (Singh & Chatterjee, 1988) and 21% (Stamenkovic-Radak et al., 1992), all considerably lower than the six-event estimate obtained in this study. The heritability estimate for courtship speed based on multiple events indicates that there is significant additive genetic variance for this trait. In a previous study in D. melanogaster (Gromko, 1987), this trait lacked detectable genetic variability, and heritability was estimated as 1%, consistent with findings for the single-event comparison.

In a widely cited paper, Roff & Mousseau (1987) summarized laboratory estimates of narrow-sense heritabilities for traits in Drosophila. Their main conclusion was that life history and behavioural traits had low heritabilities, whereas morphological and physiological traits had high heritabilities. Overall, Roff & Mousseau (1987) found that the heritability for behavioural traits was 0.18, based on averaging the median value of different studies. A related survey of estimates from other organisms (Mousseau & Roff, 1987) found the same trends.

Such comparisons of heritabilities among trait classes have been used to make inferences about the effects of selection acting on the classes. If a trait is under intense selection and if alleles influencing it act in an additive manner, then selection is expected to favour those alleles that have a higher fitness. As the favoured alleles go to fixation, the additive genetic variance of the trait is expected to decline. This will lead to a concomitant decrease in narrow-sense heritability if the environmental variance remains constant. Therefore, different classes of traits may have different heritabilities, depending on how closely they are related to fitness. An alternative hypothesis with the same prediction (Price & Schluter, 1991) is that variability in one class of traits (particularly those related to fitness) might be partly determined by variability in a second class of traits. Fitness component traits are expected to have a larger environmental variance, because of variances in all underlying metric traits.

Unfortunately, the present results indicate that comparisons of behavioural traits with other classes of traits are probably meaningless. Low heritabilities will often be an artifact of the low repeatability of single events rather than a history of directional selection. This helps to explain why low heritability estimates have been obtained for behavioural assays unlikely to have much ecological relevance (Drosophila examples include measuring geotactic responses in a maze or locomotion through a tube). Low heritabilities for behavioural traits are also inconsistent with evidence that heritable variation in behavioural traits can easily be found in nature (e.g. Hoffmann et al., 1984; Sokolowski et al., 1986; Hoffmann & O'Donnell, 1992).

Problems are also likely to be encountered when using evolvabilities (Houle, 1992) to compare trait classes. Evolvabilities represent VA estimates expressed relative to the mean of a trait; for instance, one measure of evolvability is IA=VA/mean2. Because repeat measures only control for the inflated effects of VEs on the phenotypical variance (see eqn 1), evolvabilities should not be influenced by the number of times a trait is measured. However, in practice, when measurements are only carried out once, an extremely poor estimate of VA will be obtained, and evolvabilities may be low. For instance, the negative covariances for time to copulation evident from Fig. 1 lead to negative IA-values when the one-, two- and three-event data are considered. The IA for courtship speed was highest for the five-event comparison (0.52) and, for copulation, it was highest (0.67) for the six-event comparison.

What is the ‘true’ heritability of mating and courtship speed? This will depend on the number of matings and number of courtships undertaken by males. The latter are likely to be numerous, because males will court many females during their lifetime. The heritability for courtship speed might, therefore, be fairly high. In contrast, the number of times that a male mates will be less and will depend on the incidence of remating and survival rates in a population.

Using repeated events to estimate the heritability of behavioural traits has limitations. For instance, the heritability of a trait may change with age, as seems likely for fecundity (Rose & Charlesworth, 1981). If different genes control a trait at different ages, then traits need to be measured over a limited age span. In addition, the heritability of behavioural traits should ideally be scored under field conditions and in assays that are relevant to fitness under natural conditions. For instance, mating speed may be unrelated to mating success in some situations (Hoffmann & Cacoyianni, 1990). Nevertheless, these factors do not affect the overall conclusion that heritabilities for courtship and mating behaviours based on single events will often represent underestimates.