Introduction

We frequently experience distortion of time when we encounter emotional stimuli or events in our daily lives. This phenomenon is called emotional temporal distortion (Lake et al., 2016). As early as 1890, James (1890) noted that our perception of time changed with different mental moods. Later, some studies found that emotions affected time perception (e.g., Falk & Bindra, 1954; Gulliksen, 1927; Hare, 1963; Langer et al., 1961; Rosenzweig & Koht, 1933; Thayer & Schiff, 1975). However, these studies suffered from methodological limitations, making their findings inconsistent and difficult to interpret (Lake et al., 2016). For example, some studies made inappropriate comparisons between emotional and non-emotional conditions (Hare, 1963; Langer et al., 1961; Thayer & Schiff, 1975). Other studies failed to induce the targeted emotion (Rosenzweig & Koht, 1933) or induced other confounding psychological processes (Gulliksen, 1927; Langer et al., 1961). Due to those limitations, the above studies only found that emotions affected time perception but failed to explain how. With the development of emotion research methods, Angrilli et al. (1997) started to use standardized emotional materials to explore this phenomenon to make up for the above limitations and to further understand the causal link between emotion and time perception. Since then, much evidence has demonstrated that emotions distort time perception (Gil et al., 2009; Mioni et al., 2020; Ogden et al., 2021; Yin et al., 2021b, 2022; Yuan et al., 2020).

As the number of studies exploded, many moderators were discovered. For example, based on emotion dimension theory (Lang et al., 1998; Russell, 1980), valence and arousal may modulate emotional temporal distortion (Angrilli et al., 1997; Noulhiane et al., 2007). In addition, there are many kinds of stimuli to induce emotions, such as word, picture, sound, and video. They have different effectiveness and validity in eliciting emotion and physiological reactions (Ellard et al., 2012; Siedlecka & Denson, 2019). Therefore, stimulus type may modulate the effect of emotion on time perception. Thirdly, there are many paradigms for measuring time perception, and they differ in response (Thoenes & Oberfeld, 2017). Therefore, temporal paradigm may modulate the emotional temporal distortion (Gil & Droit-Volet, 2011).

Though previous studies found that valence, arousal, stimulus type, and temporal paradigm modulated emotional temporal distortion, there was some evidence to the contrary. Meta-analysis is an effective method of exploring moderators, but no researcher has yet used meta-analysis to examine these moderators. Therefore, the current meta-analysis aimed to systematically clarify the effect of these moderators.

Valence

Emotional valence refers to the degree of the pleasantness of an emotion, ranging from pleasant to unpleasant (Bradley et al., 2001; Cacioppo & Gardner, 1999). In the simplest of empirical studies, valence is divided into three levels: positive, neutral, and negative. As such, there are many studies on valence-related temporal distortion, but the results are mixed. Some studies found that positive stimuli and negative stimuli led to temporal overestimation. Specifically, compared with neutral stimuli, both positive and negative stimuli extended the perceived duration of time passing (Droit-Volet et al., 2004, 2016; Grommet et al., 2011; Jones et al., 2017; Li & Tian, 2020; Smith et al., 2011). However, others found inconsistent results. For example, Lui et al. (2011) observed that both negative and positive stimuli shortened time perception; Tipples (2008) found that happy expressions did not lengthen time perception relative to neutral expressions; Eberhardt et al. (2020) found that angry expressions did not cause time perception to be overestimated.

The comparison between negative and positive stimuli is also mixed. Some studies found that time perception of negative stimuli was longer than that of positive stimuli (Buetti & Lleras, 2012; Mereu & Lleras, 2013; Noulhiane et al., 2007; Yamada & Kawabe, 2011). This is consistent with everyday experience: when you are happy, time flies; in sorrow, days seem years. However, other studies found the opposite (e.g., Van Volkinburg & Balsam, 2014). Therefore, it is necessary to clarify whether valence has a moderating effect on emotional temporal distortion.

Arousal

Arousal refers to the degree of physiological activation, ranging from calm to excitement (Bradley et al., 2001; Lang et al., 1998; Russell, 1980). It is an essential dimension of emotion and plays a key role in emotional temporal distortion. In empirical study, arousal is usually manipulated by different emotional stimuli (Clark, 1983; Gross & Levenson, 1995; Lang et al., 1993). How emotional arousal affects time perception has been explored in many ways. For example, participants stated that they feel aroused (Yin et al., 2021b); experimenters chose stimuli they believe to be arousing (Gil et al., 2007); experimenters measured physiological arousal directly (Mella et al., 2011). Correspondingly, the phenomenon of temporal overestimation with the increase of arousal has been verified by various experimental manipulations (Campbell & Bryant, 2007; Dirnberger et al., 2012; Droit-Volet et al., 2020; Zhou et al., 2021).

Although some studies observed that arousal modulated time perception, a few studies found inconsistent results (e.g., Noulhiane et al., 2007). Therefore, there were reasons for caution. Firstly, there were exceptions to the usual pattern: high arousal was not perceived as lasting longer than low arousal (Noulhiane et al., 2007). Secondly, studies using pictures found that the relationship was modulated by valence and not always in the same way (Angrilli et al., 1997; Smith et al., 2011; Van Volkinburg & Balsam, 2014). The interaction between valence and arousal has also been found in auditory stimuli (Noulhiane et al., 2007). To summarize, although some research suggested that arousal prolonged temporal estimations, other studies showed that an increase of emotional arousal did not result in the length of time perception (Noulhiane et al., 2007). Across multiple studies, there is still inconsistent evidence, which needs to be clarified.

Stimulus type

There are many mood induction procedures (MIPs). They can be divided into two categories: mood-induction situation and mood-eliciting material (Zheng et al., 2013). The former includes Imagination MIP, Velten MIP, Social Interaction MIP, Gift MIP, etc. (Zheng et al., 2013). The latter includes video, sound, word, and picture (Siedlecka & Denson, 2019).

In the field of time perception, although some researchers adopt mood-induction situations (Benau & Atchley, 2020; Matsuda et al., 2020; Piovesan et al., 2019), most of them used mood-induction material: word (Johnson & MacKay, 2019; Tipples, 2010), facial expression (Bar-Haim et al., 2010; Fayolle & Droit-Volet, 2014; Li & Yuen, 2015; Tipples, 2011; Zhang et al., 2014), scenic picture (Gable et al., 2016; Grondin et al., 2014; Tian et al., 2018; Tipples, 2019), sound (Droit-Volet et al., 2010; Noulhiane et al., 2007; Wackermann et al., 2014), and video (Özgör et al., 2018; Wöllner et al., 2018).

However, the results of studies using different emotional stimuli to explore emotional temporal distortion were inconsistent. For example, Noulhiane et al. (2007) and Angrilli et al. (1997) used sounds and scenic pictures to explore the effect of emotion on time perception, respectively. Despite both studies using the same time-reproduction paradigm, and using positive and negative stimuli high and low in arousal, the two studies found different outcomes. Angrilli et al. (1997) found that low-arousal positive pictures caused overestimation as compared to low-arousal negative pictures, but high-arousal negative pictures caused overestimation as compared to high arousal positive pictures. In contrast, Noulhiane et al. (2007) found that negative sounds were judged to be longer than positive sounds, regardless of whether arousal was high or low. In addition, Zhang et al. (2017) did not observe the overestimation in word. However, many studies found emotional temporal distortion in facial picture, scenic picture, and sound (Bar-Haim et al., 2010; Fayolle & Droit-Volet, 2014; Li & Yuen, 2015). To summarize, the results of studies using different stimuli are inconsistent and need further clarification.

Temporal paradigm

Paradigms used to study time perception encompass time estimation, time reproduction, and duration discrimination (Thoenes & Oberfeld, 2017). They were used to study the effect of emotion on time perception. Therefore, the effect has been generalized to different temporal paradigms: (a) Estimation paradigms, typically incorporate rating scales, in which participants rate the perceived duration from short to long using a Likert-type scale (Noulhiane et al., 2007; Ogden et al., 2019); (b) reproduction paradigms that ask participants to reproduce a given interval (Angrilli et al., 1997; Bar-Haim et al., 2010; Doi & Shinohara, 2009; Noulhiane et al., 2007); (c) discrimination paradigms that require participants to decide whether a specific duration is longer or shorter than a standard duration (Doi & Shinohara, 2009; Gil & Droit-Volet, 2012; Grommet et al., 2019).

Some empirical findings with different temporal paradigms have revealed high levels of correlation (Wearden, 2003; Wearden & Lejeune, 2008). However, many studies about emotional temporal distortions have found that different temporal paradigms caused different results (e.g., Gan et al., 2009; Gil & Droit-Volet, 2011; Huang et al., 2018a). For example, Gil and Droit-Volet (2011) tested anger-related temporal distortion using estimation, reproduction, and different kinds of discrimination paradigms (i.e., bisection and generalization). Results showed that in the estimation and one discrimination paradigm (bisection), the time of angry faces was estimated to be longer than that of neutral faces, but not in the reproduction and another discrimination paradigm (generalization). To summarize, the temporal paradigm is a possible variable that modulates emotional temporal distortion and needs to be further clarified. Despite the claim that the influence of emotion on time perception generalizes across paradigms, results seem mixed when examined across different temporal paradigms.

The current study

Since the study by Angrilli et al. (1997), increasing studies have focused on emotional temporal distortion. Although most work has found that emotions distort time perception, results of how arousal, valence, stimulus type, and temporal paradigm modulate emotional temporal distortion are inconsistent. Given that many of the studies reviewed above had relatively small sample sizes, it is clear that some results were limited by the lack of statistical power and had an increased risk of type I and random errors. However, these results can be well suited for meta-analysis, which is a powerful statistical method that can identify trends across numerous small sample studies based on effect sizes (Borenstein et al., 2009). Therefore, the current study aimed to clarify how valence, arousal, stimulus type, and temporal paradigm modulate emotional temporal distortion through meta-analysis.

Specifically, the current study would firstly adopt meta-regression to examine the moderating effects of valence and arousal, respectively; considering previous studies have found the interaction between valence and arousal (Angrilli et al., 1997), the meta-regression would also be used to test their interaction; furthermore, since previous studies usually manipulated valence and arousal into categorical variables (positive high arousal, positive low arousal, negative high arousal, and negative low arousal), which made them challenging to satisfy the linear relationship, we would thus use subgroup categorical analysis to test their interaction, too. Besides, the subgroup categorical analysis would also be used to examine the moderating effect of stimulus type and temporal paradigm. Lastly, the analysis of publication bias would be conducted.

Method

Literature search

We conducted an exhaustive literature search using sequential strategies to locate studies that provide data on the effects of emotion on time perception. First, we searched for relevant studies in Web of Science and SpiScholar. The primary keywords were “time perception,” “time estimation,” “time judgment,” “time evaluation,” “interval,” “duration,” “temporal” in conjunction with “emotion,” “affective,” “fear,”* “disgust,”* “ang,”* “sad,”* and “surprise.”* To collect literature as much as possible, we also supplemented it with Google Scholar. In addition, we performed a search of the reference lists of all included articles and relevant review articles in the field. The time window of our literature search was from January 1997 to 31 May 2021, because the first article using standardized emotion materials to explore emotional temporal distortion scientifically was published in 1997 (Angrilli et al., 1997).

Study selection

Exclusion criteria were as follows:

  1. (1)

    The duration: Fraisse (1984) thought that time perception has an upper limit that hardly exceeds 5 s. Therefore, studies with a duration of more than 5 s were excluded.

  2. (2)

    Peer-reviewed: Studies that were not published in peer-reviewed journals according to indices SCI, EI, SSCI, CSSCI,Footnote 1 and CSCDFootnote 2.

  3. (3)

    Control condition: Studies that did not provide a control condition, including emotional condition and neutral condition; or studies with inconsistent variables other than emotion between the experimental and control groups.

  4. (4)

    Emotion type: Studies in which emotion type could not be determined for the absence of valence, or arousal information of stimulus.

  5. (5)

    Temporal paradigm: Studies that did not use estimation, discrimination, or reproduction paradigm, or used a retrospective paradigm.

  6. (6)

    Stimulus type: Studies that did not induce emotion with word, picture (facial picture and scenic picture), sound, and video.

  7. (7)

    Language: Studies that were not written in English or Chinese.

  8. (8)

    Sample: Studies that did not involve healthy human participants or participants' average age in these studies were not between 18 and 60.

  9. (9)

    Modality: Studies that used tactile stimuli to measure time perception.

  10. (10)

    Article type: Review, meta-analysis, editorial, or commentary.

  11. (11)

    Compute effect size: Studies reporting results with insufficient information to compute effect size.

Notably, when study results were ambiguous or insufficient for inclusion in the meta-analysis (e.g., information required to calculate effect size was not reported), we contacted the corresponding authors of the studies to request further information. After these exclusion criteria were applied to the 4,116 potentially relevant articles, 31 articles remained. In total, 95 effect sizes were included in the current meta-analytic review (Fig. 1).

Fig. 1
figure 1

Flowchart of the study selection process. Note: n represents the number of articles, and k represents the number of independent effect sizes

Data extraction

Data were extracted independently by two candidates and cross-checked until consensus was reached. The following variables were extracted from each eligible article: study identification data (i.e., author and publication year), participants’ mean age, sample size, arousal, valence, emotion type, stimulus type, temporal paradigm, and the statistics for the calculation of effect size.

Valence

We extracted the value of the valence from the articles and converted it uniformly to a nine-point Likert-scale (1 = “extremely negative,” 9 = “extremely positive”). Specifically, we divided the values provided by the scoring scale employed and then multiplied it by nine. If the value is greater than 5, it would be assigned into positive, while less than 5 would be assigned into negative (Bradley et al., 2001; Cacioppo & Gardner, 1999). If no value of valence is provided, we would assign it by the emotion species, that is, happiness would be assigned into positive, while anger, fear, disgust, etc., would be assigned into negative.

Arousal

Similar to valence, we extracted the value of arousal from the articles and converted it uniformly to a nine-point Likert-scale (1 = “low arousal,” 9 = “high arousal”). If the value is greater than 5, it would be assigned into high, while less than 5 would be assigned into low (Bradley et al., 2001; Cacioppo & Gardner, 1999).

Stimulus type

We coded stimulus types into five categories: scenic picture, facial picture, word, sound, and video. After coding, we found that only one study used video as emotion-eliciting stimuli, providing four effect sizes (Droit-Volet et al., 2011). Therefore, we excluded videos as a stimulus type in the subsequent analysis.

Temporal paradigm

We encoded temporal paradigm into three categories: estimation, reproduction, and discrimination. We included the verbal estimation task and rating scales as types of temporal estimation paradigms. Also, we regard bisection (Droit-Volet et al., 2015; Li & Yin, 2019), generalization (Huang et al., 2018b), and S1/S2 temporal discrimination paradigm (Lui et al., 2011) as discrimination paradigm (Table 1).

Table 1 Characteristics and main findings of the studies included in the meta-analysis

Meta-analysis

Effect size

For each study, the effect sizes relevant to this analysis were calculated as Hedges’ g, as it shows a lower level of bias (Borenstein et al., 2009). In the current analysis, Hedges’ g was calculated as follows. If the study provided the mean and standard deviation of the emotion condition and the neutral condition, it was calculated according to the formula g = (Mean1 - Mean2)/SDpooled. If the related statistics of this formula were missing, Hedges’ g would be derived from t value and sample sizes according to the formula \(g=t \sqrt{\left({n}_{1}+{n}_{2}\right)/\left({n}_{1}\times {n}_{2}\right)}\). If the t value was also not reported in the study, the p value reported in the article was converted to a t value. Extraction of p value was referred to previous studies (Yuan et al., 2019). If results were reported as insignificant, it was conservatively assigned a one-tailed p value of 0.50, such that Hedges' g was 0. If the reported results were significant, but exact p values were not provided, p values were assumed to be 0.05, 0.01, or 0.001 (two tails), respectively. For example, Grommet et al. (2019) did not report an exact p value but instead p < 0.001. In the Comprehensive Meta-Analysis (Version 3; CMA; Biostat, Englewood, NJ, USA) software package, we calculated the effect size according to p = 0.001 (2-tails). Similarly, Mella et al. (2011) reported p < 0.05; we calculated the effect size based on p = 0.05 (two tails). The effect size was positive if duration judgments were longer for emotional stimuli than for neutral stimuli.

We used the Comprehensive Meta-Analysis (Version 3; CMA; Biostat, Englewood, NJ, USA) software package to order, calculate, and compare effect sizes.

Model selection

Most meta-analyses were based on fixed- or random-effects models. According to Borenstein et al. (2010)'s suggestion, since most included articles in our meta-analysis were inconsistent with temporal paradigm and stimulus type, and we expected the results to generalize to a broader population, a random-effects model was selected for the current meta-analysis.

Heterogeneity

The heterogeneity of the distribution of effect sizes was assessed by Q and \({I}^{2}\) tests. In the Q test, a statistically significant Q value (p < 0.1) shows heterogeneity in the distribution of effect sizes. In the \({I}^{2}\) test, \({I}^{2}\) is the proportion of total variation in the estimates of effects that is due to heterogeneity rather than to chance, and higher \({I}^{2}\) values indicate greater heterogeneity (Higgins, 2003). Furthermore, heterogeneity can be used to assess the rationality of model selection. Consistent with most meta-analyses, we regarded \({I}^{2}\) values of 25%, 50%, and 75% as low, moderate, and high heterogeneity, respectively, and \({I}^{2}\) > 25% is a necessary condition for random-effects models (Borenstein et al., 2009).

Publication bias

Publication bias was identified and assessed by funnel plots (Sterne & Egger, 2001), Egger’s regression test (Egger et al., 1997), trim-and-fill (Duval & Tweedie, 2000), and classic fail-safe N (Begg & Mazumdar, 1994; Rosenthal, 1979). If no publication bias is present, the funnel plot should appear symmetric for the distribution of effect sizes. In Egger’s regression test, the intercepts that do not differ significantly from zero (p > 0.05) indicate the absence of publication bias. The classic fail-safe N considers the question of how many new studies averaging a null result are required to bring the overall effect size to nonsignificance. If the classic fail-safe N is greater than the level of 5k + 10, the publication bias is tolerant (Rosenthal, 1979). In Duval and Tweedie's trim-and-fill, the distribution of the effect sizes in included studies is trimmed or filled on the left or right to provide a symmetrical distribution, and insignificant differences between adjusted and observed effect sizes indicate the impact of publication bias is not serious.

Results

A total of 31 papers offering 95 effect sizes were included in the primary meta-analysis of the emotional temporal distortion. The total number of participants was 3,776 (Fig. 2).

Fig. 2
figure 2

The forest plot of emotional temporal distortion

Overall effect size

The overall effect size was statistically significant, g = 0.200, 95% CI [0.134, 0.265], Z = 5.966, p < 0.001, showing that the emotional time perception was longer than the neutral one. Heterogeneity analysis showed a moderate heterogeneity across the included studies, Q(94) = 185.601, p < 0.001, \({I}^{2}\)= 49.354%, suggesting a moderate degree of variation between included studies. According to Borenstein et al. (2010) 's suggestion, when there was heterogeneity between studies, the random-effects model was appropriate.

Moderator of emotional temporal distortion

Valence

We first included the valence value into the meta-regression. The result revealed that there was no significant moderating effect of valence, k = 66, β = -0.018, se = 0.017, Z = -1.020, p = 0.309, 95 % CI [-0.051, 0.016].

Previous studies mostly manipulated valence and arousal into categorical variables (positive and high arousal, positive and low arousal, negative and high arousal, negative and low arousal), which made them challenging to satisfy the linear relationship. Therefore, we combined them into the variable named emotion type to conduct subgroup categorical analysis.

The subgroup categorical analysis showed that the moderating effect of emotion type was statistically significant, Q(3) = 17.570, p < 0.001. Further analysis showed that valence was a significant moderator. Specifically, pair comparisons revealed the overall effect size of the negative high arousal stimuli was significantly bigger than positive high arousal stimuli, Q(1) = 10.371, p = 0.001. In addition, the overall effect size of the negative low-arousal stimuli was also bigger than positive low-arousal stimuli, Q(1) = 2.733, p = 0.098 (Table 2).

Table 2 The meta-analytic results of moderator in emotional temporal distortion

Arousal

Similarly, we explored the moderating effect of arousal through meta-regression and subgroup categorical analysis. The meta-regression analysis revealed that arousal was a significant moderator, k = 95, β = 0.083, se = 0.023, Z = 3.610, p < 0.001, 95% CI [0.038, 0.129], showing that time perception was prolonged with the increase of arousal. It accounted for 17% of the heterogeneity. Similar results were found when both valence and arousal were simultaneously included in the meta-regression. Arousal was a significant moderator, k = 66, β = 0.048, se = 0.024, Z = 2.030, p = 0.043, 95% CI [0.002, 0.095]; valence was not, k = 66, β = -0.014, se = 0.017, Z = −0.800, p = 0.425, 95% CI [-0.047, 0.020]. They accounted for 7% of the heterogeneity.

The result of subgroup categorical analysis also supported the moderating effect of arousal. Though there was no significant difference between positive low-arousal stimuli and positive high-arousal stimuli Q(1) = 0.069, p = 0.792, the overall effect size of the negative high-arousal stimuli was significantly bigger than negative low-arousal stimuli Q(1) = 3.965, p = 0.046. This suggested that there was an interaction between valence and arousal.

Stimulus type

The moderating effect of stimulus type was statistically significant, Q(3) = 13.806, p = 0.003. Pair comparisons revealed that the effect size of word was significantly more negative than scenic picture Q(1) = 8.949, p = 0.003, facial expression Q(1) = 13.058, p < 0.001, and sound Q(1) = 4.026, p = 0.045. There was no significant difference between scenic picture, facial expression, and sound (ps > 0.1). These results suggested that stimulus type modulated emotional temporal distortion.

Temporal paradigm

The moderating effect of temporal paradigm was statistically significant, Q(2) = 11.188, p = 0.004. Pair comparisons revealed that the overall effect size for estimation was significantly bigger than discrimination Q(1) = 10.058, p = 0.002, and reproduction Q(1) = 7.705, p = 0.006. There was no significant difference between the overall effect size for discrimination and reproduction Q(1) = 0.579, p = 0.447. These results suggested that temporal paradigm modulated emotional temporal distortion.

Additional analyses of arousal and valence

Although the results showed the moderating effect of stimulus type and temporal paradigm, another possibility is that it is caused by the difference in valence and arousal. Therefore, we conducted a series of analyses of variance (ANOVAs) to clarify whether the moderating effect of stimulus type and temporal paradigm is independent of arousal and valence.

We performed two ANOVAs to test the difference in arousal. When valence, paradigm, and stimulus type were included in the same ANOVA, some levels lacked corresponding values. Therefore, we conducted two ANOVAs. Firstly, we analyzed arousal value using a two-way factorial ANOVA in a 2 (Valence: positive, negative) × 3 (Paradigm: discrimination, estimation, and reproduction). The main effect of valence, F (1, 89) = 0.714, p = 0.400, η2 = 0.008, and temporal paradigm, F (2, 89) = 2.239, p = 0.113, η2 = 0.048, did not reach statistical significance. In addition, the interaction of valence × temporal paradigm was not significant, F (2, 89) = 0.346, p = 0.708, η2 = 0.008. These results suggested that the arousal values were similar in the three paradigms and two kinds of valences. Consequently, the moderating effects of valence and paradigm reported above should be independent of the arousal effect. Secondly, we conducted a one-way ANOVA with arousal value on stimulus type (facial picture, scenic picture, sound, and word). The result revealed a significant main effect of stimulus type, F (3, 91) = 6.775, p < 0.001, η2 = 0.183. Post hoc comparisons with the LSD test showed that the arousal of word was lower than facial picture (p = 0.001), scenic picture (p = 0.011), and sound (p < 0.001). The arousal of sound was higher than facial picture (p = 0.062) and scenic picture (p = 0.006). The arousal of facial picture was similar to scenic picture (p = 0.139).

Similarly, we conducted two ANOVAs to test the difference in valence. Firstly, we analyzed valence values using a two-way factorial ANOVA in a 2 (Arousal: high, low) × 3 (Paradigm: discrimination, estimation, and reproduction). The main effect of arousal, F (1, 61) = 0.041, p = 0.841, η2 = 0.001, and temporal paradigm, F (2, 61) = 0.442, p = 0.645, η2 = 0.014, did not reach statistical significance. The interaction of arousal × temporal paradigm was not significant, F (1, 61) = 1.413, p = 0.239, η2 = 0.023. These results suggested the valence value was similar in three paradigms and two levels of arousal. Consequently, the moderating effects of arousal and paradigm should be independent of the valence effect. Secondly, we conducted a one-way ANOVA with valence on stimulus type (facial picture, scenic picture, sound, and word). The main effect of stimulus type was not significant, F (3,62) = 0.590, p = 0.624, η2 = 0.028. The result suggested that the valence value was similar across all four categories of stimuli. Consequently, the moderating effect of stimulus type should be independent of the valence effect.

Publication bias

The publication bias was identified and assessed via funnel plots, Egger’s regression test, classic fail-safe N, and trim-and-fill. The funnel plot was asymmetrical, see Fig. 3. In addition, Egger’s regression test indicated a possible publication bias, t (93) = 5.364, p < 0.001. Due to publication bias, we further assessed its impact by classic fail-safe N and trim-and-fill. According to classic fail-safe N, the number of missing studies that would bring the overall effect to nonsignificance was 1,708. The classic fail-safe N (1,708) was greater than a tolerance level of 5k + 10 (485, k = 95). The trim-and-fill showed that 20 effect sizes were missing on the left of the overall effect size. When the 20 effect sizes were filled, the overall effect size reduced to g = 0.084, 95% CI [0.009, 0.158]. However, the adjusted effect size was not significantly different from the observed overall effect size, g = 0.200, 95% CI [0.134, 0.265]. The results showed that although there was a publication bias, it did not affect the conclusions.

Fig. 3
figure 3

Funnel plots of publication bias analysis

Discussion

As a subjective feeling, time perception is flexible and affected by many factors. For the past 25 years, a growing body of empirical research has increased our knowledge of how emotion affects time perception. Although increasing empirical evidence has proved that emotions distort time perception and usually result in overestimation, it is controversial how valence (positive/negative), arousal (high/low), stimulus type (scenic picture/facial expression/word/sound), and temporal paradigm (reproduction/estimation/discrimination) modulate the effect of emotion on time perception. Therefore, the current study used meta-analysis to quantify existing evidence, aiming to clarify the effects of these moderators on emotional temporal distortion.

Valence

The current meta-analysis suggests that valence is a moderator of emotional temporal distortion. The subgroup categorical analysis showed that the effect size of negative valence was greater than that of positive valence, both under high- and low-arousal conditions. The meta-regression did not detect this trend, possibly because previous studies have generally treated valence as a categorical variable, making the valence not satisfy the linear relationship.

Though an authoritative classification is to bisect emotional stimuli symmetrically into positive and negative categories (Lang et al., 1998; Russell, 1980), the effects of positive and negative stimuli on us are rarely symmetrical (Yuan et al., 2019). The current finding of valence on time perception could be considered as negativity bias, a phenomenon in which the response to negative stimuli is more intense than to positive stimuli, and it is also found in a variety of cognitive processes such as memory, emotional response, and decision-making (Kress & Aue, 2017; Lam et al., 2020). A common explanation is that negativity bias may reflect an evolutionarily based activation of the aversive motivational system. The brain allocates more resources to negative emotional processing, which helps to detect environmental danger and mobilize defensive behavior, such as escaping from danger and maintaining vigilance. Thus, it is conducive to survival and environmental adaptation (Yuan et al., 2019).

In addition, subgroup categorical analyses found that there was an interaction between valence and arousal. Therefore, the moderating effect of valence on emotional temporal distortion should take into account arousal. This is discussed in more detail in the next section.

Arousal

The moderating effect of arousal was observed by both meta-regression and subgroup categorical analyses, that is, higher arousal leads to greater temporal overestimation.

Within the time perception literature, arousal has been considered the key mechanism for determining the length of the perceiving time. Particularly for clock-like models (i.e., internal clock model, attention gate model, and scalar timing model), arousal is conceptualized as any manipulation that changes the speed of the clock (Gibbon et al., 1984; Treisman, 1963; Zakay & Block, 1997), with an increase in arousal equivalent to an increase in clock speed. The evidence from physiological and pharmacological manipulations in both animals and humans has produced changes in arousal and observed covariation in time perception (Cheng et al., 2006; Meck, 1983; Mella et al., 2011). Therefore, increased arousal generally results in increasing temporal distortion.

However, as mentioned above, the current meta-analysis found that there may be an interaction between valence and arousal, that is, the negative valence boosts the moderating effect of arousal on emotional temporal distortion. Specifically, under positive valence, there is no significant difference between the effect sizes of high and low arousal; however, under negative valence, the effect size of high arousal is greater than that of low arousal. Additional ANOVA on arousal showed that the arousal degree between negative and positive stimuli was matched. These results suggest that the moderating effect of arousal on emotional temporal distortion is affected by valence. In other words, valence and arousal, as basic dimensions of emotion, jointly modulate emotional temporal distortion.

Recently, an adaptive perspective has emerged in the time perception. Emotional temporal distortion has been thought to allow individuals to adaptively respond to changes in the environment (e.g., Droit-Volet & Gil, 2009; Harrington et al., 2011; Lake et al., 2016; Matthews & Meck, 2014). Specifically, temporal distortion may allow individuals to have more subjective time to approach, attack, or flee. Although the bipolar structure theory of emotion posits that both positive and negative valence have essential associations with adaptive survival (Lang et al., 1998; Russell, 1980), the former is generally related to reward pursuit, and the latter is usually associated with threat avoidance (Cacioppo & Berntson, 1994; Cacioppo & Gardner, 1999). However, the increased arousal is always induced by a more intense situation. Consequently, it evolutionarily boosts human’s adaptive response more significantly in defensive than appetitive motivational systems, as it is more important to avoid a threatening event than to approach a rewarding target (Peeters & Czapinski, 1990; Taylor, 1991). Therefore, elevated arousal is linked with the prioritized processing of negative over positive stimuli (Schupp et al., 2007), and eventually leads to a larger emotional temporal distortion.

Stimulus type

The moderating effect of the stimulus type is significant, indicating emotional temporal distortion varies with stimulus type. Specifically, the subgroup categorical analysis showed that the facial expression, scenic picture, and sound led to significant temporal overestimation, while the word did not. These results suggest that word may be weaker in inducing emotional temporal distortion relative to facial expression, scenic picture, and sound.

An outstanding difference between facial expression, scenic picture, sound, and word, the most common stimulus types that people receive emotional information, is that emotional word is generally associated with a lower level of emotional arousal than other emotional material (Bayer & Schacht, 2014; Hinojosa et al., 2009; Liu et al., 2010), which has also been demonstrated by the additional ANOVA on arousal-value of the current study. Since the increase in arousal has been considered the key to emotional temporal distortion (Gibbon et al., 1984; Lake et al., 2016; Zakay & Block, 1997), it is reasonable to observe that word is not an effective stimulus to induce emotional temporal distortion.

However, the difference in arousal between stimulus types could not fully explain the moderating effect of stimulus type, because the additional ANOVA on arousal value showed that the arousal level of sound was significantly higher than other stimuli, but the emotional temporal distortion of sound is not significantly different from that of facial expression and scenic picture. One possible explanation can be attributed to evolution. Based on sensory channels, facial expression and scenic picture can be classified as visual stimuli, while sound can be classified as auditory stimuli. Although both vision and hearing are the two main channels for humans to receive emotional information (Royet et al., 2000), humans have evolved into diurnal animals and therefore rely more on vision (Paulmann & Pell, 2011). Due to the adaptability shaped by evolution, the functional mobilization of physiological response (i.e., arousal) to visual stimuli would be stronger (Delaney-Busch et al., 2016). Therefore, it is reasonable to observe that although the sound has a higher degree of arousal, the emotional temporal distortion of the sound is not significantly greater than that of facial expression and scenic picture.

Temporal paradigm

A significant moderating effect has been observed between temporal paradigms, revealing that both estimation and discrimination lead to significant emotional temporal distortion; further pair comparisons revealed that the emotional temporal distortion measured by estimation is significantly larger than by discrimination and reproduction, suggesting that estimation is likely to be the most sensitive paradigm.

The response differences between paradigms may have important contributions to the moderating effect of the temporal paradigm, although the estimation, discrimination, and reproduction all need participants to encode time information. The estimation needs participants to verbally report their estimate of the duration of a stimulus (usually in ms), while the discrimination requires participants to just indicate comparisons of duration between stimuli and standards (i.e., longer or shorter). In the estimation, participants report specific values. In discrimination, however, participants convert numerical values into "long" or "short" reactions. For example, for 400 ms and 800 ms, in the estimation, both would be reported as 400 and 800; in the discrimination, both would be converted to less than 1,000. This conversion results in the loss of information in the discrimination compared to the estimation. Correspondingly, this loss of information leads to a smaller effect of the discrimination than the estimation. In the reproduction, participants were required to reproduce the duration of the emotional stimulus through a neutral stimulus, but this makes emotion-induced arousal gradually decrease during reproduction (e.g., pressing a key until it is subjectively equal to the duration of the emotional stimulus). Since increased arousal is associated with increased temporal distortion (Droit-Volet & Meck, 2007), the reproduction is likely to weaken the emotional effect.

Nevertheless, in previous empirical studies, the emotional temporal distortion has been observed by using estimation (Noulhiane et al., 2007; Ogden et al., 2021), discrimination (Doi & Shinohara, 2009; Droit-Volet, 2016; Effron et al., 2006; Yuan et al., 2020), and even reproduction (Angrilli et al., 1997; Noulhiane et al., 2007; Yin et al., 2021a). Since meta-analysis can only identify trends across numerous small sample studies based on effect sizes, the moderating effect of the temporal paradigm found in the current meta-analysis reflects a trend to some extent, rather than a final conclusion about the efficacy of temporal paradigms.

Limitations

Several important issues warrant consideration in the interpretation of current results. Firstly, several effect sizes in the current meta-analysis were derived from p value in combination with the sample size (e.g., Huang et al., 2018b; Nicol et al., 2013). However, since some of them only provided significance (i.e., p < 0.05, 0.01, or 0.001), to avoid overestimation, their values were assumed to be 0.05, 0.01, or 0.001, respectively. This may slightly underestimate the effect size. Secondly, the included studies used stimuli from different material systems. It needs to be noted that these material systems used different Likert-scales (e.g., five-point, seven-point, or nine-point) which are distinct in validity to represent raters’ power of discrimination (Matell & Jacoby, 1972). In this regard, the approach of converting all the rating data uniformly to the nine-point Likert-scale should be considered tentative, and caution should be taken with this approach. Thirdly, because the small number of studies may increase the risk of Type I and random errors, the current meta-analysis did not include video studies due to insufficient eligible studies; similarly, the current meta-analysis did not include the studies using the time production paradigm because of the small number of studies and its particularity. Specifically, its particularity is mainly reflected in the production, which first presents the neutral stimulus and then uses the emotional stimulus to produce the time interval. This results in the reverse of the other paradigms: the produced interval is short, which means the time perception is overestimated. Thus, future studies may pay more attention to both video and time production until there are enough studies for revelation. Lastly, although meta-analysis uses statistical methods to identify trends across numerous small sample studies based on effect sizes, it could not replace the large sample study that directly provides empirical evidence. Therefore, it will still be valuable to use a large sample to verify current findings.

Conclusion

The current study used meta-analysis to clarify the moderating effects of valence (positive/negative), arousal (high/low), stimulus type (scenic picture/facial picture/word/sound), and temporal paradigm (discrimination/ estimation/ reproduction) on emotional temporal distortion. The results revealed that negative valence tends to result in overestimation relative to positive valence; the increasing arousal leads to increasing temporal dilating; scenic picture, facial expression, and sound are more effective in inducing overestimation than word; both discrimination and estimation are effective in measuring emotional temporal distortion relative to reproduction, and estimation is likely to be the best.

Transparency and openness

We reported how we searched, excluded, and coded literature. The complete set of data will be publicly stored in APA's Repository upon publication. The Comprehensive Meta-Analysis (Version 3; CMA; Biostat, Englewood, NJ, USA) software package was used to order, calculate, and compare effect sizes. The meta-analysis was not pre-registered.