Introduction

Empathy for pain is a multidimensional psychological process that allows people to recognize and share others’ feelings of pain (Cui et al., 2017a; Li et al., 2019; Peng et al., 2021; Ren et al., 2020; Zhang et al., 2022). Due to its specific cognitive and neural mechanisms, researchers have paid significant attention to empathy for pain, making it a popular topic in empathy studies. Researchers often use experiments with stimuli representing other people’s pain to explore the potential mechanisms behind empathy for pain.

Over the last few decades, the wealth of experimental research regarding people’s empathy for pain has been supplemented by several resources for testing. When individuals observe, hear, or imagine other people in painful situations, they can experience their affective and cognitive states in their minds (Penga et al., 2019). Typically, previous studies of empathy for pain have used visual or auditory stimuli.

According to a meta-analysis of 40 studies on ERP investigations of pain empathy (Coll, 2018), 90% (36 studies) of visual studies of empathy for pain used pictures (33 studies, 82.5%) or short videos (three studies, 7.5%) depicting human limbs (such as hands or feet) exposed to harmful stimuli (for example, a hand cut by a knife or foot penetrated by a syringe). These images were placed alongside images of limbs subject to non-harmful stimuli (for example, a hand using a knife to cut vegetables or a foot being touched by a Q-tip). These images have generally been similar to those initially used by Jackson et al. (Jackson et al., 2005). Compared with non-painful limb stimuli, painful limb stimuli typically elicit shorter reaction times (RTs) (Fabi & Leuthold, 2018; Wang et al., 2016) and less pleasure (Gonzalez-Liencres et al., 2016). Painful stimuli also activate the anterior cingulate cortex and several other areas (Gu & Han, 2007) and provoke larger P3 and long-latency positive component event-related potential (ERPs) amplitudes (Cheng et al., 2014; Cui et al., 2017b; Fan & Han, 2008; Meng et al., 2013; Meng et al., 2019b).

Approximately 7.5% (three of 40 studies, according to Coll, 2018) of visual studies on empathy for pain used images depicting faces of individuals pricked by a needle (painful faces) or gently touched by a Q-tip (non-painful faces). Studies revealed that painful faces induced longer RTs (Li et al., 2020), larger N2 and P3 amplitudes (Li et al., 2020; Meng et al., 2020a), and more activity in the anterior cingulate cortex (Han et al., 2009) than non-painful faces did.

Most studies of empathy for pain have focused on the visual modality, but recognizing pain from others’ voices is the auditory equivalent of visual pain recognition. The Montreal Affective Voices database (Belin et al., 2008) included ten nonverbal outcries of voices in pain and ten neutral voices recorded by ten actors (five women). However, the duration of these voices varied from 432 to 1528 ms. Previous studies manipulated these voices to achieve a duration of 700 ms and a mean intensity of 70 dB (Liu et al., 2019; Meng et al., 2019a, 2019b; Meng, Li, & Shen, 2020b). This allowed them to explore the behavioral and neural mechanisms behind empathy for auditory pain. Painful voices elicit higher accuracies (ACCs) and shorter RTs (Meng et al., 2019b), greater judgments of the intensity of pain, more negative emotional reactions, and larger P2 amplitudes (Meng et al., 2019a) than neutral voices.

However, the results of studies using different stimuli of others’ pain have not been consistent (Josiane et al., 2019). This may be because some of these studies did not control the luminance, contrast, and color of the painful and non-painful visual stimuli or the duration and intensity of the painful and non-painful auditory stimuli. In addition, some of these stimuli were not evaluated well. They were not selected based on basic emotional dimensions, such as pain intensity, affective valence, arousal, and dominance. Moreover, although understanding others’ pain in real life involves interpreting various cues, including postures and actions, to our knowledge, only a few studies (e.g., Li et al., 2022) have used human postures and actions to investigate empathy for pain. Using the whole body may have better ecological validity than only using part of the body.

The present study aimed to make available a richly varied, well-validated, and open-access stimuli database for researchers examining empathy for pain. This is the Empathy for Pain Stimuli System (EPSS). The EPSS consists of five sub-databases: (1) the Empathy for Limb Pain Picture Database (EPSS-Limb), (2) the Empathy for Face Pain Picture Database (EPSS-Face), (3) the Empathy for Voice Pain Database (EPSS-Voice), (4) the Empathy for Action Pain Video Database (EPSS-Action_Video), (5) the Empathy for Action Pain Picture Database (EPSS-Action_Picture). Each database consists of various painful stimuli and corresponding non-painful stimuli. All the stimuli were recorded by actors or revised from stimuli previously validated in other published studies. The five sub-databases were validated using ratings of the pain intensity, affective valence, arousal, and dominance of the painful and non-painful stimuli.

We hypothesized that the painful and non-painful stimuli in the EPSS would evoke the intended emotional feeling in participants. We tested whether different genders experienced different levels of perception and emotion in the EPSS. We intend to provide a well-validated and freely accessible stimuli database for research into the empathy for pain.

Empathy for Limb Pain Picture Database (EPSS-Limb)

Methods

Development of stimuli

Actors

Four actors (two females) at universities in Chongqing, China, aged 24–26 (mean = 25.34, standard deviation (SD) = 1.34), participated in the picture photograph sessions. All actors signed informed consent forms for the photo process in accordance with the Declaration of Helsinki.

Acting

In the photograph session, similar to the pictures used in our previously published studies (Meng et al., 2012; Meng et al., 2013; Meng et al., 2019b), the actors were instructed to produce painful and non-painful actions with their limbs (hands, feet, and forearms). All the pictures depicted familiar events that may happen in everyday life. Examples of painful limb pictures (see the left panel of Fig. 1) included a hand pricked by a needle and a foot penetrated by a syringe. The non-painful limb pictures corresponded to the painful limb pictures but without any nociceptive component. Examples of non-painful limb pictures included a hand using a needle to sew clothes and a foot touched by a pencil (see the right panel of Fig. 1). In the session for stimuli development, 213 original pictures exhibiting people’s limbs in painful and non-painful situations were taken.

Fig. 1
figure 1

Examples of painful (left panel) and non-painful (right panel) limb pictures in the EPSS-Limb

Photographing and editing

A Sony camera (Sony Group Corporation) was used to take painful and non-painful limb pictures at a distance of approximately 1 m. Adobe Photoshop CS6 software (Adobe Systems Incorporated, San Jose, CA, USA) was used to edit the images. The luminance, contrast, and color of the painful and non-painful limb pictures were matched. Each picture was 9 × 6.76 cm (width × height) and 100 pixels per inch.

Stimulus validation

Participants

The validation procedure included 70 paid volunteers (35 women) from Chongqing Normal University, Chongqing, China. They were 18–27 years of age (mean = 23.01, SD = 1.95) and right-handed, with normal or corrected-to-normal vision. None of them had previously been diagnosed with a psychiatric, medical, or neurological disorder, nor did they have any painful symptoms from severe somatic diseases. Before validation, all the participants received a description of the stimulus validation procedure and signed an informed consent form. All the procedures were in accordance with the Declaration of Helsinki and approved by the Research Ethics Committee of Chongqing Normal University. The procedures were performed following ethical guidelines and regulations.

Procedure

The participants were seated in a quiet room with an ambient temperature of about 26 °C. The procedure was conducted using the E-Prime (3.0) program (Psychology Software Tools, Pittsburgh, PA, USA). Prior to the formal procedure, each participant participated in a training session to become familiar with the process (details see Appendix).

Each participant was instructed to evaluate the stimuli using four dimensions of rating scales: ① Pain intensity: Judge the perceived pain intensities of the models in the stimuli (1 = no sensation, 4 = pain threshold, 9 = most intense pain imaginable); ② Affective valence: Judge the perceived pleasure of the models in the stimuli (1 = very unhappy, 9 = very happy); ③ Arousal: Judge the perceived level of arousal of the models in the stimuli (1 = extremely peaceful, 9 = extremely excited); ④ Dominance: Judge the perceived level of control (1 = extremely out of control, 9 = extremely in control).

As illustrated in Fig. 2, the participants were shown the stimuli on a computer screen. Then, a rating board with four dimensions of rating scales was displayed on the other computer screen. The participants were instructed to respond as accurately as possible to the stimuli by pressing a specific key (1 to 9) corresponding to the four dimensions of the rating scales. The order of the four dimensions of the rating scales was counterbalanced across all the participants to control for the order’s possible effects. The participants rated the stimuli on one dimension of the rating scale and moved to the next dimension. After completing the rating of one stimulus, the participants were instructed to click the mouse to rate the next stimulus. The stimuli were presented in a pseudorandomized order in 4 to 8 blocks. Each block lasted for < 10 min. The participants could take breaks at will between blocks to reduce fatigue and help them maintain attention on the rating scales. The rating was completed for each participant with several sessions of 1 to 2-h duration, spanning 1–3 days within a week.

Fig. 2
figure 2

Procedure settings of stimuli validation of EPSS-Limb

Selection

The stimuli in the EPSS-Limb were selected using two criteria. First, the mean pain intensity rating scores for the painful stimuli had to be > 4 points, and the scores for the non-painful stimuli had to be < 4 points (4 is the pain threshold based on the rating scales of pain intensity). Second, rating scores ±3 SD away from the average for dimensions were excluded.

Based on a previous study (Yao et al., 2017), two different criteria were used to exclude participants. First, participants using the same response (e.g., 1) for more than 85% of the total responses for each dimension were excluded. Second, participants’ scores that were ±2.5 SD away from the dimension average were excluded.

Statistical analysis

The statistical analysis comprised four parts. First, the characteristics of the sub-database (i.e., the descriptive statistics of the painful and non-painful stimuli made by total, female, and male participants) of the four dimensions (pain intensity, affective valence, arousal, and dominance) were examined. Second, to ensure the reliability of the measures, the internal consistency of participant assessments was estimated by calculating split-half reliability scores with Spearman–Brown formula reliability scores (the participants were split into two subgroups of equal size according to a random procedure). Third, to explore the relationships between the dimensions, Pearson’s correlation coefficient was calculated, and interactive scatterplots were provided, where researchers could check the space location of each stimulus by considering the relationships of the four dimensions. Finally, to test dimensional ratings for painful and non-painful stimuli concerning the genders of participants and actors, the potential effect of stimuli type (painful vs. non-painful) × participant gender (female vs. male) × actor gender (female vs. male) on the participants’ assessments were calculated using repeated-measures analyses of variance (ANOVAs) in SPSS Statistics (IBM Corporation, Armonk, NY, USA).

Results

Based on the selection criteria of the stimuli, 136 pictures (63.9% of the total stimuli) were selected for the EPSS-Limb, including 68 painful and 68 non-painful limb pictures. Based on the selection criteria of the participants, eight participants (11.4% of the total, four women) were excluded because of responses that formed a pattern with almost no variation or ±2.5 SD away from their group’s average. Thus, rating scores from 62 participants (31 women) aged 19–27 (mean = 23.24, SD = 1.89) were calculated.

Characteristics of the EPSS-Limb

The EPSS-Limb provides the number (N), mean, and SD of the rating scores for the painful and non-painful limb pictures across the four dimensions made by the total, female, and male participants (Table 1).

Table 1 Descriptive statistics for the EPSS-Limb (Mean ± SD)

Reliability of the measures of the EPSS-Limb

The internal consistency of participant assessments of the EPSS-Limb was estimated by calculating split-half reliability scores. The Spearman–Brown formula reliability scores were particularly high in four-dimensional ratings (pain intensity: r = 0.96; affective valence: r = 0.96; arousal: r = 0.99; dominance: r = 0.99). Therefore, the dimensional rating scores of participant assessments might be considered highly homogeneous in the EPSS-Limb.

Relationships among dimensions of the EPSS-Limb

Figure 3 shows the relationships between the four dimensions of mean rating scores of painful and non-painful limb pictures for the EPSS-Limb. For the painful limb pictures, significant negative correlations were found in pain intensity × affective valence, pain intensity × dominance, arousal × affective valence, and arousal × dominance. Conversely, significant positive correlations were found in pain intensity × arousal and dominance × affective valence (all ps < 0.05). Concerning the non-painful limb pictures, significant negative correlations were found in pain intensity × dominance and arousal × dominance. Conversely, significant positive correlations were found in dominance × affective valence (all ps < 0.05).

Fig. 3
figure 3

Scatterplot representing the distribution of the mean ratings in four dimensions provided in each painful (red) and non-painful (blue) limb picture of the EPSS-Limb

Statistical analysis of the EPSS-Limb

Table 2 shows the summary of the statistical analysis of stimulus type × participant gender × actor gender in the participant assessments for four dimensions of the EPSS-Limb.

Table 2 Summary of the effect of stimulus type × participant gender × actor gender of the EPSS-Limb

Pain intensity

Pain intensity rating scores were modulated by stimulus type [F(1,60) = 544.73, p < 0.001, ηp2 = 0.90] and actor gender [F(1,60) = 50.30, p < 0.001, ηp2 = 0.46]. Painful limb pictures (5.78 ± 0.13) were more painful than non-painful limb pictures (1.81 ± 0.12). Pictures with male actors (3.91 ± 0.09) were deemed more painful than those of female actors (3.68 ± 0.09). Pain intensity rating scores were significantly influenced by the interaction of stimulus type × actor gender [F(1,60) = 9.35, p = 0.003, ηp2 = 0.14]. The differences between male and female actors tended to be smaller for painful limb pictures (male actor: 5.85 ± 0.12, female actor: 5.71 ± 0.13) than non-painful limb pictures (male actor: 1.97 ± 0.13, female actor: 1.65 ± 0.11). Pain intensity ratings were also influenced by the interaction of participant gender × actor gender [F(1,60) = 8.82, p = 0.004, ηp2 = 0.13]. The differences in rating scores between male and female actors made by female participants were larger than male participants (male participants: 0.13 ± 0.19, female participants: 0.32 ± 0.30; p = 0.004).

Affective valence

Affective valence rating scores were moderated by the main effect of stimulus type [F(1,60) = 20.75, p < 0.001, ηp2 = 0.26]. Painful limb pictures (3.39 ± 0.18) induced more negative feelings than non-painful limb pictures (4.48 ± 0.14). The interaction of stimulus type × participant gender [F(1,60) = 12.70, p = 0.001, ηp2 = 0.18] indicated that female participants rated the non-painful limb pictures more negatively than male participants (male participants: 5.00 ± 0.20, female participants: 3.97 ± 0.20; p < 0.001), while no difference was found for the painful limb pictures (male participants: 3.05 ± 0.25, female participants: 3.73 ± 0.25; p = 0.062). The interaction of participant gender × actor gender [F(1,60) = 11.83, p = 0.001, ηp2 = 0.17] indicated that for female participants, the rating scores for female actors were more negative than male actors (male actor: 3.93 ± 0.16, female actor: 3.76 ± 0.15; p = 0.005), while no difference was found for male participants (male actor: 3.97 ± 0.16, female actor: 4.08 ± 0.15; p = 0.058). In addition, the interaction of stimulus type × actor gender × participant gender [F(1,60) = 11.91, p = 0.001, ηp2 = 0.17] revealed that for non-painful limb pictures, male participants rated male actors more negatively than female actors (male actors: 4.85 ± 0.18, female actors: 5.15 ± 0.22; p = 0.003), while female participants rated female actors more negatively than male actors (male actors: 4.10 ± 0.18, female actors: 3.84 ± 0.22; p = 0.009). However, no such difference was found for the painful limb pictures (all ps > 0.05).

Arousal

Arousal rating scores were moderated by the main effects of stimulus type [F(1,60) = 205.25, p < 0.001, ηp2 = 0.77] and actor gender [F(1,60) = 54.94, p < 0.001, ηp2 = 0.48]. Painful limb pictures (5.53 ± 0.20) were more exciting than non-painful limb pictures (2.39 ± 0.16). Pictures with male actors (4.09 ± 0.15) were more exciting than female actors (3.84 ± 0.14). Arousal was also modulated by the interaction of actor gender × participant gender [F(1,60) = 6.98, p = 0.010, ηp2 = 0.10]. The differences in rating scores between male and female actors made by female participants were larger than male participants (male participants: 0.16 ± 0.23, female participants: 0.34 ± 0.30; p = 0.011).

Dominance

Dominance rating scores were moderated by the main effects of stimulus type [F(1,60) = 114.18, p < 0.001, ηp2 = 0.66] and actor gender [F(1,60) = 22.88, p < 0.001, ηp2 = 0.28]. Painful limb pictures (4.85 ± 0.25) were rated as more out of control than non-painful stimuli (7.69 ± 0.20). Pictures with male actors (6.17 ± 0.18) were rated as more out of control than female actors (6.36 ± 0.18). Dominance was also influenced by the interaction of actor gender × participant gender [F(1,60) = 7.82, p = 0.007, ηp2 = 0.16]. Female participants rated pictures with female actors as more in-control than male actors (male actors: 6.09 ± 0.26, female actors: 6.39 ± 0.26; p < 0.001), while no difference was found for male participants (male actor: 6.26 ± 0.26, female actor: 6.33 ± 0.26; p = 0.165).

Empathy for Face Pain Picture Database (EPSS-Face)

Method

Development of stimuli

The EPSS-Face database contains 160 digital pictures of faces in painful or non-painful situations. Some of the stimuli have been used in previously published studies (Li et al., 2020; Meng et al., 2020a; Yang et al., 2022). These stimuli were revised from a picture database that had been previously validated, in which the images of the faces were morphed pictures with given black backgrounds and shown in grayscale (Hu et al., 2018; Yang et al., 2015). The EPSS-Face comprises pictures of 80 painful faces (40 female and 40 male) and 80 non-painful faces (40 female and 40 male). The painful face pictures depict pain by penetrating the model’s cheek with a needle (see the left panel of Fig. 4), and the non-painful face pictures show the model’s face being touched with a Q-tip (see the right panel of Fig. 4). The pictures were edited using Adobe Photoshop CS6. The luminance, contrast, and color of the painful and non-painful pictures were matched. Each picture was 6.88 cm × 7.94 cm (width × height), with 96 pixels per inch.

Fig. 4
figure 4

Examples of painful (left panel) and non-painful (right panel) face pictures in the EPSS-Face

Stimulus validation

Participants

The validation procedure included 70 paid volunteers (35 women) from Chongqing Normal University, Chongqing, China. They were aged 18–30 (mean = 22.29, SD = 2.53) and right-handed, with normal or corrected-to-normal vision. The other inclusion criteria were similar to those for the EPSS-Limb.

Procedures, selection criteria, and statistical analysis

The evaluation procedure, selection criteria of stimuli and participants, and statistical analysis methods for the EPSS-Face were similar to those for the EPSS-Limb.

Results

Based on the selection criteria, all the pictures were selected for the EPSS-Face, including 80 painful and 80 non-painful face pictures. In addition, the participants’ rating data were included for further statistical analyses. Examples of painful and non-painful face pictures are shown in Fig. 4.

Characteristics of the EPSS-Face

The EPSS-Face provides the descriptive statistics results of the rating scores for the painful and non-painful face pictures across the four dimensions made by the total, female, and male participants (Table 3).

Table 3 Descriptive statistics for the EPSS-Face (Mean ± SD)

Reliability of the measures of the EPSS-Face

The internal consistency of participant assessments of the EPSS-Face was estimated by calculating split-half reliability scores. The Spearman–Brown formula reliability scores were particularly high in four-dimensional ratings (pain intensity: r = 0.98; affective valence: r = 0.97; arousal: r = 0.99; dominance: r = 0.99). Therefore, dimensional ratings of participant assessments might be considered highly homogeneous in the EPSS-Face.

Relationships between the dimensions of the EPSS-Face

Figure 5 shows the relationships between the four dimensions of mean rating scores of painful and non-painful face pictures for the EPSS-Face. Concerning the painful face pictures, significant negative correlations were found in pain intensity × affective valence, pain intensity × dominance, and arousal × dominance. In contrast, significant positive correlations were found in pain intensity × arousal and dominance × affective valence (all ps < 0.05). For the non-painful face pictures, significant negative correlations were found in pain intensity × affective valence, and pain intensity × dominance. In addition, significant positive correlations were found in affective valence × arousal and dominance × affective valence (all ps < 0.05).

Fig. 5
figure 5

Scatterplot representing the distribution of the mean rating in four dimensions provided in each painful (red) and non-painful (blue) face picture of the EPSS-Face

Statistical analysis of the EPSS-Face

Table 4 shows the summary of the statistical analysis of stimulus type × participant gender × actor gender in the participant assessments for four dimensions of the EPSS-Face.

Table 4 Summary of the effect of stimulus type × participant gender × actor gender of the EPSS-Face

Pain intensity

Pain intensity rating scores were moderated by the main effects of stimulus type [F(1,68) = 921.30, p < 0.001, ηp2 = 0.93] and actor gender [F(1,68) = 6.58, p =0.013, ηp2 = 0.09]. Painful face pictures (6.04 ± 0.14) were more painful than the non-painful face pictures (1.38 ± 0.07). Pictures with male actors (3.74 ± 0.08) were deemed more painful than female actors (3.69 ± 0.08). The interaction of actor gender × participant gender [F(1,68) = 8.66, p =0.004, ηp2 = 0.11] indicated that male participants rated male actors (3.79 ± 0.12) as more painful than female actors (3.68 ± 0.11), while no difference was found for female participants (male actor: 3.69 ± 0.12, female actor: 3.69 ± 0.11; p = 0.790). The interaction of stimulus type × actor gender × participant gender [F(1,68) = 5.01, p = 0.028, ηp2 = 0.07] was significant. For non-painful face pictures, male participants rated male actors as being more painful than female actors (male actors: 1.45 ± 0.10, female actors: 1.38 ± 0.09; p = 0.025), while no difference was found in female participants (male actors: 1.38 ± 0.10, female actors: 1.33 ± 0.09; p = 0.153). For painful face pictures, male participants rated male actors as being more painful than female actors (male actors: 6.14 ± 0.21, female actors: 5.99 ± 0.20; p = 0.003), while no such difference was found in female participants (male actors: 6.00 ± 0.21, 6.06 ± 0.20; p = 0.252). In addition, the differences in rating scores between painful and non-painful face pictures with male actors were larger than female actors for female participants (male actors: 4.70 ± 1.44, female actors: 4.61 ± 1.30; p = 0.037), while no such difference was found in male participants (male actors: 4.62 ± 1.24, female actors: 4.72 ± 1.19; p = 0.222).

Affective valence

Affective valence rating scores were moderated by the main effects of stimulus type [F(1,68) = 81.96, p < 0.001, ηp2 = 0.55] and actor gender [F(1,68) = 7.57, p = 0.008, ηp2 = 0.10]. Painful face pictures (3.33 ± 0.16) were rated as more negative than non-painful face pictures (4.94 ± 0.08). Pictures with male actors (4.09 ± 0.09) were rated as more negative than female actors (4.18 ± 0.10). The interaction of actor gender × participant gender [F(1,68) = 5.14, p = 0.027, ηp2 = 0.07] indicated that female participants rated male actors as more negative than female actors (male actors: 4.08 ± 0.12, female actors: 4.23 ± 0.14; p = 0.001), while no difference was found in the male participants (male actors: 4.11 ± 0.12, female actors: 4.12 ± 0.14; p = 0.733). The interaction of stimuli type × actor gender [F(1,68) = 4.89, p = 0.030, ηp2 = 0.07] indicated that for the non-painful face pictures, pictures with female actors were rated as less negative than male actors (male actors: 4.88 ± 0.08, female actors: 5.01 ± 0.09; p = 0.001), while no difference was found for the painful face pictures (male actors: 3.31 ± 0.15, female actors: 3.35 ± 0.17; p = 0.285).

Arousal

Arousal rating scores were moderated by the main effect of stimulus type [F(1,68) = 72.40, p < 0.001, ηp2 = 0.52]. Painful face pictures (4.99 ± 0.23) were rated as more exciting than non-painful face pictures (2.65 ± 0.18).

Dominance

Dominance rating scores were moderated by the main effect of stimulus type [F(1,68) = 67.34, p < 0.001, ηp2 = 0.50], and painful face pictures (5.02 ± 0.25) were rated as less in control than non-painful face pictures (7.43 ± 0.18). The interaction of stimulus type × actor gender [F(1,68) = 4.67, p = 0.034, ηp2 = 0.06] indicated that for the non-painful face pictures, pictures with female actors were rated as more in control than male actors (male actors: 7.38 ± 0.18, female actors: 7.47 ± 0.18; p = 0.017), while no difference was found for the painful face pictures (male actors: 5.03 ± 0.24, female actors: 5.02 ± 0.25; p = 0.808).

Empathy for Voice Pain Database (EPSS-Voice)

Method

Development of stimuli

Actors

Thirty-one actors (15 females) studying drama at universities in Chongqing, China, aged 18–23 (mean = 21.13, SD = 1.41), participated in the recording sessions. All the actors had normal vision and hearing and signed informed consent forms to participate in the recording process in accordance with the Declaration of Helsinki. They received ¥100 in compensation.

Acting

The actors were instructed to produce short vocal exclamations that were either painful or neutral using the vowel sound /a/. A short rehearsal session preceded each recording round, during which the sounds’ level and duration were adjusted. Each vocalization category was performed several times until our qualitative criterion was reached. Two experimenters had to be able to recognize the painful or non-painful voices the actors produced. Feedback was given to the actors during the session so that they could improve their performance. Finally, 120 vocalizations (89 painful and 31 non-painful) were recorded and categorized as painful and non-painful voices, respectively.

Recording and editing

The voices were recorded in a sound-proof room using an ATR2500 condenser microphone (Audio Technica) at a distance of approximately 40 cm. The recordings were edited into short, meaningful segments of about 1 s each. Their peak values were normalized, and they were down-sampled at 44.1 kHz using Adobe Audition (Adobe Systems, Inc.). Only the best example for each actor and vocalization category was kept for the validation stage (Fig. 6).

Fig. 6
figure 6

Examples of painful (left panel) and non-painful (right panel) voice waveforms in the EPSS-Voice

Stimuli validation

Participants

Seventy adults (35 females) from Chongqing Normal University, Chongqing, China, were recruited as paid participants. They were aged 18–31 (mean = 22.89, SD = 2.32) and right-handed, with normal vision and hearing. The other inclusion criteria were the same as for the EPSS-Limb.

Procedures, selection criteria, and statistical analysis

The main evaluation procedure for the EPSS-Voice was similar to those for the EPSS-Limb, except that the painful and non-painful voices were presented through headphones (Fig. 7). In addition, the selection criteria and statistical analysis methods for the EPSS-Voice were similar to those for the EPSS-Limb.

Fig. 7
figure 7

Procedure settings of stimuli validation of EPSS-Voice

Results

Based on the selection criteria of the stimuli, 15 male and 15 female actors who produced the most successful displays were selected. Sixty voices were selected to form the EPSS-Voice, including 30 painful (see the left panel of Fig. 6) and 30 non-painful voices (see the right panel of Fig. 6). Based on the selection criteria of the participants, ten participants (14.3% of the total, five women) were excluded. Thus, rating scores from 60 participants (30 females) aged 19–31 (mean = 22.77, SD = 2.27) were calculated.

Characteristics of the EPSS-Voice

The EPSS-Voice provides the descriptive statistics results of the rating scores for the painful and non-painful voices across the four dimensions made by the total, female, and male participants (Table 5).

Table 5 Descriptive statistics for the EPSS-Voice (Mean ± SD)

Reliability of the measures of the EPSS-Voice

The internal consistency of participant assessments of the EPSS-Voice was estimated by calculating split-half reliability scores. The Spearman–Brown formula reliability scores were particularly high in four-dimensional ratings (pain intensity: r = 0.94; affective valence: r = 0.92; arousal: r = 0.95; dominance: r = 0.99). Therefore, dimensional ratings of participant assessments might be considered highly homogeneous in the EPSS-Voice.

Relationships between dimensions of the EPSS-Voice

Figure 8 shows the relationships between the four dimensions of mean rating scores of painful and non-painful voices for the EPSS-Voice. Concerning the painful voices, significant negative correlations were found in pain intensity × affective valence, pain intensity × dominance, and arousal × dominance. In contrast, significant positive correlations were found in pain intensity × arousal and dominance × affective valence (all ps < 0.05). Concerning the non-painful voices, a significant negative correlation was found in pain intensity × dominance (p < 0.05).

Fig. 8
figure 8

Scatterplot representing the distribution of the mean ratings in the four dimensions provided in each painful (red) and non-painful (blue) voice of the EPSS-Voice

Statistical analysis of the EPSS-Voice

Table 6 shows the summary of the statistical analysis of stimulus type × participant gender × actor gender in the participant assessments for four dimensions of the EPSS-Voice.

Table 6 Summary of the effect of stimulus type × participant gender × actor gender in the EPSS-Voice

Pain intensity

Pain intensity rating scores were moderated by the main effects of stimulus type [F(1,58) = 1224.42, p < 0.001, ηp2 = 0.96] and actor gender [F(1,58) = 38.46, p < 0.001, ηp2 = 0.40]. Painful voices (5.80 ± 0.08) were more painful than non-painful voices (1.60 ± 0.09). Voices with female actors (3.83 ± 0.07) were more painful than male actors (3.57 ± 0.06). The interaction of stimuli type × actor gender [F(1,60) = 42.36, p < 0.001, ηp2 = 0.42] indicated that for the painful voices, voices with female actors were more painful than male actors (male actors: 5.53 ± 0.08, female actors: 6.06 ± 0.10; p <0.001), while no difference was found for the non-painful voices (male actors: 1.61 ± 0.09, female actors: 1.59 ± 0.09; p = 0.766).

Affective valence

Affective valence rating scores were moderated by the main effects of stimulus type [F(1,58) = 158.34, p < 0.001, ηp2 = 0.73] and actor gender [F(1,58) = 19.00, p < 0.001, ηp2 = 0.25]. Painful voices (1.60 ± 0.09) were more negative than non-painful voices (5.80 ± 0.08). Voices with female actors (4.32 ± 0.08) were more negative than male actors (4.54 ± 0.07). The interaction of stimulus type × actor gender [F(1,58) = 30.53, p < 0.001, ηp2 = 0.35] indicated that concerning painful voices, voices with female actors were more negative than male actors (male actors: 3.92 ± 0.10, female actors: 3.44 ± 0.14, p < 0.001), while no difference was found in non-painful voices (male actors: 5.16 ± 0.07, female actors: 5.19 ± 0.07; p = 0.611).

Arousal

Arousal rating scores were moderated by the main effect of stimulus type [F(1,58) = 13.51, p = 0.001, ηp2 = 0.19]. Painful voices (4.98 ± 0.18) were more exciting than non-painful voices (3.87 ± 0.20).

Dominance

Dominance rating scores were significantly moderated by the main effects of stimulus type [F(1,58) = 166.85, p < 0.001, ηp2 = 0.74] and actor gender [F(1,58) = 31.61, p < 0.001, ηp2 = 0.35]. Painful voices (4.98 ± 0.19) were rated as more out of control than non-painful voices (7.90 ± 0.20). However, the voices of male actors (6.70 ± 0.17) were rated as more in control than female actors (6.19 ± 0.16). The interaction of stimulus type × actor gender [F(1,58) = 50.91, p < 0.001, ηp2 = 0.47] indicated that concerning painful voices, the voices of male actors were rated as more in control than female actors (male actors: 5.42 ± 0.20, female actors: 4.53 ± 0.20, p <0.001), while no difference was found for non-painful voices (male actors: 7.97 ± 0.20, female actors: 7.84 ± 0.20; p = 0.127).

Empathy for Action Pain Video Database (EPSS-Action_Video)

Method

Development of stimuli

Actors

Twenty actors studying drama at universities in Chongqing, China, aged 18–24 (mean = 22.30, SD = 2.12), participated in the video recording sessions. All the actors signed informed consent forms to participate in the recording process in accordance with the Declaration of Helsinki and the portrait agreement. They received ¥200 for their performance.

Acting

The actors were instructed to produce painful and non-painful actions in the filming session. The actions used 12 body parts (i.e., head, teeth, neck, arms, elbows, hands, belly, knees, genitalia, chest, waist, and hip). In the session of stimuli development, 480 painful actions and 480 non-painful actions were filmed, the examples of which are illustrated in Fig. 9.

Fig. 9
figure 9

An illustration of sample frames from painful (left panel) and non-painful (right panel) action videos in the EPSS-Action_Video

Filming and editing

Before filming, the actors were asked to dress in uniforms (a white T-shirt and black shorts). They were also asked to take off their accessories and not to wear make-up. The actors were filmed in an evenly lit green-screen studio with an ambient temperature of approximately 26°C. A Sony FDR-AXP55 (Sony Group Corporation) camera was used at a distance of approximately 3 m. Each film was 2160 × 1280 pixels, with a 60-fps progressive scan. The height of the camera was 1.2 m.

Adobe Premiere Pro2020 (Adobe Systems Incorporated) was used to edit the video footage. The green background was changed into a gray background, and the actor was isolated from all other contextual information. Each video was edited to a duration of 1 s. It was saved in mp4 format at 768 × 432 pixels and 60 fps.

Validation

Participants

Seventy adults (34 females) from Chongqing Normal University, Chongqing, China, were recruited as paid participants. They were aged 18–30 (mean = 23.23, SD = 2.15) and right-handed, with normal or corrected-to-normal vision. The other inclusion criteria were similar to the EPSS-Limb.

Procedures, selection criteria, and statistical analysis

The evaluation procedure, selection criteria of stimuli, participants, and statistical analysis methods for the EPSS-Action_Video were similar to those for the EPSS-Limb.

Results

Based on the selection criteria of the stimuli, 239 painful and 239 non-painful action videos, recorded by 20 actors (ten females), were selected (illustrated in Fig. 9). Based on the selection criteria of the participants, ten participants (14.3% of the total, four women) were excluded. Thus, rating scores from 60 participants (30 females) aged 18–25 (mean = 21.80, SD = 1.90) were calculated.

Characteristics of the EPSS-Action_Video

The EPSS-Action_Video provides the descriptive statistics results of the rating scores for the painful and non-painful action videos across the four dimensions made by the total, female, and male participants (Table 7).

Table 7 Descriptive statistics for the EPSS-Action_Video (Mean ± SD)

Reliability of the measurements of the EPSS-Action_Video

The internal consistency of participant assessments of the EPSS-Action_Video was estimated by calculating split-half reliability scores. The Spearman–Brown formula reliability scores were particularly high in four-dimensional ratings (Pain intensity: r = 0.99; Affective valence: r = 0.99; Arousal: r = 0.99; Dominance: r = 0.99). Therefore, dimensional ratings of participant assessments might be considered highly homogeneous in the EPSS-Action_Video.

Relationships between the dimensions of the EPSS-Action_Video

Figure 10 shows the relationships between the four dimensions of mean rating scores of painful and non-painful action videos for the EPSS-Action_Video. Concerning both the painful and non-painful action videos, significant negative correlations were found in pain intensity × affective valence, pain intensity × dominance, arousal × affective valence, and arousal × dominance. In addition, significant positive correlations were found in pain intensity × arousal and dominance × affective valence (all ps < 0.001).

Fig. 10
figure 10

Scatterplot representing the distribution of the mean ratings in four dimensions provided in each painful (red) and non-painful (blue) action video of the EPSS-Action_Video

Statistical analysis of the EPSS-Action_Video

Table 8 shows the summary of the statistical analysis of stimulus type × participant gender × actor gender in the participant assessments for four dimensions of the EPSS-Action_Video.

Table 8 Summary of the effect of stimulus type × participant gender × actor gender of the EPSS-Action_Video

Pain intensity

Pain intensity rating scores were moderated by the main effect of stimulus type [F(1,58) = 831.44, p < 0.001, ηp2 = 0.96]. Painful action videos (6.31 ± 0.11) were more painful than non-painful action videos (2.24 ± 0.11).

Affective valence

Affective valence rating scores were moderated by the main effect of stimulus type [F(1,58) = 31.28, p < 0.001, ηp2 = 0.35]. Painful action videos (3.88 ± 0.16) were more negative than non-painful action videos (5.01 ± 0.09).

Arousal

Arousal rating scores were moderated by the main effect of stimulus type [F(1,58) = 52.60, p < 0.001, ηp2 = 0.48]. Painful action videos (4.84 ± 0.23) were rated as more exciting than non-painful action videos (3.11 ± 0.20).

Dominance

Dominance rating scores were moderated by the main effect stimulus type [F(1,58) = 20.19, p < 0.001, ηp2 = 0.26]. Painful action videos (5.88 ± 0.25) were rated as more out of control than non-painful action videos (7.02 ± 0.26).

Empathy for Action Pain Picture Database (EPSS-Action_Picture)

Methods

Development of stimuli

The EPSS-Action_Picture stimuli were developed from the EPSS-Action_Video. A frame of the image best representing the painful or non-painful states of the actor in each video of EPSS-Action_Video was cut out as the stimulus for the EPSS-Action_Picture (examples are shown in Fig. 11). Thus, like the EPSS-Action_Video, the pictures in EPSS-Action_Picture conveyed 12 body parts (head, teeth, neck, arms, elbows, hands, belly, knees, genitalia, chest, waist, and hip). In total, 239 painful and 239 non-painful action pictures from 20 actors (ten females) were selected. The luminance, contrast, and color were matched between the painful and non-painful action pictures. Each picture was 15.24 × 27.9 cm (width × height) and 72 pixels per inch.

Fig. 11
figure 11

An illustration of painful (left panel) and non-painful (right panel) action pictures in the EPSS-Action_Picture

Validation

Participants

Seventy adults (36 females) from Chongqing Normal University, Chongqing, China, were recruited as paid participants. They were aged 18–28 (mean = 22.66, SD = 2.17) and right-handed, with normal or corrected-to-normal vision. The other inclusion criteria were the same as for the EPSS-Limb.

Procedures, selection criteria, and statistical analysis

The evaluation procedure, selection criteria of stimuli, participants, and statistical analysis methods for the EPSS-Action_Picture were similar to those for the EPSS-Limb.

Results

Based on the selection criteria of the stimuli, 478 action pictures (239 painful and 239 non-painful action pictures), recorded by 20 actors (ten females), were selected (illustrated in Fig. 11). Based on the selection criteria of the participants, ten participants (14.3% of the total, six women) were excluded. Thus, rating scores from 60 participants (30 females) aged 18–25 (mean = 21.80, SD = 1.90) were calculated.

Characteristics of the EPSS-Action_Picture

The EPSS-Action_Picture provides the descriptive statistics results of the rating scores for the painful and non-painful action pictures across the four dimensions made by the total, female, and male participants (Table 9).

Table 9 Descriptive statistics for the EPSS-Action_Picture (mean ± SD)

Reliability of the measurements of the EPSS-Action_Picture

The internal consistency of participant assessments of the EPSS-Action_Picture was estimated by calculating split-half reliability scores. The Spearman–Brown formula reliability scores were particularly high in four-dimensional ratings (Pain intensity: r = 0.99; Affective valence: r = 0.99; Arousal: r = 0.99; Dominance: r = 0.99). Therefore, dimensional ratings of participant assessments might be considered highly homogeneous in the EPSS-Action_Picture.

Relationships between dimensions of the EPSS-Action_Picture

Figure 12 shows the relationships between the four dimensions of mean rating scores of painful and non-painful action pictures for the EPSS-Action_Picture. Concerning both the painful and non-painful action pictures, significant negative correlations were found in pain intensity × affective valence, pain intensity × dominance, arousal × affective valence, and arousal × dominance. In addition, significant positive correlations were found in pain intensity × arousal and dominance × affective valence (all ps < 0.001).

Fig. 12
figure 12

Scatterplot representing the distribution of the mean ratings in four dimensions provided in each painful (red) and non-painful (blue) action picture of the EPSS-Action_Picture

Statistical analysis of the EPSS-Action_Picture

Table 10 shows the summary of the statistical analysis of stimulus type × participant gender × actor gender in the participant assessments for four dimensions of the EPSS-Action_Picture.

Table 10 Summary of the effect of stimulus type × participant gender × actor gender of the EPSS-Action_Picture

Pain intensity

Pain intensity rating scores were moderated by the main effect of stimulus type [F(1,58) = 776.99, p < 0.001, ηp2 = 0.93]. Painful action pictures (6.09 ± 0.11) were more painful than non-painful action pictures (2.14 ± 0.10).

Affective valence

Affective valence rating scores were moderated by the main effect of stimulus type [F(1,58) = 21.33, p < 0.001, ηp2 = 0.27]. Painful action pictures (4.00 ± 0.16) were more negative than non-painful action pictures (4.89 ± 0.08). The interaction of stimulus type × participant gender [F(1,58) = 7.87, p = 0.007, ηp2 = 0.12] indicated that for painful action pictures, female participants rated more negative than male participants (male participants: 4.40 ± 0.23, female participants: 3.59 ± 0.23; p = 0.017), while no difference was found for non-painful stimuli (male participants: 4.75 ± 0.11, female participants: 5.03 ± 0.11; p = 0.205).

Arousal

Arousal rating scores were moderated by the main effect of stimulus type [F(1,58) = 70.20, p < 0.001, ηp2 = 0.55]. Painful action pictures (4.85 ± 0.23) were more exciting than non-painful action pictures (2.99 ± 0.18). The interaction of stimulus type × actor gender × participant gender [F(1,58) = 7.70 p = 0.007, ηp2 = 0.18] indicated that for the non-painful action pictures, female participants rated pictures with male actors as more exciting than female actors (male actors: 3.12 ± 0.26, female actors: 3.05 ± 0.26; p = 0.031), while no other difference was found (ps > 0.05).

Dominance

Dominance rating scores were moderated by the main effect of stimulus type [F(1,58) = 21.11, p < 0.001, ηp2 = 0.27]. Painful action pictures (5.71 ± 0.24) were rated as less in control than non-painful action pictures (6.97 ± 0.25). The interaction of stimulus type × actor gender × participant gender [F(1,58) = 4.36, p = 0.041, ηp2 = 0.07] indicated that for the painful action pictures, female participants rated the pictures with female actors as less in control than male actors (male actors: 5.89 ± 0.33, female actors: 5.82 ± 0.36; p = 0.039), while no other difference was found (ps > 0.05).

Discussion

The present study described the generation and assessment of the Empathy for Pain Stimuli System (EPSS), including the Empathy for Limb Pain Picture Database (EPSS-Limb), Empathy for Face Pain Picture Database (EPSS-Face), Empathy for Voice Pain Database (EPSS-Voice), Empathy for Action Pain Video Database (EPSS-Action_Video), and Empathy for Action Pain Picture Database (EPSS-Action_Picture). We believe that these databases, provided for free at https://osf.io/muyah/?view_only=33ecf6c574cc4e2bbbaee775b299c6c1, will be useful for researchers hoping to conduct theoretically motivated explorations of the behavioral, cognitive, and neural processes underlying pain and empathy for pain.

For the stimuli in all the EPSS sub-databases (i.e., EPSS-Limb, EPSS-Face, EPSS-Voice, EPSS-Action_Video, EPSS-Action_Picture), pain intensity was rated using a nine-point Likert scale (1 = no sensation, 4 = pain threshold, 9 = the most intense pain imaginable). The selection criteria for the stimuli were that the pain intensity scores had to be > 4 for the painful stimuli and < 4 for the non-painful stimuli. Thus, the painful and non-painful stimuli could be accurately identified. As a result, the recognition rates for the database were higher than those found for the database of painful stimuli (Belin et al., 2008).

In addition to pain intensity, the other three dimensions (affective valence, arousal, and dominance), which indicated emotional reactions in previous databases (Bai et al., 2005; Lang et al., 1999), were rated. Consistent with a previous database of empathy for pain (Fernandes-Magalhaes et al., 2022), for all the EPSS sub-databases, the painful stimuli were more painful, more arousing, less pleasurable, and less dominant than the non-painful stimuli. The pain intensity scores of painful stimuli for the five sub-databases were negatively correlated with affective valence and dominance. However, they were positively correlated with arousal, suggesting that the more painful the stimuli, the less pleasure, the less dominance, and the more excited people felt.

Consistent with previous studies (Belin et al., 2008), the rating scores of the stimuli in the EPSS were influenced by the gender of the actors in EPSS-Limb, EPSS-Face, and EPSS-Voice. For example, male actors were perceived to feel more pain than female actors in both EPSS-Limb and EPSS-Face. Furthermore, consistent with previous studies on empathy for pain (Fernandes-Magalhaes et al., 2022), the significant interaction effects between actor gender and participant gender were found in affective valence in the EPSS-Limb and EPSS-Face, suggesting that female participants were more sensitive to others’ pain with different genders than male participants. These interactions were also found in the pain intensity, arousal, and dominance of the EPSS-Limb, as well as the pain intensity of the EPSS-Face.

There were some limitations to the EPSS that should be mentioned. First, the models in the EPSS stimuli were Chinese adults, which may be a limitation of this study. In fact, studies have shown that racial bias can be an important factor concerning empathy for pain (Avenanti et al., 2010; Fabi & Leuthold, 2018; Sheng & Han, 2012; Xiang et al., 2018). Therefore, future studies could expand the current database to include models from other races. Second, stimuli in the EPSS were performed by trainee actors or non-professional actors. However, since previous studies have shown that emotional actions produced by professional and non-professional actors are similar (Barliya et al., 2013; Roether et al., 2009) or slightly different (Keefe et al., 2014), the present database could be accepted for the stimuli used in this study, particularly given the strict selection criteria used in this study. Third, for many of the stimuli in the database, the participants recruited in each sub-database had to spend 1–2 h to rate the stimuli during the validation process at a time. Fatigue and practice may have played a role in the rating process. Finally, as the total number of participants included in the EPSS was 350 (70 participants for each sub-database, 312 of whom met the selection criteria) and only young Asian participants were included in the study, a future study would be advisable to expand the validation sample.

In conclusion, we presented the Empathy for Pain Stimuli System (EPSS). The EPSS includes five sub-databases: the Empathy for Limb Pain Picture Database (EPSS-Limb), Empathy for Face Pain Picture Database (EPSS-Face), Empathy for Voice Pain Database (EPSS-Voice), Empathy for Action Pain Video Database (EPSS-Action_Video), and Empathy for Action Pain Picture Database (EPSS-Action_Picture). For each stimulus in the database, the EPSS provides a wealth of information, such as pain intensity, affective valence, arousal, and dominance. We generated and made available this large database concerning empathy for pain, which could be used in behavioral, psychological, and neural research.