1 Introduction

In expectancy-value models, key determinants of students’ achievement-related choices and students’ success are task value and self-efficacy, which have been postulated to interact with each other (Eccles et al., 1983; Eccles & Wigfield, 2020; Middleton & Spanias, 1999). To plan lessons and to support school students’ learning in mathematics classes, teachers should be able to judge school students’ motivation to engage in different activities, such as students’ motivation to solve modelling, word, and intramathematical problems (Hannula et al., 2019; Schukajlow et al., 2017). The accuracy of teachers’ diagnostic judgments of school students’ motivation depends on the teachers’ diagnostic competence regarding their judgments of affective components, such as the value school students might attribute to solving selected problems and how confident they are in solving these problems (i.e., self-efficacy). To our knowledge, research on the determinants and underlying assumptions of teachers’ diagnostic judgments of school students’ task value and self-efficacy is scarce.

The goal of the present study was to investigate preservice teachers’ own task value and self-efficacy and their diagnostic judgments of school students’ task value and self-efficacy across different types of mathematical problems, specifically, modelling, word, and intramathematical problems. We were interested in whether judgments of motivation are different for different types of problems and whether there is a positive relationship between preservice teachers’ own ratings and their diagnostic judgments of school students’ motivation. In reporting this research, we seek to contribute to the understanding of the role of the type of problem in diagnostic judgments, and to supply initial indications of the mechanism behind diagnostic judgments, according to the data we collected.

Theoretical foundations for the analyses in the present study are expectancy-value theory (EVT; Eccles & Wigfield 2002; Wigfield et al., 2020; Wigfield & Eccles, 1992, 2000), social-cognitive theory (Bandura, 2003), and theories about types of mathematical problems (Niss et al., 2007), teachers’ diagnostic competence (Südkamp et al., 2012), and the cognitive systems involved in diagnostic judgments (Kahneman, 2003). With the provided theoretical framework and the subsequent analyses of preservice teachers’ judgments of motivation in solving modelling, word, and intramathematical problems, we aim to provide insights into the foundations of judgment accuracy in the early stages of teacher education.

2 Theoretical and empirical background

2.1 Importance of school students’ task value and self-efficacy for learning and performance

According to EVT (Eccles & Wigfield, 2002; Wigfield et al., 2020; Wigfield & Eccles, 1992, 2000), achievement-related choices, effort, persistence, and performance are strongly influenced by efficacy beliefs and task value. In EVT, task value describes the perceived importance of tasks, activities, or objects (Eccles & Wigfield, 2002). Task value can be further distinguished into four subcomponents, namely, attainment value, intrinsic value, utility value, and cost (Eccles et al., 1983). Attainment value is linked to social and personal identities and can be described as the importance of the task for “its ability to satisfy key aspects of the individual’s core self-schema and life goals/values” (Wigfield et al., 2020, p. 665). Intrinsic value means the joy that comes from engaging in this task, and utility value refers to the fit between the given task and the present and future lives of the school students (Eccles et al., 1983). Lastly, cost is related to the value attributed to a task with respect to the effort necessary to complete the task cognitively and emotionally (Eccles et al., 1983; Eccles & Wigfield, 2020).

Another central construct in expectancy-value theory is efficacy beliefs (Eccles & Wigfield, 2020). Self-efficacy “refers to beliefs in one’s capabilities to organize and execute the courses of action required to produce given attainments” (Bandura, 2003, p. 3). Self-efficacy comprises students’ confidence in performing a task in the future, and it has a strong impact on how much effort students will invest in solving a given problem and how long they will persist in trying to solve a problem when encountering obstacles during the solution process (Bandura, 2003).

Research has shown a multitude of positive effects of high self-efficacy expectations and of attributing high value to achievement-related activities or objects (e.g., learning materials or problem solving; for an overview, see Rosenzweig & Wigfield 2016; Wigfield & Eccles, 2000). For example, task value and self-efficacy and their interaction have been found to be related to performance (Hannula et al., 2014; Hoffman & Spatariu, 2008; Hulleman et al., 2008; Trautwein et al., 2012) and strategy use in mathematics (Schukajlow, Blomberg, et al., 2021) and to depend on the problems offered in the mathematics classroom (Krawitz & Schukajlow, 2018).

2.2 Characteristics of modelling, word, and intramathematical problems

Solving problems is the dominant activity in mathematics classes, and characteristics of problems that teachers offer in the classroom affect school students’ learning processes to a great extent. It is possible to differentiate mathematical problems by their approach, one of them being the problem’s relation to reality. Using Niss et al.’s (2007) categories, Schukajlow et al., (2012) chose the cognitive processes required to solve the given problem with and without a relation to reality. Modelling problems have a strong relation to reality, ‘dressed up’ word problems have a moderate relation to reality, and intramathematical problems do not have a relation to reality (Schukajlow et al., 2012).

Solving modelling problems requires cognitively demanding transfer processes between the extramathematical world and mathematics (Blum & Leiß, 2007). These processes are usually represented in models of ideal solution processes called modelling cycles. Blum & Leiß (2007) proposed seven key phases required for appropriate mathematical modelling, which we characterize by looking at the “Maypole” modelling problem (Fig. 1). This problem describes a real-world situation, the so-called “may dance,” which is a traditional dance in parts of Germany. To solve this problem, the problem solver needs to construct an individual situation model and simplify and structure it, thus constructing the so-called real model. Afterwards, the real model needs to be mathematized to construct a suitable mathematical model. The suitability of the real model (and consequently the mathematical model) depends on whether the problem solver made assumptions about the height at which the dancers hold the ribbons in their hands and whether a right-angled triangle could be identified in the simplified and structured real-world situation. If the mathematization is correct, the problem solver can solve the problem mathematically by applying the Pythagorean theorem, with the aim of computing the unknown length in the triangle, the so-called mathematical result. This mathematical result needs to be interpreted and validated with respect to the real-world situation and—if the validation cannot be deemed satisfactory—the utilized models will have to be adapted, and the solution process will start again (cf. Blum & Leiß, 2007).

Fig. 1
figure 1

The “Maypole” modelling problem (Schukajlow & Leiss, 2011)

The second type of mathematical problem with a relation to reality is the ‘dressed up’ word problem. This type of problem is loosely related to reality and includes significant simplifications and prestructuring. The real model is already presented in ‘dressed up’ word problems (Schukajlow et al., 2012). In the “Soccer Field” problem (Fig. 2), understanding that the diagonal of the given soccer field with its length of 110 m and its width of 70 m has to be calculated is sufficient for identifying the right-angled triangle and for constructing a suitable mathematical model. After applying the Pythagorean theorem, the result needs to be interpreted as the length of the diagonal of the soccer field. Validation is much less demanding while solving ‘dressed up’ word problems than for modelling problems because the problem solver does not need to examine the real model critically.

Fig. 2
figure 2

‘Dressed up’ word problem “Soccer Field” (Schukajlow et al., 2012)

Contrary to the modelling and ‘dressed up’ word problems, solving intramathematical problems, such as the “Side c” problem (Fig. 3), does not require the problem solver to convert the real-world situation into a mathematical model and vice versa. The problem itself is presented in mathematical-symbolic language, and thus, the mathematical model is already preconstructed. The problem solver can apply the Pythagorean theorem immediately to compute the requested length of the triangle’s side. Because the problem itself has no relation to the extramathematical world, no interpretation or validation process regarding the relation to reality needs to be initiated by the problem solver.

Fig. 3
figure 3

Intramathematical problem “Side c” (Krug & Schukajlow, 2013)

2.3 Task values and self-efficacy for modelling, word, and intramathematical problems

2.3.1 Task values for modelling, ‘dressed up’ word, and intramathematical problems

Schukajlow et al., (2012) found that school students value modelling, word, and intramathematical problems similarly. Further, in a study by Krawitz & Schukajlow (2018), school students expressed that being able to solve modelling problems was less important to them than solving word and intramathematical problems. These results contradict the considerations that (1) a relation to reality in modelling and ‘dressed up’ word problems might offer an additional source of value compared with intramathematical problems and (2) students should value the ability to solve modelling problems more than ‘dressed up’ word problems because modelling problems do not include strong simplifications and are related to reality. The reasons for why school students attribute low or high value to a given mathematical problem might lie in the cognitive and noncognitive cost, attainment, and utility value of the problems (Eccles & Wigfield, 2020). Solving modelling problems might be associated with high cost because of the complexity of the cognitive demands of this type of problem, thus possibly negating the gain in task value from a problem’s relation to reality, especially when students are not familiar with this type of problem. Attainment and utility value for modelling problems can be lower for students when compared with word and intramathematical problems because modelling problems are not (yet) common on exams or in school curricula (Blum & Borromeo Ferri, 2009; Krawitz et al., 2021). Thus, school students have not built an identity regarding modelling problems and do not fully recognize and appreciate the necessity of being able to solve them. This issue might also hold for future teachers if they see modelling problems as less integrated into everyday lessons and do not recognize the benefits of using modelling problems in class. Furthermore, future teachers’ valuing of modelling, word, and intramathematical problems might be influenced by university curricula, which are strongly focused on argumentation, proofs, and intramathematical problems. In sum, on the one hand, the relation to reality can be a source of the higher value of modelling problems in comparison with word and intramathematical problems. On the other hand, high cost, low attainment, and utility value for modelling problems can negatively influence future teachers’ own overall valuing of modelling problems.

2.3.2 Self-efficacy for modelling, word, and intramathematical problems

According to social-cognitive theory, self-efficacy strongly depends on the types of activities that individuals are going to perform (Bandura, 1993). As solving modelling, word, and intramathematical problems require different cognitive activities, students’ self-efficacy can depend on the types of problems. Furthermore, prior positive experience has been suggested to be important for self-efficacy (Bandura, 1993; Usher & Pajares, 2009), especially mastery experience (Usher & Pajares, 2008). For example, when a problem is solved accurately and the student interprets the solution process as a success, the student gains mastery experience, which, in turn, is beneficial for the student’s self-efficacy.

From a theoretical point of view, if students are familiar with the three types of problems and have similar mastery experience in solving these types of problems, they should express comparable self-efficacy expectations across the three types (Große, 2014). In line with the theoretical assumptions made about the influence of meaningful relations to reality in mathematical problems on school students’ self-efficacy, Schukajlow et al., (2012) found that students felt slightly more confident about solving modelling problems in comparison with word problems, whereas no difference was found between modelling and intramathematical problems. These results are in contrast with Krawitz and Schukajlow’s (2018) findings that school students reported lower self-efficacy for modelling problems than for word and intramathematical problems. Krawitz and Schukajlow explained that their results can be accounted for by school students’ lack of mastery experience in solving modelling problems, coupled with the high cognitive and noncognitive demands for solving modelling problems. However, preservice teachers have higher mathematical knowledge than school students and may have also gained additional mastery experience in solving modelling, word, and intramathematical problems in their studies. Consequently, preservice teachers may see solving the school problems as routine exercises and report high self-efficacy for the three types of problems.

2.4 Diagnostic competence and judgment accuracy

Teachers need to be able to judge the cognitive and affective characteristics of their students accurately in order to foster individual student’s learning by designing and implementing meaningful classroom instruction. While selecting tasks for their mathematics classes, teachers must consider how students will react to being required to solve these tasks. Diagnostic competence involves the ability to judge students’ learning processes and their individual learning and performance-related circumstances accurately (Schrader, 2009; Schrader & Praetorius, 2018; Südkamp et al., 2012), including noncognitive characteristics, such as motivation (Campbell et al., 2014). Diagnostic competence also includes person-related and person-specific judgment accuracy, as well as task-related judgment accuracy (Urhahne & Wijnia, 2021). Consequently, diagnostic competence for cognitive and noncognitive aspects of students’ learning is a crucial aspect of teachers’ pedagogical expertise. Diagnostic competence includes the level and rank components (Spinath, 2005). The level component describes the degree of conformity in the estimation of the absolute values of the participants’, objects’, or activities’ characteristics, for example, the motivation to engage in solving a problem, the importance of a task, or the learning of mathematics itself. The rank component describes teachers’ judgments of how fitting the order of the subjects, objects, or activities are with respect to the investigated characteristics. Both components have been established as important in the investigation of teachers’ judgment accuracy (Helmke & Schrader, 1987; Spinath, 2005).

According to Leuders et al., (2018) and their specification of Blömeke et al.‘s (2015) model on diagnostic competence, the diagnosis of students’ task value and self-efficacy for different types of mathematical problems is a diagnostic skill. To our knowledge, no overarching theoretical framework for the foundation of diagnostic judgments or decision making in educational contexts exists (Leuders et al., 2018; Praetorius & Südkamp, 2017). However, a framework for diagnostic decision making that is being used in cognitive psychology was developed by Kahneman (2003). They distinguished between two types of systems involving different kinds of knowledge, which are responsible for the generation of judgments. System 1 is linked to intuition and is fast, automatic, effortless, and influenced by affect (Kahneman, 2003). By contrast, System 2 is linked to reasoning and is consequently rather slow, controlled, effortful, and neutral (Kahneman, 2003). Thus, judgments that are based on System 1 are intuitive judgments, whereas judgments based on System 2 are more deliberate or analytical (Leuders et al., 2018; Loibl et al., 2020).

When teachers generate judgments on the task value and the self-efficacy that students might attribute to a word, modelling, or intramathematical problem, these judgments can be intuitive or deliberate. Both systems of judgments can rely either on observers’ own perceptions of value and self-efficacy regarding these types of problems or on observers’ perceptions of students’ possible attributions of value and self-efficacy with respect to mathematical problems. An intuitive judgment would be shaped by the information that is readily available at the time when the judgment is made, such as prior experience with the type of problem or mathematical problems in general, the current emotional and motivational state of the teacher, and the students who are to be judged. This means that teachers’ own task value and self-efficacy may influence their diagnostic judgments. The deliberate judgment would incorporate educational and affective theories about the types of mathematical problems and the prerequisites for motivation to emerge, for example, perceptions of a problem’s characteristics (e.g., authenticity or relation to reality), which in turn might affect judgments of students’ motivation. Furthermore, students’ average emotional and affective state over time would be part of the judgment as well. Reasoning based on System 2 can discard or keep the intuitive judgment made by System 1.

2.5 Preservice teachers’ judgments of students’ task value and self-efficacy

Mostly moderate to small correlations were found when teachers’ judgments were focused on students’ emotions and motivation (Urhahne & Zhu, 2015; Zhu & Urhahne, 2021). These findings are in line with research by Rellensmann & Schukajlow (2017, 2018), who investigated preservice teachers’ judgment accuracy regarding enjoyment, boredom, and interest in solving mathematical problems. These findings are supported by studies on inservice teachers’ judgments of students’ self-efficacy and self-concepts from other domains (Givvin et al., 2001; Praetorius et al., 2013; Spinath, 2005). Further, prior research has demonstrated that preservice teachers overestimate students’ enjoyment and interest in problems related to reality and underestimate students’ boredom in solving intramathematical problems (Rellensmann & Schukajlow, 2017, 2018). As value is positively related to interest and enjoyment, preservice teachers might perceive that students value modelling and word problems more than they value intramathematical problems. No clear predictions can be made for judgments of students’ self-efficacy.

As preservice teachers at the beginning of their teacher education have limited pedagogical content knowledge about mathematical problems and students’ motivation (Krauss et al., 2008), they may be likely to rely on intuitive judgments (System 1) when making judgments about students’ task value and self-efficacy. However, differences between students’ emotions and interest and preservice teachers’ judgments of students’ affect at the very beginning of teacher training (Rellensmann & Schukajlow, 2018) revealed that preservice teachers adapted a reason-based system for making diagnostic judgments. These results indicate that System 2 (deliberate judgments, according to Kahneman 2003) might also be important for preservice teachers’ judgments. When making deliberate judgments, preservice teachers’ own task value and self-efficacy can be expected to be significantly different from their judgments of how hypothetical students might value a given problem and how confident such students would feel about solving the problem. These discrepancies could be expected in the large differences in the level component and the low rank correlations between preservice teachers’ own judgments and their judgments of hypothetical students.

3 Research questions and expectations in the present study

RQ1: How do preservice teachers rate their own task value and self-efficacy across modelling, word, and intramathematical problems? We did not have clear expectations for differences in preservice teachers’ own task values. On the one hand, they might be able to see a connection to reality as an additional source of value, but on the other hand, they might attribute higher cost and lower attainment and utility value to modelling and word problems. Further, we expected that preservice teachers would feel equally confident in solving each type of problem because of their high mathematical content knowledge and strong mastery experience.

RQ2: How do preservice teachers diagnostically judge students’ task value and self-efficacy across modelling, word, and intramathematical problems? We hypothesized that preservice teachers’ judgments of students’ valuing of modelling and word problems would be higher than their judgments of students’ valuing of intramathematical problems because preservice teachers expect that the extent to which a problem is related to reality is important for school students’ task value. We could not make a prediction for preservice teachers’ diagnostic judgments of students’ self-efficacy.

RQ3: To what extent are preservice teachers’ own task value and self-efficacy regarding modelling, word, and intramathematical problems similar to preservice teachers’ judgments of students’ task value and self-efficacy for the three types of problems? We expected deliberate judgments on the basis of prior results (Rellensmann & Schukajlow, 2018), which would result in significant differences and low or zero correlations between preservice teachers’ own task values and self-efficacy and their judgments of students’ motivation.

4 Method

4.1 Sample

Our sample comprised 182 preservice teachers (75.3% women) who will teach grades 5 to 10, mainly at German middle- and low-track schools (German Hauptschule, Realschule, Sekundarschule or Gesamtschule). Their mean age was M = 23.45 (SD = 6.23). They participated in the study voluntarily as part of regular university courses; 75.82% attended masters courses, and thus, they were familiar with mathematical modelling because mathematical modelling is part of the introductory course on mathematics education in bachelor studies at the university where the sample was recruited. After graduating with a Master of Education degree, the preservice teachers in this study will be permitted to teach students in Grades 5 to 10. We chose to investigate preservice teachers in this study because, as Rellensmann & Schukajlow (2018) postulated, in diagnosing the proximate construct of interest, the desirable shift toward deliberate analytical decisions about learning material may occur as early as the beginning of teacher education.

4.2 Procedure and instruments

The preservice teachers were randomly assigned to one of two groups and were given a questionnaire with twelve mathematical problems. One group of preservice teachers was asked to answer a questionnaire about the value they would attribute to each problem and their confidence in their ability to solve each problem (RQ1), whereas the other group was asked about how hypothetical students might value the problems and how confident these students would be in their abilities to solve the problems (RQ2). The preservice teachers were not required to solve the problems because we assumed that their knowledge was sufficient to solve the middle-school problems. We chose a between-subjects design, divided the sample into two parts, and distributed different questionnaires to each group. We used this design to avoid having to ask the preservice teachers to switch between the teachers’ and the students’ perspectives during the test, which might have resulted in biased responses.

4.2.1 Mathematical problems

We used four modelling problems, four ‘dressed up’ word problems, and four intramathematical problems in this study. The problems were developed and piloted in a previous study (Schukajlow et al., 2012). Each problem could be solved by applying the Pythagorean theorem. Sample problems are presented in Figs. 1 and 2, and 3.

4.2.2 Task value and self-efficacy scales

To assess the preservice teachers’ task value and self-efficacy, we used a well-evaluated scale developed in the study by Schukajlow et al., (2012). Both groups were given the following instructions: “Read each task carefully and then answer some questions. You do not have to solve the problems!” Depending on the group they were assigned to, the statements in the questionnaire had to be answered from their own perspective as preservice teachers, or from the perspective of a hypothetical ninth grader.

Preservice teachers’ own task value and self-efficacy. For each of the twelve problems, the preservice teachers were asked to rate statements about the value of the task and their self-efficacy with respect to the problem. The statement measuring the task value was “I think it is important to be able to solve this problem”; for self-efficacy, it was “I am confident I can solve this problem.”

Preservice teachers’ diagnostic judgments of students’ task value and self-efficacy. The preservice teachers had to rate statements about how hypothetical students might value the twelve problems (“Students think it is important to be able to solve this problem”) and how confident the students would be in their ability to solve the problem (“The students are confident that they can solve this problem”).

A 5-point Likert scale (1 = not at all true, 5 = completely true) was used to score the statements regarding the extent to which the preservice teachers agreed with the statement. The scores for each of the twelve statements were averaged for each type of problem (modelling, word, and intramathematical problems). The scale reliabilities were mostly acceptable (see Table 1). As the reliability of the scales for judgments of students’ self-efficacy was slightly below the cut-off value of 0.65, results involving these scales should be interpreted with caution.

Table 1 Reliability (Cronbach’s α) for task values and self-efficacy depending on the preservice teachers’ perspective

4.2.3 Level and rank components

To address RQ3, we computed the level and rank components (Behrmann & Souvignier, 2013; Helmke & Schrader, 1987). The level and rank components can be used to estimate the degree of conformity between judgments (Helmke & Schrader, 1987). We asked one group of preservice teachers to rate their own task value and self-efficacy for modelling, word, and intramathematical problems and another group to judge how hypothetical school students might value the different tasks. Afterwards, we computed the level component (Helmke & Schrader, 1987) to describe the differences between the two perspectives. According to Behrmann & Souvignier (2013), the level component describes how the preservice teachers might over- or under-estimate absolute task value and self-efficacy when switching between the school students’ and their own perspective. To compute the level component, the mean task value or self-efficacy from the school students’ perspective is subtracted from the corresponding mean from the preservice teachers’ perspective. A negative score indicates an overestimation of the task value or self-efficacy from the preservice teachers’ own perspective. A score of 0 means there is no difference between the two groups’ ratings, which corresponds to an accurate judgment.

To investigate the differences between the two groups, we ranked the twelve problems that were given to the participants regarding the reported mean levels of task value and self-efficacy. The rank component was assessed via correlations between the observers’ judgments and the investigated characteristics (Behrmann & Souvignier, 2013; Helmke & Schrader, 1987). The rank component indicates whether the preservice teachers’ own ranking of the problems by their mean scores on task value or self-efficacy corresponds adequately with the rank order from the school students’ perspective. Therefore, we constructed a rank order for each of the three types of problems and computed the Spearman rank correlation for every single preservice teacher between the preservice teachers’ rank ordering of the problems and the rank ordering of how the preservice teachers judged the problems from the school students’ perspective. After that, these rank-order coefficients were Fisher z transformed (Salkind & Rasmussen, 2007), and means were computed. These means were subsequently transformed back into correlation coefficients so that we could interpret the coefficients on an interval scale.

5 Results

5.1 Preliminary analyses

To estimate whether the scales could be separated statistically, we computed confirmatory factor analyses (CFA). The CFAs showed that the goodness of fit values were slightly below the typical cut-off values (typical cut-off values are CFI = 0.90, TLI = 0.90, and RMSEA = 0.08; Bentler & Bonett 1980; Browne & Cudeck, 1992) for the three-factor model of problem types from preservice teachers’ own perspective for task value, χ²(51) = 115.169, p < .001; CFI = 0.909, TLI = 0.882, RMSEA = 0.114, and self-efficacy, χ²(51) = 107.546, p < .001; CFI = 0.922, TLI = 0.898, RMSEA = 0.107, and for judgments of school students’ task value, χ²(51) = 69.837, p > .05; CFI = 0.937, TLI = 0.919, RMSEA = 0.066, and self-efficacy, χ²(51) = 97.876, p < .01; CFI = 0.793, TLI = 0.733, RMSEA = 0.104. In addition, we collected evidence of the structural validity of the three-factor model by calculating the fit of the two-factor model (problems without a connection to reality vs. problems with a connection to reality) and the one-factor model (mathematical problems). The analysis revealed that the two other models did not fit better than the three-factor model, which was derived on the basis of the theory of mathematical problems.

The correlations between the preservice teachers’ task value and self-efficacy across the modelling, word, and intramathematical problems as presented Table 2 are in line with expectancy-value theory (i.e., task value and self-efficacy are positively related). There were no missing values in the data used in the analyses.

Table 2 Pearson correlations between preservice teachers’ task values and self-efficacy

5.2 Preservice teachers’ own task value and self-efficacy across modelling, word, and intramathematical problems

To address RQ1 and RQ2, we used t tests for dependent samples with a Bonferroni correction. We found that preservice teachers’ own task value for modelling problems (MMod = 3.75, SDMod = 0.75) was lower than their task value for word problems (MWord = 3.94, SDWord = 0.82), t(96) = -3.64, p < .001, Cohen’s d = 0.37, and intramathematical problems (MInt = 3.93, SDInt = 0.84), t(96) = -2.65, p < .01, d = 0.27. No statistically significant difference was found between preservice teachers’ own task value for word and intramathematical problems, t(96) = 0.1, p > .10 (see graphical illustration in Fig. 4).

Further, we found that the preservice teachers’ own self-efficacy for modelling problems (MMod = 4.34, SDMod = 0.64) was lower than their self-efficacy for word problems (MWord = 4.51, SDWord = 0.65), t(96) = -4.57, p < .01, d = 0.46. Intramathematical problems (MInt = 4.46, SDInt = 0.66), t(96) = -2.95, p < .05, d = 0.30, were rated higher by the preservice teachers. No statistically significant difference was found between preservice teachers’ own self-efficacy for word and intramathematical problems, t(96) = 1.21, p > .10.

Fig. 4
figure 4

Preservice teachers’ own mean levels of task value and self-efficacy for modelling, word, and intramathematical problems. Error bars represent standard errors

5.3 Preservice teachers’ diagnostic judgments of school students’ task value and self-efficacy across modelling, word, and intramathematical problems

We found that the preservice teachers’ diagnostic judgments of students’ task value for modelling problems (MMod = 2.70, SDMod = 0.64) was lower than for word problems (MWord = 3.20, SDWord = 0.67), t(84) = 8.29, p < .01, d = 0.90, and higher than for intramathematical problems (MInt= 2.61, SDInt = 0.79), t(84) = 3.75, p < .01, d = 0.41. The difference in the diagnostically judged task value for word and intramathematical problems was significant, t(84) = 9.45, p < .01, d = 1.02. The results are displayed graphically in Fig. 5.

The analysis of self-efficacy revealed similar patterns. We found that preservice teachers’ diagnostic judgments of students’ self-efficacy for modelling problems (MMod = 3.38, SDMod = 0.59) was lower than for word problems (MWord = 3.78, SDWord = 0.49), t(84) = 7.50, p < .01, d = 0.81. No difference was found between preservice teachers’ diagnostically judged self-efficacy for modelling and intramathematical problems (MInt= 3.40, SDInt = 0.61), t(84) = -0.25, p > .10. The difference in the diagnostically judged self-efficacy for word and intramathematical problems was significant, t(84) = 6.93, p < .01, d = 0.75.

Fig. 5
figure 5

Preservice teachers’ mean judgments of hypothetical school students’ task value and self-efficacy for modelling, word, and intramathematical problems. Error bars represent standard errors

5.4 Comparison of preservice teachers’ own task value and self-efficacy for modelling, word, and intramathematical problems and their judgments of school students’ task value and self-efficacy regarding the three types of problems

Level component. To address RQ3 regarding the level component, we used pairwise t tests for independent samples with a Bonferroni correction. We found that preservice teachers attributed lower task value and self-efficacy to every type of problem when rating the problems from the school students’ perspective. The mean difference scores between the two groups regarding the task value attributed to the modelling problems was M = -1.04 (SD = 0.24), t(180) = 10.028, p < .01, d = 1.49. It was M = -0.74 (SD = 0.16), t(180) = 6.589, p < .01, d = 0.98, for word problems and M = -1.47 (SD = 0.06), t(180) = 12.152, p < .01, d = 1.81, for intramathematical problems. Looking at task-specific self-efficacy, the mean difference score for modelling problems was M = -0.96 (SD = 0.13), t(180) = 10.403, p < .01, d = 1.55). It was M = -0.73 (SD = 0.08), t(180) = 8.411, p < .01, d = 1.25, for word problems and M = -1.07 (SD = 0.33), t(180) = 11.302, p < .01, d = 1.68, for intramathematical problems.

Rank component. For the task value attributed to modelling problems, the mean rank-order coefficient was Mr = 0.51, ranging from r = − .77 to a perfect positive correlation. For word problems, the mean rank-order coefficient was Mr = 0.28, ranging from r = − .77 to r = .95, and for intramathematical problems, the mean rank-order coefficient was Mr = 0.21, ranging from r = − .95 to r = .95. Regarding self-efficacy, the mean rank-order coefficient for modelling problems was Mr = − 0.06, ranging from r = − .95 to r = .95. For word problems, the mean rank-order coefficient was Mr = 0.36, ranging from r = − .95 to r = .95, and for intramathematical problems, the mean rank-order coefficient was Mr = 0.62, ranging from r = − .89 to r = 1.00. Overall, we found a positive relationship between preservice teachers’ own valuing and diagnostic judgments of school students’ task value across problems, whereas the relationship for self-efficacy ranged from about zero for modelling problems to a strong relationship for intramathematical problems.

6 Discussion

The aim of the present study was to investigate the roles that the types of problems and preservice teachers’ own ratings play in determining preservice teachers’ diagnostic judgments of school students’ task value and self-efficacy.

6.1 Preservice teachers’ own task value and self-efficacy

Task value and self-efficacy are important for learning (Eccles et al., 1983). Therefore, we analyzed how preservice teachers rated their own task value and self-efficacy across modelling, word, and intramathematical problems. Contrary to our expectations, preservice teachers reported significantly lower task value for modelling problems than for word and intramathematical problems. This result implies that preservice teachers believe that the ability to solve problems with a meaningful relation to reality is less important than the ability to solve word or intramathematical problems. Even though the literature suggests that solving modelling problems should increase school students’ motivation—and therefore, the preservice teachers’ own motivation—the current results are in line with prior studies on school students’ motivation and affect (Krawitz & Schukajlow, 2018; Rellensmann & Schukajlow, 2017, 2018; Schukajlow et al., 2012). How can we explain these findings? Firstly, the meaningful relation to reality in modelling problems has been suggested to increase the value that is attributed to the given problems, but the potentially higher perceived cognitive and noncognitive costs that come with modelling problems might inhibit their value. Secondly, because modelling problems are not common in schools or universities, their attainment and utility value might be rated as low, as might also be the case for the overall task value of modelling problems too.

Our analyses showed that preservice teachers felt less confident about solving modelling problems than ‘dressed up’ word and intramathematical problems. This result is unexpected, because the mathematical content knowledge was familiar. A possible explanation is that the preservice teachers expected to encounter modelling-specific obstacles during the solution process (e.g., making assumptions about missing information, structuring and validating the models). This result confirms prior observations that solving modelling problems is perceived as demanding not only for students but also for teachers (Niss & Blum, 2020). From the view of social-cognitive theory (Bandura, 2003), low self-efficacy might result from a lack of mastery experience in solving modelling problems. This effect was found in school students (Usher & Pajares, 2009) and might also exist in preservice teachers. The lower self-efficacy for modelling problems might also have a negative effect on preservice teachers’ perceived task value for modelling problems, as proposed in motivational theories (Bandura, 2003; Eccles & Wigfield, 2002). The positive correlation between the two constructs in our study (r = .32) supports these considerations.

One theoretical implication of these results is the importance of addressing different problems in the assessment of task value and self-efficacy, as motivation seems to depend greatly on the types of problems in the domain of mathematics. Further, we concur with Krauss et al., (2008), who suggested that knowledge about different types of problems should be an important part of instruction in preservice teachers’ learning programs. As practical implications, we suggest that modelling problems be addressed more often in school and university curricula to enhance the development of value components in school students and preservice teachers regarding this type of problem, and to give school students a chance to have mastery experiences in solving these problems, as mastery experience is a key source of self-efficacy expectations. This is very important because modelling problems are rarely addressed in school practice, and preservice teachers are not familiar with this type of problem at the beginning of their studies.

6.2 Preservice teachers’ diagnostic judgments of school students’ task value and self-efficacy

Although motivational facets of learning and instruction are important for creating learning environments, diagnostic judgments about school students’ motivation in mathematics have rarely been addressed in prior research (Südkamp et al., 2012; Urhahne & Wijnia, 2021; Urhahne & Zhu, 2015). In this study, we analyzed differences in preservice teachers’ diagnostic judgments of school students’ task value and self-efficacy across modelling, word, and intramathematical problems.

The preservice teachers reported that school students would attribute the highest task value to solving ‘dressed up’ word problems, followed by a moderate value for solving modelling problems, and the lowest value for solving intramathematical problems. This result is in line with our expectations of a higher ranking of real-world problems than of intramathematical problems. From the preservice teachers’ view, the relation to reality seems to be an important additional source of school students’ value. However, preservice teachers unexpectedly ranked the value of solving word problems higher than the value of solving modelling problems, even though the importance of solving modelling problems is part of the curriculum in Germany and many countries in the world (Kaiser & Sriraman, 2006; Niss & Blum, 2020; Schukajlow et al., 2021b). One possible explanation for this finding could be the preservice teachers’ lack of content knowledge concerning the importance of modelling problems compared with word problems. Further, preservice teachers might have considered the high level of difficulty in modelling problems for students and the high cost of solving these problems compared with word problems.

The preservice teachers judged that hypothetical ninth graders would feel most confident about solving word problems and less confident about solving modelling and intramathematical problems. Given that studies using samples of school students found that they had higher self-efficacy in solving word problems compared with two other types of problems (Krawitz & Schukajlow, 2018), preservice teachers seem to have a good starting point from which to make accurate judgments of self-efficacy with respect to these types of problems.

Our results add to prior research on judgments of school students’ affect in mathematics (Südkamp et al., 2012). We demonstrated that preservice teachers’ positive perceptions of school students’ emotions and interest in real-world problems found in prior studies are also valid for task value (Rellensmann & Schukajlow, 2017, 2018). One novel finding is that teachers perceived that students attribute greater value to word problems than to modelling problems. One explanation for this result might be that preservice teachers believe that school students will be able to solve word problems (reflected in their diagnostic judgments of school students’ high self-efficacy in solving word problems).

6.3 Comparisons of preservice teachers’ own task value and self-efficacy and their diagnostic judgments of school students’ task value and self-efficacy

Diagnostic judgments can positively influence the quality of teaching and learning (Helmke & Schrader, 1987). To date, the foundation of preservice teachers’ diagnostic judgments has not yet been clarified. In this study, we addressed this research gap by analyzing the relationship between preservice teachers’ own motivation in solving modelling, word, and intramathematical problems, and preservice teachers’ judgments of school students’ motivation to solve these three types of problems. In line with our expectations, we found that preservice teachers valued solving problems more highly and reported higher self-efficacy than they judged hypothetical ninth graders would. Consequently, preservice teachers might ascribe higher motivation to themselves than to the average school student. Indeed, according to EVT, motivation is an important factor for future career choices (Eccles & Wigfield, 2020).

Our results also support expectations about the importance of preservice teachers’ own task value for their diagnostic judgments of school students’ task value, whereas the results for self-efficacy are mixed and strongly depend on the types of problems. The zero correlation found between preservice teachers’ own self-efficacy and their diagnostic judgments of school students’ self-efficacy in solving modelling problems might have resulted from these teachers’ lack of experience with this type of problem. The average magnitude of the relationship between the teachers’ ratings of their own motivation and their diagnostic judgments of school students’ motivation ranged from zero to moderate. These results support the hypothesis that a switch in perspective occurs in preservice teachers (Rellensmann & Schukajlow, 2017, 2018), and—even though they might still lack a deep level of pedagogical content knowledge (Krauss et al., 2008) or mastery experience—their diagnostic judgments are more deliberate than intuitive. Consequently, preservice teachers’ reasoning is likely to be based on System 2 (Kahneman, 2003), which relies on critical evaluations when making judgments.

The main theoretical implication of these results is that both preservice teachers’ own motivation and the object of motivation—reflected in the types of problems in our study—contribute to the diagnostic judgments. These two factors of diagnostic judgments should be considered in teacher education by addressing these topics in educational programs.

6.4 Limitations of the present study

One important limitation of the present study was that preservice teachers were asked to judge how hypothetical ninth graders would rate their own motivation. We used the wording “hypothetical ninth graders” because it increased the scale’s objectivity. However, the results may differ for judgments of familiar school students who are being taught by the respective preservice teachers for a long period of time (Südkamp et al., 2012). Additionally, this study did not take into account some important reasons for preservice teachers’ judgments of the hypothetical ninth graders. Such reasons may include the teachers’ own experiences as school students and other factors. The choice of the Pythagorean theorem as the mathematical content for the problems was based on the importance of this content area in curricula. In future studies, other content areas should be addressed as motivation can depend on the content area (Krawitz & Schukajlow, 2018). For the questionnaire, single task-specific statements were used to assess task value and self-efficacy, built on well-evaluated scales by Schukajlow et al., (2012). The benefits of this assessment for time and test motivation are well known. Addressing different components of motivation (e.g., using attainment value, intrinsic value, utility value, and cost scales) will offer deeper insights into the nature of preservice teachers’ motivation and diagnostic judgments. In this study, we used only a proximal indicator of preservice teachers’ experience in modelling (bachelor vs. master’s degree studies in mathematics education). In future studies, experience with the types of problems that are being assessed should be included in the analysis of the diagnostic judgments.

Furthermore, although the between-subjects design used in this study has benefits, it also limits the impact of the results of our third research question. These results can first and foremost be interpreted as hints for future studies. They also suggest that the need to investigate whether preservice teachers’ diagnostic judgments are based on System 2 remains relevant and that a within-subject design is needed in future studies.

Another important limitation of the present study is the relatively weak fit of the three-factor model that we used to distinguish between the types of problems. A better model fit would increase the construct validity of assessments of motivation regarding the three types of problems. However, whether this classification of the type of problem is comprehensive for students also remains an open research question.

7 Conclusions

Our study demonstrates the importance of preservice teachers’ own motivation and the significance of the object of motivation (type of problem) for diagnostic judgments of school students’ motivation. Further, our results provide initial indications that preservice teachers do not rely exclusively on intuitive diagnostic judgments but that their diagnostic judgments might be more deliberate and might take into account the tasks’ and school students’ characteristics. These results have theoretical implications for models of diagnostic judgments, as we suggest that these factors should be taken into account in models of diagnostic judgments. Practical implications for teacher education programs are that information about school students’ motivation should be included, and possible sources of motivation regarding different types of problems in university courses for future teachers should be discussed.