Introduction

By applying selection procedures, medical schools aim to admit motivated students who will perform well in their studies (Turner and Nicholson 2011). However, the currently available selection tools, which are usually combined in selection procedures, appear to be suboptimal for identifying the most suitable candidates. While academic records, multiple mini-interviews (MMIs), aptitude tests, situational judgement tests and selection centres are among the most promising selection tools, none of these have proven to be perfect in terms of reliability, validity, fairness and cost-effectiveness (Cleland et al. 2012; Patterson et al. 2016). Evidence for the added value of costly selection procedures compared to a weighted lottery procedure, which was applied in the Netherlands for many years, is not unequivocal. The literature on selection mainly contains single-site studies and studies investigating selection tools in isolation, rather than combinations of selection tools (Patterson et al. 2016). This multi-site study aims to fill these gaps by examining the value of (different) selection procedures compared to a weighted lottery procedure, which is weighted for pre-university grade point average (pu-GPA) and includes direct admission for students with top-pu-GPAs (≥8 out of 10) (Ten Cate 2007). Outcomes of interest were student performances, as well as motivation (for studying medicine) and engagement in learning, because these variables are deemed important for the learning, performance and well-being of students (Casuso-Holgado et al. 2013; Prins et al. 2009; Williams et al. 1999). Motivation concerns the reasons people act in certain ways. These reasons can originate from within the person or from external factors. According to the Self-determination theory (SDT), autonomous motivation (AM) is seen when one does something out of genuine interest or because of a positive valuation of the activity. AM is associated with better learning, academic performance and well-being, compared to controlled motivation (CM), which is seen when one experiences internal or external pressure (Artino et al. 2010; Kusurkar et al. 2011a, 2013; Moulaert et al. 2004; Ryan and Deci 2000; Sobral 2004; Stegers-Jager et al. 2012; Vansteenkiste et al. 2005; Williams et al. 1999). Student engagement also contributes to better learning and academic performance of (medical) students (Carini et al. 2006; Casuso-Holgado et al. 2013; Schaufeli et al. 2002a; Svanum and Bigatti 2009) and has a negative relationship with burnout (Schaufeli et al. 2002b). Engagement is defined as “a positive, fulfilling, and work-related state of mind that is characterized by vigour, dedication, and absorption” (Schaufeli et al. 2002b).

The effects of the various admission pathways (i.e., admission based on a selection procedure, a weighted lottery procedure and top-pu-GPA) have been studied and the results are inconclusive. Whenever performance differences are found, however small they are, applying a selection procedure seems favourable over applying weighted lottery (de Visser et al. 2016; Lucieer et al. 2015; Schripsema et al. 2014; Urlings-Strop et al. 2009, 2011, 2013). However, the differences are often not statistically significant (de Visser et al. 2016; Hulsman et al. 2007; Lucieer et al. 2015; Schripsema et al. 2014; Stegers-Jager et al. 2015; Urlings-Strop et al. 2009). A more consistent finding is that top-pu-GPA students generally outperform selected and lottery-admitted students (de Visser et al. 2016; Schripsema et al. 2014). Different contexts of the single-site studies (such as selection procedures, proportion of students admitted through the different admission pathways) may have led to conflicting results. Moreover, a variety of outcome measures, mainly pertaining to the pre-clinical phase, has been used. Research on the motivation, especially quality of motivation, of students admitted through different pathways is scarce and the findings show a similar pattern. Either no significant differences (Nieuwhof et al. 2004; Wouters et al. 2016), or better quantity and quality of motivation among selected students have been reported (Hulsman et al. 2007; Kusurkar et al. 2010, 2013; Wouters et al. 2016). Despite its implications for learning and performance, engagement has not yet been studied as an outcome measure of selection. The first of three research questions addressed in this study is: Are different admission groups associated with differences in motivation, engagement and pre-clinical and clinical performance? Based on the literature, we hypothesize that top-pu-GPA students will outperform selected and lottery-admitted students, and selected students will outperform lottery-admitted students in pre-clinical and clinical education, while selected students will outperform and report higher AM and engagement than lottery-admitted and top-pu-GPA students.

Some researchers have made use of the fact that the lottery-admitted group consists of two types of students: students who had participated in selection and students who refrained from it. It has been argued that applicants who invest the time and effort necessary for participation in selection may perform better than those who refrain from it (Schripsema et al. 2014), suggesting that selection may attract a group of better quality students. The evidence for this is scarce and findings are inconclusive. Students who had not participated in selection have been found to underperform compared to students who were selected or students who had enrolled through weighted lottery after being rejected in selection (de Visser et al. 2016; Schripsema et al. 2014), but these findings did not always reach significance (de Visser et al. 2016; Schripsema et al. 2014; Urlings-Strop et al. 2013). Thus, limited evidence supports the hypothesis that students who have participated in selection outperform those who have not. Motivation, as reflected in the preparation for selection, has been suggested as one reason why students who have participated might perform better (Schripsema et al. 2014, 2016; Wouters et al. 2016), but this assumption has not yet been investigated. The second research question addressed in this study is: Is participation in selection associated with differences in motivation, engagement and pre-clinical and clinical performance? Based on the literature, we hypothesize that students who participated in selection outperform and report higher AM and engagement than students who did not participate in selection.

Because selection procedures are usually costly, identifying which type of procedure is associated with the most desirable student characteristics can inform policy decisions. However, the mere presence of an effort-intensive selection procedure may be more important than the specific characteristics of the procedure. The single-institution nature of previous research has resulted in a relative lack of studies comparing different selection procedures. A study comparing students selected using cognitive criteria and students selected using non-cognitive criteria within one medical school, showed no differences with regard to dropout and performance during the pre-clinical and clinical phases of medical study (Lucieer et al. 2015). The present study includes multiple institutions applying different selection procedures, enabling comparisons across medical schools. The third research question addressed in this study is: Are different selection procedures associated with differences in motivation, engagement and pre-clinical and clinical performance? Based on the limited evidence and the notion that the content of the selection procedure may be secondary to the presence of a selection procedure, we hypothesize that different types of selection procedures do not result in differences in motivation, engagement and pre-clinical and clinical performance.

Methods

Study design

This was a multi-site cross-sectional study using an online survey (Net Questionnaire) comprised of personal data and standard, validated questionnaires. The indicators of academic performance of the participating students were retrieved from student administrative databases.

Setting

This study was carried out at three of the eight Dutch medical schools: VUmc School of Medical Sciences Amsterdam (VUmc), Academic Medical Center Amsterdam (AMC), and University Medical Center Groningen (UMCG). Inclusion of these medical schools was based on differences in selection procedures and their use of selection at least since 2010 (for enabling inclusion of Year-4 performance). Medical study in the Netherlands consists of three years of pre-clinical education, followed by three years of clinical education, after which students obtain their medical degrees. Although small local differences may exist, we expect the curricula to be largely comparable because medical curricula across the Netherlands are all vertically integrated, student-centred (Ten Cate 2007) and driven by nationally standardized end terms (Van Herwaarden et al. 2009). An overview of the characteristics of the selection procedures of the different medical schools is provided in Table 1.

Table 1 Differences of the selection procedures of the three universities

Participants

In the 2013–2014 academic year, students were invited via e-mail (with two reminders) to participate in this study. Participation was voluntary and in the e-mail, students were informed about the aims of the study and handling of data. At the beginning of the survey, students gave their informed consent. The sample consisted of students from Year-1 (pre-clinical phase) and Year-4 (clinical phase) because assessment in Year-1 is based on cognitive skills and assessment in Year-4, the first clinical year of the study, is mostly based on non-cognitive skills. Moreover, the selection procedures may be associated with preclinical and clinical performance to a different extent. For every ten participants, a gift card of €25 was awarded through random selection.

Outcome measures

Academic performance

Three measures were defined to represent academic performance in Year-1: course credits, GPA and professional behaviour. Course credits (European credits) obtained in the respective study year were used. At all medical schools, the maximum number of course credits per year was 60. GPA in the first year was comprised of the average of the first attempts on all knowledge tests. For VUmc, AMC and UMCG, respectively, six, five, and four tests were included. For professional behaviour, unsatisfactory, satisfactory or good judgments on professional development were examined. Similar to other researchers, we chose the achievement of good clerkship performance as an indicator of performance in Year-4 (Stegers-Jager et al. 2015). Good clerkship performance was defined as receiving a grade of 8 or higher out of 10 for at least half of the clerkships. Clinical educators tend to be reluctant to fail students for their clerkships (Daelmans et al. 2016) and clerkship grades are usually above average. By using good clerkship performance as an outcome measure, we aimed to identify the students that stood out by performing well in the majority of their clerkships. The final clerkship grade is a single grade that includes an assessment of professional behaviour. Year-4 was composed of six clerkships at VUmc and AMC and four clerkships at UMCG.

Motivation

Two measures were defined to represent motivation: strength of motivation and type of motivation (autonomous and controlled; AM and CM). We used the concept of motivation put forth by Self-determination Theory (SDT) (Deci and Ryan 1985). AM and CM were measured with the 16-item Academic Self-regulation Questionnaire (Vansteenkiste et al. 2009). Scores ranged from 1 (not important at all) to 5 (very important). Example items are “I am studying medicine because I want to learn new things” and “I am studying medicine because I want others to think I’m smart.” for autonomous and controlled motivation, respectively. Relative autonomous motivation (RAM) was calculated by subtracting the CM subscale score from the AM subscale score. Strength of motivation was measured with the 15-item Strength of Motivation for Medical School-Revised questionnaire (SMMS-R; Kusurkar et al. 2011b; Leibach and Stern 2013; Nieuwhof et al. 2004). An example item is “I would still choose medicine even if that meant I would never be able to go on holidays with my friends anymore”. Scores ranged from 1 (strongly disagree) to 5 (strongly agree).

Engagement

The total score on the nine-item Utrecht Work Engagement Scale-Students (UWES-S-9) (Schaufeli et al. 2002a) represented students’ engagement. Students rated their level of engagement across the domains of vigour, absorption and dedication. An example item is “When I am studying, I forget everything else around me”. Scores ranged from 0 (never) to 6 (always).

Independent variables

To answer the three research questions, three independent variables were defined: admission group (selection, lottery, and top-pu-GPA), selection participation (participation and no participation in selection) and selection procedure (selection procedures A, B and C for the selection procedures at VUmc, AMC and UMCG, respectively).

Confounders

We investigated whether the variables age, gender, university, first-generation student, doctor parent, ethnicity, area of growing up, living situation and pu-GPA needed to be included as confounders in the final models, along with the independent variables. Ethnicity was defined using the definition of Statistics Netherlands (CBS; www.cbs.nl), which states that a person belongs to an ethnic minority group if at least one of his or her parents was born outside the Netherlands. These variables were indicated as possible confounders because previous research showed the importance of students’ background characteristics in performance and motivation (Kusurkar et al. 2011a; Stegers-Jager et al. 2015; Strauser et al. 2012).

Statistical analysis

For linear and dichotomous outcome variables, respectively, linear and binary logistic regression modelling was performed. First, we performed univariate regression analyses. Next, for every regression model, we investigated whether variables needed to be included as confounders in the final model based on a change in the regression coefficient of 10% or more and a significant association with the outcome variable (Twisk 2006). Wherever appropriate, we used the Bonferroni procedure for multiple comparison correction. Pu-GPA was not considered as a possible confounder in the analyses in which the different admission groups (top-pu-GPA, selection and lottery) were compared, because the top-pu-GPA group was, by definition, the group with the highest pu-GPAs. Analyses were performed using IBM SPSS Statistics for Windows Version 20.0 (IBM Corp., Armonk, NY, USA).

Results

First, we provide the descriptives and the reliability tests of the used scales. Next, we report the results for each research question separately.

The 666 participants (response rate ≈35% across all three universities) included 387 Year-1 students and 273 Year-4 students. The average ages of the participants (18.7, 19.1 and 18.5 for Year-1 students and 23.2, 22.9 and 22.6 for Year-4 students from VUmc, AMC, and UMCG students) fairly reflected the average ages of their respective cohorts (19.3, 19.2 and 18.4 for Year-1 students and 24.2, 23.5 and 23.0 for Year-4 students from VUmc, AMC and UMCG). Female students were slightly overrepresented in our study sample (77.5, 68 and 78% for Year-1 students and 78.8, 73.3 and 72.4% for Year-4 students from VUmc, AMC and UMCG) compared to the respective cohorts (67.3, 59.2 and 78% for Year-1 students and 68.1, 68.7 and 67.2% for Year-4 students from VUmc, AMC and UMCG). Participants who enrolled in a graduate entry programme (n = 6) and participants admitted under special circumstances (n = 17) were excluded from the analyses. Seventy-six students (12%) were admitted based on top-pu-GPA, 75 students (12%) enrolled through a weighted lottery without having participated in a selection procedure, 82 students (13%) were admitted through a weighted lottery after being rejected in selection and 395 students (61%) were admitted through selection. We considered the admission pathway distribution in our sample to be a fair reflection of the population, based on the places assigned through selection at each medical school. Of the Year-1 participants 62.5, 70.3 and 92.4% at VUmc, AMC and UMCG, respectively, were admitted through selection. Of the Year-4 participants 44.2, 45.0 and 40% at VUmc, AMC and UMCG, respectively, were admitted through selection. A further breakdown by study year is provided in Supplementary File 1 (Appendix).

The Crohnbach’s alpha values for reliability for the UWES-S-9, AM, CM and the SMMS-R were 0.90, 0.82, 0.84 and 0.79 respectively.

Table 2 summarizes the means and standard deviations of all variables. The results of the regression analyses for Year-1 and Year-4 students and Pearson correlations between linear and dichotomous variables are depicted in Tables 3 and 4 and Supplementary File 1 (Appendix), respectively. The incidence of unsatisfactory judgments for professional behaviour was too low (1.4%) to conduct further analyses.

Table 2 Descriptives
Table 3 Results regression analyses Year-1
Table 4 Results regression analyses Year-4

Admission group

Students with top-pu-GPAs obtained higher GPAs (B = 0.526, p < 0.01) and were more likely to show good performance during their clerkships [Odds Ratio (OR) 1.218, p < 0.1] than selected students. Selected students reported higher strength of motivation in Year-1 than students with top-pu-GPAs (B = 2.581, p < 0.05, respectively). Analyses showed no significant associations between admission group and course credits, engagement, strength of motivation in Year-4, AM, CM and RAM. These findings partly support our hypothesis that top-pu-GPA students would outperform other students, while selected students would report higher AM and engagement than top-pu-GPA and lottery-admitted students. Top-pu-GPA performed best, and selected students reported higher strength of motivation, but only in Year-1. Selected students did not show better quality of motivation and engagement than the other students.

Participation in selection

Students who had participated in selection were more likely to show good performance during their clerkships (odds ratio 2.883, p < 0.01) and reported significantly higher engagement in Year-4 (B = 0.317, p < 0.05) than student who had not participated. Analyses showed no significant associations with performance and engagement in Year-1, AM, CM, RAM and strength of motivation. Our hypothesis that students who participated in selection would outperform others and show better motivation and engagement was not supported for Year-1, because there were no differences in this regard. Our hypothesis was partly supported for Year-4 students.

Type of selection procedure

Year-1

Analysis showed significant associations between type of selection procedure and performance and strength of motivation in Year-1. Selection C was associated with more course credits than selection procedure A (B = 3.404, p < 0.05). Procedure B was associated with higher GPAs than Procedures A (B = 1.248, p < 0.01) and C (B = 0.995, p < 0.01). In addition, Procedure B was associated with higher strength of motivation in Year-1 than Procedures A (B = 2.770, p < 0.01) and C (B = 1.170, p < 0.1). Analyses showed no significant associations with performance in Year-4, AM, CM, RAM, engagement and strength of motivation in Year-4. Our hypothesis that students admitted through the three different selection procedures would not show differences was supported for Year-4 and only partly supported for Year-1. In Year-1, type of motivation and engagement were similar among students selected through the different procedures. Procedure B was associated with higher GPAs and strength of motivation than Procedures A and C, and Procedure C was associated with more course credits.

Discussion

Building on previous literature, this multi-site study investigated the added value of selection compared to a weighted lottery procedure by focusing on student performance, motivation and engagement in both pre-clinical and clinical phase of the medical study. Findings with regard to the different admission groups confirmed that students who excel in pre-university education perform better in the pre-clinical and clinical phases of medical study as well, despite showing lower strength of motivation than selected students. This was not surprising as previous performance is the best predictor of future performance (Benbassat and Baumal, 2007; Hulsman et al. 2007; Patterson et al. 2016; Salvatori 2001; Siu and Reiter 2009). Lower strength of motivation among top-GPA students has been reported before (Hulsman et al. 2007; Kusurkar et al. 2010; Wouters et al. 2016) and may be explained by the fact that these students gain direct admission to the medical study. Without the need for participating in a selection or weighted lottery procedure, these students may be stimulated less to think about their study choice, an activity which can help in making an informed and motivated study choice (Wouters et al. 2014). In our study, selected students did not outperform lottery-admitted students or report better quality of motivation and engagement in the pre-clinical and clinical phase of the study, which supports the notion that selection may have little added value over a weighted lottery procedure. However, findings differ greatly across studies, suggesting that context is an important factor. The increased proportion of places allocated through selection, for example, may stimulate a broader range of students to apply for selection, reducing the performance gap between selected and lottery-admitted students. Moreover, selection tools are used in different ways in different contexts, which complicates replication of studies and generalization of findings (Edwards et al. 2013). Furthermore, differences are usually small and do not always reach significance. For motivation and engagement this may be due to the restricted range, which means that students scored at the top end for engagement and autonomous motivation and at the bottom end for controlled motivation. The mean engagement score in our sample (M = 4.17), for example, clearly exceeds the score of the norm group of social sciences students (M = 3.18) (Schaufeli and Bakker 2003). In sum, top-pu-GPA students outperformed the other students, while few differences were found between selected and lottery-admitted students. This highlights the challenge that selection committees are confronted with, namely selecting the best candidates from a pool of seemingly equally suitable candidates. Future research should reveal whether students’ performance, motivation and engagement develop differently throughout medical study. A next important step in selection research is to follow up on the various groups of students after graduation. We plan to conduct longitudinal research to study this.

Our hypothesis that students who had participated would outperform and show higher AM and engagement than students who had not participated was only partly supported. As no differences in motivation were found, the assumption that better motivation among selection participants would explain their better performance (Schripsema et al. 2014) was not supported. Among students in the clinical phase, participation in selection was related with better clerkship performance and engagement. Students who previously chose to participate in a selection procedure, for which coping with stress and being able to combine studies with other activities are important, may become energized by and be able to cope better with the pressure of clerkships. Students who had participated in selection have been found to be more emotionally stable and conscientious than students who did not (Schripsema et al. 2016). The group of Year-1 students that had not participated in selection was rather small; this might explain why these differences did not reach significance.

Based on the hypothesis that the presence of a selection procedure may be more important than the type of procedure used, we did not expect to find differences in performance, motivation and engagement. Findings among the students in the clinical phase supported this. This must be interpreted with caution, however. Relatively small group sizes might have resulted in insufficient power to detect smaller effects. Among the students in the pre-clinical phase, we found some differences, mainly related to performance. Of course, the medical school context, as a whole, should be considered when interpreting these results (Edwards et al. 2013). While the three medical schools train their students to meet the same end terms, differences in curricular structures and assessment and grading programs may have influenced the study results. Indeed, the factor ‘university’ appeared to be a confounder in some of the other analyses, but could not be controlled for in the comparisons between the selection procedures. Differences with regard to cognitive performance in the medical study may be related to the weightings of cognitive assessments in the selection procedures. The findings seem to suggest that potential differences between the students selected at the three medical schools fade over the course of medical study, but a longitudinal study design is necessary to confirm this. Another explanation may be that the characteristics of the three medical schools (selection procedure, curriculum and location) appeal to different types of students. Two of the three medical schools in our study are located in the same city. We have examined different types of students’ reasons for applying to a certain medical school in a separate paper (Wouters et al. submitted). In sum, few differences were found between students admitted through the different selection procedures. Because the differences mainly concerned performance outcomes, they may have been strongly influenced by differences in the assessments and grading cultures of the different institutions. Further research should determine the relative influence of curriculum characteristics on performance differences.

Some of the outcome measures in the study were interrelated. For example, autonomous motivation showed a low positive correlation with course credits. Low negative correlations were found between clerkship performance and controlled motivation, while low positive correlations were found between clerkship performance and engagement and relative autonomous motivation. This is in line with the previous research on motivation and engagement reporting positive correlations with performance. Moreover, some significant differences found in the unadjusted models disappeared when confounders were included in the final model. For performance outcomes, age, gender, pu-GPA and university mainly caused this, but sometimes also socioeconomic factors, such as being a first generation student or ethnic background. Further research should determine the influence of socioeconomic factors on student performance. The final models explained up to 52% of the variance in the outcome measures. For some measures, e.g. controlled motivation and course credits, the models explained little variance, which indicates the need for more research on these outcomes.

Limitations

Possible limitations include selection bias and response bias. While we included the three medical schools for methodological and practical reasons, the findings may not be generalizable to other medical schools. In addition, administration of a web-based survey enabled us to approach all students from the proposed cohorts, decreasing the influence of selection bias at the student level, but also may have resulted in a lower response rate. Nevertheless, a response rate of 35% can be considered good in current times in which students receive many evaluation forms and junk mail (Sax et al. 2003). Female students were slightly overrepresented in our study. We included gender as a confounder in the analyses whenever necessary. A response bias is likely because we do not know how non-responders would have answered the motivation and engagement questions. It is reasonable to assume that non-responders have lower motivation and engagement. Some groups in our sample were relatively small. We have taken this into account in the interpretation of the findings. The top-pu-GPA group is consistently small because only 4% of all pre-university graduates in the Netherlands achieve this. A further limitation is that the clerkship grade in the first year of clinical rotations may not be a true reflection of how the students will perform as doctors in their actual practice owing to the little autonomy they have. Furthermore, the grades may reflect students’ ability to cope with the transition from theory to practice, rather than their clinical skills. Future research on performance in later stages of medical education and specialty training could provide more insight in more clinically relevant outcomes of selection.

Conclusion

Top performing students in pre-university education perform best in the medical study. A selection, which is usually costly, seems to be of little additional value compared to a weighted lottery procedure, especially when a large proportion of students is admitted through selection. The results suggest that the type of selection procedure may make little difference. Differences are small due to good overall performance, motivation and engagement levels.