Assessing evaluation procedures for individual researchers: The case of the Italian National Scientific Qualification
Introduction
The National Scientific Qualification (ASN) was introduced in 2010 as part of a global reform of the Italian university system. The new rules require that applicants for professorship positions in state-recognized universities must first acquire a National Scientific Qualification for the discipline and role applied to.
The ASN is to be held once a year; at the time of writing, two rounds have been completed, started in 2012 and 2013, respectively. Applicants are evaluated using quantitative indicators as well as expert assessment. The Italian Ministry of University and Research (MIUR) appoints 184 evaluation committees, one for each scientific discipline. Each committee is made of five members: four are selected among full professors from Italian universities, and one from foreign universities or research institutions. Each committee processes all applications for both the associate and full professor levels in its field of competence.
Candidates are evaluated according to their scientific profile (research output and other scientific titles, see Section 2). However, as an attempt to limit the unfair selection practices that have been associated with the Italian concorso (Gerosa, 2001), applicants are also evaluated according to three bibliometric indicators of impact and scientific productivity defined by the MIUR. The reliance of the ASN on bibliometric indicators was welcome by part of the academic community as a step towards more objective evaluation practices, but was also heavily criticized by others as a form of “career assessment by numbers” – a term first used in Kelly and Jennions (2006) – and against the best practices for the correct use of bibliometrics for the evaluation of individual researchers (Banfi & De Nicolao, 2013). Further complaints were raised as soon as the final results were made available. The fraction of qualified applicants varied considerably across Scientific Disciplines (SDs), from a minimum of 15.1% to a maximum of 81.1% (Marzolla, 2015). Such large differences can not be explained in terms of uncompetitive applicants; rather, they suggest that the committees adopted different criteria for qualification, if not unfair evaluation practices (Abramo & D’Angelo, 2015). In addition, many applicants perceived the individual evaluations they received as hastily written and poorly motivated.
The issues above are not specific to the ASN: indeed, defining open, fair, and transparent evaluation procedures for career advancement of scientists is a challenging task, as witnessed by the plurality of hiring practices adopted in different countries ([Bennion and Locke, 2010], [Dettmar, 2004], [van den Brink et al., 2013], [Vicker and Royer, 2006]). The ASN is an interesting case study, since it produced a large amount of data that have been made available on the Web for a short period of time. The data include, for each applicant: the list of publications and other scientific titles; the values of bibliometric indicators; the outcome of the application (qualified/not qualified), and a written assessment by the evaluation panel.
In this paper we address the following two questions: (i) does the ASN comply with the best practices for the use of bibliometric indicators for evaluating individual researchers? (ii) do the final reports provide useful feedback to the applicants? Both questions refer to the quality of the ASN, intended as its level of transparency and fairness.
The case study illustrated in this paper provides some important lessons about the risks and unintended side-effects of evaluation procedures for academics, especially when too much emphasis is put on quantity rather than quality. As bibliometrics is used more and more frequently to support hiring and promotion decisions (Sahel, 2011), it is important to share the experience gathered from the field so that errors are not repeated. On top of that, national-wide research evaluation campaigns such as the ASN face additional challenges due to the large number of applications that must be processed. In these situations it is tempting for evaluation committees to “cut corners” and employ sloppy practices to speed up the evaluation process, that reflect negatively on those being evaluated.
As valuable byproducts, we study the frequency of publication categories appearing in the application forms, and the structure of collaboration networks across scientific fields. The distribution of publication types can be used to understand how researchers in different disciplines disseminate their work. The investigation of the structure and dynamics of inter-disciplinary research collaboration is an important topic by itself that attracted considerable interest ([Abbasi et al., 2012], [Newman, 2001], [van Rijnsoever and Hessels, 2011], [Wagner et al., 2011]), and is important, e.g., for funding agencies to identify and possibly support joint research and development activities.
Related work. Hiring and promotion procedures for academic staff vary considerably across countries. The Academic Career Observatory from the European University Institute published a comprehensive overview of the recruiting and career advancement procedures in European countries and abroad,1 including information on salaries, access to non-nationals and gender issues.
Qualification procedures somewhat similar to the ASN are already in place in other European countries, like Germany, France, and Spain. In Germany there are two paths towards professorship positions: Assistants working towards the Habilitation, and Junior Professors that must carry out a variety of tasks (including research, teaching, management) but are not required to get the Habilitation. The German Habilitation is essentially a second PhD, and may consist of either a thesis, or several publications of high quality (Enders, 2001). Similarly, the French habilitation à diriger des recherches is awarded to applicants with a strong publication record over a period of years, and is required to supervise PhD students and to apply to professor positions (Musselin, 2004). Finally, Spain introduced the accreditation 2 as a prerequisite to apply to Agregat and Catedràtic positions (roughly equivalent to associate and full professor). The accreditation is granted by the Spanish national evaluation agency (ANECA) after detailed assessment of the applicant CV, including teaching, research experience, and list of publications. Of the three procedures above, the Spanish accreditation is the most similar to the ASN. However, the ASN is, to the best of our knowledge, the only scientific qualification that explicitly relies on bibliometric indicators of scientific productivity and impact to evaluate applicants. Also, while teaching activities play a significant role in the Spanish accreditation, they are barely considered by the ASN (see Appendix B).
A quantitative account of the ASN is given by Marzolla (2015): the author computes a set of descriptive statistics, showing among other things the fraction of qualified applicants, and the distribution of the values of bibliometric indicators. The study shows that the fraction of successful applicants varies considerably across SDs, suggesting that the qualification criteria were interpreted differently by each evaluation panel. This is confirmed by the comparison of bibliometric indicators of qualified and not qualified applicants, showing that some panels were more likely to deviate from purely quantitative considerations for granting or denying qualification. Abramo and D’Angelo (2015) examine the relationship of the ASN outcome with the scientific merit of applicants, in order to identify possible cases of discrimination or favoritism. Discrimination refers to skilled (according to their bibliometric indicators) applicants that are denied qualification, while favoritism refers to under-performing applicants that are granted qualification. The results reveal that applicants that are not already employed by an academic institution (“outsiders”) tend to be more penalized. Finally, Pautasso (2015) studies the proportions and success rates of female applicants across the various SDs to investigate gender issues. While in most disciplines the success rates of female applicants are comparable to that of male candidates, the study observes a significantly lower proportion of female scientists applying to most SDs, especially for the full professor role.
Organization of this paper. This paper is organized as follows. In Section 2 we give some information on the ASN. In Section 3 we examine the evaluation forms: we study their length and average similarity as proxies of their perceived quality. In Section 4 we discuss whether the ASN methodology follows the current best practices for the correct use of bibliometric indicators for the evaluation of researchers. Finally, conclusions are presented in Section 5. Some interesting descriptive statistics on the ASN dataset that have been produced as a byproduct of the main analysis are described in Appendix B.
Section snippets
Background
In this section we provide some background on the ASN and the Italian university system; for an historical perspective, see Degli Esposti and Geraci (2010).
In Italy, each professor and researcher is bound to a SD representing a specific field of study. There are 184 SDs organized in 14 areas shown in Table 1. Each SD is identified by a four-character code of the form AA/MC where AA is the numeric ID of the area (01–14), M is a single letter identifying the macro-sector, and C is a single digit
Analysis of final reports
In this section we focus our attention on the final reports containing the assessment of each applicant. A typical report is shown in Fig. 2, and contains the following elements:
- 1.
Applicant's last and first name;
- 2.
Collegial assessment (Giudizio collegiale) formulated by the whole panel;
- 3–7.
Individual assessment (Giudizi individuali) formulated by each member of the evaluation committee; the name of the committee member is indicated above the evaluation, that are therefore not anonymous;
- 8.
Result
Discussion
In the previous section we have analyzed whether the ASN results provide useful feedback to the applicants. In this section we take a broader view by discussing the appropriateness of the ASN methodology, including the use of bibliometric indicators to evaluate individual applicants. Indeed, the ASN is the only national scientific qualification procedure that also uses quantitative indicators of productivity and impact for assessing applicants.
The recently published Leiden manifesto for
Conclusions
In this paper we have considered the Italian ASN as a case study in the evaluation of individual researchers for promotion. In particular, we were interested in assessing the appropriateness of the ASN in terms of fairness and quality of feedback provided to applicants. To do so, we addressed the following two questions: (i) does the ASN comply with the best practices for the use of bibliometric indicators for evaluating individual researchers? (ii) do the final reports provide useful feedback
Acknowledgments
The author thanks Giuseppe De Nicolao for providing feedback on a preliminary version of this analysis, and the online community of Redazione ROARS (http://www.roars.it/) for valuable ideas and discussion.
Author contributions
Conceived and designed the analysis: Moreno Marzolla
Collected the data: Moreno Marzolla
Contributed data or analysis tools: Moreno Marzolla
Performed the analysis: Moreno Marzolla
Wrote the paper: Moreno Marzolla
References (38)
- et al.
Betweenness centrality as a driver of preferential attachment in the evolution of research collaboration networks
Journal of Informetrics
(2012) Competition for academic promotion in Italy
The Lancet
(2001)- et al.
The h index and career assessment by numbers
Trends in Ecology & Evolution
(2006) Quantitative analysis of the Italian national scientific qualification
Journal of Informetrics
(2015)- et al.
Factors associated with disciplinary and interdisciplinary research collaboration
Research Policy
(2011) An evaluation of the Australian research council's journal ranking
Journal of Informetrics
(2011)- et al.
Approaches to understanding and measuring interdisciplinary scientific research (IDR): A review of the literature
Journal of Informetrics
(2011) - et al.
An assessment of the first “scientific habilitation” for university appointments in Italy
Economia Politica
(2015) - et al.
The skewness of science in 219 sub-fields and a number of aggregates
Scientometrics
(2011) - et al.
La valutazione della ricerca fra scienza e feticismo dei numeri
Il Mulino, I/2013
(2013)