Repository logo
 

The curious case of the test set AUROC

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Hazan, A 
Schönlieb, CB 

Abstract

Whilst the size and complexity of ML mod- els have rapidly and significantly increased over the past decade, the methods for assess- ing their performance have not kept pace. In particular, among the many potential per- formance metrics, the ML community stub- bornly continues to use (a) the area under the receiver operating characteristic curve (AUROC) for a validation and test cohort (distinct from training data) or (b) the sensitivity and specificity for the test data at an optimal threshold determined from the validation ROC.

Description

Keywords

46 Information and Computing Sciences, 40 Engineering, Machine Learning and Artificial Intelligence, 3 Good Health and Well Being

Journal Title

Nature Machine Intelligence

Conference Name

Journal ISSN

2522-5839
2522-5839

Volume Title

Publisher

Springer Science and Business Media LLC
Sponsorship
EPSRC (EP/T017961/1)
Engineering and Physical Sciences Research Council (EP/N014588/1)