ROC with confidence — a Perl program for receiver operator characteristic curves

https://doi.org/10.1016/S0169-2607(00)00098-5Get rights and content

Abstract

Receiver operator characteristic (ROC) curves are recommended to assess the diagnostic value of tests depending on a single cut-off value of a continuous variable. These ROC curves show the true-positive rate (sensitivity) against the false-positive rate (1-specificity). It is desirable, especially in situations with small samples of observations, to display confidence bounds of the ROC curve. This paper presents a Perl program which calculates the ROC curve and its distribution-free confidence bounds. A simple user interface also written in Perl permits their display.

Introduction

Receiver operator characteristic (ROC) curves are recommended to assess the diagnostic value of tests depending on a single cut-off value of a continuous variable. ROC curves show the true-positive rate (sensitivity) against the false-positive rate (1-specificity) [1]. To compare the discriminative capability of two different tests, both ROC curves are calculated and usually the curve which encloses the greater area is chosen. In a clinical setting, typically there is only a small sample of observations available, hence the ROC curves are only estimates of the true relationship between sensitivity and specificity for a given diagnostic test. These estimators of the ROC curves are mostly taken to be true, although the number of data is very limited. It is therefore important to have a measure of the influence of pure sample variability on the estimation of sensitivity and specificity in order to better judge conclusions drawn.

In such a situation confidence bounds are desirable to describe the ‘correctness’ of a given ROC curve or the validity of the difference between different tests [2]. Yet, confidence bounds for ROC curves are missing in many statistical packages.

This paper presents a collection of subroutines, organized into a Perl module, that calculate estimates of the ROC curve and its distribution-free confidence bounds. The procedure of Hilgers [3] was used throughout. In addition a user interface is presented, also written in Perl, which makes it possible to display the ROC curves (empirical and estimated), the confidence bounds and optionally the optimal cut-off value together with the associated accuracy, sensitivity, specificity and positive and negative predictive values.

Section snippets

Statistical methods

Hilgers [3], [4] presents a method of constructing distribution-free confidence bounds for ROC curves. As the quantities in a clinical setting are mostly measured on an ordinal scale nonparametric methods are used. The focus is here on local bounds for specific values of sensitivity and specificity for certain threshold values that divide the groups. Between these local confidence intervals values are interpolated linearly. The method is based on the connection of order statistics with the

Program description and availability

The program was intentionally designed as a simple to use stand alone tool for calculating and displaying ROC curves. The programming language Perl was chosen for its free availability for all major platforms. The program is separated into a Perl module (suffix pm), which contains the statistical routines, and a user interface. The module runs under any version of Perl5, the user interface (UI) runs under Perl5.004 with the Perl/Tk (Tk 8.0) GUI toolkit installed. For the UI two other perl

References (11)

  • T. Lang et al.

    How to Report Statistics in Medicine

    (1997)
  • M. Gardner, D. Altman, Statistics with confidence, Br. Med. J. Lond....
  • R.A. Hilgers

    Distribution-free confidence bounds for ROC curves

    Methods. Inform. Med.

    (1991)
  • H.A. Kestler

    Calculation and display of confidence bounds for receiver operator characteristics

    Methods Inform. Med.

    (1999)
  • M. Kendall et al.
    (1979)
There are more references available in the full text version of this article.

Cited by (15)

  • Serum DNA motifs predict disease and clinical status in multiple sclerosis

    2010, Journal of Molecular Diagnostics
    Citation Excerpt :

    Of all nucleotides obtained from relapsing MS and stable MS patients, 82% produced significant hits on one of the public databases, of which >99% aligned to the human genome. The relative amount of genes, pseudogenes, transcribed, and untranslated regions (annotated as RNA and UTR) and coding sequences was calculated and compared with the relative amounts observed in the circulating DNA pool of control samples obtained from 50 apparently healthy individuals.6 The representation of coding sequences, genes, RNAs, UTRs, and pseudogenes in the circulating DNA of RRMS patients was comparable with the representation for the circulating DNA (P > 0.1) from healthy control subjects.

  • Statistical evaluation of diagnostic performance: Topics in ROC analysis

    2016, Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis
  • Use of age-albumin ratio as a single predictor of 30-day mortality after colectomy

    2013, Viszeralmedizin: Gastrointestinal Medicine and Surgery
  • The family based variability in protein family expansion

    2013, International Journal of Bioinformatics Research and Applications
View all citing articles on Scopus
View full text