Development of symptom assessments utilising item response theory and computer-adaptive testing—A practical method based on a systematic review

https://doi.org/10.1016/j.critrevonc.2009.03.007Get rights and content

Abstract

Assessment of individual patients’ distress is a cornerstone of clinical care for advanced cancer. Patients’ ability to fill out lengthy questionnaires is compromised by many factors. Computer-adaptive tests (CAT) offer a promising approach to developing tailored instruments, that administer only items relevant to the individual patient. A systematic review of the literature about CATs in medical databases was conducted. Based on the results, a method for developing a CAT was designed that requires nine steps: (1) build an item pool; (2) administer the items to a predefined sample in a calibration study; (3) eliminate inappropriate items; (4) examine whether all items are influenced by a single dominant trait; (5) calibrate the items to the best-fitting item response theory (IRT) model; (6) evaluate items’ parameter equivalence across subgroups; (7) build an item bank with the calibrated items; (8) develop the CAT; and (9) pilot test the developed CAT. CAT offers the chance to extend the usefulness of patient-reported outcome (PRO) measurements from clinical studies to daily clinical practice.

Introduction

Caring for patients who have advanced cancer and for their family members challenges clinicians with complex and fluctuating multidimensional issues [1]. As important navigation aids, both for clinical care decision making and for clinical research, assessments of outcomes reported by patients and their families serve to screen for, identify, characterise, and monitor the condition and status of patients and their families. Because most patients with advanced cancer suffer from concurrent and interrelated problems, first-level screening assessments may not provide sufficient information, and multiple second-level assessments for specific symptoms – for example, the Brief Fatigue Inventory (BFI) [2] or the Hospital Anxiety And Depression scale (HADS) [3] – must be applied. The concurrent use of several second-level assessments is poorly standardised and is challenging in daily clinical practice. Although there are differences among patients in their perceived burden in responding to questionnaire-based studies of symptoms and in the number of items they may be able to respond to reliably [4], it is generally agreed that most patients with advanced cancer can cope with responding to only limited numbers of questions. As new assessment instruments evaluating important issues are being developed in all aspects of advanced cancer and palliative care, it is worth exploring the use of tailored interventions for improving care and outcomes [5], [6].

Most symptom-assessment questionnaires currently used in medicine are developed by using classical test theory (CTT). CTT requires that every item or question in an assessment instrument be administered to every patient tested in order to obtain valid scores. In developing a fixed-length questionnaire for patients with advanced cancer, it is necessary to find a balance between keeping the response burden for patients low while still obtaining sufficient measurement precision. When instruments are adapted by simply reducing the number of questions – a recent example is the EORTC QLQ-C15-PAL [7] – the result may be an instrument that might be reliable enough for assessing differences between study populations – but not for assessing individual patients to guide patient care in daily clinical practice [8].

Forty years ago, item response theory (IRT) was introduced as an alternative to CTT for test development in education and psychology. Now IRT is used in conjunction with powerful computer capabilities and offers an exciting new research area in patient-reported outcome (PRO) measurement. Important groundwork has been the development of item banks containing questions that have been calibrated with a set of psychometric properties covering the whole range of a latent trait, such as pain or fatigue [9], [10]. Based on these item banks, a simple reduction of the items at the risk of losing important information is no longer necessary. In a process resembling that used intuitively in structuring a clinical interview, computer-adaptive testing (CAT) selects from the item bank the most informative question for each individual respondent based on the respondent's answers to previously administered questions. CAT-based individualised item administration results in shorter assessments without the trade-off of losing measurement precision [11], [12]. The use of item banks in combination with CAT affords the possibility of developing questionnaires feasible for use in daily clinical practice. The development of IRT based assessment tools is also receiving progressively more attention. Recently a group of outcomes scientists and the National Institutes of Health (NIH) formed a cooperative network funded under the NIH Roadmap for Medical Research Initiative to re-engineer the clinical research enterprise. This initiative – the Patient-Reported Outcomes Measurement Information System (PROMIS) – aims to revolutionise the way patient-reported outcome tools are selected and used in clinical research and practice evaluation. One main goal of the PROMIS initiative is to develop a set of publicly available CATs for the clinical research community (http://www.nihpromis.org) [13], [14].

Developing new CAT instruments based on IRT poses specific challenges. This systematic review summarises the methods used in IRT-based CAT development for PROs and, informed by those methods, provides a stepwise approach to the development of PRO assessment instruments that can be used by clinical researchers in oncology to evaluate and interpret articles on IRT- and CAT-based PROs.

Section snippets

Methods

Electronic searches were undertaken in October 2007 in MEDLINE (via PUBMED) and the Cochrane Library and PsycInfo (via Ovid) databases to identify relevant articles.

As no exact MeSH terms exist to research the development of CAT for PROs and as this is a relatively new research field, we used broad and unrestricted search strategies. We combined the two expressions: “item response theory OR IRT” and “CAT OR (computer$ AND adaptive)” by using the operator “AND”. There were no restrictions in the

Results

Searching the online databases and hand-searching resulted in the retrieval of 192 papers, of which 32 met the inclusion criteria (Fig. 1) [9], [10], [11], [12], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45]. The selected studies cover a wide range of PRO instruments assessing physical functioning in different populations, cancer-related fatigue, depression, or anxiety (

Discussion

A few years ago it was hardly possible to find studies that applied IRT methods and CAT for assessing PROs. Although today most PRO measurements are still developed based on CTT, we found several promising research efforts in this area, and the early movers in this field have inspired many colleagues to evaluate and investigate the advantages of CAT for PROs. Also encouraging are research projects like the PROMIS Roadmap initiative (http://www.nihpromis.org) [13], [14] – a collaborative

Conclusions

Computer-adaptive measurement of critical symptoms in internal medicine and advanced cancer care with CAT has the potential to produce more precise estimates than traditional methods, providing a detailed overview of patients’ distress from their symptoms. The burden to patients undergoing a CAT is substantially reduced because patients need to answer only questions with particular relevance to their own individual situation. In addition, the results of evaluation are easier to integrate into

Conflict of interest statement

The author(s) declare that they have no financial or non-financial conflicts of interest.

Reviewer

Victor Chang, MD, UMDNJ/New Jersey Medical School, VA New Jersey Health Care System, Section Hematology Oncology, 385 Tremont Avenue, East Orange, NJ 07018, United States.

Acknowledgements

This work has been partially supported by unrestricted grants from Grünenthal Switzerland and Gastrotech Denmark, who did not contribute to and were not informed about the design or conduct of the study, the data analysis, the writing of the manuscript, or the decision to submit it for publication. The manuscript was edited by Susan Eastwood ELS(D), an independent biomedical editor. We are grateful for the indispensable work of our librarian, Daniel Kauffmann.

Jochen Walker was graduated from the University of Freiburg (Germany) in 2002. He is currently research assistant at the cantonal hospital of St. Gallen (Switzerland) with his main interest in medical informatics.

References (77)

  • S.M. Haley et al.

    Assessing mobility in children using a computer adaptive testing version of the pediatric evaluation of disability inventory

    Arch Phys Med Rehabil

    (2005)
  • M.P. Dijkers

    A computer adaptive testing simulation applied to the FIM instrument motor component

    Arch Phys Med Rehabil

    (2003)
  • D.L. Hart et al.

    Simulated computerized adaptive tests for measuring functional status were efficient with good discriminant validity in patients with hip, knee, or foot/ankle impairments

    J Clin Epidemiol

    (2005)
  • D.L. Hart et al.

    Simulated computerized adaptive test for patients with lumbar spine impairments was efficient and produced valid measures of function

    J Clin Epidemiol

    (2006)
  • M. Kosinski et al.

    An evaluation of a patient-reported outcomes found computerized adaptive testing was efficient in assessing osteoarthritis impact

    J Clin Epidemiol

    (2006)
  • M. Rose et al.

    Evaluation of a preliminary physical function item bank supported the expected advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS)

    J Clin Epidemiol

    (2008)
  • S.M. Haley et al.

    Computerized adaptive testing for follow-up after discharge from inpatient rehabilitation: I. Activity outcomes

    Arch Phys Med Rehabil

    (2006)
  • M.A. Petersen et al.

    Item response theory was used to shorten EORTC QLQ-C30 scales for use in palliative care

    J Clin Epidemiol

    (2006)
  • J. Ingham et al.

    Cachexia-anorexia in cancer patients

    (1996)
  • A.S. Stroemgren et al.

    Self-assessment in cancer patients referred to palliative care: a study of feasibility and symptom epidemiology

    Cancer

    (2002)
  • C.L. Nekolaichuk et al.

    Assessing hope at the end of life: validation of an experience of hope scale in advanced cancer patients

    Palliat Support Care

    (2004)
  • B. Rosenfeld et al.

    The schedule of attitudes toward hastened death: measuring desire for death in terminally ill cancer patients

    Cancer

    (2000)
  • M.A. Echteld et al.

    EORTC QLQ-C15-PAL: the new standard in the assessment of health-related quality of life in advanced cancer?

    Palliat Med

    (2006)
  • W. Gardner et al.

    Computerized adaptive measurement of depression: a simulation study

    BMC Psychiatry

    (2004)
  • A.M. Jette et al.

    Prospective evaluation of the AM-PAC-CAT in outpatient rehabilitation settings

    Phys Ther

    (2007)
  • D. Cella et al.

    The Patient-Reported Outcomes Measurement Information System (PROMIS): progress of an NIH Roadmap cooperative group during its first two years

    Med Care

    (2007)
  • B.B. Reeve et al.

    Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS)

    Med Care

    (2007)
  • F. Strasser et al.

    Fighting a losing battle: eating-related distress of men with advanced cancer and their female partners. A mixed-methods study

    Palliat Med

    (2007)
  • T. Greenhalgh et al.

    Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources

    Br Med J

    (2005)
  • J.E. Ware et al.

    Item response theory and computerized adaptive testing: implications for outcomes measurement in rehabilitation

    Rehabil Psychology

    (2005)
  • C. Schwartz et al.

    Computerized adaptive testing of diabetes impact: a feasibility study of Hispanics and non-Hispanics in an active clinic population

    Qual Life Res

    (2006)
  • H. Siebens et al.

    Measuring physical function in patients with complex medical and postsurgical conditions: a computer adaptive approach

    Am J Phys Med Rehabil

    (2005)
  • S.M. Haley et al.

    A computer adaptive testing approach for assessing physical functioning in children and adolescents

    Dev Med Child Neurol

    (2005)
  • K.F. Cook et al.

    Development and psychometric evaluation of the Flexilevel Scale of Shoulder Function

    Med Care

    (2003)
  • W. Gardner et al.

    Multidimensional adaptive testing for mental health problems in primary care

    Med Care

    (2002)
  • J.E. Ware et al.

    Applications of computerized adaptive testing (CAT) to the assessment of headache impact

    Qual Life Res

    (2003)
  • J.B. Bjorner et al.

    Calibration of an item pool for assessing the burden of headaches: an application of item response theory to the headache impact test (HIT)

    Qual Life Res

    (2003)
  • P.L. Andres et al.

    Computer adaptive testing: a strategy for monitoring stroke rehabilitation across settings

    Top Stroke Rehabil

    (2004)
  • Cited by (0)

    Jochen Walker was graduated from the University of Freiburg (Germany) in 2002. He is currently research assistant at the cantonal hospital of St. Gallen (Switzerland) with his main interest in medical informatics.

    Jan R. Böhnke was graduated from the University of Konstanz (Germany) in 2007 and has been working since 2005 as a freelance statistical consultant in psychological, medical, and political research contexts. At present, he is researcher at the University of Trier (Germany) with main interests in test development and process research.

    Thomas Cerny was graduated from the Medical School University Bern (Switzerland) in 1978. He completed a fellowship in hemato-oncology in 1986. He is currently head of oncology/haematology, cantonal hospital St. Gallen and president of the Swiss Cancer League.

    Florian Strasser was graduated from the University of Zürich in 1990. He completed fellowships in internal medicine in 1997, in medical oncology in 2000, and in palliative medicine (American Board of Hospice and Palliative Medicine) in 2001. He is currently head of oncological palliative medicine in the cantonal hospital of St. Gallen, Switzerland.

    View full text