Original Article
Rasch-based scoring offered more precision in differentiating patient groups in measuring upper limb function

https://doi.org/10.1016/j.jclinepi.2012.12.014Get rights and content

Abstract

Objective

To compare the discriminatory ability of Rasch-based and summative scoring in the context of assessing upper limb function of patients with stroke.

Study Design and Setting

Data were from a cohort study of 497 adults with stroke undergoing physiotherapy. Upper limb function was assessed at admission and discharge using the upper limb subscale of the Motor Assessment Scale (UL-MAS). Rasch analysis was used to transform raw UL-MAS scores into interval measures. A relative precision (RP) index was used to differentiate patients by discharge destination.

Results

The analysis confirmed the unidimensional structure of UL-MAS at both admission and discharge and demonstrated the adequate fit of the items. The RP index favored the Rasch-based scoring over the summative scoring in differentiating between the two patient groups, with significant gains in precision at admission (15%) and discharge (11%). When examining patients in the upper or lower quartile of UL-MAS, the gains in precision were statistically significant in favor of the Rasch-based scoring, with 20% precision at admission and 19% precision at discharge.

Conclusion

Rasch-based scoring was more precise in differentiating patient groups by discharge destination than the summative scoring used to measure upper limb function, especially at the extreme range of the scale.

Introduction

What is new?

  • The UL-MAS items with rating scales demonstrated adequate fit to the Rasch model in people with stroke at both admission and discharge, supporting the transformation of the ordinal scores into interval-level measurements.

  • Rasch-based scoring demonstrated more precision in differentiating patient groups than the conventional summative scoring in measuring upper limb function using the UL-MAS.

  • In addition to establishing better precision of the Rasch-based scoring, the present study empirically demonstrated that the gains in precision were significantly higher when measuring the construct at the extreme range of the scale.

  • The results of the present study provided additional incentive for using Rasch-based scoring when measuring clinical outcomes using rating scales.

Rating scales are widely used by researchers and clinicians to measure latent individual characteristics or constructs that are related to health outcomes [1], [2]. Typically, scores from individual scale items are summed without any weighting or standardization to generate a scale score to represent the degree to which the construct being measured is present [3], [4]. Summative scoring assumes that all the items are measured on the same interval scale, and each item is equally related to the underlying construct being measured [5]. However, summative scores are rank ordered and may not represent a true linear and continuous measurement with a constant unit of measurement that is amenable to mathematical operations [3], [6]. As such, a change or difference of one point may vary in meaning across the continuum of the scale. This creates a methodological challenge for performing mathematical operations on summative scores, in which ordinal “numbers” do not have any underlying number line with an equal interval and only represent “greater than” or “less than” quantities [7]. This also raises subsequent clinical concerns that the effect of any intervention may not be truly reflected or captured by using the conventional summative scores.

Item response theory (IRT) is an alternative model-based approach that has been commonly used to transform summative scores into interval-like measurements [3], [8]. IRT postulates that the probability of producing a certain response on a specific item is a function of the underlying construct [9]. Under the IRT algorithm, measurement errors can be more accurately adjusted for sample independence and invariance. In addition, item parameters (e.g., difficulty levels) and individual ability parameters can be measured separately [9]. It has been argued, therefore, that scoring based on IRT methods could offer greater accuracy and responsiveness than conventional summative scoring in measuring health-related outcomes for clinical measures based on rating scales [10], [11]. On the contrary, a recent simulation study demonstrated that IRT-based and summative scores are comparable in predicting outcomes [12]. More work is needed, therefore, to determine which type of scoring should be used to compare groups in clinical practice or research.

The purpose of this study was to compare the discriminatory ability of Rasch-based and conventional summative scoring using a clinical measure rating scale. Data were from a cohort study that used the upper limb subscale of the Motor Assessment Scale (UL-MAS) to assess upper limb function of patients with stroke [13]. Summative scores of UL-MAS have been used widely to evaluate stroke patients’ progress in arm and hand motor recovery [1], [14], [15]. Although the psychometric properties of the scale have been well studied [16], [17], [18], it remains challenging to accurately detecting differences between patient groups or assessing improvement over time. The present study used an innovative method involving a “relative precision” (RP) index [19], [20] to compare the discriminatory ability of the two scoring methods. Specifically, this article assessed the RP of Rasch-based and conventional summative scoring to differentiate between the two patient groups expected to differ in terms of poststroke discharge destination.

Section snippets

Participants

Data were from a cohort study of patients (age range, 18–101 years; male, 53%; and female, 47%) with stroke who were undergoing inpatient physiotherapy at one of 15 facilities in Australia [13]. The Motor Assessment Scale (MAS) was administered by the treating physiotherapists to measure patients’ functional movement at both admission and discharge. Earlier research demonstrated that the three items of UL-MAS were significantly associated with poststroke discharge destination [13]. “Discharge

Results

In examining the unidimensionality of UL-MAS, the PCA of residuals revealed that 89.2% of the total variance was explained by the Rasch measure at admission, whereas 90.6% of the total variance was explained by the Rasch measure at discharge. The three items of UL-MAS were found to demonstrate good interitem correlations (Spearman rho coefficients is 0.81–0.86 at admission and 0.76–0.81 at discharge). With exploratory factor analysis, the UL-MAS items loaded strongly on one factor for both

Discussion

Although Rasch analysis has been widely used to develop instruments and investigate the psychometric properties of instruments, understanding the clinical usefulness of Rasch-based scoring in measuring patient outcomes is limited [29], [30]. The present article examined the relative advantages of Rasch-based scoring over conventional summative scoring in the context of measuring upper limb function using UL-MAS from a sizable sample of patients with stroke. Our results demonstrated higher

Acknowledgments

The authors thank the participating physiotherapists and site representatives for their tremendous assistance in coordinating data collection. They also thank all patients with stroke who participated in the study.

References (36)

  • J. Harper et al.

    A randomised controlled trial of strapping to prevent post stroke shoulder pain

    Clin Rehabil

    (2000)
  • J. Hobart

    Measuring disease impact in disabling neurological conditions: are patients’ perspectives and scientific rigor compatible?

    Curr Opin Neurol

    (2002)
  • R. Fitzpatrick et al.

    A comparison of Rasch with Likert scoring to discriminate between patients’ evaluations of total hip replacement surgery

    Qual Life Res

    (2004)
  • D. Streiner et al.

    Health measurement scales: a practical guide to their development and use

    (2008)
  • L. Tesio

    Measuring behaviours and perceptions: Rasch analysis as a tool for rehabilitation research

    J Rehabil Med

    (2003)
  • B. Wright et al.

    Rating scale analysis

    (1982)
  • S. Embretson et al.

    Item response theory for psychologists

    (2000)
  • R. Hays et al.

    Item response theory and health outcomes measurement in the 21’st century

    Med Care

    (2000)
  • Cited by (0)

    Conflict of interest: None declared.

    View full text