Introduction
What is new?
- •
The UL-MAS items with rating scales demonstrated adequate fit to the Rasch model in people with stroke at both admission and discharge, supporting the transformation of the ordinal scores into interval-level measurements.
- •
Rasch-based scoring demonstrated more precision in differentiating patient groups than the conventional summative scoring in measuring upper limb function using the UL-MAS.
- •
In addition to establishing better precision of the Rasch-based scoring, the present study empirically demonstrated that the gains in precision were significantly higher when measuring the construct at the extreme range of the scale.
- •
The results of the present study provided additional incentive for using Rasch-based scoring when measuring clinical outcomes using rating scales.
Rating scales are widely used by researchers and clinicians to measure latent individual characteristics or constructs that are related to health outcomes [1], [2]. Typically, scores from individual scale items are summed without any weighting or standardization to generate a scale score to represent the degree to which the construct being measured is present [3], [4]. Summative scoring assumes that all the items are measured on the same interval scale, and each item is equally related to the underlying construct being measured [5]. However, summative scores are rank ordered and may not represent a true linear and continuous measurement with a constant unit of measurement that is amenable to mathematical operations [3], [6]. As such, a change or difference of one point may vary in meaning across the continuum of the scale. This creates a methodological challenge for performing mathematical operations on summative scores, in which ordinal “numbers” do not have any underlying number line with an equal interval and only represent “greater than” or “less than” quantities [7]. This also raises subsequent clinical concerns that the effect of any intervention may not be truly reflected or captured by using the conventional summative scores.
Item response theory (IRT) is an alternative model-based approach that has been commonly used to transform summative scores into interval-like measurements [3], [8]. IRT postulates that the probability of producing a certain response on a specific item is a function of the underlying construct [9]. Under the IRT algorithm, measurement errors can be more accurately adjusted for sample independence and invariance. In addition, item parameters (e.g., difficulty levels) and individual ability parameters can be measured separately [9]. It has been argued, therefore, that scoring based on IRT methods could offer greater accuracy and responsiveness than conventional summative scoring in measuring health-related outcomes for clinical measures based on rating scales [10], [11]. On the contrary, a recent simulation study demonstrated that IRT-based and summative scores are comparable in predicting outcomes [12]. More work is needed, therefore, to determine which type of scoring should be used to compare groups in clinical practice or research.
The purpose of this study was to compare the discriminatory ability of Rasch-based and conventional summative scoring using a clinical measure rating scale. Data were from a cohort study that used the upper limb subscale of the Motor Assessment Scale (UL-MAS) to assess upper limb function of patients with stroke [13]. Summative scores of UL-MAS have been used widely to evaluate stroke patients’ progress in arm and hand motor recovery [1], [14], [15]. Although the psychometric properties of the scale have been well studied [16], [17], [18], it remains challenging to accurately detecting differences between patient groups or assessing improvement over time. The present study used an innovative method involving a “relative precision” (RP) index [19], [20] to compare the discriminatory ability of the two scoring methods. Specifically, this article assessed the RP of Rasch-based and conventional summative scoring to differentiate between the two patient groups expected to differ in terms of poststroke discharge destination.