Original Article
Global Perceived Effect scales provided reliable assessments of health transition in people with musculoskeletal disorders, but ratings are strongly influenced by current status

https://doi.org/10.1016/j.jclinepi.2009.09.009Get rights and content

Abstract

Objective

The study investigated the test–retest reliability and construct validity of the Global Perceived Effect (GPE) scale in patients with musculoskeletal disorders.

Study Design and Setting

Data from seven clinical studies including 861 subjects were used for the analyses. Repeat measures taken at the same attendance and from attendances separated by 24 hours were compared to estimate test–retest reliability. Construct validity was evaluated by examining relationships between pre, post, and change scores in pain and disability measures with GPE measures.

Results

Intraclass correlation coefficient values of 0.90–0.99 indicate excellent reproducibility of the GPE scale. In all but one data set, change scores on pain and disability measures correlated well (r = 0.40–0.74) with GPE; however, post scores nearly always correlated even more strongly (r = 0.58–0.84), and pre scores showed much weaker association (r = 0.00–0.28). Pre scores accounted for only a small amount of additional R2 when added to regression models including post score.

Conclusions

Test–retest reliability of the GPE is excellent. GPE ratings are strongly influenced by current status, with the effect more obvious as transition time lengthens. This result questions whether transition ratings truly reflect change, or rather just current state. This finding also has implications for the use of GPE ratings as an external criterion of change in clinimetric studies.

Introduction

What is new?

  • Global Perceived Effect (GPE) scales can be reliably rated by patients with musculoskeletal conditions.

  • Patients have difficulty taking their baseline status into account when scoring the GPE, and ratings are very strongly influenced by their current health status.

  • The influence of current status may increase with longer transition time.

  • GPE ratings may not offer an accurate measure of change as transition time stretches into months.

  • GPE ratings may be unsuitable for use as external criteria of change when determining minimally important change and responsiveness of other instruments.

The measurement of complex constructs, such as recovery, pain, and disability, is a difficult process, both for clinicians and researchers. Therefore, establishing the reliability and validity of instruments that aim to measure these constructs is necessary to sensibly interpret the findings of clinical studies. Similarly, clinicians need to understand the strengths and shortcomings of the outcome measures they use to inform appropriate clinical practice [1].

In clinical practice, measurement of patient-rated recovery often takes the form of the question: Are you feeling better (or worse)? Or, to what extent have you improved (or deteriorated) since last time? This type of rating of perceived recovery is a “transition scale” or Global Perceived Effect (GPE) scale. The GPE scale asks the patient to rate, on a numerical scale, how much their condition has improved or deteriorated since some predefined time point. The GPE has several qualities that make it an appealing tool for use in clinical practice and research; being a single question, it is easy and quick to administer and the results are seemingly simple to interpret. Such scales have been recommended for use as a core outcome measure for chronic pain trials [2] and been advocated to increase the relevance of information from clinical trials to clinical practice [3]. From the patient's perspective, the question is intuitively easy to understand and it allows them to rate those aspects of recovery that are most important to them. In addition to measurement of outcome, the GPE is commonly used as an external criterion to test the measurement properties of other outcome measures [4], [5], [6]. In the field of musculoskeletal research, these “other” outcomes are often pain or disability, domains that are assumed to have an important impact on quality of life for these patients.

However, there are several potential limitations of the GPE and relatively little work has been published regarding its reliability and validity. One concern is that the GPE may have low test–retest reliability [7], given that it is a single-item measure. Further, there are validity concerns as there is evidence that patients have difficulty recalling their previous status and their estimates of transition are biased by their current status [8], [9]. It is suggested that this bias would increase as the time period over which the transition span increases, that is, patients would become more likely to confuse change over time with current status as the time interval lengthens [8].

Guyatt et al. [8] conducted an assessment of the measurement properties of the GPE. They began by calculating the correlation between GPE scores and the change in measures of health-related quality of life in subjects with respiratory disorders. They asserted that although a significant correlation is necessary, this alone is insufficient to confirm that the GPE is truly measuring change rather than current status. This is because in some situations current status and change will be highly correlated, in others they will not.

To test the validity of the GPE, Guyatt et al. suggest that not only should there be a strong correlation between the calculated change on another measure and the GPE, arbitrarily set at 0.5, but the pre (baseline) scores on a measure and the post (current) scores should have lower, but approximately equal (size) and opposite (direction) correlations with the GPE. The thesis depends on comparable variances in the pre and post scores and is supported by a mathematical proof in their paper. To explore the matter further, the authors also constructed regression models and entered pre and post scores with GPE as the dependent variable. They reasoned that if the partial regression coefficient for the pre score was significant, it would indicate that baseline scores are taken into account in the calculation of change. The results from the Guyatt et al.'s study are somewhat mixed; in some cases, patients seemed to recall their prior state, even up to 4 weeks, in others they did not. Guyatt et al. suggested that confidence in the utility of transition (GPE) scales is determined by the extent to which correlations between pre and transition and post and transition are similar, where they are very dissimilar the scale may be providing biased information.

The present study was designed to further investigate the measurement properties of the GPE scale in subjects with musculoskeletal conditions. The three aims of the study were as follows. Firstly to establish the test–retest reliability of the scale; secondly to determine to what extent, if any, patients take their baseline status into account when scoring the GPE; and thirdly to determine how much influence the transition time period has on the performance of the scale.

Section snippets

Data sets and questionnaires

Studies were chosen from those available to the researchers to represent a range of musculoskeletal conditions and had to measure pain, disability, and GPE at a number of time points. Data collected from seven different studies were used in the analyses; features of the source studies are outlined in brief in Table 1, Table 2. Pain was assessed in all the studies via a numerical rating scale or visual analog scale, and disability measured using validated questionnaires. GPE was measured in

Results

The test–retest reliability of GPE measures in 134 patients with chronic WAD was high at all the time points. The point estimate (95% confidence interval) for the ICC (2.1) statistic were 0.998 (0.997, 0.999) at the first assessment; 0.965 (0.951, 0.975) at 6 weeks, and 0.925 (0.895, 0.947) at 12 months. The reliability for GPE scores collected 24 hours apart in 50 subjects with chronic LBP was also high with an ICC (2.1) value of 0.901 (0.856, 0.932).

The correlations between pre, post, and

Principal findings

ICCs that ranged from 0.90 to 0.99 indicate excellent test–retest reliability of the GPE for periods up to 24 hours in subjects with WAD and chronic LBP. Although authors have questioned the reproducibility of single-item measures [7], as far as we are aware no one has empirically tested this in GPE scales. Given the relatively short-time period between successive administrations of the question, we cannot completely rule out the possibility of recall bias. However, in the absence of any

References (20)

There are more references available in the full text version of this article.

Cited by (0)

View full text