Introduction

Up to 60% of persons with spinal cord injury (SCI) experience pain.1 Chronic pain after SCI has been associated with depressive symptoms, anxiety, poorer quality of life, and sleep disturbances.2, 3 Individuals with SCI report the greatest pain relief from opioid medications, and 82% report taking at least one prescription medication for pain management.4 Given the widespread implications of pain post-SCI, and the epidemic of pain medication overdose in the general population,5 it is critical that clinicians working with individuals with SCI have a thorough understanding of the potential for pain medication misuse (PMM) as well as valid and reliable means of identifying individuals at elevated risk of PMM.

PMM has been documented in 17–25% of persons with SCI.6, 7 Various factors have been associated with increased risk for PMM in individuals with SCI including: greater reported pain intensity; more limitations in daily activities due to pain; higher frequency of pain medication use; smoking tobacco; cannabis use; anxiety; depressive symptoms; and impulsive sensation-seeking.6, 7 PMM is an important issue for clinical consideration, as it has been associated with higher likelihood of fall-related injuries in ambulatory persons with SCI,8 as well as fractures, overdose, and myocardial infarction in the general population.9 Given that cardiovascular disease is one of the leading causes of mortality10 and fractures are associated with increased risk of mortality11 in individuals with SCI, PMM in this population should be carefully monitored.

One questionnaire utilized for identifying individuals at risk for PMM is the Pain Medication Questionnaire (PMQ).12 The PMQ consists of 26 items rated on a five-point Likert scale with total scores ranging from 0–104, where higher scores indicate higher risk of PMM. Initial psychometric testing on the PMQ showed moderate, but acceptable, reliability coefficients (test-retest reliability Pearson’s r=0.85, Cronbach’s α=0.73) as well as correlation of scores with levels of psychosocial distress (Pearson’s r=0.23–0.35, P<0.05) and physical functioning (Pearson’s r=0.23–0.36, P<0.01).12 PMQ scores have also demonstrated sensitivity to change over time and have been shown to be predictive of future pain-medication seeking behaviors and early termination from pain medication treatment programs.13, 14

Previous research studies have utilized cutoff values for PMQ total scores6, 7 or have divided total scores into tertiles14, 15 to identify individuals who demonstrate behaviors predictive of PMM. The difficulty with approaches based on total scores is twofold: (1) they assume that the PMQ measures a single, unidimensional construct, and (2) they assume an interval-level scale, which is a prerequisite for additivity of scores.16 There is some literature to suggest that the PMQ is multidimensional; previous psychometric testing on the PMQ revealed items with low point-total correlations15 and 10 significant components were identified via a principal component analysis (PCA).17 Additionally, it is known that ordinal-level observations, such as those obtained from the rating scale of the PMQ, do not sufficiently approximate interval-level data and, thus, are not sufficient for the generation of summed total scores.18 As such, it is imperative that the precision of the PMQ for separating individuals into meaningful classification categories be empirically tested utilizing modern measurement techniques, which can transform ordinal scale observations into interval scale measurements.

One such modern measurement technique is Rasch analysis, which allows for direct comparison of item difficulty and person ability on a common, interval-level scale.19 Rasch analysis also allows for the examination of a measure’s capacity for distinguishing statistically distinct levels of person ability on the measured construct,20 making this analytical approach ideal for evaluating the PMQ’s capacity to distinguish individuals with and without high likelihood of PMM. In addition, the item and person parameters obtained from Rasch analysis are invariant, that is, sample independent.21 Therefore, PMQ measurement properties obtained from a given sample of persons with SCI would remain consistent had a different sample been selected from the population of persons with SCI. This useful feature of Rasch analysis maximizes the generalizability of sample findings to the broader clinical population of individuals with SCI.

Purpose

The objective of the present study is to determine how well the PMQ measures the construct of risk for PMM and its precision in separating individuals with SCI into meaningful classification categories that indicate likelihood of PMM.

Materials and methods

Data source

The present study utilized a secondary analysis approach of data collected from the South Carolina SCI Surveillance System Registry (SCISSR), an annual population-based registry of SCI. Incident cases of SCI were identified using International Classification of Diseases, 9th Revision, Clinical Modification codes of 806 (0.0–0.9) and 952 (0.0–0.9). Personal identifiers were used to eliminate duplicate admissions, and nonresidents of South Carolina were excluded. Persons with SCI and hospital discharge in 1998 to 2012 were eligible for the SCISSR if at the time of the study they: (a) were 18 years of age; (b) were 1 year post injury; and (c) had traumatic SCI with residual effects. Data on diagnoses in the SCISSR were verified through random selection of medical charts.22

The SCISSR was used as a basis for identifying participants for a more detailed follow-up self-report assessment, collected by mail. There were 971 participants in the follow-up. The present analysis included PMQ data for individuals in the follow-up who reported active use of pain medication at the time of the data collection (n=745). Participants were an average of seven years post-SCI and 50.5 years of age (Table 1). Seventy-one percent of participants were male. A majority of participants were white non-Hispanic (n=422) or black non-Hispanic (n=262), had a non-cervical SCI (n=247), and were ambulatory (n=396). As with all self-report, there were varying degrees of missing data on individual items.

Table 1 Participant characteristics

Analyses

Tests of unidimensionality

As unidimensionality is a key assumption of Rasch measurement, the unidimensionality of the PMQ was explored via exploratory factor analysis (EFA) and PCA of Rasch-derived residuals. An EFA was conducted with principal factors estimation and oblique rotation in the statistical software R with package ‘psych.’23 A polychoric correlation matrix was generated for only those participants with complete data (n=646), due to the ordinal nature of the data24 and used as the input matrix for the EFA. A minimum factor loading of 0.32 was chosen as the level of significance, as this equates to approximately 10% overlapping variance with other items in that factor.25

In addition, PCA of Rasch-derived residuals was conducted based on the results of the EFA. All items that loaded 0.32 or above on the EFA were included in the PCA. The following three criteria were used to examine results for unidimensionality of included items: 1) the eigenvalue of the first residual component, after the Rasch-derived construct is removed, is 2.0;26 2) the magnitude of the item loadings on the first residual component are <|0.38|;27 and 3) item infit and outfit statistics are <2.0. The impact of multidimensionality on the estimation of person measures was evaluated by conducting a series of independent t-tests and constructing 95% confidence intervals in a plot of person measures derived from all items and subsets of items identified as contributing to multidimensionality.27 The PCA of Rasch residuals was conducted in WINSTEPS software version 3.90.028 and included participants with and without missing data (n=745). The overall missing data rate for the 26 original PMQ items was 2.10%. During WINSTEPS estimation, the observed marginal counts and the observed and expected marginal scores are computed from non-missing observations.26

Rasch analysis

Once a reasonably unidimensional set of items was identified from the PMQ, a rating scale model Rasch analysis with JML estimation was conducted using WINSTEPS, version 3.90.0.28 First, the appropriateness of the rating scale was evaluated using the following criteria: (1) at least 10 observations of each category, collapsed across all items; (2) monotonicity of rating scale categories (that is, 0–4) as evidenced by an increase in average category difficulty with increasing category value; and (3) outfit mean-square is <2.0. Second, the fit of the items and persons to the Rasch model was evaluated by examining infit and outfit mean squares and standardized z-values.29 Mean square values >1.70, as well as standardized z-values greater than 2.0 were considered indicative of misfit to the Rasch model.30 Third, reliability indicators were examined including: (1) person reliability, which represents the reproducibility of person ordering and was interpreted such that values 0.5 were considered adequate, 0.80 were considered good, and 0.90 were considered high,26 and (2) the separation index, which was used to calculate the number of statistically distinct ability strata in the sample.20 The number of person strata is calculated according to the formula , where G is the person separation index and is an indicator of the number of statistically distinct person measures with centers three calibration errors apart.20 Test targeting, test coverage, and item hierarchy were examined visually using person-item maps. Last, we examined differential item functioning (DIF) of included PMQ items for individuals with cervical vs non-cervical SCI using the Mantel test.26 The rationale behind this analysis is that some studies have shown that individuals with cervical SCI experience higher levels of pain than persons with non-cervical SCI.31, 32 Thus, it is important to assess whether the measurement properties of the PMQ are consistent across these diagnostic populations. DIF contrasts, which represent the maginitude of the difference in item difficulties between classification groups, 0.43 logits were interpreted as slight to moderate and0.64 logits were interpreted as moderate to large.26 Problematic DIF was identified by statistically significant χ2 test and DIF contrast 0.43 logits.

Statement of ethics

We certify that all applicable institutional and governmental regulations concerning the ethical use of human volunteers were followed during the course of this research.

Results

Unidimensionality

Exploratory factor analysis

Results of the EFA suggest that a two-factor solution provided the optimum explanation of the observed data. Review of the factor loadings (Table 2) revealed that 23 items had significant loadings (>0.32) on the first factor, and two items had significant loadings on the second factor. Further inspection revealed the two items with significant loadings on the second factor (Item 1: ‘I believe I am [NOT] receiving enough medication to relieve my pain,’ and Item 2: ‘My doctor [DOES NOT] spend enough time talking to me about my pain medication during appointments’) also loaded significantly on the first factor. These results were interpreted to be preliminary evidence of unidimensionality for the 23 items that significantly loaded on the first factor. Factor 2 was excluded from subsequent analyses for the following reasons: (1) the two items comprising this factor also loaded significantly on the first factor; and (2) Factor 2 was judged to be insufficient for measurement of a separate construct, due to having only two items.

Table 2 Loadings from exploratory factor analysis (EFA) and principal component analysis (PCA) of standardized Rasch residuals

PCA of Rasch-derived residuals

Using these 23 items, results of PCA of Rasch-derived residuals revealed that the items did not meet our a-priori specified criteria for unidimensionality. Specifically, the first residual component had an eigenvalue of 2.29, and multiple items had loadings >|0.38| on residual contrasts (Table 2). Additionally, one item had an outfit statistic >2.0. As a result, the 23 items were subjected to a series of analyses to test the effect of any multidimensionality on person measures.27 Specifically, we conducted a series of independent t-tests to calculate differences between person measures obtained from the 23-item PMQ when compared with person measures obtained from the items that loaded significantly on the first residual contrast (first seven items listed in Table 2). The rationale behind this approach is that, if the 23-items from the PMQ are sufficiently unidimensional, there will not be a significant distortion in person measures obtained from any subset of these 23 items. Findings revealed that only 1.7% (95% binomial confidence interval: 0.8–2.7%) of person measures were distorted, which suggests that the subset of 23 PMQ items was sufficiently unidimensional.

Rasch analysis

Given a set of 23 sufficiently unidimensional items, we proceeded with Rasch analysis and examined indices of fit. Examination of the rating scale indicated acceptable fit (infit and outfit <2.0) and monotonicity of all rating scale categories. All but three items, which are indicated with asterisks in Table 3, demonstrated adequate fit to the Rasch model as evidenced by infit and outfit statistics <1.60 and standardized residuals (standardized z-values) <2.0. Examination of person fit statistics, using the same criteria, revealed that 47 persons (6.31%) demonstrated significant misfit to the model.

Table 3 Measures and fit statistics for 23 items from the Pain Medication Questionnaire

The Cronbach’s α coefficient for the PMQ was 0.78 and person reliability was adequate (0.67). Additionally, results revealed a person separation index of 1.44, which was input into the formula for strata calculation.20 Strata calculation revealed that the analyzed subset of PMQ items functions to separate persons into 2 groups (strata =2.25) – those more likely to misuse pain medication and those with low liklihood of misusing pain medication.

Test targeting and coverage, as well as item hierarchy, were examined by inspecting the distribution of the item and person measures from item-person maps. The mean person measure was 1.08 logits (standard error =0.26) lower than the mean item measure, which suggests the present sample was skewed towards low likelihood of PMM (Figure 1). Floor effects were minimal with 0.7% of persons who achieved minimum scores; no ceiling effects were found. Person-item maps (Figure 1) and examination of item measures (Table 3) also revealed trends in item hierarchy such that the three easiest items to endorse (‘How many painful conditions do you have?’; ‘I would feel better with a higher dose of my pain medication;’ and ‘I believe I am (NOT) receiving enough medication to relieve my pain’) are related to thoughts or beliefs about pain medication, while the three items that were least likely to be endorsed (‘To help me out, family members have obtained pain medications for me from their own doctors;’ ‘I get pain medication from more than one doctor in order to have enough medication for my pain;’ and ‘How many times in the past year have you accidentally misplaced your prescription for pain medication and had to ask for another?’) address behaviors which go beyond traditional methods of obtaining pain medication to address pain. Examination of Rasch-half-point threshold maps and items with mean measures within two standard errors of one another revealed a considerable degree of overlap in the middle of the scale, indicating that item reduction may be possible.

Figure 1
figure 1

Rasch person-item map depicting relationship of person measures to item difficulties, in logits, on the same interval level scale. Person measures and item difficulties range from low (bottom of figure) to high (top of figure). Each ‘ × ’ represents 8 persons, and each ‘.’ represents 1–7 persons.

In a post hoc analysis, due to the finding that the PMQ separated persons with SCI into two strata, we examined commonly used total score cutoffs of 25 and 306, 7 in relation to the item difficulty hierarchies produced by Rash analysis. WINSTEPS software provides complete score-to-measure conversion tables and an ogive curve, which allows for the comparison of total scores on an instrument to logit measures (Figure 2). To allow for direct comparison of these conventional total score cut-offs with Rasch results, we conducted a Rasch analysis with all 26 items from the original PMQ included. Results revealed that the cutoff scores of 25 and 30 were associated with Rasch measures of −0.77 and −0.62, respectively. Examination of these values on the person-item map (Figure 1) revealed that these scores were: a) near the center of the person measure distribution in the present sample, and b) around the same level as the mean measure of the item: ‘I believe I am [NOT] receiving enough medication to relieve my pain.’ This suggests that use of these cutoff scores would identify likely pain medication misusers as persons who endorse that item or any items of greater difficulty.

Figure 2
figure 2

Raw score to measure ogive for complete PMQ. This figure provides a conversion between Rasch-calibrated person measures and item difficulties (x axis) and PMQ total scores (y axis).

Last, we examined item DIF (Table 4) for individuals with cervical vs non-cerivcal SCI. Participants who did not report their SCI injury level in the survey (n=96) were not included in this analysis. A Bonferonni correction was applied due to multiple comparisons, which resulted in a threshold of significance at P0.02. Only one item: ‘Number of painful conditions’ demonstrated statistically significant DIF (χ2(1)=10.07, P=0.002), where this item was more difficult for individuals with non-cervical injuries. However, the DIF contrast, that is the magnitude of the difference in item difficulty between the two groups of persons, was small in magnitude (0.20 logits). As it is suggested that DIF contrasts <0.43 logits are indicative of a negligible level of DIF, this item was retained.26

Table 4 Results of DIF analysis for individuals with cervical vs non-cervical SCI

Discussion

The present study utilized Rasch analysis to explore the measurement properties of the PMQ in a sample of persons with SCI. Results suggested a subset of 23 items from the PMQ represented a single unidimensional construct. Rasch analysis results revealed that the rating scale and majority of persons (>93%) and items (20/23) fit the Rasch measurement model. The PMQ demonstrated adequate reliability and functions to separate persons into two strata—those likely to misuse pain medication and those with low liklihood of misusing pain medication. An absence of ceiling effects and minimal (0.07% of persons) floor effects were observed when comparing person and item measures. Examination of item measures and thresholds revealed that some PMQ items, particularly those with moderate difficulty, demonstrate measurement overlap; this suggests that item reduction may be possible.

Overall, the PMQ items performed well in the Rasch model. A negligible amount of DIF was observed for one item when comparing individuals with cervical vs non-cervical SCI. Three items demonstrated significant misfit to the model. Those items were:

  • Item 5—‘I [WOULD] mind quitting my current medication and trying a new one if my doctor recommends it’

  • Item 15—I get pain medication from more than one doctor in order to have enough medication for my pain’

  • Item 17—‘To help me out, family members have obtained pain medications for me from their own doctors’

Two of these items (Items 15 and 17) were found to be the most difficult items to endorse. Post hoc descriptive analyses of these variables revealed that >95% of respondents answered ‘0’ to these items, indicating they ‘never’ engage in those behaviors. It is possible the misfit was a result of invariance in responses to these items such that respondents were either not engaging in these behaviors or did not report engagement in these behaviors.26 As a result, these items may be of little value in the measurement of risk of PMM in persons with SCI. However, in other diagnostic populations, these items may be more informative.

Our findings also have important clinical implications. Specifically, results do not support the division of PMQ total scores into tertiles to classify individuals as high, medium or low risk for PMM, as the PMQ distinguishes persons into only two strata. In addition, the commonly utilized cutoff scores6, 7 may be too low, thus leading to misclassifications of an individual’s risk of PMM. Examination of item content in relation to item difficulty hierarchy suggests a cutoff around the items ‘How many times in the past year have you run out of pain medication early and had to request an early refill,’ ‘At times, I run out of pain medication early and have to call my doctor for refills,’ and ‘My pain medication makes it hard for me to think clearly sometimes’ may be more indicative of PMM, and therefore more clinically meaningful, than the item ‘I believe I am [NOT] receiving enough medication to relieve my pain.’ These item measures are around 0.05–0.14 logits (Table 3), which corresponds to a PMQ total score of 55–60 (Figure 2).

Limitations and future directions

The present study has some methodologic considerations worth noting. First, the PMQ was not originally designed to adhere to the Rasch measurement model, which requires a single unidimensional construct and hierarchical item structure. As a result, it is unsurprising that the PMQ failed to meet some Rasch assumptions. Second, the present sample was skewed towards lower likelihood of PMM. It is possible that inclusion of persons with higher likelihood of PMM in the analysis may reveal that the PMQ distinguishes persons into more than two strata. Moving forward, the measurement properties of the PMQ should be studied in individuals with SCI with greater variety in likelihood of PMM to build upon the findings. Third, we utilized a population-based cohort, which is the gold standard for SCI recruitment, since it captures the full range of SCI, including those who may not receive treatment in a rehabilitation or specialty hospital. Therefore, the findings generalize to the full population of SCI, yet, as a whole, it will include a substantially larger portion of individuals who are ambulatory, compared with those treated in traditional specialty hospitals.

Conclusion

Given the exponential increase in the rate of prescription pain medication overdose-related deaths in recent years in the general population, it is critical that clinicians and researchers utilize valid and reliable measures for identifying individuals at elevated risk of PMM. The chronic pain experienced by individuals post-SCI may predispose this population to elevated risk of PMM and subsequent PMM-related comorbidities. Findings of this study suggest 23 of the 26 PMQ items yield valid and reliable estimates of PMM in persons with SCI and functions to distinguish them into two strata—those more likely to misuse pain medication and those with low likelihood of misusing pain medication. Gaining a deeper understanding of the measurement properties of the PMQ is a necessary precursor for widespread population-based studies seeking to elucidate the incidence of PMM in persons with SCI. Future studies that build upon these findings by including individuals with SCI across a broader range of likelihood of PMM are warranted.