This study had two main results. First, the BS-derived joint assessment significantly correlated with clinical and PET/CT-derived joint counts, and its reliability was good for both clinical and PET/CT-derived findings. Second, we developed the disease activity formula, the BS/DAS, which is composed of the BSS28, levels of ESR, and the PGA. Additionally, the formula was confirmed in a validation group.
In our previous study, FDG-PET/CT could serve as a sensitive and reproducible method for assessing disease activity in patients with RA [12]. Although the radiation dose is reduced with more advanced scanners, an increase in radiation exposure is one of a major safety concern in this procedure [15]. In Korea, the average radiation doses of PET/CT and BS are 12.2 and 4.2 mSv, respectively, as estimated by a national survey [14, 15]. Furthermore, the cost of conducting a PET/CT examination is high and this procedure required the use of accompanying facilities including the tracer production, so PET/CT study may not be possible in small to moderate sized facilities.
Therefore, the use of FDG PET/CT for evaluating disease activity in a routine clinical practice remains challenging. On the contrary, BS imaging for active joint count has much less radiation exposure than PET/CT imaging, while it provides similar reliable results in patients with RA. The correlation coefficient of a BS/DAS formula for representing DAS28-ESR in each patient in the validation group in this study is comparable to that of PET/DAS formula in a previous study (r = 0.806, p < 0.001 vs r = 0.843, p < 0.001, respectively) [12].
BS is a highly sensitive diagnostic technique of nuclear imaging that uses a radiotracer to evaluate the distribution of active bone formation [20]. Solid tumors with high affinity for bone, metabolic bone diseases, and joint diseases such as chronic inflammatory arthritis and osteoarthritis (OA) are indications for BS evaluation [20]. BS has been used for the differential diagnosis of RA, OA, spondyloarthritis, and unclassified arthritis in the field of rheumatology [21–23]. Additionally our results show that joint count by BS evaluation is a reproducible method for assessing bone changes in the affected synovitis, with good reliability between observers, thus BS can be used for measuring disease activity in patients with RA.
Although previous studies on disease activity assessment using BS in patients with RA were limited, two reports showed a significant correlation between the regional uptake for large joints on BS and disease activity [24, 25]. These studies did not evaluate 28-joints including small joints and did not compare the BS values with DAS28. According to the analysis of the affected joint in a large cohort with RA patients, tender joints were frequently observed in large joints, while swollen joints were frequently observed in the small joints of the hands [26]. Thus, evaluating large joints alone is not sufficient to represent the accurate disease activity. Furthermore, the reliability of BS for clinical assessment of large joints such as knee and shoulder joints was relatively lower than that of other joints in our study. Therefore, joint count based on the BS values of 28-joint areas including both small and large joints should provide a more objective parameter for disease activity assessment. Because it is important to determine the cut off value for BS score to assess for synovitis in patients with RA, we compared affected individual joints between BS scores and PET/CT examination. A BS uptake score of 2 was significantly more reliable than a BS uptake score of 1 in detecting PET-positive joint at 28 joints. Thus we used the BS uptake score of 2 as a criterion for BS positive joint.
Despite the crucial role of RA disease activity measurement in detecting synovitis, clinical assessments of joint counts are not routinely performed in clinics because reliability of joint count assessments, considering both the intra-observer and inter-observer variabilities, needs to be explored further [27]. The intra-observer reliability of ICCs for the clinical assessment of joint counts by healthcare professionals ranged from 0.47 to 0.98 in both TJC and SJC [28], whereas the reliability of kappa value at the joint level varied from fair to good in SJC [29], thereby suggesting the inconsistent joint assessment in clinical practice. Furthermore, the range of inter-observer reliability assessed with the ICCs and the kappa value was dependent on the variation among study samples in finding a positive joint count (from 0.29 to 0.98, from poor to excellent, respectively) [30, 31]. By contrast, joint counts by BS evaluation are a reproducible method for assessing synovitis, with excellent inter-observer and intra-observer reliability. Moreover, BS images show the involvement of whole joint pattern for synovial inflammation [20].
Surprisingly, when observing the ICC values of reliability between BS and PET/CT findings in 28-joints, the ICC between BSS28 and PET28 was 0.782 (0.646–0.866). Furthermore, the ICC values between BS28 and TJC28 were comparable to those between PET28 and TJC28 (0.646 and 0.728, respectively) [12], implicating that the BSS28 and clinical assessments that were performed by experienced clinicians had a good reliability. We also developed a novel BS/DAS formula derived from the results of BS assessment alone, without using the results of joint assessment performed by experienced clinicians. This formula was confirmed in an independent validation group of RA patients. The BS/DAS, which may overcome the variability of clinical evaluation by joint assessors with diverse backgrounds, can complement the use of the DAS28-ESR and may provide similar results compared with more advanced modality such as PET/CT for evaluation of disease activity.
There are two limitations in this study. First, because BS reflects bone remodeling, uptakes in knee joints can be observed in patients with knee OA [21], regardless of RA disease activity. Second, patients were enrolled at a single center, thus multicenter studies of BS validation are warranted to determine whether our findings are generalizable.