Introduction

Dual X-ray absorptiometry (DXA) is currently a principal method to measure bone mineral density (BMD) both in clinical practice and drug trials. The three dominant DXA manufacturers are Hologic Inc. (Bedford, MA, USA), GE-Lunar Inc. (Madison, WI, USA), and Cooper Surgical (Norland; Trumbull, CT, USA). Although the DXA technology is similar for these manufacturers, the BMD results are different due to different calibration standards, proprietary algorithms to calculate the BMD, and differences in the regions of interest (ROI). As a result, a patient scanned on three different DXA systems will have substantially different BMD values. As an example, Hologic spine BMD is typically 11.7% lower than GE-Lunar BMD and 0.6% higher than Norland BMD. These differences complicate the pooling of BMD values from different systems in multi-center clinical trials and make it difficult to compare BMD measures over time when a patient is scanned on different systems.

To solve this comparability problem, the International Committee for Standards in Bone Measurements (ICSBM) conducted a study in 1994 in which 100 women were scanned on all three of these of DXA systems. The study was performed at the University of California at San Francisco (UCSF) using pencil-beam DXA systems made by all three of the dominant manufacturers at that time: Hologic QDR 2000 in pencil-beam mode, Lunar DPX-L, and Norland XR26 Mark II. Subsequently, the ICSBM published universal BMD standardization equations to convert spine and hip BMD results from different systems into a standardized BMD (sBMD) [1, 2], so that sBMD derived from the three systems would be approximately the same for any given patient. To overcome residual bias present in the published ICSBM conversions, Hui et al. published optimized equations for spinal sBMD [3]. In 2001, Lu et al. published femur subregional conversion equations to cross-calibrate between different manufactures [4]. These updated formulas are frequently used in large multi-center clinical trials and epidemiological studies.

Advances in DXA technology have resulted in the development of a new generation of densitometer in which the pencil-beam X-ray source and the single detector of the pencil-beam instruments were replaced by a fan-beam X-ray source and a multiple-element detector array. Whereas pencil-beam scans report accurate bone area and dimensions, the measure of bone area (AREA) and bone mineral content (BMC) for fan-beam scans may have a magnification error relative to the height of the bone above the scanning table (i.e., the higher the bone off the table, the smaller the projected bone area since the X-ray source is in the table) [5]. Hologic systems employ a single-pass wide-angle fan beam, while GE-Lunar systems use a multi-pass narrow-angle fan beam with some overlap between passes. The current DXA software is highly automated for the placement of ROI, while the older software versions were completely manual. These software changes include adjustments to the absolute BMD values as well. The traditional recommendation regarding patient positioning for spine scans involved elevating the legs with a positioning block for pencil-beam systems. Currently, the Hologic fan-beam systems still use the positioning block while GE-Lunar offers the option (Onescan™) of not elevating the legs, slightly altering the projection of the spine in the image. The peak X-ray tube voltages used to generate the dual-energy images for the Hologic systems are different between their current fan-beam systems and previous pencil-beam models (140 and 100 kVp versus 140/70 kVp, previously). Throughout all of the changes over the years, the DXA manufactures have worked to keep the calibration of new models consistent with their original models. Lastly, the sBMD equations for the spine were derived using L2-L4, while L1-L4 is the current clinically recommended measurement. Nevertheless, as older systems are replaced with newer models, comparability of measurements made using different systems with their associated proprietary software and different modes of operation become important issues in research studies as well as clinical practice. The objective of this study was to determine whether the standardization formulas derived from pencil-beam DXA scanners are still appropriate for modern DXA systems.

Materials and methods

Study population

The three facilities involved in this study were New Mexico Clinical Research & Osteoporosis Center, Albuquerque, NM, USA [1]; Colorado Center for Bone Research, Lakewood, CO, USA [2]; and UCSF, San Francisco, CA, USA [3]. Each study facility recruited 30 postmenopausal women, ages 52 to 85 years (mean age 63.3 ± 9.2), for a total of 90 participants. Three participants’ scans were lost due to corrupted scan files. A total of 87 women’s scan results were included in this report. The local human research committee for each facility approved the study, and participants signed an approved informed consent prior to participating. There were no participant restrictions for ethnicity or body mass.

Bone densitometry

All women were scanned twice on both Hologic Delphi (Hologic, Inc., Waltham, MA, USA) and GE-Lunar Prodigy (Madison, WI, USA) DXA systems using each manufacturer’s standard scan and positioning protocols. Spine phantom quality control scans were acquired on each of the six systems on a continual basis during the study, but no cross-calibration was performed for any of the systems. Each patient was positioned for the lumbar spine scan and then the left and right proximal femur scans. The subjects were asked to stand between each scan and then repositioned. The 30-s scan mode was used on both systems and for all positions. The legs were elevated using the Hologic positioning cushion for spine scans on the Hologic systems; legs were flat on the table for the femur scans. The Onescan™ method was used to scan the participants on the GE-Lunar system, except one study facility (UCSF), where the single femur mode was used to scan each hip separately. The positioning and scan modes were picked to mimic current clinical practice and manufacturer recommendations.

Scan analysis

Using the methods recommended by each manufacturer for the ROI placement, one technologist analyzed all the images using either Hologic Apex 3.0 (prerelease) or GE-Lunar EnCore 10.10.

The “compare” (Apex) or “copy” (Prodigy) methods were used to analyze the repeat measurements, thereby facilitating consistent size and placement of analysis regions for each participant. The test–retest precision of the scans was previously reported [6]. In short, the pooled precision from duplicate scans on this population for Apex and Prodigy was statistically the same for L1-L4 (1%) and total hip (1.1%), and different for the femur neck (2.3% versus 1.8%, respectively (p = 0.03)).

Data conversion and statistical analysis

Demographics and other characteristics of the study population were expressed as means and standard deviation. The relationship between Apex and Prodigy software was defined using linear regression. The BMD values from both systems were converted into sBMD using the Hui et al. formulas for spinal BMD [3]:

$$ \begin{array}{*{20}{c}} {{\text{sBM}}{{\text{D}}_{\text{spine}}} = 1.0550 \times \left( {{\text{SPTOTBM}}{{\text{D}}_{\text{Hologic}}} - 0.972} \right) + 1.0436} \hfill \\ {{\text{sBM}}{{\text{D}}_{\text{spine}}} = 0.9683 \times \left( {{\text{SPTOTBM}}{{\text{D}}_{\text{Lunar}}} - 1.100} \right) + 1.0436} \hfill \\ \end{array} $$

and the Lu et al. formulas for femur BMD [4]:

$$ \begin{array}{*{20}{c}} {{\text{sBM}}{{\text{D}}_{{\text{Total}}\,{\text{hip}}}} = \left( {1.008 \times {\text{HTOTBM}}{{\text{D}}_{\text{Hologic}}} + 0.006} \right)} \hfill \\ {{\text{sBM}}{{\text{D}}_{{\text{Total}}\,{\text{hip}}}} = \left( {0.979 \times {\text{HTOTBM}}{{\text{D}}_{\text{Lunar}}} - 0.031} \right)} \hfill \\ {{\text{sBM}}{{\text{D}}_{\text{Neck}}} = \left( {1.087 \times {\text{NeckBM}}{{\text{D}}_{\text{Hologic}}} + 0.019} \right)} \hfill \\ {{\text{sBM}}{{\text{D}}_{\text{Neck}}} = \left( {0.939 \times {\text{NeckBM}}{{\text{D}}_{\text{Lunar}}} - 0.023} \right)} \hfill \\ \end{array} $$

Although it is customary to represent sBMD in mg/cm2, we used g/cm2 throughout this paper for both BMD and sBMD values, to compare the magnitude of absolute differences before and after applying the standardization equations.

Bland–Altman statistics [7] were used to test the agreement between the sBMD of the Apex and Prodigy. All the statistics were done using SAS software version 9.1. All the statistical tests were two-sided, and two BMD measures were considered significantly different when at least one p value of intercept or slope is 0.05 or less. The Deming regression method was used to derive cross-calibration equations mimicking the approach used by Hui et al. [3] and Lu et al. [4] to take into account that both variables have measurement uncertainties.

Since standardization equations are not available for BMC and AREA, and since it was desired to investigate the possible cause in disagreement of the sBMD values, the original Genant equations [8] were used to compare the Prodigy BMC and AREA to Hologic. The Genant equations for spine are

$$ \begin{array}{*{20}{c}} {{\text{Hol\_ARE}}{{\text{A}}_{\text{Genant}}} = \left( {0.873 \times {\text{Lun\_AREA}}} \right) + 8.808} \hfill \\ {{\text{Hol\_BM}}{{\text{D}}_{\text{Genant}}} = \left( {0.906 \times {\text{Lun\_BMD}}} \right) - 0.025} \hfill \\ \end{array} $$

BMC was calculated as BMDGenant × AREAGenant.

Investigations into the hip ROIs in a similar fashion was not possible since the AREA relationships for the proximal femur were not published in any reporting of the standardization study including Genant [8], Lu et al. [4], and Hui et al. [3]. Bland–Altman plots were again used to study the relationship of AREA and BMC.

Results

There were no statistically significant differences among the study facilities for age, height, weight, spinal BMD, and femoral BMDs. For all the study sites, the Prodigy BMD values were, as expected, significantly greater than the Hologic BMD values, as previously reported in Shepherd et al. [9] (see Table 1). The comparison of pooled Apex and Prodigy results is given in Table 2. The Apex and Prodigy BMD results were highly correlated with correlation coefficients (r values) that ranged from 0.91 (left neck) to 0.98 (spine). Before applying the universal standardization equations, all the BMD measures were significantly different between the Apex and Prodigy systems. The mean BMD differences (Apex − Prodigy) were −0.169 ± 0.063 g/cm2 (16.5%, P < 0.0001) and −0.164 ± 0.048 g/cm2 (15.6%, P < 0.001) for L1-L4 spine BMD and L2-L4 spine BMD, respectively. The differences for the femur total BMD between the two systems (Apex − Prodigy) were −0.072 ± 0.028 g/cm2 (8.2%, P < 0.001) for the left femur and −0.068 ± 0.028 g/cm2 (7.8%, P < 0.001) for the right femur. The differences for the femoral neck BMDs were much greater than for the femur total BMDs −0.164 ± 0.043 g/cm2 (21.0%, P < 0.001) and −0.156 ± 0.038 g/cm2 (20.0%, P < 0.001) for left and right, respectively.

Table 1 Demographics of the participants
Table 2 Means and standard deviation of Hologic Apex and GE-Lunar Prodigy BMD in g/cm2

After converting the manufacturers’ BMD values to sBMD values, percent differences between these two systems were reduced but not eliminated. For spine, the mean BMD differences between Apex and Prodigy were reduced from 16% to 4.1% for L1-L4 sBMD spine and from 15.6% to 3.3% for L2-L4 sBMD spine. The femoral neck sBMD values for Apex and Prodigy were not significantly different. There was 1.0% difference for the left femur total sBMD values, or 0.009 ± 0.027 g/cm2 (P < 0.05), but no differences were found for the right total sBMD values. Significant trends in the sBMD differences in the spine as a function of the magnitude of the BMD (r = 0.31, P < 0.05) were found (see Table 3). The difference between the spine sBMD measures increased as the sBMD increased (Fig. 1). In contrast to the spine, the femoral total and neck sBMD did not show significant differences or trends between the differences and means (See Figs. 2, 3, 4, and 5). The cross-calibration equations derived from this study data are shown in Table 4. The cross-calibration equations for L1-L4 and L2-L4 spine BMD had significantly different slopes and intercepts. The total femur and femoral neck BMD cross-calibration equations were also unique. However, the femur equations did not differ significantly between the left and right sides.

Table 3 Bland–Altman analysis results
Fig. 1
figure 1

Bland–Altman plot of lumbar spine L1-L4 (a) and L2-L4 (b) sBMD of Hologic Apex and GE-Lunar Prodigy. The dotted lines are the 95% confidence intervals around the best-fit line

Fig. 2
figure 2

Bland–Altman plot of left total femur sBMD of Hologic Apex and GE-Lunar Prodigy. The dotted lines are the 95% confidence intervals around the best-fit line

Fig. 3
figure 3

Bland−Altman plot of right total femur sBMD of Hologic Apex and GE-Lunar Prodigy. The dotted lines are the 95% confidence intervals around the best-fit line

Fig. 4
figure 4

Bland−Altman plot of left femur neck sBMD of Hologic Apex and GE-Lunar Prodigy. The dotted lines are the 95% confidence intervals around the best-fit line

Fig. 5
figure 5

Bland−Altman plot of right femur neck sBMD of Hologic Apex and GE-Lunar Prodigy. The dotted lines are the 95% confidence intervals around the best-fit line

Table 4 Conversion equations for GE-Lunar Prodigy and Hologic Apex systems

To investigate the cause of the differences in the spine, we also compared the L2-L4 BMC and AREA. Figures 6 and 7 show the differences in L2-L4 spine BMC and AREA, respectively. There was a significant slope in L2-L4 AREA but not BMC. Thus, the trend in differences between the L2-L4 sBMD values can be explained by the trend in the differences in spine AREA alone.

Fig. 6
figure 6

Bland−Altman plot of L2-L4 BMC of Hologic Apex and GE-Lunar Prodigy converted to Hologic Apex BMC. The dotted lines are the 95% confidence intervals around the best-fit line

Fig. 7
figure 7

Bland–Altman plot of L2-L4 AREA of Hologic Apex and GE-Lunar Prodigy converted to Hologic Apex AREA. The dotted lines are the 95% confidence intervals around the best-fit line

Discussion

This study found that marked systematic differences in BMD values at all measurement sites are reduced by using the sBMD equations, but important differences still remain for fan-beam systems in the spine. Furthermore, the relationships relating Apex to Prodigy for L1-L4 and L2-L4 were not interchangeable.

Several studies had previously indicated that there were significant measurement differences between the new and older generation systems. Pearson et al. [10] found similar differences in their cross-calibration study. They found the spine sBMD on the GE-Lunar Prodigy system was significantly higher than when the same subjects were scanned on a Hologic QDR 2000 system in fan-beam mode (the mean difference was 0.035 g/cm2). As in our study, no differences in sBMD were found for the femoral neck and femur total ROIs. Ozdemir and Ucar [11] compared hip and spine measures on the same patients between the GE-Lunar DPX-NT and Hologic 4500C systems and found that the spine sBMD was significantly different between GE-Lunar DPX-NT and the Hologic 4500C systems (1.017 and 1.022 g/cm2, respectively). These observed differences are owed in part to the significant changing results between pencil and fan-beam systems for the same manufacturer [10, 1215]. The worst reported case, the difference of 17% was observed between pencil-beam QDR 1000W to fan-beam QDR 4500W scanners [12].

There are many identifiable differences between these particular fan and pencil-beam systems: some of which are specific to their scan geometries while other long-standing differences having to do with the proprietary way each manufacturer practices the measure of bone density (edge detection algorithms, calibration methods, X-ray tube voltages, “K-edge filtered” versus “voltage switching” X-ray sources). The geometry of the pencil-beam systems was very similar, but the scan geometry used in the fan-beam systems is substantially different. Hologic uses a lateral fan-beam, whereas the Prodigy’s fan beam is oriented to the superior–inferior direction [16]. Due to its scan geometry, the Hologic system makes one pass of the region of interest with a broad fan beam. Thus, the X-rays pass through a part of the body only once. A consequence of the Hologic geometry is that bone area is magnified based on the distance between the examination table and the spine. In contrast, with its narrow fan beam, Prodigy scanners make multiple passes and over samples some parts of the scan area while not sampling other areas at all depending on where the passes intersect above the tabletop. The Prodigy scanner stitches the passes together in the bone plane to create an undistorted view of the bone. The Prodigy does not exhibit magnification [5]. Another consideration is that the 1994 sBMD study was derived from data collected at one clinic using one system from each manufacturer and could not take into account intra-manufacturer variation. Our study consisted of three study sites, with three pair of Hologic Delphi and GE-Lunar Prodigy devices, and the inter-site variations were intentionally not cross-calibrated to provide a more robust relationship. This is different than the quality control performed for multi-center clinical trials where the goal is to remove systematic differences between DXA systems by phantom cross-calibration.

The difference in L2-L4 AREA showed a significant trend as function of mean AREA measured. Two possible explanations for this are the more pronounced magnification in the Hologic Delphi fan-beam systems than the GE-Lunar Prodigy and the difference in leg positioning. Boudoueq et al. [5] found in phantom experiments that decreasing height above the table increased AREA for the Hologic Discovery device and not for the Prodigy. Secondly, Hwua et al. found that the GE-Lunar Prodigy BMD results for the legs down position were on average 1.33% higher than when measured with legs up due to a change in the bone projection [17]. However, Nord et al. showed that the GE-Lunar Prodigy spine AREA, BMC, and BMD in leg down position were highly correlated with results from the traditional position [18]. Unfortunately, we were not able to determine which of these effects accounted for the differences found in this study.

This study had several limitations. First, no phantom cross-calibration was performed between study sites. The absolute calibration differences between the systems of the same make was not known during the period of the study. However, the sites were monitored with their local quality control phantoms and found to be stable and calibrated to their factory standards. Clinical systems can vary in their absolute calibration by as much as ±2% [14]. Using another set of systems may generate equations slightly different because of this. However, since there is no gold standard phantom for field calibration of either Hologic or GE-Lunar systems, this limitation is unavoidable. This study was at least an average of three systems from each manufacturer, while the original sBMD study used representatives of convenience from each manufacturer. Second, only two of the three major DXA manufacturers’ systems were included in the study. Thus, we could not validate any of the sBMD relationships involving Norland systems. Third, our findings are only strictly applicable when the spine-positioning block is used for the Hologic systems and not used on the GE-Lunar systems. Currently, the GE-Lunar Prodigy can be used with the positioning block or without it using the Onescan™ option. Lastly, our study was not able to determine which of the many differences between the pencil and fan-beam systems was responsible for the differences seen at the spine. The time and reason for the change in inter-manufacturer accuracy is important to determine since studies often involve different models and software versions.

The pencil-beam sBMD equations made comparing BMD measurements for studies using different DXA systems possible. Pencil-beam technology has all but been totally replaced with fan-beam systems due to faster scan times, improved image quality, and greater measurement precision. It is important to note that neither sBMD nor the cross-calibration equations derived in this study solve the problem of comparing the DXA results of a patient done at one clinic on a Hologic scanner to those done at a second clinic on a GE-Lunar scanner. The large SEE of the standardization (or conversion) equations, which in this study was in the range of 4–7%, prevents a precise comparison of the BMD of an individual between scanners from different manufacturers. As previously pointed out by Formica [19] and Ozdemir and Ucar [11], these equations are most useful for pooling data from multi-center trials to remove systematic differences and not for comparing results of individual patients.

In conclusion, this study found that marked systematic differences in BMD values between current generation fan-beam DXA systems are reduced when using the sBMD equations, but residual differences remain especially for the spine ROIs. New relationships were derived from cross-calibration data averaged between three clinical sites that removed the systematic differences at all ROIs. This study emphasizes the need to keep standardization equations up to date with advances in technology and clinical practice to ensure accuracy when pooling results between scanners.