Introduction

New computed tomography techniques enable the detection and quantification of coronary calcification. In 1990 a scoring algorithm was proposed by Agatston to quantify the amount of calcium in the coronary arteries, using electron beam CT (EBT) [1]. In large and diverse studies the Agatston score has been found to be a strong predictor of future myocardial events [25]. However, because of the limited reproducibility of the calcium score according to Agatston, other scoring algorithms were introduced, like the volume score and the mass score [68]. Various studies showed the lower variability of the mass score compared to Agatston score and volume score [913].

Currently Multi-Detector CT (MDCT) systems are widely used to diagnose and quantify coronary calcifications. A large number of medical centers have multiple MDCT systems installed and multi-site studies assessing calcium scores are performed more and more frequently [14, 15]. The number of patients examined on different scanners for follow-up calcium score determination is therefore expected to increase. It is important to understand the effect of using different scanners in consecutive calcium score determinations of the same patient, particularly when the calcium score is used as a marker to follow the development of atherosclerosis. Then, it is especially important to establish whether a difference in calcium score reflects a true change in calcium or is due to interscan variability or due to a difference in scan technique. Several articles have proposed repeatability limits based on EBT and on MDCT to define significant change of coronary calcium on repeated scans [15, 16]. Calcium score protocols for MDCT are not standardized and can be performed using both sequential and spiral acquisition modes.

The purpose of this study was therefore threefold. First we assessed whether absolute mass scores are comparable or differ between identical 64-slice MDCT scanners of the same manufacturer, and determined mass score variability. Secondly, it was determined how the mass scores compared to the physical mass. And finally the effect of the scan mode on the absolute mass scores was analyzed. Three identical scanners of manufacturer A and of manufacturer B were included.

Materials and methods

A non-moving anthropomorphic phantom (QRM, Möhrendorf, Germany) with nine calcifications of three sizes (1, 3 and 5 mm) and three densities (low, medium and high) was scanned (Fig. 1). The phantom was scanned with three 64-slice MDCT scanners of manufacturer A (system A) and three of manufacturer B (system B). On each 64-slice MDCT scanner the measurements were performed with a sequential and a spiral protocol. For each scanner and scan mode, the phantom was scanned 30 times. In between the consecutive scans, the phantom was randomly dislocated by shifting it several millimeters and rotating it a few degrees in the horizontal plane, in order to simulate the random positioning of the coronaries in each R–R interval. The phantom was also scanned fifteen times without random movement for each scanner and scan mode.

Fig. 1
figure 1

QRM heart insert with nine calcifications (top) as part of the anthropomorphic phantom body (bottom)

The scan parameters on system A were: tube voltage 120 kV, collimation 64 × 0.6 mm and rotation time 330 ms. System B was used with similar scan parameters: tube voltage 120 kV, collimation 64 × 0.5 mm and rotation time 400 ms. Tube currents were 50 mAs in sequential scan mode for all scanners. For the spiral acquisitions the tube currents were 110 mAs for the scanners of system A and 76 mAs for the scanners of system B.

For both systems the acquired data were reconstructed at 75% of the R–R interval with non-overlapping 3 mm slice thickness and 320 mm FOV. For ECG gating an ECG signal was generated by an external patient simulator with a frequency of 71 bpm. Reconstruction was performed using a similar medium smooth convolution kernel available on both systems.

Reconstructed images were analyzed on a Siemens Syngo workstation (Siemens, Forchheim, Germany) using the application Ca score. First, the mass score was calibrated on each scanner to ensure that mass scores of the individual scanners could be compared. Calibration was performed by measuring two calibration inserts with known densities available in the QRM phantom; a water equivalent insert (0 HU) and a medium density insert of 200 HU (Fig. 1). Subsequently the calibration factor was calculated as described in the literature [9] (Table 1). Finally the mass score MS (mg) was determined for all individual calcifications above the default threshold of 130 HU.

Table 1 Calibration factors k for all scanners in both sequential and spiral mode

The variability in percentage (Cv = SD/MSmean*100%) was determined by calculating the mean mass score (MSmean) and standard deviation (SD) for all individual calcifications using all 30 consecutive scans (N = 30).

For all scanners the smallest calcium inserts of 1 mm size with a density of 200 and 400 mg/cm3 could not be measured because their HU-values were always less than the threshold (130 HU), therefore these two inserts were omitted from the results yielding seven results for each scan. Subsequently the data were analyzed. (i) Mass scores of identical scanners were compared to each other. (ii) Mass scores were compared to the physical calcium mass. (iii) Mass scores acquired in sequential and spiral scan mode were compared. (iv) Variability of mass scores was assessed.

Statistics

The scored data were statistically analyzed using SPSS for Windows 14.0 (SPSS Inc, Chicago, USA). A univariate ANOVA (α = 0.05) test was used for the comparison between identical scanners. For the comparison between mass score and physical mass a one-sample t test (α = 0.05) was used. For the comparison of the mass scores between the scan modes two-paired t tests (α = 0.05) were used.

Results

Comparison between identical scanners

Mass scores between the three identical scanners of manufacturer A were similar for almost all calcifications (Table 2). In sequential mode, the ANOVA test showed only significantly different mass scores between the scanners for the high density calcifications of 1 and 3 mm. In spiral scan mode no significantly different mass scores were observed. Mass score differences were small ranging from 0.1 to 2.9 mg (1.4–57.4%) inter-scanner deviation.

Table 2 Comparison of mass scores (MS) obtained on systems A in sequential and spiral mode versus physical calcium mass (M)

Also for manufacturer B mass scores between the three identical scanners were similar for almost all calcifications (Table 3). In sequential scan mode, the mass scores were significantly different for the high density calcifications of 3 and 5 mm only. In spiral scan mode the 5 mm high density calcification and the 3 mm low and medium density calcifications showed significantly different mass scores between the scanners. Mass score differences were small ranging from 0 to 5.8 mg (0–42.9%) inter-scanner deviation for manufacturer B.

Table 3 Comparison of mass scores (MS) obtained on systems B in sequential and spiral mode versus physical calcium mass (M)

For both manufacturers, most differences between identical scanners were observed for high density calcifications. For the total mass scores inter-scanner differences ranged from 2.6 to 5.8 mg (1.5–3.4%). Between the identical scanners of manufacturer A fewer mass score differences were observed than between the identical scanners of manufacturer B (3 vs. 7).

Acquired mass score compared to physical calcium mass

Mass scores were significantly (P ≤ 0.05) different from the physical mass for all scanners and almost all calcifications (Tables 2, 3). Deviation of the mass scores from the physical mass; however, was consistent for all scanners (Fig. 2). Severe underestimation was found for the 1 mm calcification, ranging from 103 to 710% for all scanners and both manufacturers. The 3 mm calcifications of low and medium density were on average underestimated by 57 and 15% respectively and the high density calcification was overestimated 4%. The 5 mm low density calcification was underestimated by all scanners (−20%) and the medium density calcification was overestimated by 1.7% except for scanners A in spiral mode which showed an underestimation of 2.6%. The high density calcification of 5 mm was overestimated by all scanners (+8.4%).

Fig. 2
figure 2

Difference (%) observed between mass scores (MS) and physical mass (M) with 95% confidence bands. Mass scores were obtained on three 64-slice MDCT systems of manufacturer A (a) and three of manufacturer B (b) in sequential and spiral scan mode

The total physical mass (167.8 mg) was significantly overestimated (+2.3%) by system A and two scanners of system B when sequential scan mode was used. In spiral mode a significant underestimation (−1.8%) was found for two scanner of system A and a significant overestimation (+2.5%) for system B.

The average total mass scores were different between manufacturer A and B. In sequential mode the average total mass scores were significantly (P ≤ 0.028) higher for scanners A (171.1 mg) compared to scanners B (169.9 mg). In spiral mode the average total mass scores were significantly (P ≤ 0.001) lower for scanners A (165.4 mg) compared to scanners B (172.1 mg).

Sequential versus spiral scan mode

Scan mode had a different effect on the mass scores for each manufacturer. For system A, the average total mass scores were significantly (P ≤ 0.026) higher in sequential mode (171.1 mg) compared to spiral mode (165.4 mg). System B showed the opposite effect where the average total mass scores were significantly (P ≤ 0.017) lower in sequential mode (169.8 mg) compared to spiral mode (172.0 mg). Scanner B3 did not show significant differences between the scan modes (P = 0.786).

For system A using spiral scan mode no differences were found between identical scanners, whereas a few differences were found using sequential mode. For system B the scan mode did not affect the number of different mass scores between identical scanners.

The variability of the total mass scores was lower in sequential mode compared to spiral mode for system A whereas for system B the variability was lower in spiral mode (Fig. 3).

Fig. 3
figure 3

Variability in percentage (Cv) of the total mass score averaged over the three scanners of system A and B in sequential and spiral scan mode. Results are shown with and without random movement of the phantom on the table

Variability of mass scores

The variability of the mass scores increased when the density of the calcifications decreased (Fig. 4). Large 5 mm calcifications yielded lower mass score variabilities than the smaller 3 mm calcifications, ranging from 1.1 to 25.5%. The smallest calcification of 1 mm showed the highest variability (48–177%).

Fig. 4
figure 4

Variability (Cv) of measured mass scores of the seven calcifications. Mass scores were obtained on three 64-slice MDCT systems of manufacturer A (a) and three of manufacturer B (b) in sequential and spiral scan mode

The variability of the total mass scores was 1.9% for system A in sequential mode and 3.5% in spiral scan mode (Fig. 4). The variability of the total mass scores was 2.7% for system B in sequential mode and 1.3% in spiral scan mode. Additionally, the average variability of the total mass scores without random movement of the phantom was lower than when movement was added to the experiment (Fig. 3).

Discussion

Comparison between identical scanners

In this study we investigated whether mass scores are comparable or differ between identical scanners of the same manufacturer. We found that mass scores between identical scanners were similar for almost all calcifications. In a few cases the mass scores were significantly different between identical scanners, in particular for high density calcifications. However, in all cases the absolute differences between the mean mass scores of the identical scanners were small and for the total mass scores the differences ranged from 1.5 to 3.4%. These differences are in the same order of magnitude compared to the mass score variability as measured for each scanner separately (1.3–3.5%). Calibration of the measurement guaranteed that the HU-scales of all scanners were fixed to known calcium mass densities and that consequently any calcium mass off-sets between the scanners were cancelled. It is therefore likely that the small differences in mass scores between identical scanners are the result of the displacement of the phantom on the table and table movement variations [17].

In a previous study no differences were observed between scanners of the same manufacturer when the Agatston and volume score algorithms were used [18]. Variability of the Agatston and volume score is relatively high compared to the mass score variability [913]. The probability to detect significant differences between groups increases when the variability of the measurement decreases. This explains why we found some significantly different mass scores between scanners in a few cases when the mass score algorithm was used.

Acquired mass score compared to physical calcium mass

Next to the similarity of mass scores between identical scanners we assessed the accordance of the mass scores with the physical mass. We found that overall all scanners showed mass scores in good agreement with the physical calcium mass. Low density calcifications tended to be underestimated and high density calcifications were overestimated. However, calcium plaques in vivo will usually be constituted out of mixed density components and the total mass score is usually being used as a measure for risk stratification, therefore on average overestimation will be compensated by underestimation. Very small calcifications were severely underestimated by all scanners as a result of the partial volume effect in combination with an improper scoring threshold (default 130 HU). This can jeopardize the discrimination between a zero-calcium score and initial coronary calcification. A dynamic scoring threshold, dependent on the calcium density, is expected to improve the agreement between mass scores and the physical mass [13, 1921]. The amount of underestimation of very small calcifications however, was consistent between all scanners.

Sequential versus spiral scan mode

Both sequential and spiral scan modes were used to determine the effect of the acquisition mode on the mass scores obtained on identical scanners. It was observed that scan mode had a different effect on the mass scores for each manufacturer. Total mass scores were significantly different between identical scanners of system A when sequential scan mode was used; however, no significant differences were found using the spiral scan mode. For system B the total mass scores were significantly different in both scan modes.

Furthermore, system A showed higher mass scores in sequential mode compared to spiral mode, whereas for system B the opposite was observed. The latter can be due to the higher tube current in spiral scan mode, however, it was also shown that tube current does not have a significant effect on HU-values and consequently mass scores are not expected to be dependent on tube current [22]. Furthermore, system A showed lower mass scores in spiral mode.

The observed differences between the scan modes indicate that the scan mode is an important aspect in calcium mass quantification and that the recommended scan mode strongly depends on the scanner manufacturer.

Variability of mass scores

The precision of mass scores was assessed by determining the mass score variability of repeated scans. Besides the dependence of absolute mass scores on the scan mode we found that the variability of the mass scores is also dependent on the scan mode. The mass score variability increased almost twofold from sequential (1.9%) to spiral scan mode (3.5%) which is similar to a previous study [23]. Remarkably, for system B the total mass score variability decreased twofold from sequential (2.7%) to spiral scan mode (1.3%) which was also found in another study [24]. In addition, recently it was shown that the mass score variability is smallest for high density calcifications, which is confirmed by our findings [25].

Variability of the mass scores is related to the variations in HU-values between the scans, which are mainly due to table movement and displacement of the phantom on the table. The average variability of the mass scores without displacement of the phantom was lower than when random displacement was added to the experiment. Without displacing the phantom on the table the mass score variability was still in between 0.5 and 2.4%. From this it can be concluded that the variability of mass scores is affected for a large part by table movement. This is confirmed in a previous study that showed that a large part of score variability and mass score variability is due to the variations in scan starting position [17].

Limitations

In this study we simulated the random positioning of the coronaries by performing a random translation and rotation of the phantom in between each consecutive scan. The variability resulting from this procedure is a significant fraction of the overall variability of realistically moving coronaries. However, to obtain a mass score variability even more in agreement with in vivo conditions, one has to take realistic heart motion into account. It was recently shown that Agatston, volume and mass scores are strongly heart rate dependent [26, 27]. In vivo, coronaries can show average velocities of 69.5 mm/s (RCA), 22.4 mm/s (LAD) and 48.4 mm/s (CCA) [28]. It is therefore expected that the variability of mass scores will increase when realistic heart movement is included, and that the few remaining differences observed between identical scanners in this study will become non-significant and the mass scores obtained on separate scanners are comparable.

Clinical applicability

The variability in mass scores found in this non-moving phantom study was relatively small, compared to the reproducibility reported for the calcium score according to Agatston. The Agatston score is still mainly used in clinical practice, also for determining change in extent of coronary calcium on sequential scans. The percentage of change in Agatston score that is considered significant, taking into account interscan variability, ranged in one clinical study from 24 to 190% [17]. The differences in mass scores we found for identical scanners are much smaller (0–57.4%). If studies in moving heart phantoms and in vivo confirm our results, change in coronary calcification can be more accurately assessed by applying the mass score. No repeatability limits are yet available for the mass score. These will have to be determined from large, population-based studies.

Conclusions

Assessment of coronary calcium mass quantified by the mass score algorithm showed that mass scores are comparable between identical 64-slice CT scanners. Identical 64-slice CT scanners of the same manufacturer can be used in follow-up studies when calcium mass scores are used to predict the risk of myocardial infarction. This study also showed the need for using the same scan mode in follow-up studies. Furthermore, for all systems significant differences were found between mass scores and the physical calcium mass; however, the differences were relatively small and consistent.