Introduction

Following a radiological accident, it is necessary to rapidly perform radiation dosimetry on victims, which will identify those who have suffered overexposure and require urgent medical treatment. In general, the dicentric chromosome assay (DCA) is considered to be the gold standard for such biodosimetry. It has been widely used to evaluate radiation doses of accidentally and occupationally exposed persons (Slozina et al. 2001; Ramalho and Nascimento 1991; Suto et al. 2013; Chung et al. 1996), but it might be not suitable for larger- scale radiological accidents due to multiple drawbacks: it is labor-intensive, time-consuming, and requires highly skilled personnel.

The development of automated systems using alternative tools should be considered to overcome these limitations and increase dosimetry throughput. As counting of micronuclei (MN) is much simpler and faster than the DCA, it has been considered as an alternative. MN are produced by lagging acentric chromosome fragments or whole chromosomes at anaphase (IAEA 2011; Lue et al. 2015). The cytokinesis-block micronucleus (CBMN) assay developed by Morley and Fenech (Fenech and Morley 1985), is a well-established method that exploits this phenomenon for genotoxicity testing. It has been recommended by the Organisation for Economic Co-operation and Development (OECD) for in vitro genotoxicity testing (OECD 2016). It has been reported that MN frequencies in binucleated (BN) cells are strongly correlated with radiation dose (Vral et al. 2011, 1994); the CBMN assay has been recommended as a valuable technique to measure chromosomal damage for biodosimetry (IAEA 2011). The International Organization for Standardization (ISO) has published a guideline on CBMN performance criteria for biodosimetry (ISO 2014).

The simplicity of MN scoring and the availability of automated scoring system through computerized imaging makes the CBMN assay more attractive, especially for large-scale radiological accidents (Depuydt et al. 2017). Multiple attempts have been made to score MN frequencies automatically, using computerized imaging or flow cytometry (Shibai-Ogata et al. 2011). One of these, the MNScore module, is an automated MN scoring system integral to the MetaSystems Metafer 4 image-analysis platform, which is commonly used to find metaphase cells in clinical cytogenetics laboratories. Automation of the CBMN assay with the MNScore module has been introduced as a biodosimetry tool for population triage, but its accuracy relative to manual scoring has not been extensively studied.

From a clinical viewpoint, dosimetry to identify subjects who require urgent clinical needs may provide sufficient information, but it would be desirable to improve accuracy as much as possible to improve long-term epidemiological follow-up (Romm et al. 2013; Rothkamm et al. 2013). Here, we investigated the impacts of automated scoring errors and sex on MN dose–response curves.

Materials and methods

Blood samples and irradiation

This study was approved by Institutional Review Board (IRB) of the Korea Institute of Radiological and Medical Sciences (IRB No. K-1707–001-003). Heparinized blood samples were collected from healthy donors (3 males and 3 females with ages ranging from 29 and 34) who provided informed written consent. For dose–response curves, blood samples were irradiated with different doses (0- 4 Gy) of 60Co gamma rays at 0.5 Gy/min in a water phantom at 37 ℃. After irradiation, samples were incubated at 37 ℃ for 2 h, then processed for the CBMN assay.

CBMN assay

Whole-blood samples (1.5 ml) were cultured in 9 ml Roswell Park Memorial Institute (RPMI) 1640 medium (Gibco, Waltham, MA) supplemented with 20% fetal bovine serum (JR Scientific, Woodland, CA), 1% antibiotic–antimycotic (Gibco), and 2% phytohemagglutinin (Gibco) at 37 ℃ and 5% CO2 in air. After 24 h of culture, cytochalasin B (Sigma, St. Louis, MO) was added to the cultures at a final concentration of 6 μg/ml. After an additional 48 h of culture, cells were harvested and resuspended in ice-cold hypotonic solution (0.075 M KCl). Cells were fixed once with methanol/acetic acid (10:1) diluted 1:1 with Ringer’s solution, and fixed three more times with methanol/acetic acid without Ringer’s solution. Fixed cells were dropped on slides. To obtain enough BN cells, 1–4 slides per dose point of each donor were made and stained with DAPI (Cytocell, Cambridge, UK).

MN scoring

DAPI-stained slides were scanned with Metafer 4 software (MetaSystems, Altlussheim, Germany) with 10 × objective. For fully-automated scoring mode, scoring MN in BN cells was performed in MNScore module in Metafer 4 image analysis platform. After automated scoring, images captured with MNScore were reanalyzed by a trained human scorer according to published scoring criteria (Fenech et al. 2003); for semi-automated scoring mode, BN cells with MNScore-detected MN were inspected to eliminate false-positive MN; for manual scoring mode, all BN cells, both with and without detected MN, were completely scored to remove false-positives and false-negatives. False positive BN cells were rejected in semi-automated and manual scoring mode.

Validation using blind samples

X-irradiated samples (n = 10) for dose estimation tests were provided from Health Canada as part of intercomparison exercises for radiation biodosimetry, which was approved by the IRB of Health Canada (approval REB 2002–0012). Blood samples were obtained from 10 donors (6 males, 4 females, age 21–55) after obtaining informed consent. Samples were irradiated with different doses (0, 0.4, 0.8, 1.0, 1.4, 2.0, 2.2, 2.6, 3.2 and 3.6 Gy) at 0.37 Gy/min using an X-RAD 320 device operated at 250 kVp and 15 mA. After irradiation, blood samples were incubated at 37 ℃ for 2 h, coded to blind us to sources, and shipped to our laboratory in the Korea Institute of Radiological and Medical Sciences (KIRAMS). γ-irradiated samples for validation were prepared in KIRAMS, Republic of Korea. For γ-irradiated samples (n = 12), blood samples collected from 3 donors (1 male and 2 females, age 35–50) were irradiated with different doses (0, 0.5, 1, 3 Gy) of 60Co gamma rays at 0.5 Gy/min in a water phantom at 37 ℃ using GammaBeam 100–80 (Best Theratronics) of KIRAMS. All samples for validation were coded and the CBMN assay was performed as described above.

Dose estimation and statistical analysis

Fitting of dose–response curves to data from blind samples was performed using Dose Estimate software ver. 5.2, kindly provided from Dr. E.A. Ainsbury of UK Health Security Agency (Ainsbury and Lloyd 2010). The curves for MN were fitted to the linear quadratic model: \(y=c+\alpha D +\beta {D}^{2}\), where y is the MN frequency per BN cell, c is the spontaneous MN frequency, α is a linear component of a curve, β is a quadratic component of a curve, and D is the radiation dose. Doses given to the 10 validation samples were estimated with the Dose Estimate software. The 95% upper and lower confidence limits were calculated taking into account Poisson and calibration curve errors (IAEA 2011). To test the discriminatory power (≤ 1.5 Gy/ > 1.5 Gy) of our CBMN assay, sensitivity, specificity and accuracy was calculated according to Rothkamm et al. (2013). We considered the dose estimates to be accurate when their 95% confidence intervals encompassed the known, actual dose.

Results

Dose–response calibration curve

The data for micronucleus formation by 60Co γ-irradiation obtained from 6 healthy donors (3 males and 3 females) were pooled to construct a dose–response calibration curve (Table 1, Supplementary Tables 1 and 2). Dose response curves were constructed on the average values of 3 males and 3 females. For automated dose response curves, MNScore software in Metafer4 platform scored at least 16,000 binucleated (BN) cells for each dose point.

Table 1 Micronucleus frequencies in and distributions lymphocytes from 6 donors (3 males and 3 females pooled) scored by fully-automated method

To evaluate the accuracy of our automated scoring system, images gallery captured with MNScore were manually inspected. Table 2 shows the false detection rates of BN cells and MN in automated scoring system. After visual inspection, 0.72–2.20% of the auto-selected BN cells were rejected because they did not comply with the standardized scoring criteria (Fenech et al. 2003). Average false-positive and false-negative MN frequencies in the total scored BN cells were 1.03% (range: 0.72–1.50) and 3.50% (range: 1.02–10.78), respectively. The rejected BN cells and false detected MN in automated scoring system seemed to be increased with radiation dose.

Table 2 False detection rate of automated micronucleus scoring shown in Table 1

Dose–response curves of micronuclei described in Fig. 1 were fitted using a linear quadratic equation using DoseEstimate v5.2. The equations regenerated as: \(\mathrm{y}=0.0178 \left(\pm 0.0016\right)+0.0237 \left(\pm 0.0039\right)\times D +0.0080 \left(\pm 0.0012\right)\times {D}^{2}\) in the fully automated scoring method, \(\mathrm{y}=0.0096 \left(\pm 0.0011\right)+0.0170 \left(\pm 0.0031\right)\times D +0.0111 \left(\pm 0.0010\right)\times {D}^{2}\) in the semi-automated scoring method and \(\mathrm{y}=0.0197\left(\pm 0.0018\right)+0.0259 \left(\pm 0.0045\right)\times D +0.0135 \left(\pm 0.0014\right)\times {D}^{2}\) in the manual scoring method.

Fig. 1
figure 1

Dose–response curves from micronucleus (MN) data produced by fully-automated, semi-automated and manual scoring. MN yields were fitted to a linear quadratic model: \(\mathrm{y}=0.0178 \left(\pm 0.0016\right)+0.0237 \left(\pm 0.0039\right)\times D +0.0080 \left(\pm 0.0012\right)\times {D}^{2}\) in fully-automated scoring method, \(\mathrm{y}=0.0096 \left(\pm 0.0011\right)+0.0170 \left(\pm 0.0031\right)\times D +0.0111 \left(\pm 0.0010\right)\times {D}^{2}\) in semi-automated scoring method, \(\mathrm{y}=0.0197\left(\pm 0.0018\right)+0.0259 \left(\pm 0.0045\right)\times D +0.0135 \left(\pm 0.0014\right)\times {D}^{2}\) in manual scoring method. Symbols and lines represent the average MN frequencies for 6 subjects and fitted curves. BN: binucleated

Radiation dose prediction

For the dose prediction exercise, we estimated the radiation dose of 22 blind samples irradiated with different doses of X-rays or γ-rays by calculating the MN frequency observed with fully-automated, semi-automated and manual modes (Supplementary Tables 3 and 4). To test the performance of our automated scoring system for triage in a large-scale radiological incident, we merged dose measurements into binary categories reflecting clinically relevant aspects. The sensitivity, specificity and accuracy based on MN measurements using automated, semi-automated and manual modes are summarized in Table 3. The sensitivity, specificity and accuracy to detect MN and non-MN correctly in total BN cells was 1.0, 0.20, and 0.56 in the fully-automated mode, respectively. Our automated scoring system with high sensitivity seemed to be sufficient to identify subjects who are likely to suffer from acute radiation syndrome several days after radiation exposure, but the ability to define persons exposed to below 1.5 Gy from higher exposed group was low. Visual inspection after automated scoring overcame the poor specificity of fully-automated scoring. The sensitivity, specificity and accuracy in the semi-automated and manual mode was 1.0, 0.90 and 0.94, respectively. When splitting data according to radiation source, similar results were observed and γ-irradiated samples have particularly higher specificity and accuracy than X-irradiated ones. These data show that additional visual inspection improves the performance of automated scoring to better identify subjects who need less urgent clinical attention.

Table 3 Sensitivity, specificity and accuracy of triage classification in the automated micronucleus (MN) assay

Next, we compared the dose estimation between the scoring modes. Of the 10 X-irradiated samples, actual doses fell within the 95% confidence interval of dose estimates for 7 and 10 samples for semi-automated and manual modes, respectively, whereas only 3 samples had accurate dose estimates in the fully-automated mode (Fig. 2A). Similar to this result, semi-automated and manual modes estimated a more accurate dose of 12 γ-irradiated samples (8 for semi-automated, 10 for manual vs. 4 for fully-automated modes; Fig. 2B). These findings indicate that a manual inspection step following automated scoring improves the accuracy of dose prediction.

Fig. 2
figure 2

Dose prediction using fully-automated, semi-automated and manual mode with automated micronucleus scoring system. Blind samples were irradiated with X-ray A and γ-ray B. Symbols and error bars represent estimated doses and corresponding 95% confidence intervals. The dashed and solid lines represent ideal fit to estimate accurate delivered dose and their ± 0.5 Gy intervals

To investigate the impact of sex on MN dose–response curves, our MN scoring data were divided and dose response curves for males and females were reconstructed (Table 4).

Table 4 Coefficients of calibration curves for micronuclei in male and female lymphocytes scored by different methods1

Table 5, Supplementary Tables 3 and 4 show the dose predictions using pooled and sex-specific dose response curves with different scoring modes. The use of sex-specific curves seemed to further improve dose prediction of semi-automated and manual modes, but statistical significance between the sexes was not observed.

Table 5 Comparison of dose estimation between pooled and sex-specific dose response curves

Discussion

The MN assay is a valuable tool for radiation biodosimetry that overcomes the limitations of the dicentric chromosome assay (Vral et al. 2011). Automated MN scoring using the Metafer slide-scanning system has many advantages over the conventional manual MN assay, enhancing throughput and reducing laborious and time-consuming tasks (Seager et al. 2014; Decordier et al. 2009). We found that dose estimation of the automated MN scoring can be improved by correcting automatic scoring errors.

Automated scoring tends to have a high false-positive rate (Seager et al. 2014). We evaluated the false detection rates of our automated scoring system. Only 0.72–2.20% of the scored BN cells did not comply with the standardized scoring criteria (Fenech et al. 2003); that is, most of automatically identified BN cells were correctly detected. Our false positive BN (0.72–2.20%) and MN frequency (0.67–1.50%) was comparable to that reported by Willems et al. (2010) [6.28% false positive BN rate, 1% false positive MN yields]. The error rates of BN and MN tend to increase with the radiation dose, which may be related to radiation-induced cell death, including apoptosis (Boreham et al. 2000). This reduces the accuracy of the fully-automated scoring mode.

To adjust detection errors occurring during automated micronucleus assay, a visual inspection of BN cells on the automated scoring-produced image gallery was performed. In this method, false-positive and false-negative MN scoring was corrected and false-positive BN cells were rejected. Therefore, the ability to identify individuals at risk of acute radiation syndrome in a triage and the accuracy of dose estimation were improved relative to fully-automated scoring. Similarly, the MultiBiodose study and RENEB intercomparison exercises have shown the higher accuracy of semi-automated micronucleus scoring (Depuydt et al. 2017; Thierens et al. 2014). Our study found that visual inspection following automated scoring can improve CBMN assay performance by comparing dose estimation for blind samples irradiated with 12 different doses with manual mode as well as semi-automated mode.

MN frequency can be affected by various factors such as exposure to environmental mutagens, dietary factors, age and sex (IAEA 2011). In the present study, dose estimates of 3 blind samples exposed to 0 Gy 60Co tended to be somewhat overestimated. The three donors (age: 35 to 50) were older than subjects for MN dose–response curve (age: 29 to 34), so donor age but also history of exposure to environmental clastogens and aneugens could be contributing factors. Various confounding factors influencing the spontaneous MN frequency assay could be a problem in real radiological accident. The discrimination of centromere-negative or positive MN could overcome the limitation because age increases mainly centromere-positive MN (Thierens et al. 1999, 2000). Indeed, it would be helpful to more precisely assess background MN frequencies in various age groups and investigate the confounding factors such as the antecedent exposure history.

Females are known to have higher spontaneous MN frequencies than males (Bonassi et al. 2001; Fenech and Bonassi 2011; Fenech et al. 1999, 1994). Female baseline MN frequencies are higher by 1.4–1.65-fold depending on age (Fenech et al. 1994), with the difference increasing with age (Bonassi et al. 2001; Fenech and Bonassi 2011). We split our automated MN scoring data based on sex. The use of sex-specific curves seemed to further improve the dose prediction of semi-automated and manual modes, but we could not see a statistical significance. Our subjects for MN dose response curve consisted of 3 males and 3 females so the small numbers might be not be sufficient for statistical significance. Larger studies are needed to confirm the improvement of dose estimation by the use of sex-specific curves.

To determine the best way to use automated scoring, we extensively compared its characteristics with those of other scoring methods. Visual inspection improved accuracy, but the additional steps required increase of scoring time. Approximately 10 min for fully-automated mode, 15 min for semi-automated mode, and 30 min for manual mode was required to scan and analyze one slide. The best choice of scoring systems would therefore depend on the purpose. When the main goal of the MN assay is to identify subjects who need urgent clinical treatment for a triage, more rapid method would be preferred. But if more precision is required, scoring methods with visual inspection, semi-automated or manual, should be chosen over fully-automated scoring. Considering that the same images can be used for both automated and visually inspected methods, those performing the assay have significant technical and temporal latitude to adjust the assay to achieve the accuracy required for specific situations. Additional visual inspection following automated scoring can be the best approach. In addition, the use of sex-specific curves can be considered as a simple way to further improve dose estimation.

In addition, the energy of the photon radiation source could affect the MN frequency induced by radiation. Our dose–response curve was constructed using blood samples exposed to γ-rays from 60Co with a mean energy of 1.2 MeV. The dose of γ-irradiated blind samples could be estimated with higher accuracy and specificity than that of the samples exposed to 250 kVp X rays. This might be explained by the higher relative biological effectiveness (RBE) of soft vs hard photons (Schmid et al. 2002). The dependence of RBE on the energy of sparsely ionizing radiations has been attributed to microdosimetric differences between these radiations. Lloyd et al. (1975) and Schmid et al. (2002) showed 250 kV X rays produced higher α coefficient than 60Co γ rays. These differences might cause the overestimation of exposed dose in X-irradiated samples when using dose–response curve generated using 60Co γ rays. Additional generation of dose–response curves for X-irradiation of appropriate energy could improve the accuracy of dose estimation.

Our study provides strong evidence showing that visual inspection of images captured by an automated MN system is necessary for accurate dosimetry. Using a validation data set of 22 blind samples, we found that the correction of automated scoring improved the performance of automated MN scoring. Our findings could be useful for performing radiation dosimetry on large numbers of people rapidly, accurately, and efficiently.