Introduction

Radiology is an essential tool in the management of a patient. The trend toward personalized imaging-based medicine increasingly requires specialized knowledge in order to be able to answer the particular clinical questions of referring specialists [1,2,3]. The communication occurs through the report written by the radiologist [4,5,6]. Describing and comprehending the imaging features as well as disposing for the probability-based differential diagnosis are the radiologist’s principal responsibility [7]. Traditionally, reports are free-form narrative. However, reducing difference in reports and creating guideline-concordant templates is essential to radiology’s success in value-based payment models and suitable for patient care [8, 9]. There has been a strong thrust in recent times on improved structure and standardization in radiology reporting. A notable example to structure in the field of breast imaging where the American College of Radiology (ACR) developed and promulgated the breast imaging reporting and data system (BI-RADS). BI-RADS includes a standardized lexicon for description of breast imaging findings and their clinical management [10].

Several proposal have been supported by the major international societies of radiology for the use of structured reports (SR), as the European Society of Radiology and the Radiological Society of North America in the so called “Structured reporting initiative”. The Italian Society of Medical and Interventional Radiology (SIRM) have made available to members of society several templates that can be used in their daily practice [11].

The advantages of SR derive from several features. In oncological setting, there have been substantial advances in the quality of templates and the statement of imaging features. Several studies were able to show that using a SR caused significant progresses in the clarity and comprehensiveness of imaging findings [12,13,14]. In a paper on SR during staging phase for pancreatic lesions, the investigated surgeons conveyed that only 25 – 42% of narrative templates described all relevant features for surgical planning while an increase to 69 – 98% was realized in the case of SRs [15].

Outside of oncology, in all radiological fields, SR results in a relevant increase in quality, since, considerably more pertinent information were included in the templates and referring clinicians favored the SR to the free-text report (FTR). SR has advantages that go far beyond communication. In fact, the possibility to archive data concerning contrast medium or radiation exposure with consequent addition to the template would be easy technical employment [16, 17].

Despite all of these promising advances, SR has not yet convert in the clinical practice. A survey of SIRM members noticed that the Italian radiologists know SRs, but only a smaller group habitually use it in clinical practice [18].

Among women, breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related death in the world [19, 20]. Digital breast tomosynthesis (DBT) has rapidly gained ground in the realm of breast cancer screening and diagnosis [21]. Published data showed the superiority of DBT in comparison with the current standard digital mammography (DM). DBT has been shown to provide improved sensitivity and specificity, as well as improved lesion conspicuity and localization [21]. In this context, it is necessary that the current format of free-text reporting (FTR) should be organized and shifted toward SR. The three main reasons for moving from FTR to SR are quality, datafication quantification and accessibility. A critical quality improvement resulting from the use of SR is standardization. The use of templates in SR provides a checklist as to whether all relevant items for a particular examination have been addressed. Thanks to this “structure”, the radiology report will also allow the association of radiological data and other key clinical features, leading to a precise diagnosis and personalized medicine.

The aim of our study was to propose a structured reporting template for x-ray mammography in the first diagnosis of Breast Cancer, to guide radiologists in a systematic reporting and improve the communication of the report to clinicians.

Materials and methods

Critical debate between specialist in Breast Radiology based on a multi-round consensus-building modified Delphi method was completed to improve a comprehensive SR for Mammography of patients with Breast Cancer.

Panel experts

A working group of 16 experts (group A), members of the board of the SIRM study section on breast radiology, was composed to create a structured report for the first diagnosis of breast cancer in x-ray mammography. A further working group of 4 experts in breast radiology (group B), chosen among senior past board members who gave their availability to participate in the consensus, and blinded to the activities of the group A, was composed to assess the quality and clinical usefulness of the final draft of the structured report.

All panellist of group A analyzed literature papers on the main scientific databases, including Pubmed, Scopus, and Google Scholar, to assess papers on Mammography findings of Breast Cancer from December 2000 to December 2020. The full text of the selected studies was reviewed and helped the panelists to compose a first list of items of the reports, via emails and/or teleconferences.

SR was divided into 10 section: (a) Personal Data, (b) Setting, (c) Comparison with previous breast examination, (d) Anamnesis and clinical context; (e) Technique; (f) Radiation dose; (g) Parenchymal pattern; (h) Description of the finding; (i) Diagnostic categories and Report and (j) Conclusions. As a part of template we added a dedicated section of more relevant images.

Delphi rounds

Preliminarily, each panellist autonomously provided to improving the draft of the SR by means of online meetings or mail exchanges. Subsequently, three Delphi rounds were performed [17].

During the first round, a Google Form survey was used to test the panellists’ agreement for the SR draft. Each section of the SR (i.e., Patient Clinical Data, Clinical Evaluation, Exam Technique, Report, Findings, and Conclusion) was tested by using a five-point Likert scale (1 = strongly disagree, 2 = slightly disagree, 3 = neither agree nor disagree 4 = slightly agree, 5 = strongly agree).

Afterward the second round, the final version of the structure report was generated on the dedicated RSNA website (radreport.org) by using T-Rex template in HTML format, in line with IHE (Integrating Healthcare Enterprise) and the MRRT (management of radiology report templates) profile, accessible as open-source software, with the technical support of Exprivia (Molfetta, Bari, Italy). These determine both the format of radiology report templates [by using both the version 5 of Hypertext Markup Language (HTML5)], and the transporting mechanism to request, get back, and stock these schedules [18]. The radiology report was structured by using a series of “codified queries” integrated in the T-Rex editor’s preselected sections [18].

In the third round of the Delphi process, the experts group B was asked to express their level of agreement, by using a five-point Likert scale, on the quality of reporting. In particular, the experts were asked to express the level of agreement on the following statements: 1) The structured report contains all the descriptive elements of a first diagnosis mammogram, 2) The structured report allows the diagnosis to be clearly expressed, 3) The structured report allows you to clearly indicate the patient’s management, 4) The structured report allows to reduce the reporting time compared to the descriptive one already used in clinical practice, 5) The structured report is easy for the radiologist to implement in clinical practice, 6) A training period for the radiologist is required to adopt the structured report.

Statistical analysis

Each panellist answers were exported in Microsoft Excel document for data collection and statistical analysis.

Mean score, standard deviation, and the sum of scores were used as statistical descriptors of scores attributed by panellists for each section. A mean score of 3 was considered good while a score of 5 excellent.

The internal consistency of the panellist scores for each section was assessed and a quality analysis with the average inter-item correlation was performed using Cronbach’s alpha (Cα) correlation coefficient [22, 23]. Cα was determined after each round.

An alpha coefficient (α) > 0.9 was considered excellent, α > 0.8 good, α > 0.7 acceptable, α > 0.6 questionable, α > 0.5 poor, and α < 0.5 unacceptable. In the iterations an α of 0.8 was considered a reasonable goal for internal reliability.

The data analysis was performed using Statistic Toolbox of Matlab (The MathWorks, Inc., Natick, MA, USA).

Results

Structured report

The final SR version (Appendix 1) was built by including n = 2 items in Personal Data, n = 4 items in Setting, n = 2 items in Comparison with previous breast examination, n = 19 items in Anamnesis and clinical context; n = 10 items in Technique; n = 1 item in Radiation dose; n = 5 items Parenchymal pattern; n = 28 items in Description of the finding; n = 12 items in Diagnostic categories and Report and n = 1 item in Conclusions. Overall, 84 items composed the definitive version of SR.

The “Personal Data” section includes patient clinical information, as weight, height, BMI, waist circumference, pathologies as hyperglycemia, hypercholesterolemia, hypertriglyceridemia, arterial hypertension.

The “Setting” section clarifies the examination clinical setting, as organized screening assessment of recalls, diagnostic mammogram (spontaneous/opportunistic screening) in asymptomatic or symptomatic woman.

The “Comparison with previous breast examinations “section, when possible, includes data obtained from previous examinations, in order to compare current data with them.

The “Anamnesis and clinical context” section includes previous or familiarity to malignancies, risk factors, genetic panel as well as data on the presence of symptoms such as breast lump, axillary lump, nipple discharge, skin/nipple alterations, mastodynia or others.

The “Technique” section includes data on the type of exam performed, such as film screen mammography or digital mammography, as well as on the methodology used.

“Radiation dose” section includes data on the category of radiation exposure.

“Parenchymal pattern” section is based on ACR classification:

  1. (1)

    Almost entirely adipose tissue with sparse areas of fibroglandular tissue

  2. (2)

    Heterogeneously dense, with possible masking of small lesions

  3. (3)

    Homogeneously dense, with reduced sensitivity

The “Description of the finding” section includes data on lesion location, type of lesions (masses or not masses), size, shape, margins, density, the presence and the type of calcifications and associated clinical findings (such as skin retraction, skin thickening, nipple retraction, axillary adenopathy).

In the “Diagnostic categories and Report conclusions” section, the lesion is stratified in the different categories (negative, benign, probably benign finding, indeterminate lesion, finding highly suggestive of malignancy and known breast malignancy already demonstrated at histopathology), with consequent follow-up or diagnostic suggestion.

The “Conclusions” section is a free-text section, with radiological diagnosis.

Consensus agreement

Table 1 reports single score and sum of scores of the 16 panellists for SR in the first round. One of the experts did not participate to the second round: Table 2 reports single score and sum of scores of panellists for SR in the second round.

Table 1 Single score and sum of scores of panellists for structured report (I round)
Table 2 Single score and sum of scores of panellists for structured report (II round)

Both in the first and second round, as reported in Table 1 and 2, all parts had more than a good assessment. The overall mean score of the experts (n.16) and the sum of score for SR were 4.7 (range 2–5) and 896 (Table 1) in the first round. The overall mean score of the panellists (n.15) and the sum of score for SR were 4.9 (range 2–5) and 807 (Table 2) in the second round.

The overall mean score of the panellists in the second round was higher than the overall mean score of the first round with a lower standard deviation value.

The Cα correlation coefficient was 0.78 in the first round while was 0.82 in the second round for structured report.

Table 3 reports single score of panellists for structured report in the third round about the answers of 4 panellists about the SR quality evaluation. All questions received more than a good rating (≥ 3). The overall mean score of the experts (n.4) was 3.3 (range 2–5). The Cαcorrelation coefficient was 0.90 in this third round for structured report.

Table 3 Single score of panellists for structured quality evaluation (III round)

Discussion

In this study, a SR for x-ray mammography in the first diagnosis of breast cancer has been proposed and built with a multi-round Delphi modified consensus. An additional round has been introduced involved a group of experts, blinded to the activities of the group A, to evaluate the quality and the clinical usefulness of the final draft of the SR. Both in the first and second round all parts had more than a good assessment. The overall mean score of the panellists and the sum of score for SR were 4.7 and 896 in the first round. The overall mean score of the panellists and the sum of score for SR were 4.9 and 807 in the second round. The overall mean score of the experts in the second round was higher than the overall mean score of the first round with a lower standard deviation value to underline the higher agreement among the experts in the SR reached in this round. Regarding to the answers of panellists on the quality evaluation, all questions received more than a good rating (≥ 3). The overall mean score of the panellists was 3.3, although, regarding to the item that the structured report allows to reduce the reporting time compared to the descriptive one, two experts provided a score of 2 since they think that the time is similar in both cases. Regarding to the item that a training period for the radiologist to adopt the structured report should be required, an expert provided a score of 1 thinking that it is not necessary.

The Cronbach’s alpha (Cα) correlation coefficient was 0.90 in this third round for structured report.

With regard to “Personal data”, this section obtained mean and SD values slightly inferior to other sections, with a trend confirmed in both first and second “rounds”. In our opinion, it is due to the panellist idea that this meticulous process of data could slow down the normal work flow and was not considered to be easy to use. However, it is necessary to point out that all the sections are independent from each other and, therefore, this is an optional section which may not even be filled in, although it was conceived with the aim of creating databases. In fact, the possibility of collecting all these data allowed the creation of a large database, not only for epidemiological studies, but in the highest conception of radiology to lay the foundations for radiomics studies.

The present study provides the first mammography template established on standardized structure and lexicon, essential features for the observance to diagnostic-therapeutic proposal in order to reduce the uncertainty that could result from a non-standardized lexicon; it is authors opinion that the proposed structured report will enable a clear communication between radiologists and clinicians; of not the conclusion allow to express a definite diagnosis or a weighted differential diagnosis (DD) [24]. Several sections are included in the present template and, the evaluation of these allow to stratify the lesion in the different categories (negative, benign, probably benign finding, indeterminate lesion, finding highly suggestive of malignancy and known malignancy), with consequent follow-up or diagnostic suggestion. SR of mass lesions is based on the BI-RADS lexicon provided by the American College of Radiology [25]. The BI-RADS lexicon needs the understanding of radiologist to designate a final category. However, there is a significant inconstancy among radiologists for the assignment of BI-RADS categories due to the level of exercising site and the single radiologist [26, 27]. It is possible to reduce this variability to educating the readers in practice of the lexicon [26].

Several authors have reported that the use of a checklist may improve diagnostic accuracy [27,28,29]. The development of a SR to guide the assessment of the lesion should decrease variability among radiologists. Another key question is related to the presence of multiple lesions; however, radiologists usually described the lesion that is most essential to clinicians in defining the management of patients. Thus, identifying and extracting the index lesion is a critical clinical task [30, 31].

The present SR is built not only considering the categories suggested by the ACR and, therefore, should favor a correct evaluation of the lesion, but it is composed of different sections that allow the correlation of the radiological features with the clinical history. This radiology report is conceived to be rich in data that could potentially be pooled, analyzed, and correlated with patient outcomes, thereby informing future clinical and imaging guidelines. However, use of non-standardized lexicon should limit the effort of data collection across multiple institutions [32, 33].

Regarding to the “Technique” section, revealing the examination technique, not only within one’s own department, but also with departments of other centers, answers to a double reason. First, it permits the standardization of study protocols, and then, it permits to optimize the study protocols between the different centers. The protocol optimization should guide the quality progress through enhanced patient safety (e.g., radiation dose reduction), best practice, image quality and reduce medical error [34,35,36,37,38,39,40].

The benefits of SR over narrative report comprise standardized structure and lexicon, features mandatory for observance to diagnostic and therapeutic proposal and for admission in clinical trials. SR decreases the equivocality due to a non-uniform lexicon. Wide application of SR is essential to offer referring physicians the best quality of service and to researchers the best quality information in the setting of big data [38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54].

Despite the favorable results, there are several weaknesses which we should ponder. Firstly, the panelists were of the same country; the involvement of internationally specialists would permit a larger involvement and would spread the uniformity of the SR. Second, this study not assess the clinical effect of the SR on the managing of breast cancer patient. However, this study has the advantage of having been supported by a multidisciplinary team, where several experts have assessed the quality of the clinical impact.

Conclusion

In this study, a structured reporting template for x-ray mammography in the first diagnosis of breast cancer, has been proposed and built with a multi-round Delphi modified consensus. An additional round has been introduced involved a group of experts, blinded to the activities of the group A, to assess the quality and clinical usefulness of the final draft of the structured report. Both in the first and second round all parts had more than a good assessment. A standardized approach with best practice guidelines will improve training in and the performance of assignment of BI-RADS assessment categories, and offer the base for quality assurance procedures within centers and across international borders.