Background & Summary

The spatial manifestation of inter- and intra-tumor heterogeneity in breast cancer is well established1,2. Current breast cancer diagnosis and subsequent disease management primarily occurs on the basis of histopathologic assessment and biomarkers, which are derived from the sampled tissue. Utilization of biopsies and conventional biomarkers cannot fully capture the intra-tumor heterogeneity, as they are limited by the tissue sampling error, leading to over- or under-treatment. As such, there is a clinical need to characterize the intra-tumor heterogeneity to better understand this disease and its progression mechanisms.

The use of magnetic resonance imaging (MRI) in breast cancer screening, diagnosis, and treatment management, allows for the non-invasive and longitudinal sampling of disease burden3,4. Beyond the conventional and qualitative uses of MRI in breast cancer disease management, the field of radiomics, broadly defined as the extraction of high-throughput visual and sub-visual cues derived from medical imaging5,6,7, has allowed for a quantitative characterization and assessment of the breast tumor disease burden. This has led to the development of prognostic and predictive radiomic biomarkers that capture breast intra-tumor heterogeneity, promoting personalized clinical decision making8.

Clinical and computational studies analyzing the radiologic presentations of breast tumor disease burden require ample and diverse data to ensure robust characterization. Publicly available datasets, such as those hosted through The Cancer Imaging Archive (TCIA www.cancerimagingarchive.net)9, created by the National Cancer Institute (NCI) of the National Institutes of Health (NIH), provide large study cohorts for meaningful research development. Furthermore, such datasets10,11,12,13 allow for study reproducibility and analyses comparisons across varying institutions, promoting increasingly robust conclusions. However, publicly available radiographic scans require accompanying expertly annotated ground truth tumor annotations to ensure accurate study comparisons and reproducible analyses. Furthermore, any computational analyses, including radiomics-based pipelines, require standardized image normalization and feature parameter selections for consistent analyses6,7,14,15,16.

To address this limitation, this manuscript provides the ‘I-SPY1-Tumor-SEG-Radiomics’ collection, which extends the current TCIA collection ‘I-SPY1’ (https://wiki.cancerimagingarchive.net/display/Public/I-SPY1)17,18, with segmentations labels and radiomic features panel for the ACRIN 6657/I-SPY1 TRIAL cohort. The latter contains dynamic contrast enhanced (DCE) MRI images of women diagnosed with locally advanced breast cancer who underwent longitudinal neoadjuvant chemotherapy17,18. The primary goal is to allow standardized expert image annotations and radiomic features for researchers to conduct reproducible analyses. To this end, annotations and radiomic features for the baseline (pre-treatment) images of n = 163 women have been provided. Based on the analyses that needs to be performed, the selected cohort includes women with baseline (T1) DCE-MRI with at least two post-contrast images for future studies wishing to explore dynamic assessments of breast tumor behavior and treatment response prediction. For each patient visit, three MRI scans are provided over the duration of a single contrast administration: a pre-contrast image, and two post-contrast images. All provided images are pre-operative and pre-treatment. Two sets of annotated labels are provided: i) structural tumor volume (STV) segmentations assessed by an expert board-certified breast radiologist, and ii) functional tumor volume (FTV) segmentations, as described in prior studies18,19. While FTV segmentations can provide an assessment of tumor vascularity and perfusion, they are limited in describing the entire structural tumor burden as they only account for voxels of a region of interest (ROI) above a specific intensity threshold. In contrast, the provided STV segmentations annotate the entire structural region (i.e., the whole extent) of the primary lesion. The STV segmentations have been used in prior studies in which radiomic features extracted from the STV region resulted in improved prognostic performance than FTV values20. Preliminary evaluation of radiomic features extracted from STV defined primary lesion volumes has demonstrated improved prognostic performance over established clinical covariates21.

Additionally, the data cohort includes a comprehensive panel of radiomic features characterizing breast tumor morphology, intensity, and texture. This panel of radiomic features is extracted in compliance with the Image Biomarker Standardization initiative (IBSI)7, using the publicly available Cancer Imaging Phenomics Toolkit (CaPTk,https://www.cbica.upenn.edu/captk)22,23,24.

The availability of annotations characterizing the functional active regions around the lesion’s ROI, the entire primary lesion structure, and the computed radiomic features can enable for the development of prognostic and predictive biomarkers characterizing breast tumor heterogeneity through the direct utilization of the TCIA ACRIN 6657/I-SPY1 TRIAL data potential in clinical and computational studies, but importantly can contribute to repeatable, reproducible, and comparative quantitative studies enabling direct utilization of the TCIA I-SPY collection.

Methods

Data collection

The ACRIN 6657/I-SPY1 TRIAL17,18 enrolled n = 237 women with their consent from May 2002 to March 2006. From this cohort, n = 230 women met the eligibility criteria of being diagnosed with locally advanced breast cancer with primary tumors of stage T3 measuring at least 3 cm in diameter18. The pre-operative DCE-MRI images of 222 women were publicly available via The Cancer Imaging Archive (TCIA)9. From this TCIA set, 15 women were excluded for our present study, due to incomplete DCE acquisition scans. A subsequent 44 women were also excluded due to either incomplete histopathologic data or recurrence free survival (RFS) outcome, or missing pre-treatment DCE-MRI scans. This resulted in the inclusion of n = 163 women for this study, for whom at least two post-contrast scans from the baseline pre-treatment DCE-MRI scans were available. Women underwent neoadjuvant chemotherapy with an anthracycline-cyclophosphamide regimen alone or followed by taxane. All women underwent longitudinal DCE-MRI imaging on a 1.5 T field-strength system. Distributions of patient histopathologic characteristics and image scanner manufacturer details can be found in Tables 1 and 2. An exemplary illustration showing the spatial intratumor heterogeneity is shown in Fig. 1. The complete clinical metadata is available in the Supplementary Table.

Table 1 Summary of patient histopathologic characteristics from study cohort.
Table 2 Scanner manufacturer and model name for study cohort.
Fig. 1
figure 1

Four representative breast tumors demonstrating spatial intratumor heterogeneity.

Preprocessing

The preprocessing procedures involved in preparing the data for further analyses were conducted using the Cancer Imaging Phenomics Toolkit (CaPTk)22,23,24, and they are outlined as follows:

  1. 1.

    Image format conversion: For each patient, baseline images were converted to the Neuroimaging Informatics Technology Initiative (NIfTI)25 file format from the publicly available DICOM scans. This format does not include any identifiable information as the DICOM headers hold, and only preserves the actual imaging information and the necessary information to define the data in the physical coordinates.

  2. 2.

    Bias Field Correction: All the converted NIfTI images were bias corrected to rectify any non-uniformity associated with the magnetic field of the MRI scanner26,27.

  3. 3.

    Data harmonization: This step is required to ensure consistency in the entire dataset as described below.

    1. (a).

      Resampling: The raw I-SPY images have different voxel resolutions, preventing cohesive analysis across the entire dataset. To mitigate this, all the images were resampled to the standard 1mm3 isotropic resolution to ensure harmonized processing for computational algorithms. This resolution is chosen because this resizes all the images to a size which can fit in the GPU memory (more details will be explained later)

    2. (b).

      Z-Scoring: After the images are resampled, we Z-score the images using instance level (considering all timepoints of the given patient rather than entire dataset) statistics of mean and variance. Z-scoring is a widely accepted method from extended observations28,29,30,31, that normalizing every single multi-timepoint scan (i.e., instance-level normalization) to zero mean and a unit variance helps to improve algorithmic generalizability and to preserve the relative intensity differences between the pre- and post-contrast excitation scans.

DCE-MRI NIfTI volumes

Three volumes have been provided for each patient from the pre-operative, pre-treatment visit. These images include the pre-contrast administration MRI scan (0000), first post-contrast image (0001), and second post-contrast image (0002).

Expert tumor annotations

From the NIfTI images, the functional tumor volume (FTV) segmentation was identified within the region of interest (ROI), provided through TCIA, from the signal enhancement ratio image, as previously described18,32. In order to generate the structural tumor volume (STV) segmentations, voxels outside of the largest contiguous volume region and voxels greater than 2 cm away from the largest contiguous volume region, within the FTV, were manually removed. Our expert board-certified breast radiologist then identified the primary lesions in each of the n = 163 baseline DCE-MRI images using the manually cleaned, FTV segmentation as a guide. The first-post contrast image for each case was used by the radiologist to delineate the entire 3-D primary tumor segmentation for each patient. Satellite lesions were not considered in the primary tumor segmentations. ITK-SNAP (www.itksnap.org)33 was utilized to perform the manual delineations.

Computationally-generated annotations

A 3D Convolutional Neural Network based on U-Net34, with residual connections35, was trained on all the preprocessed 3 timepoints to perform automated segmentations of the STV and the code has been made available for reproducibility. The models are trained using the Multi-class Dice36 Loss function37 with on-the-fly data augmentation techniques such as ghosting, blur, and gaussian noise applied in a random manner with a given probability for each type of augmentation38. All the experiments are done using nested k-fold cross validation and the median Dice score across the holdout folds is 0.74. An initial learning rate of 0.01 is used, which is varied in a linear triangular fashion having a minimum learning rate of 10−3 times the initial learning rate. We use the Stochastic Gradient Descent optimizer to update weights of our network.

Radiomic features

An comprehensive array of 370 unique features were extracted. These are from 8 different feature families, based on intensity statistics (n = 20), morphology (n = 21), histograms (n = 285), Gray-level co-occurrence matrix (GLCM) (n = 8), Gray-level run-length matrix (GLRLM) (n = 12), Gray-level size zone matrix (GLSZM) (n = 18), Neighborhood gray tone difference matrix (NGTDM) (n = 5), and Local binary patters (LBP) (n = 1). We used non-filtered images after the first post-contract injection that were bias-corrected, resampled and z-score normalized. The radiomic features were then extracted from the region defined by the STV. The extraction was done using the Cancer imaging Phenomics Toolkit (CaPTk, www.cbica.upenn.edu/captk)22,23,24. CaPTk is an open-source software toolkit, which offers functionalities to extract a wide array of radiomic features compliant with the image biomarker standardisation initiative (IBSI)7, the Quantitative Imaging Network6, and has been extensively used in radiomic analysis studies39,40,41,42,43. The exact parameters used for the radiomic analysis are available through TCIA’s repository, at https://doi.org/10.7937/TCIA.XC7A-QT2044.

Data Records

We are using the data17 published through the ACRIN 6657/I-SPY1 TRIAL study18. Specifically, we selected baseline subjects for whom at least two pre-operative post-contrast scans were available. The raw and generated data, which includes the preprocessed images in isotropic resolution of 1mm3, the expert and computationally-generated annotations, and the extracted radiomic features, have been made available through TCIA’s Analysis Results Directory www.cancerimagingarchive.net/tcia-analysis-results/ using https://doi.org/10.7937/TCIA.XC7A-QT2044. The computationally generated annotations can stand as a benchmark for improving segmentation algorithms related to this data in future computational studies.

Technical Validation

Data collection

The dataset was directly downloaded from TCIA and quantitatively analyzed to ensure all images have a defined coordinate system and contain non-zero pixel values. Two cases, 1183 and 1187, had white image artifacts outside of the breast region. While these artifacts do not affect intensity distributions within the anatomical breast or the corresponding lesion segmentations, they may cause difficulties in image visualization, and downstream analyses. These artifacts were present in images directly downloaded from TCIA (illustrated in Fig. 2). Additionally, qualitative assessment was performed to look for any visual data corruption.

Fig. 2
figure 2

Representative image slice where image artifact is present. (a) Visualization of image artifact for case 1183, (b) visualization of image artifact for case 1187. These image artifacts do not affect the intensity values within anatomical breast region.

Preprocessing

Each step of preprocessing was followed by manual qualitative assessment of the image to ensure data validity. In addition, quantitative assessment was performed following the data harmonization step to ensure that the entire dataset had the same parametric definition (i.e., same resolution and pixel intensity distribution).

Expert tumor annotations

The expert annotated STV segmentations were qualitatively assessed, manually edited and approved by a board certified, fellowship-trained breast radiologist.

Computationally-generated annotations

The FTV annotations were quantitatively compared with the corresponding STV annotations using the Dice score in order to quantify the difference between the two annotations. Additionally, a qualitative analysis was performed for the best and worst performing cases (illustrated in Fig. 3).

Fig. 3
figure 3

Three representative single slice tumor segmentations. (a) First-post contrast image of entire breast. (b) Primary tumor region of interest. (c) Functional tumor volume (FTV) segmentation (d) Structural tumor volume (STV) segmentations which have been expert annotated. Rows showcase different representative images for each case.

Feature extraction

Considering the mathematical formulation of these features, it is possible for a division by zero to occur (lack of heterogeneity or very small number of voxels). In CaPTk, we provide “not a number” for the result of these features to provide a position of clarity for the user to make subsequent downstream analyses more coherent based on the entire population. We acknowledge this could be provided as “inf” instead, but we are providing this as “NaN” to have parity between various programming languages and processing protocols.

Usage Notes

This collection of images (both normalized and resampled) and accompanying annotations can be analyzed using different tools or software. We provide all the annotations in a research-friendly NIfTI format to allow users to read the images and annotations through many programming languages such as C++, Python, R, or others. The data is accompanied by a XSLX file that provides additional information about each subject.