Expert tumor annotations and radiomics for locally advanced breast cancer in DCE-MRI for ACRIN 6657/I-SPY1

Chitalia, Rhea; Pati, Sarthak; Bhalerao, Megh; Thakur, Siddhesh Pravin; Jahani, Nariman; Belenky, Vivian; McDonald, Elizabeth S.; Gibbs, Jessica; Newitt, David C.; Hylton, Nola M.; Kontos, Despina; Bakas, Spyridon

doi:10.1038/s41597-022-01555-4

Download PDF

Data Descriptor
Open access
Published: 23 July 2022

Expert tumor annotations and radiomics for locally advanced breast cancer in DCE-MRI for ACRIN 6657/I-SPY1

Scientific Data volume 9, Article number: 440 (2022) Cite this article

2292 Accesses
5 Citations
4 Altmetric
Metrics details

Subjects

Abstract

Breast cancer is one of the most pervasive forms of cancer and its inherent intra- and inter-tumor heterogeneity contributes towards its poor prognosis. Multiple studies have reported results from either private institutional data or publicly available datasets. However, current public datasets are limited in terms of having consistency in: a) data quality, b) quality of expert annotation of pathology, and c) availability of baseline results from computational algorithms. To address these limitations, here we propose the enhancement of the I-SPY1 data collection, with uniformly curated data, tumor annotations, and quantitative imaging features. Specifically, the proposed dataset includes a) uniformly processed scans that are harmonized to match intensity and spatial characteristics, facilitating immediate use in computational studies, b) computationally-generated and manually-revised expert annotations of tumor regions, as well as c) a comprehensive set of quantitative imaging (also known as radiomic) features corresponding to the tumor regions. This collection describes our contribution towards repeatable, reproducible, and comparative quantitative studies leading to new predictive, prognostic, and diagnostic assessments.

MRI-based radiomics in breast cancer: feature robustness with respect to inter-observer segmentation variability

Article Open access 25 August 2020

Developing diagnostic assessment of breast lumpectomy tissues using radiomic and optical signatures

Article Open access 08 November 2021

MRI radiomics in head and neck cancer from reproducibility to combined approaches

Article Open access 24 April 2024

Background & Summary

The spatial manifestation of inter- and intra-tumor heterogeneity in breast cancer is well established^1,2. Current breast cancer diagnosis and subsequent disease management primarily occurs on the basis of histopathologic assessment and biomarkers, which are derived from the sampled tissue. Utilization of biopsies and conventional biomarkers cannot fully capture the intra-tumor heterogeneity, as they are limited by the tissue sampling error, leading to over- or under-treatment. As such, there is a clinical need to characterize the intra-tumor heterogeneity to better understand this disease and its progression mechanisms.

The use of magnetic resonance imaging (MRI) in breast cancer screening, diagnosis, and treatment management, allows for the non-invasive and longitudinal sampling of disease burden^3,4. Beyond the conventional and qualitative uses of MRI in breast cancer disease management, the field of radiomics, broadly defined as the extraction of high-throughput visual and sub-visual cues derived from medical imaging^5,6,7, has allowed for a quantitative characterization and assessment of the breast tumor disease burden. This has led to the development of prognostic and predictive radiomic biomarkers that capture breast intra-tumor heterogeneity, promoting personalized clinical decision making⁸.

Clinical and computational studies analyzing the radiologic presentations of breast tumor disease burden require ample and diverse data to ensure robust characterization. Publicly available datasets, such as those hosted through The Cancer Imaging Archive (TCIA www.cancerimagingarchive.net)⁹, created by the National Cancer Institute (NCI) of the National Institutes of Health (NIH), provide large study cohorts for meaningful research development. Furthermore, such datasets^10,11,12,13 allow for study reproducibility and analyses comparisons across varying institutions, promoting increasingly robust conclusions. However, publicly available radiographic scans require accompanying expertly annotated ground truth tumor annotations to ensure accurate study comparisons and reproducible analyses. Furthermore, any computational analyses, including radiomics-based pipelines, require standardized image normalization and feature parameter selections for consistent analyses^6,7,14,15,16.

To address this limitation, this manuscript provides the ‘I-SPY1-Tumor-SEG-Radiomics’ collection, which extends the current TCIA collection ‘I-SPY1’ (https://wiki.cancerimagingarchive.net/display/Public/I-SPY1)^17,18, with segmentations labels and radiomic features panel for the ACRIN 6657/I-SPY1 TRIAL cohort. The latter contains dynamic contrast enhanced (DCE) MRI images of women diagnosed with locally advanced breast cancer who underwent longitudinal neoadjuvant chemotherapy^17,18. The primary goal is to allow standardized expert image annotations and radiomic features for researchers to conduct reproducible analyses. To this end, annotations and radiomic features for the baseline (pre-treatment) images of n = 163 women have been provided. Based on the analyses that needs to be performed, the selected cohort includes women with baseline (T1) DCE-MRI with at least two post-contrast images for future studies wishing to explore dynamic assessments of breast tumor behavior and treatment response prediction. For each patient visit, three MRI scans are provided over the duration of a single contrast administration: a pre-contrast image, and two post-contrast images. All provided images are pre-operative and pre-treatment. Two sets of annotated labels are provided: i) structural tumor volume (STV) segmentations assessed by an expert board-certified breast radiologist, and ii) functional tumor volume (FTV) segmentations, as described in prior studies^18,19. While FTV segmentations can provide an assessment of tumor vascularity and perfusion, they are limited in describing the entire structural tumor burden as they only account for voxels of a region of interest (ROI) above a specific intensity threshold. In contrast, the provided STV segmentations annotate the entire structural region (i.e., the whole extent) of the primary lesion. The STV segmentations have been used in prior studies in which radiomic features extracted from the STV region resulted in improved prognostic performance than FTV values²⁰. Preliminary evaluation of radiomic features extracted from STV defined primary lesion volumes has demonstrated improved prognostic performance over established clinical covariates²¹.

Additionally, the data cohort includes a comprehensive panel of radiomic features characterizing breast tumor morphology, intensity, and texture. This panel of radiomic features is extracted in compliance with the Image Biomarker Standardization initiative (IBSI)⁷, using the publicly available Cancer Imaging Phenomics Toolkit (CaPTk,https://www.cbica.upenn.edu/captk)^22,23,24.

The availability of annotations characterizing the functional active regions around the lesion’s ROI, the entire primary lesion structure, and the computed radiomic features can enable for the development of prognostic and predictive biomarkers characterizing breast tumor heterogeneity through the direct utilization of the TCIA ACRIN 6657/I-SPY1 TRIAL data potential in clinical and computational studies, but importantly can contribute to repeatable, reproducible, and comparative quantitative studies enabling direct utilization of the TCIA I-SPY collection.

Methods

Data collection

The ACRIN 6657/I-SPY1 TRIAL^17,18 enrolled n = 237 women with their consent from May 2002 to March 2006. From this cohort, n = 230 women met the eligibility criteria of being diagnosed with locally advanced breast cancer with primary tumors of stage T3 measuring at least 3 cm in diameter¹⁸. The pre-operative DCE-MRI images of 222 women were publicly available via The Cancer Imaging Archive (TCIA)⁹. From this TCIA set, 15 women were excluded for our present study, due to incomplete DCE acquisition scans. A subsequent 44 women were also excluded due to either incomplete histopathologic data or recurrence free survival (RFS) outcome, or missing pre-treatment DCE-MRI scans. This resulted in the inclusion of n = 163 women for this study, for whom at least two post-contrast scans from the baseline pre-treatment DCE-MRI scans were available. Women underwent neoadjuvant chemotherapy with an anthracycline-cyclophosphamide regimen alone or followed by taxane. All women underwent longitudinal DCE-MRI imaging on a 1.5 T field-strength system. Distributions of patient histopathologic characteristics and image scanner manufacturer details can be found in Tables 1 and 2. An exemplary illustration showing the spatial intratumor heterogeneity is shown in Fig. 1. The complete clinical metadata is available in the Supplementary Table.

Table 1 Summary of patient histopathologic characteristics from study cohort.

Full size table

Table 2 Scanner manufacturer and model name for study cohort.

Full size table

Preprocessing

The preprocessing procedures involved in preparing the data for further analyses were conducted using the Cancer Imaging Phenomics Toolkit (CaPTk)^22,23,24, and they are outlined as follows:

1.
Image format conversion: For each patient, baseline images were converted to the Neuroimaging Informatics Technology Initiative (NIfTI)²⁵ file format from the publicly available DICOM scans. This format does not include any identifiable information as the DICOM headers hold, and only preserves the actual imaging information and the necessary information to define the data in the physical coordinates.
2.
Bias Field Correction: All the converted NIfTI images were bias corrected to rectify any non-uniformity associated with the magnetic field of the MRI scanner^26,27.
3.
Data harmonization: This step is required to ensure consistency in the entire dataset as described below.
1. (a).
  Resampling: The raw I-SPY images have different voxel resolutions, preventing cohesive analysis across the entire dataset. To mitigate this, all the images were resampled to the standard 1mm³ isotropic resolution to ensure harmonized processing for computational algorithms. This resolution is chosen because this resizes all the images to a size which can fit in the GPU memory (more details will be explained later)
2. (b).
  Z-Scoring: After the images are resampled, we Z-score the images using instance level (considering all timepoints of the given patient rather than entire dataset) statistics of mean and variance. Z-scoring is a widely accepted method from extended observations^28,29,30,31, that normalizing every single multi-timepoint scan (i.e., instance-level normalization) to zero mean and a unit variance helps to improve algorithmic generalizability and to preserve the relative intensity differences between the pre- and post-contrast excitation scans.

DCE-MRI NIfTI volumes

Three volumes have been provided for each patient from the pre-operative, pre-treatment visit. These images include the pre-contrast administration MRI scan (0000), first post-contrast image (0001), and second post-contrast image (0002).

Expert tumor annotations

From the NIfTI images, the functional tumor volume (FTV) segmentation was identified within the region of interest (ROI), provided through TCIA, from the signal enhancement ratio image, as previously described^18,32. In order to generate the structural tumor volume (STV) segmentations, voxels outside of the largest contiguous volume region and voxels greater than 2 cm away from the largest contiguous volume region, within the FTV, were manually removed. Our expert board-certified breast radiologist then identified the primary lesions in each of the n = 163 baseline DCE-MRI images using the manually cleaned, FTV segmentation as a guide. The first-post contrast image for each case was used by the radiologist to delineate the entire 3-D primary tumor segmentation for each patient. Satellite lesions were not considered in the primary tumor segmentations. ITK-SNAP (www.itksnap.org)³³ was utilized to perform the manual delineations.

Computationally-generated annotations

A 3D Convolutional Neural Network based on U-Net³⁴, with residual connections³⁵, was trained on all the preprocessed 3 timepoints to perform automated segmentations of the STV and the code has been made available for reproducibility. The models are trained using the Multi-class Dice³⁶ Loss function³⁷ with on-the-fly data augmentation techniques such as ghosting, blur, and gaussian noise applied in a random manner with a given probability for each type of augmentation³⁸. All the experiments are done using nested k-fold cross validation and the median Dice score across the holdout folds is 0.74. An initial learning rate of 0.01 is used, which is varied in a linear triangular fashion having a minimum learning rate of 10⁻³ times the initial learning rate. We use the Stochastic Gradient Descent optimizer to update weights of our network.

Radiomic features

An comprehensive array of 370 unique features were extracted. These are from 8 different feature families, based on intensity statistics (n = 20), morphology (n = 21), histograms (n = 285), Gray-level co-occurrence matrix (GLCM) (n = 8), Gray-level run-length matrix (GLRLM) (n = 12), Gray-level size zone matrix (GLSZM) (n = 18), Neighborhood gray tone difference matrix (NGTDM) (n = 5), and Local binary patters (LBP) (n = 1). We used non-filtered images after the first post-contract injection that were bias-corrected, resampled and z-score normalized. The radiomic features were then extracted from the region defined by the STV. The extraction was done using the Cancer imaging Phenomics Toolkit (CaPTk, www.cbica.upenn.edu/captk)^22,23,24. CaPTk is an open-source software toolkit, which offers functionalities to extract a wide array of radiomic features compliant with the image biomarker standardisation initiative (IBSI)⁷, the Quantitative Imaging Network⁶, and has been extensively used in radiomic analysis studies^{39,40,41,42,43}. The exact parameters used for the radiomic analysis are available through TCIA’s repository, at https://doi.org/10.7937/TCIA.XC7A-QT20⁴⁴.

Data Records

We are using the data¹⁷ published through the ACRIN 6657/I-SPY1 TRIAL study¹⁸. Specifically, we selected baseline subjects for whom at least two pre-operative post-contrast scans were available. The raw and generated data, which includes the preprocessed images in isotropic resolution of 1mm³, the expert and computationally-generated annotations, and the extracted radiomic features, have been made available through TCIA’s Analysis Results Directory www.cancerimagingarchive.net/tcia-analysis-results/ using https://doi.org/10.7937/TCIA.XC7A-QT20⁴⁴. The computationally generated annotations can stand as a benchmark for improving segmentation algorithms related to this data in future computational studies.

Technical Validation

Data collection

The dataset was directly downloaded from TCIA and quantitatively analyzed to ensure all images have a defined coordinate system and contain non-zero pixel values. Two cases, 1183 and 1187, had white image artifacts outside of the breast region. While these artifacts do not affect intensity distributions within the anatomical breast or the corresponding lesion segmentations, they may cause difficulties in image visualization, and downstream analyses. These artifacts were present in images directly downloaded from TCIA (illustrated in Fig. 2). Additionally, qualitative assessment was performed to look for any visual data corruption.

Preprocessing

Each step of preprocessing was followed by manual qualitative assessment of the image to ensure data validity. In addition, quantitative assessment was performed following the data harmonization step to ensure that the entire dataset had the same parametric definition (i.e., same resolution and pixel intensity distribution).

Expert tumor annotations

The expert annotated STV segmentations were qualitatively assessed, manually edited and approved by a board certified, fellowship-trained breast radiologist.

Computationally-generated annotations

The FTV annotations were quantitatively compared with the corresponding STV annotations using the Dice score in order to quantify the difference between the two annotations. Additionally, a qualitative analysis was performed for the best and worst performing cases (illustrated in Fig. 3).

Feature extraction

Considering the mathematical formulation of these features, it is possible for a division by zero to occur (lack of heterogeneity or very small number of voxels). In CaPTk, we provide “not a number” for the result of these features to provide a position of clarity for the user to make subsequent downstream analyses more coherent based on the entire population. We acknowledge this could be provided as “inf” instead, but we are providing this as “NaN” to have parity between various programming languages and processing protocols.

Usage Notes

This collection of images (both normalized and resampled) and accompanying annotations can be analyzed using different tools or software. We provide all the annotations in a research-friendly NIfTI format to allow users to read the images and annotations through many programming languages such as C++, Python, R, or others. The data is accompanied by a XSLX file that provides additional information about each subject.

Code availability

In favor of transparency and reproducibility, but also in line with the scientific data principles of Findability, Accessibility, Interoperability, and Reusability (FAIR)⁴⁵, we have made the tools used to generate the data for this study publicly available³⁸. Specifically, the CaPTk platform^22,23,24, version 1.8.1, was used for all the preprocessing steps. CaPTk’s source code and binary executables are publicly available for multiple operative systems through its official GitHub repository (https://github.com/CBICA/CaPTk). The implementation and configuration of the U-Net with residual connections, used in this study, can be found in the GitHub page of the Generally Nuanced Deep Learning Framework (GaNDLF), version 0.0.14 (https://github.com/CBICA/GaNDLF). Finally, ITK-SNAP³³, was used for all the manual annotation refinements.

References

Polyak, K. et al. Heterogeneity in breast cancer. The Journal of clinical investigation 121, 3786–3788 (2011).
Article CAS Google Scholar
Marusyk, A. & Polyak, K. Tumor heterogeneity: causes and consequences. Biochimica et Biophysica Acta (BBA)-Reviews on Cancer 1805, 105–117 (2010).
Article CAS Google Scholar
Gavenonis, S. C. & Roth, S. O. Role of magnetic resonance imaging in evaluating the extent of disease. Magnetic Resonance Imaging Clinics 18, 199–206 (2010).
Article Google Scholar
Weinstein, S. & Rosen, M. Breast mr imaging: current indications and advanced imaging techniques. Radiologic Clinics 48, 1013–1042 (2010).
Article Google Scholar
Gillies, R. J., Kinahan, P. E. & Hricak, H. Radiomics: images are more than pictures, they are data. Radiology 278, 563–577 (2016).
Article Google Scholar
McNitt-Gray, M. et al. Standardization in quantitative imaging: a multicenter comparison of radiomic features from different software packages on digital reference objects and patient data sets. Tomography 6, 118–128 (2020).
Article CAS Google Scholar
Zwanenburg, A. et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295, 328–338 (2020).
Article Google Scholar
Valdora, F., Houssami, N., Rossi, F., Calabrese, M. & Tagliafico, A. S. Rapid review: radiomics and breast cancer. Breast cancer research and treatment 169, 217–229 (2018).
Article Google Scholar
Clark, K. et al. The cancer imaging archive (tcia): maintaining and operating a public information repository. Journal of digital imaging 26, 1045–1057 (2013).
Article Google Scholar
Saha, A. et al. A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 dce-mri features. British journal of cancer 119, 508–516 (2018).
Article CAS Google Scholar
Saha, A. et al. Dynamic contrast-enhanced magnetic resonance images of breast cancer patients with tumor locations. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.e3sv-re93 (2021).
Lehman, C. et al. Acrin trial 6667 investigators group. mri evaluation of the contralateral breast in women with recently diagnosed breast cancer. N Engl J Med 356, 1295–303 (2007).
Article CAS Google Scholar
Kinahan, P., Muzi, M., Bialecki, B., Herman, B. & Coombs, L. Acrin-contralateral-breast-mr (acrin 6667). The Cancer Imaging Archive. https://doi.org/10.7937/Q1EE-J082 (2021).
Castaldo, R., Pane, K., Nicolai, E., Salvatore, M. & Franzese, M. The impact of normalization approaches to automatically detect radiogenomic phenotypes characterizing breast cancer receptors status. Cancers 12, 518 (2020).
Article CAS Google Scholar
Pati, S. et al. Reproducibility analysis of multi-institutional paired expert annotations and radiomic features of the ivy glioblastoma atlas project (ivy gap) dataset. Medical Physics 47, 6039–6052 (2020).
Article ADS Google Scholar
Saint Martin, M.-J. et al. A radiomics pipeline dedicated to breast mri: validation on a multi-scanner phantom study. Magnetic Resonance Materials in Physics, Biology and Medicine 34, 355–366 (2021).
Article Google Scholar
Newitt, D. et al. Multi-center breast dce-mri data and segmentations from patients in the i-spy 1/acrin 6657 trials. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2016.HdHpgJLK (2016).
Hylton, N. M. et al. Neoadjuvant chemotherapy for breast cancer: functional tumor volume by mr imaging predicts recurrence-free survival—results from the acrin 6657/calgb 150007 i-spy 1 trial. Radiology 279, 44–55 (2016).
Article Google Scholar
Hylton, N. M. Vascularity assessment of breast lesions with gadolinium-enhanced mr imaging. Magnetic resonance imaging clinics of North America 7, 411–20 (1999).
Article CAS Google Scholar
Chitalia, R. et al. Radiomic tumor phenotypes can augment molecular profiling in predicting survival after breast neoadjuvant chemotherapy: Results from acrin 6657/i-spy 1. Under review (2021).
Chitalia, R. D. et al. Imaging phenotypes of breast cancer heterogeneity in preoperative breast dynamic contrast enhanced magnetic resonance imaging (dce-mri) scans predict 10-year recurrence. Clinical Cancer Research 26, 862–869 (2020).
Article Google Scholar
Davatzikos, C. et al. Cancer imaging phenomics toolkit: quantitative imaging analytics for precision diagnostics and predictive modeling of clinical outcome. Journal of medical imaging 5, 011018 (2018).
Article Google Scholar
Pati, S. et al. The cancer imaging phenomics toolkit (captk): Technical overview. In International MICCAI Brainlesion Workshop, 380–394 (Springer, 2019).
Rathore, S. et al. Brain cancer imaging phenomics toolkit (brain-captk): an interactive platform for quantitative analysis of glioblastoma. In International MICCAI Brainlesion Workshop, 133–145 (Springer, 2017).
Cox, R. et al. A (sort of) new image data format standard: Nifti-1: We 150. Neuroimage 22 (2004).
Sled, J. G., Zijdenbos, A. P. & Evans, A. C. A nonparametric method for automatic correction of intensity nonuniformity in mri data. IEEE transactions on medical imaging 17, 87–97 (1998).
Article CAS Google Scholar
Tustison, N. J. et al. N4itk: improved n3 bias correction. IEEE transactions on medical imaging 29, 1310–1320 (2010).
Article Google Scholar
Al Shalabi, L. & Shaaban, Z. Normalization as a preprocessing engine for data mining and the approach of preference matrix. In 2006 International conference on dependability of computer systems, 207–214 (IEEE, 2006).
Ribaric, S. & Fratric, I. Experimental evaluation of matching-score normalization techniques on different multimodal biometric systems. In MELECON 2006-2006 IEEE Mediterranean Electrotechnical Conference, 498–501 (IEEE, 2006).
Bakas, S. et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge. arXivpreprintarXiv:1811.02629 (2018)..
Abdi, H., et al. Normalizing data. Encyclopedia of research design 1 (2010).
Jafri, N. F. et al. Optimized breast mri functional tumor volume as a biomarker of recurrence-free survival following neoadjuvant chemotherapy. Journal of Magnetic Resonance Imaging 40, 476–482 (2014).
Article Google Scholar
Yushkevich, P. A. et al. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage 31, 1116–1128 (2006).
Article Google Scholar
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, 234–241 (Springer, 2015).
Thakur, S. et al. Brain extraction on mri scans in presence of diffuse glioma: Multi-institutional performance evaluation of deep learning methods and robust modality-agnostic training. NeuroImage 220, 117081 (2020).
Article Google Scholar
Zijdenbos, A. P., Dawant, B. M., Margolin, R. A. & Palmer, A. C. Morphometric analysis of white matter lesions in mr images: method and validation. IEEE transactions on medical imaging 13, 716–724 (1994).
Article CAS Google Scholar
Sudre, C. H., Li, W., Vercauteren, T., Ourselin, S. & Cardoso, M. J. Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In Deep learning in medical image analysis and multimodal learning for clinical decision support, 240–248 (Springer, 2017).
Pati, S. et al. Gandlf: A generally nuanced deep learning framework for scalable end-to-end clinical workflows in medical imaging. arXiv preprint arXiv:2103.01006 (2021).
Macyszyn, L. et al. Imaging patterns predict patient survival and molecular subtype in glioblastoma via machine learning techniques. Neuro-oncology 18, 417–425 (2015).
Article Google Scholar
Bakas, S. et al. Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features. Scientific data 4, 170117 (2017).
Article Google Scholar
Fathi Kazerooni, A. et al. Cancer imaging phenomics via captk: Multi-institutional prediction of progression-free survival and pattern of recurrence in glioblastoma. JCO Clinical Cancer Informatics 4, 234–244 (2020).
Article Google Scholar
Bakas, S. et al. Integrative radiomic analysis for pre-surgical prognostic stratification of glioblastoma patients: from advanced to basic mri protocols. In Medical Imaging 2020: Image-Guided Procedures, Robotic Interventions, and Modeling, vol. 11315, 113151S (International Society for Optics and Photonics, 2020).
Thakur, S. P. et al. Skull-stripping of glioblastoma mri scans using 3d deep learning. In International MICCAI Brainlesion Workshop, 57–68 (Springer, 2019).
Chitalia, R. et al. Expert tumor annotations and radiomic features for the ispy1/acrin 6657 trial data collection. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.XC7A-QT20 (2022).
Wilkinson, M. D. et al. The fair guiding principles for scientific data management and stewardship. Scientific data 3, 1–9 (2016).
Article Google Scholar

Download references

Acknowledgements

Research reported in this publication was partly supported by the National Cancer Institute (NCI) of the National Institutes of Health (NIH), under award numbers U01CA242871, U24CA189523, U01CA151235, R01CA197000, and R01CA132870. The content of this publication is solely the responsibility of the authors and does not represent the official views of the NIH.

Author information

These authors contributed equally: Rhea Chitalia and Sarthak Pati.
These authors jointly supervised this work: Despina Kontos and Spyridon Bakas.

Authors and Affiliations

Center for Biomedical Image Computing and Analytics (CBICA), University of Pennsylvania, Philadelphia, PA, 19104, USA
Rhea Chitalia, Sarthak Pati, Megh Bhalerao, Siddhesh Pravin Thakur, Nariman Jahani, Vivian Belenky, Despina Kontos & Spyridon Bakas
Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Rhea Chitalia, Sarthak Pati, Siddhesh Pravin Thakur, Nariman Jahani, Vivian Belenky, Elizabeth S. McDonald, Despina Kontos & Spyridon Bakas
Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Rhea Chitalia, Sarthak Pati & Spyridon Bakas
University of California San Francisco (UCSF), San Francisco, CA, 94115, USA
Jessica Gibbs, David C. Newitt & Nola M. Hylton

Authors

Rhea Chitalia
View author publications
You can also search for this author in PubMed Google Scholar
Sarthak Pati
View author publications
You can also search for this author in PubMed Google Scholar
Megh Bhalerao
View author publications
You can also search for this author in PubMed Google Scholar
Siddhesh Pravin Thakur
View author publications
You can also search for this author in PubMed Google Scholar
Nariman Jahani
View author publications
You can also search for this author in PubMed Google Scholar
Vivian Belenky
View author publications
You can also search for this author in PubMed Google Scholar
Elizabeth S. McDonald
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Gibbs
View author publications
You can also search for this author in PubMed Google Scholar
David C. Newitt
View author publications
You can also search for this author in PubMed Google Scholar
Nola M. Hylton
View author publications
You can also search for this author in PubMed Google Scholar
Despina Kontos
View author publications
You can also search for this author in PubMed Google Scholar
Spyridon Bakas
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.K. and S.B. conceived the experiment(s). R.C., S.P., M.B., S.T., N.J., V.B. and E.M. conducted the experiment(s). R.C., S.P. and M.B. analysed the results. R.C. and S.P. wrote the first version of the manuscript. J.G., D.N. and N.H. provided the data. All authors reviewed, edited, and approved the manuscript.

Corresponding author

Correspondence to Spyridon Bakas.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Table - Original clinical metadata

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chitalia, R., Pati, S., Bhalerao, M. et al. Expert tumor annotations and radiomics for locally advanced breast cancer in DCE-MRI for ACRIN 6657/I-SPY1. Sci Data 9, 440 (2022). https://doi.org/10.1038/s41597-022-01555-4

Download citation

Received: 20 November 2021
Accepted: 29 June 2022
Published: 23 July 2022
DOI: https://doi.org/10.1038/s41597-022-01555-4

This article is cited by

Large scale crowdsourced radiotherapy segmentations across a variety of cancer anatomic sites
- Kareem A. Wahid
- Diana Lin
- Erin F. Gillespie
Scientific Data (2023)
GaNDLF: the generally nuanced deep learning framework for scalable end-to-end clinical workflows
- Sarthak Pati
- Siddhesh P. Thakur
- Spyridon Bakas
Communications Engineering (2023)

Subjects

Abstract

Similar content being viewed by others

MRI-based radiomics in breast cancer: feature robustness with respect to inter-observer segmentation variability

Developing diagnostic assessment of breast lumpectomy tissues using radiomic and optical signatures

MRI radiomics in head and neck cancer from reproducibility to combined approaches

Background & Summary

Methods

Data collection

Preprocessing

DCE-MRI NIfTI volumes

Expert tumor annotations

Computationally-generated annotations

Radiomic features

Data Records

Technical Validation

Data collection

Preprocessing

Expert tumor annotations

Computationally-generated annotations

Feature extraction

Usage Notes

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Table - Original clinical metadata

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Large scale crowdsourced radiotherapy segmentations across a variety of cancer anatomic sites

GaNDLF: the generally nuanced deep learning framework for scalable end-to-end clinical workflows

Search

Quick links