Qualitative Analysis of Tissuetype Segmentation Maps in the Development Cohort. As described in detail in the Materials and Methods section, the thresholds in Figure 2 were iteratively optimized to segment Fluid (blue) and vasculature (red) in non-tumor areas, and CE (yellow) and NE tumor within the tumor VOI, to match Radiologist (JAA) expectations based on the T1-weighted unenhanced (T1w) and contrast-enhanced (T1wCE), T2-weighted (T2w), FLAIR, and ADC images. Thresholds were also selected for segmenting GM (green) and WM (white), though the mpMRI acquisitions were not optimized for this task. Figure 3 illustrates tissuetype maps ("colormap") computed from co-registered mpMRI images of participants with melanoma (Figure 3A) and colon cancer (Figure 3B) metastases to the brain, and GBM (Figure 3C). It can be seen that Edema2 (cyan) co-localizes with regions of leukoaraiosis (arrows) (magnified view shown in supplementary figure S2), while significant portions of the high FLAIR regions within the tumor VOI are classified as Edema1 (brown). The rules correctly identify hemorrhage (purple) in melanoma and colon metastases (Figures 3A & 3B) but also artifactually label edge pixels affected by EPI distortions in the ADC map (Figure 3C).
Validation of Tissuetype Volumes vs. Ground Truth in an External Cohort. Segmentation methods are typically evaluated on the basis of Dice score or equivalent voxelwise similarity with a ground truth. However, our primary purpose is to perform volumetrics of intratumoral tissuetypes. We have therefore compared the volumes of specific tissuetypes computed using the algorithm in Figure 2 to manually segmented volumes of ET, ED, and NCR in the BraTS 2020 dataset. CE tissuetype computed using the algorithm in Figure 2 was strongly correlated with BraTS ET (R=0.85; p<0.001; Figure 4A). On average, the manually drawn ET primarily encompassed CE voxels (~67%) with smaller contributions from other tissuetypes (Figure 4B). The BraTS ED component was highly correlated with High FLAIR (R=0.87; P<0.001) and moderately correlated with Edema1 (R=0.69; P<0.001) and Edema2 (R=0.64; P<0.001) individually (Figure 4A). The manually drawn ED encompassed pixels classified as High FLAIR (49%) and normal brain tissue (49%) (Figure 4B). Unlike ET and ED, NCR/NET was weakly correlated with CE, High FLAIR, intratumoral Fluid, Edema1, and Edema2 (R=0.22-0.52, Figure 4A).
Figure 5 depicts overlays of ground truth BraTS segmentations of NCR/NET, ED, and ET onto tissuetype maps in illustrative examples. Qualitatively, there is good agreement between manually drawn ET and voxelwise CE, and between manually drawn ED and voxelwise Edema1 and Edema2 tissuetypes. It is also apparent that the manually drawn ET, ED, and NCR/NET segmentations encompass some voxels with divergent tissuetype signatures.
Univariable Correlations with TTP in Treatment Dataset. Figure 6 presents the cross-correlations of CE, High FLAIR, Edema1, Edema2, intratumoral Fluid, t_C1D1, and TTP. On a univariable basis, t_C1D1 was strongly correlated with TTP (R=−0.61), followed by CE (R=−0.52), Edema2 (R=−0.27), High FLAIR (R=−0.26), Edema1 (R=−0.20), Fluid (R=−0.036). Edema1, Edema2, and High FLAIR were strongly correlated with each other (R=0.82-0.98) and moderately correlated with t_C1D1 (R=0.58-0.65). Intratumoral Fluid was moderately correlated with Edema1, Edema2, and High FLAIR (R=0.60-0.73) and weakly correlated with t_C1D1 (R=0.49) and CE (R=0.20).
Tissuetype Volume Trends in Treatment Dataset. Temporal dynamics of the volumes of CE (red), Fluid (blue), High FLAIR (light blue), Edema1 (brown), and Edema2 (cyan) tissuetypes within the abVOI are presented in figure 7 for 10 treatment study participants. In 7 of 10 participants in Cohort 1 (figure 7), there was an increasing trend in the CE tissuetype volume beginning an average of 100 days before progression was called. An inconsistent trend in the volumes of Edema1, Edema2, High FLAIR, and Fluid was observed in the majority of participants in Cohort 1. Intratumoral tissuetype volume dynamics for 22 treatment study participants in Cohorts 2-5 are presented in Supplementary Figures S3-S6. Over the majority of 231 scan dates of the 32 participants in the treatment dataset, the timepoint to timepoint changes in volumes of all intratumoral tissuetypes are qualitatively smooth, lending confidence that application of the rules in Figure 2 to intensity-calibrated mpMRI yields consistent volumetric data that can meaningfully inform longitudinal assessment of rHGG.
Multivariable Model for Predicting TTP in Treatment Dataset. To further characterize the utility of the proposed mpMRI tissuetyping algorithm in a clinical trial setting, we undertook a proof-of-concept multivariable analysis to predict TTP from t_C1D1 and the volumes of CE, High FLAIR, Edema1, Edema2, and Fluid within the abVOI. Across 92 time points in Cohort 1, we found significant relationships between TTP and t_C1D1 (p<1.80x10−10), CE (p<1.89x10−4), and Fluid (p<8.15x10−5). All three variables statistically significantly predicted TTP (F(3,88)=34.73, p < 6.643x10−15, Multiple R2=0.54, Adjusted R2=0.53), and a model combining them yielded the lowest AIC with 10-fold cross-validation:
TTP = 205.6 – 0.6∙(t_C1D1) – 2.5∙(CE) + 4.8∙(Fluid) (1)
Where TTP and t_C1D1 are in units of days, and CE and Fluid are in cm3.
Model Accuracy for Predicting Progression in Treatment Dataset. The model-predicted TTP (plotted in gray) was generally in good agreement with the retrospectively-adjusted RANO-based progression12 (vertical dashed line) for most participants in Cohort 1 with IDH-wt tumors (Figure 7). Model performance is qualitatively lower on participants with IDH-mutant tumors (Cohort 2, supplementary figure S3), participants with clinical progression (Cohort 3, supplementary figure S4), and participants with remote recurrence (Cohort 4, supplementary figure S5). The model in equation (1) is a multiple linear regression model that was fitted to continuous TTP data from 92 time points, and we also investigated its performance for predicting binary outcomes. Model accuracy for predicting if progression would happen within n days of a given scan date is summarized in Table 3 for Cohorts 1-4. In the radiological progression Cohorts 1, 2, and 4, model accuracy was 80-88.5% for predicting events 30 days in the future from the scan date, decreasing to 40-73.7% for predicting events 90 days after the scan date. Expectedly, model performance for each value of n was worse in participants with clinical progression (Cohort 3) than when progression was on local radiologic changes (Cohorts 1 and 2).