Intra-domain task-adaptive transfer learning to determine acute ischemic stroke onset time

https://doi.org/10.1016/j.compmedimag.2021.101926Get rights and content

Highlights

Abstract

Treatment of acute ischemic strokes (AIS) is largely contingent upon the time since stroke onset (TSS). However, TSS may not be readily available in up to 25% of patients with unwitnessed AIS. Current clinical guidelines for patients with unknown TSS recommend the use of MRI to determine eligibility for thrombolysis, but radiology assessments have high inter-reader variability. In this work, we present deep learning models that leverage MRI diffusion series to classify TSS based on clinically validated thresholds. We propose an intra-domain task-adaptive transfer learning method, which involves training a model on an easier clinical task (stroke detection) and then refining the model with different binary thresholds of TSS. We apply this approach to both 2D and 3D CNN architectures with our top model achieving an ROC-AUC value of 0.74, with a sensitivity of 0.70 and a specificity of 0.81 for classifying TSS < 4.5 h. Our pretrained models achieve better classification metrics than the models trained from scratch, and these metrics exceed those of previously published models applied to our dataset. Furthermore, our pipeline accommodates a more inclusive patient cohort than previous work, as we did not exclude imaging studies based on clinical, demographic, or image processing criteria. When applied to this broad spectrum of patients, our deep learning model achieves an overall accuracy of 75.78% when classifying TSS < 4.5 h, carrying potential therapeutic implications for patients with unknown TSS.

Introduction

Acute ischemic stroke (AIS) is a cerebrovascular disease accounting for 2.7 million deaths worldwide every year (Benjamin et al., 2019). Treatment of AIS is heavily dependent on the time since stroke onset (TSS); current clinical guidelines recommend thrombolytic therapies for AIS patients presenting within 4.5 h and endovascular thrombectomy for those presenting up to 24 h after onset. AIS without a clear TSS is relatively common, accounting for up to 25% of all AIS (Thomalla et al., 2014, Urrutia et al., 2018). Some reasons for unclear TSS include unwitnessed strokes, wake-up strokes, or unreliable reporting by patients. For this patient population, the most recent AHA guidelines recommend using MRI sequences to assess patient eligibility for thrombolytics (Powers et al., 2019).

Following the WAKE UP trial (Thomalla et al., 2018), which used DWI-FLAIR mismatch to select patients for extending the time window for intravenous thrombolysis, the use of MRI (FLAIR-DWI mismatch) is now recommended (level IIa) to identify unwitnessed AIS patients who may benefit from thrombolytic treatment (Powers et al., 2019). Specifically, diffusion-weighted imaging (DWI) displays increased signal in ischemic areas within minutes of stroke occurrence, while fluid-attenuated inversion recovery (FLAIR) imaging can show fluid accumulation after a few hours (Etherton et al., 2018), as shown in Fig. 1. A DWI-positive, FLAIR-negative mismatch can identify stroke lesions that could benefit from administration of thrombolytics. However, assessing this mismatch is subject to high variability compared across multiple readings and/or radiologists (Thomalla et al., 2011). Thus, determining stroke onset using imaging alone could increase the number of patients eligible to receive thrombolytic treatments, possibly improving their outcomes.

Several machine learning approaches have been used to determine stroke onset time in an automated fashion. These involve generating hand-crafted, radiomic, or deep learning-derived features from either clinical reports or images and then using these features as inputs to a variety of machine learning models (Ho et al., 2019, Ho et al., 2017, Lee et al., 2020). This feature extraction has typically relied on defined regions of interest, determined by applying an image threshold or using a parameter map. Limiting features to this immediate region may fail to capture imaging characteristics in the surrounding region, which could be crucial to informing TSS given the interconnected nature of cerebral blood flow (Bang et al., 2011). Moreover, previously published approaches have applied meticulous exclusion criteria, either by stroke location or imaging factors related to preprocessing; for these studies, as many as 40% of patients were ineligible for assessment (Lee et al., 2020).

Deep learning models have excelled in medical imaging for segmentation and classification tasks (Shin et al., 2016, Milletari et al., 2016, Chan et al., 2020, Nie et al., 2016, Winzeck et al., 2018). Specifically, convolutional neural networks (CNNs) have produced state-of-the-art results even in small datasets common in medical imaging research (Litjens et al., 2017). Convolutions, which aggregate pixel neighborhoods across layers, may occur in either two or three dimensions. While there has been a wide range of 2D CNNs applied to medical image tasks, 3D CNNs offer the added advantage of integrating information along the z-axis as well. The potential advantages of 3D convolutions come with a cost of increased model complexity, which generally requires a higher amount of data and computation power to train.

Due to the large number of parameters in a deep neural network, a high volume of data is typically required for training. For particularly complex classification tasks, transfer learning has been shown to achieve model convergence using less computation and boost performance in less time compared to training models from scratch (Pan and Yang, 2010). Transfer learning traditionally involves training a model on one dataset, then refining the model on another set of data for a different task. Cross-domain transfer learning involves training on data from a source domain, and using those learned weights in a model trained on data from a different target domain (Weiss et al., 2016), e.g., from the natural image domain to the medical image domain or from the CT image domain to the MR image domain. Many deep learning approaches applied to medical images have used established architectures pretrained on large natural image datasets such as ImageNet (Russakovsky et al., 2015) and refined the model to the domain-specific task. This is thought to improve model convergence, and use the low-level features learned on a high volume dataset for a smaller dataset, which is usually the case for medical image models given the high cost to acquire sufficient data. However, the differences in natural images and those in the medical domain limit the wide applicability of this method, likely due to over-parameterization of the original models (Carneiro et al., 2019). Efforts have been made to pretrain models on public medical datasets, but access to such medical datasets is still limited. Moreover, higher-level features of medical images vary significantly for different medical domains. To combat the limitations of cross-domain transfer learning and increase features reuse across models, intra-domain transfer learning has been implemented for both natural image and medical image tasks (Raghu et al., 2019). Commonly, a model is initialized in a self-supervised or unsupervised fashion. The advantage of this approach is that it does not require outside datasets or labels. However, even intra-domain pretraining may result in limited feature reuse beyond the first convolutional layer (Verenich et al., 2020). A task-adaptive approach, which uses the same data set for pretraining and then refines the model using two different label sets, has been demonstrated to increase feature reuse and enhance performance (Elman, 1993, Bengio et al., 2009). However, this has not yet been applied in the medical image domain.

We propose an intra-domain task-adaptive transfer learning approach and implement it for TSS classification. The approach uses a multi-stage training schema, leveraging features learned by training on an easier task (stroke detection) to refine the model for a more difficult task (TSS classification). We developed both 2D and 3D CNN models to classify TSS, and we demonstrated our proposed transfer learning approach enhanced classification performance for both architectures when compared to other pretraining schemas, with our 2D model achieving the best performance for classifying TSS < 4.5 h. We also showed that adding soft attention mechanisms during latter stages further improved the performance. To offer clinical insight, we compared our model performance to both previously published methods and radiologist assessment of DWI-FLAIR mismatch. Our deep learning models were able to achieve greater classification sensitivity while maintaining specificity achieved by expert neuroradiologists. By visualizing network gradients via Grad-CAM (Selvaraju et al., 2019), we illustrated that our pretrained models were able to localize the stroke infarct more precisely than the models trained from scratch. To our knowledge, this is the first end-to-end, deep learning approach to classify TSS on a patient dataset with minimal exclusion criteria; moreover, our model exceeds the performance of previously reported state-of-art machine learning models.

Section snippets

Dataset and preprocessing

A total of 422 patients treated for AIS at the UCLA Ronald Reagan Medical Center from 2011 to 2019 were included in this study. This work was performed under the approval of the UCLA Institutional Review Board (#18-000329). A patient was included if they were diagnosed with AIS, had a known stroke onset time, and underwent MRI prior to any treatment, if given. Clinical parameters were gathered from imaging reports and the patient record, with demographic data summarized in Table 1. The study

Results

The performance metrics for all of our training phases are summarized in Table 2. For stroke detection, the 2D and 3D architectures achieved ROC-AUC values of 0.8905 and 0.9460, respectively. This indicates that the models were able to reliably identify stroke at both the slice and volume level, which aligns with intensity differences usually observed for stroke lesions on DWI and FLAIR series. For the second training phase, classifying TSS < 3 h, our pretraining approach improved the

Discussion

Among the models tested, the pretrained 2D model achieved the highest performance metrics with a sensitivity of 0.70 and a specificity of 0.81 in classifying TSS < 4.5 h. Our model was more sensitive than the DWI-FLAIR assessments performed by the neuroradiologists, which we treated as a surrogate for determining a TSS < 4.5 h. We also compared our model to the previously published state-of-the-art method. The threshold method implemented in Lee et al. (2020), which was used to create the ROI,

Conclusion

This approach uses 2D and 3D CNN models to classify TSS for 422 patients and compares model performances to DWI-FLAIR mismatch readings performed by three expert neuroradiologists. We demonstrate that our 2D model outperforms the 3D model when classifying TSS < 4.5 h, which is the current clinical guideline. We show that pretraining the model on stroke detection, then refining the model on TSS classification yields better performance than training on TSS classification labels alone; the

Authors’ contribution

Haoyue Zhang: conceptualization, methodology, software, validation, formal analysis, visualization, contribution, writing – original draft. Jennifer Polson: conceptualization, methodology, software, visualization, contribution, validation, formal analysis, writing – original draft. Kambiz Nael, Noriko Salamon and Bryan Yoo: data curation, validation, writing – review & editing. Fabien Scalzo: writing – review & editing. William Speier: conceptualization, formal analysis, writing – review &

Declaration of Competing Interest

The authors report no declarations of interest.

Acknowledgments

This work was supported by the United States National Institutes of Health (NIH) grants R01NS100806 and T32EB016640, and an NVIDIA Academic Hardware Grant. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Conflict of interest: None declared.

References (40)

  • H.P. Chan et al.

    Deep learning in medical image analysis

    Adv. Exp. Med. Biol.

    (2020)
  • J. Cheng et al.

    Enhanced performance of brain tumor classification via tumor region augmentation and partition

    PLOS ONE

    (2015)
  • C.D. d’Esterre

    Regional comparison of multiphase computed tomographic angiography and computed tomographic perfusion for prediction of tissue fate in ischemic stroke

    Stroke

    (2017)
  • M.R. Etherton et al.

    Neuroimaging paradigms to identify patients for reperfusion therapy in stroke of unknown onset

    Front. Neurol.

    (2018)
  • V. Fonov et al.

    Unbiased nonlinear average age-appropriate brain templates from birth to adulthood

    Neuroimage

    (2009)
  • K. He et al.

    Deep residual learning for image recognition

    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    (2016)
  • K.C. Ho et al.

    Classifying acute ischemic stroke onset time using deep imaging features

    AMIA Annu. Symp. Proc.

    (2017)
  • K.C. Ho et al.

    A machine learning approach for classifying ischemic stroke onset time from imaging

    IEEE Trans. Med. Imaging

    (2019)
  • H. Lee

    Machine learning approach to identify stroke within 4.5 hours

    Stroke

    (2020)
  • L. Luo et al.

    Adaptive gradient methods with dynamic bound of learning rate

    Proceedings of the 7th International Conference on Learning Representations

    (2019)
  • Cited by (15)

    View all citing articles on Scopus
    1

    Haoyue Zhang and Jennifer S Polson contributed equally.

    View full text