Introduction

Non-small cell lung cancer (NSCLC) is the most common type of lung cancer, accounting for 85% of all cases [1, 2]. The precise survival risk stratification of patients with NSCLC is a crucial step in treatment. Although the tumor, node, and metastasis (TNM) classification for lung cancer is the most objective and authoritative indicator of the prognosis, those in identical tumor stages still have heterogeneous prognoses [3,4,5]. To improve the management of NSCLC and make proper treatment decisions, numerous studies have reported other independent clinical prognostic factors, including age, sex, and performance status [6,7,8]. In addition, medical imaging, such as CT, can also derive potential markers of prognosis, including tumor volume, pleura effusion, and radiomics [9,10,11,12,13,14].

Radiomics based on medical imaging can assess the tumor and its environment in its entirety, which can provide additional information for predicting cancer outcomes [15,16,17]. Several studies have successfully applied intratumor radiomics features to predict the overall survival, the prognosis of cancer recurrence, and time to progression in patients with NSCLC [17,18,19]. Other studies have investigated the clinical use of quantifying peritumoral regions at CT to help predict tumor invasiveness, tumor spread through air spaces, and especially prognostic outcomes [20,21,22,23]. For example, Wang et al found that the combination of radiomics features extracted from intra- and peritumoral areas could enhance the accurate prognosis prediction of pure-solid NSCLC [23]. However, the added value of extratumoral radiomics and the quality of the studies have not been systematically assessed to further explore the potential association between peritumoral radiomics features and prognosis in NSCLC.

Therefore, the aim of this study was to systematically review and appraise the results from published studies that examined the prognostic value of CT-based peritumoral radiomics features in NSCLC patients, and the potential biological underpinnings were also summarized.

Materials and methods

This systematic review was reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [24]. The review was registered on PROSPERO before initiation (registration no. CRD42022322916).

Search strategy

The PubMed, Embase, Web of Science, and Cochrane Library databases were comprehensively searched up to February 21, 2022, to identify studies that used CT-based peritumoral radiomics to evaluate the prognosis in patients with NSCLC. The reference lists of the included articles and the relevant literature were also manually searched. The following basic search terms were used: NSCLC, pulmonary nodule, CT, radiomics, peritumoral, and prognosis. The detailed search criteria are described in the supplementary material. The retrieval was performed without language and date restrictions.

Study selection

Original research articles will be included in the study. Eligibility criteria included the following: (1) patients with NSCLC; (2) evaluating the prognosis of patients by a peritumoral radiomics approach on CT. Studies were excluded if they (1) were case studies, editorials, letters, review articles and conference abstracts; (2) were not in the field of interest; or (3) were overlaps in study populations.

Data extraction

Data to be extracted will include the following: (1) study details: first author, publication year, country, study design; (2) patient details: the source of data acquisition (single-center/multicenter), type of cohort, sample size, TNM staging, histological subtype, type of treatment, prognostic outcome; (3) imaging details: CT tube voltage, reconstruction slice thickness (mm), plain or contrast CT; (4) radiomics details: segmentation software, segmentation method, peritumoral definition, and reference, feature extraction software, type of radiomics features, number of radiomics features, radiomics feature selection methods, type of models constructed, final classifier, number of radiomics features in the final model, type of radiomics features in the final model, and performance of the models. Two independent reviewers (L.W. and C.G.) completed the initial screening and extracted data from all enrolled studies.

Risk of bias assessment

The methodological quality of each study was evaluated by using the Radiomics Quality Score (RQS) [25] and the Prediction Model Risk of Bias Assessment Tool (PROBAST) [26]. The RQS provides a standardized and quantitative evaluation criterion for the methodology of radiomics researches. The RQS assessment contains sixteen key components from data selection, medical imaging, feature extraction, and exploratory analysis to modelling. Each item contributes to the final score and the total score ranges from -8 to 36 points [25]. Detail description of each item of RQS and the corresponding scores is provided in Table S1. PROBAST is a tool to assess the risk of bias (ROB) and the application of prediction models for diagnosis or prognosis. The risk of bias assessment of all enrolled studies was made by two reviewers (L.W. and C.G.) with a consensus agreement.

Results

Literature search and data extraction

The flow diagram of the literature search of the Preferred Reporting Items for Systematic Reviews and Meta-Analysis is shown in Fig. 1. A total of 433 studies were identified, in which 432 studies were identified by the comprehensive literature search and one study was identified by a hand search of the relevant literature. After screening and evaluating, 13 studies with 2942 patients meeting the criteria were included in this systematic review [22, 23, 27,28,29,30,31,32,33,34,35,36,37].

Fig. 1
figure 1

Flowchart of the study screening and selection process of this systematic review

Patient and study characteristics

The patient characteristics of 13 studies are summarized in Table 1. The included studies were published from 2017 to 2022. Almost all the studies (12/13, 92%) were retrospectively designed [22, 23, 27,28,29,30,31,32,33,34,35,36], except one of the studies, which was prospective [37]. Patients from seven studies (7/13, 54%) were from one center [22, 28,29,30, 34, 36, 37], and the others (6/13, 46%) were from multiple center [23, 27, 31,32,33, 35]. Most studies (10/13, 77%) included a training cohort and validation/test cohort, in which six studies conducted external tests from another center [23, 27, 31,32,33, 35]. The number of patients included in the studies ranged from 90 to 592. The type of treatment and type of prognostic outcome are summarized in Table 1. The type of treatment varied, such as surgery, adjuvant chemo-/radiotherapy, and immune checkpoint treatment. The prognostic outcome included prediction of survival [23, 27,28,29, 31,32,33,34,35, 37], distant metastasis [22, 30, 36] and response status [28, 29, 31, 34]. The most frequent study purpose was the prediction of overall survival [27, 28, 29, 31,32, 34, 37] (7/13, 54%).

Table 1 Basic characteristics and CT scanner information of the included studies

Radiomics workflow

The details of the acquisition parameters of the images in the radiomics studies are shown in Table 1. The slice thickness of CT ranged from 0.6 to 5 mm in most studies except for one study in which the thickness was not mentioned [32]. Some studies (6/13, 46%) conducted radiomics using contrast-enhanced CT images [22, 27, 30, 31, 35, 36] while only one study used non-contrast CT images [29]. Two studies (2/13, 15%) conducted radiomics using either contrast-enhanced or non-contrast CT images for further analysis [33, 34]. The other studies (4/13, 31%) did not mention it [23, 28, 32, 37].

The study details of the radiomics workflow, including region of interest (ROI) segmentation, feature extraction and selection, and model construction, are summarized in Table 2. The ROI segmentation was manual in most studies (9/13, 69%) [22, 23, 28, 29, 31, 33,34,35, 37], semi-automatic in three studies [27, 30, 32], and automatic in one study [36]. The most commonly used software for ROI segmentation was 3D-Slicer (5/13, 38%), and the most commonly used software for radiomics feature extraction was MATLAB (8/13, 62%) in the included studies (Table 2). The types of extracted radiomics features included texture features and/or first-order statistics and shape features [22, 23, 28,29,30,31,32,33,34,35,36,37]. Moreover, some novel radiomics features were introduced in the studies. For example, Tunali et al generated radial gradient and radial deviation features that represent voxel-by-voxel gradient changes [27]. And Vaidya et al analyzed radiomics features of quantitative vessel tortuosity that represent the curvedness of tumor vessels [34].

Table 2 Segmentation, extraction, selection methods, radiomics features, and prediction models of the included studies

The number of extracted radiomics features ranged from 48 to 5309. The radiomics feature selection methods frequently included intraclass correlation coefficients, univariable analyses, and multivariable analyses. The types of models constructed in these studies ranged from 2 to 24. The model constructed in the final radiomics model was usually a multivariable Cox model [22, 27, 28, 30, 31, 35, 36]. The number of radiomics features in the final radiomics model ranged from 2 to 18.

The peritumoral radiomics model and possible biological underpinnings

All the included studies segmented both the intra- and peri-tumoral regions; however, the definitions of peritumoral regions varied. Three different definitions for peritumoral regions were summarized in Fig. 2. Almost all the performances of erosion and dilation were based on the morphology of tumors and can be classified into three types. In type 1, the border mask was defined to be inward erosion 12.5/15 mm [27] or 3 mm [22, 35, 36] to outward dilation 7.5/10 mm [27] or 3 mm [22, 35, 36] along the tumor border. The outside mask was defined as an area expanding outside from the tumor to 17.5/22.5 mm [27] or 3/6 mm [35]. The exterior mask was defined as an area 3 to 9 mm away from the tumor [22]. In type 2, the border mask was defined to be the region that expands 3 mm away from the tumor boundary [32] while the criteria for the outside mask was 15 mm [23, 28, 29, 33, 34] or 20 mm [30] or 30 mm [31]. In type 3, the gross tumor volume equalled the original volume of the tumor lesion without any erosion or dilation performance. The clinical target volume contained gross tumor volume plus an area expanding outside from tumor boundary. The planning target volume was defined as the combination of tumor volume and the area dilated from the tumor border, which was necessary to manage internal motion and set-up reproducibility [37].

Fig. 2
figure 2

The three different types of definitions for peritumoral regions were as follows: Type 1: Border Mask: (−12.5 or −15 to + 7.5 or +10) mm [27], (−3 to + 3) mm [22, 35, 36]; Outside Mask: (0 to +17.5 or + 22.5) mm [27], (0 to + 3 or + 6) mm [35]; Exterior Mask: (+3 to +9) mm [22]. Type 2: Border Mask: (0 to +3) mm [32]; Outside Mask: (0 to +15) mm [23, 28, 29, 33, 34], (0 to +20) mm [30], (0 to +30) mm [31]. Type 3: Tumor Mask: gross tumor volume; Border Mask: clinical target volume minus Tumor Mask; Outside Mask: planning target volume minus Tumor Mask [37]. −: inward erosion; +: outward dilation; 0: tumor boundary

Moreover, the references of peritumoral region researches applied also varied in these included studies. The peritumoral regions were often dilated from the tumor boundary of 15 mm, 20 mm, or 30 mm (7/13, 54%) [23, 28,29,30,31, 33, 34]. Some of them (6/13, 46%) referred to previous findings, where a resection margin > 15 mm did not decrease the risk of recurrence or a resection margin ≥ 20 mm was the safe margin [28,29,30,31, 33, 34]. An area outside the border of the tumor was chosen as the peritumoral region in several studies (4/13, 31%), where microscopic extension of cancerous islets or “real invasive front” can still be found [22, 35,36,37].

Several researchers have explored the biological underpinnings of peritumoral radiomics features in the prediction of the prognostic outcome of patients with NSCLC [27, 31,32,33]. Khorrami et al investigated associations between changes in radiomics features and the density of tumor-infiltrating lymphocytes on digitized hematoxylin-eosin images [31]. Pérez-Morales et al analyzed the associations between the final two radiomics features with gene probesets [32]. Vaidya et al investigated associations between prognostic radiomics features and tumor-infiltrating lymphocytes (radiopathomic analysis), as well as the radiomics features and mRNA sequencing data (radiogenomic analysis) [33]. Tunali et al explored potential biological underpinnings by analyzing the correlations of radiomics features with semantic radiological features [27]. Others also discussed the possible pathological basis of prognostic radiomics features from the peritumoral region, such as “real invasive front,” hypoxic tumor environment, neovascularization and angiogenesis in the tumor microenvironment, lymphovascular tumor invasion and micrometastasis [22, 28,29,30, 34, 35].

The performance of the models

The models with the best performance and the corresponding performance metrics in the included studies were summarized in Table 2. The concordance index (C-index) and the area under the receiver operating characteristic curve (AUC) were used to evaluate the performance of these models in twelve of thirteen of included studies [22, 23, 28,29,30,31,32,33,34,35,36,37]. The peritumoral radiomics features played an important role in the survival models [22, 23, 27,28,29,30,31,32,33,34,35,36,37]. The values of C-index or AUC of these best-performance models ranged from 0.65 to 0.90 [22, 23, 28,29,30,31,32,33,34,35,36,37].

Quality assessment

The total RQS and the percentages of the maximum score are summarized in Table 3. The median RQS of the studies was 13 (range 4–19), and the corresponding percentage of the score was 36.11% (range 11.11–52.78%). Figure 3 shows the percentages of scores in the studies for the sixteen components of RQS. The results of the ROB and the applicability assessments of these studies were presented in Table 4. Figure 4 presents the percentage of the studies rated by level of concern, ROB, and applicability for each domain. All of studies were assessed as high ROB overall [22, 2327,28,29,30,31,32, 33,34,35,36,37]. Most studies (12/13, 92%) were considered low concern regarding the applicability [22, 23, 28,29,30,31,32,33,33, 35,36,37].

Table 3 Radiomics quality scores for the included studies
Fig. 3
figure 3

Quality assessment of included studies by the Radiomics Quality Score (RQS) and presenting the percentages of scores of the included studies

Table 4 Prediction model risk of bias assessment of included studies (PROBAST)
Fig. 4
figure 4

The percentage of the included studies rated by the risk of bias and applicability using the Prediction Model Risk of Bias Assessment Tool (PROBAST)

Discussion

In this systematic review, we found that the radiomics features extracted from the peritumoral lung parenchyma on CT images can be considered a potential prognostic factor for patients with NSCLC. However, the included studies showed considerable variability and heterogeneity (including CT acquisition parameters and radiomics methodology) in each step of radiomics analysis.

Using standardized radiomics analysis was advocated to eliminate unnecessary confounding variability [25, 38]. With included studies having a wide range of section thicknesses (0.6–5 mm), the impact of section thickness on the performance of the model should be evaluated. Khorrami et al evaluated the impact of section thickness on the performance of the classifier and found that the areas under the receiver operating characteristic curves for the radiomics model decreased slightly when the section thickness increased [28, 29, 31]. Bettinelli et al found that the agreement of seven radiomics software programs varied [39]. The test-retest and differences in the inter-CT and intra-CT protocols can affect the stability of radiomics features to different degrees [40]. Therefore, several studies selected stable and reproducible features on the test-retest RIDER lung CT dataset and retained features with an intraclass correlation coefficient of 0.75, 0.8, 0.85 or greater [22, 27,28,29, 31,32,33,34].

ROIs can be segmented manually or (semi)automatically. However, manual segmentation remained the main method in the radiomics studies, and 69% of included studies segmented the ROI manually [22, 23, 28, 29, 31, 33,34,35, 37]. The variability in manual delineations can be reduced by multiple segmentation, but it is time-consuming [25]. Hence, rapid and reliable automatic ROI segmentation is highly desired and is still challenging. Some efforts to automatically segment the lung nodules have been made, which is promising in the future [41,42,43]. Feature selection, modeling methodology, and validation were three major aspects of the radiomics model. Feature reduction for high-throughput radiomics features was performed to decrease the risk of overfitting by multiple methodologies, such as max-relevance and min-redundant, the least absolute shrinkage and selection operator method [22, 28, 29, 33, 35]. Validation is an indispensable component of radiomics analysis [25]. Most of the included studies conducted internal validation or even external validation from another center [22, 23, 27,28,29,29, 31,32,32,34,35].

CT images may contain information that reflects the underlying pathophysiology of the tumor and that results in the conversion of images into structured data to assist in clinical decision support [38]. Peritumoral mask segmentation is usually based on morphologic operations (dilation) from the lesion boundary. Features are often extracted from three-dimension volume of interest and/or a section-by-section basis [22, 23, 27,28,29,30,31], while a few studies extracted from the three slices have the maximum area of the tumor [33, 34]. With an underlying biological rationale, such as “real invasive front” and micrometastasis around the tumor, the peritumoral regions of the included studies were dilated from the tumor boundary between 3 and 30 mm [22, 23, 27,28,29,30,31,32,33,34,35,36]. The biological underpinning of radiomics is significantly important to its wider use and further validation. Efforts to explain the biological meaning of radiomics are emerging, including relationships with semantic features, gene expression, microscopic histopathologic findings, and macroscopic histopathologic marker expression [44]. Encouragingly, several researchers have investigated the correlation between prognostic radiomics features and the density of tumor-infiltrating lymphocytes and gene and mRNA sequencing data [31,32,33]. This exploration will reinforce our understanding of the biological meaning of peri-tumoral radiomics in the predicting prognosis of NSCLC patients.

The RQS was used to assess the methodology, analysis, and reporting of a radiomics study. The median RQS of the studies was 13 (range 4–19), which indicates that most of the included studies did not reach a median level of radiomics quality. All the included studies conducted feature reduction, and biological correlates discussions. None of the included studies conducted a cost-effectiveness analysis, and most of the studies lacked open science. According to the PROBAST, all of the studies were considered to have a high ROB overall. The reasons for model development and validation studies with high ROB may be as follows: (1) Most of the included studies (12/13, 92%) were retrospective studies. (2) The calibration was not evaluated in most studies. (3) Whether predictors were assessed without knowledge of outcome information was also not mentioned.

This systematic review has several limitations that should be noted. First, the number of eligible studies was relatively small. Second, because high heterogeneity was found in radiomics analysis, such as the type of treatment, outcome of prognosis, and radiomics modeling, a meta-analysis of pooled outcomes was not conducted. Third, most of the studies were evaluated as having low RQS and high ROB, so the results should be interpreted with caution.

In conclusion, growing evidence has shown that peritumoral CT-based radiomics features in predicting the prognosis of NSCLC are promising, although they need standardization in radiomics analysis. Because most of the studies were performed retrospectively, studies based on prospective, multiple centers as well as biological correlations should be further conducted to promote their clinical use.