Development and validation of a new stage-specific nomogram model for predicting cancer-specific survival in patients in different stages of colon cancer: A SEER population-based study and external validation

Hu, Chenhao; Shi, Feiyu; Zhang, Zhe; Zhang, Lei; Liu, Ruihan; Sun, Xuejun; Zheng, Liansheng; She, Junjun

doi:10.3389/fonc.2022.1024467

ORIGINAL RESEARCH article

Front. Oncol., 07 December 2022

Sec. Cancer Epidemiology and Prevention

Volume 12 - 2022 | https://doi.org/10.3389/fonc.2022.1024467

This article is part of the Research Topic Clinically Prediction Models for Gastrointestinal Cancer Diagnosis and Prognosis in the Era of Precision Oncology View all 8 articles

Development and validation of a new stage-specific nomogram model for predicting cancer-specific survival in patients in different stages of colon cancer: A SEER population-based study and external validation

Chenhao Hu^1,2,3†

Feiyu Shi^1,2,3†

Zhe Zhang^1,2,3

Lei Zhang^1,2,3

Ruihan Liu^2,3

Xuejun Sun¹

Liansheng Zheng^4*‡

Junjun She^1,2,3*‡

¹Department of General Surgery, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
²Center for Gut Microbiome Research, Med-X Institute, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
³Department of High Talent, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
⁴Department of Digestive Minimally Invasive Surgery, The Second Affiliated Hospital of Baotou Medical College, Baotou, China

Background: The effects of laterality of the primary tumor on survival in patients in different stages of colon cancer are contradictory. We still lack a strictly evaluated and validated survival prediction tool, considering the different roles of tumor laterality in different stages.

Methods: A total of 101,277 and 809 colon cancer cases were reviewed using the Surveillance, Epidemiology, and End Results database and the First Affiliated Hospital of Xi ‘an Jiaotong University database, respectively. We established training sets, internal validation sets and external validation sets. We developed and evaluated stage-specific prediction models and unified prediction models to predict cancer-specific survival and compared the prediction abilities of these models.

Results: Compared with right-sided colon cancers, the risk of cancer-specific death of left-sided colon cancer patients was significantly higher in stage I/II but was markedly lower in stage III patients. We established stage-specific prediction models for stage I/II and stage III separately and established a unified prediction model for all stages. By evaluating and validating the validation sets, we reported high prediction ability and generalizability of the models. Furthermore, the stage-specific prediction models had better predictive power and efficiency than the unified model.

Conclusions: Right-sided colon cancer patients have better cancer-specific survival than left-sided colon cancer patients in stage I/II and worse cancer-specific survival in stage III. Using stage-specific prediction models can further improve the prediction of cancer-specific survival in colon cancer patients and guide clinical decisions.

Introduction

Colon cancer remains one of the most commonly diagnosed malignancy and leading cause of cancer-related deaths worldwide, and the morbidity and mortality rate of colon cancer has been increasing in recent years (1). In recent decades, increasing evidence has suggested that the laterality of the primary tumor is an effective prognostic factor for colon cancer. A meta-analysis of 66 relevant studies involving 1,437,846 patients suggested that the risk of death was significantly reduced in patients with left-sided primary tumors and demonstrated that the laterality of the primary tumor should be considered when deciding the ideal treatment method (2). In addition, several studies have shown that left-sided and right-sided colon cancers harbor different clinical, pathobiologic and molecular characteristics (3, 4). Moreover, the laterality of the primary tumor may be associated with the response to adjuvant therapy and targeted therapy and has an underlying predictive power for evaluating the survival benefit of targeted therapy (4, 5).

However, several recent studies provide new evidence suggesting that the relationship between survival and the laterality of the primary tumor is stage dependent. Compared with left-sided colon cancer, in stage II, the risk of mortality was significantly lower in right-sided colon cancer patients, and was markedly higher in right-sided colon cancer patients in stage III (6–9). Due to the different contributions of laterality to the prognosis of colon cancer in stage II and stage III, using a unified prediction model to predict survival in cancer patients inevitably leads to the misestimation of survival in some patients. However, to date, stage-specific prediction models based on large populations are still limited. Given that a powerful prognostic prediction tool plays a crucial role in deciding the appropriate therapy to improve survival, it is necessary to discuss and develop stage-specific prediction models for stage II and stage III separately to increase the accuracy of survival prediction.

Therefore, in this study, we retrieved and extracted data from the Surveillance, Epidemiology, and End Results (SEER) incidence database and the Electronic Medical Record and Analysis System (EMRAS) of the First Affiliated Hospital of Xi ‘an Jiaotong University. We developed prediction models based on the training data set of SEER data to predict cancer-specific survival, and evaluated and validated models using internal and external validation datasets. Furthermore, we reported that the stage-specific prediction models had better predictive power and efficiency after comparing the accuracy, discrimination, calibration and clinical usefulness of stage-specific prediction models and unified models.

Methods

Study design and data sources

We conducted a population-based retrospective cohort study. We established two follow-up cohorts. The main cohort was extracted from the SEER incidence database, which covers approximately 47.9% of the U.S. population. All patients (older than 20 years of age) who were diagnosed with primary colon cancer between 2010 and 2018 and histologically confirmed to have stage I-III malignant adenocarcinoma (ICD-O-3: 8140-8389) or mucous adenocarcinoma (ICD-O-3: 8480) were identified and extracted (Figure 1). We excluded patients who met the following criteria: (1) missing demographic information, including age, sex and race; (2) unknown grade and stage, and T0, Tis, M1; (3) no surgery or unknown surgery status; (4) no primary cancer; (5) unknown number of regional nodes examined and regional node positive; and (6) unknown cause of death (Figure 1). A total of 101,277 patients were ultimately identified and extracted.

FIGURE 1

Figure 1 Flowchart for data retrieval and filtration of patients with colon cancer from the SEER database and EMRAS.

The Xi’an cohort, an external validation cohort, was obtained from the EMRAS database of the First Affiliated Hospital of Xi ‘an Jiaotong University using the same inclusion criteria from 2015 to 2018. A total of 809 patients were ultimately included (Figure 1).

We categorized the patients into three groups according to their AJCC stage: Stage I/II group, stage III group and all-stage group. For each group, the training data set was established with 70% of randomly selected patients from the SEER cohort. The remaining 30% of the patients from the SEER cohort were included in the internal validation data set. The Xi’an cohort was defined as an external validation data set (Figure 1).

Outcomes and covariates

For each patient, the baseline was defined as the date of cancer diagnosis. In the Xi’an cohort, patients were followed up until the date of death due to any cause or March 30th, 2020. The outcome of interest was death due to colon cancer in the groups. Death attributed to any other cause was defined as a competing event. We retrieved clinical information, including age, sex, year of diagnosis, race, marital status, tumor position, differentiation grade, histological type, T stage, N stage, radiotherapy, chemotherapy, carcinoembryonic antigen (CEA), tumor deposits, number of examined regional lymph nodes and positive regional lymph nodes and perineural invasion, of each patient from the SEER database or EMRAS database and adjusted for these confounding factors.

Statistical analyses

All statistical analyses were performed using R statistical software, version 4.0.2 (F Foundation for Statistical Computing, Vienna, Austria). Continuous variables were expressed as the mean ± standard deviation (SD) or median (interquartile range (IQR)) according to the normality of the data. For each group (stage I/II group, stage III group and all-stage group), the least absolute shrinkage and selection operator (LASSO) analysis was used to select the variables. The proportional hazard (PH) assumption for each variable of models were tested before establishing models. P value less than 0.05 was considered that such variable violated the PH assumption. For variable violated the PH assumption, we introduced time function (sqrt(t)) to construct interaction term. Considering that death due to other causes was competing for the outcome of interest, we used a competing risk model to estimate and calculate the subdistribution hazard ratio (SHR) and the 95% confidence interval (CI) for cancer-specific death after adjusting for confounding variables selected by the LASSO analysis (10). Nomograms were constructed based on the results of the multivariate competing risk model. The Akaike information criterion (AIC) was used to evaluate the complexity of the model (11). Harrell’s concordance index (C-index) was used to evaluate the accuracy of the prediction (12). The time-dependent receiver operating curve (time-ROC) and time-dependent area under the curve (time-AUC) were used to assess the discrimination of the models. Calibration curves were used to assess the calibration of the models. Decision curve analysis (DCA) was conducted to evaluate the clinical usefulness of the models by calculating the net benefits at different threshold probabilities. X-tile software was used to calculate the cutoff value of the total score of the nomograms (13). All statistical tests were two-sided, and a P value less than 0.05 was considered statistically significant.

Ethical statement

This study was conducted in accordance with the Declaration of Helsinki and was approved by the institutional review board of the First Affiliated Hospital of Xi’an Jiaotong University. All the data from the SEER database were public and deidentified, and individual informed consent was exempted. The data from EMRAS were deidentified, and all patients provided written, broad informed consent at admission. Because this study did not collect new clinical information or biospecimens, additional individual informed consent was exempted.

Results

Baseline characteristics of the study cohorts

From the SEER database, we retrieved data for a total of 167,013 patients diagnosed with colon cancer from 2010 to 2018 (Figure 1). After exclusion, a total of 101,277 patients were included for further analysis (Figure 1). From the EMRAS database, a total of 809 patients diagnosed with colon cancer were retrieved and included (Figure 1). In the SEER cohort, the median age at diagnosis was 67 (IQR: 57-77) years. A total of 49.3% (N=49,960) of the patients were male, and 77.8% (N=78,779) of the patients were white. A total of 61.5% (N=62,263) of the patients had tumors located in the right colon. The total follow-up duration was 369,527 person-years, with a median of 3.25 years (IQR: 1.50-5.58 years). In the Xi’an cohort, the median age at diagnosis was 63 (IQR: 54-73) years. A total of 56.6% (N=458) of the patients were male. A total of 52.7% (N=426) of the patients had tumors located in the right colon. The total follow-up duration was 2,219 person-years, with a median of 2.42 years (IQR: 1.33-4.00 years). The detailed demographic, clinicopathological and follow-up information are shown in Table 1.

TABLE 1

Table 1 Baseline characteristics of patients with colon cancer in the training, internal validation and external validation data sets.

Tumor laterality was related to patient prognosis and had different effects on survival in the stage I/II group and stage III group

We first analyzed cancer-specific survival in the stage I and stage II groups (Table S1). The results showed similar changes in all the variables between the groups. Because of the small sample size of the stage I group, we combined it with the stage II group into the stage I/II group.

According to stage I/II and stage III, we separated the patients retrieved from the SEER database into two groups. We assessed the effects of tumor laterality in these two groups. After excluding the competition for death and adjusting for covariates (age, sex, race, marital status, tumor laterality, differentiation grade, histological type, T stage, N stage, radiotherapy, chemotherapy, CEA, tumor deposits, number of examined regional lymph nodes and perineural invasion), the risk of cancer-specific death was significantly higher in patients with left colon cancer in the stage I/II group (left vs. right SHR: 1.170, 95% CI: 1.105-1.238, adjusted P<0.001, Figure 2A), while the risk of cancer-specific death was markedly lower in patients with left colon cancer in the stage III group (left vs. right SHR: 0.836, 95% CI: 0.797-0.876, adjusted P<0.001, Figure 2B). We also observed the same changes in the Xi’an cohort (Figure S2 and Table S5), although there were no statistically significant differences. Table S6 reported results of PH assumption.

FIGURE 2

Figure 2 Cumulative incidence of cancer-specific death of right colon cancer and left colon cancer in stage I/II (A) and stage III (B). CI, confidence interval; SHR, subdistribution hazard ratio.

Variable selection using LASSO analysis

We established three data sets according to the AJCC stage of the patients: The stage I/II, stage III and all-stage groups. For each group, we conducted univariate and multivariate analyses of factors associated with cancer-specific death. In the stage I/II group, age, sex, race, marital status, tumor position, differentiation grade, T-stage, radiotherapy, chemotherapy, CEA, tumor deposits, number of examined regional lymph nodes and perineural invasion were significantly associated with cancer-specific survival (Table S2). In the stage III and all-stage groups, all factors were significantly associated with cancer-specific survival (Table S3 and Table S4).

Thus, we conducted LASSO analysis to further reduce the number of variates. According to the results of the LASSO analysis, three lists of variables were established (Figure S1). Model 1 included the combination of variables for which the λ value from the LASSO analysis was the minimum value. Model 2 included the most simplified combination of variables for which the λ value from the LASSO analysis was within the minimum value ± 1 standard error (SE). The AJCC model exclusively included T stage and N stage as traditional prognostic prediction models. A total of eight models were established (in the all-stage group, Model 1 and Model 2 included the same lists of variables).

Establishment of stage-specific prediction models and a unified model and selection of the optimal model

A competing risk model was used to establish the prediction models using the training dataset of each group (Tables S2–S4). We chose the C-index and AIC to evaluate the accuracy of the three different models to select the optimal model in each group (Table 2 and Figure S3). Furthermore, the time-ROC and time-AUC were used to assess the discriminability of the models (Figures S4 and S5).

TABLE 2

Table 2 AIC values of different models and C-indexes in predicting cancer-specific survival in the training, internal validation and external validation data sets.

In the stage I/II group, Model 1 was developed using a combination of 14 variables (Table S2), Model 2 was developed using a combination of 10 variables (Table S2), and the AJCC model exclusively included T stage. The AIC values and C-indexes were similar between Model 1 and Model 2, but lower in the AJCC model (Table 2 and Figure S3). In addition, the time-ROC and time-AUC showed similar results (Figures 3, S4 and S5). However, Model 2 is simpler and easier to use in a clinical setting than Model 1. Therefore, we ultimately chose Model 2 to predict the prognosis of patients with stage I/II colon cancer.

FIGURE 3

Figure 3 Time-ROC for the training set, internal validation set and external validation set in different groups at the 3^rd year and comparing the time-ROC between stage I/II, stage III and all-stage groups. (A) the time-ROC for the training set, the internal validation set and the external validation set in the stage I/II group at the 3^rd year; (C) the time-ROC for the training set, the internal validation set and the external validation set in the stage III group at the 3^rd year; (E) the time-ROC for the training set, the internal validation set and the external validation set in the all-stage group at the 3^rd year; (B) the time-ROC comparing the stage-specific prediction models and the unified model in the training set; (D) the time-ROC comparing the stage-specific prediction models and the unified model in the internal validation set; (F) the time-ROC comparing the stage-specific prediction models and the unified model in the external validation set. *: P<0.05.

In the stage III group, Model 1 was developed using a combination of 15 variables (Table S3), Model 2 was developed using a combination of 14 variables (Table S3), and the AJCC model exclusively included T stage and N stage. The AIC values and C-indexes were similar between Model 1 and Model 2 but were lower in the AJCC model (Table 2 and Figure S3). In addition, the time-ROC and time-AUC showed similar results (Figures 3, S4 and S5). Therefore, we ultimately chose Model 2 to predict the prognosis of patients with stage III colon cancer.

In the all-stage group, Model 1 was developed using a combination of 15 variables (Table S4), and the AJCC model exclusively included T stage and N stage. The AIC values and C-indexes were lower in the AJCC model than in Model 1 (Table 2 and Figure S3). In addition, the time-ROC and time-AUC showed similar results (Figures 3, S4, S5). Therefore, we ultimately chose Model 1 as the unified model to predict the prognosis of patients with colon cancer.

Based on the selected models, three separate nomograms were established, as shown in Figure 4. We estimated the probability of 3-year, 5-year, and 8-year cancer-specific survival.

FIGURE 4

Figure 4 Established nomograms for optimal models for stage I/II, stage III and all-stage groups. (A) nomogram for stage I/II; (B) nomogram for stage III; (C) Nomogram for all stages. *: Asian or Pacific Islander; **: American Indian/Alaska Native.

The stage-specific prediction models had better performance than the unified model

We evaluated the accuracy of the model predictions using the C-index, as shown in Table 2. These three nomograms achieved favorable predictive accuracy. Furthermore, stage-specific prediction models, including the stage I/II prediction model and stage III prediction model, showed better predictive accuracy than the unified model (Figure S3).

We assessed the discriminability of the models using the time-ROC and time-AUC (Figures 3 and S5). Except for the external validation set, the 5-year AUC values of the stage-specific prediction models were higher than those of the unified model in the training dataset and the internal validation set (Figure 3).

In addition, we assessed the calibration of the models using calibration curves at 3 years (Figure S6), 5 years (Figure S7) and 8 years (Figure S8; the 8-year data were unavailable in the external validation set). Our results showed that the nomograms, including those of the stage-specific prediction models and the unified models, provided optimal agreement between model prediction and actual observations for 3-, 5- and 8-year cancer-specific survival in the training set, internal validation set and external validation set.

Furthermore, we conducted DCA to evaluate the clinical usefulness of the models. Within most of the threshold probability range, the nomograms we established were associated with a higher net benefit. Consistently, the net benefits of the stage-specific models were higher than those of the unified model in predicting 3-year (Figure S9), 5-year (Figure S10) and 8-year (Figure S11; the 8-year data were unavailable in the external validation set) cancer-specific survival in the training set, internal validation set and external validation set.

Optimal cutoff values of the total score for the stage-specific nomograms

We calculated the optimal cutoff values of the total score for the stage I/II nomogram and stage III nomogram using X-tile software and the training sets of each group. In the stage I/II group, a total score greater than or equal to 115 points was considered high risk. In the stage III group, a total score greater than or equal to 200 points was considered high risk. The distributions of the total scores for patients and cancer-specific survival are shown in Figure 5. Furthermore, both in the stage I/II group and stage III group, high-risk patients had worse cancer-specific survival than low-risk patients in the training, internal validation and external validation sets (Figure 5).

FIGURE 5

Figure 5 Distribution of total score for patients and survival and comparison of the cancer-specific survival of high-risk and low-risk patients. (A): training data sets of the stage I/II group; (B): internal validation data sets of the stage I/II group; (C): external validation data sets of the stage I/II group; (D): training data sets of the stage III group; (E): internal validation data sets of the stage III group; (F): external validation data sets of the stage III group.

Discussion

In this retrospective study, we established two independent cohorts. A total of 101,277 colon cancer patients from the SEER database and 809 colon cancer patients from the EMRAS database were included in the analysis. We confirmed that the laterality of the primary tumor markedly affects the patients’ prognoses, while the effects are contradictory in different stages. We reported that, compared with right-sided colon cancers, the risk of cancer-specific death was higher in patients with left colon cancer in the stage I/II group (left vs. right SHR: 1.170), while the risk of cancer-specific death was markedly lower in patients with left colon cancer in the stage III group (left vs. right SHR: 0.836). Based on the optimal models selected using the LASSO analysis for the groups, we established stage-specific prediction models for stage I/II and stage III separately and a unified prediction model for all stages. The C-index values for the established models were more than 0.7, indicating that the proposed models could correctly predict survival with high accuracy. Moreover, we conducted discrimination and calibration analyses, which indicated that the proposed models were efficient predictors. The results of a DCA indicated that the proposed models could gain higher net benefit within most of the threshold probability range. By validating an independent external validation cohort in a different region and obtaining acceptable results, we reported that the proposed models have high generalizability. However, by comparing the stage-specific prediction models with the unified model, we concluded that the stage-specific prediction models had better predictive power and efficiency. Finally, we calculated the optimal total score cutoff values for the stage-specific nomograms and efficiently identified the high-risk subsets. We can further improve the prediction of survival in colon cancer patients by using stage-specific prediction models.

The laterality of the primary tumor has been widely accepted as one of the independent predictors of tumor prognosis (3). However, whether the prognosis of right-sided colon cancer is better than that of left-sided colon cancer or worse is still controversial and has been challenged by emerging evidence. Recent studies reported an interesting phenomenon in which stage II right-sided colon cancer patients had better survival than left-sided colon cancer patients, and stage III right-sided cancer patients had worse survival (3, 6, 8, 14–17). Weiss et al. reported conflicting results regarding the laterality of the primary tumor for predicting survival at different stages from the SEER database (6). Additionally, researchers also conducted studies and concluded consistent results based on several databases, including the National Cancer Database (NCDB, the United States) and British Columbia Cancer Agency Gastrointestinal Cancer Outcomes Unit (BCCA-GICOU, Canada) (15, 17). Moreover, Kishiki et al. reported that right-sided colon cancers had lower recurrence rates in stage I and stage II patients and a higher recurrence rate in stage III patients according to data retrieved from the databases of 23 institutions belonging to the Japanese Study Group for Postoperative Follow-up of Colorectal Cancer (14). However, although several studies have reported this result, we still lack a strictly evaluated and validated survival prediction tool, considering the different roles of tumor laterality in patients in different stages of colon cancers.

Several studies have shown that left-sided and right-sided colon cancers harbor different clinicopathological, biological and molecular characteristics, which may result in the different contributions of the laterality of primary cancer to survival prediction in the different stages of colon cancer (3). Right-sided and left-sided colon cancer have distinct embryologic origins. Right-sided colon cancer comprising the cecum, the ascending colon and the proximal two-thirds of the transverse colon, derives from the midgut, while the left-sided colon, including the distal one-third of the transverse, the splenic flexure, the descending colon, and the sigmoid colon, derives from the hindgut (18). Due to the distinct origins, the blood supplies of the right-sided and left-sided colon are also different. Branches of the superior mesenteric artery and inferior mesenteric artery mainly perfuse the right-sided and left-sided colon, respectively. Furthermore, such distinct embryologic origin may account for a series of biological and molecular differences between left-sided and right-sided colon cancers. Microsatellite instability (MSI), which is supposed to result from a deficient mismatch repair (MMR) system by either gene mutation or hypermethylation, occurs in approximately 15% of colon cancers and promotes tumorigenesis by generating mutations in target genes that possess coding microsatellite repeats (19, 20). In right-sided colon cancer, MSI was more frequently observed than in left-sided colon cancer (21–23). MSI status is closely related to the survival of patients with colon cancer. Studies have shown that patients with stage II/III MSI colon cancer have better survival than those with microsatellite stability (MSS) (19, 24, 25). However, in contrast, in patients with metastatic colon cancer, the presence of MSI may significantly decrease survival (26). Additionally, the frequency of mutations in key oncogenes and tumor suppressors is significantly different between right-sided and left-sided colon cancers (22). Several key mutations associated with different tumorigenesis pathways and survival, such as BRAF V600E and KRAS, are significantly more common in right-sided colon cancers, while the mutations of APC and TP53 are enriched in left-sided colon cancers (4, 27–31). The differential expression of these key tumor-associated molecules in right-sided versus left-sided colon cancer and their correlation with prognosis may be the result of the distinct embryologic origin of right-sided and left-sided colon cancer and partly explain the difference in prognosis in right-sided versus left-sided colon cancer.

Additionally, right-sided colon cancers, especially hepatic flexure and transverse colon cancers, have the possibility of alternative routes of lymphatic spread through the gastroepiploic ligament (32). Stelzner et al. identified the small blood and lymphatic vessels connecting the transverse colon and the greater omentum and connecting the transverse colon and the pancreas, which may be the potential pathways for lymphatic metastasis to infrapyloric and gastroepiploic lymph nodes (IGLN) (33). Previous studies reported 0.7-22% incidence rates of IGLN metastases for right-sided colon cancer (34–42). In addition, the rates of IGLN metastases were higher (1.7-33%) in patients with positive mesocolic nodes (32). Inadequate dissection of IGLNs could be responsible for two consequences. On the one hand, the residual of possible metastatic nodes may lead to a high rate of local recurrence, however the impact of IGLN metastases on overall survival is still unclear. On the other hand, a lack of assessment of IGLN metastasis results in a misestimation of cancer staging and further affects the choice of adjuvant therapy after surgery. Therefore, several researchers have suggested that extended lymphadenectomy should be performed as a standard treatment for flexure and transverse colon cancers (32). However, if the IGLNs are regional nodes, then IGLN dissection should be performed routinely or selectively, and the exact role of IGLN dissection in prognosis is still unclear.

Risk prediction models have been well used to inform doctors and patients about the risk of disease, the identification of high-risk populations, survival prediction and guiding therapeutic strategies (43, 44). However, an important question is that a model that has a good performance on the training dataset may not perform well when it is applied to another dataset. Model overfitting occurs when too many variables are included (45). Additionally, an established model that uses too many variables is difficult to use. The methods we generally used in the past to select variables were univariable screening and stepwise selection. However, after univariable screening or stepwise selection, the model still includes too many variables and is susceptible to overfitting, especially when the sample size is large. A useful and simple way to reduce overfitting is penalized regression. Two popular penalized regression methods are ridge and LASSO (45). Compared with ridge, LASSO can be used to perform variable selection by shrinking the coefficients to exactly zero. Ambler et al. reported that LASSO is superior to backward elimination and univariable screening when performing variable selection (46). Our results indicated that the 10-variable model performed equally as well as the all-variable models in the stage I/II group, while the univariable analysis showed that all the variables should be included in the model. We effectively reduced the complexity of the developed models by performing variable selection using LASSO.

To our knowledge, this is the first study to develop stage-specific prediction models considering the different effects of the laterality of the primary tumor on colon cancer prognosis at different stages. Our study has several unique strengths. First, a major strength of our study is the use of a large-scale nationwide cohort from the SEER program, a high-quality and reliable database. It allows us to adequately adjust for confounding factors for survival, such as demographic, clinico-pathologic and therapeutic information. Second, we established an external validation dataset from a different race, region and economic and social environment population to validate and evaluate our developed stage-specific prediction models. Finally, to eliminate the potential competitive risks of death due to other reasons, we used a competing risk regression model to calculate SHRs and adjusted for confounding factors.

Several limitations exist in this study and could be a source of bias. First, several pathological information, such as microsatellite status, KRAS mutation and BRAF mutation, are unavailable in the SEER database. While laterality is prognostic, it might be a result of and a surrogate for differences in the molecular factors and/or cancer subtypes between the right-sided and left-sided colon. Although several studies indicated that laterality is independent of molecular factors and cancer subtypes are prognostic factors, the relationship between laterality and molecular markers should be evaluated when developing prediction models (47). Recently, the SEER program has required registries to report the status of some important molecular markers, such as MSI, KRAS and BRAF, as much as possible and may provide these data in the future. Second, the disease-free survival (DFS) data were not captured in the SEER database. Recurrence and metastasis, as important survival outcomes, cannot be evaluated. Third, the detailed information of adjuvant therapy, such as the regimens, dose and completion of chemotherapy and the dose of radiotherapy, was unclear. Finally, we need external validation cohorts of other regions and races to further validate the stage-specific prediction models we developed in this study.

Conclusions

This study demonstrated that the laterality of the primary tumor affects prognosis in patients with colon cancers, while the effects are contradictory in patients at different stages. Right-sided colon cancer patients have a better survival than left-sided colon cancer patients in stage I/II, while worse survival is observed in stage III patients. We developed stage-specific prediction models considering the contradictory effects of laterality on the precise prediction of cancer-specific survival. By validating in internal and external validation sets, the stage-specific prediction models showed better prediction ability than the unified models and may guide treatment decisions for colon cancer patients.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving human participants were reviewed and approved by the institutional review board of the First Affiliated Hospital of Xi’an Jiaotong University. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

Author contributions

CH and FS were responsible for the design of the study, statistical analysis, and drafting and revising of the manuscript; ZZ, LZ, RL, and XS provided critical comments and review of the manuscript; JS and LSZ designed and supervised the study and revised the manuscript. All authors contributed to the article and approved the submitted version.

Funding

This project was supported by the National Natural Science Foundation of China (No. 81870380 and 82173394) and the Shaanxi Province Science Foundation (2020ZDLSF01-03 and 2020KWZ-020).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2022.1024467/full#supplementary-material

References

1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Ca-a Cancer J Clin (2021) 71:209–49. doi: 10.3322/caac.21660

CrossRef Full Text | Google Scholar

2. Petrelli F, Tomasello G, Borgonovo K, Ghidini M, Turati L, Dallera P, et al. Prognostic survival associated with left-sided vs right-sided colon cancer a systematic review and meta-analysis. JAMA Oncol (2017) 3:211–9. doi: 10.1001/jamaoncol.2016.4227

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Lee MS, Menter DG, Kopetz S. Right versus left colon cancer biology: Integrating the consensus molecular subtypes. J Natl Compr Cancer Netw (2017) 15:411–9. doi: 10.6004/jnccn.2017.0038

CrossRef Full Text | Google Scholar

4. Margonis GA, Amini N, Buettner S, Kim Y, Wang J, Andreatos N, et al. The prognostic impact of primary tumor site differs according to the KRAS mutational status a study by the international genetic consortium for colorectal liver metastasis. Ann Surg (2021) 273:1165–72. doi: 10.1097/SLA.0000000000003504

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Tejpar S, Stintzing S, Ciardiello F, Tabernero J, Van Cutsem E, Beier F, et al. Prognostic and predictive relevance of primary tumor location in patients with RAS wild-type metastatic colorectal cancer retrospective analyses of the CRYSTAL and FIRE-3 trials. JAMA Oncol (2017) 3:194–201. doi: 10.1001/jamaoncol.2016.3797

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Weiss JM, Pfau PR, O'connor ES, King J, Loconte N, Kennedy G, et al. Mortality by stage for right- versus left-sided colon cancer: Analysis of surveillance, epidemiology, and end results-Medicare data. J Clin Oncol (2011) 29:4401–9. doi: 10.1200/JCO.2011.36.4414

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Ulanja MB, Rishi M, Beutler BD, Sharma M, Patterson DR, Gullapalli N, et al. Colon cancer sidedness, presentation, and survival at different stages. J Oncol (2019) 2019:4315032. doi: 10.1155/2019/4315032

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Warschkow R, Sulz MC, Marti L, Tarantino I, Schmied BM, Cerny T, et al. Better survival in right-sided versus left-sided stage I - III colon cancer patients. BMC Cancer (2016) 16:554. doi: 10.1186/s12885-016-2412-0

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Li YQ, Feng Y, Dai WX, Li QG, Ca SJ, Peng JJ. Prognostic effect of tumor sidedness in colorectal cancer: A SEER-based analysis. Clin Colorectal Cancer (2019) 18:E104–16. doi: 10.1016/j.clcc.2018.10.005

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Fine JP, Gray RJ. A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc (1999) 94:496–509. doi: 10.1080/01621459.1999.10474144

CrossRef Full Text | Google Scholar

11. Akaike H. New look at statistical-model identification. IEEE Trans Automatic Control (1974) 19:716–23. doi: 10.1109/TAC.1974.1100705

CrossRef Full Text | Google Scholar

12. Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med (1996) 15:361–87. doi: 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Camp RL, Dolled-Filhart M, Rimm DL. X-Tile: A new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res (2004) 10:7252–9. doi: 10.1158/1078-0432.CCR-04-0713

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Kishiki T, Kuchta K, Matsuoka H, Kojima K, Asou N, Beniya A, et al. The impact of tumor location on the biological and oncological differences of colon cancer: Multi-institutional propensity score-matched study. Am J Surg (2019) 217:46–52. doi: 10.1016/j.amjsurg.2018.07.005

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Kennecke HF, Yin Y, Davies JM, Speers CH, Cheung WY, Lee-Ying R. Prognostic effect of sidedness in early stage versus advanced colon cancer. Health Sci Rep (2018) 1:e54. doi: 10.1002/hsr2.54

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Huang ZS, Wu JW, Li Y, Lin YH, Li XY. Effect of sidedness on survival among patients with early-stage colon cancer: A SEER-based propensity score matching analysis. World J Surg Oncol (2021) 19:127. doi: 10.1186/s12957-021-02240-3

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Turner MC, Becerra D, Sun Z, Watson J, Leung K, Migaly J, et al. The side of the primary tumor affects overall survival in colon adenocarcinoma: An analysis of the national cancer database. Techniques Coloproctology (2019) 23:537–44. doi: 10.1007/s10151-019-01997-w

CrossRef Full Text | Google Scholar

18. Gervaz P, Bucher P, Morel P. Two colons-two cancers: Paradigm shift and clinical implications. J Surg Oncol (2004) 88:261–6. doi: 10.1002/jso.20156

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Zaanan A, Shi Q, Taieb J, Alberts SR, Meyers JP, Smyrk TC, et al. Role of deficient DNA mismatch repair status in patients with stage III colon cancer treated with FOLFOX adjuvant chemotherapy a pooled analysis from 2 randomized clinical trials. JAMA Oncol (2018) 4:379–83. doi: 10.1001/jamaoncol.2017.2899

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Jasmine F, Haq Z, Kamal M, Raza M, Da Silva G, Gorospe K, et al. Interaction between microsatellite instability (MSI) and tumor DNA methylation in the pathogenesis of colorectal carcinoma. Cancers (2021) 13:4956. doi: 10.3390/cancers13194956

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Song YL, Wang LL, Ran WW, Li GQ, Xiao YJ, Wang XN, et al. Effect of tumor location on clinicopathological and molecular markers in colorectal cancer in Eastern China patients: An analysis of 2,356 cases. Front Genet (2020) 11. doi: 10.3389/fgene.2020.00096

CrossRef Full Text | Google Scholar

22. Muzny DM, Bainbridge MN, Chang K, Dinh HH, Drummond JA, Fowler G, et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature (2012) 487:330–7. doi: 10.1038/nature11252

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Sinicrope FA, Rego RL, Foster N, Sargent DJ, Windschitl HE, Burgart LJ, et al. Microsatellite instability accounts for tumor site-related differences in clinicopathologic variables and prognosis in human colon cancers. Am J Gastroenterol (2006) 101:2818–25. doi: 10.1111/j.1572-0241.2006.00845.x

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Benatti P, Gafa R, Barana D, Marino M, Scarselli A, Pedroni M, et al. Microsatellite instability and colorectal cancer prognosis. Clin Cancer Res (2005) 11:8332–40. doi: 10.1158/1078-0432.CCR-05-1030

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Taieb J, Shi Q, Pederson L, Alberts S, Wolmark N, Van Cutsem E, et al. Prognosis of microsatellite instability and/or mismatch repair deficiency stage III colon cancer patients after disease recurrence following adjuvant treatment: results of an ACCENT pooled analysis of seven studies. Ann Oncol (2019) 30:1466–71. doi: 10.1093/annonc/mdz208

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Venderbosch S, Nagtegaal ID, Maughan TS, Smith CG, Cheadle JP, Fisher D, et al. Mismatch repair status and BRAF mutation status in metastatic colorectal cancer patients: A pooled analysis of the CAIRO, CAIRO2, COIN, and FOCUS studies. Clin Cancer Res (2014) 20:5322–30. doi: 10.1158/1078-0432.CCR-14-0332

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Tran B, Kopetz S, Tie J, Gibbs P, Jiang ZQ, Lieu CH, et al. Impact of BRAF mutation and microsatellite instability on the pattern of metastatic spread and prognosis in metastatic colorectal cancer. Cancer (2011) 117:4623–32. doi: 10.1002/cncr.26086

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Lochhead P, Kuchiba A, Imamura Y, Liao XY, Yamauchi M, Nishihara R, et al. Microsatellite instability and BRAF mutation testing in colorectal cancer prognostication. Jnci-Journal Natl Cancer Institute (2013) 105:1151–6. doi: 10.1093/jnci/djt173

CrossRef Full Text | Google Scholar

29. Tol J, Nagtegaal ID, Punt CJA. BRAF mutation in metastatic colorectal cancer. New Engl J Med (2009) 361:98–9. doi: 10.1056/NEJMc0904160

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Gonsalves WI, Mahoney MR, Sargent DJ, Nelson GD, Alberts SR, Sinicrope FA, et al. Patient and tumor characteristics and BRAF and KRAS mutations in colon cancer, NCCTG/Alliance N0147. J Natl Cancer Inst (2014) 106:dju106. doi: 10.1093/jnci/dju106

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Schell MJ, Yang ML, Teer JK, Lo FY, Madan A, Coppola D, et al. A multigene mutation classification of 468 colorectal cancers reveals a prognostic role for APC. Nat Commun (2016) 7:11743. doi: 10.1038/ncomms11743

PubMed Abstract | CrossRef Full Text | Google Scholar

32. Piozzi GN, Rusli SM, Baek SJ, Kwak JM, Kim J, Kim SH. Infrapyloric and gastroepiploic node dissection for hepatic flexure and transverse colon cancer: A systematic review. Ejso (2022) 48:718–26. doi: 10.1016/j.ejso.2021.12.005

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Stelzner S, Hohenberger W, Weber K, West NP, Witzigmann H, Wedel T. Anatomy of the transverse colon revisited with respect to complete mesocolic excision and possible pathways of aberrant lymphatic tumor spread. Int J Colorectal Dis (2016) 31:377–84. doi: 10.1007/s00384-015-2434-0

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Toyota S, Ohta H, Anazawa S. Rationale for extent of lymph-node dissection for right colon-cancer. Dis Colon Rectum (1995) 38:705–11. doi: 10.1007/BF02048026

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Feng B, Sun J, Ling TL, Lu AG, Wang ML, Chen XY, et al. Laparoscopic complete mesocolic excision (CME) with medial access for right-hemi colon cancer: feasibility and technical strategies. Surg Endoscopy Other Interventional Techniques (2012) 26:3669–75. doi: 10.1007/s00464-012-2435-9

CrossRef Full Text | Google Scholar

36. Feng B, Ling TL, Lu AG, Wang ML, Ma JJ, Li JW, et al. Completely medial versus hybrid medial approach for laparoscopic complete mesocolic excision in right hemicolon cancer. Surg Endoscopy Other Interventional Techniques (2014) 28:477–83. doi: 10.1007/s00464-013-3225-8

CrossRef Full Text | Google Scholar

37. Bertelsen CA, Bols B, Ingeholm P, Jansen JE, Jepsen LV, Kristensen B, et al. Lymph node metastases in the gastrocolic ligament in patients with colon cancer. Dis Colon Rectum (2014) 57:839–45. doi: 10.1097/DCR.0000000000000144

PubMed Abstract | CrossRef Full Text | Google Scholar

38. Perrakis A, Weber K, Merkel S, Matzel K, Agaimy A, Gebbert C, et al. Lymph node metastasis of carcinomas of transverse colon including flexures. consideration of the extramesocolic lymph node stations. Int J Colorectal Dis (2014) 29:1223–9. doi: 10.1007/s00384-014-1971-2

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Uematsu D, Akiyama G, Sugihara T, Magishi A, Yamaguchi T, Sano T. Laparoscopic radical lymph node dissection for advanced colon cancer close to the hepatic flexure. Asian J Endoscopic Surg (2017) 10:23–7. doi: 10.1111/ases.12311

CrossRef Full Text | Google Scholar

40. Yuksel BC, Er S, Cetinkaya E, Aslar AK. Does transverse colon cancer spread to the extramesocolic lymph node stations?*. Acta Chirurgica Belgica (2021) 121:102–8. doi: 10.1080/00015458.2019.1689642

PubMed Abstract | CrossRef Full Text | Google Scholar

41. Sun YM, Zhang DS, Feng YF, Wang Y, Xu ZW, Tang JW, et al. Infrapyloric lymph node dissection in right hemicolectomy for colon cancer: Should prophylactic resection be recommended? J Surg Oncol (2021) 123:S30–5. doi: 10.1002/jso.26388

PubMed Abstract | CrossRef Full Text | Google Scholar

42. Wang XJ, Huang SH, Lu XR, Huang Y, Chi P. Incidence of and risk factors for gastroepiploic lymph node involvement in patients with cancer of the transverse colon including the hepatic flexure. World J Surg (2021) 45:1514–25. doi: 10.1007/s00268-020-05933-0

PubMed Abstract | CrossRef Full Text | Google Scholar

43. Moons KGM, Royston P, Vergouwe Y, Grobbee DE, Altman DG. Prognosis and prognostic research: what, why, and how? Bmj-British Med J (2009) 338:b375. doi: 10.1136/bmj.b375

CrossRef Full Text | Google Scholar

44. Moons KGM, Kengne AP, Woodward M, Royston P, Vergouwe Y, Altman DG, et al. Risk prediction models: I. development, internal validation, and assessing the incremental value of a new (bio)marker. Heart (2012) 98:683–90. doi: 10.1136/heartjnl-2011-301246

PubMed Abstract | CrossRef Full Text | Google Scholar

45. Pavlou M, Ambler G, Seaman S, De Iorio M, Omar RZ. Review and evaluation of penalised regression methods for risk prediction in low-dimensional data with few events. Stat Med (2016) 35:1159–77. doi: 10.1002/sim.6782

PubMed Abstract | CrossRef Full Text | Google Scholar

46. Ambler G, Seaman S, Omar RZ. An evaluation of penalised survival methods for developing prognostic models with rare events. Stat Med (2012) 31:1150–61. doi: 10.1002/sim.4371

PubMed Abstract | CrossRef Full Text | Google Scholar

47. Venook AP, Ou FS, Lenz HJ, Kabbarah O, Qu XP, Niedzwiecki D, et al. Primary (1 degrees) tumor location as an independent prognostic marker from molecular features for overall survival (OS) in patients (pts) with metastatic colorectal cancer (mCRC): Analysis of CALGB / SWOG 80405 (Alliance). J Clin Oncol (2017) 35:3503. doi: 10.1200/JCO.2017.35.15_suppl.3503

CrossRef Full Text | Google Scholar

Keywords: colorectal cancer, SEER, risk model, nomogram, survival

Citation: Hu C, Shi F, Zhang Z, Zhang L, Liu R, Sun X, Zheng L and She J (2022) Development and validation of a new stage-specific nomogram model for predicting cancer-specific survival in patients in different stages of colon cancer: A SEER population-based study and external validation. Front. Oncol. 12:1024467. doi: 10.3389/fonc.2022.1024467

Received: 21 August 2022; Accepted: 21 November 2022;
Published: 07 December 2022.

Edited by:

Vincent C. H. Chung, The Chinese University of Hong Kong, China

Reviewed by:

Lingchen Wang, University of Nevada, Reno, United States
Zi-Xiang Ye, Peking University, China

Copyright © 2022 Hu, Shi, Zhang, Zhang, Liu, Sun, Zheng and She. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Junjun She, junjunshe1975@sina.com; Liansheng Zheng, shenglianzheng@163.com

^†These authors have contributed equally to this work and share first authorship

^‡These authors have contributed equally to this work and share senior authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.