Scolaris Content Display Scolaris Content Display

Cochrane Database of Systematic Reviews Protocol - Intervention

Preoperative chemoradiotherapy versus chemotherapy for adenocarcinoma of the esophagus and esophagogastric junction (AEG): systematic review with individual participant data (IPD) network meta‐analysis (NMA)

Collapse all Expand all

Abstract

Objectives

This is a protocol for a Cochrane Review (intervention). The objectives are as follows:

To compare and rank preoperative chemoradiotherapy and chemotherapy treatment modalities for patients with resectable adenocarcinoma of the esophagus and esophagogastric junction, in terms of overall survival and other patient‐relevant outcomes, by conducting a network meta‐analysis.

Background

Please see Appendix 1 for a glossary of terms.

Description of the condition

Esophageal cancer is the ninth most‐common cancer worldwide and ranks sixth in terms of cancer mortality (Lagergren 2017). For 2018, the age‐standardized incidence of esophageal cancer in Europe was estimated to be 8.7/100,000 in men and 1.9/100,000 in women, corresponding to 52,970 new cases. Estimated age‐standardized mortality was 4.1/100,000, corresponding to 45,060 deaths annually (Ferlay 2018). For 2025, esophageal cancer incidence in Europe is estimated to reach 57,321 new cases annually, and mortality is predicted to reach 48,978 deaths annually (Ferlay 2018).

Esophageal cancer comprises two histologies, squamous cell carcinoma and adenocarcinoma. In the Western world, due to risk factor prevalence, the incidence of squamous cell carcinoma has markedly declined while for adenocarcinoma it has steadily risen (Gupta 2017). In terms of clinical management, esophageal adenocarcinoma and esophagogastric junction tumors are commonly regarded as one entity, which will subsequently be referred to as adenocarcinoma of the esophagogastric junction (AEG). Prognosis for patients with AEG remains poor, particularly in advanced stages. According to a pan‐European study, five‐year survival rates range from 36.9% for locally confirmed tumors without nodal spread, to 9.6% for node‐positive disease and 2.6% for metastatic disease (Gavin 2012).

Description of the intervention

In patients with AEG, surgery alone achieves poor oncological outcomes and long‐term survival is limited (DeMeester 2006). This has spurred the development of multimodal therapies. There is now substantial evidence that, for locally advanced tumors, preoperative multimodal treatment (i.e. chemotherapy or chemoradiotherapy) improves overall survival when compared to surgery alone (Ronellenfitsch 2013a; Shapiro 2015; Sjoquist 2011; van Hagen 2012). This approach is recommended by national and international guidelines (Lordick 2016; Leitlinienprogramm Onkologie 2019; NCCN 2021). Preoperative treatment is preferred to postoperative because it increases the likelihood of complete resection and leads to better functional outcomes. In addition, many patients are unable to begin or sustain postoperative treatment within the recommended time interval, due to perioperative complications or poor postoperative performance status (Biffi 2010; Ronellenfitsch 2013b).

How the intervention might work

Preoperative chemotherapy is believed to eliminate micrometastasis. Moreover, it potentially downstages the tumor by reducing its size, thus facilitating safe complete resection (Garg 2016). The addition of radiotherapy to chemotherapy, with the latter also acting as radiosensitizer, might exert even larger effects in terms of downstaging the tumor and eliminating tumor cells in locoregional lymph nodes. Consequently, both interventions result in higher rates of R0 resections (microscopically tumor‐free resection margins), a lower probability of locoregional and distant recurrences, and prolonged survival, compared to surgery alone (Giacopuzzi 2017; Ronellenfitsch 2013a).

Why it is important to do this review

From the currently available evidence, it remains unclear whether preoperative chemoradiotherapy or preoperative chemotherapy achieve better outcomes in patients with AEG, compared to surgery alone. Both approaches have individually been found to achieve better survival rates than surgery alone in several randomized controlled trials (RCTs) (Allum 2009; Cunningham 2006; Shapiro 2015; Tepper 2008; van Hagen 2012; Walsh 1996; Ychou 2011) and meta‐analyses (Fu 2015; Ronellenfitsch 2013a; Sjoquist 2011). A randomized head‐to‐head comparison between preoperative chemoradiotherapy and chemotherapy has only been performed in three trials (Burmeister 2011; Stahl 2017; von Döbeln 2019). A meta‐analysis included two of these trials (Sjoquist 2011). Pooling aggregate data from direct and indirect comparisons suggested a longer survival following chemoradiotherapy without statistical significance being reached. The authors also included trials of participants with squamous cell carcinoma and the result is prone to bias as they did not use individual participant data or conduct a network meta‐analysis (NMA), which is the gold standard for comparing two or more interventions with a comparator.

Three ongoing RCTs, known as TOPGEAR (Leong 2015), Neo‐AEGIS (Reynolds 2017) and POWERRANGER (NCT01404156), compare preoperative chemoradiotherapy with chemotherapy. TOPGEAR (Leong 2015) includes junction and gastric, but not esophageal, adenocarcinoma; and the latter two RCTs include only esophageal and junction tumors. Survival results from these trials will not be available in the near future. In Germany, two trials have been initiated which compare preoperative chemoradiotherapy with chemotherapy for AEG: the ESOPEC trial (Hoeppner 2016) and RACE trial (AIO‐STO‐0118). Results from these trials will not be available for a number of years. Therefore, an Individual Participant Data Network Metaanalysis (IPD NMA) is required to compare preoperative chemoradiotherapy versus preoperative chemotherapy for patients with adenocarcinoma of the esophagus and esophagogastric junction, using all available evidence from randomized trials to conduct a valid comparison between these two approaches.

Objectives

To compare and rank preoperative chemoradiotherapy and chemotherapy treatment modalities for patients with resectable adenocarcinoma of the esophagus and esophagogastric junction, in terms of overall survival and other patient‐relevant outcomes, by conducting a network meta‐analysis.

Methods

Criteria for considering studies for this review

Types of studies

In this systematic review with IPD NMA, we will only include RCTs. Due to the specific interventions and comparator under investigation, blinding of the participant and the treating physician is technically or ethically impossible (or both) and is therefore not considered an inclusion or exclusion criterion. There will be no restrictions regarding minimal time of follow‐up or number of included participants. We will include studies reported as full text, those published as abstract only, and unpublished data.

Types of participants

To be included in the review, trials must be conducted in participants who:

  • are diagnosed with histologically confirmed adenocarcinoma of the oesophagus or gastroesophageal junction (for studies including participants with both adenocarcinoma and other histological entities like squamous cell carcinoma, we will seek to obtain IPD or aggregate measures relating to participants with adenocarcinoma only);

  • are previously untreated;

  • have resectable tumours, based on staging; and

  • have no distant or peritoneal metastases.

In NMA, transitivity — also called similarity (Donegan 2010) or exchangeability (Dias 2018) — is required for a valid estimation (Salanti 2009). The transitivity assumption states that the choice of treatment comparisons in a study is not associated with the true relative effectiveness of the interventions, or that the patients observed to be treated with a certain intervention could, in principle, have been randomized to any other of the included treatments (Salanti 2012). In the current setting, all three of the included treatments are legitimate treatment alternatives that are not systematically applied to participants of different demographics or morbidities. In particular, all of the treatment options are commonly used in participants with resectable tumours and choice of the treatment is not directly related to tumour stage. A three‐armed RCT comparing all of the included interventions simultaneously is thus, in theory, possible.

Types of interventions

To be included in this NMA, trials have to either compare: intervention 1 directly with intervention 2 (below); one of the two interventions with surgery alone; or all three treatment options, in a three‐armed design.

  • Intervention 1: preoperative chemoradiotherapy followed by surgery. Chemoradiotherapy refers to a treatment with any kind of cytotoxic/antineoplastic drug or a combination of several of these drugs, in a sequential or concurrent combination with external beam radiotherapy. Trials will be included regardless of radiation dose, planning target volume, radiation technique, and possible postoperative continuation of chemotherapy or chemoradiotherapy.

  • Intervention 2: preoperative chemotherapy followed by surgery. Chemotherapy refers to a treatment with any kind of cytotoxic/antineoplastic drug or a combination of several of these drugs. Trials will be included regardless of possible postoperative continuation of chemotherapy or chemoradiotherapy.

For both interventions, the exact surgical approach (e.g. transhiatal/transthoracic, open/minimally‐invasive) and extent of lymphadenectomy stipulated in the single trials will not be regarded as inclusion or exclusion criteria. The definition of what constitutes a node in the network is not always straightforward (James 2018). We expect to identify trials with slight variation in dose or regime in chemotherapy or chemoradiotherapy as there is no established standard regime that might be used in all trials. As we are mainly interested in comparing the different types of treatment and do not expect to identify systematic differences between trials, we will lump variations in dose and regime within the same type of treatment. If we identify systematic differences in the retrieved trials with regard to chemotherapy or chemoradiotherapy (e.g. if trials compare low doses to high doses of the same intervention), we will consider a more expanded network where we define a node as treatment*variant interaction as a secondary analysis. We will not consider trials that include co‐interventions. Therefore, three nodes can be defined:

  • neoadjuvant chemotherapy followed by surgery;

  • neoadjuvant chemoradiotherapy followed by surgery; and

  • surgery alone.

Types of outcome measures

Primary outcomes

No primary outcomes will be defined a priori; instead, patients' representatives will rank the subjective importance of the available outcomes. Regardless of this ranking, the different outcomes reflect different aspects of treatment for the disease under study. The review authors will state any pre‐defined outcomes which were not reported in the single publications.

  • Overall survival, defined as the time from randomization until death from any cause.

  • Disease‐free survival, defined as the time from randomization until recurrence or death from any cause.

  • Local‐recurrence‐free survival, defined as the time from randomization until local recurrence.

  • Distant‐recurrence‐free survival, defined as the time from randomization until distant recurrence.

  • In participants who underwent preoperative treatment: toxicity of preoperative treatment, measured according to National Cancer Institute Common Terminology Criteria for Adverse Events (NCI CTCAE) and late effects on normal tissues, in subjective, objective, management and analytic categories (LENT‐SOMA).

  • Postoperative mortality.

  • Postoperative morbidity.

  • Achievement of tumour‐free resection margins (R0 resectability).

  • Pathological tumour stage at resection, according to the International Union against Cancer Tumour Node Metastasis (UICC TNM) classification.

  • Rate of pathological complete resection (pCR).

  • Quality of life, measured with specific assessment tools e.g. Core Quality of Life Questionnaire (QLQ‐C30), Quality of Life Questionnaire in patients with tumours of the oesophagus, oesophago‐gastric junction or stomach (QLQ‐CES24/OG25)'.

Secondary outcomes

No secondary outcomes will be defined a priori as well (see primary outcomes).

Search methods for identification of studies

We will design the search strategies with the help of the Cochrane Gut Information Specialist before performing literature searches. No restrictions will be placed on the language of publication when searching the electronic databases or reviewing reference lists in identified studies.

Electronic searches

We will conduct a literature search to identify all published and unpublished RCTs in all languages. We will translate papers not written in English and fully assess them for potential inclusion in the review as necessary.

We will search the following electronic databases:

  • PubMed (1966 to present; Appendix 2);

  • Cochrane Library (inception to present; Appendix 3);

  • CINAHL (1982 to present; Appendix 4); and

  • ClinicalTrials.gov.

We will not conduct a literature search in EMBASE, since barrier‐free access is not available in Germany.

Searching other resources

We will check the reference lists of all primary studies and review articles for additional references. We will contact authors of identified trials and ask them to identify other published and unpublished studies. We will also contact manufacturers and experts in the field. We will search for errata or retractions from eligible trials on PubMed (www.ncbi.nlm.nih.gov/pubmed) and report the date this was done in the review. We will search for grey literature on www.opengrey.eu and will conduct a search of clinical trial registers/trial result registers on the International Clinical Trials Registry Platform Search Portal.

Data collection and analysis

Selection of studies

Two review authors (UR, JF) will independently screen titles and abstracts for inclusion. All of the potential studies we identify as a result of the search will be coded as either 'retrieve' (eligible, potentially eligible, or unclear) or 'do not retrieve'. We will retrieve the full text of potentially eligible studies and two review authors (UR, JF) will independently screen the full text, and identify studies for inclusion. They will also identify and record reasons for exclusion of the ineligible studies. We will resolve any disagreement through discussion or, if required, we will consult a third author (CM). We will identify and exclude duplicates and collate multiple reports of the same study, so that each study rather than each report is the unit of interest in the review. We will record the selection process in sufficient detail to complete a PRISMA flow diagram and 'Characteristics of excluded studies' table.

Data extraction and management

Published aggregate data

We will use a standardised data collection form for study characteristics and outcome data, which will be piloted on at least one study included in the review. One review author (JF) will independently extract study characteristics from included studies. We will extract the following study characteristics and results.

  • General study information: title, authors, contact address, funding source, language, publication status, year of publication, place(s) and year(s) of study conduction.

  • Study design issues: inclusion/exclusion criteria, randomization, risk of bias, length of study/follow‐up period.

  • Baseline characteristics of participants: size of intervention and comparison group, and for each group the distribution of age, sex, co‐morbidity (measured, if given as World Health Organisation (WHO) performance status or American Society of Anesthesiologists classification), histology (adenocarcinoma/squamous cell carcinoma), tumour location (esophagus, gastroesophageal junction), tumour stage (Tumour Node Metastasis (TNM) stage and Union for International Cancer Control (UICC) stage), administration of preoperative and adjuvant therapies.

  • Characteristics of the intervention: details of applied chemotherapy/chemoradiotherapy (including drug dosages, radiotherapy dosages, radiotherapy modality, etc.).

  • Loss to follow‐up.

  • Hazard ratios (HRs) both for overall and, if available, disease‐free, local‐recurrence‐free and distant‐recurrence‐free survival.

  • Toxicity of preoperative treatment, measured according to NCI CTCAE and LANT‐SOMA.

  • Postoperative in‐hospital mortality.

  • Postoperative morbidity (any complication that would be classified as Clavien‐Dindo grade I to IV).

  • Completeness of resection margins (R0/R1/R2).

  • Pathological tumour stage at resection, as assessed from the surgical specimen according to the UICC TNM classification.

  • Quality of life, as measured within the single trial.

  • Notes: funding for trial, notable conflicts of interest of trial authors.

Individual participant data (IPD)

For each study, IPD will be solicited from the respective trialists. The requested variables are as follows.

  • Age at randomization.

  • Sex.

  • Histological type (adenocarcinoma/squamous cell carcinoma).

  • Site of tumour (oesophagus/cardia).

  • Allocated treatment arm (surgery alone, chemotherapy, chemoradiotherapy).

  • Date of randomization.

  • Date of surgery, begin and end date of chemotherapy/chemoradiotherapy.

  • WHO/Eastern Cooperative Oncology Group (ECOG) performance status.

  • American Society of Anesthesiologists class.

  • Date of last contact/follow‐up.

  • Vital status (alive/dead).

  • Cause of death.

  • Lost to follow up (yes/no).

  • Postoperative death.

  • Postoperative complications (severity according to Clavien Dindo and type).

  • TNM pT (pathological tumour) and pN (pathological nodes) stage at resection.

  • Completeness of margins (R0/R1/R2).

  • Pathological regression.

  • Pre‐treatment TNM T and N stage.

  • Features of adjuvant therapy (none, chemotherapy, chemoradiotherapy).

  • Toxicity of preoperative treatment (according to NCI CTCAE and LENT‐SOMA).

  • Reasons why surgery was not done.

  • Quality of life at time points defined in the single trials.

  • Date and site (local/distant/both) of first recurrence.

Data will be requested for all randomized participants in the trial (intention‐to‐treat population). Trialists will be asked to provide the most complete and updated follow‐up data available, even if follow‐up is longer than that used for the pertinent publication.

All data will be entered in a dedicated database. One review author (JF) will copy across the data from the data collection form into the Review Manager 5 file (Review Manager 2020). We will double‐check that the data are entered correctly by comparing the study reports with how the data are presented in the systematic review. A second review author will spot‐check study characteristics for accuracy against the trial report.

Quality control

The quality of submitted IPD from the single trials will be assessed in several ways, as follows. Any detected inconsistencies will be clarified with the respective trialists and missing data will be requested.

  • IPD will be compared with the intention‐to‐treat population reported in publications.

  • Datasets will be screened for obvious duplicates or omissions (e.g. checking participants' IDs).

  • Plausibility of the values supplied for each variable will be checked by looking for extreme outliers.

  • Summary measures calculated from the dataset will be compared with corresponding results in publications.

  • Overall and disease‐free survival of the different treatment groups in each trial will be derived using the Kaplan‐Meier method and standard Cox regression analysis, and compared with published estimates.

  • Completeness and equality of follow‐up in the two study arms will be checked by plotting a 'reverse' Kaplan‐Meier curve considering censored participants as participants who incurred the outcome (Stewart 1995); in addition the median follow‐up time will be evaluated.

Effect modifiers

  • Age

  • Sex

  • Co‐morbidity (measured, if given as World Health Organization (WHO) performance stats or American Society of Anesthesiologists (ASA) classification).

  • Tumour location (esophagus, gastroesophageal junction).

  • Tumour stage (TNM and UICC stage).

  • Characteristics of the intervention: details of applied chemotherapy/chemoradiotherapy (including drug dosages, radiotherapy dosages, radiotherapy modality, etc.).

Assessment of risk of bias in included studies

Two review authors (UR, JF) will independently assess the risk of bias for each included study, using the criteria outlined in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2021) and version 2 of the Cochrane 'Risk of bias' tool (RoB 2) (Sterne 2019). We will resolve any disagreement by discussion, or by involving a third review author (CM). We will assess the risk of bias according to the following domains:

  • bias arising from the randomization process;

  • bias due to deviations from intended interventions;

  • bias due to missing outcome data;

  • bias in measurement of the outcome; and

  • bias in selection of the reported result.

The effect of interest will be the effect of the assignment to the interventions at baseline, regardless of whether the interventions were actually received and adhered to as intended. The following results will be assessed using RoB 2:

  • overall survival, defined as the time from randomization until death of any cause;

  • disease‐free survival, defined as the time from randomization until recurrence or death of any cause;

  • postoperative mortality;

  • postoperative morbidity;

  • achievement of tumour‐free resection margins (R0 resectability);

  • pathological tumour stage at resection according to the UICC TNM classification;

  • rate of pathological complete resection (pCR)

We will grade each potential source of bias as "high", "some concerns" or "low", and provide a quote from the study report and justification for our judgment in the 'Risk of bias' table. We will summarise the ‘Risk of bias’ judgments across studies for each of the domains listed. We will consider blinding separately for different key outcomes where necessary, e.g. for unblinded outcome assessment, risk of bias for all‐cause mortality may be very different than for a person‐reported pain scale. Where information on risk of bias relates to unpublished data or correspondence with a study author, we will note this in the 'Risk of bias' table. Overall risk of bias will be ascertained by using the signalling questions and algorithm provided by the RoB 2 tool.

The RoB 2 Excel tool to implement RoB 2 will be used to manage the assessment of bias. When considering treatment effects, the overall RoB 2 judgment will be used to inform the GRADE assessment.

Due to an expected large amount of data, we will provide only the consensus decisions for the signalling questions in the full review. The full data will be provided in a supplement.

Assessment of bias in conducting the systematic review

We will conduct the review according to this published protocol and report any deviations from it in the 'Differences between protocol and review' section of the systematic review.

Measures of treatment effect

We will analyse dichotomous data as odds ratios (ORs) with 95% confidence intervals (CIs). Mean difference (MD) with 95% CIs will be calculated for continuous data and the standardised mean difference (SMD) with 95% CIs will be calculated if continuous outcomes were assessed on different scales. In the context of this review we expect this to be the case for the quality‐of‐life outcome. We will analyse time‐to‐event outcomes as hazard ratios (HRs) with their 95% CIs. For ORs and HRs, computations will be carried out on a log scale and the results will be converted back into the original metric. We will ensure that higher scores for continuous outcomes have the same meaning for the particular outcome, explain the direction to the reader and report where the directions were reversed if this was necessary. For the SMD we will re‐express the obtained combined effect estimate on the scale of the instrument most commonly identified to facilitate interpretability, as described in Section 15.5.3.2 of the Cochrane Handbook (Schünemann 2021). We will undertake meta‐analyses only where this is meaningful, i.e. if the treatments, participants and the underlying clinical question are similar enough for pooling to make sense.

A common way that trialists indicate they have skewed data is by reporting medians and interquartile ranges. When we encounter this, we will note if the data are skewed and consider the implication of this. If the data are skewed, we will not perform a meta‐analysis, but will provide a narrative summary instead.

Unit of analysis issues

If a trial reports different follow‐up times that are summarised into the same category, we will use the latest one. Additionally if a trial reports multiple time points, we will use the categories described in the section of outcome assessment. Where multiple trial arms are reported in a single trial, we will include only the relevant arms. We will, however, indicate that additional arms were available in the ‘Characteristics of included studies' table. If two comparisons (e.g. drug A versus placebo and drug B versus placebo) must be entered into the same meta‐analysis, we will use NMA methodology to adequately incorporate these. Cross‐over trials and cluster‐randomized trials are not relevant for the research question under study.

Dealing with missing data

We will contact investigators or study sponsors in order to verify key study characteristics and to obtain missing data, such as missing outcomes, missing summary data, and missing individuals (e.g. when a study is identified as abstract only). If we are unable to obtain missing statistics from the investigators or study sponsors, we will impute the mean based on the median and the range, or based on the median, lower and upper quartile values according to the formula proposed by Wan and colleagues (Wan 2014). For aggregated data, the standard deviation will be imputed from the range, lower and upper quartile values considering the sample sizes according to Wan (Wan 2014).

To obtain HRs for the meta‐analysis from aggregate data we will estimate the HR and its standard deviation for each study according to the methods described by Parmer (Parmer 1998) and Tierney (Tierney 2007). If results are only reported graphically, then we will estimate the values from these figures. We will assess the impact of including studies with imputed statistics by conducting sensitivity analyses. If we are unable to calculate the standard deviation based on the reported summary statistics, we will impute standard deviation as the highest standard deviation in the remaining trials included in the outcome, whilst being aware that this method of imputation will decrease the weight of the studies in the meta‐analysis of MD, and shift the effect towards no effect for SMD.

Assessment of heterogeneity

We will use the I² statistic, P value from Chi2 test, and the between‐study heterogeneity τ² to assess heterogeneity among the trials in each analysis (Higgins 2003). If we identify substantial heterogeneity (I² greater than 50% to 60%, according to the Cochrane Handbook for Systematic Reviews of Interventions), we will investigate reasons for this heterogeneity by performing subgroup analysis and meta‐regression in view of exploring the causes of heterogeneity (Higgins 2021). We will also assess heterogeneity by evaluating whether there is good overlap of CIs. We will take into account any statistical heterogeneity when interpreting the results.

Assessment of reporting biases

If we are able to pool more than 10 trials, we will create and examine a comparison adjusted funnel plot to explore possible publication biases. We will use Egger's test to determine the statistical significance of the reporting bias (Egger 1997). We will consider a P value of less than 0.05 to represent statistically significant reporting bias.

Data synthesis

For all outcomes, we will combine IPD and aggregated data using the two‐stage method (Riley 2017). This implies that from studies where IPD are available, we will calculate the outcome measure, as defined above, from the provided data, using appropriate linear and generalised linear models adjusted for the same covariate if possible. For studies where IPD are not available, we will use the aggregate outcome measure provided in the pertinent publication, if available. We will extract adjusted and unadjusted values if both are provided. If for a given outcome a summary measure is not available from IPD or from publications, the respective study will not be included in the analysis, but will be reported narratively.

We will perform all of the meta‐analysis using R (R), as only pairwise analyses are implemented in Review Manager (Review Manager 2020). We will use a Bayesian random‐effects model (Dias 2014) because we expect non‐explainable heterogeneity. For between‐trial heterogeneity, we will use a half‐normal prior scaled to 0.5 in all analyses, as has been recommended in Friede (Friede 2017). For testing the robustness of our findings regardless of which method was chosen, we will conduct sensitivity analyses for primary outcomes using the fixed‐effect model. In case of divergence between the two models, we will present both results; otherwise, we will present only results from the random‐effects model.

Pairwise meta‐analysis

We will perform Bayesian pairwise random‐effects meta‐analyses (Dias 2014) at the trial level for all end points. Each pairwise meta‐analytical comparison will be restricted to the corresponding trial results irrespective of whether a third treatment arm was investigated. Where IPD are available, we will calculate the outcome measures from the provided data. The analyses will be performed according to the intention‐to‐treat principle. For studies where IPD are not available, we will use the aggregated outcome measures provided in the pertinent publication. Pooled effect sizes will be estimated from the mean or median of the posterior distribution. We will estimate 95% credibility intervals from the 2.5th and 97.5th percentiles of the highest posterior density interval; these will not necessarily be symmetric.

The model will include random‐effects at the level of trials to account for possible variation between trials due to clinical heterogeneity. Clinical heterogeneity will be defined as the existence of inhomogeneous study populations or the variability of chemotherapy and chemoradiotherapy regimens. Statistical heterogeneity will be estimated from the median standard deviation between trials (τ²) observed in the posterior distribution. Half‐normal prior (with scale 0.5 or 1.0) for log odds ratios will be used for τ², which we plan to adapt for other outcome types (Friede 2017). For all trial baselines and treatment effects, vague priors will be implemented. Adjustments for multiple testing are not planned in the pairwise meta‐analyses. Publication bias will be explored by evaluating funnel plot asymmetry, if a sufficient number of studies is available.

Network meta‐analysis using aggregate data

We will use a Bayesian random‐effects model for NMA, for the primary and secondary outcomes separately (Dias 2012; Dias 2014 ). This model preserves the comparison of randomized treatments within each trial while combining all available comparisons between the treatments (surgery alone, preoperative chemoradiotherapy, preoperative chemotherapy) and, if applicable, accounts for multiple comparisons within a trial when there are more than two treatment groups (Cooper 2006). As with the pairwise meta‐analysis, pooled effect sizes from the mean or median of the posterior distribution and the 95% credible intervals will be estimated. The same priors will be used as for the pairwise meta‐analysis. The three treatment options will be ranked regarding the posterior probability of being the most successful treatment with respect to overall survival (SUCRA). The total residual deviance and the deviance information criterion (DIC) will be given as goodness of fit measures. The consistency assumption will be tested using the method of Bucher, by comparing the effect size from direct comparisons within randomized trials and the effect size from indirect comparisons between randomized trials with one intervention in common (Bucher 1997). If there is evidence of inconsistency, the network Bayesian random‐effects model will be added, taking into consideration inconsistency (Dias 2014). Convergence of Markov chains will be checked by the Brooks‐Gelman‐Rubin (Brooks‐Gelman 1998) statistic.

Network meta‐analysis using individual participant data and aggregate data

An NMA using all available IPD, as well as aggregate data from trials that do not provide IPD, will be our primary analysis for each outcome. As for conventional pairwise meta‐analysis, IPD is preferred over aggregate data from all included studies in an NMA. One of the main advantages of IPD over aggregate data is that the assessment of heterogeneity introduced by patient‐level covariates does not suffer from ecological bias. However, if IPD cannot be obtained from all included trials, the available evidence may be based partly on IPD and partly on aggregate data. Saramago, Jansen and Donegan have proposed Bayesian synthesis models for binary outcomes that incorporate both IPD and aggregate data and allow taking patient‐level covariates into account (Donegan 2012; Jansen 2012; Saramago 2012). Thom and colleagues published a Bayesian model for continuous outcomes, which incorporates both IPD and aggregate data and enables consideration of patient‐level covariates (Thom 2015). Recently, Saramago and colleagues extended their Bayesian synthesis model to time‐to‐event outcomes (Saramago 2014). Patient‐level covariates of special interest are sex, age, performance status, tumour location (esophagus, esophagogastric junction), and pretreatment tumour stage.

'Summary of findings' table

We will create a 'Summary of findings' table according to Cochrane methodology. Further information can be found at ‘Summary of findings and assessment of the certainty of the evidence’.

Subgroup analysis and investigation of heterogeneity

We plan to carry out the following subgroup analyses:

  • tumour location: esophagus versus cardia;

  • WHO/Eastern Cooperative Oncology Group (ECOG) performance status: 0 versus 1 versus 2 or higher;

  • age: less than 65 years, 65 to 75 years, over 75 years;

  • sex: male versus female;

  • surgical approach: transthoracic versus transhiatal; and

  • chemotherapeutic agents used in preoperative therapy: cisplatin/fluorouracil (5‐FU) versus other.

The following outcomes will be used in subgroup analysis:

  • overall survival; and

  • disease‐free survival.

We will use the formal Q test for subgroup difference to test for subgroup interactions. No meta‐regressions are planned.

Sensitivity analysis

All eligible studies will be included and sensitivity analyses will be conducted based on the risk of bias assigned to studies as described before (low, some concerns, or high). For all outcomes, we will perform sensitivity analyses based on the risk of bias assigned to studies as described before (low, some concerns, high). Depending on the availability of studies, results from aggregate data will be compared with the results from pooled IPD in another sensitivity analysis (Freeman 2017; Pignon 2001). Sensitivity analyses with respect to different types of prior distributions for the basic parameters and the between‐trial variance used in the Bayesian network approach will be undertaken, in order to investigate the robustness of the network results (Bafeta 2014). In addition, we will perform sensitivity analyses with regard to aggregated trials reporting standard errors, directly leaving out trials were those measures were imputed. All statistical analyses will be conducted in the most up‐to‐date version of R (R) and JAGS (mcmc-jags.sourceforge.net).

Reaching conclusions

We will only base our conclusions on findings from the quantitative or narrative synthesis of studies included in this review. We will avoid making recommendations for practice; our implications for research will give the reader a clear sense of the needed focus of future research and remaining uncertainties in the field.

Summary of findings and assessment of the certainty of the evidence

We will create a 'Summary of findings' table according to Cochrane methodology. We will use the five GRADE considerations (study limitations, consistency of effect, imprecision, indirectness, and publication bias) to assess the quality of the body of evidence based on the studies that contributed data to the meta‐analyses for each outcome, classifying it as high, moderate, low or very low. We will use the methods and recommendations described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2021), and GRADEpro GDT software (GRADEpro 2015). We will justify all decisions to downgrade or upgrade the quality of the evidence in the footnotes of the 'Summary of findings' tables, and we will provide comments to aid the reader's understanding of the review where necessary. We will consider whether there is additional outcome information that was not incorporated into the meta‐analyses; we will note this in the comments, and state if it supports or contradicts the information from the meta‐analyses.

The following seven outcomes will be included for a 'Summary of findings' table:

  • overall survival, defined as the time from randomization until death from any cause;

  • disease‐free survival, defined as the time from randomization until recurrence or death from any cause;

  • postoperative mortality;

  • postoperative morbidity;

  • achievement of tumour‐free resection margins (R0 resectability);

  • pathological tumour stage at resection according to the UICC TNM classification; and

  • rate of pathological complete resection (pCR).

A NMA 'Summary of findings' table will be presented in a format reporting multiple treatment comparisons and multiple beneficial and harmful outcomes, according to Yepes‐Nuñez 2019. The effects and the corresponding CIs will be presented in a single table for all outcomes. A final GRADE certainty of evidence for NMA as whole will be stated, as well as ranking information for each outcome and treatment option.