To the Editor:

Relapse and, less frequently, refractoriness to front-line therapy are the main causes of treatment failure in childhood B-cell precursor acute lymphoblastic leukemia (BCP-ALL), occurring in 15–20% of patients [1, 2]. Prognosis after relapse depends primarily on the time elapsing between diagnosis and relapse, site of relapse, and disease immunophenotypes [2]; unfortunately, many of these patients further relapse despite receiving allogeneic hematopoietic stem cell transplantation (allo-HSCT) [3].

Blinatumomab, a bispecific T-cell engager antibody construct, directs CD3-positive effector-memory T lymphocytes towards CD19-positive cells, triggering cell death of the latter [4]. Efficacy of blinatumomab in pediatric patients with relapsed/refractory (R/R) BCP-ALL has been demonstrated in an international phase 1/2, single-arm study (NCT01471782) [4].

R/R pediatric ALL is rare; consequently, most studies are single-arm and limited by small population sizes. Complete remission (CR) rates for pediatric patients in first or more advanced relapse vary from 8 to 75% [2, 5,6,7,8,9]. This variation can be attributed to differences among patient characteristics, sample sizes, and definition of CR used [2, 5,6,7,8,9].

For rare diseases, one approach to estimate treatment efficacy is to identify appropriate control populations with similar characteristics [9, 10]. Such a comparison has been undertaken for adult patients with Ph-negative R/R BCP-ALL from a single-arm study of blinatumomab with a historical dataset [11], but not for children.

We analyzed the blinatumomab phase 1/2 study [4] in comparison with three historical comparator groups from North America, Australia, and Europe. Propensity score (PS) analyses, along with a more conventional weighted analysis, evaluated two endpoints: overall survival (OS) and CR. The PS approach aims to create a balance between blinatumomab-treated subjects and historical comparator subjects with respect to multiple prognostic clinical factors.

Patients (<18 years) who had received intensive polychemotherapies with curative intent for R/R ALL in the time period 2005–2013 were included in this analysis by three historical comparator groups. The TACL study (group 1), conducted in 24 pediatric centers in the USA, Canada, and Australia, collected data on patients with R/R or relapsed after HSCT BCP-ALL (≤21 year-old) who received standard-of-care (SOC) chemotherapy 2005–2013. Only data from patients aged <18 years at time of earliest qualifying treatment failure were included in this analysis [9]. Two EU historical study groups provided data collected retrospectively from existing databases from Austria and Germany BFM (Berlin–Frankfurt–Münster) (group 2) and the Italian AIEOP (Associazione Italiana di Ematologia e Oncologia Pediatrica) (group 3) study groups.

Patient characteristics and endpoint definitions in the historical comparator studies were aligned to those used in the blinatumomab study [4]. Patients with Ph-negative R/R BCP-ALL with one of the following earliest qualifying events were selected: refractory to SOC induction/reinduction chemotherapy, relapse after allo-HSCT, or ≥2nd bone marrow (BM) relapse. The last qualifying treatment was used for these analyses, because they were more comparable with the blinatumomab study population. At the time of treatment for R/R disease, patients were required to have >25% leukemia BM blasts, without central nervous system involvement at time of qualifying event and to have had no previous, or current, treatment with blinatumomab. Information was documented from the date of initial ALL diagnosis through the date of R/R disease until the date of death or last follow-up.

Patients with different outcome measures in historical comparator groups are summarized in Supplementary Fig. 1. CR with or without full hematological recovery was defined in accordance with the blinatumomab study [4]. CR with full peripheral blood count recovery (CR-full) was defined as CR with ANC ≥ 0.5 × 109/l and platelet count ≥100 × 109/l. CR-full was not available for the BFM dataset. Follow-up time for OS was from the date of the start of the last salvage therapy, or date of last relapse if salvage date was not available, to date of death or last follow-up. Patients lost to follow-up were censored at the last known follow-up date.

Two statistical methods (i.e., conventional weighted analysis and PS-weighted comparative analysis) were applied to quantitatively evaluate the effect of blinatumomab on OS and CR rates, while adjusting for important risk factors for both endpoints. The main strata used were the nature of refractory disease/relapse (disease status), BM blasts, and time from prior treatment (Supplementary Appendix A). The 95% CIs were estimated by bootstrapping (Supplementary Appendix B). Weighting by PS analysis allowed estimation of treatment effect and CIs, while adjusting for differences in multiple data sources [12, 13]. The propensity to be treated was estimated via logistic regression model, using the patient’s treatment status as the outcome and a stepwise selection method to select among main effects and two-way interactions of the following covariates (see also the appendix): age; gender, region; previous allo-HSCT; number of previous salvage therapies; time since last therapy or allo-HSCT; percentage of BM blasts before starting salvage therapy; refractoriness to previous therapy; 11q23 abnormalities. The PS-weighted CR or OS analysis was performed using a Cox proportional hazard model or logistic regression model weighted with stabilized inverse probability of treatment weights (IPTW) derived from the predicted PS. The models included as independent variables patients’ treatment status and any covariates not sufficiently balanced by the PS weighting and estimated odd-ratios (OR) or hazard-ratios (HR) for treatment effects.

Baseline patient demographics and clinical characteristics among historical comparator groups and blinatumomab-treated population are shown in Table 1. In the blinatumomab-treated population, 70% of patients had relapsed <6 months from the last prior treatment compared with 46% in the combined historical groups.

Table 1 Baseline patient demographics and clinical characteristics in the historical comparator and blinatumomab studies.

Unweighted proportions of CR-full (95% CI) in the combined TACL/AIEOP, TACL alone, and AIEOP alone groups are shown in Supplementary Table 1. CR-full (95%CI) values in the combined TACL/AIEOP group were 10% (5–14), 11% (6–15), and 9% (5–12) when weighted by disease status, BM blast percentage at treatment start, and time since previous treatment, respectively (Table 2) The corresponding CR proportions with/without peripheral blood count recovery (95% CI) in the combined TACL/BFM /AIEOP group were 44% (38–50), 48% (42–54), and 42% (36–47). The stratum-specific CR proportions with/without peripheral blood count recovery were higher in the BFM group than in the AIEOP and TACL groups for patients with refractory disease and those who had experienced ≥2 relapses (Supplementary Table 1).

Table 2 (a) Complete remission and median overall survival weighted to blinatumomab study data, and (b) propensity score weighted comparative analysis on complete remission and overall survival.

Median OS (95% CI) in the combined historical dataset was 6.2 months (5.1–7.2) (Supplementary Fig. 2A). Median OS was longer in the BFM group than in the AIEOP or TACL groups (Supplementary Fig. 2B). As published previously [4], the median OS (95% CI) in the blinatumomab study was 7.5 months (4.0–11.8) (Supplementary Fig. 3). Median OS estimates in the combined comparator group were 5.9, 6.2, and 5.5 months when weighted by disease status, BM blast percentage at treatment start, and time since previous treatment, respectively (Table 2). Median OS was longer for patients who had <50% blast cells than for those who had ≥50% blast cells at the start of salvage treatment (Supplementary Table 2). OS was shortest in patients with 11q23 abnormalities (3.3 months), and in those <6 year-old (Supplementary Fig. 4). For patients who had relapsed >6 months from last treatment, median OS was 9.3 months versus 3.9 months for those who had relapsed sooner (Supplementary Table 2).

In standardized IPTW, patients in the blinatumomab group were almost twice as likely to achieve a CR-full rate as the combined historical control group (OR, 1.82; 95% CI, 0.74–4.51). The HR for death with blinatumomab group versus historical controls was 0.65 (95% CI, 0.44–0.94) (Table 2).

Through historical comparator data from pediatric patients with R/R BCP-ALL and application of two analytical approaches, it was possible to compare the efficacy of blinatumomab from a single-arm, phase 1/2 study with that of historical SOC therapy. Single-agent blinatumomab treatment was associated with longer OS and a trend for higher CR-full in comparison with SOC chemotherapy, suggesting that the agent compares favorably with historical approaches.

We acknowledge that this study may have limitations: the weighted analysis relies on categorization by prognostic variables and stratifying by prognostic factors may not be sufficient for controlling confounding factors. Differences in data availability and collection among study populations can result in the exclusion of potentially important confounders in the PS model (e.g., physician’s reasons for treating patients with blinatumomab versus chemotherapy). Conclusions of propensity-adjusted analyses are limited by availability of overlapping covariates in the three study datasets. Finally, the limited sample size could reduce the power to detect clinically meaningful differences between groups. Nonetheless, this study has several strengths. Data were included from patients across six countries worldwide; pooling these data removed some of the noise observed when datasets were considered individually. Stratified and weighted analyses were used at the patient level to provide optimal data summaries.

This study revealed differences in outcomes by important stratifying factors: in the combined subgroups analyses, median OS was shortest in patients <6 years, in those with 11q23 abnormalities, in those with refractory disease and who had received their last treatment line <6 months from the event qualifying for study-entry. Similar trends were observed in the blinatumomab cohort, except that younger patients appeared to respond better than older patients [4]. Defining age groups according to International Council of Harmonization guidelines [10], resulted into no difference in efficacy across age groups.

Altogether, these data provide support to the efficacy of blinatumomab in R/R BCP-ALL.