Skip to main content

Applicability and added value of novel methods to improve drug development in rare diseases

Abstract

Background

The ASTERIX project developed a number of novel methods suited to study small populations. The objective of this exercise was to evaluate the applicability and added value of novel methods to improve drug development in small populations, using real world drug development programmes as reported in European Public Assessment Reports.

Methods

The applicability and added value of thirteen novel methods developed within ASTERIX were evaluated using data from 26 European Public Assessment Reports (EPARs) for orphan medicinal products, representative of rare medical conditions as predefined through six clusters. The novel methods included were ‘innovative trial designs’ (six methods), ‘level of evidence’ (one method), ‘study endpoints and statistical analysis’ (four methods), and ‘meta-analysis’ (two methods) and they were selected from the methods developed within ASTERIX based on their novelty; methods that discussed already available and applied strategies were not included for the purpose of this validation exercise. Pre-requisites for application in a study were systematized for each method, and for each main study in the selected EPARs it was assessed if all pre-requisites were met. This direct applicability using the actual study design was firstly assessed. Secondary, applicability and added value were explored allowing changes to study objectives and design, but without deviating from the context of the drug development plan. We evaluated whether differences in applicability and added value could be observed between the six predefined condition clusters.

Results and discussion

Direct applicability of novel methods appeared to be limited to specific selected cases. The applicability and added value of novel methods increased substantially when changes to the study setting within the context of drug development were allowed. In this setting, novel methods for extrapolation, sample size re-assessment, multi-armed trials, optimal sequential design for small sample sizes, Bayesian sample size re-estimation, dynamic borrowing through power priors and fall-back tests for co-primary endpoints showed most promise - applicable in more than 40% of evaluated EPARs in all clusters. Most of the novel methods were applicable to conditions in the cluster of chronic and progressive conditions, involving multiple systems/organs. Relatively fewer methods were applicable to acute conditions with single episodes. For the chronic clusters, Goal Attainment Scaling was found to be particularly applicable as opposed to other (non-chronic) clusters.

Conclusion

Novel methods as developed in ASTERIX can improve drug development programs. Achieving optimal added value of these novel methods often requires consideration of the entire drug development program, rather than reconsideration of methods for a specific trial. The novel methods tested were mostly applicable in chronic conditions, and acute conditions with recurrent episodes.

Background

Background on ASTERIX project

ASTERIX was a novel EU-funded research project focusing on the development of more efficient and effective research designs to study new drugs and treatments for rare diseases. The overall aim was to achieve more reliable and cost-efficient clinical development of treatments for rare diseases and to stimulate the search for treatments for these devastating and largely ignored diseases.

The main objectives were to:

  • Develop design and analysis methods for single trials and series of trials in small populations.

  • Include patient-level information and perspectives in design and decision making throughout the clinical trial process.

  • Validate new methods and propose improvements for regulatory purposes.

ASTERIX worked through six highly interactive and interdependent Work Packages ranging from development of methodology, stakeholder participation to the dissemination of the results. Unique in this project was that patients were directly involved in the research process and their input is taken into account in design and analysis of studies.

Context

Six percent of the global population is affected by one of the estimated 5000–8000 rare diseases at some stage in their life [1]. In Europe a disease is classified as ‘rare’ if it affects less than 5 in 10,000 people [2]. Evaluating interventions aimed at preventing, diagnosing or treating a rare disease is a challenge, and can lead to slow evaluation and approval of Orphan Medicinal Products (OMPs) for marketing, and thereby delay access by patients [3]. To stimulate the development of medicines for rare diseases the EU Orphan Regulation came into effect in 2000. This regulation provides an incentive for research, development and marketing of OMPs to target rare diseases [4]. Although more than 1800 orphan drug designations have been granted since 2000, by 2017 only 129 OMPs were granted market authorisation [5]. Hence, although drugs do become available, a treatment still needs to be found for the vast majority of rare diseases. The main issue that distinguishes medicines development for rare diseases from more common diseases is the challenge of generating robust clinical evidence. The limited recruitment potential calls for an efficient study design, able to estimate the treatment effect in a valid and reliable way with a small number of patients [6].

There is an abundance of methodology to improve the design and analysis of individual trials, often essentially aimed at increasing efficiency: extract more information from the same trial, increase the probability of success of an individual clinical trial and enable the conduct of smaller trials. Yet, progress for clinical trials in truly small populations has proven difficult to achieve. Some frameworks have been proposed to guide the choice of the best suited methodology and study designs in drug development for such rare diseases. At the regulatory level, the European Medicines Agency (EMA) released the ‘Guideline on Clinical Trials in Small Populations’, which summarises a range of possible approaches in the context of small populations in drug development, acknowledging that any efficiency improvements for small population clinical trials would also be relevant to larger trials and vice-versa [7]. Other available frameworks typically aimed to propose algorithms or decision processes to arrive at the most suited design for a given clinical trial. These focus either on a specific condition ([8, 9]), a specific method or group of methods ([10, 11]), or provide general recommendations [12,13,14,15].

However, most of these algorithms or frameworks are guided by items related to only a few characteristics of the condition such as clinical course, timing and reversibility of the outcome, or trial feasibility, and they are not always exhaustive to fit all possible situations.

ASTERIX was a novel EU-funded research project (7th Framework Program (FP7) Call – Health.2013.4.2–3) focusing on the development of more efficient and effective research designs to study new drugs and treatments for rare diseases. The overall aim was to achieve more reliable and cost-efficient clinical development of treatments for rare diseases and to stimulate the search for treatments for these devastating and largely ignored diseases. ASTERIX decided to focus on progress in clinical research for new treatments for rare diseases. The vision of ASTERIX was that such progress can be best made, by advancing in coherence: (1) statistical methodology for design and analysis, (2) incorporation of the patient perspective in design and outcomes and (3) uptake in practice and regulatory guidance [16].

Within the ASTERIX project 13 novel methods have been developed proposing innovative approaches to adapt and analyse clinical trials on small populations and rare diseases (Table 1). We aimed to evaluate these methods for added value against an appropriate framework to guide application, preferably tailored to characteristics of the medical condition. The limitations of the existing frameworks to provide guidance that directly incorporates characteristics of the medical condition treated are obvious. Apart from their low prevalence, orphan diseases are a highly heterogeneous group of diseases. Such heterogeneity makes it very difficult to issue useful regulatory recommendations relevant to all (or at least most) possible clinical situations in the course of uncommon diseases. Nevertheless, some groups of conditions share similar clinical characteristics linked to the applicability of certain trial designs and general approaches.

Table 1 Overview of the methods that were evaluated

Thus, within the ASTERIX project we used a heuristic framework that could help identify groups of medical conditions – defined as the combination of clinical situation and a given therapeutic approach to be tested - for which similar methods could be useful for drug development. The six condition clusters were used to check for patterns within clusters of conditions that share in common similar features. These are reflected also in methodological and trial design challenges. The clusters were used as a strategy to try and provide methodological insights by cluster of conditions and overcome the challenges due to the large number and high variation of rare conditions. The methodology, reasoning and derivation behind the conditions clustering is detailed in ([17], to appear).

In this study, we aimed to evaluate the applicability and added value (via the potential advantages) of the 13 novel methods developed within the ASTERIX project against a comprehensive set of real life examples of drug development programs for OMPs, as identified in European Public Assessment Reports (EPARs). The framework of medical condition clusters was applied as a way to structure our evaluation, so that guidance on the (novel) methods could be given more specifically at the condition cluster level. In addition, we described advantages and disadvantages of using the newly developed methodology. Based on the applicability and potential advantages and disadvantages, we aimed to tailor guidance on the use of this new methodology to specific medical condition clusters.

Methods

We included all novel methods that were developed within the ASTERIX project and that had been reported in a published or (nearly) submitted manuscript by 1 September 2017.

There are numerous other methods and tools available that address challenges encountered in conducting research for rare diseases and small populations (i.e. n-of-1 trials, patient registries), and some were investigated in ASTERIX [18]. However, for the purpose of this research we focused only on the novel methods developed within ASTERIX. We excluded manuscripts that discussed already existing methods, or described a new perspective on an already existing method. We categorised the methods into four main groups:

  • Six ‘innovative trial designs’, including: delayed-start design, a method for interim analysis and stopping rules in multi-arm parallel trials, two methods for sample-size reassessment (one for adaptive survival trials, and a second one with a Bayesian approach for continuous end-points), a method to optimize boundaries in group-sequential designs, and a method to weight prior information in Bayesian trials based on similarity of previous data.

  • One ‘level of evidence’ method, consisting of a set of recommendations to check if prior information can be used for inference allowing to relax the significance level in confirmatory trials, reducing sample size while controlling for certainty.

  • Four ‘study endpoints and statistical analysis’ methods, including: three methods to analyze multiple end-points (one for analysis of repeated measurements of multiple end-points, one allowing conclusions for multiple co-primary endpoints even when not all meet statistical significance, and an exact non-parametric method for multiple binary end-points), and a patient-centered measurement instrument (Goal Attainment Scale or GAS) aimed to standardise individual patients’ functional outcomes in conditions with heterogeneous clinical expression.

  • Two ‘meta-analysis’ methods, both aimed at improving the management of heterogeneity estimators in meta-analysis of sparse-event studies.

Heuristic framework used

Six clusters of medical conditions were defined: ‘acute single episodes’, ‘conditions with acute recurrent episodes’, ‘chronic condition with stable or slow progression’, ‘chronic progressive condition, led by one system/organ’, ‘chronic progressive condition led by multiple systems/organs’ and ‘chronic staged conditions’ (Additional file 1: Appendix 1). In addition to this classification, a consideration of extreme rarity was also taken into account, since ultra-rare conditions (< 1/100,000) have additional limitations to the recruitment potential.

Selection of EPARs for validation

We selected 26 EPARs out of the available 125 OMPs approved by EMA between 2001 (start of the Orphan Regulation application) and 2014 (time cut-off when this research was initiated). We aimed to select EPARs that represent the conditions within each of the six medical condition clusters, without prior information about the potential applicability of the novel developed methods. We used the following criteria for selection:

  • In principle four EPARs for each condition cluster. We considered this number sufficient to capture the diversity within the cluster, but acknowledged that exceptions are still possible;

  • Since extreme rarity of a given medical condition raises additional limitations to the recruitment potential, for each condition cluster at least one EPAR describing an ultra-rare condition (affecting < 1/100,000 persons in the EU) was selected;

  • Per medical condition we included only one EPAR. The same drug could have been included more than once if developed for more than one indication, although none actually was;

    At least one repurposed drug per cluster, defined as a drug that was already in use for a different medical condition and for which a new authorization was applied and granted for an orphan indication. Repurposed drugs may have different development approaches because part of the already available information may be extrapolated from former use to the new application.

If information in EPARs was insufficiently detailed, FDA summary basis of approval [19], published original articles and public clinical trial registries were consulted in order to obtain the necessary information for assessment [20,21,22].

Method of evaluation of applicability and added value of novel methodology

Key characteristics of the studies that were used as pivotal evidence to support approval of orphan products were extracted from the EPARs and systematised (Additional file 1: Appendix 3). EPARs were used as the basic source for the data extraction since they contained the key information for the regulatory assessment in the EU. However, when EPARs were insufficiently detailed (i.e. recruitment pattern or recruitment timing was missing, etc.), we investigated other publicly available sources, such as the reports from FDA. A data extraction form was created, pilot tested and in multiple iterations refined by discussion amongst nine reviewers (Additional file  1: Appendix 1). One researcher (MM) extracted the key characteristics from the studies reported in the EPARs. One researcher extracted the pre-requisites of the methods, checked by a second researcher (KOR) and the (co)developer of the method. Five researchers summarized the OMPs and orphan conditions. The summaries and extracted data were also checked by at least one researcher independent of the previous tasks during the evaluation of the applicability of the methods.

To judge on applicability, at least two researchers were involved: 1) MM or KOR or both, and 2) the (co)developer of the method evaluated. In approximately a quarter of the cases - and in all cases with any uncertainties or unclarities - the applicability evaluation was discussed with the ASTERIX project lead. If opinions did not concur, agreement was reached in a discussion between the researchers who judged on applicability, (co)developers of the methods and the ASTERIX project lead. We summarised the key features of the 13 novel ASTERIX methods, their prerequisites and potential advantages and disadvantages. Advantages and disadvantages of the methods were extracted from the papers and manuscripts, if reported. When not reported, advantages and disadvantages were added by the reviewers (MM and KOR), based on logical reasoning. These were reviewed by the (co-)developers of the methods, and refined where necessary to reflect the advantages/disadvantages of applying the particular method. We evaluated these prerequisites against the design (characteristics) of the pivotal studies (main efficacy studies that supported the regulatory evaluation and approval) and characteristics of the orphan conditions mirroring the methods’ prerequisites and design elements in a pilot including four studies reported in two EPARs (for OMPs Savene and Cayston [23]), and then we refined the list of characteristics with study and applicability details (Additional file 1: Appendix 1). In addition, we evaluated whether the method (if applicable) would have added value compared to the currently used method. We used the currently used method – rather than using one common standard as comparator as it would be difficult to have a standard given the plethora of challenges associated with each condition and patient population. To our opinion, this comparison best reflects the improvements that can be achieved for each scenario. The extracts, interpretation and conclusions were sent for validation to the lead authors of the manuscripts describing the novel methods. Any disagreements between primary evaluators and authors were debated until general consensus was reached. Once the list of characteristics was completed, data were extracted from pivotal trials, including a summary of the condition, the trial characteristics needed to judge whether the pre-requisites for applying the novel methods could have been fulfilled, and any applicant’s justification for choice of design elements and strategy, if available. We used a two-step approach to determine whether or not the methods could have been applicable and add value:

  • Step 1. The static step: evaluation of direct applicability without any adjustments to the original setting of the pivotal studies. This evaluation was based on the (methodological) pre-requisites of the methods and whether these were fulfilled for the trials.

  • Step 2. The dynamic step: evaluation of applicability allowing for adjustments to the original setting or design of the studies without changing the original objective and context of the development plan. Changes were made checking therapeutic guidelines, regulatory guidelines or any published article of a study in the same condition, to justify the applicability and improve the drug development program ([24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49]).

For example, a secondary outcome could have been promoted to a primary outcome, or primary and secondary endpoints could have been defined as multiple co-primary endpoints, if this was clinically and methodologically appropriate and sound from a regulatory point of view.

Analysis, interpretation and synthesis of the results

Based on the comparison between the methods pre-requisites and the characteristics of the pivotal trials, we used applicability evaluation decision trees (Additional file 1: Appendix 2) to measure how applicable the methods were for each EPAR (‘applicable’ denoted by green colour, ‘may be applicable’ denoted by light green colour, ‘no applicability’ denoted by orange colour, ‘no possibility for application irrespective of changes’ denoted by grey colour), depending on pre-requisites fulfillment.

For step 1 if one of the pre-requisites was not fulfilled then non-applicability was concluded, while for step 2 if pre-requisites were not fulfilled, then relevant changes were explored before concluding on applicability or non-applicability.

The applicability is summarized numerically and visualized on a heat map for the first (static) and second (dynamic) step of the evaluation. Based on these heatmaps, we derived recommendations on the use of the novel ASTERIX methodologies per cluster of conditions.

Results

Selected EPARs for evaluation

We included 24 EPARs at first, across the six clusters from the public EMA website [21]. As a result of the detailed evaluation, 2 EPARS were re-classified from the cluster chronic: progressive, multiple systems/organs to the cluster of conditions with acute recurrent episodes. Two EPARs were added to ensure at least 4 EPARs per cluster, leading to 26 EPARs in total (Table 2). There was no available OMP and corresponding EPAR for an ultra-rare condition within the cluster of ‘chronic staged conditions’, therefore none could be selected for this cluster.

Table 2 EPARs included in the evaluation

Potential applicability of methods and advantages based on information from actual trials

In the first, static step, we found that all individual methods were directly applicable to a minimum of 1 (4%) up to 9 (35%) of the 26 EPARs, and overall each method was applicable to a minimum of 1 (17%) and a maximum of 5 (83%) of the 6 clusters. In the second, dynamic step we found the individual methods applicable in 1 (4%) up to 17 (65%) of the EPARs, and a minimum of 1 (17%) out of 6 and a maximum of 6 (100%) of the 6 clusters (Tables 1 and 3, and Fig. 1).

Table 3 Percentage of EPARs where the methods are applicable
Fig. 1
figure 1

Header: Percentage of EPARs where the methods are applicable

Conditions with single acute episodes

Methods that were applicable following adjustments within this cluster of conditions: ‘extrapolation’ (this method is related to ‘dynamic borrowing through empirical power priors’ with the difference that its purpose is to plan a trial in the target population under the assumption that no data in the source population is available yet (as is the case when a paediatric investigation plan is formulated by EMA)) (1/4 EPARs), ‘sample size reassessment and hypothesis testing’ (2/4 EPARs), ‘multi-arm group sequential design with a simultaneous stopping rule’ (2/4 EPARs), ‘sequential designs for small samples’ (3/4 EPARs), ‘Bayesian sample size re-estimation using power priors’ (1/4 EPARs), ‘dynamic borrowing through empirical power priors’ (1/4 EPARs), ‘fallback tests for co-primary endpoints’ (2/4 EPARs) and ‘optimal exact test for multiple binary endpoints’ (2/4 EPARs).

‘Heterogeneity estimators’ and ‘prior distributions for variance parameters in sparse-event meta-analysis’ were not applicable, as all EPARs either used a single pivotal trial or single-arm trials with non-sparse outcomes. Given the possible use of binary outcomes in this cluster, the methods could become applicable in the case where at least two pivotal trials are available. ‘Delayed-start randomisation’ was not applicable as a placebo arm was not used in any of the EPARs. ‘Goal Attainment Scaling’ was not applicable because it requires previous patient experience with the disease to individualize the goals, as well as follow-up assessments at sustained functional level, thus is not suitable for conditions with acute onset and clinical course. Similarly, ‘Simultaneous inference for multiple marginal GEE models’ was not applicable because due to the acute onset and short clinical course limited to a single episode, in this cluster the measurements were not repeated. This does not totally preclude applicability, as the single episode may be long enough to allow valuable use of repeated measurements.

Conditions with acute recurrent episodes

Methods that were applicable following adjustments were ‘extrapolation’ (5/6 EPARs), ‘sample size reassessment and hypothesis testing in adaptive survival trials’ (5/6 EPARs), ‘multi-arm group sequential designs with a simultaneous stopping rule’ (5/6 EPARs), ‘sequential designs for small samples’ (5/6 EPARs), ‘Bayesian sample size re-estimation using power priors’ (5/6 EPARs), ‘dynamic borrowing through empirical power priors that control type I error’ (5/6 EPARs), ‘fallback tests for co-primary endpoints’ (4/6 EPARs), ‘optimal exact tests for multiple binary endpoints’ (2/6 EPARs), ‘heterogeneity estimators’ and ‘prior distributions for variance parameters in sparse-event meta-analysis’ (1/6 EPARs), ‘delayed-start randomisation’ (1/6 EPARs). ‘Goal Attainment Scaling’ was not applicable for the same reasons as for the cluster of conditions with acute single episodes. ‘Simultaneous inference for multiple marginal GEE models’ appeared not applicable because in this cluster the measurements were not repeated in the setting as being modelled. However, as with acute single episodes it is likely that the method can be extended to be applicable to the type of repeated assessment that apply for conditions in this cluster.

Chronic conditions with stable or slow progression

Methods that were applicable following adjustments were: ‘extrapolation’ (1/4 EPARs), ‘sample size reassessment and hypothesis testing in adaptive survival trials’ (1/4 EPARs), ‘multi-arm group sequential designs with a simultaneous stopping rule’ (2/4 EPARs), ‘sequential designs for small samples’ (3/4 EPARs), ‘Bayesian sample size re-estimation using power priors’ (2/4 EPARs), ‘dynamic borrowing through empirical power priors that control type I error’ (2/4 EPARs), ‘fallback tests for co-primary endpoints’ (1/4 EPARs), ‘optimal exact tests for multiple binary endpoints’ (1/4 EPARs), ‘simultaneous inference for multiple marginal GEE models’ (1/4 EPARs). ‘Goal Attainment Scaling’ may be methodologically applicable (2/4 EPARs), but its added value may be limited within this cluster as, at least in the selected examples, there are already available validated patient reported outcomes capturing functionality for all targeted patients.

‘Delayed-start randomisation’ was not applicable as most added value is achieved if there is disease progression during the trial period, and treatments having a lasting response, while the clustering was characterised by relatively stable clinical course with treatments having reversible effects. ‘Heterogeneity estimators’ and ‘prior distributions for variance parameters in sparse-event meta-analysis’ were not applicable since there were few randomised trials and the examples included mostly continuous or non-sparse discrete endpoints.

Chronic progressive conditions led by one system/organ

Methods that were applicable following adjustments included: ‘sample size reassessment and hypothesis testing in adaptive survival trials’ (2/4 EPARs), ‘multi-arm multi-stage trial with a simultaneous stopping rule’ (1/4 EPARs), ‘sequential design for small samples’ (1/4 EPARs), ‘delayed-start randomisation’ (1/4 EPARs)’, ‘Bayesian sample size re-estimation using power priors’ (1/4 EPARs) and ‘dynamic borrowing through empirical power priors that control type I error’ (1/4 EPARs). ‘Heterogeneity estimators’ and ‘prior distributions for variance parameters in sparse-event meta-analysis’ were not applicable, similarly to other clusters of chronic conditions, due to lack of randomised trials, use of continuous or discrete endpoints, or binary endpoints that were not sparse. Also, ‘optimal exact tests for multiple binary endpoints’ was not applicable due to use of non-binary endpoints (i.e. time-to-event or continuous endpoints).

It is noted that this condition cluster contains two EPARs that were approved based on case series and not on experimental or observational trials. There are several reasons for this path, including the scarcity of patients and condition seriousness and severity, leading to ethical concerns and reluctance to use of placebo. In these cases, for methods to become applicable, the entire drug development program needed to be reshaped. Although technically possible, the judgement on whether or not this could be feasible, was out of the scope of our evaluation. For those isolated instances we concluded on non-applicability of methods.

Chronic progressive conditions led by multiple system/organs

Almost all methods were applicable in this cluster of conditions to some extent following adjustments: ‘sample size reassessment and hypothesis testing in adaptive survival trials’ (1/4 EPARs). ‘Multi-arm multi-stage trial with a simultaneous stopping rule’ (3/4 EPARs), ‘sequential design for small samples’ (3/4 EPARs), ‘delayed-start randomisation’ (1/4 EPARs)’, ‘Bayesian sample size re-estimation using power priors’ (2/4 EPARs), ‘dynamic borrowing through empirical power priors that control type I error’ (2/4 EPARs) ‘fallback tests for multiple endpoints’(2/4 EPARs), ‘optimal exact tests for multiple binary endpoints’ (1/4 EPARs) and ‘GAS’ (3/4 EPARs). Similarly to other clusters including chronic conditions, ‘heterogeneity estimators’ and ‘prior distributions for variance parameters in sparse-event meta-analysis’ were not applicable due to use of continuous or discrete endpoints, lack of randomised trials or binary endpoints that were not sparse.

Chronic staged conditions

Methods that were applicable following adjustments were ‘sample size reassessment and hypothesis testing in adaptive survival trials’ (4/4 EPARs), ‘multi-arm multi-stage trial with a simultaneous stopping rule’ (2/4 EPARs), ‘sequential design for small samples’ (2/4 EPARs), ‘Bayesian sample size re-estimation using power priors’ (2/4 EPARs), ‘dynamic borrowing through empirical power priors that control type I error’ (2/4 EPARs), ‘fallback tests for multiple endpoints’ (3/4 EPARs), and ‘simultaneous inference for multiple marginal GEE models’ (2/4 EPARs). ‘GAS’ (1/4 EPARs) was applicable in the only non-oncological condition within this cluster, i.e. pulmonary hypertension. ‘Heterogeneity estimators’ and ‘prior distributions for variance parameters in sparse-event meta-analysis’ were not applicable due to use of continuous or discrete endpoints, lack of randomised trials or use of binary endpoints to measure outcomes that were not sparse.

‘Delayed-start randomisation’ was not applicable as most added value is achieved if there is disease progression during the trial period, and treatments having a lasting response, while the clustering was characterised by staged conditions with treatments having reversible effects.

Advantages and disadvantages compared to the methods used

Potential advantages of using new methods compared to the methods that were used in the drug development program may include (Table 1): reduction in sample size (depending on method and design), more robust evidence, reduced placebo use and/or exposure to placebo or (in retrospect) inferior treatment, patient involvement in benefit-risk assessment. Potential disadvantages were a sufficient level of evidence, but not overwhelming (regardless of the positive or detrimental effect on patients), extra patients needed in case of variance overestimation compared to frequentist approach, more time- and resource- demanding trials, and increased complexity or increased logistic demand on all involved stakeholders.

The evaluation of methods in the ‘meta-analysis’ group resulted in the conclusion that in the current selection of EPARs the two methods were only applicable in the cluster of conditions with acute recurrent episodes, while in fact the two methods have much more potential for applicability. Given the general preference for types of endpoints other than binary (i.e. continuous or time-to-event), that are generally considered to provide better statistical power and sensitivity to change, the meta-analysis methods were often not directly applicable. However, the methods can easily become applicable depending on the choice of endpoint and development background (i.e. number of trials using the same binary endpoint). The two methods could be taken into account in advance and pre-specified to be used in any development program with more than one randomised trial that measures dichotomous outcomes.

Discussion

The new methods developed in ASTERIX included new proposals for interim analysis and stopping rules in multi-arm parallel trials, methods for sample-size reassessment, rules to optimise boundaries in group-sequential designs, methods to tune the use of prior information from similar trials in Bayesian analysis, considerations to apply flexibility to the level of evidence, new approaches to analyse multiple endpoints, a patient-centered instrument for heterogeneous functional outcomes and two methods for meta-analysis of sparse binary data. The applicability requirements for the methods included mainly the type of measurement (i.e., binary or continuous variable, single or multiple main end-point, scarcity of data), availability of more than one trial, availability of previous studies with good quality data, the length of time to end-point as compared to the time to complete recruitment, and feasibility of randomised designs. While all methods were applicable to some extent and in total could add value on average in 76% of the condition clusters, they were often not directly applicable to the actual trial design or approaches used during clinical development of the OMP as described in the EPAR. Applicability and added value of novel methods were extended when they were not limited to the actual settings of the study design and considered potential changes to the individual trials within the context of the drug development program, i. e., considering the characteristics of the medical condition and optimising the drug development program, rather than improve the trial as presented in isolation.

Most notable strengths of our research are the fact that we systematically evaluated the applicability of the novel methods in a representative sample of real life examples obtained from EPARs, with input from a multidisciplinary team of experts in epidemiology, statistics, drug development, drug regulation and clinical practice. We used therapeutic guidelines in order to determine if reasonable changes could be made to the actual development plan or trials, such as the possibility to use an additional control arm depending on the seriousness and severity of conditions, or availability of standard of care or best supportive care, or the use of a different type of endpoint. Importantly all considered alternatives kept the primary development objectives intact. Some conditions are rare variants of non-rare conditions (such as Dravet syndrome being a rare and severe variant of epilepsy). Hence, we also checked to see if we could borrow design elements and apply strategies from the more common variant (e.g. epilepsy in case of Dravet) leading to proper justification for the changes made and make our conclusions robust.

This evaluation also has some limitations. Firstly, due to feasibility reasons only 4–6 EPARs were evaluated within each cluster. Although we aimed to select a representative sample of different development approaches within each cluster, the applicability within these EPARs might not be fully generalizable to all conditions and drug development plans within the cluster. If a method turned out not to be applicable to any condition within a cluster, the method may still be applicable to some conditions within this cluster that were not selected for evaluation. The reverse is also true: if the method turned out to be applicable in all included EPARs within a cluster, the method may still not be applicable to all conditions within the cluster. Yet, the exercise in itself showed that a systematic approach including the definition of the applicability pre-requisites, together with the definition of the general characteristics of the medical conditions included in a given cluster, allows guidance to investigators on whether they could consider a given method or not for a certain type of medical conditions. Thus, in each individual case the method’s pre-requisites, advantages, and disadvantages should be thoroughly evaluated for adequacy in the full context of the drug development program. While the exercise of applicability may help to define the best toolbox to consider for a given clinical situation, the implications of the methods may differ between conditions and trials, and it should be judged on a case-by-case basis which one of them is optimal.

A further limitation is that the level of detail reported regarding information needed to determine applicability was often limited (e.g. recruitment rates, study timelines, etc.), making it difficult to make a thorough and fully informed judgment on the (in)applicability of the method because this depended on the judgment regarding what changes were deemed feasible or not.

Additionally, only positive opinions were included in our evaluation, given the lack of accessible information on the negative opinions. The impossibility to include negative opinions could have influenced the applicability of the methods. However, we conjecture that in negative opinions there is probably even more potential for improvement.

This work, which was limited to the European regulatory region, could have included the assessment of other orphan drugs approved in other regions, notably in the US and Japan for instance, in order to cover more orphan conditions. Several factors would hamper this approach. The criteria for designation of OMP in the US do not match the one used in EU (i.e. different prevalence cut-off and including medical devices). Furthermore, detailed data on Japanese clinical developments for OMP were not easily available.

We observed that the novel methods are applicable to real life studies, such as those reported in the EPARs, and that they have the potential to improve clinical drug development for small populations and directly address some of the issues flagged in the ‘Guideline for Clinical Trials in Small Populations’.

Not all challenges reported in EPARs or encountered in trials in rare diseases were covered by the novel methods developed within ASTERIX.

One possible avenue for extending this validation exercise based on studies reported in EPARs would be to add on the novel methods developed in the ASTERIX project other study designs and methods applicable to rare diseases available in the literature as the results here demonstrated that this methods validation exercise works and has potential to be extended.

Further research into methods to address these challenges is needed to improve and optimise drug development to ultimately be able to efficiently develop efficacious and safe treatments for all patients suffering from a rare disease.

Conclusion

Novel methods developed in ASTERIX include methods for trial design, analysis or meta-analysis of trials in small populations. The 13 developed methods have been found to be applicable to real-life examples, and can potentially improve drug development programs. Achieving optimal added value of these novel methods often requires consideration of the entire drug development program, rather than reconsideration of methods for a specific trial. The novel methods tested were mostly applicable in chronic conditions, and acute conditions with recurrent episodes. The implications of the methods may differ in specific medical conditions, and the systematic assessment as presented may guide selecting the optimal methods on a case-by-case basis.

Abbreviations

ASTERIX:

Advances in Small Trials dEsign for Regulatory Innovation and eXcellence

BMK:

Biomarker

EMA:

European Medicines Agency

EPAR:

European Public Assessment Report

FDA:

Food and Drug Administration

FP7:

Seventh Framework Program

OMP:

Orphan Medicinal Product

References

  1. de Vrueh R, Baekelandt ERF, de Haan JMH. Background Paper 6.19 Rare Diseases. 2013 [cited 2017 Sep 20]; Available from: http://www.who.int/entity/medicines/areas/priority_medicines/BP6_19Rare.pdf?ua=1.

  2. Policy - Public Health - European Commission . Public Health. [cited 2017 Sep 20]. Available from: /health/rare_diseases/policy_en

  3. European Medicines Agency - Overview - Support for early access. [cited 2017 Sep 20]. Available from: https://www.ema.europa.eu/human-regulatory/overview/support-early-access.

  4. Casteels-Rappagliosi B. Rare diseases and medical devices in the European. Probl Herb Med Leg Status. 1999;69.

  5. European Medicines Agency - Overview - Orphan designation. [cited 2017 Nov 29]. Available from: https://www.ema.europa.eu/human-regulatory/overview/orphan-designation.

  6. Hee SW, Willis A, Smith CT, Day S, Miller F, Madan J, et al. Does the low prevalence affect the sample size of interventional clinical trials of rare diseases? An analysis of data from the aggregate analysis of clinicaltrials.gov. Orphanet J Rare Dis. 2017;12:44.

    Article  Google Scholar 

  7. Guideline on clinical trials in small populations. . [cited 2017 Nov 29]. Available from: http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2009/09/WC5 00003615.pdf

  8. Valsecchi MG, De Lorenzo P. Strategies for Trial Design and Analyses. In: Saha V., Kearns P. (eds) New Agents for the Treatment of Acute Lymphoblastic Leukemia. New York: Springer, 2011. p. 83-104.

    Chapter  Google Scholar 

  9. Bogaerts J, Sydes MR, Keat N, McConnell A, Benson A, Ho A, et al. Clinical trial designs for rare diseases: Studies developed and discussed by the International Rare Cancers Initiative. Eur J Cancer. 2015;51:271–81.

    Article  Google Scholar 

  10. Huang B, Giannini EH, Lovell DJ, Ding L, Liu Y, Hashkes PJ. Enhancing crossover trial design for rare diseases: Limiting ineffective exposure and increasing study power by enabling patient choice to escape early. Contemp Clin Trials. 2014;38:204–12.

    Article  Google Scholar 

  11. Hampson LV, Whitehead J, Eleftheriou D, Brogan P. Bayesian methods for the design and interpretation of clinical trials in very rare diseases. Stat Med. 2014;33:4186–201.

    Article  Google Scholar 

  12. Cornu C, Kassai B, Fisch R, Chiron C, Alberti C, Guerrini R, et al. Experimental designs for small randomised clinical trials: an algorithm for choice. Orphanet J Rare Dis. 2013;8:48.

    Article  Google Scholar 

  13. Gupta S, Faughnan ME, Tomlinson GA, Bayoumi AM. A framework for applying unfamiliar trial designs in studies of rare diseases. J Clin Epidemiol. 2011;64:1085–94.

    Article  Google Scholar 

  14. Parmar MKB, Sydes MR, Morris TP. How do you design randomised trials for smaller populations? A framework. BMC Med . 2016 [cited 2017 Sep 7];14. Available from: http://bmcmedicine.biomedcentral.com/articles/10.1186/s12916-016-0722-3.

  15. Abrahamyan L, Feldman BM, Tomlinson G, Faughnan ME, Johnson SR, Diamond IR, et al. Alternative designs for clinical trials in rare diseases. Am J Med Genet C Semin Med Genet. 2016;172:313–31.

    Article  Google Scholar 

  16. Asterix. Welcome to the ASTERIX project . Asterix. [cited 2017 Dec 3]. Available from: http://www.asterix-fp7.eu/.

  17. Pontes C, Fontanet M, Vives R, Sancho A, Gomez-Valent M, Rios J, et al. Evidence supporting regulatory-decision making on orphan medicinal products authorisation in Europe: methodological uncertainties. Orphanet J Rare Dis. 2018 (accepted).

  18. Jansen-van der Weide MC, Gaasterland CMW, Roes KCB, Pontes C, Vives R, Sancho A, et al. Rare disease registries: potential applications towards impact on development of new drug treatments. Orphanet J Rare Dis. 2018;13:154.

  19. Drugs@FDA: FDA Approved Drug Products . [cited 2018 Feb 12]. Available from: https://www.accessdata.fda.gov/scripts/cder/daf/.

  20. EU Clinical Trials Register - Update. [cited 2017 Dec 10]. Available from: https://www.clinicaltrialsregister.eu/.

  21. Home - clinicaldata.ema.europa.eu. [cited 2017 Dec 10]. Available from: https://clinicaldata.ema.europa.eu/web/cdp/home.

  22. Home - ClinicalTrials.gov. [cited 2017 Dec 10]. Available from: https://clinicaltrials.gov/.

  23. European Medicines agency: European Public Assessment Reports. . Available from: https://www.ema.europa.eu/about-us/how-we-work/what-we-publish/european-public-assessment-reports.

  24. Guideline on the investigation of medicinal products in the term and preterm neonate. [cited 2017 Nov 29]. Available from: https://www.ema.europa.eu/documents/scientific-guideline/draft-guideline-investigation-medicinal-products-term-preterm-neonate_en.pdf.

  25. Concept paper on the need for the development of a reflection paper on regulatory requirements for the development of medicinal products for chronic non-infectious liver diseases (PBC, PSC, NASH). [cited 2017 Nov 29]. Available from: https://www.ema.europa.eu/documents/scientific-guideline/concept-paper-need-development-reflection-paper-regulatory-requirements-development-medicinal_en.pdf.

  26. European Medicines Agency - News and Events - Development of medicines to treat tuberculosis . [cited 2017 Nov 29]. Available from: https://www.ema.europa.eu/news/development-medicines-treat-tuberculosis.

  27. Genser B, Cooper PJ, Yazdanbakhsh M, Barreto ML, Rodrigues LC. A guide to modern statistical analysis of immunological data. BMC Immunol. 2007;8:27.

    Article  Google Scholar 

  28. European Medicines Agency - Clinical efficacy and safety - Evaluation of medicinal products indicated for treatment of bacterial infections. [cited 2017 Nov 29]. Available from: https://www.ema.europa.eu/evaluation-medicinal-products-indicated-treatment-bacterial-infections.

  29. Safety and Efficacy Study of Aztreonam for Inhalation Solution (AZLI) in Cystic Fibrosis (CF) Patients With Pseudomonas Aeruginosa (PA) - Study Results - ClinicalTrials.gov. [cited 2017 Sep 6]. Available from: https://clinicaltrials.gov/ct2/show/results/NCT00128492.

  30. Guideline on the clinical development of medicinal products for the treatment of cystic fibrosis. [cited 2017 Nov 29]. Available from: https://www.ema.europa.eu/documents/scientific-guideline/guideline-clinical-development-medicinal-products-treatment-cystic-fibrosis_en.pdf.

  31. Lebensburger JD, Hilliard LM, Pair LE, Oster R, Howard TH, Cutter GR. Systematic review of interventional sickle cell trials registered in ClinicalTrials.gov. Clin Trials. 2015;12:575–83.

    Article  Google Scholar 

  32. Evidence-Based Management of Sickle Cell Disease Expert Panel Report, 2014. [cited 2017 Nov 29]. Available from: https://www.nhlbi.nih.gov/health-topics/evidence-based-management-sickle-cell-disease.

  33. Denton CP, Hughes M, Gak N, Vila J, Buch MH, Chakravarty K, et al. BSR and BHPR guideline for the treatment of systemic sclerosis. Rheumatology. 2016;55:1906–10.

    Article  Google Scholar 

  34. Public summary of opinion on orphan designation Insulin human for the treatment of short bowel syndrome. [cited 2017 Nov 29]. Available from: https://www.ema.europa.eu/documents/orphan-designation/eu/3/15/1532-public-summary-opinion-orphan-designation-insulin-human-treatment-short-bowel-syndrome_en.pdf.

  35. Jeppesen PB, Gilroy R, Pertkiewicz M, Allard JP, Messing B, O’Keefe SJ. Randomised placebo-controlled trial of teduglutide in reducing parenteral nutrition and/or intravenous fluid requirements in patients with short bowel syndrome. Gut. 2011;60:902–14.

    Article  CAS  Google Scholar 

  36. Bornstein SR, Allolio B, Arlt W, Barthel A, Don-Wauchope A, Hammer GD, et al. Diagnosis and Treatment of Primary Adrenal Insufficiency: An Endocrine Society Clinical Practice Guideline. J Clin Endocrinol Metab. 2016;101:364–89.

    Article  CAS  Google Scholar 

  37. Harrison CN, Bareford D, Butt N, Campbell P, Conneally E, Drummond M, et al. Guideline for investigation and management of adults and children presenting with a thrombocytosis. Br J Haematol. 2010;149:352–75.

    Article  CAS  Google Scholar 

  38. Hillmen P, Muus P, Dührsen U, Risitano AM, Schubert J, Luzzatto L, et al. Effect of the complement inhibitor eculizumab on thromboembolism in patients with paroxysmal nocturnal hemoglobinuria. Blood. 2007;110:4123–8.

    Article  CAS  Google Scholar 

  39. Blanke CD, Demetri GD, von Mehren M, Heinrich MC, Eisenberg B, Fletcher JA, et al. Long-Term Results From a Randomized Phase II Trial of Standard- Versus Higher-Dose Imatinib Mesylate for Patients With Unresectable or Metastatic Gastrointestinal Stromal Tumors Expressing KIT. J Clin Oncol. 2008;26:620–5.

    Article  CAS  Google Scholar 

  40. Roberts EA, Schilsky ML. Diagnosis and treatment of Wilson disease: An update. Hepatology. 2008;47:2089–111.

    Article  CAS  Google Scholar 

  41. Leggio L, Addolorato G, Parker S, Gasbarrini G, Tanner S, Group ES, et al. Wilson’s disease: Creating a european clinical database and designing randomised controlled clinical trials. Dig Liver Dis. 2006;38:S188.

    Google Scholar 

  42. Liver EAFTSOT, others. EASL clinical practice guidelines: Wilson’s disease. J Hepatol. 2012;56:671–85.

    Article  Google Scholar 

  43. The ESMO/European Sarcoma Network Working Group. Gastrointestinal stromal tumours: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol. 2014;25:iii21–6.

    Article  Google Scholar 

  44. Eng CM, Banikazemi M, Gordon RE, Goldman M, Phelps R, Kim L, et al. A phase 1/2 clinical trial of enzyme replacement in Fabry disease: pharmacokinetic, substrate clearance, and safety studies. Am J Hum Genet. 2001;68:711–22.

    Article  CAS  Google Scholar 

  45. Zimran A, Elstein D. Gaucher disease and the clinical experience with substrate reduction therapy. Philos Trans R Soc B Biol Sci. 2003;358:961–6.

    Article  CAS  Google Scholar 

  46. Gaucher disease A strategic collaborative approach from EMA and FDA. [cited 2017 Nov 29]. Available from: https://www.ema.europa.eu/documents/regulatory-procedural-guideline/gaucher-disease-strategic-collaborative-approach-european-medicines-agency-food-drug-administration_en.pdf.

  47. Guideline on the evaluation of anticancer medicinal products in man. Oncology . 2012 [cited 2017 Sep 6]; Available from: http://www.fdanews.com/ext/resources/files/archives/w/WC500119966.pdf.

  48. Appendix 1 to the guideline on the evaluation of anticancer medicinal products in man - methodological consideration for using progression-free survival (PFS) or disease-free survival (DFS) in confirmatory trials. [cited 2017 Nov 29]. Available from: https://www.ema.europa.eu/documents/scientific-guideline/appendix-1-guideline-evaluation-anticancer-medicinal-products-man-methodological-consideration-using_en.pdf.

  49. Simonneau G, Channick RN, Delcroix M, Galiè N, Ghofrani H-A, Jansa P, et al. Incident and prevalent cohorts with pulmonary arterial hypertension: insight from SERAPHIN. Eur Respir J. 2015;46:1711–20.

    Article  CAS  Google Scholar 

  50. Hlavin G, Koenig F, Male C, Posch M, Bauer P. Evidence, eminence and extrapolation: Evidence, eminence and extrapolation. Stat Med. 2016;35:2117–32.

    Article  Google Scholar 

  51. Pateras K, Nikolakopoulos S, Mavridis D, Roes KCB. Interval estimation of the overall treatment effect in a meta-analysis of a few small studies with zero events. Contemp Clin Trials Commun. 2018;9:98–107.

    Article  Google Scholar 

  52. Spineli LM, Jenz E, Großhennig A, Koch A. Critical appraisal of arguments for the delayed-start design proposed as alternative to the parallel-group randomized clinical trial design in the field of rare disease. Orphanet J Rare Dis . 2017;12:140.

  53. Magirr D, Jaki T, Koenig F, Posch M. Sample Size Reassessment and Hypothesis Testing in Adaptive Survival Trials. Hills RK, editor. PLOS ONE. 2016;e0146465:11.

    Google Scholar 

  54. Urach S, Posch M. Multi-arm group sequential designs with a simultaneous stopping rule. Stat Med. 2016;35:5536–50.

    Article  CAS  Google Scholar 

  55. Nikolakopoulos S, Roes KC, van der Tweel I. Sequential designs with small samples: Evaluation and recommendations for normal responses. Stat Methods Med Res. 2016;096228021665377.

  56. Brakenhoff T, Roes K, Nikolakopoulos S. Bayesian sample size re-estimation using power priors. Stat Methods Med Res. 2018;096228021877231.

  57. Nikolakopoulos S, van der Tweel I, Roes KCB. Dynamic borrowing through empirical power priors that control type I error: Dynamic Borrowing with Type I Error Control. Biometrics. 2018;74:874–80.

    Article  Google Scholar 

  58. Ristl R, Frommlet F, Koch A, Posch M. Fallback tests for co-primary endpoints. Stat Med. 2016;35:2669–86.

    Article  Google Scholar 

  59. Ristl R, Xi D, Glimm E, Posch M. Optimal exact tests for multiple binary endpoints. Computational Statistics & Data Analysis. 2018;122:1–17.

    Article  Google Scholar 

  60. Ristl R, McDaniel L, Henderson N, Prague M. mmmgee: Simultaneous inference for multiple marginal GEE models. 2018. [cited 2018 Oct 10]. Available from: https://CRAN.R-project.org/package=mmmgee.

  61. Gaasterland CMW, Jansen-van der Weide MC, Weinreich SS, van der Lee JH. A systematic review to investigate the measurement properties of goal attainment scaling, towards use in drug trials. BMC Med Res Methodol. 2016;16:99.

Download references

Acknowledgements

This work was supported by the EU FP7 HEALTH.2013.4.2-3 PROJECT “Advances in Small Trials dEsign for Regulatory Innovation and eXcellence” (ASTERIX): Grant no 603160 ASTERIX Project - http://www.asterix-fp7.eu/ * [53] was jointly developed with partial funding from four projects including ASTERIX (Grant no 603160).

Funding

EU FP7 HEALTH.2013.4.2–3 PROJECT “Advances in Small Trials dEsign for Regulatory Innovation and eXcellence” (ASTERIX): Grant no 603160 ASTERIX Project - http://www.asterix-fp7.eu/

Availability of data and materials

All materials are included in this published article and its supplementary information files.

Author information

Authors and Affiliations

Authors

Contributions

MM and ORK have drafted the first version of this manuscript. All authors engaged in the assessment of added value and applicability, and participated in full review and approval of the final manuscript.

Corresponding author

Correspondence to Caridad Pontes.

Ethics declarations

Ethics approval and consent to participate

As only use was made of already publicly available information and data, there was no need for ethics approval and consent.

Consent for publication

All authors have contributed, reviewed and approved the submission of this version of the manuscript.

Competing interests

All authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Additional file 1:

Provided in doc file: Supplementary Materials.docx. Appendix 1: Data extraction form for EPARs including condition summary and criteria list. Appendix 2: Decision tree structure for methods evaluation of applicability. Appendix 3: List of characteristics used to build the studies profile. Figure S1. Proposed method based on validation exercise. How to determine on the optimal trial design (DOCX 35 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mitroiu, M., Oude Rengerink, K., Pontes, C. et al. Applicability and added value of novel methods to improve drug development in rare diseases. Orphanet J Rare Dis 13, 200 (2018). https://doi.org/10.1186/s13023-018-0925-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13023-018-0925-0

Keywords