FormalPara Key Summary Points

A systematic literature review of real-world studies in patients with treatment-resistant depression (TRD) identified 20 studies conducted over the last 10 years.

We found that there was a lack of clinical consensus regarding the definition, assessment and monitoring of TRD in real-world practice.

At baseline, patients with TRD presented with moderate to severe depression and had typically experienced long-lasting major depressive episodes spanning multiple years; while seldom measured, health-related quality of life was apparently low.

Rates of response to treatment varied greatly between studies but were generally low; few studies investigated long-term outcomes, but those that did typically reported marginally greater rates of remission than those reporting on short-term outcomes.

Future real-world studies would benefit from standardised modalities of assessment, monitoring and reporting of treatment effectiveness, including greater consideration of health-related quality of life outcomes.

Introduction

Major depressive disorder (MDD) is one of the most common, yet debilitating, psychiatric disorders, characterised by persistently low mood and energy; anhedonia; changes in appetite, weight and sleep; fatigue; and suicidality, among other symptoms [1, 2]. The lifetime prevalence of MDD among the general population is estimated to be ~ 13 to 15%, with first-line treatments consisting of antidepressant medications, behavioural psychotherapy, or a combination thereof [3, 4]. Many patients with MDD, however, do not experience a sufficient response to initial antidepressant treatments and may develop treatment-resistant depression (TRD) [5]. TRD is most commonly defined as non-response to two or more different pharmacological treatments, taken for an adequate duration and at an adequate dosage [6, 7].

TRD affects approximately one-third of patients with MDD and is associated with functional and physical decline, resulting in diminished health-related quality of life (HRQoL) [3, 8, 9]. Indeed, a considerable proportion of patients living with TRD are reported to be on long-term sick leave or unemployed [10]. The burden of illness is substantially greater, both to the individual and to society, for patients with TRD than it is for patients with MDD who respond to initial treatment [11]. Furthermore, the burden of TRD increases with the duration of the disorder, culminating in rates of hospitalisation for general medical and depression-related causes that are double those reported in patients with treatment-responsive MDD [12, 13]. Even in the absence of treatment resistance, patients who do not achieve remission experience increased risk of relapse and an increased personal burden arising from residual symptoms [14]. Moreover, while long-term remission is the primary goal of antidepressant treatment, the probability of achieving remission after experiencing non-response to two adequate trials of medication decreases with each subsequent treatment, and those who require more treatment steps also demonstrate higher rates of relapse during follow-up [5, 14, 15]. Furthermore, residual symptoms in patients who do not achieve complete remission result in an increased risk of relapse, with lower levels of social and psychological functioning, and greater rates of physical morbidity and mortality [14].

Pharmacological treatment of TRD can employ all approved antidepressant drugs, including selective serotonin- and serotonin and norepinephrine-reuptake inhibitors (SSRIs/SNRIs), tricyclic antidepressants (TCAs), monoamine oxidase inhibitors and other types of antidepressants. Several medications that are not approved for antidepressant monotherapy in MDD and do not have direct antidepressant activity, such as lithium, thyroid hormone and some atypical antipsychotic drugs, may be used to augment antidepressant treatments [16, 17]. However, advances in the development of specific treatments for TRD have been slow. Currently, in Europe, the only treatment approved specifically for TRD, as it is defined above, is esketamine, an N-methyl-d-aspartate receptor antagonist, which is administered as a nasal spray in combination with an SSRI/SNRI [18]. In the US, in addition to esketamine, a combination of olanzapine and fluoxetine hydrochloride (Symbyax®) is also approved [19, 20].

Advances in the development of novel treatments for TRD have been slow, with current strategies involving the switching, combining and augmenting of medications approved for the treatment of MDD [10]. Beyond the confines of clinical trials, there is a dearth of evidence assessing the characteristics of, treatment strategies employed for, and outcomes experienced by, patients with TRD in the real world, where populations are more diverse, have more comorbidities and may be less adherent to treatments [16]. Such real-world data are essential to draw a more realistic picture of the treatment landscape, long-term outcomes and the personal burden of disease for patients with TRD. Similarly, it is important to establish how outcomes experienced by patients with TRD are assessed and monitored in real-life clinical practice, and the timescales over which outcomes are reported, in order to support greater comparability of future studies.

The purpose of this systematic literature review (SLR) was therefore to assess real-world evidence in TRD, in order to understand and summarise available evidence regarding current treatment strategies and outcomes. Specifically, the objectives of this SLR were: (1) to further understand the clinical and patient-reported disease burden for patients with TRD in the real-world, and (2) to explore real-world effectiveness of available treatments and the unmet need for better treatment options for these patients. Given the rapid transformation of real-world treatment settings and practices, and in order to provide data relevant to current real-world practice, this systematic review was restricted to the decade prior to the review date.

Methods

This article is based on previously conducted studies and does not contain any new studies with human participants or animals performed by any of the authors.

Eligibility Criteria, Selection Process and Outcomes

This SLR was conducted according to a pre-specified protocol. Studies were included if they reported outcomes for adult patients with treatment-resistant MDD [with MDD being diagnosed by the Diagnostic and Statistical Manual of Mental Disorders (DSM) Edition 3 or above, or the International Classification of Diseases (ICD) Edition 9 or above], in which treatment resistance was defined as failure to adequately respond to at least two treatments given at an adequate dose during the same major depressive episode (MDE). Any definition of failure or inadequate response to a treatment as provided by the study authors was deemed eligible for inclusion. Eligible studies included those reporting on both pharmacological and non-pharmacological interventions but excluded non-medical interventions such as traditional Chinese medicine and nutritional supplements. While the search strategy also included studies reporting outcomes related to psychiatric emergencies in patients with MDD (MDD-PE), only articles specific to the TRD population are included within this publication.

Data were collected on prevalence of TRD, treatment types and effectiveness, HRQoL and patient characteristics. Studies reporting the following outcomes for the patient population of interest were included: improvement of depression severity as measured by one of the following validated scales: Montgomery–Åsberg Depression Rating Scale (MADRS), Hamilton Depression Rating Scale (HAM-D), Beck Depression Inventory (BDI), Clinical Global Impression–Change (CGI-C), Clinical Global Impression–Severity (CGI-S), 9-question Patient Health Questionnaire (PHQ-9); improvement in HRQoL as measured by any valid instrument; and rates of remission and/or response and/or non-remitters and/or non-responders, according to any of the aforementioned validated scales.

Only real-world non-interventional studies were included. This included cohort studies, cross-sectional studies, case–control studies, chart reviews and registry studies, but excluded case reports or case series, due to their high potential for selection bias. Interventional studies, such as randomised controlled trials, were also excluded. The search included studies published between January 2012 and up to and including May 2022, and conference proceedings published between January 2020 and up to and including June 2022. Full eligibility criteria are reported in Table 1.

Table 1 Inclusion and exclusion criteria

All titles were reviewed by a single senior reviewer (AB or HL) and 10% of each reviewer's excluded decisions were checked by a second reviewer. All articles included after the title review were reviewed by two independent reviewers at title, abstract and full-text stages. For the reviews at abstract and full-text stage, disagreements were resolved by discussion until a consensus was met. If necessary, a third reviewer made the final decision.

Search Strategy

A search was conducted using the electronic databases in MEDLINE (including MEDLINE In-Process, MEDLINE Daily and MEDLINE Epub Ahead of Print), Embase, The Cochrane Libraries [including Cochrane Database of Systematic Reviews (CDSR) and Cochrane Central Register of Controlled Trials (CENTRAL)] and PsycINFO. MEDLINE, MEDLINE In-Process, MEDLINE Epub Ahead of Print and Embase were searched simultaneously via the Ovid SP platform (11/05/2022). The full list of search terms used for the Ovid SP platform is presented in Supplementary Table S1. CDSR and CENTRAL were searched via The Cochrane Library, via the Wiley Online platform (11/05/2022; Supplementary Table S2). PsycINFO was searched via the American Psychological Association (APA) website (10/05/2022; Supplementary Table S3).

In addition, the bibliographies of all relevant SLRs identified during the literature review were hand-searched for any additional relevant studies. Furthermore, conference proceedings for 2020 to 2022 from the European College of Neuropsychopharmacology Congress, European Congress of Psychiatry, American Psychiatric Association Annual Meeting, American College of Neuropsychopharmacology Annual Meeting and Psych Congress were also searched.

Data Collection Process and Data Items

Data extractions and quality assessments were performed by a single researcher, with a second researcher independently verifying the extracted information. When necessary, a third individual was enlisted to arbitrate the final decision. The quality of all included studies was assessed using the Alberta Heritage Foundation for Medical Research (AHFMR) tool (Supplementary Figure S1), which was found to be the tool most suited to the heterogeneous nature of the study designs and outcomes collected.

Data were extracted for predefined outcomes. Extracted study characteristics included: the definition of TRD used, patient inclusion and exclusion criteria, total number of patients included and number of patients of relevance (patients with TRD) included, duration of follow-up and investigational treatment type. For baseline participant characteristics, we recorded: age, sex, education level, marital status, employment status, race or ethnicity, disease severity, disease duration, number and type of previous therapeutic interventions that did not result in adequate response and existing treatment within the current MDE. Treatment outcomes were extracted for: change in depression severity and rates of remission and response over time, as measured by MADRS, HAM-D, BDI, CGI-C, CGI-S or PHQ-9 score, as well as change in HRQoL over time, as measured by any validated instrument.

Results

Included Studies

A total of 8,030 records were identified through database searches, with a further 5,296 identified through supplementary searches. Following title, abstract and full-text review, 22 publications were included in the SLR (Fig. 1). The 22 publications reported on 20 unique studies, including 13 prospective cohort studies, 5 retrospective cohort studies, 1 chart review and 1 case–control study.

Fig. 1
figure 1

Flowchart of studies included and excluded in the systematic review process. CDSR Cochrane Database of Systematic Reviews, CENTRAL Cochrane Central Register of Controlled Trials, MDD-PE major depressive disorder-psychiatric emergency, TRD treatment resistant depression, SLR systematic literature review

The key characteristics of the included studies reporting on the TRD population are detailed in Table 2. Non-pharmacological treatments consisted of transcranial magnetic stimulation (TMS; n = 7) [21,22,23,24,25,26,27], and electroconvulsive therapy (ECT; n = 1) [28]. Studies of specific pharmacological treatments comprised ketamine and/or esketamine (n = 3) [29,30,31], onabotulinum toxin (n = 1) [32], valproate (n = 1)[33], pramipexole (n = 1) [34] and tranylcypromine and amitriptyline (n = 1) [35]. Other studies employed combinations of multiple pharmacological and/or non-pharmacological treatments (n = 5) [16, 36,37,38,39,40]. The majority of included studies (13/20) were prospective cohort investigations and reported a wide range of follow-up durations (range: 2 weeks to 9.4 years). The number of patients of relevance included in each study ranged from 14 to 411 (Table 2). Just over half of the included studies (12/20) had less than 50 patients with TRD, with only three studies including more than 100 patients.

Table 2 Characteristics of included studies in the TRD patient population

Of the 19 studies reporting the number of sites, most were undertaken at a single site (14/20), with the remaining five being multicentre studies. The majority of studies were conducted in North America (8/20), Europe (5/20) or Asia (5/20), with only one study reporting data from multiple countries [16]. The most common countries of study location were Canada (4/20), the United States (4/20), India (3/20) and Italy (3/20).

Definition of TRD

There was considerable variation in the definition of TRD used within the inclusion criteria of studies (Table 2). The minimum number of previously failed treatments for classification as treatment-resistant ranged from at least two (as per the study inclusion criteria for this SLR) to at least four. Only some studies (7/20) specified the necessary minimum duration of treatment administration considered to be adequate; this ranged from ‘at least 3 weeks’ to ‘at least 6 to 8 weeks’. Only one study explicitly reported a quantitative value to define inadequate improvement with treatment (≤ 25% improvement on best day Massachusetts General Hospital–Antidepressant Treatment Response Questionnaire [MGH-ATRQ] score) [16]. Other studies employed different levels of responsiveness, with variations in the language used to define treatment failure, including ‘failure to remit’, ‘insufficient response’ and ‘demonstrated inadequacy’.

Baseline Characteristics

The level of descriptive characteristic data captured at baseline varied between studies (Table 3). Included patients with TRD were typically middle-aged (range of mean age: 41.2 to 64.5 years) and overall were approximately balanced for sex (range of percentage female: 27.3 to 66.1%).

Table 3 Baseline characteristics of patients of relevance in included studies of TRD

Disease severity was reported by 18/20 studies, with the HAM-D scale being used most frequently (n = 11). Other rating scales used included MADRS (n = 7), CGI-S (n = 4), BDI (n = 3) and the PHQ-9 (n = 3). All studies that included a measure of disease severity at baseline reported mean scores that could be classified as either moderate, moderate to severe, or severe, according to previously defined thresholds for the HAM-D [41], MADRS [42], BDI [43] and PHQ-9 [44] scales (Fig. 2). Mean disease duration was reported by 9/20 studies, ranging from 1.4 to 16.5 years in the overall cohorts of included studies, with 6/9 of these studies reporting mean durations > 10 years. The duration of the current depressive episode was reported by 6/20 studies, with mean durations ranging from 0.8 to 12.5 years. Three of the included studies reported the mean number of previously failed treatments during the current depressive episode, ranging from 2.9 to 5.9 in the main study cohorts, with one study reporting a mean of 6.4 previously failed treatments in a subset of patients who did not respond to TMS treatment [21].

Fig. 2
figure 2

Baseline depression severity scores. Data are presented as mean ± SD (where available). aGroup receiving antidepressant treatment. bGroup receiving second-generation antipsychotic plus antidepressant treatment. cPatients who subsequently experienced remission. dPatients who did not subsequently achieve remission. eSeverity classification for HAM-D score defined by Zimmerman et al. [41]. fSeverity classification for MADRS score defined by Muller et al. [42]. gSeverity classification for BDI score defined by Beck et al. [43]. hPatients who were THC-positive. IPatients who were THC-negative. jSeverity classification for PHQ-9 score defined by Kroenke et al. [44]. BDI Beck Depression Inventory, HAM-D Hamilton Depression Rating Scale, MADRS Montgomery–Åsberg Depression Rating Scale, PHQ-9 9-question Patient Health Questionnaire, SD standard deviation, THC Tetrahydrocannabinol

Clinical Outcomes

Responsiveness to treatment was assessed most frequently by a version of the HAM-D scale (n = 10), followed by MADRS (n = 9), CGI-S (n = 4), BDI (n = 3), CGI-C (n = 2) and PHQ-9 (n = 2). Studies typically reported follow-up data across a relatively short period, with 11/20 studies featuring a follow-up period of 12 weeks or less and only 5/20 studies reporting follow-up data over a period of 12 months or more. Of studies reporting HAM-D scores, 7/10 studies reported the number of patients experiencing at least one level of response (remission, response, partial response or non-response), 8/10 reported absolute scores and 6/10 reported either an absolute or relative change from baseline. Of studies reporting MADRS scores, 6/9 studies reported the number of patients experiencing at least one level of response, 7/9 reported absolute scores and 4/9 reported either an absolute or relative change from baseline.

For studies reporting absolute or relative changes in a depression severity rating scale, a summary of the mean change in depression severity score from baseline to the final pre-specified timepoint is presented in Table 4. Studies reporting absolute mean HAM-D (8/20), MADRS (7/20) or BDI (2/20) scores over time are presented alongside previously described thresholds for severity classification and remission in Fig. 3 [41,42,43, 45, 46].

Table 4 Change in depression severity score from baseline to final prespecified timepoint
Fig. 3
figure 3

Change in mean absolute depression severity scores over time. aGroup receiving antidepressant treatment. bGroup receiving second-generation antipsychotic plus antidepressant treatment. cSeverity classification for HAM-D score defined by Zimmerman et al. [41]. dRemission thresholds for HAM-D and MADRS previously defined by Zimmerman et al. [45]. eSeverity classification for MADRS score defined by Muller et al. [42]. fSeverity classification for BDI score defined by Beck et al. [43]. gRemission threshold for BDI previously defined by Riedel et al. [46]

In studies reporting the proportion of patients achieving remission and/or response, no single treatment type exhibited a marked pattern of higher rates of treatment responsiveness. However, remission and response rates broadly increased over time.

In the acute setting (≤ 8 weeks), explicitly reported remission rates using HAM-D (2/20) or MADRS (1/20) scores ranged from 0 to 18% in the overall study populations, while rates of response without remission (only reported using HAM-D data) ranged from 0 to 57.1%. Of studies explicitly reporting medium-term remission rates (> 8 weeks to ≤ 6 months) using MADRS (2/20) or HAM-D (1/20) scores, rates of remission and response were generally higher than those reported in the acute setting, ranging from 16.7 to 70.9% in the overall study populations. In these studies, rates of response without remission ranged from 9.8 to 80.6%. Long-term (≥ 12 months) rates of remission and/or response were reported by 2/20 studies using MADRS and 1/20 studies using HAM-D. Long-term remission rates ranged from 19.2 to 54.8% in the overall study populations, while rates of response without remission ranged from 11.6 to 15.9% (only reported using MADRS data).

Health-Related Quality of Life Outcomes

Two studies, Heerlein et al. and Perugi et al., reported HRQoL data, the latter reporting on an Italian subset of patients in the study of the former [16, 38, 40]. These studies assessed HRQoL at baseline and after 6 months using the European Quality of Life Group, 5-Dimension 5-Level Scale (EQ-5D-5L), whereby an index score of 1 represents perfect health, 0 represents a health state equivalent to death and < 0 represents a state worse than death [38]. Heerlein et al. reported a mean baseline EQ-5D-5L index of 0.41 in 397 patients [38]. In a separate publication reporting on the same study, after 6 months of receiving various treatments, the EQ-5D-5L score had increased by 0.11 in patients who did not respond to treatment, by 0.26 in those who experienced response without remission and by 0.34 in those who experienced remission [16]. After 12 months, the improvements from baseline were 0.11, 0.31 and 0.35, respectively [16]. Perugi et al. similarly reported a mean baseline EQ-5D-5L index of 0.4 in 121 patients [40]. After 6-months of receiving various treatments, the EQ-5D-5L index improved to 0.6 (n = 85) in patients remaining in the study, but was lower in those who did not respond to treatment (0.2; n = 61) versus those who responded (0.7; n = 8) or reached remission (0.9; n = 16).

Heerlein et al. also reported the impact of TRD on functioning [Sheehan Disability Scale (SDS)] and work productivity (WPAI). At baseline, according to the SDS, 61.6% of these patients experienced marked or extreme work impairment (mean SDS total score: 22.4), with WPAI scores revealing overall mean impairment of work and activity to be 60.5% and 73.3%, respectively. After 6 and 12 months of treatment, mean change from baseline in total SDS score was − 2.67 and − 2.91, respectively, in those who did not respond to treatment, − 7.58 and − 7.00 in those with response without remission and − 12.53 and − 14.44 in those who experienced remission. Change in WPAI scores were not reported.

Quality Assessment of Included Studies

The quality of the included studies, as indicated by the AHFMR quality assessment checklist, was moderate to good (Supplementary Fig. 1). Of the included studies, the description of subjects and settings was generally appropriate, with just two studies providing only a partially adequate description. While only three studies included an appropriate sample size for the study design and target population, all studies provided an adequate description of the statistical analysis methods employed. There was a consistent lack of adjustment for confounding, which was either not done or not reported in almost all (19/20) studies.

Discussion

This systematic review has identified 20 real-world studies, comprising a variety of pharmacological and non-pharmacological treatments, reporting baseline characteristics and clinical outcomes in patients with TRD. There was substantial heterogeneity in the definition of TRD and the means of assessment, and manner of reporting, on the burden of illness and treatment outcomes, preventing the quantitative synthesis of results. Nevertheless, patients with TRD consistently presented with moderate to severe depression, long durations of illness and poor HRQoL. Only two studies assessed the latter, suggesting that greater emphasis is placed on clinical outcomes than patient-centred outcomes in real-world studies. Treatment outcomes varied greatly. While many patients typically experienced a level of response by the end of the included studies’ follow-up period, rates of remission were generally low. Studies predominantly involved relatively small sample sizes, followed-up over relatively short durations, highlighting the need for larger-scale, longer-term studies.

In their criteria for TRD, studies did not consistently define what constitutes response failure, nor adequate improvement, with the latter ranging from remission to response. Despite the definition of TRD being centred around the number of prior treatment failures and the well-established negative relationship between the number of prior treatment failures and the probability of relapse from acute response over time, very few studies reported the absolute number of prior treatments received [5]. The tools used to assess disease severity were varied, with most studies only reporting outcomes using a single tool. As the clinical tools developed to assess TRD focus on different elements of the disease and use a range of assessment methods, a more complete picture of the disease and its burden could be developed by consistently using multiple tools within individual studies. Similarly, the sample size and follow-up duration of the included studies was wide ranging, but studies typically featured relatively small sample sizes monitored over a period of several weeks to a few months, potentially reducing the robustness of the findings. Of the included studies, the most frequently utilised treatment was TMS, while several studies also reported on patients receiving multiple treatments, often comprising two or more different treatment types. Collectively, the heterogeneity of the included studies suggests a lack of clinical consensus and standardisation in the severity classification and monitoring of TRD in real-world practice [11].

Applying previously defined cut-offs, the severity of depression at baseline, which has been identified as the most important prognostic factor for TRD [47], ranged from moderate to severe. Disease duration at baseline was typically greater than 10 years, with the current MDE frequently spanning several years. These findings are consistent with an earlier review of the burden of TRD, which reported that, on average, patients with TRD had MDD durations of 4.4 years and had completed 4.7 unsuccessful drug treatments [11]. Taken together, the common concurrence of MDD that spans many years, with severe and prolonged MDEs, is indicative of the substantial and often unremitting burden imposed by TRD, which exceeds that of MDD alone [3].

Definitions of remission and response varied between studies. In those studies reporting rates of responsiveness to treatment in the acute setting, remission rates were generally low. Given the established propensity among patients with TRD for severe and long-lasting MDD and MDEs, it follows that longer-term interventions and follow-up periods are likely to be required for many patients to achieve remission. Indeed, studies of medium- and long-term follow-up durations typically reported higher rates of treatment responsiveness than those of acute interventions, but were nevertheless highly variable, with many patients not reaching remission after 12 months of treatment. Heerlein et al. reported that, after 6 months of initiating a new antidepressant treatment, only 16.7% of patients with TRD achieved remission, rising only to 19.2% at 12 months [16]. Similarly, Perugi et al., reporting on a subset of the previous study, demonstrated that, after 6 months of receiving various treatments, only 18% of patients had experienced remission, rising only marginally to 22.7% at 12 months [40]. This is in agreement with an earlier review of studies of patients with TRD, which captured data from nearly 60,000 patients and reported wide-ranging rates of remission and response, averaging 20% and 36%, respectively [11]. Indicative of the severe burden and unmet need for effective treatments experienced by patients with TRD, the aforementioned study also reported a 17% prevalence of prior suicide attempts in this population. Collectively, these studies demonstrate that existing treatment strategies are often insufficiently effective to enable patients with TRD to experience remission, particularly over short time durations of treatment.

Health-related quality of life was rarely assessed within the included studies, suggesting that, in real-world practice, greater emphasis is still placed on clinical outcomes than patient-centred outcomes. However, it should also be considered that there are few disease-specific tools for the assessment of HRQoL in patients with MDD, which may contribute to the infrequency of HRQoL evaluation. Nevertheless, it has been previously demonstrated that, when compared with MDD patients without treatment resistance, patients with TRD experience significantly lower HRQoL and a greater impairment of work activity and productivity [3]. Both studies that reported HRQoL data demonstrated low HRQoL scores in the TRD population [16, 38, 40]. In these studies, patients with TRD exhibited mean baseline EQ-5D-5L index scores (0.4 and 0.41) that were substantially lower than the general adult population, which typically ranges between ~ 0.70 and 0.95 globally [38, 40, 48]. Patients with TRD were also likely to be experiencing significant work and activity impairment, culminating in high levels of absenteeism and presenteeism [10]. In these two studies, patients who did not achieve remission during the study period experienced further declines or minimal improvement in HRQoL. Importantly, although remission rates were low in these studies, those patients with TRD who did experience remission also experienced substantial increases in HRQoL, emphasising the merit of remission as the primary treatment goal [16, 40]. Future studies would benefit from increased assessment and continued monitoring of HRQoL to better capture the personal burden of TRD and to assess the efficacy of treatment options on functioning and productivity. This, alongside consistent use of a range of clinical outcome assessment tools, could build a more comprehensive picture of the TRD population and specific unmet needs that patients experience in their day-to-day lives.

This SLR was conducted in accordance with best practice guidelines, such as the use of two independent systematic reviewers to review abstracts and full-text articles against the inclusion and eligibility criteria [49]. Nevertheless, conclusions drawn from this systematic review are naturally limited by the information available in, and the methodological quality of, the included published literature. While the present study’s evaluation of real-world evidence enables greater confidence in the relevance and applicability of findings to the patient population, the lack of standardisation prevented quantitative synthesis of the findings. Owing to the inclusion of only real-world evidence studies, and the heterogeneity of the interventions and study designs employed, safety outcomes were not captured. This decision was made since, outside randomised controlled trials, safety outcomes are inconsistently monitored and reported, thus limiting comparability. Nevertheless, it must be considered that treatment side effects are likely to influence subsequent treatment decisions, clinical outcomes and patient-centred outcomes, such as HRQoL and functioning. As such, it may be of value to investigate real-world safety patterns of therapies used for TRD in future studies.

Conclusions

This review demonstrates that there is a paucity of studies investigating real-world treatment of patients with TRD. Those that do are heterogeneous in their definition, assessment and monitoring of TRD and feature a wide range of treatment types and durations. The lack of evidence, together with the heterogeneity of studies that are available, make drawing specific conclusions about this patient population in the real world challenging. However, more broadly, studies found in this review show that patients with TRD had typically experienced long-lasting MDD, with moderate or severe MDEs spanning multiple years. Rates of response to treatment varied greatly between studies, but remission rates were typically low. Few studies had investigated long-term treatment outcomes in this patient population, for whom response to treatment is notoriously elusive. Longer durations of study intervention and follow-up were associated with marginally greater gains in favourable treatment outcomes, but remission rates typically remained low even after a year of treatment. Health-related quality of life was seldom measured, suggesting that greater emphasis is typically placed on the reporting of clinical outcomes over patient outcomes. When it was measured, HRQoL was reported to be particularly low in patients with TRD. Thus, while there is a lack of clinical consensus on the definition of TRD, the condition doubtless carries a high burden of illness and there exists an unmet need for more effective treatment options. Furthermore, future real-world studies would benefit from the application of standardised modalities of assessment, monitoring and reporting of treatment effectiveness, including greater consideration of HRQoL outcomes, to better understand the burden on patients affected by this condition.