Scolaris Content Display Scolaris Content Display

Cochrane Database of Systematic Reviews Protocol - Intervention

Telerehabilitation for acute, subacute and chronic low back pain

Collapse all Expand all

Abstract

Objectives

This is a protocol for a Cochrane Review (intervention). The objectives are as follows:

To evaluate the benefits and harms of telerehabilitation for patients with acute, subacute or chronic non‐specific LBP, compared to usual care, no treatment, waiting list, any form of education and advice (remotely or face‐to‐face), or a similar face‐to‐face intervention on pain intensity and disability.

Background

Description of the condition

Low back pain (LBP) is defined as pain and discomfort between lower limb margins and the buttock creases (Dionne 2008; Van Tulder 2002), with or without referred leg pain and neurological symptoms (Hartvigsen 2018). Non‐specific LBP is the most common type of LBP and is diagnosed by excluding radicular symptoms and serious spinal pathologies (i.e. fracture, cancer, infection, and inflammatory diseases) (Airaksinen 2006; Van Tulder 2006). LBP has been classified according to the duration of symptoms as acute/subacute (acute being less than six weeks' duration, and subacute being six to 12 weeks' duration); and as chronic (greater than 12 weeks' duration) (Airaksinen 2006).

Low back pain is the leading cause of disability worldwide (Global Burden of Disease Study 2015). LBP is experienced by people of all ages and socioeconomic status (Hartvigsen 2003; Hoy 2012). Low‐ and middle‐income countries are the most severely affected by disability caused by LBP, as their health and social systems may not be well equipped to deal with this growing burden (Hartvigsen 2018). However, the internet is widely available (Kemp 2018) and is increasingly being used as a tool to help deliver interventions for patients with LBP (Chiauzzi 2010; Del Pozo 2013a; del Pozo 2013b; Krein 2013).

Description of the intervention

The growth of mobile telecommunications and internet access presents new opportunities to reach, support, and treat patients with LBP. Recent reports have shown an increase in internet users in Africa (73 million new users or a 20% increase), America (23 million new users or a 3% increase), the Asia‐Pacific (98 million new users or a 5% increase), Europe (37 million new users or a 6% increase), and the Middle East (17 million new users or a 11% increase) in 2018(Kemp 2018). This means that over half of the world's population (i.e., 4,021 billion people) used the internet in 2018 (Kemp 2018).

Remotely delivered interventions, including telerehabilitation, have the potential to increase access to general healthcare services, deliver care to rural areas, offer providers greater flexibility in scheduling, and save patient's time and resources in seeking care (Kruse 2018). Telerehabilitation can be broadly defined as the delivery of rehabilitation services over telecommunication technologies such as websites, smartphone apps, videoconferencing systems and telephone, and can be considered as a subfield of telehealth (Russell 2007). It could provide a platform to deliver services offered by a number of health disciplines including physiotherapy, occupational therapy, dietetics, psychology and others. It may involve the full spectrum of client care, including the client interview, physical assessment and diagnosis, treatment, maintenance activities, consultation, education and training. Telerehabilitation services have been developed as a way of increasing accessibility to healthcare, especially for rural populations, those with disability, or people living in highly‐populated cities where healthcare systems can be overcrowded. It overcomes some of the potential barriers to healthcare access such as travel (distance, traffic, transport), time consumed, high demand for the public health system (long waiting lists), lack of insurance cover for private care, and high costs for long‐term treatment (Kairy 2009; Lee 2018). Telerehabilitation is a promising approach to the management of patients with chronic conditions, including back pain.

Different terms are used within this growing field, typically: telemedicine, telehealth, e‐health, telecare and distance treatment. This type of intervention is usually delivered without (or with very limited) face‐to‐face contact with a therapist, via technologies (mainly the internet, but also via smartphone or computer, virtual reality or stand‐alone computer programmes). Telerehabilitaiton interventions have shown promise for the management of chronic pain (Dear 2013), other chronic health conditions such as diabetes (Mori 2011), and patients' recovery after coronary artery bypass surgery (Barnason 2009); however, the effectiveness of telerehabilitaiton for patients with non‐specific LBP is still unclear.

How the intervention might work

Guidelines have recently revised recommendations for the management of LBP. More emphasis has been placed on simple management in first‐line care, instead of pharmacological and complex therapies (Hartvigsen 2018). The choice of delivery of self‐management and simple interventions through telecommunication devices and technologies has become more common. Recent published studies have used telerehabilitation to support and educate patients with LBP to self‐manage their conditions, and to encourage an active lifestyle (Amorim 2019; Irvine 2015; Shebib 2019). Telerehabilitaiton interventions have been used as a single intervention (Chiauzzi 2010; Irvine 2015; Krein 2013; Lorig 2002), and in combination with different interventions, such as usual care and educational materials (Damush 2003a; Damush 2003b; Del Pozo 2013a; del Pozo 2013b; Iles 2011). Studies have delivered telerehabilitation via websites (Chiauzzi 2010; Del Pozo 2013a; del Pozo 2013b; Krein 2013), online chat group discussions (Krein 2013), email discussions (Lorig 2002), phone calls (Iles 2011), or a combination of these strategies (Amorim 2019; Irvine 2015; Krein 2013; Shebib 2019).

Why it is important to do this review

The number of studies on telerehabilitation for patients with LBP has increased. The effectiveness of telerehabilitation has been tested in a number of randomised controlled trials (Amorim 2019; Chhabra 2018; Chiauzzi 2010; Damush 2003a; Damush 2003b; Del Pozo 2013a; del Pozo 2013b; Geraghty 2018; Irvine 2015; Krein 2013; Lorig 2002; Rutledge 2018a; Rutledge 2018b; Shebib 2019; Williams 2019; Zadro 2019), yet only a few of these have been summarised in one systematic review (Dario 2017). We intend to perform a Cochrane Review in order to provide accurate and robust information on the effectiveness of telerehabilitation for patients with non‐specific LBP.

Objectives

To evaluate the benefits and harms of telerehabilitation for patients with acute, subacute or chronic non‐specific LBP, compared to usual care, no treatment, waiting list, any form of education and advice (remotely or face‐to‐face), or a similar face‐to‐face intervention on pain intensity and disability.

Methods

Criteria for considering studies for this review

Types of studies

We will include randomised controlled trials (RCTs) with clearly reported true randomisation methods, cross‐over RCTs and cluster‐RCTs. We will include studies reported as full text, those published as abstract only and unpublished data. There will be no restriction on date or language of publication.

Types of participants

We will consider trials of adult participants (aged 18 years or greater) of any sex, with non‐specific LBP that is either acute/subacute (acute being less than six weeks' duration, and subacute being six to 12 weeks' duration) or chronic (greater than 12 weeks' duration). Non‐specific LBP will be defined as pain not attributed to a recognisable or specific pathology, such as radicular syndromes or serious pathology (i.e. fracture, cancer, infection and inflammatory diseases). We will exclude studies that included individuals with specific conditions such as LBP related to pregnancy, infection, fracture, spinal stenosis, cancer, and individuals with pain less than six months after spinal surgery. Studies including a mixed population (e.g. musculoskeletal pain) will be excluded, unless separate data were provided for LBP patients. We will contact study authors if information about the population with LBP is not provided separately. No restriction will be placed on the setting or context of the studies.

Types of interventions

We will consider any health intervention delivered remotely through telecommunication networks or the internet, including telephone calls, short message service (SMS), website, videoconferencing systems, software and mobile application (app). Telerehabilitation will be classified into three main categories according to the most predominant component of the intervention:

  • education or psychological interventions (behavioural and psychotherapeutic treatments designed to reduce psychological distress and maladaptive behavior, e.g. cognitive behavioural therapy (CBT), counselling);

  • exercise and physical activity (this includes general or specific exercises and strategies to increase physical activity levels);

  • others, including multi‐component interventions (when the telerehabilitation intervention includes both a psychological component and an exercise or physical activity component).

We will consider mixed interventions (i.e. a combination of face‐to‐face and telerehabilitation components) if the telerehabilitation component comprises at least 75% of either the planned sessions or the time of the intervention.

Telerehabilitation will be compared to usual care, minimal interventions (e.g. no treatment, waiting list, advice), and a similar face‐to‐face intervention (e.g. exercise). The main comparison reported in the 'Summary of findings' table(s) will be telerehabilitation versus usual care.

Types of outcome measures

Major outcomes

The key major outcomes will be pain intensity and disability, which have been recommended as part of the core outcome measures for non‐specific LBP trials (Chiarotto 2018).

  • Pain intensity, measured with a continuous self‐report scale (e.g. visual analogue scale (VAS); numeric rating scale (NRS)), a rating scale within a composite measure of pain (e.g. McGill Pain Questionnaire) or ordinal scale (we will consider ordinal scales as continuous measures). If studies present more than one measure for pain intensity, we will give preference to NRS, VAS, and other validated measures, in this order of priority. For trials reporting pain as dichotomous (e.g. 30% or 50% reduction on pain intensity) we will include those data as minor outcomes.

  • Disability, measured with a continuous, self‐report scale (e.g. Roland Morris Disability Questionnaire (RMDQ); Oswestry Disability Index (ODI)) or an ordinal scale (we will consider ordinal scales as continuous measures). If studies present more than one measure for disability, we will give preference to RMDQ, ODI, and other validated measures, in this order of priority. For trials reporting disability as dichotomous outcomes (e.g. 30% or 50% reduction on disability) we will include those data as minor outcomes.

  • Health‐related quality of life (in the following order of preference: 12‐Item Short Form Survey (SF‐12), European Quality of Life Survey ‐5 Dimensions (EQ‐5D), physical and mental health domains of the 36‐Item Short Form Health Survey (SF‐36), other algofunctional scale).

  • Anxiety (in the following order of preference: Hospital Anxiety and Depression Scale; the Spielberger State‐Trait Anxiety Inventory; other algofunctional scale)

  • Depression (in the following order of preference: Hospital Anxiety and Depression Scale; Centre for Epidemiological Studies Depression Scale; Beck Depression Inventory, other algofunctional scale).

  • Withdrawals due to adverse events (any adverse events reported, such as fatigue and worsening of pain).

  • Short‐term serious adverse effects from trials (e.g. cardiovascular events).

All ordinal scales with more than six levels will be considered to exhibit continuous properties. Studies that use other measurement tools will not be excluded from the review, but we will estimate the relative ranking of the interventions according to their effect on pain intensity at post‐treatment.

Minor outcomes

  • Return to work (e.g. percentage of patients who returned to work after LBP).

  • Self‐efficacy (in the following order of preference: Pain Self‐Efficacy Questionnaire; Chronic Pain Self‐Efficacy Scale; other algofunctional scale).

  • Fear avoidance (in the following order of preference: Fear‐Avoidance Belief Questionnaire; other algofunctional scale).

  • Pain catastrophisation (in the following order of preference: Pain Catastrophising Scale; other algofunctional scale).

  • Intervention adherence (expressed in percentages or number of participants).

We will consider the following time points of follow‐up: short‐term (less than three months after randomisation); intermediate (at least three months but less than 12 months after randomisation); and long‐term (12 months or more after randomisation). If there are multiple time points classified within the same category, we will use the one that will be closest to the end of the treatment for short‐term, six months for intermediate‐term, and we will collect all available long‐term follow‐up times available.

Search methods for identification of studies

Electronic searches

We will perform computerised electronic searches of the following databases, from inception to the present.

  • Cochrane Central Register of Controlled Trials (CENTRAL) (the Cochrane Library, current issue, which includes the Cochrane Back and Neck trials register)

  • MEDLINE via OvidSP (1946 to current)

  • Embase via OvidSP (1980 to current)

  • Cumulative Index to Nursing and Allied Health Literature (CINAHL, EBSCO) (1982 to current)

  • Physiotherapy Evidence Database (PEDro)

  • ClinicalTrials.gov

  • WHO International Clinical Trial Registry Platform (WHO ICTRP)

Search strategies will be developed using the methods described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2019a). The search strategy for MEDLINE can be found in Appendix 1. It will be translated as closely as possible across the other databases. We will not apply any restrictions on date, language or publication status to the searches.

Searching other resources

We will check the reference lists of all primary studies and previous systematic reviews on this topic to identify additional studies. We will search for errata or retractions from included studies published in full text on PubMed, and report the date this was done.

Data collection and analysis

Selection of studies

After merging search results and discarding duplicates, two review authors (LA and BS) will independently screen titles and abstracts of all reports identified as a result of the search. We will retrieve the full text of potentially relevant reports and two authors (LA and BS) will independently screen the full text and identify studies for inclusion. They will also identify and record reasons for excluding ineligible studies. We will resolve any disagreements between the review authors through discussion, or by arbitration by a third person (LC or SK) when consensus cannot be reached. We will identify and exclude duplicates ad collate multiple reports of the same study so that each study, rather than each report, is the unit of interest in the review. We will report the selection process in sufficient detail to complete a PRISMA flow diagram and 'Characteristics of excluded studies' table (PRISMA Group 2009).

Data extraction and management

We will use a data collection form for study characteristics and outcome data, which we will pilot on at least one study in the review. Two review authors (LA and BS) will independently extract relevant data from each of the included studies. We will extract the following study characteristics.

  • Bibliometric data (e.g. authors, year, and language)

  • Methods (e.g. study design, duration of study, sample size, country, withdrawals, clinical trial registration and funding)

  • Participants (e.g. number, gender, mean age, age range, symptoms, inclusion and exclusion criteria and severity of the condition)

  • Interventions: intervention (including the type of technology used e.g. telephone, app, and/or website), comparison, concomitant medications, and co‐interventions. We will collect the reporting of interventions according to the Template for Intervention Description and Replication (TIDieR) checklist (Hoffmann 2014; Yamato 2016) (Appendix 2)

  • Outcomes: primary and secondary outcomes specified and collected, and time points reported

  • Characteristics of the design of the trial, as outlined below in Assessment of risk of bias in included studies

  • Notes: funding for the trial and notable declarations of interest of trial authors

  • Whether the trial was prospectively registered or not

We will note in the 'Characteristics of included studies' table if outcome data were not reported in a usable way and when data were transformed or estimated from graph. We will use PlotDigitizer software (PlotDigitizer [Computer program]) to extract data from graphs or figures (we will extract these data in duplicate). We will resolve disagreements by consensus or by involving a third person (LC or CM). One review author (LA) will transfer data into the Review Manager 5 file (Review Mannager 2014). We will double‐check that data are entered correctly by comparing the data presented in the systematic review with the study reports.

We will select data based on the following decision rules.

  • We will give preference to change scores if both change and endpoint values are available.

  • We will give preference to intention‐to‐treat (ITT) analysis data rather than 'per protocol' or 'as treated', if available.

  • If multiple time points are reported, we will use the one closest to three months for short‐term follow‐up, and six months for intermediate‐term follow‐up. We will collect all available long‐term follow‐up data.

Where additional data about treatment effects, as well as other characteristics, are required, we will contact study authors up to three times. If no reply is received within six weeks, we will consider the data unobtainable for this iteration of the review. Where there are discrepancies between published and unpublished versions of the same data, we will give preference to the published data because these have been through a peer review process.

Main planned comparisons

Our primary comparison will be telerehabilitation versus usual care.

Our other main comparisons are grouped as follows.

  • Telerehabilitation versus minimal interventions (i.e. waiting list, no treatment, advice)

  • Telerehabilitation versus similar face‐to‐face treatments

Assessment of risk of bias in included studies

The assessment of risk of bias in included studies will be performed using the methods outlined in the Handbook (Higgins 2019a). This will be done independently by two review authors (LA and BS) disagreements will be resolved by discussion or arbitration by a third author (LC or SK). We will assess the risk of bias using the Cochrane risk of bias assessment tool from the Back and Neck Group (Furlan 2015) (Table 1; Table 2). We will assess each of the 13 items of the 'Risk of bias' assessment tool as being at high, low or unclear risk of bias. We will provide a quote from the study report, together with a justification for our judgement, in the 'Risk of bias' table. We will summarise the 'Risk of bias' judgment across different studies for each of the domains listed. Where information on risk of bias relates to unpublished data or correspondence with a trialist, we will note this in the 'Risk of bias' table. When considering treatment effects, we will take into account the risk of bias for the studies that contribute to that outcome. We will present the figures generated by the 'Risk of bias' tool to provide summary assessment of the risk of bias.

Open in table viewer
Table 1. Sources of risk of bias

Bias domain

Source of bias

Possible answers

Selection

(1) Was the method of randomisation adequate?

Yes/no/unsure

Selection

(2) Was the treatment allocation concealed?

Yes/no/unsure

Performance

(3) Was the patient blinded to the intervention?

Yes/no/unsure

Performance

(4) Was the care provider blinded to the intervention?

Yes/no/unsure

Detection

(5) Was the outcome assessor blinded to the intervention?

Yes/no/unsure

Attrition

(6) Was the dropout rate described and acceptable?

Yes/no/unsure

Attrition

(7) Were all randomised participants analysed in the group to which they were allocated?

Yes/no/unsure

Reporting

(8) Are reports of the study free of suggestion of selective outcome reporting?

Yes/no/unsure

Selection

(9) Were the groups similar at baseline regarding the most important prognostic indicators?

Yes/no/unsure

Performance

(10) Were co‐interventions avoided or similar?

Yes/no/unsure

Performance

(11) Was the compliance acceptable in all groups?

Yes/no/unsure

Detection

(12) Was the timing of the outcome assessment similar in all groups?

Yes/no/unsure

Other

(13) Are other sources of potential bias unlikely?

Yes/no/unsure

Open in table viewer
Table 2. Criteria for a judgment of ‘‘yes’’ for the sources of risk of bias

1

A random (unpredictable) assignment sequence. Examples of adequate methods are coin toss (for studies with 2 groups), rolling a dice (for studies with 2 or more groups), drawing of balls of different colours, drawing of ballots with the study group labels from a dark bag, computer‐generated random sequence, preordered sealed envelopes, sequentially‐ordered vials, telephone call to a central office, and preordered list of treatment assignments. Examples of inadequate methods are: alternation, birth date, social insurance/security number, date in which they are invited to participate in the study, and hospital registration number.

2

Assignment generated by an independent person not responsible for determining the eligibility of the patients. This person has no information about the persons included in the trial and has no influence on the assignment sequence or on the decision about eligibility of the patient.

3

Index and control groups are indistinguishable for the patients or if the success of blinding was tested among the patients and it was successful.

4

Index and control groups are indistinguishable for the care providers or if the success of blinding was tested among the care providers and it was successful.

5

Adequacy of blinding should be assessed for each primary outcome separately. This item should be scored ‘‘yes’’ if the success of blinding was tested among the outcome assessors and it was successful or:

  • for patient‐reported outcomes in which the patient is the outcome assessor (e.g. pain, disability): the blinding procedure is adequate for outcome assessors if participant blinding is scored ‘‘yes’’;

  • for outcome criteria assessed during scheduled visit and that supposes a contact between participants and outcome assessors (e.g. clinical examination): the blinding procedure is adequate if patients are blinded, and the treatment or adverse effects of the treatment cannot be noticed during clinical examination;

  • for outcome criteria that do not suppose a contact with participants (e.g. radiography, magnetic resonance imaging): the blinding procedure is adequate if the treatment or adverse effects of the treatment cannot be noticed when assessing the main outcome;

  • for outcome criteria that are clinical or therapeutic events that will be determined by the interaction between patients and care providers (e.g. co‐interventions, hospitalisation length, treatment failure), in which the care provider is the outcome assessor: the blinding procedure is adequate for outcome assessors if item ‘‘4’’ (caregivers) is scored ‘‘yes’’;

  • for outcome criteria that are assessed from data of the medical forms: the blinding procedure is adequate if the treatment or adverse effects of the treatment cannot be noticed on the extracted data.

6

The number of participants who were included in the study but did not complete the observation period or were not included in the analysis must be described and reasons given. If the percentage of withdrawals and dropouts does not exceed 20% for short‐term follow‐up and 30% for long‐term follow‐up and does not lead to substantial bias a ‘‘yes’’ is scored. (N.B. these percentages are arbitrary, not supported by literature).

7

All randomised patients are reported/analysed in the group they were allocated to by randomisation for the most important moments of effect measurement (minus missing values) irrespective of noncompliance and co‐interventions.

8

All the results from all prespecified outcomes have been adequately reported in the published report of the trial. This information is either obtained by comparing the protocol and the report, or in the absence of the protocol, assessing that the published report includes enough information to make this judgement.

9

Groups have to be similar at baseline regarding demographic factors, duration and severity of complaints, percentage of patients with neurological symptoms, and value of main outcome measure(s).

10

If there were no co‐interventions or they were similar between the index and control groups

11

The reviewer determines if the compliance with the interventions is acceptable, based on the reported intensity, duration, number and frequency of sessions for both the index intervention and control intervention(s). For example, physiotherapy treatment is usually administered for several sessions; therefore it is necessary to assess how many sessions each patient attended. For single‐session interventions (e.g. surgery), this item is irrelevant.

12

Timing of outcome assessment should be identical for all intervention groups and for all primary outcome measures.

13

Other types of biases. For example:

  • when the outcome measures were not valid. There should be evidence from a previous or present scientific study that the primary outcome can be considered valid in the context of the present;

  • industry‐sponsored trials. The conflict of interest (COI) statement should explicitly state that the researchers have had full possession of the trial process from planning to reporting without funders with potential COI having any possibility to interfere in the process. If, for example, the statistical analyses have been done by a funder with a potential COI, usually ‘‘unsure’’ is scored.

We will conduct the review according to this published protocol and report any deviations from it in the 'Differences between protocol and review' section of the systematic review.

Assessment of bias in conducting the systematic review

We will conduct the review according to this published protocol and report any deviations from it in the 'Differences between protocol and review' section of the systematic review

Measures of treatment effect

We will analyse the primary outcomes (i.e. pain intensity and disability) and present these on a continuous scale (ranging from zero to 100) as mean differences (MDs) with 95% confidence intervals (CIs). For the secondary outcomes presented as continuous measures (e.g. health‐related quality of life), we will express pooled effects with standardised mean difference (SMD) and 95% CIs if these trials used different measurement scales to assess these outcomes. If the studies presented the same outcome measures, we will use mean difference. Dichotomous outcomes (e.g. adverse events, return to work) will be expressed as risk differences and 95% CIs. To facilitate interpretation, we will translate pooled SMD values to the equivalent in commonly used scales for measuring disability, using the standard deviation reported in the included studies.

For dichotomous outcomes, we will calculate the number needed to treat for an additional beneficial outcome (NNTB), or number needed to treat for an additional harmful outcome (NNTH) from the control group event rate and the relative risk using the Visual Rx NNT calculator (Cates 2008). We will calculate the NNTB for continuous measures using the Wells calculator (available at the Cochrane Musculoskeletal website, musculoskeletal.cochrane.org). We will only calculate NNTB or NNTH for outcomes showing a statistically significant benefit or harm.

For dichotomous outcomes, we will calculate the absolute change from the difference in the risk between the intervention and control group using GRADEpro GDT, and express this as a percentage (GRADEpro GDT 2015). We will calculate the relative change as the risk ratio minus one, and will express this as a percentage. For continuous outcomes, we will calculate the absolute change by dividing the mean difference by the scale of the measure and expressing it as a percentage. We will calculate the relative difference as the absolute benefit (mean difference) divided by the baseline mean of the control group, and express this as a percentage.

In the 'Comments' column of the 'Summary of findings' table, we will report the absolute percentage difference and the relative percentage change from baseline. For outcomes that show a significant difference between treatment groups, we will also report the NNTB or NNTH.

Unit of analysis issues

The unit of analysis will be the participant for all trials. Where multiple trial arms are reported in a single trial, we will include only the relevant arms. If two comparisons are combined in the same meta‐analysis, we will halve the control group to avoid double‐counting (e.g. two groups of different interventions and one waiting‐list control). If we identify any cross‐over trials, we will only extract data from the first phase of the trial to avoid potential carry‐over effects. If we identify any cluster‐RCTs or studies that included more than one joint in the analysis, we will multiply the standard error of the effect estimate (from an analysis ignoring clustering) by the square root of the design effect (inflated variances), according to the methods described in Chapter 23 of the Handbook (Higgins 2019b). The meta‐analysis using the inflated variances will be performed using the generic inverse‐variance method.

Dealing with missing data

We will contact trial authors or study sponsors in order to verify key study characteristics and obtain missing numerical outcome data when possible (e.g. when a study is identified as abstract only or when data are not available for all participants). When this is not possible and the missing data are likely to introduce serious bias, we will explore the impact of including such studies in the overall assessment of results by a sensitivity analysis. We will clearly describe any assumptions and imputations to handle missing data and explore the effect of imputation by sensitivity analyses.

For dichotomous outcomes (e.g. number of withdrawals due to adverse events), we will calculate the withdrawal rate using the number of participants randomised in the group as the denominator. For continuous outcomes (e.g. mean change in pain intensity score), we will calculate the MD or SMD, based on the number of participants analysed at that time point. If the number of participants analysed is not presented for each time point, we will use the number of the randomised patients in each group at baseline.

When possible, we will compute missing standard deviations (SDs) from other statistics such as standard errors (SEs), CIs or P values, according to the methods recommended in the Handbook (Higgins 2019a). If SDs cannot be calculated, we will impute them (e.g. from other studies in the meta‐analysis).

Assessment of heterogeneity

We will assess clinical and methodological diversity in terms of participants, interventions, outcomes and study characteristics for the included studies to determine whether a meta‐analysis is appropriate. We will do this by examining these data from the data extraction tables. We will assess statistical heterogeneity by visual inspection of the forest plot to assess for obvious differences in results between the studies. We will also use the I² and Chi² statistical tests.

As recommended in the Handbook (Deeks 2019), we will interpret I² values as follows.

  • 0% to 40%: might not be important

  • 30% to 60%: may represent moderate heterogeneity

  • 50% to 90%: may represent substantial heterogeneity

  • 75% to 100%: considerable heterogeneity

As noted in the Handbook, the importance of I² depends on the magnitude and direction of effects, and strength of evidence for heterogeneity. The Chi² test will be interpreted where a P value of 0.10 or less indicates evidence of statistical heterogeneity.

We will create and examine a funnel plot to explore possible small‐study biases. When interpreting funnel plots, we will examine the different possible reasons for asymmetry, and relate this to the results of the review. If we are able to pool more than 10 trials, we will undertake formal statistical tests to investigate funnel plot asymmetry, and will follow the recommendations in Chapter 10 of the Handbook (Sterne 2017).

To assess outcome reporting bias, we will check trial protocols against published reports. For studies published after 1 July 2005, we will screen the Clinical Trial Register at the International Clinical Trials Registry Platform of the World Health Organization (www.who.int/ictrp/en) for the a priori trial protocol. We will evaluate whether selective reporting of outcomes is present.

We will use a random‐effects model for all meta‐analyses only where meta‐analysis is meaningful; that is, if the treatments, participants and the underlying clinical question are similar enough for pooling to make sense. We will use alternative synthesis methods, such as summary of effect estimates (e.g. median, interquartile range with box plots) or the combination of P values in the circumstance where there is no, or minimal, information reported on the direction of effect (McKenzie 2019).

Assessment of reporting biases

We will create and examine a funnel plot to explore possible small‐study biases. When interpreting funnel plots, we will examine the different possible reasons for asymmetry, and relate this to the results of the review. If we are able to pool more than 10 trials, we will undertake formal statistical tests to investigate funnel plot asymmetry, and will follow the recommendations in Chapter 10 of the Handbook (Sterne 2017).

To assess outcome reporting bias, we will check trial protocols against published reports. For studies published after 1 July 2005, we will screen the Clinical Trial Register at the International Clinical Trials Registry Platform of the World Health Organization (www.who.int/ictrp/en) for the a priori trial protocol. We will evaluate whether selective reporting of outcomes is present.

Data synthesis

We will use a random‐effects model for all meta‐analyses only where meta‐analysis is meaningful; that is, if the treatments, participants and the underlying clinical question are similar enough for pooling to make sense. We will use alternative synthesis methods, such as summary of effect estimates (e.g. median, interquartile range with 'box‐and‐whisker' plots) or the combination of P values in the circumstance where there is no, or minimal, information reported on the direction of effect (McKenzie 2019).

Subgroup analysis and investigation of heterogeneity

We will stratify the analyses based upon the duration of follow‐up reported for each outcome (that is, short‐, intermediate‐, and long‐term); and also based upon duration of symptoms (acute/subacute and chronic). We will also perform a subgroup analysis for telerehabilitation modality (phone, website, app, video conference). We will use the formal test for subgroup interactions in Review Manager 5 (Review Mannager 2014).

Sensitivity analysis

We plan to carry out the following sensitivity analyses for the main comparison to investigate the robustness of the treatment effect for pain intensity and physical function (all time points).

  • Studies we judge as being at low risk of selection bias (i.e. we will exclude all studies with unclear or high risk of selection bias);

  • Studies we judge as being at low risk of detection bias (i.e. we will exclude all studies with unclear or high risk of detection bias);

  • Studies we judge as being at low risk of attrition bias (i.e. we will exclude all studies with unclear or high risk of attrition bias).

If studies are rated as being at high risk of bias for at least one of these bias domains, we will consider the overall risk of bias for the study as high. We will follow the same decision for data synthesis and unit of analysis issues as in the main analyses.

GRADE and 'Summary of findings' table

We will create a 'Summary of findings' table for the main comparison (i.e. telerehabilitation versus usual care), and include the following outcomes: pain intensity, disability, quality of life, anxiety, depression, withdrawals due to adverse events and short‐term serious adverse events. We anticipate that the majority of relevant trials will include participants with chronic symptoms (greater than 12 weeks' duration), therefore we will present the evidence for chronic LBP. An example template of the 'Summary of findings' table for the main comparison is shown in Table 3. We will also create 'Summary of findings' tables for the other comparisons at post‐treatment follow‐up.

Open in table viewer
Table 3. Draft Summary of findings for the main comparison

Title: Telerehabilitation versus usual care

Patient or population: Chronic non‐specific low back pain
Settings: Primary or tertiary care
Intervention: Telerehabilitation interventions

Comparison: Usual care

Outcomes

Illustrative comparative risks (95% CI)

Relative effect
(95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

Usual care

Intervention

Pain (e.g. VAS)

Scale from 0 to 100
Follow‐up: less than 3 months after randomisation

Disability (e.g. ODI)
Scale from: 0 to 100
Follow‐up: less than 3 months after randomisation

Quality of life (e.g. SF‐36)

Scale from: 0 to 100
Follow‐up: less than 3 months after randomisation

Anxiety (HADS)

Scale from: 0 to 100
Follow‐up: less than 3 months after randomisation

Depression (HADS)

Scale from: 0 to 100
Follow‐up: less than 3 months after randomisation

Withdrawals due to adverse events (percentage)

Follow‐up: less than 3 months after randomisation

Short‐term serious adverse events (percentage)

Follow‐up: less than 3 months after randomisation

The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: confidence interval; VAS: visual analogue scale; ODI: Oswestry Disability Index; SF‐36: The Short Form (36) Health Survey

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

Two review authors (LA and JF) will independently assess the quality of the evidence. We will use the five GRADE considerations (study limitations; inconsistency; imprecision; indirectness; and publication bias) to assess the quality of a body of evidence as it relates to the studies which contribute data to the meta‐ analyses for the prespecified outcomes, and report the quality of evidence as high, moderate, low, or very low. We will use GRADEpro GDT software to prepare the 'Summary of findings' tables (GRADEpro GDT 2015). We will justify all decisions to downgrade the quality of evidence using footnotes, and we will make comments to aid the reader's understanding of the review where necessary.

We will use the following criteria to downgrade the certainty of evidence based on the five GRADE considerations.

1. Study design and risk of bias

We will consider comparisons to be at high risk of bias when more than 25% of participants in the comparison are from studies at high risk of bias overall (i.e. studies for which one or more domain is judged to be at high risk of bias); we will downgrade the evidence by one level for this. We will downgrade by two levels if 50% of participants in the comparison are from studies at high risk of bias overall.

2. Inconsistency

We will evaluate each direct comparison for consistency in the direction and magnitude of the effect sizes from individual trials, considering the width of the prediction interval and magnitude of the heterogeneity parameter. We will downgrade the evidence by one level if we identify important and non‐explained heterogeneity through visual inspection or considerable heterogeneity in the I2 test (i.e. an I2 value of more than 50%). Where there is evidence of serious inconsistency (heterogeneity in the I2 test greater than 75%), we will downgrade the quality assessment by two levels.

3. Indirectness

We will downgrade by one level if more than 50% of the participants are assessed as being outside the target group. We will not downgrade this domain by two levels.

4. Imprecision

In cases where studies include relatively few participants and few events, and thus have wide CIs around the estimate of the effect, the results are imprecise.

Dichotomous outcomes

A) When there is only one study, or when there is more than one study but the total number of events is less than 300, we will downgrade the evidence by one level.

B) When the 95% CI around the pooled or best estimate of effect includes both (i) no effect and (ii) appreciable benefit or appreciable harm, we will downgrade the evidence by one level. We will downgrade the evidence by two levels when there is imprecision due to both A) and B).

Continuous outcomes

A) When there is only one study, or when there is more than one study but the total sample size is less than 400, we will downgrade the evidence by one level.

B) When the 95% CI around the pooled or best estimate of effect includes no effect and the CI crosses an effect size of SMD = 0.5 or MD > 10% of the scale in either direction, we will downgrade the evidence by one level. We will downgrade the evidence by two levels when there is imprecision due to both A) and B).

5. Publication bias

To assess publication bias, we plan to generate funnel plots if at least 10 trials examining the same intervention comparison are included in the review. If funnel plots show asymmetry, we will downgrade the quality of the evidence by one level. We will not downgrade this domain by two levels.

We will use the methods and recommendations described in Chapters 8, 14 and 15 of the Handbook (Higgins 2019c; Schünemann 2019a; Schünemann 2019b). When assessing the overall quality of evidence, we may downgrade by one level for each factor, up to a maximum of three levels for all factors. If there are very severe problems for any one factor, we may downgrade the evidence by two levels due to that factor alone (Higgins 2019c).

Interpreting results and reaching conclusions

We will follow the guidelines for interpreting results in Chapter 12 of the Handbook (Schünemann 2019b), and will be aware of distinguishing a lack of evidence of effect from a lack of effect. We will base our conclusions only on findings from the quantitative or narrative synthesis of included studies for this review. We will avoid making recommendations for practice, and our implications for research will suggest priorities for future research and outline what the remaining uncertainties are in the area.

Table 1. Sources of risk of bias

Bias domain

Source of bias

Possible answers

Selection

(1) Was the method of randomisation adequate?

Yes/no/unsure

Selection

(2) Was the treatment allocation concealed?

Yes/no/unsure

Performance

(3) Was the patient blinded to the intervention?

Yes/no/unsure

Performance

(4) Was the care provider blinded to the intervention?

Yes/no/unsure

Detection

(5) Was the outcome assessor blinded to the intervention?

Yes/no/unsure

Attrition

(6) Was the dropout rate described and acceptable?

Yes/no/unsure

Attrition

(7) Were all randomised participants analysed in the group to which they were allocated?

Yes/no/unsure

Reporting

(8) Are reports of the study free of suggestion of selective outcome reporting?

Yes/no/unsure

Selection

(9) Were the groups similar at baseline regarding the most important prognostic indicators?

Yes/no/unsure

Performance

(10) Were co‐interventions avoided or similar?

Yes/no/unsure

Performance

(11) Was the compliance acceptable in all groups?

Yes/no/unsure

Detection

(12) Was the timing of the outcome assessment similar in all groups?

Yes/no/unsure

Other

(13) Are other sources of potential bias unlikely?

Yes/no/unsure

Figures and Tables -
Table 1. Sources of risk of bias
Table 2. Criteria for a judgment of ‘‘yes’’ for the sources of risk of bias

1

A random (unpredictable) assignment sequence. Examples of adequate methods are coin toss (for studies with 2 groups), rolling a dice (for studies with 2 or more groups), drawing of balls of different colours, drawing of ballots with the study group labels from a dark bag, computer‐generated random sequence, preordered sealed envelopes, sequentially‐ordered vials, telephone call to a central office, and preordered list of treatment assignments. Examples of inadequate methods are: alternation, birth date, social insurance/security number, date in which they are invited to participate in the study, and hospital registration number.

2

Assignment generated by an independent person not responsible for determining the eligibility of the patients. This person has no information about the persons included in the trial and has no influence on the assignment sequence or on the decision about eligibility of the patient.

3

Index and control groups are indistinguishable for the patients or if the success of blinding was tested among the patients and it was successful.

4

Index and control groups are indistinguishable for the care providers or if the success of blinding was tested among the care providers and it was successful.

5

Adequacy of blinding should be assessed for each primary outcome separately. This item should be scored ‘‘yes’’ if the success of blinding was tested among the outcome assessors and it was successful or:

  • for patient‐reported outcomes in which the patient is the outcome assessor (e.g. pain, disability): the blinding procedure is adequate for outcome assessors if participant blinding is scored ‘‘yes’’;

  • for outcome criteria assessed during scheduled visit and that supposes a contact between participants and outcome assessors (e.g. clinical examination): the blinding procedure is adequate if patients are blinded, and the treatment or adverse effects of the treatment cannot be noticed during clinical examination;

  • for outcome criteria that do not suppose a contact with participants (e.g. radiography, magnetic resonance imaging): the blinding procedure is adequate if the treatment or adverse effects of the treatment cannot be noticed when assessing the main outcome;

  • for outcome criteria that are clinical or therapeutic events that will be determined by the interaction between patients and care providers (e.g. co‐interventions, hospitalisation length, treatment failure), in which the care provider is the outcome assessor: the blinding procedure is adequate for outcome assessors if item ‘‘4’’ (caregivers) is scored ‘‘yes’’;

  • for outcome criteria that are assessed from data of the medical forms: the blinding procedure is adequate if the treatment or adverse effects of the treatment cannot be noticed on the extracted data.

6

The number of participants who were included in the study but did not complete the observation period or were not included in the analysis must be described and reasons given. If the percentage of withdrawals and dropouts does not exceed 20% for short‐term follow‐up and 30% for long‐term follow‐up and does not lead to substantial bias a ‘‘yes’’ is scored. (N.B. these percentages are arbitrary, not supported by literature).

7

All randomised patients are reported/analysed in the group they were allocated to by randomisation for the most important moments of effect measurement (minus missing values) irrespective of noncompliance and co‐interventions.

8

All the results from all prespecified outcomes have been adequately reported in the published report of the trial. This information is either obtained by comparing the protocol and the report, or in the absence of the protocol, assessing that the published report includes enough information to make this judgement.

9

Groups have to be similar at baseline regarding demographic factors, duration and severity of complaints, percentage of patients with neurological symptoms, and value of main outcome measure(s).

10

If there were no co‐interventions or they were similar between the index and control groups

11

The reviewer determines if the compliance with the interventions is acceptable, based on the reported intensity, duration, number and frequency of sessions for both the index intervention and control intervention(s). For example, physiotherapy treatment is usually administered for several sessions; therefore it is necessary to assess how many sessions each patient attended. For single‐session interventions (e.g. surgery), this item is irrelevant.

12

Timing of outcome assessment should be identical for all intervention groups and for all primary outcome measures.

13

Other types of biases. For example:

  • when the outcome measures were not valid. There should be evidence from a previous or present scientific study that the primary outcome can be considered valid in the context of the present;

  • industry‐sponsored trials. The conflict of interest (COI) statement should explicitly state that the researchers have had full possession of the trial process from planning to reporting without funders with potential COI having any possibility to interfere in the process. If, for example, the statistical analyses have been done by a funder with a potential COI, usually ‘‘unsure’’ is scored.

Figures and Tables -
Table 2. Criteria for a judgment of ‘‘yes’’ for the sources of risk of bias
Table 3. Draft Summary of findings for the main comparison

Title: Telerehabilitation versus usual care

Patient or population: Chronic non‐specific low back pain
Settings: Primary or tertiary care
Intervention: Telerehabilitation interventions

Comparison: Usual care

Outcomes

Illustrative comparative risks (95% CI)

Relative effect
(95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

Usual care

Intervention

Pain (e.g. VAS)

Scale from 0 to 100
Follow‐up: less than 3 months after randomisation

Disability (e.g. ODI)
Scale from: 0 to 100
Follow‐up: less than 3 months after randomisation

Quality of life (e.g. SF‐36)

Scale from: 0 to 100
Follow‐up: less than 3 months after randomisation

Anxiety (HADS)

Scale from: 0 to 100
Follow‐up: less than 3 months after randomisation

Depression (HADS)

Scale from: 0 to 100
Follow‐up: less than 3 months after randomisation

Withdrawals due to adverse events (percentage)

Follow‐up: less than 3 months after randomisation

Short‐term serious adverse events (percentage)

Follow‐up: less than 3 months after randomisation

The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: confidence interval; VAS: visual analogue scale; ODI: Oswestry Disability Index; SF‐36: The Short Form (36) Health Survey

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

Figures and Tables -
Table 3. Draft Summary of findings for the main comparison