INTRODUCTION

Approximately 75 % of all patients with diabetes have concurrent hypertension (HTN).1 While blood sugar control is the focal point of diabetes management for most patients, data suggest that there may be more benefit in blood pressure (BP) control than reduction of blood sugar in patients with diabetes.2 Evidence from several randomized controlled trials (RCTs)35 has demonstrated reduced microvascular and cardiovascular complications when BP is tightly controlled. It has been estimated that rigorous control of BP is substantially more cost-effective than similarly tight blood sugar control in patients with diabetes, and is potentially cost-saving.2

Nurse management programs for diabetes have been formally tested in randomized controlled trials in a handful of academic settings, and have been efficacious in controlling glycemia.69 In these studies, changes in hemoglobin A1c (A1c) levels have reached as high as 1.1 percentage points.610 These studies have been conducted in settings with strong academic affiliations, where intervention delivery can be tightly controlled; the potency of interventions under these ideal settings is generally referred to as “efficacy.” In contrast, “effectiveness” refers to the potency of interventions under circumstances close to those that would be experienced in medical practice. It is generally assumed that some potency will be lost in assessing the effectiveness of efficacious interventions, but there are few trials that have measured this decrement directly, and none in diabetes nurse management. Additionally, trials in nurse management of HTN were conducted in similar settings.10,11 There are no randomized trial data on the effectiveness of behavioral co-management of diabetes and HTN.

Our objective in this randomized controlled trial was to assess the effectiveness of nurse-assisted behavioral co-management of blood sugar and BP among patients with diabetes and HTN in a practice-based research network of community primary care practices.

METHODS

Study Design

The study was a single-blinded randomized controlled trial (RCT) of a tailored telephone-delivered behavioral nurse intervention. A detailed description of the methods utilized in the Tailored Case Management for Diabetes and Hypertension (TEACH-DM) study were published previously.12

Setting

We performed the study in nine primary care practices in the Duke Clinical Research Institute Primary Care Research Consortium (PCRC). Practices in this established primary care-based research network are only very loosely academically affiliated; only two of the 71 providers across our nine clinics have any teaching responsibilities with Duke (limited to one half-day in an outpatient setting). Practices comprised both physician and mid-level primary care providers, trained in either general internal medicine or family medicine, and located in either urban or rural settings.

Patients

Patients had been enrolled in primary care at one of the nine clinics for at least a year at the time of enrollment. Eligible patients were identified through administrative data, and a list of eligible patients was generated for each PCP. The lists were vetted by each PCP for participation of individual patients. Patients were required to have both diabetes and HTN, and had to be taking medication for both, as confirmed by administrative data as well as patient self-reporting. Blood sugar had to be inadequately controlled, as determined by most recent A1c ≥ 7.5 %, but subjects were not required to have poorly controlled BP. Patients were broadly excluded for any of a range of comorbid serious illnesses12, type I diabetes, the inability to receive a telephone intervention in English, participation in another diabetes or HTN study, or living in an assisted living facility.

Trial Protocol

After patients had provided written informed consent, we obtained baseline measurements including height, weight, A1c, BP, and demographics from all patients. Patients remaining eligible (i.e., with A1c ≥ 7.5 %) were randomized to intervention or control groups at a 1:1 ratio, stratified by clinic and baseline BP control (systolic BP [SBP] ≥ 130 or diastolic BP [DBP] ≥ 80 vs. SBP < 130 and DBP <80). Regardless of arm assignment, all subjects had follow-up ascertainment of all outcomes at 6, 12, and 24 months after enrollment by a research assistant blinded to arm assignment.

Outcome Measurement

A1c and BP were measured using standard methods. A1c was measured by fingerstick on a well-validated point-of-care A1c machine (Bayer DCA 2000,13), calibrated to the national standard (mean = 5.0 %, top of normal range = 6.0 %). For BP, at each visit, two measures at 5-min intervals were obtained and were then averaged for the outcome.14 SBP and A1c were co-primary outcomes; DBP and BP control (SBP <130 and DBP <80, as per guidelines at the start of the study) were secondary outcomes.

Other key secondary outcomes included weight and physical activity. Weight was measured using the same scale for each patient at all visits. Physical activity was measured by means of the short-form International Physical Activity Questionnaire (IPAQ), a validated measure of leisure-time physical activity, the outcome of which is expressed in metabolic equivalents.15 Covariates included age, race, ethnicity, gender, education, financial security, self-efficacy, health literacy, and medication adherence. Self-efficacy was measured with the Perceived Competence Scale,16 health literacy was measured using the Rapid Estimate of Adult Literacy in Medicine (REALM) tool,17 and medication adherence was obtained from self-reports and dichotomized into perfect vs. imperfect.18

Intervention

A single nurse with extensive experience in case management delivered both the tailored behavioral intervention and the attention control. For both arms, calls were delivered every two months, for a total of 12 calls.

For the intervention arm, the nurse’s behavior-modifying content was tailored to each patient’s individual barriers to controlling blood sugar or BP. This content was divided into a series of topical modules addressing one or more behaviors appropriate for improving control of BP or blood sugar, and included physical activity, weight reduction, low salt intake, smoking cessation, medication adherence, management of hypoglycemia, and blood glucose monitoring. The full list of modules was published previously.12 These modules were activated only if a patient required them (e.g., nonsmokers did not receive smoking cessation advice). The modules assessed barriers to specific behaviors, and the nurse then tried to engage the patient in problem-solving in order to determine actions for overcoming these barriers. In addition, barriers that might generalize to a number of problems—specifically, low levels of disease knowledge, poor memory, poor social support, and concern about the quality of physician-patient decision-making—were addressed on their own. Fidelity was assessed by two nurse-investigators (KP, BG), who listened to a sample of 5 % of total calls for delivery of intended content.

The control patients received calls that were not tailored; these calls provided traditional didactic information on a range of topics that had no relationship to HTN, DM, or any of the behaviors we were trying to improve (e.g., flu shots, skin cancer prevention). Content was tightly scripted, designed to limit the potential for productive interaction between nurse and patient, and was informed by standard guidelines as stated on government websites.

Analysis

Sample Size and Power Considerations. The original proposed primary outcomes (BP control and glycemic control) were longitudinal and binary. The sample size estimate was based on the second primary hypothesis that patients in the intervention group would achieve a higher rate of adequate BP control compared to the attention control group over a two-year follow-up period. The BP hypothesis was chosen because it required a larger sample than the glycemic control hypothesis (because not all patients had poorly controlled BP, while all had poorly controlled A1c). Assuming a 40 % baseline rate of control, a type I error rate of 5 %, a dropout rate of 20 %, and an AR(1) correlation structure, we estimated that 200 subjects in each arm had 80 % power to detect a 20 % change in proportion of BP control over 24 months, for a total of 400 subjects. Before enrollment began, however, a published study suggested that the HbA1c level was best analyzed as a continuous variable,19 and so we decided at that time to analyze our primary outcomes as continuous A1c and SBP. The sample size of 200 patients per arm was projected to yield greater than 80 % power to detect clinically meaningful differences in these outcomes.

Participants were analyzed in the group to which they were assigned regardless of whether they participated in any intervention calls.20 Statistical analyses were performed using SAS for Windows (version 9.2; SAS Institute, Cary, NC, USA). For primary analysis, we fit linear mixed models (LMM) with a random effect for clinic and an unstructured covariance model for the repeated measures.21 Our primary predictors included indicator variables for the 6-, 12-, and 24-month follow-up periods and interaction variables for the treatment group by follow-up time; primary inference was on treatment by 24-month follow-up interaction. All available patient data were used; no observations were deleted because of missing follow-up data. LMM via maximum likelihood estimation implicitly accommodates missingness when missingness is due to treatment, prior outcome, or other baseline covariates included in the model, defined as missing at random.22 The final models included a stratification variable for baseline BP control (SBP ≥ 130 or DBP ≥ 80 vs. SBP < 130 and DBP <80). For sensitivity or per-protocol analyses, we fit the same LMM as described above to the subgroup of subjects that completed at least 10 intervention or attention control calls. The secondary continuous outcomes of DBP, weight, and metabolic equivalent of task (MET) minutes per week were analyzed using LMM as described above for the primary outcomes. For sensitivity analysis, we transformed MET minutes per week using a square root transformation, and we also compared the change in MET minutes from baseline to 24 months between arms, using a Wilcoxon rank-sum test. For the secondary binary outcome, BP control, we used a generalized estimating equation (GEE) model,23 which accounts for clustering of patients within clinics and repeated measures within patients. We used a logit link and unstructured covariance.

RESULTS

Patients

We randomly assigned 377 of the 2,601 patients assessed for eligibility (Fig. 1). Most of the ineligible patients were excluded because of improved HbA1c control; serious comorbid illness also accounted for some exclusions. We enrolled patients between June 2009 and July 2011. Seventy-eight percent of patients completed the 12-month follow-up, and 263 patients (70 %) reached the primary 24-month endpoint. Overall, we completed 82 % of interviews. These 377 patients were enrolled from nine practices, with as many as 104 and as few as 16 patients from a single practice. Patient characteristics in the intervention and control arms were similar at baseline (Table 1).

Fig. 1
figure 1

CONSORT diagram for the trial

Table 1 Subject Characteristics, Overall and by Study Arm

Attendance

Table 2 shows the distribution of completed encounters by randomization arm. Patients in the intervention arm completed 78 % of scheduled calls; control patients completed 81 %. The mean (SD) number of completed calls (of a total of 12 scheduled) was 9.7 (3.5) for control and 9.3 (3.3) for intervention patients. The mean number of completed calls was similar between arms (p = 0.23).

Table 2 Number of Completed Intervention Calls by Arm

Primary Outcomes

After adjustment for stratification variables, the estimated mean A1c and SBP were similar between arms at the primary endpoint of 24 months (intervention 0.1 % higher than control, 95 % CI [−0.3 %, 0.5 %], p = 0.50 for A1c; intervention 0.9 mmHg lower than control, 95 % CI [−5.4, 3.5], p = 0.69 for SBP; see Figs. 2a and b). Similarly, we found no difference between arms in mean A1c at 6 (p = 0.65) or 12 months (p = 0.23), or in mean SBP at 6 (p = 0.10) or 12 months (p = 0.54) (Fig. 2a). The largest estimated difference in SBP between arms was at 6 months (intervention 3.3 mmHg lower than control). The estimated intraclass correlation coefficients (ICC) were 0.008 for A1C and 0.002 for SBP. In sensitivity analyses of patients who completed at least 10 intervention or attention control calls, similar results were found for both HbA1c and SBP (not shown).

Fig. 2
figure 2

Hemoglobin A1c (A1c, a) and systolic blood pressure (SBP, b) by study arm over time

Secondary Outcomes

We analyzed several secondary outcomes. Briefly, there were no significant differences between arms in DBP, weight, or physical activity throughout the two-year study period (see Table 3). At 24 months, the difference in DBP between arms was 0.4 mmHg (lower in the control arm), the mean weight was 0.3 kg higher among intervention group patients, and mean leisure time physical activity was 150 MET-min/week higher for intervention versus control patients. Similar results were found with a square root transformation applied to the leisure time activity outcome, and we found no difference between arms in change in leisure time physical activity at 24 months using a Wilcoxon sum test (p = 0.14). Finally, we found no difference between arms in BP control rates (OR 1.1, 95 % CI [0.6, 2.0]; p = 0.80), with estimated BP control rates of 18.3 % for the intervention arm and 17.2 % for the control arm at 24 months.

Table 3 Estimated Mixed Model-Derived Difference in Secondary Outcomes Between Intervention and Control Arms

DISCUSSION

A tailored case management intervention for diabetes and hypertension in a community practice setting failed to yield improvements in SBP or A1c among patients. This null finding was as a result of an improvement in A1c levels for both the control and intervention arms, and a lack of improvement in SBP values for either arm, and occurred despite our intervention having previously shown efficacy for BP in an academic setting10,24 and similar interventions having shown efficacy for A1c in other academic settings. This result leads to the question of whether the intervention failed to translate to the community setting because of differences inherent in the complexities of translating interventions to community settings in general or our practices in particular, or because of some other concern.

It is important to consider the possibility that our intervention in particular, and nurse case management in general, does not work very well in community settings. Previous trials have been conducted in VA facilities, resident teaching clinics, and other highly controlled settings, with teaching physicians and less pressure to generate revenue. The practices we used had very few (2/71) teaching primary care providers, and they were fee-for-service providers whose primary purpose beyond excellent care is to generate revenue. Moreover, we are not alone in experiencing this difficulty in community practices; since the end of our trial, three trials have been published of similar interventions in community practices in three countries in Europe, all of which showed no impact on clinical outcomes.2527

The purpose of our trial was to rigorously evaluate an intervention similar to nurse behavioral management interventions that have almost certainly been implemented in a number of community practices. These interventions likely show apparent efficacy in uncontrolled settings, thus encouraging practices to continue to devote resources to nurse management. However, the improvement in A1c seen in the control arm of our study underscores the fallacy in assuming that it was the case management that led to the improvement in these cases. Patients who are eligible to start a behavioral intervention at the nadir of their disease control are likely to improve over time, with or without the program, and for that reason, controlled trials are better positioned to measure the value of such programs.28 The null result of this trial, therefore, supports the possibility that the apparent gains of nurse management programs are due to this regression to the mean, and that these interventions may be ineffective. However, there are a few categories of other possible reasons for our null result.

There are at least three important design elements of this trial that may have portended the negative result. First, we used an attention control; most previous studies of disease management for DM and/or HTN have used usual care controls (including our previous studies in HTN). The reason for attention controls is precisely the concern that contact time with the patient, independent of content of intervention, may be a potent intervention, and theoretically this could explain the A1c result. However, many studies have shown that didactic lectures on diabetes-related topics are a weak interventional approach (reviewed in29), much less lectures on unrelated information. Also, a potent attention control does not explain the lack of improvement in BP in either arm. Second, co-intervention is always a concern in effectiveness trials of quality improvement (QI) type interventions, and a co-intervention in the form of QI initiatives in the practice network theoretically could have caused the null result. Again, this is not a likely explanation for the BP results, where there was no improvement at all. Additionally, the mean A1c for patients in Duke Primary Care was unchanged at 7.3 % over the course of the study, suggesting that system-wide co-intervention was not an important factor for improvement in A1c among the control group. Finally, our choice to allow patients to enter the study despite having adequate BP control may have contributed to the null BP result. We made this decision based on the fact that 40 % of patients who are in control at a given moment fall out of control within a year,30 and the mean baseline SBP was well above the treatment goal of 130 mmHg for patients with diabetes in essentially all guidelines used for the duration of the study; 72 % of patients had baseline SBP > 130 mmHg. Furthermore, this is not an explanation for the A1c results.

It is also possible that our intervention was not just ineffective, but non-efficacious, despite the prior literature suggesting efficacy. Part of the challenge of working in community practices is that it is very hard to closely link an intervention or interventionist with any of the several practices within a study, and any such linkage was frankly considered undesirable by many of the practices. This lack of integration with primary care may have weakened the intervention; this has been a concern for other telephone management interventions as well.31 It is also possible that, while the telephone was an important modality for intervention delivery at the time that prior studies were conducted, patients now prefer web- or text-based interventions, and the telephone is no longer a potent means of intervention delivery.

Regarding the BP outcome, we have had extensive experience with this part of the intervention, and we do not think that our content was inadequate to effect BP lowering. In contrast, we had not previously developed a DM intervention. And while the DM intervention was built around the same theoretical principles as the HTN intervention, and contained the same domains of material as were used in other similar interventions in the literature, it is possible that the DM-focused content of the intervention (e.g., self-monitoring, insulin management) was not as potent as in other studies. It is also possible that the two-month frequency of contact that had worked for us in HTN was not frequent enough to produce a change in A1c; prior diabetes interventions have used monthly calls.9

There are two analytic issues that must also be considered as possible contributing factors. First, the study may have been underpowered for the BP outcome objective. Sample standard deviations (SDs) are unpredictable. Our previous work had shown SD ranging from 14 to 17 mmHg,10,32 but our SD in this study was almost 21 mmHg. With this unexpectedly large SD, a relatively small absolute effect size of <0.25 SD becomes clinically important, and our 95 % CIs around the difference in SBP between arms fail at all time points to exclude clinically important effects favoring intervention (∼5 mmHg). This is a lesser consideration for A1c, where at two of three time points, 95 % CIs exclude clinically relatively small differences of 0.35 % favoring intervention. Unexpectedly high rates of dropout (30 % at 24 months, compared to 20 % expected) also contributed to the limited power. Second, the results of this study may simply reflect a spurious type II error. Spurious error, by convention, is better tolerated for null results than for positive findings (i.e., β is set greater than α in power calculations), and it is certainly possible that we randomly enrolled a sample of patients less responsive to disease management than the population as a whole.

Our study illustrates the intertwined challenges associated with both the translation of chronic illness interventions to community practices as well as the scientific study of that translation. While small gains in clinical outcomes may add up to an important public health impact over a large population, the study of a modest intervention by traditional trial methods may not be feasible. Future studies will want to start with more potent interventions, perhaps with more frequent contact, so that their effects can be measured even after potency is lost in translation. Alternatively, they will want to have a greatly increased sample size such that smaller effects among more statistically diverse samples can be detected.