Process of Developing the BeFITS-MH Measure
We undertook an extensive process to develop the BeFITS-MH measure. First, we developed a multi-level conceptual model to guide our understanding of the domains of barriers and facilitators associated with task-sharing mental health interventions. Second, we further specified the conceptual model using two data sources: the Shared Research Projects (below) and a systematic review. Based on the results of this model building process, we constructed the initial draft of the BeFITS-MH measure. The measure was revised through expert feedback from a Delphi panel. Further refinements of the BeFITS-MH measure were done during the translation and local adaptation stage. Finally, we conducted concurrent pilot testing procedures to finalize the BeFITS-MH measure within the three study site programs.
Theory-driven and empirically-grounded measure development.
In developing our theoretical model, we selected the Consolidated Framework for Implementation Research (CFIR) (35, 36), and the Theoretical Domains Framework (TDF) (37, 38), which together allowed us to enumerate and categorize a wide range of potential implementation determinants—i.e., ‘barriers and facilitators’ (39). In addition, we also drew on Chaudoir et al.’s framework (10), which specifies that implementation outcomes (e.g., acceptability, feasibility, fidelity, reach, adoption) are predicted by implementation factors (i.e. the barriers and facilitators) in five levels: (I) client; (II) provider; (III) innovation (defined as the evidence-based practice or intervention); (IV) organization; and (V) structural. This framework is especially applicable for task-sharing because it includes the characteristics of the providers who are critical in delivering task-sharing mental health interventions.
We applied and iteratively refined our framework of the domains and constructs for the types of barriers and facilitators using data from two parallel studies: (I) the Shared Research Project,[i]i a qualitative study that collected interview data from three NIMH-funded collaborative U19 “hubs” that implemented task-sharing mental health interventions in different LMIC sites; and (II) a systematic review synthesizing reported implementation barriers and facilitators of task-sharing mental health interventions in LMICs (40). For each data source, trained research assistants coded the transcripts (for the Shared Research Project) or included articles (for the systematic review) for the type of implementation factor, and iteratively revised the resulting codebook until we reached what we considered to be the most comprehensive codebook of barriers and facilitators in task-sharing mental health interventions.
A detailed description of the resulting BeFITS-MH framework and codebook is presented in Le et al. (40). Briefly, we specify eight domains of task-sharing mental health intervention barriers and facilitators within and across different spheres of influence: (I) client characteristics and (II) provider characteristics in the micro setting; (III) family- and community-related factors and (IV) organizational factors in the meso-level settings; (V) societal and (VI) mental health system-level domains in the macro-level setting; and the (VII) intervention characteristics and (VIII) mental health stigma domains operating across the settings. Figure 1 illustrates the conceptual framework for the BeFITS-MH measure, specifying: (I) the eight domains of barriers and facilitators in implementation of task-sharing mental health interventions, and (II) the three implementation outcomes with which we aim to validate the BeFITS-MH measure: acceptability, appropriateness, and feasibility. We selected these implementation outcomes because they are leading indicators of adoption of evidence-based interventions (41, 42).
INSERT FIGURE 1 ABOUT HERE
Initial BeFITS-MH measure.
Based on the conceptual framework and the results of the Shared Research Project and the systematic review, we developed an initial version of the BeFITS-MH measure, which contained six subscales and a total of 43 items (6–8 items per subscale), capturing critical aspects of task-sharing mental health implementation barriers and facilitators.
Delphi process.
To refine the BeFITS-MH measure and to arrive at an expert group consensus of the measure’s core initial domains, format, and structure, we conducted what is known as a ‘modified Delphi process’. Our modifications reflected group sessions that provided opportunities to discuss differences in responses. First, we assembled a ‘Dissemination Panel’ of 19 global experts in implementation of task-sharing mental health interventions and health services research particularly in LMIC settings, including the study co-investigators at the three sites (South Africa, Chile, Nepal). Over a period of five months, the panel met in three virtual forums (2-hours each), interspersed with two rounds of online questionnaires where panel members were asked to individually provide feedback about different aspects of the BeFITS-MH measure (e.g., the construct and content validity of the domains [subscales], cultural and linguistic appropriateness of the items, hypothesized relationships of subscales to implementation outcomes). All questionnaire responses were compiled and discussed at the following virtual forum.
Field based translation and local adaptations.
Following the Delphi process, we held regular biweekly virtual meetings with the lead BeFITS-MH measure developers and co-investigators from each of the three study sites to translate and locally adapt the BeFITS-MH measure. Within each site, we opted for a group translation process, wherein 2–3 local staff (researchers, clinicians, task-sharing providers, and program implementers) were consulted to jointly translate the measure. This collaborative process has been identified as particularly important for mental health problems and programs, where assessments of emotions and behaviors need to be aligned with local understanding and conceptualizations (43, 44). Along with the translations, site-specific adaptations included using appropriate terms describing the target mental health problem and task-sharing mental health intervention being implemented within each setting. For example, each site provided project-specific terms used for the [task-sharing] ‘provider’, (e.g., ‘counselor’ in South Africa, ‘team member’ in Chile, and ‘primary health care worker’ in Nepal). Notes regarding how each item was translated and all site-specific adaptations were recorded and discussed during regular biweekly virtual meetings to harmonize the measure across sites and to preserve comprehensiveness of item content (i.e., content validity) to the extent possible.
Pilot testing.
Piloting of the translated and adapted BeFITS-MH measure was conducted concurrently across the three sites with providers (South Africa 4; Chile 5; Nepal 35) and service users (South Africa 10; Chile 5; Nepal 6). As part of the piloting process, cognitive interviews were conducted with respondents who were asked to “think aloud” while responding to each item, and to comment on whether items were worded in an understandable way. We further asked whether items were applicable to the specific task-sharing mental health program being implemented and the local setting; this enabled identification of whether the full range of identified barriers and facilitators were used, and in triangulating responses across the three sites, what aspects of barriers and facilitators could be considered core to task-sharing across sites. We also gathered feedback in terms of the project's preference for the format of the measure (question vs. statement) and the scaling used. We discussed the findings during the biweekly virtual meetings, noting site-specific findings as well as commonalities across sites.
Process of Enhancing Utility of the BeFITS-MH measure: Assessing Associations with Implementation Outcomes
To support later BeFITS-MH validation testing, we describe the process of enhancing the construct validity and utility of the BeFITS-MH measure in assessing three implementation outcomes of interest: acceptability, appropriateness, and feasibility. We did this by pilot testing three brief measures that have been previously used in implementation science research (below) and through stakeholder discussions in each site.
Standard measures of implementation outcomes.
The three selected measures were the: a) Acceptability of Intervention Measure (AIM); b) Intervention Appropriateness Measure (IAM); c) and Feasibility of Intervention Measure (FIM) (45). These measures were developed by IS researchers and mental health professionals in the United States, with the vast majority of the developers and the sample of counselors who were part of the development process being Caucasian Americans. The three measures were initially developed for use with mental health counselors in the United States to evaluate the acceptability, appropriateness and feasibility of different treatment options (45). These measures have been used in English-speaking populations across a range of interventions, including with school-staff for student-wellness programming in England and health care providers providing antenatal alcohol reduction interventions in Australia (46, 47). More recently, these measures have been used in LMIC settings (Kenya, Tanzania, Botswana, South Africa, and Guatemala) in studies of mental health interventions (depression, anxiety, and alcohol use disorder) including those utilizing task-sharing strategies, HIV services, and medical interventions for genetic disorders and malignant cancers (48–53). Of note, English language versions of these measures have been used in most settings, with a Swahili version developed in Kenya through translation-back-translation methods (54). In addition to planning to use these three measures with the task-sharing providers, we explored the potential for using them with the clients and patients who were receiving the task-sharing mental health interventions. Field testing of these three measures with providers and service users was concurrently conducted during the pilot testing of the BeFITS-MH measure (above). After site-specific translation, we made one adaptation to the measures: replacing the term ‘EBP’ with the name of the specific task-sharing program implemented at each site. We then administered the measures to samples of task-sharing providers and service users.
Stakeholder feedback sessions.
To gain a better understanding from the mental health practitioner and system perspectives regarding the implementation outcomes, we held discussions with local staff in each of the three study countries. These individual and small group conversations were led by site co-investigators using a standard script that included definitions of acceptability, appropriateness, and feasibility, and probes for level-specific indicators for clients, providers, and the setting (Table 2). By indicators we are referring to individual items or programmatic and clinical metrics (like cases
seen per month) that can be included in the measure that are directly related to the measurement of the implementiaton outcomes. After an introduction of the definitions of the three implementation outcomes, the probes asked the participants to suggest how they think we could best measure these outcomes from the perspective of the people ‘receiving’ the program, the people ‘delivering’ the program, and the locale where the program is being provided.
Table 2
Stakeholder Feedback Definitions and Probes
Implementation Outcome | Definitions of Implementation Outcomes |
Acceptability | This is the view that the program is agreeable and satisfactory to the people providing the program and to the people receiving the program. This means the program is a good fit for the individuals providing the program and for the people receiving the program. |
Appropriateness | This is the view that the program fits and is relevant to the setting, to the people providing the program and to the people receiving the program. This means the program is suitable, compatible, a good fit for the health issue, and/or given the norms and beliefs of the clinic, the people giving the program, or the people receiving the program. |
Feasibility | This is the view that the program can be successfully used and carried out by the providers in a given setting. This means the program is possible to do given the resources (such as time, effort, and money), and the circumstances (such as policies, timing). |
Probes for Programmatic Indicators in Stakeholder Feedback Sessions |
Client-level probe | Provider-level probe | Setting-level probe |
What are your ideas for how we could measure whether people receiving the program think the program is acceptable? | What are your ideas for how we could measure whether people providing the program think the program is acceptable? | What are your ideas for how we could measure whether the place where the program is provided is acceptable? |
In Nepal, two small group discussions were held: the first with 3 participants (1 medical officer and 2 senior auxiliary health workers) and the second with 7 participants (4 psychosocial counselors and 3 health systems research staff). In South Africa, one small group discussion was held with 3 program staff (program monitoring and evaluation staff and program implementers). In Chile, information was collected by direct interview of 4 mental health professionals (1 psychologist, 1 occupational therapist, 1 social worker, 1 nurse) working at mental health centers where the task-sharing program is being implemented.
The discussions were transcribed and shared in English (for Nepal and Chile) with the full study team. Transcripts were reviewed and coded for: (I) each of the three implementation outcomes and; (II) each perspective (client, provider, system) by the study PIs (LHY, JB) and key study team members (PTL, MG). Results were reviewed to identify commonalities and common indicators with a particular eye towards suggesting where differences may be driven by the distinct type of task-sharing program being implemented. Summaries of the stakeholder perspectives around each implementation outcome are presented in the results; recommended programmatic and clinical indicators that can be used for future formal validation testing for BeFITS-MH to enhance its utility are addressed in the Discussion.