Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Improved calibration estimators for the total cost of health programs and application to immunization in Brazil

Correction

15 Apr 2019: Rivera-Rodriguez C, Toscano C, Resch S (2019) Correction: Improved calibration estimators for the total cost of health programs and application to immunization in Brazil. PLOS ONE 14(4): e0215701. https://doi.org/10.1371/journal.pone.0215701 View correction

Abstract

Multi-stage/level sampling designs have been widely used by survey statisticians as a means of obtaining reliable and efficient estimates at a reasonable implementation cost. This method has been particularly useful in National country-wide surveys to assess the costs of delivering public health programs, which are generally originated in different levels of service management and delivery. Unbiased and efficient estimates of costs are essential to adequately allocate resources and inform policy and planning. In recent years, the global health community has become increasingly interested in estimating the costs of immunization programs. In such programs, part of the cost correspond to vaccines and it is in most countries procured at the central level, while the rest of the costs are incurred in states, municipalities and health facilities, respectively. As such, total program cost is a result of adding these costs, and its variance should account for the relation between the totals at the different levels. An additional challenge is the missing information at the various levels. A variety of methods have been developed to compensate for this missing data. Weighting adjustments are often used to make the estimates consistent with readily-available information. For estimation of total program costs this implies adjusting the estimates at each level to comply with the characteristics of the country. In 2014, A National study to estimate the costs of the Brazilian National Immunization Program was initiated, requested by the Ministry of Health and with the support of international partners. We formulate a quick and useful way to compute the variance and deal with missing values at the various levels. Our approach involves calibrating the weights at each level using additional readily-available information such as the total number of doses administered. Taking the Brazilian immunization costing study as an example, this approach results in substantial gains in both efficiency and precision of the cost estimate.

Introduction

The global health community has become increasingly interested in estimating the costs of delivering public health programs, which are generally originated in different levels of service management and delivery. The interest has arisen from the need for financial sustainability and expansion of these programs. However, in low and middle income countries the resource utilization and financial information is scarce. Among the reasons for the lack of such information is the fact that public health program implementation occurs mostly at decentralized levels of management, and many costs are shared with other programs. As a result, program managers do not have the key information to improve efficiency or to advocate for adequate budget support. This is even more important when considering newly available technologies that are incorporated into public health programs, such as the introduction of new vaccines which are significantly more costly that the traditional childhood vaccines [14].

The costs of immunization programs occurs in various levels, where part of the cost correspond to vaccines and it is in most countries procured at the central level, while the rest of the costs are incurred in states, municipalities and health facilities, respectively. As such, total program cost is a result of adding these costs, and its variance should not only account for the uncertainty at each level, but also for the uncertainty due to the totals at each level being related (or nested) and for missing data.

In the past several years, selected studies have been conducted to assess the costs of immunization programs. In particular, various low and middle income countries conducted immunization costing studies since 2010, with the support from the Expanded Program on Immunization Costing and Financing (EPIC) project [1]. Detailed costing information on routine child immunization from Benin, Ghana, Honduras, Moldova, Uganda and Zambia [2, 3, 58]. These studies had in common the fact that they aimed at estimating total immunization costs, and not only expenditures on immunization. They were characterized by using multi-stage sampling of facilities and regression modelling to estimate the average costs at the various levels. The estimated average cost was then multiplied by the nationwide number of corresponding units (e.g. districts, facilities, etc.) to estimate total program costs. The study conducted in Honduras used sampling weights to obtain estimates of total national costs for routine immunization [3]. Despite having many strengths, the methods require that information is not missing, and does not allow generating a measure of uncertainty for the estimated total immunization program cost.

Over recent years, a variety of methods have been developed to compensate for missing data [917]. Weighting adjustments are often used to make the estimates consistent with readily-available information. For estimation of total program costs this implies adjusting the estimates at each level to comply with the characteristics of the country. Due to the need for collecting data at various levels, the large number of health-care units providing immunization at the local level, and considering the wide diversity of cost categories in immunization programs, it is inevitable that selected data will be lacking when conducting a comprehensive costing survey of immunization programs. Such missing data may be from the sampling units at any of the levels surveyed. This, coupled with the lack of an uncertainty measure of the overall cost estimate, represents the major methodological challenges when conducting nationwide immunization costing studies. Obtaining a precise and accurate cost estimator is critical for decision makers to inform policy and planning, and to recommend future studies.

Brazil structured its National Immunization Program in the 70’s, and has since strengthened the program with legislation to secure funds. Various newly available vaccines have been incorporated into the program in the past decade, taking into consideration recommended framework for vaccine introduction decision-making [4]. In order to generate evidence to support decision making and strengthen national capacity to appropriately inform policy and planning on the introduction of new vaccines, the Brazilian ministry of health commissioned a study to assess the costs of the Brazilian National immunization Program. A nationwide costing survey was conducted considering the year of 2013 and the perspective of SUS, the Brazilian National Public Health-care system which provides immunization services free of charge to all population through its more than 35,000 health-care services distributed in the country’s 5 macro-regions and 27 states (including a federal district) [4]. More details of the methods can be found in [6].

In brief, data on all resources used in the immunization activities which occur at four organizational levels was collected: central, state, municipal, and health care facilities providing immunization services. Data was collected at the central and all 27 state levels, in addition to a sample of municipalities and health-care facilities. The sample was collected following a two-stage stratified scheme, where the strata and stages are based on regions and municipalities respectively. After data collection, selected data was missing at both the facility and municipality levels.

Using the Brazilian National Immunization Program Costing Study, we propose and validate a quick and useful approach to estimating the inter-level variance while accounting for missing data at different sampling levels. Our method involves calibrating the weights at each level using additional readily-available information for the corresponding entire level such as the total number of doses administered or population targeted by the immunization program. Furthermore, we present theoretical calculations on how to compute the variance of this estimator using Taylor approximations. The paper is composed as follows. The first section describes the total cost and its components, the second section describes the sampling design and missing data, followed by estimation methods and variance. This is followed by post-design methods to improve estimation of the total cost and we finalize the paper with application to the Brazilian immunization study.

The total cost T

The main interest lies in estimating the total cost of a health program, e.g. the routine immunization program in Brazil. This cost can be expressed as (1) where TF, TM and TR represent the total costs at the facility level, municipality level and regional level, respectively. Note that TF, TM, or TR are not necessarily unknown, but it is usually the case that at least two of them are to be estimated [2, 3, 5, 7, 8]. TC represents the additional cost accrued centrally and it is assumed known. For example, in Brazil, this cost arises from vaccines purchased at the central level, in addition to coordination and management activities incurring in personnel, transportation and infrastructure costs. Note that, if desired, this expression can also be used to estimate any total figure of interest. Usually costing studies can have different goals, but, in order to simplify and facilitate the readability of the paper, we focus on total cost. We denote NR as the total number of regions, as the total number of municipalities within region r and is the total number of facilities in municipality m, region r. Each of these costs can, respectively, be expressed as follows: (2) where is the total cost specific to regions and NR is the number of regions. Similarly, TM and TF can be represented as (3) (4) where is the total cost incurred by municipality m in region r and is the total cost specific to facility f in municipality m in region r. Note, this expression explicitly acknowledges that facilities are structured as belonging to some specific municipality within some specific region. Furthermore, the number of municipalities within any given region may vary across regions, and the number of facilities within any given municipality may also vary. Taking this structure into consideration is essential for valid estimation and inference.

The sampling design and missing data

Generally, the costs incurred at region, municipality and facility levels are unknown. Researchers usually have access to readily available information on the hierarchical structure or demographic information at each level, but not to cost data. It is often the case that investigators have access or resources to collect detailed information on costs in a sub-sample of units. A cost-efficient approach is to implement a multi-stage sampling design where a sample of units is selected at each stage and the costs/variables of interest are recorded for each sampled unit. For example, in Brazil, the first stage units are regions and a sample of nR = NR regions was selected. Because all regions were included in the sample, this is the same as a stratified sampling design. Subsequently, a sample of municipalities was selected from each region. The sampling design within each region was proportional to the number of children under one year old, with an expected sample size of , where denotes the final number of municipalities sampled. Additionally, within each municipality, a random sample of facilities was selected. The expected municipality sample size per macro-region was 11 (55 in total, 5 macro regions), and the estimated facility sample size was 66 immunization services per macro-region, totaling 330 immunization services in the whole country. At each macro region, the facilities from the 11 selected municipalities were assembled into a pool. Then 66 facilities were selected from each pool. Due to this sampling mechanism, not all municipalities had facilities selected. In fact, this yielded a final sample of 330 vaccination units located in 40 Brazilian municipalities. Municipalities for which no health facilities were selected in the second stage of the sampling process were not visited and therefore did not have the municipal level data collected (15 municipalities). In addition, the information on two municipalities and its facilities was lost due to logistic issues. These municipalities were Inocência, in the central-west region, and Riachão do Jacuṕe in the Northeastern Region. This implies that the information from facilities belonging to these two municipalities was also missing.

Table 1(1) shows the final number of municipalities and facilities per region. For the Brazilian study, the total expected number of sampled municipalities was , but the actual size was , while the actual sample size of facilities was .

thumbnail
Table 1. Final number of municipalities and facilities per region.

https://doi.org/10.1371/journal.pone.0212401.t001

IPW (Inverse probability weighting) estimation

In absence of complete data at municipality and facility levels, neither TR, TM nor TF can be directly calculated. This is the case of multistage sampling design of our interest, where the total costs cannot be calculated. A possible approach for overcoming this limitation allowing for total costs estimation is the well-known inverse probability weighting (IPW) [10, 18]. It uses sampling weights to build a bridge between the observed subsample and the entire population to produce an estimator that represents all the population. Let the total number of facilities sampled in municipality m in region r and be the total number of facilities sampled in region r. Focusing on TR, the IPW estimate is given by (5) where is an indicator of whether the region r was selected and , with πR the probability that region r was selected to be part of the survey. In Brazil, for example, for all five regions and . In some settings, such in the Honduras EPIC study, not all regions were included [3]. In such situations, some are zero. Similarly, an IPW estimator of TM is (6) where is an indicator of whether municipality m in region r was selected at the second stage sample, and with being the probability that the municipality was selected. In Brazil, 40 municipalities were selected initially, but the information on one of them is missing, then only 38 are available for the analysis.

Similarly, an IPW estimator of TF is (7) where is an indicator of whether facility f in municipality m, in region r was selected at the third stage sample, and with as the inclusion probability for such facility. The final IPW estimator is then given by . Under a true design, statistical properties guarantee to be an unbiased estimator of the total cost T [10, 18].

Variance estimation

Usually IPW estimators only involve data at the last stage of the sampling design. An example of this is . In this case, the variance of the estimator is well known and straight forward to calculate following standard properties of multi stage designs [18]. However, our estimator of interest involves information from the various stages of the sampling design. This implies that the variance has to be estimated differently. Note first that (8) Using this, the number of facilities sampled in municipality m, region r can be written as and the number of facilities sampled in region r can be written as and these quantities are fixed by design. Furthermore, note that (Eqs (5), (6) and (7)) can respectively be written as (9) (10) (11) Now, let (12) Then, using (9)–(11) and urmf, we can write the estimator as (13)

Since , , and are fixed by design, all the design-based uncertainty is captured in the expression above and we can calculate the variance of the estimator using standard expressions or techniques for the IPW estimators, but now the target estimator is for the total of the variable urmf. Let sR, and denote the sample of regions, municipalities in region r and facilities in municipality m in region r, respectively. Note that the variance of the estimator can be calculated using the expression (14) Following [18], the expression for the first term is (15) where , which represents the pairwise inclusion probabilities at the regional level, . The term is given by , where . The term is calculated in the same way as (15) and so on for each sampling stage. Further details are given in the supporting information S1. An estimator of (15) is given by (16) where is such that . This expression is calculated in a similar way to (16). Details on this derivations can be found in [18]. We also provide specific formulas for (15) and (16) for the Brazilian design. And, an unbiased estimator for the second term in 14 is given by (17) where (18)

Estimation of the cost per dose

The cost per dose is defined as the total cost of implementation divided by the number of doses provided by the program. Mathematically, , where T is the total cost of the program and is the total number of doses, with drmf being the number of doses in facility f in municipality m in region r. The estimate of the cost per dose is (19) where is the estimated total cost obtained in (13). The variance of this estimator is given by (20) and an unbiased estimator of this quantity is given by (21) where is given by (14).

Post-design methods: Calibrated weights

The main purpose of post-designs methods is to improve the estimates by reducing the variance, i. e. to produce more efficient estimators. Post-design methods include techniques such as post stratification, estimation of weights and calibration of weights. The latter is known to yield more efficient estimates by adjusting the weights using information readily-available at first phase. In Brazil, for example, the total number of doses administered and the total number of facilities in the country is known. Hence this can be used to adjust (calibrate) the weights.

[13] proposed the use of calibration to adjust the sampling weights in case–cohort designs. Let vF, vM and vR denote readily available variables at first phase and at the different levels. The former is a variable available for all units at the final stage, the second is available for all second stage units and the latter is available for all first-stage units, e.g regions. The main goal is to adjust weights such that estimates for the totals , and equal the actual totals , and , but forcing the new weights () to be as close as possible to the design weights. For example, consider IPW estimates and based on the same weights used to estimate T with , i.e.

Since we know , and we can slightly modify the weights (w) and find () such that (22)

Intuitively, the modified weights () contain information about the entire population of facilities and we can use the new weights to find a new estimate of T: where and represent the estimators of total cost at the Facility, Municipality and regional level, respectively. Mathematically, a distance function measuring the distance from original weights(w) to new weights() is minimized subject to the constrains (22). There are several options for choosing this distance. Substantial gains in accuracy depend on how correlated the calibration variables are with the quantity of interest [9].

Variance estimation

In order to estimate the variance in practice we use a Taylor expansion as in [18]. Define where , and y are given by (23) and (24)

Therefore (25) with . Expression (25) is found in a similar way as the variance estimate in section Variance estimation.

Results from Brazilian study

Table 2 displays the information used for calibration of the weights. At the facility level, weights (wF) were calibrated using all the totals the totals presented in Table 2. This includes information on the facility size, the total number of doses in 2014 and the number of facilities in each region. Furthermore, at the municipality level, weights(wM) were calibrated using only the total number of municipalities. The number of expected municipalities in the sample was 55, but the actual sample contained 38 municipalities. This was due to sampling variation and due to two missing municipalities.

thumbnail
Table 2. Information used for calibration of the weights.

FAcility size is defined as follows. Huge (No.Doses > 10000); large (5000<No.Doses ≤ 10000); medium (1500<No.Doses ≤ 5000); small (500 < No.Doses ≤ 1500) and tiny (No.Doses ≤ 500).

https://doi.org/10.1371/journal.pone.0212401.t002

Tables 3 and 4 display the estimated total cost for each of the different categories of cost at the municipal level and for the entire program, respectively. Table 3 presents the estimates of costs incurred by municipalities. The data used to estimate this was the sample of 38 municipalities. The table shows significant differences between the estimates using the unadjusted sampling weights and the estimates using the calibrated weights. The reason for this is the small sample of municipalities available. With this in mind, adjusting/ calibrating the sampling weights yields more representative and informative estimators. In addition, due to the increase in magnitude, the SE are larger for the calibrated estimators.

Table 4 displays the estimates of the total cost of the entire program for each of the different categories. As observed, the estimates differ in magnitude for all the categories and the calibrated estimates are always larger. An additional feature of the calibrated estimates is that their standard errors are lower than those obtained with the unadjusted sampling weights. This leads to the conclusion that the calibrated estimates are more precise and efficient and therefore more reliable.

Results

thumbnail
Table 3. Estimated municipality level costs, by cost category, considering unadjusted and calibrated weight (in million R$) for the Costing Study of the Brazilian Immunization Program.

Brazil, 2013. Standard errors are displayed in parenthesis. Municipal level weights were calibrated to the national number of municipalities.

https://doi.org/10.1371/journal.pone.0212401.t003

thumbnail
Table 4. Estimated total costs, by cost category, considering unadjusted and calibrated weight (in million R$) for the Costing Study of the Brazilian Immunization Program.

Brazil, 2013. Standard errors are displayed in parenthesis. This includes facility level cost, municipality level cost and state level cost. Municipal level weights were calibrated to the national number of municipalities, and facility level weights were calibrated using information presented in Table 2.

https://doi.org/10.1371/journal.pone.0212401.t004

Discussion

We have proposed inverse probability weighting(IPW) with calibration methods to improve estimators in studies where information is collected at several levels, e.g. states-municipalities-clinics. We find an expression for the estimator of the total such that it accounts for the totals at each level and also provides a feasible way to find the variance. Further, we use information available for all units in the targeted population to adjust the weights with calibration methods. It allows for calibration of the weights at all the levels of sampling if desired by investigators. These methods can be used to correct for missing information at the various levels. This is achieved by adjusting the estimates at each level to comply with the characteristics of the country. The methods were applied to the Brazilian National Immunization Program. The results showed that calibrating the weights does not only correct for missingness, but also results in large gains in efficiency and precision compared to standard methods. The estimators result difficult to compare in terms of efficiency due to the difference in magnitude. To provide an interpretation, we considered that the estimates obtained by calibration were consistent, which is supported by asymptotic theory [9, 10]. We compare the estimated MSEs as follows. For the total cost with standard weights, the MSEs are: 278(Capital), 1138(Recurrent), 1415(Total). With calibrated weights these are 61(Capital), 291 (Recurrent), 331(Total). Calibration clearly returns smallest MSEs if the estimator is consistent (or unbiased). A point of interest for future research is to investigate how different would the results have been if the sampling design had been different. For example, stratifying by facility size and/or type. This can enhance estimators because facility size/type is related to the cost. On the contrast, stratification by something not related to the cost would not yield gains in efficiency. We hope that investigators become more aware of the existence and implementation of these methods. This can help analyst and policy-makers to inform on the basis of more reliable information.

Supporting information

S1 Appendix. Estimation and variance of the total cost and cost per dose from the Brazilian immunization study.

https://doi.org/10.1371/journal.pone.0212401.s001

(PDF)

S2 Appendix. Estimation and variance of the total cost and cost per dose using design and calibrated weights.

https://doi.org/10.1371/journal.pone.0212401.s002

(PDF)

S1 Dataset. The dataset contains the data for the estimation of the total cost from the Brazilian study.

https://doi.org/10.1371/journal.pone.0212401.s003

(TXT)

References

  1. 1. Brenzel L, Schẗte C, Goguadze K, Valdez W, Le Gargasson JB, Guthrie T. EPIC Studies: Governments Finance, On Average, More Than 50Â Percent Of Immunization Expenses, 2010-11. Health Affairs. 2016;35(2):259–265. pmid:26858378
  2. 2. Brenzel L, Young D, Walker DG. Costs and financing of routine immunization: Approach and selected findings of a multi-country study (EPIC). Vaccine. 2015;33 Suppl 1:A13–20. pmid:25919153
  3. 3. Janusz CB, Castañeda-Orjuela C, Molina Aguilera IB, Felix Garcia AG, Mendoza L, Díaz IY, et al. Examining the cost of delivering routine immunization in Honduras. Vaccine. 2015;33 Suppl 1:A53–9. pmid:25919175
  4. 4. Robson M, Andrus JK, Toscano CM, Lewis M, Oliveiria L, Ropero AM, et al. A Model for Enhancing Evidence-Based Capacity to Make Informed Policy Decisions on the Introduction of New Vaccines in the Americas: Paho’s Provac Initiative. Public Health Reports. 2007;122(6):811–816.
  5. 5. Le Gargasson JB, Nyonator FK, Adibo M, Gessner BD, Colombini A. Costs of routine immunization and the introduction of new and underutilized vaccines in Ghana. Vaccine. 2015;33 Suppl 1:A40–6. pmid:25919173
  6. 6. Fundo Nacional de Saude F. Study Report: Brasil. Estimativa de custo do programa nacional de imunizacao. Relatorio Final. Ministerio da Saude, Programa Nacional de Imunizacao; 2018.
  7. 7. Schütte C, Chansa C, Marinda E, Guthrie TA, Banda S, Nombewu Z, et al. Cost analysis of routine immunisation in Zambia. Vaccine. 2015;33 Suppl 1:A47–52. pmid:25919174
  8. 8. Goguadze K, Chikovani I, Gaberi C, Maceira D, Uchaneishvili M, Chkhaidze N, et al. Costs of routine immunization services in Moldova: Findings of a facility-based costing study. Vaccine. 2015;33 Suppl 1:A60–5. pmid:25919177
  9. 9. Deville JC, Sarndal CE. Calibration estimators in survey sampling. Journal of the American Statistical Association. 1992;87:376–382.
  10. 10. Fuller W. Sampling Statistics. John Wiley and Sons; 2009.
  11. 11. Lumley T, Shaw PA, Dai JY, Tsiatis AA, Davidian M, Handcock MS, et al. Connections between Survey Calibration Estimators and Semiparametric Models for Incomplete Data [with Discussions]. International Statistical Review / Revue Internationale de Statistique. 2011;79(2):200–232. pmid:23833390
  12. 12. Breslow N and Amorim G and Pettinger M and Rossouw J. Using the Whole Cohort in the Analysis of Case-Control Data: Application to the Women’s Health Initiative. Stat Biosci. 2014;5(2).
  13. 13. Breslow NE, Lumley TS, Ballantyne CM, Chambless LE, Kulich M. Using the whole cohort in the analysis of case-cohort data. American journal of epidemiology. 2009;169(11):1398–405. pmid:19357328
  14. 14. Breslow N, Lumley T, Ballantyne C, Chambless L, Kulich M. Improved Horvitz-Thompson Estimation of Model Parameters from Two-phase Stratified Samples: Applications in Epidemiology. Statistics in Biosciences. 2009;1(1):32–49. pmid:20174455
  15. 15. Haneuse S, Hedt-Gauthier B, Chimbwandira F, Makombe S, Tenthani L, Jahn A. Strategies for monitoring and evaluation of resource-limited national antiretroviral therapy programs: the two-phase design. BMC Medical REsearch Methodology. 2015;15(31). pmid:25886976
  16. 16. Kim JK, Kwon Y, Paik MC. Calibrated propensity score method for survey nonresponse in cluster sampling. Biometrika. 2016;103(2):461. pmid:27279670
  17. 17. Walker DG, Hutubessy R, Beutels P. WHO Guide for standardisation of economic evaluations of immunization programmes. Vaccine. 2010;28(11):2356–2359. pmid:19567247
  18. 18. Sarndal CE, Swensson B, Wretman J. Model Assisted Survey Sampling. New York; 1992.