Skip to main content

Advertisement

Log in

Regression-adjusted matching and double-robust methods for estimating average treatment effects in health economic evaluation

  • Published:
Health Services and Outcomes Research Methodology Aims and scope Submit manuscript

Abstract

Regression, propensity score (PS) and double-robust (DR) methods can reduce selection bias when estimating average treatment effects (ATEs). Economic evaluations of health care interventions exemplify complex data structures, in that the covariate–endpoint relationships tend to be highly non-linear, with highly skewed cost and health outcome endpoints. When either the regression or PS model is correct, DR methods can provide unbiased, efficient estimates of ATEs, but generally the specification of both models is unknown. Regression-adjusted matching can also protect against bias from model misspecification, but has not been compared to DR methods. This paper compares regression-adjusted matching to selected DR methods (weighted regression and augmented inverse probability of treatment weighting) as well as to regression and PS methods for addressing selection bias in cost-effectiveness analyses (CEA). We contrast the methods in a CEA of a pharmaceutical intervention, where there are extreme estimated PSs, hence unstable inverse probability of treatment (IPT) weights. The case study motivates a simulation which considers settings with functional form misspecification in the PS and endpoint regression models (e.g. cost model with log instead of identity link), stable and unstable PS weights. We find that in the realistic setting of unstable IPT weights and misspecifications to the PS and regression models, regression-adjusted matching reports less bias than DR methods. We conclude that regression-adjusted matching is a relatively robust method for estimating ATEs in applications with complex data structures exemplified by CEA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Here a calliper is defined as the pre-specified amount by which propensity scores of matched pairs are allowed to differ.

  2. Further possible ways of balancing with the PS include stratification (blocking) by the quintiles of the PS, and adding the PS as a covariate (Rosenbaum and Rubin 1983). They have been demonstrated to be dominated by IPTW and matching (Lunceford and Davidian 2004).

  3. The cross-validation used twofold split sample, and measured goodness of fit with the mean squared prediction error, averaged over 100 iterations.

  4. Standardised differences are weighted using matching frequency weights and IPT weights.

  5. The copula function can generate draws from a flexible multivariate distribution (in this case the bivariate) with different marginal distributions (here, the gamma and the normal).

  6. This resulted in a correlation of 0.34 between the cost and QALY variable, which reflects the correlation (0.22) found in the case study.

  7. The choice of normal distribution for Y E and the identity link function for Y C was made for transparency reasons and to facilitate replication.

  8. The proportion of individuals in the treatment group were typically around 50 % (scenarios 1 and 2) and 60 % (scenarios 3 and 4), compared with 46 % in the case study.

References

  • Abadie, A., Drukker, D., Herr, J.L., Imbens, G.: Implementing matching estimators for average treatment effects in Stata. Stata J. 4(3), 290–311 (2004a)

    Google Scholar 

  • Abadie, A., Herr, J.L., Imbens, G.W., Drukker, D.M.: NNMATCH: Stata module to compute nearest-neighbor bias-corrected estimators. http://fmwww.bc.edu/repec/bocode/n/nnmatch.hlp (2004b). Accessed 15 June 2012

  • Abadie, A., Imbens, G.W.: Large sample properties of matching estimators for average treatment effects. Econometrica 74(1), 235–267 (2006)

    Article  Google Scholar 

  • Abadie, A., Imbens, G.W.: Bias-corrected matching estimators for average treatment effects. J. Bus. Econ. Stat. 29(1), 1–11 (2011)

    Article  Google Scholar 

  • Austin, P.C.: A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat. Med. 27(12), 2037–2049 (2008)

    Article  PubMed  Google Scholar 

  • Austin, P.C.: Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat. Med. 28, 3083–3107 (2009)

    Article  PubMed  Google Scholar 

  • Austin, P.C.: Using ensemble-based methods for directly estimating causal effects: an investigation of tree-based G-computation. Multivariate Behav. Res. 47(1), 115–135 (2012)

    Article  PubMed  Google Scholar 

  • Bang, H., Robins, J.M.: Doubly robust estimation in missing data and causal inference models. Biometrics 61, 962–972 (2005)

    Article  PubMed  Google Scholar 

  • Barber, J., Thompson, S.G.: Multiple regression of cost data: use of generalised linear models. J. Health Serv. Res. Policy 9(4), 197–204 (2004)

    Article  PubMed  Google Scholar 

  • Basu, A.: Economics of individualization in comparative effectiveness research and a basis for a patient-centered health care. J. Health Econ. 30(3), 549–559 (2011)

    Article  PubMed  Google Scholar 

  • Basu, A., Manca, A.: Regression estimators for generic health-related quality of life and quality-adjusted life years. Med. Decis. Making 32(1), 56–69 (2011)

    Article  PubMed  Google Scholar 

  • Basu, A., Manning, W.G.: Issues for the next generation of health care cost analyses. Med. Care 47(7_Supplement_1), S109–S114 (2009)

    Google Scholar 

  • Basu, A., Polsky, D., Manning, W.: Estimating treatment effects on healthcare costs under exogeneity: is there a ‘magic bullet’? Health Serv. Outcomes Res. Methodol. 11(1), 1–26 (2011). doi:10.1007/s10742-011-0072-8

    Article  PubMed  Google Scholar 

  • Basu, A., Rathouz, P.J.: Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics 6(1), 93–109 (2005)

    Article  PubMed  Google Scholar 

  • Buntin, M.B., Zaslavsky, A.M.: Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. J. Health Econ. 23(3), 525–542 (2004)

    Article  PubMed  Google Scholar 

  • Busso, M., DiNardo, J., McCrary, J.: New evidence on the finite sample properties of propensity score reweighting and matching estimators. In: Working paper, vol. 3998, 2011

  • Caliendo, M., Kopeinig, S.: Some practical guidance for the implementation of propensity score matching. J. Econ. Surv. 22(1), 31–72 (2008). doi:10.1111/j.1467-6419.2007.00527.x

    Article  Google Scholar 

  • Crump, R.K., Hotz, V.J., Imbens, G.W., Mitnik, O.A.: Dealing with limited overlap in estimation of average treatment effects. Biometrika 96(1), 187–199 (2009)

    Article  Google Scholar 

  • Davison, A., Hinkley, D.: Bootstrap Methods and Their Application. Cambridge University Press, New York (1997)

    Book  Google Scholar 

  • Dehejia, R.H., Wahba, S.: Propensity score-matching methods for nonexperimental causal studies. Rev. Econ. Stat. 84(1), 151–161 (2002)

    Article  Google Scholar 

  • Diamond, A., Sekhon, J.S.: Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev. Econ. Stat. 95(3), 932–945 (2013)

    Article  Google Scholar 

  • Fenwick, E., O’Brien, B., Briggs, A.: Cost-effectiveness acceptability curves—facts, fallacies and frequently asked questions. Health Econ. 13(5), 405–415 (2004)

    Article  PubMed  Google Scholar 

  • Freedman, D., Berk, R.A.: Weighting regression by propensity score. Eval. Rev. 32(4), 392–409 (2008)

    Article  PubMed  Google Scholar 

  • Fung, V., Brand, R.J., Newhouse, J.P., Hsu, J.: Using medicare data for comparative effectiveness research: opportunities and challenges. Am. J. Manag. Care 17(7), 489–496 (2011)

    Google Scholar 

  • Funk, M.J., Westreich, D., Wiesen, C., Stürmer, T., Brookhart, M.A., Davidian, M.: Doubly robust estimation of causal effects. Am. J. Epidemiol. 173(7), 761–767 (2011). doi:10.1093/aje/kwq439

    Article  PubMed  Google Scholar 

  • Glick, H., Doshi, J., Sonnad, S., Polsky, D.: Economic Evaluation in Clinical Trials. Oxford University Press, Oxford (2007)

    Google Scholar 

  • Glynn, A.N., Quinn, K.M.: An introduction to the augmented inverse propensity weighted estimator. Political Anal. 18, 36–56 (2010)

    Article  Google Scholar 

  • Golinelli, D., Ridgeway, G., Rhoades, H., Tucker, J., Wenzel, S.: Bias and variance trade-offs when combining propensity score weighting and regression: with an application to HIV status and homeless men. Health Serv. Outcomes Res. Methodol. 12(2–3), 104–118 (2012)

    Article  PubMed  Google Scholar 

  • Grieve, R., Sekhon, J.S., Hu, T.-W., Bloom, J.: Evaluating health care programs by combining cost with quality of life measures: a case study comparing capitation and fee for service. Health Serv. Res. 43(4), 1204–1222 (2008)

    Article  PubMed  Google Scholar 

  • Gruber, S., van der Laan, M.J.: An application of collaborative targeted maximum likelihood estimation in causal inference and genomics. Int. J. Biostat. 6(1), Article 18 (2010). doi:10.2202/1557-4679.1182

    PubMed  Google Scholar 

  • Hill, J., Reiter, J.P.: Interval estimation for treatment effects using propensity score matching. Stat. Med. 25(13), 2230–2256 (2006)

    Article  PubMed  Google Scholar 

  • Hirano, K., Imbens, G.W.: Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv. Outcomes Res. Methodol. 2(3), 259–278 (2001)

    Article  Google Scholar 

  • Hirano, K., Imbens, G.W., Ridder, G.: Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71(4), 1161–1189 (2003)

    Article  Google Scholar 

  • Ho, D.E., Imai, K., King, G., Stuart, E.A.: Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Anal. 15(3), 199–236 (2007)

    Article  Google Scholar 

  • Imbens, G.M., Wooldridge, J.M.: Recent developments in the econometrics of program evaluation. J. Econ. Lit. 47(1), 5–86 (2009)

    Article  Google Scholar 

  • Jackson, C., Bojke, L., Thompson, S., Claxton, K., Sharples, L.: A framework for addressing structural uncertainty in decision models. Med. Decis. Making 31, 662–674 (2011)

    Article  PubMed  Google Scholar 

  • Jones, A., Lomas, J., Rice, N.: Applying beta-type size distributions to healthcare cost regressions. In: HEDG working papers, vol. WP 11/31. HEDG, c/o Department of Economics, University of York, 2011

  • Jones, A.M.: Models for health care. In: HEDG working papers. HEDG, c/o Department of Economics, University of York, 2010

  • Kang, J.D.Y., Schafer, J.L.: Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat. Sci. 22(4), 523–539 (2007)

    Article  Google Scholar 

  • Kreif, N., Grieve, R., Radice, R., Sadique, Z., Ramsahai, R., Sekhon, J.S.: Methods for estimating subgroup effects in cost-effectiveness analyses that use observational data. Med. Decis. Making 32(6), 750–763 (2012a). doi:10.1177/0272989x12448929

    Article  PubMed  Google Scholar 

  • Kreif, N., Grieve, R., Sadique, Z.: Statistical methods for cost-effectiveness analyses that use observational data: a critical appraisal tool and review of current practice. Health Econ. 22(4), 486–500 (2012b). doi:10.1002/hec.2806

    Article  PubMed  Google Scholar 

  • Lee, B.K., Lessler, J., Stuart, E.A.: Improving propensity score weighting using machine learning. Stat. Med. 29(3), 337–346 (2010)

    PubMed  Google Scholar 

  • Lunceford, J.K., Davidian, M.: Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat. Med. 23(19), 2937–2960 (2004)

    Article  PubMed  Google Scholar 

  • Manca, A., Austin, P.C.: Using propensity score methods to analyse individual patient-level cost-effectiveness data from observational studies. http://www.york.ac.uk/res/herc/documents/wp/08_20.pdf (2008). Accessed 15 June 2012

  • Manning, W.G., Basu, A., Mullahy, J.: Generalized modeling approaches to risk adjustment of skewed outcomes data. J. Health Econ. 24(3), 465–488 (2005). doi:10.1016/j.jhealeco.2004.09.011

    Article  PubMed  Google Scholar 

  • Mihaylova, B., Briggs, A., O’Hagan, A., Thompson, S.: Review of statistical methods for analysing healthcare resources and costs. Health Econ. (2010). doi:10.1002/hec.1653

    PubMed  Google Scholar 

  • NICE: Guide to the methods of technology appraisal 2013. http://www.nice.org.uk/media/D45/1E/GuideToMethodsTechnologyAppraisal2013.pdf (2013). Accessed 10 July 2013

  • Nixon, R., Wonderling, D., Grieve, R.: Non-parametric methods for cost-effectiveness analysis: the central limit theorem and the bootstrap compared. Health Econ. 19(3), 316–333 (2010)

    Article  PubMed  Google Scholar 

  • Nixon, R.M., Thompson, S.G.: Methods for incorporating covariate adjustment, subgroup analysis and between-centre differences into cost-effectiveness evaluations. Health Econ. 14(12), 1217–1229 (2005)

    Article  PubMed  Google Scholar 

  • Pearl, J.: Causal diagrams for empirical research. Biometrika 82(4), 669–688 (1995)

    Article  Google Scholar 

  • Petersen, M.L., Porter, K., Gruber, S., Wang, Y., Laan, M.J.V.D.: Diagnosing and responding to violations in the positivity assumption. Stat. Methods Med. Res. 21(1), 31–54 (2012)

    Article  PubMed  Google Scholar 

  • Porter, K.E., Gruber, S., Laan, M.J.V.D., Sekhon, J.S.: The relative performance of targeted maximum likelihood estimators. Int. J. Biostat. (2011). doi:10.2202/1557-4679

    PubMed  Google Scholar 

  • Quinn, C.: The health-economic applications of copulas: methods in applied econometric research. http://ideas.repec.org/p/yor/hectdg/07-22.html (2007). Accessed 10 Aug 2011

  • R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2011)

    Google Scholar 

  • Radice, R., Grieve, R., Ramsahai, R., Kreif, N., Sadique, Z., Sekhon, J.S.: Evaluating treatment effectiveness in patient subgroups: a comparison of propensity score methods with an automated matching approach. Int. J. Biostat. 8(1), 25 (2012)

    PubMed  Google Scholar 

  • Robins, J., Rotnitzky, A., Zhao, L.P.: Estimation of regression coefficients when some regressors are not always observed. J. Am. Stat. Assoc. 89, 846–866 (1994)

    Article  Google Scholar 

  • Robins, J., Sued, M., Lei-Gomez, Q., Rotnitzky, A.: Comment: performance of double-robust estimators when “inverse probability” weights are highly variable. Stat. Sci. 22(4), 544–559 (2007)

    Article  Google Scholar 

  • Robins, J.M., Rotnitzky, A., Zhao, L.P.: Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J. Am. Stat. Assoc. 90(429), 106–121 (1995)

    Article  Google Scholar 

  • Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983). doi:10.1093/biomet/70.1.41

    Article  Google Scholar 

  • Rowan, K., Welch, C., North, E., Harrison, D.: Drotrecogin alfa (activated): real-life use and outcomes for the UK. Crit. Care 12(2), R58 (2008)

    Article  PubMed  Google Scholar 

  • Rubin, D.B.: The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics 29, 185–203 (1973)

    Article  Google Scholar 

  • Rubin, D.B.: The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat. Med. 26(1), 20–36 (2007)

    Article  PubMed  Google Scholar 

  • Rubin, D.B.: On the limitations of comparative effectiveness research. Stat. Med. 29, 1991–1995 (2010)

    Article  PubMed  Google Scholar 

  • Rubin, D.B., Thomas, N.: Combining propensity score matching with additional adjustments for prognostic covariates. J. Am. Stat. Assoc. 95, 573–585 (2000)

    Article  Google Scholar 

  • Sadique, M.Z., Grieve, R., Harrison, D., Cuthbertson, B., Rowan, K.: Is Drotrecogin alfa (activated) for adults with severe sepsis, cost-effective in routine clinical practice? Crit. Care 15(5), R228 (2011)

    Article  PubMed  Google Scholar 

  • Sekhon, J.S.: Matching: multivariate and propensity score matching with automated balance search. J. Stat. Softw. 42(7), 1–52 (2011)

    Google Scholar 

  • Sekhon, J.S., Grieve, R.D.: A matching method for improving covariate balance in cost-effectiveness analyses. Health Econ. 21(6), 695–714 (2011). doi:10.1002/hec.1748

    Article  PubMed  Google Scholar 

  • StataCorp: Stata Statistical Software: Release 12. StataCorp LP, College Station (2011)

    Google Scholar 

  • Stuart, E.A.: Matching methods for causal inference: a review and a look forward. Stat. Sci. 25(1), 1–21 (2010)

    Article  PubMed  Google Scholar 

  • Trivedi, P.K., Zimmer, D.M.: Copula Modeling: An Introduction to Practitioners, vol. 1. Foundations and Trends in Econometrics. Now Publishing Inc., Delft (2005)

    Google Scholar 

  • Tunis, S.R., Benner, J., McClellan, M.: Comparative effectiveness research: policy context, methods development and research infrastructure. Stat. Med. 29(19), 1963–1976 (2010). doi:10.1002/sim.3818

    Article  PubMed  Google Scholar 

  • van der Laan, M.J.: Targeted maximum likelihood based causal inference: part I. Int. J. Biostat. (2010). doi:10.2202/1557-4679.1211

    Google Scholar 

  • van der Laan, M.J., Gruber, S.: Collaborative double robust targeted maximum likelihood estimation. Int. J. Biostat. (2010). doi:10.2202/1557-4679.1181

    Google Scholar 

  • van der Laan, M.J., Polley, E.C., Hubbard, A.E.: Super learner. Stat. Appl. Genet. Mol. Biol. (2007). doi:10.2202/1544-6115.1309

    Google Scholar 

  • Westreich, D., Cole, S.R.: Invited commentary: positivity in practice. Am. J. Epidemiol. 171(6), 674–677 (2010). doi:10.1093/aje/kwp436

    Article  PubMed  Google Scholar 

  • Westreich, D., Lessler, J., Funk, M.: Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J. Clin. Epidemiol. 63(8), 826–833 (2010)

    Article  PubMed  Google Scholar 

  • Willan, A.R., Briggs, A.H., Hoch, J.S.: Regression methods for covariate adjustment and subgroup analysis for non-censored cost-effectiveness data. Health Econ. 13(5), 461–475 (2004)

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

We thank Zia Sadique (LSHTM) for help in the motivating case study, Roland Ramsahai (University of Cambridge) for valuable comments on the Monte Carlo simulations, Manuel Gomes, Karla Diaz-Ordaz, Adam Steventon, Rhian Daniel (all LSHTM) and Susan Gruber (Harvard School of Public Health) for comments on the manuscript. We also thank David Harrison and Kathy Rowan (ICNARC) for access to the data used in the case study. Funding from the Economic and Social Research Council (Grant no. RES-061-25-0343) is greatly appreciated.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noémi Kreif.

Appendices

Appendix 1

See Tables 6 and 7.

Table 6 Monte Carlo simulation results: relative bias and RMSE of the estimated incremental cost
Table 7 Monte Carlo simulation results: relative bias and RMSE of the estimated incremental QALYs

Appendix 2: Code for implementing the methods

This section provides code for the implementation of the combined statistical approaches proposed in the paper, using the R (R Development Core Team 2011) and Stata statistical softwares (StataCorp 2011). The user-written functions implemented here call some pre-written R routines, for example “glm” for generalised linear models, or the “Matching” library (Sekhon 2011). When implementing the methods in Stata, we use the NNMATCH routine (Abadie et al. 2004b) for matching.

Appendix 3: R code for generating data in the simulations

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kreif, N., Grieve, R., Radice, R. et al. Regression-adjusted matching and double-robust methods for estimating average treatment effects in health economic evaluation. Health Serv Outcomes Res Method 13, 174–202 (2013). https://doi.org/10.1007/s10742-013-0109-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10742-013-0109-2

Keywords

Navigation