1932

Abstract

Respondent-driven sampling is a commonly used method for sampling from hard-to-reach human populations connected by an underlying social network of relations. Beginning with a convenience sample, participants pass coupons to invite their contacts to join the sample. Although the method is often effective at attaining large and varied samples, its reliance on convenience samples, social network contacts, and participant decisions makes it subject to a large number of statistical concerns. This article reviews inferential methods available for data collected by respondent-driven sampling.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-031017-100704
2018-03-07
2024-04-16
Loading full text...

Full text loading...

/deliver/fulltext/statistics/5/1/annurev-statistics-031017-100704.html?itemId=/content/journals/10.1146/annurev-statistics-031017-100704&mimeType=html&fmt=ahah

Literature Cited

  1. Abramovitz D, Volz EM, Strathdee SA, Patterson TL, Vera A, Frost SD. 2009. Using respondent driven sampling in a hidden population at risk of HIV infection: Who do HIV-positive recruiters recruit. Sex. Transm. Dis. 36:750–56 [Google Scholar]
  2. Arayasirikul S, Chen YH, Jin H, Wilson E. 2016. A web 2.0 and epidemiology mash-up: using respondent-driven sampling in combination with social network site recruitment to reach young transwomen. AIDS Behav 20:1265–74 [Google Scholar]
  3. Aronow PM, Crawford FW. 2015. Nonparametric identification for respondent-driven sampling. Stat. Probab. Lett. 106:100–2 [Google Scholar]
  4. Baraff AJ, McCormick TH, Raftery AE. 2016. Estimating uncertainty in respondent-driven sampling using a tree bootstrap method. PNAS 113:14668–73 [Google Scholar]
  5. Barash VD, Cameron CJ, Spiller MW, Heckathorn DD. 2016. Respondent-driven sampling—testing assumptions: sampling with replacement. J. Off. Stat. 32:29–73 [Google Scholar]
  6. Beaudry IS. 2017. Inference from network data in hard-to-reach populations PhD Thesis, Univ. Mass Amherst:
  7. Beaudry IS, Gile KJ, Mehta SH. 2017. Inference for respondent-driven sampling with misclassification. Ann. Appl. Stat. 11:42111–41 [Google Scholar]
  8. Bengtsson L, Lu X, Nguyen QC, Camitz M, Hoang NL. et al. 2012. Implementation of web-based respondent-driven sampling among men who have sex with men in Vietnam. PLOS ONE 7:e49417 [Google Scholar]
  9. Berchenko Y, Rosenblatt JD, Frost SDW. 2017. Modeling and analyzing respondent-driven sampling as a counting process. Biometrics 73:1189–98 [Google Scholar]
  10. Brown T, Bao L, Eaton JW, Hogan DR, Mahy M. et al. 2014. Improvements in prevalence trend fitting and incidence estimation in EPP 2013. AIDS 28:S415–25 [Google Scholar]
  11. Buchmann JA. 2004. Cryptographic hash functions. Introduction to Cryptography JA Buchmann 235–48 New York: Springer. , 2nd ed.. [Google Scholar]
  12. Crawford FW. 2016. The graphical structure of respondent-driven sampling. Sociol. Methodol. 46:187–211 [Google Scholar]
  13. Crawford FW, Aronow PM, Zeng L, Li J. 2015. Identification of homophily and preferential recruitment in respondent-driven sampling. arXiv:1511.05397 [stat.AP]
  14. Crawford FW, Wu J, Heimer R. 2017. Hidden population size estimation from respondent-driven sampling: a network approach. J. Am. Stat. Assoc. http://dx.doi.org/10.1080/01621459.2017.1285775 [Crossref]
  15. Erausquin JT, Reed E, Blankenship KM. 2014. Change over time in police interactions and HIV risk behavior among female sex workers in Andhra Pradesh, India. AIDS Behav 19:1108–115 [Google Scholar]
  16. Erdős P, Rényi A. 1959. On random graphs. Publ. Math. 6:290–97 [Google Scholar]
  17. Fellows IE. 2012.a Deducer: a data analysis GUI for R. J. Stat. Softw. 49:1–15 [Google Scholar]
  18. Fellows IE. 2012.b Exponential family random network models PhD Thesis, Univ. Calif Los Angeles:
  19. Fellows IE, Handcock MS. 2012. Exponential-family random network models. arXiv1208.0121 [stat.ME]
  20. Fienberg SE, Johnson MS, Junker BW. 1999. Classical multilevel and Bayesian approaches to population size estimation using multiple lists. J. R. Stat. Soc. Ser. A 162:383–405 [Google Scholar]
  21. Gallagher KM, Sullivan PS, Lansky A, Onorato IM. 2007. Behavioral surveillance among people at risk for HIV infection in the U.S.: the national HIV behavioral surveillance system. Public Health Rep 122:32–38 [Google Scholar]
  22. Gile KJ. 2011. Improved inference for respondent-driven sampling data with application to HIV prevalence estimation. J. Am. Stat. Assoc. 106:493:135–46 [Google Scholar]
  23. Gile KJ, Handcock MS. 2010. Respondent-driven sampling: an assessment of current methodology. Sociol. Methodol. 40:285–327 [Google Scholar]
  24. Gile KJ, Handcock MS. 2015. Network model-assisted inference from respondent-driven sampling data. J. R. Stat. Soc. Ser. A 178:619–39 [Google Scholar]
  25. Gile KJ, Johnston LG, Salganik MJ. 2015. Diagnostics for respondent-driven sampling. J. R. Stat. Soc. Ser. A 178:241–69 [Google Scholar]
  26. Goel S, Salganik MJ. 2009. Respondent-driven sampling as Markov chain Monte Carlo. Stat. Med. 28:2202–29 [Google Scholar]
  27. Goel S, Salganik MJ. 2010. Assessing respondent-driven sampling. PNAS 107:6743–47 [Google Scholar]
  28. Goodman LA. 1961. Snowball sampling. Ann. Math. Stat. 32:148–70 [Google Scholar]
  29. Handcock MS, Fellows IE, Gile KJ. 2014.a RDS Analyst: software for the analysis of respondent-driven sampling data. R package, version 0.42 http://wiki.stat.ucla.edu/hpmrg
  30. Handcock MS, Fellows IE, Gile KJ. 2015.a RDS: Respondent-driven sampling. Los Angeles, CA. R package, version 0.7-2. http://wiki.stat.ucla.edu/hpmrg
  31. Handcock MS, Gile KJ. 2011. Comment: on the concept of snowball sampling. Sociol. Methodol. 41:367–71 [Google Scholar]
  32. Handcock MS, Gile KJ, Mar CM. 2014.b Estimating hidden population size using respondent-driven sampling data. Electron. J. Stat. 8:1491 [Google Scholar]
  33. Handcock MS, Gile KJ, Mar CM. 2015.b Estimating the size of populations at high risk for HIV using respondent-driven sampling data. Biometrics 71:258–66 [Google Scholar]
  34. Hansen MH, Hurwitz WN. 1943. On the theory of sampling from finite populations. Ann. Math. Stat. 14:333–62 [Google Scholar]
  35. Heckathorn DD. 1997. Respondent-driven sampling: a new approach to the study of hidden populations. Soc. Probl. 44:174–99 [Google Scholar]
  36. Heckathorn DD. 2002. Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hidden populations. Soc. Probl. 49:11–34 [Google Scholar]
  37. Heckathorn DD. 2007. Extensions of respondent-driven sampling: analyzing continuous variables and controlling for differential recruitment. Sociol. Methodol. 37:151–207 [Google Scholar]
  38. Heckathorn DD, Semaan S, Broadhead RS, Hughes JJ. 2002. Extensions of respondent-driven sampling: a new approach to the study of injection drug users aged 18–25. AIDS Behav 6:55–67 [Google Scholar]
  39. Horvitz DG, Thompson DJ. 1952. A generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 47:663–85 [Google Scholar]
  40. Iguchi MY, Ober AJ, Berry SH, Fain T, Heckathorn DD. et al. 2009. Simultaneous recruitment of drug users and men who have sex with men in the United States and Russia using respondent-driven sampling: sampling methods and implications. J. Urban Health 86:5–31 [Google Scholar]
  41. Johnston LG. 2007. Conducting respondent driven sampling studies in diverse settings: a manual for planning RDS studies Cent. Dis. Control Prev Atlanta, GA:
  42. Johnston LG. 2013.a Introduction to HIV/AIDS and sexually transmitted infection surveillance. Module 4. Introduction to respondent-driven sampling World Health Organ Geneva: http://www.lisagjohnston.com/respondent-driven-sampling/respondent-driven-sampling
  43. Johnston LG. 2013.b Introduction to HIV/AIDS and sexually transmitted infection surveillance. Module 4, supplement: a guide to using RDS Analyst and NetDraw World Health Organ Geneva: http://www.lisagjohnston.com/respondent-driven-sampling/respondent-driven-sampling
  44. Johnston LG, Malekinejad M, Kendall C, Iuppa IM, Rutherford GW. 2008. Implementation challenges to using respondent-driven sampling methodology for HIV biological and behavioral surveillance: field experiences in international settings. AIDS Behav 12:131–41 [Google Scholar]
  45. Johnston LG, O'Bra H, Chopra M, Mathews C, Townsend L. et al. 2010. The associations of voluntary counseling and testing acceptance and the perceived likelihood of being HIV-infected among men with multiple sex partners in a South African township. AIDS Behav 14:922–31 [Google Scholar]
  46. Johnston LG, Prybylski D, Raymond HF, Mirzazadeh A, Manopaiboon C, McFarland W. 2013. Incorporating the service multiplier method in respondent-driven sampling surveys to estimate the size of hidden and hard-to-reach populations: case studies from around the world. Sex. Transm. Dis. 40:304–10 [Google Scholar]
  47. Kendall C, Kerr LRFS, Gondim RC, Werneck GL, Macena RHM. et al. 2008. An empirical comparison of respondent-driven sampling, time location sampling, and snowball sampling for behavioral surveillance in men who have sex with men, Fortaleza, Brazil. AIDS Behav 12:97 [Google Scholar]
  48. Khabbazian M, Hanlon B, Russek Z, Rohe K. 2016. Novel sampling design for respondent-driven sampling. arXiv1606.00387 [stat.ME]
  49. Kuchenhoff H, Mwalili SM, Lesaffre E. 2006. A general method for dealing with misclassification in regression: the misclassification SIMEX. Biometrics 62:85–96 [Google Scholar]
  50. Lansky A, Abdul-Quader AS, Cribbin M, Hall T, Finlayson TJ. et al. 2007. Developing an HIV behavioral surveillance system for injecting drug users: the national HIV behavioral surveillance system. Public Health Rep 122:48–55 [Google Scholar]
  51. Lansky A, Drake A, Wejnert C, Pham H, Cribbin M, Heckathorn DD. 2012. Assessing the assumptions of respondent-driven sampling in the national HIV behavioral surveillance system among injecting drug users. Open AIDS J 6:77–82 [Google Scholar]
  52. Li X, Rohe K. 2016. Central limit theorems for network driven sampling. arXiv1509.04704 [stat.ME]
  53. Liang KY, Zeger SL. 1986. Longitudinal data analysis using generalized linear models. Biometrika 78:13–22 [Google Scholar]
  54. Liu H, Li J, Ha T, Li J. 2012. Assessment of random recruitment assumption in respondent-driven sampling in egocentric network data. Soc. Netw. 1:13 [Google Scholar]
  55. Lu X. 2013. Linked ego networks: improving estimate reliability and validity with respondent-driven sampling. Soc. Netw. 35:669–85 [Google Scholar]
  56. Lu X, Bengtsson L, Britton T, Camitz M, Kim BJ. et al. 2012. The sensitivity of respondent-driven sampling. J. R. Stat. Soc. Ser. A 175:191–216 [Google Scholar]
  57. Lu X, Malmros J, Liljeros F, Britton T. 2013. Respondent-driven sampling on directed networks. Electron. J. Stat. 7:292–322 [Google Scholar]
  58. Lunagomez S, Airoldi E. 2014. Bayesian inference from non-ignorable network sampling designs. arXiv1401.4718 [stat.ME]
  59. Magnani R, Sabin K, Saidel T, Heckathorn D. 2005. Review of sampling hard-to-reach and hidden populations for HIV surveillance. AIDS 19:S67–S72 [Google Scholar]
  60. Malekinejad M, Johnston LG, Kendall C, Kerr LRFS, Rifkin MR, Rutherford GW. 2008. Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: a systematic review. AIDS Behav 12:105–30 [Google Scholar]
  61. Malmros J, Liljeros F, Britton T. 2016.a Respondent-driven sampling and an unusual epidemic. J. Appl. Probab. 53:518–30 [Google Scholar]
  62. Malmros J, Masuda N, Britton T. 2016.b Random walks on directed networks: inference and respondent-driven sampling. J. Off. Stat. 32:433–59 [Google Scholar]
  63. McCreesh N, Copas A, Seeley J, Johnston LG, Sonnenberg P. et al. 2013. Respondent driven sampling: determinants of recruitment and a method to improve point estimation. PLOS ONE 8:e78402 [Google Scholar]
  64. McCreesh N, Frost S, Seeley J, Katongole J, Tarsh MN. et al. 2012. Evaluation of respondent-driven sampling. Epidemiology 23:138–47 [Google Scholar]
  65. McCreesh N, Johnston LG, Copas A, Sonnenberg P, Seeley J. et al. 2011. Evaluation of the role of location and distance in recruitment in respondent-driven sampling. Int. J. Health Geogr. 10:1 [Google Scholar]
  66. McLaughlin KR. 2016. Modeling preferential recruitment for respondent-driven sampling PhD Thesis, Univ. Calif Los Angeles:
  67. McLaughlin KR, Handcock MS, Johnston LG. 2015. Inference for the visibility distribution for respondent-driven sampling. JSM Proceedings 2015, Statistical Computing Section Alexandria, VA: Am. Stat. Assoc. [Google Scholar]
  68. Merli MG, Moody J, Smith J, Li J, Weir S, Chen X. 2015. Challenges to recruiting population representative samples of female sex workers in China using respondent driven sampling. Soc. Sci. Med. 125:79–93 [Google Scholar]
  69. Mills HL, Johnson S, Hickman M, Jones NS, Colijn C. 2014. Errors in reported degrees and respondent driven sampling: implications for bias. Drug Alcohol Depend 142:120–26 [Google Scholar]
  70. Molloy MS, Reed BA. 1995. A critical point for random graphs with a given degree sequence. Random Struct. Algorithms 6:161–79 [Google Scholar]
  71. Montealegre JR, Johnston LG, Murrill C, Monterroso E. 2013. Respondent driven sampling for HIV biological and behavioral surveillance in Latin America and the Caribbean. AIDS Behav 17:2313–40 [Google Scholar]
  72. Mouw T, Verdery AM. 2012. Network sampling with memory. Sociol. Methodol. 42:206–56 [Google Scholar]
  73. Neely WW. 2010. Statistical theory for respondent driven sampling PhD Thesis, Univ. Wisc Madison:
  74. Ott MQ, Gile KJ. 2016. Unequal edge inclusion probabilities in link-tracing network sampling with implications for respondent-driven sampling. Electron. J. Stat. 10:1109–32 [Google Scholar]
  75. Ott MQ, Gile KJ, Harrison MT, Johnston LG, Hogan JW. 2017. Reduced bias for respondent-driven sampling: accounting for non-uniform edge sampling probabilities in people who inject drugs in Mauritius. arXiv:1712.09149 [stat.AP]
  76. Paz-Bailey G, Jacobson J, Guardado M, Hernandez F, Nieto A. et al. 2011. How many men who have sex with men and female sex workers live in El Salvador? Using respondent-driven sampling and capture–recapture to estimate population sizes. Sex. Transm. Infect. 87:279–82 [Google Scholar]
  77. Paz-Bailey G, Miller W, Shiraishi RW, Jacobson JO, Abimbola TO, Chen SY. 2013. Reaching men who have sex with men: a comparison of respondent-driven sampling and time-location sampling in Guatemala City. AIDS Behav 17:3081–90 [Google Scholar]
  78. Pitpitan EV, Smith LR, Goodman-Meza D, Torres K, Semple SJ. et al. 2016. “Outness” as a moderator of the association between syndemic conditions and HIV risk-taking behavior among men who have sex with men in Tijuana, Mexico. AIDS Behav 20:431–38 [Google Scholar]
  79. R Core Team. 2016. R: a language and environment for statistical computing Vienna: R Found. Stat. Comput.
  80. Ramirez-Valles J, Molina Y, Dirkes J. 2013. Stigma towards PLWHA: the role of internalized homosexual stigma in Latino gay/bisexual male and transgender communities. AIDS Educ. Prev. 25:179–89 [Google Scholar]
  81. Rhodes SD, McCoy TP. 2015. Condom use among immigrant Latino sexual minorities: multilevel analysis after respondent-driven sampling. AIDS Educ. Prev. 27:27–43 [Google Scholar]
  82. Rohe K. 2015. Network driven sampling; a critical threshold for design effects. arXiv1505.05461 [math.ST]
  83. Rubin DB. 1976. Inference and missing data. Biometrika 63:581–92 [Google Scholar]
  84. Rudolph AE, Crawford ND, Latkin C, Heimer R, Benjamin EO. et al. 2011. Subpopulations of illicit drug users reached by targeted street outreach and respondent-driven sampling strategies: implications for research and public health practice. Ann. Epidemiol. 21:280–89 [Google Scholar]
  85. Rudolph AE, Fuller CM, Latkin C. 2013. The importance of measuring and accounting for potential biases in respondent-driven samples. AIDS Behav 17:2244–52 [Google Scholar]
  86. Salganik MJ. 2006. Variance estimation, design effects, and sample size calculations for respondent-driven sampling. J. Urban Health 83:98–112 [Google Scholar]
  87. Salganik MJ, Heckathorn DD. 2004. Sampling and estimation in hidden populations using respondent-driven sampling. Sociol. Methodol. 34:193–240 [Google Scholar]
  88. Shi Y, Cameron CJ, Heckathorn DD. 2016. Model-based and design-based inference. Sociol. Methods Res. http://dx.doi.org/10.1177/0049124116672682 [Crossref]
  89. Silva-Santisteban A, Raymond HF, Salazar X, Villayzan J, Leon S. et al. 2012. Understanding the HIV/AIDS epidemic in transgender women of Lima, Peru: results from a sero-epidemiologic study using respondent driven sampling. AIDS Behav 16:872–81 [Google Scholar]
  90. Spiller MW. 2009. Regression modeling of data collected using respondent-driven sampling Master's Thesis, Cornell Univ.:
  91. Spiller MW, Gile KJ, Handcock MS, Mar CM, Wejnert C. 2017. Evaluating variance estimators for respondent-driven sampling. J. Surv. Stat. Methodol. In press. https://doi.org/10.1093/jssam/smx018 [Crossref]
  92. Štulhofer A, Baćak V, Božičević I, Begovac J. 2008. HIV-related sexual risk taking among HIV-negative men who have sex with men in Zagreb, Croatia. AIDS 12:505–12 [Google Scholar]
  93. Tomas A, Gile KJ. 2011. The effect of differential recruitment, non-response and non-recruitment on estimators for respondent-driven sampling. Electron. J. Stat. 5:899–934 [Google Scholar]
  94. Tran HV, Le LVN Johnston LG, Nadol P, Van Do A. et al. 2015. Sampling males who inject drugs in Haiphong, Vietnam: comparison of time-location and respondent-driven sampling methods. J. Urban Health 92:744–57 [Google Scholar]
  95. UNAIDS. 2009. Estimating national adult prevalence of HIV-1 in concentrated epidemics Tech. Rep., UNAIDS Joint United Nations Programme on HIV/AIDS
  96. Verdery AM, Fisher JC, Siripong N, Abdesselam K, Bauldry S. 2016. New survey questions and estimators for network clustering with respondent-driven sampling data. arXiv1610.06683 [stat.ME]
  97. Verdery AM, Merli MG, Moody J, Smith J, Fisher JC. 2015.a Respondent-driven sampling estimators under real and theoretical recruitment conditions of female sex workers in China. Epidemiology 26:661 [Google Scholar]
  98. Verdery AM, Mouw T, Bauldry S, Mucha PJ. 2015.b Network structure and biased variance estimation in respondent driven sampling. PLOS ONE 10:e0145296 [Google Scholar]
  99. Volz E, Heckathorn DD. 2008. Probability based estimation theory for respondent driven sampling. J. Off. Stat. 24:79 [Google Scholar]
  100. Volz E, Wejnert C, Cameron C, Barash V, Degani I, Heckathorn DD. 2012. Respondent-driven sampling analysis tool (RDSAT), version 7.1. Statistical software http://www.respondentdrivensampling.org/
  101. Wang J, Carlson RG, Falck RS, Siegal HA, Rahman A, Li L. 2005. Respondent-driven sampling to recruit MDMA users: a methodological assessment. Drug Alcohol Depend 78:147–57 [Google Scholar]
  102. Wejnert C. 2009. An empirical test of respondent-driven sampling: point estimates, variance, degree measures, and out-of-equilibrium data. Sociol. Methodol. 39:73–116 [Google Scholar]
  103. Wejnert C. 2010. Social network analysis with respondent-driven sampling data: a study of racial integration on campus. Soc. Netw. 32:112–24 [Google Scholar]
  104. Wejnert C, Heckathorn DD. 2008. Web-based network sampling: efficiency and efficacy of respondent-driven sampling for online research. Sociol. Methods Res. 37:105–34 [Google Scholar]
  105. White RG, Hakim AJ, Salganik MJ, Spiller MW, Johnston LG. et al. 2015. Strengthening the reporting of observational studies in epidemiology for respondent-driven sampling studies: STROBE-RDS statement. J. Clin. Epidemiol. 68:1463–71 [Google Scholar]
  106. Yamanis TJ, Merli MG, Neely WW, Tian FF, Moody J. et al. 2013. An empirical analysis of the impact of recruitment patterns on RDS estimates among a socially ordered population of female sex workers in China. Sociol. Methods Res. 42:392–425 [Google Scholar]
  107. Zhong F, Lin P, Xu H, Wang Y, Wang M. et al. 2011. Possible increase in HIV and syphilis prevalence among men who have sex with men in Guangzhou, China: results from a respondent-driven sampling survey. AIDS Behav 15:1058–66 [Google Scholar]
/content/journals/10.1146/annurev-statistics-031017-100704
Loading
/content/journals/10.1146/annurev-statistics-031017-100704
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error