skip to main content
research-article

A Bayesian Approach for Quantifying Data Scarcity when Modeling Human Behavior via Inverse Reinforcement Learning

Published:07 March 2023Publication History
Skip Abstract Section

Abstract

Computational models that formalize complex human behaviors enable study and understanding of such behaviors. However, collecting behavior data required to estimate the parameters of such models is often tedious and resource intensive. Thus, estimating dataset size as part of data collection planning (also known as Sample Size Determination) is important to reduce the time and effort of behavior data collection while maintaining an accurate estimate of model parameters. In this article, we present a sample size determination method based on Uncertainty Quantification (UQ) for a specific Inverse Reinforcement Learning (IRL) model of human behavior, in two cases: (1) pre-hoc experiment design—conducted in the planning stage before any data is collected, to guide the estimation of how many samples to collect; and (2) post-hoc dataset analysis—performed after data is collected, to decide if the existing dataset has sufficient samples and whether more data is needed. We validate our approach in experiments with a realistic model of behaviors of people with Multiple Sclerosis (MS) and illustrate how to pick a reasonable sample size target. Our work enables model designers to perform a deeper, principled investigation of the effects of dataset size on IRL model parameters.

REFERENCES

  1. [1] Adaimi Rebecca and Thomaz Edison. 2019. Leveraging active learning and conditional mutual information to minimize data annotation in human activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3 (2019), 23 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Adcock C. J.. 1997. Sample size determination: A review. Journal of the Royal Statistical Society: Series D (The Statistician) 46, 2 (1997), 261283. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Alejo Roberto, García Vicente, and Pacheco J.. 2015. An efficient over-sampling approach based on mean square error back-propagation for dealing with the multi-class imbalance problem. Neural Processing Letters 42, 3 (2015), 603617. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Banovic Nikola, Buzali Tofi, Chevalier Fanny, Mankoff Jennifer, and Dey Anind K.. 2016. Modeling and understanding human routine behavior. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY,248260. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Banovic Nikola, Grossman Tovi, and Fitzmaurice George. 2013. The effect of time-based cost of error in target-directed pointing tasks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY,13731382. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Banovic Nikola, Mankoff Jennifer, and Dey Anind K.. 2018. Computational model of human routine behaviors. In Proceedings of the Computational Interaction. Oulasvirta Antti, Kristensson Per Ola, Bi Xiaojun, and Howes Andrew (Eds.), Oxford University Press, Oxford, 377398.Google ScholarGoogle Scholar
  7. [7] Banovic Nikola, Oulasvirta Antti, and Kristensson Per Ola. 2019. Computational modeling in human-computer interaction. In Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY,17. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Banovic Nikola, Wang Anqi, Jin Yanfeng, Chang Christie, Ramos Julian, Dey Anind, and Mankoff Jennifer. 2017. Leveraging human routine models to detect and generate human behaviors. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems.ACM, New York, NY,66836694. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Begoli Edmon, Bhattacharya Tanmoy, and Kusnezov Dimitri. 2019. The need for uncertainty quantification in machine-assisted medical decision making. Nature Machine Intelligence 1, 1 (2019), 2023. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Berger James O.. 1985. Statistical Decision Theory and Bayesian Analysis. Springer New York, New York, NY. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Bernardo Jose M. and Smith Adrian F. M.. 2000. Bayesian Theory. John Wiley & Sons, New York, NY.Google ScholarGoogle Scholar
  12. [12] Bishop Christopher. 2006. Pattern Recognition and Machine Learning. Springer-Verlag New York.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Blei David M., Kucukelbir Alp, and McAuliffe Jon D.. 2017. Variational inference: A review for statisticians. Journal of the American Statistical Association 112, 518 (2017), 859877. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Boonyanunta Natthaphan and Zeephongsekul Panlop. 2004. Predicting the relationship between the size of training sample and the predictive power of classifiers. In Proceedings of the Knowledge-based Intelligent Information and Engineering Systems. Negoita Mircea Gh., Howlett Robert J., and Jain Lakhmi C. (Eds.), Springer, Berlin,529535.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Breiman Leo. 2001. Statistical modeling: The two cultures. StatisticalScience 16, 3 (2001), 199–231.Google ScholarGoogle Scholar
  16. [16] Brooks Steve, Gelman Andrew, Jones Galin, and Meng Xiao-Li (Eds.), 2011. Handbook of Markov Chain Monte Carlo. Chapman and Hall/CRC. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Brown Daniel S. and Niekum Scott. 2017. Efficient probabilistic performance bounds for inverse reinforcement learning. arXiv preprint arXiv:1707.00724 (2017).Google ScholarGoogle Scholar
  18. [18] Brown Daniel S. and Niekum Scott. 2018. Efficient probabilistic performance bounds for inverse reinforcement learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Chaloner Kathryn and Verdinelli Isabella. 1995. Bayesian experimental design: A review. Statistical Science 10, 3 (1995), 273304. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Chang Youngjae, Mathur Akhil, Isopoussu Anton, Song Junehwa, and Kawsar Fahim. 2020. A systematic study of unsupervised domain adaptation for robust human-activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable, and Ubiquitous Technologies 4, 1 (2020), 30 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Chen Xiuli, Starke Sandra Dorothee, Baber Chris, and Howes Andrew. 2017. A cognitive model of how people make decisions through interaction with visual displays. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY,12051216. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Cohen Jacob. 1977. CHAPTER 1 - The concepts of power analysis. In Proceedings of the Statistical Power Analysis for the Behavioral Sciences. Cohen Jacob (Ed.), Academic Press, 117. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Cover Thomas A. and Thomas Joy A.. 2006. Elements of Information Theory (2nd ed.). John Wiley & Sons, Hoboken, NJ.Google ScholarGoogle Scholar
  24. [24] Crespo Luis G., Kenny Sean P., and Giesy Daniel P.. 2014. The NASA langley multidisciplinary uncertainty quantification challenge. In Proceedings of the 16th AIAA Non-Deterministic Approaches Conference. American Institute of Aeronautics and Astronautics, Reston, Virginia. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Cui Yuchen and Niekum Scott. 2017. Active learning from critiques via bayesian inverse reinforcement learning. In Proceedings of the Robotics: Science and Systems Workshop on Mathematical Models, Algorithms, and Human-Robot Interaction.Google ScholarGoogle Scholar
  26. [26] Eagle Nathan and Pentland Alex Sandy. 2009. Eigenbehaviors: Identifying structure in routine. Behavioral Ecology and Sociobiology 63, 7 (2009), 10571066. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Farrahi Katayoun and Gatica-Perez Daniel. 2012. Extracting mobile behavioral patterns with the distant N-gram topic model. In Proceedings of the 2012 16th International Symposium on Wearable Computers. 18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Fiebrink Rebecca, Cook Perry R., and Trueman Dan. 2011. Human model evaluation in interactive supervised learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY,147156. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Figueroa Rosa L., Zeng-Treitler Qing, Kandula Sasikiran, and Ngo Long H.. 2012. Predicting sample size required for classification performance. BMC Medical Informatics and Decision Making 12, 1 (2012), 8.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Finn Chelsea, Levine Sergey, and Abbeel Pieter. 2016. Guided cost learning: Deep inverse optimal control via policy optimization. In Proceedings of the 33rd International Conference on International Conference on Machine Learning.JMLR.org, 4958.Google ScholarGoogle Scholar
  31. [31] Foreman-Mackey Daniel, Hogg David W., Lang Dustin, and Goodman Jonathan. 2013. emcee: The MCMC hammer. Publications of the Astronomical Society of the Pacific 125, 925 (2013), 306.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Fukunaga K. and Hayes R. R.. 1989. Effects of sample size in classifier design. IEEE Transactions on Pattern Analysis and Machine Intelligence 11, 8 (1989), 873885.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Gebhardt Christoph, Hecox Brian, Opheusden Bas van, Wigdor Daniel, Hillis James, Hilliges Otmar, and Benko Hrvoje. 2019. Learning cooperative personalized policies from gaze data. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology.Association for Computing Machinery, New York, NY,197208. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Gebhardt Christoph, Oulasvirta Antti, and Hilliges Otmar. 2020. Hierarchical Reinforcement Learning as a Model of Human Task Interleaving. arXiv:cs.AI/2001.02122.Google ScholarGoogle Scholar
  35. [35] Gershman Samuel J., Horvitz Eric J., and Tenenbaum Joshua B.. 2015. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines. Science 349, 6245 (2015), 273278. arXiv:https://science.sciencemag.org/content/349/6245/273.full.pdf.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Ghanem Roger, Higdon David, and Owhadi Houman (Eds.). 2017. Handbook of uncertainty quantification. Springer International Publishing, Cham. arXiv:1507.00398.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Gilks W. R., Richardson S., and Spiegelhalter D. J.. 1996. Markov Chain Monte Carlo in Practice. Chapman & Hall, New York, NY.Google ScholarGoogle Scholar
  38. [38] Glowacka Dorota, Ruotsalo Tuukka, Konuyshkova Ksenia, Athukorala kumaripaba, Kaski Samuel, and Jacucci Giulio. 2013. Directing exploratory search: Reinforcement learning from user interactions with keywords. In Proceedings of the 2013 International Conference on Intelligent User Interfaces.Association for Computing Machinery, New York, NY,117128. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Goodman Jonathan and Weare Jonathan. 2010. Ensemble samplers with affine invariance. Communications in Applied Mathematics and Computational Science 5, 1 (2010), 6580.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Hauser Stephen L. and Oksenberg Jorge R.. 2006. The neurobiology of multiple sclerosis: Genes, inflammation, and neurodegeneration. Neuron 52, 1 (2006), 6176.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Huan Xun and Marzouk Youssef M.. 2013. Simulation-based optimal Bayesian experimental design for nonlinear systems. Journal of Computational Physics 232, 1 (2013), 288317. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Imani Mahdi and Braga-Neto Ulisses M.. 2018. Control of gene regulatory networks using Bayesian inverse reinforcement learning. IEEE/ACM Transactions on Computational Biology and Bioinformatics 16, 4 (2018), 12501261.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Inoue Sozo, Lago Paula, Hossain Tahera, Mairittha Tittaya, and Mairittha Nattaya. 2019. Integrating activity recognition and nursing care records: The system, deployment, and a verification study. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3 (2019), 24 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Inoue Sozo and Pan Xincheng. 2016. Supervised and unsupervised transfer learning for activity recognition from simple in-home sensors. In Proceedings of the 13th International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services.Association for Computing Machinery, New York, NY,2027. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Japkowicz Nathalie and Stephen Shaju. 2002. The class imbalance problem: A systematic study. Intelligent Data Analysis 6, 5 (2002), 429449.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Jaynes Edwin T. and Bretthorst G. L.. 2003. Probability Theory: The Logic of Science. Cambridge University Press.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Joyce James M.. 2011. Kullback–Leibler Divergence. Springer, Berlin,720722. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Kaelbling Leslie Pack, Littman Michael L., and Moore Andrew W.. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 1 (1996), 237285.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Kangasrääsiö Antti and Kaski Samuel. 2018. Inverse reinforcement learning from summary data. Machine Learning 107, 8 (2018), 15171535. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Kangasrääsiö Antti, Jokinen Jussi P. P., Oulasvirta Antti, Howes Andrew, and Kaski Samuel. 2019. Parameter inference for computational cognitive models with approximate bayesian computation. Cognitive Science 43, 6 (2019), e12738. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Kennedy Marc. C. and O’Hagan Anthony. 2001. Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63, 3 (2001), 425464. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Kotseruba Iuliia and Tsotsos John K.. 2020. 40 years of cognitive architectures: Core cognitive abilities and practical applications. Artificial Intelligence Review 53, 1 (2020), 1794. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Kratz Anna L., Braley Tiffany J., Foxen-Craft Emily, Scott Eric, III John F. Murphy, and Murphy Susan L.. 2017. How do pain, fatigue, depressive, and cognitive symptoms relate to well-being and social and physical functioning in the daily lives of individuals with multiple sclerosis? Archives of Physical Medicine and Rehabilitation 98, 11 (2017), 21602166.Google ScholarGoogle ScholarCross RefCross Ref
  54. [54] Kratz Anna L., Murphy Susan L., and Braley Tiffany J.. 2017. Ecological momentary assessment of pain, fatigue, depressive, and cognitive symptoms reveals significant daily variability in multiple sclerosis. Archives of Physical Medicine and Rehabilitation 98, 11 (2017), 21422150.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Kratz Anna L., Murphy Susan L., and Braley Tiffany J.. 2017. Pain, fatigue, and cognitive symptoms are temporally associated within but not across days in multiple sclerosis. Archives of Physical Medicine and Rehabilitation 98, 11 (2017), 21512159.Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Kullback Solomon and Leibler Richard A.. 1951. On information and sufficiency. The Annals of Mathematical Statistics 22, 1 (1951), 7986.Google ScholarGoogle ScholarCross RefCross Ref
  57. [57] Leino Katri, Oulasvirta Antti, and Kurimo Mikko. 2019. RL-KLM: Automating keystroke-level modeling with reinforcement learning. In Proceedings of the 24th International Conference on Intelligent User Interfaces.Association for Computing Machinery, New York, NY,476480. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Leino Katri, Todi Kashyap, Oulasvirta Antti, and Kurimo Mikko. 2019. Computer-supported form design using keystroke-level modeling with reinforcement learning. In Proceedings of the 24th International Conference on Intelligent User Interfaces: Companion.Association for Computing Machinery, New York, NY, 8586. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. [59] Lenth Russell V.. 2001. Some practical guidelines for effective sample size determination. The American Statistician 55, 3 (2001), 187193. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  60. [60] Lewis Richard L., Howes Andrew, and Singh Satinder. 2014. Computational rationality: Linking mechanism and behavior through bounded utility maximization. Topics in Cognitive Science 6, 2 (2014), 279311. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  61. [61] Li Nan, Kambhampati Subbarao, and Yoon Sungwook. 2009. Learning probabilistic hierarchical task networks to capture user preferences. In Proceedings of the International Joint Conference on Artificial Intelligence. Retrieved from https://www.aaai.org/ocs/index.php/IJCAI/IJCAI-09/paper/view/417/874.Google ScholarGoogle Scholar
  62. [62] Lindley Dennis V.. 1997. The choice of sample size. Journal of the Royal Statistical Society. Series D (The Statistician) 46, 2 (1997), 129138. Retrieved from http://www.jstor.org/stable/2988516.Google ScholarGoogle Scholar
  63. [63] Liu Qiang and Wang Dilin. 2016. Stein variational gradient descent: A general purpose bayesian inference algorithm. In Proceedings of the Advances in Neural Information Processing Systems 29. Barcelona, Spain, 23782386.Google ScholarGoogle Scholar
  64. [64] Magnusson Magnus S.. 2000. Discovering hidden time patterns in behavior: T-patterns and their detection. Behavior Research Methods, Instruments, and Computers 32, 1 (2000), 93110. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Mann Gideon S. and McCallum Andrew. 2010. Generalized expectation criteria for semi-supervised learning with weakly labeled data. Journal of Machine Learning Research 11, 32 (2010), 955984. Retrieved from http://jmlr.org/papers/v11/mann10a.html.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. [66] Maxwell Scott E., Kelley Ken, and Rausch Joseph R.. 2008. Sample size planning for statistical power and accuracy in parameter estimation. Annual Review of Psychology 59, 1 (2008), 537563. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  67. [67] Melnik Roderick. 2015. Universality of Mathematical Models in Understanding Nature, Society, and Man-Made World. John Wiley & Sons, Ltd., 116. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  68. [68] Müller Peter. 2005. Simulation based optimal design. Handbook of Statistics 25 (2005), 509518. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  69. [69] Ng Andrew Y. and Russell Stuart J.. 2000. Algorithms for inverse reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning.Morgan Kaufmann Publishers Inc., San Francisco, CA,663670.Google ScholarGoogle Scholar
  70. [70] Oberkampf William L., Trucano Timothy G., and Hirsch Charles. 2004. Verification, validation, and predictive capability in computational engineering and physics. Applied Mechanics Reviews 57, 5 (2004), 345384. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  71. [71] O’Hagan Anthony, Buck Caitlin E., Daneshkhah Alireza, Eiser J. Richard, Garthwaite Paul H., Jenkinson David J., Oakley Jeremy E., and Rakow Tim. 2006. Uncertain Judgements: Eliciting Experts’ Probabilities. John Wiley & Sons, Ltd, Chichester, United Kingdom. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  72. [72] Oulasvirta Antti, Jokinen Jussi P. P., and Howes Andrew. 2022. Computational rationality as a theory of interaction. In Proceedingsof the 2022 CHI Conference on Human Factors in Computing Systems.Google ScholarGoogle Scholar
  73. [73] Parmigiani Giovanni and Inoue Lurdes Y. T.. 2009. Decision Theory: Principles and Approaches. John Wiley & Sons, Inc., West Sussex, United Kingdom. Retrieved from http://books.google.com/books?id=mnjGCYqWj7EC&pgis=1.Google ScholarGoogle ScholarCross RefCross Ref
  74. [74] Pilch Martin, Trucano Timothy G., and Helton Jon C.. 2006. Ideas Underlying Quantification of Margins and Uncertainties (QMU): A White Paper. Technical Report. Sandia National Laboratories.Google ScholarGoogle Scholar
  75. [75] Puterman Martin L.. 2014. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons.Google ScholarGoogle Scholar
  76. [76] Ramachandran Deepak and Amir Eyal. 2007. Bayesian inverse reinforcement learning. In Proceedings of the International Joint Conference on Artificial Intelligence. 25862591.Google ScholarGoogle Scholar
  77. [77] Robert Christian P. and Casella George. 2004. Monte Carlo Statistical Methods. Springer New York, NY. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  78. [78] Rojas-Barahona Lina M. and Cerisara Christophe. 2014. Bayesian inverse reinforcement learning for modeling conversational agents in a virtual environment. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 503514.Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. [79] Ross Stephane, Gordon Geoffrey J., and Bagnell J. Andrew. 2011. No-regret reductions for imitation learning and structured prediction. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics.Google ScholarGoogle Scholar
  80. [80] Sadilek Adam and Krumm John. 2012. Far out: Predicting long-term human mobility. In Proceedings of the AAAI Conference on Artificial Intelligence. Retrieved from https://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/4845/5275.Google ScholarGoogle Scholar
  81. [81] Salsman John M., Victorson David, Choi Seung W., Peterman Amy H., Heinemann Allen W., Nowinski Cindy, and Cella David. 2013. Development and validation of the positive affect and well-being scale for the neurology quality of life (Neuro-QOL) measurement system. Quality of Life Research 22, 9 (2013), 25692580.Google ScholarGoogle ScholarCross RefCross Ref
  82. [82] Sarcar Sayan, Jokinen Jussi P. P., Oulasvirta Antti, Wang Zhenxin, Silpasuwanchai Chaklam, and Ren Xiangshi. 2018. Ability-based optimization of touchscreen interactions. IEEE Pervasive Computing 17, 1 (2018), 1526. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. [83] Scott David W.. 2015. Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons.Google ScholarGoogle ScholarCross RefCross Ref
  84. [84] Settles Burr. 2009. Active Learning Literature Survey. Technical Report. University of Wisconsin-Madison Department of Computer Sciences.Google ScholarGoogle Scholar
  85. [85] Sivia D. S. and Skilling J.. 2006. Data Analysis: A Bayesian Tutorial (2nd ed.). Oxford University Press, New York, NY.Google ScholarGoogle Scholar
  86. [86] Soukoreff R. William and MacKenzie I. Scott. 2004. Towards a standard for pointing device evaluation, perspectives on 27 years of fitts’ law research in HCI. International Journal of Human-Computer Studies 61, 6 (2004), 751789. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. [87] Venkatraman Arun, Hebert Martial, and Bagnell J.. 2015. Improving multi-step prediction of learned time series models. In Proceedings of the AAAI Conference on Artificial Intelligence. Retrieved from https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9592/9976.Google ScholarGoogle ScholarCross RefCross Ref
  88. [88] Toussaint Udo Von. 2011. Bayesian inference in physics. Reviews of Modern Physics 83, 3 (2011), 943999. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  89. [89] Wilson Robert C. and Collins Anne G. E.. 2019. Ten simple rules for the computational modeling of behavioral data. eLife 8 (2019), e49547. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  90. [90] Xu Xuhai, Chikersal Prerna, Doryab Afsaneh, Villalba Daniella K., Dutcher Janine M., Tumminia Michael J., Althoff Tim, Cohen Sheldon, Creswell Kasey G., Creswell J. David, Mankoff Jennifer, and Dey Anind K.. 2019. Leveraging routine behavior and contextually-filtered features for depression detection among college students. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3, (2019), 33 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. [91] Yao Shuochao, Zhao Yiran, Shao Huajie, Zhang Aston, Zhang Chao, Li Shen, and Abdelzaher Tarek. 2018. RDeepSense: Reliable deep mobile computing models with uncertainty estimations. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018), 26 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. [92] Yao Shuochao, Zhao Yiran, Shao Huajie, Zhang Chao, Zhang Aston, Hu Shaohan, Liu Dongxin, Liu Shengzhong, Su Lu, and Abdelzaher Tarek. 2018. SenseGAN: Enabling deep learning for internet of things with a semi-supervised framework. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 3, (2018), 21 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. [93] Zeng Yunxiu, Xu Kai, Yin Quanjun, Qin L., Zha Yabing, and Yeoh William. 2018. Inverse reinforcement learning based human behavior modeling for goal recognition in dynamic local network interdiction. In Proceedings of the AAAI Workshops.Google ScholarGoogle Scholar
  94. [94] Ziebart Brian, Dey Anind, and Bagnell J. Andrew. 2012. Probabilistic pointing target prediction via inverse optimal control. In Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces.Association for Computing Machinery, New York, NY,110. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. [95] Ziebart Brian D., Bagnell J. Andrew, and Dey Anind K.. 2010. Modeling interaction via the principle of maximum causal entropy. In Proceedings of the 27th International Conference on International Conference on Machine Learning.Omnipress, 12551262. Retrieved from http://dl.acm.org/citation.cfm?id=3104322.3104481.Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. [96] Ziebart Brian D., Maas Andrew L., Dey Anind K., and Bagnell J. Andrew. 2008. Navigate like a cabbie: Probabilistic reasoning from observed context-aware behavior. In Proceedings of the 10th International Conference on Ubiquitous Computing. Association for Computing Machinery, New York, NY, 322331. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. [97] Ziemssen Tjalf, Kern Raimar, and Thomas Katja. 2016. Multiple sclerosis: Clinical profiling and data collection as prerequisite for personalized medicine approach. BMC Neurology 16, 1 (2016), 124.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A Bayesian Approach for Quantifying Data Scarcity when Modeling Human Behavior via Inverse Reinforcement Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Computer-Human Interaction
      ACM Transactions on Computer-Human Interaction  Volume 30, Issue 1
      February 2023
      537 pages
      ISSN:1073-0516
      EISSN:1557-7325
      DOI:10.1145/3585399
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 March 2023
      • Online AM: 27 July 2022
      • Accepted: 30 May 2022
      • Revised: 30 April 2022
      • Received: 20 January 2021
      Published in tochi Volume 30, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format