research-article

Open Access

Towards Efficient Annotations for a Human-AI Collaborative, Clinical Decision Support System: A Case Study on Physical Stroke Rehabilitation Assessment

Authors:
Min Hun Lee

Singapore Management University, Singapore

Singapore Management University, Singapore
View Profile

,
Daniel P. Siewiorek

Carnegie Mellon University, United States

Carnegie Mellon University, United States
View Profile

,
Asim Smailagic

Carnegie Mellon University, United States

Carnegie Mellon University, United States
View Profile

,
Alexandre Bernardino

Institute for Systems and Robotics, Instituto Superior Tecnico, University of Lisbon, Portugal

Institute for Systems and Robotics, Instituto Superior Tecnico, University of Lisbon, Portugal
View Profile

,
Sergi Bermúdez i Badia

Faculdade de Ciências Exatas e da Engenharia, Universidade da Madeira, Portugal

Faculdade de Ciências Exatas e da Engenharia, Universidade da Madeira, Portugal
View Profile

IUI '22: Proceedings of the 27th International Conference on Intelligent User InterfacesMarch 2022Pages 4–14https://doi.org/10.1145/3490099.3511112

Published:22 March 2022Publication History

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

Pages 4–14

ABSTRACT

Artificial intelligence (AI) and machine learning (ML) algorithms are increasingly being explored to support various decision-making tasks in health (e.g. rehabilitation assessment). However, the development of such AI/ML-based decision support systems is challenging due to the expensive process to collect an annotated dataset. In this paper, we describe the development process of a human-AI collaborative, clinical decision support system that augments an ML model with a rule-based (RB) model from domain experts. We conducted its empirical evaluation in the context of assessing physical stroke rehabilitation with the dataset of three exercises from 15 post-stroke survivors and therapists. Our results bring new insights on the efficient development and annotations of a decision support system: when an annotated dataset is not available initially, the RB model can be used to assess post-stroke survivor’s quality of motion and identify samples with low confidence scores to support efficient annotations for training an ML model. Specifically, our system requires only 22 - 33% of annotations from therapists to train an ML model that achieves equally good performance with an ML model with all annotations from a therapist. Our work discusses the values of a human-AI collaborative approach for effectively collecting an annotated dataset and supporting a complex decision-making task.

References

Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. AI Magazine 35, 4 (2014), 105–120.Google ScholarDigital Library
Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, 2019. Guidelines for human-AI interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 3.Google ScholarDigital Library
PK Anooj. 2012. Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules. Journal of King Saud University-Computer and Information Sciences 24, 1(2012), 27–40.Google ScholarDigital Library
Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2019. Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 2(2019), 423–443.Google ScholarDigital Library
Mark T Bayley, Amanda Hurdowar, Carol L Richards, Nicol Korner-Bitensky, Sharon Wood-Dauphinee, Janice J Eng, Marilyn McKay-Lyons, Edward Harrison, Robert Teasell, Margaret Harrison, 2012. Barriers to implementation of stroke rehabilitation evidence: findings from a multi-site pilot project. Disability and rehabilitation 34, 19 (2012), 1633–1638.Google Scholar
Emma Beede, Elizabeth Baylor, Fred Hersch, Anna Iurchenko, Lauren Wilcox, Paisan Ruamviboonsuk, and Laura M Vardoulakis. 2020. A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.Google ScholarDigital Library
Edmon Begoli, Tanmoy Bhattacharya, and Dimitri Kusnezov. 2019. The need for uncertainty quantification in machine-assisted medical decision making. Nature Machine Intelligence 1, 1 (2019), 20–23.Google ScholarCross Ref
Or Biran and Courtenay Cotton. 2017. Explanation and justification in machine learning: A survey. In IJCAI-17 workshop on explainable AI (XAI), Vol. 8. 1.Google Scholar
Norbert Buch, Sergio A Velastin, and James Orwell. 2011. A review of computer vision techniques for the analysis of urban traffic. IEEE Transactions on intelligent transportation systems 12, 3(2011), 920–939.Google ScholarDigital Library
Bruce G Buchanan and Richard O Duda. 1983. Principles of rule-based expert systems. In Advances in computers. Vol. 22. Elsevier, 163–216.Google Scholar
Federico Cabitza, Raffaele Rasoini, and Gian Franco Gensini. 2017. Unintended consequences of machine learning in medicine. Jama 318, 6 (2017), 517–518.Google ScholarCross Ref
Carrie J Cai, Emily Reif, Narayan Hegde, Jason Hipp, Been Kim, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, Greg S Corrado, Martin C Stumpe, 2019. Human-centered tools for coping with imperfect algorithms during medical decision-making. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 4.Google ScholarDigital Library
Carrie J Cai, Samantha Winter, David Steiner, Lauren Wilcox, and Michael Terry. 2019. ” Hello AI”: Uncovering the Onboarding Needs of Medical Practitioners for Human-AI Collaborative Decision-Making. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–24.Google ScholarDigital Library
Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 1721–1730.Google ScholarDigital Library
Po-Hsuan Cameron Chen, Yun Liu, and Lily Peng. 2019. How to develop machine learning models for healthcare. Nature materials 18, 5 (2019), 410.Google Scholar
Samarjit Das, Laura Trutoiu, Akihiko Murai, Dunbar Alcindor, Michael Oh, Fernando De la Torre, and Jessica Hodgins. 2011. Quantitative measurement of motor symptoms in Parkinson’s disease: A study with full-body motion capture data. In 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 6789–6792.Google ScholarCross Ref
Maria De-Arteaga, Riccardo Fogliato, and Alexandra Chouldechova. 2020. A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.Google ScholarDigital Library
Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, and Adam Tauman Kalai. 2019. Bias in bios: A case study of semantic representation bias in a high-stakes setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 120–128.Google ScholarDigital Library
TT Dhivyaprabha, P Subashini, and Marimuthu Krishnaveni. 2016. Computational intelligence based machine learning methods for rule-based reasoning in computer vision applications. In 2016 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 1–8.Google ScholarCross Ref
Andre Esteva, Brett Kuprel, Roberto A Novoa, Justin Ko, Susan M Swetter, Helen M Blau, and Sebastian Thrun. 2017. Dermatologist-level classification of skin cancer with deep neural networks. nature 542, 7639 (2017), 115–118.Google Scholar
Ben Green and Yiling Chen. 2019. The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–24.Google ScholarDigital Library
Eren Gultepe, Jeffrey P Green, Hien Nguyen, Jason Adams, Timothy Albertson, and Ilias Tagkopoulos. 2014. From vital signs to clinical outcomes for patients with sepsis: a machine learning basis for a clinical decision support system. Journal of the American Medical Informatics Association 21, 2(2014), 315–325.Google ScholarCross Ref
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. 2017. On calibration of modern neural networks. In International Conference on Machine Learning. PMLR, 1321–1330.Google Scholar
Jatinder ND Gupta, Guisseppi A Forgionne, and Manuel Mora. 2007. Intelligent decision-making support systems: foundations, applications and challenges. Springer Science & Business Media.Google Scholar
Fred Hohman, Minsuk Kahng, Robert Pienta, and Duen Horng Chau. 2018. Visual analytics in deep learning: An interrogative survey for the next frontiers. IEEE transactions on visualization and computer graphics 25, 8(2018), 2674–2693.Google ScholarDigital Library
Fred Hohman, Kanit Wongsuphasawat, Mary Beth Kery, and Kayur Patel. 2020. Understanding and visualizing data iteration in machine learning. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–13.Google ScholarDigital Library
Mark Jones, Karen Grimmer, Ian Edwards, Joy Higgs, and Franziska Trede. 2006. Challenges in applying best evidence to physiotherapy. Internet Journal of Allied Health Sciences and Practice 4, 3(2006), 11.Google Scholar
Mayank Kabra, Alice A Robie, Marta Rivera-Alba, Steven Branson, and Kristin Branson. 2013. JAABA: interactive machine learning for automatic annotation of animal behavior. Nature methods 10, 1 (2013), 64–67.Google Scholar
Ashish Kapoor, Bongshin Lee, Desney Tan, and Eric Horvitz. 2010. Interactive optimization for steering machine classification. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1343–1352.Google ScholarDigital Library
Danielle Leah Kehl and Samuel Ari Kessler. 2017. Algorithms in the criminal justice system: Assessing the use of risk assessments in sentencing. (2017).Google Scholar
Saif Khairat, David Marc, William Crosby, and Ali Al Sanousi. 2018. Reasons for physicians not adopting clinical decision support systems: critical analysis. JMIR medical informatics 6, 2 (2018), e24.Google Scholar
Bongjun Kim and Bryan Pardo. 2018. A human-in-the-loop system for sound event detection and annotation. ACM Transactions on Interactive Intelligent Systems (TiiS) 8, 2(2018), 1–23.Google ScholarDigital Library
Been Kim, Julie A Shah, and Finale Doshi-Velez. 2015. Mind the gap: A generative approach to interpretable feature selection and extraction. In Advances in Neural Information Processing Systems. 2260–2268.Google Scholar
Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2018. Human decisions and machine predictions. The quarterly journal of economics 133, 1 (2018), 237–293.Google Scholar
Jan-Christoph Klie, Michael Bugert, Beto Boullosa, Richard Eckart de Castilho, and Iryna Gurevych. 2018. The inception platform: Machine-assisted and knowledge-oriented interactive annotation. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations. 5–9.Google Scholar
Amanda Kube, Sanmay Das, and Patrick J Fowler. 2019. Allocating interventions based on predicted outcomes: A case study on homelessness services. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 622–629.Google ScholarDigital Library
Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the 20th international conference on intelligent user interfaces. 126–137.Google ScholarDigital Library
Peter Langhorne, Julie Bernhardt, and Gert Kwakkel. 2011. Stroke rehabilitation. The Lancet 377, 9778 (2011), 1693–1702.Google Scholar
Min Hun Lee, Daniel P Siewiorek, Asim Smailagic, Alexandre Bernadino, 2019. Learning to assess the quality of stroke rehabilitation exercises. In Proceedings of the 24th International Conference on intelligent user interfaces. ACM, 218–228.Google ScholarDigital Library
Min Hun Lee, Daniel P Siewiorek, Asim Smailagic, Alexandre Bernardino, and Sergi Bermúdez i Badia. 2020. Co-Design and Evaluation of an Intelligent Decision Support System for Stroke Rehabilitation Assessment. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2(2020), 1–27.Google ScholarDigital Library
Min Hun Lee, Daniel P Siewiorek, Asim Smailagic, Alexandre Bernardino, and Sergi Bermúdez i Badia. 2020. An exploratory study on techniques for quantitative assessment of stroke rehabilitation exercises. In Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization. 303–307.Google ScholarDigital Library
Min Hun Lee, Daniel P. Siewiorek, Asim Smailagic, Alexandre Bernardino, and Sergi Bermúdez i Badia. 2021. A Human-AI Collaborative Approach for Clinical Decision Making on Rehabilitation Assessment. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14.Google ScholarDigital Library
Benjamin Letham, Cynthia Rudin, Tyler H McCormick, and David Madigan. 2015. Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. The Annals of Applied Statistics 9, 3 (2015), 1350–1371.Google ScholarCross Ref
Mingkun Li and Ishwar K Sethi. 2006. Confidence-based active learning. IEEE transactions on pattern analysis and machine intelligence 28, 8(2006), 1251–1261.Google ScholarDigital Library
Andrew F Long, Rosie Kneafsey, and Julia Ryan. 2003. Rehabilitation practice: challenges to effective team working. International journal of nursing studies 40, 6 (2003), 663–673.Google ScholarCross Ref
Alexandru Niculescu-Mizil and Rich Caruana. 2005. Predicting good probabilities with supervised learning. In Proceedings of the 22nd international conference on Machine learning. 625–632.Google ScholarDigital Library
Susan B O’Sullivan, Thomas J Schmitz, and George Fulk. 2019. Physical rehabilitation. FA Davis.Google Scholar
Madhuri Panwar, Dwaipayan Biswas, Harsh Bajaj, Michael Jöbges, Ruth Turk, Koushik Maharatna, and Amit Acharyya. 2019. Rehab-Net: Deep Learning Framework for Arm Movement Classification Using Wearable Sensors for Stroke Rehabilitation. IEEE Transactions on Biomedical Engineering 66, 11 (2019), 3026–3037.Google ScholarCross Ref
Nathan Peiffer-Smadja, Timothy Miles Rawson, Raheelah Ahmad, Albert Buchard, P Georgiou, F-X Lescure, Gabriel Birgand, and Alison Helen Holmes. 2020. Machine learning for clinical decision support in infectious diseases: a narrative review of current applications. Clinical Microbiology and Infection 26, 5 (2020), 584–595.Google ScholarCross Ref
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1135–1144.Google ScholarDigital Library
Brandon Rohrer, Susan Fasoli, Hermano Igo Krebs, Richard Hughes, Bruce Volpe, Walter R Frontera, Joel Stein, and Neville Hogan. 2002. Movement smoothness changes during stroke recovery. Journal of Neuroscience 22, 18 (2002), 8297–8304.Google ScholarCross Ref
Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206–215.Google ScholarCross Ref
Saima Safdar, Saad Zafar, Nadeem Zafar, and Naurin Farooq Khan. 2018. Machine learning based decision support systems (DSS) for heart disease diagnosis: a review. Artificial Intelligence Review 50, 4 (2018), 597–623.Google ScholarDigital Library
Mark Sendak, Madeleine Clare Elish, Michael Gao, Joseph Futoma, William Ratliff, Marshall Nichols, Armando Bedoya, Suresh Balu, and Cara O’Brien. 2020. ” The human body is a black box” supporting clinical decision-making with deep learning. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 99–109.Google ScholarDigital Library
Emily Seto, Kevin J Leonard, Joseph A Cafazzo, Jan Barnsley, Caterina Masino, and Heather J Ross. 2012. Developing healthcare rule-based expert systems: case study of a heart failure telemonitoring system. International journal of medical informatics 81, 8(2012), 556–565.Google Scholar
Philip J Smith, Norman D Geddes, and Roger Beatty. 2009. Human-centered design of decision-support systems. In Human-Computer Interaction. CRC Press, 263–292.Google Scholar
Katherine J Sullivan, Julie K Tilson, Steven Y Cen, Dorian K Rose, Julie Hershberg, Anita Correa, Joann Gallichio, Molly McLeod, Craig Moore, Samuel S Wu, 2011. Fugl-Meyer assessment of sensorimotor function after stroke: standardized training procedure for clinical practice and clinical trials. Stroke 42, 2 (2011), 427–432.Google ScholarCross Ref
Edward Taub, David M Morris, Jean Crago, Danna Kay King, Mary Bowman, Camille Bryson, Staci Bishop, Sonya Pearson, and Sharon E Shaw. 2011. Wolf motor function test (WMFT) manual. Birmingham: University of Alabama, CI Therapy Research Group (2011).Google Scholar
David Webster and Ozkan Celik. 2014. Systematic review of Kinect applications in elderly care and stroke rehabilitation. Journal of neuroengineering and rehabilitation 11, 1(2014), 108.Google ScholarCross Ref
Qian Yang, Aaron Steinfeld, and John Zimmerman. 2019. Unremarkable AI: Fitting Intelligent Decision Support into Critical, Clinical Decision-Making Processes. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 238.Google ScholarDigital Library
Bianca Zadrozny and Charles Elkan. 2002. Transforming classifier scores into accurate multiclass probability estimates. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. 694–699.Google ScholarDigital Library

Index Terms

Towards Efficient Annotations for a Human-AI Collaborative, Clinical Decision Support System: A Case Study on Physical Stroke Rehabilitation Assessment

Index terms have been assigned to the content through auto-classification.

Recommendations

A Human-AI Collaborative Approach for Clinical Decision Making on Rehabilitation Assessment
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Advances in artificial intelligence (AI) have made it increasingly applicable to supplement expert’s decision-making in the form of a decision support system on various tasks. For instance, an AI-based system can provide therapists quantitative ...
Read More
Understanding the Effect of Counterfactual Explanations on Trust and Reliance on AI for Human-AI Collaborative Clinical Decision Making
CSCW

Artificial intelligence (AI) is increasingly being considered to assist human decision-making in high-stake domains (e.g. health). However, researchers have discussed an issue that humans can over-rely on wrong suggestions of the AI model instead of ...
Read More
Home-Based Rehabilitation System for Stroke Survivors: A Clinical Evaluation
Abstract
Recently, a home-based rehabilitation system for stroke survivors (Baptista et al. Comput. Meth. Prog. Biomed. 176:111–120 2019), composed of two linked applications (one for the therapist and another one for the patient), has been introduced. The ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces
March 2022
888 pages
ISBN:9781450391443
DOI:10.1145/3490099

Copyright © 2022 Owner/Author
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 March 2022
Check for updates
Author Tags
Clinical Decision Support Systems
Human Centered AI
Human-AI Collaboration
Human-In-the-Loop Systems
Physical Stroke Rehabilitation Assessment;
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate746of2,811submissions,27%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 883
  Total Downloads
- Downloads (Last 12 months)340
- Downloads (Last 6 weeks)40
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Towards Efficient Annotations for a Human-AI Collaborative, Clinical Decision Support System: A Case Study on Physical Stroke Rehabilitation Assessment

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Human-AI Collaborative Approach for Clinical Decision Making on Rehabilitation Assessment

Understanding the Effect of Counterfactual Explanations on Trust and Reliance on AI for Human-AI Collaborative Clinical Decision Making

Home-Based Rehabilitation System for Stroke Survivors: A Clinical Evaluation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Towards Efficient Annotations for a Human-AI Collaborative, Clinical Decision Support System: A Case Study on Physical Stroke Rehabilitation Assessment

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Human-AI Collaborative Approach for Clinical Decision Making on Rehabilitation Assessment

Understanding the Effect of Counterfactual Explanations on Trust and Reliance on AI for Human-AI Collaborative Clinical Decision Making

Home-Based Rehabilitation System for Stroke Survivors: A Clinical Evaluation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media