Auto Response Generation in Online Medical Chat Services

Jahanshahi, Hadi; Kazmi, Syed; Cevik, Mucahit

doi:10.1007/s41666-022-00118-x

Auto Response Generation in Online Medical Chat Services

Research Article
Published: 15 July 2022

Volume 6, pages 344–374, (2022)
Cite this article

Journal of Healthcare Informatics Research Aims and scope Submit manuscript

3986 Accesses
1 Altmetric
Explore all metrics

Abstract

Telehealth helps to facilitate access to medical professionals by enabling remote medical services for the patients. These services have become gradually popular over the years with the advent of necessary technological infrastructure. The benefits of telehealth have been even more apparent since the beginning of the COVID-19 crisis, as people have become less inclined to visit doctors in person during the pandemic. In this paper, we focus on facilitating chat sessions between a doctor and a patient. We note that the quality and efficiency of the chat experience can be critical as the demand for telehealth services increases. Accordingly, we develop a smart auto-response generation mechanism for medical conversations that helps doctors respond to consultation requests efficiently, particularly during busy sessions. We explore over 900,000 anonymous, historical online messages between doctors and patients collected over 9 months. We implement clustering algorithms to identify the most frequent responses by doctors and manually label the data accordingly. We then train machine learning algorithms using this preprocessed data to generate the responses. The considered algorithm has two steps: a filtering (i.e., triggering) model to filter out infeasible patient messages and a response generator to suggest the top-3 doctor responses for the ones that successfully pass the triggering phase. Among the models utilized, BERT provides an accuracy of 85.41% for precision@3 and shows robustness to its parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ChatGPT-HealthPrompt. Harnessing the Power of XAI in Prompt-Based Healthcare Decision Support using ChatGPT

Medical Chabot Using Machine Learning

Evaluating the accuracy and adequacy of ChatGPT in responding to queries of diabetes patients in primary healthcare

Article 11 September 2024

Notes

References

Charlton G (2013) Consumers prefer live chat for customer service: stats https://econsultancy.com/consumers-prefer-live-chat-for-customer-service-stats/
AAMC (2019) Physician Supply and Demand. A 15-Year Outlook: Key Findings. https://www.aamc.org/media/45976/download
Hawkins M (2017) Survey of physician appointment wait times and medicare and medicaid acceptance rates. https://www.aristamd.com/wp-content/uploads/2018/11/mha2017waittimesurveyPDF-1.pdf
Mehrotra A, Chernew M, Linetsky D, Hatch H, Cutler D (2020) The impact of the COVID-19 pandemic on outpatient visits: a rebound emerges https://www.commonwealthfund.org/publications/2020/apr/impact-covid-19-outpatient-visits
Epstein H-AB (2020) Texting thumb. J Hosp Librariansh 20 (1):82–86
Article Google Scholar
Kannan A, Kurach K, Ravi S, Kaufmann T, Tomkins A, Miklos B, Corrado G, Lukacs L, Ganea M, Young P et al (2016) Smart reply: automated response suggestion for email. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 955–964
Weng Y, Zheng H, Bell F, Tur G (2019) OCC: a smart reply system for efficient in-app communications. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2596–2603
Galke L, Gerstenkorn G, Scherp A (2018) A case study of closed-domain response suggestion with limited training data. In: International conference on database and expert systems applications. Springer, pp 218–229
Zhou L, Gao J, Li D, Shum H-Y (2020) The design and implementation of XiaoIce, an empathetic social Chatbot. Comput Ling 46(1):53–93
Article Google Scholar
Yan R (2018) Chitty-Chitty-Chat Bot: deep learning for conversational AI. In: IJCAI, vol 18, pp 5520–5526
Yan R, Zhao D, W E (2017) Joint learning of response ranking and next utterance suggestion in human-computer conversation system. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, pp 685–694
Yan R, Zhao D (2018) Coupled context modeling for deep chit-chat: towards conversations between human and computer. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2574–2583
Li R, Jiang J-Y, Ju CJ-T, Flynn C, Hsu W-l, Wang J, Wang W, Xu T (2018) Enhancing response generation using chat flow identification. In: KDD’18: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1–6
Kim J-G, Wu C-W, Chiang A, Ko J, Lee S-J (2016) A picture is worth a thousand words: improving mobile messaging with real-time autonomous image suggestion. In: Proceedings of the 17th international workshop on mobile computing systems and applications, HotMobile ’16. ISBN 9781450341455. Association for Computing Machinery, New York, pp 51–56
Jain M, Kumar P, Kota R, Patel SN (2018) Evaluating and informing the design of chatbots. In: Proceedings of the 2018 designing interactive systems conference, pp 895–906
Lee S-C, Song J, Ko E-Y, Park S, Kim J, Kim J (2020) SolutionChat: real-time moderator support for chat-based structured discussion. In: Proceedings of the 2020 CHI conference on human factors in computing systems, CHI ’20. ISBN 9781450367080. Association for Computing Machinery, New York, pp 1–12
Hamet P, Tremblay J (2017) Artificial intelligence in medicine. Metabolism 69:S36–S40
Article Google Scholar
Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25(1):44–56
Article Google Scholar
He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K (2019) The practical implementation of artificial intelligence technologies in medicine. Nat Med 25(1):30–36
Article Google Scholar
Tudor Car L, Dhinagaran DA, Kyaw BM, Kowatsch T, Joty S, Theng Y-L, Atun R (2020) Conversational agents in health care: scoping review and conceptual analysis. J Med Internet Res 22(8):e17158. ISSN 1438-8871
Article Google Scholar
Oh K-J, Lee D, Ko B, Choi H-J (2017) A chatbot for psychiatric counseling in mental healthcare service based on emotional dialogue analysis and sentence generation. In: 2017 18th IEEE International conference on mobile data management (MDM). IEEE, pp 371–375
Kowatsch T, Nißen M, Shih C-HI, Rüegger D, Volland D, Filler A, Künzler F, Barata F, Büchter D, Brogle B, Heldt K, Gindrat P, Farpour-Lambert N, l’Allemand D (2017) Text-based healthcare Chatbots supporting patient and health professional teams: preliminary results of a randomized controlled trial on childhood obesity. In: Persuasive embodied agents for behavior change (PEACH2017) Workshop, co-located with the 17th international conference on intelligent virtual agents (IVA 2017), pp 1–10
Cuffy C, Hagiwara N, Vrana S, McInnes BT (2020) Measuring the quality of patient–physician communication. J Biomed Inform 112:103589. ISSN 1532-0464
Article Google Scholar
Davenport T, Kalakota R (2019) The potential for artificial intelligence in healthcare. Fut Healthcare J 6(2):94
Article Google Scholar
Hancock JT, Naaman M, Levy K (2020) AI-mediated communication: definition, research agenda, and ethical considerations. J Comput-Mediated Commun 25(1):89–100. ISSN 1083-6101
Article Google Scholar
Nadarzynski T, Miles O, Cowie A, Ridge D (2019) Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: a mixed-methods study. Digit Health 5:2055207619871808
Google Scholar
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16. ISBN 9781450342322. Association for Computing Machinery, New York, pp 785–794
Stein RA, Jaques PA, Valiati JF (2019) An analysis of hierarchical text classification using word embeddings. Inform Sci 471:216–232. ISSN 0020-0255
Article Google Scholar
Zhao J, Lan M, Tian JF (2015) ECNU: using traditional similarity measurements and word embedding for semantic textual similarity estimation. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015). Association for Computational Linguistics, Denver, pp 117–122
Chen Q, Sokolova M (2021) Specialists, scientists, and sentiments: Word2Vec and Doc2Vec in analysis of scientific and medical texts. SN Comput Sci 2(5):1–11
Article Google Scholar
Shao Y, Taylor S, Marshall N, Morioka C, Zeng-Treitler Q (2018) Clinical text classification with word embedding features vs. bag-of-words features. In: 2018 IEEE International conference on big data (big data), pp 2874–2878. https://doi.org/10.1109/BigData.2018.8622345
Hughes M, Li I, Kotoulas S, Suzumura T (2017) Medical text classification using convolutional neural networks. In: Informatics for health: connected citizen-led wellness and population health. IOS Press, pp 246–250
Zhu W, Zhang W, Li G-Z, He C, Zhang L (2016) A study of damp-heat syndrome classification using Word2vec and TF-IDF. In: 2016 IEEE International conference on bioinformatics and biomedicine (BIBM), pp 1415–1420. https://doi.org/10.1109/BIBM.2016.7822730
Qi Z (2020) The text classification of theft crime based on TF-IDF and XGBoost model. In: 2020 IEEE International conference on artificial intelligence and computer applications (ICAICA), 1241–1246
Hartmann J, Huppertz J, Schamp C, Heitmann M (2019) Comparing automated text classification methods. Int J Res Market 36(1):20–38. ISSN 0167-8116
Article Google Scholar
Günal S (2011) Hybrid feature selection for text classification
Ren F, Sohrab MG (2013) Class-indexing-based term weighting for automatic text classification. Inform Sci 236:109–125. ISSN 0020-0255
Article Google Scholar
Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Nédellec C, Rouveirol C (eds) Machine learning: ECML-98. ISBN 978-3-540-69781-7. Springer, Berlin, pp 137–142
Du J, Vong CM, Chen CLP (2020) Novel efficient RNN and LSTM-like architectures: recurrent and gated broad learning systems and their applications for text classification. Recurrent and Gated Broad Learning Systems and Their Applications for Text Classification. IEEE Trans Cybern, 1–12
Liu G, Guo J (2019) Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338. ISSN 0925-2312
Article Google Scholar
Sachan DS, Zaheer M, Salakhutdinov R (2019) Revisiting LSTM networks for semi-supervised text classification via mixed objective function. Proc AAAI Conf Artif Intell 33(01):6940–6948
Google Scholar
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ (eds) Advances in neural information processing systems, vol 27. Curran Associates, Inc., pp 3104–3112
Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1412–1421
Kumar A, Vembu S, Menon AK, Elkan C (2013) Beam search algorithms for multilabel learning. Mach Learn 92(1):65–89
Article MathSciNet Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Kalyan KS, Sangeetha S (2021) BertMCN: mapping colloquial phrases to standard medical concepts using BERT and highway network. Artif Intell Med 112:102008. ISSN 0933-3657
Article Google Scholar
Ameri K, Hempel M, Sharif H, Lopez J Jr, Perumalla K (2021) CyBERT: cybersecurity claim classification by fine-tuning the BERT language model. J Cybersecur Privacy 1(4):615–637. ISSN 2624-800X
Article Google Scholar
Bataa E, Wu J (2019) An investigation of transfer learning-based sentiment analysis in Japanese. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, pp 4652–4657
Zahra El-Alami F, Ouatik El Alaoui S, En Nahnahi N Contextual semantic embeddings based on fine-tuned AraBERT model for Arabic text multi-class categorization. Journal of King Saud University - Computer and Information Sciences ISSN 1319-1578
Wang Y, Hou Y, Che W, Liu T (2020) From static to dynamic word representations: a survey. Int J Mach Learn Cybern, 1–20
Gu Y, Tinn R, Cheng H, Lucas M, Usuyama N, Liu X, Naumann T, Gao J, Poon H Domain-specific language model pretraining for biomedical natural language processing. arXiv:2007.15779
Hernández-Orallo J, Flach P, Ferri C (2012) A unified view of performance metrics: translating threshold choice into expected classification loss. J Mach Learn Res 13(91):2813–2869
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank Your Doctors Online for funding and supporting this research. This work was also funded and supported by Mitacs through the Mitacs Accelerate Program. The authors would also like to thank Gagandip Chane for his help with the data labeling.

Author information

Authors and Affiliations

Data Science Lab, Ryerson University, 44 Gerrard St E, Toronto, M5B 1G3, Ontario, Canada
Hadi Jahanshahi, Syed Kazmi & Mucahit Cevik

Authors

Hadi Jahanshahi
View author publications
You can also search for this author inPubMed Google Scholar
Syed Kazmi
View author publications
You can also search for this author inPubMed Google Scholar
Mucahit Cevik
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Hadi Jahanshahi.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: . Finding the Best Pipeline

In this section, we discuss the evaluation results of the end-to-end pipelines obtained by combining different models for triggering and response generation. Table 4 demonstrates the performance of various combinations of the triggering and response generation models. We use Precision@3 as the representative performance metric as we consider top-3 proposed responses in a generic chat application. These results show that using LSTM for triggering and BERT for response generation outperforms other combinations in terms of average Precision@3 value of 85.58%. We observe that using BERT for both phases leads to a very similar performance with an average Precision@3 value of 85.42%, coming second among all the tested combinations. The higher performance for using LSTM in the first phase can be attributed to the particularly good performance of LSTM for the triggering task. That is, LSTM outperforms BERT in three performance metrics out of seven that we report in Table 2, with precision values for the feasible messages exceeding that of the BERT model by 2.55% on average. We also find that rule-based approaches and their combinations have significantly lower performance, pointing to the benefits of employing machine learning algorithms to create this end-to-end pipeline.

Table 4 Summary performance values for the triggering and response generation model combinations to create an end-to-end pipeline

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jahanshahi, H., Kazmi, S. & Cevik, M. Auto Response Generation in Online Medical Chat Services. J Healthc Inform Res 6, 344–374 (2022). https://doi.org/10.1007/s41666-022-00118-x

Download citation

Received: 09 September 2021
Revised: 28 June 2022
Accepted: 29 June 2022
Published: 15 July 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s41666-022-00118-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Auto Response Generation in Online Medical Chat Services

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ChatGPT-HealthPrompt. Harnessing the Power of XAI in Prompt-Based Healthcare Decision Support using ChatGPT

Medical Chabot Using Machine Learning

Evaluating the accuracy and adequacy of ChatGPT in responding to queries of diabetes patients in primary healthcare

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Appendix: . Finding the Best Pipeline

Appendix: . Finding the Best Pipeline

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now