Skip to main content

Advertisement

Log in

Auto Response Generation in Online Medical Chat Services

  • Research Article
  • Published:
Journal of Healthcare Informatics Research Aims and scope Submit manuscript

Abstract

Telehealth helps to facilitate access to medical professionals by enabling remote medical services for the patients. These services have become gradually popular over the years with the advent of necessary technological infrastructure. The benefits of telehealth have been even more apparent since the beginning of the COVID-19 crisis, as people have become less inclined to visit doctors in person during the pandemic. In this paper, we focus on facilitating chat sessions between a doctor and a patient. We note that the quality and efficiency of the chat experience can be critical as the demand for telehealth services increases. Accordingly, we develop a smart auto-response generation mechanism for medical conversations that helps doctors respond to consultation requests efficiently, particularly during busy sessions. We explore over 900,000 anonymous, historical online messages between doctors and patients collected over 9 months. We implement clustering algorithms to identify the most frequent responses by doctors and manually label the data accordingly. We then train machine learning algorithms using this preprocessed data to generate the responses. The considered algorithm has two steps: a filtering (i.e., triggering) model to filter out infeasible patient messages and a response generator to suggest the top-3 doctor responses for the ones that successfully pass the triggering phase. Among the models utilized, BERT provides an accuracy of 85.41% for precision@3 and shows robustness to its parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://yourdoctors.online

  2. https://bio.nlplab.org/

  3. https://tfhub.dev/google/experts/bert/pubmed/2

References

  1. Charlton G (2013) Consumers prefer live chat for customer service: stats https://econsultancy.com/consumers-prefer-live-chat-for-customer-service-stats/

  2. AAMC (2019) Physician Supply and Demand. A 15-Year Outlook: Key Findings. https://www.aamc.org/media/45976/download

  3. Hawkins M (2017) Survey of physician appointment wait times and medicare and medicaid acceptance rates. https://www.aristamd.com/wp-content/uploads/2018/11/mha2017waittimesurveyPDF-1.pdf

  4. Mehrotra A, Chernew M, Linetsky D, Hatch H, Cutler D (2020) The impact of the COVID-19 pandemic on outpatient visits: a rebound emerges https://www.commonwealthfund.org/publications/2020/apr/impact-covid-19-outpatient-visits

  5. Epstein H-AB (2020) Texting thumb. J Hosp Librariansh 20 (1):82–86

    Article  Google Scholar 

  6. Kannan A, Kurach K, Ravi S, Kaufmann T, Tomkins A, Miklos B, Corrado G, Lukacs L, Ganea M, Young P et al (2016) Smart reply: automated response suggestion for email. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 955–964

  7. Weng Y, Zheng H, Bell F, Tur G (2019) OCC: a smart reply system for efficient in-app communications. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2596–2603

  8. Galke L, Gerstenkorn G, Scherp A (2018) A case study of closed-domain response suggestion with limited training data. In: International conference on database and expert systems applications. Springer, pp 218–229

  9. Zhou L, Gao J, Li D, Shum H-Y (2020) The design and implementation of XiaoIce, an empathetic social Chatbot. Comput Ling 46(1):53–93

    Article  Google Scholar 

  10. Yan R (2018) Chitty-Chitty-Chat Bot: deep learning for conversational AI. In: IJCAI, vol 18, pp 5520–5526

  11. Yan R, Zhao D, W E (2017) Joint learning of response ranking and next utterance suggestion in human-computer conversation system. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, pp 685–694

  12. Yan R, Zhao D (2018) Coupled context modeling for deep chit-chat: towards conversations between human and computer. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2574–2583

  13. Li R, Jiang J-Y, Ju CJ-T, Flynn C, Hsu W-l, Wang J, Wang W, Xu T (2018) Enhancing response generation using chat flow identification. In: KDD’18: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1–6

  14. Kim J-G, Wu C-W, Chiang A, Ko J, Lee S-J (2016) A picture is worth a thousand words: improving mobile messaging with real-time autonomous image suggestion. In: Proceedings of the 17th international workshop on mobile computing systems and applications, HotMobile ’16. ISBN 9781450341455. Association for Computing Machinery, New York, pp 51–56

  15. Jain M, Kumar P, Kota R, Patel SN (2018) Evaluating and informing the design of chatbots. In: Proceedings of the 2018 designing interactive systems conference, pp 895–906

  16. Lee S-C, Song J, Ko E-Y, Park S, Kim J, Kim J (2020) SolutionChat: real-time moderator support for chat-based structured discussion. In: Proceedings of the 2020 CHI conference on human factors in computing systems, CHI ’20. ISBN 9781450367080. Association for Computing Machinery, New York, pp 1–12

  17. Hamet P, Tremblay J (2017) Artificial intelligence in medicine. Metabolism 69:S36–S40

    Article  Google Scholar 

  18. Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25(1):44–56

    Article  Google Scholar 

  19. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K (2019) The practical implementation of artificial intelligence technologies in medicine. Nat Med 25(1):30–36

    Article  Google Scholar 

  20. Tudor Car L, Dhinagaran DA, Kyaw BM, Kowatsch T, Joty S, Theng Y-L, Atun R (2020) Conversational agents in health care: scoping review and conceptual analysis. J Med Internet Res 22(8):e17158. ISSN 1438-8871

    Article  Google Scholar 

  21. Oh K-J, Lee D, Ko B, Choi H-J (2017) A chatbot for psychiatric counseling in mental healthcare service based on emotional dialogue analysis and sentence generation. In: 2017 18th IEEE International conference on mobile data management (MDM). IEEE, pp 371–375

  22. Kowatsch T, Nißen M, Shih C-HI, Rüegger D, Volland D, Filler A, Künzler F, Barata F, Büchter D, Brogle B, Heldt K, Gindrat P, Farpour-Lambert N, l’Allemand D (2017) Text-based healthcare Chatbots supporting patient and health professional teams: preliminary results of a randomized controlled trial on childhood obesity. In: Persuasive embodied agents for behavior change (PEACH2017) Workshop, co-located with the 17th international conference on intelligent virtual agents (IVA 2017), pp 1–10

  23. Cuffy C, Hagiwara N, Vrana S, McInnes BT (2020) Measuring the quality of patient–physician communication. J Biomed Inform 112:103589. ISSN 1532-0464

    Article  Google Scholar 

  24. Davenport T, Kalakota R (2019) The potential for artificial intelligence in healthcare. Fut Healthcare J 6(2):94

    Article  Google Scholar 

  25. Hancock JT, Naaman M, Levy K (2020) AI-mediated communication: definition, research agenda, and ethical considerations. J Comput-Mediated Commun 25(1):89–100. ISSN 1083-6101

    Article  Google Scholar 

  26. Nadarzynski T, Miles O, Cowie A, Ridge D (2019) Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: a mixed-methods study. Digit Health 5:2055207619871808

    Google Scholar 

  27. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  28. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16. ISBN 9781450342322. Association for Computing Machinery, New York, pp 785–794

  29. Stein RA, Jaques PA, Valiati JF (2019) An analysis of hierarchical text classification using word embeddings. Inform Sci 471:216–232. ISSN 0020-0255

    Article  Google Scholar 

  30. Zhao J, Lan M, Tian JF (2015) ECNU: using traditional similarity measurements and word embedding for semantic textual similarity estimation. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015). Association for Computational Linguistics, Denver, pp 117–122

  31. Chen Q, Sokolova M (2021) Specialists, scientists, and sentiments: Word2Vec and Doc2Vec in analysis of scientific and medical texts. SN Comput Sci 2(5):1–11

    Article  Google Scholar 

  32. Shao Y, Taylor S, Marshall N, Morioka C, Zeng-Treitler Q (2018) Clinical text classification with word embedding features vs. bag-of-words features. In: 2018 IEEE International conference on big data (big data), pp 2874–2878. https://doi.org/10.1109/BigData.2018.8622345

  33. Hughes M, Li I, Kotoulas S, Suzumura T (2017) Medical text classification using convolutional neural networks. In: Informatics for health: connected citizen-led wellness and population health. IOS Press, pp 246–250

  34. Zhu W, Zhang W, Li G-Z, He C, Zhang L (2016) A study of damp-heat syndrome classification using Word2vec and TF-IDF. In: 2016 IEEE International conference on bioinformatics and biomedicine (BIBM), pp 1415–1420. https://doi.org/10.1109/BIBM.2016.7822730

  35. Qi Z (2020) The text classification of theft crime based on TF-IDF and XGBoost model. In: 2020 IEEE International conference on artificial intelligence and computer applications (ICAICA), 1241–1246

  36. Hartmann J, Huppertz J, Schamp C, Heitmann M (2019) Comparing automated text classification methods. Int J Res Market 36(1):20–38. ISSN 0167-8116

    Article  Google Scholar 

  37. Günal S (2011) Hybrid feature selection for text classification

  38. Ren F, Sohrab MG (2013) Class-indexing-based term weighting for automatic text classification. Inform Sci 236:109–125. ISSN 0020-0255

    Article  Google Scholar 

  39. Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Nédellec C, Rouveirol C (eds) Machine learning: ECML-98. ISBN 978-3-540-69781-7. Springer, Berlin, pp 137–142

  40. Du J, Vong CM, Chen CLP (2020) Novel efficient RNN and LSTM-like architectures: recurrent and gated broad learning systems and their applications for text classification. Recurrent and Gated Broad Learning Systems and Their Applications for Text Classification. IEEE Trans Cybern, 1–12

  41. Liu G, Guo J (2019) Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338. ISSN 0925-2312

    Article  Google Scholar 

  42. Sachan DS, Zaheer M, Salakhutdinov R (2019) Revisiting LSTM networks for semi-supervised text classification via mixed objective function. Proc AAAI Conf Artif Intell 33(01):6940–6948

    Google Scholar 

  43. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ (eds) Advances in neural information processing systems, vol 27. Curran Associates, Inc., pp 3104–3112

  44. Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1412–1421

  45. Kumar A, Vembu S, Menon AK, Elkan C (2013) Beam search algorithms for multilabel learning. Mach Learn 92(1):65–89

    Article  MathSciNet  Google Scholar 

  46. Devlin J, Chang M-W, Lee K, Toutanova K Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  47. Kalyan KS, Sangeetha S (2021) BertMCN: mapping colloquial phrases to standard medical concepts using BERT and highway network. Artif Intell Med 112:102008. ISSN 0933-3657

    Article  Google Scholar 

  48. Ameri K, Hempel M, Sharif H, Lopez J Jr, Perumalla K (2021) CyBERT: cybersecurity claim classification by fine-tuning the BERT language model. J Cybersecur Privacy 1(4):615–637. ISSN 2624-800X

    Article  Google Scholar 

  49. Bataa E, Wu J (2019) An investigation of transfer learning-based sentiment analysis in Japanese. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, pp 4652–4657

  50. Zahra El-Alami F, Ouatik El Alaoui S, En Nahnahi N Contextual semantic embeddings based on fine-tuned AraBERT model for Arabic text multi-class categorization. Journal of King Saud University - Computer and Information Sciences ISSN 1319-1578

  51. Wang Y, Hou Y, Che W, Liu T (2020) From static to dynamic word representations: a survey. Int J Mach Learn Cybern, 1–20

  52. Gu Y, Tinn R, Cheng H, Lucas M, Usuyama N, Liu X, Naumann T, Gao J, Poon H Domain-specific language model pretraining for biomedical natural language processing. arXiv:2007.15779

  53. Hernández-Orallo J, Flach P, Ferri C (2012) A unified view of performance metrics: translating threshold choice into expected classification loss. J Mach Learn Res 13(91):2813–2869

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Your Doctors Online for funding and supporting this research. This work was also funded and supported by Mitacs through the Mitacs Accelerate Program. The authors would also like to thank Gagandip Chane for his help with the data labeling.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hadi Jahanshahi.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: . Finding the Best Pipeline

Appendix: . Finding the Best Pipeline

In this section, we discuss the evaluation results of the end-to-end pipelines obtained by combining different models for triggering and response generation. Table 4 demonstrates the performance of various combinations of the triggering and response generation models. We use Precision@3 as the representative performance metric as we consider top-3 proposed responses in a generic chat application. These results show that using LSTM for triggering and BERT for response generation outperforms other combinations in terms of average Precision@3 value of 85.58%. We observe that using BERT for both phases leads to a very similar performance with an average Precision@3 value of 85.42%, coming second among all the tested combinations. The higher performance for using LSTM in the first phase can be attributed to the particularly good performance of LSTM for the triggering task. That is, LSTM outperforms BERT in three performance metrics out of seven that we report in Table 2, with precision values for the feasible messages exceeding that of the BERT model by 2.55% on average. We also find that rule-based approaches and their combinations have significantly lower performance, pointing to the benefits of employing machine learning algorithms to create this end-to-end pipeline.

Table 4 Summary performance values for the triggering and response generation model combinations to create an end-to-end pipeline

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jahanshahi, H., Kazmi, S. & Cevik, M. Auto Response Generation in Online Medical Chat Services. J Healthc Inform Res 6, 344–374 (2022). https://doi.org/10.1007/s41666-022-00118-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41666-022-00118-x

Keywords