Skip to main content
Log in

A study of factors related to patients’ length of stay using data mining techniques in a general hospital in southern Iran

  • Reseach
  • Published:
Health Information Science and Systems Aims and scope Submit manuscript

Abstract

Purpose

The length of stay (LOS) in hospitals is a widely used indicator for goals such as health care management, quality control, utilizing hospital services and resources, and determining the degree of efficiency. Various methods have been used to identify the factors influencing the LOS. This study adopts a comparative approach of data mining techniques for investigating effective factors and predict the length of stay in Shahid-Mohammadi Hospital, Bandar Abbas, Iran.

Methods

Using a dataset consists of 526 patient records of the Shahid-Mohammadi Hospital from March 2016 to March 2017, factors affecting the LOS were ranked using information gain and correlation indices. In addition, classification models for LOS prediction were created based on nine data mining classifiers applied with and without feature selection technique. Finally, the models were compared.

Results

The most important factors affecting LOS are the number of para-clinical services, counseling frequency, clinical ward, the specialty and the degree of the doctor, and the cause of hospitalization. In addition, regarding to the classifiers created based on the dataset, the best accuracy (83.91%) and sensitivity (80.36%) belongs to the Logistic Regression and Naïve Bayes respectively. In addition, the best AUC (0.896) belongs to the Random Forest and Generalized Linear classifiers.

Conclusion

The results showed that most of the proposed models are suitable for classification of the length of stay, although the Logistic Regression might have a slightly better performance than others in term of accuracy, and this model can be used to determine the patients’ Length of Stay. In general, continuous monitoring of the factors influencing each of the performance indicators based on proper and accurate models in hospitals is important for helping management decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Pourreza A, Salavati S, Sadeghi darvishi S, Salehi Nasab M, Tabesh H, Mamivand F, Kishizadeh Z. Factors influencing the length of stay in infectious ward of Razi Hospital in Ahvaz: Iran. Health Inf Manag. 2015;11(6):779–88.

    Google Scholar 

  2. Bahadori M, Sadeghifar J, Hamouzadeh P, Hakimzadeh SM, Nejati M. Combining multiple indicators to assess hospital performance in Iran using the Pabon Lasso Model. Australas Med J. 2011;4(4):175–9. https://doi.org/10.4066/AMJ.2011.620.

    Article  Google Scholar 

  3. Baniasadi T, Khorrami F, Jebraeily M, Khamzade F, Ghovvati Kisomi F. Performance evaluation of Hormozgan University of Medical Sciences (HUMS) hospitals based on Pabon Lasso Model. Evid Based Health Policy Manag Econ. 2018;2(4):249–57. https://doi.org/10.18502/jebhpme.v2i4.276.

    Article  Google Scholar 

  4. Zahiri M, Keliddar I. Performance evaluating in hospitals affiliated in Ahvaz University of Medical Sciences based on Pabon Lasso model. Hospital. 2012;11(3):37–44.

    Google Scholar 

  5. del-Rey-Chamorro FM, Roy R, van Wegen B, Steele A. A framework to create key performance indicators for knowledge management solutions. J Knowl Manag. 2003;7(2):46–62. https://doi.org/10.1108/13673270310477289.

    Article  Google Scholar 

  6. Parmenter D. Key Performance Indicators (KPI): developing, implementing, and using winning KPIs. Philadelphia: Wiley; 2007.

    Google Scholar 

  7. Nasiripoor AA, Helali Bonab MA, Raeisi P. The leadership styles of district health network managers and performance indices in eastern Azerbaijan, Iran; 2008. J Health Adm. 2009;12(36):17–24.

    Google Scholar 

  8. Sadeghifar J, Ashrafrezaee N, Hamouzadeh P, Taghavi Shahri S, Shams L. Relationship between performance indicators and hospital evaluation score at hospitals affiliated to Urmia University of Medical Sciences. J Urmia Nurs Midwifery Fac. 2011;9(4):270–6.

    Google Scholar 

  9. Jonaidi Jafari N, Sadeghi M, Izadi M, Ranjbar R. Comparison of performance indicators in one of hospitals of Tehran with national standards. J Mil Med. 2011;12(4):223–8.

    Google Scholar 

  10. Ebadifard Azar F, Ansari H, Rezapoor A. Study of daily bed occupancy costs and performance indexes in selected Hospitalat of Iran University of Medical Sciences in 2002. J Health Admin. 2005;7(18):37–44.

    Google Scholar 

  11. Arab M, Zarei A, Rahimi A, Rezaiean F, Akbari F. Analysis of factors affecting length of stay in public hospitals in Lorestan Province, Iran. Hakim Res J. 2010;12(4):27–32.

    Google Scholar 

  12. Karim H, Tara SM, Etminani K. Factors Associated with length of hospital stay: a systematic review. J Health Biomed Inf. 2015;1(2):131–42.

    Google Scholar 

  13. Ravangard R, Arab M, Zeraati H, Rashidian A, Akbarisari A, Mostaan F. Patients’ length of stay in women hospital and its associated clinical and non-clinical factors, tehran, iran. Iran Red Crescent Med J. 2011;13(5):309–15.

    Google Scholar 

  14. Aghajani S, Kargari M. Determining factors influencing length of stay and predicting length of stay using data mining in the general surgery department. Hosp Pract Res. 2016;1(2):53–8. https://doi.org/10.20286/hpr-010251.

    Article  Google Scholar 

  15. Ameri H, Adham D, Panahi M, Khalili Z, Fasihi A, Moravveji M, Karimi S. Predictors for duration of stay in hospitals. J Health. 2015;6(3):256–65.

    Google Scholar 

  16. Khajehali N, Alizadeh S. Extract critical factors affecting the length of hospital stay of pneumonia patient by data mining (case study: an Iranian hospital). Artif Intell Med. 2017;83:2–13. https://doi.org/10.1016/j.artmed.2017.06.010.

    Article  Google Scholar 

  17. Turgeman L, May JH, Sciulli R. Insights from a machine learning model for predicting the hospital Length of Stay (LOS) at the time of admission. Expert Syst Appl. 2017;78:376–85. https://doi.org/10.1016/j.eswa.2017.02.023.

    Article  Google Scholar 

  18. Xiao J, Douglas D, Lee AH, Vemuri SR. A Delphi evaluation of the factors influencing length of stay in Australian hospitals. Int J Health Plan Manage. 1997;12(3):207–18. https://doi.org/10.1002/(SICI)1099-1751(199707/09)12:3%3c207:AID-HPM480%3e3.0.CO;2-V.

    Article  Google Scholar 

  19. Yaghoubi M, Karimi S, Ketabi S, Javadi M. Factors affecting patients’ length of stay in Alzahra hospital based on hierarchical analysis technique. Health Inf Manag. 2011;8(3):326–34.

    Google Scholar 

  20. Han J, Pei J, Kamber M. Data mining: concepts and techniques. 3rd ed. Burlington: Morgan Kaufmann; 2012.

    MATH  Google Scholar 

  21. Rezaei Hachesu P, Ahmadi M, Alizadeh S, Sadoughi F. Use of data mining techniques to determine and predict length of stay of cardiac patients. Healthc Inf Res. 2013;19(2):121–9. https://doi.org/10.4258/hir.2013.19.2.121.

    Article  Google Scholar 

  22. Azari A, Janeja VP, Mohseni A. Predicting hospital length of stay (PHLOS): a multi-tiered data mining approach. In: 2012 IEEE 12th international conference on datamining workshops (ICDMW), Brussels, Belgium; 2012. p. 17–24. IEEE.

  23. Tanuja S, Acharya DU, Shailesh K. Comparison of different data mining techniques to predict hospital length of stay. J Pharm Biomed Sci. 2011;7(15):1–4.

    Google Scholar 

  24. Daghistani TA, Elshawi R, Sakr S, Ahmed AM, Al-Thwayee A, Al-Mallah MH. Predictors of in-hospital length of stay among cardiac patients: a machine learning approach. Int J Cardiol. 2019;288:140–7. https://doi.org/10.1016/j.ijcard.2019.01.046.

    Article  Google Scholar 

  25. Chuang MT, Hu YH, Lo CL. Predicting the prolonged length of stay of general surgery patients: a supervised learning approach. Int Trans Oper Res. 2018;25(1):75–90. https://doi.org/10.1111/itor.12298.

    Article  MATH  Google Scholar 

  26. Caetano N, Cortez P, Laureano RM. Using data mining for prediction of hospital length of stay: an application of the CRISP-DM methodology. In: International conference on enterprise information systems. Cham: Springer; 2014. p. 149–66.

  27. Karegowda AG, Manjunath A, Jayaram M. Comparative study of attribute selection using gain ratio and correlation based feature selection. Int J Inf Technol Knowl Manag. 2010;2(2):271–7.

    Google Scholar 

  28. Hall MA. Correlation-based feature selection for machine learning. PhD Thesis, Department of Computer Science, Waikato University, Waikato; 1999.

  29. Patil TR, Sherekar S. Performance analysis of Naive Bayes and J48 classification algorithm for data classification. Int J Comput Sci Appl. 2013;6(2):256–61.

    Google Scholar 

  30. Komarek P. Logistic regression for data mining and high-dimensional classification. Ph.D. Thesis, Carnegie Mellon University, Pittsburgh; 2004.

  31. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44. https://doi.org/10.1038/nature14539.

    Article  Google Scholar 

  32. Gupta B, Rawat A, Jain A, Arora A, Dhami N. Analysis of various decision tree algorithms for classification in data mining. Int J Comput Appl. 2017;163(8):15–9. https://doi.org/10.5120/ijca2017913660.

    Article  Google Scholar 

  33. Krauss C, Do XA, Huck N. Deep neural networks, gradient-boosted trees, random forests: statistical arbitrage on the S&P 500. Eur J Oper Res. 2017;259(2):689–702. https://doi.org/10.1016/j.ejor.2016.10.031.

    Article  MATH  Google Scholar 

  34. Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29(5):1189–232.

    Article  MathSciNet  Google Scholar 

  35. Chen T, Guestrin C. Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining–KDD’16, San Francisco, CA, USA. New York: ACM; 2016. p. 785–94.

  36. Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci. 1997;55(1):119–39. https://doi.org/10.1006/jcss.1997.1504.

    Article  MathSciNet  MATH  Google Scholar 

  37. Nekoei-Moghadam M, Rooholamini A, Yazdi Feizabadi V, Hooshyar P. Comparing performance of selected teaching hospitals in Kerman and Shiraz Universities of Medical Sciences, Iran, Using Pabon-Lasso Chart. J Health Dev. 2012;1(1):11–21.

    Google Scholar 

  38. Hodge V, Austin J. A survey of outlier detection methodologies. Artif Intell Rev. 2004;22(2):85–126. https://doi.org/10.1023/B:AIRE.0000045502.10941.a9.

    Article  MATH  Google Scholar 

  39. Jaensson M, Dahlberg K, Eriksson M, Gronlund A, Nilsson U. The Development of the Recovery Assessments by Phone Points (RAPP): a mobile phone app for postoperative recovery monitoring and assessment. JMIR mHealth uHealth. 2015;3(3):e86. https://doi.org/10.2196/mhealth.4649.

    Article  Google Scholar 

  40. Bekmezian A, Chung PJ, Cabana MD, Maselli JH, Hilton JF, Hersh AL. Factors associated with prolonged emergency department length of stay for admitted children. Pediatr Emerg Care. 2011;27(2):110–5. https://doi.org/10.1097/PEC.0b013e31820943e4.

    Article  Google Scholar 

  41. Liu Y, Phillips M, Codde J. Factors influencing patients’ length of stay. Aust Health Rev. 2001;24(2):63–70. https://doi.org/10.1071/AH010063.

    Article  Google Scholar 

  42. Baek H, Cho M, Kim S, Hwang H, Song M, Yoo S. Analysis of length of hospital stay using electronic health records: a statistical and data mining approach. PLoS ONE. 2018;13(4):e0195901. https://doi.org/10.1371/journal.pone.0195901.

    Article  Google Scholar 

  43. Thompson B, Elish KO, Steele R. Machine learning-based prediction of prolonged length of stay in newborns. In: Paper presented at the 2018 17th IEEE international conference on machine learning and applications (ICMLA), Orlando, FL; 2018.

  44. Stoean R, Stoean C, Sandita A, Ciobanu D, Mesina C. Interpreting decision support from multiple classifiers for predicting length of stay in patients with colorectal carcinoma. Neural Process Lett. 2017;46(3):811–27. https://doi.org/10.1007/s11063-017-9585-7.

    Article  Google Scholar 

  45. Zikos D, Tsiakas K, Qudah F, Athitsos V, Makedon F. Evaluation of classification methods for the prediction of hospital length of stay using medicare claims data. In: Proceedings of the 7th international conference on PErvasive technologies related to assistive environments, Rhodes, Greece. New York: ACM; 2014. p. 54.

  46. Rajkomar A, Oren E, Chen K, Dai AM, Hajaj N, Hardt M, Liu PJ, Liu X, Marcus J, Sun M. Scalable and accurate deep learning with electronic health records. NPJ Digit Med. 2018;1(1):18. https://doi.org/10.1038/s41746-018-0029-1.

    Article  Google Scholar 

  47. Stoean R, Stoean C, Sandita A, Ciobanu D, Mesina C. Ensemble of classifiers for length of stay prediction in colorectal cancer. In: Rojas I, Joya G, Catala A, editors. Advances in computational intelligence, vol. 9094., Lecture notes in computer scienceBerlin: Springer; 2015. p. 444–57.

    Chapter  Google Scholar 

  48. Steele RJ, Thompson B. Data mining for generalizable pre-admission prediction of elective length of stay. In: 2019 IEEE 9th annual computing and communication workshop and conference (CCWC), Las Vegas, NV, USA. IEEE; 2019. p. 0127–33.

Download references

Acknowledgements

This study is part of a registered research project with the grant number of 9520 and ethical code of HUMS.REC.1395.56 from Deputy of Research and Technology of Hormozgan University of Medical Sciences. We wish to thank the deputy of the university’s research and technology for its supports, also we are sincerely thankful to our counselors in Clinical Research Development Center of Shahid Mohammadi Hospital.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tayebeh Baniasadi.

Ethics declarations

Conflict of interest

The authors report no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ayyoubzadeh, S.M., Ghazisaeedi, M., Rostam Niakan Kalhori, S. et al. A study of factors related to patients’ length of stay using data mining techniques in a general hospital in southern Iran. Health Inf Sci Syst 8, 9 (2020). https://doi.org/10.1007/s13755-020-0099-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13755-020-0099-8

Keywords

Navigation