Abstract
In a medical appointment, patient information, including past exams, is analyzed in order to define a diagnosis. This process is prone to errors, since there may be many possible diagnoses. This analysis is very dependent on the experience of the doctor. Even with the correct diagnosis, prescribing medicines can be a problem, because there are multiple drugs for each disease and some may not be used due to allergies or high cost. Therefore, it would be helpful, if the doctors were able to use a system that, for each diagnosis, provided a list of the most suitable medicines. Our approach is to support the physician in this process. Rather than trying to predict the medicine, we aim to, given the available information, predict the set of the most likely drugs. The prescription problem may be solved as a Multi-Label classification problem since, for each diagnosis, multiple drugs may be prescribed at the same time. Due to its complexity, some simplifications were performed for the problem to be treatable. So, multiple approaches were done with different assumptions. The data supplied was also complex, with important problems in its quality, that led to a strong investment in data preparation, in particular, feature engineering. Overall, the results in each scenario are good with performances almost twice the baseline, especially using Binary Relevance as transformation approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alvares-Cherman, E., Metz, J., Monard, M.C.: Incorporating label dependency into the binary relevance framework for multi-label classification. Expert Syst. Appl. 39(2), 1647–1655 (2012). https://doi.org/10.1016/j.eswa.2011.06.056
Dai, W., Brisimi, T.S., Adams, W.G., Mela, T., Saligrama, V., Paschalidis, I.C.: Prediction of hospitalization due to heart diseases by supervised learning methods. Int. J. Med. Inform. 84(3), 189–197 (2015). https://doi.org/10.1016/j.ijmedinf.2014.10.002
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Fatima, M., Pasha, M.: Survey of machine learning algorithms for disease diagnostic. J. Intell. Learn. Syst. Appl. 09(01), 1–16 (2017). https://doi.org/10.4236/jilsa.2017.91001
Gibaja, E., Ventura, S.: A tutorial on multilabel learning. ACM Comput. Surv. 47(3), 1–38 (2015). https://doi.org/10.1145/2716262
Hssina, B., Merbouha, A., Ezzikouri, H., Erritali, M.: A comparative study of decision tree ID3 and C4.5. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 4(2) (2014). https://doi.org/10.14569/SpecialIssue.2014.040203. Special Issue on Advances in Vehicular Ad Hoc Networking and Applications
Levine, B.B.: Immunologic mechanisms of penicillin allergy. N. Engl. J. Med. 275(20), 1115–1125 (1966). https://doi.org/10.1056/NEJM196611172752009
Metz, J., de Abreu, L.F.D., Cherman, E.A., Monard, M.C.: On the estimation of predictive evaluation measure baselines for multi-label learning. In: Pavón, J., Duque-Méndez, N.D., Fuentes-Fernández, R. (eds.) IBERAMIA 2012. LNCS (LNAI), vol. 7637, pp. 189–198. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34654-5_20
Mitchell, R., Frank, E.: Accelerating the XGBoost algorithm using GPU computing. PeerJ Comput. Sci. 3, e127 (2017). https://doi.org/10.7717/peerj-cs.127
Norman, G., Barraclough, K., Dolovich, L., Price, D.: Iterative diagnosis. BMJ 339(7339), b3490 (2009). https://doi.org/10.1136/bmj.b3490. (Clinical research ed.)
Platt, J.: Sequential minimal optimization: a fast algorithm for training support vector machines. Technical report, Microsoft (1998)
Sontakke, S., Lohokare, J., Dani, R.: Diagnosis of liver diseases using machine learning. In: 2017 International Conference on Emerging Trends Innovation in ICT (ICEI), pp. 129–133 (2017). https://doi.org/10.1109/ETIICT.2017.7977023
Tantimongcolwat, T., Naenna, T., Isarankura-Na-Ayudhya, C., Embrechts, M.J., Prachayasittikul, V.: Identification of ischemic heart disease via machine learning analysis on magnetocardiograms. Comput. Biol. Med. 38(7), 817–825 (2008). https://doi.org/10.1016/j.compbiomed.2008.04.009
Tomar, D., Agarwal, S.: A survey on data mining approaches for healthcare. Int. J. Bio-Sci. Bio-Technol. 5(5), 241–266 (2013). https://doi.org/10.14257/ijbsbt.2013.5.5.25
Topaz, M., Shafran-Topaz, L., Bowles, K.H.: ICD-9 to ICD-10: evolution, revolution, and current debates in the United States. Perspect. Health Inf. Manag. 10(Spring), 1d (2013)
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007). https://doi.org/10.1016/j.patcog.2006.12.019
Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms (2014). https://doi.org/10.1109/TKDE.2013.39
Acknowledgements
I would like to thank Glintt - Healthcare Solutions S.a., who provided data and expertise that greatly assisted the development of this work.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Silva, P., Rivolli, A., Rocha, P., Correia, F., Soares, C. (2018). Machine Learning for Drugs Prescription. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science(), vol 11314. Springer, Cham. https://doi.org/10.1007/978-3-030-03493-1_57
Download citation
DOI: https://doi.org/10.1007/978-3-030-03493-1_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03492-4
Online ISBN: 978-3-030-03493-1
eBook Packages: Computer ScienceComputer Science (R0)