Multi-objective feature selection for warfarin dose prediction

doi:10.1016/j.compbiolchem.2017.06.002

Computational Biology and Chemistry

Volume 69, August 2017, Pages 126-133

https://doi.org/10.1016/j.compbiolchem.2017.06.002 Get rights and content

Highlights

•
The purpose of paper is identification of impressive features using multi-objective optimization algorithms.
•
We propose two new approaches based on NSGA-II and MOPSO to predict warfarin dosage.
•
The prediction of warfarin dose rate is based on Multi-Layer Perceptron in this paper.
•
Multi-objective optimization have more accuracy compared to the classic methods.

Abstract

With increasing the application of decision support systems in various fields, using such systems in different aspects of medical science has been growing. Drug’s dose prediction is one of the most important issues which can be improved using decision support systems. In this paper, a new multi-objective feature approach has been proposed to support warfarin dose prediction decision. Warfarin is an anticoagulant normally used in the prevention of the formation of clots. This research was conducted on 553 patients during 2013–2015 who were candidates for using warfarin and their INR was in the target range. Features affecting dose was implemented and evaluated, which were clinical and genetic characteristics extracted, and new methods of feature selection based on multi-objective optimization methods such as the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) and Multi-Objective Particle Swarm Optimization (MOPSO) along with the evaluation of artificial neural networks were used. Multi-objective optimization methods have more accuracy and performance compared to the classic methods of feature selection. Furthermore, multi-objective particle swarm optimization algorithm has higher precision than Non-dominated Sorting Genetic Algorithm-II. With a choice of seven features Mean Square Error (MSE), root mean square error (RMSE) and mean absolute error (MAE) were 0.011, 0.1 and 0.109 for MOPSO, respectively.

Graphical abstract

Introduction

Abnormal blood clots in arteries and veins are one of the important causes of morbidity and mortality (Mannucci and Franchini, 2011). Oral anticoagulant drugs are often used to prevent thrombosis and occlusion of blood flow disorders at rial fibrillation, pulmonary embolism and heart valve replacement (Fauci et al., 2008).

Warfarin is one of the most common oral anticoagulant. This medicine reduces the blood coagulability and prevents the formation of blood clotting. The amount of the warfarin dose is determined based on the results of International Normalized Ratio (INR) test. (Riley et al., 2000, Fitzmaurice and Blann, 2002). Target INR range can be considered usually between 2 and 3 for patients who are taking warfarin (Princeton, 2006). INR value should be kept within the therapeutic range because the changes increase the risk of mortality (Oden and Fahlen, 2002). There are many factors such as drug and other diseases which promote the risk of warfarin therapy and increase or decrease the INR. Warfarin treatment period is usually between three months to the end of life (Hirsh et al., 2003).

Problem with warfarin is how to achieve the goals of treatment for each patient. Since the dose adjustment is based on trial and error tests and is often based on the dose in subsequent experiments, data mining methods have a significant role in predicting the dose. Data mining is the process of extracting previously unknown knowledge from data. There exist several types of data mining techniques (Sohrabi and Akbari, 2016) and data mining has numerous applications, including several types of predictions in social networks, analysis of customer purchase patterns, analysis of web access patterns, and specially, the investigation of medical processes (Sohrabi and Barforoush, 2012, Sohrabi and Barforoush, 2013, Sohrabi and Ghods, 2014, Sohrabi and Ghods, 2016a, Sohrabi and Ghods, 2016b, Sohrabi and Akbari, 2016, Sohrabi and Marzooni, 2016, Sohrabi and Roshani, 2017, Sohrabi and Azgomi, 2017).

The prescribed dose for warfarin therapy is very important. Because too much of the drug prescribed, can cause complications which are life-threatening in some cases and on the other hand, prescribing less than the amount required does not provide medical purposes (Sinxadi and Blockman, 2008).

Now, this diagnosis is made based on the experience of the physician and the patients begin treatment with low doses which this will slow down to reach rang of therapy and frequent tests for patients. Many clinical and genetic features affect the dosage of patients which have been less studied. Decision support systems can be used for the diagnosis or prediction of drugs’ doses to help expert humans to make the best decisions (Beaudoin et al., 2016, Nielsen et al., 2017, Sohrabi and Tajik, 2016, Yet et al., 2013). Final decisions on such issues, due to the importance of the decision, are usually with the physicians. Therefore, in such cases which the human factor can comment on the decision which taken by the system and correct or change it, along with acceptable accuracy of the decision, speed of decision-making is also important.

Since the decisions in a high dimensional complex environment often takes a lot time, and decrease the ability of online answering of the system, which is the main mission of decision support systems, reducing the scale of the problem and the operating space by selecting key features, is a very important operation. On the other hand, the available hospital data on patients are not clean, and records of some patients have no values for some of the features. In such cases, in which, the values of some of the features in the training or testing dataset are not available, the choice of dynamic selection of features and quick determination of alternative categories of features for medication, is considered as an important benefits of decision support systems. In the past few years, multi-objective optimization methods have been replaced with traditional methods of feature selection which are superior in terms of accuracy and complexity (Xue et al., 2013, Zhou et al., 2013, Huang et al., 2010). The purpose of this paper is identification of impressive features using multi-objective optimization algorithms NSGA-II (Deb et al., 2002) and MOPSO (Coello et al., 2004), and then prediction of dose rate based on Multi-Layer Perceptron (MLP) (Haykin, 2007).

Following parts of this paper is organized as below:

The second section contains a selective literature review in the field of warfarin dose prediction and multi-objective optimization methods. The proposed methods are explained in section three. In this section, we first describe two multi objective optimization methods and then represent our new approaches based on multi-objective feature selection and evaluation factors of multilayer neural network. In section four, the experimental results are presented, and finally we conclude our work in section five.

Section snippets

Related works

Feature selection techniques have been used in computational biology and its related applications for a long time (He and Yu, 2010, Martinez et al., 2010, Gumus et al., 2013, Garbarine et al., 2011, Li et al., 2015). Reducing the size of the datasets, decreasing the computation time of classification by removing redundant or irrelevant features, and improving the classification time by eliminating misleading and inappropriate features, are some of the most important purposes of features

Proposed method

In this section we first explain two of the most common multi objective optimization algorithms, NSGA II and MOPSO, and then we describe our multi-objective feature selection approach for predicting warfarin dosage using these two multi objective optimization algorithms.

Experimental results

In present study, the characteristics of 19 clinical and genetic features were discussed. The target variable for the modeling was the dose of warfarin in milligram per week (mg/w). The parameters of NSGA-II and MOPSO algorithms have been shown in Table 2, Table 3, respectively.

Non-dominated optimal solutions sets obtained from implementation of two NSGA-II and MOPSO algorithms based on MSE criterion have been shown in Fig. 6, Fig. 7.

The output of the NSGA-II and MOPSO algorithms based on the

Conclusion

Data mining can be applied to various aspects, including prevention and detection of disease, the treatment of the disease and duration of hospitalization. One of the applications of data mining in the medical field is predict the dose of some vital medicines. Warfarin is one of the best known oral medications which play an important role in the prevention of blood clots. Identify the dose of the drug is difficult to physicians and often is based on observed changes of successive tests.