Research Article
Multi-objective feature selection for warfarin dose prediction

https://doi.org/10.1016/j.compbiolchem.2017.06.002Get rights and content

Highlights

  • The purpose of paper is identification of impressive features using multi-objective optimization algorithms.

  • We propose two new approaches based on NSGA-II and MOPSO to predict warfarin dosage.

  • The prediction of warfarin dose rate is based on Multi-Layer Perceptron in this paper.

  • Multi-objective optimization have more accuracy compared to the classic methods.

Abstract

With increasing the application of decision support systems in various fields, using such systems in different aspects of medical science has been growing. Drug’s dose prediction is one of the most important issues which can be improved using decision support systems. In this paper, a new multi-objective feature approach has been proposed to support warfarin dose prediction decision. Warfarin is an anticoagulant normally used in the prevention of the formation of clots. This research was conducted on 553 patients during 2013–2015 who were candidates for using warfarin and their INR was in the target range. Features affecting dose was implemented and evaluated, which were clinical and genetic characteristics extracted, and new methods of feature selection based on multi-objective optimization methods such as the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) and Multi-Objective Particle Swarm Optimization (MOPSO) along with the evaluation of artificial neural networks were used. Multi-objective optimization methods have more accuracy and performance compared to the classic methods of feature selection. Furthermore, multi-objective particle swarm optimization algorithm has higher precision than Non-dominated Sorting Genetic Algorithm-II. With a choice of seven features Mean Square Error (MSE), root mean square error (RMSE) and mean absolute error (MAE) were 0.011, 0.1 and 0.109 for MOPSO, respectively.

Introduction

Abnormal blood clots in arteries and veins are one of the important causes of morbidity and mortality (Mannucci and Franchini, 2011). Oral anticoagulant drugs are often used to prevent thrombosis and occlusion of blood flow disorders at rial fibrillation, pulmonary embolism and heart valve replacement (Fauci et al., 2008).

Warfarin is one of the most common oral anticoagulant. This medicine reduces the blood coagulability and prevents the formation of blood clotting. The amount of the warfarin dose is determined based on the results of International Normalized Ratio (INR) test. (Riley et al., 2000, Fitzmaurice and Blann, 2002). Target INR range can be considered usually between 2 and 3 for patients who are taking warfarin (Princeton, 2006). INR value should be kept within the therapeutic range because the changes increase the risk of mortality (Oden and Fahlen, 2002). There are many factors such as drug and other diseases which promote the risk of warfarin therapy and increase or decrease the INR. Warfarin treatment period is usually between three months to the end of life (Hirsh et al., 2003).

Problem with warfarin is how to achieve the goals of treatment for each patient. Since the dose adjustment is based on trial and error tests and is often based on the dose in subsequent experiments, data mining methods have a significant role in predicting the dose. Data mining is the process of extracting previously unknown knowledge from data. There exist several types of data mining techniques (Sohrabi and Akbari, 2016) and data mining has numerous applications, including several types of predictions in social networks, analysis of customer purchase patterns, analysis of web access patterns, and specially, the investigation of medical processes (Sohrabi and Barforoush, 2012, Sohrabi and Barforoush, 2013, Sohrabi and Ghods, 2014, Sohrabi and Ghods, 2016a, Sohrabi and Ghods, 2016b, Sohrabi and Akbari, 2016, Sohrabi and Marzooni, 2016, Sohrabi and Roshani, 2017, Sohrabi and Azgomi, 2017).

The prescribed dose for warfarin therapy is very important. Because too much of the drug prescribed, can cause complications which are life-threatening in some cases and on the other hand, prescribing less than the amount required does not provide medical purposes (Sinxadi and Blockman, 2008).

Now, this diagnosis is made based on the experience of the physician and the patients begin treatment with low doses which this will slow down to reach rang of therapy and frequent tests for patients. Many clinical and genetic features affect the dosage of patients which have been less studied. Decision support systems can be used for the diagnosis or prediction of drugs’ doses to help expert humans to make the best decisions (Beaudoin et al., 2016, Nielsen et al., 2017, Sohrabi and Tajik, 2016, Yet et al., 2013). Final decisions on such issues, due to the importance of the decision, are usually with the physicians. Therefore, in such cases which the human factor can comment on the decision which taken by the system and correct or change it, along with acceptable accuracy of the decision, speed of decision-making is also important.

Since the decisions in a high dimensional complex environment often takes a lot time, and decrease the ability of online answering of the system, which is the main mission of decision support systems, reducing the scale of the problem and the operating space by selecting key features, is a very important operation. On the other hand, the available hospital data on patients are not clean, and records of some patients have no values for some of the features. In such cases, in which, the values of some of the features in the training or testing dataset are not available, the choice of dynamic selection of features and quick determination of alternative categories of features for medication, is considered as an important benefits of decision support systems. In the past few years, multi-objective optimization methods have been replaced with traditional methods of feature selection which are superior in terms of accuracy and complexity (Xue et al., 2013, Zhou et al., 2013, Huang et al., 2010). The purpose of this paper is identification of impressive features using multi-objective optimization algorithms NSGA-II (Deb et al., 2002) and MOPSO (Coello et al., 2004), and then prediction of dose rate based on Multi-Layer Perceptron (MLP) (Haykin, 2007).

Following parts of this paper is organized as below:

The second section contains a selective literature review in the field of warfarin dose prediction and multi-objective optimization methods. The proposed methods are explained in section three. In this section, we first describe two multi objective optimization methods and then represent our new approaches based on multi-objective feature selection and evaluation factors of multilayer neural network. In section four, the experimental results are presented, and finally we conclude our work in section five.

Section snippets

Related works

Feature selection techniques have been used in computational biology and its related applications for a long time (He and Yu, 2010, Martinez et al., 2010, Gumus et al., 2013, Garbarine et al., 2011, Li et al., 2015). Reducing the size of the datasets, decreasing the computation time of classification by removing redundant or irrelevant features, and improving the classification time by eliminating misleading and inappropriate features, are some of the most important purposes of features

Proposed method

In this section we first explain two of the most common multi objective optimization algorithms, NSGA II and MOPSO, and then we describe our multi-objective feature selection approach for predicting warfarin dosage using these two multi objective optimization algorithms.

Experimental results

In present study, the characteristics of 19 clinical and genetic features were discussed. The target variable for the modeling was the dose of warfarin in milligram per week (mg/w). The parameters of NSGA-II and MOPSO algorithms have been shown in Table 2, Table 3, respectively.

Non-dominated optimal solutions sets obtained from implementation of two NSGA-II and MOPSO algorithms based on MSE criterion have been shown in Fig. 6, Fig. 7.

The output of the NSGA-II and MOPSO algorithms based on the

Conclusion

Data mining can be applied to various aspects, including prevention and detection of disease, the treatment of the disease and duration of hospitalization. One of the applications of data mining in the medical field is predict the dose of some vital medicines. Warfarin is one of the best known oral medications which play an important role in the prevention of blood clots. Identify the dose of the drug is difficult to physicians and often is based on observed changes of successive tests.

References (58)

  • H.H. Inbarani et al.

    Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis

    Comput. Methods Progr. Biomed.

    (2014)
  • M. Iqbal et al.

    Multi-objective optimization in sensor networks: optimization classification, applications and solution approaches

    Comput. Netw.

    (2016)
  • D. Kimovski et al.

    Parallel alternatives for evolutionary multi-objective optimization in unsupervised feature selection

    Expert Syst. Appl.

    (2015)
  • F.S. Lobato et al.

    Determination of an optimal control strategy for drug administration in tumor treatment using multi-objective optimization differential evolution

    Comput. Methods Progr. Biomed.

    (2016)
  • E. Martinez et al.

    Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm

    Comput. Biol. Chem.

    (2010)
  • C. Nantasenamat et al.

    Predictive QSAR modeling of aldose reductase inhibitors using Monte Carlo feature selection

    Eur. J. Med. Chem.

    (2014)
  • P.B. Nielsen et al.

    Using a personalized decision support algorithm for dosing in warfarin treatment: a randomised controlled trial

    Clin. Trials Regul. Sci. Cardiol.

    (2017)
  • M. Ojha et al.

    Overlapping structure features selection in linear and non-linear QSAR

    J. Pharm. Res.

    (2013)
  • S. Paul et al.

    Simultaneous feature selection and weighting −an evolutionary multi-objective optimization approach

    Pattern Recognit. Lett.

    (2015)
  • S.N. Qasem et al.

    Radial basis function network based on time variant multi-objective particle swarm optimization for medical diseases diagnosis

    Appl. Soft Comput.

    (2011)
  • Z. Qin et al.

    QSAR studies of the bioactivity of hepatitis C virus (HCV) NS3/4A protease inhibitors by multiple linear regression (MLR) and support vector machine (SVM)

    Bioorg. Med. Chem. Lett.

    (2017)
  • M.K. Sohrabi et al.

    A comprehensive study on the effects of using data mining techniques to predict tie strength

    Comput. Hum. Behav.

    (2016)
  • M.K. Sohrabi et al.

    Parallel set similarity join on big data based on locality-sensitive hashing

    Sci. Comput. Progr.

    (2017)
  • M.K. Sohrabi et al.

    Efficient colossal pattern mining in high dimensional datasets

    Knowl. Based Syst.

    (2012)
  • M.K. Sohrabi et al.

    Parallel frequent itemset mining using systolic arrays

    Knowl. Based Syst.

    (2013)
  • M.K. Sohrabi et al.

    Frequent itemset mining using cellular learning automata

    Comput. Hum. Behav.

    (2017)
  • S. Soltani et al.

    QSAR analysis of diaryl COX-2 inhibitors: comparison of feature selection and train-test data selection methods

    Eur. J. Med. Chem.

    (2010)
  • Z. Wang et al.

    A multi-objective evolutionary algorithm for feature selection based on mutual information with a new redundancy measure

    Inf. Sci.

    (2015)
  • B. Yet et al.

    Decision support system for Warfarin therapy management using Bayesian networks

    Decis. Support Syst.

    (2013)
  • Cited by (0)

    View full text