Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets

  • Olaide N. Oyelade,

    Roles Conceptualization, Data curation, Investigation, Methodology, Software, Validation, Writing – original draft

    Affiliation Department of Computer Science, Faculty of Physical Sciences, Ahmadu Bello University, Zaria, Nigeria

  • Jeffrey O. Agushaka,

    Roles Methodology, Resources, Writing – original draft, Writing – review & editing

    Affiliation Unit for Data Science and Computing, North-West University, Potchefstroom, South Africa

  • Absalom E. Ezugwu

    Roles Conceptualization, Methodology, Resources, Supervision, Validation, Writing – review & editing

    absalom.ezugwu@nwu.ac.za

    Affiliation Unit for Data Science and Computing, North-West University, Potchefstroom, South Africa

Abstract

Feature selection problem represents the field of study that requires approximate algorithms to identify discriminative and optimally combined features. The evaluation and suitability of these selected features are often analyzed using classifiers. These features are locked with data increasingly being generated from different sources such as social media, surveillance systems, network applications, and medical records. The high dimensionality of these datasets often impairs the quality of the optimal combination of these features selected. The use of the binary optimization method has been proposed in the literature to address this challenge. However, the underlying deficiency of the single binary optimizer is transferred to the quality of the features selected. Though hybrid methods have been proposed, most still suffer from the inherited design limitation of the single combined methods. To address this, we proposed a novel hybrid binary optimization capable of effectively selecting features from increasingly high-dimensional datasets. The approach used in this study designed a sub-population selective mechanism that dynamically assigns individuals to a 2-level optimization process. The level-1 method first mutates items in the population and then reassigns them to a level-2 optimizer. The selective mechanism determines what sub-population is assigned for the level-2 optimizer based on the exploration and exploitation phase of the level-1 optimizer. In addition, we designed nested transfer (NT) functions and investigated the influence of the function on the level-1 optimizer. The binary Ebola optimization search algorithm (BEOSA) is applied for the level-1 mutation, while the simulated annealing (SA) and firefly (FFA) algorithms are investigated for the level-2 optimizer. The outcome of these are the HBEOSA-SA and HBEOSA-FFA, which are then investigated on the NT, and their corresponding variants HBEOSA-SA-NT and HBEOSA-FFA-NT with no NT applied. The hybrid methods were experimentally tested over high-dimensional datasets to address the challenge of feature selection. A comparative analysis was done on the methods to obtain performance variability with the low-dimensional datasets. Results obtained for classification accuracy for large, medium, and small-scale datasets are 0.995 using HBEOSA-FFA, 0.967 using HBEOSA-FFA-NT, and 0.953 using HBEOSA-FFA, respectively. Fitness and cost values relative to large, medium, and small-scale datasets are 0.066 and 0.934 using HBEOSA-FFA, 0.068 and 0.932 using HBEOSA-FFA, with 0.222 and 0.970 using HBEOSA-SA-NT, respectively. Findings from the study indicate that the HBEOSA-SA, HBEOSA-FFA, HBEOSA-SA-NT and HBEOSA-FFA-NT outperformed the BEOSA.

1. Introduction

Recent technological advances have led to an increase in the amount of data generated and stored. The increase is in the volume and nature of the data, usually having large dimensions or features, outliers, skewness, missing values, redundant features, irrelevant data, integration, and heterogeneity [1, 2]. This increase significantly reduces the classifier’s accuracy, and the ability to manipulate this data decreases, too [3]. Hence the need for tools that can handle this volume of data. The issue of datasets with large dimensions and redundant or irrelevant features can be solved using the feature section methods [4]. These methods aim to reduce the number of features to the bearest minimum without information loss [5]. Feature selection (FS) methods have been successfully applied to many domains, including computational medicine [6, 7], clustering [8, 9], intrusion and spam detection [1013], and genomics [14].

The methods for solving FS problems are broadly classified into filter-based, wrapper-based, and embedded-based methods. The filter-based methods reduce the number of features by assessing the features based on similarity, distance measure, information loss or gain, consistency, and statistical measures and then ranking these features based on these criteria [15]. The merit and demerit of the filter method are low computational cost and low performance, respectively. The wrapper-based methods perform feature reduction using a predetermined learning algorithm that evaluates all possible feature subsets to find the optimal one [16]. The wrapper has the advantage of providing higher classification accuracy than the others. Finally, the embedded methods are wrapper and filter-based hybrid methods. It has the advantages of filter-based and wrapper-based methods and incorporates the optimal feature search into the classifier training process [17].

Feature selection is an NP-hard problem because it involves finding an optimal subset out of 2N subsets of a dataset with N features. Approximate algorithms such as metaheuristic algorithms have been used to find an optimal subset out of near-optimal subsets heuristically [18, 19]. Just like in other areas of application of metaheuristic algorithms, such as engineering problems [20, 21] and scheduling problems [22, 23], significant successes have been recorded in the area of FS [24, 25]. Emary et al. [26] used the wrapper-based method to propose two versions of binary grey wolf optimizer (bGWO) that use the stochastic crossover among the three best solutions and the S-shaped transfer function. The proposed methods were used to solve the FS problem. In the same two-way approach of converting the continuous search space to a binary one, Mafarja et al. [27] proposed a wrapper-based binary grasshopper optimization algorithm (BGOA) framework that uses the S-shaped and V-shaped transfer functions in the first instance and combines the finest solutions found so far. Their approach was used to solve the FS problem. The FS solution proposed by [28] is called an improved sine cosine algorithm (ISCA). It introduced an elitism technique and solution update mechanism that helps select an optimal feature subset and increases classification accuracy. The authors [29] used different variants of S-shaped and V-shaped transfer functions to develop eight binary variants of the newly proposed emperor penguin optimizer to solve the FS problem.

The dwarf mongoose optimization (DMO) proposed by [30] has been gaining attention from the metaheuristic research community. It was improved to a DMO-secure-based clustering and combined with a Multi-Hop Scheme Of Routing (DMOSC-MHRS) to solve the clustering problem [31]. The binary version of DMO (BDMO) was developed by [24] and was applied to solve the multiclass high-dimensional feature selection problem. The simulated annealing (SA) was used to improve the local search mechanism of the BDMO in the hybrid of the two algorithms and used to solve the FS problem [3]. The ebola optimization search algorithm (EOSA) [32] is a recently proposed swarm-based metaheuristic algorithm inspired by the Ebola virus disease propagation. It was used to solve 47 classical benchmark functions, and it significantly outperformed compared seven well-known state-of-the-art algorithms. The binary version of EOSA called BEOSA was proposed by [1] and used two newly formulated S-shape and V-shape transfer functions to investigate mutations of the infected population in the exploitation and exploration phases. The result of applying BEOSA on 22 benchmark datasets consisting of low, medium, and high dimensional data shows that the BEOSA and its variant BIEOSA significantly outperform the other known FS methods used for the comparisons.

Interestingly, the hybrids of these algorithms have proven to yield better performance in adequately solving real-life problems. For instance, in [33], authors hybridized Harris hawk optimization (HHO)–grey wolf optimization (GWO) to solve the challenge of ensuring unmanned aerial vehicles (UAV) avoid obstacles while releasing payload hold-release targets. Similarly, a study in [34] has demonstrated the use of a hybrid AI-based approach for solving real-life problems. Similarly, hybrid binary optimizers have also been proposed in the literature to address the feature selection problem. However, no study has investigated the complexity of integration/non-integration of both nested-transfer functions and the threshold method in solving this same problem. We consider that a hybrid of binary optimizers in investigating this design pattern in binary optimizer contributes to the current research focus in the domain. Moreover, this study is motivated by the applicability and feasibility of using nested-transfer functions and threshold methods to solve large high-dimensional datasets and compare performance with low-dimensional datasets. Furthermore, the study was also motivated to harness a recent binary optimizer (BEOSA) whose continuous variant has remained one of the state-of-the-art metaheuristic algorithms in leveraging its impressive optimization structures in investigating the approach proposed in this study.

This paper aims to overcome the curse of dimensionality difficulties in the FS domain by generating high-quality solutions. The hybrids are carefully aligned to improve the global and local searches to enhance the feature selection mechanism. Specifically, a novel hybrid binary optimization capable of effectively selecting features from increasingly high-dimensional datasets is proposed. The approach used in this study designed a sub-population selective mechanism that dynamically assigns individuals to a 2-level optimization process. The binary Ebola optimization search algorithm (BEOSA) is applied for the level-1 mutation, while the simulated annealing (SA) and firefly (FFA) algorithms are investigated for the level-2 optimizer. The level-1 method first mutates items in the population and then reassigns them to a level-2 optimizer. The selective mechanism determines what sub-population is assigned for the level-2 optimizer based on the exploration and exploitation phase of the level-1 optimizer. In addition, we designed nested transfer (NT) functions and investigated the influence of the function on the level-1 optimizer. The outcome of these are the HBEOSA-SA and HBEOSA-FFA, which are then investigated on the NT, and their corresponding variants HBEOSA-SA-NT and HBEOSA-FFA-NT with no NT applied. The hybrid methods were experimentally tested over high-dimensional datasets to address the challenge of feature selection. A comparative analysis was done on the methods to obtain performance variability with the low-dimensional datasets. The main contributions of this study are summarized as follows.

  • This study introduced a novel hybrid binary optimization capable of effectively selecting features from increasingly high-dimensional datasets.
  • The approach is based on a 2-level optimization process where a sub-population selective mechanism dynamically assigns individuals to the 2-level optimizer.
  • The binary Ebola optimization search algorithm (BEOSA) is used as the level-1 mutation, while the simulated annealing (SA) and firefly (FFA) algorithms are used as the level-2 optimizer called HBEOSA-SA and HBEOSA-FFA, respectively.
  • A novel nested transfer (NT) function is designed, and its influence on the level-1 optimizer is investigated, resulting in variants called HBEOSA-SA and HBEOSA-FFA.
  • The hybrid methods were experimentally tested over high-dimensional datasets to address the challenge of feature selection.
  • A comparative analysis was done on the methods to obtain performance variability with the low-dimensional datasets.

The rest of this manuscript is structured as follows: Related works are presented in Section 2. Section 3 discusses the methodology used in this study. The details about the datasets and performance metrics are presented in Section 4. Section 5 presents the experiments’ results and discusses this study’s findings. Finally, Section 6 provides the conclusion and possible future work.

2. Related works

There are many FS approaches in the literature that employs metaheuristics optimization methods [35]. In practice, the FS approach involves any of the Filter, Wrapper, Embedded, and Hybrid -approaches. The hybrid approach combines the best features of the Filter and Wrapper approaches to form one approach. Each of the Wrapper and Hybrid approaches have different ways of using metaheuristic algorithms for FS. The metaheuristic algorithm is adapted wholly the way it is or modified (improved to tackle FS peculiarities) or hybridized (combining best features of two or more metaheuristic algorithms). The terminologies hybrid and hybridize refer to different things. The hybrid refers to an FS approach, while hybridize refers to combining the best features of two or more metaheuristic algorithms.

This review starts with approaches that adapt or modify some metaheuristic algorithms for FS problems. Two novel binary algorithms based on the butterfly optimization algorithm (BOA) used the wrapper method to find the optimum features for efficiently classifying objects. The performance of the proposed approach was tested using over 21 datasets from the UCI repository and compared with four high-performance optimization algorithms [36]. Similarly, a dynamic butterfly optimization algorithm (DBOA) was proposed by enhancing the BOA using a local search algorithm based on mutation (LSAM). The enhancement prevents the BOA from being stuck in the local minima and is tested using 20 datasets found in the UCI repository. Their results show that DBOA outperforms candidate algorithms used in the study [37].

Different versions of the artificial butterfly optimization (ABO) were proposed by [38]. The first version is used for single-objective optimization, and the second and third are used for multi and many-objective FS optimization. The study was validated using 8 publicly available datasets, and their results showed the superiority of their proposed algorithms. An FS strategy using the particle swarm optimization (PSO) for improving the text clustering called (FSPSOTC) is proposed by [39]. They tested the performance of FSPSOTC using six regular text datasets characterized by an assortment of features. Their findings showed that FSPSOTC could assemble informative features by generating a subgroup of written descriptive features.

The authors [40] proposed a novel binary butterfly optimization algorithm for information gain (IG-bBOA) to solve the lack of redundancy and feature relevancy issue of the s-shaped binary butterfly optimization algorithm (S-bBOA). Six routine UCI registry datasets were used to test the proposed FS method’s performance. The results showed the superiority of the proposed method over other methods used for comparison. In [41], four text representation methods were used before the genetic algorithm (GA) was used to select the optimal set of features. The text representation methods used are the bag of words (BOW), N-gram, stemming, and conceptual representation.

Similar studies [4244] used metaheuristic algorithms to find the optimal subset of features from text data found in three benchmark datasets. Specifically, invasive weed optimization (IWO) was used to find the optimal subset of features, and its accuracy was evaluated using the NB classifier. Their study was compared with PSO and GA [42]. In [43], all significant features are weighted using various Term Frequency (TF) methods consisting of TF, NORMTF, LOGTF, ITF, and SPARCK. The flower pollination algorithm (FPA) was then used to select the optimal set of features, and its accuracy was tested using the Ada-boost algorithm. Finally, in [44], the crow search algorithm (CSA) and KNN were used as an FS method and classifier, respectively.

Now the approaches that hybridized different metaheuristic algorithms are discussed. The goal is to create a robust method to select the relevant and optimum feature subset from the large feature sets in the original dataset. The authors [45] combined the best feature of the artificial bee colony (ABC) and bacterial foraging optimization (BFO) to form a wrapper-based hybrid called HABBFO. The hybridized HABBFO is then applied to select the most significant feature subset from Reuter’s dataset, which is later used for the prediction. The optimal feature subset is fed to an ANN, which performs the multi-label classification.

A three-step classification model was proposed by [46]. The author hybridized the grasshopper optimization algorithm (GOA) and crow search algorithm (CSA) to get a robust algorithm called (GCOA) used for the FS process. The vector space model (VSM) extracts features, and the Deep Belief Network (DBN) is used for text categorization (TC). Another hybridization of ant colony optimization (ACO) and GA called the ACOGA was proposed by [47]. The hybrid was used as an FS method and KNN as the classifier.

It is common knowledge that the major disadvantage of the wrapper-based FS approaches is the high cost of computational resources. The process of optimal feature subset identification is deeply embedded in the randomization mechanism of the algorithms. Many researchers have proposed a hybrid of intelligent optimization algorithms with traditional FS methods as a solution. This form of hybrid works by first performing preprocessing tasks that prune the data’s high dimension using any filter method. It then uses the wrapper-based metaheuristic method, which refines the selected feature subsets.

The authors [48] used the information gain (IG) and chi-square statistic (CHI) to preselect relevant feature subsets. Then, the preselected feature subset is further refined using a small-world optimization algorithm (SWA) to get the optimal feature subset. The KNN and SVM are used for text classification. In [49], the feature selection process is carried out in two phases. The filter method consisting of correlation (CO), information gain (IG), gain ratio (GR), and symmetrical uncertainty (SU) was used for preprocessing, while the wrapper-based PSO algorithm was used to refine the preselected feature subsets. The NB classifier was used to evaluate the optimally selected feature subset.

Likewise [50], proposed a hybrid FS method that used the Normalized Difference Measure (NDM) as a filter-based method and a wrapper-based Binary Jaya Optimization (BJO). The hybrid is called NDM-BJO and was used for the dimensionality reduction of feature space. The authors evaluated the selected feature subset using the NB and SVM. In [51], the Sine Cosine Algorithm (SCA) was improved and called (ISCA) for feature selection. However, the authors first used an information gain (IG) filter to rank the features and select the highest-ranked features, thereby reducing the size of high dimensionality. The NB algorithm was then used to validate the ISCA-selected feature subset.

The authors [52] modified the gaining sharing knowledge-based optimization algorithm (GSK) using the probability estimation operator called (Bi-GSK) to find the best feature subsets. The performance of Bi-GSK was enhanced using ten chaotic maps. The performance of these improved feature selection algorithms on twenty-one benchmark datasets taken from the UCI repository was compared with other existing algorithms, which showed that Chebyshev chaotic map has the best result among all chaotic. Similarly, the authors [53] used eight S-shaped and V-shaped transfer functions to binarize the GSK. The same datasets were used as previous authors, and the V4 transfer function outperforms other optimizers in terms of accuracy, fitness values, and the minimal number of features. The binary GSK has succeeded in other areas, such as the knapsack problem [54] and fault section location in distribution networks [55]. A decade-long survey of metaheuristic algorithms for feature selection (2009–2019) was presented in [56].

Undoubtedly, the use of metaheuristic algorithms for FS problems has been successful. However, it also comes with challenges, such as multi-objectivity, dynamicity, constraint, and uncertainty. Multi-objectivity implies multiple objectives that can be conflicting, and tradeoffs or Pareto optimal sets are needed for successful optimization. Uncertainty implies that the position of the global solution changes frequently. This scenario would require careful handling by these algorithms. The nature of the problem search space could lead to local minima stagnation and many more. The challenges of exploration and exploitation are enormous. They both serve conflicting purposes since increasing exploration may mean decreasing exploitation. Also, there is no clearly defined milestone for transiting between the two.

3. Methodology

The approach applied for the design of the proposed hybrid algorithms is presented in this section. First, the optimization process demonstrating how other algorithms are incorporated into the BEOSA method is presented. This model is further detailed using mathematical models. The design process also showed how each candidate solution is evaluated to obtain the best solution. Meanwhile, the transfer functions that support the binary optimizer are also detailed.

3.1 The hybrid HBEOSA model

The BEOSA [57] is a recent binary optimizer derived from the EOSA metaheuristics [58] and the immunity-based variant IEOSA [35]. The foundational design of the EOSA method was inspired by the Ebola virus and its associated propagation method. The base algorithm follows the susceptible, infected, recovered, exposed, hospitalized, vaccinated, quarantined, and death or dead (SIREHVQD) model. In this study, we leverage the EOSA and BEOSA to derive a new hybrid HBEOSA. The methodology follows a two-level (2-level) optimization approach using a novel nested transfer function. In this section, we describe the design of a level-1 optimizer using the BEOSA and then derive new methods using the integration of SA and FFA algorithms for the level-2 optimizer.

As inherited by the hybrid methods proposed in this study, an individual in the population initialization for the search space of BEOSAdy, is determined by Eq (1), while the entire population (S) of size N.

(1)(2)

where sp is s(D, rnd(mn, mx)); mn and mx which are representative of the 1 and ⌊0.5*D⌋ values; D is the dimension of each indi in the population; the rnd() returns a random positive non-zero integer value within the range of its parameter, and S is a sampling function that samples and returns a value within the range of [0, D].

An individual in S is positioned within a space and is allowed to move around to demonstrate the concept of infectiousness so that the individual can transit to the infected (I) compartment. As a result, position update for every indi in the system is computed using Eq (3).

(3)

where ρ represents the scale factor of displacement of an individual, and are the updated and original position at time t and t+1, respectively. The rand(−1|0|1) randomly yields a value that can be -1 or 0, or 1, with each denoting movement leading to covered, intensification, and exposed displacements, respectively.

Only individuals exposed and infected are mutated, as represented using Eq (4). In the equation, the Δ notation denotes the change factor of an individual, rand represents a randomly generated uniform number in the range[−1, 1], gbest represents the current global best solution.

(4)

3.1.1 Simulated Annealing (SA).

The SA algorithm is considered the first method to hybridize with BEOSA for performance improvement. We take advantage of the core part of SA, which uses Eq (5) to update the current global best in the population by renaming ind0 with indk if Δf returns a value less than zero; otherwise, we compute with Eq (6) and check if the condition rand<pf) is satisfied to confirm if indk still remains best global solution.

(5)(6)

3.1.2 Firefly Algorithm (FFA).

The FFA, sometimes referred to as FA, is the second algorithm investigated for the hybridization process. The mutation of individuals in the algorithm is achieved using Eq (7).

(7)

Where r is the radius and attraction level computed as and F is the Frobenius norm function; also urand represents a uniform random number in the range [0,1]; 0.05urand is computed as the mutation vector.

3.1.3 Hybrid BEOSA (HBEOSA).

The optimization process described by the hybrid model is illustrated using Fig 1. We note that while the BEOSA initializes S, generates the number of infected to allocate to Q, and exposes a certain fraction of S to I, only during the infection stage is the integration of either SA or FFA applicable. Note that the hybrid allows for either individual in S to be further optimized with SA/FFA during the exploration phase of BEOSA, or we optimize the individuals of I using SA/FFA during the exploitation phase of BEOSA.

thumbnail
Fig 1. An optimization process of the proposed hybrid BEOSA (HBEOSA) combining both SA and FFA methods into BEOSA.

https://doi.org/10.1371/journal.pone.0282812.g001

Therefore the hybrid model follows according to the mathematical models in Eq (8).

(8)

Where h(ind) represents the hybrid function which generates a set of individuals optimized by two methods with the BEOSA been the base method; the ⨁ and optimize() functions represent the BEOSA and SA/FFA optimization operators respectively, and indi is an element in the set of S or I at any time t.

The same fitness function is applied for evaluating solutions in the population used in BEOSA, SA, and FFA algorithms. This fitness function is described by Eq (9), which evaluates the solution based on its performance on a given classifier clf on a subset of the dataset and the application of control parameter ω. The notation as used in the equation, returns the number of 1s in the array representing the individual indi. Note that the notation |F| returns the number of features selected in the individual while D represents the dimension of the features in the dataset X. For experimental purposes, the value of 0.99 was used for ω notation.

(9)

Another evaluation function, known as the cost function, as described in Eq (10), was applied to check the cost-effectiveness of a potential solution. In contrast, the outcome from the previous equation demonstrates a solution’s fitness.

(10)

In this study, we propose a novel approach to the design and use of transfer functions in binary optimization methods. The popular S, V, Z, and Q shapes have been reported and used in the literature. Nevertheless, we consider that a novel optimization and transformation outcome can be achieved using a nested transfer function. As a result, we modeled eight different transfer functions taking a cue from the basic S and V functions applied in our recent study [57]. In that previous study, we proposed using the S1 and S2 for the S-family and the V1 and V2 for the V-family transfer function. The first four transfer functions are categorized into the S-V function, while the other category is named the V-S function. In both categories, the nesting of the second term is achieved in the first term.

In equitation (11), we have the S1(V1) transfer function which first applied an arbitrary indi to V1 function, the outcome is then applied to the S1 function. A similar operation is designed for the S2(V1), S1(V2), and S2(V2) transfer functions, which are defined in Eqs (1214).

(11)(12)(13)(14)

Also, for the V-family, which is nested with the S-family, we show in Eqs (1518) the definition for the V1(S1), V2(S1), V1(S2), and V2(S2), transfer functions.

(15)(16)(17)(18)

Plotting the graph of the eight newly derived SV() transfer functions, we discovered an interesting shape that promises to impact the application process of the functions on solutions in the search space, thereby enhancing the optimization outcome. In Fig 2(A), we graphed the S1 and S2 transfer functions which form the basis of the four derived functions shown in Fig 2(B). While the original S-shaped transfer function illustrates that shape, the derived functions illustrate different versions of a V-shape with an inherent S-shape. We found this very interesting and suitable for testing the proposed hybrids of BEOSA. Similarly, the original V-shaped graphs consisting of V1 and V2 transfer functions are shown in Fig 2(C). The derived nested transfer functions resulting from this are displayed in Fig 2(D), and we see the shape of the plot curve according to the S-pattern, though having an inherent V-shape.

thumbnail
Fig 2.

A graphical chart of the values of (a) two variants of S transfer functions (b) four variants of S(T) transfer functions, (c) two variants of T transfer functions, and (d) four variants of T(S) transfer functions.

https://doi.org/10.1371/journal.pone.0282812.g002

We demonstrate the applicability of the proposed nested derived transfer functions in the algorithm, which details the design of the hybrid algorithms.

3.2 Algorithmic and procedural flow of HBEOSA

The algorithmic design of the hybrid BEOSA methods is detailed in the sub-section with emphasis on the use of the transfer function as well as the branching from the BEOSA flow to the hybrids. In the algorithm, both SA and FFA methods are used for the hybrid to achieve what is referred to as HBEOSA-SA and HBEOSA-FFA. This study also investigates the possible performance of the hybrids when the derived transfer function is used and what the likely output would look like should the hybrids simply use a threshold approach with no transfer function. Hence, when the transfer functions are not used, new sets of hybrids, namely HBEOSA-SA-NT and HBOESA-FFA-NT, where the NT acronym defines non-transfer functions usage.

In Algorithm 1, the input and expected out for the hybrid algorithm are listed in Lines 1–2, while the body of the algorithm is listed in Lines 3–38. The initialization of the population and assignment of the index case of the infection on the population are described using Lines 4–5. Recall that the proposed method is designed to use the derived nested transfer functions and may not use the functions depending on the isThreshold control parameter assigned on Line 6. When the value for this parameter is set to true (1), the HBEOSA-SA and HBOESA-FFA algorithms are obtained, otherwise, we derive HBEOSA-SA-NT and HBOESA-FFA-NT from Algorithm 1.

Algorithm 1. HBEOSA method.

1 Input: maxIter, psize, srate, lrate, dim

2 Output: gbest

3 begin

4 S = Initialize and binarize populations (psize) as S

5 I, gbestS[0], S[0]

6 isThreshold = rand(1|0)

7 while e < maxIter and size (I) > 0 do:

8 Compute individuals to be quarantine

9 I = difference of current infected cases (I) from quarantine cases

10 for i in 1 to size(I) do:

11 generate new infected (nI) case from S

12

13 if! isThreshold

14 for j in 1 to dim do:

15 randomly generate d between 1|0

16 if displacement(nIi,) > 0.5 do:

17 update size of nI using srate

18

19 if s > = rand do:

20 nIi,j = 1

21 else:

22 nIi,j = 0

23 else:

24 update size of nI using lrate

25

26 if t > = rand do:

27 nIi,j = 1

28 else:

29 nIi,j = 0

30 if displacement(nIi,) < 0.5:

31 nI = SA(nI)|FFA(nI)

32 else:

33 S = SA(S)|FFA(S)

34 Evaluate new fitness of nIi

35 I←nI

36 Update gbest and compartments variables

37 return gbest

38 end

The iterative process describing the optimization process is outlined in Lines 7–36, starting with the while structure, which has a conditional statement to terminate the loop. Further from this is the assignment of some infected (I) cases to the quarantine (Q) compartment, as shown in Lines 8–9. It is desired that every infected case has the potency to infect new cases from the susceptible S compartment, this is described with the first for loop structure and specifically model with Lines 11–12. The branching off from using the derived nested function is shown with Line 13 so that only the design of HBEOSA-SA and HBEOSA-FFA is seen listed between Lines 14–29. The use of the transfer function in the case of the exploration and exploitation phases of the algorithm is shown in Lines 18 and 25, respectively. The return value for s and t and as conditioned with a randomly generated number, determines if a 1 or 0 is assigned to nIi,j element of the infected case being transformed. Note the use of SA and FFA to optimize nI and S on Lines 31 and 33. This demonstrates that only when the algorithm is in the exploitation phase is either SA or FFA applied to optimize individuals in the newly infected nI compartments; otherwise, the entire population remaining in S is optimized.

The flowchart further detailing the data flow within the algorithm described above is shown in Fig 3. We differentiate the flowchart of HBEOSA from that of BEOSA using some colored boxes. The highlighted boxes showed the use of the isThreshold control parameter, the mutation of the newly infected case nIi. Once the mutation operation is applied, the checking for the use of the isThreshold parameter is testing to determine the branching of the flowchart either to run HBEOSA-SA and HBEOSA-FFA or HBEOSA-SA-NT and HBEOSA-FFA-NT. Note also the highlight of the boxes showing the derived transfer functions and the use of the SA/FFA methods for further optimization.

thumbnail
Fig 3. The flowchart showing the optimization process of the hybrid methods using BEOSA as the base algorithm and SA and FFA as integrated algorithms.

https://doi.org/10.1371/journal.pone.0282812.g003

The method described in this section demonstrates the proposed hybrid BEOSA (HBEOSA) algorithms. This hybrid algorithm is further used to derive two methods, the HBEOSA-SA and the HBEOSA-FFA. Furthermore, we showed the design of a novel transfer function and mentioned that the applicability of the functions to the solving of feature selection problem is tested using two variants of the proposed hybrids, namely the HBEOSA-SA-NT and HBEOSA-FFA-NT. The following sections discuss the detailing of the datasets, experimentation, evaluation criteria, results, and discussion of the proposed method.

4. Datasets and evaluation metrics

The performance of the hybrids of the BEOSA algorithm is evaluated using publicly available datasets, which can be categorized into high-dimensional, medium-dimensional, and low-dimensional [59]. The high-dimensional datasets include WaveformEW, Sonar, PenglungEW, KrVsKpEW, Ionosphere, BreastEW, Prostate, colon, and Leukemia. Those categorized in the medium-scale group include the Zoo, Vote, SpectEW, Lymphography, and CongressEW. The Iris, Wine, Tic-tac-toe, M-of-n, HeartEW, Exactly, Exactly2, and BreastCancer are grouped into the low-dimensional dataset. Details about the datasets used for the experimentation in this study are given in Table 1.

The experiments in this study were conducted using a personal computer (PC) with the following configuration: CPU, Intel® Core i5-4210U CPU 1.70 GHz, 2.40 GHz; RAM of 8 GB; Windows 10 OS. This was complemented with other computer systems having Intel® Core i5-4200, CPU 1.70 GHz, 2.40 GHz; RAM of 16 GB; 64-bit Windows 10 OS. The hybrid metaheuristic algorithms were implemented using Python 3.7.3 and supporting libraries, such as Numpy and other dependent libraries.

The comparative performances of all hybrid methods were considered under the following measures: classification accuracy, cost and fitness function values, the number of features selected, and computational time. Tabular and graph-based result outlines were shown based on the HBEOSA-SA, HBEOSA-SA-NT, HBEOSA-FFA, HBEOSA-FFA-NT, and BEOSA algorithms. Furthermore, to discover the performance of each algorithm concerning population variation, we subjected the experimentation to 50 and 100 population sizes for every run using 50 iterations. Table 2 presents the parameter settings of the base algorithm (BEOSA) used to derive the hybrids (HBEOSA-SA, HBEOSA-SA-NT, HBEOSA-FFA, HBEOSA-FFA-NT).

5. Results and discussion

The result of the experimentation carried out in the study are presented in this section. Emphasis is made on a comparative approach in the presentation of the outcome. As a result, the comparative performances of all hybrid methods were considered under the following measures: the classification accuracy, the cost and fitness function values, the number of features selected, and, lastly, the computational time. Tabular and graph-based result outlines were shown based on the HBEOSA-SA, HBEOSA-SA-NT, HBEOSA-FFA, HBEOSA-FFA-NT, and BEOSA algorithms. This is motivated by the need to observe the performance of proposed hybrids of BEOSA to allow for an investigative reportage of these performances and suitability for practical applicability. Furthermore, to discover the performance of each algorithm with respect to population variation, we subjected the experimentation to both 50 and 100 population sizes for every run using 50 iterations. The section concludes by highlighting the study findings based on the metrics supporting the feature selection process.

The investigation of the hybrids of the BEOSA algorithm is considered under the categorization of the datasets into high-dimensional, medium-dimensional, and low-dimensional. The high-dimensional datasets include WaveformEW, Sonar, PenglungEW, KrVsKpEW, Ionosphere, BreastEW, Prostate, colon, and Leukemia. Those categorized in the medium-scale group include the Zoo, Vote, SpectEW, Lymphography, and CongressEW. The Iris, Wine, Tic-tac-toe, M-of-n, HeartEW, Exactly, Exactly2, and BreastCancer are grouped into the low-dimensional dataset.

5.1 Comparative analysis of features count by hybrid methods

The evaluation of the number of features selected by HBEOSA-SA, HBEOSA-SA-NT, HBEOSA-FFA, HBEOSA-FFA-NT, and BEOSA algorithms are discussed in the sub-section. Considering that the study aims to observe the most performing method based on the number of features selected, we highlight methods with the most optimal number of features and mention those with the worst performance in terms of selected features. Table 3 shows a comparative listing of the feature counts reported by all datasets in the category of high-dimensional scale. Some algorithms, such as HBEOSA-FFA and HBEOSA-SA, underperformed by returning a negligent number of features in almost all the datasets in the category. This is reflected in the table by those rows with a zero (0) value as the corresponding values for the feature count column. The implication of this as regards the HBEOSA-FFA and HBEOSA-SA algorithms on those datasets is the issue of the unsuitability of the method due to the integration of transfer function in their design. On the other hand, the same methods, HBEOSA-FFA and HBEOSA-SA, which were not designed with transfer functions, performed well for all datasets in the category of high-dimensional scale.

thumbnail
Table 3. Large-scale dataset comparative analysis using the number of features selected.

https://doi.org/10.1371/journal.pone.0282812.t003

The performances of HBEOSA-FFA-NT and HBEOSA-SA-NT on WaveformEW, Sonar, PenglungEW, KrVsKpEW, Ionosphere, BreastEW, Prostate, colon, and Leukemia returned different results, which are worth considering. HBEOSA-FFA-NT performed better than HBEOSA-SA-NT though BEOSA outperformed all methods on the BreastEW dataset. A similar performance is observed for Prostate, Leukemia, Ionosphere, and KrVsKpEW. With regards to the Colon dataset, almost a similar report is obtained except for the inadequacy of BEOSA to compete with its hybrids. The HBEOSA-SA–NT did well with the WaveformEW dataset, followed by HBEOSA-FFA-NT and BEOSA in that order. The result on the PenglungEW dataset showed that while HBEOSA-SA–NT still leads, BEOSA comes out better than HBEOSA-FFA–NT, which lags far behind in performance. The summary of all these performances reveals that HBEOSA-SA–NT and HBEOSA-FFA–NT are competitive, outperforming the basic BEOSA, and are much more applicable for extracting the optimal combination of features needed for the classification purpose. Therefore, this showed that the use of transfer function in binary optimization method is not as significant as reported in the literature. Nevertheless, a careful hybridization of binary optimizers could yield better performance for a high-dimensional dataset.

The results obtained for the medium-dimension dataset are listed in Table 4 where the following are considered: Zoo, Vote, SpectEW, Lymphography, and CongressEW. Similar to the observation noted for the high-dimensional dataset, we see that both the HBEOSA-SA and HBEOSA-FFA showed that their performance was impaired due to the use of the transfer function, the CongressEW is an exception to this observation. On the contrary, their corresponding methods, HBEOSA-SA-NT and HBEOSA-FFA-NT which did not use the transfer function, returned a good number of features selected. In all the datasets, we found BEOSA returning suboptimal feature counts compared with its hybrids of HBEOSA-SA-NT and HBEOSA-FFA-NT. For instance, HBEOSA-SA-NT showed the best performance with the CongressEW dataset, while HBEOSA-FFA-NT demonstrated this same superiority with Vote, Zoo, and Lymphography. The implication is that a hybrid of BEOSA and FFA algorithm is much more compatible and productive when compared with the hybrid of BEOSA and SA.

thumbnail
Table 4. Medium-scale dataset comparative analysis using the number of features selected.

https://doi.org/10.1371/journal.pone.0282812.t004

Six datasets were experimented with under low-dimensional datasets, and the results obtained are listed in Table 5. HBEOSA-SA-NT and HBEOSA-FFA-NT are seen to compete closely here, especially with the Iris and Exactly datasets. However, the former seems to outperform the latter in M-of-n, Tic-tac-toe, and Exactly2 datasets while lagging in the Wine dataset. This again confirms that HBEOSA-FFA-NT remains the best hybrid of BEOSA to yield optimal performance in terms of the number of features selected for classification purposes. Recall that we had observed this performance trend with high-dimensional, medium-dimensional, and now low-dimensional datasets.

thumbnail
Table 5. Small-scale dataset comparative analysis using the number of features selected.

https://doi.org/10.1371/journal.pone.0282812.t005

The summary of the methods’ performance compared with the number of features selected on all categories of the dataset is that applying hybrid algorithms is more suitable. Furthermore, we observed that using a transfer function could greatly impair the hybrid methods’ performance when such a function’s design and integration are ineffective. We also noted that FFA’s hybrids with BEOSA yield better performance than the hybrid with SA. This shows that the biology-swarm-based nature of BEOSA and FFA might be the reason for the good performance reported by the hybrid. Recall those swarm-based algorithms are often more competitive than those physics-based.

5.2 Comparative analysis of classification accuracy by hybrid methods

The problem of feature selection is evaluated by investigating the outcome of the classification accuracy resulting from using the selected features. In this study, we experimented with the features selected by HBEOSA-SA, HBEOSA-SA-NT, HBEOSA-FFA, HBEOSA-FFA-NT, and BEOSA. Further, we investigated the impact of varying the population size for each method, and an average classification accuracy value was computed. Results are presented and compared in the three categories of datasets followed in the last sub-section.

Table 6 lists the result obtained for the high-dimensional datasets, including the WaveformEW, Sonar, PenglungEW, KrVsKpEW, Ionosphere, BreastEW, Prostate, colon, and Leukemia. In most cases, we found that only minimal differences existed between the classification accuracy obtained for experiments using a population size of 50 and those using a population size of 100. This is readily noticeable with the BreastEW and Colon datasets. The analysis aims to see the impact of the reduced features in achieving good classification accuracy. It is desired that such accuracy must be significant; otherwise, we conclude that the feature selected is suboptimal. Hence the binary optimizers proposed are ineffective. As an example, we note that the average classification accuracy observed for HBEOSA-SA, HBEOSA-SA-NT, HBEOSA-FFA, HBEOSA-FFA-NT, and BEOSA for BreastEW, Colon, Ionosphere, KrVsKpEW, Leukemia, and Prostate range between [0.91–1.00] except for HBEOSA-SA on Ionosphere which yielded 0.867857, though the general classification outcome that is very significant and appreciable. On the other hand, we see the classification accuracy reported for PenglungEW, Sonar, and WaveformEW with the same algorithms, namely HBEOSA-SA, HBEOSA-SA-NT, HBEOSA-FFA, HBEOSA-FFA-NT, and BEOSA, had their classification lower, though significant, in the range of [0.7166–0.8809]. This result implies that all the hybrids of BEOSA as proposed in the study, are suitable for fetching only relevant features required for obtaining significant classification accuracy, even on high-dimensional datasets.

thumbnail
Table 6. Large-scale dataset comparative analysis using classification accuracy for population sizes 50 and 100.

https://doi.org/10.1371/journal.pone.0282812.t006

The features extracted by the hybrid binary optimizers are seen to be very competitive in terms of classification accuracy obtained on the medium-scale dataset compared to those from the high-dimensional group. A careful look at the accuracy values reported in Table 7 for experiments using 50 population size and those of 100 population size revealed that a change in population size might not have any significant performance enhancement if the hybrids of a binary optimizer are well articulated and designed. We see this confirmed in the results of Zoo, Vote, SpectEW, Lymphography, and CongressEW. Although there are some exceptions in the case of HBEOSA-SA-NT and BEOSA using Lymphography, HBEOSA-SA, and HBEOSA-FFA-NT using SpectEW, HBEOSA-SA-NT and HBEOSA-FFA using Zoo dataset, were there is a wide margin between the classification accuracies of 50 and 100 population sizes. Meanwhile, we note that the average accuracy for all the medium-scale datasets on all the hybrid methods is also significant, with the least and best being 0.783333 and 0.966667, respectively. The average classification accuracy observed for all experiments on 50 population size is 0.898019, for 100 population size is 0.866073, and the average on the individual averages is 0.904246.

thumbnail
Table 7. Medium-scale dataset comparative analysis using classification accuracy for population sizes 50 and 100.

https://doi.org/10.1371/journal.pone.0282812.t007

An interesting performance, though reduced compared to the high-dimensional dataset, on the classification accuracy is observed for the low-dimensional datasets. In Table 8, most of the accuracies obtained for the 50 and 100 population sizes are lower and ranges between [0.60–0.80]. This then motivated us to ask if those features extracted for the low-dimensional datasets were not representative of those which can yield a good classification accuracy. This concern is justified by the fact that it is desirable to have selected features produce better classification accuracy. However, since the results obtained for the low-dimensional dataset are not those for the other categories, we conclude that HBEOSA-SA selected the optimal number of features, HBEOSA-SA-NT, HBEOSA-FFA, and HBEOSA-FFA-NT, but more suggestive features were left out.

thumbnail
Table 8. Small-scale dataset comparative analysis using classification accuracy for population sizes 50 and 100.

https://doi.org/10.1371/journal.pone.0282812.t008

The summary of the findings observed in the comparative analysis of the hybrid methods with respect to classification accuracy is that methods that yielded lower performance with respect to the number of features extracted still output a significant classification accuracy. Also, we noted that it is important to design binary optimizers to select the optimal number of features and include the most discriminant features capable of supporting the classifier to produce good results. To allow for having an overview of the findings from the analysis, the charts illustrating the distribution of the average classification accuracies in all the categories of datasets have been plotted, as seen in Fig 4.

thumbnail
Fig 4.

A bar chart plot for the comparison of the hybrids of BEOSA based on the classification accuracy obtained using (a) high-dimensional, (b) (medium-dimensional), and (c) low-dimensional databases.

https://doi.org/10.1371/journal.pone.0282812.g004

In the following two sub-sections, we focus on analyzing the values returned for the fitness function, cost function, and even the computational cost for running all the hybrids compared with the single binary optimizer. This is necessary to corroborate the significance of applying the method, which yielded the impressive performance reported in the previous and this sub-sections.

5.3 Comparative analysis of fitness and cost values by hybrid methods

Evaluation of fitness and cost functions are very relevant to consolidating the result obtained for classification accuracy and feature counts. Whereas the fitness value demonstrates the high ranking associated with the selected solution from a wide range of candidate solutions, cost values demonstrate what is required to obtain that optimize that solution. The fitness value is expected to be minimized while the cost value is maximized, hence a min-max optimization process. In this sub-section, we analyze the fitness and cost values obtained for 50 and 100 population sizes on all categories of datasets using the hybrid methods of BEOSA.

In Table 9, the listing of the values obtained for the fitness and cost functions are outlined for population sizes 50 and 100 on all datasets listed. As observed during the discussion of the result of feature counts, we note that the fitness values of both HBEOSA-SA and HBEOSA-FFA, in some cases, returned as low as negative values. This is consistent with the feature counts reported by these same methods, where we observed that very negligent feature counts were returned, although some other cases returned positive fitness values. Again this abnormal performance is associated with using the transfer function on the hybrids, and we have already motivated the need to consider if using the transfer function in hybrids of the binary optimizer is necessary. The results obtained for HBEOSA-SA-NT and HBEOSA-FFA-NT are very impressive because, as expected, the fitness and cost values were minimized and maximized accordingly. For instance, consider the values returned for HBEOSA-SA-NT and HBEOSA-FFA-NT on WaveformEW, Sonar, PenglungEW, KrVsKpEW, Ionosphere, BreastEW, Prostate, colon, and Leukemia datasets for fitness and cost are very low and high respectively for both 50 and 100 population sizes.

thumbnail
Table 9. Large-scale dataset comparative analysis using fitness and cost values for population sizes 50 and 100.

https://doi.org/10.1371/journal.pone.0282812.t009

The results for the medium-scale datasets are listed in Table 10 for all the hybrid methods and for population sizes 50 and 100. We noted that HBEOSA-SA, HBEOSA-FFA, HBEOSA-SA-NT, and HBEOSA-FFA-NT all performed well except for the case of HBEOSA-FFA on CongressEW (50 population size), HBEOSA-FFA on Lymphography (100 population size), HBEOSA-SA on SpectEW (50 and 100 population sizes), HBEOSA-SA and HBEOSA-FFA on Vote (50 population sizes), and HBEOSA-FFA on Zoo (50 and 100 population sizes). The fitness and cost values for both 50 and 100 population sizes on HBEOSA-SA, HBEOSA-FFA, HBEOSA-SA-NT, and HBEOSA-FFA-NT are seen to be significantly low and correspond high for Zoo, Vote, SpectEW, Lymphography, and CongressEW. The implication of this result is that the solution selected from all candidate solutions represented the best solution.

thumbnail
Table 10. Medium-scale dataset comparative analysis using fitness and cost values for population size 50 and 100.

https://doi.org/10.1371/journal.pone.0282812.t010

The low-dimensional datasets Iris, Wine, Tic-tac-toe, M-of-n, HeartEW, Exactly, Exactly2, and BreastCancer were applied to the hybrid methods, and the results are listed in Table 11. The results obtained are consistent with those reported for high-dimensional and medium-dimensional categories. Some cases of the HBEOSA-SA and HBEOSA-FFA methods yield negative values for the fitness function. On the other hand, HBEOSA-SA-NT and HBEOSA-FFA-NT did well regarding the values returned for the fitness and cost functions. Note that the low-dimensional fitness results are quite high compared with those obtained for the high and medium scale datasets. Again, this points to the suitability of the proposed hybrid methods in handling high-dimensional datasets more effectively. Recall that the challenge associated with high-dimensional datasets often impairs binary optimizers’ outcomes. However, this study shows that the proposed hybrid methods are significantly suitable for high-dimensional datasets.

thumbnail
Table 11. Small-scale dataset comparative analysis using fitness and cost values for population sizes 50 and 100.

https://doi.org/10.1371/journal.pone.0282812.t011

The fitness and cost convergence curves for some selected datasets in the three categories of dataset grouping are obtained for graphing. In Fig 5, the fitness convergence curves for WaveformEW, Zoo, and Wine datasets are shown and compared with those for population sizes 50 and 100. The comparison for WaveformEW shows that the fitness curve for HBEOSA-FFA and HBEOSA-FFA-NT using 50 and 100 population sizes rank high in the plots, while those of HBEOSA-SA and HBEOSA-SA-NT trail behind. Almost the contrary is observed for the Zoo dataset belonging to medium-scale datasets. Here, both curves for HBEOSA-SA and HBEOSA-SA-NT flow high in the plots, though HBEOSA-FFA-NT successfully competes. All the hybrid methods under-performed compared with BEOSA using the Wine dataset for the 50 population size. However, we see a different curve pattern with the 100 population size plots where all hybrid methods rose high except for HBEOSA-SA. This is consistent with the report obtained from tabular data discussed earlier. Convergence curves are expected to show how the solutions benefit from the optimization process through a drop in the pattern of each curve on a plot. We see this convergence curve pattern replicated for most algorithms for each dataset except for the Wine dataset using 50 population sizes.

thumbnail
Fig 5. An illustration and comparison of the fitness convergence curves for the Large-scale dataset (WaveformEW), medium-scale dataset (Zoo dataset), and Small-scale (Wine dataset) using 50 and 100 population sizes.

(a) WaveformEW dataset for 50 population size; (b) WaveformEW dataset for 100 population size; (c) Zoo dataset for 50 population size; (d) Zoo dataset for 100 population size; (e) Wine dataset for 50 population size; (f) Wine dataset for 100 population size.

https://doi.org/10.1371/journal.pone.0282812.g005

Similarly, we plot the cost function graphs for WaveformEW, Zoo, and Wine datasets for their corresponding 50 and 100 population sizes. In this case of the cost function curve, we expect each curve for the hybrid methods to rise rather than drop as defined for the fitness curves. In Fig 6, the WaveformEW dataset graph plots for 50 population size shows that HBEOSA-SA and HBEOSA-FFA rose high in the plot while those of HBEOSA-SA-NT and HBEOSA-FFA-NT were low in the plot. Almost a similar curve display is seen for the 100 population size with HBEOSA-SA at the peak, followed by HBEOSA-SA-NT, while HBEOSA-FFA and HBEOSA-FFA-NT are at the bottom of the plot. The Zoo dataset for the 50 population size shows the opposite, with HBEOSA-SA, HBEOSA-SA-NT, and HBEOSA-FFA-NT flowing at the bottom of the curve while only HBEOSA-FFA rose at the top. For the 100 population size still on the Zoo dataset, only HBEOSA-FFA-NT ranked low in the plot. All the hybrid methods performed well as graphed for the Wine dataset on the 50 population size, while only the HBEOSA-SA peaked when the population size of 100 was used. The cost curves for all the datasets on the hybrid methods are seen to rise from low points to higher points except for the Wine 50 population size.

thumbnail
Fig 6. An illustration and comparison of the cost convergence curves for the Large-scale dataset (WaveformEW), medium-scale dataset (Zoo dataset), and Small-scale (Wine dataset) using 50 and 100 population sizes.

(a) WaveformEW dataset for 50 population size; (b) WaveformEW dataset for 100 population size; (c) Zoo dataset for 50 population size; (d) Zoo dataset for 100 population size; (e) Wine dataset for 50 population size; (f) Wine dataset for 100 population size.

https://doi.org/10.1371/journal.pone.0282812.g006

The summary of the evaluation of the fitness and cost functions results for the low, medium, and high-dimensional datasets confirms that solutions selected best during the feature selection and classification process are indeed optimal. This is required to verify if the binary optimizers could optimize the solution space to determine the best and optimal solution out of all candidate solutions.

5.4 Comparative analysis of computation time by hybrid methods

Computational resources are necessary when implementing new algorithms and must be evaluated during experimentation. This study compares the computational cost of all hybrid methods considered. The discussion around this computational cost is based on the categorization of the datasets and, of course, the performance of all the hybrid methods. While we note that even the BEOSA computational cost is collected and presented, it is not used for comparison with the hybrid methods since the design approach is far different. As a result, it is observed that the computational cost of BEOSA is far lower than those of the hybrids. However, the hybrid methods achieved outstanding performance with regard to feature selection and classification. Hence, the tradeoff is to achieve improved classification accuracy using an optimal feature set at a more demanding computational cost. The following paragraphs detail the results of all methods according to their categorization in the dataset grouping.

The computational time required for running all the high-dimensional datasets is listed in Table 12. The computational cost of HBEOSA-FFA-NT is slightly higher than its corresponding HBEOSA-FFA on the BreastEW, KrVsKpEW, and Sonar datasets. However, the Prostate, Colon, Leukemia, Ionosphere, and PenglungEW datasets achieved its task at a reduced computational cost compared with HBEOSA-FFA. A comparison of the computational cost of HBEOSA-SA-NT and HBEOSA-SA on the high-dimensional datasets showed that the former is most cost-effective than the latter, as seen in Prostate, Colon, Leukemia, Sonar, and WaveformEW. Even when it appears that HBEOSA-SA recorded lower computational cost than HBEOSA-SA-NT, we noted that the difference is still insignificant. Hence, the HBEOSA-SA-NT and HBEOSA-FFA-NT are cost-effective and the most performing methods with respect to feature selection and classification. Meanwhile, all methods compete based on the computational cost listed for each high-dimensional dataset.

thumbnail
Table 12. Large-scale dataset comparative analysis using computation resources.

https://doi.org/10.1371/journal.pone.0282812.t012

The computational cost of HBEOSA-SA, HBEOSA-SA-NT, HBEOSA-FFA, and HBEOSA-FFA-NT for Zoo, Vote, SpectEW, Lymphography, and CongressEW datasets are listed in Table 13. The HBEOSA-SA-NT method recorded the lowest computational cost in almost all the datasets except for the Vote dataset, where HBEOSA-FFA outperformed it. Similarly, we see the HBEOSA-SA-NT method trailing behind HBEOSA-FFA-NT in performance on low computational cost. Therefore, this implies that the removal of the transfer function on the hybrid methods produced greater benefits in terms of performance on feature selection with classification and computational cost.

thumbnail
Table 13. Medium-scale dataset comparative analysis using computation resources.

https://doi.org/10.1371/journal.pone.0282812.t013

Table 14 reports the computational cost for the low-dimensional datasets, including Iris, Wine, Tic-tac-toe, M-of-n, HeartEW, Exactly, Exactly2, and BreastCancer. Generally speaking, a lower computational cost is reported for the low-dimensional datasets compared with the computational cost of running the algorithms on large and medium-scale datasets. This demonstrates the consistency of the hybrid algorithms and confirms their reliability and applicability to real-life optimization problems. Meanwhile, we observed that, as earlier reported, HBEOSA-SA and HBEOSA-FFA’s computational cost was lower than their corresponding models, HBEOSA-SA-NT and HBEOSA-FFA-NT, which do not use transfer functions. For instance, the computational cost for HBEOSA-SA and HBEOSA-FFA using the Exactly, M-of-n, Tic-tac-toe (853.813 and 902.6218, 801.7767 and 801.7767, 1599.557 and 1551.322, 832.3618 and 946.9568) and Wine are as against those of HBEOSA-SA-NT and HBEOSA-FFA-NT (933.5138 and 1122.875, 893.3054 and 800.887, 1793.579 and 1702.181, 797.7159 and 871.5879) respectively. However, for the Exactly2 and Iris datasets, performance for HBEOSA-FFA on computational cost was higher than its corresponding HBEOSA-FFA-NT.

thumbnail
Table 14. Small-scale dataset comparative analysis using computation resources.

https://doi.org/10.1371/journal.pone.0282812.t014

The computational cost discussed in previous paragraphs for the three categories of datasets is further presented using graphs for clarification. In Fig 7, we apply bar charts to show the distribution of computational cost for each hybrid algorithm. The high-dimensional datasets, Sonar, PenglungEW, and Ionosphere are computationally low compared with those of WaveformEW, KrVsKpEW, BreastEW, Prostate, Colon, and Leukemia. In almost all the datasets, the bar column for HBEOSA-SA is seen to peak higher above other methods. This is contrary to what is seen with the medium and low dimensional datasets, where the bar columns for HBEOSA-FFA and HBEOSA-SA-NT, respectively, were higher in computational plotting than the other hybrid methods.

thumbnail
Fig 7.

A bar chart plot for the comparison of the hybrids of BEOSA, namely HBEOSA-SA, HBEOSA-SA-NT, HBEOSA-FFA, HBEOSA-FFA-NT based on the computational resource obtained using (a) high-dimensional, (b) (medium-dimensional) and (c) low-dimensional databases.

https://doi.org/10.1371/journal.pone.0282812.g007

The summary of the computational cost observed for all the hybrid methods for the three categories of datasets showed that this resource cost is justified by the gain achieved on the reduced feature selected and the classification accuracy. We note that this corroborates with the study’s aim, which seeks to promote a hybrid binary optimizer that outperforms a single binary optimizer at a considerable computational cost.

5.5 Discussion on findings

In this sub-section, the findings from the study are presented through a combinatorial observation of the performance of all hybrid algorithms on fitness, classification accuracy, and cost. Recall that we have noted that based on individual approaches for examining these metrics, we confirmed that the results obtained were consistent with the features selected by each method. However, to arrive at justifiable findings, we applied radar plots to chart these three metrics on a single graph for some selected datasets in each dataset category.

In Fig 8, we selected the fitness, classification accuracy, and cost values on Sonar, PenglungEW and Leukemia datasets for 50 and 100 population sizes and plotted them using radar plot. Again, placing graphs for both 50 and 100 population sizes close will buttress the findings further if considerable population sizes influence the performance of hybrids methods. For the Sonar dataset, we noted a strong alignment of values returned for fitness, accuracy, and cost for HBEOSA-SA, HBEOSA-SA-NT, and HBEOSA-FFA-NT in both the 50 and 100 population sizes, the exception to this is the plot for the HBEOSA-FFA algorithm. The values for fitness, accuracy, and cost using the PenglungEW dataset, the lag existing among plots for the hybrid methods is very small on both the 50 and 100 population sizes. On the Leukemia dataset, the small lag in plots exists only for the fitness, accuracy, and cost values with the HBEOSA-FFA algorithm using 50 population size and values for fitness, accuracy, and cost with the HBEOSA-SA algorithm using 100 population size. This shows that for all the hybrid methods proposed in the study, there is a correlation in the fitness performance, accuracy, and cost relating to the feature selected. This implies that when the performance for fitness, accuracy, and cost are poor, the aim of the hybrid methods in minimizing the number of features selected will be defeated. This demonstrates the need to consider the performance of binary optimizers not only by examining metrics on an individual basis but rather by correlating the values from related metrics in a manner that will project the harmonious behavior of the optimizer in executing the feature selection task. Secondly, the study’s findings showed minimal performance gain when the population size varied for each hybrid method. In fact, we noted that even the single binary optimizer appears to trail behind its corresponding hybrids in terms of performance. A confirmation that hybrid binary optimizers will maintain the behavioral pattern of their corresponding single/base method while improving performance. Also, as charted in the plots, results showed that the proposed hybrid methods are very suitable for addressing high-dimensional datasets with no abnormality observed.

thumbnail
Fig 8. A radar plot illustrating the comparison of the classification accuracy, cost values, and fitness values for the hybrids of BEOSA when applied to some high-dimensional datasets using a variation of 50 and 100 population sizes.

(a) Sonar dataset for 50 population size; (b) Sonar dataset for 100 population size; (c) PenglungEW dataset for 50 population size; (d) PenglungEW dataset for 100 population size; (e) Leukemia dataset for 50 population size; (f) Leukemia dataset for 100 population size.

https://doi.org/10.1371/journal.pone.0282812.g008

Further investigation on the behavior of the hybrid methods is observed for both the medium-scale datasets are reported in Fig 9. The Vote, Zoo, and SpectEW datasets were randomly selected for analyzing the performance of the medium-scale datasets using the 50 and 100 population sizes. On the Vote dataset, the fitness and cost values for the hybrid methods using a 50 population size were better than the single BOESA method. Classification accuracies lapped for all hybrid and single methods. This lap is also noticed with the 100 population size. Using the Zoo dataset, we see a repletion of this lap of the plots for the hybrid and the single binary methods, except for the HBEOSA-FFA, which reported a more desirable result using the 50 population size. Also, this competitive performance for the fitness, cost, and classification accuracy for SpectEW is demonstrated through the lap in the plots between the hybrid methods and the single binary optimizer method. This shows that the hybrid binary optimizer performs almost similar to the single binary optimizer with the medium-scale dataset. This then shows that large-scale dataset stands to benefit more from the concept of hybridization of single binary optimizers. Moreover, we see that even with the medium-scale dataset, the hybrid methods performed well only that there was no significant difference in their performance compared with the single binary optimizer.

thumbnail
Fig 9. A radar plot illustrating the comparison of the classification accuracy, cost values, and fitness values for the hybrids of BEOSA when applied to some medium-sized dimensional datasets using a variation of 50 and 100 population sizes.

(a) Vote dataset for 50 population size; (b) Vote dataset for 100 population size; (c) Zoo dataset for 50 population size; (d) Zoo dataset for 100 population size; (e) SpectEW dataset for 50 population size; (f) SpectEW dataset for 100 population size.

https://doi.org/10.1371/journal.pone.0282812.g009

The findings from applying the low-dimensional datasets to the proposed hybrid methods are illustrated using the plots in Fig 10. The Tic-tac-toe, Exactly2, and Exactly datasets were randomly selected from this category for the comparative analysis of fitness, cost, and classification accuracy values. With the Tic-tac-toe dataset using a 50 population size, we see no significant difference between the hybrid methods and the single binary optimizer. However, for the experiment using a 100 population size, only the HBEOSA-SA benefited more with respect to fitness and cost, while all the other hybrid algorithms overlap in performance. A similar pattern is observed for the Exactly2 dataset, especially when using the 100 population size. However, that which uses a 50 population size revealed better performance for HBEOSA-SA and HBEOSA-FFA in terms of fitness and cost values. On the contrary, both HBEOSA-SA and HBEOSA-FFA-NT reported a slight drop in performance on fitness and cost values when 50 population size is used, and HBEOSA-SA-NT and HBEOSA-FFA-NT a similar drop in performance when 100 population size is used on the Exactly dataset. The remaining two hybrid algorithms outperformed the single binary optimizer in both cases.

thumbnail
Fig 10. A radar plot illustrating the comparison of the classification accuracy, cost values, and fitness values for the hybrids of BEOSA when applied to some small-sized dimensional datasets using a variation of 50 and 100 population sizes.

(a) Tic-tac-toe dataset for 50 population size; (b) Tic-tac-toe dataset for 100 population size; (c) Exactly 2 dataset for 50 population size; (d) Exactly 2 dataset for 100 population size; (e) Exactly dataset for 50 population size; (f) Exactly dataset for 100 population size.

https://doi.org/10.1371/journal.pone.0282812.g010

Recall that the motivation for this study is to investigate the possible performance enhancement when nested transfer functions are applied to solve the FS problem, as against the traditional single-function approach. We also noted that the study aims to observe the performance gain of using the threshold method compared with the transfer function method in binarizing the continuous optimizer process. The outcome of the study has shown that the following observations were the reason for the results obtained:

  1. It is expected that there must exist a correlation between the values returned for fitness, accuracy, and cost functions. We observed that in most cases for the hybrid algorithms, this correlation holds to buttress that all results and performance enhancement achieved make the hybrid algorithms valid and relevant in solving the FS problem. Furthermore, we noted that values obtained for these three functions (fitness, accuracy, and cost) were, in most cases, demonstrating a form of alignment even when the population size varied between 50 and 100. This is readily noticeable with HBEOSA-SA, HBEOSA-SA-NT, and HBEOSA-FFA-NT algorithms, most of which are the methods using the nested-transfer functions. The reason for this performance is to confirm the use of the nested-transfer function as a stabilizer of binary optimizers even when solving FS problems using high-dimensional datasets. This is indeed very impressive considering the problematic nature of high-dimensional datasets with use on binary optimizers.
  2. Another observation noted with the results obtained during the experimentation process using a particular population size, 50, revealed no significant difference between the hybrid methods and the single binary optimizer. Again, the nested transfer function justifies this performance since it stabilizes the candidate solutions.
  3. When using low dimensional datasets, an observation noted for the fitness and cost values results confirmed that the hybrid optimizers were more optimal in performance than the single-optimizer. The reason for this is based on the mutual benefit derived from leveraging the composing algorithms’ strengths. As typical of previous observations, the hybrid methods using nested-transfer functions are leading in this respect.

The summary of the findings from the study on the use of hybrid binary optimizers is that such methods improve performance in addressing feature selection problems compared with their corresponding single binary optimizers. This performance enhancement is seen to be reflected in the quantity and quality of features selected and the fitness and cost of the best solution selected from candidate solutions. Meanwhile, competitive performance is observed between the hybrid methods and the single binary optimizers when both the medium-scale and low-dimensional datasets are used. These findings imply that high-dimensional datasets benefit more from a hybrid binary optimizer than a single binary optimizer. Moreover, recall that the computational cost for hybrid binary optimizers is much higher than those for single binary optimizers. Therefore, this study’s findings show significant performance gain for the high-dimensional dataset, which is an interesting discovery. This is because most real-life problems are characterized by high-dimensional datasets, which are highly solvable with better performance using the proposed hybrid binary optimizers.

6. Conclusion

The use of hybrid binary optimization algorithms is proposed and investigated in this study. For the single binary optimizer, the binary Ebola optimization search algorithm (BEOSA) is used as the basis for deriving the hybrid algorithms. In designing the hybrid methods, the simulated annealing (SA) and the firefly algorithm (FFA) were hybridized with BEOSA to achieve both HBEOSA-SA and HBOESA-FFA. A further investigative study on the influence of transfer functions in the design of hybrid methods was conducted. Results showed that the hybrid algorithms not designed to use transfer functions outperformed those which used the functions. Findings from the study also showed that studies on binary optimization algorithms need to consider the performance of binary optimizers not only by examining metrics on an individual basis but by correlating the values from related metrics in a manner that will project harmonious behavior of the optimizer. The study also investigated the influence of increasing population sizes of the solutions in the search space. The result confirmed that there is minimal performance gain when the population size is varied for each hybrid method. Furthermore, the hybrid algorithms reported performances almost similar in pattern to those of the single binary optimizer. This showed that hybrid binary optimizers would maintain the behavioral pattern of their corresponding single/base method while improving performance. The datasets applied for the experimentation were categorized into high-dimensional, low-dimensional, and medium-scale dimensions. The experiment’s outcome revealed that the hybrid methods performed better than those datasets in the other two categories. This then shows that large-scale dataset stands to benefit more from the concept of hybridization of single binary optimizers. In future work, we recommend investigating other transfer functions on the hybrid methods to investigate the possible performance behavior. Meanwhile, the use of the threshold method other than the transfer function method can be investigated to draw a comparative analysis of their performance. Recent discrete and continuous optimizers might as well be considered for hybridization with the BEOSA method to reveal further how efficiently the new hybrid methods might perform compared with what is reported in this study.

References

  1. 1. Akinola O., Oyelade O. N. and Ezugwu A. E. "Binary Ebola Optimization Search Algorithm for Feature Selection and Classification Problems," Applied Sciences, vol. 12, no. 22, p. 11787, 2022.
  2. 2. Remeseiro B. and Bolon-Canedo V. "A review of feature selection methods in medical applications," Computers in biology and medicine, vol. 112, p. 103375, 2019. pmid:31382212
  3. 3. Akinola O. A., Ezugwu A. E., Oyelade O. N. and Agushaka J. O. "A hybrid binary dwarf mongoose optimization algorithm with simulated annealing for feature selection on high dimensional multi-class datasets," Scientific Reports, vol. 12, no. 1, pp. 1–22, 2022.
  4. 4. Gharehchopogh F. S., Maleki I. and Dizaji Z. A. "Chaotic vortex search algorithm: metaheuristic algorithm for feature selection," Evolutionary Intelligence, vol. 15, no. 3, pp. 1777–1808., 2022.
  5. 5. Arora S., Sharma M. and Anand P. "A novel chaotic interior search algorithm for global optimization and feature selection," Applied Artificial Intelligence, vol. 34, no. 4, pp. 292–328, 2020.
  6. 6. Chatterjee S., Biswas S., Majee A., Sen S., Oliva D. and Sarkar R. "Breast cancer detection from thermal images using a Grunwald-Letnikov-aided Dragonfly algorithm-based deep feature selection method," Computers in Biology and Medicine, vol. 141, p. 105027, 2022. pmid:34799076
  7. 7. Ayar M., Isazadeh A., Gharehchopogh F. S. and Seyedi M. "Chaotic-based divide-and-conquer feature selection method and its application in cardiac arrhythmia classification," The Journal of Supercomputing, vol. 78, no. 4, pp. 5856–5882, 2022.
  8. 8. Abualigah L. M. Q. Feature selection and enhanced krill herd algorithm for text document clustering, Berlin: Springer, 2019, pp. 1–165.
  9. 9. Bharti K. K. and Singh P. K. "Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering," Expert Systems with Applications, vol. 42, no. 6, pp. 3105–3114, 2015.
  10. 10. Naseri T. S. and Gharehchopogh F. S. "A Feature Selection Based on the Farmland Fertility Algorithm for Improved Intrusion Detection Systems," Journal of Network and Systems Management, vol. 30, no. 3, pp. 1–27, 2022.
  11. 11. Khater B. S., Abdul Wahab A. W., Idris M. Y. I., Hussain M. A., Ibrahim A. A., Amin M. A., et al. “Classifier performance evaluation for lightweight IDS using fog computing in IoT security,” Electronics, vol. 10, no. 14, p. 1633, 2021.
  12. 12. Mohammadzadeh H. and Gharehchopogh F. S. "A novel hybrid whale optimization algorithm with flower pollination algorithm for feature selection: Case study Email spam detection," Computational Intelligence, vol. 37, no. 1, pp. 176–209, 2021.
  13. 13. Zhang Y., Wang S., Phillips P. and Ji G. "Binary PSO with mutation operator for feature selection using decision tree applied to spam detection," Knowledge-Based Systems, vol. 64, pp. 22–31, 2014.
  14. 14. Tadist K., Najah S., Nikolov N. S., Mrabti F. and Zahi A. "Feature selection methods and genomic big data: a systematic review," Journal of Big Data, vol. 6, no. 1, pp. 1–24, 2019.
  15. 15. Allam M. and Nandhini M. "Optimal feature selection using binary teaching learning based optimization algorithm.," Journal of King Saud University-Computer and Information Sciences, vol. 34, no. 2, pp. 329–341, 2022.
  16. 16. Mortazavi R., Mortazavi S. and Troncoso A. "Wrapper-based feature selection using regression trees to predict intrinsic viscosity of polymer," Engineering with Computers, vol. 38, no. 3, pp. 2553–2565, 2022.
  17. 17. Miri M., Dowlatshahi M. B., Hashemi A., Rafsanjani M. K., Gupta B. B. and Alhalabi W. "Ensemble feature selection for multi‐label text classification: An intelligent order statistics approach," International Journal of Intelligent Systems., 2022.
  18. 18. Nadimi-Shahraki M. H., Fatahi A., Zamani H., Mirjalili S., Abualigah L. and Abd Elaziz M. "Migration-based moth-flame optimization algorithm," Processes, vol. 9, no. 12, p. 2276, 2021.
  19. 19. Akinola O. O., Ezugwu A. E., Agushaka J. O., Zitar R. A. and Abualigah L. "Multiclass feature selection with metaheuristic optimization algorithms: a review," Neural Computing and Applications, pp. 1–40, 2022. pmid:36060097
  20. 20. Agushaka J. O., Ezugwu A. E. and Abdualigah L. "Gazelle Optimization Algorithm: A novel nature-inspired metaheuristic optimizer for mechanical engineering applications," Neural Computing and Applications, vol. 6, no. September, 2022.
  21. 21. Agushaka J. O. and Ezugwu A. E. "Advanced Arithmetic Optimization Algorithm for solving mechanical engineering design problems," Plos one, vol. 16, no. 8, p. e0255703, 2021.
  22. 22. Ezugwu A. E. "Enhanced symbiotic organisms search algorithm for unrelated parallel machines manufacturing scheduling with setup times," Knowledge-Based Systems, vol. 172, pp. 15–32, 2019.
  23. 23. Sa’ad S., Muhammed A., Abdullahi M., Abdullah A. and Hakim Ayob F. "An enhanced discrete symbiotic organism search algorithm for optimal task scheduling in the cloud," Algorithms, vol. 14, no. 7, p. 200, 2021.
  24. 24. Akinola O. A., Agushaka J. O. and Ezugwu A. E. "Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems," Plos one, vol. 17, no. 10, p. e0274850, 2022. pmid:36201524
  25. 25. Radpour V. and Gharehchopogh F. S. "A Novel Hybrid Binary Farmland Fertility Algorithm with Naïve Bayes for Diagnosis of Heart Disease," Sakarya University Journal of Computer and Information Sciences, vol. 5, no. 1, pp. 90–103, 2022.
  26. 26. Emary E., Zawbaa H. M. and Hassanien A. E. "Binary grey wolf optimization approaches for feature selection," Neurocomputing, vol. 172, pp. 371–381, 2016.
  27. 27. Mafarja M., Aljarah I., Faris H., Hammouri A. I., Ala’M A. Z. and Mirjalili S. "Binary grasshopper optimisation algorithm approaches for feature selection problems," Expert Systems with Applications, vol. 117, pp. 267–286, 2019.
  28. 28. Sindhu R., Ngadiran R., Yacob Y. M., Zahri N. A. H. and Hariharan M. "Sine–cosine algorithm for feature selection with elitism strategy and new updating mechanism," Neural Computing and Applications, vol. 28, no. 10, pp. 2947–2958, 2017.
  29. 29. Dhiman G., Oliva D., Kaur A., Singh K. K., Vimal S., Sharma A., et al. "BEPO: A novel binary emperor penguin optimizer for automatic feature selection," Knowledge-Based Systems, vol. 211, p. 106560, 2021.
  30. 30. Agushaka J. O., Ezugwu A. E. and Abualigah L. "Dwarf Mongoose Optimization Algorithm," Computer Methods in Applied Mechanics and Engineering, vol. 391, p. 114570, 2022.
  31. 31. Alrayes F. S., Alzahrani J. S., Alissa K. A., Alharbi A., Alshahrani H., Elfaki M. A., et al. "Dwarf Mongoose Optimization-Based Secure Clustering with Routing Technique in Internet of Drones.," Drones, vol. 6, no. 9, p. 247, 2022.
  32. 32. Oyelade O. N., Ezugwu A. E. S., Mohamed T. I. A. and Abualigah L. "Ebola Optimization Search Algorithm: A New Nature-Inspired Metaheuristic Optimization Algorithm," IEEE Access, vol. 10, p. 16150–16177, 2022.
  33. 33. Belge E., Altan A. and Hacıoğlu R. "Metaheuristic Optimization-Based Path Planning and Tracking of Quadcopter for Payload Hold-Release Mission," MDPI Electronics; https://doi.org/10.3390/electronics11081208, vol. 11, no. 8, p. 1208, 2022.
  34. 34. Yağ İ. and Altan A. "Artificial Intelligence-Based Robust Hybrid Algorithm Design and Implementation for Real-Time Detection of Plant Diseases in Agricultural Environments," Biology (Basel), vol. 11, no. 12, p. 1732, 2022. pmid:36552243
  35. 35. Oyelade O. N. and Ezugwu A. E. "Immunity-based Ebola optimization search algorithm for minimization of feature extraction with reduction in digital mammography using CNN models," Scientific Reports, vol. 12, no. 1, pp. 1–33, 2022.
  36. 36. Arora S. and Anand P. "Binary butterfly optimization approaches for feature selection," Expert Systems with Applications, vol. 116, p. 147–160, 2019.
  37. 37. Tubishat M., Alswaitti M., Mirjalili S., Al-Garadi M. A., Alrashdan M. T. and Rana T. A. "Dynamic butterfly optimization algorithm for feature selection," IEEE Access, vol. 8, p. 194303, 2020.
  38. 38. Rodrigues D., de Albuquerque V. H. C. and Papa J. P. "A multi-objective artificial butterfly optimization approach for feature selection," Applied Soft Computing, vol. 94, p. 106442, 2020.
  39. 39. Abualigah L. M., Khader A. T. and Hanandeh E. S. " A new feature selection method to improve the document clustering using particle swarm optimization algorithm," Journal of Computational Science, vol. 25, pp. 456–466, 2018.
  40. 40. Sadeghian Z., Akbari E. and Nematzadeh H. "A hybrid feature selection method based on information theory and binary butterfly optimization algorithm," Engineering Applications of Artificial Intelligence, vol. 97, p. 104079, 2021.
  41. 41. Bidi N. and Elberrichi Z. " Feature selection for text classification using genetic algorithms," in 8th International Conference on Modelling, Identification and Control (ICMIC), Algiers, Algeria, 2016.
  42. 42. Khalandi S. and Soleimanian Gharehchopogh F. "A new approach for text documents classification with invasive weed optimization and naive bayes classifier," Journal of Advances in Computer Engineering and Technology, vol. 4, no. 3, pp. 167–184, 2018.
  43. 43. Majidpour H. and Soleimanian Gharehchopogh F. " An improved flower pollination algorithm with AdaBoost algorithm for feature selection in text documents classification," Journal of Advances in Computer Research, vol. 9, no. 1, pp. 29–40, 2018.
  44. 44. Allahverdipour A. and Soleimanian Gharehchopogh F. "An improved k-nearest neighbor with crow search algorithm for feature selection in text documents classification," Journal of Advances in Computer Research, vol. 9, no. 2, pp. 37–48, 2018.
  45. 45. Maruthupandi J. and Devi K. V. "Multi-label text classification using optimised feature sets," International Journal of Data Mining, Modelling and Management, vol. 9, no. 3, pp. 237–248, 2017.
  46. 46. Srilakshmi V., Anuradha K. and Bindu C. S. "Optimized deep belief network and entropy-based hybrid bounding model for incremental text categorization," International Journal of Web Information Systems, vol. 16, no. 3, p. 347–368, 2020.
  47. 47. Singh A. and Kumar A. "Text document classification using a hybrid approach of ACOGA for feature selection," International Journal of Advanced Intelligence Paradigms, vol. 20, no. 1–2, pp. 158–170, 2021.
  48. 48. Lu Y. and Chen Y. "A text feature selection method based on the small world algorithm," Procedia Computer Science, vol. 107, pp. 276–284, 2017.
  49. 49. Bai X., Gao X. and Xue B. "Particle swarm optimization based two-stage feature selection in text mining," in IEEE congress on evolutionary computation (CEC), Rio de Janeiro, Brazil, 2018.
  50. 50. Thirumoorthy K. and Muneeswaran K. "Optimal feature subset selection using hybrid binary Jaya optimization algorithm for text classification.," Sādhanā, vol. 45, no. 1, pp. 1–13, 2020.
  51. 51. Belazzoug M., Touahria M., Nouioua F. and Brahimi M. " An improved sine cosine algorithm to select features for text categorization," Journal of King Saud University-Computer and Information Sciences, vol. 32, no. 4, pp. 454–464, 2020.
  52. 52. Agrawal P., Ganesh T. and Mohamed A. W. "Chaotic gaining sharing knowledge-based optimization algorithm: an improved metaheuristic algorithm for feature selection," Soft Computing, vol. 25, no. 14, pp. 9505–9528, 2021.
  53. 53. Agrawal P., Ganesh T., Oliva D. and Mohamed A. W. "S-shaped and v-shaped gaining-sharing knowledge-based algorithm for feature selection," Applied Intelligence, pp. 1–32, 2022.
  54. 54. Agrawal P., Ganesh T. and Mohamed A. W. "Solving knapsack problems using a binary gaining sharing knowledge-based optimization algorithm," Complex & Intelligent Systems, pp. 1–21, 2021.
  55. 55. Xiong G., Yuan X., Mohamed A. W., Chen J. and Zhang J. " Improved binary gaining–sharing knowledge-based algorithm with mutation for fault section location in distribution networks," Journal of Computational Design and Engineering, vol. 9, no. 2, pp. 393–405, 2022.
  56. 56. Agrawal P., Abutarboush H. F., Ganesh T. and Mohamed A. W. "Metaheuristic algorithms on feature selection: A survey of one decade of research (2009–2019)," IEEE Access, vol. 9, pp. 26766–26791, 2021.
  57. 57. Akinola O., Oyelade O. N. and Ezugwu A. E.-S. "Binary Ebola Optimization Search Algorithm (BEOSA) using novel S-V transformation functions for solving feature selection and classfication problem," Applied Sciences, pp. 1–38, 2022.
  58. 58. Ovelade O. N. and Ezugwu A. E. "Ebola Optimization Search Algorithm: A new nature-inspired metaheuristic algorithm for global optimization problems," in 2021 International Conference on Electrical, Computer and Energy Technologies (ICECET), Cape Town, South Africa, 2021.
  59. 59. Dua D. and Graff C. "UCI Machine Learning Repository," University of California, School of Information and Computer Science: Irvine, CA, USA, 2019. [Online]. Available: http://archive.ics.uci.edu/ml. [Accessed 01 January 2023].