Abstract

Many optimization algorithms have been applied to solve high-dimensional instances of the 0-1 knapsack problem. However, these algorithms often fall into a local optimization trap and thus fail to obtain the global optimal solutions. To circumvent this shortcoming, a hybrid harmony search algorithm with distribution estimation is proposed in this paper. A few important features of the proposed algorithm are as follows: (i) the idea of probability distribution estimation is employed to design the adaptive search strategy, (ii) a fixed improvisation process is presented to improve the algorithm searching ability, (iii) a new method of initialization is used to ensure that the initialization is feasible harmony and (iv) an improved remediation approach is proposed to effectively repair the infeasible solutions. To assess the effectiveness of the proposed algorithm, some experiments are carried out. The experimental results reveal that the proposed algorithm is a reliable and promising alternative for solving the 0-1 knapsack problem.

1. Introduction

With the development of digitization, a large number of discrete optimization problems are emerging, which has attracted the attention of many scholars [14]. Combinatorial optimization problem is a typical problem, including integer programming, 0-1 programming. The 0-1 knapsack problem proposed by Fayard in 1975, is a typical combinatorial optimization problem [5] and a NP problem. It is widely applied in many areas such as investment decision making problems [6], cutting stock problem [7], the housing problem [8], cryptography [9], adaptive multimedia system [10], the portfolio choice [11, 12], the computer memory [13], the allocation of resources [14], energy minimization [15, 16], cargo load problem [1719], real estate property maintenance optimization [20], the main budget [21] and blanking problem [7]. Many real-world optimization problems are similar to 0-1 knapsack problem, therefore, exploring approaches to effectively solve 0-1 knapsack problem is important and it can offer new methods for some complex engineering optimization problems. Without loss of generality, 0-1 knapsack problem can be expressed aswhere is the total number of objects. Each object has a separate profit value and weight , is the largest weight the knapsack can accommodate, depends on the state of the object i. If the object comes into the knapsack, is 1, otherwise, is 0.

The 0-1 knapsack problem is about maximizing the value of the items in the knapsack under the condition of meeting the requirement of knapsack capacity. Many scholars have studied 0-1 knapsack problems and various algorithms have been proposed. These algorithms can be classified into two categories: deterministic algorithms which include dynamic programming [22], backtracking algorithm, branch and bound algorithm [23], and intelligent optimization algorithms. Because dynamic programming method as well as branch and bound algorithm are often used in solving 0-1 knapsack problem, the computation time increases rapidly and the accuracy decreases as the dimension increases. Thus, the metaheuristic algorithms imitating natural phenomena in intelligent optimization algorithms have attracted more and more attention [2429].

In recent years, many metaheuristics have been proposed to solve 0-1 knapsack problems. Lim proposed a monogamous genetic algorithm with different offspring for the 0-1 knapsack problems [30]. The immune genetic algorithm (NIGA) can quickly recognize antigen characteristics by injecting vaccine. Experimental results show that NIGA algorithm can effectively prevent the algorithm from degradation in the process of genetic algorithm optimization and can improve the convergence speed [31]. Eberhart and Kennedy proposed a binary particle swarm optimization algorithm (BPSO), which largely transforms the continuous problem into a discrete problem by sigmoid [32]. The main shortcomings are that the change of probability becomes smaller with the increase of speed, which will lead to evolution into local optimum. An improved binary particle swarm optimization (MBPSO) was presented by Kennedy, in which a new probability function was designed to ensure the diversity of the population. Compared with some classic binary particle swarm optimization algorithms, MBPSO is more efficient in solving the 0-1 knapsack problem [33]. He et al. proposed a greedy particle swarm optimization (GPSO). In GPSO, a new greedy transformation method is proposed, and then it is combined with the binary particle swarm optimization algorithm with two-layer structure coding. The simulation results reveal that GPSO algorithm is more effective [34]. Wang proposed monarch butterfly optimization algorithm [35], and then Feng et al. proposed chaotic monarch butterfly optimization [36] and a new binary monarch butterfly optimization algorithm [37] to solve 0-1 knapsack problems. Marinakis proposed a hybrid algorithm based on the concepts of artificial bee colony (BABC-DE) and greedy random adaptive search (GRASP) [38]. The algorithm is a two-stage algorithm, which combines the artificial bee colony optimization algorithm for feature selection and grasp algorithm for clustering. Cao proposed a binary differential evolution artificial bee algorithm (BABC-DE) to solve 0-1 knapsack problem. In the phase of employing bees, a binary search operator is designed, which considers memory and neighbor information. In the phase of observing bees, the mutation and crossover operations of differential evolution are used [39]. In 2021, Wang et al. presented A hybridization of dragonfly algorithm optimization and angle modulation mechanism for 0-1 knapsack problems [40]. Abdel-basset et al. proposed a binary flower pollination algorithm to solve 0-1 knapsack problem [41]. The algorithm uses the conversion function to convert the continuous values generated by FPA into binary values. The penalty function is added to the evaluation function to give negative value to the infeasible solution. Finally, the infeasible solution is repaired by a two-stage repair operation. Abdel-basset et al. proposed a binary equilibrium optimization method (BEO) to solve the 0-1 knapsack problem, which mainly uses v-shaped and S-shaped transfer functions to transform the continuous function into discrete function, uses penalty function to screen out the infeasible solution from the problem, and applies repair function (RA) to transform it into feasible solution. Results show that the algorithm has better performance than other algorithms [42]. In 2020, Abdel-basset et al. also presented New binary marine predators optimization algorithms for 0-1 knapsack problems [43]. In 2021, Moradi et al. designed an efficient population-based simulated annealing algorithm for 0-1 knapsack problem [44].

Harmony search algorithm is an optimization algorithm proposed by Geem with a relatively simple structure and relatively fewer adjustable parameters [45]. Harmony search imitates the harmony produced by musicians adjusting the pitch of an instrument. Zou et al. [46] proposed a new global harmony search algorithm (NGHS). The algorithm is improved in position update and genetic mutation, which makes the worst harmony in harmony memory change to the global optimum and prevents the algorithm from falling into the local optimum. Wang proposed an improved adaptive binary harmony search algorithm (ABHS), which enhanced the search ability and robustness of the algorithm [47]. Kong et al. proposed a new simplified binary harmony search (SBHS) algorithm, which modified the improvisation process and adaptive harmony memory rate [48]. For the infeasible solution, the algorithm proposes a two-stage greedy process to repair. Wang et al. proposed an improved differential harmony search (IDHS) algorithm to solve numerical function optimization problems [49]. Mahmoudi et al. proposed a combination of fuzzy logic controller and harmonic search algorithm (HS) for PWR load mode optimization. In the work, the parameters such as bandwidth and pitch adjustment rate are adjusted by a fuzzy logic controller to improve the performance of the proposed algorithm. Finally, the superiority of the algorithm is proved by a case study [50].

Despite the above improvements of the harmony algorithm, they still fall into a local optimum for high-dimensional problems. According to the characteristics of 0-1 knapsack problem, this paper proposes a hybrid harmony search algorithm with distribution estimation. It mainly solves the problem that it is easy to fall into local optimum in high dimensional 0-1 knapsack problem, and also carries out low dimensional tests to verify the reliability of the algorithm.

The main features of the proposed algorithm are as follows:(1)A feasible harmony discrete initialization method is used to increase the randomness of the initial solution, and the initial harmony is 0-1 distribution, and all harmony is feasible(2)In the distribution estimation algorithm, the idea of probability is used to adjust the selected harmony, which makes the direction of harmony change approach to the optimal direction and enhances the search ability of the algorithm(3)The improvisation of harmony search algorithm is improved to a discrete form(4)A greedy repair mechanism is designed to repair the harmony that does not meet the constraints, and try to optimize the harmony that meets the constraints

The rest of this paper is organized as follows. Section 2 mainly introduces the basic harmony search algorithm, including harmony search algorithm and other related improved version. In Section 3, we introduce the distribution estimation and harmony search hybrid algorithm in detail. Section 4 offers the experimental results and evaluate the optimization performance of the proposed hybrid algorithm. The conclusions and the directions for further research are summarized in Section 5.

2.1. The Harmony Search Algorithm

Harmony search algorithm is a metaheuristic algorithm proposed by the Korean scholar Geem in 2001 [45]. It is an algorithm imitating the principle of music performance. Harmony search algorithm is different from a genetic algorithm, a differential evolution algorithm or a particle swarm optimization algorithm in that it is guided by probability. The key idea of the harmony search algorithm is to initialize a harmony memory first, and then choose the harmony memory for improvisation, which includes the harmony memory, tone adjustment and random search. The iteration is done by comparing the value of the fitness: if the new fitness is better than the worst fitness in harmony memory, they are replaced by new generation cycle until the stop condition is satisfied. The detailed steps of harmony search algorithm are as follows:

Step 1. Initialize algorithm parameters. HS generally needs to set five parameters which are harmony memory size (HMS), harmony memory consideration probability (HMCR), gene adjustment probability (PAR), harmony fine-tuning step size (BW), the maximum number of iterations K.

Step 2. The harmony memory library HM is randomly initialized according to Formula is (3), and memory bank HM (4). Here and are the upper and lower limits of the ith-dimensional variables.

Step 3. Improvisation to generate new harmonies improvisation is performed with given parameters HMCR, PAR and BW to generate a new harmony memory bank. The specific process is as follows (Algorithms 15).

for i = 1 to N do
   If rand < HMCR
   % Determine whether to perform a harmony memory consideration operation
   
   If rand < PAR then
    % Determine whether to perform genetic manipulation
  
   end if
  else
   %Conduct random mutation operation
  end if
end for
for i = 1: HMS
   For j = 1: N
    If rand < rand
   
    else
   
    end if
   end for
Calculate the total weight of the harmony
   while  > C % Determine whether the constraint satisfies the condition
   Reinitialize the harmony vector
   End while
End for
Initialize the harmony library
Calculate fitness
For i = 1 to HMS
The fitness value of the harmony is obtained by adding the unit price of the value of the harmony that meets the conditions
End
% Sort harmonies by fitness value
  [Fitness, index] = sort (Fitness_ Value);
%Select the dominant harmony
  for i = 1 to Advantages of
    According to the sorted fitness value, the harmony with relatively high fitness value is searched and selected
  end
%Update Probability Model
A mathematical model is used to update the probability, and new probability values are generated.
% Determine if the conditions are met
If the stop condition is not met, the updated probability is substituted into the loop until the condition is met
(1)for j = 1: N
(2)Calculate the probability of each dimension of the population being selected
(3)  if rand < HMCR% Consideration of harmony memory bank
(4)  Operate on a harmonic vector randomly selected from the harmony memory library
(5)   if rand < PAR% Determine whether to make pitch adjustment
(6)    If rand < probability %The probability generated by the distribution estimation algorithm is used for judgment
(7)    B (j) = 1;
(8)   Else
(9)    B (j) = 0;
(10)   end if
(11)  end if
(12) else
(13)
(14) End if
(15)End for
(1)While  > C
(2)Randomly select a certain number of dimensions
(3)Sorts the selected dimensions out of order
(4)If B (dd(i)) == 1
(5)B (dd(i)) = 0;
(6) =  − W (dd(i));
(7)end
(8)Determines whether the selected dimension is exceeded
(9)end

where rand is a random number generated in the range (0, 1):

Step 4. Update and remember repository
If the newly generated is better than the in the harmony memory bank, then the worst harmony in the harmony memory bank is replaced with .

Step 5. Stop the inspection.
If the current iteration number is greater than the maximum iteration number , the algorithm will be terminated and the optimal harmony vector will be output in HM as the final result; Otherwise, Step 3 and Step 4 will be repeated.
Harmony search algorithm can solve some problems encountered in real life, but there are some shortcomings due to the simplicity of its principle. In the iteration of the algorithm, its local search ability is poor, and it is easy to fall into the local optimal.
Many attempts have been made to improve the performance of the original algorithm. The directions of improvements include: changing the algorithm parameters, changing algorithm strategy and the combination of these two improvements. Experiments show that the parameter variation of the algorithm can play an important role in the optimization results.

2.2. Binary-Coding Harmony Search Algorithm (BHS)

Geem proposed a binary harmony search algorithm to solve the pump switching problem [51]. Compared with the classical harmony search algorithm, a binary harmony search algorithm selects the new harmony from HM harmony library or random 0 or 1.

The following is the improvisation process of BHS:

In the above equation, is a newly generated harmony, and Rand is a random choice in the range of 0 to 1.

2.3. Discrete Binary Harmony Search Algorithm (DBHS)

Because the standard harmony search algorithm cannot solve the discrete problem, Wang et al. [52] proposed a new discrete binary harmony search (DBHS) algorithm. DBHS proposes a special way of tone adjustment, which selects only one harmony from the harmony to form a new harmony.

The following is the improvisation process of DBHS:

In the above formula, is a value randomly selected from . represents the ith element of the harmony selected in HM.

In the specific 0-1 knapsack problem, harmony can only be 0 or 1. The new harmony generation formula is aswhere represents the ith corresponding element of the global optimal harmony vector . This formula can improve the local search ability of the DBHS algorithm.

2.4. A Novel Global Harmony Search Algorithm (NGHS)

Zou et al. proposed a new global harmony search algorithm, which is inspired by the swarm intelligence algorithm [46]. The improvements are mainly made in two aspects: location updating and genetic variation. The improvisation process of NGHS as follow:

The improvisation process of NGHS

In the above process, best and worst represent the global best harmony and the global worst harmony in HM, respectively. Rand is a random selection from 0 to 1. It can be seen that in NGHS, harmony always moves in the direction of the best harmony, and it can be prevented from falling into the local optimum by a certain probability of producing a new solution.

2.5. Adaptive Binary Harmony Search Algorithm (ABHS)

Wang et al. proposed an improved adaptive binary harmony search algorithm (ABHS) in order to solve the degradation problem of pitch adjustment operator in the harmony search algorithm [47]. The algorithm uses a position selection strategy for harmony memory operation and the same tone adjustment rules by DBHS. The innovation of the algorithm is mainly on adapting HMCR and par. The following is the adaptive formula for HMCR.where K and k are the maximum number of iterations and the current number of iterations respectively, and c is a constant. [x] is an operator that takes the largest integer less than x. The above HMCR is dynamically changing according to the dimension and the current iteration number.

3. Hybrid Harmony Search Algorithm with Distribution Estimation

In this section, we introduce the hybrid algorithm based on distribution estimation and harmony search. We consider the initialization, distribution estimation algorithm, improvisation operation and improved repair algorithm.

3.1. An Initialization Method under Constraints

The 0-1 knapsack problem is a discrete problem with a knapsack state represented by 0 or 1. 1 means that the object is loaded into the knapsack, and 0 means that the object is not loaded into the knapsack. The proposed method can initialize random harmonies and ensure that the harmonies are feasible, see the description below. The advantage of the proposed initialization is that if the constraint is not satisfied, it is repeated until the constraint is satisfied.

Here, the G (i) is the total weight of backpack, x (i, j) as the independent state of harmony variable.

3.2. The Distribution Estimation Algorithm

The distribution estimation algorithm was proposed in 1996 and it has been rapidly studied thereafter [53]. In the traditional genetic algorithm, a candidate solution, is generated by iterating through selection, crossover and mutation and other operations that simulate the natural evolution. Unlike the traditional algorithm, the distribution estimation algorithm uses a profile model and a sampling procedure. In essence, the distribution estimation algorithm describes the spatial distribution of the candidate solutions through a probability model describing the distribution of the solution from the perspective of population macroscopic by means of statistical learning. Then, the probability model is randomly sampled to generate new populations, and repeated iterations are carried out to update the populations until the termination condition. The specific steps are as follows:

Step 1. Randomly generate M individuals as the initial population.

Step 2. Calculate the individual fitness for the L generation population, and judge whether the conditions are met. If so, the cycle is terminated. If not, proceed to the next step.

Step 3. Select the first N dominant individuals from the population according to the fitness value to form the dominant subpopulation of the L + 1 generation.

Step 4. Update probability model according to dominant subpopulation (N ≤ M),

Step 5. Conduct random sampling of the probability model to generate a new population (size M) and return to Step2.
From the above description we can understand the basic process of the distribution estimation algorithm. Figure 1 summarizes the general process of distribution estimation algorithm. In the distribution estimation algorithm, there is no operation such as crossover and mutation for the genetic algorithm. Instead, through learning probability model and sampling operation, the population distribution is made to evolve towards the direction of better individuals. From the perspective of biological evolution, genetic algorithm simulates the microscopic changes among individuals, while distribution estimation algorithm models and simulates the overall distribution of biological population.

3.3. A Directed Improvisation

In HS, the improvisation stage is important and the change of harmony vector mainly takes place in this part. There are three important operation processes in the improvisation process. Since 0-1 knapsack problem is a discrete problem, the operation needs to be adapted to its discrete nature: (1) The consideration of harmony memory library is to compare HMCR with random numbers and select a harmony vector from harmony memory library HM for subsequent operations. (2) Gene adjustment process, that is, PAR is compared with random numbers, and then the probability obtained by distribution estimation algorithm is compared with random numbers if needed, and 0 or 1 is selected for harmony. (3) Random variation, that is, random perturbation strategy to explore new harmonic variables, which only have two choices of 0 or 1.

In this paper, the idea of using probability in the distribution estimation algorithm is used to guide the tendency of the whole harmony memory database, and the harmony is optimized and adjusted by using the numerical distribution probability. According to the characteristics of 0-1 knapsack problem, the random correction process is used to carry out mutation operation, which is called a novel instructional improvisation.

According to the instructional improvisation process, in the process of genetic adjustment, the probability used is the 0-1 distribution probability in the harmony memory bank, which can make the newly generated harmony overall change for a better result. After many iterations, the harmony in the harmony memory bank will be closer to the best harmony.

3.4. An Improved Repair Operator

In problems with constraints, only feasible solutions can be assigned to the feasible region, and infeasible solutions will lead to search stagnation in the infeasible region. Therefore, it is necessary to ensure that all the harmonic vectors contained in HM satisfy the constraints. If the newly created harmony does not meet the criteria, a random repair mechanism is used to repair the harmony. First, select a certain number of dimensions independently and randomly, and arrange the dimensions out of order and judge whether the first dimension harmony is selected. If it is selected, remove it from the knapsack and judge whether it meets the requirements of knapsack capacity.

If the newly generated harmonies meet the conditions, an improved method is used to optimize the harmonies under the constraints as described in Algorithm 6 below. In this way, the harmony that does not meet the constraint is repaired or improved, so that the harmony vector stored in the harmony memory library HM can maximize the value of the items loaded in the knapsack as much as possible, and meet the constraint condition of the knapsack capacity.

(1)While  < C
(2)Randomly select a certain number of dimensions
(3)Sort the selected dimensions out of order
(4)  L = ; % Total weight held
(5)  If B (dd (i)) == 0
(6)  B (dd(i)) = 1;
(7)  =  + W (dd (i));
(8)  end
(9)if  > C
(10)B (dd(i)) = 0;
(11)L =  − W (dd(i));
(12)end
(13)Determines whether the selected dimension is exceeded
(14)end
(15) = L;

Here, the B (dd (i)) for the corresponding state of harmony, W (dd (i)) for the corresponding to the object weight of harmony solution.

3.5. The HHSEDA Algorithm Flow

The proposed HHSEDA algorithm can be described as follows.

Step 1. Initialize the parameters of the HHSEDA algorithm including: the size of harmony memory bank (HMS), the probability of harmony memory bank considering (HMCR), the number of excellent population (YouxiuNum), the maximum number of iterations (J1) and the probability of population distribution (probability)

Step 2. The harmonic memory library is initialized according to an initialization method satisfying the constraints, and the fitness value is calculated.

Step 3. Based on an improved improvisation of the idea of the distribution estimation algorithm, new harmonies are generated.

Step 4. The newly generated solution is repaired or improved by the improved repair operator.

Step 5. Update the harmony memory to determine whether the new harmony generated is greater than the worst harmony in HM. If so, the worst harmony is replaced. If not, it is not replaced.

Step 6. Check whether the termination conditions are met. If yes, the termination is effected. If no, repeat Steps 2 to Step 5.
The pseudocode of the HHSEDA algorithm is given in Algorithm 7, in which J1 is the maximum number of iterations. In this process, the global dominant population is selected, the probability of knapsack being selected is calculated by using the distribution estimation algorithm, and then the harmony memory is considered, and the pitch is adjusted. Whenever a new harmony is generated, the greedy operators are used to repair or improve it.
The flow chart of the algorithm of HHSEDA is shown in Figure 2.

(1)Initialize the algorithm and problem parameters.
(2)The harmony memory library should be initialized according to the feasible solution initialization method and the requirement of knapsack capacity to be met.
(3)Calculate the fitness value after initialization.
(4)for i = 1 to J1 do
(5)  Find the worst and best harmony in the harmony memory database as p
(6)  Calculate fitness values and sort them
(7)  For i = 1: K% select the dominant population
(8)  The number of K populations with the highest fitness value was selected
(9)  end
(10)  For j = 1: N%% Generates new harmonies based on Guided Improvisation
(11)  Count the number of bags selected in each column
(12)   if rand < HMCR
(13)   r = ceil (HMS ∗ rand) Randomly select the harmonic vector
(14)    if rand < PAR
(15)     If rand < probability
(16)     Update the corresponding harmonies
(17)    end
(18)   end
(19)  else Random mutation produces new harmonies
(20)  end
(21) end for
(22)  The total capacity of the newly produced harmony is calculated
(23)  If the knapsack capacity requirement is met, the greedy selection is made under the constraint
(24)  If the knapsack capacity requirement is not met, the harmony is repaired
(25)  if fit (B) > fit ()
(26)  Update the global lowest harmony
(27)  end
(28)end for

4. Computational Results

In this section, a large number of experimental studies are carried out to assess the performance of the proposed hybrid algorithm. Based on 10 low dimensional instances and 3 groups of high-dimensional instances, the advantages of the proposed hybrid harmony search algorithm with distribution estimation are confirmed. The algorithm is written in MATLAB 2018a and runs on a computer with Intel Core i5-6300 (2.3 GHz) processor, 8 GB ram and windows 10 operating system.

4.1. The Parameter Data

This paper use BHS [51], DBHS [52], NGHS [46], ABHS [47] and other algorithms related to HHSEDA and harmony search to solve the 0-1 knapsack problem, and compare them with GA, PSO, BFPA [41], BABC-DE [54]. In order to ensure the comparability and effectiveness of the simulation experiments, the relevant parameters of the above nine algorithms are listed in Table 1.

4.2. Comparison Based on Low Dimensional Instances of the 0-1 Knapsack Problem

In this section, 10 low dimensional instances of the 0-1 knapsack problem are selected to study the performance of the proposed algorithm. Table 2 lists the information needed for the test including the dimension, weight, value and knapsack capacity of each data. The maximum number of iterations is 3000, and each instance has 30 experiments independently. In order to more comprehensively compare the performance of the improved harmony search algorithm in solving 0-1 knapsack problem, we consider statistical indicators such as “best,” “worst,” “average” and “standard deviation.” The best value and the worst value can indicate the advantages and disadvantages of each algorithm. Moreover, the average value and standard deviation can intuitively show the robustness of the algorithm.

The performance of the algorithm is compared with other six different algorithms. The comparison results with NGHS [46], BHS [51], DBHS [52], ABHS [47] are in Table 3, and the comparison results with GA, PSO, BFPA [41], BABC-DE [54] are in Table 4.

Table 3 contains experimental results for the 10 low dimensional instances of the knapsack problem with known optimal values. Each algorithm of HHSEDA, ABHS, NGHS, DBHS and BHS is set to have 3000 iterations. The number of population is 30 and the number of independent runs is 30.

According to Table 3, HHSEDA and ABHS have the best performance: the optimal value, median, worst value and average value are the same as the optimum and the standard deviation of 30 independent runs is 0 for each instance. The other three algorithms do not perform so well. The probability of NGHS algorithm reaching the optimum for the optimal value, median, worst, average and standard deviation is 100%, 70%, 40%, and 40%, respectively. The probability of DBHS reaching the optimum for the optimal value in the five evaluation criteria is 70%, 60%, 60%, and 60%, respectively. The probability of BHS reaching the optimum for the optimal value in the five evaluation criteria is 100%, 100%, 50%, 50%, and 50%, respectively.

In Table 4, GA, PSO, BFPA and BABC-DE are set with the same iteration times, population number and independent test times as those in Table 3. According to Table 4, BABC-DE can achieve the same results as HHSEDA. GA and PSO have a good performance overall, but they cannot produce the result satisfying the knapsack capacity constraint for KP3, KP4 and KP10. The reason is that in the process of algorithm repair for both GA and PSO, the penalty method cannot be solved for relatively complex data. The probability of BFPA algorithm reaching the optimum for the optimal value, median, worst value, mean, and standard deviation is 100%, 70%, 70%, and 70%, respectively.

In sum, based on the above experiments on the low dimensional instances, HHSEDA, ABHS and BABC-DE are better than other algorithms, and they can achieve the optimal value for 5 instances.

4.3. Comparison Based on High Dimensional Instances of the 0-1 Knapsack Problem

Based on the above experiments on 10 low dimensional instances of 0-1 knapsack problem, HHSEDA performs better than other algorithms in most cases. These low dimensional instances relatively simple. In the above experiments, the highest dimension is only 23. As the dimension increases, 0-1 knapsack problem becomes more and more complex.

4.3.1. Uncorrelation High-Dimensional Experimental

In order to further assess the performance of the proposed HHSEDA algorithm, we use nine high-dimensional instances of 0-1 knapsack problem. The dimensions of these instances include 100, 200, 300, 500, 700, 1000, 1200, 1500, and 2000. The generation of each uncorrelated high-dimensional instances (KP25-KP33) is as follows: the volume of each knapsack is randomly selected from 5 to 20, the corresponding random profit range is 50 to 100, and the maximum capacity of the knapsack is 0.75 of the total capacity of the items generated according to the above procedure. The above examples only need to be generated randomly once, and remain unchanged in the later experiments. The maximum number of iterations for solving the above 9 instances is 10000, 10000, 10000, 20000, 30000, 50000, 50000, 70000 and 100000, respectively. The number of population is set to 30 in all experiments, and the number of independent experiments is 30.

Tables 5 and 6 report the performance metrics for each instance for HHSEDA and other comparison algorithms with the best results highlighted in bold face. The optimal profit of KP25-KP33 is unknown, and the best value, median, worst value, average and standard deviation are all considered in the table.

In many cases, the worst values obtained by HHSEDA are far better than the worst values obtained by other algorithms. In terms of optimal value, HHSEDA performs the same as ABHS, BHS and BABC-DE algorithm except for KP11 (100-dimensional). Compared with the suboptimal values achieved by other algorithms, the corresponding improvement rates of HHSEDA are 0%, 0.05%, 0.09%, 0.133%, 0.16%, 0.15%, 0.14% and 0.22%, respectively (When GA algorithm applied to the 100-dimensional KP11 and 200-dimensional KP12, the results generated do not meet the constraints and are deleted). In terms of the worst value, HHSEDA shows better performance in all cases except KP11 (HHSEDA100 dimension) which is slightly lower than BABC-DE algorithm. Compared with the worst value achieved by other algorithms, the corresponding improvement rates are 51.38%, 55.93%, 53.16%, 62.38%, 57.72%, 63.58%, 64.58%, 64.38%, 65.47%, respectively. Among these data, the biggest improvement is 65.47% of BFPA algorithm in the case of 2000 dimension. In general, the improvement rate will become more obvious as the dimension increases. In terms of average value, the corresponding improvements rates are 0.5%, 25.26%, 42.22%, 0.20%, 13.37%, 26.27%, 56.56% and 1.03%, respectively.

4.3.2. Low Correlation High-Dimensional Experimental

In order to further evaluate the performance of HHSEDA, nine instances of 0-1 knapsack problem with dimensions of 100, 200, 300, 500, 700, 1000, 1200, 1500 and 2000 are randomly generated. The nine instances are generated according to the following conditions: the volume of each object is randomly selected from 10 to 50, the corresponding profit value is randomly selected from 10 of the corresponding object volume, and the maximum capacity of the knapsack is one-half of the total weight of the items generated according to the above procedure.

BHS, DBHS, NGHS, ABHS, GA, PSO, BFPA, BABC-DE and HHSEDA are applied to these instances. The results are reported in Tables 7 and 8.

In terms of optimal value, HHSEDA obtains the same optimal value as ABHS and BHS algorithms except for KP20 (dimension = 100). Compared with the suboptimal value achieved by other algorithms, the corresponding improvement rates of HHSEDA are 0%, 0.03%, 0.06%, 0.06%, 0.05%, 0.05%, 0.14%, 0.14%, and 0.16%, respectively. In terms of average value, the corresponding improvement rates of HHSEDA are 0.78%, 8.98%, 12.69%, 0.13%, 6.84%, 10.19%, 11.99%, and 7.98%, respectively. Compared with the worst value achieved by other algorithms, the corresponding improvement rates are 10.67%, 12.75%, 13.29%, 12.88%, 13.45%, 14.28%, 14.42%, 14.46%, and 14.44% respectively.

4.3.3. High Correlation High-Dimensional Experimental

In order to further evaluate the performance of HHSEDA, nine instances of 0-1 knapsack problem with dimensions of 100, 200, 300, 500, 700, 1000, 1200, 1500 and 2000 are randomly generated. The instances with strong correlation between value and weight are generated under the following conditions: the volume of each object is randomly selected from 10 to 50, the corresponding profit value is added with a random number of 0 to 10, and the maximum capacity of the knapsack is half of the total weight of the items generated according to the above procedure. BHS, DBHS, NGHS, ABHS, GA, PSO, BFPA, BABC-DE, and HHSEDA are used for comparison. The experimental results are presented in Tables 9 and 10.

In terms of the optimal value, because the weight of the object loaded into the knapsack is highly related to the value, the improvement of the optimal value is not obvious. In terms of the suboptimal value, the corresponding improvement rates of HHSEDA are 0%, 0.08%, 0.01%, 0.10%, 0.08%, 0.07%, 0.11%, 0.09%, and 0.01%, respectively. In terms of the mean value, the corresponding improvement rates of HHSEDA are 0.75%, 4.95%, 6.32%, 0.23%, 3.64%, 4.39%, 6.71%, and 3.25%, respectively. In terms of the worst value, the corresponding improvement rates of HHSEDA are 6.51%, 7.24%, 7.19%, 6.74%, 7.41%, 7.37%, 7.45%, 7.41%, and 7.07%, respectively. Among these data, the largest rate is 7.45% from BFPA algorithm for the instance with a dimension of 1200.

According to Table 11, HHSEDA algorithm runs slower than NGHS, DBHS and BHS among the harmony search variants algorithms. This is because the HHSEDA algorithm needs to calculate the probability of the harmony distribution in the harmony memory bank in each iteration, which increases the time loss to some extent. However, compared with other nonharmony search variants algorithms in time consumption, it has some advantages.

HHSEDA algorithm has a maximum standard deviation less than 32, 8, 12 for instances with no correlation, low and high and correlation, respectively. In terms of standard deviation, HHSEDA algorithm has an advantage over other algorithms. This is because the HHSEDA algorithm generalizes the harmony memory from the macro level, and then feeds back to a single harmony, and makes the result more accurate by improving and repairing the operator.

BABC-DE, ABHS, HHSEDA, BFPA, GA, and PSO are compared in simulation graphs for the 27 instances in Figure 3. Except in Experiment 1 (unrelated 100-dimensional) and Experiment 2 (unrelated 200-dimensional) where GA algorithm violates the constraints, the average convergence speed and accuracy of HHSEDA algorithm are obviously better than other algorithms.

From the simulation curves in Figure 3, BABC-DE can achieve fast convergence and good results in general. However, for Experiments 8-27, it is easy to fall into the local optimal, because the repair operator of BABC-DE cannot jump out of the local optimal when processing the data with high correlation. Observing the convergence diagram of BFPA, it can be seen that the algorithm is always ineffective for solving high-dimensional problems, and it takes a lot of time to search globally (Levy flight) and locally, and the repair operator it uses makes it easy to fall into local optimal. By observing the convergence graph of particle swarm optimization (PSO) and genetic algorithm (GA), we can see that the results are not good when solving high-dimensional problems, and they are easy to fall into local optimal. This is because their efficiency are not very good, and the penalty function method used is not obvious in the repair of experimental results, and some constraints are violated in some data.

Observing the convergence diagram of ABHS in Figure 3, it can be seen that ABHS has relatively good results in all experiments. For Experiments 1-9, the convergence diagram of ABHS algorithm is the closest to HHSEDA, and the parameter adjustment of this algorithm makes it better. But compared with HHSEDA, the effect of adjusting the harmony vector through the probability of harmony distribution in the harmony memory bank is relatively poor.

In order to evaluate the dispersion of HHSEDA algorithm, a total of 9 instances with low-correlation are selected, and they are run independently for 10 times. The generated results are presented by the box diagrams, as shown in Figure 4. It can be seen from the figure that the minimum value, maximum value and median of HHSEDA algorithm are closer and better than other algorithms. In conclusion, HHSEDA algorithm has better accuracy and computational stability.

It can be analyzed from Tables 111 and Figures 14 that HHSEDA is an excellent HS variant with fast convergence and excellent stability. Therefore, HHSEDA is a more efficient and reliable algorithm for solving the 0-1 knapsack problem of high dimension. The reason may include the better initialization, enhanced local search based on the hybrid strategies, and the infeasible solution repair strategy.

5. Conclusions

This paper presents a hybrid harmonic search and distribution estimation algorithm for 0-1 knapsack problem. Firstly, a new initialization method under constraints is proposed. Compared with the classical initialization method, the advantage is that it can directly generate harmonies conforming to the constraints. In addition, the hybrid algorithm mainly combines the probability idea in the distribution estimation algorithm with the harmony search algorithm to find the population with better fitness, analyze the distribution profile of its excellent harmony, and make the subsequent improvement direction develop in a more optimal direction, which greatly improves the probability of finding the optimal value. Finally, in order to correct the infeasible solutions in the iterative process, a new improved repair operator is proposed. The new repair operator can transform the infeasible solutions into feasible solutions and optimize the feasible solutions under the conditions of satisfying the constraints to produce better solutions. This repair operator can not only guarantee the availability of solution, but also improve the accuracy and convergence of HHSEDA algorithm.

In general, HHSEDA has the following three advantages:(i)In the initialization stage, a population conforming to the constraints can be generated to prepare for the subsequent iteration of the algorithm(ii)In the process of improvisation, the idea of probability in distribution estimation algorithm is added to harmony search, which makes it difficult to fall into local optimum(iii)In solving the problem of infeasible solutions, a newly proposed infeasible solution repair strategy is added, which can repair infeasible solutions into feasible solutions more efficiently, so as to optimize the overall fitness value

Experiments on 10 classic low-dimensional instances of 0-1 knapsack problem, 9 high-dimensional instances with low-correlation, 9 high-dimensional instances with high-correlation, and 9 high-dimensional instances with no-correlation are conducted. The simulation results show that HHSEDA possesses a good optimization potential and stability for solving 0-1 knapsack problem. Especially for the large-scale 0-1 knapsack problem, it shows a much better performance than ABHS, BABC-DE, BFPA, and other algorithms. In sum, the HHSEDA algorithm perform better than other algorithms in low-dimensional, large-scale with no correlation, low-correlation or and high-correlation 0-1 knapsack problem. In addition, the application of the algorithm is further expanded based on the characteristics of the algorithm, such as multi-dimensional knapsacks, discounted knapsacks, and even path planning related problems.

Data Availability

All data used to support the findings of this study are available from our experiments, and these data are publicly available data. Upon request, please contact the author to provide them.

Ethical Approval

This article does not contain any studies with human participants performed by any of the authors.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant no. 61806058), Natural Science Foundation of Guangdong Province (2018A030310063), and Guangzhou Science and Technology Plan Project (201804010299).