Next Article in Journal
Interacting Ru(bpy) 3 2 + Dye Molecules and TiO2 Semiconductor in Dye-Sensitized Solar Cells
Next Article in Special Issue
A Competitive Memory Paradigm for Multimodal Optimization Driven by Clustering and Chaos
Previous Article in Journal
Visual Cryptography Scheme with Essential Participants
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Objective Optimization Benchmarking Using DSCTool

Computer Systems Department, Jožef Stefan Institute, Jamova cesta 39, SI-1000 Ljubljana, Slovenia
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(5), 839; https://doi.org/10.3390/math8050839
Submission received: 8 May 2020 / Revised: 20 May 2020 / Accepted: 21 May 2020 / Published: 22 May 2020
(This article belongs to the Special Issue Advances of Metaheuristic Computation)

Abstract

:
By performing data analysis, statistical approaches are highly welcome to explore the data. Nowadays with the increases in computational power and the availability of big data in different domains, it is not enough to perform exploratory data analysis (descriptive statistics) to obtain some prior insights from the data, but it is a requirement to apply higher-level statistics that also require much greater knowledge from the user to properly apply them. One research area where proper usage of statistics is important is multi-objective optimization, where the performance of a newly developed algorithm should be compared with the performances of state-of-the-art algorithms. In multi-objective optimization, we are dealing with two or more usually conflicting objectives, which result in high dimensional data that needs to be analyzed. In this paper, we present a web-service-based e-Learning tool called DSCTool that can be used for performing a proper statistical analysis for multi-objective optimization. The tool does not require any special statistics knowledge from the user. Its usage and the influence of a proper statistical analysis is shown using data taken from a benchmarking study performed at the 2018 IEEE CEC (The IEEE Congress on Evolutionary Computation) is appropriate. Competition on Evolutionary Many-Objective Optimization.

1. Introduction

Nowadays, comparing the performance of a newly developed multi-objective optimization algorithm involves calculating descriptive statistics such as means, medians, and standard deviations of the algorithm’s performance. Recently, we have published several studies that show that calculating descriptive statistics is not enough to compare algorithms’ performances, but require some high-level statistics [1,2,3].
In this paper, we will show how different benchmarking practices can potentially have big influences on the outcome of a statistical analysis performed on multi-objective optimization algorithms. Additionally, we will present a web-service-based e-Learning tool called DSCTool [4], which guides the user through all necessary steps needed to perform a proper statistical analysis. The DSCTool also reduces the requirement of additional statistical knowledge from the user, which includes knowing which conditions must be fulfilled to select a relevant and proper statistical test (e.g., parametric or nonparametric) [5]. The conditions that should be checked before selecting a relevant statistical test includes checking for data independence, normality, and homoscedasticity of variances.
In multi-objective optimization, there is no single solution that simultaneously optimizes each objective. Since the objectives are said to be conflicting, there exists a set of alternative solutions. Each solution that belongs to the set of alternative solutions is optimal in a manner that no other solution from the search space is superior to it when all objectives are considered. The question that appears here is how to compare algorithms with regard to sets of solutions. To this end, many different performance metrics have been proposed, which map the solution sets to a set of real numbers. By using performance metrics, we can quantify the differences between solution sets. Further, this data can be used as input data that should be analyzed by applying some statistical tests.
To see the importance of making a proper statistical analysis in benchmarking studies that involve multi-objective optimizaton algorithms we performed analysis on the results presented at the 2018 IEEE CEC Competition on Evolutionary Many-Objective Optimization. To do this, the analyses are performed using the DSCTool, where each step from the DSCTool pipeline is explained in more detail. More details about the DSCTool are presented in [4].
The rest of the paper is organized as follows. In Section 2, the multi-objective optimization is shortly reintroduced, followed by Section 3 where two imporatant caveats of statistical analysis are presented. Section 4 reintroduces the DSCTool and Section 5 shows and compares different statistical analysis performed by using the DSCTool. Finally, the conclusions of the paper are presented in Section 6.

2. Multi-Objective Optimization

Multi-objective optimization is a research of multiple-criteria decision making, concerning with optimization problems that involve more than one objective, which should be optimized simultaneously. A multi-objective problem can be formulated as
m i n ( f 1 ( x ) , f 2 ( x ) , , f k ( x ) ) , x X
where k 2 is the number of objectives and X is a set of feasible solutions. X must often satisfy some inequality constraints g i ( x ) 0 , i = 1 , , l and equality constraints h j ( x ) = 0 , i = 1 , , m . Since typical optimization algorithms are unconstrained search methods, dealing with constraints is a challenging task. Standard approaches [6] to handling constraints consists of penalty functions, objectivization of constraints violations, repair algorithms, separation of constraints and objectives, hybrid methods.
The objectives are usually conflicting, therefore there exists some kind of trade-off among objectives, i.e., we can only improve one objective at the expense of the other. As a consequence there is not one optimum solution for all objectives, but a set of solutions called Pareto optimal set, which are all considered optimal with respect to all objectives. There are many quality aspects with regard to which approximation sets are compared, such as closeness to the Pareto optimal set, and coverage of a wide range of diverse solutions. Quality is measured in terms of criteria that relates to properties of convergence and diversity. There have been many studies tackling this problem by using unary and binary quality indicators, which take one or two input sets that are mapped into a real number. However, one problem the arises is that there exists many quality indicators, for example, hypervolume (HV) [7], epsilon indicator (EI) [8], generational distance (GD) [9], inverse generational distance (IGD) [9] etc., that capture different information from the approximation set and its selection can greatly influence the outcome of the comparison study. For the interested reader Riquelme et al. [10] provide an in-depth review and analysis of 54 multi-objective-optimization metrics. One option to reduce the influence of the selection of a quality indicator, is to use ensemble learning, where the idea is to combine information gained from several quality indicators into one real value. With a suitable performance metric calculated, we then need to perform some statistical analysis to gain understanding of the compared algorithms performances.

3. Statistical Analysis

Frequentist statistical analysis can be done by exploratory data analysis, which involves calculating descriptive statistics (i.e., mean, median, standard deviation, etc.), or some higher-level statistics that are done by inferential statistics to explore relations between compared variables (e.g., hypothesis testing with statistical tests) [2]. No matter which statistical analysis is done, it is very important to understand all its limitations and caveats because with its improper usage, wrong conclusions can be made. For example, improper usage comes from misunderstanding or not knowing the requirements when a certain statistical test can be applied. Recently, it was shown that outliers and small differences in the data can have negative impact on the results obtained from statistical tests [11].

3.1. Outliers

An outlier is an observation that lies outside of the distribution of the data [12]. If outliers are not properly handled, some deceptive statistics can be affected, which might not reflect the actual probability data distribution. This indicates that the statistical analysis shows that there is a statistical significance, while the probability distribution shows the opposite.
Outliers are data points that differ significantly from other data points in the data set and can cause serious problems in statistical analyses. For example, means are the most commonly used descriptive statistics that are involved in comparison studies because they are unbiased estimators. However, they are sensitive to outliers, so by using them for statistical analysis we inherently transfer this sensitivity to results of any analysis that originates from them. One option to reduce the influence of outliers is to use medians as a more robust statistic since they are less sensitive to outliers.

3.2. Small Differences in the Data

Though medians reduce the sensitivity to outliers, both means and medians can have negative impact on results obtained from applying statistical tests. Such impact can be observed when differences between means or medians are in some ϵ -neighbourhood. An ϵ -neighborhood is a range in function value space in which distance from a given number is less than some specified number ϵ . When the means or medians are not the same, but in some ϵ -neighborhood, they receive different rankings. However, if distributions of multiple runs are the same, indicating that there are no performance differences between the compared algorithms, this suggests that they should obtain the same rankings. On the contrary, it can also happen that the distributions of multiple runs are not the same, indicating that algorithms should obtain different rankings, but the means or medians are the same.

4. The DSCTool

The DSCTool is presented in [4] as an e-Learning tool with the goal to make all required scenarios for different performance evaluations of single- and multi-objective optimization algorithms. It guides the user through several steps of statistical analysis pipeline, starting from preparing and providing input data (optimisation algorithm results), then selecting the desired comparison scenario, and obtaining final result from it. In this way, the statistical knowledge required from the user is greatly reduced. Actually, the DSCTool allows users not to think about how the statistical analysis should be performed, but only to define the comparison scenario that is relevant for their experimental setup. Basically, the user must select a significance level used by statistical test and follow a pipeline for selecting the most commonly-used statistical tests in benchmarking evolutionary algorithms. For multi-objective optimization algorithm analysis, the user must provide data in form of one or more quality indicators, decide on significance level and which kind of ensemble (i.e., average, hierarchical majority vote, or data-driven) is desired for comparing data calculated by quality indicator. Following the pipeline is trivial as it will be shown in next sections.
The DSCTool is developed for the Deep Statistical Comparison (DSC) approach [2] and its variants that all have a benefit of providing robust statistical analysis with reduced influence of outliers and small differences in the data since a comparison is made on data distribution. In this paper, we will only show web services that implement DSC variants used with multi-objective optimization, i.e., the basic DSC ranking scheme used for comparing data for one quality indicator [13], and average, hierarchical majority vote, and data-driven ensembles that are used to fuse the information from more quality indicators [1,3].
The DSC ranking scheme is based on comparing distributions using some two-sample statistical test [14] with predefined significance level. The obtained ranking scheme is used to make a further statistical comparison. The DSC average ensemble is a ranking scheme that uses a set of user-selected quality indicators [1] to calculate a mean of DSC rankings for each quality indicator on specific problems [15]. The DSC hierarchical majority vote ensemble is a ranking scheme [1] that checks which algorithm wins in the most quality indicators or which algorithm is ranked the most number of times with the best DSC ranking on each benchmark problem separately. Finally, the DSC data-driven ensemble [3] is where the preference of each quality indicator is estimated using its entropy, which is calculated by the Shannon entropy weighted method [16]. The preference ranking organization method (PROMETHEE) [17] is then used to determine the rankings.
The DSCTool offers web services that are accessible from https://ws.ijs.si:8443/dsc-1.5/service/ and use the REST software architectural style. The web services are using JSON format to transfer input/output data. A more detailed information about using DSCTool can be accessed at https://ws.ijs.si:8443/dsc-1.5/documentation.pdf.

5. Experiments, Results, and Discussion

To show the usefulness and the power of DSCTool, the results presented at the 2018 IEEE CEC Competition on Evolutionary Many-Objective Optimization [18] are analyzed, where 10 algorithms were submitted for competition (i.e., AGE-II [19], AMPDEA, BCE-IBEA [20], CVEA3 [21], fastCAR [22], HHcMOEA [23], KnEA [24], RPEA [25], RSEA [26], and RVEA [27]). The competition provided 15 optimization problems (MaF01-MaF15) with 5, 10, and 15 objectives [28]. The obtained optimization results for each algorithm together with results can be accessed at https://github.com/ranchengcn/IEEE-CEC-MaOO-Competition/tree/master/2018. We would like to point here that we focus only on the issue of how to evaluate the performance (i.e., which statistical approach to be used), where the problem selection and experimental setup have been already solved and set by the competition organizers.
For performance metrics, the organizers have selected inverted generational distance (IGD), where 10,000 uniformly distributed reference points were sampled on the Pareto front and hypervolume (HV), where the population was normalized by the nadir point of Pareto front and Monte Carlo estimation method with 1,000,000 points was adopted. Each algorithm was executed 20 times on each problem with 5, 10, and 15 objectives, resulting in 900 approximation sets. Using these approximation sets, both quality indicators were calculated and the mean of each quality indicator was taken as a comparison metric for each problem and number of objectives. The algorithms were then ranked according to the comparison metric and the final score of the algorithm was determined as the sum of the reciprocal values of the rankings. In Table 1 and Table 2, official competition ranking results for 5 and 10 objectives are presented for both quality indicators, respectively.
Using the tables, it can be seen that CVEA3 is the best performing algorithm due to obtained total ranking of 1 for both quality indicators in 5 and 10 objectives.
Further, statistical analysis on the same data is showed using the DSCTool. First, quality indicators should be calculated using the approximation sets obtained by the algorithms. In addition to IGD and HV, also generational distance (GD) and epsilon indicator (EI) were calculated. After all quality indicators are calculated, the data for each quality indicator must be organized in an appropriate JSON input for the rank web service that performs the DSC ranking scheme. The JSON inputs for each quality indicator and each number of objectives are available at http://cs.ijs.si/dl/dsctool/rank-json.zip. For our analysis, the two-sample Anderson–Darling test was selected to compare data distributions (since it is more powerful than Kolmogorov–Smirnov test). To obtain the DSC rankings for each quality indicator, the rank web service was executed with an appropriate JSON input. The result is a JSON response where rankings are calculated based on the DSC ranking scheme. The rankings results for all quality indicators for 5 and 10 objectives are provided in Table 3, Table 4, Table 5 and Table 6, respectively.
First, let us compare results obtained in the official competition (see Table 1 and Table 2) and the results obtained by DSCTool, where comparisons are done using only one quality indicator (see Table 3, Table 4, Table 5 and Table 6). Here we would like to remind the reader that competition ranking is based on simple statistics in the form of the mean value obtained for a specific quality indicator, while DSCTool uses the DSC approach in which the comparison is based on the distribution of quality indicator values from several runs. Quickly one can see that for some algorithms the obtained ranking is the same (e.g., for algorithm CVEA3 on 5 objectives both approaches returned ranking 1 according to IGD and HV quality indicator), while for some others there is a large difference between the rankings (e.g., for algorithm AGE-II on 5 objectives and using HV quality indicator competition ranked it as 3rd, while DSCTool ranked it as 9th). Since the paper is not about comparing different statistical approaches, we will not go into details. Nevertheless, we would like to remind the reader that though mean values are unbiased estimators, they can be heavily influenced by outliers and some small differences in the data. For further details on this topic we refer the reader to [2]. Therefore, by only using different (in our case more powerful) statistics, conclusions obtained from the rankings can be drastically changed. In our experiment, two additional quality indicators (i.e., GD and EI) were calculated on purpose, so the influence of its selection can be even better seen. Looking at Table 3, Table 4, Table 5 and Table 6, drastic changes in rankings can be observed. Again looking at algorithm AGE-II it can be seen that if only EI and GD would be used, the AGE-II would change from the average performing algorithm to the best one, also outperforming algorithm CVEA3 in all cases.
Though it is well known that the selection of the quality indicator can have a big influence on the statistical analysis outcome, this influence can be also clearly observed in Table 3, Table 4, Table 5 and Table 6. In such situations, it is better to use an ensemble of several quality indicators and estimate performance according to all of them. The DSCTool provides the ensemble service, which calculates rankings according to inputs from several quality indicators. To do this, ranking results from running the rank web service on all quality indicators should be taken and used as inputs to the ensemble service. In addition to all quality indicator data that needs to be put into a proper JSON form, one needs to decide on which ensemble technique to be used. The DSCTool provides three ensemble options, namely: average, hierarchical, and data-driven. When using the average method, rankings are simply averaged, with the hierarchical method, the input rankings are looked at from a hierarchical viewpoint where the algorithm with the highest rankings obtains the best ensemble ranking, and the data-driven method, where the information gain provided by the quality indicators is also taken into account to determine the obtained ensemble ranking. The examples of the JSON input for the average ensemble method can be found at http://cs.ijs.si/dl/dsctool/M05-average.json and http://cs.ijs.si/dl/dsctool/M10-average.json for 5 and 10 objectives, respectively. To prepare the JSON input for hierarchical or data-driven ensemble, the name of the method should be modified/changed from average to hierarchical or data-driven, while everything else remains the same. Further, to obtain the ensemble rankings, the ensemble service was executed with an appropriate JSON input. The rankings obtained from ensemble services in 5 and 10 objectives are shown in Table 7, Table 8 and Table 9. Unsurprisingly, it can be again seen that the rankings have changed with respect to individual rankings shown in Table 3, Table 4, Table 5 and Table 6. Now, the rankings no longer represent individual quality indicators, but a fusion of the information by all of them. All ensembles provide similar ranking results, but there are still some differences between them. So, the selection of an ensemble method can also have an influence and must be made with great care. If we are interested purely in average performance, average should be selected, if we are interested in most often high-performing algorithm, hierarchical should be selected (this is also recommended for dynamic optimization), while if we care about which algorithm seems the most relevant in general, the data-driven ensemble method should be selected.
The algorithms rankings obtained on each benchmark problem consist of statistical significance that can be used to compare the algorithms only on that specific problem (or performing single problem analysis). However, if we are interested in a more general conclusion, or to compare the algorithms using the set of all benchmark problems (or multiple-problem analysis), it is not enough only to look at the mean ranking of the algorithms obtained from all benchmark problems. In such situations, we should analyze the data by applying some paired statistical test. For this purpose an omnibus test must be performed, which is implemented in DSCTool as an omnibus web service. The omnibus test can be performed on the results from any of the above tables. When using the DSCTool rank or ensemble web service we receive a JSON result that consists of the rankings for all algorithms and all benchmark problems (which were shown in Table 3, Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9). Additionally, it provides us the information if the parametric tests can be applied on our data and which statistical tests are relevant for our data. In our cases, the Friedman test was recommended as an appropriate one. To make a general conclusion, we continue by using it with a significance level of 0.05. Instead of showing the results of all obtained rankings (presented in the tables), only one scenario was selected to show how this can be done and the results were compared to the outcome of the competition. We decided to make omnibus test for the results obtained with EI quality indicator on five objective problems. The reason is in the fact that, here, algorithm AGE-II outperformed the CVEA3, which was shown to be the best performing algorithm at the competition. After preparing the JSON input (accessible at http://cs.ijs.si/dl/dsctool/M05-ei-omnibus.json) the omnibus web service was executed. The obtained results indicate that the null hypothesis is rejected, since the calculated p-value is 1.09 × 10 10 and it is smaller than our previously set significance level (0.05). This means that there is a statistical significance in the data and post-hoc test should be performed to identify where these significant differences come from. In our case, the algorithm with the lowest mean value (AGE-II) was selected as the control algorithm and compared to the other algorithms. To apply the post-hoc test, the algorithm means obtained from the omnibus test were taken, the number of algorithms and the number of benchmark functions set (in our case 10 and 15, respectively), and the control algorithm specified (in our case AGE-II). Since the post-hoc statistic is dependent from the omnibus statistical test, the same statistical test must be applied in the post-hoc test (in our case, the Friedman test). After creating the JSON input (accessible at http://cs.ijs.si/dl/dsctool/M05-ei-posthoc.json), the posthoc web service was executed. The results show that algorithm AGE-II significantly outperforms every other algorithm. Therefore, what we have shown is that not only can the rankings change by applying a different quality indicator, but the differences between performances can be significant. If we would do a similar test on an HV quality indicator with five objective problems, the results would be reversed. Here, CVEA3 would be shown as an algorithm that significantly outperforms every other algorithm. However, if we would look at the results of any of the ensemble methods, we would see that there is no statistical significance between algorithms AGE-II and CVEA3, which in the end would be a general conclusion (assuming we would not search for the best algorithm according to some problem specific quality indicator).
With our experiments, we have shown how important it is to perform proper analysis since we were able to change outcomes of the study only by selecting different quality indicators and/or statistical approaches.

6. Conclusions

Performing a proper statistical analysis is an important task that must be taken with great care. This was clearly shown in Section 5, where we showed that the selection of quality indicators and the statistical approach that will be used to analyze this data has a big influence on the end results of the comparison. For these reasons, the DSCTool can help users to make proper statistical analysis quick and error free. It guides users through all analysis steps and provides them with all the required information that is needed to perform a proper statistical analysis and obtain final conclusions from their studies. There are still some decisions that should be made by the user such as a selection of significance level and choosing the relevant ensemble, but this requires much less knowledge than doing all of the statistics on their own.

Author Contributions

Conceptualization, P.K. and T.E.; methodology, T.E. and P.K.; software, P.K., and T.E.; validation, P.K., and T.E.; formal analysis, T.E. and P.K.; investigation, P.K. and T.E.; writing—original draft preparation, P.K.; writing—review and editing, T.E. and P.K.; funding acquisition, T.E. and P.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the financial support from the Slovenian Research Agency (research core funding No. P2-0098 and project No. Z2-1867) and from the European Union’s Horizon 2020 research and innovation program under grant agreement No. 692286.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
DSCdeep statistical comparison
EIepsilon indicator
GDgenerational distance
HVhypervolume
IGDinverse generational distance

References

  1. Eftimov, T.; Korošec, P.; Seljak, B.K. Comparing multi-objective optimization algorithms using an ensemble of quality indicators with deep statistical comparison approach. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 27 November–1 December 2017; pp. 1–8. [Google Scholar]
  2. Eftimov, T.; Korošec, P.; Seljak, B.K. A novel approach to statistical comparison of meta-heuristic stochastic optimization algorithms using deep statistics. Inf. Sci. 2017, 417, 186–215. [Google Scholar] [CrossRef]
  3. Eftimov, T.; Korošec, P.; Seljak, B.K. Data-Driven Preference-Based Deep Statistical Ranking for Comparing Multi-objective Optimization Algorithms. In Proceedings of the International Conference on Bioinspired Methods and Their Applications, Paris, France, 16–18 May 2018; pp. 138–150. [Google Scholar]
  4. Eftimov, T.; Petelin, G.; Korošec, P. DSCTool: A web-service-based framework for statistical comparison of stochastic optimization algorithms. Appl. Soft Comput. 2020, 87, 105977. [Google Scholar] [CrossRef]
  5. García, S.; Molina, D.; Lozano, M.; Herrera, F. A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: A case study on the CEC’2005 special session on real parameter optimization. J. Heuristics 2009, 15, 617. [Google Scholar] [CrossRef]
  6. Coello, C.A.C. Constraint-Handling Techniques Used with Evolutionary Algorithms. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO’18); Association for Computing Machinery: New York, NY, USA, 2018; pp. 773–799. [Google Scholar] [CrossRef]
  7. Zitzler, E.; Thiele, L. Multiobjective evolutionary algorithms: A comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput. 1999, 3, 257–271. [Google Scholar] [CrossRef] [Green Version]
  8. Knowles, J.; Thiele, L.; Zitzler, E. A tutorial on the performance assessment of stochastic multiobjective optimizers. Tik Rep. 2006, 214, 327–332. [Google Scholar]
  9. Van Veldhuizen, D.A.; Lamont, G.B. Multiobjective Evolutionary Algorithm Research: A History and Analysis; Technical Report; CiteSeer: Princeton, NJ, USA, 1998. [Google Scholar]
  10. Riquelme, N.; Von Lücken, C.; Baran, B. Performance metrics in multi-objective optimization. In Proceedings of the 2015 Latin American Computing Conference (CLEI), Arequipa, Peru, 19–23 October 2015; pp. 1–11. [Google Scholar]
  11. Eftimov, T.; Korošec, P.; Seljak, B.K. Disadvantages of statistical comparison of stochastic optimization algorithms. In Proceedings of the Bioinspired Optimizaiton Methods and their Applications (BIOMA), Bled, Slovenia, 18–20 May 2016; pp. 105–118. [Google Scholar]
  12. Moore, D.S.; McCabe, G.P.; Craig, B. Introduction to the Practice of Statistics, 9th ed.; W. H. Freeman: New York City, NY, USA, 1998. [Google Scholar]
  13. Eftimov, T.; Korošec, P.; KoroušićSeljak, B. Deep Statistical Comparison Applied on Quality Indicators to Compare Multi-objective Stochastic Optimization Algorithms. In Machine Learning, Optimization, and Big Data; Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 76–87. [Google Scholar]
  14. Eftimov, T.; Korosec, P.; Korousic-Seljak, B. The Behavior of Deep Statistical Comparison Approach for Different Criteria of Comparing Distributions. In Proceedings of the IJCCI, Funchal, Madeira, Portugal, 1–3 November 2017; pp. 73–82. [Google Scholar]
  15. Eftimov, T.; Korošec, P.; Seljak, B.K. Deep statistical comparison applied on quality indicators to compare multi-objective stochastic optimization algorithms. In Proceedings of the International Workshop on Machine Learning, Optimization, and Big Data, Siena, Italy, 10–13 September 2017; pp. 76–87. [Google Scholar]
  16. Boroushaki, S. Entropy-based weights for multicriteria spatial decision-making. Yearb. Assoc. Pac. Coast Geogr. 2017, 79, 168–187. [Google Scholar] [CrossRef]
  17. Brans, J.P.; Mareschal, B. PROMETHEE methods. In Multiple Criteria Decision Analysis: State of the Art Surveys; Springer: New York, NY, USA, 2005; pp. 163–186. [Google Scholar]
  18. Cheng, R.; Li, M.; Tian, Y.; Xiang, X.; Zhang, X.; Yang, S.; Jin, Y.; Yao, X. Competition on Many-Objective Optimization at 2018 IEEE Congress on Evolutionary Computation. 2020. Available online: https://www.cs.bham.ac.uk/~chengr/CEC_Comp_on_MaOO/2018/webpage.html (accessed on 2 April 2020).
  19. Wagner, M.; Neumann, F. A Fast Approximation-Guided Evolutionary Multi-Objective Algorithm. In Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation (GECCO’13), Amsterdam, The Netherlands, 6–10 July 2013; Association for Computing Machinery: New York, NY, USA, 2013; pp. 687–694. [Google Scholar] [CrossRef] [Green Version]
  20. Li, M.; Yang, S.; Liu, X. Pareto or Non-Pareto: Bi-Criterion Evolution in Multiobjective Optimization. IEEE Trans. Evol. Comput. 2016, 20, 645–665. [Google Scholar] [CrossRef] [Green Version]
  21. Yuan, J.; Liu, H.; Gu, F. A Cost Value Based Evolutionary Many-Objective Optimization Algorithm with Neighbor Selection Strategy. In Proceedings of the 2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–8. [Google Scholar]
  22. Zhao, M.; Ge, H.; Han, H.; Sun, L. A Many-Objective Evolutionary Algorithm with Fast Clustering and Reference Point Redistribution. In Proceedings of the 2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–6. [Google Scholar]
  23. Fritsche, G.; Pozo, A. A Hyper-Heuristic Collaborative Multi-objective Evolutionary Algorithm. In Proceedings of the 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), Sao Paulo, Brazil, 22–25 October 2018; pp. 354–359. [Google Scholar]
  24. Zhang, X.; Tian, Y.; Jin, Y. A Knee Point-Driven Evolutionary Algorithm for Many-Objective Optimization. IEEE Trans. Evol. Comput. 2015, 19, 761–776. [Google Scholar] [CrossRef]
  25. Liu, Y.; Gong, D.; Sun, X.; Zhang, Y. Many-Objective Evolutionary Optimization Based on Reference Points. Appl. Soft Comput. 2017, 50, 344–355. [Google Scholar] [CrossRef]
  26. He, C.; Tian, Y.; Jin, Y.; Zhang, X.; Pan, L. A radial space division based evolutionary algorithm for many-objective optimization. Appl. Soft Comput. 2017, 61, 603–621. [Google Scholar] [CrossRef]
  27. Cheng, R.; Jin, Y.; Olhofer, M.; Sendhoff, B. A Reference Vector Guided Evolutionary Algorithm for Many-Objective Optimization. IEEE Trans. Evol. Comput. 2016, 20, 773–791. [Google Scholar] [CrossRef] [Green Version]
  28. Cheng, R.; Li, M.; Tian, Y.; Zhang, X.; Yang, S.; Jin, Y.; Yao, X. A benchmark test suite for evolutionary many-objective optimization. Complex Intell. Syst. 2017, 3, 67–81. [Google Scholar] [CrossRef]
Table 1. Official competition inverse generational distance (IGD) results.
Table 1. Official competition inverse generational distance (IGD) results.
(a) D = 5
ProblemAGE-IIAMPDEABCE-IBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF0114356829710
MaF0212347561089
MaF0343812691075
MaF0492314765810
MaF0549125831076
MaF0654127638109
MaF0764217531089
MaF0814236598710
MaF0915623410897
MaF1071015683942
MaF1186549731021
MaF1261012493875
MaF1329415368710
MaF1441823751069
MaF1523814510976
Total34215761089
(b) D = 10
ProblemAGE-IIAMPDEABCE-IBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF0161248739510
MaF0275328416109
MaF0352106149783
MaF0491624853710
MaF0594127105368
MaF0652914710386
MaF0754127638910
MaF0826139458710
MaF0917934510628
MaF1089162731054
MaF1137429168510
MaF1210921675483
MaF1359218346710
MaF1461924810573
MaF1532915810764
Total42315678910
Table 2. Official competition hypervolume (HV) results.
Table 2. Official competition hypervolume (HV) results.
(a) D = 5
ProblemAGE-IIAMPDEABCE-IBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF0112456839710
MaF0274318102659
MaF0373612491085
MaF0465817923410
MaF0569431105872
MaF0624136759810
MaF0764127851039
MaF0814326789510
MaF0925613410798
MaF1081031974526
MaF1110831457926
MaF1271051392864
MaF1311042537689
MaF1442813659710
MaF1523914610875
Total35214891079
(b) D = 10
ProblemAGE-IIAMPDEABCE-IBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF0164273958110
MaF0284635910217
MaF0361824310795
MaF0483927645110
MaF0594231105786
MaF0661923510478
MaF0710375498216
MaF0845329768110
MaF0916934510728
MaF1010961375428
MaF1110142657938
MaF1210961382745
MaF1371041356298
MaF1462814710593
MaF1551724691083
Total26137610548
Table 3. DSCTool rankings for IGD.
Table 3. DSCTool rankings for IGD.
(a) D = 5
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF012.05.53.04.01.07.55.59.07.510.0
MaF021.55.01.53.07.07.04.010.07.09.0
MaF0310.09.03.04.08.01.07.06.05.02.0
MaF049.02.07.01.03.08.05.04.06.010.0
MaF054.08.05.03.02.09.06.010.07.01.0
MaF063.04.01.02.07.06.05.08.010.09.0
MaF074.05.01.02.07.06.03.010.08.09.0
MaF082.04.01.03.05.06.09.08.07.010.0
MaF095.01.06.03.02.04.010.08.09.07.0
MaF108.010.01.03.06.07.02.09.04.05.0
MaF119.06.06.03.01.08.03.010.06.03.0
MaF125.510.05.53.51.08.03.58.08.02.0
MaF139.010.01.08.05.03.02.07.04.06.0
MaF144.01.08.02.03.07.05.010.06.09.0
MaF153.04.02.06.05.01.010.09.08.07.0
Total46213751098
(b) D = 10
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF014.01.05.02.09.07.03.08.06.010.0
MaF022.04.01.03.07.05.09.06.08.010.0
MaF0310.07.02.08.09.03.01.06.04.05.0
MaF049.01.07.02.03.08.05.04.06.010.0
MaF059.04.01.02.06.010.05.03.08.07.0
MaF065.010.04.09.07.02.03.08.01.06.0
MaF076.02.04.01.07.05.03.09.010.08.0
MaF083.06.01.02.09.04.05.08.07.010.0
MaF0910.02.01.07.09.04.03.05.06.08.0
MaF1010.09.01.05.03.07.02.08.04.06.0
MaF1110.03.01.04.08.07.02.09.05.06.0
MaF1210.09.04.03.01.08.05.06.07.02.0
MaF134.010.01.05.06.03.07.08.02.09.0
MaF146.02.09.01.03.08.010.05.07.04.0
MaF155.04.09.07.06.01.010.03.02.08.0
Total94127538610
Table 4. DSCTool rankings for HV.
Table 4. DSCTool rankings for HV.
(a) D = 5
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF014.53.04.51.02.07.57.57.57.510.0
MaF029.04.03.02.07.010.01.06.05.08.0
MaF037.03.010.03.06.09.08.03.03.03.0
MaF046.51.54.51.56.54.59.03.010.08.0
MaF058.010.06.03.02.07.04.09.05.01.0
MaF067.03.01.02.06.05.04.08.010.09.0
MaF072.05.05.01.08.03.05.010.09.07.0
MaF084.03.02.01.06.06.09.06.08.010.0
MaF097.06.03.01.05.04.010.02.08.09.0
MaF108.010.03.01.09.06.05.04.02.07.0
MaF119.010.04.01.05.02.07.08.03.06.0
MaF127.010.06.01.04.08.03.09.02.05.0
MaF134.010.07.01.05.03.06.02.09.08.0
MaF149.08.04.02.07.03.05.06.01.010.0
MaF155.02.03.01.07.04.06.08.010.09.0
Total95214367810
(b) D = 10
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF014.52.06.51.09.08.03.04.56.510.0
MaF029.04.05.02.06.07.010.03.01.08.0
MaF039.01.07.04.02.05.06.010.08.03.0
MaF047.01.04.02.06.05.09.03.010.08.0
MaF0510.06.04.02.03.09.01.07.05.08.0
MaF067.01.09.04.03.08.010.02.05.06.0
MaF076.05.07.04.08.01.02.010.09.03.0
MaF086.05.04.02.09.03.07.01.08.010.0
MaF097.06.08.51.03.53.510.02.05.08.5
MaF1010.09.08.01.05.03.06.04.02.07.0
MaF1110.01.04.02.06.05.08.09.03.07.0
MaF1210.08.56.01.54.58.53.07.01.54.5
MaF137.09.55.02.03.53.57.01.07.09.5
MaF149.010.01.04.07.05.02.06.08.03.0
MaF156.02.04.01.07.05.010.03.09.08.0
Total10261548379
Table 5. DSCTool rankings for EI.
Table 5. DSCTool rankings for EI.
(a) D = 5
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF011.02.04.03.05.06.08.09.07.010.0
MaF021.04.52.52.57.57.54.57.57.510.0
MaF031.03.08.05.02.06.07.010.09.04.0
MaF041.04.07.03.02.06.09.05.010.08.0
MaF051.09.03.02.05.08.06.010.07.04.0
MaF061.04.02.03.07.06.05.08.010.09.0
MaF071.06.04.03.08.05.02.010.09.07.0
MaF081.04.02.03.05.55.59.57.57.59.5
MaF091.05.07.02.04.03.010.08.09.06.0
MaF102.010.04.05.06.09.03.07.01.08.0
MaF111.010.07.06.04.08.03.09.02.05.0
MaF121.010.05.03.02.09.06.08.07.04.0
MaF131.010.04.02.05.03.06.08.07.09.0
MaF142.01.08.03.04.06.05.09.07.010.0
MaF152.04.08.01.03.06.010.09.07.05.0
Total15423661089
(b) D = 10
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF013.01.07.02.09.04.06.05.08.010.0
MaF021.03.03.07.07.07.010.03.07.07.0
MaF031.03.010.06.05.03.09.07.08.03.0
MaF041.03.08.04.02.05.09.07.010.06.0
MaF054.07.09.08.01.010.03.06.05.02.0
MaF065.03.09.01.04.08.010.02.07.06.0
MaF071.06.07.05.08.03.04.010.09.02.0
MaF082.05.01.03.09.04.08.06.07.010.0
MaF094.07.09.01.02.05.010.06.03.08.0
MaF102.09.55.59.55.55.55.58.02.02.0
MaF111.06.07.010.03.09.04.08.05.02.0
MaF121.010.07.05.04.09.02.06.08.03.0
MaF134.09.02.01.08.03.05.06.07.010.0
MaF146.01.09.02.04.08.010.05.07.03.0
MaF153.01.09.02.04.08.010.06.07.05.0
Total13924710685
Table 6. DSCTool rankings for GD.
Table 6. DSCTool rankings for GD.
(a) D = 5
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF012.01.07.03.06.09.04.08.05.010.0
MaF021.51.54.56.58.510.03.06.54.58.5
MaF035.02.08.04.01.010.07.06.03.09.0
MaF041.06.08.02.04.09.05.07.03.010.0
MaF054.01.06.09.02.010.07.05.08.03.0
MaF061.02.54.52.54.57.57.59.06.010.0
MaF071.09.04.05.07.08.06.02.03.010.0
MaF081.09.08.02.06.05.03.04.07.010.0
MaF091.010.07.02.05.04.08.06.03.09.0
MaF101.010.06.05.09.07.04.02.03.08.0
MaF111.08.010.06.03.08.08.02.05.04.0
MaF122.010.04.01.08.09.03.07.05.06.0
MaF131.03.09.02.06.010.08.04.07.05.0
MaF149.02.07.01.05.010.06.04.03.08.0
MaF153.57.09.01.53.57.010.07.01.55.0
Total16824107539
(b) D = 10
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF016.01.58.05.01.510.04.07.03.09.0
MaF022.02.05.09.08.010.06.04.07.02.0
MaF031.04.010.03.02.08.09.05.07.06.0
MaF044.06.08.03.02.010.05.07.01.09.0
MaF053.01.04.09.02.010.07.06.08.05.0
MaF064.01.07.04.04.07.09.02.07.010.0
MaF071.07.53.53.57.510.07.53.53.57.5
MaF081.09.08.03.02.06.07.05.04.010.0
MaF091.08.09.04.02.06.07.05.03.010.0
MaF109.010.08.06.05.04.07.01.03.02.0
MaF111.06.09.07.04.010.08.03.05.02.0
MaF124.03.05.07.09.010.06.02.01.08.0
MaF1310.02.08.05.06.07.04.01.09.03.0
MaF147.03.09.01.02.010.08.06.04.05.0
MaF154.06.09.03.02.010.08.07.05.01.0
Total14962108357
Table 7. DSCTool average ensemble rankings.
Table 7. DSCTool average ensemble rankings.
(a) D = 5
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF012.252.754.502.753.507.005.758.006.2510.00
MaF023.003.502.503.256.758.003.007.005.258.75
MaF035.753.757.253.504.256.507.255.754.504.00
MaF044.253.256.501.753.756.757.004.757.259.00
MaF054.257.005.004.252.758.505.758.506.752.25
MaF063.003.252.002.256.006.005.258.259.009.25
MaF072.006.003.252.757.505.503.758.007.258.25
MaF082.005.003.252.255.255.257.506.007.259.75
MaF093.505.505.752.004.003.759.506.007.257.75
MaF104.7510.003.503.507.507.253.505.502.507.00
MaF115.008.006.503.753.256.254.757.253.754.25
MaF123.7510.005.002.003.758.253.757.755.254.25
MaF133.758.255.253.255.254.755.505.256.757.00
MaF146.003.006.752.004.756.505.257.254.259.25
MaF153.254.005.502.254.504.259.008.006.506.50
Total25314869710
(b) D = 10
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF014.251.256.502.507.007.254.006.005.759.75
MaF023.252.753.254.756.506.758.753.755.256.00
MaF035.253.507.255.254.504.506.257.006.754.00
MaF045.252.756.752.753.257.007.005.256.758.25
MaF056.504.504.505.253.009.754.005.506.505.50
MaF065.003.757.004.254.256.008.003.504.757.00
MaF073.504.755.003.007.254.753.757.757.504.75
MaF083.006.253.502.507.254.256.755.006.5010.00
MaF095.505.756.753.254.004.507.504.504.258.50
MaF107.509.255.255.254.254.504.755.252.504.00
MaF115.504.005.255.755.257.755.507.254.504.25
MaF126.257.505.504.004.508.754.005.254.254.25
MaF136.007.504.003.255.754.005.504.006.007.75
MaF147.004.007.002.004.007.757.505.506.503.75
MaF154.503.257.753.254.756.009.504.755.755.50
Total42713108569
Table 8. DSCTool hierarchical ensemble rankings.
Table 8. DSCTool hierarchical ensemble rankings.
(a) D = 5
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF011.02.05.04.03.08.06.09.07.010.0
MaF021.04.02.05.08.09.03.07.06.010.0
MaF036.01.09.05.03.08.010.07.04.02.0
MaF042.03.08.01.04.07.09.05.06.010.0
MaF052.03.06.05.04.010.07.09.08.01.0
MaF062.04.01.03.06.07.05.09.08.010.0
MaF071.08.03.02.09.06.04.05.07.010.0
MaF081.04.02.03.07.57.55.06.09.010.0
MaF091.03.07.02.04.06.010.05.08.09.0
MaF102.010.03.04.08.09.05.06.01.07.0
MaF111.010.09.02.03.07.05.08.04.06.0
MaF123.010.07.01.02.09.06.08.05.04.0
MaF131.07.03.02.09.06.05.04.08.010.0
MaF144.01.08.02.05.06.09.07.03.010.0
MaF154.06.05.01.07.02.010.09.03.08.0
Total13425978610
(b) D = 10
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF015.01.09.02.03.08.04.07.06.010.0
MaF021.02.03.07.09.08.010.06.04.05.0
MaF031.02.07.08.04.06.03.010.09.05.0
MaF042.01.07.04.05.08.09.06.03.010.0
MaF057.04.03.05.01.010.02.08.09.06.0
MaF067.01.09.02.06.05.08.04.03.010.0
MaF071.07.06.02.010.03.04.09.08.05.0
MaF082.08.01.04.05.06.09.03.07.010.0
MaF092.06.03.01.04.08.09.05.07.010.0
MaF106.010.03.55.09.08.07.03.51.02.0
MaF111.02.03.05.07.010.06.09.08.04.0
MaF124.08.09.02.03.010.06.07.01.05.0
MaF139.05.03.02.07.06.010.01.04.08.0
MaF1410.02.03.01.04.09.05.08.07.06.0
MaF158.02.09.01.05.04.010.07.06.03.0
Total32415109768
Table 9. DSCTool data-driven ensemble rankings.
Table 9. DSCTool data-driven ensemble rankings.
(a) D = 5
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF011.03.05.02.04.08.06.09.07.010.0
MaF023.05.01.04.08.09.02.07.06.010.0
MaF036.02.010.01.03.08.09.07.05.04.0
MaF044.02.06.01.03.07.08.05.09.010.0
MaF054.08.05.03.02.09.56.09.57.01.0
MaF063.04.01.02.07.06.05.08.09.010.0
MaF071.06.03.02.08.05.04.09.07.010.0
MaF081.04.03.02.05.55.59.07.08.010.0
MaF092.05.06.01.04.03.010.07.08.09.0
MaF105.010.04.02.59.08.02.56.01.07.0
MaF115.010.08.03.01.07.06.09.02.04.0
MaF123.010.06.01.02.09.04.08.07.05.0
MaF132.010.04.51.04.53.07.06.08.09.0
MaF146.02.08.01.04.07.05.09.03.010.0
MaF152.03.06.01.05.04.010.09.08.07.0
Total25413869710
(b) D = 10
ProblemAGE-IIAMPDEABCEIBEACVEA3fastCARHHcMOEAKnEARPEARSEARVEA
MaF014.01.07.02.08.09.03.06.05.010.0
MaF023.01.02.05.08.09.010.04.06.07.0
MaF035.01.010.06.03.04.07.09.08.02.0
MaF044.51.56.01.53.08.09.04.57.010.0
MaF058.53.04.05.01.010.02.07.08.56.0
MaF066.02.09.03.04.07.010.01.05.08.0
MaF072.05.57.01.08.04.03.010.09.05.5
MaF082.06.03.01.09.04.08.05.07.010.0
MaF096.07.08.01.02.05.09.04.03.010.0
MaF109.010.08.07.03.04.05.06.01.02.0
MaF116.51.04.08.05.010.06.59.03.02.0
MaF128.09.07.02.05.010.01.06.04.03.0
MaF137.59.03.01.06.04.05.02.07.510.0
MaF147.03.58.01.03.510.09.05.06.02.0
MaF153.01.09.02.04.58.010.04.57.06.0
Total42813109567

Share and Cite

MDPI and ACS Style

Korošec, P.; Eftimov, T. Multi-Objective Optimization Benchmarking Using DSCTool. Mathematics 2020, 8, 839. https://doi.org/10.3390/math8050839

AMA Style

Korošec P, Eftimov T. Multi-Objective Optimization Benchmarking Using DSCTool. Mathematics. 2020; 8(5):839. https://doi.org/10.3390/math8050839

Chicago/Turabian Style

Korošec, Peter, and Tome Eftimov. 2020. "Multi-Objective Optimization Benchmarking Using DSCTool" Mathematics 8, no. 5: 839. https://doi.org/10.3390/math8050839

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop