Abstract
Analogy-based effort estimation (ABE) is one of the prominent methods for software effort estimation. The fundamental concept of ABE is closer to the mentality of expert estimation but with an automated procedure in which the final estimate is generated by reusing similar historical projects. The main key issue when using ABE is how to adapt the effort of the retrieved nearest neighbors. The adaptation process is an essential part of ABE to generate more successful accurate estimation based on tuning the selected raw solutions, using some adaptation strategy. In this study, we show that there are three interrelated decision variables that have great impact on the success of adaptation method: (1) number of nearest analogies (k), (2) optimum feature set needed for adaptation and (3) adaptation weights. To find the right decision regarding these variables, one need to study all possible combinations and evaluate them individually to select the one that can improve all prediction evaluation measures. The existing evaluation measures usually behave differently, presenting sometimes opposite trends in evaluating prediction methods. This means that changing one decision variable could improve one evaluation measure while it is decreasing the others. Therefore, the main theme of this research is how to come up with best decision variables that improve adaptation strategy and thus the overall evaluation measures without degrading the others. The impact of these decisions together has not been investigated before; therefore, we propose to view the building of adaptation procedure as a multi-objective optimization problem. The Particle swarm optimization algorithm (PSO) is utilized to find the optimum solutions for such decision variables based on optimizing multiple evaluation measures. We evaluated the proposed approaches over 15 datasets and using four evaluation measures. After extensive experimentation, we found that: (1) predictive performance of ABE has noticeably been improved, (2) optimizing all decision variables together is more efficient than ignoring any one of them, and (3) optimizing decision variables for each project individually yields better accuracy than optimizing them for the whole dataset.
Similar content being viewed by others
References
Azzeh M (2012) A replicated assessment and comparison of adaptation techniques for analogy-based effort estimation. J Empirical Softw Eng 17(1–2):90–127
Kocaguneli E, Menzies T, Bener A, Keung J (2012) Exploiting the essential assumptions of analogy-based effort estimation. IEEE Trans Softw Eng 38(2):425–438
Khoshgoftaar M, Rebours P, Seliya N (2009) Software quality analysis by combining multiple projects and learners. J Softw Qual Control 17(1):25–49
Huang J, Li Y-F, Xie M (2015) An empirical analysis of data pre-processing for machine learning-based software cost estimation. Inf Softw Technol. doi:10.1016/j.infsof.2015.07.004
Leandro M, Yao X (2012) Ensembles and locality: insight on improving software effort estimation. J Inf Softw Technol 55(8):1512–1528
Azzeh M (2011) Model tree based adaptation strategy for software effort estimation by analogy. In: 11th IEEE international conference on computer and information technology, pp 328–335
Mittas N, Angelis L (2013) Ranking and clustering software cost estimation models through a multiple comparisons algorithm. IEEE Trans Softw Eng 39(4):537–551
Kocaguneli E, Kultur Y, Bener A (2009) Combining multiple learners induced on multiple datasets for software effort prediction. In: 20th international symposium on software reliability engineering (ISSRE)
Li YF, Xie M, Goh TN (2009) A study of the non-linear adjustment for analogy based software cost estimation. J Empir Softw Eng 14(6):603–643
Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416
Song L, Leandro M, Xin Y (2013) The impact of parameter tuning on software effort estimation using learning machines. In: The 9th international conference on predictive models in software engineering. ACM
Mohammad R, Thabta F, McCluskey L (2014) Predicting phishing websites based on self-structuring neural network. J Neural Comput Appl 25(2):443–458
Kankal M, Yuksek O (2014) Artificial neural network for estimation of harbor oscillation in a cargo harbour basin. J Neural Comput Appl 25(1):95–103
Foss T, Stensrud E, Kitchenham B, Myrtveit I (2003) A simulation study of the model evaluation criterion MMRE. IEEE Trans Softw Eng 29(11):985–995
Jorgensen M (2004) A review of studies on expert estimation of software development effort. J Syst Softw 70(1):37–60
Shepperd M, MacDonell S (2012) Evaluating prediction systems in software project estimation. J Inf Softw Technol 54(8):820–827
Menzies T, Jalali O, Hihn J, Baker D, Lum K (2010) Stable rankings for different effort models. J Automated Softw Eng 17(4):409–437
Shepperd M, Schofield C (1997) Estimating software project effort using analogies. IEEE Trans Softw Eng 23(11):736–743
Keung J, Kitchenham B, Jeffery DR (2008) Analogy-X: providing statistical inference to analogy-based software cost estimation. IEEE Trans Softw Eng 34(4):471–484
Kadoda G, Cartwright M, Chen L, Shepperd M (2000) Experiences using case based reasoning to predict software project effort. In: proceedings of EASE, evaluation and assessment in software engineering conference. Keele, UK
Mendes E, Watson I, Triggs C, Mosley N, Counsell S (2003) A comparative study of cost estimation models for web hypermedia applications. J Empir Softw Eng 8(2):163–196
Wu D, Jianping L, Yong L (2013) Linear combination of multiple case-based reasoning with optimized weight for software effort estimation. J Supercomput 64(3):898–918
Leandro M, Yao X (2013) Software effort estimation as a multiobjective learning problem. ACM Trans Softw Eng Methodol (TOSEM) 22(4):35
Jorgensen M, Indahl U, Sjoberg D (2003) Software effort estimation by analogy and regression toward the mean. J Syst Softw 68(3):253–262
Chiu NH, Huang SJ (2007) The adapted analogy-based software effort estimation based on similarity distances. J Syst Softw 80(4):628–640
Kennedy J, Eberhart RC (1995) Particle swarm optimization. In: The 4th IEEE international conference on neural networks, pp 1942–1948
James K (2010) Particle swarm optimization, Encyclopaedia of machine learning. Springer, Berlin, pp 760–766
Cabrera JCF, Coello CAC (2010) Micro-MOPSO: a multi-objective particle swarm optimizer that uses a very small population size, Multi-objective swarm intelligent systems. Springer, Berlin, pp 83–104
Azzeh M, Nassif AB, Banitaan S (2014) A better case adaptation method for case-based effort estimation using multi-objective optimization. In: The 13th international conference on machine learning and applications (ICMLA’14), Detroit, MI, USA
Azzeh M, Elsheikh Y (2012) Learning best K analogies from data distribution for case-based software effort estimation. In: The seventh international conference on software engineering advances, pp 341–347
Walkerden F, Jeffery DR (1999) An empirical study of analogy-based software effort estimation. J Empir Softw Eng 4(2):135–158
Kirsopp C, Mendes E, Premraj R, Shepperd M (2003) An empirical analysis of linear adaptation techniques for case-based prediction. In: 5th international conference on case based reasoning, pp 231–245
Shepperd M, Cartwright M (2005) A Replication of the use of regression towards the mean (R2M) as an adjustment to effort estimation models. In: 11th IEEE international software metrics symposium (METRICS’05), 38pp
Li JZ, Ruhe G, Al-Emran A, Richter M (2007) A flexible method for software effort estimation by analogy. J Empir Softw Eng 12(1):65–106
Auer M, Trendowicz A, Graser B, Haunschmid E, Biffl S (2006) Optimal project feature weights in analogy-based cost estimation: improvement and limitations. IEEE Trans Softw Eng 32(2):83–92
Lipowezky U (1998) Selection of the optimal prototype subset for 1-NN classification. J Pattern Recogn Lett 19(10):907–918
Myrtveit I, Stensrud E, Shepperd M (2005) Reliability and validity in comparative studies of software prediction models. IEEE Trans Softw Eng 31(5):380–391
Coello CAC, Pulido GT, Pulido T, Lechuga MS (2004) Handling multiple objectives with particle swarm optimization. IEEE Trans Evol Comput 8(3):256–279
Padhye N, Deb K, Mittal P (2013) Boundary handling approaches in particle swarm optimization. In: The 7th international conference on bio-inspired computing: theories and applications (BIC-TA 2012). Springer India
Tsou CS, Chang SC, Lai PW (2007) Using crowding distance to improve multi-objective PSO with local search. In: Swarm intelligence: focus on ant and particle swarm optimization, pp 77–86
Dejaeger K, Verbeke W, Martens D, Baesens B (2012) Data mining techniques for software effort estimation: a comparative study. IEEE Trans Softw Eng 38(2):375–397
Menzies T, Caglayan B, Kocaguneli E, Krall J, Peters F, Turhan B (2012) The PROMISE Repository of empirical software engineering data. West Virginia University, Department of Computer Science. http://promisedata.googlecode.com
ISBSG (2007) International software benchmark and standard group, Data CD Release 10. www.isbsg.org
Azzeh M, Neagu D, Cowling PI (2011) Analogy-based software effort estimation using fuzzy numbers. J Syst Softw 84(2):270–284
Kocaguneli E, Menzies T (2013) Software effort models should be assessed via leave-one-out validation. J Syst Softw 86:1879–1890
Kocaguneli E, Menzies T (2011) How to find relevant data for effort estimation? In: 5th international symposium on empirical software engineering and measurement (ESEM). IEEE, Banff, Canada, pp 255–264
Acknowledgments
The authors are grateful to the Applied Science Private University, Amman, Jordan, for the financial support granted to this research project.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Azzeh, M., Nassif, A.B., Banitaan, S. et al. Pareto efficient multi-objective optimization for local tuning of analogy-based estimation. Neural Comput & Applic 27, 2241–2265 (2016). https://doi.org/10.1007/s00521-015-2004-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-015-2004-y