Abstract
Predicting defect proneness of software products has been an active research area in software engineering domain in recent years. Researchers have been using static code metrics, code churn metrics, developer networks, and module networks as inputs to their proposed models until now. However, domain specific characteristics of software has not been taken into account. In this research, we propose to include a new set of metrics to improve defect prediction performance for web applications by utilizing their characteristics. To validate our hypotheses we used datasets from 3 open source web applications to conduct our experiments. Defect prediction is then performed using different machine learning algorithms. The results of experiments revealed that overall performance of defect predictors are improved compared to only using existing static code metrics. Therefore we recommend practitioners to utilise domain-specific characteristics in defect prediction as they can be informative.
Notes
- 1.
Fault is used interchangeably in this study with the terms ‘defect’, ‘bug’ or ‘error’ to mean a fault in software code.
References
Alpaydın, E.: Introduction to Machine Learning, 2nd edn. The MIT Press, Cambridge (2010)
California power outages suspended-for now. http://news.cnet.com/2100-1017-251167.html. Accessed 12 April 2014
Andrews, A.A., Offutt, J., Alexander, R.T.: Testing web applications by modeling with FSMs. Softw. Syst. Model. 4, 326–345 (2005)
Arshad, F.A.: Failure characterization and error detection in distributed web applications. Ph.D. thesis, Purdue University (2014)
Artzi, S., Kiezun, A., Dolby, J., Tip, F., Dig, D., Paradkar, A., Ernst, M.D.: Finding bugs in dynamic web applications. In: Proceedings of the 2008 International Symposium on Software Testing and Analysis. ISSTA 2008, pp. 261–272. ACM, New York (2008). http://doi.acm.org/10.1145/1390630.1390662
Biçer, S., Bener, A.B., Çağlayan, B.: Defect prediction using social network analysis on issue repositories. In: Proceedings of the 2011 International Conference on Software and Systems Process. ICSSP 2011, pp. 63–71. ACM, New York (2011). http://doi.acm.org/10.1145/1987875.1987888
Çatal, C., Diri, B.: Review: a systematic review of software fault prediction studies. Expert Syst. Appl. 36(4), 7346–7354 (2009). doi:10.1016/j.eswa.2008.10.027
van Deursen, A., Mesbah, A.: Research issues in the automated testing of Ajax applications. In: van Leeuwen, J., Muscholl, A., Peleg, D., Pokorný, J., Rumpe, B. (eds.) SOFSEM 2010. LNCS, vol. 5901, pp. 16–28. Springer, Heidelberg (2010)
Dholakia, U., Rego, L.L.: What makes commercial web pages popular? An empirical investigation of web page effectiveness. Eur. J. Mark. 32(7), 724–736 (1998)
Guo, Y., Sampath, S.: Web application fault classification - an exploratory study. In: Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. ESEM 2008, pp. 303–305. ACM, New York (2008). http://doi.acm.org/10.1145/1414004.1414060
Hall, M.: Correlation-based feature selection for machine learning. Ph.D. thesis, University of Waikato (1999)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009). doi:10.1145/1656274.1656278
Halstead, M.H.: Elements of Software Science. Operating and Programming Systems. Elsevier Science Inc., New York (1977)
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008). doi:10.1109/TSE.2008.35
Marchetto, A., Ricca, F., Tonella, P.: Empirical validation of a web fault taxonomy and its usage for fault seeding. In: Huang, S., Penta, M.D. (eds.) WSE 2007, pp. 31–38. IEEE Computer Society (2007)
McCabe, T.: A complexity measure. IEEE Trans. Soft. Eng. SE–2(4), 308–320 (1976)
Medeiros, I., Neves, N.F., Correia, M.: Automatic detection and correction of web application vulnerabilities using data mining to predict false positives. In: Proceedings of the 23rd International Conference on World Wide Web. WWW 2014, pp. 63–74. ACM, New York (2014). http://doi.acm.org/10.1145/2566486.2568024
Meneely, A., Williams, L., Snipes, W., Osborne, J.: Predicting failures with developer networks and social network analysis. In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering. SIGSOFT 2008/FSE-16, pp. 13–23. ACM, New York (2008). http://doi.acm.org/10.1145/1453101.1453106
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)
Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., Bener, A.: Defect prediction from static code features: current results, limitations, new approaches. Autom. Softw. Eng. 17(4), 375–407 (2010)
Misra, S., Cafer, F.: Estimating quality of javascript. Int. Arab J. Inf. Technol. 9(6), 535–543 (2012). http://dblp.uni-trier.de/db/journals/iajit/iajit9.html#MisraC12
Munson, J.C., Elbaum, S.G.: Code churn: a measure for estimating the impact of code change. In: Proceedings of the International Conference on Software Maintenance. ICSM 1998, p. 24. IEEE Computer Society, Washington (1998). http://dl.acm.org/citation.cfm?id=850947.853326
Offutt, J.: Quality attributes of web software applications. IEEE Softw. 19(2), 25–32 (2002). doi:10.1109/52.991329
Pertet, S., Narasimhan, P.: Causes of failure in web applications. Technical report CMU-PDL-05-109, Parallel Data Laboratory, Carnegie Mellon University (2005)
php-ast. https://github.com/nikic/php-ast. Accessed 14 December 2014
Pinzger, M., Nagappan, N., Murphy, B.: Can developer-module networks predict failures? In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering. SIGSOFT 2008/FSE-16, pp. 2–12. ACM, New York (2008). http://doi.acm.org/10.1145/1453101.1453105
Praphamontripong, U., Offutt, J.: Applying mutation testing to web applications. In: 2010 Third International Conference on Software Testing, Verification, and Validation Workshops (ICSTW), pp. 132–141, April 2010
Ricca, F., Tonella, P.: Anomaly detection in web applications: a review of already conducted case studies. In: Ninth European Conference on Software Maintenance and Reengineering. CSMR 2005, pp. 385–394, March 2005
Shar, L.K., Tan, H.B.K.: Mining input sanitization patterns for predicting sql injection and cross site scripting vulnerabilities. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 1293–1296, June 2012
Shull, F., Basili, V., Boehm, B., Brown, A.W., Costa, P., Lindvall, M., Port, D., Rus, I., Tesoriero, R., Zelkowitz, M.: What we have learned about fighting defects. In: Proceedings of the 8th International Symposium on Software Metrics. METRICS 2002, p. 249. IEEE Computer Society, Washington (2002). http://dl.acm.org/citation.cfm?id=823457.824031
Sprenkle, S.E.: Strategies for automatically exposing faults in web applications. Ph.D. thesis, University of Delaware, Newark, DE, USA (2007)
Torchiano, M., Ricca, F., Marchetto, A.: Are web applications more defect-prone than desktop applications? Int. J. Softw. Tools Technol. Transfer 13(2), 151–166 (2011). doi:10.1007/s10009-010-0182-6
Understand - source code analysis & metrics. http://scitools.com. Accessed 3 May 2014
World internet users statistics usage and population stats. http://www.internetworldstats.com/stats.htm. Accessed 20 October 2014
Vijayaraghavan, G.V.: A taxonomy of e-commerce risks and failures. Ph.D. thesis, Florida Institute of Technology (2003)
Wang, H., Khoshgoftaar, T.M., Seliya, N.: How many software metrics should be selected for defect prediction? In: Murray, R.C., McCarthy, P.M. (eds.) FLAIRS Conference. AAAI Press (2011)
Wassermann, G., Su, Z.: Static detection of cross-site scripting vulnerabilities. In: Proceedings of the 30th International Conference on Software Engineering. ICSE 2008, pp. 171–180. ACM, New York (2008). http://doi.acm.org/10.1145/1368088.1368112
Wolf, T., Schroter, A., Damian, D., Nguyen, T.: Predicting build failures using social network analysis on developer communication. In: Proceedings of the 31st International Conference on Software Engineering. ICSE 2009, pp. 1–11. IEEE Computer Society, Washington (2009). http://dx.doi.org/10.1109/ICSE.2009.5070503
Zimmermann, T., Nagappan, N.: Predicting defects using network analysis on dependency graphs. In: Proceedings of the 30th International Conference on Software Engineering. ICSE 2008, pp. 531–540. ACM, New York (2008). http://doi.acm.org/10.1145/1368088.1368161
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Biçer, M.S., Diri, B. (2015). Predicting Defect Prone Modules in Web Applications. In: Dregvaite, G., Damasevicius, R. (eds) Information and Software Technologies. ICIST 2015. Communications in Computer and Information Science, vol 538. Springer, Cham. https://doi.org/10.1007/978-3-319-24770-0_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-24770-0_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24769-4
Online ISBN: 978-3-319-24770-0
eBook Packages: Computer ScienceComputer Science (R0)