skip to main content
10.1145/3345629.3351449acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Does chronology matter in JIT defect prediction?: A Partial Replication Study

Authors Info & Claims
Published:18 September 2019Publication History

ABSTRACT

BACKGROUND: Just-In-Time (JIT) models, unlike the traditional defect prediction models, detect the fix-inducing changes (or defect inducing changes). These models are designed based on the assumption that past code change properties are similar to future ones. However, as the system evolves, the expertise of developers and/or the complexity of the system also change.

AIM: In this work, we aim to investigate the effect of code change properties on JIT models over time. We also study the impact of using recent data as well as all available data on the performance of JIT models. Further, we analyze the effect of weighted sampling on the performance of fix-inducing properties of JIT models. For this purpose, we used datasets from four open-source projects, namely Eclipse JDT, Mozilla, Eclipse Platform, and PostgreSQL.

METHOD: We used five families of change code properties such as size, diffusion, history, experience, and purpose. We used Random Forest to train and test the JIT model and Brier Score (BS) and Area Under Curve (AUC) for performance measurement. We applied the Wilcoxon Signed Rank Test on the output to statistically validate whether the performance of JIT models improves using all the available data or the recent data.

RESULTS: Our paper suggest that the predictive power of JIT models does not change by time. Furthermore, we observed that the chronology of data in JIT defect prediction models can be discarded by considering all the available data. On the other hand, the importance score of families of code change properties is found to oscillate over time.

CONCLUSION: To mitigate the impact of the evolution of code change properties, it is recommended to use weighted sampling approach in which more emphasis is placed upon the changes occurring closer to the current time. Moreover, since properties such as "Expertise of the Developer" and "Size" evolve with the time, the models obtained from old data may exhibit different characteristics compared to those employing the newer dataset. Hence, practitioners should constantly retrain JIT models to include fresh data.

References

  1. Leo Breiman. 2001. Random Forests. Mach. Learn. 45, 1 (Oct. 2001), 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. D'Ambros, M. Lanza, and R. Robbes. 2010. An extensive comparison of bug prediction approaches. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010). 31--41.Google ScholarGoogle ScholarCross RefCross Ref
  3. Francis X. Diebold and Roberto S. Mariano. 1995. Comparing Predictive Accuracy. Journal of Business & Economic Statistics 13, 3 (1995), 253--263.Google ScholarGoogle Scholar
  4. Emanuel Giger, Marco D'Ambros, Martin Pinzger, and Harald C. Gall. 2012. Method-level Bug Prediction. In Proceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '12). ACM, New York, NY, USA, 171--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tilmann Gneiting and Adrian E Raftery. 2007. Strictly Proper Scoring Rules, Prediction, and Estimation. J. Amer. Statist. Assoc. 102, 477 (2007), 359--378.Google ScholarGoogle ScholarCross RefCross Ref
  6. T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. 2000. Predicting fault incidence using software change history. IEEE Transactions on Software Engineering 26, 7 (July 2000), 653--661. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Philip J. Guo, Thomas Zimmermann, Nachiappan Nagappan, and Brendan Murphy. 2010. Characterizing and Predicting Which Bugs Get Fixed: An Empirical Study of Microsoft Windows. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE '10). ACM, New York, NY, USA, 495--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. David J. Hand. 2009. Measuring Classifier Performance: A Coherent Alternative to the Area Under the ROC Curve. Mach. Learn. 77, 1 (Oct. 2009), 103--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. E. Hassan. 2009. Predicting faults using the complexity of code changes. In 2009 IEEE 31st International Conference on Software Engineering. 78--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2012. Bug Prediction Based on Fine-grained Module Histories. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). IEEE Press, Piscataway, NJ, USA, 200--210. http://dl.acm.org/citation.cfm?id=2337223.2337247 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Malley JD, Kruppa J, Dasgupta A, Malley KG, and Ziegler A. 2012. Probability machines: consistent probability estimation using nonparametric learning machines. Methods of Information in Medicine 51, 1 (2012), 74--81.Google ScholarGoogle ScholarCross RefCross Ref
  12. Yasutaka Kamei, Takafumi Fukushima, Shane Mcintosh, Kazuhiro Yamashita, Naoyasu Ubayashi, and Ahmed E. Hassan. 2016. Studying Just-in-time Defect Prediction Using Cross-project Models. Empirical Softw. Engg. 21, 5 (Oct. 2016), 2072--2106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Kamei, S. Matsumoto, A. Monden, K. Matsumoto, B. Adams, and A. E. Hassan. 2010. Revisiting common bug prediction findings using effort-aware models. In 2010 IEEE International Conference on Software Maintenance. 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Kamei, E. Shihab, B. Adams, A. E. Hassan, A. Mockus, A. Sinha, and N.Ubayashi. 2013. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering 39, 6 (June 2013), 757--773. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Kim, E. J. Whitehead, Jr., and Y. Zhang. 2008. Classifying Software Changes: Clean or Buggy? IEEE Transactions on Software Engineering 34, 2 (March 2008), 181--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. G. Koru, D. Zhang, K. El Emam, and H. Liu. 2009. An Investigation into the Functional Form of the Size-Defect Relationship for Software Modules. IEEE Transactions on Software Engineering 35, 2 (March 2009), 293--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Paul Luo Li, James Herbsleb, Mary Shaw, and Brian Robinson. 2006. Experiences and Results from Initiating Field Defect Prediction and Product Test Prioritization Efforts at ABB Inc.. In Proceedings of the 28th International Conference on Software Engineering (ICSE '06). ACM, New York, NY, USA, 413--422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Shinsuke Matsumoto, Yasutaka Kamei, Akito Monden, Ken-ichi Matsumoto, and Masahide Nakamura. 2010. An Analysis of Developer Metrics for Fault Prediction. In Proceedings of the 6th International Conference on Predictive Models in Software Engineering (PROMISE '10). ACM, New York, NY, USA, Article 18, 9 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. McIntosh and Y. Kamei. 2018. Are Fix-Inducing Changes a Moving Target? A Longitudinal Case Study of Just-In-Time Defect Prediction. IEEE Transactions on Software Engineering 44, 5 (May 2018), 412--428.Google ScholarGoogle ScholarCross RefCross Ref
  20. A. Mockus and D. M. Weiss. 2000. Predicting risk of software changes. Bell Labs Technical Journal 5, 2 (April 2000), 169--180.Google ScholarGoogle Scholar
  21. Nachiappan Nagappan and Thomas Ball. 2005. Use of Relative Code Churn Measures to Predict System Defect Density. In Proceedings of the 27th International Conference on Software Engineering (ICSE '05). ACM, New York, NY, USA, 284--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining Metrics to Predict Component Failures. In Proceedings of the 28th International Conference on Software Engineering (ICSE '06). ACM, New York, NY, USA, 452--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Foyzur Rahman, Daryl Posnett, Israel Herraiz, and Premkumar Devanbu. 2013. Sample Size vs. Bias in Defect Prediction. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013). ACM, New York, NY, USA, 147--157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Gema Rodríguez-Pérez, Gregorio Robles, and Jesús M. González-Barahona. 2018. Reproducibility and credibility in empirical software engineering: A case study based on a systematic literature review of the use of the SZZ algorithm. Information and Software Technology 99 (2018), 164--176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Shaoqing Ren, X. Cao, Yichen Wei, and J. Sun. 2015. Global refinement of random forest. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 723--730.Google ScholarGoogle Scholar
  26. Carolin Strobl, Anne-Laure Boulesteix, Achim Zeileis, and Torsten Hothorn. 2007. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8, 25 (Oct. 2007), 1--25.Google ScholarGoogle ScholarCross RefCross Ref
  27. Thomas Zimmermann, Rahul Premraj, and Andreas Zeller. 2007. Predicting Defects for Eclipse. In Proceedings of the Third International Workshop on Predictor Models in Software Engineering (PROMISE '07). IEEE Computer Society, Washington, DC, USA, 9--. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    PROMISE'19: Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering
    September 2019
    103 pages
    ISBN:9781450372336
    DOI:10.1145/3345629

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 18 September 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate64of125submissions,51%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader