research-article

Does chronology matter in JIT defect prediction?: A Partial Replication Study

Authors:
Hadi Jahanshahi

Data Science Lab, Ryerson University, Toronto, Canada

Data Science Lab, Ryerson University, Toronto, Canada
View Profile

,
Dhanya Jothimani

Data Science Lab, Ryerson University, Toronto, Canada

Data Science Lab, Ryerson University, Toronto, Canada
View Profile

,
Ayşe Başar

Data Science Lab, Ryerson University, Toronto, Canada

Data Science Lab, Ryerson University, Toronto, Canada
View Profile

,
Mucahit Cevik

Data Science Lab, Ryerson University, Toronto, Canada

Data Science Lab, Ryerson University, Toronto, Canada
View Profile

PROMISE'19: Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software EngineeringSeptember 2019Pages 90–99https://doi.org/10.1145/3345629.3351449

Published:18 September 2019Publication History

PROMISE'19: Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering

Pages 90–99

ABSTRACT

BACKGROUND: Just-In-Time (JIT) models, unlike the traditional defect prediction models, detect the fix-inducing changes (or defect inducing changes). These models are designed based on the assumption that past code change properties are similar to future ones. However, as the system evolves, the expertise of developers and/or the complexity of the system also change.

AIM: In this work, we aim to investigate the effect of code change properties on JIT models over time. We also study the impact of using recent data as well as all available data on the performance of JIT models. Further, we analyze the effect of weighted sampling on the performance of fix-inducing properties of JIT models. For this purpose, we used datasets from four open-source projects, namely Eclipse JDT, Mozilla, Eclipse Platform, and PostgreSQL.

METHOD: We used five families of change code properties such as size, diffusion, history, experience, and purpose. We used Random Forest to train and test the JIT model and Brier Score (BS) and Area Under Curve (AUC) for performance measurement. We applied the Wilcoxon Signed Rank Test on the output to statistically validate whether the performance of JIT models improves using all the available data or the recent data.

RESULTS: Our paper suggest that the predictive power of JIT models does not change by time. Furthermore, we observed that the chronology of data in JIT defect prediction models can be discarded by considering all the available data. On the other hand, the importance score of families of code change properties is found to oscillate over time.

CONCLUSION: To mitigate the impact of the evolution of code change properties, it is recommended to use weighted sampling approach in which more emphasis is placed upon the changes occurring closer to the current time. Moreover, since properties such as "Expertise of the Developer" and "Size" evolve with the time, the models obtained from old data may exhibit different characteristics compared to those employing the newer dataset. Hence, practitioners should constantly retrain JIT models to include fresh data.

References

Leo Breiman. 2001. Random Forests. Mach. Learn. 45, 1 (Oct. 2001), 5--32. Google ScholarDigital Library
M. D'Ambros, M. Lanza, and R. Robbes. 2010. An extensive comparison of bug prediction approaches. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010). 31--41.Google ScholarCross Ref
Francis X. Diebold and Roberto S. Mariano. 1995. Comparing Predictive Accuracy. Journal of Business & Economic Statistics 13, 3 (1995), 253--263.Google Scholar
Emanuel Giger, Marco D'Ambros, Martin Pinzger, and Harald C. Gall. 2012. Method-level Bug Prediction. In Proceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '12). ACM, New York, NY, USA, 171--180. Google ScholarDigital Library
Tilmann Gneiting and Adrian E Raftery. 2007. Strictly Proper Scoring Rules, Prediction, and Estimation. J. Amer. Statist. Assoc. 102, 477 (2007), 359--378.Google ScholarCross Ref
T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. 2000. Predicting fault incidence using software change history. IEEE Transactions on Software Engineering 26, 7 (July 2000), 653--661. Google ScholarDigital Library
Philip J. Guo, Thomas Zimmermann, Nachiappan Nagappan, and Brendan Murphy. 2010. Characterizing and Predicting Which Bugs Get Fixed: An Empirical Study of Microsoft Windows. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE '10). ACM, New York, NY, USA, 495--504. Google ScholarDigital Library
David J. Hand. 2009. Measuring Classifier Performance: A Coherent Alternative to the Area Under the ROC Curve. Mach. Learn. 77, 1 (Oct. 2009), 103--123. Google ScholarDigital Library
A. E. Hassan. 2009. Predicting faults using the complexity of code changes. In 2009 IEEE 31st International Conference on Software Engineering. 78--88. Google ScholarDigital Library
Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2012. Bug Prediction Based on Fine-grained Module Histories. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). IEEE Press, Piscataway, NJ, USA, 200--210. http://dl.acm.org/citation.cfm?id=2337223.2337247 Google ScholarDigital Library
Malley JD, Kruppa J, Dasgupta A, Malley KG, and Ziegler A. 2012. Probability machines: consistent probability estimation using nonparametric learning machines. Methods of Information in Medicine 51, 1 (2012), 74--81.Google ScholarCross Ref
Yasutaka Kamei, Takafumi Fukushima, Shane Mcintosh, Kazuhiro Yamashita, Naoyasu Ubayashi, and Ahmed E. Hassan. 2016. Studying Just-in-time Defect Prediction Using Cross-project Models. Empirical Softw. Engg. 21, 5 (Oct. 2016), 2072--2106. Google ScholarDigital Library
Y. Kamei, S. Matsumoto, A. Monden, K. Matsumoto, B. Adams, and A. E. Hassan. 2010. Revisiting common bug prediction findings using effort-aware models. In 2010 IEEE International Conference on Software Maintenance. 1--10. Google ScholarDigital Library
Y. Kamei, E. Shihab, B. Adams, A. E. Hassan, A. Mockus, A. Sinha, and N.Ubayashi. 2013. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering 39, 6 (June 2013), 757--773. Google ScholarDigital Library
S. Kim, E. J. Whitehead, Jr., and Y. Zhang. 2008. Classifying Software Changes: Clean or Buggy? IEEE Transactions on Software Engineering 34, 2 (March 2008), 181--196. Google ScholarDigital Library
A. G. Koru, D. Zhang, K. El Emam, and H. Liu. 2009. An Investigation into the Functional Form of the Size-Defect Relationship for Software Modules. IEEE Transactions on Software Engineering 35, 2 (March 2009), 293--304. Google ScholarDigital Library
Paul Luo Li, James Herbsleb, Mary Shaw, and Brian Robinson. 2006. Experiences and Results from Initiating Field Defect Prediction and Product Test Prioritization Efforts at ABB Inc.. In Proceedings of the 28th International Conference on Software Engineering (ICSE '06). ACM, New York, NY, USA, 413--422. Google ScholarDigital Library
Shinsuke Matsumoto, Yasutaka Kamei, Akito Monden, Ken-ichi Matsumoto, and Masahide Nakamura. 2010. An Analysis of Developer Metrics for Fault Prediction. In Proceedings of the 6th International Conference on Predictive Models in Software Engineering (PROMISE '10). ACM, New York, NY, USA, Article 18, 9 pages. Google ScholarDigital Library
S. McIntosh and Y. Kamei. 2018. Are Fix-Inducing Changes a Moving Target? A Longitudinal Case Study of Just-In-Time Defect Prediction. IEEE Transactions on Software Engineering 44, 5 (May 2018), 412--428.Google ScholarCross Ref
A. Mockus and D. M. Weiss. 2000. Predicting risk of software changes. Bell Labs Technical Journal 5, 2 (April 2000), 169--180.Google Scholar
Nachiappan Nagappan and Thomas Ball. 2005. Use of Relative Code Churn Measures to Predict System Defect Density. In Proceedings of the 27th International Conference on Software Engineering (ICSE '05). ACM, New York, NY, USA, 284--292. Google ScholarDigital Library
Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining Metrics to Predict Component Failures. In Proceedings of the 28th International Conference on Software Engineering (ICSE '06). ACM, New York, NY, USA, 452--461. Google ScholarDigital Library
Foyzur Rahman, Daryl Posnett, Israel Herraiz, and Premkumar Devanbu. 2013. Sample Size vs. Bias in Defect Prediction. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013). ACM, New York, NY, USA, 147--157. Google ScholarDigital Library
Gema Rodríguez-Pérez, Gregorio Robles, and Jesús M. González-Barahona. 2018. Reproducibility and credibility in empirical software engineering: A case study based on a systematic literature review of the use of the SZZ algorithm. Information and Software Technology 99 (2018), 164--176.Google ScholarDigital Library
Shaoqing Ren, X. Cao, Yichen Wei, and J. Sun. 2015. Global refinement of random forest. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 723--730.Google Scholar
Carolin Strobl, Anne-Laure Boulesteix, Achim Zeileis, and Torsten Hothorn. 2007. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8, 25 (Oct. 2007), 1--25.Google ScholarCross Ref
Thomas Zimmermann, Rahul Premraj, and Andreas Zeller. 2007. Predicting Defects for Eclipse. In Proceedings of the Third International Workshop on Predictor Models in Software Engineering (PROMISE '07). IEEE Computer Society, Washington, DC, USA, 9--. Google ScholarDigital Library

Recommendations

Deep just-in-time defect prediction: how far are we?
ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Defect prediction aims to automatically identify potential defective code with minimal human intervention and has been widely studied in the literature. Just-in-Time (JIT) defect prediction focuses on program changes rather than whole programs, and has ...
Read More
Studying just-in-time defect prediction using cross-project models

Unlike traditional defect prediction models that identify defect-prone modules, Just-In-Time (JIT) defect prediction models identify defect-inducing changes. As such, JIT defect models can provide earlier feedback for developers, while design decisions ...
Read More
Cross-project smell-based defect prediction
Abstract
Defect prediction is a technique introduced to optimize the testing phase of the software development pipeline by predicting which components in the software may contain defects. Its methodology trains a classifier with data regarding a set of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PROMISE'19: Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering
September 2019
103 pages
ISBN:9781450372336
DOI:10.1145/3345629
General Chair:
Leandro Minku,
Program Chairs:
Foutse Khomh,
Jean Petrić
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 September 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Just-In-Time prediction
defect prediction
quality assurance
software engineering
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate64of125submissions,51%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 144
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Does chronology matter in JIT defect prediction?: A Partial Replication Study

PROMISE'19: Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering

ABSTRACT

References

Cited By

Recommendations

Deep just-in-time defect prediction: how far are we?

Studying just-in-time defect prediction using cross-project models

Cross-project smell-based defect prediction