ABSTRACT
According to psychological scientists, humans understand models that most match their own internal models, which they characterize as lists of "heuristic"s (i.e. lists of very succinct rules). One such heuristic rule generator is the Fast-and-Frugal Trees (FFT) preferred by psychological scientists. Despite their successful use in many applied domains, FFTs have not been applied in software analytics. Accordingly, this paper assesses FFTs for software analytics.
We find that FFTs are remarkably effective in that their models are very succinct (5 lines or less describing a binary decision tree) while also outperforming result from very recent, top-level, conference papers. Also, when we restrict training data to operational attributes (i.e., those attributes that are frequently changed by developers), the performance of FFTs are not effected (while the performance of other learners can vary wildly).
Our conclusions are two-fold. Firstly, there is much that software analytics community could learn from psychological science. Secondly, proponents of complex methods should always baseline those methods against simpler alternatives. For example, FFTs could be used as a standard baseline learner against which other software analytics tools are compared.
- Behnoush Abdollahi and Olfa Nasraoui. 2016. Explainable restricted Boltzmann machines for collaborative iltering. arXiv preprint arXiv:1606.07129 (2016).Google Scholar
- Amritanshu Agrawal, Wei Fu, and Tim Menzies. 2018. What is Wrong with Topic Modeling?(and How to Fix it Using Search-based Software Engineering). Information and Software Technology (2018).Google Scholar
- Amritanshu Agrawal and Tim Menzies. 2018. Is "Better Data" Better than "Better Data Miners"? (Beneits of Tuning SMOTE for Defect Prediction). International Conference on Software Engineering (2018). Google ScholarDigital Library
- David W Aha, Dennis Kibler, and Marc K Albert. 1991. Instance-based learning algorithms. Machine learning 6, 1 (1991), 37ś66. Google ScholarDigital Library
- A. Arcuri and L. Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In 2011 33rd International Conference on Software Engineering (ICSE). 1ś10. Google ScholarDigital Library
- Andrew Begel and Thomas Zimmermann. 2014. Analyze this! 145 questions for data scientists in software engineering. In Proceedings of the 36th International Conference on Software Engineering. ACM, 12ś23. Google ScholarDigital Library
- Alex Berson, Stephen Smith, and Kurt Thearling. 2004. An overview of data mining techniques. Building Data Mining Application for CRM (2004).Google Scholar
- Nicolas Bettenburg, Meiyappan Nagappan, and Ahmed E Hassan. 2012. Think locally, act globally: Improving defect and efort prediction models. In Proceedings of the 9th IEEE Working Conference on Mining Software Repositories. IEEE Press, 60ś69. Google ScholarDigital Library
- Henry Brighton. 2006. Robust Inference with Simple Cognitive Models.. In AAAI spring symposium: Between a rock and a hard place: Cognitive science principles meet AI-hard problems. 17ś22.Google Scholar
- Jadzia Cendrowska. 1987. PRISM: An algorithm for inducing modular rules. International Journal of Man-Machine Studies 27, 4 (1987), 349ś370.Google ScholarCross Ref
- William W. Cohen. 1995. Fast Efective Rule Induction. In ICML’95. 115ś123. Google ScholarDigital Library
- N. Cowan. 2001. The magical number 4 in short-term memory: a reconsideration of mental storage capacity. Behav Brain Sci 24, 1 (Feb 2001), 87ś114.Google ScholarCross Ref
- Mark W Craven and Jude W Shavlik. 2014. Learning symbolic rules using artiicial neural networks. In Proceedings of the Tenth International Conference on Machine Learning. 73ś80. Google ScholarDigital Library
- Jean Czerlinski, Gerd Gigerenzer, and Daniel G Goldstein. 1999. How good are simple heuristics?. In Simple Heuristics That Make Us Smart. Oxford University Press.Google Scholar
- J. Czerwonka, R. Das, N. Nagappan, A. Tarvo, and A. Teterev. 2011. CRANE: Failure Prediction, Change Analysis and Test Prioritization in Practice ś Experiences Applications of Psychological Science for Actionable Analytics ESEC/FSE ’18, November 4–9, 2018, Lake Buena Vista, FL, USA from Windows. In Software Testing, Veriication and Validation (ICST), 2011 IEEE Fourth International Conference on. 357 ś366. Google ScholarDigital Library
- Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2018. Explainable Software Analytics. arXiv preprint arXiv:1802.00603 (2018).Google Scholar
- Chris Fraley and Adrian E Raftery. 2007. Bayesian regularization for normal mixture estimation and model-based clustering. Journal of classiication 24, 2 (2007), 155ś181. Google ScholarDigital Library
- Mark A Friedl and Carla E Brodley. 1997. Decision tree classiication of land cover from remotely sensed data. Remote sensing of environment 61, 3 (1997), 399ś409.Google Scholar
- Wei Fu and Tim Menzies. 2017. Easy over hard: a case study on deep learning. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, 49ś60. Google ScholarDigital Library
- B. R. Gaines and P. Compton. 1995. Induction of Ripple-down Rules Applied to Modeling Large Databases. J. Intell. Inf. Syst. 5, 3 (Nov. 1995), 211ś228. Google ScholarDigital Library
- Baljinder Ghotra, Shane McIntosh, and Ahmed E Hassan. 2015. Revisiting the impact of classiication techniques on the performance of defect prediction models. In Proceedings of the 37th International Conference on Software Engineering-Volume 1. IEEE Press, 789ś800. Google ScholarDigital Library
- Gerd Gigerenzer. 2008. Why heuristics work. Perspectives on psychological science 3, 1 (2008), 20ś29.Google Scholar
- Gerd Gigerenzer, Jean Czerlinski, and Laura Martignon. 1999. How good are fast and frugal heuristics. Decision science and technology: Relections on the contributions of Ward Edwards (1999), 81ś103.Google Scholar
- Gerd Gigerenzer and Wolfgang Gaissmaier. 2011. Heuristic decision making. Annual review of psychology 62 (2011), 451ś482.Google Scholar
- Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API learning. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 631ś642. Google ScholarDigital Library
- Khaled Hammouda and Fakhreddine Karray. 2000. A comparative study of data clustering techniques. University of Waterloo, Ontario, Canada (2000).Google Scholar
- J. Hihn and T. Menzies. 2015. Data Mining Methods and Cost Estimation Models: Why is it So Hard to Infuse New Ideas?. In 2015 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW). 5ś9. Google ScholarDigital Library
- Mirjam A Jenny, Thorsten Pachur, S Lloyd Williams, Eni Becker, and Jürgen Margraf. 2013. Simple rules for detecting depression. Journal of Applied Research in Memory and Cognition 2, 3 (2013), 149ś157.Google ScholarCross Ref
- Magne Jorgensen. 2004. Realism in assessment of efort estimation uncertainty: It matters how you ask. IEEE Transactions on Software Engineering 30, 4 (2004), 209ś217. Google ScholarDigital Library
- Marian Jureczko and Lech Madeyski. 2010. Towards identifying software project clusters with regard to defect prediction. In Proceedings of the 6th International Conference on Predictive Models in Software Engineering. ACM, 9. Google ScholarDigital Library
- Yasutaka Kamei, Emad Shihab, Bram Adams, Ahmed E Hassan, Audris Mockus, Anand Sinha, and Naoyasu Ubayashi. 2013. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering 39, 6 (2013), 757ś773. Google ScholarDigital Library
- Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2016. The Emerging Role of Data Scientists on Software Development Teams. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). ACM, New York, NY, USA, 96ś107. Google ScholarDigital Library
- E. Kocaguneli, T. Menzies, A. Bener, and J. Keung. 2012. Exploiting the Essential Assumptions of Analogy-Based Efort Estimation. IEEE Transactions on Software Engineering 28 (2012), 425ś438. Issue 2. Available from http://menzies.us/pdf/ 11teak.pdf. Google ScholarDigital Library
- Sotiris B Kotsiantis, I Zaharakis, and P Pintelas. 2007. Supervised machine learning: A review of classiication techniques. (2007).Google Scholar
- Rahul Krishna and Tim Menzies. 2015. Actionable= Cluster+ Contrast?. In Automated Software Engineering Workshop (ASEW), 2015 30th IEEE/ACM International Conference on. IEEE, 14ś17. Google ScholarDigital Library
- Jill Larkin, John McDermott, Dorothea P. Simon, and Herbert A. Simon. 1980. Expert and Novice Performance in Solving Physics Problems. Science 208, 4450 (1980), 1335ś1342.Google Scholar
- K Laskey and Laura Martignon. 2014. Comparing fast and frugal trees and Bayesian networks for risk assessment. In Proceedings of the 9th International Conference on Teaching Statistics, Flagstaf, Arizona.Google Scholar
- Andy Liaw, Matthew Wiener, and others. 2002. Classiication and regression by randomForest. R news 2, 3 (2002), 18ś22.Google Scholar
- Zachary C Lipton. 2016. The mythos of model interpretability. arXiv preprint arXiv:1606.03490 (2016).Google Scholar
- Wei Ji Ma, Masud Husain, and Paul M Bays. 2014. Changing concepts of working memory. Nature neuroscience 17, 3 (2014), 347ś356.Google Scholar
- Lech Madeyski and Marian Jureczko. 2015. Which process metrics can signiicantly improve defect prediction models? An empirical study. Software Quality Journal 23, 3 (2015), 393ś422. Google ScholarDigital Library
- Laura Martignon, Konstantinos V Katsikopoulos, and Jan K Woike. 2008. Categorization with limited resources: A family of simple heuristics. Journal of Mathematical Psychology 52, 6 (2008), 352ś361.Google ScholarCross Ref
- Laura Martignon, Oliver Vitouch, Masanori Takezawa, and Malcolm R Forster. 2003. Naive and yet enlightened: From natural frequencies to fast and frugal decision trees. Thinking: Psychological perspective on reasoning, judgment, and decision making (2003), 189ś211.Google Scholar
- Thilo Mende and Rainer Koschke. 2010. Efort-aware defect prediction models. In Software Maintenance and Reengineering (CSMR), 2010 14th European Conference on. IEEE, 107ś116. Google ScholarDigital Library
- Tim Menzies. 2014. Occam’s razor and simple software project management. In Software Project Management in a Changing World. Springer, 447ś472.Google Scholar
- Tim Menzies, Alex Dekhtyar, Justin Distefano, and Jeremy Greenwald. 2007. Problems with Precision: A Response to "Comments on ’Data Mining Static Code Attributes to Learn Defect Predictors’". IEEE Transactions on Software Engineering 33, 9 (sep 2007), 637ś640. Google ScholarDigital Library
- Tim Menzies, Alex Dekhtyar, Justin Distefano, and Jeremy Greenwald. 2007. Problems with Precision: A Response to" comments on’data mining static code attributes to learn defect predictors’". IEEE Transactions on Software Engineering 33, 9 (2007), 637ś640. Google ScholarDigital Library
- Tim Menzies, Jeremy Greenwald, and Art Frank. 2007. Data mining static code attributes to learn defect predictors. IEEE transactions on software engineering 33, 1 (2007), 2ś13. Google ScholarDigital Library
- Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayşe Bener. 2010. Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering 17, 4 (2010), 375ś407. Google ScholarDigital Library
- Tim Menzies, Osamu Mizuno, Yasunari Takagi, and Tohru Kikuno. 2009. Explanation vs Performance in Data Mining: A Case Study with Predicting Runaway Projects. Journal of Software Engineering and Applications 2 (2009), 221ś236.Google ScholarCross Ref
- Tim Menzies, Dan Port, Zhihao Chen, and Jairus Hihn. 2005. Simple software cost analysis: safe or unsafe?. In ACM SIGSOFT Software Engineering Notes, Vol. 30. ACM, 1ś6. Google ScholarDigital Library
- George A Miller. 1956. The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological review 63, 2 (1956), 81.Google Scholar
- Akito Monden, Takuma Hayashi, Shoji Shinoda, Kumiko Shirai, Junichi Yoshida, Mike Barker, and Kenichi Matsumoto. 2013. Assessing the cost efectiveness of fault prediction in acceptance testing. IEEE Transactions on Software Engineering 39, 10 (2013), 1345ś1357. Google ScholarDigital Library
- Hansjörg Neth and Gerd Gigerenzer. 2015. Heuristics: Tools for an uncertain world. Emerging trends in the social and behavioral sciences: An interdisciplinary, searchable, and linkable resource (2015).Google Scholar
- Thomas J. Ostrand, Elaine J. Weyuker, and Robert M. Bell. 2004. Where the bugs are. In ISSTA ’04: Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis. ACM, New York, NY, USA, 86ś96. Google ScholarDigital Library
- Thomas J Ostrand, Elaine J Weyuker, and Robert M Bell. 2005. Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering 31, 4 (2005), 340ś355. Google ScholarDigital Library
- Nathaniel D Phillips, Hansjoerg Neth, Jan K Woike, and Wolfgang Gaissmaier. 2017. FFTrees: A toolbox to create, visualize, and evaluate fast-and-frugal decision trees. Judgment and Decision Making 12, 4 (2017), 344ś368.Google Scholar
- J Ross Quinlan and others. 1996. Bagging, boosting, and C4. 5. In AAAI/IAAI, Vol. 1. 725ś730. Google ScholarDigital Library
- Markus Raab and Gerd Gigerenzer. 2015. The power of simplicity: a fast-andfrugal heuristics approach to performance science. Frontiers in psychology 6 (2015).Google Scholar
- Tim Menzies Rahul Krishna. 2018. Bellwethers: A Baseline Method For Transfer Learning. arXiv preprint arXiv:1703.06218v4 (2018).Google Scholar
- Mitch Rees-Jones, Matthew Martin, and Tim Menzies. 2018. Better predictors for issue lifetime. Journal of Software and Systems, submitted. arXiv preprint arXiv:1702.07735 (2018).Google Scholar
- Robert Sawyer. 2013. BIâĂŹs Impact on Analyses and Decision Making Depends on the Development of Less Complex Applications. In Principles and Applications of Business Intelligence Research. IGI Global, 83ś95.Google Scholar
- AI Technology & Industry Review Synced. 2017. LeCun vs Rahimi: Has Machine Learning Become Alchemy? (2017).Google Scholar
- https://medium.com/@Synced/ lecun-vs-rahimi-has-machine-learning-become-alchemy-21cb1557920dGoogle Scholar
- Shiang-Yen Tan and Taizan Chan. 2016. Deining and conceptualizing actionable insight: a conceptual framework for decision-centric analytics. arXiv preprint arXiv:1606.03510 (2016).Google Scholar
- Christopher Theisen, Kim Herzig, Patrick Morrison, Brendan Murphy, and Laurie Williams. 2015. Approximating Attack Surfaces with Stack Traces. In ICSE’15. Google ScholarDigital Library
- Burak Turhan, Ayşe Tosun, and Ayşe Bener. 2011. Empirical evaluation of mixed-project defect prediction models. In Software Engineering and Advanced Applications (SEAA), 2011 37th EUROMICRO Conference on. IEEE, 396ś403. Google ScholarDigital Library
- András Vargha and Harold D Delaney. 2000. A critique and improvement of the CL common language efect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics 25, 2 (2000), 101ś132. ESEC/FSE ’18, November 4–9, 2018, Lake Buena Vista, FL, USA Di Chen, Wei Fu, Rahul Krishna, and Tim MenziesGoogle Scholar
- Martin White, Christopher Vendome, Mario Linares-Vásquez, and Denys Poshyvanyk. 2015. Toward deep learning software repositories. In Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on. IEEE, 334ś345. Google ScholarDigital Library
- Susan Wiedenbeck, Vikki Fix, and Jean Scholtz. 1993. Characteristics of the mental representations of novice and expert programmers: an empirical study. International Journal of Man-Machine Studies 39, 5 (1993), 793ś812. Google ScholarDigital Library
- Ian H. Witten and Eibe Frank. 2002. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. SIGMOD Rec. 31, 1 (March 2002), 76ś77. Google ScholarDigital Library
- Jan K Woike, Ulrich Hofrage, and Laura Martignon. 2017. Integrating and testing natural frequencies, naïve Bayes, and fast-and-frugal trees. Decision 4, 4 (2017), 234.Google ScholarCross Ref
- Xinli Yang, David Lo, Xin Xia, Yun Zhang, and Jianling Sun. 2015. Deep learning for just-in-time defect prediction. In Software Quality, Reliability and Security (QRS), 2015 IEEE International Conference on. IEEE, 17ś26. Google ScholarDigital Library
- Yibiao Yang, Yuming Zhou, Jinping Liu, Yangyang Zhao, Hongmin Lu, Lei Xu, Baowen Xu, and Hareton Leung. 2016. Efort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 157ś168. Google ScholarDigital Library
- Andreas Zeller. 2002. Isolating Cause-efect Chains from Computer Programs. In Proceedings of the 10th ACM SIGSOFT Symposium on Foundations of Software Engineering (SIGSOFT ’02/FSE-10). ACM, New York, NY, USA, 1ś10. Google ScholarDigital Library
- Zhi-Qiang Zeng, Hong-Bin Yu, Hua-Rong Xu, Yan-Qi Xie, and Ji Gao. 2008. Fast training support vector machines using parallel sequential minimal optimization. In Intelligent System and Knowledge Engineering, 2008. ISKE 2008. 3rd International Conference on, Vol. 1. IEEE, 997ś1001.Google ScholarCross Ref
Index Terms
- Applications of psychological science for actionable analytics
Recommendations
Explainable software analytics
ICSE-NIER '18: Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging ResultsSoftware analytics has been the subject of considerable recent attention but is yet to receive significant industry traction. One of the key reasons is that software practitioners are reluctant to trust predictions produced by the analytics machinery ...
Software Analytics in Practice
With software analytics, software practitioners explore and analyze data to obtain insightful, actionable information for tasks regarding software development, systems, and users. The StackMine project produced a software analytics system for Microsoft ...
Streaming software analytics
BIGDSE '16: Proceedings of the 2nd International Workshop on BIG Data Software EngineeringIn this paper we present a novel software analytics infrastructure supporting for a combination of three requirements to serve software practitioners in utilising data-driven decision making: (1) Real-time insight: streaming software analytics unify ...
Comments