ABSTRACT
We present the first systematic analysis of key characteristics of patch search spaces for automatic patch generation systems. We analyze sixteen different configurations of the patch search spaces of SPR and Prophet, two current state-of-the-art patch generation systems. The analysis shows that 1) correct patches are sparse in the search spaces (typically at most one correct patch per search space per defect), 2) incorrect patches that nevertheless pass all of the test cases in the validation test suite are typically orders of magnitude more abundant, and 3) leveraging information other than the test suite is therefore critical for enabling the system to successfully isolate correct patches.
We also characterize a key tradeoff in the structure of the search spaces. Larger and richer search spaces that contain correct patches for more defects can actually cause systems to find fewer, not more, correct patches. We identify two reasons for this phenomenon: 1) increased validation times because of the presence of more candidate patches and 2) more incorrect patches that pass the test suite and block the discovery of correct patches. These fundamental properties, which are all characterized for the first time in this paper, help explain why past systems often fail to generate correct patches and help identify challenges, opportunities, and productive future directions for the field.
- E. T. Barr, Y. Brun, P. Devanbu, M. Harman, and F. Sarro. The Plastic Surgery Hypothesis. In Proceedings of the 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), pages 306--317, Hong Kong, China, November 2014. Google ScholarDigital Library
- M. Carbin, S. Misailovic, M. Kling, and M. C. Rinard. Detecting and escaping infinite loops with jolt. In Proceedings of the 25th European conference on Object-oriented programming, ECOOP'11, pages 609--633. Springer-Verlag, 2011. Google ScholarDigital Library
- V. Debroy and W. E. Wong. Using mutation to automatically suggest fixes for faulty programs. In Software Testing, Verification and Validation (ICST), 2010 Third International Conference on, pages 65--74. IEEE, 2010. Google ScholarDigital Library
- F. DeMarco, J. Xuan, D. Le Berre, and M. Monperrus. Automatic repair of buggy if conditions and missing preconditions with smt. In Proceedings of the 6th International Workshop on Constraints in Software Testing, Verification, and Analysis, CSTVA 2014, pages 30--39, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- B. Demsky and M. C. Rinard. Automatic detection and repair of errors in data structures. In Proceedings of the 2003 ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications, OOPSLA 2003, October 26-30, 2003, Anaheim, CA, USA, pages 78--95, 2003. Google ScholarDigital Library
- B. Demsky and M. C. Rinard. Goal-directed reasoning for specification-based data structure repair. IEEE Trans. Software Eng., 32(12):931--951, 2006. Google ScholarDigital Library
- T. Durieux, M. Martinez, M. Monperrus, R. Sommerard, and J. Xuan. Automatic repair of real bugs: An experience report on the defects4j dataset. CoRR, abs/1505.07002, 2015.Google Scholar
- B. Elkarablieh, I. Garcia, Y. L. Suen, and S. Khurshid. Assertion-based repair of complex data structures. In Proceedings of the Twenty-second IEEE/ACM International Conference on Automated Software Engineering, ASE '07', pages 64--73, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- Q. Gao, H. Zhang, J. Wang, Y. Xiong, L. Zhang, and H. Mei. Fixing recurring crash bugs via analyzing q&a sites (T). In 30th IEEE/ACM International Conference on Automated Software Engineering, ASE 2015, Lincoln, NE, USA, November 9-13, 2015, pages 307--318, 2015.Google ScholarDigital Library
- D. Gopinath, S. Khurshid, D. Saha, and S. Chandra. Data-guided repair of selection statements. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 243--253, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- R. Just, D. Jalali, and M. D. Ernst. Defects4j: a database of existing faults to enable controlled testing studies for java programs. In International Symposium on Software Testing and Analysis, ISSTA '14, San Jose, CA, USA - July 21-26, 2014, pages 437--440, 2014. Google ScholarDigital Library
- S. Kaleeswaran, V. Tulsian, A. Kanade, and A. Orso. Minthint: Automated synthesis of repair hints. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 266--276, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- Y. Ke, K. T. Stolee, C. Le Goues, and Y. Brun. Repairing programs with semantic code search (T). In 30th IEEE/ACM International Conference on Automated Software Engineering, ASE 2015, Lincoln, NE, USA, November 9-13, 2015, pages 295--306, 2015.Google ScholarDigital Library
- D. Kim, J. Nam, J. Song, and S. Kim. Automatic patch generation learned from human-written patches. In Proceedings of the 2013 International Conference on Software Engineering, ICSE '13', pages 802--811. IEEE Press, 2013. Google ScholarDigital Library
- M. Kling, S. Misailovic, M. Carbin, and M. Rinard. Bolt: on-demand infinite loop escape in unmodified binaries. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications, OOPSLA '12', pages 431--450. ACM, 2012. Google ScholarDigital Library
- E. Kneuss, M. Koukoutos, and V. Kuncak. Deductive program repair. In Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part II, pages 217--233, 2015.Google Scholar
- C. Le Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In Proceedings of the 2012 International Conference on Software Engineering, ICSE 2012, pages 3--13. IEEE Press, 2012. Google ScholarDigital Library
- F. Long and M. Rinard. Staged program repair with condition synthesis. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, Bergamo, Italy, August 30 - September 4, 2015, pages 166--178, 2015. Google ScholarDigital Library
- F. Long and M. Rinard. An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems. Technical Report MIT-CSAIL-TR-2016-003, 2016.Google Scholar
- F. Long and M. Rinard. Automatic patch generation by learning correct code. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, pages 298--312, New York, NY, USA, 2016. ACM. Google ScholarDigital Library
- F. Long, S. Sidiroglou-Douskos, and M. Rinard. Automatic runtime error repair and containment via recovery shepherding. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14', pages 227--238, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- M. Martinez, W. Weimer, and M. Monperrus. Do the fix ingredients already exist? an empirical inquiry into the redundancy assumptions of program repair approaches. In Companion Proceedings of the 36th International Conference on Software Engineering, ICSE Companion 2014, pages 492--495, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- H. D. T. Nguyen, D. Qi, A. Roychoudhury, and S. Chandra. Semfix: Program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering, ICSE '13', pages 772--781, Piscataway, NJ, USA, 2013. IEEE Press. Google ScholarDigital Library
- H. H. Nguyen and M. C. Rinard. Detecting and eliminating memory leaks using cyclic memory allocation. In Proceedings of the 6th International Symposium on Memory Management, ISMM 2007, Montreal, Quebec, Canada, October 21-22, 2007, pages 15--30, 2007. Google ScholarDigital Library
- F. S. Ocariza, Jr., K. Pattabiraman, and A. Mesbah. Vejovis: Suggesting fixes for javascript faults. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 837--847, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- Y. Pei, C. A. Furia, M. Nordio, Y. Wei, B. Meyer, and A. Zeller. Automated fixing of programs with contracts. IEEE Trans. Software Eng., 40(5):427--449, 2014. Google ScholarDigital Library
- J. H. Perkins, S. Kim, S. Larsen, S. Amarasinghe, J. Bachrach, M. Carbin, C. Pacheco, F. Sherwood, S. Sidiroglou, G. Sullivan, W.-F. Wong, Y. Zibin, M. D. Ernst, and M. Rinard. Automatically patching errors in deployed software. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP '09, pages 87--102. ACM, 2009. Google ScholarDigital Library
- Y. Qi, X. Mao, Y. Lei, Z. Dai, and C. Wang. The strength of random search on automated program repair. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 254--265, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- Z. Qi, F. Long, S. Achour, and M. C. Rinard. An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, Baltimore, MD, USA, July 12-17, 2015, pages 24--36, 2015. Google ScholarDigital Library
- M. C. Rinard. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In Proceedings of the 20th Annual International Conference on Supercomputing, ICS 2006, Cairns, Queensland, Australia, June 28 - July 01, 2006, pages 324--334, 2006. Google ScholarDigital Library
- M. C. Rinard. Using early phase termination to eliminate load imbalances at barrier synchronization points. In Proceedings of the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2007, October 21-25, 2007, Montreal, Quebec, Canada, pages 369--386, 2007. Google ScholarDigital Library
- M. C. Rinard, C. Cadar, D. Dumitran, D. M. Roy, T. Leu, and W. S. Beebee. Enhancing server availability and security through failure-oblivious computing. In 6th Symposium on Operating System Design and Implementation (OSDI) 2004), San Francisco, California, USA, December 6-8, 2004, pages 303--316, 2004. Google ScholarDigital Library
- R. Samanta, O. Olivo, and E. A. Emerson. Cost-aware automatic program repair. In Static Analysis - 21st International Symposium, SAS 2014, Munich, Germany, September 11-13, 2014. Proceedings, pages 268--284, 2014.Google Scholar
- H. Samimi, M. Schäfer, S. Artzi, T. Millstein, F. Tip, and L. Hendren. Automated repair of html generation errors in php applications using string constraint solving. In Proceedings of the 34th International Conference on Software Engineering, ICSE '12', pages 277--287, Piscataway, NJ, USA, 2012. IEEE Press. Google ScholarDigital Library
- S. Sidiroglou-Douskos, E. Lahtinen, F. Long, and M. Rinard. Automatic error elimination by horizontal code transfer across multiple applications. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 43--54, 2015. Google ScholarDigital Library
- S. Sidiroglou-Douskos, S. Misailovic, H. Hoffmann, and M. C. Rinard. Managing performance vs. accuracy trade-offs with loop perforation. In SIGSOFT/FSE'11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC'11: 13rd European Software Engineering Conference (ESEC-13), Szeged, Hungary, September 5-9, 2011, pages 124--134, 2011. Google ScholarDigital Library
- E. K. Smith, E. Barr, C. Le Goues, and Y. Brun. Is the Cure Worse than the Disease? Overfitting in Automated Program Repair. In Proceedings of the 10th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pages 532--543, Bergamo, Italy, September 2015. Google ScholarDigital Library
- S. Son, K. S. McKinley, and V. Shmatikov. Fix me up: Repairing access-control bugs in web applications. In 20th Annual Network and Distributed System Security Symposium, NDSS 2013, San Diego, California, USA, February 24-27, 2013, 2013.Google Scholar
- Y. Wei, Y. Pei, C. A. Furia, L. S. Silva, S. Buchholz, B. Meyer, and A. Zeller. Automated fixing of programs with contracts. In Proceedings of the 19th International Symposium on Software Testing and Analysis, ISSTA '10', pages 61--72, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- W. Weimer, Z. P. Fry, and S. Forrest. Leveraging program equivalence for adaptive program repair: Models and first results. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013, Silicon Valley, CA, USA, November 11-15, 2013, pages 356--366, 2013.Google ScholarDigital Library
- H. Zhong and Z. Su. An empirical study on real bug fixes. In 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015, Florence, Italy, May 16-24, 2015, Volume 1, pages 913--923, 2015. Google ScholarDigital Library
Index Terms
- An analysis of the search spaces for generate and validate patch generation systems
Recommendations
An analysis of patch plausibility and correctness for generate-and-validate patch generation systems
ISSTA 2015: Proceedings of the 2015 International Symposium on Software Testing and AnalysisWe analyze reported patches for three existing generate-and- validate patch generation systems (GenProg, RSRepair, and AE). The basic principle behind generate-and-validate systems is to accept only plausible patches that produce correct outputs for ...
Automatic patch generation by learning correct code
POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesWe present Prophet, a novel patch generation system that works with a set of successful human patches obtained from open- source software repositories to learn a probabilistic, application-independent model of correct code. It generates a space of ...
Automatic inference of code transforms for patch generation
ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software EngineeringWe present a new system, Genesis, that processes human patches to automatically infer code transforms for automatic patch generation. We present results that characterize the effectiveness of the Genesis inference algorithms and the complete Genesis ...
Comments