skip to main content
research-article

CodeEditor: Learning to Edit Source Code with Pre-trained Models

Authors Info & Claims
Published:30 September 2023Publication History
Skip Abstract Section

Abstract

Developers often perform repetitive code editing activities (up to 70%) for various reasons (e.g., code refactoring) during software development. Many deep learning (DL) models have been proposed to automate code editing by learning from the code editing history. Among DL-based models, pre-trained code editing models have achieved the state-of-the-art (SOTA) results. Pre-trained models are first pre-trained with pre-training tasks and fine-tuned with the code editing task. Existing pre-training tasks mainly are code infilling tasks (e.g., masked language modeling), which are derived from the natural language processing field and are not designed for automatic code editing.

In this article, we propose a novel pre-training task specialized in code editing and present an effective pre-trained code editing model named CodeEditor. Compared to previous code infilling tasks, our pre-training task further improves the performance and generalization ability of code editing models. Specifically, we collect lots of real-world code snippets as the ground truth and use a powerful generator to rewrite them into mutated versions. Then, we pre-train our CodeEditor to edit mutated versions into the corresponding ground truth, to learn edit patterns. We conduct experiments on four code editing datasets and evaluate the pre-trained CodeEditor in three settings (i.e., fine-tuning, few-shot, and zero-shot). (1) In the fine-tuning setting, we train the pre-trained CodeEditor with four datasets and evaluate it on the test data. CodeEditor outperforms the SOTA baselines by 15%, 25.5%, 9.4%, and 26.6% on four datasets. (2) In the few-shot setting, we train the pre-trained CodeEditor with limited data and evaluate it on the test data. CodeEditor substantially performs better than all baselines, even outperforming baselines that are fine-tuned with all data. (3) In the zero-shot setting, we evaluate the pre-trained CodeEditor on the test data without training. CodeEditor correctly edits 1,113 programs, while the SOTA baselines cannot work. The results show that the superiority of our pre-training task and the pre-trained CodeEditor is more effective in automatic code editing.

REFERENCES

  1. [1] Ahmad Wasi, Chakraborty Saikat, Ray Baishakhi, and Chang Kai-Wei. 2021. Unified Pre-training for program understanding and generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 26552668.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Bosu Amiangshu and Carver Jeffrey C.. 2013. Impact of peer code review on peer impression formation: A survey. In Proceedings of the 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. IEEE, 133142.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Chakraborty Saikat, Ding Yangruibo, Allamanis Miltiadis, and Ray Baishakhi. 2020. Codit: Code editing with tree-based neural models. IEEE Transactions on Software Engineering 48, 4 (2020), 13851399.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Chakraborty Saikat and Ray Baishakhi. 2021. On multi-modal learning of editing source code. In Proceedings of the 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). IEEE, 443455.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Clark Kevin, Luong Minh-Thang, Le Quoc V., and Manning Christopher D.. 2019. ELECTRA: Pre-training text encoders as discriminators rather than generators. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  6. [6] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 41714186.Google ScholarGoogle Scholar
  7. [7] Eghbali Aryaz and Pradel Michael. 2022. CrystalBLEU: Precisely and efficiently measuring the similarity of code. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. 112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Feng Zhangyin, Guo Daya, Tang Duyu, Duan Nan, Feng Xiaocheng, Gong Ming, Shou Linjun, Qin Bing, Liu Ting, Jiang Daxin, et al. 2020. CodeBERT: A Pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. 15361547.Google ScholarGoogle Scholar
  9. [9] GitHub. 2022. Real-world code changes. https://github.com/apache/hadoop/pull/4670/files#diffdac9de4dd225110eff2f29a44000bf32705f02df2b3fcf17b5d89bc236c12f01.Google ScholarGoogle Scholar
  10. [10] Guo Daya, Ren Shuo, Lu Shuai, Feng Zhangyin, Tang Duyu, Shujie LIU, Zhou Long, Duan Nan, Svyatkovskiy Alexey, Fu Shengyu, et al. 2020. GraphCodeBERT: Pre-training code representations with data flow. In Proceeding os the International Conference on Learning Representations.Google ScholarGoogle Scholar
  11. [11] Husain Hamel, Wu Ho-Hsiang, Gazit Tiferet, Allamanis Miltiadis, and Brockschmidt Marc. 2019. Codesearchnet challenge: Evaluating the state of semantic code search. CoRR abs/1909.09436 (2019). arXiv:1909.09436 http://arxiv.org/abs/1909.09436.Google ScholarGoogle Scholar
  12. [12] Lewis Mike, Liu Yinhan, Goyal Naman, Ghazvininejad Marjan, Mohamed Abdelrahman, Levy Omer, Stoyanov Veselin, and Zettlemoyer Luke. 2020. BART: Denoising Sequence-to-sequence Pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 78717880.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Li Jia, Li Yongmin, Li Ge, Hu Xing, Xia Xin, and Jin Zhi. 2021. Editsum: A retrieve-and-edit framework for source code summarization. In Proceedings of the 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). IEEE, 155166.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Li Jia, Li Yongmin, Li Ge, Jin Zhi, Hao Yiyang, and Hu Xing. 2023. SkCoder: A sketch-based approach for automatic code generation. CoRR abs/2302.06144 (2023). Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Li Jia, Li Zhuo, Zhang Huangzhao, Li Ge, Jin Zhi, Hu Xing, and Xia Xin. 2022. Poison attack and defense on deep source code processing models. CoRR abs/2210.17029 (2022). arXiv:2210.17029.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Li Jia, Zhao Yunfei, Li Yongmin, Li Ge, and Jin Zhi. 2023. Towards enhancing in-context learning for code generation. CoRR abs/2303.17780 (2023). arXiv:2303.17780.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Li Xiaonan, Gong Yeyun, Shen Yelong, Qiu Xipeng, Zhang Hang, Yao Bolun, Qi Weizhen, Jiang Daxin, Chen Weizhu, and Duan Nan. 2022. CodeRetriever: Unimodal and bimodal contrastive learning. CoRR abs/2201.10866 (2022). arXiv:2201.10866. https://arxiv.org/abs/2201.10866.Google ScholarGoogle Scholar
  18. [18] Liu Fang, Li Ge, Zhao Yunfei, and Jin Zhi. 2020. Multi-task learning based pre-trained language model for code completion. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 473485.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Liu Yinhan, Ott Myle, Goyal Naman, Du Jingfei, Joshi Mandar, Chen Danqi, Levy Omer, Lewis Mike, Zettlemoyer Luke, and Stoyanov Veselin. 2019. RoBERTa: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). arXiv:1907.11692. http://arxiv.org/abs/1907.11692.Google ScholarGoogle Scholar
  20. [20] Lu Shuai, Guo Daya, Ren Shuo, Huang Junjie, Svyatkovskiy Alexey, Blanco Ambrosio, Clement Colin, Drain Dawn, Jiang Daxin, Tang Duyu, et al. 2021. CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. In Proceedings of the 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1).Google ScholarGoogle Scholar
  21. [21] Mastropaolo Antonio, Scalabrino Simone, Cooper Nathan, Palacio David Nader, Poshyvanyk Denys, Oliveto Rocco, and Bavota Gabriele. 2021. Studying the usage of text-to-text transfer transformer to support code-related tasks. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE’21). IEEE, 336347.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Nguyen Anh Tuan, Hilton Michael, Codoban Mihai, Nguyen Hoan Anh, Mast Lily, Rademacher Eli, Nguyen Tien N., and Dig Danny. 2016. API code recommendation using statistical learning from fine-grained changes. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 511522.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Nguyen Hoan Anh, Nguyen Anh Tuan, Nguyen Tung Thanh, Nguyen Tien N., and Rajan Hridesh. 2013. A study of repetitiveness of code changes in software evolution. In Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE, 180190.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Niu Changan, Li Chuanyi, Ng Vincent, Ge Jidong, Huang Liguo, and Luo Bin. 2022. SPT-Code: Sequence-to-sequence Pre-Training for learning the representation of source code. In Proceedings of the 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE’22). IEEE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.Google ScholarGoogle Scholar
  27. [27] Raffel Colin, Shazeer Noam, Roberts Adam, Lee Katherine, Narang Sharan, Matena Michael, Zhou Yanqi, Li Wei, and Liu Peter J.. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21 (2020), 167.Google ScholarGoogle Scholar
  28. [28] Ray Baishakhi, Kim Miryung, Person Suzette, and Rungta Neha. 2013. Detecting and characterizing semantic inconsistencies in ported code. In Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE, 367377.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Sennrich Rico, Haddow Barry, and Birch Alexandra. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 17151725.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Strubell Emma, Ganesh Ananya, and McCallum Andrew. 2019. Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 36453650.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Sun Zeyu, Zhu Qihao, Xiong Yingfei, Sun Yican, Mou Lili, and Zhang Lu. 2020. Treegen: A tree-based transformer architecture for code generation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 89848991.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Thongtanunam Patanamon, Pornprasit Chanathip, and Tantithamthavorn Chakkrit. 2022. AutoTransform: Automated code transformation to support modern code review process. In Proceedings of the 2022 IEEE/ACM 44st International Conference on Software Engineering (ICSE’22). IEEE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Tufano Michele, Pantiuchina Jevgenija, Watson Cody, Bavota Gabriele, and Poshyvanyk Denys. 2019. On learning meaningful code changes via neural machine translation. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE’19). IEEE, 2536.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Tufano Rosalia, Masiero Simone, Mastropaolo Antonio, Pascarella Luca, Poshyvanyk Denys, and Bavota Gabriele. 2022. Using Pre-trained models to boost code review automation. In Proceedings of the 44th International Conference on Software Engineering. 22912302.Google ScholarGoogle Scholar
  35. [35] Tufano Rosalia, Pascarella Luca, Tufanoy Michele, Poshyvanykz Denys, and Bavota Gabriele. 2021. Towards automating code review activities. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE’21). IEEE, 163174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 59986008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.8.Google ScholarGoogle Scholar
  37. [37] Wang Yue, Wang Weishi, Joty Shafiq, and Hoi Steven C. H.. 2021. CodeT5: Identifier-aware unified Pre-trained Encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 86968708.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Yin Pengcheng and Neubig Graham. 2018. TRANX: A transition-based neural abstract Syntax parser for semantic parsing and code generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (Demo Track).Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Zhou Wangchunshu, Ge Tao, Xu Canwen, Xu Ke, and Wei Furu. 2021. Improving Sequence-to-sequence Pre-training via sequence span rewriting. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 571582.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. CodeEditor: Learning to Edit Source Code with Pre-trained Models

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Software Engineering and Methodology
          ACM Transactions on Software Engineering and Methodology  Volume 32, Issue 6
          November 2023
          949 pages
          ISSN:1049-331X
          EISSN:1557-7392
          DOI:10.1145/3625557
          • Editor:
          • Mauro Pezzè
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 30 September 2023
          • Online AM: 22 May 2023
          • Accepted: 7 April 2023
          • Revised: 26 February 2023
          • Received: 25 September 2022
          Published in tosem Volume 32, Issue 6

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text