Abstract
Developers often perform repetitive code editing activities (up to 70%) for various reasons (e.g., code refactoring) during software development. Many deep learning (DL) models have been proposed to automate code editing by learning from the code editing history. Among DL-based models, pre-trained code editing models have achieved the state-of-the-art (SOTA) results. Pre-trained models are first pre-trained with pre-training tasks and fine-tuned with the code editing task. Existing pre-training tasks mainly are code infilling tasks (e.g., masked language modeling), which are derived from the natural language processing field and are not designed for automatic code editing.
In this article, we propose a novel pre-training task specialized in code editing and present an effective pre-trained code editing model named CodeEditor. Compared to previous code infilling tasks, our pre-training task further improves the performance and generalization ability of code editing models. Specifically, we collect lots of real-world code snippets as the ground truth and use a powerful generator to rewrite them into mutated versions. Then, we pre-train our CodeEditor to edit mutated versions into the corresponding ground truth, to learn edit patterns. We conduct experiments on four code editing datasets and evaluate the pre-trained CodeEditor in three settings (i.e., fine-tuning, few-shot, and zero-shot). (1) In the fine-tuning setting, we train the pre-trained CodeEditor with four datasets and evaluate it on the test data. CodeEditor outperforms the SOTA baselines by 15%, 25.5%, 9.4%, and 26.6% on four datasets. (2) In the few-shot setting, we train the pre-trained CodeEditor with limited data and evaluate it on the test data. CodeEditor substantially performs better than all baselines, even outperforming baselines that are fine-tuned with all data. (3) In the zero-shot setting, we evaluate the pre-trained CodeEditor on the test data without training. CodeEditor correctly edits 1,113 programs, while the SOTA baselines cannot work. The results show that the superiority of our pre-training task and the pre-trained CodeEditor is more effective in automatic code editing.
- [1] . 2021. Unified Pre-training for program understanding and generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2655–2668.Google ScholarCross Ref
- [2] . 2013. Impact of peer code review on peer impression formation: A survey. In Proceedings of the 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. IEEE, 133–142.Google ScholarCross Ref
- [3] . 2020. Codit: Code editing with tree-based neural models. IEEE Transactions on Software Engineering 48, 4 (2020), 1385–1399.Google ScholarCross Ref
- [4] . 2021. On multi-modal learning of editing source code. In Proceedings of the 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). IEEE, 443–455.Google ScholarDigital Library
- [5] . 2019. ELECTRA: Pre-training text encoders as discriminators rather than generators. In Proceedings of the International Conference on Learning Representations.Google Scholar
- [6] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.Google Scholar
- [7] . 2022. CrystalBLEU: Precisely and efficiently measuring the similarity of code. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. 1–12.Google ScholarDigital Library
- [8] . 2020. CodeBERT: A Pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. 1536–1547.Google Scholar
- [9] GitHub. 2022. Real-world code changes. https://github.com/apache/hadoop/pull/4670/files#diffdac9de4dd225110eff2f29a44000bf32705f02df2b3fcf17b5d89bc236c12f01.Google Scholar
- [10] . 2020. GraphCodeBERT: Pre-training code representations with data flow. In Proceeding os the International Conference on Learning Representations.Google Scholar
- [11] . 2019. Codesearchnet challenge: Evaluating the state of semantic code search. CoRR abs/1909.09436 (2019). arXiv:1909.09436 http://arxiv.org/abs/1909.09436.Google Scholar
- [12] . 2020. BART: Denoising Sequence-to-sequence Pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871–7880.Google ScholarCross Ref
- [13] . 2021. Editsum: A retrieve-and-edit framework for source code summarization. In Proceedings of the 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). IEEE, 155–166.Google ScholarDigital Library
- [14] . 2023. SkCoder: A sketch-based approach for automatic code generation. CoRR abs/2302.06144 (2023). Google ScholarCross Ref
- [15] . 2022. Poison attack and defense on deep source code processing models. CoRR abs/2210.17029 (2022). arXiv:2210.17029.Google ScholarCross Ref
- [16] . 2023. Towards enhancing in-context learning for code generation. CoRR abs/2303.17780 (2023). arXiv:2303.17780.Google ScholarCross Ref
- [17] . 2022. CodeRetriever: Unimodal and bimodal contrastive learning. CoRR abs/2201.10866 (2022). arXiv:2201.10866. https://arxiv.org/abs/2201.10866.Google Scholar
- [18] . 2020. Multi-task learning based pre-trained language model for code completion. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 473–485.Google ScholarDigital Library
- [19] . 2019. RoBERTa: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). arXiv:1907.11692. http://arxiv.org/abs/1907.11692.Google Scholar
- [20] . 2021. CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. In Proceedings of the 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1).Google Scholar
- [21] . 2021. Studying the usage of text-to-text transfer transformer to support code-related tasks. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE’21). IEEE, 336–347.Google ScholarDigital Library
- [22] . 2016. API code recommendation using statistical learning from fine-grained changes. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 511–522.Google ScholarDigital Library
- [23] . 2013. A study of repetitiveness of code changes in software evolution. In Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE, 180–190.Google ScholarDigital Library
- [24] . 2022. SPT-Code: Sequence-to-sequence Pre-Training for learning the representation of source code. In Proceedings of the 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE’22). IEEE.Google ScholarDigital Library
- [25] . 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318.Google ScholarDigital Library
- [26] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.Google Scholar
- [27] . 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21 (2020), 1–67.Google Scholar
- [28] . 2013. Detecting and characterizing semantic inconsistencies in ported code. In Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13). IEEE, 367–377.Google ScholarDigital Library
- [29] . 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1715–1725.Google ScholarCross Ref
- [30] . 2019. Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3645–3650.Google ScholarCross Ref
- [31] . 2020. Treegen: A tree-based transformer architecture for code generation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 8984–8991.Google ScholarCross Ref
- [32] . 2022. AutoTransform: Automated code transformation to support modern code review process. In Proceedings of the 2022 IEEE/ACM 44st International Conference on Software Engineering (ICSE’22). IEEE.Google ScholarDigital Library
- [33] . 2019. On learning meaningful code changes via neural machine translation. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE’19). IEEE, 25–36.Google ScholarDigital Library
- [34] . 2022. Using Pre-trained models to boost code review automation. In Proceedings of the 44th International Conference on Software Engineering. 2291–2302.Google Scholar
- [35] . 2021. Towards automating code review activities. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE’21). IEEE, 163–174.Google ScholarDigital Library
- [36] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998–6008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.8.Google Scholar
- [37] . 2021. CodeT5: Identifier-aware unified Pre-trained Encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 8696–8708.Google ScholarCross Ref
- [38] . 2018. TRANX: A transition-based neural abstract Syntax parser for semantic parsing and code generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (Demo Track).Google ScholarCross Ref
- [39] . 2021. Improving Sequence-to-sequence Pre-training via sequence span rewriting. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 571–582.Google ScholarCross Ref
Index Terms
- CodeEditor: Learning to Edit Source Code with Pre-trained Models
Recommendations
Automating code review activities by large-scale pre-training
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software EngineeringCode review is an essential part to software development lifecycle since it aims at guaranteeing the quality of codes. Modern code review activities necessitate developers viewing, understanding and even running the programs to assess logic, ...
CCT5: A Code-Change-Oriented Pre-trained Model
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software EngineeringSoftware is constantly changing, requiring developers to perform several derived tasks in a timely manner, such as writing a description for the intention of the code change, or identifying the defect-prone code changes. Considering that the cost of ...
Bridging pre-trained models and downstream tasks for source code understanding
ICSE '22: Proceedings of the 44th International Conference on Software EngineeringWith the great success of pre-trained models, the pretrain-then-finetune paradigm has been widely adopted on downstream tasks for source code understanding. However, compared to costly training a large-scale model from scratch, how to effectively adapt ...
Comments