skip to main content
10.1145/3533767.3534390acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

An extensive study on pre-trained models for program understanding and generation

Authors Info & Claims
Published:18 July 2022Publication History

ABSTRACT

Automatic program understanding and generation techniques could significantly advance the productivity of programmers and have been widely studied by academia and industry. Recently, the advent of pre-trained paradigm enlightens researchers to develop general-purpose pre-trained models which can be applied for a broad range of program understanding and generation tasks. Such pre-trained models, derived by self-supervised objectives on large unlabelled corpora, can be fine-tuned in downstream tasks (such as code search and code generation) with minimal adaptations. Although these pre-trained models claim superiority over the prior techniques, they seldom follow equivalent evaluation protocols, e.g., they are hardly evaluated on the identical benchmarks, tasks, or settings. Consequently, there is a pressing need for a comprehensive study of the pre-trained models on their effectiveness, versatility as well as the limitations to provide implications and guidance for the future development in this area. To this end, we first perform an extensive study of eight open-access pre-trained models over a large benchmark on seven representative code tasks to assess their reproducibility. We further compare the pre-trained models and domain-specific state-of-the-art techniques for validating pre-trained effectiveness. At last, we investigate the robustness of the pre-trained models by inspecting their performance variations under adversarial attacks. Through the study, we find that while we can in general replicate the original performance of the pre-trained models on their evaluated tasks and adopted benchmarks, subtle performance fluctuations can refute the findings in their original papers. Moreover, none of the existing pre-trained models can dominate over all other models. We also find that the pre-trained models can significantly outperform non-pre-trained state-of-the-art techniques in program understanding tasks. Furthermore, we perform the first study for natural language-programming language pre-trained model robustness via adversarial attacks and find that a simple random attack approach can easily fool the state-of-the-art pre-trained models and thus incur security issues. At last, we also provide multiple practical guidelines for advancing future research on pre-trained models for program understanding and generation.

References

  1. 2021. Google BigQuery. Website. https://console.cloud.google.com/marketplace/details/github/github-repos Google ScholarGoogle Scholar
  2. 2022. ISSTA’22 CodeStudy. Github. https://github.com/ZZR0/ISSTA22-CodeStudy Google ScholarGoogle Scholar
  3. Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Unified Pre-training for Program Understanding and Generation. In NAACL-HLT 2021. Association for Computational Linguistics, 2655–2668. https://doi.org/10.18653/v1/2021.naacl-main.211 Google ScholarGoogle ScholarCross RefCross Ref
  4. Miltiadis Allamanis, Hao Peng, and Charles Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code. In ICML 2016 (JMLR Workshop and Conference Proceedings, Vol. 48). JMLR.org, 2091–2100. http://proceedings.mlr.press/v48/allamanis16.html Google ScholarGoogle Scholar
  5. Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. code2vec: Learning distributed representations of code. Proceedings of the ACM on Programming Languages, 3, POPL (2019), 1–29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sophia Althammer, Sebastian Hofstätter, and Allan Hanbury. 2021. Cross-Domain Retrieval in the Legal and Patent Domains: A Reproducibility Study. In ECIR 2021 (Lecture Notes in Computer Science, Vol. 12657). Springer, 3–17. https://doi.org/10.1007/978-3-030-72240-1_1 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Junyi Ao, Rui Wang, Long Zhou, Shujie Liu, Shuo Ren, Yu Wu, Tom Ko, Qing Li, Yu Zhang, and Zhihua Wei. 2021. Speecht5: Unified-modal encoder-decoder pre-training for spoken language processing. arXiv:2110.07205. Google ScholarGoogle Scholar
  8. Nghi DQ Bui, Yijun Yu, and Lingxiao Jiang. 2021. InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees. In ICSE 2021. 1186–1197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Saikat Chakraborty, Miltiadis Allamanis, and Baishakhi Ray. 2018. Tree2Tree Neural Translation Model for Learning Source Code Changes. CoRR, abs/1810.00314 (2018), arxiv:1810.00314. arxiv:1810.00314 Google ScholarGoogle Scholar
  10. Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, and Baishakhi Ray. 2021. Deep learning based vulnerability detection: Are we there yet. IEEE Transactions on Software Engineering. Google ScholarGoogle ScholarCross RefCross Ref
  11. Ciprian Chelba, Tomás Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2014. One billion word benchmark for measuring progress in statistical language modeling. In INTERSPEECH 2014. ISCA, 2635–2639. http://www.isca-speech.org/archive/interspeech_2014/i14_2635.html Google ScholarGoogle ScholarCross RefCross Ref
  12. Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, and Greg Brockman. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374. Google ScholarGoogle Scholar
  13. Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In ICLR 2020. OpenReview.net. https://openreview.net/forum?id=r1xMH1BtvB Google ScholarGoogle Scholar
  14. Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2020. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555. Google ScholarGoogle Scholar
  15. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT 2019. Association for Computational Linguistics, 4171–4186. https://doi.org/10.18653/v1/n19-1423 Google ScholarGoogle ScholarCross RefCross Ref
  16. Ahmed Elnaggar, Wei Ding, Llion Jones, Tom Gibbs, Tamas Feher, Christoph Angerer, Silvia Severini, Florian Matthes, and Burkhard Rost. 2021. CodeTrans: Towards Cracking the Language of Silicone’s Code Through Self-Supervised Deep Learning and High Performance Computing. CoRR, abs/2104.02443 (2021), arxiv:2104.02443. arxiv:2104.02443 Google ScholarGoogle Scholar
  17. Chunrong Fang, Zixi Liu, Yangyang Shi, Jeff Huang, and Qingkai Shi. 2020. Functional code clone detection with syntax and semantics fusion learning. In ISSTA 2020. 516–527. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In EMNLP 2020 (Findings of ACL, Vol. EMNLP 2020). Association for Computational Linguistics, 1536–1547. https://doi.org/10.18653/v1/2020.findings-emnlp.139 Google ScholarGoogle ScholarCross RefCross Ref
  19. Siddhant Garg and Goutham Ramakrishnan. 2020. BAE: BERT-based Adversarial Examples for Text Classification. In EMNLP 2020. 6174–6181. Google ScholarGoogle Scholar
  20. Ali Ghanbari, Samuel Benton, and Lingming Zhang. 2019. Practical program repair via bytecode mutation. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 19–30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin B. Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, and Ming Zhou. 2021. GraphCodeBERT: Pre-training Code Representations with Data Flow. In ICLR 2021. OpenReview.net. https://openreview.net/forum?id=jLoC4ez43PZ Google ScholarGoogle Scholar
  22. Vincent J. Hellendoorn, Christian Bird, Earl T. Barr, and Miltiadis Allamanis. 2018. Deep learning type inference. In ESEC/SIGSOFT FSE 2018. ACM, 152–162. https://doi.org/10.1145/3236024.3236051 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Thong Hoang, Hong Jin Kang, David Lo, and Julia Lawall. 2020. Cc2vec: Distributed representations of code changes. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 518–529. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. 2019. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. CoRR, abs/1909.09436 (2019), arxiv:1909.09436 Google ScholarGoogle Scholar
  25. Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing Source Code using a Neural Attention Model. In ACL 2016. The Association for Computer Linguistics. https://doi.org/10.18653/v1/p16-1195 Google ScholarGoogle ScholarCross RefCross Ref
  26. Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2018. Mapping Language to Code in Programmatic Context. In EMNLP 2018. Association for Computational Linguistics, 1643–1652. https://doi.org/10.18653/v1/d18-1192 Google ScholarGoogle ScholarCross RefCross Ref
  27. Paras Jain, Ajay Jain, Tianjun Zhang, Pieter Abbeel, Joseph E. Gonzalez, and Ion Stoica. 2020. Contrastive Code Representation Learning. CoRR, abs/2007.04973 (2020), arxiv:2007.04973. arxiv:2007.04973 Google ScholarGoogle Scholar
  28. Nan Jiang, Thibaud Lutellier, and Lin Tan. 2021. CURE: Code-Aware Neural Machine Translation for Automatic Program Repair. In ICSE 2021. 1161–1173. Google ScholarGoogle Scholar
  29. Di Jin, Zhijing Jin, Joey Tianyi Zhou, and Peter Szolovits. 2020. Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In AAAI 2020. 34, 8018–8025. Google ScholarGoogle Scholar
  30. Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, and Omer Levy. 2020. SpanBERT: Improving Pre-training by Representing and Predicting Spans. Trans. Assoc. Comput. Linguistics, 8 (2020), 64–77. https://transacl.org/ojs/index.php/tacl/article/view/1853 Google ScholarGoogle ScholarCross RefCross Ref
  31. Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, and Kensen Shi. 2020. Learning and evaluating contextual embedding of source code. In International Conference on Machine Learning. 5110–5121. Google ScholarGoogle Scholar
  32. Jinkyu Kim and John F. Canny. 2017. Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention. In ICCV 2017. IEEE Computer Society, 2961–2969. https://doi.org/10.1109/ICCV.2017.320 Google ScholarGoogle ScholarCross RefCross Ref
  33. Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In ACL 2020, Online. Association for Computational Linguistics, 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703 Google ScholarGoogle ScholarCross RefCross Ref
  34. Hongyu Li, Seohyun Kim, and Satish Chandra. 2019. Neural code search evaluation dataset. arXiv preprint arXiv:1908.09804. Google ScholarGoogle Scholar
  35. Linyang Li, Ruotian Ma, Qipeng Guo, Xiangyang Xue, and Xipeng Qiu. 2020. BERT-ATTACK: Adversarial Attack against BERT Using BERT. In EMNLP 2020. 6193–6202. Google ScholarGoogle ScholarCross RefCross Ref
  36. Xia Li, Wei Li, Yuqun Zhang, and Lingming Zhang. 2019. DeepFL: Integrating Multiple Fault Diagnosis Dimensions for Deep Fault Localization. Association for Computing Machinery, New York, NY, USA. 169–180. isbn:9781450362245 https://doi.org/10.1145/3293882.3330574 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Xiang Ling, Lingfei Wu, Saizhuo Wang, Gaoning Pan, Tengfei Ma, Fangli Xu, Alex X Liu, Chunming Wu, and Shouling Ji. 2021. Deep Graph Matching and Searching for Semantic Code Retrieval. TKDD, 15, 5 (2021), 1–21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Fang Liu, Ge Li, Yunfei Zhao, and Zhi Jin. 2020. Multi-task learning based pre-trained language model for code completion. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 473–485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, abs/1907.11692 (2019), arxiv:1907.11692. arxiv:1907.11692 Google ScholarGoogle Scholar
  40. Yiling Lou, Ali Ghanbari, Xia Li, Lingming Zhang, Haotian Zhang, Dan Hao, and Lu Zhang. 2020. Can automated program repair refine fault localization? a unified debugging approach. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 75–87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Yiling Lou, Qihao Zhu, Jinhao Dong, Xia Li, Zeyu Sun, Dan Hao, Lu Zhang, and Lingming Zhang. 2021. Boosting coverage-based fault localization via graph-based representation learning. In ESEC/FSE ’21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, August 23-28, 2021, Diomidis Spinellis, Georgios Gousios, Marsha Chechik, and Massimiliano Di Penta (Eds.). ACM, 664–676. https://doi.org/10.1145/3468264.3468580 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. In NeurIPS 2019. 13–23. https://proceedings.neurips.cc/paper/2019/hash/c74d97b01eae257e44aa9d5bade97baf-Abstract.html Google ScholarGoogle Scholar
  43. Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin B. Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, and Shujie Liu. 2021. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. CoRR, abs/2102.04664 (2021), arxiv:2102.04664 Google ScholarGoogle Scholar
  44. Thibaud Lutellier, Hung Viet Pham, Lawrence Pang, Yitong Li, Moshi Wei, and Lin Tan. 2020. Coconut: combining context-aware neural translation models using ensemble for program repair. In ISSTA 2020. 101–114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Rishabh Maheshwary, Saket Maheshwary, and Vikram Pudi. 2021. Generating natural language attacks in a hard label black box setting. Google ScholarGoogle Scholar
  46. Rishabh Maheshwary, Saket Maheshwary, and Vikram Pudi. 2021. A Strong Baseline for Query Efficient Attacks in a Black Box Setting. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 8396–8409. Google ScholarGoogle ScholarCross RefCross Ref
  47. Vadim Markovtsev and Waren Long. 2018. Public git archive: a big code dataset for all. In MSR 2018. ACM, 34–37. https://doi.org/10.1145/3196398.3196464 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781. Google ScholarGoogle Scholar
  49. John Morris, Eli Lifland, Jin Yong Yoo, Jake Grigsby, Di Jin, and Yanjun Qi. 2020. TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 119–126. Google ScholarGoogle ScholarCross RefCross Ref
  50. Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. 2016. Convolutional Neural Networks over Tree Structures for Programming Language Processing. In AAAI 2016. AAAI Press, 1287–1293. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11775 Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. 2016. Convolutional neural networks over tree structures for programming language processing. In Thirtieth AAAI Conference on Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Hung Viet Pham, Shangshu Qian, Jiannan Wang, Thibaud Lutellier, Jonathan Rosenthal, Lin Tan, Yaoliang Yu, and Nachiappan Nagappan. 2020. Problems and opportunities in training deep learning software systems: an analysis of variance. In ASE 2020. 771–783. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Annibal, Alec Peltekian, and Yanfang Ye. 2021. CoTexT: Multi-task Learning with Code-Text Transformer. In NLP4Prog 2021. Association for Computational Linguistics, Online. 40–47. https://doi.org/10.18653/v1/2021.nlp4prog-1.5 Google ScholarGoogle ScholarCross RefCross Ref
  54. Varot Premtoon, James Koppel, and Armando Solar-Lezama. 2020. Semantic code search via equational reasoning. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 1066–1082. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI blog, 1, 8 (2019), 9. Google ScholarGoogle Scholar
  56. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res., 21 (2020), 140:1–140:67. http://jmlr.org/papers/v21/20-074.html Google ScholarGoogle Scholar
  57. Goutham Ramakrishnan, Jordan Henkel, Zi Wang, Aws Albarghouthi, Somesh Jha, and Thomas Reps. 2020. Semantic robustness of models of source code. arXiv preprint arXiv:2002.03043. Google ScholarGoogle Scholar
  58. Veselin Raychev, Pavol Bielik, and Martin T. Vechev. 2016. Probabilistic model for code with decision trees. In OOPSLA 2016, part of SPLASH 2016. ACM, 731–747. https://doi.org/10.1145/2983990.2984041 Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Marco Túlio Ribeiro, Tongshuang Wu, Carlos Guestrin, and Sameer Singh. 2020. Beyond Accuracy: Behavioral Testing of NLP Models with CheckList. In ACL 2020. Association for Computational Linguistics, 4902–4912. https://doi.org/10.18653/v1/2020.acl-main.442 Google ScholarGoogle ScholarCross RefCross Ref
  60. Baptiste Roziere, Marie-Anne Lachaux, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised Translation of Programming Languages.. In NeurIPS. Google ScholarGoogle Scholar
  61. Asaf Shabtai, Yuval Elovici, and Lior Rokach. 2012. A survey of data leakage detection and prevention solutions. Springer Science & Business Media. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Ensheng Shi, Yanlin Wang, Lun Du, Junjie Chen, Shi Han, Hongyu Zhang, Dongmei Zhang, and Hongbin Sun. 2021. Neural Code Summarization: How Far Are We? arXiv preprint arXiv:2107.07112. Google ScholarGoogle Scholar
  63. Shashank Srikant, Sijia Liu, Tamara Mitrovska, Shiyu Chang, Quanfu Fan, Gaoyuan Zhang, and Una-May O’Reilly. 2020. Generating Adversarial Computer Programs using Optimized Obfuscations. In ICLR. Google ScholarGoogle Scholar
  64. Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, and Cordelia Schmid. 2019. VideoBERT: A Joint Model for Video and Language Representation Learning. In ICCV 2019. IEEE, 7463–7472. https://doi.org/10.1109/ICCV.2019.00756 Google ScholarGoogle ScholarCross RefCross Ref
  65. Jeffrey Svajlenko, Judith F. Islam, Iman Keivanloo, Chanchal Kumar Roy, and Mohammad Mamun Mia. 2014. Towards a Big Data Curated Benchmark of Inter-project Code Clones. In ICSME 2014. IEEE Computer Society, 476–480. https://doi.org/10.1109/ICSME.2014.77 Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2019. An empirical study on learning bug-fixing patches in the wild via neural machine translation. ACM Transactions on Software Engineering and Methodology (TOSEM), 28, 4 (2019), 1–29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2019. An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation. ACM Trans. Softw. Eng. Methodol., 28, 4 (2019), 19:1–19:29. https://doi.org/10.1145/3340544 Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Aäron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR, abs/1807.03748 (2018), arxiv:1807.03748. arxiv:1807.03748 Google ScholarGoogle Scholar
  69. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 5998–6008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Wenhua Wang, Yuqun Zhang, Yulei Sui, Yao Wan, Zhou Zhao, Jian Wu, Philip S. Yu, and Guandong Xu. 2022. Reinforcement-Learning-Guided Source Code Summarization Using Hierarchical Attention. IEEE Transactions on Software Engineering, 48, 1 (2022), 102–119. https://doi.org/10.1109/TSE.2020.2979701 Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Yue Wang, Weishi Wang, Shafiq Joty, and Steven CH Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. In EMNLP 2021. 8696–8708. Google ScholarGoogle ScholarCross RefCross Ref
  72. Anjiang Wei, Yinlin Deng, Chenyuan Yang, and Lingming Zhang. 2022. Free Lunch for Testing: Fuzzing Deep-Learning Libraries from Open Source. In ICSE. Google ScholarGoogle Scholar
  73. Frank F Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, and Graham Neubig. 2020. Incorporating External Knowledge through Pre-training for Natural Language to Code Generation. In ACL 2020. 6045–6052. Google ScholarGoogle ScholarCross RefCross Ref
  74. Han Xu, Zhang Zhengyan, Ding Ning, Gu Yuxian, Liu Xiao, Huo Yuqi, Qiu Jiezhong, Zhang Liang, Han Wentao, and Huang Minlie. 2021. Pre-Trained Models: Past, Present and Future. arXiv preprint arXiv:2106.07139. Google ScholarGoogle Scholar
  75. Ziyu Yao, Daniel S. Weld, Wei-Peng Chen, and Huan Sun. 2018. StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow. In WWW 2018. ACM, 1693–1703. https://doi.org/10.1145/3178876.3186081 Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Noam Yefet, Uri Alon, and Eran Yahav. 2020. Adversarial examples for models of code. OOPSLA, 4 (2020), 1–30. Google ScholarGoogle Scholar
  77. Jin Yong Yoo, John Morris, Eli Lifland, and Yanjun Qi. 2020. Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples. In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. 323–332. Google ScholarGoogle ScholarCross RefCross Ref
  78. Zhengran Zeng, Yuqun Zhang, Haotian Zhang, and Lingming Zhang. 2021. Deep Just-in-Time Defect Prediction: How Far Are We? In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2021). Association for Computing Machinery, New York, NY, USA. 427–438. isbn:9781450384599 https://doi.org/10.1145/3460319.3464819 Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Jian Zhang, Xu Wang, Hongyu Zhang, Hailong Sun, and Xudong Liu. 2020. Retrieval-based neural source code summarization. In ICSE 2020. 1385–1397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Lingming Zhang, Lu Zhang, and Sarfraz Khurshid. 2013. Injecting mechanical faults to localize developer faults for evolving software. ACM SIGPLAN Notices, 48, 10 (2013), 765–784. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Mengshi Zhang, Yuqun Zhang, Lingming Zhang, Cong Liu, and Sarfraz Khurshid. 2018. DeepRoad: GAN-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems. Association for Computing Machinery, New York, NY, USA. 132–142. isbn:9781450359375 https://doi.org/10.1145/3238147.3238187 Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Husheng Zhou, Wei Li, Zelun Kong, Junfeng Guo, Yuqun Zhang, Bei Yu, Lingming Zhang, and Cong Liu. 2020. DeepBillboard: Systematic Physical-World Testing of Autonomous Driving Systems. ICSE ’20. Association for Computing Machinery, New York, NY, USA. 347–358. isbn:9781450371216 https://doi.org/10.1145/3377811.3380422 Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, and Yang Liu. 2019. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. arXiv preprint arXiv:1909.03496. Google ScholarGoogle Scholar
  84. Yaqin Zhou, Shangqing Liu, Jing Kai Siow, Xiaoning Du, and Yang Liu. 2019. Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks. In NeurIPS 2019. 10197–10207. Google ScholarGoogle Scholar

Index Terms

  1. An extensive study on pre-trained models for program understanding and generation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ISSTA 2022: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis
      July 2022
      808 pages
      ISBN:9781450393799
      DOI:10.1145/3533767

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 July 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate58of213submissions,27%

      Upcoming Conference

      ISSTA '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader