A Comparative Study of Vectorization Approaches for Detecting Inconsistent Method Names

Minehisa, Tomoya; Aman, Hirohisa; Yokogawa, Tomoyuki; Kawahara, Minoru

doi:10.1007/978-3-030-79474-3_9

Tomoya Minehisa³,
Hirohisa Aman⁴,
Tomoyuki Yokogawa⁵ &
…
Minoru Kawahara⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 985))

Included in the following conference series:

International Conference on Intelligence Science

355 Accesses
1 Citations

Abstract

Methods (functions) are the fundamental components of the software. Programmers usually grasp a method’s behavior by looking at the method’s name. Hence, the name of a method should be a summary of what the method does. There has been a study utilizing Word2Vec, Doc2Vec, and the convolutional neural network (CNN) to evaluate the consistency between a method’s name and body in an automated way. While the conventional evaluation procedure detects inconsistent method names successfully, its CNN-based vectorization model construction requires costly computations. This paper focuses on such a computational cost and proposes to replace it with another lightweight vectorization approach. The comparative study using four alternative approaches proved that the Sent2Vec, one of the alternatives, can build the vectorization model 14 times faster than the conventional one while maintaining the capability of detecting inconsistent method names.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.abbreviations.com/abbreviation/.
2.
CPU: Intel Core i5-6600 3.3GHz; Memory: 16GB; OS: Linux 4.19.128.
3.
https://deeplearning4j.org/.

References

Liblit, B., Begel, A., Sweetser, E.: Cognitive perspectives on the role of naming in computer programs. In: Proceedings of 18th Annual Psychology of Programming Workshop, pp. 53–67 (2006)
Google Scholar
Martin, R.C.: Clean Code: A Handbook of Agile Software Craftsmanship. Prentice Hall, Boston (2008)
Google Scholar
Deissenboeck, F., Pizka, M.: Concise and consistent naming. Softw. Quality J. 14(3), 261–282 (2006)
Article Google Scholar
Boswell, D., Foucher, T.: The Art of Readable Code: Simple and Practical Techniques for Writing Better Code. Oreilly & Associates, Sebastopol, CA (2011)
Google Scholar
Gosling, J., Joy, B., Steele Jr., G.L., Bracha, G., Buckley, A.: The Java Language Specification. Addison-Wesley, Boston, MA (2014)
Google Scholar
Montgomery, S.L. (ed.): MISRA C: Guidelines for the Use of the C Language in Critical Systems 2012. Motor Industry Research Association, Warwickshire (2013)
Google Scholar
Kernighan, B.W., Pike, R.: The Practice of Programming. Addison-Wesley Longman, Boston, MA (1999)
Google Scholar
Free Software Foundation: GNU Coding Standards. https://www.gnu.org/prep/standards/. Accessed 20 Nov 2020
Allamanis, M., Barr, E.T., Bird, C., Sutton, C.: Learning natural coding conventions. In: Proceedings of 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 281–293 (2014)
Google Scholar
Liu, K., Kim, D., Bissyandé, T.F., Kim, T., Kim, K., Koyuncu, A., Kim, S., Traon, Y.L.: Learning to spot and refactor inconsistent method names. In: Proceedings of 41st International Conference on Software Engineering, pp. 1–12 (2019)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 3111–3119 (2013)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of 31st International Conference on Machine Learning, vol. 32(2), pp. 1188–1196 (2014)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, MA (2016)
MATH Google Scholar
Liu, K.: Debug-Method-Name. https://github.com/SerVal-DTF/debug-method-name/tree/master/Data. Accessed 20 Dec 2020
Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: Proceedings of 2018 Conference on North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 528–540 (2018)
Google Scholar
Høst, E.W., Østvold, B.M.: Debugging method names. In: Drossopoulou, S. (ed.) ECOOP 2009—Object-Oriented Programming. Lecture Notes in Computer Science, vol. 5653, pp. 294–317. Springer, Berlin, Heidelberg (2009)
Google Scholar
Runeson, P., Alexandersson, M., Nyholm, O.: Detection of duplicate defect reports using natural language processing. In: Proceedings of 29th International Conference on Software Engineering, pp. 499–510 (2007)
Google Scholar
Lawrie, D., Feild, H., Binkley, D.: Extracting meaning from abbreviated identifiers. In: Proceedings of 7th IEEE International Working Conference on Source Code Analysis & Manipulation, pp. 213–222 (2007)
Google Scholar
Hill, E., Fry, Z.P., Boyd, H., Sridhara, G., Novikova, Y., Pollock, L., Vijay-Shanker, K.: AMAP: automatically mining abbreviation expansions in programs to enhance software maintenance tools. In: Proceedings of 2008 International Working Conference Mining Software Repositories, pp. 79–88 (2008)
Google Scholar
Corazza, A., Martino, S.D. , Maggio, V.: Linsen: an efficient approach to split identifiers and expand abbreviations. In: Proceedings of 28th IEEE International Conference on Software Maintenance, pp. 233–242 (2012)
Google Scholar
Alatawi, A., Xu, W., Yan, J.: The expansion of source code abbreviations using a language model. In: Proceedings of 2018 IEEE 42nd Annual Computer Software & Applications Conference, vol. 2, pp. 370–375 (2018)
Google Scholar
Sauer, C., Jeffery, D.R., Land, L., Yetton, P.: The effectiveness of software development technical reviews: a behaviorally motivated program of research. IEEE Trans. Softw. Eng. 26(1), 1–14 (2000)
Article Google Scholar
Rigby, P., Cleary, B., Painchaud, F., Storey, M.-A., German, D.: Contemporary peer review in action: lessons from open source development. IEEE Softw. 29(6), 56–61 (2012)
Article Google Scholar
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Article Google Scholar
Devlin, J., Chang, M.-W. , Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference on North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019)
Google Scholar

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their helpful comments on an earlier version of this paper. This work was supported by JSPS KAKENHI Grant #20H04184, #21K11831, and #21K11833.

Author information

Authors and Affiliations

Graduate School of Science and Engineering, Ehime University, Matsuyama, Ehime, 790-8577, Japan
Tomoya Minehisa
Center for Information Technology, Ehime University, Matsuyama, Ehime, 790-8577, Japan
Hirohisa Aman & Minoru Kawahara
Faculty of Computer Science and Systems Engineering, Okayama Prefectural University, Soja, Okayama, 719-1197, Japan
Tomoyuki Yokogawa

Authors

Tomoya Minehisa
View author publications
You can also search for this author in PubMed Google Scholar
Hirohisa Aman
View author publications
You can also search for this author in PubMed Google Scholar
Tomoyuki Yokogawa
View author publications
You can also search for this author in PubMed Google Scholar
Minoru Kawahara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hirohisa Aman .

Editor information

Editors and Affiliations

Software Engineering and Information Technology Institute, Central Michigan University, Mount Pleasant, MI, USA
Roger Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Minehisa, T., Aman, H., Yokogawa, T., Kawahara, M. (2021). A Comparative Study of Vectorization Approaches for Detecting Inconsistent Method Names. In: Lee, R. (eds) Computer and Information Science 2021—Summer . ICIS 2021. Studies in Computational Intelligence, vol 985. Springer, Cham. https://doi.org/10.1007/978-3-030-79474-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-79474-3_9
Published: 24 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79473-6
Online ISBN: 978-3-030-79474-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics