Skip to main content

A Comparative Study of Vectorization Approaches for Detecting Inconsistent Method Names

  • Chapter
  • First Online:
Book cover Computer and Information Science 2021—Summer (ICIS 2021)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 985))

Included in the following conference series:

Abstract

Methods (functions) are the fundamental components of the software. Programmers usually grasp a method’s behavior by looking at the method’s name. Hence, the name of a method should be a summary of what the method does. There has been a study utilizing Word2Vec, Doc2Vec, and the convolutional neural network (CNN) to evaluate the consistency between a method’s name and body in an automated way. While the conventional evaluation procedure detects inconsistent method names successfully, its CNN-based vectorization model construction requires costly computations. This paper focuses on such a computational cost and proposes to replace it with another lightweight vectorization approach. The comparative study using four alternative approaches proved that the Sent2Vec, one of the alternatives, can build the vectorization model 14 times faster than the conventional one while maintaining the capability of detecting inconsistent method names.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.abbreviations.com/abbreviation/.

  2. 2.

    CPU: Intel Core i5-6600 3.3GHz; Memory: 16GB; OS: Linux 4.19.128.

  3. 3.

    https://deeplearning4j.org/.

References

  1. Liblit, B., Begel, A., Sweetser, E.: Cognitive perspectives on the role of naming in computer programs. In: Proceedings of 18th Annual Psychology of Programming Workshop, pp. 53–67 (2006)

    Google Scholar 

  2. Martin, R.C.: Clean Code: A Handbook of Agile Software Craftsmanship. Prentice Hall, Boston (2008)

    Google Scholar 

  3. Deissenboeck, F., Pizka, M.: Concise and consistent naming. Softw. Quality J. 14(3), 261–282 (2006)

    Article  Google Scholar 

  4. Boswell, D., Foucher, T.: The Art of Readable Code: Simple and Practical Techniques for Writing Better Code. Oreilly & Associates, Sebastopol, CA (2011)

    Google Scholar 

  5. Gosling, J., Joy, B., Steele Jr., G.L., Bracha, G., Buckley, A.: The Java Language Specification. Addison-Wesley, Boston, MA (2014)

    Google Scholar 

  6. Montgomery, S.L. (ed.): MISRA C: Guidelines for the Use of the C Language in Critical Systems 2012. Motor Industry Research Association, Warwickshire (2013)

    Google Scholar 

  7. Kernighan, B.W., Pike, R.: The Practice of Programming. Addison-Wesley Longman, Boston, MA (1999)

    Google Scholar 

  8. Free Software Foundation: GNU Coding Standards. https://www.gnu.org/prep/standards/. Accessed 20 Nov 2020

  9. Allamanis, M., Barr, E.T., Bird, C., Sutton, C.: Learning natural coding conventions. In: Proceedings of 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 281–293 (2014)

    Google Scholar 

  10. Liu, K., Kim, D., Bissyandé, T.F., Kim, T., Kim, K., Koyuncu, A., Kim, S., Traon, Y.L.: Learning to spot and refactor inconsistent method names. In: Proceedings of 41st International Conference on Software Engineering, pp. 1–12 (2019)

    Google Scholar 

  11. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 3111–3119 (2013)

    Google Scholar 

  12. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of 31st International Conference on Machine Learning, vol. 32(2), pp. 1188–1196 (2014)

    Google Scholar 

  13. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, MA (2016)

    MATH  Google Scholar 

  14. Liu, K.: Debug-Method-Name. https://github.com/SerVal-DTF/debug-method-name/tree/master/Data. Accessed 20 Dec 2020

  15. Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: Proceedings of 2018 Conference on North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 528–540 (2018)

    Google Scholar 

  16. Høst, E.W., Østvold, B.M.: Debugging method names. In: Drossopoulou, S. (ed.) ECOOP 2009—Object-Oriented Programming. Lecture Notes in Computer Science, vol. 5653, pp. 294–317. Springer, Berlin, Heidelberg (2009)

    Google Scholar 

  17. Runeson, P., Alexandersson, M., Nyholm, O.: Detection of duplicate defect reports using natural language processing. In: Proceedings of 29th International Conference on Software Engineering, pp. 499–510 (2007)

    Google Scholar 

  18. Lawrie, D., Feild, H., Binkley, D.: Extracting meaning from abbreviated identifiers. In: Proceedings of 7th IEEE International Working Conference on Source Code Analysis & Manipulation, pp. 213–222 (2007)

    Google Scholar 

  19. Hill, E., Fry, Z.P., Boyd, H., Sridhara, G., Novikova, Y., Pollock, L., Vijay-Shanker, K.: AMAP: automatically mining abbreviation expansions in programs to enhance software maintenance tools. In: Proceedings of 2008 International Working Conference Mining Software Repositories, pp. 79–88 (2008)

    Google Scholar 

  20. Corazza, A., Martino, S.D. , Maggio, V.: Linsen: an efficient approach to split identifiers and expand abbreviations. In: Proceedings of 28th IEEE International Conference on Software Maintenance, pp. 233–242 (2012)

    Google Scholar 

  21. Alatawi, A., Xu, W., Yan, J.: The expansion of source code abbreviations using a language model. In: Proceedings of 2018 IEEE 42nd Annual Computer Software & Applications Conference, vol. 2, pp. 370–375 (2018)

    Google Scholar 

  22. Sauer, C., Jeffery, D.R., Land, L., Yetton, P.: The effectiveness of software development technical reviews: a behaviorally motivated program of research. IEEE Trans. Softw. Eng. 26(1), 1–14 (2000)

    Article  Google Scholar 

  23. Rigby, P., Cleary, B., Painchaud, F., Storey, M.-A., German, D.: Contemporary peer review in action: lessons from open source development. IEEE Softw. 29(6), 56–61 (2012)

    Article  Google Scholar 

  24. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  25. Devlin, J., Chang, M.-W. , Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference on North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their helpful comments on an earlier version of this paper. This work was supported by JSPS KAKENHI Grant #20H04184, #21K11831, and #21K11833.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hirohisa Aman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Minehisa, T., Aman, H., Yokogawa, T., Kawahara, M. (2021). A Comparative Study of Vectorization Approaches for Detecting Inconsistent Method Names. In: Lee, R. (eds) Computer and Information Science 2021—Summer . ICIS 2021. Studies in Computational Intelligence, vol 985. Springer, Cham. https://doi.org/10.1007/978-3-030-79474-3_9

Download citation

Publish with us

Policies and ethics