Skip to main content

Clone Detection in Reuse of Software Technical Documentation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9609))

Abstract

As software documentation is becoming more and more complicated, efficiency of maintenance process could be increased through documentation reuse. In this paper, we apply software clone detection technique to automate searching of repeated fragments in software technical documentation to be reused. Our approach supports adaptive reuse, which means extracting “near duplicate” text fragments (repetitions with variations) and producing customizable reusable elements. We present a process and a tool, which can work with both DocBook documentation (widely used XML markup language) and DRL (DocBook extension with adaptive reuse features), as well as with plain text. Our tool is based on Clone Miner software clone detection tool, and integrated to DocLine environment (adaptive reuse documentation framework), providing visualization and navigation facilities on the clone groups found, and also supporting refactoring to extract clones into reusable elements.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Garousi, G., Garousi, V., Moussavi, M., Ruhe, G., Smith, B.: Evaluating usage and quality of technical software documentation: an empirical study. In: Proceedings of EASE 2013, pp. 24–35 (2013)

    Google Scholar 

  2. Watson, R.: Developing best practices for API reference documentation: creating a platform to study how programmers learn new APIs. In: Proceedings of IPCC 2012, pp. 1–9 (2012)

    Google Scholar 

  3. Parnas, D.L.: Precise documentation: the key to better software. In: Nanz, S. (ed.) The Future of Software Engineering, pp. 125–148. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  4. Holmes, R., Walker, R.J.: Systematizing pragmatic software reuse. ACM Trans. Softw. Eng. Methodol. 21(4), 20:1–20:44 (2013)

    Google Scholar 

  5. Czarnecki, K.: Software reuse and evolution with generative techniques. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, p. 575 (2007)

    Google Scholar 

  6. Jarzabek, S., Bassett, P., Zhang, H., Zhang, W.: XVCL: XML-based variant configuration language. In: ICSE 2003, pp. 810–811 (2003)

    Google Scholar 

  7. Bassett, P.: The theory and practice of adaptive reuse. SIGSOFT Softw. Eng. Notes 22(3), 2–9 (1997)

    Article  Google Scholar 

  8. Koznov, D., Romanovsky, K.: DocLine: a method for software product lines documentation development. Program. Comput. Softw. 34(4), 216–224 (2008)

    Article  MATH  Google Scholar 

  9. Romanovsky, K., Koznov, D., Minchin, L.: Refactoring the documentation of software product lines. In: Huzar, Z., Koci, R., Meyer, B., Walter, B., Zendulka, J. (eds.) CEE-SET 2008. LNCS, vol. 4980, pp. 158–170. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  10. Akhin, M., Itsykson, V.: Clone detection: why, what and how? In: Proceedings of CEE-SECR 2010, pp. 36–42 (2010)

    Google Scholar 

  11. Rattan, D., Bhatia, R.K., Singh, M.: Software clone detection: a systematic review. Inf. Softw. Technol. (INFSOF) 55(7), 1165–1199 (2013)

    Article  Google Scholar 

  12. Walsh, N., Muellner, L.: DocBook: The Definitive Guide, p. 644. O’Reilly, Sebastopol (1999)

    Google Scholar 

  13. Basit, H.A., Smyth, W.F., Puglisi, S.J., Turpin, A., Jarzabek, S.: Efficient token based clone detection with flexible tokenization. In: Proceedings of ACM SIGSOFT International Symposium on the Foundations of Software Engineering, pp. 513–516. ACM Press (2007)

    Google Scholar 

  14. Linux Kernel Documentation, snapshot on 11 December 2013 (2013). https://github.com/torvalds/linux/tree/master/Documentation/DocBook/

  15. Darwin Information Typing Architecture (DITA) Version 1.2 Specification (2012). http://docs.oasis-open.org/dita/v1.2/os/spec/DITA1.2-spec.pdf

  16. Zhi, J., Garousi, V., Sun, B., Garousi, G., Shahnewaz, S., Ruhe, G.: Cost, benefits and quality of technical software documentation: a systematic mapping. J. Syst. Softw. 99, 175–198 (2015)

    Article  Google Scholar 

  17. Zhong, H., Zhang, L., Xie, T., Mei, H.: Inferring resource specifications from natural language API documentation. In: Proceedings of 24th ASE, pp. 307–318 (2009)

    Google Scholar 

  18. Zhong, H., Su, Z.: Detecting API documentation errors. In: Proceedings of SPASH/OOPSLA, pp. 803–816 (2013)

    Google Scholar 

  19. Wingkvist, A., Lowe, W., Ericsson, M., Lincke, R.: Analysis and visualization of information quality of technical documentation. In: Proceedings of the 4th European Conference on Information Management and Evaluation, pp. 388–396 (2010)

    Google Scholar 

  20. Wingkvist, A., Ericsson, M., Lowe, W.: A visualization-based approach to present and assess technical documentation quality. Electron. J. Inf. Syst. Eval. 14(1), 150–159 (2011)

    Google Scholar 

  21. VizzAnalyzer Clone Detection Tool. http://www.arisa.se/vizz_analyzer.php

  22. Cameron, H.G.: Wright: technical writing tools for engineers and scientists. Comput. Sci. Eng. 12(5), 98–103 (2010)

    Article  Google Scholar 

  23. Grigorev, S., Kirilenko, I.: GLR-based abstract parsing. In: Proceedings of the 9th Central & Eastern European Software Engineering Conference in Russia (2013)

    Google Scholar 

  24. Fowler, M., et al.: Refactoring: Improving the Design of Existing Code. Addison-Wesley, Reading (1999)

    MATH  Google Scholar 

  25. Document Refactoring Toolkit. http://www.math.spbu.ru/user/kromanovsky/docline/index_en.html

  26. Zend PHP Framework documentation, snapshot on 24 April 2015 (2015). https://github.com/zendframework/zf1/tree/master/documentation

  27. SVN Book, snapshot on 24 April 2015 (2015). http://sourceforge.net/p/svnbook/source/HEAD/tree/trunk/en/book/

  28. DocBook Definitive Guide, snapshot on 24 April 2015 (2015). http://sourceforge.net/p/docbook/code/HEAD/tree/trunk/defguide/en/

  29. Basili, V.R., Caldiera, G., Rombach, H.D.: The goal question metric approach. Encycl. Softw. Eng. 2, 528–532 (1994). Wiley

    Google Scholar 

  30. Frakes, W., Terry, C.: Software reuse: metrics and models. ACM Comput. Surv. 28(2), 415–435 (1996)

    Article  Google Scholar 

  31. Krueger, C.W.: Variation management for software product lines. In: Proceedings of SPL 2002, San Diego, CA, USA, pp. 37–48 (2002)

    Google Scholar 

  32. Abadi, A., Nisenson, M., Simionovici, Y.: A traceability technique for specifications. In: Proceedings of ICPC 2008, pp. 103–112 (2008)

    Google Scholar 

  33. Terekhov, A.N., Sokolov, V.V.: Document implementation of the conformation of MSC and SDL diagrams in the REAL technology. Progra. Comput. Softw. 33(1), 24–33 (2007)

    Article  MATH  Google Scholar 

  34. Gavrilova, T.A.: Ontological engineering for practical knowledge work. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007. LNCS, vol. 4693, pp. 1154–1161. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  35. Grigoriev, L., Kudryavtsev, D.: ORG-Master: combining classifications, matrices and diagrams in the enterprise architecture modeling tool. In: Mouromtsev, D., Klinov, P. (eds.) KESW 2013. CCIS, vol. 394, pp. 250–257. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

Download references

Acknowledgements

The authors thank the students Artem Shutak, Dmitry Kopin, Mikhail Smarzhevskij and Adeel Khan, who implemented the draft versions of selected parts of the solution, and participated in discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitrij Koznov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Koznov, D., Luciv, D., Basit, H.A., Lieh, O.E., Smirnov, M. (2016). Clone Detection in Reuse of Software Technical Documentation. In: Mazzara, M., Voronkov, A. (eds) Perspectives of System Informatics. PSI 2015. Lecture Notes in Computer Science(), vol 9609. Springer, Cham. https://doi.org/10.1007/978-3-319-41579-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41579-6_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41578-9

  • Online ISBN: 978-3-319-41579-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics