Abstract
As software documentation is becoming more and more complicated, efficiency of maintenance process could be increased through documentation reuse. In this paper, we apply software clone detection technique to automate searching of repeated fragments in software technical documentation to be reused. Our approach supports adaptive reuse, which means extracting “near duplicate” text fragments (repetitions with variations) and producing customizable reusable elements. We present a process and a tool, which can work with both DocBook documentation (widely used XML markup language) and DRL (DocBook extension with adaptive reuse features), as well as with plain text. Our tool is based on Clone Miner software clone detection tool, and integrated to DocLine environment (adaptive reuse documentation framework), providing visualization and navigation facilities on the clone groups found, and also supporting refactoring to extract clones into reusable elements.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Garousi, G., Garousi, V., Moussavi, M., Ruhe, G., Smith, B.: Evaluating usage and quality of technical software documentation: an empirical study. In: Proceedings of EASE 2013, pp. 24–35 (2013)
Watson, R.: Developing best practices for API reference documentation: creating a platform to study how programmers learn new APIs. In: Proceedings of IPCC 2012, pp. 1–9 (2012)
Parnas, D.L.: Precise documentation: the key to better software. In: Nanz, S. (ed.) The Future of Software Engineering, pp. 125–148. Springer, Heidelberg (2011)
Holmes, R., Walker, R.J.: Systematizing pragmatic software reuse. ACM Trans. Softw. Eng. Methodol. 21(4), 20:1–20:44 (2013)
Czarnecki, K.: Software reuse and evolution with generative techniques. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, p. 575 (2007)
Jarzabek, S., Bassett, P., Zhang, H., Zhang, W.: XVCL: XML-based variant configuration language. In: ICSE 2003, pp. 810–811 (2003)
Bassett, P.: The theory and practice of adaptive reuse. SIGSOFT Softw. Eng. Notes 22(3), 2–9 (1997)
Koznov, D., Romanovsky, K.: DocLine: a method for software product lines documentation development. Program. Comput. Softw. 34(4), 216–224 (2008)
Romanovsky, K., Koznov, D., Minchin, L.: Refactoring the documentation of software product lines. In: Huzar, Z., Koci, R., Meyer, B., Walter, B., Zendulka, J. (eds.) CEE-SET 2008. LNCS, vol. 4980, pp. 158–170. Springer, Heidelberg (2011)
Akhin, M., Itsykson, V.: Clone detection: why, what and how? In: Proceedings of CEE-SECR 2010, pp. 36–42 (2010)
Rattan, D., Bhatia, R.K., Singh, M.: Software clone detection: a systematic review. Inf. Softw. Technol. (INFSOF) 55(7), 1165–1199 (2013)
Walsh, N., Muellner, L.: DocBook: The Definitive Guide, p. 644. O’Reilly, Sebastopol (1999)
Basit, H.A., Smyth, W.F., Puglisi, S.J., Turpin, A., Jarzabek, S.: Efficient token based clone detection with flexible tokenization. In: Proceedings of ACM SIGSOFT International Symposium on the Foundations of Software Engineering, pp. 513–516. ACM Press (2007)
Linux Kernel Documentation, snapshot on 11 December 2013 (2013). https://github.com/torvalds/linux/tree/master/Documentation/DocBook/
Darwin Information Typing Architecture (DITA) Version 1.2 Specification (2012). http://docs.oasis-open.org/dita/v1.2/os/spec/DITA1.2-spec.pdf
Zhi, J., Garousi, V., Sun, B., Garousi, G., Shahnewaz, S., Ruhe, G.: Cost, benefits and quality of technical software documentation: a systematic mapping. J. Syst. Softw. 99, 175–198 (2015)
Zhong, H., Zhang, L., Xie, T., Mei, H.: Inferring resource specifications from natural language API documentation. In: Proceedings of 24th ASE, pp. 307–318 (2009)
Zhong, H., Su, Z.: Detecting API documentation errors. In: Proceedings of SPASH/OOPSLA, pp. 803–816 (2013)
Wingkvist, A., Lowe, W., Ericsson, M., Lincke, R.: Analysis and visualization of information quality of technical documentation. In: Proceedings of the 4th European Conference on Information Management and Evaluation, pp. 388–396 (2010)
Wingkvist, A., Ericsson, M., Lowe, W.: A visualization-based approach to present and assess technical documentation quality. Electron. J. Inf. Syst. Eval. 14(1), 150–159 (2011)
VizzAnalyzer Clone Detection Tool. http://www.arisa.se/vizz_analyzer.php
Cameron, H.G.: Wright: technical writing tools for engineers and scientists. Comput. Sci. Eng. 12(5), 98–103 (2010)
Grigorev, S., Kirilenko, I.: GLR-based abstract parsing. In: Proceedings of the 9th Central & Eastern European Software Engineering Conference in Russia (2013)
Fowler, M., et al.: Refactoring: Improving the Design of Existing Code. Addison-Wesley, Reading (1999)
Document Refactoring Toolkit. http://www.math.spbu.ru/user/kromanovsky/docline/index_en.html
Zend PHP Framework documentation, snapshot on 24 April 2015 (2015). https://github.com/zendframework/zf1/tree/master/documentation
SVN Book, snapshot on 24 April 2015 (2015). http://sourceforge.net/p/svnbook/source/HEAD/tree/trunk/en/book/
DocBook Definitive Guide, snapshot on 24 April 2015 (2015). http://sourceforge.net/p/docbook/code/HEAD/tree/trunk/defguide/en/
Basili, V.R., Caldiera, G., Rombach, H.D.: The goal question metric approach. Encycl. Softw. Eng. 2, 528–532 (1994). Wiley
Frakes, W., Terry, C.: Software reuse: metrics and models. ACM Comput. Surv. 28(2), 415–435 (1996)
Krueger, C.W.: Variation management for software product lines. In: Proceedings of SPL 2002, San Diego, CA, USA, pp. 37–48 (2002)
Abadi, A., Nisenson, M., Simionovici, Y.: A traceability technique for specifications. In: Proceedings of ICPC 2008, pp. 103–112 (2008)
Terekhov, A.N., Sokolov, V.V.: Document implementation of the conformation of MSC and SDL diagrams in the REAL technology. Progra. Comput. Softw. 33(1), 24–33 (2007)
Gavrilova, T.A.: Ontological engineering for practical knowledge work. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007. LNCS, vol. 4693, pp. 1154–1161. Springer, Heidelberg (2007)
Grigoriev, L., Kudryavtsev, D.: ORG-Master: combining classifications, matrices and diagrams in the enterprise architecture modeling tool. In: Mouromtsev, D., Klinov, P. (eds.) KESW 2013. CCIS, vol. 394, pp. 250–257. Springer, Heidelberg (2013)
Acknowledgements
The authors thank the students Artem Shutak, Dmitry Kopin, Mikhail Smarzhevskij and Adeel Khan, who implemented the draft versions of selected parts of the solution, and participated in discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Koznov, D., Luciv, D., Basit, H.A., Lieh, O.E., Smirnov, M. (2016). Clone Detection in Reuse of Software Technical Documentation. In: Mazzara, M., Voronkov, A. (eds) Perspectives of System Informatics. PSI 2015. Lecture Notes in Computer Science(), vol 9609. Springer, Cham. https://doi.org/10.1007/978-3-319-41579-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-41579-6_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41578-9
Online ISBN: 978-3-319-41579-6
eBook Packages: Computer ScienceComputer Science (R0)