ABSTRACT
Software projects use Issue Tracking Systems (ITS) like JIRA to track issues and organize the workflows around them. Issues are often inter-connected via different links such as the default JIRA link types Duplicate, Relate, Block, or Subtask. While previous research has mostly focused on analyzing and predicting duplication links, this work aims at understanding the various other link types, their prevalence, and characteristics towards a more reliable link type prediction. For this, we studied 607,208 links connecting 698,790 issues in 15 public JIRA repositories. Besides the default types, the custom types Depend, Incorporate, Split, and Cause were also common. We manually grouped all 75 link types used in the repositories into five general categories: General Relation, Duplication, Composition, Temporal / Causal, and Workflow. Comparing the structures of the corresponding graphs, we observed several trends. For instance, Duplication links tend to represent simpler issue graphs often with two components and Composition links present the highest amount of hierarchical tree structures (97.7%). Surprisingly, General Relation links have a significantly higher transitivity score than Duplication and Temporal / Causal links.
Motivated by the differences between the link types and by their popularity, we evaluated the robustness of two state-of-the-art duplicate detection approaches from the literature on the JIRA dataset. We found that current deep-learning approaches confuse between Duplication and other links in almost all repositories. On average, the classification accuracy dropped by 6% for one approach and 12% for the other. Extending the training sets with other link types seems to partly solve this issue. We discuss our findings and their implications for research and practice.
- Mehdi Amoui, Nilam Kaushik, Abraham Al-Dabbagh, Ladan Tahvildari, Shimin Li, and Weining Liu. 2013. Search-based duplicate defect detection: An industrial experience. In 2013 10th Working Conference on Mining Software Repositories (MSR). IEEE, USA, 173--182. Google ScholarCross Ref
- John Anvik, Lyndon Hiew, and Gail C. Murphy. 2006. Who Should Fix This Bug? Association for Computing Machinery, New York, NY, USA, 361--370. Google ScholarDigital Library
- Deeksha Arya, Wenting Wang, Jin L. C. Guo, and Jinghui Cheng. 2019. Analysis and Detection of Information Types of Open Source Software Issue Discussions. In Proceedings of the 41st International Conference on Software Engineering (ICSE '19). IEEE Press, Piscataway, NJ, USA, 454--464. Google ScholarDigital Library
- Dane Bertram, Amy Voida, Saul Greenberg, and Robert Walker. 2010. Communication, Collaboration, and Bugs: The Social Nature of Issue Tracking in Small, Collocated Teams. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW '10). ACM, New York, NY, USA, 291--300. Google ScholarDigital Library
- Nicolas Bettenburg, Rahul Premraj, Thomas Zimmermann, and 3 Sunghun Kim. 2008. Duplicate bug reports considered harmful ... really?. In 2008 IEEE International Conference on Software Maintenance. IEEE, USA, 337--345. Google ScholarCross Ref
- Elizabeth Bjarnason, Krzysztof Wnuk, and Björn Regnell. 2011. Requirements are slipping through the gaps --- A case study on causes & effects of communication gaps in large-scale software development. In 2011 IEEE 19th International Requirements Engineering Conference. IEEE, USA, 37--46. Google ScholarDigital Library
- Amar Budhiraja, Kartik Dutta, Raghu Reddy, and Manish Shrivastava. 2018. DWEN: Deep Word Embedding Network for Duplicate Bug Report Detection in Software Repositories. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (ICSE '18). Association for Computing Machinery, New York, NY, USA, 193--194. Google ScholarDigital Library
- Yguarata Cerqueira Cavalcanti, Eduardo Santana de Almeida, Carlos Eduardo Albuquerque da Cunha, Daniel Lucrédio, and Silvio Romero de Lemos Meira. 2010. An Initial Study on the Bug Report Duplication Problem. In 2010 14th European Conference on Software Maintenance and Reengineering. ICSE, USA, 264--267. Google ScholarDigital Library
- Xiaoyun Cheng, Naming Liu, Lin Guo, Zhou Xu, and Tao Zhang. 2020. Blocking Bug Prediction Based on XGBoost with Enhanced Features. In 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, USA, 902--911. Google ScholarCross Ref
- Jayati Deshmukh, K. M. Annervaz, Sanjay Podder, Shubhashis Sengupta, and Neville Dubash. 2017. Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, USA, 115--124. Google ScholarCross Ref
- Gouri Deshpande, Quim Motger, Cristina Palomares, Ikagarjot Kamra, Katarzyna Biesialska, Xavier Franch, Guenther Ruhe, and Jason Ho. 2020. Requirements Dependency Extraction by Integrating Active Learning with Ontology-Based Retrieval. In 2020 IEEE 28th International Requirements Engineering Conference (RE). IEEE, USA, 78--89. Google ScholarCross Ref
- Qiang Fan, Yue Yu, Gang Yin, Tao Wang, and Huaimin Wang. 2017. Where Is the Road for Issue Reports Classification Based on Text Mining?. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, USA, 121--130. Google ScholarDigital Library
- Camilo Fitzgerald, Emmanuel Letier, and Anthony Finkelstein. 2011. Early failure prediction in feature request management systems. In 2011 IEEE 19th International Requirements Engineering Conference. IEEE, USA, 229--238. Google ScholarDigital Library
- Davide Fucci, Cristina Palomares, Xavier Franch, Dolors Costal, Mikko Raatikainen, Martin Stettinger, Zijad Kurtanovic, Tero Kojo, Lars Koenig, Andreas Falkner, Gottfried Schenner, Fabrizio Brasca, Tomi Männistö, Alexander Felfernig, and Walid Maalej. 2018. Needs and Challenges for a Platform to Support Large-Scale Requirements Engineering: A Multiple-Case Study. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '18). Association for Computing Machinery, New York, NY, USA, Article 19, 10 pages. Google ScholarDigital Library
- Derek L. Hansen, Ben Shneiderman, Marc A. Smith, and Itai Himelboim (Eds.). 2020. (second edition ed.). Morgan Kaufmann, USA. 31--51 pages. Google ScholarCross Ref
- Jianjun He, Ling Xu, Meng Yan, Xin Xia, and Yan Lei. 2020. Duplicate Bug Report Detection Using Dual-Channel Convolutional Neural Networks. Association for Computing Machinery, New York, NY, USA, 117--127. Google ScholarDigital Library
- Kim Herzig, Sascha Just, and Andreas Zeller. 2013. It's not a bug, it's a feature: How misclassification impacts bug prediction. In 2013 35th International Conference on Software Engineering (ICSE). IEEE, USA, 392--401. Google ScholarCross Ref
- Gaeul Jeong, Sunghun Kim, and Thomas Zimmermann. 2009. Improving Bug Triage with Bug Tossing Graphs. In Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering (ESEC/FSE '09). Association for Computing Machinery, New York, NY, USA, 111--120. Google ScholarDigital Library
- Ahmed Lamkanfi, Serge Demeyer, Emanuel Giger, and Bart Goethals. 2010. Predicting the severity of a reported bug. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010). IEEE, USA, 1--10. Google ScholarCross Ref
- Ahmed Lamkanfi, Serge Demeyer, Quinten David Soetens, and Tim Verdonck. 2011. Comparing mining algorithms for predicting the severity of a reported bug. In 2011 15th European Conference on Software Maintenance and Reengineering. IEEE, IEEE, USA, 249--258.Google ScholarDigital Library
- Alina Lazar, Sarah Ritchey, and Bonita Sharif. 2014. Generating Duplicate Bug Datasets. In Proceedings of the 11th Working Conference on Mining Software Repositories (MSR 2014). Association for Computing Machinery, New York, NY, USA, 392--395. Google ScholarDigital Library
- Lisha Li, Zhilei Ren, Xiaochen Li, Weiqin Zou, and He Jiang. 2018. How Are Issue Units Linked? Empirical Study on the Linking Behavior in GitHub. In 2018 25th Asia-Pacific Software Engineering Conference (APSEC). IEEE, USA, 386--395. Google ScholarCross Ref
- Garm Lucassen, Fabiano Dalpiaz, Jan Martijn E.M. van der Werf, Sjaak Brinkkemper, and Didar Zowghi. 2017. Behavior-Driven Requirements Traceability via Automated Acceptance Tests. In 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW). IEEE, USA, 431--434. Google ScholarCross Ref
- Robert J. Walker Martin P. Robillard, Walid Maalej and Thomas Zimmermann (Eds.). 2014. . Springer, Berlin Heidelberg. Google ScholarCross Ref
- Thorsten Merten, Matúš Falis, Paul Hübner, Thomas Quirchmayr, Simone Bürsner, and Barbara Paech. 2016. Software Feature Request Detection in Issue Tracking Systems. In 2016 IEEE 24th International Requirements Engineering Conference (RE). IEEE, USA, 166--175. Google ScholarCross Ref
- Thorsten Merten, Daniel Krämer, Bastian Mager, Paul Schell, Simone Bürsner, and Barbara Paech. 2016. Do information retrieval algorithms for automated traceability perform effectively on issue tracking system data?. In International Working Conference on Requirements Engineering: Foundation for Software Quality. Springer International Publishing, Cham, 45--62.Google ScholarDigital Library
- Lloyd Montgomery, Clara Lüders, and Walid Maalej. 2022. An Alternative Issue Tracking Dataset of Public Jira Repositories. In 2022 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, USA.Google Scholar
- M. E. J. Newman. 2002. Assortative Mixing in Networks. Phys. Rev. Lett. 89 (Oct 2002), 208701. Issue 20. Google ScholarCross Ref
- Alexander Nicholson, Deeksha M. Arya, and Jin L.C. Guo. 2020. Traceability Network Analysis: A Case Study of Links in Issue Tracking Systems. In 2020 IEEE Seventh International Workshop on Artificial Intelligence for Requirements Engineering (AIRE). IEEE, USA, 39--47. Google ScholarCross Ref
- Patrick Rempel and Parick Mäder. 2017. Preventing Defects: The Impact of Requirements Traceability Completeness on Software Quality. IEEE Transactions on Software Engineering 43, 8 (2017), 777--797. Google ScholarDigital Library
- Thiago Marques Rocha and André Luiz Da Costa Carvalho. 2021. SiameseQAT: A Semantic Context-Based Duplicate Bug Report Detection Using Replicated Cluster Information. IEEE Access 9 (2021), 44610--44630. Google ScholarCross Ref
- Thomas Schank and Dorothea Wagner. 2005. Finding, Counting and Listing All Triangles in Large Graphs, an Experimental Study. In Experimental and Efficient Algorithms, Sotiris E. Nikoletseas (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 606--609.Google Scholar
- Marcus Seiler and Barbara Paech. 2017. Using tags to support feature management across issue tracking systems and version control systems. In International Working Conference on Requirements Engineering: Foundation for Software Quality. Springer International Publishing, Cham, 174--180.Google ScholarCross Ref
- Pannavat Terdchanakul, Hideaki Hata, Passakorn Phannachitta, and Kenichi Matsumoto. 2017. Bug or Not? Bug Report Classification Using N-Gram IDF. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, USA, 534--538. Google ScholarCross Ref
- C Albert Thompson, Gail C Murphy, Marc Palyart, and Marko Gašparic. 2016. How software developers use work breakdown relationships in issue repositories. In 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). IEEE, IEEE, USA, 281--285.Google ScholarDigital Library
- Juha Tiihonen, Mikko Raatikainen, Lalli Myllyaho, Clara Marie Lüders, Tomi Männistö, et al. 2019. Coping with Inconsistent Models of Requirements. In Proceedings of the 21st Configuration Workshop Hamburg, Germany, September 19th to 20th, 2019. Rheinisch-Westfaelische Technische Hochschule Aachen, Aachen, Germany, 1--8.Google Scholar
- Xiaoyin Wang, Lu Zhang, Tao Xie, John Anvik, and Jiasu Sun. 2008. An Approach to Detecting Duplicate Bug Reports Using Natural Language and Execution Information. In Proceedings of the 30th International Conference on Software Engineering (ICSE '08). Association for Computing Machinery, New York, NY, USA, 461--470. Google ScholarDigital Library
- Jifeng Xuan, He Jiang, Zhilei Ren, and Weiqin Zou. 2012. Developer prioritization in bug repositories. In 2012 34th International Conference on Software Engineering (ICSE). IEEE, USA, 25--35. Google ScholarCross Ref
- Tao Zhang, He Jiang, Xiapu Luo, and Alvin T.S. Chan. 2016. A Literature Review of Research in Bug Resolution: Tasks, Challenges and Future Directions. Comput. J. 59, 5 (2016), 741--773. Google ScholarCross Ref
- Ye Zhang and Byron Wallace. 2017. A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, Taipei, Taiwan, 253--263. https://aclanthology.org/I17-1026Google Scholar
- Yu Zhou, Yanxiang Tong, Ruihang Gu, and Harald Gall. 2016. Combining text mining and data mining for bug report classification. Journal of Software: Evolution and Process 28, 3 (2016), 150--176.Google ScholarDigital Library
- Thomas Zimmermann, Rahul Premraj, Nicolas Bettenburg, Sascha Just, Adrian Schröter, and Cathrin Weiss. 2010. What Makes a Good Bug Report? IEEE Transactions on Software Engineering 36, 5 (2010), 618--643. Google ScholarDigital Library
Recommendations
Orthogonal frequency division multiple access PON (OFDMA-PON) for colorless upstream transmission beyond 10 Gb/s
Special issue on next-generation broadband optical access network technologiesIn this paper, we overview the fundamental principles of next-generation optical Orthogonal Frequency Division Multiple Access (OFDMA)-PON systems, with a particular focus on upstream architectures capable of achieving 10+ Gb/s colorless upstream ...
Optical Grooming Capable Wavelength Division Multiplexing node architecture for beyond 100 Gbps transport
AbstractMixed-line-rate wavelength division multiplexing (WDM) networks with discrete channel spacing generalize the fixed grid WDM networks and can support mixed-electronic-optical grooming efficiently. For optical transport networks with ...
Comments