ABSTRACT
Although very important in software engineering, establishing traceability links between software artifacts is extremely tedious, error-prone, and it requires significant effort. Even when approaches for automated traceability recovery exist, these provide the requirements analyst with a, usually very long, ranked list of candidate links that needs to be manually inspected. In this paper we introduce an approach called Estimation of the Number of Remaining Links (ENRL) which aims at estimating, via Machine Learning (ML) classifiers, the number of remaining positive links in a ranked list of candidate traceability links produced by a Natural Language Processing techniques-based recovery approach. We have evaluated the accuracy of the ENRL approach by considering several ML classifiers and NLP techniques on three datasets from industry and academia, and concerning traceability links among different kinds of software artifacts including requirements, use cases, design documents, source code, and test cases. Results from our study indicate that: (i) specific estimation models are able to provide accurate estimates of the number of remaining positive links; (ii) the estimation accuracy depends on the choice of the NLP technique, and (iii) univariate estimation models outperform multivariate ones.
- Davide Falessi, Massimiliano Di Penta, Gerardo Canfora, and Giovanni Cantone. 2017. Estimating the number of remaining links in traceability recovery. Empirical Software Engineering 22, 3 (2017), 996–1027. Google ScholarDigital Library
- s10664-016-9460-6 Abstract 1 Disclaimer ReferencesGoogle Scholar
Index Terms
- Estimating the number of remaining links in traceability recovery (journal-first abstract)
Recommendations
Estimating the number of remaining links in traceability recovery
Although very important in software engineering, establishing traceability links between software artifacts is extremely tedious, error-prone, and it requires significant effort. Even when approaches for automated traceability recovery exist, these ...
Towards the automatic classification of traceability links
ASE '17: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software EngineeringA wide range of text-based artifacts contribute to software projects (e.g., source code, test cases, use cases, project requirements, interaction diagrams, etc.). Traceability Link Recovery (TLR) is the software task in which relevant documents in ...
Capturing software traceability links from developers' eye gazes
ICPC 2014: Proceedings of the 22nd International Conference on Program ComprehensionThe paper presents a novel approach for recovering software traceability links from developers' eye gazes. An eye tracker is used to capture eye gazes while developers perform software maintenance tasks within the Eclipse IDE. An algorithm is presented ...
Comments