ABSTRACT
Planning and allocating resources for testing is difficult and it is usually done on empirical basis, often leading to unsatisfactory results. The possibility of early estimating the potential faultiness of software could be of great help for planning and executing testing activities. Most research concentrates on the study of different techniques for computing multivariate models and evaluating their statistical validity, but we still lack experimental data about the validity of such models across different software applications.This paper reports an empirical study of the validity of multivariate models for predicting software fault-proneness across different applications. It shows that suitably selected multivariate models can predict fault-proneness of modules of different software packages.
- B. Boehm. Software Engineering Economics. Prentice-Hall, 1981. Google ScholarDigital Library
- B. Boehm and P. Papaccio. Understanding and controlling software costs. IEEE Transactions on Software Engineering, 14(10):1462-1477, Oct. 1988. Google ScholarDigital Library
- L. Briand, V. Basili, and W. Thomas. A pattern recognition approach for software engineering data analysis. IEEE Transactions on Software Engineering, 18(11):931-942, Nov. 1992. Google ScholarDigital Library
- N. E. Fenton and M. Neil. A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(5):675-689, Sept./Oct. 1999. Google ScholarDigital Library
- N. E. Fenton and N. Ohlsson. Quantitative analysis of faults and failures in a complex software system. IEEE Transactions on Software Engineering, 26(8):797-814, Aug. 2000. Google ScholarDigital Library
- M. H. Halstead. Elements of Software Science. Elsevier North-Holland, New York, 1th edition, 1977. Google ScholarDigital Library
- D. Hosmer and S. Lemeshow. Applied Logistic Regression. Wiley-Interscience, 1989.Google Scholar
- I. Jolliffe. Principal component analysis. In Principal component analysis. Springer Verlag, New York, 1986.Google ScholarCross Ref
- T. M. Khoshgoftaar, E. B. Allen, R. Halstead, G. P. Trio, and R. M. Flass. Using process history to predict software quality. Computer, 31(4):66-72, Apr. 1998. Google ScholarDigital Library
- T. M. Khoshgoftaar and D. L. Lanning. Are the principal components of software complexity data stable across software products? In Proceedings of the Second International Software Metrics Symposium, pages 61-72, 1994.Google ScholarCross Ref
- T. M. Khoshgoftaar, D. L. Lanning, and A. S. Pandya. A comparative-study of pattern-recognition techniques for quality evaluation of telecommunications software. IEEE Journal On Selected Areas In Communications, 12(2):279-291, 1994.Google ScholarDigital Library
- R. Krause. CVS: An introduction. Linux Journal, 87:72, 74, 76, July 2001. Google ScholarDigital Library
- Liverpool Data Research Associates Ltd., http://www.ldra.com/. LDRA Testbed Technical Manual, Revision 11, June 1999.Google Scholar
- M Squared technologies, http://msquaredtechnologies.com/. Resource Standard Metrics for C, C++ and Java, Technical Manual, Version 6.01, 2001.Google Scholar
- T. J. McCabe. A complexity measure. IEEE Transactions on Software Engineering, 2(4):308-320, Dec. 1976.Google ScholarDigital Library
- A. Mokus, R. T. Fielding, and J. Herbsleb. A case study of open source software development: The apache sever. In Proceedings of the 22nd International Conference on Software Engineering (ICSE 2000), pages 263-272, 2000. Google ScholarDigital Library
- S. Morasca and G. Ruhe. A hybrid approach to analyze empirical software engineering data and its application to predict module fault-proneness in maintenance. The Journal of Systems and Software, 53(3):225-237, Sept. 2000. Google ScholarDigital Library
- J. C. Munson and T. M. Khoshgoftaar. The detection of fault-prone programs. IEEE Transactions on Software Engineering, 18(5):423-33, May 1992. Google ScholarDigital Library
- N. Ohlsson and H. Alberg. Predicting fault-prone software modules in telephone switches. IEEE Transactions on Software Engineering, 22(12):886-894, Dec. 1996. Google ScholarDigital Library
- A. A. Porter and R. W. Selby. Empirically guided software development using metric-based classification trees. IEEE Software, 7(2):46-54, Mar. 1990. Google ScholarDigital Library
- R. W. Selby and A. A. Porter. Learning from examples: Generation and evaluation of decision trees for software resource analysis. IEEE Transactions on Software Engineering, 14(12):1743-1757, Dec. 1988. Google ScholarDigital Library
- I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Tecniques with Java Implementations. Morgan Kaufmann Publishers, 2000. Google ScholarDigital Library
- M. R. Woodward, M. A. Hennell, and D. Hedley. A measure of control flow complexity in program text. IEEE Transactions on Software Engineering, 5(1):45-50, Jan. 1979.Google ScholarDigital Library
Index Terms
- An empirical evaluation of fault-proneness models
Recommendations
Deriving models of software fault-proneness
SEKE '02: Proceedings of the 14th international conference on Software engineering and knowledge engineeringThe effectiveness of the software testing process is a key issue for meeting the increasing demand of quality without augmenting the overall costs of software development. The estimation of software fault-proneness is important for assessing costs and ...
Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
Background. Slice-based cohesion metrics leverage program slices with respect to the output variables of a module to quantify the strength of functional relatedness of the elements within the module. Although slice-based cohesion metrics have been ...
Applying machine learning to software fault-proneness prediction
The importance of software testing to quality assurance cannot be overemphasized. The estimation of a module's fault-proneness is important for minimizing cost and improving the effectiveness of the software testing process. Unfortunately, no general ...
Comments