The impact of accounting for special methods in the measurement of object-oriented class cohesion on refactoring and fault prediction activities
Highlights
► The paper empirically addresses whether to include or exclude special methods from cohesion measurements. ► Constructors must be included and access methods must be excluded when using cohesion metrics in Extract Class refactoring activity. ► Including special methods does not significantly affect the abilities of the cohesion metrics to detect faulty classes.
Introduction
Over the past decade, object-oriented programming languages, such as C++ and Java, have been widely used in the software industry. The basic unit of design in object-oriented programs is the class. The members of a class include its attributes and methods. Several factors have to be addressed to evaluate the quality of class design, including cohesion, coupling, and complexity (Fenton and Pfleeger, 1998). Class cohesion refers to the extent to which the methods and attributes in a class are related (Briand et al., 1998).
In object-oriented programming, some methods are typically provided to support the encapsulation key feature of the object-oriented paradigm. These methods do not provide any functionality that contributes to the problem for which the class was developed. Instead, they are used to initialize attributes or to inquire about their values (Deitel and Deitel, 2005). We refer to these methods as special methods, and we classify them into four different types, including constructors, destructors, access methods, and delegation methods. A constructor is invoked when creating an object of the class, and the constructor typically initializes most or all of the attributes in the class. In contrast, a destructor is invoked at the end of the object life cycle and typically deinitializes most or all of the attributes. In contrast to constructors, destructors are not supported by some object-oriented programming languages, such as Java. Access methods are classified as either setters or getters. A setter method initializes a single attribute to the reference/value that is passed through the method's parameter. A getter method returns the reference/value of a single attribute. Finally, a delegation method is used to inquire about the status of a single attribute (e.g., whether the value is positive/negative or whether a stack/queue is full/empty). As a result, special methods feature extreme scenarios regarding the percentage of the referenced attributes in a class. That is, constructors and destructors potentially reference a relatively high percentage of attributes, whereas access and delegation methods potentially reference a relatively low percentage of attributes.
All existing cohesion metrics are directly or indirectly based on observations of the attributes referenced by the methods, although they apply different approaches to measure cohesion. Therefore, in some cases, the inclusion of special methods may influence the cohesion values so strongly that they become clearly incorrect cohesion indicators, in comparison to human intuition. In such cases, the obtained cohesion values could lead to incorrect class quality assessments and refactoring decisions. To solve this problem, special methods must be excluded if the obtained cohesion values indicate class quality incorrectly or lead to incorrect refactoring decisions. Therefore, after a cohesion metric is defined and before the metric is used in practice, the impact of including or excluding special methods on the values obtained using the metric must be theoretically and empirically investigated. Guidelines based on the obtained results must be provided to the practitioners to direct them toward the best scenario in which the metric can be applied.
Several class cohesion metrics that apply different approaches, use different pieces of information, and measure different cohesion aspects have been proposed in the literature. The problem of including/excluding special methods in cohesion measurement has been qualitatively addressed for some of the existing metrics, albeit in brief (e.g., Bieman and Kang, 1995, Briand et al., 1998, Etzkorn et al., 1998, Chae et al., 2000, Zhou et al., 2004). As far as we know, the impact of including/excluding special methods has not been empirically studied for any metric, although this impact can affect the use of the cohesion metrics in the applications of interest for software practitioners. In practice, cohesion is among the quality attributes that are useful only when they are shown to be related to other quality attributes of interest for software users and practitioners, such as fault proneness, reusability, and maintainability (Morasca, 2009). In addition, measuring cohesion is useful when it is demonstrated to be related to design enhancement tasks such as refactoring (Czibula and Serban, 2006, De Lucia et al., 2008, Al Dallal and Briand, 2012). As a result, addressing the problem of including/excluding special methods is not of interest for software practitioners unless the impact of including/excluding special methods in cohesion measurement on other quality attributes and design improvement tasks is investigated, which is the main goal of this paper.
Refactoring is the process of changing an existing object-oriented software code to enhance its internal structure while preserving its external behavior (Fowler, 1999). A class is predicted to require refactoring when a value that is based on design metrics which measure quality attributes, such as cohesion, is found to be less than a certain threshold (e.g., De Lucia et al., 2008, Czibula and Serban, 2006, Al Dallal and Briand, 2012). Although the impact of including/excluding special methods in cohesion measurement can have a crucial impact on the use of cohesion metrics in refactoring activities, as far as we know, this impact has not been empirically studied.
In this paper, we qualitatively discuss the expected impact of including/excluding different types of special methods on cohesion measurement. The goal of this discussion is to demonstrate, with the aid of examples, that the impact of including special methods on class cohesion values greatly depends on the type of the special method considered. In addition, we empirically investigated the impact of including/excluding special methods from 2245 classes of five open-source Java systems when applying several cohesion metrics to the cohesion values obtained. In this empirical study, 20 cohesion metrics that apply different cohesion measurement approaches are considered. In addition, we address the four possible scenarios of (1) excluding all special methods, (2) including constructors and excluding access methods, (3) excluding constructors and including access methods, and (4) including all special methods. Destructors are not considered because they are not supported by Java, and delegation methods are not accounted for because they are difficult to detect automatically. To empirically investigate whether the changes in the cohesion values affect refactoring decisions, several thresholds are selected to examine the percentage of classes for which the refactoring decisions are changed when the special methods are included in the cohesion measurement. A statistical technique was applied to explore whether the changes in the refactoring decisions were significant. The results indicate that a high percentage of the methods in a class are special, and it is therefore important to address the problem of including/excluding special methods. The results also demonstrate that the cohesion values and the corresponding refactoring decisions using most of the cohesion metrics considered are significantly changed when the special methods are included. According to the above discussion, these results suggest that the access methods must be excluded and the constructor methods must be included when using cohesion metrics to predict the classes that require refactoring.
Highly cohesive classes are expected to be less prone to faults. This expectation has been confirmed by several empirical studies (e.g., Briand et al., 1998, Briand et al., 2001, Gyimothy et al., 2005, Aggarwal et al., 2007, Marcus et al., 2008) in which cohesion metrics were involved in models used to predict faulty classes. The impact of including/excluding special methods on the abilities of the cohesion metrics to predict faulty classes has not previously been thoroughly studied or addressed. In this paper, we empirically investigated the impact of including/excluding special methods from classes of the same five open-source Java systems when applying the same 20 cohesion metrics on the abilities of the metrics to predict faulty classes. To perform this study, we collected fault data for the classes in the considered software systems from publicly available fault repositories, obtained the cohesion values for the considered classes using the 20 considered cohesion metrics and the four considered scenarios, statistically analyzed the relationship between the cohesion values and the presence of faults in the classes, and statistically compared the fault prediction results across the four scenarios. The results show that including/excluding special methods in cohesion measurement insignificantly changes the abilities of the cohesion metrics considered to predict faulty classes.
In summary, the major contributions of this paper are as follows:
- 1.
It empirically explores the importance of addressing the problem of including/excluding special methods and the impact of including/excluding special methods on the cohesion values that are obtained from using 20 different metrics on five open-source systems.
- 2.
It empirically investigates the effects of including/excluding special methods from classes of five open-source systems on refactoring decisions based on applying 20 different cohesion metrics.
- 3.
It empirically explores the impact of including/excluding special methods from classes of five open-source systems on the ability of 20 different cohesion metrics to detect faulty classes.
This paper is organized as follows. Section 2 reviews related work. Section 3 qualitatively discusses the effects of including/excluding special methods from cohesion measurements on changing the cohesion values. Section 4 demonstrates the empirical effects of including/excluding special methods from cohesion measurements on the cohesion values obtained. Sections 5 Impact of including special methods on refactoring decisions, 6 Impact of including special methods on predicting faulty classes report and discuss the results of the empirical studies that investigate the impact of including/excluding special methods on the refactoring decisions and the abilities of the metrics to predict faulty classes. Section 7 lists validity threats to the empirical studies. Finally, Section 8 concludes the paper and discusses future work.
Section snippets
Related work
Several metrics have been proposed in the literature to measure cohesion in object-oriented systems at different abstraction levels, including method metrics (e.g., Al Dallal, 2009) and class metrics (Briand et al., 1998). The class cohesion metrics can be classified according to different perspectives, such as the types of interactions considered, the development phase during which they are applicable, and the types of methods considered. In this paper, we consider 20 metrics, including LCOM1,
Qualitative analysis
The cohesion of a class is determined by the extent to which the attributes and methods of the class are directly or indirectly related. Typically, the existence of constructors and destructors in a class increases the number of direct and indirect relations in a class, for two reasons. First, the constructors and destructors reference most, if not all, of the class attributes, which introduces additional direct relations to the class between each of the constructors and destructors and the
Empirical analysis for the cohesion values
We empirically studied the impact of including special methods on the values that were obtained using the 20 cohesion metrics summarized in Table 1. The goal of this study was to empirically determine whether the changes in the cohesion values are significant, and consequently, to empirically determine whether it is important for software practitioners to pay attention to the problem of including/excluding special methods when using cohesion metrics in supporting software quality decisions.
Impact of including special methods on refactoring decisions
Refactoring aims to improve code maintainability and understandability, and it refers to the process of changing an existing object-oriented software code to enhance its internal structure while preserving its external behavior (Fowler, 1999). Automating design changes, reducing testing efforts, simplifying designs, assisting validation, and experimenting with new designs are among the refactoring benefits identified by Tokuda and Batory (2001). Researchers use metrics that measure quality
Impact of including special methods on predicting faulty classes
Researchers have provided empirical evidence showing that cohesion metrics strongly contribute to models used in predicting the presence of faults in classes (e.g., Briand et al., 1998, Briand et al., 2001, Gyimothy et al., 2005, Aggarwal et al., 2007, Marcus et al., 2008).
Typically, some special methods, such as access methods, are less complex and smaller in size than other (nonspecial) methods. In addition, these special methods can be fully or semi-automated by many existing programming
Threats to validity
The reported empirical studies have several internal and external threats that may restrict the generality and limit the interpretation of the results. These threats are detailed as follows.
Conclusions and future work
This paper empirically addressed whether to include or exclude special methods from cohesion measurements. Two types of special methods were considered, constructors and access methods. The impact of including/excluding each of these special methods on the cohesion values that were obtained using 20 metrics was empirically studied. In the empirical analyses, four scenarios were considered when applying each metric. The empirical study demonstrated the importance of addressing how to deal with
Acknowledgments
The author would like to acknowledge the support of this work by Kuwait University Research Grant WI06/09. In addition, the author would like to thank Anas Abdin and Saqiba Sulman for assisting in collecting the cohesion results.
Jehad Al Dallal received his PhD in Computer Science from the University of Alberta in Canada and was granted the award for best PhD researcher. He is currently working at Kuwait University in the Department of Information Science as an Associate Professor. Dr. Al Dallal has completed several research projects in the areas of software testing, software metrics, and communication protocols. In addition, he has published more than 60 papers in conference proceedings and ACM, IEEE, IET, Elsevier,
References (56)
Improving the applicability of object-oriented class cohesion metrics
Information and Software Technology
(2011)- et al.
An object-oriented high-level design-based class cohesion metric
Information and Software Technology
(2010) - et al.
A systematic and comprehensive investigation of methods to build and evaluate fault prediction models
Journal of Systems and Software
(2010) - et al.
Exploring the relationship between design measures and software quality in object-oriented systems
Journal of System and Software
(2000) - et al.
Evaluating the quality of open source software
Electronic Notes in Theoretical Computer Science
(2009) Bonferroni and Sidak corrections for multiple comparisons
- et al.
Investigating effect of design metrics on fault proneness in object-oriented systems
Journal of Object Technology
(2007) Software similarity-based functional cohesion metric
IET Software
(2009)Mathematical validation of object-oriented class cohesion metrics
International Journal of Computers
(2010)Measuring the discriminative power of object-oriented class cohesion metrics
IEEE Transactions on Software Engineering
(2011)
Improving object-oriented lack-of-cohesion metric by excluding special methods
A precise method–method interaction-based cohesion metric for object-oriented classes
ACM Transactions on Software Engineering and Methodology (TOSEM)
A proposal of a new class cohesion criterion: an empirical study
Journal of Object Technology
A class cohesion metric for object-oriented designs
Journal of Object-Oriented Program
Cohesion and reuse in an object-oriented system
Metrics for class cohesion and similarity between methods
A unified framework for cohesion measurement in object-oriented systems
Empirical Software Engineering: An International Journal
Empirical studies of quality models in object-oriented systems
Advances in Computers
Replicated case studies for investigating quality factors in object-oriented designs
Empirical Software Engineering
A cohesion measure for object-oriented classes
Software: Practice & Experience
Towards a metrics suite for object-oriented design
Object-Oriented Programming Systems, Languages and Applications (OOPSLA)
A metrics suite for object oriented design
IEEE Transactions on Software Engineering
The interpretation and utility of three cohesion metrics for object-oriented design
ACM Transactions on Software Engineering and Methodology (TOSEM)
Improving systems design using a clustering approach
IJCSNS International Journal of Computer Science and Network Security
Java How to Program
Using structural and semantic metrics to improve class cohesion
A practical look at the lack of cohesion in methods metric
Journal of Object-Oriented Programming
Cited by (27)
A fuzzy logic expert system to predict module fault proneness using unlabeled data
2020, Journal of King Saud University - Computer and Information SciencesHow does object-oriented code refactoring influence software quality? Research landscape and challenges
2019, Journal of Systems and SoftwareCitation Excerpt :This approach was followed because it is explicitly indicated in Fowler's book that the listed refactoring activities target code smells. Further, we are not interested in the papers that report the impact of change in coupling/cohesion on refactoring decisions (Chern and De Volder, 2008; Al Dallal, 2012). Rather, our focus is on determining the impact of refactoring activities on internal/external software quality attributes.
A systematic literature review: Refactoring for disclosing code smells in object oriented software
2018, Ain Shams Engineering JournalCitation Excerpt :It seems to be the most easiest and interesting approach for the detection of smells. A total of 13.446% of researchers tried on same concept- [28,85,107,155,194,206,233–240,221,207,203,192,184,178,172,171,144,140,131,89,88,78,72,58,38,36]. The datasets and tools used for the removal of code smells are mentioned in the next sections of the systematic literature survey.
Spotting and Removing WSDL Anti-pattern Root Causes in Code-first Web Services Using NLP Techniques: A Thorough Validation of Impact on Service Discoverability
2018, Computer Standards and InterfacesCitation Excerpt :These graphs have methods and variables as nodes, and the metrics can be seen as properties of these graphs. Although the metrics have been proved to be useful, certain type of methods, such as constructors and access methods (getters and setters) in service front-end classes, might negatively impact on their ability to successfully estimate the cohesion of a class [2]. In response, the work in [40] investigates how to quantify semantic cohesion of service APIs directly from WSDL documents.
Exploring community structure of software Call Graph and its applications in class cohesion measurement
2015, Journal of Systems and SoftwareObject-oriented class maintainability prediction using internal quality attributes
2013, Information and Software TechnologyCitation Excerpt :Based on a justified criterion, as discussed in Section 4.3, we consider eight cohesion measures, including Coh, CAMC, TCC, LCC, LSCC, SCOM, PCCC, and OLn, as defined in Table 1. The selected cohesion measures are well studied, both theoretically and empirically [30,33,5,6,7,9,11,8]. Coupling refers to the relatedness among system components.
Jehad Al Dallal received his PhD in Computer Science from the University of Alberta in Canada and was granted the award for best PhD researcher. He is currently working at Kuwait University in the Department of Information Science as an Associate Professor. Dr. Al Dallal has completed several research projects in the areas of software testing, software metrics, and communication protocols. In addition, he has published more than 60 papers in conference proceedings and ACM, IEEE, IET, Elsevier, Wiley, and other journals. Dr. Al Dallal was involved in developing more than 20 software systems. He also served as a technical committee member of several international conferences and as an associate editor for several refereed journals.