The impact of accounting for special methods in the measurement of object-oriented class cohesion on refactoring and fault prediction activities

https://doi.org/10.1016/j.jss.2011.12.006Get rights and content

Abstract

Class cohesion is a key attribute that is used to assess the design quality of a class, and it refers to the extent to which the attributes and methods of the class are related. Typically, classes contain special types of methods, such as constructors, destructors, and access methods. Each of these special methods has its own characteristics, which can artificially affect the class cohesion measurement. Several metrics have been proposed in the literature to indicate class cohesion during high- or low-level design phases. The impact of accounting for special methods in cohesion measurement has not been addressed for most of these metrics. This paper empirically explores the impact of including or excluding special methods on cohesion measurements that were performed using 20 existing class cohesion metrics. The empirical study applies the metrics that were considered to five open-source systems under four different scenarios, including (1) considering all special methods, (2) ignoring only constructors, (3) ignoring only access methods, and (4) ignoring all special methods. This study empirically explores the impact of including special methods in cohesion measurement for two applications of interest to software practitioners, including refactoring and predicting faulty classes. The results of the empirical studies show that the cohesion values for most of the metrics considered differ significantly across the four scenarios and that this difference significantly affects the refactoring decisions, but does not significantly affect the abilities of the metrics to predict faulty classes.

Highlights

► The paper empirically addresses whether to include or exclude special methods from cohesion measurements. ► Constructors must be included and access methods must be excluded when using cohesion metrics in Extract Class refactoring activity. ► Including special methods does not significantly affect the abilities of the cohesion metrics to detect faulty classes.

Introduction

Over the past decade, object-oriented programming languages, such as C++ and Java, have been widely used in the software industry. The basic unit of design in object-oriented programs is the class. The members of a class include its attributes and methods. Several factors have to be addressed to evaluate the quality of class design, including cohesion, coupling, and complexity (Fenton and Pfleeger, 1998). Class cohesion refers to the extent to which the methods and attributes in a class are related (Briand et al., 1998).

In object-oriented programming, some methods are typically provided to support the encapsulation key feature of the object-oriented paradigm. These methods do not provide any functionality that contributes to the problem for which the class was developed. Instead, they are used to initialize attributes or to inquire about their values (Deitel and Deitel, 2005). We refer to these methods as special methods, and we classify them into four different types, including constructors, destructors, access methods, and delegation methods. A constructor is invoked when creating an object of the class, and the constructor typically initializes most or all of the attributes in the class. In contrast, a destructor is invoked at the end of the object life cycle and typically deinitializes most or all of the attributes. In contrast to constructors, destructors are not supported by some object-oriented programming languages, such as Java. Access methods are classified as either setters or getters. A setter method initializes a single attribute to the reference/value that is passed through the method's parameter. A getter method returns the reference/value of a single attribute. Finally, a delegation method is used to inquire about the status of a single attribute (e.g., whether the value is positive/negative or whether a stack/queue is full/empty). As a result, special methods feature extreme scenarios regarding the percentage of the referenced attributes in a class. That is, constructors and destructors potentially reference a relatively high percentage of attributes, whereas access and delegation methods potentially reference a relatively low percentage of attributes.

All existing cohesion metrics are directly or indirectly based on observations of the attributes referenced by the methods, although they apply different approaches to measure cohesion. Therefore, in some cases, the inclusion of special methods may influence the cohesion values so strongly that they become clearly incorrect cohesion indicators, in comparison to human intuition. In such cases, the obtained cohesion values could lead to incorrect class quality assessments and refactoring decisions. To solve this problem, special methods must be excluded if the obtained cohesion values indicate class quality incorrectly or lead to incorrect refactoring decisions. Therefore, after a cohesion metric is defined and before the metric is used in practice, the impact of including or excluding special methods on the values obtained using the metric must be theoretically and empirically investigated. Guidelines based on the obtained results must be provided to the practitioners to direct them toward the best scenario in which the metric can be applied.

Several class cohesion metrics that apply different approaches, use different pieces of information, and measure different cohesion aspects have been proposed in the literature. The problem of including/excluding special methods in cohesion measurement has been qualitatively addressed for some of the existing metrics, albeit in brief (e.g., Bieman and Kang, 1995, Briand et al., 1998, Etzkorn et al., 1998, Chae et al., 2000, Zhou et al., 2004). As far as we know, the impact of including/excluding special methods has not been empirically studied for any metric, although this impact can affect the use of the cohesion metrics in the applications of interest for software practitioners. In practice, cohesion is among the quality attributes that are useful only when they are shown to be related to other quality attributes of interest for software users and practitioners, such as fault proneness, reusability, and maintainability (Morasca, 2009). In addition, measuring cohesion is useful when it is demonstrated to be related to design enhancement tasks such as refactoring (Czibula and Serban, 2006, De Lucia et al., 2008, Al Dallal and Briand, 2012). As a result, addressing the problem of including/excluding special methods is not of interest for software practitioners unless the impact of including/excluding special methods in cohesion measurement on other quality attributes and design improvement tasks is investigated, which is the main goal of this paper.

Refactoring is the process of changing an existing object-oriented software code to enhance its internal structure while preserving its external behavior (Fowler, 1999). A class is predicted to require refactoring when a value that is based on design metrics which measure quality attributes, such as cohesion, is found to be less than a certain threshold (e.g., De Lucia et al., 2008, Czibula and Serban, 2006, Al Dallal and Briand, 2012). Although the impact of including/excluding special methods in cohesion measurement can have a crucial impact on the use of cohesion metrics in refactoring activities, as far as we know, this impact has not been empirically studied.

In this paper, we qualitatively discuss the expected impact of including/excluding different types of special methods on cohesion measurement. The goal of this discussion is to demonstrate, with the aid of examples, that the impact of including special methods on class cohesion values greatly depends on the type of the special method considered. In addition, we empirically investigated the impact of including/excluding special methods from 2245 classes of five open-source Java systems when applying several cohesion metrics to the cohesion values obtained. In this empirical study, 20 cohesion metrics that apply different cohesion measurement approaches are considered. In addition, we address the four possible scenarios of (1) excluding all special methods, (2) including constructors and excluding access methods, (3) excluding constructors and including access methods, and (4) including all special methods. Destructors are not considered because they are not supported by Java, and delegation methods are not accounted for because they are difficult to detect automatically. To empirically investigate whether the changes in the cohesion values affect refactoring decisions, several thresholds are selected to examine the percentage of classes for which the refactoring decisions are changed when the special methods are included in the cohesion measurement. A statistical technique was applied to explore whether the changes in the refactoring decisions were significant. The results indicate that a high percentage of the methods in a class are special, and it is therefore important to address the problem of including/excluding special methods. The results also demonstrate that the cohesion values and the corresponding refactoring decisions using most of the cohesion metrics considered are significantly changed when the special methods are included. According to the above discussion, these results suggest that the access methods must be excluded and the constructor methods must be included when using cohesion metrics to predict the classes that require refactoring.

Highly cohesive classes are expected to be less prone to faults. This expectation has been confirmed by several empirical studies (e.g., Briand et al., 1998, Briand et al., 2001, Gyimothy et al., 2005, Aggarwal et al., 2007, Marcus et al., 2008) in which cohesion metrics were involved in models used to predict faulty classes. The impact of including/excluding special methods on the abilities of the cohesion metrics to predict faulty classes has not previously been thoroughly studied or addressed. In this paper, we empirically investigated the impact of including/excluding special methods from classes of the same five open-source Java systems when applying the same 20 cohesion metrics on the abilities of the metrics to predict faulty classes. To perform this study, we collected fault data for the classes in the considered software systems from publicly available fault repositories, obtained the cohesion values for the considered classes using the 20 considered cohesion metrics and the four considered scenarios, statistically analyzed the relationship between the cohesion values and the presence of faults in the classes, and statistically compared the fault prediction results across the four scenarios. The results show that including/excluding special methods in cohesion measurement insignificantly changes the abilities of the cohesion metrics considered to predict faulty classes.

In summary, the major contributions of this paper are as follows:

  • 1.

    It empirically explores the importance of addressing the problem of including/excluding special methods and the impact of including/excluding special methods on the cohesion values that are obtained from using 20 different metrics on five open-source systems.

  • 2.

    It empirically investigates the effects of including/excluding special methods from classes of five open-source systems on refactoring decisions based on applying 20 different cohesion metrics.

  • 3.

    It empirically explores the impact of including/excluding special methods from classes of five open-source systems on the ability of 20 different cohesion metrics to detect faulty classes.

This paper is organized as follows. Section 2 reviews related work. Section 3 qualitatively discusses the effects of including/excluding special methods from cohesion measurements on changing the cohesion values. Section 4 demonstrates the empirical effects of including/excluding special methods from cohesion measurements on the cohesion values obtained. Sections 5 Impact of including special methods on refactoring decisions, 6 Impact of including special methods on predicting faulty classes report and discuss the results of the empirical studies that investigate the impact of including/excluding special methods on the refactoring decisions and the abilities of the metrics to predict faulty classes. Section 7 lists validity threats to the empirical studies. Finally, Section 8 concludes the paper and discusses future work.

Section snippets

Related work

Several metrics have been proposed in the literature to measure cohesion in object-oriented systems at different abstraction levels, including method metrics (e.g., Al Dallal, 2009) and class metrics (Briand et al., 1998). The class cohesion metrics can be classified according to different perspectives, such as the types of interactions considered, the development phase during which they are applicable, and the types of methods considered. In this paper, we consider 20 metrics, including LCOM1,

Qualitative analysis

The cohesion of a class is determined by the extent to which the attributes and methods of the class are directly or indirectly related. Typically, the existence of constructors and destructors in a class increases the number of direct and indirect relations in a class, for two reasons. First, the constructors and destructors reference most, if not all, of the class attributes, which introduces additional direct relations to the class between each of the constructors and destructors and the

Empirical analysis for the cohesion values

We empirically studied the impact of including special methods on the values that were obtained using the 20 cohesion metrics summarized in Table 1. The goal of this study was to empirically determine whether the changes in the cohesion values are significant, and consequently, to empirically determine whether it is important for software practitioners to pay attention to the problem of including/excluding special methods when using cohesion metrics in supporting software quality decisions.

Impact of including special methods on refactoring decisions

Refactoring aims to improve code maintainability and understandability, and it refers to the process of changing an existing object-oriented software code to enhance its internal structure while preserving its external behavior (Fowler, 1999). Automating design changes, reducing testing efforts, simplifying designs, assisting validation, and experimenting with new designs are among the refactoring benefits identified by Tokuda and Batory (2001). Researchers use metrics that measure quality

Impact of including special methods on predicting faulty classes

Researchers have provided empirical evidence showing that cohesion metrics strongly contribute to models used in predicting the presence of faults in classes (e.g., Briand et al., 1998, Briand et al., 2001, Gyimothy et al., 2005, Aggarwal et al., 2007, Marcus et al., 2008).

Typically, some special methods, such as access methods, are less complex and smaller in size than other (nonspecial) methods. In addition, these special methods can be fully or semi-automated by many existing programming

Threats to validity

The reported empirical studies have several internal and external threats that may restrict the generality and limit the interpretation of the results. These threats are detailed as follows.

Conclusions and future work

This paper empirically addressed whether to include or exclude special methods from cohesion measurements. Two types of special methods were considered, constructors and access methods. The impact of including/excluding each of these special methods on the cohesion values that were obtained using 20 metrics was empirically studied. In the empirical analyses, four scenarios were considered when applying each metric. The empirical study demonstrated the importance of addressing how to deal with

Acknowledgments

The author would like to acknowledge the support of this work by Kuwait University Research Grant WI06/09. In addition, the author would like to thank Anas Abdin and Saqiba Sulman for assisting in collecting the cohesion results.

Jehad Al Dallal received his PhD in Computer Science from the University of Alberta in Canada and was granted the award for best PhD researcher. He is currently working at Kuwait University in the Department of Information Science as an Associate Professor. Dr. Al Dallal has completed several research projects in the areas of software testing, software metrics, and communication protocols. In addition, he has published more than 60 papers in conference proceedings and ACM, IEEE, IET, Elsevier,

References (56)

  • J. Al Dallal

    Improving object-oriented lack-of-cohesion metric by excluding special methods

  • Al Dallal, J., 2012. Fault prediction and the discriminative powers of connectivity-based object-oriented class...
  • J. Al Dallal et al.

    A precise method–method interaction-based cohesion metric for object-oriented classes

    ACM Transactions on Software Engineering and Methodology (TOSEM)

    (2012)
  • L. Badri et al.

    A proposal of a new class cohesion criterion: an empirical study

    Journal of Object Technology

    (2004)
  • J. Bansiya et al.

    A class cohesion metric for object-oriented designs

    Journal of Object-Oriented Program

    (1999)
  • J. Bieman et al.

    Cohesion and reuse in an object-oriented system

  • C. Bonja et al.

    Metrics for class cohesion and similarity between methods

  • L.C. Briand et al.

    A unified framework for cohesion measurement in object-oriented systems

    Empirical Software Engineering: An International Journal

    (1998)
  • L.C. Briand et al.

    Empirical studies of quality models in object-oriented systems

    Advances in Computers

    (2002)
  • L.C. Briand et al.

    Replicated case studies for investigating quality factors in object-oriented designs

    Empirical Software Engineering

    (2001)
  • H.S. Chae et al.

    A cohesion measure for object-oriented classes

    Software: Practice & Experience

    (2000)
  • S.R. Chidamber et al.

    Towards a metrics suite for object-oriented design

    Object-Oriented Programming Systems, Languages and Applications (OOPSLA)

    (1991)
  • S.R. Chidamber et al.

    A metrics suite for object oriented design

    IEEE Transactions on Software Engineering

    (1994)
  • S. Counsell et al.

    The interpretation and utility of three cohesion metrics for object-oriented design

    ACM Transactions on Software Engineering and Methodology (TOSEM)

    (2006)
  • I. Czibula et al.

    Improving systems design using a clustering approach

    IJCSNS International Journal of Computer Science and Network Security

    (2006)
  • P. Deitel et al.

    Java How to Program

    (2005)
  • A. De Lucia et al.

    Using structural and semantic metrics to improve class cohesion

  • L. Etzkorn et al.

    A practical look at the lack of cohesion in methods metric

    Journal of Object-Oriented Programming

    (1998)
  • Cited by (27)

    • A fuzzy logic expert system to predict module fault proneness using unlabeled data

      2020, Journal of King Saud University - Computer and Information Sciences
    • How does object-oriented code refactoring influence software quality? Research landscape and challenges

      2019, Journal of Systems and Software
      Citation Excerpt :

      This approach was followed because it is explicitly indicated in Fowler's book that the listed refactoring activities target code smells. Further, we are not interested in the papers that report the impact of change in coupling/cohesion on refactoring decisions (Chern and De Volder, 2008; Al Dallal, 2012). Rather, our focus is on determining the impact of refactoring activities on internal/external software quality attributes.

    • A systematic literature review: Refactoring for disclosing code smells in object oriented software

      2018, Ain Shams Engineering Journal
      Citation Excerpt :

      It seems to be the most easiest and interesting approach for the detection of smells. A total of 13.446% of researchers tried on same concept- [28,85,107,155,194,206,233–240,221,207,203,192,184,178,172,171,144,140,131,89,88,78,72,58,38,36]. The datasets and tools used for the removal of code smells are mentioned in the next sections of the systematic literature survey.

    • Spotting and Removing WSDL Anti-pattern Root Causes in Code-first Web Services Using NLP Techniques: A Thorough Validation of Impact on Service Discoverability

      2018, Computer Standards and Interfaces
      Citation Excerpt :

      These graphs have methods and variables as nodes, and the metrics can be seen as properties of these graphs. Although the metrics have been proved to be useful, certain type of methods, such as constructors and access methods (getters and setters) in service front-end classes, might negatively impact on their ability to successfully estimate the cohesion of a class [2]. In response, the work in [40] investigates how to quantify semantic cohesion of service APIs directly from WSDL documents.

    • Object-oriented class maintainability prediction using internal quality attributes

      2013, Information and Software Technology
      Citation Excerpt :

      Based on a justified criterion, as discussed in Section 4.3, we consider eight cohesion measures, including Coh, CAMC, TCC, LCC, LSCC, SCOM, PCCC, and OLn, as defined in Table 1. The selected cohesion measures are well studied, both theoretically and empirically [30,33,5,6,7,9,11,8]. Coupling refers to the relatedness among system components.

    View all citing articles on Scopus

    Jehad Al Dallal received his PhD in Computer Science from the University of Alberta in Canada and was granted the award for best PhD researcher. He is currently working at Kuwait University in the Department of Information Science as an Associate Professor. Dr. Al Dallal has completed several research projects in the areas of software testing, software metrics, and communication protocols. In addition, he has published more than 60 papers in conference proceedings and ACM, IEEE, IET, Elsevier, Wiley, and other journals. Dr. Al Dallal was involved in developing more than 20 software systems. He also served as a technical committee member of several international conferences and as an associate editor for several refereed journals.

    View full text