Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A Bridge Role Metric Model for Nodes in Software Networks

  • Bo Li ,

    89646597@qq.com

    Affiliations Key Laboratory of Intelligent Information Processing, Shan Dong Institute of Business and Technology, YanTai, Shandong, China, Department of Computer Foundation Studies, Shan Dong Institute of Business and Technology, YanTai, Shandong, China

  • Yanli Feng,

    Affiliations Key Laboratory of Intelligent Information Processing, Shan Dong Institute of Business and Technology, YanTai, Shandong, China, Department of Computer Foundation Studies, Shan Dong Institute of Business and Technology, YanTai, Shandong, China

  • Shiyu Ge,

    Affiliation Department of Computer Foundation Studies, Shan Dong Institute of Business and Technology, YanTai, Shandong, China

  • Dashe Li

    Affiliation Key Laboratory of Intelligent Information Processing, Shan Dong Institute of Business and Technology, YanTai, Shandong, China

Abstract

A bridge role metric model is put forward in this paper. Compared with previous metric models, our solution of a large-scale object-oriented software system as a complex network is inherently more realistic. To acquire nodes and links in an undirected network, a new model that presents the crucial connectivity of a module or the hub instead of only centrality as in previous metric models is presented. Two previous metric models are described for comparison. In addition, it is obvious that the fitting curve between the results and degrees can well be fitted by a power law. The model represents many realistic characteristics of actual software structures, and a hydropower simulation system is taken as an example. This paper makes additional contributions to an accurate understanding of module design of software systems and is expected to be beneficial to software engineering practices.

Introduction

Large-scale software systems have developed quickly with the rapid development of software engineering. Hence, understanding, measuring, and controlling design are significant challenges for designers, which have attracted a significant amount of attention. There are many studies on software metric methods such as property-based [1],junction point [2],productivity [3] and combination [4], but “a common approach of using simple regression models to predict software defects [5][8] can lead to risk management decisions”. In 2002, complex networks were applied to metric software structures by Valverde et al, where the software structure is represented by a complex network. Characteristics of scale-free and small-world networks have been determined [9], and subsequently, studies have also determined that software networks that are extracted from various software also follow power-law degree distributions [10][19], exhibit strong community phenomenon [10], [20], and show some complex network behavior characteristics[13], [21][26]. Furthermore, other studies have been analyzed in software systems, and subsequently, the methodology of dependability in software networks based on three dimensions of structure has been discussed, and the structural stability in software is analyzed on dimension of composition. Because of the reusability of design patterns in object-oriented software systems, the design patterns are regarded as a typical structure that has more effect on the whole [27], [28]. [29] has studied nine large object-oriented software networks, recovering that graphs associated with these software networks are self-similar. They have also studied the time evolution of fractal dimensions during software system growth, and a significant correlation is found between the complexity metrics and the fractal dimension. [30] has presented a systematic empirical analysis of the statistical properties of communities.

On the other research front, some studies are trying to develop a metric for the role of a module in software networks, but few models can describe the “bridge” role of a module more accurately. The Weighted OO Software Coupling Network as the node weight is proposed in [31], where the weight and out degree follow a power law distribution. [32], [33] have introduced main metric parameters of software networks in detail and have integrated these metrics parameters into a hierarchical metric set. The analytic results in [34] have revealed that most of the parameters in complex systems can also be used to represent properties of software structures, some efficient metrics and methods are introduced which are based on basic parameters in other complex systems, and a practical example is used to demonstrate the validity and effectiveness of the proposed metrics. [35] has described some recent algorithms that appear to work as well as some algorithms based on betweenness, which is one of the most important metrics of the centrality of a module in a software network. [36] has introduced another important metric model: closeness. It makes regular and macroscopic analysis and subsequently, utilizes the method to measure important features and characteristics. The relativity among the integral measure and identities facilitates important proofs for the qualification of software qualities.

This paper is motivated by the above considerations. [10], [32][33] have proposed some metric parameters and models to represent properties of software structure that are independent of the connectivity role of a module, and modules in [34][36] represent the centrality of a module in software networks but are different from connectivity. Hence, a new model is proposed from a new perspective. Some modules behave stronger connectivity than other modules, and if a fault occurs, neighbors of these modules cannot connect to each other. A “bridge” is used to represent the connectivity of the module in the software network; therefore, a bridge role metric model that can more accurately serve as a metric for characteristics is proposed. The remainder of this paper is outlined as follows. After describing the bridge role metric model in section 2, we compare this model with two other previous models and analyze the correlation between the results and other fundamental metrics. In section 4, an actual hydropower system is taken as an example to demonstrate the validity of the model and the implications of design principles for software structure are discussed. In section 5, the conclusion is presented and future studies are proposed.

Methods

Software networks

Software is a system which is composed of many interactional and collaborative units reflecting coding, design and execution. The extraction from codes to network is displayed in Fig. 1. Particularly, some modules are reused or rely on other modules, and the dependency relations between two modules A and B include two types: inheritance and association. If A makes reference to B (either through association or inheritance) in its definition, there is an edge directed from A to B and vice versa. Hence, a software network is defined as , where  = , which represents modules, is a set of nodes and E = , which denotes relations between modules. The repeated edges between modules are not considered. Software is regarded as an undirected network, and  = . Node i is characterized by parameters such as the degree, closeness , and the betweenness , which are presented in section 3. In this paper, approximately 100 randomly selected software (listed in table in Appendix S1) from the open source community (http://sourceforge.net, http://code.google.com/hosting/ and http://www.oschina.net) are chosen as empirical cases.

thumbnail
Figure 1. The extraction from codes to software network.

The process is as follows: The UML class diagram is first abstracted from the source code and subsequently converted to the undirected software network.

https://doi.org/10.1371/journal.pone.0111613.g001

Bridge role metric model

As mentioned above, two metric models can better measure the centrality of a node in the software network, i.e. the closeness [36] and the betweenness [35], and their definitions are as follows: , where N is the number of nodes in the software network, i<>j, and 1 if there is an direct connection between node i and j; otherwise, ++……++, m<n. , where  = 1 if node i is located on the shortest path between node s and t, and is the number of shortest paths between s and t.

Conversely, closeness and betweenness cannot show the connectivity role effectively; therefore, the bridge role metric model is proposed in this paper, and comparisons with the two metric models above are executed.

The definition is as follows: (1)(2)where node j is any neighbor of node i, and 1 if there exists an edge between node i and node j; otherwise, 0. We now discuss the value of equation (1). We suppose the number of all neighbors of node i is n (1nN-1,); therefore, node i and all its neighbors can be considered as a community as follows: and . We set the number of neighbors (including node i) of node k as and the number of neighbors of node j as . We obtainand . One of the two extreme cases is that when (), = 0), and the other extreme case is that when , i.e., the community is a n+1-clique, then . We set Y = , and subsequently, the solutions of equation Y′ = 0 are n = 0.5 and n = 1.5, but n is an integer; hence, the extremal solution is n = 2 and 1.125. When ,; therefore, Y<1. Finally, another case should be discussed in which the community is not a star network or a clique. In these cases, the maximum can be computed with the recurrence method. Suppose that there are edges between neighbors of node , i.e. M = (if M = , the community is an -clique). We then set () to represent the value of the metric model. We obtain () = ; hence, we have and

where means that the node is an isolated one.

Computing the value can be described with the following algorithm. The method of creating the hierarchical network is by placing the nodes to a corresponding hierarchy based on its centralization. For example, the nodes that are in the center will be placed in the most inner, and the whole network will be similar to a multi-ring network.

Algorithm 1 (N,M)

Create Hierarchical Network H(N,M)

If the number of nodes in the nodes set |N|

Computing the isolated and nodes (suppose the number of these nodes is , = 1)

hierarchy  = 1

  while |N| do

   |N| = |N|-;

   Computing the nodes of the current hierarchy

   hierarchy  =  hierarchy+1;

     Removing the nodes of the current hierarchy from |N|;

     end while

     returning results

Results and Discussion

Comparisons

How does the bridge role metric model represent the connectivity of a node? Fig. 2 shows several cases with the number of edges gradually increasing, and node 1 is taken as an example to explain the function of the model. In Fig. 2(a), there are no edges between neighbors of node 1; hence, the neighbors cannot make contact with each other without node 1. In Fig. 2(b), four pair of nodes can connect to one another without node 1, and in Fig. 2(c), much more of these types of nodes exist. In Fig. 2(d), any node can connect to any other one. It can be concluded that the connectivity of a node in a given community becomes stronger with value decreasing, and it has been theoretically proved in section 2.2 with equation (3), where connectivity means the ability to make other nodes communicate with each other.

thumbnail
Figure 2. Several cases with number of edges gradually increasing and the fixed nodes.

In Figure 2 (a), N = 30, M = 29, and  = 0.0345. In Figure 2 (b), N = 30, M = 37, and  = 0.0577. In Figure 2 (c), N = 30, M = 275, and  = 0.1319. In Figure 2 (d), N = 30, M = 435, and  = 0.1332. The value of reflects the connectivity of the node 1.

https://doi.org/10.1371/journal.pone.0111613.g002

As mentioned in section 2.2, two previous metric model parameters are the closeness [36] and the betweenness [35]. How does the metric model in this paper work more effectively than these two models? Node 1 is taken as an example, and comparisons are described as follows.

In Fig.3, there are two networks that almost have the same structure except for a few edges. Node 1 is in the center of (a) and (b). In (a), nodes 2, 6, and 10 can communicate with each other only through node 1; hence, node 1 acts as a “bridge”. In (b), all other nodes can connect to each other without node 1; therefore, in the latter network, the role of node 1 is not very important. The closeness of node 1 is 0.5 in the two networks, which cannot reflect the evident different role of node 1 as an intermediate node. Nevertheless, in the former network, 0.3333, and 0.6533 in the latter network. It shows that the bridge role metric model can reflect the connectivity of a node more effectively than can the closeness.

thumbnail
Figure 3. Comparisons of the value between closeness and bridge role in two identical networks.

(a) N = 13 and M = 18. (b) N = 13 and M = 21, where there are three more edges in this network on the basis of the network in (a). Node 1 is in the center in these two networks.

https://doi.org/10.1371/journal.pone.0111613.g003

We now concentrate on the other previous metric model parameter: betweenness. In addition, there are two networks in Fig. 4, where the latter one has two more edges than the former. In (a), there are a total of 66 shortest paths between the nodes excluding node 1, in which node 1 is located in 54 paths; therefore,  = 0.8182. Meanwhile, node 1 acts as a bridge to allow nodes 2, 6, and 10 to communicate with each other, where 0.25. In (b), the connectivity of node 1 for nodes 2, 6, and 10 does not change, and 0.25; however, the shortest paths that node 1 is located in decrease, where  = 0.6667. It should be noted that ,, and are altered because of the two extra edges.

thumbnail
Figure 4. Comparisons of the value between the betweenness and bridge role in two identical networks.

(a) N = 13 and M = 12. (b) N = 13 and M = 16. Node 1 is also in the center in these two networks as shown in Figure 3.

https://doi.org/10.1371/journal.pone.0111613.g004

The conclusion can be drawn as discussed above that the metric model proposed in this paper can reflect the connectivity of nodes more effectively than the closeness or betweenness.

Simulations

Some studies have revealed that software networks follow power law distributions over an extent of degree, which is the number of edges attached to the node [36]. It is natural to consider the correlations between the bridge role metric model and other metrics. Fig.5 shows the correlations of , betweenness, closeness, and the degree in four familiar software networks.

thumbnail
Figure 5. The correlations between the metric model Bre values, closeness, betweenness and the degree K.

The corresponding data obtained from Evolution-2.6.2 (N = 1445, M = 1129), JeditR-1.35 (N = 822, M = 718), Blender-2.42 (N = 2426, M = 2848) and Azureus_2.5.0.2 (N = 2375, M = 3278), which are well-known software packages. The data points (•,▪,▴) represent measurements of the three metric models.

https://doi.org/10.1371/journal.pone.0111613.g005

Typically, centrality (closeness or betweenness) has a significant correlation with the degree; nevertheless, it can be seen in Fig. 5 that the closeness or betweenness increases but is less pronounced as the degree K increases, and the centrality of a node does not significantly depend on its degree. Specially, it is determined that values are logarithmic with the degrees, and it indicates that the node plays a less important connectivity role with increasing degree. Meanwhile, there are more edges between the neighbors of the corresponding node. The correlation contributes more to an accurate understanding of the module for software engineering practices. If there are some reusable modules in a software system, they will obey the engineering principle where if the reusable rate is high. The corresponding module is often redesigned as several additional modules, the neighbors often use or rely on more than one modules, and hence making the neighbors more “close”.

The values have a close relation with the edges between their neighbors; therefore, there is most likely a correlation between them and another metric model called Clustering Coefficients (CC2) [34], which also depends on the edges. CC2, where is the number of edges among nodes in the -neighborhood of node ( = 1,2). As seen in Fig.6, there is an approximately linear correlation between CC2 and . The correlation indicates that increased use between parts of neighbors ( = 1,2) will inevitably lead to a decrease of the connectivity of the corresponding module. Because of the scale-free characteristic () mentioned above, it is clear that is not proportional to , which represents the difference between and K from another point of view. The distribution certainly does not reflect the scale-free [26] nature of the software system, which is shown in Fig. 7.

thumbnail
Figure 6. The correlations between the CC2 and Bre values.

The corresponding data are also obtained from Evolution-2.6.2, JeditR-1.35, Blender-2.42 and Azureus_2.5.0.2. The data points • represent measurements of the models.

https://doi.org/10.1371/journal.pone.0111613.g006

thumbnail
Figure 7. The distributions of the Bre values.

The data points • represent P(). The distribution can be plotted with two branches, one of which seems to be most likely proportional to but with few data points; therefore, there is no correlation between and P().

https://doi.org/10.1371/journal.pone.0111613.g007

To verify the validity of the metric model proposed in this paper for software engineering practices, a hydropower simulation system [34] is taken as an example. The architecture and corresponding networks are shown in Fig. 8. It is developed by Embedded Technology Lab in Northeastern for the Fengman station, which was the earliest established large hydropower station. The software has access to two national software copyright studies (No. 0009448 and No. 050963) and has been working for more than ten years. The metric model in this paper is used for fault detection in developing version 2.0 software. First, modules are sorted based on the values, subsequently, source codes of the modules that have lower values and are not isolated are analyzed. The studies determined that there are fault-pronesses [37] in four modules (the XJ, RD, LP and VoltCurr modules), which lead to overall instability. These modules are basic control units and plays significant bridge roles in the system because other modules inherit or use them. The studies facilitated redesigns to reduce the fault-proness and enhance stability.

thumbnail
Figure 8. Architecture and network of the hydropower simulation system.

(a) Architecture. (b) Network (N = 310, M = 850). The software system is developed by Embedded Technology Key Lab in Northeastern University (www.netology.cn) for the Fengman hydropower station in China.

https://doi.org/10.1371/journal.pone.0111613.g008

Conclusions

The contribution of this paper is the proposed bridge role metric model. Because of the different connectivity role of a node in a software network, we use the metric model instead of the previous two metric models: betweenness and closeness. After providing a definition, the range of the metric value was discussed. The metric model's function is illustrated with different cases as well as theoretically. Comparisons are also carried out, and the analysis indicates that the model can reflect the connectivity more effectively. Furthermore, it is determined that values are logarithmic with the degrees and are proportional to another metric model-Clustering Coefficients-CC2, which indicates that the node plays a less important connectivity role as the degree increases. Nevertheless, is not proportional to . To verify the validity of the model in software engineering practice, a hydropower simulation system is taken as an example to detect the fault-proness in modules.

However, we still require further work to improve the application of the model in software structure designs. Most likely, we can detect fault-proness through a combination of this model and others (K, closeness, etc.). Second, it also required some other proof, we will use some software engineering metrics such as coupling, Cohesion to support the solution proposed. Additionally, further investigations to extend the metric model to macro- and micro-structure should be carried out to emphasize estimating the role of a node in the entire software network more effectively.

The work in this paper could facilitate a better understanding of the role of modules in systems. Actually, because local instability most likely leads to global failures, the structure is very important for designers to predict the fault-proneness of a module. The metric can help us to redesign the structure of software, improve the quality of software, and subsequently shorten the development life cycle.

Supporting Information

Acknowledgments

The authors gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author Contributions

Conceived and designed the experiments: BL YF SG DL. Performed the experiments: BL YF SG DL. Analyzed the data: BL YF SG DL. Contributed reagents/materials/analysis tools: BL YF SG DL. Wrote the paper: BL YF SG DL.

References

  1. 1. Briand LC, Morasca S, Basili VR (1996) Property Based Software Engineering Measurement. IEEE Transaction on Software Engineering 22(1): 68–86.
  2. 2. Furey S (1997) Why we should use function points. IEEE Software 14(2): 28–30.
  3. 3. Arnold M, Pedross P (1998) Software Size Measurement and Productivity Rating in a Large-Scale Software Development Department. In: Proceedings of 1998 International Conference on Software Engineering[C] Kyoto: 503–506.
  4. 4. Bauer M (1999) Analysing Software Systems by Using Combinations of Metrics. In: Proceedings of ECOOP'99 Workshops[C] Lisbon: 170–171.
  5. 5. Kpodjedo S, Ricca F, Antoniol G, Galinier P (2009) Evolution and Search Based Metrics to Improve Defects Prediction. In: Proceedings of the1st International Symposium on Search Based Software Engineering, Windsor, UK: 23–32.
  6. 6. Fenton NE, Neil M (2000) Software metrics: successes, failures and new directions. Journal of Systems and Software 47(23): 149–157.
  7. 7. Mens T, Demeyer S (2001) Future trends in software evolution metrics. In: Proceedings of 4th International Workshop on Principles of Software Evolution, Austria: 83–86.
  8. 8. Fenton NE, Krause P, Neil M (2002) Software Measurement: Uncertainty and Causal Modeling. IEEE Software 19(4): 116–122.
  9. 9. Valverdes, Cancho RF, Sole RV (2002) Seale free networks from optimal design. Europhysics Letter 60: 512–517.
  10. 10. Myers CR (2003) Software systems as complex networks: Structure, function, and evolvability of software collaboration graphs. Physics Review E 68: 1–15.
  11. 11. Valverdes S, Sole R (2003) Hierarchical small worlds, in software architecture. Working Paper of Santa Fe- institute, SFI/030744.
  12. 12. De Moura APS, Lai YC, Motter AE (2003) Signatures of small-world and scale-free properties in large computer programs. Physical Review E 68: 017102.
  13. 13. Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286: 509–512.
  14. 14. Wen L, Dromey G, Kirk D (2009) Software engineering and scale-free networks. IEEE Transactions on Systems, Man, and Cybernatics PartB: Cybernatics 39(4): 845–854.
  15. 15. Concas G, Marchesi M, Pinna S, Serra N (2007) Power-laws in a large object-oriented software system. IEEE Transactions on Software Engineering 33(10): 687–708.
  16. 16. Clauset A, Shalizi CR, Newman ME (2009) Power-law distributions in empirical data. SIAM Review 51(4): 661–703.
  17. 17. Ma YT, He KQ, Li B, Liu J, Zhou XY (2010) A Hybrid Set of Complexity Metrics for Large-Scale Object-Oriented Software Systems. Journal of Computer Science and Technology 25(6): 1184–1201.
  18. 18. Potanin A, Noble J, Frean M, Biddle R (2005) Scale-free geometry in OO programs. Communications of the ACM-Adaptive complex enterprises 48(5): 99–103.
  19. 19. Liu J, He KQ, Ma YT, Peng R (2006) Scale free in software metrics. In: Proceedings of 30th Annual International Computer Software and Applications Conference pp. 229–235.
  20. 20. Subelj S, Bajec M (2011) Community structure of complex software systems: analysis and applications. Physica A 390(16): 2968–2975.
  21. 21. Jenkins S, Kirk SR (2007) Software architecture graphs as complex networks: a novel partitioning scheme to measure stability and evolution. Information Sciences 177(12): 2587–2601.
  22. 22. Li B, Pan WF, Lu JH (2011) Multi-granularity dynamic analysis of complex software networks. in: Proceeding of IEEE International Symposium on Circuits and Systems pp. 2119–2124.
  23. 23. Pan WF, Li B, Ma YT, Qin YY, Zhou XY (2010) Measuring structural quality of object-oriented softwares via bug propagation analysis on weighted software networks. Journal of Computer Science and Technology 25(6): 1202–1213.
  24. 24. Roach C, Menezes R (2011) Using networks to understand the dynamics of software development. Communications in Computer and Information Science 116(2): 119–129.
  25. 25. Dabrowski R, Stencel K, Timoszuk G (2011) Software is a directed multi-graph. In: Proceedings of the 5th European conference on Software architecture pp. 360–369.
  26. 26. Zhang HH, Zhao H, Cai W, Zhao M (2008) Visualization and cognition of large-scale software structure using the k-core analysis. In: Proceedings of the 2008 International Conference on Intelligent Information Hidingand Multimedia Signal Processing, pp. 954–957.
  27. 27. Wang W, Zhao H, Li H, Zhang J, Li P, et al. (2010) Research on LFS Algorithm in Software Network. Journal of Software Engineering and Applications 3(2): 185–189.
  28. 28. Wang W, Zhao H, Li H, Li P, Yao D, et al. (2010) Application of Design Patterns in Process of Large-scale Software Evolving. Journal of Software Engineering and Applications 3(1): 58–64.
  29. 29. Concas G, Locci MF, Marchesi M, Pinna S, Turnu L (2006) Fractal Dimension in Software Networks. Europhysics Letters 76(6): 1221–1227.
  30. 30. Lancichinetti A, Kivelä M, Saramäki J, Fortunato S (2010) Characterizing the Community Structure of Complex Networks. PloS ONE 5(8): e11976.
  31. 31. Liu J, He KQ, Ma YT, Peng R (2006) Scale Free in Software Metrics. In: Proceedings of the 30th Annual International Computer Software and Applications Conference (1): 229–235.
  32. 32. Liu J, He KQ, Peng R, Ma YT (2006) A Study on the Weight and Topology Correlation of Object Oriented Software Coupling Network, In: Proceedings of the 1st International Conference on Complex Systems and Applications pp. 955–959.
  33. 33. Ma YT, He KQ, Liu J, Yan YL (2005) A Complexity Metrics Set for Large-scale Object-Oriented Software Systems. In: Proceedings of the 6th International Conference on Computer and Information Technology pp. 189.
  34. 34. Cai W, Zhao H, Zhang HH (2007) Static Structural Complexity Metrics for Large-scale Software. Special Issue on Software Engineering and Complex Networks of Dynamics of Continuous, Discrete and Impulsive Systems Series B 14(S6): 12–17.
  35. 35. Newman ME (2004) Detecting community structure in networks. Euro. Phys. J. B 38(2): 321–330.
  36. 36. Peng L (2011) Research on Structure Characteristics and Information Metabolism of Software Network. Shen Yang: Northeastern University. 344 p.
  37. 37. Huang P, Zhu J (2008) Predicting the fault-proneness of class hierarchy in object-oriented software using a layered kernel. Journal of Zhejiang University 9(10): 1390–1397.