Offdiagonal complexity: A computationally quick complexity measure for graphs and networks

https://doi.org/10.1016/j.physa.2006.08.067Get rights and content

Abstract

A vast variety of biological, social, and economical networks shows topologies drastically differing from random graphs; yet the quantitative characterization remains unsatisfactory from a conceptual point of view. Motivated from the discussion of small scale-free networks, a biased link distribution entropy is defined, which takes an extremum for a power-law distribution. This approach is extended to the node–node link cross-distribution, whose nondiagonal elements characterize the graph structure beyond link distribution, cluster coefficient and average path length. From here a simple (and computationally cheap) complexity measure can be defined. This offdiagonal complexity (OdC) is proposed as a novel measure to characterize the complexity of an undirected graph, or network. While both for regular lattices and fully connected networks OdC is zero, it takes a moderately low value for a random graph and shows high values for apparently complex structures as scale-free networks and hierarchical trees. The OdC approach is applied to the Helicobacter pylori protein interaction network and randomly rewired surrogates.

Introduction

While random graph theory and scale-free network research know a set of standard measures to quantify their properties, the question of complexity of a graph still is in its infancies. A ‘blind’ application of other complexity measures (as for binary sequences or computer programs) does not account for the special properties shared by graphs and especially scale-free graphs. Moreover, some known complexity measures themselves have a high computational complexity.

Since a series of seminal papers [1], [2], [3], [4], [5] since 1999 (see also Ref. [6] for an overview), small-world and scale-free networks are a hot topic of investigation in a broad range of systems and disciplines. Metabolic and other biological networks, collaboration networks, www, internet, etc., have in common that the distribution of link degrees follows a power law, thus has no inherent scale. Such networks are termed ‘scale-free networks’. Compared to random graphs, which have a Poisson link distribution and thus a characteristic scale, they share a lot of different properties, especially a high clustering coefficient, and a short average path length.

Mathematically, a graph (or synonymously in this context, a network) is defined by a (nonempty) set of nodes, a set of edges (or links), and a map that assigns two nodes (the “end nodes” of a link) to each link. In a computer, a graph may be represented either by a list of links, represented by the pairs of nodes, or equivalently, by its adjacency matrix aij whose entries are 1 (0) if nodes i,j are connected (disconnected). Useful generalizations are weighted graphs, where the restriction of aij is relaxed from binary values to (unusually nonnegative) integer or real values (e.g. resistor values, travel distances, interaction coupling), and directed graphs, where aij no longer needs to be symmetric, and the link from i to j and the link from j to i can exist independently (e.g. links between webpages, or scientific citations).

Here the discussion will be kept limited to binary undirected graphs, like an acquaintancy network or a railway network as shown below. In the following sections, the link (degree) distribution and the next order cross-distribution are investigated and taken as a basis for a complexity measure.

Section snippets

Other complexity measures

For text strings (as computer programs, or DNA) there are common complexity measures in theoretical computer science, as Kolmogorov complexity (and the related Lempel-Ziv complexity and algorithmic information content AIC) [8]. E.g., AIC is defined by the length of the shortest program generating the string. For random structures, thus also for random graphs, they indicate high complexity. A distinction of complex structured (but still partly random) structures from completely random ones

Node degree correlations: methods of classical statistics

A straightforward mathematical approach to study node–node link correlations, i.e., correlations between degrees of pairs of nodes, is to use rank correlation methods [16] from classical statistics to analyze the link distributions.

Two common rank correlation methods can be described as follows. One considers a list of rank numbers of link numbers (node degrees). For each of the two graphs (A and B) to be compared, there is a (ordered) list of link numbers (k1,k2,,kN)=522111, and one assigns a

Definition of the offdiagonal complexity (OdC)

Let gij be the adjacency matrix of a graph with N nodes, i.e., gij=1 if nodes i and j are connected, else gij=0. Then OdC is defined as follows [15].

  • (i)

    For each node i, let l(i) be the node degree, i.e., the number of edges (links),l(i)j=0N-1gij.

  • (ii)

    Let cmn be the number of edges between all pairs of nodes i and j, with node degrees m=l(i), n=l(j) with l(j)l(i) (ordered pairs), i.e.,cmnj=0N-1j=0N-1gijδm,l(i)δn,l(j)H(l(i)-l(j)).Here δ is the Kronecker symbol and H(x)=1 for x0 and H(x)=0 for x<0.

Application to the Helicobacter pylori protein interaction graph and reshuffling to a random graph

To demonstrate that OdC can distinguish between random graphs and complex networks, the Helicobacter pylori protein interaction graph [20] has been chosen. For different rewiring probabilities p and 102 realizations each, the links have been reshuffled, ending up with a random graph for p=1. As can be seen in Fig. 6, rewiring in any case lowers the offdiagonal complexity (Fig. 6).

Conclusions and outlook

A new complexity measure for graphs and networks has been proposed. The motivation of its definition is twofold: one observation is that the binning of link distributions is problematic for small networks. Herefrom the second observation is that if one uses instead of the (plain) entropy of link distribution, which is unsignificant for scale-free networks, a “biased link entropy”, it has an extremum where the exponent of the power law is met.

The central idea of OdC is to apply an entropy

Acknowledgments

J.C.C. thanks Christian Starzynski for providing the simulation code for Fig. 6, and an anonymous referee for constructive remarks.

References (20)

  • R. Albert et al.

    Statistical mechanics of complex networks

    Rev. Mod. Phys.

    (2001)
  • H. Meyer-Ortmanns

    Functional complexity measure for networks

    Physica A

    (2004)
  • D.J. Watts et al.

    Nature

    (1998)
  • A.L. Barabasi et al.

    Science

    (1999)
  • M.E.J. Newman

    The structure and function of complex networks

    SIAM Rev.

    (2003)
  • S.N. Dorogovtsev et al.

    Evolution of networks

    Adv. Phys.

    (2002)
  • H.A. Ceccatto et al.

    Phys. Scripta

    (1988)
  • M. Gell-Mann et al.

    Information measures, effective complexity, and total information

    Complexity

    (1996)
  • M. Gell-Mann

    What is complexity?

    Complexity

    (1995)
There are more references available in the full text version of this article.

Cited by (51)

  • Comparisons of Karcı and Shannon entropies and their effects on centrality of social networks

    2019, Physica A: Statistical Mechanics and its Applications
    Citation Excerpt :

    The entropy measures were investigated for quantifying the so-called structural information content of a graph. Graph entropy can be used to obtain information from the social networks, to identify influential nodes, to measure importance of links, to identify communities as can be seen from the studies mentioned in [28–39]. The fuzzy refers to things which are vague.

  • The application of sorption hysteresis in nano-petrophysics using multiscale multiphysics network models

    2014, International Journal of Coal Geology
    Citation Excerpt :

    Table 2 and Figs. 4 and 5 show the statistics of each network type. The complexity of the networks are quantified by the Offdiagonal Complexity number (OdC), which is zero for a regular network and increases with complexity and correlation (refer to Mehmani and Prodanović, 2014; Claussen, 2007 for further discussion, as well as a sensitivity study of both network types on the fraction of pores or grains populated by small scale clusters). The critical fraction just before the network disconnects (does not percolate any more) is frc and is provided in Table 2.

View all citing articles on Scopus
View full text