ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article
Updated

An a posteriori measure of network modularity

[version 2; peer review: 1 approved, 2 approved with reservations]
PUBLISHED 30 Aug 2013
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Measuring modularity is important to understand the structure of networks, and has an important number of real-world implications. However, several measures exists to assess the modularity, and give both different modularity values and different modules composition. In this article, I propose an a posteriori measure of modularity, which represents the ratio of interactions between members of the same modules vs. members of different modules. I apply this measure to a large dataset of 290 ecological networks, to show that it gives new insights about their modularity.

Keywords

nodes, network, modularity, interactions

Updated Changes from Version 1

I updated the manuscript in several places to (i) clarify the usefulness of this measure, and (ii) show how it behaves when applied to different methods of community detection. An extensive list of changes to each point raised by the two referees can be found online: https://github.com/tpoisot/ms_qr/issues?page=1&state=closed.

See the author's detailed response to the review by Jochen Fründ

Introduction

Modularity, the fact that groups of nodes within a network interact more frequently with themselves than with other nodes, is an important property of several systems, including genetic1,2, informatics3, ecological4, and socio-economic5 interactions, as well as biogeographic patterns6,7 and disease spread management8. Because of the relevance of modularity for network properties, it is important to assess it correctly. Several methods exist to measure network modularity, some of which rely on the optimization of a given criterion9,10, label propagation11, or combination of these approaches12,13. These methods return two elements. The first is a value of modularity for the networks, most often within the 0–1 interval. Each method often has a threshold value, above which a network is considered to be modular. Increasing values reflect an increasingly modular structure. The second element is a “community partition”, i.e. the attribution of each node to a module.

Recently, Thébault7 showed that different measures of modularity tailored to presence/absence matrices (i.e. networks in which links have no weight), gave roughly equal estimates of the significance of modularity, but differed in the community partition they returned (i.e. the identity of nodes composing each module varied). In such situations, one might look for a way to choose which community partition should be used. As the criterion that is optimized by each method is different, one possible way to compare the different community partitions is to use an a posteriori measure to quantify modularity, which can be applied to a network regardless of the method used to obtain the community partition.

An important feature of modular networks is the occurrence of interactions between nodes of different modules. They contribute to the propagation of disturbances4, flow of information14,15, and cross-regulation of biological processes16, inter alia17. In addition to measuring how modular the network is, determining to what extent modules are connected, and to identify nodes and edges responsible for connecting modules, is thus valuable information. In this article, I propose an a posteriori measure of the proportion of interactions established between modules, i.e. edges connecting different communities. I apply this measure to the community partition identified by the Louvain method on 290 ecological networks, and show that it behaves in a similar way to other modularity measures.

The measure

In this contribution I define the realized modularity, termed QR. QR measures the extent to which edges, within a network, are established between nodes belonging to the same module. For E edges in a network, if W of them are established between members of the same module, then

QR= WE.              (1)

When there are no between-module links, then W = E and QR takes the maximal value of 1. When between-module interactions are as numerous as within-module interactions, then W = E/2, and QR takes the minimal value of 1/2. To express the realized modularity as a value between 0 and 1, it is expressed as:

QR =2× QR −1.              (2)

The main advantage of QR is that it is agnostic with regard to the measure used to optimize modularity (and even to the method by which the nodes were assigned to modules, which can be arbitrary), as it acts a posteriori, i.e. after nodes have been attributed to modules. It can therefore be used to select the community detection method maximizing modularity. This measure works on most type of networks, as it makes no difference if links are directional, or if the networks are bipartite/unipartite. An illustration of this measure is given in Figure 1. This measure is purposefully simple, (i) so that it makes no assumption about what modularity is, or how it should be optimized, and (ii) because it is not meant to be used to optimize modularity, but to either compare the outcome of different methods, or present the value of modularity in a way that is straightforward to interpretate.

dbd11524-b197-4202-895c-56606de1ed69_figure1.gif

Figure 1. A cartoon depiction of a modular network with links between modules.

Nodes of the same modules are identified by different colors. This network has a modularity (Louvain method) of Q = 0.527. Out of the 36 interactions, 31 are established within modules, and 5 between modules. This gives a QR value of 0.86, and Q′R = 0.72.

A python implementation of this measure, using the networkx package, is proposed at https://gist.github.com/tpoisot/4947006. It reads data in the edge list format, and offers additional functions to generate null networks, as detailed in the following section.

Example application: realized modularity in ecological networks

In this section, I analyze the modular structure of a large dataset of 290 ecological networks (187 food webs and 113 host-parasite networks) published in previous meta-analyses18,19. Modularity is an important feature of ecological interaction networks, which is linked to their resilience20,21, stability7, biogeographic structure22, functioning23, and to the evolutionary mechanisms involved in their assembly24. Notably, the occurrence of interactions between and within modules plays a central role in the structure of pollination networks4, and help buffer the effect of species extinctions21.

The existence of interactions in ecological systems involves a large family of processes, ranging from abudance related25,26 (abundant species are more likely to interact together) to trait related27 (pollination depends on the flower and insect having compatible morphologies, predators are constrained by the body-size of their preys). The interaction within these different families of mechanisms will drive heterogeneity in interaction strength28. Yet, the analysis of binary matrices (is there an interaction between a pair of species, or not), still has relevance to identify properties that are conserved across systems29, especially given that one could argue that quantitative information on interaction strength is an additional level of information. The systems analyzed in this section are represented by their adjacency matrix, describing the presence or absence of an interaction.

Data and analysis

I used the Louvain method30 to detect modules, due to its rapidity and efficiency on large networks. The Louvain method works in two steps: first it optimizes modularity locally, through clustering of neighboring nodes. These clusters are, in the second steps, aggregated together, until modularity ceases to increase. This method is known to give values of modularity comparable to what is found using e.g. simulated annealing, and has been observed to give modules that have a functional relevance30. Once the partition is returned by the Louvain method, I recorded its realized modularity Q′R, and its modularity Q (using the Newman and Girvan31 measure).

For each network, I compared the values of Q and Q′R on the empirical networks to their random estimate using a network null model. Because random networks will by chance display a modular (among other) structure, it is important to confront the empirical measures of Q and Q′R to their random expectations. The null model is defined as follows. For each node n of the network, I measured its degree dn, its number of successors (the number of node it links to, or generality in ecological terms, as per32) gn, and its number of predecessors (the number of nodes that link to it, or vulnerability) vn. In each random network, for each pair of nodes (i, j), the probability that i interacts with j is given by

P(ij)=12(gidi+vjdj),              (3)

and conversely for P(ji). This null model allowed the generation of pseudo-random networks through a Bernoulli process (in each replicate, the occurrence of a link is randomly determined), with the same connectance, and the same distribution of degrees, generality, and vulnerability, as the original one (these properties are also conserved at the node level). For each of the 290 networks, 1000 pseudo-random replicates are generated. For each of them, the average value of QR and Q′R are estimated along with their 90% confidence interval. When the empirical value lies outside the confidence interval, it can be assumed that the modular structure of the network is different than expected by chance.

Results

There is a strong, positive relationship, between the values of Q′R and the values of Q (Pearsons's product-moment correlation coefficient, as implemented in R 2.1533, ρ = 0.64, 288 d.f., p < 10–6), i.e. networks for which a high modularity is detected tend to have relatively few between-module links (Figure 2). It is worth noting that some Q′R values were negative: in some cases, the best community division resulted in more interactions between than within modules. This result highlights why using an a posteriori measure is useful: other measures of modularity do not reveal the fact that there were more interactions between than within modules. In the dataset examined, most of the networks with a modularity lower than 0.2 had a negative realized modularities. This result suggests that discussing the modularity of such networks makes little sense, as their modules are not more densely linked than other random collections of nodes within the graph. Q and Q′R have different relationships with connectance (Figure 3). Increased connectance values resulted in lower modularity (ρ = –0.61, 288 d.f., p < 10–6), but had no impact on Q′R. This is a desirable property, as it allows easy comparison with the Q′R values of networks with extremely different connectances.

dbd11524-b197-4202-895c-56606de1ed69_figure2.gif

Figure 2. Relationship between the modularity of the best partition using the Louvain method and the a posteriori realized modularity.

There exists a strong, positive relationship between the two variables. Worth noting is the fact that, for some networks, the best partition resulted in negative versions of Q′R, i.e. there were more interactions between than within modules. Each dot corresponds to a network.

dbd11524-b197-4202-895c-56606de1ed69_figure3.gif

Figure 3. Relationship between the two measures of modularity and network connectance.

A. Q is negatively affected by connectance, i.e. densely connected networks are more likely not to be modular. B. Q′R is not affected by connectance, allowing to use it to compare different networks. Each dot corresponds to a network.

There is a linear relationship between the deviation from random expectation of Q and Q′R (ρ = 0.78, 288 d.f., p < 10–6Figure 4). The deviations (respectively ΔQ and ΔQ′R) are calculated as the empirical value, minus the average of the values on the networks generated by the null model. As an example, a ΔQ less than zero indicates that the empirical network is less modular than expected by chance. Confidence intervals for the average of the null models were typically very narrow (not represented in the figure to avoid cluttering – see associated original dataset), probably owing to the fact that the null model is restrictive on the type of networks which are generated. It is worth noting that for some networks, the diagnostic of the null model analysis is conflicted. In a vast majority of the situations, this corresponds to networks having a lower modularity than expected by chance, yet having a higher realized modularity (dots in the upper left corner of Figure 4). Depending on whether the true modularity, or the realized modularity, is the most relevant metric of the processes studied, the interpretation of the null models for these networks will be different.

dbd11524-b197-4202-895c-56606de1ed69_figure4.gif

Figure 4. Linear relationship between the deviation from random expectation in Q and Q′R.

Networks in the red area are detected as being less modular than expected both by Q′R and Q, while networks in the blue area are detected as being more modular. Although the agreement between the two measures is good (see main text for statistics), some networks are detected as having a higher than expected realized modularity Q′R, despite a lower than expected modularity Q. Each dot correspond to a network.

Finally, for the unipartite network dataset, I compare the results of three alternative methods of community detection (the walktrap, spinglass, and edge-betweenness methods, as implemented in the igraph library). For each of the unipartite networks, I computed the value of Barber's Q, and Q′R, on the best partition found. The strong correlation between Q and Q′R were observed for the spinglass method (r = 0.61, df = 165, t = 10.02), and the weakest for the edge-betweenness method (r = 0.04, non-significant at α = 0.05). The walktrap algorithm gave results in between (r = 0.489, df = 165, t = 7.20). For both the walktrap and edge-betweenness methods, several networks had negative values of Q′R, which indicates that the “best” community partition had more links between than within modules. The spinglass method had, by contrast, less than 8% of all networks with values of Q′R lower than 0, meaning that this algorithm should be prefered when one wants to group nodes in densely connected clusters.

Conclusions

The Q′R measure presented here allows the estimation of the proportion of interactions established between different modules in a network. This measure can be analyzed much in the same way as other measures of modularity, but is applied a posteriori. As such, it can help choose the “best” community partition according to the property of the network that one wants to maximize. For example, choosing the partition giving the lowest Q′R can help identify which species are more likely to act as connectors between different modules. Ultimately, this information may have some practical relevance as a decision tool. Saavedra et al.5 showed that different nodes contribute differently to overall network properties. In a context in which networks are increasingly being used as management tools to adress e.g. conservation or pest management8, knowing the realized modularity, and developing methods to estimate which species have the highest impact on it, can allow the design of efficient policies to maximize, or decrease, the ability of network modules to interact.

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 23 May 2013
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Poisot T. An a posteriori measure of network modularity [version 2; peer review: 1 approved, 2 approved with reservations] F1000Research 2013, 2:130 (https://doi.org/10.12688/f1000research.2-130.v2)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 30 Aug 2013
Views
42
Cite
Reviewer Report 05 Dec 2013
Jochen Fründ, Agroecology, Georg-August-University of Göttingen, Göttingen, Germany 
Approved with Reservations
VIEWS 42
This article suggests a simple intuitive measure of network modularity. The suggested measure, Q'R, is related to established measures but calculated slightly differently. It is proposed as an a posteriori measure, which means it is not suggested to be used ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Fründ J. Reviewer Report For: An a posteriori measure of network modularity [version 2; peer review: 1 approved, 2 approved with reservations]. F1000Research 2013, 2:130 (https://doi.org/10.5256/f1000research.2230.r2678)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
41
Cite
Reviewer Report 02 Sep 2013
Daniel Carstensen, Department of Bioscience, Universidade Federal de São Paulo, São Paulo, Brazil 
Approved
VIEWS 41
I confirm that I have read this submission and believe that I have an ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Carstensen D. Reviewer Report For: An a posteriori measure of network modularity [version 2; peer review: 1 approved, 2 approved with reservations]. F1000Research 2013, 2:130 (https://doi.org/10.5256/f1000research.2230.r1638)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 23 May 2013
Views
46
Cite
Reviewer Report 11 Jul 2013
Daniel Carstensen, Department of Bioscience, Universidade Federal de São Paulo, São Paulo, Brazil 
Approved with Reservations
VIEWS 46
The aim of the author is interesting and relevant. I am intrigued by the development of a method to quickly evaluate different modularity measures, and an a posteriori method might well be a good solution.

Overall the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Carstensen D. Reviewer Report For: An a posteriori measure of network modularity [version 2; peer review: 1 approved, 2 approved with reservations]. F1000Research 2013, 2:130 (https://doi.org/10.5256/f1000research.1000.r1055)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
43
Cite
Reviewer Report 26 Jun 2013
Carsten F Dormann, Biometry and Environmental System Analysis, University Freiburg, Freiburg, Germany 
Approved with Reservations
VIEWS 43
The proposed index of modularity is of striking simplicity - and thus likely to be prone to artifacts. In the opening paragraph, Poisot forgot to mention that random networks are also modular. Thus, a Q_R > 0 means, in itself, ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Dormann CF. Reviewer Report For: An a posteriori measure of network modularity [version 2; peer review: 1 approved, 2 approved with reservations]. F1000Research 2013, 2:130 (https://doi.org/10.5256/f1000research.1000.r1028)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 23 May 2013
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.