iterativeWGCNA: iterative refinement to improve module detection from WGCNA co-expression networks

Emily Greenfest-Allen; Jean-Philippe Cartailler; Mark A. Magnuson; Christian J. Stoeckert

doi:10.1101/234062

Abstract

Weighted-gene correlation network analysis (WGCNA) is frequently used to identify highly co-expressed clusters of genes (modules) within whole-transcriptome datasets. However, transcriptome-scale networks tend to be highly connected, making it challenging for the hierarchical clustering underlying the WGCNA-based classification to discriminate coherently expressed gene sets without significant information loss from either a priori filtering of the expression dataset or a posteriori pruning of the cluster dendrogram.

Here we present iterativeWGCNA, a Python-wrapped extension for the WGCNA R software package that improves the robustness of detected modules and minimizes information loss. The method works by pruning poorly fitting genes from estimated modules and then re-running WGCNA to refine gene clusters. After refining, pruned genes are assembled into a new expression dataset to isolate overlapping modules and the process repeated. In doing so, iterativeWGCNA provides an unsupervised, non-biased filtering to generate a robust, comprehensive network-based classification of whole-transcriptome expression datasets.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.