Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advantages and limitations of current network inference methods

Key Points

  • Recently several novel tools for inferring transcriptional networks from expression data have been developed. Computationally inferred interactions offer a useful resource to complement experimental findings, but the direct integration of inference tools in daily laboratory practice remains limited, because the choice of the appropriate network tool is not obvious.

  • Network inference is, mathematically, an underdetermined problem. The large number of theoretically possible interactions between transcription factors (TFs) and their targets far exceeds the number of independent measurements from which the true interactions can be inferred. Inference therefore results in many possible solutions that all explain the data equally well, but only a few of these solutions can be biologically true.

  • Different state-of-the-art tools for network inference deal with underdetermination by using assumptions and simplifications that reduce the number of possible solutions in order to make the problem solvable.

  • The strategy adopted to deal with the inference problem determines the aspects of the transcriptional network that is highlighted and the type of research question that can be answered. The outcome of network inference therefore varies greatly between tools.

  • Fair benchmark studies are useful for guiding both users and developers. Most current studies combine validation based on an external standard with medium-throughput experiments to validate the extent to which known interactions can be recovered and reliable new interactions can be inferred.

  • It is likely that no single best method exists, and different methods highlight complementary interaction types. Therefore, ensemble approaches, which aggregate the outcomes of several methods, offer a way to improve on the breadth and the accuracy of the predicted interactions.

  • Future work in the light of novel data generation procedures will be to develop inference methods that exploit high-throughput information about regulation at levels other than transcription to mechanistically explain how genomic variations result in observed expression changes.

Abstract

Network inference, which is the reconstruction of biological networks from high-throughput data, can provide valuable information about the regulation of gene expression in cells. However, it is an underdetermined problem, as the number of interactions that can be inferred exceeds the number of independent measurements. Different state-of-the-art tools for network inference use specific assumptions and simplifications to deal with underdetermination, and these influence the inferences. The outcome of network inference therefore varies between tools and can be highly complementary. Here we categorize the available tools according to the strategies that they use to deal with the problem of underdetermination. Such categorization allows an insight into why a certain tool is more appropriate for the specific research question or data set at hand.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Categorization of different state-of-the-art methods for module and network inference.
Figure 2: Complementarity in the type of interactions inferred by direct and module-based inference methods.
Figure 3: The different characteristics of interactions inferred by expression-based and integrative network inference methods.
Figure 4: Complementarity in the type of interactions inferred by supervised versus unsupervised network inference methods.
Figure 5: The low overlap of the predictions made by different network inference methods that rely on different strategies.

Similar content being viewed by others

References

  1. Jacob, F. & Monod, J. Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3, 318–356 (1961).

    CAS  PubMed  Google Scholar 

  2. Ptashne, M. & Gilbert, W. Genetic repressors. Sci. Am. 222, 36–44 (1970).

    Article  CAS  PubMed  Google Scholar 

  3. Alon, U. Network motifs: theory and experimental approaches. Nature Rev. Genet. 8, 450–461 (2007).

    Article  CAS  PubMed  Google Scholar 

  4. Shen-Orr, S. S., Milo, R., Mangan, S. & Alon, U. Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet. 31, 64–68 (2002).

    Article  CAS  PubMed  Google Scholar 

  5. Fadda, A. et al. Inferring the transcriptional network of Bacillus subtilis. Mol. Biosyst. 5, 1840–1852 (2009).

    Article  CAS  PubMed  Google Scholar 

  6. Cho, B. K. et al. The transcription unit architecture of the Escherichia coli genome. Nature Biotech. 27, 1043–1049 (2009).

    Article  CAS  Google Scholar 

  7. Mendoza-Vargas, A. et al. Genome-wide identification of transcription start sites, promoters and transcription factor binding sites in E. coli. PLoS One 4, e7526 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Lemmens, K. et al. DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli. Genome Biol. 10, R27 (2009). A description of the integrative reconstruction of the E. coli TRN using a cross-platform expression compendium and motif information, followed by experimental validation of the predicted network.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Zare, H., Sangurdekar, D., Srivastava, P., Kaveh, M. & Khodursky, A. Reconstruction of Escherichia coli transcriptional regulatory networks via regulon-based associations. BMC Syst. Biol. 3, 39 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Kohanski, M. A., Dwyer, D. J., Wierzbowski, J., Cottarel, G. & Collins, J. J. Mistranslation of membrane proteins and two-component system activation trigger antibiotic-mediated cell death. Cell 135, 679–690 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Yoon, H., McDermott, J. E., Porwollik, S., McClelland, M. & Heffron, F. Coordinated regulation of virulence during systemic infection of Salmonella enterica serovar Typhimurium. PLoS Pathog. 5, e1000306 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Bonneau, R. et al. A predictive model for transcriptional control of physiology in a free living cell. Cell 131, 1354–1365 (2007). An example of the use of an integrated computational–experimental approach to chart the regulatory network of a largely uncharacterized archaeon, including experimental validation of the predicted network.

    Article  CAS  PubMed  Google Scholar 

  13. Bansal, M., Belcastro, V., Ambesi-Impiombato, A. & di Bernardo, D. How to infer gene networks from expression profiles. Mol. Syst. Biol. 3, 78 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Bonneau, R. Learning biological networks: from modules to dynamics. Nature Chem. Biol. 4, 658–664 (2008).

    Article  CAS  Google Scholar 

  15. Karlebach, G. & Shamir, R. Modelling and analysis of gene regulatory networks. Nature Rev. Mol. Cell Biol. 9, 770–780 (2008).

    Article  CAS  Google Scholar 

  16. Babu, M. M. & Teichmann, S. A. Evolution of transcription factors and the gene regulatory network in Escherichia coli. Nucleic Acids Res. 31, 1234–1244 (2003).

    Article  CAS  Google Scholar 

  17. Draghici, S., Khatri, P., Eklund, A. C. & Szallasi, Z. Reliability and reproducibility issues in DNA microarray measurements. Trends Genet. 22, 101–109 (2006).

    Article  CAS  PubMed  Google Scholar 

  18. Marshall, E. Getting the noise out of gene arrays. Science 306, 630–631 (2004).

    Article  CAS  PubMed  Google Scholar 

  19. Johnson, D. S. et al. Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res. 18, 393–403 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Ma, H. W., Buer, J. & Zeng, A. P. Hierarchical structure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach. BMC Bioinformatics 5, 199 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Hartwell, L. H., Hopfield, J. J., Leibler, S. & Murray, A. W. From molecular to modular cell biology. Nature 402, C47–C52 (1999).

    Article  CAS  PubMed  Google Scholar 

  22. Ihmels, J., Bergmann, S. & Barkai, N. Defining transcription modules using large-scale gene expression data. Bioinformatics 20, 1993–2003 (2004).

    Article  CAS  PubMed  Google Scholar 

  23. Qi, Y. & Ge, H. Modularity and dynamics of cellular networks. PLoS Comput. Biol. 2, e174 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Madeira, S. C. & Oliveira, A. L. Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans. Comput. Biol. Bioinform. 1, 24–45 (2004).

    Article  CAS  PubMed  Google Scholar 

  25. Bonneau, R. et al. The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biol. 7, R36 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Segal, E. et al. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genet. 34, 166–176 (2003). Pioneering work introducing module-based network inference.

    Article  CAS  PubMed  Google Scholar 

  27. Margolin, A. A. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7, S7 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Basso, K. et al. Reverse engineering of regulatory networks in human B cells. Nature Genet. 37, 382–390 (2005).

    Article  CAS  PubMed  Google Scholar 

  29. Michoel, T., De Smet, R., Joshi, A., Van de Peer, Y. & Marchal, K. Comparative analysis of module-based versus direct methods for reverse-engineering transcriptional regulatory networks. BMC Syst. Biol. 3, 49 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Ernst, J. et al. A semi-supervised method for predicting transcription factor–gene interactions in Escherichia coli. PLoS Comput. Biol. 4, e1000044 (2008). The first integrative reconstruction of the E. coli TRN using a supervised method, combining motif information and the expression compendium from reference 31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Mordelet, F. & Vert, J. P. SIRENE: supervised inference of regulatory networks. Bioinformatics 24, i76–i82 (2008).

    Article  PubMed  Google Scholar 

  32. Faith, J. J. et al. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5, e8 (2007). The first global reconstruction of the E. coli TRN from an Affymetrix gene expression compendium, along with experimental validation of the predicted network.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Foster, J. W. Escherichia coli acid resistance: tales of an amateur acidophile. Nature Rev. Microbiol. 2, 898–907 (2004).

    Article  CAS  Google Scholar 

  34. Joshi, A., De Smet, R., Marchal, K., Van de Peer, Y. & Michoel, T. Module networks revisited: computational assessment and prioritization of model predictions. Bioinformatics 25, 490–496 (2009).

    Article  CAS  PubMed  Google Scholar 

  35. Anastassiou, D. Computational analysis of the synergy among multiple interacting genes. Mol. Syst. Biol. 3, 83 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Watkinson, J., Liang, K. C., Wang, X., Zheng, T. & Anastassiou, D. Inference of regulatory gene interactions from expression data using three-way mutual information. Ann. NY Acad. Sci. 1158, 302–313 (2009).

    Article  CAS  PubMed  Google Scholar 

  37. Shaw, O. J., Harwood, C., Steggles, L. J. & Wipat, A. SARGE: a tool for creation of putative genetic networks. Bioinformatics 20, 3638–3640 (2004).

    Article  CAS  PubMed  Google Scholar 

  38. Schmitt, W. A. Jr, Raab, R. M. & Stephanopoulos, G. Elucidation of gene interaction networks through time-lagged correlation analysis of transcriptional data. Genome Res. 14, 1654–1663 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Gutierrez-Rios, R. M. et al. Regulatory network of Escherichia coli: consistency between literature knowledge and microarray profiles. Genome Res. 13, 2435–2443 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Herrgard, M. J., Covert, M. W. & Palsson, B. O. Reconciling gene expression data with known genome-scale regulatory network structures. Genome Res. 13, 2423–2434 (2003). An informative study illustrating the limitations of expression-based network inference for E. coli and S. cerevisiae.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Bar-Joseph, Z. et al. Computational discovery of gene modules and regulatory networks. Nature Biotech. 21, 1337–1342 (2003). The first large-scale integration of ChIP-chip and expression data, applied to yeast (including experimental validation).

    Article  CAS  Google Scholar 

  42. Lemmens, K. et al. Inferring transcriptional modules from ChIP-chip, motif and microarray data. Genome Biol. 7, R37 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Sabatti, C. & James, G. M. Bayesian sparse hidden components analysis for transcription regulation networks. Bioinformatics 22, 739–746 (2006).

    Article  CAS  PubMed  Google Scholar 

  44. Tanay, A., Sharan, R., Kupiec, M. & Shamir, R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc. Natl Acad. Sci. USA 101, 2981–2986 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Myers, C. L. & Troyanskaya, O. G. Context-sensitive data integration and prediction of biological networks. Bioinformatics 23, 2322–2330 (2007).

    Article  CAS  PubMed  Google Scholar 

  46. Keseler, I. M. et al. EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res. 37, D464–D470 (2009).

    Article  CAS  PubMed  Google Scholar 

  47. Reiss, D. J., Baliga, N. S. & Bonneau, R. Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinformatics. 7, 280 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Huttenhower, C. et al. Detailing regulatory networks through large scale data integration. Bioinformatics 25, 3267–3274 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Freckleton, G., Lippman, S. I., Broach, J. R. & Tavazoie, S. Microarray profiling of phage-display selections for rapid mapping of transcription factor–DNA interactions. PLoS Genet. 5, e1000449 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Butala, M., Busby, S. J. & Lee, D. J. DNA sampling: a method for probing protein binding at specific loci on bacterial chromosomes. Nucleic Acids Res. 37, e37 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Lu, L. J., Xia, Y., Paccanaro, A., Yu, H. & Gerstein, M. Assessing the limits of genomic data integration for predicting protein networks. Genome Res. 15, 945–953 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Sheng, Q., Moreau, Y. & De Moor, B. Biclustering microarray data by Gibbs sampling. Bioinformatics 19, ii196–ii205 (2003).

    Article  PubMed  Google Scholar 

  53. Getz, G., Levine, E. & Domany, E. Coupled two-way clustering analysis of gene microarray data. Proc. Natl Acad. Sci. USA 97, 12079–12084 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Tanay, A., Sharan, R. & Shamir, R. Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, S136–S144 (2002).

    Article  PubMed  Google Scholar 

  55. Lazzeroni, L. & Owen, A. Plaid models for gene expression data. Stat. Sin. 2, 61–86 (2002).

    Google Scholar 

  56. Murali, T. M. & Kasif, S. Extracting conserved gene expression motifs from gene expression data. Pac. Symp. Biocomput. 2003, 77–88 (2003).

    Google Scholar 

  57. Cheng, Y. & Church, G. M. Biclustering of expression data. Proc. Int. Conf. Intell. Syst. Mol. Biol. 8, 93–103 (2000).

    CAS  PubMed  Google Scholar 

  58. Ben-Dor, A., Chor, B., Karp, R. & Yakhini, Z. Discovering local structure in gene expression data: the order-preserving submatrix problem. J. Comput. Biol. 10, 373–384 (2003).

    Article  CAS  PubMed  Google Scholar 

  59. Kluger, Y., Basri, R., Chang, J. T. & Gerstein, M. Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13, 703–716 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Dhollander, T. et al. Query-driven module discovery in microarray data. Bioinformatics 23, 2573–2580 (2007).

    Article  CAS  PubMed  Google Scholar 

  61. Ihmels, J. et al. Revealing modular organization in the yeast transcriptional network. Nature Genet. 31, 370–377 (2002).

    Article  CAS  PubMed  Google Scholar 

  62. Zwir, I., Huang, H. & Groisman, E. A. Analysis of differentially-regulated genes within a regulatory network by GPS genome navigation. Bioinformatics 21, 4073–4083 (2005).

    Article  CAS  PubMed  Google Scholar 

  63. Pena, J. M., Bjorkegren, J. & Tegner, J. Growing Bayesian network models of gene networks from seed genes. Bioinformatics 21, ii224–ii229 (2005).

    Article  CAS  PubMed  Google Scholar 

  64. Gat-Viks, I. & Shamir, R. Refinement and expansion of signaling pathways: the osmotic response network in yeast. Genome Res. 17, 358–367 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Tanay, A. & Shamir, R. Computational expansion of genetic networks. Bioinformatics 17, S270–S278 (2001).

    Article  PubMed  Google Scholar 

  66. Honkela, A. et al. Model-based method for transcription factor target identification with limited data. Proc. Natl Acad. Sci. USA 107, 7793–7798 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Zwir, I. et al. Dissecting the PhoP regulatory network of Escherichia coli and Salmonella enterica. Proc. Natl Acad. Sci. USA 102, 2862–2867 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. de Hoon, M. J. et al. Predicting gene regulation by sigma factors in Bacillus subtilis from genome-wide data. Bioinformatics. 20, i101–i108 (2004).

    Article  CAS  PubMed  Google Scholar 

  69. Gama-Castro, S. et al. RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res. 36, D120–D124 (2008).

    Article  CAS  PubMed  Google Scholar 

  70. Sierro, N., Makita, Y., de Hoon, M. & Nakai, K. DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information. Nucleic Acids Res. 36, D93–D96 (2008).

    Article  CAS  PubMed  Google Scholar 

  71. McDermott, J. E., Taylor, R. C., Yoon, H. & Heffron, F. Bottlenecks and hubs in inferred networks are important for virulence in Salmonella typhimurium. J. Comput. Biol. 16, 169–180 (2009).

    Article  CAS  PubMed  Google Scholar 

  72. Taylor, R. C. et al. A network inference workflow applied to virulence-related processes in Salmonella typhimurium. Ann. NY Acad. Sci. 1158, 143–158 (2009).

    Article  CAS  PubMed  Google Scholar 

  73. Fredrickson, J. K. et al. Towards environmental systems biology of Shewanella. Nature Rev. Microbiol. 6, 592–603 (2008).

    Article  CAS  Google Scholar 

  74. Toepel, J., McDermott, J. E., Summerfield, T. C. & Sherman, L. A. Transcriptional analysis of the unicellular, diazotrophic cyanobacterium Cyanothece sp. ATCC 51142 grown under short day/night cycles. J. Phycol. 45, 610–620 (2009).

    Article  CAS  PubMed  Google Scholar 

  75. Mendes, P., Sha, W. & Ye, K. Artificial gene networks for objective comparison of analysis algorithms. Bioinformatics 19, ii122–ii129 (2003).

    Article  PubMed  Google Scholar 

  76. Van den Bulcke, T. et al. SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinformatics. 7, 43 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Van den Bulcke, T., Lemmens, K., Van de Peer, Y. & Marchal, K. Inferring transcriptional networks by mining 'omics' data. Curr. Bioinform. 1, 301–331 (2006).

    Article  CAS  Google Scholar 

  78. Stolovitzky, G., Monroe, D. & Califano, A. Dialogue on reverse-engineering assessment and methods: the DREAM of high-throughput pathway inference. Ann. NY Acad. Sci. 1115, 1–22 (2007).

    Article  PubMed  Google Scholar 

  79. Cantone, I. et al. A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches. Cell 137, 172–181 (2009).

    Article  CAS  PubMed  Google Scholar 

  80. Marbach, D. et al. Revealing strengths and weaknesses of methods for gene network inference. Proc. Natl Acad. Sci. USA 107, 6286–6291 (2010). A discussion about the current limitations of network inference methods based on submissions to the DREAM3 in silico challenge.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Hibbs, M. A. et al. Directing experimental biology: a case study in mitochondrial biogenesis. PLoS Comput. Biol. 5, e1000322 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Stolovitzky, G., Prill, R. J. & Califano, A. Lessons from the DREAM2 Challenges. Ann. NY Acad. Sci. 1158, 159–195 (2009).

    Article  CAS  PubMed  Google Scholar 

  83. Nachman, I. & Regev, A. BRNI: modular analysis of transcriptional regulatory programs. BMC Bioinformatics 10, 155 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Sorek, R. & Cossart, P. Prokaryotic transcriptomics: a new view on regulation, physiology and pathogenicity. Nature Rev. Genet. 11, 9–16 (2010).

    Article  CAS  PubMed  Google Scholar 

  85. MacLean, D., Jones, J. D. & Studholme, D. J. Application of 'next-generation' sequencing technologies to microbial genetics. Nature Rev. Microbiol. 7, 287–296 (2009).

    Google Scholar 

  86. Sharma, C. M. & Vogel, J. Experimental approaches for the discovery and characterization of regulatory small RNA. Curr. Opin. Microbiol. 12, 536–546 (2009).

    Article  CAS  PubMed  Google Scholar 

  87. Coppins, R. L., Hall, K. B. & Groisman, E. A. The intricate world of riboswitches. Curr. Opin. Microbiol. 10, 176–181 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Vora, T., Hottes, A. K. & Tavazoie, S. Protein occupancy landscape of a bacterial genome. Mol. Cell 35, 247–253 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Madar, A., Greenfield, A., Ostrer, H., Vanden Eijnden, E. & Bonneau, R. The Inferelator 2.0: a scalable framework for reconstruction of dynamic regulatory network models. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2009, 5448–5451 (2009).

    Google Scholar 

  90. Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006).

    Article  CAS  PubMed  Google Scholar 

  91. Cookson, W., Liang, L., Abecasis, G., Moffatt, M. & Lathrop, M. Mapping complex disease traits with global gene expression. Nature Rev. Genet. 10, 184–194 (2009).

    Article  CAS  PubMed  Google Scholar 

  92. Cooper, T. F., Remold, S. K., Lenski, R. E. & Schneider, D. Expression profiles reveal parallel evolution of epistatic interactions involving the CRP regulon in Escherichia coli. PLoS Genet. 4, e35 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Fong, S. S., Joyce, A. R. & Palsson, B. O. Parallel adaptive evolution cultures of Escherichia coli lead to convergent growth phenotypes with different gene expression states. Genome Res. 15, 1365–1372 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Mitchell, A. et al. Adaptive prediction of environmental changes by microorganisms. Nature 460, 220–224 (2009).

    Article  CAS  PubMed  Google Scholar 

  95. Tagkopoulos, I., Liu, Y. C. & Tavazoie, S. Predictive behavior within microbial genetic networks. Science 320, 1313–1317 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Litvin, O., Causton, H. C., Chen, B. J. & Pe'er, D. Modularity and interactions in the genetics of gene expression. Proc. Natl Acad. Sci. USA 106, 6441–6446 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Lee, S. I. et al. Learning a prior on regulatory potential from eQTL data. PLoS Genet. 5, e1000358 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Lee, S. I., Pe'er, D., Dudley, A. M., Church, G. M. & Koller, D. Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc. Natl Acad. Sci. USA 103, 14062–14067 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Gat-Viks, I., Meller, R., Kupiec, M. & Shamir, R. Understanding gene sequence variation in the context of transcription regulation in yeast. PLoS Genet. 6, e1000800 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Herring, C. D. et al. Comparative genome sequencing of Escherichia coli allows observation of bacterial evolution on a laboratory timescale. Nature Genet. 38, 1406–1412 (2006).

    Article  CAS  PubMed  Google Scholar 

  101. Barrick, J. E. et al. Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461, 1243–1247 (2009).

    Article  CAS  PubMed  Google Scholar 

  102. Conrad, T. M. et al. Whole-genome resequencing of Escherichia coli K-12 MG1655 undergoing short-term laboratory evolution in lactate minimal media reveals flexible selection of adaptive mutations. Genome Biol. 10, R118 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Brem, R. B. & Kruglyak, L. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl Acad. Sci. USA 102, 1572–1577 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Isalan, M. et al. Evolvability and hierarchy in rewired bacterial gene networks. Nature 452, 840–845 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Barrett, C. L., Kim, T. Y., Kim, H. U., Palsson, B. O. & Lee, S. Y. Systems biology as a foundation for genome-scale synthetic biology. Curr. Opin. Biotechnol. 17, 488–492 (2006).

    Article  CAS  PubMed  Google Scholar 

  106. Joshi, A., Van, P. T., Van de Peer, Y. & Michoel, T. Characterizing regulatory path motifs in integrated networks using perturbational data. Genome Biol. 11, R32 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Ye, C., Galbraith, S. J., Liao, J. C. & Eskin, E. Using network component analysis to dissect regulatory networks mediated by transcription factors in yeast. PLoS Comput. Biol. 5, e1000311 (2009). One of the pioneering methods that tries to explain mechanistically how genomic variations result in observed expression changes.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Zhu, J. et al. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genet. 40, 854–861 (2008).

    Article  CAS  PubMed  Google Scholar 

  109. Hwang, D. et al. A data integration methodology for systems biology: experimental verification. Proc. Natl Acad. Sci. USA 102, 17302–17307 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Lee, I., Date, S. V., Adai, A. T. & Marcotte, E. M. A probabilistic functional network of yeast genes. Science 306, 1555–1558 (2004).

    Article  CAS  PubMed  Google Scholar 

  111. Suthram, S., Beyer, A., Karp, R. M., Eldar, Y. & Ideker, T. eQED: an efficient method for interpreting eQTL associations using protein networks. Mol. Syst. Biol. 4, 162 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  112. Liao, J. C. et al. Network component analysis: reconstruction of regulatory signals in biological systems. Proc. Natl Acad. Sci. USA 100, 15522–15527 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Gardner, T. S., di Bernardo, D., Lorenz, D. & Collins, J. J. Inferring genetic networks and identifying compound mode of action via expression profiling. Science 301, 102–105 (2003).

    Article  CAS  PubMed  Google Scholar 

  114. Grainger, D. C., Hurd, D., Harrison, M., Holdstock, J. & Busby, S. J. Studies of the distribution of Escherichia coli cAMP-receptor protein and RNA polymerase along the E. coli chromosome. Proc. Natl Acad. Sci. USA 102, 17693–17698 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Grainger, D. C., Hurd, D., Goldberg, M. D. & Busby, S. J. Association of nucleoid proteins with coding and non-coding segments of the Escherichia coli genome. Nucleic Acids Res. 34, 4642–4652 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Grainger, D. C., Aiba, H., Hurd, D., Browning, D. F. & Busby, S. J. Transcription factor distribution in Escherichia coli: studies with FNR protein. Nucleic Acids Res. 35, 269–278 (2007).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank the anonymous reviewers as well as Y. Van de Peer and J. Vanderleyden for their useful comments on the manuscript. R.D.S. is a research assistant of the agency for Innovation by Science and Technology (IWT, Belgium). This work is further supported by the Katholieke Universiteit Leuven (GOA AMBioRICS, GOA/08/011, CoE EF/05/007, SymBioSys and CREA/08/023), by the IWT through the SBO-BioFrame project, by the Interuniversity Attraction Poles (IUAP, Belgium) (BioMaGNet grant P6/25), by the National Fund for Scientific Research (FWO, Belgium) (grant IOK-B9725-G.0329.09) and by the Human Frontier Science Program (grant HFSP-RGY0079/2007C).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kathleen Marchal.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary information S1 (table)

Overview of databases that store information on bacterial transcription regulation (PDF 236 kb)

Related links

Related links

DATABASES

Entrez Genome Project

Bacillus subtilis

Escherichia coli

Halobacterium salinarum

Saccharomyces cerevisiae

Salmonella enterica subsp. enterica serovar Typhimurium

Shewanella oneidensis

FURTHER INFORMATION

Kathleen Marchal's homepage

EcoCyc

DBTBS

RegulonDB

Glossary

Module inference

Identifying groups of co-expressed genes from gene expression data using clustering or biclustering algorithms.

Guilt-by-association principle

The assumption that genes with similar functions exhibit similar expression patterns. This allows the function of an unknown gene to be inferred from the function of annotated genes that are co-expressed with the unknown gene.

Expression modularity

Refers to the modular structure of the co-expression network. This network can be broken down into modules, or groups of co-expressed genes, the function of which can be separated from that of other modules.

Top-down network inference

Reverse engineering or de novo reconstruction of the structure of biological networks on a genome-wide scale by exploiting high-throughput data. By contrast, bottom-up regulatory network inference is the construction of a quantitative model from the data using a known, mathematically formalized connectivity network as input; estimating the kinetic parameters of this model from the data allows the dynamic behaviour of the network to be modelled.

Optimization strategy

A strategy used to screen the search space so that the optimal (or almost optimal) solution can be found without having to evaluate all possible solutions.

Search space

All possible solutions that need to be evaluated to find the one that is the most optimal according to preset criteria. In most inference problems, the number of possible solutions is prohibitively large and cannot be enumerated exhaustively.

Clustering

Grouping of genes that have similar expression patterns across all conditions.

Biclustering

Combining the selection of co-expressed gene sets with a condition selection step to infer the set of conditions that is relevant to the clustered genes.

Motif

TF-binding site or specific sequence tag that is recognized by a TF and is located in the promoter region of a gene.

Classification problem

A problem that can be solved by a system whereby properties or features of known targets and non-targets of a regulator are derived from high-throughput data and used to construct a classifier function — that is, a mathematical function that describes the relationship between the class labels (being a target versus being a non -target) and the corresponding properties of the high-throughput data. These classifier functions can then be used to predict whether or not a gene of interest is a target of the studied TF on the basis of its data properties.

Operonic regulator

Regulator dedicated to one specific operon.

De novo motif detection

Computational strategy to identify TF binding sites without any prior information on the sequence of the site. Such a strategy relies on certain subsequences being statistically over-represented in a set of co-regulated genes.

Precision–recall curve

Customary method of comparing the precision and recall of a network inference method in order to evaluate the performance of inference algorithm. The precision is the proportion of correctly inferred interactions, according to an external standard, out of the total number of predictions made. The recall is the degree to which the total number of existing interactions in the real network has been covered by the predictions.

Cross-validation

Statistical technique that assesses the extent to which a model fitted on a certain data set can also predict the observations made on an independent data set.

Rights and permissions

Reprints and permissions

About this article

Cite this article

De Smet, R., Marchal, K. Advantages and limitations of current network inference methods. Nat Rev Microbiol 8, 717–729 (2010). https://doi.org/10.1038/nrmicro2419

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrmicro2419

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing