Key Points
-
Recently several novel tools for inferring transcriptional networks from expression data have been developed. Computationally inferred interactions offer a useful resource to complement experimental findings, but the direct integration of inference tools in daily laboratory practice remains limited, because the choice of the appropriate network tool is not obvious.
-
Network inference is, mathematically, an underdetermined problem. The large number of theoretically possible interactions between transcription factors (TFs) and their targets far exceeds the number of independent measurements from which the true interactions can be inferred. Inference therefore results in many possible solutions that all explain the data equally well, but only a few of these solutions can be biologically true.
-
Different state-of-the-art tools for network inference deal with underdetermination by using assumptions and simplifications that reduce the number of possible solutions in order to make the problem solvable.
-
The strategy adopted to deal with the inference problem determines the aspects of the transcriptional network that is highlighted and the type of research question that can be answered. The outcome of network inference therefore varies greatly between tools.
-
Fair benchmark studies are useful for guiding both users and developers. Most current studies combine validation based on an external standard with medium-throughput experiments to validate the extent to which known interactions can be recovered and reliable new interactions can be inferred.
-
It is likely that no single best method exists, and different methods highlight complementary interaction types. Therefore, ensemble approaches, which aggregate the outcomes of several methods, offer a way to improve on the breadth and the accuracy of the predicted interactions.
-
Future work in the light of novel data generation procedures will be to develop inference methods that exploit high-throughput information about regulation at levels other than transcription to mechanistically explain how genomic variations result in observed expression changes.
Abstract
Network inference, which is the reconstruction of biological networks from high-throughput data, can provide valuable information about the regulation of gene expression in cells. However, it is an underdetermined problem, as the number of interactions that can be inferred exceeds the number of independent measurements. Different state-of-the-art tools for network inference use specific assumptions and simplifications to deal with underdetermination, and these influence the inferences. The outcome of network inference therefore varies between tools and can be highly complementary. Here we categorize the available tools according to the strategies that they use to deal with the problem of underdetermination. Such categorization allows an insight into why a certain tool is more appropriate for the specific research question or data set at hand.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Jacob, F. & Monod, J. Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3, 318–356 (1961).
Ptashne, M. & Gilbert, W. Genetic repressors. Sci. Am. 222, 36–44 (1970).
Alon, U. Network motifs: theory and experimental approaches. Nature Rev. Genet. 8, 450–461 (2007).
Shen-Orr, S. S., Milo, R., Mangan, S. & Alon, U. Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet. 31, 64–68 (2002).
Fadda, A. et al. Inferring the transcriptional network of Bacillus subtilis. Mol. Biosyst. 5, 1840–1852 (2009).
Cho, B. K. et al. The transcription unit architecture of the Escherichia coli genome. Nature Biotech. 27, 1043–1049 (2009).
Mendoza-Vargas, A. et al. Genome-wide identification of transcription start sites, promoters and transcription factor binding sites in E. coli. PLoS One 4, e7526 (2009).
Lemmens, K. et al. DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli. Genome Biol. 10, R27 (2009). A description of the integrative reconstruction of the E. coli TRN using a cross-platform expression compendium and motif information, followed by experimental validation of the predicted network.
Zare, H., Sangurdekar, D., Srivastava, P., Kaveh, M. & Khodursky, A. Reconstruction of Escherichia coli transcriptional regulatory networks via regulon-based associations. BMC Syst. Biol. 3, 39 (2009).
Kohanski, M. A., Dwyer, D. J., Wierzbowski, J., Cottarel, G. & Collins, J. J. Mistranslation of membrane proteins and two-component system activation trigger antibiotic-mediated cell death. Cell 135, 679–690 (2008).
Yoon, H., McDermott, J. E., Porwollik, S., McClelland, M. & Heffron, F. Coordinated regulation of virulence during systemic infection of Salmonella enterica serovar Typhimurium. PLoS Pathog. 5, e1000306 (2009).
Bonneau, R. et al. A predictive model for transcriptional control of physiology in a free living cell. Cell 131, 1354–1365 (2007). An example of the use of an integrated computational–experimental approach to chart the regulatory network of a largely uncharacterized archaeon, including experimental validation of the predicted network.
Bansal, M., Belcastro, V., Ambesi-Impiombato, A. & di Bernardo, D. How to infer gene networks from expression profiles. Mol. Syst. Biol. 3, 78 (2007).
Bonneau, R. Learning biological networks: from modules to dynamics. Nature Chem. Biol. 4, 658–664 (2008).
Karlebach, G. & Shamir, R. Modelling and analysis of gene regulatory networks. Nature Rev. Mol. Cell Biol. 9, 770–780 (2008).
Babu, M. M. & Teichmann, S. A. Evolution of transcription factors and the gene regulatory network in Escherichia coli. Nucleic Acids Res. 31, 1234–1244 (2003).
Draghici, S., Khatri, P., Eklund, A. C. & Szallasi, Z. Reliability and reproducibility issues in DNA microarray measurements. Trends Genet. 22, 101–109 (2006).
Marshall, E. Getting the noise out of gene arrays. Science 306, 630–631 (2004).
Johnson, D. S. et al. Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res. 18, 393–403 (2008).
Ma, H. W., Buer, J. & Zeng, A. P. Hierarchical structure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach. BMC Bioinformatics 5, 199 (2004).
Hartwell, L. H., Hopfield, J. J., Leibler, S. & Murray, A. W. From molecular to modular cell biology. Nature 402, C47–C52 (1999).
Ihmels, J., Bergmann, S. & Barkai, N. Defining transcription modules using large-scale gene expression data. Bioinformatics 20, 1993–2003 (2004).
Qi, Y. & Ge, H. Modularity and dynamics of cellular networks. PLoS Comput. Biol. 2, e174 (2006).
Madeira, S. C. & Oliveira, A. L. Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans. Comput. Biol. Bioinform. 1, 24–45 (2004).
Bonneau, R. et al. The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biol. 7, R36 (2006).
Segal, E. et al. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genet. 34, 166–176 (2003). Pioneering work introducing module-based network inference.
Margolin, A. A. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7, S7 (2006).
Basso, K. et al. Reverse engineering of regulatory networks in human B cells. Nature Genet. 37, 382–390 (2005).
Michoel, T., De Smet, R., Joshi, A., Van de Peer, Y. & Marchal, K. Comparative analysis of module-based versus direct methods for reverse-engineering transcriptional regulatory networks. BMC Syst. Biol. 3, 49 (2009).
Ernst, J. et al. A semi-supervised method for predicting transcription factor–gene interactions in Escherichia coli. PLoS Comput. Biol. 4, e1000044 (2008). The first integrative reconstruction of the E. coli TRN using a supervised method, combining motif information and the expression compendium from reference 31.
Mordelet, F. & Vert, J. P. SIRENE: supervised inference of regulatory networks. Bioinformatics 24, i76–i82 (2008).
Faith, J. J. et al. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5, e8 (2007). The first global reconstruction of the E. coli TRN from an Affymetrix gene expression compendium, along with experimental validation of the predicted network.
Foster, J. W. Escherichia coli acid resistance: tales of an amateur acidophile. Nature Rev. Microbiol. 2, 898–907 (2004).
Joshi, A., De Smet, R., Marchal, K., Van de Peer, Y. & Michoel, T. Module networks revisited: computational assessment and prioritization of model predictions. Bioinformatics 25, 490–496 (2009).
Anastassiou, D. Computational analysis of the synergy among multiple interacting genes. Mol. Syst. Biol. 3, 83 (2007).
Watkinson, J., Liang, K. C., Wang, X., Zheng, T. & Anastassiou, D. Inference of regulatory gene interactions from expression data using three-way mutual information. Ann. NY Acad. Sci. 1158, 302–313 (2009).
Shaw, O. J., Harwood, C., Steggles, L. J. & Wipat, A. SARGE: a tool for creation of putative genetic networks. Bioinformatics 20, 3638–3640 (2004).
Schmitt, W. A. Jr, Raab, R. M. & Stephanopoulos, G. Elucidation of gene interaction networks through time-lagged correlation analysis of transcriptional data. Genome Res. 14, 1654–1663 (2004).
Gutierrez-Rios, R. M. et al. Regulatory network of Escherichia coli: consistency between literature knowledge and microarray profiles. Genome Res. 13, 2435–2443 (2003).
Herrgard, M. J., Covert, M. W. & Palsson, B. O. Reconciling gene expression data with known genome-scale regulatory network structures. Genome Res. 13, 2423–2434 (2003). An informative study illustrating the limitations of expression-based network inference for E. coli and S. cerevisiae.
Bar-Joseph, Z. et al. Computational discovery of gene modules and regulatory networks. Nature Biotech. 21, 1337–1342 (2003). The first large-scale integration of ChIP-chip and expression data, applied to yeast (including experimental validation).
Lemmens, K. et al. Inferring transcriptional modules from ChIP-chip, motif and microarray data. Genome Biol. 7, R37 (2006).
Sabatti, C. & James, G. M. Bayesian sparse hidden components analysis for transcription regulation networks. Bioinformatics 22, 739–746 (2006).
Tanay, A., Sharan, R., Kupiec, M. & Shamir, R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc. Natl Acad. Sci. USA 101, 2981–2986 (2004).
Myers, C. L. & Troyanskaya, O. G. Context-sensitive data integration and prediction of biological networks. Bioinformatics 23, 2322–2330 (2007).
Keseler, I. M. et al. EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res. 37, D464–D470 (2009).
Reiss, D. J., Baliga, N. S. & Bonneau, R. Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinformatics. 7, 280 (2006).
Huttenhower, C. et al. Detailing regulatory networks through large scale data integration. Bioinformatics 25, 3267–3274 (2009).
Freckleton, G., Lippman, S. I., Broach, J. R. & Tavazoie, S. Microarray profiling of phage-display selections for rapid mapping of transcription factor–DNA interactions. PLoS Genet. 5, e1000449 (2009).
Butala, M., Busby, S. J. & Lee, D. J. DNA sampling: a method for probing protein binding at specific loci on bacterial chromosomes. Nucleic Acids Res. 37, e37 (2009).
Lu, L. J., Xia, Y., Paccanaro, A., Yu, H. & Gerstein, M. Assessing the limits of genomic data integration for predicting protein networks. Genome Res. 15, 945–953 (2005).
Sheng, Q., Moreau, Y. & De Moor, B. Biclustering microarray data by Gibbs sampling. Bioinformatics 19, ii196–ii205 (2003).
Getz, G., Levine, E. & Domany, E. Coupled two-way clustering analysis of gene microarray data. Proc. Natl Acad. Sci. USA 97, 12079–12084 (2000).
Tanay, A., Sharan, R. & Shamir, R. Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, S136–S144 (2002).
Lazzeroni, L. & Owen, A. Plaid models for gene expression data. Stat. Sin. 2, 61–86 (2002).
Murali, T. M. & Kasif, S. Extracting conserved gene expression motifs from gene expression data. Pac. Symp. Biocomput. 2003, 77–88 (2003).
Cheng, Y. & Church, G. M. Biclustering of expression data. Proc. Int. Conf. Intell. Syst. Mol. Biol. 8, 93–103 (2000).
Ben-Dor, A., Chor, B., Karp, R. & Yakhini, Z. Discovering local structure in gene expression data: the order-preserving submatrix problem. J. Comput. Biol. 10, 373–384 (2003).
Kluger, Y., Basri, R., Chang, J. T. & Gerstein, M. Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13, 703–716 (2003).
Dhollander, T. et al. Query-driven module discovery in microarray data. Bioinformatics 23, 2573–2580 (2007).
Ihmels, J. et al. Revealing modular organization in the yeast transcriptional network. Nature Genet. 31, 370–377 (2002).
Zwir, I., Huang, H. & Groisman, E. A. Analysis of differentially-regulated genes within a regulatory network by GPS genome navigation. Bioinformatics 21, 4073–4083 (2005).
Pena, J. M., Bjorkegren, J. & Tegner, J. Growing Bayesian network models of gene networks from seed genes. Bioinformatics 21, ii224–ii229 (2005).
Gat-Viks, I. & Shamir, R. Refinement and expansion of signaling pathways: the osmotic response network in yeast. Genome Res. 17, 358–367 (2007).
Tanay, A. & Shamir, R. Computational expansion of genetic networks. Bioinformatics 17, S270–S278 (2001).
Honkela, A. et al. Model-based method for transcription factor target identification with limited data. Proc. Natl Acad. Sci. USA 107, 7793–7798 (2010).
Zwir, I. et al. Dissecting the PhoP regulatory network of Escherichia coli and Salmonella enterica. Proc. Natl Acad. Sci. USA 102, 2862–2867 (2005).
de Hoon, M. J. et al. Predicting gene regulation by sigma factors in Bacillus subtilis from genome-wide data. Bioinformatics. 20, i101–i108 (2004).
Gama-Castro, S. et al. RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res. 36, D120–D124 (2008).
Sierro, N., Makita, Y., de Hoon, M. & Nakai, K. DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information. Nucleic Acids Res. 36, D93–D96 (2008).
McDermott, J. E., Taylor, R. C., Yoon, H. & Heffron, F. Bottlenecks and hubs in inferred networks are important for virulence in Salmonella typhimurium. J. Comput. Biol. 16, 169–180 (2009).
Taylor, R. C. et al. A network inference workflow applied to virulence-related processes in Salmonella typhimurium. Ann. NY Acad. Sci. 1158, 143–158 (2009).
Fredrickson, J. K. et al. Towards environmental systems biology of Shewanella. Nature Rev. Microbiol. 6, 592–603 (2008).
Toepel, J., McDermott, J. E., Summerfield, T. C. & Sherman, L. A. Transcriptional analysis of the unicellular, diazotrophic cyanobacterium Cyanothece sp. ATCC 51142 grown under short day/night cycles. J. Phycol. 45, 610–620 (2009).
Mendes, P., Sha, W. & Ye, K. Artificial gene networks for objective comparison of analysis algorithms. Bioinformatics 19, ii122–ii129 (2003).
Van den Bulcke, T. et al. SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinformatics. 7, 43 (2006).
Van den Bulcke, T., Lemmens, K., Van de Peer, Y. & Marchal, K. Inferring transcriptional networks by mining 'omics' data. Curr. Bioinform. 1, 301–331 (2006).
Stolovitzky, G., Monroe, D. & Califano, A. Dialogue on reverse-engineering assessment and methods: the DREAM of high-throughput pathway inference. Ann. NY Acad. Sci. 1115, 1–22 (2007).
Cantone, I. et al. A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches. Cell 137, 172–181 (2009).
Marbach, D. et al. Revealing strengths and weaknesses of methods for gene network inference. Proc. Natl Acad. Sci. USA 107, 6286–6291 (2010). A discussion about the current limitations of network inference methods based on submissions to the DREAM3 in silico challenge.
Hibbs, M. A. et al. Directing experimental biology: a case study in mitochondrial biogenesis. PLoS Comput. Biol. 5, e1000322 (2009).
Stolovitzky, G., Prill, R. J. & Califano, A. Lessons from the DREAM2 Challenges. Ann. NY Acad. Sci. 1158, 159–195 (2009).
Nachman, I. & Regev, A. BRNI: modular analysis of transcriptional regulatory programs. BMC Bioinformatics 10, 155 (2009).
Sorek, R. & Cossart, P. Prokaryotic transcriptomics: a new view on regulation, physiology and pathogenicity. Nature Rev. Genet. 11, 9–16 (2010).
MacLean, D., Jones, J. D. & Studholme, D. J. Application of 'next-generation' sequencing technologies to microbial genetics. Nature Rev. Microbiol. 7, 287–296 (2009).
Sharma, C. M. & Vogel, J. Experimental approaches for the discovery and characterization of regulatory small RNA. Curr. Opin. Microbiol. 12, 536–546 (2009).
Coppins, R. L., Hall, K. B. & Groisman, E. A. The intricate world of riboswitches. Curr. Opin. Microbiol. 10, 176–181 (2007).
Vora, T., Hottes, A. K. & Tavazoie, S. Protein occupancy landscape of a bacterial genome. Mol. Cell 35, 247–253 (2009).
Madar, A., Greenfield, A., Ostrer, H., Vanden Eijnden, E. & Bonneau, R. The Inferelator 2.0: a scalable framework for reconstruction of dynamic regulatory network models. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2009, 5448–5451 (2009).
Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006).
Cookson, W., Liang, L., Abecasis, G., Moffatt, M. & Lathrop, M. Mapping complex disease traits with global gene expression. Nature Rev. Genet. 10, 184–194 (2009).
Cooper, T. F., Remold, S. K., Lenski, R. E. & Schneider, D. Expression profiles reveal parallel evolution of epistatic interactions involving the CRP regulon in Escherichia coli. PLoS Genet. 4, e35 (2008).
Fong, S. S., Joyce, A. R. & Palsson, B. O. Parallel adaptive evolution cultures of Escherichia coli lead to convergent growth phenotypes with different gene expression states. Genome Res. 15, 1365–1372 (2005).
Mitchell, A. et al. Adaptive prediction of environmental changes by microorganisms. Nature 460, 220–224 (2009).
Tagkopoulos, I., Liu, Y. C. & Tavazoie, S. Predictive behavior within microbial genetic networks. Science 320, 1313–1317 (2008).
Litvin, O., Causton, H. C., Chen, B. J. & Pe'er, D. Modularity and interactions in the genetics of gene expression. Proc. Natl Acad. Sci. USA 106, 6441–6446 (2009).
Lee, S. I. et al. Learning a prior on regulatory potential from eQTL data. PLoS Genet. 5, e1000358 (2009).
Lee, S. I., Pe'er, D., Dudley, A. M., Church, G. M. & Koller, D. Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc. Natl Acad. Sci. USA 103, 14062–14067 (2006).
Gat-Viks, I., Meller, R., Kupiec, M. & Shamir, R. Understanding gene sequence variation in the context of transcription regulation in yeast. PLoS Genet. 6, e1000800 (2010).
Herring, C. D. et al. Comparative genome sequencing of Escherichia coli allows observation of bacterial evolution on a laboratory timescale. Nature Genet. 38, 1406–1412 (2006).
Barrick, J. E. et al. Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461, 1243–1247 (2009).
Conrad, T. M. et al. Whole-genome resequencing of Escherichia coli K-12 MG1655 undergoing short-term laboratory evolution in lactate minimal media reveals flexible selection of adaptive mutations. Genome Biol. 10, R118 (2009).
Brem, R. B. & Kruglyak, L. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl Acad. Sci. USA 102, 1572–1577 (2005).
Isalan, M. et al. Evolvability and hierarchy in rewired bacterial gene networks. Nature 452, 840–845 (2008).
Barrett, C. L., Kim, T. Y., Kim, H. U., Palsson, B. O. & Lee, S. Y. Systems biology as a foundation for genome-scale synthetic biology. Curr. Opin. Biotechnol. 17, 488–492 (2006).
Joshi, A., Van, P. T., Van de Peer, Y. & Michoel, T. Characterizing regulatory path motifs in integrated networks using perturbational data. Genome Biol. 11, R32 (2010).
Ye, C., Galbraith, S. J., Liao, J. C. & Eskin, E. Using network component analysis to dissect regulatory networks mediated by transcription factors in yeast. PLoS Comput. Biol. 5, e1000311 (2009). One of the pioneering methods that tries to explain mechanistically how genomic variations result in observed expression changes.
Zhu, J. et al. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genet. 40, 854–861 (2008).
Hwang, D. et al. A data integration methodology for systems biology: experimental verification. Proc. Natl Acad. Sci. USA 102, 17302–17307 (2005).
Lee, I., Date, S. V., Adai, A. T. & Marcotte, E. M. A probabilistic functional network of yeast genes. Science 306, 1555–1558 (2004).
Suthram, S., Beyer, A., Karp, R. M., Eldar, Y. & Ideker, T. eQED: an efficient method for interpreting eQTL associations using protein networks. Mol. Syst. Biol. 4, 162 (2008).
Liao, J. C. et al. Network component analysis: reconstruction of regulatory signals in biological systems. Proc. Natl Acad. Sci. USA 100, 15522–15527 (2003).
Gardner, T. S., di Bernardo, D., Lorenz, D. & Collins, J. J. Inferring genetic networks and identifying compound mode of action via expression profiling. Science 301, 102–105 (2003).
Grainger, D. C., Hurd, D., Harrison, M., Holdstock, J. & Busby, S. J. Studies of the distribution of Escherichia coli cAMP-receptor protein and RNA polymerase along the E. coli chromosome. Proc. Natl Acad. Sci. USA 102, 17693–17698 (2005).
Grainger, D. C., Hurd, D., Goldberg, M. D. & Busby, S. J. Association of nucleoid proteins with coding and non-coding segments of the Escherichia coli genome. Nucleic Acids Res. 34, 4642–4652 (2006).
Grainger, D. C., Aiba, H., Hurd, D., Browning, D. F. & Busby, S. J. Transcription factor distribution in Escherichia coli: studies with FNR protein. Nucleic Acids Res. 35, 269–278 (2007).
Acknowledgements
We thank the anonymous reviewers as well as Y. Van de Peer and J. Vanderleyden for their useful comments on the manuscript. R.D.S. is a research assistant of the agency for Innovation by Science and Technology (IWT, Belgium). This work is further supported by the Katholieke Universiteit Leuven (GOA AMBioRICS, GOA/08/011, CoE EF/05/007, SymBioSys and CREA/08/023), by the IWT through the SBO-BioFrame project, by the Interuniversity Attraction Poles (IUAP, Belgium) (BioMaGNet grant P6/25), by the National Fund for Scientific Research (FWO, Belgium) (grant IOK-B9725-G.0329.09) and by the Human Frontier Science Program (grant HFSP-RGY0079/2007C).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary information S1 (table)
Overview of databases that store information on bacterial transcription regulation (PDF 236 kb)
Related links
Related links
DATABASES
Entrez Genome Project
Salmonella enterica subsp. enterica serovar Typhimurium
FURTHER INFORMATION
Glossary
- Module inference
-
Identifying groups of co-expressed genes from gene expression data using clustering or biclustering algorithms.
- Guilt-by-association principle
-
The assumption that genes with similar functions exhibit similar expression patterns. This allows the function of an unknown gene to be inferred from the function of annotated genes that are co-expressed with the unknown gene.
- Expression modularity
-
Refers to the modular structure of the co-expression network. This network can be broken down into modules, or groups of co-expressed genes, the function of which can be separated from that of other modules.
- Top-down network inference
-
Reverse engineering or de novo reconstruction of the structure of biological networks on a genome-wide scale by exploiting high-throughput data. By contrast, bottom-up regulatory network inference is the construction of a quantitative model from the data using a known, mathematically formalized connectivity network as input; estimating the kinetic parameters of this model from the data allows the dynamic behaviour of the network to be modelled.
- Optimization strategy
-
A strategy used to screen the search space so that the optimal (or almost optimal) solution can be found without having to evaluate all possible solutions.
- Search space
-
All possible solutions that need to be evaluated to find the one that is the most optimal according to preset criteria. In most inference problems, the number of possible solutions is prohibitively large and cannot be enumerated exhaustively.
- Clustering
-
Grouping of genes that have similar expression patterns across all conditions.
- Biclustering
-
Combining the selection of co-expressed gene sets with a condition selection step to infer the set of conditions that is relevant to the clustered genes.
- Motif
-
TF-binding site or specific sequence tag that is recognized by a TF and is located in the promoter region of a gene.
- Classification problem
-
A problem that can be solved by a system whereby properties or features of known targets and non-targets of a regulator are derived from high-throughput data and used to construct a classifier function — that is, a mathematical function that describes the relationship between the class labels (being a target versus being a non -target) and the corresponding properties of the high-throughput data. These classifier functions can then be used to predict whether or not a gene of interest is a target of the studied TF on the basis of its data properties.
- Operonic regulator
-
Regulator dedicated to one specific operon.
- De novo motif detection
-
Computational strategy to identify TF binding sites without any prior information on the sequence of the site. Such a strategy relies on certain subsequences being statistically over-represented in a set of co-regulated genes.
- Precision–recall curve
-
Customary method of comparing the precision and recall of a network inference method in order to evaluate the performance of inference algorithm. The precision is the proportion of correctly inferred interactions, according to an external standard, out of the total number of predictions made. The recall is the degree to which the total number of existing interactions in the real network has been covered by the predictions.
- Cross-validation
-
Statistical technique that assesses the extent to which a model fitted on a certain data set can also predict the observations made on an independent data set.
Rights and permissions
About this article
Cite this article
De Smet, R., Marchal, K. Advantages and limitations of current network inference methods. Nat Rev Microbiol 8, 717–729 (2010). https://doi.org/10.1038/nrmicro2419
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrmicro2419
This article is cited by
-
Distance correlation application to gene co-expression network analysis
BMC Bioinformatics (2022)
-
Inferring the underlying multivariate structure from bivariate networks with highly correlated nodes
Scientific Reports (2022)
-
Prediction of biomarkers and therapeutic combinations for anti-PD-1 immunotherapy using the global gene network association
Nature Communications (2022)
-
Reconstruction of nonlinear flows from noisy time series
Nonlinear Dynamics (2022)
-
An order independent algorithm for inferring gene regulatory network using quantile value for conditional independence tests
Scientific Reports (2021)