Skip to main content

Using MCL to Extract Clusters from Networks

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 804))

Abstract

MCL is a general purpose cluster algorithm for both weighted and unweighted networks. The algorithm utilises network topology as well as edge weights, is highly scalable and has been applied in a wide variety of bioinformatic methods. In this chapter, we give protocols and case studies for clustering of networks derived from, respectively, protein sequence similarities and gene expression profile correlations.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. van Dongen S. (2000) A cluster algorithm for graphs. Tech. rep., National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam.

    Google Scholar 

  2. van Dongen S. (2000) Graph clustering by flow simulation. PhD thesis, University of Utrecht.

    Google Scholar 

  3. van Dongen S. (2008) Graph clustering via a discrete uncoupling process. SIAM J Matrix Anal Appl, 30:121–141.

    Article  Google Scholar 

  4. Enright A, van Dongen S, Ouzounis C. (2002) An efficient algorithm for the large-scale detection of protein families. Nucleic Acids Res, 7:1575–1584.

    Article  Google Scholar 

  5. Enright AJ, Kunin V, Ouzounis CA. (2003) Protein families and TRIBES in genome sequence space. Nucleic Acids Res, 31:4632–4638.

    Article  PubMed  CAS  Google Scholar 

  6. Li L, Stoeckert C, Roos D, OrthoMCL. (2003) Identification of ortholog groups for eukaryotic genomes. Genome Res, 13:2178–2189.

    Article  PubMed  CAS  Google Scholar 

  7. Pereira-Leal JB, Enright AJ, Ouzounis CA. (2004) Detection of functional modules from protein interaction networks. Proteins, 54:49–57.

    Article  CAS  Google Scholar 

  8. Brohée S, van Helden J. (2006) Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics, 7:488.

    Article  PubMed  Google Scholar 

  9. Samuel Lattimore B, van Dongen S, Crabbe MJ. (2005) GeneMCL in microarray analysis. Comput Biol Chem, 29:354–359.

    Google Scholar 

  10. Freeman TC, et al. (2007) Construction, visualisation, and clustering of transcription networks from microarray expression data. PLoS Comput Biol, 3:2032–2042.

    Article  PubMed  CAS  Google Scholar 

  11. Lopez F, et al. (2008) TranscriptomeBrowser: a powerful and flexible toolbox to explore productively the transcriptional landscape of the Gene Expression Omnibus database. PLoS ONE, 3:e4001.

    Article  PubMed  Google Scholar 

  12. Theodosiou T, et al. (2008) PuReD-MCL: a graph-based PubMed document clustering methodology. Bioinformatics, 24:1935–1941.

    Article  PubMed  CAS  Google Scholar 

  13. Hubbard TJ, et al. (2009) Ensembl. Nucleic Acids Res, 37:D690–697.

    Article  PubMed  CAS  Google Scholar 

  14. Chen F, et al. (2007) Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS ONE, 2:e383.

    Article  PubMed  Google Scholar 

  15. Theocharidis A, et al. (2009) Network visualization and analysis of gene expression data using BioLayout Express(3D). Nat Protoc, 4:1535–1550.

    Article  PubMed  CAS  Google Scholar 

  16. Brohee S, Faust K, Lima-Mendez G, Sand O, Janky R, Vanderstocken G, Deville Y, van Helden J. (2008) NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways. Nucleic Acids Res, 36:W444–W451.

    Article  PubMed  CAS  Google Scholar 

  17. King AD, Przulj N, Jurisica I. (2004) Protein complex prediction via costbased clustering. Bioinformatics, 20:3013–3020.

    Article  PubMed  CAS  Google Scholar 

  18. Darby AC, et al. (2007) Intracellular pathogens go extreme: genome evolution in the Rickettsiales. Trends Genet, 23:511–520.

    Article  PubMed  CAS  Google Scholar 

  19. d′Haeseleer P. (2005) How does gene expression clustering work? Nat Biotechnol, 23:1499–1501.

    Google Scholar 

  20. van Noort V, Snel B, Huynen MA. (2003) Predicting gene function by conserved co-expression. Trends Genet, 19:238–242.

    Article  PubMed  Google Scholar 

  21. Faith JJ, et al. (2008) Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata. Nucleic Acids Res, 36:D866–870.

    Article  PubMed  CAS  Google Scholar 

  22. Gama-Castro S, et al. (2008) RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res, 36:D120–124.

    Google Scholar 

  23. Keseler IM, et al. (2009) EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res, 37:D464–470.

    Article  PubMed  CAS  Google Scholar 

  24. Bairoch A, et al. (2009) The universal protein resource (UniProt) 2009. Nucleic Acids Res, 37:D169–D174.

    Article  Google Scholar 

  25. van Dongen S. (2000) Performance criteria for graph clustering and Markov cluster experiments. Tech. rep., National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam. [http://www.cwi.nl/static/publications/reports/INS-2000.html].

  26. Ogata H, Audic S, Barbe V, Artiguenave F, Fournier PE, Raoult D, Claverie JM. (2000) Selfish DNA in protein-coding genes of Rickettsia. Science, 290:347–350.

    Article  PubMed  CAS  Google Scholar 

  27. Neidhardt FC, Curtiss R. (1996) Escherichia Coli and Salmonella: Cellular and Molecular Biology. 2nd ed. ASM Press, Washington. [Walker GC. The SOS response of Escherichia coli. 1400–1416].

    Google Scholar 

  28. Huang DW, Sherman BT, Lempicki RA. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc, 4:44–57.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stijn van Dongen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

van Dongen, S., Abreu-Goodger, C. (2012). Using MCL to Extract Clusters from Networks. In: van Helden, J., Toussaint, A., Thieffry, D. (eds) Bacterial Molecular Networks. Methods in Molecular Biology, vol 804. Springer, New York, NY. https://doi.org/10.1007/978-1-61779-361-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-361-5_15

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-61779-360-8

  • Online ISBN: 978-1-61779-361-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics