Skip to main content

Community Finding with Applications on Phylogenetic Networks

  • Conference paper
  • First Online:
XV Mediterranean Conference on Medical and Biological Engineering and Computing – MEDICON 2019 (MEDICON 2019)

Part of the book series: IFMBE Proceedings ((IFMBE,volume 76))

Abstract

With the advent of high-throughput sequencing methods, new ways of visualizing and analyzing increasingly amounts of data are needed. Although some software already exist, they do not scale well or require advanced skills to be useful in phylogenetics.

The aim of this thesis was to implement three community finding algorithms – Louvain, Infomap and Layered Label Propagation (LLP); to benchmark them using two synthetic networks – Girvan-Newman (GN) and Lancichinetti-Fortunato-Radicchi (LFR); to test them in real networks, particularly, in one derived from a Staphylococcus aureus MLST dataset; to compare visualization frameworks – Cytoscape.js and D3.js, and, finally, to make it all available online (mscthesis.herokuapp.com).

Louvain, Infomap and LLP were implemented in JavaScript. Unless otherwise stated, next conclusions are valid for GN and LFR. In terms of speed, Louvain outperformed all others. Considering accuracy, in networks with well-defined communities, Louvain was the most accurate. For higher mixing, LLP was the best. Contrarily to weakly mixed, it is advantageous to increase the resolution parameter in highly mixed GN. In LFR, higher resolution decreases the accuracy of detection, independently of the mixing parameter. The increase of the average node degree enhanced partitioning accuracy and suggested detection by chance was minimized. It is computationally more intensive to generate GN with higher mixing or average degree, using the algorithm developed in the thesis or the LFR implementation. In S. aureus network, Louvain was the fastest and the most accurate in detecting the clusters of seven groups of strains directly evolved from the common ancestor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. The world’s most valuable resource is no longer oil, but data. The Economist, 6 May 2017. economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data. Accessed 28 May 2019

  2. Barabási, A.-L.: Network science book. networksciencebook.com. Accessed 15 May 2019

  3. Patil, H.G.S., Babu, A.N., Ramkumar, P.S.: Non-invasive data acquisition and measurement in bio-medical technology: an overview. In: Maximizing Healthcare Delivery and Management through Technology Integration. IGI Global (2016)

    Google Scholar 

  4. Health. European Data Protection Supervisor. edps.europa.eu/data-protection/our-work/subjects/health_en. Accessed 9 June 2019

  5. Ten threats to global health in 2019. World Health Organization. who.int/emergencies/ten-threats-to-global-health-in-2019. Accessed 4 June 2019

  6. Antimicrobial resistance. World Health Organization, 15 February 2018. who.int/en/news-room/fact-sheets/detail/antimicrobial-resistance. Accessed 29 May 2019

  7. Memish, Z.A., Venkatesh, S., Shibl, A.M.: Impact of travel on international spread of antimicrobial resistance. Int. J. Antimicrob. Agents 21(2), 135–142 (2003)

    Article  Google Scholar 

  8. Top 10 Leading Causes of Death Globally. theatlas.com/charts/HkLaDreuW. Accessed 12 May 2019

  9. Ribeiro-Gonçalves, B., Francisco, A.P., Vaz, C., Ramirez, M., Carriço, J.A.: PHYLOViZ online: web-based tool for visualization, phylogenetic inference, analysis and sharing of minimum spanning trees. Nucleic Acids Res. 44(1), 246–251 (2016)

    Article  Google Scholar 

  10. Motro, Y., Moran-Gilad, J.: Next-generation sequencing applications in clinical bacteriology. Biomol. Detect. Quantif. 14, 1–6 (2017)

    Article  Google Scholar 

  11. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. P10008, 12 (2008)

    Google Scholar 

  12. Rosvall, M., Axelsson, D., Bergstrom, C.T.: The map equation. Eur. Phys. J. Spec. Top. 178(1), 13–23 (2009)

    Article  Google Scholar 

  13. Raghavan, N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3) (2007). 25th Anniversary Milestones

    Google Scholar 

  14. Boldi, P., Rosa, M., Santini, M., Vigna, S.: Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks. In: WWW 2011 Proceedings of the 20th International Conference on World Wide Web (2011)

    Google Scholar 

  15. Šubelj, L.: Label propagation for clustering. In: Advances in Network Clustering and Blockmodeling. Wiley, New York (2018)

    Google Scholar 

  16. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Nat. Acad. Sci. U.S.A. 99(12), 7821–7826 (2002)

    Article  MathSciNet  Google Scholar 

  17. Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E: Stat. Nonlin. Soft Matter Phys. 78(4) (2008)

    Google Scholar 

  18. Lancichinetti, A., Fortunato, S., Kertesz, J.: Detecting the overlapping and hierarchical community structure of complex networks. New J. Phys. 11, 033015 (2009)

    Article  Google Scholar 

  19. Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. In: Proceedings of 2012 IEEE International Conference on Data Mining (ICDM) (2012)

    Google Scholar 

  20. Zachary, W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. 33, 452–473 (1976)

    Article  Google Scholar 

  21. Staphylococcus aureus MLST Databases. PubMLST, 5 June 2019. pubmlst.org/saureus/. Accessed 5 June 2019

  22. Lancichinetti, A., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E: Stat. Nonlin. Soft Matter Phys. 80, 016118 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luís Rita .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rita, L., Francisco, A., Carriço, J., Borges, V. (2020). Community Finding with Applications on Phylogenetic Networks. In: Henriques, J., Neves, N., de Carvalho, P. (eds) XV Mediterranean Conference on Medical and Biological Engineering and Computing – MEDICON 2019. MEDICON 2019. IFMBE Proceedings, vol 76. Springer, Cham. https://doi.org/10.1007/978-3-030-31635-8_234

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-31635-8_234

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-31634-1

  • Online ISBN: 978-3-030-31635-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics