skip to main content
10.1145/1281192.1281280acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

SCAN: a structural clustering algorithm for networks

Published:12 August 2007Publication History

ABSTRACT

Network clustering (or graph partitioning) is an important task for the discovery of underlying structures in networks. Many algorithms find clusters by maximizing the number of intra-cluster edges. While such algorithms find useful and interesting structures, they tend to fail to identify and isolate two kinds of vertices that play special roles - vertices that bridge clusters (hubs) and vertices that are marginally connected to clusters (outliers). Identifying hubs is useful for applications such as viral marketing and epidemiology since hubs are responsible for spreading ideas or disease. In contrast, outliers have little or no influence, and may be isolated as noise in the data. In this paper, we proposed a novel algorithm called SCAN (Structural Clustering Algorithm for Networks), which detects clusters, hubs and outliers in networks. It clusters vertices based on a structural similarity measure. The algorithm is fast and efficient, visiting each vertex only once. An empirical evaluation of the method using both synthetic and real datasets demonstrates superior performance over other methods such as the modularity-based algorithms.

Skip Supplemental Material Section

Supplemental Material

p824-xu-200.mov

mov

36 MB

p824-xu-768.mov

mov

119.8 MB

References

  1. S. Wasserman and K. Faust, "Social Network Analysis." Cambridge University Press, Cambridge (1994).Google ScholarGoogle Scholar
  2. R. Albert, H. Jeong, and A.-L. Barabási, "Diameter of the world-wide web." Nature 401, 130--131 (1999).Google ScholarGoogle ScholarCross RefCross Ref
  3. J. M. Kleinberg, S. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, "The Web as a graph: Measurements, models and methods." In Proceedings of the International Conference on Combinatorics and Computing, number 1627 in Lecture Notes in Computer Science, pp. 1--18, Springer, Berlin (1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Ding, X. He, H. Zha, M. Gu, and H. Simon, "A min-max cut algorithm for graph partitioning and data clustering", Proc. of ICDM 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Shi and J. Malik, "Normalized cuts and image segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 22, No. 8, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Guimera and L. A. N. Amaral, "Functional cartography of complex metabolic networks." Nature 433, 895--900 (2005).Google ScholarGoogle ScholarCross RefCross Ref
  7. J. Kleinberg. "Authoritative sources in a hyperlinked environment." Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Domingos and M. Richardson, "Mining the Network Value of Customers", Proc. 7th ACM SIGKDD, pp. 57--66, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Wang, D. Chakrabarti, C. Wang and C. Faloutsos, "Epidemic Spreading in Real Networks: An Eigenvalue Viewpoint", SRDS 2003 (pages 25--34), Florence, ItalyGoogle ScholarGoogle Scholar
  10. M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise". In Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD'96), Portland, OR, pages 291--316. AAAI Press, 1996.Google ScholarGoogle Scholar
  11. M. E. J. Newman and M. Girvan, "Finding and evaluating community structure in networks", Phys. Rev. E 69, 026113 (2004).Google ScholarGoogle ScholarCross RefCross Ref
  12. A. Clauset, M. E. J. Newman, and C. Moore, "Finding community in very large networks", Physical Review E 70, 066111 (2004).Google ScholarGoogle ScholarCross RefCross Ref
  13. D. J. Watts and S. H. Strogatz, "Collective dynamics of 'small-world' networks," Nature, 393:440--442 (1998)Google ScholarGoogle ScholarCross RefCross Ref
  14. W. M. Rand, "Objective criteria for the evaluation of clustering methods." Journal of the American Statistical Association, 66, pp 846--850 (1971).Google ScholarGoogle ScholarCross RefCross Ref
  15. L. Hubert and P. Arabie, "Comparing Partitions". Journal of Classification, 193--218, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  16. G. W. Milligan and M. C. Cooper, "A study of the comparability of external criteria for hierarchical cluster analysis", Multivariate BehavioralResearch, 21, 441--458, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  17. http://cs.unm.edu/~aaron/research/fastmodularity.htm.Google ScholarGoogle Scholar
  18. http://www.orgnet.com/.Google ScholarGoogle Scholar
  19. http://www-personal.umich.edu/~mejn/netdata/.Google ScholarGoogle Scholar
  20. P. Erdös and A. Rényi, Publ. Math. (Debrecen) 6, 290 (1959).Google ScholarGoogle Scholar
  21. M. Faloutsos, P. Faloutsos and C. Faloutsos, On Power-Law Relationships of the Internet Topology, SIGCOMM 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A.-L. Barabási and Z. N. Oltvai, Nature Reviews Genetics 5, 101--113 (2004).Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. SCAN: a structural clustering algorithm for networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2007
      1080 pages
      ISBN:9781595936097
      DOI:10.1145/1281192

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 August 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      KDD '07 Paper Acceptance Rate111of573submissions,19%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader