Article

SCAN: a structural clustering algorithm for networks

Authors:
Xiaowei Xu

University of Arkansas at Little Rock

University of Arkansas at Little Rock
View Profile

,
Nurcan Yuruk

University of Arkansas at Little Rock

University of Arkansas at Little Rock
View Profile

,
Zhidan Feng

University of Arkansas at Little Rock

University of Arkansas at Little Rock
View Profile

,
Thomas A. J. Schweiger

Acxiom Corporation

Acxiom Corporation
View Profile

KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2007Pages 824–833https://doi.org/10.1145/1281192.1281280

Published:12 August 2007Publication History

KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 824–833

ABSTRACT

Network clustering (or graph partitioning) is an important task for the discovery of underlying structures in networks. Many algorithms find clusters by maximizing the number of intra-cluster edges. While such algorithms find useful and interesting structures, they tend to fail to identify and isolate two kinds of vertices that play special roles - vertices that bridge clusters (hubs) and vertices that are marginally connected to clusters (outliers). Identifying hubs is useful for applications such as viral marketing and epidemiology since hubs are responsible for spreading ideas or disease. In contrast, outliers have little or no influence, and may be isolated as noise in the data. In this paper, we proposed a novel algorithm called SCAN (Structural Clustering Algorithm for Networks), which detects clusters, hubs and outliers in networks. It clusters vertices based on a structural similarity measure. The algorithm is fast and efficient, visiting each vertex only once. An empirical evaluation of the method using both synthetic and real datasets demonstrates superior performance over other methods such as the modularity-based algorithms.

Supplemental Material

p824-xu-200.mov

mov

36 MB

Download

p824-xu-768.mov

mov

119.8 MB

Download

References

S. Wasserman and K. Faust, "Social Network Analysis." Cambridge University Press, Cambridge (1994).Google Scholar
R. Albert, H. Jeong, and A.-L. Barabási, "Diameter of the world-wide web." Nature 401, 130--131 (1999).Google ScholarCross Ref
J. M. Kleinberg, S. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, "The Web as a graph: Measurements, models and methods." In Proceedings of the International Conference on Combinatorics and Computing, number 1627 in Lecture Notes in Computer Science, pp. 1--18, Springer, Berlin (1999). Google ScholarDigital Library
C. Ding, X. He, H. Zha, M. Gu, and H. Simon, "A min-max cut algorithm for graph partitioning and data clustering", Proc. of ICDM 2001. Google ScholarDigital Library
J. Shi and J. Malik, "Normalized cuts and image segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 22, No. 8, 2000. Google ScholarDigital Library
R. Guimera and L. A. N. Amaral, "Functional cartography of complex metabolic networks." Nature 433, 895--900 (2005).Google ScholarCross Ref
J. Kleinberg. "Authoritative sources in a hyperlinked environment." Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Google ScholarDigital Library
P. Domingos and M. Richardson, "Mining the Network Value of Customers", Proc. 7th ACM SIGKDD, pp. 57--66, 2001. Google ScholarDigital Library
Y. Wang, D. Chakrabarti, C. Wang and C. Faloutsos, "Epidemic Spreading in Real Networks: An Eigenvalue Viewpoint", SRDS 2003 (pages 25--34), Florence, ItalyGoogle Scholar
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise". In Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD'96), Portland, OR, pages 291--316. AAAI Press, 1996.Google Scholar
M. E. J. Newman and M. Girvan, "Finding and evaluating community structure in networks", Phys. Rev. E 69, 026113 (2004).Google ScholarCross Ref
A. Clauset, M. E. J. Newman, and C. Moore, "Finding community in very large networks", Physical Review E 70, 066111 (2004).Google ScholarCross Ref
D. J. Watts and S. H. Strogatz, "Collective dynamics of 'small-world' networks," Nature, 393:440--442 (1998)Google ScholarCross Ref
W. M. Rand, "Objective criteria for the evaluation of clustering methods." Journal of the American Statistical Association, 66, pp 846--850 (1971).Google ScholarCross Ref
L. Hubert and P. Arabie, "Comparing Partitions". Journal of Classification, 193--218, 1985.Google ScholarCross Ref
G. W. Milligan and M. C. Cooper, "A study of the comparability of external criteria for hierarchical cluster analysis", Multivariate BehavioralResearch, 21, 441--458, 1986.Google ScholarCross Ref
http://cs.unm.edu/~aaron/research/fastmodularity.htm.Google Scholar
http://www.orgnet.com/.Google Scholar
http://www-personal.umich.edu/~mejn/netdata/.Google Scholar
P. Erdös and A. Rényi, Publ. Math. (Debrecen) 6, 290 (1959).Google Scholar
M. Faloutsos, P. Faloutsos and C. Faloutsos, On Power-Law Relationships of the Internet Topology, SIGCOMM 1999. Google ScholarDigital Library
A.-L. Barabási and Z. N. Oltvai, Nature Reviews Genetics 5, 101--113 (2004).Google ScholarCross Ref

Index Terms

SCAN: a structural clustering algorithm for networks
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis

Recommendations

Clustering dense graphs: A web site graph paradigm

Typically graph-clustering approaches assume that a cluster is a vertex subset such that for all of its vertices, the number of links connecting a vertex to its cluster is higher than the number of links connecting the vertex to the remaining graph. We ...
Read More
Refining graph partitioning for social network clustering
WISE'10: Proceedings of the 11th international conference on Web information systems engineering

Graph partitioning is a traditional problem with many applications and a number of high-quality algorithms have been developed. Recently, demand for social network analysis arouses the new research interest on graph clustering. Social networks differ ...
Read More
Networks, communities and kronecker products
CNIKM '09: Proceedings of the 1st ACM international workshop on Complex networks meet information & knowledge management

Emergence of the web and online computing applications gave rise to rich large scale social activity data. One of the principal challenges then is to build models and understanding of the structure of such large social and information networks. Here I ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2007
1080 pages
ISBN:9781595936097
DOI:10.1145/1281192
General Chair:
Pavel Berkhin
Yahoo!, USA
,
Program Chairs:
Rich Caruana
Cornell University, USA
,
Xindong Wu
University of Vermont, USA
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 August 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
community Structure
graph partitioning
hubs
network clustering
outliers
Qualifiers
- Article
Conference

Acceptance Rates
KDD '07 Paper Acceptance Rate111of573submissions,19%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 569
  Total Citations
  View Citations
- 4,478
  Total Downloads
- Downloads (Last 12 months)312
- Downloads (Last 6 weeks)44
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SCAN: a structural clustering algorithm for networks

KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Clustering dense graphs: A web site graph paradigm

Refining graph partitioning for social network clustering

Networks, communities and kronecker products