Article

Mining closed relational graphs with connectivity constraints

Authors:
Xifeng Yan

University of Illinois at Urbana-Champaign, Urbana, IL

University of Illinois at Urbana-Champaign, Urbana, IL
View Profile

,
X. Jasmine Zhou

University of Southern California, Los Angeles, CA

University of Southern California, Los Angeles, CA
View Profile

,
Jiawei Han

University of Illinois at Urbana-Champaign, Urbana, IL

University of Illinois at Urbana-Champaign, Urbana, IL
View Profile

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data miningAugust 2005Pages 324–333https://doi.org/10.1145/1081870.1081908

Published:21 August 2005Publication History

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

Pages 324–333

ABSTRACT

Relational graphs are widely used in modeling large scale networks such as biological networks and social networks. In this kind of graph, connectivity becomes critical in identifying highly associated groups and clusters. In this paper, we investigate the issues of mining closed frequent graphs with connectivity constraints in massive relational graphs where each graph has around 10K nodes and 1M edges. We adopt the concept of edge connectivity and apply the results from graph theory, to speed up the mining process. Two approaches are developed to handle different mining requests: CloseCut, a pattern-growth approach, and splat, a pattern-reduction approach. We have applied these methods in biological datasets and found the discovered patterns interesting.

References

C. Borgelt and M. Berthold. Mining molecular fragments: Finding relevant substructures of molecules. In Proc. 2002 Int. Conf. on Data Mining (ICDM'02), pages 211--218, 2002.]] Google ScholarDigital Library
D. Burdick, M. Calimlim, and J. Gehrke. MAFIA: A maximal frequent itemset algorithm for transactional databases. In Proc. 2001 Int. Conf. Data Engineering (ICDE'01), pages 443--452, 2001.]] Google ScholarDigital Library
A. Butte, P. Tamayo, D. Slonim, T. Golub, and I. Kohane. Discovering functional relationships between rna expression and chemotherapeutic susceptibility. In Proc. of the National Academy of Science, volume 97, pages 12182--12186, 2000.]]Google Scholar
C. Chekuri, A. Goldberg, D. Karger, M. Levine, and C. Stein. Experimental study of minimum cut algorithms. In Proc. of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'97), pages 324--333, 1997.]] Google ScholarDigital Library
M. Eisen, P. Spellman, P. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. In Proc. of the National Academy of Science, volume 95, pages 14863--14868, 1998.]]Google ScholarCross Ref
G. Flake, S. Lawrence, and C. Giles. Efficient identification of web communities. In Proc. 2000 ACM Int. Conf. Knowledge Discovery and Data Mining (KDD'00), pages 150--160, 2000.]] Google ScholarDigital Library
L. Holder, D. Cook, and S. Djoko. Substructure discovery in the subdue system. In Proc. AAAI'94 Workshop on Knowledge Discovery in Databases (KDD'94), pages 169--180, 1994.]]Google Scholar
J. Huan, W. Wang, D. Bandyopadhyay, J. Snoeyink, J. Prins, and A. Tropsha. Mining spatial motifs from protein structure graphs. In Proc. of the 8th Annual Int. Conf. on Research in Computational Molecular Biology (RECOMB'04), pages 308--315.]] Google ScholarDigital Library
A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In Proc. 2000 European Symp. Principle of Data Mining and Knowledge Discovery (PKDD'00), pages 13--23, 1998.]] Google ScholarDigital Library
M. Kuramochi and G. Karypis. Frequent subgraph discovery. In Proc. 2001 Int. Conf. Data Mining (ICDM'01), pages 313--320, 2001.]] Google ScholarDigital Library
T. Mielikainen. Intersecting data to closed sets with constraints. In Proc. of the First ICDM Workshop on Frequent Itemset Mining Implementation (FIMI'03), 2003.]]Google Scholar
F. Pan, G. Cong, A. Tung, J. Yang, and M. Zaki. Carpenter: Finding closed patterns in long biological datasets. In Proc. 2003 ACM Int. Conf. Knowledge Discovery and Data Mining (KDD'03), 2003.]] Google ScholarDigital Library
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000.]] Google ScholarDigital Library
V. Spirin and L. Mirny. Protein complexes and functional modules in molecular networks. In Proc. of the National Academy of Science, volume 100, pages 12123--12128, 2003.]]Google ScholarCross Ref
M. Stoer and F. Wagner. A simple min-cut algorithm. Journal of the ACM, 44:585--591, 1997.]] Google ScholarDigital Library
P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E. Lander, and T. Golub. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. In Proc. of the National Academy of Science, volume 96, pages 2907--2912, 1999.]]Google ScholarCross Ref
N. Vanetik, E. Gudes, and S. E. Shimony. Computing frequent graph patterns from semistructured data. In Proc. 2002 Int. Conf. on Data Mining (ICDM'02), pages 458--465, 2002.]] Google ScholarDigital Library
J. Wang, J. Han, and J. Pei. Closet+: Searching for the best strategies for mining frequent closed itemsets. In Proc. 2003 ACM Int. Conf. Knowledge Discovery and Data Mining (KDD'03), pages 236--245, 2003.]] Google ScholarDigital Library
T. Washio and H. Motoda. State of the art of graph-based data mining. SIGKDD Explorations, 5:59--68, 2003.]] Google ScholarDigital Library
D. West. Introduction to Graph Theory. Prentice Hall, Cambridge, MA, 2000.]]Google Scholar
Z. Wu and R. Leahy. An optimal graph theoretic approach to data clustering: Theory and its application to image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 15:1101--1113, 1993.]] Google ScholarDigital Library
X. Yan and J. Han. gSpan: Graph-based substructure pattern mining. In Proc. 2002 Int. Conf. on Data Mining (ICDM'02), pages 721--724, 2002.]] Google ScholarDigital Library
X. Yan and J. Han. Closegraph: Mining closed frequent graph patterns. In Proc. 2003 ACM Int. Conf. Knowledge Discovery and Data Mining (KDD'03), pages 286--295, 2003.]] Google ScholarDigital Library
X. Yan, P. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In Proc. 2004 ACM Int. Conf. Management of Data (SIGMOD'04), pages 335--346, 2004.]] Google ScholarDigital Library
M. Zaki and K. Gouda. Fast vertical mining using diffsets. In Proc. 2003 ACM Int. Conf. Knowledge Discovery and Data Mining (KDD'03), pages 326--335, 2003.]] Google ScholarDigital Library

Index Terms

Mining closed relational graphs with connectivity constraints
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

The bondage and connectivity of a graph

Let G =(V,E) be a simple graph. A subset S of V is a dominating set of G if for any vertex v ∈ V - S, there exists some vertex u ∈ S such that uv ∈ E(G). The domination number, denoted by γ(G), is the cardinality of a minimum dominating set of G. The ...
Read More
On the connectivity of p-diamond-free graphs

For a vertex v of a graph G, we denote by d(v) the degree of v. The local connectivity@k(u,v) of two vertices u and v in a graph G is the maximum number of internally disjoint u-v paths in G, and the connectivity of G is defined as @k(G)=min{@k(u,v)|u,v@...
Read More
A bound on 4-restricted edge connectivity of graphs

An edge cut of a connected graph is 4-restricted if it disconnects this graph with each component having order at least four. The size of minimum 4-restricted edge cuts of graph G is called its 4-restricted edge connectivity and is denoted by @l"4(G). ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
August 2005
844 pages
ISBN:159593135X
DOI:10.1145/1081870
General Chair:
Robert Grossman
University of Illinois at Chicago & Open Data Partners, USA
,
Program Chairs:
Roberto Bayardo
IBM Almaden Research, USA
,
Kristin Bennett
RPI, USA
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
closed pattern
connectivity
graph
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 90
  Total Citations
  View Citations
- 1,112
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Mining closed relational graphs with connectivity constraints

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

The bondage and connectivity of a graph

On the connectivity of p-diamond-free graphs

A bound on 4-restricted edge connectivity of graphs