Abstract
In a previous paper, we laid out the vision of a novel graph query processing paradigm where instead of processing a visual query graph after its construction, it interleaves visual query formulation and processing by exploiting the latency offered by the gui to filter irrelevant matches and prefetch partial query results [8]. Our recent attempts at implementing this vision [8, 9] show significant improvement in system response time (srt) for subgraph queries. However, these efforts are designed specifically for graph databases containing a large collection of small or medium-sized graphs. In this paper, we propose a novel algorithm called quble (QUery Blender for Large nEtworks) to realize this visual subgraph querying paradigm on very large networks (e.g., protein interaction networks, social networks). First, it decomposes a large network into a set of graphlets and supergraphlets using a minimum cut-based graph partitioning technique. Next, it mines approximate frequent and small infrequent fragments (sifs) from them and identifies their occurrences in these graphlets and supergraphlets. Then, the indexing framework of [9] is enhanced so that the mined fragments can be exploited to index graphlets for efficient blending of visual subgraph query formulation and query processing. Extensive experiments on large networks demonstrate effectiveness of quble.
Similar content being viewed by others
Notes
In this paper, we assume an “edge-at-a-time” visual query formulation interface. A more advanced and domain-dependent gui may support drag and drop of canned patterns or subgraphs (e.g., benzene ring) for composing visual queries. Such visual query composition interface is beyond the scope of this work.
Duration between the time a user presses the Run icon to the time when the user gets the query results [8].
prague adopts the maximum connected common subgraphs (mccs) for computing similarity between a pair of graphs.
quble is a game where players twist and turn a cube to build words in 60 s rounds using letters worth varying point values. Each player must decide how to best allocate their play time between moving the cube or searching and calling out each word. In our visual querying paradigm, we also decide how to best manage the time between visual actions and searching for query fragment matches.
Extension of TreeSpan is claimed to support vertex mismatch. However, the details are not discussed in [20].
A video of quble is available at http://youtu.be/4k4XBxxdD_4. It is also demonstrated in SIGMOD 2013 [7].
Note that we do not monitor action for modifying the query fragment here as the procedure is same as in prague [9].
For clarity, we distinguish between a node in a query (data) graph fragment and a node in action-aware indexes and g-spigs by using the terms “node” and “vertex”, respectively.
Recall that sapper [19] only retrieves similarity matches containing same number of nodes as the query graph (Sect. 2.2). In contrast, qub allows retrieval of similar subgraphs that do not necessarily have the same number of nodes as the query. Hence, candidate data graphs as well as result sets generated by these approaches are different and incomparable.
Here, we ignore the “user thinking time.” As the thinking time increases, the latency offered by a gui increases as well at each step.
Downloaded from http://snap.stanford.edu/data/com-Youtube.html.
References
Bhowmick, S.S. et al.: VOGUE: towards a visual interaction-aware graph query processing framework. In: CIDR (2013)
Cheng, J., Ke, Y., Ng, W., Lu, A.: FG-Index: towards verification-free query processing on graph databases. In: SIGMOD (2007)
Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: An improved algorithm for matching large graphs. In: 3rd IAPR TC-15 workshop on graph-based representations in, pattern recognition (2001)
Han, W.-S. et al.: iGraph: a framework for comparisons of disk based graph indexing techniques. In: VLDB (2010)
He, H., Singh, A.K.: Graphs-at-a-time: query language and access methods for graph databases. In: SIGMOD (2008)
Huan, J.P., Wang, W.: Efficient mining of frequent subgraph in the presence of isomorphism. In: ICDM (2003)
Hung, H., Bhowmick, S.S. et al.: QUBLE: blending visual subgraph query formulation with query processing on large networks. In: SIGMOD (2013)
Jin, C. et al.: GBLENDER: towards blending visual query formulation and query processing in graph databases. In: ACM SIGMOD (2010)
Jin, C. et al.: Prague: a practical framework for blending visual subgraph query formulation and query processing. In: ICDE (2012)
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1999)
Shang, H. et al.: Connected substructure similarity search. In: SIGMOD (2010)
Tian, Y., Patel, J.: TALE: a tool for approximate large graph matching. In: ICDE (2008)
Xie, Y., Yu, P. S.: CP-Index: on the efficient indexing of large graphs. In: CIKM (2011)
Yan, X., Han, J. : gSpan: graph-based substructure pattern mining. In: ICDM (2002)
Yan, X. et al.: Graph indexing: a frequent structure-based approach. In: SIGMOD (2004)
Yan, X. et al.: Substructure similarity search in graph databases. In: SIGMOD (2005)
Zeng, Z., Tung, A.K.H., Wang, J. et al.: Comparing stars: on approximating graph edit distance. In: VLDB (2009)
Zhang, S. et al.: GADDI: distance index based subgraph matching in biological networks. In: EDBT, pp. 192–203 (2009)
Zhang, S. et al.: SAPPER: subgraph indexing and approximate matching in large graphs. In: VLDB (2010)
Zhu, G., Lin, X., Zhu, K. et al.: TreeSpan: efficiently computing similarity all-matching. In: SIGMOD (2012)
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Hung, H.H., Bhowmick, S.S., Truong, B.Q. et al. QUBLE: towards blending interactive visual subgraph search queries on large networks. The VLDB Journal 23, 401–426 (2014). https://doi.org/10.1007/s00778-013-0322-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-013-0322-1