Skip to main content
Log in

Subgraph similarity maximal all-matching over a large uncertain graph

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Recently, uncertain graph data management and mining techniques have attracted significant interests and research efforts due to potential applications such as protein interaction networks and social networks. Specifically, as a fundamental problem, subgraph similarity all-matching is widely applied in exploratory data analysis. The purpose of subgraph similarity all-matching is to find all the similarity occurrences of the query graph in a large data graph. Numerous algorithms and pruning methods have been developed for the subgraph matching problem over a certain graph. However, insufficient efforts are devoted to subgraph similarity all-matching over an uncertain data graph, which is quite challenging due to high computation costs. In this paper, we define the problem of subgraph similarity maximal all-matching over a large uncertain data graph and propose a framework to solve this problem. To further improve the efficiency, several speed-up techniques are proposed such as the partial graph evaluation, the vertex pruning, the calculation model transformation, the incremental evaluation method and the probability upper bound filtering. Finally, comprehensive experiments are conducted on real graph data to test the performance of our framework and optimization methods. The results verify that our solutions can outperform the basic approach by orders of magnitudes in efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13

Similar content being viewed by others

References

  1. Adar, E., Re, C.: Managing Uncertainty in Social Networks. IEEE Data Eng. Bull. 30(2), 15–22 (2007)

    Google Scholar 

  2. Aggarwal, C.C., Wang, H.: Managing and Mining Graph Data, vol.40 of Advances in Database Systems. Springer (2010)

  3. Choi, R., Chung, C.-W.: Efficient processing of graph similarity search. WWW J. preprint, doi:10.1007/s11280-014-0274-4

  4. Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty Years Of Graph Matching In Pattern Recognition. IJPRAI 18(3), 265–298 (2004)

    Google Scholar 

  5. Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1367–1372 (2004)

    Article  Google Scholar 

  6. Gu, Y., Gao, C., Cong, G., Yu, G.: Effective and Efficient Clustering Methods for Correlated Probabilistic Graphs. IEEE Trans. Knowl. Data Eng. 26(5), 1117–1130 (2014)

    Article  Google Scholar 

  7. Hua, M., Pei, J.: Probabilistic path queries in road networks: traffic uncertainty aware path selection. In: Proceedings of EDBT, pp 347–358 (2010)

  8. Jiang, H., Wang, H., Yu, P.S., Zhou, S.: GString: A Novel Approach for Efficient Search in Graph Databases. In: Proceedings of ICDE, pp 566–575 (2007)

  9. Jin, R., Liu, L., Aggarwal, C.C.: Discovering highly reliable subgraphs in uncertain graphs. In: Proceedings of KDD, pp. 992–1000 (2011)

  10. Jin, R., Liu, L., Ding, B., Wang, H.: Distance-Constraint Reachability Computation in Uncertain Graphs. PVLDB 4(9), 551–562 (2011)

    Google Scholar 

  11. Kollios, G., Potamias, M., Terzi, E.: Clustering Large Probabilistic Graphs. IEEE Trans. Knowl. Data Eng. 25(2), 325–336 (2011)

    Article  Google Scholar 

  12. Larrosa, J., Valiente, G.: Constraint Satisfaction Algorithms for Graph Pattern Matching. Math. Struct. Comput. Sci. 12(4), 403–422 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  13. Liu, Z., Wang, C., Wang, J.: Aggregate nearest neighbor queries in uncertain graphs. WWW J. 17(1), 161–188 (2014)

    Google Scholar 

  14. Li, J., Zou, Z., Gao, H.: Mining frequent subgraphs over uncertain graph databases under probabilistic semantics. VLDB J. 21(6), 753–777 (2012)

    Article  Google Scholar 

  15. Potamias, M., Bonchi, F., Gionis, A., Kollios, G.: k-Nearest Neighbors in Uncertain Graphs. PVLDB 3(1), 997–1008 (2010)

    Google Scholar 

  16. Ullmann, J.R.: An Algorithm for Subgraph Isomorphism. J. ACM 23(1), 31–42 (1976)

    Article  MathSciNet  Google Scholar 

  17. Wang, X., Smalter, A.M., Huan, J., Lushington, G.H.: G-hash: towards fast kernel-based similarity search in large graph databases. In: Proceedings of EDBT, pp 472–480 (2009)

  18. Wang, Y., Wang, H., Li, J., Gao, H.: Efficient subgraph join based on connectivity similarity. WWW J. preprint, doi:10.1007/s11280-014-0286-0

  19. Yuan, Y., Wang, G., Chen, L., Wang, H.: Efficient Subgraph Similarity Search on Large Probabilisti Graph Databases. PVLDB 5(9), 800–811 (2012)

    Google Scholar 

  20. Yuan, Y., Wang, G., Wang, H., Chen, L.: Efficient Subgraph Search over Large Uncertain Graphs. PVLDB 4(11), 876–886 (2011)

    Google Scholar 

  21. Zhu, G., Lin, X., Zhu, K., Zhang, W., Yu, J.X.: TreeSpan: efficiently computing similarity all-matching. In: Proceedings of SIGMOD, pp. 529–540 (2012)

  22. Zhang, S., Yang, J., Jin, W.: SAPPER: Subgraph Indexing and Approximate Matching in Large Graphs. PVLDB 3(1), 1185–1194 (2010)

    Google Scholar 

  23. Zou, Z., Gao, H., Li, J.: Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In: Proceedings of KDD, pp 633–642 (2010)

Download references

Acknowledgments

This work was supported by the National Basic Research Program of China (973 Program) under Grant No. 2012CB316201, the National Natural Science Foundation of China (61472071,61272179) and the Fundamental Research Funds for the Central Universities(N130404010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Gu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gu, Y., Gao, C., Wang, L. et al. Subgraph similarity maximal all-matching over a large uncertain graph. World Wide Web 19, 755–782 (2016). https://doi.org/10.1007/s11280-015-0358-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-015-0358-9

Keywords

Navigation