Skip to main content

Big-Graphs: Querying, Mining, and Beyond

  • Chapter
  • First Online:
Handbook of Big Data Technologies

Abstract

Graphs are a ubiquitous model to represent objects and their relations. However, the complex combinations of structure and content, coupled with massive volume, high streaming rate, and uncertainty inherent in the data, raise several challenges that require new efforts for smarter and faster graph analysis. With the advent of complex networks such as the World Wide Web, social networks, knowledge graphs, genome and scientific databases, Internet of things, medical and government records, novel graph computations are also emerging, including graph pattern matching and mining, similarity search, keyword search, and graph query-by-example. These workloads require both topology and content information of the network; and hence, they are different from classical graph computations such as shortest path, reachability, and minimum cut, which depend only on the structure of the network. In this chapter, we shall describe the emerging graph queries and mining problems, their applications and resolution techniques. We emphasize the current challenges and highlight some future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 449.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. D.J. Abadi, A. Marcus, S.R. Madden, K. Hollenbach, SW-Store: a vertically partitioned DBMS for semantic web data management. VLDB J. 18(2), 385–406 (2009)

    Article  Google Scholar 

  2. S. Abiteboul, D. Quass, J. McHugh, J. Widom, J.L. Wiener, The lorel query language for semistructured data. Int. J. Digit. Libr. 1(1), 68–88 (1997)

    Article  Google Scholar 

  3. B. Aditya, G. Bhalotia, S. Chakrabarti, A. Hulgeri, C. Nakhe, P. Parag, S. Sudarshan, BANKS: browsing and keyword searching in relational databases, in VLDB (2002)

    Google Scholar 

  4. C. Aggarwal, H. Wang, Managing and Mining Graph Data (Springer, Berlin, 2010)

    Book  MATH  Google Scholar 

  5. S. Agrawal, S. Chaudhuri, G. Das, DBXplorer: a system for keyword-based search over relational databases, in ICDE (2002)

    Google Scholar 

  6. D. Ajwani, M. Karnstedt, A. Sala, Processing large graphs: representations, storage, systems, and algorithms, in WWW (2015)

    Google Scholar 

  7. R. Angles, C. Gutierrez, Survey of graph database models. ACM Comput. Surv. 40(1), 1:1–1:39 (2008)

    Google Scholar 

  8. A. Arora, M. Sachan, A. Bhattacharya, Mining statistically significant connected subgraphs in vertex labeled graphs, in SIGMOD (2014)

    Google Scholar 

  9. P. Barceló, L. Libkin, J.L. Reutter, Querying graph patterns, in PODS (2011)

    Google Scholar 

  10. M. Bayati, M. Gerritsen, D.F. Gleich, A. Saberi, Y. Wang, Algorithms for large sparse network alignment problems, in ICDM (2009)

    Google Scholar 

  11. J. Berry, B. Hendrickson, S. Kahan, P. Konecny, Software and algorithms for graph queries on multithreaded architectures, in IPDPS (2007)

    Google Scholar 

  12. S.S. Bhowmick, B. Choi, S. Zhou, VOGUE: towards a visual interaction-aware graph query processing framework, in CIDR (2013)

    Google Scholar 

  13. C. Borgelt, M.R. Berthold, Mining molecular fragments: finding relevant substructures of molecules, in ICDM (2002)

    Google Scholar 

  14. S. Brin, L. Page, The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 30(1–7), 107–117 (1998)

    Google Scholar 

  15. B. Bringmann, S. Nijssen, What is frequent in a single graph? in PAKDD (2008)

    Google Scholar 

  16. J. Broekstra, A. Kampman, F.v. Harmelen, Sesame: a generic architecture for storing and querying RDF and RDF schema, in ISWC (2002)

    Google Scholar 

  17. A. Buluç, J.R. Gilbert, The combinatorial BLAS: design, implementation, and applications. Int. J. High Perform. Comput. Appl. 25(4), 496–509 (2011)

    Article  Google Scholar 

  18. P. Buneman, M.F. Fernandez, D. Suciu, UnQL: a query language and algebra for semistructured data based on structural recursion. VLDB J. 9(1), 76–110 (2000)

    Article  Google Scholar 

  19. M. Bureli, The Current State of Graph Databases (2012). http://bigbe.su/lectures/2014/16.3.pdf

  20. C. Chen, X. Yan, F. Zhu, J. Han, P.S. Yu, Graph OLAP: towards online analytical processing on graphs, in ICDM (2008)

    Google Scholar 

  21. H. Cheng, D. Lo, Y. Zhou, X. Wang, X. Yan, Identifying bug signatures using discriminative graph mining, in ISSTA (2009)

    Google Scholar 

  22. E.I. Chong, S. Das, G. Eadon, J. Srinivasan, An efficient SQL-based RDF querying scheme, in VLDB (2005)

    Google Scholar 

  23. S. Cohen, J. Mamou, Y. Kanza, Y. Sagiv, XSEarch: a semantic search engine for XML, in VLDB (2003)

    Google Scholar 

  24. M.P. Consens, A.O. Mendelzon, Expressing structural hypertext queries in graphlogm, in HYPERTEXT (1989)

    Google Scholar 

  25. S. Cook, The complexity of theorem-proving procedures, in STOC (1971), pp. 151–158

    Google Scholar 

  26. L.P. Cordella, P. Foggia, C. Sansone, M. Vento, A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1367–1372 (2004)

    Article  Google Scholar 

  27. T.H. Cormen, C. Stein, R.L. Rivest, C.E. Leiserson, Introduction to Algorithms (McGraw-Hill Higher Education, New York, 2001)

    MATH  Google Scholar 

  28. X.H. Dang, A. Singh, P. Bogdanov, H. You, B. Hsu, Discriminative subnetworks with regularized spectral learning for global-state network data, in ECML PKDD (2014)

    Google Scholar 

  29. X.H. Dang, H. You, P. Bogdanov, A. Singh, Learning predictive substructures with regularization for network data, in ICDM (2015)

    Google Scholar 

  30. M. Deshpande, M. Kuramochi, N. Wale, G. Karypis, Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans. Knowl. Data Eng. 17, 1036–1050 (2005)

    Article  Google Scholar 

  31. DEX/Sparksee, http://sparsity-technologies.com/

  32. A. Dovier, C. Piazza, The subgraph bisimulation problem. TKDE 15(4), 1055–1056 (2003)

    Google Scholar 

  33. J. Dutkowski, T. Ideker, Protein networks as logic functions in development and cancer. PLoS Comput. Biol. 7, 09 (2011)

    Article  Google Scholar 

  34. M. Elseidy, E. Abdelhamid, S. Skiadopoulos, P. Kalnis, GraMi: frequent subgraph and pattern mining in a single large graph, in VLDB (2014)

    Google Scholar 

  35. O. Erling, A. Averbuch, J. Larriba-Pey, H. Chafi, A. Gubichev, A. Prat, M.-D. Pham, P. Boncz, The LDBC social network benchmark: interactive workload, in SIGMOD (2015)

    Google Scholar 

  36. R. Fagin, A. Lotem, M. Naor, Optimal aggregation algorithms for middleware, in PODS (2001)

    Google Scholar 

  37. C. Faloutsos, G. Miller, C. Tsourakakis, Large graph mining: power tools and a practioner’s guide, in KDD (2009)

    Google Scholar 

  38. W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, Y. Wu, Graph pattern matching: from intractable to polynomial time, in VLDB (2010)

    Google Scholar 

  39. W. Fan, J. Li, S. Ma, H. Wang, Y. Wu, Graph homomorphism revisited for graph matching, in VLDB (2010)

    Google Scholar 

  40. W. Fan, J. Li, J. Luo, Z. Tan, X. Wang, Y. Wu, Incremental graph pattern matching, in SIGMOD (2011)

    Google Scholar 

  41. W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, Adding regular expressions to graph reachability and pattern queries, in ICDE (2011)

    Google Scholar 

  42. M.F. Fernandez, D. Florescu, A.Y. Levy, D. Suciu, Declarative specification of web sites with STRUDEL. VLDB J. 9(1), 38–55 (2000)

    Article  Google Scholar 

  43. M. Fiedler, C. Borgelt, Subgraph support in a single large graph, in ICDM Workshops, 2007 (2007)

    Google Scholar 

  44. B. Gallagher, Matching structure and semantics: a survey on graph-based pattern matching, in AAAI FS (2006)

    Google Scholar 

  45. J.E. Gonzalez, R.S. Xin, A. Dave, D. Crankshaw, M.J. Franklin, I. Stoica, GraphX: graph processing in a distributed dataflow framework, in OSDI (2014)

    Google Scholar 

  46. D. Gregor, A. Lumsdaine, The parallel BGL: a generic library for distributed graph computations, in POOSC (2005)

    Google Scholar 

  47. Z. Guan, J. Wu, Q. Zhang, A. Singh, X. Yan, Assessing and ranking structural correlations in graphs, in SIGMOD (2011)

    Google Scholar 

  48. L. Guo, F. Shao, C. Botev, J. Shanmugasundaram, XRANK: ranked keyword search over XML documents, in SIGMOD (2003)

    Google Scholar 

  49. R. Gupta, S. Sarawagi, Answering table augmentation queries from unstructured lists on the web, in VLDB (2009)

    Google Scholar 

  50. S. Gurukar, S. Ranu, B. Ravindran, COMMIT: a scalable approach to mining communication motifs from dynamic networks, in SIGMOD (2015)

    Google Scholar 

  51. A. Guttman, R-trees: a dynamic index structure for spatial searching, in SIGMOD (1984)

    Google Scholar 

  52. J. Han, Y. Sun, X. Yan, P.S. Yu, Mining knowledge from databases: an information network analysis approach, in SIGMOD (2010)

    Google Scholar 

  53. L. Han, T. Finin, A. Joshi, GoRelations: an intuitive query system for dbpedia, in JIST (2011)

    Google Scholar 

  54. M. Han, K. Daudjee, K. Ammar, M.T. Özsu, X. Wang, T. Jin, An experimental comparison of pregel-like graph processing systems, in VLDB (2014)

    Google Scholar 

  55. W.-S. Han, J. Lee, M.-D. Pham, J. Yu, iGraph: a framework for comparisons of disk-based graph indexing techniques, in VLDB (2010)

    Google Scholar 

  56. W.-S. Han, S. Lee, K. Park, J.-H. Lee, M.-S. Kim, J. Kim, H. Yu, TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC, in KDD (2013)

    Google Scholar 

  57. S. Harris, N. Gibbins, 3store: efficient bulk RDF, in PSSS (2003)

    Google Scholar 

  58. M.A. Hasan, V. Chaoji, S. Salem, J. Besson, M.J. Zaki, ORIGAMI: mining representative orthogonal graph patterns, in ICDM (2007)

    Google Scholar 

  59. M.A. Hasan, M.J. Zaki, Output space sampling for graph patterns, in VLDB (2009)

    Google Scholar 

  60. H. He, A. Singh, Graphs-at-a-time: query language and access methods for graph databases, in SIGMOD (2008)

    Google Scholar 

  61. H. He, H. Wang, J. Yang, P.S. Yu, BLINKS: ranked keyword searches on graphs, in SIGMOD (2007)

    Google Scholar 

  62. B. Hendrickson, R. Leland, A multilevel algorithm for partitioning graphs, in Supercomputing (1995)

    Google Scholar 

  63. M.R. Henzinger, T.A. Henzinger, P.W. Kopke, Computing simulations on finite and infinite graphs, in FOCS (1995)

    Google Scholar 

  64. S. Hong, H. Chafi, E. Sedlar, K. Olukotun, Green-Marl: a dsl for easy and efficient graph analysis, in ASPLOS (2012)

    Google Scholar 

  65. V. Hristidis, Y. Papakonstantinou, Discover: keyword search in relational databases, in VLDB (2002)

    Google Scholar 

  66. V. Hristidis, L. Gravano, Y. Papakonstantinou, Efficient IR-style keyword search over relational databases, in VLDB (2003)

    Google Scholar 

  67. V. Hristidis, N. Koudas, Y. Papakonstantinou, D. Srivastava, Keyword proximity search in XML trees. TKDE 18(4), 525–539 (2006)

    Google Scholar 

  68. J. Huan, W. Wang, J. Prins, Efficient mining of frequent subgraphs in the presence of isomorphism, in ICDM (2003)

    Google Scholar 

  69. J. Huan, W. Wang, J. Prins, J. Yang, Spin: mining maximal frequent subgraphs from graph databases, in KDD (2004)

    Google Scholar 

  70. J. Huan, W. Wang, D.Bandyopadhyay, J. Snoeyink, J. Prins, A. Tropsha, Mining spatial motifs from protein structure graphs, in Proceedings of the 8th Annual International Conference on Research in Computational Molecular Biology (RECOMB04) (2004), pp. 308–315

    Google Scholar 

  71. InfiniteGraph, http://www.objectivity.com/products/infinitegraph/

  72. A. Inokuchi, T. Washio, H. Motoda, An apriori-based algorithm for mining frequent substructures from graph data. Princ. Data Min. Knowl. Discov. 1910, 13–23 (2000)

    Article  Google Scholar 

  73. N. Jayaram, A. Khan, C. Li, X. Yan, R. Elmasri, Querying knowledge graphs by example entity tuples. TKDE 27(10), 2797–2811 (2015)

    Google Scholar 

  74. N. Jin, C. Young, W.Wang, 0010. GAIA: graph classification using evolutionary computation, in SIGMOD (2010)

    Google Scholar 

  75. C. Jin, S.S. Bhowmick, X. Xiao, B. Choi, S. Zhou, GBLENDER: visual subgraph query formulation meets query processing, in SIGMOD (2011)

    Google Scholar 

  76. V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, H. Karambelkar, Bidirectional expansion for keyword search on graph databases, in VLDB (2005)

    Google Scholar 

  77. M. Kargar, A. An, Keyword search in graphs: finding R-cliques, in VLDB (2011)

    Google Scholar 

  78. G. Karypis, METIS and ParMETIS, in Encyclopedia of parallel computing (Springer, Berlin, 2011)

    Google Scholar 

  79. Z. Kefato, M. Lissandrini, D. Mottin, T. Palpanas, Keyword Query to Graph Query. Technical report DISI-14-003, University of Trento (2013)

    Google Scholar 

  80. B.P. Kelley, B. Yuan, F. Lewitter, R. Sharan, B.R. Stockwell, T. Ideker, PathBLAST: a tool for alignment of protein interaction networks. Nucleic Acids Res. 32, 83–88 (2004)

    Article  Google Scholar 

  81. D. Kempe, J.M. Kleinberg, E. Tardos, Maximizing the spread of influence through a social network, in KDD (2003)

    Google Scholar 

  82. A. Khan, L. Chen, On uncertain graphs modeling and queries, in VLDB (2015)

    Google Scholar 

  83. A. Khan, S. Elnikety, Systems for big-graphs, in VLDB (2014)

    Google Scholar 

  84. A. Khan, N. Li, Z. Guan, S. Chakraborty, S. Tao, Neighborhood based fast graph search in large networks, in SIGMOD (2011)

    Google Scholar 

  85. A. Khan, X. Yan, K.-L. Wu, Towards proximity pattern mining in large graphs, in SIGMOD (2010)

    Google Scholar 

  86. A. Khan, Y. Wu, X. Yan, Emerging graph queries in linked data, in ICDE (2012)

    Google Scholar 

  87. A. Khan, Y. Wu, C. Aggarwal, X. Yan, NeMa: fast graph search with label similarity, in VLDB (2013)

    Google Scholar 

  88. J. Kleinberg, Navigation in a small world. Nature 406, 845 (2000)

    Article  Google Scholar 

  89. K. Kochut, M. Janik, SPARQLeR: extended sparql for semantic association discovery, in ESWC (2007)

    Google Scholar 

  90. R. Krishnamurthy, S.P. Morgan, M. Zloof, Query-by-example: operations on piecewise continuous data, in VLDB (1983)

    Google Scholar 

  91. M. Kuramochi, G. Karypis, Frequent subgraph discovery, in ICDM (2001)

    Google Scholar 

  92. M. Kuramochi, G. Karypis, GREW-a scalable frequent subgraph discovery algorithm, in ICDM (2004)

    Google Scholar 

  93. T. Lappas, K. Liu, E. Terzi, Finding a team of experts in social networks, in KDD (2009)

    Google Scholar 

  94. J. Lee, W.-S. Han, R. Kasperovics, J.-H. Lee, An in-depth comparison of subgraph isomorphism algorithms in graph databases, in VLDB (2013)

    Google Scholar 

  95. J. Leskovec, C. Faloutsos, Tools for large graph mining: structure and difference, in WWW (2008)

    Google Scholar 

  96. G. Li, B.C. Ooi, J. Feng, J. Wang, L. Zhou, EASE: an effective 3-in-1 keyword search method for unstructured semi-structured and structured data, in SIGMOD (2008)

    Google Scholar 

  97. Z. Liang, M. Xu, M. Teng, L. Niu, NetAlign: a web-based tool for comparison of protein interaction networks. Bioinformatics 22(17), 2175–2177 (2006)

    Article  Google Scholar 

  98. F. Liu, C. Yu, W. Meng, A. Chowdhury, Effective keyword search in relational databases, in SIGMOD (2006)

    Google Scholar 

  99. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, J.M. Hellerstein, Distributed graphlab: a framework for machine learning and data mining in the cloud, in VLDB (2012)

    Google Scholar 

  100. S. Ma, Y. Cao, W. Fan, J. Huai, T. Wo, Capturing topology in graph pattern matching, in VLDB (2012)

    Google Scholar 

  101. G. Malewicz, M.H. Austern, A.J.C. Bik, J.C. Dehnert, I. Horn, N. Leiser, G. Czajkowski, Pregel: a system for large-scale graph processing, in SIGMOD (2010)

    Google Scholar 

  102. F. Manola, E. Miller, RDF Primer, W3C Recommendation (2004). http://www.w3.org/TR/REC-rdf-syntax/

  103. R.R. McCune, T. Weninger, G. Madey, Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput. Surv. 48(2), 25:1–25:39 (2015)

    Google Scholar 

  104. A. McGregor, Graph stream algorithms: a survey. SIGMOD Rec. 43(1), 9–20 (2014)

    Article  Google Scholar 

  105. F. McSherry, M. Isard, D.G. Murray, Scalability! but at what COST? in HotOS (2015)

    Google Scholar 

  106. K. Mehlhorn, S. Naher, LEDA, a platform for combinatorial and geometric computing. Commun. ACM 38(1), 96–102 (1995)

    Article  MATH  Google Scholar 

  107. S. Melnik, H.G.-Molina, E. Rahm, Similarity flooding: a versatile graph matching algorithm and its application to schema matching, in ICDE (2002)

    Google Scholar 

  108. A.O. Mendelzon, P.T. Wood, Finding regular simple paths in graph databases. SIAM J. Comput. 24(6), 1235–1258 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  109. M. Mongiovì, R.D. Natale, R. Giugno, A. Pulvirenti, A. Ferro, R. Sharan, Sigma: a set-cover-based inexact graph matching algorithm. J. Bioinform. Comput. Biol. 8(2), 199–218 (2010)

    Article  Google Scholar 

  110. D. Mottin, M. Lissandrini, Y. Velegrakis, T. Palpanas, Exemplar queries: give me an example of what you need, in VLDB (2014)

    Google Scholar 

  111. D.G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, M. Abadi, Naiad: a timely dataflow system, in SOSP (2013)

    Google Scholar 

  112. Neo4j, https://neo4j.com/

  113. T. Neumann, G. Weikum, The RDF-3X engine for scalable management of RDF data. VLDB J. 19(1), 91–113 (2010)

    Article  Google Scholar 

  114. S. Nijssen, J.N. Kok, The gaston tool for frequent subgraph mining, in Proceedings of the International Workshop on Graph-Based Tools (2004)

    Google Scholar 

  115. M.T. Özsu, A survey of rdf data management systems (2015). http://arxiv.org/abs/1601.00707

  116. F. Pellegrini, J. Roman, SCOTCH: a software package for static mapping by dual recursive bipartitioning of process and architecture graphs, in HPCN (1996)

    Google Scholar 

  117. E. Prud’hommeaux, A. Seaborne, SPARQL query language for RDF. W3C Recommendation (2008)

    Google Scholar 

  118. S. Ranu, B.T. Calhoun, A.K. Singh, S.J. Swamidass, Probabilistic substructure mining from small-molecule screens. Mol. Inform. 30(9), 809–815 (2011)

    Article  Google Scholar 

  119. S. Ranu, M. Hoang, A. Singh, Mining discriminative subgraphs from global-state networks, in KDD (2013)

    Google Scholar 

  120. S. Ranu, A.K. Singh, GraphSig: a scalable approach to mining significant subgraphs in large graph databases, in ICDE (2009)

    Google Scholar 

  121. S. Ranu, A.K. Singh, Mining statistically significant molecular substructures for efficient molecular classification. J. Chem. Inf. Model. 49, 2537–2550 (2009)

    Article  Google Scholar 

  122. S. Sakr, G. Al-Naymat, Relational processing of RDF queries: a survey. SIGMOD Rec. 38(4), 23–28 (2010)

    Article  Google Scholar 

  123. S. Sakr, S. Elnikety, Y. He, G-SPARQL: a hybrid engine for querying large attributed graphs, in CIKM (2012)

    Google Scholar 

  124. H. Samet, J. Sankaranarayanan, H. Alborzi, Scalable network distance browsing in spatial databases, in SIGMOD (2008)

    Google Scholar 

  125. M. Sarwat, S. Elnikety, Y. He, M.F. Mokbel, Horton+: a distributed system for processing declarative reachability queries over partitioned graphs, in VLDB (2013)

    Google Scholar 

  126. H. Shang, Y. Zhang, X. Lin, J. Yu, Taming verification hardness: an efficient algorithm for testing subgraph isomorphism, in VLDB (2008)

    Google Scholar 

  127. J. Shun, G.E. Blelloch, Ligra: a lightweight graph processing framework for shared memory, in PPoPP (2013)

    Google Scholar 

  128. R. Singh, J. Xu, B. Berger, Global alignment of multiple protein interaction networks with application to functional orthology detection. PNAS 105(35), 12763–12768 (2008)

    Article  Google Scholar 

  129. C. Sommer, Shortest-path queries in static networks. ACM Comput. Surv. 46(4), 45:1–45:31 (2014)

    Google Scholar 

  130. H. Sun, M. Srivatsa, S. Tan, Y. Li, L.M. Kaplan, S. Tao, X. Yan, Analyzing expert behaviors in collaborative networks, in KDD (2014)

    Google Scholar 

  131. Y. Sun, J. Han, X. Yan, P.S. Yu, T. Wu, PathSim: meta path-based top-K similarity search in heterogeneous information networks, in VLDB (2011)

    Google Scholar 

  132. Z. Sun, H. Wang, H. Wang, B. Shao, J. Li, Efficient subgraph matching on billion node graphs, in VLDB (2012)

    Google Scholar 

  133. M. Thoma, H. Cheng, A. Gretton, J. Han, H.-P. Kriegel, A. Smola, L. Song, P.S. Yu, X. Yan, K. Borgwardt, Near-optimal supervised feature selection among frequent subgraphs, in SDM (2009)

    Google Scholar 

  134. L.T. Thomas, S.R. Valluri, K. Karlapalem, MARGIN: maximal frequent subgraph mining. ACM Trans. Knowl. Discov. Data 4(3), 10:1–10:42 (2010)

    Google Scholar 

  135. Y. Tian, R. McEachin, C. Santos, D. States, J. Patel, SAGA: a subgraph matching tool for biological graphs. Bioinformatics 23(2), 232–239 (2006)

    Article  Google Scholar 

  136. Y. Tian, J.M. Patel, TALE: a tool for approximate large graph matching, in ICDE (2008)

    Google Scholar 

  137. H. Tong, C.-Y. Lin, Non-negative residual matrix factorization with application to graph anomaly detection, in SDM (2011)

    Google Scholar 

  138. H. Tong, C. Faloutsos, B. Gallagher, T. Eliassi-Rad, Fast best-effort pattern matching in large attributed graphs, in KDD (2007)

    Google Scholar 

  139. S. Trißl, U. Leser, Fast and practical indexing and querying of very large graphs, in SIGMOD (2007)

    Google Scholar 

  140. J.R. Ullmann, An algorithm for subgraph isomorphism. J. ACM 23, 31–42 (1976)

    Article  MathSciNet  Google Scholar 

  141. N. Vanetik, E. Gudes, Mining frequent labeled and partially labeled graph patterns, in ICDE (2004)

    Google Scholar 

  142. C. Vicknair, M. Macias, Z. Zhao, X. Nan, Y. Chen, D. Wilkins, A comparison of a graph database and a relational database: a data provenance perspective, in ACMSE (2010)

    Google Scholar 

  143. S.V.N. Vishwanathan, N.N. Schraudolph, R. Kondor, K.M. Borgwardt, Graph Kernels. J. Mach. Learn. Res. 11, 1201–1242 (2010)

    MathSciNet  MATH  Google Scholar 

  144. R.C. Wang, W. Cohen, Language-independent set expansion of named entities using the web, in ICDM (2007)

    Google Scholar 

  145. A. Wlc, R. Raman, Z. Wu, S. Hong, H. Chafi, J. Banerjee, Graph analysis: do we have to reinvent the wheel? in GRADES (2013)

    Google Scholar 

  146. K. Wilkinson, C. Sayers, H. Kuno, D. Reynolds, Efficient RDF storage and retrieval in Jena2, in SWDB (2003)

    Google Scholar 

  147. P.T. Wood, Query languages for graph databases. SIGMOD Rec. 41(1), 50–60 (2012)

    Article  Google Scholar 

  148. Y. Xu, Y. Papakonstantinou, Efficient keyword search for smallest LCAs in XML databases, in SIGMOD (2005)

    Google Scholar 

  149. X. Yan, J. Han, gSpan: graph-based substructure pattern mining, in ICDM (2002)

    Google Scholar 

  150. X. Yan, J. Han, Closegraph: mining closed frequent graph patterns, in KDD (2003)

    Google Scholar 

  151. X. Yan, P.S. Yu, J. Han, Graph indexing: a frequent structure-based approach, in SIGMOD (2004)

    Google Scholar 

  152. X. Yan, F. Zhu, P.S. Yu, J. Han, Feature-based similarity search in graph structures. ACM Trans. Database Syst. 31(4), 1418–1453 (2006)

    Article  Google Scholar 

  153. X. Yan, H. Cheng, J. Han, P.S. Yu, Mining significant graph patterns by scalable leap search, in SIGMOD (2008)

    Google Scholar 

  154. X. Yan, B. He, F. Zhu, J. Han, Top-K aggregation queries over large networks, in ICDE (2010)

    Google Scholar 

  155. J. Yao, B. Cui, L. Hua, Y. Huang, Keyword query reformulation on structured data, in ICDE (2012)

    Google Scholar 

  156. S. Zhang, S. Li, J. Yang, GADDI: distance index based subgraph matching in biological networks, in EDBT (2009)

    Google Scholar 

  157. S. Zhang, J. Yang, S. Li, RING: an integrated method for frequent representative subgraph mining, in ICDM (2009)

    Google Scholar 

  158. S. Zhang, J. Yang, W. Jin, SAPPER: subgraph indexing and approximate matching in large graphs, in VLDB (2010)

    Google Scholar 

  159. P. Zhao, J. Han, On graph query optimization in large networks, in VLDB (2010)

    Google Scholar 

  160. Q. Zhong, H. Li, J. Li, G. Xie, J. Tang, L. Zhou, Y. Pan, A Gauss function based approach for unbalanced ontology matching, in SIGMOD (2009)

    Google Scholar 

  161. Y. Zhu, L. Qin, J. Yu, H. Cheng, Finding top-k similar graphs in graph databases, in EDBT (2012)

    Google Scholar 

  162. L. Zou, L. Chen, M.T. Özsu, D. Zhao, Dynamic skyline queries in large graphs, in DASFAA (2010)

    Google Scholar 

  163. L. Zou, J. Mo, L. Chen, M.T. Özsu, D. Zhao, gStore: answering SPARQL queries via subgraph matching, in VLDB (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arijit Khan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Khan, A., Ranu, S. (2017). Big-Graphs: Querying, Mining, and Beyond. In: Zomaya, A., Sakr, S. (eds) Handbook of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-49340-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49340-4_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49339-8

  • Online ISBN: 978-3-319-49340-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics