research-article

Open Access

Generating Preview Tables for Entity Graphs

Authors:
Ning Yan

Huawei U.S. R&D Center, Santa Clara, CA, USA

Huawei U.S. R&D Center, Santa Clara, CA, USA
View Profile

,
Sona Hasani

The University of Texas at Arlington, Arlington, TX, USA

The University of Texas at Arlington, Arlington, TX, USA
View Profile

,
Abolfazl Asudeh

The University of Texas at Arlington, Arlington, TX, USA

The University of Texas at Arlington, Arlington, TX, USA
View Profile

,
Chengkai Li

The University of Texas at Arlington, Arlington, TX, USA

The University of Texas at Arlington, Arlington, TX, USA
View Profile

SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataJune 2016Pages 1797–1811https://doi.org/10.1145/2882903.2915221

Published:26 June 2016Publication History

SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data

Pages 1797–1811

Editorial Notes

Computationally Replicable. The experimental results of this paper were replicated by a SIGMOD Review Committee and were found to support the central results reported in the paper. Details of the review process are found here

ABSTRACT

Users are tapping into massive, heterogeneous entity graphs for many applications. It is challenging to select entity graphs for a particular need, given abundant datasets from many sources and the oftentimes scarce information for them. We propose methods to produce preview tables for compact presentation of important entity types and relationships in entity graphs. The preview tables assist users in attaining a quick and rough preview of the data. They can be shown in a limited display space for a user to browse and explore, before she decides to spend time and resources to fetch and investigate the complete dataset. We formulate several optimization problems that look for previews with the highest scores according to intuitive goodness measures, under various constraints on preview size and distance between preview tables. The optimization problem under distance constraint is NP-hard. We design a dynamic-programming algorithm and an Apriori-style algorithm for finding optimal previews. Results from experiments, comparison with related work and user studies demonstrated the scoring measures' accuracy and the discovery algorithms' efficiency.

Supplemental Material

Available for Download

pdf

readme.pdf (58.6 KB)

Rights information

zip

tabview_reproducibility.zip (2.1 GB)

Data, Experiments

References

R. Agarwal and R. Srikant. Fast algorithms for mining association rules. In VLDB, pages 487--499, 1994. Google ScholarDigital Library
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a Web of open data. In ISWC, pages 722--735, 2007. Google ScholarDigital Library
A. Balmin, V. Hristidis, and Y. Papakonstantinou. Objectrank: Authority-based keyword search in databases. In VLDB, pages 564--575, 2004. Google ScholarDigital Library
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250, 2008. Google ScholarDigital Library
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In WWW, pages 107--117, 1998. Google ScholarDigital Library
C. Bron and J. Kerbosch. Algorithm 457: finding all cliques of an undirected graph. CACM, 16(9):575--577, Sept. 1973. Google ScholarDigital Library
J. Cohen. Statistical Power Analysis for the Behavioral Sciences. Academic Press, 1988.Google Scholar
X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In KDD, pages 601--610, 2014. Google ScholarDigital Library
Y. Huang, Z. Liu, and Y. Chen. Query biased snippet generation in xml search. In SIGMOD, pages 315--326, 2008. Google ScholarDigital Library
M. Jayapandian and H. V. Jagadish. Automated creation of a forms-based database query interface. PVLDB, 1(1):695--709, Aug. 2008. Google ScholarDigital Library
F. Kose, W. Weckwerth, T. Linke, and O. Fiehn. Visualizing plant metabolomic correlation networks using clique-metabolite matrices. Bioinformatics, 17(12):1198--1208, Dec. 2001.Google ScholarCross Ref
T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):225--331, Mar. 2009. Google ScholarDigital Library
C. D. Manning, P. Raghavan, and H. Schtze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarCross Ref
A. Nandi and H. V. Jagadish. Qunits: queried units in database search. In CIDR, 2009.Google Scholar
S. E. Schaeffer. Survey: Graph clustering. Comput. Sci. Rev., 1(1):27--64, Aug. 2007. Google ScholarDigital Library
F. M. Suchanek, G. Kasneci, and G. Weikum. YAGO: a core of semantic knowledge unifying WordNet and Wikipedia. In WWW, pages 697--706, 2007. Google ScholarDigital Library
Y. Tian, R. A. Hankins, and J. M. Patel. Efficient aggregation for graph summarization. In SIGMOD, pages 567--580, 2008. Google ScholarDigital Library
W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: a probabilistic taxonomy for text understanding. In SIGMOD, pages 481--492, 2012. Google ScholarDigital Library
X. Yang, C. M. Procopiuc, and D. Srivastava. Summarizing relational databases. PVLDB, 2(1):634--645, 2009. Google ScholarDigital Library
X. Yang, C. M. Procopiuc, and D. Srivastava. Summary graphs for relational database schemas. PVLDB, 4(11):899--910, 2011.Google ScholarDigital Library
C. Yu and H. V. Jagadish. Schema summarization. In VLDB, pages 319--330, 2006. Google ScholarDigital Library
N. Zhang, Y. Tian, and J. M. Patel. Discovery-driven graph summarization. In ICDE, pages 880--891, 2010.Google ScholarCross Ref

Index Terms

Generating Preview Tables for Entity Graphs
1. Information systems
  1. Data management systems
  2. Information systems applications
    1. Data mining

Recommendations

Generating Chordal Graphs Included in Given Graphs

A chordal graph is a graph which contains no chordless cycle of at least four edges as an induced subgraph. The class of chordal graphs contains many famous graph classes such as trees, interval graphs, and split graphs, and is also a subclass of ...
Read More
Enumerating and generating labeled k-degenerate graphs
ALENEX '10: Proceedings of the Meeting on Algorithm Engineering & Expermiments

A k-degenerate graph is a graph in which every induced subgraph has a vertex with degree at most k. The class of k-degenerate graphs is interesting from a theoretical point of view and it plays an interesting role in the theory of fixed parameter ...
Read More
On generating planar graphs

A 3-valent graph G is cyclically n-connected provided one must cut at least n edges in order to separate any two circuits of G. If G is cyclically n-connected but any separation of G by cutting n edges yields a component consisting of a simple circuit, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
June 2016
2300 pages
ISBN:9781450335317
DOI:10.1145/2882903
General Chairs:
Fatma Özcan
IBM Research, USA
,
Georgia Koutrika
HP Labs, USA
,
Program Chair:
Sam Madden
Massachusetts Institute of Technology, USA
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 June 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
Author Tags
data exploration
entity graph
knowledge graph
schema summarization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 767
  Total Downloads
- Downloads (Last 12 months)38
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.