skip to main content
10.1145/2882903.2915221acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections

Generating Preview Tables for Entity Graphs

Authors Info & Claims
Published:26 June 2016Publication History

Editorial Notes

Computationally Replicable. The experimental results of this paper were replicated by a SIGMOD Review Committee and were found to support the central results reported in the paper. Details of the review process are found here

ABSTRACT

Users are tapping into massive, heterogeneous entity graphs for many applications. It is challenging to select entity graphs for a particular need, given abundant datasets from many sources and the oftentimes scarce information for them. We propose methods to produce preview tables for compact presentation of important entity types and relationships in entity graphs. The preview tables assist users in attaining a quick and rough preview of the data. They can be shown in a limited display space for a user to browse and explore, before she decides to spend time and resources to fetch and investigate the complete dataset. We formulate several optimization problems that look for previews with the highest scores according to intuitive goodness measures, under various constraints on preview size and distance between preview tables. The optimization problem under distance constraint is NP-hard. We design a dynamic-programming algorithm and an Apriori-style algorithm for finding optimal previews. Results from experiments, comparison with related work and user studies demonstrated the scoring measures' accuracy and the discovery algorithms' efficiency.

Skip Supplemental Material Section

Supplemental Material

References

  1. R. Agarwal and R. Srikant. Fast algorithms for mining association rules. In VLDB, pages 487--499, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a Web of open data. In ISWC, pages 722--735, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Balmin, V. Hristidis, and Y. Papakonstantinou. Objectrank: Authority-based keyword search in databases. In VLDB, pages 564--575, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In WWW, pages 107--117, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Bron and J. Kerbosch. Algorithm 457: finding all cliques of an undirected graph. CACM, 16(9):575--577, Sept. 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Cohen. Statistical Power Analysis for the Behavioral Sciences. Academic Press, 1988.Google ScholarGoogle Scholar
  8. X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In KDD, pages 601--610, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Huang, Z. Liu, and Y. Chen. Query biased snippet generation in xml search. In SIGMOD, pages 315--326, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Jayapandian and H. V. Jagadish. Automated creation of a forms-based database query interface. PVLDB, 1(1):695--709, Aug. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Kose, W. Weckwerth, T. Linke, and O. Fiehn. Visualizing plant metabolomic correlation networks using clique-metabolite matrices. Bioinformatics, 17(12):1198--1208, Dec. 2001.Google ScholarGoogle ScholarCross RefCross Ref
  12. T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):225--331, Mar. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. D. Manning, P. Raghavan, and H. Schtze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarGoogle ScholarCross RefCross Ref
  14. A. Nandi and H. V. Jagadish. Qunits: queried units in database search. In CIDR, 2009.Google ScholarGoogle Scholar
  15. S. E. Schaeffer. Survey: Graph clustering. Comput. Sci. Rev., 1(1):27--64, Aug. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. F. M. Suchanek, G. Kasneci, and G. Weikum. YAGO: a core of semantic knowledge unifying WordNet and Wikipedia. In WWW, pages 697--706, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Tian, R. A. Hankins, and J. M. Patel. Efficient aggregation for graph summarization. In SIGMOD, pages 567--580, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: a probabilistic taxonomy for text understanding. In SIGMOD, pages 481--492, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. X. Yang, C. M. Procopiuc, and D. Srivastava. Summarizing relational databases. PVLDB, 2(1):634--645, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. X. Yang, C. M. Procopiuc, and D. Srivastava. Summary graphs for relational database schemas. PVLDB, 4(11):899--910, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Yu and H. V. Jagadish. Schema summarization. In VLDB, pages 319--330, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. Zhang, Y. Tian, and J. M. Patel. Discovery-driven graph summarization. In ICDE, pages 880--891, 2010.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Generating Preview Tables for Entity Graphs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
        June 2016
        2300 pages
        ISBN:9781450335317
        DOI:10.1145/2882903

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 June 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader