skip to main content
10.1145/2448496.2448526acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Extracting minimum-weight tree patterns from a schema with neighborhood constraints

Published:18 March 2013Publication History

ABSTRACT

The task of formulating queries is greatly facilitated when they can be generated automatically from some given data values, schema concepts or both (e.g., names of particular entities and XML tags). This automation is the basis of various database applications, such as keyword search and interactive query formulation. Usually, automatic query generation is realized by finding a set of small tree patterns that contain some given labels. More formally, the computational problem at hand is to find top-k patterns, that is, k minimum-weight tree patterns that contain a given bag of labels, conform to the schema, and are non-redundant. A plethora of systems and research papers include a component that deals with this problem. This paper presents an algorithm for this problem, with complexity guarantees, that allows nontrivial schema constraints and, hence, avoids generating patterns that cannot be instantiated. Specifically, this paper shows that for schemas with certain types of neighborhood constraints, the problem is fixed-parameter tractable (FPT), the parameter being the size of the given bag of labels. As machinery, an adaptation of Lawler-Murty's procedure is developed. This adaptation reduces a top-k problem, over an infinite space of solutions, to a prefix-constrained optimization problem. It is shown how to cast the problem of top-k patterns in this adaptation. A solution is developed for the corresponding prefix-constrained optimization problem, and it uses an algorithm for finding a (single) minimum-weight tree pattern. This algorithm generalizes an earlier work by handling leaf constraints (i.e., which labels may, must or should not be leaves). It all boils down to a reduction showing that, under a language for neighborhood constraints, finding top-k patterns is FPT if a certain variant of exact cover is FPT.

References

  1. C. Beeri and T. Milo. Schemas for integration and translation of structured and semi-structured data. In ICDT, pages 296--313. Springer, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. In ICDE, pages 431--440. IEEE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Cohen, Y. Kanza, B. Kimelfeld, and Y. Sagiv. Interconnection semantics for keyword search in XML. In CIKM, pages 389--396. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. G. Downey and M. R. Fellows. Parameterized Complexity. Monographs in Computer Science. Springer, 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD Conference, pages 927--940. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Golenberg, B. Kimelfeld, and Y. Sagiv. Optimizing and parallelizing ranked enumeration. PVLDB, 4(11):1028--1039, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Grohe and J. Flum. Parameterized Complexity Theory. Theoretical Computer Science. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H. Hamacher and M. Queyranne. K-best solutions to combinatorial optimization problems. Annals of Operations Research, 4:123--143, 1985/6.Google ScholarGoogle ScholarCross RefCross Ref
  9. V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword search in relational databases. In VLDB, pages 670--681. Morgan Kaufmann, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Kemper, D. Kossmann, and B. Zeller. Performance tuning for SAP R/3. IEEE Data Eng. Bull., 22(2):32--39, 1999.Google ScholarGoogle Scholar
  11. B. Kimelfeld and Y. Sagiv. Finding and approximating top-k answers in keyword proximity search. In PODS, pages 173--182. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Kimelfeld and Y. Sagiv. New algorithms for computing Steiner trees for a fixed number of terminals. Accessible from the first author's home page, 2006.Google ScholarGoogle Scholar
  13. B. Kimelfeld and Y. Sagiv. Finding a minimal tree pattern under neighborhood constraints. In PODS, pages 235--246. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Kimelfeld, Y. Sagiv, and G. Weber. ExQueX: exploring and querying XML documents. In SIGMOD Conference, pages 1103--1106. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. L. Lawler. A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 18(7):401--405, 1972.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Li, C. Yu, and H. V. Jagadish. Schema-free XQuery. In VLDB, pages 72--83. Morgan Kaufmann, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Luo, W. Wang, and X. Lin. SPARK: A keyword search engine on relational databases. In ICDE, pages 1552--1555. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Markowetz, Y. Yang, and D. Papadias. Keyword search over relational tables and streams. ACM Trans. Database Syst., 34(3), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. G. Murty. An algorithm for ranking all the assignments in order of increasing cost. Operations Research, 16(3):682--687, 1968.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Qin, J. X. Yu, and L. Chang. Keyword search in databases: the power of RDBMS. In SIGMOD Conference, pages 681--694. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. P. Talukdar, M. Jacob, M. S. Mehmood, K. Crammer, Z. G. Ives, F. Pereira, and S. Guha. Learning to create data-integrating queries. PVLDB, 1(1):785--796, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Y. Vardi. The complexity of relational query languages (extended abstract). In Proceedings of the Fourteenth Annual ACM Symposium on Theory of Computing, pages 137--146. ACM, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Y. Yen. Finding the k shortest loopless paths in a network. Management Science, 17:712--716, 1971.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G. Zenz, X. Zhou, E. Minack, W. Siberski, and W. Nejdl. From keywords to semantic queries - incremental query construction on the semantic Web. J. Web Sem., 7(3):166--176, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Extracting minimum-weight tree patterns from a schema with neighborhood constraints

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICDT '13: Proceedings of the 16th International Conference on Database Theory
        March 2013
        301 pages
        ISBN:9781450315982
        DOI:10.1145/2448496

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 18 March 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader