ABSTRACT
With the growing importance of semi-structure data in information exchange, much research has been done to provide an effective mechanism to match a twig query in an XML database. A number of algorithms have been proposed recently to process a twig query holistically. Those algorithms are quite efficient for quires with only ancestor-descendant edges. But for queries with mixed ancestor-descendant and parent-child edges, the previous approaches still may produce large intermediate results, even when the input and output size are more manageable. To overcome this limitation, in this paper, we propose a novel holistic twig join algorithm, namely <i>TwigStackList</i>. Our main technique is to look-ahead read some elements in input data steams and cache limited number of them to <i>lists</i> in the main memory. The number of elements in any list is bounded by the length of the longest path in the XML document. We show that <i>TwigStackList</i> is I/O optimal for queries with only ancestor-descendant relationships below branching nodes. Further, even when queries contain parent-child relationship below branching nodes, the set of intermediate results in <i>TwigStackList</i> is guaranteed to be a subset of that in previous algorithms. We complement our experimental results on a range of real and synthetic data to show the significant superiority of <i>TwigStackList</i> over previous algorithms for queries with <i>parent</i>-<i>child</i> relationships.
- S. Al-Khalifa, H. V. Jagadish, N. Koudas, J. M. Patel. Y. Wu, N. Koudas, D. Srivastava "Structural Joins: A primitive for efficient XML query pattern matching" In Proceedings of ICDE 2002 pages 141--152 Google ScholarDigital Library
- A. Berglund, S. Boag, D. Chamberlin, M. F. Fernandez, M. Kay, J. Robie, J. Simeon "XML Path Language (XPath) 2.0" W3C Working Draft 22 August 2003Google Scholar
- S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu J. Robie, J. Simeon "Xquery 1.0: An XML QueryW3C" Working Draft 22 August 2003Google Scholar
- N. Bruno, N. Koudas, and D. Srivastava. "Holistic twig joins: Optimal XML pattern matching" Technical Report Columbia University March 2002Google ScholarDigital Library
- N. Bruno, N. Koudas, and D. Srivastava. "Holistic twig joins: Optimal XML pattern matching" In Proceedings of ACM SIGMOD 2002 pages 310--321 Google ScholarDigital Library
- Y. Chen, S. B. Davidson, Y. Zheng "BLAS: An Efficient XPath Processing System" In Proceedings of SIGMOD 2004, pages 47--58 Google ScholarDigital Library
- B. Choi, M. Mahoui, D. Wood "On the Optimality of Holistic Algorithms for Twig Queries" DEXA 2003 pages 28--37Google Scholar
- J.Hellerstein, J. Naughton, and A. Pfeifer "Generalized search trees for database systems" In Proceedings of VLDB, 1995 pages 562--573 Google ScholarDigital Library
- H. Jiang, W. Wang, H. Lu and J.X. Yu "Holistic twig joins on indexed XML documents" In Proceedings of VLDB 2003 pages 273--284 Google ScholarDigital Library
- H. Jiang, H. Lu, W. Wang, B. C. Ooi "XR-Tree: Indexing XML Data for Efficient Structural Joins" In Proceedings of ICDE 2003, pages 253--263Google Scholar
- H. Jiang, H. Lu, W. Wang "Efficient Processing of Twig Queries with OR-Predicates" In Proceedings of SIGMOD 2004, pages 59--70 Google ScholarDigital Library
- Q. Li and B. Moon "Indexing and querying XML data for regular path expressions" In Proceedings of VLDB 2001 pages 361--370 Google ScholarDigital Library
- I. Tatarinov, S. Viglas, K. Beyer, J. Shanmugasundaram, E. Shekita, and C. Zhang "Storing and Querying Ordered XML Using a Relational Database System" In Proceedings of ACM SIGMOD 2002 pages 204--215 Google ScholarDigital Library
- Y. Wu, J. M. Patel, H. V. Jagadish "Structural Join Order Selection for XML Query Optimization" ICDE 2003 pages 443--454Google Scholar
- XML-benchmark http://monetdb.cwi.nl/xmlGoogle Scholar
- University of Washington XML Repository. Available from http://www.cs.washington.edu/research/xmldatasets/Google Scholar
- C. Zhang, J.F. Naughton, D.J. Dewitt, Q. Luo and G.M. Lohman "On Supporting containment Queries in Relational Database Management Systems" In Proceedings of. ACM SIGMOD, 2001 pages 425--436 Google ScholarDigital Library
Index Terms
- Efficient processing of XML twig patterns with parent child edges: a look-ahead approach
Recommendations
Efficient evaluation of high-selective xml twig patterns with parent child edges in tree-unaware rdbms
CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge managementRecent study showed that native twig join algorithms and tree-aware relational framework significantly outperform tree-unaware approaches in evaluating structural relationships in XML twig queries. In this paper, we present an efficient strategy to ...
Mapping of bibliographical standards into XML
The most popular bibliographical standards, which prescribe the exchange of bibliographical data in machine readable form, are MARC (Machine Readable Cataloguing) and UNIMARC (Universal Machine Readable Cataloguing). This paper presents two schemas, ...
Efficient processing of XML twig queries with OR-predicates
SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of dataAn XML twig query, represented as a labeled tree, is essentially a complex selection predicate on both structure and content of an XML document. Twig query matching has been identified as a core operation in querying tree-structured XML data. A number ...
Comments