ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Data & Knowledge Engineering
Volume 65, Issue 2, May 2008, Pages 243-265
Including Special Section: 3rd XML Schema and Data Management Workshop (XSDM 2006) – Five selected and extended papers
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (382 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.datak.2007.09.007    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier B.V. All rights reserved.

Processing recursive XQuery over XML streams: The Raindrop approachstar, open

Mingzhu WeiCorresponding Author Contact Information, a, E-mail The Corresponding Author, Elke A. Rundensteinera, E-mail The Corresponding Author, Murali Mania, E-mail The Corresponding Author and Ming Lia, E-mail The Corresponding Author

aCS Department, Worcester Polytechnic Institute, Worcester, MA, 01609–2280, USA

Received 5 September 2007; 
accepted 5 September 2007. 
Available online 20 September 2007.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

XML stream applications bring the challenge of efficiently processing queries on sequentially accessible token-based data. For efficient processing of queries, we need to ensure that memory usage stays low. This in turn requires that we avoid holding data in the query buffer, by outputting it at the earliest possible time. In this paper, we propose a new class of stream algebra operators for efficient recursive XQuery stream processing. Our plan generator will analyze the query, and the schema when available to determine which join operators in the query need recursive join support and thus can plug in the more inexpensive just-in-time structural join whenever possible. In particular, we propose two strategies for implementing structural joins: (a) the just-in-time structural join strategy efficiently processes joins over non-recursive XML token streams; and (b) the recursive structural join strategy supports structural joins over recursive XML substreams, however, at an added cost of generating and comparing tuple-level ID. Both structural join strategies are complemented by an automata-driven invocation mechanism that triggers the execution of each join process at the first possible moment upon recognizing the end of the targeted input stream subelement. Further, we design this StructuralJoin operator itself to be context-aware. The operator is capable of at run-time switching from the efficient just-in-time join strategy for elements that are recognized to be non-recursive to the more powerful ID-based structural join strategy for elements that are identified to be recursive. We incorporate the proposed techniques into the Raindrop stream engine. We also report on experimental studies we conducted using the ToXgene benchmark that demonstrate that the performance improvements of the techniques.

Keywords: XML; Query optimization; XQuery processing; Recursive query; Structural join

Article Outline

1. Introduction
1.1. Our recursive Raindrop approach
2. Raindrop basics
2.1. Retrieving patterns using automata
2.2. Algebra plan
2.3. Plan execution
2.4. Issues for recursive XML data
3. Recursive-mode operators
3.1. Associating IDs with elements
3.2. Features of recursive Navigate operators
3.3. Features of recursive ExtractUnnest operators
3.4. Features of recursive ExtractNest operators
3.5. The recursive StructuralJoin operator
3.5.1. Invocation mechanism of recursive StructuralJoin
3.5.2. Algorithm for recursive StructuralJoin
3.5.3. Differences between recursive structural join and non-recursive structural join
3.6. Context-aware StructuralJoin operators
4. Optimized plan configuration based on schema analysis
4.1. Motivation of schema analysis
4.2. Optimizing generated plan based on query and schema
4.2.1. Schema analysis algorithm
4.2.1.1. The “For” clause without predicates
4.2.1.2. The “For” clause with existence predicates
4.3. Algorithm of plan generation based on schema analysis
5. Related work
6. Experimental results
6.1. Advantages of early invocation of structural join
6.2. Efficiency of context-aware StructuralJoin
6.3. Advantage of using recursion-free mode operators
7. Conclusion
References
Vitae

















Data & Knowledge Engineering
Volume 65, Issue 2, May 2008, Pages 243-265
Including Special Section: 3rd XML Schema and Data Management Workshop (XSDM 2006) – Five selected and extended papers
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.