Copyright © 2007 Elsevier B.V. All rights reserved.
Assigning semantics to partial tree-pattern queries
Received 12 October 2006;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
The wide adoption of XML has increased the interest on data models that are based on tree-structured data. Querying capabilities are provided through tree-pattern queries (TPQs). The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. Assigning semantics to the queries of these languages so that they return meaningful answers is a challenging issue.
In this paper, we introduce a query language which allows the specification of partial tree-pattern queries (PTPQs). The structure in a PTPQ can be flexibly specified fully, partially or not at all. We define index graphs which summarize the structural information of data trees. Using index graphs, we show that PTPQs can be evaluated through the generation of an equivalent set of “complete” TPQs. We suggest an original approach that exploits the set of complete TPQs of a PTPQ to assign meaningful semantics to the PTPQ language. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect meaningful complete TPQs. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Finally, it is superior to and scales better than the only previous approach that allows for structural constraints in the queries. Our approach generates TPQs and therefore, it can be easily implemented on top of an XQuery engine.
Keywords: XML; Partial tree-pattern query; Keyword query; Query language semantics; Meaningful answer; Structural summary of XML data
Article Outline
- 1. Introduction
- 1.1. The problem
- 1.2. Limitations of previous approaches
- 1.3. Our approach
- 1.4. Contribution
- 1.5. Outline
- 2. Related work
- 3. The partial tree-pattern query language
- 3.1. Data model
- 3.2. Query language
- 4. Evaluating PTPQs using complete TPQs
- 4.1. Index graphs
- 4.2. Complete TPQs for a PTPQ
- 5. Using complete TPQs to exclude meaningless answers
- 6. Analysis of previous approaches and comparison
- 7. Experimental evaluation
- 7.1. Quality
- 7.1.1. Experimental setting
- 7.1.2. Experimental results for keyword queries without structural restrictions
- 7.1.3. Experimental results for keyword queries with structural restrictions
- 7.2. Performance
- 7.2.1. Experimental setting
- 7.2.2. Experimental results
- 8. Conclusion
- References
- Vitae







E-mail Article
Add to my Quick Links

Cited By in Scopus (0)






