ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Data & Knowledge Engineering
Volume 64, Issue 1, January 2008, Pages 242-265
Fourth International Conference on Business Process Management (BPM 2006) - Four selected and extended papers; 8th International Conference on Enterprise Information Systems (ICEIS' 2006) - Three selected and extended papers
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (2178 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.datak.2007.07.002    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier B.V. All rights reserved.

Assigning semantics to partial tree-pattern queries

Dimitri TheodoratosCorresponding Author Contact Information, a, E-mail The Corresponding Author and Xiaoying Wua, E-mail The Corresponding Author

aDepartment of Computer Science, New Jersey Institute of Technology, USA

Received 12 October 2006; 
accepted 24 July 2007. 
Available online 7 August 2007.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

The wide adoption of XML has increased the interest on data models that are based on tree-structured data. Querying capabilities are provided through tree-pattern queries (TPQs). The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. Assigning semantics to the queries of these languages so that they return meaningful answers is a challenging issue.

In this paper, we introduce a query language which allows the specification of partial tree-pattern queries (PTPQs). The structure in a PTPQ can be flexibly specified fully, partially or not at all. We define index graphs which summarize the structural information of data trees. Using index graphs, we show that PTPQs can be evaluated through the generation of an equivalent set of “complete” TPQs. We suggest an original approach that exploits the set of complete TPQs of a PTPQ to assign meaningful semantics to the PTPQ language. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect meaningful complete TPQs. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Finally, it is superior to and scales better than the only previous approach that allows for structural constraints in the queries. Our approach generates TPQs and therefore, it can be easily implemented on top of an XQuery engine.

Keywords: XML; Partial tree-pattern query; Keyword query; Query language semantics; Meaningful answer; Structural summary of XML data

Article Outline

1. Introduction
1.1. The problem
1.2. Limitations of previous approaches
1.3. Our approach
1.4. Contribution
1.5. Outline
2. Related work
3. The partial tree-pattern query language
3.1. Data model
3.2. Query language
4. Evaluating PTPQs using complete TPQs
4.1. Index graphs
4.2. Complete TPQs for a PTPQ
5. Using complete TPQs to exclude meaningless answers
5.1. A transformation for complete TPQs
5.2. Determining the meaningful complete TPQs
6. Analysis of previous approaches and comparison
7. Experimental evaluation
7.1. Quality
7.1.1. Experimental setting
7.1.2. Experimental results for keyword queries without structural restrictions
7.1.3. Experimental results for keyword queries with structural restrictions
7.2. Performance
7.2.1. Experimental setting
7.2.2. Experimental results
8. Conclusion
References
Vitae























Data & Knowledge Engineering
Volume 64, Issue 1, January 2008, Pages 242-265
Fourth International Conference on Business Process Management (BPM 2006) - Four selected and extended papers; 8th International Conference on Enterprise Information Systems (ICEIS' 2006) - Three selected and extended papers
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.