ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Information Systems
Volume 33, Issues 4-5, June-July 2008, Pages 456-474
Selected Papers from the Tenth International Symposium on Database Programming Languages (DBPL 2005
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (496 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.is.2008.01.004    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2008 Elsevier B.V. All rights reserved.

Efficient memory representation of XML document trees

Giorgio Busattoa, E-mail The Corresponding Author, Markus Lohreyb, Corresponding Author Contact Information, E-mail The Corresponding Author and Sebastian Manethc, d, E-mail The Corresponding Author

aDepartment für Informatik, Universität Oldenburg, Germany bInstitut für Informatik, Universität Leipzig, Johannisgasse 26, 04103 Leipzig, Germany cNational ICT Australia Ltd., Australia1 dUniversity of New South Wales, Sydney, Australia

Available online 15 January 2008.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Implementations that load XML documents and give access to them via, e.g., the DOM, suffer from huge memory demands: the space needed to load an XML document is usually many times larger than the size of the document. A considerable amount of memory is needed to store the tree structure of the XML document. In this paper, a technique is presented that allows to represent the tree structure of an XML document in an efficient way. The representation exploits the high regularity in XML documents by compressing their tree structure; the latter means to detect and remove repetitions of tree patterns. Formally, context-free tree grammars that generate only a single tree are used for tree compression. The functionality of basic tree operations, like traversal along edges, is preserved under this compressed representation. This allows to directly execute queries (and in particular, bulk operations) without prior decompression. The complexity of certain computational problems like validation against XML types or testing equality is investigated for compressed input trees.

Keywords: Tree grammar; Compression; In-memory XML representation

Article Outline

1. Introduction
2. Preliminaries
2.1. Tree grammars
3. The BPLEX algorithm
4. Memory-efficient XML tree representation using BPLEX
4.1. Binary tree model
4.2. Multiary tree model
4.3. DAGs: binary trees versus multiary trees
4.4. SLT Grammars: binary trees versus multiary trees
5. Experimental results
5.1. Performance and parameter tuning
6. Algorithms on SLT grammars
6.1. XML type validation
6.2. Equality test
7. Related work
8. Conclusions and future work
Acknowledgements
References








Information Systems
Volume 33, Issues 4-5, June-July 2008, Pages 456-474
Selected Papers from the Tenth International Symposium on Database Programming Languages (DBPL 2005
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.