Skip to main content

SAM: An Efficient Algorithm for F&B-Index Construction

  • Conference paper
Advances in Data and Web Management (APWeb 2007, WAIM 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4505))

  • 1150 Accesses

Abstract

Using index to process structural queries on XML data is a natural way. F&B-Index has been proven to be the smallest index which covers all branching path queries. One disadvantage which prevents the wide usage of F&B-Index is that its construction requires lots of time and very large main memory. However, few works focus on this problem. In this paper, we propose an effective and efficient F&B-Index construction algorithm, SAM, for DAG-structured XML data. By maintaining only a small part of index, SAM can save required space of construction. Avoiding complex computation of the selection of nodes to process, SAM takes less time cost than existing algorithms. Theoretical analysis and experimental results show that SAM is correct, effective and efficient.

Research supported by the key Program National Natural Science Foundation of China (NSFC) under Grant No. 60533110, NSFC under Grant No. 60473075, National Grand Fundamental research 973 Program of China under Grant No. 2006CB303000 and Program for New Century Excellent Talents in University (NCET) under Grant No. NCET-05-0333.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Milo, T., Suciu, D.: Index structures for path expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  2. Wang, W.: PBiTree Coding and Efficient Processing of Containment Joins. In: The 19th International Conference on Data Engineering (ICDE 2003), Bangalore, India, pp. 391–402 (2003)

    Google Scholar 

  3. Chen, Q., Lim, A., Ong, K.W.: D(K)-Index: An Adaptive Structural Summary for Graph-Structured Data. In: Proceedings of the 22nd ACM International Conference on Management of Data (SIGMOD 2003), San Diego, California, USA, pp. 134–144 (2003)

    Google Scholar 

  4. Kaushik, R.: Covering Indexes for Branching Path Queries. In: Proceedings of the ACM SIGMOD Conference, Madison, USA, pp. 133–144 (2002)

    Google Scholar 

  5. Wang, W., Wang, H.: Efficient Processing of XML Path Queries Using the Disk-based F&B Index. In: The 31st Proc. of VLDB, Norway, pp. 145–156 (2005)

    Google Scholar 

  6. Ramanan, P.: Covering Indexes for XML Queries: Bisimulation-Simulation= Negation. In: Proceedings of the 29th International Conference on Very Large Data Bases (VLDB 2003), Berlin, German, pp. 165–176 (2003)

    Google Scholar 

  7. Paige, R., Tarjan, R.E.: Three Partition refinement algorithms. SIAM Journal on Computing 16(6), 973–989 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  8. Liu, X., Li, J., Wang, H.: SAJ: An F&B-Index Construction Algorithm with Optimized Space Cost. In: Proc. of NDBC, Guangzhou, China, pp. 413–417 (2006)

    Google Scholar 

  9. Gene Ontology, http://www.geneontology.org

  10. XMark. The xml-benchmark project (Apr. 2001), http://www.xml-benchmark.org

  11. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal xml pattern matching. In: SIGMOD, San Jose, CA, pp. 310–321 (2002)

    Google Scholar 

  12. Cormen, T.H., et al.: Introduction to Algorithms. MIT Press, Cambridge (1994)

    Google Scholar 

  13. Park, D.: Concurrency and Automata on Infinite Sequences. In: Deussen, P. (ed.) GI-TCS 1981. LNCS, vol. 104, Springer, Heidelberg (1981)

    Chapter  Google Scholar 

  14. Goldman, R., Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured databases. In: Proc. of the 23rd VLDB Conf., Greece, pp. 436–445 (1997)

    Google Scholar 

  15. Fernandez, M.F.: Optimizing regular path expressions using graph schemas. In: Proc. of the 14th Int.Conf.on Data Engineering (ICDE 1998), Florida, USA, pp. 14–23 (1998)

    Google Scholar 

  16. Milner, R. (ed.): A Calculus of Communication Systems. LNCS, vol. 92. Springer, Heidelberg (1980)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Guozhu Dong Xuemin Lin Wei Wang Yun Yang Jeffrey Xu Yu

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Liu, X., Li, J., Wang, H. (2007). SAM: An Efficient Algorithm for F&B-Index Construction . In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds) Advances in Data and Web Management. APWeb WAIM 2007 2007. Lecture Notes in Computer Science, vol 4505. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72524-4_72

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72524-4_72

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72483-4

  • Online ISBN: 978-3-540-72524-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics