skip to main content
10.1145/1247480.1247564acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Massively multi-query join processing in publish/subscribe systems

Published:11 June 2007Publication History

ABSTRACT

There has been much recent interest in XML publish/subscribe systems. Some systems scale to thousands of concurrent queries, but support a limited query language (usually a fragment of XPath 1.0). Other systems support more expressive languages, but do not scale well with the number of concurrent queries. In this paper, we propose a set of novel query processing techniques, referred to as Massively Multi-Query Join Processing techniques, for processing a large number of XML stream queries involving value joins over multiple XML streams and documents. These techniques enable the sharing of representations of inputs to multiple joins, and the sharing of join computation. Our techniques are also applicable to relational event processing systems and publish/subscribe systems that support join queries. We present experimental results to demonstrate the effectiveness of our techniques. We are able to process thousands of XML messages with hundreds of thousands of join queries on real RSS feed streams. Our techniques gain more than two orders of magnitude speedup compared to the naive approach of evaluating such join queries.

References

  1. Xpath leashed. http://www-db-out.bell-labs.com/user/benedikt/papers/leashed.ps.gz.Google ScholarGoogle Scholar
  2. D. J. Abadi, Y. Ahmad, M. Balazinska, U. Çetintemel, M. Cherniack, J. H. Hwang, W. Lindner, A. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. B. Zdonik. The design of the borealis stream processing engine. In Proc. CIDR, pages 277--289, 2005.Google ScholarGoogle Scholar
  3. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. K. Aguilera, R. E. Strom, D. C. Sturman, M. Astley, and T. D. Chandra. Matching events in a content-based subscription system. In Proc. PODC, pages 53--61, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Altinel and M. J. Franklin. Efficient filtering of XML documents for selective dissemination of information. In Proc. VLDB, pages 53--64, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Barton, P. Charles, M. Fontoura, V. Josifovski, D. Goyal, and M. Raghavachari. Streaming xpath processing with forward and backward axes. In Proc. ICDE, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  7. D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams - a new class of data management applications. In Proc. VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Chakravarthy, V. Krishnaprasad, E. Anwar, and S. K. Kim. Composite events for active databases: Semantics, contexts and detection. In Proc. VLDB, pages 606--617, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. R. Madden, V. Raman, F. Reiss, and M. A. Shah. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proc. CIDR, 2003.Google ScholarGoogle Scholar
  10. Y. Chen, S. Davidson, and Y. Zheng. An efficient xpath query processor for xml streams. In Proc. ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Byron Choi. What are real dtds like. 2002.Google ScholarGoogle Scholar
  12. A. Demers, J. Gehrke, M. Hong, M. Riedewald, and W. White. Towards expressive publish/subscribe systems. In Proc. EDBT, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Diao, M. Altinel, M. J. Franklin, H. Zhang, and P. M. Fischer. Path sharing and predicate evaluation for high-performance XML filtering. ACM TODS, 28(4):467--516, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. Fabret, H. A. Jacobsen, F. Llirbat, J. Pereira, K. A. Ross, and D. Shasha. Filtering algorithms and implementation for very fast publish/subscribe. In Proc. SIGMOD, pages 115--126, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Florescu, C. Hillery, D. Kossmann, P. Lucas, F. Riccardi, T. Westmann, M. Carey, A. Sundararajan, and G. Agrawal. The bea/xqrl streaming xquery processor. In Proc. VLDB, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. X. Gong, W. Qian, Y. Yan, and A. Zhou. Bloom filter-based xml packets filtering for millions of path queries. In Proc. ICDE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Todd J. Green, Ashish Gupta, Gerome Miklau, Makoto Onizuka, and Dan Suciu. Processing xml streams with deterministic automata and stream indexes. ACM TODS, 29(4):752--788, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Hong, A. Demers, J. Gehrke, C. Koch, M. Riedewald, and W. White. Massively multi-query join processing in publish/subscribe systems. Technical report, Cornell University, 2007. http://techreports.library.cornell.edu.Google ScholarGoogle Scholar
  19. J. Kang, J. F. Naughton, and S. D. Viglas. Evaluating window joins over unbounded streams. In Proc. ICDE, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  20. C. Koch, S. Scherzinger, N. Schweikardt, and B. Stegmaier. Schema-based scheduling of event processors and buffer minimization for queries on structured data streams. In Proc. VLDB, pages 228--239, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. XLi and GAgrawal.Efficient evaluation of xquery over stream data.In Proc. VLDB, pages 265--276, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Ludascher, P. Mukhopadhayn, and Y. Papakonstantinou. A transducer-based xml query processor. In Proc. VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. S. Manku, C. Olston, J. Rosenstein, and R. Varma. Query processing, approximation, and resource management in a data stream management system. In Proc. CIDR, 2003.Google ScholarGoogle Scholar
  24. Ed Jr. Pegg. Graph minor.http://mathworld.wolfram.com/GraphMinor.html.Google ScholarGoogle Scholar
  25. F. Peng and S. Chawathe. Xsq: A streaming xpath engine. ACM TODS, 30(2):577--623, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. U. Srivastava and J. Widom. Flexible time management in data stream systems. In Proc. PODS, pages 263--274, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing over streams. In Proc. SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Yalamanchi, J. Srinivasan, and D. Gawlick. Managing expressions as data in relational database systems. In Proc. CIDR, 2003.Google ScholarGoogle Scholar

Index Terms

  1. Massively multi-query join processing in publish/subscribe systems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
        June 2007
        1210 pages
        ISBN:9781595936868
        DOI:10.1145/1247480
        • General Chairs:
        • Lizhu Zhou,
        • Tok Wang Ling,
        • Program Chair:
        • Beng Chin Ooi

        Copyright © 2007 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 June 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader