Skip to main content

Joining Punctuated Streams

  • Conference paper
Book cover Advances in Database Technology - EDBT 2004 (EDBT 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2992))

Included in the following conference series:

Abstract

We focus on stream join optimization by exploiting the constraints that are dynamically embedded into data streams to signal the end of transmitting certain attribute values. These constraints are called punctuations. Our stream join operator, PJoin, is able to remove no-longer-useful data from the state in a timely manner based on punctuations, thus reducing memory overhead and improving the efficiency of probing. We equip PJoin with several alternate strategies for purging the state and for propagating punctuations to benefit down-stream operators. We also present an extensive experimental study to explore the performance gains achieved by purging state as well as the trade-off between different purge strategies. Our experimental results of comparing the performance of PJoin with XJoin, a stream join operator without a constraint-exploiting mechanism, show that PJoin significantly outperforms XJoin with regard to both memory overhead and throughput.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abadi, D., Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: A new model and architecture for data stream management. VLDB Journal 12(2), 120–139 (2003)

    Article  Google Scholar 

  2. Arasu, A., Babcock, B., Babu, S., McAlister, J., Widom, J.: Characterizing memory requirements for queries over continuous data streams. In: PODS, June 2002, pp. 221–232 (2002)

    Google Scholar 

  3. Babu, S., Widom, J.: Exploiting k-constraints to reduce memory overhead in continuous queries over data streams. Technical report, Stanford Univ. (November 2002)

    Google Scholar 

  4. Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Lee, S., Seidman, G., Stonebraker, M., Tatbul, N., Zdonik, S.: Monitoring streams - a new class of data management applications. In: VLDB, August 2002, pp. 215–226 (2002)

    Google Scholar 

  5. Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M., Hellerstein, J., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.: TelegraphCQ: Continuous dataflow processing for an uncertain world. In: CIDR, January 2003, pp. 269–280 (2003)

    Google Scholar 

  6. Chen, J., DeWitt, D., Tian, F., Wang, Y.: NiagaraCQ: A scalable continuous query system for internet databases. In: ACM SIGMOD, June 2002, pp. 379–390 (2002)

    Google Scholar 

  7. Ding, L., Rundensteiner, E.A., Heineman, G.T.: MJoin: A metadata-aware stream join operator. In: DEBS (June 2003)

    Google Scholar 

  8. Golab, L., Ozsu, M.T.: Processing sliding window multi-joins in continuous queries over data streams. In: VLDB, September 2003, pp. 500–511 (2003)

    Google Scholar 

  9. Haas, P., Hellerstein, J.: Ripple joins for online aggregation. In: ACM SIGMOD, June 1999, pp. 287–298 (1999)

    Google Scholar 

  10. Hammad, M.A., Franklin, M.J., Aref, W.G., Elmagarmid, A.K.: Scheduling for shared window joins over data streams. In: VLDB, September 2003, pp. 297–308 (2003)

    Google Scholar 

  11. Hellerstein, J.M., Franklin, M.J., Chandrasekaran, S., Deshpande, A., Hildrum, K., Madden, S., Raman, V., Shah, M.: Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin 23(2), 7–18 (2000)

    Google Scholar 

  12. Ives, Z.G., Florescu, D., Friedman, M., Levy, A., Weld, D.S.: An adaptive query execution system for data integration. In: ACM SIGMOD, pp. 299–310 (1999)

    Google Scholar 

  13. Kang, J., Naughton, J.F., Viglas, S.D.: Evaluating window joins over unbounded streams. In: ICDE, March 2003, pp. 341–352 (2003)

    Google Scholar 

  14. Madden, S., Franklin, M.: Fjording the stream: An architecture for queries over streaming sensor data. In: ICDE, February 2002, pp. 555–566 (2002)

    Google Scholar 

  15. Madden, S., Shah, M., Hellerstein, J.M., Raman, V.: Continuously adaptive continuous queries over streams. In: ACM SIGMOD, June 2002, pp. 49–60 (2002)

    Google Scholar 

  16. Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., Varma, R.: Query processing, resource management, and approximation in a data stream management system. In: CIDR, January 2003, pp. 245–256 (2003)

    Google Scholar 

  17. Su, H., Jian, J., Rundensteiner, E.A.: Raindrop: A uniform and layered algebraic framework for XQueries on XML streams. In: CIKM, September 2003, pp. 279–286 (2003)

    Google Scholar 

  18. Tucker, P.A., Maier, D., Sheard, T., Fegaras, L.: Exploiting punctuation semantics in continuous data streams. IEEE Transactions on Knowledge and Data Engineering 15(3), 555–568 (2003)

    Article  Google Scholar 

  19. Urhan, T., Franklin, M.: XJoin: A reactively scheduled pipelined join operator. IEEE Data Engineering Bulletin 23(2), 27–33 (2000)

    Google Scholar 

  20. Urhan, T., Franklin, M.J.: Dynamic pipeline scheduling for improving interactive query performance. In: VLDB, September 2001, pp. 501–510 (2001)

    Google Scholar 

  21. Viglas, S., Naughton, J., Burger, J.: Maximizing the output rate of multi-way join queries over streaming information. In: VLDB, September 2003, pp. 285–296 (2003)

    Google Scholar 

  22. Wilschut, A.N., Apers, P.M.G.: Dataflow query execution in a parallel mainmemory environment. Distributed and Parallel Databases 1(1), 103–128 (1993)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ding, L., Mehta, N., Rundensteiner, E.A., Heineman, G.T. (2004). Joining Punctuated Streams. In: Bertino, E., et al. Advances in Database Technology - EDBT 2004. EDBT 2004. Lecture Notes in Computer Science, vol 2992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24741-8_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24741-8_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21200-3

  • Online ISBN: 978-3-540-24741-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics