skip to main content
research-article

High-performance dynamic pattern matching over disordered streams

Published:01 September 2010Publication History
Skip Abstract Section

Abstract

Current pattern-detection proposals for streaming data recognize the need to move beyond a simple regular-expression model over strictly ordered input. We continue in this direction, relaxing restrictions present in some models, removing the requirement for ordered input, and permitting stream revisions (modification of prior events). Further, recognizing that patterns of interest in modern applications may change frequently over the lifetime of a query, we support updating of a pattern specification without blocking input or restarting the operator. Our new pattern operator (called AFA) is a streaming adaptation of a non-deterministic finite automaton (NFA) where additional schema-based user-defined information, called a register, is accessible to NFA transitions during execution. AFAs support dynamic patterns, where the pattern itself can change over time. We propose clean order-agnostic pattern-detection semantics for AFAs, with new algorithms that allow a very efficient implementation, while retaining significant expressiveness and supporting native handling of out-of-order input, stream revisions, dynamic patterns, and several optimizations. Experiments on Microsoft StreamInsight show that we achieve event rates of more than 200K events/sec (up to 5x better than simpler schemes). Our dynamic patterns give up to orders-of-magnitude better throughput than solutions such as operator restart, and our other optimizations are very effective, incurring low memory and latency.

References

  1. D. Abadi et al. The design of the Borealis stream processing engine. In CIDR, 2005.Google ScholarGoogle Scholar
  2. J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman. Efficient pattern matching over event streams. In SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Ali et al. Microsoft CEP Server and Online Behavioral Targeting. In VLDB, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Babcock et al. Models and issues in data stream systems. In PODS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Babu, U. Srivastava, and J. Widom. Exploiting k-constraints to reduce memory overhead in continuous queries over data streams. ACM TODS, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Barga et al. Consistent streaming through time: A vision for event stream processing. In CIDR, 2007.Google ScholarGoogle Scholar
  7. B. Chandramouli, J. Goldstein, and D. Maier. On-the-fly progress detection in iterative stream queries. In VLDB, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chart Patterns. http://tinyurl.com/6zvzk5.Google ScholarGoogle Scholar
  9. Y. Chen et al. Large-scale behavioral targeting. In KDD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Demers, J. Gehrke, M. Hong, M. Riedewald, and W. White. Towards expressive publish/subscribe systems. In EDBT, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Diao et al. Path sharing and predicate evaluation for high-performance XML filtering. ACM TODS, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Diao et al. Capturing data uncertainty in high-volume stream processing. In CIDR, 2009.Google ScholarGoogle Scholar
  13. EsperTech. http://esper.codehaus.org/.Google ScholarGoogle Scholar
  14. M. Franklin et al. Continuous analytics: Rethinking query processing in a network-effect world. In CIDR, 2009.Google ScholarGoogle Scholar
  15. J. Hopcroft and J. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Johnson, S. Muthukrishnan, and I. Rozenbaum. Monitoring regular expressions on out-of-order streams. In ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  17. M. Liu et al. Sequence pattern query processing over out-of-order event streams. In ICDE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Maier et al. Semantics of data streams and operators. In International Conference on Database Theory, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Majumder, R. Rastogi, and S. Vanama. Scalable regular expression matching on data streams. In SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Mei and S. Madden. Zstream: a cost-based query processor for adaptively detecting composite events. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Motwani et al. Query processing, approximation, and resource management in a DSMS. In CIDR, 2003.Google ScholarGoogle Scholar
  22. Oracle Inc. http://www.oracle.com/.Google ScholarGoogle Scholar
  23. E. Ryvkina et al. Revision processing in a stream processing engine: A high-level design. In ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. U. Srivastava and J. Widom. Flexible time management in data stream systems. In PODS, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. StreamBase Inc. http://www.streambase.com/.Google ScholarGoogle Scholar
  26. P. Tucker et al. Exploiting punctuation semantics in continuous data streams. IEEE TKDE, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Viglas and J. Naughton. Rate-based query optimization for streaming information sources. In SIGMOD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing over streams. In SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. High-performance dynamic pattern matching over disordered streams
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
      September 2010
      1658 pages

      Publisher

      VLDB Endowment

      Publication History

      • Published: 1 September 2010
      Published in pvldb Volume 3, Issue 1-2

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader