ABSTRACT
Event processing applications from financial fraud detection to health care analytics continuously execute event queries with Kleene closure to extract event sequences of arbitrary, statically unknown length, called Complete Event Trends (CETs). Due to common event sub-sequences in CETs, either the responsiveness is delayed by repeated computations or an exorbitant amount of memory is required to store partial results. To overcome these limitations, we define the CET graph to compactly encode all CETs matched by a query. Based on the graph, we define the spectrum of CET detection algorithms from CPU-optimal to memory-optimal. We find the middle ground between these two extremes by partitioning the graph into time-centric graphlets and caching partial CETs per graphlet to enable effective reuse of these intermediate results. We reveal cost monotonicity properties of the search space of graph partitioning plans. Our CET optimizer leverages these properties to prune significant portions of the search to produce a partitioning plan with minimal CPU costs yet within the given memory limit. Our experimental study demonstrates that our CET detection solution achieves up to 42--fold speed-up even under rigid memory constraints compared to the state-of-the-art techniques in diverse scenarios.
- Stock trade traces. http://davis.wpi.edu/datasets/Stock_Trace_Data/.Google Scholar
- Esper. http://www.espertech.com/, 2015. {Online; accessed 20-April-2015}.Google Scholar
- Storm. https://storm.apache.org/, 2015. {Online; accessed 9-January-2015}.Google Scholar
- StreamInsight. https://technet.microsoft.com/en-us/library/ee362541%28v=sql.111%29.aspx, 2015. {Online; accessed 20-April-2015}.Google Scholar
- The Press Enterprise. http://www.pe.com/articles/checks-694614-people-bank.html, 2015. {Online; accessed 6-October-2015}.Google Scholar
- Wikipedia. https://en.wikipedia.org/wiki/Check_kiting, 2015. {Online; accessed 6-October-2015}.Google Scholar
- Apache Flink. https://flink.apache.org/, 2016. {Online; accessed 14-October-2016}.Google Scholar
- Apache Flink Forum. https://issues.apache.org/jira/browse/FLINK-3318, 2016. {Online; accessed 14-October-2016}.Google Scholar
- Boulder Community Health. http://www.bch.org/cardiac-care/arrhythmia-electrophysiology.aspx, 2016. {Online; accessed 6-July-2016}.Google Scholar
- J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman. Efficient pattern matching over event streams. In SIGMOD, pages 147--160, 2008. Google ScholarDigital Library
- K. Andreev and H. Räcke. Balanced graph partitioning. In SPAA, pages 120--124, 2004. Google ScholarDigital Library
- C. Balkesen, N. Dindar, M. Wetter, and N. Tatbul. RIP: Run-based intra-query parallelism for scalable Complex Event Processing. In DEBS, pages 3--14, 2013. Google ScholarDigital Library
- S. T. Barnard. PMRSB: Parallel Multilevel Recursive Spectral Bisection. In Supercomputing, 1995. Google ScholarDigital Library
- B. Chandramouli, J. Goldstein, and D. Maier. On-the-fly progress detection in iterative stream queries. VLDB, 2(1):241--252, Aug. 2009. Google ScholarDigital Library
- Y. Chen, S. B. Davidson, and Y. Zheng. An Efficient XPath Query Processor for XML Streams. In ICDE, pages 1--12, 2006. Google ScholarDigital Library
- A. Demers, J. Gehrke, B. Panda, M. Riedewald, V. Sharma, and W. White. Cayuga: A general purpose event monitoring system. In CIDR, pages 412--422, 2007.Google Scholar
- T. J. Green, A. Gupta, G. Miklau, M. Onizuka, and D. Suciu. Processing XML streams with deterministic automata and stream indexes. ACM Trans. Database Syst., 29(4):752--788, 2004. Google ScholarDigital Library
- B. Hendrickson and R. Leland. A multilevel algorithm for partitioning graphs. In Supercomputing, 1995. Google ScholarDigital Library
- M. Hirzel. Partition and compose: Parallel Complex Event Processing. In DEBS, pages 191--200, 2012. Google ScholarDigital Library
- G. Karypis and V. Kumar. Multilevel graph partitioning schemes. In Parallel Processing, pages 113--122, 1995.Google Scholar
- M. Klazar. Bell numbers, their relatives, and algebraic differential equations. J. Comb. Theory, Ser. A, 102(1):63--87, 2003. Google ScholarDigital Library
- R. Krauthgamer, J. S. Naor, and R. Schwartz. Partitioning graphs into balanced components. In SODA, pages 942--949, 2009. Google ScholarDigital Library
- M. Liu, E. A. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta. E-Cube: Multi-dimensional event sequence analysis using hierarchical pattern query sharing. In SIGMOD, pages 889--900, 2011. Google ScholarDigital Library
- M. Liu, N. E. Taylor, W. Zhou, Z. G. Ives, and B. T. Loo. Recursive computation of regions and connectivity in networks. In ICDE, pages 1108--1119, 2009. Google ScholarDigital Library
- B. Ludäscher, P. Mukhopadhyay, and Y. Papakonstantinou. A transducer-based XML query processor. In VLDB, pages 227--238, 2002. Google ScholarDigital Library
- Y. Mei and S. Madden. ZStream: A Cost-based Query Processor for Adaptively Detecting Composite Events. In SIGMOD, pages 193--206, 2009. Google ScholarDigital Library
- B. Mozafari, K. Zeng, and C. Zaniolo. From regular expressions to nested words: Unifying languages and query execution for relational and XML sequences. VLDB, 3(1--2):150--161, Sept. 2010. Google ScholarDigital Library
- B. Mozafari, K. Zeng, and C. Zaniolo. High-performance complex event processing over XML streams. In SIGMOD, pages 253--264, 2012. Google ScholarDigital Library
- J. Nishimura and J. Ugander. Restreaming graph partitioning: Simple versatile algorithms for advanced balancing. In KDD, pages 1106--1114, 2013. Google ScholarDigital Library
- M. Ray, C. Lei, and E. A. Rundensteiner. Scalable pattern sharing on event streams. In SIGMOD, pages 495--510, 2016. Google ScholarDigital Library
- M. Ray, E. A. Rundensteiner, M. Liu, C. Gupta, S. Wang, and I. Ari. High-performance complex event processing using continuous sliding views. In EDBT, pages 525--536, 2013. Google ScholarDigital Library
- A. Reiss and D. Stricker. Creating and benchmarking a new dataset for physical activity monitoring. In PETRA, pages 40:1--40:8, 2012. Google ScholarDigital Library
- A. Shkapsky, M. Yang, M. Interlandi, H. Chiu, T. Condie, and C. Zaniolo. Big data analytics with Datalog queries on Spark. In SIGMOD, pages 1135--1149, 2016. Google ScholarDigital Library
- I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, pages 1222--1230, 2012. Google ScholarDigital Library
- J. Stewart. Calculus: Early Transcendentals. Thompson Brooks/Cole, 8th edition, 2015.Google Scholar
- C. E. Tsourakakis, C. Gkantsidis, B. Radunovic, and M. Vojnovic. Fennel: Streaming graph partitioning for massive scale graphs. Technical report, 2012.Google Scholar
- E. Wu, Y. Diao, and S. Rizvi. High-performance Complex Event Processing over streams. In SIGMOD, pages 407--418, 2006. Google ScholarDigital Library
- H. Zhang, Y. Diao, and N. Immerman. On complexity and optimization of expensive queries in Complex Event Processing. In SIGMOD, pages 217--228, 2014. Google ScholarDigital Library
Index Terms
- Complete Event Trend Detection in High-Rate Event Streams
Recommendations
Efficient Complete Event Trend Detection over High-Velocity Streams
ICPP '21: Proceedings of the 50th International Conference on Parallel ProcessingComplete Event Trend (CET) detection over large-scale event streams is important and challenging in various applications such as financial services, real-time business analysis, and supply chain management. A potential large number of partial ...
High-performance complex event processing over streams
SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of dataIn this paper, we present the design, implementation, and evaluation of a system that executes complex event queries over real-time streams of RFID readings encoded as events. These complex event queries filter and correlate events to match specific ...
High-performance complex event processing framework to detect event patterns over video streams
Middleware '19: Proceedings of the 20th International Middleware Conference Doctoral SymposiumComplex Event Processing (CEP) is an event processing paradigm capable of detecting patterns over streaming data in real-time. Presently, CEP systems have key challenges to preform matching over video streams due to their unstructured data model and ...
Comments