ABSTRACT
With the increasing demand for real-time system monitoring and tracking in various contexts, the amount of time-stamped event data grows at an astonishing rate. Analytics on time-stamped events must be real time and the aggregated results need to be accurate even when data arrives out of order. Unfortunately, frequent occurrences of out-of-order data will significantly slow down the processing, and cause a large delay in the query response. Timon is a timestamped event database that aims to support aggregations and handle late arrivals both correctly (i.e., upholding the exactly-once semantics) and efficiently. Our insight is that a broad range of applications can be implemented with data structures and corresponding operators that satisfy associative and commutative properties. Records arriving after the low watermark are appended to Timon directly, allowing aggregations to be performed lazily. To improve query efficiency, Timon maintains a TS-LSM-Tree, which keeps the most recent data in memory and contains a time-partitioning tree on disk for high-volume data accumulated over long time span. Besides, Timon supports materialized aggregation views and correlation analysis across multiple streams. Timon has been successfully deployed at Alibaba Cloud and is a critical building block for Alibaba cloud's continuous monitoring and anomaly analysis infrastructure.
Supplemental Material
- T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle. Millwheel: Fault-tolerant stream processing at internet scale. PVLDB, 6(11):1033--1044, 2013.Google ScholarDigital Library
- T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. Ferná ndez-Moctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, and S. Whittle. The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. PVLDB, 8(12):1792--1803, 2015.Google ScholarDigital Library
- AlibabaCloud. Loghub. https://www.alibabacloud.com/product/log-service.Google Scholar
- AlibabaCloud. Polardb. https://www.alibabacloud.com/products/apsaradb-for-polardb.Google Scholar
- AlibabaCloud. Rds. https://www.alibabacloud.com/product/apsaradb-for-rds-mysql.Google Scholar
- M. P. Andersen and D. E. Culler. Btrdb: Optimizing storage system design for timeseries processing. In FAST, pages 39--52, 2016.Google Scholar
- Apache. Cassandra. http://cassandra.apache.org/, 2008.Google Scholar
- Apache. Hbase. https://hbase.apache.org/, 2008.Google Scholar
- Apache. Kafka. https://kafka.apache.org/, 2011.Google Scholar
- Apache. Opentsdb. http://opentsdb.net/, 2011.Google Scholar
- Apache. Storm. https://storm.apache.org/, 2017.Google Scholar
- AWS. Kinesis. https://aws.amazon.com/kinesis/.Google Scholar
- O. Boykin, S. Ritchie, I. O'Connell, and J. Lin. Summingbird: A framework for integrating batch and online mapreduce computations. Proceedings of the VLDB Endowment, 7(13):1441--1451, 2014.Google ScholarDigital Library
- W. Cao, Y. Gao, B. Lin, X. Feng, Y. Xie, X. Lou, and P. Wang. Tcprt: Instrument and diagnostic analysis system for service quality of cloud databases at massive scale in real-time. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD '18, pages 615--627, New York, NY, USA, 2018. ACM.Google ScholarDigital Library
- W. Cao, Y. Liu, Z. Cheng, N. Zheng, W. Li, W. Wu, L. Ouyang, P. Wang, Y. Wang, R. Kuan, et al. $$POLARDB$$ meets computational storage: Efficiently support analytical workloads in cloud-native relational database. In 18th $$USENIX$$ Conference on File and Storage Technologies ($$FAST$$ 20), pages 29--41, 2020.Google Scholar
- W. Cao, Z. Liu, P. Wang, S. Chen, C. Zhu, S. Zheng, Y. Wang, and G. Ma. Polarfs: an ultra-low latency and failure resilient distributed file system for shared storage cloud database. Proceedings of the VLDB Endowment, 11(12):1849--1862, 2018.Google ScholarDigital Library
- J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107--113, 2008.Google ScholarDigital Library
- Facebook. Beringei. https://github.com/facebookarchive/beringei, 2017.Google Scholar
- P. Flajolet, E. Fusy, O. Gandouet, and et al. Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm. In AOFA, 2007.Google Scholar
- influxdata. Influxdb. https://github.com/influxdata/influxdb, 2013.Google Scholar
- M. Kiran, P. Murphy, I. Monga, J. Dugan, and S. S. Baveja. Lambda architecture for cost-effective batch and speed big data processing. In IEEE Big Data, pages 2785--2792, 2015.Google ScholarDigital Library
- S. A. Noghabi, K. Paramasivam, Y. Pan, N. Ramesh, J. Bringhurst, I. Gupta, and R. H. Campbell. Samza: stateful scalable stream processing at linkedin. Proceedings of the VLDB Endowment, 10(12):1634--1645, 2017.Google ScholarDigital Library
- P. O'Neil, E. Cheng, D. Gawlick, and E. O'Neil. The log-structured merge-tree (lsm-tree). Acta Informatica, 33(4):351--385, 1996.Google ScholarDigital Library
- T. Pelkonen, S. Franklin, J. Teller, P. Cavallaro, Q. Huang, J. Meza, and K. Veeraraghavan. Gorilla: A fast, scalable, in-memory time series database. Proceedings of the VLDB Endowment, 8(12):1816--1827, 2015.Google ScholarDigital Library
- M. Welsh, D. Culler, and E. Brewer. Seda: an architecture for well-conditioned, scalable internet services. In ACM SIGOPS Operating Systems Review, volume 35, pages 230--243. ACM, 2001.Google ScholarDigital Library
- M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, pages 2--2. USENIX Association, 2012.Google ScholarDigital Library
Index Terms
- Timon: A Timestamped Event Database for Efficient Telemetry Data Processing and Analytics
Recommendations
Big data analytics in Cloud computing: an overview
AbstractBig Data and Cloud Computing as two mainstream technologies, are at the center of concern in the IT field. Every day a huge amount of data is produced from different sources. This data is so big in size that traditional processing tools are unable ...
Issues in complex event processing
Research issues in complex event processing (CEP) emphasizing on query optimization.Cover deterministic probabilistic models, centralized distributed settings.Issues for CEP optimization over Big Data enabling cloud computing platforms.Predictive ...
Comments