ABSTRACT
Time-series data has an increasingly growing usage in Industrial Internet of Things (IIoT) and large-scale scientific experiments. Managing time-series data needs a storage engine that can keep up with their constantly growing volumes while providing an acceptable query latency. While traditional ACID databases favor consistency over performance, many time-series databases with novel storage engines have been developed to provide better ingestion performance and lower query latency. To understand how the unique design of a time-series database affects its performance, we design SciTS, a highly extensible and parameterizable benchmark for time-series data. The benchmark studies the data ingestion capabilities of time-series databases especially as they grow larger in size. It also studies the latencies of 5 practical queries from the scientific experiments use case. We use SciTS to evaluate the performance of 4 databases of 4 distinct storage engines: ClickHouse, InfluxDB, TimescaleDB, and PostgreSQL.
- Sattam Alsubaiee, Yasser Altowim, Hotham Altwaijry, Alexander Behm, Vinayak Borkar, Yingyi Bu, Michael Carey, Inci Cetindil, Madhusudan Cheelangi, Khurram Faraaz, Eugenia Gabrielova, Raman Grover, Zachary Heilbron, Young-Seok Kim, Chen Li, Guangqiang Li, Ji Mahn Ok, Nicola Onose, Pouria Pirzadeh, Vassilis Tsotras, Rares Vernica, Jian Wen, and Till Westmann. 2014. AsterixDB: A Scalable, Open Source BDMS. Proc. VLDB Endow. 7, 14 (Oct. 2014), 1905–1916. https://doi.org/10.14778/2733085.2733096Google ScholarDigital Library
- Renzo Angles. 2012. A Comparison of Current Graph Database Models. In 2012 IEEE 28th International Conference on Data Engineering Workshops. IEEE, New York, NY, USA, 171–177. https://doi.org/10.1109/ICDEW.2012.31Google ScholarDigital Library
- Vo Ngoc Anh and Alistair Moffat. 2010. Index Compression Using 64-Bit Words. Softw. Pract. Exper. 40, 2 (feb 2010), 131–147.Google Scholar
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. Bigtable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst. 26, 2, Article 4 (June 2008), 26 pages. https://doi.org/10.1145/1365815.1365816Google ScholarDigital Library
- Artem Chebotko, Andrey Kashlev, and Shiyong Lu. 2015. A Big Data Modeling Methodology for Apache Cassandra. In 2015 IEEE International Congress on Big Data. IEEE, New York, NY, USA, 238–245. https://doi.org/10.1109/BigDataCongress.2015.41Google ScholarDigital Library
- Satyadhyan Chickerur, Anoop Goudar, and Ankita Kinnerkar. 2015. Comparison of Relational Database with Document-Oriented Database (MongoDB) for Big Data Applications. In 2015 8th International Conference on Advanced Software Engineering and Its Applications (ASEA). IEEE, New York, NY, USA, 41–47. https://doi.org/10.1109/ASEA.2015.19Google ScholarDigital Library
- PANDA Collaboration. 2020. Technical Design Report for the PANDA Detector Control System. online. https://panda.gsi.de/publication/re-tdr-2018-009Google Scholar
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (Indianapolis, Indiana, USA) (SoCC ’10). Association for Computing Machinery, New York, NY, USA, 143–154. https://doi.org/10.1145/1807128.1807152Google ScholarDigital Library
- Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudre-Mauroux. 2013. OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases. Proc. VLDB Endow. 7, 4 (dec 2013), 277–288. https://doi.org/10.14778/2732240.2732246Google ScholarDigital Library
- B. Frammery. 2005. The LHC control system. Conf. Proc. 10th International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS) 051010 (2005), MO2.1–1I.Google Scholar
- Google. 2022. snappy. online. https://google.github.io/snappy/Google Scholar
- Yuanzhe Hao, Xiongpai Qin, Yueguo Chen, Yaru Li, Xiaoguang Sun, Yu Tao, Xiao Zhang, and Xiaoyong Du. 2021. TS-Benchmark: A Benchmark for Time Series Databases. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, New York, NY, USA, 588–599. https://doi.org/10.1109/ICDE51399.2021.00057Google Scholar
- Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker. 2008. OLTP through the Looking Glass, and What We Found There. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (Vancouver, Canada) (SIGMOD ’08). Association for Computing Machinery, New York, NY, USA, 981–992. https://doi.org/10.1145/1376616.1376713Google ScholarDigital Library
- Nicolas Hennion. 2022. Glances. online. https://github.com/nicolargo/glancesGoogle Scholar
- Baktagul Imasheva, Nakispekov Azamat, Andrey Sidelkovskiy, and Ainur Sidelkovskaya. 2020. The Practice of Moving to Big Data on the Case of the NoSQL Database, Clickhouse. In Optimization of Complex Systems: Theory, Models, Algorithms and Applications, Hoai An Le Thi, Hoai Minh Le, and Tao Pham Dinh (Eds.). Springer International Publishing, Cham, 820–828.Google Scholar
- InfluxData. 2022. InfluxDB Time Series Platform | InfluxData. online. https://www.influxdata.com/products/influxdb/Google Scholar
- Min-Gyue Jung, Seon-A Youn, Jayon Bae, and Yong-Lak Choi. 2015. A Study on Data Input and Output Performance Comparison of MongoDB and PostgreSQL in the Big Data Environment. In 2015 8th International Conference on Database Theory and Application (DTA). IEEE, New York, NY, USA, 14–17. https://doi.org/10.1109/DTA.2015.14Google ScholarDigital Library
- Yong-Shin Kang, Il-Ha Park, Jongtae Rhee, and Yong-Han Lee. 2016. MongoDB-Based Repository Design for IoT-Generated RFID/Sensor Big Data. IEEE Sensors Journal 16, 2 (2016), 485–497. https://doi.org/10.1109/JSEN.2015.2483499Google ScholarCross Ref
- Rui Liu and Jun Yuan. 2019. Benchmarking Time Series Databases with IoTDB-Benchmark for IoT Scenarios. arxiv:1901.08304 [cs.DB]Google Scholar
- M. Aker et al.2021. The design, construction, and commissioning of the KATRIN experiment. Journal of Instrumentation (JINST) 16, 08 (aug 2021), T08015. https://doi.org/10.1088/1748-0221/16/08/t08015Google Scholar
- Tuomas Pelkonen, Scott Franklin, Justin Teller, Paul Cavallaro, Qi Huang, Justin Meza, and Kaushik Veeraraghavan. 2015. Gorilla: A Fast, Scalable, in-Memory Time Series Database. Proc. VLDB Endow. 8, 12 (Aug. 2015), 1816–1827. https://doi.org/10.14778/2824032.2824078Google ScholarDigital Library
- PostgreSQL. 2022. PostgreSQL. online. https://www.postgresql.org/Google Scholar
- Swaminathan Sivasubramanian. 2012. Amazon DynamoDB: A Seamlessly Scalable Non-Relational Database Service. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data(Scottsdale, Arizona, USA) (SIGMOD ’12). Association for Computing Machinery, New York, NY, USA, 729–730. https://doi.org/10.1145/2213836.2213945Google ScholarDigital Library
- Greg Smith. 2010. pgtune. online. https://git.postgresql.org/gitweb/?p=pgtune.gitGoogle Scholar
- Juliusz Sompolski, Marcin Zukowski, and Peter Boncz. 2011. Vectorization vs. Compilation in Query Execution. In Proceedings of the Seventh International Workshop on Data Management on New Hardware (Athens, Greece) (DaMoN ’11). Association for Computing Machinery, New York, NY, USA, 33–40. https://doi.org/10.1145/1995441.1995446Google ScholarDigital Library
- Timescale. 2022. Time-series data simplified | Timescale. online. https://www.timescale.com/Google Scholar
- Timescale. 2022. TSBS. online. https://github.com/timescale/tsbsGoogle Scholar
- TSDBBench. 2022. TSDBBench. online. https://tsdbbench.github.io/Google Scholar
- W. Waggoner, M. Brnicky, Michael G. Cherney, J. Fujita, and C. Hartsig. 2005. The STAR slow controls system: Status and upgrade plans. Conf. Proc. 10th International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS) 051010 (2005), PO1.103–8.Google Scholar
- Weitao Zhang, Yinlong Xu, Yongkun Li, and Dinglong Li. 2016. Improving Write Performance of LSMT-Based Key-Value Store. In 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS). IEEE, New York, NY, USA, 553–560. https://doi.org/10.1109/ICPADS.2016.0079Google Scholar
Index Terms
- SciTS: A Benchmark for Time-Series Databases in Scientific Experiments and Industrial Internet of Things
Recommendations
Cloud Databases for Internet-of-Things Data
ITHINGS '14: Proceedings of the 2014 IEEE International Conference on Internet of Things(iThings), and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom)The Internet of Things (IoT) is posing new challenges and opportunities for data management and analysis techniques. One of the major problems is how to handle an increasing amount of data, with a variety of data types and data sources, in order to meet ...
Distance Measures for Effective Clustering of ARIMA Time-Series
ICDM '01: Proceedings of the 2001 IEEE International Conference on Data MiningMany environmental and socioeconomic time-series data can be adequately modeled using Auto-RegressiveIntegrated Moving Average (ARIMA) models. We call such Time-series ARIMA time-series. We consider the problem of clustering ARIMA time-series. We ...
Industrial internet of things: Recent advances, enabling technologies and open challenges
AbstractThe adoption of emerging technological trends and applications of the Internet of Things (IoT) in the industrial systems is leading towards the development of Industrial IoT (IIoT). IIoT serves as a new vision of IoT in the industrial ...
Comments