skip to main content
10.1145/3538712.3538723acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
research-article

SciTS: A Benchmark for Time-Series Databases in Scientific Experiments and Industrial Internet of Things

Published:23 August 2022Publication History

ABSTRACT

Time-series data has an increasingly growing usage in Industrial Internet of Things (IIoT) and large-scale scientific experiments. Managing time-series data needs a storage engine that can keep up with their constantly growing volumes while providing an acceptable query latency. While traditional ACID databases favor consistency over performance, many time-series databases with novel storage engines have been developed to provide better ingestion performance and lower query latency. To understand how the unique design of a time-series database affects its performance, we design SciTS, a highly extensible and parameterizable benchmark for time-series data. The benchmark studies the data ingestion capabilities of time-series databases especially as they grow larger in size. It also studies the latencies of 5 practical queries from the scientific experiments use case. We use SciTS to evaluate the performance of 4 databases of 4 distinct storage engines: ClickHouse, InfluxDB, TimescaleDB, and PostgreSQL.

References

  1. Sattam Alsubaiee, Yasser Altowim, Hotham Altwaijry, Alexander Behm, Vinayak Borkar, Yingyi Bu, Michael Carey, Inci Cetindil, Madhusudan Cheelangi, Khurram Faraaz, Eugenia Gabrielova, Raman Grover, Zachary Heilbron, Young-Seok Kim, Chen Li, Guangqiang Li, Ji Mahn Ok, Nicola Onose, Pouria Pirzadeh, Vassilis Tsotras, Rares Vernica, Jian Wen, and Till Westmann. 2014. AsterixDB: A Scalable, Open Source BDMS. Proc. VLDB Endow. 7, 14 (Oct. 2014), 1905–1916. https://doi.org/10.14778/2733085.2733096Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Renzo Angles. 2012. A Comparison of Current Graph Database Models. In 2012 IEEE 28th International Conference on Data Engineering Workshops. IEEE, New York, NY, USA, 171–177. https://doi.org/10.1109/ICDEW.2012.31Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Vo Ngoc Anh and Alistair Moffat. 2010. Index Compression Using 64-Bit Words. Softw. Pract. Exper. 40, 2 (feb 2010), 131–147.Google ScholarGoogle Scholar
  4. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. Bigtable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst. 26, 2, Article 4 (June 2008), 26 pages. https://doi.org/10.1145/1365815.1365816Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Artem Chebotko, Andrey Kashlev, and Shiyong Lu. 2015. A Big Data Modeling Methodology for Apache Cassandra. In 2015 IEEE International Congress on Big Data. IEEE, New York, NY, USA, 238–245. https://doi.org/10.1109/BigDataCongress.2015.41Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Satyadhyan Chickerur, Anoop Goudar, and Ankita Kinnerkar. 2015. Comparison of Relational Database with Document-Oriented Database (MongoDB) for Big Data Applications. In 2015 8th International Conference on Advanced Software Engineering and Its Applications (ASEA). IEEE, New York, NY, USA, 41–47. https://doi.org/10.1109/ASEA.2015.19Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. PANDA Collaboration. 2020. Technical Design Report for the PANDA Detector Control System. online. https://panda.gsi.de/publication/re-tdr-2018-009Google ScholarGoogle Scholar
  8. Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (Indianapolis, Indiana, USA) (SoCC ’10). Association for Computing Machinery, New York, NY, USA, 143–154. https://doi.org/10.1145/1807128.1807152Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudre-Mauroux. 2013. OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases. Proc. VLDB Endow. 7, 4 (dec 2013), 277–288. https://doi.org/10.14778/2732240.2732246Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Frammery. 2005. The LHC control system. Conf. Proc. 10th International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS) 051010 (2005), MO2.1–1I.Google ScholarGoogle Scholar
  11. Google. 2022. snappy. online. https://google.github.io/snappy/Google ScholarGoogle Scholar
  12. Yuanzhe Hao, Xiongpai Qin, Yueguo Chen, Yaru Li, Xiaoguang Sun, Yu Tao, Xiao Zhang, and Xiaoyong Du. 2021. TS-Benchmark: A Benchmark for Time Series Databases. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, New York, NY, USA, 588–599. https://doi.org/10.1109/ICDE51399.2021.00057Google ScholarGoogle Scholar
  13. Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker. 2008. OLTP through the Looking Glass, and What We Found There. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (Vancouver, Canada) (SIGMOD ’08). Association for Computing Machinery, New York, NY, USA, 981–992. https://doi.org/10.1145/1376616.1376713Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nicolas Hennion. 2022. Glances. online. https://github.com/nicolargo/glancesGoogle ScholarGoogle Scholar
  15. Baktagul Imasheva, Nakispekov Azamat, Andrey Sidelkovskiy, and Ainur Sidelkovskaya. 2020. The Practice of Moving to Big Data on the Case of the NoSQL Database, Clickhouse. In Optimization of Complex Systems: Theory, Models, Algorithms and Applications, Hoai An Le Thi, Hoai Minh Le, and Tao Pham Dinh (Eds.). Springer International Publishing, Cham, 820–828.Google ScholarGoogle Scholar
  16. InfluxData. 2022. InfluxDB Time Series Platform | InfluxData. online. https://www.influxdata.com/products/influxdb/Google ScholarGoogle Scholar
  17. Min-Gyue Jung, Seon-A Youn, Jayon Bae, and Yong-Lak Choi. 2015. A Study on Data Input and Output Performance Comparison of MongoDB and PostgreSQL in the Big Data Environment. In 2015 8th International Conference on Database Theory and Application (DTA). IEEE, New York, NY, USA, 14–17. https://doi.org/10.1109/DTA.2015.14Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yong-Shin Kang, Il-Ha Park, Jongtae Rhee, and Yong-Han Lee. 2016. MongoDB-Based Repository Design for IoT-Generated RFID/Sensor Big Data. IEEE Sensors Journal 16, 2 (2016), 485–497. https://doi.org/10.1109/JSEN.2015.2483499Google ScholarGoogle ScholarCross RefCross Ref
  19. Rui Liu and Jun Yuan. 2019. Benchmarking Time Series Databases with IoTDB-Benchmark for IoT Scenarios. arxiv:1901.08304 [cs.DB]Google ScholarGoogle Scholar
  20. M. Aker et al.2021. The design, construction, and commissioning of the KATRIN experiment. Journal of Instrumentation (JINST) 16, 08 (aug 2021), T08015. https://doi.org/10.1088/1748-0221/16/08/t08015Google ScholarGoogle Scholar
  21. Tuomas Pelkonen, Scott Franklin, Justin Teller, Paul Cavallaro, Qi Huang, Justin Meza, and Kaushik Veeraraghavan. 2015. Gorilla: A Fast, Scalable, in-Memory Time Series Database. Proc. VLDB Endow. 8, 12 (Aug. 2015), 1816–1827. https://doi.org/10.14778/2824032.2824078Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. PostgreSQL. 2022. PostgreSQL. online. https://www.postgresql.org/Google ScholarGoogle Scholar
  23. Swaminathan Sivasubramanian. 2012. Amazon DynamoDB: A Seamlessly Scalable Non-Relational Database Service. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data(Scottsdale, Arizona, USA) (SIGMOD ’12). Association for Computing Machinery, New York, NY, USA, 729–730. https://doi.org/10.1145/2213836.2213945Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Greg Smith. 2010. pgtune. online. https://git.postgresql.org/gitweb/?p=pgtune.gitGoogle ScholarGoogle Scholar
  25. Juliusz Sompolski, Marcin Zukowski, and Peter Boncz. 2011. Vectorization vs. Compilation in Query Execution. In Proceedings of the Seventh International Workshop on Data Management on New Hardware (Athens, Greece) (DaMoN ’11). Association for Computing Machinery, New York, NY, USA, 33–40. https://doi.org/10.1145/1995441.1995446Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Timescale. 2022. Time-series data simplified | Timescale. online. https://www.timescale.com/Google ScholarGoogle Scholar
  27. Timescale. 2022. TSBS. online. https://github.com/timescale/tsbsGoogle ScholarGoogle Scholar
  28. TSDBBench. 2022. TSDBBench. online. https://tsdbbench.github.io/Google ScholarGoogle Scholar
  29. W. Waggoner, M. Brnicky, Michael G. Cherney, J. Fujita, and C. Hartsig. 2005. The STAR slow controls system: Status and upgrade plans. Conf. Proc. 10th International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS) 051010 (2005), PO1.103–8.Google ScholarGoogle Scholar
  30. Weitao Zhang, Yinlong Xu, Yongkun Li, and Dinglong Li. 2016. Improving Write Performance of LSMT-Based Key-Value Store. In 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS). IEEE, New York, NY, USA, 553–560. https://doi.org/10.1109/ICPADS.2016.0079Google ScholarGoogle Scholar

Index Terms

  1. SciTS: A Benchmark for Time-Series Databases in Scientific Experiments and Industrial Internet of Things

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Other conferences
                SSDBM '22: Proceedings of the 34th International Conference on Scientific and Statistical Database Management
                July 2022
                201 pages
                ISBN:9781450396677
                DOI:10.1145/3538712

                Copyright © 2022 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 23 August 2022

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article
                • Research
                • Refereed limited

                Acceptance Rates

                Overall Acceptance Rate56of146submissions,38%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader

              HTML Format

              View this article in HTML Format .

              View HTML Format