Abstract
Several systems today produce enormous textual, numerical, geospatial, structured, and unstructured data. This data may serve many business requirements, which implies specific performance requirements. These requirements typically make it imperative that the application uses the most efficient means to store and retrieve it. In the past, systems architects would implement SQL relational databases systems (RDBMS). Still, today, the advances of NoSQL and document storage technologies offer a high-scalable non-relational database that can process and store vast amounts of that with high performance and efficiency. Mixing the traditional relational databases with the new databases creates a database system model capable of serving the different business needs. This paper proposes to create a stream architecture capable of feeding the same data to other database systems based on open-source technologies such as Apache Kafka, NoSQL database, time series database (TSDB), relational database, and document indexing engine. The implementation will handle massive incoming data from processed network traffic traces that will be ingested to several databases through the stream architecture.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hamami, F., Dahlan, I.A.: The implementation of stream architecture for handling big data velocity in social media. J. Phys. Conf. Ser. 1641, 012021 (2020)
(pdf) message-oriented middleware. https://www.researchgate.net/publication/220035284_Message-Oriented_Middleware. Accessed on 24 May 2021
What is apache kafka?—confuent. https://www.confluent.io/what-is-apache-kafka/?utm_medium=sem&utm_source=google&utm_campaign=ch.sem_br.nonbrand_tp.prs_tgt.kafka_mt.xct_rgn.emea_lng.eng_dv.all&utm_term=apache%20kafka&creative=&device=c&placement=&gclid=Cj0KCQjwna2FBhDPARIsACAEc_WjuVPlkETwHVnARFuF0G3cgIqzqyIMcJe1mAntoiGqlcmGVyv-KuQaAvRSEALw_wcB. Accessed on 24 May 2021
Amazon kinesis data streams—data streaming service—amazon web services. https://aws.amazon.com/kinesis/data-streams/?nc1=h_ls. Accessed on 25 May 2021
Khine, P.P., Wang, Z.: A review of polyglot persistence in the big data world. Information 10(4) (2019)
What is polyglot persistence? Definition from whatis.com. https://searchapparchitecture.techtarget.com/definition/polyglot-persistence. Accessed on 26 May 2021
tshark—the wireshark network analyzer 3.4.5. https://www.wireshark.org/docs/man-pages/tshark.html. Accessed on 26 May 2021
Geoip® databases & services: Industry leading IP intelligence—maxmind. https://www.maxmind.com/en/geoip2-services-and-databases. Accessed on 26 May 2021
The most popular database for modern apps—mongodb. https://www.mongodb.com/. Accessed on 26 May 2021
Influxdb time series platform — influxdata. https://www.influxdata.com/products/influxdb/. (Accessed on 05/26/2021).
Porque gratuito e aberto?—elastic. https://www.elastic.co/pt/about/free-and-open?ultron=fao-sitelink&gambit=Elasticsearch-Core&blade=adwords-s&hulk=cpc&Device=c&thor=elasticsearch&gclid=CjwKCAjw47eFBhA9EiwAy8kzNJ8-4cVDEAnbKrcc6f1EX7baujXEAVuBNacrlom3PlzXxra-DCFF7RoCyUgQAvD_BwE. Accessed on 26 May 2021
Postgresql: The world’s most advanced open source database. https://www.postgresql.org/. Accessed on 26 May 2021
Grafana—wikipedia. https://en.wikipedia.org/wiki/Grafana. Accessed on 26 May 2021
Empowering app development for developers—docker. https://www.docker.com/. Accessed on 26 May 2021
Docker hub. https://hub.docker.com/. Accessed on 26 May 2021
Li, Y., Manoharan, S.: A performance comparison of SQL and NOSQL databases. In: 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pp. 15–19 (2013)
Choosing the right nosql database for the job: a quality attribute evaluation — springerlink. https://link.springer.com/article/10.1186/s40537-015-00250#citeas. Accessed on 27 May 2021
Doguc, T.B., Aydın, A.A.: Cap-based examination of popular NOSQL database technologies in streaming data processing. In: 2019 international artificial intelligence and data processing symposium (IDAP), pp. 1–6 (2019)
Geoip2 databases—maxmind. https://www.maxmind.com/en/geoip2databases. Accessed on 29 June 2021
Github—maxmind/geoip2-python: Python code for geoip2 webservice client and database reader. https://github.com/maxmind/GeoIP2-python. Accessed on 29 June 2021
Compass—mongodb. https://www.mongodb.com/products/compass. Accessed on 29 June 2021
Apache kafka gui management and monitoring-confluent. https://www.confluent.io/product/confluent-platform/gui-driven-managementand-monitoring/. Accessed on 3 July 2021
Acknowledgements
This work is funded by National Funds through the FCT—Foundation for Science and Technology, IP, within the scope of the project Ref. UIDB/05583/2020. Furthermore, we would like to thank the Research Centre in Digital Services (CISeD), the Polytechnic of Viseu for their support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Oliveira, L., Brito, J., Cá, F., Wanzeller, C., Martins, P., Abbasi, M. (2022). Multi-DB Data Streaming on Polyglot Systems. In: Reis, J.L., López, E.P., Moutinho, L., Santos, J.P.M.d. (eds) Marketing and Smart Technologies. Smart Innovation, Systems and Technologies, vol 279. Springer, Singapore. https://doi.org/10.1007/978-981-16-9268-0_11
Download citation
DOI: https://doi.org/10.1007/978-981-16-9268-0_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-9267-3
Online ISBN: 978-981-16-9268-0
eBook Packages: EngineeringEngineering (R0)