skip to main content
10.1145/3445945.3445949acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbdrConference Proceedingsconference-collections
research-article

ESTemd: A Distributed Processing Framework for Environmental Monitoring based on Apache Kafka Streaming Engine

Published:01 March 2021Publication History

ABSTRACT

Distributed networks and real-time systems are becoming the most important components for the new computer age – the Internet of Things (IoT), with huge data streams generated from sensors and data sets generated from existing legacy systems. The data generated offers the ability to measure, infer and understand environmental indicators, from delicate ecologies and natural resources to urban environments. This can be achieved through the analysis of the heterogeneous data sources (structured and unstructured). In this paper, we propose a distributed framework – Event STream Processing Engine for Environmental Monitoring Domain (ESTemd) for the application of stream processing on heterogeneous environmental data. Our work in this area demonstrates the useful role big data techniques can play in an environmental decision support system, early warning and forecasting systems. The proposed framework addresses the challenges of data heterogeneity from heterogeneous systems and offers real-time processing of huge environmental datasets through a publish/subscribe method via a unified data pipeline with the application of Apache Kafka for real-time analytics.

References

  1. Apache Flink. https://flink.apache.org. Accessed on 6 Oct 2019.Google ScholarGoogle Scholar
  2. Apache Storm. https://storm.apache.org. Accessed on 12 Jan 2020.Google ScholarGoogle Scholar
  3. Hadoop. https://hadoop.apache.org/. Accessed on 13 Jan 2020.Google ScholarGoogle Scholar
  4. Dean, J. and Ghemawat, S., 2008. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), pp.107-113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Apache Storm. https://storm.apache.org. Accessed on 14 Jan 2020.Google ScholarGoogle Scholar
  6. Microsoft StreamInsight. https://docs.microsoft.com/en/us/previous/versions/sql/streaminsight/ee362541(v=sql.111)?redirectedfrom=MSDN. Accessed on 19 Jan 2020.Google ScholarGoogle Scholar
  7. Apache Spark. https://spark.apache.org. Accessed on 10 Nov 2019.Google ScholarGoogle Scholar
  8. Siddhi. https://siddhi.io/en/v4.x/docs/. Accessed on 18 Jan 2020.Google ScholarGoogle Scholar
  9. SAP ESP. https://www.sap.com/africa/products/complex-event-processing.html. Acessed on 2 Jan 2020.Google ScholarGoogle Scholar
  10. Dean, J. and Ghemawat, S., 2008. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), pp.107-113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Zikopoulos, P. and Eaton, C., 2011. Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media.Google ScholarGoogle Scholar
  12. ESPER. http://www.espertech.com/esper. Accessed on 20 Dec 2019.Google ScholarGoogle Scholar
  13. Shukla, A. and Simmhan, Y., 2016, September. Benchmarking distributed stream processing platforms for iot applications. In Technology Conference on Performance Evaluation and Benchmarking (pp. 90-106). Springer, Cham.Google ScholarGoogle Scholar
  14. Malek, Y.N., Kharbouch, A., El Khoukhi, H., Bakhouya, M., De Florio, V., El Ouadghiri, D., Latré, S. and Blondia, C., 2017. On the use of IoT and big data technologies for real-time monitoring and data processing. Procedia computer science, 113, pp.429-434.Google ScholarGoogle Scholar
  15. Rios, L.G., 2014, June. Big data infrastructure for analysing data generated by wireless sensor networks. In 2014 IEEE International Congress on Big Data (pp. 816-823). IEEE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Gedik, B., Andrade, H., Wu, K.L., Yu, P.S. and Doo, M., 2008, June. SPADE: the system s declarative stream processing engine. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data (pp. 1123-1134). ACM.Google ScholarGoogle Scholar
  17. Shahrivari, S., 2014. Beyond batch processing: towards real-time and streaming big data. Computers, 3(4), pp.117-129.Google ScholarGoogle ScholarCross RefCross Ref
  18. Dean, J. and Ghemawat, S., 2008. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), pp.107-113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Marcu, O.C., Costan, A., Antoniu, G., Pérez-Hernández, M., Tudoran, R., Bortoli, S. and Nicolae, B., 2018. Storage and Ingestion Systems in Support of Stream Processing: A Survey.Google ScholarGoogle Scholar
  20. Demers, A.J., Gehrke, J., Panda, B., Riedewald, M., Sharma, V. and White, W.M., 2007, January. Cayuga: A General Purpose Event Monitoring System. In Cidr (Vol. 7, pp. 412-422).Google ScholarGoogle Scholar
  21. Zhou, Q., Simmhan, Y. and Prasanna, V., 2017. Knowledge-infused and consistent Complex Event Processing over real-time and persistent streams. Future Generation Computer Systems, 76, pp.391-406.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Akanbi, A. and Masinde, M., 2020. A Distributed Stream Processing Middleware Framework for Real-Time Analysis of Heterogeneous Data on Big Data Platform: Case of Environmental Monitoring. Sensors, 20(11), p.3166.Google ScholarGoogle ScholarCross RefCross Ref
  23. Apache Kafka. https://kafka.apache.org. Accessed on 6 Oct 2019.Google ScholarGoogle Scholar
  24. Jafarpour, H. and Desai, R., 2019. KSQL: Streaming SQL Engine for Apache Kafka. In EDBT (pp. 524-533).Google ScholarGoogle Scholar
  25. Confluent. https://www.confluent.io/. Accessed on 2 Jan 2020.Google ScholarGoogle Scholar
  26. Akanbi, A.K. and Masinde, M., 2015, December. Towards semantic integration of heterogeneous sensor data with indigenous knowledge for drought forecasting. In Proceedings of the Doctoral Symposium of the 16th International Middleware Conference (pp. 1-5).Google ScholarGoogle Scholar
  27. Shree, R., Choudhury, T., Gupta, S.C. and Kumar, P., 2017, August. KAFKA: The modern platform for data management and analysis in big data domain. In 2017 2nd International Conference on Telecommunication and Networks (TEL-NET) (pp. 1-5). IEEEGoogle ScholarGoogle Scholar
  28. Akanbi, A.K. and Masinde, M., 2018, May. Semantic interoperability middleware architecture for heterogeneous environmental data sources. In 2018 IST-Africa Week Conference (IST-Africa) (pp. Page-1). IEEE.Google ScholarGoogle Scholar
  29. Thein, K.M.M., 2014. Apache kafka: Next generation distributed messaging system. International Journal of Scientific Engineering and Technology Research, 3(47), pp.9478-9483.Google ScholarGoogle Scholar
  30. Chawda, R.K. and Thakur, G., 2016, March. Big data and advanced analytics tools. In 2016 symposium on colossal data analysis and networking (CDAN) (pp. 1-8). IEEE.Google ScholarGoogle Scholar
  31. Jain, A., 2016. The 5 Vs of big data. IBM Watson Health Perspectives. Dostupno na: https://www. ibm. com/blogs/watson-health/the-5-vs-of-big-data/.[30.05. 2017].Google ScholarGoogle Scholar
  32. Yin, S. and Kaynak, O., 2015. Big data for modern industry: challenges and trends [point of view]. Proceedings of the IEEE, 103(2), pp.143-146.Google ScholarGoogle ScholarCross RefCross Ref
  33. Kafka, A., 2014. A high-throughput distributed messaging system. URL: kafka. apache. org as of, 5(1).Google ScholarGoogle Scholar
  34. Lachev, T. and Price, E., 2018. Applied Microsoft Power BI: Bring your data to life!. Prologika Press.Google ScholarGoogle Scholar
  35. Akka. http://akka.io/. Accessed on 15 Jan 2020.Google ScholarGoogle Scholar
  36. Kejariwal, A., Kulkarni, S. and Ramasamy, K., 2017. Real time analytics: algorithms and systems. arXiv preprint arXiv:1708.02621.Google ScholarGoogle Scholar
  37. Nayak, S. and Kumar, T.S., 2008. Indian tsunami warning system. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Beijing, 37(1), pp.1501-1506.Google ScholarGoogle Scholar
  38. Lara, R., Benitez, D., Caamano, A., Zennaro, M. and Rojo-Alvarez, J.L., 2015. On real-time performance evaluation of volcano-monitoring systems with wireless sensor networks. IEEE Sensors Journal, 15(6), pp.3514-3523.Google ScholarGoogle ScholarCross RefCross Ref
  39. Bose, S., Mukherjee, N. and Mistry, S., 2016, August. Environment monitoring in smart cities using virtual sensors. In 2016 IEEE 4th International Conference on Future Internet of Things and Cloud (FiCloud) (pp. 399-404). IEEE.Google ScholarGoogle Scholar
  40. https://www.timeslive.co.za/news/south-africa/2019-12-10-dramatic-scenes-of-chaos-in-parts-of-gauteng-after-flooding/Google ScholarGoogle Scholar
  41. https://en.wikipedia.org/wiki/2019_Arkansas_River_floodsGoogle ScholarGoogle Scholar
  42. Anuradha, J., 2015. A brief introduction on Big Data 5Vs characteristics and Hadoop technology. Procedia computer science, 48, pp.319-324.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICBDR '20: Proceedings of the 4th International Conference on Big Data Research
    November 2020
    110 pages
    ISBN:9781450387750
    DOI:10.1145/3445945

    Copyright © 2020 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 1 March 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format