skip to main content
10.1145/3404687.3404694acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbdcConference Proceedingsconference-collections
research-article

A Comprehensive Overview of BIG DATA Technologies: A Survey

Authors Info & Claims
Published:30 July 2020Publication History

ABSTRACT

In as much as the approaches of the new revolution, machines including transmission media like social media sites, nowadays quantity of data swell hastily. So, size is the core and only facet that leaps the mention of BIG DATA. In this article, an effort to touch a comprehensive view of big data technologies, because of the swift evolution of data by an industry trying the academic press to catch up. This paper also offers a unified explanation of big data as well as the analytics methods. A practical discriminate characteristic of this paper is core analytics associated with unstructured data which is more than 90% of big data. To deal with complicated Big Data problems, great work has been done. This paper analyzes contemporary Big Data technologies. Therein article further strengthens the necessity to formulate new tools for analytics. It bestows not sole an intercontinental overview of big data techniques even though the valuation according to big data Hadoop Ecosystem. It classifies and debates the main technologies feature, challenges, and usage as well.

References

  1. 6th Symposium on Operating Systems Design and Implementation --- Technical Paper: https://www.usenix.org/legacy/event/osdi04/tech/full_papers/dean/dean_html/. Accessed: 2019-08-01.Google ScholarGoogle Scholar
  2. Aiyer, A. et al. 2012. Storage Infrastructure Behind Facebook Messages. IEEE Data Engineering. (2012), 1--10.Google ScholarGoogle Scholar
  3. Al-fuqaha, A. et al. 2015. Internet of Things: A Survey on Enabling. IEEE Communications Surveys & Tutorials. 17, 4 (2015), 2347--2376. DOI:https://doi.org/10.1109/COMST.2015.2444095.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Al-Sai, Z.A. et al. 2019. Big Data Impacts and Challenges: A Review. 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings. (2019), 150--155. DOI:https://doi.org/10.1109/JEEIT.2019.8717484.Google ScholarGoogle Scholar
  5. Alam, A. and Ahmed, J. 2014. Hadoop Architecture and Its Issues. (2014). DOI:https://doi.org/10.1109/CSCI.2014.140.Google ScholarGoogle Scholar
  6. Ambari -: http://ambari.apache.org/. Accessed: 2019-08-02.Google ScholarGoogle Scholar
  7. Apache Cassandra: http://cassandra.apache.org/. Accessed: 2019-08-01.Google ScholarGoogle Scholar
  8. Apache HBase - Apache HBase™ Home: http://hbase.apache.org/. Accessed: 2019-07-31.Google ScholarGoogle Scholar
  9. Apache Hive TM: http://hive.apache.org/. Accessed: 2019-08-02.Google ScholarGoogle Scholar
  10. Apache Mahout: http://mahout.apache.org/. Accessed: 2019-08-02.Google ScholarGoogle Scholar
  11. Apache Spark™ - Unified Analytics Engine for Big Data: http://spark.apache.org/. Accessed: 2019-08-02.Google ScholarGoogle Scholar
  12. Apache Tez - Welcome to Apache TEZ®: http://tez.apache.org/. Accessed: 2019-08-02.Google ScholarGoogle Scholar
  13. Apache ZooKeeper: http://zookeeper.apache.org/. Accessed: 2019-08-02.Google ScholarGoogle Scholar
  14. Ardagna, C.A. et al. 2016. Big Data Analytics as-a-Service: Issues and challenges. (2016), 3638--3644.Google ScholarGoogle Scholar
  15. Arora, Y. Big Data Technologies: Brief Overview. 131, 9, 1--6.Google ScholarGoogle Scholar
  16. Azarmi, B. Scalable Big Data Architecture.Google ScholarGoogle Scholar
  17. Balachandran, M. 2017. ScienceDirect ScienceDirect ScienceDirect Challenges Deploying Challenges and and Benefits Benefits of of Deploying Big Data Data Analytics Analytics in in the the Cloud Cloud for for Business Business Intelligence Intelligence Big. Procedia Computer Science. 112, (2017), 1112--1122. DOI:https://doi.org/10.1016/j.procs.2017.08.138.Google ScholarGoogle Scholar
  18. Barbier, G. Chapter 12 DATA MINING IN SOCIAL MEDIA. DOI:https://doi.org/10.1007/978-1-4419-8462-3.Google ScholarGoogle Scholar
  19. Bardi, M. et al. 1926. Big Data Security and Privacy: A Review. Journal of the Chemical Society (Resumed). 129, 2 (1926), 663--670. DOI:https://doi.org/10.1039/JR9262900663.Google ScholarGoogle Scholar
  20. Braganza, A. et al. 2017. Resource management in big data initiatives: Processes and dynamic capabilities *, **. Journal of Business Research. 70, (2017), 328--337. DOI:https://doi.org/10.1016/j.jbusres.2016.08.006.Google ScholarGoogle Scholar
  21. Cai, H. et al. 2017. IoT-Based Big Data Storage Systems in Cloud Computing: Perspectives and Challenges. 4, 1 (2017), 75--87.Google ScholarGoogle Scholar
  22. Chang, F. et al. 2006. Bigtable: A Distributed Storage System for Structured Data (Awarded Best Paper!). Osdi. (2006), 205--218. DOI:https://doi.org/10.1145/1365815.1365816.Google ScholarGoogle Scholar
  23. Chauhan, A. 2013. Learning Cloudera Impala.Google ScholarGoogle Scholar
  24. Chukwa - Welcome to Apache Chukwa: http://chukwa.apache.org/. Accessed: 2019-08-02.Google ScholarGoogle Scholar
  25. Conference, I.I. et al. 2015. Data Confidentiality Challenges in Big Data Applications. 8, (2015), 2886--2888.Google ScholarGoogle Scholar
  26. Dave, M. and Kamal, J. 2017. Identifying Big Data Dimensions and Structure. (2017), 163--168.Google ScholarGoogle Scholar
  27. Desai, P. V. 2018. A survey on big data applications and challenges. Proceedings of the International Conference on Inventive Communication and Computational Technologies, ICICCT 2018. Icicct (2018), 737--740. DOI:https://doi.org/10.1109/ICICCT.2018.8472999.Google ScholarGoogle ScholarCross RefCross Ref
  28. Dimiduk, N. and Khurana, A. HBase in Action.Google ScholarGoogle Scholar
  29. Dwivedi, K. 2014. Analytical Review on Hadoop Distributed File System. (2014), 174--181.Google ScholarGoogle Scholar
  30. Eldawy, A. and Mokbel, M.F. 2017. The era of Big Spatial Data. Proceedings of the VLDB Endowment. 10, 12 (2017), 1992--1995. DOI:https://doi.org/10.14778/3137765.3137828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Gandomi, A. and Haider, M. 2015. International Journal of Information Management Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management. 35, 2 (2015), 137--144. DOI:https://doi.org/10.1016/j.ijinfomgt.2014.10.007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hep, T. et al. 2019. A Roadmap for HEP Software and Computing R & D for the 2020s. Springer International Publishing.Google ScholarGoogle Scholar
  33. Hurwitz, J. et al. 2013. Bir Data for Dummies.Google ScholarGoogle Scholar
  34. Industry's Next Generation Data Platform for AI and Analytics | MapR: https://mapr.com/. Accessed: 2019-08-01.Google ScholarGoogle Scholar
  35. Ishwarappa and J, A. 2015. A Brief Introduction on Big Data 5Vs Characteristics and Hadoop Technology. 48, Iccc (2015), 319--324. DOI:https://doi.org/10.1016/j.procs.2015.04.188.Google ScholarGoogle Scholar
  36. Ismail, A.S. et al. Querying DBpedia Using HIVE-QL. 102--108.Google ScholarGoogle Scholar
  37. Jaseena, K.U. and David, J.M. 2014. ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING. (2014), 131--140.Google ScholarGoogle Scholar
  38. Khan, N. et al. 1990. Big Data: Survey, Technologies, Opportunities, and Challenges. Japanese Journal of Applied Physics. 29, 8 (1990), L1497--L1499. DOI:https://doi.org/10.1143/JJAP.29.L1497.Google ScholarGoogle Scholar
  39. Khan, N. et al. 2018. The 10 Vs, Issues and Challenges of Big Data. March (2018), 52--56. DOI:https://doi.org/10.1145/3206157.3206166.Google ScholarGoogle Scholar
  40. Li, S. et al. 2018. US CR. (2018). DOI:https://doi.org/10.1016/j.jii.2018.01.005.Google ScholarGoogle Scholar
  41. Lin, J. 2013. MAPREDUCE IS GOOD ENOUGH? March (2013), 28--37. DOI:https://doi.org/10.1089/big.2012.1501.Google ScholarGoogle Scholar
  42. Machova, R. et al. 2016. Processing of Big Educational Data in the Cloud Using Apache Hadoop. (2016), 46--49.Google ScholarGoogle Scholar
  43. Manwal, M. Big Data and Hadoop -A Technological Survey.Google ScholarGoogle Scholar
  44. Martino, B. Di et al. 2014. Big data (lost) in the cloud. International Journal of Big Data Intelligence. 1, 1/2 (2014), 3. DOI:https://doi.org/10.1504/ijbdi.2014.063840.Google ScholarGoogle ScholarCross RefCross Ref
  45. Mass, C. et al. 2013. Volume 3, Issue 12, December 2013. 3, 12 (2013), 14947.Google ScholarGoogle Scholar
  46. Mcafee, A. and Brynjolfsson, E. 2012. Spotlight on Big Data Big Data: The Management Revolution, 2012. Acedido em 15-03-2017. Harvard Business Review. October (2012), 1--9.Google ScholarGoogle Scholar
  47. Mehta, N. and Pandit, A. 2018. Concurrence of big data analytics and healthcare: A systematic review. International Journal of Medical Informatics. 114, January (2018), 57--65. DOI:https://doi.org/10.1016/j.ijmedinf.2018.03.013.Google ScholarGoogle ScholarCross RefCross Ref
  48. Mishra, S. 2015. Challenges in Big Data Application: A Review. 121, 19 (2015), 42--46.Google ScholarGoogle Scholar
  49. Mitra, A. et al. 2016. A Novel Big-Data Processing Framwork for Healthcare Applications. (2016), 3548--3555.Google ScholarGoogle Scholar
  50. Nambiar, R. 2019. A look at challenges and opportunities of Big Data analytics in healthcare - IEEE Conference Publication. (2019), 17--22.Google ScholarGoogle Scholar
  51. Oozie - Apache Oozie Workflow Scheduler for Hadoop: http://oozie.apache.org/. Accessed: 2019-08-03.Google ScholarGoogle Scholar
  52. Oussous, A. et al. 2018. Big Data technologies: A survey. Journal of King Saud University - Computer and Information Sciences. 30, 4 (2018), 431--448. DOI:https://doi.org/10.1016/j.jksuci.2017.06.001.Google ScholarGoogle ScholarCross RefCross Ref
  53. Pashazadeh, A. and Navimipour, N.J. 2018. Big data handling mechanisms in the healthcare applications: A comprehensive and systematic literature review. Journal of Biomedical Informatics.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Patel, D. et al. 2017. Analyzing Network Traffic Data Using Hive Queries. 3 (2017), 3--8.Google ScholarGoogle Scholar
  55. Philip Chen, C.L. and Zhang, C.Y. 2014. Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences. 275, (2014), 314--347. DOI:https://doi.org/10.1016/j.ins.2014.01.015.Google ScholarGoogle Scholar
  56. Pol, U. 2016. International Journal of Advanced Research in Big Data and Hadoop Technology Solutions with Cloudera Manager. September (2016).Google ScholarGoogle Scholar
  57. Prasad, B.R. and Agarwal, S. 2016. Comparative Study of Big Data Computing and Storage Tools: A Review. International Journal of Database Theory and Application. 9, 1 (2016), 45--66. DOI:https://doi.org/10.14257/ijdta.2016.9.1.05.Google ScholarGoogle ScholarCross RefCross Ref
  58. Rajaraman, V. 2016. Big Data Analytics. August (2016), 2015--2016.Google ScholarGoogle ScholarCross RefCross Ref
  59. Ravi, V.T. Comparing Map-Reduce and FREERIDE for Data-Intensive Applications.Google ScholarGoogle Scholar
  60. Raza, M.U. 2017. Big Data - Security and Privacy policy. 5, 6 (2017), 51--54.Google ScholarGoogle Scholar
  61. Rezaeijam, M. A Survey on Security of Hadoop.Google ScholarGoogle Scholar
  62. Sakr, S. Big Data 2.0 Processing Systems A Survey.Google ScholarGoogle Scholar
  63. Shafer, J. et al. 2010. The Hadoop distributed filesystem: Balancing portability and performance. ISPASS 2010 - IEEE International Symposium on Performance Analysis of Systems and Software. March 2010 (2010), 122--133. DOI:https://doi.org/10.1109/ISPASS.2010.5452045.Google ScholarGoogle ScholarCross RefCross Ref
  64. Shao, Y. et al. 2018. Computers & Industrial Engineering E ffi cient jobs scheduling approach for big data applications. Computers & Industrial Engineering. 117, March 2017 (2018), 249--261. DOI:https://doi.org/10.1016/j.cie.2018.02.006.Google ScholarGoogle Scholar
  65. Sinanc, D. et al. 2015. A survey on security and privacy issues in big data. December (2015). DOI:https://doi.org/10.1109/ICITST.2015.7412089.Google ScholarGoogle Scholar
  66. Singh, S. et al. 2015. Big Data: Technologies, Trends and Applications. 6, 5 (2015), 4633--4639.Google ScholarGoogle Scholar
  67. Sogodekar, M. et al. 2016. Big data analytics: Hadoop and tools. IEEE Bombay Section Symposium 2016: Frontiers of Technology: Fuelling Prosperity of Planet and People, IBSS 2016. (2016). DOI:https://doi.org/10.1109/IBSS.2016.7940204.Google ScholarGoogle Scholar
  68. Somasekaram, P. 2016. Privacy-Preserving Big Data in an In-Memory Analytics Solution. Luleå University of Technology. (2016).Google ScholarGoogle Scholar
  69. Sqoop -: http://sqoop.apache.org/. Accessed: 2019-08-03.Google ScholarGoogle Scholar
  70. Sur, S. et al. Can High-Performance Interconnects Benefit Hadoop Distributed File System ?Google ScholarGoogle Scholar
  71. Taguchi, Y.H. et al. 2014. Heuristic principal component analysis-based unsupervised feature extraction and its application to bioinformatics. Big Data Analytics in Bioinformatics and Healthcare. i, (2014), 138--162. DOI:https://doi.org/10.4018/978-1-4666-6611-5.ch007.Google ScholarGoogle Scholar
  72. Tech, M.R.D. 2014. Handling Big Data with Hadoop Toolkit. 978 (2014).Google ScholarGoogle Scholar
  73. The real story of how big data analytics helped Obama win | InfoWorld: https://www.infoworld.com/article/2613587/the-real-story-of-how-big-data-analytics-helped-obama-win.html. Accessed: 2019-07-30.Google ScholarGoogle Scholar
  74. To, Q.C. et al. 2018. A survey of state management in big data processing systems. VLDB Journal. 27, 6 (2018), 847--872. DOI:https://doi.org/10.1007/s00778-018-0514-9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Uzunkaya, C. et al. 2015. Hadoop Ecosystem and Its Analysis on Tweets. Procedia - Social and Behavioral Sciences. 195, (2015), 1890--1897. DOI:https://doi.org/10.1016/j.sbspro.2015.06.429.Google ScholarGoogle Scholar
  76. Wang, H. et al. 2016. Towards felicitous decision making: An overview on challenges and trends of Big Data. Information Sciences. 367-368, (2016), 747--765. DOI:https://doi.org/10.1016/j.ins.2016.07.007.Google ScholarGoogle Scholar
  77. Wang, Y. et al. 2018. Big data analytics: Understanding its capabilities and potential benefits for healthcare organizations. Technological Forecasting and Social Change. 126, (2018). DOI:https://doi.org/10.1016/j.techfore.2015.12.019.Google ScholarGoogle Scholar
  78. Welcome to Apache Avro! http://avro.apache.org/. Accessed: 2019-08-02.Google ScholarGoogle Scholar
  79. Welcome to Apache Pig! http://pig.apache.org/. Accessed: 2019-08-02.Google ScholarGoogle Scholar
  80. White, T. Hadoop: The Definitive Guide.Google ScholarGoogle Scholar
  81. Zheng, Z. et al. 2015. Real-Time Big Data Processing Framework: Challenges and Solutions. 3190, 6 (2015), 3169--3190.Google ScholarGoogle Scholar
  82. Zhou, J. et al. 2013. Cloud Things: a Common Architecture for Integrating the Internet of Things with Cloud Computing. (2013), 651--657.Google ScholarGoogle Scholar

Index Terms

  1. A Comprehensive Overview of BIG DATA Technologies: A Survey

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICBDC '20: Proceedings of the 5th International Conference on Big Data and Computing
      May 2020
      133 pages
      ISBN:9781450375474
      DOI:10.1145/3404687

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 July 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)48
      • Downloads (Last 6 weeks)3

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader