Big data analytics: Computational intelligence techniques and application areas

https://doi.org/10.1016/j.techfore.2018.03.024Get rights and content

Highlights

  • We highlight the importance of Big Data in modern life and economy.

  • We investigate the benefits of computational intelligence techniques in big data analytics.

  • We present a data modelling methodology called Hierarchical Spatial-Temporal State Machine

  • We explore the potential of the powerful combination of Big Data and Computational intelligence

  • We identify a number of areas where novel applications in real world problems can be developed.

Abstract

Big Data has significant impact in developing functional smart cities and supporting modern societies. In this paper, we investigate the importance of Big Data in modern life and economy, and discuss challenges arising from Big Data utilization. Different computational intelligence techniques have been considered as tools for Big Data analytics. We also explore the powerful combination of Big Data and Computational Intelligence (CI) and identify a number of areas, where novel applications in real world smart city problems can be developed by utilizing these powerful tools and techniques. We present a case study for intelligent transportation in the context of a smart city, and a novel data modelling methodology based on a biologically inspired universal generative modelling approach called Hierarchical Spatial-Temporal State Machine (HSTSM). We further discuss various implications of policy, protection, valuation and commercialization related to Big Data, its applications and deployment.

Introduction

The importance of data in our increasingly information driven economy and society can be summarized in the statement that “big data is the new oil” as was recently quoted by IBM's chief Executive officer “(Hirsch, 2013). The analogy of the importance data to that of the world's reliance on this natural resource has been identified and highlighted by various studies demonstrating the power and impact of Big Data to modern life. This new digital resource can provide the main driving force for building the smart cities of tomorrow. Big data can be defined as huge sets of data with a structure of increasing variety and complexity. The inherent difficulty in dealing with these large amounts of data results in major challenges concerning their storage and analysis as well as the cost and time efficient delivery of results. Moreover, information should be delivered in an interpretable and easily visualized way (Sagiroglu and Sinanc, 2013). Big Data analytics refers to the techniques utilized in order to examine, process, discover and expose hidden underlying patterns, interesting relations and other insights concerning the application context under investigation.

As pointed out by Hashem et al. Big Data have three main characteristics. Firstly, as the name reveals the data themselves are numerous. Secondly, it is not possible to categorize the data into regular relational databases, and finally data streams are created, captured, and analyzed rapidly (Hashem et al., 2015). As Gerhard mentions, Big Data is a revolutionary leap forward from traditional analysis which possesses three main characteristics volume, variety, and velocity (Gerhardt et al., 2012). Volume refers to the amount of data, which are created and stored. Variety is related to the various types of data collected, and velocity can be defined as the speed of data generation, streaming and aggregation (Kaisler et al., 2013). In Kaissler et al.'s study, data value, and complexity are also proposed as Big Data characteristics. Data value is a measure of the usefulness of data in decision making processes, while complexity is a measure of the degree of interdependence and interconnectedness in Big Data structures.

Nowadays, in modern smart cities, there is a growth in the amount of data, which can be captured and utilized. Recent advances in hardware and software technologies, such as social media, the internet of things, wearable sensors, mobile technologies, data storage and cloud computing, data mining techniques and machine learning algorithms have resulted in the ability to easily acquire, analyze and store large amounts of data from different kinds of quantitative and qualitative domain specific data sources. This data can be harvested from a large number of diverse sources including emails and online transactions, multimedia information such as video audio and pictures, large databases containing health records and other information, information captured during a user's interaction with social media such as posts, status updates etc., data derived from the search queries or click patterns of a user, physiological data such as heart rate, skin conductivity etc. as captured from wearable sensors, data derived and extracted from our interaction with our mobile devices or other devices inside a smart home, data from scientific research and others (Eaton et al., 2012). It can be easily concluded form the above that in the modern world, data is generated at a record rate (Villars et al., 2011). This fact has aided Big Data's emergence as a very important subject, which continually gained interest from both academia and industry.

The potential utilization of this huge amount of information to transform the way of life in modern smart cities, catapults Big Data and Big data analysis to the forefront of modern research communities, businesses, and governments (Hashem et al., 2015), delivering the promise of a countless new application areas and opportunities, which can emerge under completely diverse application contexts from transportation (Hashem et al., 2016) to health care (Murdoch and Detsky, 2013). As a result, the benefits arising from this wealth of knowledge and information can affect research in numerous ways. This includes for example promoting medical advances by providing evidence in the identification of symptoms and patterns concerning diseases, pandemics or modern health issues, or aiding in the creation of large ground truth databases for scientific fields such as emotion research, and affective computing which are in desperate need of vast amounts of data in order to successfully create emotion models and effective emotion recognition techniques. Modern economy and businesses in the context of a smart city, can also greatly benefit from Big Data and Big Data analytics since they can utilize the data generated from the interaction of users with social networks or smart devices in order to identify the users' preferences, or recognize unsatisfied needs of modern clients or understand the relationships between competitive and collaborating organizations, thus creating better and more appealing services and products, or improving already existing ones. Acquiring more nuanced insights of customer preferences and needs will provide modern organizations and businesses with a crucial advantage over competitors (Sagiroglu and Sinanc, 2013). As mentioned in (Hirsch, 2013) Big Data “is becoming a significant corporate asset, a vital economic input, and the foundation of new business models” (Hirsch, 2013). Fast evolution of ICT technologies, Big Data analytics, and energy efficient communication protocols has provided a new momentum to e-businesses and global connectivity (Qureshi et al., 2017). Big Data analytics can also facilitate government's and smart city local authorities' efforts towards delivering better services to their citizens. Big Data can aid governments in improving healthcare, public transport, education, and other fields of social life thus aiding in shaping a more efficient modern society. For example, data from traffic records can be utilized towards the improvement of public transport services delivered by the state to the population. Especially IoT related Big Data applications such as smart homes, combinations of wearable health devices, and smart cities are widely recognized as key factors which can contribute in the economic and social development of developing countries (Mital et al., 2017).

Utilizing Big Data poses a number of challenges such as security issues and robust handling and storage of data in emergency situations. Enterprises and organizations store, and analyze huge amounts of data, which should be processed in a secure way. This is a very challenging task in the modern environment. Traditional tools are not able to tackle this problem, and this fact is demonstrated in recent studies. This happens due to various reasons such as the centralised nature of Big Data stores or the enhanced capabilities of attackers to penetrate and nuteralise traditional security systems (Sagiroglu and Sinanc, 2013). In order to tackle these challenges companies should incorporate risk aware, contextual, and agile security models (Sagiroglu and Sinanc, 2013). As mentioned in Amin et al.'s paper, there is an identified need for lightweight intelligent security protocols that allows only legitimate users to access the sensitive information collected by various smart devices in the IoT environment (Amin et al., 2018).

In order to harvest the advantages of Big Data analytics in an increasingly knowledge driven society there is a need to develop solutions that reduce the complexity and cognitive burden on accessing and processing these large volumes of data in both embedded hardware and software based data analytics (Maniak et al., 2015; Iqbal et al., 2015a, Iqbal et al., 2015b). Big challenges stem from the utilization of Big Data in real world, since the implementation of real time applications is becoming increasingly complex. This complexity derives from a variety of data related factors. One factor is the high dimensionality degree which a dataset may possess increasing the difficulty of processing and analyzing the data. The interactions, co-relations and causal effects of these high dimensional data parameters in relation to the behaviours and specific outcomes of these systems are often too complex to be analyzed and understood by human users. Additionally, data can be accumulated from diverse sources and input channels, making the online processing very demanding due to the variety of signal input, which needs to be synchronized and diverse data types, which need to be analyzed simultaneously. Furthermore, the collected data is often comprised of multiple types of inputs that are also not always precise or complete due to various sources of imprecision, uncertainty together with missing data (e.g. malfunctioning or inaccurate sensors). Moreover, there is an inherent need in real life applications for high speed storage, processing of data and retrieval of the corresponding analysis results. Another factor that should be taken into account is that the method utilized for Big Data analytics, should extract knowledge from data in an interpretable way. The computational techniques deployed to perform this task should make the underlying patterns, which exist in the data, transparent to the person who tries to utilize and understand them. Finally, there is a need for techniques performing online adaptation, to incorporate contextual and user specific elements in their design, and decision making mechanism, in a user friendly and computationally feasible manner. All the above factors should be reflected in the computational and machine learning techniques utilized in order to process and analyze Big Data so that successful applications and models can be constructed (Suthaharan, 2014).

In the following section, we aim to explore the potential of applying Computational Intelligence techniques to Big Data analytics, and we present recent applications that utilize these techniques towards Big Data analytics tasks. In Section 3, we discuss potential areas where intelligent machine learning techniques can be applied on Big Data analytics resulting in novel and interesting smart city applications. In Section 4, we present our novel data modelling methodology approach. In Section 5, we consider different aspects of policy, protection, valuation and commercialization related to Big Data. Finally, in Section 6, we discuss the conclusions arising from this study.

Section snippets

Computational intelligence for big data analytics

Machine learning (ML) approaches offer a means for modelling patterns and correlations in data in order to discover relationships and make predictions based on unseen events. ML approaches consist of supervised learning (learning from labeled data), unsupervised learning (discovering hidden patterns in data or extracting features) and reinforcement learning (goal oriented learning in dynamic situations) (Mitchell, 1997). As such, ML approaches can also be categorised into regression techniques,

Potential application areas

Big Data and CI can be utilized in order to provide novel applications with scientific and commercial value. In this section we provide examples of opportunities from a variety of different application areas, which can contribute to the creation of a truly intelligent digital environment in the context of a smart city.

Proposed data modelling methodology

Based on the latest discoveries in the field of neuroscience the core part of our proposed data modelling methodology introduces a novel biologically inspired universal generative modelling approach called Hierarchical Spatial-Temporal State Machine (HSTSM) that has been developed on the understanding of the structure and functionality of the human brain. The proposed approach is based on a hybrid method incorporating a number of soft computing techniques such as: deep belief networks,

Policy, protection, valuation and commercialization

The possible miss-use for Big Data sets, data to power the Internet of things, personal data held and large corporate commercialization currently features as one of the biggest issues in global commerce. As outlined in this paper controlling the accessibility of Big Data is of critical importance to the global economy and countries and territories face a huge challenge to legislate access to these data sources given differences of opinion, ethics and laws governing the distribution and security

Conclusions

In this paper, we explored Big Data and Big Data analytics potential to support the development of modern smart cities, and investigated the benefits arising from the utilization of computational techniques namely deep learning neural networks, evolutionary algorithms and fuzzy logic in data analytics. We identified and highlighted potential novel smart city applications arising from the vast amount of information offered by modern high-tech societies, and from the deployment of intelligent

Rahat Iqbal is Chief Executive Officer of Interactive Coventry Ltd. and a Reader/Associate Professor in the Faculty of Engineering, Environment and Computing at Coventry University. He has a track record of project management and leadership of industrial projects funded by EPSRC, TSB, ERDF and local industries (e.g. Jaguar Land Rover Ltd., Trinity Expert Systems Ltd). He was involved in the project management and development of the EU FP7 project CHIL (Computers in Human Interaction Loop) at

References (96)

  • S. Mahmud et al.

    Cloud enabled data analytics and visualization framework for health-shocks prediction

    Futur. Gener. Comput. Syst.

    (2016)
  • T. Maniak et al.

    Automated intelligent system for sound signalling device quality assurance

    Inf. Sci.

    (2015)
  • L. Monostori

    AI and machine learning techniques for managing complexity, changes and uncertainties in manufacturing

    Eng. Appl. Artif. Intell.

    (2003)
  • A. Ortigosa et al.

    Sentiment analysis in Facebook and its application to e-learning

    Comput. Hum. Behav.

    (2014)
  • N. Sultan

    Making use of cloud computing for healthcare provision: opportunities and challenges

    Int. J. Inf. Manag.

    (2014)
  • A. Tirachini

    Estimation of travel time and the benefits of upgrading the fare payment technology in urban bus services

    Transp. Res. C Emerg. Technol.

    (2013)
  • E.I. Vlahogianni et al.

    A real-time parking prediction system for smart cities

    J. Intell. Transp. Syst.

    (2016)
  • D. Whitley

    An overview of evolutionary algorithms: practical issues and common pitfalls

    Inf. Softw. Technol.

    (2001)
  • L.A. Zadeh

    Fuzzy sets

    Inf. Control.

    (1965)
  • M.A. Alsheikh et al.

    Mobile Big Data Analytics Using Deep Learning and Apache Spark

    (2016)
  • O. Behadada et al.

    Big data-based extraction of fuzzy partition rules for heart arrhythmia detection: a semi-automated approach

  • R. Bekkerman et al.

    Scaling Up Machine Learning: Parallel and Distributed Approaches

    (2011)
  • Y. Bengio

    Learning deep architectures for AI

    Found. Trends Mach. Learn.

    (2009)
  • L. Bing et al.
  • R.A. Calvo et al.

    Affect detection: an interdisciplinary review of models, methods, and their applications

    Affect. Comput. IEEE Trans.

    (2010)
  • I. Campo et al.

    A real-time driver identification system based on artificial neural networks and cepstral analysis

  • V. Chang et al.

    Research investigations on the use or non-use of hearing aids in the smart cities

    Technol. Forecast. Soc. Chang.

    (2018)
  • X.W. Chen et al.

    Big data deep learning: challenges and perspectives

    Access IEEE

    (2014)
  • J. Chirillo et al.

    Implementing Biometric Security

    (2003)
  • I.H. Chung et al.

    Parallel deep neural network training for big data on blue gene/q

  • D.G. Costa et al.

    A fuzzy-based approach for sensing, coding and transmission configuration of visual sensors in smart city applications

    Sensors

    (2017)
  • F.A. D'Asaro et al.

    Computational intelligence and citizen communication in the Smart City

    Informatik-Spektrum

    (2017)
  • S. Djahel et al.

    Toward V2I communication technology-based solution for reducing road traffic congestion in smart cities

  • F. Doctor et al.

    U.S. Patent No. 8,515,884

    (2013)
  • F. Doctor et al.

    An intelligent framework for monitoring student performance using fuzzy rule-based linguistic summarisation

  • F. Doctor et al.

    A fuzzy embedded agent-based approach for realizing ambient intelligence in intelligent inhabited environments

    IEEE Trans. Syst. Man Cybern. Syst. Hum.

    (January 2005)
  • F. Doctor et al.

    A fuzzy ambient intelligent agents approach for monitoring disease progression of dementia patients

    J. Ambient. Intell. Humaniz. Comput.

    (2014)
  • F. Dreier

    Genetic Algorithm Tutorial

    (2002)
  • R. Duggal et al.

    Improving patient matching: single patient view for clinical decision support using big data analytics

  • C. Eaton et al.

    Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data

    (2012)
  • Esposito, E., De Vito, S., Salvato, M., Fattoruso, G., & Di Francia, G. (2017). Computational Intelligence for Smart...
  • C. Free et al.

    The effectiveness of mobile-health technologies to improve health care service delivery processes: a systematic review and meta-analysis

    PLoS Med.

    (2013)
  • R. Fujdiak et al.

    Using genetic algorithm for advanced municipal waste collection in Smart City

  • B. Gerhardt et al.

    “Unlocking Value in the Fragmented World of Big Data Analytics”, Cisco Internet Business Solutions Group

  • A. Grzywaczewski et al.

    E-marketing strategy for businesses

  • T. Gulden et al.
  • V. Gupta et al.
  • A.B. Habtie et al.

    A neural network model for road traffic flow estimation

  • Cited by (136)

    • Healthcare analytics: A techno-functional perspective

      2023, Technological Forecasting and Social Change
    View all citing articles on Scopus

    Rahat Iqbal is Chief Executive Officer of Interactive Coventry Ltd. and a Reader/Associate Professor in the Faculty of Engineering, Environment and Computing at Coventry University. He has a track record of project management and leadership of industrial projects funded by EPSRC, TSB, ERDF and local industries (e.g. Jaguar Land Rover Ltd., Trinity Expert Systems Ltd). He was involved in the project management and development of the EU FP7 project CHIL (Computers in Human Interaction Loop) at the Technical University of Eindhoven, Netherlands. Recently, he has successfully led a project in collaboration with Jaguar Land Rover on self-learning car for predicting drivers behaviour for personalization of telematics and optimization of route planning. He has managed many industrial projects, in Intelligent Systems, Predictive Modelling, User Behaviour, Information Retrieval and Fault Detection. He has published more than 100 papers in peer-reviewed journals and reputable conferences and workshops. Dr. Iqbal is on the programme committee of several international conferences and workshops. He is also a fellow of the UK Higher Education Academy (HEA). Dr. Iqbal has also edited several special issues of international journals within the field of Information Retrieval and User Supportive systems.

    Faiyaz Doctor is Research Director of Interactive Coventry Ltd. and a Lecturer in the School of Computer Science and Electronic Engineering, University of Essex. He has worked in industry and academia to develop Novel Computational Intelligence Solutions addressing real world problems related to smart environments, Energy Optimisation, Predictive Analytics and Collaborative Decision Support. His work has resulted in high profile innovation awards (best KTP regional finalist 2011, Lord Stafford award for Innovation) and an international patent on improved approaches for Data Analysis and Decision-Making using Hybrid Neuro-Fuzzy and Type-2 Fuzzy Systems: wo/2009/141631. He has led and co-led projects funded through the Newton Fund and Conacyt (Mexico), the TSB in collaboration with local industry (e.g. Jaguar Land Rover Ltd) and though collaborative consultancy with international partners (e.g. Ministry of Labour, Saudi Arabia). Dr. Doctor has published over 50 papers in peer reviewed international journals, conferences and workshops. He regularly serves on organization and programme committees of several international conferences and workshops in the field of Computational Intelligence. He is also a member of the IEEE and IEEE Computational Intelligence Society.

    Brian More is representing Coventry University Enterprise Ltd. as Director of Intellectual Property (IP) and serves on the Board of Directors of Interactive Coventry Ltd. He works as Director for Intellectual Property at Coventry University with responsibility for policy, protection, valuation and commercialization of all forms of IP. He manages a portfolio of 20 patent families, trademarks, designs and copyright. He has had 25 years experience working with Intellectual Property, is an inventor on 6 patents and jointly owns 3 trademarks. Dr. More has been active in starting 15 companies using IP and attracting investment into them. He is passionate about education and training in the IP arena and developed accredited courses for academic and industry use. He has published 20 peer reviewed papers on research, innovation and entrepreneurship. He worked at CEA, NPL and BNFL's Company Research Laboratory. He is a Director of 2 companies and sits on 3 national advisory panels. He sits on the Board of Trustees of the Institute of Nanotechnology prior to which he was Chairman of the Steering Group to the Institute. He has worked for private contractors on assessment of development proposals in the field of Nanotechnology and worked on EU Framework projects as commercialization consultant.

    Shahid Mahmud is Chairman and Chief Executive Officer of Interactive Group. He has more than 31 years of professional experience in the field of ICT. He has served on various federal committees of the Government of Pakistan addressing the formulation and implementation of the National Telecom and IT policies. Dr. Mahmud is 2016 Distinguished Eisenhower Fellow. He is also a Senior Fellow, Global Think Tank Network (GTTN) and Co Chair for ICT on the Corporate Advisory Council of the National University of Science and Technology (NUST). He is active in several philanthropic activities, working with youth-oriented and community service projects such as Buraq Planetary Society, TRUCE, Begum Mehmooda Welfare Trust and Zubaida Khaliq Memorial Free Hospital. He has been the founder director and shareholder of Paktel Limited, Indus Vision, Pak Globalstar (Pvt) Limited, SHOA (Pvt) Limited, and Shaheen Pay TV (Pvt) Limited. He has also served as a Director of Askari Bank for over six years. In recognition for having spent his entire career in promoting IT in Pakistan, Dr. Mahmud was given the Lifetime Achievement Award: at the 12th Teradata National IT Excellence Awards for 2014.

    Usman Yousuf is Chief Executive Officer UAE at Interactive Group. His primary responsibilities include developing and implementing business strategy for the Middle East operation, forging key partnerships and alliances, and directing technical and project teams. In addition to being a certified Project Management Professional (PMP), he is also certified in logistics and The Open Group Architecture Framework 9 (TOGAF 9 Level 2). He has completed training in Strategic visioning, time management and negotiation and attended numerous workshops and trainings over the course of his career. He has worked with the United Nations Office for Project Services (UNOPS) in his previous role in business development at a regional IT firm. He is actively involved with the Buraq Planetary Society (www.buraqsociety.org) as Vice Chairman and volunteers his time for the development of Pakistan's youth. He has lived in the UAE for over 12 years and is fluent in English, Urdu and Arabic.

    1

    Dr Rahat Iqbal.

    View full text