Keywords

1 Introduction

Modern transport systems aim at an optimal flow of goods and people, which significantly impacts the quality of life in any society. Technological developments in transport modes, population growth and changes in population density, increasing urbanisation - all these factors generate new challenges for the management of modern transport systems. Poor traffic organisation causes numerous traffic jams, hindering the efficient use of transport infrastructure and increasing travel time, air pollution, and fuel consumption.

The rapid development of research, technology and information tools for electronics, communication and control systems, sensor networks and the processing of enormous data sets has contributed to a real revolution in modern public and private transport management and the development of effective strategies for both road infrastructure and management systems and the efficient use of existing infrastructure. The use of modern Information and Communication Technology systems (ICT) has significantly improved (and in many situations even enabled) the transfer of accurate traffic data, implementing control measures, varying the level of uncertainty and randomness that characterised conventional, manually managed transport networks. Successful implementation of Intelligent Transportation Systems (ITS) requires a good understanding of both locally and globally, and the impact of related phenomena and possible traffic anomalies, such as generation and propagation of shock waves, initiation of congestion, etc. Human relations are also an essential factor that must be considered in designing such systems.

ITS is nowadays a global important market. Figure 1 shows the calculation and estimation of this market size in the period of 2018–2025.

Fig. 1.
figure 1

Glovbal ITS market size (https://marketersmedia.com)

Based on the statistical analysis provide by MarketersMediaFootnote 1, it is expected that the market will reach US  42.6 billion in 2025. The leading players are Efkon AGFootnote 2 (Austria), Hitachi Ltd.Footnote 3 (Japan), Thales S.A.Footnote 4 (France), Roper Industries, Inc.Footnote 5 (U.S), Xerox CorporationFootnote 6 (U.S.), Q–Free ASAFootnote 7 (Norway), Kapsch AGFootnote 8 (Austria), Siemens AGFootnote 9 (Germany), Garmin Ltd.Footnote 10 (Switzerland) and TomTom International BVFootnote 11 (Netherlands).

Managing transportation systems, especially in highly urbanized environments, is a highly complex problem of optimization and decision making. Simplified mathematical models cannot always accurately capture the high complexity and dynamics of transport systems. For this reason, the use of computer networks, complex data and information transmission and processing systems, and computer simulations have enabled comprehensive analysis, and characterization of traffic flows on a given road network.

Despite the benefits of implementing today’s numerous computer models to support transportation management, modern ITS systems must cope with the secure processing of streaming time-series data, induced by the connected real-time data sources. Examples of such streaming datasources include sensors in transportation systems and in intelligent cars. The collected data can be preprocessed locally using the resources and IT infrastructure of the institution (customer) directly responsible for transport management in a given area. Often, however, the data and metadata, along with initial analysis results, are sent to external systems (e.g., cloud computing) that can analyze the data more closely and send back alerts on potential threats and anomalies. All this indicates the complexity of ITS systems and their potential vulnerability to attacks at the local (client) and global levels.

It should be noted that a significant challenge for modern ITS systems is their real-time operation. This means that manual adjustment of the parameters of such systems or data labelling is impossible, and the whole modelled transport system is characterized by high dynamics of changes in its parameters and thus in the generated data. Such dynamic environments are often prone to concept drifts [44], which means that the statistical properties of the target variable, which the model is trying to learn, change over time in unforeseen ways, and hence are non-stationary. This requires any learning intelligent systems to be able to continuously learn and adapt to system changes.

This chapter presents a simple overview of Intelligent Transportation System models, flow traffic models defined at the microscopic and macroscopic levels, business -based models (see Sect. 2. In Sect. 3, we defined and classified the types of threats, attacks and anomalies which can be detected in the ITS systems.

ITS systems are usually a part of the wider “smart” or “intelligent cities” projects. The futuristic vision of such city is presented in Fig. 2.

Fig. 2.
figure 2

“Clean mobility instead of dirty traffic” - smart city and sustainable transport conceptual model (https://en.wikipedia.org/wiki/Sustainable_transport)

We present in Sec.4 a real example of such smart city project implemented in the city of Wolfburg (Germany) with a special focus on the Wolfburg ITS system. Such system has been recently integrated with the prototype of the GUARD platform developed as the H2020 GUARD projectFootnote 12. The Wolfburg ITS system was implemnted for the ’smart transport’ case study in this project.

Simple conclusions are drawn in Sect. 5.

2 Intelligent Transportation Models

Modern ITSs are designed to improve the efficiency and safety of conventional transport systems, optimize transport costs, fuel and energy consumption, and improve the environment. ITS are generally an integral part of larger intelligent cities or intelligent environment projects using advanced IT methods and infrastructure, mainly for transport management, data processing and analysis, support and automation of logistics and decision-making systems. Modern ITSs thus open up a new market for IT services for drivers, travellers and infrastructure providers, e.g. as services in computational clouds. Therefore, ITS is rather understood as an umbrella term for many models, systems and applications that have been developed and implemented, not only strictly IT.

Despite a large number of publications in the field of state-of-the-art transportation systems, no exhaustive, multi-faceted taxonomy of these systems has yet emerged. It is also difficult to precisely define the general criteria of this taxonomy without overextending it.

In the early 1990s, the U.S. Department for Transportation accepted a national ITS standard. ITS architecture has identified the variable speed limits as a service package, which consists of two subsystems, as shown in Fig. 3.

Fig. 3.
figure 3

Variable speed limits service package (http://www.iteris.com/itsarch/html/mp/mpatms22.htm)

In this model, the traffic management subsystem supports monitoring and controlling roadway traffic and exchanges the data with the roadway subsystem. Based on it, the following two criteria can be considered for the classification of the ITS systems:

  • intelligent infrastructure and

  • intelligent vehicles.

This simple classification is very general. Intelligent vehicles are equipped with devices and information systems that support the driver. These include satellite navigation systems, intelligent speed adaptation (ISA), adaptive cruise control (ACC), forward collision warning (FCW), pedestrian detection systems (PDS), and lane departure warning (LDW). Therefore, there is a need also to classify the ITC models, methods and infrastructure, which may span both of the above criteria. For example, transportation traffic flow analysis is the basis of many established technologies for evaluating and improving transportation systems, including analytical methods, ICT methods and techniques and simulation software packages for planning and design, traffic control, traffic safety analysis, and demand management. In addition, business aspects significantly influence both the implementation and design of ITS models and their management.

The following subections show examples of ITS flow traffic models, ITS models supporting flow traffic models, and business aspects.

In the rest of this section.

2.1 Examples of the Traffic Flow ITS Models

Morrison Hershfield LimitedFootnote 13 has developed a comprehensive analysis of traffic in the vicinity of a quarry located in Dufferin County, Ontario, Canada, and the impact of that traffic on service levels capacity and operation of the surrounding roadways based on the traffic forecasts performed for the region. This analysis was performed for different scenarios developed based on the number of vehicles (trucks) entering the quarry area and moving in the opposite direction. The results of this analysis were used to optimize the ITS model by, for example, adding offset passing lanes, adding a left-turn lane at an intersection, and adding a new right-turn route at a crossroadsFootnote 14.

The objective of the project developed by Baby and Al–Sahrawi and presented in [3] is to analyze the traffic impact of residential redevelopment in Kuwait City. The analysis was conducted for thirteen planned locations of new roads, and the potential effect of road infrastructure development on fuel consumption was investigated. Models were developed for changes (balanced growth) in fuel consumption, but also emissions of carbon monoxide (CO), nitrogen oxide (NO2) and volatile organic compounds (VOC). The model predicts increases in air pollution and indicates possible methods to offset these increases.

URSC CanadaFootnote 15 studied the potential traffic impact and management of the proposed construction of a new thermal processing facility located in the Municipality of Clarington. The project indicates a projected increase of 2% - 3% in traffic volumes throughout the municipality and the need for additional investment in additional traffic signals, widening of certain arterials, intersection improvements and ramp terminals. The infrastructure upgrade project is scheduled to be completed by the end of 2023, which is in sync with Clarington Energy Business Park (CEBP).

Xu et al. in [4] presented a method for effectively adapting a macroscopic urban traffic network model to improve urban planning and urbanization models in general. The developed model also used speed density models to estimate travel time on individual road sections and the CORSIMFootnote 16 system to simulate the real traffic. The simulation results obtained with the developed model allowed the development of adequate time forecasts, especially during peak hours and sudden changes in traffic conditions.

The ITS model presented in [5] uses a queuing model of traffic flow in the network. Changes in density in traffic flow at different nodes of the road infrastructure determine different levels of congestion on a given road link. The following criteria are defined in this model: capacity and congestion density, and free-flow speed. The implemented intersection stream functions allow the propagation, initiation and dispersion traffic queues in the road network, which so-called bottlenecks can cause.

Fig. 4.
figure 4

Classification of ML-based ITS models.

2.2 ICT Supporting Models for Traffic Flow

The example ITS models presented in the previous chapter use methods and information systems in forecasting transport flows and decision-making systems and information infrastructure. In general, Machine Learning (ML) methods and artificial intelligence methods are the most commonly used methods in ITS support.

Nguyen et al. in [6] presented in a taxonomy of the ML-based transportation models. Their classification is shown in Fig. 4.

There are many ML models that span all classes in that taxonomy. Some selected example as of the representatives of each class are presented below.

Li and Lu in [7] presented a new model for highway traffic volume forecasting based on a combined neural network (NN) consisting of a self-organizing feature map (SOM) and an Elman NN. The SOM network was used to classify the traffic condition and the Elman NN identifies the relationships between input and output data to obtain prediction values. As a case study, the performance of this model was evaluated using actual observational data from a highway in Beijing, China.

Ma et al. in [8] developed the transportation network model by defining on-road segments’ traffic conditions (links). In this model, traffic congestion for a network with a specified number of links and time intervals is expressed as a two-dimensional matrix. The authors used the recurrent neural network combined with the deep restricted Boltzmann machine model to monitor and predict the traffic flow in the transportation network. The presented model is, however, quite complex, and its main limitation is that the model needs to automatically learn and infer the spatial dependency (e.g. related road segments) from historical data, which may result in low prediction accuracies.

Fouladgar et al. [9] proposed a deep traffic flow model based on a convolutional neural network (CNN) that considered inflow and outflow information in addition to traffic conditions on a road segment. In this model, the features of the processing data are classified into two general groups, namely, Traffic Condition and Incidents. Firstly, past traffic conditions are passed to the first filter of CNN as the training set. Then, the outcome of the first layer is sent to the second convolutional layer.

Huang et al. in [10] introduced a two-level deep learning ITS model, which included a multi-output regression layer at the top and a deep belief network at the bottom for traffic flow prediction. Such a method is executed for a group of road segments. In this case, the related roads should be grouped, as the overall performance is only improved when jointly trained tasks are related.

Effectively eliminating and relieving traffic congestion is a significant challenge in ITS models. This problem can be addressed by expanding road infrastructure, but this approach is generally costly. Another much cheaper method is to improve traffic management systems, which directly impacts road decongestion and vehicle flow optimization.

Genders and Razavi [11] developed a deep artificial neural network synchronized with a multi-agent system (MAS) to build adaptive control of traffic signals. The agents in the MAS were trained using reinforcement learning to develop an optimal control policy. This method was then evaluated in the SUMO traffic micro simulator.

Travel demand forecasting aims to estimate the number of road users or public transport users in the future. It is one of the most fundamental problems in transportation, Liu and Chen in [12] proposed a deep neural network-based model for demand forecasting in Taipei’s mass rapid transit system. The model considered various explanatory variables, including historical passenger flows, directional, and holiday factors.

Predicting and managing information about transportation accidents and hazards is another important problem that designers of modern ITSs must face. Chen et al. used a deep denoise autoencoder [13] to model hierarchical feature representation of data collected from passenger and driver mobility monitoring. The goal of the whole model is to generate a traffic incident risk map based on real-time human mobility input data. The experimental analysis results showed that the model could predict the risk of a traffic accident with a relatively small error in simple scenarios. The model should then consider other factors such as land use to be more reliable.

2.3 ITS e-Business Aspects and Models

Most publications present ITS models from the engineering or IT side. Despite standards for ITS developed in some countries such as Japan and the USA, there have been relatively few studies on ITS in a business context and very few publications on business or e-business models in the last 2–3 years.

Osterwalder and Pigneur [1] and then Giannoutakis and Li [2], They defined an e-Business model that contains five main pillars shown in Fig. 5. Each ITS project, to be sustainable, must adequately address these elements.

Fig. 5.
figure 5

The main pillars in e-Buisness ITS models.

Products and Services

In ITS models, generating revenue through the sale of products and services occurs at multiple levels. First, ITS system providers build modern infrastructure and provide intelligent vehicles. Then the users of the vehicles (customers) receive factual information about the potential benefits of the ITS system (e.g. travel safety and travel time reduction) and are willing to pay for it. Therefore, it creates business opportunities for more companies to enter the market as intermediaries and service providers. As a result, the ITS system causes economic development in a given area, but it also influences (through modern technologies) the improvement of life quality and natural environment protection.

Infrastructure and Network of Partners

The operation of ITS systems requires the involvement of many parties and users, e.g. government, funding bodies, transport groups, automotive companies, communications technology companies, the energy sector, road users, etc. In addition, the widespread use of the Internet, including for ITS, provides opportunities for businesses companies to develop e-business models that make minimal use of the physical infrastructure. On the other hand, an ITS infrastructure incompatible with intelligent vehicles will not add value and waste resources. Therefore, intelligent vehicles should be considered as part of the infrastructure when adopting a specific e-Business model.

Relationship Capital

Another essential business aspect, especially for ITS providers, is the relationship with customers and gaining their trust. For ITS companies, the Internet is an ideal marketplace to promote themselves and build a network of trusted customers. The Internet, especially social networks, collects and processes data from and about users.

Financial Aspects

The financial aspects range from adopted product pricing models to methods for efficient use of tangible and intangible assets. In the age of the Internet, the interest of companies is shifting more and more rapidly towards investments in intangible assets (e.g., reputation, supplier network, intellectual property, value information), while tangible or physical assets represent an increasingly smaller percentage of the total enterprise value. This is also the case for ITS vendors. In the age of the Internet, companies’ interest is shifting more and more rapidly towards investments in intangible assets (e.g. reputation, a network of suppliers, intellectual property, value information value), while tangible or physical assets represent a decreasing percentage of a company’s total value. This in turn also serves to reduce costs and working more efficiently with the same or even fewer resources.

Stakeholder Credibility

The success of the ITS business model and the sustainability and reliability of the system itself depends largely on all stakeholders’ support. To achieve such consent, it is necessary to develop a clear plan for the distribution of potential benefits (including financial returns) between the various interest groups and outline the expected more comprehensive societal and economic benefits.

The main challenge for an ITS company, as implied from the analysis above, is to take advantage of the opportunities offered by the internet. There is growing development in the ITS technology, both for transportation infrastructure and vehicles, and several examples of products and services in the ITS industry that could be integrated with, or make use of the internet and digital technologies. Nevertheless, the ITS sector has not taken off and the opportunities offered by the ICTs have remained rather unexplored. In the following, we present a few examples of ITS technologies, where the internet could create business opportunities.

Advanced Traveler Information Systems (ATIS). ATIS are systems that provide customized information to the user, such as on route selection, options about public transport, information about the destination, and warning messages for potential dangers during travelling [14, 15]. Many of them are GIS-based, like in–vehicle navigation devices and rely on digital technologies to operate. They offer an excellent opportunity for e–Business model development by providing information on third parties’ products (e.g. adverts) or even serving as platforms for added services, such as reporting defects on the road and updating online information about routes. They could also incorporate tourist information for popular destinations, such as sightseeing or online hotel booking suggestions.

Electric Vehicles. With the increasing environmental concerns of the future, there is growing research on electric and hybrid vehicles [16]. Digital technologies could contribute to the establishment of online business for these vehicles, for example, by providing online information about charging posts, or by selling online credit for vehicle charging on a pay–as–you–go with a registered smartcard. Electronic Toll Collection. Electronic Toll Collection (ETC), also known as Electronic Payment and Pricing System, is a topic of growing interest [17, 18]. The technology enables the collection of congestion charging automatically by recognizing the vehicle’s registration number. Neither have the drivers to stop at toll plazas, nor are cashiers required to collect tolls. Through the internet, public authorities or private companies could enable prepayment of tolls or discounts for frequently used routes by allowing road users to set up online accounts through their websites. ETC could also be a supportive mechanism for tracking down vehicles linked to illegal activities and facilitating law enforcement.

Public Transportation. Public Transportation is one of the areas where ITS have already started to have an impact and to revolutionize public transportation services [19, 20]. Widely used examples of ITS are passenger information systems at bus stops or train stations, bus-mounted cameras, online bookings and automatic payment systems. There is room for further developments of internet-based technologies on public Transportation, such as improvements in the integration of traveller information with mobile technology and enforcement of Wi–Fi networks, which could create new business opportunities for third parties.

3 Methods of Detection of Anomalies, Attacks and Threats in ICT Systems

IT infrastructure in ITSs is closely related to the security aspects of integrated ICT systems and IT networks. Anomalies in ICT systems are monitored by dedicated software and may have various causes. They can result from failure, overloading or ineffective management of traffic and transport infrastructure and can also result from external attacks on networks and information systems.

There are various taxonomies and characteristics of the types of threats (attacks). Figure 6 presents threats–attacks classification criteria defined based on the detailed taxonomy presented in [21].

Fig. 6.
figure 6

Criteria of the threats classification in ICT systems

In our taxonomy, we define the following three criteria of categorization of threats in ICT systems:

  1. 1.

    technique of attack,

  2. 2.

    threat impact, and

  3. 3.

    type of attack.

Based on the type, the attacks may be classified into two following groups:

  1. 1.

    known attacks, and

  2. 2.

    unknown attacks (anomalies).

In the rest of this section, we briefly survey the most popular methods which spam the classes of both above-defined classifications.

3.1 Attack Technique and Thread Impact Criteria

Based on the attack techniques criterion, the following popular models may be used for the classification of threats:

  • Three Orthogonal Dimensional Model was defined by Ruf et al. inFootnote 17. This model decomposes the threat space into sub-spaces according to three orthogonal dimensions (motivation, localization and agent). Threat agent imposes the threat on a specific asset of the system, represented by human, technological, and force majeure. Threat motivation is creating the threat and may be deliberate or accidental. Finally, threat localization is the origin of threats, either internal or external

  • Hybrid C3 Model was developed by Geric et al. in [22]. In this model, three significant criteria are considered, namely:

    • frequency of security threat occurrence,

    • the area affected by the threat (network nodes, users’ data, communication channels, data, operation system)

    • threat’s source.

  • Pyramid Model is presented in [23]. In this model, the threats are classified based on the following factors:

    1. (i)

      attackers’ prior knowledge about the system hardware, software, employees and users;

    2. (ii)

      critical system components which might be affected by the threat; and

    3. (iii)

      damage (loss) in the system or organization (privacy, integrity\(\ldots \)).

  • The Cyber Kill Chain was defined by Lockheed-Martin [24]. It splits cyber-attacks into seven phases:

    1. (i)

      reconnaissance,

    2. (ii)

      weaponization,

    3. (iii)

      delivery,

    4. (iv)

      exploitation,

    5. (v)

      installation,

    6. (vi)

      command and control, and

    7. (vii)

      actions and objective.

The following two models have been developed based on analysis of the observed or potential threat impacts:

  • STRIDE Model [25] developed by Microsoft which allows the characteristics of the known threats according to the goals and purposes of the attacks (or motivation of the attacker). The STRIDE acronym is defined based on the Spoofing identity, Tampering with data, Repudiation, Information disclosure, Denial of service, and Elevation of privilege. It is a goal-based approach, where an attempt is made to get inside the mind of the attacker by rating the threats against.

  • ISO Model (ISO 7498-2)Footnote 18 defines five major security threats impacts and services as a reference model: destruction of information and/or other resources, corruption or modification of information, theft, removal or loss of information and/or other resources, disclosure of information, and interruption of services.

3.2 Known and Unknown Attacks

Known Threats

Detection of some threats may be based on prior knowledge of the characteristics of the attack and the potential threat impact. Such threats are referred to as “known” threats because they have been already identified and studied before. Most of the detection methodologies of known threats are signature-based (SD) (sometimes defined as knowledge-based) techniques [26].

The main aim of the signature-based detection methods is to compare the suspicious payload with specific known attacks,i.e., signatures. Depending on the IDS type, the signatures can correspond to different types of data, e.g., byte sequences in network traffic, known malicious instruction sequences used by malware, etc. It is assumed in the SD scheme that patterns can define malware. Signature-based detection is the most popular technique for IDS systems. However, there are several disadvantages of using SD, such as:

  • Susceptible to evasion - since the signature byte patterns are derived from known attacks, these byte patterns are also commonly known. Hence they can be evaded by using obfuscation or polymorphic techniques that alter the attack’s payload, such that signatures no longer apply. Those methods can be easily used for computer malware, less so in the case of network attacks. Network attacks or exploits usually take advantage of bugs or vulnerabilities found in software and are bounded by specific application protocols.

  • Zero–day attacks - since the signature-based IDS systems are constructed based on known attacks, they cannot detect unknown malware or even variants of known malware. Therefore, they cannot effectively detect polymorphic malware [27], which means that SD does not provide zero-day protection. Signature-based detectors use different signatures for each malware variant. Hence, the volume of the database of signatures grows exponentially.

SD methodologies are effective and fast in the detection of known attacks and threatsFootnote 19. However, generating new signatures in SD is complex and is usually performed manually by experts. The experts must analyse the attack identify invariant fragments in the involved flows using their understanding of the attacked application and exploited vulnerability. They also construct a signature that fully recognises the threat due to the detailed knowledge. Manual generation of the signatures is a time-consuming process. The provided experiments show that over 90% of vulnerable systems can be “successfully” infected in that time. Therefore, automated signature generation tools are used in IDS systems to limit the propagation of a new threat in an early phase until a manually created signature is available and can be included in the rule sets permanently [28]. These systems work by searching for common features of suspicious flows not seen in regular, benign traffic. Looking for common characteristic traits of different malicious activities is not specific to worm detection - it is a basis of detection systems in other security applications, see, e.g. [29]. The syntax of generated signatures is generally based on the language provided by the system Snort [30]. Several systems for automatic generation of signatures of zero–day polymorphic worms have been developed: Autograph [31], Polygraph [32], Nebula [33], Hamsa [34], Lisabeth [35]. Most of them apply relatively simple (computationally inexpensive) heuristic approaches. Another model is proposed in [36]. The generation of multi–set type signatures is formulated as an optimisation problem. The specialised version of a genetic algorithm (GA) is used to solve it.

Signature-based detection in ITC systems belongs to a broader class of methodologies referred to as Threat IntelligenceFootnote 20. Threat intelligence is frequently used in Security Information and Event Management (SIEM), antivirus, and web technologies such as algorithms inspired by the human immune system for detection and prevention of web intrusions [37, 38]. In those algorithms, malware samples can be used to create a behavioural model to generate a signature, which is served as an input to a malware detector, acting as the antibodies in the antigen detection process. In the case of malicious botnets, a new trend is to use alternative communication channels, i.e., DNS-tunneling or HTTP, instead of IRC to connect command & control (C&C) servers and infected hosts [39].

Another group of methodologies is Stateful Protocol Analysis (SPA). SPA (specification based) uses predetermined profiles that define benign protocol activity. Occurring events are compared against these profiles to decide if protocols are used correctly or not. IDS based on SPA track the state of network, transport, and application protocols. They use vendor-developed universal profiles and therefore rely on their support [40].

Intruder traps [41] are set for attackers to prevent data or system infection. The pitfalls may include honeypot systems which are often employed to detect, deflect or counteract attempts of unauthorized use of information systems.

Classification Machine Learning (ML) techniques based on supervised learning are successfully used for data classification, considering the unique set of features. Those methodologies may be used to detect known threats using their characteristics for the generation of the validation and testing sets in the learning process. However, together with the other ML techniques, they are much more helpful in detecting anomalies and unknown attacks.

Detection Methods of Unknown Threats and Attacks

All “known threat” detection methodologies defined in the previous section can only detect previously known attack patterns using signatures and rules that describe malicious events and are thus also called black–listing approaches. Known threats can sometimes slip past even the best defensive measures, which is why most security organizations actively look for both known and unknown threats in their environment.

Unknown threats and attacks are not recognized by the IDS based on the collected attack knowledge. One of the possible reasons is that the attacker may use brand new methods or technologies. The “unknown threat” methodologies allow detection of previously unknown attacks, however often with high false positive rate [42].

The core class of the “unknown threats” detection methodologies is anomaly–based detection (AD). AD (behavior based) approaches learn a baseline of normal system behavior, a so–called ground truth. Against this ground truth all occurring events are compared to detect anomalous system behavior. AD techniques permit only normal system behavior, and are therefore also called white-listing approaches. While black-listing approaches are usually easier to deploy, they depend on the support of vendors. They mostly cannot be applied in legacy systems and systems with small market shares; those are often poorly documented and not supported by vendors.

There exist different types of anomalies that can indicate malicious system behavior [43]:

  • Point anomaly is the simplest form of an anomaly and is often also referred to as outlier, i.e. an anomalous single event. This could be, for example, caused by an anomalous event parameter, such as an unexpected login-name or IP address.

  • Contextual anomaly is an event that is anomalous in a specific context, but it might be normal in another one. This could, for example, be a system login from an employee outside working hours, which would be normal during normal working time.

  • Collective/frequency anomaly usually origins in an anomalous frequency of a usually normal single events. In an ICT network this could be a database dump, which could be caused by a SQL–Injection. During a database dump, a large number of log lines that refer to normal SQL–Queries are generated. In this case, the single lines are normal, but their high frequency is anomalous.

  • Sequential anomaly represents an anomalous sequence of single events usually categorized as normal. In an ICT network a sequential anomaly can be caused for example by violating an access chain. For example, a normal database server access is usually only allowed via a firewall and a Web server. Therefore, it would be malicious, if someone accesses the database server directly, without accessing the Web server.

Anomaly detection methods in distributed ICT systems are based on the analysis of data and information flow monitoring results in these systems. Therefore, they have to adapt to system architecture and configuration changes and analyze large amounts of data and information transmitted and generated by devices integrated with the computer system. The most commonly used methods of detecting anomalies include the following: machine learning methods, methods from the general class of artificial intelligence methods and statistical methods of data analysis:

  • Artificial Neural Networks (ANN) - Input data activates neurons (nodes) of an artificial network, inspired by the human brain. The nodes of the first layer pass their output to the nodes of the next layer, until the output of the last layer of the artificial network classifies the monitored ICT networks’ current state [45]. Bayesian Networks - Bayesian networks define graphical models that encode the probabilistic relationships between variables of interest and can predict the consequences of actions [46].

  • Clustering – Clustering enables grouping of unlabeled data and is often applied to detect outliers [47]. In particular, clustering can successfully support filtering defence mechanisms in case of DDoS attacks. Robust clustering techniques such as density-based clustering, subspace clustering can be used for evidence accumulation for classifying flow ensembles in traffic classes. One and multi-stage techniques can be investigated. The two-stage algorithm that works on single-link packet-level traffic captured in consecutive time-slots is presented in [48]. In the first stage, the changes in traffic characteristics are observed and at each time–slot, traffic is aggregated using multiple criteria (source and destination IPs, network prefixes, traffic volume measurements, etc.). Flows containing possible attacks are passed for processing to the next stage that employs the clustering techniques and applies them to the suspicious, aggregated flows from stage one.

  • Graph Clustering – A graph is generated based on a malware data analysis and a graph clustering technique [49] can be used to derive common malware behavior. The method to generate a common behavioral graph representing the execution behavior of a family of malware instances by clustering a set of individual behavioral graphs is proposed in [50]. To speed up the malware data analysis by reduction of sample counts, generic hash functions are applied. The generic hash function for portable executable files that generates a per-binary specific hash value based on structural data found in the file headers and structural information about the executables section data is described in [51].

  • Decision Trees – Decision trees have a tree-like structure, which comprises paths that lead to a classification based on the values of different features [52].

  • Hidden Markov Models (HMM) – A Markov chain connects states through transition probabilities. HMM aim at determining hidden (unobservable) parameters from observed parameters [53].

  • Support Vector Machines (SVM) – SVM construct hyperplanes in a high- or infinite- dimensional space, which can then be used for classification and regression. Thus, similar to clustering, SVM can, for example, be applied for outlier detection [54].

  • Ensemble learning - Combination learning based AD (also known as Ensemble methods) combine several methods for their decision. For example, one can include five different classifiers and use majority voting to decide whether a datum should be considered an anomaly [55].

  • Self-learning – Self-learning systems usually learn a baseline of normal system behavior during a training phase. This baseline serves as ground truth to detect anomalies that expose attacks and especially invaders. Generally, there are three ways how self-learning AD can be realized:

    • Unsupervised - This method does not require any labeled data and is able to learn to distinguish normal from malicious system behavior during the training phase. Based on the findings, it classifies any other given data during the detection phase.

    • Semi-supervised - This method is applied when the training set only contains anomaly–free data and is therefore also called “one-class” classification.

    • Supervised - This method requires a fully labeled training set containing both normal and malicious data.

    Self-learning methods do not require active human intervention during the learning process. While unsupervised self-learning is entirely independent from human influence, for the other two methods the user has to ensure that the training data is anomaly free or correctly labeled. However, it might be difficult to provide training data for semi-supervised learning and even harder for supervised approaches.

4 ITS Practical Example – Wobcom Smart City Project with GUARD Support

In this section, we present the example of the real ITS, which is a part of CrownCastle smart city projectFootnote 21 implemented in the city of Wolfsburg in Germany. As part of its digital strategy, the city of Wolfsburg is developing and deploying an ICT infrastructure for building intelligent services that help tackle issues like waste management, parking and metering, pollution and transportation. The infrastructure is based on a modern sensor network covering the city and its surroundings, including several Smart Gateways interconnected by a high–speed–low–latency fibre network throughout the whole city (see the left side of Fig. 7).

Fig. 7.
figure 7

Smart City project for Wolfsburg

The Smart Gateway is a slight edge computing installation and consists of 3–5 general-purpose computers. Each gateway has local fast storage, a multi-core multi-tenant CPU, \(\times 4\) Ethernet interfaces, and 32 GB RAM per node. Kubernetes acts as the container orchestrator. The first generation of Smart Gateways includes both LoRaWAN and WiFi/LTE interfaces; they implement so-called packet forwarders, protocol/payload decoders, scheduled/recurring (cron type) tasks, and other networks functions. The optical backbone connects all Smart Gateways to remote data centres managed by Wobcom, an ICT service provider owned by the municipality.

WobCom already provides free internet access by WiFi (FreeWolfsburg) in the city. Both the edge installations and central data centres can host applications of third parties that create Smart City services.

The right side of Fig. 7 identifies different digital services involved in the Smart City scenario, highlighting the presence of multiple actors (Wobcom, service providers, citizens, owners of the physical infrastructures). Even though most LoraWAN applications are not real-time and tolerate some degree of service interruption, the full range of possible services under the scope of Smart City cannot exclude carrier–grade connectivity that provides reliable and robust quality of service, not to mention the need for trust and high–security guarantees. This is especially critical for the most recent computing paradigms since edge installations do not have enough resources to deploy the same security measures as a normal data centre. Opening access to thousands of devices requires a multi–tenant, self-learning approach. Still, the market for cyber-security tools is currently lacking proper solutions that can be effectively applied in a distributed, heterogeneous, multi-tenancy environment. Monitoring the good and fair behaviour of the different network participants is a tight requirement of the European regulations regarding the usage of open-shared frequencies and properties of LoRaWAN.

The most tangible benefit will be improved service level agreements, including robust security features that each actor can define. While infrastructure providers will be primarily concerned about the availability and integrity of their infrastructures, service providers will care about their operation’s reliability, continuity, and trustworthiness, which are essential to delivering high levels of (end) user experience.

4.1 Security Aspects of ITS Integration with the GUARD Project

The main technical challenge for implementing smart city services is the interconnection of heterogeneous components (IoT devices, applications, citizens, processes) in different domains. While a lot of effort has been put in proposing n web services, middleware, and service-oriented architectures, security concerns beyond identity management and access control have been largely overlooked. Indeed, existing tools in the market for complex systems, including cloud, IoT, and network assets, are often designed and integrated for each specific scenario, resulting in rigid architectures that cannot be adapted to evolving systems and partially unknown topologies. The introduction of security capabilities in each digital component, accessible through standard interfaces and APIs, represents a ground-breaking evolution in security architectures for smart cities, allowing the composition of dynamic environments where devices, users, applications, and processes can be easily plugged or removed in multi-tenancy environments without requiring the re-design of the cyber–security architecture.

The main aim of the GUARD projectFootnote 22 is to construct the ICT platform, described in detail in Chap. 1, allows end-to-end assurance and protection of business service chains by assessing the level of trustworthiness of the involved services and tracing data propagation. GUARD platform can be successfully integrated with the local client infrastructures, such as the Wobcom network, for enforcement functionalities by leveraging “programmability” to shape the granularity of context information to the actual needs.

The intelligent local city Wobcom infrastructure has already been integrated with the ICT GUARD platform. Figure 8 shows the block diagram of the local system integrated with the GUARD platform.

Fig. 8.
figure 8

Smart mobility diagram integrating the GUARD platform

In considered scenarios, the cloud applications is hosted on the WOBCOM city infrastructure. The IoT device connects to the IBIS-Bus of the buses to collect route information, provides positioning and counts the number of wireless (Wi–Fi) devices in proximity to estimate the passenger count. It also provides LoRaWAN Network Information and is able to respond to commands sent from the management application. The data are collected and processed by the system using the following FIWARE servicesFootnote 23.

Wobcom developed Fleet Management application for the full integration with the GUARD platform and its security service components. Such application consists of smart devices installed on city bus that collect several data from the CAN bus, including speed, position, traveling distance, and measures from the engine. This data is transmitted to a LoRa Gateway, which forwards it to the remote LoRa Server, deployed in a remote cloud. Data is then consumed by the Fleet Management application, deployed in the cloud as well, which provides current position and delay of the bus to the citizens, and is also used for predictive maintenance of the vehicle fleet. A simplified view of the service chain is presented in Fig. 9.

Fig. 9.
figure 9

Architecture of fleet management use case and GUARD SAPs

With GUARD platform, each digital component of the Wobcom system implements internally its own monitoring, inspection, and enforcement tasks; a common interface exposes these capabilities, together with the description of security properties (vendor, release, updates). Local agents deployed in each component report measurements, events, logs to a common central framework, which therefore gains deep visibility over the whole chain to detect and analyse even the weakest correlations in the cyber–security context. Beyond the common interface to security capabilities, the great value of GUARD local agents is their programmability, i.e., a large flexibility in defining the local operations (filtering rules, log aggregation, pre-processing tasks, etc.). For example, the amount of traffic generated by IoT devices could be compared with that received by cloud/edge applications, to detect anomalies and possible ongoing volumetric Denial–of–Service (DoS) attacks. Suspicious end devices and gateways could be selectively monitored at the radio/network level, so to identity incorrect or malicious behaviours, including various kinds of DoS (volumetric, syn flood, amplification), man–in–the middle (e.g., rogue diversion of network traffic), tampering, manipulation or alteration of the device (operating system, libraries, applications, data). It helps detect intrusion especially at the edge, which is one of the weakest links in the chain.

The dichotomy between local and central processing should also be leveraged to improve the resilience of the same framework. As a matter of fact, some “cron types” might not have an “always connected” requirement, and could work locally for a certain period of time. Local GUARD agents are able to apply pre-defined or fallback enforcement rules in case the central framework cannot be reached, so to ensure continuity of service even in case of direct or indirect attack to the framework itself.

Another important feature of the GUARD framework is support for multiple users. A single installation are managed by a trusted security provider, offering interfaces to all users that might be affected by security concerns: Wobcom, the municipality, service providers, infrastructure owners, and citizens. Through the external interface, each user can set up its own security service. For example, infrastructure owners may be interested in detecting if their devices get compromised (to avoid responsibility when they are used for other attacks), Wobcom may be interested in avoiding DoS and amplification attacks that overwhelm its ICT infrastructure, service providers may be interested in the integrity and availability of their applications.

GUARD platform integrated with the Wobcom intelligent infrastructure through the local LoRaWAN network plays a role of cross–layer Intrusion Detection Systems (IDS). In general, cross—ayer IDS aim to maximize the available information, and therefore raise the detection capability to an optimum level and minimize the false alarm rate at the same time. Therefore, various data sources, such as log data and network traffic data, can be used for intrusion detection. Especially lightweight solutions that work resource-efficient are required. A high data throughput is important to enable real time analysis and evaluation of the collected information and thus timely detection of attacks and invaders [56, 57]. In the case of GUARD, the anomalies defined in Sect. 3.2, are detected by two secure service modules of the platform, namely (i) Network Anomaly Detection (NAD) module developed by NASKFootnote 24 and described in details in Chap. 4, and (ii) AMiner with AECIDFootnote 25 developed by Austrian Institute of Technology (AIT)Footnote 26 and presented in Chap. 5 of this book.

The critical factor ensuring the success of modern ITS systems is highly developed technology and IT infrastructure. It is hard to imagine an intelligent system without IT support. However, it carries the risk of numerous anomalies, and the whole implementation is exposed to hacking attacks, which can result in enormous difficulties in ITS management and can be dangerous for users. Sect. 3 defines the most critical types of threats and presents a brief characterization of selected classes of methods for detecting attacks and anomalies in ITS.

The last part of the chapter contains an example of a real ITS system implemented in the city of Wolfburg, Germany. This system was integrated with the GUARD platform, developed as part of the European H2020 project.

We believe that we have succeeded in drawing the reader’s attention in this chapter to the most critical challenges and aspects related to the development of ITS systems, which we will all be using in the near or distant future. The example of the GUARD platform and integrated ITS developed by Wobcom shows that this future does not have to be very far away.

5 Conclusions

This chapter introduced the complexity of modelling intelligent transportation systems, especially in highly urbanized areas. Although there are many projects concerning smart cities and smart infrastructures in general, these projects have been implemented in practice only in a few cases. The reasons for these difficulties are manifold: from economic and technological barriers to increasingly restrictive environmental regulations and financial constraints. The most critical problems and challenges faced by ITS systems developers were discussed in Sect.  2.