skip to main content
research-article
Open Access

Computational Intelligence in Security of Digital Twins Big Graphic Data in Cyber-physical Systems of Smart Cities

Authors Info & Claims
Published:10 August 2022Publication History

Skip Abstract Section

Abstract

This investigation focuses on the application of computational intelligence to the security of Digital Twins (DTs) graphic data of the Cyber-physical System (CPS). The intricate and diverse physical space of CPS in the smart city is mapped in virtual space to construct the DTs CPS in the smart city. Besides, Differential Privacy Frequent Subgraph-Big Multigraph (DPFS-BM) is employed to ensure data privacy security. Moreover, the analysis and prediction model for the DTs big graphic data (BGD) in the CPS is built based on Differential Privacy-AlexNet (DP-AlexNet). Alexnet successfully solves the gradient dispersion problem of the Sigmoid function of deep network structures. Finally, the comparative analysis approach is utilized to verify the performance of the model reported here by comparing it with Long Short-Term Memory, Convolutional Neural Network, Recurrent Neural Network, original AlexNet, and Multi-Layer Perceptron in a simulation experiment. Through the comparison in the root mean square error, the mean absolute error, the mean absolute percentage error, training time, and test time, the model proposed here outperforms other models regarding errors, time delay, and time consumption. In the same environment, the system performs better with multi-hop paths, extra relays, and a high fading index; in that case, the outage probability is minimal. Therefore, the DP-AlexNet model is suitable for processing BGD. Moreover, its speed acceleration is more apparent than that of other models, with a higher SpeedUp indicator. The research effectively combines data mining and data security, which is of significant value for optimizing the privacy protection technology of frequent subgraph mining on a single multi-graph. Besides, the constructed DTs of CPS can provide excellent accuracy and a prominent acceleration effect on the premise of low errors. In addition, the model reported here can provide reference for the intelligent and digital development of smart cities.

Skip 1INTRODUCTION Section

1 INTRODUCTION

As science and technology evolve quickly, widely accepted technologies such as big data (BD), artificial intelligence (AI), and blockchain (BC) have accelerated the construction of smart cities. Consequently, infrastructure investment, data center deployment, and urban business scenarios get improved remarkably. Smart city construction has also reformed government-dominated urban governance projects, such as public security, smart transportation, smart tourism, and handy services for the people [1, 2]. Despite the formulated bright prospect during actual smart city construction, various problems and contradictions are intertwined due to the vast area and high population density in China. In this case, the promising Digital Twins (DTs) are anticipated to develop a systematic, targeted, and operable solution for smart city construction.

Diversified services have been introduced into smart cities, such as smart transportation, smart healthcare, and smart tourism; however, these services lack a unified “data floor” that supports universal management, resulting in unbalanced projects and blocked data sharing [3]. DTs can map real entities in smart city construction to a virtual space of the same environmental conditions, gather the refined business perception data of smart cities, and assess urban development and construction comprehensively, dynamically, and accurately. Eventually, both people-oriented layout schemes and dynamic resource allocation schemes can be developed for smart cities [4, 5]. On the one hand, the speed and scale of data generation in the Cyber-Physical System (CPS) are improved through constructing DTs. On the other hand, big data processing and analysis are significantly more complicated. The data of various industries or fields in the DTs of the CPS of the smart city are usually relevant, constituting the big graphic data (BGD) of the DTs of the CPS of the smart city.

BGD is different from traditional graph data processing and analysis approaches with simple relationships. Those traditional approaches usually deal with separate “mini graphs” that are independent of each other. Despite the huge number of graphs, these approaches do not require complicated iterative processes or generate loads of messages; thus, they consume less time and space overheads [6]. Intricate data relationships generated during smart city construction increase BD processing complexity involving network structure and data connection exponentially. Besides, they also lead to double exponential growth in BGD processing and analysis complexity [7]. However, this situation conflicts with the current slow growth of BGD storage and computing. Meanwhile, it is difficult to guarantee the security performance of BGD during processing and analysis. Hence, how to efficiently process and analyze BGD in DTs of smart cities is a common and fundamental issue in BGD computation, also a bottleneck for BGD cognition and value exploration.

In general, it is of great significance for the safe and efficient handling of BGD in the construction process of smart cities. The innovation of the present work lies in the following aspects: (1) a smart city DTs for complex and diverse smart city physical space; (2) Differential Privacy Frequent Subgraph-Big Multigraph (DPFS-BM) is introduced, ensuring data privacy and security; (3) Alexnet is optimized, and a DTs BGD analysis and prediction model of smart city is implemented based on Differential Privacy-AlexNet (DP-AlexNet). To sum up, the BGD analysis and prediction model of CPS in the smart city based on DP-AlexNet constructed here provides an experimental reference for the intelligent and digital development of the smart city in the future.

Skip 2RECENT WORKS Section

2 RECENT WORKS

2.1 Development Trend of Smart City DTs

Digital infrastructure is an essential foundation for the development and construction of smart city DTs, also a vital means to increase the value of urban space. Many scholars have researched digitized smart city development. Sepasgozar [2021] applied DTs to smart city construction. DTs could help monitor, understand, and optimize all physical entities’ functions, providing continuous feedback to humans in smart cities, thereby improving the quality of life and the sense of well-being [8]. Francisco et al. [2021] adjusted building energy using smart meters in smart city DTs. This adjustment made smart energy management of large buildings a vital link to smart city construction [9]. Laamarti et al. [2020] proposed an ISO/IEEE11073 standardized digital dual-frame architecture, which could collect data from individual health devices, analyze the data, and deliver feedback to users cyclically. This framework could serve as the basis for developing smart city DTs [10]. Fuller et al. [2020] analyzed the challenges of AI, the Internet of Things (IoT), and DTs. Besides, they explored the usage, challenges, and functions of smart city DTs [11]. White et al. [2021] found that combining large and accurate Building Information Models (BIMs) in smart cities and BD generated by IoT sensors could improve the accuracy and transparency of massive data in smart cities. Meanwhile, a public and open DTs model could make city planning more accurate than in the past [12].

2.2 Research Status of Security Issues of CPS

With the continuous emergence of security issues, great losses have been caused to the development of urbanization and personal privacy. The security problem of CPS is not only the control problem of physical process, but also related to many fields such as computers and communication, which have become the focus of many scientific researchers. Li et al. [2021] studies optimized control under dual-network interactive cascading failure to support the operation of cyber-physical power systems. In addition, considering the communication constraints including synchronous bandwidth consumption and delay, they adopted the optimal routing strategy based on the publish–subscribe network to reconstruct the transmission of damaged packets during cascading failures [13]. Hussain et al. [2020] used Deep Convolutional Neural Networks to detect real network data against malicious attacks on CPS, providing early detection for distributed denial-of-service attacks planned by botnets that controlled malicious devices. Through simulation, they found that this framework could achieve higher than 91% normal and under-attack cell detection accuracy [14]. Hu et al. [2021] selected various new technologies and application scenarios covering CPS and the IoT for study and analysis, aiming at the design, optimization, implementation, and evaluation of emerging cloud edge solutions for CPS and IoT applications. They finally verified the feasibility of their research [15].

2.3 Research Progress of BGD Query Processing and Optimization

As the internet boosts unprecedentedly, data in smart cities keep growing; the massive heterogeneous data from multiple sources share close relationships, which can be demonstrated by graphs vividly. Many scholars have explored data acquisition against Coronavirus Disease 2019 (COVID-19) regarding BGD processing and optimization. Villanustre et al. [2021] applied mathematical models to predict the whereabouts of potentially infected people and their likelihood of carrying the disease. The authors believed that this modern computing power enabled the rapid integration of multiple prediction models and promote the effective control of the COVID-19 epidemic [16]. Wang et al. [2021] proposed a deep hashing framework for large-scale video similarity search based on the generated massive BGD, namely, Unsupervised Deep Video Hashing (UDVH). This framework aimed to learn compact and practical binary codes. Ultimately, they employed three different popular video datasets for experimentation. Results demonstrated that UDVH was far superior to current techniques regarding various evaluation indicators, making UDVH practical in applications [17]. Facing the explosive growth of smart city data, Zhou et al. [2021] designed a multi-stage relational search framework for smart city BGD to automatically extract non-categorical relationships from domain documents. This framework combined semantic maps’ structural information with terms’ context information to recognize the non-axial relationships. Through experiments, the authors proved the framework's performance in recognizing and forecasting smart city BGD [18]. Ma et al. [2020] proposed a Support Multimode Tensor Machine (SMTM) for vast industrial BD in smart cities. They also designed a useful parameter training algorithm. Eventually, experiments on different datasets verified SMTM's superiority over other algorithms in multi-classification, revealing its application potential in industrial BD multi-classification [19].

As per the above works, while data in smart cities are boosting, DTs are constructed for the complicated and diversified reality of smart cities. However, an algorithm to extract features of massive data acquired in smart city DTs is not proposed yet. In addition, there are insufficient comprehensive research works on the security performance of CPS, and the majority are related to the power PCS. The emergence of BGD provides the possibility for massive data processing and analysis. Therefore, the DTs technology is utilized to construct the DTs of CPS in virtual space for the physical smart city in real space, to extract the characteristics of the BGD, and analyze the security performance. This is of great significance to the digital and intelligent development of the smart city in various fields.

Skip 3BGD SECURITY ANALYSIS OF DTS OF CPS IN THE SMART CITY Section

3 BGD SECURITY ANALYSIS OF DTS OF CPS IN THE SMART CITY

3.1 Computing Requirements of BGD in DTs of CPS in the Smart City

Various new applications and cloud computing techniques keep emerging in the communication field during the construction of smart cities. Correspondingly, data of graph models grows very fast, with billions of vertices and trillions of edges at every turn. Traditional independent “mini graphs” may have loads of graphs and uncomplicated iterative processes; however, they cannot extract massive amounts of messages thoroughly. The structural information comprising these vertices and edges can be regarded as the iceberg tip of the astonishing BGD scale [20, 21]. The graphic data in complex applications such as information physics systems often contain various attribute information on vertices and edges to express complex semantics. These attributes are rich in information and require space overheads. Compared to simple queries and searches based on attributes, statistical analysis algorithms on large graphs often require looping and recurrent operations based on the graph structure until the convergence conditions are reached. Hence, intermediate results should be frequently processed, such as message data generated by communication interaction during parallel iteration. BGD contains three reasoning modes, namely, collaborative classification, link forecasting, and entity analysis, as displayed in Figure 1 [22].

Fig. 1.

Fig. 1. Reasoning modes in BGD management and analysis.

Big graphs in BGD processing analysis have static structure and attribute information, and they change dynamically in many cases, such as sequence graphs and dynamic graphs. These changes constantly alter the scale and structure of the big graphs with some attributes. Hence, it is necessary to record them in detail. The time and space costs of storing, indexing, searching, and analyzing the data are far beyond the affordability of traditional centralized graph data management. Therefore, techniques for BGD distributed storage, query processing, optimization, mining, analysis, and system implementation assurance have become an urgent database problem, also a very challenging task. As the research goes deeper, only using traditional approaches to explore primary graph management and analysis issues falls far behind current demands. Hot topics keep emerging, such as using BGD to extract language features, hardware-based graph processing, and BGD processing security analysis. These issues have been discussed in the early days. However, as computer software and hardware advance, accurate and efficient approaches to solve these problems bring new opportunities and challenges to the BGD security management and analysis in DTs of CPS of the smart city.

3.2 Intelligent Computing and Forecasting of BGD in CPS of the Smart City

BGD analysis processes data collected in DTs of CPS of the smart city by investigating development demands and the specific situation of the smart city and improved functional requirements of IoT. Figure 2 illustrates the structure of IoT terminals in CPS of smart cities, including the information collection layer, the data integration layer, the computing layer, and the application layer.

Fig. 2.

Fig. 2. BGD processing and analysis structure in IoT CPS of the smart city.

The collected data require many nodes in the communication network to participate in DTs of CPS of the smart city. Some participating nodes cause other nodes in data graphs to have no labels, without being able to protect privacy or prevent information loss. Hence, some unknown node information shall be forecasted. Moreover, it is significantly vital for BGD information processing in smart cities to use exact values to identify subjective social attributes and quickly select multiple matching subgraphs with better goals from a large matching subgraph set [23]. While processing and analyzing BGD, fuzziness is very useful in handling incredibly limited and inaccurate data, especially uncertain, qualitative, and fuzzy variables. Hence, fuzziness has quite practical applicability to many management and engineering problems of BGD in smart city DTs [24]. The following equation exists for the fuzzy number \( \tilde{A}(\alpha ,C) \): (1) \( \begin{equation} {\rm{BGD}}{\mu _{\tilde{A}}}(x) = \left\{ {\begin{array}{@{}*{2}{c}@{}} {\left\{ {\begin{array}{@{}*{2}{c}@{}} {1 - \frac{{\left| {x - \alpha } \right|}}{C},}&{\left| {x - \alpha } \right| \le C,}\\ {0,}&{otherwise,} \end{array}} \right.}&{C > 0}\\ \qquad{\left\{ {\begin{array}{@{}*{2}{c}@{}} {1,}&\!\!\!\!{x = \alpha }\\ {0,}&\quad\,{otherwise,} \end{array}} \right.}&{C = 0} \end{array}} \right. \end{equation} \)

In Equation (1), \( \alpha \in A \) holds for all \( \alpha \in ( {0,1} ] \), \( \lambda \tilde{A} \) represents a fuzzy number \( \tilde{A}( {\lambda \alpha ,\lambda C} ) \), and \( {\tilde{A}_1} + {\tilde{A}_2} \) denotes another fuzzy number \( \tilde{A}( {{\alpha _1} + {\alpha _2},{C_1} + {C_2}} ) \). Hence, if: (2) \( \begin{equation} \tilde{T}_i^* = {\tilde{A}_0} + {q_{i1}}{\tilde{A}_1} + {q_{i2}}{\tilde{A}_2} + \cdots + {q_{in}}{\tilde{A}_n}, \end{equation} \)

then there is: (3) \( \begin{equation} \tilde{T}_i^*\left( {{\alpha _0} + \sum\limits_{j = 1}^n {{q_{ij}}{\alpha _j}} ,{C_0} + \sum\limits_{j = 1}^n {\left| {{q_{ij}}} \right|{C_j}} } \right). \end{equation} \)

Each BGD component collected from the CPS of smart cities can be normal or faulty, where the reliability can be evaluated as the probability that the component functions normally. Subgraphs can be taken as a system to match and process BGD patterns. Subgraph's reliability is determined by the reliability of its component nodes and their connections. Only when all its component nodes are running well, the subgraph can work normally. The reliability of component node i in the communication network is denoted as \( R_i^{(s)}(t) = P( {T_i^{(s)} > t} ),i = 1,2,\ldots,m \), where \( T_i^{(s)} \) represents the lifetime of BGD node i; in subgraph S, this describes the time from \( t = 0 \) to the component node failure [25]. Suppose that the failures of component nodes are independent of each other, which are marked as \( T_1^{(s)} \), \( T_2^{(s)}, \ldots \), and \( T_m^{(s)} \). In that case, all component nodes in S start to work at t = 0. The lifetime of the subgraph is calculated according to: (4) \( \begin{equation} T_{}^{(s)} = \min \left(T_1^{(s)},T_2^{(s)},\ldots,T_m^{(s)}\right)\!. \end{equation} \)

The reliability of the subgraph can be expressed as: (5) \( \begin{equation} R_{}^{(s)}(t) = P\left( {T_i^{(s)} > t} \right) = \prod\limits_{i = 1}^m {R_i^{(s)}(t)} . \end{equation} \)

Sometimes, even there are m component nodes in S, the subgraph can only operate normally while at least k component nodes are running well, where \( k < m \). For example, only when the key components satisfy the needs of service consumers can the composite service run well in the consumption domain BGD of CPS of smart cities; in community communication network BGD of CPS of smart cities, only when community members perform their functions correctly can they settle down in a small community [26]. Mathematically, if the lifetime of component nodes in S is \( T_1^{(s)},T_2^{(s)}, \ldots ,T_m^{(s)} \), then these components are independent and conform to the same distribution, and their reliability is \( R_i^{(s)}(t) = P( {T_i^{(s)} > t} ) \). Then, the reliability of the subgraph can be calculated according to: (6) \( \begin{equation} R_{}^{(s)} = \sum\limits_{j = k}^m {\left( {\begin{array}{@{}*{1}{c}@{}} m\\ j \end{array}} \right){{\left( {R_i^{(s)}(t)} \right)}^j}{{\left( {1 - R_i^{(s)}(t)} \right)}^{m - j}}} . \end{equation} \)

Equation (7) indicates the reliability of subgraph S. (7) \( \begin{equation} R_{}^{(s)}(t) = \phi \left( {R_1^{(s)},R_2^{(s)}, \ldots ,R_m^{(s)}} \right)\!. \end{equation} \)

In Equation (7), \( \phi \) represents the known function. Historically, component node i has experienced \( n_i^{(s)} \) tests, in which it has succeeded \( s_i^{(s)} \) times and failed \( f_i^{(s)} \) times, that is, \( n_i^{(s)} = s_i^{(s)} + f_i^{(s)} \), where \( n_i^{(s)} \ge 1,s_i^{(s)} \ge 0,f_i^{(s)} \ge 0 \). Suppose that: (8) \( \begin{equation} Z_{}^{(s)} = \left( {s_1^{(s)},s_2^{(s)}, \ldots ,s_m^{(s)}} \right)\!, \end{equation} \) (9) \( \begin{equation} \theta _{}^{(s)} = \left( {R_1^{(s)},R_2^{(s)}, \ldots ,R_m^{(s)}} \right)\!. \end{equation} \)

Assume that each test is independent of the other. The distribution of \( Z_{}^{(s)} \) depends on \( \theta _{}^{(s)} \): (10) \( \begin{equation} {P_{\theta _{}^{(s)}}} = \left( {Z_{}^{(s)} = \left( {i_1^{(s)},i_2^{(s)}, \ldots ,i_m^{(s)}} \right)} \right), \end{equation} \) (11) \( \begin{equation} {P_{\theta _{}^{(s)}}} = \prod\limits_{k = 1}^m {\left( {\begin{array}{@{}*{1}{c}@{}} {n_k^{(s)}}\\[2pt] {i_k^{(s)}} \end{array}} \right)} {\left( {R_k^{(s)}} \right)^{i_k^{(s)}}}{\left( {1 - R_k^{(s)}} \right)^{n_k^{(s)} - i_k^{(s)}}}. \end{equation} \)

To extract and forecast the collected BGD features, Deep Learning is introduced into the BD analysis system of smart city IoT. Convolutional Neural Network (CNN) is an Artificial Neural Network; it is the first model that can successfully train and learn networks with multiple layers [27]. CNN minimizes the requirements on preprocessing and extracts the most expressive features directly from the original data input without manually specifying features. Figure 3 demonstrates how CNN extracts and classifies features of BGD in smart city DTs.

Fig. 3.

Fig. 3. Process of BGD feature extraction and classification in smart city DTs based on CNN.

The pooling layer of CNN performs down-sampling operations on the input feature maps in length and width dimensions to reduce the model parameters. According to the number of parameters, the data complexity of different scenarios in CPS of smart cities has been reduced, diminishing the over-fitting degree and the probability of local minimum. Moreover, the pooling layer can make the model more robust to translation and distortion of the image.

IoT data in CNN goes through multiple convolutional layers and pooling layers; then, the data is connected via one or more fully connected layers. All neurons in the current layer are connected to those in the last layer. Usually, this layer depends on two 1D network layers. Local information of the convolutional layer or the pooling layer is grouped. The activation function used by all neurons is often the Rectified Linear Unit (ReLU).

AlexNet [28], a deep CNN model, is selected to reduce the calculation amount and enhance the generalization performance, because it has multiple network layers and stronger learning ability. Furthermore, the functional layer of AlexNet's convolutional layer is improved. The operation of “local normalization before pooling” is advanced to “pooling before local normalization.” This improvement has two advantages. First, the generalization ability of AlexNet can be enhanced while the over-fitting can be weakened, which greatly shortens the training time. Second, overlapping pooling before local normalization can reserve more data and weaken redundant information during pooling and accelerate the convergence rate of the smart city DTs forecasting model during training, highlighting its superiority over other pooling approaches. In addition, DPFS-BM is introduced to ensure security and real-time performances. The Differential Privacy (DP) protection mechanism of BGD in smart city DTs is presented in Figure 4.

Fig. 4.

Fig. 4. DP protection mechanism of BGD in smart city DTs.

3.3 Intelligent Computing and Forecasting of BGD in Smart City

Regarding the current smart city development in the real world, DTs implement the analysis and forecasting function of real smart city BGD in the virtual space. AlexNet with multiple network layers and strong learning ability is selected to extract BGD features collected in the CPS of smart cities. A BGD analysis forecasting model in smart city DTs is designed based on DP-AlexNet while ensuring the model's security performance, as displayed in Figure 5.

Fig. 5.

Fig. 5. Prediction model of BGD analysis in DTs of CPS of the smart city based on DP-AlexNet.

The privacy security protection process of DPFS-BM in this prediction model is presented in Figure 6.

Fig. 6.

Fig. 6. Privacy and security protection process of DPFS-BM.

In this prediction model, the tth feature map \( y_t^l(i,j) \) of the lth convolutional layer is sampled using overlapping pooling as shown in Equation (12). (12) \( \begin{equation} \begin{array}{@{}*{1}{c}@{}} {a_t^l(i,j) = \max \big\{ y_t^l(i,j),{i_s} \le i \le {i_s} + {w_c} - 1,}\\ \qquad\,\,\,\,\,\,\,{{j_s} \le j \le {j_s} + {w_c} - 1\big\} } \end{array}. \end{equation} \)

In Equation (12), s is the pooling movement step size, \( {w_c} \) refers to the width of the pooling area, and \( {w_c} > s \).

A local normalization layer is added after the first and second pooling layers of AlexNet to standardize the feature map \( c_t^l(i,j) \): (13) \( \begin{equation} c_t^l(i,j) = a_t^l(i,j)/{\left( {k + \alpha \sum\limits_{\max (0,t - m/2)}^{\min (N - 1,t + m/2)} {{{\left( {a_t^l(i,j)} \right)}^2}} } \right)^\beta }. \end{equation} \)

In Equation (13), k, α, β, and m are all hyperparameters valuing 2, 0.78, 10−4, and 7, respectively, and N stands for the total number of convolution kernels in the lth convolutional layer. To prevent gradient dispersion [29], the activation function takes ReLU to activate the convolution output \( S_t^l(i,j) \): (14) \( \begin{equation} y_t^l(i,j) = f\left( {S_t^l(i,j)} \right) = \max \left\{ {0,S_t^l(i,j)} \right\}\!\!. \end{equation} \)

In Equation (14), \( {{f}}( {{{S}}_{{t}}^{{l}}( {{{i,\ j}}} )} ) \) represents ReLU. To prevent over-fitting in the fully connected layer, the dropout parameter is set to 0.5. All the feature maps when l values 5 in Equation (12) are reconstructed into a high-dimensional single-layered neuron structure C5; thus, the input \( Z_i^6 \) of the ith neuron in the sixth fully connected layer is: (15) \( \begin{equation} Z_i^6 = W_i^6{C^5} + b_i^6. \end{equation} \)

In Equation (15), \( W_i^6 \) and \( b_i^6 \) are the weight and bias of the ith neuron in the sixth fully connected layer, respectively.

While improving the generalization ability, the neurons Cl of the sixth and seventh fully connected layers are discarded and output. Besides, \( r_j^l \sim bernoulli(dp) \), and \( {\tilde{C}^l} = {r^l}{C^l} \). Therefore, the ith neuron's input in the seventh and eighth fully connected layers \( Z_i^{l + 1} \) is \( W_i^{l + 1}{\tilde{C}^l} + b_i^{l + 1} \), where the ith neuron's input in the sixth and seventh fully connected layers \( C_i^l \) is \( f( {Z_i^l} ) \), namely, \( \max \{ {0,Z_i^l} \} \). Finally, the input \( {q^i} \) of the ith neuron in the eighth fully connected layer can be obtained according to Equation (16). (16) \( \begin{equation} {q^i} = soft\max (Z_i^8) = \frac{{{e^{Z_i^8}}}}{{\sum\nolimits_{j = 1}^{12} {{e^{Z_i^8}}} }}. \end{equation} \)

The cross-entropy loss function suitable for classification is selected as the error function, which can be written as Equation (17). (17) \( \begin{equation} Loss = \sum\limits_{i = 1}^K {{y_i} \cdot \log \left( {{p_i}} \right)}. \end{equation} \) (18) \( \begin{equation} {p_i} = \frac{{\exp \left( {{{\tilde{y}}_i}} \right)}}{{\sum\nolimits_{i = 1}^K {\exp \left( {{{\tilde{y}}_j}} \right)} }}. \end{equation} \)

In Equations (17) and (18), K denotes the number of categories, \( {y_i} \) describes the true category distribution of the sample, \( {\tilde{y}_i} \) signifies the network output, and \( {p_i} \) represents the classification result after the SoftMax classifier. SoftMax's input is an N-dimensional real number vector, denoted as x, which is calculated according to Equation (19). (19) \( \begin{equation} \xi {(x)_i} = \frac{{{e^{{x_i}}}}}{{\sum\nolimits_{n = 1}^N {{e^{{x_i}}}} }},i = 1,2,\ldots,N. \end{equation} \)

Essentially, SoftMax can map an N-dimensional arbitrary real number vector to an N-dimensional vector whose values all fall in the range of (0,1), thereby normalizing the vector. To reduce the computational complexity, the output data volume is reduced to 28 through the compression conversion by (μ = 255), thereby improving the model's forecasting efficiency. (20) \( \begin{equation} f({x_t}) = sign({x_t})\frac{{\ln (1 + \mu \left| {{x_t}} \right|)}}{{\ln (1 + \mu )}},\left| {{x_t}} \right| < 1. \end{equation} \)

The model reported here is trained through learning rate updating using the polynomial decay approach (Poly) [30], as presented in Equation (21). (21) \( \begin{equation} init\_lr \times {\left( {1 - \frac{{epoch}}{{\max \_epoch}}} \right)^{power}}. \end{equation} \)

In Equation (21), the initial learning rate \( init\_lr \) is 0.0005 (or 5e-4), and power is set to 0.9.

3.4 Simulation Experiment

OPNET is taken as the simulation tool to verify the performance of this prediction model. OPNET builds a cellular network comprising three macro base stations and 15 micro base stations, covering an area of 5 km × 5 km. Each macro base station covers a circle with a radius of 1 km, in which there are five micro base stations. Users are randomly distributed in their respective cells. All sample data are divided into a training dataset and a test dataset in a ratio of 7:3. The proportion of each data type in the two datasets shall be the same. Hyperparameters of AlexNet are set as follows: 120 iterations, 2,000 seconds, and 128 Batch Size. Some state-of-the-art models are included for performance comparison, including Long Short-Term Memory (LSTM) [31], CNN, Recurrent Neural Network [32], AlexNet, and Multi-Layer Perceptron (MLP) [33]. The experimental environment is configured from software and hardware. As for software, the operating system is Linux 64 bit, the Python version is 3.6.1, and the development platform is PyCharm. As for hardware, the Central Processing Unit (CPU) is Intel Core [email protected] 8 Cores, the internal memory is Kingston DDR4 2,400 MHz 16 G, and the Graphics Processing Unit (GPU) is NVIDIA GeForce 1060 6G.

Skip 4RESULTS AND DISCUSSIONS Section

4 RESULTS AND DISCUSSIONS

4.1 Forecasting Performance Analysis and Comparison

The DP-AlexNet model is compared with LSTM, CNN, RNN, AlexNet, and MLP regarding root-mean-square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), to validate its security prediction performance. The results are demonstrated in Figure 7. Furthermore, the time these models require for training and testing is compared, as shown in Figure 8.

Fig. 7.

Fig. 7. Model error (%) variations with iteration times (a) RMSE; (b) MAE; (c) MAPE.

Fig. 8.

Fig. 8. Time required for training and testing (a) Duration of training; (b) Duration of testing.

The DP-AlexNet model can achieve 4.64% RMSE, 5.34% MAE, and 7.82% MAPE, significantly lower than those of other models. Especially, the difference between DP-AlexNet and MLP in MAE and MAPE is about 2 ∼ 3 times. Compared with AlexNet before improvement, the error performance is also significantly improved. Hence, the proposed BGD analysis prediction model in DTs of CPS of the smart city based on DP-AlexNet can provide stronger forecasting accuracy and robustness.

According to Figure 8, as iterations continue, the time required first increases and then almost remains unchanged. The DP-AlexNet model reduces the time required for security forecasting remarkably compared with other models. A possible cause is that the improved AlexNet has enhanced generalization capabilities while also accelerating the convergence rate during model training. Therefore, the DP-AlexNet model can achieve higher forecasting accuracy more quickly.

As shown in Figure 9, if the successful transmission probability (denoted as p) takes 100% and 80%, then different λ values exert different influences on the transmission delay. As the distance between relay signal collection nodes extends, transmission delay with the same λ gets reduced continuously. Then, situations with fixed p and varying λ are analyzed. The transmission delay will decrease as λ increases. The maximal delays are about 564.15 ms and 693.06 ms, respectively. When λ=0.05, the transmission delay approaches zero. When λ values between 0.01 and 0.05, the theoretical results are the same as the actual transmission delays. To sum up, theoretical results are closest to the actual results when λ values are between 0.01∼0.05.

Fig. 9.

Fig. 9. Experimental transmission delays at p = 80% and p = 100% (a) p = 100%; (b) p = 80%.

A differential privacy protection algorithm DPFS-DM is proposed for multi-graph frequent and complex subgraph mining to mine privacy and obtain the frequent subgraph pattern and noise support of privacy protection. DPFS-BM algorithm is the key to privacy protection processing. The four sub-processes jointly determine the performance advantages of the algorithm. Process 1: According to the characteristics of multi-graph, preprocess its multilateral to obtain the maximum length limit and the maximum frequent multilateral length. Process 2: Use the idea of intelligent truncation to mine frequent seeds. Process 3: After generating a set of candidate subgraphs based on frequent seed expansion, evaluate the number of frequent subgraphs. Process 4: Utilize the exponential mechanism to select the frequent subgraph patterns from the candidate subgraph set, add Laplace noise to their real support, and finally obtain the frequent subgraph patterns with privacy protection and their corresponding noise support.

4.2 Outage Probability Analysis

The outage probability is analyzed regarding whether the system has multi-hop paths, whether the system has direct paths, and different relay numbers of multi-hop paths. The results are demonstrated in Figure 10.

Fig. 10.

Fig. 10. Influences of different factors on outage probability (a) Whether the system has multi-hop paths; (b) Whether the system has direct paths; (c) Different relay numbers of multi-hop paths.

According to Figure 10(a), regardless of m, the outage probability will always decrease if the system has multi-hop paths; it decreases even more sharply as m increases. The virtual antennas formed in multi-hop paths lead to collaboration, which improves the system performance. As shown in Figure 10(b), despite the value of m, the outage probability will always decrease sharply if the system has direct paths, which also decreases more sharply as m increases. Since direct paths send signals directly from the source node to the terminal node, the loss can be reduced, and the system performance can be increased. According to Figure 10(c), different relay numbers in the DP-AlexNet model result in different system outage probabilities. With a fixed m, the more the relays, the smaller the outage probability and the better the system performance.

4.3 SpeedUp Indicator Analysis and Comparison

The DP-AlexNet model is compared with AlexNet, LSTM, CNN, RNN, and MLP regarding the training time required under different numbers of nodes and different data volumes. The results are illustrated in Figures 11 and 12.

Fig. 11.

Fig. 11. Training time required and SpeedUp indicator under different data volumes (a) Training time required; (b) SpeedUp indicator.

Fig. 12.

Fig. 12. Training time required and SpeedUp indicator under different numbers of nodes (a) Training time required; (b) SpeedUp indicator.

In the BGD scenario, the improved AlexNet is less sensitive to data growth than other models, suitable for processing large amounts of data. As computing nodes increase, the acceleration effect gets much more apparent, and the SpeedUp indicator is higher. Hence, this experiment also proves the superiority of the DP-AlexNet model in BGD recognition and classification of DTs of CPS of the smart city.

Skip 5CONCLUSIONS Section

5 CONCLUSIONS

As data in CPS of smart cities grows unprecedentedly fast, techniques such as DTs and BGD analysis present huge application potential. Privacy protection technology in frequent pattern mining has always been one of the hot issues in the field of data security. In multi-graph data, the interactive information between a pair of vertices is richer, and the mined frequent subgraphs can provide potentially valuable information. While DPFS-BM is introduced to ensure data privacy security, AlexNet is improved to construct a BGD analysis prediction model in DTs of CPS of the smart city based on DP-AlexNet. The simulation results show that the RMSE, MAE, and MAPE of the BGD model reported are 4.64%, 5.34%, and 7.82%, respectively, significantly lower than other prediction models. This result indicates that this model significantly shortens the time required for safety prediction. The DP-AlexNet model reported here is mainly verified from two aspects of utility and running time. The experimental results show that the privacy protection method proposed here can protect the privacy and ensure data availability. In addition, it can achieve high accuracy and present notable acceleration. Hence, this model can motivate the intelligent and digitized development of smart cities. However, there are still some weaknesses in the present work. DPFS-BM in the constructed CPS model can only protect static BGD privacy. BD development makes data-flow a hot topic worldwide. However, mining subgraphs in graph data-flow may bring privacy leakage risks. Follow-up research will consider privacy protection of mining subgraphs in dynamic scenarios, which is extremely significant for subsequent smart city development.

REFERENCES

  1. [1] Fan C., Jiang Y., and Mostafavi A.. 2020. Social sensing in disaster city digital twin: Integrated textual–visual–geo framework for situational awareness during built environment disruptions. J. Manag. Eng. 36, 3 (2020), 0402000204020006.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Fan C., Zhang C. C., Yahja A., and Mostafavi A.. 2021. Disaster city digital twin: A vision for integrating artificial and human intelligence for disaster management. Int. J. Inf. Manag. 56 (2021), 102049102053.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Conejos Fuertes P., Martínez Alzamora F., Hervás Carot M., and Alonso Campos J. C.. 2020. Building and exploiting a digital twin for the management of drinking water distribution networks. Urb. Wat. J. 17, 8 (2020), 704713.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Lu Q., Parlikad A. K., Woodall P., Don Ranasinghe G., Xie X., Liang Z., and Schooling J.. 2020. Developing a digital twin at building and city levels: Case study of West Cambridge campus. J. Manag. Eng. 36, 3 (2020), 0502000405020011.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Yaqoob I., Salah K., Uddin M., Jayaraman R., Omar M., and Imran M.. 2020. Blockchain for digital twins: Recent advances and future research challenges. IEEE Netw. 34, 5 (2020), 290298.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Alexopoulos K., Nikolakis N., and Chryssolouris G.. 2020. Digital twin-driven supervised machine learning for the development of artificial intelligence applications in manufacturing. Int. J. Comput. Integ. Manuf. 33, 5 (2020), 429439.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Lu Y., Huang X., Zhang K., Maharjan S., and Zhang Y.. 2020. Low-latency federated learning and blockchain for edge association in digital twin empowered 6G networks. IEEE Trans. Industr. Inform. 17, 7 (2020), 50985107.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] S. M. E. Sepasgozar 2021. Differentiating digital twin from digital shadow: Elucidating a paradigm shift to expedite a smart, sustainable built environment. Buildings 11, 4 (2021), 151.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Francisco A., Mohammadi N., and Taylor J. E.. 2020. Smart city digital twin–enabled energy management: Toward real-time urban building energy benchmarking. J. Manag. Eng. 36, 2 (2020), 04019045.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Laamarti F., Badawi H. F., Ding Y., Arafsha F., Hafidh B., and El Saddik A.. 2020. An ISO/IEEE 11073 standardized digital twin framework for health and well-being in smart cities. IEEE Access 8 (2020), 105950105961.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Fuller A., Fan Z., Day C., and Barlow C.. 2020. Digital twin: Enabling technologies, challenges and open research. IEEE Access 8 (2020), 108952108971.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] White G., Zink A., Codecá L., and Clarke S.. 2021. A digital twin smart city for citizen feedback. Cities 110 (2021), 103064103068.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Li X., Jiang C., Du D., Wang R., Fei M., Li X. and Tian Y.. 2021. Optimization and control of cyber–physical power systems under dual-network interactive cascading failure. Contr. Eng. Pract. 111 (2021), 104789.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Hussain B., Du Q., Sun B., and Han Z.. 2020. Deep learning-based DDoS-attack detection for cyber–physical system over 5G network. IEEE Trans. Industr. Inform. 17, 2 (2020), 860870.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Hu S., Shi Y., Colombo A., Karnouskos S., and Li X.. 2021. Cloud-edge computing for cyber-physical systems and internet-of-Things. IEEE Trans. Indust. Inform. 11. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Villanustre F., Chala A., Dev R., Xu Lili., LexisNexis J. S., Furht B., and Khoshgoftaar T.. 2021. Modeling and tracking Covid-19 cases using big data analytics on HPCC system platform. J. Big Data 8, 1 (2021), 124.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Wang Y., Song J., Zhou K., and Liu Yu. 2021. Unsupervised deep hashing with node representation for image retrieval. Pattern Recog. 112 (2021), 107785.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Zhou Y., Wang M., Zhang C, Ren F., Ma X., and Du Q.. 2021. A points of interest matching method using a multivariate weighting function with gradient descent optimization. Trans. GIS 25, 1 (2021), 359381.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Ma Z., Yang L. T., and Zhang Q.. 2020. Support multimode tensor machine for multiple classification on industrial big data. IEEE Trans. Industr. Inform. 17, 5 (2020), 33823390.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Wang X., Wang Y., Tao F., and Liu A.. 2021. New paradigm of data-driven smart customisation through digital twin. J. Manuf. Syst. 58 (2021), 270280.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Shirowzhan S., Tan W., and Sepasgozar S.. 2020. Digital twin and CyberGIS for improving connectivity and measuring the impact of infrastructure construction planning in smart cities 9, 4 (2020), 240. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Sepasgozar S. M.. 2021. Differentiating digital twin from digital shadow: Elucidating a paradigm shift to expedite a smart, sustainable built environment. Buildings 11, 4 (2021), 151154.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Autiosalo J., Vepsäläinen J., Viitala R., and Tammi K.. 2019. A feature-based framework for structuring industrial digital twins. IEEE Access 8 (2019), 11931208.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Xue F., Lu W., Chen Z., and Webster C. J.. 2020. From LiDAR point cloud towards digital twin city: Clustering city objects based on Gestalt principles. ISPRS J. Photogram. Rem. Sens. 167 (2020), 418431.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Liu Y., Zhang L., Yang Y., Zhou L., Ren L, Wang F., and Deen M. J.. 2019. A novel cloud-based framework for the elderly healthcare services using digital twin. IEEE Access 7 (2019), 4908849101.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Wang K. J., Lee Y. H., and Angelica S.. 2020. Digital twin design for real-time monitoring–a case study of die cutting machine. Int. J. Produc. Res. 115. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Tao F., Sui F., Liu A., Qi Q., Zhang M., Song B., and Nee A. Y. C.. 2019. Digital twin-driven product design framework. Int. J. Produc. Res. 57, 12 (2019), 39353953.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Minerva R., Awan F. M., and Crespi N.. 2021. Exploiting digital twin as enablers for synthetic sensing. IEEE Internet Comput., 11. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Dong R., She C., Hardjawana W., Li Y., and Vucetic B. 2019. Deep learning for hybrid 5G services in mobile edge computing systems: Learn from a digital twin. IEEE Trans. Wirel. Commun. 18, 10 (2019), 46924707.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Sun J., Tian Z., Fu Y., Geng J., and Liu C.. 2020. Digital twins in human understanding: A deep learning-based method to recognize personality traits. Int. J. Comput. Integ. Manuf. 114.Google ScholarGoogle Scholar
  31. [31] Elayan H., Aloqaily M., and Guizani M.. 2021. Digital twin for intelligent context-aware IoT healthcare systems. IEEE Internet Things J. 11. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Nguyen H. X., Trestian R., To D., and Tatipamula M.. 2021. Digital twin for 5G and beyond. IEEE Commun. Mag. 59, 2 (2021), 1015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Dai Y., Zhang K., Maharjan S., and Zhang Y.. 2020. Deep reinforcement learning for stochastic computation offloading in digital twin networks. IEEE Trans. Industr. Inform. 17, 7 (2020), 49684977.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Computational Intelligence in Security of Digital Twins Big Graphic Data in Cyber-physical Systems of Smart Cities

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Management Information Systems
            ACM Transactions on Management Information Systems  Volume 13, Issue 4
            December 2022
            255 pages
            ISSN:2158-656X
            EISSN:2158-6578
            DOI:10.1145/3555789
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 10 August 2022
            • Online AM: 1 April 2022
            • Accepted: 1 February 2022
            • Revised: 1 January 2022
            • Received: 1 July 2021
            Published in tmis Volume 13, Issue 4

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format