1 False data attack model

The reliable and safe operation of power systems is crucial to the economy and homeland security of a nation [1, 2]. However, the operational security of modern power systems is being challenged by the high integration of information technologies [3, 4]. In fact, to monitor the real-time operation state of a power system, an increasing number of sensors and meters are being installed to collect the real-time measurements (such as bus voltages and line power flows) which are then transmitted to the control center. Power system operators make energy management decisions based on the received measurements that collectively contribute to the situational awareness of power system operations. These sensors and meters are therefore regarded as the eye of a power system. There will be certain uncontrollable noises in the real-time measurements due to the mechanical flaw of meters and measuring errors. To detect and screen out potential bad data, state estimation is employed in energy management system (EMS) to determine the most likely operating condition of the power system based on redundant power grid measurements. Random noises are not correlated with the physical characteristic of a power system, so they will increase the residual in the state estimation.

In the DC state estimation, the state vector \(\widehat{\varvec{\theta}}\) is estimated by solving the following non-linear optimization problem:

$${\widehat{\varvec{\theta}}} = { \hbox{min} }\left\| {{\varvec{z}} - {\varvec{H\theta }}} \right\|_{2}$$
(1)

where \({\varvec{z}}\) is the vector of measurements; \({\varvec{H}}\) is the Jacobian matrix corresponding to the network configuration.

The most popular method to solve the optimization problem is the least square method [5]. After the best possible estimation of the system state is determined, the residual r is calculated:

$$r = \parallel{\varvec{z}} - {{\varvec{H}}\widehat{\varvec{\theta}}}\parallel_{2}$$
(2)

If the residual is less than a given threshold value, the estimated state \(\widehat{\varvec{\theta}}\) is regarded acceptable. Otherwise, the estimated state is corrupted by bad data. According to the principle of bad data detection, Liu et al. [6] proved that if the amounts of injected bad data satisfy

$${\varvec{a}} = {\varvec{Hc}}$$
(3)

where \({\varvec{ a}}\) is the injected malicious data by an attacker; \({\varvec{c}}\) is the corresponding increment in state vector.

Then the overall residual of the power system will remain the same as if there were no bad data. That is, if the original data can pass the bad data detection, the corrupted data will escape from detection in the same manner. Therefore, this kind of false data injection attacks can hardly be detected.

DC state estimation provides us a fast and simple method to estimate the operating condition of a power system. However, it ignores bus voltages, reactive power flows and line losses. To better reflect the operating condition of the power system, AC state estimation is usually used in practical power system applications.

In a AC system, the Jacobian matrix H depends on the current state of the system so that (3) is rewritten as [7]:

$${\varvec{a}} = {\varvec{H}}\left( {{\varvec{x}} + {\varvec{c}}} \right)\cdot\left( {{\varvec{x}} + {\varvec{c}}} \right) - {\varvec{H}}\left( {\varvec{x}} \right)\cdot{\varvec{x }}$$
(4)

We can see that the injected false data a depends on the current state of the power system. In other words, the attacker has to obtain or estimate the system state to construct the undetectable false data a. Since only a few number of phasor measurement units (PMUs) have been instated in power systems, phase angles at most buses are unavailable. Consequently, it will be more difficult for an attacker to launch false data attack against AC state estimation.

Recently, false data injection attack has attracted intensive research interests. Significant efforts have been dedicated to reveal the wide impacts of false data injection attacks on economic operation [8, 9], transmission line power flows [10,11,12], real-time electric market [13,14,15], transient stability [16, 17], real-time topology [18, 19], line outage detection [20], PMU data quality [21, 22], microgrids [23, 24], automation generation control (AGC) [25, 26] and so on. Note that the existing models have a common drawback that limits their applications. It is known that the Jacobian matrix H is determined by the topology and line parameters of a power network. Thus, an attacker has to obtain the full topology and network parameters of the entire power system to construct the undetectable false data. As will be discussed later, this is an impractical assumption. In this paper, we review the existing literatures on false data injection attacks based on both DC and AC power flow models under the assumption of incomplete network information.

The rest of this paper is organized as follows. Section 2 discusses the motivation of developing a local model for false data injection attacks. Sections 37 investigate the local attack mechanism (with incomplete network information) based on barrier conditions, blind identification and data driven approaches. Section 8 further discusses three special cases for the local false data attacks. Finally, Sects. 9 and 10 discuss the future work and conclude the paper.

2 Why we need local attack model?

Most of the existing attack models are developed based on the assumption that attackers have access to the full network configuration information including the topology and other physical properties of the targeted power system. However, this assumption is impractical due to the following reasons.

  1. 1)

    The amount of required power grid data is huge

Today’s power grids are continually expanding their sizes for better serving electricity customers. For example, the IEEE 13659-bus system (as shown in Table 1 [27]) has 13659 buses and over 20000 lines. For such a large system, an attacker needs to obtain the reactance and resistance of over 20000 lines.

Table 1 Numbers of buses and lines in different systems

Since an attacker needs to pay a cost to obtain a certain set of line parameters, this is a difficult or even impossible task for an attacker with limited a budget to gain full information of the network configuration. In other words, the global models which are based on full network information is impractical.

  1. 2)

    Power grid data are difficult to obtain

On the other side, power grid is a critical infrastructure whose reliable and safe operation is very important to the economy prosperity and homeland security of a nation. In particular, the complicated international environment drives attackers to attack power grids. The number of cyber-attacks against power grids keeps increasing. An attacker can identify the weakness of a power system and launch a cyber or physical attack that will bring in severe consequences to the power system operation. Given these reasons, sensitive power system data are adequately protected. As a result, it is difficult for an attacker to access or obtain these data.

Considering the difficulty of obtaining the network parameters of a power system, an intelligent data attack should have the following three properties:

  1. 1)

    Undetectable. As the EMS uses the bad data detection procedure to remove the interruption of false data on the state estimation, the first thing for an attacker is to design the false data that can escape from being identified as bad data. One possible method is to inject the false data that obey the physical laws of the power system, such as Kirchhoff Current Law (KCL) and Kirchhoff Voltage Law (KVL). By doing so, the injected false data will not increase the residual of the state estimation and eventually avoid being detected.

  2. 2)

    Reduced requirement of network information. As discussed, an attacker needs to obtain the topology and line parameters of a power grid for making the injected data undetectable. However, it is difficult to for an attacker to obtain this information due to the security issue. So, from the perspective of an attacker, a more practical strategy is to compress the requirement of network information.

  3. 3)

    Severe consequence of data attacks. The ultimate goal of a cyber-attack is to pose severe consequences to the power system operation, e.g., transmission line outage, loss of loads, increased operation cost. So, an attacker is always active in finding the vulnerability of a power grid and then launches a cyber-attack against it.

Once these three conditions are met, such an attack can be defined as an intelligent data attack.

3 Attack model based on DC barrier condition

In this section, we first investigate the attack mechanism of false data injection attack based on incomplete network information and DC power flow equations.

3.1 DC model

In reality, an attacker constrained by its capacity/budget can only attack a limited number of measurements in a local region (denoted as region A). In the global model, false data injection initiated in region A will eventually incur changes in the power flows outside that region. That is, the measurements in the outer region also need to be attacked to hide the initial false data injection. Consequently, this attacker needs to get the topologies and line parameters of the network out of region A. So, as shown in Fig. 1, one possible strategy is to ensure that the false data injected into region A will not change the power flows in the outer region.

Fig.1
figure 1

Power flow barrier effect for DC case

We proved the following theorem in [28].

Theorem 1

Suppose a power grid is decomposed into two connected regions A and N by a set of lines (tie lines). If an additional injected power \(\Delta{\varvec{P}}_{\text{A}}\) into region A makes the phase angles of all its boundary buses increase or decrease by the same amount

$$\Delta \theta_{r} = \alpha \quad \quad \forall r \in \varOmega_{\text{BA}}$$
(5)

where \(\varOmega_{\text{BA}}\) is the set of boundary buses in region A; \(\Delta \theta_{r}\) is the incremental phase angle at bus r; \(\Delta {\varvec{P}}_{\text{A}}\) is the injected bus power injection vector in region A.

Then, the power flows in region N remain the same under the false data injection.

$$\Delta {\varvec{F}}_{\text{N}} = 0$$
(6)

where \(\Delta {\varvec{F}}_{\text{N}}\) is the incremental line flow vector in region N.

Theorem 1 tells that if we enforce all the changes of phase angles at the boundary buses in region A to be the same, no additional power exchange will occur between region A and the outer region. This phenomenon is referred as to the barrier effect of power flows. Constraint (5) is the barrier conditions for the DC case.

We can see that the boundary conditions met by the attacking region A enable an attacker to design the undetectable false data (that follows KCL and KVL) based on the topology and network parameters of the local attack region. There is no necessity of obtaining the network information in the outer region. The local model indicates that an attacker can launch a successful false data by paying a very low cost.

Followed by our work, a lot of research has been carried out to reveal the attack mechanism based on barrier effect of power flows.

A bilevel optimization model was proposed in [29] to evaluate the impacts of local false data attacks on the long-term power supply reliability. In the lower level, to avoid having access to the topology and line parameters of the entire power network, the authors adopted the barrier conditions to limit the additional power flows within the attacking region. Later, the impacts of the local load redistribution attacks with incomplete network information on power supply adequacy was also investigated in [30]. The attack process for modifying the measurements is modeled by the semi-Markov model.

Li et al. in [31] studied the local coordinated cyber-physical attack scheme using the incomplete network information. This is achieved by estimating the reactance of lines in the attacking region and replacing the non-attacking region with its equivalent network. It was demonstrated that such an attack can mask the outage of transmission lines.

Sun et al. [32] proposed a false data proportional attack, in which an attacker could construct the false data that is able to avoid the traditional bad data detection method just based on the topology information of a local region. The line parameters of transmission lines are not needed. In addition, it was proved that the injected false data can be adjusted proportionally when the measurement of a bus and transmission-line data is changed.

Ly et al. [33] employed the proposed local attack scheme in [28] to examine the rerouting strategy for defending false data attacks in power systems by increasing the power grid topology complexity. An algorithm was developed to evaluate the probability of a successful false data attack for a particular topology and status of circuit breakers.

The authors in [34] introduced a practical attack scheme using limited network information. Considering the limited information of the attacker, a multiple linear regression model was introduced to learn the relationship between the attack region and the outer subnetwork based on historical data. A bilevel optimization problem was set up to identify the most damaging consequences of the attack to the operation of the power system.

Zhang et al. [35] proposed a false data attack model with limited network information, in which an attacker has perfect knowledge of the network information of the targeted subnetwork but has only estimated knowledge of the power transfer distribution factor. It was revealed that such an attack scheme is able to avoid the traditional bad data detection.

3.2 Feasibility theorem

It should be pointed out that the boundary conditions will impact the feasibility of the DC power flows in region A when the phase angles at boundary buses are set the same value. So, it is necessary to develop a simple method to find a feasible region. A feasible attack region is defined as the region in which a nonzero attack vector can be constructed.

According to the graph theory, we in [28] introduced the following two theorems to find a feasible region.

Theorem 2

Suppose a power grid is decomposed into two regions A and N connected by a set of lines (tie lines). Suppose the attacking region A consists of \(\rho\) non-boundary buses and \(\sigma\) boundary buses. If there are at most \(q = \rho - 1\) non-attackable bus injection measurements in region A, then there exists a feasible non-zero attacking vector.

Theorem 2 can also be extended to include the cases where the attacking region A is disconnected and/or the non-attacking region N is disconnected, as shown in Theorem 2E.

Theorem 2E

Suppose a power grid is decomposed into an attacking region A and a non-attacking region N connected by a set of lines (tie lines). Suppose the attacking region A consists of \(\rho\) non-boundary buses and \(\sigma\) boundary buses. The \(\sigma\) boundary buses in A are connected to \(n\) non-attacking islands. If there are at most \(q = \rho + n - 2\) non-attackable bus injection measurements in region A, then there exists a feasible non-zero attacking vector.

According to Theorems 2 and 2E, the only thing that we need to do is to count the number of buses. We use the following example to illustrate the two proposed feasibility theorems.

As shown in the left of Fig. 2, α and β are the corresponding bus phase increment, the attacking region includes three boundary buses. There is only one non-boundary bus in the attacking region, so \(\rho = 1.\) The non-attacking region is connected, \(n = 1\). We can see that \(q > \rho + n - 2\), and thus Theorem 2 is not satisfied. That is, the attacking region is infeasible. However, as shown in the right of Fig. 2, if we add one attackable bus into the attacking region, \(\rho = 2\) and \(q \le \rho + n - 2\). Theorem 2 is then satisfied and the attacking region is feasible.

Fig. 2
figure 2

Illustrative examples of feasibility theorem

3.3 Optimizing the attacking region

As discussed in Sect. 2, it is difficult for an attacker to get the line parameters due to the security issues. Although the local DC model reveals that an undetectable attack vector can be constructed using incomplete network information, the following concern needs to be addressed instead: How much information is needed for launching a successful attack, given the fact that an attacker hopes to perform a successful false data injection with the same impact but based on as little information as possible?

To address this issue, we proposed a heuristic algorithm in [19, 36] to determine an optimal attacking region such that the required network information can be reduced as much as possible. As shown in Fig. 3 [19], the potential attacking region starts from a small sub-network, and gradually expands until the false data injection becomes feasible in the region. According to the boundary conditions, power flows only change in the current attacking region without impacting the power flows in the outer region E, if the phase angles at boundary buses are set the same value.

Fig. 3
figure 3

Expansion of the attacking region

The whole algorithm can be summarized as follows [36]:

Step 1: Determine an initial attacking region of a target component.

Step 2: Obtain the parameters of all lines in the attacking region.

Step 3: Set the phase angles at boundary buses the same value and then determine the feasibility of the attacking region.

Step 4: If a feasible attacking region is found, stop. Otherwise, go to Step 5.

Step 5: Expand the attacking region, go to Step 2.

As the attacking region starts from a small one, the size of the final attacking region will be controlled not to be unnecessarily large. Accordingly, the numbers of buses and lines included in the final attacking region will be relatively small compared to global models. So, the attacking cost to obtain the network information can be reduced significantly.

4 Attack model based on AC barrier condition

In this section, we continue to investigate the attack mechanism against AC state estimation based when the network information obtained by attackers is incomplete.

4.1 Local model

In the DC power flows, the superstition principle applies. However, when the AC model is considered, incremental power flows are dependent on the current system state. In other words, the Jacobian matrix H is not constant under various operating conditions. In this sense, the boundary conditions (5) for ensuring the barrier effect of power flows are not valid. It is essential for us to develop new boundary conditions for the AC case. As shown in Fig. 4 [37], similar to the DC case, the power system is supposed to be decomposed into the attacking region A and non-attacking region N.

Fig.4
figure 4

Power flow barrier effect for the AC case

The measurements are separated into two parts: \({\varvec{z}}_{1}\) includes all the measurements in the attacking region A excluding the flow measurements on the tie lines; \({\varvec{z}}_{2}\) contains the remaining measurements. After that, we have

$$\left[ {\begin{array}{*{20}c} {{\varvec{z}}_{1} } \\ {{\varvec{z}}_{2} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {{\varvec{H}}_{11} } & {{\varvec{H}}_{12} } \\ 0 & {{\varvec{H}}_{22} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\widehat{\varvec{x}}_{1} } \\ {\widehat{\varvec{x}}_{2} } \\ \end{array} } \right] + \left[ {\begin{array}{*{20}c} {\varvec{e}_{1} } \\ {\varvec{e}_{2} } \\ \end{array} } \right]$$
(7)

where \({\varvec{e}}_{1} ,{\varvec{ e}}_{2}\) are the corresponding random error vectors for \({\varvec{z}}_{1} ,{\varvec{ z}}_{2}\). Different from the DC case, the Jacobian matrices \({\varvec{H}}_{11}\), \({\varvec{H}}_{12} ,{\varvec{ H}}_{22}\) in the AC case depend on the state vector.

After the false data \({\varvec{z}}_{1}^{\varvec{\prime }} \varvec{ }\)is injected, the residual of the system is determined by:

$$r^{\prime} = \hbox{min} \parallel\varvec{z^{\prime}} - \varvec{H}\left( {\varvec{x^{\prime}}} \right)\varvec{x^{\prime}}\parallel_{2}$$
(8)

where \(r^{\prime}, \varvec{x^{\prime}}\) are the residual and state variable after the injection of false data.

It was proved in [37] that

$$r^{\prime} \le r^{\prime\prime} = \left\| {\varvec{z^{\prime}} - \varvec{H}\left( {\widehat{\varvec{x}}^{{\prime }} } \right)\widehat{\varvec{x}}^{{\prime }} } \right\|_{2} = \left\| {\varvec{e}_{2} } \right\|_{2} < r = \left\| {\begin{array}{*{20}c} {\varvec{e}_{1} } \\ {\varvec{e}_{2} } \\ \end{array} } \right\|_{2}$$
(9)

where \(\widehat{\varvec{x}}^{{\prime }}\) is the estimated state vector after the injection of false data.

Constraint (9) indicates that the injected false data deceases the overall residual of the power system since the injected false data conform to physical laws in the attacked power system. The proposed practical attack model for the AC case is characterized by the three properties as follows [37]:

  1. 1)

    The false data injected into the attacking region follows KCL and KVL;

  2. 2)

    The voltage magnitudes at the boundary buses in the attacking region are set to the values of the corresponding measurements;

  3. 3)

    The flows on the tie lines are set to the corresponding measurements.

4.2 Estimating phase angle difference

For a pair of buses b and d (see Fig. 5 [37]), if we find a path k that connects these two buses (e.g., \(b, d\)), then it is trivial to prove that (10) holds [37],

$$\mathop \sum \limits_{{l \in S_{k} }} \delta_{l} = \theta_{b} - \theta_{d}$$
(10)

where \(S_{k}\) includes all lines in path k.

Fig. 5
figure 5

A path connecting two boundary buses

Constraint (10) indicates that one can select a path \(k\) that connects buses b and d, and then sum the angle differences of all lines along the path to determine the value of \(\theta_{b} - \theta_{d}\). By doing so, an attacker does not need to obtain the actual phase angles. However, the challenge here is how to calculate the angle difference of a line without knowing the phase angles at its terminal buses.

Under the DC assumptions, the power flow of line i-j is:

$$p_{ij} = \frac{{\theta_{i} - \theta_{j} }}{{x_{ij} }}$$
(11)

Based on (11), the angle difference of a line is calculated by

$$\theta_{ij} = \theta_{i} - \theta_{j} = x_{ij} p_{ij}$$
(12)

There is a trend that the greater the ratio of reactance \(X\) to resistance \(R\) of a line is, the smaller the difference would be. This motivates us to find a best path that has the largest average ratio of reactance to resistance [37].

Next, we introduce another approach to estimate the angle difference of line.

From (12), it can be observed that the line flow is actually determined by the angle difference of the line instead of actual values of phase angles. This provides a method to estimate the angle difference of a line using the associated bus voltage and line flow measurements.

The estimating principle of the angle difference of line i-j is explained as follows.

In the AC case, the real power flow of line i-j is:

$$p_{ij} = V_{i}^{2} G_{ij} - V_{i} V_{j} \left[ {g_{ij} \cos \left( {\theta_{i} - \theta_{j} } \right) + b_{ij} { \sin }\left( {\theta_{i} - \theta_{j} } \right)} \right]$$
(13)

where \(V_{i}\) is the voltage magnitude at bus i; \(g_{ij}\) and \(b_{ij}\) are the parameters of line i-j.

From (13), we have

$$g_{ij} \cos\theta_{ij} + b_{ij} \sin\theta_{ij} = \frac{{V_{i}^{2} g_{ij} - p_{ij} }}{{V_{i} V_{j} }}$$
(14)

Equivalently, (15) holds:

$$\sqrt {g_{ij}^{2} + b_{ij}^{2} } \sin \left( {\theta_{ij} + \omega } \right) = \frac{{V_{i}^{2} g_{ij} - p_{ij} }}{{V_{i} V_{j} }}$$
(15)

where

$$\omega = { \arctan }\frac{{b_{ij} }}{{g_{ij} }}$$
(16)

Combining (15) and (16), we obtain that

$$\theta_{ij} = { \arcsin }\left( {\frac{{V_{i}^{2} g_{ij} - p_{ij} }}{{V_{i} V_{j} \sqrt {g_{ij}^{2} + b_{ij}^{2} } }}} \right) - \omega + 2k\uppi \quad k = \mp 0, 1, \ldots$$
(17)

Mathematically, there are multiple solutions for \(\theta_{ij}\). However, note that

$$- 15^\circ \le \theta_{ij} \le 15^\circ$$
(18)

and

$$\left| {\theta_{ij} + 2k\uppi} \right| > 2\uppi \quad\quad k = \mp 1, 2, \ldots$$
(19)

This implies that (17) has a unique solution that satisfies constraint (18).

Once angle differences among boundary buses in the attacking region are determined, the attacker can choose one of these boundary buses as the reference bus and set its phase angle to be zero and determine the corresponding phase angles for the other buses without changing their angle differences.

5 Attack model based on blind identification

In this section, we will review the existing literatures that study local attack schemes using incomplete network information based on blind identification techniques.

Anwar et al. [38] showed that a stealthy attack vector can be constructed without any topology information and line parameters of a power grid. They demonstrated that the subspace transformation methods of the measurement matrix can be used to generate a hidden attack. However, such an attack scheme is only valid for the cases where measurement errors are identical to Gaussian noises. In the presence of gross errors, the injected false data will still trigger the alarm of the bad data detection procedure. To overcome this issue, a technique was developed to ensure the stealth of false data if the gross error exists.

The authors in [39] introduced a strategy to mask the sensitive information of a power grid when solving the multi-party AC optimal power flow problem in a public platform. In the attack model, the attacker has knowledge of the general AC optimal power model, but has no knowledge of the topology and parameters of the power grid. It was revealed that the topology information of a power grid may be identified if the rank information of the constraints is inferable.

The authors in [40] studied the principle of blind false data injection attacks using the principal component approximation method without knowing the Jacobian matrix and the distribution of state variables. The principal component analysis (PCA) is used to transform the measurement vector into a linear combination of several vectors with uncorrelated components. The simulation results demonstrated the stealth of the attack vector generated by the PCA matrix.

In [41], Chin et al. analyzed the blind attack scheme against the AC state estimation in a power system. The geometric approach was adopted to relax the strong assumption such that an attacker does not need to obtain the full topology and line parameter information. The criteria for successful AC blind and non-blind false data attacks (FDAs) are derived. Specifically, an attacker can modify the state of the targeted bus if the additional information of the original system states is known to this attacker.

6 Attack model based on data-driven approaches

In this section, we will review the existing literature that study local attack schemes using incomplete network information based on data driven techniques.

In [42], Xie et al. proposed a data driven approach to realize an undetectable false data injection attack with incomplete network information. The principle is to relax the system matrix used for constructing the injected false data. It was proved that such knowledge can be learned by a two-stage approach. In the first stage, a blind identification approach is employed to estimate the incomplete system matrix using a sequence of intercepted meter data. In the second stage, the estimated system matrix is used to construct the attack vector by a sparsity-exploiting method.

Chen et al. in [43] proposed a new strategy of false data injection attacks to disrupt the normal operation of a power system regulated by automatic voltage controls (AVC). Such an attack can be launched by an attacker who has little knowledge of the entire power grid. A partial observable Markov decision process is used to determine the optimal attack strategy. Moreover, a Q-learning algorithm with nearest sequence memory is used to realize the real-time data attack.

In [44], the authors proposed an alternative data-driven approach to construct stealthy attacks using only the subspace network information of the measurement signals without any requirement on the prior knowledge of the system states. However, such an attack scheme will fail if the measurement signals contain missing values. In this case, low-rank and sparse matrix approximation techniques are utilized to overcome this issue. By doing so, the injected false data is able to escape from the bad data detection.

Considering the difficulty of obtaining the information, Kim et al. in [45] proposed the subspace method to learn the system operating subspace from measurements. The feasibility conditions for an unobservable subspace attack are derived under both full and limited measurement assumptions. After the system subspace is estimated, two attack strategies are presented to ensure the impacts of such an attack to the operation of the system. The first one is to affect the system state directly by hiding the attack vector in the system subspace. The second strategy induces the operator to remove the normal data.

In [46], the authors further developed a data framing attack strategy which can impact the process of state estimation by an arbitrary level under the condition that only half of the critical measurements are acquired by the attackers. This type of stealthy attacks uses the subspace information of power systems measurements and exploits normal meter measurements as sources of malicious data. It is shown that the framing attack is able to misguide the operator to remove critical measurements from the framed meters, and the attacker can adjust the disturbance degree by carefully selecting a delicate attack magnitude.

7 Other local attack models

In Sects. 36, we will review local attack models that use limited network information of a power grid based on barrier conditions, blind identification and data driven approaches. Besides, there are several other local attack models.

The authors in [47] investigated the possibility of launching an undetectable false data injection attack without the prior knowledge of the power grid topology. The results show that the Jacobian matrix of a power grid can be approximately estimated by the linear independent component analysis when the system dynamics are small. Once the Jacobian matrix is estimated, the attacker can use it to design the injected false data that can pass the bad data detection procedure.

Tajer et al. in [48,49,50] investigated the attack strategy for an attacker who only has limited and imperfect information about the power grid. An optimal attack strategy was designed to ensure the economic profit of such an attack while taking into account the bounded errors of the knowledge of the power grid network information.

Bi et al. [51] introduced an optimal undetectable data attacks against the DC state estimation with partial topological knowledge. In particular, it was proved that such topological information is not required for constructing the undetectable false data if a power grid has a special structure, e. g., bridge structure. A Min-Cut method was proposed to minimize the required topological information.

Following the local attack scheme in [28], Deng et al. in [52] proposed an attack model against distribution system state estimation. Specifically, the authors discussed the strategy to avoid having the complete knowledge of the network topology and related parameters. Similarly, it is shown that an attacker can estimate system state only with the knowledge of power flow or power injection measurements. Moreover, it was also demonstrated that the states of buses in a local region can be obtained by accessing a small number of power flow or injection measurements.

8 Discussions on special cases

In this section, we will further discuss three special cases in which the required network information can be further reduced for constructing an undetectable attack vector.

8.1 Tree topology

In general, a power grid is a meshed network. However, it is observed that the number of loops in a power grid is much less than the number of buses. So, there exist a significant number of one-degree buses. For a tree network as shown in Fig. 6, we have

$$\varvec{F} = \varvec{X}^{ - 1} \varvec{KL}^{{\prime }}\varvec{\theta}= \varvec{A\theta }$$
(20)

where \(\varvec{X}\) is the reactance matrix; \(\varvec{KL}\) is the bus-line incidence matrix; \(\varvec{\theta}\) is the bus angle vector.

Fig. 6
figure 6

Power sub-network with a tree topology

For a tree structured network, suppose there are n buses, then the number of lines is n 1. It is easy to prove that

$${\text{rank }}\left( \varvec{A} \right) = n - 1$$
(21)

Constraint (21) implies that for any power flow vector, KVL constraint in DC power flow equations can be ignored.

By doing so, we only need to obtain the topology information of a power grid for determining DC power flows, when the resistance of lines is not required. Accordingly, the attacking cost is reduced significantly.

8.2 Single loop

As shown in Fig. 7, the topology of a power network is a single loop which is defined as the loop in which we cannot find any internal loop.

Fig. 7
figure 7

A single loop

Since the power grid does not hold a tree topology, the KVL constraint cannot be discarded. That is, an attacker needs to obtain the parameters of these lines to construct the undetectable false data.

The outage of a line can be simulated by injecting a pair of additional power vector. Accordingly, we further proposed a topology attack scheme in which an attacker injects false data into buses to change the real-time topology of a power grid sensed by the control center [19].

Considering the case where the attacker disconnects one line in the single loop, it becomes a tree power network. Differently, this topology modification is not achieved by a physical attack, but by a false data injection attack. That is, an attacker changes the breaker status of one line from 1 to 0 by injecting false measurements.

In practice, a large number of PMUs have been installed to detect the outage of a line. According to the principle of line outage detection, the goal of the attacking is to minimize the residual (22) by injecting false data into a set of measurements [20].

$$r_{k} = { \hbox{min} }\left\| {\Delta\varvec{\theta}_{m,k} - \Delta \varvec{\theta^{\prime}}_{m,k}^{{}} } \right\|_{2}$$
(22)

where \(\Delta\varvec{\theta}_{m,k}\) is observed phase angle change vector of PMU buses after line \(k\) is outaged; \(\Delta \theta_{m,k}^{\prime }\) is calculated phase angle change vector of PMU buses after line \(k\) is outaged with false data injection.

By doing so, the residual used to detect the outage line in the PMUs based detection method will be disrupted.

To sum up, in order to finish the false data attack without line parameters, the following conditions must be satisfied [20]:

  1. 1)

    The false data injected into the attacking region follows KCL and KVL;

  2. 2)

    The injected false data can simulate the outage of one line in this single loop;

  3. 3)

    The injected false data can minimize the residual value in (22) such that the PMU based line detection becomes invalid.

8.3 Single tie line

When the AC power flow model is adopted, as discussed in Sect. 4, an attacker needs to estimate the phase angle differences among boundary buses in the attacking region A. In this section, we further consider a special case where an attacking region is connected to the non-attacking region N through a single tie line.

In this case, one question needs to be reinvestigated: is there a need to estimate the angle difference?

As shown in Fig. 4, suppose that the actual phase angle at boundary b is \(\alpha\), if we revise its phase angle to \(\beta\), then the error \(\tau\) will be

$$\tau = \beta - \alpha$$
(23)

So, if we increase or decrease the phase angles at all buses in region N, that is

$$\left\{ {\begin{array}{*{20}l} {\theta_{b} = \theta_{b} + \tau } \hfill \\ {\theta_{d} = \theta_{d} + \tau } \hfill \\ \end{array} } \right.$$
(24)

Then, we have

$$\theta_{b} - \theta_{d} = \theta_{b} + \tau - \theta_{d} - \tau$$
(25)

We can see that the angle difference of tie line b-d will not be changed. Since the power flow of a line is dependent on the angle difference of a line, rather than the actual phase angles at the terminal buses. This indicates that we can randomly assign a value to the phase angle at the boundary bus in region A.

In fact, since a power grid is highly sparse network, the ratio of the number of lines to the number of buses is usually below 1.5. As a result, such cases are not common in practice. For example, a sub-network is connected to another sub-network through a HVDC tie line. In this case, the local attack scheme can be applied.

9 Future work

The required network information is crucial for the success of a false data injection attack. As an extension of the current work, we will discuss some of our future work in this section.

  1. 1)

    Attack mechanism against power dispatch without network parameters

We have reviewed several local attack mechanisms for launching false data injection attacks. However, in these models, the focus is to construct an undetectable attack vector based on incomplete network information. It remains to discuss sufficiently how to ensure these data can significantly impact the operation of power systems, e.g., Nk contingency analysis [53], dispatch security [54]. This will be investigated in our future work.

Secondly, most models still require an attacker to obtain the network parameters of the attacking region. Considering the fact that the line parameters are more difficult to obtain, a future work is to investigate the attack strategy without any network parameter. For example, it is necessary to investigate the possibility that if an attacker can design an effective attack vector just based on the topology of a power grid to significantly disrupt the economic and secure operation of a power system.

  1. 2)

    Local attack mechanism against distribution systems with incomplete network information

A distribution system is characterized by lots of buses with few meters. On the other hand, the integration of renewable energies, such as wind power and PV, has significantly increased the uncertainty of the distribution system. This will pose a challenge to the accuracy of the load forecasting. Consequently, the state estimator has relative weak situational awareness to the operation of a distribution system. This gives an attacker a better chance to compromise the real-time data. Different from transmission networks, a distribution system usually has a tree topology. As discussed, for a tree structured power grid, there is no need to estimate the differences of phase angles among boundary buses. Instead, they can be assigned with random values to construct the injected false data. Thus, it is of significance to study the local attack mechanism against distribution systems.

  1. 3)

    Effective detection methods

It is revealed that an attacker can launch an effective false data attack to pose severe consequences to a power system. Moreover, such an attack only requires an attacker to attack a small number of measurements with a few amount of topology and network information. So, it is necessary to develop some effective detection methods to defend against such attacks. The detection method should sufficiently considers the decision-making intelligence of an attacker. That is, the attack intelligence discussed in Sect. 2 might provide clues for developing some effective detection methods. On the other hand, increasing the complexity and unpredictability of state estimation could be an alternative countermeasure.

10 Conclusion

Today’s power systems are subject to increasingly frequent cyber-attacks due to the integration of information technologies. The existing models for analyzing malicious behaviors of an attacker share one shortcoming that the full network information of a power grid must be available to the attacker. To address this issue, we perform a literature review on false data injection attacks using incomplete network information based on barrier conditions, blind identification and data driven approaches. We also discussed several special cases in which an attacker can launch an effective data attack without knowing line parameters. These studies provide a more practical model to analyze the attack behaviors and highlight the cyber risk of power systems, as an attacker is possible to launch a successful false data injection attack after obtaining a limited amount of network information.