Next Article in Journal
Analytical Models for Pose Estimate Variance of Planar Fiducial Markers for Mobile Robot Localisation
Next Article in Special Issue
Advanced Vibration-Based Fault Diagnosis and Vibration Control Methods
Previous Article in Journal
Multi-Temporal Satellite Investigation of gas Flaring in Iraq and Iran: The DAFI Porting on Collection 2 Landsat 8/9 and Sentinel 2A/B
Previous Article in Special Issue
Verification of Mechanical Properties Identification Based on Impulse Excitation Technique and Mobile Device Measurements
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimal Control Algorithm for Stochastic Systems with Parameter Drift

School of Electronic Information Engineering, Xi’an Technological University, Xi’an 710021, China
*
Author to whom correspondence should be addressed.
Sensors 2023, 23(12), 5743; https://doi.org/10.3390/s23125743
Submission received: 22 May 2023 / Revised: 10 June 2023 / Accepted: 15 June 2023 / Published: 20 June 2023

Abstract

:
A novel optimal control problem is considered for multiple input multiple output (MIMO) stochastic systems with mixed parameter drift, external disturbance and observation noise. The proposed controller can not only track and identify the drift parameters in finite time but, furthermore, drive the system to move towards the desired trajectory. However, there is a conflict between control and estimation, which makes the analytic solution unattainable in most situations. A dual control algorithm based on weight factor and innovation is, therefore, proposed. First, the innovation is added to the control goal by the appropriate weight and the Kalman filter is introduced to estimate and track the transformed drift parameters. The weight factor is used to adjust the degree of drift parameter estimation in order to achieve a balance between control and estimation. Then, the optimal control is derived by solving the modified optimization problem. In this strategy, the analytic solution of the control law can be obtained. The control law obtained in this paper is optimal because the estimation of drift parameters is integrated into the objective function rather than the suboptimal control law, which includes two parts of control and estimation in other studies. The proposed algorithm can achieve the best compromise between optimization and estatimation. Finally, the effectiveness of the algorithm is verified by numerical experiments in two different cases.

1. Introduction

Parameter drift refers to the change in parameters in the system when the component exceeds its working life or failure. The occurrence of drift is closely related to the working principle, operating environment and the physical properties of components’ materials [1], especially in aerospace systems. For example, the gyroscope installed on a satellite is the most important measurement tool for measuring the angular velocity of the satellite attitude. However, due to the strong radiation and strong lightning of the space environment, as well as the mechanical torsion, degradation and wear caused by the vibration, material creep and time stress, the structural parameters of a gyroscope will drift over time [2,3]. The acceleration of drift will seriously affect the navigation accuracy of the inertial platform, which may have serious consequences. If the drift parameter of a gyroscope can be accurately known at any time, the drift can be found the fastest so that corresponding measures can be taken to reduce the occurrence of accidents. Therefore, it is necessary to study the system control problem involving parameter drift.
Generally, researchers take parameter drift as an irresolvable uncertainty. There has been some research on this kind of uncertainty. Wen et al. [4] proposed an adaptive control strategy for model predictive control for uncertainties in control systems, including modeling dynamic and bounded uncertainties. Aiming at parameter uncertainty and model uncertainty, a sampling-based approximation method is proposed in [5], which consists of two parts, the dual part of the scenario tree and the exploitation part of computing the open loop control sequence, both of which perform their respective functions and finally obtain a dual predictive control formula. In addition to these model structural uncertainties, there are various uncertainties caused by noise. Most current studies only consider Gaussian white noise; few researchers have studied non-Gaussian noise. Ma et al. [6] consider the case of following thick tail distribution instead of Gaussian distribution, and Ma et al. [7] consider the filtering problem of nonlinear stochastic systems with measured outliers. However, they all have their limitations. Ma L. et al. [7] only studies the filtering problem involving the measured outliers, while Ma X. et al. [6,8] consider both the filtering and control, but the research object is limited to the single input–single output (SISO) system. Up to now, there have been many research results on the control problems of stochastic linear SISO time-invariant systems involving parameter drift. For fast arbitrary drift phenomena, the drift parameters are viewed as unknown augmented states that are compensated via model error compensation in the literature [9]. For the deterministic drift phenomena, a double exponentially weighted moving average feedback controller is designed to compensate the slow drift of the system in the literature [10]. In addition, Yang et al. [11] proposed a suboptimal dual control method for the systems with parameter drifting. However, these above methods are only for SISO cases. For the MIMO system, the common methods are to decouple into multiple SISO systems. However, these methods often overlook a large amount of information in the actual system and cannot accurately describe the actual system [12]. Actually, with the rapid development of science and technology, the structures of modern mechanical systems and aerospace systems have become increasingly complex, with more and more subsystems and functions. Each component is interconnected and the subsystems are coupled with each other, which makes a simple linear SISO model far from enough to meet the current needs. Therefore, it is urgent to solve the control and estimation problem for MIMO systems with parameter drift.
Presently, numerous works have contributed to the research regarding the MIMO system, such as the approximation of nonlinear MIMO system control [13], the optimal adaptive tracking control [14] and stable adaptive control of nonlinear multivariable systems [15], model predictive control [16], adaptive certainty equivalence control [17,18] and so on. For stochastic systems with parameters and noise unicertainties, the idea of dual control was first proposed by former Soviet researcher Feldbaum as early as the early 1960s. He and NASA‘s Barshalom (founder of information fusion) noticed this kind of problem and pointed out in their series of papers that, except for a few ideal cases, the optimal control of such systems generally pursues two conflicting goals: on the one hand, the controller needs to optimize the system. In order to achieve good control effects, the control amount should be small and not too large, which is the cautious role of the controller. On the other hand, the controller also needs to learn to deal with uncertainties regarding parameters and system states. In order to obtain better processing results, it is hoped that the control amount will be as large as possible to motivate rich information in the system [19,20], which is the detection function of the controller. The controller design theory that combines caution and detection is a challenge to the existing theory of LQG, which is the so-called dual adaptive control problem of systems with double uncertainty. These two actions conflict with each other and cannot be carried out separately, resulting in difficulty in obtaining optimal control solutions in most cases. There have been some research results about the problem, such as adaptive dual control [21,22] and robust learning control [23], innovation dual control [24], variance minimization cumulant control under complete statistical characterization [25], LQG nominal dual control [26,27] and so on. The above approaches are designed on the basis of the state space model for the stochastic system with unknown parameters, not emphasizing the correlation and coupling between internal variables and outside variables in the system. In [22], a suboptimal dual control strategy is proposed for stochastic systems with parameter uncertainties. However, the designed controller contains two parts: one is for control and the other is for learning. Among them, the learning part is to add the trace of covariance matrix to the control law in a specific way, resulting in a suboptimal control strategy. Therefore, they cannot be widely applied. Because the separation principle between the filter and the controller is not fulfilled and the controller and observer are mutually coupled in the MIMO stochastic system, it is a significant challenge to obtain the analytic solution of the controller.
Motivated by the above description, for the MIMO system with parameter drift, we design a novel optimal dual control law by viewing drift parameters as parameters that vary with time in finite time instead of drifting off to infinity. The control designed can track the drift parameters on one hand, and it can also drive the system to run towards the desired target. The innovations of this paper can be summarized as follows. 1. This paper designs an optimal control law for drift parameters, which replaces many suboptimal control laws, including control and learning parts, while reducing computational complexity. 2. Due to the drift parameters in the system, the separation principle does not work. In this paper, multi-step optimization is transformed into one-step optimization, and the optimal goal at each moment is achieved online while estimating, ultimately achieving optimal control. 3. Almost all control systems are MIMO systems, and the model in this paper is closer to the actual system. At the same time, it can also solve the optimal control problem of linearized systems with this type of structure, which has a very broad practical application prospect.
The remaining part of the paper is organized as follows. In Section 2, the problem to be solved is presented. In Section 3, we transform the system model and the drift parameters and then estimate and track them using Kalman filter. A novel dual control law with learning property is designed for the control problem in Section 4. In Section 5, two numerical simulation examples are used to verify the effectiveness of the algorithm proposed in this paper, and the conclusion is presented in Section 6.

2. Problem Statement

Consider a general multi-input multi-output stochastic system model:
z 1 ( t + 1 ) = g 11 ( t ) z 1 ( t ) + g 12 ( t ) z 2 ( t ) + + g 1 m ( t ) z m ( t ) + h 11 ( t ) u 1 ( t ) + h 12 ( t ) u 2 ( t ) + + h 1 r ( t ) u r ( t ) + ε 1 ( t ) z 2 ( t + 1 ) = g 21 ( t ) z 1 ( t ) + g 22 ( t ) z 2 ( t ) + + g 2 m ( t ) z m ( t ) + h 21 ( t ) u 1 ( t ) + h 22 ( t ) u 2 ( t ) + + h 2 r ( t ) u r ( t ) + ε 2 ( t ) z m ( t + 1 ) = g m 1 ( t ) z 1 ( t ) + g m 2 ( t ) z 2 ( t ) + + g m m ( t ) z m ( t ) + h m 1 ( t ) u 1 ( t ) + h m 2 ( t ) u 2 ( t ) + + h m r ( t ) u r ( t ) + ε m ( t )
where u j ( t ) , j = 1 , 2 , , r are r-dimension control input, z i ( t ) , i = 1 , 2 , , m are m-dimension control output, g 1 i ( t ) , , g m i ( t ) , h 1 j ( t ) , , h m j ( t ) , i = 1 , 2 , , m , j = 1 , 2 , , r represent the drift parameters reflecting the physical characteristics in the actual system, ε 1 ( t ) , ε 2 ( t ) , , ε m ( t ) are stochastic perturbations acting on the system. In general, they are viewed as mutually independent Gaussian white noises with normal distribution, and expressed as ε i N ( 0 , σ i 2 ) , i = 1 , 2 , , m .
For the sake of designing controller, the system (1) can be described by the following r-dimension input and m-dimension output stochastic system model:
z ( t + 1 ) = G ( t ) z ( t ) + H ( t ) u ( t ) + ε ( t )
where z ( t ) is an output vector composed of each output component z i ( t ) , i = 1 , 2 , , m , expressed as z ( t ) = [ z 1 ( t ) , z 2 ( t ) , , z m ( t ) ] T , T represents the transpose of the vector. u ( t ) is the same denoted by u ( t ) = [ u 1 ( t ) , u 2 ( t ) , u r ( t ) ] T , stochastic perturbations are ε ( t ) denoted by ε ( t ) = [ ε 1 ( t ) , ε 2 ( t ) , , ε m ( t ) ] T . Since each component is subject to Gaussian white noise, they satisfy ε i N ( 0 , Σ ε ) . Further, Σ ε = d i a g ( σ 1 2 , σ 2 2 , , σ m 2 ) , d i a g ( · ) represents the diagonal matrix. For the convenience of writing, a is used instead of a ( t ) in this paper to represent the time-varying parameter with drift. Therefore, the parameters of the system can be rewritten as
G = g 11 g 12 g 1 m g 21 g 22 g 2 m g m 1 g m 2 g m m H = h 11 h 12 h 1 r h 21 h 22 h 2 r h m 1 h m 2 h m r
It is worth noting that rewritten model (2) and original model (1) are interchangeable and have the same structure.
Next, the performance index is expressed in linear quadratic form as follows:
J E t N 1 [ z ( t + 1 ) z r ( t + 1 ) ] T Q [ z ( t + 1 ) z r ( t + 1 ) ]
where z r ( t + 1 ) , t = 0 , 1 , , N 1 is the desired output trajectory, Q = d i a g [ q 1 , q 2 , , q m ] is a semidefinite diagonal weight matrix. E { · } represents the mathematical expectation of error between the actual output and the desired output based on under the information set in the past.
For the above model (2), the goal of our research is to look for an optimal control to minimize the output deviation of the system in statistical sense. At the same time, it can effectively deal with the drift parameters in the system. The control problem to be solved in this paper can be described as the following optimization problem:
( P ) m i n u J s . t . z ( t + 1 ) = G ( t ) z ( t ) + H ( t ) u ( t ) + ε ( t ) t = 0 , 1 , 2 , , N 1
For the above optimization problem (P), when the parameters representing the characteristics of the system are all known, we can treat it as given constant and no parameter drift at any stage; the traditional minimum variance approach has been solved, and it is quite mature. When parameters are drifting, it is a new problem and the existing approach cannot be solved. The work of this paper is to derive a control law for the above dual control problem so that the derived control law can predict or track effectively the drift parameters while driving the system to work towards the desired target.

3. Parameter Prediction

In the problem (P), parameter drift makes it difficult to determine the control law. Therefore, the problem of parameter estimation is given priority to solve. The model (2) is transformed into the following model:
z ( t + 1 ) = Φ ( t ) Θ ( t ) + ε ( t )
where z ( t + 1 ) is the output vector at time t + 1 , Φ ( t ) = d i a g [ ϕ ( t ) , ϕ ( t ) , , ϕ ( t ) ] ; here, ϕ ( t ) as a new system vector containing the output component and the control component, denoted by ϕ ( t ) = [ z T ( t ) , u T ( t ) ] , Θ ( t ) , is the parameter vector consisting of all the drift parameters, denoted by Θ ( t ) = [ θ 1 T ( t ) , θ 2 T ( t ) , , θ m T ( t ) , ] T , θ i ( t ) = [ g i 1 , g i 2 , , g i m , h i 1 , h i 2 , , h i r ] T , i = 1 , 2 , , m .
In the above model (4), the parameter vector Θ is assumed to satisfy Gaussian distribution with initial mean Θ ^ ( 0 ) and initial covariance matrix P ( 0 ) . The disturbance noise is assumed to be mutually independent with the parameter vector.
In fact, no matter in the actual aerospace system, high-speed train operation system or large building structure, bridge or ship system, there are some components of failure, wear and aging phenomena, which are reflected in the change in system parameters. At the same time, there are also some systems whose parameters are ideal values obtained from trials, but, due to the complexity of the working environment, the one-to-one correspondence between the real parameters and the ideal values is not established, or they inevitably drift due to production errors, external time stresses and physical properties of materials. The drift is a dynamic process and affected by external noise or environment. A dynamic model of drift parameters is therefore established:
Θ ( t + 1 ) = Γ ( t ) Θ ( t ) + ξ ( t )
where Γ represents the drift coefficient matrix that can reflect the amplitude or changing trend in the parameter drift. ξ ( t ) represents the process noise during the operation of the system. To use Kalman filtering, we assume that it is independent of the measurement noise and follows normal distribution, namely ε ( t ) satisfies ξ N ( 0 , Σ ξ ) .
When the drift coefficient is unknown but constant, it is shown from (5) that the drift parameters are stable during the operation of the system, which is an ideal condition. At the moment, Γ is the identity matrix. When the parameters are fluctuant, Γ is not identity matrix. If | | Γ | | 2 is lager than 1, it indicates that the value of the system parameters gradually increases over time. For example, in the motor speed control system, with the continuous operation of the motor, the motor temperature is more and more high as time goes by and the resistance becomes great as the temperature increases. If | | Γ | | 2 is smaller than 1, it indicates that the value of the system parameters gradually decreases over time; for example, as the key measuring device of aerospace attitude control system, the performance of gyroscope is degraded due to wear and creep caused by harsh external environment and time stress. When the gyroscope parameter value of drift is less than the preset threshold value, the drift coefficient can be adjusted appropriately by modifying its torque to eliminate errors as much as possible. However, because the wear and aging caused by time stress is irreversible, the modified drift parameters still show a decreasing trend; namely, the drift parameter decreases gradually. The above analysis shows the situation that the parameters are time-varying, and the tendency of parametric variation in the model (5) is described at the same time. Therefore, above model (5) has universality.
Since the past control and output information is needed in processing drift parameters and solving optimal control, all information collected by the control law before the sampling time t is called real-time information set, that is:
z t = { z 1 ( 0 ) , , z m ( 0 ) , u 1 ( 0 ) , , u r ( 0 ) , z 1 ( 1 ) , , z m ( 1 ) , u 1 ( 1 ) , , u r ( 1 ) , , z 1 ( t 1 ) , , z m ( t 1 ) , u 1 ( t 1 ) , , u r ( t 1 ) , z 1 ( t ) , , z m ( t ) } .
If the initial time of the control system starts from the time t = 1 , the initial information set is represented as z 1 = { z 1 ( 1 ) , z 2 ( 1 ) , , z m ( 1 ) , u 1 ( 0 ) , u 2 ( 0 ) , , u r ( 0 ) } , which is set up in advance before the system runs.
In the practical system, z t is known, the parameter estimation, estimate error and estimation covariance matrix are defined:
Θ ^ ( t | t ) = E { Θ ( t ) | z t }
Θ ˜ ( t | t ) = Θ ( t ) Θ ^ ( t | t )
P ( t | t ) = E { Θ ˜ ( t | t ) Θ ˜ ( t | t ) T }
where E { · } represents mathematical expectation.
In the system described by (4) and (5), the evolutions of the conditional mean and covariance matrix are given by the standard Kalman filter equations
Θ ^ ( t + 1 | t ) = Γ Θ ^ ( t | t )
Θ ^ ( t + 1 | t + 1 ) = Θ ^ ( t + 1 | t ) + F ( t + 1 ) e ( t + 1 )
e ( t + 1 ) = z ( t + 1 ) Φ ( t ) Θ ^ ( t + 1 | t )
F ( t + 1 ) = P ( t + 1 | t ) Φ T ( t ) [ Φ ( t ) P ( t + 1 | t ) Φ T ( t ) + Σ ξ ]
P ( t + 1 | t ) = Γ P ( t | t ) Γ T + Σ ε
P ( t + 1 | t + 1 ) = P ( t + 1 | t ) F ( t + 1 ) Φ ( t ) P ( t + 1 | t )
where (11) is the new information about the system parameters contained in z ( t + 1 ) , i.e., the innovation sequence.
Equation (10) shows the estimation and tracking of drift parameters at the current sanpling isntant.

4. Main Results

In general, it is easy to obtain the optimal control sequence u * ( t ) by minimizing the performance index at each stage using dynamic programming. However, the realization condition of dynamic programming is that the system parameters must be known; otherwise, the dynamic programming cannot be recursive. In this paper, it is easy to find that it is very intractable to obtain the optimal control law of the system by direct dynamic programming because of system parameter drift and uncertainties. Therefore, we design a novel MIMO dual control optimization algorithm. The control law designed not only can track the drift parameters of the system but can also drive the control system to run towards the desired target, i.e., a tradeoff between the control objectives and the parameter estimation objectives.
Different from [22], the part dealing with drift parameters is directly integrated into the performance index with different weight factors. In this way, the control law obtained can be guaranteed to be the optimal control law, but it also greatly reduces the amount of calculation, which is very convenient for application and popularization.
In order to realize the above two objectives, we need to simplify the initial problem. The principle of simplification is that the controller solved has dual characteristics and can obtain the analytic solution. In order to obtain the analytic solution, the general solution is to convert the global optimal to the single-step optimal so that not only the parameter drift can be taken into account but also the optimal control of the system can be obtained through a series of processing methods. Therefore, we transform the overall performance index to the single-step optimal performance index; the new performance index is rewritten as:
J t = E { [ z ( t + 1 ) z r ( t + 1 ) ] T Q [ z ( t + 1 ) z r ( t + 1 ) ] β e T ( t + 1 ) Q e ( t + 1 ) | z t } , t = 0 , 1 , 2 , , N 1 .
where β is the learning weighting factor.
The first term of Equation (15) considers the control ability of the controller and guarantees the system to track the reference signal in the optimal way. The second term endows the learning ability of the controller and carries out prediction output of the model to inch as close as possible to the practical output of the system. The sign of the second is thus negative because of the mutual conflict between optimization and estimation. It can be seen that the control law determined by the performance index (15) has dual characteristics.
The optimal weight coefficient β determined indicates that the control law derived by the performance index (15) can achieve the best tradeoff between optimization and estimation. In addition, we can obtain the analytic solution of the control law using (15), and the specific derivation process is as follows:
E { e T ( t + 1 ) Q e ( t + 1 ) } = E { [ z ( t + 1 ) Φ ( t ) Θ ^ ( t ) ] T Q [ z ( t + 1 ) Φ ( t ) Θ ^ ( t ) ] } = E { [ Φ ( t ) Θ ( t ) + ε ( t ) Φ ( t ) Θ ^ ( t ) ] T Q [ Φ ( t ) Θ ( t ) + ε ( t ) Φ ( t ) Θ ^ ( t ) ] } = E { [ Φ ( t ) Θ ˜ ( t ) + ε ( t ) ] T Q [ Φ ( t ) Θ ˜ ( t ) + ε ( t ) ] } = E { Θ ˜ T ( t ) Φ T ( t ) Q Φ ( t ) Θ ˜ ( t ) } + t r ( Q Σ ε ) = t r [ Φ T ( t ) Q Φ ( t ) P ( t ) ] + t r ( Q Σ ε )
where t r ( · ) represents the trace of the matrix. The theorem t r ( A B C ) = t r ( B C A ) = t r ( C A B ) is used in Equation (16).
The following equation can be obtained by Equations (15) and (16):
J t = [ Φ ( t ) Θ ^ ( t ) z r ( t + 1 ) ] T Q [ Φ ( t ) Θ ^ ( t ) z r ( t + 1 ) ] + ( 1 β ) [ t r ( Φ T ( t ) Q Φ ( t ) P ( t ) ) + t r ( Q Σ ε ) ]
To deal with the optimal control problem, introduce the following partitionings of the observation vector Φ ( t ) , the estimated drift parameter vector θ ^ i ( t ) and covariance matrix P ( t ) :
θ ^ i T ( t ) = [ g ^ i T , h ^ i T ] , i = 1 , 2 , , m
P ( t ) = P 11 ( t ) P 12 ( t ) P 1 r ( t ) P 21 ( t ) P 22 ( t ) P 2 r ( t ) P m 1 ( t ) P m 2 ( t ) P m m ( t )
where g ^ i = [ g ^ i 1 , g ^ i 2 , , g ^ i m ] T , h ^ i = [ h ^ i 1 , h ^ i 2 , , h ^ i r ] T , P 11 ( t ) , P 22 ( t ) , , P m m ( t ) are ( m + r ) dimension square matrices. P i i , i = 1 , 2 , , m is, therefore, partitioned and written as
P i i ( t ) = P g i ( t ) P g h i ( t ) P g h i T ( t ) P h i ( t ) , i = 1 , 2 , , m
where P g i ( t ) is the m dimension square matrix, P h i ( t ) is the r dimension square matrix.
Combining Equations (18)–(20):
t r [ Φ T ( t ) Q Φ ( t ) P ( t ) ] = t r [ d i a g ( ϕ T ( t ) , ϕ T ( t ) , , ϕ T ( t ) ) Q d i a g ( ϕ ( t ) , ϕ ( t ) , , ϕ ( t ) ) P ( t ) ] = t r [ Φ T ( t ) q 1 ϕ ( t ) P 11 ( t ) + Φ T ( t ) q 2 ϕ ( t ) P 22 ( t ) + + Φ T ( t ) q m ϕ ( t ) P m m ( t ) ] = i = 1 m t r [ ϕ T ( t ) q i ϕ ( t ) P i i ( t ) ] = i = 1 m t r [ z ( t ) q i z T ( t ) P g i ( t ) + z ( t ) q i u T ( t ) P g h i T ( t ) + u ( t ) q i z T ( t ) P g h i ( t ) + u ( t ) q i u T ( t ) P h i ( t ) ]
Substituting Equations (18) and (19) into the first of the performance index (15):
[ Φ ( t ) Θ ^ ( t ) z r ( t ) ] T Q [ Φ ( t ) Θ ^ ( t ) z r ( t ) ] = i = 1 m [ g ^ i T z ( t ) q i z T ( t ) g ^ i + g ^ i T z ( t ) q i u T ( t ) h ^ i + h ^ i T u ( t ) q i z T ( t ) a ^ i + h ^ i T u ( t ) q i u T ( t ) h ^ i ] i = 1 m [ y r i T ( t + 1 ) q i ( z T ( t ) a ^ i + u T ( t ) h ^ i ) ] i = 1 m [ ( g ^ i T z ( t ) + h ^ i T u ( t ) ) q i z r i ( t + 1 ) ] + z r T ( t + 1 ) Q z r ( t + 1 )
Combining Equations (21) and (22), the target function is derived as follows:
J t = i = 1 m [ g ^ i T z ( t ) q i z T ( t ) g ^ i + a ^ i T z ( t ) q i u T ( t ) h ^ i + h ^ i T u ( t ) q i z T ( t ) g ^ i + h ^ i T u ( t ) q i u T ( t ) h ^ i ] i = 1 m [ z r i T ( t + 1 ) q i ( z T ( t ) g ^ i + u T ( t ) h ^ i ) ] i = 1 m [ ( g ^ i T z ( t ) + h ^ i T u ( t ) ) q i z r i ( t + 1 ) ] + z r T ( t + 1 ) Q z r ( t + 1 ) + ( 1 β ) { i = 1 m t r [ z ( t ) q i z T ( t ) P a i ( t ) + z ( t ) q i u T ( t ) P g h i T ( t ) + u ( t ) q i z T ( t ) P g h i ( t ) + u ( t ) q i u T ( t ) P b i ( t ) ] + t r ( Q Σ ε ) }
In order for the controller to minimize the performance index, the control law is obtained by J t u t = 0 ; it yields
u * ( t ) = i = 1 m [ h ^ i q i g ^ i T z ( t ) b ^ i q i z r i ( t + 1 ) ] + 1 β 2 i = 1 m P g h i T ( t ) z ( t ) q i i = 1 m h ^ i q i h ^ i T + q i ( 1 β ) i = 1 m P h i ( t )
Equation (24) is the optimal controller u * ( t ) solved for the problem (P). It is shown from the above Equation (24) that the controller is not only related to the parameter estimated and estimation covariance matrix but also to the value of β . If the system parameters are constant, the estimation covariance matrix is zero; i.e., the second term of the numerator and denominator in (24) is zero. The contribution of the paper is endowing the learning characteristic of the controller by adding the terms about estimation covariance matrix. The value of β can be determined by the following property.
Theorem 1
(Property). In Equation (24), i.e., the optimal control u * ( t ) , there exists a constant Δ such that 0 < β < Δ .
Proof of Theorem 1.
In the practical system, the ability of parameter learning is related to the covariance matrix and the innovation of Kalman filter. The mathematical expectation (16) of innovation square can be obtained. Combining Equations (18)–(20), Equation (23) can be derived.
J t = i = 1 m g ^ i T u ( t ) q i u T ( t ) g ^ i + ( 1 β ) i = 1 m t r ( u ( t ) q i u T ( t ) P h i ( t ) ) + i = 1 m [ g ^ i T z ( t ) q i u T ( t ) h ^ i + h ^ i T u ( t ) q i z T ( t ) g ^ i ] i = 1 m [ z r i T ( t + 1 ) q i ( z T ( t ) g ^ i + u T ( t ) h ^ i ) ] + ( 1 β ) i = 1 m t r [ z ( t ) q i u T ( t ) P g h i T ( t ) + u ( t ) q i z T ( t ) P g h i ( t ) ] i = 1 m [ ( g ^ i T z ( t ) + h ^ i T u ( t ) ) q i z r i ( t + 1 ) ] + i = 1 m g ^ i T z ( t ) q i z T ( t ) g ^ i + z r T ( t + 1 ) Q z r ( t + 1 ) + ( 1 β ) { i = 1 m t r [ z ( t ) q i z T ( t ) P g i ( t ) ] + t r ( Q Σ ε ) }
Calculate the trace of Equation (25):
J t = i = 1 m q i i = 1 m h ^ i T h ^ i + ( 1 β ) i = 1 m q i i = 1 m t r P h i ( t ) u T ( t ) u ( t ) + ζ 1 + ζ 2
where ζ 1 denotes the single term containing u ( t ) or u T ( t ) , ζ 2 denotes the constant term not containing u ( t ) or u T ( t ) . Obviously, in order to obtain the minimum value of Equation (26) on u ( t ) , the coefficient of quadratic term should be greater than zero:
i = 1 m q i i = 1 m h ^ i T h ^ i + ( 1 β ) i = 1 m q i i = 1 m t r P h i ( t ) 0
namely:
β 1 + i = 1 m q i i = 1 m b ^ i T b ^ i i = 1 m q i i = 1 m t r P h i ( t )
Suppose Δ = 1 + i = 1 m q i i = 1 m h ^ i T h ^ i i = 1 m q i i = 1 m t r P h i ( t ) .
The controller performs the estimation to the drift parameters at each stage; the N 1 steps are accumulated to control system. Therefore, the upper bound Δ of β for all inequalities can be obtained by taking the maximum i = 1 m t r P h i in Equation (27). □

5. Numerical Experiments

The novel MIMO dual control optimazation algorithm can be obtained by summarizing the above methods:
Step 1: Initialization, and set t = 0 ;
Step 2: Estimate the drift parameters Θ using Kalman filtering (7a–7f);
Step 3: Calculate the optimal control u * ( t ) , minimizing the performance index using Equation (24).
Step 4: Apply the control u * ( t ) to the system (4).
Step 5: If t = N 1 , stop; otherwise, set t = t + 1 ; go back to Step 1.
Due to the randomness and unknowability of the system drift, in order to better verify and compare the algorithms in this paper, we treat the drift parameters as a time-varying function and process them. In actual industrial production, due to the continuous changes in operating conditions, the system structural parameters may not be consistent with the label values or ideal values. Therefore, we first treat the drift parameters as unknown but constants for estimation. For example, when the actuator is stuck, the system parameters drift to a fixed value. After that, the parameters are unknown and change in a certain trend over time, such as gyroscopes with continuously deteriorating performance. During the entire lifecycle of the gyroscope, when a certain practice is exceeded, the performance of the gyroscope components shows a certain trend of degradation, which can be considered as an unknown time-varying parameter. This trend is currently being studied in the fields of life prediction and health management.
In response to these two cases, we use two numerical experiments to verify the effectiveness of the control law designed. This section mainly presents simulation results and optimal performance for two cases, and analyzes the simulation results. The system equations in all examples fit the MIMO structure of (1), where r = 2 , m = 2 .
The performance index:
J = E { [ z ( t + 1 ) z r ( t + 1 ) ] T Q [ z ( t + 1 ) z r ( t + 1 ) ] }
where Q = I 2 , I 2 denotes 2-order identity matrix, z r ( t + 1 ) are 2-dimensional zero vectors.
Since the drift parameters of the practical system with noise are fluctuating at random, the parameters in every moment are varying, which makes it difficult to show the estimation ability of the controller designed in the paper. In addition, how to drift and the form of drift are not within the scope of this paper. This paper only considers the impact of the system on the parameters after it occurs. Next, let us first consider the first case and use the following simulation example to verify the designed control law.
Example 1.
A simple example is given to illustrate the implementation of the MIMO control algorithm proposed in this paper. Considering a 2-dimensional input and 2-dimensional output system (2), the parameter values after drift are set as follows:
θ 1 = [ 0.2 , 1.8 , 0.8 , 0.7 ] T ; θ 2 = [ 0.6 , 0.5 , 0.2 , 1.5 ] T ;
In order to make Kalman filter play a better role, it is necessary to find a suitable initial for prediction and update. Therefore, the initial is set to θ ^ 1 ( 0 ) = [ 0.1 , 0.1 , 0.1 , 0.1 ] T , θ ^ 2 ( 0 ) = [ 0.1 , 0.1 , 0.1 , 0.1 ] T . External disturbances and observation noise are Gaussian white noise with mean value 0 and variance Σ ξ , Σ ε , respectively, specifically Σ ξ = 0.02 , σ 1 2 = σ 2 2 = 0.2 , Σ ε = d i a g ( σ 1 2 , σ 2 2 ) . In addition, the initial covariance matrix is set as P ( 0 ) = I 8 , where I 8 denotes 8-order identity matrix.
The simulation results are shown in Figure 1 and Figure 2. In these two figures, the estimation processes of fixed drift parameters are shown. Since the stochastic system is considered in this paper, the estimation process is different every time it is run, but it will eventually reach the true value without exception. At the same time, through multiple runs, we found that, for each simulation, the true values of drift parameters can be accurately estimated before time t = 15 and remain stable, which proves the effectiveness of the algorithm in the above case.
The above are the simulation results for the first case. In order to enhance the persuasiveness of our method, we conducted simulations for the second case. Next, we considered that the drift parameters are no longer invariant when unknown but a time-varying function.
Example 2.
Similarly, we consider a dynamic stochastic system with the same structure of (1); the time-varing drift parameters are expressed in the following:
a 11 = 20 s i n 2 π t 100 , a 12 = 50 s i n 2 π t 80 a 21 = 20 s i n 2 π t 50 , a 22 = 30 s i n 2 π t 30 b 11 = 200 s i n 2 π t 100 , b 12 = 100 s i n 2 π t 80 b 21 = 100 s i n 2 π t 80 , b 22 = 200 s i n 2 π t 100
For convenience, in this example, external disturbances and observation noise still use the variance value from Example 1. At the same time, the initial covariance matrix P ( 0 ) remains unchanged.
The simulation results are shown in Figure 3, Figure 4, Figure 5 and Figure 6.
Figure 3, Figure 4, Figure 5 and Figure 6 show the learning process of parameters in the last case. It can be seen from the figures that learned parameters can follow true parameters with time, but some errors exist, which is inevitable. This is because learning is a process, which can be shown in Figure 1 and Figure 2, and it needs to take about time 15 to learn the true parameter for each parameter. However, the parameters in this example are changing with time t and it is impossible to learn exact values at every moment. However, the algorithm in this paper can estimate approximate parameters and follow the variation trend of parameters. The simulation results prove that the algorithm of this paper provides a feasible method for solving such problems.
In the sequent, the value of performance index under the action of the optimal control law can be calculated by Monte Carlo experiments. Simultaneously, the values under the action of nominal control and pure control are also calculated. The comparison results are shown in Table 1 and Table 2.
As we can see from Table 1 and Table 2, the value of the performance index is different under the action of two different control laws. Among them, the performance index value of pure control is the largest, while that of dual control is the smallest. This indicates that the algorithm proposed is obviously better than the other two, which proves the effectiveness of the designed control law.
In this section, the effectiveness of the proposed optimal dual control strategy is verified by performing two different numerical experiments.

6. Conclusions

For the stochastic system involved in parameter drift, disturbances and measurement noise, a novel MIMO dual control strategy is designed in this paper. The proposed control algorithm can not only estimate the drift parameters but can also carry out the optimal control of the system. Due to the conflict between parameter prediction and control target, a weighting factor is added in the preceding parameter term to balance above conflict. The real-time information in the optimal controller designed in the paper is a set of output information and control information at the previous moment. Because of the mutual couple and complex nonliear structure of the MIMO system, the dual control problem based on deep learning is our main research problem in the future.

Author Contributions

X.Z. and C.C. proposed the research idea and conceived the entire research program; X.Z. designed the framework of draft, wrote the original manuscript and implemented the simulation experiments. S.G. reviewed the paper and provided instructional support. C.C. participated in the revisions to the paper and provided funding support. J.H. participated in the verification of experiments’ results and the polishing of language. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by National Key R&D Program of China (Grant No. 2022YFC3803700), in part by the Key R&D Project of the Ministry of Science and Technology of China (Grant No. 2022YFE0123400) and in part by the Natural Science Foundation of Shaanxi Province (Grant No. 2022JQ-667).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Maciej, N. Optimal and suboptimal smoothing algorithms for identification of time-varying systems with randomly drifting parameters. Automatica 2008, 44, 1718–1727. [Google Scholar]
  2. Nakamura, S. MEMS inertial sensor toward higher accuracy & multi-axis sensing. In Proceedings of the IEEE Conference on Sensors, Irvine, CA, USA, 31 October–3 November; pp. 939–942.
  3. Li, J.; Liu, J.; Zhang, W.D. MEMS based micro inertial measurement system. WSEAS Trans. Circuits Syst. 2006, 37, 691–696. [Google Scholar]
  4. Wen, Q.; Li, S. Enhanced parameterizable uncertainty to dual adaptive model predictive control. Control Theory Appl. 2019, 36, 1197–1206. [Google Scholar]
  5. Arcari, E.; Hewing, L.; Schlichting, M.; Zeilinger, M. Dual stochastic MPC for systems with parametric and structural uncertainty. Learn. Dyn. Control PMLR 2020, 120, 894–903. [Google Scholar]
  6. Ma, X.; Qian, F.; Zhang, S.; Wu, L. Adaptive quantile control for stochastic systems. ISA Trans. 2021, 123, 110–121. [Google Scholar] [CrossRef] [PubMed]
  7. Ma, L.; Wang, Z.; Hu, J.; Han, Q. Probability-guaranteed envelope-constrained filtering for nonlinear systems subject to measurement outliers. IEEE Trans. Autom. Control 2021, 66, 3274–3281. [Google Scholar] [CrossRef]
  8. Ma, X.; Qian, F.; Zhang, S.; Wu, L.; Liu, L. Adaptive dual control with online outlier detection for uncertain systems. ISA Trans. 2022, 129, 157–168. [Google Scholar] [CrossRef]
  9. Jose, A.R.; Rodolfo, S. Stabilization of a class of linear time-varying systems via modeling error compensation. IEEE Trans. Autom. Control 2000, 45, 738–741. [Google Scholar]
  10. Good, R.; Qin, S.J. Stability analysis of double EWMA run-to-run control with metrology delay. In Proceedings of the American Control Conference, Anchorage, AK, USA, 8–10 May 2002; pp. 2156–2161. [Google Scholar]
  11. Yang, H.; Gao, S.; Qian, F.; Huang, J. A Suboptimal Dual Control Method for the Stochastic Systems with Parameters Drifting. Asian J. Control 2019, 21, 609–616. [Google Scholar] [CrossRef]
  12. Wang, L.-Y.; Zhao, W.-X. System identification: New models, challenges and opportunities. J. Autom. 2013, 39, 933–942. [Google Scholar]
  13. Ge, S.S.; Keng, P.T. Approximation-based control of nonlinear MIMO time-delay systems. Automatica 2007, 43, 31–43. [Google Scholar] [CrossRef] [Green Version]
  14. Xue, W.; Shaojie, Z.; Weifang, S. Optimal adaptive tracking control for a class of MIMO uncertain nonlinear systems with actuator failures. In Proceedings of the 36th Chinese Control Conference, Dalian, China, 26–28 July 2017; Voulme 6, pp. 26–28. [Google Scholar]
  15. Ge, S.S.; Hang, C.C.; Zhang, T. Stable adaptive control for nonlinear multivariable systems with triangular control structure. IEEE Trans. Autom. Control 2000, 45, 1221–1225. [Google Scholar] [CrossRef]
  16. Altan, A.; Hacıoğlu, R. Model predictive control of three-axis gimbal system mounted on UAV for real-time target tracking under external disturbances. Mech. Syst. Signal Process. 2020, 138, 106–548. [Google Scholar] [CrossRef]
  17. Karafyllis, I.; Krstic, M. Adaptive certainty equivalence control with regulation triggered finite-time least squares identification. IEEE Trans. Autom. Control 2018, 63, 3261–3275. [Google Scholar] [CrossRef]
  18. Li, C.; Ding, J.; Lewis, F.L.; Chai, T. A novel adaptive dynamic programming based on tracking error for nonlinear discrete-time systems. Automatica 2021, 129, 109–687. [Google Scholar] [CrossRef]
  19. Tse, E.; Bar-Shalom, Y.; Meier, L. Wide-sense adaptive dual control for nonlinear stochastic systems. IEEE Trans. Autom. Control 1973, 18, 98–108. [Google Scholar] [CrossRef]
  20. Feldbaum, A. Dual Control Theory, Parts I and II. Autom. Remote Control 1960, 21, 1033–1039. [Google Scholar]
  21. Filatovb, N.M.; Unbehauen, H. Adaptive Dual Control Theory and Applications; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
  22. Qian, F.; Zhang, X.; Liu, L.; Xie, G. Dual Control for Stochastic Linear MIMO Systems with Parameter Uncertainty. IEEE Access 2020, 8, 41860–41869. [Google Scholar] [CrossRef]
  23. Huang, J.; Qian, F.; Xie, G.; Yang, H. Robust learning control for dynamic systems with mixed uncertainties. J. Syst. Eng. Electron. 2016, 27, 656–663. [Google Scholar] [CrossRef]
  24. Milito, R.; Padilla, C.S.; Padilla, R.A.; Cadorin, D. An innovation approach to dual control. IEEE Trans. Autom. Control 1982, 27, 132–137. [Google Scholar] [CrossRef]
  25. Qian, F.; Gao, J.; Li, D. Complete statistical characterization of discretetime LQG and cumulant control. IEEE Trans. Autom. Control 2012, 57, 2110–2115. [Google Scholar] [CrossRef]
  26. Li, D.; Qian, F.; Fu, P. Optimal nominal dual control for discrete-time LQG problem with unknown parameters. Automatica 2008, 44, 119–127. [Google Scholar] [CrossRef]
  27. Wang, L.; Qian, F.; Liu, J. The PDF shape control of the state variable for a class of stochastic systems. Int. J. Syst. Sci. 2015, 46, 2231–2239. [Google Scholar] [CrossRef]
Figure 1. The estimate process of parameters a 11 , a 12 , a 21 and a 22 .
Figure 1. The estimate process of parameters a 11 , a 12 , a 21 and a 22 .
Sensors 23 05743 g001
Figure 2. The estimate process of parameters b 11 , b 12 , b 21 and b 22 .
Figure 2. The estimate process of parameters b 11 , b 12 , b 21 and b 22 .
Sensors 23 05743 g002
Figure 3. The estimate process of parameters a 11 and a 12 .
Figure 3. The estimate process of parameters a 11 and a 12 .
Sensors 23 05743 g003
Figure 4. The estimate process of parameters a 21 and a 22 .
Figure 4. The estimate process of parameters a 21 and a 22 .
Sensors 23 05743 g004
Figure 5. The estimate process of parameters b 11 and b 12 .
Figure 5. The estimate process of parameters b 11 and b 12 .
Sensors 23 05743 g005
Figure 6. The estimate process of parameters b 21 and b 22 .
Figure 6. The estimate process of parameters b 21 and b 22 .
Sensors 23 05743 g006
Table 1. Comparison results of example 1 under different control laws.
Table 1. Comparison results of example 1 under different control laws.
Nominal ControlPure ControlDual Control of e.g., 1
369.112491.224289.377
Table 2. Comparison results of example 2 under different control laws.
Table 2. Comparison results of example 2 under different control laws.
Nominal ControlPure ControlDual Control of e.g., 2
1805.3252156.7711765.391
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, X.; Gao, S.; Chen, C.; Huang, J. Optimal Control Algorithm for Stochastic Systems with Parameter Drift. Sensors 2023, 23, 5743. https://doi.org/10.3390/s23125743

AMA Style

Zhang X, Gao S, Chen C, Huang J. Optimal Control Algorithm for Stochastic Systems with Parameter Drift. Sensors. 2023; 23(12):5743. https://doi.org/10.3390/s23125743

Chicago/Turabian Style

Zhang, Xiaoyan, Song Gao, Chaobo Chen, and Jiaoru Huang. 2023. "Optimal Control Algorithm for Stochastic Systems with Parameter Drift" Sensors 23, no. 12: 5743. https://doi.org/10.3390/s23125743

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop