MarketSeparation principle in stochastic control
Company Profile

Separation principle in stochastic control

The separation principle is one of the fundamental principles of stochastic control theory, which states that the problems of optimal control and state estimation can be decoupled under certain conditions. In its most basic formulation it deals with a linear stochastic system

Choices of the class of admissible control laws
Linear-quadratic control problems are often solved by a completion-of-squares argument. In our present context we have : J(u)=\operatorname{E}\left\{ \int_0^T(u-Kx)'R(u-Kx) \, dt\right\} +\text{terms that do not depend on }u, in which the first term takes the form :\begin{align} \operatorname{E}\left\{ \int_0^T(u-Kx)'R(u-Kx)\,dt\right\}=\operatorname{E}\left\{\int_0^T[(u-K\hat{x})'R(u-K\hat{x})+\operatorname{tr}(K'RK\Sigma)] \, dt\right\}, \end{align} where \Sigma is the covariance matrix : \Sigma(t):=\operatorname{E}\{[x(t)-\hat{x}(t)][x(t)-\hat{x}(t)]'\}. The separation principle would now follow immediately if \begin{align}\Sigma\end{align} were independent of the control. However this needs to be established. The state equation can be integrated to take the form : x(t)=x_0(t)+\int_0^t \Phi(t,s)B_1(s)u(s) \, ds, where x_0 is the state process obtained by setting u=0 and \Phi is the transition matrix function. By linearity, \hat{x}(t)=\operatorname{E}\{x(t)\mid {\cal Y}_t\} equals : \hat{x}(t)=\hat{x}_0(t)+\int_0^t \Phi(t,s)B_1(s)u(s)\,ds, where \hat{x}_0(t)=\operatorname{E}\{x_0(t)\mid {\cal Y}_t\}. Consequently, : \Sigma(t):=\mathbb{E}\{[x_0(t)-\hat{x}_0(t)][x_0(t)-\hat{x}_0(t)]'\}, but we need to establish that \begin{align}\hat{x}_0\end{align} does not depend on the control. This would be the case if : {\cal Y}_t ={\cal Y}_t^0:=\sigma\{ y_0(\tau), \tau\in [0,t]\}, \quad 0\leq t\leq T, where y_0 is the output process obtained by setting u=0. This issue was discussed in detail by Lindquist. also van Handel and Willems. In Lindquist 1973 More generally, the linear class : ({\mathcal L})\quad u(t)=\bar{u}(t)+\int_0^tF(t,\tau)\,dy, where \bar{u} is a deterministic function and F is an L_2 kernel, ensures that \Sigma is independent of the control. However, the proof is far from simple and there are many technical assumptions. For example, \begin{align}C(t)\end{align} must square and have a determinant bounded away from zero, which is a serious restriction. A later proof by Fleming and Rishel is considerably simpler. They also prove the separation theorem with quadratic cost functional J(u) for a class of Lipschitz continuous feedback laws, namely u(t)=\phi(t,y), where \phi:\, [0,T]\times C^n [0,T]\to{\mathbb R}^m is a non-anticipatory function of y which is Lipschitz continuous in this argument. Kushner proposed a more restricted class u(t)=\psi(t,\hat{\xi}(t)), where the modified state process \hat{\xi} is given by : \hat{\xi}(t)=\operatorname{E}\{ x_0(t)\mid {\mathcal Y}_t^0\}+ \int_0^t \Phi(t,s)B_1(s)u(s)\,ds, leading to the identity \begin{align}\hat{x}=\hat{\xi}\end{align}. Imposing delay If there is a delay in the processing of the observed data so that, for each t, u(t) is a function of y(\tau); \, 0\leq\tau\leq t-\varepsilon, then {\cal Y}_t ={\cal Y}_t^0, 0\leq t\leq T, see Example 3 in Tryphon T. Georgiou and Lindquist. and Davis and Varaiya, see also Section 2.4 in Bensoussan ), that the state process is conditionally Gaussian given the filtration \begin{align}\{{\mathcal Y}_t\}\end{align}. This fact can be used to show that \begin{align}\hat{x}\end{align} is actually generated by a Kalman filter (see Chapters 11 and 12 in Lipster and Shirayev ==Issues on feedback in linear stochastic systems==
Issues on feedback in linear stochastic systems
At this point it is suitable to consider a more general class of controlled linear stochastic systems that also covers systems with time delays, namely :\begin{align} z(t) & =z_0(t) + \int_0^t G(t,s)u(s)\,ds \\ y(t) & = Hz(t) \end{align} with \begin{align}z_0\end{align} a stochastic vector process which does not depend on the control. and pages 126–128 in Klebaner. Also, under fairly general conditions (see e.g., Chapter V in Protter), stochastic differential equations driven by martingales with sample paths in D have strong solutions who are semi-martingales. For the time setting f(z):=g\pi Hz, the feedback system z=z_0+g\pi Hz can be written z=z_0+f(z), where z_0 can be interpreted as an input. Definition. A feedback loop z=z_0+f(z) is deterministically well-posed if it has a unique solution z\in D for all inputs z_0\in D and (1-f)^{-1} is a system. This implies that the processes z and z_0 define identical filtrations. Consequently, no new information is created by the loop. However, what we need is that {\cal Y}_t ={\cal Y}_t^0 for 0\leq t\leq T. This is ensured by the following lemma (Lemma 8 in Georgiou and Lindquist). Key Lemma. If the feedback loop z=z_0+g\pi Hz is deterministically well-posed, g\pi is a system, and H is a linear system having a right inverse H^{-R} that is also a system, then (1-Hg\pi)^{-1} is a system and {\cal Y}_t ={\cal Y}_t^0 for 0\leq t\leq T. The condition on H in this lemma is clearly satisfied in the standard linear stochastic system, for which H=[0,I], and hence H^{-R}=H'. The remaining conditions are collected in the following definition. Definition. A feedback law \pi is deterministically well-posed for the system z=z_0+g\pi Hz if g\pi is a system and the feedback system z=z_0+g\pi Hz deterministically well-posed. Examples of simple systems that are not deterministically well-posed are given in Remark 12 in Georgiou and Lindquist. ==A separation principle for physically realizable control laws==
A separation principle for physically realizable control laws
By only considering feedback laws that are deterministically well-posed, all admissible control laws are physically realizable in the engineering sense that they induce a signal that travels through the feedback loop. The proof of the following theorem can be found in Georgiou and Lindquist 2013. Separation theorem. Given the linear stochastic system : \begin{align} dx & =A(t)x(t)\,dt+B_1(t)u(t)\,dt+B_2(t)\,dw \\ dy & =C(t)x(t)\,dt +D(t)\,dw \end{align} where w is a vector-valued Wiener process, x(0) is a zero-mean Gaussian random vector independent of w, consider the problem of minimizing the quadratic functional J(u) over the class of all deterministically well-posed feedback laws \pi. Then the unique optimal control law is given by u(t)=K(t)\hat{x}(t) where K is defined as above and \hat{x} is given by the Kalman filter. More generally, if w is a square-integrable martingale and x(0) is an arbitrary zero mean random vector, u(t)=K(t)\hat{x}(t), where \hat{x}(t)=\operatorname{E}\{x(t)\mid {\cal Y}_t\}, is the optimal control law provided it is deterministically well-posed. In the general non-Gaussian case, which may involve counting processes, the Kalman filter needs to be replaced by a nonlinear filter. ==A Separation principle for delay-differential systems==
A Separation principle for delay-differential systems
Stochastic control for time-delay systems were first studied in Lindquist, although Brooks relies on the strong assumption that the observation y is functionally independent of the control u, thus avoiding the key question of feedback. Consider the delay-differential system Theorem. There is a unique feedback law \begin{align}\pi:\, y\mapsto u\end{align} in the class of deterministically well-posed control laws that minimizes \begin{align}J(u)\end{align}, and it is given by : u(t)=\int_{t-h}^t d_s \, K(t,s)\hat{x}(s\mid t), where K is the deterministic control gain and \hat{x}(s\mid t) := E\{ x(s)\mid {\cal Y}_t\} is given by the linear (distributed) filter :\begin{align} d\hat{x}(t\mid t) & =\int_{t-h}^t d_s \, A(t,s)\hat{x}(s\mid t) \, dt +B_1u\,dt+ X(t,t)\,dv \\ d\hat{x}(t\mid t) & =\int_{t-h}^t d_s \, A(t,s)\hat{x}(s\mid t) \, dt +B_1u\,dt+ X(t,t)\,dv \end{align} where v is the innovation process : dv=dy - \int_{t-h}^t d_sC(t,s)\hat{x}(s\mid t)\, dt, \quad v(0)=0, and the gain x is as defined in page 120 in Lindquist. ==References==
tickerdossier.comtickerdossier.substack.com