Separation principle in stochastic control

Linear-quadratic control problems are often solved by a completion-of-squares argument. In our present context we have : J(u)=\operatorname{E}\left\{ \int_0^T(u-Kx)'R(u-Kx) \, dt\right\} +\text{terms that do not depend on }u, in which the first term takes the form :\begin{align} \operatorname{E}\left\{ \int_0^T(u-Kx)'R(u-Kx)\,dt\right\}=\operatorname{E}\left\{\int_0^T[(u-K\hat{x})'R(u-K\hat{x})+\operatorname{tr}(K'RK\Sigma)] \, dt\right\}, \end{align} where \Sigma is the covariance matrix : \Sigma(t):=\operatorname{E}\{[x(t)-\hat{x}(t)][x(t)-\hat{x}(t)]'\}. The separation principle would now follow immediately if \begin{align}\Sigma\end{align} were independent of the control. However this needs to be established. The state equation can be integrated to take the form : x(t)=x_0(t)+\int_0^t \Phi(t,s)B_1(s)u(s) \, ds, where x_0 is the state process obtained by setting u=0 and \Phi is the transition matrix function. By linearity, \hat{x}(t)=\operatorname{E}\{x(t)\mid {\cal Y}_t\} equals : \hat{x}(t)=\hat{x}_0(t)+\int_0^t \Phi(t,s)B_1(s)u(s)\,ds, where \hat{x}_0(t)=\operatorname{E}\{x_0(t)\mid {\cal Y}_t\}. Consequently, : \Sigma(t):=\mathbb{E}\{[x_0(t)-\hat{x}_0(t)][x_0(t)-\hat{x}_0(t)]'\}, but we need to establish that \begin{align}\hat{x}_0\end{align} does not depend on the control. This would be the case if : {\cal Y}_t ={\cal Y}_t^0:=\sigma\{ y_0(\tau), \tau\in [0,t]\}, \quad 0\leq t\leq T, where y_0 is the output process obtained by setting u=0. This issue was discussed in detail by Lindquist. also van Handel and Willems. In Lindquist 1973 More generally, the linear class : ({\mathcal L})\quad u(t)=\bar{u}(t)+\int_0^tF(t,\tau)\,dy, where \bar{u} is a deterministic function and F is an L_2 kernel, ensures that \Sigma is independent of the control. However, the proof is far from simple and there are many technical assumptions. For example, \begin{align}C(t)\end{align} must square and have a determinant bounded away from zero, which is a serious restriction. A later proof by Fleming and Rishel is considerably simpler. They also prove the separation theorem with quadratic cost functional J(u) for a class of Lipschitz continuous feedback laws, namely u(t)=\phi(t,y), where \phi:\, [0,T]\times C^n [0,T]\to{\mathbb R}^m is a non-anticipatory function of y which is Lipschitz continuous in this argument. Kushner proposed a more restricted class u(t)=\psi(t,\hat{\xi}(t)), where the modified state process \hat{\xi} is given by : \hat{\xi}(t)=\operatorname{E}\{ x_0(t)\mid {\mathcal Y}_t^0\}+ \int_0^t \Phi(t,s)B_1(s)u(s)\,ds, leading to the identity \begin{align}\hat{x}=\hat{\xi}\end{align}. Imposing delay If there is a delay in the processing of the observed data so that, for each t, u(t) is a function of y(\tau); \, 0\leq\tau\leq t-\varepsilon, then {\cal Y}_t ={\cal Y}_t^0, 0\leq t\leq T, see Example 3 in Tryphon T. Georgiou and Lindquist. and Davis and Varaiya, see also Section 2.4 in Bensoussan ), that the state process is conditionally Gaussian given the filtration \begin{align}\{{\mathcal Y}_t\}\end{align}. This fact can be used to show that \begin{align}\hat{x}\end{align} is actually generated by a Kalman filter (see Chapters 11 and 12 in Lipster and Shirayev ==Issues on feedback in linear stochastic systems==