Generalized singular value decomposition

The generalized singular value decomposition (GSVD) is a matrix decomposition on a pair of matrices which generalizes the singular value decomposition. It was introduced by Van Loan are extensively used in the study of the conditioning and regularization of linear systems with respect to quadratic semi-norms. In the following, let \mathbb{F} = \mathbb{R}, or \mathbb{F} = \mathbb{C}. Definition The generalized singular value decomposition of matrices A_1 \in \mathbb{F}^{m_1 \times n} and A_2 \in \mathbb{F}^{m_2 \times n} is \begin{align} A_1 & = U_1\Sigma_1 [ W^* D, 0_D] Q^*, \\ A_2 & = U_2\Sigma_2 [ W^* D, 0_D] Q^*, \end{align} where • U_1 \in \mathbb{F}^{m_1 \times m_1} is unitary, • U_2 \in \mathbb{F}^{m_2 \times m_2} is unitary, • Q \in \mathbb{F}^{n \times n} is unitary, • W \in \mathbb{F}^{k \times k} is unitary, • D \in \mathbb{R}^{k \times k} is real diagonal with positive diagonal, and contains the non-zero singular values of C = \begin{bmatrix} A_1 \\ A_2 \end{bmatrix} in decreasing order, • 0_D = 0 \in \mathbb{R}^{k \times (n - k)} , • \Sigma_1 = \lceil I_A, S_1, 0_A \rfloor \in \mathbb{R}^{m_1 \times k} is real non-negative block-diagonal, where S_1 = \lceil \alpha_{r + 1}, \dots, \alpha_{r + s} \rfloor with 1 > \alpha_{r + 1} \ge \cdots \ge \alpha_{r + s} > 0, I_A = I_r, and 0_A = 0 \in \mathbb{R}^{(m_1 - r - s) \times (k - r - s)} , • \Sigma_2 = \lceil 0_B, S_2, I_B \rfloor \in \mathbb{R}^{m_2 \times k} is real non-negative block-diagonal, where S_2 = \lceil \beta_{r + 1}, \dots, \beta_{r + s} \rfloor with 0 , I_B = I_{k - r - s}, and 0_B = 0 \in \mathbb{R}^{(m_2 - k + r) \times r} , • \Sigma_1^* \Sigma_1 = \lceil\alpha_1^2, \dots, \alpha_k^2\rfloor, • \Sigma_2^* \Sigma_2 = \lceil\beta_1^2, \dots, \beta_k^2\rfloor, • \Sigma_1^* \Sigma_1 + \Sigma_2^* \Sigma_2 = I_k, • k = \textrm{rank}(C). We denote \alpha_1 = \cdots = \alpha_r = 1, \alpha_{r + s + 1} = \cdots = \alpha_k = 0, \beta_1 = \cdots = \beta_r = 0, and \beta_{r + s + 1} = \cdots = \beta_k = 1. While \Sigma_1 is diagonal, \Sigma_2 is not always diagonal, because of the leading rectangular zero matrix; instead \Sigma_2 is "bottom-right-diagonal". Variations There are many variations of the GSVD. These variations are related to the fact that it is always possible to multiply Q^* from the left by E E^* = I where E \in \mathbb{F}^{n \times n} is an arbitrary unitary matrix. We denote • X = ([W^* D, 0_D] Q^*)^* • X^* = [0, R] \hat{Q}^* , where R \in \mathbb{F}^{k \times k} is upper-triangular and invertible, and \hat{Q} \in \mathbb{F}^{n \times n} is unitary. Such matrices exist by RQ-decomposition. • Y = W^* D. Then Y is invertible. Here are some variations of the GSVD: • MATLAB (gsvd): \begin{aligned} A_1 & = U_1 \Sigma_1 X^*, \\ A_2 & = U_2 \Sigma_2 X^*. \end{aligned} • LAPACK (ggsvd3): \begin{aligned} A_1 & = U_1 \Sigma_1 [0, R] \hat{Q}^*, \\ A_2 & = U_2 \Sigma_2 [0, R] \hat{Q}^*. \end{aligned} • Simplified: \begin{align} A_1 & = U_1\Sigma_1 [ Y, 0_D] Q^*, \\ A_2 & = U_2\Sigma_2 [ Y, 0_D] Q^*. \end{align} Generalized singular values The phrase generalized singular value can be used in reference to slightly different things. In this article, a generalized singular value of A_1 and A_2 is a pair (a, b) \in \mathbb{R}^2 such that \begin{align} \lim_{\delta \to 0} \det(b^2 A_1^* A_1 - a^2 A_2^* A_2 + \delta I_n) / \det(\delta I_{n - k}) & = 0, \\ a^2 + b^2 & = 1, \\ a, b & \geq 0. \end{align} Other sources call such (a, b) \in \mathbb{R}^2 generalized singular value pairs and use generalized singular value to refer to the ratio a/b.'''' Regardless of the naming convention, we have • A_i A_j^* = U_i \Sigma_i Y Y^* \Sigma_j^* U_j^* • A_i^* A_j = Q \begin{bmatrix} Y^* \Sigma_i^* \Sigma_j Y & 0 \\ 0 & 0 \end{bmatrix} Q^* = Q_1 Y^* \Sigma_i^* \Sigma_j Y Q_1^* By these properties we can show that the generalized singular values are exactly the pairs (\alpha_i, \beta_i). We have \begin{aligned} & \det(b^2 A_1^* A_1 - a^2 A_2^* A_2 + \delta I_n) \\ = & \det(b^2 A_1^* A_1 - a^2 A_2^* A_2 + \delta Q Q^*) \\ = & \det\left(Q \begin{bmatrix} Y^* (b^2 \Sigma_1^* \Sigma_1 - a^2 \Sigma_2^* \Sigma_2) Y + \delta I_k & 0 \\ 0 & \delta I_{n - k} \end{bmatrix} Q^*\right) \\ = & \det(\delta I_{n - k}) \det(Y^* (b^2 \Sigma_1^* \Sigma_1 - a^2 \Sigma_2^* \Sigma_2) Y + \delta I_k). \end{aligned} Therefore : \begin{aligned} {} & \lim_{\delta \to 0} \det(b^2 A_1^* A_1 - a^2 A_2^* A_2 + \delta I_n) / \det(\delta I_{n - k}) \\ = & \lim_{\delta \to 0} \det(Y^* (b^2 \Sigma_1^* \Sigma_1 - a^2 \Sigma_2^* \Sigma_2) Y + \delta I_k) \\ = & \det(Y^* (b^2 \Sigma_1^* \Sigma_1 - a^2 \Sigma_2^* \Sigma_2) Y) \\ = & |\det(Y)|^2 \prod_{i = 1}^k (b^2 \alpha_i^2 - a^2 \beta_i^2). \end{aligned} This expression is zero exactly when a = \alpha_i and b = \beta_i for some i. In, the generalized singular values are claimed to be those which solve \det(b^2 A_1^* A_1 - a^2 A_2^* A_2) = 0. However, this claim only holds when k = n, since otherwise the determinant is zero for every pair (a, b) \in \mathbb{R}^2; this can be seen by substituting \delta = 0 above. Generalized inverse Define E^+ = E^{-1} for any invertible matrix E \in \mathbb{F}^{n \times n} , 0^+ = 0^* for any zero matrix 0 \in \mathbb{F}^{m \times n}, and \left\lceil E_1, E_2 \right\rfloor^+ = \left\lceil E_1^+, E_2^+ \right\rfloor for any block-diagonal matrix. Then defineA_i^+ = Q \begin{bmatrix} Y^{-1} \\ 0 \end{bmatrix} \Sigma_i^+ U_i^*It can be shown that A_i^+ as defined here is a generalized inverse of A_i; in particular a \{1, 2, 3\}-inverse of A_i. Since it does not in general satisfy (A_i^+ A_i)^* = A_i^+ A_i, this is not the Moore–Penrose inverse; otherwise we could derive (AB)^+ = B^+ A^+ for any choice of matrices, which only holds for certain class of matrices. Suppose Q = \begin{bmatrix}Q_1 & Q_2\end{bmatrix} , where Q_1 \in \mathbb{F}^{n \times k} and Q_2 \in \mathbb{F}^{n \times (n - k)}. This generalized inverse has the following properties: • \Sigma_1^+ = \lceil I_A, S_1^{-1}, 0_A^T \rfloor • \Sigma_2^+ = \lceil 0^T_B, S_2^{-1}, I_B \rfloor • \Sigma_1 \Sigma_1^+ = \lceil I, I, 0 \rfloor • \Sigma_2 \Sigma_2^+ = \lceil 0, I, I \rfloor • \Sigma_1 \Sigma_2^+ = \lceil 0, S_1 S_2^{-1}, 0 \rfloor • \Sigma_1^+ \Sigma_2 = \lceil 0, S_1^{-1} S_2, 0 \rfloor • A_i A_j^+ = U_i \Sigma_i \Sigma_j^+ U_j^* • A_i^+ A_j = Q \begin{bmatrix} Y^{-1} \Sigma_i^+ \Sigma_j Y & 0 \\ 0 & 0 \end{bmatrix} Q^* = Q_1 Y^{-1} \Sigma_i^+ \Sigma_j Y Q_1^* Quotient SVD A generalized singular ratio (or just "generalized singular value," depending on one's naming convention) of A_1 and A_2 is \sigma_i=\alpha_i \beta_i^+. By the above properties, A_1 A_2^+ = U_1 \Sigma_1 \Sigma_2^+ U_2^*. Note that \Sigma_1 \Sigma_2^+ = \lceil 0, S_1 S_2^{-1}, 0 \rfloor is diagonal, and that, ignoring the leading zeros, contains the singular ratios in decreasing order. If A_2 is invertible, then \Sigma_1 \Sigma_2^+ has no leading zeros, and the generalized singular ratios are the singular values, and U_1 and U_2 are the matrices of singular vectors, of the matrix A_1 A_2^+ = A_1 A_2^{-1}. In fact, computing the SVD of A_1 A_2^{-1} is one of the motivations for the GSVD, as "forming AB^{-1} and finding its SVD can lead to unnecessary and large numerical errors when B is ill-conditioned for solution of equations". Hence the sometimes used name "quotient SVD", although this is not the only reason for using GSVD. If A_2 is not invertible, then U_1 \Sigma_1 \Sigma_2^+ U_2^*is still the SVD of A_1 A_2^+ if we relax the requirement of having the singular values in decreasing order. Alternatively, a decreasing order SVD can be found by moving the leading zeros to the back: U_1 \Sigma_1 \Sigma_2^+ U_2^* = (U_1 P_1) P_1^* \Sigma_1 \Sigma_2^+ P_2 (P_2^* U_2^*), where P_1 and P_2 are appropriate permutation matrices. Since rank equals the number of non-zero singular values, \mathrm{rank}(A_1 A_2^+)=s. Construction Let • C = P \lceil D, 0 \rfloor Q^* be the SVD of C = \begin{bmatrix} A_1 \\ A_2 \end{bmatrix}, where P \in \mathbb{F}^{(m_1 + m_2) \times (m_1 \times m_2)} is unitary, and Q and D are as described, • P = [P_1, P_2], where P_1 \in \mathbb{F}^{(m_1 + m_2) \times k} and P_2 \in \mathbb{F}^{(m_1 + m_2) \times (n - k)}, • P_1 = \begin{bmatrix} P_{11} \\ P_{21} \end{bmatrix}, where P_{11} \in \mathbb{F}^{m_1 \times k} and P_{21} \in \mathbb{F}^{m_2 \times k}, • P_{11} = U_1 \Sigma_1 W^* by the SVD of P_{11}, where U_1, \Sigma_1 and W are as described, • P_{21} W = U_2 \Sigma_2 by a decomposition similar to a QR-decomposition, where U_2 and \Sigma_2 are as described. Then\begin{aligned} C & = P \lceil D, 0 \rfloor Q^* \\ {} & = [P_1 D, 0] Q^* \\ {} & = \begin{bmatrix} U_1 \Sigma_1 W^* D & 0 \\ U_2 \Sigma_2 W^* D & 0 \end{bmatrix} Q^* \\ {} & = \begin{bmatrix} U_1 \Sigma_1 [W^* D, 0] Q^* \\ U_2 \Sigma_2 [W^* D, 0] Q^* \end{bmatrix} . \end{aligned}We also have\begin{bmatrix} U_1^* & 0 \\ 0 & U_2^* \end{bmatrix} P_1 W = \begin{bmatrix} \Sigma_1 \\ \Sigma_2 \end{bmatrix}.Therefore\Sigma_1^* \Sigma_1 + \Sigma_2^* \Sigma_2 = \begin{bmatrix} \Sigma_1 \\ \Sigma_2 \end{bmatrix}^* \begin{bmatrix} \Sigma_1 \\ \Sigma_2 \end{bmatrix} = W^* P_1^* \begin{bmatrix} U_1 & 0 \\ 0 & U_2 \end{bmatrix} \begin{bmatrix} U_1^* & 0 \\ 0 & U_2^* \end{bmatrix} P_1 W = I.Since P_1 has orthonormal columns, ||P_1||_2 \leq 1. Therefore||\Sigma_1||_2 = ||U_1^* P_1 W||_2 = ||P_1||_2 \leq 1.We also have for each x \in \mathbb{R}^k such that ||x||_2 = 1 that||P_{21} x||_2^2 \leq ||P_{11} x||_2^2 + ||P_{21} x||_2^2 = ||P_{1} x||_2^2 \leq 1.Therefore ||P_{21}||_2 \leq 1, and||\Sigma_2||_2 = || U_2^* P_{21} W ||_2 = ||P_{21}||_2 \leq 1. == Applications ==