Rayleigh–Ritz method

The Rayleigh–Ritz method is a direct numerical method of approximating eigenvalues, which originated in the context of solving physical boundary-value problems. It is named after Lord Rayleigh and Walther Ritz. In this method, an infinite-dimensional linear operator is approximated by a finite-dimensional compression, enabling the use of a numerical eigenvalue algorithm.

Naming and attribution

The origin of this technique has been debated by historians. It has been called Ritz method after Walther Ritz, who published the numerical procedure 1908 and 1909. Lord Rayleigh wrote a paper congratulating Ritz on his work in 1911, but stating that he himself had used Ritz's method in many places in his book, Theory of Sound (1877), and elsewhere. == Method ==

Method

Let T be a linear operator on a Hilbert space \mathcal{H}, with inner product (\cdot, \cdot). Now consider a finite set of functions \mathcal{L} = \{\varphi_1, ...,\varphi_n\}. Depending on the application these functions may be: • A subset of the orthonormal basis of the original operator; • A space of splines (as in the Galerkin method); • A set of functions which approximate the eigenfunctions of the operator. One could use the orthonormal basis generated from the eigenfunctions of the operator, which will produce diagonal approximating matrices, but in this case we would have already had to calculate the spectrum. We now approximate T by T_{\mathcal{L}}, which is defined as the matrix with entries If a subset of the orthonormal basis was used to find the matrix, the eigenvectors of T_{\mathcal{L}} will be linear combinations of orthonormal basis functions, and as a result they will be approximations of the eigenvectors of T. == Properties ==

Properties

Spectral pollution It is possible for the Rayleigh–Ritz method to produce values which do not converge to actual values in the spectrum of the operator as the truncation gets large. These values are known as spectral pollution. In some cases (such as for the Schrödinger equation), there is no approximation which both includes all eigenvalues of the equation, and contains no pollution. The spectrum of the compression (and thus pollution) is bounded by the numerical range of the operator; in many cases it is bounded by a subset of the numerical range known as the essential numerical range. == For matrix eigenvalue problems ==

For matrix eigenvalue problems

In numerical linear algebra, the Rayleigh–Ritz method is commonly applied to approximate an eigenvalue problem A \mathbf{x} = \lambda \mathbf{x} for the matrix A \in \mathbb{C}^{N \times N} of size N using a projected matrix of a smaller size m , generated from a given matrix V \in \mathbb{C}^{N \times m} with orthonormal columns. The matrix version of the algorithm is the most simple: • Compute the m \times m matrix V^* A V , where V^* denotes the complex-conjugate transpose of V • Solve the eigenvalue problem V^* A V \mathbf{y}_i = \mu_i \mathbf{y}_i • Compute the Ritz vectors \tilde{\mathbf{x}}_i = V \mathbf{y}_i and the Ritz value \tilde{\lambda}_i=\mu_i • Output approximations (\tilde{\lambda}_i,\tilde{\mathbf{x}}_i), called the Ritz pairs, to eigenvalues and eigenvectors of the original matrix A. If the subspace with the orthonormal basis given by the columns of the matrix V \in \mathbb{C}^{N \times m} contains k \leq m vectors that are close to eigenvectors of the matrix A, the Rayleigh–Ritz method above finds k Ritz vectors that well approximate these eigenvectors. The easily computable quantity \| A \tilde{\mathbf{x}}_i - \tilde{\lambda}_i \tilde{\mathbf{x}}_i\| determines the accuracy of such an approximation for every Ritz pair. In the easiest case m = 1, the N \times m matrix V turns into a unit column-vector v, the m \times m matrix V^* A V is a scalar that is equal to the Rayleigh quotient \rho(v) = v^*Av/v^*v, the only i = 1 solution to the eigenvalue problem is y_i = 1 and \mu_i = \rho(v), and the only Ritz vector is v itself. Thus, the Rayleigh–Ritz method turns into computing of the Rayleigh quotient if m = 1. Another useful connection to the Rayleigh quotient is that \mu_i = \rho(v_i) for every Ritz pair (\tilde{\lambda}_i, \tilde{\mathbf{x}}_i), allowing to derive some properties of Ritz values \mu_i from the corresponding theory for the Rayleigh quotient. For example, if A is a Hermitian matrix, its Rayleigh quotient (and thus its every Ritz value) is real and takes values within the closed interval of the smallest and largest eigenvalues of A. Example The matrix A = \begin{bmatrix} 2 & 0 & 0 \\ 0 & 2 & 1 \\ 0 & 1 & 2 \end{bmatrix} has eigenvalues 1, 2, 3 and the corresponding eigenvectors \mathbf x_{\lambda=1} = \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix}, \quad \mathbf x_{\lambda=2} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \quad \mathbf x_{\lambda=3} = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}. Let us take V = \begin{bmatrix} 0 & 0 \\ 1 & 0 \\ 0 & 1 \end{bmatrix}, then V^* A V = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} with eigenvalues 1, 3 and the corresponding eigenvectors \mathbf y_{\mu=1} = \begin{bmatrix} 1 \\ -1 \end{bmatrix}, \quad \mathbf y_{\mu=3} = \begin{bmatrix} 1 \\ 1 \end{bmatrix}, so that the Ritz values are 1, 3 and the Ritz vectors are \mathbf \tilde{x}_{\tilde{\lambda}=1} = \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix}, \quad \mathbf \tilde{x}_{\tilde{\lambda}=3} = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}. We observe that each one of the Ritz vectors is exactly one of the eigenvectors of A for the given V as well as the Ritz values give exactly two of the three eigenvalues of A. A mathematical explanation for the exact approximation is based on the fact that the column space of the matrix V happens to be exactly the same as the subspace spanned by the two eigenvectors \mathbf x_{\lambda=1} and \mathbf x_{\lambda=3} in this example. == For matrix singular value problems ==

For matrix singular value problems

Truncated singular value decomposition (SVD) in numerical linear algebra can also use the Rayleigh–Ritz method to find approximations to left and right singular vectors of the matrix M \in \mathbb{C}^{M \times N} of size M \times N in given subspaces by turning the singular value problem into an eigenvalue problem. Using the normal matrix The definition of the singular value \sigma and the corresponding left and right singular vectors is M v = \sigma u and M^* u = \sigma v. Having found one set (left of right) of approximate singular vectors and singular values by applying naively the Rayleigh–Ritz method to the Hermitian normal matrix M^* M \in \mathbb{C}^{N \times N} or M M^* \in \mathbb{C}^{M \times M}, whichever one is smaller size, one could determine the other set of left of right singular vectors simply by dividing by the singular values, i.e., u = Mv / \sigma and v = M^* u / \sigma. However, the division is unstable or fails for small or zero singular values. An alternative approach, e.g., defining the normal matrix as A = M^* M \in \mathbb{C}^{N \times N} of size N \times N, takes advantage of the fact that for a given N \times m matrix W \in \mathbb{C}^{N \times m} with orthonormal columns the eigenvalue problem of the Rayleigh–Ritz method for the m \times m matrix W^* A W = W^* M^* M W = (M W)^* M W can be interpreted as a singular value problem for the N \times m matrix M W. This interpretation allows simple simultaneous calculation of both left and right approximate singular vectors as follows. • Compute the N \times m matrix M W . • Compute the thin, or economy-sized, SVD M W = \mathbf {U} \Sigma \mathbf V_h, with N \times m matrix \mathbf U, m \times m diagonal matrix \Sigma, and m \times m matrix \mathbf {V}_h. • Compute the matrices of the Ritz left U = \mathbf U and right V_h = \mathbf {V}_h W^* singular vectors. • Output approximations U, \Sigma, V_h, called the Ritz singular triplets, to selected singular values and the corresponding left and right singular vectors of the original matrix M representing an approximate Truncated singular value decomposition (SVD) with left singular vectors restricted to the column-space of the matrix W. The algorithm can be used as a post-processing step where the matrix W is an output of an eigenvalue solver, e.g., such as LOBPCG, approximating numerically selected eigenvectors of the normal matrix A = M^* M. Example The matrix M = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 \\ 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 4 \\ 0 & 0 & 0 & 0 \end{bmatrix} has its normal matrix A = M^* M = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 4 & 0 & 0 \\ 0 & 0 & 9 & 0 \\ 0 & 0 & 0 & 16 \\ \end{bmatrix}, singular values 1, 2, 3, 4 and the corresponding thin SVD A = \begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} 4 & 0 & 0 & 0 \\ 0 & 3 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{bmatrix}, where the columns of the first multiplier from the complete set of the left singular vectors of the matrix A, the diagonal entries of the middle term are the singular values, and the columns of the last multiplier transposed (although the transposition does not change it) \begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{bmatrix}^* \quad = \quad \begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{bmatrix} are the corresponding right singular vectors. Let us take W = \begin{bmatrix} 1 / \sqrt{2} & 1 / \sqrt{2} \\ 1 / \sqrt{2} & -1 / \sqrt{2} \\ 0 & 0 \\ 0 & 0 \end{bmatrix} with the column-space that is spanned by the two exact right singular vectors \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ 0 & 0 \\ 0 & 0 \end{bmatrix} corresponding to the singular values 1 and 2. Following the algorithm step 1, we compute MW = \begin{bmatrix} 1 / \sqrt{2} & 1 / \sqrt{2} \\ \sqrt{2} & -\sqrt{2} \\ 0 & 0 \\ 0 & 0 \end{bmatrix}, and on step 2 its thin SVD M W = \mathbf {U}\mathbf {V}_h with \mathbf {U} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{bmatrix}, \quad \Sigma = \begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix}, \quad \mathbf {V}_h = \begin{bmatrix} 1 / \sqrt{2} & -1 / \sqrt{2} \\ 1 / \sqrt{2} & 1 / \sqrt{2} \end{bmatrix}. Thus we already obtain the singular values 2 and 1 from \Sigma and from \mathbf {U} the corresponding two left singular vectors u as [0, 1, 0, 0, 0]^* and [1, 0, 0, 0, 0]^*, which span the column-space of the matrix W, explaining why the approximations are exact for the given W. Finally, step 3 computes the matrix V_h = \mathbf {V}_h W^* \mathbf {V}_h = \begin{bmatrix} 1 / \sqrt{2} & -1 / \sqrt{2} \\ 1 / \sqrt{2} & 1 / \sqrt{2} \end{bmatrix} \, \begin{bmatrix} 1 / \sqrt{2} & 1 / \sqrt{2} & 0 & 0 \\ 1 / \sqrt{2} & -1 / \sqrt{2} & 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{bmatrix} recovering from its rows the two right singular vectors v as [0, 1, 0, 0]^* and [1, 0, 0, 0]^*. We validate the first vector: M v = \sigma u \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 \\ 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 4 \\ 0 & 0 & 0 & 0 \end{bmatrix} \, \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix} = \, 2 \, \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} and M^* u = \sigma v \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 3 & 0 & 0 \\ 0 & 0 & 0 & 4 & 0 \end{bmatrix} \, \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} = \, 2 \, \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}. Thus, for the given matrix W with its column-space that is spanned by two exact right singular vectors, we determine these right singular vectors, as well as the corresponding left singular vectors and the singular values, all exactly. For an arbitrary matrix W, we obtain approximate singular triplets which are optimal given W in the sense of optimality of the Rayleigh–Ritz method. == Applications and examples ==

Applications and examples

In quantum physics In quantum physics, where the spectrum of the Hamiltonian is the set of discrete energy levels allowed by a quantum mechanical system, the Rayleigh–Ritz method is used to approximate the energy states and wavefunctions of a complicated atomic or nuclear system. == The relationship with the finite element method ==

The relationship with the finite element method

In the language of the finite element method, the matrix H_{kj} is precisely the stiffness matrix of the Hamiltonian in the piecewise linear element space, and the matrix S_{kj} is the mass matrix. In the language of linear algebra, the value \epsilon is an eigenvalue of the discretized Hamiltonian, and the vector c is a discretized eigenvector. == See also ==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com