Single-vector version ====Preliminaries:
Gradient descent for eigenvalue problems==== The method performs an
iterative maximization (or minimization) of the generalized
Rayleigh quotient \rho(x) := \rho(A,B; x) :=\frac{x^\mathsf{T} A x}{x^\mathsf{T} B x}, which results in finding largest (or smallest) eigenpairs of A x = \lambda B x. The direction of the steepest ascent, which is the
gradient, of the generalized
Rayleigh quotient is positively proportional to the vector r := Ax - \rho(x) Bx, called the eigenvector
residual. If a
preconditioner T is available, it is applied to the residual and gives the vector w := Tr, called the preconditioned residual. Without preconditioning, we set T := I and so An iterative method x^{i+1} := x^i + \alpha^i T \left(Ax^i - \rho(x^i) Bx^i\right), or, in short, \begin{align} x^{i+1} &:= x^i + \alpha^i w^i,\, \\ w^i &:= Tr^i,\, \\ r^i &:= Ax^i - \rho(x^i) Bx^i, \end{align} is known as preconditioned
steepest ascent (or descent), where the scalar \alpha^i is called the step size. The optimal step size can be determined by maximizing the Rayleigh quotient, i.e., x^{i+1} := \arg\max_{y\in \operatorname{span}\{x^i,w^i\}} \rho(y) (or \arg\min in case of minimizing), in which case the method is called locally optimal.
Three-term recurrence To dramatically accelerate the convergence of the locally optimal preconditioned steepest ascent (or descent), one extra vector can be added to the two-term
recurrence relation to make it three-term: x^{i+1} := \arg\max_{y\in \operatorname{span}\{x^i,w^i,x^{i-1}\}} \rho(y) (use \arg\min in case of minimizing). The maximization/minimization of the Rayleigh quotient in a 3-dimensional subspace can be performed numerically by the
Rayleigh–Ritz method. Adding more vectors, see, e.g.,
Richardson extrapolation, does not result in significant acceleration but increases computation costs, so is not generally recommended.
Numerical stability improvements As the iterations converge, the vectors x^i and x^{i-1} become nearly
linearly dependent, resulting in a precision loss and making the
Rayleigh–Ritz method numerically unstable in the presence of round-off errors. The loss of precision may be avoided by substituting the vector x^{i-1} with a vector p^i, that may be further away from x^{i}, in the basis of the three-dimensional subspace \operatorname{span}\{x^i, w^i, x^{i-1}\}, while keeping the subspace unchanged and avoiding
orthogonalization or any other extra operations. utilize unstable but efficient
Cholesky decomposition of the
normal matrix, which is performed only on individual matrices W^{i} and P^{i}, rather than on the whole subspace. The constantly increasing amount of computer memory allows typical block sizes nowadays in the 10^3-10^4 range, where the percentage of compute time spend on orthogonalizations and the Rayleigh-Ritz method starts dominating.
Locking of previously converged eigenvectors Block methods for eigenvalue problems that iterate subspaces commonly have some of the iterative eigenvectors converged faster than others that motivates locking the already converged eigenvectors, i.e., removing them from the iterative loop, in order to eliminate unnecessary computations and improve numerical stability. A simple removal of an eigenvector may likely result in forming its duplicate in still iterating vectors. The fact that the eigenvectors of symmetric eigenvalue problems are pair-wise orthogonal suggest keeping all iterative vectors orthogonal to the locked vectors. Locking can be implemented differently maintaining numerical accuracy and stability while minimizing the compute costs. For example, LOBPCG implementations, separating hard locking, i.e. a deflation by restriction, where the locked eigenvectors serve as a code input and do not change, from soft locking, where the locked vectors do not participate in the typically most expensive iterative step of computing the residuals, however, fully participate in the Rayleigh—Ritz method and thus are allowed to be changed by the Rayleigh—Ritz method.
Modifications, LOBPCG II LOBPCG includes all columns of matrices X^{i},\, W^{i}, and P^{i} into the
Rayleigh–Ritz method resulting in an up to 3k-by-3k eigenvalue problem needed to solve and up to 9k^2
dot products to compute at every iteration, where k denotes the block size — the number of columns. For large block sizes k this starts dominating compute and I/O costs and limiting parallelization, where multiple compute devices are running simultaneously. The original LOBPCG paper goes further applying the LOBPCG algorithm to each approximate eigenvector separately, i.e., running the unblocked version of the LOBPCG method for each desired eigenpair for a fixed number of iterations. The Rayleigh-Ritz procedures in these runs only need to solve a set of 3 × 3 projected eigenvalue problems. The global Rayleigh-Ritz procedure for all desired eigenpairs is only applied periodically at the end of a fixed number of unblocked LOBPCG iterations. Such modifications may be less robust compared to the original LOBPCG. Individually running branches of the single-vector LOBPCG may not follow continuous iterative paths flipping instead and creating duplicated approximations to the same eigenvector. The single-vector LOBPCG may be unsuitable for clustered eigenvalues, but separate small-block LOBPCG runs require determining their block sizes automatically during the process of iterations since the number of the clusters of eigenvalues and their sizes may be unknown a priori. ==Convergence theory and practice==