Rank decomposition Let denote the
rank of {{tmath| A \in \mathbb{K}^{m\times n} }}. Then can be
(rank) decomposed as A = BC where {{tmath| B \in \mathbb{K}^{m\times r} }} and {{tmath| C \in \mathbb{K}^{r\times n} }} are of rank . Then A^+ = C^+B^+ = C^*\left(CC^*\right)^{-1}\left(B^*B\right)^{-1}B^*.
The QR method For \mathbb{K} \in \{ \mathbb{R}, \mathbb{C}\} computing the product or and their inverses explicitly is often a source of numerical rounding errors and computational cost in practice. An alternative approach using the
QR decomposition of may be used instead. Consider the case when is of full column rank, so that A^+ = \left(A^*A\right)^{-1}A^*. Then the
Cholesky decomposition A^*A = R^*R, where is an
upper triangular matrix, may be used. Multiplication by the inverse is then done easily by solving a system with multiple right-hand sides, A^+ = \left(A^*A\right)^{-1}A^* \quad \Leftrightarrow \quad \left(A^*A\right)A^+ = A^* \quad \Leftrightarrow \quad R^*RA^+ = A^* which may be solved by
forward substitution followed by
back substitution. The Cholesky decomposition may be computed without forming explicitly, by alternatively using the
QR decomposition of A = Q R, where Q has orthonormal columns, Q^*Q = I, and is upper triangular. Then A^*A\, =\, (Q R)^*(Q R) \,=\, R^*Q^*Q R \,=\, R^*R , so is the Cholesky factor of . The case of full row rank is treated similarly by using the formula A^+ = A^*\left(A A^*\right)^{-1} and using a similar argument, swapping the roles of and .
Using polynomials in matrices For an arbitrary {{tmath| A \in \mathbb{K}^{m\times n} }}, one has that A^*A is normal and, as a consequence, an EP matrix. One can then find a polynomial p(t) such that (A^*A)^+=p(A^*A). In this case one has that the pseudoinverse of is given by If A = U\Sigma V^* is the singular value decomposition of , then A^+ = V\Sigma^+ U^*. For a
rectangular diagonal matrix such as , we get the pseudoinverse by transposing and taking the reciprocal of each non-zero element on the diagonal, leaving the zeros in place. In numerical computation, only elements larger than some small tolerance are taken to be nonzero, and the others are replaced by zeros. For example, in the
MATLAB or
GNU Octave function , the tolerance is taken to be , where ε is the
machine epsilon. The computational cost of this method is dominated by the cost of computing the SVD, which is several times higher than matrix–matrix multiplication, even if a state-of-the art implementation (such as that of
LAPACK) is used. The above procedure shows why taking the pseudoinverse is not a continuous operation: if the original matrix has a singular value 0 (a diagonal entry of the matrix above), then modifying slightly may turn this zero into a tiny positive number, thereby affecting the pseudoinverse dramatically as we now have to take the reciprocal of a tiny number.
Block matrices Optimized approaches exist for calculating the pseudoinverse of block-structured matrices.
The iterative method of Ben-Israel and Cohen Another method for computing the pseudoinverse (cf.
Drazin inverse) uses the recursion A_{i+1} = 2A_i - A_i A A_i, which is sometimes referred to as hyper-power sequence. This recursion produces a sequence converging quadratically to the pseudoinverse of if it is started with an appropriate satisfying A_0 A = \left(A_0 A\right)^*. The choice A_0 = \alpha A^* (where 0 , with denoting the largest singular value of ) has been argued not to be competitive to the method using the SVD mentioned above, because even for moderately ill-conditioned matrices it takes a long time before enters the region of quadratic convergence. However, if started with already close to the Moore–Penrose inverse and A_0 A = \left(A_0 A\right)^*, for example A_0 := \left(A^* A + \delta I\right)^{-1} A^*, convergence is fast (quadratic).
Updating the pseudoinverse For the cases where has full row or column rank, and the inverse of the correlation matrix ( for with full row rank or for full column rank) is already known, the pseudoinverse for matrices related to can be computed by applying the
Sherman–Morrison–Woodbury formula to update the inverse of the correlation matrix, which may need less work. In particular, if the related matrix differs from the original one by only a changed, added or deleted row or column, incremental algorithms exist that exploit the relationship. Similarly, it is possible to update the Cholesky factor when a row or column is added, without creating the inverse of the correlation matrix explicitly. However, updating the pseudoinverse in the general rank-deficient case is much more complicated.
Software libraries High-quality implementations of SVD, QR, and back substitution are available in standard libraries, such as
LAPACK. Writing one's own implementation of SVD is a major programming project that requires a significant
numerical expertise. In special circumstances, such as
parallel computing or
embedded computing, however, alternative implementations by QR or even the use of an explicit inverse might be preferable, and custom implementations may be unavoidable. The Python package
NumPy provides a pseudoinverse calculation through its functions matrix.I and linalg.pinv; its pinv uses the SVD-based algorithm.
SciPy adds a function scipy.linalg.pinv that uses a least-squares solver. The MASS package for
R provides a calculation of the Moore–Penrose inverse through the ginv function. The ginv function calculates a pseudoinverse using the singular value decomposition provided by the svd function in the base R package. An alternative is to employ the pinv function available in the pracma package. The
Octave programming language provides a pseudoinverse through the standard package function pinv and the pseudo_inverse() method. In
Julia (programming language), the LinearAlgebra package of the standard library provides an implementation of the Moore–Penrose inverse pinv() implemented via singular-value decomposition. In
Wolfram Mathematica the built-in function PseudoInverse works for both symbolic and numerical matrices. In the numerical case SVD is used and a tolerance parameter t is provided to specify the minimum singular value t\, \sigma_{\rm max} that should be retained. By default t=10^{2-p} where p is the precision of the matrix. ==Applications==