Computational complexity of matrix multiplication

In theoretical computer science, the computational complexity of matrix multiplication dictates how quickly the operation of matrix multiplication can be performed. Matrix multiplication algorithms are a central subroutine in theoretical and numerical algorithms for numerical linear algebra and optimization, so finding the fastest algorithm for matrix multiplication is of major practical relevance.

Simple algorithms

If A, B are two matrices over a field, then their product AB is also an matrix over that field, defined entrywise as (AB)_{ij} = \sum_{k = 1}^n A_{ik} B_{kj}. Schoolbook algorithm The simplest approach to computing the product of two matrices A and B is to compute the arithmetic expressions coming from the definition of matrix multiplication. In pseudocode: input A and B, both n by n matrices initialize C to be an n by n matrix of all zeros for i from 1 to n: for j from 1 to n: for k from 1 to n: C[i][j] = C[i][j] + A[i][k]*B[k][j] output C (as A*B) This algorithm requires multiplications and additions of scalars for computing the product of two square matrices. Its computational complexity is therefore , in a model of computation where field operations (addition and multiplication) take constant time (in practice, this is the case for floating point numbers, but not necessarily for integers). Strassen's algorithm Strassen's algorithm improves on naive matrix multiplication through a divide-and-conquer approach. The key observation is that multiplying two matrices can be done with only seven multiplications, instead of the usual eight (at the expense of 11 additional addition and subtraction operations). This means that, treating the input matrices as block matrices, the task of multiplying two matrices can be reduced to seven subproblems of multiplying two matrices. Applying this recursively gives an algorithm needing O( n^{\log_{2}7}) \approx O(n^{2.807}) field operations. Unlike algorithms with faster asymptotic complexity, Strassen's algorithm is used in practice. The numerical stability is reduced compared to the naive algorithm, but it is faster in cases where or so and appears in several libraries, such as BLAS. Fast matrix multiplication algorithms cannot achieve component-wise stability, but some can be shown to exhibit norm-wise stability. It is very useful for large matrices over exact domains such as finite fields, where numerical stability is not an issue. == Matrix multiplication exponent ==

Matrix multiplication exponent

The matrix multiplication exponent, usually denoted , is the smallest real number for which any two n\times n matrices over a field can be multiplied together using n^{\omega + o(1)} field operations. This notation is commonly used in algorithms research, so that algorithms using matrix multiplication as a subroutine have bounds on running time that can update as bounds on improve. Using a naive lower bound and schoolbook matrix multiplication for the upper bound, one can straightforwardly conclude that . Whether is a major open question in theoretical computer science, and there is a line of research developing matrix multiplication algorithms to get improved bounds on . All recent algorithms in this line of research use the laser method, a generalization of the Coppersmith–Winograd algorithm, which was given by Don Coppersmith and Shmuel Winograd in 1990 and was the best matrix multiplication algorithm until 2010. The conceptual idea of these algorithms is similar to Strassen's algorithm: a method is devised for multiplying two -matrices with fewer than multiplications, and this technique is applied recursively. The laser method has limitations to its power: Ambainis, Filmus and Le Gall prove that it cannot be used to show that by analyzing higher and higher tensor powers of a certain identity of Coppersmith and Winograd and neither for a wide class of variants of this approach. In 2022 Duan, Wu and Zhou devised a variant breaking the first of the two barriers with , Group theory reformulation of matrix multiplication algorithms Henry Cohn, Robert Kleinberg, Balázs Szegedy and Chris Umans put methods such as the Strassen and Coppersmith–Winograd algorithms in an entirely different group-theoretic context, by utilising triples of subsets of finite groups which satisfy a disjointness property called the triple product property (TPP). They also give conjectures that, if true, would imply that there are matrix multiplication algorithms with essentially quadratic complexity. This implies that the optimal exponent of matrix multiplication is 2, which most researchers believe is indeed the case. Several of their conjectures have since been disproven by Blasiak, Cohn, Church, Grochow, Naslund, Sawin, and Umans using the Slice Rank method. Further, Alon, Shpilka and Chris Umans have recently shown that some of these conjectures implying fast matrix multiplication are incompatible with another plausible conjecture, the sunflower conjecture, which in turn is related to the cap set problem. It is known that, under the model of computation typically studied, there is no matrix multiplication algorithm that uses precisely operations; there must be an additional factor of . This generalizes the square matrix multiplication exponent, since \omega(1) = \omega. Since the output of the matrix multiplication problem is size n^2, we have \omega(k) \geq 2 for all values of k. If one can prove for some values of k between 0 and 1 that \omega(k) \leq 2, then such a result shows that \omega(k) = 2 for those k. The largest k such that \omega(k) = 2 is known as the dual matrix multiplication exponent, usually denoted α. α is referred to as the "dual" because showing that \alpha = 1 is equivalent to showing that \omega = 2. Like the matrix multiplication exponent, the dual matrix multiplication exponent sometimes appears in the complexity of algorithms in numerical linear algebra and optimization. The first bound on α is by Coppersmith in 1982, who showed that \alpha > 0.17227. The current best peer-reviewed bound on α is \alpha \geq 0.321334, given by Williams, Xu, Xu, and Zhou. ==Related problems==

Related problems

Problems that have the same asymptotic complexity as matrix multiplication include determinant, matrix inversion, Gaussian elimination (see next section). Problems with complexity that is expressible in terms of \omega include characteristic polynomial, eigenvalues (but not eigenvectors), Hermite normal form, and Smith normal form. Matrix inversion, determinant and Gaussian elimination In his 1969 paper, where he proved the complexity O(n^{\log_2 7}) \approx O(n^{2.807}) for matrix computation, Strassen proved also that matrix inversion, determinant and Gaussian elimination have, up to a multiplicative constant, the same computational complexity as matrix multiplication. The proof does not make any assumptions on matrix multiplication that is used, except that its complexity is O(n^\omega) for some \omega \ge 2. The starting point of Strassen's proof is using block matrix multiplication. Specifically, a matrix of even dimension may be partitioned in four blocks \begin{bmatrix} {A} & {B} \\{C} & {D} \end{bmatrix}. Under this form, its inverse is \begin{bmatrix} {A} & {B} \\ {C} & {D} \end{bmatrix}^{-1} = \begin{bmatrix} {A}^{-1}+{A}^{-1}{B}({D}-{CA}^{-1}{B})^{-1}{CA}^{-1} & -{A}^{-1}{B}({D}-{CA}^{-1}{B})^{-1} \\ -({D}-{CA}^{-1}{B})^{-1}{CA}^{-1} & ({D}-{CA}^{-1}{B})^{-1} \end{bmatrix}, provided that and {D}-{CA}^{-1}{B} are invertible. Thus, the inverse of a matrix may be computed with two inversions, six multiplications and four additions or additive inverses of matrices. It follows that, denoting respectively by , and the number of operations needed for inverting, multiplying and adding matrices, one has I(2n) \le 2I(n) + 6M(n)+ 4 A(n). If n=2^k, one may apply this formula recursively: \begin{align} I(2^k) &\le 2I(2^{k-1}) + 6M(2^{k-1})+ 4 A(2^{k-1})\\ &\le 2^2I(2^{k-2}) + 6(M(2^{k-1})+2M(2^{k-2})) + 4(A(2^{k-1}) + 2A(2^{k-2}))\\ &\,\,\,\vdots \end{align} If M(n)\le cn^\omega, and \alpha=2^\omega\ge 4, one gets eventually \begin{align} I(2^k) &\le 2^k I(1) + 6c(\alpha^{k-1}+2\alpha^{k-2} + \cdots +2^{k-1}\alpha^0) + k 2^{k+1}\\ &\le 2^k + 6c\frac{\alpha^k-2^k}{\alpha-2} + k 2^{k+1}\\ &\le d(2^k)^\omega \end{align} for some constant . For matrices whose dimension is not a power of two, the same complexity is reached by increasing the dimension of the matrix to a power of two, by padding the matrix with rows and columns whose entries are 1 on the diagonal and 0 elsewhere. This proves the asserted complexity for matrices such that all submatrices that have to be inverted are indeed invertible. This complexity is thus proved for almost all matrices, as a matrix with randomly chosen entries is invertible with probability one. The same argument applies to LU decomposition, as, if the matrix is invertible, the equality \begin{bmatrix} {A} & {B} \\{C} & {D} \end{bmatrix} = \begin{bmatrix}I & 0\\CA^{-1}&I\end{bmatrix}\,\begin{bmatrix}A&B\\0&D-CA^{-1}B\end{bmatrix} defines a block LU decomposition that may be applied recursively to A and D-CA^{-1}B, for getting eventually a true LU decomposition of the original matrix. The argument applies also for the determinant, since it results from the block LU decomposition that \det \begin{bmatrix} {A} & {B} \\{C} & {D} \end{bmatrix} = \det(A)\det(D-CA^{-1}B). Minimizing number of multiplications Related to the problem of minimizing the number of arithmetic operations is minimizing the number of multiplications, which is typically a more costly operation than addition. A O(n^\omega) algorithm for matrix multiplication must necessarily only use O(n^\omega) multiplication operations, but these algorithms are impractical. Improving from the naive n^3 multiplications for schoolbook multiplication, 4\times 4 matrices in \mathbb{Z}/2\mathbb{Z} can be done with 47 multiplications, 3\times 3 matrix multiplication over a commutative ring can be done in 21 multiplications (23 if non-commutative). The lower bound of multiplications needed is 2mn+2n−m−2 (multiplication of n×m matrices with m×n matrices using the substitution method, m \ge n \ge 3), which means n=3 case requires at least 19 multiplications and n=4 at least 34. For n=2 optimal seven multiplications and 15 additions are minimal, compared to only four additions for eight multiplications. ==See also==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com