Algorithms exist that provide better running times than the straightforward ones. The first to be discovered was
Strassen's algorithm, devised by
Volker Strassen in 1969 and often referred to as "fast matrix multiplication". It is based on a way of multiplying two 2×2 matrices which requires only 7 multiplications (instead of the usual 8), at the expense of several additional addition and subtraction operations. Applying this recursively gives an algorithm with a multiplicative cost of O( n^{\log_{2}7}) \approx O(n^{2.807}). Strassen's algorithm is more complex, and the
numerical stability is reduced compared to the naïve algorithm, but it is faster in cases where or so It is very useful for large matrices over exact domains such as
finite fields, where numerical stability is not an issue. Since Strassen's algorithm is actually used in practical numerical software and
computer algebra systems, improving on the constants hidden in the
big-O notation has its merits. A table that compares key aspects of the improved version based on recursive multiplication of 2×2-block matrices via 7 block matrix multiplications follows. As usual, n gives the dimensions of the matrix and M designates the memory size. It is known that a Strassen-like algorithm with a 2×2-block matrix step requires at least 7 block matrix multiplications. In 1976 Probert showed that such an algorithm requires at least 15 additions (including subtractions); however, a hidden assumption was that the blocks and the 2×2-block matrix are represented in the same basis. Karstadt and Schwartz computed in different bases and traded 3 additions for less expensive basis transformations. They also proved that one cannot go below 12 additions per step using different bases. In subsequent work Beniamini et el. applied this base-change trick to more general decompositions than 2×2-block matrices and improved the leading constant for their run times. It is an open question in
theoretical computer science how well Strassen's algorithm can be improved in terms of
asymptotic complexity. The
matrix multiplication exponent, usually denoted \omega, is the smallest
real number for which any n\times n matrix over a field can be multiplied together using n^{\omega + o(1)} field operations. The current best bound on \omega is \omega , by Alman, Duan,
Williams, Xu, Xu, and Zhou. The conceptual idea of these algorithms is similar to Strassen's algorithm: a way is devised for multiplying two -matrices with fewer than multiplications, and this technique is applied recursively. However, the constant coefficient hidden by the
big-O notation is so large that these algorithms are only worthwhile for matrices that are too large to handle on present-day computers. Victor Pan proposed so-called feasible sub-cubic matrix multiplication algorithms with an exponent slightly above 2.77, but in return with a much smaller hidden constant coefficient.
Freivalds' algorithm is a simple
Monte Carlo algorithm that, given matrices , and , verifies in time if .
AlphaTensor In 2022,
DeepMind introduced AlphaTensor, a
neural network that used a single-player game analogy to invent thousands of matrix multiplication algorithms, including some previously discovered by humans and some that were not. Operations were restricted to the (normal arithmetic) and
finite field \mathbb Z/2\mathbb Z (mod 2 arithmetic). The best "practical" (explicit low-rank decomposition of a matrix multiplication tensor) algorithm found ran in O(n2.778). Finding low-rank decompositions of such tensors (and beyond) is NP-hard; optimal multiplication even for 3×3 matrices
remains unknown, even in commutative field. in normal arithmetic. Some algorithms were completely new: for example, (4, 5, 5) was improved to 76 steps from a baseline of 80 in both normal and mod 2 arithmetic. ==Parallel and distributed algorithms==