Although multiplication by the centering matrix is not a computationally efficient way of removing the mean from a vector, it is a convenient analytical tool. It can be used not only to remove the mean of a single vector, but also of multiple vectors stored in the rows or columns of an
m-by-
n matrix X. The left multiplication by C_m subtracts a corresponding mean value from each of the
n columns, so that each column of the product C_m\,X has a zero mean. Similarly, the multiplication by C_n on the right subtracts a corresponding mean value from each of the
m rows, and each row of the product X\,C_n has a zero mean. The multiplication on both sides creates a doubly centred matrix C_m\,X\,C_n, whose row and column means are equal to zero. The centering matrix provides in particular a succinct way to express the
scatter matrix, S=(X-\mu J_{n,1}^{\mathrm{T}})(X-\mu J_{n,1}^{\mathrm{T}})^{\mathrm{T}} of a data sample X\,, where \mu=\tfrac{1}{n}X J_{n,1} is the
sample mean. The centering matrix allows us to express the scatter matrix more compactly as :S=X\,C_n(X\,C_n)^{\mathrm{T}}=X\,C_n\,C_n\,X\,^{\mathrm{T}}=X\,C_n\,X\,^{\mathrm{T}}. C_n is the
covariance matrix of the
multinomial distribution, in the special case where the parameters of that distribution are k=n, and p_1=p_2=\cdots=p_n=\frac{1}{n}. == References ==