Let \hat{\mathbf \Sigma} be the
sample covariance: \hat{\mathbf \Sigma} = \frac 1 {n-1} \sum_{i=1}^n \left(\mathbf{x}_i -\overline{\mathbf{x}}\right) \left(\mathbf{x}_i - \overline{\mathbf{x}}\right)' where we denote
transpose by an
apostrophe. It can be shown that \hat{\mathbf \Sigma} is a
positive (semi) definite matrix and (n-1)\hat{\mathbf \Sigma} follows a
p-variate
Wishart distribution with
n − 1 degrees of freedom. The sample covariance matrix of the mean reads \hat{\mathbf \Sigma}_\overline{\mathbf x}=\hat{\mathbf \Sigma}/n. The '''Hotelling's
t-squared statistic''' is then defined as: t^2=(\overline{\mathbf x}-\boldsymbol{\mu})'\hat{\mathbf \Sigma}_\overline{\mathbf x}^{-1} (\overline{\mathbf x}-\boldsymbol{\mathbf\mu})=n(\overline{\mathbf x}-\boldsymbol{\mu})'\hat{\mathbf \Sigma}^{-1} (\overline{\mathbf x}-\boldsymbol{\mathbf\mu}), which is proportional to the
Mahalanobis distance between the sample mean and \boldsymbol{\mu}. Because of this, one should expect the statistic to assume low values if \overline{\mathbf x} \approx \boldsymbol{\mu}, and high values if they are different. From the
distribution, t^2 \sim T^2_{p,n-1}=\frac{p(n-1)}{n-p} F_{p,n-p} , where F_{p,n-p} is the
F-distribution with parameters
p and
n −
p. In order to calculate a
p-value (unrelated to
p variable here), note that the distribution of t^2 equivalently implies that \frac{n-p} {p(n-1)} t^2 \sim F_{p,n-p} . Then, use the quantity on the left hand side to evaluate the
p-value corresponding to the sample, which comes from the
F-distribution. A
confidence region may also be determined using similar logic.
Motivation Let \mathcal{N}_p(\boldsymbol{\mu},{\mathbf \Sigma}) denote a
p-variate normal distribution with
location \boldsymbol{\mu} and known
covariance {\mathbf \Sigma}. Let {\mathbf x}_1,\dots,{\mathbf x}_n\sim \mathcal{N}_p(\boldsymbol{\mu},{\mathbf \Sigma}) be
n independent identically distributed (iid)
random variables, which may be represented as p\times1 column vectors of real numbers. Define \overline{\mathbf x}=\frac{\mathbf{x}_1+\cdots+\mathbf{x}_n}{n} to be the
sample mean with covariance {\mathbf \Sigma}_\overline{\mathbf x}={\mathbf \Sigma}/ n. It can be shown that (\overline{\mathbf x}-\boldsymbol{\mu})'{\mathbf \Sigma}_\overline{\mathbf x}^{-1}(\overline{\mathbf x}-\boldsymbol{\mathbf\mu})\sim\chi^2_p , where \chi^2_p is the
chi-squared distribution with
p degrees of freedom. {{math proof|Every positive-semidefinite symmetric matrix \boldsymbol M has a positive-semidefinite symmetric square root \boldsymbol M^{1/2} , and if it is nonsingular, then its inverse has a positive-definite square root \boldsymbol M^{-1/2} . Since \operatorname{var}\left( \overline{\boldsymbol x} \right) = \mathbf\Sigma_\overline{\mathbf x} , we have \begin{align} \operatorname{var} \left( \mathbf\Sigma_\overline{\boldsymbol x}^{-1/2} \overline{\boldsymbol x} \right) & = \mathbf\Sigma_\overline{\boldsymbol x}^{-1/2} \Big( \operatorname{var}\left( \overline{\boldsymbol x} \right) \Big) \left( \mathbf\Sigma_\overline{\boldsymbol x}^{-1/2} \right)^T \\[5pt] & = \mathbf\Sigma_\overline{\boldsymbol x}^{-1/2} \Big( \operatorname{var}\left( \overline{\boldsymbol x} \right) \Big) \mathbf\Sigma_\overline{\boldsymbol x}^{-1/2} \text{ because } \mathbf\Sigma_\overline{\boldsymbol x} \text{ is symmetric} \\[5pt] & = \left( \mathbf\Sigma_\overline{\boldsymbol x}^{-1/2} \mathbf\Sigma_\overline{\boldsymbol x}^{1/2} \right) \left( \mathbf\Sigma_\overline{\boldsymbol x}^{1/2} \mathbf\Sigma_\overline{\boldsymbol x}^{-1/2} \right) \\[5pt] & = \mathbf I_p. \end{align} Consequently (\overline{\boldsymbol x}- \boldsymbol \mu)^T \mathbf\Sigma_\overline{x}^{-1} (\overline{\boldsymbol x}- \boldsymbol \mu) = \left( \mathbf\Sigma_\overline{x}^{-1/2} (\overline{\boldsymbol x}- \boldsymbol \mu) \right)^T \left( \mathbf\Sigma_\overline{x}^{-1/2} (\overline{\boldsymbol x}- \boldsymbol \mu) \right) and this is simply the sum of squares of p independent standard normal random variables. Thus its distribution is \chi^2_p. }} Alternatively, one can argue using density functions and characteristic functions, as follows. {{math proof| To show this use the fact that \overline{\mathbf x}\sim \mathcal{N}_p(\boldsymbol{\mu},{\mathbf \Sigma}/n) and derive the
characteristic function of the random variable \mathbf y = (\bar{\mathbf x}-\boldsymbol{\mu})'{\mathbf \Sigma}_\bar{\mathbf x}^{-1}(\bar{\mathbf x}-\boldsymbol{\mathbf\mu}) = (\bar{\mathbf x}-\boldsymbol{\mu})'({\mathbf \Sigma} / n)^{-1}(\bar{\mathbf x}-\boldsymbol{\mathbf\mu}). As usual, let | \cdot | denote the
determinant of the argument, as in | \boldsymbol\Sigma |. By definition of characteristic function, we have: \begin{align} \varphi_{\mathbf y}(\theta) &=\operatorname{E} e^{i \theta \mathbf y}, \\[5pt] &= \operatorname{E} e^{i \theta (\overline{\mathbf x}-\boldsymbol{\mu})'({\mathbf \Sigma}/n)^{-1}(\overline{\mathbf x}-\boldsymbol{\mathbf\mu})} \\[5pt] &= \int e^{i \theta (\overline{\mathbf x}-\boldsymbol{\mu})'n{\mathbf \Sigma}^{-1}(\overline{\mathbf x}-\boldsymbol{\mathbf\mu})} (2\pi)^{-p/2} |\boldsymbol{\Sigma}/n|^{-1/2}\, e^{ -(1/2) (\overline{\mathbf x}-\boldsymbol\mu)' n \boldsymbol\Sigma^{-1} (\overline{\mathbf x}-\boldsymbol\mu) } \, dx_1 \cdots dx_p \end{align} There are two exponentials inside the integral, so by multiplying the exponentials we add the exponents together, obtaining: \begin{align} &= \int (2\pi)^{-p/2}| \boldsymbol\Sigma/n|^{-1/2}\, e^{ -(1/2)(\overline{\mathbf x} - \boldsymbol\mu)' n(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})(\overline{\mathbf x}-\boldsymbol\mu) }\,dx_1 \cdots dx_p \end{align} Now take the term |\boldsymbol\Sigma/n|^{-1/2} off the integral, and multiply everything by an identity I = |(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})^{-1} / n|^{1/2} \;\cdot\; |(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})^{-1} / n|^{-1/2}, bringing one of them inside the integral: \begin{align} &= |(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})^{-1} / n|^{1/2} |\boldsymbol\Sigma/n|^{-1/2} \int (2\pi)^{-p/2} |(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})^{-1} / n|^{-1/2} \, e^{ -(1/2)n(\overline{\mathbf x}-\boldsymbol\mu)'(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})(\overline{\mathbf x}-\boldsymbol\mu) }\,dx_1 \cdots dx_p \end{align} But the term inside the integral is precisely the probability density function of a
multivariate normal distribution with covariance matrix (\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})^{-1} / n = \left[ n (\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1}) \right]^{-1} and mean \mu, so when integrating over all x_1, \dots, x_p, it must yield 1 per the
probability axioms. We thus end up with: \begin{align} & = \left|(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})^{-1} \cdot \frac{1}{n} \right|^{1/2} |\boldsymbol\Sigma/n|^{-1/2} \\ & = \left|(\boldsymbol\Sigma^{-1}-2 i \theta \boldsymbol\Sigma^{-1})^{-1} \cdot \frac{1}{\cancel{n}} \cdot \cancel{n} \cdot \boldsymbol\Sigma^{-1} \right|^{1/2} \\ & = \left| \left[ (\cancel{\boldsymbol\Sigma^{-1}} -2i \theta \cancel{\boldsymbol\Sigma^{-1}} ) \cancel{\boldsymbol\Sigma} \right]^{-1} \right|^{1/2} \\ & = |\mathbf I_p-2 i \theta \mathbf I_p|^{-1/2} \end{align} where I_p is an identity matrix of dimension p. Finally, calculating the determinant, we obtain: \begin{align} & = (1-2 i \theta)^{-p/2} \end{align} which is the characteristic function for a
chi-square distribution with p degrees of freedom. \;\;\;\blacksquare }} ==Two-sample statistic==