Cochran's theorem The following is a special case of Cochran's theorem.
Theorem. If Z_1,...,Z_n are
independent identically distributed (i.i.d.),
standard normal random variables, then \sum_{t=1}^n \left(Z_t - \bar Z\right)^2 \sim \chi^2_{n-1} where \bar Z = \frac{1}{n} \sum_{t=1}^n Z_t.
Proof. Let Z\sim\mathcal{N}(\bar 0,1\!\!1) be a vector of n independent normally distributed random variables, and \bar Z their average. Then \sum_{t=1}^n(Z_t-\bar Z)^2 ~=~ \sum_{t=1}^n Z_t^2 -n\bar Z^2 ~=~ Z^\top[1\!\!1 -{\textstyle\frac1n}\bar 1\bar 1^\top]Z ~=:~ Z^\top\!M Z where 1\!\!1 is the identity matrix and \bar 1 the all ones vector. M has one eigenvector b_1:={\textstyle\frac{1}{\sqrt{n}}} \bar 1 with eigenvalue 0, and n-1 eigenvectors b_2,...,b_n (all orthogonal to b_1) with eigenvalue 1, which can be chosen so that Q:=(b_1,...,b_n) is an orthogonal matrix. Since also X:=Q^\top\!Z\sim\mathcal{N}(\bar 0,Q^\top\!1\!\!1 Q) =\mathcal{N}(\bar 0,1\!\!1), we have \sum_{t=1}^n(Z_t-\bar Z)^2 ~=~ Z^\top\!M Z ~=~ X^\top\!Q^\top\!M Q X ~=~ X_2^2+...+X_n^2 ~\sim~ \chi^2_{n-1}, which proves the claim.
Additivity It follows from the definition of the chi-squared distribution that the sum of independent chi-squared variables is also chi-squared distributed. Specifically, if X_i,i=\overline{1,n} are independent chi-squared variables with k_i, i=\overline{1,n} degrees of freedom, respectively, then Y = X_1 + \cdots + X_n is chi-squared distributed with k_1 + \cdots + k_n degrees of freedom.
Sample mean The sample mean of n
i.i.d. chi-squared variables of degree k is distributed according to a gamma distribution with shape \alpha and scale \theta parameters: \overline X = \frac{1}{n} \sum_{i=1}^n X_i \sim \operatorname{Gamma}\left(\alpha{=}\tfrac{n k}{2}, \,\theta{=}\tfrac{2}{n} \right) \qquad \text{where } X_i \sim \chi^2(k)
Asymptotically, given that for a shape parameter \alpha going to infinity, a Gamma distribution converges towards a normal distribution with expectation \mu = \alpha \theta and variance the sample mean converges towards: \overline X \xrightarrow{n \to \infty} N{\left(\mu{=}k, \, \sigma^2{=}\tfrac{2k}{n} \right)} Note that we would have obtained the same result invoking instead the
central limit theorem, noting that for each chi-squared variable of degree k the expectation is and its variance 2k (and hence the variance of the sample mean \overline{X} being \sigma^2 = \tfrac{2k}{n} ).
Entropy The
differential entropy is given by \begin{align} h &= \int_0^\infty f(x;\,k) \ln f(x;\,k) \, dx \\ &= \frac k 2 + \ln \left[2\,\Gamma{\left(\frac k 2 \right)}\right] + \left(1-\frac k 2 \right) \psi\!\left(\frac k 2 \right), \end{align} where \psi(x) is the
Digamma function. The chi-squared distribution is the
maximum entropy probability distribution for a random variate X for which \operatorname{E}(X)=k and \operatorname{E}(\ln(X))=\psi(k/2)+\ln(2) are fixed. Since the chi-squared is in the family of gamma distributions, this can be derived by substituting appropriate values in the
Expectation of the log moment of gamma. For derivation from more basic principles, see the derivation in
moment-generating function of the sufficient statistic.
Noncentral moments The noncentral moments (raw moments) of a chi-squared distribution with k degrees of freedom are given by \begin{align} \operatorname{E}(X^m) &= k (k+2) (k+4) \cdots (k+2m-2) \\[1ex] &= 2^m \frac{\Gamma{\left(m+\frac{k}{2}\right)}}{\Gamma{\left(\frac{k}{2}\right)}}. \end{align}
Cumulants The
cumulants are readily obtained by a
power series expansion of the logarithm of the characteristic function: \kappa_n = 2^{n-1}(n-1)!\,k with
cumulant generating function \ln \operatorname{E}[e^{tX}] = - \frac{k}{2} \ln(1-2t) .
Concentration The chi-squared distribution exhibits strong concentration around its mean. The standard Laurent-Massart bounds are: \Pr(X - k \ge 2 \sqrt{k x} + 2x) \le e^{-x} \Pr(k - X \ge 2 \sqrt{k x}) \le e^{-x} One consequence is that, if Z \sim N(0, 1)^k is a Gaussian random vector in \R^k, then as the dimension k grows, the squared length of the vector is concentrated tightly around k with a width k^{1/2 + \alpha}:\Pr\left(\left\|Z\right\|^2 \in \left[k - 2k^{1/2+\alpha}, \; k + 2k^{1/2+\alpha} + 2k^{\alpha}\right]\right) \geq 1-e^{-k^\alpha}where the exponent \alpha can be chosen as any value in \R. Since the cumulant generating function for \chi^2(k) is K(t) = -\frac k2 \ln(1-2t) , and its
convex dual is K^*(q) = \frac{1}{2} \left(q - k + k\ln\frac{k}{q}\right) , the standard
Chernoff bound yields\begin{aligned} \ln \Pr(X \geq (1 + \varepsilon) k) &\leq -\frac{k}{2} \left( \varepsilon - \ln(1+\varepsilon)\right) \\ \ln \Pr(X \leq (1 - \varepsilon) k) &\leq -\frac{k}{2} \left(-\varepsilon - \ln(1-\varepsilon)\right) \end{aligned}where 0 . By the union bound,Pr(X \in (1\pm \varepsilon ) k ) \geq 1 - 2 e^{-\frac k2 (\frac{1}{2} \varepsilon^2 - \frac 13 \varepsilon^3)} This result is used in proving the
Johnson–Lindenstrauss lemma.
Asymptotic properties By the
central limit theorem, because the chi-squared distribution is the sum of k independent random variables with finite mean and variance, it converges to a normal distribution for large k. For many practical purposes, for k>50 the distribution is sufficiently close to a
normal distribution, so the difference is ignorable. Specifically, if X \sim \chi^2(k), then as k tends to infinity, the distribution of (X-k)/\sqrt{2k}
tends to a standard normal distribution. However, convergence is slow as the
skewness is \sqrt{8/k} and the
excess kurtosis is 12/k. The sampling distribution of \ln(\chi^2) converges to normality much faster than the sampling distribution of \chi^2, as the
logarithmic transform removes much of the asymmetry. Other functions of the chi-squared distribution converge more rapidly to a normal distribution. Some examples are: • If X \sim \chi^2(k) then \sqrt{2X} is approximately normally distributed with mean \sqrt{2k-1} and unit variance (1922, by
R. A. Fisher, see (18.23), p. 426 of Johnson). This is known as the
Wilson–Hilferty transformation, see (18.24), p. 426 of Johnson. • This
normalizing transformation leads directly to the commonly used median approximation k\bigg(1-\frac{2}{9k}\bigg)^3\; by back-transforming from the mean, which is also the median, of the normal distribution. == Related distributions ==