Analysis of Variance (ANOVA) In a one-way
analysis of variance, the
total sum of squares (proportional to \operatorname{Var}(Y)) is split into a “between-group” sum of squares (\operatorname{Var}(\operatorname{E}[Y\mid X])) plus a “within-group” sum of squares (\operatorname{E}[\operatorname{Var}(Y\mid X)]). The
F-test examines whether the explained component is sufficiently large to indicate has a significant effect on .
Regression and R² In
linear regression and related models, if \hat{Y}=\operatorname{E}[Y\mid X], the fraction of variance explained is \begin{align} R^2 = \frac{\operatorname{Var}(\hat{Y})}{\operatorname{Var}(Y)} &= \frac{\operatorname{Var}(\operatorname{E}[Y\mid X])}{\operatorname{Var}(Y)} \\[1ex] &= 1 - \frac{\operatorname{E}[\operatorname{Var}(Y\mid X)]}{\operatorname{Var}(Y)}. \end{align} In the simple linear case (one predictor), R^2 also equals the square of the
Pearson correlation coefficient between and .
Machine learning and Bayesian inference In many
Bayesian and ensemble methods, one decomposes prediction uncertainty via the law of total variance. For a
Bayesian neural network with random parameters \theta: \operatorname{Var}(Y) = \operatorname{E}\left[\operatorname{Var}(Y\mid \theta)\right] + \operatorname{Var}\left(\operatorname{E}[Y\mid \theta]\right), often referred to as “aleatoric” (within-model) vs. “epistemic” (between-model) uncertainty.
Actuarial science Credibility theory uses the same partitioning: the expected value of process variance (EVPV), \operatorname{E}[\operatorname{Var}(Y\mid X)], and the variance of hypothetical means (VHM), \operatorname{Var}(\operatorname{E}[Y\mid X]). The ratio of explained to total variance determines how much “credibility” to give to individual risk classifications. In non-Gaussian settings, a high explained-variance ratio still indicates significant information about contained in .
Generalizations The law of total variance generalizes to multiple or nested conditionings. For example, with two conditioning variables X_1 and X_2: \operatorname{Var}(Y) = \operatorname{E}\left[\operatorname{Var}(Y\mid X_1,X_2)\right] + \operatorname{E}\left[\operatorname{Var}(\operatorname{E}[Y\mid X_1,X_2]\mid X_1)\right] + \operatorname{Var}(\operatorname{E}[Y\mid X_1]). More generally, the
law of total cumulance extends this approach to higher moments. == See also ==