Binomial sum variance inequality

Consider the sum, Z, of two independent binomial random variables, X ~ B(m0, p0) and Y ~ B(m1, p1), where Z = X + Y. Then, the variance of Z is less than or equal to its variance under the assumption that p0 = p1 = \bar{p}, that is, if Z had a binomial distribution with the success probability equal to the average of X and Y 's probabilities. Symbolically, Var(Z) \leqslant E[Z] (1 - \tfrac{E[Z]}{m_0+m_1}). Proof We wish to prove that :Var(Z) \leqslant E[Z] (1 - \frac{E[Z]}{m_0+m_1}) We will prove this inequality by finding an expression for Var(Z) and substituting it on the left-hand side, then showing that the inequality always holds. If Z has a binomial distribution with parameters n and p, then the expected value of Z is given by E[Z] = np and the variance of Z is given by Var[Z] = np(1 – p). Letting n = m0 + m1 and substituting E[Z] for np gives :Var(Z) = E[Z] (1 - \frac{E[Z]}{m_0+m_1}) The random variables X and Y are independent, so the variance of the sum is equal to the sum of the variances, that is :Var(Z) = E[X] (1-\frac{E[X]}{m_0}) + E[Y] (1-\frac{E[Y]}{m_1}) In order to prove the theorem, it is therefore sufficient to prove that :E[X](1 - \frac{E[X]}{m_0}) + E[Y](1 - \frac{E[Y]}{m_1}) \leqslant E[Z](1 - \frac{E[Z]}{m_0+m_1}) Substituting E[X] + E[Y] for E[Z] gives :E[X](1 - \frac{E[X]}{m_0}) + E[Y](1 - \frac{E[Y]}{m_1}) \leqslant (E[X]+E[Y])(1 - \frac{E[X]+E[Y]}{m_0+m_1}) Multiplying out the brackets and subtracting E[X] + E[Y] from both sides yields :- \frac{E[X]^2}{m_0} - \frac{E[Y]^2}{m_1} \leqslant - \frac{(E[X]+E[Y])^2}{m_0+m_1} Multiplying out the brackets yields :E[X] - \frac{E[X]^2}{m_0} + E[Y] - \frac{E[Y]^2}{m_1} \leqslant E[X] + E[Y] - \frac{(E[X]+E[Y])^2}{m_0+m_1} Subtracting E[X] and E[Y] from both sides and reversing the inequality gives :\frac{E[X]^2}{m_0} + \frac{E[Y]^2}{m_1} \geqslant \frac{(E[X]+E[Y])^2}{m_0+m_1} Expanding the right-hand side gives :\frac{E[X]^2}{m_0} + \frac{E[Y]^2}{m_1} \geqslant \frac{E[X]^2+2E[X]E[Y]+E[Y]^2}{m_0+m_1} Multiplying by m_0 m_1 (m_0+m_1) yields :(m_0m_1+{m_1}^2){E[X]^2}+ ({m_0}^2+m_0m_1){E[Y]^2} \geqslant m_0m_1({E[X]}^2+2E[X]E[Y]+{E[Y^2}) Deducting the right-hand side gives the relation :{m_1}^2{E[X]^2} -2m_0m_1E[X]E[Y] + {m_0}^2{E[Y]^2} \geqslant 0 or equivalently :(m_1E[X] - m_0E[Y])^2 \geqslant 0 The square of a real number is always greater than or equal to zero, so this is true for all independent binomial distributions that X and Y could take. This is sufficient to prove the theorem. Although this proof was developed for the sum of two variables, it is easily generalized to greater than two. Additionally, if the individual success probabilities are known, then the variance is known to take the form : \operatorname{Var}(Z) = n \bar{p} (1 - \bar{p}) - ns^2, where \bar{p} is the average probability and s^2 = \frac{1}{n}\sum_{i=1}^n (p_i-\bar{p})^2. This expression also implies that the variance is always less than that of the binomial distribution with p=\bar{p}, because the standard expression for the variance is decreased by ns2, a positive number. ==Applications==