Sums of binomials If and are independent binomial variables with the same probability , then is again a binomial variable; its distribution is : \begin{align} \operatorname P(Z=k) &= \sum_{i=0}^k\left[\binom{n}i p^i (1-p)^{n-i}\right]\left[\binom{m}{k-i} p^{k-i} (1-p)^{m-k+i}\right]\\ &= \binom{n+m}k p^k (1-p)^{n+m-k} \end{align} A Binomial distributed random variable can be considered as the sum of Bernoulli distributed random variables. So the sum of two Binomial distributed random variables and is equivalent to the sum of Bernoulli distributed random variables, which means . This can also be proven directly using the addition rule. However, if and do not have the same probability , then the variance of the sum will be
smaller than the variance of a binomial variable distributed as .
Poisson binomial distribution The binomial distribution is a special case of the
Poisson binomial distribution, which is the distribution of a sum of independent non-identical
Bernoulli trials .
Ratio of two binomial distributions This result was first derived by Katz and coauthors in 1978. Let and be independent. Let . Then is approximately normally distributed with mean and variance .
Conditional binomials If and (the conditional distribution of , given ), then is a simple binomial random variable with distribution . For example, imagine throwing balls to a basket and taking the balls that hit and throwing them to another basket . If is the probability to hit then is the number of balls that hit . If is the probability to hit then the number of balls that hit is and therefore . Since X \sim \mathrm{B}(n, p) and Y \sim \mathrm{B}(X, q) , by the
law of total probability, \begin{align} \Pr[Y = m] &= \sum_{k = m}^{n} \Pr[Y = m \mid X = k] \Pr[X = k] \\[2pt] &= \sum_{k=m}^n \binom{n}{k} \binom{k}{m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} \end{align} Since \tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m}, the equation above can be expressed as \Pr[Y = m] = \sum_{k=m}^{n} \binom{n}{m} \binom{n-m}{k-m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} Factoring p^k = p^m p^{k-m} and pulling all the terms that don't depend on k out of the sum now yields \begin{align} \Pr[Y = m] &= \binom{n}{m} p^m q^m \left( \sum_{k=m}^n \binom{n-m}{k-m} p^{k-m} (1-p)^{n-k} (1-q)^{k-m} \right) \\[2pt] &= \binom{n}{m} (pq)^m \left( \sum_{k=m}^n \binom{n-m}{k-m} \left(p(1-q)\right)^{k-m} (1-p)^{n-k} \right) \end{align} After substituting i = k - m in the expression above, we get \Pr[Y = m] = \binom{n}{m} (pq)^m \left( \sum_{i=0}^{n-m} \binom{n-m}{i} (p - pq)^i (1-p)^{n-m - i} \right) Notice that the sum (in the parentheses) above equals (p - pq + 1 - p)^{n-m} by the
binomial theorem. Substituting this in finally yields \begin{align} \Pr[Y=m] &= \binom{n}{m} (pq)^m (p - pq + 1 - p)^{n-m}\\[4pt] &= \binom{n}{m} (pq)^m (1-pq)^{n-m} \end{align} and thus Y \sim \mathrm{B}(n, pq) as desired.
Bernoulli distribution The
Bernoulli distribution is a special case of the binomial distribution, where . Symbolically, has the same meaning as . Conversely, any binomial distribution, , is the distribution of the sum of independent
Bernoulli trials, , each with the same probability .
Normal approximation and normal
probability density function approximation for and If is large enough, then the skew of the distribution is not too great. In this case a reasonable approximation to is given by the
normal distribution \mathcal{N}(np,\,np(1-p)), and this basic approximation can be improved in a simple way by using a suitable
continuity correction. The basic approximation generally improves as increases (at least 20) and is better when is not near to 0 or 1. Various
rules of thumb may be used to decide whether is large enough, and is far enough from the extremes of zero or one: • One rule or equal to 5. However, the specific number varies from source to source, and depends on how good an approximation one wants. In particular, if one uses 9 instead of 5, the rule implies the results stated in the previous paragraphs. Assume that both values np and n(1-p) are greater than 9. Since 0, we easily have that np\geq9>9(1-p)\quad\text{and}\quad n(1-p)\geq9>9p. We only have to divide now by the respective factors p and 1-p, to deduce the alternative form of the 3-standard-deviation rule: n>9 \left(\frac{1-p}p\right) \quad\text{and}\quad n>9 \left(\frac{p}{1-p}\right). The following is an example of applying a
continuity correction. Suppose one wishes to calculate for a binomial random variable . If has a distribution given by the normal approximation, then is approximated by . The addition of 0.5 is the continuity correction; the uncorrected normal approximation gives considerably less accurate results. This approximation, known as
de Moivre–Laplace theorem, is a huge time-saver when undertaking calculations by hand (exact calculations with large are very onerous); historically, it was the first use of the normal distribution, introduced in
Abraham de Moivre's book
The Doctrine of Chances in 1738. Nowadays, it can be seen as a consequence of the
central limit theorem since is a sum of independent, identically distributed
Bernoulli variables with parameter . This fact is the basis of a
hypothesis test, a "proportion z-test", for the value of using , the sample proportion and estimator of , in a
common test statistic. For example, suppose one randomly samples people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If groups of people were sampled repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion of agreement in the population and with standard deviation \sigma = \sqrt{\frac{p(1-p)}{n}}
Poisson approximation The binomial distribution converges towards the
Poisson distribution as the number of trials goes to infinity while the product converges to a finite limit. Therefore, the Poisson distribution with parameter can be used as an approximation to of the binomial distribution if is sufficiently large and is sufficiently small. According to rules of thumb, this approximation is good if and such that , or if and such that , or if and . Concerning the accuracy of Poisson approximation, see Novak, ch. 4, and references therein.
Limiting distributions •
Poisson limit theorem: As approaches and approaches 0 with the product held fixed, the distribution approaches the
Poisson distribution with
expected value . P(p;\alpha,\beta) = \frac{p^{\alpha-1}(1-p)^{\beta-1}}{\operatorname{Beta}(\alpha,\beta)}. Given a uniform prior, the posterior distribution for the probability of success given independent events with observed successes is a beta distribution. == Computational methods ==