MarketBinomial proportion confidence interval
Company Profile

Binomial proportion confidence interval

In statistics, a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success–failure experiments. In other words, a binomial proportion confidence interval is an interval estimate of a success probability when only the number of experiments and the number of successes are known.

Problems with using a normal approximation or "Wald interval" {{anchor|Normal approximation interval|Wald interval}}
reveals problems of overshoot and zero-width intervals. The normal approximation depends on the de Moivre–Laplace theorem (the original, binomial-only version of the central limit theorem) and becomes unreliable when it violates the theorems' premises, as the sample size becomes small or the success probability grows close to either or Using the normal approximation, the success probability \ p\ is estimated by p ~ \approx ~ \hat p \pm \frac{ z_\alpha }{ \sqrt{n\ } }\sqrt{ \hat p \left(1 - \hat p \right)\ }\ , where \ \hat p \equiv \frac{ n_\mathsf{s} }{ n }\ is the proportion of successes in a Bernoulli trial process and an estimator for p in the underlying Bernoulli distribution. The equivalent formula in terms of observation counts is p ~ \approx ~ \frac{ n_\mathsf{s} }{ n } \pm \frac{ z_\alpha }{ \sqrt{n\ } } \sqrt{ \frac{ n_\mathsf{s} }{ n } \frac{ n_\mathsf{f} }{ n }\ }\ , where the data are the results of \ n\ trials that yielded \ n_\mathsf{s}\ successes and \ n_\mathsf{f} = n - n_\mathsf{s}\ failures. The distribution function argument \ z_\alpha\ is the \ 1 - \tfrac{ \alpha }{2}\ quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate \ \alpha ~. For a 95% confidence level, the error \ \alpha ~=~ 1 - 0.95 ~=~ 0.05\ , so that \ 1 - \tfrac{ \alpha }{ 2 } = 0.975\ and \ z_{.05} = 1.96 ~. When using the Wald formula to estimate \ p\ , or just considering the possible outcomes of this calculation, two problems immediately become apparent: • First, for \ \hat p\ approaching either or , the interval narrows to zero width (falsely implying certainty). • Second, for values of \ \hat p (probability too low / too close to ), the interval boundaries exceed \ [0 , 1]\ (overshoot). (Another version of the second, overshoot problem, arises when instead \ 1 - \hat p\ falls below the same upper bound: probability too high / too close to  .) An important theoretical derivation of this confidence interval involves the inversion of a hypothesis test. Under this formulation, the confidence interval represents those values of the population parameter that would have large -values if they were tested as a hypothesized population proportion. The collection of values, \ \theta\ , for which the normal approximation is valid can be represented as \left\{\quad \theta \quad \Bigg\vert \quad y_{\alpha} ~ \le ~ \frac{\hat p - \theta}{\sqrt{\tfrac{ 1 }{ n }\ \hat p \left(1 - \hat p\right)\ } } ~ \le ~ z_{\alpha} \quad\right\}\ , where \ y_{\alpha}\ is the lower \ \tfrac{ \alpha }{ 2 }\ quantile of a standard normal distribution, vs. \ z_{\alpha}\ , which is the upper i.e., \ 1 - \tfrac{ \alpha }{ 2 }\ quantile. Since the test in the middle of the inequality is a Wald test, the normal approximation interval is sometimes called the Wald interval or Wald method, after Abraham Wald, but it was first described by Laplace (1812). Bracketing the confidence interval Extending the normal approximation and Wald-Laplace interval concepts, Michael Short has shown that inequalities on the approximation error between the binomial distribution and the normal distribution can be used to accurately bracket the estimate of the confidence interval around \ p\ : \frac{ k + C_\mathsf{L1} - z_\alpha \widehat{W} }{ n + z_\alpha^2 } ~\le~ p ~\le~ \frac{ k + C_\mathsf{U1} + z_\alpha \widehat{W} }{ n + z_\alpha^2 } with \widehat{W} ~\equiv~ \sqrt{ \frac{ n k - k^2 + C_\mathsf{L2} n - C_\mathsf{L3} k + C_\mathsf{L4} }{ n }\ }\ , and where \ p\ is again the (unknown) proportion of successes in a Bernoulli trial process (as opposed to \ \hat p \equiv \frac{ n_\mathsf{s} }{ n }\ that estimates it) measured with \ n\ trials yielding \ k\ successes, \ z_\alpha\ is the \ 1 - \tfrac{\alpha}{2}\ quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate \ \alpha\ , and the constants \ C_\mathsf{L1}\ , C_\mathsf{L2}\ , \ C_\mathsf{L3}\ , \ C_\mathsf{L4}\ , \ C_\mathsf{U1}\ , \ C_\mathsf{U2}\ , \ C_\mathsf{U3}\ , and \ C_\mathsf{U4}\ are simple algebraic functions of \ z_\alpha ~. you are dealing with. For analytic weights, the following calculation can be applied. Let \ X_1,\ \ldots,\ X_n\ be such that each \ X_i\ is i.i.d from a Bernoulli( ) distribution and weight \ w_i\ is the weight for each observation, with the (positive) weights \ w_i\ normalized so they sum to The weighted sample proportion is: \ \hat p = \sum_{i=1}^n w_i X_i ~. Since each of the \ X_i\ is independent from all the others, and each one has variance \ \operatorname{var}\{\ X_i\ \} = p \left(1 - p\right)\ for every \ i = 1 ,\ \ldots ,\ n\ ; the sampling variance of the proportion therefore is: \operatorname{var}\left\{\ \hat p\ \right\} ~=~ \sum_{i=1}^n \operatorname{var}\left\{\ w_i X_i\ \right\} ~=~ p\left( 1 - p \right) \sum_{i=1}^n w_i^2 ~. The standard error of \ \hat p\ is the square root of this quantity. Because we do not know \ p \left(1 - p\right)\ , we have to estimate it. Although there are many possible estimators, a conventional one is to use \ \hat p\ , the sample mean, and plug this into the formula. That gives: \operatorname{SE}\left\{\ \hat p\ \right\} ~\approx~ \sqrt{ \hat p \left(1 - \hat p\right) \sum_{i=1}^n w_i^2 ~} ~. For otherwise unweighted data, the effective weights are uniform \ w_i = \frac{ 1 }{ n }\ , giving \ \sum_{i=1}^n w_i^2 = \frac{ 1 }{ n } ~. The \ \operatorname{SE}\ becomes \ \sqrt{ \tfrac{ 1 }{ n }\ \hat p \left( 1 - \hat p \right)\ }\ , leading to the familiar formulas, showing that the calculation for weighted data is a direct generalization of them. If the weights in question are the complex sampling design weights, appropriate statistical software (R package ; Python package ) needs to be used to obtain standard errors corrected for the survey sampling design. An extension of the presented below to the complex survey data is also known as the Korn-Graubard confidence interval. ==Wilson score interval==
Wilson score interval
The Wilson score interval was developed by E.B. Wilson (1927). It is an improvement over the normal approximation interval in multiple respects: Unlike the symmetric normal approximation interval (above), the Wilson score interval is asymmetric, and it doesn't suffer from problems of overshoot and zero-width intervals that afflict the normal interval. It can be safely employed with small samples and skewed observations. After that, then also plotting a normal across each bound. The tail areas of the resulting Wilson and normal distributions represent the chance of a significant result, in that direction, must be equal. The continuity-corrected Wilson score interval and the Clopper-Pearson interval are also compliant with this property. The practical import is that these intervals may be employed as significance tests, with identical results to the source test, and new tests may be derived by geometry. \begin{align} w_\mathsf{cc}^- &= \max \left\{ ~ 0, ~ \frac{2n\hat{p} + z_\alpha^2 - \left[z_\alpha\sqrt{ z_\alpha^2 - \frac{1}{ n } + 4n\hat{p}\left(1 - \hat{p}\right) + \left( 4\hat{p} - 2 \right) ~}+1\right]}{2 \left( n + z_\alpha^2 \right) ~} \right\}\ ,\\ w_\mathsf{cc}^+ &= \min \left\{ ~~ 1, ~~ \frac{2n\hat{p} + z_\alpha^2 + \left[z_\alpha\sqrt{ z_\alpha^2 - \frac{1}{ n } + 4n\hat{p}\left( 1 - \hat{p} \right) - \left( 4\hat{p} - 2 \right) ~}+1\right]}{2 \left( n + z_\alpha^2 \right) ~~} \right\}\ , \end{align} for \ \hat p \ne 0\ and \ \hat p \ne 1 ~. If \ \hat p = 0\ , then \ w_\mathsf{cc}^-\ must instead be set to \ 0\ ; if \ \hat p = 1\ , then \ w_\mathsf{cc}^+\ must be instead set to \ 1 ~. Wallis (2021) identifies a simpler method for computing continuity-corrected Wilson intervals that employs a special function based on Wilson's lower-bound formula: In Wallis' notation, for the lower bound, let \mathsf{Wilson_{lower}}\!\left(\ \hat{p},\ n,\ \tfrac{ \alpha }{ 2 }\ \right) ~\equiv~ w^- ~=~ \frac{ 1 }{~ 1 + z_\alpha^2/ n ~} \Biggl(\ \hat p + \frac{\; z_\alpha^2}{ 2n } ~~ - ~~ \frac{\ z_\alpha }{ 2n } \sqrt{4n\hat p(1 - \hat p) + z_\alpha^2 ~} ~ \Biggr)\ , where \ \alpha\ is the selected tolerable error level for \ z_\alpha ~. Then w_\mathsf{cc}^- ~=~ \mathsf{Wilson_{lower}}\!\left(\ \max\left\{\ \hat{p} - \tfrac{ 1 }{ 2 n },\ 0\ \right\},\ n,\ \tfrac{ \alpha }{ 2 }\ \right) ~. This method has the advantage of being further decomposable. Wilson interval with complex survey data For the complex survey data, the sample size n needs to be replaced with the effective sample size \tilde n = n/{\rm DEFF} that accounts for the design effect of unequal weighting and clustering. It is also reasonable to replace the normal quantiles z_\alpha with the Student distribution quantiles t_\alpha(d) where d is the sampling design degrees of freedom. ==Jeffreys interval==
Jeffreys interval
The Jeffreys interval has a Bayesian derivation, but good frequentist properties (outperforming most frequentist constructions). In particular, it has coverage properties that are similar to those of the Wilson interval, but it is one of the few intervals with the advantage of being equal-tailed (e.g., for a 95% confidence interval, the probabilities of the interval lying above or below the true value are both close to 2.5%). In contrast, the Wilson interval has a systematic bias such that it is centred too close to The Jeffreys interval is the Bayesian credible interval obtained when using the non-informative Jeffreys prior for the binomial proportion p ~. The Jeffreys prior for this problem is a Beta distribution with parameters \left( \tfrac{\!1\!}{ 2 }, \tfrac{\!1\!}{ 2 } \right), a conjugate prior. After observing x successes in n trials, the posterior distribution for p is a Beta distribution with parameters \left( x + \tfrac{\!1\!}{ 2 }, n - x + \tfrac{\!1\!}{ 2 } \right) ~. When x \ne 0 and x \ne n, the Jeffreys interval is taken to be the 100\left( 1 - \alpha \right)\mathrm{%} equal-tailed posterior probability interval, i.e., the \tfrac{\!1\!}{ 2 }\alpha and 1 - \tfrac{\!1\!}{ 2 }\alpha quantiles of a Beta distribution with parameters \left(x + \tfrac{1}{ 2 },n - x + \tfrac{1}{ 2 }\right) ~. In order to avoid the coverage probability tending to zero when p \to 0 or  , when x = 0 the upper limit is calculated as before but the lower limit is set to  , and when x = n the lower limit is calculated as before but the upper limit is set to  . Jeffreys' interval can also be thought of as a frequentist interval based on inverting the p-value from the G-test after applying the Yates correction to avoid a potentially-infinite value for the test statistic. ==Clopper–Pearson interval==
Clopper–Pearson interval
The Clopper–Pearson interval is an early and very common method for calculating binomial confidence intervals. This is often called an 'exact' method, as it attains the nominal coverage level in an exact sense, meaning that the coverage level is never less than the nominal B\!\left(\tfrac{\!\alpha\!}{ 2 };x,n - x + 1 \right) ~ where x is the number of successes, n is the number of trials, and B\!\left(p;v,w\right) is the th quantile from a beta distribution with shape parameters v and w ~. Thus, p_{\min} ~ where: \tfrac{\Gamma(n+1)}{\Gamma\!( x )\Gamma\!( n-x+1 )}\int_0^{ p_{\min}}t^{x-1}(1-t)^{n-x}\mathrm{d}\!t ~~ = ~~ \tfrac{\!\alpha\!}{ 2 }, \tfrac{\Gamma(n+1)}{\Gamma(x+1)\Gamma(n-x)}\int_0^{ p_{\max}}t^{x}(1-t)^{n-x-1}\mathrm{d}\!t ~ = ~ 1 - \tfrac{\!\alpha\!}{ 2 } ~. The binomial proportion confidence interval is then \left(p_{\min},p_{\max} \right), as follows from the relation between the Binomial distribution cumulative distribution function and the regularized incomplete beta function. When x is either or n, closed-form expressions for the interval bounds are available: when x = 0 the interval is \left( 0,1 - \left(\tfrac{\!\alpha\!}{ 2 }\right)^{ 1/n } \right) and when x = n it is \left(\left( \tfrac{\!\alpha\!}{ 2 }\right)^{ 1 / n },1\right) ~. etc., with a nominal coverage of 95% may in fact cover less than 95%, in R and scipy.stats.beta.ppf in Python. from scipy.stats import beta import numpy as np k = 20 n = 400 alpha = 0.05 p_u, p_o = beta.ppf([alpha / 2, 1 - alpha / 2], [k, k + 1], [n - k + 1, n - k]) if np.isnan(p_o): p_o = 1 if np.isnan(p_u): p_u = 0 ==Agresti–Coull interval==
Agresti–Coull interval
The Agresti–Coull interval is also another approximate binomial confidence interval. Given n_\mathsf{s} successes in n trials, define \tilde n \equiv n + z^2_\alpha and \tilde p = \frac{ 1 }{\tilde n}\left(\!n_\mathsf{s} + \tfrac{z^2_\alpha }{ 2 }\!\right) Then, a confidence interval for p is given by p ~~ \approx ~~ \tilde p ~ \pm ~ z_\alpha\sqrt{ \frac{\!\tilde p\!}{ \tilde n }\left(\!1 - \tilde p \!\right) ~} where z_\alpha = \operatorname{\Phi^{-1}}\!\!\left(1 - \tfrac{\!\alpha\!}{ 2 }\right) is the quantile of a standard normal distribution, as before (for example, a 95% confidence interval requires \alpha = 0.05, thereby producing z_{.05} = 1.96 ). According to Brown, Cai, & DasGupta (2001), taking z = 2 instead of 1.96 produces the "add 2 successes and 2 failures" interval previously described by Agresti & Coull. This interval can be summarised as employing the centre-point adjustment, \tilde p, of the Wilson score interval, and then applying the Normal approximation to this point. \tilde p = \frac{\quad \hat p + \frac{z^2_\alpha\!}{2n} \quad}{\quad 1 + \frac{z^2_\alpha }{ n } \quad} ==Arcsine transformation==
Arcsine transformation
The arcsine transformation has the effect of pulling out the ends of the distribution. While it can stabilize the variance (and thus confidence intervals) of proportion data, its use has been criticized in several contexts. Let X be the number of successes in n trials and let p = \tfrac{ 1 }{\!n\!}\!X ~. The variance of p is \operatorname{var}\{~~ p ~~\} = \tfrac{ 1 }{\!n\!}p(1 - p) ~. Using the arc sine transform, the variance of the arcsine of \sqrt{p ~} is \operatorname{var} \left\{~~ \arcsin \sqrt{ p ~} ~~\right\} ~ \approx ~ \frac{\operatorname{var}\{~~ p ~~\}}{4p(1 - p)} = \frac{p(1 - p)}{4np(1 - p)} = \frac{1}{4n} ~. So, the confidence interval itself has the form \sin^2 \left(-\frac{ z_\alpha }{2\sqrt{n}} + \arcsin\sqrt{p} ~\right) ~ where z_\alpha is the 1 -\tfrac{\!\alpha\!}{2} quantile of a standard normal distribution. This method may be used to estimate the variance of p but its use is problematic when p is close to or  . ==ta transform==
ta transform
Let p be the proportion of successes. For 0 \le a \le 2, t_a=\log\left(\frac{p^a}{(1 - p)^{2-a}}\right)=a\log p - (2-a)\log(1 - p) This family is a generalisation of the logit transform which is a special case with a = 1 and can be used to transform a proportional data distribution to an approximately normal distribution. The parameter a has to be estimated for the data set. ==Rule of three — for when no successes are observed==
Rule of three — for when no successes are observed
The rule of three is used to provide a simple way of stating an approximate 95% confidence interval for p, in the special case that no successes (\hat p = 0) have been observed. The interval is \left(0, \tfrac{3}{n} \right). By symmetry, in the case of only successes (\hat p = 1), the interval is \left(1 - \tfrac{3}{n},1 \right). ==Comparison and discussion==
Comparison and discussion
There are several research papers that compare these and other confidence intervals for the binomial proportion. Both Ross (2003) and Agresti & Coull (1998) and . ==See also==
tickerdossier.comtickerdossier.substack.com