One method is to calculate the posterior
probability density function of
Bayesian probability theory. A test is performed by tossing the coin
N times and noting the observed numbers of heads,
h, and tails,
t. The symbols
H and
T represent more generalised variables expressing the numbers of heads and tails respectively that
might have been observed in the experiment. Thus
N =
H +
T =
h +
t. Next, let
r be the actual probability of obtaining heads in a single toss of the coin. This is the property of the coin which is being investigated. Using
Bayes' theorem, the posterior probability density of
r conditional on
h and
t is expressed as follows: : f(r \mid H = h, T = t) = \frac{\Pr(H = h \mid r, N = h + t) \, g(r)}{\int_0^1 \Pr(H = h \mid p, N = h + t) \, g(p) \, dp}, where
g(
r) represents the prior probability density distribution of
r, which lies in the range 0 to 1. The prior probability density distribution summarizes what is known about the distribution of
r in the absence of any observation. We will assume that the
prior distribution of
r is
uniform over the interval [0, 1]. That is,
g(
r) = 1. (In practice, it would be more appropriate to assume a prior distribution which is much more heavily weighted in the region around 0.5, to reflect our experience with real coins.) The probability of obtaining
h heads in
N tosses of a coin with a probability of heads equal to
r is given by the
binomial distribution: : \Pr(H = h \mid r, N = h + t) = {N \choose h} r^h (1 - r)^t. Substituting this into the previous formula: : f(r \mid H = h, T = t) = \frac{{N \choose h} r^h (1-r)^t} {\int_0^1 {N \choose h} p^h (1 - p)^t\,dp} = \frac{r^h (1 - r)^t}{\int_0^1 p^h (1 - p)^t\,dp}. This is in fact a
beta distribution (the
conjugate prior for the binomial distribution), whose denominator can be expressed in terms of the
beta function: :f(r \mid H = h, T = t) = \frac{1}{\mathrm{B}(h + 1, t + 1)} r^h (1 - r)^t. As a uniform prior distribution has been assumed, and because
h and
t are integers, this can also be written in terms of
factorials: :f(r \mid H = h, T = t) = \frac{(h + t + 1)!}{h!\,t!} r^h (1 - r)^t.
Example For example, let
N = 10,
h = 7, i.e. the coin is tossed 10 times and 7 heads are obtained: : f(r \mid H = 7, T = 3) = \frac{(10 + 1)!}{7!\,3!} r^7 (1 - r)^3 = 1320 \, r^7 (1 - r)^3. The graph on the right shows the
probability density function of
r given that 7 heads were obtained in 10 tosses. (Note:
r is the probability of obtaining heads when tossing the same coin once.) The probability for an unbiased coin (defined for this purpose as one whose probability of coming down heads is somewhere between 45% and 55%) : \Pr(0.45 is small when compared with the
alternative hypothesis (a biased coin). However, it is not small enough to cause us to believe that the coin has a significant bias. This probability is slightly
higher than our presupposition of the probability that the coin was fair corresponding to the uniform prior distribution, which was 10%. Using a prior distribution that reflects our prior knowledge of what a coin is and how it acts, the posterior distribution would not favor the hypothesis of bias. However the number of trials in this example (10 tosses) is very small, and with more trials the choice of prior distribution would be somewhat less relevant.) With the uniform prior, the posterior probability distribution
f(
r |
H = 7,
T = 3) achieves its peak at
r =
h / (
h +
t) = 0.7; this value is called the
maximum a posteriori (MAP) estimate of
r. Also with the uniform prior, the
expected value of
r under the posterior distribution is :\operatorname{E}[r] = \int_0^1 r \cdot f(r \mid H=7, T=3) \, \mathrm{d}r = \frac{h+1}{h+t+2} = \frac{2}{3}. == Estimator of true probability ==