The following is an example that shows how to compute power for a randomized experiment: Suppose the goal of an experiment is to study the effect of a treatment on some quantity, and so we shall compare research subjects by measuring the quantity before and after the treatment, analyzing the data using a one-sided
paired t-test, with a significance level threshold of 0.05. We are interested in being able to detect a positive change of size \theta > 0. We first set up the problem according to our test. Let A_i and B_i denote the pre-treatment and post-treatment measures on subject i, respectively. The possible effect of the treatment should be visible in the differences D_i = B_i-A_i, which are assumed to be independent and identically
Normal in distribution, with unknown mean value \mu_D and variance \sigma_D^2. Here, it is natural to choose our null hypothesis to be that the expected mean difference is zero, i.e. H_0: \mu_D =\mu_0= 0. For our one-sided test, the alternative hypothesis would be that there is a positive effect, corresponding to H_1: \mu_D = \theta > 0. The
test statistic in this case is defined as: T_n=\frac{\bar{D}_n-\mu_0 }{\hat{\sigma}_D/\sqrt{n}} =\frac{\bar{D}_n-0}{\hat{\sigma}_D/\sqrt{n}}, where \mu_0 is the mean under the null so we substitute in 0, is the sample size (number of subjects), \bar{D}_n is the
sample mean of the difference \bar{D}_n=\frac{1}{n}\sum_{i=1}^n D_i, and \hat{\sigma}_D is the sample
standard deviation of the difference.
Analytic solution We can proceed according to our knowledge of statistical theory, though in practice for a standard case like this software will exist to compute more accurate answers. Thanks to t-test theory, we know this test statistic under the null hypothesis follows a
Student t-distribution with n-1 degrees of freedom. If we wish to reject the null at significance level \alpha = 0.05\,, we must find the
critical value t_{\alpha} such that the probability of T_n > t_{\alpha} under the null is equal to \alpha. If is large, the t-distribution converges to the standard normal distribution (thus no longer involving ) and so through use of the
corresponding quantile function \Phi^{-1}, we obtain that the null should be rejected if T_n > t_{\alpha} \approx \Phi^{-1}(0.95) \approx 1.64\,. Now suppose that the alternative hypothesis H_1 is true so \mu_D = \theta. Then, writing the power as a function of the effect size, B(\theta), we find the probability of T_n being above t_{\alpha} under H_1. \begin{align} B(\theta) &\approx \Pr \left( T_n > 1.64 ~\big|~ \mu_D = \theta \right) \\ &= \Pr \left( \frac{\bar{D}_n-0}{\hat{\sigma}_D/\sqrt{n} } > 1.64 ~\Big|~ \mu_D = \theta \right) \\ &= 1- \Pr \left( \frac{\bar{D}_n-0}{\hat{\sigma}_D/\sqrt{n} } \frac{\bar{D}_n - \theta}{\hat{\sigma}_D/\sqrt{n}} again follows a student-t distribution under H_1, converging on to a standard
normal distribution for large . The estimated {\hat{\sigma}}_D will also converge on to its population value \sigma_D Thus power can be approximated as B(\theta) \approx 1 - \Phi \left( 1.64 - \frac{\theta}{\sigma_D/\sqrt{n}} \right). According to this formula, the power increases with the values of the effect size \theta and the sample size , and reduces with increasing variability \sigma_D. In the trivial case of zero effect size, power is at a minimum (
infimum) and equal to the significance level of the test \alpha\,, in this example 0.05. For finite sample sizes and non-zero variability, it is the case here, as is typical, that power cannot be made equal to 1 except in the trivial case where \alpha = 1 so the null is
always rejected. We can invert B to obtain required sample sizes: \sqrt{n} > \frac{\sigma_D}{\theta}\left( 1.64- \Phi^{-1} \left( 1- B(\theta)\right) \right). Suppose \theta = 1 and we believe \sigma_D is around 2, say, then we require for a power of B(\theta) = 0.8, a sample size n > 4 \left( 1.64- \Phi^{-1} \left( 1- 0.8\right) \right)^2 \approx 4 \left( 1.64+0.84\right)^2 \approx 24.6 .
Simulation solution Alternatively we can use a
Monte Carlo simulation method that works more generally. Once again, we return to the assumption of the distribution of D_n and the definition of T_n. Suppose we have fixed values of the sample size, variability and effect size, and wish to compute power. We can adopt this process: 1. Generate a large number of sets of D_n according to the null hypothesis, N(0, \sigma_D) 2. Compute the resulting test statistic T_n for each set. 3. Compute the (1-\alpha)th quantile of the simulated T_n and use that as an estimate of t_\alpha. 4. Now generate a large number of sets of D_n according to the alternative hypothesis, N(\theta, \sigma_D), and compute the corresponding test statistics again. 5. Look at the proportion of these simulated alternative T_n that are above the t_\alpha calculated in step 3 and so are rejected. This is the power. This can be done with a variety of software packages. Using this methodology with the values before, setting the sample size to 25 leads to an estimated power of around 0.78. The small discrepancy with the previous section is due mainly to inaccuracies with the normal approximation. == Power in different disciplines ==