Jackknife resampling

The jackknife estimator of a parameter is found by systematically leaving out each observation from a dataset and calculating the parameter estimate over the remaining observations and then aggregating these calculations. For example, if the parameter to be estimated is the population mean of random variable x, then for a given set of i.i.d. observations x_1, ..., x_n the natural estimator is the sample mean: \bar{x} = \frac{1}{n} \sum_{i=1}^n x_i =\frac{1}n \sum_{i \in [n]} x_i, where the last sum used another way to indicate that the index i runs over the set [n] = \{1, \ldots, n\}. Then we proceed as follows: For each i \in [n] we compute the mean \bar{x}_{(i)} of the jackknife subsample consisting of all but the i-th data point, and this is called the i-th jackknife replicate: \bar{x}_{(i)} = \frac{1}{n-1} \sum_{j \in [n], j\ne i} x_j, \qquad i = 1, \dots, n. It could help to think that these n jackknife replicates \bar{x}_{(1)}, \ldots, \bar{x}_{(n)} approximate the distribution of the sample mean \bar{x}. A larger n improves the approximation. Then finally to get the jackknife estimator, the n jackknife replicates are averaged: \bar{x}_\text{jack} = \frac{1}{n} \sum_{i=1}^n \bar{x}_{(i)}. One may ask about the bias and the variance of \bar{x}_\text{jack}. From the definition of \bar{x}_\text{jack} as the average of the jackknife replicates, one could try to calculate explicitly. The bias is a trivial calculation, but the variance of \bar{x}_\text{jack} is more involved, since the jackknife replicates are not independent. For the special case of the mean, one can show explicitly that the jackknife estimate equals the usual estimate: \frac{1}{n} \sum_{i=1}^n \bar{x}_{(i)} = \bar{x}. This establishes the identity \bar{x}_\text{jack} = \bar{x}. Then taking expectations, we get E[\bar{x}_\text{jack}] = E[\bar{x}] = E[x], so \bar{x}_\text{jack} is unbiased, while taking variance, we get V[\bar{x}_\text{jack}] = V[\bar{x}] = V[x]/n. However, these properties do not generally hold for parameters other than the mean. This simple example for the case of mean estimation is just to illustrate the construction of a jackknife estimator, while the real subtleties (and the usefulness) emerge for the case of estimating other parameters, such as higher moments than the mean or other functionals of the distribution. \bar{x}_\text{jack} could be used to construct an empirical estimate of the bias of \bar{x}, namely \widehat{\operatorname{bias}}(\bar{x})_\text{jack} = c(\bar{x}_\text{jack} - \bar{x}) with some suitable factor c > 0, although in this case we know that \bar{x}_\text{jack} = \bar{x}, so this construction does not add any meaningful knowledge, but it gives the correct estimation of the bias (which is zero). A jackknife estimate of the variance of \bar{x} can be calculated from the variance of the jackknife replicates \bar{x}_{(i)}: \widehat{\operatorname{var}}(\bar{x})_\text{jack} = \frac{n - 1}{n} \sum_{i=1}^n (\bar{x}_{(i)} - \bar{x}_\text{jack})^2 = \frac{1}{n(n - 1)} \sum_{i=1}^n (x_i - \bar{x})^2. The left equality defines the estimator \widehat{\operatorname{var}}(\bar{x})_\text{jack}, and the right equality is an identity that can be verified directly. Then taking expectations, we get E[\widehat{\operatorname{var}}(\bar{x})_\text{jack}] = V[x]/n = V[\bar{x}], so this is an unbiased estimator of the variance of \bar{x}. ==Estimating the bias of an estimator==