With insufficiently large samples, the approach:
fixed sample – random properties suggests inference procedures in three steps: {{ordered list
Example. : For
X following a
Pareto distribution with parameters
a and
k, i.e. F_X(x)=\left(1-\frac{k}{x}^a\right) I_{[k,\infty)}(x), a sampling mechanism (U, g_{(a,k)}) for
X with seed
U reads: g_{(a,k)}(u)=k (1-u)^{-\frac{1}{a}}, or, equivalently, g_{(a,k)}(u)=k u^{-1/a}. s = \rho(\boldsymbol\theta;z_1,\ldots,z_m). With these relations we may inspect the values of the parameters that could have generated a sample with the observed statistic from a particular setting of the seeds representing the seed of the sample. Hence, to the population of sample seeds corresponds a population of parameters. In order to ensure this population clean properties, it is enough to draw randomly the seed values and involve either
sufficient statistics or, simply,
well-behaved statistics w.r.t. the parameters, in the master equations. For example, the statistics s_1 = \sum_{i=1}^m \log x_i and s_2 = \min_{i=1,\ldots,m} \{x_i\} prove to be sufficient for parameters
a and
k of a Pareto random variable
X. Thanks to the (equivalent form of the) sampling mechanism g_{(a,k)} we may read them as s_1 = m\log k + \frac{1}{a} \sum_{i=1}^m \log u_i s_2 = \min_{i=1,\ldots,m} \{k u_i^{-\frac{1}{a}}\}, respectively.
Example. From the above master equation we can draw a pair of parameters, ( a, k),
compatible with the observed sample by solving the following system of equations: a=\frac{\sum\log u_i - m \log \min \{u_i\}}{s_1-m\log s_2}. k = \exp\left(\frac{ a s_1 - \sum\log u_i}{m a}\right) where s_1 and s_2 are the observed statistics and u_1,\ldots,u_m a set of uniform seeds. Transferring to the parameters the probability (density) affecting the seeds, you obtain the distribution law of the random parameters
A and
K compatible with the statistics you have observed. Compatibility denotes parameters of compatible populations, i.e. of populations that
could have generated a sample giving rise to the observed statistics. You may formalize this notion as follows: }}
Definition For a random variable and a sample drawn from it a
compatible distribution is a distribution having the same
sampling mechanism \mathcal M_X=(Z,g_{\boldsymbol\theta}) of
X with a value \boldsymbol\theta of the random parameter \mathbf\Theta derived from a
master equation rooted on a well-behaved statistic
s.
Example You may find the distribution law of the Pareto parameters
A and
K as an implementation example of the
population bootstrap method as in the figure on the left. Implementing the
twisting argument method, you get the distribution law F_M(\mu) of the mean
M of a Gaussian variable
X on the basis of the statistic s_M = \sum_{i=1}^m x_i when \Sigma^2 is known to be equal to \sigma^2 . Its expression is: F_M(\mu) = \Phi{\left(\frac{m\mu-s_M}{\sigma\sqrt{m}}\right)}, shown in the figure on the right, where \Phi is the
cumulative distribution function of a
standard normal distribution. Computing a
confidence interval for
M given its distribution function is straightforward: we need only find two quantiles (for instance \delta/2 and 1-\delta/2 quantiles in case we are interested in a confidence interval of level δ symmetric in the tail's probabilities) as indicated on the left in the diagram showing the behavior of the two bounds for different values of the statistic
sm. The Achilles heel of Fisher's approach lies in the joint distribution of more than one parameter, say mean and variance of a Gaussian distribution. On the contrary, with the last approach (and above-mentioned methods:
population bootstrap and
twisting argument) we may learn the joint distribution of many parameters. For instance, focusing on the distribution of two or many more parameters, in the figures below we report two confidence regions where the function to be learnt falls with a confidence of 90%. The former concerns the probability with which an extended
support vector machine attributes a binary label 1 to the points of the (x,y) plane. The two surfaces are drawn on the basis of a set of sample points in turn labelled according to a specific distribution law . The latter concerns the
confidence region of the hazard rate of breast cancer recurrence computed from a censored sample . == Notes ==