Confidence distribution

In statistical inference, the concept of a confidence distribution (CD) has often been loosely referred to as a distribution function on the parameter space that can represent confidence intervals of all levels for a parameter of interest. Historically, it has typically been constructed by inverting the upper limits of lower sided confidence intervals of all levels. It was also commonly associated with a fiducial interpretation, although it is a purely frequentist concept. A confidence distribution is not a probability distribution function of the parameter of interest, but may still be a function useful for making inferences.

History

Neyman (1937) Some researchers view the confidence distribution as "the Neymanian interpretation of Fisher's fiducial distributions", which was "furiously disputed by Fisher". It is also believed that these "unproductive disputes" and Fisher's "stubborn insistence" might be the reason that the concept of confidence distribution has been long misconstrued as a fiducial concept and not been fully developed under the frequentist framework. Indeed, the confidence distribution is a purely frequentist concept with a purely frequentist interpretation, although it also has ties to Bayesian and fiducial inference concepts. == Definition ==

Definition

Classical definition Classically, a confidence distribution is defined by inverting the upper limits of a series of lower-sided confidence intervals. C for a parameter \gamma in a measurable space is a distribution estimator with C(A_p) = p for a family of confidence regions A_p for \gamma with level p for all levels 0 . The family of confidence regions is not unique. If A_p only exists for p \in I \subset (0,1), then C is a confidence distribution with level set I. Both C and all A_p are measurable functions of the data. This implies that C is a random measure and A_p is a random set. If the defining requirement P(\gamma \in A_p) \ge p holds with equality, then the confidence distribution is by definition exact. If, additionally, \gamma is a real parameter, then the measure theoretic definition coincides with the above classical definition. ==Examples==

Examples

Example 1: Normal mean and variance Suppose a normal sample Xi ~ N(μ, σ2), i = 1, 2, ..., n is given. Known Variance σ2 Let, Φ be the cumulative distribution function of the standard normal distribution, and F_{t_{n-1}} the cumulative distribution function of the Student t_{n-1} distribution. Both the functions H_\mathit{\Phi}(\mu) and H_t(\mu) given by H_\Phi(\mu) = \mathit{\Phi}{\left(\frac{\sqrt{n}(\mu-\bar{X})}{\sigma}\right)} , \quad\text{and}\quad H_t(\mu) = F_{t_{n-1}}{\left(\frac{\sqrt{n}(\mu-\bar{X})}{s}\right)} , satisfy the two requirements in the CD definition, and they are confidence distribution functions for μ. \pi (\rho \mid r) = \frac{\nu (\nu - 1)\Gamma(\nu-1)}{\sqrt{2\pi}\Gamma(\nu + \frac{1}{2})} \left(1 - r^2\right)^{\frac{\nu - 1}{2}} \cdot \left(1 - \rho^2\right)^{\frac{\nu - 2}{2}} \cdot \left(1 - r \rho\right)^{-\nu+\frac{1}{2}} F{\left(\frac{3}{2},-\frac{1}{2}; \nu + \frac{1}{2}; \frac{1 + r \rho}{2}\right)} where F is the Gaussian hypergeometric function and \nu = n-1 > 1 . This is also the posterior density of a Bayes matching prior for the five parameters in the binormal distribution. The very last formula in the classical book by Fisher gives \pi (\rho | r) = \frac{(1 - r^2)^{\frac{\nu - 1}{2}} \cdot (1 - \rho^2)^{\frac{\nu - 2}{2}}}{\pi (\nu - 2)!} \partial_{\rho r}^{\nu - 2} \left\{ \frac{\theta - \frac{1}{2}\sin 2\theta}{\sin^3 \theta} \right\} where \cos \theta = -\rho r and 0 . This formula was derived by C. R. Rao. Example 3: Binormal mean Let data be generated by Y = \gamma + U where \gamma is an unknown vector in the plane and U has a binormal and known distribution in the plane. The distribution of \Gamma^y = y - U defines a confidence distribution for \gamma. The confidence regions A_p can be chosen as the interior of ellipses centered at \gamma and axes given by the eigenvectors of the covariance matrix of \Gamma^y. The confidence distribution is in this case binormal with mean \gamma, and the confidence regions can be chosen in many other ways. The argument generalizes to the case of an unknown mean \gamma in an infinite-dimensional Hilbert space, but in this case the confidence distribution is not a Bayesian posterior. == Using confidence distributions for inference ==

Using confidence distributions for inference

Confidence interval From the CD definition, it is evident that the interval (-\infty, H_n^{-1}(1-\alpha)], [H_n^{-1}(\alpha), \infty) and [H_n^{-1}(\alpha/2), H_n^{-1}(1-\alpha/2)] provide 100(1 − α)%-level confidence intervals of different kinds, for θ, for any α ∈ (0, 1). Also [H_n^{-1}(\alpha_1), H_n^{-1}(1-\alpha_2)] is a level 100(1 − α1 − α2)% confidence interval for the parameter θ for any α1 > 0, α2 > 0 and α1 + α2 H_n^{-1}(\beta) is the 100β% quantile of H_n(\theta) or it solves for θ in equation H_n(\theta)=\beta . The same holds for a CD, where the confidence level is achieved in limit. Some authors have proposed using them for graphically viewing what parameter values are consistent with the data, instead of coverage or performance purposes. Point estimation Point estimators can also be constructed given a confidence distribution estimator for the parameter of interest. For example, given Hn(θ) the CD for a parameter θ, natural choices of point estimators include the median Mn = Hn−1(1/2), the mean \bar{\theta}_n = \int_{-\infty}^\infty t \, \mathrm{d}H_n(t), and the maximum point of the CD density \widehat{\theta}_n=\arg\max_\theta h_n(\theta), h_n(\theta)=H'_n(\theta). Under some modest conditions, among other properties, one can prove that these point estimators are all consistent. Certain confidence distributions can give optimal frequentist estimators. Hypothesis testing One can derive a p-value for a test, either one-sided or two-sided, concerning the parameter θ, from its confidence distribution Hn(θ). Denote by the probability mass of a set C under the confidence distribution function p_s(C) = H_n(C) = \int_C \mathrm{d} H(\theta). This ps(C) is called "support" in the CD inference and also known as "belief" in the fiducial literature. We have • For the one-sided test K0: θ ∈ C vs. K1: θ ∈ Cc, where C is of the type of (−∞, b] or [b, ∞), one can show from the CD definition that supθ ∈ ''C'P'θ(p''s(C) ≤ α) = α. Thus, ps(C) = Hn(C) is the corresponding p-value of the test. • For the singleton test K0: θ = b vs. K1: θ ≠ b, P{K0: θ = b}(2 min{ps(Clo), one can show from the CD definition that ps(Cup)} ≤ α) = α. Thus, 2 min{ps(Clo), ps(Cup)} = 2 min{Hn(b), 1 − Hn(b)} is the corresponding p-value of the test. Here, Clo = (−∞, b] and Cup = [b, ∞). See Figure 1 from Xie and Singh (2011) for a graphical illustration of the CD inference. == Implementations ==

Implementations

A few statistical programs have implemented the ability to construct and graph confidence distributions. • R, via the concurve, pvaluefunctions, and episheet packages • Excel, via episheet • Stata, via concurve ==See also==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com