Normal-inverse-gamma distribution

In probability theory and statistics, the normal-inverse-gamma distribution is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

Definition

Suppose : x \mid \sigma^2, \mu, \lambda\sim \mathrm{N}(\mu,\sigma^2 / \lambda) \,\! has a normal distribution with mean \mu and variance \sigma^2 / \lambda, where :\sigma^2\mid\alpha, \beta \sim \Gamma^{-1}(\alpha,\beta) \! has an inverse-gamma distribution. Then (x,\sigma^2) has a normal-inverse-gamma distribution, denoted as : (x,\sigma^2) \sim \text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta) \! . (\text{NIG} is also used instead of \text{N-}\Gamma^{-1}.) The normal-inverse-Wishart distribution is a generalization of the normal-inverse-gamma distribution that is defined over multivariate random variables. ==Characterization==

Characterization

Probability density function : f(x,\sigma^2\mid\mu,\lambda,\alpha,\beta) = \frac {\sqrt{\lambda}} {\sigma\sqrt{2\pi} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{\sigma^2} \right)^{\alpha + 1} \exp \left( -\frac { 2\beta + \lambda(x - \mu)^2} {2\sigma^2} \right) For the multivariate form where \mathbf{x} is a k \times 1 random vector, : f(\mathbf{x},\sigma^2\mid\mu,\mathbf{V}^{-1},\alpha,\beta) = |\mathbf{V}|^{-1/2} {(2\pi)^{-k/2} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{\sigma^2} \right)^{\alpha + 1 + k/2} \exp \left( -\frac { 2\beta + (\mathbf{x} - \boldsymbol{\mu})^T \mathbf{V}^{-1} (\mathbf{x} - \boldsymbol{\mu})} {2\sigma^2} \right). where |\mathbf{V}| is the determinant of the k \times k matrix \mathbf{V}. Note how this last equation reduces to the first form if k = 1 so that \mathbf{x}, \mathbf{V}, \boldsymbol{\mu} are scalars. Alternative parameterization It is also possible to let \gamma = 1 / \lambda in which case the pdf becomes : f(x,\sigma^2\mid\mu,\gamma,\alpha,\beta) = \frac {1} {\sigma\sqrt{2\pi\gamma} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{\sigma^2} \right)^{\alpha + 1} \exp \left( -\frac{2\gamma\beta + (x - \mu)^2}{2\gamma \sigma^2} \right) In the multivariate form, the corresponding change would be to regard the covariance matrix \mathbf{V} instead of its inverse \mathbf{V}^{-1} as a parameter. Cumulative distribution function : F(x,\sigma^2\mid\mu,\lambda,\alpha,\beta) = \frac{e^{-\frac{\beta}{\sigma^2}} \left(\frac{\beta }{\sigma ^2}\right)^\alpha \left(\operatorname{erf}\left(\frac{\sqrt{\lambda} (x-\mu )}{\sqrt{2} \sigma }\right)+1\right)}{2 \sigma^2 \Gamma (\alpha)} ==Properties==

Properties

Marginal distributions Given (x,\sigma^2) \sim \text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta) \! . as above, \sigma^2 by itself follows an inverse gamma distribution: :\sigma^2 \sim \Gamma^{-1}(\alpha,\beta) \! while \sqrt{\frac{\alpha\lambda}{\beta}} (x - \mu) follows a t distribution with 2 \alpha degrees of freedom. {{math proof | title=Proof for \lambda = 1 | proof= For \lambda = 1 probability density function is f(x,\sigma^2 \mid \mu,\alpha,\beta) = \frac {1} {\sigma\sqrt{2\pi} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{\sigma^2} \right)^{\alpha + 1} \exp \left( -\frac { 2\beta + (x - \mu)^2} {2\sigma^2} \right) Marginal distribution over x is \begin{align} f(x \mid \mu,\alpha,\beta) & = \int_0^\infty d\sigma^2 f(x,\sigma^2\mid\mu,\alpha,\beta) \\ & = \frac {1} {\sqrt{2\pi} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \int_0^\infty d\sigma^2 \left( \frac{1}{\sigma^2} \right)^{\alpha + 1/2 + 1} \exp \left( -\frac { 2\beta + (x - \mu)^2} {2\sigma^2} \right) \end{align} Except for normalization factor, expression under the integral coincides with Inverse-gamma distribution \Gamma^{-1}(x; a, b) = \frac{b^a}{\Gamma(a)}\frac{e^{-b/x}}{{x}^{a+1}} , with x=\sigma^2 , a = \alpha + 1/2 , b = \frac { 2\beta + (x - \mu)^2} {2} . Since \int_0^\infty dx \Gamma^{-1}(x; a, b) = 1, \quad \int_0^\infty dx x^{-(a+1)} e^{-b/x} = \Gamma(a) b^{-a} , and \int_0^\infty d\sigma^2 \left( \frac{1}{\sigma^2} \right)^{\alpha + 1/2 + 1} \exp \left( -\frac { 2\beta + (x - \mu)^2} {2\sigma^2} \right) = \Gamma(\alpha + 1/2) \left(\frac { 2\beta + (x - \mu)^2} {2} \right)^{-(\alpha + 1/2)} Substituting this expression and factoring dependence on x, f(x \mid \mu,\alpha,\beta) \propto_{x} \left(1 + \frac{(x - \mu)^2}{2 \beta} \right)^{-(\alpha + 1/2)} . Shape of generalized Student's t-distribution is t(x | \nu,\hat{\mu},\hat{\sigma}^2) \propto_x \left(1+\frac{1}{\nu} \frac{ (x-\hat{\mu})^2 }{\hat{\sigma}^2 } \right)^{-(\nu+1)/2} . Marginal distribution f(x \mid \mu,\alpha,\beta) follows t-distribution with 2 \alpha degrees of freedom f(x \mid \mu,\alpha,\beta) = t(x | \nu=2 \alpha, \hat{\mu}=\mu, \hat{\sigma}^2=\beta/\alpha ) . }} In the multivariate case, the marginal distribution of \mathbf{x} is a multivariate t distribution: :\mathbf{x} \sim t_{2\alpha}(\boldsymbol{\mu}, \frac{\beta}{\alpha} \mathbf{V}) \! Summation Scaling Suppose : (x,\sigma^2) \sim \text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta) \! . Then for c>0 , : (cx,c\sigma^2) \sim \text{N-}\Gamma^{-1}(c\mu,\lambda/c,\alpha,c\beta) \! . Proof: To prove this let (x,\sigma^2) \sim \text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta) and fix c>0 . Defining Y=(Y_1,Y_2)=(cx,c \sigma^2) , observe that the PDF of the random variable Y evaluated at (y_1,y_2) is given by 1/c^2 times the PDF of a \text{N-}\Gamma^{-1}(\mu,\lambda,\alpha,\beta) random variable evaluated at (y_1/c,y_2/c) . Hence the PDF of Y evaluated at (y_1,y_2) is given by \begin{align} f_Y(y_1,y_2)&=\frac{1}{c^2} \frac {\sqrt{\lambda}} {\sqrt{2\pi y_2/c} } \, \frac{\beta^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{y_2/c} \right)^{\alpha + 1} \exp \left( -\frac { 2\beta + \lambda(y_1/c - \mu)^2} {2y_2/c} \right) \\ &= \frac {\sqrt{\lambda/c}} {\sqrt{2\pi y_2} } \, \frac{(c\beta)^\alpha}{\Gamma(\alpha)} \, \left( \frac{1}{y_2} \right)^{\alpha + 1} \exp \left( -\frac { 2c\beta + (\lambda/c) \, (y_1 - c\mu)^2} {2y_2} \right). \end{align} The right hand expression is the PDF for a \text{N-}\Gamma^{-1}(c\mu,\lambda/c,\alpha,c\beta) random variable evaluated at (y_1,y_2) , which completes the proof. Exponential family Normal-inverse-gamma distributions form an exponential family with natural parameters \textstyle\theta_1=\frac{-\lambda}{2}, \textstyle\theta_2=\lambda \mu, \textstyle\theta_3=\alpha , and \textstyle\theta_4=-\beta+\frac{-\lambda \mu^2}{2} and sufficient statistics \textstyle T_1=\frac{x^2}{\sigma^2}, \textstyle T_2=\frac{x}{\sigma^2}, \textstyle T_3=\log \big( \frac{1}{\sigma^2} \big) , and \textstyle T_4=\frac{1}{\sigma^2}. Information entropy Kullback–Leibler divergence Measures difference between two distributions. == Maximum likelihood estimation ==

Maximum likelihood estimation

== Posterior distribution of the parameters ==

Posterior distribution of the parameters

See the articles on normal-gamma distribution and conjugate prior. == Interpretation of the parameters ==

Interpretation of the parameters

See the articles on normal-gamma distribution and conjugate prior. == Generating normal-inverse-gamma random variates ==

Generating normal-inverse-gamma random variates

Generation of random variates is straightforward: • Sample \sigma^2 from an inverse gamma distribution with parameters \alpha and \beta • Sample x from a normal distribution with mean \mu and variance \sigma^2/\lambda == Related distributions ==

Related distributions

• The normal-gamma distribution is the same distribution parameterized by precision rather than variance • A generalization of this distribution which allows for a multivariate mean and a completely unknown positive-definite covariance matrix \sigma^2 \mathbf{V} (whereas in the multivariate inverse-gamma distribution the covariance matrix is regarded as known up to the scale factor \sigma^2) is the normal-inverse-Wishart distribution == See also ==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com