Estimation of parameters Maximum likelihood estimator For determining the
maximum likelihood estimators of the log-normal distribution parameters and , we can use the
same procedure as for the
normal distribution. Note that L(\mu, \sigma) = \prod_{i=1}^n \frac 1 {x_i} \varphi_{\mu,\sigma} (\ln x_i), where \varphi is the density function of the normal distribution \mathcal N(\mu,\sigma^2). Therefore, the log-likelihood function is \ell (\mu,\sigma \mid x_1, x_2, \ldots, x_n) = - \sum _i \ln x_i + \ell_N (\mu, \sigma \mid \ln x_1, \ln x_2, \dots, \ln x_n). Since the first term is constant with regard to
μ and
σ, both logarithmic likelihood functions, \ell and \ell_N, reach their maximum with the same \mu and \sigma. Hence, the maximum likelihood estimators are identical to those for a normal distribution for the observations \ln x_1, \ln x_2, \dots, \ln x_n), \widehat \mu = \frac {\sum_i \ln x_i}{n}, \qquad \widehat \sigma^2 = \frac {\sum_i {\left( \ln x_i - \widehat \mu \right)}^2} {n}. For finite
n, the estimator for \mu is unbiased, but the one for \sigma is biased. As for the normal distribution, an unbiased estimator for \sigma can be obtained by replacing the denominator
n by
n−1 in the equation for \widehat\sigma^2. From this, the MLE for the expectancy of x is: \widehat{\theta}_\text{MLE} = \widehat{\operatorname{E}[X]}_\text{MLE} = e^{\hat \mu + {\hat{\sigma}^2}/{2}}
Method of moments When the individual values x_1, x_2, \ldots, x_n are not available, but the sample's mean \bar x and
standard deviation s is, then the
method of moments can be used. The corresponding parameters are determined by the following formulas, obtained from solving the equations for the expectation \operatorname{E}[X] and variance \operatorname{Var}[X] for \mu and \sigma: \begin{align} \mu &= \ln \frac{ \bar x} {\sqrt{1+\widehat\sigma^2/\bar x^2} } , \\[1ex] \sigma^2 &= \ln\left(1 + {\widehat\sigma^2} / \bar x^2 \right). \end{align}
Other estimators Other estimators also exist, such as Finney's
UMVUE estimator, the "Approximately Minimum Mean Squared Error Estimator", the "Approximately Unbiased Estimator" and "Minimax Estimator", also "A Conditional Mean Squared Error Estimator", and other variations as well.
Interval estimates The most efficient way to obtain
interval estimates when analyzing log-normally distributed data consists of applying the well-known methods based on the normal distribution to logarithmically transformed data and then to back-transform results if appropriate.
Prediction intervals A basic example is given by
prediction intervals: For the normal distribution, the interval [\mu-\sigma,\mu+\sigma] contains approximately two thirds (68%) of the probability (or of a large sample), and [\mu-2\sigma,\mu+2\sigma] contain 95%. Therefore, for a log-normal distribution, • [\mu^*/\sigma^*,\mu^*\cdot\sigma^*]=[\mu^* {}^\times\!\!/ \sigma^*] contains 2/3, and • [\mu^*/(\sigma^*)^2,\mu^*\cdot(\sigma^*)^2] = [\mu^* {}^\times\!\!/ (\sigma^*)^2] contains 95% of the probability. Using estimated parameters, then approximately the same percentages of the data should be contained in these intervals.
Confidence interval for eμ Using the principle, note that a
confidence interval for \mu is [\widehat\mu \pm q \cdot \widehat\mathop{se}], where \mathop{se} = \widehat\sigma / \sqrt{n} is the standard error and
q is the 97.5% quantile of a
t distribution with
n-1 degrees of freedom. Back-transformation leads to a confidence interval for \mu^* = e^\mu (the median), is: [\widehat\mu^* {}^\times\!\!/ (\operatorname{sem}^*)^q] with \operatorname{sem}^*=(\widehat\sigma^*)^{1/\sqrt{n}}
Confidence interval for The literature discusses several options for calculating the
confidence interval for \mu (the mean of the log-normal distribution). These include
bootstrap as well as various other methods. The Cox Method proposes to plug-in the estimators \widehat \mu = \frac {\sum_i \ln x_i}{n}, \qquad S^2 = \frac {\sum_i \left( \ln x_i - \widehat \mu \right)^2} {n-1} and use them to construct
approximate confidence intervals in the following way: \mathrm{CI}(\operatorname{E}(X)) : \exp\left(\hat \mu + \frac{S^2}{2} \pm z_{1-\frac{\alpha}{2}} \sqrt{\frac{S^2}{n} + \frac{S^4}{2(n-1)}} \right) We know that {{nowrap|\operatorname{E}(X) = e^{\mu + \frac{\sigma^2}{2}}.}} Also, \widehat \mu is a normal distribution with parameters: \widehat \mu \sim N\left(\mu, \frac{\sigma^2}{n}\right) S^2 has a
chi-squared distribution, which is
approximately normally distributed (via
CLT), with
parameters: {{nowrap|S^2 \dot \sim N\left(\sigma^2, \frac{2\sigma^4}{n-1}\right).}} Hence, {{nowrap|\frac{S^2}{2} \dot \sim N\left(\frac{\sigma^2}{2}, \frac{\sigma^4}{2(n-1)}\right).}} Since the sample mean and variance are independent, and the sum of normally distributed variables is
also normal, we get that: \widehat \mu + \frac{S^2}{2} \dot \sim N\left(\mu + \frac{\sigma^2}{2}, \frac{\sigma^2}{n} + \frac{\sigma^4}{2(n-1)}\right) Based on the above, standard
confidence intervals for \mu + \frac{\sigma^2}{2} can be constructed (using a
Pivotal quantity) as: \hat \mu + \frac{S^2}{2} \pm z_{1-\frac{\alpha}{2}} \sqrt{\frac{S^2}{n} + \frac{S^4}{2(n-1)} } And since confidence intervals are preserved for monotonic transformations, we get that: \mathrm{CI}\left(\operatorname{E}[X] = e^{\mu + \frac{\sigma^2}{2}}\right): \exp\left(\hat \mu + \frac{S^2}{2} \pm z_{1-\frac{\alpha}{2}} \sqrt{\frac{S^2}{n} + \frac{S^4}{2(n-1)}} \right) As desired. Olsson 2005, proposed a "modified Cox method" by replacing z_{1-\frac{\alpha}{2}} with t_{n-1, 1-\frac{\alpha}{2}}, which seemed to provide better coverage results for small sample sizes. The way it is done there is that we have two approximately Normal distributions (e.g., p1 and p2, for RR), and we wish to calculate their ratio.{{efn|The issue is that we do not know how to do it directly, so we take their logs, and then use the
delta method to say that their logs is itself (approximately) normal. This trick allows us to pretend that their exp was log normal, and use that approximation to build the CI. Notice that in the RR case, the median and the mean in the base distribution (i.e., before taking the log), is actually identical (since they are originally normal, and not log normal). For example, \hat p_1 \dot \sim N(p_1, p_1(1-p1)/n) and \ln \hat{p}_1 \dot \sim N(\ln p_1, (1-p1)/(p_1 n)) Hence, building a CI based on the log and then back-transform will give us CI(p_1): e^{\ln \hat{p}_1 \pm (1 - \hat{p}_1)/(\hat{p}_1 n))}. So while we expect the CI to be for the median, in this case, it is actually also for the mean in the original distribution. i.e., if the original \hat p_1 was log-normal, we would expect that \operatorname{E}[\hat p_1] = e^{\ln p_1 + \tfrac{1}{2} (1 - p1)/(p_1 n)}. But in practice, we KNOW that \operatorname{E}[\hat p_1] = e^{\ln p_1} = p_1. Hence, the approximation we have is in the second step (of the delta method), but the CI are actually for the expectation (not just the median). This is because we are starting from a base distribution that is normal, and then using another approximation after the log again to normal. This means that a big approximation part of the CI is from the delta method. }} However, the ratio of the expectations (means) of the two samples might also be of interest, while requiring more work to develop. The ratio of their means is: \frac{\operatorname{E}(X_1)}{\operatorname{E}(X_2)} = \frac{e^{\mu_1 + \sigma_1^2 / 2}}{e^{\mu_2 + \sigma_2^2 /2}} = e^{(\mu_1 - \mu_2) + \frac{1}{2} \left(\sigma_1^2 - \sigma_2^2\right)} Plugin in the estimators to each of these parameters yields also a log normal distribution, which means that the Cox Method, discussed above, could similarly be used for this use-case: \mathrm{CI}\left( \frac{\operatorname{E}(X_1)}{\operatorname{E}(X_2)} = \frac{e^{\mu_1 + \sigma_1^2 / 2}}{e^{\mu_2 + \sigma_2^2 / 2}} \right): \exp\left(\left(\hat \mu_1 - \hat \mu_2 + \tfrac{1}{2}S_1^2 - \tfrac{1}{2}S_2^2\right) \pm z_{1-\frac{\alpha}{2}} \sqrt{ \frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} + \frac{S_1^4}{2(n_1-1)} + \frac{S_2^4}{2(n_2-1)} } \right) To construct a confidence interval for this ratio, we first note that \hat \mu_1 - \hat \mu_2 follows a normal distribution, and that both S_1^2 and S_2^2 has a
chi-squared distribution, which is
approximately normally distributed (via
CLT, with the relevant
parameters). This means that (\hat \mu_1 - \hat \mu_2 + \frac{1}{2}S_1^2 - \frac{1}{2}S_2^2) \sim N\left((\mu_1 - \mu_2) + \frac{1}{2}(\sigma_1^2 - \sigma_2^2), \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2} + \frac{\sigma_1^4}{2(n_1-1)} + \frac{\sigma_2^4}{2(n_2-1)} \right) Based on the above, standard
confidence intervals can be constructed (using a
Pivotal quantity) as: (\hat \mu_1 - \hat \mu_2 + \frac{1}{2}S_1^2 - \frac{1}{2}S_2^2) \pm z_{1-\frac{\alpha}{2}} \sqrt{ \frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} + \frac{S_1^4}{2(n_1-1)} + \frac{S_2^4}{2(n_2-1)} } And since confidence intervals are preserved for monotonic transformations, we get that: CI\left( \frac{\operatorname{E}(X_1)}{\operatorname{E}(X_2)} = \frac{e^{\mu_1 + \frac{\sigma_1^2}{2}}}{e^{\mu_2 + \frac{\sigma_2^2}{2}}} \right):e^{\left((\hat \mu_1 - \hat \mu_2 + \frac{1}{2}S_1^2 - \frac{1}{2}S_2^2) \pm z_{1-\frac{\alpha}{2}} \sqrt{ \frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} + \frac{S_1^4}{2(n_1-1)} + \frac{S_2^4}{2(n_2-1)} } \right)} As desired. It is worth noting that naively using the
MLE in the ratio of the two expectations to create a
ratio estimator will lead to a
consistent, yet biased, point-estimation (we use the fact that the estimator of the ratio is a log normal distribution):{{efn|The formula can found by just treating the estimated means and variances as approximately normal, which indicates the terms is itself a log-normal, enabling us to quickly get the expectation. The bias can be partially minimized by using: \begin{align} \widehat \left[ \frac{\operatorname{E}(X_1)}{\operatorname{E}(X_2)} \right] &= \left[ \frac{\widehat \operatorname{E}(X_1)}{\widehat \operatorname{E}(X_2)} \right] \frac{2}{\widehat \left( \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2} + \frac{\sigma_1^4}{2(n_1-1)} + \frac{\sigma_2^4}{2(n_2-1)} \right)} \\ &\approx \left[e^{(\widehat \mu_1 - \widehat \mu_2) + \frac{1}{2}\left(S_1^2 - S_2^2\right)}\right] \frac{2}{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} + \frac{S_1^4}{2(n_1-1)} + \frac{S_2^4}{2(n_2-1)}} \end{align} }} \begin{align} \operatorname{E}\left[ \frac{\widehat \operatorname{E}(X_1)}{\widehat \operatorname{E}(X_2)} \right] &= \operatorname{E}\left[\exp\left(\left(\widehat \mu_1 - \widehat \mu_2\right) + \tfrac{1}{2} \left(S_1^2 - S_2^2\right)\right)\right] \\ &\approx \exp\left[{(\mu_1 - \mu_2) + \frac{1}{2}(\sigma_1^2 - \sigma_2^2) + \frac{1}{2}\left( \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2} + \frac{\sigma_1^4}{2(n_1-1)} + \frac{\sigma_2^4}{2(n_2-1)} \right) }\right] \end{align}
Extremal principle of entropy to fix the free parameter σ In applications, \sigma is a parameter to be determined. For growing processes balanced by production and dissipation, the use of an extremal principle of Shannon entropy shows that \sigma = \frac 1 \sqrt{6} This value can then be used to give some scaling relation between the inflexion point and maximum point of the log-normal distribution. The value \sigma = 1 \big/ \sqrt{6} is used to provide a probabilistic solution for the Drake equation. ==Occurrence and applications==