Log-Normal Distribution

I derive some basic properties of the log-normal distribution.

Let XX be a normal random variable with mean μ\mu and variance σ2\sigma^2:

XN(μ,σ2).(1) X \sim \mathcal{N}(\mu, \sigma^2). \tag{1}

Now define YY as

Y=exp(X).(2) Y = \exp(X). \tag{2}

We say that YY is log-normally distributed with parameters μ\mu and σ\sigma or

Ylognormal(μ,σ).(3) Y \sim \text{lognormal}(\mu, \sigma). \tag{3}

Alternatively, we could say that logY\log Y is normally distributed,

logYN(μ,σ2).(4) \log Y \sim \mathcal{N}(\mu, \sigma^2). \tag{4}

Let’s work through some basic properties of YY.

Non-negativity. Perhaps the first thing to observe is that YY is a non-negative random variable (Figure 11). This is because exe^x is positive for any value of xx. Thus, the log-normal distribution often arises in cases where non-negativity is an important property of the data being modeled.

Figure 1. Normal (left) and log-normal (right) distributions, both with parameters μ=0\mu=0 and σ=1\sigma=1. The normal distribution's measures of central tendency (mean, median, mode) are all equal, while the log-normal distribution's measures are different due to the log-normal distribution's skew.

Moments. The second thing to observe is that the parameters μ\mu and σ2\sigma^2 are the mean and variance of XX, but they are not the mean and variance of YY. The mean of YY is

E[Y]=E[exp(X)]=exp{μ+12σ2}.(5) \mathbb{E}[Y] = \mathbb{E}\left[\exp(X)\right] = \exp\left\{ \mu + \frac{1}{2} \sigma^2 \right\}. \tag{5}

This is just a special case of the kk-th moment of the log-normal distribution. In general, the kk-th moment is

E[eXk]=exp{kμ+12k2σ2}.(6) \mathbb{E}[e^{Xk}] = \exp\left\{ k \mu + \frac{1}{2} k^2 \sigma^2 \right\}. \tag{6}

See A1 for details. The variance is

V[Y]=(exp{σ2}1)exp{2μ+σ2},(7) \mathbb{V}[Y] = \left(\exp\left\{\sigma^2\right\} - 1\right) \exp\left\{2\mu + \sigma^2\right\}, \tag{7}

which can also be derived from Equation 66. See A2 for details.

Density functions. The cumulative distribution function (CDF) of YY is

FY(y)=Φ(logyμσ),(8) F_Y(y) = \Phi\left(\frac{\log y - \mu}{\sigma}\right), \tag{8}

where Φ(x)\Phi(x) is CDF of the standard normal distribution. This is trivial to derive:

P(Yy)=P(Xlogy).(9) \mathbb{P}(Y \leq y) = \mathbb{P}(X \leq \log y). \tag{9}

We can then differentiate Equation 88 to compute the probability density function (PDF) of YY, which is

fY(y)=1yσ2πexp{12[logyμσ]2}.(10) f_Y(y) = \frac{1}{y \sigma \sqrt{2\pi}} \exp\left\{ -\frac{1}{2}\left[\frac{\log y - \mu}{\sigma} \right]^2 \right\}. \tag{10}

See A3 for details.

Measures of central tendency. Using the CDF in Equation 88, we can compute the median mm of YY, which is

m:=exp(μ).(11) m := \exp(\mu). \tag{11}

See A4 for details. And using the PDF in Equation 88, we can compute the mode dd, which is

d:=exp(μσ2).(12) d := \exp(\mu - \sigma^2). \tag{12}

See A5 for details. Given Equations 55, 1111, and 1212, we can order these measures of central tendency as

exp(μσ2)exp(μ)exp ⁣(μ+12σ2).(13) \exp(\mu - \sigma^2) \leq \exp(\mu) \leq \exp\!\left(\mu + \frac{1}{2} \sigma^2 \right). \tag{13}

This tells us that a log-normal distribution’s measures are ordered left-to-right as mode, median, and then mean (Figure 11, right).

Parameterizations. Not only is μ\mu not the mean of YY, it is not even a clean measure of central tendency. This is because μ\mu is shifting logy\log y rather than yy. So the dispersion of YY increases as either μ\mu or σ\sigma increases (Equation 77 and Figure 22).

Figure 2. Several log-normal distributions with (left) the parameter σ\sigma fixed and (right) the parameter μ\mu fixed. We can see that both the central tendency and dispersion of YY depend on μ\mu and σ\sigma.

Given the fact that μ\mu and σ\sigma are not actually the mean and standard deviations of YY, we can consider alternative, more natural parameterizations. One choice is to consider the exponent of each parameter, so

μ=eμ,σ=eσ.(14) \mu^* = e^{\mu}, \qquad \sigma^* = e^{\sigma}. \tag{14}

We have already seen that μ\mu^* is the median of YY, while σ\sigma^* captures the dispersion of YY, although it is not the variance of YY.

As a final note, some statistical libraries use different parameterizations. In my mind, it is easiest to think of the “canonical” parameterization as the one used in this post and to then convert to alternative forms as needed. For an example, see A6 for details on SciPy’s parameterization of the log-normal distribution.

Appendix

A1. Moments

We want to find the kk-th moment of Y=eXY = e^X when XN(μ,σ2)X \sim \mathcal{N}(\mu, \sigma^2). This means we want to simplify

E[Yk]=E[eXk]=exp{xk}12πσ2exp{12[xμσ]2}dx.(A1.1) \mathbb{E}[Y^k] = \mathbb{E}[e^{Xk}] = \int_{-\infty}^{\infty} \exp\left\{xk\right\} \frac{1}{\sqrt{2 \pi \sigma^2}} \exp\left\{ -\frac{1}{2} \left[ \frac{x - \mu}{\sigma} \right]^2 \right\} dx. \tag{A1.1}

We add combine the exponent terms into

exp{12[xμσ]2+xk}.(A1.2) \exp\left\{ -\frac{1}{2} \left[ \frac{x - \mu}{\sigma} \right]^2 + xk\right\}. \tag{A1.2}

Then all we need to do is simplify the expression in brackets to be again quadratic in xx. We can then pull out any terms that do not depend on xx, and see that the integral must be unity because probabilities are normalized. So let’s write the bracketed term as

12[xμσ]2+xk=12σ2[x2μ22xμ2σ2xk]=12σ2[x2μ22xμ2σ2xk+(2σ2kμ+k2σ4)(2σ2kμ+k2σ4)]=12σ2[(xμkσ2)2(2σ2kμ+k2σ4)].(A1.3) \begin{aligned} &-\frac{1}{2} \left[ \frac{x - \mu}{\sigma} \right]^2 + xk \\ &= -\frac{1}{2\sigma^2} \left[ x^2 - \mu^2 - 2x\mu - 2 \sigma^2 xk \right] \\ &= -\frac{1}{2\sigma^2} \left[ x^2 - \mu^2 - 2x\mu - 2 \sigma^2 xk + (2 \sigma^2 k \mu + k^2 \sigma^4) - (2 \sigma^2 k \mu + k^2 \sigma^4) \right] \\ &= -\frac{1}{2\sigma^2} \left[ (x - \mu - k \sigma^2)^2 - (2 \sigma^2 k \mu + k^2 \sigma^4) \right]. \end{aligned} \tag{A1.3}

So we can rewrite Equation A2\text{A}2 above as

exp{12σ2[xμkσ2]2}exp{kμ+12k2σ2}.(A1.4) \exp\left\{ -\frac{1}{2 \sigma^2} \left[ x - \mu - k \sigma^2 \right]^2 \right\} \exp\left\{ k \mu + \frac{1}{2} k^2 \sigma^2 \right\}. \tag{A1.4}

The right term does not depend on xx and can thus be pulled out the integral, giving us

E[eXk]=exp{kμ+12k2σ2}12πσ2exp{12σ2[xμkσ2]2}dx.(A1.5) \mathbb{E}[e^{Xk}] = \exp\left\{ k \mu + \frac{1}{2} k^2 \sigma^2 \right\} \int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi \sigma^2}} \exp\left\{ -\frac{1}{2 \sigma^2} \left[ x - \mu - k \sigma^2 \right]^2 \right\} dx. \tag{A1.5}

The integral must be equal to unity and therefore

E[eXk]=exp{kμ+12k2σ2}.(A1.6) \mathbb{E}[e^{Xk}] = \exp\left\{ k \mu + \frac{1}{2} k^2 \sigma^2 \right\}. \tag{A1.6}

We can easily compute the mean and variance of XX using Equation A6A6.

A2. Variance

Using A1, we can see that the variance of YY is

V[Y]=E[Y2]E[Y]2=exp{2μ+2σ2}[exp{μ+12σ2}]2=exp{σ21}exp{2μ+σ2}.(A2.1) \begin{aligned} \mathbb{V}[Y] &= \mathbb{E}[Y^2] - \mathbb{E}[Y]^2 \\ &= \exp\left\{ 2\mu + 2\sigma^2 \right\} - \left[ \exp\left\{ \mu + \frac{1}{2}\sigma^2 \right\} \right]^2 \\ &= \exp\left\{ \sigma^2 - 1 \right\} \exp\left\{ 2 \mu + \sigma^2 \right\}. \end{aligned} \tag{A2.1}

And we’re done.

A3. Probability density function

Let Φ(x)\Phi(x) and φ(x)\varphi(x) denote the CDF and PDF of XX respectively. Then the PDF of YY is

fY(y)=ddyP(Yy)=ddyP(Xlogy)=ddyΦ ⁣(logyμσ)=φ ⁣(logyμσ)ddy ⁣(logyμσ)=φ ⁣(logyμσ) ⁣1yσ.(A3.1) \begin{aligned} f_Y(y) &= \frac{d}{dy}\mathbb{P}(Y \leq y) \\ &= \frac{d}{dy} \mathbb{P}(X \leq \log y) \\ &= \frac{d}{dy} \Phi \! \left( \frac{\log y - \mu}{\sigma} \right) \\ &= \varphi \! \left( \frac{\log y - \mu}{\sigma} \right) \frac{d}{dy} \! \left( \frac{\log y - \mu}{\sigma} \right) \\ &= \varphi \! \left( \frac{\log y - \mu}{\sigma} \right) \! \frac{1}{y \sigma}. \end{aligned} \tag{A3.1}

Using the definition of φ(x)\varphi(x), we have

fY(y)=1yσ2πexp{12[logyμσ]2},(A3.2) f_Y(y) = \frac{1}{y \sigma \sqrt{2\pi}} \exp\left\{ -\frac{1}{2}\left[\frac{\log y - \mu}{\sigma} \right]^2 \right\}, \tag{A3.2}

as desired.

A4. Median

The median mm of a random variable is a constant such that

P(Ym)=P(mY).(A4.1) \mathbb{P}\left(-\infty \leq Y \leq m\right) = \mathbb{P}\left(m \leq Y \leq \infty \right). \tag{A4.1}

In words, half of the probability is on either side of mm. We can see that in the case that YY is log-normally distributed, we have

P(expXm)=P(mexpX)P(Xlogm)=P(logmX).(A4.2) \begin{aligned} \mathbb{P}\left(-\infty \leq \exp X \leq m\right) &= \mathbb{P}\left( m \leq \exp X \leq \infty\right) \\ &\Downarrow \\ \mathbb{P}\left(-\infty \leq X \leq \log m\right) &= \mathbb{P}\left( \log m \leq X \leq \infty\right). \end{aligned} \tag{A4.2}

But for XX, the median is μ\mu, and therefore we have μ=logm\mu = \log m, which implies that m=expμm = \exp \mu, as desired.

A5. Mode

To compute the mode dd of a distribution, we want to compute

d:=y=arg ⁣maxyfY(y).(A5.1) d := y^{\star} = \arg\!\max_{y} f_Y(y). \tag{A5.1}

To compute this, we take the derivative of the PDF, set it equal to zero, and solve for yy. In addition, we should confirm that mm is the local maximum using a second derivative test.

The first derivative is

fY(y)=[e1/2((logyμ)/σ)2y2σ2π+e1/2((logyμ)/σ)2yσ2π(μlogyσ)1yσ]=1y2σ2πe1/2((logyμ)/σ)2[1+logyμσ2].(A5.2) \begin{aligned} f^{\prime}_Y(y) &= \left[ -\frac{e^{-1/2 ((\log y - \mu) / \sigma)^2}}{y^2 \sigma \sqrt{2\pi}} + \frac{e^{-1/2 ((\log y - \mu) / \sigma)^2}}{y \sigma \sqrt{2\pi}} \left(\frac{\mu - \log y}{\sigma}\right) \frac{1}{y \sigma} \right] \\ &= \frac{1}{y^2 \sigma \sqrt{2 \pi}} e^{-1/2 ((\log y - \mu) / \sigma)^2} \left[1 + \frac{\log y - \mu}{\sigma^2}\right]. \end{aligned} \tag{A5.2}

Let’s set this equal to zero and solve for yy:

0=1y2σ2πe1/2((logyμ)/σ)2[1+logyμσ2]=1+logyμσ2σ2+μ=logyy=exp(μσ2).(A5.3) \begin{aligned} 0 &= \frac{1}{y^2 \sigma \sqrt{2 \pi}} e^{-1/2 ((\log y - \mu) / \sigma)^2} \left[1 + \frac{\log y - \mu}{\sigma^2}\right] \\ &= 1 + \frac{\log y - \mu}{\sigma^2} \\ -\sigma^2 + \mu &= \log y \\ &\Downarrow \\ y &= \exp(\mu - \sigma^2). \end{aligned} \tag{A5.3}

We should confirm this is a maximum with a second derivative test. However, I don’t want to take the derivative of Equation A5.2\text{A}5.2. See the Book of Statistical Proofs for a complete proof.

A6. SciPy

SciPy uses a parameter s for σ\sigma and a parameter scale for expμ\exp \mu. This is a SciPy convention in which multiple distributions use the same parameter names (loc, shape, scale, …). Since μ\mu is not strictly a location parameter—it also affects the dispersion—we can only specify μ\mu a la Equation 33 using the median expμ\exp \mu. I am not sure why the argument for σ\sigma is named s rather than shape.

Here is a sanity check,

from scipy.stats import lognorm

mu = 2
sigma = 0.7
m, v = lognorm(scale=np.exp(mu), s=sigma).stats(moments="mv")

print(np.exp(mu + 0.5 * sigma**2))
# 9.440415556460355
print(m.item())
# 9.440415556460353

print((np.exp(sigma**2) - 1) * np.exp(2 * mu + sigma**2))
# 56.352935774951334
print(v.item())
# 56.35293577495132

which confirms our understanding of the parameterization.