Exponential Decay

Many phenomena can be modeled as exponential decay. I discuss this model in detail, focusing on natural exponential decay (base ee) and various useful properties.

A quantity exhibits exponential decay if it decreases at a rate proportional to its current value. Formally, a quantity y=f(x)y = f(x) decays exponentially if

f(x)=y0(1r)x,x0,r(0,1),(1) f(x) = y_0 (1 - r)^x, \quad x \geq 0, \quad r \in (0, 1), \tag{1}

where y0y(0)y_0 \triangleq y(0) is the initial quantity and rr is the rate of decay. To simplify and focus the discussion, this post will focus on exponential decay. However, exponential growth, when a quantity increases at a rate proportional to its current value, is formalized by Equation 11 when (1r)(1 - r) is replaced with (1+r)(1 + r) and r>0r \gt 0. Here, rr is the rate of growth.

For an example of exponential decay, let y0=100y_0 = 100 and r=0.5r = 0.5. Then yy begins with an initial value of 100100 and then decreases by half of its current value for each integer increment of xx:

xy0100(1/2)0=1001100(1/2)1=502100(1/2)2=25(2) \begin{array}{c|c} x & y \\ \hline \\ 0 & 100 (1/2)^0 = 100 \\ 1 & 100 (1/2)^1 = 50 \\ 2 & 100 (1/2)^2 = 25 \\ \vdots & \vdots \end{array} \tag{2}

Of course, xx need not be an integer. For example, f(0.2)87.06f(0.2) \approx 87.06 in the current example. See Figure 11 for exponential decay for several rates rr and y0=1y_0 = 1.

Figure 1. Exponential decay, f(x)=(1r)xf(x) = (1 - r)^x, for several decay rates r(0,1)r \in (0, 1).

Natural exponential decay

Equation 11 with an integer-valued xx is probably how a teacher would explain exponential decay to children. However, beyond an introductory level, exponential decay or growth is conventionally expressed in terms of Euler’s number ee and a decay parameter λ\lambda:

f(x)=y0eλx,x0,λ>0.(3) f(x) = y_0 e^{-\lambda x}, \quad x \geq 0, \quad \lambda \gt 0. \tag{3}

Here, λ\lambda controls how fast yy decays (Figure 22). This formulation is sometimes referred to as a natural exponential function, the counterpart to the natural logarithm f(x)=loge(x)=ln(x)f(x) = \log_e(x) = \ln(x). (We will see why this formulation is “natural” in a moment.)

Figure 2. Exponential decay, f(x)=eλxf(x) = e^{-\lambda x}, for several values of the decay parameter, λ>0\lambda \gt 0.

We can easily switch between these two parameterizations:

y0(1r)x=y0eλxλ=ln(1r).(4) \begin{aligned} y_0 (1-r)^{x} &= y_0 e^{-\lambda x} \\ &\Downarrow \\ \lambda &= -\ln(1-r). \end{aligned} \tag{4}

We can plot this relationship (Figure 33) to better understand it. So as rr, the rate of decay, increases, the decay parameter also increases.

Figure 3. The relationship between λ\lambda and rr.

Why is the mathematical constant ee used in so-called natural exponential decay? According to Wikipedia, the constant ee was discovered by Jacob Bernoulli in 1683, when studying compound interest (exponential growth). Thus, ee arose from understanding exponential functions.

Here is the problem Bernoulli was thinking about. Imagine that you had pp of some currency, say $10$10 USD, in an account that pays 100%100\% interest per year. Then after one year, you would have p(1+1)=2pp (1+1) = 2p or $20$20 USD. Now imagine that interest compounds once every half-year but paid 50%50\% interest per period. Then after two half-years, you would have p(1+0.5)2=2.25pp (1+0.5)^2 = 2.25p or $22.5$22.5 USD. Now imagine that interest compounds once every quarter-year but paid 25%25\% interest per period. Then after four quarter-years, you would have p(1+0.25)42.44pp (1+0.25)^4 \approx 2.44p or approximately $24.40$24.40 USD.

The question Bernoulli asked was: what happens when interest compounds continuously? This question is particularly interesting because so many physical systems and phenomena grow or decay in a continuous manner. For example, a cell might split into two cells in a given time period, but the process is not discrete. In reality, the cell is not a single cell and then, in an instant, two cells. Instead, it is continuous, with growth upon growth. (This YouTube video has an excellent visualization of what I mean here.)

Let’s formalize this. Let nn be the number of periods, and 1/n1/n be the interest rate per period. We want to compute the limit as nn approaches infinity. What Bernoulli observed is that this limit is equal to a constant, which is conventionally denoted ee:

e=limn(1+1n)n.(5) e = \lim_{n \rightarrow \infty} \left(1 + \frac{1}{n}\right)^n. \tag{5}

Thus, with continuous compounding, the amount of money after one year is pepe. While I won’t prove Equation 55, it is easy and useful to visualize it (Figure 44).

Figure 4. Discrete approximations (yellow step-wise functions) of the continuous exponential function f(x)=exf(x) = e^{x} (blue lines) for an increasing number of time periods nn. As nn increases, the total amount compounded approaches the mathematical constant ee.

So what is ee? Like π\pi, ee is a transcendental number, and approximations of it have improved over the centuries. Here are the first 2121 digits of ee:

e2.71828  18284  59045  23536  (6) e \approx 2.71828\;18284\;59045\;23536\;\dots \tag{6}

So this is one way to think about exponential decay (and growth) of base ee, and why it is considered a natural formulation: ee simply emerges when we compute exponential growth continuously.

It’s worth spending another moment to think about this. Why should the initial rate of growth (the rate of growth that Bernoulli considered) be r=1r=1, meaning a doubling every year? One argument is that doubling is the most “natural” rate of growth, since the quantity is always increasing by its current amount. Of course, we could imagine a rate of growth of 2.92.9 or 0.760.76. However, unity is elegant and simplifies the derivations. Furthermore, we can simply generalize the definition of ee in Equation 55 to account for any arbitrary growth rate α\alpha:

eα=limn(1+αn)n.(7) e^{\alpha} = \lim_{n \rightarrow \infty} \left(1 + \frac{\alpha}{n}\right)^n. \tag{7}

See this Wikipedia page for details. This gives us a nice interpretation of the decay parameter λ\lambda in Equation 33: it is rate of decay when that decay is continuous! See Figure 55 for some examples.

Figure 5. The exponential function, f(x)=eαxf(x) = e^{\alpha x}, for both negative (decay) and positive (growth) values of α\alpha.

In my mind, Equation 77 and Figure 55 really capture why exponential decay and growth are naturally base ee. With exponential growth, Euler’s number represents the total amount of the quantity if it continuously grows by doubling. With exponential decay, the inverse of Euler’s number (1/e1/e) represents the remaining amount of the quantity if it continuously decays by halving. And we can easily model faster or slower rates of growth or decay by changing α\alpha.

Halflife

A common way to think about exponential decay is in terms of the decaying quantity’s halflife, which quantifies how fast the decay is occurring. The halflife is the time required for the decaying quantity to be reduced to half its initial value. Here, xx represents time, and the halflife is the value of xx, call this xHLx_{\textsf{HL}}, such that

f(xHL)=y0/2.(8) f(x_{\textsf{HL}}) = y_0 / 2. \tag{8}

We can easily solve for xHLx_{\textsf{HL}}:

y0/2=y0eλxHL1/2=eλxHLln(2)=λxHLxHL=ln(2)λ.(9) \begin{aligned} y_0 / 2 &= y_0 e^{-\lambda x_{\textsf{HL}}} \\ 1/2 &= e^{-\lambda x_{\textsf{HL}}} \\ -\ln(2) &= -\lambda x_{\textsf{HL}} \\ &\Downarrow \\ x_{\textsf{HL}} &= \frac{\ln(2)}{\lambda}. \end{aligned} \tag{9}

See Figure 66 for a new version of Figure 22 with the halflives denoted with vertical lines. As we can see, there is a one-to-one relationship between the decay parameter and the halflife. The halflife is nice because it is pretty intuitive. If we know that a process has a halflife of xHL=5x_{\textsf{HL}} = 5, for example, we know that it will decay to half its initial value in 55 time periods.

Figure 6. Exponential decay for various decay parameters λ\lambda, along with their halflives (vertical dashed lines).

We can also express exponential decay in terms of the halflife parameter, by plugging in λ=ln(2)/xHL\lambda = \ln(2) / x_{\textsf{HL}} into Equation 33.

f(x)=y0eλx=y0e(ln(2)/xHL)x=y0(12)x/xHL.(10) \begin{aligned} f(x) &= y_0 e^{-\lambda x} \\ &= y_0 e^{-(\ln(2) / x_{\textsf{HL}}) x} \\ &= y_0 \left( \frac{1}{2} \right)^{x / x_{\textsf{HL}}}. \end{aligned} \tag{10}

We can see that when x=xHLx = x_{\textsf{HL}}, Equation 99 is equal to y0/2y_0 / 2. When x>xHLx \gt x_{\textsf{HL}}, then Equation 99 has less than y0/2y_0 / 2 and when x<xHLx \lt x_{\textsf{HL}}, it has more than y0/2y_0 / 2.

Mean lifetime

So far, we have thought about exponential decay as a function. However, we can also think of exponential decay as a probabilistic model. This interpretation adds a lot of intuition for how exponential decay behaves in the wild, e.g. how a nucleus with many particles decays.

Here is the idea. Imagine a hundred people are in a room together, and each person has a fair coin. Every minute, everyone in the room flips their coin. If a person’s coin comes up heads with probability pp, they must leave the room. Otherwise, they stay. This experiment simulates exponential decay with a halflife of one, where the decaying quantity is the number of people in the room (Figure 77).

Figure 7. Simulation of the following experiment: 100100 people each flip a fair coin. For each person, if their coin is heads with probability p=0.5p = 0.5, they must leave the room. Otherwise, they stay. For each trial in the experiment, the remaining people in the room flip their coins again to decide if they must leave or stay. The total number of people in the room follows an exponentially decaying function, f(x)=y0pxf(x) = y_0 p^x.

What’s the probabilistic model here? Let XiX_i be a random variable, denoting how long person ii remains in the room. We assume each person has a coin with the same bias pp, and that all the coin flips, across time and individuals, are independent of each other. So the probability that XiX_i takes on a value xx—this represents the probability that person ii remains in the room until time xx—must be

P(Xi=x)=(1p)x1p.(11) \mathbb{P}(X_i = x) = (1-p)^{x-1} p. \tag{11}

This is the probability that person ii’s coin landed on tails (with probability (1p)(1-p)) for x1x-1 trials before landing on heads (with probability pp). Clearly, XiX_i is geometrically distributed. And the continuous analog to the geometric distribution is the exponential distribution. So if person ii were to continuously flip their coin, then XiX_i would be exponentially distributed with density function

P(Xi=x)=λeλx.(12) \mathbb{P}(X_i = x) = \lambda e^{-\lambda x}. \tag{12}

The distribution defined by Equation 1212 is exponential decay with an initial value λ\lambda, where λ\lambda is the initial amount such that the total area under the curve is unity. See A1 for the derivation of the normalizing constant λ\lambda. Furthermore, we know that the expected value of an exponential random variable with parameter λ\lambda is

E[Xi]=1λ.(13) \mathbb{E}[X_i] = \frac{1}{\lambda}. \tag{13}

See A2 for a derivation. And this expectation can be viewed as the mean lifetime of all the people in the room.

Mean lifetime is interesting in a few ways. First, it immediately suggests yet another way to think about the decay rate λ\lambda. If λ=0.1\lambda = 0.1, for example, then the mean lifetime is 1010. So as λ\lambda decreases, the expected lifetime increases (for an element in a set with y0y_0 elements).

Second, if we plug 1/λ1 / \lambda into Equation 33, something fascinating happens. We find that the mean lifetime occurs when the initial quantity y0y_0 has decayed to 1/e1/e of its value! Let xMLx_{\textsf{ML}} denote the mean lifetime. Then

f ⁣(xML)=y0eλ(1/λ)=y0e.(14) f\!\left( x_{\textsf{ML}} \right) = y_0 e^{-\lambda (1 / \lambda)} = \frac{y_0}{e}. \tag{14}

So the halflife is the value xHLx_{\textsf{HL}} such that the initial quantity y0y_0 has decayed to half of its value, while the mean lifetime is the value xMLx_{\textsf{ML}} such that the initial quantity y0y_0 has decayed to 1/e1/e of its value (approximately 1/31/3). This is yet another reason why exponential decay with base ee is considered special or natural.

Figure 8. Exponential decay for various decay parameters λ\lambda, along with their halflives (lighter vertical dashed lines) and their mean lifetimes (darker vertical dashed lines). Note that the mean halflives intersect the exponential curves at 1/e1/e.

See Figure 88 for examples of both halflife and mean lifetime for various exponentially decaying functions. Note that while halflife and mean lifetime are approximately the same value when λ\lambda is large, this is not true when λ\lambda is small. This is because as a process decays more slowly, the difference in time when it achieves 1/21/2 of its value and 1/e1/e of its value becomes bigger and bigger.

Computing λ\lambda

Imagine that we do not know the decay parameter λ\lambda (or xHLx_{\textsf{HL}} or xMLx_{\textsf{ML}}), but that we have observed two values yiy_i and yjy_j, at time points xix_i and xjx_j, of a quantity that we assume follows exponential decay. Then we can solve for λ\lambda (and thus xHLx_{\textsf{HL}} and xMLx_{\textsf{ML}}) directly. First, we can write the ratio of yiy_i and yjy_j as

yiyj=eλ(xixj).(15) \frac{y_i}{y_j} = e^{-\lambda (x_i - x_j)}. \tag{15}

Taking the natural logarithm of both sides, we can simplify for λ\lambda:

λ=ln(yi)ln(yj)xjxi.(16) \lambda = \frac{\ln(y_i) - \ln(y_j)}{x_j - x_i}. \tag{16}

Thus, if we know that a process decays exponentially, then we can estimate the decay parameter with only two data points. This is one of the useful properties of exponential decay. Not only does it capture a lot of physical phenomena, it does so with only a single parameter that can be easily estimated.

Integration

A final reason (that I’ll mention) exponential decay with base ee is nice is because the derivative of the exponential function exe^x is simply exe^x:

f(x)=ex,    f(x)=ex.(17) f(x) = e^x, \quad \implies \quad f^{\prime}(x) = e^x. \tag{17}

This makes both derivatives and anti-derivatives relatively easy. One useful operation that is relatively easy to compute is the total amount of the decayed property between two time points, which is y0y_0 minus the total area under the curve until some time point xx (Figure 99).

Figure 9. A definite integral of an exponential function (yellow shaded region).

In general, the integral between any two points xix_i and xjx_j is

xixjy0eλxdx=eλxλ+Cx=xixj=y0λ(eλxieλxj)(18) \begin{aligned} \int_{x_i}^{x_j} y_0 e^{-\lambda x} dx &= \frac{e^{-\lambda x}}{-\lambda} + C \Big|_{x=x_i}^{x_j} \\ &= \frac{y_0}{\lambda} \left(e^{-\lambda x_i} - e^{-\lambda x_j} \right) \end{aligned} \tag{18}

This integral simplifies nicely if xi=0x_i = 0 or xj=x_j = \infty. If xi=0x_i = 0, then Equation 1818 is

y0λ(1eλxj).(19) \frac{y_0}{\lambda} \left(1 - e^{-\lambda x_j} \right). \tag{19}

And if xj=x_j = \infty, then Equation 1818 is

y0λeλxi.(20) -\frac{y_0}{\lambda} e^{\lambda x_i}. \tag{20}

Either way, the point is that computing these integrals is relatively straightforward, and effectively amounts to a difference in exponential terms, properly scaled.

   

Appendix

A1. Exponential distribution’s normalizing constant

We want to solve for the normalizing constant gg here:

1=01geλxdx.(A1.1) 1 = \int_0^{\infty} \frac{1}{g} e^{-\lambda x} dx. \tag{A1.1}

Since we can easily take the anti-derivative (and derivative) of f(x)=exf(x) = e^x, we can easily integrate this to solve for gg:

g=0eλxdx=eλλ+Cx=0=1λ.(A1.2) \begin{aligned} g &= \int_0^{\infty} e^{-\lambda x} dx \\ &= \frac{e^{-\lambda}}{-\lambda} + C \Big|_{x=0}^{\infty} \\ &= \frac{1}{\lambda}. \end{aligned} \tag{A1.2}

And we’re done.

A2. Mean of exponential distribution

We want to compute the expectation

E[X]=0xP(X=x)dx.(A2.1) \mathbb{E}[X] = \int_0^{\infty} x \mathbb{P}(X = x) dx. \tag{A2.1}

We first plug in the exponential distribution’s density function to get

E[X]=0λxeλxdx.(A2.2) \mathbb{E}[X] = \int_0^{\infty} \lambda x e^{-\lambda x} dx. \tag{A2.2}

We can solve this by integration by parts. Define uu, vv, dudu, and dvdv as

dv=eλxdx,v=1λeλx,u=λx,du=λdx.(A2.3) \begin{aligned} dv &= e^{-\lambda x} dx, \\ v &= -\frac{1}{\lambda} e^{-\lambda x}, \\ u &= \lambda x, \\ du &= \lambda dx. \end{aligned} \tag{A2.3}

Then we can use the formula for integration by parts,

abudv=uvababvdu,(A2.4) \int_a^b u dv = uv \Big|_a^b - \int_a^b v du, \tag{A2.4}

which gives us

0λxeλxdx=λx(1λeλx)001λeλxλdx=xeλx0+0eλxdx=0+eλxλ0=1λ.(A2.5) \begin{aligned} \int_0^{\infty} \lambda x e^{-\lambda x} dx &= \lambda x \left( -\frac{1}{\lambda} e^{-\lambda x} \right) \Bigg|_0^{\infty} - \int_0^{\infty} - \frac{1}{\lambda} e^{-\lambda x} \lambda dx \\ &= - x e^{-\lambda x} \Big|_0^{\infty} + \int_0^{\infty} e^{-\lambda x} dx \\ &= 0 + \frac{e^{-\lambda x}}{-\lambda} \Big|_0^{\infty} \\ &= \frac{1}{\lambda}. \end{aligned} \tag{A2.5}

If it’s not clear why

xeλx0=0,(A2.6) x e^{-\lambda x} \Big|_0^{\infty} = 0, \tag{A2.6}

simply rewrite this as

limxxeλx== ?(A2.7) \lim_{x \rightarrow \infty} \frac{x}{e^{-\lambda x}} = \frac{\infty}{\infty} = \text{ ?} \tag{A2.7}

and apply L’Hôpital’s rule.