Deriving the VIX

The VIX is a benchmark for market-implied volatility. It is computed from a weighted average of variance swaps. I first derive the fair strike for a variance swap and then discuss the VIX's approximation of this formula.

The Chicago Board Options Exchange’s (CBOE) Volatility Index (VIX) is a global benchmark for market-implied volatility. However, for me, understanding the VIX’s main formula was tricky. CBOE’s whitepaper (CBOE, 2019) explains the inputs to the formula in detail—how the options are selected, how time is handled, and so forth—but does not explain where the formula comes from. And the original paper describing the VIX, (Whaley, 1993), is rich in stylized facts about the VIX but spare in technical details. So the goal of this post is simply to derive the main VIX formula.

As it turns out, deriving the VIX is nearly entirely about deriving the fair price of a variance swap. This is because the VIX can be viewed as the fair strike of a volatility swap that settles in thirty days. Thus, this post is in large part just a detailed elaboration of John Hull’s excellent note on variance swaps. See (Demeterfi et al., 1999) for a deeper treatment on the topic.

Index calculation

At a high level, the VIX is the weighted average of two option-implied volatilities, σ1\sigma_1 and σ2\sigma_2, each computed using a set of S&P 500 (SPX) options that expire in roughly 30-days:

VIX=100×[T1σ1(NT2N30NT2NT1)+T2σ2(N30NT1NT2NT1)]N365N30.(1) \text{VIX} = 100 \times \sqrt{ \left[ T_1 \sigma_1 \left( \frac{N_{T_2} - N_{30}}{N_{T_2} - N_{T_1}} \right) + T_2 \sigma_2 \left( \frac{N_{30} - N_{T_1}}{N_{T_2} - N_{T_1}} \right) \right] \frac{N_{365}}{N_{30}}}. \tag{1}

Thus, the VIX is a forward-looking estimate of SPX-implied volatility as an annualized percentage. Equation 11 may look complicated, but most of the terms, which I won’t bother defining, are time-based variables designed to properly weight σ1\sigma_1 and σ2\sigma_2. Simply put, it is just a bunch of calendar math. We could re-write Equation 11 as

VIX=100×w1σ1+w2σ2,(2) \text{VIX} = 100 \times \sqrt{w_1 \sigma_1 + w_2 \sigma_2}, \tag{2}

where w1w_1 and w2w_2 are the appropriate time-based weights. Thus, in my mind, the key conceptual lynchpin to understanding the VIX is understanding the equation for the options-implied volatilities σ1\sigma_1 and σ2\sigma_2. According to (CBOE, 2019), each σj\sigma_j for j{1,2}j \in \{1,2\} is defined as

σj2=2TjiΔKiKi2eRjTjQ(Ki)1Tj[FjKj,01]2.(3) \sigma_j^2 = \frac{2}{T_j} \sum_{i} \frac{\Delta K_i}{K_i^2} e^{R_j T_j} Q(K_i) - \frac{1}{T_j} \left[ \frac{F_j}{K_{j,0}} - 1\right]^2. \tag{3}

Here, TjT_j is the time to expiration, RjR_j is the risk-free interest rate to expiration, FjF_j is forward index level, Kj,0K_{j,0} is the first strike price less than FjF_j, KiK_i is the strike price of the ii-th option in the basket selected, ΔKi\Delta K_i is an interval between strikes, and finally Q(Ki)Q(K_i) is the midpoint of the bid-ask spread for each option with strike KiK_i. As with Equation 11, we can see that most of this is not conceptually hard. Rather, it amounts to tedious calculations for selecting options and then computing derived quantities based on those options.

To me, the real question is: where did Equation 33 come from? This is the essence of understanding the VIX. Thus, the goal of this post is to derive Equation 33.

Variance swap

To understand the VIX formula, we need to understand the fair price for a variance swap. We’ll see that this is what the VIX formula is approximating.

A variance swap is a forward contract whose payoff is proportional to the difference between realized variance σR2\sigma^2_R and strike variance σK2\sigma^2_K:

payoff    σR2σK2.(4) \text{payoff} \;\propto\; \sigma^2_R - \sigma^2_K. \tag{4}

At contract inception, two parties agree on a strike variance σK2\sigma^2_K, and the payoff in notional or dollar terms is proportional to the difference in Equation 44. How should a dealer set the fair price σK2\sigma^2_K? One approach is to define the fair price as the expected average total variance:

σK2:=EQ[1T0Tσt2dt].(5) \sigma^2_K := \mathbb{E}_{\mathbb{Q}}\left[\frac{1}{T} \int_0^T \sigma^2_t dt\right]. \tag{5}

Here, the expectation is with respect to the risk-neutral measure Q\mathbb{Q}, as opposed to the true probability measure P\mathbb{P} for the underlying physical process. In the derivatives-pricing literature, the risk-neutral measure is a key idea because it is the probability measure that ensures a contract is exactly equal to its time-discounted expected value. If we used another measure to price a derivative, we would allow for arbitrage. For an example of this idea when pricing a call option, see Equation 1212 in this post.

So that’s the idea. The fair strike price of a variance swap is simply the expected realized variance under the risk-neutral measure. But it’s not clear how we can actually compute Equation 55. As when pricing options, the big idea is to use a replicating portfolio. We want to model a changing position that is proportional to the desired expected varianced, and the fair price of the strike is the cost to construct the replicating portfolio.

Estimating the variance

First, let’s see how to isolate and compute the desired expected variance. Let BtB_t be a Brownian motion on a probability space with risk-neutral measure Q\mathbb{Q}. We assume the price of SPX satisfies this stochastic differential equation (SDE):

dStSt=μtdt+σtdBt,(6) \frac{d S_t}{S_t} = \mu_t dt + \sigma_t dB_t, \tag{6}

where μt\mu_t is the drift of SPX, σt2\sigma^2_t is the variance of SPX, and BtB_t is Brownian motion. At a high level, this equation states that the SPX return (left-hand side) can be decomposed into some expected return plus uncertainty around that return. See “The Process for a Stock Price” in section 14.314.3 in (Hull, 1993) for a deeper discussion of this common modeling assumption.

We need to solve this SDE, meaning we would like to find a functional representation of StS_t that is a (ideally unique) solution to Equation 66. The problem, however, is that ordinary calculus does not give us the tools to differentiate expressions of the form f(Bt)f(B_t). Speaking very loosely, we can think of Brownian motion as too rough, too crinkly, to be differentiable using ordinary calculus. There is a rich theory for dealing with this problem called stochastic calculus (see (Steele, 2001; Shreve & others, 2004) for mathematical treatments), and the high-level view is that there is a chain rule for stochastic calculus called the Itô–Doeblin lemma (Itô, 1951; Bru & Yor, 2002). This allows us to find a unique solution to the SDE in Equation 66:

St=S0exp{0tμsds120tσs2ds+0tσsdBs}.(7) S_t = S_0 \exp \left\{\int_0^t \mu_s ds - \frac{1}{2} \int_0^t \sigma_s^2 ds + \int_0^t \sigma_s dB_s \right\}. \tag{7}

Wikipedia has a nice derivation of this here—albeit with constrant drift and variance. The point is: if we assume that SPX follows the SDE in Equation 66, then the functional form of SPX is Equation 77. Notice that StS_t in Equation 77 is geometric Brownian motion. Even if you are unfamiliar with stochastic calculus, this representation should make some sense given our assumptions so far. It says that the current value of SPX is equal to the original value, plus all the drift and fluctuations from time 00 to tt. We exponentiate these changes since prices are non-negative.

The key point for us is that we can write the differential form of geometric Brownian motion as

d(logSt)=μtdt+σtdBtdSt/St12σt2dt.(8) d (\log S_t) = \overbrace{\mu_t dt + \sigma_t dB_t}^{dS_t/S_t} - \frac{1}{2} \sigma^2_t dt. \tag{8}

In ordinary calculus, dlogx=dx/xd \log x = dx/x, but this is not true here since StS_t is a stochastic process. Thus, we have an extra term, (1/2)σt2dt(-1/2)\sigma^2_t dt, that represents the quadratic variation of the Brownian motion. Now if we subtract Equation 88 from Equation 66, we can eliminate both the drift term and the Brownian motion! (Remember, we are looking for a useful representation of σt2\sigma_t^2.) This gives us

dStStd(logSt)=σt22dt.(9) \frac{d S_t}{S_t} - d (\log S_t) = \frac{\sigma^2_t}{2} dt. \tag{9}

Integrating and then scaling both sides by 2/T2/T, we get

1T0Tσt2dt=2T0TdStSt2T0Td(logSt).(10) \frac{1}{T} \int_0^T \sigma^2_t dt = \frac{2}{T} \int_0^T \frac{dS_t}{S_t} - \frac{2}{T} \int_0^T d(\log S_t). \tag{10}

This is promising! The left-hand side of Equation 1010 is our desired total variance, and the right-hand side of Equation 1010 is simply a function of the SPX price. We can write the right integral as

0Td(logSt)=0Tμtdt+0TσtdBt0T12σt2dt=logSTlogS0.(11) \begin{aligned} \int_0^T d (\log S_t) &= \int_0^T \mu_t dt + \int_0^T \sigma_t dB_t - \int_0^T \frac{1}{2} \sigma^2_t dt \\ &= \log S_T - \log S_0. \end{aligned} \tag{11}

The first step holds because of Equation 88, and the second step holds because of Equation 77.

And once we take the expectation of Equation 1010, the middle integral simplifies to the log ratio of the forward price FF with the current price S0S_0. To see that, consider this derivation:

EQ[0TdStSt]=1EQ[0Tμtdt]+EQ[0TσtdBt]=2EQ[0Tμtdt]=3RT=4log(F/S0).(12) \begin{aligned} \mathbb{E}_{\mathbb{Q}}\left[ \int_0^T \frac{d S_t}{S_t} \right] &\stackrel{1}{=} \mathbb{E}_{\mathbb{Q}}\left[ \int_0^T \mu_t dt \right] + \mathbb{E}_{\mathbb{Q}}\left[ \int_0^T \sigma_t dB_t \right] \\ &\stackrel{2}{=} \mathbb{E}_{\mathbb{Q}}\left[ \int_0^T \mu_t dt \right] \\ &\stackrel{3}{=} RT \\ &\stackrel{4}{=} \log(F / S_0). \end{aligned} \tag{12}

The first step is just Equation 77. In the second step, we use the fact that the expected uncertainty is zero. Intuitively, we can think of the increments of Brownian motion dBtdB_t as having mean zero. In the third step, we use risk-neutrality. The expected drift is simply the risk-free rate over the time-to-maturity. And in the fourth step, we use the definition of the forward price:

F:=S0eRT    log(F/S0)=RT.(13) F := S_0 e^{RT} \quad \implies \quad \log (F / S_0) = RT. \tag{13}

See Chapter 5 in (Hull, 1993) for details on forward prices.

Putting it all together, we can express the fair value of a variance swap as

σK2=EQ[1T0Tσtdt]=2Tlog(FS0)2TEQ[log(STS0)].(14) \sigma^2_K = \mathbb{E}_{\mathbb{Q}}\left[\frac{1}{T} \int_0^T \sigma_t dt \right] = \frac{2}{T} \log\left(\frac{F}{S_0}\right) - \frac{2}{T} \mathbb{E}_{\mathbb{Q}} \left[ \log \left( \frac{S_T}{S_0} \right) \right]. \tag{14}

Note that the expected log difference can be thought of as a log contract (Neuberger, 1994). But the log contract isn’t a real tradeable asset. So what remains is to find a way to reformulate Equation 1414 into something we can trade. This will allow us to construct a portfolio that replicates the strike variance, and the fair price of the variance swap is the cost of constructing the replicating portfolio.

Replicating portfolio

To construct a replicating portfolio, we need the Carr-Madan formula, which is just a special case of Taylor’s formula. Using the notation from that blog post, if we set g(x)=log(x)g(x) = \log(x), St=STS_t = S_T, and κ=S0\kappa = S_0, then Carr-Madan’s formula applied to our setting is

logST=logS0+STS0S00S0max[0,KST]K2dKS0max[0,STK]K2dK.(15) \begin{aligned} \log S_T = \log S_0 + \frac{S_T - S_0}{S_0} - \int_0^{S_0} \frac{\max[0, K - S_T]}{K^2} dK - \int_{S_0}^{\infty} \frac{\max[0, S_T - K]}{K^2} dK. \end{aligned} \tag{15}

As in that blog post, max[0,STK]\max[0, S_T - K] and max[0,KST]\max[0, K - S_T] are payoff functions for European-style calls and puts, respectively. This is a pretty cool idea in its own right, as it states that we can represent any log move in the underlying price as an infinite portfolio of calls and puts. Now we’re approaching something that looks like a portfolio that we can use to replicate σK2\sigma^2_K!

Now two observations. First, under the risk-neutral measure, the expected value of a call or a put is simply it’s time-adjusted value. So if c()c(\cdot) and p()p(\cdot) are two functions that compute the fair value of calls and puts respectively using information today, then we have

c(K)=eRTEQ[max[0,STK],p(K)=eRTEQ[max[0,KST].(16) \begin{aligned} c(K) &= e^{-RT} \mathbb{E}_{\mathbb{Q}}[\max[0, S_T-K], \\ p(K) &= e^{-RT} \mathbb{E}_{\mathbb{Q}}[\max[0, K-S_T]. \end{aligned} \tag{16}

Second, under the risk-neutral measure, the expected value of STS_T is simply the forward price of the asset at time TT, since

EQ[ST]=S0eRT=F.(17) \mathbb{E}_{\mathbb{Q}}[S_T] = S_0 e^{RT} = F. \tag{17}

Therefore, we can write the ratio (STS0)/S0(S_T - S_0) / S_0 as

EQ[STS0S0]=FS01.(18) \mathbb{E}_{\mathbb{Q}}\left[\frac{S_T - S_0}{S_0}\right] = \frac{F}{S_0} - 1. \tag{18}

Putting it all together, we have derived the fair price of a variance swap in terms of things we can actually compute and trade:

σK2=2TlogFS02T[FS01]+2T[0S0eRTp(K)K2dK+S0eRTc(K)K2dK].(19) \sigma^2_K = \frac{2}{T} \log \frac{F}{S_0} - \frac{2}{T} \left[ \frac{F}{S_0} - 1 \right] + \frac{2}{T} \left[ \int_0^{S_0} \frac{e^{RT} p(K)}{K^2} dK + \int_{S_0}^{\infty} \frac{e^{RT} c(K)}{K^2} dK \right]. \tag{19}

The fair strike variance is the expected average variance, under the risk-neutral measure, which we replicate via a portfolio of calls and puts.

Approximations

Now let’s derive Equation 33 from Equation 1919. Putting the two side by side and aligning terms (and ignoring the index jj in Equation 33 for notational brevity), we have:

σK2=2T[0S0eRTp(K)K2dK+S0eRTc(K)K2dK]+2TlogFS02T[FS01],σ2=2TiΔKiKi2eRTQ(Ki)1T[FK01]2.(20) \begin{aligned} \sigma_K^2 &= \frac{2}{T} \left[ \int_0^{S_0} \frac{e^{RT} p(K)}{K^2} dK + \int_{S_0}^{\infty} \frac{e^{RT} c(K)}{K^2} dK \right] &&+ \frac{2}{T} \log \frac{F}{S_0} - \frac{2}{T} \left[ \frac{F}{S_0} - 1 \right], \\ \\ \sigma^2 &= \frac{2}{T} \sum_{i} \frac{\Delta K_i}{K_i^2} e^{R T} Q(K_i) &&- \frac{1}{T} \left[ \frac{F}{K_{0}} - 1\right]^2. \end{aligned} \tag{20}

We can see that the two integrals in Equation 1919 are approximated by a sum over a set of options, as defined in (CBOE, 2019). The term ΔKi\Delta K_i is an approximation of dKdK using the midpoint rule:

dKΔKi=Ki+1Ki12.(21) dK \approx \Delta K_i = \frac{K_{i+1} - K_{i-1}}{2}. \tag{21}

If we visualize the VIX as approximating a continuous-time integral with a Riemann sum, then we can think of ΔKi\Delta K_i as the width of each rectangle. And the height of each rectangle is defined by Q(Ki)Q(K_i), which is the midpoint between the bid and ask prices. Using the bid would make the rectangle too short, while using the ask would make the rectangle too tall. So we can think of Q(Ki)ΔKiQ(K_i) \Delta K_i as two midpoint approximations that combine to construct a rectangle under the integral in Equation 1919.

Finally, note that the VIX formula uses the following approximation:

2TlogFS02T[FS01]1T[FK01]2.(22) \frac{2}{T} \log\frac{F}{S_0} - \frac{2}{T} \left[ \frac{F}{S_0} - 1\right] \approx - \frac{1}{T} \left[ \frac{F}{K_0} - 1 \right]^2. \tag{22}

I have not seen this pointed out in any official reference, but this StackExchange post observes that it is just a second-order Taylor approximation (see A1),

log(1+x)xx22,(23) \log(1+x) \approx x - \frac{x^2}{2}, \tag{23}

where

x=FS01.(24) x = \frac{F}{S_0} - 1. \tag{24}

This gives us

logFS0=log(1+[FS01])=[FS01]+12[FS01]2.(25) \begin{aligned} \log \frac{F}{S_0} &= \log\left( 1 + \left[\frac{F}{S_0} - 1 \right] \right) \\ &= \left[\frac{F}{S_0} - 1 \right] + \frac{1}{2} \left[\frac{F}{S_0} - 1 \right]^2. \end{aligned} \tag{25}

The left term in the second line of Equation 2525 cancels with the term from Equation 1818. Given that we can trivially compute the left-hand side of Equation 2222, it is not clear to me why the second-order Taylor approximation is preferred.

Finally, note that the VIX computes the forward price FF using put-call parity, and uses K0K_0 as an approximation for the spot price S0S_0. This is why K0K_0 is used in the approximation above (Equation 2222).

Conclusion

As we have seen, in some sense, there is no real “derivation of the VIX” that is not ultimately a derivation of a variance swap. The VIX main formula (Equation 33) is just a discretized approximation of Equation 1919. The final VIX formula (Equation 11) weights two fairly valued variance strikes and then transforms this average variance into an annualized volatility. So the VIX can be thought of as the fair strike price of a volatility swap that settles in thirty days.

   

Appendix

A1. Taylor approximation

In general, a second-order Taylor approximation is

f(x)f(a)+f(a)(xa)+f(a)(xa)22.(A1.1) f(x) \approx f(a) + f^{\prime}(a)(x-a) + \frac{f^{\prime\prime}(a)(x-a)^2}{2}. \tag{A1.1}

Evaluate this with f=logf = \log and a=1a = 1:

log(x)x1+(x1)22.(A1.2) \log(x) \approx x-1 + \frac{(x-1)^2}{2}. \tag{A1.2}

So

log(x+1)x+x22(A1.3) \log(x+1) \approx x + \frac{x^2}{2} \tag{A1.3}

as desired.

  1. CBOE. (2019). CBOE volatility index. White Paper.
  2. Whaley, R. E. (1993). Derivatives on market volatility. The Journal of Derivatives, 1(1), 71–84.
  3. Demeterfi, K., Derman, E., Kamal, M., & Zou, J. (1999). A guide to volatility and variance swaps. The Journal of Derivatives, 6(4), 9–32.
  4. Hull, J. (1993). Options, futures, and other derivative securities (Vol. 7). Prentice Hall Englewood Cliffs, NJ.
  5. Steele, J. M. (2001). Stochastic calculus and financial applications (Vol. 1). Springer.
  6. Shreve, S. E., & others. (2004). Stochastic calculus for finance II: Continuous-time models (Vol. 11). Springer.
  7. Itô, K. (1951). On a formula concerning stochastic differentials. Nagoya Mathematical Journal, 3, 55–65.
  8. Bru, B., & Yor, M. (2002). Comments on the life and mathematical legacy of Wolfgang Doeblin. Finance and Stochastics, 6, 3–47.
  9. Neuberger, A. (1994). The log contract. Journal of Portfolio Management, 20, 74–74.