Deriving the VIX
The VIX is a benchmark for market-implied volatility. It is computed from a weighted average of variance swaps. I first derive the fair strike for a variance swap and then discuss the VIX's approximation of this formula.
The Chicago Board Options Exchange’s (CBOE) Volatility Index (VIX) is a global benchmark for market-implied volatility. However, for me, understanding the VIX’s main formula was tricky. CBOE’s whitepaper (CBOE, 2019) explains the inputs to the formula in detail—how the options are selected, how time is handled, and so forth—but does not explain where the formula comes from. And the original paper describing the VIX, (Whaley, 1993), is rich in stylized facts about the VIX but spare in technical details. So the goal of this post is simply to derive the main VIX formula.
As it turns out, deriving the VIX is nearly entirely about deriving the fair price of a variance swap. This is because the VIX can be viewed as the fair strike of a volatility swap that settles in thirty days. Thus, this post is in large part just a detailed elaboration of John Hull’s excellent note on variance swaps. See (Demeterfi et al., 1999) for a deeper treatment on the topic.
Index calculation
At a high level, the VIX is the weighted average of two option-implied volatilities, and , each computed using a set of S&P 500 (SPX) options that expire in roughly 30-days:
Thus, the VIX is a forward-looking estimate of SPX-implied volatility as an annualized percentage. Equation may look complicated, but most of the terms, which I won’t bother defining, are time-based variables designed to properly weight and . Simply put, it is just a bunch of calendar math. We could re-write Equation as
where and are the appropriate time-based weights. Thus, in my mind, the key conceptual lynchpin to understanding the VIX is understanding the equation for the options-implied volatilities and . According to (CBOE, 2019), each for is defined as
Here, is the time to expiration, is the risk-free interest rate to expiration, is forward index level, is the first strike price less than , is the strike price of the -th option in the basket selected, is an interval between strikes, and finally is the midpoint of the bid-ask spread for each option with strike . As with Equation , we can see that most of this is not conceptually hard. Rather, it amounts to tedious calculations for selecting options and then computing derived quantities based on those options.
To me, the real question is: where did Equation come from? This is the essence of understanding the VIX. Thus, the goal of this post is to derive Equation .
Variance swap
To understand the VIX formula, we need to understand the fair price for a variance swap. We’ll see that this is what the VIX formula is approximating.
A variance swap is a forward contract whose payoff is proportional to the difference between realized variance and strike variance :
At contract inception, two parties agree on a strike variance , and the payoff in notional or dollar terms is proportional to the difference in Equation . How should a dealer set the fair price ? One approach is to define the fair price as the expected average total variance:
Here, the expectation is with respect to the risk-neutral measure , as opposed to the true probability measure for the underlying physical process. In the derivatives-pricing literature, the risk-neutral measure is a key idea because it is the probability measure that ensures a contract is exactly equal to its time-discounted expected value. If we used another measure to price a derivative, we would allow for arbitrage. For an example of this idea when pricing a call option, see Equation in this post.
So that’s the idea. The fair strike price of a variance swap is simply the expected realized variance under the risk-neutral measure. But it’s not clear how we can actually compute Equation . As when pricing options, the big idea is to use a replicating portfolio. We want to model a changing position that is proportional to the desired expected varianced, and the fair price of the strike is the cost to construct the replicating portfolio.
Estimating the variance
First, let’s see how to isolate and compute the desired expected variance. Let be a Brownian motion on a probability space with risk-neutral measure . We assume the price of SPX satisfies this stochastic differential equation (SDE):
where is the drift of SPX, is the variance of SPX, and is Brownian motion. At a high level, this equation states that the SPX return (left-hand side) can be decomposed into some expected return plus uncertainty around that return. See “The Process for a Stock Price” in section in (Hull, 1993) for a deeper discussion of this common modeling assumption.
We need to solve this SDE, meaning we would like to find a functional representation of that is a (ideally unique) solution to Equation . The problem, however, is that ordinary calculus does not give us the tools to differentiate expressions of the form . Speaking very loosely, we can think of Brownian motion as too rough, too crinkly, to be differentiable using ordinary calculus. There is a rich theory for dealing with this problem called stochastic calculus (see (Steele, 2001; Shreve & others, 2004) for mathematical treatments), and the high-level view is that there is a chain rule for stochastic calculus called the Itô–Doeblin lemma (Itô, 1951; Bru & Yor, 2002). This allows us to find a unique solution to the SDE in Equation :
Wikipedia has a nice derivation of this here—albeit with constrant drift and variance. The point is: if we assume that SPX follows the SDE in Equation , then the functional form of SPX is Equation . Notice that in Equation is geometric Brownian motion. Even if you are unfamiliar with stochastic calculus, this representation should make some sense given our assumptions so far. It says that the current value of SPX is equal to the original value, plus all the drift and fluctuations from time to . We exponentiate these changes since prices are non-negative.
The key point for us is that we can write the differential form of geometric Brownian motion as
In ordinary calculus, , but this is not true here since is a stochastic process. Thus, we have an extra term, , that represents the quadratic variation of the Brownian motion. Now if we subtract Equation from Equation , we can eliminate both the drift term and the Brownian motion! (Remember, we are looking for a useful representation of .) This gives us
Integrating and then scaling both sides by , we get
This is promising! The left-hand side of Equation is our desired total variance, and the right-hand side of Equation is simply a function of the SPX price. We can write the right integral as
The first step holds because of Equation , and the second step holds because of Equation .
And once we take the expectation of Equation , the middle integral simplifies to the log ratio of the forward price with the current price . To see that, consider this derivation:
The first step is just Equation . In the second step, we use the fact that the expected uncertainty is zero. Intuitively, we can think of the increments of Brownian motion as having mean zero. In the third step, we use risk-neutrality. The expected drift is simply the risk-free rate over the time-to-maturity. And in the fourth step, we use the definition of the forward price:
See Chapter 5 in (Hull, 1993) for details on forward prices.
Putting it all together, we can express the fair value of a variance swap as
Note that the expected log difference can be thought of as a log contract (Neuberger, 1994). But the log contract isn’t a real tradeable asset. So what remains is to find a way to reformulate Equation into something we can trade. This will allow us to construct a portfolio that replicates the strike variance, and the fair price of the variance swap is the cost of constructing the replicating portfolio.
Replicating portfolio
To construct a replicating portfolio, we need the Carr-Madan formula, which is just a special case of Taylor’s formula. Using the notation from that blog post, if we set , , and , then Carr-Madan’s formula applied to our setting is
As in that blog post, and are payoff functions for European-style calls and puts, respectively. This is a pretty cool idea in its own right, as it states that we can represent any log move in the underlying price as an infinite portfolio of calls and puts. Now we’re approaching something that looks like a portfolio that we can use to replicate !
Now two observations. First, under the risk-neutral measure, the expected value of a call or a put is simply it’s time-adjusted value. So if and are two functions that compute the fair value of calls and puts respectively using information today, then we have
Second, under the risk-neutral measure, the expected value of is simply the forward price of the asset at time , since
Therefore, we can write the ratio as
Putting it all together, we have derived the fair price of a variance swap in terms of things we can actually compute and trade:
The fair strike variance is the expected average variance, under the risk-neutral measure, which we replicate via a portfolio of calls and puts.
Approximations
Now let’s derive Equation from Equation . Putting the two side by side and aligning terms (and ignoring the index in Equation for notational brevity), we have:
We can see that the two integrals in Equation are approximated by a sum over a set of options, as defined in (CBOE, 2019). The term is an approximation of using the midpoint rule:
If we visualize the VIX as approximating a continuous-time integral with a Riemann sum, then we can think of as the width of each rectangle. And the height of each rectangle is defined by , which is the midpoint between the bid and ask prices. Using the bid would make the rectangle too short, while using the ask would make the rectangle too tall. So we can think of as two midpoint approximations that combine to construct a rectangle under the integral in Equation .
Finally, note that the VIX formula uses the following approximation:
I have not seen this pointed out in any official reference, but this StackExchange post observes that it is just a second-order Taylor approximation (see A1),
where
This gives us
The left term in the second line of Equation cancels with the term from Equation . Given that we can trivially compute the left-hand side of Equation , it is not clear to me why the second-order Taylor approximation is preferred.
Finally, note that the VIX computes the forward price using put-call parity, and uses as an approximation for the spot price . This is why is used in the approximation above (Equation ).
Conclusion
As we have seen, in some sense, there is no real “derivation of the VIX” that is not ultimately a derivation of a variance swap. The VIX main formula (Equation ) is just a discretized approximation of Equation . The final VIX formula (Equation ) weights two fairly valued variance strikes and then transforms this average variance into an annualized volatility. So the VIX can be thought of as the fair strike price of a volatility swap that settles in thirty days.
Appendix
A1. Taylor approximation
In general, a second-order Taylor approximation is
Evaluate this with and :
So
as desired.
- CBOE. (2019). CBOE volatility index. White Paper.
- Whaley, R. E. (1993). Derivatives on market volatility. The Journal of Derivatives, 1(1), 71–84.
- Demeterfi, K., Derman, E., Kamal, M., & Zou, J. (1999). A guide to volatility and variance swaps. The Journal of Derivatives, 6(4), 9–32.
- Hull, J. (1993). Options, futures, and other derivative securities (Vol. 7). Prentice Hall Englewood Cliffs, NJ.
- Steele, J. M. (2001). Stochastic calculus and financial applications (Vol. 1). Springer.
- Shreve, S. E., & others. (2004). Stochastic calculus for finance II: Continuous-time models (Vol. 11). Springer.
- Itô, K. (1951). On a formula concerning stochastic differentials. Nagoya Mathematical Journal, 3, 55–65.
- Bru, B., & Yor, M. (2002). Comments on the life and mathematical legacy of Wolfgang Doeblin. Finance and Stochastics, 6, 3–47.
- Neuberger, A. (1994). The log contract. Journal of Portfolio Management, 20, 74–74.