The Capital Asset Pricing Model

In finance, the capital asset pricing model (CAPM) was the first theory to measure systematic risk. The CAPM argues that there is a single type of risk, market risk. I derive the CAPM from the mean–variance framework of modern portfolio theory.

When I first learned about the capital asset pricing model (CAPM) (Sharpe, 1964), I thought the main result was non-obvious. To explain my initial confusion, let me first state this result. Let RMR_M be the return of the market. For example, this might be approximated by the return of the S&P 500 index. Let $R_i$ be the return of asset ii. And let rfr_f be the risk-free rate, meaning a rate of return that is achievable without any risk. No such risk-free asset truly exists, but the canonical example is a United States Treasury bill. Under some assumptions which we will discuss, we can prove the following equation:

E[Ri]rf=σi,MσM2(E[RM]rf),(1) \mathbb{E}[R_i] - r_f = \frac{\sigma_{i,M}}{\sigma_M^2} \left( \mathbb{E}[R_M] - r_f \right), \tag{1}

where σM2\sigma_M^2 is the variance of the market and σi,M\sigma_{i,M} is the cross-covariance between the market and asset ii. The coefficient in Equation 11 is often called “beta” and denoted βi\beta_i, giving us

E[Ri]rf=βi(E[RM]rf).(2) \mathbb{E}[R_i] - r_f = \beta_i \left( \mathbb{E}[R_M] - r_f \right). \tag{2}

This is because in a simple linear regression, the linear coefficient β\beta is the ratio of the cross-covariance to the variance. (See Equation 1010 in this post for details on this claim.) I assume that βi\beta_i in Equation 22 gets its name and notation from the standard notation for least squares regression. An attentive reader might notice that the regression’s intercept, often denoted α\alpha, is zero in Equation 22. We will discuss what this means later.

Now let me explain what is non-obvious to me. What Equation 22 say is that there is a linear relationship between the expected excess rate of return of asset ii (excess w.r.t. the risk free rate) and the expected excess rate of return of the market. This means that if we had a bunch of historical data with which to estimate E[Ri]\mathbb{E}[R_i] and E[RM]\mathbb{E}[R_M], we could simply fit a linear regression to Equation 22 to estimate βi\beta_i, which would tell us whether asset ii is equivalent to the market (βi1\beta_i \approx 1), better than the market (βi>1\beta_i \gt 1), or worse than the market (βi<1\beta_i \lt 1).

In my mind, if the above were true, it would be surprising. However, we can prove it is true with some assumptions (albeit simplistic ones). The goal of this post is to re-derive Equation 22 from first principles, to better understand which assumptions induce this surprising result.

Mean–variance analysis

To understand the CAPM, one must understand modern portfolio theory and the efficient frontier. I will briefly review these topics here, but see those two posts for details.

Recall that modern portfolio theory posits that reward can be defined as the expected return of a portfolio, and its associated risk can be defined as the standard deviation on that return. Let RpR_p be the return of portfolio pp, and let wiw_i denote the portfolio weight for the ii-th asset. Then one can show that

μp=E[Rp]=n=1NwnR[Rn],σp2=V[Rp]=n=1Nm=1NwnwmCov(Rn,Rm),(3) \begin{aligned} \mu_p = \mathbb{E}[R_p] &= \sum_{n=1}^N w_n \mathbb{R}[R_n], \\ \sigma_p^2 = \mathbb{V}[R_p] &= \sum_{n=1}^N \sum_{m=1}^N w_n w_m \text{Cov}(R_n, R_m), \end{aligned} \tag{3}

which implies that the relationship between risk and reward is a quadratic. This quadratic relationship is sometimes called the Markowitz bullet (named after the founder of modern portfolio theory, Harry Markowitz), and it is typically diagrammed with σp\sigma_p on the xx-axis and μp\mu_p on the yy-axis (Figure 11, dashed blue line). This kind of plot is sometimes called the risk-reward spectrum. Each point on the plane represents a portfolio (possibly a single-asset portfolio) with a given risk and associated reward.

Figure 1. The mean–variance spectrum from modern portfolio theory. The Markowitz bullet (dashed blue line) traces the quadratic relationship between risk and reward when considering portfolios with only risky assets. The quadratic efficient frontier is the top half of this bullet (solid blue line). When we include a risk-free asset into the portfolio, the efficient frontier becomes linear (red line). All portfolios on the linear efficient frontier are convex combinations of a portfolio containing just the risk-free rate (red triangle) and a portfolio containing only risky assets on the quadratic efficient frontier (red circle).

Modern portfolio theory assumes that the rational investor should seek higher reward for the same risk. Since for a given portfolio variance, there are two portfolios with different expected returns, the rational investor should prefer portfolios on the top half of the Markowitz bullet (Figure 11, solid blue line). This region is called the efficient frontier, because any portfolio that is inside of or on the lower half of the Markowitz bullet is inefficient in that an investor could have a higher expected return for the same risk.

Finally, modern portfolio theory proves that if we can construct a portfolio with NN risky assets (each RiR_i is a random variable) and a single risk-free asset (the non-random risk-free rate rfr_f), the efficient frontier is no longer a quadratic curve but a straight line (Figure 11, red line). Furthermore, any portfolio along this line can be represented as a convex combination of a portfolio containing just the risk free rate (Figure 11, triangle) and the tangency portfolio, which is the portfolio that sits at the intersection of the linear and quadratic efficient frontiers (Figure 11, circle).

In my mind, understanding the linear efficient frontier is critical to understanding the CAPM. This is so important that if it does not make sense now, I strongly suggest you read this section on the efficient frontier with a risk-free asset before proceeding.

Capital market line

The reason understanding the linear efficient frontier is so important for understanding the CAPM is because, at least in my mind, the CAPM is primarily an interpretation of this frontier. First, the CAPM argues that the tangency portfolio is really the market portfolio, or a portfolio containing all the assets in the market. (Market weight wiw_i might be chosen, for example, based on volume trades of asset ii.) Why? Recall that the tangency portfolio is the portfolio of risky assets that maximizes the Sharpe ratio, defined as

sharpe=μprfσp.(4) \text{sharpe} = \frac{\mu_p - r_f}{\sigma_p}. \tag{4}

See this section for details. The geometric interpretation of the Sharpe ratio is that it is the reward per unit of risk on the risk-return spectrum, and so the tangency portfolio is the portfolio that is maximally efficient in the sense that it you cannot get more reward for the same amount of risk. The CAPM argues that the market must be efficient, because if it were not, investors would arbitrage any mispricings and thereby make the market efficient. So all rational investors must hold this maximally efficient portfolio, which is also the tangency portfolio.

So the “tangency portfolio” is now the “market portfolio”. Second, the CAPM interprets the linear efficient frontier as the “capital market line”. I am not sure why, but I would guess that the name is meant to suggest that the capital markets “live” on that linear efficient frontier, since all investors, being rational, want a portfolio that is a combination of the maximally efficient portfolio (tangency or market portfolio) and the risk-free rate. Where investors want to be on this convex combination depends on their risk preferences. So rather than Figure 11, the CAPM uses Figure 22.

Figure 2. The CAPM provides a specific interpretation to Figure 11. The linear efficient frontier is called the "capital market line" and the tangency portfolio, it is argued, must be the market portfolio.

Deriving the CAPM

We are now ready to derive Equation 22. Consider a portfolio that holds a combination of the market portfolio and a single asset ii. Let 1w1-w be the market weight and ww be the single asset weight. Clearly, the expected return of this portfolio is

E[Rp]=wE[Ri]+(1w)E[RM],(5) \mathbb{E}[R_p] = w \mathbb{E}[R_i] + (1 - w) \mathbb{E}[R_M], \tag{5}

where RMR_M is the rate of return of the market. This is just an accounting identity. The variance is a bit more tedious, but it’s just

V[Rp]=E[(RpE[Rp])2]=E[(wRi+(1w)RM(wE[Ri]+(1w)E[RM]))2]=E[(w(RiE[Ri])+(1w)(RME[RM]))2]=w2σi2+(1w)2σM2+2w(1w)σi,M.(6) \begin{aligned} \mathbb{V}[R_p] &= \mathbb{E}\left[(R_p - \mathbb{E}[R_p])^2 \right] \\ &= \mathbb{E}\left[(w R_i + (1 - w) R_M - (w \mathbb{E}[R_i] + (1 - w) \mathbb{E}[R_M]))^2 \right] \\ &= \mathbb{E}\left[(w (R_i - \mathbb{E}[R_i]) + (1 - w) (R_M - \mathbb{E}[R_M]))^2 \right] \\ &= w^2 \sigma_i^2 + (1 - w)^2 \sigma_M^2 + 2w (1-w)\sigma_{i,M}. \end{aligned} \tag{6}

You can verify that when w=1w = 1, the mean and variance are just the mean and variance for asset ii, and when w=0w = 0, the mean and variance are just the mean and variance for the market.

Notice that these two equations also imply a quadratic relationship. Since a portfolio with a single risky asset must be either on the efficient frontier or inside of it, and since any portfolio with just the market portfolio must be at the tangency portfolio, then Equations 55 and 66 imply a quadratic equation that is inside the Markowitz bullet and yet touches the tangency portfolio (Figure 44, black line). We can think of this curve as asset ii’s idiosyncratic frontier. (This is not standard terminology, but it is how I think about it.)

Figure 3. The convex combination of the market portfolio (red dot) and a portfolio containing a single asset (black square) traces out a quadratic curve within the Markowitz bullet (black line). The curve must be the tangent line at the market portfolio, since the market portfolio is the tangency portfolio.

Now what is the slope (Sharpe ratio) of this curve at the tangency or market portfolio? We know it must be the Sharpe ratio of the tangency portfolio. But we could also derive the slope of the idiosyncratic frontier at the tangency portfolio by representing its slope as a function of ww, taking its derivative, and then plugging in w=0w=0. Let’s do that.

Let f(σp)=μpf(\sigma_p) = \mu_p be the function that defines the black curve on the risk-reward spectrum. Then the slope of curve is simply the first derivative, f(σp)f^{\prime}(\sigma_p). We would like to compute this derivative but in terms of the weight ww. This way we can plug in w=0w=0 to get to the slope of the black curve at the tangency or market portfolio. We can do that by expressing the mean and standard deviations as functions of ww. Define such functions g(w)g(w) and h(w)h(w) as

g(w)E[Rp]=μp,h(w)V[Rp]=σp.(7) \begin{aligned} g(w) &\triangleq \mathbb{E}[R_p] = \mu_p, \\ h(w) &\triangleq \sqrt{\mathbb{V}[R_p]} = \sigma_p. \end{aligned} \tag{7}

These are functions of ww because of Equations 55 and 66. Then the function f(σp)=μpf(\sigma_p) = \mu_p can be written as

f(h(w))=g(w).(8) f(h(w)) = g(w). \tag{8}

This is the function that traces the black curve on the risk-reward spectrum, as a function of the weight ww. Then derivative of f(σp)f(\sigma_p) as a function of ww is

f(h(w))h(w)=g(w),f(σp)=g(w)h(w).(9) \begin{aligned} f^{\prime}(h(w)) h^{\prime}(w) &= g^{\prime}(w), \\ &\Downarrow \\ f^{\prime}(\sigma_p) &= \frac{g^{\prime}(w)}{h^{\prime}(w)}. \end{aligned} \tag{9}

So we simply need to compute the ratio of the derivatives of g(w)g(w) and h(w)h(w) at w=0w=0 to compute the desired slope. The derivative of g(w)g(w) is

g(w)=E[Ri]E[RM].(10) g^{\prime}(w) = \mathbb{E}[R_i] - \mathbb{E}[R_M]. \tag{10}

So g(w)g^{\prime}(w) is not a function of ww. The derivative of h(w)h(w) is more tedious because of the square root, but it is straightforward. It is

h(w)=12[2wσi22(1w)σM2+2σi,M4wσi,M(w2σi2+(1w)2σM2+2w(1w)σi,M)1/2].(11) h^{\prime}(w) = \frac{1}{2} \left[ \frac{2 w \sigma_i^2 - 2(1 - w) \sigma_M^2 + 2 \sigma_{i,M} - 4 w \sigma_{i,M}}{\left( w^2 \sigma_i^2 + (1 - w)^2 \sigma_M^2 + 2w (1-w)\sigma_{i,M} \right)^{1/2}} \right]. \tag{11}

What happens when we plug w=0w=0 into h(w)h^{\prime}(w)? Most of the terms cancel, and we get

h(0)=12[2σM2+2σi,MσM2]=σi,MσM2σM.(12) h^{\prime}(0) = \frac{1}{2} \left[ \frac{-2 \sigma_M^2 + 2 \sigma_{i,M}}{\sqrt{\sigma_M^2}} \right] = \frac{\sigma_{i,M} - \sigma_M^2}{\sigma_M}. \tag{12}

So the derivative of the function f(σp)=μpf(\sigma_p) = \mu_p that traces out the black curve, at the point w=0w = 0 where the curve touches the market portfolio, is

f(0)=σM(E[Ri]E[RM])σi,MσM2.(13) f^{\prime}(0) = \frac{\sigma_{M} (\mathbb{E}[R_i] - \mathbb{E}[R_M])}{\sigma_{i,M} - \sigma_M^2}. \tag{13}

But we also know that the slope of linear efficient frontier (the capital market line) is the Sharpe ratio of the tangency portfolio. So it must be the case that

E[RM]rfσM=σM(E[Ri]E[RM])σi,MσM2.(14) \frac{\mathbb{E}[R_M] - r_f}{\sigma_M} = \frac{\sigma_{M} (\mathbb{E}[R_i] - \mathbb{E}[R_M])}{\sigma_{i,M} - \sigma_M^2}. \tag{14}

And this is just Equation 11:

E[RM]rfσM=σM(E[Ri]E[RM])σi,MσM2(E[RM]rfσM)(σi,MσM2σM)=E[Ri]E[RM]σi,MσM2(E[RM]rf)(E[RM]rf)=E[Ri]E[RM].σi,MσM2(E[RM]rf)=E[Ri]rf.(15) \begin{aligned} \frac{\mathbb{E}[R_M] - r_f}{\sigma_M} &= \frac{\sigma_{M} (\mathbb{E}[R_i] - \mathbb{E}[R_M])}{\sigma_{i,M} - \sigma_M^2} \\ \left( \frac{\mathbb{E}[R_M] - r_f}{\sigma_M} \right) \left( \frac{\sigma_{i,M} - \sigma_M^2}{\sigma_M} \right) &= \mathbb{E}[R_i] - \mathbb{E}[R_M] \\ \frac{\sigma_{i,M}}{\sigma_M^2} (\mathbb{E}[R_M] - r_f) - (\mathbb{E}[R_M] - r_f) &= \mathbb{E}[R_i] - \mathbb{E}[R_M]. \\ \frac{\sigma_{i,M}}{\sigma_M^2} (\mathbb{E}[R_M] - r_f) &= \mathbb{E}[R_i] - r_f. \end{aligned} \tag{15}

And we’re done. This is why I said at the beginning that the linear relationship between individual asset’s return RiR_i and the market’s return RMR_M is really an artifact of the linear efficient frontier. Equations 11 and 1515 simply fall out of this fact.

Security market line

The main graphical representation of the CAPM is the security market line (Figure 44). To understand this visualization, let’s think about what happens when βi{0,1}\beta_i \in \{0, 1\}. When βi=1\beta_i = 1, the return of asset ii equals the return of the market, or E[Ri]=E[RM]\mathbb{E}[R_i] = \mathbb{E}[R_M]. And when βi=0\beta_i = 0, the return of asset ii equals the risk-free rate rfr_f. If we plot βi\beta_i on the xx-axis and expected return E[Ri]\mathbb{E}[R_i] on the yy-axis, we can draw a straight line between the two points defined above. This line is the security market line. Then we can estimate βi\beta_i and E[Ri]\mathbb{E}[R_i] for any asset in the market and plot the value against the security market line. An overvalued asset should be above the security market line, and an undervalued asset should be below it.

Figure 4. The security market line is a graphical representation of the CAPM. The point (1,E[RM])(1, \mathbb{E}[R_M]) is risk-reward of the market, and the point (0,rf)(0, r_f) is the risk-reward of the risk-free rate. Assets that are undervalued are above the line between these two points, since they provide more reward for the same risk βi\beta_i. Assets that are overvalued are under this line, since one should just hold the market portfolio to get a better return for the same risk.

Alpha

As I mentioned in the introduction, we could estimate the expectations in Equation 11 from historical data, and then fit a linear regression to estimate the correlation between the asset and the market, the term

βiσi,MσM2.(16) \beta_i \triangleq \frac{\sigma_{i,M}}{\sigma_M^2}. \tag{16}

Notice that the linear regression should have an intercept and an error term, i.e.:

E[Ri]rf=αi+βi(E[RM]rf)+εi.(17) \mathbb{E}[R_i] - r_f = \alpha_i + \beta_i (\mathbb{E}[R_M] - r_f) + \varepsilon_i. \tag{17}

In Equation 1616, the intercept is αi\alpha_i and the slope is βi\beta_i. The standard ordinary least squares assumption is that the error terms are zero mean, or E[εi]=0\mathbb{E}[\varepsilon_i] = 0. However, given the mathematical fact of Equation 11—and it is a mathematical fact if the assumptions are true, then it must be true that

E[αi]=0.(18) \mathbb{E}[\alpha_i] = 0. \tag{18}

This is another way of saying that the market is efficient. In expectation, there is no vertical offset for risky assets on the risk–reward spectrum, no risk-free reward in the market. The only way to a bigger reward is to take on a bigger risk. And βi\beta_i captures how much reward you get per unit of risk, for a given asset ii.

Conclusion

In finance, the CAPM was the first theory to measure systematic risk (Grinold & Kahn, 2000). The CAPM argues that there is a single type of risk, market risk. The risk of asset ii is captured by the parameter βi\beta_i, as defined by Equation 22, and the CAPM argues that, in expectation, there is no αi\alpha_i, no idiosyncratic reward that requires no risk. To earn a higher reward, you must expose yourself to the risks of the market. The framework is an extension of modern portfolio theory with some additional assumptions about market efficiency.

  1. Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. The Journal of Finance, 19(3), 425–442.
  2. Grinold, R. C., & Kahn, R. N. (2000). Active portfolio management.