The Capital Asset Pricing Model

In finance, the capital asset pricing model (CAPM) was the first theory to measure systematic risk. The CAPM argues that there is a single type of risk, market risk. I derive the CAPM from the mean–variance framework of modern portfolio theory.

Published

06 March 2022

When I first learned about the capital asset pricing model (CAPM) (Sharpe, 1964), I thought the main result was non-obvious. To explain my initial confusion, let me first state this result. Let $R_M$ be the return of the market. For example, this might be approximated by the return of the S&P 500 index. Let $R_i$ be the return of asset $i$ . And let $r_f$ be the risk-free rate, meaning a rate of return that is achievable without any risk. No such risk-free asset truly exists, but the canonical example is a United States Treasury bill. Under some assumptions which we will discuss, we can prove the following equation:

$\mathbb{E}[R_i] - r_f = \frac{\sigma_{i,M}}{\sigma_M^2} \left( \mathbb{E}[R_M] - r_f \right), \tag{1}$

where $\sigma_M^2$ is the variance of the market and $\sigma_{i,M}$ is the cross-covariance between the market and asset $i$ . The coefficient in Equation $1$ is often called “beta” and denoted $\beta_i$ , giving us

$\mathbb{E}[R_i] - r_f = \beta_i \left( \mathbb{E}[R_M] - r_f \right). \tag{2}$

This is because in a simple linear regression, the linear coefficient $\beta$ is the ratio of the cross-covariance to the variance. (See Equation $10$ in this post for details on this claim.) I assume that $\beta_i$ in Equation $2$ gets its name and notation from the standard notation for least squares regression. An attentive reader might notice that the regression’s intercept, often denoted $\alpha$ , is zero in Equation $2$ . We will discuss what this means later.

Now let me explain what is non-obvious to me. What Equation $2$ say is that there is a linear relationship between the expected excess rate of return of asset $i$ (excess w.r.t. the risk free rate) and the expected excess rate of return of the market. This means that if we had a bunch of historical data with which to estimate $\mathbb{E}[R_i]$ and $\mathbb{E}[R_M]$ , we could simply fit a linear regression to Equation $2$ to estimate $\beta_i$ , which would tell us whether asset $i$ is equivalent to the market ( $\beta_i \approx 1$ ), better than the market ( $\beta_i \gt 1$ ), or worse than the market ( $\beta_i \lt 1$ ).

In my mind, if the above were true, it would be surprising. However, we can prove it is true with some assumptions (albeit simplistic ones). The goal of this post is to re-derive Equation $2$ from first principles, to better understand which assumptions induce this surprising result.

Mean–variance analysis

To understand the CAPM, one must understand modern portfolio theory and the efficient frontier. I will briefly review these topics here, but see those two posts for details.

Recall that modern portfolio theory posits that reward can be defined as the expected return of a portfolio, and its associated risk can be defined as the standard deviation on that return. Let $R_p$ be the return of portfolio $p$ , and let $w_i$ denote the portfolio weight for the $i$ -th asset. Then one can show that

$\begin{aligned} \mu_p = \mathbb{E}[R_p] &= \sum_{n=1}^N w_n \mathbb{R}[R_n], \\ \sigma_p^2 = \mathbb{V}[R_p] &= \sum_{n=1}^N \sum_{m=1}^N w_n w_m \text{Cov}(R_n, R_m), \end{aligned} \tag{3}$

which implies that the relationship between risk and reward is a quadratic. This quadratic relationship is sometimes called the Markowitz bullet (named after the founder of modern portfolio theory, Harry Markowitz), and it is typically diagrammed with $\sigma_p$ on the $x$ -axis and $\mu_p$ on the $y$ -axis (Figure $1$ , dashed blue line). This kind of plot is sometimes called the risk-reward spectrum. Each point on the plane represents a portfolio (possibly a single-asset portfolio) with a given risk and associated reward.

Figure 1. The mean–variance spectrum from modern portfolio theory. The Markowitz bullet (dashed blue line) traces the quadratic relationship between risk and reward when considering portfolios with only risky assets. The quadratic efficient frontier is the top half of this bullet (solid blue line). When we include a risk-free asset into the portfolio, the efficient frontier becomes linear (red line). All portfolios on the linear efficient frontier are convex combinations of a portfolio containing just the risk-free rate (red triangle) and a portfolio containing only risky assets on the quadratic efficient frontier (red circle).

Modern portfolio theory assumes that the rational investor should seek higher reward for the same risk. Since for a given portfolio variance, there are two portfolios with different expected returns, the rational investor should prefer portfolios on the top half of the Markowitz bullet (Figure $1$ , solid blue line). This region is called the efficient frontier, because any portfolio that is inside of or on the lower half of the Markowitz bullet is inefficient in that an investor could have a higher expected return for the same risk.

Finally, modern portfolio theory proves that if we can construct a portfolio with $N$ risky assets (each $R_i$ is a random variable) and a single risk-free asset (the non-random risk-free rate $r_f$ ), the efficient frontier is no longer a quadratic curve but a straight line (Figure $1$ , red line). Furthermore, any portfolio along this line can be represented as a convex combination of a portfolio containing just the risk free rate (Figure $1$ , triangle) and the tangency portfolio, which is the portfolio that sits at the intersection of the linear and quadratic efficient frontiers (Figure $1$ , circle).

In my mind, understanding the linear efficient frontier is critical to understanding the CAPM. This is so important that if it does not make sense now, I strongly suggest you read this section on the efficient frontier with a risk-free asset before proceeding.

Capital market line

The reason understanding the linear efficient frontier is so important for understanding the CAPM is because, at least in my mind, the CAPM is primarily an interpretation of this frontier. First, the CAPM argues that the tangency portfolio is really the market portfolio, or a portfolio containing all the assets in the market. (Market weight $w_i$ might be chosen, for example, based on volume trades of asset $i$ .) Why? Recall that the tangency portfolio is the portfolio of risky assets that maximizes the Sharpe ratio, defined as

$\text{sharpe} = \frac{\mu_p - r_f}{\sigma_p}. \tag{4}$

See this section for details. The geometric interpretation of the Sharpe ratio is that it is the reward per unit of risk on the risk-return spectrum, and so the tangency portfolio is the portfolio that is maximally efficient in the sense that it you cannot get more reward for the same amount of risk. The CAPM argues that the market must be efficient, because if it were not, investors would arbitrage any mispricings and thereby make the market efficient. So all rational investors must hold this maximally efficient portfolio, which is also the tangency portfolio.

So the “tangency portfolio” is now the “market portfolio”. Second, the CAPM interprets the linear efficient frontier as the “capital market line”. I am not sure why, but I would guess that the name is meant to suggest that the capital markets “live” on that linear efficient frontier, since all investors, being rational, want a portfolio that is a combination of the maximally efficient portfolio (tangency or market portfolio) and the risk-free rate. Where investors want to be on this convex combination depends on their risk preferences. So rather than Figure $1$ , the CAPM uses Figure $2$ .

Figure 2. The CAPM provides a specific interpretation to Figure

1

. The linear efficient frontier is called the "capital market line" and the tangency portfolio, it is argued, must be the market portfolio.

Deriving the CAPM

We are now ready to derive Equation $2$ . Consider a portfolio that holds a combination of the market portfolio and a single asset $i$ . Let $1-w$ be the market weight and $w$ be the single asset weight. Clearly, the expected return of this portfolio is

$\mathbb{E}[R_p] = w \mathbb{E}[R_i] + (1 - w) \mathbb{E}[R_M], \tag{5}$

where $R_M$ is the rate of return of the market. This is just an accounting identity. The variance is a bit more tedious, but it’s just

$\begin{aligned} \mathbb{V}[R_p] &= \mathbb{E}\left[(R_p - \mathbb{E}[R_p])^2 \right] \\ &= \mathbb{E}\left[(w R_i + (1 - w) R_M - (w \mathbb{E}[R_i] + (1 - w) \mathbb{E}[R_M]))^2 \right] \\ &= \mathbb{E}\left[(w (R_i - \mathbb{E}[R_i]) + (1 - w) (R_M - \mathbb{E}[R_M]))^2 \right] \\ &= w^2 \sigma_i^2 + (1 - w)^2 \sigma_M^2 + 2w (1-w)\sigma_{i,M}. \end{aligned} \tag{6}$

You can verify that when $w = 1$ , the mean and variance are just the mean and variance for asset $i$ , and when $w = 0$ , the mean and variance are just the mean and variance for the market.

Notice that these two equations also imply a quadratic relationship. Since a portfolio with a single risky asset must be either on the efficient frontier or inside of it, and since any portfolio with just the market portfolio must be at the tangency portfolio, then Equations $5$ and $6$ imply a quadratic equation that is inside the Markowitz bullet and yet touches the tangency portfolio (Figure $4$ , black line). We can think of this curve as asset $i$ ’s idiosyncratic frontier. (This is not standard terminology, but it is how I think about it.)

Figure 3. The convex combination of the market portfolio (red dot) and a portfolio containing a single asset (black square) traces out a quadratic curve within the Markowitz bullet (black line). The curve must be the tangent line at the market portfolio, since the market portfolio is the tangency portfolio.

Now what is the slope (Sharpe ratio) of this curve at the tangency or market portfolio? We know it must be the Sharpe ratio of the tangency portfolio. But we could also derive the slope of the idiosyncratic frontier at the tangency portfolio by representing its slope as a function of $w$ , taking its derivative, and then plugging in $w=0$ . Let’s do that.

Let $f(\sigma_p) = \mu_p$ be the function that defines the black curve on the risk-reward spectrum. Then the slope of curve is simply the first derivative, $f^{\prime}(\sigma_p)$ . We would like to compute this derivative but in terms of the weight $w$ . This way we can plug in $w=0$ to get to the slope of the black curve at the tangency or market portfolio. We can do that by expressing the mean and standard deviations as functions of $w$ . Define such functions $g(w)$ and $h(w)$ as

$\begin{aligned} g(w) &\triangleq \mathbb{E}[R_p] = \mu_p, \\ h(w) &\triangleq \sqrt{\mathbb{V}[R_p]} = \sigma_p. \end{aligned} \tag{7}$

These are functions of $w$ because of Equations $5$ and $6$ . Then the function $f(\sigma_p) = \mu_p$ can be written as

$f(h(w)) = g(w). \tag{8}$

This is the function that traces the black curve on the risk-reward spectrum, as a function of the weight $w$ . Then derivative of $f(\sigma_p)$ as a function of $w$ is

$\begin{aligned} f^{\prime}(h(w)) h^{\prime}(w) &= g^{\prime}(w), \\ &\Downarrow \\ f^{\prime}(\sigma_p) &= \frac{g^{\prime}(w)}{h^{\prime}(w)}. \end{aligned} \tag{9}$

So we simply need to compute the ratio of the derivatives of $g(w)$ and $h(w)$ at $w=0$ to compute the desired slope. The derivative of $g(w)$ is

$g^{\prime}(w) = \mathbb{E}[R_i] - \mathbb{E}[R_M]. \tag{10}$

So $g^{\prime}(w)$ is not a function of $w$ . The derivative of $h(w)$ is more tedious because of the square root, but it is straightforward. It is

$h^{\prime}(w) = \frac{1}{2} \left[ \frac{2 w \sigma_i^2 - 2(1 - w) \sigma_M^2 + 2 \sigma_{i,M} - 4 w \sigma_{i,M}}{\left( w^2 \sigma_i^2 + (1 - w)^2 \sigma_M^2 + 2w (1-w)\sigma_{i,M} \right)^{1/2}} \right]. \tag{11}$

What happens when we plug $w=0$ into $h^{\prime}(w)$ ? Most of the terms cancel, and we get

$h^{\prime}(0) = \frac{1}{2} \left[ \frac{-2 \sigma_M^2 + 2 \sigma_{i,M}}{\sqrt{\sigma_M^2}} \right] = \frac{\sigma_{i,M} - \sigma_M^2}{\sigma_M}. \tag{12}$

So the derivative of the function $f(\sigma_p) = \mu_p$ that traces out the black curve, at the point $w = 0$ where the curve touches the market portfolio, is

$f^{\prime}(0) = \frac{\sigma_{M} (\mathbb{E}[R_i] - \mathbb{E}[R_M])}{\sigma_{i,M} - \sigma_M^2}. \tag{13}$

But we also know that the slope of linear efficient frontier (the capital market line) is the Sharpe ratio of the tangency portfolio. So it must be the case that

$\frac{\mathbb{E}[R_M] - r_f}{\sigma_M} = \frac{\sigma_{M} (\mathbb{E}[R_i] - \mathbb{E}[R_M])}{\sigma_{i,M} - \sigma_M^2}. \tag{14}$

And this is just Equation $1$ :

$\begin{aligned} \frac{\mathbb{E}[R_M] - r_f}{\sigma_M} &= \frac{\sigma_{M} (\mathbb{E}[R_i] - \mathbb{E}[R_M])}{\sigma_{i,M} - \sigma_M^2} \\ \left( \frac{\mathbb{E}[R_M] - r_f}{\sigma_M} \right) \left( \frac{\sigma_{i,M} - \sigma_M^2}{\sigma_M} \right) &= \mathbb{E}[R_i] - \mathbb{E}[R_M] \\ \frac{\sigma_{i,M}}{\sigma_M^2} (\mathbb{E}[R_M] - r_f) - (\mathbb{E}[R_M] - r_f) &= \mathbb{E}[R_i] - \mathbb{E}[R_M]. \\ \frac{\sigma_{i,M}}{\sigma_M^2} (\mathbb{E}[R_M] - r_f) &= \mathbb{E}[R_i] - r_f. \end{aligned} \tag{15}$

And we’re done. This is why I said at the beginning that the linear relationship between individual asset’s return $R_i$ and the market’s return $R_M$ is really an artifact of the linear efficient frontier. Equations $1$ and $15$ simply fall out of this fact.

Security market line

The main graphical representation of the CAPM is the security market line (Figure $4$ ). To understand this visualization, let’s think about what happens when $\beta_i \in \{0, 1\}$ . When $\beta_i = 1$ , the return of asset $i$ equals the return of the market, or $\mathbb{E}[R_i] = \mathbb{E}[R_M]$ . And when $\beta_i = 0$ , the return of asset $i$ equals the risk-free rate $r_f$ . If we plot $\beta_i$ on the $x$ -axis and expected return $\mathbb{E}[R_i]$ on the $y$ -axis, we can draw a straight line between the two points defined above. This line is the security market line. Then we can estimate $\beta_i$ and $\mathbb{E}[R_i]$ for any asset in the market and plot the value against the security market line. An overvalued asset should be below the security market line, and an undervalued asset should be above it.

Figure 4. The security market line is a graphical representation of the CAPM. The point

(1, \mathbb{E}[R_M])

is risk-reward of the market, and the point

(0, r_f)

is the risk-reward of the risk-free rate. Assets that are undervalued are above the line between these two points, since they provide more reward for the same risk

\beta_i

. Assets that are overvalued are under this line, since one should just hold the market portfolio to get a better return for the same risk.

Alpha

As I mentioned in the introduction, we could estimate the expectations in Equation $1$ from historical data, and then fit a linear regression to estimate the correlation between the asset and the market, the term

$\beta_i \triangleq \frac{\sigma_{i,M}}{\sigma_M^2}. \tag{16}$

Notice that the linear regression should have an intercept and an error term, i.e.:

$\mathbb{E}[R_i] - r_f = \alpha_i + \beta_i (\mathbb{E}[R_M] - r_f) + \varepsilon_i. \tag{17}$

In Equation $16$ , the intercept is $\alpha_i$ and the slope is $\beta_i$ . The standard ordinary least squares assumption is that the error terms are zero mean, or $\mathbb{E}[\varepsilon_i] = 0$ . However, given the mathematical fact of Equation $1$ —and it is a mathematical fact if the assumptions are true, then it must be true that

$\mathbb{E}[\alpha_i] = 0. \tag{18}$

This is another way of saying that the market is efficient. In expectation, there is no vertical offset for risky assets on the risk–reward spectrum, no risk-free reward in the market. The only way to a bigger reward is to take on a bigger risk. And $\beta_i$ captures how much reward you get per unit of risk, for a given asset $i$ .

Conclusion

In finance, the CAPM was the first theory to measure systematic risk (Grinold & Kahn, 2000). The CAPM argues that there is a single type of risk, market risk. The risk of asset $i$ is captured by the parameter $\beta_i$ , as defined by Equation $2$ , and the CAPM argues that, in expectation, there is no $\alpha_i$ , no idiosyncratic reward that requires no risk. To earn a higher reward, you must expose yourself to the risks of the market. The framework is an extension of modern portfolio theory with some additional assumptions about market efficiency.

Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. The Journal of Finance, 19(3), 425–442.
Grinold, R. C., & Kahn, R. N. (2000). Active portfolio management.