Factor Modeling in Finance

I discuss multi-factor modeling, which generalizes many early financial models into a common prediction and risk framework.

Published

12 April 2022

In the 1950s, Harry Markowitz developed modern portfolio theory (Markowitz, 1952). In particular, modern portfolio theory introduced the idea of a risk–reward trade-off between a portfolio’s expected return and its volatility. In its framing, investors want portfolios with high expected returns and low volatility in actual outcomes. This framework is sometimes called mean–variance analysis. However, Markowitz did not define “risk” beyond the notion of portfolio volatility.

Subsequently, several researchers (Sharpe, 1964; Mossin, 1966; Lintner, 1965; Treynor, 1961) extended mean–variance analysis one step further to argue that there is a single explanatory variable, often called a “factor” in both statistics and finance, that explains expected returns: the market. This model became known as the capital asset pricing model (CAPM). The CAPM argues that the only real risk is the market, and so the only real factor is exposure to the market. Formally, let $R_n$ and $R_m$ be random variables denoting the return of asset $n$ and the market $m$ , and let $r_f$ be the non-random risk-free rate of return. Then the CAPM states,

$\mathbb{E}[R_n] - r_f = \beta_n (\mathbb{E}[R_m] - r_f). \tag{1}$

In words, the expected excess return of asset $n$ is a linear function of the expected excess return of the market. The linear relationship in Equation $1$ can be derived directly from the mean–variance analysis framework, particularly from linear efficient frontier. See my post on the CAPM for this derivation. The CAPM is useful because it is both simple and intuitive. There is a single factor, the market, and a single exposure to that factor, $\beta_n$ , which captures an asset’s sensitivity to the market. Higher (lower) $\beta_n$ means that when the market moves up or down, the return for asset $n$ moves up or down faster (slower).

However, empirical evidence has not agreed with the CAPM (Fama & French, 2004). To quote (Chamberlain & Rothschild, 1982),

Few believe that asset returns are well described by their first two moments or that some well-defined set of marketable assets contains most of the investment opportunities available to individual investors. Casual observation is sufficient to refute one of the main implications of the CAPM—that everyone holds the market portfolio.

Subsequent economists have proposed many multi-factor models, meaning models with multiple explanatory variables. An early and famous multi-factor model, for example, is the Fama–French three-factor model (Fama & French, 1993). Formally, let “SMB” denote a “small minus big” factor where size is measured via market capitalization, and let “HML” denote a “high minus low” factor, meaning the book-to-market ratio (book or accounting value to market value). Then the Fama–French three-factor model is

$\mathbb{E}[R_n] - r_f = \beta_{n1} (\mathbb{E}[R_m] - r_f) + \beta_{n2} \text{SMB} + \beta_{n3} \text{HML}. \tag{2}$

Of course, one can extend this logic to any number of macroeconomic indicators, and so there are many other multi-factor models. As an aside, the factors are not indexed by $n$ . The interpretation of this is that the factors are macroeconomic in nature: exposure to the market, size relative to the market, relative value, interest rates, and so on. Each asset is exposed to or loads on these factors in some quantity, defined by the parameters $\{\beta_{n1}, \beta_{n2}, \dots \}$ , which are indexed by $n$ .

A natural extension to multi-factor models with factors specified by some modeler would be a model with latent or unobservable factors. In other words, rather than specifying the factors in advance, can we use multivariate statistics to simply infer the factors? This is precisely what arbitrage pricing theory (APT) does (Ross, 1976). Formally, let $K$ denote the number of latent factors, let $f_k$ denote the $k$ -th systematic factor that is common to all assets, and let $\beta_k$ denote the asset’s sensitivity or factor loading onto the $k$ -th factor. Finally, let $\alpha_n$ denote a linear model’s intercept, and let $\varepsilon_n$ denote white noise. Then APT models risky asset returns as

$R_n = \alpha_n + \beta_{n1} f_1 + \beta_{n2} f_2 + \dots + \beta_{nK} f_K + \varepsilon_n. \tag{3}$

APT is quite a general, and at this point I think it would be quite clear to a statistician where all this is going: a general framework for multi-factor modeling that looks a lot like linear-Gaussian models such as factor analysis or principal components analysis (PCA).

Multi-factor models

In finance, factor modeling exploits standard methods from multivariate statistics to model returns, variances, and correlations (Rosenberg & McKibben, 1973). To reiterate, a factor is an explanatory variable, and these factors can be observable or unobservable.

Let $N$ be the number of assets, $T$ be the number of time periods, and $K$ be the number of factors. Ideally, $K \ll N$ . In words, there are far fewer macroeconomic factors that explain asset returns than their are unique assets. Let $r_{nt}$ and $\varepsilon_{nt}$ denote the return and white noise of asset $n$ at time $t$ . Let $\mathbf{f}_t$ be a $K$ -vector of factors and let $\boldsymbol{\beta}_n$ be a $K$ -vector of “loadings” or exposures of asset $n$ onto this factor at time $t$ . Note that the factors are common across assets (no index $n$ ), while the loadings are common across time (no index $t$ ). Then the multi-factor model is

$r_{nt} = \alpha_{nt} + \boldsymbol{\beta}_n^{\top} \mathbf{f}_t + \varepsilon_{nt}. \tag{4}$

As with linear regression, we can push the intercept $\alpha_{nt}$ into the dot product by adding a dummy factor $f_{0t} = 1$ . In finance, $\varepsilon_{nt}$ can be interpreted as the idiosyncratic return, or the return of asset $n$ at time $t$ that is not shared across assets or time. Note that all other parts of the return in Equation $4$ are common to all assets via the factors.

We assume that the idiosyncratic return has zero mean, $\mathbb{E}[\varepsilon_{nt}] = 0$ , and that these specific returns are uncorrelated with each other,

$\text{cov}[\varepsilon_{it}, \varepsilon_{js}] = \begin{cases} \sigma_{it}^2 & \text{if $j = i$ and $s = t$,} \\ 0 & \text{otherwise.} \end{cases} \tag{5}$

Furthermore, we assume that the factors are zero mean, $\mathbb{E}[\mathbf{f}_t] = \mathbf{0}$ , and are uncorrelated with the idiosyncratic return $\varepsilon_{nt}$ . This means that their covariance is zero:

$\text{cov}[\varepsilon_{it}, \mathbf{f}_t] = \mathbb{E}[\varepsilon_{it} \mathbf{f}_t] - \mathbb{E}[\varepsilon_{it}] \mathbb{E}[\mathbf{f}_t] = \mathbf{0}. \tag{6}$

Since $\mathbb{E}[\varepsilon_{it}] = 0$ , this implies that the cross term must be zero as well, that

$\mathbb{E}[\varepsilon_{it} \mathbf{f}_t] = \mathbf{0}. \tag{7}$

Finally, let $\boldsymbol{\Sigma}_f$ denote the covariance of $\mathbf{f}_t$ . Since $\mathbf{f}_t$ is zero mean, this implies

$\boldsymbol{\Sigma}_f = \text{cov}[\mathbf{f}_t, \mathbf{f}_t] = \mathbb{E}[\mathbf{f}_t \mathbf{f}_t^{\top}]. \tag{8}$

Ex ante, $r_{nt}$ is a random variable. These assumptions induce some probability distribution onto $r_{nt}$ , and we can derive the first and second moments. The first moment of $r_{nt}$ is:

$\mathbb{E}[r_{nt}] = \boldsymbol{\beta}_n^{\top} \mathbb{E}[\mathbf{f}_t] = 0. \tag{9}$

So returns are zero mean if the factors are zero mean. The covariance between returns is

$\begin{aligned} \text{cov}[r_{it}, r_{js}] &= \mathbb{E}[r_{it} r_{js}] - \mathbb{E}[r_{it}]\mathbb{E}[r_{js}] \\ &= \mathbb{E}[r_{it} r_{js}] \\ &= \mathbb{E}[(\boldsymbol{\beta}_i^{\top} \mathbf{f}_t + \varepsilon_{it})(\boldsymbol{\beta}_j^{\top} \mathbf{f}_s + \varepsilon_{js})] \\ &= \mathbb{E}[\boldsymbol{\beta}_i^{\top} \mathbf{f}_t \mathbf{f}_s^{\top} \boldsymbol{\beta}_j + \varepsilon_{it} \boldsymbol{\beta}_j^{\top} \mathbf{f}_s + \boldsymbol{\beta}_i^{\top} \mathbf{f}_t \varepsilon_{js} + \varepsilon_{it} \varepsilon_{js}] \\ &= \boldsymbol{\beta}_i^{\top} \mathbb{E}[\mathbf{f}_t \mathbf{f}_s^{\top}] \boldsymbol{\beta}_j + \boldsymbol{\beta}_j^{\top} \mathbb{E}[\varepsilon_{it} \mathbf{f}_s] + \boldsymbol{\beta}_i^{\top} \mathbb{E}[\mathbf{f}_t \varepsilon_{js}] + \mathbb{E}[\varepsilon_{it} \varepsilon_{js}] \\ &= \boldsymbol{\beta}_i^{\top} \mathbb{E}[\mathbf{f}_t \mathbf{f}_s^{\top}] \boldsymbol{\beta}_j + \text{cov}[\varepsilon_{it}, \varepsilon_{js}]. \end{aligned} \tag{10}$

We cannot simplify $\mathbb{E}[\mathbf{f}_t \mathbf{f}_s^{\top}]$ without some assumptions. Typically, we restrict $t = s$ , so that this term is the covariance of the factors at time $t$ . This is called a cross-sectional analysis, where the “cross-section” is the slice of all assets at time $t$ .

As we will see, Equation $10$ is a big leap forward, as it explicitly represents the riskiness of two assets through their idiosyncratic risk and their exposure to the factors’ risk. What this suggests is that not only are factors useful for predicting asset returns, they are useful for modeling portfolio volatility.

Inference

What parameters are we actually estimating in a multi-factor model? It depends on how we frame the problem. Let’s look at several scenarios.

Classic (implicit) factor model

In an implicit factor model, we assume the factor loadings are known and seek to estimate the unknown (implicit) factors. This approach uses a cross-sectional setup and induces the classic formulation of factor analysis. In a cross-sectional analysis, data are grouped by time period, meaning we care about differences between assets within each time period. Thus, we can rewrite Equation $4$ in vector form as

$\mathbf{r}_t = \mathbf{B} \mathbf{f}_t + \boldsymbol{\varepsilon}_t, \tag{11}$

where

$\mathbf{r}_t = \begin{bmatrix} r_{1t} \\ \vdots \\ r_{Nt} \end{bmatrix}, \quad \mathbf{B} = \begin{bmatrix} \alpha_{1} & \beta_{11} & \dots & \beta_{1K} \\ \vdots & \ddots & \ddots & \vdots \\ \alpha_{N} & \beta_{N1} & \dots & \beta_{NK} \end{bmatrix}, \quad \boldsymbol{\varepsilon}_t = \begin{bmatrix} \varepsilon_{1t} \\ \vdots \\ \varepsilon_{Nt} \end{bmatrix}. \tag{12}$

And we can represent the idiosyncratic return in vector-form as

$\begin{aligned} \mathbb{E}[\boldsymbol{\varepsilon}_t] &= \mathbf{0}, \\ \text{cov}[\boldsymbol{\varepsilon}_t] &= \boldsymbol{\Sigma}_{\varepsilon}, \quad \boldsymbol{\Sigma}_{\varepsilon} = \begin{bmatrix} \sigma^2_1 & \\ & \ddots & \\ & & \sigma_N^2 \end{bmatrix}. \end{aligned} \tag{13}$

While $\boldsymbol{\varepsilon}_t$ is indexed by $t$ , $\boldsymbol{\Sigma}_{\varepsilon}$ is not. This is a cross-sectional assumption: at each time period, the covariance of the noise does not change. Since $\boldsymbol{\Sigma}_{\varepsilon}$ is a diagonal matrix, the error terms are uncorrelated across assets.

Equations $9$ and $10$ above can be written in vector form as

$\begin{aligned} \mathbb{E}[\mathbf{r}_t] &= \mathbf{0}, \\ \mathbb{V}[\mathbf{r}_t] &= \mathbf{B} \boldsymbol{\Sigma}_f \mathbf{B}^{\top} + \boldsymbol{\Sigma}_{\varepsilon}, \end{aligned} \tag{14}$

Typically, $\boldsymbol{\Sigma}_f$ is assumed to be a diagonal matrix. See A1 for derivations.

Finally, to estimate the factors, $\{ \mathbf{f}_1, \mathbf{f}_2, \dots, \mathbf{f}_T \}$ , we fit $T$ linear regressions using Equation $11$ .

Time-series-based (explicit) factor model

In an explicit factor model, we assume the factors are known a priori (explicit) and seek to estimate the unknown factor loadings. In some sense, this is the most intuitive setup and the one most analogous to the models in the introduction. For example, in the Fama–French three factor model, we know the three factors a priori, and estimating the loadings is equivalent to estimating the parameters $\{\beta_{n1}, \beta_{n2}, \beta_{n3}\}$ in Equation $2$ but for all assets $n$ .

This requires a slightly different setup, a time-series regression. Instead of Equation $11$ , we represent the problem as

$\mathbf{r}_n = \mathbf{F} \boldsymbol{\beta}_n + \boldsymbol{\varepsilon}_n. \tag{15}$

Each of the $K$ columns of $\mathbf{F}$ is a $T$ -vector $\mathbf{f}_k$ . This represents the $k$ -th factor varying across time. This is a time-series regression rather than a cross-sectional regression because now the independent variable is the returns for a single asset across time. Any distributional assumption on $\mathbf{r}_n$ is an assumption about this time series. The error terms still have spherical errors but w.r.t. time:

$\text{cov}[\boldsymbol{\varepsilon}_n] = \boldsymbol{\Sigma}_{\varepsilon}, \quad \boldsymbol{\Sigma}_{\varepsilon} = \begin{bmatrix} \sigma^2_1 & \\ & \ddots & \\ & & \sigma_T^2 \end{bmatrix}. \tag{16}$

Now we fit $N$ linear regressions using Equation $15$ to estimate $\boldsymbol{\beta}_n$ for each asset. Here, the real goal is to assess the goodness-of-fit of the model, assuming the factors $\mathbf{F}$ . If the model has a high coefficient of determination or if the estimated coefficients are statistically significant, then this suggests that the investor has selected useful factors.

Statistical analysis

In the final approach, we assume both the factors and the loadings are unknown. To estimate both quantities jointly, we use standard methods in multivariate statistics such as factor analysis or PCA. Formally, factor analysis is the cross-sectional model defined in Equation $11$ , while probabilistic PCA is identical to factor analysis except the idiosyncratic returns have a common variance, i.e.

$\text{cov}[\boldsymbol{\varepsilon}_{t}] = \boldsymbol{\Sigma}_{\varepsilon}, \quad \boldsymbol{\Sigma}_{\varepsilon} = \begin{bmatrix} \sigma^2 & \\ & \ddots & \\ & & \sigma^2 \end{bmatrix}. \tag{17}$

Inference for factor analysis typically requires assuming that $\mathbf{f}_t$ is multivariate normally distributed, which induces a multivariate normal assumption on the returns:

$\begin{aligned} \mathbf{f}_t &\sim \mathcal{N}(\mathbf{0}, \boldsymbol{\Sigma}_f), \\ &\Downarrow \\ \mathbf{r}_t \mid \mathbf{f}_t &\sim \mathcal{N}\left( \mathbf{0}, \mathbf{B} \boldsymbol{\Sigma}_f \mathbf{B}^{\top} + \boldsymbol{\Sigma}_{\varepsilon} \right). \end{aligned} \tag{18}$

This assumption is not too unreasonable if we assume $\mathbf{r}_t$ is a vector of log returns rather than raw returns. We can then write down the log likelihood,

$\mathcal{L} = -\frac{1}{2} \sum_{t=1}^T \mathbf{r}_t^{\top} (\mathbf{B} \boldsymbol{\Sigma}_f \mathbf{B}^{\top} + \boldsymbol{\Sigma}_{\varepsilon})^{-1} \mathbf{r}_t - \frac{T}{2} \ln \det(\mathbf{B} \boldsymbol{\Sigma}_f \mathbf{B}^{\top} + \boldsymbol{\Sigma}_{\varepsilon}) + \text{const}, \tag{19}$

and use maximum likelihood estimation or expectation–maximization (EM) to infer the parameters $\mathbf{B}$ and $\boldsymbol{\Sigma}_{\varepsilon}$ . We can then estimate the factors from the inferred parameters.

Alternatively, we can use PCA. In the PCA-based approach, we first compute the sample covariance matrix of the returns, $\hat{\boldsymbol{\Sigma}}_r$ . Then we take as the factors the first $K$ eigenvectors corresponding to the largest $K$ eigenvalues of $\boldsymbol{\Sigma}_r$ .

See my previous post on factor analysis for details on fitting factor analysis using EM. Alternatively, see (Tipping & Bishop, 1999) for a discussion of probabilistic PCA.

Risk-factor modeling

So far, we have viewed factors as useful macroeconomic indicators that are correlated with or predictive of asset returns. However, if a factor predicts a return, it is natural to think of it as a risk-factor as well. What do I mean? Recall that in the mean–variance analysis framework, the objective is to maximize our portfolio’s expected return while minimizing its variance. See my post on mean–variance analysis if this claim does not make sense. Formally, if $\mathbf{w}$ is an $N$ -vector of portfolio weights, then the unconstrained objective is:

$\mathbf{w}^{\star} = \arg\!\max_{\mathbf{w}} \left\{ \mathbf{w}^{\top} \mathbf{r}_t - \mathbf{w}^{\top} \boldsymbol{\Sigma}_r \mathbf{w} \right\}, \tag{20}$

where now $\mathbf{r}_t$ is an $N$ -vector of assets in a portfolio at time $t$ and where $\boldsymbol{\Sigma}_r$ is the covariance of those assets at time $t$ . We might add constraints such as the weights summing to unity, but the essence of the problem is to maximize the returns of the positions we take and to minimize the risk of those positions, as captured by their variances and covariances.

However, $\boldsymbol{\Sigma}_r$ is an $N \times N$ matrix, which can be quite large and quite sparse. Think about how many stocks there are, for example, and how the number of available stocks changes across time. To compute the optimal $\mathbf{w}^{\star}$ , an optimizer may supply many values for $\mathbf{w}$ and then compute $\mathbf{w}^{\top} \boldsymbol{\Sigma}_r \mathbf{w}$ many times. In risk-factor modeling, we replace $\boldsymbol{\Sigma}_r$ with the decomposition in Equation $14$ :

$\boldsymbol{\Sigma}_r = \mathbf{B} \boldsymbol{\Sigma}_f \mathbf{B}^{\top} + \boldsymbol{\Sigma}_{\varepsilon}. \tag{21}$

In fact, we should do this if we believe Equation $14$ is true, since that equation states that $\mathbf{r}_t$ is really a linear combination of factors. This allows us to rewrite the objective in Equation $21$ as

$\mathbf{w}^{\star} = \arg\!\max_{\mathbf{w}} \left\{ \mathbf{w}^{\top} \mathbf{r}_t - \mathbf{w}^{\top} \mathbf{B} \boldsymbol{\Sigma}_f \mathbf{B}^{\top} \mathbf{w} - \mathbf{w}^{\top} \boldsymbol{\Sigma}_{\varepsilon} \mathbf{w} \right\}, \tag{22}$

Notice that $\boldsymbol{\Sigma}_{\varepsilon}$ is a diagonal matrix of idiosyncratic variances, so the right-most term in Equation $22$ can be computed with a dot product and scalar-vector multiplication. And typically we assume that the factors are uncorrelated, meaning that $\boldsymbol{\Sigma}_f$ is a diagonal matrix as well. Since $K \ll N$ , computing $\mathbf{w}^{\top} \mathbf{B} \boldsymbol{\Sigma}_f \mathbf{B}^{\top} \mathbf{w}$ is much faster than computing $\mathbf{w}^{\top} \boldsymbol{\Sigma}_r \mathbf{w}$ . Furthermore, this decomposition elegantly handles the sparse matrix $\boldsymbol{\Sigma}_r$ . Rather than estimating the covariances between all combinations of assets, we model asset correlations through their loadings onto common factors.

Conclusion

In finance, multi-factor modeling generalizes many early financial models into a common framework, where returns are linear functions of macroeconomic explanatory variables, called “factors”. In this framework, we can either assume we know the factors or the loadings but not both, in which case inferring the other amounts to a multivariate linear regression. Or we can use standard methods from multivariate statistics, such as factor analysis or PCA, to infer both the factors and loadings jointly. In a cross-sectional analysis, the natural extension of these ideas to model risk (portfolio variances and covariances) through the low-rank approximation induced by the factor model.

Appendix

A1. Unconditional moments

The unconditional mean is zero, since $\mathbf{E}[\mathbf{f}_t] = \mathbf{0}$ by assumption:

$\begin{aligned} \mathbb{E}[\mathbf{r}_t] &= \mathbb{E}[ \mathbf{B} \mathbf{f}_t + \boldsymbol{\varepsilon}_t] \\ &= \mathbf{B} \mathbb{E}[ \mathbf{f}_t ] + \mathbb{E}[ \boldsymbol{\varepsilon}_t] \\ &= \mathbf{B} \mathbf{0} \\ &= \mathbf{0}. \end{aligned} \tag{A2.1}$

The unconditional variance is

$\begin{aligned} \boldsymbol{\Sigma}_r &\triangleq \text{cov}[\mathbf{r}_t] \\ &= \mathbb{E}[(\mathbf{r}_t - \mathbb{E}[\mathbf{r}_t])(\mathbf{r}_t - \mathbb{E}[\mathbf{r}_t])^{\top}] \\ &= \mathbb{E}[\mathbf{r}_t \mathbf{r}_t^{\top}] \\ &= \mathbb{E}[(\mathbf{B} \mathbf{f}_t + \boldsymbol{\varepsilon}_t)(\mathbf{B} \mathbf{f}_t + \boldsymbol{\varepsilon}_t)^{\top}] \\ &= \mathbb{E} \left[ \mathbf{B}\mathbf{f}_t\mathbf{f}_t^{\top}\mathbf{B}^{\top} + \boldsymbol{\varepsilon}_t \boldsymbol{\varepsilon}_t^{\top} + \boldsymbol{\varepsilon}_t \mathbf{f}_t^{\top} \mathbf{B}^{\top} + \mathbf{B} \mathbf{f}_t \boldsymbol{\varepsilon}_t \right] \\ &= \mathbf{B} \mathbb{E} \left[ \mathbf{f}_t\mathbf{f}_t^{\top} \right] \mathbf{B}^{\top} + \mathbb{E} \left[ \boldsymbol{\varepsilon}_t \boldsymbol{\varepsilon}_t^{\top} \right] + \mathbb{E} \left[ \boldsymbol{\varepsilon}_t \mathbf{f}_t^{\top} \right] \mathbf{B}^{\top} + \mathbf{B} \mathbb{E}\left[\mathbf{f}_t \boldsymbol{\varepsilon}_t \right] \\ &= \mathbf{B} \boldsymbol{\Sigma}_f \mathbf{B}^{\top} + \boldsymbol{\Sigma}_{\varepsilon}. \end{aligned} \tag{A2.2}$

The cross-terms are zero because $\boldsymbol{\varepsilon}_t$ and $\mathbf{f}_t$ are mean-zero and uncorrelated, so:

$\begin{aligned} \mathbf{0} &= \text{cov}[\mathbf{f}_t, \boldsymbol{\varepsilon}_t] \\ &= \mathbb{E}[(\mathbf{f}_t - \mathbb{E}[\mathbf{f}_t])(\boldsymbol{\varepsilon}_t - \mathbb{E}[\boldsymbol{\varepsilon}_t])^{\top}] \\ &= \mathbb{E}[\mathbf{f}_t \boldsymbol{\varepsilon}_t^{\top}]. \end{aligned} \tag{A2.3}$

Markowitz, H. (1952). Portfolio selection. Journal of Finance.
Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. The Journal of Finance, 19(3), 425–442.
Mossin, J. (1966). Equilibrium in a capital asset market. Econometrica: Journal of the Econometric Society, 768–783.
Lintner, J. (1965). The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. The Review of Economics and Statistics, 222–224.
Treynor, J. L. (1961). Market value, time, and risk. Time, and Risk (August 8, 1961).
Fama, E. F., & French, K. R. (2004). The capital asset pricing model: Theory and evidence. Journal of Economic Perspectives, 18(3), 25–46.
Chamberlain, G., & Rothschild, M. (1982). Arbitrage, factor structure, and mean-variance analysis on large asset markets. National Bureau of Economic Research Cambridge, Mass., USA.
Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56.
Ross, S. A. (1976). The arbitrage theory of capital asset pricing. In Handbook of the fundamentals of financial decision making: Part I (pp. 11–30). World Scientific.
Rosenberg, B., & McKibben, W. (1973). The prediction of systematic and specific risk in common stocks. Journal of Financial and Quantitative Analysis, 8(2), 317–333.
Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3), 611–622.