I learned very early the difference between knowing the name of something and knowing something.

Richard Feynman

Bienaymé's Identity

In probability theory, Bienaymé's identity is a formula for the variance of random variables which are themselves sums of random variables. I provide a little intuition for the identity and then prove it.

Log-Normal Distribution

I derive some basic properties of the log-normal distribution.

High-Dimensional Variance

A useful view of a covariance matrix is that it is a natural generalization of variance to higher dimensions. I explore this idea.

Correlation and Hedging

A mean–variance optimizer will hedge correlated assets. I explain why and then work through a simple example.

The Greeks

In finance, the "Greeks" refer to the partial derivatives of an option pricing model with respect to its inputs. They are important for understanding how an option's price may change. I discuss the Black–Scholes Greeks in detail.

Deriving the VIX

The VIX is a benchmark for market-implied volatility. It is computed from a weighted average of variance swaps. I first derive the fair strike for a variance swap and then discuss the VIX's approximation of this formula.

Estimating ATM Option Prices

I work through a well-known approximation of the Black–Scholes price of at-the-money (ATM) options.

Proof the Binomial Model Converges to Black–Scholes

The binomial options-pricing model converges to Black–Scholes as the number of steps in fixed physical time goes to infinity. I present Chi-Cheng Hsia's 1983 proof of this result.

Binomial Options-Pricing Model

I present a simple yet useful model for pricing European-style options, called the binomial options-pricing model. It provides good intuition into pricing options without any advanced mathematics.

fortunai

I describe the process of using ChatGPT-3.5 to write a program that uses OpenAI's API. The program generates LLM fortunes a la the Unix command 'fortune'.

Problem Solving with Dimensional Analysis

Dimensional analysis is the technique of analyzing relationships through their base quantities. I demonstrate the power of this approach by approximating a Gaussian integral without calculus.

Estimating Square Roots in Your Head

I explore an ancient algorithm, sometimes called Heron's method, for estimating square roots without a calculator.

Carr–Madan Formula

In the options-pricing literature, the Carr–Madan formula equates a derivative's nonlinear payoff function with a portfolio of options. I describe and prove this relationship.

One-Period Binomial Model

The binomial options-pricing model is a numerical method for valuing options. I explore this model over a single time period and focus on two key ideas, the no-arbitrage condition and risk-neutral pricing.

Principal Component Analysis

Principal component analyis (PCA) is a simple, fast, and elegant linear method for data analysis. I explore PCA in detail, first with pictures and intuition, then with linear algebra and detailed derivations, and finally with code.

Matrices as Functions, Matrices as Data

I discuss two views of matrices: matrices as linear functions and matrices as data. The second view is particularly useful in understanding dimension reduction methods.

Scaling Factors for Hidden Markov Models

Inference for hidden Markov models (HMMs) is numerically unstable. A standard approach to resolving this instability is to use scaling factors. I discuss this idea in detail.

Weighted Least Squares

Weighted least squares (WLS) is a generalization of ordinary least squares in which each observation is assigned a weight, which scales the squared residual error. I discuss WLS and then derive its estimator in detail.

The Sharpe Ratio

The Sharpe ratio measures a financial strategy's performance as the ratio of its reward to its variability. I discuss this metric in detail, particularly its relationship to the information ratio and tt-statistics.

How Dangerous Is Biking in New York?

I estimate my probability of serious injury or death from bike commuting to work in New York, using public data from city's Department of Transportation.

Moving Averages

I discuss moving or rolling averages, which are algorithms to compute means over different subsets of sequential data.

Square Root of Time Rule

A common heuristic for time-aggregating volatility is the square root of time rule. I discuss the big idea for this rule and then provide the mathematical assumptions underpinning it.

Exponential Decay

Many phenomena can be modeled as exponential decay. I discuss this model in detail, focusing on natural exponential decay (base ee) and various useful properties.

Factor Modeling in Finance

I discuss multi-factor modeling, which generalizes many early financial models into a common prediction and risk framework.

Research and Adventure

During my PhD, I went hiking alone in a remote region of Iceland. Over the years, I've come to view this trip as analogous to the PhD process. Graduate school was hard, but on the warm days, the views were spectacular.

Conjugate Gradient Descent

Conjugate gradient descent (CGD) is an iterative algorithm for minimizing quadratic functions. CGD uses a kind of orthogonality (conjugacy) to efficiently search for the minimum. I present CGD by building it up from gradient descent.

The Capital Asset Pricing Model

In finance, the capital asset pricing model (CAPM) was the first theory to measure systematic risk. The CAPM argues that there is a single type of risk, market risk. I derive the CAPM from the mean–variance framework of modern portfolio theory.

Generalized Least Squares

I discuss generalized least squares (GLS), which extends ordinary least squares by assuming heteroscedastic errors. I prove some basic properties of GLS, particularly that it is the best linear unbiased estimator, and work through a complete example.

Understanding Positive Definite Matrices

I discuss a geometric interpretation of positive definite matrices and how this relates to various properties of them, such as positive eigenvalues, positive determinants, and decomposability. I also discuss their importance in quadratic programming.

The Gauss–Markov Theorem

I discuss and prove the Gauss–Markov theorem, which states that under certain conditions, the least squares estimator is the minimum-variance linear unbiased estimator of the model parameters.

See all posts