Bienaymé's Identity

In probability theory, Bienaymé's identity is a formula for the variance of random variables which are themselves sums of random variables. I provide a little intuition for the identity and then prove it.

Let BnB_n denote a random variable which is itself the sum of nn random variables,

Bn:=i=1nXi.(1) B_n := \sum_{i=1}^n X_i. \tag{1}

Bienaymé’s identity, named after the French statistician Irénée-Jules Bienaymé, states that the variance of BnB_n is

V[Bn]=i=1nV[Xi]  + ⁣ ⁣ ⁣i,j=1,ijn ⁣ ⁣ ⁣ ⁣cov[Xi,Xj].(2) \mathbb{V}[B_n] = \sum_{i=1}^n \mathbb{V}[X_i] \; + \!\!\! \sum_{i,j=1,i \neq j}^n \!\!\!\! \text{cov}\left[X_i, X_j \right]. \tag{2}

If we think of variance as simply a special case of covariance, since

V[Xi]=cov[Xi,Xi],(3) \mathbb{V}[X_i] = \text{cov}\left[X_i, X_i\right], \tag{3}

then we can write Equation 22 as

V[Bn]= ⁣i,j=1ncov[Xi,Xj].(4) \mathbb{V}[B_n] = \! \sum_{i,j=1}^n \text{cov}\left[X_i, X_j \right]. \tag{4}

In either case, we can visualize this identity as summing the elements of the covariance matrix of the vector [X1,,Xn][X_1, \dots, X_n] (Figure 11). In Equation 22, we break the sum into the diagonal and off-diagonal elements, while in Equation 44, we denote the diagonal elements with the same notation as the off-diagonal elements.

Figure 1. Visualization of Bienaymé's identity. The matrix represents an n×nn \times n covariance matrix of the vector [X1,,Xn][X_1, \dots, X_n]. The variances or diagonal elements (gold) are captured in the middle sum in Equation 22, while the covariances or off-diagonal elements (gray) are captured in the right sum in Equation 22.

If we write the covariance in terms of Pearson’s correlation coefficient ρ\rho, then the identity in Equation 33 becomes

V[Bn]= ⁣ ⁣i,j=1nρij  V[Xi]V[Xj].(5) \mathbb{V} [B_n] = \!\! \sum_{i,j=1}^n \rho_{ij} \; \sqrt{\mathbb{V}[X_i] \mathbb{V}[X_j]}. \tag{5}

An important special case of Bienaymé’s identity is when the random variables are independent or uncorrelated (ρ=0\rho = 0). In either case, the covariance between the random variables is zero or

cov[Xi,Xj]=0,ij.(6) \text{cov}\left[X_i, X_j\right] = 0, \quad i \neq j. \tag{6}

And so clearly, we have

V[Bn]=i=1nV[Xi].(7) \mathbb{V} [B_n] = \sum_{i=1}^n \mathbb{V}[X_i]. \tag{7}

This leads to rules such as the square root of time rule or adages such as “variances add”.

With a little thought, Bienaymé’s identity is fairly intuitive. If we add random variables together, then we are compounding uncertainty. However, that uncertainty is less if the elements are independent or uncorrelated, in which case their uncertainty is unrelated. But in the worst case, all random variables are highly correlated with ρ=1\rho = 1, which maximizes the variance of BnB_n. Alternatively, in the best case, all random variables are highly anti-correlated with ρ=1\rho = -1, which minimizes the variance of BnB_n.

Finally, proving Bienaymé’s identity really amounts to understanding how to square a sum of terms. In general, it is true that

(a1+a2++ak)2=i=1kai2+ ⁣ ⁣ ⁣i,j=1,ijk ⁣ ⁣aiaj.(8) \left( a_1 + a_2 + \dots + a_k \right)^2 = \sum_{i=1}^k a_i^2 + \!\!\! \sum_{i,j=1, i\neq j}^k \!\! a_i a_j. \tag{8}

We can see that this is true with a simple proof:

(i=1kai)2=(i=1kai)(j=1kaj)=i=1kj=1kaiaj=i=1kai2+ ⁣ ⁣ ⁣i,j=1,ijn ⁣ ⁣aiaj.(9) \begin{aligned} \left( \sum_{i=1}^k a_i \right)^2 &= \left( \sum_{i=1}^k a_i \right) \left( \sum_{j=1}^k a_j \right) \\ &= \sum_{i=1}^k \sum_{j=1}^k a_i a_j \\ &= \sum_{i=1}^k a_i^2 + \!\!\! \sum_{i,j=1, i\neq j}^n \!\! a_i a_j. \end{aligned} \tag{9}

We can then apply Equation 88 twice and consolidate terms to derive Bienaymé’s identity,

V[Bn]=E[(Bn+E[Bn])2]=E[Bn2]E[Bn]2=E[(X1++Xn)2]E[X1++Xn]2=E[i=1nXi2+ ⁣ ⁣ ⁣i,j=1,ijn ⁣ ⁣ ⁣XiXj][i=1nE[Xi]2+ ⁣ ⁣ ⁣i,j=1,ijn ⁣ ⁣E[Xi]E[Xj]]=i=1nE[Xi2]+ ⁣ ⁣ ⁣i,j=1,ijn ⁣ ⁣ ⁣E[XiXj]i=1nE[Xi]2 ⁣ ⁣ ⁣i,j=1,ijn ⁣ ⁣ ⁣E[Xi]E[Xj]=i=1n[E[Xi2]+E[Xi]2]+ ⁣ ⁣ ⁣i,j=1,ijn ⁣ ⁣ ⁣[E[XiXj]E[Xi]E[Xj]],(10) \begin{aligned} \mathbb{V}[B_n] &= \mathbb{E}\left[(B_n + \mathbb{E}\left[B_n\right])^2\right] \\ &= \mathbb{E}\left[B_n^2\right] - \mathbb{E}\left[B_n\right]^2 \\ &= \mathbb{E}\left[(X_1 + \dots + X_n)^2\right] - \mathbb{E}\left[X_1 + \dots + X_n \right]^2 \\ &= \mathbb{E} \left[ \sum_{i=1}^n X_i^2 + \!\!\! \sum_{i,j=1, i\neq j}^n \!\!\! X_i X_j \right] - \left[ \sum_{i=1}^n \mathbb{E}[X_i]^2 + \!\!\! \sum_{i,j=1, i\neq j}^n \!\! \mathbb{E}[X_i] \mathbb{E}[X_j] \right] \\ &= \sum_{i=1}^n \mathbb{E} \left[ X_i^2 \right] + \!\!\! \sum_{i,j=1, i\neq j}^n \!\!\! \mathbb{E} \left[ X_i X_j \right] - \sum_{i=1}^n \mathbb{E}[X_i]^2 - \!\!\! \sum_{i,j=1, i\neq j}^n \!\!\! \mathbb{E}[X_i] \mathbb{E}[X_j] \\ &= \sum_{i=1}^n \left[ \mathbb{E} \left[ X_i^2 \right] + \mathbb{E}[X_i]^2 \right] + \!\!\! \sum_{i,j=1, i\neq j}^n \!\!\! \left[ \mathbb{E} \left[ X_i X_j \right] - \mathbb{E}[X_i] \mathbb{E}[X_j] \right], \tag{10} \end{aligned}

as desired.