Law of total variance
|
In probability theory, the law of total variance states that if X and Y are random variables on the same probability space, and the variance of X is finite, then
- <math>\mbox{var}(X)=\mbox{E}(\mbox{var}(X\mid Y))+\mbox{var}(\mbox{E}(X\mid Y)).\,<math>
In language perhaps better known to statisticians than to probabilists, the first term is the unexplained component of the variance; the second is the explained component of the variance.
The nomenclature in this article's title parallels the phrase law of total probability. Some writers on probability call this the "conditional variance formula" or use other names.
(The conditional expected value E( X | Y ) is a random variable in its own right, whose value depends on the value of Y. Notice that the conditional expected value of X given the event Y = y is a function of y (this is where adherence to the conventional rigidly case-sensitive notation of probability theory becomes important!). If we write E( X | Y = y) = g(y) then the random variable E( X | Y ) is just g(Y). Similar comments apply to the conditional variance.)
Contents |
Proof
The law of total variance can be proved using the law of total expectation:
- var(X) = E(X2) − E(X)2
- = E(E(X2|Y)) − E(E(X|Y))2
- = E(var(X|Y)) + E(E(X|Y)2) − E(E(X|Y))2
- = E(var(X|Y)) + var(E(X|Y)).
The square of the correlation
In case the graph of the conditional expected value is a straight line, i.e., if we have
- <math>E(X \mid Y)=a+bY,\,<math>
then the explained component of the variance divided by the total variance is just the square of the correlation between X and Y, i.e., in that case,
- <math>{\operatorname{var}(\operatorname{E}(X\mid Y)) \over \operatorname{var}(X)} = \operatorname{corr}(X,Y)^2.\,<math>
Higher moments?
A similar law for the third central moment μ3 says
- <math>\mu_3(X)=\operatorname{E}(\mu_3(X\mid Y))+\mu_3(\operatorname{E}(X\mid Y))
+3\,\operatorname{cov}(\operatorname{E}(X\mid Y),\operatorname{var}(X\mid Y)).\,<math>
Generalizations for higher moments than the third are messy; for higher cumulants on the other hand, a simple and elegant form exists. See law of total cumulance.
Reference
- A Course in Probability, Neil Weiss, Addison-Wesley, 2005