|
In the theory of stochastic processes, the Karhunen-Loève theorem (named after Kari Karhunen and Michel Loève) states that a centered stochastic process {Xt}t (where centered means that the expectations E(Xt) are defined and equal to 0 for all t) satisfying a technical continuity condition, admits a decomposition
- <math> \mathbf{X}_t = \sum_{k=1}^\infty \mathbf{Z}_k e_k(t). <math>
where Zk</sup> are pairwise uncorrelated random variables. Moreover, if the process is Gaussian, then the random variables Zk are Gaussian and stochastically independent. This result generalizes the Karhunen-Loève transform. An important example of a centered real stochastic process on [0,1] is Brownian motion and the Karhunen-Loève theorem can be used to provide a canonical orthogonal representation for it. The above expansion into uncorrelated random variables is also known as the Karhunen-Loève expansion.
Detailed formulation
We will formulate the result in terms of real random variables, although it is applicable without change to vector-valued random variables.
If X and Y are random variables, the inner product is defined by
- <math> \langle \mathbf{X}|\mathbf{Y} \rangle = \operatorname{E}(\mathbf{X}\mathbf{Y})<math>
This is defined if both X and Y have finite second moments e.g., are square integrable. Note that the inner product is related to covariance and correlation. In particular, for random variables of mean zero, covariance and inner product coincide. If {Xt}t is a centered process, the covariance function of {Xt}t is
- <math> \operatorname{Cov}_{\mathbf{X}}(t,s) = \langle \mathbf{X}_t | \mathbf{X}_s \rangle = \operatorname{Cov}( \mathbf{X}_t,\mathbf{X}_s).<math>
Note that if {Xt}t is centered and t1, ≤ t2, ..., ≤ tN are points in [a, b], then
- <math> \sum_{k,\ell} \operatorname{Cov}_{\mathbf{X}}(t_k,t_\ell) = \operatorname{Var}\left(\sum_{k=1}^N \mathbf{X}_k\right) \geq 0. <math>
Theorem. Consider a centered stochastic process {Xt}t indexed by t in the interval [a, b] with covariance function CovX. Suppose the covariance function CovX(t,s) is jointly continuous in t, s. Then CovX can be regarded as a positive definite kernel and so by Mercer's theorem, the corresponding integral operator T on L2[a,b] (relative to Lebesgue measure on [a,b]) has an orthonormal basis of eigenvectors. Let {ei}i be the eigenvectors of T corresponding to non-zero eigenvalues and
- <math> \mathbf{Z}_i = \int_a^b \mathbf{X}_t e_i(t) dt. <math>
Then Zi are centered orthogonal random variables and
- <math> \mathbf{X}_t = \sum_{i=1}^\infty e_i(t) \mathbf{Z}_i <math>
where the convergence is in the mean and is uniform in t. Moreover
- <math> \operatorname{Var}(\mathbf{Z}_i) = \operatorname{E}(\mathbf{Z}_i^2) = \lambda_i.<math>
where λi is the eigenvalue corresponding to the eigenvector ei.
In the statement of the theorem, the integral defining Zi, can be defined as the limit in the mean of Cauchy sums of random variables:
- <math> \sum_{k=0}^{\ell-1} \mathbf{X}_{\xi_k} e_i(\xi_k) (t_{k+1} - t_k), <math>
where
- <math> a = t_0 \leq \xi_0 \leq t_1 \leq \cdots \leq \xi_{\ell-1} \leq t_n = b <math>
Since the limit in the mean of jointly Gaussian random variables is jointly Gaussian, and jointly Gaussian random (centered) variables are independent iff they are orthogonal, we can also conclude:
Theorem. The variables Zi have a joint Gaussian distribution and are stochastically independent if the original process {Xt}t is Gaussian.
In the gaussian case, since the variables Zi are independent, we can say more:
- <math> \lim_{N \rightarrow \infty} \sum_{i=1}^N e_i(t) \mathbf{Z}_i(\omega) = \mathbf{X}_t(\omega) <math>
almost surely.
Note that by generalizations of Mercer's theorem we can replace the interval [a, b] with other compact spaces C and Lebesgue measure on [a, b] with a Borel measure whose support is C.
Brownian motion
There are numerous equivalent characterizations of Brownian motion. Here we regard it as the centered Gaussian process {Bt} with covariance function
- <math> \operatorname{Cov}_{\mathbf{B}}(t,s) =\min (s,t). <math>
The eigenvectors of the covariance kernel are easily determined. These are
- <math> e_k(t) = \sqrt{2} \sin \left(k - \frac{1}{2}\right) \pi t <math>
and the corresponding eigenvalues are
- <math> \lambda_k = \frac{4}{(2 k -1)^2 \pi^2}. <math>
This gives the following representation of Brownian motion:
Theorem. There is a sequence {Wi}i of independent Gaussian random with mean zero and variance 1 such that
- <math> \mathbf{B}_t = \sqrt{2} \sum_{k=1}^\infty \mathbf{W}_k \frac{\sin \left(k - \frac{1}{2}\right) \pi t}{ \left(k - \frac{1}{2}\right) \pi}. <math>
Convergence is uniform in t and in the L2 norm, that is
- <math> \operatorname{E}\left(\mathbf{B}_t - \sqrt{2} \sum_{k=1}^n \mathbf{W}_k \frac{\sin \left(k - \frac{1}{2}\right) \pi t}{ \left(k - \frac{1}{2}\right) \pi} \right)^2 \rightarrow 0 <math>
uniformly in t.
References
- I. Guikhman, A. Skorokhod, ´'Introduction a la Théorie des Processus Aléatoires´' Éditions MIR, 1977
- B. Simon, Functional Integration and Quantum Physics, Academic Press, 1979