Cholesky decomposition
|
In mathematics, the Cholesky decomposition, named after André-Louis Cholesky, is a matrix decomposition of a Hermitian positive-definite matrix into a lower triangular matrix and the conjugate transpose of the lower triangular matrix. The lower triangular matrix is the Cholesky triangle of the original, positive-definite matrix.
Any square matrix A can be written as the product of a lower triangular matrix L and an upper triangular matrix U; this is called the LU decomposition. However, if A is Hermitian and positive definite, we can choose the factors such that U is the conjugate transpose of L, and this is called the Cholesky decomposition. Both the LU and the Cholesky decomposition are used to solve systems of linear equations. When it is applicable, the Cholesky decomposition is twice as efficient as the LU decomposition.
Contents |
Definition
Let A be a Hermitian, positive-definite matrix of complex numbers, then A can be decomposed as
- <math>\mathbf{A} = \mathbf{L} \mathbf{L}^{*}<math>
with L a lower triangular matrix with positive diagonal entries, and L* the conjugate transpose of L.
Notes
The Cholesky decomposition is unique: given a Hermitian, positive-definite matrix A, there is only one lower triangular matrix L with positive diagonal entries such that A = LL*. The converse is also true: if A can be written as LL*, with L lower triangular with positive diagonal entries, then A is Hermitian and positive definite.
If the matrix A has only real valued entries, the matrix L is also real. Hence, the conjugate transpose coincides with the transpose and the decomposition simplifies to
- <math>\mathbf{A} = \mathbf{L} \mathbf{L}^{T}. <math>
Applications
The Cholesky decomposition is mainly used for the numerical solution of linear equations Ax = b. If A is Hermitian and positive definite, then we can solve Ax = b by first computing the Cholesky decomposition A = LL*, then solving Ly = b for y, and finally solving L*x = y for x.
Systems of the form Ax = b with A Hermitian and positive definite arise quite often in applications. For instance, the normal equations in linear least squares problems are of this form. It may also happen that matrix A comes from an energy functional which must be positive from physical considerations; this happens frequently in the numerical solution of partial differential equations.
The Cholesky decomposition is commonly used in the Monte Carlo method for simulating systems with multiple correlated variables: The matrix of inter-variable correlations is decomposed, to give the lower-triangular L. Applying this to a vector of uncorrelated simulated shocks, u, produces a shock vector Lu with the covariance properties of the system being modelled.
Computing the Cholesky decomposition
There are various methods for calculating the Cholesky decomposition. The algorithms described below all involve n3/3 flops, where n is the size of the matrix A. Hence, they are twice as cheap as the LU decomposition, which uses 2n3/3 flops.
Which of the algorithms below is faster depends on the details on the implementation. Generally, the first algorithm will be slightly slower because it accesses the data in a more irregular manner.
The Cholesky algorithm
The Cholesky algorithm, used to calculate the decomposition matrix L, is a modified version of the Gauss algorithm.
The recursive algorithm starts with i := 1 and
- <math>\mathbf{A}^{(1)} := \mathbf{A}.<math>
At step i, the matrix A(i) has the following form:
- <math>\mathbf{A}^{(i)}
= \begin{pmatrix} \mathbf{I}_{i-1} & 0 & 0 \\ 0 & a_{i,i} & \mathbf{b}_{i}^{*} \\ 0 & \mathbf{b}_{i} & \mathbf{B}^{(i)} \end{pmatrix}, <math> where Ii−1 denotes the identity matrix of dimension i − 1.
If we now define the matrix Li by
- <math>\mathbf{L}_{i}
- =
\begin{pmatrix} \mathbf{I}_{i-1} & 0 & 0 \\ 0 & \frac{1}{\sqrt{a_{i,i}}} & 0 \\ 0 & - \frac{1}{a_{i,i}} \mathbf{b}_{i} & \mathbf{I}_{n-i} \end{pmatrix}, <math> then we can write A(i) as
- <math>
\mathbf{A}^{(i)} = \mathbf{L}_{i}^{-1} \mathbf{A}^{(i+1)} (\mathbf{L}_{i}^{-1})^{*} <math> where
- <math>
\mathbf{A}^{(i+1)} = \begin{pmatrix} \mathbf{I}_{i-1} & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & \mathbf{B}^{(i)} - \frac{1}{a_{i,i}} \mathbf{b}_{i} \mathbf{b}_{i}^{*} \end{pmatrix}. <math> Note that bi bi* is an outer product, therefore this algorithm is called the outer product version in (Golub & Van Loan).
We repeat this for i from 1 to n. After n steps, we get A(n+1) = I. Hence, the lower triangular matrix L we are looking for is calculated as
- <math>\mathbf{L} := \mathbf{L}_{1} \mathbf{L}_{2} \dots \mathbf{L}_{n}.<math>
The Cholesky-Banachiewicz and Cholesky-Crout algorithms
If we write out the equation A = LL*, we obtain the following formula for the entries of L:
- <math> l_{i,j} = \frac{1}{l_{j,j}} \left( a_{i,j} - \sum_{k=1}^{j-1} l_{i,k} l_{j,k} \right), \qquad\mbox{for } i
- <math> l_{i,i} = \sqrt{ a_{i,i} - \sum_{k=1}^{i-1} l_{i,k}^2 }. <math>
The expression under the square root is always positive if A is real and positive-definite.
So we can compute the (i, j) entry if we know the entries to the left and above. The computation is usually arranged in either of the following orders.
- The Cholesky-Banachiewicz algorithm starts form the upper left corner of the matrix L and proceeds to calculate the matrix row by row.
- The Cholesky-Crout algorithm starts from the upper left corner of the matrix L and proceeds to calculate the matrix column by column.
Stability of the computation
Suppose that we want to solve a well-conditioned system of linear equations. If the LU decomposition is used, then the algorithm is unstable unless we use some sort of pivoting strategy. In the latter case, the error depends on the so-called growth factor of the matrix, which is usually (but not always) small.
Now, suppose that the Cholesky decomposition is applicable. As mentioned above, the algorithm will be twice as fast. Furthermore, no pivoting is necessary and the error will always be small. Specifically, if we want to solve Ax = b, and y denotes the computed solution, then y solves the disturbed system (A + E)y = b where
- <math> \|\mathbf{E}\|_2 \le c_n \varepsilon \|\mathbf{A}\|_2. <math>
Here, || ||2</sup> is the matrix 2-norm, cn is a small constant depending on n, and ε denotes the unit roundoff.
There is one small problem with the Cholesky decomposition. Note that we must compute square roots in order to find the Cholesky decomposition. If the matrix is real symmetric and positive definite, then the numbers under the square roots are always positive in exact arithmetic. Unfortunately, the numbers can become negative because of round-off errors, in which case the algorithm cannot continue. However, this can only happen if the matrix is very ill-conditioned.
See also
External links
- Chapter 2.9 of "Numerical Recipes in C" (http://www.library.cornell.edu/nr/bookcpdf/c2-9.pdf) gives information about implementation of Cholesky algorithm in C.
- Template:Planetmath reference
References
- Gene H. Golub and Charles F. Van Loan, Matrix computations (3rd ed.), Section 4.2, John Hopkins University Press. ISBN 0-8018-5414-8.de:Cholesky-Zerlegung