Edgeworth series
|
The Edgeworth series or Gram-Charlier A series, named in honor of Francis Ysidro Edgeworth, are series that approximate a probability distribution in terms of its cumulants.
Gram-Charlier A series
The key idea of these expansions is to write the characteristic function of the distribution whose probability density function is F to be approximated in terms of the characteristic function of a distribution with known and suitable properties, and to recover F through the inverse Fourier transform.
Let f be the characteristic function of the distribution whose density function is F, and κr its cumulants. We expand in terms of a known distribution with probability density function Ψ, characteristic function ψ, and cumulants γr. The density Ψ is generally chosen to be that of the normal distribution, but other choices are possible as well. By the definition of the cumulants, we have the following formal identity:
- <math> f(t)=\exp\left[\sum_{r=1}^\infty(\kappa_r-\gamma_r)\frac{(it)^r}{r!}\right]\psi(t)\,.<math>
By the properties of the Fourier transform, (it)rψ(t) is the Fourier transform of (−1)r Dr Ψ(x), where D is the differential operator with respect to x. Thus, we find for F the formal expansion
- <math> F(x) = \exp\left[\sum_{r=1}^\infty(\kappa_r - \gamma_r)\frac{(-D)^r}{r!}\right]\Psi(x)\,.<math>
If Ψ is chosen as the normal density with mean and variance as given by F, that is, mean μ = κ1 and variance σ2 = κ2, then the expansion becomes
- <math>
F(x) = \exp\left[\sum_{r=3}^\infty\kappa_r\frac{(-D)^r}{r!}\right]\frac{1}{\sqrt{2\pi\sigma}}\exp\left[-\frac{(x-\mu)^2}{2\sigma^2}\right]\,.<math>
By expanding the exponential and collecting terms according to the order of the derivatives, we arrive at the Gram-Charlier A series. If we include only the first two correction terms to the normal distribution, we obtain
- <math> F(x) = \frac{1}{\sqrt{2\pi\sigma}}\exp\left[-\frac{(x-\mu)^2}{2\sigma^2}\right]\left[1+\frac{\kappa_3}{\sigma^3}h_3\left(\frac{x-\mu}{\sigma}\right)+\frac{\kappa_4}{\sigma^4}h_4\left(\frac{x-\mu}{\sigma}\right)\right]\,,<math>
with h3 = (x3 − 3x)/3! and h4 = (x4 − 6x2 + 3)/4! (these are Hermite polynomials). Note that this expression is not guaranteed to be positive, and is therefore not a valid probability distribution! The Gram-Charlier A series diverges in many cases of interest. [The series converges only if f(x) falls off faster than exp(−x2/4) at infinity (Cramér 1957).] When it does not converge, the series is also not a true asymptotic expansion, because it is not possible to estimate the error of the expansion. Therefore, the Edgeworth series (see next section) is generally preferred over the Gram-Charlier A series.
Edgeworth series
Edgeworth developed a similar expansion as an improvement to the central limit theorem. The advantage of the Edgeworth series is that the error is controlled, so that it is a true asymptotic expansion.
Let Xi be a sequence of identically distributed random variables, and Yn the standardized sum
- <math> Y_n = \frac{\sum_{i=1}^n(X_i-\mbox{E}[X_i])}{\sqrt{\sum_{i=1}^n\mbox{var}[X_i]}}.<math>
Further, let Fn be the probability density function of the variables Yn. By the central limit theorem,
- <math>\lim_{n\rightarrow\infty} F_n(x) = \frac{1}{\sqrt{2\pi}}\exp(-x^2/2)<math>
for every x, as long as the means and variances are finite and the sum of variances diverges to infinity. (Generally, the conclusion of the central limit theorem is about the limit of cumulative distribution functions, not of probability density functions, and therefore applies to discrete distributions as well. But discrete distributions are not contemplated in the present context).
Now assume that the random variables Xi have mean μ, variance σ2, and higher cumulants κr=σrλr. If we expand in terms of the unit normal distribution, that is, if we set
- <math>\Psi(x)=\exp(-x^2/2)/\sqrt{2\pi},<math>
then the cumulant differences in the formal expression of the characteristic function fn(t) of Fn are
- <math> \kappa_1-\gamma_1 = 0\,,<math>
- <math> \kappa_2-\gamma_2 = 0\,,<math>
- <math> \kappa_r-\gamma_r = \frac{\lambda_r}{n^{r/2-1}}; \qquad r\geq 3\,.<math>
The Edgeworth series is developed similarly to the Gram-Charlier A series, only that now terms are collected according to powers of n. Thus, we have
- <math> f_n(t)=\left[1+\sum_{j=1}^\infty \frac{P_j(it)}{n^{j/2}}\right] \exp(-t^2/2)\,,<math>
where Pj(x) is a polynomial of degree 3j. Again, after inverse Fourier transform, the density function Fn follows as
- <math> F_n(x) = \Psi(x) + \sum_{j=1}^\infty \frac{P_j(-D)}{n^{j/2}} \Psi(x)\,.<math>
The first three terms of the expansion are (Cramér 1957)
- <math> F_n(x)\approx \Psi(x) - \frac{\lambda_3 \Psi^{(3)}(x)}{6\sqrt{n}} +\frac{1}{n}\left[\frac{\lambda_4 \Psi^{(4)}(x)}{24}+\frac{\lambda_3^2 \Psi^{(6)}(x)}{72}\right]+O(1/n^{3/2})\,.<math>
Here, Ψ(j)(x) is the jth derivative of Ψ(x) with respect to x. Blinnikov and Moessner (1998) have given a simple algorithm to calculate higher-order terms of the expansion.
Further reading
- S. Blinnikov and R. Moessner (1998). "Expansions for nearly Gaussian distributions". Astron. Astrophys. Suppl. Ser. 130:193-205.
- Harald Cramér (1957). Mathematical Methods of Statistics. Princeton University Press, Princeton.
- D. L. Wallace (1958). "Asymptotic approximations to distributions". Ann. Math. Stat. 29:635-654.