Normal distribution
The normal distribution is an extremely important probability distribution considered in statistics. Among people whose field is not primarily probability theory or statistics (notably in physics) it is often called the Gaussian distribution. It is actually a family of distributions of the same general form, differing only in their location and scale parameters: the mean and standard deviation. The standard normal distribution is the normal distribution with a mean of zero and a standard deviation of one. Because the graph of its probability density resembles a bell, it is often called the bell curve.
History
The normal distribution was first introduced by de Moivre in an article in 1733 (reprinted in the second edition of his The Doctrine of Chances, 1738) in the context of approximating certain binomial distributions for large n. His result was extended by Laplace in his book Analytical Theory of Probabilities (1812), and is now called the Theorem of de Moivre-Laplace.
Laplace used the normal distribution in the analysis of errors of experiments. The important method of least squares was introduced by Legendre in 1805. Gauss, who claimed to have used the method since 1794, justified it rigorously in 1809 by assuming a normal distribution of the errors.
The name "bell curve" goes back to Jouffret who used the term "bell surface" in 1872 for a bivariate normal with independent components. The name "normal distribution" was coined independently by Charles S. Peirce, Francis Galton and Wilhelm Lexis around 1875 [Stigler]. This terminology is unfortunate, since it reflects and encourages the fallacy that "everything is Gaussian". (See the discussion of "occurrence" below).
Specification of the normal distribution
There are various ways to specify a random variable. The most visual is the probability density function (plot at the top), which represents how likely each value of the random variable is. The cumulative density function is a conceptually cleaner way to specify the same information, but to the untrained eye its plot is much less informative (see below). Equivalent ways to specify the normal distribution are: the moments, the cumulants, the characteristic function, the moment-generating function, and the cumulant-generating function. Some of these are very useful for theoretical work, but not intuitive. See probability distribution for a discussion.
All of the cumulants of the normal distribution except the first two are zero.
Distribution functions
Probability density function
The probability density function of the normal distribution with mean μ and standard deviation σ (or variance σ2) is also known as the Gaussian function
These statements are also true for non-standard normal distributions.
Cumulative distribution function
The cumulative distribution function of the normal distribution is the probability that a given standard normal variable has a value less than z. Given the probability density function above, the cumulative distribution function has formula:
For instance, the probability that a standard normal variable has a value less than 0.12 is equal to 0.54776. The cumulative distribution function of the normal distribution does not have an analytic form, and has to be calculated using numerical techniques. It is so commonly used that it is often called "the" error function. A special feature of the normal distribution is that the cumulative distribution function is not needed to simulate normal random variables (see below).
Generating functions
Moment generating function
Characteristic function
The characteristic function of a gaussian random variable X ~ N(μ,σ2) is defined as the expected value of eitX and can be written as
Properties
- If X ~ N(μ, σ2) and a and b are real numbers, then aX + b ~ N(aμ + b, (aσ)2).
- If X1 ~ N(μ1, σ12) and X2 ~ N(μ2, σ22), and X1 and X2 are independent, then X1 + X2 ~ N(μ1 + μ2, σ12 + σ22).
- If X1, ..., Xn are independent standard normal variables, then X12 + ... + Xn2 follows a chi-squared distribution with n degrees of freedom.
Standardizing Gaussian random variables
As a consequence of the first listed property, it is possible to relate all gaussian random variables to the standard normal.
If X is a Gaussian random variable with mean μ and variance σ2, then
The standard normal distribution has been tabulated, and the other normal distributions are simple transformations of the standard one. Therefore, if one knows the mean and the standard deviation of a normal distribution, one can use this table to answer all questions about the distribution.
Generating Gaussian random variables
For computer simulations, it is often necessary to generate values that follow a Gaussian distribution. This is best done with the Box-Muller transforms. These methods require two uniformly distributed values as input which can easily be generated by the computer's pseudorandom number generator.
The Box-Muller transform is a beautiful consequence of the third listed property, and the fact that the chi-square distribution with two degrees of freedom is an exponential random variable (which can be easily simulated exactly).
Occurrence
Approximately normal distributions occur in many situations, as a result of the central limit theorem. Simply stated, this theorem says that adding up a large number of small independent variables results in an approximately normal distribution. Therefore, whenever there is reason to suspect the presence of a large number of small effects acting additively, it is reasonable to assume that observations will be normal. This assumption is then subject to empirical test using well-established statistical methods.
It is important to realize, however, that small effects often act as multiplicative (rather than additive) modifications. In that case, the assumption of normality is not justified, and it is the logarithm of the variable of interest that is normally distributed. The distribution of the directly observed variable is then called log-normal.
Finally, if there is a single external influence which has a large effect on the variable under consideration, the assumption of normality is not justified either. This is true even if, when the external variable is held constant, the resulting distributions are ideed normal. The full distribution will be a superposition of normal variables, which is not in general normal. This is related to the theory of errors (see below).
To summarize, here's a list of situations where approximate normality is expected. For a fuller discussion, see below.
- In counting problems (so the central limit theorem includes a discrete-to-continuum approximation) where reproductive random variables are involved, such as
- Binomial random variables, associated to yes/no questions;
- Poisson random variables, associates to rare events;
- In physiological measurements of biological specimens:
- The logarithm of measures of size of living tissue (length, height, skin area, weight);
- The length of inert appendages (hair, claws, nails, teeth) of biological specimens, in the direction of growth; presumably the thickness of tree bark also falls under this category;
- Other physiological measures may be normally distributed, but there is no reason to expect that a priori;
- Measurement errors are assumed to be normally distributed, and any deviation fron normality must be explained;
- Financial variables
- The logarithm of interest rates, exchange rates, and inflation; these variables behave like compound interest, not like simple interest, and so are multiplicative;
- Stock-market indices are supposed to be multiplicative too, but some researchers claim that they are log-Lévy variables instead of lognormal;
- Other financial variables may be normally distributed, but there is no reason to expect that a priori;
- Light intensity
- The intensity of laser light is normally distributed;
- Thermal light has a Bose-Einstein distribution on very short time scales, and a normal distribution on longer timescales due to the central limit theorem.
Instances of the central limit theorem
- A binomial distribution with parameters n and p is approximately normal if n is big enough (the approximation is very good if both np and n(1-p) are at least 5). The approximating normal distribution has mean μ = np and standard deviation σ = (n p (1 - p\))1/2.
- A Poisson distribution with parameter λ is approximately normal if λ is big enough (λ > 10 is sufficient). The approximating normal distribution has mean μ = λ and standard deviation σ = √λ.
Test scores
The IQ score of an individual for example can be seen as the result of many small additive influences: many genes and many environmental factors all play a role.
- IQ scores and other ability scores are approximately normally distributed. For most IQ tests, the mean is 100 and the standard deviation is 15.
Physical characteristics of biological specimens
The overwhelming biological evidence is that bulk growth processes of living tissue proceed by multiplicative, not additive, increments, and that therefore measures of body size should at most follow a lognormal rather than normal distribution. Despite common claims of normality, the sizes of plants and animals is approximately lognormal. The evidence and an explanation based on models of growth was first published in the classic book
- Huxley, Julian: Problems of Relative Growth (1932)
The assumption that linear size of biological specimens is normal leads to a non-normal distribution of weight (since weight/volume is roughly the 3rd power of length, and gaussian distributions are only preserved by linear transformations), and conversely assuming that weight is normal leads to non-normal lengths. This is a problem, because there is no a priori reason why one of length, or body mass, and not the other, should be normally distributed. Lognormal distributions, on the other hand, are preserved by powers so the "problem" goes away if lognormality is assumed.
- blood pressure of adult humans is supposed to be normally distributed, but only after separating males and females into different populations (each of which is normally distributed)
- The length of inert appendages such as hair, nails, teet, claws and shells is expected to be normally distributed if measured in the direction of growth. This is because the growth of inert appendages depends on the size of the root, and not on the length of the appendage, and so proceeds by additive increments. Hence, we have an example of a sum of very many small lognormal increments approaching a normal distribution. Another plausible example is the width of tree trunks, where a new thin ring if produced every year whose width is affected by a large number of factors.
Measurement errors
Repeated measurements of the same quantity are expected to yield results which are clustered around a particular value. If all major sources of errors have been taken into account, it is assumed that the remaining error must be the result of a large number of very small additive effects, and hence normal. Deviations from normality are interpreted as indications of systematic errors which have not been taken into account. Note that this is the central assumption of the mathematical theory of errors.
Financial variables
Because of the exponential nature of interest and inflation, financial indicators such as interest rates, stock values, or commodity prices make good examples of multiplicative behaviour. As such, they should not be expected to be normal, but lognormal.
Mandelbrot, the popularizer of fractals, has claimed that even the assumption of lognormality is flawed.
Lifetime
Other examples of variables that are not normally distributed include the lifetimes of humans or technical devices. Examples of distributions used in this connection are the exponential distribution (memoryless) and the Weibull distribution. In general, there is no reason that waiting times should be normal, since they are not directly related to any kind of additive influence.
Photon counts
Light intensity from a single source varies with time, and is usually assumed to be normally distributed. However, quantum mechanics interprets measurements of light intensity as photon counting. Ordinary light sources which produce light by thermal emission, should follow a Poisson distribution or Bose-Einstein distribution on very short time scales. On longer time scales (longer than the coherence time), the addition of independent variables generates a gaussian distribution. Laser light, which is definitely a quantum phenomenon, has an exactly gaussian light intensity.
Further reading
- See also multivariate normal distribution.


