Prediction interval
|
In statistics, a prediction interval bears the same relationship to a future observation that a confidence interval bears to an unobservable population parameter.
Example
Suppose one has drawn a sample from a normally distributed population. The mean and standard deviation of the population are unknown except insofar as they can be estimated based on the sample. It is desired to predict the next observation. Let n be the sample size; let μ and σ be respectively the unobservable mean and standard deviation of the population. Let X1, ..., Xn, be the sample; let Xn+1 be the future observation to be predicted. Let
- <math>\overline{X}_n=(X_1+\cdots+X_n)/n<math>
and
- <math>S_n^2={1 \over n-1}\sum_{i=1}^n (X_i-\overline{X}_n)^2.<math>
Then it is fairly routine to show that
- <math>{X_{n+1}-\overline{X}_n \over \sqrt{S_n^2+S_n^2/n}} = {X_{n+1}-\overline{X}_n \over S_n\sqrt{1+1/n}}<math>
has a Student's t-distribution with n − 1 degrees of freedom. Consequently we have
- <math>P\left(\overline{X}_n-A S_n\sqrt{1+(1/n)}\leq X_{n+1} \leq\overline{X}_n+A S_n\sqrt{1+(1/n)}\,\right)=p<math>
where A is the 100(1 − (p/2))th percentile of Student's t-distribution with n − 1 degrees of freedom. Therefore the numbers
- <math>\overline{X}_n\pm A {S}_n\sqrt{1+(1/n)}<math>
are the endpoints of a 100p% prediction interval for Xn+1.