Talk:Central limit theorem

I removed this:

An interesting illustration of the central tendency, or Central Limit Theorem, is to compare, for a number of lifts (elevators for those on the left-hand side of the Atlantic), the maximum load and the maximum number of people. For small lifts holding only a few people, the maximum load divided by maximum number of people is usually greater than it is in large lifts holding a larger number of people. This is necessary because some small groups of people who fill the lift may well have several people who are above average weight (just as, on other occasions, other small groups may have several who are well below average weight), whereas the larger the sample (the number of people in the large lift) the nearer the proportion of overweight people will be to the norm for the whole population.

While it is a nice example, it doesn't illustrate the Central limit theorem, whose gist is that the sum is normally distributed. I don't quite know where to put this example though. Maybe in standard deviation or normal distribution? AxelBoldt 21:02 Oct 14, 2002 (UTC)


I've encountered another definition of "the" central limit theorem.

My statistics textbook (Mathematical Statistics with Applications, 6th edition, by Wackerly, Mendenhall III, and Scheaffer) defines it in this way:

If Y1, Y2, ..., Yn are iid with μ and σ, then n1/2*(Ybar-μ)/σ converges to a standard normal distribution as n goes to infinity. (my paraphrase)

The HyperStat on-line basic statistics text (http://davidmlane.com/hyperstat/sampling_dist.html) says

The central limit theorem states that given a distribution with a mean m and variance s2, the sampling distribution of the mean approaches a normal distribution with a mean (m) and a variance s2/N as N, the sample size, increases. (quoted directly)

I suppose this follows from the definition given in this article. Nonetheless, it is not identical to the one given in the article.

Is there a general trend for more basic/applied statistics books to use this mean-centric definition, while more advanced/theoretical ones use the definition given in the article? Is the definition given in the article better somehow? (I assume the mean-centric definition can be derived from it, but not vice versa.) Should the article also mention the mean-centric definition, since it seems to be somewhat popular?

--Ryguasu 10:52 Dec 2, 2002 (UTC)

No --- the "mean-centric" version and the "sum-centric" version are trivially exactly the same thing; either can be derived from the other, and it's completely trivial: Just multiply both the numerator and the denominator by the same thing; you need to figure out which thing. Michael Hardy 04:34 Feb 21, 2003 (UTC)
Right. This became obvious to me sometime after posting the question. Nonetheless, I think I'm going to stick in the mean-based formulation at some point; I've found more books using only the mean-based definition, and I imagine that some not so mathematically inclined people who nonetheless have to brush up against the CLT (certain social scientists come to mind) might like having what is not trivial to them pointed out. I agree, however, that unless proofs of the CLT typically involve the mean-based formulation, the one currently given on this page should be presented as more fundamental. --Ryguasu

Maybe I'm getting in over my head here, but do you really need to normalize Sn to say anything precise here? Can't we clarify the first "informal" claim of convergence of Sn by saying, parallel to what AxelBoldt has said for the normalized (i.e. Zn) case

The distribution of Sn converges towards the normal distribution N(nμ,σ2n) as n approaches ∞. This means: if F(z) is the cumulative distribution function of N(nμ,σ2n), then for every real number z, we have
limn→∞ Pr(Snz) = F(z).

Is there a lurking desire here to state the non-standard normal part as a corollary, rather than as central to the CLT? That might be ok, although the general-purpose version looks more useful to me.

--Ryguasu 01:18 Dec 11, 2002 (UTC)

The problem is that on one side of your equality you have a limit as n approaches infinity, so that the value of that side does not depend on anything called n, and which CDF you've got on the other side does depend on the value of n. -- Mike Hardy

Actually, the CDF on the right hand size depends on z, not on n. There are no free ns anywhere. --Ryguasu

It does depend on n, but your notation inappropriately suppresses that dependency. You defined F(z) as the cumulative distribution function of N(nμ,σ2n). AxelBoldt 02:23 Dec 14, 2002 (UTC)

Excellent point. Nonetheless, I find it suspicious that someone with more mathematical experience than me can't express the "informal" claim in a rigorous manner. At Talk:Normal distribution, you mentioned "goodness of fit" tests. Couldn't you express the informal version formally, through some limit statement about the results of such a test as the number of samples/trials goes to infinity? --Ryguasu 02:11 Jan 30, 2003 (UTC)

Probably, I don't know. But the version given in the article is also a rigorous statement of the "informal" claim you have in mind. AxelBoldt 00:55 Jan 31, 2003 (UTC)


How about adding some examples? (This is something most of the math pages are lacking.) How about an illustration involving coin flips? I.e., X_n is defined on the probability space [0, 1] so that X_n is 1 with probability 1/2 and -1 with probability 1/2. A series of graphs and equations could be given.


In the article, there is a comment reading, "picture of a distribution being "smoothed out" by summation would be nice". I've created an animated gif to address this comment. Since animated gifs are considered questionable, I am posting it to the talk page to see if others think it's a good idea. (The image has a rather large footprint on the screen. If anyone can easily shrink it, that would be good. With the rather rudimentary image manipulation tools at my disposal, it would be a moderately involved undertaking for me, so I'm not going to do it unless it's a worthwhile effort.)

I also propose the following explanatory text:

The figure below demonstrates the central limit theorem in action. It shows the distribution of the random variable Y = nSn for values of n from 1 to 7. (In this particular case, the random variables Xi have variance equal to 1, so the variance of Sn is equal to n. The factor n scales Y so that its variance is equal to 1 independent of n.)

Missing image
Clt_in_action.gif
Image:clt_in_action.gif

Any and all comments appreciated. -- Cyan 22:15, 2 Feb 2004 (UTC)

Testing...

Missing image
Clt_in_action.gif
The central limit theorem in action

Yes, using the thumbnail feature would be a quick work-around. I don't know anything about this, but the diagram seems useful to me (it's particularly useful that it pauses between repetitions). You can count along in your head 1 to 7 as the shape of the graph changes, it doesn't rely on captions you need to read at the same time as observing the graph. I give it my uninformed support.  :) (Plus, if this is replicating information already included in the text then that's even better; relying on an animated gif to impart key information rather than to give an example of it would be a bad thing). fabiform | talk 04:32, 4 Feb 2004 (UTC)

An animation can't be printed, and I've always found animated diagrams to be very frustrating, particularly in a case like this. I have to wait for it to come around again if I'm trying to wrap my head around some individual part of it. There's no pause button, no frame forward, no rewind, at least in most browsers. I'd rather see such images side by side in most cases. Perhaps an animation in addition might be neat, but forcing it on readers is to me not friendly.
Here's a quick vertically flattened version (which could float to the side of the body text, for instance). A horizontal version might be better, or break it on two lines. --Brion 09:15, 4 Feb 2004 (UTC)
Missing image
Smoothing_by_summation_sample.png
Image:Smoothing by summation sample.png

My $0.02:
a) This is, indeed, an example of an appropriate use of an animated GIF. There's no actual need to change it. However...
b) I actually think that in this particular case the separate pictures are really just as good. I find the animation irritatingly jumpy, and, of course, the constant-time steps are too fast for the early steps (where you might even want to take a moment to visualize the convolution in your head, and notice that you go from two sharp peaks to three blunt peaks to a single broad peak with four bumps), and too slow for the later steps (which all look alike). This is a nit-pick, though.
c) Footprint of the animated version is OK. Note, however, that you could easily reduce the extent of the X axis to +/- 3.5. Maybe by the last iteration there is some data outside those limits and maybe you know it's there, but visually it doesn't matter.
d) The individual thumbnails in Brion's version need a bit of work. They're currently too small and the vertical arrangement isn't very good. You're going to get a million "try this, try that" suggestions, each of which would be a couple of hours' work to try... mine is that you use a table and put them into some kind of comic strip format, maybe two rows of four, maybe four rows of two... yes, you'd need to provide an eighth image but since it would look just the same as the seventh that wouldn't be a problem... you'd need to tinker with the axis labelling, slightly bigger type, perhaps slightly fewer divisions... the axis labels (numbers) do NOT need to be TRULY legible, they should be reduced with antialiases smoothing, it's OK if they look blurry when you enlarge them, but they need to be just legible enough that you think you're seeing 1, 2, 3...
Very appropriate to the subject matter, by the way, and a nice illustration. Good stuff! Dpbsmith 11:37, 4 Feb 2004 (UTC)

Thanks for all the comments, folks! Here's what I'm going to do. As Dbpsmith and Brion suggest, I'm going to create a static image in 2 strips of 4 graphs. I'll play around with the x-axis limits for aesthetic effect, and I'll include a link to the animated gif for those of our readers who want to click on it. The reason to include it at all is that the last few panels will be indistinguishable as static images, but small changes will be apparent in the animated version, thus giving the viewer a sense of the scale of changes in distribution that occur past a certain value of n. -- Cyan 16:04, 4 Feb 2004 (UTC)

I looked at the different proposed diagrams, and I think I prefer the 2 strips of 4 graphs idea. I like the static images better than the animated image. -- It occurs to me that the illustration of the central limit theorem could be expanded by showing two or more different initial distributions, or adding a different distribution each time (not identical). After all the whole point of the theorem is that for a large class of distributions, adding them together brings you to the same limiting distribution. Thoughts? Happy editing, Wile E. Heresiarch 02:47, 18 Mar 2004 (UTC)
Oh, just a minor followup -- maybe it would help if the same example shown on the main central limit theorem page was the same as one of (hopefully several) examples shown in illustration of the central limit theorem. I'm thinking the main page could just show the phenomenon, and the illustration page could go into more detail. Thinking out loud, Wile E. Heresiarch 14:08, 18 Mar 2004 (UTC)
Yet another half-baked idea -- maybe the effect of the animation can be sort-of imitated by leaving each plotted line in the succeeding figures, but grayed-out or something like that. So you could see just how much the line is changing, and the old lines won't block out the new ones if we use a lighter/grayer color. Wile E. Heresiarch 14:15, 18 Mar 2004 (UTC)

I have to agree with the no-animation camp. While it does show the progression nicely, having to watch it repeat a few times isn't ideal, and it distracts from the article. The images are great though, and as shown above they work nicely in a line. One other problem with animation is that it can show effects that are not there - the line looks to move which kinda hides the fact that it is a convolution. There might be a case to argue for a link to the animated version, but I would argue it is unnecessary. Good work folks. Mat-C 00:41, 18 Apr 2004 (UTC)

Mat-C, maybe you can look at the figures in Student's t-distribution and tell me what you think -- I attempted to show the progression of the t distribution to the normal distribution by using different colors. How successful was that, do you think? Thanks for any comments, Wile E. Heresiarch 02:53, 19 Apr 2004 (UTC)

Just for those who are wondering, the reason I haven't followed up on producing a set of images is because I discovered that the numerical convolution method I'm using isn't actually converging to a Gaussian. The images above look like Gaussians, but in fact are flatter and have wider tails than a Gaussian actually has. In fact, if I start with a Gaussian, the convolution moves it away from Gaussianity, flattening it and widening the tails. I haven't the time to devote to correcting this problem right now... I may get to it at some less busy time in the future. -- Cyan 05:53, 18 Apr 2004 (UTC)

Hmm, can you tell me a little about how you're going about the convolution, then? The reason that I ask is that I have also computed a numerical convolution (via FFT) for the figures on the illustration of the central limit theorem page, and I'd like to try to make sure those figures don't have the same problem. Thanks for any info. Wile E. Heresiarch 02:53, 19 Apr 2004 (UTC)
I used a two-sided filter algorithm based on MATLAB's built-in one-sided "filter" function (more info on this function here (http://www.mathworks.com/access/helpdesk/help/techdoc/matlab.shtml)). I convolved a vector containing discrete samples of the distribution with the original distributio, and then rescaled it back to standard deviation 1, which involves resampling the distribution so that the discrete grid matches that of the original distribution. Apparently this quick and dirty procedure is affected by some kind of numerical error, because the distribution it converges to is not Gaussian. If you want to check the convergence, why not just plot a Gaussian over your filter-derived distribution? -- Cyan 04:54, 19 Apr 2004 (UTC)
Thanks for your comments. Just a thought -- the problem that you describe might be caused by the discretization effects -- I ran into that when working on another convolution problem and found the convolution result slowly drifting away from the correct result. I think it might be possible to solve the problem without resampling, which could reduce the discretization error. I think I'll post the Octave code which I used to construct the figures -- then it can be inspected and compared, as well as making it possible to "try this at home". Happy editing, Wile E. Heresiarch 02:22, 20 Apr 2004 (UTC)

o(t2)

Just a note: o(t2), t → 0, refers to a function which goes to zero more quickly than t2 (like t3), and not a function 'like' t2, which would be O(t2). Hence, I have reverted the recent edits that changed o(t2) to o(t3). Notably, the article on Big-O notation does not discuss limits other than the limit as t → ∞. However, it should do so! Ben Cairns 06:56, 14 Feb 2005 (UTC).

o(t2) Reply

Sorry, I 've did not seen your message (Bjcairns) in the discussion enrty. I confused big O with small o. I though that this o is reffering to the higher order corrections of the Taylor's expansion formula. I suppose that you are right so I changed the article back to its previous version with o(t2) without being logged in. That ip 143.233.xxx.xxx etc is mine :) My version is perhaps correct if we consider the Big O and not the small one. Theofilatos 17:07, 17 Feb 2005 (UTC)

Navigation

  • Art and Cultures
    • Art (https://academickids.com/encyclopedia/index.php/Art)
    • Architecture (https://academickids.com/encyclopedia/index.php/Architecture)
    • Cultures (https://www.academickids.com/encyclopedia/index.php/Cultures)
    • Music (https://www.academickids.com/encyclopedia/index.php/Music)
    • Musical Instruments (http://academickids.com/encyclopedia/index.php/List_of_musical_instruments)
  • Biographies (http://www.academickids.com/encyclopedia/index.php/Biographies)
  • Clipart (http://www.academickids.com/encyclopedia/index.php/Clipart)
  • Geography (http://www.academickids.com/encyclopedia/index.php/Geography)
    • Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
    • Maps (http://www.academickids.com/encyclopedia/index.php/Maps)
    • Flags (http://www.academickids.com/encyclopedia/index.php/Flags)
    • Continents (http://www.academickids.com/encyclopedia/index.php/Continents)
  • History (http://www.academickids.com/encyclopedia/index.php/History)
    • Ancient Civilizations (http://www.academickids.com/encyclopedia/index.php/Ancient_Civilizations)
    • Industrial Revolution (http://www.academickids.com/encyclopedia/index.php/Industrial_Revolution)
    • Middle Ages (http://www.academickids.com/encyclopedia/index.php/Middle_Ages)
    • Prehistory (http://www.academickids.com/encyclopedia/index.php/Prehistory)
    • Renaissance (http://www.academickids.com/encyclopedia/index.php/Renaissance)
    • Timelines (http://www.academickids.com/encyclopedia/index.php/Timelines)
    • United States (http://www.academickids.com/encyclopedia/index.php/United_States)
    • Wars (http://www.academickids.com/encyclopedia/index.php/Wars)
    • World History (http://www.academickids.com/encyclopedia/index.php/History_of_the_world)
  • Human Body (http://www.academickids.com/encyclopedia/index.php/Human_Body)
  • Mathematics (http://www.academickids.com/encyclopedia/index.php/Mathematics)
  • Reference (http://www.academickids.com/encyclopedia/index.php/Reference)
  • Science (http://www.academickids.com/encyclopedia/index.php/Science)
    • Animals (http://www.academickids.com/encyclopedia/index.php/Animals)
    • Aviation (http://www.academickids.com/encyclopedia/index.php/Aviation)
    • Dinosaurs (http://www.academickids.com/encyclopedia/index.php/Dinosaurs)
    • Earth (http://www.academickids.com/encyclopedia/index.php/Earth)
    • Inventions (http://www.academickids.com/encyclopedia/index.php/Inventions)
    • Physical Science (http://www.academickids.com/encyclopedia/index.php/Physical_Science)
    • Plants (http://www.academickids.com/encyclopedia/index.php/Plants)
    • Scientists (http://www.academickids.com/encyclopedia/index.php/Scientists)
  • Social Studies (http://www.academickids.com/encyclopedia/index.php/Social_Studies)
    • Anthropology (http://www.academickids.com/encyclopedia/index.php/Anthropology)
    • Economics (http://www.academickids.com/encyclopedia/index.php/Economics)
    • Government (http://www.academickids.com/encyclopedia/index.php/Government)
    • Religion (http://www.academickids.com/encyclopedia/index.php/Religion)
    • Holidays (http://www.academickids.com/encyclopedia/index.php/Holidays)
  • Space and Astronomy
    • Solar System (http://www.academickids.com/encyclopedia/index.php/Solar_System)
    • Planets (http://www.academickids.com/encyclopedia/index.php/Planets)
  • Sports (http://www.academickids.com/encyclopedia/index.php/Sports)
  • Timelines (http://www.academickids.com/encyclopedia/index.php/Timelines)
  • Weather (http://www.academickids.com/encyclopedia/index.php/Weather)
  • US States (http://www.academickids.com/encyclopedia/index.php/US_States)

Information

  • Home Page (http://academickids.com/encyclopedia/index.php)
  • Contact Us (http://www.academickids.com/encyclopedia/index.php/Contactus)

  • Clip Art (http://classroomclipart.com)
Toolbox
Personal tools