The Central Limit Theorem - understanding what it is and why it works
Вставка
- Опубліковано 9 лип 2024
- The Central Limit theorem underpins much of traditional inference. In this video Dr Nic explains what it entails, and gives an example using dragons.
0:00 Introduction
0:30 Background
0:40 Sampling Distribution of the mean
1:15 Overview of four aspects
1:50 Dragon example
3:00 Samples of n=4
3:55 Aspect 1 - spread is smaller
4:15 Aspect 2
4:50 Aspect 3
5:26 Aspect 4 - sample size effect
6:10 Important notes
There is a really helpful blogpost here also: creativemaths.net/blog/sampli...
See creativemaths.net/videos/ for all of Dr Nic's videos organised by topic.
You can buy Dragonistics data cards at our shop at: shop.creativemaths.net/produc...
#Statistics #Probability #DrNicStats
With the help of these hillarious pictures, I will finally remember central limit theorem :)
As a high school math teacher, I can say this is one phenomenal video you put together that is very conceptual, illustrative, engaging, and powerful! Great job!
Wow, thank you!
What a great, informative video!! When learning this theorem from a purely analytic viewpoint, it’s hard sometimes to bring it down to something relatable outside of measure spaces. Thanks you!
Great sense of humor, and great methods of teaching, as i always say... Not all profs are profs, some are 'Borne' to do it. i wish you'd do more videos on Biostats as its a universal problem. Thanks for existing. :)
Instablaster
Thanks for the Video Dr Nic! Helps immensely!
Nice style of teaching
THIS WAS JUST THE VIDEO I WAS LOOKING FOR! Dr. Nic, you're saving lives!! Also, loved the dragon example, kept me from falling asleep for sure :)
Thanks Tiffany! So glad to be saving lives.
Ooh my god im so glad i discovered your channel!! Cheers from Chile!!! Already in love of your style of teaching, you and the channel!!!!
And cheers from New Zealand. So glad to be of help.
This is an underrate and high quality mathematical content--great stuff! I will be sure to share with others if they inquire about math tutorial videos.
Thanks. Maybe you could think about supporting us by joining the channel. The more support we get, the more videos we can make. :)
@@DrNic SG!
❤ love dr Nic’s videos. She’s a star of statistics world 🌍
Great stuff, doc.
This method of teaching is phemomenal. As a Fantasy geek, I really loved the Dragon stats example!
Thanks for that. My editor got a bit carried away on this one, but I'm happy you like the dragons.
Helps immensely!
thanks. very useful video for understanding the central limit theorem.
"Hi, everybody!"
"Hi, Dr Nick!"
Yes - I know. I'm surprised more people don't comment on that. My editor would say "Hi Dr Nic" as he was editing.
Great animations!
thanks so much , dr Nic
Great video! And I love the pictures, they're funny!
Hi Dr Nic! First, many thanks for this video it will certainly stay with me for a while.
Just two small notes:
1) I have to say that I was super confused by different ticks for the x axis on the graphs when you were comparing distributions. It has thrown me off to a point that I've initially thought that you've change the scale for this axis, after scratching my head for a while trying to figure out why on earth would you do that until I managed to figure out that the problem is with different ticks. It's probably caused by automated drawing of ticks to the plot, but it hurt my brain a bit until I got there
2) some of those overlaid images have some small composition issues, generally you don't want your body to be facing away from the center of the frame, I have to admit I cannot recall the last time when I've seen such composition creating such a strong feeling of bizzarness and out of placeness, I had to make up two words in one sentence just to be able to express it ;)
Regardless I still enjoyed this video a lot so I'm just sharing my constrictive criticism with you as I believe that addressing this points might help you reach a broader audience with your future videos which I hope to see coming up soon.
Many thanks again, keep up the good work and best wishes!
Amazing! You put so much effort into this video. A guy who occasionally miscounts their own fingers and toes might finally get it.
Glad it was helpful!
This is the best explanation on this topic I've ever seen. Congratulations, and thank you.
Wow, thank you!
Thanks for being there.
Always!
What a funny way of teaching ! Thanks a lot
Loved the explanation and the pics :) that should help the lesson stick
thank you Dr. Nic :).
And again!
As an ex math teacher, I can say that your video is great. You made it easy to learn. Best wishes.
Wow, thanks!
Thats the real and easy way to make a non mathematician like me to understand. Kudos Dr.Nic
That is great to hear. I find concrete examples work much better for me than abstract formulas.
Very well explained, Dr. Nic. - Please enlighten us with more of your valuable videos.
Thank you, I would but I have to keep down a job so there will be a bit of a pause for a few months. If I ever get enough channel members I will be able to make videos full time.
I am so much fan of you, with all those funny pictures and simple way of making things understand.
Awww. That is so nice.
Thanks for the great videos! I will definitely get my class to watch.
Please do! And let me know if you find any gaps in my collection. creativemaths.net/videos/
its is very easy to understand to me thank you mam
Thank You!
You are awesome! Really loved the presentation layout and "expressions" XD. Lovely.
Thanks. Some people find the expressions annoying and my editor did get a bit carried away on this one so I’m happy you like it.
@@DrNic He was a head of his time in the memes I guess XD
Question...
how is the CLT affected by the s standard deviation when applied?
Thanks for the explanation! So basically the means of random samples taken from a random distribution are all distributed Normally, given that the number of samples taken is big enough?
Or is it, but given that the samples are big enough? Couldn't catch this.
The sample size needs to be big enough. If the underlying distribution of the population is well-approximated by the normal distribution, then even a sample of 20 or so will give a mean that is well modelled by a normal distribution. Otherwise, once the sample is about 30 or so, the distribution of the means will be approximately normal. We only ever take one sample unless we are trying to illustrate something in "teaching world". The sample size is what matters. There is a table in this blogpost you might find helpful: creativemaths.net/blog/sampling-distribution/
Loved it
Great, thank you very much for sharing.
My pleasure!
I liked the pictures, they were funny and helped me pay attention.
Thanks for that. Not everyone likes them, so it's good to hear from someone who does.
this helped me I am greatful for you
Tq.. it is well explained👍🏻
You had me at Dragons.
Oh god this is absolutely amazing 😍
ROFL
Perfect for my stats class, thank you.
Hi Jeremy. Pleased to hear it. I know the Central Limit Theorem can be very confusing for students. You may find others of my videos useful.
@@DrNic We just watched this in our class today for AP Stats. Great vid!
Matthew Ware Great to hear it, Matt. Well done for doing AP Stats.
Great tutorials. Helped me a lot. Thank you. Is that accent from New Zealand?
Sure is - NZ through and through.
thanks ! just i need this for my tesis ..i need understand t
his !
Brilliant!
Thanks
Very good way to understanding a difficult concept. You are the bestest.
I'm glad you think so. I have built on many years of explaining this to come up with this.
@@DrNic thanks a lot. Please carry on, the world appreciates you.
Best video about the central limit theorem !
Thanks. I did a lot of thinking about it.
@@DrNic You Can be proud of yourself :)
Hello everybody, I'm Doctor Nick
Thank you so much for the brilliant video. you are the only one in youtube who pointed out that in real life we usually take a single sample but how could the central limit theorem be used practically when we only take one sample. and doesnt this affect the conclusions we have. i.e if the population is not normally distributed and the sample means are wouldnt this make us get wrong ideas about the populaiton.
Glad it was helpful! The central limit theorem lets us draw conclusions about the population from the sample using the theory of the CLT. You might find this blog post helpful: creativemaths.net/blog/sampling-distribution/
another question, I cant find the etymology of it, why is it called central limit , in Wikipedia it said because its central to probability, but what is the "limit" ?
A very good question. In mathematics we have the idea of a limit, which is a value that is approached but not exceeded. In the case of the Central limit theorem as we get a bigger and bigger sample, the amount of variation of the sample mean around the central value - the population mean, decreases, so at the limit the variation would be zero and the sample mean would equal the population mean. (This is only my thoughts and may be wrong, but it works for m!)
Thank you
omg i love this person. i understood it and it was so fun
Thanks. Lovely to hear.
Hi Dr. Nic. Does your site have practice problems? Thanks.
Hi Lester, no we don't have practice problems except in our courses for NZ high school students. However we are moving towards channel memberships, and will provide some more content there. Two of the videos have practice questions.
i just fell in love =D
Hi, why the sampling distribution of mean is not within 1-8, (when I draw 4 cards - replaceable, it may be 1 for all four cards, so the mean this sample is 1)
Hi - can you give me a time stamp for what you are asking about, as it makes it easier for me to answer.
Hi. Great explanation. I have a fundamental question of how can we generalize for population when the population distribution itself is not known. How does the sampling distribution being normally distributed help achieve this generalization? Thanks
Very good question. I don't have time to do a video right now, so I'll write an explanation. We are not making any claims about the population distribution apart from the value of the mean. The CLT lets us model the distribution of the means (the sampling distribution) as a normal distribution, which means we can now use the normal distribution to create confidence intervals and do hypothesis tests. Let me know if you need more explanation.
If bigger samples lead to less spread in the sampling distribution (aspect 4) then how could the sampling distribution be less spread than the population (aspect 1)? Wouldn't the biggest possible sample just be the entire population? I can't quite tell whether the dragon simulation illustrates this property, considering that the population isn't normally distributed (although the sample means are).
The spread in the sampling distribution is the spread of the **means** from the different samples. The means will always be less spread than the original distribution. This blog post might help explain this: creativemaths.net/blog/sampling-distribution/
The original population does not need to be normally distributed for the sampling distribution to be well modelled by a normal distribution. The dragon simulation was chosen to make sure we did not reinforce the erroneous concept that the population needs to be normal.
Hope this helps
Very nicely explained. Thanks.
Thanks - I did a lot of thinking about this - and many years of teaching it.
@@DrNic I can understand. It is an art to teach things this way.
So if you take one large sample then... What, you can just use the sampling distribution's mean to approximate the population mean, or?
That is correct. You take one sample and use the mean of the sample to approximate the sampling distribution, which can then be used to find confidence intervals and hypothesis tests which are underpinned by the Central Limit Theorem. Watch ua-cam.com/video/tFRXsngz4UQ/v-deo.html to get a fuller idea of statistical inference.
very helpful, and amazing edit :)
Great explanation. One suggestion: at 4:29, keep the scale of your y axes the same across each of those figures. That would make it much easier to compare the spread for sampling distributions when changing the sample_size, n. Only change one variable at a time to make easier comparisons, even in visual plots.
Good point!
Thank you ma'am
No cap this video is lit!
I'm hoping this means this is good. If so...glad to hear it!
Hi teacher. Firstly, my great gratitude to your video. Second, May I know why we need to always find the normal distribution by taking more samples size from the population just to form a normal distribution?. Thank you and have a good day!
I'm glad you find the videos useful. This video takes lots of samples to show that the normal distribution is a good model for the sampling distribution. It is illustrating a concept that could also be shown by mathematical proof. However when applying the central limit theorem we take one sample and use the theorem to draw conclusions about the population from which the sample is drawn.
I suggest you read this blog post and pay particular attention to the table.
creativemaths.net/blog/sampling-distribution/
Thank you for this nice explanation.
I finally get CLT
Thanks a lot Dr !
That is great to hear. It is a tricky concept.
Wooooow amazing, it's very very impressive that I didn't learn a single thing..
It takes all sorts.
Really clear description- helped a lot thank you!
I do have one question though: As The Central Limit Theorem applies when you take multiple samples from a population, why is it important to understand when in "real life" we usually only take one sample?
In the video I take multiple samples in order to illustrate what is happening. It is a simulation but not what would happen in a real-life scenario. Sometimes people think you have to take multiple samples, so I am clarifying that you would usually take just one sample.
One sample consists of multiple data points
That is correct. And it is a thing that people get confused about, so thank you for pointing that out.
Picture and hand movements are scary AF !
Sanjay Sanjel rude comment.
I can't seem to wrap my head around one aspect of the CLT: Since the initial population has a a non-normal distribution, if we take the means of all our samples, we get a normal distribution? Even if we take the whole population as samples, we would still get a normal distribution?
That is an excellent question that made me think for a bit and do some reading. For the CLT to apply, the samples need to be made up of independently randomly selected observations. This implies sampling with replacement, otherwise the probability of an observation being chosen would be dependent on what had gone before. We don't usually sample with replacement, but if the population is much bigger than the sample, this is not a problem. Now your suggestion of taking the whole population as a sample violates this condition, as the sample is not made up of independently chosen random observations. Does that help? If not, ask some more.
@@DrNic That makes sense to me. Thank you very much for your response.
IF the underlying population distribution is NOT NORMAL, and we have samples less than 30. Let's say the samples are size
n = 5. I know the distribution of the sample means will not be normal according to the CLT. However, will the distribution have the same mean as the population mean, and will the variance be equal to the variance of the population divided by 5? Please let me know? thanks?
The expected value of the sample means with a sample size of 5 will still be the population mean. So yes.
And yes to the variance of the sample means. Sorry to take so long to answer. But, as you say, the distribution will not be well approximated by the normal distribution.
If you are good at using Excel or can program, you could make up a little simulation to see what happens.
Hi, Dr Nic!
I have a doubt. As sample size increases, the distribution of the sample means tend to become like a normal distribution as stated in the 3rd point. But in last point, you mentioned as sample size increases, the distribution of the sample means decreases. It confused me as it was opposing
Sorry - I thought I had answered this before but it must have got lost. The shape becomes more bell-shaped like a normal distribution, but the spread reduces.
@@DrNic thank you so much for the clarification
Did they simulate all of those dragon samples using bootstrapping?
No - we already had the population of dragons, so could take repeated random samples from all the dragons. It was a simulation. In bootstrapping we repeatedly take bootstrap samples from the sample.
Anyone else have the urge to play Magic or Yu-Gi-Oh now?
Is this idea accurate:
If I take 2 samples vs 200, the E(Y bar) in the latter is going to be closer to the population mu then the former. as the quantity of your samples increases, not only does your distribution become more bell shaped, but you also get a more unbiased estimate for the population mean.
I know that if your n gets larger , the probability that the mu= your E(y bars) increases. I am applying the same logic to the quantity of samples. Not sure if its true but whenever I do problems almost always is the E(Y bars) closer to the mu as I add samples.
I'm not sure if you are talking about a theoretical or practical example. In practice you would only ever take one sample.
In a simulation you could take 2 or 200 samples. If you take a sample of size 200 the likelihood that the expected value of the sample means is close to the population mean increases, as you have reduced the spread of the sample means in the sampling distribution.
Not sure if this is answering your question.
@@DrNic Thanks for your response.
Lets say there are 1 million people in a city. We are interested in height. We define a sample size as 1000 people. We take data measurements on 1000 random people 2 times vs 200 times. In both cases we have a sampling distribution of the sample means and in both we can calculate the distribution mean E(y bars)/ the average of the averages. I believe that it is a fact that the E( y bars) in the second case is more close to the population mean than the first because as your sample distribution grows, this produces a better estimate. As the quantity of sample groups increases (not the actual size of each group which I realize also produces a similiar result because of the law of large numbers), your estimate becomes less biased. Is this wrong? Is this part of the CLT?
@@nicktanner8231 I'm glad you clarified that - many people confuse number of samples with size of sample. The short answer is that in a real life analysis we would only take one sample. The Central Limit Theorem is a theory, based on the concept of infinite samples. I think what you are saying is probably correct, but not part of the Central Limit Theorem. I made up the table in this blog post to clarify: creativemaths.net/blog/sampling-distribution/
@@DrNic Thank you again for a free response.
@@nicktanner8231 No worries - I like to help. Feel free to join the channel for a month or two in gratitude!
Thank you madam
You’re welcome 😊
I just got reminded of the show Tom Goes to the Mayor LOL
I still don't understand...How is it possible to benefit from the central limit theorem with only ONE sample? If I understood correctly, I have to consider the distribution of the MEAN, and How is it possible to consider a distribution with the mean of only ONE sample? I would obtain a histogram composed of only ONE bar...What sense does it? For example, If I have only a sample of 100 units taken from the population, where the mean is 4, I will obtain as a sampling distribution of the mean a histogram with only ONE bar with the value 4 with associated a probability 1.
I'm not managing to figure out what I'm doing wrong.
All good questions. You might find this blog post and, in particular, the table about the nature of the sampling distribution of the mean. creativemaths.net/blog/sampling-distribution/
If you still have questions after reading that, put them here!
1/8th to the power of 4 , because the sample size is 4, right?
That is correct.
Wouldnt it be better then, to take multiple small sample than large fewer sample if your objectif is to have a better pictur of the population?
Hi John Smith (Really?)
That is an interesting take on it, and the answer is no - the larger the sample the better. How would you use the different samples? You need to find the mean, so you would find several means and then find the mean of them - which would be the same as finding the mean of the larger sample. Say you had two samples of size 9. The standard error for each would be s/3, so the combined standard error would be 2s/3. But if you had a sample of 18, the standard error would be about 0.23s. However, the principle you suggest is used to underpin the idea of bootstrapping, where you use the sample to simulate lots of samples as if it were the population.
If you were on rate my professor , i imagine 🌶️🌶️🌶️🌶️🌶️🌶️🌶️🌶️🌶️🌶️🌶️so many dragons but only one dr nic
Awesome vbideo, Could u please explain more on 3:45 part?
So it's like you have a pile of dragons and take a sample of four and get the mean. Then you take another sample of 4 and get the mean. You do this one thousand times, and that graph shows the one thousand means you got. THere were some very big and some very small but most were around the middle values, even with a little sample of size 4.
Also, you did not mention an important aspect of the CLT: the population distribution can be of ANY shape (it need not be normal) and the distribution of the sample means will be normal. Not mentioning this aspect is a key takeaway omitted.
Indeed
the pictures are really distracting lol
Sorry - I think my editor got a bit carried away on this one.
I started out learning more about probability and CLT, then I started playing Dungeons & Dragons.
Then I ended up rolling a lot of dice, which helped me learn a lot about probability and statistics.
@@NoActuallyGo-KCUF-Yourself lol
@@NoActuallyGo-KCUF-Yourself 🤣😂
What’s the practical meaning of this theorem if in real life you draw only one sample anyway?
Why can’t we draw more than one sample from a population in real life?
The theorem helps us to gain perspective over the variation between sample means. It gives us an idea of the range of mean values that are likely from a theoretical population so we can then reflect back and say what our sample is telling us, while taking into account the effect of sampling error (also known as variation due to sampling.)
One sample is made up of a number of observations or data points. If you are going to draw more than one sample you are better to put them all in together to make one larger sample. Say you did take two smaller samples and they gave different results, what would you do with the two results?
What is the optimal age group of students for which this video is intended?
I'd say the really eager/keen/naive/bright-eyed-bushy-tailed uni student (so about 17 or 18 I guess).
It depends on when they need to learn about the Central Limit Theorem. In New Zealand it only appears in some university courses. It IS part of AP statistics in the United States. But any student who is interested in why the normal distribution is so common should find this and the video on the normal distribution useful.
1:40 Aspect 4...
*only has picture of her holding up number 3*
Editor: uhhhhhh.... tHiRd ArM!
Yup. You have to make do with what you've got.
She should be lecturer of the year
Thank you - you are very kind. I did win a teaching award when I was a lecturer at University. It is nice to be appreciated.
good start... the example, why did it need to be so unrealistic.. maths (statistics in particular) is supposed to deal with real life examples. just a suggestion.. but good video overall. thanks
Different things appeal to different people, though this is one of my more divisive videos. Some people love the dragons and the weird popups whereas others don't.
yes if the population has the exact same number of every single representative for the variables then of course we will get a normal distribution as n increases, but what if that is not the case, what is the intuition of CLT?
Interesting question. It does not matter what the distribution of the population is. In fact the uniform distribution is quite a long way from a normal distribution so it really isn't "of course we will get a normal distribution". If the population distribution is very non-normal it can take a bit bigger sample size to approach a normal distribution for the sampling distribution, but it will get there.
@@DrNic hi, thanks very much for the clarification! :)
good video - that accent though!
It’s such a nice change from American, I think.
@@DrNic yis! I agrey!
All that talk with the dragons made it confusing
Sorry
💗🌈💗🌈💗🌈💗🌈💗
Medical cohorts seem problematic given epigenetics overlapping with localized environmental pollutants combined with autophagy and other "phases" the body goes through, whether as it matures, or later becomes compromised and potentially degrades at varying rates.
Maybe?
at 1:23 "Aspect Two" you say "The sampling distribution will be well modeled by the normal distribution". Didn't you mean to say "The sampling distribution OF THE MEANS of several samples will be well modeled by the normal distribution"? If you take only one sample, the distribution of the elements of that sample will not be normal if the population is not normal. But if you take many samples, and look the the distribution of the means of those samples, then the CLT says that distribution (of the sample means) will be normal. Yes?
Yes
Aspect 4 with that extra finger XD
Good spotting!
Gret eexplaneshun in undah teen munutz
Great to hear
Very informative but the person doing hand gestures was distracting. 4/5
Agreed - my editor got a bit carried away that time.
The amount of work that must've gone into this video is whack. I just can't get my head around the fact that she probably posed in front of a camera for this video, or the fact that Dragonistics data cards are an actual thing that exists in real life.
Thanks for recognising this. To be fair, we have had that collection of posed photos for several videos, and we made the Dragonistics data cards to help people learn and teach statistic.