I'm so impressed with how in depth and without choking you are capable to teach statistics. They are a pretty complex subject, and thanks to you we are getting to understand them, and who knows, maybe some of us even like them. THANKS!
Beautifully explained! This is a topic that gets glossed over a lot in statistics courses and I really appreciate the amount of time you devoted to it.
Just commented on your other video, but ended up watching this too by accident, when I was searching information on degrees of freedom. As a medical student doing research on biostatistic heavy subject, you truly are a lifesaver! Stuff like this really helps me to keep on going instead of needing to go through multiple statistic courses in uni on top of all my other studies and research project work!
if you know maths and desrciptive statistcis already, excep degrees of freedom and use of (n-1) instead of n, the critical explanation for you starts at 13.11 and ends at 15th minute but it's kind of explained away without a real explanation. let me look for the sections about regression etc.
Thank you for actually explaining the meaning behind DF clearly before jumping in to any abstract analysis of numbers. It is frustrating trying to find clear content so thank you for that.
Thanks for putting these together man. I'm not a student; I'm just brushing up on my stats. Your explanations are spot on. Had I had a teacher like you, I wouldn't be brushing up on my stats now (and I had some great mathematics teachers).
From 17:00, for the next minute: that's where it clicked for me, and I (somewhat) understood what degrees of freedom means. Thank you! Great breakdown.
this is the best video i have found online to tell me the DF, it is the independent pieces of information that exists in a sample to predict the main population. if were to predict, we must know the minimal values of independent pieces of sample to do a prediction over the population, generally, the more df the more accurate the prediction from the sample
I'm greateful for your lectures, and can say that this specific topic was always somehow incomplete for me, until now! I'm studying calculus, and statistic is a challenge for me. Thank you, and be always healthy!
Big fan Justin, you really don't know how helpful and mind-blowing are these videos for like 'whole world'. I wish you health and happiness in these extraordinary times. Where are you from, as in country and city? I would be fortunate to meet you someday, you're a great guy!
17:13 Only with the third observation that we have a "degree of freedom" such that the regression line can cut through the points, to get the errors and the parameters.
This is a very helpful explanation on a topic that is all too easily glossed over, but I think it is essential to getting a firm grasp of what we are doing with statistics. Thank you for taking the time to post it.
Absolutely amazing explanation of degrees of freedom. You gave a very good, simple and easy example of the urchins through which what df is and how its calculated could be immediately understood. Great job.
That is a very explanatory, cool sound, and step-by-step lecture. I really like it because it is very easy even for new beginners of statistics. Good job! many thanks.
. Is that you ? . Nice to see a face to a name . Been watching a few of your videos . Thank you . Hope to use the skills into my retirement... touching 60 .
@@pk-uk5lc Hey there, buddy! It ain't about being Indian or American, it's about the individual's ability to simplify things and make them more understandable. There are tons of examples of amazing Indian educators out there, not just on UA-cam.
you changed n to n-1 without any mathematical proof. any other proofs? if we want to inflate the estimate, why not make it n-2? i am assuming there is some robust mathematical reasoning.
We use df primarily when estimating variances, because we know that dividing by n underestimates population variance. To my knowledge, ther is no formal mathematical proof to show that n-1 is necessarily "correct". We can never know it is 'correct", because in the real world we never know the true population values. That said, statisticians have demonstrated the concept using toy data sets for which the population values are defined. With these models, n-1 reliably made the variance estimates better. If n-1 had been better at the job, they would have chosen it. It's the mathematical equivalent if having thermometers that reliably read 10% colder than reality. We are simply compensating for a known bias in the tool.
The no. of Degrees of freedom of sum of squares = no. of independent variables in that sum of squares. Let SS= sum of (yi-ybar )^2 ,i=1,2..n here SS is sum of square of n elements (y1-ybar),(y2-ybar)....(yn-ybar). These elements are not all independent bcz sum(yi-ybar)=0 (which condition for dependence of variable). So we use ' n-1 ' degrees of freedom for SS insted of n.
@@niemand262 there are mathematical proofs as to what gives the best estimate on average (i.e. an unbiased estimate). In terms of standard deviations, its called "Bessel's correction" and there is a proof as to why we use n-1. As for using n-p, i.e. some other number of degrees of freedom, I THINK these are calculated by seeing how many of the data points are free to vary and still give us the same statistic. For example, if we calculate the mean of [x1, x2, x3], if we vary any of them, we can just move another one so that the mean stays the same. As we can move any of them, we have n=3 degrees of freedom. If we are estimating population variance from a sample without knowing the population mean, we are solving 2 equations (one for mean and one for variance) with n unknowns. As such, we can "replace" one of the n data points in the equation for standard deviation with some function of the sample mean whilst still technically expressing the standard deviation in the same way. As we can do this replacement of 1 of the data points with one of our statistics, we have n-1 degrees of freedom. This could be slightly wrong (I came to this video hoping for a full mathematical explanation) but I'm fairly sure its the gist of it.
@@henrysorsky thanks for pointing out "Bessel's correction". Your intuitive explanation for degrees of freedom makes sense, but why doesn't the same intuition also apply for the standard deviation of the population? After all, given a set of N observations, if you know the standard deviations of n-1 points around the mean, you know the value of the Nth standard deviation.
At 10:00, isn't it the case the numerator must be zero as well as the denominator since x -x_bar is also zero. So its not "an explosion" but undefined.
Thank you so very much for this thorough and well delivered explanation of a complex concept that many educators try to breeze over. The type of explanation you provided is rare and I can't believe how smoothly and clearly you delivered your content. I am super impressed and very inspired as I tutor statistics to my fellow students. Thank you again. Liked, Subscribed and hit the bell 😊🙏
23:38 What confuses me about this line of reasoning is that it requires you to know the total number of samples. You can't derive the number of samples for the missing category if you don't know the total. So isn't the total a piece of information in itself? To me it looks like you have four pieces of information: category 1, category 2, category 3 and the total. So you have 4 pieces of information. But one of those doesn't contribute anything new and is therefore obsolete which would leave you with 4 - 1 = 3 independent pieces of information and not 2. What am I missing?
When the person that was in fact, seated on the front row, asked this question, professor said something like this: I don't know how to explain this to you, because you don't know enough statistics, I don't want to insult you, but you won't understand this. He was an ass in my opinion.
Could you please explain why we use degrees of freedom to adjust the difference between sample statistics and population parameters? What does that have to do with "independent pieces of information"?
14:20 why do we inflate the estimate by choosing to go from n to n-1. Why not n to n-2 for example? I'm baffled at all the responses saying how everything is clear to them, just because the denominator n-1 is smaller than n. Well n-2 and n-3 etc are also smaller than n. There's a mention of Bessel's correction in the comments below but nothing else anywhere in the video about what's really going on here.
@14:25 OMG, all this time, I took DoF as an abstract peculiarity of the equation. So the Sample Mean is just an estimate of the Population Mean. Reducing n by 1, you decreasing denominator and therefore inflating the variance to get a better estimate of the Population Variance, thereby including sampling error. Even though, I still don't know why you will use 2,3,4 DoF, I feel must more relieved now that this major stumbling block is removed. Thank you for breaking it down!
So, I have a question. What if the Population mean, bz chance happens to lie right on the the place where our sample mean is calculated. Then wouldnt we be unnecessarily inflating the variance? Might be a silly question but I am understanding the concept so well that just wanted to ask. :)
17:45 The explanation of having k "X" variables was a bit confusing, I had to go through a second time to understand. I am not sure calling all the variables "X" variables is correct??? Wouldn't we just say "the number of variables?" One of the independent variables is perhaps on an X axis, the other independent variable on the Y. The dependent variable is on the Z. Good explanation everywhere else - the Chi-square examples were especially interesting.
Hello, great job on all the explanations. But my question is: I understand that we need to "inflate" the computation of the variance from estimate of the population mean by x bar, why should this "inflation" be by dividing by n-1? Why not divide by (n/2)? I did not see the answer to: where did n-1 come from? I will listen to the rest of the video in case the answer is in the remaining part...
It couldn't be a better explanation! Unfortunately, when we ask ourselves: "what is, again, a "degree of freedom?", a half-hour response would not come to mind... 😢
The reason n-1 is used in calculating s^2 is because n-1 is unbiased estimator of sigma^2 or the variance. You did some wishy-washy hand waiving. If you don't want to work through the math, just say it can be shown that if n is used to calculate s^2, the bias is ((n-1)/n)*sigma^2 so by using n-1 you instead end up (n-1)/n-1) all times sigma^2. which is unbiased
Justin, I had a doubt. Is it okay to say, in simple regression analysis, y hat has n-2 dof because beta1 and beta2 are estimates of the population coefficients that we don't know ??
In a regression, when you have three observations (n = 3), based on formula, degrees of freedom is 1. What does the number 1 mean? (based on your definition at the 6 minute mark on the video?)
hi justin, nice and useful video you got here. I'd like to request if you could make one about bias, and what does it mean when you say unbiased estimate. thank you.
11:40 "using absolute values is clunky, statistically [which is why we use the variance instead]". I'm curious what the clunkiness you're alluding to is. I was taught that variance/stddev includes outliers a bit more than if we used absolute values, and that that was typically something we wanted in statistics, but that never made much sense to me. Great video, thank you! I'm a stats tutor in college and the prof for this class definitely handwaved DF.
i think it has to do with absolute values functions not being differentiable everywhere. finding optimums is not as easy as with squared functions, which are differentiable everywhere
13:53 so your saying we arbitrailly use n-1 to inflate the variance because the population variance (and thus Std Dev) may, or likely to be, larger? Why not use n-2 then? I guess because you need two data points to get to a standard dev.
What softwares are you using to make these excellent videos? They are amazingly crisp and clean! Under the current pandemic and wanting to create better videos, I would love to know what tools you use. If you could share, that would be great! Thank you for making these videos!
@zedstatistics Dear Sir, when you explained about the n-1 in the denominator of S.D., it was more of an empirical observation. I would like to know if we have a mathematical deduction of this formula. Thank you
Blessings. I’m a 79 yr old grad student. This stuff is rocking my self image
I'm so impressed with how in depth and without choking you are capable to teach statistics. They are a pretty complex subject, and thanks to you we are getting to understand them, and who knows, maybe some of us even like them. THANKS!
😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😮😊😊😊😊😮😊😅😊😊😮😊😮😊😊😊😅😊
P
Please f
Hi! Your teaching style is exceptional! You really know how to approach the weaknesses of the students! Keep on the good work!
Beautifully explained! This is a topic that gets glossed over a lot in statistics courses and I really appreciate the amount of time you devoted to it.
Just commented on your other video, but ended up watching this too by accident, when I was searching information on degrees of freedom. As a medical student doing research on biostatistic heavy subject, you truly are a lifesaver! Stuff like this really helps me to keep on going instead of needing to go through multiple statistic courses in uni on top of all my other studies and research project work!
Thank you so much sir. Please keep up the good work. I'm learning a lot.
if you know maths and desrciptive statistcis already, excep degrees of freedom and use of (n-1) instead of n, the critical explanation for you starts at 13.11 and ends at 15th minute but it's kind of explained away without a real explanation. let me look for the sections about regression etc.
Probably one of the BEST instructor I have run into in my career or lifetime - most people cannot teach statistics - this gentleman is awesome!
so good. also his choice of words to explain concepts is really good
Thank you for actually explaining the meaning behind DF clearly before jumping in to any abstract analysis of numbers. It is frustrating trying to find clear content so thank you for that.
Best explanation of Chi Square so far. Best use of my 27 minutes
It is even cool to watch stats with you as a teacher! Bravo.
Liked the video as soon as i heard his introduction. Summed my feelings about the topic up Perfectly
Your pace of delivering is perfect. Can't thank you enough!
1:26 - Relief, I thought that music was going all the way though the video. Awesome video. Best explanation I've found so far.
I have never seen anyone describe degrees of freedom so clearly, thanks!
The three-dimensional explanation of degrees of freedom in regression was really a light bulb moment. Awesome stuff.
Thanks for putting these together man. I'm not a student; I'm just brushing up on my stats. Your explanations are spot on. Had I had a teacher like you, I wouldn't be brushing up on my stats now (and I had some great mathematics teachers).
From 17:00, for the next minute: that's where it clicked for me, and I (somewhat) understood what degrees of freedom means. Thank you! Great breakdown.
DUUUUUDE!!!!! THANKS FOR ALL THE SIMPLE EXPLANATIONS. Appreciate it a lot.
this is the best video i have found online to tell me the DF, it is the independent pieces of information that exists in a sample to predict the main population. if were to predict, we must know the minimal values of independent pieces of sample to do a prediction over the population, generally, the more df the more accurate the prediction from the sample
Wow...this is really awesome...you did in 30 mins what my lecturer couldnt do over the whole semester...LOL. THANK YOU!!
🤣🤣🤣
I'm greateful for your lectures, and can say that this specific topic was always somehow incomplete for me, until now! I'm studying calculus, and statistic is a challenge for me. Thank you, and be always healthy!
Big fan Justin, you really don't know how helpful and mind-blowing are these videos for like 'whole world'. I wish you health and happiness in these extraordinary times. Where are you from, as in country and city? I would be fortunate to meet you someday, you're a great guy!
17:13 Only with the third observation that we have a "degree of freedom" such that the regression line can cut through the points, to get the errors and the parameters.
you are a life saver for stats students...
This is a very helpful explanation on a topic that is all too easily glossed over, but I think it is essential to getting a firm grasp of what we are doing with statistics. Thank you for taking the time to post it.
Detailed coverage , kudos .Finally What i have been looking for. Appreciate
Probably the best explanation about df on UA-cam, well done!
yeah? but whats the explanation really?
Happy 2021...Thank you Justin for the immense effort you put into this video...Love from Kerala...🙂
Absolutely amazing explanation of degrees of freedom. You gave a very good, simple and easy example of the urchins through which what df is and how its calculated could be immediately understood. Great job.
That is a very explanatory, cool sound, and step-by-step lecture. I really like it because it is very easy even for new beginners of statistics. Good job! many thanks.
.
Is that you ?
.
Nice to see a face to a name
.
Been watching a few of your videos
.
Thank you
.
Hope to use the skills into my retirement... touching 60
.
Superb explanation. This was stuck in my head. You just cleared the concepts. Grateful to you.
You are really an excellet teacher! I love the way you explain these concepts!
I wish you were my professor
That's something most of us indians want -better education
It's better if we adapt the Vedic methods😂😂
He is your professor by choice.
Support the idea
@@pk-uk5lc Hey there, buddy! It ain't about being Indian or American, it's about the individual's ability to simplify things and make them more understandable. There are tons of examples of amazing Indian educators out there, not just on UA-cam.
you changed n to n-1 without any mathematical proof. any other proofs? if we want to inflate the estimate, why not make it n-2? i am assuming there is some robust mathematical reasoning.
We use df primarily when estimating variances, because we know that dividing by n underestimates population variance. To my knowledge, ther is no formal mathematical proof to show that n-1 is necessarily "correct". We can never know it is 'correct", because in the real world we never know the true population values.
That said, statisticians have demonstrated the concept using toy data sets for which the population values are defined. With these models, n-1 reliably made the variance estimates better. If n-1 had been better at the job, they would have chosen it.
It's the mathematical equivalent if having thermometers that reliably read 10% colder than reality. We are simply compensating for a known bias in the tool.
The no. of Degrees of freedom of sum of squares = no. of independent variables in that sum of squares.
Let SS= sum of (yi-ybar )^2
,i=1,2..n
here SS is sum of square of
n elements
(y1-ybar),(y2-ybar)....(yn-ybar).
These elements are not all independent bcz
sum(yi-ybar)=0 (which condition for dependence of variable).
So we use ' n-1 ' degrees of freedom for SS insted of n.
please correct me,if am wrong
@@niemand262 there are mathematical proofs as to what gives the best estimate on average (i.e. an unbiased estimate). In terms of standard deviations, its called "Bessel's correction" and there is a proof as to why we use n-1. As for using n-p, i.e. some other number of degrees of freedom, I THINK these are calculated by seeing how many of the data points are free to vary and still give us the same statistic. For example, if we calculate the mean of [x1, x2, x3], if we vary any of them, we can just move another one so that the mean stays the same. As we can move any of them, we have n=3 degrees of freedom. If we are estimating population variance from a sample without knowing the population mean, we are solving 2 equations (one for mean and one for variance) with n unknowns. As such, we can "replace" one of the n data points in the equation for standard deviation with some function of the sample mean whilst still technically expressing the standard deviation in the same way. As we can do this replacement of 1 of the data points with one of our statistics, we have n-1 degrees of freedom. This could be slightly wrong (I came to this video hoping for a full mathematical explanation) but I'm fairly sure its the gist of it.
@@henrysorsky thanks for pointing out "Bessel's correction". Your intuitive explanation for degrees of freedom makes sense, but why doesn't the same intuition also apply for the standard deviation of the population? After all, given a set of N observations, if you know the standard deviations of n-1 points around the mean, you know the value of the Nth standard deviation.
This was very informative! I will be sharing this with my students.
11:26 Why is using absolute values clunky?
Very helpful. Thank you for helping me master this subject.
At 10:00, isn't it the case the numerator must be zero as well as the denominator since x -x_bar is also zero. So its not "an explosion" but undefined.
This is like the 8th video am watching on this channel today !! Where had you been all this while !!!!?
9:10 you said standard deviation is undefined but mathematically both numerator and denominator are zero so why it is still undefined?
0/0 = undefined, not zero, believe it or not!
AT 25:20, what is the purpose of marginal values? The calculation will change if the row total and column total is different right?
Thank you so very much for this thorough and well delivered explanation of a complex concept that many educators try to breeze over. The type of explanation you provided is rare and I can't believe how smoothly and clearly you delivered your content. I am super impressed and very inspired as I tutor statistics to my fellow students. Thank you again. Liked, Subscribed and hit the bell 😊🙏
Thaaaankk you! No one ever properly explained it to me!
A sea urchin has some many spikes?!
Loving your videos so far... Really helpful 🎉
23:38
What confuses me about this line of reasoning is that it requires you to know the total number of samples. You can't derive the number of samples for the missing category if you don't know the total. So isn't the total a piece of information in itself? To me it looks like you have four pieces of information: category 1, category 2, category 3 and the total. So you have 4 pieces of information. But one of those doesn't contribute anything new and is therefore obsolete which would leave you with 4 - 1 = 3 independent pieces of information and not 2. What am I missing?
the best stats explanation that I ever had!!!
When the person that was in fact, seated on the front row, asked this question, professor said something like this: I don't know how to explain this to you, because you don't know enough statistics, I don't want to insult you, but you won't understand this.
He was an ass in my opinion.
I'm very glad I subscribed to this channel
this explanation is simply fantastic, thank you so much!
Awesome explanation! thanks!
Loved the line about dividing by zero, "mathematically speaking, an explosion." Made me laugh of loud.
I noticed that too but it didn’t cause me to lol...just a chuckle.😏
Could you please explain why we use degrees of freedom to adjust the difference between sample statistics and population parameters? What does that have to do with "independent pieces of information"?
Super clear explanation!
14:20 why do we inflate the estimate by choosing to go from n to n-1. Why not n to n-2 for example?
I'm baffled at all the responses saying how everything is clear to them, just because the denominator n-1 is smaller than n. Well n-2 and n-3 etc are also smaller than n.
There's a mention of Bessel's correction in the comments below but nothing else anywhere in the video about what's really going on here.
I dont understand 7:05 why is the DF of xbar 5, (because of the 5 observations?) and of s, DF4. and descending further by skewness and kurtosis
First time i understand why it’s n-k-1. Thanks!
@14:25 OMG, all this time, I took DoF as an abstract peculiarity of the equation.
So the Sample Mean is just an estimate of the Population Mean. Reducing n by 1, you decreasing denominator and therefore inflating the variance to get a better estimate of the Population Variance, thereby including sampling error. Even though, I still don't know why you will use 2,3,4 DoF, I feel must more relieved now that this major stumbling block is removed. Thank you for breaking it down!
Great concise presentation!
Much appreciated!👍
great video acutally for the first time I can say I understand DFs
It's so good!!!! It's the way of how statistics should be taught!
23:53 Ooh, that's a dangerous assumption in 2022 haha. Thanks for the great lecture!
Brilliantly explained
Clearly explained, excellent.
I now fully understood the rationale of the Degrees of Freedom.
Kudos to you Bud! Great Explanation!
Seriously, u are the greatest, I love u man 🤍🤍
Incredible explanation!
So, I have a question. What if the Population mean, bz chance happens to lie right on the the place where our sample mean is calculated. Then wouldnt we be unnecessarily inflating the variance? Might be a silly question but I am understanding the concept so well that just wanted to ask. :)
17:45 The explanation of having k "X" variables was a bit confusing, I had to go through a second time to understand. I am not sure calling all the variables "X" variables is correct???
Wouldn't we just say "the number of variables?" One of the independent variables is perhaps on an X axis, the other independent variable on the Y. The dependent variable is on the Z.
Good explanation everywhere else - the Chi-square examples were especially interesting.
Quite clearly explained.
This is very helpful 😃
great explanation. Keep up your good work!
Fascinating Work you are doing ... Keep it up Plz
Hello, great job on all the explanations. But my question is: I understand that we need to "inflate" the computation of the variance from estimate of the population mean by x bar, why should this "inflation" be by dividing by n-1? Why not divide by (n/2)? I did not see the answer to: where did n-1 come from? I will listen to the rest of the video in case the answer is in the remaining part...
Same exact question! Thank you
Search for Bessel's Correction, there's a tough math deduction behind n-1.
Thank you! Intuitive explanations.
It couldn't be a better explanation!
Unfortunately, when we ask ourselves: "what is, again, a "degree of freedom?", a half-hour response would not come to mind... 😢
What great explanation! I thank you.
The reason n-1 is used in calculating s^2 is because n-1 is unbiased estimator of sigma^2 or the variance. You did some wishy-washy hand waiving. If you don't want to work through the math, just say it can be shown that if n is used to calculate s^2, the bias is ((n-1)/n)*sigma^2 so by using n-1 you instead end up (n-1)/n-1) all times sigma^2. which is unbiased
Great explanation
Justin, I had a doubt.
Is it okay to say, in simple regression analysis, y hat has n-2 dof because beta1 and beta2 are estimates of the population coefficients that we don't know ??
In a regression, when you have three observations (n = 3), based on formula, degrees of freedom is 1. What does the number 1 mean? (based on your definition at the 6 minute mark on the video?)
I also like to think of n-1 as reminding us that we just have xbar and hence only 1 piece of information.
hi justin, nice and useful video you got here. I'd like to request if you could make one about bias, and what does it mean when you say unbiased estimate. thank you.
Oooo! I like this idea. Might even do a series on bias. Though I'm behind on other series at the moment so STAY TUNED :)
Thanks a lot! This was really helpful! Thanks again sir!
11:40 "using absolute values is clunky, statistically [which is why we use the variance instead]". I'm curious what the clunkiness you're alluding to is.
I was taught that variance/stddev includes outliers a bit more than if we used absolute values, and that that was typically something we wanted in statistics, but that never made much sense to me.
Great video, thank you! I'm a stats tutor in college and the prof for this class definitely handwaved DF.
i think it has to do with absolute values functions not being differentiable everywhere. finding optimums is not as easy as with squared functions, which are differentiable everywhere
Great content. I am starting a statistics channel. Any recommendations?
Why isn't total counted in one of the independent pieces of information? After all we need that too??
Great explanation, thank you!
13:53 so your saying we arbitrailly use n-1 to inflate the variance because the population variance (and thus Std Dev) may, or likely to be, larger? Why not use n-2 then? I guess because you need two data points to get to a standard dev.
I would be interested in your perspective on how degrees of freedom should be considered for nuisance parameters.
Amazing content.
But, can you say, as to why in 7.03, you mean to say that there is only 5 df for mean 4 df for std? I am a newbie to stats
What softwares are you using to make these excellent videos? They are amazingly crisp and clean! Under the current pandemic and wanting to create better videos, I would love to know what tools you use. If you could share, that would be great! Thank you for making these videos!
obvious that it's Prezi software
@@gazzzada Is it? I have used Prezi and wouldn't have guessed that's what was used to create this video. Thanks.
current version it gives even more options
Great video, as always. I could not find anything specifically about F-distribution, is it in the pipeline? Thank youy
Thanks and a very happy new year.
Nice & illustrative
May I ask what software do you use to produce such attractive and informative video ?
@zedstatistics
Dear Sir, when you explained about the n-1 in the denominator of S.D., it was more of an empirical observation. I would like to know if we have a mathematical deduction of this formula.
Thank you