What are degrees of freedom?!? Seriously.
Вставка
- Опубліковано 29 чер 2024
- See all my videos at www.zstatistics.com/videos/
Ever wondered why lecturers often baulk at the idea of explaining degrees of freedom?? Well... it's a tough topic. But here it is. Presented succinctly in all of its delicious glory.
You'll find that in understanding degrees of freedom, you actually are leaps ahead in understanding statistics itself.
1:13 Introduction
3:23 Degrees of Freedom Intuition (WATCH THIS BIT!)
7:33 Standard deviation and descriptive statistics
15:05 Regression
20:28 Chi-squared goodness of fit test
24:12 Chi-squared test for independence
Blessings. I’m a 79 yr old grad student. This stuff is rocking my self image
I'm so impressed with how in depth and without choking you are capable to teach statistics. They are a pretty complex subject, and thanks to you we are getting to understand them, and who knows, maybe some of us even like them. THANKS!
😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😮😊😊😊😊😮😊😅😊😊😮😊😮😊😊😊😅😊
P
Please f
Hi! Your teaching style is exceptional! You really know how to approach the weaknesses of the students! Keep on the good work!
Your pace of delivering is perfect. Can't thank you enough!
Beautifully explained! This is a topic that gets glossed over a lot in statistics courses and I really appreciate the amount of time you devoted to it.
Just commented on your other video, but ended up watching this too by accident, when I was searching information on degrees of freedom. As a medical student doing research on biostatistic heavy subject, you truly are a lifesaver! Stuff like this really helps me to keep on going instead of needing to go through multiple statistic courses in uni on top of all my other studies and research project work!
Thanks for putting these together man. I'm not a student; I'm just brushing up on my stats. Your explanations are spot on. Had I had a teacher like you, I wouldn't be brushing up on my stats now (and I had some great mathematics teachers).
Probably one of the BEST instructor I have run into in my career or lifetime - most people cannot teach statistics - this gentleman is awesome!
Big fan Justin, you really don't know how helpful and mind-blowing are these videos for like 'whole world'. I wish you health and happiness in these extraordinary times. Where are you from, as in country and city? I would be fortunate to meet you someday, you're a great guy!
Detailed coverage , kudos .Finally What i have been looking for. Appreciate
It is even cool to watch stats with you as a teacher! Bravo.
Absolutely amazing explanation of degrees of freedom. You gave a very good, simple and easy example of the urchins through which what df is and how its calculated could be immediately understood. Great job.
That is a very explanatory, cool sound, and step-by-step lecture. I really like it because it is very easy even for new beginners of statistics. Good job! many thanks.
Thank you for actually explaining the meaning behind DF clearly before jumping in to any abstract analysis of numbers. It is frustrating trying to find clear content so thank you for that.
This is a very helpful explanation on a topic that is all too easily glossed over, but I think it is essential to getting a firm grasp of what we are doing with statistics. Thank you for taking the time to post it.
Probably the best explanation about df on UA-cam, well done!
yeah? but whats the explanation really?
Thank you so very much for this thorough and well delivered explanation of a complex concept that many educators try to breeze over. The type of explanation you provided is rare and I can't believe how smoothly and clearly you delivered your content. I am super impressed and very inspired as I tutor statistics to my fellow students. Thank you again. Liked, Subscribed and hit the bell 😊🙏
Happy 2021...Thank you Justin for the immense effort you put into this video...Love from Kerala...🙂
From 17:00, for the next minute: that's where it clicked for me, and I (somewhat) understood what degrees of freedom means. Thank you! Great breakdown.
Superb explanation. This was stuck in my head. You just cleared the concepts. Grateful to you.
You are really an excellet teacher! I love the way you explain these concepts!
Liked the video as soon as i heard his introduction. Summed my feelings about the topic up Perfectly
1:26 - Relief, I thought that music was going all the way though the video. Awesome video. Best explanation I've found so far.
so good. also his choice of words to explain concepts is really good
you are a life saver for stats students...
I'm greateful for your lectures, and can say that this specific topic was always somehow incomplete for me, until now! I'm studying calculus, and statistic is a challenge for me. Thank you, and be always healthy!
This was very informative! I will be sharing this with my students.
I have never seen anyone describe degrees of freedom so clearly, thanks!
I'm very glad I subscribed to this channel
Great concise presentation!
Much appreciated!👍
Loving your videos so far... Really helpful 🎉
Very helpful. Thank you for helping me master this subject.
this explanation is simply fantastic, thank you so much!
Thank you so much sir. Please keep up the good work. I'm learning a lot.
Thanks and a very happy new year.
This is like the 8th video am watching on this channel today !! Where had you been all this while !!!!?
this is the best video i have found online to tell me the DF, it is the independent pieces of information that exists in a sample to predict the main population. if were to predict, we must know the minimal values of independent pieces of sample to do a prediction over the population, generally, the more df the more accurate the prediction from the sample
Kudos to you Bud! Great Explanation!
It's so good!!!! It's the way of how statistics should be taught!
The three-dimensional explanation of degrees of freedom in regression was really a light bulb moment. Awesome stuff.
Super clear explanation!
Awesome explanation! thanks!
I wish you were my professor
That's something most of us indians want -better education
It's better if we adapt the Vedic methods😂😂
He is your professor by choice.
Support the idea
@@pk-uk5lc Hey there, buddy! It ain't about being Indian or American, it's about the individual's ability to simplify things and make them more understandable. There are tons of examples of amazing Indian educators out there, not just on UA-cam.
great explanation. Keep up your good work!
Fascinating Work you are doing ... Keep it up Plz
Wow...this is really awesome...you did in 30 mins what my lecturer couldnt do over the whole semester...LOL. THANK YOU!!
🤣🤣🤣
Thank you! Intuitive explanations.
the best stats explanation that I ever had!!!
Outstanding. Thank you!
Best explanation of Chi Square so far. Best use of my 27 minutes
What great explanation! I thank you.
Thanks a lot! This was really helpful! Thanks again sir!
Incredible explanation!
Great explanation, thank you!
You're really good, thank you
Clearly explained, excellent.
Quite clearly explained.
.
Is that you ?
.
Nice to see a face to a name
.
Been watching a few of your videos
.
Thank you
.
Hope to use the skills into my retirement... touching 60
.
Liked and shared. Great content!
Brilliantly explained
Amazing content.
But, can you say, as to why in 7.03, you mean to say that there is only 5 df for mean 4 df for std? I am a newbie to stats
Loved the line about dividing by zero, "mathematically speaking, an explosion." Made me laugh of loud.
I noticed that too but it didn’t cause me to lol...just a chuckle.😏
sir i would like to meet u some day ....you really clarified one of my biggest doubts
Seriously, u are the greatest, I love u man 🤍🤍
Could you please explain why we use degrees of freedom to adjust the difference between sample statistics and population parameters? What does that have to do with "independent pieces of information"?
amazing video! thank you!!
if you know maths and desrciptive statistcis already, excep degrees of freedom and use of (n-1) instead of n, the critical explanation for you starts at 13.11 and ends at 15th minute but it's kind of explained away without a real explanation. let me look for the sections about regression etc.
great video acutally for the first time I can say I understand DFs
WOw.... Amazing amazing. So far I just took it as a rule of thumb. Now it all makes sense.
you changed n to n-1 without any mathematical proof. any other proofs? if we want to inflate the estimate, why not make it n-2? i am assuming there is some robust mathematical reasoning.
We use df primarily when estimating variances, because we know that dividing by n underestimates population variance. To my knowledge, ther is no formal mathematical proof to show that n-1 is necessarily "correct". We can never know it is 'correct", because in the real world we never know the true population values.
That said, statisticians have demonstrated the concept using toy data sets for which the population values are defined. With these models, n-1 reliably made the variance estimates better. If n-1 had been better at the job, they would have chosen it.
It's the mathematical equivalent if having thermometers that reliably read 10% colder than reality. We are simply compensating for a known bias in the tool.
The no. of Degrees of freedom of sum of squares = no. of independent variables in that sum of squares.
Let SS= sum of (yi-ybar )^2
,i=1,2..n
here SS is sum of square of
n elements
(y1-ybar),(y2-ybar)....(yn-ybar).
These elements are not all independent bcz
sum(yi-ybar)=0 (which condition for dependence of variable).
So we use ' n-1 ' degrees of freedom for SS insted of n.
please correct me,if am wrong
@@niemand262 there are mathematical proofs as to what gives the best estimate on average (i.e. an unbiased estimate). In terms of standard deviations, its called "Bessel's correction" and there is a proof as to why we use n-1. As for using n-p, i.e. some other number of degrees of freedom, I THINK these are calculated by seeing how many of the data points are free to vary and still give us the same statistic. For example, if we calculate the mean of [x1, x2, x3], if we vary any of them, we can just move another one so that the mean stays the same. As we can move any of them, we have n=3 degrees of freedom. If we are estimating population variance from a sample without knowing the population mean, we are solving 2 equations (one for mean and one for variance) with n unknowns. As such, we can "replace" one of the n data points in the equation for standard deviation with some function of the sample mean whilst still technically expressing the standard deviation in the same way. As we can do this replacement of 1 of the data points with one of our statistics, we have n-1 degrees of freedom. This could be slightly wrong (I came to this video hoping for a full mathematical explanation) but I'm fairly sure its the gist of it.
@@henrysorsky thanks for pointing out "Bessel's correction". Your intuitive explanation for degrees of freedom makes sense, but why doesn't the same intuition also apply for the standard deviation of the population? After all, given a set of N observations, if you know the standard deviations of n-1 points around the mean, you know the value of the Nth standard deviation.
This is very helpful 😃
So, I have a question. What if the Population mean, bz chance happens to lie right on the the place where our sample mean is calculated. Then wouldnt we be unnecessarily inflating the variance? Might be a silly question but I am understanding the concept so well that just wanted to ask. :)
23:53 Ooh, that's a dangerous assumption in 2022 haha. Thanks for the great lecture!
What softwares are you using to make these excellent videos? They are amazingly crisp and clean! Under the current pandemic and wanting to create better videos, I would love to know what tools you use. If you could share, that would be great! Thank you for making these videos!
obvious that it's Prezi software
@@gazzzada Is it? I have used Prezi and wouldn't have guessed that's what was used to create this video. Thanks.
current version it gives even more options
Love it!
what a legend! thank you!
really great video
hi justin, nice and useful video you got here. I'd like to request if you could make one about bias, and what does it mean when you say unbiased estimate. thank you.
Oooo! I like this idea. Might even do a series on bias. Though I'm behind on other series at the moment so STAY TUNED :)
17:13 Only with the third observation that we have a "degree of freedom" such that the regression line can cut through the points, to get the errors and the parameters.
Nice & illustrative
May I ask what software do you use to produce such attractive and informative video ?
just amazing...
I also like to think of n-1 as reminding us that we just have xbar and hence only 1 piece of information.
First time i understand why it’s n-k-1. Thanks!
9:10 you said standard deviation is undefined but mathematically both numerator and denominator are zero so why it is still undefined?
0/0 = undefined, not zero, believe it or not!
Fantastic!!!
Hi, thanks for nice explanation!
I have a question about calculating degrees of freedom in chi-squared test. In population genetic study, the degrees of freedom is calculated as "the number of categories - the number of parameters" when we do chi-squared test for testing Hardy-Weinberg equilibrium. For example, if there are 6 genotypes(AA, BB, CC, AB, BC, AC), the number of categories(genotypes) is 6, and the number of parameters(alleles) is 3(A,B,C). So the degrees of freedom is 3. Do you know why is this?
Additional description : Sum of allele frequencies is 1. And the expected genotypes are calculated from product of allele frequencies.
If number of observed genotypes are 30,40,30 for AA, AB, BB respectively, allele frequencies are 0.5, 0.5 for A, B respectively. So the expected genotype will be 25(0.5*0.5*100), 50(2*0.5*0.5*100), 25(0.5*0.5*100) for AA, AB, BB.
Your videos are so useful, thank you so much! One thing I can't get my head around here though. So, we divide by n-1 (as opposed to n) to account for the variance needing to be larger as our sample mean is just an approximation of the population mean and the variance of the population mean is as small as it can be. But, we don't know the population mean so our sample mean could be the same as the population mean and thus we would be over estimating the variance by dividing by n-1 and not n. Is this true?
Overestimation can occur only when n, sample size is greater than population size, which by definition is impossible. You can't gather more units to measure than absolute best case scenario, assuming you have available a 100% of information. For example, you can't run a questionnaire through all 8 billion people. And even if you could, you can't hope they all will have answered 100% honestly without biases. If you'd divide your result by 8 billion and 1, your result by default will have a smaller value than the true value. Dividing by a larger value gives you a smaller value. Anything we measure is less than 8 billion people, hence, all of our results will be larger or smaller than the true value. If we wouldn't have measured 1 person out of a total population, that 1 person would be the final measurement before the sample mean becomes a population mean. That one last piece of information increases or decreases the sample mean into a population mean.
We divide by n-1, or n-k-1, or (c-1)(r-1) to have our guess be as close to the actual population as possible. We know that any guess we make is not spot on a population guess. Look at what happens with a variance in a perfect population (x-mu)/n with random numbers i.e. mean at zero:
(3-0) + (2-0) + (1-0) + (0 - 0) + (-1-0) + (2-0) + (-3-0) = 3 + 2 + 1 + 0 - 1 - 2 - 3 = 0
All of the values cancel out. We square them to get rid of the negative sign and have any kind of usable metric to assess reality. We probably could have use |absoulte bars| with the same result.
When you pass 0, i.e. our sample mean passes the population mean.
Statistics under/overestimates the true mean mathematically by default.
Once your guess is higher than the mean, the now squared distance starts to grow again.
9 + 4 + 1 + 0 +1 + 4 + 9
It does not collapse on itself. Graphically it's a parabola and you've estimated the guess at its bottom point.
I guess what you asking is, will we overestimate the variance when our guess size matches perfectly the true population size and no better guess can be made?
Firstly you should never assume you can guess the true population value. We have to live in a real life where nothing is perfect. Statistics is not perfect. It simply strives for perfection. Improbable to guess any single continuous value precisely. What is the chance that a next person has a height of 173.48763498456437654405 cm? Impossible to pick a sample this precise.
For a sample of n=1 you'd have a undefined sample variance, because in a case of n-1 variance is divided by zero. How far away is a sample from itself? No distance. It is the mean value of itself. Statistics doesn't deal with absolutes and cannot PROVE anything IS because we never know the population at any given moment. Something is always left unaccounted for. If n=0, answer would be negative. Negative value means no sample has been measured.
Imagine an equation without the -1 part i.e. a sample of me n=1 writing an an amount of this comment x=1. The average of these two would be 1. The variance (1-1)/1 = 0. No spread from itself. It's nonexistent either way, either you say it's = 0, or infinity, because you divided the spread by 0.
Your alternative hypothesis defined by the sample size IS the null hypothesis.
You've proven the alternative hypothesis is the null hypothesis.
Null hypothesis is not rejected within the significance level of 1, we can have an underestimation or overestimation of the true population mean
Having written all this, i still don't think i have answered the question, why an overestimation happens as sample mean becomes the population mean.
nice sir. thanks
Thank you!
Great video, as always. I could not find anything specifically about F-distribution, is it in the pipeline? Thank youy
I would be interested in your perspective on how degrees of freedom should be considered for nuisance parameters.
Thanks to you!
@14:25 OMG, all this time, I took DoF as an abstract peculiarity of the equation.
So the Sample Mean is just an estimate of the Population Mean. Reducing n by 1, you decreasing denominator and therefore inflating the variance to get a better estimate of the Population Variance, thereby including sampling error. Even though, I still don't know why you will use 2,3,4 DoF, I feel must more relieved now that this major stumbling block is removed. Thank you for breaking it down!
GREAT!, thanks a lot
Brilliant!
@zedstatistics
Dear Sir, when you explained about the n-1 in the denominator of S.D., it was more of an empirical observation. I would like to know if we have a mathematical deduction of this formula.
Thank you
good explanation
Thanks!
Happy 2019!
To you too YL!
The reason n-1 is used in calculating s^2 is because n-1 is unbiased estimator of sigma^2 or the variance. You did some wishy-washy hand waiving. If you don't want to work through the math, just say it can be shown that if n is used to calculate s^2, the bias is ((n-1)/n)*sigma^2 so by using n-1 you instead end up (n-1)/n-1) all times sigma^2. which is unbiased