Thanks, Jack! Your video was super entertaining and informative and I finally understand how all these terms come together. Thank you thank you thank you soooooo much!
I am studying the chi-squared goodness of fit hypothesis test to compare a histogram of a set of samples to the normal distribution with some mean and standard deviation. Some literature says that to calculate the amount of degrees of freedom we must use the expression: n_of_bars_of_histogram -1 - n_of_parameters. If I understand your explanation well, in the expression of the literature the parameters will take out 2 degrees of freedom and there is a number which cannot vary given some mean and standard deviation. Is this correct?
I don’t understand. I wonder if I’m genuinely stupid. Why is the mean free to vary? It seems like Only the points we observe are free to vary and the sample mean is fixed given these points
The point is not that you could have gone out and collected a measurement for the mean and then figured out the last data point from the mean. Tha would be silly. The point is that the mathamatics doesn't care which came first. The statement "1+2 = 3" has 3 numbers but only 2 of them (any 2) are free to vary. (And if we use the sum "3" in our statisitcal model, there is only one left that was free to vary).
8:21 This is technically not true. If I measure 1 bear, and then measure 1 ant, on almost all metrics I can create an estimation that can classify future instances of bears or ants into their corresponding classes. The bottom-line is that the less true variance there exists between the classifications in your data, the more data you will need to establish a significant difference. But after a certain point, 1 instance of two classes is enough to establish the scale of the expected effect size given we got a larger sample. One datapoint estimates the true mean, just worse than a sample with any amount of variance.
Only because you have a prior model of what a bear is and what an ant is. If you gave one patient a drug and it cured them and one patient a placebo and they died, you would not be sure about any future efficacy of the drug. The ONLY reason you know that you could classify ants and bears with an N=1 is because you've seen images of hundreds of bears and ants. So your N is actually hundreds. And of course what about the amazing water bear lol.
@@dr.jackauty4415 "If you gave one patient a drug and it cured them and one patient a placebo and they died, you would not be sure about any future efficacy of the drug." Not "sure", but I would have knowledge. Given that you observe your fellow tribesman die after eating a suspicious plant, would you eat the plant even though N = 1? It's just not AS reliable a sample as observing more people eat that plant, but here we stumble to philosophical territory where the importance of the outcome changes how we should conduct inference. My point is that a datapoint that represents a population is more likely to be empirically similar to other instances of that population than not, and thus even one datapoint tells us something. We just lack variance, which means we can't do fancy sample statistics (because we can't measure error).
Wow. The best explanation I've ever heard. Thank you so much for your great video.
Best explanation I’ve come across, thanks
Everything makes sense after listening to your lecture.. salute to you sir
How did I missed this channel!!! Excellent explanation !👏👏 Thank you!!
Thanks, Jack! Your video was super entertaining and informative and I finally understand how all these terms come together. Thank you thank you thank you soooooo much!
Thank you very much, great explanation!!!:)
Awesome🥰
This video is so helpful!
Extremely gud and clear explanation
excellent
This makes me love statistics even more.
I am studying the chi-squared goodness of fit hypothesis test to compare a histogram of a set of samples to the normal distribution with some mean and standard deviation.
Some literature says that to calculate the amount of degrees of freedom we must use the expression:
n_of_bars_of_histogram -1 - n_of_parameters.
If I understand your explanation well, in the expression of the literature the parameters will take out 2 degrees of freedom and there is a number which cannot vary given some mean and standard deviation.
Is this correct?
I don’t understand. I wonder if I’m genuinely stupid. Why is the mean free to vary? It seems like Only the points we observe are free to vary and the sample mean is fixed given these points
The point is not that you could have gone out and collected a measurement for the mean and then figured out the last data point from the mean. Tha would be silly. The point is that the mathamatics doesn't care which came first. The statement "1+2 = 3" has 3 numbers but only 2 of them (any 2) are free to vary. (And if we use the sum "3" in our statisitcal model, there is only one left that was free to vary).
8:21 This is technically not true. If I measure 1 bear, and then measure 1 ant, on almost all metrics I can create an estimation that can classify future instances of bears or ants into their corresponding classes. The bottom-line is that the less true variance there exists between the classifications in your data, the more data you will need to establish a significant difference. But after a certain point, 1 instance of two classes is enough to establish the scale of the expected effect size given we got a larger sample. One datapoint estimates the true mean, just worse than a sample with any amount of variance.
Only because you have a prior model of what a bear is and what an ant is.
If you gave one patient a drug and it cured them and one patient a placebo and they died, you would not be sure about any future efficacy of the drug.
The ONLY reason you know that you could classify ants and bears with an N=1 is because you've seen images of hundreds of bears and ants. So your N is actually hundreds.
And of course what about the amazing water bear lol.
@@dr.jackauty4415 "If you gave one patient a drug and it cured them and one patient a placebo and they died, you would not be sure about any future efficacy of the drug."
Not "sure", but I would have knowledge. Given that you observe your fellow tribesman die after eating a suspicious plant, would you eat the plant even though N = 1? It's just not AS reliable a sample as observing more people eat that plant, but here we stumble to philosophical territory where the importance of the outcome changes how we should conduct inference.
My point is that a datapoint that represents a population is more likely to be empirically similar to other instances of that population than not, and thus even one datapoint tells us something. We just lack variance, which means we can't do fancy sample statistics (because we can't measure error).
kc3t4
#vum.fyi