StatQuest: One or Two Tailed P-Values

StatQuest with Josh Starmer

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 6 січ 2025

КОМЕНТАРІ • 65

@statquest 2 роки тому ⁺¹
Support StatQuest by buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF - statquest.gumroad.com/l/wvtmc
Paperback - www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - www.amazon.com/dp/B09ZG79HXC
@bhaskarreddy5920 6 років тому ⁺⁴
Thanks for helping us and for sure your are going to get more subscribers.You have an amazing voice.Please update videos frequently.
@statquest 6 років тому
You're welcome! Working as fast as I can, I add 2 or 3 videos a month. :)
@timrose4310 4 місяці тому ⁺¹
2 sided because we need to decide what test BEFORE we experiment (otherwise p hacking). Perfect explanation!
@statquest 4 місяці тому ⁺¹
bam! :)
@hq9248 3 роки тому
Hi, Josh. Thanks for the great video! One minor problem at 3:45, it's a probability density function, so y-axis doesn't mean the probability..
@statquest 3 роки тому
You are correct, which is why I did not say "probability". I said "likelihood", which is different. Likelihood refers to the y-axis coordinate for a specific x-axis coordinate.
@hq9248 3 роки тому
@@statquest sorry for the wrong location, 3:22, "probability".
@statquest 3 роки тому
@@hq9248 Yeah, i should have said "related to" or something like that.
@Patrick881199 4 роки тому ⁺¹
The one tailed p-value test here tests the null hypothesis that the new measurement is better, and we get a p value of 0.03 which is smaller than 0.05, that means the null hypothesis is likely to be wrong. Am I understanding it right? P is low, null must go! Right?
@statquest 4 роки тому
Yes.
@kissapeles 6 місяців тому
Amazing videos! Just a quick question, how did the number of false positives increase from ~500 to ~800 when a one-tailed test was used moving forward? 05:52
@statquest 6 місяців тому
This is explained at 5:13
@kissapeles 6 місяців тому ⁺²
@@statquest Ahh I see. I rewatched it. Got it now. Thank you!
@jxaskcijiaxhsic9943 5 місяців тому ⁺¹
hmmmm, I'm confused, if the 2 - sided p - value only tells you if you can reject the null hypothesis, and the null hypothesis in the video is "no differences between the new and standard treatments". How does it tells you if the new treatment is better or worse? I thought it only tells you if these 2 treatments are significantly different from each other.
@statquest 5 місяців тому ⁺¹
The p-value only tells you if you can reject the null that there is no difference. To know if the treatment is better or worse, you just look at the means.
@jxaskcijiaxhsic9943 5 місяців тому ⁺¹
@@statquest 😯that makes sense, thanks a lot
@nerd0us69 4 місяці тому
@@statquest This is kind of confusing. If we can look at the means and see that meanA is greater than meanB, it seems logical to t-test for statistical significance of this difference using one-tail test. If we can infer the direction of the effect this way, it seems strange to me to consider a second tail.
Nevertheless, thank you for the great content!
@statquest 4 місяці тому
@@nerd0us69 The key to understanding what's going on here is to know that we have to decide on significance (is the p-value less than some threshold (usually 0.05)?) before we look at the means. Otherwise we will increase the probability of getting a false positive (getting a significant p-value and rejecting the null hypothesis when we shouldn't). To learn more about this problem, see: ua-cam.com/video/HDCOUXE3HMM/v-deo.html
@hepcat93 3 роки тому
Could someone explain me, why we have more or less equal amount of tests in each p-value "basket" (4:50 in the video)? Since p-value is related to the probability itself, shouldn't we have more tests with bigger p-values? Sorry for (probably) a silly question.
@statquest 3 роки тому ⁺¹
95% of the tests will have p-values > 0.5, so your intuition is correct. However, only 5% will have p-values between 0.05 and 1. And only 5% will have p-values between 0.1 and 0.15 etc. This is pretty easy to show using simulations in R. You just select values from a standard normal curve and call that group A, then select 3 values from a standard normal curve and call that group b, and then do the test test comparing A and B. Do that a lot of times and plot a histogram of the p-values.
@hepcat93 3 роки тому
@@statquest thank you for the answer! That's quite an ecnouragement for me, a starting learner :D The common sense saves me. But why ur exact histogram in the video isn't skewed to the right then? Each of the "baskests" is having more or less same number of tests, though u made 10 000 of them, what is a lot.
@statquest 3 роки тому
@@hepcat93 It isn't skewed to the right because the the probability of getting a p-value between 0.95 and 1 is still only 5 percent. I encourage you to do the simulation.
@flowereye3720 3 роки тому ⁺¹
Thank you very much Josh
@statquest 3 роки тому
You bet!
@ONoesBird 4 роки тому ⁺²
I have to say, you are an AWESOME teacher!!
@statquest 4 роки тому
Thank you! 😃
@dmitryoshkalo789 6 років тому ⁺³
Thanks for the video! I can't get the point of the experiment with a normal distribution. Why do you call non-overlapping samples a false-positive and why it has a probability of 5%?
@statquest 6 років тому ⁺¹¹
These are great questions. Let me answer them. 1) The reason why I am doing experiments with a normal distribution is that a lot of "real" experiments use data that comes from normal distributions. For example, the height of a plant is normally distributed. So if I compared the heights of two types of plants, I would be comparing data from a normal distribution. Thus, in this video I use the normal distribution because it accurately represents a lot of potential experiments that people might do. For more details, check out my StatQuests on statistical distributions: ua-cam.com/video/oI3hZJqXJuc/v-deo.html and the normal distribution: ua-cam.com/video/rzFX5NWojp0/v-deo.html 2) To answer your second question... both samples come from the same distribution - so if we do a t-test on those samples and the p-value is < 0.05, then the t-test is suggesting that the samples do not come from the same distribution. This is called a "false positive." The t-test is designed so that if you take 2 random samples from the same distribution, 5% of the time it will report a false positive. For more details, check out the StatQuest on P-hacking: ua-cam.com/video/UFhJefdVCjE/v-deo.html
@pushkarwagh9974 Рік тому
does a two tailed t-test tell mean of distribution A is greater than mean of distribution B? or it just tells me that they are different?
@statquest Рік тому
The p-value just tells you that they are different, but you can then look at the means to tell which direction they are different.
@karthica5251 11 місяців тому
Wonderful as always, one question though: In the starting example, when we perform a t-test, in order to determine the p-value we would calculate the f-value then randomly generate data (I assume within the max and min values) repeatedly and determine the f-values and plot them in histogram to get the f-distribution and use the original f-value to determine the p-value. In this context, what does it mean to have a 1-sided vs 2-sided test? Confused due to the shape of the f-distribution. Thanks : )
@statquest 11 місяців тому
So, the "one-sided" vs "two-sided" nomenclature really only applies two symmetric distributions. For non-symmetric distributions, maybe it would be better to say "equal to" or "greater/less than" p-values. The "equal to" would be equivalent to a "two-sided" p-value and the "greater/less than" would be equivalent to a "one-sided" p-value. And, by default, an F-test calculates "equal to" (aka "two-sided") p-values.
@karthica5251 11 місяців тому
@@statquest Thank you very much : )
@changyongli1332 5 років тому ⁺⁴
good to know about the p hacking
@snakewayne9112 5 років тому ⁺²
After watching the P value vedio I watched this one.I can't find vedios introducing t test on your list . Am I missing something?
@statquest 5 років тому ⁺²
You should watch my videos on linear models. These sound really fancy, but they are simple. Part 1 introduces the concept, and part 2 covers t-tests and anova:
ua-cam.com/video/nk2CQITm_eo/v-deo.html
ua-cam.com/video/NF5_btOaCig/v-deo.html
@fmetaller 6 років тому ⁺³
Could you tell me an experiment in witch it's correct to chose 1 tailed t test a priori?
@statquest 6 років тому ⁺¹¹
I've been doing statistics for 17 years and I've never been in a situation when a 1 tailed t-test was appropriate. So I can't give you an example from my own experience. I can safely say that 1 tailed t-tests are never appropriate for academic research. However, in a commercial setting there may be a justification for it. Imagine you had 2 different drugs to cure a disease. One is very expensive, one is very cheap. You could use a 1 tailed t-test to show that the cheap drug is no worse than the expensive one. However, even this situation is somewhat artificial. If you had a competing drug, you would still want to know if the cheaper drug was better than the expensive one. So, ultimately, you would still want a 2 tailed test.
@rubenpinnata4626 4 роки тому
what I'm understanding is that when you use 1-tailed, the Zscore must be above 1.64 while if you use 2-tail, the Zscore must be above 1.96 and in this sense, two tailed is more skeptic to h1 so its good. but how can the same data result in 2 p values while the X, mean, sd and n have the same values?
@statquest 4 роки тому
For more details about how to calculate p-values, check out: ua-cam.com/video/vemZtEM63GY/v-deo.html
@yogeshbharadwaj6200 4 роки тому
Tks for the video sir, So the learning from this video is, when we do 1 tailed test the chances of reporting 'false positive' is more than 2 tailed test, Hence it's always recommended to go with 2 tailed test whenever possible. Correct understanding ?Also in this video we didn't conclude whether new treatment is better than standard one, it was just an example to start with, correct ? just got confused whether conclusion was drawn or not.....pls help to clarify...Tks in advance.
@statquest 4 роки тому ⁺¹
Here's a video that should answer all of your questions: ua-cam.com/video/JQc3yx0-Q9E/v-deo.html
@yogeshbharadwaj6200 4 роки тому ⁺¹
@@statquest Tks a lot sir, will watch....
@dengzhonghan5125 3 роки тому
Hello Josh, do we have some videos for statistical tests or t-test?
@statquest 3 роки тому
Yes. However, I discuss t-tests in an unusual way - in the context of general linear models, which I think is a better approach. For details, see: ua-cam.com/play/PLblh5JKOoLUIzaEkCLIUxQFjPIlapw8nU.html and all of my other videos can be found here: statquest.org/video-index/
@dengzhonghan5125 3 роки тому ⁺¹
@@statquest thanks for your reply. Will watch that series
@alifahsanil 4 роки тому
Hi Josh. Thank you for your amazing explanations.
I have one question. Let say I want to know whether treatment A is better than treatment B. First, I test it using two-tailed test with null hypothesis "treatment A is not significantly different than treatment B" and then (let say) from the result I reject the null hypothesis. Second, because i reject the null hypothesis from the first result, the possibilities are either "treatment A is better than treatment B" or "treatment B is better than treatment A". I test it again using one-tailed test with null hypothesis "treatment A is no worse than treatment B" and then (let say) from the result I failed to reject the null hypothesis. So, based on these two results, I conclude that "treatment A is better than treatment B"
is that the right way? if not, what is the better way to know whether treatment A is better than treatment B?
Thank you
@statquest 4 роки тому ⁺³
You can just do the 2-tailed test without the follow up 1-tailed test. Once you establish that there is a difference, just look at the means. If A is better than B, then you conclude that A is better than B.
@alifahsanil 4 роки тому ⁺¹
@@statquest okay. thank you very much for the answer :)
@svozild 4 роки тому
@@statquest But 2-tailed test is said to be non-directional, however, you are actually identifying the direction this way. Or do I miss something? Actually, after reading several papers on 1- and 2-tailed tests, the whole story is more unclear to me now than before :-(. See, e.g., www.onesided.org/.
@statquest 4 роки тому
@@svozild I have a newer video on p-values that may help you understand the concept of two-sided p-values better: ua-cam.com/video/JQc3yx0-Q9E/v-deo.html
@svozild 4 роки тому ⁺¹
@@statquest Thanks a lot! I missed this one, going to watch it carefully. And thank you very much, Josh, such a great work for the community!
@meeravinod2027 3 роки тому
Hello Josh, can you mention an example of a test where it only makes sense to do a 1 tailed test?
@statquest 3 роки тому ⁺¹
I've never used a one tailed t-test.
@legend_6678 Рік тому ⁺¹
You’re amazing
@statquest Рік тому
Bam! :)
@benitshetty8492 4 роки тому
please provide videos on the hypothesis testing with examples!
@statquest 4 роки тому
I plan on doing that soon.
@mangning1107 3 роки тому ⁺¹
hope this statistics series will be more complete and systematic
@statquest 3 роки тому ⁺³
It is getting there. Originally, the purpose of these videos was just to answer questions that my coworkers had. So they would ask a question, and I would make a video. So the videos did not have any systematic approach - they were just individual solutions to individual problems. However, now I've got a whole selection of videos that are relatively comprehensive of most of the basic topics in Stats. Here's the link: statquest.org/video-index/
@jiayiwu4101 4 роки тому
I am confused.....So why the two tests have contradict results?
@statquest 4 роки тому ⁺¹
I have a new video that does a better job explaining this concept here: ua-cam.com/video/JQc3yx0-Q9E/v-deo.html
@TheEbbemonster 5 років тому
If your hypothesis beforehand is that your medicine will improve the condition, and you state beforehand that you are going to use a one-sided t-test, I do not see the problem.
@statquest 5 років тому ⁺⁴
Sure, you could do that. However, if you do, you should also explicitly say that you did not test to see if the medicine makes things worse and that that remains a possibility since it was not tested.

Наступне

Автоматичне відтворення

XGBoost Part 3 (of 4): Mathematical Details