The Essential Guide To Hypothesis Testing

Поділитися
Вставка
  • Опубліковано 1 лют 2025

КОМЕНТАРІ • 72

  • @CharlesLampman
    @CharlesLampman 4 місяці тому +2

    The past half year has been my sprint into grad school prep with an emphasis in ecological modeling- lots of time spent in R and side quests to more fully grasp statistical intuition. You have continued to fuel the fire with these easy to understand, well thought out videos. I hope this gets a 1/10!

  • @xTurqoise
    @xTurqoise 4 місяці тому +3

    I'm currently writing my master thesis in CS and this video is an excellent refresher of my statistics class!
    Great video! Thanks!

    • @very-normal
      @very-normal  4 місяці тому +1

      Thanks! Good luck with your thesis!

    • @xTurqoise
      @xTurqoise 4 місяці тому

      @@very-normal Thank you! Best regards from Germany.

  • @RomanNumural9
    @RomanNumural9 4 місяці тому +5

    I'm also a PhD student (math finance which is mostly statistics) but my background is applied math. Your videos have been very helpful for filling in my knowledge gaps that an undergrad in stats might learn! Thank you!

  • @dan_pal
    @dan_pal 2 місяці тому

    This was amazing, I’ve never seen the p-value being explained that clearly.

  • @xanmos
    @xanmos 4 місяці тому +5

    Thanks a lot for this video.
    I’ll share this to my students by posting in our Canvas LMS. :)

  • @EffigyOfficial
    @EffigyOfficial 4 місяці тому +3

    Couldn't have come at a better time, I've just started studying this

  • @desktus2224
    @desktus2224 4 місяці тому +6

    My god man at this point you should do this full time. You have a gift. The content and presentation is arguably immaculate. I am trying to learn statistics, but so often it is assumed that the learner is intuitively familiar with some statistical/mathematical principles and they are not accurately explained. For example when I try to learn what a standard error is many talk about how to calculate the standard error of the mean and not about the concept of it. Likewise the mathematical notation is all over the place. Can you tell me why regression models represented as equations are often writte like so : yi = b1*xi + b0 + ε and not like so Y = beta0 +beta1 * X + Ε . I believe using the small letters and having a index indicates that we use realizations for the random variable Y or X for observation i. But that does not accurately illustrate the general model in my opinion. Iam so confused why it is so widespread.

  • @onlyonecjb001
    @onlyonecjb001 4 місяці тому

    Fabulous video. Thank you very much for the clear explanation. Will now check out your other videos as well.

  • @tylernardone3788
    @tylernardone3788 4 місяці тому +3

    Someday I hope you do a video on fixed and random effects and mixed models. I think you may be the only man alive that can help me understand those clearly 😂

    • @very-normal
      @very-normal  4 місяці тому +2

      I talk about it a little bit in my video on “the biggest prize in statistics” actually. You can jump to the section on Nan Laird to get what you want

  • @barronwill2614
    @barronwill2614 4 місяці тому +3

    great video, interesting perspective to see the problem

  • @weitzun9691
    @weitzun9691 4 місяці тому

    This is a golden masterpiece! No doubt to subscribe. Thank you!

  • @bicepjai
    @bicepjai 3 місяці тому

    My suggestion would be to communicate (video time: 14:44) the population parameter range in CI rather than the test statistics range. Great videos and keep up the good work.

    • @very-normal
      @very-normal  3 місяці тому

      the CI is what’s communicated there tho?

    • @bicepjai
      @bicepjai 3 місяці тому

      @@very-normal So given H0, 95% confident that the 10% falls in between 0.5% and 45.8% ; But that can happen only 2.6% of times, hence we reject H0 ? I thought we donot reject if the 10% is in CI

  • @Impatient_Ape
    @Impatient_Ape 4 місяці тому +32

    This video slaps.

  • @keepfeatherinitbrothaaaa
    @keepfeatherinitbrothaaaa 2 місяці тому

    This was very good and clearly communicated. I would have liked only some more detail on calculating the confidence intervals and maybe a step by step solution on a concrete example

  • @schrodingersalphacat1862
    @schrodingersalphacat1862 4 місяці тому +1

    Awesome video. While in university, I struggled a lot to understand the meaning of the p-value aside from its "straightforward" definition as a conditional probability. The way you phrase your results at the end makes things so much clearer.

  • @ControlAlpha
    @ControlAlpha 4 місяці тому

    terrific video!want to know more about multiple hypothesis.

  • @tannermuhlestein9313
    @tannermuhlestein9313 4 місяці тому

    Great video! Well done.

  • @orpheus2883
    @orpheus2883 4 місяці тому

    Would you consider making a video on a suggested topic? Regression analysis, given how inportant it is to many fields including decision-making.

    • @very-normal
      @very-normal  4 місяці тому

      yeah! Next next video is on linear regression actually

    • @orpheus2883
      @orpheus2883 4 місяці тому

      @@very-normal That's great. Thank you!

  • @jakeaustria5445
    @jakeaustria5445 4 місяці тому +1

    Thank You

  • @georgessakr1
    @georgessakr1 4 місяці тому

    Great video as usual !
    I cant really understand what is the importance of the CI. I understand what you said in the video. But for instance, in my regression class my professor always wanted me to draw the CI on the graph. And i still do not understand its importance.

    • @very-normal
      @very-normal  4 місяці тому +1

      The CI can be helpful because it’s easier to see when the null hypothesis should be rejected.
      In linear regression, the null hypothesis of interest is usually that the parameter associated with the covariate is 0. When your teacher asks you to draw the CI, you’re getting a visual check that 0 is not included as a possible value (ie there is a discernible association and the “line” isn’t flat). If you’re talking about logistic regression, the usual null is that the parameter is 1, but same logic applies

  • @matteofrattini9133
    @matteofrattini9133 4 місяці тому

    You rightly said that seeing a p-value alone should always make you wonder what the null hypothesis was, so wouldn't it be better if we presented the result in the order:
    1) Null hypothesis
    2) Test results
    3) CI and p-value
    4) Conclusion
    ?

    • @very-normal
      @very-normal  4 місяці тому +1

      Yeah that can work as well, the main takeaway was that there’s a set of things that should be mentioned to best communicate about a test. The order may change depending on where this communication takes place

  •  4 місяці тому

    Loved your videos, man! Do you recommend those two books at @4:33?

    • @very-normal
      @very-normal  4 місяці тому +1

      Thanks! I don’t think I’d recommend those for learning. They’re more like a reference for people for the most common hypothesis tests

  • @goku-np5bk
    @goku-np5bk 4 місяці тому

    Great video, as always! Can you use a two-sided t-test instead of the proportion test? Test-Statistic under null hypothesis would have a mean of 0.5, if heads == 1 and tails == 0

    • @very-normal
      @very-normal  4 місяці тому +1

      You technically can, but with only 10 heads, the results wouldn’t be very reliable. The same thing can be said for the proportion test, but I just used it for teaching purposes

  • @RepChris
    @RepChris 4 місяці тому

    I dont think a Chi² test is the appropriate test for the coinflip, since its a very simple experiment with only two possible results (heads or tails), meaning a binomial test is much more appropriate. Since it makes more assumptions it should also be able to reject the null hypothesis with less data.
    Its also worth noting that the sample size is quite literally on the border of what you can (at least by common agreement) properly use the Chi² test for (# of expected occurrences assuming H0 for all buckets >= 5 (or >= 5 for at least 80% of the buckets, and no bucket with less than 1))

    • @very-normal
      @very-normal  4 місяці тому

      you’re not wrong, that’s what I was getting at when I was saying there’s a better way to approach the test rather than the proportion test / chi-squared test

  • @psl_schaefer
    @psl_schaefer 4 місяці тому

    I am wondering if it actually makes sense to report the confidence interval given its non intuitive definition. With Bayesian credible intervals it’s easier I think.

    • @very-normal
      @very-normal  4 місяці тому

      Yeahhhhh that’s what I think too, but it’s kinda like a “how I think it should work” vs “how the world works” type of thing. People will need to use the confidence interval til the end of time, so might as well make sure they know what it is

    • @ucchi9829
      @ucchi9829 4 місяці тому

      Booooo. Integrals can be challenging to students, should we not use them?

    • @very-normal
      @very-normal  4 місяці тому

      if I had a nickel for everytime someone got confused by a confidence interval, I’d have a nickel 95% of the time

    • @ucchi9829
      @ucchi9829 4 місяці тому

      @very-normal I don't doubt it. I'm willing to be they were poorly taught by professors who were poorly taught.

  • @tufonkin2707
    @tufonkin2707 4 місяці тому

    I was a bit confused about p=0.026 as long as an exact probability of the observed and more extreme outcomes could be calculated mentally: all heads, all tails, 10 cases of 1 tail and 10 cases of 1 head. Total of 22 out of 2^10. Something definitely below 0.022 as long as 1024>1000. Calculator helps and gives 0.021.

    • @very-normal
      @very-normal  4 місяці тому +1

      Yeah, technically you can calculate an exact p-value for this because you know the number of heads is binomial. I used the Z-test approximation even though it’s not really appropriate here because I wanted to focus on the getting the interpretation down

    • @tufonkin2707
      @tufonkin2707 4 місяці тому

      @@very-normal Once you’ve named the test being used, I’ve got it.
      In general, a very good video 👍
      IMHO, only one important point is missing: the significance level alpha should be chosen prior to testing. It is done based on “costs” for type I and type II errors. You’ve got a very nice example for that with a coin. If those two people were just buddies, then the cost of the type I error, when the coin was indeed fair, is just a bit of temporary tension in relationship, so 5% is reasonable. But if the punishment for the unfair coin is an execution, then the guilt should be proved without any reasonable doubt and alpha 10^-6 does not seam unrealistic.

    • @very-normal
      @very-normal  4 місяці тому

      Yeah you’re right, that’s a great point. I ended up being too focused on trying to avoid jargon that I forgot to make that point explicit

    • @tufonkin2707
      @tufonkin2707 4 місяці тому

      @@very-normal Make an additional short about it.

  • @Mystic2122
    @Mystic2122 4 місяці тому +1

    Omg you’re the goat, I have a review quiz on hypothesis testing this week

  • @ethandavis7310
    @ethandavis7310 4 місяці тому +1

    Shouldn't you have a correction factor? You studied the coin flips after the sample was already taken because it appeared unlikely from the beginning. If I were cheating and didn't want to get caught I'd also call into question your ability to decide on a reasonable p-val post-hoc.

    • @very-normal
      @very-normal  4 місяці тому +1

      For my example, there’s no need for a correction, but I think I understand what you’re getting at. That I only had a hypothesis after seeing the data, so I’m doing post-hoc reasoning.
      In this case, it’s not unreasonable to think that coins would be equally likely to land tails or heads. From a Bayesian perspective, nothing tells me I should believe otherwise. You could say I “had” the hypothesis before I saw the data, even though I wasn’t really thinking about it.
      However, corrections would be needed if I wanted to conduct multiple hypotheses. The test I performed was for a null with a fair coin, but nothing would stop me from doing another test where the coin hit tails 60% of the time. In that case, I would need to adjust for multiple hypothesis tests because the probability of getting “at least” one type one error will grow with more tests.
      Hopefully that helps a bit

    • @abhinandandalal8631
      @abhinandandalal8631 4 місяці тому

      I think @ethandavis7310 is hinting at post-selection inference, not multiple testing correction. The choice to decide to test a hypothesis after looking at the data is data-peeking, which invalidates the coverage guarantees (or equivalently the Type 1 error guarantee) provided by the frequentist test.
      The argument that a hypothesis already exists out there, but not thought about, is not valid. This issue comes up a lot in say p-hacking. One can look at the data and keep looking for hypotheses to test, and just by chance one of these hypotheses may be significant even if none of them are true (for instance, testing 100 hypotheses at 5% significance would yield approx 5 significant even if none are true). One can then say that hey this hypothesis already existed, I just wasn't thinking about it, but that would not be a valid argument. Choosing a hypothesis after peeking at the data requires correction factors or alternate procedures for inference. For example, an easy fix would be in the slapping example would be, once u become suspicious that the coin is unfair, you play the game again and then use the new data to conduct the test. Since the hypotheses you chose was, although random, a function of your old data which is independent from the new data, the inference obtained from the new data would thus be valid.

  • @tune490
    @tune490 4 місяці тому

    I thought this video was going to be about the Hubble constant lol :P

  • @aerospacedoctor
    @aerospacedoctor 4 місяці тому

    What? You will judge me for using Excel! First I use MATLAB when needed, but I teach students to use Excel because it doesn't need any programming, and it is likely on every office computer, so if they need to do a statistical test in the future they can.
    BTW, great videos, keep them coming. In class (research methods for aviators) I tell students to watch your vidoes if they need more info.

    • @quillaja
      @quillaja 4 місяці тому

      You can use python in excel now too.

    • @aerospacedoctor
      @aerospacedoctor 4 місяці тому

      @@quillaja another reason just using excel is fine 😁

    • @quillaja
      @quillaja 4 місяці тому

      @@aerospacedoctor Agreed. Doing stats in Excel is actually fine! Using Excel as a database management system, which is usually what I see, is a travesty.

  • @Idkprobablyyeet
    @Idkprobablyyeet 3 місяці тому

    I’ve hit an absolute wall with stats. How in the world do y’all understand this?

    • @very-normal
      @very-normal  3 місяці тому

      a lot of time and examples

  • @bonaldisillico
    @bonaldisillico 4 місяці тому

    A person who cheats is a 'cheat' NOT a cheater!

  • @chimurawill
    @chimurawill 4 місяці тому

    :( sowwie

  • @floatingblaze8405
    @floatingblaze8405 4 місяці тому +1

    Great video. I just wish Bayesian methods were more widespread🥲 (I know that's out of scope of this video, but I do wish the world will change in a way that they'll be scope in the future)
    Also, I wonder whether all these hypothesis test methods are at least partially applicable in a Bayesian context, or are they fundamentally frequentist?

    • @very-normal
      @very-normal  4 місяці тому +2

      Next video should sate your curiosity 👌

  • @iconjack
    @iconjack 4 місяці тому

    I rolled ⚂⚂ and concluded the dice were loaded, P(⚂⚂) = 1/36 ≃ .0278

    • @very-normal
      @very-normal  4 місяці тому

      But how do you know if something is more extreme than ⚂⚂

    • @iconjack
      @iconjack 4 місяці тому

      @@very-normal I'm just teasing. Great video, thanks for making it.

    • @very-normal
      @very-normal  4 місяці тому +1

      lol yeah I know, I appreciate you giving time to the channel. I spent too much time trying to figure out how you type the dice character