Introduction to One-Way ANOVA

Поділитися
Вставка
  • Опубліковано 17 січ 2025

КОМЕНТАРІ • 32

  • @emadharazi5044
    @emadharazi5044 4 роки тому +6

    No one explains it as clearly as you do. You are literally a life saver. Thank you so much

  • @Mpleo17
    @Mpleo17 4 роки тому +12

    The only words that will keep on ringing in my mind are 'beetweeeeen' and 'withhhhiiiiinnnn' . Great explanation !!

  • @iliassiablis6875
    @iliassiablis6875 9 років тому +2

    I simply cannot find the words to express my gratitude.
    I am in a data science MSc and you are about to save my life...

    • @jbstatistics
      @jbstatistics  9 років тому

      +ilias siablis You are very welcome Ilias! I hope your MSc studies go very well!

  • @jbstatistics
    @jbstatistics  12 років тому +2

    Thanks! I'm glad you find them interesting. I may get to logistic regression eventually, but it might take a while.

  • @jbstatistics
    @jbstatistics  11 років тому +2

    As in many statistical inference scenarios, here the assumption of normally distributed populations becomes less important as the sample size increases.
    So yes, for large sample sizes normality is not an issue. It's always still an assumption of the procedure, it just isn't very important for large sample sizes. And even in small sample size situations, ANOVA can still work well under some violations of the normality assumption, depending on the type of violation.

    • @Robin1997311
      @Robin1997311 Рік тому

      In the other video on "Inference for 2 variance ratio", you mentioned that the violation of the normality assumption can lead to very poor results when you use the F-statistic, even for large sample sizes, which means that the normality assumption in the F-statistic is crucial regardless of sample size unlike the t-test and the z-test.
      Since ANOVA also uses the F-statistic, shouldn't it also be the case that the ANOVA test will perform very poorly if the normality assumption is not met, even for large sample sizes? Your comment here seems to contradict that, which I am confused about. Why is it that unlike the two variance ratio test, the assumption of normality will become less relevant in ANOVA as the sample size increases? They both use the same F-statistic right?

    • @jbstatistics
      @jbstatistics  Рік тому +1

      @@Robin1997311 That's a very good question. One reason is that the numerator of the F statistic is related to the variance *of the sample means*, and the sample means will be approximately normal for large sample sizes.
      When sampling from non-normal populations, the sample variance will get more normal as the sample size increases, it's just very slow (relative to the mean).

    • @Robin1997311
      @Robin1997311 Рік тому

      @@jbstatistics Thank you for the answer. I have two follow up questions:
      1. I can see that as the sample size gets larger and larger, both the numerator and the denominator in an F-statistic may become approximately normal. However, what does that have to do with the ANOVA becoming more robust as the sample size increases? How does the numerator and the denominator becoming closer to normal affect the F-statistic? Does it make the F-statistic approximate closer to the actual/theoretical F-distribution you would get if the assumptions weren't violated?
      2. The F-statistic for "Inference for 2 variance ratio" and the F-statistic for "ANOVA" are both ratios of variance. Which means that the effect you described in your reply will apply to both cases. Then my original question remains unanswered because according to your videos the large sample sizes do NOT improve the performance of "Inference for 2 variance ratio" tests while it DOES improve the performance of "ANOVA" (when the normality assumption is violated). I want to know why this distinction exists when both tests are using the same F-statistic, and your answer doesn't explain that distinction since it is a common effect for both tests.

  • @jbstatistics
    @jbstatistics  12 років тому +2

    Yes, it's done in Latex (a Beamer presentation). Thanks for the offer, but I think I'm going to keep this a one-person show. Cheers.

  • @Cleisthenes2
    @Cleisthenes2 Рік тому +1

    Wait, where did you get that F statistic?

  • @sestinashell
    @sestinashell 6 років тому +1

    I'm brushing up on my stats and I wish ANOVA had been taught to me this way in the first place. Thanks for a great video in plain English.

    • @jbstatistics
      @jbstatistics  6 років тому

      I'm glad to be of help! Thanks for the kind words!

  • @metapsych27
    @metapsych27 9 років тому

    Perfect. Just what I was needing :-) I need explanations of the theory not the math. Thank you so much for this video!

    • @jbstatistics
      @jbstatistics  9 років тому

      +metapsych27 You are very welcome. I'm glad I could be of help!

  • @isdn888
    @isdn888 12 років тому +2

    ...keep on going... your tutorials are so interesting~~~ cannot stop watching ^^ what about logistic regression, please make a tutorial video!!! ;)

  • @ASHASHARMA-ef1rb
    @ASHASHARMA-ef1rb 4 роки тому

    please do a two ANOVA series as well man. i really appreciate your work.

  • @DenisG631
    @DenisG631 7 років тому

    GREAT! Now it makes perfect sense. Thank you!

  • @nigaryasin5314
    @nigaryasin5314 9 років тому +1

    Thank you so much, these videos are great!

  • @SoccerGurl8P
    @SoccerGurl8P 6 років тому +6

    Why do prof's make this shit so confusing, thank u bro!

  • @nourelislam8565
    @nourelislam8565 5 років тому

    Awesome explanation

  • @marcoventura9451
    @marcoventura9451 3 роки тому

    We looking forward to your R tutorial :-)

  • @johnvoyce
    @johnvoyce 8 років тому

    Excellent video. Keep up the good work.

  • @panagiotisgoulas8539
    @panagiotisgoulas8539 11 років тому +1

    In the assumptions you say that he population should be normally distributed. Can"t we not avoid that and use the central limit theorem and find the same result?

  • @khalidalsabahy840
    @khalidalsabahy840 9 років тому

    thanks a lot for your informative demonstration ..

  • @whetstoneguy6717
    @whetstoneguy6717 4 роки тому

    Professor--you reference the box and whisker plots and cite the mean. But I thought box and whisker plots display/mark the median. Therefore the "apparent" variances or spread are really about the median. Unless of course your box and whisker plots display/mark the mean and not the median. Your reply is requested. Thank you. Steve G Sept 27, 2020

    • @jbstatistics
      @jbstatistics  4 роки тому

      Yes, the line within the box represents the median. But I'm not sure why you feel that's problematic. I give the sample means under the boxplots, and any way you slice it, it is visually apparent that the within group variability of the groups on the right are larger than that of those on the left.

  • @isdn888
    @isdn888 12 років тому

    Thank you... is the presentation build in Tex/Latex? If so and if you agree, please send me your idears of the presentation and i write the Latex-code for you. Then you save some time... ;)

  • @kutilkol
    @kutilkol 4 роки тому

    There is difference between error and residual ! (Look that up)

    • @jbstatistics
      @jbstatistics  4 роки тому +2

      The error row is sometimes called "residuals", and it is the default in R to do so. SSE is the sum of squared residuals. If you're saying there is a difference between the theoretical error terms and the residuals, then sure, but that's not relevant here.