ANOVA: Crash Course Statistics #33

Поділитися
Вставка
  • Опубліковано 24 гру 2024

КОМЕНТАРІ • 166

  • @crashcourse
    @crashcourse  3 місяці тому +16

    There is an error in this video - we apologize for any confusion this may have caused. At 4:40, we say that in order to calculate the SSM, you "sum up the squared distances between each point and its group mean." We should have said "sup up the squared distances between each point and the grand mean, or overall mean." Thanks to our audience for pointing this out.

  • @phlippindolfy
    @phlippindolfy 6 років тому +1145

    I'm here in the deserate hopes that this will help me understand stats after a full semester of classes.

    • @xanthe1076
      @xanthe1076 5 років тому +6

      same lol oops

    • @MrHei913
      @MrHei913 5 років тому +108

      Same... Paid school fees to have lectures but end up depend on UA-cam to learn cause lectures are bad :(

    • @TurningoftheTides
      @TurningoftheTides 5 років тому +1

      im with you!

    • @michaeldaugherty1274
      @michaeldaugherty1274 4 роки тому +1

      Same

    • @samanthalawson6617
      @samanthalawson6617 4 роки тому +16

      Same but she is talking way too fast and now I'm more lost

  • @genericruler
    @genericruler 6 років тому +323

    5:00 The distance between each point and it's group mean is the residual error (SSE). The SSM would be the difference between each group mean and grand mean.

    • @genericruler
      @genericruler 6 років тому +22

      10:36 SSE (Residual) should be (Xi - Xbar_group) in squared term?

    • @soulfrench
      @soulfrench 6 років тому +52

      You are totally right. I have taken a look at my statistics text book and it says that SST(Total sum of squares) = SSM + SSE, and SSM is calculated by the (difference between each group mean and the grand mean)^2 * (Total number of categories). I am surprised to notice two points. The first thing is that crash course made this kind of huge mistakes when explaining ANOVA and the second is that no one actually notices that (only based on this comment section) except you, Ronan

    • @soulfrench
      @soulfrench 6 років тому +52

      The explanation for the SSM and SSE are exactly the same on this video, which means that either of one is wrong

    • @patrickjane5796
      @patrickjane5796 6 років тому +17

      Had the same doubt, checked the comments for confirmation, and found your reply. Thank you!

    • @andriistadnik6775
      @andriistadnik6775 6 років тому +9

      I guess we have to contact crash course team somehow to tell them about that, in the case of other people will rely on those videos

  • @llamafromspace
    @llamafromspace 6 років тому +522

    Today I learned that the word ANOVA exists, and that I shouldn't jump halfway into a course.

    • @swampertblaziken1
      @swampertblaziken1 6 років тому +5

      ANOVA - Analysis of Variance

    • @rafnaegels8913
      @rafnaegels8913 6 років тому +2

      This + 9000!!!

    • @lichbanelb
      @lichbanelb 6 років тому +7

      This is a 2nd year statistics topic (at least at my uni), so its not easy!

    • @LiteralCats
      @LiteralCats 6 років тому +4

      Lol. I'm halfway through the video, and this is what I learnt so far xD *goes back to earlier videos*

    • @lokeshsah7
      @lokeshsah7 5 років тому +2

      I got the same lesson 😂😂

  • @Xman3456
    @Xman3456 6 років тому +786

    I appreciate the effort into making the video, but I was a bit overwhelmed by all the graphics and the speed of the explanations.

    • @philocac5424
      @philocac5424 5 років тому +32

      I watched at 0.75 speed XD

    • @emeraldemperor2601
      @emeraldemperor2601 4 роки тому +14

      @@philocac5424 I watched at 0.5 speed holding down space
      beat that 🤣

    • @bilalhasansyed7
      @bilalhasansyed7 4 роки тому +1

      Same here

    • @404bidden
      @404bidden 4 роки тому +47

      I mean its crashcourse lol

  • @km1dash6
    @km1dash6 6 років тому +46

    I'm a grad student studying psychology. In a couple weeks, I have to take a class on ANOVA. This really helps.

  • @desankad.870
    @desankad.870 4 роки тому +20

    You make statistics so understandable and not as abstract! I am not so scared of t-tests, z-tests, F-tests and ANOVA anymore! Why do most statistics teachers make it seem so scary? Statistics is great! (esp for curious minds like myself ^_^)

  • @francis112233445566
    @francis112233445566 4 роки тому +39

    Great video, although I wouldn’t recommend running three t-tests after the ANOVA without first applying the Bonferroni correction! This is using an alpha level of 0.05/the number of comparisons you’re making (in this case 3). Use this corrected alpha level to determine significance, otherwise you may run into family-wise problems and make a type 1 error

  • @anastasiostresinis6899
    @anastasiostresinis6899 5 років тому +16

    This presentation helped me gain further insight to ANOVA. Wish I had you (this goes to the whole cast) as a stats teacher! Big THANKS!!!

  • @Noob___Noob
    @Noob___Noob 6 років тому +183

    ANOVA, I learned it, being tested on it, aced it, but really didn't understand what it is.

    • @Jasuta123
      @Jasuta123 6 років тому +2

      Yah me too ... The software is too complicated...

    • @voltairesarmy6702
      @voltairesarmy6702 6 років тому +2

      I took two classes using ANOVA b4 learning (on my own) the connection to GLMs lol

    • @teunvandenbrand1324
      @teunvandenbrand1324 6 років тому +14

      The intuition I have about ANOVA is that it tests wether the variance between groups exceeds the variance within groups. Maybe that could help

    • @oldcowbb
      @oldcowbb 6 років тому +6

      basically my college life

  • @adamheckenberg5861
    @adamheckenberg5861 2 роки тому +13

    This is incredible. Taking Stats for Psyc right now, getting a lot harder as it goes on. Thank you!

  • @rhiwright
    @rhiwright 6 років тому +53

    I am so glad SPSS does most of the work and I'm just learning to interpret the results. Interesting that you moved onto ANOVA at the same week my quantitative and qualitative research methods module at uni did :)

  • @JEOGRAPHYSongs
    @JEOGRAPHYSongs 6 років тому +10

    NOVA has been one of my favorite PBS programs for 3 decades now.

  • @Ureyeuh
    @Ureyeuh 4 роки тому +8

    This information is spit out insanely fast.

  • @amulyagupta9161
    @amulyagupta9161 4 роки тому +15

    I love crash course videos for their simplicity but couldn't make out much from this one

  • @rib_rob_personal
    @rib_rob_personal 6 років тому +23

    Wow. These are coming out right when I need them lol. Taking a hard statistics course.

  • @zackwise1852
    @zackwise1852 6 років тому +32

    I really love all of crash course's content, but I've been having a tough time following this series. After rewatching this episode and the previous one multiple times, I'm still confused. In this episode, but SSM and SSE are both described as the sum of squares between each point and its group mean, but SSM and SSE are different! If someone could explain this to me I'd really appreciate it.

    • @Mr_Wallet
      @Mr_Wallet 6 років тому +4

      This is the only CC series to date that seems to be geared very specifically at being a class supplemental and not necessarily accessible to someone only watching the videos (although Engineering is also skirting the line a little bit). It's been fairly disappointing.

    • @chelseaparlett8069
      @chelseaparlett8069 6 років тому +12

      I'm sorry if there was an error.
      SSE is the sum of the squared distance between each point and its group mean (more generally it's the distance between the data point and the predicted value).
      SSModel is the sum of squared distance between the model prediction and the grand (overall) mean.

    • @zackwise1852
      @zackwise1852 6 років тому +1

      @@chelseaparlett8069 Thanks for the clarification, I think I understand now :)

    • @lenamaas9233
      @lenamaas9233 5 років тому

      @@chelseaparlett8069 thank you!

    • @winnieb3324
      @winnieb3324 5 років тому

      @@chelseaparlett8069 Is the predicted value in SSE basically the predicted mean?

  • @taylorharris8078
    @taylorharris8078 6 років тому +17

    I love crash course! But this is not an introductory level video. There are others that explain anova more simply

  • @stephenlippi5724
    @stephenlippi5724 6 років тому +55

    You can't run multiple T-tests... this inflates the rate of Type I error!

    • @gardenhead92
      @gardenhead92 6 років тому +9

      They said they'll address that in a future episode

    • @stephenlippi5724
      @stephenlippi5724 6 років тому +18

      That's definitely good because as soon as they said "just run 3 t-tests" I almost fell out of my chair. Doesn't help those students watching this who are now like "oh just run t tests!"

    • @doonce
      @doonce 6 років тому +17

      Ya, you have to do a post-hoc test like Tukey. Otherwise, there's no point in doing the ANOVA in the first place, just do t-tests.

    • @lakudzala195
      @lakudzala195 6 років тому +3

      If you were doing this study would it be better to do all 3 t-tests with the bonferroni correction and present all 3 in a paper, or find the t-test that shows the strongest result and only present that one?

    • @voltairesarmy6702
      @voltairesarmy6702 6 років тому +1

      @@lakudzala195 I don't remember the bonferroni(spelling?Lol) correction but just wanted to say, there's a push for presenting confidence intervals in papers. So, regardless of what you end up doing, I suggest using confidence intervals. Also, maybe look for it on google scholar (also related topic: replication in science). (I'm assuming this is a scientific study of some kind. )

  • @CMunkMunk
    @CMunkMunk 6 років тому +118

    Hi graphics team, ß ≠ β 😉

  • @kimberlyt7986
    @kimberlyt7986 Рік тому

    Thank you Adriene for helping me pass this class 🙏

  • @bee-stef
    @bee-stef 4 роки тому +1

    You are an incredibly eloquent speaker. Thanks for this explanation!

  • @kiou97
    @kiou97 Рік тому

    For the first example and the slope calculation, it should be the opposite (μ1-μ0) in the numerator. Just for avoiding any confusion with regard to the code names for rainy and non-rainy days. Thanks Crash Course team for all your efforts and teaching, and thanks Adriene for this particular course which ,for me as an engineer, was a tough lesson for all those years that I was avoiding Statistics courses 😅

  • @sansm5285
    @sansm5285 4 роки тому +2

    Hey, I love these series, it's helping me through a semester of Corona-Statistics.
    I just think I might have found a mistake at 10:57 for the model sum of squares, because the sum should go from i=1 to k, instead of to n as the figure says.

  • @lianggegou
    @lianggegou 5 років тому +49

    I love the examples but this goes way too fast, I had a hard time following the explanations 😢

    • @greensteve9307
      @greensteve9307 5 років тому +4

      Just watch it on x0.75 then, or pause it and go back.

  • @albyv.4209
    @albyv.4209 5 років тому +1

    I have my intro to bio stats final tomorrow and these videos are my 3 am Hail Mary half court shot.

  • @nightsazrael
    @nightsazrael 6 років тому +3

    A bunny preserve how cool. Also I really have to think hard to understand your videos, but it is always worth it. I never gamble, but life is a gamble and statistics are one of the best ways to make a decision. Not always the right decision, but random chance rules the world.

  • @NavajoMX
    @NavajoMX 6 років тому +2

    Thank you! I've needed this episode for years.

  • @ashleyalexander6885
    @ashleyalexander6885 3 місяці тому

    Thank you so much for this information!! I was wondering if you guys might be able to add the APA citation for these sources in the description! That would be really helpful!

  • @ЮлияШикова-ф4м
    @ЮлияШикова-ф4м 4 роки тому +3

    There was one big mistake....never talk about chocolate in maths 😄
    I was not able to think about calculations but chocolate.
    Overall was pretty clear:)

  • @kylehenderson9489
    @kylehenderson9489 6 років тому +10

    YES. There is largely unpalatable chocolate. I've eaten some.

  • @Jesusiscomingback
    @Jesusiscomingback 4 роки тому

    I love Indiana, my family is from there. I’m hype. Love u guys. Thanks for your help. You guys literally help me in every class I have. I go to college online. CTU online. Thanks guys for real.

    • @sudeepjoseph69
      @sudeepjoseph69 4 роки тому

      Array! Noru moosko ra pandhi! Epudu matladuthanai untavu. Konchuma brathakamu neruchuko.

  • @aaronmarks9366
    @aaronmarks9366 5 років тому

    My favorite statistics documentary series is PBS ANOVA

  • @jonathanblackwell42
    @jonathanblackwell42 6 років тому +7

    ANOVA beat me up in stats class...

  • @IamMathenge
    @IamMathenge 2 роки тому

    thank youapparently i am understing this 1 year after campus into data science

  • @douglasmaxwell6547
    @douglasmaxwell6547 5 років тому +1

    Brilliant video, thanks for sharing.

  • @tohtine
    @tohtine 6 років тому +7

    I think your explanation for the model sum of squares is incorrect; it should be the sum of squared differences between group means and the overall mean.

  • @INSPirrationalNATURE
    @INSPirrationalNATURE 4 роки тому +3

    You're gonna save my master's degree *.*

  • @kierannurmi5488
    @kierannurmi5488 6 років тому +1

    Is there any situation where an F test would say not statistically significant but a T test would? The fact that you said a failed F test means a relationship "probably" doesn't exist seems to imply that it can. What would you do in that case?

  • @tiffanyszymanski5956
    @tiffanyszymanski5956 2 роки тому

    This video was super helpful!! Thank you!!! ❤

  • @jamicub39
    @jamicub39 6 років тому +1

    Is it a bumpy or slippery slope? Si there's a variable difference.

  • @NeilNileStudios
    @NeilNileStudios 5 років тому

    Cool, Hill is back. I liked her in econ

  • @grainfrizz
    @grainfrizz 6 років тому +1

    Is it right to say that ANOVA is the same as T-test but the former is when you have more that 2 groups?

    • @voltairesarmy6702
      @voltairesarmy6702 6 років тому +4

      It's right that the ANOVA is used for cases where a t-test is inappropriate/inadequate because there are more than two groups to compare.

  • @caitlincunningham8944
    @caitlincunningham8944 4 роки тому

    Would there be a point in doing an ANOVA for two groups, or would it be easier to just do a T-test?

  • @danielduvernay3207
    @danielduvernay3207 6 років тому +2

    omg love this video

  • @JasonOlshefsky
    @JasonOlshefsky 6 років тому

    Is there a variation of GLM that relies on median rather than mean? I kind of doubt it because it doesn't work mathematically ... but I have read that medians are a "more accurate" measure of "typical" than means. For instance, in the bunnies example, if one day the sanctuary sent all the bunnies outside on a sunny day and you saw 30 bunnies, it would skew your 1-or-5 general model strongly.

    • @soulfrench
      @soulfrench 6 років тому

      Hey, General linear model and Generalized linear model(GLM) are two different things.

    • @teunvandenbrand1324
      @teunvandenbrand1324 6 років тому +2

      Most of the time you could take a non-parametric test over a parametric test if you're concerned that your data doesn't follow a theoretical distribution. Non-parametric tests are often based on rank. The good thing is that they are robust, the downside is that you lose some statistical power.

  • @dinomoviesnstuff
    @dinomoviesnstuff 11 місяців тому +5

    Very confusing.

  • @NamithaMariaCherian
    @NamithaMariaCherian 2 роки тому

    When you are calculating, SSM- it is the difference between the overall mean and the mean of each group. SSE- is the difference between observed data and the group means. The SSM is explained incorrectly in the video. But otherwise, great content. Thank you.

  • @MasterofPlay7
    @MasterofPlay7 5 років тому

    So if the mean of one or more groups (are skewed by outlier or missing values), is anova's result between the groups still valid? Since the parameters for anova is the variances

  • @Dr.Danger.Communication
    @Dr.Danger.Communication 6 років тому +1

    Cacao bean difference here is an example of the danger of significance testing. I would argue that a mean difference of .17, on the scale being used, is not meaningful.

    • @gardenhead92
      @gardenhead92 6 років тому +1

      I think most people would agree, which is why you should always present your effect size along with your p-value :)

    • @voltairesarmy6702
      @voltairesarmy6702 6 років тому

      Also, presenting confidence intervals is a good idea!

    • @teunvandenbrand1324
      @teunvandenbrand1324 6 років тому

      Also the ratings are on an ordinal scale, not a continuous ones (as seen by the discrete values the ratings can take). So applying a non-parametric test might be more useful.

    • @voltairesarmy6702
      @voltairesarmy6702 6 років тому

      @@teunvandenbrand1324 well if we care about that, an ordered logit / probit would work. C:

  • @emilyneufeld673
    @emilyneufeld673 6 років тому +1

    K. I have a question... why exactly do you hate the ever so extraordinary SPONGE???

  • @unleashingpotential-psycho9433
    @unleashingpotential-psycho9433 6 років тому +31

    Statistics is way better than geometry.

  • @vegangelo_29
    @vegangelo_29 4 роки тому +1

    So instead of using ANOVA, why not just use multiple T-test?

  • @loganl3746
    @loganl3746 6 років тому

    Yeah, but which potato varieties did best in Martian soil, supplemented with human manure and bacteria cultures?

  • @zeio-nara
    @zeio-nara 5 років тому +2

    It sounds like SSR and SSE are the same thing

  • @Grv28097
    @Grv28097 6 років тому +1

    You are a life saver!

  • @himanshukhandelwal9226
    @himanshukhandelwal9226 6 років тому

    Is it a complete course on statistics..I mean.. does it includes most of what we need to know about statistics..?

  • @voltairesarmy6702
    @voltairesarmy6702 6 років тому +1

    Since it's a Kaggle dataset, did you use R or python to analyze the data? Or did you download it and use Excel, SPSS, Stata, SAS, etc to analyze the data?

  • @Reyesmagos8585
    @Reyesmagos8585 6 років тому +1

    Thank you!!!

  • @BlezzBeats
    @BlezzBeats 4 роки тому

    ANOVA is a great tasting chocolate bean.

  • @maftoumiali4412
    @maftoumiali4412 4 роки тому

    You're amazing

  • @ezhilarasankandaswamin4339
    @ezhilarasankandaswamin4339 6 років тому +1

    Can you sugest good book to follow crash course series & further practise

    • @voltairesarmy6702
      @voltairesarmy6702 6 років тому

      Open Intro Statistics is decent. It's free to get an ebook and has decent resources too. I used it in a class lol

    • @ezhilarasankandaswamin4339
      @ezhilarasankandaswamin4339 6 років тому

      @@voltairesarmy6702 thanks i will start to download from web
      Can you give your suggestion e-book

  • @RohaZahidi
    @RohaZahidi 5 років тому +10

    Even though I've taken an entire semester worth of classes on statistics, these videos are actually even more confusing. You guys focus too much on keeping the videos short and end up explaining nothing at all. there is information and you make a few good points but its nothing one cant get from a regular math website. the visuals are a waste and it all seems pretty forced and like youre just reading off a screen.

  • @raeidm.raunak4927
    @raeidm.raunak4927 6 років тому +1

    Do a crash course history on the Bangladeshi war of independence in 1971. I have a project and would love of you do a video on it.

  • @ikahn17
    @ikahn17 6 років тому +1

    I thought this was going to be about my sous vide circulator lol

  • @toniisaurVODs
    @toniisaurVODs 5 років тому +1

    Using ordinal data is a bad example with the cocoa bean type. You cant use the mean as a measure of central tendency when it has no meaning i.e.what is the average of strongly agree and disagree? Also really bad idea to teach doing multiple t-tests as it increasing the Type I error, and defeats the whole point of ANOVA. Would have been better to show Tukey’ HSD to determine which means are different.

  • @Cormac_YT
    @Cormac_YT 6 років тому +4

    *NOTIFICATION SQUAD WHERE YOU AT? 🔥💯💪*

  • @researchtech5830
    @researchtech5830 5 років тому

    NIce explaination..

  • @alexmarvin3093
    @alexmarvin3093 4 роки тому

    Adrian Hill is the best no one compares

  • @mohamedaitkhouyamouh5599
    @mohamedaitkhouyamouh5599 4 роки тому

    this letterally bettar than sharing the bad with my gf ,thank you so much for the work absolutely mind-blowing

  • @Stoic_Panda
    @Stoic_Panda 5 років тому +1

    wait was this a 2 way or 1 way ANOVA? Lol what is the difference?

  • @EmilyTotallynotbees
    @EmilyTotallynotbees 4 роки тому

    I wanna walk through a bunny preserve to work 🥺

  • @danconrad920
    @danconrad920 6 років тому +1

    Unpalatable chocolate?
    Yeah,...it's called carob

  • @dr.jackauty4415
    @dr.jackauty4415 6 років тому

    Bunny count would not be Gaussian. Probably Poisson or negative binomial.

  • @harrygroundwater2590
    @harrygroundwater2590 Рік тому +1

    Anyone here from ANU?

  • @anikamaynard8132
    @anikamaynard8132 6 років тому +2

    this doesnt make anova easy to understand at all. it doesn't take into consideration that people are now learning this whole concept...

  • @Palau_Legend
    @Palau_Legend 6 років тому +3

    I like cookies

  • @andresmc210
    @andresmc210 6 років тому

    Please feature bunnies more often.

  • @CosbyAdam
    @CosbyAdam Рік тому

    9 grand a year to learn more off off of a 5yr old UA-cam playlist than in my Stats lectures... (I have an exam on this and I am so screwed)

  • @aNytmare
    @aNytmare 6 років тому

    DFTBAQ (hey did you can type DFTBAQ with your left hand only?)

  • @liamc3995
    @liamc3995 5 років тому

    This isn’t John Green.

  • @jonathandominguez300
    @jonathandominguez300 6 років тому +1

    Hey! Explain the story of Scheherazade. Pwease.

  • @nareshchinnam8349
    @nareshchinnam8349 4 роки тому

    Very difficult to follow with this speed of the explanation.

  • @sudeepjoseph69
    @sudeepjoseph69 4 роки тому +1

    This series has the lowest viewership compared to all other series in cc

  • @iefe65
    @iefe65 5 років тому

    If we can know exactly the statistical significance between different groups by using t tests for every 2 groups, why even bother with the f-test in the first place lol ?

  • @coolhaddool3680
    @coolhaddool3680 5 років тому +2

    والله مدري ايش بتقول دي

  • @sarocturtlegaming7306
    @sarocturtlegaming7306 5 років тому +1

    yup nope still confused I miss the dude D: TAKE ME BACK TO SCIENCE

  • @fruitninja8475
    @fruitninja8475 4 роки тому

    still can't get it. I'm an idiot. sorry.

  • @Blubgamer
    @Blubgamer 6 років тому +1

    bro!

  • @mamasophie8597
    @mamasophie8597 5 років тому

    UMMMMMMM AM I AN IDIOT OR HOW DO U CALCULATE THE P-VALUE??????

    • @deliveryscooter4154
      @deliveryscooter4154 4 роки тому

      bro I feel u I cant find any substantial explanation either :/

  • @dyngjean4532
    @dyngjean4532 6 років тому +1

    Lol what people will do for the first comment...

  • @ninasimoneh4030
    @ninasimoneh4030 4 роки тому

    Omg I think I'm worse off..

  • @mzms4l422
    @mzms4l422 6 років тому +1

    1st

  • @birdygamer5224
    @birdygamer5224 6 років тому +1

    I think I might understand this a little better if she used a video game examp!e

  • @andreaqui1653
    @andreaqui1653 4 роки тому

    this didn't make sense at all fam

  • @rachaelharwood9063
    @rachaelharwood9063 4 роки тому

    This is way too fast

  • @ZIlxIM
    @ZIlxIM 6 років тому +1

    .

  • @KristopherStockholm
    @KristopherStockholm 6 років тому +1

    First

  • @danielmclaughlin5573
    @danielmclaughlin5573 6 років тому

    Yes. Of course there is unpalatable chocolate out there. It's called chocolate.

  • @BigYellowJoint1
    @BigYellowJoint1 6 років тому +8

    Just use SPSS