Regression: Crash Course Statistics #32

Поділитися
Вставка
  • Опубліковано 30 вер 2024
  • Today we're going to introduce one of the most flexible statistical tools - the General Linear Model (or GLM). GLMs allow us to create many different models to help describe the world - you see them a lot in science, economics, and politics. Today we're going to build a hypothetical model to look at the relationship between likes and comments on a trending UA-cam video using the Regression Model. We'll be introducing other popular models over the next few episodes.
    Crash Course is on Patreon! You can support us directly by signing up at / crashcourse
    Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:
    Mark Brouwer, Kenneth F Penttinen, Trevin Beattie, Satya Ridhima Parvathaneni, Erika & Alexa Saur, Glenn Elliott, Justin Zingsheim, Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, SR Foxley, Sam Ferguson, Yasenia Cruz, Eric Koslow, Caleb Weeks, D.A. Noe, Shawn Arnold, Malcolm Callis, Advait Shinde, William McGraw, Andrei Krishkevich, Rachel Bright, Mayumi Maeda, Kathy & Tim Philip, Jirat, Ian Dundore
    --
    Want to find Crash Course elsewhere on the internet?
    Facebook - / youtubecrashcourse
    Twitter - / thecrashcourse
    Tumblr - / thecrashcourse
    Support Crash Course on Patreon: / crashcourse
    CC Kids: / crashcoursekids

КОМЕНТАРІ • 188

  • @Waltham1892
    @Waltham1892 6 років тому +787

    BRAIN HURT!!!

    • @MasterofPlay7
      @MasterofPlay7 5 років тому +15

      nah actually I find it very fun, Z test T test F test and anova all has to do with std and variance which has to do with the mean

    • @nikkid4890
      @nikkid4890 5 років тому +6

      @@MasterofPlay7 I also LOVE stats. And this from a person that sold my math text books for toffee at school! Once you get it, it's so much fun!

    • @MasterofPlay7
      @MasterofPlay7 5 років тому +9

      ​@@nikkid4890 yes we human are too dumb, most statistical analysis have to do with a straight line (most models are based on y=ax+b) cuz we can only perceive the relationship through a straight line

  • @justynaizabela2495
    @justynaizabela2495 6 років тому +571

    This series is amazing! I have majored in Statistics and still this series explains everything much better than college classes.

    • @panosshady6168
      @panosshady6168 4 роки тому +70

      You majored in statistics? Wow, and here I thought that I hated myself.

  • @samsonthelionhearted6873
    @samsonthelionhearted6873 4 роки тому +118

    You’re going so fast you lost me a bit.

  • @kanitoneko
    @kanitoneko 6 років тому +268

    How can she keep speaking without inhaling!?

  • @RaulPelcastreRealEstate
    @RaulPelcastreRealEstate 5 років тому +43

    She speaks a little too fast for me but clearly explained. I like it.

  • @demonika2060
    @demonika2060 2 роки тому +3

    lmao i didn't understand anything

  • @NN_2000
    @NN_2000 5 років тому +172

    Only crash course can make statistics interesting. Thank you for making quality educational videos for free! :D

  • @sarlut
    @sarlut 4 роки тому +60

    I swear this series is the reason I am actually doing well in statistics! Wish I had this in my BSC (MSc Student)

  • @klaras5703
    @klaras5703 5 років тому +152

    the best thing about the video is how the pumpkin and the transformer slowly eat all the candy worms that were on the table during the video

  • @liliadams1
    @liliadams1 Рік тому +3

    Lost so lost

  • @raposo_debora
    @raposo_debora 4 роки тому +61

    This course is sooo good. I'm using the Covid-19 Quaratine to educate myself in Statistics and this Crash Course was THE finding. Thanks a lot!

  • @danieldelacruz7642
    @danieldelacruz7642 6 років тому +175

    Who needs a regression calculation when you have "add trendline" in Excel?

    • @Tntpker
      @Tntpker 6 років тому +30

      who still uses excel in 2018 lol. keep up and learn python noob

    • @avinoamr
      @avinoamr 6 років тому +3

      The folks that want to work at Microsoft or any of it's competitors

    • @alanle18
      @alanle18 6 років тому +97

      Tntpker excel will forever be used, it has a great balance between learning curve and power. You must feel good about yourself putting strangers down over the internet.

    • @OlleLindestad
      @OlleLindestad 6 років тому +21

      I mean, that's what "add trendline" does. It does a regression on your data. :D

    • @aNytmare
      @aNytmare 6 років тому

      But a B-spline looks sooo much nicer!!!

  • @TilleTheo
    @TilleTheo 4 роки тому +12

    It was a bit too fast, but very helpfull still! Will watch it a few more times.

  • @OlleLindestad
    @OlleLindestad 6 років тому +81

    NOTE: This video uses the abbreviation "GLM" incorrectly (or at least very misleadingly) throughout.
    The general linear model is NOT usually what is meant by "GLM". Instead, GLM stands for generaLIZED linear model, which is a special kind of linear model that (among other things) allows for a response variable that is not normally distributed. (Yes, this is extremely confusing. Don't even get me started on the word "linear", which doesn't even mean "straight lines" in this context.)
    Bottom line: substitute simply "linear model" whenever Adrienne says "GLM" in this video, and you'll be fine.

  • @EconomicalUnicorn
    @EconomicalUnicorn 4 роки тому +8

    Did anyone notice how as the video goes on, there are less and less lollies (sweets) near the pumpkin lmao

  • @theidiotboy100
    @theidiotboy100 Рік тому +8

    I think you guys are the reason people study or stay in school. net positive for humanity. thanks for helping people.

  • @seltsamerjunge3642
    @seltsamerjunge3642 6 років тому +6

    Interesting. I just factchecked the theory about the comment-to-likes ratio, and it met pretty well: At the time I've written this, there were 41 comments and 391 likes, which is just the value "4000/100" shown in the diagram... As it turned out, this time it's above the regression line, but with an increase in the y-value by less than 35%

  • @benbernanke4037
    @benbernanke4037 4 роки тому +15

    These graphical presentations are so good, especially at 8:30 the didferent sums of square types

  • @mw79863s
    @mw79863s 5 років тому +32

    Some unnecessarily confusing parts:
    It would have been helpful to explain that our zero-coefficient line IS the line y='y hat'.
    The point referred to at 7:05 is not highlighted or pointed out (and as it sits far above its distance for SSR it isn't instantly recognizable as connected).
    Positioning of the equations at 8:50 gives strong and erroneous implication that each refers specifically to the diagram above.
    The equation given for F-statistic at 8:58 is then instantly revised as not being correct.
    The correct f-statistic equation is only on screen at 10:07 for a fraction of the time needed to read it - let alone fully digest it.

    • @BastiPROTON
      @BastiPROTON 4 роки тому +2

      Exactly! I was so confused the whole time, a lot of it makes little sense if you see this stuff for the first time.

    • @jaceychang5785
      @jaceychang5785 4 роки тому

      I think the F-statistic formulas are wrong at both 8:58 and 10:07 ! At 10:07 the denominator and numerator should be reversed!

  • @alfredgustafsson4708
    @alfredgustafsson4708 4 роки тому +30

    This is great, especially the explaination of degrees of freedom. I never really understood it through five years of Economics so thank you.

  • @medslarge
    @medslarge 6 років тому +9

    Didn’t really understand the degrees freedom part 🤔

  • @tatsianatati8375
    @tatsianatati8375 5 років тому +9

    Oh my , too fast for me🤯🤯🤯

  • @dondacurator
    @dondacurator 5 років тому +5

    This right here is the most entertaining and intriguing statistical video Ive ever watched.. it actually made stats fun, thanks for incorporating art and creativity to this piece ,,instead of old and boring numbers presented in a monotonic go to sleep now voice

  • @treelight1707
    @treelight1707 6 років тому +5

    I finally figured out the issue with this series, why it is so hard to follow. The animations are too much, too fast for statistics. I can barely follow through with the examples, or cannot follow at all. Example: the calculations; you can't remove each line before the next. I would want to see what numbers went where, and it is not that long of a calculation that you need to have space. Other than that, I think everything else is fine. Crash course Economics was awesome btw.

  • @Maria-hd4hk
    @Maria-hd4hk 5 років тому +7

    First minute and a half and i've actually learnt so much

  • @danyypao1824
    @danyypao1824 5 років тому +4

    Im so lost

  • @kowalityjesus
    @kowalityjesus 6 років тому +15

    I really appreciate this explanation, but I think you started moving too quick when discussing degrees of freedom. I can't get what you're talking about after listening to it even several times. Specifically the lines she says at 19:13 are completely non-understandable to me. Thanks, though

    • @SyedAriff
      @SyedAriff 5 років тому +2

      19:13 doesnt exist

  • @eritrean_forever
    @eritrean_forever 4 роки тому +2

    Toooooo fast speaking for a teacher...only reason I have to look for another video!

  • @AliceChused
    @AliceChused 5 років тому +12

    great videos but why are they SO FAST? take a breath in and breath out and a few extra explanatory words…

    • @PJ3721
      @PJ3721 4 роки тому

      and here I want her to talk faster

  • @jeremiahharemza1235
    @jeremiahharemza1235 5 років тому +5

    "I know Kung Fu" - Neo
    "Show me." - Morpheus

  • @nikkid4890
    @nikkid4890 5 років тому +3

    Wow! You are brilliant. I'm post-grad and needed to refresh. Brilliant

  • @PhysqueLab
    @PhysqueLab 6 років тому +3

    May i ask when the logistic regression video will be uploaded?

  • @saivishnutulugu5014
    @saivishnutulugu5014 6 років тому +3

    Can you go over nonlinear data models(exponential, power, etc) and also Simpson's paradox in the future?

  • @berfeito
    @berfeito 6 років тому +2

    Can anyone recommend an exercise book or a site with practice questions for statistics? I feel like I need to practice it on my own. Cheers.

  • @williamkee6578
    @williamkee6578 4 роки тому +4

    This video is absolutely helpful! One single video and I understand the contents from 2 hours class.

  • @stankalfon2170
    @stankalfon2170 6 років тому +5

    Thank you this helped me so much! Will you do a video on multiple regression and econometrics in general? Keep up the good work you guys rock!

  • @Sagitarria
    @Sagitarria 6 років тому +2

    For the trick or treat example, would it be appropriate to try a logarithmic transformation?

  • @sunsusan2739
    @sunsusan2739 4 роки тому +1

    I'm confused by the equation at 2:24. Should "increase in likes per comment" be in blue, standing for m instead of x?

  • @tutukkunoor
    @tutukkunoor 5 років тому +1

    At 9:33, she says 'The sums of squares for regression (SSR) has one degree of freedom as one degree is consumed in calculating slope of the model line'. How is that o.O

  • @manzurekhoda7013
    @manzurekhoda7013 4 роки тому +1

    You should have include nonlinear methods of regression in this video. Anyway, great video.

  • @Malik-jt8hi
    @Malik-jt8hi 5 років тому +1

    Thank god for crash course lol, godsend channel to start to learn a topic when I gotta teach my brother about a topic I’ve never learned myself

  • @granny933
    @granny933 5 років тому +1

    If linear regression is for straight lines, how can regression be used to detect a curve? After all, how common a straight lines in science....

    • @3rl0y
      @3rl0y 5 років тому +3

      You can apply a transformation or use a different technique altogether, fully depends on what you are researching. As for straight lines in science, perfect lines, no, but approximations are enough. And those are abundant.

    • @granny933
      @granny933 5 років тому

      Baldur thank you.

  • @mpilosov
    @mpilosov 6 років тому +2

    At 9:43, do you mean “the mean”
    In the null model, we are just using the mean of the data (one independent piece of info) to predict the outcome. You say “slope,” but aren’t we not using slope, i.e. setting it to zero?

    • @mikail5682
      @mikail5682 5 років тому

      It's at 9:34
      "The sums of squares for regression (SSR) has one degree of freedom, because we are using one piece of independent information to estimate our coefficient, the slope"
      Correct me if I'm wrong, but the sentence has to be "...we are using one piece of independent information to estimate SSR, the mean".
      If this is incorrect, please explain why.

  • @NikitaSamourai
    @NikitaSamourai 5 років тому +3

    i don't understand why the sums of squares for regression has one degree of freedom

    • @NikitaSamourai
      @NikitaSamourai 5 років тому

      I DONT WANT TO OPEN TABACHNIK AND FIDELL

    • @iefe65
      @iefe65 5 років тому

      same, I don't get it

    • @cycla
      @cycla 5 років тому

      Because only 1 independent variable is used to generate the regression

  • @chinmayadhiman3358
    @chinmayadhiman3358 5 років тому +1

    my god you talk too much and too fast!

  • @Ureallydontknow
    @Ureallydontknow 6 років тому +2

    this video is top production quality and expert instruction. thank you so much.

  • @cbottube
    @cbottube 6 років тому +1

    *watches Optimus very closely throughout the video*

  • @michaelyoon9355
    @michaelyoon9355 26 днів тому

    I opened the CD cart, I haven't done that since Aprilish. I've regressed.

  • @ArjunTheCreator8
    @ArjunTheCreator8 Рік тому

    Guess I'm the only one who noticed the gummy worms slowly disappearing...

  • @iefe65
    @iefe65 5 років тому

    I don't get why in 9:35 she says that we only need 1 degree of freedom to calculate the slop. I understand the 98 DF for SSE but I don't get why SSR has only 1 DF

  • @mayankjacky
    @mayankjacky 11 місяців тому +1

    Thanks

  • @mariamontero5651
    @mariamontero5651 Рік тому

    here you have another comment :D

  • @alikhan81
    @alikhan81 4 роки тому +2

    Just to F-up the F-Test, I'm gonna leave a comment without liking the video

  • @greensteve9307
    @greensteve9307 6 років тому +2

    So much clearer than my uni stats lecture!

  • @lagh
    @lagh 5 років тому +1

    Just use SPSS 😂

  • @ravindukarunarathne507
    @ravindukarunarathne507 4 роки тому +1

    So nice, can keep watching for hours.. Well done

  • @sungkim1397
    @sungkim1397 6 років тому +1

    I am lost +_+

  • @florentinfrank3671
    @florentinfrank3671 4 роки тому +1

    Well done! thanks so much for all the efforts! now i understand better!

  • @TaroQuispe
    @TaroQuispe 4 роки тому +2

    You guys rock the house, super clear, super helpful!

  • @bartonpaullevenson3427
    @bartonpaullevenson3427 4 роки тому

    I can't give this a like, because I think you took way too long to get to degrees of freedom, and gave the strong impression that F was defined as SSR/SSE for most of the presentation. I would introduce the mean squares MSR and MSE first.

  • @technofeeliak
    @technofeeliak 6 років тому +8

    Statistics are the ultimate rationalization of life's experiences through math. Unfortunately, the government and other organizations can take this oversimplification to back up their fallacies.

    • @KitsuneSoftware
      @KitsuneSoftware 6 років тому +1

      Only when their audience doesn't understand the stats. It's like small print in contracts (who reads those?) or those disclaimers in adverts in tiny print or really fast voices. Lessons like these help us to not be fooled.

  • @unleashingpotential-psycho9433
    @unleashingpotential-psycho9433 6 років тому +11

    I remember statistics class in school was very challenging T_T

    • @gardenhead92
      @gardenhead92 6 років тому +3

      Said like a true psychologist

    • @DPMixing
      @DPMixing 6 років тому +1

      Well it seems easy when you get to watch entertaining, visually-stimulating videos and not have to be assessed for your application of the concepts with homework and exams...😂😂😂

    • @gnometheory3831
      @gnometheory3831 6 років тому

      @@DPMixing Yup, I am in AP stats with a 97% and watched this for fun to see how dumbed down it is. The answer: very.

  • @milesbrown1889
    @milesbrown1889 4 роки тому

    Did anyone else notice the bell curve in the background and where she sits positions her as being among the average? how funny is that! I’m not showing off my observational skills at all it’s just an observation.

  • @antoniolupen6138
    @antoniolupen6138 7 місяців тому +1

    woud be nice to have the dataset in order to be able to replicate the excercise

  • @mariamontero5651
    @mariamontero5651 Рік тому

    two comments

  • @expansivegymnast1020
    @expansivegymnast1020 Рік тому

    This series is pointless... until you actually need this stuff for class and then you're thankful to God that it exists. Thanks for everything y'all do!

  • @libbylebyane3681
    @libbylebyane3681 Рік тому

    Linear Regression is the building block in Artificial Intelligence predictions

  • @Tfin
    @Tfin 6 років тому

    Don't forget to factor in the number of dishes there are. You might want dirty dishes as a percentage of all dishes owned and a percentage of space in the sink. Higher numbers are bad for both values. Maybe instead of working out the math and plotting data, you could just do the @#!$ dishes already. I can't even wash them at this point without taking them out to make space.

  • @davidcampos1463
    @davidcampos1463 6 років тому

    You mean it's all of us human being random number generators against UA-cams mechanical algorithms. "Of course you realize, this means war!"

  • @kensaville513
    @kensaville513 11 місяців тому

    Useful video thanls. it looks like the alpha value used was 0.5. Should this be 0.05?

  • @itierney
    @itierney 4 роки тому +3

    Schoolgirl Error: You started using the term ‘regression’ without defining it.

  • @jaceychang5785
    @jaceychang5785 4 роки тому

    The F-statistic formulas are wrong at both 8:58 and 10:07 ! Though the calculation is correct.

  • @isaacliu896
    @isaacliu896 6 років тому +1

    Gah, a bit late.

  • @HinamiMel
    @HinamiMel Рік тому

    every night brings a dream but the day, relentlessly, keeps me awakeee

  • @JEOGRAPHYSongs
    @JEOGRAPHYSongs 6 років тому +2

    There is certainly something to be said for flexibility.

  • @rajdubey4389
    @rajdubey4389 6 років тому

    SIR PLZ MAKE VIDEOS ON MATHS IF U WANNA CROSS 10M COZ THERES MILLION OF SAME DEMAND

  • @Flush.103
    @Flush.103 6 років тому +1

    new video yay!

  • @alexbe3136
    @alexbe3136 4 роки тому

    Easter egg alert: The candies dissapear while the video goes on :)

  • @Bigbopper01
    @Bigbopper01 11 місяців тому

    ugh!

  • @circleofideas9549
    @circleofideas9549 4 роки тому

    Mam you are so sweet. thankx you teaching us.

  • @Abhalerao96
    @Abhalerao96 5 років тому

    OPTIMUS!!! You're distracting me!

  • @adamacosta5019
    @adamacosta5019 4 роки тому

    So lost I want to cry, but seeing @AstroKatie was a nice pick me up

  • @Nshiime
    @Nshiime 5 років тому

    Hi at 3:51 is it sum of(observed value minus predicted value)^2 or is it sum of(observed value minus average of values observed)^2

  • @jasonreviews
    @jasonreviews 6 років тому

    you just explain one portion of machine learning.

  • @dianarinker8429
    @dianarinker8429 Рік тому

    what a great explanation!! thank you so much!

  • @missaster1902
    @missaster1902 4 роки тому

    You can’t be like that glasses 👓 guy 🥺

  • @corinneblair8795
    @corinneblair8795 5 місяців тому

    So good! So helpful! Thank You!!

  • @aldorosas1136
    @aldorosas1136 6 років тому +2

    Very interesting. One comment though. "The regression line is the one straight line that minimizes the sum of the squared distances of each point to the line" (3:50) can be slightly misleading. It seems to suggest the actual distance from each point to the line, which (except for a horizontal line) would not be vertical. It should say, "...minimizes the sum of the squared vertical distances from each point to the line."

  • @siddharththomas5740
    @siddharththomas5740 6 років тому

    How many more episodes will there be?

  • @alexdetoxx2633
    @alexdetoxx2633 5 років тому

    why youtube doesn't have heart button?

  • @neilcidial-masrysandagesid7796
    @neilcidial-masrysandagesid7796 6 років тому

    Insightful. Will read watch.

  • @emopeterparker7
    @emopeterparker7 4 роки тому

    hi apitong 👋

  • @mayankjacky
    @mayankjacky 11 місяців тому

    It was a very comprehensive, concise and crisp presentation on a complex topic. Kudos to the entire team for an excellent effort.

  • @SaraAB98
    @SaraAB98 4 роки тому

    Thank you very much ❤

  • @bharathreddy4806
    @bharathreddy4806 5 років тому

    please add content of full machine learning algorithms

  • @user-py5lm6ip8y
    @user-py5lm6ip8y 4 роки тому

    ca.....n....t...... pr...o..ce..ss

  • @actualprogramming
    @actualprogramming 6 років тому

    Next what? Correlation?

  • @jimivie
    @jimivie 4 роки тому

    100/100 - great video

  • @thenewnerdtrucker
    @thenewnerdtrucker 6 років тому +1

    Get outta here with this linear stuff...it’s all about logistic regression...

  • @brodyreingold5992
    @brodyreingold5992 Рік тому

    great video