Machine Learning Fundamentals: Bias and Variance
Вставка
- Опубліковано 20 лип 2024
- Bias and Variance are two fundamental concepts for Machine Learning, and their intuition is just a little different from what you might have learned in your statistics class. Here I go through two examples that make these concepts super easy to understand.
For a complete index of all the StatQuest videos, check out:
statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF - statquest.gumroad.com/l/wvtmc
Paperback - www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - www.amazon.com/dp/B09ZG79HXC
Patreon: / statquest
...or...
UA-cam Membership: / @statquest
...a cool StatQuest t-shirt or sweatshirt:
shop.spreadshirt.com/statques...
...buying one or two of my songs (or go large and get a whole album!)
joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
0:00 Awesome song and introduction
0:29 The data and the "true" model
1:23 Splitting the data into training and testing sets
1:40 Least Regression fit to the training data
2:16 Definition of Bias
2:33 Squiggly Line fit to the training data
3:40 Model performance with the testing dataset
4:06 Definition of Variance
5:10 Definition of Overfit
Correction:
4:06 I say that the difference in fits between the training dataset and the testing dataset is called Variance. However, I should have said that the difference is a consequence of variance. Technically, variance refers to the amount by which the predictions would change if we fit the model to a different training data set.
#statquest #biasvariance #ML
Correction:
4:06 I say that the difference in fits between the training dataset and the testing dataset is called Variance. However, I should have said that the difference is a _consequence_ of variance. Technically, variance refers to the amount by which the predictions would change if we fit the model to a different training data set.
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
And at 4:55, why do you say straight line has low variance? That isn't necessarily true since those points on the graph could be anywhere else and if they are farther from the line, the sum of squares could easily be much greater.
.
@@leif1075 Given this dataset, the straight line has lower variance than the squiggly line. Given another dataset, things could be very different.
@@statquest Ok so you were only referring tp this dataset then? Sorry What I said is correct though in general right?
@@leif1075 Regardless of the models and the data, you always have to test to see which one has the least variance.
@@statquest So what I said was correct then?
4 hours of the lectures with a lot of complicated math: got nothing
6 minutes with the singing guy: *DOUBLE BAM*
Hooray! :)
Math is important. Go learn the math.
You can't get anywhere without the math
@@fluxqubit ima jus import da python library my G. math is for fools
Bam
Better than lots of courses on Udemy. I really like your humor
Thanks! :)
@@statquest BAMMMM!!!!!!
@@Ex_Arc :)
@@statquest DOUBLE BAMMM
@@prashdash112 Thanks! :)
This guy has united his two passions-Machine Learning and guitar.
Yes! :)
and mice :)
and singing & composing! Loved the intro in this video :)
and saying "Bam"
Josh! How about that transformers video? Eagerly awaiting your humorous and mad explanation skills. Perhaps how it relates to its predecessor models? Key Query Value bit would be great as well. Keep on rocking it.
LOL What a way to present dry material with a dry approach yet making it interesting and easy to follow :-) Great job!
His dryness went full circle XD
I went from BUMMED to DOUBLE BAM in six and a half minutes. God bless you!
Hooray! :)
I did the same in just over three minutes with increased playback speed! BAM
I wish professors taught like this!! Such clarity - I am so thankful to you.
Wow, thank you!
Wow this was so straight to the point with great visuals that I managed to figure out all in one go! Great stuff!
Awesome!!! :)
Notes for myself:
Def. of Bias: The inability for a machine learning method to capture the true relationship is called Bias.
Def. of Variance: The difference in fits between data sets is called Variance.
M-m-Morty huh? Learning some m-m-machine learning? Your grandpa rick would be p-p-proud of you **burp**, Morty.
Thanks Morty for ur short note, which helps me to understand the definition more clearly. Good luck for ur adventure with ur crazy Grandpa
Thank you Morty
@@BrandonSLockey That doesn't sound like something Rick would say! He'd probably berate Morty for trying to learn this and then go on a soliloquy of how nothing actually will ever matter :D
These two definitions are completely counter-intuitive for me, have to re-define them for myself constantly. Because, bias sounds like the model is biased to the training data, but the bias in the definition is towards the model's assumptions (i.e linear model biased towards linearity). Variance sounds like the model's variation from the training set data (creating high variance), but the definition refers to the large variance of the error values (i.e residuals) when the model is fit to new data. Hope this helps if your intuition is similar to mine.
I have watched many of Josh's videos several times. Whenever I find myself trying to remember a concept, I know that a StatQuest video will sort me out in 10 minutes or less
BAM!
After watching more than hundred of videos on machine learning, i find your way of explanation very easy to understand and digest. Plus, i am really amazed with the way you start your lectures and wait for 'BAM' to come.
Wow, thanks!
you just did it in a perfect way. I've read blogs, "best ML books", and other resources, but you just nailed this. thank you!
Thank you!
So much of quality content on Machine Learning!! I wish I knew about this channel a bit before. A must follow channel for ML & DS enthusiasts. Great job Josh :) Please continue the good work and serve the humanity!!
Thank you very much! :)
You're like the postal mailman of online videos. Neither snow nor rain nor heat nor gloom of night can stop StatQuest!
i read it as post malone
you have watched enough Seinfeld , haven't you?
I have paid for courses on edX and also have many free resources available to me through school- nothing has explained Bias and Variance as quickly and efficiently as you have in this video. Thank you, thank you, THANK YOU!
Hooray! I'm glad my video was helpful.
BAM ! Mindblowing how clearly explained these videos are, with even a sense of humour and some home made music. Really nice work, hats off.
Wow, thanks!
This guy is awesome... this video actually explain bias and variance To Me finally. I have watch lots of other video but it was this video who taught me this concept
Awesome!!! Thank you so much! :)
It couldn't have been made easier to understand these concepts.Great job, I hope your journey to making abstruse concepts easy to understands doesnt end here
I hope not! :)
My masters course in ML has been challenging. Getting washed over with lots of maths with greek (I've only taken calc I) and statistical jargon (never taken stats) when I am a simple computer science pleb has made class really hard. These videos are making light work of looking past the confusing figures and long-winded over-technical lectures! Thank you, Josh. Thanks, StatQuest!
Hooray! I'm glad my videos are helpful! :)
How on earth did you get into a masters of ML without more background in relevant subjects?
@@mitchellsteindler I'm looking back at my previous reply and see that it sounds like I'm doing a masters program in ML. What I was trying to say is that I was taking an ML course in my masters program. My program is just computer science :) But I passed my class with an A with big thanks to these awesome videos!
@@BenStoneking ah okay
I am currently in a trainee program to learn machine learning...my teachers suggested this channel. This is awesome
Welcome aboard!
I have understood not only the Bias and Variance, but also even more ML terminology that has been quite difficult for me to understand until this point! Keep it up brother! Very good job :)
Awesome!!! :)
The world of learning is still enjoyble cuz of people like you are still present
Thank you! :)
The best and most interesting videos combine fundamental statistics, machine learning for beginners. The heavy textbook for statistics are so bored and after watching your series videos, I have a better understanding of many abstract things. Thanks, tons!!!
From Intro to Statistical Learning with Application in R. I fully grasp the picture of Bias and Variance. In addition, flexible techniques vs less flexible techniques now cement into my memory, before I just crammed the terminology without knowing exactly what it means. I will be a constant goer to this channel
*Opens StatQuest Videos* -> Automatically clicks 'Like'
Best, most intuitively understoood, explanation of this that I've ever seen!
BAM! :)
You are probably the best resource when it comes to understanding the fundamentals of Machine Learning... like it's not even close
Thank you! :)
Very concise and easily understandable video. In the past I have read this topic in books and seen other videos but never understood bias variance so clearly earlier.
Thanks! I'm glad my video is helpful. :)
Thank you, Josh, for this wonderful video on Bias and Variance in ML. It was a great visual-heavy explanation and the explanations were made very clear for these two concepts!
Thank you very much! :)
Currently reading the Intro to Statistical Learning with Application in R and I can't tell you the number of times I've loaded up one of your videos to help me understand one of the concepts such as Bias and Variance because they do a poor job in explaining for a broader audience. Please keep it up!
Hooray! One of my long term goals is to "translate" most of that book into StatQuest videos. This was the first, but I also just put out a vide on Ridge Regression and will soon put out a vide on Lasso Regression.
Literally doing exactly the same thing
I was searching Bias and Variance for the same reason. Thankfully I found this channel!
Came here for the exact same reason lol
MAN!!! i was reading about bias and variance trade off, but not a word got into my head...this video made it beyond clear!! thanks a ton!!
Hooray! I'm glad the video was helpful. :)
I don't know why I subscribed to your channel a long ago and after a long time I have been searching for ML course and have found you. After watching the intro to ML, I have felt like wow I subscribed to a worthy channel
Hooray!!! :)
Josh , I don't know which i love more, your songs or your lessons on stats. You're amazing.
bam! :)
Wow.. Go through many blogs.. Watched many videos and asked n no.of questions in quora and other platforms, but your single video (less than 7 minute video) explained well.. Really Thanks man.. Done a great job..
Thank you!!! :)
This is some quality educational content...Keep up the good work brother!!
Definitely gonna buy some merch to support the channel!!
Awesome! Thank you! :)
its amazing how 6 minutes video did a far (and i mean really far) more better job in explaining the concepts than hours spent on articles that did nothing but increase confusion.
thanks a lot for sharing this... much luv
Wow, thanks!
Thanks man, i do not know what the start was about, but your video really helped me. Thanks
Brilliant and clear and concise explanation: the best i have seen!!! Congrats and many thanks.
Thank you! :)
Thank you for your work Josh, I learn more from your six and a half minute videos than I do from six and a half of hours of textbooks and classwork
Glad to help!
This is absolutely brilliant M8, crisp, clear and very concise. Well Done!! You've got one more stat fan now!
Hooray! Thank you very much! :)
Thank you so much for this video at this special moment! I hope you can keep safe during Florence hurricane! Good luck to you and the Carolinas!
Thank you! We got a lot of flooding, but I stayed dry and now the sun is shining again. :)
You're gifted to turn unclear concepts to pretty clear ones. Baaam!
Thank you! :)
Sir I must say you are the Gem. This 6:35 Mins video has taught me what our Phd Dr. 3 Hrs with 50 slides cant.... Hat's off
bam!
I do love the way you explain and the way you keep people alert to upcoming information
Thank you! :)
Paid thousands of dollars on Udacity, but ALWAYS have to come to your channel for a clear explanation. Love the way you explained all these complicated concepts Josh :) (Btw, we met at IVADO's 100 Days Event haha:) )
Hooray! I'm so glad my videos are helpful and IVADO's 100 Days Event was super cool. :)
Your videos are AMAZING!! Thank you Josh for being such an inspiration :) Have a wonderful weekend! :)
Thank you very much for this video! I am learning a lot from it and it helps me understand what people mean by Bias-Variance tradeoff!
Thank you very much! :)
Dude you are awesome, this is my first video that I have seen from your channel. Plan on watching your other videos as well.
Such great visualizations. just wow.
Thank you very much! :)
Amazing video, love the clarity and simplicity.
Great video, very clear. Also, the graphics are intuitive. Thank you!
Thanks! :)
awsome and very clear explanation!
Have watched many of your videos and that have forced me to write a comment, Stat Quest is AWESOME!! and @Josh Starmer, I am you fan. The way you begin your videos and go about explaining some of the most difficult concepts in Statistics and Machine Learning is GREAT. Many books and tutorials mention making the complex simple, but rarely do so. This channel is not one of them, it truly makes things simple to understand.
I have just one request (i think most of your followers would agree to this point), please write a book on Machine Learning and it's application of various algorithms (may be a series of books).
Thanks so much! If I ever have time, I'll write a book, but right now I only have time to do the videos.
Nothing can be better than this in 6.35 mins... It drives me crazy... stopped watching courses on ML of the bigger names.... will continue with #statequest. Its double BAM!!!. Love you Josh.
Thank you very much! :)
You should sell these videos as DVD sets. I bet a lot of educators would buy them.
Just found this channel today. Also making my way through ISLR. They have a great video series to go along with the book, but still pretty technical. This channel is a god send. Thank you!
You're welcome!
I got to the point where I first check statquest if I come across unfamiliar topics. Thank you so much for all of your hard work!!!
Bam! :)
hands down the best explanation that i have ever seen. plus the humour is soo good
Thanks!
Great job man! Seriously, you made my journey in data science easier 👍
BAM! :)
very clear, no extra unnecessary "noise". I really enjoyed this lesson.
Awesome, thank you!
Thank God I found this channel! I understand 2 hour lectures under 10 minutes - Thanks StatQuest!!
Happy to help! :)
I will comment on every single video of yours. Just to show how much I love your teaching style.
bam!
I loved your composition Miss Carolina. You have amazing voice Sir!
Thank you very, very much!!!! :)
3:09
psst. I can listen to this all day.
:)
You are a scholar and a gentleman. Thank you for explaining what my lecturer tried to explain in 2 hours in 6 minutes.
You're very welcome!
Such a GREAT video on bias-variance trade-off. Looking forward to your lectures on regularization and boosting~
PERFECT AND CLEAR!
Awesome!
this was so straight to the point, with some great visuals that I managed to figure out all in one go! BAM!!!!!
Hooray! :)
Thanks for all your videos, I will go through all of them! You are the best!
Tons of Thanks for You..your videos are really nice..pls do the video on regularization soon..
I should have the first video on Regularization out in the next week or two. :)
👍
Thank you so much for all of your videos. I'm watching them all in a row. All the subjects are so clearly explained !
Thank you very much from France !
Thank you very much!!! :)
I love that you explained why you square the differences! Most people don't bother explaining that and it always seemed strange to me.
Thanks!
You’re on my list of guys I’ll buy a beer for if I ever see in a bar. You, Jeremy Howard, and the folks over at Deep Lizard.
Wow! Thank you very much! :)
Hi Josh! You are the "God of ML and Stats". You really made me fall in love with these subjects.
I had a query. According to you, if we cut the data into training and testing sets, what % should be assigned to test? I think it should vary with the amount of data, but is there a thumb rule?
There are a handful of "rules of thumb". One simple one is if you do 10 fold cross validation, then you divide your data into 10 equally sized bins (see the StatQuest on cross validation: ua-cam.com/video/fSytzGwwBVw/v-deo.html ). Another standard is to use 75% for training and 25% for testing. This is the default setting for Python's scikit-learn function train_test_split().
My new favourite channel to learn the fundamentals of ML. Plus you use R!!! 🔥
One of the best videos I have come so far
Thank you!
youtube should give option to add thousand likes. Your channel beats paid ML courses out there hands down.
Thank you very much! :)
Man, you're very didactic! For each statement, there is a 'because', so that your students never ends with a question mark in the head. Besides that, you don't mind to repeat the because's again and again in different ways, and that's what make things clearer. Why can't teachers, coaches, tutors realize that? Triple BAMMM!
Thank you very much!! :)
I can seriously binge-watch this Channel!! Thanks, @JoshStarmer
BAM! :)
I was on a quest to understand Bias and Variance for a longtime until i saw StatQuest. Good work explaining, Josh.
BAM! :)
BAM. Subscribed.
great work, so funny
Very simply and amazingly explained, saw many tutorials but this was by far the best. Thank you :)
Thank you!
This is an excellent series of machine learning, and I especially like the song at the starting of the video. Thank you statquest❤️
Glad you enjoy it!
Woah your original songs are beautiful too'
linear regression (aka least square) finally, now I can die in peace. you explain things in very nice way.
Thanks!
the best video so far on bias-variance tradeoff.
Thanks!
Your the biggest statistics nerd i have come across in a while. I love it
Thanks! :)
You are the male version of Phoebe Buffay!!! 😁
Also, can you tube customize the like button to BAM!! that would be Great.. ;)
That would be awesome! :)
Your explanation of concepts outweighs tons of other videos I watch on UA-cam and Other course websites.
Thank you so much will subscribe and become a member at your channel to support you.
Wow! Thank you so much for your support!
What an outstandingly simple and intuitive explanation, bravo!
Thank you! :)
StatQuest terminology : Bam with a high tone means this is the point you should understand. Little bam means something more important are coming. Double bams means at this point, you should be enlightened.
That's perfect!!! You made me laugh out loud. :)
@@samarthgoel1671 I think Tiny Bam means "boring but important."
@@statquest TRIPLE BAM
Who on the EARTH disliked this video? Probably other content creators...
It's always a mystery why someone doesn't like StatQuest. Maybe they couldn't handle the BAM! :)
@@statquest could't agree more xD
All your intro music give me a feeling tat the concepts are easy to understand....thanks you for building tat confidence.
Hooray! :)
man
I love how you explianed it so easy to understand like butter 🔥🔥🔥🔥
Thanks!
I could simply replace my tuition payments with payments for a UA-cam Premium subscription. Much cheaper and easier to study :D
bam! :)
Bam ! Double bam!!
Hooray!!!! :)
Wonderful presentation, explanation and the effort you put in visualising every step...Thank you Josh!!
Glad you enjoyed it!
Short and to the point and clear. Thanks
Thanks! :)
perfect video doesn't exist... wait nvm, found it!
bam! :)
ohhh nooooo , i thought that bias meant errors and variance meant variation of the data .
That's similar to what most people thing 'bias' and "variance" mean in the context of Statistics. Things are a little different in machine learning.