Real-World Python Machine Learning Tutorial w/ Scikit Learn (sklearn basics, NLP, classifiers, etc)

Поділитися
Вставка
  • Опубліковано 14 чер 2024
  • Practice your Python Pandas data science skills with problems on StrataScratch!
    stratascratch.com/?via=keith
    In this video we walk through a real world python machine learning project using the sci-kit learn library. In it we work our way to building a model that automatically classifies text as either having a positive or negative sentiment. We do this by using amazon reviews as our training data. Full video timeline in the comments!
    Link to Code & Data:
    github.com/keithgalli/sklearn
    Raw Data download:
    jmcauley.ucsd.edu/data/amazon/
    Sci-kit learn documentation:
    scikit-learn.org/stable/docum...
    Make sure you have sci-kit learn downloaded! To do this either run "pip install sklearn" or use python through Anaconda.
    Join the Python Army to get access to perks!
    UA-cam - / @keithgalli
    Patreon - / keithgalli
    ---------------------------
    Follow me on social media!
    Instagram: / keithgalli
    Twitter: / keithgalli
    To get one of the cool shirts I was wearing:
    / pagandvls
    ---------------------------
    Video outline!
    0:00 - What we will be doing!
    3:40 - Sci-Kit Learn Overview
    6:38 - How do we find training data?
    9:33 - Download data
    11:45 - Load our data into Jupyter Notebook
    16:38 - Cleaning our code a bit (building data class)
    20:13 - Using Enums
    22:50 - Converting text to numerical vectors, bag of words (BOW) explanation
    25:45 - Training/Test Split (make sure to "pip install sklearn" !)
    33:45 - Bag of words in sklearn (CountVectorizer)
    40:05 - fit_transform, fit, transform methods
    42:05 - Model Selection (SVM, Decision Tree, Naive Bayes, Logistic Regression) & Classification
    47:50 - predict method
    53:35 - Analysis & Evaluation (using clf.score() method)
    56:58 - F1 score
    1:01:01 - Improving our model (evenly distributing positive & negative examples and loading in more data)
    1:20:36 - Let's see our model in action! (qualitative testing)
    1:22:24 - Tfidf Vectorizer
    1:25:40 - GridSearchCv to automatically find the best parameters
    1:31:30 - Further NLP improvement opportunities
    1:32:50 - Saving our model (Pickle) and reloading it later
    1:36:37 - Category Classifier
    1:39:14 - Confusion Matrix
    ---------------------
    If you are curious to learn how I make my tutorials, check out this video: • How to Make a High Qua...
    *I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.

КОМЕНТАРІ • 306

  • @KeithGalli
    @KeithGalli  4 роки тому +92

    Video outline!
    0:20 - What we will be doing!
    3:40 - Sci-Kit Learn Overview
    6:38 - How do we find training data?
    9:33 - Download data
    11:45 - Load our data into Jupyter Notebook
    16:38 - Cleaning our code a bit (building data class)
    20:13 - Using Enums
    22:50 - Converting text to numerical vectors, bag of words (BOW) explanation
    25:45 - Training/Test Split (make sure to "pip install sklearn" !)
    33:45 - Bag of words in sklearn (CountVectorizer)
    40:05 - fit_transform, fit, transform methods
    42:05 - Model Selection (SVM, Decision Tree, Naive Bayes, Logistic Regression) & Classification
    47:50 - predict method
    53:35 - Analysis & Evaluation (using clf.score() method)
    56:58 - F1 score
    1:01:01 - Improving our model (evenly distributing positive & negative examples and loading in more data)
    1:20:36 - Let's see our model in action! (qualitative testing)
    1:22:24 - Tfidf Vectorizer
    1:25:40 - GridSearchCv to automatically find the best parameters
    1:31:30 - Further NLP improvement opportunities
    1:32:50 - Saving our model (Pickle) and reloading it later
    1:36:37 - Category Classifier
    1:39:14 - Confusion Matrix
    Thank you for watching! Make sure to like & subscribe if you enjoyed :)

    • @shagufta3247
      @shagufta3247 4 роки тому +1

      thanks so much
      please make videos on Django python full tutorial using visual studio

    • @girishvenkatachalam8793
      @girishvenkatachalam8793 4 роки тому +1

      Thanks man

    • @mimididi8689
      @mimididi8689 4 роки тому +1

      Is there anyway I could import another random dataset into my trained model and see if he can predict me the category from the other database (the one I used to trained my model)

    • @joneandrewharris8225
      @joneandrewharris8225 3 роки тому

      can you help out with my error in the comments

    • @alexvidu4517
      @alexvidu4517 3 роки тому

      This is glorious, been searching for "learn tennis betting game" for a while now, and I think this has helped. Ever heard of - Aiyenjamin Prefatory Approach - (should be on google have a look ) ? It is a good one of a kind guide for discovering how to get a unique tennis betting formula minus the hard work. Ive heard some super things about it and my buddy got amazing results with it.

  • @ManishSharma-xq9be
    @ManishSharma-xq9be 4 роки тому +167

    He not only teaches the good stuff but also teach how to google things and get the job done.
    Keep going brother!. You are Awesome.

    • @KeithGalli
      @KeithGalli  4 роки тому +32

      My goal is for you guys to be able to do this type of stuff on your own! Thanks for the support man, I appreciate it :)

    • @shuhratjonzikiryaev9685
      @shuhratjonzikiryaev9685 3 роки тому +2

      Yes, I agree with you 100%. He is the only person I know on youtube that actually teaches the material so well! I hope to see this channel grow to millions of subscribers.

    • @niteshprajapat7918
      @niteshprajapat7918 3 роки тому +1

      yess exactly.... I was confused how to use stackoverflow...but after watching his real world problem tutorial.. I learnt this skill too

  • @mucahitugurlu7324
    @mucahitugurlu7324 3 роки тому +49

    you're the reason that I've got an internship in a great company :) well.. I'm broke now :D but when I earn tons of money( I hope we all do :D ) I'll donate you Keith !

    • @xyphoes345
      @xyphoes345 2 роки тому

      How are you doing now, man?
      Any updates?

    • @joelrichmond6256
      @joelrichmond6256 2 роки тому

      Hey man how you doing now

    • @KeithGalli
      @KeithGalli  2 роки тому +6

      Doing well now finally! :). Will be back on youtube very soon

    • @uchechukwumazi6512
      @uchechukwumazi6512 2 роки тому

      A quick one for those into machine learning. On a scale of 1-10 how sufficiently enough does this tutorial cover machine learning. I am developing certain skills in data analytics and wanted to add Machine learning into the mix but don’t want to start diving too much into it. Just the necessary I will be need for a day in day out machine learning job requirements

    • @uchechukwumazi6512
      @uchechukwumazi6512 2 роки тому

      A quick one for those into machine learning. On a scale of 1-10 how sufficiently enough does this tutorial cover machine learning. I am developing certain skills in data analytics and wanted to add Machine learning into the mix but don’t want to start diving too much into it. Just the necessary I will be need for a day in day out machine learning job requirements

  • @BennyHarassi
    @BennyHarassi 4 роки тому +55

    Please keep uploading you're one of the best tutorial channels.

    • @KeithGalli
      @KeithGalli  4 роки тому +4

      Thank you!! Will do my best

  • @jenn6997
    @jenn6997 4 роки тому +27

    I like it when you showed us how you would use online resources, all the Googling and documentation stuff, so that we are not afraid to actually go online ourselves and explore more new functions :) Thanks Keith!! Stay healthy! :)

  • @deufrai1
    @deufrai1 Рік тому +5

    50 y.o. software developer here.
    this is the first hands on video I watch on the subject of ML.
    As a first step into the subject, I'm very sarisfied with the time I spent with you.
    You covered the basics, from data prep to model save and load.
    Surely a good starting point for further personal explorations.
    Also enjoying your Pandas related content
    Keep up the good work, and maybe use Jupyter's tab-completion, sometimes ;)

  • @alexq5516
    @alexq5516 3 роки тому

    This video is super helpful! I have struggled in making my model using sklearning for several days and you just make my day! Thanks!

  • @hollmanbaez1423
    @hollmanbaez1423 4 роки тому +2

    You are so good, explaining the hardest things in common language and makes it easy to understand to even my grandma.... Thanks so much for making this simple!

  • @mohitkishore8494
    @mohitkishore8494 3 роки тому +3

    This is by far the most useful tutorial that I have ever seen. You are an amazing teacher.

  • @Locke19901
    @Locke19901 4 роки тому +2

    Keith, this is incredibly helpful. Your teaching style is to be commended. I look forward to more like this for ML.

  • @saptarshisanyal4869
    @saptarshisanyal4869 2 роки тому +2

    This one is just one heck of tutorial. Thanks a ton Keith. I am a Java Architect with 17 years of extensive experience, looking to shift to ML/Data Science. It took me 3 hours to cover this video. I must say first one hour was realy easy to follow but probably you covered a lot of things in the last 40 minutes.

  • @gannoncondon1864
    @gannoncondon1864 2 роки тому +1

    Another great video. Really appreciate minimal slides paired with the 'live' coding feel.

  • @Max-my6rk
    @Max-my6rk 4 роки тому +3

    i always am being directed back and stay at Keith's video... just awesome...

  • @aligh18
    @aligh18 3 роки тому +2

    Wow Keith, you're an absolute legend! I can't wait to get through your other videos and see your future work :D

  • @jenn6997
    @jenn6997 4 роки тому +3

    Phew, finally finished watching this one:) A lot to take in, but super helpful and interesting! Thanks, Keith! :) Gonna start your real-world task with Pandas tomorrow!

  • @asafrozali1353
    @asafrozali1353 3 роки тому

    thanks man , i'm watching your whole data science video series and you are awesome!

  • @merkol8
    @merkol8 2 роки тому

    I have implemented my first ml model with the help of you please upload more content you are amazing well done !

  • @johnhutchinson9445
    @johnhutchinson9445 3 роки тому

    Thank you for this video! This saved me so much time digging through documentation to try to understand how to implement these libraries!

  • @somshridhar
    @somshridhar 2 роки тому

    Really appreciate your efforts. I did not understand from my class teacher anything. Keith taught it very nicely. Thanks a lot

  • @ninjaduck3534
    @ninjaduck3534 3 роки тому +1

    Dude you are an excellent educator, thank you so much for this well structured, well explained video!!

  • @FraserMyersMusic
    @FraserMyersMusic 4 роки тому +30

    I was waiting for this! You sir, are a legend

    • @dylanyves6331
      @dylanyves6331 2 роки тому

      @wise guy I think discrete math would help you grasp this

  • @sunritjana4573
    @sunritjana4573 3 роки тому +9

    I have been doing a lot of courses for ML in scikit, I found this last week, and learnt it. And to be honest, I mastered things, which they couldn't cover in the so-called "mega" courses. You're awesome and also really helpful!

    • @rawmetal3052
      @rawmetal3052 2 роки тому +5

      This guy is like the human version of W3school, his content is simple, succinct and well thought out

  • @vilasjagtap6165
    @vilasjagtap6165 4 роки тому

    Great stuff Keith. Really good. Keep doing your bit for all of us. Thanks a lot.

  • @azrulfyz1162
    @azrulfyz1162 3 роки тому +1

    Wow, that is one comprehensive tutorial. Thanks for the time and effort.

  • @RobinHagg
    @RobinHagg 2 роки тому

    Yes. Been starting out with scikit and all videos are just so so. But your videos are always great

  • @hectordavila6249
    @hectordavila6249 4 роки тому

    Thank yoy man, you are awesome, I really appreciate your videos and how you go trough all the process step by step. Please keep uploading.

  • @jamesriri1810
    @jamesriri1810 3 роки тому +1

    @Keith Galli this is really dope. Totally love how how you teach the tutorial. Amazing stuff here.

  • @quickpresent8987
    @quickpresent8987 3 роки тому

    I only have a basic knowledge about python and c# language, thanks for teach me a machine learning method !!! . Continue upload these kind of video pls , you are the best teaching channel

  • @lokeshnagarajan7495
    @lokeshnagarajan7495 3 роки тому +1

    Amazing video. One won't find such tutorial on Python and Machine learning modules. It's the very video helped to complete my project.

  • @haraldlons
    @haraldlons 4 роки тому +9

    Just watched the video in one sitting. It was great! I learned so much, and I loved you showed the entire process from data to evaluation of model. Keep up the good work :)

    • @KeithGalli
      @KeithGalli  4 роки тому +2

      Thank you! Glad it was helpful :)

  • @BM-vz2nb
    @BM-vz2nb 4 роки тому +2

    Very good and cool Tutorial Keith! Thanks a Ton! Loved it!

  • @saikumargatla4706
    @saikumargatla4706 Рік тому

    Your videos. Are changing my life

  • @liam4154
    @liam4154 2 роки тому

    Great video man! some of the best quality educating on youtube!

  • @lisaw3829
    @lisaw3829 4 роки тому

    Watching the tutorial is kind of enjoyment! Have subscribed and waiting for more videos.

    • @KeithGalli
      @KeithGalli  4 роки тому +1

      Glad to hear it! Thanks for the sub! :)

  • @overgeared
    @overgeared 4 роки тому +1

    practical and nicely done. thanks! please do more videos on sklearn, maybe regression & clustering...

  • @kushsheth4801
    @kushsheth4801 2 роки тому

    that moment of joy when i saw my model work! its like magic too good

  • @gisleberge4363
    @gisleberge4363 8 місяців тому

    Real helpful, made me realise New possibilities on how to go agout text data - thanks 🙂

  • @ssharkwsk9439
    @ssharkwsk9439 4 роки тому

    Great videos and series Keith, Kudos to you. Keep it going....

  • @keihinata3740
    @keihinata3740 3 роки тому

    Thanks, Keith! I really like how you teach these stuff. Easy to understand and covers all necessary topics. Excellent tutorial. This comment might be considered a 'POSITIVE' sentiment in the model. 😆

  • @amankumarsingh6242
    @amankumarsingh6242 4 роки тому +1

    Your videos are superb. I can see your videos and just get started applying it to my project. Thank you👍.

    • @KeithGalli
      @KeithGalli  4 роки тому +2

      That's awesome! Glad you have enjoyed :)

  • @vijaykumar-od7kx
    @vijaykumar-od7kx Місяць тому

    Excellent tutorial to learn the fundamentals of SCI-Kit

  • @tak68tak
    @tak68tak 4 роки тому +1

    Sooo POSITIVE. You really saved me. Thanks a lot!

  • @BarryOGorman
    @BarryOGorman 19 днів тому

    Great intro - and commitment to good programming

  • @carlosjacobfield-sierra3759
    @carlosjacobfield-sierra3759 4 роки тому +1

    This video was great man, keep it up your going places.

  • @prashlovessamosa
    @prashlovessamosa Рік тому

    Your channel is heaven to me.

  • @safizaidi2787
    @safizaidi2787 4 роки тому +1

    Keith man. This is an awsome video. Please make some more videos just like you did "Solving real world data science task" video.

  • @briannnnnnnnnn1037
    @briannnnnnnnnn1037 3 роки тому +2

    This is great! Looking forward to more ML content like regression, decision trees, SVM.

  • @drglover31
    @drglover31 3 роки тому

    KG Intelligence I appreciate your detailed videos on this platform

  • @lfmtube
    @lfmtube 3 роки тому

    Very good video! New subscriber and added to my “ Perfect videos” list. Thanks for sharing your knowledge.

  • @haigangzhang8039
    @haigangzhang8039 2 роки тому

    Thanks a lot for the great video, I spent a few days to follow through, and learn a lot!

  • @dikshyantthapa3367
    @dikshyantthapa3367 3 роки тому +1

    You kept appearing on my thumbnail.. I didn't care at first.. Later for once i opened the data science video.. Man.. It was so useful. The application videos of machine learning, data science were awesome. Thanks Keith ❤️.

    • @KeithGalli
      @KeithGalli  3 роки тому +1

      Well I'm happy that you ended up clicking on a video :). Also glad that you have found the videos useful. I appreciate the support!

  • @leasrhythm473
    @leasrhythm473 3 роки тому

    very nice video, so well explained for beginners! Thank you so much!

  • @piotrb5161
    @piotrb5161 4 роки тому

    It's a good and hard work for... for us! Thank you Keith!

    • @KeithGalli
      @KeithGalli  4 роки тому +1

      You're very welcome! Challenging yourself is the best way to learn :)

  • @AmandeepSingh-cv5qz
    @AmandeepSingh-cv5qz 2 роки тому

    keith ,you are like an elder brother teaching us how to do sums.thanksssssssssssssssssssssssssss a lottttttttttttttttttttttttttt bruhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

  • @azrmuradl6420
    @azrmuradl6420 Рік тому

    amazing! expecting more projects like this

  • @sarowarshouvo
    @sarowarshouvo Рік тому

    Hey keith,through this video just completed the first machine learning project.Thanks:).

    • @KeithGalli
      @KeithGalli  Рік тому

      Nice work! Your first of many to come 🤠

  • @tedgq
    @tedgq 4 роки тому +1

    Letss goo! Didn't know I wanted this video until it was here

  • @dhruvmk3055
    @dhruvmk3055 4 роки тому +3

    Great video, but I tested a few other algorithms on the data-set and they seemed to work even better on the data. The algorithms were: Nearest Centroid Classifier and Stochastic Gradient Descent. Thanks for the video though, really helped me.

  • @DataversePH
    @DataversePH 3 роки тому

    This video is so underrated. Should have atleast 500K views.

  • @maksimparnyakov6802
    @maksimparnyakov6802 4 роки тому

    Thank you very much for proper explanation. It’s got clear after your video

  • @sudhakar6933
    @sudhakar6933 3 роки тому

    finally, found another one who's lessons are understandable

  • @sandeepsharma-ph8id
    @sandeepsharma-ph8id Рік тому

    Awesome learning material. Thanks for making it.

  • @hsrayyar
    @hsrayyar 3 роки тому

    You look so young, but your ability is so good.Thanks for your explanation

  • @tobiasksr23
    @tobiasksr23 4 роки тому +1

    Awesome. Are you planning making more of this Machine Learning Videos? It would be great if you could include more about the preprocessing part, maybe trying to get data from a source where it is not ordered and with lot of outliers.

  • @happylearning-gp
    @happylearning-gp 4 роки тому +1

    Excellent Tutorial Keith, Thank you very much

    • @KeithGalli
      @KeithGalli  4 роки тому +1

      Glad you enjoyed! You are very welcome :)

  • @kihunkim7498
    @kihunkim7498 4 роки тому

    i miss your tutorial . good job !!!

  • @mahmoudaldeeb452
    @mahmoudaldeeb452 4 роки тому

    keep going man ,you are the best

  • @DJSEWWES
    @DJSEWWES Рік тому

    big fan of what you are doing keep it up (y)

  • @sreevlogger3847
    @sreevlogger3847 2 роки тому

    Really nice and easy to understand

  • @hemanthshankar4520
    @hemanthshankar4520 3 роки тому

    i really like the way u explain

  • @kryskoss8410
    @kryskoss8410 4 роки тому +2

    I learned more in these 2 hours than my professor taught in 2 weeks. Many thanks!

  • @bosorensen
    @bosorensen 2 роки тому

    Wonderfully done! Thank you!

  • @girishvenkatachalam8793
    @girishvenkatachalam8793 4 роки тому

    Nice...shall watch full video now

  • @MrTaken-tl4bw
    @MrTaken-tl4bw 3 роки тому

    In the first exercise if any of you feels like laughing a bit do this:
    if float(review['overall']) < 2:
    print(review['reviewText']+ '
    ')
    Also, great video! Didn't know I could enjoy Data Science as much as I am.

  • @gregmaland5318
    @gregmaland5318 3 роки тому +1

    Thank you! This was extremely helpful. (POSITIVE)

  • @robiparvez
    @robiparvez Рік тому

    awesome stuff, bro
    subscribed 🙂

  • @gianniprocida3332
    @gianniprocida3332 2 роки тому

    Thanks for the excellent tutorial!!

  • @datastako156
    @datastako156 2 роки тому

    Great tutorials! a learned alot from you more powers!

  • @dibyaranjansahu9971
    @dibyaranjansahu9971 3 роки тому

    Great tutorial,loved it

  • @bryantjohnston8663
    @bryantjohnston8663 3 роки тому

    Well done. Thanks for this!

  • @lokguanlim7420
    @lokguanlim7420 4 роки тому

    Love your tutorial a lot :D

  • @iliassuvanov6690
    @iliassuvanov6690 4 роки тому

    Sir, its great tutorial, thank you!

  • @jongcheulkim7284
    @jongcheulkim7284 2 роки тому

    Thank you so much. it was awesome!!

  • @ishanpatel3086
    @ishanpatel3086 4 роки тому +1

    Perfectly done!!💯✨

  • @kylieying2
    @kylieying2 4 роки тому +3

    This is awesome!

  • @karthikeyan-ro6ud
    @karthikeyan-ro6ud 3 роки тому

    Well done sir..great job🔥

  • @MucahitKatirci
    @MucahitKatirci 3 роки тому

    Learned a lot, thank you

  • @melihsafacelik1166
    @melihsafacelik1166 4 роки тому +1

    I am like machines. I am always learning... Not watched but I believe you made your best.

    • @melihsafacelik1166
      @melihsafacelik1166 4 роки тому

      Edit: I just finished this tutorial and I still support my first comment. NOICE. You are real deal!

  • @americovaldazo4441
    @americovaldazo4441 3 роки тому

    Great video. Very didactic. Thank you.

  • @rarethamaren913
    @rarethamaren913 3 роки тому +1

    great tutorial keith. you are incredible !!
    anyways, do you have any book recommendation for studying? I'm still a new in machine learning so, it would be nice if I read a lot of book first than start studying machine learning in practically. thanks in advance!!

  • @marvinbotchway
    @marvinbotchway 4 роки тому +2

    Great tutorial

  • @bboyhusky
    @bboyhusky 2 роки тому +1

    Great tutorials ! Sentiment.POSITIVE
    Jupyter-Tip: press esc + numbers 1,2,3 or 4 to create markdown header cells

  • @davidlee5715
    @davidlee5715 4 роки тому

    thank you mate! that's amazing!

  • @dryanwarrener
    @dryanwarrener 4 роки тому

    Nice! Very informative.

  • @richarddoggies3038
    @richarddoggies3038 3 роки тому

    It's a positive comment on your video about how it's cool. Thank you

  • @temilolorunaiyelari3676
    @temilolorunaiyelari3676 11 місяців тому

    Great tutorial!!!!

  • @tarifgolder4456
    @tarifgolder4456 2 роки тому

    wow, that's a great tutorial ❤❤❤

  • @kar.s3390
    @kar.s3390 3 роки тому

    Hey keith that was awsome
    try to upload more sklearn ml tutorials

  • @akhileswarv7194
    @akhileswarv7194 4 роки тому

    Great tutorial!

  • @Kris-to7vh
    @Kris-to7vh 3 роки тому

    Best channel ever to learn any Python library!
    1:05 i wonder what the outcome will be for sarcasm, something like: 'beautiful restaurant that made me puke, raccomand'