Machine Learning Tutorial Python - 10 Support Vector Machine (SVM)

Поділитися
Вставка
  • Опубліковано 2 лип 2024
  • Support vector machine (SVM) is a popular classification algorithm. This tutorial covers some theory first and then goes over python coding to solve iris flower classification problem using svm and sklearn library. We also cover different parameters such as gamma, regularization and how to fine tune svm classifier using these parameters. Basically the way support vector machine works is it draws a hyper plane in n dimension space such that it maximizes the margin between classification groups.
    #MachineLearning #PythonMachineLearning #MachineLearningTutorial #Python #PythonTutorial #PythonTraining #MachineLearningCource #SupportVectorMachine #SVM #sklearntutorials #scikitlearntutorials
    Code: github.com/codebasics/py/blob...
    Exercise: Open above notebook from github and go to the end.
    Exercise solution: github.com/codebasics/py/blob...
    Topics that are covered in this Video:
    0:00 Introduction
    0:20 Theory (Explain support vector machine using sklearn iris dataset flower classification problem)
    3:11 What is Gamma?
    4:21 What is Regularization?
    5:27 Kernel
    6:32 Coding (Start)
    18:08 sklearn.svm SVC
    21:41 Exercise (Classify hand written digits dataset from sklearn using SVM)
    Do you want to learn technology from me? Check codebasics.io/?... for my affordable video courses.
    Next Video:
    Machine Learning Tutorial Python - 11 Random Forest: • Machine Learning Tutor...
    Populor Playlist:
    Data Science Full Course: • Data Science Full Cour...
    Data Science Project: • Machine Learning & Dat...
    Machine learning tutorials: • Machine Learning Tutor...
    Pandas: • Python Pandas Tutorial...
    matplotlib: • Matplotlib Tutorial 1 ...
    Python: • Why Should You Learn P...
    Jupyter Notebook: • What is Jupyter Notebo...
    Tools and Libraries:
    Scikit learn tutorials
    Sklearn tutorials
    Machine learning with scikit learn tutorials
    Machine learning with sklearn tutorials
    To download csv and code for all tutorials: go to github.com/codebasics/py, click on a green button to clone or download the entire repository and then go to relevant folder to get access to that specific file.
    🌎 My Website For Video Courses: codebasics.io/?...
    Need help building software or data analytics and AI solutions? My company www.atliq.com/ can help. Click on the Contact button on that website.
    #️⃣ Social Media #️⃣
    🔗 Discord: / discord
    📸 Dhaval's Personal Instagram: / dhavalsays
    📸 Codebasics Instagram: / codebasicshub
    🔊 Facebook: / codebasicshub
    📱 Twitter: / codebasicshub
    📝 Linkedin (Personal): / dhavalsays
    📝 Linkedin (Codebasics): / codebasics
    🔗 Patreon: www.patreon.com/codebasics?fa...

КОМЕНТАРІ • 484

  • @codebasics
    @codebasics  2 роки тому +3

    To learn AI concepts in a simplified and practical manner check our course "AI for everyone": codebasics.io/courses/ai-for-everyone-your-first-step-towards-ai
    Do you want to learn technology from me? Check codebasics.io/ for my affordable video courses.

  • @jg12357
    @jg12357 3 роки тому +3

    Thanks so much for the detailed video on SVM. This helped me a lot!

  • @favourfadeyi1128
    @favourfadeyi1128 2 роки тому +7

    Thank you very much for these videos. They are really helpful. I did the exercise and got 99% when C=4. Any increase in C did not affect the accuracy. Also, any alteration made to gamma and kernel dropped the accuracy drastically. Thank you once again.

  • @sagnikmukherjee8954
    @sagnikmukherjee8954 4 роки тому +15

    model = SVC(kernel = 'rbf', C = 4, gamma = 'scale')
    With the above config, I got a model score of about 99.17%. Test size was 20%, as mentioned.
    Thank you, these tutorials are amazing! :) cheers!

    • @codebasics
      @codebasics  4 роки тому +9

      again great job sagnik. I am seeing that you are on the roll and finishing all the exercises from this playlist. keep it up :)

    • @KULDEEPSINGH-li6gv
      @KULDEEPSINGH-li6gv 2 роки тому

      @@codebasics high model score leads to overfitting? as I got 98% model score with 60% training size

    • @Michelle-yk1fc
      @Michelle-yk1fc Рік тому

      I got 99.25% model score with 70% training size

  • @aatsw
    @aatsw 3 роки тому +1

    A very solid, informative yet concise tutorial. Excellent. Please keep it up.

  • @saidouiazzane2297
    @saidouiazzane2297 4 роки тому +8

    What a wonderfull tutorial!! well done and well explained. Thanks a lot dude for the sharing of this expensive knowledge.

  • @junjietan2668
    @junjietan2668 3 роки тому +4

    Hello sir, thank you for your videos. It really helps from the beginner of the video which you have listed in data science playlist. 😄
    The model in default method is 99.65% in train and 99.4% in test. Whereas gamma method will lower down the accuracy of the model from 99.4% to 75% therefore it has explicit shows the gamma method is unsuitable for the scenario however the regularisation has improve the train set to 1 and testing set is retained the best accuracy of model.
    Yet, kernel parameter as linear has also provided a good accuracy of model.
    Thank you for your guidance.

  • @GlobalDee_
    @GlobalDee_ 3 роки тому

    This series is the best I have seen on simple and explicit Machine learning and Algorithm.Thanks

  • @aditinagar6688
    @aditinagar6688 4 роки тому +36

    Got 1.0 score when C=4 for iris data set. Thank you Sir! Your machine learning Playlist is a boon for beginners like me.

    • @nikitakazankov4099
      @nikitakazankov4099 3 роки тому +5

      That's not always a good thing though. In most real life problems, that would mean that your model has become overfitted.

    • @LetsRetainLength
      @LetsRetainLength 3 роки тому

      Same here😱

    • @LetsRetainLength
      @LetsRetainLength 3 роки тому +1

      @@nikitakazankov4099 Ikrt😏

    • @jay-rathod-01
      @jay-rathod-01 3 роки тому

      @@nikitakazankov4099 Though it does make sense, Whenever I see a Russian name I bow down because of their intelligence.

    • @phungdao9184
      @phungdao9184 3 роки тому +2

      @@nikitakazankov4099 bro, The accuracy is on test dataset. if it's on training dataset then it must be overfit

  • @joehansie6014
    @joehansie6014 3 роки тому +1

    Very very good tutorial. The gentle practice of svm. Thank you

  • @amandal8404
    @amandal8404 3 роки тому

    Very well-explained video. Thank you!

  • @azhaanali1109
    @azhaanali1109 4 роки тому

    This is great! Thank you so much for the video

  • @danasharon4752
    @danasharon4752 2 роки тому

    Thank you for this great series!

  • @DataScienceHarrison
    @DataScienceHarrison 5 місяців тому

    Thank you for this. They are really helpful. I did the exercise and got 99.17% when C=10. Any increase in C did not affect the accuracy. Also, any alteration made to gamma and kernel dropped the accuracy drastically. Be blessed.

  • @Abhishekpandey-dl7me
    @Abhishekpandey-dl7me 5 років тому +9

    one of the best lecture I have ever watched

    • @codebasics
      @codebasics  4 роки тому +3

      Hey Abhishek.
      Great thanks for your kind words.Stay in touch for more videos and share our channel if you really find it worth.

  • @shubhangiagrawal336
    @shubhangiagrawal336 3 роки тому

    Thankyou so much for the wonderful job!!

  • @GojoKamando
    @GojoKamando 9 місяців тому

    your lectures are so addictive I am enjoy learning, thank you soooooooooo much

  • @stephenngumbikiilu3988
    @stephenngumbikiilu3988 2 роки тому +11

    Thank you so much for your presentation. I have learned a lot.
    Exercise
    Test size=0.2, C=1, kernel='poly
    Accuracy: 99.17%

  • @amitblizer4567
    @amitblizer4567 Рік тому

    Thank you so much for this clear and helpful explanation. well done

  • @dhhan68
    @dhhan68 3 роки тому +2

    I was looking for python code to SVM... Thanks a lot... this was a great help... very clean and intuitive lecture~!

  • @stackflow4007
    @stackflow4007 2 місяці тому

    Great videos Bro, Finally understands something :)

  • @BenjaminFunklin
    @BenjaminFunklin 5 років тому +5

    hello great videos, loved this series. Can you please do a video on imbalanced data sets in classifications problems? Maybe just add onto a previous example you have but with a case where there are very few "1" or "true" values compared to "0" or "false" . thanks for you consideration!

  • @sikanderayazkhan9996
    @sikanderayazkhan9996 5 років тому +12

    Wow! how brilliantly working and good teaching method as well . thx sir from Pakistan ... keep it Up!

  • @kousarjamadar7339
    @kousarjamadar7339 4 роки тому +1

    Your all concepts are so brilliant and well defined.because of these video , my concepts and doughts are now so much clear.

  • @franky0226
    @franky0226 4 роки тому +11

    Great! Sir, Can you elaborate something about plotting the hyperplane (the decision function) in matplotlib
    I want to see the best line which classifies the data

  • @arijitRC473
    @arijitRC473 5 років тому +4

    In linear kernel score is 96.9 percent and in rbf kernel score is 40 percent...
    With gamma value the score is 0.06... And with the regularization value the score is around 45. 83 percent

  • @ashrafulalsabit4970
    @ashrafulalsabit4970 Рік тому

    It was really really helpful, thanks a million.

  • @gajanantayde
    @gajanantayde 2 роки тому +2

    Thanks!!!!!!!!! for this wonderful tutorial got accuracy 99.166%

  • @benb6946
    @benb6946 3 роки тому

    When I did the exercise, rbf performs slightly better for me than linear. I believe when you created your notebook, the default gamma was auto. Using the scale option provides much better results than auto for rbf.

  • @codebasics
    @codebasics  4 роки тому +2

    Exercise solution: github.com/codebasics/py/blob/master/ML/10_svm/Exercise/10_svm_exercise_digits.ipynb
    Complete machine learning tutorial playlist: ua-cam.com/video/gmvvaobm7eQ/v-deo.html

    • @anujvyas9493
      @anujvyas9493 4 роки тому

      I used model = SVC(C=2.0, gamma='auto', kernel='rbf') and got an accuracy of 100%
      Can you check that it is right or not?
      Also I used random_state = 100 in train_test_split method for random values

  • @karimjiwa9279
    @karimjiwa9279 3 роки тому

    What an awesome tutorial.

  • @oshogarg341
    @oshogarg341 3 роки тому +66

    Can you make a video on title "how to determine which classification model to be used in ML according to dataset" ?

  • @sidduhedaginal
    @sidduhedaginal 4 роки тому +3

    Your teaching skills are unmeasurable and it's very easy to understand no need to scratch our head for looking at some other training institute.
    I have executed load_digits datasets and found the following score:
    For 'rbf' kernal, score -98
    'linear' kernal, score -97

    • @codebasics
      @codebasics  4 роки тому

      Siddu, thanks for complement and good job on exercise. 👏👏👏 That is indeed a nice score

  • @ferrari1
    @ferrari1 2 роки тому +1

    Great vid! but would've been nice if you had plotted the SVM line and scatter plots. Also running a few predictions would be useful.

  • @ms.mousoomibora9526
    @ms.mousoomibora9526 4 роки тому

    very much helpful for beigineers !! Thank you so much..

  • @oshogarg341
    @oshogarg341 3 роки тому

    Thank you so much Sir! for your machine learning playlist

    • @codebasics
      @codebasics  3 роки тому

      I am happy this was helpful to you.

  • @nishattamanna2372
    @nishattamanna2372 4 роки тому +3

    @codebasics, sir could you please make a video with regression models like KNN regression or random forest with train_test_validation set? Thank you for your amazing videos..I started my machine learning implementation journey with your tutorials.❤

  • @iradukundapacifique987
    @iradukundapacifique987 4 роки тому +1

    After all possible regularisations, my highest accuracy is 99%. Thank you sir

    • @codebasics
      @codebasics  4 роки тому +2

      Iradukunda, that's a pretty good score buddy. Good job 👍👌👏

  • @CodeWithPrince
    @CodeWithPrince 4 роки тому +1

    great tutorial man👍👍👍

  • @oshogarg341
    @oshogarg341 3 роки тому

    You are seeming to be tired from your voice but hats off your efforts !

  • @deepanshutyagi1467
    @deepanshutyagi1467 4 роки тому

    thanks for this!!

  • @musthakhahammed6535
    @musthakhahammed6535 2 роки тому

    And thank you sir for an awesome playlist

  • @ajmalrasheed5412
    @ajmalrasheed5412 3 роки тому

    I calculated on digits dataset and comes with SVC = 99.16%
    while with logistic regression it was = 96.38%.
    So kudos to Support Vector Classification.

  • @simonmontenegro4897
    @simonmontenegro4897 7 місяців тому +1

    Excellent video, I'm doing a review of what i learned a year ago in a deep learning course in the university (i'm a geophysics graduate) with this playlist without seeing too much math.
    For C = 25 kernel = rbf and gamma = scale, Test_size = 0.2
    Accuracy = 99.70%

    • @h.m.sazzadquadir1625
      @h.m.sazzadquadir1625 5 місяців тому

      I used kernel = linear and it gave me an accuracy score of 1.0 :3

  • @shipperification
    @shipperification 3 роки тому

    You teach so well...i thought i will never understand ml...

  • @yl95
    @yl95 2 роки тому +1

    dank je wel

  • @vikhyatshanbhag305
    @vikhyatshanbhag305 4 роки тому +11

    When i tried iris data set with SVC default values, i got 1 accuracy. Digits data set with SVM(kernal='linear') gave 98% accuracy.

  • @tuhinkarak4927
    @tuhinkarak4927 4 роки тому

    All your videos are just awesome❤❤❤

    • @codebasics
      @codebasics  4 роки тому

      Thanks for your kind words of appreciation

  • @aayushsharma2881
    @aayushsharma2881 10 місяців тому

    Thank You Sir, Dhaval and for the exercise I used normal rbf kernel C= 1 and got the accuracy of 0.991668

  • @roopagaur8834
    @roopagaur8834 4 роки тому

    Excellent...!!!! 😀 thanks

    • @codebasics
      @codebasics  4 роки тому +1

      Roopa, thanks for the feedback

  • @williamwambua7710
    @williamwambua7710 3 роки тому

    I am liking the tutorials Thanks

  • @minhaj6211
    @minhaj6211 3 роки тому

    Please make a video on the topic "How to choose which ML algorithm for a dataset".
    And thanks for amazing videos, sir.

  • @UttamKumar-zj4qs
    @UttamKumar-zj4qs 2 роки тому

    Hello sir, thank you so much for this video. I got 99.25% when i put C=1.
    If, I use kernel='rbf' then, I got 99 % accuracy.
    And for kernel='linear', I got 97.7% accuracy.

  • @awesome_harish
    @awesome_harish 4 роки тому

    BEST DEMO ON SVM

  • @renjithas5809
    @renjithas5809 4 роки тому

    heartfull thanks to you sir

  • @aviroxi
    @aviroxi 2 роки тому

    thank you so much:)

  • @user-fe7kg7jt5w
    @user-fe7kg7jt5w 3 роки тому

    For digits I got highest accuracy value as 0.99 with gamma 'scale' and C=10
    Thank you for your video!

    • @codebasics
      @codebasics  3 роки тому

      That’s the way to go Коробка, good job working on that exercise

  • @prashantkumarsharma1
    @prashantkumarsharma1 5 років тому +1

    thanks a lot for uploading. Plz try to upload next vides soon.

  • @rameshpokhriyalnishank7445
    @rameshpokhriyalnishank7445 2 роки тому +1

    my score is 76.5 with ginni index model and 75.9 with entropy model
    btw thanks for good teaching sir ji

  • @praveenkamble89
    @praveenkamble89 3 роки тому

    Tried couple of iterations finally I got 99.166% accuracy with all default parameters. random_state=1 while defining train test data.. Thanks a lot Sir

  • @mridang2064
    @mridang2064 2 роки тому +2

    Can't thank you enough bro.💜🙏
    Jai Shree Ram. Hope Ram bhagwaan bless your entire family.

  • @irshadmuhammed7740
    @irshadmuhammed7740 2 роки тому +1

    I got score 99.4 when c=1 and gamma=scale
    And i got 50 when gamma = auto
    And 99.7 when gamma = auto and c=10
    Thank you sir for this series. And following the tutorial with doing exercises

    • @codebasics
      @codebasics  2 роки тому

      That’s the way to go irshad, good job working on that exercise

  • @sriharshavardhan299
    @sriharshavardhan299 5 років тому

    ur the best broo

  • @spyzvarun5478
    @spyzvarun5478 2 роки тому

    used hyperparameter tuning here to get 100% for train and 99.72% for test...luckily data was clean cause im not very experienced in data cleaning and here i didnt even do too much data visualization

  • @kamlakarsapkale319
    @kamlakarsapkale319 5 місяців тому

    Thank you sir for wonderful explanation.I think high regularization means simpler the model.(5.11)

  • @sanooosai
    @sanooosai 5 місяців тому

    great thank you

  • @anuragmishra6262
    @anuragmishra6262 4 роки тому +5

    I got 98.16% accuracy with C=2, kernel=rbf and gamma=0.001
    Maximum Accuracy: 100%
    Minimum Accuracy: 95 %
    Avg Accuracy: 98.16%

    • @codebasics
      @codebasics  3 роки тому

      That’s the way to go Anurag, good job working on that exercise

  • @pappering
    @pappering 4 роки тому +1

    Really nice video. Thank you so much such brilliant video. I got the score of 98.61% with C = 1. but i could not apply matplotlib visualization as there are 64 columns. I could not understand which columns should be selected for visualization.

  • @celsocunha1000
    @celsocunha1000 4 роки тому

    amazing video

  • @sakshigaikwad8711
    @sakshigaikwad8711 8 місяців тому

    Thank you

  • @fathoniam8997
    @fathoniam8997 3 дні тому

    OK, the excercise is cool. i got the best accuracy score using kernel = RBF.

  • @minhaoling3056
    @minhaoling3056 2 роки тому

    I think it should be high C corresponds to low regularization, which means the classifier don't penalize too much on classification error.
    Vice versa for low C.

  • @jk97322
    @jk97322 5 років тому +1

    Sir please make videos on unsupervised learning waiting for it for a long time hope you will help us.

  • @mk3vinfinate
    @mk3vinfinate 4 роки тому

    Can you do some quick videos on exploratory data analyses? Things like custom querying and displaying relation between queried data elements?

  • @dipankarrahuldey6249
    @dipankarrahuldey6249 4 роки тому

    1)What is good to have- a large gamma or regularization parameter?
    2)We used only fit() but not fit_transform(), is it because the rbf Kernel will perform the transformation itself to scale the features and the target labels?

  • @saurabh7943
    @saurabh7943 6 місяців тому +1

    We have done with the iris data in Logistic Regression Exercise , which peak value was also 96 %

  • @darwinchan5573
    @darwinchan5573 4 роки тому

    very nice tutorial!

  • @subee128
    @subee128 Рік тому

    Thanks

  • @mucahitugurlu7324
    @mucahitugurlu7324 3 роки тому +1

    Thanks for your effort sir, but there is something I wonder.. When I fit a model, I can't see any description like you have in your jupyter notebook.(C=1, cache_size=200 etc..) I can't see them.. is there any way to see them?

  • @sdhilip
    @sdhilip 5 років тому +1

    How to find linear or non-linear in the dataset if we get very large dataset ?

  • @Koome777
    @Koome777 7 місяців тому +1

    Model score when C=20 is 0.9944444444444445. Varying kernel, gamma gave lower scores. This was my best.

  • @midhunskani
    @midhunskani 4 роки тому

    great tutorial.You explained all the concepts crisp and clear. liked and subbed

  • @arijitRC473
    @arijitRC473 5 років тому +3

    Sir please make videos on neural networks, anomally detection and unsupervised learning..... I am eagerpy waiting..... Your last video i have seem is random forest... Please upload more

    • @Surftech09
      @Surftech09 4 роки тому

      Hello Arijit,
      did you manage to implement the neural networks for unsupervised learning?
      am currently working on DDoS attack Detection with NSL-KDD Dataset and need few clarification
      you can reply to me here or chrisonic64@gmail.com

  • @ajeniyiajayi
    @ajeniyiajayi 4 роки тому

    Very good tutorial. I got 99.9% accuracy using kernel='rbf' and C=1.0.

    • @codebasics
      @codebasics  3 роки тому

      That’s the way to go Ajeniyi, good job working on that exercise

  • @nomanshaikhali3355
    @nomanshaikhali3355 3 роки тому

    As in this, you have given the last example for practice that same example we have solved in the Logistic regression model!! then what will be the difference b/w them?? I am talking about (Load Digits problem)

  • @mvcutube
    @mvcutube 3 роки тому

    The best

  • @wast9050
    @wast9050 11 місяців тому

    Do you know a way you could look at only one data points specifically when you do the prediction at the end?

  • @AbhiGupta-bc3wx
    @AbhiGupta-bc3wx 4 роки тому

    nice video sir

  • @jenil16
    @jenil16 3 місяці тому +1

    Logistic Regression is giving the 100% score.....its performing better than SVC and also Decision Tree.

  • @rakeshdas2636
    @rakeshdas2636 Рік тому

    thanks sir ,please upload some new advance data science algorithm with practical .again thank you sir

  • @hichemhadji348
    @hichemhadji348 3 роки тому

    when I try to do the model.fit function I have this error :
    check_array() got an unexpected keyword argument 'warn_on_dtype'
    I don't understand how to fix it

  • @ankitachavan4637
    @ankitachavan4637 4 роки тому +3

    I got accuracy of 99.72% by keeping kernel='rbf', C=1 and gamma=0.002

  • @anjumrohra9778
    @anjumrohra9778 3 роки тому

    Sir, how to add legend displaying all the three categories with corresponding markers in the plot?

  • @tarikamer3703
    @tarikamer3703 3 роки тому

    excellent video

  • @sagarkhanna3339
    @sagarkhanna3339 3 роки тому

    In the digits dataset exercise, I got accuracy score of 99.78% on the test dataset.
    These are the hyperparameters are used in SVC( C = 12.0, kernel = 'poly', gamma = 'auto')

  • @amirhosseindaneshpour8714
    @amirhosseindaneshpour8714 3 роки тому

    got a score of 99.16 for my test samples with C=any thing more than 2( i wonder why there were no difference between C=2 and C= 100, i got the 99.16 accuracy for all the values for C more than 2!). didn't change the gamma or the score would be destroyed! the kernel ='rbf'.
    thanks for this amazing tutorial BTW! :)

    • @codebasics
      @codebasics  3 роки тому +1

      Good job Amir, that’s a pretty good score. Thanks for working on exercise

  • @nomanshaikhali3355
    @nomanshaikhali3355 3 роки тому

    Kindly upload the mathematics behind this model video too!! Thanks

  • @nomanshaikhali3355
    @nomanshaikhali3355 3 роки тому

    Kindly make videos on K nearest neighborhood and on underfitting and overfitting

  • @ajaimisra
    @ajaimisra 2 роки тому

    got 100% accuracy for Digit dataset where model=SVC(C=7,kernel='rbf')

  • @588kumar
    @588kumar 4 роки тому

    Just watching the tutorial, you are not going to learn anything :-) --> We understood your intention sir. A big salute to you.

    • @codebasics
      @codebasics  4 роки тому

      Ha ha.. nice. It is very true.