Machine Learning Tutorial Python - 10 Support Vector Machine (SVM)
Вставка
- Опубліковано 2 лип 2024
- Support vector machine (SVM) is a popular classification algorithm. This tutorial covers some theory first and then goes over python coding to solve iris flower classification problem using svm and sklearn library. We also cover different parameters such as gamma, regularization and how to fine tune svm classifier using these parameters. Basically the way support vector machine works is it draws a hyper plane in n dimension space such that it maximizes the margin between classification groups.
#MachineLearning #PythonMachineLearning #MachineLearningTutorial #Python #PythonTutorial #PythonTraining #MachineLearningCource #SupportVectorMachine #SVM #sklearntutorials #scikitlearntutorials
Code: github.com/codebasics/py/blob...
Exercise: Open above notebook from github and go to the end.
Exercise solution: github.com/codebasics/py/blob...
Topics that are covered in this Video:
0:00 Introduction
0:20 Theory (Explain support vector machine using sklearn iris dataset flower classification problem)
3:11 What is Gamma?
4:21 What is Regularization?
5:27 Kernel
6:32 Coding (Start)
18:08 sklearn.svm SVC
21:41 Exercise (Classify hand written digits dataset from sklearn using SVM)
Do you want to learn technology from me? Check codebasics.io/?... for my affordable video courses.
Next Video:
Machine Learning Tutorial Python - 11 Random Forest: • Machine Learning Tutor...
Populor Playlist:
Data Science Full Course: • Data Science Full Cour...
Data Science Project: • Machine Learning & Dat...
Machine learning tutorials: • Machine Learning Tutor...
Pandas: • Python Pandas Tutorial...
matplotlib: • Matplotlib Tutorial 1 ...
Python: • Why Should You Learn P...
Jupyter Notebook: • What is Jupyter Notebo...
Tools and Libraries:
Scikit learn tutorials
Sklearn tutorials
Machine learning with scikit learn tutorials
Machine learning with sklearn tutorials
To download csv and code for all tutorials: go to github.com/codebasics/py, click on a green button to clone or download the entire repository and then go to relevant folder to get access to that specific file.
🌎 My Website For Video Courses: codebasics.io/?...
Need help building software or data analytics and AI solutions? My company www.atliq.com/ can help. Click on the Contact button on that website.
#️⃣ Social Media #️⃣
🔗 Discord: / discord
📸 Dhaval's Personal Instagram: / dhavalsays
📸 Codebasics Instagram: / codebasicshub
🔊 Facebook: / codebasicshub
📱 Twitter: / codebasicshub
📝 Linkedin (Personal): / dhavalsays
📝 Linkedin (Codebasics): / codebasics
🔗 Patreon: www.patreon.com/codebasics?fa...
To learn AI concepts in a simplified and practical manner check our course "AI for everyone": codebasics.io/courses/ai-for-everyone-your-first-step-towards-ai
Do you want to learn technology from me? Check codebasics.io/ for my affordable video courses.
Thanks so much for the detailed video on SVM. This helped me a lot!
Thank you very much for these videos. They are really helpful. I did the exercise and got 99% when C=4. Any increase in C did not affect the accuracy. Also, any alteration made to gamma and kernel dropped the accuracy drastically. Thank you once again.
model = SVC(kernel = 'rbf', C = 4, gamma = 'scale')
With the above config, I got a model score of about 99.17%. Test size was 20%, as mentioned.
Thank you, these tutorials are amazing! :) cheers!
again great job sagnik. I am seeing that you are on the roll and finishing all the exercises from this playlist. keep it up :)
@@codebasics high model score leads to overfitting? as I got 98% model score with 60% training size
I got 99.25% model score with 70% training size
A very solid, informative yet concise tutorial. Excellent. Please keep it up.
What a wonderfull tutorial!! well done and well explained. Thanks a lot dude for the sharing of this expensive knowledge.
Hello sir, thank you for your videos. It really helps from the beginner of the video which you have listed in data science playlist. 😄
The model in default method is 99.65% in train and 99.4% in test. Whereas gamma method will lower down the accuracy of the model from 99.4% to 75% therefore it has explicit shows the gamma method is unsuitable for the scenario however the regularisation has improve the train set to 1 and testing set is retained the best accuracy of model.
Yet, kernel parameter as linear has also provided a good accuracy of model.
Thank you for your guidance.
This series is the best I have seen on simple and explicit Machine learning and Algorithm.Thanks
Glad to hear that!
Got 1.0 score when C=4 for iris data set. Thank you Sir! Your machine learning Playlist is a boon for beginners like me.
That's not always a good thing though. In most real life problems, that would mean that your model has become overfitted.
Same here😱
@@nikitakazankov4099 Ikrt😏
@@nikitakazankov4099 Though it does make sense, Whenever I see a Russian name I bow down because of their intelligence.
@@nikitakazankov4099 bro, The accuracy is on test dataset. if it's on training dataset then it must be overfit
Very very good tutorial. The gentle practice of svm. Thank you
Very well-explained video. Thank you!
This is great! Thank you so much for the video
Thank you for this great series!
Thank you for this. They are really helpful. I did the exercise and got 99.17% when C=10. Any increase in C did not affect the accuracy. Also, any alteration made to gamma and kernel dropped the accuracy drastically. Be blessed.
one of the best lecture I have ever watched
Hey Abhishek.
Great thanks for your kind words.Stay in touch for more videos and share our channel if you really find it worth.
Thankyou so much for the wonderful job!!
your lectures are so addictive I am enjoy learning, thank you soooooooooo much
Thank you so much for your presentation. I have learned a lot.
Exercise
Test size=0.2, C=1, kernel='poly
Accuracy: 99.17%
Thank you so much for this clear and helpful explanation. well done
I was looking for python code to SVM... Thanks a lot... this was a great help... very clean and intuitive lecture~!
Glad it was helpful!
Great videos Bro, Finally understands something :)
hello great videos, loved this series. Can you please do a video on imbalanced data sets in classifications problems? Maybe just add onto a previous example you have but with a case where there are very few "1" or "true" values compared to "0" or "false" . thanks for you consideration!
Wow! how brilliantly working and good teaching method as well . thx sir from Pakistan ... keep it Up!
**** pk
Your all concepts are so brilliant and well defined.because of these video , my concepts and doughts are now so much clear.
Glad you like them!
Great! Sir, Can you elaborate something about plotting the hyperplane (the decision function) in matplotlib
I want to see the best line which classifies the data
In linear kernel score is 96.9 percent and in rbf kernel score is 40 percent...
With gamma value the score is 0.06... And with the regularization value the score is around 45. 83 percent
It was really really helpful, thanks a million.
Thanks!!!!!!!!! for this wonderful tutorial got accuracy 99.166%
When I did the exercise, rbf performs slightly better for me than linear. I believe when you created your notebook, the default gamma was auto. Using the scale option provides much better results than auto for rbf.
Exercise solution: github.com/codebasics/py/blob/master/ML/10_svm/Exercise/10_svm_exercise_digits.ipynb
Complete machine learning tutorial playlist: ua-cam.com/video/gmvvaobm7eQ/v-deo.html
I used model = SVC(C=2.0, gamma='auto', kernel='rbf') and got an accuracy of 100%
Can you check that it is right or not?
Also I used random_state = 100 in train_test_split method for random values
What an awesome tutorial.
Can you make a video on title "how to determine which classification model to be used in ML according to dataset" ?
Pycaret answers that query
@@nivedhansenthilkumar964 Thanks
Look up cross validation.
@@nivedhansenthilkumar964 thanks
Yess
Your teaching skills are unmeasurable and it's very easy to understand no need to scratch our head for looking at some other training institute.
I have executed load_digits datasets and found the following score:
For 'rbf' kernal, score -98
'linear' kernal, score -97
Siddu, thanks for complement and good job on exercise. 👏👏👏 That is indeed a nice score
Great vid! but would've been nice if you had plotted the SVM line and scatter plots. Also running a few predictions would be useful.
very much helpful for beigineers !! Thank you so much..
Thank you so much Sir! for your machine learning playlist
I am happy this was helpful to you.
@codebasics, sir could you please make a video with regression models like KNN regression or random forest with train_test_validation set? Thank you for your amazing videos..I started my machine learning implementation journey with your tutorials.❤
After all possible regularisations, my highest accuracy is 99%. Thank you sir
Iradukunda, that's a pretty good score buddy. Good job 👍👌👏
great tutorial man👍👍👍
You are seeming to be tired from your voice but hats off your efforts !
thanks for this!!
And thank you sir for an awesome playlist
I calculated on digits dataset and comes with SVC = 99.16%
while with logistic regression it was = 96.38%.
So kudos to Support Vector Classification.
Excellent video, I'm doing a review of what i learned a year ago in a deep learning course in the university (i'm a geophysics graduate) with this playlist without seeing too much math.
For C = 25 kernel = rbf and gamma = scale, Test_size = 0.2
Accuracy = 99.70%
I used kernel = linear and it gave me an accuracy score of 1.0 :3
You teach so well...i thought i will never understand ml...
dank je wel
When i tried iris data set with SVC default values, i got 1 accuracy. Digits data set with SVM(kernal='linear') gave 98% accuracy.
All your videos are just awesome❤❤❤
Thanks for your kind words of appreciation
Thank You Sir, Dhaval and for the exercise I used normal rbf kernel C= 1 and got the accuracy of 0.991668
Excellent...!!!! 😀 thanks
Roopa, thanks for the feedback
I am liking the tutorials Thanks
Glad you like them!
Please make a video on the topic "How to choose which ML algorithm for a dataset".
And thanks for amazing videos, sir.
Hello sir, thank you so much for this video. I got 99.25% when i put C=1.
If, I use kernel='rbf' then, I got 99 % accuracy.
And for kernel='linear', I got 97.7% accuracy.
BEST DEMO ON SVM
heartfull thanks to you sir
thank you so much:)
For digits I got highest accuracy value as 0.99 with gamma 'scale' and C=10
Thank you for your video!
That’s the way to go Коробка, good job working on that exercise
thanks a lot for uploading. Plz try to upload next vides soon.
my score is 76.5 with ginni index model and 75.9 with entropy model
btw thanks for good teaching sir ji
Tried couple of iterations finally I got 99.166% accuracy with all default parameters. random_state=1 while defining train test data.. Thanks a lot Sir
Nice work!
Can't thank you enough bro.💜🙏
Jai Shree Ram. Hope Ram bhagwaan bless your entire family.
I got score 99.4 when c=1 and gamma=scale
And i got 50 when gamma = auto
And 99.7 when gamma = auto and c=10
Thank you sir for this series. And following the tutorial with doing exercises
That’s the way to go irshad, good job working on that exercise
ur the best broo
used hyperparameter tuning here to get 100% for train and 99.72% for test...luckily data was clean cause im not very experienced in data cleaning and here i didnt even do too much data visualization
Thank you sir for wonderful explanation.I think high regularization means simpler the model.(5.11)
great thank you
I got 98.16% accuracy with C=2, kernel=rbf and gamma=0.001
Maximum Accuracy: 100%
Minimum Accuracy: 95 %
Avg Accuracy: 98.16%
That’s the way to go Anurag, good job working on that exercise
Really nice video. Thank you so much such brilliant video. I got the score of 98.61% with C = 1. but i could not apply matplotlib visualization as there are 64 columns. I could not understand which columns should be selected for visualization.
amazing video
Thank you
OK, the excercise is cool. i got the best accuracy score using kernel = RBF.
I think it should be high C corresponds to low regularization, which means the classifier don't penalize too much on classification error.
Vice versa for low C.
Sir please make videos on unsupervised learning waiting for it for a long time hope you will help us.
Can you do some quick videos on exploratory data analyses? Things like custom querying and displaying relation between queried data elements?
1)What is good to have- a large gamma or regularization parameter?
2)We used only fit() but not fit_transform(), is it because the rbf Kernel will perform the transformation itself to scale the features and the target labels?
We have done with the iris data in Logistic Regression Exercise , which peak value was also 96 %
I got a peak value of 1 in LR
very nice tutorial!
😊👍
Thanks
Thanks for your effort sir, but there is something I wonder.. When I fit a model, I can't see any description like you have in your jupyter notebook.(C=1, cache_size=200 etc..) I can't see them.. is there any way to see them?
How to find linear or non-linear in the dataset if we get very large dataset ?
Model score when C=20 is 0.9944444444444445. Varying kernel, gamma gave lower scores. This was my best.
great tutorial.You explained all the concepts crisp and clear. liked and subbed
Sir please make videos on neural networks, anomally detection and unsupervised learning..... I am eagerpy waiting..... Your last video i have seem is random forest... Please upload more
Hello Arijit,
did you manage to implement the neural networks for unsupervised learning?
am currently working on DDoS attack Detection with NSL-KDD Dataset and need few clarification
you can reply to me here or chrisonic64@gmail.com
Very good tutorial. I got 99.9% accuracy using kernel='rbf' and C=1.0.
That’s the way to go Ajeniyi, good job working on that exercise
As in this, you have given the last example for practice that same example we have solved in the Logistic regression model!! then what will be the difference b/w them?? I am talking about (Load Digits problem)
The best
Do you know a way you could look at only one data points specifically when you do the prediction at the end?
nice video sir
Logistic Regression is giving the 100% score.....its performing better than SVC and also Decision Tree.
thanks sir ,please upload some new advance data science algorithm with practical .again thank you sir
when I try to do the model.fit function I have this error :
check_array() got an unexpected keyword argument 'warn_on_dtype'
I don't understand how to fix it
I got accuracy of 99.72% by keeping kernel='rbf', C=1 and gamma=0.002
Sir, how to add legend displaying all the three categories with corresponding markers in the plot?
excellent video
Glad you liked it!
In the digits dataset exercise, I got accuracy score of 99.78% on the test dataset.
These are the hyperparameters are used in SVC( C = 12.0, kernel = 'poly', gamma = 'auto')
Great score. good job.
got a score of 99.16 for my test samples with C=any thing more than 2( i wonder why there were no difference between C=2 and C= 100, i got the 99.16 accuracy for all the values for C more than 2!). didn't change the gamma or the score would be destroyed! the kernel ='rbf'.
thanks for this amazing tutorial BTW! :)
Good job Amir, that’s a pretty good score. Thanks for working on exercise
Kindly upload the mathematics behind this model video too!! Thanks
Kindly make videos on K nearest neighborhood and on underfitting and overfitting
got 100% accuracy for Digit dataset where model=SVC(C=7,kernel='rbf')
Just watching the tutorial, you are not going to learn anything :-) --> We understood your intention sir. A big salute to you.
Ha ha.. nice. It is very true.