After watching so many different ML tutorial videos and literally so many i have just one thing to say, the way you teach is literally the best among all of them. You name any famous one like Andrew NG or sentdex but you literally need to have prerequisites to understand their videos while yours are a treat to the viewers explained from so basics and slowly going up and up. And those exercises are like cherry on the top. Never change your teaching style sir yours is the best one.👍🏻
I have never seen anyone who can explain Machine Learning and Data Science so easily.. I used to be scared in Machine Learning and Data science, then after seeing your videos, I am now confident that I can do it by myself. Thank you so much for all these videos.... 👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏
that approach of doing the manual method of what cross_val_score is doing in the background and then introducing the method! God send! Brilliant. Brilliant I say!
Thank you Sir for this awesome explanation. Iris Dataset Assignment Score Logistic Regression [96.07% , 92.15% , 95.83%] SVM [100% , 96.07% , 97.91%] (Kernel='linear') Decision Tree [98.03 %, 92.15% , 100%] Random Forest [98.03% , 92.15% , 97.91%] Conclusion: SVM works the best model for me .
He did folds = StratifiedKFold(), and said that he will use it because it is better than KFold but at 14:20, he used kf.split, where kf is KFold. I think he frogot to use StatifiedKFold.
Hi, I'm from Malaysia. I came across your video and I am glad I did it. super easy to understand and I'm currently preparing to learn deep learning. already watch your Python, Pandas, and currently ML videos. thank you for making all these videos. you making our life easier Sir. Sincerely, your student from Malaysia.
we needed to use mean() with cross validation to get average mean of accuracy score. i'm guessing you forget to add. anyways video is pretty good and depth.keep producing such videos.
My teacher is frustratingly bad. I am learning from your videos so that I can get a good grade in my class. Thank you for taking some time to demonstrate what is happening. When you showed me with the example at 10:47, I finally understood.
@@manu-prakash-choudhary After 50 splits 😎😎 Score of Logistic Regression is 0.961111111111111 Score of SVM is 0.9888888888888888 Score of RandomForestClassifier is 0.973111111111111
Thank you very much. Very nice explanation. My scores, after taking averages, are as follow: LogisticRegression (max_iter=200) = 97.33% SVC (kernel = poly) = 98.00% DecisionTreeClassifier = 96% RandomForestClassifier (n_estimators=300) = 96.67%
Your videos are AMAZING man!!! I have already recommended these videos to my colleagues in my University who is taking Machine Learning course. They are also loving it...!!! Keep it up champ!
For the parameter tuning this helps. Just play a bit with indexes due to lists staring from 0 and n_estimators from 1 to match up indexes. scores=[ ] avg_scores=[ ] n_est=range(1,5) #example for i in n_est : model=RandomForestClassifier(n_estimators=i) score=cross_val_score(model,digits.data, digits.target, cv=10) scores.append(score) avg_scores.append(np.average(score))
print('avg score:{}, n_estimator:{}'.format(avg_scores[i-1],i)) avg_scores=np.asarray(avg_scores) #convert the list to array print(' Average accuracy score is {} for n_estimators={} calculated from following accuracy scores: {}'.format(np.amax(avg_scores),np.argmax(avg_scores)+1,scores[np.argmax(avg_scores)])) plt.plot(n_est,avg_scores) plt.xlabel('number of estimators') plt.ylabel('average accuracy') 44 was the best for me
Thank you very much for excellent explanation. I got accuracy SVC=98.04% , RandomForestClassifier(n_estimators=30)=98.04%, LogisticRegression(max_iter=200)=96.08%
best explanation... i like the way u give examples using small data to explain how it actually works. 10:20 no one explains like this... keep doing great work
@@codebasics I have applied K fold on the linear regression's dataset I used different activation functions & then I get mean & se values How to pick the best model from the k folds?
AWESOME AWESOME..... Excellent video you have created. I'm learning ML since past more than 1 years and heard almost more 400 videos. Your videos are AWESOME.... Please make complete series on ML... Thanks.
@20:39 of the video, noticed something interesting, by default "cross_val_score()" method generates 3 kfolds... but the default has now changed from 3 to 5 :))
Explanation was amazing sir and performed cross_val_score, below is the final average result(considered 10 folds) Logistic Regression - 95% SVM - 98% -------[Performed better] Decision Tree - 95% Random Forest - 96%
usage of same datasets make less uninteresting, but your tutorials are awesome every tutorial across every thing have + and -,your tutorials are more structured but minus point is usage of same dataset which reduces interest to go next next
Thank you very much sir for this very nice explanation. My results are: Logistic Regression=95.33% SVM=97.33% Decision Tree=96.67% Random Forests(40 estimators)=96.67%
Your videos are really good! The explanation is crisp and succinct! Love your videos! Keep posting! By the way, you may not realize it, but you are changing peoples' lives by educating them! Jai Hind!
00:02 K fold cross validation helps determine the best machine learning model for a given problem. 02:20 K-fold cross validation provides a more robust evaluation of machine learning models. 04:36 Classifying handwritten characters into ten categories using different algorithms and evaluating performance using k-fold cross validation. 07:06 K-fold cross validation helps in more robust model evaluation. 09:43 K-fold cross validation divides data into training and testing sets for iterative model evaluation. 12:35 Stratified k-fold ensures uniform distribution of categories for better model training. 15:42 Measuring the performance of models in each iteration 18:29 Parameter tuning in random forest classifier improves scores. 20:46 K Fold Cross Validation helps measure the performance of machine learning models. 23:18 Cross-validation helps in comparing algorithms and finding the best parameters for a given problem. 25:18 K Fold Cross Validation helps in assessing the model's performance. Crafted by Merlin AI.
No worries! If you feel my videos have benefited you, you can spread the words. share the information on channel on your linkedin, facebook etc. that way maximum ppl can benefit from this
What is the score ? Cross validation is about validation of ONE model. After validating the model and getting his parameters, you shall choose method to compare with other models and select appropriate model. - Training set: A set of examples used for learning, that is to fit the parameters of the classifier. - Validation set: A set of examples used to tune the parameters of a classifier, for example to choose the number of hidden units in a neural network. - Test set: A set of examples used only to assess the performance of a fully-specified classifier.
You can use cross validation to compare multiple models to. Basically just run kfold one multiple models or same model with different parameters and compare the score.
by making df method: mean(cross_val_score(LogisticRegression(max_iter=200), X,y)) 0.9733 mean(cross_val_score(SVC(kernel='linear'),X,y)) 0.98 mean(cross_val_score(RandomForestClassifier(n_estimators=40), X, y)) 0.96 by using iris.data and iris.target directly: np.average(score_lr) 0.95333 np.average(score_svm) 0.98000001 np.average(score_rf) 0.95333333
Great video, as usual. Quick question: How were able to get such low scores for svm? I ran it a couple of times and was getting in the upper 90's. So, I set up a for loop, ran 1000 different train_test_split iterations through svm and recorded the lowest score. It came back 97.2%!
Dear Sir Another great explanation as always. Thank you very much for that. By adding the following code svm started showing very good scores! X_train = preprocessing.scale(X_train) X_test = preprocessing.scale(X_test) Have I done the correct thing?
Really good explanation. You are an expert. I have a question, Is it possible to select the test_size in cross-validation. Because when I use for example, Kfold with 3 splits. It splits the whole data into three parts, but it is possible to make these three splits but using 2 data tests and 7 data train.
You explain things very clearly ! Moreover you keep code ready to save time and your videos are of the appropriate length. You follow the presentations you have made and seem to be speaking impromptu and not reading from somewhere. I have joined a course but there things are not very clear, the videos are sometimes 2-3 hours long and I get bored to death. Most importantly you skip the unnecessary detailed mathematics which are not essential for beginners which helps to focus on machine learning (though I am good at mathematics).
Using the K Fold Method, the data was split multiple times into X_train s and y_train s but remained constant for each method for each split. Is it the same case in the cross_val_score method? Isn't the splitting taking place differently for each method? So basically the models are trained on different X_train s and y_train s Thank you so much for the clear explanation.
LogisticRegressionClassifier =100% SVC (kernel="poly") =97% DecisionTreeClassifier = 97% RandomForestClassifier(n_estimators=30) =97% for every increase in n_estimators
Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced
After watching so many different ML tutorial videos and literally so many i have just one thing to say, the way you teach is literally the best among all of them.
You name any famous one like Andrew NG or sentdex but you literally need to have prerequisites to understand their videos while yours are a treat to the viewers explained from so basics and slowly going up and up. And those exercises are like cherry on the top.
Never change your teaching style sir yours is the best one.👍🏻
I love that you go through the example the hard way and introduce the cross validation after
Couldn't ask for a better teacher to teach machine learning. Truly exceptional !!!!Thank You so much for all your efforts.
I have never seen anyone who can explain Machine Learning and Data Science so easily..
I used to be scared in Machine Learning and Data science, then after seeing your videos, I am now confident that I can do it by myself. Thank you so much for all these videos....
👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏
Happy to help
that approach of doing the manual method of what cross_val_score is doing in the background and then introducing the method! God send! Brilliant. Brilliant I say!
Thank you Sir for this awesome explanation. Iris Dataset Assignment Score
Logistic Regression [96.07% , 92.15% , 95.83%]
SVM [100% , 96.07% , 97.91%] (Kernel='linear')
Decision Tree [98.03 %, 92.15% , 100%]
Random Forest [98.03% , 92.15% , 97.91%]
Conclusion: SVM works the best model for me .
pretty ironic and yet amusing at the same time
what an amazing explanation. Finally! I understood cross validation concept so clearly. Thank You so much.
Glad it was helpful!
He did folds = StratifiedKFold(), and said that he will use it because it is better than KFold
but at 14:20, he used kf.split, where kf is KFold.
I think he frogot to use StatifiedKFold.
yeah, i noticed that
Hi, I'm from Malaysia. I came across your video and I am glad I did it. super easy to understand and I'm currently preparing to learn deep learning. already watch your Python, Pandas, and currently ML videos. thank you for making all these videos. you making our life easier Sir.
Sincerely, your student from Malaysia.
Exercise solution: github.com/codebasics/py/blob/master/ML/12_KFold_Cross_Validation/Exercise/exercise_kfold_validation.ipynb
Complete machine learning tutorial playlist: ua-cam.com/video/gmvvaobm7eQ/v-deo.html
we needed to use mean() with cross validation to get average mean of accuracy score. i'm guessing you forget to add. anyways video is pretty good and depth.keep producing such videos.
only one channel who has pure quality not beating around the bush thanks dhaval sir for your contribution
Thanks Vishal
My teacher is frustratingly bad. I am learning from your videos so that I can get a good grade in my class. Thank you for taking some time to demonstrate what is happening. When you showed me with the example at 10:47, I finally understood.
After Parameter Tuning Using Cross Validation = 10 and taking average
Logistic Regression = 95.34%
SVM = 97.34%
Decision Tree = 95.34 %
Random Forest Classifier = 96.67 %
Performance = SVM > Random Forest > Logistic ~ Decision
after taking cv=5 and C=6 svm is 98.67%
@@manu-prakash-choudhary After 50 splits 😎😎
Score of Logistic Regression is 0.961111111111111
Score of SVM is 0.9888888888888888
Score of RandomForestClassifier is 0.973111111111111
🌹 You are way way... way better than all of my Machine learning professor at school!
Thank you very much. Very nice explanation. My scores, after taking averages, are as follow:
LogisticRegression (max_iter=200) = 97.33%
SVC (kernel = poly) = 98.00%
DecisionTreeClassifier = 96%
RandomForestClassifier (n_estimators=300) = 96.67%
mine too...
Your videos are AMAZING man!!! I have already recommended these videos to my colleagues in my University who is taking Machine Learning course. They are also loving it...!!! Keep it up champ!
Mast pelluri, I am glad you liked it and thanks for recommending it to your friends 🙏👍
For the parameter tuning this helps. Just play a bit with indexes due to lists staring from 0 and n_estimators from 1 to match up indexes.
scores=[ ]
avg_scores=[ ]
n_est=range(1,5) #example
for i in n_est :
model=RandomForestClassifier(n_estimators=i)
score=cross_val_score(model,digits.data, digits.target, cv=10)
scores.append(score)
avg_scores.append(np.average(score))
print('avg score:{}, n_estimator:{}'.format(avg_scores[i-1],i))
avg_scores=np.asarray(avg_scores) #convert the list to array
print('
Average accuracy score is {} for n_estimators={} calculated from following accuracy scores:
{}'.format(np.amax(avg_scores),np.argmax(avg_scores)+1,scores[np.argmax(avg_scores)]))
plt.plot(n_est,avg_scores)
plt.xlabel('number of estimators')
plt.ylabel('average accuracy')
44 was the best for me
Finnaly a video explaining de X_train, X_test, y_train,y_teste. Thank you!
Thanks man! You're really helping me out finishing my university project in machine learning.
Christian I am glad to hear you are making a progress on your University project 😊 I wish you all the best 👍
LogisticRegression was the best model in the Iris dataset
I got an accuracy of 97.3% compared to other models such as svm and randomforestclassifier
Dhanyavaad Sir. Bhagwaan aapko swasth aur khush rakhien humesha.
You are my god.
Don't have any words, you're teaching style and knowledge is amazing ✨...
Your video's on machine learning is way bettet than any online paid video's. so keep growing..
Probably the best machine learning tutorials out there... Very good job
Thanks!
Thank you very much for excellent explanation. I got accuracy SVC=98.04% , RandomForestClassifier(n_estimators=30)=98.04%,
LogisticRegression(max_iter=200)=96.08%
This is one of the best explanation of Kfold Cross Validation!!!
Thank you so much for sharing this valuable video . :))
😊👍
What an excellent video, thank you! I got lost in other written tutorials, this was finally a clear explanation!
Hey, thanks for the comment. Keep learning. 👍
This is the best video I have watched on Machine learning. Well done!
Glad you liked it!
This is an EXCELLLENT explanation. Straighfoward and simplified....Thank you.
Glad it was helpful!
best explanation... i like the way u give examples using small data to explain how it actually works. 10:20
no one explains like this... keep doing great work
Glad you liked it
@@codebasics
I have applied K fold on the linear regression's dataset
I used different activation functions & then I get mean & se values
How to pick the best model from the k folds?
I watch several videos of CV but your video is well explained, thank you, thank you very much sir, keep uploading videos sir
AWESOME AWESOME..... Excellent video you have created. I'm learning ML since past more than 1 years and heard almost more 400 videos. Your videos are AWESOME.... Please make complete series on ML... Thanks.
Pankaj I am happy it has helped you. :) And yes I am in process of uploading many new tutorials on ML. stay tuned!
I am so close enough to finish your videos and then I'm going to hop into your Machine Learning and Data Science Projects... 😊😊😊😊😊😊😊😊😊😊😊
That is awesome!
The best and the smilpest explanation for cross validation i could find after so mush searching.! Keep up the good work!
Thanks sir! Your tutorials are really helpful for me. Hope I'm gonna see all of them and make my transition from mechanical to AI successful 😊.
Useful for identifying many differnt types of categories.
@20:39 of the video, noticed something interesting, by default "cross_val_score()" method generates 3 kfolds... but the default has now changed from 3 to 5 :))
thanks man, i was worried when mine was showing 5 folds results. i thought something was wrong w my code.
@@gandharvsaxena8841 Me too lol, whi I am getting 5
Thankyou man!!
You are a great instructor and explain concepts in a very understandable and relatable manner. Thank you
I am happy this was helpful to you.
Great stuff indeed. I'm learning machine learning from scratch and this was very helpful. Keep up the good work, kudos!
You solved one of my biggest confusion.....Thanks a lot sir
very simple n lovely teaching......u r simple n great... thank u so much sir
Thanks rahul for your kind words of appreciation
thank you so much, i am so grateful for a teacher like you.
In exercise the maximum score get by SVM at gamma=auto and kernel=linearr and the score is = array([1. , 1. , 0.98]) 😀
14:15 - Here instead of kf.split() we should use folds.split(). Am I correct??
Yes. My notebook has a correction. Check that on GitHub link I have provided in video description
Yes and also just to add to it StratifiedKFold requires X and y both labels to its split method. Stratification is done based on the y labels.
Thank for the very useful and free tutorial series. Salute to you sir!
The best score in my case is of Logistic Regression 97.33%
Excellent Machine Learning Tutorials.
Good job Raahim, that’s a pretty good score. Thanks for working on the exercise
@@codebasics Thanks sir
Sir, SVM performance is high as compared to other algo after changing parameter gamma='scale' for the given example of digits dataset
My results (with final average):
L. Regression --> 97.33%
Decision Tree --> 96.66%
SVM --> 98.00% [THE WINNER]
Random Forest --> 96.66%
right same here
Same but i tune svm with kernal = linear and got 99.33%
@@jaihind5092 Pretty good, man!! 👏🏻
So simple. You're a good teacher
Glad you think so!
Excellent explanation of cross-validation and parameter tuning...
Thanks for feedback Subhronil.
Explanation was amazing sir and performed cross_val_score, below is the final average result(considered 10 folds)
Logistic Regression
- 95%
SVM
- 98% -------[Performed better]
Decision Tree
- 95%
Random Forest - 96%
Good job. those scores are indeed excellent.
your tutorial are saving my life
After watching the video for 25 mins, I realized that the last 5mins were the most important😄
Thanks for creating rather authentic content on this topic compare to others. It is more clear!
Glad it was helpful!
Best Explanation I have ever seen. Outstanding job!
I am happy this was helpful to you
nice n helpful. video with practice is more helpful than just lecture without practice session
😊👍
Thank you. This video solved so many questions at once. Nicely done.
Super clear explanation, I have been searching for this one, by seeing this video makes me perfect, tq.
Glad it was helpful!
This is the most helpful video regarding this topic. Thank you so much!
OMG!! this is one of your best sir :) , may lord shiva bless you for ur service
usage of same datasets make less uninteresting, but your tutorials are awesome every tutorial across every thing have + and -,your tutorials are more structured but minus point is usage of same dataset which reduces interest to go next next
Thank you very much sir for this very nice explanation. My results are:
Logistic Regression=95.33%
SVM=97.33%
Decision Tree=96.67%
Random Forests(40 estimators)=96.67%
Avg score?
Your videos are really good! The explanation is crisp and succinct! Love your videos! Keep posting! By the way, you may not realize it, but you are changing peoples' lives by educating them! Jai Hind!
Good explanation..Gained some confidence to enhance my skills in this area..
All the best
00:02 K fold cross validation helps determine the best machine learning model for a given problem.
02:20 K-fold cross validation provides a more robust evaluation of machine learning models.
04:36 Classifying handwritten characters into ten categories using different algorithms and evaluating performance using k-fold cross validation.
07:06 K-fold cross validation helps in more robust model evaluation.
09:43 K-fold cross validation divides data into training and testing sets for iterative model evaluation.
12:35 Stratified k-fold ensures uniform distribution of categories for better model training.
15:42 Measuring the performance of models in each iteration
18:29 Parameter tuning in random forest classifier improves scores.
20:46 K Fold Cross Validation helps measure the performance of machine learning models.
23:18 Cross-validation helps in comparing algorithms and finding the best parameters for a given problem.
25:18 K Fold Cross Validation helps in assessing the model's performance.
Crafted by Merlin AI.
Sir, really a very good explanation... finally i understood it very well.....
Glad it helped!
following your tutorials is the best way to learn Machine learning techniques. Please upload a video explanation on KNN as well.
Sure I will
Now i understand this concept. Thank you sir😃
I am happy this was helpful to you.
Sir u are doing an amazing job...i am becoming your fan now...👑
Thank you so much Ayushi 😀
Great. You made things look very easy & boosts the confidence. Thank you.
Happy to help!
You are the best teacher 😊
Somehow when i tried this, SVM did better than all other Classifiers XD
Same here
love your teaching pattern sir
If i was rich i would have sent you a token of appreciation...Thank you for the content
No worries! If you feel my videos have benefited you, you can spread the words. share the information on channel on your linkedin, facebook etc. that way maximum ppl can benefit from this
it is amazing explanation , grate job ...
What is the score ?
Cross validation is about validation of ONE model.
After validating the model and getting his parameters, you shall choose method to compare with other models and select appropriate model.
- Training set: A set of examples used for learning, that is to fit the parameters of the classifier.
- Validation set: A set of examples used to tune the parameters of a classifier, for example to choose the number of hidden units in a neural network.
- Test set: A set of examples used only to assess the performance of a fully-specified classifier.
You can use cross validation to compare multiple models to. Basically just run kfold one multiple models or same model with different parameters and compare the score.
thank you for this video. Excellent presentation of the material with clear explanations
Michael, I am happy you find it useful
Greatly explained man. Thank you
by making df method:
mean(cross_val_score(LogisticRegression(max_iter=200), X,y))
0.9733
mean(cross_val_score(SVC(kernel='linear'),X,y))
0.98
mean(cross_val_score(RandomForestClassifier(n_estimators=40), X, y))
0.96
by using iris.data and iris.target directly:
np.average(score_lr)
0.95333
np.average(score_svm)
0.98000001
np.average(score_rf)
0.95333333
Sir
You used KFold(kf) instead of StratifiedKFold(folds) in the video
Will there be any difference in the scores if we use stratified KFold?
There is slight difference in the scores
Great video, as usual. Quick question: How were able to get such low scores for svm? I ran it a couple of times and was getting in the upper 90's. So, I set up a for loop, ran 1000 different train_test_split iterations through svm and recorded the lowest score. It came back 97.2%!
one tip for avoiding writing the same code to test difference models is to store all the models in a list/dict and loop through it.
I learnt K Fold Cross Validation! from here!!
Dear Sir
Another great explanation as always.
Thank you very much for that.
By adding the following code svm started showing very good scores!
X_train = preprocessing.scale(X_train)
X_test = preprocessing.scale(X_test)
Have I done the correct thing?
Really good explanation. You are an expert. I have a question, Is it possible to select the test_size in cross-validation. Because when I use for example, Kfold with 3 splits. It splits the whole data into three parts, but it is possible to make these three splits but using 2 data tests and 7 data train.
Thank you sooooo much. You simplified that beautifully.
def avg(nums):
num_avg = 0
for i in range(len(nums)):
num_avg = num_avg + nums[i]
num_avg = (num_avg / len(nums))
return num_avg
// this is the code if you want to get the average of the list. To use it just say
avg(scores_l)
wonderful explaination. Great tutorial series
taking cv=3 for all cases,
Logistic regression=97.33%
Random_Forest=96.66%(n_estimator=60)
Decision tree=~Svc()=96%
understood now what is k_fold cv , thanks sir..
Thank you very much for your class. Its very useful for the beginners.
I am happy you liked it Vishnu :)
You make exquisite content, I'd love to see more!
for me, SVM's score is almost 99 everytime
Hey bro how are you?
good to see you.
@@computingpanda1629 bro aap bhi idhar😂🤣🤣 machine learning padhne aaye ho😂
Then maybe ur overfitting the data 😂😂
Same here😅
You explain things very clearly ! Moreover you keep code ready to save time and your videos are of the appropriate length. You follow the presentations you have made and seem to be speaking impromptu and not reading from somewhere. I have joined a course but there things are not very clear, the videos are sometimes 2-3 hours long and I get bored to death. Most importantly you skip the unnecessary detailed mathematics which are not essential for beginners which helps to focus on machine learning (though I am good at mathematics).
Wasi thanks for leaving a well thought out feedback. This helps me a lot. 😊👍
Using the K Fold Method, the data was split multiple times into X_train s and y_train s but remained constant for each method for each split.
Is it the same case in the cross_val_score method? Isn't the splitting taking place differently for each method? So basically the models are trained on different X_train s and y_train s
Thank you so much for the clear explanation.
Your explanations are awesome 👌
Glad you like them!
LogisticRegressionClassifier =100%
SVC (kernel="poly") =97%
DecisionTreeClassifier = 97%
RandomForestClassifier(n_estimators=30) =97% for every increase in n_estimators
thank you for this series. it is helping me a lot.
The way you teach is awesome! I request to make tutorials on Neural Network if you are in that field. Thankyou!
Akshya I started making videos on neural net. Check my channel, posted first two already..once TF2.0 is stable I will add more.