The Guassian model is more accurate. As mentioned in the video, the Gussian model is more accurate for cases where the features have continuous values, which is the case for the Wine dataset.
outstanding video series! greetings from Turkey, I learn too much from this channel. It's now my primary go-to resource to learn machine learning from scratch
For comparing the models I used Cross Validation (CV = 4) as you explained in the previous videos. Average Gaussian Score = 0.9722222222222222 Average Multinomial score = 0.8333333333333333
U r one of the best teacher I have ever seen keep rocking By the way I don't know from what you r suffering get well soon buddy take care of yourself.👍
Thanks a lot for the tuto. Your series is best because it contains the exercises. My exercise result: GaussianNB = 0.96, MultinomialNB = 0.84. I also applied cross validation =5
Really great videos sir, explained very well. About the exercise:- for GaussianNB :- 1.0 for MultinomialNB:- 0.944 with random_state= 7 and test_size=0.2
Wonderfull explanation sir, thanks for that and here is my result after execution GaussianNB : 97.77% MultinomialNB: 73.33% BernoulliNB: 44.44% with test size = 25%
I solved the exercise and I got the following score: used train_test_split with test_size=0.2 and random_state=123 This parameters gave me following results: GaussianNB score: 1.0(100%) MultinomialNB score: 0.888888888888888(88%) dataset shape : (178,13)[Dataset is pretty small!]
solved the excercise with the help of cross_val_score method where i have found Gaussian performed better than Multinomial as i got the list of their score in which max value of Gaussian=0.97222222 max value of Multinomial=0.91428571 SIR, your tutorial helping me a lot because your teaching teachnique is quite familair and easy for me thanks a lot SIR
Your course is great for serving the practical needs of getting started doing ML in Python. For this video, some more explanation of pipelines would help. I understand what they are accomplishing, but not entirely how. Are the .fit methods referred to the underlying functions in the pipeline or is .fit its own method of the pipeline? How does the pipeline know to sue the right transformation method, that didn't seem to be explicitly specified? Again thank so much for this and the other videos. John
Sir, you did not give fit_transform method in pipeline. You only gave CoutVectorizer() but it automatically did fit_transform step. How did it do that?
Thanks a lot for this course! As a beautiful and clever student I always do your exercises ^) I don't know what would make your course better. Maybe more exercises.
it is just for visualization purpose, printing only X_train_count .toarray() would have printed all the data points which are in thousands i guess, so sir just used slicing method "[:3]" which states that only 3 data poins will be shown. so we can look at the code properly. get yourself familiarized with pandas slicing and methods like df.iloc[] and df.loc[]. It will be useful
@@klelck Yes , I have used iloc quite often but I felt we were just converting here to and array and not printing it and the 3 somehow had significance in this specific dataset for data cleaning thank you for your reply !
Thank you sir for your tutorial. I was confused in the countvectorizer at 4:06 , it would have been much better if you would have explained in more in detail. Like what datatype is xtrain and xtraincount, what kind of data is stored in x_train_count and so on. I learned from the shape and type of numpy. But it would have saved time. Also, why first you fit_transform and later just transform for emails. can anybody please help me
Not sure about the first problem But I can help you solve the second problem. To solve your second problem , lets first understand what is fit(),transform(),and fit_transform() methods fit() - The fit methods calculates the learning model parameter from training data . We use model.fit(x_train,y_train) so on , it calculates the internal parameters and adjusts the value for our prediction. transform() - The transform methods applies the calculated parameter onto our dataset. fit_transform() - The fit_transform() methods applies both fit () for calculating the parameters and transform() function to transform our dataset in one step. In the first case, we use fit_transform(x_train) for calculating and transforming our entire dataset and for test data we are applying those parameters that we learned from fit_transform(x_train) so we use transform(x_test). I hope I cleared your doubt.
Hi. i am getting lower case error when evaluating test data by CountVectorizer .There is no integer value present as well. How can i resolve it? AttributeError: 'int' object has no attribute 'lower'
Thanks for the garble free explanation sir, my scores are: GaussianNB:97.7% MultinomialNB:80% BernoulliNB:48.8% hope, the above mentioned scores are good. Please comment, if any better score can be achieved in any another way.
Thanks a lot!! I have done with exercise and the result is 94 83 and 54 with Gausian, Multi and Berneuli, maybe becaues Berneuli with be more suitable with binary input, and Multi with discrete value so Gaussian will be the best in that case, right?
Hello, I am still new in machine learning and your course is just brilliant, it helps me a lot. I was the exercise and there was an error, I've got no idea why. My sklearn version is 0.22.2. Here is an error "Could not find a version that satisfies the requirement sklearn.feature (from versions: ) No matching distribution found for sklearn.feature". Is there any help? I tried to add environment through terminal with pip install sklearn.feature, but there was no use. Thanks
I know I saw this comment too late but I think the solution for you is in the following link: ================================================================================== stackoverflow.com/questions/38159462/no-matching-distribution-found-for-install/51445886 ================================================================================== Happy Coding!
Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced
Sir You are amazing, an experience of 25 years is really brilliant, Thanks for Guiding us
Excellent channel to start learning the ML concepts...Way better than almost all the paid courses out their
i can just say that you are a perfect teacher, Thank you very much. This is a best channel to learn all about datascience!!!
Thanks a lot for this playlist of such amazing tutorials.
at test_size=0.2, GaussianNB: 97.2% and MultinonialNB: 77.3%
The Guassian model is more accurate. As mentioned in the video, the Gussian model is more accurate for cases where the features have continuous values, which is the case for the Wine dataset.
yep , you are right, GaussianNB gave me 100% score.
outstanding video series! greetings from Turkey, I learn too much from this channel. It's now my primary go-to resource to learn machine learning from scratch
Solved the exercise, got these answers:
Using Gaussian : 1.0
Using Multinominal : 0.889
Can you please send me the code and dataset vikas.kulshreshtha@gmail.com
what is .values in X_train.values in fit_transform
I got
100% accuracy with Gaussian NB
96% accuracy with Multinomial NB
Thanks for explaining in a very easy and convenient way :)
All your ML videos are wonderful. Good job. Difficult things explained easily. Thanks
Amazing tutorial, you teach far better than university professors. Following many of your playlist thoroughly !!! Thank you very much
I always recommend your playlist to others, it's really helpful and thanks for this effort.
Amazing !!! Just Amazing ️🔥 The best ML tutorial on UA-cam....
Glad it was helpful!
Thank you for this wonderful tutorial
Exercise scores
GaussianNB score - 94.5%
MultinomialNB score - 84.5%
Good job gajanan, that’s a pretty good score. Thanks for working on the exercise
Kindly Sir, Help me to find a malicious email through AI. any link etc...
Gaussian: 1.0
Multinomial: 0.833
Keep up the good work you're doing
I have never found such informative course like this.. Really great job !!!
I think you might be the most valuable resource online for ML beginners.
Gaussian: 100%
Multinomial: 86.1%
Thank you for sharing your knowledge. These ML classes are gold ! 👏🏼👏🏼👏🏼
For comparing the models I used Cross Validation (CV = 4) as you explained in the previous videos.
Average Gaussian Score = 0.9722222222222222
Average Multinomial score = 0.8333333333333333
better approach! thanks for your suggestion
My scores are : Multinomial NB = 0.84, Gaussian NB = 0.97. Thank you so much for these videos :)
Great job and great score. ☺️👍
you are one of the best teacher in my life.
thanks Bhavya
i must say premium lectures i am getting from you sir
Thanks for making such great content, free of cost. I'm enjoying .
GaussianNaiveBayes 0.972/ MultinomialNaiveBayes 0.94. MinMaxScaler train dataset. This series of tutorials are strongly recommended. Help me a lot
U r one of the best teacher I have ever seen
keep rocking
By the way I don't know from what you r suffering
get well soon buddy
take care of yourself.👍
I was suffering from Ulcerative colitis. I am doing well now.
@@codebasics thanks for ur reply sir
May I no from where u r?
@@karthikc8992 He is in US
@@muhammedrajab2301 I learned it before , by the way thank u for your reply
Wonderfull explanation sir, thanks for that and here is my result after execution
GaussianNB : 96.2%
MultinomialNB: 88.8%
Siddu, good job indeed.thats a pretty good score
Kindly Sir, Help me to find a malicious email through AI. any link etc...
Thank you Sir, for these well informed videos on ML.
You, Sir, are our hero!!!
I have a question. How it is finding the probability of continuous variables. Can you give me a link to explore
I don't have any words for your work. Thanks a lot.
Exercise answer:
Gaussian : 1.0
MultinomialNB : 0.889
Sir u use random state in your solution.Thank you sir i learned something new
Thanks a lot for the tuto. Your series is best because it contains the exercises.
My exercise result: GaussianNB = 0.96, MultinomialNB = 0.84. I also applied cross validation =5
Thank you very much for that tutorial!
My results were:
GaussianNB score - 97.2%
MultinomialNB score - 86.1%
Good job Alikhan, that’s a pretty good score. Thanks for working on the exercise
Kindly Sir, Help me to find a malicious email through AI. any link etc...
Very nice explanation, Thank you so much sir for keeping this much effort in making videos and the exercises
Thank you for your amazing explanation. I have learned a lot.
Gaussian NB: 100%
Multinomial: 91.11%
From where did you get the dataset?
@@himakshipahuja3015 Check the exercise file and you will see the data set. Please tell me if you can't find it and I will send it to you
Thank you very much, @Stephen Ngumbi Kiilu. I found the dataset.
Your teaching is great sir
Sir very nice teaching and really it's very easy to understand
Thanks a lot for videos!!!!, 81% for MultinomialNB and 96% for GaussianNB
Perfect samad. You are really a good student as you are working on all my exercises 😊👌 keep it up 👍
Just love the tutorial sir...........
Hats off to you!!
I don't know why some people have disliked this video. How beautifully he is explaining the M.L algorithms.
very well demonstration sir,keep inspiring us with your great videos.
Thanks Prakash.
Wonderful sir,really cleared the concepts of pipeline and vectorisation method
Really great videos sir, explained very well.
About the exercise:-
for GaussianNB :- 1.0
for MultinomialNB:- 0.944
with random_state= 7 and test_size=0.2
Great score. Good job 👍👏
GaussianNB : 97.22
MultinomialNB: 86.11
thank you for this video
Wonderfull explanation sir, thanks for that and here is my result after execution
GaussianNB : 97.77%
MultinomialNB: 73.33%
BernoulliNB: 44.44%
with test size = 25%
I solved the exercise and I got the following score:
used train_test_split with test_size=0.2 and random_state=123
This parameters gave me following results:
GaussianNB score: 1.0(100%)
MultinomialNB score: 0.888888888888888(88%)
dataset shape : (178,13)[Dataset is pretty small!]
Great job muhammed. Good score indeed
clean and clear explaination...thank you sir
Well, I've just finished the exercise - that's well-prepared, thanks for your committment.
solved the excercise with the help of cross_val_score method
where i have found Gaussian performed better than Multinomial
as i got the list of their score in which
max value of Gaussian=0.97222222
max value of Multinomial=0.91428571
SIR, your tutorial helping me a lot because your teaching teachnique is quite familair and easy for me
thanks a lot SIR
Good job ashutosh, that’s a pretty good score. Thanks for working on the exercise
thank u so much sir from somewhere on earth from pakistan
at 1:45 can we use mapping instead of lambda function??
Your course is great for serving the practical needs of getting started doing ML in Python. For this video, some more explanation of pipelines would help. I understand what they are accomplishing, but not entirely how. Are the .fit methods referred to the underlying functions in the pipeline or is .fit its own method of the pipeline? How does the pipeline know to sue the right transformation method, that didn't seem to be explicitly specified?
Again thank so much for this and the other videos.
John
sir, is it possible to list out the vocabularies that the Naive Bayes algorithm found out to contain the high possibility of spam?
Sir, Your videos are great continue doing your job. I got an accuracy of GNB of 97.22 and MNB as 86.122 for the exercise question.
Very helpful, appreciate all your content!
Sir, you did not give fit_transform method in pipeline. You only gave CoutVectorizer() but it automatically did fit_transform step. How did it do that?
wonderful explanation sir.
Great video, very well explained.. I'm gonna try doing the exercise soon
Excellent tutorial
Thankyou for your efforts. These videos are really helpful
Glad you like them!
Really good one to start
Rashid, I am glad you liked it
Thanks a lot for this course! As a beautiful and clever student I always do your exercises ^) I don't know what would make your course better. Maybe more exercises.
Glad you like them!
Very helpful videos buddy !!!
Thank you so much!
You explained this complex concept so easily..
👍
Awesone tutorial
Thank you very much for great explanation , my results are
GaussianNB =96.3%
MultinomialNB=83.33%
Thankyou very much guru ji...
Awesome and so clean..
Glad you like it!
Great video!One doubt though, Why did we use X_train_count .toarray()[:3] , I did not understand the 3 , Thank you in advance
it is just for visualization purpose, printing only X_train_count .toarray() would have printed all the data points which are in thousands i guess, so sir just used slicing method "[:3]" which states that only 3 data poins will be shown. so we can look at the code properly. get yourself familiarized with pandas slicing and methods like df.iloc[] and df.loc[]. It will be useful
@@klelck Yes , I have used iloc quite often but I felt we were just converting here to and array and not printing it and the 3 somehow had significance in this specific dataset for data cleaning thank you for your reply !
@@klelck what is .values in X_train.values in fit_transform
your videos are great! good luck.
Thanks, you melika
Kindly raise your volume.
how to apply the count vectorizer on more than one text column
Thank you sir for your tutorial. I was confused in the countvectorizer at 4:06 , it would have been much better if you would have explained in more in detail. Like what datatype is xtrain and xtraincount, what kind of data is stored in x_train_count and so on. I learned from the shape and type of numpy. But it would have saved time. Also, why first you fit_transform and later just transform for emails. can anybody please
help me
Not sure about the first problem But I can help you solve the second problem. To solve your second problem , lets first understand what is fit(),transform(),and fit_transform() methods
fit() - The fit methods calculates the learning model parameter from training data . We use model.fit(x_train,y_train) so on , it calculates the internal parameters and adjusts the value for our prediction.
transform() - The transform methods applies the calculated parameter onto our dataset.
fit_transform() - The fit_transform() methods applies both fit () for calculating the parameters and transform() function to transform our dataset in one step.
In the first case, we use fit_transform(x_train) for calculating and transforming our entire dataset and for test data we are applying those parameters that we learned from fit_transform(x_train) so we use transform(x_test). I hope I cleared your doubt.
fantastic
GaussianNB score is 1 whereas that for the MultinomialNB is 0.866... for the WINE dataset. Hence, GNB is performing better that MNB.
Nice one sir. Thank you so much...
Dhananjay, I am glad you liked it
i also did Regression analysis, which has r2 value as 0.89.
Thank you so much sir. Your videos are really useful
Glad to hear that
How we are deciding wether they are spam or not depending on the occurance of the word. I didn't understand sir. Can u please explain.
sir can I use tfidfvectorizer?
yes you can
@@codebasics ok sir
well demonstration sir!!
Prakhar, I am glad you liked it
good content
Glad you liked it!
Hi. i am getting lower case error when evaluating test data by CountVectorizer
.There is no integer value present as well.
How can i resolve it?
AttributeError: 'int' object has no attribute 'lower'
Have u solved it?
Hey sir .. is there a video available on Hypothesis testing?
i dont understand why I cant access to your git repository and get codes
what is the main difference between fit_transform and transform
Sir can you show how to code naive bayes on categorical dataset like play tennis
Why order of the words doesn't matter in count vector?
Thanks for the garble free explanation sir, my scores are:
GaussianNB:97.7%
MultinomialNB:80%
BernoulliNB:48.8%
hope, the above mentioned scores are good. Please comment, if any better score can be achieved in any another way.
Thanks a lot!! I have done with exercise and the result is 94 83 and 54 with Gausian, Multi and Berneuli, maybe becaues Berneuli with be more suitable with binary input, and Multi with discrete value so Gaussian will be the best in that case, right?
Thanks for the video sir
My results are below
Gaussian score : 97.777%
Multinomial score: 88.888%
Good job surya, that’s a pretty good score. Thanks for working on the exercise
instead of using lambda function can't we use label encoder?
85% for Multinomial and 96% for Gaussian using mean of 10 fold cross validation
Awesome Anup. you are so fast. Good job :)
@@codebasics All thanks to you for such a nice explanation:)
how transform is helping to to differentiate email as spam and not spam?
sir we can also use label encdoer from sklearn for that category column right sir?
GaussianNB: 1
MultinomialNB: 0.85
Okay, Sir. I understand, but if we have null values in the message column. What we do in a situation.
drop it, bro, if there is no message, how do we know it spam or not?
Hello, I am still new in machine learning and your course is just brilliant, it helps me a lot. I was the exercise and there was an error, I've got no idea why. My sklearn version is 0.22.2. Here is an error "Could not find a version that satisfies the requirement sklearn.feature (from versions: )
No matching distribution found for sklearn.feature". Is there any help? I tried to add environment through terminal with pip install sklearn.feature, but there was no use. Thanks
I know I saw this comment too late but I think the solution for you is in the following link:
==================================================================================
stackoverflow.com/questions/38159462/no-matching-distribution-found-for-install/51445886
==================================================================================
Happy Coding!
GausianNB is one for this dataset, it scores approx 97%
MultinomialNB had a score approx 92%
RandomForgetClassifier had a score approx 97%
That’s the way to go raj, good job working on that exercise
What is v.transform(email)