Sir You are amazing, an experience of 25 years is really brilliant, Thanks for Guiding us
Excellent channel to start learning the ML concepts...Way better than almost all the paid courses out their
Solved the exercise, got these answers:
Using Gaussian : 1.0
Using Multinominal : 0.889
Can you please send me the code and dataset vikas.kulshreshtha@gmail.com
Thanks a lot for this playlist of such amazing tutorials.
at test_size=0.2, GaussianNB: 97.2% and MultinonialNB: 77.3%
The Guassian model is more accurate. As mentioned in the video, the Gussian model is more accurate for cases where the features have continuous values, which is the case for the Wine dataset.
GaussianNaiveBayes 0.972/ MultinomialNaiveBayes 0.94. MinMaxScaler train dataset. This series of tutorials are strongly recommended. Help me a lot
For comparing the models I used Cross Validation (CV = 4) as you explained in the previous videos.
Average Gaussian Score = 0.9722222222222222
Average Multinomial score = 0.8333333333333333
Exercise solution: github.com/codebasics/py/blob/master/ML/14_naive_bayes/Exercise/14_naive_bayes_exercise.ipynb
Complete machine learning tutorial playlist: ua-cam.com/video/gmvvaobm7eQ/v-deo.html
AttributeError: 'NoneType' object has no attribute 'lower'
this is the error
i can just say that you are a perfect teacher, Thank you very much. This is a best channel to learn all about datascience!!!
Thank you for this wonderful tutorial
Exercise scores
GaussianNB score - 94.5%
MultinomialNB score - 84.5%
Good job gajanan, that’s a pretty good score. Thanks for working on the exercise
All your ML videos are wonderful. Good job. Difficult things explained easily. Thanks
Thanks a lot for the tuto. Your series is best because it contains the exercises.
My exercise result: GaussianNB = 0.96, MultinomialNB = 0.84. I also applied cross validation =5
I got
100% accuracy with Gaussian NB
96% accuracy with Multinomial NB
Thanks for explaining in a very easy and convenient way :)
outstanding video series! greetings from Turkey, I learn too much from this channel. It's now my primary go-to resource to learn machine learning from scratch
I think you might be the most valuable resource online for ML beginners.
Gaussian: 100%
Multinomial: 86.1%
My scores are : Multinomial NB = 0.84, Gaussian NB = 0.97. Thank you so much for these videos :)
Really great videos sir, explained very well.
About the exercise:-
for GaussianNB :- 1.0
for MultinomialNB:- 0.944
with random_state= 7 and test_size=0.2
Exercise answer:
Gaussian : 1.0
MultinomialNB : 0.889
Sir u use random state in your solution.Thank you sir i learned something new
Gaussian: 1.0
Multinomial: 0.833
Keep up the good work you're doing
Very nice explanation, Thank you so much sir for keeping this much effort in making videos and the exercises
Wonderfull explanation sir, thanks for that and here is my result after execution
GaussianNB : 96.2%
MultinomialNB: 88.8%
Amazing tutorial, you teach far better than university professors. Following many of your playlist thoroughly !!! Thank you very much
U r one of the best teacher I have ever seen
keep rocking
By the way I don't know from what you r suffering
get well soon buddy
take care of yourself.👍
@@muhammedrajab2301 I learned it before , by the way thank u for your reply
Sir, Your videos are great continue doing your job. I got an accuracy of GNB of 97.22 and MNB as 86.122 for the exercise question.
I don't know why some people have disliked this video. How beautifully he is explaining the M.L algorithms.
solved the excercise with the help of cross_val_score method
where i have found Gaussian performed better than Multinomial
as i got the list of their score in which
max value of Gaussian=0.97222222
max value of Multinomial=0.91428571
SIR, your tutorial helping me a lot because your teaching teachnique is quite familair and easy for me
thanks a lot SIR
Good job ashutosh, that’s a pretty good score. Thanks for working on the exercise
I always recommend your playlist to others, it's really helpful and thanks for this effort.
Sir very nice teaching and really it's very easy to understand
GaussianNB : 97.22
MultinomialNB: 86.11
thank you for this video
Thank you very much for that tutorial!
My results were:
GaussianNB score - 97.2%
MultinomialNB score - 86.1%
Good job Alikhan, that’s a pretty good score. Thanks for working on the exercise
Thank you for your amazing explanation. I have learned a lot.
Gaussian NB: 100%
Multinomial: 91.11%
@@himakshipahuja3015 Check the exercise file and you will see the data set. Please tell me if you can't find it and I will send it to you
Thank you for sharing your knowledge. These ML classes are gold ! 👏🏼👏🏼👏🏼
I solved the exercise and I got the following score:
used train_test_split with test_size=0.2 and random_state=123
This parameters gave me following results:
GaussianNB score: 1.0(100%)
MultinomialNB score: 0.888888888888888(88%)
dataset shape : (178,13)[Dataset is pretty small!]
Thanks for the garble free explanation sir, my scores are:
GaussianNB:97.7%
MultinomialNB:80%
BernoulliNB:48.8%
hope, the above mentioned scores are good. Please comment, if any better score can be achieved in any another way.
I have never found such informative course like this.. Really great job !!!
Your course is great for serving the practical needs of getting started doing ML in Python. For this video, some more explanation of pipelines would help. I understand what they are accomplishing, but not entirely how. Are the .fit methods referred to the underlying functions in the pipeline or is .fit its own method of the pipeline? How does the pipeline know to sue the right transformation method, that didn't seem to be explicitly specified?
Again thank so much for this and the other videos.
John
i must say premium lectures i am getting from you sir
Thank you Sir, for these well informed videos on ML.
very well demonstration sir,keep inspiring us with your great videos.
Wonderfull explanation sir, thanks for that and here is my result after execution
GaussianNB : 97.77%
MultinomialNB: 73.33%
BernoulliNB: 44.44%
with test size = 25%
Thanks for making such great content, free of cost. I'm enjoying .
Thanks a lot for videos!!!!, 81% for MultinomialNB and 96% for GaussianNB
Perfect samad. You are really a good student as you are working on all my exercises 😊👌 keep it up 👍
I have a question. How it is finding the probability of continuous variables. Can you give me a link to explore
Your teaching is great sir
Sir, you did not give fit_transform method in pipeline. You only gave CoutVectorizer() but it automatically did fit_transform step. How did it do that?
Wonderful sir,really cleared the concepts of pipeline and vectorisation method
GaussianNB score is 1 whereas that for the MultinomialNB is 0.866... for the WINE dataset. Hence, GNB is performing better that MNB.
Thank you very much for great explanation , my results are
GaussianNB =96.3%
MultinomialNB=83.33%
I don't have any words for your work. Thanks a lot.
You, Sir, are our hero!!!
clean and clear explaination...thank you sir
Thanks a lot!! I have done with exercise and the result is 94 83 and 54 with Gausian, Multi and Berneuli, maybe becaues Berneuli with be more suitable with binary input, and Multi with discrete value so Gaussian will be the best in that case, right?
Sir when using fit_transform in Count vectorizer why you wrote X_train.values. What it supposed to mean?? Like there is no column named "values" in the dataset.
Thanks for the video sir
My results are below
Gaussian score : 97.777%
Multinomial score: 88.888%
Good job surya, that’s a pretty good score. Thanks for working on the exercise
Just love the tutorial sir...........
Hats off to you!!
Great video, very well explained.. I'm gonna try doing the exercise soon
Sir can you explain that on the upper side we have used fit_transform but while predicting we have used only transform, Why?
thank u so much sir from somewhere on earth from pakistan
I find it difficult to use GaussianNB model on spam dataset. Is it that it won't work the same way as MultinomialNB?
sir, is it possible to list out the vocabularies that the Naive Bayes algorithm found out to contain the high possibility of spam?
How we are deciding wether they are spam or not depending on the occurance of the word. I didn't understand sir. Can u please explain.
Well, I've just finished the exercise - that's well-prepared, thanks for your committment.
GausianNB is one for this dataset, it scores approx 97%
MultinomialNB had a score approx 92%
RandomForgetClassifier had a score approx 97%
Thank you for tutorial. I like the way that you teach and GaussianNB work better but I do not know why! Also score of MultinomialNB for me caculate as 0.8444444444444444
wonderful explanation sir.
Hi. i am getting lower case error when evaluating test data by CountVectorizer
.There is no integer value present as well.
How can i resolve it?
AttributeError: 'int' object has no attribute 'lower'
Why did you choose to use multimonialNB? Based on descriptions of different NB we should have used bernoulliNB. Isn't it?
sir in my wine dataset for 19 index predicted value is 2 but the actual target is 0 and by then also my model accuracy is 1 how it is possible please help
Sir, GaussianNaiveBayes works better gives 97.8% accuracy where MultinomialNB gives 86.7%
Thank you sir. I have 1 question. Can't we get numerical data regarding spam column using 'pd.get_dummies' method instead of 'df['spam']=df['Category'].apply(lambda x: 1 if x=='spam' else 0)' as we did for titanic data regarding gender column...?
i got accuracy 100 %. my train test split is as below.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df,target,random_state=20,test_size=0.05)
Gaussian model was most accurate for me resulting 97% accuracy, while Bernoulli being the least resulting only 19% accuracy which is to be expected since the training dataset was for continuous variables and Bernoulli model works better for Binary variables.
Very helpful, appreciate all your content!
Hello, I am still new in machine learning and your course is just brilliant, it helps me a lot. I was the exercise and there was an error, I've got no idea why. My sklearn version is 0.22.2. Here is an error "Could not find a version that satisfies the requirement sklearn.feature (from versions: )
No matching distribution found for sklearn.feature". Is there any help? I tried to add environment through terminal with pip install sklearn.feature, but there was no use. Thanks
I know I saw this comment too late but I think the solution for you is in the following link:
==================================================================================
stackoverflow.com/questions/38159462/no-matching-distribution-found-for-install/51445886
==================================================================================
Happy Coding!
Sir can you show how to code naive bayes on categorical dataset like play tennis
Sir, i tried running bagging classifier on this to see if it increased accuracy but it gives the error that it expects 2D array but gets 1D array. what should i do ?
i also did Regression analysis, which has r2 value as 0.89.
Okay, Sir. I understand, but if we have null values in the message column. What we do in a situation.
hello sir, I want to ask how if I want to display the data used as testing data on the sentiment data after doing the countvectorizer? thank you 😊
i dont understand why I cant access to your git repository and get codes
sir we can also use label encdoer from sklearn for that category column right sir?
Thank you sir for your tutorial. I was confused in the countvectorizer at 4:06 , it would have been much better if you would have explained in more in detail. Like what datatype is xtrain and xtraincount, what kind of data is stored in x_train_count and so on. I learned from the shape and type of numpy. But it would have saved time. Also, why first you fit_transform and later just transform for emails. can anybody please
help me
Not sure about the first problem But I can help you solve the second problem. To solve your second problem , lets first understand what is fit(),transform(),and fit_transform() methods
fit() - The fit methods calculates the learning model parameter from training data . We use model.fit(x_train,y_train) so on , it calculates the internal parameters and adjusts the value for our prediction.
transform() - The transform methods applies the calculated parameter onto our dataset.
fit_transform() - The fit_transform() methods applies both fit () for calculating the parameters and transform() function to transform our dataset in one step.
In the first case, we use fit_transform(x_train) for calculating and transforming our entire dataset and for test data we are applying those parameters that we learned from fit_transform(x_train) so we use transform(x_test). I hope I cleared your doubt.
what is the main difference between fit_transform and transform
how transform is helping to to differentiate email as spam and not spam?
Sir, i have a question regarding 'pipeline concept':
if pipeline does convert data into numbers automatically then why we have to do convert data into numbers manually like OneHot, Dummy_variable, label_encoding etc.?
is this applicable for just vectorizer or General?
Pipeline depends on the order of the given steps. You can give whatever you want as the steps in the pipeline.
Look at the sklearn documentation below it even has usage examples.
scikit-learn.org/stable/modules/generated/sklearn.pipeline.make_pipeline.html
Thanks a lot for this course! As a beautiful and clever student I always do your exercises ^) I don't know what would make your course better. Maybe more exercises.
Sir, my excercise result is,
GaussianNB = 0.9722
MultinomialNB = 0.83333
edit: Multinomial Naive Bayes improved after data scaling to 0.9722 equal to Gaussian Naive Bayes
Could any please tell me the difference between model.score and accuracy_score??
I think score is used after the model is fit and to know how well the model is fit and accuracy_score gives the accuracy rate of predicted values vs actual values...
Is this correct?? need clarification...
they both produce the same result, just difference syntax:
1. model.score(X_test, Y_test)
2. u have to specify another variable which is
Y_test_predicted = model.predict(X_test)
model.accuracy_score(Y_test, Y_test_predicted)
Great video!One doubt though, Why did we use X_train_count .toarray()[:3] , I did not understand the 3 , Thank you in advance
it is just for visualization purpose, printing only X_train_count .toarray() would have printed all the data points which are in thousands i guess, so sir just used slicing method "[:3]" which states that only 3 data poins will be shown. so we can look at the code properly. get yourself familiarized with pandas slicing and methods like df.iloc[] and df.loc[]. It will be useful
@@swapnshah3234 Yes , I have used iloc quite often but I felt we were just converting here to and array and not printing it and the 3 somehow had significance in this specific dataset for data cleaning thank you for your reply !
Very helpful videos buddy !!!
instead of using lambda function can't we use label encoder?
From where did i get this spam,ham email csv fil?
how to apply the count vectorizer on more than one text column
With min-max scaling and X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.2, random_state=43) i got with Multinominal : 1
Using Gaussian : 1.0
In exercise, no need of pipeline or count vector, directly apply train test split and fit method, gaussian gives 97% while multinational gave 80%
Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced