Machine Learning Tutorial Python - 15: Naive Bayes Classifier Algorithm Part 2

Поділитися
Вставка
  • Опубліковано 17 січ 2025

КОМЕНТАРІ • 350

  • @codebasics
    @codebasics  2 роки тому +2

    Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced

  • @vishalgupta3175
    @vishalgupta3175 4 роки тому +21

    Sir You are amazing, an experience of 25 years is really brilliant, Thanks for Guiding us

  • @r21061991
    @r21061991 4 роки тому +16

    Excellent channel to start learning the ML concepts...Way better than almost all the paid courses out their

  • @bechirmariam8684
    @bechirmariam8684 2 роки тому +1

    i can just say that you are a perfect teacher, Thank you very much. This is a best channel to learn all about datascience!!!

  • @kamalsingh1345
    @kamalsingh1345 3 роки тому +7

    Thanks a lot for this playlist of such amazing tutorials.
    at test_size=0.2, GaussianNB: 97.2% and MultinonialNB: 77.3%

  • @moeintorabi2205
    @moeintorabi2205 4 роки тому +21

    The Guassian model is more accurate. As mentioned in the video, the Gussian model is more accurate for cases where the features have continuous values, which is the case for the Wine dataset.

    • @muhammedrajab2301
      @muhammedrajab2301 4 роки тому +1

      yep , you are right, GaussianNB gave me 100% score.

  • @mehmetyigitakn8208
    @mehmetyigitakn8208 2 роки тому +1

    outstanding video series! greetings from Turkey, I learn too much from this channel. It's now my primary go-to resource to learn machine learning from scratch

  • @anujvyas9493
    @anujvyas9493 4 роки тому +23

    Solved the exercise, got these answers:
    Using Gaussian : 1.0
    Using Multinominal : 0.889

    • @vikaskulshreshtha
      @vikaskulshreshtha 3 роки тому

      Can you please send me the code and dataset vikas.kulshreshtha@gmail.com

    • @abhijeetvighne4452
      @abhijeetvighne4452 3 роки тому

      what is .values in X_train.values in fit_transform

  • @Yash8147
    @Yash8147 2 роки тому

    I got
    100% accuracy with Gaussian NB
    96% accuracy with Multinomial NB
    Thanks for explaining in a very easy and convenient way :)

  • @navneetkaurpopli2766
    @navneetkaurpopli2766 4 роки тому +2

    All your ML videos are wonderful. Good job. Difficult things explained easily. Thanks

  • @charmindesai3730
    @charmindesai3730 3 роки тому

    Amazing tutorial, you teach far better than university professors. Following many of your playlist thoroughly !!! Thank you very much

  • @reemnasser9105
    @reemnasser9105 2 роки тому +2

    I always recommend your playlist to others, it's really helpful and thanks for this effort.

  • @akashbairagi6840
    @akashbairagi6840 3 роки тому

    Amazing !!! Just Amazing ‎️‍🔥 The best ML tutorial on UA-cam....

  • @gajanantayde
    @gajanantayde 3 роки тому +1

    Thank you for this wonderful tutorial
    Exercise scores
    GaussianNB score - 94.5%
    MultinomialNB score - 84.5%

    • @codebasics
      @codebasics  3 роки тому

      Good job gajanan, that’s a pretty good score. Thanks for working on the exercise

    • @nisarali1954
      @nisarali1954 2 роки тому

      Kindly Sir, Help me to find a malicious email through AI. any link etc...

  • @larrybuluma2458
    @larrybuluma2458 4 роки тому +1

    Gaussian: 1.0
    Multinomial: 0.833
    Keep up the good work you're doing

  • @ebtehalturki6329
    @ebtehalturki6329 3 роки тому

    I have never found such informative course like this.. Really great job !!!

  • @austinwhite1316
    @austinwhite1316 2 роки тому

    I think you might be the most valuable resource online for ML beginners.
    Gaussian: 100%
    Multinomial: 86.1%

  • @mapa5000
    @mapa5000 Рік тому

    Thank you for sharing your knowledge. These ML classes are gold ! 👏🏼👏🏼👏🏼

  • @moeintorabi2205
    @moeintorabi2205 4 роки тому +10

    For comparing the models I used Cross Validation (CV = 4) as you explained in the previous videos.
    Average Gaussian Score = 0.9722222222222222
    Average Multinomial score = 0.8333333333333333

    • @study_with_thor
      @study_with_thor 3 роки тому

      better approach! thanks for your suggestion

  • @TheNobody04
    @TheNobody04 3 роки тому +2

    My scores are : Multinomial NB = 0.84, Gaussian NB = 0.97. Thank you so much for these videos :)

    • @codebasics
      @codebasics  3 роки тому +1

      Great job and great score. ☺️👍

  • @bhavyanaik74
    @bhavyanaik74 4 роки тому

    you are one of the best teacher in my life.

  • @akshatsingh6036
    @akshatsingh6036 4 роки тому

    i must say premium lectures i am getting from you sir

  • @behindthecareer
    @behindthecareer 3 роки тому +1

    Thanks for making such great content, free of cost. I'm enjoying .

  • @tsai301103
    @tsai301103 2 роки тому +2

    GaussianNaiveBayes 0.972/ MultinomialNaiveBayes 0.94. MinMaxScaler train dataset. This series of tutorials are strongly recommended. Help me a lot

  • @karthikc8992
    @karthikc8992 4 роки тому +2

    U r one of the best teacher I have ever seen
    keep rocking
    By the way I don't know from what you r suffering
    get well soon buddy
    take care of yourself.👍

    • @codebasics
      @codebasics  4 роки тому +2

      I was suffering from Ulcerative colitis. I am doing well now.

    • @karthikc8992
      @karthikc8992 4 роки тому +1

      @@codebasics thanks for ur reply sir
      May I no from where u r?

    • @muhammedrajab2301
      @muhammedrajab2301 4 роки тому +1

      @@karthikc8992 He is in US

    • @karthikc8992
      @karthikc8992 4 роки тому

      @@muhammedrajab2301 I learned it before , by the way thank u for your reply

  • @sidduhedaginal
    @sidduhedaginal 4 роки тому +4

    Wonderfull explanation sir, thanks for that and here is my result after execution
    GaussianNB : 96.2%
    MultinomialNB: 88.8%

    • @codebasics
      @codebasics  4 роки тому +1

      Siddu, good job indeed.thats a pretty good score

    • @nisarali1954
      @nisarali1954 2 роки тому

      Kindly Sir, Help me to find a malicious email through AI. any link etc...

  • @bhavishyvidya
    @bhavishyvidya 2 роки тому +1

    Thank you Sir, for these well informed videos on ML.

  • @kisiki_kt_na_lage
    @kisiki_kt_na_lage 4 роки тому +3

    You, Sir, are our hero!!!

  • @AlexTechProjects
    @AlexTechProjects 11 місяців тому +1

    I have a question. How it is finding the probability of continuous variables. Can you give me a link to explore

  • @samundraps5730
    @samundraps5730 4 роки тому

    I don't have any words for your work. Thanks a lot.

  • @piyushjha8888
    @piyushjha8888 4 роки тому

    Exercise answer:
    Gaussian : 1.0
    MultinomialNB : 0.889
    Sir u use random state in your solution.Thank you sir i learned something new

  • @userhandle-u7b
    @userhandle-u7b 7 місяців тому

    Thanks a lot for the tuto. Your series is best because it contains the exercises.
    My exercise result: GaussianNB = 0.96, MultinomialNB = 0.84. I also applied cross validation =5

  • @alikhannurkamal4691
    @alikhannurkamal4691 3 роки тому +1

    Thank you very much for that tutorial!
    My results were:
    GaussianNB score - 97.2%
    MultinomialNB score - 86.1%

    • @codebasics
      @codebasics  3 роки тому

      Good job Alikhan, that’s a pretty good score. Thanks for working on the exercise

    • @nisarali1954
      @nisarali1954 2 роки тому

      Kindly Sir, Help me to find a malicious email through AI. any link etc...

  • @Taher-p7i
    @Taher-p7i 7 місяців тому

    Very nice explanation, Thank you so much sir for keeping this much effort in making videos and the exercises

  • @stephenngumbikiilu3988
    @stephenngumbikiilu3988 2 роки тому +1

    Thank you for your amazing explanation. I have learned a lot.
    Gaussian NB: 100%
    Multinomial: 91.11%

    • @himakshipahuja3015
      @himakshipahuja3015 2 роки тому

      From where did you get the dataset?

    • @stephenngumbikiilu3988
      @stephenngumbikiilu3988 2 роки тому +1

      @@himakshipahuja3015 Check the exercise file and you will see the data set. Please tell me if you can't find it and I will send it to you

    • @himakshipahuja3015
      @himakshipahuja3015 2 роки тому

      Thank you very much, @Stephen Ngumbi Kiilu. I found the dataset.

  • @sreenivas351
    @sreenivas351 2 роки тому

    Your teaching is great sir

  • @santoshnarwad3528
    @santoshnarwad3528 8 місяців тому

    Sir very nice teaching and really it's very easy to understand

  • @samadhemmaty1796
    @samadhemmaty1796 5 років тому

    Thanks a lot for videos!!!!, 81% for MultinomialNB and 96% for GaussianNB

    • @codebasics
      @codebasics  5 років тому +1

      Perfect samad. You are really a good student as you are working on all my exercises 😊👌 keep it up 👍

  • @Otaku-Chan01
    @Otaku-Chan01 Рік тому

    Just love the tutorial sir...........
    Hats off to you!!

  • @premnathmagi
    @premnathmagi 3 роки тому +1

    I don't know why some people have disliked this video. How beautifully he is explaining the M.L algorithms.

  • @prakharmishra2977
    @prakharmishra2977 5 років тому +2

    very well demonstration sir,keep inspiring us with your great videos.

  • @MotoRider23605
    @MotoRider23605 4 роки тому +1

    Wonderful sir,really cleared the concepts of pipeline and vectorisation method

  • @subhajitadhikary155
    @subhajitadhikary155 4 роки тому

    Really great videos sir, explained very well.
    About the exercise:-
    for GaussianNB :- 1.0
    for MultinomialNB:- 0.944
    with random_state= 7 and test_size=0.2

    • @codebasics
      @codebasics  4 роки тому +1

      Great score. Good job 👍👏

  • @ameysonawane
    @ameysonawane 3 роки тому

    GaussianNB : 97.22
    MultinomialNB: 86.11
    thank you for this video

  • @meetulagrawal6498
    @meetulagrawal6498 3 роки тому

    Wonderfull explanation sir, thanks for that and here is my result after execution
    GaussianNB : 97.77%
    MultinomialNB: 73.33%
    BernoulliNB: 44.44%
    with test size = 25%

  • @muhammedrajab2301
    @muhammedrajab2301 4 роки тому +1

    I solved the exercise and I got the following score:
    used train_test_split with test_size=0.2 and random_state=123
    This parameters gave me following results:
    GaussianNB score: 1.0(100%)
    MultinomialNB score: 0.888888888888888(88%)
    dataset shape : (178,13)[Dataset is pretty small!]

    • @codebasics
      @codebasics  4 роки тому +1

      Great job muhammed. Good score indeed

  • @dbreddy6201
    @dbreddy6201 3 роки тому

    clean and clear explaination...thank you sir

  • @study_with_thor
    @study_with_thor 3 роки тому

    Well, I've just finished the exercise - that's well-prepared, thanks for your committment.

  • @ashutoshtiwary7448
    @ashutoshtiwary7448 3 роки тому

    solved the excercise with the help of cross_val_score method
    where i have found Gaussian performed better than Multinomial
    as i got the list of their score in which
    max value of Gaussian=0.97222222
    max value of Multinomial=0.91428571
    SIR, your tutorial helping me a lot because your teaching teachnique is quite familair and easy for me
    thanks a lot SIR

    • @codebasics
      @codebasics  3 роки тому

      Good job ashutosh, that’s a pretty good score. Thanks for working on the exercise

  • @usmanrafiq7080
    @usmanrafiq7080 Рік тому

    thank u so much sir from somewhere on earth from pakistan

  • @gamerboy123-fb
    @gamerboy123-fb 9 місяців тому +1

    at 1:45 can we use mapping instead of lambda function??

  • @johnspivack
    @johnspivack Рік тому

    Your course is great for serving the practical needs of getting started doing ML in Python. For this video, some more explanation of pipelines would help. I understand what they are accomplishing, but not entirely how. Are the .fit methods referred to the underlying functions in the pipeline or is .fit its own method of the pipeline? How does the pipeline know to sue the right transformation method, that didn't seem to be explicitly specified?
    Again thank so much for this and the other videos.
    John

  • @darkvolaf
    @darkvolaf 4 роки тому +2

    sir, is it possible to list out the vocabularies that the Naive Bayes algorithm found out to contain the high possibility of spam?

  • @rajadurai7336
    @rajadurai7336 Рік тому

    Sir, Your videos are great continue doing your job. I got an accuracy of GNB of 97.22 and MNB as 86.122 for the exercise question.

  • @amandaahringer7466
    @amandaahringer7466 2 роки тому

    Very helpful, appreciate all your content!

  • @anshulagarwal6682
    @anshulagarwal6682 2 роки тому +1

    Sir, you did not give fit_transform method in pipeline. You only gave CoutVectorizer() but it automatically did fit_transform step. How did it do that?

  • @poojadixit5608
    @poojadixit5608 4 роки тому

    wonderful explanation sir.

  • @SandyCoco1
    @SandyCoco1 3 роки тому

    Great video, very well explained.. I'm gonna try doing the exercise soon

  • @subhnand
    @subhnand 4 роки тому

    Excellent tutorial

  • @kajalkhirodkar6521
    @kajalkhirodkar6521 3 роки тому

    Thankyou for your efforts. These videos are really helpful

  • @rashid101b
    @rashid101b 5 років тому

    Really good one to start

    • @codebasics
      @codebasics  5 років тому

      Rashid, I am glad you liked it

  • @ISandrucho
    @ISandrucho 4 роки тому

    Thanks a lot for this course! As a beautiful and clever student I always do your exercises ^) I don't know what would make your course better. Maybe more exercises.

  • @antshant1
    @antshant1 Рік тому

    Very helpful videos buddy !!!

  • @srishtikumari6664
    @srishtikumari6664 4 роки тому

    Thank you so much!
    You explained this complex concept so easily..

  • @harshalbhoir8986
    @harshalbhoir8986 2 роки тому

    Awesone tutorial

  • @maruthiprasad8184
    @maruthiprasad8184 3 роки тому

    Thank you very much for great explanation , my results are
    GaussianNB =96.3%
    MultinomialNB=83.33%

  • @kirankumarb2190
    @kirankumarb2190 4 роки тому

    Thankyou very much guru ji...

  • @rambaldotra2221
    @rambaldotra2221 3 роки тому

    Awesome and so clean..

  • @saineshnakra6045
    @saineshnakra6045 4 роки тому +3

    Great video!One doubt though, Why did we use X_train_count .toarray()[:3] , I did not understand the 3 , Thank you in advance

    • @klelck
      @klelck 4 роки тому +2

      it is just for visualization purpose, printing only X_train_count .toarray() would have printed all the data points which are in thousands i guess, so sir just used slicing method "[:3]" which states that only 3 data poins will be shown. so we can look at the code properly. get yourself familiarized with pandas slicing and methods like df.iloc[] and df.loc[]. It will be useful

    • @saineshnakra6045
      @saineshnakra6045 4 роки тому +1

      @@klelck Yes , I have used iloc quite often but I felt we were just converting here to and array and not printing it and the 3 somehow had significance in this specific dataset for data cleaning thank you for your reply !

    • @abhijeetvighne4452
      @abhijeetvighne4452 3 роки тому

      @@klelck what is .values in X_train.values in fit_transform

  • @melikabaghi5555
    @melikabaghi5555 3 роки тому

    your videos are great! good luck.

  • @chinmaymathur2430
    @chinmaymathur2430 Рік тому +1

    Kindly raise your volume.

  • @snehasneha9290
    @snehasneha9290 4 роки тому +1

    how to apply the count vectorizer on more than one text column

  • @NitinKumar-wm2dg
    @NitinKumar-wm2dg Рік тому +1

    Thank you sir for your tutorial. I was confused in the countvectorizer at 4:06 , it would have been much better if you would have explained in more in detail. Like what datatype is xtrain and xtraincount, what kind of data is stored in x_train_count and so on. I learned from the shape and type of numpy. But it would have saved time. Also, why first you fit_transform and later just transform for emails. can anybody please
    help me

    • @rajadurai7336
      @rajadurai7336 Рік тому

      Not sure about the first problem But I can help you solve the second problem. To solve your second problem , lets first understand what is fit(),transform(),and fit_transform() methods
      fit() - The fit methods calculates the learning model parameter from training data . We use model.fit(x_train,y_train) so on , it calculates the internal parameters and adjusts the value for our prediction.
      transform() - The transform methods applies the calculated parameter onto our dataset.
      fit_transform() - The fit_transform() methods applies both fit () for calculating the parameters and transform() function to transform our dataset in one step.
      In the first case, we use fit_transform(x_train) for calculating and transforming our entire dataset and for test data we are applying those parameters that we learned from fit_transform(x_train) so we use transform(x_test). I hope I cleared your doubt.

  • @harshamejari
    @harshamejari Рік тому

    fantastic

  • @jenil16
    @jenil16 6 місяців тому +1

    GaussianNB score is 1 whereas that for the MultinomialNB is 0.866... for the WINE dataset. Hence, GNB is performing better that MNB.

  • @dhananjaykansal8097
    @dhananjaykansal8097 5 років тому

    Nice one sir. Thank you so much...

    • @codebasics
      @codebasics  5 років тому

      Dhananjay, I am glad you liked it

  • @gauravlotey4263
    @gauravlotey4263 4 роки тому +1

    i also did Regression analysis, which has r2 value as 0.89.

  • @sulemankhan7775
    @sulemankhan7775 3 роки тому

    Thank you so much sir. Your videos are really useful

  • @deepakavva317
    @deepakavva317 3 роки тому

    How we are deciding wether they are spam or not depending on the occurance of the word. I didn't understand sir. Can u please explain.

  • @mathwithpunam
    @mathwithpunam 2 роки тому

    sir can I use tfidfvectorizer?

  • @prakharmishra2977
    @prakharmishra2977 5 років тому

    well demonstration sir!!

    • @codebasics
      @codebasics  5 років тому

      Prakhar, I am glad you liked it

  • @ksantoshkumar4579
    @ksantoshkumar4579 Рік тому

    good content

  • @asadali4153
    @asadali4153 4 роки тому +1

    Hi. i am getting lower case error when evaluating test data by CountVectorizer
    .There is no integer value present as well.
    How can i resolve it?
    AttributeError: 'int' object has no attribute 'lower'

  • @adj6375
    @adj6375 4 роки тому

    Hey sir .. is there a video available on Hypothesis testing?

  • @cybercorgi8170
    @cybercorgi8170 Рік тому

    i dont understand why I cant access to your git repository and get codes

  • @FilmLover65
    @FilmLover65 Рік тому

    what is the main difference between fit_transform and transform

  • @naderanavas8145
    @naderanavas8145 Рік тому

    Sir can you show how to code naive bayes on categorical dataset like play tennis

  • @rushangkasundra3956
    @rushangkasundra3956 3 роки тому

    Why order of the words doesn't matter in count vector?

  • @msuryapavan9517
    @msuryapavan9517 2 роки тому

    Thanks for the garble free explanation sir, my scores are:
    GaussianNB:97.7%
    MultinomialNB:80%
    BernoulliNB:48.8%
    hope, the above mentioned scores are good. Please comment, if any better score can be achieved in any another way.

  • @NguyenucNam_
    @NguyenucNam_ 2 роки тому

    Thanks a lot!! I have done with exercise and the result is 94 83 and 54 with Gausian, Multi and Berneuli, maybe becaues Berneuli with be more suitable with binary input, and Multi with discrete value so Gaussian will be the best in that case, right?

  • @suryavarunkanthpanchanapu5179
    @suryavarunkanthpanchanapu5179 3 роки тому

    Thanks for the video sir
    My results are below
    Gaussian score : 97.777%
    Multinomial score: 88.888%

    • @codebasics
      @codebasics  3 роки тому

      Good job surya, that’s a pretty good score. Thanks for working on the exercise

  • @vipulagarwal9842
    @vipulagarwal9842 4 роки тому

    instead of using lambda function can't we use label encoder?

  • @anupsigedar8396
    @anupsigedar8396 5 років тому +1

    85% for Multinomial and 96% for Gaussian using mean of 10 fold cross validation

    • @codebasics
      @codebasics  5 років тому +2

      Awesome Anup. you are so fast. Good job :)

    • @anupsigedar8396
      @anupsigedar8396 5 років тому +1

      @@codebasics All thanks to you for such a nice explanation:)

  • @gowthamimaram8119
    @gowthamimaram8119 3 роки тому

    how transform is helping to to differentiate email as spam and not spam?

  • @Jai_Ram.2602
    @Jai_Ram.2602 3 роки тому

    sir we can also use label encdoer from sklearn for that category column right sir?

  • @frankfunk9188
    @frankfunk9188 4 роки тому +2

    GaussianNB: 1
    MultinomialNB: 0.85

  • @saiheroforkings3574
    @saiheroforkings3574 11 місяців тому

    Okay, Sir. I understand, but if we have null values in the message column. What we do in a situation.

    • @r0cketRacoon
      @r0cketRacoon 10 місяців тому

      drop it, bro, if there is no message, how do we know it spam or not?

  • @pythonitvet4442
    @pythonitvet4442 4 роки тому +1

    Hello, I am still new in machine learning and your course is just brilliant, it helps me a lot. I was the exercise and there was an error, I've got no idea why. My sklearn version is 0.22.2. Here is an error "Could not find a version that satisfies the requirement sklearn.feature (from versions: )
    No matching distribution found for sklearn.feature". Is there any help? I tried to add environment through terminal with pip install sklearn.feature, but there was no use. Thanks

    • @muhammedrajab2301
      @muhammedrajab2301 4 роки тому +1

      I know I saw this comment too late but I think the solution for you is in the following link:
      ==================================================================================
      stackoverflow.com/questions/38159462/no-matching-distribution-found-for-install/51445886
      ==================================================================================
      Happy Coding!

  • @rajkalashtiwari
    @rajkalashtiwari 3 роки тому

    GausianNB is one for this dataset, it scores approx 97%
    MultinomialNB had a score approx 92%
    RandomForgetClassifier had a score approx 97%

    • @codebasics
      @codebasics  3 роки тому

      That’s the way to go raj, good job working on that exercise

  • @anushreejha9999
    @anushreejha9999 4 роки тому +1

    What is v.transform(email)