Machine Learning Tutorial Python - 18: K nearest neighbors classification with python code

Поділитися
Вставка
  • Опубліковано 16 лип 2024
  • In this video we will understand how K nearest neighbors algorithm work. Then write python code using sklearn library to build a knn (K nearest neighbors) model. The end, I have an exercise for you to practice concepts you learnt in this video.
    Code: github.com/codebasics/py/blob...
    Exercise: github.com/codebasics/py/blob...
    ⭐️ Timestamps ⭐️
    00:00 Theory
    03:51 Coding
    14:09 Exercise
    Do you want to learn technology from me? Check codebasics.io/?... for my affordable video courses.
    Machine learning tutorial playlist for beginners: • Machine Learning Tutor...
    🌎 My Website For Video Courses: codebasics.io/?...
    Need help building software or data analytics and AI solutions? My company www.atliq.com/ can help. Click on the Contact button on that website.
    🎥 Codebasics Hindi channel: / @codebasicshindi
    #️⃣ Social Media #️⃣
    🔗 Discord: / discord
    📸 Dhaval's Personal Instagram: / dhavalsays
    📸 Instagram: / codebasicshub
    🔊 Facebook: / codebasicshub
    📱 Twitter: / codebasicshub
    📝 Linkedin (Personal): / dhavalsays
    📝 Linkedin (Codebasics): / codebasics
    ❗❗ DISCLAIMER: All opinions expressed in this video are of my own and not that of my employers'.

КОМЕНТАРІ • 93

  • @codebasics
    @codebasics  2 роки тому +3

    Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced

  • @flavio4923
    @flavio4923 2 роки тому +44

    just a tip I read from a book: for highly structured data a smaller K is better (like this example, or handwritting/ speech recognition), but for noisy data it is recommended using a bigger K.
    keep up the great videos!

  • @PythonArms
    @PythonArms 2 роки тому +16

    When you said most important skill then said ctrl C/Ctrl V I lost it. haha. great video

  • @Koome777
    @Koome777 7 місяців тому +4

    I got a score of 0.99444 with k=6 while using random state to test the outcomes of each change in K. I've also discovered that sklearn has a module for displaying confusion matrix without using seaborn or matplotlib. The module is called ConfusionMatrixDisplay and it only takes the confusion matrix object as its parameter. Thanks Dhaval Patel sir.

  • @zizoublbs8918
    @zizoublbs8918 2 роки тому +8

    iv tested the classification with KNN in the digits dataset and had an accuracy of 99.44% with n_neighbours=3 and test_size=0.2 of the split (i never see solutions) thank you for your tutorials its extremely useful (y)

  • @gusinthecloud
    @gusinthecloud 2 роки тому +3

    The best teacher makes simple the difficult subject. Thank you. You are great!!!

  • @vimalradadiya5929
    @vimalradadiya5929 2 роки тому +5

    Nice explanation and sir please continue this playlist it's really helpful to gain knowledge regarding Machine learning.
    Thank you so much sir

  • @marcellodichiera
    @marcellodichiera 2 роки тому +1

    Going back to basics Boss!?! You are amazing !!

  • @A0G7
    @A0G7 Рік тому

    The most important skill that you have is to copy the amazing knowledge you have and past it smoothly to our understanding. That is what I called mastering ctrl c and ctrl v

  • @jiyabyju565
    @jiyabyju565 2 роки тому

    these lectures tempt me to search for next upcoming videos...thank you for all these effort...

  • @tianhuicao3297
    @tianhuicao3297 Рік тому

    Thank you so much, Dhaval! I'm watching your videos to survive my DS classes.

  • @FindingInsights
    @FindingInsights 2 роки тому

    What an amazing and, easy to learn video. Thank you.

  • @user-bo2eg8dq9c
    @user-bo2eg8dq9c 2 роки тому +7

    "How to become data scientist/ML engineer"
    Views = 500K
    "Tutorial on ML/DS topics"
    Views = 5-10K
    sums up how much effort everyone is putting to be a ML/DS guy🌝

    • @codebasics
      @codebasics  2 роки тому

      Ha ha, this is so true 😊

  • @ghzich017
    @ghzich017 2 роки тому +3

    7:39 Most relatable statement I've heard so far

  • @alcryton6515
    @alcryton6515 2 роки тому +3

    Even my training institute is teaching from your codes only.
    You are really a great teacher

  • @mbogitechconpts
    @mbogitechconpts 2 роки тому

    First time here and I just have to subcribe. Very funny but good teacher. God bless you.

  • @amandaahringer7466
    @amandaahringer7466 2 роки тому

    Awesome explanation!!

  • @pouriab9782
    @pouriab9782 2 роки тому +1

    I tested n_neighbors between 1 to 100 using cross validation and plotted the results. looks like as you increase number of neighbors, the score declines (the cool thing is it's linear)
    highest score was at n_neighbors = 3 with test_size = 0.33 and it was 0.9915.
    p.s: I have watched all the videos from this series and I've got to say you're amazing sir. keep making tutorials cause you're the best!

    • @zizoublbs8918
      @zizoublbs8918 2 роки тому +1

      try test_size=0.2 i had 0.9944 😀

  • @60pluscrazy
    @60pluscrazy 2 роки тому +5

    Really an outlier in simplification. Awesome 🙏

  • @user-sy4kk3oh1o
    @user-sy4kk3oh1o 2 місяці тому

    Very easy and meaningful explanation, Thank You Sir

  • @alidakhil3554
    @alidakhil3554 2 роки тому

    You are excellent!!!!

  • @VishnuPriya-tz4ls
    @VishnuPriya-tz4ls Місяць тому

    you're the best.Thank you

  • @shylashreedev2685
    @shylashreedev2685 2 роки тому +3

    Exercise Result 99.16% Score with k=4, Thank u so much sir, i m trying to solve all ur exercises and it is helping me build my confidence in ML

  • @nazmulhaqueomi8121
    @nazmulhaqueomi8121 8 місяців тому

    Your last comment is very sir.
    Your are an amazing teacher sir.
    Thanks a lot

  • @anjalinair1763
    @anjalinair1763 Рік тому

    Really informative sir

  • @charmilam920
    @charmilam920 2 роки тому +2

    Amazing
    Please do it for other algorithms also

  • @sophiamary2522
    @sophiamary2522 2 роки тому

    Very useful video, that's a lot

  • @paulkornreich9806
    @paulkornreich9806 2 роки тому +1

    Plotted scores between 1 and 50 and found highest accuracy at k = 7 or k = 8 at 99.72% accuracy. Like everyone else, it sharply declined after that. Used the same parameters for data split as in the video.

  • @ramandeepbains862
    @ramandeepbains862 2 роки тому

    for k =1 overfit issue for k=3 I got the best score of 0.985397 for the exercise digits dataset . as compared to the SVM, KNN gave the best accuracy for digits dataset

  • @datastako156
    @datastako156 2 роки тому +4

    im gonna use my most important skill "Ctrl-C Ctrl-V" hahaha.. that funny sir

  • @samrozch8419
    @samrozch8419 10 місяців тому

    nice lecture sir

  • @narendraparmar1631
    @narendraparmar1631 8 місяців тому

    Thanks😀

  • @bestcomedyjokes4913
    @bestcomedyjokes4913 Рік тому

    pretty good tutorial for free👍

  • @saravanashanmuganathan4692
    @saravanashanmuganathan4692 2 роки тому

    Thank u sir

  • @carolinemoraes3704
    @carolinemoraes3704 Рік тому

    Thankss!

  • @useryaya-r5h
    @useryaya-r5h 7 місяців тому

    Your computer will start sneezing and it will have a fever 🤣🤣 Amazing content ❤💯

  • @shlokdoshi7162
    @shlokdoshi7162 2 роки тому

    If a give an input list for the KNN algorithm to predict the classes of each element, How can I print out the list of inputs only belonging to a particular class?

  • @shubhamchauhan6916
    @shubhamchauhan6916 2 роки тому

    Sir can you explain this code to replace nan values with knn and dataset have both categorical and continuous datapoint

  • @rajatsharma7899
    @rajatsharma7899 2 роки тому +2

    Sir , i have a background of Geophysics and i want to do data science in canada. So, is it possible to connect data sciences with Geophysics

  • @slainiae
    @slainiae 4 місяці тому

    Perfect score with n_neightbors = 6.

  • @yamrajoli3834
    @yamrajoli3834 2 роки тому

    hello sir, while making the heatmap using the seaborn during making label I think at x -axis there should true and in y label predicted but you had done exactly opposite
    is it like as you done ore actually there is mistakes I have confusion in confusion matrix interpretation.
    please reply me sir

  • @sai_sh
    @sai_sh Рік тому

    Can we use SVM here 6:45 . since it can be easily separated using hyperplane

  • @mikettu
    @mikettu 2 роки тому

    great explanation!!! using gridsearchcv method, K=2 was the best value. Continue the great job

  • @ogochukwustanleyikegbo2420
    @ogochukwustanleyikegbo2420 10 місяців тому

    I got an accuracy of 96.38 with K = 4 after working on the exercise

  • @prathampandey9898
    @prathampandey9898 2 роки тому

    Can you create videos for Reinforced Learning?

  • @haleykwok2501
    @haleykwok2501 2 роки тому

    😂sir you are humorous!

  • @pheiroijamprishika6414
    @pheiroijamprishika6414 2 роки тому

    Sir can you post about unsupervised learning, about Boltzmann machine and it's types
    i.restricted Boltzmann machine
    ii. Deep Boltzmann machine
    iii. Deep belief network

  • @ArulPasupathi
    @ArulPasupathi 2 місяці тому

    Most important skill cntrol c and control v hence prooved in this video,,...😝

  • @EK-wq7qt
    @EK-wq7qt 2 роки тому

    *Hi, I have already added this comment to Perosonal Finance video but adding again to get an answer*
    *NEED HELP*
    I am currently doing *Personal Finance* project but I am getting error and despite dsearching a lot I couldnot resolve it, While transforming data initially, when I change date type from text to (Date) type, it only shows 2021 at the end of every date, like this 1/18/2021. Even before it was data of 2019 like jan-19 but after only shows this. Please help me in this, I am stuck here for a long time.

  • @bhaskarg8438
    @bhaskarg8438 2 роки тому +2

    I have one doubt, in Confusion_Matrix(Truth,Predicted) , but in plt graph we are giving in reverse... like x-axis as Predicted and y-axis as Truth....
    Can you please clarify , Thank you 🙏

    • @gouravsapra8668
      @gouravsapra8668 2 роки тому

      I dont think it matters....you can do either way...You may try yourself..

  • @fraudude3841
    @fraudude3841 День тому

    when I tried to predict the score for the exercise it showed AttributeError: 'Flags' object has no attribute 'c_contiguous' can anyone help

  • @shifaabid1425
    @shifaabid1425 2 роки тому +1

    most Important skill
    Ctrl + C and Ctrl + V....

  • @orekisato9145
    @orekisato9145 8 місяців тому

    Please tell me how to make Knn based error checking on language in step by step or please share any link ??

  • @talharauf3111
    @talharauf3111 2 роки тому

    🤩

  • @tinkhanimphasa91
    @tinkhanimphasa91 Рік тому +1

    😂😂😂I like the joke at the end of the video, i have tried it and my below is my classification report:
    precision recall f1-score support
    0 1.00 1.00 1.00 26
    1 0.89 1.00 0.94 50
    2 1.00 1.00 1.00 38
    3 1.00 0.93 0.96 28
    4 1.00 0.96 0.98 28
    5 0.96 1.00 0.98 43
    6 1.00 1.00 1.00 32
    7 0.95 1.00 0.98 42
    8 1.00 0.88 0.93 40
    9 1.00 0.94 0.97 33
    accuracy 0.97 360
    macro avg 0.98 0.97 0.97 360
    weighted avg 0.97 0.97 0.97 360
    and my cm is:
    array([[26, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [ 0, 50, 0, 0, 0, 0, 0, 0, 0, 0],
    [ 0, 0, 38, 0, 0, 0, 0, 0, 0, 0],
    [ 0, 0, 0, 26, 0, 1, 0, 1, 0, 0],
    [ 0, 0, 0, 0, 27, 0, 0, 1, 0, 0],
    [ 0, 0, 0, 0, 0, 43, 0, 0, 0, 0],
    [ 0, 0, 0, 0, 0, 0, 32, 0, 0, 0],
    [ 0, 0, 0, 0, 0, 0, 0, 42, 0, 0],
    [ 0, 5, 0, 0, 0, 0, 0, 0, 35, 0],
    [ 0, 1, 0, 0, 0, 1, 0, 0, 0, 31]], dtype=int64)

  • @eng.mariamalhussainy687
    @eng.mariamalhussainy687 2 роки тому

    Hi Sir , Thanks Thanks Thanks for this explain and You are great man in your explain thanks alooooot,, Excuse me Sir is the test=30% and train =70%?

    • @uchenwodo4603
      @uchenwodo4603 2 роки тому

      yes it is.....but you can change it to 80:20 if you wish to

  • @Piyush-yp2po
    @Piyush-yp2po 5 годин тому

    99.16 for 3 neighbour

  • @vishnuviswanathmm
    @vishnuviswanathmm 2 роки тому +1

    Is it necessary to split the dataset into Training and Testing set for KNN? Since KNN being a lazy algorithm

    • @ShawnDypxz
      @ShawnDypxz 2 роки тому +2

      He is just doing it for testing the model. So it's like he only had data equal to training data. Then he used testing data as foreign data to figure out where those foreign data lie in the clusters.

    • @vishnuviswanathmm
      @vishnuviswanathmm 2 роки тому

      @@ShawnDypxz Makes sense. Thank you

  • @ManusaiSRKian
    @ManusaiSRKian 11 днів тому

    4:56

  • @praveendeena1493
    @praveendeena1493 2 роки тому

    Hi sir ,I want your videos on Rasa chatbot and sentiment analysis

    • @shobhitbishop
      @shobhitbishop 2 роки тому

      I can help you, worked on building RASA based models as well

    • @praveendeena1493
      @praveendeena1493 2 роки тому

      @@shobhitbishop thank you sir.

  • @Survivor-xs9gv
    @Survivor-xs9gv 2 роки тому

    I was following this tutorial series but unfortunately it is missing some topics. Any idea how many topics are left?

    • @codebasics
      @codebasics  2 роки тому +4

      yes I need to cover XGBoost, adboost, bagging, boosting, PCA. But majority of the topics are covered already in the series.

  • @dataoverview7388
    @dataoverview7388 2 роки тому

    Hi, How many ways to find out K value. And " Elbow" method it's useful or not in finding K value, if suppose not why?

  • @JMS-ht3td
    @JMS-ht3td Рік тому

    now my computer has a fever

  • @praba8478
    @praba8478 2 роки тому +1

    Sir it is mandatory to learn excel and statistics for data analyst jobs?

  • @Koalasq119
    @Koalasq119 Рік тому

    Sir, I would die for you. I am paying thousands of dollars for some guy to not teach me a goddamn thing.

  • @nihalwaghmare1337
    @nihalwaghmare1337 2 роки тому +1

    Sir I'm a fresher chemical engineer, can I build career in data analytics? I'm so confused . Plz help

  • @themoneymaker03
    @themoneymaker03 2 роки тому

    Dang I caught the virus lol j/k. Thanks great video! 👍

  • @markvincentgallemit9894
    @markvincentgallemit9894 Рік тому

    test_size = 0.2
    k = 7
    score: 0.9972222222222222

  • @brandonsager223
    @brandonsager223 2 роки тому

    My computer got the virus, he wasn't lying

  • @elahehgorgin9769
    @elahehgorgin9769 2 роки тому

    Why people keep calling you sir?

    • @zainnaveed267
      @zainnaveed267 2 роки тому

      how this question it is even related to ML

  • @rayhansuryatama909
    @rayhansuryatama909 2 роки тому

    ayo why is marc specter teaching artificial intelligence?

    • @codebasics
      @codebasics  2 роки тому

      It is my alter ego teaching ML 😎

  • @curiousMan69
    @curiousMan69 3 місяці тому

    lol virus 😂

  • @princedoshi1594
    @princedoshi1594 Рік тому

    i think sir you need to work on how to speak. It seems you are pretty confused yourself