Training a machine learning model with scikit-learn

Поділитися
Вставка
  • Опубліковано 24 гру 2024

КОМЕНТАРІ • 547

  • @dataschool
    @dataschool  3 роки тому +10

    Having problems with the code? I just finished updating the notebooks to use *scikit-learn 0.23* and *Python 3.9* 🎉! You can download the updated notebooks here: github.com/justmarkham/scikit-learn-videos

  • @mightyflamelord
    @mightyflamelord 8 років тому +39

    i appreciate the fact that you speak very slowly and express clearly!

    • @dataschool
      @dataschool  8 років тому +7

      Thanks, I try to make it easy for others to understand me! :)

    • @11folders
      @11folders 4 роки тому +1

      I totally agree. I don't have to pause the video as frequently while taking notes.

  • @johnlim640
    @johnlim640 3 роки тому +3

    This is hands down the best machine learning tutorial. Definition and concept is well-explained. THANK YOU SO MUCH!

    • @dataschool
      @dataschool  2 роки тому

      Thank you for your kind words! 🙏

  • @Aviel777Gergel
    @Aviel777Gergel 3 роки тому +5

    It is so awesome, that you combine first class knowledge+ impressive pronunciation of a professional voice actor. It's super clear! Thank you for the series

    • @dataschool
      @dataschool  3 роки тому

      Wow, thank you! 🙏 I really appreciate your truly kind words!

  • @srivathsgondi191
    @srivathsgondi191 2 роки тому +2

    Despite this being an old playlist, without a doubt still the best one I found on youtube so far...

    • @dataschool
      @dataschool  2 роки тому

      Thank you so much!

    • @Guinhulol
      @Guinhulol 10 місяців тому

      Oh yeah! It doesn't get better than that!

  • @edmarkowitz9873
    @edmarkowitz9873 6 років тому +1

    You're videos are so far superior to the commercial products out there I just can't believe it. I wish I had found them before dumping a small fortune into the "pay-to-play courses." Thank you for sharing this information and be sure that I will join the Paetron group.

    • @dataschool
      @dataschool  6 років тому

      Wow! Thank you so much for your kind words! :) I look forward to having you in the Data School Insiders group... you can join here: www.patreon.com/dataschool

  • @gsk1740
    @gsk1740 6 років тому +1

    No words to Describe How awesome it is...after watching so many tutorials .

    • @dataschool
      @dataschool  6 років тому +1

      Thanks very much for your kind words!

  • @yechihast
    @yechihast 6 років тому +10

    One of the best ML online tutor I have come across, very well thought, every minute in informative. AND I do support Kevin's slow speech pace; it makes it much easier to comprehend the complext concepts. Thank you Kevin.

  • @Axle_Max
    @Axle_Max 6 років тому +3

    Your ability to explain this topic in simple terms is remarkable. Thank you so much for these videos.

  • @pierrelaurent8284
    @pierrelaurent8284 8 років тому +1

    It's a real pleasure to follow this serie, clear, concise and so well teached. Being a non native English speaker, it's 100% understandable. Bravo !

    • @dataschool
      @dataschool  7 років тому

      Wow, thanks so much for your kind words! I really appreciate it.

  • @Dockmark5
    @Dockmark5 6 років тому +1

    Not just educated, but a talented teacher. Fantastic combination

    • @dataschool
      @dataschool  6 років тому +1

      Thanks so much! I really appreciate it! :)

  • @hasyahaven
    @hasyahaven 5 років тому

    Loss of words! Ur explanation is with the purpose to answer every root question and with an aim so that one clearly understands. Thanks a lot.

  • @spandanhetfield
    @spandanhetfield 9 років тому +1

    You have done an awesome job. I'm the TA for a course on Bioinformatics and I'll be using your videos to teach my students a short primer on getting started with ML just so that they can shed that fear and get down to work :)

    • @dataschool
      @dataschool  9 років тому

      +Spandan Madan That's awesome! Please let me know how it goes!

  • @kushalmiglani2691
    @kushalmiglani2691 8 років тому +3

    you are doing a very good job. The stuff and tutorials you are providing for free seriously shows your dedication for your work and how much you care for those who cant afford such expensive tutorials. Thanks

    • @dataschool
      @dataschool  8 років тому +1

      Thanks so much for your kind words! I'm really glad the tutorials have been helpful to you!

  • @kritikakamra22
    @kritikakamra22 7 років тому +1

    Thanks for making such lucid videos Kevin! You have no idea how helpful these videos are for a novice like me.

    • @dataschool
      @dataschool  7 років тому

      Excellent! That's very nice to hear!

  • @c00kiemonster247
    @c00kiemonster247 8 років тому +18

    This literally is best tutorial guide on the internet.. thank you so much

    • @dataschool
      @dataschool  8 років тому +3

      Wow! What a kind thing to say... thank you!

    • @aquaman788
      @aquaman788 4 роки тому

      Me too!!!!!!

  • @ssagga
    @ssagga 8 років тому +10

    Wow, ML suddenly feels a lot less scary. Can't wait to watch the rest of the series.

    • @dataschool
      @dataschool  8 років тому +1

      Excellent! Here's a link to the entire video series, for others who are interested: ua-cam.com/play/PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A.html

  • @garriedaden4168
    @garriedaden4168 9 років тому +2

    Many thanks for this video series. I really like the way you develop the subject in manageable chunks and focus on what is really needed to master the subject.

    • @dataschool
      @dataschool  9 років тому

      Garrie Daden That is excellent to hear, and is exactly what I was trying to do! Thanks for your thoughtful comment.

  • @atiflatif7233
    @atiflatif7233 7 років тому +1

    Thanks so much for making it so easy to understand. I have watch many videos on Machine Learning and have never felt so confident in applying the concepts. Well Done!

    • @dataschool
      @dataschool  7 років тому

      You are very welcome! I'm glad to hear my video was helpful to you!

  • @aditi-ind
    @aditi-ind 6 років тому +2

    Overwhelming, I have been trying to learn these basics since a long time, and finally got this video series, Thank you so much for such clear presentation of such a complex (esp for me) topic.

    • @dataschool
      @dataschool  6 років тому

      You are very welcome! I'm so glad to hear it was helpful to you!

  • @SeaCreature_
    @SeaCreature_ 7 років тому +2

    One of the best channels. Nice to see someone speaking so coherent and educational, compared to other channels. Great job Kevin.

    • @artemkovera5785
      @artemkovera5785 7 років тому +1

      Totally agree with you. It's a great channel. I just published an e-book about machine learning with clustering algorithms. it's available for free for 5 days. Would you like to get a free copy?

    • @SeaCreature_
      @SeaCreature_ 7 років тому

      of course, thank you Artem

    • @artemkovera5500
      @artemkovera5500 7 років тому

      it's here www.amazon.com/dp/B076NX6KY7 You would really help me if you leave a little review on Amazon

    • @SeaCreature_
      @SeaCreature_ 7 років тому

      Great thank you, and will do. I dont use Amazon kindle but I try to figure out how to get around it. Thanks

    • @artemkovera5785
      @artemkovera5785 7 років тому

      You can easily create an account on Amazon if you don't have one (you don't necessarily need to enter your credit card). After that, you will be able to read free e-books on Amazon website through their cloud reader. I regularly read free e-books available on Amazon, and it's very convenient.

  • @KowsalyaSubramanian
    @KowsalyaSubramanian 8 років тому +1

    Thanks Kevin. I like the videos very much. Wish I had known about this series a month back. I dropped my ML course in this semester because the material was very overwhelming. Very useful videos and the material is presented in a very organized manner. Keep up the good work!

    • @dataschool
      @dataschool  8 років тому

      Thank you so much for your kind comments!

  • @dataschool
    @dataschool  6 років тому +28

    *Note:* This video was recorded using Python 2.7 and scikit-learn 0.16. Recently, I updated the code to use Python 3.6 and scikit-learn 0.19.1. You can download the updated code here: github.com/justmarkham/scikit-learn-videos

    • @terryxie1929
      @terryxie1929 6 років тому +3

      thanks a lot for your job

    • @dataschool
      @dataschool  6 років тому +2

      You're very welcome!

    • @a.n.7338
      @a.n.7338 5 років тому +1

      Hi i have trained my model using NN and model is saved so how can i use model to classify images?

    • @tomparatube6506
      @tomparatube6506 5 років тому

      I'm running the latest Anaconda 1.9.7 Jupyter Notebook server 5.7.8, Python 3.6.5, iPython 7.4. Upon hitting Run for the 1st two lines of code:
      "from IPython.display import IFrame
      IFrame('archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', width=300, height=200)"
      Jupyter doesn't run and outputs the columns of numbers like in the video, but asks for "iris.data".
      What should I do? Your Pandas videos have helped so much. I'm enrolled at DataQuest but have been considering enrolling in yours too. Thanks Kevin.

    • @aquaman788
      @aquaman788 4 роки тому

      @@dataschool Can we also have a lecture for TensorFlow?

  • @pranjalkumar9378
    @pranjalkumar9378 5 років тому +2

    You choose your words very carefully. Awesome teaching 👏

  • @lightningblade9347
    @lightningblade9347 6 років тому +2

    This is the best Machine learning video I've ever watched, amazing how you did break a complicated topic like machine learning into small sections accompanied with very, very clear explanations, thank you very much I hope you continue it's been a long time since you posted a video on UA-cam.

    • @dataschool
      @dataschool  6 років тому +1

      Thanks so much for your kind comments! I really appreciate it :)
      P.S. I published 10 videos last month, and will have more in the future!

    • @lightningblade9347
      @lightningblade9347 6 років тому

      Wow thanks for the update, I'm gonna check them right now. bless you.

  • @rayuduyarlagadda3473
    @rayuduyarlagadda3473 6 років тому +1

    This is the best explanation, I have gone through many videos but this video helped me a lot for better understanding.... Thank you markham.

    • @dataschool
      @dataschool  6 років тому

      You're very welcome! Glad it was helpful to you!

  • @RicardoFerrazLeal
    @RicardoFerrazLeal 9 років тому +1

    Best series of machine learning tutorials out there!

    • @dataschool
      @dataschool  9 років тому

      Ricardo Ferraz Leal Wow, thank you! What a kind compliment. I really appreciate it!

  • @victorekwueme3581
    @victorekwueme3581 8 років тому +1

    Your explanations in your videos are easy to understand and very or should I say extremely helpful. Keep it up....

    • @dataschool
      @dataschool  8 років тому

      +Victor Ekwueme Thanks! I spent a lot of time figuring out how to teach this material in the classroom, and so I thought it was important to spread the knowledge using videos as well :)

  • @tuvantran660
    @tuvantran660 4 роки тому +1

    Wow, you're the best teacher I've learned so far. Easy to understand and the contents are well explained.

  • @saraths9044
    @saraths9044 3 роки тому +2

    Please keep on making videos of the same quality. Thank you so much

  • @shobhitsrivastava4496
    @shobhitsrivastava4496 6 років тому +1

    You are one of the best teacher ever got taught !

  • @flamboyantperson5936
    @flamboyantperson5936 7 років тому +2

    Step by step explanation in a clear way. Just love it. Thank you so much.

  • @adityarajora7219
    @adityarajora7219 6 років тому

    Love your Speed and Clarity man.

  • @colmorourke4657
    @colmorourke4657 4 роки тому +1

    Simply outstanding work. It's highly structured and clearly explained. I also greatly appreciate the excellent references you link for various sections.

    • @dataschool
      @dataschool  4 роки тому

      Thank you so much for your kind words!

  • @dishonfano7599
    @dishonfano7599 5 років тому +1

    My friend...Thanks a lot..This is the best introduction to machine learning I have ever come across...Please do a deep learning tutorial...Again thanks a lot.

    • @dataschool
      @dataschool  5 років тому

      Thanks very much for your kind words! I really appreciate it.

  • @RohitShukla-mm3gz
    @RohitShukla-mm3gz 4 роки тому +2

    Such an amazing video. I have never seen this type of clear video. I understand many things. Thanks a lot, Please make a video on unsupervised learning also.

    • @dataschool
      @dataschool  4 роки тому

      Thanks for your suggestion!

  • @guptaachin
    @guptaachin 6 років тому +1

    You are the best Kevin. I always find the most relevant stuff in your videos.

  • @brunofazoli1
    @brunofazoli1 7 років тому +1

    Amazing explanation! I'm so excited to finish the series! Congrats!

    • @dataschool
      @dataschool  7 років тому

      Thanks very much! Glad you are enjoying the series :)

  • @robindong3802
    @robindong3802 7 років тому +1

    you are one of the best instructors online, thank you so much.

    • @dataschool
      @dataschool  7 років тому

      Wow, thanks so much for your kind comment! :)

  • @satyakiguha415
    @satyakiguha415 9 років тому +1

    finding these tutorials very interesting.....do continue putting them up...thanks a lot

    • @dataschool
      @dataschool  9 років тому

      ***** You're very welcome!

  • @sneharane2596
    @sneharane2596 4 роки тому +1

    Very well explained, you are a great teacher! Loving this series !

  • @ankrish8692
    @ankrish8692 6 років тому

    this is the best speed to make a beginner understand the terminologies one by one ... i really appreciate and thankful yo you for this video . i have seen some videos and i was not able to get wat they were saying bcoz of the speed .... thanks...!!!!!!

  • @ralfmatulat
    @ralfmatulat 8 років тому +1

    This whole series is helpful and fun to watch. Thanks!

    • @dataschool
      @dataschool  8 років тому

      That's excellent to hear. Thanks for watching!

  • @Abhay17291
    @Abhay17291 7 років тому +2

    Thank you for all these videos, Kevin! Very clear and easily understandable.

    • @dataschool
      @dataschool  7 років тому +1

      You're very welcome! :)

  • @tomasemilio
    @tomasemilio 8 років тому +57

    This is great man, I am watching this in x2 speed, haha.

  • @paolosalamon
    @paolosalamon 7 років тому

    One of the best tutorial I ever seen. I love your speech also.

    • @dataschool
      @dataschool  7 років тому

      Thanks! :)

    • @paolosalamon
      @paolosalamon 7 років тому

      Hi. Are you going to make some new paid course?

    • @dataschool
      @dataschool  7 років тому

      I am continuing to work on both free content and paid content. Stay tuned!

  • @avadhanulasairamaraviteja9271
    @avadhanulasairamaraviteja9271 6 років тому +4

    You're just awesome...best videos in recent times...like your way of explanation and please do continue teaching and sharing your knowledge...peace..

    • @dataschool
      @dataschool  6 років тому

      Thanks so much for your kind words! :)

  • @alitanwir3372
    @alitanwir3372 7 років тому +5

    Kevin, your a great teacher, your explanations are top notch! Subbed on the channel and the news letter! Thanks a lot! :)

    • @dataschool
      @dataschool  7 років тому +1

      Wow, thanks so much! Great to hear :)

  • @hemenboro4313
    @hemenboro4313 4 роки тому +1

    its pretty clear and precise explanation. Thanks for making such videos and keep us educated @data school

  • @talkingaboutitinaeasyway5067
    @talkingaboutitinaeasyway5067 5 років тому +1

    Thank you very much. Your videos really help me understand ML deeply.

    • @dataschool
      @dataschool  4 роки тому

      That's great to hear! 🙌

  • @victoreirekponor6052
    @victoreirekponor6052 7 років тому

    Mr Kevin, I really appreciate this tutorials. I hope to become as good as you are some day..

  • @tgbaozkn
    @tgbaozkn 4 роки тому +1

    your pronuncation is awesome ,im really understand because of you thanks a lot teacher !

  • @srinidhibandi2313
    @srinidhibandi2313 Рік тому

    It is because of these guys we are able to learn Machine Learning concepts so clearly and easily🎉🎉❤❤

  • @MJ-em_jay
    @MJ-em_jay 8 років тому +6

    Very clear and easy to follow. Thanks!

    • @dataschool
      @dataschool  8 років тому

      Excellent! You're very welcome!

  • @arjunpukale3310
    @arjunpukale3310 5 років тому

    Thank u very much I wanted to start off with ML and tried many tutorials but all were very fast. But u explained each line very nicely

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes 3 роки тому

    Kevin, you said you don't know how well your model do on new data, but when you test your model with predict on the test data, I think it is standard to evaluate the accuracy (or any other metric) of your model.

    • @dataschool
      @dataschool  3 роки тому

      To be clear, if we are talking about truly "new" data, meaning out-of-sample data, then you actually don't know the true target values, and thus there's no way to check how accurate your model was with those samples. Hope that helps!

    • @JoaoVitorBRgomes
      @JoaoVitorBRgomes 3 роки тому

      @@dataschool Ah ok, thanks for elaborating. Yes, indeed, e g. a new client asking for a loan (default or not)

  • @JoeG2324
    @JoeG2324 7 років тому +31

    why would anyone dislike this video?

    • @dataschool
      @dataschool  7 років тому +12

      Ha! I ask myself that same question :)

    • @rmehdi5871
      @rmehdi5871 7 років тому +5

      probably some other e-learn teaching competitors :)

    • @TheAlderFalder
      @TheAlderFalder 5 років тому

      They hate-im, cus they ain't-im. ;)

    • @bardamu9662
      @bardamu9662 5 років тому +1

      @@dataschool Certainly because they are not using the appropriate classifier for high-end teaching videos :-) Congrats for your video series: very instructive, clearly articulated (pondering theory and examples) and with perfect emphasis in critical points. As a well-known French philosopher used to say: "whatever is well conceived is clearly said and the words to say it flow with ease". Bravo Kevin! Very talented teacher!

    • @aquaman788
      @aquaman788 4 роки тому +1

      @@bardamu9662 very good ML lecture!!!!

  •  2 роки тому

    ¡Gracias!

    • @dataschool
      @dataschool  2 роки тому

      Wow, thank you so much Luis! I truly appreciate it! 🙏

  • @udaymallam43
    @udaymallam43 6 років тому +1

    Great explanation, simple & effective, Big Thank you for the videos

  • @thisisgurkaran
    @thisisgurkaran 6 років тому

    Great video man. You are hero of humanity.

  • @sandeepgautam2465
    @sandeepgautam2465 5 років тому +23

    it only worked when i used two square bracket knn.predict([[3,5,4,2]])

    • @dataschool
      @dataschool  5 років тому +5

      Right. See here for an explanation: www.dataschool.io/how-to-update-your-scikit-learn-code-for-2018/#only2ddataarrayscanbepassedtomodels

    • @hfsbhat
      @hfsbhat 4 роки тому

      Thanks Sandeep

    • @yuvaraj2457
      @yuvaraj2457 3 роки тому

      op = [[1.77, 2.55],]
      linreg.predict(op)
      this also works. it is expecting a 2d arrays but i dont know y. adding a comma next to a list makes sense

  • @rishabbamrara5072
    @rishabbamrara5072 6 років тому

    Very very good and easy to learn lectures. Thank you..

  • @omarnassor5259
    @omarnassor5259 8 років тому

    very simple and straight forward, thank you data school.

  • @aamirkhan7201
    @aamirkhan7201 Рік тому +1

    00:04 Introduction to K Nearest Neighbors (KNN) classification model
    02:13 K-nearest neighbors classification model works by selecting the nearest observations and using the most popular response value.
    04:44 KNN is a simple machine learning model that predicts the response value based on the nearest neighbor.
    07:26 The first step is to import the relevant class and the second step is to instantiate the estimator.
    10:11 Training a machine learning model with scikit-learn
    12:44 The predict method returns a numpy array with the predicted response value
    15:05 Different models can be easily trained using scikit-learn
    17:22 Understanding nearest neighbor algorithms and class documentation
    Crafted by Merlin AI.

  • @taor412
    @taor412 9 років тому +1

    Thank you a lot for giving such a great video for beginner!
    Very thanks!

    • @dataschool
      @dataschool  9 років тому

      +TA OR You're very welcome!

  • @levon9
    @levon9 4 роки тому +1

    Very helpful and clear - thank you, incl the updated notebooks.
    Toward the end (t=15:40), using the logreg.fit(X, y) function results in a "/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):
    STOP: TOTAL NO. of ITERATIONS REACHED LIMIT." warning and a result of [0, 0] even with the updated code. Any suggestions? Changing max_iter to 500 gets rid of the warning, but still ends up with [0, 0] rather than [2. 0] as shown in the video. Any suggestions? I'm using Colab notebooks.

    • @dataschool
      @dataschool  3 роки тому +1

      The default solver for LogisticRegression has changed from liblinear to lbfgs. If you change it back to liblinear, it will converge. Try: logreg = LogisticRegression(solver='liblinear') before fitting. Hope that helps!

  • @khanhdo3988
    @khanhdo3988 8 років тому +1

    Keep up the amazing work!

  • @harshitsharma589
    @harshitsharma589 3 роки тому

    Thank you thank you .. I was having some doubts in concept and now it's cleared. I request you to please make some video on data normaliser

  • @sravankumar5017
    @sravankumar5017 4 роки тому

    Tq sir for your great explanation it will bring confidence in us that we can learn ml

  • @transmatter99
    @transmatter99 8 років тому

    You're very helpful and intelligent. thank you for these very polished videos.

  • @oguzcanyavuz8069
    @oguzcanyavuz8069 8 років тому +11

    Hello again, your tutorials are awesome. I have an error at here:
    In [8]:knn.predict([3, 5, 4, 2])
    /usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py:386:
    DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and
    willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1)
    if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
    DeprecationWarning)
    Out[8]:array([2])
    But i still get the correct output. I think it is related to numpy. Where should i use the numpy and how exactly? Or should i just ignore it?

    • @dataschool
      @dataschool  8 років тому +33

      You bring up a great point! It's a long explanation:
      The 0.17 release of scikit-learn included the following change: "Passing 1D data arrays as input to estimators is now deprecated as it caused confusion in how the array elements should be interpreted as features or as samples. All data arrays are now expected to be explicitly shaped (n_samples, n_features)." Here's what that means:
      When you pass data to a model (an "estimator" in scikit-learn terminology), it is now required that you pass it as a 2D array, in which the number of rows is the number of observations ("samples"), and the number of columns is the number of features. In this example, I make a prediction by passing a Python list to the predict method: knn.predict([3, 5, 4, 2]). The problem is that the list gets converted to a NumPy array of shape (4,), which is a 1D array. Because I wanted scikit-learn to interpret this as 1 sample with 4 features, it now requires a NumPy array of shape (1, 4), which is a 2D array. There are three separate ways to fix this:
      1. Explicitly change the shape to (1, 4):
      import numpy as np
      X_new = np.reshape([3, 5, 4, 2], (1, 4))
      knn.predict(X_new)
      2. Tell NumPy that you want the first dimension to be 1, and have it infer the shape of the second dimension to be 4:
      import numpy as np
      X_new = np.reshape([3, 5, 4, 2], (1, -1))
      knn.predict(X_new)
      3. Pass a list of lists (instead of just a list) to the predict method, which will get interpreted as having shape (1, 4):
      X_new = [[3, 5, 4, 2]]
      knn.predict(X_new)
      Solution #2 is scikit-learn's suggested solution. Solution #3 is the simplest, but also the least clear to someone reading the code.

    • @oguzcanyavuz8069
      @oguzcanyavuz8069 8 років тому

      Thanks for the explanation. I think i will use second option. Which one you are using? :)

    • @dataschool
      @dataschool  8 років тому +1

      If I'm writing code for myself, I use option #3, otherwise I use option #2.

    • @carlomott
      @carlomott 8 років тому

      Hi thanks for these video, they are amazing!
      One thing I noticed: it turns out that with the 0.17 release if you just type
      >>knn.predict(X_new)
      nothing will be output.
      My workaround is to type
      >>print knn.predict(X_new)
      >>[2]
      But I am not sure it is the best solution...

    • @dataschool
      @dataschool  8 років тому

      Glad the videos are helpful to you!
      I'm using scikit-learn 0.17, and I'm not seeing the behavior you are describing. Are you sure you're running exactly the same code, in exactly the same order I'm running it?

  • @vishwass9491
    @vishwass9491 8 років тому

    clear cut and to the point .thanks.

    • @dataschool
      @dataschool  8 років тому

      +vishwas s You're welcome!

  • @Bena_Gold
    @Bena_Gold 6 років тому

    Just to clarify ... it's a "(n_samples, n_features) matrix" ... not a "feature matrix" as you simply put ... great video ... thumbs up ...

    • @dataschool
      @dataschool  6 років тому +1

      The scikit-learn documentation refers to it as a "feature matrix", thus I do as well. Calling it a "feature matrix" indicates that it's made up of features, and it's 2-dimensional, and it's implied that the other dimension is the samples.

  • @alialsaady5
    @alialsaady5 6 років тому

    Hi, thank you for your explanation it's very clear. But there is something I don't really understand. At 4:00 you say that the data is represented by 2 numerical features, so you have two axes X and Y. But what if there were more features like the iris dataset. How does NNB work in that case? Is it taking the same steps as you explain in this video, but not on a 2D graph but 4D graph?

    • @dataschool
      @dataschool  6 років тому

      I don't know how to explain this briefly, I'm sorry!

    • @alialsaady5
      @alialsaady5 6 років тому

      @@dataschool That is very unfortunate to hear. Is it possible if we make an appointment via skype? I do not get out and need this information for my thesis. I hope you can help me with it.

    • @dataschool
      @dataschool  6 років тому

      I don't work with anyone one-on-one, I'm sorry! However, you are welcome to join Data School Insiders and ask a question during a live webcast or on the private forum: www.patreon.com/dataschool - I prioritize answering questions from Insiders because they are investing in me.

  • @HarshaXSoaD
    @HarshaXSoaD 8 років тому

    Very effective tutorial series

    • @dataschool
      @dataschool  8 років тому

      +HarshaXSoaD Thanks! Glad it's helpful to you.

  • @MrMikeWyn
    @MrMikeWyn 8 років тому

    Great series of video. Thanks.

    • @dataschool
      @dataschool  8 років тому

      You're welcome! Glad you are enjoying them :)

  • @yashgos
    @yashgos 6 років тому +1

    Thanks for another great video. I have one conceptual question regarding implementing logistic regression as shown in the video. What I understand is logistic regression is used where the outcome is binary (for example, A or B), In Iris dataset the outcome can be from 3 categories so how does logistic regression work here.

    • @dataschool
      @dataschool  6 років тому

      That's really a mathematical question rather than a conceptual question. Logistic regression can be used for multiclass problems, and a few details are here: scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
      You can learn more about this topic by searching for multinomial logistic regression.
      Hope that helps!

  • @dheerajsonawane6738
    @dheerajsonawane6738 7 років тому

    Great videos, as great talent and easy method to teach, thanks!

  • @pondapelagechandima8186
    @pondapelagechandima8186 6 років тому

    Hello, Thanks for explaining such topics in clean manner. Could you please do a explanation for categorical and numerical NAN values imputations? ( How to handle NAN in Machine Learning ? )

    • @dataschool
      @dataschool  6 років тому

      Thanks for your suggestion!

  • @adiflorense1477
    @adiflorense1477 4 роки тому

    6:01 Sir, is the white area color called outlier / noise?

  • @hectoralvarorojas1918
    @hectoralvarorojas1918 7 років тому

    This video series is the best I have watched about scikit-learn so far.
    By the time I finished watching all the videos; I will let you know my comments.
    At this time I am just wondering if you have in mind to something similar but based on the R-project platform. I mean, to go over the principal supervised and unsupervised machine learning methods but sin R-project. Are you planning to do this?

    • @dataschool
      @dataschool  7 років тому

      Glad you like the videos! To answer your question, I'm not planning on making any more videos about R at this time.

    • @hectoralvarorojas1918
      @hectoralvarorojas1918 7 років тому

      How about other machine learning models using python scikit-learn like tree models, cluster models, SVM model and Neural networks?

    • @dataschool
      @dataschool  7 років тому +1

      I do plan on covering more machine learning in Python in the future! :)

    • @hectoralvarorojas1918
      @hectoralvarorojas1918 7 років тому

      Great!
      I hope you can cover other algorithms like SVM, Decision Trees, Random forest and, Discriminant Analysis (DA).
      At the same time, by the time we use linear regression, Logistics regression and DA, for instance, sometimes we need to tune our models to check if they are in line with the assumptions. It would be nice if you can consider those topics in the next video series about machine learning and scikit-learn. How can we get the output graphs in all the models too? For instance: Tree graphs; ROC curve graph, etc. I am already working on it by myself (a lot of googling and reading work so far) but it would be great to have this add-in from you too.
      You are a great teacher, very precise and direct. Besides, you gave very good additional support in your notes that follows each video. So, it is very easy to follow your examples and recommendations.
      I hope you can get the new video series done soon.
      My best regards!

    • @dataschool
      @dataschool  7 років тому +1

      Thanks so much for your detailed suggestions, and your kind comments! I really appreciate it and will certainly consider your suggestions.

  • @NazarTropanets
    @NazarTropanets 6 років тому

    You are great teacher! Thank you very much!

  • @marco.nascimento
    @marco.nascimento 5 років тому +1

    Lovinng the series

  • @khushalvyas5633
    @khushalvyas5633 5 років тому

    You are awesome! Thanks to you for making these videos.

  • @GenzoVandervelden
    @GenzoVandervelden 8 років тому

    These video's are really great, thanks!

  • @jbowater
    @jbowater 8 років тому +1

    Really enjoying this series! Thanks for creating it. Do you know where I might find the code for making one of the lovely classification maps you show at e.g. 4.15?

    • @dataschool
      @dataschool  8 років тому

      I'm sorry, I don't know how those classification maps were made! If you find a way, feel free to let me know :)

    • @jbowater
      @jbowater 8 років тому

      OK, will do! Thanks for letting me know!

    • @jbowater
      @jbowater 8 років тому

      This solution: scikit-learn.org/stable/auto_examples/neighbors/plot_classification.html
      works straight out of the box for Iris data, though sadly struggling to adapt it to my dataset.

    • @dataschool
      @dataschool  8 років тому

      Thanks for sharing! Keep in mind that this technique will only work with two features.

  • @sjljc2019
    @sjljc2019 4 роки тому +1

    Hi, I am getting an error as "STOP: TOTAL NO. of ITERATIONS REACHED LIMIT." and the predictions is coming as array([0, 0]). Any help would be appreciated.

    • @alenaosipova4660
      @alenaosipova4660 4 роки тому +3

      try logreg = LogisticRegression(solver = 'liblinear') instead

    • @thiennguyen9186
      @thiennguyen9186 4 роки тому

      @@alenaosipova4660 Thanks a lot, that helps.

    • @dataschool
      @dataschool  4 роки тому

      Exactly!

  • @eldert1735
    @eldert1735 5 років тому

    Thank you for the video. It is really well-explained. I got several future warnings with LogisticRegression. When I used the fit method, it says that the default solver would change to 'lbfgs' and that I should specify a solver. Also, I got a warning that the default multi_class is also going to change to 'auto' and that I have to specify the multi_class myself. Even after I specify these two, I get a ConvergenceWarning, claiming that lbfgs failed to converge. I am new to machine learning, and I don't know what to do. Can you please tell me what I can do to solve these warnings?

    • @dataschool
      @dataschool  5 років тому

      You're doing the right thing! I would just try a different solver. Sorry, I know these warnings can be hard to understand.

  • @dimiro1
    @dimiro1 9 років тому

    Very good explanation. Thank you.

    • @dataschool
      @dataschool  9 років тому

      ***** You're very welcome!

  • @jerinjohn9518
    @jerinjohn9518 7 років тому

    Clear to understand. Thanks kevin

  • @sebastianli5547
    @sebastianli5547 6 років тому

    This totally saved me. I love you so much.

    • @dataschool
      @dataschool  6 років тому

      Ha! You are very welcome :)

  • @danniecutts6221
    @danniecutts6221 10 місяців тому

    Great communicator! Thanks!

  • @laurentvw
    @laurentvw 9 років тому +1

    Thanks for the video! I noticed you have code completion in your notebook, though I'm not seeing it in my notebook (I installed the latest version of Jupyter). Does it require some kind of plugin? Looking forward to the next lesson!

    • @dataschool
      @dataschool  9 років тому +2

      Laurent Van Winckel You're welcome! It looks like tab completion requires the readline library. More information is here: ipython.org/ipython-doc/stable/interactive/tutorial.html
      Let me know if you are able to get it working!

    • @laurentvw
      @laurentvw 9 років тому

      Data School Ah the tab key! It's working. Thank you!

    • @dataschool
      @dataschool  9 років тому

      Laurent Van Winckel Ah, I should have mentioned that I was hitting the Tab key! :)

  • @NirajKumar-hq2rj
    @NirajKumar-hq2rj 8 років тому

    thanks for this great work, have got question. I am using Anaconda distribution, while coding in the notebook, how would I get the pop up list of the objects as you get ,,ex. when you "from sklearn." after that it list methods/properties and you select the one you need..in this case you selected "neighbor"..I am not getting all this option

    • @dataschool
      @dataschool  8 років тому

      I am hitting the Tab key to autocomplete. Does that work for you?

    • @NirajKumar-hq2rj
      @NirajKumar-hq2rj 8 років тому

      Oh Yes, Thanks Sir:) My mind froze

    • @dataschool
      @dataschool  8 років тому

      No problem! :)

  • @alanjoseph3190
    @alanjoseph3190 5 років тому +1

    Thank you sir for your vedio on machine learning.Sir,i got an erro saying expected 2d array,got 1d array instead when using knn.predict.
    Nb: got the outpu when it is knn.predict( [ [ 3,5,4,2 ] ] ). 2 [ is used

    • @salmafrikha7228
      @salmafrikha7228 5 років тому +1

      the same problem for me please if you have resolve this problem please let me know

    • @alanjoseph3190
      @alanjoseph3190 5 років тому

      @@salmafrikha7228 use 2 brackets as given in my comment.it worked for me.but i dont know how he got.maybe he used numpy

    • @dataschool
      @dataschool  5 років тому

      See this blog post for a detailed explanation: www.dataschool.io/how-to-update-your-scikit-learn-code-for-2018/#only2ddataarrayscanbepassedtomodels

  • @houssammetni3319
    @houssammetni3319 4 роки тому +1

    When I try to do the Linear Regression part, when I predict (Logreg.predict), I get the wrong array [0,0] and a convergence error. What should I do? I tried changing the max_iter number/verbose... but I have no idea if that is where the problem comes from.

    • @dataschool
      @dataschool  4 роки тому +2

      The default solver changed in version 0.22, so you can try solver='liblinear' instead.

  • @bijayamanandhar3890
    @bijayamanandhar3890 3 роки тому

    Please inform the reason why you selected KNeighborsClassifier in this case. Why not other classifiers?

    • @dataschool
      @dataschool  3 роки тому

      It was just for teaching purposes.

  • @bevansmith3210
    @bevansmith3210 7 років тому

    Thanks Kevin, these are really great!

  • @elilavi7514
    @elilavi7514 9 років тому

    Thanks for material ! One question : In the video you try to solve classification problem with regression model , if I recall correctly from previous video , the regression models are good for regression problems and classification models for classification problems . Is there any criteria when I can choose with confidence regression model to solve classification problem ?

    • @dataschool
      @dataschool  9 років тому +1

      Eli Lavi Actually, "logistic regression" is a classification model (not a regression model), despite its name! That's why I used logistic regression in this case.
      There are some limited circumstances in which regression models can be used to solve classification problems, but it usually doesn't make sense. I wouldn't worry about it for now... that's a very advanced technique.
      Does that answer your question?

  • @SreedharRPingili
    @SreedharRPingili 5 років тому

    Hi Mark,
    How do you prepare the data for prediction? If the train data has one empty column value in the row . How to replace this value? Can I use KNN for this purpose?

    • @dataschool
      @dataschool  5 років тому

      There's no one answer for this. Yes, you could use KNN, or other imputation methods, or you could drop the row.