Logistic Regression Explained - Data Pre-Processing, Feature Selection and Interpretation - Part 2

Поділитися
Вставка
  • Опубліковано 7 лис 2024

КОМЕНТАРІ • 59

  • @YiannisPi
    @YiannisPi  4 роки тому +13

    Hey everyone! Did you like this tutorial? Please let me know your thoughts below!

    • @deepaksurya776
      @deepaksurya776 4 роки тому

      Great Tutorial. How to do feature selection on imbalanced data? do we need to do with original data (imbalanced) or with oversampled data (SMOTE) ?? Which is better??

    • @YiannisPi
      @YiannisPi  4 роки тому

      @@deepaksurya776 hey Deepak, glad you like the tutorial. There is no right or wrong answer here; it depends. The approach I follow is to run it on both and see if it's different and try to understand why

    • @deepaksurya776
      @deepaksurya776 4 роки тому

      Thanks

    • @joe8ones
      @joe8ones 4 роки тому +1

      can all these steps be used also when one wants to become a machine learning engineer?

    • @YiannisPi
      @YiannisPi  4 роки тому +1

      @@joe8ones yes ofc!

  • @21Gannu
    @21Gannu 4 роки тому +2

    you showed me some steps that i have never thought of doing. thankyou for your input. hope to see more.

    • @YiannisPi
      @YiannisPi  4 роки тому

      Glad you liked the video! will keep them coming!

  • @mvince4030
    @mvince4030 3 роки тому +1

    Loved it. I learned from it because it is a full project from start to finish. Thank you for sharing your knowledge.

  • @victorreloadedCHANEL
    @victorreloadedCHANEL 3 роки тому +7

    This is a huge damn useful tutorial !! Thanks for it :D
    It is very well explained in my opinion

  • @joebox9999
    @joebox9999 4 роки тому +1

    I am amazed with your explanations. There are several good, but yours are the best tutorials so far. I also like how your tutorial is very detailed in terms of steps and how you use the Sklearn manual to find the code for different methods. Thank you

    • @YiannisPi
      @YiannisPi  4 роки тому +1

      Thank you very much!! More to come in the coming weeks!

  • @ayencoscolfield3312
    @ayencoscolfield3312 4 роки тому

    This guy is awesome, to think have not come across your videos since, am really feeling your contents right now, well done and good job

  • @jongcheulkim7284
    @jongcheulkim7284 3 роки тому

    Thank you so much. This is great. I like your video because it is very practical and comprehensive so I can understand the whole process. Thank you again.

  • @elmehdiouafiq
    @elmehdiouafiq 2 роки тому +1

    Feature Selection at 11:30

  • @victordias8899
    @victordias8899 4 роки тому +1

    This is good stuff. Congratz on the video and keep up with the great work!

    • @YiannisPi
      @YiannisPi  4 роки тому

      Thanks Victor! Glad you liked it!

  • @umutaltun9049
    @umutaltun9049 3 роки тому

    Incredible tutorial!. Please keep it up the good work.

  • @subhrajitdasgupta3868
    @subhrajitdasgupta3868 3 роки тому

    Liked and subscribe d because you put a good effort on this video to teach us how to do this on python step by step instead of jumping to the model fitting part merely

  • @weihongxu989
    @weihongxu989 4 роки тому

    Great video on the steps of building a logistic regression! I just wish you could zoom in some on the coding screen.

    • @YiannisPi
      @YiannisPi  4 роки тому

      Glad you liked it! Noted!

  • @andreasp.189
    @andreasp.189 4 роки тому

    Very informative second part tutorial! Keep it up!

  • @sarrahrose8077
    @sarrahrose8077 Рік тому

    Great tutorials!

  • @micahdelaurentis6551
    @micahdelaurentis6551 3 роки тому +2

    so sick of little 5 minute over simplified tutorials--this is the real deal that doesn't skimp.Thanks for sharing!

  • @tamaskg
    @tamaskg 4 роки тому

    Hi! Great video! It would be nice if you could make your notebooks a bit larger though, the text is pretty small.

    • @YiannisPi
      @YiannisPi  4 роки тому

      Hey Tamas, yes you are right! I have started doing it in the new videos!

  • @ikizler2016-p7f
    @ikizler2016-p7f 3 роки тому

    Great work, thank you! One question : You used feature importance on all of the data before train/test/valid split. Shouldn't we do it only on test data ? Because we don't know which features are going to be important in the future in a data which we have never seen before.

  • @anishchakraborty2585
    @anishchakraborty2585 4 роки тому +1

    you make my day ... thanks

    • @YiannisPi
      @YiannisPi  4 роки тому

      Glad you liked the video!

  • @dhristovaddx
    @dhristovaddx 4 роки тому +1

    Great video! Very helpful. I was wondering about something, though. I've heard that it's not a good idea to remove dummy variables, unless you just drop 1, but in your case you didn't use some of them because they weren't important. Doesn't that cause a problem because it's basically as if you're removing some of the levels of a certain categorical variable?

    • @YiannisPi
      @YiannisPi  4 роки тому +2

      Each case is different so I cannot say what is right and what is wrong. Why is not a good idea to remove dummy variables? One solution would be to test both with and without the dummy variables and test the performance

  • @louisebuijs3221
    @louisebuijs3221 3 роки тому

    its really inspiring to see you coding :)

  • @denisgalich6097
    @denisgalich6097 2 роки тому

    What do we do in case we have inbalanced data set. Let's say it is an ecommerce website that has conversion rate of 1-3%?

  • @jeanhospicetiko2043
    @jeanhospicetiko2043 4 роки тому

    Great tutorial
    I will like to know how to deal with the outliers found using the boxplot
    Thanks

  • @raunakmaitra9761
    @raunakmaitra9761 5 місяців тому

    Sir , I'm encountering a problem while getting the dummies it is coming as true false values , how to convert that into numeric 0 ,1

  • @ayahmamdouh8445
    @ayahmamdouh8445 3 роки тому

    Hi! Great video!
    would you please provide me with the links for feature importance details in decision trees and random forest models?
    Thanks in Advance :)

  • @navjiwanhira3888
    @navjiwanhira3888 3 роки тому

    gud one..thanks

  • @sillieboy2
    @sillieboy2 4 роки тому

    If your dependent y variable had instead 3 values (say 0, 1, 2), rather than binary, what would be the interpretation of feature importance in that case?

  • @beijiazhang6635
    @beijiazhang6635 4 роки тому

    What do you consider an unbalanced data? How you deal with it? Actually I have an unbalanced dataset and I'm wondering how to deal with it by using logistic model. Thanks!

    • @YiannisPi
      @YiannisPi  4 роки тому

      It depends from a lot of factors to be fair. Anything outside of 60-40 it's worth looking at if you should balance it but every case is different. To balance datasets you can look at up-sampling or down-sampling and then feeding it into your model and checking performance

    • @timto3935
      @timto3935 4 роки тому

      @@YiannisPiThank you very much for your tutors. please do the imbalanced dataset.

  • @irverybored
    @irverybored 3 роки тому +1

    Dummy variable trap ? shouldn't you drop_first = True in the get_dummies function ?

    • @AC-kq6fv
      @AC-kq6fv 3 роки тому

      Dummy variable trap +1

  • @sageguru3654
    @sageguru3654 3 роки тому

    How to plot the logistic regression model?

  • @bharathisit1382
    @bharathisit1382 3 роки тому

    Best video

  • @agustinleira77
    @agustinleira77 2 роки тому

    Maybe would be better split this video since the beginning of: 'logistic regression'. It's a long time 45' for a video. Also, the definition of the screen is not very good. Particulary, I liked much the part 1. Try to don't speak quickly in the next videos please! Thank you for create content and share with us ;)