Machine learning feature engineering: Label encoding Vs One-Hot encoding (using Scikit-learn)

Поділитися
Вставка
  • Опубліковано 14 жов 2024

КОМЕНТАРІ • 25

  • @TheAnbohan
    @TheAnbohan 3 роки тому +3

    Hi, thx for the tutorial, however the sklearn doc specifies that you should use LabelEncoder for target not for features: "This transformer should be used to encode target values, i.e. y, and not the input X."

    • @MrMadmaggot
      @MrMadmaggot 2 роки тому

      What is target values? the labels (y's)?

    • @havryliuk
      @havryliuk 2 роки тому

      You could use integer encoding for the dependent variable but you won't be able to use one-hot encoding because you would be predicting more than one feature, not sure if that's possible.
      Well, encoding is really used for independent features (x) as well because machine learning models can't be trained on labels (strings) but only on numbers.

    • @updatascience
      @updatascience  Рік тому

      You are right, thanks for your comment, in my new video I explained this: I have uploaded a new video for you to cover this: When to apply each categorical encoders? Practical examples ua-cam.com/video/JbaELidWCQw/v-deo.html

  • @havryliuk
    @havryliuk 2 роки тому +2

    I suppose it would have been useful to mention when to use each type of encoding and what are their pros and cons

    • @cristianofroes4681
      @cristianofroes4681 2 роки тому

      I was expecting the same, to know when use each one and why.

    • @updatascience
      @updatascience  Рік тому +1

      I have uploaded a new video for you to cover this: When to apply each categorical encoders? Practical examples ua-cam.com/video/JbaELidWCQw/v-deo.html

    • @updatascience
      @updatascience  Рік тому +2

      @@cristianofroes4681 I have uploaded a new video for you to cover this: When to apply each categorical encoders? Practical examples ua-cam.com/video/JbaELidWCQw/v-deo.html

  • @stand4justice4867
    @stand4justice4867 2 роки тому

    Thank you very much! the video was clear, direct and informative :)

  • @todorp4056
    @todorp4056 Рік тому

    Thanks, nice tutorials

  • @sasidharansathiyamoorthy6918
    @sasidharansathiyamoorthy6918 3 роки тому

    Can someone explain on what basis the labels are generated? Like why did Staten Island get 3, Manhattan 2 etc?

    • @updatascience
      @updatascience  3 роки тому +1

      Hey Sasidharan, it is generated by the fit_transform function, and the function is assigning them a unique represented number; this is the whole idea of label encoding, which is mapping each value into a unique number

    • @musyo005
      @musyo005 2 роки тому +4

      In this technique, each label is assigned a unique integer based on alphabetical ordering. Bronx(0) comes before brooklyn(1),manhattan(2),Queens(3), Statens Island(4).

    • @_danfiz
      @_danfiz 2 роки тому

      @@musyo005 thank you!

    • @RamonX69
      @RamonX69 2 роки тому

      @@musyo005 how to specify each label for each value?

  • @cristianofroes4681
    @cristianofroes4681 2 роки тому

    I was expecting to learn when and why use each one also the main differences. Anyway, thanks for the great class.

    • @updatascience
      @updatascience  Рік тому +1

      I have uploaded a new video for you to cover this: When to apply each categorical encoders? Practical examples ua-cam.com/video/JbaELidWCQw/v-deo.html

  • @ameerabdul5444
    @ameerabdul5444 3 роки тому

    thanks for tutorial ....i like the way u explained

  • @yahyamlaouhi9508
    @yahyamlaouhi9508 3 роки тому +2

    nice video keep going

  • @pkstock372
    @pkstock372 3 роки тому +1

    airbnb.shape( ) , it would be more easy to understand .

  • @saunitmarolia3901
    @saunitmarolia3901 3 роки тому

    Nicely explained

  • @irfanfulari2122
    @irfanfulari2122 2 роки тому

    Focus on indian students, you have good scope...