13.4.2 Feature Permutation Importance (L13: Feature Selection)

Поділитися
Вставка
  • Опубліковано 7 лис 2024
  • Sebastian's books: sebastianrasch...
    This video introduces permutation importance, which is a model-agnostic, versatile way for computing the importance of features based on a machine learning classifier or regression model.
    Slides: sebastianrasch...
    Random forest importance video: • 13.3.2 Decision Trees ...
    -------
    This video is part of my Introduction of Machine Learning course.
    Next video: • 13.4.3 Feature Permuta...
    The complete playlist: • Intro to Machine Learn...
    A handy overview page with links to the materials: sebastianrasch...
    -------
    If you want to be notified about future videos, please consider subscribing to my channel: / sebastianraschka

КОМЕНТАРІ • 30

  • @hamzawi2752
    @hamzawi2752 2 роки тому +7

    One of the best explanations for this topic, thank you very much. Please continue to explain these concepts, you are a legend in teaching.

    • @SebastianRaschka
      @SebastianRaschka  2 роки тому

      Thanks for the kind words! I definitely should make some more videos some time. Haha, it's just so hard to find time these days! Thanks, though, I will keep it in mind!!

  • @mostafabouzari2790
    @mostafabouzari2790 8 місяців тому +1

    i am writing my master thesis right now and your courses are perfect combination of theory and practice which i need. thank you for providing these lectures.

    • @SebastianRaschka
      @SebastianRaschka  7 місяців тому +1

      Awesome to hear that you are getting so much out of these! Best of luck with your thesis!

  • @dr_greg_mouse4125
    @dr_greg_mouse4125 3 місяці тому

    This is such a great explanation with a clear explanation of each step of the algorithm with examples. Thank you!!

  • @erichganz4605
    @erichganz4605 2 роки тому +2

    Really amazing Sebastian! Very extensive explication! I find seldom such complete videos on youtube!

  • @annawilson3824
    @annawilson3824 2 роки тому +1

    During the estimation of accuracy for permitted X, y remains the old one as you said (the code says it is y_val_perm)

  • @hamzajouichat9997
    @hamzajouichat9997 2 роки тому +1

    Thank you Sebastian for the wonderful videos, I went trough your feature selection videos and learned a lot, but one of the practical issues that I cant get my head around is dealing with categorical one hot encoded features, is there a method suited for computing feature importance of the original variable rather than the encoded sub-features?

    • @SebastianRaschka
      @SebastianRaschka  2 роки тому

      Glad you liked it. And coincidentally: yes there is! Just added support for that in mlxtend like 2 weeks ago.
      You can install the current dev version via
      pip install git+git://github.com/rasbt/mlxtend.git
      and then you can use it as described in Example 3 here: github.com/rasbt/mlxtend/blob/master/docs/sources/user_guide/evaluate/feature_importance_permutation.ipynb

  • @PoojaGaikwad-jm6ll
    @PoojaGaikwad-jm6ll 7 місяців тому

    What if we have imbalance data? Can we perform these techniques permutation importance or feature _importance_ to select the important features?

  • @riski_al_arsy
    @riski_al_arsy Рік тому

    Can i get some journal of this explains?

  • @waheedimran6491
    @waheedimran6491 Рік тому

    Hello Sabastian!
    Thanks for the lecture. I have a query: How do we differentiate between leave one covariate out and permutation importance? Secondly, can we call permutation importance explainable AI?
    Thank you.

  • @jamalnuman
    @jamalnuman 8 місяців тому +1

    is the permutation a synonym for the shuffling?

    • @SebastianRaschka
      @SebastianRaschka  7 місяців тому +1

      Good question, yeah, it's just jargon for shuffling here

    • @jamalnuman
      @jamalnuman 6 місяців тому

      @@SebastianRaschka
      does the permutation includes removing columns?

    • @SebastianRaschka
      @SebastianRaschka  6 місяців тому +1

      @@jamalnuman Good question. It doesn't remove columns because if you removed columns, the features would be incompatible with the model and you would have to fit the model again.

    • @jamalnuman
      @jamalnuman 6 місяців тому

      @@SebastianRaschka
      what the method is called if the it include removing columns? is it SHAP?
      how about the inherent feature importance? is it PFI or SHAP or neither?

    • @SebastianRaschka
      @SebastianRaschka  6 місяців тому +1

      @@jamalnuman Believe it or not, but it's actually just called "drop column importance" E.g., see forum.posit.co/t/random-forest-regression-drop-column-importance/185425:)

  • @annawilson3824
    @annawilson3824 2 роки тому +1

    Freaking subscribed!

  • @shalinianunay2713
    @shalinianunay2713 2 роки тому +1

    WOnderful!

  • @chakerbannour5255
    @chakerbannour5255 2 роки тому +1

    If we shuffle , we will use the same metric for importance (gini or IG ) so how the order will affect the final accuracy !!

    • @SebastianRaschka
      @SebastianRaschka  2 роки тому

      The model is not refit to the shuffled data, it's only evaluated on the shuffled data. But yes, it will affect the accuracy unless the feature is not used by the model. That's basically the idea: the more the shuffle affects the accuracy, the more important that feature is. If it does not affect the accuracy, the model does not rely on that feature for prediction.

    • @chakerbannour5255
      @chakerbannour5255 2 роки тому +1

      @@SebastianRaschka Thanks for the quick answer , What i know even if we shuffle the order , our tree will keep looking for the best features(less gini or more IG ) So Iam not getting the idea how changing the order of features will change their importance for the Model (even if there some rondomness ,I think the change will be really so small )!!
      I am correct or i am saying stupid things !
      In case is not refit (it make sense as we changed the order , we will confuse the model by wrong disturbition of the new feature)

    • @SebastianRaschka
      @SebastianRaschka  2 роки тому +1

      @@chakerbannour5255 Oh it's not changing the importance but estimating the importance. Say you train a decision tree on the Iris dataset. The decision tree only uses 2 features like petal length and petal width because the other ones are not useful for optimizing IG. So, if the DT only uses the petal features and then you do the permutation importance procedure, you will find that shuffling sepal length will not affect the accuracy. Because the acc difference between shuffled and unshuffled sepal length is 0, it means the feature is not important at all. Vice versa if you shuffle petal length, you may notice a drop of 20% accuracy or so. This indicates that this feature is kind of important :).

    • @chakerbannour5255
      @chakerbannour5255 2 роки тому

      @@SebastianRaschka Thanks a lot for the clarification

    • @ahmedsaied8373
      @ahmedsaied8373 2 роки тому +3

      @@SebastianRaschka Thanks so much, I was looking in the video for how shuffling the feature would imply how important the feature is, and the answers was in the replay..
      "But yes, it will affect the accuracy unless the feature is not used by the model. That's basically the idea: the more the shuffle affects the accuracy, the more important that feature is. If it does not affect the accuracy, the model does not rely on that feature for prediction."