13.3.2 Decision Trees & Random Forest Feature Importance (L13: Feature Selection)

Поділитися
Вставка
  • Опубліковано 7 лис 2024

КОМЕНТАРІ • 19

  • @bobrarity
    @bobrarity 7 місяців тому +5

    even though this video is pretty old, this video helped me, the way you explain things is easily understandable, subscribed already 😊

    • @SebastianRaschka
      @SebastianRaschka  7 місяців тому +2

      Nice. Glad it's still useful after all these years!

  • @anikajohn6924
    @anikajohn6924 2 роки тому +3

    Preparing for an interview and this is a great refresher. Thanks!

    • @SebastianRaschka
      @SebastianRaschka  2 роки тому +1

      Glad to hear! And best of luck with the interview!!

  • @MrMAFA99
    @MrMAFA99 2 роки тому +3

    really good video, you should have more subscribers!
    Greetings from Germany

  • @kenbobcorn
    @kenbobcorn Рік тому

    Glad you added Lecture 13 after the fact for those that are interested. Also, do you a list of the equipment you use for video recording? The new tablet setup looks great.

  • @abdireza1298
    @abdireza1298 Рік тому

    Professor Raschka, please allow me to ask. Can statistical test procedures be implemented into feature coefficient values (such as Gini impurity)? Like in image 13:04, can we compare the values statistically if we obtained the mean and confidence interval of each feature importance (Proline, Flavanoids, etc) from cross-validation instead of the train-test split? (using Friedman test, or t-test, or Wilcoxon). I do not think any statistical restriction to apply statistical tests to any feature importance coefficient since they are numeric in nature, but I am afraid I missed something because I never encountered any paper that tests statistical feature coefficient. An expert opinion, as you are, is my referee in this case. Thank you, Professor.

  • @lekhnathojha8537
    @lekhnathojha8537 2 роки тому +2

    very nice explained. So informative. Thank you for making this video.

  • @AyushSharma-jm6ki
    @AyushSharma-jm6ki Рік тому

    @Sebastian Can DTs and RFs also be used to select features for Regression models?

  • @yashodharpathak189
    @yashodharpathak189 10 місяців тому

    Which method of feature selection is best if datasets have many categorical variables. I have a dataset which comprises continuous as well as categorical variables. What should be the approach in this case?

  • @cagataydemirbas7259
    @cagataydemirbas7259 Рік тому

    Hi, when I use randomforest , DecisionTree and xgboost on RFE and to look feature_importances_, even if all of them tree based models, they returned completely different orders. On my dataset has 13 columns on xgboost one of feature importance rank is 1, same feature rank on Decisiontree is 10, an same feautre on Randomforest is 7. How can I trust wich feature is better than others in general purpose ? İf a feature is better predictive than others, shouldnt it be de same rank all tree based models ? I am so confused about this. Also its same on SquentialFeatureSelection

  • @nooy2228
    @nooy2228 Рік тому

    Thank you so much!!! Your videos are super great..

  • @Jhawk_LA
    @Jhawk_LA Рік тому

    Really great!

  • @bezawitdagne5756
    @bezawitdagne5756 Рік тому

    I were using correlation heatmap, p-value and information gain for feature selection, the values are pretty similar, but I use the result with all algorithms I were using , the accuracy decreased, and I tried using random feature importance , the result I get from RF feature importance has improve my accuracy, so please help me understand why ?

    • @SebastianRaschka
      @SebastianRaschka  Рік тому +1

      I think it may depend on what your downstream model looks like. The correlation method may work better for generalized linear models than tree-based methods because tree-based methods have feature selection built-in already

  • @jamalnuman
    @jamalnuman 8 місяців тому +1

    great

  • @pulkitmadan6381
    @pulkitmadan6381 2 роки тому