SHAP values for beginners | What they mean and their applications

A Data Odyssey

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 11 бер 2023
SHAP is the most powerful Python package for understanding and debugging your machine-learning models. We learn to interpret SHAP values for both continuous and binary target variables. We also explore the applications of SHAP. This includes debugging models, providing human-friendly
explanations and data exploration.
NOTE: SHAP course is no longer free but you will still get the XAI course for free :)
SHAP course: adataodyssey.com/courses/shap...
XAI course: adataodyssey.com/courses/xai-...
Newsletter signup: mailchi.mp/40909011987b/signup
Using SHAP to Debug a PyTorch Image Regression Model (no-paywall link): towardsdatascience.com/using-...
Medium: / conorosullyds
Twitter: / conorosullyds
Mastodon: sigmoid.social/@conorosully
Website: adataodyssey.com/

КОМЕНТАРІ • 43

@adataodyssey 4 місяці тому
NOTE: SHAP course is no longer free but you will still get the XAI course for free :)
SHAP course: adataodyssey.com/courses/shap-with-python/
XAI course: adataodyssey.com/courses/xai-with-python/
Newsletter signup: mailchi.mp/40909011987b/signup
@lakshman587 7 місяців тому ⁺¹
This is a very clear video about shap!!
@innocentjoseph9084 2 місяці тому
Excellent explanation, just what I needed. Thank you.
@adataodyssey 2 місяці тому
I’m glad you found it useful, Innocent :)
@RHONSON100 7 місяців тому ⁺¹
wonderful explanation.
@adataodyssey 7 місяців тому
Thank you for the kind comment!
@dantedt3931 2 місяці тому
This is awesome!
@adataodyssey 2 місяці тому
Thanks!
@satk4211 2 місяці тому
Excellent video ❤❤❤❤❤❤
@adataodyssey 2 місяці тому
Thank you ☺️ I’m glad it could help
@aakritiiacharya 6 місяців тому ⁺²
Hey Amazing explanation , I wanted to know more about the interpretation of SHAP Summary plot in terms of Survial Analysis
@adataodyssey 6 місяців тому
Thanks Aakriti! I don't know anything about survival analysis I'm afraid... If you are building models using well know packages (e.g. sklearn, XGBoost) then you should be able to use SHAP. I have this video on the more technical coding details. Let me know if that helps!
ua-cam.com/video/L8_sVRhBDLU/v-deo.html
@fouried96 3 місяці тому
Love to see a fellow South African in this line of work!
@adataodyssey 3 місяці тому
Howzit! Will keep the videos coming :)
@fouried96 3 місяці тому
@@adataodyssey Sweet! I followed you on linkedin for any other posts outside of UA-cam. I was just curious, how does Ireland's grading system work for masters, I see you have a 1.1. I have no idea what that means having only studied in SA lol :P
@adataodyssey 3 місяці тому ⁺¹
@@fouried96 that's 75% or above. They don't distinguish beyond that. The Irish are not so big on grading :D
@fouried96 3 місяці тому
@@adataodysseyCongrats! I am busy following this SHAP series. I'm looking to find the best features for this kaggle comp for a multiclass classification problem where I'm using XGBoost. I was wondering, are you on Kaggle?
@hasnainayub2369 4 місяці тому
Very well explained! I have a question regarding SHAP dependency plots. On the right-Y axis, SHAP selects a particular interacting feature by default and I know we can manually change the interacting feature. Does the default selection by SHAP explainer tell us that that particular feature is the feature that interacts with the main feature the MOST as compared to other features? In other words, can we say that the main feature depends on (or interact with) the default interacting feature while making predictions?
@adataodyssey 3 місяці тому
Yes, I wasn't aware of this but it seems like it is true:
shap-lrjball.readthedocs.io/en/latest/example_notebooks/plots/dependence_plot.html
@statistikochspss-hjalpen8335 7 місяців тому
Does it have to be about prediction?
I just want to understand which features/independent variables are most important when my independent variables are highly correlated. I've heard people talking about "contribution".
@adataodyssey 7 місяців тому
No, you can also interpret a model used for analysis. In ML, when we say "prediction" we mean the output of the model. We use this term even if we are not trying to predict the future.
@user-nf2zo3yt8j 5 місяців тому
If I have one hot encoded on the categorical values, How should I know which main features are contributing ?
@adataodyssey 5 місяців тому
This is a great question! You have two options (see the articles below). Either you can add up the SHAP values for the individual one-hot encodings or use CatBoost. I also go over these concepts in more detail in my course.
towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19
towardsdatascience.com/shap-for-categorical-features-with-catboost-8315e14dac1?sk=ef720159150a19b111d8740ab0bbac6d
@youtubeuser4878 2 місяці тому
Hello. Thanks for the tutorial. Regarding your XAI and SHAP courses, is there an order to how we should take the courses. Should we take the XAI before SHAP or vice versa. Thanks
@adataodyssey Місяць тому
No problem! It is better to take XAI first then SHAP. XAI covers more of the basics in the field and other useful model agnostic methods. But the SHAP course still gives some basics so it is not necessary to do the entire XAI course (or even any of it) if all you care about it learning SHAP :)
@youtubeuser4878 Місяць тому
@@adataodysseyAwesome. Thank you.
@mahdihabibi6382 5 місяців тому
How can we determine which interpretable models are appropriate for our deep learning models? For example, I have a CNN model for Malaria prediction, however, I am unsure whether LIME or SHAP is a better tool for interpreting my model. Could you please guide me through this situation?
@adataodyssey 4 місяці тому ⁺¹
For deep learning, you might want to look into a model specific method such as gradcam. Are you using images or tabular data?
If you are using tabular data, I would change the model to XGBoost or random forest. Then use both LIME and SHAP. There are also other methods like ALEs, PDPs, ICE Plots and Freedman's H-statistic. It is also a good idea to use multiple methods.
@mahdihabibi6382 4 місяці тому
Thank you for your reply. @@adataodyssey
@aneesha123able Рік тому
👏
@keenanosullivan305 Рік тому ⁺¹
Shap means something a little different in South Africa. Love the content though👍🏼
@adataodyssey Рік тому ⁺¹
Haha shap shap bra!
@teguhprasetyo7505 4 місяці тому
Can this method be applied in multilabel classification?
@adataodyssey 4 місяці тому
Yes! I have a video on this exact topic: ua-cam.com/video/2xlgOu22YgE/v-deo.html&lc=UgwSqpAiiG_ho6hDqDd4AaABAg
@shubhanshisinghms7745 Місяць тому
Can you make a video on how recruitment decision is made?
@adataodyssey Місяць тому
Do you mean how automated decisions are made or decisions for data scientists in general?
@weii321 9 місяців тому
Can shap value used for feature selection?
@adataodyssey 9 місяців тому
Yes! You can use the mean SHAP plot. I discuss it in this video: ua-cam.com/video/L8_sVRhBDLU/v-deo.html
@weii321 9 місяців тому
@@adataodyssey Thank you for your answer. I have another question, what is the difference between using SHAP values compared to using feature importance for feature selection? Does using SHAP values improve the model's performance more?
@keivansamani3437 5 місяців тому
I want to be able to understand how the features affect the predictions along a 2D curve where the points are sequential, but it seems SHAP is only useful when there’s a single prediction not a curve :(
@adataodyssey 4 місяці тому
You could try using PDPs or ICE Plots for this. Or aggregate SHAP values using a dependence plot
@fupopanda 17 днів тому
Jumping between what you are explaining and yourself is distracting
@adataodyssey 16 днів тому
Thanks for the feedback!

Наступне

Автоматичне відтворення

SHAP with Python (Code and Explanations)