Feature Selection using Hierarchical Clustering | Python Tutorial
Вставка
- Опубліковано 7 чер 2024
- In this comprehensive Python tutorial, we delve into feature selection for machine learning with hierarchical clustering. We guide you through the essentials of partitioning features into cohesive groups to minimize redundancy in model training. This technique is particularly important as your dataset expands, offering a structured alternative to manual grouping.
What you'll learn:
- The importance of variable clustering algorithms in handling large feature sets.
- Detailed application of hierarchical clustering to form intuitive feature groups with a focus on Ward’s distance metric.
- Visualising clusters using a Dendrogram
- A comparative analysis highlighting the advantages of hierarchical clustering over other clustering methods.
- Insights into the method's output using correlation heatmaps, demonstrating the formation of homogeneous feature groups.
Why is this important?
In the data-driven industry, navigating through hundreds or thousands of potential features in your dataset is a challenge. While dimensionality reduction methods like PCA offer a solution, the result is hard-to-interpret features. Hierarchical clustering emerges as a hero, paving the way for an interpretable model with a concise feature list.
🚀 Free Course 🚀
Signup here: mailchi.mp/40909011987b/signup
XAI course: adataodyssey.com/courses/xai-...
SHAP course: adataodyssey.com/courses/shap...
🚀 Companion article with link to code (no-paywall link): 🚀
medium.com/towards-data-scien...
🚀 Useful playlists 🚀
XAI: • Explainable AI (XAI)
SHAP: • SHAP
Algorithm fairness: • Algorithm Fairness
🚀 Get in touch 🚀
Medium: / conorosullyds
Threads: www.threads.net/@conorosullyds
Twitter: / conorosullyds
Website: adataodyssey.com/
🚀 Chapters 🚀
00:00 Introduction
00:50 What is feature selection?
01:37 Theory of Hierarchical Clustering
06:35 Applying Hierarchical Clustering
09:31 Visualising the Dendrogram
10:43 Feature selection
13:23 Sense-checking clusters
🚀 Free Course 🚀
Signup here: mailchi.mp/40909011987b/signup
XAI course: adataodyssey.com/courses/xai-with-python/
SHAP course: adataodyssey.com/courses/shap-with-python/
Thanks, I was recently reading a post in LinkedIn how to eliminate highly correlated features with hierarchical clustering, but that was not clear but this is much better explained.
Thanks Karthikeya! I'm glad you found it useful. I have another video coming out tomorrow about explaining linear models.
Thank you so much! I was stuck at a hierarchical analysis as I did not know that I need to transpose my dataframe. Great video!
I’m glad you found this useful ☺️