Feature Selection using Hierarchical Clustering | Python Tutorial

Поділитися
Вставка
  • Опубліковано 7 чер 2024
  • In this comprehensive Python tutorial, we delve into feature selection for machine learning with hierarchical clustering. We guide you through the essentials of partitioning features into cohesive groups to minimize redundancy in model training. This technique is particularly important as your dataset expands, offering a structured alternative to manual grouping.
    What you'll learn:
    - The importance of variable clustering algorithms in handling large feature sets.
    - Detailed application of hierarchical clustering to form intuitive feature groups with a focus on Ward’s distance metric.
    - Visualising clusters using a Dendrogram
    - A comparative analysis highlighting the advantages of hierarchical clustering over other clustering methods.
    - Insights into the method's output using correlation heatmaps, demonstrating the formation of homogeneous feature groups.
    Why is this important?
    In the data-driven industry, navigating through hundreds or thousands of potential features in your dataset is a challenge. While dimensionality reduction methods like PCA offer a solution, the result is hard-to-interpret features. Hierarchical clustering emerges as a hero, paving the way for an interpretable model with a concise feature list.
    🚀 Free Course 🚀
    Signup here: mailchi.mp/40909011987b/signup
    XAI course: adataodyssey.com/courses/xai-...
    SHAP course: adataodyssey.com/courses/shap...
    🚀 Companion article with link to code (no-paywall link): 🚀
    medium.com/towards-data-scien...
    🚀 Useful playlists 🚀
    XAI: • Explainable AI (XAI)
    SHAP: • SHAP
    Algorithm fairness: • Algorithm Fairness
    🚀 Get in touch 🚀
    Medium: / conorosullyds
    Threads: www.threads.net/@conorosullyds
    Twitter: / conorosullyds
    Website: adataodyssey.com/
    🚀 Chapters 🚀
    00:00 Introduction
    00:50 What is feature selection?
    01:37 Theory of Hierarchical Clustering
    06:35 Applying Hierarchical Clustering
    09:31 Visualising the Dendrogram
    10:43 Feature selection
    13:23 Sense-checking clusters

КОМЕНТАРІ • 5

  • @adataodyssey
    @adataodyssey  2 місяці тому

    🚀 Free Course 🚀
    Signup here: mailchi.mp/40909011987b/signup
    XAI course: adataodyssey.com/courses/xai-with-python/
    SHAP course: adataodyssey.com/courses/shap-with-python/

  • @karthikeyapervela3230
    @karthikeyapervela3230 2 місяці тому

    Thanks, I was recently reading a post in LinkedIn how to eliminate highly correlated features with hierarchical clustering, but that was not clear but this is much better explained.

    • @adataodyssey
      @adataodyssey  2 місяці тому

      Thanks Karthikeya! I'm glad you found it useful. I have another video coming out tomorrow about explaining linear models.

  • @arjendeniz6828
    @arjendeniz6828 2 місяці тому

    Thank you so much! I was stuck at a hierarchical analysis as I did not know that I need to transpose my dataframe. Great video!

    • @adataodyssey
      @adataodyssey  2 місяці тому

      I’m glad you found this useful ☺️