Datashader Revealing the Structure of Genuinely Big Data | SciPy 2016 | James A Bednar

HDBSCAN, Fast Density Based Clustering, the How and the Why - John Healy

Leland McInnes, John Healy | Clustering: A Guide for the Perplexed

说好的半夜不睡觉，敲门这操作太意外了！#cute #baby #funny #comedy

Новий концерт Єдиного Кварталу до Дня Незалежності України | Повний випуск від 25 серпня 2024

В радиусе действия дрона “Паляниця” - 20 российских военных аэродромов

High Quality, High Performance Clustering with HDBSCAN | SciPy 2016 | Leland McInnes

Enthought

Переглядів 20 546

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 5 вер 2024

КОМЕНТАРІ • 20

@DouglasDuhaime 4 роки тому ⁺³
Leland is truly a gentleman and a scholar
@kevon217 2 місяці тому ⁺¹
Been on a Leland yt binge as of late, saw this comment, and truly agree.
@lelandmcinnes9501 8 років тому ⁺¹²
Thanks to the great people at conda-forge hdbscan is now available as conda packages (which is by far the easiest way to install it).
conda install -c conda-forge hdbscan
@zwitter689 7 років тому
Thanks, very nicely done. I installed hdbscan and am trying to mimic the examples you give but I can't find the data for the example on "Getting More Information About a Clustering". I like to follow the examples exactly so a copy of the actual data set you used would be great, can you help me with this?
@lelandmcinnes9501 7 років тому
It's in the github repository with the notebooks:
github.com/scikit-learn-contrib/hdbscan/blob/master/notebooks/clusterable_data.npy
@zwitter689 7 років тому
Thank you and especially for the quick response.
@chengchu88 6 років тому
Dr McInnes,
thanks for the great video.
I am using the HDBSCAN on a large dataset, and I know how to set 'memory' parameter to cache the hard computation.
My question is, after I cache the computation during fitting, how do I change the min_cluster_size and min_sample_size and re-label the same data without going through the time-consuming fitting again? Could you provide a few sample python lines?
thank you,
Cheng
@elivazquez7582 6 років тому ⁺³
Great video! Great presentation - thanks for doing this!
@enthought 8 років тому
More info on HDBSCAN here: github.com/lmcinnes/hdbscan.
See the complete SciPy 2016 Conference talk & tutorial playlist here: ua-cam.com/play/PLYx7XA2nY5Gf37zYZMw6OqGFRPjB1jCy6.html
@rajeshbalakrishnan2228 4 роки тому
Wowwww!! One of best clustering discussion
@shyamsbox 6 років тому
Very nice! We will try HDBSCAN.
@grygoriyzolotarov3228 6 років тому ⁺²
What is the font you use in your presentations (very appealing)?
@wexwexexort 3 роки тому
great talk!
@rednax3788 7 років тому ⁺²
HDBSCAN IS KING
@Marin-ct5my 3 роки тому
HDBScan seems to be capable of producing clusters which share overlapping nodes, given that clustering for me is to identify shared points between clusters, what would I have to do to the algorithm to get those? I was surprised when nobody had a question about this and there was nothing said about it despite it being a possible feature of the algorithm.
@karthik-ex4dm 5 років тому
Great video...Since clustering cannot do better in high dimension space, the pair wise distance matrix should be fine if we are working in high dim spaces..right? but even computation of pairwise distance will also be computational expensive for very high dimension space right?. So the best choice must be finding best features using something like forward feature selection and then perform hdbscan. right?
@andrewdennis6976 6 років тому ⁺¹
I am running your example code to just play around and keep getting an error.
TypeError: descriptor 'get_metric' requires a 'hdbscan.dist_metrics.DistanceMetric' object but received a 'str'
unfortunately there is not much documentation on this so its hard to find fixes. Any help?
@jennifermew8386 7 років тому ⁺¹
how do you identify noise in HDBSCAN ? how do the algorithm tell the difference between outliers and noise?
@ashishkannad3021 6 років тому ⁺²
the ones which are not clustered in any cluster are our noises!
@KeshavDial 4 роки тому ⁺³
For anyone who was looking for Christian Hennig's PyData talk ua-cam.com/video/Mf6MqIS2ql4/v-deo.html

Наступне

Автоматичне відтворення

Datashader Revealing the Structure of Genuinely Big Data | SciPy 2016 | James A Bednar

Datashader Revealing the Structure of Genuinely Big Data | SciPy 2016 | James A Bednar

HDBSCAN, Fast Density Based Clustering, the How and the Why - John Healy

HDBSCAN, Fast Density Based Clustering, the How and the Why - John Healy

Leland McInnes, John Healy | Clustering: A Guide for the Perplexed

Leland McInnes, John Healy | Clustering: A Guide for the Perplexed

说好的半夜不睡觉，敲门这操作太意外了！#cute #baby #funny #comedy

说好的半夜不睡觉，敲门这操作太意外了！#cute #baby #funny #comedy

Новий концерт Єдиного Кварталу до Дня Незалежності України | Повний випуск від 25 серпня 2024

Новий концерт Єдиного Кварталу до Дня Незалежності України | Повний випуск від 25 серпня 2024

В радиусе действия дрона “Паляниця” - 20 российских военных аэродромов

В радиусе действия дрона “Паляниця” - 20 российских военных аэродромов

ЗАРЯДКА ⚡️🪫1 серия. Полный сезон уже на YouTube

ЗАРЯДКА ⚡️🪫1 серия. Полный сезон уже на YouTube

Clustering with DBSCAN, Clearly Explained!!!

Clustering with DBSCAN, Clearly Explained!!!

Constructing Models to Deal with Missing Data | SciPy 2016 | Deborah Hanus

Constructing Models to Deal with Missing Data | SciPy 2016 | Deborah Hanus

Beyond the Numbers: A Data Analyst Journey | Anna Leach | TEDxPSU

Beyond the Numbers: A Data Analyst Journey | Anna Leach | TEDxPSU

Christian Hennig - Assessing the quality of a clustering

Christian Hennig - Assessing the quality of a clustering

Reproducible, One Button Workflows with the Jupyter Notebook & Scons | SciPy 2016 | Jessica Hamrick

Reproducible, One Button Workflows with the Jupyter Notebook & Scons | SciPy 2016 | Jessica Hamrick

UMAP Uniform Manifold Approximation and Projection for Dimension Reduction | SciPy 2018 |

UMAP Uniform Manifold Approximation and Projection for Dimension Reduction | SciPy 2018 |

Brian Kent: Density Based Clustering in Python

Brian Kent: Density Based Clustering in Python

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

7 Database Design Mistakes to Avoid (With Solutions)

7 Database Design Mistakes to Avoid (With Solutions)

Секрет летающего стула! #shorts

Секрет летающего стула! #shorts

Люди в Курській області просять українську армію захистити їх від російської. ЕКСКЛЮЗИВ ТСН.Тижня

Люди в Курській області просять українську армію захистити їх від російської. ЕКСКЛЮЗИВ ТСН.Тижня

У ГОРДЕЯ ПОЖАР в ОФИСЕ!

У ГОРДЕЯ ПОЖАР в ОФИСЕ!

Useful construction tips. How to reliably tie reinforcement #shorts #diy #tips #construction

Useful construction tips. How to reliably tie reinforcement #shorts #diy #tips #construction

Провел 3 НОЧИ с ПРОКЛЯТЫМИ КУКЛАМИ ! 100 часов в закрытом доме

Провел 3 НОЧИ с ПРОКЛЯТЫМИ КУКЛАМИ ! 100 часов в закрытом доме

Испугал Военных ЖЕЛЕЗНОЙ ИГРУШКОЙ

Испугал Военных ЖЕЛЕЗНОЙ ИГРУШКОЙ

«Я думаю, що це сон»: у День Незалежності України із російського полону повернули 115 захисників

«Я думаю, що це сон»: у День Незалежності України із російського полону повернули 115 захисників

I Took a LUNCHBAR OFF A Poster 🤯 #shorts

I Took a LUNCHBAR OFF A Poster 🤯 #shorts