The Mapper Algorithm | Overview & Python Example Code

Shaw Talebi

426

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 10 лют 2025

КОМЕНТАРІ • 40

@ShawhinTalebi Рік тому ⁺¹
More in this series 👇
- Introduction to TDA: ua-cam.com/video/fpL5fMmJHqk/v-deo.html
- Persistent Homology: ua-cam.com/video/5ezFcy9CIWE/v-deo.html
@ifycadeau 2 роки тому ⁺⁴
Thanks for following up in the series! Really enjoying it 👌🏾
@ShawhinTalebi 2 роки тому
Thanks for watching!
@danbocker7398 Місяць тому
Thank you for introducing this idea in such a friendly way!
@gluphus Рік тому ⁺¹
Thanks for producing this - very well done
@ShawhinTalebi Рік тому
Thanks! I'm glad you liked it :)
@yongdahuang5188 2 роки тому ⁺³
We need persistent homology! Thank you!
@ShawhinTalebi 2 роки тому
Coming very soon!
@valentinpolishchuk5065 11 місяців тому ⁺¹
Much needed, TDA w/o notation nightmare 😂
@ShawhinTalebi 11 місяців тому
Glad it helped!
@profmiked Місяць тому
Great TDA example video. However, returns should be estimated using original adjusted closed prices first and then standardized (if necessary), instead of being estimated from standardized prices.
@ShawhinTalebi Місяць тому
Good call out!
@kaanerdogan6512 2 роки тому ⁺²
great video
@ShawhinTalebi 2 роки тому
Thanks for watching!
@kevon217 Рік тому
Great video. Do you have any future videos planned illustrating similar pipelines for multimodal or mixed datasets?
@ShawhinTalebi Рік тому
That's a good idea. Are there any specific use cases you're interested in?
@simdimdim 4 місяці тому ⁺¹
7:22 data-np.mean... or (data-np.mean...)/np.std(... ?
@ShawhinTalebi 4 місяці тому
(data - np.mean(data))/np.std(data)
Here I am "normalizing" the data.
@simdimdim 4 місяці тому ⁺¹
@@ShawhinTalebi Could've sworn I also wrote an answer to this, about how that's standardization and not normalization, but I might've also forgotten to press 'Send'/'Reply'
normalization would be (data - data.min()) / (data.max() - data.min()) (pseudocode)
@wesleydeelman Рік тому ⁺¹
Hi Thanks for this video, was very helpful. Where do you set the cover in the python code?
@ShawhinTalebi Рік тому ⁺¹
Glad it was helpful! You can define the cover using the kmapper.cover() class, then pass it into the .fit_transform() and .map() methods. Since I do not specify it explicitly here, it uses the default parameters of n_cubes=10, perc_overlap=0.5, and limits=None.
Here are some documentation links.
cover - kepler-mapper.scikit-tda.org/en/latest/reference/stubs/kmapper.Cover.html#kmapper.Cover
fit_transform - kepler-mapper.scikit-tda.org/en/latest/reference/stubs/kmapper.KeplerMapper.html#kmapper.KeplerMapper.fit_transform
map - kepler-mapper.scikit-tda.org/en/latest/reference/stubs/kmapper.KeplerMapper.html#kmapper.KeplerMapper.map
@ClumpypooCP 6 місяців тому
When someone says "high dimensional data" , like you said as an example some "500 dimensional data set", what do they mean exactly? Does it mean that each data point has 500 components or features to it?
@ShawhinTalebi 6 місяців тому
Great question! In data science, dimensions are typically synonymous with features i.e. the attributes that define individual data points. This is because given N features, we can view data points as living in an N-dimensional space.
@JohnDavidRicker 2 місяці тому
I like it
@ClumpypooCP 6 місяців тому
In the Mapper algorithm, I am not sure what the projecting, covering, and clustering the pre-image have to do with each other. Basically, I'm wondering couldn't the very first step be some clustering algorithm and then you can immediately make a graph from that? I'm just not sure what the projecting onto lower-dimensions , covering etc have to do with the eventual clustering step.
@ShawhinTalebi 6 місяців тому
Another great question. Here's my understanding.
Mapper generates a graph whose nodes correspond to clusters. While we can get these without running through steps 2-4 on slide 3, we still need connect these nodes in some way to generate a graph.
This is where the cover comes in. Essentially, Step 3 provides us a 2nd clustering strategy (i.e. green vs red) which we can use to define the links between the clusters found in Step 4. Namely, 2 clusters from Step 4 are connected by an edge if they share members according to the Step 2 subsets.
@training7574 11 місяців тому
Per se fascinating, thanks. But is it also useful? It would be helpful with some kind of literature that covers the necessary background knowledge. My topology book was printed in the 1960's.
@ShawhinTalebi 11 місяців тому
While there are have been some interesting use cases, Mapper is still in its infancy. I share more resources in the video description and in the article for this video: medium.datadriveninvestor.com/the-mapper-algorithm-d0842f926658?sk=4b78e5f8f2e8f390b919e8285a97871e
@luis2arm Рік тому
good video, the volume is way too low!! when the youtube ads pop up is dangerous hehe
@ShawhinTalebi Рік тому
Forgive my mediocre editing. Hopefully the quality of my more recent content is better 😅
@luis2arm Рік тому
@@ShawhinTalebi editing is great! just a constructive comment for your next video. Greetings
@ShawhinTalebi Рік тому
@@luis2arm Thanks Luis, I appreciate the feedback
@shangwufeng2964 Рік тому
It looks like Research Rabbit also use similar way to show its search results.
@ShawhinTalebi Рік тому ⁺¹
That’s really interesting, I wonder if they use Mapper in the backend
@yongdahuang5188 2 роки тому ⁺¹
And could you recommend some books or references?
@ShawhinTalebi 2 роки тому
Yes, there are a couple in the video description under "Resources I found helpful".
@zakizaki9516 2 роки тому
اريد مساعدة منك
@ShawhinTalebi 2 роки тому
Happy to help however I can. You can message me here: shawhint.github.io/connect.html
@Rowing-li6jt Рік тому
too small volume,, spaek up!
@ShawhinTalebi Рік тому
sorry about that! hopefully other videos had better levels 😅

Наступне

Автоматичне відтворення

Persistent Homology | Introduction & Python Example Code