Talks # 4: Sebastien Fischman - Pytorch-TabNet: Beating XGBoost on Tabular Data Using Deep Learning

Abhishek Thakur

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 14 жов 2024
Talks # 4:
Speaker: Sebastien Fischman ( / sebastienfischman )
Title : Pytorch-tabnet : Beating XGBoost on tabular data with deep learning?
Abstract: #DeepLearning has set up new benchmarks for Computer Vision, NLP, Speech, Reinforcement Learning in the past few years.
However tabular data competitions are still dominated by gradient boosted trees (GBTs) libraries like XGBoost, LightGBM and Catboost.
Tabnet is a new promising deep learning architecture based on sequential attention transformers proposed by Arik & Pfister that aims to fill the gap between GBTs and neural networks.
Pytorch-tabnet is an open source library that provides a scikit-like interface for training a TabNetClassifier or TabNetRegressor. It's ease of use allow any developer to quickly try a #TabNet architecture on any dataset, hopefully setting up new benchmarks.
Bio: Worked as a Data Scientist in France and Australia on very different topics:
user segmentation based on their shopping habits for WoolWorth @Quantium
real time bidding advertising @Tradelab
stock market predictions based on sentiment analysis from social medias @SESAMm
auto ML platform with explainable AI @DreamQuark
now working on early stage cancer detection on new OCT-3D images @DamaeMedical
To give a talk in Talks, fill out this form here: bit.ly/Abhishe...
----
Follow me on:
Twitter: / abhi1thakur
LinkedIn: / abhi1thakur
Kaggle: kaggle.com/abh...

КОМЕНТАРІ • 33

@abhishekkrthakur 4 роки тому ⁺¹⁹
Slides: www.slideshare.net/SebastienFischman/tab-netpresentation/SebastienFischman/tab-netpresentation
GitHub: github.com/dreamquark-ai/tabnet
Thank you Sebastien for the great Talk!
@jacquepang 7 місяців тому ⁺²
2:55 Tabnet Paper introduction
4:20 Main ideas from Tabnet
7:21 Architecture
8:55 feature transformer block
10:51 attentive transformer block
14:25 individual explainability intro
15:10 self supervised learning ( pretrainning )
17:10 pytorch implementation intro ( 19:18 fastai wrapper avialable )
20:59 demo from a notebook
29:34 Kaggle competition notebooks using Tabnet Pytorch
29:55 Code base architecture
32:18 tricky implementation tips!
34:36 future work
40:52 QA session
41:09 explainability
42:30 computing resource
43:50 tabnet parameters explain
47:55 feature selection ( from sparse mask)
@AIPlayerrrr 4 роки тому ⁺¹¹
After watching this video, I jumped right into implementing it on some of the kaggle competitions and my research. LGB still works better than Tabnet in most of my implementations. Pytorch-Tabnet is really user-friendly tho if you are new to deep learning for tabular data.
@林奕勳-c5t 4 роки тому ⁺¹
Hi, Tony. Do you know how much LGB perform better than Tabnet and what kind of tasks LGB beats Tabnet? Do you tune the parameter size of Tabnet?
@solomonadeyemi53 Рік тому
hi from South Africa ......have been using Tabnet for 2yrs now in R studio ......works very well.... will give the pytorch-tabnet a trial
@ritamshome 4 роки тому ⁺²
Actually an in-depth session and Sebastian answered most of the queries. Great work!
@risabb 4 роки тому ⁺²
This is the best Talk Session! Learnt a lot and a great explanation. Thanks Abhishek and Sebastien!
@matteomele3303 Рік тому
Thank you, excellent work for both of you!
@memories2692 3 роки тому
Thanks so much guys! It's a perfect architecture (and lecturer). I've implemented it easily for couple of days, works great!
@nirjharyou 4 роки тому ⁺¹
Thank you so much Abhishek for this . I am also extremely happy to see my kernel and my name on your video , even though for a flash :)
@ParsiadAzimzadeh 3 роки тому
Great talk.
You mentioned being uncertain about the origin of sqrt(0.5) factor. I believe the reason the authors use it is because given two IID random variables X and Y,
Var(sqrt(0.5) X + sqrt(0.5) Y) = 0.5 Var(X) + 0.5 Var(Y) = Var(X).
In the context of the GLU summation, it is a heuristic to ensure that the variance does not increase.
@FrankHerfert 4 роки тому ⁺¹
This is great! Thank you both.
@abhishekkrthakur 4 роки тому ⁺³
To give a talk in Talks, fill out this form here: bit.ly/AbhishekTalks
@davidvictor7124 4 роки тому ⁺¹
Can you please post the link of the code in the description.
@sebastienfischman8671 4 роки тому ⁺¹
@@davidvictor7124 All the code is available here github.com/dreamquark-ai/tabnet
I'll also add all the links and the presentation on this same page, so this is the place to go for any information!
@aditya_01 Рік тому
u r doing really great thanks a lot for such a awesome content
@tempdeltavalue 2 роки тому ⁺²
It's strange what author call it "transformers" because (if I understand correctly) here's not used attention masks (I mean QVK matrices)
@jacquepang 7 місяців тому
I have the same confusion. Do you have a clue?
@sayedathar2507 3 роки тому
Amazing Talk , thanks for sharing , your channel is best :)
@oculustech1904 4 роки тому
Great thank Abhishek and Sebastien !!!. you mention about copy of book, how to get that, please share link.
@abhishekkrthakur 4 роки тому
Sebastien explains it at the end of the the talk
@vslaykovsky Рік тому ⁺¹
9:17 should be "element-wise multiplication" I guess
@JaskaranSingh-hp3zy 4 роки тому ⁺¹
Great Session
@deepaksadulla8974 4 роки тому
Really good explanations...
@consistentthoughts826 3 роки тому
I applied this Santander Classification Kaggle dataset and got 81% accuracy without any preprocessing
@shrikantnarayankar4778 4 роки тому
Hi Abhishek..I was trying to buy your book but link said it will be available on 15 july..how to buy it today? ...u held a session with krish ..
@razzor_hero 4 роки тому
Hey, do you know how to monitor and fit the tabnet based on a metric other than accuracy, say roc_auc_score ? I tried looking for this in the github, couldn't find it :/
@sebastienfischman8671 4 роки тому ⁺¹
default monitoring for binary classification is already roc_auc_score, for multi class it's accuracy, for regression it's MSE. Easy way of changing early stopping metrics still need to be added!
@manelallani4746 3 роки тому
@@sebastienfischman8671 Is it possible now to use a customized loss function ?
@mahery_ranaivoson 4 роки тому
Where to get the notebooks?
@abhishekkrthakur 4 роки тому
added in pinned comments

Наступне

Автоматичне відтворення

Deep Learning for Tabular Data: A Bag of Tricks | ODSC 2020