Talks # 4: Sebastien Fischman - Pytorch-TabNet: Beating XGBoost on Tabular Data Using Deep Learning
Вставка
- Опубліковано 14 жов 2024
- Talks # 4:
Speaker: Sebastien Fischman ( / sebastienfischman )
Title : Pytorch-tabnet : Beating XGBoost on tabular data with deep learning?
Abstract: #DeepLearning has set up new benchmarks for Computer Vision, NLP, Speech, Reinforcement Learning in the past few years.
However tabular data competitions are still dominated by gradient boosted trees (GBTs) libraries like XGBoost, LightGBM and Catboost.
Tabnet is a new promising deep learning architecture based on sequential attention transformers proposed by Arik & Pfister that aims to fill the gap between GBTs and neural networks.
Pytorch-tabnet is an open source library that provides a scikit-like interface for training a TabNetClassifier or TabNetRegressor. It's ease of use allow any developer to quickly try a #TabNet architecture on any dataset, hopefully setting up new benchmarks.
Bio: Worked as a Data Scientist in France and Australia on very different topics:
user segmentation based on their shopping habits for WoolWorth @Quantium
real time bidding advertising @Tradelab
stock market predictions based on sentiment analysis from social medias @SESAMm
auto ML platform with explainable AI @DreamQuark
now working on early stage cancer detection on new OCT-3D images @DamaeMedical
To give a talk in Talks, fill out this form here: bit.ly/Abhishe...
----
Follow me on:
Twitter: / abhi1thakur
LinkedIn: / abhi1thakur
Kaggle: kaggle.com/abh...
Slides: www.slideshare.net/SebastienFischman/tab-netpresentation/SebastienFischman/tab-netpresentation
GitHub: github.com/dreamquark-ai/tabnet
Thank you Sebastien for the great Talk!
2:55 Tabnet Paper introduction
4:20 Main ideas from Tabnet
7:21 Architecture
8:55 feature transformer block
10:51 attentive transformer block
14:25 individual explainability intro
15:10 self supervised learning ( pretrainning )
17:10 pytorch implementation intro ( 19:18 fastai wrapper avialable )
20:59 demo from a notebook
29:34 Kaggle competition notebooks using Tabnet Pytorch
29:55 Code base architecture
32:18 tricky implementation tips!
34:36 future work
40:52 QA session
41:09 explainability
42:30 computing resource
43:50 tabnet parameters explain
47:55 feature selection ( from sparse mask)
After watching this video, I jumped right into implementing it on some of the kaggle competitions and my research. LGB still works better than Tabnet in most of my implementations. Pytorch-Tabnet is really user-friendly tho if you are new to deep learning for tabular data.
Hi, Tony. Do you know how much LGB perform better than Tabnet and what kind of tasks LGB beats Tabnet? Do you tune the parameter size of Tabnet?
hi from South Africa ......have been using Tabnet for 2yrs now in R studio ......works very well.... will give the pytorch-tabnet a trial
Actually an in-depth session and Sebastian answered most of the queries. Great work!
This is the best Talk Session! Learnt a lot and a great explanation. Thanks Abhishek and Sebastien!
Thank you, excellent work for both of you!
Thanks so much guys! It's a perfect architecture (and lecturer). I've implemented it easily for couple of days, works great!
Thank you so much Abhishek for this . I am also extremely happy to see my kernel and my name on your video , even though for a flash :)
Great talk.
You mentioned being uncertain about the origin of sqrt(0.5) factor. I believe the reason the authors use it is because given two IID random variables X and Y,
Var(sqrt(0.5) X + sqrt(0.5) Y) = 0.5 Var(X) + 0.5 Var(Y) = Var(X).
In the context of the GLU summation, it is a heuristic to ensure that the variance does not increase.
This is great! Thank you both.
To give a talk in Talks, fill out this form here: bit.ly/AbhishekTalks
Can you please post the link of the code in the description.
@@davidvictor7124 All the code is available here github.com/dreamquark-ai/tabnet
I'll also add all the links and the presentation on this same page, so this is the place to go for any information!
u r doing really great thanks a lot for such a awesome content
It's strange what author call it "transformers" because (if I understand correctly) here's not used attention masks (I mean QVK matrices)
I have the same confusion. Do you have a clue?
Amazing Talk , thanks for sharing , your channel is best :)
Great thank Abhishek and Sebastien !!!. you mention about copy of book, how to get that, please share link.
Sebastien explains it at the end of the the talk
9:17 should be "element-wise multiplication" I guess
Great Session
Really good explanations...
I applied this Santander Classification Kaggle dataset and got 81% accuracy without any preprocessing
Hi Abhishek..I was trying to buy your book but link said it will be available on 15 july..how to buy it today? ...u held a session with krish ..
Hey, do you know how to monitor and fit the tabnet based on a metric other than accuracy, say roc_auc_score ? I tried looking for this in the github, couldn't find it :/
default monitoring for binary classification is already roc_auc_score, for multi class it's accuracy, for regression it's MSE. Easy way of changing early stopping metrics still need to be added!
@@sebastienfischman8671 Is it possible now to use a customized loss function ?
Where to get the notebooks?
added in pinned comments