The Past, Present and the Future of Machine Learning for Tabular Data - Bojan Tunguz

The moment we stopped understanding AI [AlexNet]

1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jeff Roach

788 військових ЗНИКЛИ у Кринках: РОЗСЛІДУВАННЯ /Кислиця ВИКРИВ брехню Кремля в ООН /Стан ЗДОРОВ'Я Сі

ПРОВЕРИЛ АРБУЗЫ #shorts

Оновлення даних, способи та для кого будуть наслідки | Адвокат Ростислав Кравець

Deep Learning for Tabular Data: A Bag of Tricks | ODSC 2020

DataRobot

Переглядів 14 599

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 21 лип 2024
Jason McGhee, Senior Machine Learning Engineer at DataRobot, has been spending time applying deep learning and neural networks to tabular data. Although the deep learning technique can prove challenging, his research supports how valuable it is when using tabular datasets. In this video (adapted from his presentation at ODSC Boston 2020), Jason shares some important techniques for implementing deep learning when learning heterogenous tabular data. Learn more about Jason’s findings and ask him questions at his DataRobot Community post: community.datarobot.com/t5/ai...
Table of Contents
Motivation: 0:15
Impute missing values: 1:37
Prepare categoricals, text, and numerics: 2:49, 3:10, 3:31
Properly validate: 3:54
Establish a benchmark: 5:24
Start with a low capacity network: 6:10
Determine output activation and loss function for classification and regression: 7:17, 8:26
Determine hidden activation: 9:46
Choose batch size: 10:57
Build learning rate schedule: 12:02
Determine number of epochs: 14:35
Track and interpret regression predictions: 15:30
Track metric and/or loss: 16:09
Track and interpret classification predictions: 16:45
Benchmark the network: 17:11
Dealing with discontinuities: 18:16
Tuning the network: 19:31
Handing overfitting vs. underfitting: 20:41
All tricks in one place: 21:35
Music for this video: www.bensound.com.
Stay connected with DataRobot!
Blog: blog.datarobot.com/
Community: community.datarobot.com/
Twitter: / datarobot
LinkedIn: h / datarobot
Facebook: / datarobotinc
Instagram: / datarobot
Наука та технологія

КОМЕНТАРІ • 33

@markryan2475 3 роки тому ⁺³
This is awesome - so glad to see some serious, methodical work on this.
@michaeljuhasz1162 4 роки тому ⁺³
Thank you for this. As material’s science researcher dabbling in applying ML techniques to my datasets, this is great.
@briantroy9403 3 роки тому ⁺¹
I’ve also noticed a lack of emphasis of tabular data with respect to NN’s. This is a great presentation and very informative. Thanks for putting it together.
@alirezaamani2027 Рік тому
very interesting point on the suggested loss functions based on the distribution of the target variable. Learned a lot. Thank u
@ZachMeador 2 роки тому ⁺¹
Great use of Grant Sanderson's graphics library
@parmarsuraj99 4 роки тому ⁺¹
This helped me a lot. Thank you 🙏😍
@sayedathar2507 2 роки тому
Thanks for Sharing this are golden advices !
@rupjitchakraborty8012 Рік тому
Amazing amazing video, I learnt so much. Thank you
@tahamagdy4932 3 роки тому ⁺⁶
Great, but the music is very loud.
@venk8t 3 роки тому ⁺¹
The background music is a distraction and hard to listen .
@comunedipadova1790 9 місяців тому
incredible, how to ruin a video
@hungdoan9148 2 роки тому
Amazing video!
@vibhatha 3 роки тому
Nicely done 👌
@marekglowacki2607 3 роки тому ⁺⁴
3Blue1Brown style :)
@robimalco 3 роки тому
This is high quality content
@MiguelRaggi 2 роки тому ⁺¹
By the way, one hot encoding and making an embedding are the same thing, except embedding is faster. What do you do with your one hot encoding? you multiply by a matrix. What happens when you multiply a vector with 0's and only one 1 by a matrix? That's right, you basically choose a column of the matrix. And that column is the embedding.
@Houshalter 6 місяців тому
My thoughts too. The only exception I could think of is if the software he used treated embedding differently somehow. Maybe regularization or dropout wasn't applied to the embeddings. Maybe normalization on the one hot columns had some beneficial effect.
@idabagusdiazagasatya9900 4 роки тому ⁺¹
This is gold, keep it up.
@parmarsuraj99 4 роки тому
Yes it is
@mohamedesdairi3044 3 роки тому ⁺¹
you helped me A LOT, amazing content and prefect presentation, keep it up
@DataRobot 3 роки тому
We are! Reply with topics you'd like to see covered.
@mohamedesdairi3044 3 роки тому
@@DataRobot personally, I think topics like solving challenges in fitting neural networks to a small dataset, debugging why you're NN can't perform well on a specific task, and active learning are subject that don't get the attention they deserve just like "Deep Learning for Tabular Data" which you did a fantastic job in covering in this video, so thanks again
@DataRobot 3 роки тому
@@mohamedesdairi3044 Hey would you please ask your question in the DataRobot Community? You'll find a lot of community members who are interested in posting about DataRobot, machine learning, data science subjects. community.datarobot.com/t5/platform/bd-p/platform-discussions-1
@TheOraware 2 роки тому
how to spot random subset of data from a given set of data?
@mehdiozel517 2 роки тому
Thanks, really cool video! But I have a question. You said "set the batch size 1% of dataset". Is these informations provided for deep learning on tabular datasets or other types of data too?
@brianray9715 4 роки тому ⁺³
I dig the background music
@anon44492 3 роки тому
Great work bro!
@username42 3 роки тому
what about the 1d sensory data collected from physical and chemical instruments ? i know we can still treat them as tabular data but what about when we have thousands of variables and hundreds of samples only and the variables are not single identity but they are sort of grouping features , how to treat the data analysis ?
@DataRobot 3 роки тому
Hey would you please ask your question in the DataRobot Community? You'll find a lot of community members who are interested in posting about DataRobot, machine learning, data science subjects. community.datarobot.com/t5/platform/bd-p/platform-discussions-1
@jivan476 Рік тому
Nice vid, but tbh this sounds like a crazy amount of work for something that will only ever tangentially approach boosted trees performance on most tabular datasets
@ThanhNguyen-rz4tf Рік тому
What is the paper you mentioned ?
@morowenka9718 2 роки тому
amazing video 🥵
@swarajshinde3950 3 роки тому

Наступне

Автоматичне відтворення

The Past, Present and the Future of Machine Learning for Tabular Data - Bojan Tunguz

The Past, Present and the Future of Machine Learning for Tabular Data - Bojan Tunguz

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jeff Roach

1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jeff Roach

788 військових ЗНИКЛИ у Кринках: РОЗСЛІДУВАННЯ /Кислиця ВИКРИВ брехню Кремля в ООН /Стан ЗДОРОВ'Я Сі

788 військових ЗНИКЛИ у Кринках: РОЗСЛІДУВАННЯ /Кислиця ВИКРИВ брехню Кремля в ООН /Стан ЗДОРОВ'Я Сі

ПРОВЕРИЛ АРБУЗЫ #shorts

ПРОВЕРИЛ АРБУЗЫ #shorts

Оновлення даних, способи та для кого будуть наслідки | Адвокат Ростислав Кравець

Оновлення даних, способи та для кого будуть наслідки | Адвокат Ростислав Кравець

Угадай Настоящего Рэпера, но Наоборот! Взрослые угадывают Toxi$ (Сатир, Хазяева, Кокошка, Дилблин)

Угадай Настоящего Рэпера, но Наоборот! Взрослые угадывают Toxi$ (Сатир, Хазяева, Кокошка, Дилблин)

Why Neural Networks can learn (almost) anything

Why Neural Networks can learn (almost) anything

Scientific Concepts You're Taught in School Which are Actually Wrong

Scientific Concepts You're Taught in School Which are Actually Wrong

Bayesian Deep Learning and Probabilistic Model Construction - ICML 2020 Tutorial

Bayesian Deep Learning and Probabilistic Model Construction - ICML 2020 Tutorial

Talks # 4: Sebastien Fischman - Pytorch-TabNet: Beating XGBoost on Tabular Data Using Deep Learning

Talks # 4: Sebastien Fischman - Pytorch-TabNet: Beating XGBoost on Tabular Data Using Deep Learning

But what is a neural network? | Chapter 1, Deep learning

But what is a neural network? | Chapter 1, Deep learning

How To Create Datasets for Finetuning From Multiple Sources! Improving Finetunes With Embeddings.

How To Create Datasets for Finetuning From Multiple Sources! Improving Finetunes With Embeddings.

Machine Learning vs Deep Learning

Machine Learning vs Deep Learning

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

155 - How many hidden layers and neurons do you need in your artificial neural network?

155 - How many hidden layers and neurons do you need in your artificial neural network?

Красиво, но телефон жаль

Красиво, но телефон жаль

Это Xiaomi Su7 Max 🤯 #xiaomi #su7max

Это Xiaomi Su7 Max 🤯 #xiaomi #su7max

🔥 Лютая вещь для геймеров Да и вообще для тех кто проводит время за компом 💻

🔥 Лютая вещь для геймеров Да и вообще для тех кто проводит время за компом 💻

Отдых для геймера? 😮‍💨 Hiper Engine B50

Отдых для геймера? 😮‍💨 Hiper Engine B50

Как сделать так, чтобы видеть экран телефона в солнечную погоду?

Как сделать так, чтобы видеть экран телефона в солнечную погоду?

POCO X6 PRO😈 Vs iPHONE 15 PRO💀Vs POCO F6 PRO😱 VsiQOO 12Vs 8GBvs4GBVs-PUBG TEST #pocox6pro #iPhone

POCO X6 PRO😈 Vs iPHONE 15 PRO💀Vs POCO F6 PRO😱 VsiQOO 12Vs 8GBvs4GBVs-PUBG TEST #pocox6pro #iPhone

Worlds smallest 4K headset 😎 Visor.com #tech #vr #technology #virtualreality #insideout2

Worlds smallest 4K headset 😎 Visor.com #tech #vr #technology #virtualreality #insideout2

ТОП 3 процессоров в ПК до 120.000 ₽ от CompShop #intel #amd #игровойпк

ТОП 3 процессоров в ПК до 120.000 ₽ от CompShop #intel #amd #игровойпк