Bayesian Target Encoding to boost model accuracy - Clearly Explained

A Machine Learning roadmap (the one I recommend to my students)

Are all of your memories real? - Daniel L. Schacter

Хто зрадник?

ЇЛИ МИШЕЙ І ЧЕРВ'ЯКІВ, ҐВАЛТУВАННЯ У ПОЛОНІ, ЛІКАРКА САДИСТ, ЯК ВЕРБУВАЛИ РОСІЯНИ: ЗАХИСНИК ЗМІЇНОГО

Вижив лише батько: загинули три доньки і дружина #shorts

The Simplest Encoding You’ve Never Heard Of

Underfitted

Переглядів 3 610

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 5 вер 2024

КОМЕНТАРІ • 31

@aiwithaz Рік тому ⁺¹²
next video: boosting model accuracy by enigma encoding categorical features
@jespermikkelsen7553 Рік тому ⁺¹
This channel is so underrated.
@underfitted Рік тому
Thanks!
@rajgurubhosale8680 6 місяців тому
glad i found these video when i needed the most!!!!! thank u
@edmundfreeman7203 Рік тому ⁺⁴
Target encoding has a lot of problems. For instance: 1) Build a data set of character stings AAA, AAB, ... ZZZ 2) Randomly generate 0 and 1's, 25 per string 3) build a model with this. Unless you are very lucky you'll get a strong model with little overfitting. Any technique that lets you build a strong model from random data is a bad idea.
Really, what you are doing is a very fancy way of putting the target variable into the model, which is a big no-no.
What you could conceivable do is build the target encoding on the test data only.
@underfitted Рік тому ⁺²
It does have a lot of problems, but that doesn't make it useless. Target encoding works very well in many different situations where One-Hot Encoding becomes problematic.
If you aren't careful, you can overfit with Target Encoding. That's where the smoothing part comes in.
@edmundfreeman7203 Рік тому ⁺¹
@@underfitted Give me a little bit. I'm going to owe you are concrete demonstration of what I am talking about, and I'll see if smoothing fixes the problem.
@edmundfreeman7203 Рік тому
@@underfitted I put together a video ua-cam.com/video/4Zl-juDI2YM/v-deo.html on why I think target encoding is an antipattern.
@underfitted Рік тому ⁺⁴
Thanks for the video response. You make great points, and I generally agree with most of what you said. There are still claims that, in practice, hold less value. I've seen Target Encoding used in real-life situations with excellent results and no signs of overfitting. Of course, that doesn't make Target Encoding a silver bullet and you can easily leak the target values if you aren't careful.
@edmundfreeman7203 Рік тому ⁺¹
@@underfitted What could work very well is using a past average of the target, instead of the modeling target.
@tshock22 Рік тому
Your production quality is next level. Much appreciated!
@nedafiroz514 Рік тому
Wonderful illustration, thank you so much
@Cosimao564 Рік тому
every video i have seen from this channel is applicable in my chemometrics work, thank you
@user-vb9jo8xg4s 2 місяці тому
Fantastic!!!!
@openroomxyz Рік тому
Thanks for creating this videos in any case.
@fikriansyahadzaka6647 Рік тому ⁺¹
This might not be related to your video, but could you also cover ChatGPT? The internet went crazy in the past 2 weeks because of it. It will be interesting to understand the history of ChatGPT and how it works.
@underfitted Рік тому ⁺¹
I did a video last week that talks about ChatGPT.
ua-cam.com/video/l_oHZT6yTEs/v-deo.html
@fikriansyahadzaka6647 Рік тому
Ah I see, I missed that video. You are so fast updating the current trend. Keep up the good work!
@openroomxyz Рік тому
Maybe you could create a video to about how to self-learn without degree AI, order of things to learn and sources from where to learn, how long it would take ( approximation ), and how could you go monitizing the the knowladge, and skill.
@prajwalsyallur712 Рік тому
Thanks for this useful video!🙂
@viswarupmisra Рік тому
can you tell me about the camera you are using, the software you use to edit videos and your set up in general. And how do you create the effects in your videos?
@underfitted Рік тому ⁺¹
I'm using a Sony FX3. Final Cut Pro. The effects are from LenoFX.
@philtoa334 Рік тому
Nice vidéo.
@afterwork260 Рік тому
How about we just delete the outliers first? and continue doing target encoding?
@bobdowling6932 Рік тому ⁺¹
I don’t understand this at all. What’s to stop two different weathers getting the same score because you were happy on the same number of days with each of those weathers?
@underfitted Рік тому
A couple of things:
1. Keep in mind that this technique is effective with enough data. The example in the video is using 7 rows.
2. Every row with the same value "SUNNY" will get the same encoding. That's precisely the goal. They already have the same value ("SUNNY"), the difference is that they will be getting a numerical value.
@curtisnewton895 Рік тому ⁺¹
but what if the same text gets the same amount of associated values in another column, they will get the same numeric label
why not just divide 1 per number of text labels and multiply that ratio by the line index
so simple
@underfitted Рік тому
Hey Curtis, I'm having problems understanding the situation you mention. "Same text gets the same amount of associated values in another column." Happy to discuss more. Feel free to hit me up @svpino in Twitter.
@diegofabianledesmamotta5139 Рік тому
If I understand, you mean that two categories could en up with the same associated value right?
Doesn't seem like a big problem to me, sounds like you're loosing infirmation but the goal is to make a predictive model, so if it doesn't prevent the model for doing good predictions I think is ok. @Underfitted do you think that's an issue?
@jackcat3745 Рік тому ⁺¹
He is not smart man.

Наступне

Автоматичне відтворення

Bayesian Target Encoding to boost model accuracy - Clearly Explained

Bayesian Target Encoding to boost model accuracy - Clearly Explained

A Machine Learning roadmap (the one I recommend to my students)

A Machine Learning roadmap (the one I recommend to my students)

Are all of your memories real? - Daniel L. Schacter

Are all of your memories real? - Daniel L. Schacter

ЇЛИ МИШЕЙ І ЧЕРВ'ЯКІВ, ҐВАЛТУВАННЯ У ПОЛОНІ, ЛІКАРКА САДИСТ, ЯК ВЕРБУВАЛИ РОСІЯНИ: ЗАХИСНИК ЗМІЇНОГО

ЇЛИ МИШЕЙ І ЧЕРВ'ЯКІВ, ҐВАЛТУВАННЯ У ПОЛОНІ, ЛІКАРКА САДИСТ, ЯК ВЕРБУВАЛИ РОСІЯНИ: ЗАХИСНИК ЗМІЇНОГО

Вижив лише батько: загинули три доньки і дружина #shorts

Вижив лише батько: загинули три доньки і дружина #shorts

这三姐弟太会藏了！#小丑#天使#路飞#家庭#搞笑

这三姐弟太会藏了！#小丑#天使#路飞#家庭#搞笑

The Problem with Time & Timezones - Computerphile

The Problem with Time & Timezones - Computerphile

Lattice-based cryptography: The tricky math of dots

Lattice-based cryptography: The tricky math of dots

How the Best Hackers Learn Their Craft

How the Best Hackers Learn Their Craft

One-Hot, Label, Target and K-Fold Target Encoding, Clearly Explained!!!

One-Hot, Label, Target and K-Fold Target Encoding, Clearly Explained!!!

A gentle introduction to RAG (using open-source models)

A gentle introduction to RAG (using open-source models)

What are Genetic Algorithms?

What are Genetic Algorithms?

Harder Drive: Hard drives we didn't want or need

Harder Drive: Hard drives we didn't want or need

Simple, Non-Commercial, Open Source Notes

Simple, Non-Commercial, Open Source Notes

One Hot Encoder with Python Machine Learning (Scikit-Learn)

One Hot Encoder with Python Machine Learning (Scikit-Learn)

Участник рассмешил Диму Журавлева 😂 | Смотри Удиви меня в VK Видео!

Участник рассмешил Диму Журавлева 😂 | Смотри Удиви меня в VK Видео!

ДИЗЕЛЬ ШОУ 2024 🇺🇦 ПРЕМ'ЄРА 🇺🇦 ВИПУСК 147 на підтримку ЗСУ ⭐ Гумор ICTV від 30.08.2024

ДИЗЕЛЬ ШОУ 2024 🇺🇦 ПРЕМ'ЄРА 🇺🇦 ВИПУСК 147 на підтримку ЗСУ ⭐ Гумор ICTV від 30.08.2024

🤣 "Проблемы" богачей или как заработать $40 за 3 секунды! | Новостничок

🤣 "Проблемы" богачей или как заработать $40 за 3 секунды! | Новостничок

Каха отправляет дочь в школу #непосредственнокаха

Каха отправляет дочь в школу #непосредственнокаха

Прием в первый класс (1969) #ссср #школа

Прием в первый класс (1969) #ссср #школа

⚡️путін у Монголії: що ЗАГРОЖУЄ країні за ігнорування ордера на арешт

⚡️путін у Монголії: що ЗАГРОЖУЄ країні за ігнорування ордера на арешт

🤔Насколько Глубокую Яму можно Выкопать ? #shorts

🤔Насколько Глубокую Яму можно Выкопать ? #shorts