Multi-Armed Bandits and A/B Testing
Вставка
- Опубліковано 1 лип 2024
- Today I'm talking to Sandeep, a PhD student studying Information and Decision Sciences at the University of Minnesota. We talk Multi-Armed Bandits, A/B Testing, and the key differences between the two.
Check out Sandeep's website: sandeepgangarapu.com/
Want to be featured in the next mock interview video? Apply here: airtable.com/shrdQrwKK7xxGLm6l
👉 Subscribe to my data science channel: bit.ly/2xYkyUM
Use the code "datasciencejay" and get 10% off data science interview prep 🔥 : www.interviewquery.com/pricin...
❓ Check out our data science courses: www.interviewquery.com/course...
🔑 Get professional coaching here: www.interviewquery.com/coachi...
🐦 Follow us on Twitter: / interview_query
More from Jay:
Read my personal blog: datastream.substack.com/
Follow me on Linkedin: / jay-feng-ab66b049
Find me on Twitter: / datasciencejay - Наука та технологія
Learnt something today , thanks! I think for the last example of unlearnai, they will still need to test few real people with placebo to validate their model performance. With a proven working model, they can test mainly with real drug for side effect, etc
Such a great talk sandy! So proud of you
Wow, this was such a great convo! Thanks Sandeep for sharing your wisdom, going to be checking out your other work!
Great content!!
This was a great video!
Hello Sandeep, thank you for the quick overrun. Do you mind to tell us how to connect or discuss with you after this session?
Follow up, so I feel that Multi Armed Bandit is sort of Optimisation Problem given such constraint that it is quite hard and ineffective to perform AB Testing? Do you agree with such motion? Let me know your inputs
Its not often you hear a researcher give a high level talk that regular folks can understand. Great talk. Enjoyed it thoroughly. About that 20$ though, whats the algo haha
at the moment it is often using UCB/Upper Confidence Bound to maximise utility return. But the overall problem is, in casino the reward is not simply one state. It is far complex than simple one state bandit context tho. The casino example is a mere oversimplifying.
What's Sandeep's full name/ linkedin
You cant use Multi armed bandits in online experimentation because they cause return user bias. MAB's can only be used once per user. The problem is that bandit machines have a fixed probability of payout.... whilst a user of a websites probability of buying something increases over time. This means that if they are switched into a new variation that new variation is more likely to incur an outcome of a sale...... flawed experiment!