Posterior for the Bernoulli using the Conjugate Prior | with example in TensorFlow Probability

Machine Learning & Simulation

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 9 лип 2024
If we observe data on the event modelled by a Bernoulli distribution, we could be interested in finding a posterior distribution over the latent parameter to it. If we use a conjugate prior, this posterior has a closed-form solution. Here are the notes: raw.githubusercontent.com/Cey...
The Bernoulli distribution is actually one of these rare cases in which we can actually express all associated distributions: The marginal, the posterior and the predictive posterior. Other more sophisticated distributions do not allow for this since we there run into the trouble of intractability when applying Bayes' rule.
-------
📝 : Check out the GitHub Repository of the channel, where I upload all the handwritten notes and source-code files (contributions are very welcome): github.com/Ceyron/machine-lea...
📢 : Follow me on LinkedIn or Twitter for updates on the channel and other cool Machine Learning & Simulation stuff: / felix-koehler and / felix_m_koehler
💸 : If you want to support my work on the channel, you can become a Patreon here: / mlsim
-------
⚙️ My Gear:
(Below are affiliate links to Amazon. If you decide to purchase the product or something else on Amazon through this link, I earn a small commission.)
- 🎙️ Microphone: Blue Yeti: amzn.to/3NU7OAs
- ⌨️ Logitech TKL Mechanical Keyboard: amzn.to/3JhEtwp
- 🎨 Gaomon Drawing Tablet (similar to a WACOM Tablet, but cheaper, works flawlessly under Linux): amzn.to/37katmf
- 🔌 Laptop Charger: amzn.to/3ja0imP
- 💻 My Laptop (generally I like the Dell XPS series): amzn.to/38xrABL
- 📱 My Phone: Fairphone 4 (I love the sustainability and repairability aspect of it): amzn.to/3Jr4ZmV
If I had to purchase these items again, I would probably change the following:
- 🎙️ Rode NT: amzn.to/3NUIGtw
- 💻 Framework Laptop (I do not get a commission here, but I love the vision of Framework. It will definitely be my next Ultrabook): frame.work
As an Amazon Associate I earn from qualifying purchases.
-------
Timestamps
00:00 Opening
00:16 Task of inferring parameters from data
01:30 Graphical Model and joint
04:10 Deriving the Posterior
11:20 A conjugate prior
11:55 TensorFlow Probability
15:25 End-Card

КОМЕНТАРІ • 10

@bikashkharel5891 5 місяців тому ⁺¹
Hi, can you please make a tutorial on the interactive plots you make? Your videos are awesome and helping me a lot…thank you so much.
@MachineLearningSimulation 5 місяців тому ⁺¹
Hi, thanks a lot for the kind comment. 😊
The interactive visualizations are based on streamlit (a Python package that very easily lets you build interactive web applications). I believe, there are already a lot of tutorials (both written and as videos) online (correct if I'm wrong, though). Are you specifically interested in a video from me? 😊
@bikashkharel5891 5 місяців тому ⁺¹
@@MachineLearningSimulation Thank you for the prompt response. I know the package now and can do a little bit of research on that. I am more interested in videos like MCMC from you even though there are tons in youtube they don't clearly give the concepts along with the python tutorial like you do.
@vamsibalijepally3431 2 роки тому ⁺¹
hi can you please elaborate about graphical models and joint distributions please ??
@MachineLearningSimulation 2 роки тому
Hey,
check out this Playlist of mine: ua-cam.com/video/yBc01ZeaFxw/v-deo.html
@user-or7ji5hv8y 3 роки тому
Why is P(D) difficult to compute? Can you illustrate with an example if possible. We assumed beta and Bernoulli for prior and likelihood respectively. Can we not simply assume a distribution for the D?
@MachineLearningSimulation 3 роки тому ⁺¹
That's a great question! In the case of the Beta-Bernoulli model (which we use here) one could actually find an expression for p(D) (i.e. the probability of the dataset) by marginalization.
Recall that we have the joint p(D, theta), and we can marginalize over theta to get
p(D) = integral(p(D, theta)) d theta
The integration here can only be solved by a trick that is a property of the Beta distribution(I will produce a video about this, it's already on my To-do list). Meaning that in general you can't solve such kind of integrations which is why I called it "difficult".
Usually, people also refer to this as being intractable. (I am also planning on doing a video on intractability)
@MachineLearningSimulation 3 роки тому ⁺¹
Some more thoughts on your question: You asked why we can't assume a distribution for D:
The D is just a collection of all our observations. In a sense we know its distribution: It is a (product of) Bernoulli. However, this is only true in the context of the graphical model (i.e. we know the conditional distribution p(D|theta) ).
@MachineLearningSimulation 3 роки тому ⁺¹
If you do not want to wait until my next video, you can also check out the slides here: www2.stat.duke.edu/~rcs46/modern_bayes17/lecturesModernBayes17/lecture-1/01-intro-to-Bayes.pdf (page 24). Be aware, the notation is slightly different.
@MachineLearningSimulation 3 роки тому ⁺¹
The new video is now online ua-cam.com/video/gYvE9S2s2mE/v-deo.html :)

Наступне

Автоматичне відтворення

Maximum A Posteriori Estimate (MAP) for Bernoulli | Derivation & TensorFlow Probability