What is a latent variable?

Machine Learning & Simulation

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 31 січ 2025

КОМЕНТАРІ • 14

@indiajackson5959 Рік тому ⁺³
This is really an excellent video! The best examples I've seen on latent variables!
@MachineLearningSimulation Рік тому ⁺¹
Wow, thanks! 😊
Happy, it was helpful.
@rishidixit7939 3 місяці тому ⁺²
I am studying Variational Autoencoder and the introduction of Latent Variable confuses me everytime
@MachineLearningSimulation 2 місяці тому
I can understand. I hope the video gave some clarification 😊
@mohamadroghani1470 3 роки тому ⁺³
Awesome videos...
@MachineLearningSimulation 3 роки тому
@Omar-rc4li 3 роки тому ⁺²
New to this and I have a question... Why do we have to calculate the joint P(T,w) in the first place.
Is it because we need to infer something about 'w', but 'w' itself depends on 'T'. In that case we will have a conditional probability to solve in order to infer i.e. P(w/T)...
We know, P(w/T)=P(w,T)/P(T)....So here we have the joint P(w,T) which we need to solve.......Is this reasoning correct to understand why we need to solve the joint distribution P(w,T) in the first place?
Somewhere i saw that it was the other way round i.e. we need to calculate P(T/w).... but that does not intuitively fit in because in the diagram you made the arrow is from 'w' to 'T'... so "T' depends on 'w'...?????
@MachineLearningSimulation 3 роки тому ⁺¹
Thanks for the comment. I can understand the confusion. :)
Your confusion is most certainly related to the differences between generative and discriminative models. See e.g. here: stackoverflow.com/questions/879432/what-is-the-difference-between-a-generative-and-a-discriminative-algorithm
I will elaborate a bit on this, trying to frame it into the context of these videos. If I got you wrong, and this was not the point of the comment, let me know, :) I would be happy to help.
Generally speaking, one creates Directed Graphical Models/Bayesian Networks with latent and observable nodes in order to model sth. which is either more abstract or pretty close to reality (an example for the latter is a model to predict Covid cases).
The DGM does not necessarily induce for what you want to use it later on. It is just a way to factorize a joint distribution and helpful in querying the likelihood of a data point.
A common application for DGMs is inference, which is predicting the probability distribution over the latent variables (or only one of the potentially many latent variables) given observed data. Here, you of course need the posterior (in our example p(W|T)), as you correctly noted. There are multiple ways to obtain the posterior from the joint, sometimes it is possible to find a closed-form solution like in Mixture Models (e.g. GMM), sometimes you would prefer Variational Inference (also take a look at my video on that topic) or you just use MCMC to obtain statistics on the posterior distribution. For all these, you use your knowledge of the joint distribution to either obtain the full posterior, a surrogate or just some statistics on it.
However, inference is not the only task in Machine Learning. You could use a generative model (i.e. the joint) to generate new data. Think for instance of a generative model of celebrity faces. You could use it to generate previously unseen faces.
Additionally, a joint distribution offers way more insight and flexibility. Think of very complex models with multiple (groups of) latent variables. Sometimes, you might be interested in posteriors over a subset of them. You cannot do this (or at least not as easy) if you only model one particular posterior, as in a discriminative model.
@knowledgedistiller 2 роки тому ⁺¹
@@MachineLearningSimulation So if I understand correctly, we want to compute the joint distribution rather than the conditional distribution p(z|x) (z is latent, x is datapoint), since it is much more flexible, if you have p(x,z) then you can use bayes rule to get p(z|x) - a discriminative model. Or you can generate new data as you said, I think by sampling points of high probability p(x,z=K), where K is a fixed value.
Correct me if I'm wrong. In essence, DGMs are representing a joint probability - the most general distribution over all random variables involved, which can be used to compute more specific distributions (posterior, lileihood). DGMs represent a joint probability because they tell us how to factor a joint distribution, for example, p(x,z) = p(x|z) *p(z) if x depends on z.
@MachineLearningSimulation 2 роки тому
@@knowledgedistiller Yes, you are correct. DGMs are the most flexible ways to model probabilistic relations by providing a factorization of the joint distribution over all involved random variables.
Maybe as a side note: Despite joint distributions being the most flexible, it does not necessarily mean they are the most efficient. For some applications, it might be more reasonable to just model certain specific distributions like posteriors etc. like in discriminative models.
@UpperM3 Рік тому
i can't believe that you have just spelled "happyness" with "Y"...
@MachineLearningSimulation Рік тому ⁺⁵
Thanks for spotting the typo. :)
English is not my mother tongue, so I tend to make mistakes from time to time.
@UpperM3 Рік тому
@@MachineLearningSimulation it's okay, i actually liked your explanation, it helped me study for my exam, thank you. Just pay attention to basic typos
@rishidixit7939 3 місяці тому
Y is latent

Наступне

Автоматичне відтворення

Maximum Likelihood Estimate by Automatic Differentiation | Directed Graphical Models