Really appreciate the brilliant content. Not sure if it's just me or if the instructor at times is not completely sure about the some of the logics or how to explain them, such as 43:35. It's a bit confusing to be honest. Still super nice lecture overall!
I think I have an idea about how to estimate likelihood of p(x) given a trained model(the problem is mentioned at 20:20 onward). One thing we can do also is pick a batch of points to represent p(x|z) and pick random p(z) inputs for every random point in the batch. Then for every point in the batch, apply gradient ascent with respect to the randomly selected p(z)'s(inputs). What inputs will converge locally will be the maximum likelihood locations for that point. After that, we can fit a subset of Gaussian distribution of these local maxima. This will complete p(x|z)p(z) multiplication. Finally, instead of integration just sum over all those multiplication to get a rough surface of p(x)...
Hello. From what I understand, calculation of likelihood isn't always that straightforward, it could be intractable in sense of closed form. When you say, apply gradient descent to a likelihood, it's called score function ( grad of log (p(x)) ) quite commonly used in diffusion models, in the reverse process. While your intuition seems on teh right path, I think you can try using Arzela Theorem to make it more rigorous. en.wikipedia.org/wiki/Arzel%C3%A0%E2%80%93Ascoli_theorem Have a great day !
Hi Stanford online, could you please share the recordings of CS228 probabilistic graphical models and CS236 Deep Generative Models? Thanks a lot in advance.
It feels too theoretical. It should be more like this: how do I do this in practice, why do I do it this way, and why can it not be done with neural networks and have to go all the way to variational inference
I just want you to know i really enjoyed the paranormal distribution.
That is the best UA-cam lecture I've ever seen.
I enjoyed every minutes of this lecture. thank you!
these students have very deep voices
interesting to start ELBO with Jensen's inequality through log marginal likelihood rather than from KL between q(z) and the posterior
yeah, if she would have derived the objective this way, she would have known why its called variational
Love the section on "kale divergence"! Thanks UA-cam auto-captioning! 😂
impressive lecture 🤌
Really appreciate the brilliant content. Not sure if it's just me or if the instructor at times is not completely sure about the some of the logics or how to explain them, such as 43:35. It's a bit confusing to be honest. Still super nice lecture overall!
why do we want the expectation to be before the log and not inside the log? How does it help us with sampling?
I think I have an idea about how to estimate likelihood of p(x) given a trained model(the problem is mentioned at 20:20 onward). One thing we can do also is pick a batch of points to represent p(x|z) and pick random p(z) inputs for every random point in the batch. Then for every point in the batch, apply gradient ascent with respect to the randomly selected p(z)'s(inputs). What inputs will converge locally will be the maximum likelihood locations for that point. After that, we can fit a subset of Gaussian distribution of these local maxima. This will complete p(x|z)p(z) multiplication. Finally, instead of integration just sum over all those multiplication to get a rough surface of p(x)...
Hello. From what I understand, calculation of likelihood isn't always that straightforward, it could be intractable in sense of closed form. When you say, apply gradient descent to a likelihood, it's called score function ( grad of log (p(x)) ) quite commonly used in diffusion models, in the reverse process. While your intuition seems on teh right path, I think you can try using Arzela Theorem to make it more rigorous. en.wikipedia.org/wiki/Arzel%C3%A0%E2%80%93Ascoli_theorem
Have a great day !
Nice lecture.
great job!
Underrated
Hi Stanford online, could you please share the recordings of CS228 probabilistic graphical models and CS236 Deep Generative Models? Thanks a lot in advance.
can you tell any reference books for this course?
நன்றி
ahh.. it's a gold mine. :)
Damn genders inequality comes into play everywhere these days
This lecture is taken from ua-cam.com/video/UTMpM4orS30/v-deo.html
It feels too theoretical. It should be more like this: how do I do this in practice, why do I do it this way, and why can it not be done with neural networks and have to go all the way to variational inference
lol