Stanford CS330 I Variational Inference and Generative Models l 2022 I Lecture 11

Поділитися
Вставка
  • Опубліковано 24 лис 2024

КОМЕНТАРІ • 23

  • @leonardocaliendo2415
    @leonardocaliendo2415 Рік тому +13

    I just want you to know i really enjoyed the paranormal distribution.

  • @jonathanr4242
    @jonathanr4242 5 місяців тому +1

    That is the best UA-cam lecture I've ever seen.

  • @parsakhavarinejad
    @parsakhavarinejad 8 місяців тому +1

    I enjoyed every minutes of this lecture. thank you!

  • @dewibatista5752
    @dewibatista5752 Місяць тому

    these students have very deep voices

  • @heyjianjing
    @heyjianjing 10 місяців тому +4

    interesting to start ELBO with Jensen's inequality through log marginal likelihood rather than from KL between q(z) and the posterior

    • @macx7760
      @macx7760 5 годин тому

      yeah, if she would have derived the objective this way, she would have known why its called variational

  • @joebobthe13th
    @joebobthe13th 6 місяців тому +5

    Love the section on "kale divergence"! Thanks UA-cam auto-captioning! 😂

  • @SoroushHashemifar
    @SoroushHashemifar 17 днів тому +1

    impressive lecture 🤌

  • @saderick52
    @saderick52 Рік тому +7

    Really appreciate the brilliant content. Not sure if it's just me or if the instructor at times is not completely sure about the some of the logics or how to explain them, such as 43:35. It's a bit confusing to be honest. Still super nice lecture overall!

  • @infoeangel8974
    @infoeangel8974 10 місяців тому +1

    why do we want the expectation to be before the log and not inside the log? How does it help us with sampling?

  • @onurozkan1077
    @onurozkan1077 Рік тому

    I think I have an idea about how to estimate likelihood of p(x) given a trained model(the problem is mentioned at 20:20 onward). One thing we can do also is pick a batch of points to represent p(x|z) and pick random p(z) inputs for every random point in the batch. Then for every point in the batch, apply gradient ascent with respect to the randomly selected p(z)'s(inputs). What inputs will converge locally will be the maximum likelihood locations for that point. After that, we can fit a subset of Gaussian distribution of these local maxima. This will complete p(x|z)p(z) multiplication. Finally, instead of integration just sum over all those multiplication to get a rough surface of p(x)...

    • @sumeet679
      @sumeet679 Рік тому +1

      Hello. From what I understand, calculation of likelihood isn't always that straightforward, it could be intractable in sense of closed form. When you say, apply gradient descent to a likelihood, it's called score function ( grad of log (p(x)) ) quite commonly used in diffusion models, in the reverse process. While your intuition seems on teh right path, I think you can try using Arzela Theorem to make it more rigorous. en.wikipedia.org/wiki/Arzel%C3%A0%E2%80%93Ascoli_theorem
      Have a great day !

  • @daryoushmehrtash7601
    @daryoushmehrtash7601 7 місяців тому +1

    Nice lecture.

  • @haroldsu1696
    @haroldsu1696 Рік тому +1

    great job!

  • @Vanadium404
    @Vanadium404 Рік тому

    Underrated

  • @henrywong741
    @henrywong741 Рік тому +3

    Hi Stanford online, could you please share the recordings of CS228 probabilistic graphical models and CS236 Deep Generative Models? Thanks a lot in advance.

  • @sekar3412
    @sekar3412 Рік тому

    நன்றி

  • @hardToGetHandle
    @hardToGetHandle Рік тому

    ahh.. it's a gold mine. :)

  • @spyrosp.551
    @spyrosp.551 9 місяців тому

    Damn genders inequality comes into play everywhere these days

  • @bharatbajoria
    @bharatbajoria Рік тому +3

    This lecture is taken from ua-cam.com/video/UTMpM4orS30/v-deo.html

  • @lpmlearning2964
    @lpmlearning2964 6 місяців тому +2

    It feels too theoretical. It should be more like this: how do I do this in practice, why do I do it this way, and why can it not be done with neural networks and have to go all the way to variational inference