L6 Diffusion Models (SP24)

Поділитися
Вставка
  • Опубліковано 24 лис 2024

КОМЕНТАРІ • 16

  • @martinpernus9511
    @martinpernus9511 9 місяців тому +13

    World-class education material on the webz for free? I love you guys so much

  • @chenxu-s6g
    @chenxu-s6g 8 місяців тому

    This is a very good work thank you for continous update in RL ,which is the most resorces for me to follow and learn!

  • @faruknane
    @faruknane 6 місяців тому +2

    Hi from TUM. Awesome quality content! I always get astonished by how a lot of recent methods & papers are compiled and summarized into lectures.

  • @cardianlfan8827
    @cardianlfan8827 9 місяців тому +2

    This course employs cutting-edge technology Sora in its instruction, creating an immersive experience as if the professor's lecture is happening right before your eyes.

  • @bingbingsun6304
    @bingbingsun6304 9 місяців тому +2

    For DDIM, my understanding is that as we can predict noise, why not use the predicted noise as the noise to add in sampling. ^_^

    • @GenAiWarrior
      @GenAiWarrior 8 місяців тому

      In my opinion , the accuracy of predicted noise might not be good enough to put it back up in sampling model as a result it will disrupt the quality of sampling model.

  • @harshdeepsingh3872
    @harshdeepsingh3872 2 місяці тому

    very well explained

  • @JoshuaMichael-v6c
    @JoshuaMichael-v6c 5 місяців тому

    Can you please elaborate ore on v-space loss? how does it differ from epsilon space loss?

  • @siddmathlver3444
    @siddmathlver3444 9 місяців тому

    I wish the mic quality was a little better. Otherwise, awesome lecture!

  • @girikkhullar4072
    @girikkhullar4072 Місяць тому

    at 1:24:14 they mention that they have to double check the loss; could you tell us what they found?

  • @JoshuaMichael-v6c
    @JoshuaMichael-v6c 5 місяців тому

    How are FID and IS scores calculated?

  • @sandrocavallari4640
    @sandrocavallari4640 8 місяців тому +1

    Great material. But it is me or slide 80 contains an error in the derivation of the velocity ? Shouldn’t it be: cos(/phi)/epsilon - sin(/phi) x_o ?

    • @PieterAbbeel
      @PieterAbbeel  7 місяців тому

      great eye, fixed in the posted slides: sites.google.com/view/berkeley-cs294-158-sp24/home (but video is still the same)

  • @JoshuaMichael-v6c
    @JoshuaMichael-v6c 5 місяців тому

    Also we convert 256 X 256 X 3 into 32 X 32 X 4 . What would 4 be here? what does it represent?

  • @Aesthetic_Euclides
    @Aesthetic_Euclides 2 місяці тому

    I'm struggling to get a good intuition of what a NN trained for this task is learning to do exactly.
    This is my thought process:
    You have some process of going from x0 to xt, by sampling epsilon ~ N(0,I) and basically adding it to x0, with a couple of constants.
    Then you have a NN that given xt, it approximates the actual sample from N(0,I) that was used (epsilon). And you optimize this NN on this task.
    In the end, you can just give it pure noise, it will give you some other approximation of a sample from N(0,I), and you can substract it (with some constants) and get a novel image.
    If this is essentially true, I struggle to get an intuition of how by training a NN to do a great job at this, you get actually great-looking novel images.
    I can see how by training it to go from x_t+1 to x_t, the NN will be a denoising function.
    One can imagine the weights of this NN to encapsulate different patterns in the pixels of the NN to identify regions of pure noise vs regions of the original image.
    It makes sense that it would be able to do a good job when there are some patterns to detect.
    But how about when it sees a purely noisy image? It would seem to me that whatever patterns it's trying to match, that none of them would be there since it would all be noise. And you would get crap, basically.
    I hope I explained myself, any help is appreciated! Thanks!

    • @Aesthetic_Euclides
      @Aesthetic_Euclides 2 місяці тому

      I think an explanation in the pure noise case is that it "guesses" some denoising step, and you will have your x_T-1 with a little bit of pattern resembling a super noisy realistic image.
      You repeat this process, and the patterns compound on each other until you get a realistic image