ControlNet with Diffusion Models | Explanation and PyTorch Implementation

Поділитися
Вставка
  • Опубліковано 16 гру 2024

КОМЕНТАРІ • 20

  • @Explaining-AI
    @Explaining-AI  4 місяці тому

    *Official Implementation* : github.com/lllyasviel/ControlNet
    *My Github Implementation* : github.com/explainingai-code/ControlNet-PyTorch

  • @rafa_br34
    @rafa_br34 4 місяці тому +4

    Awesome! I love how you first describe the theory and then guide us through the implementation, it makes it so much easier to understand both the concepts and pytorch in general.

    • @Explaining-AI
      @Explaining-AI  4 місяці тому +1

      Thank You! Yes the intention is that an explanation followed by implementation(even a minimal one), will make the topic clearer.
      And also implementing enables me to run experiments to validate my own complete understanding, so beneficial for me as well.

  • @hieuaovan7101
    @hieuaovan7101 7 днів тому

    nice video with great image that can clearly explain

  • @HassanHamidi-v8s
    @HassanHamidi-v8s 4 місяці тому

    Wow, thanks! That's super clean and straight to the point!

  • @PrajwalSingh15
    @PrajwalSingh15 4 місяці тому

    Awesome, thank you for the explanation and implementation man 😁🙌🏼

  • @alexijohansen
    @alexijohansen 4 місяці тому

    Amazing video!😊

  • @MadaraUchiha-ug9xc
    @MadaraUchiha-ug9xc 4 місяці тому +1

    good stuff

  • @ryanhe5563
    @ryanhe5563 4 місяці тому

    Awesome, I would like to ask how to add text control in DDPM

    • @Explaining-AI
      @Explaining-AI  4 місяці тому

      For text conditioning in DDPM Unet , you can use cross attention. I have gone through this in Stable Diffusion video - ua-cam.com/video/hEJjg7VUA8g/v-deo.html In the video diffusion model works with latent images but the same thing can be done in pixel space as well to the DDPM unet.

    • @ryanhe5563
      @ryanhe5563 4 місяці тому

      ​@@Explaining-AI Thank you. I have a problem with training DDPM. When I try to train with images with a resolution greater than 128, I get a cuda out of memory problem. My GPU has 24g of video memory. Why does DDPM consume so much memory?

    • @Explaining-AI
      @Explaining-AI  4 місяці тому

      @@ryanhe5563 I am not sure what resolution you are training at, but most likely the self attention layers would be causing the out of memory problem, as at higher resolution they become very costly. I would suggest to have self attention disabled at initial encoder layers and final decoder layers. See if enabling self attention ONLY once the spatial size is say less than 32x32 enables you to train without error.

  • @BhrantoPathik
    @BhrantoPathik 4 місяці тому

    Will you please cover the theoretical and implementation aspects of yolo models?

    • @Explaining-AI
      @Explaining-AI  4 місяці тому +1

      @BhrantoPathik Yolov1 is the next video that I will release, and will soon follow it up with all the prominent versions in the detection series (ua-cam.com/play/PL8VDJoEXIjppNvOzocFbRciZBrtSMi81v.html)

    • @BhrantoPathik
      @BhrantoPathik 4 місяці тому

      I will be waiting for it eagerly.
      I couldn't find any detailed article regarding the architecture of yolov8, it would be immensely helpful for me if you could provide some useful docs related to it meanwhile.
      Btw, I am really enjoying your high quality amazing contents. It's really helpful.

    • @Explaining-AI
      @Explaining-AI  4 місяці тому

      @@BhrantoPathik Glad you found the content helpful :) Regarding yolov8 , I dont think there is any official paper for yolov8, but in one of the issues on ultralytics repo, this paper was linked - arxiv.org/pdf/2304.00501v6 . In this, the authors give some details regarding changes from yolov5 to yolov8.(Figure 17 and Section 16.1).

  • @adelchisilvestri4435
    @adelchisilvestri4435 3 місяці тому

    On mimicpc, I can’t select a model of controlnet for preprocessing (it appears “none”). Can you explain me why

  • @swaystar1235
    @swaystar1235 4 місяці тому

    nice