Awesome! I love how you first describe the theory and then guide us through the implementation, it makes it so much easier to understand both the concepts and pytorch in general.
Thank You! Yes the intention is that an explanation followed by implementation(even a minimal one), will make the topic clearer. And also implementing enables me to run experiments to validate my own complete understanding, so beneficial for me as well.
For text conditioning in DDPM Unet , you can use cross attention. I have gone through this in Stable Diffusion video - ua-cam.com/video/hEJjg7VUA8g/v-deo.html In the video diffusion model works with latent images but the same thing can be done in pixel space as well to the DDPM unet.
@@Explaining-AI Thank you. I have a problem with training DDPM. When I try to train with images with a resolution greater than 128, I get a cuda out of memory problem. My GPU has 24g of video memory. Why does DDPM consume so much memory?
@@ryanhe5563 I am not sure what resolution you are training at, but most likely the self attention layers would be causing the out of memory problem, as at higher resolution they become very costly. I would suggest to have self attention disabled at initial encoder layers and final decoder layers. See if enabling self attention ONLY once the spatial size is say less than 32x32 enables you to train without error.
@BhrantoPathik Yolov1 is the next video that I will release, and will soon follow it up with all the prominent versions in the detection series (ua-cam.com/play/PL8VDJoEXIjppNvOzocFbRciZBrtSMi81v.html)
I will be waiting for it eagerly. I couldn't find any detailed article regarding the architecture of yolov8, it would be immensely helpful for me if you could provide some useful docs related to it meanwhile. Btw, I am really enjoying your high quality amazing contents. It's really helpful.
@@BhrantoPathik Glad you found the content helpful :) Regarding yolov8 , I dont think there is any official paper for yolov8, but in one of the issues on ultralytics repo, this paper was linked - arxiv.org/pdf/2304.00501v6 . In this, the authors give some details regarding changes from yolov5 to yolov8.(Figure 17 and Section 16.1).
*Official Implementation* : github.com/lllyasviel/ControlNet
*My Github Implementation* : github.com/explainingai-code/ControlNet-PyTorch
Awesome! I love how you first describe the theory and then guide us through the implementation, it makes it so much easier to understand both the concepts and pytorch in general.
Thank You! Yes the intention is that an explanation followed by implementation(even a minimal one), will make the topic clearer.
And also implementing enables me to run experiments to validate my own complete understanding, so beneficial for me as well.
nice video with great image that can clearly explain
Thank You :)
Wow, thanks! That's super clean and straight to the point!
Awesome, thank you for the explanation and implementation man 😁🙌🏼
You are most welcome :)
Amazing video!😊
good stuff
Awesome, I would like to ask how to add text control in DDPM
For text conditioning in DDPM Unet , you can use cross attention. I have gone through this in Stable Diffusion video - ua-cam.com/video/hEJjg7VUA8g/v-deo.html In the video diffusion model works with latent images but the same thing can be done in pixel space as well to the DDPM unet.
@@Explaining-AI Thank you. I have a problem with training DDPM. When I try to train with images with a resolution greater than 128, I get a cuda out of memory problem. My GPU has 24g of video memory. Why does DDPM consume so much memory?
@@ryanhe5563 I am not sure what resolution you are training at, but most likely the self attention layers would be causing the out of memory problem, as at higher resolution they become very costly. I would suggest to have self attention disabled at initial encoder layers and final decoder layers. See if enabling self attention ONLY once the spatial size is say less than 32x32 enables you to train without error.
Will you please cover the theoretical and implementation aspects of yolo models?
@BhrantoPathik Yolov1 is the next video that I will release, and will soon follow it up with all the prominent versions in the detection series (ua-cam.com/play/PL8VDJoEXIjppNvOzocFbRciZBrtSMi81v.html)
I will be waiting for it eagerly.
I couldn't find any detailed article regarding the architecture of yolov8, it would be immensely helpful for me if you could provide some useful docs related to it meanwhile.
Btw, I am really enjoying your high quality amazing contents. It's really helpful.
@@BhrantoPathik Glad you found the content helpful :) Regarding yolov8 , I dont think there is any official paper for yolov8, but in one of the issues on ultralytics repo, this paper was linked - arxiv.org/pdf/2304.00501v6 . In this, the authors give some details regarding changes from yolov5 to yolov8.(Figure 17 and Section 16.1).
On mimicpc, I can’t select a model of controlnet for preprocessing (it appears “none”). Can you explain me why
nice