Why Does Diffusion Work Better than Auto-Regression?

Поділитися
Вставка
  • Опубліковано 26 чер 2024
  • Have you ever wondered how generative AI actually works? Well the short answer is, in exactly the same as way as regular AI!
    In this video I break down the state of the art in generative AI - Auto-regressors and Denoising Diffusion models - and explain how this seemingly magical technology is all the result of curve fitting, like the rest of machine learning.
    Come learn the differences (and similarities!) between auto-regression and diffusion, why these methods are needed to perform generation of complex natural data, and why diffusion models work better for image generation but are not used for text generation.
    The following generative models were featured as demos in this video:
    Images: Adobe Firefly (www.adobe.com/products/firefl...)
    Text: ChatGPT (chat.openai.com)
    Audio: Suno.ai (suno.ai)
    Code: Gemini (gemini.google.com/app)
    Video: Lumiere (Lumiere-video.github.io)
    Chapters:
    00:00 Intro to Generative AI
    02:40 Why Naïve Generation Doesn't Work
    03:52 Auto-regression
    08:32 Generalized Auto-regression
    11:43 Denoising Diffusion
    14:19 Optimizations
    14:30 Re-using Models and Causal Architectures
    16:35 Diffusion Models Predict the Noise Instead of the Image
    18:19 Conditional Generation
    19:08 Classifier-free Guidance

КОМЕНТАРІ • 260

  • @doku7335
    @doku7335 Місяць тому +269

    At first I thought "oh, another random video explaining the same basics and not adding anything new", but I was so wrong. It's an incredibly clear explanation of diffusion, and the start with the basic makes the full picture much clearer. Thank you for the video!

    • @gonfpv
      @gonfpv 26 днів тому +4

      You should check the rest of his videos. All are of sublime quality

    • @pvic6959
      @pvic6959 21 день тому +2

      > makes the full picture much clearer
      hehe did it help denoise

    • @MinoriMirari-fans
      @MinoriMirari-fans 18 годин тому

      I mean it's a bit over simplified...

    • @MinoriMirari-fans
      @MinoriMirari-fans 18 годин тому

      Diffusion these days for example could implement any number of methods.

    • @MinoriMirari-fans
      @MinoriMirari-fans 18 годин тому

      To know more of an advanced technical perspective you could join this server where we research and study on all forms of ai aspecialy generative ai prompting, theoretical ways to run computation of ai neutral networks and tandems such as quantum networks. We help also suggest and invent theoretical applications of the ai and also ways in which to enhance the systems ect.

  • @algorithmicsimplicity
    @algorithmicsimplicity  4 місяці тому +212

    Next video will be on Mamba/SSM/Linear RNNs!

    • @benjamindilorenzo
      @benjamindilorenzo 3 місяці тому

      great! Also maybe think about the Tradeoff between scaling and incremental improvements, in case your perspective is, that LLM´s also always approximate the data set and therefore memorize rather than any "emergent capabilities". So that ChatGPT also does "only" curve fitting.

    • @harshvardhanv3873
      @harshvardhanv3873 Місяць тому +2

      I am student who is pursuing a degree in ai and we want more of your videos for even simplest of the concepts in ai, trust me this channel will be a huge deal in the near future, good luck!!

    • @QuantenMagier
      @QuantenMagier 29 днів тому

      Well take my subscription then!!1111

    • @atishayjain1141
      @atishayjain1141 25 днів тому

      From where did you learn, all these also have to tried to code for the same?

  • @jupiterbjy
    @jupiterbjy Місяць тому +125

    kinda sorry to my professors and seniors but this is the single best explanation of logics behind each models. About dozen min vid > 2 years of confusion in univ

  • @user-my3dd4lu2k
    @user-my3dd4lu2k 2 місяці тому +124

    Man I love the fact that you present the fundamental idea with an Intuitionistic approach, and then discuss the optimization.

  • @jasdeepsinghgrover2470
    @jasdeepsinghgrover2470 Місяць тому +40

    This is a much better explanation than the diffusion paper itself. They just went all around variational inference to get the same result!

  • @yqisq6966
    @yqisq6966 Місяць тому +55

    The clearest and most concise explanation of diffusion model I've seen so far. Well done.

  • @user-fh7tg3gf5p
    @user-fh7tg3gf5p 4 місяці тому +41

    This genius only makes videos occassionally, that are not to be missed.

  • @rafa_br34
    @rafa_br34 Місяць тому +24

    Such an underrated video, I love how you went from the basic concepts to complex ones and didn't just explain how it works but also the reason why other methods are not as good/efficient.
    I will definitely be looking forward to more of your content!

  • @erfanasgari21
    @erfanasgari21 22 дні тому +6

    This is literally the best explanation of the diffusion models I have ever seen.

  • @pw7225
    @pw7225 Місяць тому +22

    The way you tell the story is fantastic! I am surprised that all AI/ML books are so terrible at didactics. We should always start at the intuition, the big picture, the motivation. The math comes later when the intuition is clear.

    • @dustinandrews89019
      @dustinandrews89019 27 днів тому +6

      I have seen the "math-first, intuition later or never" approach in a lot of teaching. High school and college math, physics and programming classes are rife with this approach. I agree it's sub-optimal for most students. I have some vague ideas about why this approach perpetuates itself and I have seen a lot of gatekeeping around learning in a bottom up way. It's lovely to see some educators like AlgorithmicSiplicity and Three Blue One Brown break things down in much more intuitive way that then allows us to understand the maths.

    • @fog1257
      @fog1257 19 днів тому

      ​​@@dustinandrews89019I think the main reason is time. Most university courses are 8 weeks in my case and there simply isn't enough time to explain all the details in theory behind electronics or math for example. My learning is terrible when I am just given a formula for a particular problem, it's useless to me. Instead I end up spending days understanding who came up with the formula and why before I derive it myself and then I will never forget it since it becomes part of my intuition.
      Another reason I've noticed is sadly lack of deeper understanding from some teachers. They themselves only memoriesed the solution for the problem but they don't really fully understand the problem or the solution, in my opinion they are unfit for teaching. A teacher should never be worried about a student asking why.

  • @Jack-gl2xw
    @Jack-gl2xw Місяць тому +17

    I have trained my own diffusion models and it required me to do a deep dive of the literature. This is hands down the best video on the subject and covers so much helpful context that makes understanding diffusion models so much easier. I applaud your hard work, you have earned a subscriber!

    • @Real-HumanBeing
      @Real-HumanBeing 9 днів тому

      You realize these models contain their dataset, right? And that’s the only way they can work.

  • @RicardoRamirez-dr6gc
    @RicardoRamirez-dr6gc Місяць тому +10

    This is seriously one of the best explainer videos i've ever seen. I've spent a long time trying to understand diffusion models and not a single video has come close to this one

  • @GianlucaTruda
    @GianlucaTruda 27 днів тому +5

    Holy shit, at 11:03 I suddenly realised what you were cooking! I've been trying to find a way to articulate this interesting relationship between autoregression and diffusion for ages (my thesis developed diffusion models for tabular data). This is such a brilliantly-visualised and intuitively explained video! Well done. And the classifier-free guidance explanation you threw in at the end has got to be some of the most high-ROI intuition pumping I've seen on UA-cam.

  • @Veptis
    @Veptis Місяць тому +7

    This is a great explanation on how image decoders work. I haven't seen this approach and narrative direction yet.
    This now makes my reference for explaining it to people that got no idea.!

  • @HD-Grand-Scheme-Unfolds
    @HD-Grand-Scheme-Unfolds Місяць тому +8

    You truly understand how to simplify... to engage our imagination... to employ naive thought or ideas to make comparisons to bring across a deeper more core principles and concepts to make the subject for more easier to grasp and get an intuition for. Algorithmic Simplicity indeed... thank you for your style of presentation and teaching. love it love it... you make me know what question I want to ask but didn't know I wanted to ask. UA-cam needs your contribution in ML education. please don't forget that.

  • @riddhimanmoulick3407
    @riddhimanmoulick3407 21 день тому +4

    Kudos for an incredibly intuitive explanation! Really loved the visual representations too!!

  • @pseudolimao
    @pseudolimao Місяць тому +22

    this is insane. I feel bad for getting this level of content for free

  • @benjamindilorenzo
    @benjamindilorenzo 3 місяці тому +8

    Very good job.
    My suggestion is that you explain more about how it actually works, that the model learns to understand complete sceneries just from text prompts.
    This could fill its own video.
    Also it would be very nice to have a video about Diffusion Transformers like OpenAIs Sora probably is.
    Also it could be great to have a Video about the paper "Learning in High Dimension Always Amounts to Extrapolation".
    best wishes

    • @algorithmicsimplicity
      @algorithmicsimplicity  3 місяці тому +7

      Thanks for the suggestions, I was planning to make a video about why neural networks generalize outside their training set from the perspective of algorithmic complexity. That paper "Learning in High Dimension Always Amounts to Extrapolation" essentially argues that the interpolation vs extrapolation distinction is meaningless for high dimensional data, and I agree, I don't think it is worth talking about interpolation/extrapolation at all when explaining neural network generalization.

    • @benjamindilorenzo
      @benjamindilorenzo 3 місяці тому +2

      @@algorithmicsimplicity yes true. It would be great also because this links back to the LLM´s discussions, wether scaling up Transformers actually brings up "emergent capabilities", or if this is simple and less magical explainable by extrapolation.
      Or in other words: either people tend to believe, that Deep Learning Architectures like Transformers only approximating their training data set, or people tend to believe, that seemingly unexplainable or unexpected capabilities emerge while scaling.
      I believe, that extrapolation alone explains really good why LLM´s work so well, especially when scaled up AND that LLM´s "just" approximate their training data (curve fitting). This is why i brought this up ;)

  • @Frdyan
    @Frdyan Місяць тому +4

    I have a graduate degree in this shit and this is by far the clearest explanation of diffusion I've seen. Have you thought about doing a video running over the NN Zoo? I've used that as a starting point for lectures on NN and people seem to really connect with that paradigm

  • @themodernshoe2466
    @themodernshoe2466 22 дні тому +1

    This has been on my watch later for 3 months. Finally got to watching it, glad I did. This is an exceptional explanation of the technologies at play here.

  • @lusayonyondo9111
    @lusayonyondo9111 20 днів тому +1

    wow, this is such an amazing resource. I'm glad I stuck around. This is literally the first time this is all making sense to me.

  • @karlnikolasalcala8208
    @karlnikolasalcala8208 Місяць тому +4

    This channel is gold, I'm glad I've randomly stumbled across one of your vids

  • @oculuscat
    @oculuscat Місяць тому +6

    Diffusion doesn't necessarily work better than auto-regression. The "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction" paper introduces an architecture they call VAR that upscales noise using an AR model and this currently out-performs all diffusion models in terms of speed and accuracy.

  • @poipoi300
    @poipoi300 21 день тому +2

    This is refreshing to watch in a sea of people who don't know what they're talking about and decide to make "educational" videos on the subject anyways. The simplifications are often harmful.

  • @nasseral-bess564
    @nasseral-bess564 23 дні тому +1

    This is actually one of the best if not the best deep learning related video on UA-cam
    Thanks for your efforts

  • @londonl.5892
    @londonl.5892 27 днів тому +1

    So glad this came across my recommended feed! Fantastic explanation and definitely cleared up a lot of confusion I had around diffusion models.

  • @MeriaDuck
    @MeriaDuck Місяць тому +1

    This must be one of the best and concise explanations I've seen!

  • @Matyanson
    @Matyanson Місяць тому +3

    Thank you for the explanation. I already knew a little bit about diffusion but this is exactly the way I'd hope to learn. Start from the simplest examples(usually historical) and progresivelly advance, explaining each optimisation!

  • @CodeMonkeyNo42
    @CodeMonkeyNo42 Місяць тому

    Great video. Love the pacing and how you distiled the material into such an easy to watch video. Great job!

  • @banana_lemon_melon
    @banana_lemon_melon Місяць тому +1

    bruh, I loved your contents. Other channel/video usually explain general knowledge that can be easily found on internet. But you're going deeper to the intrinsic aspects of how the stuff works. This video, and one of your video about transformer, are really good.

  • @mattshannon5111
    @mattshannon5111 18 днів тому +1

    Wow, it requires really deep understanding and a lot of work to make videos this clear that are also so correct and insightful. Very impressive!

  • @wormjuice7772
    @wormjuice7772 16 днів тому +1

    This has helped me so much wrapping my head around this whole subject!
    Thank you for now, and the future!

  • @updated_autopsy_report
    @updated_autopsy_report 26 днів тому +1

    I really enjoyed this video!! took a lot of notes while watching it too. you have a god tier ability to explain concepts in an easy to follow way

  • @shivamkaushik6637
    @shivamkaushik6637 29 днів тому

    Never knew youtube could give random suggestion to videos like these. This was mind blowing. The way you teach is work of art.

  • @justanotherbee7777
    @justanotherbee7777 4 місяці тому +3

    A person with very less background can understand what he describes here.. commenting to make youtube so it gets recommended for other ..
    wonderful video! really good one

  • @mrdr9534
    @mrdr9534 Місяць тому +1

    Thanks for taking the time and effort of making and sharing these videos and Your knowledge.
    Kudos and best regards

  • @deep.space.12
    @deep.space.12 16 днів тому +2

    If there will be a longer version of this video, it might be worth mentioning VAE as well.

  • @snippletrap
    @snippletrap 18 днів тому +1

    Fantastic explanation. Very intuitive

  • @abdelhakkhalil7684
    @abdelhakkhalil7684 Місяць тому +1

    This was a good watch, thank you :)

  • @akashmody9954
    @akashmody9954 4 місяці тому +2

    Great video....already waiting for your next video

  • @kkordik
    @kkordik 23 дні тому +1

    Bro, this is amazing!!! Your explanation is so clear, like it

  • @xaidopoulianou6577
    @xaidopoulianou6577 Місяць тому +1

    Very nicely and simply explained! Keep it up

  • @vidishapurohit4709
    @vidishapurohit4709 23 дні тому +1

    very nice visual explanations

  • @photamasan9661
    @photamasan9661 21 день тому +1

    You’re him 🙌🏽. Thank you so much. Getting this kind of information or well explanation is not easy with all the “BREAKING AI NEWS !😮‼️” on UA-cam now.

  • @zlatanonkovic2424
    @zlatanonkovic2424 21 день тому +1

    What a great explanation!

  • @iestynne
    @iestynne Місяць тому +1

    Wow, fantastic video. Such clear explanations. I learned a great deal from this. Thank you so much!

  • @neonelll
    @neonelll 21 день тому +1

    The best explanation I've seen. Great work.

  • @art4eigen93
    @art4eigen93 22 дні тому +1

    So simple ! Thank you.

  • @MichaelBrown-gt4qi
    @MichaelBrown-gt4qi 15 днів тому

    This is a great video. I have watched videos in the past (years ago) talk about auto-regression and more lately talk about diffusion. But it's nice to see why and how there was such a jump between the two. Amazing! However, I feel this video is a little incomplete when there was no mention of the enhancer model that "cleans up" the final generated image. This enhancing model is able to create a larger image while cleaning up the six fingers gen AI is so famous for. While not technically a part of the diffusion process (because it has no random noise) it is a valuable addition to image gen if anyone is trying to build their own model.

  • @TheTwober
    @TheTwober 20 днів тому +1

    The best explanation I have found on the internet so far. 👍

  • @RezaJavadzadeh
    @RezaJavadzadeh 22 дні тому +1

    such complete explanations, keep it up thank you

  • @jcorey333
    @jcorey333 4 місяці тому +7

    This is an amazing quality video! The best conceptual video on diffusion in AI I've ever seen.
    Thanks for making it!
    I'd love to see you cover RNNs.

  • @istoleyourfridgecall911
    @istoleyourfridgecall911 20 днів тому +1

    Hands down the best video that explains how these models work. I love that you explain these topics in a way that resembles how the researchers created these models. Your video shows the thinking process behind these models, combined with great animated examples, it is so easy to understand. You really went all out. Only if youtube promoted these kinds of videos instead of brainrot low quality videos made by inexperienced teenagers.

  • @ecla141
    @ecla141 Місяць тому +2

    Awesome video! I would love to see a video about graph neural networks

  • @user-er9pw4qh6j
    @user-er9pw4qh6j Місяць тому +2

    Soooo Good!!! Thanks for making it!!!!

  • @iancallegariaragao
    @iancallegariaragao 4 місяці тому +2

    Great video and amazing content quality!

  • @vasil_astrov
    @vasil_astrov 21 день тому +1

    Thank you! This is great explanation❤

  • @abhijeetvishwasrao
    @abhijeetvishwasrao 25 днів тому +1

    Awesome explanation 👏

  • @JordanMetroidManiac
    @JordanMetroidManiac Місяць тому +1

    I finally understand how models like Stable Diffusion work now! I tried understanding them before but got lost at the equation (17:50), but this video describes that equation very simply. Thank you!

  • @alenqquin4509
    @alenqquin4509 9 днів тому +1

    A very good job, I have deepened my understanding of generative AI

  • @user-ou3ts4hl7p
    @user-ou3ts4hl7p 7 днів тому +1

    Very good video. I get to konw the straigforward reason: why diffusion idea emerges and why diffusion is intrinsically better than autogression algorithm.

  • @alaad1009
    @alaad1009 24 дні тому +1

    Amazing video !

  • @simonpenelle2574
    @simonpenelle2574 15 днів тому +1

    Amazing content I now want to implement this

  • @ArtOfTheProblem
    @ArtOfTheProblem Місяць тому +1

    great work

  • @user-yj3mf1dk7b
    @user-yj3mf1dk7b Місяць тому +1

    nice explanations, although, i've already knew about diffusion. examples from simplest to final diffusion -- were a really nice touch.

  • @sanjeev.rao3791
    @sanjeev.rao3791 Місяць тому +1

    Wow, that was a fantastic explanation.

  • @marcusbluestone2822
    @marcusbluestone2822 27 днів тому +1

    Brilliant explanation. Thank you very much

  • @lialkalo4093
    @lialkalo4093 10 днів тому +1

    very good explanation

  • @atifadib
    @atifadib 26 днів тому +1

    great video... loved it!

  • @ShubhamSinghYoutube
    @ShubhamSinghYoutube Місяць тому +1

    Love the conclusion

  • @tkimaginestudio
    @tkimaginestudio Місяць тому +1

    Great explanations, thank you!

  • @mojtabavalipour
    @mojtabavalipour Місяць тому +1

    Well done!

  • @hmmmza
    @hmmmza 4 місяці тому +3

    what a great rare content!

  • @Mhrn.Bzrafkn
    @Mhrn.Bzrafkn Місяць тому +3

    It was too easy understanding👌🏻👌🏻

  • @anatolyr3589
    @anatolyr3589 2 місяці тому +1

    Great explanation!👍👍, I personally would like to see a video observing all major types of neural nets with their distinctions, specifics, advantages, disadvantages etc. the author explains very well 👏👏

  • @gabrielgraf2521
    @gabrielgraf2521 25 днів тому +2

    Boah what a good explanation. I alwa6was wondering how these big NN like chatgpt and dalle are working. Thank you

  • @RobotProctor
    @RobotProctor Місяць тому +1

    Thank you. This video is wonderful

  • @morrisdehaan6679
    @morrisdehaan6679 27 днів тому +1

    So good!

  • @joaosousapinto3614
    @joaosousapinto3614 Місяць тому +1

    Great video, congrats.

  • @robosergTV
    @robosergTV 25 днів тому +1

    good stuff, thanks

  • @1.4142
    @1.4142 4 місяці тому +4

    Some2 really brought out some good channels

  • @meanderthalensis
    @meanderthalensis Місяць тому +1

    Great video!

  • @vijayaveluss9098
    @vijayaveluss9098 Місяць тому +1

    Great explanation

  • @marcinstrzesak346
    @marcinstrzesak346 Місяць тому +1

    Very good video. Thank you

  • @ollie-d
    @ollie-d Місяць тому +1

    Solid video!

  • @pon1
    @pon1 Місяць тому +1

    Still feels like magic to me 🙌🙌

  • @RobotProctor
    @RobotProctor Місяць тому +2

    I like to think of ML as a funky calculator. Instead of a calculator where you give it inputs and an operation and it gives you an output, you give it inputs and outputs and it gives you an operation.
    You said it's like curve fitting, which is the same thing, but I like thinking the words funky calculator because why not

  • @paaabl0.
    @paaabl0. Місяць тому

    Great video! Focus on the right elements.

  • @fractalinfinity
    @fractalinfinity 7 днів тому +1

    I get it now! 🎉 thanks!

  • @khangvutien2538
    @khangvutien2538 Місяць тому

    Thank you very much.
    I enjoyed the first part, the first 10 seconds.
    After, there are too any shortcuts in the explanations that I struugled to understand and be able to explain it again to myself. Still, I subscribed.
    As for suggestions for other videos, I'll check whether you have explained the U-Net already. If not I'd appreciate to have the same kind of explanation about it.

  • @looooool3145
    @looooool3145 2 дні тому +1

    i now understand things, thanks!

  • @AurL_69
    @AurL_69 Місяць тому +1

    thanks for explaining

  • @demohub
    @demohub Місяць тому +1

    Just subscribed. Great video

  • @sobhhi
    @sobhhi Місяць тому +2

    I think it would help to mention that the auto-regressors may be viewing the image as a sequence of pixels (RGB vectors). Overall excellent video, extremely intuitive.

    • @algorithmicsimplicity
      @algorithmicsimplicity  Місяць тому +1

      In general, auto-regressors do not view images as a sequence. For example, PixelCNN uses convolutional layers and treats inputs as 2d images. Only sequential models such as recurrent neural networks would view the image as a sequence.

    • @sobhhi
      @sobhhi Місяць тому

      @@algorithmicsimplicity of course, but I feel mentioning it may help with intuition as you’re walking through pixel by pixel image generation

  • @zephilde
    @zephilde Місяць тому +3

    Great visualisation! Good job!
    Maybe next video on LoRA or ControlNet ?

  • @winstongraves8321
    @winstongraves8321 Місяць тому +1

    Great video

  • @infographie
    @infographie Місяць тому +1

    Excellent.

  • @tryptamedia7375
    @tryptamedia7375 День тому +1

    So do the recent large world model breakthroughs of Sora, Luma, Runway alpha imply that we've returned to auto regressive? Are they a combo of the two? Amazing video, would love to hear your thoughts!

    • @algorithmicsimplicity
      @algorithmicsimplicity  День тому

      From what little they have released publicly, it seems that they are simply diffusion models applied to videos, i.e. they treat videos as a collection of frames, add noise to all frames, take all noisy frames as input and try to predict all clean frames. I don't think there is any auto-regression done, but maybe that will change when they start generating longer videos.

  • @jayantdubey3025
    @jayantdubey3025 22 дні тому +1

    In your neural network animations, the traveling highlight starts from the image, goes through the neural net, then to the output pixel. I understand this as information traveling forward. When the highlights reverse direction, does this represent back propagation at the regressed value of the pixel? Great video by the way!

    • @algorithmicsimplicity
      @algorithmicsimplicity  22 дні тому

      Yep it's just meant to demonstrate the weights in the network changing based on the error in the predicted value.

  • @mallow610
    @mallow610 Місяць тому +2

    Video is a banger