The one thing that got me is how the examples are actually clear and to the point, while you also take a very nice route from basic understanding of the problem to the final result. May i suggest that you write in full words a thought or a theorem? What i mean is, maybe at 8:25 where you explain what a posterior mean is. Maybe writing it with letters, so that people that didnt get it can pause and ponder on it while remembering what is said For real though, you do such an outstanding job explaining this concept, i learned a lot from those 15 minutes
You manage to explain just the right parts of the math to guide us to the conclusion without making the video too long. That's an art! Thank you so much for this. Other explanations expect a lot more of the viewer/reader and unless you are an expert you are always looking things up. Whereas you got it in one, including all the necessary tangents. Thank you! ❤
From the perspective of probability, denoising is to let the neural network learn the noise distribution on the image and move the image towards its possible expectation. The autoencoder projects the image near the low-dimensional manifold and then maps it back to the original space. Can we think that the denoising step is completed on the bottle neck and only implicitly predicts the noise distribution? Then, if we add a bypass parameter on the bottle neck to specifically predict the Gaussian noise distribution, can we effectively remove the Gaussian noise in the noise?
That's usually what's done in practice, people use a UNet which has skip connections. The network architecture weights are then mainly dedicated to estimating the residual noise.
Great video as always! It was a bit hard to follow in certain areas though... I think visuals really help to convey an intuition for these sort of things, and the visuals were a bit sparse in this video. In my opinion, your autoencoder and variational autoencoder videos struck a perfect balance of math complemented by illustrations. I realize that the artwork is probably what takes the longest when making a video, but it makes them so much more accessible. In any case, keep up the great work - can't wait to see future videos from the channel!
Thanks for watching the videos! I had to produce this video very fast and I'm not 100% satisfied with the result yet. Hopefully I'll have more time in 2025. :)
Your channel makes an unfairly low number of views per video. Just do not give up and one day your channel will blow up since you deserve it considering the high quality of content you provide. Much love
Great visuals, and very nice way of explaining the manifold hypothesis using the handwritten digits. If I may say , the second half of the video was a bit too packed and I had some trouble following through. Also would suggest putting the ad at the start or end of the video because it interrupts the flow of ideas. Very nice work though and looking forward to more AE videos!
Are manifolds mode-agnostic? Aka, are they inherently multimodal? So, would a semantic statement, let's say "a barbed tube fitting", lie on the same manifold as pictures of a "barbed tube fitting"? Finally, could you "walk" said manifold to semantically reconstruct the set of all possible valid "barbed tube fittings"? Asking for a friend.
does knowing all of this benifit you or is it just for your curiosity? Im genuinely curious beacuse i might also want to dig deeper into machine learning (im not learning this in uni)
Second part of the video is a bit hard I would say. You jump to apriori distribution which are hard to follow. I like to watch a video and tell myself I don’t need another video to understand it. Which was not completely the case here unfortunately! Still amazing work and looking forward to see the rest.
I like the analogy you do with the manifold hypothesis and the stochastic sampling via Tweedie formula. I didn't know Tweedie work btw. In my case I would have introduced diffusion from physics and langevin dynamics where a particule follow a force field towards a potential well. (paper from Song et al. and P. Vincent)
Don't hesitate to make your own take on this topic, surely the videos will be different :) I plan on talking about sampling in the next video, without going all the way to diffusion models though
Unfortunately this glosses over some rather key points. A naive implementation would give you a blurry reconstruction, because it's the average of all possible underlaying noise free versions. One solution is to compute the gradient and take small steps. EP Simoncelli and Z Kadkhodaie have much better lectures on this.
@@baxile I always struggle to paste links, sometimes they get spam filtered. The ones I was thinking of, had to find them again: An early one from the Channel "Deepmath": Eero Simoncelli - Making use of the Prior Implicit in a Denoiser One on the UA-cam Channel "MITCBMM" Photographic Image Priors in the Era of Machine Learning And one on the channel "Simons Institute" Generalization in Diffusion Models Arises from Geometry-Adaptive Harmonic Rrepresentations
I think it's pretty good. No need to be rude and call it annoying. Its volume could be just a couple percent lower, but overall the music background does make the video much more interesting to watch
@@Deepia-ls2fo I really enjoy your channel! yes indeed. I can't actually watch negatively biased (white-on-black) content and usually use an inverting shader. But when pictures and colors come in, it gets of course weird 😄 Not sure when this trend (using negatively biased graphics) started, I guess the legendary 3B1B and his Manim had something to do with it 🤗
@@harriehausenman8623 That's indeed the default mode of Manim, but thanks for the feedback it's good to know it can be hard to follow for some people. The styling and color scheme is really something I need to work on.
Damn I just realized the final 15 seconds of the video were cut off at the end. I was essentially saying see you in the next one 😅
Looking forward to it, thanks a lot for your outstanding work.
I didn't understand that. Is it a small oversight in the editing or some joke i don't get 🤔
@harriehausenman8623 A mistake while doing the editing unfortunately 😬
@@Deepia-ls2fo happens. kinda funny, actually 😄 reminds me of the editing style of Uri Tuchman 😉
Crossing my fingers that the algorithm decides to promote your videos more. Every video of yours is a gem!
please continue making high quality videos
second that!
Amazing content. This channel will go far.
Hey i was searching for Autoencoders just yesterday, Perfect Timing!
The one thing that got me is how the examples are actually clear and to the point, while you also take a very nice route from basic understanding of the problem to the final result.
May i suggest that you write in full words a thought or a theorem?
What i mean is, maybe at 8:25 where you explain what a posterior mean is. Maybe writing it with letters, so that people that didnt get it can pause and ponder on it while remembering what is said
For real though, you do such an outstanding job explaining this concept, i learned a lot from those 15 minutes
You manage to explain just the right parts of the math to guide us to the conclusion without making the video too long. That's an art! Thank you so much for this. Other explanations expect a lot more of the viewer/reader and unless you are an expert you are always looking things up. Whereas you got it in one, including all the necessary tangents. Thank you! ❤
This channel is absolutely incredible! Please keep uploading more amazing content!
Nothing like watching diffusion math à la Gymnopedie No. 1-- amazing soundtrack choice haha
Good'ol Erik is always a good fit for calm math content 👍
From the perspective of probability, denoising is to let the neural network learn the noise distribution on the image and move the image towards its possible expectation. The autoencoder projects the image near the low-dimensional manifold and then maps it back to the original space. Can we think that the denoising step is completed on the bottle neck and only implicitly predicts the noise distribution? Then, if we add a bypass parameter on the bottle neck to specifically predict the Gaussian noise distribution, can we effectively remove the Gaussian noise in the noise?
That's usually what's done in practice, people use a UNet which has skip connections. The network architecture weights are then mainly dedicated to estimating the residual noise.
@Deepia-ls2fo Thanks for reply
@@Deepia-ls2fo Thanks for reply!
Wow, the manifold hypothesis is indeed make sense to me @@ but I will need to review probability to understand the proof. Great video.
Great video as always! It was a bit hard to follow in certain areas though... I think visuals really help to convey an intuition for these sort of things, and the visuals were a bit sparse in this video. In my opinion, your autoencoder and variational autoencoder videos struck a perfect balance of math complemented by illustrations. I realize that the artwork is probably what takes the longest when making a video, but it makes them so much more accessible. In any case, keep up the great work - can't wait to see future videos from the channel!
Thanks for watching the videos! I had to produce this video very fast and I'm not 100% satisfied with the result yet. Hopefully I'll have more time in 2025. :)
Amazing clarity as always, thank you !
Your channel makes an unfairly low number of views per video. Just do not give up and one day your channel will blow up since you deserve it considering the high quality of content you provide. Much love
Finally the new video is online 🥳🥳🥳
Thank you for your excellent explanation!
Yet another great video with top animations, thanks a lot
Thank you very much for this, please continue making these 🙏
Great visuals, and very nice way of explaining the manifold hypothesis using the handwritten digits. If I may say , the second half of the video was a bit too packed and I had some trouble following through. Also would suggest putting the ad at the start or end of the video because it interrupts the flow of ideas. Very nice work though and looking forward to more AE videos!
Great content again, thank you very much 🎉
Wow, that was awesome!
The sigma on the right side of the Tweedy's equation makes sense, but I do not know how to prove it.
I'll probably detail how we derive the formula in the next video. I didn't want this one to be even more dense in terms of calculations.
Are manifolds mode-agnostic? Aka, are they inherently multimodal? So, would a semantic statement, let's say "a barbed tube fitting", lie on the same manifold as pictures of a "barbed tube fitting"? Finally, could you "walk" said manifold to semantically reconstruct the set of all possible valid "barbed tube fittings"? Asking for a friend.
What is sigma in the formula towards the end of the video?
The standard deviation of the Gaussian noise applied to the images !
Such a great video
can someone tell me how much maths i need to know to actually understand this alchemy ?
Basically undergrad with a focus on stat / calculus !
Great video and explanation, I'm currently working on a project and want to do similar animation with manim, can I take a look at your code?
I'll be releasing the code very soon, check the video description for the link
Et tout ça en ayant le covid... Quel chef
does knowing all of this benifit you or is it just for your curiosity? Im genuinely curious beacuse i might also want to dig deeper into machine learning (im not learning this in uni)
This is the theoretical basis of my research work so I even say it's required
@@Deepia-ls2fo would it benifit me?
@@erwinschulhoff4464Not sure it's worth your time if you're an engineer
Second part of the video is a bit hard I would say. You jump to apriori distribution which are hard to follow. I like to watch a video and tell myself I don’t need another video to understand it. Which was not completely the case here unfortunately! Still amazing work and looking forward to see the rest.
Thanks for the feedback !
great work! but i am a bit sad because i was learning manime to do exactly the same...
I like the analogy you do with the manifold hypothesis and the stochastic sampling via Tweedie formula. I didn't know Tweedie work btw. In my case I would have introduced diffusion from physics and langevin dynamics where a particule follow a force field towards a potential well. (paper from Song et al. and P. Vincent)
Don't hesitate to make your own take on this topic, surely the videos will be different :)
I plan on talking about sampling in the next video, without going all the way to diffusion models though
another interesting aspect is: what is the purpose of noise in diffusion model? This is to better learn the gradient (the score) in low density region
(is it supposed to cut out at the end like that?)
Unfortunately this glosses over some rather key points. A naive implementation would give you a blurry reconstruction, because it's the average of all possible underlaying noise free versions. One solution is to compute the gradient and take small steps. EP Simoncelli and Z Kadkhodaie have much better lectures on this.
Any links or keywords to find the lectures you mentioned ?
@@baxile I always struggle to paste links, sometimes they get spam filtered. The ones I was thinking of, had to find them again:
An early one from the Channel "Deepmath":
Eero Simoncelli - Making use of the Prior Implicit in a Denoiser
One on the UA-cam Channel "MITCBMM"
Photographic Image Priors in the Era of Machine Learning
And one on the channel "Simons Institute"
Generalization in Diffusion Models Arises from Geometry-Adaptive Harmonic Rrepresentations
@@luke.perkin.inventor That's very helpful! Thanks a lot!!
mymyzes
first
Why all the UA-camrs have the terrible habit of putting annoying, disturbing and distracting background music? Sorry, thumb 👎
I think it's pretty good. No need to be rude and call it annoying. Its volume could be just a couple percent lower, but overall the music background does make the video much more interesting to watch
great video, but I really wish you wouldn't be part of the 'dark theme' cult.
Thanks, what kind of theme would you prefer ? White background?
@@Deepia-ls2fo I really enjoy your channel! yes indeed. I can't actually watch negatively biased (white-on-black) content and usually use an inverting shader. But when pictures and colors come in, it gets of course weird 😄 Not sure when this trend (using negatively biased graphics) started, I guess the legendary 3B1B and his Manim had something to do with it 🤗
@@harriehausenman8623 That's indeed the default mode of Manim, but thanks for the feedback it's good to know it can be hard to follow for some people. The styling and color scheme is really something I need to work on.