I spent hours to learn about the concept of Convolution and didn't find answer how does it work in image processing, but you gave us probably the best closet to practicable definition.
Had to watch this a few times to start to get it, but it's a great video. So there are two ways of looking at the convolution. One is where the kernel is the actual thing being convolved and it's saying, 'At each pixel, give me a score for how well the kernel matches at this region.' The second way is where you're treating the kernel more like a subimage you want to make multiple copies of, as scaled by pixel values in another image. But then to counteract the way kernels pull data into the center of the kernel, you have to reverse the subimage you want to make copies of before you use it as the kernel.
Convolutions are part of nature as well. We see it in microscopy, where basically point light sources are Convoluted by the lenses in the microscope which gives the point a gaussian blur and some rings around the central blur, caused by diffraction, in the resulting image. The shape is known as an Airy disc.
Hello, first of all thank you very much for the video, it's very interesting and well-structured. I haven't understood one thing though. With the dog image at 16:57, with my calculations, the upper left edge of its head should produce (as result in convolution) a red diagonal line instead of a white one. Because the edge has much more white spots on its down-right side and much less on its upper-left side. Why do we produce the exact reverse (or opposite) value in the convolution? Have a nice day! PS: I now resent the use of the word ''much'' in everyday language because apparently we can deduce any subject matter to an integer or a series of integers somehow, therefore favoring ''many'' in all cases.
your content is amazing. will you do any q and a in your online class? i am having trouble finding out what “convolution on top of convolution” is. if you run 20 filters on an image you get another image with 20 channels. what happens when you run 20 filters on those 20 filters after pooling? do each of those filters have 20 channels? do you run 1 filter on 1 channel? everyone talks about layer 1 but no one really talks about multi layer convolution in any great detail. does your class offer this type of granular analysis?
Thanks shake! We definitely do Q&A in the online classes in the form of running comment threads, one for each 5-10 minute video or post. Your particular question is one I had too and it drove me crazy until I could find an answer. Short answer: If the second layer has 20 channels coming into it, then each convolutional kernel will have 20 parts - one for each input channel - and you add the results of all of them together to get a single output channel. Long answer: You can follow through the one dimensional convolution tutorial and see how it works there in detail (e2eml.school/convolution_one_d.html). We will be stepping through the Python code for doing this in two dimensions in Course 322 (e2eml.school/322) where the handling of multiple input channels and multiple kernels will be made explicit.
@@BrandonRohrer You are the Man! I really appreciate your response and hard work. I will def be taking some of your classes in the near future. I take it that those 20 channel kernels have the same multiplication across each channel or can they have 2 different multiplication types. I am just wondering how "crazy" kernels get.
@@BrandonRohrer wow. So in Layer 1 we could have 20 Kernels with 1 channel that results in layer 2 of an image with 20 channel. We then run 20 kernels with 20 channels which then gives us 400 channels but each 20 channel image gets summed back into 1 channel? Is this something that would actually be covered in your course in more details? Thanks!
@@shake6321 Yes, just like that. And yes, in Course 321 and Course 322 we cover it in all it's glorious detail, from pictures to equations to Python code.
... I was writing my thesis* and just now I started writing a chapter about what convolutional filters I use and saw that this video went up. What kind of sorcery is this? *only work with "classical" image anysis, no machine learning
The part where you use kernel to create multiple of itself on the image, is it called Cross-correlation? or just a special form of convolution?( apologies for any of my misunderstanding of cross and conv) Since I heard cross-correlation is regularly used for image matching
No worries! It can be quite confusing, because convolution = cross correlation with a reversed kernel. When the kernel is making non-reversed copies of itself, that is vanilla, out-of-the-box convolution. In machine learning implementations it's popular to store the kernel pre-reversed and then just do the cross-correlation step.
Flipping the kernel hurts my brain... and my feelings?? It is very anti-intuitive that when you flip, you get copies that have the SAME orientation as the original...
This is the clearest explanation of convolution that I've ever come across.
Magically clear. Finally got what convolution is all about. I appreciate your hard work.
Thanks Can! I'm happy you found it helpful.
I spent hours to learn about the concept of Convolution and didn't find answer how does it work in image processing, but you gave us probably the best closet to practicable definition.
Thanks Nauman :)
My most sincere kudos. Your way of explaining things is how teaching should be. Looking forward to learning more from you.
This is the best explanation I have found to date on convolutions and the animations were fantastic. Great work.
Thanks Eric :)
Had to watch this a few times to start to get it, but it's a great video. So there are two ways of looking at the convolution. One is where the kernel is the actual thing being convolved and it's saying, 'At each pixel, give me a score for how well the kernel matches at this region.'
The second way is where you're treating the kernel more like a subimage you want to make multiple copies of, as scaled by pixel values in another image. But then to counteract the way kernels pull data into the center of the kernel, you have to reverse the subimage you want to make copies of before you use it as the kernel.
Exactly, this is what I understood too
the best ML tutor out there
Thanks for this video, great explanation!
omg the best explaination
One nice thing to keep in mind is that the convolution in time/spatial domain is a simple multiplication in frequency domain. And vice versa 😉
What is the idea of reversed kernel? What happens to convolution if we do not reverse?
Brilliant and awe inspiring !! Thank You sir, for such a high quality content.
Thanks Siddhant :)
Convolutions are part of nature as well. We see it in microscopy, where basically point light sources are Convoluted by the lenses in the microscope which gives the point a gaussian blur and some rings around the central blur, caused by diffraction, in the resulting image. The shape is known as an Airy disc.
at 8:15 convolution image, it looks like the red diagonal line should be on top, and the white one should be below
This is really awesome. Keep up the good work :)
Hello, first of all thank you very much for the video, it's very interesting and well-structured. I haven't understood one thing though. With the dog image at 16:57, with my calculations, the upper left edge of its head should produce (as result in convolution) a red diagonal line instead of a white one. Because the edge has much more white spots on its down-right side and much less on its upper-left side. Why do we produce the exact reverse (or opposite) value in the convolution? Have a nice day!
PS: I now resent the use of the word ''much'' in everyday language because apparently we can deduce any subject matter to an integer or a series of integers somehow, therefore favoring ''many'' in all cases.
If you feel motivated to share your work or code it up, I'd be happy to take a look. I love that you are engaging with the topic so deeply.
Thanks for this video. It's very enlightening
Nice explanation! I'd like to see if there's some use for blind deconvolution in ML, e.g. for noise filtering, etc.
your content is amazing. will you do any q and a in your online class?
i am having trouble finding out what “convolution on top of convolution” is.
if you run 20 filters on an image you get another image with 20 channels. what happens when you run 20 filters on those 20 filters after pooling? do each of those filters have 20 channels? do you run 1 filter on 1 channel?
everyone talks about layer 1 but no one really talks about multi layer convolution in any great detail. does your class offer this type of granular analysis?
Thanks shake! We definitely do Q&A in the online classes in the form of running comment threads, one for each 5-10 minute video or post.
Your particular question is one I had too and it drove me crazy until I could find an answer.
Short answer: If the second layer has 20 channels coming into it, then each convolutional kernel will have 20 parts - one for each input channel - and you add the results of all of them together to get a single output channel.
Long answer: You can follow through the one dimensional convolution tutorial and see how it works there in detail (e2eml.school/convolution_one_d.html). We will be stepping through the Python code for doing this in two dimensions in Course 322 (e2eml.school/322) where the handling of multiple input channels and multiple kernels will be made explicit.
@@BrandonRohrer
You are the Man! I really appreciate your response and hard work. I will def be taking some of your classes in the near future.
I take it that those 20 channel kernels have the same multiplication across each channel or can they have 2 different multiplication types. I am just wondering how "crazy" kernels get.
@@shake6321 Thanks :)
Typically the kernel has a different set values for each channel, so pretty crazy.
@@BrandonRohrer
wow. So in Layer 1 we could have 20 Kernels with 1 channel that results in layer 2 of an image with 20 channel. We then run 20 kernels with 20 channels which then gives us 400 channels but each 20 channel image gets summed back into 1 channel?
Is this something that would actually be covered in your course in more details?
Thanks!
@@shake6321 Yes, just like that. And yes, in Course 321 and Course 322 we cover it in all it's glorious detail, from pictures to equations to Python code.
Thanks for the knowledge, bruv.
My pleasure, Swaraj
best video i watched on convolution. now how do we represent it in maths?
very good explain.
thanks :)
... I was writing my thesis* and just now I started writing a chapter about what convolutional filters I use and saw that this video went up.
What kind of sorcery is this?
*only work with "classical" image anysis, no machine learning
I'm working on your Acknowledgements right now
The part where you use kernel to create multiple of itself on the image, is it called Cross-correlation? or just a special form of convolution?( apologies for any of my misunderstanding of cross and conv) Since I heard cross-correlation is regularly used for image matching
No worries! It can be quite confusing, because convolution = cross correlation with a reversed kernel. When the kernel is making non-reversed copies of itself, that is vanilla, out-of-the-box convolution. In machine learning implementations it's popular to store the kernel pre-reversed and then just do the cross-correlation step.
@@BrandonRohrer thank you for your explanation
Flipping the kernel hurts my brain... and my feelings?? It is very anti-intuitive that when you flip, you get copies that have the SAME orientation as the original...
Nice video
Please can you write a formula that performs convolution for me