I still don't get why it's supposed to help translational invariance. As you say, the convolution is already capable of that. If I move the image contents 50 pixels to the right, the features should also move 50 pixels, given stride 1 and padding. Exactly the same is true for traditional sobel edge detectors. The edges don't change if I convolve the image with the edge detection filters translated or not
No the convolution help with translational equivariance, but not with translational invariance. The idea of adding pooling is that you are forcing exact nearby pixel/feature information to be “lost”. This means that the network is forced to learn more generalizeable components of the picture you are showing it.
finally someone explaining pooling. congrats sir, youre the gigachad of the year
Glad it was useful 😅
I still don't get why it's supposed to help translational invariance. As you say, the convolution is already capable of that. If I move the image contents 50 pixels to the right, the features should also move 50 pixels, given stride 1 and padding. Exactly the same is true for traditional sobel edge detectors. The edges don't change if I convolve the image with the edge detection filters translated or not
No the convolution help with translational equivariance, but not with translational invariance.
The idea of adding pooling is that you are forcing exact nearby pixel/feature information to be “lost”.
This means that the network is forced to learn more generalizeable components of the picture you are showing it.