Weighted sums, discrete convolutions and fast transforms (FFT WHT) are dot products. Max pooling and ReLU are switching. Ie for ReLU f(x)=x is connect, f(x)=0 is disconnect. You have a lot of freedom to mix those elements. For example you can use fast transforms as fixed systems of dot products together with parametric activation functions (fi(x)=ai.x x=0, i=0 to m) to create Fast Transform neural networks. To stop the first transform from taking the spectrum you can a apply a random fixed pattern of sign flips to the input data. Such a net is then: sign flips, transform, activation functions, transforn, activation functions......transform. The Walsh Hadamard transform is good.
Weighted sums, discrete convolutions and fast transforms (FFT WHT) are dot products. Max pooling and ReLU are switching. Ie for ReLU f(x)=x is connect, f(x)=0 is disconnect. You have a lot of freedom to mix those elements. For example you can use fast transforms as fixed systems of dot products together with parametric activation functions (fi(x)=ai.x x=0, i=0 to m) to create Fast Transform neural networks. To stop the first transform from taking the spectrum you can a apply a random fixed pattern of sign flips to the input data. Such a net is then: sign flips, transform, activation functions, transforn, activation functions......transform. The Walsh Hadamard transform is good.