Hey I have a small doubt, in the paper Sin(omega.Wx +b) was given at 4:44 isn't the SineLayer in forward doing Sin(omega * (Wx + b))? let me know if I am understanding it wrong. Shouldn't that be sin(omega*torch.mm(x, self.linear.weight.t()) + self.linear.bias)
Great to see that you payed a lot of attention:) I actually did not spot this. I was following the official notebook of the authors. They do it the same way as I did github.com/vsitzmann/siren/blob/master/explore_siren.ipynb You are right that in the paper the omega only multiplies the Wx and not the bias. Anyway, my comment on this would probably be that in terms of the learnable parameters and the architecture nothing really changes. But who knows:) I would more than encourage you to play around with different setups and see whether there are any major differences.
There is no difference in results of both ways of training. I tried training based on the paper's formulation of sin activation. For some reason, mine is slower than your implementation. I used torch alone for the meshgrid, coords-generation, and other operations rather than using the NumPy, and used kornia instead of the scipy.ndimage for Laplace and spacial_gradient calculation. Results look the same tho. Thanks for the video.
@@dhawals9176 Mathematically, there's no difference between the two. With the paper version, it'll train some phase b in raw radians, with the notebook version it'll train to b/omega (assuming both train to the same valley in the loss landscape.) However, "multiply and accumulate" (that is a multiplication followed by an addition) is frequently a single operation in hardware (because adding an additional adder to a multiplier is really cheap) while doing the addition before the multiply requires the addition to complete fully and its result to be saved back to the device registers before the multiplication can begin. That probably explains your speed difference. MAC operations are equivalent to one multiply, while add-then-multiply takes one full pipeline op, then can begin on the multiply, then can begin on the sine.
This is a great paper and I really enjoyed your video. I like the way you explained things and stepped through the code. One question: What is the point of training on anything other than the intensity? Is it just to illustrate that the network can capture these higher frequency aspects?
Thank you very much! That is actually a great question:) In the image example, this supervision on derivatives is more of a toy problem showing how powerful the SIREN is and that it has the capacity to capture signals that have nonzero higher order derivatives. With that being said, training on intensities would be more than enough in the case of images. However, as briefly mentioned in the video, one of the applications of SIRENs is solving partial differential equations and there one does not have the 0th order ground truths (~intensities) and can only supervise on derivatives. Unfortunately, I did not really talk about them in the video.
Bro, amazing content. But you are going way too fast. Please slow down. explain and concepts and what each line of code is doing. Most of the audience aren't that experienced in AI programming.
Point taken. I will try to create more beginner friendly content in the future. I am still kind of unsure what my audience wants to see so I definitely appreciate your comment!
great format of the video. I liked how you walk through the code. bonus points for vim.
Your channel is just pure gold!
I am really glad you find the content useful:) Thank you!
Thanks! I liked your style of walking through the code. Keep it up
Thank you for the comment!
Hey I have a small doubt, in the paper Sin(omega.Wx +b) was given at 4:44 isn't the SineLayer in forward doing Sin(omega * (Wx + b))? let me know if I am understanding it wrong.
Shouldn't that be sin(omega*torch.mm(x, self.linear.weight.t()) + self.linear.bias)
Great to see that you payed a lot of attention:) I actually did not spot this. I was following the official notebook of the authors. They do it the same way as I did github.com/vsitzmann/siren/blob/master/explore_siren.ipynb
You are right that in the paper the omega only multiplies the Wx and not the bias. Anyway, my comment on this would probably be that in terms of the learnable parameters and the architecture nothing really changes. But who knows:) I would more than encourage you to play around with different setups and see whether there are any major differences.
@@mildlyoverfitted Kool.
There is no difference in results of both ways of training. I tried training based on the paper's formulation of sin activation. For some reason, mine is slower than your implementation. I used torch alone for the meshgrid, coords-generation, and other operations rather than using the NumPy, and used kornia instead of the scipy.ndimage for Laplace and spacial_gradient calculation. Results look the same tho. Thanks for the video.
@@dhawals9176 Mathematically, there's no difference between the two. With the paper version, it'll train some phase b in raw radians, with the notebook version it'll train to b/omega (assuming both train to the same valley in the loss landscape.) However, "multiply and accumulate" (that is a multiplication followed by an addition) is frequently a single operation in hardware (because adding an additional adder to a multiplier is really cheap) while doing the addition before the multiply requires the addition to complete fully and its result to be saved back to the device registers before the multiplication can begin.
That probably explains your speed difference. MAC operations are equivalent to one multiply, while add-then-multiply takes one full pipeline op, then can begin on the multiply, then can begin on the sine.
This is a great paper and I really enjoyed your video. I like the way you explained things and stepped through the code. One question: What is the point of training on anything other than the intensity? Is it just to illustrate that the network can capture these higher frequency aspects?
Thank you very much! That is actually a great question:) In the image example, this supervision on derivatives is more of a toy problem showing how powerful the SIREN is and that it has the capacity to capture signals that have nonzero higher order derivatives. With that being said, training on intensities would be more than enough in the case of images. However, as briefly mentioned in the video, one of the applications of SIRENs is solving partial differential equations and there one does not have the 0th order ground truths (~intensities) and can only supervise on derivatives. Unfortunately, I did not really talk about them in the video.
@@mildlyoverfitted Nice thanks for the explanation!
Can SIREN be used to encode audio or frame in applications such as speech to video framworks?
Check out this video that mentions all the applications: ua-cam.com/video/Q2fLWGBeaiI/v-deo.html
Bro, amazing content. But you are going way too fast. Please slow down. explain and concepts and what each line of code is doing. Most of the audience aren't that experienced in AI programming.
Point taken. I will try to create more beginner friendly content in the future. I am still kind of unsure what my audience wants to see so I definitely appreciate your comment!