SIREN in PyTorch

Поділитися
Вставка
  • Опубліковано 18 гру 2024

КОМЕНТАРІ • 17

  • @dhananjayraut
    @dhananjayraut 3 роки тому +6

    great format of the video. I liked how you walk through the code. bonus points for vim.

  • @wolfisraging
    @wolfisraging 3 роки тому +1

    Your channel is just pure gold!

    • @mildlyoverfitted
      @mildlyoverfitted  3 роки тому

      I am really glad you find the content useful:) Thank you!

  • @theodoretsitsimis9973
    @theodoretsitsimis9973 3 роки тому +3

    Thanks! I liked your style of walking through the code. Keep it up

  • @dhawals9176
    @dhawals9176 3 роки тому +1

    Hey I have a small doubt, in the paper Sin(omega.Wx +b) was given at 4:44 isn't the SineLayer in forward doing Sin(omega * (Wx + b))? let me know if I am understanding it wrong.
    Shouldn't that be sin(omega*torch.mm(x, self.linear.weight.t()) + self.linear.bias)

    • @mildlyoverfitted
      @mildlyoverfitted  3 роки тому +1

      Great to see that you payed a lot of attention:) I actually did not spot this. I was following the official notebook of the authors. They do it the same way as I did github.com/vsitzmann/siren/blob/master/explore_siren.ipynb
      You are right that in the paper the omega only multiplies the Wx and not the bias. Anyway, my comment on this would probably be that in terms of the learnable parameters and the architecture nothing really changes. But who knows:) I would more than encourage you to play around with different setups and see whether there are any major differences.

    • @dhawals9176
      @dhawals9176 3 роки тому +1

      @@mildlyoverfitted Kool.

    • @dhawals9176
      @dhawals9176 3 роки тому +1

      There is no difference in results of both ways of training. I tried training based on the paper's formulation of sin activation. For some reason, mine is slower than your implementation. I used torch alone for the meshgrid, coords-generation, and other operations rather than using the NumPy, and used kornia instead of the scipy.ndimage for Laplace and spacial_gradient calculation. Results look the same tho. Thanks for the video.

    • @4onen
      @4onen 3 роки тому +1

      @@dhawals9176 Mathematically, there's no difference between the two. With the paper version, it'll train some phase b in raw radians, with the notebook version it'll train to b/omega (assuming both train to the same valley in the loss landscape.) However, "multiply and accumulate" (that is a multiplication followed by an addition) is frequently a single operation in hardware (because adding an additional adder to a multiplier is really cheap) while doing the addition before the multiply requires the addition to complete fully and its result to be saved back to the device registers before the multiplication can begin.
      That probably explains your speed difference. MAC operations are equivalent to one multiply, while add-then-multiply takes one full pipeline op, then can begin on the multiply, then can begin on the sine.

  • @michaelcarlon1831
    @michaelcarlon1831 3 роки тому +1

    This is a great paper and I really enjoyed your video. I like the way you explained things and stepped through the code. One question: What is the point of training on anything other than the intensity? Is it just to illustrate that the network can capture these higher frequency aspects?

    • @mildlyoverfitted
      @mildlyoverfitted  3 роки тому +2

      Thank you very much! That is actually a great question:) In the image example, this supervision on derivatives is more of a toy problem showing how powerful the SIREN is and that it has the capacity to capture signals that have nonzero higher order derivatives. With that being said, training on intensities would be more than enough in the case of images. However, as briefly mentioned in the video, one of the applications of SIRENs is solving partial differential equations and there one does not have the 0th order ground truths (~intensities) and can only supervise on derivatives. Unfortunately, I did not really talk about them in the video.

    • @michaelcarlon1831
      @michaelcarlon1831 3 роки тому +1

      @@mildlyoverfitted Nice thanks for the explanation!

  • @nuhaaldausari7019
    @nuhaaldausari7019 3 роки тому +3

    Can SIREN be used to encode audio or frame in applications such as speech to video framworks?

    • @mildlyoverfitted
      @mildlyoverfitted  3 роки тому +1

      Check out this video that mentions all the applications: ua-cam.com/video/Q2fLWGBeaiI/v-deo.html

  • @teetanrobotics5363
    @teetanrobotics5363 3 роки тому +2

    Bro, amazing content. But you are going way too fast. Please slow down. explain and concepts and what each line of code is doing. Most of the audience aren't that experienced in AI programming.

    • @mildlyoverfitted
      @mildlyoverfitted  3 роки тому +1

      Point taken. I will try to create more beginner friendly content in the future. I am still kind of unsure what my audience wants to see so I definitely appreciate your comment!