Hi Alfredo - thanks for your videos. Just as a note on the step to zero gradients, I felt a simpler way to think about it is that pytorch is storing the results of past computations and these past results need to be cleared prior to future training batches. Found it confusing when you argue that zeroing + L.backward() are conceptually linked when I don't think they are.
There’s a reason why these previous gradients are stored. I have a entire section about it. To perform backpropagation in PyTorch one needs to execute two commands: zeroing + backward. Backward alone does two things: it computes and accumulate the gradient. So, if it is preceded by zeroing the previous grads, then it just computes the new grads. That’s why I’m insisting that ‘zeroing + backward’ accounts for a single statement, i.e. backpropagation.
Hello, Alfredo. Thank you for the video! It's nice to spend a Saturday morning watching a lecture. one question: Will your book be available for public sale?
Thank you for making this!
Anytime 😇😇😇
Hi Alfredo - thanks for your videos. Just as a note on the step to zero gradients, I felt a simpler way to think about it is that pytorch is storing the results of past computations and these past results need to be cleared prior to future training batches. Found it confusing when you argue that zeroing + L.backward() are conceptually linked when I don't think they are.
There’s a reason why these previous gradients are stored. I have a entire section about it. To perform backpropagation in PyTorch one needs to execute two commands: zeroing + backward. Backward alone does two things: it computes and accumulate the gradient. So, if it is preceded by zeroing the previous grads, then it just computes the new grads. That’s why I’m insisting that ‘zeroing + backward’ accounts for a single statement, i.e. backpropagation.
Hello, Alfredo. Thank you for the video! It's nice to spend a Saturday morning watching a lecture. one question: Will your book be available for public sale?
For sale in print and for free in digital version.