I agree with all the previous comments, this was a terrific explanation. I particularly appreciated that you included details of the proof of the adjoint method.
Thanks for the video and detailed derivation. There is a question/comment which really puzzles me. First, if L is a number, which measure the deviation (eg. absolute difference between) of estimation z(t) from the real value, then the mapping z(t) -> L is a functional by definition. We would have, instead, \delta L/\delta z(t) = a(t) naturally defined as a functional derivative. However, as I tried to follow the arguments used in this video (29:28), I realized that it changes a lot (as dz(t+\Delta t)/dz(t) does have its counterpart in the context of functional). So I was forced to understand that the notion of functional is irrelevant here, one defines a number L as a function of another number z(t) which is a function evaluated at a given time, but must not be understood as a function of time. Under such an enforced context, the derivation then makes sense. PS: please do not use "partial" in the numerator and "d" in the denominator, as I don't believe this is the standard.
On a second thought, a(t) is indeed reminiscent of the functional derivative defined as a(t)=\delta L/\delta z(t). It is very inviting to state that for z(t) satisfying \dot{z}=f(z,t), one has \dot{a}=-a \partial f/\partial z. Except for its variation the function z(t) is not fixed at both end points as per the definition of functional differential, therefore it one follows that path (which mostly should work), one must introduce some proper modifications.
Good job, very clear explanation ! However, it's sad that you didn't introduce some implementation of the function f. How can we design and implement such continuous function ?
You explained the adjoint method PERFECTLY.
Herr YEAH! i was fighting like for 2 Weeks with the adjoint method and nobody really explained like this in detail. Thanks a lot keep going!
Or maybe it just took you 2 weeks to get it and you just happen to be watching this video when it "clicked"?
Best video/blog so far on neural ODEs
Thank You very much, Sir. This is by far most easy explanation of neural ODE.
Definitely the best Neural ODE explanation! Thank you sir!
This is very helpful; I appreciate it as it provides a comprehensive review with detailed explanations
I agree with all the previous comments, this was a terrific explanation. I particularly appreciated that you included details of the proof of the adjoint method.
could you explain the difference between lower case f and theta? I'm a bit confused as to how they are different
Thanks for explaining the proof, couldn't find it anywhere else
10:05 I understand that the f is exactly the z() function at the time t, am I right?
Excellent exposition of the paper! Thank you.
This is a truly great explanation!
Awesome explanation!!
Thanks for the video and detailed derivation. There is a question/comment which really puzzles me. First, if L is a number, which measure the deviation (eg. absolute difference between) of estimation z(t) from the real value, then the mapping z(t) -> L is a functional by definition. We would have, instead, \delta L/\delta z(t) = a(t) naturally defined as a functional derivative. However, as I tried to follow the arguments used in this video (29:28), I realized that it changes a lot (as dz(t+\Delta t)/dz(t) does have its counterpart in the context of functional). So I was forced to understand that the notion of functional is irrelevant here, one defines a number L as a function of another number z(t) which is a function evaluated at a given time, but must not be understood as a function of time. Under such an enforced context, the derivation then makes sense. PS: please do not use "partial" in the numerator and "d" in the denominator, as I don't believe this is the standard.
On a second thought, a(t) is indeed reminiscent of the functional derivative defined as a(t)=\delta L/\delta z(t). It is very inviting to state that for z(t) satisfying \dot{z}=f(z,t), one has \dot{a}=-a \partial f/\partial z. Except for its variation the function z(t) is not fixed at both end points as per the definition of functional differential, therefore it one follows that path (which mostly should work), one must introduce some proper modifications.
great explanation! :)
Brilliant! You're really good at explaining I must say. Excellent job! May I please ask what you used for presentation and drawing equations Andriy?
Thanks! I think it was GoodNotes with ipad screen recording and apple pencil. (I just cut the surrounding window borders in the final recording)
28:44, i think the backward equation of the adjoint method might be wrong and the integral term should be negative
yes you are correct
Good job, very clear explanation !
However, it's sad that you didn't introduce some implementation of the function f. How can we design and implement such continuous function ?
Oh that function doesn't exist - that's just for explanation purposes. This is what ODE solver does basically.
illegible hand written scrawl... just like my undergrad days
Thanks for the superb explanation!