Your delivery was brilliant! You gave the right amount of detail. I’m really a fan now. Can’t wait for the next one, and I’m going to watch your other playlists. Thank you.
@@SerranoAcademy I'm actually looking for other architectures right now since my models can't get pass the 88% AUC ROC maximum. Hopefully, I can use this to get to that sweet 95-ish %. Thank you again. Please be kind with the maths in your next video. lol
At first look, KAN requires more parameters to be trained than MLP, but they claimed in the paper that KAN can compete equally if not better with MLP using a smaller network, therefore a smaller number of layers. I cannot wait to watch the next video, I would like to understand how the initial splines are chosen. For instance, if we go with B-splines, which one do we take and how many ? Are there other parameters to learn in addition to the knots ?
Sir: Around 11:28 in your video you show quadratic B splines (my question is true for any spline approximation) three splines that approximate the function of interest. I was unclear how this will be used. They will not be used as weights for a linear dot product right? The three splines connecting to x1 will be used to determine what each outputs right? If x1 cvalue is .3 then the middle will output a 0.3 and the other two will output 0. Am I right? I am confused how you can use them as weights in the regular sense.
Mathematical beauty is enough to motivate me to watch the rest of this series. But the practical question is whether these networks can perform as well as neural networks on benchmarks, given equal “compute.” That is probably more an empirical question than a mathematical one.
Thank you! I fully agree. I really liked the mathematical beauty, so that is what caught my interest. From what I understand, they perform well compared to regular NNs. But it could go either way; they could become huge, or not. However, my hope is that either way, they'll inspire new architectures coming from the theory of representation of functions, as this is a beautiful field that has remained (until now) unexplored in ML.
Seems to me that, instead of training weights that lead to activation functions, KANs are training weights (knot vectors) that lead to spines. Interested to learn more about the tradeoffs between the two.
Ok this is great. However doesn't it also demonstrate that KANs and MLPs are equivalent? The spline sections are equivalent to the activation levels.. and the choice of b splines is equivalent to the choice of functions. So aren't the two theories and entire architectures potentially equivalent? Is this just a choice of how to get the same function approximation system into memory?
Your delivery was brilliant! You gave the right amount of detail. I’m really a fan now. Can’t wait for the next one, and I’m going to watch your other playlists. Thank you.
Truly a hidden gem. I've missed out a lot by not knowing about this channel, and off course Luis is the best math teacher i've ever had.
The best explanation of the topic on UA-cam. All you need to know is explained in less than 15 minute! Thank you very much
Great delivery! I wish every math teacher was like Luis.
The second part is out, on the Kolmogorov-Arnold Theorem!
ua-cam.com/video/nS2hnm0JRBk/v-deo.htmlsi=ym6OsCVKFgiHhtne
I can't thank you enough, Luis. You make all this stuff look very simple.
and this lesson is free? What a time to be alive!
Thank you, my payment is kind comments like yours. :)
@@SerranoAcademy I'm actually looking for other architectures right now since my models can't get pass the 88% AUC ROC maximum. Hopefully, I can use this to get to that sweet 95-ish %. Thank you again. Please be kind with the maths in your next video. lol
Thank you so much for your efforts to put out such informative videos.
Very well done video, as usual! Great and interesting work!
@@skydiver151 thank you! I’m glad you liked it!
12/5. Checked the channel for #2. Eagerly waiting. Great Video.
excellent explanation. thank you so much
At first look, KAN requires more parameters to be trained than MLP, but they claimed in the paper that KAN can compete equally if not better with MLP using a smaller network, therefore a smaller number of layers. I cannot wait to watch the next video, I would like to understand how the initial splines are chosen. For instance, if we go with B-splines, which one do we take and how many ? Are there other parameters to learn in addition to the knots ?
Thanks!
@khaledal-utaibi2049 thank you so much for your very kind contribution! I really appreciate it ☺️
Muchas gracias por las explicaciones. Como siempre, las mejores que hay en youtube.
Muchas gracias, me alegra que te gusten! :)
This was a really good explanation of KANs. 🥳
Clearly explained and illustrated. Thank you.
Thank you! I'm glad you liked it!
This is fantastic! Thank you
Amazing video by amazing teacher
Thank you! :) The next one is coming up soon, and I'm having a lot of fun making it. :)
Sir: Around 11:28 in your video you show quadratic B splines (my question is true for any spline approximation) three splines that approximate the function of interest. I was unclear how this will be used. They will not be used as weights for a linear dot product right? The three splines connecting to x1 will be used to determine what each outputs right? If x1 cvalue is .3 then the middle will output a 0.3 and the other two will output 0. Am I right? I am confused how you can use them as weights in the regular sense.
Mathematical beauty is enough to motivate me to watch the rest of this series. But the practical question is whether these networks can perform as well as neural networks on benchmarks, given equal “compute.” That is probably more an empirical question than a mathematical one.
Thank you! I fully agree. I really liked the mathematical beauty, so that is what caught my interest. From what I understand, they perform well compared to regular NNs. But it could go either way; they could become huge, or not. However, my hope is that either way, they'll inspire new architectures coming from the theory of representation of functions, as this is a beautiful field that has remained (until now) unexplored in ML.
Eagerly waiting for the second part!!!❤
Great timing! The second part just came out! :) ua-cam.com/video/nS2hnm0JRBk/v-deo.html
@@SerranoAcademy thank u sir,u r awesome!!!🎉
Thanks prophet.
Why this video only have 813 views after 4 hours? Subscribe instantly :D
Thank you so much Luis , cant wait for next chapter 😍
Thank you! :) Yes, super excited for that one, it's coming up soon!
@@SerranoAcademy 😍😍
I think they reinvented the wheel with this one. Existing NN-s are already KANs. What they think is new is a misunderstanding of these concepts.
Great video! Very well explained, Peace!
@@Pedritox0953 thank you so much, I’m glad you liked it! Peace! 😊
Seems to me that, instead of training weights that lead to activation functions, KANs are training weights (knot vectors) that lead to spines. Interested to learn more about the tradeoffs between the two.
thank you sir
Typo in the KAN depiction: output should be f1(x1) + f2(x2).
Oh thanks! Yeah you're right, the w's should be x's.
Amazing video! Thanks :)
Thank you, I'm glad you liked it!
You are the best!!!
Thank you so much! :)
why do we require 4 basis functions to approximate any linear/quadratic function with 3 bins?
great video😊
Thank you! :)
Ok this is great. However doesn't it also demonstrate that KANs and MLPs are equivalent? The spline sections are equivalent to the activation levels.. and the choice of b splines is equivalent to the choice of functions. So aren't the two theories and entire architectures potentially equivalent? Is this just a choice of how to get the same function approximation system into memory?
❤️🧠
😊💪
🎉
@@carolinalasso ❤️
It is the same CNN+DENSE layers , so what is its advantage ? Would you please take example of its advantage?