Radial Basis Function Kernel : Data Science Concepts

ritvikmath

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 9 лис 2024

КОМЕНТАРІ • 42

@louisyang185 2 роки тому ⁺²⁰
I've learned this kernel thing in college class, in Andrew Ng's ML courses and many other times, but this is literally the best explanation so far. It really blew my mind once I’ve grasped how gaussian kernel can help me work in infinite dimension!
@LeHaroun Рік тому ⁺⁴
I am a professor of engineering, and I have to say that your chain of thoughts, and the back drive are just amazing. Also the simplicity of explanation, and energy in this video. Keep on .
@ektabansal7645 Рік тому ⁺¹
Our professor has just told us that it is a rbf kernel and i was not convinced but your video helped me believe it. This is amazing, thanks a lot sir
@yesitisnt5738 Рік тому
Man you are just amazing! Everytime i come across something i don't quite get at Machine Learning theory, there you are! Thanks a million!
@sgprasad66 3 роки тому ⁺¹
Hey Ritwik, been trying to intuitively understand kernel-infinite dimension link for ages now and i had not remotely come closr to doing it,,,but your one video has melted the fuzziness away in a trice..thank you so much
@yasminejaafar 3 роки тому ⁺¹
thank you thank you, I just finished my test,
your Markov videos were amazing and helped me a lot, thank you again
@pnorton8063 2 роки тому
Thank you. You warped my fragile little mind. Fresh air. I love the RBF. Well presented. Nice zest too
@lizbethtirado331 2 роки тому ⁺³
Thank you for such an excellent explanation! This helps me to understand better ML models!
@ritvikmath 2 роки тому
Thanks!
@Alexweno07 Рік тому ⁺¹
Best explanation. Thank you so much!
@ritvikmath Рік тому
Glad it was helpful!
@akshaymulgund4947 2 роки тому
I just got emotional. what a video
@tejassrivastava6971 Рік тому
Amazing concept with amazing explanation !! Hats off to you !!
@ritvikmath Рік тому
Glad you liked it!
@cameronbaird5658 2 роки тому
Amazing explanation
@harishmalik6935 2 роки тому
Sir you deserve a million subscribers. Hope you get soon what you deserve 😊
@aliciaflorrie8390 2 роки тому
Thank you, I have learn a lot about kernel function
@jx4864 2 роки тому
This is art, really nice explaination
@hu5116 Рік тому
Love your videos! Have just gone through the SVM an Kernel videos. However, I feel a little like on a cliffhanger. That is, now understand SVM, and I see where going with Kernel, but then it seems there needs to be a follow on video to finally link the kernel back explicitly to SVM and how the kernel is then explicitly used to do the classification. Specifically, what is missing in this video (or more accurately needed in a follow on video) is the linkage back to the Alphas of the Lagrangian or the w and b, because in the end, that is what defines the discrimination line. That last piece is tantalizing missing (I.e., hint for next video ;-). Thanks!
@javiergonzalezarmas8250 Рік тому
Beautiful
@ritvikmath Рік тому
Thank you! Cheers!
@lilin7535 Рік тому
thank you!!! so good.
@axadify 3 роки тому
Amazing explanation. keep up good work
@piotrpustelnik3109 2 роки тому ⁺¹
brilliant!
@abroy77 3 роки тому ⁺⁴
What has my life become. I genuinely anticipate the release of new math videos smh. Thanks for the great videos though :)
@ericchristoffersen9355 Рік тому
This was fun!
There are so many different ways to explain things. In this case youve based explanation on ‘property of kernel’ which seems so stodgy. theres maybe other ‘street math’ explanations of why this kernel is so great? For example e^x is its own derivative. Why does it use the 2 norm? Would a 4 norm be ok too? Why the -1/2? It turns out theres lots of variations of rbf that work just as well, this canonical edition is often the most efficient.
i think it would be fun to see the rbf in action applied to a thorny classification problem, why do the operators of rbf work so well, what makes the wrong variations work poorly.
@houyao2147 3 роки тому ⁺²
so cool to understand this infinitive. How to avoid overfitting for such a powerful model?
@ritvikmath 3 роки тому
good question, I have an SVM kernels coding video coming soon that will answer that
@preritchaudhary2587 2 роки тому
@@ritvikmath Hello Sir. Can you create a video on what role the hyperparameters play in SVM.
@vanamalivanam1397 Рік тому
This is probably because we can differentiate/integrate e^x infinite times and it results in always same function e^x
@zwitter689 Рік тому
Just great! Would you do a few examples (preferrably in python) and make the code available?
@shivkrishnajaiswal8394 2 роки тому ⁺²
4:50 I think the reason why you are able to make that constant (even terms involving xi), is that xi is normalized. So xi.T@xi = 1
@jiaheliu6431 Рік тому
great point! Maybe he fotgot to mention this in the video. I think without this condition the definition of the high dimension feature vector is not consistant
@jiaheliu6431 Рік тому
Sorry I was wrong. The term exp(xi^T * xi) is just a scaler and that's part of the function defines the high dimension feature vector.
@scottzeta3067 2 роки тому ⁺¹
I don't understand how does my teacher turn 8 min content into 1 hour confussing and boring class.
@KrischerBoy 2 роки тому
Absolutely brilliant! Could you maybe elaborate the Gaussian Radial Basis Function? How does the Variance & Mean fit into the context?
@harrypadarri6349 5 місяців тому
Your comment is two years old but here’s how I tried to make some intuitive sense out of it:
In regression Gaussian processes are used as a prior over functions.
It is often said that the kernel of a Gaussian process specifies the “form” of the functions.
For example in the sense that a larger lengthscale places more mass on smoother functions.
If you sample from a GP with 1d inputs with an RBF kernel it looks exactly like this but does not really explain why that’s the case.
What I did next was looking into a kernel smoother.
Roughly speaking:
You have a bunch of observations of a function f(x) at locations x and we predict the unknown function value at some location z by computing the linear combination of RBF kernel times known function values and normalise that sum.
Let’s say we know f(x1) and f(x2) and want to predict f(x3).
Then
f(x3) ≈ (k(x3,x2)*f(x2) + k(x3,x1)*f(x1)) / (k(x3,x2) + k(x3,x1))
If you try to construct an equation with nice vector-matrix-notation you might get something like this:
f = C * K_{fy} * y
where f is the prediction of the unknown function values, y are the known function values and C is a matrix that does the normalisation.
When you look at the equation of the posterior distribution’s mean of a GP in GP regression it looks something like this:
mean = K(X_known, X_unknown)^T @ K(X_known, X_known)^(-1) @ y
It’s also a linear combination of kernel and observed function values.
Here “centred” by the inverse of the kernel matrix evaluated on the locations of the observations.
This similarity between the a-posteriori mean and a kernel smoother helps me with the intuition.
Of course it’s not a solid mathematical explanation but maybe a nice point of view from where to start when looking into it.
@patrick_bateman-ty7gp 11 місяців тому
Did anyone have their Oppenheimer moment while understanding the RBF kernel ? I did.
@jameschen2308 10 місяців тому
Doesn't exp(x_i^Tx_j) give the same power?
@anthonyibilic 2 роки тому
I came here looking for intuiton on what RBFs are
@ajsalshereef7810 2 роки тому
3:43 How can you add the cross terms to get -2*xiT*xj? Can anyone help me.
@maxstrzelecki3970 Рік тому
xiT * xj produces the same result as xjT * xi, I think :)

Наступне

Автоматичне відтворення

Probability Calibration : Data Science Concepts