Machine Learning Lecture 26 "Gaussian Processes" -Cornell CS4780 SP17

Поділитися
Вставка
  • Опубліковано 15 тра 2024
  • Cornell class CS4780. (Online version: tinyurl.com/eCornellML )
    GPyTorch GP implementatio: gpytorch.ai/
    Lecture Notes:
    www.cs.cornell.edu/courses/cs4...
    Small corrections:
    Minute 14: it should be P(y,w|x,D) and not P(y|x,w,D) sorry about that typo.
    Also the variance term in 40:20 should be K** - K* K^-1 K*.

КОМЕНТАРІ • 102

  • @pandasstory
    @pandasstory 4 роки тому +50

    I got my first data science internship after watching all lectures. And now revisiting it during the quarantine and still benefit a lot. This whole series is a legend, thank you so much, professor Killian! Stay safe and healthy!

  • @jiahao2709
    @jiahao2709 4 роки тому +22

    He is the most interesting ML professor that I Ever seen on the Internet.

  • @rshukla64
    @rshukla64 5 років тому +34

    That was a truly amazing lecture from an intuitive teaching perspective. I LOVE THE ENERGY!

  • @yibinjiang9009
    @yibinjiang9009 3 роки тому

    The best GP lecture I've found. Simple enough and makes sense.

  • @miguelalfonsomendez2224
    @miguelalfonsomendez2224 3 роки тому +2

    amazing lecture in every possible aspect: bright, funny, full of energy... a true inspiration!

  • @horizon2reach561
    @horizon2reach561 4 роки тому +12

    There are no words to describe the power of the intelligence in the lecture , thanks a lot for sharing it.

  • @TeoChristopher
    @TeoChristopher 4 роки тому +7

    Best prof that Ive experienced so far. I love the way he tries to build sensible intuition behind the math. FYI, Love the sense of humour

  • @George-lt6jy
    @George-lt6jy 3 роки тому

    This is a great lecture, thanks for sharing it. I also appreciate that you took the time to add the lecture corrections.

  • @damian_smith
    @damian_smith Місяць тому

    Loved that "the answer will always be Gaussian, the whole lecture!" moment.

  • @alvarorodriguez1592
    @alvarorodriguez1592 4 роки тому +1

    Hooray! Gaussian process for dummies! Exactly what I was looking for
    Thank you very much.

  • @htetnaing007
    @htetnaing007 2 роки тому

    People like these are truly a gift to our mankind!

  • @rajm3496
    @rajm3496 4 роки тому +1

    Very intuitive and easy to follow. Loved it!

  • @salmaabdelmonem7482
    @salmaabdelmonem7482 3 роки тому +2

    the best GP lecture ever, impressive work (Y)

  • @CibeSridharanK
    @CibeSridharanK 4 роки тому +1

    Awesome explanation. That house example explains in very layman’s terms.

  • @saikumartadi8494
    @saikumartadi8494 4 роки тому +21

    explanation was great ! thanks a lot .it would be great if you upload other courses videos you taught at cornell because everyone is not lucky to get aa teacher like you :)

  • @ylee5269
    @ylee5269 5 років тому +2

    Thanks for such good lecture and nice explanation, I was struggling of understanding gaussian process for a while until I saw your viedeo

  • @ikariama100
    @ikariama100 Рік тому +2

    Currently writing my master thesis working with bayesian optimization, thank god I found this video!

  • @mostofarafiduddin9361
    @mostofarafiduddin9361 3 роки тому +1

    Best lecture on GPs! Thanks.

  • @karl-henridorleans5081
    @karl-henridorleans5081 4 роки тому +4

    8 hours of scraping the internet, but the 9th was the successful one. You sir, have explained and answered all questions I had on the subject, and raised much more interesting ones. Thank you ver much!

  • @benoyeremita1359
    @benoyeremita1359 Рік тому

    Sir your lectures are really amazing, you give so many insights I would've never thought of. Thank you

  • @prizmaweb
    @prizmaweb 5 років тому +2

    This is a more intuitive explanation than the Sheffield summer school GP videos

  • @DJMixomnia
    @DJMixomnia 4 роки тому +1

    Thanks Kilian, this was really insightful!

  • @naifalkhunaizi4372
    @naifalkhunaizi4372 2 роки тому

    Professor Killian you are truly an amazing professor

  • @erenyeager4452
    @erenyeager4452 3 роки тому +1

    I love you. Thank you for explaining on why you can model it as a gaussian.

  • @massisenergy
    @massisenergy 4 роки тому +7

    It might have only 112 likes & ~5000 views at the moment while I comment, but it will have profound influence to the people who watched it & it would stick to the minds!

  • @kiliandervaux6675
    @kiliandervaux6675 2 роки тому +1

    The comparision with the houses prices to explain the covariance was very pertinent. I never heard it elsewhere. Thanks !

  • @tintin924
    @tintin924 4 роки тому +1

    Best lecture on Gaussian Processes

  • @parvanehkeyvani3852
    @parvanehkeyvani3852 Рік тому

    amazing, I really love the energy of teacher.

  • @danielism8721
    @danielism8721 4 роки тому +6

    AMAZING LECTURER

  • @laimeilin6708
    @laimeilin6708 3 роки тому +1

    Woo this is Andrew Ng level explanations!! Thank you for making these videos. :)

  • @gareebmanus2387
    @gareebmanus2387 2 роки тому +4

    Thanks for the sharing the excellent lecture. @27:00 About the house's price: The contour plot was drawn always in the first quadrant, but the Gaussian contours should have been extended over the entire plane. This actually is a drawback of the Gaussian: While we know that the house's price can't be negative, and we do not wish to consider the negative range in out model at all, we can't avoid it: The Gaussian would allow for non-zero probability for the negative price intervals as well.

    • @jiageng1997
      @jiageng1997 2 роки тому

      exactly, I was so confused why he drew it as a peak rather than a ridge

  • @fierydino9402
    @fierydino9402 4 роки тому +1

    Thank you so much for this clear lecture :D It helped me a lot!!

  • @rossroessler5159
    @rossroessler5159 6 місяців тому

    Thank you so much for the incredible lecture and for sharing the content on UA-cam! I'm a first year Master's student and this is really helping me self study a lot of the content I didn't learn in undergrad. I hope I can be a professor like this one day.

  • @hamade7997
    @hamade7997 Рік тому

    Insane lecture. This helped so much, thank you.

  • @prathikshaav9461
    @prathikshaav9461 4 роки тому +2

    just binge watching your course i love it...is there link to homework, exam and solutions for the same... it would be helpful

  • @gyeonghokim
    @gyeonghokim Рік тому +1

    Such a wonderful lecture!

  • @sarvasvarora
    @sarvasvarora 3 роки тому

    "What the bleep" HAHAH, it was genuinely interesting to look at regression from this perspective!

  • @preetkhaturia7408
    @preetkhaturia7408 3 роки тому +1

    Thankyou for an Amazing lecture sir!! :)

  • @jaedongtang37
    @jaedongtang37 5 років тому +2

    Really nice explanation.

  • @siyuanma2323
    @siyuanma2323 4 роки тому +2

    Looooove this lecture!

  • @clementpeng
    @clementpeng 3 роки тому +1

    amazing explanation!

  • @logicboard7746
    @logicboard7746 2 роки тому

    The last demo was great for understanding gp

  • @galexwong3368
    @galexwong3368 5 років тому +1

    Really awesome teaching

  • @Higgsinophysics
    @Higgsinophysics 2 роки тому

    Brilliant and interesting !

  • @rohit2761
    @rohit2761 2 роки тому +2

    Kilian Is ML God. Why so less views compared to crappy lectures getting so many, and this gold playlist still less. I hope people dont find it and struggle to decrease competition. But still Kilian is God, and gold series. Please upload deep learning also.

  • @jiahao2709
    @jiahao2709 4 роки тому

    Your lecture is really really good! I have a question here, If the input also have noise, how we can use the beyesian linear regression? In most book it mention the gaussian noise in the label, But I think it also quite possible have some noise in the input X.

  • @Ankansworld
    @Ankansworld 3 роки тому +1

    What a teacher!!

  • @CalvinJKu
    @CalvinJKu 3 роки тому +4

    Hypest GP lecture ever LOL

  • @SubodhMishrasubEE
    @SubodhMishrasubEE 3 роки тому +4

    The professor's throat is unable to keep up with his excitement!

  • @iusyiftgkl7346u
    @iusyiftgkl7346u 4 роки тому +1

    Thank you so much!

  • @yuanchia-hung8613
    @yuanchia-hung8613 3 роки тому +7

    These lectures definitely have some problems... I have no idea why they are even more interesting than Netflix series lol

    • @sulaimanalmani
      @sulaimanalmani 3 роки тому

      Before starting the lecture, I thought this must be an exaggeration, but after watching it, this is actually true!

  • @rorschach3005
    @rorschach3005 3 роки тому +1

    Really insightful lecture series and I have to say gained a lot from it. An important correction in the beginning - Sums and products of Normal distributions are not always normal. Sum of two gaussians is gaussian only if they are independent or jointly normal. No such rule exists for products as far as I remember.

    • @kilianweinberger698
      @kilianweinberger698  3 роки тому

      Yes, that came out wrong. What I wanted to say is the product of two normal PDFs is proportional to a normal PDF (which is something that comes up a lot in Bayesian statistics).

    • @rorschach3005
      @rorschach3005 3 роки тому +1

      @@kilianweinberger698 Thanks for replying. I am not sure that I understand what you meant by proportional to a normal. Product of two normals generally is in the form of a combination of chi square variables : XY = ((X+Y)^2 - (X-Y)^2)/4. Please correct me if I am missing something

    • @fowlerj111
      @fowlerj111 Рік тому

      @@rorschach3005 I had the same reaction and I think I've resolved it. "product of Gaussians" can be interpreted two different ways. You and I considered the distribution of z where z=x*y and x and y are Gaussian. By this definition, z is definitely not Gaussian. KW is saying that if you define the pdf of z to be the product of the pdfs of x and y, normalized, then z is Gaussian. This is the property exploited in the motivating integral - note that probability densities are multiplied, but actual random variables are never multiplied.

  • @DrEhrfurchtgebietend
    @DrEhrfurchtgebietend 4 роки тому

    It is worth pointing out that while there is no specific model there is an analytic model being assumed. In this case he assumed a linear model

  • @kevinshao9148
    @kevinshao9148 4 місяці тому

    Thanks for the brilliant lecture! One confusion if I may: since 39:18 you change the conditional probability P( y1...yn | x1 .. xn) based on data D to P(y1 ... yn, y_test | x1 ... xn, x_test) ... questions are 1) before test data point, do we already have a joint distribution P(y1 ... yn, x1 ... xn) based on D? 2) once test point comes in, we need form another Gaussian distribution N(mean, variance) for (y1 ... yn, x1 ... xn, y_test , x_test) ? if so how to get covariance term between test data point with each training data? So basically for prediction, I have new x_test, what are the exact parameters we can get for y_test distribution (how to get the mean and variance)? Many Thanks!

  • @CibeSridharanK
    @CibeSridharanK 4 роки тому

    18.08 I have a doubt we are not constructing a line instead we are comparing with every possible lines near by does that mean we are indirectly taking the W using covariance matrix.

  • @dheerajbaby
    @dheerajbaby 3 роки тому

    Thanks for a great lecture. I am bit confused about the uncertainty estimates. How can we formally argue that the posterior variance at any point is telling us something really useful? For example, let's say we consider a simple setup where the training data is generated as y_i = f(x_i) + N(0,sigma^2), i = 1,..n and f is a sample path of the GP(0,k). Then is it possible to construct a high probability confidence band that traps the ground truth f_i using the posterior covariance and mean functions? After all, if I understood correctly, the main plus point of GP regression over kernel ridge regression is due to the posterior covariance.

    • @dheerajbaby
      @dheerajbaby 3 роки тому

      I actually found all my questions answered at this paper arxiv.org/pdf/0912.3995.pdf which is the test of time paper at ICML 2020

  • @vishaljain4915
    @vishaljain4915 Місяць тому

    What was the question at 14:30 anyone know? Brilliant lecture - easily a new all time favourite.

  • @dr.vinodkumarchauhan3454
    @dr.vinodkumarchauhan3454 2 роки тому

    Beautiful

  • @ejomaumambala5984
    @ejomaumambala5984 4 роки тому

    Great lectures! Really enjoyable. There's an important mistake at 40:20, I think? The variance is not K** K^-1 K*, as kilian wrote, but rather it is K** - K* K^-1 K*.

    • @kilianweinberger698
      @kilianweinberger698  4 роки тому +1

      Yes, good catch! Thanks for pointing this out. Luckily it is correct in the notes: www.cs.cornell.edu/courses/cs4780/2018fa/lectures/lecturenote15.html

  • @vikramnanda2833
    @vikramnanda2833 Рік тому

    Which course to learn Data science or Machine learning

  • @pratyushkumar9037
    @pratyushkumar9037 3 роки тому +1

    Professor Kilian, I don't understand how did you write mean= K*K^ -1Y and variance = K** -K*K^-1 K* for the normal distribution?

    • @kilianweinberger698
      @kilianweinberger698  3 роки тому +2

      It is just the conditional distribution for the Gaussian ( see e.g. en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions , here Sigma is our K)

  • @yannickpezeu3419
    @yannickpezeu3419 2 роки тому

    Thanks

  • @zvxcvxcz
    @zvxcvxcz 3 роки тому +2

    Really making concrete what I've known about ML for some time. There is no such thing as ML, it is all just glorified correlation :P

  • @mutianzhu5128
    @mutianzhu5128 4 роки тому +1

    I think there is a typo at 40:18 for the variance.

    • @ejomaumambala5984
      @ejomaumambala5984 4 роки тому +1

      Yes, i agree. The variance is not K** K^-1 K*, as kilian wrote, but rather it is K** - K* K^-1 K*.

  • @arihantjha4746
    @arihantjha4746 3 роки тому +3

    Since p(xi,yi;w) = p(yi|xi;w)p(xi) and during MLE and MAP we ignore p(xi), as it is independent of w, to get the likelihood function (multiply from i to n -> p(yi|xi;w)). But here, why do we simply start with P(D;w) as equal to the likelihood function. Shouldn't P(D;w) be equal to (Multiply from i to n -> p(yi|xi;w) p(xi) ) where p(xi) is some arbitrary dist as it is independent of w and no assumptions are made about it, while p(yi|xi;w) is a Gaussian. Since only multiplying Gaussian with Gaussian gives us a Gaussian, how is the answer a Gaussian when p(xi) is not a Gaussian.
    Ignoring p(xi) during MLE and MAP makes a lot of sense as it is independent of theta, but why wasn't it been included when writing P(D;w) in the first place.
    Do we just assume that since xi are given to us and we don't model p(xi), p(xi) is a constant for all xi?? Can anyone help???
    Also, thank you for the lectures Prof.

    • @kilianweinberger698
      @kilianweinberger698  3 роки тому +1

      The trick is that P(D;w) is inside a maximization with respect to the parameters w. Because P(x_i) is independent of w, it is just a constant we can drop.
      max_w P(D;w)=max_w \PI_i P(x_i,y_i;w)=max_w (\PI_i P(y_i|x_i;w)) * (PI_i P(x_i) )
      This last term is a multiplicative constant that you can pull out of the maximization and drop, as it won’t affect your choice of w. (Here PI is the capital PI multiply symbol.)

  • @zhongyuanchen8424
    @zhongyuanchen8424 3 роки тому +1

    Why is integral over w of P(y|x,w)P(w|D) equal to P(y|x,D) ? Is it because P(w|D) = P(w|D,x)?

    • @kilianweinberger698
      @kilianweinberger698  3 роки тому

      P(y|x,w)P(w|D)=P(y,w|x,D)
      If you now integrate out w you obtain P(y|x,D).
      (Here x is the test point, and D is the training data.)
      If you want to make it clearer you can also use the following intermediate step: P(y|x,w)=P(y|x,w,D). You can condition on D here, because y is conditionally independent of D, when x,w are given.
      For the same reason you can write P(w|D)=P(w|D,x) as w does not depend on the test point x (it is only fitted on the training data).
      Hope this helps.

  • @vatsan16
    @vatsan16 3 роки тому +6

    "One line of julia... two lines of python!!" whats with all the python hate professor? :P

    • @zvxcvxcz
      @zvxcvxcz 3 роки тому +1

      Oh come on, two isn't so bad, do you know how many it is in assembly? :P

  • @christiansetzkorn6241
    @christiansetzkorn6241 2 роки тому

    Sorry but why correlation of 10 for POTUS example? Correlation can only be -1 ... 1?!

  • @hossein_haeri
    @hossein_haeri 3 роки тому +1

    What is exactly k**? Isn't it always ones(m,m)?

    • @kilianweinberger698
      @kilianweinberger698  3 роки тому

      No, depends on the kernel function. But it is the inner-product of the test point(s) with it-/themselves.

  • @gregmakov2680
    @gregmakov2680 Рік тому

    hahahah, sinh vien nao ma hieu duoc bai nay la thien tai :D:D:D:D pha tron tum lum :D:D roi qua di.

  • @sandipua8586
    @sandipua8586 5 років тому +11

    Thanks for the content but please calm down, I'm getting a heart attack

    • @nichenjie
      @nichenjie 5 років тому +1

      Learning GP is so frustrated T.T

    • @jzinou1779
      @jzinou1779 5 років тому

      lol

  • @sekfook97
    @sekfook97 2 роки тому +1

    just know about they used Gaussian processes to search the airplanes in the ocean. btw, I am from malaysia.

  • @zhuyixue4979
    @zhuyixue4979 4 роки тому +1

    aha moment: 11:15 to 11:25

  • @maxfine3299
    @maxfine3299 25 днів тому

    the Donald Trump bits were very funny!

  • @busTedOaS
    @busTedOaS 3 роки тому

    ERRM

  • @bnouadam
    @bnouadam 4 роки тому

    this giy has absolutely no charisma and has a controlling attitude. tone is not fluent

  • @prathikshaav9461
    @prathikshaav9461 4 роки тому +2

    just binge watching your course i love it...is there link to homework, exam and solutions for the same... it would be helpful

    • @kilianweinberger698
      @kilianweinberger698  4 роки тому +13

      Past 4780 exams are here: www.dropbox.com/s/zfr5w5bxxvizmnq/Kilian past Exams.zip?dl=0
      Past 4780 Homeworks are here: www.dropbox.com/s/tbxnjzk5w67u0sp/Homeworks.zip?dl=0