Lecture 12 - Regularization

Поділитися
Вставка
  • Опубліковано 17 гру 2024

КОМЕНТАРІ • 55

  • @StevenSarasin
    @StevenSarasin 9 років тому +73

    This guy is a rock star. These lectures are so satisfying and useful.

  • @ZeeshanAliSayyed
    @ZeeshanAliSayyed 10 років тому +21

    I haven't seen any other class where Regularization was explained in such a depth. I love Prof. Abu-Mustafa's teaching.

  • @pettergustad
    @pettergustad 9 років тому +17

    A great class! Thank you to Yaser Abu-Mostafa and Caltech for making it available to the public!

  • @Nestorghh
    @Nestorghh 12 років тому +11

    Excellent: "you can think of the holy grail of machine learning is to find an in sample estimate of the out of sample error. If you get that, you are done, minimize it and you go home".

  • @woddenhorse
    @woddenhorse 2 роки тому +2

    59:49
    "You don't hide behind a great-looking derivation, when the basis of it is shaky"
    Damnn this line is 🔥🔥

  • @Devilathor
    @Devilathor 7 років тому +4

    1:05:49 "Heuristic is heuristic but we are still scientists and engineers"
    The best words ever!

  • @chowdhsk
    @chowdhsk 10 років тому +13

    Small clarification :- in 16:00 the professor says Lebesgue of order 1 upto lebesgue of order Q - I presume he meant legendre ... Lebesgue to the best of my knowledge is in integration ...

    • @ZeeshanAliSayyed
      @ZeeshanAliSayyed 10 років тому

      Good observation. He indeed meant Legendre!

  • @luismesagrave
    @luismesagrave 6 років тому

    It's amazing how well this man understands what he is talking about, and how clearly he establishes the key notions... just superb!

  • @sarahchen4385
    @sarahchen4385 9 років тому +6

    So enjoyable! One of the best lectures ever!

    • @nerdmeh5172
      @nerdmeh5172 8 років тому +1

      same feeling here. impresive!!

  • @SaeidAbolfazli
    @SaeidAbolfazli 8 років тому +2

    Such a wonderful material consists of non-scary math, simple yet clear visualization, and plain language accompanied by lots of small hints to keep following. It takes audience and go deep into such an important topic in machine learning. Awesome!

  • @AndyLee-xq8wq
    @AndyLee-xq8wq Рік тому

    this lecture is insanely clear on explaining regularization!

  • @musicaansen5449
    @musicaansen5449 6 років тому +1

    I always wanted to know intution behind regularization...one of the best lecture on it along with Andrew Ng's one.

  • @MrWilliamducfer
    @MrWilliamducfer 4 місяці тому

    His classes are really amazing!

  • @muhammadusama6040
    @muhammadusama6040 6 років тому +1

    Excellent lecture. He makes quite a sense out of raw math.

  • @somaya2005
    @somaya2005 9 років тому

    Never heard Regularization in such an interesting way:)
    thanks

  • @mrf145
    @mrf145 10 років тому

    Excellent..for the first time i have understood what regularization is :)

  • @DistortedV12
    @DistortedV12 7 років тому

    Not going to say I understand everything here, but he's a really great instructor.

  • @filipwojcik4133
    @filipwojcik4133 9 років тому +1

    Great lecture (as always!).
    What I find confusing is the content of slide 9 (28:42) - from where is the derivation coming, that:
    gradient Ein(_w reg_) is proportional to -_w reg_ ?
    Do you know that? I cannot get this derivation point.

    • @StevenSarasin
      @StevenSarasin 9 років тому +2

      +Filip Wójcik E_in will be minimized by a w closest to w_lin with the constraint that w be within some circle not containing w_lin. Convince yourself that w should point in the direction of w_lin, since if it didn't we could rotate it along the circle C until it did and have it be closer to w_lin. So we want w pointing at w_lin but the gradient is perpendicular to the ellipse containing w_lin as its center and pointing away from the interior of the ellipse. Therefore when w points to w_lin the gradient is pointing away from w_lin and their directions are opposite. vectors with opposite directions differ by a negative sign only.

    • @Bing.W
      @Bing.W 7 років тому

      The point is like the following. When you are trying to get the minimal Ein by updating w using gradient descent, you always update w with a small move in the opposite direction of Ein'(w), the gradient at w, i.e., w = w - t*Ein'(w). When you reach the minimal point of Ein, it usually requires Ein'(w) = 0; but now with the constraint w^2

    • @Bing.W
      @Bing.W 4 роки тому

      @@markoiugdefsuiegh ​ Monarchist When you move the w in the _tangential direction_ to the constraint circle, that means you are always _on_ the circle. I guess what you meant is that, you move the w in the direction of the _tangent line_ that touches the circle at point w. But that is not how it works, and you must move w on the circle. Ein'(w) always has two components, one is tangential to the circle, and one is normal. When it reaches minimum while w is on the circle, the tangential component of Ein'(w) becomes zero.

  • @movax20h
    @movax20h 8 років тому

    Tikhonov regularization, and the weight decay, somehow reminds me of the projection methods for ordinary differential equation. (i.e. algebraic-differential equations, or differential equations with constraints or on a manifolds).

  • @tempvariable
    @tempvariable 5 років тому

    4:35 what does the intensity level between -0.2 to 0.2 corresponds to at the image at bottom right? Thank you

    • @leinadfc01
      @leinadfc01 5 років тому +1

      It's the difference in the out of sample error between a complex and a simpler model

  • @sapnad9854
    @sapnad9854 5 років тому

    In the homework#6, Question#2, out of sample (Eout) classification error is required to be calculated. But looks like the right answer is only obtained, if classification error is calculated in the non-linear transformation domain. But I remember prof mentioned, Eout is always measured on
    the X (input) space not the Z (transformed space). Any thoughts on this?

  • @duanddie
    @duanddie 8 років тому

    At 56:34, what technique is he talking about?

  • @brod515
    @brod515 4 роки тому

    @35:27 but what is the capital I on the solution for W_reg

  • @frankd1156
    @frankd1156 3 роки тому

    This guy is a Legend....I am wondering why he doesn't teach in any MOOC platform like Idacity or coursera

  • @attilakun7850
    @attilakun7850 9 років тому +2

    If monomials form a basis of polyomials, then they are independent, right? Why can't we just use them instead of Legendre polyomials?

    • @dtunkelang
      @dtunkelang 9 років тому +1

      +Attila Kun What's nice about the Legendre polynomials is that they form an orthogonal basis.

    • @Bing.W
      @Bing.W 7 років тому +3

      From the point of view of forming a polynomial model, the coefficients of monomials are not independent. Legendre polynomials ensure that different components of Z are orthogonal, hence the coefficients are independent. It is like to construct a w space with orthonormal basis. The neat part is that sum{ wi * zi} expresses vector (w1, ..., wn) of the space.

    • @MrDkhili
      @MrDkhili 7 років тому

      ty

  • @wimthiels638
    @wimthiels638 8 років тому

    thanks prof. Yaser great lessons

  • @chiranjit7798
    @chiranjit7798 8 років тому

    Prof Yaser Thank you.

  • @sidderverse
    @sidderverse 8 років тому

    What a man! Incredible!

  • @raulguarini2342
    @raulguarini2342 6 років тому

    This guy rocks, really

  • @ksawery6568
    @ksawery6568 8 років тому +1

    Amazing, thank you!

  • @always-stay-positive5187
    @always-stay-positive5187 7 років тому

    How are the biases calculated in theory and for whate estimator aginst whih true valeu.

    • @tempvariable
      @tempvariable 5 років тому

      if you work with synthetic data you should be able to empirically evaluate them :)

    • @viniciusdeavilajorge5053
      @viniciusdeavilajorge5053 2 роки тому

      Watch earlier videos, there is a video called Bias-Variance something

  • @mostrotorino
    @mostrotorino 10 років тому +2

    SUPERB!

  • @diwakargautam7446
    @diwakargautam7446 7 років тому

    From where I can download the slides....it is so helpful for me.

    • @Dr_Hope
      @Dr_Hope 7 років тому

      The slides are available on EdX course forum CS1156

  • @indranilbiswas629
    @indranilbiswas629 2 роки тому

    GOD professor ❤️

  • @foxcorgi8941
    @foxcorgi8941 2 роки тому

    thank you so much

  • @michaelmellinger2324
    @michaelmellinger2324 2 роки тому

    want to punish the noise more than the signal @51m

  • @aminabbasi2936
    @aminabbasi2936 6 років тому

    What an insight!

  • @xesan555
    @xesan555 7 років тому

    Thank you sir, great work

  • @jakubguzowski1352
    @jakubguzowski1352 7 років тому +1

    Let's get carried away, like people get carried away with medicine :)

  • @АламкхуршидСаркер
    @АламкхуршидСаркер 7 років тому

    would you say me that this teacher is from which university?

  • @brainstormingsharing1309
    @brainstormingsharing1309 4 роки тому +1

    👍👍👍👍👍