Lecture 8: Norms of Vectors and Matrices

Поділитися
Вставка
  • Опубліковано 21 вер 2024
  • MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
    Instructor: Gilbert Strang
    View the complete course: ocw.mit.edu/18...
    UA-cam Playlist: • MIT 18.065 Matrix Meth...
    A norm is a way to measure the size of a vector, a matrix, a tensor, or a function. Professor Strang reviews a variety of norms that are important to understand including S-norms, the nuclear norm, and the Frobenius norm.
    License: Creative Commons BY-NC-SA
    More information at ocw.mit.edu/terms
    More courses at ocw.mit.edu

КОМЕНТАРІ • 127

  • @veeecos
    @veeecos 4 роки тому +344

    In times of Covid, I hope this makes young people realize why older people are so important. Long live Prof Strang.

    • @justpaulo
      @justpaulo 4 роки тому +9

      In times of Covid I hear in the background of the class someone sneezing and nose-blowing and it gives me the chills ...

    • @somadityasantra5572
      @somadityasantra5572 4 роки тому +3

      He is human equivalent of God

    • @godfreypigott
      @godfreypigott 3 роки тому +2

      @@somadityasantra5572 Are you saying he doesn't exist?

    • @somadityasantra5572
      @somadityasantra5572 3 роки тому +1

      @@godfreypigott U are assuming that I mean God does not exist. But how can u prove or disprove that?

    • @godfreypigott
      @godfreypigott 3 роки тому +3

      @@somadityasantra5572 There is no "god" - that is a given. So by saying he is the "human equivalent of god" you are saying that he doesn't exist.

  • @EduardoGarcia-tv2fc
    @EduardoGarcia-tv2fc 4 роки тому +64

    I'd say without any doubt that Professor Strang is the best Algebra professor in the entire world. I'm sure he has helped tons of students all around the world to understand the beauty of algebra

  • @lizijian7090
    @lizijian7090 5 років тому +73

    Long live your kindly,mild professor

  • @deeptendusantra670
    @deeptendusantra670 4 роки тому +55

    After reading so many texts finally some actual geometric interpretation of L1 and L2 ...he explains it so beautifully.Came here only to understand definition but his charsima made me watch whole 50 mins

    • @Forced2
      @Forced2 4 роки тому +2

      Exactly the same for me

    • @denys22222
      @denys22222 4 роки тому

      Ahah I am in the same situation.

    • @minoh1543
      @minoh1543 3 роки тому

      the same for me 2222222

    • @pondie5381
      @pondie5381 Рік тому

      EXACTLY the same!!!

  • @JulieIsMe824
    @JulieIsMe824 3 роки тому +14

    Best linear algebra course ever! Best wishes for Prof. Strang's health during this horrible pandemic

  • @andrewmeowmeow
    @andrewmeowmeow 3 роки тому +8

    What a smart and humble person! Long live Prof. Strang!

  • @rogiervdw
    @rogiervdw 4 роки тому +10

    Teaching norms with their R2 pictures is just brilliant. So much insight, even emerging while teaching (sparsity of L1 optimum: it's on the axis!!). An absolute joy to watch & learn from

  • @KirtiDhruv
    @KirtiDhruv 4 роки тому +6

    This lecture needs to reach more people asap.
    Total respect for the Professor!

  • @bilyzhuang9242
    @bilyzhuang9242 4 роки тому +26

    LONG LIVE PROFESSOR STRANG!!!!!

  • @georgesadler7830
    @georgesadler7830 3 роки тому +2

    DR. Strang, thank you explaining and analyzing Norms. I understand this lecture from start to finish.

  • @atulsrmcem
    @atulsrmcem Рік тому +1

    I'm currently reading Calculus by Dr. Strang. One of the best books on the subject I have ever come across.

  • @abdowaraiet2169
    @abdowaraiet2169 3 роки тому +2

    "You start from the origin and you blow up the norm until you get a point on the line that satisfies your constraint, and because you are blowing up the norm, when it hit first, that's the smallest blow up possible, that's min, that's the guy that minimize" (31:23-31:42) that's 2-D optimization in a nutshell...clear and simple, thanks very much Professor Strang..

  • @arnaud5033
    @arnaud5033 2 роки тому +1

    Probably, this has been said before, so forgive me if I repeat someone else's words.
    I acknowledge here, that professor Strang is a good pedagogue. I learnt some math over the years. I completely support the use of the geometrical visualization of some properties, as it is a learning need. I can say that for me it is easy to see how to derive properties like the one he gave for the assignment on the Frobenius norm. I say this, because I may not be the only one thinking it and I wanted to tell those people that there is more to math here.
    Only recently, I understood the huge degree of humility and teaching wit that it takes one to pass one's knowledge along. It requires to pretend or to honestly feel you are no better than any of your students. For instance, as I could witness here, Pr Strang shared with his students the latest cool research topics as if they were his colleagues, he thanked them for contributing to the course by giving out some answers. That's what allows him to successfully challenge them in solving some assignments, like the Frobenius norm - SVD problem. All of it is summarized by Gilbert himself at the very end in 48:12, when he explains his view of his relationship with the students (such as "We have worked to do!", an honest use of the pronoun "we" by the lecturer).
    This 48 min long lecture, honestly impressed me in this regard. Today, I had the privilege of a double lecture: one in math (that could have been compressed to 15 min, since most proofs were skipped) and one in being a better passer of knowledge (that could be extended to 10+ years). Hat off!

  • @abdulghanialmasri5550
    @abdulghanialmasri5550 2 роки тому +1

    This man does not stop giving, many thanks.

  • @asifahmed1801
    @asifahmed1801 2 роки тому

    After passing the linear algebra course, i was kind of disappointed no need to see your lecture again . but for data analysis u came again in a HD resolution. So glad to see you professor .

  • @diysumit
    @diysumit 3 роки тому +2

    Love this man, thanks MIT for looking out for us!

  • @supersnowva6717
    @supersnowva6717 Рік тому

    This lecture just brought my understanding of norms to a whole new level! Thank you so much Professor Strang!

  • @sebah1991
    @sebah1991 Рік тому

    The reason I went from hating math to loving math (especially linear algebra) is Gilbert Strang. What an incredible teacher.

  • @naterojas9272
    @naterojas9272 4 роки тому +2

    I highly recommend doing the Frobenius norm proof he mentions. It is elegant and uses some nice properties of linear algebra. If you took 18.06 (or watched the lectures) using the column & row picture of matrix multiplication really helps. I'll finalize my proof and post a link - hopefully I didn't make a mistake ;)

    • @naterojas9272
      @naterojas9272 4 роки тому +2

      Maybe I shouldn't post a link... I wouldn't want anyone enrolled in 18.065 to copy it... Hmm......

  • @wangxiang2044
    @wangxiang2044 2 роки тому

    Frobenius norm squared = trace of (A transpose times A) = sum of eigenvalues of (A transpose times A) = sum of squares of singular values

  • @xingjieli3069
    @xingjieli3069 3 роки тому

    Great point on comparing matrix Nuclear norm with vector L1 norm, which tends to find the most sparse winning vector. I guess the matrix Nuclear norm may tend to find 'least' weights during the optimization.

  • @ashutoshpatidar3288
    @ashutoshpatidar3288 10 місяців тому +1

    Feeling so emotional watching him teaching at the age of 84😢

  • @hieuphamngoc6258
    @hieuphamngoc6258 3 роки тому

    He is such a sweet man and a genius teacher at the same time

  • @jonahansen
    @jonahansen 5 років тому +20

    Dang - He's good!

    • @mgh256
      @mgh256 4 роки тому +1

      come on.... He is Gilbert Strang......

  • @karthikeyakethamakka
    @karthikeyakethamakka 2 роки тому

    The Largest Singular Value is the same as largest Eigen Value for a fully connected layer which is also called as spectral Normalization.

  • @nuclearrambo3167
    @nuclearrambo3167 19 днів тому

    I would use laplace's succession rule in coin flipping problem

  • @shvprkatta
    @shvprkatta 3 роки тому

    Prof Strang...my respects sir...

  • @fredsmith894
    @fredsmith894 4 роки тому +1

    I love Linear Algebra!

  • @HoangLe-rk2ke
    @HoangLe-rk2ke 3 роки тому

    Protect him with all cost MIT

  • @karthikeyakethamakka
    @karthikeyakethamakka 2 роки тому

    27:07 minimizing something with a constraint Lagrangian Formulation.

  • @RohithBhattaram
    @RohithBhattaram 4 роки тому

    Good Video about Norms. Thank you Prof.

  • @sergiohuaman6084
    @sergiohuaman6084 3 роки тому +1

    @44:00 by now Prof. Strang should know that nothing is ever taken out of the tape haha.

  • @jxw7196
    @jxw7196 3 роки тому

    This man is brilliant!

  • @xc2530
    @xc2530 Рік тому

    35:00 matrix norm

  • @HardLessonsOfLife
    @HardLessonsOfLife 2 роки тому

    Why is L half not a good norm? Why the P is restricted to be >= 1 instead of just p >0?

  • @zkhandwala
    @zkhandwala 5 років тому +3

    Compelling lecture (as always), but I'm unsettled about one thing: much of it is based on the fact that the first singular vector of A is the maximizing x in the definition of ||A||2. However, this fact just seems to be mentioned without proof or argument, and accordingly it doesn't feel as though the proof that ||A||2 = sigma1 is complete. Thoughts?

    • @gordongustafson2799
      @gordongustafson2799 5 років тому +8

      I agree. I can give a proof sketch:
      1. A = U Σ V^t by the SVD.
      2. To maximize ||Σy|| for a unit vector y, we would choose y to have all 0's except for a 1 in the position multiplying the largest value in the diagonal matrix Σ, which is sigma1. This effectively scales every component of y by sigma1 (all the other components are 0). Any other choice of y results in some component of y being scaled by a value less than sigma1, and no component scaled by more than sigma1.
      3. U is orthonormal, so ||Uz|| = ||z||
      4. 1 and 3 give us ||Ax|| = ||U Σ V^t x|| = ||Σ V^t x||
      5. Assume ||x|| = 1. V^t is orthonormal, so ||V^t x|| = 1.
      6. Thus, the maximizing value of x satisfies V^t x = y for the y we found in step 2.
      7. This gives x = v1, and ||Ax|| = sigma1.
      8. Since the L2 norm of A is the maximum value of ||Ax||/||x|| over all x's, the L2 norm of A is sigma1 (small leap here, but straightforward)

  • @lololamize
    @lololamize 5 років тому +3

    Does anyone know something more concrete about the Srebro results? Have they been verified already? How general are they? 44:54

  • @Leeidealist
    @Leeidealist 4 роки тому +17

    I love him so much
    I don't believe in God but I prayed for his health

  • @zeke-hc3rc
    @zeke-hc3rc 3 роки тому

    thank you

  • @namimmsadeghi8475
    @namimmsadeghi8475 3 роки тому

    perfect. God bless you...

  • @filialernotina1060
    @filialernotina1060 4 роки тому

    Can someone explain to me when we should use Frobenius norm and when we should use the nuclear norm ?

  • @nazarm6215
    @nazarm6215 Рік тому

    so is a sigmoid a norm or a norm is a sigmoid?

  • @oscarlu9919
    @oscarlu9919 3 роки тому

    12:12
    prof: it's just exploded in importance.
    me: I just burst in laugh :)

  • @wernerhartl2069
    @wernerhartl2069 3 роки тому

    Max ||Ax||/||x||=Max ||Akx||/||kx||=Max ||Ax||/||x||, ||x||=1.
    So you can think of a unit circle ||x||=1 with ||Ax|| plotted on the radius which might look like a circle and ellipse with a point on the Ellipse being at Max ||Ax||/||x||.

  • @TheLegendOfCockpunch
    @TheLegendOfCockpunch 4 роки тому

    At 21:30, when Strang is performing the linear transformation of the unit circle, shouldn't the image be rotated 90 degrees.
    It seems like the x stretching is by a factor of 2 and the y stretching by a factor of 3. But he did it the opposite way.
    I am but a humble pupil looking to understand and correct any bugs in my intuition.

    • @oscarys
      @oscarys 4 роки тому

      Hi Samuel. Make v2=0 and you will know how far out the ellipse stretches over the v1 axis. Turns out to be sqrt(1/2). Equivalent reasoning tells you that the ellipse stretches sqrt(1/3) on the v2 axis. So the drawing is ok.

    • @TheLegendOfCockpunch
      @TheLegendOfCockpunch 4 роки тому

      ​@@oscarys Ah thanks! I was forgetting the bounds of the equation forcing it to equal 1. So those points aren't 2 or 3 but sqrt(1/2) and sqrt(1/3)

    • @oscarys
      @oscarys 4 роки тому +1

      @@TheLegendOfCockpunch right! I edited my answer to include the sqrts =)

  • @TeejK
    @TeejK 4 роки тому

    Holy crap this is a good lecture

  • @charliehou9553
    @charliehou9553 3 роки тому

    Long live!

  • @obsiyoutube4828
    @obsiyoutube4828 4 роки тому

    sure big professor

  • @sanjaykrkk
    @sanjaykrkk 3 роки тому

    Awesome!

  • @raphaelambrosiuscosteau829
    @raphaelambrosiuscosteau829 4 роки тому

    How do we actually see what sigma1 is the maximum blow-up factor and what v1 is the vector what gets blown up the most? Because i initially thought it would be the first eigenvector, and then it would make sense, but then i realised what sigma is not an eigenvalue after professor said it and i'm struggling a bit with imagening what's happening here

    • @justpaulo
      @justpaulo 4 роки тому +1

      Recall the picture Prof. Strang draw when explaining SVD. Here's a refresher (in slide #25):
      ocw.mit.edu/resources/res-18-010-a-2020-vision-of-linear-algebra-spring-2020/videos/MITRES_18_010S20_LA_Slides.pdf
      As Prof. Strang mentioned, U and V only perform rotation or possible reflection of x, which does not changes the norm of x.
      It is Sigma that is responsible for stretching and among those sigma1 is the biggest and it is therefore the "maximum blow-up factor".
      I hope this helps.

  • @prajwalchoudhary4824
    @prajwalchoudhary4824 3 роки тому +1

    Is L0 norm is not convex ????

    • @ZeroManifold
      @ZeroManifold Рік тому

      yes, cause the origin point is excluded

  • @avareallymeow
    @avareallymeow 4 роки тому +2

    This is some nice chalk

  • @shaurovdas5842
    @shaurovdas5842 4 роки тому

    At 32:41 when professor Strang says L2 norm of a matrix is 'sigma1', what does he mean by sigma1?

    • @oscarys
      @oscarys 4 роки тому +1

      Hi Shaurov. He is referring to the largest singular value in the SVD of A

  • @YNRUIZ69
    @YNRUIZ69 3 роки тому

    Cool vid

  • @yupm1
    @yupm1 4 роки тому +8

    Whenever I make money I will donate!

    • @jaimelima2420
      @jaimelima2420 4 роки тому

      You will make a lot of money man. In Wall Street, perhaps!

    • @freeeagle6074
      @freeeagle6074 Рік тому

      When you earn 20 dollars, you can donate 1/2 dollars. When you earn 20,000 dollars, you can donate 100 dollars. When you earn 2 billion, you'll leave here and forget about donation for ever.

  • @phuongnamphan335
    @phuongnamphan335 5 років тому

    why sigma 1 is the largest singular value ? Why it's position relate to largest or not ? I dont understand

    • @BorrWick
      @BorrWick 5 років тому +1

      Yes, singular values are ordered based on size

  • @SimmySimmy
    @SimmySimmy 5 років тому +1

    I've watched the first 6 videos without difficulty, but I'm confused by the definition and geometric meaning of different norms. Could anyone please tell me which textbook I should read to help me understand? Thanks for your helping!

    • @turdferguson3400
      @turdferguson3400 5 років тому +5

      Rewatch the videos, and maybe you'll get it! It has worked for me!

    • @darkwingduck42
      @darkwingduck42 5 років тому

      Linear Algebra and Learning from Data by Gilbert Strang!

  • @quanyingliu7168
    @quanyingliu7168 5 років тому +1

    The phenomenon he mentioned in the first 5 minutes is a very interesting psychological question. Is it about the sequential effects of decision making? Anyone knows the field? Please feel free to share some papers. Thank you.

  • @csl1384
    @csl1384 5 років тому +1

    Is there a link to the notes Prof. Strang keeps alluding to?

    • @NolanZewariligon
      @NolanZewariligon 4 роки тому

      @Bob Mama There aren't any lecture notes on that link.

  • @MrFurano
    @MrFurano 2 роки тому

    43:49 The "actual humans" statement is still on the tape 🤣

  • @vyvo1473
    @vyvo1473 5 років тому +1

    Why ||A||2 = max ||Ax||2/||x||2? Can someone help me explain? :(

    • @sricharanbattu4502
      @sricharanbattu4502 5 років тому +1

      That is actually the definition of matrix norm, induced by a vector

    • @tanweermahdihasan4119
      @tanweermahdihasan4119 5 років тому

      @Rich Caputo shouldn't there be a l2 norm constraint for x? say, ||x|| = 1.

    • @myoung1445
      @myoung1445 4 роки тому

      It's a definition rather than a result

  • @eyobgizaw8362
    @eyobgizaw8362 3 роки тому

    how would the shape look like for p, 1

    • @godfreypigott
      @godfreypigott 3 роки тому

      Between the diamond and the circle.

  • @karthikeyakethamakka
    @karthikeyakethamakka 2 роки тому

    40:10 F norm

  • @NolanZewariligon
    @NolanZewariligon 4 роки тому +8

    He forgot to finish PCA.

    • @СергейКумейко-й8г
      @СергейКумейко-й8г 4 роки тому

      I can give available information . The lecture here connects the optimization problem with eigenvectors.
      But sorry , the lecture in Russian)))
      ua-cam.com/video/W5JLSKcuaQo/v-deo.html

    • @bpc1570
      @bpc1570 4 роки тому +1

      How about referring to Andrew Ng lecture at cs229, which is not in Russian for English speakers

    • @justpaulo
      @justpaulo 4 роки тому +1

      ua-cam.com/video/ey2PE5xi9-A/v-deo.html

    • @NolanZewariligon
      @NolanZewariligon 4 роки тому

      @@justpaulo @BP C
      MVPs.

  • @saleemji9561
    @saleemji9561 3 роки тому

    Old people are really our pride

  • @bird9
    @bird9 2 роки тому

    well... How old are these students ?

  • @sschmachtel8963
    @sschmachtel8963 5 років тому

    superelipses

  • @thetedmang
    @thetedmang 4 роки тому

    I didn't get the part where he minimized the L2 norm geometrically, why was it that particular point?

    • @yuchenzhao6411
      @yuchenzhao6411 4 роки тому +1

      L2 norm for a vector is the distance from origin. Since the candidate vectors have to be on the constraint line, the problem (find a vector that subject to the constraint minimize L2 norm) became "which point on that line has smallest distance to the origin".

  • @matthewpublikum3114
    @matthewpublikum3114 Рік тому

    Are the notes available somewhere?

    • @mitocw
      @mitocw  Рік тому

      There are no lectures notes available for this course; that is because the book (Strang, Gilbert. Linear Algebra and Learning from Data. Wellesley-Cambridge Press, 2018. ISBN: 9780692196380) is basically the lecture notes for the course. See the course on MIT OpenCourseWare for more info and materials at: ocw.mit.edu/18-065S18. Best wishes on your studies!

  • @paquitagallego6171
    @paquitagallego6171 3 роки тому

    💖💖💖🙏

  • @bmh18172
    @bmh18172 3 роки тому

    RIP. He will be missed.

  • @sumanchaudhary8757
    @sumanchaudhary8757 5 років тому +1

    can somebody provide lecture notes of this course?

    • @mitocw
      @mitocw  5 років тому +4

      Course materials are available on MIT OpenCourseWare at: ocw.mit.edu/18-065S18. Best wishes on your studies!

    • @NolanZewariligon
      @NolanZewariligon 4 роки тому

      @@mitocw There aren't any lecture notes on that link.

    • @sukuya
      @sukuya 4 роки тому

      github.com/ws13685555932/18.065_lecture_notes are some summary notes till lecture 14.

  • @intoeleven
    @intoeleven 4 роки тому

    may I ask what the p mean?

  • @fraenzo44
    @fraenzo44 Рік тому

    Ć

  • @kevinchen1820
    @kevinchen1820 2 роки тому

    20220527 簽