Jensen's Inequality

Поділитися
Вставка
  • Опубліковано 5 чер 2024
  • The machine learning consultancy: truetheta.io
    Want to work together? See here: truetheta.io/about/#want-to-w...
    Article on the topic: truetheta.io/concepts/machine...
    Jensen's Inequality appears multiple times in any rigorous machine learning textbook. It's essential for the key principles and foundational algorithms that make this field so productive. In this video, I state what it is, explain why it's important and show why it's true.
    SOCIAL MEDIA
    LinkedIn : / dj-rich-90b91753
    Twitter : / duanejrich
    Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon: / mutualinformation
    Sources and Learning More
    To see Jensen's Inequality used in the justification for the EM algorithm, see section 11.4.7 of [1]. For its use in Information Theory, see section 2.6 of [2].
    [1] Murphy, K. P. (2012). Machine Learning: a Probabilistic Perspective. MIT Press, Cambridge, MA, USA.
    [2] Cover, T. M. & Thomas, J. A. (2006), Elements of Information Theory 2nd Edition, Wiley-Interscience, NY USA
    Contents
    00:00 - Why Jensen's Inequality is important
    02:01 - Stating the Inequality
    03:30 - Showing the Inequality
    06:36 - Outro

КОМЕНТАРІ • 128

  • @willbutplural
    @willbutplural Рік тому +17

    You just simply explained a complex topic that I spent 3 hours on reading a textbook into 10-minute video. Your ability to condense and concisely explain these topics in your videos have been phenomenal. Great job!

  • @mCoding
    @mCoding 3 роки тому +4

    Great intuition, great visualizations! Mathematicians will also say that you need to assume that X is integrable in order for Jensen's inequality to hold. Jensen also has far reaching consequences in theoretical probability, and even analysis in general. Can't wait for more!

    • @Mutual_Information
      @Mutual_Information  3 роки тому +2

      This means a lot getting your comment here. Much appreciated!
      And yes! There are unfortunately rigor qualifications that I omit to keep the vid light. In the case of integrability, I hadn’t thought of that, so thanks for pointing it out :)

  • @PolyRocketMatt
    @PolyRocketMatt 5 місяців тому +1

    Probably one of the most underrated inequalities... Shows up everywhere (mostly Machine-Learning these days, but I also encountered this in neutron transport and rendering of images)

  • @jiayangcheng
    @jiayangcheng 2 роки тому +8

    Intuition is indeed what helps at least me to understand (not just short-term memory) a concept, great work, thank you!

    • @sskhdsk
      @sskhdsk Місяць тому

      Human transforms short-term memory to long-term memory by understanding and prediction.

  • @Form74
    @Form74 3 роки тому +8

    Thanks for the intuition-nurturing graphics. Very helpful!

  • @xxxxxfirefoxxxxx
    @xxxxxfirefoxxxxx 2 роки тому +34

    You basically took an esoteric formula and explained it in a stupid-people friendly way. Thank you

    • @jacoboribilik3253
      @jacoboribilik3253 10 місяців тому +1

      There's no need to put yourself down in such a way. You can be thankful for the content this guy is putting out on YT by liking, subscribing and hitting the share buttom...stop dragging yourself over the coals.

  • @SumayaKazi
    @SumayaKazi 3 роки тому +7

    Congrats on the launch of your channel and first video, DJ! This was awesome!

  • @karanshah1698
    @karanshah1698 Рік тому +5

    You have no idea how often your explanations blow my mind. It is an "aha" moment every single time, a concept clicks so well! Please keep up this amazing work.

    • @Mutual_Information
      @Mutual_Information  Рік тому +2

      THank you and I will! I got something big in the works :)

    • @karanshah1698
      @karanshah1698 Рік тому +1

      @@Mutual_Information Do you plan on doing one for EM derivation of GMMs?

    • @Mutual_Information
      @Mutual_Information  Рік тому +1

      @@karanshah1698 EM yes, GMMs, yes eventually. Using them together?? No I didn’t think of that, hm

  • @rangjungyeshe
    @rangjungyeshe Рік тому +4

    You sure have a gift for teaching ! Plus, what a slick production . It takes a lot of hard work and skill to make something look as simple and obvious as you do. Awesome.

  • @debatirthadeb6632
    @debatirthadeb6632 3 місяці тому +1

    Great explanation. Instead ofaverageg, its better to think of weighted average; this will easily convey the idea of the formal definition of convex function : )

  • @hansenmarc
    @hansenmarc Рік тому +3

    I was curious about Jensen’s inequality, having seen it in the context of EM. You did a great job of providing even more context and explaining the intuition. The animation makes it so easy to understand why it is true. Simply outstanding. This is hands-down the best video I’ve seen on the subject. Thank you! Just subscribed.

    • @Mutual_Information
      @Mutual_Information  Рік тому +1

      Thanks - great to have you! This was my first vid. I've gotten a lot of useful feedback since, but glad this one still lands

  • @anandseth1772
    @anandseth1772 4 місяці тому +1

    Nicely and Intuitively explained! Thanks

  • @brankojangelovski3105
    @brankojangelovski3105 2 роки тому

    nice quality and explanation, really helped me out

  • @EverlastingsSlave
    @EverlastingsSlave 2 роки тому +1

    Thanks for such great visuals

  • @LittleBigVlad25
    @LittleBigVlad25 Рік тому +1

    Great visualisation, really good job! Thank you very much!

  • @BGWee
    @BGWee 2 роки тому +2

    As a tired econometrics student with a dull lecturer, this helped a bunch, thanks

  • @tferrerd
    @tferrerd 3 роки тому +5

    Very interesting. Keep ‘em coming DJ!

  • @user-vr1so7tc7x
    @user-vr1so7tc7x 3 роки тому +4

    Great explanation, I am at the process of figuring out the cross-entropy and you video helped me with Jensen inequality concept! Keep it up!

  • @manoeanna
    @manoeanna 2 роки тому +1

    Great video! Thanks for sharing your knowledge!

  • @chnaka7518
    @chnaka7518 Місяць тому +1

    Wow. The way you simplified the concept. I was amazed.😍

  • @user-fx1hs8qr6h
    @user-fx1hs8qr6h 7 місяців тому +1

    Great visualization, thank you very much for your effort to break it down so well!!! :)

  • @manueljenkin95
    @manueljenkin95 2 роки тому +2

    Thank you very much for this wonderful presentation. A lot of effort must have gone into getting an animation that feels intuitive.

  • @yennefer415
    @yennefer415 2 роки тому +1

    Huh.. it clicked in like 3 seconds after seeing the comparison with line. And why it's true is so obvious now. Amazing. Thanks.

  • @NoNTr1v1aL
    @NoNTr1v1aL 2 роки тому +1

    Amazing video! Subscribed.

  • @shubhamjoshi449
    @shubhamjoshi449 9 місяців тому +1

    Great Video ...Thanks for the efforts you put in these Videos ..🙂

  • @hardy8488
    @hardy8488 3 роки тому +7

    Great video, hopefully there is a followup videos on how Jensen's Inequality becomes the important part of EM, KLDiv and so on.

    • @Mutual_Information
      @Mutual_Information  3 роки тому +1

      Yes! The EM algorithm will be covered, but later this year. If you're curious immediately, I linked to some sources in the description where Jensen's Inequality is used. In Cover's book, there is a section "Jensen's Inequality and Its Consequences" which show how foundational it is for Information Theory.

  • @MrEmilosen
    @MrEmilosen 2 роки тому +1

    Perfect presentation, thank you!!

  • @emanuelhuber4312
    @emanuelhuber4312 2 роки тому +1

    Awesome video! Really easy to follow

  • @partyhorse420
    @partyhorse420 Рік тому +1

    Amazing explanation!

  • @santroma1
    @santroma1 2 роки тому +1

    Great explanation!

  • @rafaelladeira6049
    @rafaelladeira6049 2 роки тому +1

    Outstanding explanation!

  • @alexanderk5835
    @alexanderk5835 2 роки тому +1

    Very good explanation without keeping it complicated, thanks a lot/

  • @jmbrjmbr2397
    @jmbrjmbr2397 2 місяці тому +1

    Your channel looks great, thanks!

  • @olegmonogarov7219
    @olegmonogarov7219 2 роки тому +1

    Excelent explanation indeed!

  • @descent21iri89
    @descent21iri89 2 роки тому +1

    incredibly clear and helpful! thx a lot

  • @KapilSachdeva
    @KapilSachdeva 2 роки тому +1

    Brilliant explanation!

  • @jarvis-yu
    @jarvis-yu 3 місяці тому

    Nice animation making things a lot more intuitive, thanks.

  • @sastryanjaneya5863
    @sastryanjaneya5863 7 місяців тому +1

    very clear explanation with high energy .. I like it

    • @Mutual_Information
      @Mutual_Information  7 місяців тому

      lol old video will lots of energy.. I've chilled out a bit since, but thank you

  • @damiangames1204
    @damiangames1204 2 роки тому +1

    Nice visualization!

  • @chiragvashist8415
    @chiragvashist8415 Рік тому +1

    You are awesome. I have been binge watching your videos.🖖

  • @ananthakrishnank3208
    @ananthakrishnank3208 4 місяці тому +1

    Then for concave function, I expect 'greater than equal to', instead if less than equal to.

  • @vasanthakumarg4538
    @vasanthakumarg4538 2 місяці тому

    Very clear explanation. Keep up the good work

  • @Soedmaelk
    @Soedmaelk 2 роки тому +15

    This was an awesome explanation. Thank you! Out of curiosity, how did you make the animation? By the way, that was also really well made!

    • @Mutual_Information
      @Mutual_Information  2 роки тому +2

      Hey thanks and to answer your question, I stick together a bunch of graphs made in Altair using a personal library. Altair is very nice plotting library

    • @asdfasdfuhf
      @asdfasdfuhf Рік тому

      Looks like he is using manim made by 3blue1brown

  • @simonhradetzky7055
    @simonhradetzky7055 Рік тому +1

    GREAT VISUALISATION TY

  • @salgsalgglas
    @salgsalgglas Рік тому +1

    This is so good. Thankyou.😊

  • @forughghadamyari8281
    @forughghadamyari8281 4 місяці тому

    Thanks...you've explained it clearly

  • @AKASHSOVIS
    @AKASHSOVIS 2 роки тому +1

    You deserve more likes!

  • @Glassful
    @Glassful 3 роки тому +1

    You are awesome...this explanation is so cool.

  • @kadenhesse9777
    @kadenhesse9777 3 роки тому +2

    This was awesome! Could you make a video about where this is applied? you talked about how it effects ML but could you show an example? Thank you!
    btw the algorithm showed me this video so hopefully ur on the rise! Honored to be this early

    • @Mutual_Information
      @Mutual_Information  3 роки тому +1

      Yea I’ll do a video on the EM algorithm, where this shows up. Also, variational inference, eventually.
      And I’m happy to hear that! I hope you’re right but we’ll see.

  • @thorgexyz
    @thorgexyz 3 роки тому +1

    Thanks. Very interesting. I listend to a talk from Nassim Taleb where he talked about the Jensen’s Inequality.

  • @Throwingness
    @Throwingness 2 роки тому +1

    Liked and commented. Thank you and more please!

  • @connorshorten6311
    @connorshorten6311 3 роки тому +4

    Awesome video!

  • @danialdunson
    @danialdunson 2 роки тому +1

    great channel

  • @veri_bilimi
    @veri_bilimi Рік тому +1

    Amazing! Thank you very much!

  • @randomvideos3628
    @randomvideos3628 2 роки тому +1

    Gem of a video...

  • @r.hazeleger7193
    @r.hazeleger7193 19 днів тому

    Great vid bruv

  • @angelinag5076
    @angelinag5076 2 роки тому +1

    Thanks !

  • @kiarashgeraili8595
    @kiarashgeraili8595 2 роки тому +1

    Very Very nice!

  • @simpleworld542
    @simpleworld542 2 роки тому +1

    Thank You boss

  • @za012345678998765432
    @za012345678998765432 2 роки тому +2

    Just found your channel through your comment on 3b1b's video, very nice explanation. Btw, is the opposite inequality true for concave functions?

    • @Mutual_Information
      @Mutual_Information  2 роки тому +1

      Glad you’re here! His channel is a huge inspiration.
      And yep, concave functions you get the opposite. The neg of a convex I’d a concave function and that deserves the inequality.

  • @Kopakabana001
    @Kopakabana001 3 роки тому +1

    Love the videos!

  • @WilliamDye-willdye
    @WilliamDye-willdye 3 роки тому +7

    Heh. 4:55 "what mathematicians will ask, and engineers probably won't, is 'why?'." That definitely matches my own experience.

  • @pedramhaqiqi7030
    @pedramhaqiqi7030 8 місяців тому

    most OP explanation of all time. My intuition so far had came from showing it w the definition of convexity, This was awesome, relating it to N sampling was the key. Learning Online Convex Opt, any tips :p prof is planning 50% avg midterm

  • @TheEmT33
    @TheEmT33 2 роки тому +1

    great vid! but it would be even better if you can show the formal proof and connect it with the visualization u showed

  • @omololaomotalade8105
    @omololaomotalade8105 2 роки тому +1

    Thank you for this..

    • @Mutual_Information
      @Mutual_Information  2 роки тому

      For sure - If you'd like, you can do me a solid and anyone into ML/stats about the channel :)

  • @inordirection_
    @inordirection_ 3 роки тому +9

    I never understood why this inequality was true or what it really meant the first time I saw it, and my engineering prof said don't worry about the intuition just know how to use it (bleh), but UA-cam somehow knew to suggest this to me months later! Thanks for the clear explanation

  • @davidjohnston4240
    @davidjohnston4240 3 роки тому +2

    This is relevant to extractor theory used in cryptography (usually as min-entropy, not the Shannon entropy you assume here) - how does your function change the entropy per bit of the input data?

  • @prakhyatshankesi3749
    @prakhyatshankesi3749 Рік тому +1

    Subscribed

  • @alphamikeomega5728
    @alphamikeomega5728 3 роки тому +2

    3:19 in and I get it. Thanks!

  • @gjcamacho
    @gjcamacho 10 місяців тому

    Is it possible you could made a video on Variational Inference and the intuition on the loss function?

  • @akhilezai
    @akhilezai 3 роки тому +4

    Holyshit, this is a great video!

  • @ramirolopezvazquez4636
    @ramirolopezvazquez4636 4 місяці тому

    Awesome! So ... given a random variable X I can use Jensen's inequality to estimate the local curvature of a function 'f' ?

  • @lenishpandey192
    @lenishpandey192 2 роки тому +1

    Can't thankyou enough!

  • @omridrori3286
    @omridrori3286 2 роки тому +1

    My friend you are amazing like really i feel bad how much time i wast in trying underatand it when you explain it amaizing in 5 minute
    Can you also make video on vae and variational inference elbo and all that?
    Please it is really topic which hard for a lot of people and look like it exactly in your domain
    Please

    • @Mutual_Information
      @Mutual_Information  2 роки тому

      Thanks! And I do have plans to make a video on variational inference but it may take awhile. There are a few videos in front of it. But it’s coming!

  • @Breeezn
    @Breeezn 4 місяці тому

    Cool inequality! If we take f(x) = x² then this inequality tells us that for any real a, b: a² + 2ab + b²

  • @selinacarter8849
    @selinacarter8849 2 роки тому +1

    What software did you use for this cool moving graph thingy??

  • @line8748
    @line8748 10 місяців тому +1

    If you have a concave function, does the inequality sign just get flipped?
    Thank you very much for the content!

  • @egoreremeev9969
    @egoreremeev9969 3 роки тому +1

    So what would happen if the space of points below(!) the function is convex? Will we get the different inequality?

    • @Mutual_Information
      @Mutual_Information  3 роки тому +1

      Yep, then that would be a concave function and the inequality would be reversed. A pretty common example of that is the log(x) function.

  • @sadiyaahmad6680
    @sadiyaahmad6680 Рік тому +2

    Can you upload some content about estimation maximization with mixed poisson?

    • @Mutual_Information
      @Mutual_Information  Рік тому

      Hm, sorry but that's unlikely. It's just too specific. The topics I've picked are already fairly niche. If I go into a very specific subtopic, it'll appeal to very few folks (unless there is something particularly fascinating about it)

  • @nikoskonstantinou3681
    @nikoskonstantinou3681 3 роки тому +1

    Hey, nice video. I though we would see the
    f((a+b)/2)

  • @cnidariantide4207
    @cnidariantide4207 2 роки тому +1

    I wish I had friends like you! What's on the bookshelf? =)
    The intuition follows immediately from grokking the idea of convex functions!
    Some of the coolest tricks in mathematics come from manipulation of inequalities. There's a brilliant little maths book named "The Cauchy-Schwarz Master Class" which I recommend for anyone wanting to master these dark arcana.
    Sometimes, like in this case, an animation is just unbeatable, however!

    • @Mutual_Information
      @Mutual_Information  2 роки тому

      Thank you! I’ve heard that book recommended a few times but have never checked it out. I’ll order it!

  • @carlaparla2717
    @carlaparla2717 Рік тому

    I don't like the chosen function for the visualization because it is everywhere increasing. From this video i m not convinced that this reasoning would be valid for say a parabola segment.

  • @rocamonde
    @rocamonde 2 роки тому +2

    The visual explanation is excellent. However, it might seem that it does not hold if one picks a convex function that is below the straight line. For completeness, it is worth remarking that the equality holds for any straight line. Because of this, for all convex functions f(x) there is always a straight line ax+b for which f(x)>=ax+b for all x. (Namely, the line you pick has to be the tangent of the convex function at E[X]).

    • @erikysilvagomes5496
      @erikysilvagomes5496 Рік тому

      My question was not answered by the author, maybe you can help me. My doubt is precisely about the equality holding for any straight line... why it is true? I can understand it holds for linear functions of type f(x) = c.x, in the sense of addition and multiplication conservation -> hence f(E(x)) = E(f(x)). Functions of the type f(x) = ax+b are straight lines, but not linear funcions in these sense, so we cannot prove that f(E(x))=E(f(x)).
      Try to use f(x) ~ exp(k) and the transformation Y = ax + b, you'll see that f(E(x))=a/k+b and E(f(x))=exp(kb/a)*a/k, which do not hold for any straight line, but just for b=0.

    • @rocamonde
      @rocamonde Рік тому +1

      I’m not sure what you’re saying. The equality is satisfied when f is an affine transformation. This is trivial because E is a linear operator:
      E(f(X)) = E(aX+b) = a E(X)+b=f(E(X))
      where the first step applies the definition that f is affine (a straight line), the second, that E is linear, and the third using again the definition of f being affine.
      In your example you’re mixing up letting f be an exponential or letting it be an affine transformation, so you get something weird. You can’t apply both transformations at the same time, that is not something that Jensen inequality talks about in any way.
      If the function is convex, you get an inequality. If the function is strictly convex (unless X is constant a.s.) you get strict inequality. And in the opposite edge case that the function is affine (which is also convex), you get strict equality.

    • @erikysilvagomes5496
      @erikysilvagomes5496 Рік тому

      @@rocamonde Of course you are correct, in my example I made a mistake calculating E[f(X)], for that reason the equality does not hold. My approach was to see the theorem from the affine functions properties, but in fact is much more simple to see it from the expectation linearity. Thank you, your explanation was clear!

  • @rangjungyeshe
    @rangjungyeshe Рік тому

    Thanks!