The Exponential Family (Part 1)

Поділитися
Вставка
  • Опубліковано 15 тра 2024
  • The machine learning consultancy: truetheta.io
    Want to work together? See here: truetheta.io/about/#want-to-w...
    The Exponential Family includes almost all of the most frequently encountered distributions. In this video, I show how it functions as such a wide reaching generalization. But it's more than a generalization. This video will set up part 2 where I all the useful properties that follow from this clever mathematical device.
    SOCIAL MEDIA
    LinkedIn : / dj-rich-90b91753
    Twitter : / duanejrich
    Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon: / mutualinformation
    CONTENTS
    0:00 Introducing the Exponential Family
    1:11 The Normal Distribution as a special case of the Exponential Family
    3:40 Stating the Exponential Family Precisely
    9:00 The Bernoulli Distribution as a case of the Exponential Family
    10:50 The Multinoulli Distribution as a case of the Exponential Family
    14:00 Different Choices, Different Distributions
    Chapter 9 from [1] gives a nice overview of the Exponential Family and it's usefulness within Machine Learning. [2] was a useful additional perspective and made clear how to interpret the measure of x. [3] gave me the list of all distributions.
    SOURCES
    [1] K. P. Murphy. Machine Learning: A Probabilistic Perspective, MIT Press, 2012
    [2] M. I. Jordan, Exponential Family: Basics, University of California, Berkeley, people.eecs.berkeley.edu/~jor...
    [3] The Exponential Family, Wikipedia, en.wikipedia.org/wiki/Exponen...

КОМЕНТАРІ • 82

  • @mCoding
    @mCoding 2 роки тому +13

    Can't wait for part 2!

  • @vivekveer3272
    @vivekveer3272 2 роки тому +9

    This video is so amazing. Great visuals, really nicely explained.

  • @BlaqueT
    @BlaqueT Рік тому +4

    This explained the concept and inner workings of the exponential family so much better than my lecturer did. Keep up the amazing work!

  • @Dan-xl8jv
    @Dan-xl8jv 2 роки тому +3

    This is amazing. So well explained

  • @chrisha313
    @chrisha313 2 роки тому +3

    This is amazingly done, thank you so much!

  • @parthkdoshi
    @parthkdoshi 2 роки тому +2

    Lovely video! Thanks a lot. Waiting for part 2..

  • @SamuelLiJ
    @SamuelLiJ 10 місяців тому +1

    The expression for the expectation and variance in terms of log(Z) shows up again in statistical mechanics. There the energy and energy fluctuations are derivatives of log(partition function), precisely because the distribution over microstates is in the exponential family with t(x)=E and theta=beta. Great stuff!

  • @xLyndo
    @xLyndo 2 роки тому +3

    Thanks, DJ! I was coincidentally trying to learn about these in Bishop but my eyes just glazed over. These visualizations are great!

  • @PeacefulAnxiety
    @PeacefulAnxiety 2 роки тому +2

    The pace for the bullet points was perfect, it's clear the effort to get it to flow so well!

    • @Mutual_Information
      @Mutual_Information  2 роки тому +1

      Ha yea getting the timing right is a challenge. Still a work in progress, but glad people are noticing g

  • @roydouek
    @roydouek 2 роки тому +2

    Your content is gold!

  • @isleep8519
    @isleep8519 2 роки тому +2

    eager to see part 2.

  • @JR-iu8yl
    @JR-iu8yl Рік тому +1

    Glad I found this channel

  • @jaewonlee8147
    @jaewonlee8147 Рік тому +1

    Keep up the great work man!

  • @shuntpics
    @shuntpics Рік тому +1

    thank you so much for making these videos

  • @user-ne6rs9pv7p
    @user-ne6rs9pv7p Рік тому +3

    Thank you for delivering the concept so clearly!!!

    • @Mutual_Information
      @Mutual_Information  Рік тому

      Happy to! Nice to see this very technical topic getting some love

  • @dhinas9444
    @dhinas9444 Рік тому +1

    You are a cool guy for this explanation, mr. Mutual Information!

  • @py5050
    @py5050 Рік тому +3

    I am late to the party. But you deserve so much more credit and attention. This answered all the questions I had and then some.

    • @Mutual_Information
      @Mutual_Information  Рік тому

      The party is just getting started!
      This topic is one of my favorites, but it's quite technical, so there were never any expectations of getting a lot of attention. But those familiar with stats definitely appreciate it

  • @marcegger7411
    @marcegger7411 2 роки тому +5

    Here because of mCoding and honestly I’m baffled. Amazing content, amazing lecture, your channel is going to blow up. All the best!

    • @Mutual_Information
      @Mutual_Information  2 роки тому +1

      You mCodings folks are likely half my followers! What a bump :)

  • @kimchi_taco
    @kimchi_taco 10 місяців тому +3

    Is DJ practicing boxing? He naturally guards up constantly.

    • @Mutual_Information
      @Mutual_Information  10 місяців тому

      Lol I don't like my old videos because they have this awk-as-hell hand shit. I was listening to some garbage 'talk with your hands' advice for UA-camrs.. ugh!

  • @abdulrhmanaun
    @abdulrhmanaun 3 місяці тому +1

    Thank you for your help

  • @Dupamine
    @Dupamine 2 роки тому +1

    The details are crazy! But i guess the exponential family is not completed yet. I might come back here a few months later

  • @vyrgill
    @vyrgill 9 місяців тому +1

    Crazy good video, really! Thank you so much

  • @trinidadcisneros4339
    @trinidadcisneros4339 Рік тому +1

    Wow this was crystal clear! looking forward to watching your other videos thank you

    • @Mutual_Information
      @Mutual_Information  Рік тому +1

      And this is an old one! Glad you like it. I'm actually thinking about reshooting it..

    • @trinidadcisneros4339
      @trinidadcisneros4339 Рік тому

      @@Mutual_Information I'll be looking out for it if you do, keep up the great work!

  • @SohailKhan-zb5td
    @SohailKhan-zb5td Рік тому +1

    Your explanation is the best on internet

  • @ilyboc
    @ilyboc 2 роки тому +2

    can't wait for part 2 :D

  • @roshinroy5129
    @roshinroy5129 Рік тому +1

    Brother, love from India!

  • @kylekuang1052
    @kylekuang1052 Рік тому +1

    mannn this is phenomenal

  • @camila_braz
    @camila_braz 2 роки тому +2

    Thanks a lot!!!

  • @toducanh
    @toducanh 2 роки тому +2

    Thank you

  • @benjaminpedersen9548
    @benjaminpedersen9548 8 місяців тому

    I think you can technically form uniform distributions within the family, it is just uninteresting since you have to choose your set beforehand and the distribution would be effectively parameterless: Choose h to be an indicator function of some set A and t to map everything to 0 or, even simpler, be zero-dimensional. This gives you probability density 1/nu(A) 1_A.

  • @parthkdoshi
    @parthkdoshi 2 роки тому +2

    Can you please also do a video on deep exponential families?

  • @desjajjaden49
    @desjajjaden49 Рік тому +1

    U r my HERO!!!!

  • @user-qh8zx7zo2u
    @user-qh8zx7zo2u 9 місяців тому +1

    woah, this was awesome..

    • @Mutual_Information
      @Mutual_Information  9 місяців тому

      It's nice when someone can appreciate the harder videos.

  • @Thefare1234
    @Thefare1234 11 місяців тому +3

    If math stats books had added intuitive explanations like this we would have been able to solve climate change by now.

  • @frecklematt
    @frecklematt 2 роки тому +1

    bruh how do you have so few views???? amazing videos!!

  • @user-jp6cc4qw2z
    @user-jp6cc4qw2z 2 роки тому +1

    May I ask where or how did you develop these deeper intuitions for the topics discussed on your channel?

    • @Mutual_Information
      @Mutual_Information  2 роки тому +2

      It mostly comes from reading about the same topic from multiple sources. When you get different perspectives, it's more likely you'll find a perspective that clicks. So I do that and then when I have a sense, I may write up a little experiment to play with the idea. In this case, I remember trying to come up with a generalized exponential family object and found that was too hard! Turns out, in the general case, it's hard to determine the set where thetas yields a finite normalizer.. so that work got me more familiar with the family.

  • @II_superluminal_II
    @II_superluminal_II 7 місяців тому +1

    HOLY SHIT UR CHANNEL IS AMAZING BROTHER, just a CS grad interested in the secrets of AI. P=NP tho

  • @bluearctik3980
    @bluearctik3980 Рік тому +1

    Does it make sense to analogize the exponential family to overloaded functions in computer science? Something like: “given some inputs that determine the search space and sufficient statistics, return the appropriate distribution.” This a very lucid explanation, by the way - I’ve bookmarked the video for future reference!

    • @Mutual_Information
      @Mutual_Information  Рік тому

      Thank you! And to answer your question, I don't quite see the analogy your referring to in the first sentence, but the second sentence seems fair to me. If the inputs are the decisions for t(x), h(x) and v(x) and the function your referring to searches for the theta* according to the data, then returning theta* would amount to returning "the appropriate distribution". So I think you're right

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 2 роки тому +2

    Wow, never would have guessed that they were somehow related. Is this the main family that matters or are many more?

    • @Mutual_Information
      @Mutual_Information  2 роки тому +1

      This one matters a lot! And I’m not familiar with any other classes of distributions that come close. A lot of theorems/tools are designed for the exponential family, just b/c they have such convenient properties

  • @Mohamed_Salah8
    @Mohamed_Salah8 6 місяців тому +1

    Sometimes it takes only one video to make people subscribe 😂.... thank you

    • @Mutual_Information
      @Mutual_Information  6 місяців тому

      I was actually thinking of re-shooting this one. I'm glad it still works for some!

  • @junhanouyang6593
    @junhanouyang6593 2 роки тому +1

    Really good video that help me understand the concept of exponential family. However I just have one small confusion. In a normal distribution case, why t(x) needs to be x and x^2? Because to me having an x will tell us the x^2 value and thus knowing x should give us the probability distribution via theta. I know to form the PDF for normal distribution you need x and x^2. I just can’t find a good logic behind this.

    • @Mutual_Information
      @Mutual_Information  2 роки тому

      I see you're point. The way I would think about x and x^2 is they are measures of the same thing.. such that their probability depends (basically) **linearly** on these measures, using a fixed parameter vector.
      Let's say you want to only use x. Could you write the normal density as a linear function of *only* x?? That's what you can't do... and that's why x^2 is needed as well.

  • @SuperGanga2010
    @SuperGanga2010 Рік тому +1

    Nice lesson, but 4K please! Even if your camera is 1080p, the rendered math would look sharper.

    • @Mutual_Information
      @Mutual_Information  Рік тому

      Yes, that was a more recent change. I was less familiar with camera best practices when I shot this.

  • @Jacob011
    @Jacob011 2 роки тому

    Is that a Gaussian process regression model on your desktop background?
    I appreciate the high-quality content and the visuals. I just wouldn't call this p(x|\theta) = 1/Z(\theta) h(x) exp(t(x)\theta) an equation, because asking for a solution of this doesn't make any sense.

    • @Mutual_Information
      @Mutual_Information  2 роки тому +1

      Yep, that’s a GP and in fact that’s my next video coming out (sometime near the end of this week).
      And yea, you’re right. It’s an expression, not an equation. I’ll keep that in mind.

  • @martinschulze5399
    @martinschulze5399 4 місяці тому +1

    That gestures xD

  • @lethalavidity
    @lethalavidity 2 роки тому +1

    3Blue1Brown's little brother ;)

  • @OmegAtlAnt
    @OmegAtlAnt 2 роки тому +1

    Suggestion: Set your playback speed to 0:75 if you don't like to feel like a dumb idiot

    • @Mutual_Information
      @Mutual_Information  2 роки тому

      ha yea these older videos are too fast. Amateur mistake on my part, but the new stuff is better paced!

    • @OmegAtlAnt
      @OmegAtlAnt 2 роки тому +1

      @@Mutual_Information I was feeling frustrated and commented a bit too harshly. It was a great explanation overall.

  • @keggluneq
    @keggluneq 10 місяців тому +1

    Love your content, but it would be nice if you could sloooooow it down a bit. You feed us a shit-ton of information in 15 minutes. I find that noob creators tend to unnecessarily zip through their material. I personally like the pace of 3b1b & Trefor Bazett.

  • @jacobcaurdy2997
    @jacobcaurdy2997 Рік тому

    I love your content and explanations but i find you being side-by-side with the blackboard explanations very distracting, I think it’s gotta do with the hand gestures combined with intonations. Maybe thats just me. All love though, your content is superb.

    • @Mutual_Information
      @Mutual_Information  Рік тому

      I agree in fact. My new format (see my most recent video) makes the latex more front and center, and that's how things we'll be going forward. It's all a work in progress

  • @undisclosedmusic4969
    @undisclosedmusic4969 2 роки тому

    Cool video and all, but why are you wasting precious energy by having two monitors and a computer on idle while filming?

    • @Mutual_Information
      @Mutual_Information  2 роки тому +1

      Ha never thought of that. If it makes any difference to you, I turn the AC and fridge off since they make noise. So maybe that nets out :)

    • @undisclosedmusic4969
      @undisclosedmusic4969 2 роки тому

      @@Mutual_Information I don’t mind, perhaps your wallet and/ or the planet do 😅

  • @welcomethanks5192
    @welcomethanks5192 10 місяців тому +1

    how to map this form
    en.wikipedia.org/wiki/Exponential_family#:~:text=This%20yields%20the%20canonical%20form
    with your equation?

    • @Mutual_Information
      @Mutual_Information  10 місяців тому

      eta = theta and log Z(theta) = A(eta)

    • @welcomethanks5192
      @welcomethanks5192 10 місяців тому

      @@Mutual_Information why not subtract A(eta)? How you get 1/A(eta)?? exp(x-a) = exp(x)/exp(a)