Gaussian Mixture Models - The Math of Intelligence (Week 7)

Поділитися
Вставка
  • Опубліковано 12 січ 2025

КОМЕНТАРІ • 244

  • @tomhas4442
    @tomhas4442 4 роки тому +24

    3:44 Intro, Gaussian Distribution, Probability Density Function (PDF)
    7:38 GMM Intro
    9:08 Covariance matrix
    10:15 GMM Definition, K Gaussians
    11:30 How to apply GMM for classification
    12:30 Problem statement, Fitting a GMM model, Maximum Likelihood Estimate (MLE)
    13:58 Similarity to Kmeans clustering algorithm
    16:13 Expectation maximization (EM) algorithm and difference to Gradient Descent
    18:15 When to apply GMM, anomaly detection, clustering, object tracking
    19:30 Coding example with Python
    25:10 EM algorithm workflow in practice, Log Likelihood
    27:54 EM algorithm visual / walkthrough
    36:30 Summary
    great video, many Thanks :)

  • @jericklee8071
    @jericklee8071 6 років тому +4

    From a muddy blur to crystal clear in 30 min, thank you very much for this video Siraj

  • @jayce8978
    @jayce8978 7 років тому +32

    In case you have bad results using Gaussian mixtures, keep in mind the EM optimization only has local convergence properties, just like gradient descent: it can get stuck. Restarting the the density estimation with other initial parameters might solve it ! :)

  • @antonylawler3423
    @antonylawler3423 7 років тому +1

    Siraj. The depth and range of your knowledge still continues to amaze me.

  • @alinazari6563
    @alinazari6563 4 роки тому +1

    I love how passionate you are about this

  • @jinitgandhi1197
    @jinitgandhi1197 5 років тому +3

    suggestion at time 6:45 minutes, the y values aren't the probabilities of the x values, intuitively the probability for a single point on the gaussian will be 0.

  • @RoxanaNoe
    @RoxanaNoe 6 років тому +2

    I watch 4-5 vídeos of you per day. I'm Learning generative models for drug Design Siraj. Watch your videos not only motivates me, also makes my life & study fun and cool.

  • @CrazySkillz15
    @CrazySkillz15 6 років тому +2

    Thank you! Your videos helped me a lot... I was so lost and confused about this topic that I was on the verge of giving up. Checked out your tutorials that gave a lot of useful information and insights. Thanks a tonne! :) :D Keep up the good stuff

  • @asif7601
    @asif7601 3 роки тому

    Very energetic presentation. Kept me attentive throughout the video. Hit the sub 2 minutes in it.

  • @getinenglish3472
    @getinenglish3472 4 роки тому +1

    Wow! Finally I got my head around this subject. Well done and amazing teaching skills 👏🏻
    Andre

  • @BiranchiNarayanNayak
    @BiranchiNarayanNayak 6 років тому

    Very well explained..... I was lost while our college professor was explaining GMM and EM...

  • @idiocracy10
    @idiocracy10 6 років тому +15

    warning: when he finger styles his hair, get ready for hardcore info dump.
    PS: 3blue1brown series on linear algebra has THE BEST vid on eigen vectors/value pairs, no joking.

  • @mykle2069
    @mykle2069 7 років тому +32

    You're the best! You've helped turn this 19 year old from a lazy kid into an inspired workaholic

  • @vg6004
    @vg6004 7 років тому

    This is very helpful for my machine learning exam! Stay awesome, Siraj!

  • @spiderman7616
    @spiderman7616 7 років тому +1

    Hey Siraj!
    Just found your channel and it doesn't cease to amaze. I am learning a lot about AI and ML with your vibrant and enthusiastic expression. My 2 cents would be to talk a tiny bit slower but it is up to you. Congrats and Keep up the Good Work!

  • @slavko321
    @slavko321 7 років тому +1

    The quality of the audience is reflected from the content:) Thank you for sharing and helping understand complex subjects in an approachable way. (and not dumbing it down:)

  • @I77AGIC
    @I77AGIC 7 років тому +3

    you are getting better and better at explaining these things Siraj! keep up the great work you are helping a lot of people

  • @browsertab
    @browsertab 6 років тому +37

    The butt kissing ends at 3:40

  • @hammadshaikhha
    @hammadshaikhha 7 років тому +10

    Siraj, I think it would of been helpful if you showed the resulting clusters that you get from the gaussian mixture model approach in your data. You showed how to model your data using the gaussian mixture, but I am unclear on how we get the specific clusters (say 2 clusters) from that?

  • @adarshsrivastava1074
    @adarshsrivastava1074 5 років тому +1

    Great Video! Really helpful for Data scence students..

  • @pandawendao
    @pandawendao 7 років тому +10

    the iteration function is empty, which makes the current code completely random, it should be "mix.Mstep(mix.Estep())" inside that function

    • @Aureole62
      @Aureole62 5 років тому +4

      Like he understands that

  • @vinay1744
    @vinay1744 6 років тому

    Siraj this is Awesome!! Brother... Man you gave awesome reference links. Exploring them gave full knowledge on the concept.
    Rewatching the video after that made Complete sense..
    Hope i find a Job at ML and DL and support you on Patreon

  • @moorsyjam
    @moorsyjam 7 років тому +4

    I got pretty confused around 33:33 with the E step. You've computed wp1 and wp2, which is cool, and then normalised them so their sum is 1 [wp1/(wp1+wp2) + wp2/(wp1+wp2) = (wp1+wp2)/(wp1+wp2) = 1], which makes sense. You then add the log of this sum to self.loglike. But the log of 1 is 0... Which is where you lost me.

    • @emrahyigit
      @emrahyigit 7 років тому +1

      You are right! Siraj should check and fix that with UA-cam annotations.

    • @茱莉-x2o
      @茱莉-x2o 2 роки тому

      Agree

  • @ethereumnews3873
    @ethereumnews3873 6 років тому

    you are the best source of ML... thanks for your attention(s) and love to AI!!!!!

  • @siddharthshah7767
    @siddharthshah7767 6 років тому

    Bruh you’re helping me pass my class. Thanks

  • @kabita2301
    @kabita2301 5 років тому +2

    hello, I know this video is a bit old (in internet years :D) but I wanted to leave my positive feedback. I found your video because I am preparing for an exam and your energy gave me that burst of motivation I needed just now. Also, your method was very didactic, you explained something very complex in an understandable and enjoyable manner. Thank you so much!
    Congratulations, best wishes to you!

  • @GugaOliveira70
    @GugaOliveira70 6 років тому +1

    Thank you very much! Your explication is very good and educative! I'm recommending your channel to my friends too.

  • @마민욱-s9j
    @마민욱-s9j 7 років тому

    Thank you very much for the great video!! Siraj is god of explanation

  • @bosepukur
    @bosepukur 7 років тому

    thank you siraj for such amazing videos....u really are the best

  • @rohanghige
    @rohanghige 6 років тому

    Such a good video that I clicked like button for 10 times :)

    • @singlesam41
      @singlesam41 6 років тому +1

      ended up with "no thumbs up" :P

  • @DosellIo1
    @DosellIo1 7 років тому +2

    Great series!!!! even helps me in my AI learning curve at Udacity. Thanks for it. rgds tibor

  • @tarekamineafir714
    @tarekamineafir714 6 років тому +1

    Really thanks man, your video helped me a lot in my Hyperspectral Images classification project's

  • @mathematicalninja2756
    @mathematicalninja2756 7 років тому

    3:45 Siraj, in my information theory class, I was told Gaussian distribution as the distribution which assumes the least about the data (maximized differential entropy for a given variance) so maybe you can include that in your explanation when someone asks why we assume Gaussian distribution apart from the central limit theorem.

  • @MsSmartFox
    @MsSmartFox 4 роки тому

    @Siraj
    , why do you change the formula at 29:54? instead of sigma^2 you are using abs(sigma).

  • @ntimdomfeh1959
    @ntimdomfeh1959 5 років тому

    👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿👏🏿 you are a very good teacher. Probably the best video so far on this topic.

  • @ngplradhika
    @ngplradhika 6 років тому +7

    Your accent reminds me of Mitchell from Modern Family(fav character) :')
    Also great video thanks!!

  • @011azr
    @011azr 7 років тому

    Those are really strong motivating words in the beginning :). Thanks.

  • @nicholascantrell1179
    @nicholascantrell1179 7 років тому

    At 4:35, it appears that the score is nonnegative. Although a Gaussian distribution is a close approximation in this case, could a log-normal distribution also be used in a Gaussian Mixture Model? Are there advantages to selecting a Gaussian distribution instead?

  • @kshiteejsheth9416
    @kshiteejsheth9416 7 років тому +1

    hey siraj ! EM is a heuristic with no guarantees for global convergence. there have been recent algorithms based on method of moments, random projections etc. which provably recover the gmm under some assumptions

  • @TechResearch05
    @TechResearch05 6 років тому

    Clearly explained the concept!!! Great presentation

  • @McMurchie
    @McMurchie 7 років тому +4

    Siraj never fails to inspire, and I agree with his point strongly - we are the most important community in the world today. We all have a common goal, of making the world better with the best tech we have to offer. I for one am working on a universal translator not just for spoken languages, but for sign, braille and more. ML and NNs has moved my research forward by at least a decade.

  • @KarfontaAlec
    @KarfontaAlec 7 років тому +4

    Love the motivation at the start, preach!

  • @SubhojeetPramanik406
    @SubhojeetPramanik406 7 років тому +2

    When my friends ask me how to start with machine learning and AI, I tell them Siraj is the way to go! Thanks for making the AI community so cool! Yes we are the COOL GUYS!

  • @teamsarmuliadi6960
    @teamsarmuliadi6960 6 років тому

    You're the real man! Why didn't you come to Indonesia? We also have ML/DL community here. :) Anyway, thanks for your elaboration of GMM, it is indeed helpful and easy to understand. Cheers!

  • @vivilee7290
    @vivilee7290 7 років тому

    Love this video. It presents so clear.

  • @rafirahman6628
    @rafirahman6628 7 років тому

    Relating EM to K-means set off an epiphany in my mind. Thanks for that, it really helped clarify EM like it it never did in school.

  • @mauropappaterra
    @mauropappaterra 5 років тому +1

    We love you Siraj

  • @prayanshsrivastava1810
    @prayanshsrivastava1810 6 років тому +2

    33:30
    wp1/(wp1+wp2) + wp2/(wp1+wp2) = 1
    log(wp1 + wp2) = log(1) = 0
    How is his model being trained?

    • @prizmaweb
      @prizmaweb 6 років тому

      You guess a theta ( model params) , then that gives you a probability distribution of the hidden variables. With that known, you maximize the joint probability distribution of X and the hidden variables. That gives you a new theta. Repeat the 2 steps above: use the new theta model params instead of your guess.

    • @muhammadshahzaib9122
      @muhammadshahzaib9122 6 років тому

      We actually try to get the value of log(wp1 + wp2) =1 not (wp1 + wp2) to be 1.

  • @MorisonMs
    @MorisonMs 7 років тому

    You can use gradient descent. it's a standard maximization problem (likelihood)..
    the variable here is denoted by theta, where theta (for gmm) is the mean, variances (co variance matrix) and the probabilities
    for every gaussian.
    nothing stochastic when you have the given data points, a no more complex function then
    loss of a network.

  • @rebiiahmed7836
    @rebiiahmed7836 7 років тому

    Hi Siraj Raval, we love you from Tunisia

  • @susmapant605
    @susmapant605 7 років тому

    Great presentation about GMM !! Thanks

  • @onefulltimeequivalent1230
    @onefulltimeequivalent1230 7 років тому +3

    6:45 "the y values are the probabilities for those x values"
    aren't the y values the probability density of the x values, since in a continuous range of x values, the probability for a single value x is 0? Or did I miss something?

    • @hammadshaikhha
      @hammadshaikhha 7 років тому

      Technically speaking you are indeed correct, the probability of any point occurring on a continuous distribution such as the gaussian is 0. The y-axis for a normal distribution is density, not probability. I think Siraj just mentioned "probability" as a intuitive way to think about it.We can still use the area under the gaussian to compute the probability of getting a point in a small neighbourhood of x.

    • @NM-jq3sv
      @NM-jq3sv 7 років тому

      I was wondering some one would have commented about that... Technically speaking it's probably density as we don't say density at a single point we get probability only after integrating around an infinitesimal small interval around it.

  • @morakan9956
    @morakan9956 7 років тому

    Love the lecture style! Wish the topic covers multivariate as well

  • @sanzeej91
    @sanzeej91 7 років тому

    Awesome work Siraj

  • @sandeepozarde2820
    @sandeepozarde2820 4 роки тому +1

    Can you please control your moving hands data points? too much distraction.

  • @BahriddinAbdiev
    @BahriddinAbdiev 6 років тому

    I have some questions:
    1. In the end, what we achieved: probability distribution of people whether they keep playing the game?
    2. May it cause overfitting if we set too many gaussian distributions?
    Regards.

  • @valentinocostabile9314
    @valentinocostabile9314 7 років тому

    Great! u solved smartly my doubts... thanks man =)

  • @mayurkulkarni755
    @mayurkulkarni755 7 років тому +1

    Super tutorial! Thank you so much!

  • @TheStartupKid
    @TheStartupKid 7 років тому +1

    i just loved the energy :D

  • @julioargumedo6722
    @julioargumedo6722 7 років тому

    Hey Siraj thank you. If you ever come to México, you'll have a room, a meal, a beer and a friend :)

  • @nehadureja
    @nehadureja 4 роки тому

    Thank you. Very helpful video. :)

  • @ACLNM
    @ACLNM 4 роки тому +1

    So... 38 minutes to predict something and he just forgets about the prediction part?
    I'm sorry, but the justification on 36:02 is not enough for my satisfaction.

  • @PabloMartinez-ut8on
    @PabloMartinez-ut8on 7 років тому

    You can visit us in Uruguay! Everyone is welcome in Uruguay and especially, people who motivate the world to be better, like you @siraj!

  • @Selahmescudi
    @Selahmescudi 5 років тому +1

    You are saving me in ML classes dude!
    Thanks a lot

  • @jignareshamwala3401
    @jignareshamwala3401 7 років тому

    5:45 +siraj "whether it's a car or roller coaster that's increasing in velocity reaches a peak then decreases or a soundwave... very likely a Gaussian distribution would be a great model"...????? Isn't the bell curve representative of the frequency of the data, not the data itself??

  • @gabrielcustodiodasilva
    @gabrielcustodiodasilva 7 років тому

    You is amazing! Siraj!

  • @andreeaciontos7090
    @andreeaciontos7090 6 років тому +2

    Thank you Siraj for the great video :D super informative ! But damn sometimes you are overdoing it with the body movements and gestures ( 16:45 ) calm down, it makes it hard to focus !

  • @simonmandlik910
    @simonmandlik910 7 років тому +7

    Where do I get the dataset? It is not mentioned anywhere and is not in Github repository either

    • @imtryinghere1
      @imtryinghere1 5 років тому +3

      Dataset can be found at: raw.githubusercontent.com/brianspiering/gaussian_mixture_models/master/bimodal_example.csv

  • @hacademicabel
    @hacademicabel 5 років тому

    That was an amazing intro! Great videos man!

  • @nomercysar
    @nomercysar 6 років тому +1

    Thanks for reading theory to me. Couldn't do that by myself

    • @Arik1989
      @Arik1989 6 років тому

      I know you're being sarcastic, but honestly, I'm looking for people to do just that for me, I HATE reading technical material.

  • @mojiheydari
    @mojiheydari 4 роки тому

    omg. I just discovered your channel..... sOOOOOOOOOOOO gOOOOOOOOOOOd

  • @kakolelouch5261
    @kakolelouch5261 7 років тому +1

    Hi Siraj, wonderful video! I am wandering what is the difference between Gaussian mixture model and least square method in the data fitting' view?

  • @CarlosCosta-gs8rb
    @CarlosCosta-gs8rb 7 років тому

    Hi. Great again Siraj. You're the best on that online apparently. Should we have a video about non-parametric estimation or Higher Order statistics, perhaps ICA?

  • @chitralalawat8106
    @chitralalawat8106 5 років тому +1

    Here, x1, x2... are the vecors or are the data points of a vector x?

  • @brunoribeiro512
    @brunoribeiro512 6 років тому +2

    Great video, I tried running your code on my terminal and it's giving the error that 'GaussianMixture' object has no attribute 'loglike', would you happen to know why an error like would occur, or anyone by that matter. Thank you so much

  • @chasegraham246
    @chasegraham246 7 років тому

    So the probability density function looks more intimidating than it really is. Thanks for explaining it. If you had to choose between a semester of linear algebra or statistics, which would you choose?

  • @bitvox
    @bitvox 7 років тому +2

    Hi, your videos are great!. Please cover VGG, Alexnet, and others sometime.

  • @gokulprasad888
    @gokulprasad888 7 років тому

    Thanks Siraj, good one!!

  • @ego_sum_liberi
    @ego_sum_liberi 7 років тому

    Thank you for this great lecture and video...

  • @Vivekagrawal5800
    @Vivekagrawal5800 2 роки тому

    Video starts at 03:40

  • @tensorhack5271
    @tensorhack5271 7 років тому

    Hi, Im following this channel for a while now and love that you create different series. can you make a small series of basic examples next, so it's easier to learn and get started. With one of your first videos I've just created an sklearn programm that had 50 examples of fruit and car names and with KNN I've got pretty good results. but they are not perfect. now I want to use deep learning for that and would love to see a series where you give different simple examples like this to compare and get started using the different libaries and algorithms. And yes you created some beautiful similar content before but it's not exactly that. Best Wishes

  • @boscojay1381
    @boscojay1381 5 років тому +1

    Hi Siraj, I appreciate your videos and I love your content. I' am working on a project on cross-matching using active learning, what advice would you have for me? I' am trying to build something scalable but not so computationally intense.

  • @shashankesh
    @shashankesh 7 років тому +3

    25:22 EM model

  • @Iceport
    @Iceport 4 роки тому

    6:45 y is not the probability. y is the "likelihood" because the probability function is a pdf.

  • @mikkelgedehansen9479
    @mikkelgedehansen9479 4 роки тому

    Would be nice with timestamps, since it is quite impossible to find the bit of information about Gaussian mixture models that I was actually looking for...

  • @kjkunaljindal24
    @kjkunaljindal24 6 років тому

    I believe, the objective is to maximize the likelihood of observed data, not the observed data and the hidden variables.

  • @esakkiponraj.e5224
    @esakkiponraj.e5224 5 років тому +2

    whether wp1 + wp2 = 1 always...so self.loglike += log(wp1 + wp2) will be zero ????
    Is it true ?? whether my assumption is wrong ??
    Kindly explain...

    • @ACLNM
      @ACLNM 4 роки тому +1

      He makes mistakes... If only that was the only one... Referring to Variance as Variation... Doesn't know how a Standard Deviation is calculated... omg.

  • @JayanthBagare
    @JayanthBagare 7 років тому

    Hey @siraj where are you going to be in India would love to catch up

  • @alessandrorosati969
    @alessandrorosati969 2 роки тому

    I have the problem with the gaussian mixture models, I don't know how generate outliers uniformly in the p-parallelotope defined by the
    coordinate-wise maxima and minima of the ‘regular’ observations in R?

  • @bkovnkk6105
    @bkovnkk6105 6 років тому

    WE ARE "THE ONE" :) regards come from CN

  • @AddyKanhere
    @AddyKanhere 7 років тому

    Hey Siraj, Where will you be meeting folks in India?

  • @Abhitechno01
    @Abhitechno01 7 років тому

    It's always great and informative to watch and learn from your video.
    But my question is a non technical, but do provide a solution plz...
    Question : I saw your github profile, and I'm curious what filters you applied on your profile pic(dp) ?? :p
    ps: I already told you this question is going to be a non-technical one and Yes !!! you have been on my youtube's subscription list from the very beginning.
    Cheers !!!

  • @heathicusmaximus8170
    @heathicusmaximus8170 3 роки тому

    Apple sends their hinge prototypes to this guy for testing. If this guy won't wear out hinges, who will?

  • @fuzzypenguino
    @fuzzypenguino 7 років тому

    Siraj's desktop background has the Sierra mountains, but doesn't OS Sierra not work with Tensorflow and OpenAI and other machine learning stuff?

  • @rage0397
    @rage0397 5 років тому

    Loved the explanation. If I have to model 6 features instead of 2, and use a sliding windows approach on my dataframe (I need to find the anomalous windows), how can I modify the weights and the rest of the code? Just looking for direction.

  • @negar21100
    @negar21100 4 роки тому

    19:15 where are the link to those repositories?

    • @danielibanez1855
      @danielibanez1855 4 роки тому

      You can find them in the notebook Siraj made for this video github.com/llSourcell/Gaussian_Mixture_Models/blob/master/intro_to_gmm_%26_em.ipynb

  • @redafekry3303
    @redafekry3303 4 роки тому

    could you please show an example on 3d data (XYZ - points) ?

  • @leodong6060
    @leodong6060 7 років тому

    Wondering if you would post the lecture notes/slides somewhere?

  • @zaphbeeblebrox5333
    @zaphbeeblebrox5333 3 роки тому

    8:30 "x is the number of data points"? What are you talking about?!

  • @hemilysantos600
    @hemilysantos600 6 років тому

    Hi, how to change the variance and average Gaussian function in matlab? Can you show an example of what the code looks like?