L20.10 Maximum Likelihood Estimation Examples

Поділитися
Вставка
  • Опубліковано 23 кві 2018
  • MIT RES.6-012 Introduction to Probability, Spring 2018
    View the complete course: ocw.mit.edu/RES-6-012S18
    Instructor: John Tsitsiklis
    License: Creative Commons BY-NC-SA
    More information at ocw.mit.edu/terms
    More courses at ocw.mit.edu

КОМЕНТАРІ • 35

  • @ireneisme8747
    @ireneisme8747 4 роки тому +17

    Thank you for this amazing video!! But i have a quick question: why you are trying to MINIMIZE the negative of that function instead of directly MAXIMIZE that function?

    • @thiagoneubauer5190
      @thiagoneubauer5190 4 роки тому +9

      I think this is because of many optimization algorithms are build to minimize functions instead of maximizing these functions. Thus, it's better to put our equations in that form to become more easily adapted to an optimization routine.

    • @carlospinzoncarrera9246
      @carlospinzoncarrera9246 4 роки тому +12

      Just note that maximize -f(x) is equivalent to minimize f(x); when you apply the logarithm function to the likelihood function, you get an expression just with negative terms, so the professor just multiply all the log-likelihood function by (-1) so that the negative of the log-likelihood function is now an expression with just positive terms. that's why he minimize this function (all positivities terms) rather than maximizing the original (all negative terms). Hope this help.

    • @mhdadk
      @mhdadk 4 роки тому +1

      Hi, the real reason has to do with the Kullback-Leibler divergence. More details here:
      wiseodd.github.io/techblog/2017/01/26/kl-mle/

    • @warrenricardo5244
      @warrenricardo5244 2 роки тому

      Instablaster...

    • @jebhank1620
      @jebhank1620 2 роки тому

      @@carlospinzoncarrera9246 Very helpful, cheers!

  • @anangelsdiaries
    @anangelsdiaries Рік тому +2

    For the binomial case, why is it that we don't take a product for the likelihood function?
    (Is it because there's only one observation?)

  • @nicolasbourbaki1872
    @nicolasbourbaki1872 4 роки тому +20

    Blasphemous negligence of the chain rule when performing that first derivative to find its roots for the maximum estimation on the mean of the random variables in the second example. Good thing the difference is symmetric!

  • @oneaboveall8190
    @oneaboveall8190 5 місяців тому +1

    Can someone expand the formula to show how he got rid of exponential while taking log of exponential {x - u/2v}

    • @nibrad9712
      @nibrad9712 3 місяці тому

      since it's natural log, when you log the exponential of "something", it just becomes that "something"

  • @AakarshNair
    @AakarshNair 2 роки тому +8

    OMFG this guy is amazing

  • @m.preacher2829
    @m.preacher2829 3 роки тому +4

    why we can directly use the PMF of binomial to calculate the ML instead of using the ML function like the second example?

    • @mallakbasheersyed1859
      @mallakbasheersyed1859 3 роки тому +1

      hii , we can use the same for the first one but at the end we will end up in finding the same. Suppose consider the case of head and tail in 5 tosses ,and u know that the output is HHTTH (given) and we want to use the concept of ML to find the same , as all are independent events we can write the resultant probability as product of probabilities @ each toss ,so we end up in getting teta^3*(1-teta)^2 and by diff it and equating it to 0 ,we get probability of head as 3/5 which is obviously right(by the usual method) .The same happened with the first problem but order doesnot matter hence considering permutations we could have nck at the front(but we can remove it ,as it is not irritating much)

    • @m.preacher2829
      @m.preacher2829 3 роки тому

      @@mallakbasheersyed1859 i read the theory of likelihood function, i think at that time i understood something wrong with the likelihood function. but by the way thanks for the reply

  • @clydexu2599
    @clydexu2599 3 роки тому +3

    by watching the video, not clear to me why the optimal value turns out to maximize the likelihood function.

    • @oneandonlyflow
      @oneandonlyflow 2 роки тому +2

      because think of the Pk expression at the very top to be a curve. If you differentiate it and set it to 0 you get the max or minimum. Now if you rearrange It to find theta you now know what value of theta gives you a max or min

  • @jovialjoe_
    @jovialjoe_ 5 місяців тому

    9:09 howcome u can cancel out the 2 and a v? they r both denominators

    • @quest1606
      @quest1606 2 місяці тому

      That is why they can be cancelled out…

  • @mehrdadkazemi3969
    @mehrdadkazemi3969 2 роки тому

    thank you

  • @porterchien4782
    @porterchien4782 4 місяці тому

    why is the variance under the root shouldn't it be outside of the root(2pi)

    • @maoam-im7lc
      @maoam-im7lc 2 дні тому

      Maybe a bit late but I think here they say v = sigma^2 so it is the same as putting the sigma outside. Here the variance v is the thing you get after you do E[x^2]-E[x]^2 so sigma^2

  • @husseinsleiman6119
    @husseinsleiman6119 2 роки тому +1

    Hello thanks for the effort. I think you have a mistake when you minimized w.r.t v. the sum part of the denominator must be 4v^2

    • @husseinsleiman6119
      @husseinsleiman6119 2 роки тому +1

      but it doesn't matter the answer you reach will be the same.

  • @drakoumell
    @drakoumell 2 роки тому

    Yo this guy is awsome he sounds like Junior from Kim Possible

  • @seniormuchacho1868
    @seniormuchacho1868 4 роки тому +2

    What is that K letter means?

    • @thiagoneubauer5190
      @thiagoneubauer5190 4 роки тому +3

      The number of heads obtained in our population. he probability of having exactly k successes, our number of heads, is given by the probability function showed in the slide.

  • @pablock0
    @pablock0 Рік тому

    niiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiice I got it

  • @kavourakos
    @kavourakos 2 роки тому

    ωραιος ο γιαννης

  • @tommys4809
    @tommys4809 3 місяці тому

    Micromasters lfg

  • @yaweli2968
    @yaweli2968 4 роки тому +10

    If you dislike this video, I am sorry you are just weak at advanced maths/ statistics. Not that MIT Professor’s fault, this isn’t for everybody.

    • @jiangchengyu3205
      @jiangchengyu3205 3 роки тому +45

      No need to brag about superiority. While the video is truly as good as can be.

    • @henri1_96
      @henri1_96 3 роки тому +11

      oh wow you must be so smort

    • @dijay7821
      @dijay7821 3 роки тому +12

      Yeah, it is not for arrogant pricks like you.

  • @yarenlerler67
    @yarenlerler67 Рік тому

    thank you