GLM Intro - 2 - Least Squares vs. Maximum Likelihood

Поділитися
Вставка
  • Опубліковано 15 лис 2024
  • What is the difference between the Least Squares and the Maximum Likelihood methods of finding the regression coefficients?
    Corrections:
    4:30 - I'm missing a 2 in the denominator, should be -1/2*sigma^2
    Become a member and get full access to this online course:
    meerkatstatist...
    ** 🎉 Special UA-cam 60% Discount on Yearly Plan - valid for the 1st 100 subscribers; Voucher code: First100 🎉 **
    “GLM in R” Course Outline:
    Administration
    Administration
    Up to Scratch
    Notebook - Introduction
    Notebook - Linear Models
    Notebook - Intro to R
    Intro to GLM’s
    Linear Models vs. Generalized Linear Models
    Least Squares vs. Maximum Likelihood
    Saturated vs. Constrained Model
    Link Functions
    Exponential Family
    Definition and Examples
    More Examples
    Notebook - Exponential Family
    Mean and Variance
    Notebook - Mean-Variance Relationship
    Deviance
    Deviance
    Notebook - Deviance
    Likelihood Analysis
    Likelihood Analysis
    Numerical Solution
    Notebook - GLM’s in R
    Notebook - Fitting the GLM
    Inference
    Code Examples:
    Notebook - Binary/Binomial Regression
    Notebook - Poisson & Negative Binomial Regression
    Notebook - Gamma & Inverse Gaussian Regression
    Advanced Topics:
    Quasi-Likelihood
    Generalized Estimating Equations (GEE)
    Mixed Models (GLMM)
    Why become a member?
    All video content
    Extra material (notebooks)
    Access to code and notes
    Community Discussion
    No Ads
    Support the Creator ❤️
    GLM (restricted) playlist: bit.ly/2ZMSv4U
    If you’re looking for statistical consultation, someone to work on interesting projects, or training workshops, visit my website meerkatstatist... or contact me directly at david@meerkatstatistics.com
    ~~~~~ SUPPORT ~~~~~
    Paypal me: paypal.me/Meer...
    ~~~~~~~~~~~~~~~~~

КОМЕНТАРІ • 42

  • @MeerkatStatistics
    @MeerkatStatistics  2 роки тому

    Full course is now available on my private website. Become a member and get full access:
    meerkatstatistics.com/courses...
    * 🎉 Special UA-cam 60% Discount on Yearly Plan - valid for the 1st 100 subscribers; Voucher code: First100 🎉 *
    “GLM in R” Course Outline:
    Administration
    * Administration
    Up to Scratch
    * Notebook - Introduction
    * Notebook - Linear Models
    * Notebook - Intro to R
    Intro to GLM’s
    * Linear Models vs. Generalized Linear Models
    * Least Squares vs. Maximum Likelihood
    * Saturated vs. Constrained Model
    * Link Functions
    Exponential Family
    * Definition and Examples
    * More Examples
    * Notebook - Exponential Family
    * Mean and Variance
    * Notebook - Mean-Variance Relationship
    Deviance
    * Deviance
    * Notebook - Deviance
    Likelihood Analysis
    * Likelihood Analysis
    * Numerical Solution
    * Notebook - GLM’s in R
    * Notebook - Fitting the GLM
    * Inference
    Code Examples:
    * Notebook - Binary/Binomial Regression
    * Notebook - Poisson & Negative Binomial Regression
    * Notebook - Gamma & Inverse Gaussian Regression
    Advanced Topics:
    * Quasi-Likelihood
    * Generalized Estimating Equations (GEE)
    * Mixed Models (GLMM)
    Why become a member?
    * All video content
    * Extra material (notebooks)
    * Access to code and notes
    * Community Discussion
    * No Ads
    * Support the Creator ❤

  • @verdi0756
    @verdi0756 3 роки тому +67

    Jesus, man, those normal distribution drawings on the maximum likelihood line make you deserve the teacher of the year award.

    • @MeerkatStatistics
      @MeerkatStatistics  3 роки тому +2

      Haha :-P

    • @vuyiswapriscadlamini6207
      @vuyiswapriscadlamini6207 3 роки тому +1

      I swear they do make him deserve it. Thanks so much @Meerkat Statistics

    • @iya3952
      @iya3952 2 роки тому

      i was copying down notes and had to redraw those like three times. this man is a true artist

  • @pedrocolangelo5844
    @pedrocolangelo5844 3 роки тому +12

    I've just watched two videos of yours and I am amazed with how you can make difficult statistical concepts sound so trivial. One of the best UA-cam channels on statistics for sure!

  • @메호대전
    @메호대전 6 місяців тому

    I was always perplexed by the two different views of seeing linear regression as a fixed viewpoint and a random viewpoint. Now everything is clear: the two viewpoints correspond to the OLS and MLE methods. What a wonderful explanation you provided!

  • @jiesun31
    @jiesun31 3 роки тому +7

    An excellent explanation of an important concept. Good job!

  • @fade-touched
    @fade-touched 3 роки тому +1

    thank you!!! I had hard time grasping it before I watched your video

  • @shambo9807
    @shambo9807 7 місяців тому

    very clear explanation. Thank you.

  • @jec8303
    @jec8303 Рік тому

    Very much appreciated the video!

  • @rickyqiao3162
    @rickyqiao3162 3 роки тому +2

    Thank you for making this series

  • @dinacula5058
    @dinacula5058 2 роки тому

    Thank you so much! Fantastic explanation!

  • @MoosaHosseini
    @MoosaHosseini 2 роки тому

    well done, I really appreciate your work. thank you so much.

  • @raltonkistnasamy6599
    @raltonkistnasamy6599 6 місяців тому

    thank u so much brother

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 3 роки тому

    Another great explanation

  • @brazilfootball
    @brazilfootball 3 роки тому +1

    Thank you so much, this was great!

  • @vuyiswapriscadlamini6207
    @vuyiswapriscadlamini6207 3 роки тому

    So helpful, Thank you!

  • @marcoantoniorocha9077
    @marcoantoniorocha9077 2 роки тому

    Fabulous!

  • @imrul66
    @imrul66 3 роки тому +3

    In practice, how do we know the distribution of y_i? Is this something we just assume? or test?
    1) Does a histogram of our y variable tell anything of the distribution of y_i? For example, if y "looks" normal from the histogram, is it possible that y_i are still from a different distribution.
    2) what about the other way? Is it possible to have a, lets say, logarithmic-looking histogram of the y variable but the y_i are still from a normal distribution?
    3) it looks like in addition to the normal distribution assumption, we also have to assume mu_i = beta + beta_1 * x_i, for this to work. Or normal distribution assumption implies this? if yes, how?

    • @henri1_96
      @henri1_96 3 роки тому +2

      These are good questions! :0 I'm going to have to ask my professor these.

    • @TheBjjninja
      @TheBjjninja 3 роки тому +2

      In practice you would study the residuals chart and run a couple of tests to look for heteroscedasticity.

  • @taylordaigle5944
    @taylordaigle5944 4 роки тому +3

    I get that there is an equivalence algebraically, but I'm still struggling to see why this is conceptually true....what about the normal distribution makes these two methods equivalent? Does it have something to do with the residuals summing to zero in the linear case? i.e. there should be the same amount above the regression line as below it? And the normal distribution mimics this (even spread about the mean)?

    • @MeerkatStatistics
      @MeerkatStatistics  4 роки тому +1

      look at 6:50 - 7:30 again. Minimizing the least square term, is exactly like maximizing the ML term (that depends on the coefficients). Because the two terms are identical, only the ML term has a minus sign in front of it. Min F(x) = Max -F(x). This only happens for the Normal distribution.

    • @rikudoukarthik
      @rikudoukarthik 3 роки тому +2

      Yes, I think it is because the normal distribution has a mean in the centre of the distribution, which also has the highest probability, essentially linking the squared-distances to likelihood. In other distributions, this need not be true.

  • @vijanth
    @vijanth 4 роки тому

    Very nice indeed

  • @SphereofTime
    @SphereofTime 2 місяці тому

    5:24 MLE vs LS

  • @chandan4713
    @chandan4713 2 роки тому

    I really like the video as it gives a very unique perspective than most of the other videos on youtube.
    I have some related questions:
    1.When we are using the Maximum likelihood approach in logistic regression, is that Maximum likelihood Estimator (MLE) assuming Gaussian distribution of log-of-odds, identical to the MLE assuming Bernoulli distribution of the probabilities?
    2. Drawing parallels from this video: Does the maximum likelihood approach in logistic regression assumes that the mean of the Bernoulli distribution lies along the Sigmoid line i.e an S-shaped curve (as against a straight line in linear regression)?
    Thank you

    • @nizogos
      @nizogos 3 місяці тому

      In the x-y plot of the normal MLE ,the model is a straight line because the mean of y is linearly dependent on the xi , in glms you have that the mean is equal to the inverse function ( g(μi)=xTb=> μi=g-1(xTb)). So on a glm, your model equation is not necessarily a line,it might be a different curve.On this different curve lie the distribution curves of each yi ( the parallel is : normal distributions on a line vs whatever distribution on a curve)

  • @hex9219
    @hex9219 11 місяців тому

    awesome

  • @jackfarris3670
    @jackfarris3670 2 роки тому

    What would happen if you used a gamma distribution and also OLS? Instead it maximum likelihood

  • @ronniefromdk
    @ronniefromdk 3 роки тому +1

    So a General Linear Model using MLE is always exactly the same as least squares?

    • @MeerkatStatistics
      @MeerkatStatistics  3 роки тому +1

      No. Only in the Normal distribution case...

    • @ronniefromdk
      @ronniefromdk 3 роки тому

      @@MeerkatStatistics Thanks for responding. What I think of as General Linear Model is perhaps what you refer to as Linear Model in your GLM intro 1 video? I.e. a Generalized Linear Model that only has normally distributed residuals? Or I may be mistaken.

    • @MeerkatStatistics
      @MeerkatStatistics  3 роки тому

      @@ronniefromdk in the Normal distribution, which is the ordinary linear model, MLE and LS give the same result. If you are moving to Generalized/General linear model, this is not the case for any other distribution.

  • @thangnguyenminh7019
    @thangnguyenminh7019 Рік тому

    I think to be precise, the error follow normal distribution, not y(i).

    • @MeerkatStatistics
      @MeerkatStatistics  Рік тому +1

      As mentioned in replies to the previous video, if the error is normal, then the y's are as well, as they are a function of the error: y=xb+e (x and the true b's are considered constant)

  • @mexheix
    @mexheix Рік тому

    i like your videos, but can you do better on the writing?

  • @xingyanglan6836
    @xingyanglan6836 3 роки тому +1

    I appreciate the effort, But this video should not be any longer than three minutes

    • @TalGalili
      @TalGalili 3 роки тому

      You can just run it at 1.5 speed or so.

    • @OPT-TD
      @OPT-TD 2 роки тому +1

      do it better if you are able to!