GLM Intro - 2 - Least Squares vs. Maximum Likelihood
Вставка
- Опубліковано 15 лис 2024
- What is the difference between the Least Squares and the Maximum Likelihood methods of finding the regression coefficients?
Corrections:
4:30 - I'm missing a 2 in the denominator, should be -1/2*sigma^2
Become a member and get full access to this online course:
meerkatstatist...
** 🎉 Special UA-cam 60% Discount on Yearly Plan - valid for the 1st 100 subscribers; Voucher code: First100 🎉 **
“GLM in R” Course Outline:
Administration
Administration
Up to Scratch
Notebook - Introduction
Notebook - Linear Models
Notebook - Intro to R
Intro to GLM’s
Linear Models vs. Generalized Linear Models
Least Squares vs. Maximum Likelihood
Saturated vs. Constrained Model
Link Functions
Exponential Family
Definition and Examples
More Examples
Notebook - Exponential Family
Mean and Variance
Notebook - Mean-Variance Relationship
Deviance
Deviance
Notebook - Deviance
Likelihood Analysis
Likelihood Analysis
Numerical Solution
Notebook - GLM’s in R
Notebook - Fitting the GLM
Inference
Code Examples:
Notebook - Binary/Binomial Regression
Notebook - Poisson & Negative Binomial Regression
Notebook - Gamma & Inverse Gaussian Regression
Advanced Topics:
Quasi-Likelihood
Generalized Estimating Equations (GEE)
Mixed Models (GLMM)
Why become a member?
All video content
Extra material (notebooks)
Access to code and notes
Community Discussion
No Ads
Support the Creator ❤️
GLM (restricted) playlist: bit.ly/2ZMSv4U
If you’re looking for statistical consultation, someone to work on interesting projects, or training workshops, visit my website meerkatstatist... or contact me directly at david@meerkatstatistics.com
~~~~~ SUPPORT ~~~~~
Paypal me: paypal.me/Meer...
~~~~~~~~~~~~~~~~~
Full course is now available on my private website. Become a member and get full access:
meerkatstatistics.com/courses...
* 🎉 Special UA-cam 60% Discount on Yearly Plan - valid for the 1st 100 subscribers; Voucher code: First100 🎉 *
“GLM in R” Course Outline:
Administration
* Administration
Up to Scratch
* Notebook - Introduction
* Notebook - Linear Models
* Notebook - Intro to R
Intro to GLM’s
* Linear Models vs. Generalized Linear Models
* Least Squares vs. Maximum Likelihood
* Saturated vs. Constrained Model
* Link Functions
Exponential Family
* Definition and Examples
* More Examples
* Notebook - Exponential Family
* Mean and Variance
* Notebook - Mean-Variance Relationship
Deviance
* Deviance
* Notebook - Deviance
Likelihood Analysis
* Likelihood Analysis
* Numerical Solution
* Notebook - GLM’s in R
* Notebook - Fitting the GLM
* Inference
Code Examples:
* Notebook - Binary/Binomial Regression
* Notebook - Poisson & Negative Binomial Regression
* Notebook - Gamma & Inverse Gaussian Regression
Advanced Topics:
* Quasi-Likelihood
* Generalized Estimating Equations (GEE)
* Mixed Models (GLMM)
Why become a member?
* All video content
* Extra material (notebooks)
* Access to code and notes
* Community Discussion
* No Ads
* Support the Creator ❤
Jesus, man, those normal distribution drawings on the maximum likelihood line make you deserve the teacher of the year award.
Haha :-P
I swear they do make him deserve it. Thanks so much @Meerkat Statistics
i was copying down notes and had to redraw those like three times. this man is a true artist
I've just watched two videos of yours and I am amazed with how you can make difficult statistical concepts sound so trivial. One of the best UA-cam channels on statistics for sure!
Thank you! Heart warming!
I was always perplexed by the two different views of seeing linear regression as a fixed viewpoint and a random viewpoint. Now everything is clear: the two viewpoints correspond to the OLS and MLE methods. What a wonderful explanation you provided!
An excellent explanation of an important concept. Good job!
thank you!!! I had hard time grasping it before I watched your video
very clear explanation. Thank you.
Very much appreciated the video!
Thank you for making this series
Thank you so much! Fantastic explanation!
well done, I really appreciate your work. thank you so much.
thank u so much brother
Another great explanation
Thank you so much, this was great!
So helpful, Thank you!
Fabulous!
In practice, how do we know the distribution of y_i? Is this something we just assume? or test?
1) Does a histogram of our y variable tell anything of the distribution of y_i? For example, if y "looks" normal from the histogram, is it possible that y_i are still from a different distribution.
2) what about the other way? Is it possible to have a, lets say, logarithmic-looking histogram of the y variable but the y_i are still from a normal distribution?
3) it looks like in addition to the normal distribution assumption, we also have to assume mu_i = beta + beta_1 * x_i, for this to work. Or normal distribution assumption implies this? if yes, how?
These are good questions! :0 I'm going to have to ask my professor these.
In practice you would study the residuals chart and run a couple of tests to look for heteroscedasticity.
I get that there is an equivalence algebraically, but I'm still struggling to see why this is conceptually true....what about the normal distribution makes these two methods equivalent? Does it have something to do with the residuals summing to zero in the linear case? i.e. there should be the same amount above the regression line as below it? And the normal distribution mimics this (even spread about the mean)?
look at 6:50 - 7:30 again. Minimizing the least square term, is exactly like maximizing the ML term (that depends on the coefficients). Because the two terms are identical, only the ML term has a minus sign in front of it. Min F(x) = Max -F(x). This only happens for the Normal distribution.
Yes, I think it is because the normal distribution has a mean in the centre of the distribution, which also has the highest probability, essentially linking the squared-distances to likelihood. In other distributions, this need not be true.
Very nice indeed
5:24 MLE vs LS
I really like the video as it gives a very unique perspective than most of the other videos on youtube.
I have some related questions:
1.When we are using the Maximum likelihood approach in logistic regression, is that Maximum likelihood Estimator (MLE) assuming Gaussian distribution of log-of-odds, identical to the MLE assuming Bernoulli distribution of the probabilities?
2. Drawing parallels from this video: Does the maximum likelihood approach in logistic regression assumes that the mean of the Bernoulli distribution lies along the Sigmoid line i.e an S-shaped curve (as against a straight line in linear regression)?
Thank you
In the x-y plot of the normal MLE ,the model is a straight line because the mean of y is linearly dependent on the xi , in glms you have that the mean is equal to the inverse function ( g(μi)=xTb=> μi=g-1(xTb)). So on a glm, your model equation is not necessarily a line,it might be a different curve.On this different curve lie the distribution curves of each yi ( the parallel is : normal distributions on a line vs whatever distribution on a curve)
awesome
What would happen if you used a gamma distribution and also OLS? Instead it maximum likelihood
So a General Linear Model using MLE is always exactly the same as least squares?
No. Only in the Normal distribution case...
@@MeerkatStatistics Thanks for responding. What I think of as General Linear Model is perhaps what you refer to as Linear Model in your GLM intro 1 video? I.e. a Generalized Linear Model that only has normally distributed residuals? Or I may be mistaken.
@@ronniefromdk in the Normal distribution, which is the ordinary linear model, MLE and LS give the same result. If you are moving to Generalized/General linear model, this is not the case for any other distribution.
I think to be precise, the error follow normal distribution, not y(i).
As mentioned in replies to the previous video, if the error is normal, then the y's are as well, as they are a function of the error: y=xb+e (x and the true b's are considered constant)
i like your videos, but can you do better on the writing?
I appreciate the effort, But this video should not be any longer than three minutes
You can just run it at 1.5 speed or so.
do it better if you are able to!