6. Maximum Likelihood Estimation (cont.) and the Method of Moments

MIT OpenCourseWare

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 15 жов 2024
MIT 18.650 Statistics for Applications, Fall 2016
View the complete course: ocw.mit.edu/18-...
Instructor: Philippe Rigollet
In this lecture, Prof. Rigollet continued on maximum likelihood estimators and talked about Weierstrass Approximation Theorem (WAT), and statistical application of the WAT, etc.
License: Creative Commons BY-NC-SA
More information at ocw.mit.edu/terms
More courses at ocw.mit.edu

КОМЕНТАРІ • 41

@Noah-jz3gt Рік тому ⁺⁴
I've never taken any regular basic statistics course and it takes literally a day to fully understand 1 lecture video. But as the instructor said, I feel much smarter after taking this lecture.
@kevinliu5299 3 роки тому ⁺¹¹
I'd like to say, this video is the best I've ever seen. The instructor's mind is very clear so that he can relate all critical notions together and depicts vivid images for us in brief.
@seulebrg 3 роки тому ⁺¹⁰
Method of Moments starts at 32:36
@syifaamustafa954 2 роки тому
Thank you !
@salmamorsi3003 2 роки тому
Thanks
@Marteenez_ 2 роки тому
@28:50:00 why would there be a square root 2 pi there, I don't get the significance of what he is saying when there are no fudge factors and this is the true asymptotic variance. Why would there be any of that?
@jaspreetsingh-nr6gr Рік тому ⁺²
@ 19:20 , the dotted curve represents our ESTIMATOR for KL (theta, theta*) where as the solid line is the actual KL (theta, theta*) , the values theta and theta* are the minimum points of the estimator and the actual KL divergence resply. Can you guys help me verify if i understood correctly? Is the dotted line something else? or dis i interpret the solid line incorrectly? please help me out here..
@x.tzhang7629 Рік тому ⁺¹
Yes that is what I understood as well. The point of him drawing these two lines was basically to illustrate if you have a very flat base, then even if you somehow managed to find the min of the estimator, there is still a chance that you being pretty far away from the actually parameter theta star.
@brandomiranda6703 2 роки тому ⁺¹
how is the fisher information used in modern machine learning - especially in practice?
@brandomiranda6703 2 роки тому
how does his theorem in 30:55 mean that the MLE just going to be an average?
@brandomiranda6703 3 роки тому ⁺²
Would have been nice to put in the description or title or somewhere that this lecture focus on Fisher Information (Matrix) - to make it easier to search...I honestly don't know how or why I found this...especially since it was at the bottom of my search results. MIT videos that are relevant should be at the top...
@danielyin3043 6 років тому ⁺³
17:10 The word is his name Rigollet in French
@visualAnalyticsVA 3 роки тому
46:29 one to the last row in the matrix left side should be x_1^(r1-1), x_2^(r1-1), etc. instead of r-1
@adamzielinski2848 4 роки тому ⁺³
In 41:23 he says that it's actually enough to look only at the terms of the form X to the k-th - why is it enough?
@owenmireles9615 4 роки тому ⁺⁴
Hi, Adam. I hope this answer suits you well.
The reason terms of the form X^k suffice is "linearity". The operation of taking an average is linear, meaning you can take out the constants.
It is the same reason why constants can "escape" an integral.
If E is the expectation, and there's a polynomial a_0 + a_1 X + a_2 X^2 + ... + a_n X^n, its expectation is
E ( a_0 + a_1 X + a_2 X^2 + ... + a_n X^n ) = a_0 + a_1 E( X ) + a_2 E ( X^2 ) + ... + a_n E ( X^n ).
@adamzielinski2848 4 роки тому ⁺¹
@@owenmireles9615 Ah that's right indeed, thank you!
@jaspreetsingh-nr6gr Рік тому
@@owenmireles9615 @ 19:20 , the dotted curve represents our ESTIMATOR for KL (theta, theta*) where as the solid line is the actual KL (theta, theta*) , the values theta and theta* are the minimum points of the estimator and the actual KL divergence resply. Can you guys help me verify if i understood correctly? Is the dotted line something else? or dis i interpret the solid line incorrectly? please help me out here..
@owenmireles9615 Рік тому ⁺¹
@@jaspreetsingh-nr6gr Hi, Jaspreet.
Your interpretation seems correct. I'll just emphasize some parts which I think weren't covered as in much detail in the lecture.
That's right, the dotted line represents the estimator for the KL divergence.
However, the relationship between theta and theta* is more subtle... there's a bit more going on.
Throughout the video, they mention that theta* is the true parameter that you're trying to find. To do this, you'd like to minimize a function. That function would be f(X) = KL(P_theta*, P_X). In words, you want to find the parameter X that is the "closest" (under KL divergence) to theta*. The graph of this f(X) is the solid line in the video.
If you had perfect information, then obviously theta* is such minimizer.
However, under real-world conditions, you never have perfect data, and have to resort to an approximation, that being Hat(KL). So, what you're actually trying to minimize now is g(X) = Hat(KL) (P_theta*, P_X). The graph of this g(X) is the dotted line in the video.
@jaspreetsingh-nr6gr Рік тому
@@owenmireles9615 Understood, using data (for sample mean) and then the guarantees given by LLN and Continuous functions under LLN ensures hat(KL) reasonably approximates KL divergence--thanks Owen, will ping u again if i get stuck on subsequent lectures.
@joyprokash4013 2 роки тому ⁺¹
Thank you very much.
@MrCraber 6 років тому ⁺²
Fisher proof is awesome!
@Noah-jz3gt Рік тому
41:04
- moment : expectation of power
@chtibareda 3 роки тому
what does support of P tetha means please?
@edulgl 6 років тому ⁺³
This is way too advanced for me. I can understand the calculus but when he starts talking about convergence in probability and distribution, i get really lost. Can anyone point me to a book where i can get a better understanding on these topics of inference and convergence?
@Harihar_Patel 6 років тому ⁺¹
asymptotic theory?
@conorigoe1213 6 років тому ⁺³
www.stat.cmu.edu/~siva/705/lec4.pdf
www.stat.cmu.edu/~siva/705/lec5.pdf
www.stat.cmu.edu/~siva/705/lec6.pdf
I found these helpful!
@SrEstroncio 5 років тому ⁺²
Try Wasserman's "All of Statistics" its pretty concise and straightforward, and designed for people coming in from other fields.
@edmonda.9748 5 років тому
@@SrEstroncio so true, i was gonna say same thing, it explains them very well and in detail
@gouravbhattacharya2694 3 роки тому ⁺¹
I have now a clear idea of Fisher
@ogusqiu6926 9 місяців тому ⁺¹
22:50
@pranishramteke7642 4 роки тому ⁺¹
That was a harry potter on broom entry!
@yd_ 2 роки тому ⁺¹
What a doozy. Great lecture.
@rainywusc3051 4 роки тому ⁺²
damn it‘s hard
@saubaral 4 роки тому ⁺³
i hate when teachers be like : who does not know this: and then go and read about it. LOL
@Indiker27 3 роки тому
agree
@caunesandrew1476 4 роки тому ⁺³
He's so bad at cleaning the board omg
@imtryinghere1 3 роки тому ⁺³
He has a broken leg and MIT has staff that come in and clean after each lecture.
@HojuneKim914 3 роки тому ⁺²
@@imtryinghere1 I honestly think it's more the eraser than his lack of skill
@ativan6959 4 роки тому ⁺⁴
i literally search oof moment

Наступне

Автоматичне відтворення