The Beta distribution in 12 minutes!

Serrano.Academy

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 2 чер 2024
This video is about the Beta distribution, a very important distribution in probability, statistics, and machine learning. It is explained using a simple example involving flipping coins.
One armed bandits video: • Thompson sampling, one...
Useful links:
Bayes theorem video: • Naive Bayes classifier...
Beta distribution (3blue1brown)
• Why “probability of 0”...
• Binomial distributions...
Grokking Machine Learning Book: www.manning.com/books/grokkin...
40% discount promo code: serranoyt
Machine Learning Testing and Error Metrics
www.youtube.com/watch?v=aDW44...
Announcement: Book by Luis Serrano! Grokking Machine Learning. bit.ly/grokkingML
40% discount code: serranoyt
Наука та технологія

КОМЕНТАРІ • 86

@jayraldbasan5354 2 роки тому ⁺²⁶
Thank you for this! Ive been waiting for 3blue1brown's 3rd part of the probabilities of probabilities series. There are a lot of very comprehensive videos on beta distribution out there but this has been the most intuitive so far.
@felipec Рік тому ⁺⁴
This is exactly the same reason I came here.
@srinivasanbalan2469 3 роки тому ⁺²
Kudos Dr. Serrano. You are an excellent teacher as always! Looking forward for the Thompson sampling
@sidharthdash2666 2 роки тому ⁺⁴
To Luis Serrano- The most humble and articulated teacher I know so far. I am a data scientist who is self taught through videos/blogs and books and do not have any academic training in this field. I wish I could roll back time and start learning these watching your videos. You are an institution yourself. I request you if you could have a series of videos in the field of optimization in general.
@rush19772112 2 роки тому ⁺¹
Dr Serrano thank you for your well explained video. Watching your videos understood in 1 hour more than I have been studying for months. that's a talent and thank you so much for your analysis. EVERYTHING IS MUCH APPRECIATED
@wootcrisp 3 роки тому ⁺¹⁴
Very good. I aspire to teach with your level of clarity and kindness.
@davegoodo3603 7 місяців тому ⁺²
Thank you Luis, that was a very good video. Clearly presented, easy to follow, it has helped me enormously in understanding this distribution. I'm studying Bayesian Statistics so it has helped me with that as well. I've subscribed, so I'll look at other videos in your collection. Much appreciated.👍
@kylepalmer6936 2 роки тому
This is really quite good; very clear explanation of a complex concept. Please follow through with a video on the gamma function.
@bobbyc1120 2 роки тому ⁺²²
For years I've been trying to understand the beta distribution. I'm out of college and studying for a probability exam, and it finally makes sense thanks to your video.
@mazmillion451 9 місяців тому
out of college and taking an exam
@Kornackifs 3 місяці тому
@@mazmillion451
Mutually exclusive events
@mantidream8179 2 роки тому ⁺²
Great video, you've got a real knack for teaching
@sdsa007 11 місяців тому
I went the mile to learn the beta and gamma integrals! But that was just to understand what you explained soooo very well! Thank you!
@user-uv4cu4hy7j 2 місяці тому
Thank you so much for the video, best beta distribution ever seen. I wonder if there would be a gemma distribution introduction video planned?
2 роки тому
A very neat way to introduce this topic. Thanks sir
@allisonal 6 місяців тому
Clear and methodical exposition. Thank you!
@testme2026 Рік тому
This crazy, top professors can not explain as good as this guy! thank you
@ihgnmah 2 роки тому ⁺⁶
For those who don't know why we use the Beta function as the normalizing factor:
We need to sum up all the probabilities of all the coins. Since we have infinite numbers of coins, that summation means the area under the distribution curve. In calculus, the area under the curve is defined by the integral of the function of that curve. In this case, the function is x^(q-1)*(1-x)^(r-1) (q=8, r=4 for the example in the video; substituting them in the equation, you will have x^7*(1-x)^3). The integral of x^(q-1)*(1-x)^(r-1) from 0 to 1 (x is a probability, it can only take on a value from 0 to 1) is exactly the Beta function.
@lucassullivan2109 2 місяці тому
Holy shit! That's awesome!
@omarmohy3975 2 роки тому
Luis, really great video as always. I would like to ask to make sure I understood, If I have a problem where I want to get a likelihood score for 3/5 and 60/100, and I want the second one to yield a higher score then I use the Beta Distribution right?
The second question will the probability which is are under curve will differ a lot from f(x) from a=7, b=3 and a=700, b=301
@jbtechcon7434 3 роки тому
Another great video and clear explanation.
@Lsazeh Рік тому
Super clear explanation, thank you!
@xxxx3732 10 місяців тому
i stuck in beta baz they do not explain gamma to me now i knew which step they sliped
you are the grokking machine learning‘s writer! i just be recommend this book.
@tanim980 Рік тому
Please make a video about Upper confidence bound in reinforcement learning ,you teach so well
@michaeldouglas7641 2 роки тому
Hi Luis. Great video. Have you since made a video regarding the Gamma function/distribution as you mentioned you might at (9:08) mins?
@dansplain2393 2 роки тому
Really good. Never seen the intuition of bayes explained like that: “these have to add to 1”
@ukjoeee 3 роки тому
I am looking forward to seeing next video
@Tyokok Рік тому ⁺¹
Incredible explain!
@LSC69 2 роки тому ⁺¹
Very good video. Did a much better explanation than my statistics professor. One tiny suggestion though, at 11:16, I think it would be more intuitive to the viewers if you didn't vertically scale each probability graph but instead maintained the same scale. That way it is easier to see how a more spread-out distribution results in less probability density around the mode of 0.7.
@robharwood3538 2 роки тому ⁺¹
Luis, I have a request. It may be something too difficult (I have been trying to find a good, understandable answer for many years now), but on the other hand, it is absolutely *fundamental* to the topics you cover on your channel. The question is: How would one go about calculating (or approximating numerically) values for the Gamma Function without the aid of a black-box tool like a spreadsheet gamma(x) or gammaln(x), or Wolfram Alpha, or anything like that? And not just for special values like for integers n or (n + 1/2), but for a general rational/real number r.
Everywhere I look there are either a) appeals to an existing black-box function that will return a value for you, without understanding where it came from, b) resorting just to the general properties of the Gamma function (such as Gamma(x + 1) = x * Gamma(x)) and special pre-known values such as Gamma(1/2) being some value involving Pi, I believe, or c) virtually impenetrable mathematical formulas that assume the reader knows way more advanced calculus and advanced theorems that themselves require other advanced theorems to even *understand* what they're *for!*
It's very frustrating, because the Gamma function is used everywhere when you start digging down into the fundamentals of probability theory, Bayesian statistics (and classical stats, too), information theory, machine learning, and a whole host of other areas.
But *nowhere* have I found an explanation of how to calculate it that goes from 1) here's the definition and some properties of it, to 2) here's how you can actually calculate it for yourself if you really want to. Is it really that opaque???
Compare the situation with say the exponential or natural logarithm. If you have the patience, you can use the Taylor series to calculate them for whatever input you like. Likewise for most other common functions you can think of, trig functions, even things like the Normal distribution can be approximated in various fairly straightforward ways.
Surely someone can open up the black boxes and explain how they work to us non-super-advanced-math folks?!?! Especially for such a fundamentally important function!
Anyway, if you have any ideas you could mention or perhaps if you get inspired to even make a video about it, I personally would very much appreciate it, and I suspect I'm not the only person in the world who would! 😅
@Nxck2440 2 роки тому ⁺¹
Gamma is defined as an integral,
gamma(x) = integral from 0 to infinity: t^(x - 1) * e^(-t) dt
You can use a substitution to make these bounds finite, e.g. let t = -ln u:
gamma(x) = integral from 0 to 1: (-ln u)^(x - 1) du
Now you can use numerical integration. You have to be a bit careful though since the lower bound is improper. A good strategy might be to start at u = 1 and integrate backwards, taking smaller and smaller steps as you get to u --> 0. Writing this as a summation is possible but not very useful as you'd want to program it, and at that point why not just use the black-box function anyway?
@emanalsuradi4969 2 роки тому
superb!!! The best explanation ever!
@Murphyalex 3 роки тому ⁺¹
Another great video!
@BetoAlvesRocha 2 роки тому
Hi, mate! Thank you so much for the very clear and kind explanation. By the way, love your accent. Cheers from Brazil! ;)
@BetoAlvesRocha 2 роки тому
Ahhh man! Now I know where I know you. It's from Udacity, I can remember some Machine Learning classes taught by you =)
@rajatsharma6137 2 роки тому
please make more probability related videos...its really helpful..
@tithisarkar2987 2 роки тому
It's so much helpful
Thank you so much sir
@felipec Рік тому
Excellent explanation, one minor comment though. At the end you present 3 graphs with the beta distribution at different values, and it's clear the one with a=701, b=301 is the more likely one to land close to 0.7, but you didn't show the *height* which I believe is the maximum likelihood estimator and it's much higher in the 3rd one.
@scottfan5079 2 роки тому
Very useful, thank you very much!
@maxyen9892 Рік тому
fantastic video!
@calluma8472 Рік тому
Best video I have found on this topic - I have been waiting for 3b1b to finish the binomial topic by talking about using the beta. One question - let's say I have a coin and have done 10 throws and so I use beta distribution where a=8 and b=4. Can I then say something concrete like p(h) is between 0.6 and 0.8 with confidence of 54% (54% is based on cumulative dist at 0.8 minus cumulative dist at 0.6)? And with a=71 b=31 confidence would be more like 77% for the same range?
@samirelzein1978 3 роки тому
Awesome as usual
@nanu-Nina Рік тому
Thank you so much. Now I got it!
@tejask5417 2 роки тому
You are a genius :) Thank you!
@dyoolyoos 3 роки тому ⁺⁶
Just have to say, I've watched the binomial distribution vids of 3Bl1Br and your explanation made more sense!👍🏼 Granted we're still waiting for his Part 3; pls help prod him for it!😊
@hardwarejunkie9 2 роки тому ⁺¹
Absolutely. Hunting around to fill the gaps for Part 3.
@thomasward7196 2 роки тому
I had a ohhhhhhhhg moment, while watching this. Thanks for the amazing explanation, I'll make sure to sub. Cheers
@cernejr 3 роки тому
What is the intuitive explanation of the asymmetry of the curve? And why is the curve getting more symmetric as the number of coin flips rises? I would be nice to see these discussed/explained.
@SerranoAcademy 3 роки тому ⁺¹
that's a great question, I hadn't thought about the symmetry... I don't know if the one with more flips looks more symmetric simply because it's skinnier, or because it's actually more symmetric for a combinatorial reason. I'll give it more thought to see if it means something, keep me posted if anything comes to mind!
@jackpot7041 11 місяців тому
very nice video, thanks
@kenkinyua7036 2 роки тому
watching from kenya this was amazing thanks
@SerranoAcademy 2 роки тому
Yeahhh!!! Greetings to Kenya! :)
@nitinkapoor4752 7 місяців тому
Finally… I have a better understanding of Beta distribution
@anttwo 16 днів тому
I don't understand something. Why do we multiply the probability of heads to the probability of tails?
Eg: coin 3 of the second example has P(H)=0.3, why do we multiply it with P(T)=0.7?
Edit: I also didn't understand why is it (1-p) still
@user-wr4yl7tx3w Рік тому
I thought data was incorporated into the likelihood. But in your last example, with larger sample, the beta prior reflects the data increase. I’m not clear on that part.
@offswitcher3159 2 роки тому
Great video
@lactobacillus128 22 години тому
Why the probability of Head is started 0.4? not 0.5 each?
@ytseberle 2 роки тому
Excellent video. Thanks so much.
One picky note: sometimes you pronounce it "beta" and sometimes "bee". The Roman letter B (bee) and the Greek letter B (beta) look alike, but aren't pronounced the same.
@mahnazakbari2504 Рік тому
*B* is the area under the curve. *Beta* distribution is the *density* divided by *B* .
@vivekgandhi8891 8 місяців тому
why did you multiply the probabilities on 3 heads and 2 tails
@user-gl8we4tg3l 2 місяці тому
But this is a probability density function, f(x)=PDF, not the probability of p, so how we know the real probability of p?
@suslamo 3 роки тому
thank you!
@md.abdurrahman7059 2 місяці тому
Good video❤
@InglesConConfianza Рік тому
THANK YOU!!!!!!!
@angelsabillon93 2 роки тому
I wish my statistics professor explained like this
@user-wr4yl7tx3w Рік тому
Not sure why the binomial coefficient term would go away.
@tormentacristal Рік тому
Great!😍
@tianyiluo2991 Рік тому
nice video
@user-hv3or9rc4c 6 місяців тому
LOOOOOVE YOOUUUUU❤
@franard4547 2 роки тому
nice exp.
@rhuanbarros 9 місяців тому
thank youuuuuu
@AmitGuptaGwl Рік тому
Sorry, I couldn't understand why we divided with the sum?
@SerranoAcademy Рік тому
Hi Amit! It's so that the probabilities add to 1, since they always have to add to 1, as we're considering all the possible cases.
@pradiptapattanayak8085 2 роки тому
Presentation is too good.
@user-eo3su7qz7c 6 місяців тому
Bruh!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
May your days be long and as easy as you just made my life!!!
@RajdeepBorgohainRajdeep 2 роки тому
Hello, Sir please make a video on Eigenvectors :)
@SerranoAcademy 2 роки тому ⁺³
Thanks! THere are some eigenvectors in this video: ua-cam.com/video/g-Hb26agBFg/v-deo.html
But soon there'll be a video on generalized eigenspaces, stay tuned!
@RajdeepBorgohainRajdeep 2 роки тому
@@SerranoAcademy Sir, I have seen the above mentioned lesson. Thank you and waiting for the new one :)
@maxpaju 3 роки тому
Why are they not called meta-distributions?
@SerranoAcademy 3 роки тому ⁺¹
Ohh that’s a very good name!
@venready2839 11 місяців тому
The link to your book is broken.
@SerranoAcademy 10 місяців тому
Thank you so much! Fixed. :)
@kasraamanat5453 2 роки тому
❤️
@antonioruotolo6014 2 місяці тому
too many leaps to be easily undertandable
@Coughi3 Рік тому
thank you for the video that was very well explained
@gauravms6681 3 роки тому
❤️

Наступне

Автоматичне відтворення

The Beta Distribution : Data Science Basics