Thank you! I'm very glad you find these videos useful. I created them mainly with my own students in mind, but I'm very glad that others around the world find them helpful.
Thanks very much for the compliment. It's not the easiest thing in the world to do well, and there are a number of skills and talents required, so it's not too surprising there can be some poor quality stuff out there. I don't profess to be an expert in videos, but I'm doing my best (on my own) to make the best ones I can. Some of my earlier videos were plagued by some of the negatives you mention, but I've refined my technique a little since then. I'm very glad you find them helpful.
Statistics grad student here. Your videos are always so incredibly clear. It is so helpful to have an example alongside the general mathematical notation to understand when we would use this distribution and to have everything worked out. You are so incredibly helpful. Thank you!
This playlist is a gem. Came here looking for multinomial distribution after using it for sampling in Pytorch. Following explanation on the number of possible orderings might help myself (in future) and fellow learners - Example - Number of ways to permute/order “n” letter word = n! Suppose word = AAAA. Possible orderings = 4!, but as all letters in the word are the same (not distinct i.e. even if I change the position of any of the A’s it won’t matter), divide by 4!. Suppose word = “AABC” then orderings = 4! / 2!, divide by 2! because letter A is same(not distinct). Same case here, in the case of sampling 10 Americans. The number of ways to permute/order = 10!, but out of 10 people, 6 are of type O, 2 are of type A, 1 of type B, and 1 of type AB. Hence, the people in each group (O, A, B, AB) are the same(non-distinct). So, 10! / (6! * 2!)
Hands down the best explanation on this concept period. The example was so clear and easy to follow. Where's the "Donate" button. I'll gladly chip in for your efforts.
That's correct. There's nothing wrong with analyzing a situation that is truly multinomial as a binomial, if one is interested in only one of the outcomes. For example, we might be interested in the distribution of the number of people with O- blood in a random sample of 100 people. This has a binomial distribution with 2 possible outcomes on each trial (O-, not O-), even though there are many other blood types. There's nothing wrong with doing this, if that's the question of interest.
Yes, X_1 through X_k are random variables, and collectively can be thought of as a k-dimensinoal random vector. Not all random vectors have a multinomial distribution of course, and we'd only use the multinomial distribution to find probabilities if the conditions of the multinomial distribution are met.
I enjoyed every single video in this playlist. Your approach really helped to understand what random variables are. It is also very explicit in your videos what are possible values of rv given a distribution. It looks so trivial now but was very confusing before. Many thanks for that.
I do very much appreciate the compliment, but trust me on this, the American education system produces thousands upon thousands of people far more eloquent than I could ever hope to be. I am sure there are many wonderful educators in every country, just as there are some problems with the education system in every country. My roots growing up in a small Northern Ontario town slip through sometimes -- listen carefully for the "we're gunna..." that shows up in some of my videos :)
Good explanation of the basics. Wish you did a sequel with more elaborate examples :) The problem I'm facing now could be reduced to "given a k-sided die, what is the probability of getting each possible value at least once within n throws" and there's no video on youtube that would help tackle this.
Is there an efficient way to calculate something akin to: An urn contains 8 red balls, 3 yellow balls, and 9 white balls. 6 Balls are randomly selected **with replacement**. What is the probability that at least 2 are red and at least 1 is yellow?
It depends what you mean by "efficient" :) There's no quick formula or anything like that; you have to add the probabilities of all the cases that satisfy the condition. So it can get a little complicated and messy.
@jbstatistics Ahhh sad... I have a similar case I wanted to calculate but that has 8 relevant outcomes within hundreds of thousands of trials... So adding all the cases up that satisfy the condition doesn't seem to be an option
Hey, I was wondering what possible problems can it casuse if a multinomial variable is recoded into binomial and analyzed as such. I think that the only difference is the research question one answers: if kept in a multinomial form, you can find out what the probability is that an individual with certain characteristics chooses option A, B, C. If the variable is recoded, then you can only find out what the probability is that this person prefers option A over all non-A options. Is that correct?
For that part of the calculation, we want the number of ways of picking 2 red balls from 8, which is what (8 choose 2) represents. The square of (8 choose 2) is not a useful number for us in that question.
Great explanation. But what I can't grasp my head around is why we need the "different ways to arrange them" Lets say we have the blood example. and we have a sample of 5 people. and 3 of them are O, 2 are A. How is OOOAA not the same as AAOOO, if nowhere in the problem it stays there is an specific order to take into consideration when taking the sample?
It is the same as far as we are concerned, and we don't care about the order. We simply want to know the probability of a certain number of occurrences of each of the k outcomes. But that's precisely why we need that multinomial coefficient in front. p_1^x_1...p_k^x_k is the probability of getting *any specific ordering* of x_1 occurrences of outcomes 1, x_2 occurrences of outcomes 2, etc. That's what that term gives us. We don't care about that in isolation though, so we need to add up the probabilities of all the orderings that get us x_1 occurrences of outcomes 1, x_2 occurrences of outcomes 2, etc. That's what multiplying to the multinomial coefficient does for us. It's the same logic as for the binomial distribution. The binomial is a slightly simpler situation, and I go into this notion in a little more detail in my binomial video. (If you don't understand what I"m talking about above, you might find the binomial video helpful.)
@@jbstatistics thank you! so my understanding is that for either the binomial or multinomial distributions, it is implied that the events are in succession. Like, in the blood example, we are asking one person, then another, then another. and thats why the order OOOAA and AAOOO are considered different events?
@@mando1964 It's just a way to visualize it. We could ask them all at exactly the same time, and have them all respond at exactly the same time, and that would not change the situation. Toss 2 fair coins at the same time. What's the probability heads comes up exactly once? 1/2, and there's a variety of ways to come up with that, with one of them being: In a conceptual world, where a magic fairy were to glance at the coins, there are 4 outcomes: TT, TH, HT, HH, where the first letter represents the outcome on the coin that the fairy happened to see first. Each of those 4 outcomes has a probability of (1/2)(1/2) = 1/4. But we care not about the order, so TH or HT would get us what we need in our world. Thus 2 of those outcomes in that magical fairly world give us heads exactly once, with each of those outcomes having a probability of 1/4, and thus the probability of getting heads exactly once is 2(1/4) = 1/2.
@@jbstatistics This last paragraph made it clear to me. It reiterated that it all comes down to the many ways HT can occurs. Hence why it has a bigger probability than HH or TT. Like why rolling 2 dice, the probability of getting a 7 is way higher, because there are more ways it can occur, correct?
@@mando1964 Yes, it's related to that idea. We only care about getting the 7, but there are many ways of getting 7, so we need to add up the probabilities of all those different ways.
10!/6!2!1!1! By definition is the same as 10!/(6×5×4×3×2×1×2×1×1×1) since 6!=6×5×4×3×2×1 etc. Multiply that by 0.44^6 × 0.42^2 × 0.1 × 0.04 and you will get 0.012902538 which is the answer.
The probabilities are independent. Remember that if each trial is independent of the other, such as a coin flip then ---> P(a, b) = P(a) * P(b); Thats Why you see the right hand side of the equation. The left hand side comes straight out of Probability theory where we can get the Number of different ways this event can occur as the (Number of permutations if we considered every event distinct) / (the number of ways we can shuffle around the similar events such as the A's)
+Srx 30 No, I meant n. Each of the individual random variables can take on any whole numbered value between 0 and n, subject to the constraint that they all must sum to n.
Great video thank you for the explanation! I was wondering, wouldn't it be easier to follow if you called the variables Xred, Xyellow and Xwhite instead of X1 X2 and X3?
Dear Mr jbstatistics! Thank for your presentation. I have a problem with Multinomial Naive Bayes. I can't fully understand the meaning one of the fragment in the formula of the probability of a document in Multinomial Naive Bayes Model. P(di|Cj) = P(|di|). |di|!. U(P(Wt|Cj)^Nit / Nit !) with i = 1, .., |V|. U is Integration, comment isn't allowed for special symbol so I can't express it. My question: P(|di|), what does this probability mean? How to compute it? Please explain for me! Thanks you so much. Best regard! Nam.
n! / x1! .....Xk! , no of ordering that give us x_1 occurences of outcome 1 and so on. It is nCx right? So where is (n-x)! in the denominator? Thank you
It is a generalization of nCx beyond just two groups of items (successes and failures) to k groups of items (group 1, group 2, ..., group k). With just two groups, it reduces to n!/(x!(n-x)!)
10 years later, nothing has been able to beat this explanation. Very much appreciated!
Thank you! I'm very glad you find these videos useful. I created them mainly with my own students in mind, but I'm very glad that others around the world find them helpful.
had to login and leave a comment, your videos are a gem. thanks for saving my probability and statistics course!
Happy to be of help!
Thanks very much for the compliment. It's not the easiest thing in the world to do well, and there are a number of skills and talents required, so it's not too surprising there can be some poor quality stuff out there. I don't profess to be an expert in videos, but I'm doing my best (on my own) to make the best ones I can. Some of my earlier videos were plagued by some of the negatives you mention, but I've refined my technique a little since then. I'm very glad you find them helpful.
Statistics grad student here. Your videos are always so incredibly clear. It is so helpful to have an example alongside the general mathematical notation to understand when we would use this distribution and to have everything worked out. You are so incredibly helpful. Thank you!
Thanks so much! I'm definitely going to keep going -- I'll be adding new videos in the coming weeks and months. Cheers from Canada.
just know that your an exceptional teacher and without your videos I would be lost all quarter!!!! Thank you so much!
This playlist is a gem. Came here looking for multinomial distribution after using it for sampling in Pytorch.
Following explanation on the number of possible orderings might help myself (in future) and fellow learners -
Example - Number of ways to permute/order “n” letter word = n!
Suppose word = AAAA. Possible orderings = 4!, but as all letters in the word are the same (not distinct i.e. even if I change the position of any of the A’s it won’t matter), divide by 4!. Suppose word = “AABC” then orderings = 4! / 2!, divide by 2! because letter A is same(not distinct).
Same case here, in the case of sampling 10 Americans. The number of ways to permute/order = 10!, but out of 10 people, 6 are of type O, 2 are of type A, 1 of type B, and 1 of type AB. Hence, the people in each group (O, A, B, AB) are the same(non-distinct). So, 10! / (6! * 2!)
You are awesome. You even showed without replacement which shows you are very thorough in your explanations.
Hands down the best explanation on this concept period. The example was so clear and easy to follow. Where's the "Donate" button. I'll gladly chip in for your efforts.
This 10 min video alone explained a whole week of lecture content 10x better than my professor did.
You are very welcome, and thanks for the compliment! I'm very glad to be of help.
That's correct. There's nothing wrong with analyzing a situation that is truly multinomial as a binomial, if one is interested in only one of the outcomes. For example, we might be interested in the distribution of the number of people with O- blood in a random sample of 100 people. This has a binomial distribution with 2 possible outcomes on each trial (O-, not O-), even though there are many other blood types. There's nothing wrong with doing this, if that's the question of interest.
Yes, X_1 through X_k are random variables, and collectively can be thought of as a k-dimensinoal random vector.
Not all random vectors have a multinomial distribution of course, and we'd only use the multinomial distribution to find probabilities if the conditions of the multinomial distribution are met.
that's exactly what I was thinking!
Excellent simple intros!
Thank you so much JB!
Very nice voice, very good explanation, clear presentation. Love it!
Thanks for all the compliments! I'm very glad to hear that you love my video! All the best.
jbstatistics do you have a lesson on gamma distribution?
I enjoyed every single video in this playlist. Your approach really helped to understand what random variables are. It is also very explicit in your videos what are possible values of rv given a distribution. It looks so trivial now but was very confusing before. Many thanks for that.
Thank you! People are still using this! please keep going! : D
I passed the quiz,that is why I am here to thank you and it helped
excellent video...I found it incredebly straightforward, clear and complete. You're a great teacher!
Short and Straight to the point. Thank you good sir.
Thanks. All your lectures are great. Spent the whole day going through most.
You are welcome kunal. I'm very glad you find my videos helpful.
Everything about ur teaching is classic....tnx ❣️ Love from India ❣️
Thank you so much for all of your videos, I really appreciate it!!!! They have helped me so much, I wish you were my instructor!
Am I the only one who's watching it 2021? Must be hard to learn without you, appreciate!
you are a God among shadows, youch in >30 min what my teachers fail to do in two months
watching this video in 2022 and still its amazingly effective
I do very much appreciate the compliment, but trust me on this, the American education system produces thousands upon thousands of people far more eloquent than I could ever hope to be. I am sure there are many wonderful educators in every country, just as there are some problems with the education system in every country. My roots growing up in a small Northern Ontario town slip through sometimes -- listen carefully for the "we're gunna..." that shows up in some of my videos :)
Good explanation of the basics. Wish you did a sequel with more elaborate examples :) The problem I'm facing now could be reduced to "given a k-sided die, what is the probability of getting each possible value at least once within n throws" and there's no video on youtube that would help tackle this.
Thanks for great explanation of Multinomial distribution
It is very clear to understand. Thank you for your video.🥰
Yo, I am going to follow all of your videos on statistics along with the book on stats I am currently reading. Thankyou,, really really helpful.
Thank you for your nice explanation. from Japan.
分かりやすい動画をありがとう!
great explanation.from upm,malaysia.thanks a lot.
You are welcome!
Great work man. I use your videos as an introduction before studying, helps a bunch ! Thank you.
You are very welcome! I'm glad to be of help!
Clear explanation. Very useful.
Still here in 2021! (Need this for online classes) Thank you very much!
I'm still here too! You are very welcome.
Very helpful, thank you! You have a good voice for teaching :)
Thanks!
i like your voice very much. it helps drawing my attention
Thanks! I get a lot of views from Malaysia. You folks must know good videos when you see them :) Cheers from Canada.
Pretty clear explanation ! Very helpful
Very clear explanation!
In R use the dmultinom function for your observations and the vector of probabilities
Can you please explain how you get the answer at the part without replacement. I dont get it
Well explained.... Thank you
At 8:23 how come the denominators of 20 don't change because of the "with replacement" nature of this question.
Oh soz nvm i got confused with "with replacement" and "without replacement".
Amazing video!
Is there an efficient way to calculate something akin to:
An urn contains 8 red balls, 3 yellow balls, and 9 white balls. 6 Balls are randomly selected **with replacement**.
What is the probability that at least 2 are red and at least 1 is yellow?
It depends what you mean by "efficient" :) There's no quick formula or anything like that; you have to add the probabilities of all the cases that satisfy the condition. So it can get a little complicated and messy.
@jbstatistics Ahhh sad...
I have a similar case I wanted to calculate but that has 8 relevant outcomes within hundreds of thousands of trials...
So adding all the cases up that satisfy the condition doesn't seem to be an option
very good explanation
Awsome!
very clear!
you should do your own talk show
hello. what the difference between multinomial distribution witout replacement and the hypergeometric. they seems identical
thank you for your hard work sir
Thanks you. This is and excellently produced tutorial.
Thanks! I'm glad to be of help.
Good job.it's great 👍
how can we solve the without relacement calculation(0.18204)
10:40
is multinomial distribution without replacement similar to hypergeometric distribution?
Thank you so much! This video is very helpful!
Outstanding Sir!!
Thanks!
have u done video on exponential distribution
It was very useful. can i share this video on my Instagram posts??
Hey, I was wondering what possible problems can it casuse if a multinomial variable is recoded into binomial and analyzed as such. I think that the only difference is the research question one answers: if kept in a multinomial form, you can find out what the probability is that an individual with certain characteristics chooses option A, B, C. If the variable is recoded, then you can only find out what the probability is that this person prefers option A over all non-A options. Is that correct?
Why haven't you not elavated the (8 choose 2) by 2, in the 10:28 ?
For that part of the calculation, we want the number of ways of picking 2 red balls from 8, which is what (8 choose 2) represents. The square of (8 choose 2) is not a useful number for us in that question.
Nicely explained
Liked, subscribed and shared!
Great explanation. But what I can't grasp my head around is why we need the "different ways to arrange them" Lets say we have the blood example. and we have a sample of 5 people. and 3 of them are O, 2 are A. How is OOOAA not the same as AAOOO, if nowhere in the problem it stays there is an specific order to take into consideration when taking the sample?
It is the same as far as we are concerned, and we don't care about the order. We simply want to know the probability of a certain number of occurrences of each of the k outcomes. But that's precisely why we need that multinomial coefficient in front. p_1^x_1...p_k^x_k is the probability of getting *any specific ordering* of x_1 occurrences of outcomes 1, x_2 occurrences of outcomes 2, etc. That's what that term gives us. We don't care about that in isolation though, so we need to add up the probabilities of all the orderings that get us x_1 occurrences of outcomes 1, x_2 occurrences of outcomes 2, etc. That's what multiplying to the multinomial coefficient does for us.
It's the same logic as for the binomial distribution. The binomial is a slightly simpler situation, and I go into this notion in a little more detail in my binomial video. (If you don't understand what I"m talking about above, you might find the binomial video helpful.)
@@jbstatistics thank you! so my understanding is that for either the binomial or multinomial distributions, it is implied that the events are in succession. Like, in the blood example, we are asking one person, then another, then another. and thats why the order OOOAA and AAOOO are considered different events?
@@mando1964 It's just a way to visualize it. We could ask them all at exactly the same time, and have them all respond at exactly the same time, and that would not change the situation.
Toss 2 fair coins at the same time. What's the probability heads comes up exactly once? 1/2, and there's a variety of ways to come up with that, with one of them being:
In a conceptual world, where a magic fairy were to glance at the coins, there are 4 outcomes: TT, TH, HT, HH, where the first letter represents the outcome on the coin that the fairy happened to see first. Each of those 4 outcomes has a probability of (1/2)(1/2) = 1/4. But we care not about the order, so TH or HT would get us what we need in our world. Thus 2 of those outcomes in that magical fairly world give us heads exactly once, with each of those outcomes having a probability of 1/4, and thus the probability of getting heads exactly once is 2(1/4) = 1/2.
@@jbstatistics This last paragraph made it clear to me. It reiterated that it all comes down to the many ways HT can occurs. Hence why it has a bigger probability than HH or TT. Like why rolling 2 dice, the probability of getting a 7 is way higher, because there are more ways it can occur, correct?
@@mando1964 Yes, it's related to that idea. We only care about getting the 7, but there are many ways of getting 7, so we need to add up the probabilities of all those different ways.
This has helped me memorize concepts for exam p. Do you have a video for convolution, I can't find?
Thank you for this!
I'm glad to be of help!
thank you so much! it's really helpful ❤️
I think it should be random variable Xk instead of Xi because k represents outcomes
There are k random variables: X_i, for i = 1, ..., k. X_k is the kth random variable.
do we use multinomial distribution with random vectors ?? As i understood random vector is the collection of random variables ? is this correct ?
best stat channel on youtube?
Absolutely! Why the question mark? :)
thank you for this, really helpful
Very helpful. Thanks.
im still confused as to how the probability is calculated at 6:33
10!/6!2!1!1! By definition is the same as 10!/(6×5×4×3×2×1×2×1×1×1) since 6!=6×5×4×3×2×1 etc. Multiply that by 0.44^6 × 0.42^2 × 0.1 × 0.04 and you will get 0.012902538 which is the answer.
The probabilities are independent. Remember that if each trial is independent of the other, such as a coin flip then ---> P(a, b) = P(a) * P(b); Thats Why you see the right hand side of the equation. The left hand side comes straight out of Probability theory where we can get the Number of different ways this event can occur as the (Number of permutations if we considered every event distinct) / (the number of ways we can shuffle around the similar events such as the A's)
at 3:36 you mean up to k, not up to n?
+Srx 30 No, I meant n. Each of the individual random variables can take on any whole numbered value between 0 and n, subject to the constraint that they all must sum to n.
Great video
Thanks Anders! I'm glad to be of help.
Great video thank you for the explanation! I was wondering, wouldn't it be easier to follow if you called the variables Xred, Xyellow and Xwhite instead of X1 X2 and X3?
Thank you a lot, perfectly explained
Dear Mr jbstatistics! Thank for your presentation. I have a problem with Multinomial Naive Bayes. I can't fully understand the meaning one of the fragment in the formula of the probability of a document in Multinomial Naive Bayes Model.
P(di|Cj) = P(|di|). |di|!. U(P(Wt|Cj)^Nit / Nit !) with i = 1, .., |V|. U is Integration, comment isn't allowed for special symbol so I can't express it.
My question:
P(|di|), what does this probability mean? How to compute it?
Please explain for me! Thanks you so much.
Best regard!
Nam.
Thanmks
can we say for multinomial distributions all possibilites should be 1. MECE and 2. IID
Awesome! Thank you so so much +jbstatistics! You're truly an amazing statistician!
I'm not so sure about the "amazing statistician" part, but I do my best and I'm glad to be of help!
Thank you so much!!!!! you're wonderful!
good explanation
Lifesaving videos
I'm glad to be of help!
awesome videos, keep them going :DD
Thanks Jacob!
Really Great!
Thank you so so much.
n! / x1! .....Xk! , no of ordering that give us x_1 occurences of outcome 1 and so on. It is nCx right? So where is (n-x)! in the denominator? Thank you
It is a generalization of nCx beyond just two groups of items (successes and failures) to k groups of items (group 1, group 2, ..., group k). With just two groups, it reduces to n!/(x!(n-x)!)
is this for grade 9?
Not typically, no.
Thank you ❤
Salute.
Thank you sir !
🔥🔥🔥🔥😍🤩🤩
Thanks 😀
Thanks
nice, very nice
@jbstatistcs magnificent work, arguably the best explanation of probability distributions in UA-cam, an evergreen tutorial! Many thanks sir
thanks