The Beta distribution is a conjugate prior for the Bernoulli. We derive the posterior distribution and the (posterior) predictive distribution under this model.
By hypothesis, x_i in {0,1}. So THETA^{I(x_i=1)}*(1-THETA)^{I(x_i=0)} = THETA^{x_i}*(1-THETA)^{1-x_i}. The indicator functions here are purely clutter.
why is the mode of theta|D in the denominator a+b+n-2? If I substitute a=a+n1 and b = b+n0 I would get a+n1+b+n0-2 ? or is n just the sum of n0 and n1?
When you say that the posterior is proportional to a Beta distribution I am not sure why you include "Beta(theta | a + n1, b + n0)" and not just Beta(a + n1, b + n0). I am not sure where the "theta | " part comes from.
Actually you are wrong, because n_0+n_1=n think as a bernoulli number of success (n_0) and non-success (n_1) and if you calculate the beta distribution of posterior they become n. So he is right.
By hypothesis, x_i in {0,1}. So THETA^{I(x_i=1)}*(1-THETA)^{I(x_i=0)}
= THETA^{x_i}*(1-THETA)^{1-x_i}. The indicator functions here are purely clutter.
why is the mode of theta|D in the denominator a+b+n-2? If I substitute a=a+n1 and b = b+n0 I would get a+n1+b+n0-2 ? or is n just the sum of n0 and n1?
What is meant by "Indicator" function? (mentioned at time 4:04)
I haven't seen that part, but it is a 1 if some condition is true, otherwise a 0.
When you say that the posterior is proportional to a Beta distribution I am not sure why you include "Beta(theta | a + n1, b + n0)" and not just Beta(a + n1, b + n0). I am not sure where the "theta | " part comes from.
Actually you are wrong, because n_0+n_1=n think as a bernoulli number of success (n_0) and non-success (n_1) and if you calculate the beta distribution of posterior they become n. So he is right.
Also shouldn't this be called the "beta-binomial" model? You are combining a beta pdf with a binomial pdf, not a bernoulli pdf.