This is by far the best explanation on youtube ! most of the other rbm videos on youtube are a waste of time, if you want to learn the basics, this is the best !
Agree with q0x. I have seen many explanation on RBM on Videolectures and other sites. Prof. Ghodsi explanation truly insightful and step by step. Thank you very much Prof. Ghodsi.
Very nice explanation about RBM, I was looking at Hugo's lectures and this video clarified few things skipped there. Prof Ghodsi explains things intuitively and reminds bigger picture, which is great.
Howcome P(h|v) is easy to compute? Have we not been saying all along that computing posterior robability is difficult due to the interactiability of computing the evidence, p(v)? (according to Bayes' rule)
Can I get a reference for the part where he says that the lower dimensional space by neural network is proved to span the eigenspace generated using PCA
How long does a RBM needs to train on small MNSIT data to get localized features as shown in slides? I tried and ran for 2M iterations, I still see global features but not localized as shown in slide. I highly appreciate a response.
I'm sorry but I think that the slide at time 27:49 is wrong. Although P(hj=1|v)=sigmoid(cj+v.W:j) is correct, the middle line in the first equation implies, according to the reasoning presented, that P(hj=0,v)=exp(0) and P(hj=1,v)=exp(cj+v.W:j) which is wrong. In fact what should be done is to state that P(hj=1,v)/(P(hj=0,v)+P(hj=1,v))=1/(1+P(hj=0,v)/P(hj=1,v))=sigmoid(cj+v.W:j). Furthermore, P(h|v) and P(v|h) can't be right, not all entries in h or v are =1, so even though they are independent there should be sets of P(hj=1|v)=sigmoid(cj+v.W:j) and P(hk=0|v)=(1-sigmoid(ck+v.W:k)) in these products. Finally P(h|v)=product{P(hi|v), i=1..n} for a pair h,v, and similarly for P(v|h).
it could be better if the dimension of each vector and matrix operations are shown. It is confusing how can "v" has subscript "k" when it has "j" items
25:00 P(h|v) as product of P(h_j|v) the formula is not correct. hidden vector h = (h_j) j =1...n_h when h_j =1 the term is sigmoid(c_j + v.T *W(:,j)) but when h_j =0 the term is sigmoid(-c_j - v.T*W(:,j)) www.deeplearningbook.org/contents/generative_models.html page 656 equation (20.15) is correct.
the third term of the energy function should be : sum_{j,k}( v_k*w_{k,j}*h_j) . not sum_{j,k}( v_k*w_{j,k}*h_j) the size of the matrix w is (the size of vector v) * (the size of vector h). In this case 5*7. so you should write w_{k,j} rather than w_{j,k}
This is by far the best explanation on youtube ! most of the other rbm videos on youtube are a waste of time, if you want to learn the basics, this is the best !
totally agree~
we should make sure to promote great content like this so that people don't waste hours on sub-par content! (like apparently we did :D)
Can I ask you where should I dig it more about RBM? Something more advanced, I have wasted too much time on this topic...
totally a great lecture and presenter
totally agree
My best teachers were always from Iran. Prof Ghodsi is one of them.
Agree with q0x. I have seen many explanation on RBM on Videolectures and other sites. Prof. Ghodsi explanation truly insightful and step by step. Thank you very much Prof. Ghodsi.
His calm and patient delivery demonstrates his confidence in the field.
wonderful lecture, i have been following Hugo Larochelle's lectures but got stuck. This helped me. thank you.
Same. Even after 3 years of ur comment I found it relatable
thanks a lot, it 's really well explained, the best RBM presentation I have seen till today
A Great Lecture by Prof. Ghodsi.
Very nice explanation about RBM, I was looking at Hugo's lectures and this video clarified few things skipped there. Prof Ghodsi explains things intuitively and reminds bigger picture, which is great.
Intuitive and mathematical explanations of the RBM in a simple and nice way.
Thank you very much for sharing this useful lecture and your knowledge as well. I found it very useful and learnt a lot.
Excellent presentation!
I have a question at 8:13
the energy function in the first term why we use "k" instead of "j".
its because input nodes may or maynot be equal to hidden nodes...for example if RBM is of 4 input nodes and 3 hidden nodes, then j=1,2,3,4 and k=1,2,3
This is amazing! Thank you so much!
Howcome P(h|v) is easy to compute? Have we not been saying all along that computing posterior robability is difficult due to the interactiability of computing the evidence, p(v)? (according to Bayes' rule)
Very nice video! Thanks!
Intuitive, perfect
And Thanks for professor. It is very good lecture, concepts are well explained.
this is excellent and helpful!
Very helpful lecture, where i can find the notes/presentation of this lecture ? Thanks in advance
Can I get a reference for the part where he says that the lower dimensional space by neural network is proved to span the eigenspace generated using PCA
Good explanation, thanks.
Thanks for sharing!
Thanks, Good presentation
How long does a RBM needs to train on small MNSIT data to get localized features as shown in slides? I tried and ran for 2M iterations, I still see global features but not localized as shown in slide. I highly appreciate a response.
Is there any similar video for gaussian RBM?
he is the master of it
This is gold!
Where is the Part 2 of this lecture?
From 37k viewers only at least (1 or 2 or 3) understood all the class.
I'm sorry but I think that the slide at time 27:49 is wrong. Although P(hj=1|v)=sigmoid(cj+v.W:j) is correct, the middle line in the first equation implies, according to the reasoning presented, that P(hj=0,v)=exp(0) and P(hj=1,v)=exp(cj+v.W:j) which is wrong. In fact what should be done is to state that P(hj=1,v)/(P(hj=0,v)+P(hj=1,v))=1/(1+P(hj=0,v)/P(hj=1,v))=sigmoid(cj+v.W:j).
Furthermore, P(h|v) and P(v|h) can't be right, not all entries in h or v are =1, so even though they are independent there should be sets of P(hj=1|v)=sigmoid(cj+v.W:j) and P(hk=0|v)=(1-sigmoid(ck+v.W:k)) in these products. Finally P(h|v)=product{P(hi|v), i=1..n} for a pair h,v, and similarly for P(v|h).
www.deeplearningbook.org/contents/generative_models.html
equation (20.12) there is a hat on top of the P
The lecture is really helpful, but the only bad thing is the notations are not consistent.
Thank you for good lesson ! !
Thanx a lot. It was really good.
it could be better if the dimension of each vector and matrix operations are shown. It is confusing how can "v" has subscript "k" when it has "j" items
25:00 P(h|v) as product of P(h_j|v) the formula is not correct.
hidden vector h = (h_j) j =1...n_h
when h_j =1 the term is sigmoid(c_j + v.T *W(:,j))
but when h_j =0 the term is sigmoid(-c_j - v.T*W(:,j))
www.deeplearningbook.org/contents/generative_models.html
page 656 equation (20.15) is correct.
the third term of the energy function should be : sum_{j,k}( v_k*w_{k,j}*h_j) . not sum_{j,k}( v_k*w_{j,k}*h_j)
the size of the matrix w is (the size of vector v) * (the size of vector h). In this case 5*7.
so you should write w_{k,j} rather than w_{j,k}
at this point I dont know what am ı doing