Very detailed and comprehensive explanation! I am a UW student taking CS480 in the 2022 Spring, this video resolves a lot of my confusion. Thank you prof!
For a specific input x, the generated distribution p(z|x) should be inside N(0,1), but narrower. Does it mean that many p(z|x) accumulate together and empirically become p(z), which is a Guassian mixture?
In VAE, the encoder outputs a mu and sigma. mu and sigma are used to generate h. Encoder uses h to generate data point X. After training when the decoder is used as a generator, then why are mu and sigma not used to produce h. Instead h is sampled from N(0,1). Won't this output an incorrect X?
(8:45) Agreed, this seems to be a clear mistake on the Prof's part. (16:20) he seems to realize his mistake, but doesn't acknowledge. From then on he uses the same wrong distribution as well as the correct distribution somewhat confusingly. (25:10) he somewhat explains how the distributions are related in the context of the "reparameterization trick". BTW, going between them implicitly requires a transformation from a later lecture on normalizing flows explained pretty well by a grad student.
Very detailed and comprehensive explanation! I am a UW student taking CS480 in the 2022 Spring, this video resolves a lot of my confusion. Thank you prof!
Very clear explanation of VAE and the reparameterization trick! Thank you Professor!
A very intuitive explanation of VAEs, thanks!
Great teacher. Very clear explanations, thanks
Thank you for your great lectures!
For a specific input x, the generated distribution p(z|x) should be inside N(0,1), but narrower. Does it mean that many p(z|x) accumulate together and empirically become p(z), which is a Guassian mixture?
In VAE, the encoder outputs a mu and sigma. mu and sigma are used to generate h.
Encoder uses h to generate data point X.
After training when the decoder is used as a generator, then why are mu and sigma not used to produce h. Instead h is sampled from N(0,1). Won't this output an incorrect X?
(8:45) Agreed, this seems to be a clear mistake on the Prof's part. (16:20) he seems to realize his mistake, but doesn't acknowledge. From then on he uses the same wrong distribution as well as the correct distribution somewhat confusingly. (25:10) he somewhat explains how the distributions are related in the context of the "reparameterization trick". BTW, going between them implicitly requires a transformation from a later lecture on normalizing flows explained pretty well by a grad student.