10:30? Shannon code interpretation 11:25 Training a generative model based on maximum likelihood is equivalent to optimizing compression, comparison of models based on likelihoods is equivalent to comparing compression 22:30 GAN generates samples but getting likelihoods is difficult, kernel density approximation 33:40? VAE, compute likelihoods by annealed importance sampling 43:25? Sharpness, entropy of classifier, low entropy means classifier puts probability mass on one y leading to high sharpness. Diversity, entropy of marginal distribution of classifier. 45:50 c(y) is the marginal distribution over predicted labels when fitting synthetic samples, one sample would yield low entropy (unhappy), ideally c(y) uniform over possible y, sharpness and diversity are competing 54:00 Notation, x and x' are not global variable names 55:20? Kernel Hilbert space mapping, comparing features equivalent to comparing moments 1:05:20? VAEs have reconstruction loss embedded in them, GANs don't 1:07:55 Impossible to measure disentanglement in unlabeled data
10:30? Shannon code interpretation
11:25 Training a generative model based on maximum likelihood is equivalent to optimizing compression, comparison of models based on likelihoods is equivalent to comparing compression
22:30 GAN generates samples but getting likelihoods is difficult, kernel density approximation
33:40? VAE, compute likelihoods by annealed importance sampling
43:25? Sharpness, entropy of classifier, low entropy means classifier puts probability mass on one y leading to high sharpness. Diversity, entropy of marginal distribution of classifier.
45:50 c(y) is the marginal distribution over predicted labels when fitting synthetic samples, one sample would yield low entropy (unhappy), ideally c(y) uniform over possible y, sharpness and diversity are competing
54:00 Notation, x and x' are not global variable names
55:20? Kernel Hilbert space mapping, comparing features equivalent to comparing moments
1:05:20? VAEs have reconstruction loss embedded in them, GANs don't
1:07:55 Impossible to measure disentanglement in unlabeled data