Stanford CS236: Deep Generative Models I 2023 I Lecture 14 - Energy Based Models

Поділитися
Вставка
  • Опубліковано 28 лис 2024

КОМЕНТАРІ • 3

  • @CPTSLEARNER
    @CPTSLEARNER 6 місяців тому +3

    7:00 Sliced score matching slower than denoising score matching, taking derivatives
    13:45 Denoising data minimizes sigma, but minimum sigma is not optimal for perturbing data when sampling
    27:15 Annealed Langevin, 1000 sigmas
    38:50 Fokker Planck PDE, interdependence of scores, intractable so treat loss functions (scores) as independent
    45:00? Weighted combination of denoising score matching losses, estimation of score for each perturbed data by sigma_i, weighted combination of the estimated scores
    48:15 As efficient as estimating a single non-conditional score network, joint estimation of scores is amortized by a single score network
    49:50? Smallest to largest noise during training, largest to smallest noise during inference (Langevin)
    52:10? Notation, p sigma_i is equivalent to previous q (estimation of perturbed data)
    57:20 Mixture denoising score matching is expensive at inference time (Langevin steps), deep computation graph which doesn't have to be unrolled at training time (not generating samples during training)
    1:07:00 SDE describes perturbation iterations over time
    1:08:50 Inference time (largest to smallest noise) described by reverse SDE which only depends on the score functions of the noise perturbed data densities
    1:12:00 Euler-Maruyama discretizes time to solve numerically solve SDE
    1:13:25 Numerically integrating SDE that goes from noise to data
    1:15:00? SDE and Langevin corrector
    1:20:25 Infinitely deep computation graph (refer to 57:20)
    1:21:45 Possible to convert SDE model to normalizing flow and get latent variables
    1:22:00 SDE can be described as ODE with same marginals
    1:23:15 Machinery defines a continuous time normalizing flow where the invertible mapping is given by solving an ODE, paths of solved ODE with different initial conditions can never cross (invertible, normalizing flow), normalizing flow model trained not by maximum likelihood but by score matching, flow with infinite depth (likelihoods can be obtained)

  • @iamnotPi
    @iamnotPi 11 днів тому

    Can anyone suggest some books or courses for understanding SDE? I’m kinda new to this

  • @jchwenger
    @jchwenger 3 місяці тому +4

    Thanks for the great lectures! Tiny detail: shouldn't this one be called Noise Conditional Score Networks, instead of Energy Based Models? Simply because it's the main topic in the lecture, and would allow for people to find it more easily when searching?