Scalable Bayesian Inference with Hamiltonian Monte Carlo

Поділитися
Вставка
  • Опубліковано 15 лис 2024

КОМЕНТАРІ • 11

  • @zool0941
    @zool0941 Рік тому +1

    suuuucchhhh a great talk. really clear, thank you.

  • @nikitagupta9369
    @nikitagupta9369 7 років тому +3

    Loved the talk, helped understand the intuition of HMC. Thanks

  • @ProfessorBeautiful
    @ProfessorBeautiful 8 років тому +1

    Excellent talk; thank you. And yes, to respond to your question at the end, it was that clear.

  • @oflasch
    @oflasch 7 років тому +1

    Great talk, thank you!

  • @deepbayes6808
    @deepbayes6808 7 років тому +1

    Amazing talk.

  • @Samanthaz
    @Samanthaz 4 роки тому

    Are your slides available? perhaps with the lecture transcript for each slide?

  • @stevebez2767
    @stevebez2767 7 років тому

    Brands Hatch=Sim C egg,sam,pools? (Monile Radiation)

  • @alute5532
    @alute5532 Рік тому

    Biased inferences, in wide data regime select bias
    Add prior regulidr system gives math added to help
    Likely what we learned in total qauntifies our info
    Any stat question via manipulation of posterior
    Resort to an expectation reduced t0 computing an interval
    We do numerical wppriximation
    As calculating exact is hard in D
    To find expectation, identify where to focus our computation where is most contribution to those expectations.
    Interesting density consider the volume (over that density)
    High F lots of corners, hard
    Volume increase fast exponentially
    2 competing forces
    1 volume wanna focus on large q
    2 Density focus on mode
    OMG (I. E. Normal) balances out in middle
    Region concentration is the typical set. Look at surface around the mode
    Markov chain : a way of finding exploring sets like that
    It's a random function tao
    After jump next time it will be a new distribution of points
    We get a Markov chain
    If we can engineer Markov chain to preserve our target distribution
    Markov make us humans to typical set (start exploring that surface)
    In m d every point is far from the typical set
    End nice quantification of where probability really is
    To compute any function average it over Markov chain history I. R Markov chain Monte Carlo mcmc
    Long enough ensure we always converge to the true expectation
    (always right answer)
    Q how well can we do it?
    2 how quick we converge to true expectation?.if transitions expensive like in white data
    Exhaust computational resources long before we complete the exploration
    Partial exploration means biased (missing probability) lots of mcmc aloha like that
    Metropolis
    1 proposal:add some noise
    2 decide: accept reject proposal (based on where we come)
    If closer to mode. The. Ccwp it
    If away from Mode, we reject it
    In MD volume is weird it doesn't scale. Outside typical set there is more volume
    Only way is to shrink size of perturbation to a really small neighborhood
    We won't go any where, just a tiny transition
    End up v inefficient exploration, v poor mcmc
    So avoid guessing checking p acceptance is v small
    Use transition knows shape of our surface (how to stay on the contour?)
    Need of automation
    How extract info about the surface?
    Hamiltonian mcmc uses diff geometry
    Use vector field : assign direction to vectors if direction is right, don't guess anymore! Hence all new points lead to others on the same typical set
    How: look at density of target fun
    Take gradient of that function
    Gradient is also a vector field
    If we follow it it leads to mode (unuseful)
    Potentially correct gradient
    Differential geometry automatically correct the gradient
    Physics planet orbit & it's field
    Missing momentum transverse motion keeps us from falling
    Too much momentum gravity won't catch us at all?!
    Key add momenta in the right way
    For all parameter q, add expand a momentum
    2 lift up target distribution on this space
    Find prob. Structure pi( p q)
    How by conditional distribution
    (for the momenta )
    End join distribution, over momenta and distributions)
    I always recover target distribution
    I can project it down, get rid of momenta
    use symplectic integrator can bound errors, transformation required from exact o approximate
    Calculate how accurate the solution is by integrating over all deviations
    Solution I'll n between cost of algorithm, and step size
    End up getting lower bound upper bound (of error) x avg acceptance prob.
    Y = cost
    For almost all models relationship is bounded between. 2 lines
    0.6 0.8 solution is near flat, near optimal
    Choose step size so that avg x Aziz in. 0.6 0.8
    Intuition hoe to
    1 choose kinetic energy
    2 choose integration time
    3 step size
    Fully automated
    Devouple 2 steps of inference
    1 modeling step we choose prior likelihood
    2 computation step: compute those expectations
    2 step size smaller
    No step size work
    Changing your model reimplementing in different way or recharging your priors
    After ensures exact computation of necessary gradient
    1 control stmts if else
    2 prob. Density functions PDF Cdf
    3 linear algebra addition multiplication decomposition
    4 ode (nonstiff stiff)
    Space equipped with Lie group to give a flow
    typical set is meausre preserving flow
    Adibotic Monte Carlo
    multi modal distribution