The Key Equation Behind Probability

Поділитися
Вставка
  • Опубліковано 1 лют 2025

КОМЕНТАРІ • 309

  • @ArtemKirsanov
    @ArtemKirsanov  5 місяців тому +23

    Get 4 months extra on a 2 year plan here: nordvpn.com/artemkirsanov. It’s risk free with Nord’s 30 day money-back guarantee!

    • @anthonyrepetto3474
      @anthonyrepetto3474 5 місяців тому +1

      best explanation of KL-div I've heard on UA-cam - thank you!

    • @pebbles7913
      @pebbles7913 4 місяці тому +1

      I beg you to talk about how Neuro-symbolic AI models I find them very fascinating. Alpha geometry has shown combining different model types may be the way to go for AGI, similar to the brain.

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @mounirgharsallah1263
    @mounirgharsallah1263 5 місяців тому +67

    So far... One of the best clear videos about Entropy and KL-divergence...
    Good Motion Design too...

  • @gonzalopolo2612
    @gonzalopolo2612 4 місяці тому +39

    Wow, what an amazing video and explanation! I love how you derived KL divergence (also known as relative entropy) from cross-entropy and entropy.
    It's interesting to note that historically, these ideas were actually discovered in the reverse order. Kullback and Leibler introduced the concept of "information gain" or "relative entropy" (now known as KL divergence) in their 1951 paper "On Information and Sufficiency," building on Shannon's earlier work on entropy. The explicit use of cross-entropy as a separate concept came later as far as I know.
    Your explanation really helps in understanding these interconnected ideas. Thank you for this excellent content!
    Variational inference video with this quality would be simply incredible (I really cannot imagine the amount of effort this requires), again thank you for this

    • @ArtemKirsanov
      @ArtemKirsanov  4 місяці тому +5

      Thank you so much!

    • @hyperduality2838
      @hyperduality2838 4 місяці тому +1

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @TurinBeats
    @TurinBeats 5 місяців тому +108

    This is Artem Kirsanov's golden year. Posting banger after banger. Much love, your videos are a gem

  • @MathOnMain
    @MathOnMain 5 місяців тому +46

    The most succinct explanation of entropy I have heard, the explanation of cross-entropy was very insightful too

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

    • @MathOnMain
      @MathOnMain 4 місяці тому

      @@hyperduality2838 It's not my intention to be rude but your comment it's only very loosely related to mine; may I ask why did you typed it as a reply to me? Was your intention to reply to someone else? Were you trying to leave a comment rather than replying to mine? Regardless, I wish you a nice day.

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      @@MathOnMain I am just informing you that there is a 4th law of thermodynamics based upon teleology.
      "Philosophy is dead" -- Stephen Hawking.
      Main stream physics and therefore science is currently dominated by teleophobia and eliminative materialism hence a new law based upon teleology is not going to be popular.
      Teleophilia is dual to teleophobia.
      Alive is dual to not alive -- Schrodinger's cat.
      Being is dual to non being creates becoming -- Plato's cat.
      Stephen Hawking accepted the metaphysics of Schrodinger's cat so philosophy is not dead!
      Syntax is dual to semantics -- languages, communication or information (surprise).
      If mathematics is a language then it is dual.
      Stephen Hawking was a mathematician he was using duality.
      Messages in a communication system are predicted into existence using probability -- Shannon's information theory, and making predictions is a syntropic process!
      "We predict ourselves into existence" -- Anil Seth, neuroscientist.
      Predicting ourselves into existence is a syntropic process hence there is a 4th law of thermodynamics!
      Synchronic points/lines are dual to enchronic points/lines.
      Points are dual to lines -- the principle of duality in geometry.
      Objective information (syntax) is dual to subjective information (semantics) -- information is dual.
      All Information (surprise) has structure or form and meaning hence it is dual.
      Absolute truth is dual to relative truth -- Hume's fork.
      I assumed that you are interested in absolute truth.

  • @blaine_stl
    @blaine_stl Місяць тому +1

    This is the absolute best video I’ve come across that explains the concepts of entropy, cross entropy, and KL divergence in the most ground up methodical way, great stuff

  • @andreapanuccio295
    @andreapanuccio295 5 місяців тому +13

    This video is the new golden standard of stat introduction for dummies; so clear and informative. I'll link it to the next person I find at the beginning of its stat journey! Tks bro for what u do

    • @saralatreche6948
      @saralatreche6948 3 місяці тому +1

      this is not introduction for dummies but explanation by brilliant communicator

  • @finnrobertson2592
    @finnrobertson2592 5 місяців тому +40

    I'd be very keen for a video about variational inference. Have been loving your content

  • @klikkolee
    @klikkolee 5 місяців тому +249

    Personally, after the third time someone predicted the die roll, I would be exponentially more surprised than after the first time

    • @George70220
      @George70220 5 місяців тому +12

      You misunderstand why he said that surprisal is additive

    • @JackDespero
      @JackDespero 5 місяців тому +58

      @@George70220 Because the notion of surprise is confusing in this context. It is a bad way to call it. Humans are very bad with random events, and surprise does not follow an entropic distribution.

    • @Eta_Carinae__
      @Eta_Carinae__ 5 місяців тому +3

      The additivity allows us to say that H(P,Q) = H(P) + KL(P,Q)

    • @kamilrichert8446
      @kamilrichert8446 5 місяців тому +5

      Yea, gambler's fallacy does make quite a difference in real life interpretation of random events

    • @user-qw1rx1dq6n
      @user-qw1rx1dq6n 5 місяців тому +6

      The main thing is that it’s 0 at 1 and infinite at 0

  • @drhxa
    @drhxa 4 місяці тому +2

    Very impressive how clearly you explained entropy, cross-entropy, and KL divergence with the idea of surprise and great visuals. Well done and thank you for this

  • @FsimulatorX
    @FsimulatorX 5 місяців тому +3

    This is really good and high quality content !! You have no idea this is the first time I've been able to stay watch a math youtube video from start to finish in a very long time

  • @StratosFair
    @StratosFair Місяць тому +1

    Excellent video, from the explanations to the quality of the visualizations and animations, everything was top notch !

  • @PhilipFabianek
    @PhilipFabianek Місяць тому

    Thank you for this video, the explanations are crystal clear. Not only did this make me understand the cross entropy loss better, but it also made me understand quite a lot of the theory surrounding entropy.

  • @andrewgrebenisan6141
    @andrewgrebenisan6141 5 місяців тому +1

    Brilliant, Artem! Your videos are the crown jewels of ML educational content on UA-cam!! So intuitive!

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @zakozakaria
    @zakozakaria 3 місяці тому

    i am addicted to this chanel, i love how you articulate every subject to explain these concepts, thank you so much the world really needs more people like you!!! ♥

  • @robertocardenas5006
    @robertocardenas5006 2 місяці тому +1

    This is unbelievable, I loved it, now I can see how the KL divergence can be used to correct our approximate models, and how we aim at minimizing this divergence to get closer to the hidden distribution, beautiful!

  • @toxicore1190
    @toxicore1190 5 місяців тому +1

    perfect timing for this content, I was looking for more intuitive explanations of cross-entropy lately, thanks!

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @brucerosner3547
    @brucerosner3547 4 місяці тому

    This is the clearest explanation of probability and entropy I've ever seen. Please create more videos.

  • @jimcallahan448
    @jimcallahan448 5 місяців тому +57

    Cross entropy explains magic shows.
    The magician is trying to get the audience to believe the wrong model ("Nothing up my sleeve"") and therefore be surprised (and hopefully delighted) by the outcome.

    • @celaleddinomersaglam6811
      @celaleddinomersaglam6811 4 місяці тому +1

      Basic Pledge-Turn-Prestige process:)

    • @hyperduality2838
      @hyperduality2838 4 місяці тому +1

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @noagarnett
    @noagarnett 25 днів тому

    I need a video about ELBO derivation! No matter how many times I read, I have trouble fully and intuitively understanding. Your way of explaining seems like exactly what I needed - after the (Restricted) Boltzmann machine video (a concept I struggled with for years!) it could be of great value to me. And thanks again for this and the other videos. They're great.

  • @neurosync_x
    @neurosync_x 5 місяців тому

    This is the most lucid unpacking of these concepts that I've ever seen. Great work as always.
    I would definitely look forward to a variational inference video

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @samkee3859
    @samkee3859 5 місяців тому +27

    Keep posting please!

  • @OscarKeats-rs4tk
    @OscarKeats-rs4tk 26 днів тому

    Great job on this. Helped me understand better than anything else out there.

  • @JavierFausLlopis
    @JavierFausLlopis 5 місяців тому

    Perfect presentation, only tiny error at 19:18 in the lower right corner where for a second the word Tails changes to Heads right at the slide change. Thank you very much for your great work, I think we all really appreciate it.

  • @1st_ProCactus
    @1st_ProCactus 5 місяців тому +7

    This to me atleast as a concept is fundamental. But to see people put effort into making maths to explain it, is amazing.

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

    • @1st_ProCactus
      @1st_ProCactus 4 місяці тому +1

      @@hyperduality2838 I think you replied to the wrong comment

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      @@1st_ProCactus I am just informing you that there is a 4th law of thermodynamics based upon teleology.
      "Philosophy is dead" -- Stephen Hawking.
      Main stream physics and therefore science is currently dominated by teleophobia and eliminative materialism hence a new law based upon teleology is not going to be popular.
      Teleophilia is dual to teleophobia.
      Alive is dual to not alive -- Schrodinger's cat.
      Being is dual to non being creates becoming -- Plato's cat.
      Stephen Hawking accepted the metaphysics of Schrodinger's cat so philosophy is not dead!
      Syntax is dual to semantics -- languages, communication or information (surprise).
      If mathematics is a language then it is dual.
      Stephen Hawking was a mathematician he was using duality.
      Messages in a communication system are predicted into existence using probability -- Shannon's information theory, and making predictions is a syntropic process!
      "We predict ourselves into existence" -- Anil Seth, neuroscientist.
      Predicting ourselves into existence is a syntropic process hence there is a 4th law of thermodynamics!
      Synchronic points/lines are dual to enchronic points/lines.
      Points are dual to lines -- the principle of duality in geometry.
      Objective information (syntax) is dual to subjective information (semantics) -- information is dual.
      All Information (surprise) has structure or form and meaning hence it is dual.
      Absolute truth is dual to relative truth -- Hume's fork.
      I assumed that you are interested in absolute truth.
      Comments, messages, languages are dual and the cells, neurons in your body are all communicating with each other all the time hence you are built out of duality.
      Duality creates reality.

  • @vastabyss6496
    @vastabyss6496 5 місяців тому

    About a year ago, I was building a neural network library from scratch, but I was having some trouble understanding some of the intuition behind binary cross-entropy. Watching this video was the first time I felt like I actually understood the concepts behind cross-entropy!

  • @F.H-i9c
    @F.H-i9c 5 місяців тому +1

    Danke!

  • @srinjandutta121
    @srinjandutta121 5 місяців тому +1

    You answered a lot of my questions, the ones I had before the video and also the ones I was having while watching it. Thank you ❤❤

  • @indigoriviera
    @indigoriviera 27 днів тому

    Incredibly well explained! This was so clear and informative. Thank you for the time and effort you put into your videos.

  • @collinmccarthy
    @collinmccarthy 2 місяці тому

    This was amazing, thank you so much! I would LOVE to see a video on VAE's / ELBO or the various ways we can use generative models by trying to learn the joint distribution.

  • @vanhoheneim
    @vanhoheneim 3 місяці тому

    That's one of the best videos I have ever seen on the probabilistic foundations of ML!

  • @bingyanliu6370
    @bingyanliu6370 4 місяці тому

    Thanks!

  • @xyzct
    @xyzct 5 місяців тому +1

    Simple to follow, and crystal clear. Well done!

  • @jimcallahan448
    @jimcallahan448 5 місяців тому +3

    When we get tired of statistical cookbook procedures it is great to go back to the underlying probability logic (it's probability all the way down). This applies not only to machine learning models but to all types of statistical models. For example, in regression models when one studies the "residual" error component. It also explains science stories when an experiment produces a surprising result and first a scientist and then the scientific community realizes that they need a new theory (model) -- a "scientific revolution". A scientific revolution should occur when the K-L divergence of experimental results is large (or else they need to find a mundane explanation for the experimental error).

  • @davide9242
    @davide9242 4 місяці тому

    Amazing video. It's so rare finding such quality nowadays

  • @andytroo
    @andytroo 4 місяці тому

    a good example for cross entropy (or mutual information) is asking "how well aligned are the two images", when one image is a black square on a white background, and the other is a black background with a white square.
    Minimising surprise by some measures is ensuring white=white and black=black, forcing the squares to not touch, other measures force the two squares on top of each other, so that each color is only related to 1 colour in the other picture.

  • @SelfBuiltWealth
    @SelfBuiltWealth 5 місяців тому +1

    Omg thank you so much i was looking to understand this subject and searched you up because you are the best out there!❤

  • @AlexBerg1
    @AlexBerg1 5 місяців тому +2

    What an amazing introduction to this point of view of probability!

  • @MrFarber31
    @MrFarber31 4 місяці тому +1

    Never skip a Kirsanov video

  • @abdurrezzakefe5308
    @abdurrezzakefe5308 5 місяців тому

    Great content! I can see multiple concepts used from FEP, greatly presented and explained. Keep up the good work man, hope we get to have a chat some day about these!

  • @vickymar3836
    @vickymar3836 20 днів тому

    Best video about cross entropy. Succint and comprehensive!!

  • @benfrank6520
    @benfrank6520 5 місяців тому +2

    your animation is so incredible, i even watched the add.

  • @tantzer6113
    @tantzer6113 5 місяців тому +1

    Thank you so much for this fantastically clear and useful video! Question about the formula for cross-entropy: you gave an example that showed that the asymmetry of the formula makes sense (the example of assuming a fair coin versus assuming a rigged coin). However, you didn't "motivate" the formula for cross entropy. I mean, if somebody asked me to derive the formula, I wouldn't know how to explain the choice of the positions of p_s versus q_s in the formula and why they shouldn't be switched. What explains the choice? What explains why we pick one for the weights/coefficients of the probability terms and the other as the arguments of the natural logarithms?

    • @ArtemKirsanov
      @ArtemKirsanov  5 місяців тому +3

      Thanks! Sure, let me explain.
      When we believe in the model Q, with q_s representing the assumed probability of state s, each time we observe an instance of s, we experience log(1/q_s) units of surprisal. As we continue to observe instances, we might encounter different states (like s2) with different probabilities under our model, resulting in different amounts of surprise (log(1/q_s2), etc.).
      Now, the crucial question is: What generates these samples? What determines which state we're going to observe? Since we're observing the process unfold in the real world, these samples are actually distributed according to P, with each state s having a probability p_s of occurring.
      So, to break down the formula:
      1) The term inside the logarithm (1/q_s) represents our surprise, which only makes sense in the context of a particular belief about probability - our internal model. This is why Q is associated with the log term.
      2) When we calculate the expected value of this subjective surprise, we need to account for how often each state actually occurs. The random variable itself is governed by P, so we use p_s as weighing coefficients.
      In essence, p_s * log(1/q_s) gives us the contribution to the average surprise: how often a state occurs (p_s) multiplied by how surprised we are when it does occur (log(1/q_s)).
      Hope this clarifies things a bit!

    • @tantzer6113
      @tantzer6113 5 місяців тому

      This is a great explanation. Thank you so much!

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @MichelCarroll
    @MichelCarroll 5 місяців тому +2

    Keep up the videos. They are really excellent and intuitive

  • @jimlbeaver
    @jimlbeaver 5 місяців тому +1

    Great job explaining this. I’ve always had difficulty getting an intuition around it

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @ResmungoCoder
    @ResmungoCoder 5 місяців тому

    You really know how to explain complex things in simple ways. Thank you for this video!

  • @Carrymejane
    @Carrymejane 5 місяців тому +27

    You called my soul since you mention entropy in the title

  • @DistortedV12
    @DistortedV12 5 місяців тому

    I have 4 ideas: 1) causal representation learning, 2) difference between PCA, ICA and spectral embedding, 3) difference between singular value decomposition and eigen value decomposition and 4) bayes equation? I love how you explain thing and in the video you did indeed beautifully illustrate one of the most important equation in DL coming from a PhD student in machine learning

  • @cerioscha
    @cerioscha 5 місяців тому

    Great video and not an easy one to make. Delighted to see you position and reference the Free energy Principle

  • @sudiptochatterjee8996
    @sudiptochatterjee8996 3 місяці тому

    Really liked the way u motivated the definition of entropy, thanks a lot

  • @strategyschool
    @strategyschool 2 місяці тому

    Every single person interesting in Business Administration (as students or professional) should be good at probability and statistic.
    Very teaching video.
    Thanks a lot.

  • @chaowang8229
    @chaowang8229 5 місяців тому

    Thank you for your video. I think it is very helpful to understand the basic ideas behind training generative machine learning models. By the way, we are extremely interested in the concepts (latent-variable models, variational inference, ELBO, etc. ) that you mentioned in the ending of this video. I would be grateful if you plan to make some videos to introduce these concepts.

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @tirthasg
    @tirthasg 5 місяців тому

    Love your explanations & visualisations. You should consider making a course on the mathematics of ML, and DL. We would definitely tune in!

  • @SN-uc3vr
    @SN-uc3vr 5 місяців тому +1

    Amazing video! Please make more on topics like variational inference, ELBO, etc!!!

  • @ed.puckett
    @ed.puckett 5 місяців тому

    Thank you for your clearly presented video. I wanted to comment about my own experience with KL-Divergence. First, I note that KL-Divergence is not a metric; it fails the triangle inequality for example. This leads to sort of discontinuities when trying to do inference, for example when trying to infer the so-called "even process" from its outputs, and causes a more complex model to be inferred when using the KL-Divergence as a comparator for distributions.

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @moisesbessalle
    @moisesbessalle 5 місяців тому +1

    Yes please do videos on Bayesian inference! This video was great!

  • @bordercollie-black
    @bordercollie-black 2 місяці тому

    Thank you for describing what entropy is. I want to learn Latent-variable models, and VAE. So, when would you upload those topic?

  • @BenjaminEvans316
    @BenjaminEvans316 3 місяці тому

    Great animations in the video with a good level of detail.

  • @SorryNothingToSeeHere
    @SorryNothingToSeeHere 5 місяців тому

    I just recently ran across your videos. I would love to see videos relating to search iterative inference and space reduction. In this video you mention gradient descent, I have not looked it over. I'll browse a bit and look for other optimzation topics. Thanks a lot!

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @davidattias6282
    @davidattias6282 5 місяців тому

    Underrated UA-cam channel. Thanks boss

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @theodoreshachtman9990
    @theodoreshachtman9990 3 місяці тому

    Your work is a gift to the world. Thank you!

  • @simonstrandgaard5503
    @simonstrandgaard5503 5 місяців тому +1

    Beautiful visuals and well presented.

  • @Boredguy112
    @Boredguy112 4 місяці тому

    Thanks Artem, Please go into Variational inference or even Free energy too! very interested!

  • @NGBigfield
    @NGBigfield 4 місяці тому

    That's a great way to explain KL-Divergence. I never understood the parameters until now!
    thanks!

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @nigelsheldon5753
    @nigelsheldon5753 6 днів тому

    Thank you very much, I understand it much better now. I am clearly going to check out the other videos, they seem awesome! Time to go back to Stanford CS299 :)

  • @hdcamsit2144
    @hdcamsit2144 3 місяці тому

    you should do a video building on top of this explanation of entropy and cross-entropy to explain what perplexity is.

  • @hr5651
    @hr5651 5 місяців тому

    Please do a video on the other techniques (in your outro/conclusion). On youtube, there is already a lot of video explaining globally how generative model works but not so much on the details of the techniques behind. It could be interesting.

  • @lush93yt
    @lush93yt Місяць тому

    Thank you, a lot, for your explanation, Artem Kirsanov!

  • @robertmotsch7535
    @robertmotsch7535 5 місяців тому +1

    A truly excellent presentation!

  • @ellysian
    @ellysian 4 місяці тому

    Amazing and clear explanation, thank you for the video!
    Since we are optimizing for the cross-entropy and not KL divergence in practice, this means that although optimizing for both functions is the same process, the ideal loss that we are expecting is not 0?

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @Infraredchili
    @Infraredchili 5 місяців тому +1

    Great video. I'm really interested in latent-variables models, ELBO, VAE and so on

  • @mauroabidalcarrer4083
    @mauroabidalcarrer4083 5 місяців тому

    Awesome video, can't wait for the next fondamental of probabilities

  • @bingyanliu6370
    @bingyanliu6370 4 місяці тому

    Thank you for solving my long-lasting questions

  • @heavenrvne888
    @heavenrvne888 5 місяців тому

    I'd really like to see a video from you on variational inference

  • @Jamie-my7lb
    @Jamie-my7lb 4 місяці тому +1

    I’d like to better understand what kinds of operations are allowed with probability distributions. For example I think that we never really have access to the true distribution P in any real world scenario. All we have access to are models, Q. By sampling events from P, we can estimate the cross entropy. However I don’t see how P can ever be actually known.

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @robinbreslin1626
    @robinbreslin1626 4 місяці тому

    Very good description of entropy and corss entropy (and what on earthKL divergence is).

  • @tildarusso
    @tildarusso 3 місяці тому

    Very well explained. These essential principles are vital for deep understanding of modern AI technologies.

  • @allenlu2007
    @allenlu2007 5 місяців тому +1

    Excellent video! Is the cross-entropy minimization analogous to lowest energy in Hopfield Network? For the variational inference such as variational classifier (or VAE) to minimize the cross-entropy plus a regularization term analogous to lowest energy plus free energy in Boltzmann machine?

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @indigoriviera
    @indigoriviera 14 днів тому

    A small confusion -- there is a formula for cross-entropy loss at 24:29 where p_s seems to be missing from the usual term p_s * log(1/q_s). The formula on the slide seems to be minimizing the total "surprise" of the neural net distribution instead of the cross entropy loss. Am I missing something?

  • @ravenecho2410
    @ravenecho2410 5 місяців тому +1

    Fantastic explanation!!

  • @GeoffryGifari
    @GeoffryGifari 5 місяців тому +1

    Gotta say, you're really good at teaching

  • @alexjensen990
    @alexjensen990 5 місяців тому +1

    That was really well done. Thank you for making this video.

  • @mathieudespriee6646
    @mathieudespriee6646 5 місяців тому

    Great video. Interested in latent-variable content.

  • @bwowekeith4472
    @bwowekeith4472 5 місяців тому +1

    Thank you. I am very surprised with the explanation. It is a masterpiece

  • @liucara8548
    @liucara8548 5 місяців тому +1

    best cross entropy explaination!

    • @hyperduality2838
      @hyperduality2838 4 місяці тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @trisinogy
    @trisinogy 4 місяці тому

    Excellent graphics, superb explanation! Thank you! Keep up the good work! You gained a sub!

  • @ophthojooeileyecirclehisha4917
    @ophthojooeileyecirclehisha4917 4 місяці тому

    Thank you for your hard work, dedication, mathematics, statistics, science, kindness,,and generosity. May GOD reward you

  • @aitjellal
    @aitjellal 5 місяців тому

    Excellent works , thank you
    Please create a video about the training of generative models

  • @JTan-fq6vy
    @JTan-fq6vy 2 місяці тому

    In 6:01, the video introduces the notion of distribution right after bayesian definition of probability. I am wondering why only bayesian approach provides such view (of distribution) rather than the frequentist?

  • @TheMikeMassengale
    @TheMikeMassengale 3 місяці тому

    Time to go read Hitchhikers Guide again. Nice video!

  • @brunolevy6261
    @brunolevy6261 5 місяців тому

    brilliant and enlightening explanations - ty

  • @alexandercannon3329
    @alexandercannon3329 2 місяці тому

    Great Video. One question - I see that the entropy of a fair coin is approx. 0.7 (13:37) but shouldn't it be 1.0? Apologies if I am missing something.

  • @dejavus
    @dejavus 18 днів тому +1

    Big thanks to you, now, I understand why they always say that there is a statistics laying under the all this deep learn models.

  • @ForgottenInForest
    @ForgottenInForest 5 місяців тому +1

    Did you omit the minus sign before the log intentionally for simplicity?

    • @ArtemKirsanov
      @ArtemKirsanov  5 місяців тому

      Yeah, I felt like log(1/p) is conceptually simpler than -log(p), because there are a bit more cognitive steps to recall that log(p) would be a negative number (since p

  • @CouthZero
    @CouthZero 5 місяців тому

    I missed some of your references in which we can dig in. Btw, fascinating topic and awesome video!

  • @wubwub616
    @wubwub616 4 місяці тому

    can you make more videos about cognitive constructs such as working memory, attention, cognitive control, or the interference of all of them, attentional control working memory?

  • @MultiNeurons
    @MultiNeurons 5 місяців тому +1

    Very interesting and well explained

  • @thecaveman2871
    @thecaveman2871 5 місяців тому +1

    Absolutely love you videos man.

  • @PrinceKumar-u4k4y
    @PrinceKumar-u4k4y 5 місяців тому

    Surprise feel so intuitive but maths make them quantifiable. Keep these videos coming. Thank You👏

  • @MelonHusk7
    @MelonHusk7 28 днів тому

    Always good to see Yann.