The Key Equation Behind Probability

Поділитися
Вставка
  • Опубліковано 28 жов 2024

КОМЕНТАРІ • 266

  • @ArtemKirsanov
    @ArtemKirsanov  2 місяці тому +19

    Get 4 months extra on a 2 year plan here: nordvpn.com/artemkirsanov. It’s risk free with Nord’s 30 day money-back guarantee!

    • @anthonyrepetto3474
      @anthonyrepetto3474 Місяць тому +1

      best explanation of KL-div I've heard on UA-cam - thank you!

    • @pebbles7913
      @pebbles7913 Місяць тому +1

      I beg you to talk about how Neuro-symbolic AI models I find them very fascinating. Alpha geometry has shown combining different model types may be the way to go for AGI, similar to the brain.

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @mounirgharsallah1263
    @mounirgharsallah1263 2 місяці тому +44

    So far... One of the best clear videos about Entropy and KL-divergence...
    Good Motion Design too...

  • @gonzalopolo2612
    @gonzalopolo2612 Місяць тому +24

    Wow, what an amazing video and explanation! I love how you derived KL divergence (also known as relative entropy) from cross-entropy and entropy.
    It's interesting to note that historically, these ideas were actually discovered in the reverse order. Kullback and Leibler introduced the concept of "information gain" or "relative entropy" (now known as KL divergence) in their 1951 paper "On Information and Sufficiency," building on Shannon's earlier work on entropy. The explicit use of cross-entropy as a separate concept came later as far as I know.
    Your explanation really helps in understanding these interconnected ideas. Thank you for this excellent content!
    Variational inference video with this quality would be simply incredible (I really cannot imagine the amount of effort this requires), again thank you for this

    • @ArtemKirsanov
      @ArtemKirsanov  Місяць тому +2

      Thank you so much!

    • @hyperduality2838
      @hyperduality2838 Місяць тому +1

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @TurinBeats
    @TurinBeats 2 місяці тому +97

    This is Artem Kirsanov's golden year. Posting banger after banger. Much love, your videos are a gem

  • @finnrobertson2592
    @finnrobertson2592 2 місяці тому +36

    I'd be very keen for a video about variational inference. Have been loving your content

  • @MathOnMain
    @MathOnMain 2 місяці тому +36

    The most succinct explanation of entropy I have heard, the explanation of cross-entropy was very insightful too

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

    • @MathOnMain
      @MathOnMain Місяць тому

      @@hyperduality2838 It's not my intention to be rude but your comment it's only very loosely related to mine; may I ask why did you typed it as a reply to me? Was your intention to reply to someone else? Were you trying to leave a comment rather than replying to mine? Regardless, I wish you a nice day.

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      @@MathOnMain I am just informing you that there is a 4th law of thermodynamics based upon teleology.
      "Philosophy is dead" -- Stephen Hawking.
      Main stream physics and therefore science is currently dominated by teleophobia and eliminative materialism hence a new law based upon teleology is not going to be popular.
      Teleophilia is dual to teleophobia.
      Alive is dual to not alive -- Schrodinger's cat.
      Being is dual to non being creates becoming -- Plato's cat.
      Stephen Hawking accepted the metaphysics of Schrodinger's cat so philosophy is not dead!
      Syntax is dual to semantics -- languages, communication or information (surprise).
      If mathematics is a language then it is dual.
      Stephen Hawking was a mathematician he was using duality.
      Messages in a communication system are predicted into existence using probability -- Shannon's information theory, and making predictions is a syntropic process!
      "We predict ourselves into existence" -- Anil Seth, neuroscientist.
      Predicting ourselves into existence is a syntropic process hence there is a 4th law of thermodynamics!
      Synchronic points/lines are dual to enchronic points/lines.
      Points are dual to lines -- the principle of duality in geometry.
      Objective information (syntax) is dual to subjective information (semantics) -- information is dual.
      All Information (surprise) has structure or form and meaning hence it is dual.
      Absolute truth is dual to relative truth -- Hume's fork.
      I assumed that you are interested in absolute truth.

  • @klikkolee
    @klikkolee 2 місяці тому +220

    Personally, after the third time someone predicted the die roll, I would be exponentially more surprised than after the first time

    • @George70220
      @George70220 2 місяці тому +8

      You misunderstand why he said that surprisal is additive

    • @JackDespero
      @JackDespero 2 місяці тому +51

      @@George70220 Because the notion of surprise is confusing in this context. It is a bad way to call it. Humans are very bad with random events, and surprise does not follow an entropic distribution.

    • @Eta_Carinae__
      @Eta_Carinae__ 2 місяці тому

      The additivity allows us to say that H(P,Q) = H(P) + KL(P,Q)

    • @kamilrichert8446
      @kamilrichert8446 2 місяці тому +3

      Yea, gambler's fallacy does make quite a difference in real life interpretation of random events

    • @user-qw1rx1dq6n
      @user-qw1rx1dq6n 2 місяці тому +5

      The main thing is that it’s 0 at 1 and infinite at 0

  • @andreapanuccio295
    @andreapanuccio295 2 місяці тому +10

    This video is the new golden standard of stat introduction for dummies; so clear and informative. I'll link it to the next person I find at the beginning of its stat journey! Tks bro for what u do

  • @drhxa
    @drhxa Місяць тому +2

    Very impressive how clearly you explained entropy, cross-entropy, and KL divergence with the idea of surprise and great visuals. Well done and thank you for this

  • @samkee3859
    @samkee3859 2 місяці тому +25

    Keep posting please!

  • @brucerosner3547
    @brucerosner3547 Місяць тому

    This is the clearest explanation of probability and entropy I've ever seen. Please create more videos.

  • @srinjandutta121
    @srinjandutta121 2 місяці тому +1

    You answered a lot of my questions, the ones I had before the video and also the ones I was having while watching it. Thank you ❤❤

  • @JavierFausLlopis
    @JavierFausLlopis 2 місяці тому

    Perfect presentation, only tiny error at 19:18 in the lower right corner where for a second the word Tails changes to Heads right at the slide change. Thank you very much for your great work, I think we all really appreciate it.

  • @toxicore1190
    @toxicore1190 2 місяці тому +1

    perfect timing for this content, I was looking for more intuitive explanations of cross-entropy lately, thanks!

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @vastabyss6496
    @vastabyss6496 2 місяці тому

    About a year ago, I was building a neural network library from scratch, but I was having some trouble understanding some of the intuition behind binary cross-entropy. Watching this video was the first time I felt like I actually understood the concepts behind cross-entropy!

  • @jimlbeaver
    @jimlbeaver 2 місяці тому +1

    Great job explaining this. I’ve always had difficulty getting an intuition around it

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @moisesbessalle
    @moisesbessalle 2 місяці тому +1

    Yes please do videos on Bayesian inference! This video was great!

  • @procactus9109
    @procactus9109 2 місяці тому +6

    This to me atleast as a concept is fundamental. But to see people put effort into making maths to explain it, is amazing.

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

    • @procactus9109
      @procactus9109 Місяць тому

      @@hyperduality2838 I think you replied to the wrong comment

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      @@procactus9109 I am just informing you that there is a 4th law of thermodynamics based upon teleology.
      "Philosophy is dead" -- Stephen Hawking.
      Main stream physics and therefore science is currently dominated by teleophobia and eliminative materialism hence a new law based upon teleology is not going to be popular.
      Teleophilia is dual to teleophobia.
      Alive is dual to not alive -- Schrodinger's cat.
      Being is dual to non being creates becoming -- Plato's cat.
      Stephen Hawking accepted the metaphysics of Schrodinger's cat so philosophy is not dead!
      Syntax is dual to semantics -- languages, communication or information (surprise).
      If mathematics is a language then it is dual.
      Stephen Hawking was a mathematician he was using duality.
      Messages in a communication system are predicted into existence using probability -- Shannon's information theory, and making predictions is a syntropic process!
      "We predict ourselves into existence" -- Anil Seth, neuroscientist.
      Predicting ourselves into existence is a syntropic process hence there is a 4th law of thermodynamics!
      Synchronic points/lines are dual to enchronic points/lines.
      Points are dual to lines -- the principle of duality in geometry.
      Objective information (syntax) is dual to subjective information (semantics) -- information is dual.
      All Information (surprise) has structure or form and meaning hence it is dual.
      Absolute truth is dual to relative truth -- Hume's fork.
      I assumed that you are interested in absolute truth.
      Comments, messages, languages are dual and the cells, neurons in your body are all communicating with each other all the time hence you are built out of duality.
      Duality creates reality.

  • @neurosync_research
    @neurosync_research Місяць тому

    This is the most lucid unpacking of these concepts that I've ever seen. Great work as always.
    I would definitely look forward to a variational inference video

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @sudiptochatterjee8996
    @sudiptochatterjee8996 20 днів тому

    Really liked the way u motivated the definition of entropy, thanks a lot

  • @FsimulatorX
    @FsimulatorX 2 місяці тому +2

    This is really good and high quality content !! You have no idea this is the first time I've been able to stay watch a math youtube video from start to finish in a very long time

  • @SN-uc3vr
    @SN-uc3vr 2 місяці тому +1

    Amazing video! Please make more on topics like variational inference, ELBO, etc!!!

  • @vanhoheneim
    @vanhoheneim 17 днів тому

    That's one of the best videos I have ever seen on the probabilistic foundations of ML!

  • @andytroo
    @andytroo Місяць тому

    a good example for cross entropy (or mutual information) is asking "how well aligned are the two images", when one image is a black square on a white background, and the other is a black background with a white square.
    Minimising surprise by some measures is ensuring white=white and black=black, forcing the squares to not touch, other measures force the two squares on top of each other, so that each color is only related to 1 colour in the other picture.

  • @cerioscha
    @cerioscha Місяць тому

    Great video and not an easy one to make. Delighted to see you position and reference the Free energy Principle

  • @xyzct
    @xyzct 2 місяці тому +1

    Simple to follow, and crystal clear. Well done!

  • @AlexBerg1
    @AlexBerg1 2 місяці тому +2

    What an amazing introduction to this point of view of probability!

  • @hdcamsit2144
    @hdcamsit2144 18 днів тому

    you should do a video building on top of this explanation of entropy and cross-entropy to explain what perplexity is.

  • @ResmungoCoder
    @ResmungoCoder 2 місяці тому

    You really know how to explain complex things in simple ways. Thank you for this video!

  • @jimcallahan448
    @jimcallahan448 2 місяці тому +2

    When we get tired of statistical cookbook procedures it is great to go back to the underlying probability logic (it's probability all the way down). This applies not only to machine learning models but to all types of statistical models. For example, in regression models when one studies the "residual" error component. It also explains science stories when an experiment produces a surprising result and first a scientist and then the scientific community realizes that they need a new theory (model) -- a "scientific revolution". A scientific revolution should occur when the K-L divergence of experimental results is large (or else they need to find a mundane explanation for the experimental error).

  • @tirthasg
    @tirthasg 2 місяці тому

    Love your explanations & visualisations. You should consider making a course on the mathematics of ML, and DL. We would definitely tune in!

  • @jimcallahan448
    @jimcallahan448 2 місяці тому +57

    Cross entropy explains magic shows.
    The magician is trying to get the audience to believe the wrong model ("Nothing up my sleeve"") and therefore be surprised (and hopefully delighted) by the outcome.

    • @celaleddinomersaglam6811
      @celaleddinomersaglam6811 Місяць тому +1

      Basic Pledge-Turn-Prestige process:)

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @Cookie82772
    @Cookie82772 Місяць тому

    This channel is gold. Suggested topic for a future video: The details of training GANs

  • @NGBigfield
    @NGBigfield Місяць тому

    That's a great way to explain KL-Divergence. I never understood the parameters until now!
    thanks!

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @andrewgrebenisan6141
    @andrewgrebenisan6141 2 місяці тому

    Brilliant, Artem! Your videos are the crown jewels of ML educational content on UA-cam!! So intuitive!

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @DistortedV12
    @DistortedV12 2 місяці тому

    I have 4 ideas: 1) causal representation learning, 2) difference between PCA, ICA and spectral embedding, 3) difference between singular value decomposition and eigen value decomposition and 4) bayes equation? I love how you explain thing and in the video you did indeed beautifully illustrate one of the most important equation in DL coming from a PhD student in machine learning

  • @davide9242
    @davide9242 24 дні тому

    Amazing video. It's so rare finding such quality nowadays

  • @SelfBuiltWealth
    @SelfBuiltWealth 2 місяці тому +1

    Omg thank you so much i was looking to understand this subject and searched you up because you are the best out there!❤

  • @benfrank6520
    @benfrank6520 2 місяці тому +2

    your animation is so incredible, i even watched the add.

  • @Infraredchili
    @Infraredchili 2 місяці тому +1

    Great video. I'm really interested in latent-variables models, ELBO, VAE and so on

  • @aitjellal
    @aitjellal 2 місяці тому

    Excellent works , thank you
    Please create a video about the training of generative models

  • @MrFarber31
    @MrFarber31 Місяць тому +1

    Never skip a Kirsanov video

  • @Boredguy112
    @Boredguy112 Місяць тому

    Thanks Artem, Please go into Variational inference or even Free energy too! very interested!

  • @ed.puckett
    @ed.puckett Місяць тому

    Thank you for your clearly presented video. I wanted to comment about my own experience with KL-Divergence. First, I note that KL-Divergence is not a metric; it fails the triangle inequality for example. This leads to sort of discontinuities when trying to do inference, for example when trying to infer the so-called "even process" from its outputs, and causes a more complex model to be inferred when using the KL-Divergence as a comparator for distributions.

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @Carrymejane
    @Carrymejane 2 місяці тому +27

    You called my soul since you mention entropy in the title

    • @janagha1217
      @janagha1217 Місяць тому

      I here for entropy too yo !!!

  • @MichelCarroll
    @MichelCarroll 2 місяці тому +2

    Keep up the videos. They are really excellent and intuitive

  • @chaowang8229
    @chaowang8229 Місяць тому

    Thank you for your video. I think it is very helpful to understand the basic ideas behind training generative machine learning models. By the way, we are extremely interested in the concepts (latent-variable models, variational inference, ELBO, etc. ) that you mentioned in the ending of this video. I would be grateful if you plan to make some videos to introduce these concepts.

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @simonstrandgaard5503
    @simonstrandgaard5503 2 місяці тому +1

    Beautiful visuals and well presented.

  • @hr5651
    @hr5651 2 місяці тому

    Please do a video on the other techniques (in your outro/conclusion). On youtube, there is already a lot of video explaining globally how generative model works but not so much on the details of the techniques behind. It could be interesting.

  • @GeoffryGifari
    @GeoffryGifari Місяць тому +1

    Gotta say, you're really good at teaching

  • @BenjaminEvans316
    @BenjaminEvans316 8 днів тому

    Great animations in the video with a good level of detail.

  • @heavenrvne888
    @heavenrvne888 2 місяці тому

    I'd really like to see a video from you on variational inference

  • @robertmotsch7535
    @robertmotsch7535 2 місяці тому +1

    A truly excellent presentation!

  • @SorryNothingToSeeHere
    @SorryNothingToSeeHere 2 місяці тому

    I just recently ran across your videos. I would love to see videos relating to search iterative inference and space reduction. In this video you mention gradient descent, I have not looked it over. I'll browse a bit and look for other optimzation topics. Thanks a lot!

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @bwowekeith4472
    @bwowekeith4472 2 місяці тому +1

    Thank you. I am very surprised with the explanation. It is a masterpiece

  • @mauroabidalcarrer4083
    @mauroabidalcarrer4083 2 місяці тому

    Awesome video, can't wait for the next fondamental of probabilities

  • @abdurrezzakefe5308
    @abdurrezzakefe5308 2 місяці тому

    Great content! I can see multiple concepts used from FEP, greatly presented and explained. Keep up the good work man, hope we get to have a chat some day about these!

  • @CouthZero
    @CouthZero 2 місяці тому

    I missed some of your references in which we can dig in. Btw, fascinating topic and awesome video!

  • @mathieudespriee6646
    @mathieudespriee6646 Місяць тому

    Great video. Interested in latent-variable content.

  • @tantzer6113
    @tantzer6113 2 місяці тому +1

    Thank you so much for this fantastically clear and useful video! Question about the formula for cross-entropy: you gave an example that showed that the asymmetry of the formula makes sense (the example of assuming a fair coin versus assuming a rigged coin). However, you didn't "motivate" the formula for cross entropy. I mean, if somebody asked me to derive the formula, I wouldn't know how to explain the choice of the positions of p_s versus q_s in the formula and why they shouldn't be switched. What explains the choice? What explains why we pick one for the weights/coefficients of the probability terms and the other as the arguments of the natural logarithms?

    • @ArtemKirsanov
      @ArtemKirsanov  2 місяці тому +3

      Thanks! Sure, let me explain.
      When we believe in the model Q, with q_s representing the assumed probability of state s, each time we observe an instance of s, we experience log(1/q_s) units of surprisal. As we continue to observe instances, we might encounter different states (like s2) with different probabilities under our model, resulting in different amounts of surprise (log(1/q_s2), etc.).
      Now, the crucial question is: What generates these samples? What determines which state we're going to observe? Since we're observing the process unfold in the real world, these samples are actually distributed according to P, with each state s having a probability p_s of occurring.
      So, to break down the formula:
      1) The term inside the logarithm (1/q_s) represents our surprise, which only makes sense in the context of a particular belief about probability - our internal model. This is why Q is associated with the log term.
      2) When we calculate the expected value of this subjective surprise, we need to account for how often each state actually occurs. The random variable itself is governed by P, so we use p_s as weighing coefficients.
      In essence, p_s * log(1/q_s) gives us the contribution to the average surprise: how often a state occurs (p_s) multiplied by how surprised we are when it does occur (log(1/q_s)).
      Hope this clarifies things a bit!

    • @tantzer6113
      @tantzer6113 2 місяці тому

      This is a great explanation. Thank you so much!

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @alexjensen990
    @alexjensen990 2 місяці тому +1

    That was really well done. Thank you for making this video.

  • @tildarusso
    @tildarusso 17 днів тому

    Very well explained. These essential principles are vital for deep understanding of modern AI technologies.

  • @davidattias6282
    @davidattias6282 Місяць тому

    Underrated UA-cam channel. Thanks boss

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @bingyanliu6370
    @bingyanliu6370 Місяць тому

    Thank you for solving my long-lasting questions

  • @theodoreshachtman9990
    @theodoreshachtman9990 18 днів тому

    Your work is a gift to the world. Thank you!

  • @RexPilger
    @RexPilger 2 місяці тому

    Nice and concise. Should have a photo of Claude Shannon instead of Rev. Bayes. The latter knew nothing of entropy -- he lived in the century before the physicists and one more before Shannon. (Another commenter caught misspelling of Function.)

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @TheMikeMassengale
    @TheMikeMassengale 23 дні тому

    Time to go read Hitchhikers Guide again. Nice video!

  • @lukarius791
    @lukarius791 2 місяці тому +1

    Hey Artem,
    i really enjoy your video.
    Thank you for this!!!
    I have a question:
    my journey on a university starts in one month and have my own zettelkasten since one years and want to use in effectively to build my knowledge.
    But i have the feeling, my Obsidian fault as it is at the moment and how i use it is not the exact right way.
    Can you write (or maybe create a video, its over 2 years ago 😊) over you workflow of your zettelkasten and how it changed in over the years.
    Enjoy your Life

  • @ravenecho2410
    @ravenecho2410 2 місяці тому +1

    Fantastic explanation!!

  • @kubanychtakyrbashev549
    @kubanychtakyrbashev549 2 місяці тому

    ВаУ, Артём, Супер! Ну ты даёшь стране угля, Чон Рахмат!

  • @PrinceKumar-u4k4y
    @PrinceKumar-u4k4y 2 місяці тому

    Surprise feel so intuitive but maths make them quantifiable. Keep these videos coming. Thank You👏

  • @SimonHajny
    @SimonHajny 2 місяці тому +1

    This is exactly the kind of video I love. Abstract topics beautifully described! Although perhaps the intro was a bit long and most people watching the video know these basic concepts.

  • @rubncarmona
    @rubncarmona 2 місяці тому

    Great video, Artem! Please be careful with affirmations and statements like 'always result in noise'. Since we're dealing with scientific and mathematical concepts, being precise with your words is essential.

  • @liucara8548
    @liucara8548 2 місяці тому +1

    best cross entropy explaination!

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @peki_ooooooo
    @peki_ooooooo 2 місяці тому

    wow, this metaphor is really beautiful, and this metaphor is really the core idea of diffusion process.

  • @ellysian
    @ellysian Місяць тому

    Amazing and clear explanation, thank you for the video!
    Since we are optimizing for the cross-entropy and not KL divergence in practice, this means that although optimizing for both functions is the same process, the ideal loss that we are expecting is not 0?

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @MultiNeurons
    @MultiNeurons 2 місяці тому +1

    Very interesting and well explained

  • @robinbreslin1626
    @robinbreslin1626 Місяць тому

    Very good description of entropy and corss entropy (and what on earthKL divergence is).

  • @brunolevy6261
    @brunolevy6261 2 місяці тому

    brilliant and enlightening explanations - ty

  • @trisinogy
    @trisinogy 28 днів тому

    Excellent graphics, superb explanation! Thank you! Keep up the good work! You gained a sub!

  • @allenlu2007
    @allenlu2007 2 місяці тому +1

    Excellent video! Is the cross-entropy minimization analogous to lowest energy in Hopfield Network? For the variational inference such as variational classifier (or VAE) to minimize the cross-entropy plus a regularization term analogous to lowest energy plus free energy in Boltzmann machine?

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Concepts are dual to percepts -- the mind duality of Immanuel Kant.
      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @Gus-AI-World
    @Gus-AI-World 2 місяці тому

    This is so beautiful and amazing video. I cannot thank you enough.

  • @ophthojooeileyecirclehisha4917
    @ophthojooeileyecirclehisha4917 Місяць тому

    Thank you for your hard work, dedication, mathematics, statistics, science, kindness,,and generosity. May GOD reward you

  • @C0DEWARR10R
    @C0DEWARR10R 2 місяці тому +1

    So beautiful I cried!

  • @Jamie-my7lb
    @Jamie-my7lb Місяць тому +1

    I’d like to better understand what kinds of operations are allowed with probability distributions. For example I think that we never really have access to the true distribution P in any real world scenario. All we have access to are models, Q. By sampling events from P, we can estimate the cross entropy. However I don’t see how P can ever be actually known.

    • @hyperduality2838
      @hyperduality2838 Місяць тому

      Making predictions to track targets and goals is a syntropic process -- teleological.
      Teleological physics (syntropy) is dual to non teleological physics (entropy).
      Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
      Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
      Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
      "The brain is a prediction machine" -- Karl Friston, neuroscientist.
      "Always two there are" -- Yoda.

  • @felipe_marra
    @felipe_marra 27 днів тому

    Such an interesting and useful topic and such a great video!

  • @wubwub616
    @wubwub616 Місяць тому

    can you make more videos about cognitive constructs such as working memory, attention, cognitive control, or the interference of all of them, attentional control working memory?

  • @JackDespero
    @JackDespero 2 місяці тому +11

    It is very similar to the idea of Bayes Theorem, which makes sense of course. But it is nice to see it appear from somewhere else.

  • @bosepukur
    @bosepukur 17 днів тому

    wow , what beautiful presentation

  • @godlyradmehr2004
    @godlyradmehr2004 2 місяці тому

    KL divergence basically says , ok just forget about what you believing (subtract Entropy of distribution P) and now again compute the cross entropy ❤❤

  • @looldrole
    @looldrole 7 днів тому

    Great communication of science!

  • @thecaveman2871
    @thecaveman2871 2 місяці тому +1

    Absolutely love you videos man.

  • @JerryDavid-o1n
    @JerryDavid-o1n 2 місяці тому

    While the entropy explanation is correct, I personally feel that it'd have helped if the example of an unfair coin was included and compared with the fair coin in addition to the weird coin which lands on its edge. Because at least to someones's surprise, fair coin has the most surprise which might not be intuitive if someone gets the concept of (shannon and in general)entropy wrong.

  • @IakobusAtreides
    @IakobusAtreides 2 місяці тому +3

    I love these videos so much

  • @sigfridsixsis3255
    @sigfridsixsis3255 2 місяці тому

    Super clear explanation. Top!❤

  • @Filup
    @Filup 2 місяці тому

    This would have be soooo good last semester!!!

  • @maestraccivalentin316
    @maestraccivalentin316 Місяць тому

    Really good video;
    Im left confused by something, how do people computer Cross-Entropy to minimise it since we dont know the real probability distribution (and it depends on it)

  • @Patapom3
    @Patapom3 2 місяці тому +1

    Very interesting, thank you!

  • @shoshaunagauvin3699
    @shoshaunagauvin3699 24 дні тому

    Great video, however your weather example is actually wrong. %pop (rain chance) is given to you in %Area Covered. I.e. for a given regional boundary, 30% of the region will experience X precipitation. (Therefore frequentist view holds)

  • @sapfeartop3499
    @sapfeartop3499 2 місяці тому +2

    Классное видео, спасибо!)