I beg you to talk about how Neuro-symbolic AI models I find them very fascinating. Alpha geometry has shown combining different model types may be the way to go for AGI, similar to the brain.
Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Wow, what an amazing video and explanation! I love how you derived KL divergence (also known as relative entropy) from cross-entropy and entropy. It's interesting to note that historically, these ideas were actually discovered in the reverse order. Kullback and Leibler introduced the concept of "information gain" or "relative entropy" (now known as KL divergence) in their 1951 paper "On Information and Sufficiency," building on Shannon's earlier work on entropy. The explicit use of cross-entropy as a separate concept came later as far as I know. Your explanation really helps in understanding these interconnected ideas. Thank you for this excellent content! Variational inference video with this quality would be simply incredible (I really cannot imagine the amount of effort this requires), again thank you for this
Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
@@hyperduality2838 It's not my intention to be rude but your comment it's only very loosely related to mine; may I ask why did you typed it as a reply to me? Was your intention to reply to someone else? Were you trying to leave a comment rather than replying to mine? Regardless, I wish you a nice day.
@@MathOnMain I am just informing you that there is a 4th law of thermodynamics based upon teleology. "Philosophy is dead" -- Stephen Hawking. Main stream physics and therefore science is currently dominated by teleophobia and eliminative materialism hence a new law based upon teleology is not going to be popular. Teleophilia is dual to teleophobia. Alive is dual to not alive -- Schrodinger's cat. Being is dual to non being creates becoming -- Plato's cat. Stephen Hawking accepted the metaphysics of Schrodinger's cat so philosophy is not dead! Syntax is dual to semantics -- languages, communication or information (surprise). If mathematics is a language then it is dual. Stephen Hawking was a mathematician he was using duality. Messages in a communication system are predicted into existence using probability -- Shannon's information theory, and making predictions is a syntropic process! "We predict ourselves into existence" -- Anil Seth, neuroscientist. Predicting ourselves into existence is a syntropic process hence there is a 4th law of thermodynamics! Synchronic points/lines are dual to enchronic points/lines. Points are dual to lines -- the principle of duality in geometry. Objective information (syntax) is dual to subjective information (semantics) -- information is dual. All Information (surprise) has structure or form and meaning hence it is dual. Absolute truth is dual to relative truth -- Hume's fork. I assumed that you are interested in absolute truth.
This is unbelievable, I loved it, now I can see how the KL divergence can be used to correct our approximate models, and how we aim at minimizing this divergence to get closer to the hidden distribution, beautiful!
This video is the new golden standard of stat introduction for dummies; so clear and informative. I'll link it to the next person I find at the beginning of its stat journey! Tks bro for what u do
Very impressive how clearly you explained entropy, cross-entropy, and KL divergence with the idea of surprise and great visuals. Well done and thank you for this
i am addicted to this chanel, i love how you articulate every subject to explain these concepts, thank you so much the world really needs more people like you!!! ♥
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
@@George70220 Because the notion of surprise is confusing in this context. It is a bad way to call it. Humans are very bad with random events, and surprise does not follow an entropic distribution.
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
@@1st_ProCactus I am just informing you that there is a 4th law of thermodynamics based upon teleology. "Philosophy is dead" -- Stephen Hawking. Main stream physics and therefore science is currently dominated by teleophobia and eliminative materialism hence a new law based upon teleology is not going to be popular. Teleophilia is dual to teleophobia. Alive is dual to not alive -- Schrodinger's cat. Being is dual to non being creates becoming -- Plato's cat. Stephen Hawking accepted the metaphysics of Schrodinger's cat so philosophy is not dead! Syntax is dual to semantics -- languages, communication or information (surprise). If mathematics is a language then it is dual. Stephen Hawking was a mathematician he was using duality. Messages in a communication system are predicted into existence using probability -- Shannon's information theory, and making predictions is a syntropic process! "We predict ourselves into existence" -- Anil Seth, neuroscientist. Predicting ourselves into existence is a syntropic process hence there is a 4th law of thermodynamics! Synchronic points/lines are dual to enchronic points/lines. Points are dual to lines -- the principle of duality in geometry. Objective information (syntax) is dual to subjective information (semantics) -- information is dual. All Information (surprise) has structure or form and meaning hence it is dual. Absolute truth is dual to relative truth -- Hume's fork. I assumed that you are interested in absolute truth. Comments, messages, languages are dual and the cells, neurons in your body are all communicating with each other all the time hence you are built out of duality. Duality creates reality.
Cross entropy explains magic shows. The magician is trying to get the audience to believe the wrong model ("Nothing up my sleeve"") and therefore be surprised (and hopefully delighted) by the outcome.
Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
This is really good and high quality content !! You have no idea this is the first time I've been able to stay watch a math youtube video from start to finish in a very long time
Every single person interesting in Business Administration (as students or professional) should be good at probability and statistic. Very teaching video. Thanks a lot.
When we get tired of statistical cookbook procedures it is great to go back to the underlying probability logic (it's probability all the way down). This applies not only to machine learning models but to all types of statistical models. For example, in regression models when one studies the "residual" error component. It also explains science stories when an experiment produces a surprising result and first a scientist and then the scientific community realizes that they need a new theory (model) -- a "scientific revolution". A scientific revolution should occur when the K-L divergence of experimental results is large (or else they need to find a mundane explanation for the experimental error).
This is the most lucid unpacking of these concepts that I've ever seen. Great work as always. I would definitely look forward to a variational inference video
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
About a year ago, I was building a neural network library from scratch, but I was having some trouble understanding some of the intuition behind binary cross-entropy. Watching this video was the first time I felt like I actually understood the concepts behind cross-entropy!
Perfect presentation, only tiny error at 19:18 in the lower right corner where for a second the word Tails changes to Heads right at the slide change. Thank you very much for your great work, I think we all really appreciate it.
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
I have 4 ideas: 1) causal representation learning, 2) difference between PCA, ICA and spectral embedding, 3) difference between singular value decomposition and eigen value decomposition and 4) bayes equation? I love how you explain thing and in the video you did indeed beautifully illustrate one of the most important equation in DL coming from a PhD student in machine learning
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
a good example for cross entropy (or mutual information) is asking "how well aligned are the two images", when one image is a black square on a white background, and the other is a black background with a white square. Minimising surprise by some measures is ensuring white=white and black=black, forcing the squares to not touch, other measures force the two squares on top of each other, so that each color is only related to 1 colour in the other picture.
Great video, Artem! Please be careful with affirmations and statements like 'always result in noise'. Since we're dealing with scientific and mathematical concepts, being precise with your words is essential.
This was amazing, thank you so much! I would LOVE to see a video on VAE's / ELBO or the various ways we can use generative models by trying to learn the joint distribution.
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Thank you for your clearly presented video. I wanted to comment about my own experience with KL-Divergence. First, I note that KL-Divergence is not a metric; it fails the triangle inequality for example. This leads to sort of discontinuities when trying to do inference, for example when trying to infer the so-called "even process" from its outputs, and causes a more complex model to be inferred when using the KL-Divergence as a comparator for distributions.
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
This is exactly the kind of video I love. Abstract topics beautifully described! Although perhaps the intro was a bit long and most people watching the video know these basic concepts.
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Nice and concise. Should have a photo of Claude Shannon instead of Rev. Bayes. The latter knew nothing of entropy -- he lived in the century before the physicists and one more before Shannon. (Another commenter caught misspelling of Function.)
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Thank you so much for this fantastically clear and useful video! Question about the formula for cross-entropy: you gave an example that showed that the asymmetry of the formula makes sense (the example of assuming a fair coin versus assuming a rigged coin). However, you didn't "motivate" the formula for cross entropy. I mean, if somebody asked me to derive the formula, I wouldn't know how to explain the choice of the positions of p_s versus q_s in the formula and why they shouldn't be switched. What explains the choice? What explains why we pick one for the weights/coefficients of the probability terms and the other as the arguments of the natural logarithms?
Thanks! Sure, let me explain. When we believe in the model Q, with q_s representing the assumed probability of state s, each time we observe an instance of s, we experience log(1/q_s) units of surprisal. As we continue to observe instances, we might encounter different states (like s2) with different probabilities under our model, resulting in different amounts of surprise (log(1/q_s2), etc.). Now, the crucial question is: What generates these samples? What determines which state we're going to observe? Since we're observing the process unfold in the real world, these samples are actually distributed according to P, with each state s having a probability p_s of occurring. So, to break down the formula: 1) The term inside the logarithm (1/q_s) represents our surprise, which only makes sense in the context of a particular belief about probability - our internal model. This is why Q is associated with the log term. 2) When we calculate the expected value of this subjective surprise, we need to account for how often each state actually occurs. The random variable itself is governed by P, so we use p_s as weighing coefficients. In essence, p_s * log(1/q_s) gives us the contribution to the average surprise: how often a state occurs (p_s) multiplied by how surprised we are when it does occur (log(1/q_s)). Hope this clarifies things a bit!
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Great content! I can see multiple concepts used from FEP, greatly presented and explained. Keep up the good work man, hope we get to have a chat some day about these!
I’d like to better understand what kinds of operations are allowed with probability distributions. For example I think that we never really have access to the true distribution P in any real world scenario. All we have access to are models, Q. By sampling events from P, we can estimate the cross entropy. However I don’t see how P can ever be actually known.
Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Please do a video on the other techniques (in your outro/conclusion). On youtube, there is already a lot of video explaining globally how generative model works but not so much on the details of the techniques behind. It could be interesting.
Thank you for your video. I think it is very helpful to understand the basic ideas behind training generative machine learning models. By the way, we are extremely interested in the concepts (latent-variable models, variational inference, ELBO, etc. ) that you mentioned in the ending of this video. I would be grateful if you plan to make some videos to introduce these concepts.
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
likelihood, measure of fit, cross entropy, surprise-at-model, loss function -- all of these are expressing a similar idea, building on their various academic creeds
I just recently ran across your videos. I would love to see videos relating to search iterative inference and space reduction. In this video you mention gradient descent, I have not looked it over. I'll browse a bit and look for other optimzation topics. Thanks a lot!
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
While the entropy explanation is correct, I personally feel that it'd have helped if the example of an unfair coin was included and compared with the fair coin in addition to the weird coin which lands on its edge. Because at least to someones's surprise, fair coin has the most surprise which might not be intuitive if someone gets the concept of (shannon and in general)entropy wrong.
Great video, however your weather example is actually wrong. %pop (rain chance) is given to you in %Area Covered. I.e. for a given regional boundary, 30% of the region will experience X precipitation. (Therefore frequentist view holds)
it s probability theory published 1987, this theory applied in many field such as quantum physic , Filter Design ,and Economy , and ML ,administrator. it because decision is using probability . that is a advance theory , everyone should know.
8:44 In quantum mechanics, there are infinite-dimentional vector spaces. I don't think 10,000-dimentional probability spaces should be much of a problem for humans😅
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Amazing and clear explanation, thank you for the video! Since we are optimizing for the cross-entropy and not KL divergence in practice, this means that although optimizing for both functions is the same process, the ideal loss that we are expecting is not 0?
Concepts are dual to percepts -- the mind duality of Immanuel Kant. Making predictions to track targets and goals is a syntropic process -- teleological. Teleological physics (syntropy) is dual to non teleological physics (entropy). Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics! Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle. Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line. "The brain is a prediction machine" -- Karl Friston, neuroscientist. "Always two there are" -- Yoda.
Get 4 months extra on a 2 year plan here: nordvpn.com/artemkirsanov. It’s risk free with Nord’s 30 day money-back guarantee!
best explanation of KL-div I've heard on UA-cam - thank you!
I beg you to talk about how Neuro-symbolic AI models I find them very fascinating. Alpha geometry has shown combining different model types may be the way to go for AGI, similar to the brain.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
So far... One of the best clear videos about Entropy and KL-divergence...
Good Motion Design too...
This is Artem Kirsanov's golden year. Posting banger after banger. Much love, your videos are a gem
Wow, what an amazing video and explanation! I love how you derived KL divergence (also known as relative entropy) from cross-entropy and entropy.
It's interesting to note that historically, these ideas were actually discovered in the reverse order. Kullback and Leibler introduced the concept of "information gain" or "relative entropy" (now known as KL divergence) in their 1951 paper "On Information and Sufficiency," building on Shannon's earlier work on entropy. The explicit use of cross-entropy as a separate concept came later as far as I know.
Your explanation really helps in understanding these interconnected ideas. Thank you for this excellent content!
Variational inference video with this quality would be simply incredible (I really cannot imagine the amount of effort this requires), again thank you for this
Thank you so much!
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
The most succinct explanation of entropy I have heard, the explanation of cross-entropy was very insightful too
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
@@hyperduality2838 It's not my intention to be rude but your comment it's only very loosely related to mine; may I ask why did you typed it as a reply to me? Was your intention to reply to someone else? Were you trying to leave a comment rather than replying to mine? Regardless, I wish you a nice day.
@@MathOnMain I am just informing you that there is a 4th law of thermodynamics based upon teleology.
"Philosophy is dead" -- Stephen Hawking.
Main stream physics and therefore science is currently dominated by teleophobia and eliminative materialism hence a new law based upon teleology is not going to be popular.
Teleophilia is dual to teleophobia.
Alive is dual to not alive -- Schrodinger's cat.
Being is dual to non being creates becoming -- Plato's cat.
Stephen Hawking accepted the metaphysics of Schrodinger's cat so philosophy is not dead!
Syntax is dual to semantics -- languages, communication or information (surprise).
If mathematics is a language then it is dual.
Stephen Hawking was a mathematician he was using duality.
Messages in a communication system are predicted into existence using probability -- Shannon's information theory, and making predictions is a syntropic process!
"We predict ourselves into existence" -- Anil Seth, neuroscientist.
Predicting ourselves into existence is a syntropic process hence there is a 4th law of thermodynamics!
Synchronic points/lines are dual to enchronic points/lines.
Points are dual to lines -- the principle of duality in geometry.
Objective information (syntax) is dual to subjective information (semantics) -- information is dual.
All Information (surprise) has structure or form and meaning hence it is dual.
Absolute truth is dual to relative truth -- Hume's fork.
I assumed that you are interested in absolute truth.
This is unbelievable, I loved it, now I can see how the KL divergence can be used to correct our approximate models, and how we aim at minimizing this divergence to get closer to the hidden distribution, beautiful!
I'd be very keen for a video about variational inference. Have been loving your content
I add myself to the request
+1 to that. Thank you so much for the amazing videos!!!!
100%
+1
+1
This video is the new golden standard of stat introduction for dummies; so clear and informative. I'll link it to the next person I find at the beginning of its stat journey! Tks bro for what u do
this is not introduction for dummies but explanation by brilliant communicator
Very impressive how clearly you explained entropy, cross-entropy, and KL divergence with the idea of surprise and great visuals. Well done and thank you for this
Keep posting please!
i am addicted to this chanel, i love how you articulate every subject to explain these concepts, thank you so much the world really needs more people like you!!! ♥
This is the clearest explanation of probability and entropy I've ever seen. Please create more videos.
Brilliant, Artem! Your videos are the crown jewels of ML educational content on UA-cam!! So intuitive!
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
Personally, after the third time someone predicted the die roll, I would be exponentially more surprised than after the first time
You misunderstand why he said that surprisal is additive
@@George70220 Because the notion of surprise is confusing in this context. It is a bad way to call it. Humans are very bad with random events, and surprise does not follow an entropic distribution.
The additivity allows us to say that H(P,Q) = H(P) + KL(P,Q)
Yea, gambler's fallacy does make quite a difference in real life interpretation of random events
The main thing is that it’s 0 at 1 and infinite at 0
This to me atleast as a concept is fundamental. But to see people put effort into making maths to explain it, is amazing.
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
@@hyperduality2838 I think you replied to the wrong comment
@@1st_ProCactus I am just informing you that there is a 4th law of thermodynamics based upon teleology.
"Philosophy is dead" -- Stephen Hawking.
Main stream physics and therefore science is currently dominated by teleophobia and eliminative materialism hence a new law based upon teleology is not going to be popular.
Teleophilia is dual to teleophobia.
Alive is dual to not alive -- Schrodinger's cat.
Being is dual to non being creates becoming -- Plato's cat.
Stephen Hawking accepted the metaphysics of Schrodinger's cat so philosophy is not dead!
Syntax is dual to semantics -- languages, communication or information (surprise).
If mathematics is a language then it is dual.
Stephen Hawking was a mathematician he was using duality.
Messages in a communication system are predicted into existence using probability -- Shannon's information theory, and making predictions is a syntropic process!
"We predict ourselves into existence" -- Anil Seth, neuroscientist.
Predicting ourselves into existence is a syntropic process hence there is a 4th law of thermodynamics!
Synchronic points/lines are dual to enchronic points/lines.
Points are dual to lines -- the principle of duality in geometry.
Objective information (syntax) is dual to subjective information (semantics) -- information is dual.
All Information (surprise) has structure or form and meaning hence it is dual.
Absolute truth is dual to relative truth -- Hume's fork.
I assumed that you are interested in absolute truth.
Comments, messages, languages are dual and the cells, neurons in your body are all communicating with each other all the time hence you are built out of duality.
Duality creates reality.
Cross entropy explains magic shows.
The magician is trying to get the audience to believe the wrong model ("Nothing up my sleeve"") and therefore be surprised (and hopefully delighted) by the outcome.
Basic Pledge-Turn-Prestige process:)
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
This is really good and high quality content !! You have no idea this is the first time I've been able to stay watch a math youtube video from start to finish in a very long time
Every single person interesting in Business Administration (as students or professional) should be good at probability and statistic.
Very teaching video.
Thanks a lot.
Never skip a Kirsanov video
When we get tired of statistical cookbook procedures it is great to go back to the underlying probability logic (it's probability all the way down). This applies not only to machine learning models but to all types of statistical models. For example, in regression models when one studies the "residual" error component. It also explains science stories when an experiment produces a surprising result and first a scientist and then the scientific community realizes that they need a new theory (model) -- a "scientific revolution". A scientific revolution should occur when the K-L divergence of experimental results is large (or else they need to find a mundane explanation for the experimental error).
Simple to follow, and crystal clear. Well done!
your animation is so incredible, i even watched the add.
This is the most lucid unpacking of these concepts that I've ever seen. Great work as always.
I would definitely look forward to a variational inference video
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
About a year ago, I was building a neural network library from scratch, but I was having some trouble understanding some of the intuition behind binary cross-entropy. Watching this video was the first time I felt like I actually understood the concepts behind cross-entropy!
Perfect presentation, only tiny error at 19:18 in the lower right corner where for a second the word Tails changes to Heads right at the slide change. Thank you very much for your great work, I think we all really appreciate it.
That's one of the best videos I have ever seen on the probabilistic foundations of ML!
Gotta say, you're really good at teaching
What an amazing introduction to this point of view of probability!
perfect timing for this content, I was looking for more intuitive explanations of cross-entropy lately, thanks!
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
I have 4 ideas: 1) causal representation learning, 2) difference between PCA, ICA and spectral embedding, 3) difference between singular value decomposition and eigen value decomposition and 4) bayes equation? I love how you explain thing and in the video you did indeed beautifully illustrate one of the most important equation in DL coming from a PhD student in machine learning
You called my soul since you mention entropy in the title
I here for entropy too yo !!!
Amazing video. It's so rare finding such quality nowadays
Great job explaining this. I’ve always had difficulty getting an intuition around it
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
You answered a lot of my questions, the ones I had before the video and also the ones I was having while watching it. Thank you ❤❤
This channel is gold. Suggested topic for a future video: The details of training GANs
you should do a video building on top of this explanation of entropy and cross-entropy to explain what perplexity is.
a good example for cross entropy (or mutual information) is asking "how well aligned are the two images", when one image is a black square on a white background, and the other is a black background with a white square.
Minimising surprise by some measures is ensuring white=white and black=black, forcing the squares to not touch, other measures force the two squares on top of each other, so that each color is only related to 1 colour in the other picture.
You really know how to explain complex things in simple ways. Thank you for this video!
Great video, Artem! Please be careful with affirmations and statements like 'always result in noise'. Since we're dealing with scientific and mathematical concepts, being precise with your words is essential.
This was amazing, thank you so much! I would LOVE to see a video on VAE's / ELBO or the various ways we can use generative models by trying to learn the joint distribution.
Underrated UA-cam channel. Thanks boss
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
Thank you for your hard work, dedication, mathematics, statistics, science, kindness,,and generosity. May GOD reward you
Great video and not an easy one to make. Delighted to see you position and reference the Free energy Principle
Yes please do videos on Bayesian inference! This video was great!
Love your explanations & visualisations. You should consider making a course on the mathematics of ML, and DL. We would definitely tune in!
Keep up the videos. They are really excellent and intuitive
Amazing video! Please make more on topics like variational inference, ELBO, etc!!!
wow, this metaphor is really beautiful, and this metaphor is really the core idea of diffusion process.
Time to go read Hitchhikers Guide again. Nice video!
Thank you for solving my long-lasting questions
Really liked the way u motivated the definition of entropy, thanks a lot
Omg thank you so much i was looking to understand this subject and searched you up because you are the best out there!❤
A truly excellent presentation!
That's a great way to explain KL-Divergence. I never understood the parameters until now!
thanks!
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
Surprise feel so intuitive but maths make them quantifiable. Keep these videos coming. Thank You👏
Thanks!
Thank you for your clearly presented video. I wanted to comment about my own experience with KL-Divergence. First, I note that KL-Divergence is not a metric; it fails the triangle inequality for example. This leads to sort of discontinuities when trying to do inference, for example when trying to infer the so-called "even process" from its outputs, and causes a more complex model to be inferred when using the KL-Divergence as a comparator for distributions.
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
Beautiful visuals and well presented.
Thank you!
Fantastic explanation!!
Very well explained. These essential principles are vital for deep understanding of modern AI technologies.
Awesome video, can't wait for the next fondamental of probabilities
Thank you. I am very surprised with the explanation. It is a masterpiece
This is exactly the kind of video I love. Abstract topics beautifully described! Although perhaps the intro was a bit long and most people watching the video know these basic concepts.
Your work is a gift to the world. Thank you!
Great video. I'm really interested in latent-variables models, ELBO, VAE and so on
best cross entropy explaination!
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
Nice and concise. Should have a photo of Claude Shannon instead of Rev. Bayes. The latter knew nothing of entropy -- he lived in the century before the physicists and one more before Shannon. (Another commenter caught misspelling of Function.)
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
Great animations in the video with a good level of detail.
Thank you so much for this fantastically clear and useful video! Question about the formula for cross-entropy: you gave an example that showed that the asymmetry of the formula makes sense (the example of assuming a fair coin versus assuming a rigged coin). However, you didn't "motivate" the formula for cross entropy. I mean, if somebody asked me to derive the formula, I wouldn't know how to explain the choice of the positions of p_s versus q_s in the formula and why they shouldn't be switched. What explains the choice? What explains why we pick one for the weights/coefficients of the probability terms and the other as the arguments of the natural logarithms?
Thanks! Sure, let me explain.
When we believe in the model Q, with q_s representing the assumed probability of state s, each time we observe an instance of s, we experience log(1/q_s) units of surprisal. As we continue to observe instances, we might encounter different states (like s2) with different probabilities under our model, resulting in different amounts of surprise (log(1/q_s2), etc.).
Now, the crucial question is: What generates these samples? What determines which state we're going to observe? Since we're observing the process unfold in the real world, these samples are actually distributed according to P, with each state s having a probability p_s of occurring.
So, to break down the formula:
1) The term inside the logarithm (1/q_s) represents our surprise, which only makes sense in the context of a particular belief about probability - our internal model. This is why Q is associated with the log term.
2) When we calculate the expected value of this subjective surprise, we need to account for how often each state actually occurs. The random variable itself is governed by P, so we use p_s as weighing coefficients.
In essence, p_s * log(1/q_s) gives us the contribution to the average surprise: how often a state occurs (p_s) multiplied by how surprised we are when it does occur (log(1/q_s)).
Hope this clarifies things a bit!
This is a great explanation. Thank you so much!
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
Great content! I can see multiple concepts used from FEP, greatly presented and explained. Keep up the good work man, hope we get to have a chat some day about these!
Thank you for describing what entropy is. I want to learn Latent-variable models, and VAE. So, when would you upload those topic?
I missed some of your references in which we can dig in. Btw, fascinating topic and awesome video!
Thanks Artem, Please go into Variational inference or even Free energy too! very interested!
I’d like to better understand what kinds of operations are allowed with probability distributions. For example I think that we never really have access to the true distribution P in any real world scenario. All we have access to are models, Q. By sampling events from P, we can estimate the cross entropy. However I don’t see how P can ever be actually known.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
Please do a video on the other techniques (in your outro/conclusion). On youtube, there is already a lot of video explaining globally how generative model works but not so much on the details of the techniques behind. It could be interesting.
It is very similar to the idea of Bayes Theorem, which makes sense of course. But it is nice to see it appear from somewhere else.
Thank you for your video. I think it is very helpful to understand the basic ideas behind training generative machine learning models. By the way, we are extremely interested in the concepts (latent-variable models, variational inference, ELBO, etc. ) that you mentioned in the ending of this video. I would be grateful if you plan to make some videos to introduce these concepts.
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
KL divergence basically says , ok just forget about what you believing (subtract Entropy of distribution P) and now again compute the cross entropy ❤❤
Very interesting and well explained
That was really well done. Thank you for making this video.
ВаУ, Артём, Супер! Ну ты даёшь стране угля, Чон Рахмат!
So beautiful I cried!
Analysts need to learn this day 1
Excellent works , thank you
Please create a video about the training of generative models
Great video. Interested in latent-variable content.
wow , what beautiful presentation
I love these videos so much
This would have be soooo good last semester!!!
likelihood, measure of fit, cross entropy, surprise-at-model, loss function -- all of these are expressing a similar idea, building on their various academic creeds
I just recently ran across your videos. I would love to see videos relating to search iterative inference and space reduction. In this video you mention gradient descent, I have not looked it over. I'll browse a bit and look for other optimzation topics. Thanks a lot!
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
While the entropy explanation is correct, I personally feel that it'd have helped if the example of an unfair coin was included and compared with the fair coin in addition to the weird coin which lands on its edge. Because at least to someones's surprise, fair coin has the most surprise which might not be intuitive if someone gets the concept of (shannon and in general)entropy wrong.
I'd really like to see a video from you on variational inference
Great video, however your weather example is actually wrong. %pop (rain chance) is given to you in %Area Covered. I.e. for a given regional boundary, 30% of the region will experience X precipitation. (Therefore frequentist view holds)
Excellent graphics, superb explanation! Thank you! Keep up the good work! You gained a sub!
This is so beautiful and amazing video. I cannot thank you enough.
Excellent explanation!
it s probability theory published 1987, this theory applied in many field such as quantum physic , Filter Design ,and Economy , and ML ,administrator. it because decision is using probability . that is a advance theory , everyone should know.
8:44 In quantum mechanics, there are infinite-dimentional vector spaces. I don't think 10,000-dimentional probability spaces should be much of a problem for humans😅
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.
Awesome video!
Thank you Artem, this is amazing!
Amazing and clear explanation, thank you for the video!
Since we are optimizing for the cross-entropy and not KL divergence in practice, this means that although optimizing for both functions is the same process, the ideal loss that we are expecting is not 0?
Concepts are dual to percepts -- the mind duality of Immanuel Kant.
Making predictions to track targets and goals is a syntropic process -- teleological.
Teleological physics (syntropy) is dual to non teleological physics (entropy).
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics!
Certainty (predictability, syntropy) is dual to uncertainty (unpredictability, entropy) -- the Heisenberg certainty/uncertainty principle.
Mind (syntropy) is dual to matter (entropy) -- Descartes or Plato's divided line.
"The brain is a prediction machine" -- Karl Friston, neuroscientist.
"Always two there are" -- Yoda.