Intuitively Understanding the Shannon Entropy

Поділитися
Вставка
  • Опубліковано 22 сер 2024

КОМЕНТАРІ • 61

  • @maxlehtinen4189
    @maxlehtinen4189 9 місяців тому +28

    for everyone trying to understand this concept even more thoroughly, towardsdatascience's article "The intuition behind Shannon’s Entropy" is amazing. it gives added insight on why information is the reciprocal of probability

  • @bluejays440
    @bluejays440 10 місяців тому +5

    Please make more videos this is literally the only time I've ever seen entropy be explained in a way that makes sense

  • @caleblo8498
    @caleblo8498 Рік тому +5

    it is confusing from some part of the concept, but the rethinking progress is helpful, as I can surely say:
    the expected possible outcomes (the surprises) * the probability = entropy (lower the entropy, the less surprising it will be)

  • @zizo-ve8ib
    @zizo-ve8ib 7 місяців тому

    Bro really explained it in less than 10 mins when my professors don't bother even if it could be done in 5 secs, true master piece thus video keep it up man 🔥🔥🔥

  • @charleswilliams8368
    @charleswilliams8368 5 місяців тому

    Three bits to tell the guy on the other side of the wall what happened, and it suddenly made sense. Thanks.

  • @xyzct
    @xyzct 2 місяці тому

    Excellent. Short and sweet.

  • @nicholaselliott2484
    @nicholaselliott2484 8 місяців тому +1

    Dude, took information theory from a rigorously academic and formal professor. I'm a little slow and under the pressure of getting assignments done, couldn't always see the forest for the trees. Just the sentence "how much information, on average, would we need to encode an outcome from a distribution" just summed up the whole motivation and intuition. Thanks!

  • @MissPiggyM976
    @MissPiggyM976 2 дні тому

    Very good!

  • @kowpen
    @kowpen 2 роки тому +9

    At 4:54, may I know the reason to consider 10 bits and triples? why not any other combination? Thanks.

    • @adianliusie590
      @adianliusie590  2 роки тому +8

      I was just showing arbitrary examples but I could have chosen many different examples too. The triples (when there were 8 outcomes) was to show this could be easily extended to any power of 2, and the 10 outcomes was to show that this generalises too to non-powers of 2.

  • @Sars78
    @Sars78 25 днів тому

    Well done, Adian. I just found out-though I'm not surprised at all, in the Shannon sense 🤓 -that you're doing a PhD at Cambridge. Congratulations! Best wishes for everything 🙂

  • @caleblo8498
    @caleblo8498 Рік тому +2

    each slice of probablity requires log2p_i number of bits to represent, and the total number of outcomes (they call it entropy) requires the sum of all slices of probability. Each slice of probability is basically one of the expected outcome, say, geting the combination ABCDEF in a six letter scramble. (correct me if I am wrong)

  • @mathy642
    @mathy642 3 місяці тому

    Thank you for the best explanation

  • @avatar00001
    @avatar00001 3 місяці тому

    thank you codexchan

  • @prateekyadav7679
    @prateekyadav7679 Рік тому +3

    What I understand is that Entropy is directly related to the number of outcomes, right? So, I don't get why we need such a parameter/term when we could simply do by stating the number of outcomes of a probability distribution? What new thing does entropy bring to the table?

    • @derickd6150
      @derickd6150 10 місяців тому +1

      Consider the case that a biased coin is flipped. There are two outcomes, just like an unbiased coin, but let's say this biased coin has a (0.1)^10000 chance of being heads. Do you have exactly the same information about the outcome before hand as you do with an unbiased coin?

    • @maxlehtinen4189
      @maxlehtinen4189 9 місяців тому

      @@derickd6150 yes, it makes sense that a non-uniform distribution should have an effect on the uncertainty of a distribution, but can you explain how the bias affects the outcome via the entropy formula?

    • @derickd6150
      @derickd6150 9 місяців тому +2

      @@maxlehtinen4189 I'm not sure what you mean by bias here? Edit: Oh right you're referring to my answer, not something in the video. Yes well the entropy formula says something along the lines of: "How many bits do we need to represent the outcome of the coin"? That is a very natural measure of how much information you have about the outcome. If the coin is unbiased, you need two bits. If it is so severly biased like I describe above, and you plug the numbers into the entropy formula, it will essentially tell you "Well... we really only need one bit to describe the outcome right? We essentially certain it will be tails" Something intuitively along these lines. Edit 2: to see this, plot y(p) = -p log(p) - (1-p) log(1-p) for p in [0,1]. That is the expression for the entropy of the coin, whatever its bias. You will see that when p is very close to 1 or to 0 (which it is in my example), y(p) is almost 0. This is to say, you need almost no information to represent the outcome. It is just known. You need not transfer any information to someone, on the moon say, for that person to guess that the biased coin I described gives tails. However, when p is 0.5, the entropy is maximised, and so you would need to transfer the most information to someone on the moon to tell them the outcome of the coin, because they cannot use their prior knowledge at all to make any kind of educated guess

  • @AdeshBenipal
    @AdeshBenipal 2 місяці тому

    Nice video

  • @murilopalomosebilla2999
    @murilopalomosebilla2999 3 роки тому +6

    Nice explanation. Keep up the good work, man!

  • @anusaxena971
    @anusaxena971 2 роки тому +2

    You CERTAINLY DESERVE MORE VIEWS 👏 👍👍👍👍

  • @mansoor9894
    @mansoor9894 Рік тому +1

    Fantastic job in explaining this,

  • @nyx8017
    @nyx8017 8 місяців тому

    god this is an incredible video thank you so much

  • @RodrigodaMotta
    @RodrigodaMotta 2 роки тому +1

    Blew my mind!

  • @user-wi1rj4iw9y
    @user-wi1rj4iw9y 2 роки тому

    Thank you for your video.Keep it up! 感谢你的视频. 再接再厉!

  • @MaximB
    @MaximB Рік тому

    Great job. Thank you

  • @chetanwarke4658
    @chetanwarke4658 Рік тому +1

    Simple and precise!

  • @debasishraychawdhuri
    @debasishraychawdhuri 4 місяці тому

    It does not explain the most important part - how the formula for non-uniform distribution came about

  • @sirelegant2002
    @sirelegant2002 7 місяців тому

    Thank you!

  • @derickd6150
    @derickd6150 10 місяців тому

    Great video!

  • @tanjamikovic2739
    @tanjamikovic2739 Рік тому

    this is great! i hope you will film more!

  • @lennerdsimon9117
    @lennerdsimon9117 2 роки тому +1

    Great video, well explained!

  • @user-cf2yo5qf3h
    @user-cf2yo5qf3h 6 місяців тому

    Thankkkk youuuuu.

  • @huibosa2780
    @huibosa2780 2 роки тому

    excelent video, thank you!

  • @prateek4546
    @prateek4546 2 роки тому

    wonderful explaination !!

  • @morphos2
    @morphos2 Рік тому

    I didn't quite understand the 4:36 rational.

  • @Justin-zw1hx
    @Justin-zw1hx Рік тому

    awesome!

  • @robertwagner5506
    @robertwagner5506 2 роки тому

    great video thank you

  • @zgz97
    @zgz97 2 роки тому

    beautiful explanation :)

  • @karlzhu99
    @karlzhu99 Рік тому

    Uncertainty is a confusing way to describe this. For the lottery example, wouldn't you be very certain of the outcome?

    • @TUMENG-TSUNGF
      @TUMENG-TSUNGF Рік тому

      It’s about the numbers not whether you win the lottery or not.

  • @alixpetit2285
    @alixpetit2285 2 роки тому

    Nice video, what do you think about set shaping theory (information theory)?

  • @corydkiser
    @corydkiser 2 роки тому

    awesome

  • @aj7_gauss
    @aj7_gauss 9 місяців тому

    can someone explain the triplets part

  • @azerack955
    @azerack955 Рік тому

    I dont quite understand the very last step. what does summing over all the probability outcomes give us?

    • @vibhanshugupta1729
      @vibhanshugupta1729 Рік тому +1

      That is the way we calculate expectation values. For a random variable X which takes values {xi}, E(X) = sum P(xi) * xi

    • @AkashGupta-th2nm
      @AkashGupta-th2nm 11 місяців тому

      Intuitively, u sum over it to get some understanding of the average uncertainty

  • @AniketKumar-dl1ou
    @AniketKumar-dl1ou 11 місяців тому

    Your should have written H[U(x)] = logM / M
    to better relate the entropy explanation.

  • @diy_mo
    @diy_mo 9 місяців тому

    I expected something else, but it's also ok.

  • @whoisray1680
    @whoisray1680 Рік тому +1

    Why 1/p?????????

    • @lsacy8347
      @lsacy8347 Рік тому

      im not too sure but I think its just a bitwise expression of M possible outcomes. considering there are M probabilities with equal probabilities (p), so p = 1/M -> 1/p = (1/(1/M)) = M

  • @bodwiser100
    @bodwiser100 7 місяців тому

    I appreciate your effort, but the video is quite confusing. For example, in the example about 8 football teams, you explain why 3 bits are required by flat out stating as a starting premise that 3 bits are required! It's a circular argument.

  • @energy-tunes
    @energy-tunes Рік тому

    This seems so intuitive why did it take so long to get "discovered"

  • @axonis2306
    @axonis2306 2 роки тому +1

    Most of your understanding is good, but 4:50 is an unnecessary leap of logic. At level this introductory is probably best to assume outcomes to be at 2^n.

  • @2011djdanny
    @2011djdanny 2 роки тому +1

    Example is even difficult than the concept itself 🤦🏼‍♂️😃
    Nice try by the way

  • @mikes9012
    @mikes9012 2 роки тому +4

    this sucks, really unintuitive