Revision: Upto 2:30 5:05 In standard communication system what we want to do is exploit redundancy that already exists in the real source and get rid of it, shrink the file (i.e. compress it first to make it idealize source) so we can put into channel coding method. The Bent Coin 7:15 Shannon's Claim: 11:00 Entropy: 19:30 The Weighing Problem Solution: 21:08 The Bent Coin Lottery: 47:15
why did David say in 49:40 "if this was a course for real"? Im going through his lactures as saplementry matirial for my uni class and after hearing him put out that phrase I got curios as to what he meant by that
Hi Jack, thank you for pointing this out. I wonder if this was on purpose given that the noisy channel is a core part of the course. I just re-listened to the lecture, I think it's mostly an issue of over-saturation and clipping during recording which unfortunately is harder to fix than noise.
The question as stated is about minimizing the guaranteed number of weighings, not the expected number of weighings, so we really ought to be maximizing the minimum information we can get from each weighing, rather than the entropy.
Applying entropy for this weighing problem is peculiar, since the entropy depends on how we are defining the states of the system. For example, for the first weighing, it seems totally irrelevant whether the scale tips left or tips right. So in this view, it would be preferable to set up the first weighing so that it is equally probable for the scale to tip at all as opposed to being balanced. This would indicate that the most informative (greedy) first weighing is actually the case where we have a 50% chance of leaving the odd ball out, which is to weigh three against three and leave six aside. However, in that case I think there may be a conflation between the physical state of the scale and the epistemic state of the balls. The correct approach to a greedy solution is to maximize the epistemic entropy, which I believe is achieved by the 4 vs. 4 weighing.
At 46:00 of this excellent video, Professor MacKay agrees with the *brilliant* student question that maximizing entropy (or asking questions which maximize equiprobable outcomes) is *not* always the best strategy in these games. Why not? Scratching my head in Silicon Valley! Matt
he was talking about information coming from random noise (which would have high shannon information content but low relevancy) in that case when you're looking at random noise (he gives example of static on TV screen), you're maximizing entropy but that's not an effective strategy (the example game he gave was colors of balls on t v screen or something)
clipping reduces as it goes on, information content regained, i think maybe you shouts when you are nervous. more interstitial melodies, perhaps. the beautiful entropy
After half a year of on and off thinking about this problem, I think I have found a suitable example. Shannon has 14 golden balls, 8 with a circle on them and 6 with a square. Exactly one of them is actually Harry Potter's golden snitch from Quidditch. You are allowed the following questions to find out which one it is : 1) Shannon tells you if the snitch has a circle on it or a square 2) Shannon separates the balls into two groups of 7, one having only circles, and tells you which group contains the snitch 3) You choose a group (circles or squares) and divide if into two. Shannon shows you a half that does not contain the snitch. I will not go too much into it, but I believe question 2) has the highest entropy (1 bit), but choosing it will actually not help you find the snitch. If you get unlucky, you will not be able to find the snitch in less than 5 questions, wherease using only questions 1) and 3) gives an answer in 4 questions.
Repetition (redundancy) is dual to variation -- music. Certainty is dual to uncertainty -- the Heisenberg certainty/uncertainty principle. Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics. Randomness (entropy) is dual to order (predictability) -- "Always two there are" -- Yoda. Teleological physics (syntropy) is dual to non teleological physics. Duality: two sides of the same coin.
Interesting lecture but I don't understand the second weighing entropy computation. How can 4 good balls and 4 light balls result in balance weighing with probability 0.5? Beside that, the odd ball is only one and either lighter or heavier. How can heavy and light ball assumed appear at the same time? Any help will be very appreciated. Thank you very much.
Thanks for your question. Which part of the lecture are you referring to? David's example assumes 12 balls, out of which one is either heavier or lighter than the others. If you are weighing 4 vs 4, there is a 4 / 12 = 1 / 3 probability that the uneven ball is not part of the 8 balls being compared, in which case the scale is even. If it is part of the weighing process (8 / 12) it is equally likely to be on either side, which will make the scale tilt left or right with (1 / 2) * 8 / 12 = 1 / 3 probability.
Thank you for your answer. I mean at time : 30 minutes 30 seconds. At the second weighing it's supposed to be only 8 balls. But there are many combinations, e.g. GGGG vs LLLL, GGG vs LLL, etc. ua-cam.com/video/y5VdtQSqiAI/v-deo.htmlm30s Thank you very much.
Yes, there are lot of options. He actually asks for suggestions from the audience to collect a few of them. His point is that we should pick the combination with the largest uncertainty (entropy) in the outcome.
There are 8 possible candidates for the 'deviant' ball, it could be any of the 4 'heavy' (H) or 'light' (L). Weighing HHL, HHL leaves 2 of the 8 possible candidates on the table. If one of these two is the ball with different weight, the scale balances. Thus 2 / 8 = 1 / 4 is the probability of the scale balancing. Similar reasoning applies to the GGG, LLL scenario. This option leaves 5 out of 8 candidates on the table, causing a 5 / 8 chance of balancing. In the remaining cases, one of the LLL balls in the candidate and that side goes up. Does that clarify?
I find this fascinating and was trying to work out the math. How did he get 1.48 bits on the 5x5 example at 26:22? I tried averaging log base 2 of 5/12,2/12, and 5/12 divided by 3 but that is clearly wrong. Would appreciate any help. Thanks
He is calculating the entropy of the possible outcomes and averaging it. In 2/12 cases the scale is even, in 5/12 it tilts left and and 5/12 right. Average entropy is just the sum_events{ p(event) * log(1 / p(event))}//log(2) (where the log(2) converts to bits): H = (2/12*log(12 / 2) + 5/12*log(12/5) + 5/12*log(12/5) )/log(2) You can type this into google search and reproduce the 1.48 bits result: www.google.co.uk/search?q=(2%2F12*log(12+%2F+2)+%2B+5%2F12*log(12+%2F+5)+%2B+5%2F12*log(12+%2F+5)+)%2Flog(2)
Stephen Waite he is adding them, but each term is weighted with the corresponding probability. That's the definition of an average, sum( x * P(x)). Glad I could help!
You are summing the entropy after multiplying them with the probability of the event occurring. When you calculate the average of a set of numbers by summing them and dividing by N you are assuming that they all appeared with equal probability, (1 / N).
I sense a major conflict between Shannon's model and economic creativity theories of GIlder. Reducing redundancy, noise reduction and lossless transmissions are contraindicative to Gilder's idea of creativity and growth. Any responses on this?
Repetition (redundancy) is dual to variation -- music. Certainty is dual to uncertainty -- the Heisenberg certainty/uncertainty principle. Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics. Randomness (entropy) is dual to order (predictability) -- "Always two there are" -- Yoda. Teleological physics (syntropy) is dual to non teleological physics. Duality: two sides of the same coin.
Revision: Upto 2:30
5:05 In standard communication system what we want to do is exploit redundancy that already exists in the real source and get rid of it, shrink the file (i.e. compress it first to make it idealize source) so we can put into channel coding method.
The Bent Coin 7:15
Shannon's Claim: 11:00
Entropy: 19:30
The Weighing Problem Solution: 21:08
The Bent Coin Lottery: 47:15
idk who you are but thank god you exist
why did David say in 49:40 "if this was a course for real"?
Im going through his lactures as saplementry matirial for my uni class and after hearing him put out that phrase I got curios as to what he meant by that
Sir D Mackay was an inspiration. RIP
Rest in peace David :) Thanks for the lectures.
Noisy channel audio should have been cleaned up first ! It's distorted in fact.
Hi Jack, thank you for pointing this out. I wonder if this was on purpose given that the noisy channel is a core part of the course. I just re-listened to the lecture, I think it's mostly an issue of over-saturation and clipping during recording which unfortunately is harder to fix than noise.
@λ 3 well, it's all about noisy channels..
The question as stated is about minimizing the guaranteed number of weighings, not the expected number of weighings, so we really ought to be maximizing the minimum information we can get from each weighing, rather than the entropy.
Can you elaborate?
Applying entropy for this weighing problem is peculiar, since the entropy depends on how we are defining the states of the system. For example, for the first weighing, it seems totally irrelevant whether the scale tips left or tips right. So in this view, it would be preferable to set up the first weighing so that it is equally probable for the scale to tip at all as opposed to being balanced. This would indicate that the most informative (greedy) first weighing is actually the case where we have a 50% chance of leaving the odd ball out, which is to weigh three against three and leave six aside.
However, in that case I think there may be a conflation between the physical state of the scale and the epistemic state of the balls. The correct approach to a greedy solution is to maximize the epistemic entropy, which I believe is achieved by the 4 vs. 4 weighing.
high-value talk, thank you for uploading
At 46:00 of this excellent video, Professor MacKay agrees with the *brilliant* student question that maximizing entropy (or asking questions which maximize equiprobable outcomes) is *not* always the best strategy in these games.
Why not? Scratching my head in Silicon Valley! Matt
he was talking about information coming from random noise (which would have high shannon information content but low relevancy) in that case when you're looking at random noise (he gives example of static on TV screen), you're maximizing entropy but that's not an effective strategy (the example game he gave was colors of balls on t v screen or something)
thanks for uploading these excellent lectures (R.I.P David
Thank you so much for uploading these. They're great.
David Mackay was one of the greatest lectures I have come across in my studies. Glad you enjoyed the content.
clipping reduces as it goes on, information content regained, i think maybe you shouts when you are nervous. more interstitial melodies, perhaps. the beautiful entropy
yes this works at 36:27
anyone have an example where this sort of greedy strategy fails to produce the best strategy?
exactly. did you get an answer?
After half a year of on and off thinking about this problem, I think I have found a suitable example. Shannon has 14 golden balls, 8 with a circle on them and 6 with a square. Exactly one of them is actually Harry Potter's golden snitch from Quidditch. You are allowed the following questions to find out which one it is : 1) Shannon tells you if the snitch has a circle on it or a square 2) Shannon separates the balls into two groups of 7, one having only circles, and tells you which group contains the snitch 3) You choose a group (circles or squares) and divide if into two. Shannon shows you a half that does not contain the snitch.
I will not go too much into it, but I believe question 2) has the highest entropy (1 bit), but choosing it will actually not help you find the snitch. If you get unlucky, you will not be able to find the snitch in less than 5 questions, wherease using only questions 1) and 3) gives an answer in 4 questions.
Repetition (redundancy) is dual to variation -- music.
Certainty is dual to uncertainty -- the Heisenberg certainty/uncertainty principle.
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics.
Randomness (entropy) is dual to order (predictability) -- "Always two there are" -- Yoda.
Teleological physics (syntropy) is dual to non teleological physics.
Duality: two sides of the same coin.
Fantastic course ! Does it exist in 720p by any chance ?
No problem, I found the schematics hard to read (e.g. around 1:17) in the corresponding book available freely at: www.inference.org.uk/itprnn/book.pdf
Interesting lecture but I don't understand the second weighing entropy computation. How can 4 good balls and 4 light balls result in balance weighing with probability 0.5?
Beside that, the odd ball is only one and either lighter or heavier. How can heavy and light ball assumed appear at the same time?
Any help will be very appreciated. Thank you very much.
Thanks for your question. Which part of the lecture are you referring to? David's example assumes 12 balls, out of which one is either heavier or lighter than the others. If you are weighing 4 vs 4, there is a 4 / 12 = 1 / 3 probability that the uneven ball is not part of the 8 balls being compared, in which case the scale is even. If it is part of the weighing process (8 / 12) it is equally likely to be on either side, which will make the scale tilt left or right with (1 / 2) * 8 / 12 = 1 / 3 probability.
Thank you for your answer. I mean at time : 30 minutes 30 seconds. At the second weighing it's supposed to be only 8 balls. But there are many combinations, e.g. GGGG vs LLLL, GGG vs LLL, etc.
ua-cam.com/video/y5VdtQSqiAI/v-deo.htmlm30s
Thank you very much.
Yes, there are lot of options. He actually asks for suggestions from the audience to collect a few of them. His point is that we should pick the combination with the largest uncertainty (entropy) in the outcome.
at 34:12, why is the probability 2/8 in even case of HHL,HHL ? and in case of GGG,LLL (LHHHH), why are the probability 0, 5/8, 3/8??
There are 8 possible candidates for the 'deviant' ball, it could be any of the 4 'heavy' (H) or 'light' (L). Weighing HHL, HHL leaves 2 of the 8 possible candidates on the table. If one of these two is the ball with different weight, the scale balances. Thus 2 / 8 = 1 / 4 is the probability of the scale balancing.
Similar reasoning applies to the GGG, LLL scenario. This option leaves 5 out of 8 candidates on the table, causing a 5 / 8 chance of balancing.
In the remaining cases, one of the LLL balls in the candidate and that side goes up.
Does that clarify?
@@JakobFoerster Thanks for your kindness. Yes I got it.
I was confused with the meaning of the symbols because I missed the "Possibly".
I find this fascinating and was trying to work out the math. How did he get 1.48 bits on the 5x5 example at 26:22? I tried averaging log base 2 of 5/12,2/12, and 5/12 divided by 3 but that is clearly wrong. Would appreciate any help. Thanks
He is calculating the entropy of the possible outcomes and averaging it. In 2/12 cases the scale is even, in 5/12 it tilts left and and 5/12 right. Average entropy is just the sum_events{ p(event) * log(1 / p(event))}//log(2) (where the log(2) converts to bits): H = (2/12*log(12 / 2) + 5/12*log(12/5) + 5/12*log(12/5) )/log(2)
You can type this into google search and reproduce the 1.48 bits result:
www.google.co.uk/search?q=(2%2F12*log(12+%2F+2)+%2B+5%2F12*log(12+%2F+5)+%2B+5%2F12*log(12+%2F+5)+)%2Flog(2)
Thank you very much. Forgot to multiply the log base 2 1/p(x) by p(x). Great to know there are good helpful people out there...
Now that I think about it though..how is this an average when I am summing all the bits? Thanks again.,.
Stephen Waite he is adding them, but each term is weighted with the corresponding probability. That's the definition of an average, sum( x * P(x)). Glad I could help!
You are summing the entropy after multiplying them with the probability of the event occurring.
When you calculate the average of a set of numbers by summing them and dividing by N you are assuming that they all appeared with equal probability, (1 / N).
Vielen Dank, Das hat mir sehr geholfen
sehr gut.
Excellent,Long live.
RIP.
Sigh ... totally lost ... back to the beginning of the video ...
I sense a major conflict between Shannon's model and economic creativity theories of GIlder. Reducing redundancy, noise reduction and lossless transmissions are contraindicative to Gilder's idea of creativity and growth. Any responses on this?
Repetition (redundancy) is dual to variation -- music.
Certainty is dual to uncertainty -- the Heisenberg certainty/uncertainty principle.
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics.
Randomness (entropy) is dual to order (predictability) -- "Always two there are" -- Yoda.
Teleological physics (syntropy) is dual to non teleological physics.
Duality: two sides of the same coin.
is this in the lecture?