Well, the third line was more of a statement than a question so… only the usual rules of conversation applies,… which I guess still means that it could be a lie if you have an annoying friend.
Hey, I just wanted to add that this is how error correcting codes work! In your example you had 4 bit of data (the initial four questions) and then 3 parity bits (the last three questions). Since the player can only lie once you've shown that it's possible to cover every possible branch with seven yes/no questions (7 total bits). That's called Hamming(7,4), and in fact this code can detect and correct any single bit error! That's awesome, great video!
This effectively means you can simplify the method, right? Your first 4 questions determine the four bits of the hidden number (with one potential error), then you just combine those four questions in three non-complementary pairs (combine = union the sets you're asking about) for the three 'parity bit' questions, right? I feel like that's simpler than having to do new logic for the extra questions. I'm not 100% that turning the original questions into parity-check questions is exactly that unioning procedure but it feels right.
Oh yes, the social annoyance you get when you play 20 questions in the car and ask your friend to provide a numbered list of all people on earth. That’s what people get for being friends with mathematicians. It’s a blessing and a curse…
Modeling problems abstractly is pretty fun but to actually play the game you also need to construct relevant questions quick enough. Typically it’s quite expensive to construct questions that cut the set in half. Usually one of the best ways of doing it is to draw on information you already have. Like, if you figure out the person is an actor you might then try genre or something like that. It is quite expensive to form questions that would halve both the set of actors and the set of athletes at the same time. Well you could ask for gender but people typically start with that and are they a real person and are they alive, and age etc. Given such complications it might be worth it to make sure certain pieces of information are not lies, early. I was thinking that one might try questions that condition on the earlier questions. Like: Q1: Is the person an actor? Yes Q2: Is the person an actor and in fantasy movies? Yes Q3: Is the person an actor and in fantasy movies and in Game of thrones? No Since it can’t be the case that both Q1 and Q2 are lies, Q1 can’t be a lie. Essentially if you manage to get a series of Yes answers you’re safe. That only works on Yes answers. But if you get a No answer you could invert the last question and add it to the conjunction. I’m not sure how to make it optimal exactly but if you’re starting to get an unlikely series of no you might check for the lie, root out where it is and then just play regular 20 questions after that.
I'm not sure if the rules allow it, but what about this: First, do the binary tree for regular 20 questions. Then ask "Have you used your lie yet?" If they answer "Yes" there are 2 possibilities: (a) They used their lie before (b) They have not used their lie before, but lied on this question In either case, you know they used their lie. You can then do a binary search on t, a, b, c, d. Notice that you can't exclude t here since you don't know when the lie occured. If they answer "No", they must be telling tell the truth. That is because if they were lying on this question, it would mean they have used their lie before which means they couldn't lie on this question anymore. In this case, you know the answer is t. The worst-case amount of questions for this is log n + 1 + log log (n+1) (and +1 if you count the final "is it x") Interestingly, this matches your method. The difference is that in yours you essentially double the size of the search tree, but this is only increasing its depth by 1. I can avoid this doubling with my method, but need to introduce one new question, making it the same. Mine has a slight advantage in that it might improve on the average solving time depending on the strategy your opponent uses for the lies. The drawback is of course that you have to reside to such meta questions ^^ PS: I'm pretty sure it's 15 and 26 questions for 1000 and 1000000 categories, or am i wrong?
I was hoping someone would be drawn towards this kind of thinking. I limited the analysis to only using questions that are subsets of the state, simply because it has fairly direct connections to information theory that way. However, that closes the door on some pretty unique and creative solutions such as the one you've found :)
@@axiomath3434 Actually, it does not prevent you from asking that question - asking "have you used your lie yet?" is equivalent to asking "is it not t?"
@@axiomath3434 If we allow this kind of questions I think the best solution would be: Ask: "will you answer 'no' to this question?" If the answer is 'yes' they said the answer will be 'no', but they answered 'yes' so they lied. If the answer is 'no' they said the answer will be 'yes', but they answered 'no' so they lied. They lied either way so now we know the rest of the answers will be true and we can just use binary search to find out the answer.
For each previous question q_n, once you get the corresponding answers a_n you can define a new question p_n to be q_n if a_n was yes and “not q_n” otherwise. Then the question Q: not(p_1 and … and p_n) Is equivalent to “have you lied before” Some meta question can be phrased in terms of regular questions without loss of generality as long as it can depend on the answers.
🤔 What if you ask e.g. "Does it quack XOR will your answer to this question be a lie?" so that regardless of whether they lie, they will emit the answer to "Does it quack?" Probably they'll just stop playing Ulam's game with you, but it might be useful if you're playing against some kind of genie or demon.
6:52 Slight correction, but that is not a binary search tree (BST). It is just a binary tree. A binary tree is just a tree where nodes have at most 2 children. A BST additionally requires that the left subtree contains only nodes lesser than the root, the right subtree contains only nodes greater than the root, and that both subtrees are BSTs. The advantage of a BST is that at every node it can be determined which child to recursively search to find a target node. This property is not present (nor does it appear useful) in the example in the video.
It's still sort of a binary search tree, perhaps in a non-CS context. It's obviously not a data structure either way, but we could just define our ordering function based on what the answers are. In fact, let's do that. Let's assign a key to each entry with k bits, where k is ceil(log(set size)) (base 2 log of course). The first (most significant) bit is set to 1 if the answer to the first question is yes, 0 otherwise. The second bit is set to 1 if the answer to the second question is yes, 0 otherwise. Continue this pattern, and it's pretty clear to see that these keys fulfill the BST property.
To be honest, I hadn't heard of Ulam's game before either until I went searching for a fun video idea. It's been a great time diving deep into it though. And yeah you're absolutely right. You can apply math to anything, even things that don't exist, which is what makes it awesome!
A simpler way of thinking about the optimal strategy for 20Qs is that we want to divide the number of options in half with each question, so when we're trying to narrow down the lie we can ask "did you lie before question Q/2?" where Q=the number of questions asked before the process of narrowing down the lie, then we ask "did you lie before Q/4" or "after Q/4*3" and so on
Yes for sure. I opted to explain that bit the way I did in order to give precedent to the kind of analysis I'd use in the second bit, so viewers were already familiar with that kind of thinking before it went and got more complex. But yes I agree, this is a more natural way to think about it.
I always love reading comments on videos like these that get so hyper focused on the original premise (in this case guessing an animal) rather than understanding the point of the premise is to serve as a jumping point for the mathematics being demonstrated in the video. My favourite ones on this video are a phD dissertation on choice bias, an article about the Linnaeus system, and another asking about a blind person that can't see the list of animals.
According to the paper in the description, the number of moves needed to win if the category size is n is the ceiling of the largest solution to 2^x/(x+1) = n (note: this is for even n, the equation for odd n is very similar). We can solve this equation exactly with the Lambert W function, but I had to look up how to find the asymptotic form. It turns out that x ~ log(n) + log(log(n)) which is pretty intuitive, but the next term is L := log(log(n))/log(n), and after that the next terms are on the order of powers of L.
Most of this makes sense, but there's one thing I'm confused about: 12:20 "If the responder answers yes, for example, we know that the only way a or d can possibly still be the Responder's chosen number is if they lied once in their first 4 questions and then lied again..." The answer is probably standing right in front of me and I'm just slow here, but can you expand a little bit as to why you're so certain of this claim? Regardless, great video; loved the clean visuals!
That's definitely the roughest part of the video. I went through *so* many iterations of the script trying to nail down the best way to say it. If you think back to the section starting at around 10:09, we built a table showing why t, a, b, c, and d are still potential possibilities for the responding player's chosen number. We know that 't' corresponds to the number that *would* be the Responder's number if they told the truth for the 4 questions. This means that if 't' is their number, they still have a lie left that they can use. On the other hand, the only way 'a' could still be the Responder's number is if they lied for the first question we asked. This means that they can't lie again. So when we ask if their number is t, b, or c, they can't answer 'yes' because that would be a second lie. This means that if they *do* answer yes, that their number couldn't possibly be 'a.' This same reasoning can be said about 'd.' I hope this makes some sense. Thanks for taking the time to give me a watch!
@@axiomath3434 Ah, okay! When you put it like that, it makes more sense! Thanks for clearing that up! I'll be looking forward to more videos from this channel!
Here's my take : we're searching for a N-bit binary number (for simplicity we kind of assume N is a power of 2 but this changes little). First ask for each bit individually (this takes N questions). Then ask if the sum mod 2 of the N/2 first bits is the one you computed. If no, there is a lie, and it's within the first half, then continue this way. If yes, there either isn't a lie or it's located in the second half. So ask if the sum mod 2 of the N/4 first bits of the second half is the one you computed. Etc... When you've narrowed it down to a single bit there are two possibilities : either you at some point learned that there was indeed a lie, so you can tell the correct number by flipping that bit, or this was never revealed and you have to ask which makes one more question. Final worst-case complexity is N+log_2(N). I imagine it's optimal. Funnily, if you have to find the object in a set of n objects (N=log(n)), the complexity is log(n)+log(log(n)). Quite cute to see an appearance of log(log(n)) in the wild :)
The math analysis is amazing and enjoyable, kudoos!! But in real life it’s straightforward that I can answers same answer to same questions, so I think a good Ulan game is one that I can lie once, but I can’t lie again on the same questions, that’s what I’d set ulam rules to be, just a thought from a gamer’s viewpoint
Of course, there's some real world factors to consider here. 1. If someone picks a beluga whale and you ask if it's bigger than a microwave oven, they will answer confidently and immediately yes. 2. If someone picks a beluga whale and you ask if it's bigger than a Greenland shark, they will look at their phone or give you the very helpful "I don't know" or "sometimes" or "clarify the question." If they have to do any of these you are definitely extracting more than 1 bit of information from them. And if they don't, you also get more than 1 bit of information. 3. It takes a lot longer to compare entities which are similar in the dimension being compared but don't follow a strict cached ordering when it comes to that quantity. Everyone knows the relative size of an elk and a caribou but what about an alligator and a caribou? Whether they have to think about it / say I don't know / use their phone / etc also tells you whether your reference point is similar in TYPE to the object and not just dimension. Even a small pause to think could be worth extra information. Note also that it's possible for them to lie by taking longer but not shorter to answer a question. So if you're usually answering instantly and they have to think all of the sudden you may be on to something. You know not just the yes or no answer but also whether or not it was ambiguous or uncertain and how long they took to answer. Finally, there is choice bias. Try as you might, you are not gonna pick a random animal. You are either gonna pick an animal that's well known and nonspecific in type, or one which is "weirdly specific" but probably not actually arbitrary from the several hundred to several thousand animals you might be aware of. Biasing questions to the distribution of how people pick things is a very powerful optimization. For much the same reason it's a powerful optimization for guessing passwords.
Following the strategy in the video, the number of questions required is given by ceil(log(n)) + ceil(log( 1+ceil(log(n)) + floor(log( 1+ceil(log(n) )) where n is the total number of possible items and log = log2.
I'll add something here that wasn't included in the video. The answers to the last three questions give the following results: NNN => 6 NNY => 10 NYN => 15 YNN => 13 YYN => 14 YNY => 14 NYY => 14 YYY => 14
Ask a question, ask “was the previous answer a lie?” ,repeat until they say “yes” , then ask on more time to make sure that the “yes” wasn’t a lie, then continue as normal
Animals are fairy well categorized by the Linnaeus system. Much thought would have to be put into what the best question would be based not just on number of animals in each category. In classification there are 7 phyla but cordata includes all animals with backbones. Despite being no where near 1/7th of animal species, I would guess that half or more selected answers will be in thus category. Therefore I would ask "Does it have a backbone?" First.
14:49 Isn't 15 (all yes) impossible for that set of questions (as in the true answer being all yes, if the no on the 3rd question was the lie) since the 4th question (which must be truthful) says it's not 15 (caused by 15 being only 3 layers down in the binary tree rather than 4 so it can't be in the "yes" set of the 4th layer) I don't think it makes any difference(?) assuming it's there for the next step looking at the tree of t, a, b, c, and d for the number of questions, but that's only if the true answer isn't 15 11:01 would all be possible though Cool topic tho I remember this in error correction
This is kinda incorrect though. The optimal strategy is not necessarily the strategy that has the best worst case scenario. It is instead the strategy that, well, wins more often, ie. Guesses the right answer in less than 20 questions more often. Say we have a strategy that has a 51% chance of getting it right in 20 questions, but then they have a 0% chance to get it right in the next 80 questions, and finally the remaining 49% chance to get it right in only 101 questions (ignore how this doesn't make sense). Compare that to a strategy that gets it right 50% of the time in 20 questions and 50% of the time in 21 questions. Even though this strategy's worse case scenario is 80 questions shorter, it's a worse strategy because you only win 50% of the time, instead of the 51% of the time the first strategy gave.
Still, in order to win any realistic sized game (either in 20 questions or the Ulam variant) you need probabilistic knowledge (about the category), even then the result is non-deterministic, as one could rarely choose a very rare choice for an animal or something. Might be a good idea for a video to see how this optimal strategy along optimal probabilistic knowledge (assuming some natural distribution of true answers) would work on either variant. I'm almost curious enough to get going with it by hand but it's almost 2 AM.
There's a bunch of logic puzzles that I could never solve that involves self-references and ANDs and ORs. I wonder if asking those type of questions could make it faster
Your algorithm involves playing a complete game of twenty questions and only then trying to detect the lie. Can you prove that is optimal, rather than some other strategy that incorporates error correction from the beginning? Come to that, are you even claiming it's optimal?
I think I have the ultimate, most meta strategy “When will you lie” Obviously not a great strategy, but it was so horrendously meta that I thought I ought to mention it
One thing I think you should have clarified is the simplification to numbers. Questions which are directly asking "is it X" remain valid but questions like "is it less than 5" don't remain valid in the switch back from numbers to animals. That's ok because the general strategy you have works for any set of answers but it wasn't clear at that point in the video
nice trick. now, to add an extra twist. what if he can lie twice? I believe the rules more or less still apply but will require a few modifications to take into account a second lie.
Also keep in mind with the "asking twice" that just locates the lie. It doesn't tell you what the truth is. If you want the truth you'd need to ask that question thrice.
I would like to point out that this style of 20 questions bears very little resemblance to the actual game. Note that in most cases, there is not a simple, objective number of items that fit in a category. Still, I appreciate this video for the logic. Also: I would consider the best strategy to have the highest mean, not the lowest minimum. At this point, the video would just be 3b1b's video about wordle and everyone would stop watching; but whatever. Isn't the best strategy to brute-force every possible tree and pick the best one?
Youre assuming there is a one-to-one correspondence between a set of integers and any and all arbitrary lists. Perhaps if the list were infinite this may not be true; its not clear. Did you perhaps want to clarify the rules of the game?
Brilliant! I thought about this exact problem, but didn't realize it already had a name, and such an elegant optimal strategy. Fantastically-done video. Hope this gets a zillion more views ;-;
Thanks for the kind words! As for the case with multiple lies, as long as everything else stays constant in the rules, any game with k rules can be solved in a similar way. The trick for building a tree with a certain amount of possible lies is to use a sort of strike system. Every possible choice in your category starts with 0 strikes, and gains a strike when it doesn't fit an answer given by the responding player. You remove an item as a possibility when it gets it's k+1th strike. In order to win the game, you must narrow down the list of possibilities to 1, so in order to keep track of the state of the game, the only information we need to know is the number of strikes each possibility has. Additionally, we can completely determine both the yes and no states given any particular question. From this point on, the optimal strategy can be found in the same way we did in the video. (In fact, the reasoning in the video is nothing but a specific application of this strike tallying method.) I hope this answers your question!
Maybe I'm too sleep deprived or ill to understand but how exactly do I to apply numbers, letters and binary to trying to guess the animal in Ulam's game? Am I wrong to assume that quizzing about seemingly arbitrary numbers to represent the subject matter won't help because that's not in the premise of the game itself? The logic sounds right but the premise feels off. I'm now frazxled.
I'm amazed that even with a lie, you could still win 20 questions for 1000 items. What's the most number of items for which you could still win in 20 questions?
I thought you would bring it up because this may help lower the number of questions (maybe, I'm not that good at math): what if we go all meta and ask about the lie itself?
This essentially changes nothing. Once you have the bits, asking "did you lie? " can be equivalently replaced by "is the sum mod 2 of all digits the one I can compute with your digits?". And "did you lie among bits 1,4,7,8?" can be replaced with "is the sum mod 2 of bits 1,4,7,8 the one I compute?" etc.
Is exactly one of these statements true: "you are using your lie on this question", "the number is bigger than 8"? If they are using their lie on this question and the number is bigger than 8, the true answer is no, but they'll answer yes. If they are using their lie on this question and the number is not bigger than 8, the true answer is yes, but they'll answer no. If they are not using their lie on this question and the number is bigger than 8, the true answer is yes, and they'll answer yes. If they are not using their lie on this question and the number is not bigger than 8, the true answer is no, and they'll answer no. So, no matter if they're lying or not, if the number is bigger than 8, they'll answer 'yes', and if the number is not bigger than 8, they'll answer 'no'. Repeat with every question and you've got yourself the optimal strategy. :)
Is the list of animals a prescribed list that both players can know ahead of time? Does only the one player know the list from which he chose the animal, and the other player is blind? Youve been very vague about the rules of the game.
Is there an easy way to prove you can not do better than logn+loglogn+c? (log base 2) I could come up with a proof assuming you have to specify a list of questions in advance, but the fact that you can base your questions on the answers you have gathered makes this rather difficult. If you have to specify a list of q questions in advance, this gives 2^q possible lists of answers to all the questions. For every item, there are q+1 possible lists of answers the responder could have given. The responder either never lies, or lies for exactly one of the q questions. Thus, as long as 2^q < n(q+1), by the pigeonhole principle, there exists a list of answers for which multiple items could have generated that list. So if 2^q < n(q+1), then there is no list of q questions that you can give in advance to guarantee you will know the chosen item after getting all the answers. Let a = loglogn, so n=2^2^a. Consider q = logn+loglogn = 2^a+a. We have 2^q = 2^2^a * 2^a < n(q+1). This proves you need more than logn+loglogn questions to figure out the chosen item in this variant of the game.
There's a easier solution. Use self referencing to force the correct answer. Ex: Does your answer correctly answer the following question AND . If the responder decides not to lie. Then it devolves into a normal question. If they decide to lie and the real solution is true. They shall reply false. But replying false makes the first statement false. Which makes the entire expression false. That means they actually have to reply true... It's a paradox. The responder have no option but to reply truthfully. Same solution applies for the false case
Let's say you discover an island inhabited by two tribes. One of these tribes always tells the truth, and one always lies. Before you sits a man from one of these tribes, though you don't know which. You can ask him any number of yes or no questions you wish, but you may never ask him a question that you yourself know the answer to. This includes any question that you have enough information to be able to deduce the answer to. How do you learn his tribe?
You should state that differently because semantically your problem doesn't hold. You should change it to Trivial Questions. Like 'Are we on an island?' , because all questions you could ask even the non-trivial ones can have their answers deduced beforehand. But sure the question you ask is, 'Which tribe would the members of the other tribe say you belong to?' The Liar Lies and says that the other tribe would say he's from the Truth Tribe and the Truther would say that the other tribe would say he's a Liar.
answer 'If I asked you "are you from the True Tribe", would you say no?' Suppose he is from the True tribe. He would say yes to the inner question because he is True. Thus he would say no to the outer, also because he is True. Suppose he is from the Liar tribe. He would say yes to the inner question, because he lies. So he should say no to the outer question. However, he's a liar, so he'll say yes instead. Now, because I don't know his tribe, I don't know if he'll say no or yes, so I don't know the answer to his question. So this question correctly identifies the man and is also askable.
There are certainly loads of questions who's answers cannot be deduced beforehand, such as simply asking "do you tell the truth?" We don't have enough information to say one way or another what the actual answer is so we are allowed to ask it. On the other hand, if you were to ask "are we on an island?" The liar would say no and the truther would say yes. This would immediately solve the puzzle, except that the answer to the question is one that we already know the answer to, so we can't ask it. I can't rephrase this idea to not asking "Trivial Questions" because that's not a well defined idea. Interestingly enough, your proposed solution is actually illegal according to the rules. If you ask "would a member of the other tribe say you were a truther," no matter which tribe they are in, the true answer to the question is yes. Your solution hinges on picking out the liar based on deviance from this true answer. Because we can deduce the answer beforehand, your solution is no less illegal than asking if you are on an island.
@@moonlightcocktail To build on this, any question requiring the lie to be doubled gives a truth again since it's a binary question. Anything like "what would you say..." or "what would the members of your own tribe say..." would force the liar to tell the truth again from the lie, meaning both answer truthfully. Then, it's a simple problem of asking an identifying binary question that satisfies the rules.
If you stick within the logical spirit of the problem, no finite sequence of legal questions will enable you to determine his tribe. Suppose you ask a sequence of questions, and before the last answer you don't know what tribe he belongs to, but after the last answer you do. Without loss of generality, when you ask the last question you have deduced that a truth-teller would answer 'yes' and a liar would answer 'no'. But if you know that, you can deduce the true answer to the question is 'yes', which makes the question illegal. It is possible to win within the literal wording of the rules, for instance by asking 'will you answer this question within five seconds'? When you ask the question, you don't know the answer.
It is not correct that you guarantee the correct answer in 4 questions. You need 5. In your example, there is a branch that ends with the numbers 1 and 2, and the question is "is it 2". You only get that correct in 4 guesses if the answer actualy is 2. If it is not, then you have to ask again "is it 1", which is your fifth question. Otherwise you never gave the correct answer.
- Is it a duck?
- Yes.
- So I won.
- No, I lied.
its even more sneaky because you could have lied on the second one...
Well, the third line was more of a statement than a question so… only the usual rules of conversation applies,… which I guess still means that it could be a lie if you have an annoying friend.
Hey, I just wanted to add that this is how error correcting codes work! In your example you had 4 bit of data (the initial four questions) and then 3 parity bits (the last three questions). Since the player can only lie once you've shown that it's possible to cover every possible branch with seven yes/no questions (7 total bits). That's called Hamming(7,4), and in fact this code can detect and correct any single bit error! That's awesome, great video!
Yes! Part of the reason I loved researching this topic was its connections to information theory and error correcting codes
This effectively means you can simplify the method, right? Your first 4 questions determine the four bits of the hidden number (with one potential error), then you just combine those four questions in three non-complementary pairs (combine = union the sets you're asking about) for the three 'parity bit' questions, right? I feel like that's simpler than having to do new logic for the extra questions. I'm not 100% that turning the original questions into parity-check questions is exactly that unioning procedure but it feels right.
Oh yes, the social annoyance you get when you play 20 questions in the car and ask your friend to provide a numbered list of all people on earth.
That’s what people get for being friends with mathematicians. It’s a blessing and a curse…
Modeling problems abstractly is pretty fun but to actually play the game you also need to construct relevant questions quick enough. Typically it’s quite expensive to construct questions that cut the set in half. Usually one of the best ways of doing it is to draw on information you already have. Like, if you figure out the person is an actor you might then try genre or something like that.
It is quite expensive to form questions that would halve both the set of actors and the set of athletes at the same time.
Well you could ask for gender but people typically start with that and are they a real person and are they alive, and age etc.
Given such complications it might be worth it to make sure certain pieces of information are not lies, early.
I was thinking that one might try questions that condition on the earlier questions.
Like:
Q1: Is the person an actor? Yes
Q2: Is the person an actor and in fantasy movies? Yes
Q3: Is the person an actor and in fantasy movies and in Game of thrones? No
Since it can’t be the case that both Q1 and Q2 are lies, Q1 can’t be a lie.
Essentially if you manage to get a series of Yes answers you’re safe. That only works on Yes answers. But if you get a No answer you could invert the last question and add it to the conjunction. I’m not sure how to make it optimal exactly but if you’re starting to get an unlikely series of no you might check for the lie, root out where it is and then just play regular 20 questions after that.
I'm not sure if the rules allow it, but what about this:
First, do the binary tree for regular 20 questions.
Then ask "Have you used your lie yet?"
If they answer "Yes" there are 2 possibilities:
(a) They used their lie before
(b) They have not used their lie before, but lied on this question
In either case, you know they used their lie. You can then do a binary search on t, a, b, c, d. Notice that you can't exclude t here since you don't know when the lie occured.
If they answer "No", they must be telling tell the truth. That is because if they were lying on this question, it would mean they have used their lie before which means they couldn't lie on this question anymore.
In this case, you know the answer is t.
The worst-case amount of questions for this is log n + 1 + log log (n+1) (and +1 if you count the final "is it x")
Interestingly, this matches your method. The difference is that in yours you essentially double the size of the search tree, but this is only increasing its depth by 1.
I can avoid this doubling with my method, but need to introduce one new question, making it the same.
Mine has a slight advantage in that it might improve on the average solving time depending on the strategy your opponent uses for the lies. The drawback is of course that you have to reside to such meta questions ^^
PS: I'm pretty sure it's 15 and 26 questions for 1000 and 1000000 categories, or am i wrong?
I was hoping someone would be drawn towards this kind of thinking. I limited the analysis to only using questions that are subsets of the state, simply because it has fairly direct connections to information theory that way. However, that closes the door on some pretty unique and creative solutions such as the one you've found :)
@@axiomath3434 Actually, it does not prevent you from asking that question - asking "have you used your lie yet?" is equivalent to asking "is it not t?"
@@axiomath3434 If we allow this kind of questions I think the best solution would be:
Ask: "will you answer 'no' to this question?"
If the answer is 'yes' they said the answer will be 'no', but they answered 'yes' so they lied.
If the answer is 'no' they said the answer will be 'yes', but they answered 'no' so they lied.
They lied either way so now we know the rest of the answers will be true and we can just use binary search to find out the answer.
For each previous question q_n, once you get the corresponding answers a_n you can define a new question p_n to be q_n if a_n was yes and “not q_n” otherwise.
Then the question
Q: not(p_1 and … and p_n)
Is equivalent to “have you lied before”
Some meta question can be phrased in terms of regular questions without loss of generality as long as it can depend on the answers.
🤔 What if you ask e.g. "Does it quack XOR will your answer to this question be a lie?" so that regardless of whether they lie, they will emit the answer to "Does it quack?"
Probably they'll just stop playing Ulam's game with you, but it might be useful if you're playing against some kind of genie or demon.
6:52 Slight correction, but that is not a binary search tree (BST). It is just a binary tree. A binary tree is just a tree where nodes have at most 2 children. A BST additionally requires that the left subtree contains only nodes lesser than the root, the right subtree contains only nodes greater than the root, and that both subtrees are BSTs. The advantage of a BST is that at every node it can be determined which child to recursively search to find a target node. This property is not present (nor does it appear useful) in the example in the video.
It's still sort of a binary search tree, perhaps in a non-CS context. It's obviously not a data structure either way, but we could just define our ordering function based on what the answers are.
In fact, let's do that. Let's assign a key to each entry with k bits, where k is ceil(log(set size)) (base 2 log of course). The first (most significant) bit is set to 1 if the answer to the first question is yes, 0 otherwise. The second bit is set to 1 if the answer to the second question is yes, 0 otherwise. Continue this pattern, and it's pretty clear to see that these keys fulfill the BST property.
I hadn’t even heard of Ulam’s game until I watched this, but I was still fascinated the entire time! Everything really is math, huh?
To be honest, I hadn't heard of Ulam's game before either until I went searching for a fun video idea. It's been a great time diving deep into it though. And yeah you're absolutely right. You can apply math to anything, even things that don't exist, which is what makes it awesome!
A simpler way of thinking about the optimal strategy for 20Qs is that we want to divide the number of options in half with each question, so when we're trying to narrow down the lie we can ask "did you lie before question Q/2?" where Q=the number of questions asked before the process of narrowing down the lie, then we ask "did you lie before Q/4" or "after Q/4*3" and so on
Yes for sure. I opted to explain that bit the way I did in order to give precedent to the kind of analysis I'd use in the second bit, so viewers were already familiar with that kind of thinking before it went and got more complex. But yes I agree, this is a more natural way to think about it.
I always love reading comments on videos like these that get so hyper focused on the original premise (in this case guessing an animal) rather than understanding the point of the premise is to serve as a jumping point for the mathematics being demonstrated in the video.
My favourite ones on this video are a phD dissertation on choice bias, an article about the Linnaeus system, and another asking about a blind person that can't see the list of animals.
This is definitely the best comment I could have come across! Happy to know I have company 😁
Really underrated channel bruh! You deserve way more subs
According to the paper in the description, the number of moves needed to win if the category size is n is the ceiling of the largest solution to 2^x/(x+1) = n (note: this is for even n, the equation for odd n is very similar). We can solve this equation exactly with the Lambert W function, but I had to look up how to find the asymptotic form. It turns out that x ~ log(n) + log(log(n)) which is pretty intuitive, but the next term is L := log(log(n))/log(n), and after that the next terms are on the order of powers of L.
The quality and everything is so good. You'll grow to be one of the best out there. Keep it up mate!
Most of this makes sense, but there's one thing I'm confused about:
12:20 "If the responder answers yes, for example, we know that the only way a or d can possibly still be the Responder's chosen number is if they lied once in their first 4 questions and then lied again..."
The answer is probably standing right in front of me and I'm just slow here, but can you expand a little bit as to why you're so certain of this claim?
Regardless, great video; loved the clean visuals!
That's definitely the roughest part of the video. I went through *so* many iterations of the script trying to nail down the best way to say it.
If you think back to the section starting at around 10:09, we built a table showing why t, a, b, c, and d are still potential possibilities for the responding player's chosen number. We know that 't' corresponds to the number that *would* be the Responder's number if they told the truth for the 4 questions. This means that if 't' is their number, they still have a lie left that they can use.
On the other hand, the only way 'a' could still be the Responder's number is if they lied for the first question we asked. This means that they can't lie again. So when we ask if their number is t, b, or c, they can't answer 'yes' because that would be a second lie. This means that if they *do* answer yes, that their number couldn't possibly be 'a.' This same reasoning can be said about 'd.'
I hope this makes some sense. Thanks for taking the time to give me a watch!
@@axiomath3434 Ah, okay! When you put it like that, it makes more sense! Thanks for clearing that up!
I'll be looking forward to more videos from this channel!
Here's my take : we're searching for a N-bit binary number (for simplicity we kind of assume N is a power of 2 but this changes little). First ask for each bit individually (this takes N questions). Then ask if the sum mod 2 of the N/2 first bits is the one you computed. If no, there is a lie, and it's within the first half, then continue this way. If yes, there either isn't a lie or it's located in the second half. So ask if the sum mod 2 of the N/4 first bits of the second half is the one you computed. Etc... When you've narrowed it down to a single bit there are two possibilities : either you at some point learned that there was indeed a lie, so you can tell the correct number by flipping that bit, or this was never revealed and you have to ask which makes one more question.
Final worst-case complexity is N+log_2(N). I imagine it's optimal.
Funnily, if you have to find the object in a set of n objects (N=log(n)), the complexity is log(n)+log(log(n)). Quite cute to see an appearance of log(log(n)) in the wild :)
Nice, I figured out the idea of just "adding the lie to he tree" myself.
Granted, the exact tree design still eluded me.
Great first video! Waiting for the next one!
The math analysis is amazing and enjoyable, kudoos!! But in real life it’s straightforward that I can answers same answer to same questions, so I think a good Ulan game is one that I can lie once, but I can’t lie again on the same questions, that’s what I’d set ulam rules to be, just a thought from a gamer’s viewpoint
Of course, there's some real world factors to consider here.
1. If someone picks a beluga whale and you ask if it's bigger than a microwave oven, they will answer confidently and immediately yes.
2. If someone picks a beluga whale and you ask if it's bigger than a Greenland shark, they will look at their phone or give you the very helpful "I don't know" or "sometimes" or "clarify the question."
If they have to do any of these you are definitely extracting more than 1 bit of information from them. And if they don't, you also get more than 1 bit of information.
3. It takes a lot longer to compare entities which are similar in the dimension being compared but don't follow a strict cached ordering when it comes to that quantity. Everyone knows the relative size of an elk and a caribou but what about an alligator and a caribou? Whether they have to think about it / say I don't know / use their phone / etc also tells you whether your reference point is similar in TYPE to the object and not just dimension. Even a small pause to think could be worth extra information. Note also that it's possible for them to lie by taking longer but not shorter to answer a question. So if you're usually answering instantly and they have to think all of the sudden you may be on to something.
You know not just the yes or no answer but also whether or not it was ambiguous or uncertain and how long they took to answer.
Finally, there is choice bias. Try as you might, you are not gonna pick a random animal. You are either gonna pick an animal that's well known and nonspecific in type, or one which is "weirdly specific" but probably not actually arbitrary from the several hundred to several thousand animals you might be aware of. Biasing questions to the distribution of how people pick things is a very powerful optimization. For much the same reason it's a powerful optimization for guessing passwords.
Following the strategy in the video, the number of questions required is given by
ceil(log(n)) + ceil(log( 1+ceil(log(n)) + floor(log( 1+ceil(log(n) ))
where n is the total number of possible items and log = log2.
I'll add something here that wasn't included in the video. The answers to the last three questions give the following results:
NNN => 6
NNY => 10
NYN => 15
YNN => 13
YYN => 14
YNY => 14
NYY => 14
YYY => 14
Ask a question, ask “was the previous answer a lie?” ,repeat until they say “yes” , then ask on more time to make sure that the “yes” wasn’t a lie, then continue as normal
That is very optimal, indeed, because if The Responder is very smart, he would not lie.
but then, could not the responder simply not lie until the end and reply "no" until the end?
Animals are fairy well categorized by the Linnaeus system. Much thought would have to be put into what the best question would be based not just on number of animals in each category.
In classification there are 7 phyla but cordata includes all animals with backbones. Despite being no where near 1/7th of animal species, I would guess that half or more selected answers will be in thus category.
Therefore I would ask "Does it have a backbone?" First.
at 4:30 i had a thought of asking about eqch bit of the number and some kind of an error correction code
What # of items gives you 20 as the optimal strategy, as that's the name of the game & the max number of questions/guesses you can have?
14:49 Isn't 15 (all yes) impossible for that set of questions (as in the true answer being all yes, if the no on the 3rd question was the lie) since the 4th question (which must be truthful) says it's not 15 (caused by 15 being only 3 layers down in the binary tree rather than 4 so it can't be in the "yes" set of the 4th layer)
I don't think it makes any difference(?) assuming it's there for the next step looking at the tree of t, a, b, c, and d for the number of questions, but that's only if the true answer isn't 15
11:01 would all be possible though
Cool topic tho I remember this in error correction
This is kinda incorrect though. The optimal strategy is not necessarily the strategy that has the best worst case scenario. It is instead the strategy that, well, wins more often, ie. Guesses the right answer in less than 20 questions more often.
Say we have a strategy that has a 51% chance of getting it right in 20 questions, but then they have a 0% chance to get it right in the next 80 questions, and finally the remaining 49% chance to get it right in only 101 questions (ignore how this doesn't make sense).
Compare that to a strategy that gets it right 50% of the time in 20 questions and 50% of the time in 21 questions. Even though this strategy's worse case scenario is 80 questions shorter, it's a worse strategy because you only win 50% of the time, instead of the 51% of the time the first strategy gave.
I was thinking that too.
"Hot Garbage, to borrow from scientific terminology" hahaha yeah thats the best scientific terminology ive heard. well done !
Would a 20 Q strategy be similar to a Guess Who strategy?
Still, in order to win any realistic sized game (either in 20 questions or the Ulam variant) you need probabilistic knowledge (about the category), even then the result is non-deterministic, as one could rarely choose a very rare choice for an animal or something.
Might be a good idea for a video to see how this optimal strategy along optimal probabilistic knowledge (assuming some natural distribution of true answers) would work on either variant.
I'm almost curious enough to get going with it by hand but it's almost 2 AM.
There's a bunch of logic puzzles that I could never solve that involves self-references and ANDs and ORs. I wonder if asking those type of questions could make it faster
Your algorithm involves playing a complete game of twenty questions and only then trying to detect the lie. Can you prove that is optimal, rather than some other strategy that incorporates error correction from the beginning? Come to that, are you even claiming it's optimal?
What a great video!
why is this so underrated
We can ask twice on Question q if they've lied before q
YY-> lied previously
YN-> didn't lie until q
NN-> haven't lied yet
NY-> didn't lie until q+1
I think I have the ultimate, most meta strategy
“When will you lie”
Obviously not a great strategy, but it was so horrendously meta that I thought I ought to mention it
One thing I think you should have clarified is the simplification to numbers. Questions which are directly asking "is it X" remain valid but questions like "is it less than 5" don't remain valid in the switch back from numbers to animals. That's ok because the general strategy you have works for any set of answers but it wasn't clear at that point in the video
nice trick.
now, to add an extra twist.
what if he can lie twice?
I believe the rules more or less still apply but will require a few modifications to take into account a second lie.
Also keep in mind with the "asking twice" that just locates the lie. It doesn't tell you what the truth is. If you want the truth you'd need to ask that question thrice.
I would like to point out that this style of 20 questions bears very little resemblance to the actual game. Note that in most cases, there is not a simple, objective number of items that fit in a category. Still, I appreciate this video for the logic.
Also: I would consider the best strategy to have the highest mean, not the lowest minimum. At this point, the video would just be 3b1b's video about wordle and everyone would stop watching; but whatever.
Isn't the best strategy to brute-force every possible tree and pick the best one?
Amazing video, thankyou
Adjacent questions.
I like how
Youre assuming there is a one-to-one correspondence between a set of integers and any and all arbitrary lists. Perhaps if the list were infinite this may not be true; its not clear. Did you perhaps want to clarify the rules of the game?
Brilliant! I thought about this exact problem, but didn't realize it already had a name, and such an elegant optimal strategy. Fantastically-done video. Hope this gets a zillion more views ;-;
Thanks for the kind words! As for the case with multiple lies, as long as everything else stays constant in the rules, any game with k rules can be solved in a similar way.
The trick for building a tree with a certain amount of possible lies is to use a sort of strike system. Every possible choice in your category starts with 0 strikes, and gains a strike when it doesn't fit an answer given by the responding player. You remove an item as a possibility when it gets it's k+1th strike.
In order to win the game, you must narrow down the list of possibilities to 1, so in order to keep track of the state of the game, the only information we need to know is the number of strikes each possibility has. Additionally, we can completely determine both the yes and no states given any particular question.
From this point on, the optimal strategy can be found in the same way we did in the video. (In fact, the reasoning in the video is nothing but a specific application of this strike tallying method.)
I hope this answers your question!
fantastic video!
Maybe I'm too sleep deprived or ill to understand but how exactly do I to apply numbers, letters and binary to trying to guess the animal in Ulam's game?
Am I wrong to assume that quizzing about seemingly arbitrary numbers to represent the subject matter won't help because that's not in the premise of the game itself?
The logic sounds right but the premise feels off. I'm now frazxled.
whats the maximum number of categories you could have while still being able to win with 20 questions? Really love the video btw
Me who can't even comprehend simple high-school math: 👁👄👁
Also it's me from the disc
What if only one lie can be told, but a lie once told can or even must be repeated?
What if the responder never uses the one lie? Should the rules state one lie must be used?
I'm amazed that even with a lie, you could still win 20 questions for 1000 items. What's the most number of items for which you could still win in 20 questions?
The doubke ask fails if they lie on the second
Is it duck, horse, platypus, crocodile, ... 1582 items later ... , or a cat?
Can you ask "have you lied yet?" in Ulam's game?
I thought you would bring it up because this may help lower the number of questions (maybe, I'm not that good at math): what if we go all meta and ask about the lie itself?
This essentially changes nothing. Once you have the bits, asking "did you lie? " can be equivalently replaced by "is the sum mod 2 of all digits the one I can compute with your digits?". And "did you lie among bits 1,4,7,8?" can be replaced with "is the sum mod 2 of bits 1,4,7,8 the one I compute?" etc.
Is exactly one of these statements true: "you are using your lie on this question", "the number is bigger than 8"?
If they are using their lie on this question and the number is bigger than 8, the true answer is no, but they'll answer yes.
If they are using their lie on this question and the number is not bigger than 8, the true answer is yes, but they'll answer no.
If they are not using their lie on this question and the number is bigger than 8, the true answer is yes, and they'll answer yes.
If they are not using their lie on this question and the number is not bigger than 8, the true answer is no, and they'll answer no.
So, no matter if they're lying or not, if the number is bigger than 8, they'll answer 'yes', and if the number is not bigger than 8, they'll answer 'no'.
Repeat with every question and you've got yourself the optimal strategy. :)
Is the list of animals a prescribed list that both players can know ahead of time? Does only the one player know the list from which he chose the animal, and the other player is blind? Youve been very vague about the rules of the game.
GREAT VIDEO THANKS!
1:52 But once youve identified the lie you dont have to keep asking questions twice.
Is there an easy way to prove you can not do better than logn+loglogn+c? (log base 2)
I could come up with a proof assuming you have to specify a list of questions in advance, but the fact that you can base your questions on the answers you have gathered makes this rather difficult.
If you have to specify a list of q questions in advance, this gives 2^q possible lists of answers to all the questions.
For every item, there are q+1 possible lists of answers the responder could have given. The responder either never lies, or lies for exactly one of the q questions.
Thus, as long as 2^q < n(q+1), by the pigeonhole principle, there exists a list of answers for which multiple items could have generated that list.
So if 2^q < n(q+1), then there is no list of q questions that you can give in advance to guarantee you will know the chosen item after getting all the answers.
Let a = loglogn, so n=2^2^a. Consider q = logn+loglogn = 2^a+a. We have
2^q = 2^2^a * 2^a < n(q+1).
This proves you need more than logn+loglogn questions to figure out the chosen item in this variant of the game.
Hold up how do you only have 20 subscribers?? (21 now btw :))
Slow and steady wins the game :)
Because it's his first video? Just because you start with 0 subs doesn't mean you can't make good videos.
The existence of a 21st subscriber makes it a lot harder (i.e., non-trivial) to win "20 questions" in the category of "AXIOMath subscribers".
There's a easier solution. Use self referencing to force the correct answer. Ex: Does your answer correctly answer the following question AND . If the responder decides not to lie. Then it devolves into a normal question. If they decide to lie and the real solution is true. They shall reply false. But replying false makes the first statement false. Which makes the entire expression false. That means they actually have to reply true... It's a paradox. The responder have no option but to reply truthfully. Same solution applies for the false case
Let's say you discover an island inhabited by two tribes. One of these tribes always tells the truth, and one always lies. Before you sits a man from one of these tribes, though you don't know which. You can ask him any number of yes or no questions you wish, but you may never ask him a question that you yourself know the answer to. This includes any question that you have enough information to be able to deduce the answer to. How do you learn his tribe?
You should state that differently because semantically your problem doesn't hold.
You should change it to Trivial Questions. Like 'Are we on an island?' , because all questions you could ask even the non-trivial ones can have their answers deduced beforehand.
But sure the question you ask is, 'Which tribe would the members of the other tribe say you belong to?' The Liar Lies and says that the other tribe would say he's from the Truth Tribe and the Truther would say that the other tribe would say he's a Liar.
answer
'If I asked you "are you from the True Tribe", would you say no?'
Suppose he is from the True tribe. He would say yes to the inner question because he is True. Thus he would say no to the outer, also because he is True.
Suppose he is from the Liar tribe. He would say yes to the inner question, because he lies. So he should say no to the outer question. However, he's a liar, so he'll say yes instead.
Now, because I don't know his tribe, I don't know if he'll say no or yes, so I don't know the answer to his question. So this question correctly identifies the man and is also askable.
There are certainly loads of questions who's answers cannot be deduced beforehand, such as simply asking "do you tell the truth?" We don't have enough information to say one way or another what the actual answer is so we are allowed to ask it.
On the other hand, if you were to ask "are we on an island?" The liar would say no and the truther would say yes. This would immediately solve the puzzle, except that the answer to the question is one that we already know the answer to, so we can't ask it. I can't rephrase this idea to not asking "Trivial Questions" because that's not a well defined idea.
Interestingly enough, your proposed solution is actually illegal according to the rules. If you ask "would a member of the other tribe say you were a truther," no matter which tribe they are in, the true answer to the question is yes. Your solution hinges on picking out the liar based on deviance from this true answer. Because we can deduce the answer beforehand, your solution is no less illegal than asking if you are on an island.
@@moonlightcocktail To build on this, any question requiring the lie to be doubled gives a truth again since it's a binary question. Anything like "what would you say..." or "what would the members of your own tribe say..." would force the liar to tell the truth again from the lie, meaning both answer truthfully. Then, it's a simple problem of asking an identifying binary question that satisfies the rules.
If you stick within the logical spirit of the problem, no finite sequence of legal questions will enable you to determine his tribe.
Suppose you ask a sequence of questions, and before the last answer you don't know what tribe he belongs to, but after the last answer you do. Without loss of generality, when you ask the last question you have deduced that a truth-teller would answer 'yes' and a liar would answer 'no'. But if you know that, you can deduce the true answer to the question is 'yes', which makes the question illegal.
It is possible to win within the literal wording of the rules, for instance by asking 'will you answer this question within five seconds'? When you ask the question, you don't know the answer.
Okay now do 20 answers.
It is not correct that you guarantee the correct answer in 4 questions. You need 5. In your example, there is a branch that ends with the numbers 1 and 2, and the question is "is it 2". You only get that correct in 4 guesses if the answer actualy is 2. If it is not, then you have to ask again "is it 1", which is your fifth question. Otherwise you never gave the correct answer.