This is great. I majored in mathematics and, still today, dabble around in cryptography just for fun. I really wish these helpful videos were around when I was in school.
Thank you for the awesome video. From your video I learned what it means "vigenere cypher is vulnerable to frequency analysis", which I didn't get from the book Serious Cryptography.
Was just tasked with writing a vigenere crack on Tuesday and due Thursday morning in an unfamiliar programming language. Couldn't really test for accuracy after seeing that the key length was the appropriate size, I trucked on until the end and... IT WORKED! Thanks for the help! :)
damn that's a lotta steps. I am learning cryptography on my free time. I always thought it was interesting, but I never realized how much work it was. The Ceaser cipher is way easier. Any hints for saving time, yet still getting an accurate answer?
Computers. Computer programs save time. I'f you're interested in math and crypto and happen to be into programming, I would recommend learning to also use Maple or Matlab or the like. Good luck with your learning!
So, I do LARP, and last year a guild I'm found out about a plot which could have serious repercussions for all factions in the game. We tracked the plotters and intercepted a coded message, and the guild launched an investigation. Unfortunately, at one of the events, the guy leading the investigation was killed, and the head of the guild gave me this guy's diary, with his notes on the plot and a copy of the coded message and asked me to figure it out. So, I've been scouring the internet trying to figure out how to decrypt this message (for context I don't know anything about cryptography). It's not a caesar cipher. I'm struggling to guess two letter words because it seems like there aren't enough vowels in english to make them work. Frequency analysis doesn't seem to yield anything conclusive. Then I heard about polyalphabetic encryption. This was the first video I watched on it, and I've just spent 4 hours working out a key for the message. Finally came up with a two number key and... *drumroll* it doesn't work when applied to the message. BUT. I don't feel like it was a waste of time, because I had fun working out the key and trying to solve the puzzle, & even if my guess about the type of cipher was wrong, at least that's vigenere eliminated from the list. So thanks for helping me out and giving me an evening's entertainment. Great video. :)
Thank you for all the explanations, you really simplified it and it really helped me understand. I'm making a game for my friend and she is going to have to take her time with this
This was super helpful but I’m completely lost on the “frequency” of the letters: A = .10; B = .20; C = .70 but that doesn’t correlate with the frequency in which those letters appear in the example cypher (as A&B appear 4 times and C 3) nor the English alphabet (as each letter only appears once) so where is the “frequency” derived from? Side note: what’s the name of the mathematical proof used when finding the greatest sum of products for that combination of numbers provided? I believe the example was 1, 2, 5 and 1, 2, 5. Thank you!
It is a made up example in this case since we assume a language with only three letters (9:05), but my guess is that it is usually derived through analyzing a large amount of texts in a language and seeing how often each letter appears on average in each of them.
@ROROROSEEEEE HAPPY BDAY!!!! That is the frequency of usage of alphabets in the English language. It's constant.... In any examination, they will specify the frequency of each letter in the question itself, no need to memorize it.
It was as clear as mud but it was also very interesting. I think this process could be better explained on an Excel spreadsheet. You should consider a re-do. If you do, I would watch it.
oh thank you very much ! i was hearing the video at coursea about this topic like more than 5 times and understood nothing ,til now ,now i get it.thank you teacher
You obviously know what you are doing but you really haven't explained it well enough for my old brain. Perhaps a Kasiski search might be better -- if you had a much longer ciphertext, of course. Thank you for trying so hard to tell us about your way of setting about the search!
Hey awesome video, but if someone doesn’t mind helping I don’t quite understand what it means that the frequencies of a, b, and c are .10, .20, and .70 or how we know that
I don't understand why you only did the shift twice . Like you said 0.60 is the largest but 0.26 is also greater than 0.24 so why not stop there and fill the blank for the first letter of the key as 1?
Those are the "normal" frequencies in the hypothetical language. For the normal frequencies in the English language see: www.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html
Yeah, she was working with a theoretical example with only four letters. Were she to draw out the frequencies for all 26 letters on a dry erase board, this video would have been painful.
In her pretend language of only three letters you would see the letter C 70% of the time, B 20% and A %10. She just picked whatever she felt like because it's a pretend language meant to make the example shorter and easier
@Thuliumify watch 5:52 to 7:57 . This is where she explain the principle that she was talking about . if you understand that you will understand 14:24.
I have this encoded message I'm trying to decode, I've creates that table 5:30 in Excel. But I can't figure out the number. The frequency for coincidences is weird. It goes 0,1,1,0,0,1,1,4,2,1,0,0,0,1,0,1,1,0,0,0,0,0,0. Not sure how to count that.
Hello!😊 Can you teach me how to get the alphabet frequencies (how did you know that a has a frequency of .10 and so on)? I'm stuck with that part. Thank you
You dont have to know, have a chart with you handy, unless you want to memorize all of them. There is no way of getting an accurate calculation of the frequency without hundreds of thousands of words. So either memorize, or have a chart.
Great explanation i wanted to know that 2 things:What is the math behind finding the length like why do we follow that algorithm secondly thier is a slight chance right that the key we found may not be correct(a very small chance as it is all statistical) in that case what how do we proceed further.
After getting the keyword size wouldn't be more suitable to apply statistical analysis straight into the plain text? In this case you can generate multiple possible plaintexts and match them to get the most probable one. Afterwards find the keyword. *Also you avoid cases that the keyword is not word from any language. *I was thinking like this, because it's more likely for the plaintext to be in some language than the keyword itself.
This video was really helpful tbf and I understand this a lot better now. One thing I'm struggling to understand is how I can apply this to when we use 26 letters instead of just the A, B and C you used? If my keyword length is 5, that means I would have to do what you did 5 times right? And with 26 letters, that leaves 130 computations and in an exam situation I wouldn't have time for that haha! I've been given part of the plain text though, so can I use this to find my key? Thanks!
+Ethan Conrad Hi Ethan, 1. Yes it takes forever by hand, most people use computer programs. But read point 3! :) 2. "If my keyword length is 5, that means I would have to do what you did 5 times right?" - Exactly 3. If you already have the some plaintext AND key length, it becomes much easier. Say you had a key length of 3 and your plain text started out "test" and your ciphertext was "UGVJPJ." Then we know that t=U, e=G, and s=V, so we could count and see that t and U are 1 apart, e and G are 2 apart, and s and V are 3 apart in the alphabet. So our key is (1, 2, 3) and we can apply that to the rest of the cipher text. Now instead say you had (for the same example) only "te" of the plaintext. Remember that you can easily figure out the first 2 key numbers. Then you could calculate the last one as usual and still have saving lots of time.
I still don't understand one thing. How do you get the frequencies for the letters (in the example a .10 b .20 c .70)? Because as far as I know you would have to collect data and recalculate the frequencies every time till it wont change anymore and that's about an eternity later. Someone please help!
Say in this example knowing the key length and hunting for key numbers , you're only using A-B-C to multiply the two frequency tables. If your ciphertext after counting up all the letters uses only 14 while the alphabet is 26 long. Do you simply input 0 for all the missing characters?
+MoralReformXGames If I understand your question correctly, then yes. The reason for counting the letters is to find the frequency of them. If they never occur in your text, then their frequency is 0/X.
Thanks, this was really helpful. What would happen if, instead of a shift, the letters in the key would each map to a permutation of the alphabet? Is it about as easy to break and if so, how would one go about breaking that?
Nice Explanation. But could you make it complete with complete decryption instead of watching other video. Complete the cycle. Additionally, please post the link of first video. Thanks
@Theoretically When you say that the order is important when writing down the Frequency, the first thing I need to write down is the Frequency of the Letters in the (English) Alphabet? In the order of the Alphabet itself? so A, B, C, D - NOT the order of Highest Frequency to Lowest Frequency? Second question, when I align the Cipher Text (in my case every 4th letter) under the alphabet, again I enter that in correct alphabetic order? or frequency order? Last question for now, do i remove the letters that are not used in the cipher text altogether? or do i just allow the calculation to be 0? Example: A B C D E F G H I A B C D E F G H I If there are no E's in my cipher text do I remove the column altogether (both cipher and alphabet) or just the cipher. A B C D F G H I A B C D F G H I or A B C D E F G H I A B C D F G H I I hope this makes sense, thanks for your help, great demonstration
Hi, Yes, you write the letters in alphabetical order, not order of frequency in both cases. Allow the frequency to be 0 rather than removing the letters in the cipher text. How many E's do you have? Zero, then make the frequency 0/total, which = 0. Basically, you have to make you have a full alphabet every time you multiply the actual frequency times the frequency in the cipher text. Hope that makes sense! Feel free to ask again if that doesn't answer the question.
I would guess you mean shifts... :D Also im not sure ive just begun learning decryption, but I see you don`t have a response so maybe you just multiply by 0 and you get a zero... after this its just adding so it wont make the whole sum a zero
What happens if there are spaces in the ciphered text? Ex. aicbs akova ps sjkhal ...Would I just combine it all do it like what you did it in the video?
Great start.. Keep doing this sort of thing. But get a bitter board, maybe even ast to use a classroom at a local collage. at about 15.4 in thee video you say to count the A, B, Cs again... we did that in the first round, so is the next round different? If so how?
You're an absolute genius, thank you for a fascinating video. Could you tell me if it is possible to find the encryption key for an Android photo file if the owner mistakenly wiped the encryption key from the phone?. Yes it happened to me and so far nobody has been able to shed some light on this commonly occurring problem with Android phones. By default from Android 6 upwards all data is encrypted.
What is the best way of analyzing ciphertext of around 40 letters or so? Would this method be useful, or is it too short for the results to be accurate?
Why didn't you use the first encrypted message? Now I am sad Edit: Also , what if the numbers in the frequency are 1 2 3 4 5 6 etc and not 0 1 55 60 2 3 ? What if they don't have a big difference between them ? I don't believe your method works, the positions are not fixed.
It's a proven mathematical concept, not something she came up with. The longer the message is, the more accurate it would be. It's a common theme in cryptography
They were predefined for alphabet {a,b,c}. Yes all alphabets (of course associated with real languages) has their own letters frequencies. For example check english letters frequencies. If you want to crack Vigenere Cipher with ciphertext only, you should know the plain text language at least.
So I have one that I do not know what type of cipher it is, but It contains these characters: A,C,G,I,L,S,T,U,V,Y,Q,0,4,8 If someone could help me on how to solve it, I would be very grateful. It is a huge thing over 2000 letters
I understand how the decryption works, but why does it work? Is there any good resources that I can refer to for the explanation behind this attack? I'm guessing the step 2 is related to frequency of each character? But i can't explain why the counting of coincidences gives the key length for step 1.
+Victor Hazali I'm not sure if this directly answers your question, but here is an explanation I gave someone previously. Maybe this will help you understand the "why." (sorry it's quite long) Why it happens is really quite interesting, but hard to put in words. I'll try with an example: Say the key length was 3, and the key was (2, 9, 5). Then every third letter will be "on the same shift." So in the ciphertext, the 1st, 4th, 7th, 10th, 13th etc letter will be shifted 2 places. Now we also know that the letters in an alphabet have constant frequencies. For example, in English, "e" is the most common letter. But when we shift it two spaces over (as in the first shift of our key), it becomes "G" in the ciphertext, and hence G will be the most common letter in the 1st, 4th, 7th, 10th, 13th etc position in the ciphertext, not "e." So now we know that G is the most common (for our shift 2 letters), and likewise we could figure out how common all the other letters of shift 2 are in the ciphertext. But when we line up the letters in an offset manner (like this we do in practice with all the rows offset from each other), what is the most common overlap of the same letter? Generally when G overlaps on G because it's the most common, right? Let's take a step back and think about the other shifts as well. If we were to look at the 2nd, 5th, 8th, 11th, etc letter in the plaintext, the "e" would be shifted 9 places to "N", so for the 2nd, 5th, 8th, 11th, etc letter in the ciphertext, "N" would be the most common letter. In the same way, in the 3rd, 6th, 9th, 12th etc letter in the plaintext, the "e" would shift 5 places over to "J", so J would be the most common letter in the 3rd, 6th, 9th, 12th etc place in the ciphertext. Now we have determined that for the first, second, and third shift, the letters G, N, and J, respectively would be most common (and of course we could find second most common, third most common, etc for each). So how do we use this fact? Well, any time the letters are lined up in multiples of the key (multiples of 3), there is the greatest chance of our G's in the top line lining up with G in the offset line (because they are the most common in the 1st, 4th, 7th, 10th, 13th etc position) in the first shift. In contrast, if the top line is, say for example, over a line with an offset of two (not a multiple of 3, the key), G will be the most common in the 1st, 4th, 7th, 10th, 13th etc position in top line, but in the offset line, J will be the most common in the 1st, 4th, 7th, 10th, 13th etc position (places still relative to the top line). HENCE, you won't get as many matches, called "coincidences," because the same letter is not the most common in both cases. The number of coincidences will be a little be lower because there are fewer J's than G's in the top line. Now remember that the while I just talked about how you get the most matches in the first shift, the same idea applies to all other shifts. So, IN SUMMARY: you obtain the most coincidences when the shift is at multiples of the key, which is what we see in practice.
Sorry, but when you make the multiplication between the alphabetical frequency and that in the cyphertext , you are considering only the letters into that part of the cyphertext or all the alphabetic letters ?
Hi,I am not understanding how to find a corresponding encryption key. For example, A message M = Mario is Vernam encrypted into ciphertext C = AOAMV. The key is 5 letters long is all I know.
Thank you so much for your video.It helped me understood more clearly. But I had a question. What if my ciphertext is 500 characters long?Then your method of finding key length would be much more lengthy. Is there any other way to find key length?
+Rachna Desai I don't know of any other way of finding key length. Usually people use a computer program to do the work for them. For example you could use Maple as seen in this video: ua-cam.com/video/rnRFze0WTyM/v-deo.html
This is my cipher text here, but trying to find keyword. So to use above method for key, should I only use maximum of 25 enciphered letters for decryption like above?
@@shauryadoger Here to have the best chance to find the key would be to analyse everything, but for it to be shorter you would need a program. This is my problem, I don't know programming! :)
what if I rewrite the cipher text and check for constants but I get only one large number which's 2 the rest of numbers were all one and zeroes!! how to find the key length then?
+MktNinja Becuase Becuase we have to use the 5, 2, and 1 each only twice (and we have to use all three of them). As opposed to in your question, you used the 5 and 2 three times and did not use the 1 at all. Does that answer your question?
Nazmus Salehin I just used the key length from the first message/step and assumed it was the same for the second message (for time's sake). In reality, yes, it does change the shift, so you would have to repeat the process we did with the "VVHQWVV..." message to the "ABAABCC..." message in order to determine the key length.
Nazmus Salehin In the video, 0:37 - 5:41 is about finding the key length by hand. However, there are programs that can be used to do it for you, such as Maple. See: ua-cam.com/video/rnRFze0WTyM/v-deo.html for doing it on maple. Is there a specific part that you are having trouble with? To summarize doing the process by hand: 1. Write out ciphertext 2. Write out ciphertext again and again one place over each time 3. Count the number of coincidences in each row. 4. Count how often "spikes" occur. For example, if the list of coincidences is 23, 15, 60, 17, 5, 45, 12, 27, 75... We can see that the biggest numbers in this list are 60, 45, and 75, which occur every three places (#, #, Big number, #, #, Big number, #, #, big number...). Therefore, we would conclude that this text has a key length of 3.
Nazmus Salehin Why it happens is really quite interesting, but hard to put in words. I'll try with an example: Say the key length was 3, and the key was (2, 9, 5). Then every third letter will be "on the same shift." So in the ciphertext, the 1st, 4th, 7th, 10th, 13th etc letter will be shifted 2 places. Now we also know that the letters in an alphabet have constant frequencies. For example, in English, "e" is the most common letter. But when we shift it two spaces over (as in the first shift of our key), it becomes "G" in the ciphertext, and hence G will be the most common letter in the 1st, 4th, 7th, 10th, 13th etc position in the ciphertext, not "e." So now we know that G is the most common (for our shift 2 letters), and likewise we could figure out how common all the other letters of shift 2 are in the ciphertext. But when we line up the letters in an offset manner (like this we do in practice with all the rows offset from each other), what is the most common overlap of the same letter? Generally when G overlaps on G because it's the most common, right? Let's take a step back and think about the other shifts as well. If we were to look at the 2nd, 5th, 8th, 11th, etc letter in the plaintext, the "e" would be shifted 9 places to "N", so for the 2nd, 5th, 8th, 11th, etc letter in the ciphertext, "N" would be the most common letter. In the same way, in the 3rd, 6th, 9th, 12th etc letter in the plaintext, the "e" would shift 5 places over to "J", so J would be the most common letter in the 3rd, 6th, 9th, 12th etc place in the ciphertext. Now we have determined that for the first, second, and third shift, the letters G, N, and J, respectively would be most common (and of course we could find second most common, third most common, etc for each). So how do we use this fact? Well, any time the letters are lined up in multiples of the key (multiples of 3), there is the greatest chance of our G's in the top line lining up with G in the offset line (because they are the most common in the 1st, 4th, 7th, 10th, 13th etc position) in the first shift. In contrast, if the top line is, say for example, over a line with an offset of two (not a multiple of 3, the key), G will be the most common in the 1st, 4th, 7th, 10th, 13th etc position in top line, but in the offset line, J will be the most common in the 1st, 4th, 7th, 10th, 13th etc position (places still relative to the top line). HENCE, you won't get as many matches, called "coincidences," because the same letter is not the most common in both cases. The number of coincidences will be a little be lower because there are fewer J's than G's in the top line. Now remember that the while I just talked about how you get the most matches in the first shift, the same idea applies to all other shifts. So, IN SUMMARY: you obtain the most coincidences when the shift is at multiples of the key, which is what we see in practice.
+Theoretically i have been given a challenge cipher in my geometry class, and my teacher said i could use any outside resources. i have been trying to solve this cipher for about 7 hours and I'm wondering if I could send you it and if you could confirm that it is a Vigenere cipher. Thanks!
You seem to know your stuff regarding Decryption so i ask this. Would it be possible to find a key in a 752 Hex code where characters are limited to 0 to 9 and A to F. Its part of a alternate reality game thats currently going on called "The Pizza Code Mystery" we have been stuck trying to figure out how to decode this 752 character long HEX code for almost 2 years and were trying to figure out what method we need to use to crack it.
I've been looking all over to find someone to explain the frequency analysis step. You're video is great and well explained. Keep up the good work
This was so helpful, used this to help me with my technology assignment and I got an A!
Good Job!
Great 👍👍👍
@@jedidiahwesley8522 and you too
This is great. I majored in mathematics and, still today, dabble around in cryptography just for fun. I really wish these helpful videos were around when I was in school.
Only found it cuz of ego
@NaiZYAJ , wanna chat for a bit?
@@NaiZYaJ wanna chat for a bit?
@@NaiZYaJwanna chat for a bit?
Thank you for the awesome video. From your video I learned what it means "vigenere cypher is vulnerable to frequency analysis", which I didn't get from the book Serious Cryptography.
Was just tasked with writing a vigenere crack on Tuesday and due Thursday morning in an unfamiliar programming language. Couldn't really test for accuracy after seeing that the key length was the appropriate size, I trucked on until the end and... IT WORKED! Thanks for the help! :)
damn that's a lotta steps. I am learning cryptography on my free time. I always thought it was interesting, but I never realized how much work it was. The Ceaser cipher is way easier. Any hints for saving time, yet still getting an accurate answer?
Computers. Computer programs save time. I'f you're interested in math and crypto and happen to be into programming, I would recommend learning to also use Maple or Matlab or the like. Good luck with your learning!
Thanks!
Yeah all the complicated steps and make computer do the iteration thats why math strong with computer science
make a Python algorithm that’s able to decrypt messages in an instant using the Vigenere Cipher
So, I do LARP, and last year a guild I'm found out about a plot which could have serious repercussions for all factions in the game. We tracked the plotters and intercepted a coded message, and the guild launched an investigation. Unfortunately, at one of the events, the guy leading the investigation was killed, and the head of the guild gave me this guy's diary, with his notes on the plot and a copy of the coded message and asked me to figure it out.
So, I've been scouring the internet trying to figure out how to decrypt this message (for context I don't know anything about cryptography). It's not a caesar cipher. I'm struggling to guess two letter words because it seems like there aren't enough vowels in english to make them work. Frequency analysis doesn't seem to yield anything conclusive. Then I heard about polyalphabetic encryption. This was the first video I watched on it, and I've just spent 4 hours working out a key for the message. Finally came up with a two number key and... *drumroll* it doesn't work when applied to the message.
BUT. I don't feel like it was a waste of time, because I had fun working out the key and trying to solve the puzzle, & even if my guess about the type of cipher was wrong, at least that's vigenere eliminated from the list. So thanks for helping me out and giving me an evening's entertainment. Great video. :)
Multitude of resources .. This is the only one that made me understand :)
+Puneet Kumar I'm so glad you got it!
Thank you for all the explanations, you really simplified it and it really helped me understand. I'm making a game for my friend and she is going to have to take her time with this
You are so great at teaching ..
Keep it up..
Amazing way of teaching..
You lost me when you stopped using the primary example, no matter how many times I re-watch this.
+Jesse Harding Exactly...I was very happy with her explanations but when I saw A, B and C I was disappointed
+Valkon katse kala
Even me ,I experience the same pain as you.
Ikr
@@antasmax5480 you can not break this cipher using a text of length
there is a mistake, i am sure someone caught it but on 4:00 on light blue row you have 0 where you should have 1
4:50 wait... thats insane!!! Thamk you for yoir Video already. I didnt watched it fully yet but at this point im convinced its legit
This was super helpful but I’m completely lost on the “frequency” of the letters: A = .10; B = .20; C = .70 but that doesn’t correlate with the frequency in which those letters appear in the example cypher (as A&B appear 4 times and C 3) nor the English alphabet (as each letter only appears once) so where is the “frequency” derived from?
Side note: what’s the name of the mathematical proof used when finding the greatest sum of products for that combination of numbers provided? I believe the example was 1, 2, 5 and 1, 2, 5.
Thank you!
It is a made up example in this case since we assume a language with only three letters (9:05), but my guess is that it is usually derived through analyzing a large amount of texts in a language and seeing how often each letter appears on average in each of them.
@@ahmadghaemi2192 Sorry bit late to the party, so that means on a real english cipher, we would use 8.4966% for the letter A ?
@@blijore27 Yeah pretty much
Thank you so much ❣️
I had a very hard time understanding this concept, thanks to you its done now ✅
@ROROROSEEEEE HAPPY BDAY!!!! That is the frequency of usage of alphabets in the English language. It's constant.... In any examination, they will specify the frequency of each letter in the question itself, no need to memorize it.
@ROROROSEEEEE HAPPY BDAY!!!! I don't know about the general usage of the English language. But, in the example she gave; yes, C is more frequent
It was as clear as mud but it was also very interesting. I think this process could be better explained on an Excel spreadsheet. You should consider a re-do. If you do, I would watch it.
Very nicely illustrated! Thank you so much.
What is the math principle behind the first part, the discovering of the length?
Index of Coincidence
im just 14 and not understanding a thing just learning for gravity falls
+PIXELATED- BLOCK In every single Gravity Falls cipher, the key is given to you. Alex isn't a cruel man.
+Dallas Henderson weirdnaggedon part 3 the new one has alot of codes and keys that can be decoded
just?
I'm just ten but I can understand it
LMAO I remember doing that when the show first started airing
oh thank you very much ! i was hearing the video at coursea about this topic like more than 5 times and understood nothing ,til now ,now i get it.thank you teacher
michael jordan So glad it helped!!
.10, .20, and .70 is the frequency of those letters in the cipher text without counting every 4?
You obviously know what you are doing but you really haven't explained it well enough for my old brain. Perhaps a Kasiski search might be better -- if you had a much longer ciphertext, of course.
Thank you for trying so hard to tell us about your way of setting about the search!
I have a cypher that the pattern of large numbers is very random. It’s a 50 character message but the coincidences are always low. Never more than 6
Max Donahue same
Hey awesome video, but if someone doesn’t mind helping I don’t quite understand what it means that the frequencies of a, b, and c are .10, .20, and .70 or how we know that
I don't understand why you only did the shift twice . Like you said 0.60 is the largest but 0.26 is also greater than 0.24 so why not stop there and fill the blank for the first letter of the key as 1?
its because she found out the length of the key and therefore had to shift it 3 times (the 4th would've been the start of the key again)
at 4:02 why haven't you wrapped the text around to teh beginning?
I would like to know that to tbh
Thanks so much for this. I used it to code a program that can decipher any vigenere text without knowing the key! So Cool!
where can i check it out?
@@YamChopp I wrote one years ago here excuse the music ua-cam.com/video/Q3l6-KcULsc/v-deo.html
All understood. Only a single thing: How did you find the value of a, b, c as .10, . 20 & .70. Please elaborate. thanks..
Those are the "normal" frequencies in the hypothetical language. For the normal frequencies in the English language see: www.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html
Yeah, she was working with a theoretical example with only four letters. Were she to draw out the frequencies for all 26 letters on a dry erase board, this video would have been painful.
but how 0.1 0.2 and 0.7
only three* letters
In her pretend language of only three letters you would see the letter C 70% of the time, B 20% and A %10. She just picked whatever she felt like because it's a pretend language meant to make the example shorter and easier
Can you please kindly re-explain this part? 14:24
@Thuliumify watch 5:52 to 7:57 . This is where she explain the principle that she was talking about . if you understand that you will understand 14:24.
Really good and clever video... Thank you!
I have this encoded message I'm trying to decode, I've creates that table 5:30 in Excel. But I can't figure out the number. The frequency for coincidences is weird. It goes 0,1,1,0,0,1,1,4,2,1,0,0,0,1,0,1,1,0,0,0,0,0,0. Not sure how to count that.
@@Fmn-u9k I don't even remember posting that.
Hello!😊 Can you teach me how to get the alphabet frequencies (how did you know that a has a frequency of .10 and so on)? I'm stuck with that part. Thank you
Christine Yvonne Mercado me too
hey did you ever find the answer to this I am also confused on this part
You dont have to know, have a chart with you handy, unless you want to memorize all of them. There is no way of getting an accurate calculation of the frequency without hundreds of thousands of words. So either memorize, or have a chart.
this seems like the core complicated method of doing it.
One thing I didn't quite get was how you find the original frequency of the sequence e.g in yours it was a=10 b=20 c=70, how did you get that
How do you find the length of the key if the cipher text is very long?
Great explanation i wanted to know that 2 things:What is the math behind finding the length like why do we follow that algorithm secondly thier is a slight chance right that the key we found may not be correct(a very small chance as it is all statistical) in that case what how do we proceed further.
I am a competitive hacker so I followed well thanks it helped a lot!
After getting the keyword size wouldn't be more suitable to apply statistical analysis straight into the plain text?
In this case you can generate multiple possible plaintexts and match them to get the most probable one. Afterwards find the keyword.
*Also you avoid cases that the keyword is not word from any language.
*I was thinking like this, because it's more likely for the plaintext to be in some language than the keyword itself.
when you shift the frequency probabilities, does it have to be in order?
This video was really helpful tbf and I understand this a lot better now. One thing I'm struggling to understand is how I can apply this to when we use 26 letters instead of just the A, B and C you used? If my keyword length is 5, that means I would have to do what you did 5 times right? And with 26 letters, that leaves 130 computations and in an exam situation I wouldn't have time for that haha! I've been given part of the plain text though, so can I use this to find my key? Thanks!
+Ethan Conrad Hi Ethan,
1. Yes it takes forever by hand, most people use computer programs. But read point 3! :)
2. "If my keyword length is 5, that means I would have to do what you did 5 times right?" - Exactly
3. If you already have the some plaintext AND key length, it becomes much easier.
Say you had a key length of 3 and your plain text started out "test" and your ciphertext was "UGVJPJ." Then we know that t=U, e=G, and s=V, so we could count and see that t and U are 1 apart, e and G are 2 apart, and s and V are 3 apart in the alphabet. So our key is (1, 2, 3) and we can apply that to the rest of the cipher text.
Now instead say you had (for the same example) only "te" of the plaintext. Remember that you can easily figure out the first 2 key numbers. Then you could calculate the last one as usual and still have saving lots of time.
Theoretically Ah i see now! I managed to figure it out so thanks a lot :D
you are a gem
I still don't understand one thing. How do you get the frequencies for the letters (in the example a .10 b .20 c .70)? Because as far as I know you would have to collect data and recalculate the frequencies every time till it wont change anymore and that's about an eternity later. Someone please help!
This was great. I'm going to code this and see if I can make it work.
Did it work?
@@NATHALIAAVILAIndiGoddess1147 Heh...gonna need a little more time
@@dannuttle9005 so.... did it work ?
Say in this example knowing the key length and hunting for key numbers , you're only using A-B-C to multiply the two frequency tables.
If your ciphertext after counting up all the letters uses only 14 while the alphabet is 26 long.
Do you simply input 0 for all the missing characters?
+MoralReformXGames If I understand your question correctly, then yes. The reason for counting the letters is to find the frequency of them. If they never occur in your text, then their frequency is 0/X.
+Theoretically Thanks a lot xd
Thanks, this was really helpful.
What would happen if, instead of a shift, the letters in the key would each map to a permutation of the alphabet? Is it about as easy to break and if so, how would one go about breaking that?
You can break it with reverse encrypting, its like you did substitution (another method) and then Vigenere so now you have to do the reverse operation
How can I do this with programming? Please help. I can not find any videos on youtube for this.
Fascinating!! Thanks for the video.
I love you, u saved me
I have a kappa cryptogram puzzle I bought at a grocery store could you make a video on how to solve these? Thank you in advance.
all understood , one single thing is why .50 with the shift of 2 was the key what math principle was that?
So this channel explains ciphers and codes? Ill subscribe if you have more like this
Nice Explanation. But could you make it complete with complete decryption instead of watching other video. Complete the cycle. Additionally, please post the link of first video. Thanks
@Theoretically
When you say that the order is important when writing down the Frequency, the first thing I need to write down is the Frequency of the Letters in the (English) Alphabet? In the order of the Alphabet itself? so A, B, C, D - NOT the order of Highest Frequency to Lowest Frequency?
Second question, when I align the Cipher Text (in my case every 4th letter) under the alphabet, again I enter that in correct alphabetic order? or frequency order?
Last question for now, do i remove the letters that are not used in the cipher text altogether? or do i just allow the calculation to be 0?
Example:
A B C D E F G H I
A B C D E F G H I
If there are no E's in my cipher text do I remove the column altogether (both cipher and alphabet) or just the cipher.
A B C D F G H I
A B C D F G H I
or
A B C D E F G H I
A B C D F G H I
I hope this makes sense, thanks for your help, great demonstration
Hi,
Yes, you write the letters in alphabetical order, not order of frequency in both cases.
Allow the frequency to be 0 rather than removing the letters in the cipher text. How many E's do you have? Zero, then make the frequency 0/total, which = 0.
Basically, you have to make you have a full alphabet every time you multiply the actual frequency times the frequency in the cipher text.
Hope that makes sense! Feel free to ask again if that doesn't answer the question.
Got the overall idea.It was good explanation. Still looking for base intuition...if possible can anyone provide any related link?
Thank you very much for the clear explanation!
The math edu link doesn't work btw (well for me at least) so can someone tell me the frequencies for the english abc XD
What if you don't have all letters of the alphabet i.e. a blank column when performing shits?
I would guess you mean shifts... :D
Also im not sure ive just begun learning decryption, but I see you don`t have a response so maybe you just multiply by 0 and you get a zero... after this its just adding so it wont make the whole sum a zero
If I am understanding you're question correctly, yes. You are correct, make it zero.
What happens if there are spaces in the ciphered text? Ex. aicbs akova ps sjkhal ...Would I just combine it all do it like what you did it in the video?
Great start.. Keep doing this sort of thing. But get a bitter board, maybe even ast to use a classroom at a local collage.
at about 15.4 in thee video you say to count the A, B, Cs again... we did that in the first round, so is the next round different? If so how?
You count every "x" letters, where x is your key length, but start from the next letter along. Then the next. Etc.
Should I be doing one word at a time? Sentence? How should I do this?
Where you get the .10 .20. and .70 from?
Thank you for the video! I just wanted to ask if the shifting and multiplying method has a name? Are there any references for it?
You're an absolute genius, thank you for a fascinating video.
Could you tell me if it is possible to find the encryption key for an Android photo file
if the owner mistakenly wiped the encryption key from the phone?.
Yes it happened to me and so far nobody has been able to shed some light on this
commonly occurring problem with Android phones.
By default from Android 6 upwards all data is encrypted.
What is the best way of analyzing ciphertext of around 40 letters or so? Would this method be useful, or is it too short for the results to be accurate?
so if i have 200 letter as a encrypted text, should i do the process and find the coincidences ??
Why didn't you use the first encrypted message? Now I am sad
Edit: Also , what if the numbers in the frequency are 1 2 3 4 5 6 etc and not 0 1 55 60 2 3 ? What if they don't have a big difference between them ? I don't believe your method works, the positions are not fixed.
+Valkon polla les
This is not her method. Its proofed by complicated concepts of overlapping. This is not guaranteed to decypher it, but it's pretty dam close to do it
It's a proven mathematical concept, not something she came up with. The longer the message is, the more accurate it would be. It's a common theme in cryptography
@@cosma_one εκλαψα χαχαχαχαχ
How can you define frequencies of a,b, c as 0.1,0.2,0.7??
Is all alphbets have their particular frequencies like this or not and how can i get it??
They were predefined for alphabet {a,b,c}. Yes all alphabets (of course associated with real languages) has their own letters frequencies. For example check english letters frequencies. If you want to crack Vigenere Cipher with ciphertext only, you should know the plain text language at least.
So I have one that I do not know what type of cipher it is, but It contains these characters:
A,C,G,I,L,S,T,U,V,Y,Q,0,4,8
If someone could help me on how to solve it, I would be very grateful. It is a huge thing over 2000 letters
sort of got it...but how do i get the averages and the averages of the numbers?
I understand how the decryption works, but why does it work?
Is there any good resources that I can refer to for the explanation behind this attack?
I'm guessing the step 2 is related to frequency of each character? But i can't explain why the counting of coincidences gives the key length for step 1.
+Victor Hazali I'm not sure if this directly answers your question, but here is an explanation I gave someone previously. Maybe this will help you understand the "why." (sorry it's quite long)
Why it happens is really quite interesting, but hard to put in words. I'll try with an example:
Say the key length was 3, and the key was (2, 9, 5). Then every third letter will be "on the same shift." So in the ciphertext, the 1st, 4th, 7th, 10th, 13th etc letter will be shifted 2 places.
Now we also know that the letters in an alphabet have constant frequencies. For example, in English, "e" is the most common letter. But when we shift it two spaces over (as in the first shift of our key), it becomes "G" in the ciphertext, and hence G will be the most common letter in the 1st, 4th, 7th, 10th, 13th etc position in the ciphertext, not "e."
So now we know that G is the most common (for our shift 2 letters), and likewise we could figure out how common all the other letters of shift 2 are in the ciphertext. But when we line up the letters in an offset manner (like this we do in practice with all the rows offset from each other), what is the most common overlap of the same letter? Generally when G overlaps on G because it's the most common, right?
Let's take a step back and think about the other shifts as well. If we were to look at the 2nd, 5th, 8th, 11th, etc letter in the plaintext, the "e" would be shifted 9 places to "N", so for the 2nd, 5th, 8th, 11th, etc letter in the ciphertext, "N" would be the most common letter. In the same way, in the 3rd, 6th, 9th, 12th etc letter in the plaintext, the "e" would shift 5 places over to "J", so J would be the most common letter in the 3rd, 6th, 9th, 12th etc place in the ciphertext.
Now we have determined that for the first, second, and third shift, the letters G, N, and J, respectively would be most common (and of course we could find second most common, third most common, etc for each). So how do we use this fact?
Well, any time the letters are lined up in multiples of the key (multiples of 3), there is the greatest chance of our G's in the top line lining up with G in the offset line (because they are the most common in the 1st, 4th, 7th, 10th, 13th etc position) in the first shift. In contrast, if the top line is, say for example, over a line with an offset of two (not a multiple of 3, the key), G will be the most common in the 1st, 4th, 7th, 10th, 13th etc position in top line, but in the offset line, J will be the most common in the 1st, 4th, 7th, 10th, 13th etc position (places still relative to the top line). HENCE, you won't get as many matches, called "coincidences," because the same letter is not the most common in both cases. The number of coincidences will be a little be lower because there are fewer J's than G's in the top line.
Now remember that the while I just talked about how you get the most matches in the first shift, the same idea applies to all other shifts. So, IN SUMMARY: you obtain the most coincidences when the shift is at multiples of the key, which is what we see in practice.
+Theoretically Thanks so much for the explanation!
It's very clear with the example, and I think I understand why it works now.
how do you know the frequencies for the alphabet before hand? is it a universal truth?
Sorry, but when you make the multiplication between the alphabetical frequency and that in the cyphertext , you are considering only the letters into that part of the cyphertext or all the alphabetic letters ?
what if i have a cipher text as AGVAR and the length of the key is given same as the number of letters in the cipher text.?
Do you have a key? If so, see ua-cam.com/video/oHcJ4QLiiP8/v-deo.html
If not, that ciphertext is really too short to do this type of analysis on.
Can you show how to solve a ADFGX cipher with and without the key?
Hi,I am not understanding how to find a corresponding encryption key. For example, A message M = Mario is Vernam encrypted into ciphertext C = AOAMV. The key is 5 letters long is all I know.
The first number key was 2 , what is its corresponding letter , C ???
Thank you so much for your video.It helped me understood more clearly. But I had a question. What if my ciphertext is 500 characters long?Then your method of finding key length would be much more lengthy. Is there any other way to find key length?
+Rachna Desai I don't know of any other way of finding key length. Usually people use a computer program to do the work for them. For example you could use Maple as seen in this video: ua-cam.com/video/rnRFze0WTyM/v-deo.html
You need to use Kasiski's Method to find the key length. Its much easier and works for any length of the cipher text.
So from this video, does it mean that the length of key cannot be greater than 25, as it 25 maximum length of any key in a long cipher text?
ivroq likgs pbxmi ojgrl ihebh oevqu lbgkv mpfhf utrkm
sxpww smiyi xrvch anxfd bvgbm iqtlg mpjzx airhn ylbhv
vybnw attvg miypx bttzj tbphv gopiv lcewq kxety rretv
axgnl ttrdp ggamr iphot xkmsp tbetv isjcs ixzgi zdarl
ssstc hbtvd sktsy lmfra xgaxd jvoti fxlyl fhwfm pjskt
hvfmu earwc amwto djrgc sfoto xtjqi qwsph vimqx caliw
ipdrv ynngr ahgam deotw aizfg qxqne mkjbr haxdj vxrvv
xdjhh tmjhz iwpwc emmwx epbga mriph otxml glbdy xbjzf
rhbkg zwbsp lmpjg lctrw mwezn rhkqs kqwsn fmwmz pbpbd
nptpf vgbws ajqrt kdgix qctby iochu tbrmd whoxl jxbrh
rwenx epggt bnwqx qnetd eakoa vmizb ggvhv tjcgs dnmsg
vpbne gxmp
This is my cipher text here, but trying to find keyword. So to use above method for key, should I only use maximum of 25 enciphered letters for decryption like above?
@@shauryadoger Here to have the best chance to find the key would be to analyse everything, but for it to be shorter you would need a program. This is my problem, I don't know programming! :)
Can this method work to find a Gronsfeld key?
How did you reveal the whole of the letters in the description part?
What is the general rule ?
Isn't row 7 supposed to have 1 coincidence? Just to make sure I'm understanding.
what if I rewrite the cipher text and check for constants but I get only one large number which's 2 the rest of numbers were all one and zeroes!! how to find the key length then?
Is your ciphertext too short?
This was great. Thank you for doing this.
Do you have the method/technique name used in your video? Awesome explanation :)
I may not have heard you, but for the largest #, why is not (5x5)+(2x5)+(2x2)=39 an option?
+MktNinja Becuase Becuase we have to use the 5, 2, and 1 each only twice (and we have to use all three of them). As opposed to in your question, you used the 5 and 2 three times and did not use the 1 at all. Does that answer your question?
Hi there do you have a series of affine cipher decryption with no keys? Thanks a lot :)
How have you determined the key length in your later example of all A,B and C's ? Because depending on the key length the shift would be different.
Nazmus Salehin I just used the key length from the first message/step and assumed it was the same for the second message (for time's sake). In reality, yes, it does change the shift, so you would have to repeat the process we did with the "VVHQWVV..." message to the "ABAABCC..." message in order to determine the key length.
can you show us please effective way of finding key length in Vigenere ciphers (or maybe index of coincidence method).I am at a loss here.
Nazmus Salehin In the video, 0:37 - 5:41 is about finding the key length by hand. However, there are programs that can be used to do it for you, such as Maple. See: ua-cam.com/video/rnRFze0WTyM/v-deo.html for doing it on maple.
Is there a specific part that you are having trouble with?
To summarize doing the process by hand:
1. Write out ciphertext
2. Write out ciphertext again and again one place over each time
3. Count the number of coincidences in each row.
4. Count how often "spikes" occur.
For example, if the list of coincidences is 23, 15, 60, 17, 5, 45, 12, 27, 75... We can see that the biggest numbers in this list are 60, 45, and 75, which occur every three places (#, #, Big number, #, #, Big number, #, #, big number...). Therefore, we would conclude that this text has a key length of 3.
thanks....Another question is why is that ?? I mean why looking for big numbers and the key length is between the gaps of consecutive big numbers ?
Nazmus Salehin
Why it happens is really quite interesting, but hard to put in words. I'll try with an example:
Say the key length was 3, and the key was (2, 9, 5). Then every third letter will be "on the same shift." So in the ciphertext, the 1st, 4th, 7th, 10th, 13th etc letter will be shifted 2 places.
Now we also know that the letters in an alphabet have constant frequencies. For example, in English, "e" is the most common letter. But when we shift it two spaces over (as in the first shift of our key), it becomes "G" in the ciphertext, and hence G will be the most common letter in the 1st, 4th, 7th, 10th, 13th etc position in the ciphertext, not "e."
So now we know that G is the most common (for our shift 2 letters), and likewise we could figure out how common all the other letters of shift 2 are in the ciphertext. But when we line up the letters in an offset manner (like this we do in practice with all the rows offset from each other), what is the most common overlap of the same letter? Generally when G overlaps on G because it's the most common, right?
Let's take a step back and think about the other shifts as well. If we were to look at the 2nd, 5th, 8th, 11th, etc letter in the plaintext, the "e" would be shifted 9 places to "N", so for the 2nd, 5th, 8th, 11th, etc letter in the ciphertext, "N" would be the most common letter. In the same way, in the 3rd, 6th, 9th, 12th etc letter in the plaintext, the "e" would shift 5 places over to "J", so J would be the most common letter in the 3rd, 6th, 9th, 12th etc place in the ciphertext.
Now we have determined that for the first, second, and third shift, the letters G, N, and J, respectively would be most common (and of course we could find second most common, third most common, etc for each). So how do we use this fact?
Well, any time the letters are lined up in multiples of the key (multiples of 3), there is the greatest chance of our G's in the top line lining up with G in the offset line (because they are the most common in the 1st, 4th, 7th, 10th, 13th etc position) in the first shift. In contrast, if the top line is, say for example, over a line with an offset of two (not a multiple of 3, the key), G will be the most common in the 1st, 4th, 7th, 10th, 13th etc position in top line, but in the offset line, J will be the most common in the 1st, 4th, 7th, 10th, 13th etc position (places still relative to the top line). HENCE, you won't get as many matches, called "coincidences," because the same letter is not the most common in both cases. The number of coincidences will be a little be lower because there are fewer J's than G's in the top line.
Now remember that the while I just talked about how you get the most matches in the first shift, the same idea applies to all other shifts. So, IN SUMMARY: you obtain the most coincidences when the shift is at multiples of the key, which is what we see in practice.
+Theoretically i have been given a challenge cipher in my geometry class, and my teacher said i could use any outside resources. i have been trying to solve this cipher for about 7 hours and I'm wondering if I could send you it and if you could confirm that it is a Vigenere cipher. Thanks!
I was given the letters KCO and was told to use the Vigenere Cipher to decode it, and I seem to be stuck, how could I decode this?
ua-cam.com/video/oHcJ4QLiiP8/v-deo.html
Great video! thank you so much!
this was very helpful..
You are AWESOME!
Helped me very much, thank you
anyone here in 2020? like u know.... during quarantine?
me in 2021
@@danielthompson2299 me in 2024
what if the calculated key is not working? how to change the key then?
What if your cipher text is much longer?
love your vids! keep it up ur amazing!
What type of this code???
Very good videos, thank you!
You seem to know your stuff regarding Decryption so i ask this. Would it be possible to find a key in a 752 Hex code where characters are limited to 0 to 9 and A to F. Its part of a alternate reality game thats currently going on called "The Pizza Code Mystery" we have been stuck trying to figure out how to decode this 752 character long HEX code for almost 2 years and were trying to figure out what method we need to use to crack it.
How do you get the 125 number. I don't understand that part.
+Ben Andrei D. Prudentino She "counted" 125 B's in the full cipher text. Remember she's only showing a small part of the cipher text in the example.