It's easier to just generate a bunch of keys' N values, determine the product, and do a GCF with the target N. Use the same certificate tool to get the best results.
@@Avorthoren It actually is special to weak primes: if the two factors are not weak primes, the best factoring algorithm for a number with two prime factors would take exponentially more time than the primality test...
No computer generates random numbers. A number that is generated with a mathematical function can never truly be random. We call random numbers Pseudo Random. Generating random numbers for encryption is a lot more complex than just choosing to numbers between A and B.
@Rake1087 you can't generate true random numbers from something deterministic like a turing machine. But on a real computer it's quite doable. Shift noisy ADC numbers or use quantum effects if your structure is low nm. Some servers are used so damn frequently that determinism isn't really a problem. The lifespan of the numbers is just too short and usage to random to keep track and brute force.
While not directly related to this RSA "shortcoming", this video did remind me of another (arguably more fundamental) concern regarding RSA. It is widely known that the difficulty of factorizing the product of two large and random prime numbers, is that makes RSA work. However, when just 1 of those primes isn't truly random, maybe a fixed value or in any way deterministic or part of some limited/predictable pool/rotation of seemingly random possibilities, it becomes rather trivial to break the cipher (to anyone with knowledge about what makes this prime less random). Nothing new there. This "trick" has been used in CTF challenges. However, what has surprised me on several occasions, is that not many people appear to realize that it is this same factorization problem that makes it equally hard/impossible to detect any "weakened" key from the outside. In other words: it is nearly impossible to determine if a RSA initiated encrypted transmission is in fact secure, without access to the private key. From a set of private keys, a pattern in primes might still be spotted (although maybe not trivial). But from just public keys, this is certainly nearly impossible. In a way, you could say that RSA is just as good at hiding compromised security, as it is at providing secure cryptography. One may ponder on whether that is a desirable property of good cryptography. One may also wonder whether that might have been just a side effect, or who knows maybe intentional. The few times I have seen this being brought up, it was instantly discarded as irrelevant. Often in a surprisingly dismissive way. Usually with arguments along the way of: "if you don't trust who generated the keys, all bets are off anyways" or "organizations who take their security seriously, will do such a thing because of X Y Z". However, I can think of at least several scenarios in which a company may like to pretend to use very strong cryptography, while at the same time (secretively) provide a convenient way (e.g. to a 3rd party with "special interests") to eavesdrop on those "secure" transmission without much hassle. While accusations of any such a thing actually happening remain conspiracy theory, it is the nature of this factorization difficulty in RSA that also makes that it would be almost impossible to prove such a thing taking place (unless some closely guarded secret leaks out). Considering the history of the NSA and RSA (both individually and in relationship with each other), it feels to me like this RSA concern deserves more attention than it generally appears to get. Similar concerns (although for totally different technical reason) might be raised regarding the NSA's EC cryptography. But that's just my 2 cents.
Conceptually, that's valid, but there's a lot of overhead. In the wild three letter agencies generally get the private keys from the originator. Heck, if your pockets are deep enough, some companies will sell them after signing a lengthy NDA.
@@ClifBratcher No doubt true, to all you said. But all of that will leave evidence of collusion (which might be legal requirements in 1 jurisdiction, yet at the same time illegal according to another). It is exactly the plausible deniability aspect of this RSA concern, which might make it particularly useful to some parties (which might not be the typical 3 letter agencies; there are other scenarios).
What do you mean when you say, “it is this same factorization problem that makes it equally hard/impossible to detect any ‘weakened’ key from the outside”? I assume that you mean a weakened key to be one comprised of a prime that isn’t truly random or not chosen properly. Perhaps the best way to see if an RSA encryption is weak is to assume the key isn’t random and try various algorithms that take advantage of different mistakes that can occur while creating the key, such as the ones used in the CFT challenges you mentioned. Assuming you have a wide variety of algorithms that check for mistakes, wouldn’t it be relatively easy to check if a key is weak i.e. by factoring n? Maybe you can’t prove with 100% certainty a key isn’t weak, because that might require a potentially infinite amount of algorithms, but after so many attempts wouldn’t it be safe to assume a key is strong with a high probability?
One example, by the way - You have a closed source messaging application that promises to be end-to-end encrypted (*cough cough* Whatsapp). The company wants to read its users' data, or let the government read the data. There are of course loads of methods with a closed source application, but a backdoor like this one (picking intentionally weak keys, or better yet picking strong keys with a specific predictable algorithm) would be comparatively hard to detect. In other words - you might trust the person you're talking to, but that doesn't mean you can trust the software they run. Admittedly this is all pretty niche and unlikely though
Back in school, we did some RSA by hand, generally using primes up to 13. having a p, q = 7, 13 is a bit lower than generally secure, but it illustrates the point nicely.
7:09 an interesting point here, you need to assume that a & b are integers. since b=(p-q)/2 a=(p+q)/2 they might be a fraction like if p=5 & q=2. fortunately "2" is the only even prime. so we are dealing here with two odd primes, so always (p-q) & (p+q) are even.
One more special method, is the One Line Factoring Algorithm proposed by William B. Hart (called HOLF), which can factorize numbers of the form p*(a*p+b), where p is prime and a and b are small. For example, numbers of the form: n = p*next_prime(p*k), where k is small (approx. k
Would love more videos on the challenges of generating good primes. As said its more complex that math.random, and this video shows that bad prime choice can destroy what is essentially an amazingly good encryption algorithm. So would love to know more!!! Its that nice mix of maths and computer science (almost like a hybrid of numberphile and computerphile!) that I think is really fascinating and resonantes with both audiences. Dr. Mike is really great at explaining even really complex topics, so would be great for more!!!
A modified Maurer's algorithm that uses the Miller-Rabin probable primality test and a cryptographically secure PRNG modified to randomly select a number within a given range with uniform probability. You use Maurer's algorithm to build up a prime to the appropriate size from other, smaller, randomly found prime numbers.
I suppose I should mention that the reason you do it this way, instead of just picking some random number uniformly in the appropriate range, checking its primality using Miller-Rabin, discarding it if its composite and repeating until you have two large prime numbers P and Q, is because using Miller-Rabin on numbers of that size becomes infeasible.
Some real systems just use hardware to generate true random numbers. You could generate switching noise by, well run your code, and measure anything (mostly core temp) with way too much precision. Shift the noisy bits and check for prime or initialize a random generator. Some servers simply use quantum effects of properly poorly designed transistors. Or you pay one employee enough to stay loyal and let them pick a number by hand that is used by a generator. An interesting field for a video.
13:35 pick random p, find the nearest prime pick q such that q > [p + 2^n/2] (or you can choose whatever difference is large enough for you), and find the nearest prime above that The important part is if the q you pick doesn't match, you have to generate a completely brand new number each time.
2 роки тому+1
I watch each of these lectures. To learn but also just to relax.
Not sure if anyone has mentioned this, but I love how you've repurposed continuous feed (dot matrix) paper! Also, your writing is so clean and legible, which makes it perfect for this application!
A side note - any modulo with a prime base creates a group under multiplication, which is why you mentioned finding the inverse of phi_n - I think this is an awesome example of pure maths such as group theory actually being used in action! :)
Ah yes, been reading Hanno's blog for years. His "feisty duck" newsletter is also always a good read. How very unsurprising to see Hanno's excellent work make an appearance. :)
Interesting that you only have in the private key, all the reference implementations I've found (and the one ChatGPT spat out) all have a two-part private key, made of (, ) and also using both…
Just another bullet point in the long list of reasons why you should never ever implement cryptography algorithms yourself unless you really, really know what you are doing. Unfortunately many people think that because the formulas behind cryptography algorithms are easy to find, implementing them is also easy. There is a reason why you have specialists for cryptography.
Interesting. I'm surprised that it finds the factors so quickly, even with so many random bits difference. You're effectively searching for the arithmetic mean ((p+q)/2) of the two factors, given the geometric mean (sqrt(pq)). I guess the 'trick' is that for numbers of similar scale, these two values are very close.
It's not surprising at all if the first half of the bits are equal. In fact, that would mean it takes about 2 iterations!!! Let's say a is 333500 (6 digits). Then p and q are numbers of the form 333???, which means b is at max 499 (3 digits) and b² has at max 6 digits (probably 5, but that would only make it faster). Now, when you go through the algorithm, the first candidate a² will have 12 digits (or 11, but lets stick with 12). But (a+1)² will already be about 666000 larger than a²: a 6-digit difference. The square root of a 6-digit number is a 3-digit number - which is already on the order of the maximum value for b. When you try a+2, the value you'd get for b² would be 1334004 (7 digits) - which can't be possible, because we know that b² is a 6-digit number.
Maybe you can use a factor K for N: X^2 + K*N = Y^2 and using K as an iterator. For example: 313 * 113 = 35369 = N, if you pick K as equal 3 you get 3*35369 = K*N = 106107 so take the square root of ceil(3 * 35369) = 326; 326^2 - 3 * 35369 = 169 = 13^2 326 - 13 = 313 = P 326 + 13 = 339 = Q * 3 For example: 17 * 11 = 187, K = 1 -> ceil(sqrt(K*N)) = 14 and 14^2 - 187*1 = 3*3 -> 14 + 3 = 17, 14 - 3 = 11
This is pretty neat method to reduce search space , I will have to give it a run. I would use this against the issuing CA, not the server key. Then re-issue your own certificates. Sign your own firmware images. Some firmware signing techniques will use an intermediate certificate and leaf that are both generated at the time of signing and each time a new image is signed. It would be interesting to analyse the p q generation in such a circumstance.
Hum, so this is a glimpse of what Professor Edward Frenkel was talking about the relations between P and Q in the NSA video... it is frightening to think what math techniques and tools may be under the radar that we will never know about...
2:03 - t's not even necessary to work out or steal the private key, to impersonate a legitimate website or server. Most end users don't even understand what a certificate is, and therefore, when their browser presents a trust warning, they simply click the "Accept" button to continue to the site and hand over their credit card details.
This is exactly the reason why one must never implement cryptographic algorithms as they are explained in textbooks. They lack all these real world tweaks which are necessary to prevent very specialized attacks for specific cases. There even have been smartphone apps in the past (sorry for not remembering the names - they were short lived), where cryptographers essentially criticized exactly that: It was a textbook implementation which does not hold up against real world scenarios, resulting from specific hardware, bad random number generators, limited memory, similar user mindsets, etc...
7:30 If we maintained a list of squares where S]x] = x^2, then ceil(sqrt(N)) = bisect.bisect_left(S, N) which can be computed in O(lgN). I'm guessing the issue is that N is so large that maintaining a sufficiently long list S would be impractical.
There used to be a big problem with some hardware servers with too low entropy to generate p and q at boot. Often 2 different servers where using the same p or the same q. So there's N1=p . q1 And N2=p . q2 Taking the biggest common prime factor of 2 numbers is fast so do that on N1 and N2 to get p then you get q1 and q2 with simple division. 2 servers hacked together...
Yes, for those interested there is more details about this kind of exploit in a fairly famous paper titled Ron was wrong, Whit is right (can easily find a free pdf googling)
Common factor attack: Given two private keys n1 = p * q1, n2 = p * q2, using the same p, then p = gcd(n1, n2), which is easy to compute with Euclid's algorithm, then q = n / p. This actually happened for a large number of public keys in the wild, that had been generated using a faulty prime number generator. This finding was revealed in a paper from February 2012.
When I started following Numberphile I couldn't understand what was all the fuss about with the primes. Then I started following computerphile and it slowly started to make sense.
Can someone give a proper intuitive explanation for this? I just tried it myself, randomising the last 512 bits and it found the result in the first step of ceil(sqrt(n)). How can it pinpoint to the solution so easily.
I wonder if this problem generally occurs if the attacker, for whatever reason, has a good estimate on p/q? In which case it is not so much that choosing p and q similar is inherently weak it is just that it is natural to try for p/q close to 1 and it is more natural to end up with an implementation of choosing p and q close to each other rather than specifically 20% apart. Kinda how 1234 is not a weaker number than any other 4 digit number but it is bad as a password because the human brain can more easily remember it and thus prefers to choose it.
I wonder if you could make this work better on well chosen primes by assuming they will be well chosen. Start at root of N + the root of the root of N and start going both up and down (instead of a=a+1 you have a+g and a=g for g starts at 1 and g=g+1 each iteration.
The good news is since getting the weak primes out of the keys is so trivial, it's equally trivial to make a test at key creation and just reject the keys and start over.
This is brilliant.. and scary ! I would imagine not a lot the RSA libraries will pick a good far apart primes. And what makes this scary is parallelizing this is quite easy.
Actually, it's highly unlikely to randomly generate two 1024 bits, that have the first 512 bits identical. Unless the random number generator is not really random.
I'm 4 minutes in and I have a 3-part question. if P and Q are Prime numbers, How many primes are in the "usable range of primes", and is that few enough that you could precalculate them into an, albeit large, rainbow-table? How large of a drive would you need to store the rainbowtable for current RSA prime sizes? I'd also like to clarify "usable range of primes", because, the generation method for primes, will not give all primes for a range, and is a much, much shorter list.
Quick search reveals: If you were to collect the power of the sun for 32 years in a dyson sphere and input that into a computer at the maximum theoretical efficiancy. Then that computer could count to 2^192. That is still a bit short of the 2^1024 range that RSA seems to use. In Short: Not physically feasible until we have explored a few layers deaper into physics.
@@ShakesB13r That misses part of the nuance of my question. That's true if you're counting every number starting at 2^1, which you most certainly aren't, so that's a massive reduction in calculation time, and second, The method used for generating random numbers is not true randomness. True digital randomness doesn't exist, we use pseudo-random numbers, which vastly reduces the numberspace even further for larger numbers. If you have the method of pseudo-random numbers that was used, and reverse engineer it to know what factors go into generating, such as pulling specific memory addresses and clocks, you can replicate that very limited space of generation, and re-generate the same pseudorandom numbers. in theory... It would still be an enormous number-space, but nowhere near 2^1024, I'd honestly doubt that the actual numberspace exceeds 2^50, because if it pulls from memory addresses, there's a very finite number of possible instructions, and if that memory space was, say, system memory, It's hypothetically possible to get exact strings, particularly if the psudo-random generation is less secure. It's actually why there's that one room full of lava-lamps on a camera being used to generate random numbers that's owned by cloudflare for SSL, and they do that exactly for the reasons I've described above. Which means this method couldn't be used for cracking cloudflare's prime generation, but the end-user does not have access to that level of RNG, they only have PRNG. What I want to know is, if that's why cloudflare is using lava lamps for RNG, how real is that threat of exploiting PRNG? How much can you narrow down a PRNG Numberspace? does PRNG even fill the entire numberspace evenly, or is there an asymetrical distribution that could be exploited? What factors go into PRNG and can that be exploited?
@@therealquade there are n/log(n) primes smaller than n; so there are about 2^2037 primes that are 2048 bit long. No, we can't physically iterate over them. Worse still, there are about 2**267 particles in the observable universe, so we don't have even the amount of matter necessary to store them (let alone keep track of them all) So, while, yes, it's is a shorter list, and yes, it's a much, much shorter list, we are operating with numbers so gigantic that tens of *orders of magnitude* is an insignificant rounding error. As such, shortening that list is completely meaningless compared the (humanly) uncountable void that are the typical numbers used in RSA.
@@666Tomato666 that's Half of my question answered, How about the other half of my question, How much further can that be narrowed down by the predictability of PRNG? Is that not also tens of orders of magnitude of a difference, if not more?
Would love to see examples of broken random number generators or real life libraries that are flawed. Almost all well used algorithms are audited. I guess I'll have to read this paper, then see what uses these algs
I tried many ways, I even modified my iptables rules, firewall restrictions, and all the possible ways, but still I cant get reverse shell. Netcat doesn't listen to my reverse shell, so I stucked in the root me room for more than a week. I need help, please anyone suggest me any ideas to overcome this.
It seems that your starting guess for "a" could actually be anything, and as long as your initial guess is close, this method is a (relatively) fast way to explore out from that point? Is my understanding correct that the lesson learned from this isn't "don't pick two primes that are close in size", but rather "don't pick primes that are near intuitive starting positions"? And then you have the issue that, if you tell all of the RSA algorithm providers to avoid all of the areas that are near intuitive starting positions, you've drastically, and perhaps cataclysmically, reduced the search-space of what 'acceptable' primes are in-use by RSA library providers? I feel like I'm missing something here, posting to be corrected and be a little less wrong.
You could choose two similar (say m, n ≈ 2ˆ256, m-n≈2ˆ128 starting long integers), use simpy.nextprime() on both and verify that the resulting primes are different and actually prime (not just semi-prime). If they coincide, you could just use simpy.nextprime() on one of them several more times.
11:03 Why are a and b so different in length? I thought they were similar in length? Edit: Nevermind. We are talking about p and q, and not a and b :facepalm:
10:14 Ok, so it's not fully compatible with Python, as `^` in Python is the XOR operator and you just called it "to the power of". In Python that would be `**`.
I haven't understood how I would end-up with two close values of p and q. Because when I generate an RSA key pair, I don't choose those numbers (I'm not even aware of them). I might be asked by the keygen program to 'generate some randomness' by moving my mouse (or some such). So what's the arrangement where these values are 'badly chosen'? How could I [arrange that | rely on that | expect that] as a hacker?
It's not user error. It's the people who developed the software that chooses the primes who are at fault when weak n are generated. You need to trust that the software you're using is generating strong n, not yourself
would it be possible to have an n that is the product of several sets of p and q primes and would that make it "easier" for figure out d? just thinking that having a bigger pool of primes to use might make it somewhat quicker somehow.
Yes. The more primes in a number's factorization the easier it is to factor it. This is exactly why RSA only uses n that are the product of TWO large primes
So then how close is 'too close'? Once you know the threshold for 'too close' couldn't you just begin your search outside that boundary? Basically reducing your search space?
Sure, but reducing the search space would only shave off about the amount of time you needed to crack the key if it was weak which by assumption is not that much time. A strong key that would require 100 years to crack doesn't really mind if you shave 1 hour off. And then there is the hilarious situation where the key actually was weak and your algorithm can't capitalize on it because you purposefully excluded checking for it.
Strictly speaking RSA uses the Carmichael totient function, lambda, not the Euler totient function phi. The difference is that Euler's is (p - 1)(q - 1) and Carmichael is LCM((p - 1)(q - 1)).
Every integer that is not congruent to 2 mod 4, can be represented as a difference of two squares. This includes every odd integer. See also OEIS: A042965.
@@drumetul_dacic Every pair of consecutive squares (say, k^2 and (k+1)^2) are an odd number apart: (k+1)^2 = k^2 + 2k + 1; (k+1)^2 - k^2 = 2k + 1. So, to find values a and b for a given odd number n such that n = a^2 - b^2, choose a = (n+1) / 2 and b = a - 1.
no, private key in ECC is any number, and there is no factorisation involved, that being said, you still have to be careful with the random numbers you use for signatures....
Don't you normally compute the Carmichael function λ(n), not the Euler totient function φ(n)? I know both will work, since λ(n) | φ(n), but I'm pretty sure λ(n) is what is actually used. At least, when I learned about it in class, all the papers/textbooks I read used λ(n).
The best lecturer on Computerphile, hands down.
Together with prof. Dave !
pounding off in the comments for the Good Doctor!
100%
@@gloverelaxis ...phrasing.
True!
“A lot of thought goes into generating random primes than I am doing justice in this video” - sounds like a great topic for a future video :)
Yes please! That would be very interesting.
I was going to say exactly the same thing. I was surprised by that statement and now I'm curious
YESSS 😍😍
It's easier to just generate a bunch of keys' N values, determine the product, and do a GCF with the target N. Use the same certificate tool to get the best results.
Yes! Now I REALLY want that video
I find it hilarious that for weak primes, the factoring algorithm runs faster than the primality test.
Nothing special: algorithm works in the assumption that there are exactly two factors for n.
@@Avorthoren That's not an assumption, that's a property of n.
@@Avorthoren It actually is special to weak primes: if the two factors are not weak primes, the best factoring algorithm for a number with two prime factors would take exponentially more time than the primality test...
@@Jooolse Of course they are weak. But what do you mean by "exponentially more" in this context..? Exponentially more by what parameter? :)
@@nerze3157 Read carefully what I wrote.
Always a good day when a Mike Pound video drops!
Proper
Generating random numbers is too important to leave it up to chance.
Get out
Lol
No computer generates random numbers. A number that is generated with a mathematical function can never truly be random. We call random numbers Pseudo Random. Generating random numbers for encryption is a lot more complex than just choosing to numbers between A and B.
I love this comment so much. xD
@Rake1087 you can't generate true random numbers from something deterministic like a turing machine. But on a real computer it's quite doable. Shift noisy ADC numbers or use quantum effects if your structure is low nm.
Some servers are used so damn frequently that determinism isn't really a problem. The lifespan of the numbers is just too short and usage to random to keep track and brute force.
While not directly related to this RSA "shortcoming", this video did remind me of another (arguably more fundamental) concern regarding RSA. It is widely known that the difficulty of factorizing the product of two large and random prime numbers, is that makes RSA work. However, when just 1 of those primes isn't truly random, maybe a fixed value or in any way deterministic or part of some limited/predictable pool/rotation of seemingly random possibilities, it becomes rather trivial to break the cipher (to anyone with knowledge about what makes this prime less random).
Nothing new there. This "trick" has been used in CTF challenges. However, what has surprised me on several occasions, is that not many people appear to realize that it is this same factorization problem that makes it equally hard/impossible to detect any "weakened" key from the outside. In other words: it is nearly impossible to determine if a RSA initiated encrypted transmission is in fact secure, without access to the private key. From a set of private keys, a pattern in primes might still be spotted (although maybe not trivial). But from just public keys, this is certainly nearly impossible.
In a way, you could say that RSA is just as good at hiding compromised security, as it is at providing secure cryptography. One may ponder on whether that is a desirable property of good cryptography. One may also wonder whether that might have been just a side effect, or who knows maybe intentional.
The few times I have seen this being brought up, it was instantly discarded as irrelevant. Often in a surprisingly dismissive way. Usually with arguments along the way of: "if you don't trust who generated the keys, all bets are off anyways" or "organizations who take their security seriously, will do such a thing because of X Y Z". However, I can think of at least several scenarios in which a company may like to pretend to use very strong cryptography, while at the same time (secretively) provide a convenient way (e.g. to a 3rd party with "special interests") to eavesdrop on those "secure" transmission without much hassle.
While accusations of any such a thing actually happening remain conspiracy theory, it is the nature of this factorization difficulty in RSA that also makes that it would be almost impossible to prove such a thing taking place (unless some closely guarded secret leaks out). Considering the history of the NSA and RSA (both individually and in relationship with each other), it feels to me like this RSA concern deserves more attention than it generally appears to get. Similar concerns (although for totally different technical reason) might be raised regarding the NSA's EC cryptography.
But that's just my 2 cents.
Conceptually, that's valid, but there's a lot of overhead. In the wild three letter agencies generally get the private keys from the originator. Heck, if your pockets are deep enough, some companies will sell them after signing a lengthy NDA.
@@ClifBratcher No doubt true, to all you said. But all of that will leave evidence of collusion (which might be legal requirements in 1 jurisdiction, yet at the same time illegal according to another). It is exactly the plausible deniability aspect of this RSA concern, which might make it particularly useful to some parties (which might not be the typical 3 letter agencies; there are other scenarios).
What do you mean when you say, “it is this same factorization problem that makes it equally hard/impossible to detect any ‘weakened’ key from the outside”?
I assume that you mean a weakened key to be one comprised of a prime that isn’t truly random or not chosen properly.
Perhaps the best way to see if an RSA encryption is weak is to assume the key isn’t random and try various algorithms that take advantage of different mistakes that can occur while creating the key, such as the ones used in the CFT challenges you mentioned.
Assuming you have a wide variety of algorithms that check for mistakes, wouldn’t it be relatively easy to check if a key is weak i.e. by factoring n? Maybe you can’t prove with 100% certainty a key isn’t weak, because that might require a potentially infinite amount of algorithms, but after so many attempts wouldn’t it be safe to assume a key is strong with a high probability?
My thoughts exactly.
One example, by the way -
You have a closed source messaging application that promises to be end-to-end encrypted (*cough cough* Whatsapp). The company wants to read its users' data, or let the government read the data. There are of course loads of methods with a closed source application, but a backdoor like this one (picking intentionally weak keys, or better yet picking strong keys with a specific predictable algorithm) would be comparatively hard to detect.
In other words - you might trust the person you're talking to, but that doesn't mean you can trust the software they run.
Admittedly this is all pretty niche and unlikely though
Definitely stealing this method for CTFs. I've been using a number field sieve to factor low-bit modulo numbers. This is awesome!
Back in school, we did some RSA by hand, generally using primes up to 13. having a p, q = 7, 13 is a bit lower than generally secure, but it illustrates the point nicely.
Really nice video. And I am playing with Sage again, after some 10 years. Mr. Euler and Mr. Fermat - it is truly amazing what you guys gave us.
I love this channel. it's encouraged me to start a degree in computer science, I'm just starting second year and I'm really excited.
7:09 an interesting point here, you need to assume that a & b are integers.
since b=(p-q)/2 a=(p+q)/2 they might be a fraction like if p=5 & q=2.
fortunately "2" is the only even prime. so we are dealing here with two odd primes, so always
(p-q) & (p+q) are even.
please post more computerphile. I find my computer science classes incredibly boring, but these videos I cant stop watching. Makes me love computers!
Also when p-1 or p+1 is B-smooth for some small value of B (approx. B
One more special method, is the One Line Factoring Algorithm proposed by William B. Hart (called HOLF), which can factorize numbers of the form p*(a*p+b), where p is prime and a and b are small. For example, numbers of the form: n = p*next_prime(p*k), where k is small (approx. k
the math is way over my head, but again Dr Mike made a complex subject very simple to follow, Thanks you, always look forward to these presentations.
Would love more videos on the challenges of generating good primes.
As said its more complex that math.random, and this video shows that bad prime choice can destroy what is essentially an amazingly good encryption algorithm. So would love to know more!!! Its that nice mix of maths and computer science (almost like a hybrid of numberphile and computerphile!) that I think is really fascinating and resonantes with both audiences.
Dr. Mike is really great at explaining even really complex topics, so would be great for more!!!
A modified Maurer's algorithm that uses the Miller-Rabin probable primality test and a cryptographically secure PRNG modified to randomly select a number within a given range with uniform probability. You use Maurer's algorithm to build up a prime to the appropriate size from other, smaller, randomly found prime numbers.
I suppose I should mention that the reason you do it this way, instead of just picking some random number uniformly in the appropriate range, checking its primality using Miller-Rabin, discarding it if its composite and repeating until you have two large prime numbers P and Q, is because using Miller-Rabin on numbers of that size becomes infeasible.
Some real systems just use hardware to generate true random numbers.
You could generate switching noise by, well run your code, and measure anything (mostly core temp) with way too much precision. Shift the noisy bits and check for prime or initialize a random generator.
Some servers simply use quantum effects of properly poorly designed transistors.
Or you pay one employee enough to stay loyal and let them pick a number by hand that is used by a generator.
An interesting field for a video.
13:35 pick random p, find the nearest prime
pick q such that q > [p + 2^n/2] (or you can choose whatever difference is large enough for you), and find the nearest prime above that The important part is if the q you pick doesn't match, you have to generate a completely brand new number each time.
I watch each of these lectures. To learn but also just to relax.
Not sure if anyone has mentioned this, but I love how you've repurposed continuous feed (dot matrix) paper! Also, your writing is so clean and legible, which makes it perfect for this application!
A side note - any modulo with a prime base creates a group under multiplication, which is why you mentioned finding the inverse of phi_n - I think this is an awesome example of pure maths such as group theory actually being used in action! :)
Ah yes, been reading Hanno's blog for years. His "feisty duck" newsletter is also always a good read. How very unsurprising to see Hanno's excellent work make an appearance. :)
Beside myself that you actually started this comment with "Ah yes." xD Nothing wrong with that.
Interesting that you only have in the private key, all the reference implementations I've found (and the one ChatGPT spat out) all have a two-part private key, made of (, ) and also using both…
Very nice! A video with a practical example! Not only dry theory as most of your videos.
Just another bullet point in the long list of reasons why you should never ever implement cryptography algorithms yourself unless you really, really know what you are doing.
Unfortunately many people think that because the formulas behind cryptography algorithms are easy to find, implementing them is also easy. There is a reason why you have specialists for cryptography.
*OpenJDK* recently learned this the hard way.
@@ExEBoss what do ye mean
@@0xhhhhff See *Computerphile’s* video titled *“Psychic Signatures (Java Vulnerability)”* for explanation.
@@ExEBoss
Not only them. Java in general did.
@@0xhhhhff
They weren't the only ones with these issues.
I have no idea what he’s talking about but I can’t stop watching
Interesting. I'm surprised that it finds the factors so quickly, even with so many random bits difference. You're effectively searching for the arithmetic mean ((p+q)/2) of the two factors, given the geometric mean (sqrt(pq)). I guess the 'trick' is that for numbers of similar scale, these two values are very close.
It's not surprising at all if the first half of the bits are equal. In fact, that would mean it takes about 2 iterations!!! Let's say a is 333500 (6 digits). Then p and q are numbers of the form 333???, which means b is at max 499 (3 digits) and b² has at max 6 digits (probably 5, but that would only make it faster).
Now, when you go through the algorithm, the first candidate a² will have 12 digits (or 11, but lets stick with 12). But (a+1)² will already be about 666000 larger than a²: a 6-digit difference. The square root of a 6-digit number is a 3-digit number - which is already on the order of the maximum value for b. When you try a+2, the value you'd get for b² would be 1334004 (7 digits) - which can't be possible, because we know that b² is a 6-digit number.
The Wikipedia page for Fermat factorisation shows how b grows more quickly than a, which is how it manages to skip through the search space quickly.
Maybe you can use a factor K for N: X^2 + K*N = Y^2 and using K as an iterator.
For example: 313 * 113 = 35369 = N, if you pick K as equal 3 you get 3*35369 = K*N = 106107 so take the square root of ceil(3 * 35369) = 326;
326^2 - 3 * 35369 = 169 = 13^2
326 - 13 = 313 = P
326 + 13 = 339 = Q * 3
For example: 17 * 11 = 187, K = 1 -> ceil(sqrt(K*N)) = 14 and 14^2 - 187*1 = 3*3 -> 14 + 3 = 17, 14 - 3 = 11
This is pretty neat method to reduce search space , I will have to give it a run. I would use this against the issuing CA, not the server key. Then re-issue your own certificates. Sign your own firmware images. Some firmware signing techniques will use an intermediate certificate and leaf that are both generated at the time of signing and each time a new image is signed. It would be interesting to analyse the p q generation in such a circumstance.
Mike pound is back ! Missed you man
Hum, so this is a glimpse of what Professor Edward Frenkel was talking about the relations between P and Q in the NSA video... it is frightening to think what math techniques and tools may be under the radar that we will never know about...
Any plans to have a video on quantum-resistant cryptography? Seems like a good time with the recent White House memo
Yes, please!
What memo?
Unparalleled explanation skills!
I’m not a programmer or coder but when Mr P speaks I listen.
*Dr. P.
I fully agree though. I never miss any of his computerphile videos.
@@generalmanyara when he fixes my dads cancer I’ll call him (and any non medical doctor) Doctor. Until then he’s Mr Pound.
@@AcornElectron Wow. Way to invalidate the amount of work that goes into getting a PhD...
@@IceMetalPunk 10 years is gone like that ‘snap’
@@IceMetalPunk it's not mandatory to call someone by their title. no work was invalidated here
Have you done a video on using quantum computing to break RSA?
Great video!
What about making a video about Peter Shor's algorithm for quantum computers?
2:03 - t's not even necessary to work out or steal the private key, to impersonate a legitimate website or server. Most end users don't even understand what a certificate is, and therefore, when their browser presents a trust warning, they simply click the "Accept" button to continue to the site and hand over their credit card details.
Presumably, someone could "sweep the web", looking for weak Public Keys, and then focus their attacks on those poor unfortunates.
Love these little lectures! More concepts and interesting problems please!
3:04 a Lisp developer was born that second.
This is exactly the reason why one must never implement cryptographic algorithms as they are explained in textbooks. They lack all these real world tweaks which are necessary to prevent very specialized attacks for specific cases.
There even have been smartphone apps in the past (sorry for not remembering the names - they were short lived), where cryptographers essentially criticized exactly that: It was a textbook implementation which does not hold up against real world scenarios, resulting from specific hardware, bad random number generators, limited memory, similar user mindsets, etc...
7:30 If we maintained a list of squares where S]x] = x^2, then
ceil(sqrt(N)) = bisect.bisect_left(S, N)
which can be computed in O(lgN).
I'm guessing the issue is that N is so large that maintaining a sufficiently long list S would be impractical.
There used to be a big problem with some hardware servers with too low entropy to generate p and q at boot. Often 2 different servers where using the same p or the same q.
So there's N1=p . q1
And N2=p . q2
Taking the biggest common prime factor of 2 numbers is fast so do that on N1 and N2 to get p then you get q1 and q2 with simple division.
2 servers hacked together...
Ah okay, so it's not RSA that's broken, it's that it's just not being implemented correctly in the first place.
@@imveryangryitsnotbutter rsa layer itself was done correctly
it's the rng lib below was flawed
Yes, for those interested there is more details about this kind of exploit in a fairly famous paper titled Ron was wrong, Whit is right (can easily find a free pdf googling)
@@NoNameAtAll2 Sounds like a problem I’d also expect to see out of Haskell’s cryptonite lib that most Haskellers use atm.
@@imveryangryitsnotbutter It's always an implementation problem
Common factor attack: Given two private keys n1 = p * q1, n2 = p * q2, using the same p, then p = gcd(n1, n2), which is easy to compute with Euclid's algorithm, then q = n / p.
This actually happened for a large number of public keys in the wild, that had been generated using a faulty prime number generator. This finding was revealed in a paper from February 2012.
Have any RSA client libraries implemented checks to reject certs with weak keys?
I find that if you add phi(n) to the d key, then the new d = old d + phi(n) still works, is that new private key?
"I'm not a genius, it just wasn't difficult". I'm using this
Owh yes, dr Pound is always on point!
When I started following Numberphile I couldn't understand what was all the fuss about with the primes. Then I started following computerphile and it slowly started to make sense.
love this stuff, even though my maths was always garbage.
Six minutes in I had to check, am I on numberphile or computerphile
Can someone give a proper intuitive explanation for this? I just tried it myself, randomising the last 512 bits and it found the result in the first step of ceil(sqrt(n)). How can it pinpoint to the solution so easily.
That looks like it's a pretty handy test algorithm to make sure your encryption is properly secure.
_RSA Hate Him for This One Simple Trick_
Wow, I can't believe what happened next!
I wonder if this problem generally occurs if the attacker, for whatever reason, has a good estimate on p/q?
In which case it is not so much that choosing p and q similar is inherently weak it is just that it is natural to try for p/q close to 1 and it is more natural to end up with an implementation of choosing p and q close to each other rather than specifically 20% apart.
Kinda how 1234 is not a weaker number than any other 4 digit number but it is bad as a password because the human brain can more easily remember it and thus prefers to choose it.
There is a special method for this, called the One Line Factoring Algorithm proposed by William B. Hart.
This was so well explained, wow. Nicely done!
I wonder if you could make this work better on well chosen primes by assuming they will be well chosen. Start at root of N + the root of the root of N and start going both up and down (instead of a=a+1 you have a+g and a=g for g starts at 1 and g=g+1 each iteration.
I love this! Even without understanding all
The good news is since getting the weak primes out of the keys is so trivial, it's equally trivial to make a test at key creation and just reject the keys and start over.
3:30 is there only one such number?
This is brilliant.. and scary !
I would imagine not a lot the RSA libraries will pick a good far apart primes.
And what makes this scary is parallelizing this is quite easy.
Actually, it's highly unlikely to randomly generate two 1024 bits, that have the first 512 bits identical. Unless the random number generator is not really random.
I'm 4 minutes in and I have a 3-part question. if P and Q are Prime numbers, How many primes are in the "usable range of primes", and is that few enough that you could precalculate them into an, albeit large, rainbow-table? How large of a drive would you need to store the rainbowtable for current RSA prime sizes? I'd also like to clarify "usable range of primes", because, the generation method for primes, will not give all primes for a range, and is a much, much shorter list.
Quick search reveals: If you were to collect the power of the sun for 32 years in a dyson sphere and input that into a computer at the maximum theoretical efficiancy. Then that computer could count to 2^192. That is still a bit short of the 2^1024 range that RSA seems to use. In Short: Not physically feasible until we have explored a few layers deaper into physics.
@@ShakesB13r Always nice to refresh my understanding of how large a ~1000 bit number is
@@ShakesB13r That misses part of the nuance of my question. That's true if you're counting every number starting at 2^1, which you most certainly aren't, so that's a massive reduction in calculation time, and second, The method used for generating random numbers is not true randomness. True digital randomness doesn't exist, we use pseudo-random numbers, which vastly reduces the numberspace even further for larger numbers. If you have the method of pseudo-random numbers that was used, and reverse engineer it to know what factors go into generating, such as pulling specific memory addresses and clocks, you can replicate that very limited space of generation, and re-generate the same pseudorandom numbers. in theory... It would still be an enormous number-space, but nowhere near 2^1024, I'd honestly doubt that the actual numberspace exceeds 2^50, because if it pulls from memory addresses, there's a very finite number of possible instructions, and if that memory space was, say, system memory, It's hypothetically possible to get exact strings, particularly if the psudo-random generation is less secure. It's actually why there's that one room full of lava-lamps on a camera being used to generate random numbers that's owned by cloudflare for SSL, and they do that exactly for the reasons I've described above. Which means this method couldn't be used for cracking cloudflare's prime generation, but the end-user does not have access to that level of RNG, they only have PRNG.
What I want to know is, if that's why cloudflare is using lava lamps for RNG, how real is that threat of exploiting PRNG? How much can you narrow down a PRNG Numberspace? does PRNG even fill the entire numberspace evenly, or is there an asymetrical distribution that could be exploited? What factors go into PRNG and can that be exploited?
@@therealquade there are n/log(n) primes smaller than n; so there are about 2^2037 primes that are 2048 bit long. No, we can't physically iterate over them. Worse still, there are about 2**267 particles in the observable universe, so we don't have even the amount of matter necessary to store them (let alone keep track of them all)
So, while, yes, it's is a shorter list, and yes, it's a much, much shorter list, we are operating with numbers so gigantic that tens of *orders of magnitude* is an insignificant rounding error. As such, shortening that list is completely meaningless compared the (humanly) uncountable void that are the typical numbers used in RSA.
@@666Tomato666 that's Half of my question answered, How about the other half of my question, How much further can that be narrowed down by the predictability of PRNG? Is that not also tens of orders of magnitude of a difference, if not more?
10:16 In this pythonic context, the ^ gave me small heart attack. But then I remembered it's not Python.
Aahh man, ComputerPhil the expert does it again!
Would love to see examples of broken random number generators or real life libraries that are flawed. Almost all well used algorithms are audited. I guess I'll have to read this paper, then see what uses these algs
please make a cryptography playlist of Dr. Mike Pound videos
I tried many ways, I even modified my iptables rules, firewall restrictions, and all the possible ways, but still I cant get reverse shell. Netcat doesn't listen to my reverse shell, so I stucked in the root me room for more than a week. I need help, please anyone suggest me any ideas to overcome this.
"jesse we need to install sagemaths"
Mike makes the day
Really great Lecture. Well Done.
It seems that your starting guess for "a" could actually be anything, and as long as your initial guess is close, this method is a (relatively) fast way to explore out from that point? Is my understanding correct that the lesson learned from this isn't "don't pick two primes that are close in size", but rather "don't pick primes that are near intuitive starting positions"? And then you have the issue that, if you tell all of the RSA algorithm providers to avoid all of the areas that are near intuitive starting positions, you've drastically, and perhaps cataclysmically, reduced the search-space of what 'acceptable' primes are in-use by RSA library providers?
I feel like I'm missing something here, posting to be corrected and be a little less wrong.
5:38 I'm relieved that not even Dr. Mike Pound can escape the inevitable sharpie marks on your hands.
Is n the same as N? I didn't catch why there was a change but he did refer to N once as, "our original en." So....?
I love the way he talks
How fast could you calculate 'd' (public key) when using optimized code for QBit Processor? Even when 'p' and 'q' are "difficult" numbers?
How do you get two similar prime numbers in Python?
Wondering how to reproduce this...
You could choose two similar (say m, n ≈ 2ˆ256, m-n≈2ˆ128 starting long integers), use simpy.nextprime() on both and verify that the resulting primes are different and actually prime (not just semi-prime). If they coincide, you could just use simpy.nextprime() on one of them several more times.
@@evgeniipavlov5440 Thanks man. Worked fine!
11:03 Why are a and b so different in length? I thought they were similar in length?
Edit: Nevermind. We are talking about p and q, and not a and b :facepalm:
10:14 Ok, so it's not fully compatible with Python, as `^` in Python is the XOR operator and you just called it "to the power of". In Python that would be `**`.
RSA has been broken. It has a secret back door:
correct horse battery staple
So. How is openssl generating p, q?
I haven't understood how I would end-up with two close values of p and q. Because when I generate an RSA key pair, I don't choose those numbers (I'm not even aware of them). I might be asked by the keygen program to 'generate some randomness' by moving my mouse (or some such). So what's the arrangement where these values are 'badly chosen'? How could I [arrange that | rely on that | expect that] as a hacker?
It's not user error. It's the people who developed the software that chooses the primes who are at fault when weak n are generated. You need to trust that the software you're using is generating strong n, not yourself
Ahh the nightmare is back... Had to do both RSA and quadratic sieving by hand during my discrete maths exam.
would it be possible to have an n that is the product of several sets of p and q primes and would that make it "easier" for figure out d? just thinking that having a bigger pool of primes to use might make it somewhat quicker somehow.
Yes. The more primes in a number's factorization the easier it is to factor it. This is exactly why RSA only uses n that are the product of TWO large primes
Many time he said "There are not 'many' algorithms to do that". I am wondering that, is there some algorithms that can do that?
This video might get a million views :)
Yeah ::
plz ! i hope you make a video about chacha20/xchacha20 cipher with more details !!
If you are being choosy about your "random" numbers then are they really random?
isqrt(n) + 1 -- "so that's the ceiling"
only given that n isn't square, otherwise it's the ceiling + 1 :)
Nope. Ceiling is the smallest integer greater or equal to its argument, so ceil(n) == n if n is an integer
I think he purposefully avoids looking at the camera, at us. And I think it has a positive effect on us focusing on the subject.
So then how close is 'too close'? Once you know the threshold for 'too close' couldn't you just begin your search outside that boundary? Basically reducing your search space?
Sure, but reducing the search space would only shave off about the amount of time you needed to crack the key if it was weak which by assumption is not that much time.
A strong key that would require 100 years to crack doesn't really mind if you shave 1 hour off.
And then there is the hilarious situation where the key actually was weak and your algorithm can't capitalize on it because you purposefully excluded checking for it.
Pls do a video about yubikey or sth similar
Awesome video, however u cant apply this formula u made (N = a² - b²) for all cases... just the one you picked ;)
Strictly speaking RSA uses the Carmichael totient function, lambda, not the Euler totient function phi. The difference is that Euler's is (p - 1)(q - 1) and Carmichael is LCM((p - 1)(q - 1)).
Why is 65537 the best integer?
How do you know that n can be expressed as a^2 - b^2? Is this some math theorem or something?
Every integer that is not congruent to 2 mod 4, can be represented as a difference of two squares. This includes every odd integer. See also OEIS: A042965.
@@drumetul_dacic Every pair of consecutive squares (say, k^2 and (k+1)^2) are an odd number apart: (k+1)^2 = k^2 + 2k + 1; (k+1)^2 - k^2 = 2k + 1. So, to find values a and b for a given odd number n such that n = a^2 - b^2, choose a = (n+1) / 2 and b = a - 1.
But just where does Dr Mike get his sweaters...?
I don't understand one thing though, 1 mod anything is always 1 right? So what's the point of writing 1 mod (euler totient (n))?
0:31 isn't the private key used to read encoded data instead? Because why call it public if it's actualy private.. ✌️💯 good work tho
Does elliptic curves have this problem?
no, private key in ECC is any number, and there is no factorisation involved, that being said, you still have to be careful with the random numbers you use for signatures....
How do we know if our RSA implementation avoids this? Could NSA sneak something in that causes this weakness?
Nice explanation 😄
Don't you normally compute the Carmichael function λ(n), not the Euler totient function φ(n)? I know both will work, since λ(n) | φ(n), but I'm pretty sure λ(n) is what is actually used. At least, when I learned about it in class, all the papers/textbooks I read used λ(n).
The totient function for primes and products of unique primes is extremely simple, and RSA uses 2 primes. I learned it with the Euler totient function