@@edwardian23 hmm. Posted but maybe UA-cam didn't like the link. I only know approximately but you can toggle a bit at, say, 100MHz and then sample it at, say, 101MHz. Sometimes you'll sample it while the voltages are changing and get a random answer.
I never knew that x86 processors had this functionality built in. It might be slow, but if you absolutely need just a few truly random numbers, it sure beats a wall of lava lamps, a "chaotic pendulum", or a radioactive check source next to a Geiger counter. Long term fans of this channel might get the reference. ;)
@simonsomething2620 I was also curious, so I looked it up. (Had a much better response before it got lost in transit) Basically, thermal noise on the processor is measured, converted to a numerical state, and then that value is used as the seed for a pseudo-random number generator.
Hey Valerio, the whole point of the video was for you to explain how hardware could generate a truly random seed, and you didn't even mention that at all. Either in general or in the case of the x86. For anyone who didn't already know, it was all gobbledegoop. The only clue was when the questioner asked if you knew what the source of entropy was within the chip, and you said you didn't know. Anyway wikipedia says it's "thermal noise within the silicon". Cheers. 🙂
I rarely make comments and very rarely negative about a video but honestly I felt like this video was a complete waste of my time. I watched waiting for the part where he would explain the source of the true randomness that the hardware could leverage but it never came. Thanks for looking it up and providing the answer, I thought it would be something related to the CMB or some electrical voltage wave but thermal noise in the silicon is even more interesting.
Linux also has a similar system in place, where it collects entropy from all hardware connected to the system into /dev/random and /dev/urandom. I wonder what's slower; using rdseed or using the getrandom() syscall. Both should provide true randomness
I would assume that means that the circuit uses a reverse-biased P/N junction to generate noise (which a reverse biased junction does naturally) and then it clocks that noise through a schmitt trigger to turn it into a stream of 0's and 1's to create the output.
@@aqua-op Lol, yeah I was (and still am, haven't slept) a bit drunk, so I was kinda easy with loose words. But I'm glad because others wouldn't have said it. It wasn't that harsh. He obviously understands what's happening, just didn't explain the important part. It was still interesting when he said such things could have a bias toward one side or one set of numbers. Anyway thanks for your comment. This, (randomness, or lack thereof) is one of my favourite things. 🙂
@@SamMason0 who can blame him :) It's a funny thing, many people don't realise... But the site's called "Compiler Explorer"; I can't call it the Other Name...it's too weird :D
@@hwstar9416 it's Matt's surname... It was just his personal site, but the compiler explorer got kind of popular and took over. He gave an interesting talk about it recently at C++ On Sea
It's also nice to have the same seed create the same random numbers in for example Minecraft so I can play in the same world shared by a friend or website and spawn in the same place like next to a village. I imagine some other games operate on the same principle in a great variety of implementations, and that many others at least theoretically could.
Wouldn't the player's input cause a variable count of numbers to be pulled from the sequence, and the desired loot get a different number if they are not decided at game start but rely on a seed? Say, for example, you go where a monster spawns, which has some properties, and then you go for your loot? The Sims 2 had a very bad random number, and you could see the same set of people being chosen. Adding together any variable data like time and object identifiers could improve upon it.
@@j7ndominica051 Loot and monsters are not determined from seed only, but all the blocks and structures (villages, pyramids etc.) are fully determined, so it's essentially the same world, it looks the same
Seed numbers generating the same sequence of pseudo randomness is very usefull in thinks like VFX for example, when you need to rerender certain effects but dont want to do everything back from 0... So using the same seeds assures you it will still fit togetter with everything else... For example when calculating particles, wind, fluid sims, etc... but even the noise from monte carlo raytracers.
if you need a different seed every time just use clock( ) or time( ). True purpose of true random numbers is cryptography. Imagine you use a rand( ) with some seed, no matter how secure the seed is, the sequence itself is deterministic. Thus you can predict it and brake some encryptions. But even if the sequence is truly random, then, as it was stated, it might not be evenly distributed. That means, that the numbers must belong to a determinable subset of numbers, and it will be easier to brutforce the key; that turns an almost impossible task to doable task within a couple of months.
@@Raspredval1337 How exactly does the evenness of the number distribution or the lack of it thereof means the the following numbers belong to deterministic subset?
@@Raspredval1337 The problem with using time for your seed is that if the attacker knows when your program started, it gives a rather small set of possible seeds to exhaustively try. Trash for cryptography pruposes
The standard C++ library can use a "true random" engine automatically if available in the hardware, so it's really easy to use in modern C++. Also, as you mentioned, it is recommended to use it mainly for randomizing the seed.
Well in linux you have /dev/random and urandom that gets entropy from hardware cpu, hard disk, user inputs etc. That can say that are "real" random numbers
Both of these are circumstance dependent though - /dev/random on most Linux distros uses true random numbers and blocks reads when "empty", but iirc can sometimes be plugged into a pseudorandom number generator that continuously reseeds from hardware entropy (eg in environments where hardware entropy is scarce such as VMs which don't always have access to a dedicated hardware entropy generator), /dev/urandom is non blocking and traditional behaviour is to use a pseudorandom number generator once true entropy is exhausted until more true entropy is generated. Notably, in many BSD implementations /dev/random is just a link to /dev/urandom, so all random numbers generated on BSDs are potentially pseudorandom numbers generated from a true random seed instead of forced true random.
Pseudo random number generation and seeds were used in computer games all the way back into the 80's. One of the best "adventure" type games was Akalabeth, and the only way you got the same dungeons and overland maps was to use the same "seed." Which is why when i was writing games back then, i would just very rapidly (machine coding) generate PR number after PR number while waiting for the player to input. Player input is about as random a factor as you can get. Nobody is going to be able to repeatably respond within milliseconds every time. One of my favorite tricks was to use the refresh register in a Z80 as the seed. Because it was constantly incrementing, and i could use that to tweak the randomness of the numbers i was generating, again using the player as the randomizing element. Mainly because the refresh register was a zero cost (in programming terms) high speed moving target.
Something I tend to do with my programs is to start out with a "seed" derived from the TOD clock, then generate a small integer, get that many random numbers, then save the last one to a file to be used in future as the "seed" instead of using the TOD clock. The procedure of generating a small integer and then getting that many random numbers and saving the last one as the new seed seems to provide sufficient randomness for most games.
I had a windows-based casino game back in the 90s. In some games, each round started with the same sequence of random numbers, making it easy to cheat.
To be as generic as possible and skipping over how implementations change: The Hardware RNG is a buffer of entropy samples collected from entropy source(s) that are typically fed through thermal or electrical noise, conditioned for quality and governed with an entropic heath check of sorts. These are analog components on the die. Often this process is so slow that rather than being used directly, the random numbers are actually used to seed Procedural RNGs, after being conditioned with a Detrrministic Random Bit Generator to remove identifiable characteristics of the hardware noise.
you clearly missed the point of the question you basically didnt say anything at all .. given a biased coin toss how do you unbias it without knowing the bias priori .. I dont know the answer but I know that because I dont know the answer that you arent even close to ever being informative here because if you were thats where you would have went .. you dont even know enough to know what the fundamental issue is ("collect entropy" is not meaningful)@@orbatos
You might need to have a look on stochastic computing that exploit cosmic noise to perform approximate computing. You will be fascinated by the fact that a simple AND gate could do arithmetic multiplication by exploiting probability.
The real utility difference between RdRand and RdSeed is that numbers from RdRand can be concatenated to make larger random numbers up to 256 bits and still maintain O(2^256) security. 128 bits for older chips. RdSeed you can concatenate the outputs to make as large a number as you like. So 512 bit keys or 1024 bit keys or 10 million bit keys will support a security strength that large. RdRand offers computational prediction resistance and O(2^256)+O(2^256) = O(2^257), not O(2^512). This is sometimes referred to as the additive vs multiplicative prediction resistance of RdRand vs RdSeed.
This video is an amazing showcase of speaking with an accent without hindering the understandability of the spoken content whatsoever. I'm astonished by the way he's articulating the sentences in a way that seemingly highlights the premise on its own. Great stuff.
Another situation where is useful to set a seed is for teaching programming or data analysis, so everyone can replicate the expected results and easily verify they did the assignments correct.
The important difference between PRNG and TRNG is not the distribution, you can always fix that for either type. What you want is to have no ability to predict the NEXT number in the sequence with certainly-better-than-chance probability.
You're encouraged to extract the entropy from these instructions via some cryptographic primitives, e.g. sha hash the output. This would remove any bias/"skew" and get you back to a uniform distribution.
@@ed_halley That is just another way of saying that the important difference is precisely the evenness of the distribution. That is because you can not determine the probability of the next number unless you know the distribution.
It seems that pretty large portion of the numbers displayed after 11:33 begins with digit 1. I tried to pause the video (pseudo)randomly and in some instances one third or even a half of the numbers start with 1.
It may seem that there is bias. I believe that the numbers go from 0 to 2^31. That is 0 to 2.1billion. So.. note that any leading zeroes are removed and about half the numbers between 0 and 2.1B do actually start with a 1. Now note that any numbers less than 1B do not start with a zero. And one out of 9 starts with a 1. That is probably about half the numbers starting with a 1. The same should be true of the sequence generated earlier.
The function he calls gives 32 random bits, it's then converted to an int that spans between ±2.1 billion and displayed as base 10, that's why there's bias
I like the ending point regarding even distribution of pseudo random numbers which we as humans interpret as random. But technically an infinite sequence of the same number is possible and acceptable as well.
Woah, a wild tr1p! *edit:* speaking to that end part, it's kind of like on the Z80, seeding a PRNG with the R register. It's usually limited to 128 possible starting configurations, but still helpful
To expand on the "no": a pseudo random number generator usually has guaranteed behaviors on the generated numbers **per seed**, given you take a "large" amount of samples. That is, no matter the seed, eventually the generated numbers will approximate a given distribution. Usually bias in PRNGs come from limited sampling or poor use.
No. Pseudorandom number generators are like mathematical blenders designed to always produce something uniformly distributed. Even if you throw all 0s in there it will blend it up to something uniformly distributed. However, it is important to keep in mind that if your initial seed is constantly nonuniform, such as, it is mostly 0s with just a few bits changed, then this can make it easier for an attacker to guess the seed. Any nonuniform but truly random bit stream can be converted into a uniformly random bit stream using a technique called a von Neumann extractor.
you could use a Bernoulli Factory or a Dice Enterprise to generate correctly distributed random numbers from ones with an unknown distribution. For instance, just take two bits and, if they are a head and a tail, say it's a head, and if they are a tail and a head, say it's a tail. If they are both the same, reject. This way, as long as the random bits are independent, you are guaranteed to have a fair coin, as the head probability and the tail probability effectively cancel out. A Dice Enterprise is this concept applied to results with more than two states, i.e. numbers rather than just bits.
The 800Mhz clock was the clock speed of the uncore logic in the Ivy Bridge CPU in 2011, which is the first CPU to have RdRand. That's just the clock that was available in that area of the chip. Since then there have been hundreds of products with many different clock speeds. The Intel DRNG is run from 100MHz to 2GHz depending on the chip and the power goals for the chip. Those low power SoCs are usually 200Mhz. The high end xeons are currently 2GHz because the Xeons have hard performance requirements to meet.
It seems like all the "random numbers" (true generated) are of a similar length digit-wise. I'm curious if they are true-random wouldn't they be pulled from every conceivable length of digit to infinity?
Not sure. Around 10:45, he says that "we don't know what source of entropy the computer is using".. so maybe the answer is "I don't know how it generates random numbers". He says that a typical pseudorandom generator picks numbers with an equal probability distribution, so that every number is equally possible. But this "true" random generator has a skewed probability(?) (which sounds to me like it would make it 'less' random)? Maybe I should just research it instead of typing a reply without knowing anything heh.
As far as I know intel uses thermal noise, but we are unable to check what they are actually doing in hardware, it's just too complicated and miniaturized
Electronic Design: “Understanding Intel's Ivy Bridge Random Number Generator” explains the design. A meta stable circuit is constantly flipping its value. Then it is sampled, I.e. the randomness is copied from meta stable to an unpredictable clocked signal. From that point on, Intel does a bunch of deterministic stuff to assure quality. But at the core of it: have a signal switch value at infinite speed (meta stable), then clock it. The clocked signal is unpredictable.
All "true randomness" means is statistical independence. A bit stream is truly random if it satisfies the equation P(0)P(1)=P(1)P(0), meaning, it is equally as probably as a 0 will occur after a 1 in the bit stream as a 1 would occur after a 0, because each bit is statistically independent of each other bit. Often, true randomness is falsely conflated with uniformity, which is the idea that 0s occur with equal probability as 1s. However, a bit stream 0101010101 is uniform yet clearly not random. Any truly random but nonuniform distribution can also be easily converted into a uniform distribution by making use of statistical independence by assigning pairs of 01 to 0 and 10 to 1 and throwing out pairs of 00 and 11. Since 01 and 10 have equal probability of occurring, you would be guaranteed to get a uniformly distributed set of random numbers. This only works if they are truly random, because the statistical independence and will fail if the bits are statistically dependent upon one another as statistical dependence follows different mathematical laws. The trick to building a truly random number generator is thus to just generate bits which are as independent of each other as possible, where no analyst who sits down and studies a bit stream could find any statistical relations between any of the bits. Often this is done by sampling many different sources, like computers often generate random numbers by sampling fan noise, CPU temperature, keyboard and mouse movements, so on and so forth. Often they will just take the least significant bits from these sources as well. If they use the entire CPU temperature, then it might be predictable that it would be a larger number in the day and a lower number at night due to when people would typically use the CPU. If you just sample the tiniest fluctuations in CPU temperature, though, then there is not an obvious correlation between what programs are running and those tiny fluctuations in temperature. You can then combine these many different entropy sources into a single bit stream which should produce statistically independent bits, and either directly throw it into a pseudorandom number generator as the seed, or you can keep sampling until you have enough to turn it into a uniformly random probability distribution and use that as the seed. Cryptographically secure pseudorandom number generators these are so well understood you can write one from scratch in about 5 minutes (like ChaCha20) making it pointless to actually use the true random numbers directly. It makes a lot more sense to use them to seed a pseudorandom number generator.
But if the seed is vulnerable, doesn't that also make the resulting sequence produced by the PRNG also more predictable than a true random sequence? What's the benefit of a potentially compromised blackbox over an accumulation of inputs from various likely random sources that are unlikely to all be compromised at the same time, mic noise, typing timing, mechanical drive statistics, webcam noise, mouse trajectory, IMU noise etc
yea but if the true random number generator might have a bias we can't check for, and it would be slightly more likley to output 3141579 than any other other number and we put it as a seed then whatever number is generated by the seed 3141579 would still be more likley to be drawn so there's still a bias no?
Can someone explain me why we cannot know the source of entropy being used? Are these functions also use for cryptography purposes and thus knowing it would make them weak or something along those lines?
That would be very bad cryptography if it could be broken by knowing the general source of entropy. He avoided clouding the demonstartion because the details change with each model of processor and are pretty tedius in any case. Mostly it involves collecting the signal-noise on analog inputs such as thermal sensors.
How about take the epoch number to the nearest second at the start of the program, and use it as an input into the random number generator? That epoch number is unique and will never be repeated.
I like seeing Compiler Explorer being used as a tool to demonstrate how compilers generate assembly code. It's one of the most useful pieces of free software that ever existed.
Please don’t use auto-generated captions on videos like this when the speaker is going to be hard to understand. Please take the time to write up a transcript and ensure your video is accessible to all. Thank you!
Would using this true random number generator running locally be a good way to create keys for one-time-pads? For perfect encryption in hyperauthoritarian states like North Korea or China?
A few years ago I coded my own version of Breakout. To my great surprise, the game played EXACTLY the same way (provided you didn't miss the ball) each time. Same bricks hit, same scores, basically like watching a replay of the previous game. Upon debug, I realized that always starting the ball from the same position was a contributor to this flaw. Also, the game needed some RND when the ball hit the paddle. All in all, I realized that it is VERY complicated to allow a player to learn a strategy in video games. If you allow too much predictability, the game is ruined because the player can end up in a loop.
I apologise if i am ignorant. But does a seed not introduce a bias? I would love to see a huge sample of your code/hardware not generate a somewhat graphical repetitional pattern.
I kind of disagree of the unimportance of a true random number. Using “Rand()” in Excel (that most analysts do) is not random. It is paramount in statistical surveys where a population is truely randomly surveyed. It’s just a “random as we can get it” survey.
You can generate a true random numbers in software without RDRAND/RDSEED. This is done by pitting the RTC against the CPU. Set a timer to expire sometimes in the future and flip a bit between 0/1 as fast as the CPU will allow until the timer expires. The result is a true random bit. This is because interrupts are not precise. Depending on when the interrupt occurred will determine the result of the bit flipping process. In pseudocode: flip_bit() {bit = 0; then=time.ms() + 1; while(time.ms()
As a programmer my problem has always been more about getting deterministic numbers than actual random. You even need to record seeds to be able to reproduce results. What use is pure randomness?
Before Linux changed the /dev/random implementation, I had an external USB device that would provide an infinite stream of random numbers to /dev/random via kernel module. It had 2 internal monitors to protect against bias, and would brick the device if too much bias crept in (fail secure).
Around 2005 I ran into that issue that /dev/random would stall due to not enough entropy. It took me ages to find out why unit tests with TLS stalled until I moved my mouse. Then I bought an USB "entropykey" that could provide entropy to /dev/random.
@@kirkanos771 For one Linux's /dev/random implementation no longer blocks like that. That's why I mentioned the implementation change. Second, the device I had fed the kernel's entropy pool, and thus /dev/random, such that it never blocked. That was the entire point of my comment. Trying to make a gotcha comment to me when you don't understand what I'm saying is really weird.
@@amihartz To fail secure. It's a source of entropy. If that entropy ever becomes biased, the secure thing to do is make the device unrecoverable and unusable so that no one will ever use a biased entropy source. Because it wouldn't be entropy at that point.
@@klfjoat It depends on what you mean by "biased." Uniformity has nothing to do with security, an incredibly nonuniform source of entropy is still cryptographically secure as long as successive bits in the bit stream are statistically independent of one another. If by "bias" you mean they become statistically dependent, then yes, you should not use such a source, but if it is nonuniform yet satisfies statistical independence then it does not hamper the security. Also, destroying hardware still seems like a drastic decision and massive waste. There are ways to disable hardware without destroying it...
A skewed source of random numbers is not a problem because if it is truly random (statistically independent) then you can just transform it into a uniform distribution using a randomness extractor.
Actually, there's a book for that - "A Million Random Digits with 100,000 Normal Deviates", and it is not cheap. But this again brings the question of the importance of the "true random" when the output is used for a momentary placeholder of no additional value or dependencies.
It wasn't particularly strong, especially according to today's typical hardware, and considering the limitiations of the ROM, but I wrote an encryption/decryption program for my ZX81 where the key was the seed for the PRNG. Today it'd probably be cracked in 2 seconds flat because of the very limited keyspace, but I would reckon it was fairly formidable for its time.
My optimisation lecturer mentioned a cool way to generate true randrom numbers, by taking a picture of a decaying banana and using the colour of the pixels of the banana as your seed
Technically any set of photos can be used to create random numbers. Using a webcam pointed at anything will do it. Even in the dark. The classic of being pointed at a wall of lava lamps really doesn't _need_ the lava lamps. Why boils down to how these cameras work. You never get exactly the same values out for each pixel.
@@DampeS8NYou don't need the lava lamps, true, but you don't *need* the camera either, since CPUs have built in generators. The reason for the lava lamps is because CloudFlare needs *a lot* of random numbers and they're not just relying on sensor noise but actually relying on the random appearance of the lava lamps themselves to get it.
Wouldn't the resulting pseudo-random numbers also be non-uniformly distributed when the true random (seed) numbers are skewed? In other words, a uniform likelihood function (pseudo random choice) over a skewed prior distribution (true, seed randoms) is still non-uniform?
Came here to comment exactly this. The pseudo random generator will return the same result for the same seed every time, which basically makes it a mapping function where you say "if input is 1, output 9; If input is 3, output 5;etc". Feeding it with a skewed true random number will simply result in another skewed random number. It makes no sense at all to use it as a seed.
@@Kwauhn. a pseudo random number generator repeats its sequence, at which point it is not random, so calculations based on random numbers respond bizarrely
@@DrDeuteron I see. You're saying that using a hardware random seed makes it harder to predict the behavior of a pseudo-random function, right? Like, if you used a language standard RNG function that uses standard seed initialization, it would be easier to guess the function and predict its outcome?
@@Kwauhn. no, I’m saying if you a do large complex simulation or montecarlo, and you call the rng more times than it has unique outputs, your calculation can get really wrong in subtle ways.
@@pierreabbat6157 You're right, of course. Compiling for x86 means that you don't have to worry about it, and iirc early on in the video it's stated that this instruction has been supported for a long time.
@@WalderFreyBasically any Intel processor since 2012 and any AMD processor since 2015 should support it (technically AMD is planning to roll out ARM CPUs as well but I don't think any of those are on the market yet and it wouldn't surprise me if those included an equivalent function since they're initially going to target the server market where CPU RNG is far more important as desktop users need less entropy and generate their own by interacting with the system).
@@pierreabbat6157 My PC runs as 12th gen i7 and it has the instruction, but when I moved it to my server which is running a 2nd gen i3 the program was crashing, and it took me a bit of digging to find that in my cryptography library I had put an option to use that instruction and had it turned on, and that CPU did not support the instruction despite being x86.
Yes and no. The benefits of PRNGs (particularly cryptographic PRNGS) is that we have mathematical proofs about their output statistics. The best you can get from a TRNG is a physics based circuit model and the output from a statistical test suite confirming said model.
There are two components of picking a random number: the random picking and the distribution from which it is picked. E.g. 1, 1, 1, 1, 1 is a random sequence of numbers from a uniform distribution starting and ending at 1. Hardware RNG can guarantee that the process is random, but it's really hard to prove that the distribution is uniform (or of any other particular shape), since the distribution is related to the current physical environment of the processor. PRNGs are mathematical, so their distribution can be determined, but since they're just a deterministic algorithm - we have to pick their initial inputs truly-randomly, otherwise they're predictable.
@@d3line The uniformity of the random number distribution genuinely does not matter in the slightest. If the distribution is truly random, meaning, each successive bit is statistically independent from the next, then it is trivial to convert it into a uniform random number distribution because it follows the statistical law P(A)P(B)=P(B)P(A), so all pairs of AB (01) and BA (10) would be guaranteed to be uniformly distributed even if the overall distribution is not. You can then just map those pairs to 0 and 1 respectively and throw out the pairs AA (00) and BB (11) and you have a uniform distribution. This is called a _von Neumann extractor._
Yes; the range [1, 9] has aproximately 10 times less ammount of numbers than [10, 99], which has aproximately 10 times less ammount of numbers than [100, 999] and so on. Since the numbers whith more digits are more represented, they are going to appear more frequently than the ones with less digits.
At a low level the RNG actually just spits out a string of ones and zeros, so the digit count is determined by the software implementation and how many random binary digits it asks the hardware for
doing this does expose the intrinsic entropy your computer has. quite cool. will you be exploring physical unclonable functions? using the SRAMs or some memristor like devices.
Question: If you produce a sequnce of pseudo-random numbers using some algorithm, then pick a number from that sequence at pseudo-random, does that make the outcome even more random, or less random?
@@antiHUMANDesignsit doesn't change anything because your pool of inputs is already truly random. Regardless of how biased a human is, if the initial pool is truly random then it doesn't matter which number the human picks.
@@mareau2193 If the distribution is perfectly uniform both in the serquence and in the picking, I get that the result shoudl be uniformly random, but what if there's a small deviation from uniform distribution in both algorithms? That is, in both the sequence and the picking algorithm. Does the result get twice the deviation, or less of a deviation from perfectly uniforms, or is the deviation as large as the algorithm with the largest deviation, or perhaps the one with the smallest deviation? That's more technically what I'm asking.
My main takeaway from this video is that what I thought was an exaggeration of the Italian accent is actually the opposite. It's impossible to comically exaggerate the Italian accent, it turns out.
He didn't meant to say the source is unknown. He meant the statistical properties of the source is unknown... (As for any true random number generator)
Hopefully not for anything cryptographically secure - using the system clock is very predictable if the attacker has a sense of the system uptime (they wouldn't need to check many inputs to just brute force the seed for the PRNG), and an unconnected analogue input would likely be quite biased and generate very small amounts of entropy compared to the thermal noise of a high power desktop CPU
@@bosstowndynamics5488 For cryptography I would use a well tested library. It's Way beyond my skill set. Normally I just want to avoid getting the same series every time.
According to quantum mechanics there is true randomness, not just that we don't understand what is happening. Heisenberg uncertainty principle for example.
@@foible2085 You mean according to your _interpretation_ of quantum mechanics. Quantum mechanics is a deterministic theory just like any others. If you flip a truly random and fair coin an infinite number of times and formed a probability distribution of the results, the results are guaranteed to deterministically converge to a distribution of 50%/50%. While each individual flip may be random, the probability distribution is deterministic. Quantum mechanics gives deterministic probability distributions of what ensembles of systems will converge to according to the Born rule. Whether or not a single system (just one sample, or the behavior of just a single particle when the experiment is only ran once) is truly random is out of the scope of quantum mechanics itself (as a body of mathematics) and a question for philosophy and interpretation. The uncertainty principle could be due to intrinsic uncertainty in the universe but there are also interpretations that posit it is caused by interacting with the particle disturbing its state (see Spekkens' toy model for example).
@@foible2085 Depends on how you interpret quantum mechanics. There are ways to interpret quantum mechanics deterministically, but they are unpopular as they rely on presuming that there is some variable which for some reason is hidden from us by the natural world and so we cannot know of it. (If it wasn't hidden from us, then we could go measure it, and then you would have a new theory rather than just an interpretation.) If we can't observe something, it is simpler to just not presume it exists in the first place.
I saw it mentioned in another comment but I'd like to add some context: hardware RNGs like the one presented here are still deterministic, from a physical point of view. They are based on chaotic processes (in this case thermal noise in the processor), which are usually modelled using classical physics, and return results we cannot predict, but that uncertainty only comes from the initial conditions of the system, which serve the same function as the seed in a PRNG. Once those are set, the result is deterministic and we can calculate it, in theory. This means that, especially in cryptographic scenarios, these random numbers are not "unconditionally secure", because an attacker with infinite computing power could potentially reverse engineer the process obtain the number. Of course, in practice no one has infinite resources, and there are several levels of security which you can guarantee even with a good PRNG or with any HRNG. If we are talking about "true" randomness, the only processes than guarantee this are quantum processes. QRNGs are proven to be unconditionally secure because of the inherent randomness of quantum mechanics: no matter how well we can replicate the initial conditions, given the right measurement the result will always be random. That said, as a physicist who has built a lot for QRNGs, I once again reassure everyone that HRNG are more than "random enough" for even the most secure applications.
I would argue the thing that makes some TRNGs insecure isn't that the source is thermal noise, but the fact that circuit implementations to sample the noise introduce biases.
The thermal noise effects these hardware number generators rely on are created by quantum effects in the silicon though, so if these are deterministic then the entire universe is deterministic and true random number generators don't exist at all.
See for example, Knuth Vol 2, Seminumerical algorithms (Chapter 3). "Anyone who considers arithmetical methods of producing random digits, is, of course, in a state of sin." John Von Neumann (1951). How hard can it be to generate random numbers? Read chapter 3 -- it's very hard! (See also Hotbits - generating random digits through radioactive decay events).
One very critical feature in [m]any modelling system[s] meanwhile... So called "Oranges".. I mean "Rings", ooops, I mean RNGs! A very dear topic to me personally, and a reason for my fascination with applied computing in relation to "ghost in the machine" problems ;)
Nerd comment: glibc has a function called getentropy that provides true random bytes. You can use that if you are on Linux and don't want to use processor architecture specific code. How does it work? It issues a Linux syscall, i.e. it gets the numbers from the OS kernel. The kernel internally uses a processor architecture specific way to get the numbers. So, might be that Linux does what the guy was showing when running on an x86.
My understanding is that Linux still mixes the hardware generator output in with other sources of entropy, including the random timing variations in disk activity, and random fluctuations in inputs like mice and various hardware interrupts. The last time I looked into it in detail though was during the controversy not long after it was first implemented during the whole NSA revelation, when it turned out it was being mixed in in a suboptimal way that meant it was theoretically possible for a specially designed hardware RNG to unrandomise the pool if created by an exceptionally sophisticated attacker.
If you only use it as a seed, you will be seeing numbers from the same sequence. If you let it run long enough, you will start seeing the same sequence of numbers.
It's debatable if these are true random numbers. They may be generated using a high entropy source, but if the universe is deterministic then no true random numbers exist 😄
True, but at the moment our best understanding of physics at the scale used by these systems is using quantum physics, which describes a true random universe, so the debate position that it's deterministic is pretty weak. Plus, even if it is, in order to predict the outcome you would need the "seed" for the entire universe, and fortunately most cybercriminals still have difficulty determining that
How is this better or even different from using the real time clock at the nanosecond level as a seed? Another source would be any snapshot from your webcam(ok you need to remove the tape put in front of it to protect your privacy)
Using the clock is a terrible idea. If, for whatever reason, the attacker get to know approximately *when* you ran the program, it can narrow down a brute force attack on the seed to a manageable size (even with nanosecond resolution). For instance, tcp packets are timestamped to the nanosecond :-) Please don't do it, it's truly unsecure.
These generators aren't supposed to be "better" (I'm assuming you mean more secure) than other sources of quantum effects like sensor noise, they're supposed to be better in the sense of being much higher throughput than other sources of random that are automatically available on a server. It actually used to be a pretty common problem for servers to run out of entropy when doing a lot of cryptographic processing (which for a server applies to most tasks, including serving webpages and even just allowing remote access through tools like SSH if done a lot), which would require the addition of external hardware to generate randomness. That hardware could include a webcam (even covered you could use sensor noise, but you can get more randomness with it uncovered) but then you would have to add a webcam to your server. These days most servers actually get enough entropy just from the CPU, so it's better in the sense that you don't need to have a webcam at all to do it. And the other commenter already addressed why the clock is a bad idea - the entire point of random number generators is to be unpredictable and clocks are the exact opposite of that.
RDRAND uses a meta stable entropy source, it clocks entropy at around 3GHz, (I.e. random output at 3 billion bits per second) and then perform a lot of cryptographic quality checks. A normal clock is not unpredictable and you would have problem sampling useful randomness from it at speed. So it is more unpredictable and faster than many naive entropy collection implementations.
My favorite mathematical topic, tbh. I've been obsessed with PRNG and TRNG for the better part of three decades. I have some theories of my own. Edit: and having watched the video, I can tell you this does not anywhere cover an actual TRNG. Two computers that are identical in their start time and run time would produce the same result. There are TRNG that rely on physics, analog components, and even mechanical components to produce truly random, irreproducible results. This video does not contain an example.
I once had a disagreement with my C.S. professor about whether getting numbers from Atari’s POKEY chip on their 8-bit computers was truly random (my position) or pseudo-random (his). POKEY runs independently of the 6502 and updates its 17-bit linear feedback shift register on every clock cycle, while the main CPU is frequently interrupted by ANTIC for video synchronization and by POKEY for external inputs so the number of cycles between reading the random number port often varies. Since there is no seeding and programs are frequently waiting for human input, it’s impossible (or at least impractical) to predict which number will come up when the program looks for one.
It's still a pseudo-random number generator because you could record the delays and recreate it. If it was truly random, given the same state, you would get different numbers. You can't predict it because you're not recording the full state. Doesn't mean it's any less pseudo-y.
It has indeed been re-created … _in emulators._ I can’t imagine how one would re-create the state on the actual hardware using software that’s running natively (i.e. no external circuits that can keep pace with the 1MHz onboard clock.)
On Linux, hardware random numbers feed into the so-called entropy pool, which also uses unpredictable input from the environment as a source for actual randomness. From that, pseudorandom numbers are generated as long as there's enough entropy in the pool.
There's an Italian-style random number generation process: it's done with alphabet vermicelli. However, it's not very practical. Therefore, I won't go into detail.
From what I see/understand, there's RDRAND and RDSEED, where RDRAND uses a pseudorandom generator which periodically seeds from a hardware noise source. Then, there is RDSEED which is a true random generator which is fully driven off the noise source? Certainly interesting to see. Too bad it has worse performance and reproducibility for the same results when compared to existing algorithms like the Mersenne Twister. When evaluated for Monte Carlo usage, Intel's implementation was 20x slower than the Mersenne Twister. And AMD is even worse, at 3-6 times slower than Intel's implementation...
Technically true random numbers are physically impossible. Hardware that relies on more diverse, real-world input will still give you the same sequence given identical conditions. Producing a random number out of nothing (no input) is not possible.
No. If you amplify the thermal noise in a resistor, or the shot noise in a diode, you get random events that are impossible to duplicate (in other words, you can't get "identical conditions") I suppose if you consider amplification of thermal or shot noise to be "an input", then OK. You can't get truly random numbers out of a deterministic process.
For those wondering..The Intel x86 RDRAND instruction he's talking about gets its seed from sampling thermal noise, so it's technically a psuedorandom generator just like everything else.
Why don’t random generators just use the las 3 digits of the current milliseconds as seed to its pseudo random generator. It seems like this would be truly random
I don't get why truly random numbers are not that useful... I mean, even if they are not "evenly distributed" that's what makes them truly random, right? The "unexpected"... that's what RANDOM means to me.
It would be nice to see a follow-up video explaining how the hardware produces truly random numbers.
Yes. Like the different sources of entropy, as he stated in the video.
Some are temperature and humidity for example
Yes! This! Is it munging environmental measurements, or something like sampling between asynchronous clocks?
@@damian_smith Or cosmic radiation? I know this is sometimes used, because it's incredibly random.
@@edwardian23 hmm. Posted but maybe UA-cam didn't like the link.
I only know approximately but you can toggle a bit at, say, 100MHz and then sample it at, say, 101MHz. Sometimes you'll sample it while the voltages are changing and get a random answer.
I never knew that x86 processors had this functionality built in. It might be slow, but if you absolutely need just a few truly random numbers, it sure beats a wall of lava lamps, a "chaotic pendulum", or a radioactive check source next to a Geiger counter. Long term fans of this channel might get the reference. ;)
Straight from Numberphile I see
been a long time since i've seen smurfs in the wild
It's often used to seed the OS-level PRNG
A shortwave antenna - literally one tiny quartz diode - is a more reliable TRNG than this gross attempt.
@simonsomething2620 I was also curious, so I looked it up. (Had a much better response before it got lost in transit) Basically, thermal noise on the processor is measured, converted to a numerical state, and then that value is used as the seed for a pseudo-random number generator.
Hey Valerio, the whole point of the video was for you to explain how hardware could generate a truly random seed, and you didn't even mention that at all. Either in general or in the case of the x86. For anyone who didn't already know, it was all gobbledegoop. The only clue was when the questioner asked if you knew what the source of entropy was within the chip, and you said you didn't know. Anyway wikipedia says it's "thermal noise within the silicon". Cheers. 🙂
I rarely make comments and very rarely negative about a video but honestly I felt like this video was a complete waste of my time. I watched waiting for the part where he would explain the source of the true randomness that the hardware could leverage but it never came. Thanks for looking it up and providing the answer, I thought it would be something related to the CMB or some electrical voltage wave but thermal noise in the silicon is even more interesting.
Linux also has a similar system in place, where it collects entropy from all hardware connected to the system into /dev/random and /dev/urandom. I wonder what's slower; using rdseed or using the getrandom() syscall. Both should provide true randomness
That's because it's computerphile... Not sixty symbols...
I would assume that means that the circuit uses a reverse-biased P/N junction to generate noise (which a reverse biased junction does naturally) and then it clocks that noise through a schmitt trigger to turn it into a stream of 0's and 1's to create the output.
@@aqua-op Lol, yeah I was (and still am, haven't slept) a bit drunk, so I was kinda easy with loose words. But I'm glad because others wouldn't have said it. It wasn't that harsh. He obviously understands what's happening, just didn't explain the important part. It was still interesting when he said such things could have a bias toward one side or one set of numbers. Anyway thanks for your comment. This, (randomness, or lack thereof) is one of my favourite things. 🙂
Overjoyed to see Compiler Explorer being used to show how these things work under the hood!
I was going to ask this question to the general public this is even better! When his code runs on Compiler Explorer is he calling a real rdseed?
Interesting he didn't seem to know where the term "Godbolt" comes from
@@SamMason0 who can blame him :) It's a funny thing, many people don't realise... But the site's called "Compiler Explorer"; I can't call it the Other Name...it's too weird :D
@@SamMason0 where does it come from?
@@hwstar9416 it's Matt's surname... It was just his personal site, but the compiler explorer got kind of popular and took over. He gave an interesting talk about it recently at C++ On Sea
An Italian science researcher in a room completely empty apart of a computer and an espresso machine? Yeap, checks out 😛
with also an italian dictionary
🤌
Ay, I'm glad-a you liked eet. You wanna some coffee? 😄
It's also nice to have the same seed create the same random numbers in for example Minecraft so I can play in the same world shared by a friend or website and spawn in the same place like next to a village. I imagine some other games operate on the same principle in a great variety of implementations, and that many others at least theoretically could.
Also that's why in many games you can't abuse save-loads to get desired rng
I can add the example of Shattered Pixel dungeon and many offshoots of it, for the same reason.
Wouldn't the player's input cause a variable count of numbers to be pulled from the sequence, and the desired loot get a different number if they are not decided at game start but rely on a seed? Say, for example, you go where a monster spawns, which has some properties, and then you go for your loot?
The Sims 2 had a very bad random number, and you could see the same set of people being chosen. Adding together any variable data like time and object identifiers could improve upon it.
@@j7ndominica051 Easily solved by having the seed used for world generation and a different rng for everything else
@@j7ndominica051 Loot and monsters are not determined from seed only, but all the blocks and structures (villages, pyramids etc.) are fully determined, so it's essentially the same world, it looks the same
True random numbers were friends we made along the way
His Italian-ness is at 11. I actually lol’d when I saw the espresso machine on his desk.
Seed numbers generating the same sequence of pseudo randomness is very usefull in thinks like VFX for example, when you need to rerender certain effects but dont want to do everything back from 0... So using the same seeds assures you it will still fit togetter with everything else... For example when calculating particles, wind, fluid sims, etc... but even the noise from monte carlo raytracers.
if you need a different seed every time just use clock( ) or time( ). True purpose of true random numbers is cryptography. Imagine you use a rand( ) with some seed, no matter how secure the seed is, the sequence itself is deterministic. Thus you can predict it and brake some encryptions. But even if the sequence is truly random, then, as it was stated, it might not be evenly distributed. That means, that the numbers must belong to a determinable subset of numbers, and it will be easier to brutforce the key; that turns an almost impossible task to doable task within a couple of months.
@@Raspredval1337 How exactly does the evenness of the number distribution or the lack of it thereof means the the following numbers belong to deterministic subset?
@@Raspredval1337 The problem with using time for your seed is that if the attacker knows when your program started, it gives a rather small set of possible seeds to exhaustively try. Trash for cryptography pruposes
The standard C++ library can use a "true random" engine automatically if available in the hardware, so it's really easy to use in modern C++.
Also, as you mentioned, it is recommended to use it mainly for randomizing the seed.
Sadly, it is implementation defined and not required by the C++ standard to do so.
@@sledgex9 Well, as long as you know whether your IDE does it or not.
Eve Online was seeded on the number "42". Just have to love that.
Well in linux you have /dev/random and urandom that gets entropy from hardware cpu, hard disk, user inputs etc. That can say that are "real" random numbers
Both of these are circumstance dependent though - /dev/random on most Linux distros uses true random numbers and blocks reads when "empty", but iirc can sometimes be plugged into a pseudorandom number generator that continuously reseeds from hardware entropy (eg in environments where hardware entropy is scarce such as VMs which don't always have access to a dedicated hardware entropy generator), /dev/urandom is non blocking and traditional behaviour is to use a pseudorandom number generator once true entropy is exhausted until more true entropy is generated.
Notably, in many BSD implementations /dev/random is just a link to /dev/urandom, so all random numbers generated on BSDs are potentially pseudorandom numbers generated from a true random seed instead of forced true random.
I came for random numbers, I stayed for the strong accent
Pseudo random number generation and seeds were used in computer games all the way back into the 80's. One of the best "adventure" type games was Akalabeth, and the only way you got the same dungeons and overland maps was to use the same "seed."
Which is why when i was writing games back then, i would just very rapidly (machine coding) generate PR number after PR number while waiting for the player to input. Player input is about as random a factor as you can get. Nobody is going to be able to repeatably respond within milliseconds every time.
One of my favorite tricks was to use the refresh register in a Z80 as the seed. Because it was constantly incrementing, and i could use that to tweak the randomness of the numbers i was generating, again using the player as the randomizing element. Mainly because the refresh register was a zero cost (in programming terms) high speed moving target.
Something I tend to do with my programs is to start out with a "seed" derived from the TOD clock, then generate a small integer, get that many random numbers, then save the last one to a file to be used in future as the "seed" instead of using the TOD clock. The procedure of generating a small integer and then getting that many random numbers and saving the last one as the new seed seems to provide sufficient randomness for most games.
I had a windows-based casino game back in the 90s. In some games, each round started with the same sequence of random numbers, making it easy to cheat.
Would be interesting to hear how the computer collects and uses entropy.
To be as generic as possible and skipping over how implementations change: The Hardware RNG is a buffer of entropy samples collected from entropy source(s) that are typically fed through thermal or electrical noise, conditioned for quality and governed with an entropic heath check of sorts. These are analog components on the die.
Often this process is so slow that rather than being used directly, the random numbers are actually used to seed Procedural RNGs, after being conditioned with a Detrrministic Random Bit Generator to remove identifiable characteristics of the hardware noise.
alpha scatter from its Plutonium 238 power source.
@@mytech6779 don't mislead the children
you clearly missed the point of the question you basically didnt say anything at all .. given a biased coin toss how do you unbias it without knowing the bias priori .. I dont know the answer but I know that because I dont know the answer that you arent even close to ever being informative here because if you were thats where you would have went .. you dont even know enough to know what the fundamental issue is ("collect entropy" is not meaningful)@@orbatos
You might need to have a look on stochastic computing that exploit cosmic noise to perform approximate computing. You will be fascinated by the fact that a simple AND gate could do arithmetic multiplication by exploiting probability.
02:40 “making the program deterministic and then the bug will appear…”
Threads: hold my beer
Threads: I'm about to end this man's whole career
The real utility difference between RdRand and RdSeed is that numbers from RdRand can be concatenated to make larger random numbers up to 256 bits and still maintain O(2^256) security. 128 bits for older chips. RdSeed you can concatenate the outputs to make as large a number as you like. So 512 bit keys or 1024 bit keys or 10 million bit keys will support a security strength that large. RdRand offers computational prediction resistance and O(2^256)+O(2^256) = O(2^257), not O(2^512). This is sometimes referred to as the additive vs multiplicative prediction resistance of RdRand vs RdSeed.
This video is an amazing showcase of speaking with an accent without hindering the understandability of the spoken content whatsoever. I'm astonished by the way he's articulating the sentences in a way that seemingly highlights the premise on its own. Great stuff.
Another situation where is useful to set a seed is for teaching programming or data analysis, so everyone can replicate the expected results and easily verify they did the assignments correct.
I just found out that you made a couple of Computerphile videos!
It was great having you as my Honours Project supervisor :D
Great video!
An interesting followup would be to plot the distribution of the random numbers over a few different tests and compare it to pseudorandom numbers.
The important difference between PRNG and TRNG is not the distribution, you can always fix that for either type. What you want is to have no ability to predict the NEXT number in the sequence with certainly-better-than-chance probability.
You're encouraged to extract the entropy from these instructions via some cryptographic primitives, e.g. sha hash the output. This would remove any bias/"skew" and get you back to a uniform distribution.
@@ed_halley That is just another way of saying that the important difference is precisely the evenness of the distribution. That is because you can not determine the probability of the next number unless you know the distribution.
That's the job of libraries like DIHARDER and BigCrush.
@@davidgillies620 Why do both those libraries sound so dramatic
The code is neat, but I don't get the explanation of how processor makes rundom numbers.
It seems that pretty large portion of the numbers displayed after 11:33 begins with digit 1. I tried to pause the video (pseudo)randomly and in some instances one third or even a half of the numbers start with 1.
It may seem that there is bias. I believe that the numbers go from 0 to 2^31. That is 0 to 2.1billion. So.. note that any leading zeroes are removed and about half the numbers between 0 and 2.1B do actually start with a 1. Now note that any numbers less than 1B do not start with a zero. And one out of 9 starts with a 1. That is probably about half the numbers starting with a 1. The same should be true of the sequence generated earlier.
It isn't a distribution that spans some power of 10.
The function he calls gives 32 random bits, it's then converted to an int that spans between ±2.1 billion and displayed as base 10, that's why there's bias
I like the ending point regarding even distribution of pseudo random numbers which we as humans interpret as random. But technically an infinite sequence of the same number is possible and acceptable as well.
Woah, a wild tr1p!
*edit:* speaking to that end part, it's kind of like on the Z80, seeding a PRNG with the R register. It's usually limited to 128 possible starting configurations, but still helpful
If we the distribution for seed is non uniform, won’t that make the distribution for the pseudo random number it generates skewed as well?
No.
To expand on the "no": a pseudo random number generator usually has guaranteed behaviors on the generated numbers **per seed**, given you take a "large" amount of samples. That is, no matter the seed, eventually the generated numbers will approximate a given distribution.
Usually bias in PRNGs come from limited sampling or poor use.
No. Pseudorandom number generators are like mathematical blenders designed to always produce something uniformly distributed. Even if you throw all 0s in there it will blend it up to something uniformly distributed. However, it is important to keep in mind that if your initial seed is constantly nonuniform, such as, it is mostly 0s with just a few bits changed, then this can make it easier for an attacker to guess the seed. Any nonuniform but truly random bit stream can be converted into a uniformly random bit stream using a technique called a von Neumann extractor.
@@amihartz ahh that makes sense, thanks
you could use a Bernoulli Factory or a Dice Enterprise to generate correctly distributed random numbers from ones with an unknown distribution.
For instance, just take two bits and, if they are a head and a tail, say it's a head, and if they are a tail and a head, say it's a tail. If they are both the same, reject.
This way, as long as the random bits are independent, you are guaranteed to have a fair coin, as the head probability and the tail probability effectively cancel out.
A Dice Enterprise is this concept applied to results with more than two states, i.e. numbers rather than just bits.
12 minutes of Mario teaching coding.😂
If my wife's mood could be converted into a number, you would have the perfect random seed generator. It would never be the same.
The 800Mhz clock was the clock speed of the uncore logic in the Ivy Bridge CPU in 2011, which is the first CPU to have RdRand. That's just the clock that was available in that area of the chip. Since then there have been hundreds of products with many different clock speeds. The Intel DRNG is run from 100MHz to 2GHz depending on the chip and the power goals for the chip. Those low power SoCs are usually 200Mhz. The high end xeons are currently 2GHz because the Xeons have hard performance requirements to meet.
It seems like all the "random numbers" (true generated) are of a similar length digit-wise. I'm curious if they are true-random wouldn't they be pulled from every conceivable length of digit to infinity?
But how does the processor generate the random number? It was mentioned that it uses entropy but that doesn't really explain it.
Johnson nyquist noise, maybe.
It just uses thermal noise.
But how does the true random number generator work? Did I accidentally miss that part?
Not sure. Around 10:45, he says that "we don't know what source of entropy the computer is using".. so maybe the answer is "I don't know how it generates random numbers". He says that a typical pseudorandom generator picks numbers with an equal probability distribution, so that every number is equally possible. But this "true" random generator has a skewed probability(?) (which sounds to me like it would make it 'less' random)?
Maybe I should just research it instead of typing a reply without knowing anything heh.
As far as I know intel uses thermal noise, but we are unable to check what they are actually doing in hardware, it's just too complicated and miniaturized
Electronic Design: “Understanding Intel's Ivy Bridge Random Number Generator” explains the design. A meta stable circuit is constantly flipping its value. Then it is sampled, I.e. the randomness is copied from meta stable to an unpredictable clocked signal. From that point on, Intel does a bunch of deterministic stuff to assure quality. But at the core of it: have a signal switch value at infinite speed (meta stable), then clock it. The clocked signal is unpredictable.
@@randomgeocacher thanks!
All "true randomness" means is statistical independence. A bit stream is truly random if it satisfies the equation P(0)P(1)=P(1)P(0), meaning, it is equally as probably as a 0 will occur after a 1 in the bit stream as a 1 would occur after a 0, because each bit is statistically independent of each other bit. Often, true randomness is falsely conflated with uniformity, which is the idea that 0s occur with equal probability as 1s. However, a bit stream 0101010101 is uniform yet clearly not random.
Any truly random but nonuniform distribution can also be easily converted into a uniform distribution by making use of statistical independence by assigning pairs of 01 to 0 and 10 to 1 and throwing out pairs of 00 and 11. Since 01 and 10 have equal probability of occurring, you would be guaranteed to get a uniformly distributed set of random numbers. This only works if they are truly random, because the statistical independence and will fail if the bits are statistically dependent upon one another as statistical dependence follows different mathematical laws.
The trick to building a truly random number generator is thus to just generate bits which are as independent of each other as possible, where no analyst who sits down and studies a bit stream could find any statistical relations between any of the bits. Often this is done by sampling many different sources, like computers often generate random numbers by sampling fan noise, CPU temperature, keyboard and mouse movements, so on and so forth.
Often they will just take the least significant bits from these sources as well. If they use the entire CPU temperature, then it might be predictable that it would be a larger number in the day and a lower number at night due to when people would typically use the CPU. If you just sample the tiniest fluctuations in CPU temperature, though, then there is not an obvious correlation between what programs are running and those tiny fluctuations in temperature.
You can then combine these many different entropy sources into a single bit stream which should produce statistically independent bits, and either directly throw it into a pseudorandom number generator as the seed, or you can keep sampling until you have enough to turn it into a uniformly random probability distribution and use that as the seed. Cryptographically secure pseudorandom number generators these are so well understood you can write one from scratch in about 5 minutes (like ChaCha20) making it pointless to actually use the true random numbers directly. It makes a lot more sense to use them to seed a pseudorandom number generator.
But if the seed is vulnerable, doesn't that also make the resulting sequence produced by the PRNG also more predictable than a true random sequence? What's the benefit of a potentially compromised blackbox over an accumulation of inputs from various likely random sources that are unlikely to all be compromised at the same time, mic noise, typing timing, mechanical drive statistics, webcam noise, mouse trajectory, IMU noise etc
yea but if the true random number generator might have a bias we can't check for, and it would be slightly more likley to output 3141579 than any other other number and we put it as a seed then whatever number is generated by the seed 3141579 would still be more likley to be drawn so there's still a bias no?
Can someone explain me why we cannot know the source of entropy being used? Are these functions also use for cryptography purposes and thus knowing it would make them weak or something along those lines?
That would be very bad cryptography if it could be broken by knowing the general source of entropy.
He avoided clouding the demonstartion because the details change with each model of processor and are pretty tedius in any case. Mostly it involves collecting the signal-noise on analog inputs such as thermal sensors.
How about take the epoch number to the nearest second at the start of the program, and use it as an input into the random number generator? That epoch number is unique and will never be repeated.
I like seeing Compiler Explorer being used as a tool to demonstrate how compilers generate assembly code. It's one of the most useful pieces of free software that ever existed.
What is that red thing on the desk? Tea maker? Good view of it at 8:03.
it's a Nespresso coffee machine. hth
Couldn’t you use the noise on a voltage rail to generate true random numbers?
that's nearly what this does, Intel says they use "heat noise in the silicon".
Is that a curved flat screen? never seen one of them. or is it just camera fish eye?
Very interesting. But does also work for example in Docker Containers?
Please don’t use auto-generated captions on videos like this when the speaker is going to be hard to understand. Please take the time to write up a transcript and ensure your video is accessible to all. Thank you!
Would using this true random number generator running locally be a good way to create keys for one-time-pads? For perfect encryption in hyperauthoritarian states like North Korea or China?
Quick question, can we generate true random numbers by emulating an x86 CPU ?
A few years ago I coded my own version of Breakout. To my great surprise, the game played EXACTLY the same way (provided you didn't miss the ball) each time. Same bricks hit, same scores, basically like watching a replay of the previous game. Upon debug, I realized that always starting the ball from the same position was a contributor to this flaw. Also, the game needed some RND when the ball hit the paddle. All in all, I realized that it is VERY complicated to allow a player to learn a strategy in video games. If you allow too much predictability, the game is ruined because the player can end up in a loop.
Shout-out to the one and only Matt Godbolt as always!
I apologise if i am ignorant. But does a seed not introduce a bias? I would love to see a huge sample of your code/hardware not generate a somewhat graphical repetitional pattern.
Just to be clear on my question. An un biased seed has to be a true random ‘seed’ seems like a chicken or the egg type question.
I kind of disagree of the unimportance of a true random number. Using “Rand()” in Excel (that most analysts do) is not random. It is paramount in statistical surveys where a population is truely randomly surveyed. It’s just a “random as we can get it” survey.
You can generate a true random numbers in software without RDRAND/RDSEED. This is done by pitting the RTC against the CPU. Set a timer to expire sometimes in the future and flip a bit between 0/1 as fast as the CPU will allow until the timer expires. The result is a true random bit. This is because interrupts are not precise. Depending on when the interrupt occurred will determine the result of the bit flipping process.
In pseudocode: flip_bit() {bit = 0; then=time.ms() + 1; while(time.ms()
As a programmer my problem has always been more about getting deterministic numbers than actual random. You even need to record seeds to be able to reproduce results.
What use is pure randomness?
Cryptography, mostly
Before Linux changed the /dev/random implementation, I had an external USB device that would provide an infinite stream of random numbers to /dev/random via kernel module. It had 2 internal monitors to protect against bias, and would brick the device if too much bias crept in (fail secure).
Around 2005 I ran into that issue that /dev/random would stall due to not enough entropy. It took me ages to find out why unit tests with TLS stalled until I moved my mouse. Then I bought an USB "entropykey" that could provide entropy to /dev/random.
@@kirkanos771 For one Linux's /dev/random implementation no longer blocks like that. That's why I mentioned the implementation change. Second, the device I had fed the kernel's entropy pool, and thus /dev/random, such that it never blocked. That was the entire point of my comment. Trying to make a gotcha comment to me when you don't understand what I'm saying is really weird.
...why brick the device? Why not just shut it down or something?
@@amihartz To fail secure. It's a source of entropy. If that entropy ever becomes biased, the secure thing to do is make the device unrecoverable and unusable so that no one will ever use a biased entropy source. Because it wouldn't be entropy at that point.
@@klfjoat It depends on what you mean by "biased." Uniformity has nothing to do with security, an incredibly nonuniform source of entropy is still cryptographically secure as long as successive bits in the bit stream are statistically independent of one another. If by "bias" you mean they become statistically dependent, then yes, you should not use such a source, but if it is nonuniform yet satisfies statistical independence then it does not hamper the security. Also, destroying hardware still seems like a drastic decision and massive waste. There are ways to disable hardware without destroying it...
A skewed source of random numbers is not a problem because if it is truly random (statistically independent) then you can just transform it into a uniform distribution using a randomness extractor.
What is the chair that he is on?
Dude's hoodie looks so comfy.
I thought the same thing lol
Actually, there's a book for that - "A Million Random Digits with 100,000 Normal Deviates", and it is not cheap. But this again brings the question of the importance of the "true random" when the output is used for a momentary placeholder of no additional value or dependencies.
It wasn't particularly strong, especially according to today's typical hardware, and considering the limitiations of the ROM, but I wrote an encryption/decryption program for my ZX81 where the key was the seed for the PRNG. Today it'd probably be cracked in 2 seconds flat because of the very limited keyspace, but I would reckon it was fairly formidable for its time.
Borat explaining random number generation. Very Nice!
My optimisation lecturer mentioned a cool way to generate true randrom numbers, by taking a picture of a decaying banana and using the colour of the pixels of the banana as your seed
Technically any set of photos can be used to create random numbers. Using a webcam pointed at anything will do it. Even in the dark. The classic of being pointed at a wall of lava lamps really doesn't _need_ the lava lamps. Why boils down to how these cameras work. You never get exactly the same values out for each pixel.
@@DampeS8NYou don't need the lava lamps, true, but you don't *need* the camera either, since CPUs have built in generators. The reason for the lava lamps is because CloudFlare needs *a lot* of random numbers and they're not just relying on sensor noise but actually relying on the random appearance of the lava lamps themselves to get it.
Wouldn't the resulting pseudo-random numbers also be non-uniformly distributed when the true random (seed) numbers are skewed? In other words, a uniform likelihood function (pseudo random choice) over a skewed prior distribution (true, seed randoms) is still non-uniform?
Came here to comment exactly this. The pseudo random generator will return the same result for the same seed every time, which basically makes it a mapping function where you say "if input is 1, output 9; If input is 3, output 5;etc". Feeding it with a skewed true random number will simply result in another skewed random number.
It makes no sense at all to use it as a seed.
The equally distributed is a property of the PRNG regardless of the set used
@@MatthisDayerI agree, but the PRNG is applied over a skewed set of numbers. The resulting set should also be skewed.
@@thomaswolfe9490 they would be skewed to a set of still equally distributed sequences
What's the advantage of using a truly random seed? Does it have some use in cryptography or something?
Back in the day, you could cycle the sequence, and the errors that that causes can be really hard to notice.
@@DrDeuteron Not sure I follow, can you elaborate a little?
@@Kwauhn. a pseudo random number generator repeats its sequence, at which point it is not random, so calculations based on random numbers respond bizarrely
@@DrDeuteron I see. You're saying that using a hardware random seed makes it harder to predict the behavior of a pseudo-random function, right? Like, if you used a language standard RNG function that uses standard seed initialization, it would be easier to guess the function and predict its outcome?
@@Kwauhn. no, I’m saying if you a do large complex simulation or montecarlo, and you call the rng more times than it has unique outputs, your calculation can get really wrong in subtle ways.
Can you detect at runtime if your processor is x86-based and supports this instruction?
Code compiled for x86 won't run on ARM or POWER. That's a compile-time check. What language are you writing code in?
@@pierreabbat6157 You're right, of course. Compiling for x86 means that you don't have to worry about it, and iirc early on in the video it's stated that this instruction has been supported for a long time.
@@WalderFreyBasically any Intel processor since 2012 and any AMD processor since 2015 should support it (technically AMD is planning to roll out ARM CPUs as well but I don't think any of those are on the market yet and it wouldn't surprise me if those included an equivalent function since they're initially going to target the server market where CPU RNG is far more important as desktop users need less entropy and generate their own by interacting with the system).
CPUID is used to look for X86 feature flags like RDRAND support.
@@pierreabbat6157 My PC runs as 12th gen i7 and it has the instruction, but when I moved it to my server which is running a 2nd gen i3 the program was crashing, and it took me a bit of digging to find that in my cryptography library I had put an option to use that instruction and had it turned on, and that CPU did not support the instruction despite being x86.
so at the end of the day it is not adviced to use the "true" random number generator since it is so random that we cant be sure its random?
Yes and no. The benefits of PRNGs (particularly cryptographic PRNGS) is that we have mathematical proofs about their output statistics. The best you can get from a TRNG is a physics based circuit model and the output from a statistical test suite confirming said model.
There are two components of picking a random number: the random picking and the distribution from which it is picked. E.g. 1, 1, 1, 1, 1 is a random sequence of numbers from a uniform distribution starting and ending at 1.
Hardware RNG can guarantee that the process is random, but it's really hard to prove that the distribution is uniform (or of any other particular shape), since the distribution is related to the current physical environment of the processor.
PRNGs are mathematical, so their distribution can be determined, but since they're just a deterministic algorithm - we have to pick their initial inputs truly-randomly, otherwise they're predictable.
@@d3line The uniformity of the random number distribution genuinely does not matter in the slightest. If the distribution is truly random, meaning, each successive bit is statistically independent from the next, then it is trivial to convert it into a uniform random number distribution because it follows the statistical law P(A)P(B)=P(B)P(A), so all pairs of AB (01) and BA (10) would be guaranteed to be uniformly distributed even if the overall distribution is not. You can then just map those pairs to 0 and 1 respectively and throw out the pairs AA (00) and BB (11) and you have a uniform distribution. This is called a _von Neumann extractor._
in shader programming, you don't have random numbers or states... making this topic really difficult
Can anyone speak to why all of the numbers were of similar numbers of digits? Shouldn't we expect more variety in number of digits?
Because there are more than 2 billion numbers with 9 or 10 digits. But only 100 million with 8 or fewer digits.
there are 10 one digit numbers, 90 two digits numbers, 900 three digits numbers, the more the digits, the more numbers
Yes; the range [1, 9] has aproximately 10 times less ammount of numbers than [10, 99], which has aproximately 10 times less ammount of numbers than [100, 999] and so on. Since the numbers whith more digits are more represented, they are going to appear more frequently than the ones with less digits.
He programmed it that way.
At a low level the RNG actually just spits out a string of ones and zeros, so the digit count is determined by the software implementation and how many random binary digits it asks the hardware for
doing this does expose the intrinsic entropy your computer has. quite cool. will you be exploring physical unclonable functions? using the SRAMs or some memristor like devices.
This video is supposed to be on “true” random number generation.
And you didn’t talk about how we generate TRNs🤦
Question: If you produce a sequnce of pseudo-random numbers using some algorithm, then pick a number from that sequence at pseudo-random, does that make the outcome even more random, or less random?
It doesn't change anything.
@@misterhat5823 That's what I'd guess, as well.
@@antiHUMANDesignsit doesn't change anything because your pool of inputs is already truly random. Regardless of how biased a human is, if the initial pool is truly random then it doesn't matter which number the human picks.
@@mareau2193 If the distribution is perfectly uniform both in the serquence and in the picking, I get that the result shoudl be uniformly random, but what if there's a small deviation from uniform distribution in both algorithms? That is, in both the sequence and the picking algorithm.
Does the result get twice the deviation, or less of a deviation from perfectly uniforms, or is the deviation as large as the algorithm with the largest deviation, or perhaps the one with the smallest deviation?
That's more technically what I'm asking.
Holy hairpieces the wedding planner from Father of the bride is real.
My main takeaway from this video is that what I thought was an exaggeration of the Italian accent is actually the opposite. It's impossible to comically exaggerate the Italian accent, it turns out.
Didn't pay attention to his name nor the room he was in. For me (I'm from Brazil) he sounded like from some place in eastern Europe.
Let's hear you speaking in Italian, or Spanish, or French, or any other language. Your language accent would get in the way...
BORAT! His accent remember me Borat speaking. That's it!
This seems quite a superficial and unsatisfying video. The entropy source is known: it's processed thermal noise within the silicon. Talk about that.
He didn't meant to say the source is unknown. He meant the statistical properties of the source is unknown... (As for any true random number generator)
I agree, the whole point of the video is not mentioned, except when the guy asks and he says it's unknown. Poor.
I'm a C++ dev, I *have* used a true random number generator a little bit.
Interesting but I think I'll stick to using the system clock as a seed, or a unconnected analog input if it's a microcontroller project.
Hopefully not for anything cryptographically secure - using the system clock is very predictable if the attacker has a sense of the system uptime (they wouldn't need to check many inputs to just brute force the seed for the PRNG), and an unconnected analogue input would likely be quite biased and generate very small amounts of entropy compared to the thermal noise of a high power desktop CPU
@@bosstowndynamics5488
For cryptography I would use a well tested library. It's Way beyond my skill set. Normally I just want to avoid getting the same series every time.
Bring up possibly distribution skewed hardware implementation is a good point
Calling something "truly random" just means that we truly don't know how it is determined.
According to quantum mechanics there is true randomness, not just that we don't understand what is happening. Heisenberg uncertainty principle for example.
@@foible2085 You mean according to your _interpretation_ of quantum mechanics. Quantum mechanics is a deterministic theory just like any others. If you flip a truly random and fair coin an infinite number of times and formed a probability distribution of the results, the results are guaranteed to deterministically converge to a distribution of 50%/50%. While each individual flip may be random, the probability distribution is deterministic. Quantum mechanics gives deterministic probability distributions of what ensembles of systems will converge to according to the Born rule. Whether or not a single system (just one sample, or the behavior of just a single particle when the experiment is only ran once) is truly random is out of the scope of quantum mechanics itself (as a body of mathematics) and a question for philosophy and interpretation. The uncertainty principle could be due to intrinsic uncertainty in the universe but there are also interpretations that posit it is caused by interacting with the particle disturbing its state (see Spekkens' toy model for example).
@@foible2085 Depends on how you interpret quantum mechanics. There are ways to interpret quantum mechanics deterministically, but they are unpopular as they rely on presuming that there is some variable which for some reason is hidden from us by the natural world and so we cannot know of it. (If it wasn't hidden from us, then we could go measure it, and then you would have a new theory rather than just an interpretation.) If we can't observe something, it is simpler to just not presume it exists in the first place.
A true random number is a number that is completely unpredictable and cannot be predicted based on past or present values.
When programming the seed, I slammed my hand on the number pad.
There, it is random now
Random or broken? 😄
@@blucat4 probably both
@@blucat4 can't trace back the sequence if I don't know my own seed! *WHACK*
Works great.
@@XenoTravis Is that the new With_Hand_Attack_Computer_Keyboard method? 😄
ARM has TRNG in many of the modern microcontroller, and could be called by Arduino code.
Utterly disappointed, the computer doesn’t use quantum physics to create truly random numbers
I saw it mentioned in another comment but I'd like to add some context: hardware RNGs like the one presented here are still deterministic, from a physical point of view.
They are based on chaotic processes (in this case thermal noise in the processor), which are usually modelled using classical physics, and return results we cannot predict, but that uncertainty only comes from the initial conditions of the system, which serve the same function as the seed in a PRNG. Once those are set, the result is deterministic and we can calculate it, in theory.
This means that, especially in cryptographic scenarios, these random numbers are not "unconditionally secure", because an attacker with infinite computing power could potentially reverse engineer the process obtain the number.
Of course, in practice no one has infinite resources, and there are several levels of security which you can guarantee even with a good PRNG or with any HRNG.
If we are talking about "true" randomness, the only processes than guarantee this are quantum processes. QRNGs are proven to be unconditionally secure because of the inherent randomness of quantum mechanics: no matter how well we can replicate the initial conditions, given the right measurement the result will always be random.
That said, as a physicist who has built a lot for QRNGs, I once again reassure everyone that HRNG are more than "random enough" for even the most secure applications.
I would argue the thing that makes some TRNGs insecure isn't that the source is thermal noise, but the fact that circuit implementations to sample the noise introduce biases.
The thermal noise effects these hardware number generators rely on are created by quantum effects in the silicon though, so if these are deterministic then the entire universe is deterministic and true random number generators don't exist at all.
See for example, Knuth Vol 2, Seminumerical algorithms (Chapter 3). "Anyone who considers arithmetical methods of producing random digits, is, of course, in a state of sin." John Von Neumann (1951). How hard can it be to generate random numbers? Read chapter 3 -- it's very hard! (See also Hotbits - generating random digits through radioactive decay events).
One very critical feature in [m]any modelling system[s] meanwhile... So called "Oranges".. I mean "Rings", ooops, I mean RNGs!
A very dear topic to me personally, and a reason for my fascination with applied computing in relation to "ghost in the machine" problems ;)
Nerd comment: glibc has a function called getentropy that provides true random bytes. You can use that if you are on Linux and don't want to use processor architecture specific code. How does it work? It issues a Linux syscall, i.e. it gets the numbers from the OS kernel. The kernel internally uses a processor architecture specific way to get the numbers. So, might be that Linux does what the guy was showing when running on an x86.
My understanding is that Linux still mixes the hardware generator output in with other sources of entropy, including the random timing variations in disk activity, and random fluctuations in inputs like mice and various hardware interrupts. The last time I looked into it in detail though was during the controversy not long after it was first implemented during the whole NSA revelation, when it turned out it was being mixed in in a suboptimal way that meant it was theoretically possible for a specially designed hardware RNG to unrandomise the pool if created by an exceptionally sophisticated attacker.
@bosstowndynamics5488 You are right. Linux mixes multiple different entropy sources.
This dude reminds me of Borat after attending a couple Computer Science lectures in the US&A.
If you only use it as a seed, you will be seeing numbers from the same sequence. If you let it run long enough, you will start seeing the same sequence of numbers.
Quantum computers may be able to ignore the "starting state" issue that is creating this problem.
Anyone who has played the X-COM series since 1993 understands this concept...
It's debatable if these are true random numbers. They may be generated using a high entropy source, but if the universe is deterministic then no true random numbers exist 😄
True, but at the moment our best understanding of physics at the scale used by these systems is using quantum physics, which describes a true random universe, so the debate position that it's deterministic is pretty weak. Plus, even if it is, in order to predict the outcome you would need the "seed" for the entire universe, and fortunately most cybercriminals still have difficulty determining that
How is this better or even different from using the real time clock at the nanosecond level as a seed? Another source would be any snapshot from your webcam(ok you need to remove the tape put in front of it to protect your privacy)
Using the clock is a terrible idea. If, for whatever reason, the attacker get to know approximately *when* you ran the program, it can narrow down a brute force attack on the seed to a manageable size (even with nanosecond resolution). For instance, tcp packets are timestamped to the nanosecond :-) Please don't do it, it's truly unsecure.
These generators aren't supposed to be "better" (I'm assuming you mean more secure) than other sources of quantum effects like sensor noise, they're supposed to be better in the sense of being much higher throughput than other sources of random that are automatically available on a server. It actually used to be a pretty common problem for servers to run out of entropy when doing a lot of cryptographic processing (which for a server applies to most tasks, including serving webpages and even just allowing remote access through tools like SSH if done a lot), which would require the addition of external hardware to generate randomness. That hardware could include a webcam (even covered you could use sensor noise, but you can get more randomness with it uncovered) but then you would have to add a webcam to your server. These days most servers actually get enough entropy just from the CPU, so it's better in the sense that you don't need to have a webcam at all to do it.
And the other commenter already addressed why the clock is a bad idea - the entire point of random number generators is to be unpredictable and clocks are the exact opposite of that.
You don't need to remove the tape, just bump up the sensitivity to the point you can capture thermal noise
RDRAND uses a meta stable entropy source, it clocks entropy at around 3GHz, (I.e. random output at 3 billion bits per second) and then perform a lot of cryptographic quality checks. A normal clock is not unpredictable and you would have problem sampling useful randomness from it at speed. So it is more unpredictable and faster than many naive entropy collection implementations.
Great video.
My favorite mathematical topic, tbh. I've been obsessed with PRNG and TRNG for the better part of three decades. I have some theories of my own. Edit: and having watched the video, I can tell you this does not anywhere cover an actual TRNG. Two computers that are identical in their start time and run time would produce the same result. There are TRNG that rely on physics, analog components, and even mechanical components to produce truly random, irreproducible results. This video does not contain an example.
I once had a disagreement with my C.S. professor about whether getting numbers from Atari’s POKEY chip on their 8-bit computers was truly random (my position) or pseudo-random (his). POKEY runs independently of the 6502 and updates its 17-bit linear feedback shift register on every clock cycle, while the main CPU is frequently interrupted by ANTIC for video synchronization and by POKEY for external inputs so the number of cycles between reading the random number port often varies. Since there is no seeding and programs are frequently waiting for human input, it’s impossible (or at least impractical) to predict which number will come up when the program looks for one.
It's still a pseudo-random number generator because you could record the delays and recreate it. If it was truly random, given the same state, you would get different numbers. You can't predict it because you're not recording the full state. Doesn't mean it's any less pseudo-y.
It has indeed been re-created … _in emulators._ I can’t imagine how one would re-create the state on the actual hardware using software that’s running natively (i.e. no external circuits that can keep pace with the 1MHz onboard clock.)
On Linux, hardware random numbers feed into the so-called entropy pool, which also uses unpredictable input from the environment as a source for actual randomness. From that, pseudorandom numbers are generated as long as there's enough entropy in the pool.
I can't tell where this guy is from
There's an Italian-style random number generation process: it's done with alphabet vermicelli. However, it's not very practical. Therefore, I won't go into detail.
From what I see/understand, there's RDRAND and RDSEED, where RDRAND uses a pseudorandom generator which periodically seeds from a hardware noise source. Then, there is RDSEED which is a true random generator which is fully driven off the noise source?
Certainly interesting to see. Too bad it has worse performance and reproducibility for the same results when compared to existing algorithms like the Mersenne Twister. When evaluated for Monte Carlo usage, Intel's implementation was 20x slower than the Mersenne Twister. And AMD is even worse, at 3-6 times slower than Intel's implementation...
If it is a true random, how is the distribution not uniform?
Technically true random numbers are physically impossible. Hardware that relies on more diverse, real-world input will still give you the same sequence given identical conditions.
Producing a random number out of nothing (no input) is not possible.
No. If you amplify the thermal noise in a resistor, or the shot noise in a diode, you get random events that are impossible to duplicate (in other words, you can't get "identical conditions")
I suppose if you consider amplification of thermal or shot noise to be "an input", then OK. You can't get truly random numbers out of a deterministic process.
But keep a copy of the million random digits book around in case of a power outage. Good to have a backup plan 😉
For those wondering..The Intel x86 RDRAND instruction he's talking about gets its seed from sampling thermal noise, so it's technically a psuedorandom generator just like everything else.
Why don’t random generators just use the las 3 digits of the current milliseconds as seed to its pseudo random generator. It seems like this would be truly random
I don't get why truly random numbers are not that useful... I mean, even if they are not "evenly distributed" that's what makes them truly random, right? The "unexpected"... that's what RANDOM means to me.