Floating Point Numbers (Part2: Fp Addition) - Computerphile

Computerphile

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 22 жов 2024

КОМЕНТАРІ • 70

@StefanH 5 років тому ⁺³³
I'm quite surprised there is no video on regular expressions yet. Would love one about the history of it and why it is so cryptic
@jamegumb7298 4 роки тому ⁺¹
There are different types of regex. Do not forget about that part.
@amkhrjee 2 роки тому
The animations on every computerphile video is the most underrated yet the one of the most important components. They make the explanations way more easier to grasp by visually explaining what the speaker wants to convey. On this video, especially, the animations were so on point!
@robspiess 5 років тому ⁺⁷
I find the easiest way to learn about floating point is via 8-bit floating point. While impractical for actual use, it's helpful to be able to actually see the whole domain. There's a PDF by Dr. William T. Verts which lists a value for each of the 256 combinations.
@cmscoby 5 років тому ⁺¹
Thank you for explicitly covering this topic. Better than anything else I've found online.
@cacheman 5 років тому ⁺⁵
I can't remember if they've done one on radix sorting, but understanding the representational bit-pattern of floats is very helpful to being able to sort them with that familiy of algorithms.
@antonf.9278 10 місяців тому
Radix sort is designed around integers but positive floats have the same ordering as ints and can therefore be treated as such for sorting purposes.
@cacheman 10 місяців тому
@@antonf.9278 Only one of the two major classes of radix sorts, Least-Significant-Digit/Bit, could correctly be classified as "designed around integers".
I'm not sure what to make of the rest of your comment; you're only confirming the usefulness of understanding the bit-representation, because without this knowledge, you would not be able to prove your assertion... except by exhaustive testing I guess.
@kaustubhmurumkar2670 5 років тому ⁺⁴
Computerphile is so underrated!!
@BGBTech 5 років тому
FWIW: I did an FPU for an experimental CPU core I was working on (targeting an FPGA). It normally works with Double, but only has an ~64-bit intermediate mantissa (for FADD), and this was mostly because the FADD unit was also being used for Int64->Float conversion (reusing the same normalizer; otherwise it could have been narrower). The rest of the bits just "fell off the bottom". Similar goes for FMUL, which only produced a 54-bit intermediate result (with a little bit-twiddly mostly to fix up rounding). Similarly: FDIV was done in software; rounding was hard-coded; it used "denomal as zero" behavior; ... Most of this was to make it more affordable (if albeit not strictly IEEE conformant; most code wouldn't notice).
@V1ruzZW2G 5 років тому ⁺¹⁵
Did anyone notice that he wrote the first two 0 on the table at 3:35 :D
@thiswasleft27 5 років тому ⁺¹⁵
Very informative! Thank you for this explanation.
@RossMcgowanMaths 3 роки тому
Fascinating subject. I have simulated 32 bit floating point addition , subtraction , multiplication in excel vba then built the 'circuits' in logisim. Implementing rounding , subnormals , special values then testing is quite involved and can really waste a lot of time. I chased 1s and 0s for months. My coding skills are a basic but got things working well ( I think ?)To comprehend it mathematically first is the way forward.
@matsv201 5 років тому ⁺¹⁵
There have been quite a few processors historically where the fpu cheated not having the full 48 bit needed but really going for something much smaller than say 36 or 38 bits. Rounding of the last once.
People that made software, specially in the 90-tys had to be very careful with this not trusting it to much. This was also one reason why 64 bit become very popular. Even if you do cheat. It becomes more accurate anyway.
Sadly this is quite common to this day software developers run 64 bit when it's really not needed. This is specially problematic with gpu acceleration that for some cards emulate 64 bit, running much slower than half speed.
Also worth saying. 16 bit floating ponit is actually quite a bit more accurate than people think . And twice as fast on most modern cpu and some moden gpu;s.
There even exist 8 bit floating points. Four times as fast. While they are really inaccurate and have a very slim range. When they can be used the preformance gane is huge
@godarklight 5 років тому
I'm probably wrong, but isn't the hardware FPU for x64 a fixed size bigger than a 64bit double? (Like 80 or 96 or something)? Someone once tried to tell me that 32bit floats had to be cast and were slower than native 64bit float stuff
@tanveerhasan2382 5 років тому
@@godarklight regarding the the last part, maybe thats why most programming language treat decimals as 64 bit double precision instead of single precision by default? because as you said, using single precision is actually detrimental to the performance?
@valuedhumanoid6574 5 років тому
These guys are so good with explaining things to us not so smart people. Well done mate.
@lisamariefan 5 років тому
The explanation is nice and explains why floats are coarse like they are.
@JaccovanSchaik 5 років тому ⁺⁴
Multiplication isn't really simpler for floats though, because multiplying the mantissas for floats is pretty much the same as multiplying two integers. It's just that the extra step (adding the exponents) is almost trivial.
@Para199x 5 років тому ⁺¹⁰
I think the point was that it was (at least conceptually) simpler than addition of floats. Not that multiplying floats is easier than integers
@RossMcgowanMaths 3 роки тому
All floating point operations are conceptually simple for simple cases but add in subnormal numbers , rounding and special cases then testing for errors and you will soon understand the complexities.
@ryananderson8817 5 років тому ⁺²
My machine organization class is doing this as an assignment right now thank you
@PoluxYT 5 років тому
Machine organization is a neat name for the subject. Mine is called "Computer organization".
@alen7648 5 років тому ⁺⁴
Can you do a video about: rounding and rounding-errors?
@enantiodromia 5 років тому
Amazing... A Computerphile video that uses pen and paper to visualize addition, and not a nice CGI... In the year 2019...
@Adamarla 5 років тому
You did the "Double Dabble" video to explain going from bit representation to a string. Could you do a video explaining how to do it for floating point?
@JakubH 5 років тому ⁺⁶
what about infinities and NaNs? will there be another video?
@kc9scott 5 років тому
Yes, something on that topic would be interesting. While from the standpoint of using FP numbers, Inf and Nan are really nice to have, I imagine that they add a lot of special-case checking into the FP implementation.
@totlyepic 5 років тому ⁺²
They're just reserved bit patterns. For anything you reserve like that, you just have to build special checks into the hardware.
@EebstertheGreat 5 років тому
Having literal + and - infinities is nice for improper integration in R. At least, I think the infinities there are IEEE 754 infinities.
@brahmcdude685 3 роки тому
even more great stuff. how can i thank you????????????
@U014B 5 років тому ⁺⁴
2:23 Don't tell Numberphile you said that!
@eLBehmo 3 роки тому
Can you please continue this series with decimal floating point math (IEEE 754-2008). You would be the first, for sure ;)
@seabo9566 10 днів тому
Why is there 0 videos on UA-cam that have a negative number example. Why do we always pick the most ideal situation?
@YouPlague 5 років тому
Why do you need 48bit register for addition? The result is 24bit anyway and you only preserve the most-significant bits, so the lower ones will always get discarded, won't they? So why do the actual addition of those?
I would presume you only shift to make the exponents the same and then add two 24b registers together, then normalize and you're done with the mantissa.
@zombiedude347 5 років тому
You only need to do a "48-bit" calculation if the addition turns into a subtraction. You however, can't just discard the nest as they are required for rounding.
You shift, keeping all the bits, then add as normal the first 24 bits. (the rest would be unnecessarily added to zero). Re-normalize if needed.
Then, you keep the first 25 bits, replacing the rest with a 26th bit equal to zero if they are all zero, 1 otherwise.
Assuming the most common rounding (ties to even), you then check the 24th, 25th, and 26th bits and round away from zero if they are (0.11/1.10/1.11), rounding towards zero otherwise.
If it weren't ties to even, but ties always away, it would instead round away if any of the 3 bits were (0.10/0.11/1.10/1.11).
If in ceiling "rounding" (positive+positive) or floor "rounding" (negative+negative), you round away for (0.01/0.10/0.11/1.01/1.10/1.11)
However, if in truncate mode, ceiling mode (negative+negative), or floor mode (positive+positive), you don't round away from any combination of the bits.
@TheTwick 5 років тому ⁺⁹
Noob question: when they measure FLOPS (on a computer) are they performing additions, or subtractions, or multiplications...?
5 років тому ⁺⁴
That doesn't really matter, since those operations are often considered as taking one cycle (at least on x86 when considering vector instructions). For example you can do 1 FLOP (addition/multiplication) or 2 FLOPs per cycle (FMA - fused multiply add) - times the width of the vector unit times the number of execution ports times the number of cores etc.
@lotrbuilders5041 5 років тому
I think they expect a normal distribution for some type of program
@APaleDot 5 років тому ⁺¹
Why do they use an offset of 127 in the exponent to put 0 in the center of the range, rather than just storing a int8 using two's-complement? Isn't that math for addition simpler using two's-complement?
@Roxor128 5 років тому ⁺²
Using the offset approach allows reserving bit patterns for 0, infinity, and not-a-numbers. Without a reserved pattern for 0, you can't store it due to the implied 1 before the radix point. Infinities and NaNs can be useful for figuring out if something went wrong.
For IEEE 754 floats, an exponent field with all bits 0 and a mantissa of all bits 0 encodes zero. Note that I didn't mention any restrictions on the sign bit, which gives +0 and -0 values.
@fllthdcrb 5 років тому ⁺¹
Not sure, but as far as I understand, basically, it allows for the bit patterns to be lexicographically compared (as long as they're regular positive numbers). You wouldn't get that if the exponent is in two's-complement, as negative exponents would make the number appear greater under such a comparison method.
Another nice thing about this is that zero gets encoded with all zeros: sign bit is 0, which makes it positive (yes, positive!); exponent is all 0s, which is interpreted as -∞; and significand (or "mantissa", as it's informally called) is all 0s, which means 1.000000.... Thus, you get 1.000000... × 2^-∞, which is +0 (strictly speaking, it's infinitesimal). A bit off-topic, but... flip the sign bit, and you get -0. The zeros in floating-point are signed for this reason, and if you ignore exceptions, dividing them into some positive, non-zero value yields an infinity of the same sign.
EDIT: Also, the exponent bias doesn't really complicate addition. Remember that to line up the significands, you only need to shift one by the _difference_ between the exponents, which will be the same with or without a bias. Now, multiplication does have a slight complication here, as you need to add their true values together. But it's really a very small price to pay, in the scheme of things.
@Bibibosh 5 років тому ⁺³
This is mentally stimulating!
PART 3!!!!
PART 3!!!!
PART 3 and i dont pee!
@Tatiana-jt9hd 5 років тому
so this is numberphile’s sister...
@hyf4229 5 років тому
Actually the FADD operation is more complicated than what he says in the video. You must take care of denormalized numbers. Besides, adding 2 floating point numbers could lead to POSITIVE INF or NEGATIVE INF. If you subtract +INF from +INF, you should generate a qNaN result. All these factors make the hardware that execute FADD really complicated and slow..
@RossMcgowanMaths 3 роки тому
Agreed. Take simple addition subtraction. 2 numbers can be pos or neg , giving 4 combinations , then add or sub gives you 8 , then one greater or less than other gives you 16. Write simple equation that takes care of all 16 combinations giving correct absolute value and correct sign. Then add in subnormals , rounding special cases then write test script to check validity, simulate in code , design build in CMOS layout test ....... Not as simple as adding two numbers. Have to add subtract every number adhering to IEEE 754. And that's just adding subtracting.
@elgalas 5 років тому
It's time for a React/Vue video... These are driving the web today!
@rogerbosman2126 5 років тому
No. This channel covers concepts and technologies, not frameworks. React/Vue is 100% not interesting in this regard, it's just a very popular (and great tbh) implementation of known stuff
@alexloktionoff6833 Рік тому
It's not so simple... In IEEE754 All exponents are biased, more over exponents 0 & ~0 are reserved for especial meaning. For addition and multiplication h/w /*or s/w ;)*/ must use additional bits to make implicit 1 "explicit", one more for carry and then adjust the exponent controlling underflow/overflow corner cases. I can't imagine how this multi-steps operation could be made in one cycle.
@hrnekbezucha 5 років тому ⁺¹
Long live fixed point arithmetic
@kc9scott 5 років тому ⁺¹
3:11 he says "shift this one place to the left", when he's really shifting it right.
@kc9scott 5 років тому
It looks like you could either shift the higher-exponent number left, or shift the lower-exponent number right, whichever is easier to implement.
@Bibibosh 5 років тому ⁺⁴
please make an entire series about binary mathematics. .....
010111001001100101010101 hahahaah what number did i write? ... now multiply it by 00101 !!!!!!!!! MIND BLOWN
@miroslavhoudek7085 5 років тому ⁺¹
Now go and implement it on TIS-100.
@miroslavhoudek7085 5 років тому ⁺¹
@@merrickryman4853 I can't beat like 50% of the content in that game :-o Really got me to think about my skills gap
@josepablogil4943 Рік тому
Didn't know Philip Seymour Hoffman was into computers.
@clonatul1 5 років тому
What about negative exponents?
@rwantare1 5 років тому ⁺³
Before the exponent is encoded, a 127 offset is added (only for 32 bit float), so 2^0 's exponent gets stored as 127. Therefore 2^-20 would get stored as 107. So you can have negative exponents all the way to -126 (-127+127 =0 which is reserved for 0)
@adriancruzat2711 5 років тому ⁺¹
In this particular 32 bit example, 127 is added to all exponents in order to allow for negative exponents. So for example if you have 2^1 it would be represented as 128 -> '10000000'. For something like 2^-1 it will be 127 + (-1) so 126 -> '01111110'. As for adding them together the process is the same. Say that you want to add 1.0 x 2^2 + 1.0 x 2^-1 (4 + 1/2). You would still shift the smaller number to the right the same number of spaces as the difference (3 spaces) and you would add 1.0 x 2^2 + 0.001 x 2^2 = 1.001 x 2^2 (4.5)
@clonatul1 5 років тому
@@adriancruzat2711 wouldn't it be easier just to use the 2nd bit as the sign for the exponent and have 7 bits to represent the value? It's basically the same thing without he offset
@adriancruzat2711 5 років тому ⁺²
@@clonatul1 In theory yes you could represent it using the second digit as a sign digit for the exponent (Or even better as Two's Compliment to avoid having 2 values for 0 [-0 and +0]). However as far as I can tell, the comparison between two floating points is much easier when the Exponent is encoded using the bias (offset).
@matsv201 5 років тому
That is pretty much (sorta, not quite) how you make division in floating points
@GilesBathgate 5 років тому
42!
@skuzzbunny 5 років тому
For all you "zero" aficionados out there.....!!!D
@BigMcLargeChungus 5 років тому ⁺²
Wow I never knew multiplication was more complex than addition for floating-points.
@ais4185 5 років тому ⁺¹²
*less complex ?
@BigMcLargeChungus 5 років тому
@@ais4185 Yes thanks I had a stroke

Наступне

Автоматичне відтворення

Floating Point Numbers (Part1: Fp vs Fixed) - Computerphile