Why Is This Happening?! Floating Point Approximation
Вставка
- Опубліковано 22 лис 2022
- ⭐ Join my Patreon: / b001io
💬 Discord: / discord
🐦 Follow me on Twitter: / b001io
🔗 More links: linktr.ee/b001io
Background Music:
Gentle Lo-Fi Vlog Background Music | Cookies by Alex-Productions | onsound.eu/
Music promoted by www.free-stock-music.com
Creative Commons Attribution 3.0 Unported License
creativecommons.org/licenses/... - Наука та технологія
An alternate solution is to use fixed-point arithmetic. For instance, when doing math on money, don't try to use dollars and fractions, but instead use cents. In the example problem, 0.6 would be represented as integer 60, and 0.7 as integer 70. These can be added with perfect precision in a finite number of bits to get 130 cents, or 1.3 exactly. Not only is the precision better (IE perfect), but it uses integer arithmetic which is often faster than floating-point, and orders of magnitude faster than Decimal. This works fine with addition and subtraction, which is the most common math done on money, but still has a round-off issue with multiplication (used in interest calculations).
The old Visual Basic had a Currency data type which was a 64-bit integer with an assumed decimal point and four digits to the right of the decimal point. Modern cryptocurrency like Bitcoin uses something like 10 digits after the decimal point. A transaction would be presented to humans as 'move 1.3 bitcoins', but the actual block in the chain says 'move 13,000,000,000 satoshi'.
It’s helps to avoid mistakes due to floating point issues
People fuss about COBOL, but there have been fixed point numbers available since the day it came out (COMP-3) Actually it's Packed Decimal.
That won't be better, because you need to reserve 2^n bits for the decimal-part, and unless you want a very weird division factor (other than a certain n in 2^n), you won't be able to represent 0.6 nor 0.7 exactly anyways. And if you use a very weird division factor the precision will be lost at that division.
I'm missing two small things mentioned: 1) That Decimal computation is slower and uses more memory than floats do. This might or might not matter to you, based on your application (eg physic simulation vs finances). 2) The float precision might be fine, just the final print could be better. Use shortest round-trip representation. Or round before print.
Funny you'd suggest rounding before print.
1. You'd still have to round in decimal notation, as the video fairly clearly demonstrates why you can't round 1.3 in pure raw hardware binary.
2. Just to add insult to confusion, floating point hardware by default uses a round to even rule...
0.5 rounds to 0
1.5 rounds to 2
2.5 rounds to 2
3.5 rounds to 4
4.5 rounds to 4
...
Seems counterintuitive huh? Go ahead and try it with raw assembly instructions and see what you get...
@@southernflatland Sorry, I was not clear I suggest rounding to n decimal places (mathematically) before printing. For example 0.1+0.2=0.30000000000000004 then round (mathematically) to, say, 4 places, then print "0.3". Btw hardware has many rounding modes, such as away from zero, towards zero, to +inf, to -inf, to even. In x87 assembly, you can choose several of them (not sure if all).
@@MarekKnapek Indeed you're correct about floating point assembly having multiple rounding modes. I was just pointing out that to the best of my understanding, the hardware defaults to round-to-even mode if left unconfigured otherwise.
If you care about performance or memory then you would not be using Python to begin with :)
@@ABaumstumpf I can't recall what you speak of, can you refresh my cache?...
Just a couple of things I feel you should have mentioned:
1. It's not just the lack of precision. If you enter "1.3", you get back "1.3", not "1.299999999999998" or whatever. Why? Because if you look at the floating-point representation, you'll see the very last bit of the mantissa is in fact a 1, not a 0, breaking the pattern. It ends up rounded up, since the remainder is over half. This ends up being interpreted as being closest to 1.3. However, what you had in the double-precision example, as well as the result of 0.7+0.6, had a 0 at the end, which is no longer seen as 1.3. So basically, it comes down to two things: (1) operations, even addition and subtraction, on numbers with inexact representations (and some operations on exact representations as well) are subject to rounding errors, and (2) the conversion to decimal is extremely sensitive (possibly too much) if you don't limit the precision.
2. Decimal floating-point is great, *IF* you need the exactness of staying within a decimal representation *more* than you need speed. So, for instance, financial calculations benefit from something like decimal types, or fixed-point. But if a loss of precision of a few bits is a perfectly acceptable tradeoff for crunching numbers faster, then you should definitely stick with binary floating-point, since it actually fully uses the hardware, and just limit the precision you output. You shouldn't simply say, "Use decimal types", without explaining the tradeoff and when it's appropriate to use each type.
Great explanation that was both detailed and concise; that definitely deserves a like and subscribe from me
Thank you for this great video, I was struggling to really understand the issue with this until I watched it!
Provided a problem, explanation and solution. Love it, keep going!
Thank you! Clear, Precise and Succinct.
An explanation on how decimal works to fix the problem would be very interesting.
I think it just multiplys and divides, e.g.:
0.7 -> (7 / 10)
0.6 -> (6 / 10)
(6+7 / 10)
(13 / 10) -> 1.3
So it just splits it up in "Nenner" and "Zähler" and then for addition and subtraction only executes calculations on the "Zähler". If it is mjltiplication or division, it is also done on the "Nenner".
So multiplication of 0.6 and 0.7 is:
0.6 -> (6 / 10)
0.7 -> (7 / 10)
(6*7 / 10*10)
(42 / 100) -> 0.42
Because of "Nenner" and "Zähler":
I'm german and I don't know scientific words in english. If you have a "Bruch" (the division with the vertical line), the "Zähler" (lit. counter) is above the line and the "Nenner" (namer) is below.
In:
1
----
3
(so one third)
1 is the "Zähler" and 3 is the "Nenner".
@@jojojux cool "zahler" would be numerator and "nenner" would be denominator.
@@mahmudoloyede881 thanks :)
would be the numerator divided by the denominator, so the line you were correct, is literally just divide, or "upper number divided by lower number" used in a sentence if you were still curious@@jojojux
@@hengry2 thank you :)
best explanation ive ever seen about this topic, you are very underrated and deserve at least 10 mil
Keep up the shorts so more people can find you. Love the clarity
this was greatly educational and entertaining, thank you!
Note: having a 32 bit or 64 bit computer/OS has very little to do with precision of numbers used. You can use 64 bit floats on 32 bit machine and vice versa. It's all down to the programmer.
Examples:
1 most game engines will use 32 bit floats for object positions because you don't usually need extra precision and you'll have to convert to 32 bit for GPU rendering anyway.
2. JavaScript will use 64 bit float for it's "Number" data type regardless of what OS or hardware it's running on.
Financial software has always used integer math (in cents not dollars) to avoid floating point problems. But you still need to be careful when calculating interest rates and taxation.
underrated channel, this helped me SO much
Very clear explanation! Also would love to know the font you are using, I like the nostalgia of it!
thanks for the amazing video! What theme do you use on vscode?
I really really love your content, keep it up :)
You are a great teacher! Amazing explanation
Hey dude, could you tell us your color scheme and font for VScode?
They're super clean
I’m doing a physics simulation and drawing it in a turtle window and this is exactly what I’m missing, thanks!
If only computers supported a decimal-based floating point, like in many scientific and graphing calculators. Some computers do, but I mean like as a value type.
Great explanation 😊
Good overall view. But a few factual errors.
The reason for adding 127 in the IEEE 754 standard instead of 128 as used in some older floating point formats is because an exponent of 0 or 255 are special. The effect is that the binary exponent ranges from -126 to +127, not -127 to +128.
And exponent of 0 is for handling 0 and denormalized numbers. From your point of view, the implied and not stored value for denorms is a 0 instead of the 1 you would see for other exponents. And an exponent of 255 is for handling infinity and Not a Number (NaN) to indicate and propagate errors such as square roots of negative numbers, dividing 0 by 0, etc.
Other than that, good job.
The issue of not being able to exactly represent a fraction is universal to any number base you use. The issue happens whenever the divisor has an uncanceled prime factor that's not in the base you're using. Base 10 (or decimal) has the prime factors of 2 and 5, so any division with a divisor having just those prime factors will eventually terminate. Base 2 (or binary) has only the prime factor of 2, so a lot of fractions that terminate in decimal will not terminate in binary. But as you said, there's lots of fractions that won't terminate in decimal either such as divide by 3.
If only i knew this trick when i did my examinations in school. I found out this information only 3 months ago at 1st course of my university. And i didn't even realise why i needed to know about mantissa and so on.
So helpful, thanks :)
What is the best way to deal with this issue in java (eclipse)?
Also, why do integers not seem to have this problem? Shouldnt a 3 be hard to represent no matter what?
Then what Decimal library put on the table to get it right if there is no gain even increasing the bits that you canbuse for accuracy?
what vsc theme are you using?
whats ur vsc theme?
Damn best explanation ever, do you mind explaining how the built in library works.
What theme/colour do you use on Visual Studio Code?
What's the theme you're using?
Would be cool to then explain what decimal does to make it 'work' ...
Thanks for your sharing
Very nicely explained 👏
I cant describe how much i liked that video. Really, great job. It's important to teach code monkeys using python something more about binary numbers and representation. Great example at the end too!
Aren't all coders code monkeys?
I've never seen birds or snails typing code.
@@lorax121323 nope, nocturnal coders are owls
If your exponent was a 127, you would get into a "denormal" representation. For most output, however, it is sufficient to use a display precision that is less that the stored precision.
can you please tell what font are you using?
I program a little in c++ and i just tested it, using g++ compiler this wasn't a problem for me, both when i used floats and doubles. Why is this?
Make a video explaining how the decimal package works
on c++ this not occur
unless we use "printf" and manually increase the number of decimal places to avoid rounding.
Does it affects the training of neural networks on python?
hi mate, whats the editing software u use to show all the tables etc?
Hello, I just used PowerPoint
That was very cool! In trying that all out, my phone calculator worked perfectly and so did my Windows calculator.
It's too bad the floating point standard is not BCD.
That's because they use the decimal version of floats. Decimal floats use DPD(not BCD, that is really naive) for the mantissa.
Easier way imo would just be to represent the money as an int in cents and just add the decimal later. 60 + 70 = 130 cents
Sorry, what font do you use for code?
what theme is this?
What program are you using to Write your code can you send me the link
it's all about the strings, babyyyyyyyy
how does the Decimal function fix the problem?
Always fun haha doing ints to floats without loosing anything...
what is that python exstension?
So I have to use always decimal library?
I understand the method used to represent Floating point, but to my mind it would make calculations a lot easier if the decimals were converted to integers first by multiplying by exponents ie. 0.6x10^1 + 0.7x10^1 = 6+7 = 13x10^-1 = 1.3 Obviously I'm missing something as the computer industry doesn't use this system, except maybe financial software.
that's why you should never use floats. If you really need to, then you should probably use libraries that store numbers in numerator/denominator form (this is probably what Decimal does).
Can we have a brief explanation on how the decimal function works?
This is just my guess: it converts the fraction into strings (or char, to be exact) and process them one by one as an integer just like when we do addition in elementary school, e.g. 0.6 turn into '0', ',' and '6'. The 0.7 turn into '0', ',' and '7' which then it adds the '7' and '6' and so on.
Is that correct though? Cmiw!
@@rizalardiansyah4486 Sounds like a good idea, not sure if correct though.
@@rizalardiansyah4486 I'd think they'd just convert from string to float and add trailing decimal values afterwards, offsetting by tens of course.
Each digit is stored in a 4-bit byte. The byte values run from 0000 (0) to 1001 (9) other data holds tha length of the byte array, signs, decimal point position. Math is done (as another said) like you learned in school, if adding align the decimal points and go column by column. Subtraction is the same. Multiplication and division and higher functions are harder but can be done. Many scientific calculators work on the decimal principle and the math routines are well known and optimized.
Why doesn't the last few bits get rounded off when converting back to decimal?
Is there any way to convert binary scientific to decimal scientific notation? I mean like, a number
L.LLL* 2 ^ LL
to
1.5 * 10 ^ 1
The workaround is at 4:58
Does this only happen in python ? Coz I don't see such issue in c++
So when dealing with a lot of decimal data should we systematically use this to avoid the floating numbers problem?? Let's say for exemple we're multiplying and dividing vector and matrices with thousands of data numbers inside...
Should we not care or systematically define our vectors as decimals when declaring them ?
If you are dealing with a high precision application such as accounting or certain engineering applications, I would recommend using the Decimal library.
@@b001 thank you for your response 😁😁it is in fact high precision engineering... Digital image correlation and I've been having some troubles moving from octave to python 🙏this would really help thank you
@@lainiwakura3503 "it is in fact high precision engineering"
Then you shouldn't do the calculations in python to begin with.
@@ABaumstumpf well that's what I said to my supervisor and his response was " not really sure about that tho " 🤷🤷🤷
I feel like this could be solved with the round function in python
Could you clarify the part about decimal side * 2 etc? Whats the rationale behind this? Thanks
If you take a decimal side of the number and times it by two and check if it's bigger than 1 (0.5 * 2 = 1), then you know that 0.5 can fit inside the decimal side of the number. If repeat that and times by two again (4x now), and checking if it's bigger than 1 (0.25 * 4 = 1), then you know the remainder from the pervious step can fit inside the decimal side. etc.. Try this exercise with 0.75 and 0.25 and see how it works out
Hi Sir,
In below code not getting how, lst1 is appended with element 6 when we are appending only lst2 only.
Can you please Clarify, Thanks
lst1=[1,2,3,4,5]
lst2=lst1
lst2.append(6)
print('lst1- ',lst1)
print('lst2- ',lst2)
Output-
lst1- [1, 2, 3, 4, 5, 6]
lst2- [1, 2, 3, 4, 5, 6]
Why is it printing 1.3 directly on my env?
Any idea?
why is it not just decimal, but the point is floating?
Why does this problem seem to always happen in addition? What will happen if x=1.3; print(x);?
Does anyone know what font is he using?
Excellent insight and explanation. But, big picture, given the compute capacity of chips and the talent of the engineers, this is a problem that should be solved and no longer an issue. Period. As we move ahead with using larger and larger data sets to predict with more and more accuracy, these issues are serious drag coefficient.
thanks
I like your vscode/vscodium editor font very much, can you tell it's name please :)
i think its synthwave '84
I think the font is Brass Mono
EDIT: Nvm the guy below me is probably right
I think the font is Comic Sans
Why don't computers store 1.3 as 13 and then move the exponent by minus 1?
Thank you
Y works in c++ ?
could you give a tutorial on how you got your code to appear in the output panel at the bottom instead of the terminal? whenever I click the run button, it runs my code in the terminal and not the output panel. how would i change this?
It is the default that it is shown in the "terminal" panel. What happens if you press "Terminal" > "New Terminal" in the top bar is VSCode?
So two questions: 1. How does the Decimal function work? 2. What about a continued fraction representation?
I think (thats how I'd do it) it converts the numbers to fractions.
eg. Addition:
0.7 + 0.6
0.7 -> 7/10
0.6 -> 6/10
6+7 / 10 = 13/10 = 1.3
Eg. Multiplication:
0.6 * 0.7
0.6 -> 6/10
0.7 -> 7/10
6*7/10*10 = 42/100 = 0.42
This dude deserves a Nobel prize. Very clear explanation.
I agree. I've seen a lot of explanations that just say "you make a trade off between space efficiency and precision" but I never understood WHY you lose precision and why even small numbers get messed up. It makes sense that you just can't represent every number with floating point binary the way that it's made.
Wow that was the best explanation of IEEE 754 I have ever seen!
But I'm confused what the decimal library does different and why it can represent decimal numbers without these issues.
Same, I'm guessing it just stores the integer and decimal separately? Perhaps using up more memory?
Based on my very brief look at the library it appears it just stores each decimal digit as it's own entry in a tuple. (I'm no expert, I could easily be reading this wrong)
class DecimalTuple(NamedTuple):
sign: int
digits: tuple[int, ...]
exponent: int
You can just use .isclose() which compare the two values with a specificed relative or absolute tolerance.
math.isclose(0.3 + 0.7)
What are you saying in 3:06 ? right of the left most 1 ? what ? what left most 1 ? where is the number ? there is no number there ? am i not seeing something here?
I finally gained some knowledge to brag on🤣
Why wouldn't you multiply both number until there natural numbers?
or just use round(num, 2) it's in the standard python library meaning you dint need to install or import anything.
What if you need more than 2dp of precision? I think that's the point here, rather than whether or not it looks nice.
The decimal package is also a part of Python standard library, IIRC.
2"51 awww wfuck.... he just mentioned THE MANTISSA...........Shit just got real
Or another syntax:
x = 0.6
y = 0.7
z = (10 * x + 10 * y) / 10
Instead of 1.2(9), we have 1.3 ;)
I just don't understand why the computer doesn't add the numbers ignoring the decimal point, then refactor it in after. It would probably take longer to make it "0.6 + 0.7" > "6 + 7" > "13" > "1.3" and you'd need more space to store the information of the decimal point, but you'd never run into the issue of infinite numbers.
Because the computer isn't working in decimal internally - it's working in binary floating point. There is no 0.6 in the computer here, there's a floating point number that's very close to 0.6
Floating point works very fast in computers, and it isn't inherently worse than decimal. Floating point can't represent 0.3 exactly, but then decimal can't represent all numbers either, as the video said you can't write 1/3 exactly as a decimal.
And there are lots of applications where 1.29999999999999982236431605997495353221893310546875 is more than close enough to the answer to 0.6+0.7. If you're calculating anything about the real world you know that any measurements you're using are almost certainly far less precise than that to start with.
@@barneylaurance1865 yeah I acknowledged that, but the thing is, the computer might not know what 0.3 is, but it absolutely knows what 3 is: 11. It also knows that One is: 1... Both those numbers are in binary.... Nothing other than "it'd take 2 memory slots instead of one" stops the computer from storing something like 0.3 as "3, 1 decimal point."
I understand most processes defaulting to very close approximations of numbers in binary, it just baffles me that, for finances, we don't do this.
Yes there are reasons for it, mostly the fact that an abundance of memory and processing power is mostly a new fact where earlier computers didn't have that, I was just pointing out that it's weird that, for instance, we don't have a few OS made specifically for financing where floating numbers are, well, the actual numbers.
@@mordirit8727 floating point is only one of many ways programmers can choose to store numbers. If they know what they're doing they generally won't use floating point for finance. They can do something like what you said - if 0.6 means £0.60 then they can systematically write the program to store that as the integer number 60 instead of any floating point number, and make sure all output routines in the program display it as £0.60.
Or if you want to use two memory slots you can have a system where you use two integers, and store the number as a fraction, 6 / 10 (or simplified to 3/5). The point is there are lots of options, what the video shows here isn't how computers work universally, it's just one very widely used system, that's good for some applications and bad for others.
Since we're on UA-cam video processing is probably a very good example of where floating point is a good choice. If you want to process a video file, to make the colours look better / or more realistic, make it brighter or darker or whatever you might have to multiply the numbers for all the pixels. You want it do be done quickly and fairly precisely, but if it's one part in a billion brighter or darker or redder or bluer than it should be no-one will care.
@@barneylaurance1865 yeah most I've seen on financing works around the issue by using integers to count cents instead of floats for higher currency (although some applications require fractions of cents it's always possible to just shift to "thousandths of a dollar", for instance), only ever turning things into decimal fractions when giving feedback to the user but doing all the internal math with integers
I just understood x and y cause I'm a biology student
but why does pyhton show 1.3 when I simply enter it in the terminal ?
Whats the name of your theme?
SynthWave '84
Ive had this issue and in this situation I would’ve done (7+6)/10 or ((0.7*10)+(0.6*10))/10
What's the approach in c or cpp?
float x = 0.6, y = 0.7;
printf("%f
", x+y);
result is 1.300000
Why don't floating numbers uses a lot of zero?
I got this question on me midterm and got it wrong😢.
x = 0.7
y = 0.6
print(x+y//1)
Output : 0.7
Why ??
haha, why am I getting 1.3 by default
lol i had that in my class like a week ago
This makes me want to never code in my life 😂
x = 7
y = 6
print((x+y)/10)
It's even weird with Decimals too. Check the difference between these 2 similar looking code snippets:
x = Decimal('0.6') ; y = Decimal('0.7') ; print(x+y)
x = Decimal(0.6) ; y = Decimal(0.7) ; print(x+y)
That's because when you use 0.6 without the quotes, certain steps happen in order. First, the 0.6 is converted to a binary 64-bit float, at which point the round-off discussed in the video happens. Then, that already-rounded number is converted to a Decimal object. When '0.6' is passed as a string, it is never converted to 64-bit float. I haven't looked at how Decimal works, but I presume that its constructor has a special case to handle strings to convert directly to the internal form that Decimal uses.
@@kwan3217 Thanks for the explanation, I knew that.
In quote it’s simply doing string manipulation
It certainly casts doubt on the claim that 0.99999... is equal to 1!
Wouldn’t changing 0.6 to 0.60 fix this?
Nope because the issue is the conversation to base 2 not the calculations themselves.
People in the comments are true genii