how floating point works

Поділитися
Вставка
  • Опубліковано 31 тра 2024
  • a description of the IEEE single-precision floating point standard
    / hbmmaster
    conlangcritic.bandcamp.com
    seximal.net
    / hbmmaster
    / janmisali

КОМЕНТАРІ • 1 тис.

  • @aliasd5423
    @aliasd5423 2 роки тому +1482

    Fun fact about the imprecision of floating points: if you’re in a infinitely generating video game, going out further and further will eventually lead to your player “teleporting” around instead of walking, since the subtlety of your individual steps cannot be represented when your player coords are so big, which is why some infinitely generated games move the world around the player instead of the player around the world, that way everything it processed close to the coordinates 0,0,0 for the greatest precision

    • @samagraarohan2513
      @samagraarohan2513 Рік тому +229

      That makes so much sense in the context of minecraft! As you go farther away from 0,0,0 the game starts bugging more. Neat!

    • @voidify3
      @voidify3 Рік тому +113

      The Outer Wilds isn't infinitely generated but it uses the latter system. If you move away from the solar system for long enough, and bring up the map, the map will get flickery and glitchy

    • @1unar_eclipse
      @1unar_eclipse Рік тому +90

      @@samagraarohan2513 Obligatory that's mostly on Bedrock. Java has distance effects too, but Bedrock's are far more apparent (see: falling through the world lol).

    • @hillaryclinton2415
      @hillaryclinton2415 Рік тому +37

      Real life works this way

    • @keiyakins
      @keiyakins Рік тому +29

      assuming it's using floats yeah. there are other ways to represent numbers, but they have other problems - usually being slow to work with and taking a lot of ram as they get longer (on either side of the radix point)

  • @codahighland
    @codahighland 2 роки тому +2641

    For anyone curious: The generalization of a decimal point is known as a radix point.

    • @ferociousfeind8538
      @ferociousfeind8538 2 роки тому +56

      Is that pronounced [ɹadɪks] or [ɹeidɪks]?

    • @codahighland
      @codahighland 2 роки тому +112

      @@ferociousfeind8538 It's [ɹeidɪks] in American English, at least. I couldn't say what the appropriate Latin pronunciation would be.

    • @Cheerwine091
      @Cheerwine091 2 роки тому +40

      Rad!

    • @doublex85
      @doublex85 2 роки тому +34

      Hmm, I'm not sure I ever heard anyone say it aloud so I usually read it to myself as /rædɪks/. Both Merriam-Webster and Cambridge only offer /reɪdɪks/, which is a bit distressing.

    • @stevelknievel4183
      @stevelknievel4183 2 роки тому +125

      @@ferociousfeind8538 I don't think I've ever seen someone ask about pronunciation in the comments section on UA-cam and then someone else give an answer where both commenters both know and use IPA. It stands to reason that it would be on this channel though.

  • @nickbb6185
    @nickbb6185 2 роки тому +1377

    ive often found the "pretend that youre inventing it for the first time" method of teaching to be really effective, and i feel like this video is just such an excellent case study. math is not my strong suit but i still found it easy to follow because of that framing and the wonderful visualizations. thank you!

    • @mihailmilev9909
      @mihailmilev9909 Рік тому +15

      I know right. I find this works really well for me for math and science concepts like quantum physics, relativity, and number theory that I've been looking at recently. It's like the whole style of math explained like numberphile, 3Blue1Brown and Matt Parker!

    • @mihailmilev9909
      @mihailmilev9909 Рік тому +5

      Also "Physics Explained", foing him recently which is what put me on this theoretical physics streak, PHENOMENAL channel. Criminally not known enough. Though only praise and praise in his comment sections

  • @nerdporkspass1m1st78
    @nerdporkspass1m1st78 Рік тому +403

    4:43 The moment I realized this incredible fluid and clean visualization was actually RECORDING EXCEL when he typed in the function blew my mind. I can’t imagine how these videos are made.

    • @schhhh
      @schhhh 11 місяців тому +20

      excel is pretty neat, to put it lightly

  • @emilyrln
    @emilyrln 2 роки тому +256

    "And that's a really good question."

    "The other problem with-"
    😂
    This wasn't just informative and really well-explained, but funny to boot!

    • @Vespyro
      @Vespyro Рік тому +7

      I lost it at this part!

  • @doommustard8818
    @doommustard8818 2 роки тому +1484

    The reason 0 represents positive and 1 represents negative has to do with the fact that signed/unsigned standards with overlapping range will be the same if you do this for integers and the standard was carried over to floating point.
    So for 8 bit integers the range from 0->127 can be represented in both signed and unsigned standards. Because it would be convenient to represent them the same way, someone made the decision to do that. The result is that positive numbers in signed integers have a leading 0 and negatives have a leading 1.

    • @Henrix1998
      @Henrix1998 2 роки тому +70

      However that's not how CPUs actually do it, they use 2's compliment where for example 95 and -95 have (almost) no binary digits in common at all

    • @keiyakins
      @keiyakins 2 роки тому +92

      @@Henrix1998 true, but with how two's complement works the first bit is zero for positive integers and one for negative. It still leads to "eh just copy how it works with ints for ease of use"

    • @luelou8464
      @luelou8464 2 роки тому +39

      @@Henrix1998 For an 8 bit signed integer you basically just take an 8 bit unsigned integer and relabel the values from 128 to 255 as -128 to -1. Thanks to overflow errors, 255 + 1 = 0 in an unsigned representation, so you can reuse the same addition circuits to give -1 + 1 = 0. In binary both are just 11111111 + 00000001 = 00000000.

    • @corylong5808
      @corylong5808 2 роки тому +14

      I'm assuming it was done that way in integer (and then floating point) because programmers often make the default state of a Boolean negative, and the modified state of a Boolean positive. For instance, if something went wrong with the assignment of the Boolean, it would have the default and/or more common value and more likely to be correct and fail silently.

    • @notnullnotvoid
      @notnullnotvoid 2 роки тому +8

      @@corylong5808 No, it's entirely unrelated to booleans. Booleans don't have anything to do with positive or negative numbers, they can only store true and false. Usually that's represented canonically as 0=false and 1=true (so only the least significant bit changes, not the sign bit which is the most significant bit). And generally any non-zero value is interpreted as true.

  • @kalleguld
    @kalleguld 2 роки тому +532

    Fun fact: The Chrome javascript engine uses the NaN-space to hold other datatypes such as integers, booleans and pointers. That way everything is a double.

    • @kkard2
      @kkard2 2 роки тому +124

      i can't find confirmation for this, but if true this really is some cursed knowledge

    • @keiyakins
      @keiyakins 2 роки тому +87

      That is both ingenious and incredibly evil.

    • @dzaima
      @dzaima 2 роки тому +130

      So does Firefox's. It's a common technique, called NaN-boxing.

    • @metroidisprettycool119
      @metroidisprettycool119 2 роки тому +11

      Ew

    • @bb010g
      @bb010g 2 роки тому +58

      @@dzaima NaN-boxing, my favorite sport. LuaJIT uses it too.

  • @WhirligigStudios
    @WhirligigStudios 2 роки тому +781

    Another reason for 0 being positive and 1 being negative: (-1)^0 is positive, and (-1)^1 is negative. More technically, it's because the multiplication of signs acts as addition mod 2, with positive or 0 as the identity, respectively. (So when multiplying floating-point numbers, the sign bit is just the XOR of the sign bits of the multiplicands.)

    • @NF30
      @NF30 2 роки тому +23

      Wow I never thought of it like that, neat! Also don't think I've ever heard anyone use the word "multiplicand" before. Also I know you

    • @LARAUJO_0
      @LARAUJO_0 2 роки тому +18

      First time I've seen someone use "multiplicand" instead of "factor"

    • @killerbee.13
      @killerbee.13 2 роки тому +57

      @@LARAUJO_0 If you want to get really particular about it, they're not synonyms. A factor is something that divides something else, usually in some 'even' way, but a multiplicand is just an operand to a multiplication operator. You wouldn't say that 1.5 is a factor of 6, but 1.5 × 4 = 6.
      But the word "operand" or "argument" is usually clear enough in context that "multiplicand" is uncommonly used. Even "term" often does the trick just fine.

    • @killerbee.13
      @killerbee.13 2 роки тому +6

      @@lapatatadelplato6520 yeah, since it's commutative there's no significant difference. It's not like divisor/dividend

    • @kazedcat
      @kazedcat 2 роки тому +8

      In Matrix multiplication it is not commutative. So left multiply and right multiply give different result. So the distinction is needed.

  • @CommonCommiestudios
    @CommonCommiestudios 2 роки тому +101

    This takes "Hey Siri, what's 0÷0?" to a whole new level

    • @farmerchuck7294
      @farmerchuck7294 2 роки тому +15

      a n d y o u a r e s a d b e c a u s e y o u h a v e n o f r i e n d s

  • @bammam5988
    @bammam5988 2 роки тому +815

    Floating-point is a very carefully-thought-out format. Here's my favorite float tricks:
    * You can generate a uniform random float between 1.0 and 2.0 by putting random bits into the mantissa, and using a certain constant bit pattern for the exponent/sign bits. Then you can subtract 1 from it to get a uniform random value between 0 and 1. This is extremely fast *and* has a much more uniform distribution compared to the naive ways of generating random floats, like "rand(100000) / 100000.0f"
    * GPU's can store texture RGBA pixels in a variety of formats, and some of them are floating-point. But engineers try as hard as possible to cram lots of data into a very small space, which leads to some esoteric formats. For example, the "R10G11B11" format stores each of the 3 components as an *unsigned* float, with 10 bits for the Red float, and 11 for each of the Green and Blue floats, to fit into a 32-bit pixel. Even weirder is "RGB9_E5". In this format, each of the three color channels uses a 14-bit unsigned float, 5 bits of exponent and 9 of mantissa. How does this fit into 32 bits? Because they share the same exponent! The pixel has 5 bits for the exponent, and then 9 bits for each mantissa.

    • @nucular_sr
      @nucular_sr 2 роки тому +18

      So if the colors in RGB9_5 share the same exponent, then wouldn't each color have roughly the same brightness, each less than twice as much as each other? So the colors would always look pretty unsaturated.

    • @kkard2
      @kkard2 2 роки тому +52

      ​@@nucular_sr ​ no, because you would scale the other components' mantissa.
      e.g. if you want very bright red with very small green value, you set high exponent, high red mantissa and low green mantissa
      ofc this way you lose precision with very bright colors, but that's the point

    • @nucular_sr
      @nucular_sr 2 роки тому +18

      @@kkard2 So are you saying the first digit of the mantissa can be a 0 whatever the exponent is, unlike with usual floating points?

    • @kkard2
      @kkard2 2 роки тому +6

      ​@@nucular_sr hmm, tbh i just think it's that way, fast google searches failed me and left with unconfirmed information...

    • @nucular_sr
      @nucular_sr 2 роки тому +2

      @@kkard2 Because if it's not like that then my point still stands, the highest the mantissa can be is 1.11111111 in binary or about 1.998 in decimal, and the lowest it can be is 1, which is exactly what i was saying

  • @aggressivelymidtier825
    @aggressivelymidtier825 2 роки тому +650

    I've always thought I'd love to have jan Misali as a teacher but after watching them explain this competely foreign and complex topic to me so well, they'd be so over qualified for any teacher's pay or compensation. Like, the w series was in depth, but this just has so many moving parts and you communicated them so well!

    • @alternateaccount9510
      @alternateaccount9510 Рік тому +26

      Let me tell you why you are objectively wrong. This is in no disrespect to jan Misali, firstly. But anyway, these videos contain a pretty large gap between each in-depth one. Going as far as a few months. While teachers have to prepare a lesson within different days in a week, at least once every two weeks but it depends what age you are. I expect you are at least thirteen which probably means at least five lessons every two weeks. While jan Misali has months for one lesson. And within preparing these lessons, the teacher has a short gap to produce a one hour lesson about a topic. And this is also to different people, around 30 people usually that they have to teach it to. This means there will be many hold ups probably. While with jan Misali, he has months to produce a lesson. This means making and perfecting scripts, perfecting slides and graphics. And it is to more than 30 people but not at one time. He is able to keep going without interruptions and put all attention on you (technically). This is why there might seem to be a big difference. And you thinking this does not show how great jan Misali is, it shows how underappreciated teachers are.

    • @gillianargyle6144
      @gillianargyle6144 Рік тому +18

      @@alternateaccount9510 So, the TL;DR of this is "Misali has more time for each lesson, and there's a slim-to-none chance that he could cover a unique subject for 100+ days, multiple times per day."
      Am I correct?

    • @iantaakalla8180
      @iantaakalla8180 Рік тому +11

      That is correct. In fact, if he were a teacher he would be as good as any other teacher simply because the challenges of arranging a lesson for 30 people daily such that they all equally understand and do not resort to rote memorization is so slim that he would fall to the teaching quality of any other teacher at best.

    • @iantaakalla8180
      @iantaakalla8180 Рік тому +9

      He does at least know what he is talking about and his interests are clear, but to teach something to random people directly in a nonstop basis to the level he does is impossible.

    • @mihailmilev9909
      @mihailmilev9909 Рік тому +1

      @@iantaakalla8180 unfortunately it seems so 😔. Naive was I for thinking that we can teach everyone like this I guess

  • @jezerozeroseven
    @jezerozeroseven 2 роки тому +111

    IEEE 754 is amazing in how often it just works. "MATLAB's creator Dr. Cleve Moler used to advise foreign visitors not to miss the country's two most awesome spectacles: the Grand Canyon, and meetings of IEEE p754" -William Kahan

  • @seneca983
    @seneca983 2 роки тому +302

    11:23 "The caveat only being that they get increasingly less precise the closer you get to zero."
    That isn't the only caveat. Subnormals often have a separate trap handler in the processor which can slow down processing quite a bit if a lot of subnormals appear.

    • @Kaepsele337
      @Kaepsele337 2 роки тому +71

      Thank you, you just answered my question of why my monte carlo integrator has wildly different run times for different values of the parameters even though it should perform the exact same operations.

    • @RyanLynch1
      @RyanLynch1 2 роки тому +4

      good tidbit/nit! thank you!

    • @seneca983
      @seneca983 2 роки тому +13

      @@Kaepsele337 Wow, I didn't expect my comment to be this useful but I'm glad that I could help you.

    • @lifthras11r
      @lifthras11r 2 роки тому +30

      Note that this is in many cases an implementation artifact. There _are_ implementation techniques for performant subnormal handling, but since they are supposed to happen only sparingly actual implementations have less incentive to optimize them. That's why subnormal numbers hit much harder in newer machines than in older machines. (IEEE 754 does have an additional FP exception type for subnormals, but traps are frequently disabled for the same reason that signaling NaN is unpopular.)

  • @bidaubadeadieu
    @bidaubadeadieu 2 роки тому +63

    Wow the information density here is high! This 18 minute video is essentially the first lecture of the semester in the Numerical Analysis course I took in my senior year as a math major, except that lecture was an hour long! The professor used floating point as a motivation to talk about different kind of errors in lecture 2 (i.e., round-off vs. truncation) which honestly was a pretty effective framing.

  • @camwoodstock
    @camwoodstock 2 роки тому +27

    "I’ve been jan Misali, and much like infinity minus infinity, I too am not a number."
    guess we need to reconsider our definition of a "misalian"... and write an apology to our math teacher.

  • @agcummings11
    @agcummings11 2 роки тому +11

    I already know how floating point works but I’m watching this anyway because I have PRINCIPLES and they include watching every Jan misali video

    • @Duiker36
      @Duiker36 2 роки тому

      I mean, it's at least as important as the origins of Carameldansen.

  • @Hyreia
    @Hyreia 2 роки тому +140

    Shout out to my favorite underused format the fixed-point. An old game creation library I used had them. Great for 2D games when you wanted subpixel movement but not a lot of range ("I want it to move 16 pixels over 30 frames"). I found the lack of precision perfect so that you don't get funny rounding errors and have things not line up after moving them small spaces.

    • @sourestcake
      @sourestcake 2 роки тому +24

      They're also good for avoiding bugs caused by the changing precision in floating-point formats. Sadly they have worse optimization potential in modern CPUs.

    • @ferociousfeind8538
      @ferociousfeind8538 Рік тому +13

      I also feel like minecraft should be using a fixed-point position format. Half of all the number space is completely wasted at the world's origin, and then the precision is stretched to its limit at quintillions of blocks out. Does minecraft need more than 4096 possible positions within a block (in X, Y, and Z directions each of course)? I don't really think so. Which leaves 19 bits for macro-block positions (and 1 for the sign of course) which is 524,288 blocks with 32 bits. Or, with 64 bits as I'm pretty sure it already does use, you can go out to 2,251,799,813,685,248 (or 2.251×10^15, or 2.2 quadrillion) blocks in any direction, before the game either craps out or loops your world. Which I think is a fine amount of space, and even 500,000 blocks was fine- you'd run out of computer memory doing ordinary minecraft things or run out of interest in the world before you actually naturally explored that far. But with 2 quadrillion blocks in any direction, there is no way you'll get out there ever, without teleworking there to see what it's like

    • @ferociousfeind8538
      @ferociousfeind8538 7 місяців тому +1

      Yeah, games with limited scale don't need the sliding scale of the floating point number system. Minecraft does not need to specify 1/65536th of a block, it probably doesn't functionally need anything more than, like, 1/4096th of a block (or 1/256th of a single pixel), which would nominally squash the range the floating point coordinate system afforded you, but expand the usable space significantly and give you a hard cutoff where the fixed point numbers top out (or bottom out negatively)
      In fact, using 64 bits for positions (floats are SO last year) and a fixed-point implementation down to 1/4096th of a block, you get a range up to 2.2517 * 10^15 blocks in any direction from spawn, all which behave as well as current vanilla minecraft between 2048 and 4096 blocks away from spawn (where the minimum change is already 1/4096)
      And, of course, I couldn't just NOT mention the other half of the floating-point numbers that Minecraft straight up cannot use- half of the available numbers are positions between 0 and 1! Unparalleled precision used for nothing!

    • @ferociousfeind8538
      @ferociousfeind8538 7 місяців тому +1

      Lmfao I am constantly harping on this I guess

  • @toricon8070
    @toricon8070 2 роки тому +535

    this is a good video!! I like the way you explain how this came to exist. it is a human thing made by humans, and as such it is messy and flawed but it _works_ . and I love that. it was created for a purpose, and it serves that purpose well.
    you didn't do this, but I've seen people say that "computers can't store arbitrary-precision numbers", which frustrates me, because computers _can_ do that, they just need a different format. these tools are freedoms, not restrictions. if you want to perform a different task, then find different tools. and yeah you probably won't ever need this much precision, but like things like finance exists, where making a $0.01 error in a billion-dollar field is cause for concern, at the least. arbitrary-precision numbers do require arbitrary memory, but they're definitely possible.
    I think 0 is positive and 1 is negative because 0 is "default" and one is "special", like how main() returns 0 on success in C. also it could be because of how signed integers are stored, where 11111101 is -3 b/c of two's compliment (I think?), which is mathematically justified by 2-adic numbers.
    anyway good video, as always.

    • @KaneYork
      @KaneYork 2 роки тому +5

      In finance, you can instead use a (int64, currency) pair with the integer representing micros. (1000, USD) is interpreted as one tenth of a cent (one thousand micro dollars).

    • @canaDavid1
      @canaDavid1 2 роки тому +12

      @@KaneYork that is a fixed-point system.

    • @strangejune
      @strangejune 2 роки тому +14

      You would definitely not use a floating point number for finance, the inaccuracies will cause trouble. You'd use a couple integers (like 200 dollars and 53 cents).

    • @canaDavid1
      @canaDavid1 2 роки тому +8

      @@strangejune you'd store the value in cents (our even millicents). Divide by 100 the get the dollar amount, the rest is the cents

    • @NYKevin100
      @NYKevin100 2 роки тому +10

      Using int64 for finance is questionable; there have been instances of hyperinflation where the 64-bit limit (~18 quintillion) could have been exceeded in some cases. In particular, the first Zimbabwe dollar (ZWD) had a total of 25 zeros knocked off during successive redenominations (i.e. 10^25 ZWD = 1 ZWL), but 18 quintillion is less than 10^20, so you would be unable to represent 1 ZWL as an int64 of ZWD. Making matters worse, 1 ZWL was still way too small to be useful. If your financial database was never converted from ZWD to ZWL, you would be overflowing all the time. The safest way to represent money is as an arbitrary-precision integer, not as an int64.

  • @Dawn_Cavalier
    @Dawn_Cavalier 2 роки тому +78

    If I had to guess, 0 being positive and 1 being negative is a holdover from 2's complement binary representation.
    For those uninitiated, 2's complement binary representation (2C) is a way to represent positive and negative whole numbers in binary that also uses the leading bit as the signed bit. To showcase why this format exists here's an example of writing a -3 in 2C using 8 bits.
    Step 1: Write |-3| in binary
    |-3| = 3 = (0000 0011)
    Step 2: Invert all of the bits
    Inv(0000 0011) = 1111 1100
    Step 3: Add 1
    1111 1100
    + 0000 0001
    1111 1101
    -3 = (1111 1101)2C
    Converting it back is the reverse process
    Step 1: Subtract 1
    1111 1111
    - 0000 0001
    1111 1110
    Step 2: Invert all of the bits
    Inv(1111 1110) = 0000 0001
    Step 3: Convert to base of choice, 10 in this example, and multiply by -1
    0000 0001 = 1
    1 * -1 = -1
    (1111 1111)2C = -1
    The advantage of this form is that addition works the same regardless of sign of both numbers or the order. It does this by using the overflow to discard the negative sign if the result would be positive.
    Example: -3 + 5 Example: -3 + 2
    1111 1101 1111 1101
    + 0000 0101 + 0000 0010
    0000 0010 1111 1111
    2 = (0000 0010)2C -1 = (1111 1111)2C
    It's ingenious how effortlessly this integrates with existing computer operations and doesn't have glaring issues, such as One's Complement having a duplicate zero or requiring operational baggage like more naïve negative systems.
    To go back to the original statement, this system only works if the leading digit is a one because if it were inverted 0 would be (1000 0000)2C. This is not only unsatisfying to look at, but dangerous when you consider most bits in a computer are initialized as zero, which would be read in this hypothetical system as -255.

    • @hughcaldwell1034
      @hughcaldwell1034 2 роки тому +3

      This was in the back of my mind, too.

    • @angeldude101
      @angeldude101 8 місяців тому

      Using a single bit to represent the sign is convenient for thinking about and implementing the numbers, but ultimately binary integers are _modular_ integers, which don't actually have any concept of sign. Ascribing a sign based on the first bit gives odd behavior specifically for 0 = -0 being "positive" and 128 = -128 being "negative" when really both are neither.

  • @thesaladballs
    @thesaladballs 2 роки тому +17

    well, this is surreal. i was just asking this the other day in regard to FLOPS.
    i’m learning more about computers, and now i understand what T-FLOPS stands for, but then i thought... “well what the fuck is a floating point?!” looked it up and didn’t wanna read 5 paragraphs of math that day... because i was very inebriated.
    but now i get a jan misali video about the thing i wanted to learn?!! best day of my lifr

  • @WishMakers
    @WishMakers 2 роки тому +8

    5:43 - 5:51 is like a masterclass thesis in the realm of comedic timing and I just needed you to know that
    anyway this video is awesome

  • @denverbeek
    @denverbeek Рік тому +5

    Microprocessor/IC enthusiast here.
    5:33
    The reason that a 1 in the sign bit of a signed integer is negative and a 0 is positive, is that it saves a step when doing 2's Compliment, which is how computers do subtraction (basically, you can turn any addition problem into a subtraction problem by flipping the bits of one of the numbers and adding 1, since computers can't natively subtract).

  • @leow2996
    @leow2996 2 роки тому +2

    That barely audible mouse click to stop recording right after a nonsensical parting one-liner is just *chef's kiss*

  • @keiyakins
    @keiyakins 2 роки тому +6

    "unless you're doing something *really* silly like [unix time_t]" cracked me up :3

  • @FernandoGarcia-hz1gp
    @FernandoGarcia-hz1gp 2 роки тому +9

    Here i am, once again
    When i got out of all my maths courses i swore to never come back, but am i gonna sit through however much time jan Misali needs to talk about some cool random math thing?
    Yes, yes i am

  • @nucular_sr
    @nucular_sr 2 роки тому +83

    I think the reason that the first bit being a 0 represents positive numbers and 1 for negative is so that it's consistent with signed integer formats. When you add two signed ints (or byte, long, etc) you just go from right to left, adding bitwise and carrying the 1 to the next digit if you need to. The leading bit of a signed int is 0 if it's positive so that, to add two ints, you can just treat the leading bit like another digit in the number. For example, in signed bytes, 1+2=3 is done as 00000001+00000010=00000011, but if a leading 1 represented a positive number then if you tried to apply the same bitwise addition step to each digit, you would get 10000001+10000010=00000011. Since you're adding two leading 1's together, they add to 0, which means the number becomes negative. When using a 0 for the leading digit of a positive number, you don't have this problem when adding together negative numbers, since you'll have a 1 carried to the leading digit (so for example, (-1)+(-2)=(-3) is done as 11111111+11111110=11111101) unless the numbers are so negative that you get an integer underflow, which is an unavoidable problem. Because this convention makes logical sense for signed ints, it makes sense that it would be used for floats for consistency.

    • @steffahn
      @steffahn 2 роки тому +12

      One useful result of the sign bit being 0 meaning positive is that this way the "ordinary zero", i.e. "positive zero" value consists entirely of zero-bits. So if some piece of memory is zero-initialized, and then interpreted as floating point numbers, those become zero-initialized, too.

  • @MCLooyverse
    @MCLooyverse 2 роки тому +54

    I always see the sign as `isNegative`. Also, I recently did a thing in Desmos where I needed to know the quadrant a point was in, so I had Q(p) = N(p.x) + 2N(p.y); N(x) = {x < 0: 1, 0} (which generates a non-standard quadrant index, but it's maybe better.), where N(x) is the same mapping.
    Also, with this sign encoding, we have that the actual sign is (-1)^s ((-1)^0 = 1, (-1)^1 = -1)

    • @matthewhubka6350
      @matthewhubka6350 2 роки тому +1

      The sign function also exists in desmos btw. Might not work as well though since sign(0)=0

  • @quietsamurai1998
    @quietsamurai1998 2 роки тому +20

    This is without question the best explanation of floating point numbers I've ever seen. I wish this video was around when I was taking my freshman CS classes and we had to memorize the structure of floats, since actually walking through the whole process of making compromises and design decisions behind the format really gives you a deep understanding of the reasoning behind the format.

  • @RedLuigi235
    @RedLuigi235 2 роки тому +45

    I think perhaps 0 as + and 1 as - could be a case of default (off) versus the only possible changed state (on)/ underspecification (+ by default) vs specified case (-)

    • @GroundThing
      @GroundThing 2 роки тому +2

      It also works for fixed point or Integer addition, with 2s complement. If you add a negative number to a positive number if the negative number's absolute value is larger than the positive number the resulting number won't have an overflow carry so the sign bit will still be 1 (so still negative). If the positive number is larger than the negative number's absolute value, you will have an overflow carry into the sign bit, which will then become 0, as 1+0+1 = 0 with a carry out (which is usually ignored). As a result, the writers of the IEEE floating point standard probably went with the same convention for the sign bit as prior signed binary representations.

  • @Quxxy
    @Quxxy 2 роки тому +10

    In regards to signed zero, I recall an explanation in a paper a long, *long* time ago that showed a graph of a 2D function that was correct with signed zero, and incorrect without it. It kills me that I can't find the damn thing, but the point is that signed zero is *required* for some functions to get sensible results with floating point. I think it comes down to correctly representing sign at inflection points.
    Oh, also: another small, practical advantage of the way the sign bit is: it means "all zero bits" means 0.0. This is handy for contexts where data is initialised to zero: you end up with your floats all being 0.0 by default, rather than -0.0.

    • @rossjennings4755
      @rossjennings4755 2 роки тому +1

      One interesting way that having negative zero can come in handy is with functions that have "branch cuts", which is something that's normally associated with complex numbers, but there's an analogous thing with the inverse tangent function that doesn't require complex numbers. In C (and most other programming languages), taking the inverse tangent of +infinity gives +π/2, and similarly the inverse tangent of -infinity is -π/2. So if you have a function that computes arctan(1/x) and x ends up being a negative number that's too small to be represented, the fact that it underflows to -0 instead of +0 can save you from being off by π.

  • @dnys_7827
    @dnys_7827 2 роки тому +21

    I've had a vague sense of what floats are from running into problems with them in computational design software, and the feeling of getting a clear overview on something you've only been vaguely familiar with is so good. great video.

  • @redtaileddolphin1875
    @redtaileddolphin1875 2 роки тому +117

    Oh hell yeah I’ve heard floating points explained a few times and still don’t really get it so I’m really happy you did a video on it. Your brain seems to work similar to mine, or at least you’re very good at explaining, so your videos work very well with my brain.
    The current second place video for explaining this topic to me is the video about quake’s fast square root hack

  • @notnullnotvoid
    @notnullnotvoid 2 роки тому +15

    I always found it silly that IEEE-754 gives us 1/0=INF, but not INF*0=0, this is the first time I've had a plausible explanation for why it might have been designed that way. Thanks!

    • @cmyk8964
      @cmyk8964 2 роки тому +1

      0 × ∞ is either a discontinuity or an exception in normal math too though. Anything × ∞ is ±∞, but 0 × anything is 0.

    • @danielbishop1863
      @danielbishop1863 2 роки тому +2

      INF*0 is one of the standard "indeterminate forms" in calculus, which is why it evaluates to NaN.
      en.wikipedia.org/wiki/Indeterminate_form

    • @cmyk8964
      @cmyk8964 2 роки тому +1

      @@danielbishop1863 Exactly. 0 can be an infinitesimal in normal math too.

  • @youranforit
    @youranforit 2 роки тому +45

    what a cool video!!! i always love that i can understand your content even though i have no background in it. thinking of numbers as approximations like this really is so fascinating and unique

  • @sugar_700
    @sugar_700 2 роки тому +59

    0 meaning positive means that it's possible to zero out the memory (such as with `memset` or `calloc`) to get floating point value +0.

    • @seneca983
      @seneca983 2 роки тому +7

      If the sign bit worked the other way around then that would set the value to -0.0. In most cases that difference would probably be harmless though sometimes it could cause problems.

    • @notnullnotvoid
      @notnullnotvoid 2 роки тому +7

      Oh, that's a good point. Especially since the same holds for integers. So zeroing memory gives you the same value of +0 in both data types.

  • @MitalAshok
    @MitalAshok 2 роки тому +11

    As someone also weirdly into floating point numbers, I appreciate this video.
    0 for the sign bit being positive is carried over from previous sign/value representations of numbers. It has the benefit that "all zero bits" is the same as positive zero, the expected "default" float value. Also, you can think of it as multiplying the rest of the number by (-1)**(sign bit).
    And I tend to think of floats themselves as precise, but operators need to round to the closest precise value. This makes rounding modes make sense. But your interpretation is great for stuff like 1/0==Infty.
    But I hate hearing the phrase "Infinity is a concept": You say "Infinity is not a real number", but that sounds like the non-maths meaning of "real" (i.e., it exists). Of course, it can be a "real" number in certain domains. In IEEE-754, it isn't a concept, but an entity that "exists".
    Like "engineering notation" 1.72e-3 == 1.72 * 10**-3 == 0.00172, there is a "precision notation" 1.001p4 == 1.001_2 * 2**4 = 10010_2
    And about NaNs: All of those values are not used in real life. You might rarely see the difference between signalling and quite NaNs, but the entire payload isn't used. In JavaScript, there is only 1 NaN value for example. Some modified floating point formats (Like the ARM 16 bit float) reuse the NaN encodings as just larger numbers.

  • @Maxuvious
    @Maxuvious 2 роки тому +8

    I went from knowing nothing about floating point numbers to literally all of it (with the exception of how it's applied) in 17 minutes. I am very impressed! Great video

  • @DominoPivot
    @DominoPivot 2 роки тому +12

    Thanks, this is a very straightforward explanation and I'm probably going to link it to any neophyte programmer asking me a question about floating points :)
    Or anyone who shouts that JavaScript is a bad language when the flaws they're complaining about are really just flaws with the IEEE floating point standard that JS doesn't encapsulate.

    • @keiyakins
      @keiyakins 2 роки тому +7

      I mean, "all numbers are double precision floats, deal with it" is a pretty awkward design decision. On the other hand the thing was thrown together in like a week and a half originally and when you take that into consideration it's pretty good.

    • @notnullnotvoid
      @notnullnotvoid 2 роки тому +3

      Having no integer data type is FAR from the only reason why javascript is a terrible programming language.

    • @drdca8263
      @drdca8263 2 роки тому +1

      @@notnullnotvoid that doesn’t preclude many of the complaints made from being mistaken though

    • @DominoPivot
      @DominoPivot 2 роки тому

      @@notnullnotvoid Plenty of languages with a high level of abstraction have only one number type. 64bit floating point numbers are pretty good at representing 32 bit integers accurately, so when dealing with human-scale numbers it's rarely a problem. But JS could have done a better job handling NaN, Infinity and -0 for sure.

    • @jangamecuber
      @jangamecuber 2 роки тому

      @@notnullnotvoid eg. The Date system

  • @EDoyl
    @EDoyl 2 роки тому +62

    doubly recommend the tom7 video. It's interesting how the different NaN values tell you what sort of NaN it is, but NaN never equals NaN if you compare them. Even if they're the same sort of NaN they don't equal each other. Even if they're literally the same bits at the same location in memory it doesn't equal itself.

  • @doublex85
    @doublex85 2 роки тому +8

    Hey jan Misali, take a look at "Posit numbers" if you haven't already. I feel like they'll be aligned with your interests. They feel like floating point but without all the ad-hoc design-by-committee kludges.
    No negative zero, only one exception value (NotAReal) instead of trillions, more precision near zero but larger dynamic range by trading bits between the fraction part and the exponent part using the superexponential technique you actually joked about in this video. They're super cool.

  • @areadenial2343
    @areadenial2343 Рік тому +8

    I hope you do a video about the balanced ternary number system sometime, it's very cool and has lots of unique properties! Having the digits 1, 0, and -1, it can naturally represent negative numbers. Truncation is equivalent to rounding, so repeated rounding will not result in loss of precision. Converting a number to negative simply involves swapping the 1 digit with -1 and vice versa. Similar to binary, having only digits with a magnitude of 1 simplifies multiplication, allowing you to use a modified version of the shift-and-add method (flip to negative if -1, shift, and add). Some early computers used balanced ternary, such as the Setun computer at Moscow State University, and a calculating machine built by Thomas Fowler. Fowler said in a letter to another mathematician that "I often reflect that had the Ternary instead of the denary Notation been adopted in the Infancy of Society, machines something like the present would long ere this have been common, as the transition from mental to mechanical calculation would have been so very obvious and simple."

  • @JNCressey
    @JNCressey 2 роки тому +3

    There is also a decimal floating point. It works nice with giving "sensible" results, like 0.1+0.2=0.3, but it's a bit slower to process.

    • @danielbishop1863
      @danielbishop1863 2 роки тому +1

      For those who need an explanation: In binary, the fraction one-tenth is the recurring "decimal" 0.00011001100110011001100..., so has to be rounded off in order to be stored in floating point. To be ultra-precise, it's stored as 0.100000001490116119384765625 (13421773/2^27) in 32-bit "single precision", or 0.1000000000000000055511151231257827021181583404541015625 (3602879701896397/2^55) in 64-bit "double-precision".
      Base-ten arithmetic has a similar issue where (1/3)*3 = 0.333333333333333 * 3 = 0.999999999999999 instead of an exact 1.

  • @fernandobanda5734
    @fernandobanda5734 2 роки тому +24

    The weirdest thing you'll face when dealing with floating point is when you try to make games and one of the tips is "As you get farther from the (0, 0, 0) coordinate, physics gets less precise".
    It makes a lot of sense. If you need to spend more bits in the integer part of the number, the decimal part will be less precise. It's just so impractical for this purpose.

    • @huckthatdish
      @huckthatdish 2 роки тому

      That makes total sense but I’ve never worked in games so it never occurred to me. How interesting. What kind of data type are the coordinates? I’d assume it’s at least a double these days but like is double precise enough for large open world games or do you need more bits?

    • @kkard2
      @kkard2 2 роки тому +7

      ​@@huckthatdish in games most often float is used, due to higher performance on GPU (transforming vertices by matrix, etc.)
      open world games usually use floating world origin, which shifts entire world closer to (0, 0, 0) (implementations vary)

    • @qwertystop
      @qwertystop 2 роки тому +7

      @@huckthatdish Generally, games with large enough areas to make this a problem handle it by setting the origin coordinate at the player's location and moving the rest of the world around them, and not simulating the parts of the world that the player isn't currently in. That, or loading zones; any given loaded area can have it's own fixed origin.

    • @qwertystop
      @qwertystop 2 роки тому +3

      On the other hand, Outer Wilds has a whole miniature solar system and needs to run physics everywhere even when you're not around. I'm not sure how that handles it, but I am very impressed that they did.

    • @huckthatdish
      @huckthatdish 2 роки тому +1

      @@kkard2 interesting. Still surprised with how far draw distances are these days everything loaded at once can be simulated with just float, but I know the tricks and hacks to make it all work are myriad. Very interesting

  • @ala8193
    @ala8193 2 роки тому +11

    Floating-point numbers are interesting. I would recommend looking into posits and unums and other such alternatives for numeric computations.

  • @strangeWaters
    @strangeWaters 2 роки тому +7

    numbers as regions is a thing that shows up in locale theory (aka "pointless topology"), where you don't worry about there being "points" and only work with hunks of space that can contain each other (aka some sort of algebraic lattice, i forget the precise formalism). Imo it's a lot more physically realistic too -- all physical measurements have degrees of uncertainty, nothing is 100% certain. It's just that classical math is very uncomfortable with any kind of uncertainty. (Everything has to be a total function! Anything that doesn't fully determine its output is Not Allowed!)
    The blog "graphical linear algebra" does really some interesting graphical-algebraic development of basic arithmetic and convinced me that it's often natural to deal with one-to-many relations in basic math -- that is, have operations that produce ranges instead of points -- like "nan" (when it's used in the sense of "could be anything" rather than "outside context problem"). They do stuff like turning addition "backwards" -- thinking of "reverse addition" as something that consumes an input and produces a *constraint* that its two outputs sum to the input. There's a nice visual formalism where you literally turn a little circuit diagram backwards to indicate this, it's cute.

    • @jonassattler4489
      @jonassattler4489 Рік тому

      There is a very clever trick with floating point numbers which allows you to make mathematically true calculations possible with just floating point numbers alone.
      The idea is that floating point numbers allow you to round in different direction and give you a result which you can choose to be either lower or higher than the result in the real numbers.
      Using this idea you can represent a real number as a an interval and calculations with intervals give you other intervals with the property that the mathematically true result is guaranteed to be in the interval.
      This gives something similar to the structure you talk about. A number in that setting isn't just a point, but an entire range of values which can be operated on like numbers. Additionally this also gives you a measure of how accurately you are calculating. If the resulting interval is wide the uncertainty is high, if it is small your errors in the calculation are low.

  • @jasmijnwellner6226
    @jasmijnwellner6226 2 роки тому +59

    I now understand subnormal numbers, I thought I never was going to get them. Thanks!
    By the way, have you heard of posits? It's kinda like a "floating floating point system" like you mentioned at 5:10, and it's really fascinating (albeit harder to understand than the standard IEEE 754 format).

    • @kered13
      @kered13 2 роки тому +1

      Thank you for mentioning posits! I remembered reading about them, but I could not remember what they were called. They're a really clever format.

  • @DeWillpower
    @DeWillpower 2 місяці тому

    thank you for making this video! my code teacher told me basically everything i needed to know but also move to a more important subject, while your video on youtube is a more cozy place where i could learn more

  • @EebstertheGreat
    @EebstertheGreat 2 роки тому +2

    By the way, fixed-point arithmetic is used somewhat frequently in computer systems. It basically works the same way as integer arithmetic, but interpreted as some decimal or binary fraction. For instance, your banking app might internally use signed fixed-point binary-coded decimal numbers with 11 digits, 2 of which are after the decimal point. That way, it can be sure all of its calculations are exact, and it can apply a rigidly-defined set of rounding rules. Your account can never have half a cent in it, after all. However, you can of course be much more efficient with a binary floating-point format like IEEE 754.

  • @Elizabeth-vh6il
    @Elizabeth-vh6il 2 роки тому +3

    This is the best explanation of IEEE 754 I've ever seen. Much easier to follow than the textbook I originally learned this stuff from 20 years ago. The only things I can remember that you didn't mention are (a) that the FPU has a certain number of extra hidden bits (3?) to minimize the rounding errors applicable to the results of intermediate steps of sequences of calculations, (b) we had a whole 6 week lecture course that I didn't manage to understand (I was a CompSci student but the module was run alongside math majors) about what happens when the system breaks down (perhaps when subnormals aren't enough?) and how to perform operations in the correct order to minimize errors because formulae that should be mathematically equivalent aren't always (c) I can't remember the difference between a 'Signalling NaN' and a non-signalling one, (d) the whole issue with rounding errors, equality checking and 'epsilon', and (e) there was some hype at the time around a then new-ish IEEE *decimal* standard that was supposed to replace the standard double precision binary format and fix all the problems but I don't know if it ever gained much traction at all (obviously the binary formats are still ridiculously popular).

  • @whatelseison8970
    @whatelseison8970 2 роки тому +3

    Daaamn Jan! You really blew up since last I saw you. Nice job! This video was a lot of fun. I like to think you also did the entire thing in one take. I know you said at the end you're not a number but as far as I'm concerned, you're number one!

  • @SomeTomfoolery
    @SomeTomfoolery 2 роки тому +2

    I've never wondered about how floating point worked before, but I couldn't have asked for anything better.

  • @rkvkydqf
    @rkvkydqf 2 роки тому +1

    You broke my understanding of math and programming in just 15 minutes.

  • @DumbMuscle
    @DumbMuscle 2 роки тому +4

    If you watched this and enjoyed it and want to see a funky thing that floating points let you do - go look up the Fast Inverse Square Root algorithm.

  • @BLiZIHGUH
    @BLiZIHGUH 2 роки тому +8

    Great video as always! And I'm always happy to see Tom7 getting a plug as well :) You both exist in the same realm of "obscure but incredibly entertaining content about niche subjects"

  • @ritzgaming1819
    @ritzgaming1819 Рік тому

    this guy is a teacher but entertaining
    teaching me useless stuff that entertains me and does not bore me like every other teacher in my school.

  • @Frankium
    @Frankium 2 роки тому

    this channel has instantly turned from my favourite linguistic and conlang channel to my favourite computer science channel

  • @SolomonUcko
    @SolomonUcko 2 роки тому +11

    Posits (type III unums) use a kind of "floating floating point" by having a variable-precision exponent and mantissa, allowing them to reduce precision for very large and small values in exchange for increasing precision for numbers near 1 and increasing range.

  • @Yotanido
    @Yotanido 2 роки тому +3

    NaN is an absolute scourge. You do some operation that results in a NaN. It doesn't error, you just get a NaN as a result.
    Any operation on a NaN is also a NaN. By the time this causes an issue in your program, it might be somewhere completely different and now you need to figure out where the NaN originated, before it infected all the other floats.
    Honestly, I'd prefer it if things just errored out when encountering a NaN. It would make debugging so much easier and in the vast majority of cases, things are going wrong once a NaN shows up anyway.

    • @Chloe-ju7jp
      @Chloe-ju7jp 2 роки тому

      just have checks for if things are NaN after doing something if it's a problem

    • @Yotanido
      @Yotanido 2 роки тому

      @@Chloe-ju7jp Sure, you can do that. But then you need to do a NaN check everywhere and you might forget.
      Or you make a function for each operation, which will make longer expressions look absolutely daft and hard to read.
      NaN is an exceptional state. It should throw in exception. (Or whatever error mechanism the language you are using happens to have)

    • @rsa5991
      @rsa5991 2 роки тому

      @@Yotanido On x86, hardware supports signaling on FP errors. It is controlled by MXCSR register. Your compiler might have some function or setting to control it.

  • @RaeSan_Art
    @RaeSan_Art Рік тому +1

    i love all your videos because i get so into the topic and your style of delivering comedy that totally dont realize that I dont know what your talking about until like 5 minutes after you lost me lmao

  • @michaeltan7625
    @michaeltan7625 Рік тому +1

    That was a very good explanation. Personally, I always considered finding the best compromise to be one of the cornerstoness of engineering, and this system is a really good example of it.
    Also, I find the whole "every number is a range" thing much easier to digest by thinking about it as scientific notation with limited significant figures. Then it does make sense that 1.00000 * 10^15 + 1 still rounds to 1.00000 * 10^15 if you are hypothetically limited to 5 significant figures.

  • @santoast24
    @santoast24 2 роки тому +3

    Once again, Jan Misali has taken something I dont give two hoots in Hell about, and convinced me to sit through a 20 (almost) minute long video, and enjoy every second of it (and learn some stuff, that even though I dont care about, I will happily carry with me forever)
    Impressive

  • @Chubby_Bub
    @Chubby_Bub 2 роки тому +3

    This video was really interesting and helpful, but the best thing I can contribute to the comments is that without a specified radix (base) the point is called a “radix point”.

  • @abrahammekonnen
    @abrahammekonnen 2 роки тому +1

    Thank you for your explanatory videos. I always appreciate how clear, effective, and entertaining your videos are.

  • @nanometer6079
    @nanometer6079 Рік тому +1

    I love that my interest in computer science and other esoteric internet things like homestuck and undertale somehow converge on this channel lol

  • @TheThirdPrice
    @TheThirdPrice Рік тому +4

    9:29 legit made me laugh out loud

  • @Erin-ks4jp
    @Erin-ks4jp 2 роки тому +9

    The idea of numbers as secretly being ranges of numbers reminds me of a thing from combinatorical game theory have the property of being "confused" with other numbers, though this is a very strict and particular mathematical idea rather than the very real-world orientated compromise involved here with floating point.

    • @IllidanS4
      @IllidanS4 2 роки тому +3

      Enter the surreal numbers!

    • @iambad
      @iambad Рік тому

      @@IllidanS4 Nice. I had not heard of surreal numbers before.

  • @synodicseason
    @synodicseason 2 роки тому +2

    babe wake up new jan misali video

  • @gaeel330
    @gaeel330 2 роки тому +2

    It's equally truthful and painful to call Tom7's video the "logical conclusion" of NaN and Infinity shenanigans.
    Thanks for this video, it's a fun and easy to follow explanation of the format, I particularly appreciate how you insist that floating point numbers represent a range of numbers.
    ni li pona, a!

    • @gaeel330
      @gaeel330 2 роки тому

      nampa pi sike telo li pona ala. lipu tawa li pona.

  • @gamerdomain6618
    @gamerdomain6618 2 роки тому +3

    2:46
    The universal term for such dividers between the fractional and whole components, suchas decimal points do, is the radix point.

    • @manioqqqq
      @manioqqqq Рік тому

      quad dio null radix hex non

  • @BryndanMeyerholtTheRealDeal
    @BryndanMeyerholtTheRealDeal 2 роки тому +4

    When you type 1 vigintillion and you get a number like 1 vgnt 57 quaddec 857 tredec 959 duodec 942 undec 726 dec 969 non 827 oct 393 hep 378 hex 689 quin 175 quad 40 tril 438 bil 172 mil 647 tsnd 424

  • @ChefSalad
    @ChefSalad Рік тому +1

    In the middle of the video, he mentions an "Inception" that you could do to have a "floating floating point" system, where you first determine how many bits go into the exponent and then determine where the binary point goes. This system actually exists. Sort of. People have invented it, although I'm not aware of any serious implementations. It's called "tapered floating point", and it's really neat.
    One of the more interesting parts about tapered floating point is a weird side effect of tapering and normalization. Normal IEEE floating point tailors its numbers around 0. That is, if you look at where all the possible numbers you can represent lie on a number line, the densest region centers on 0. Tapered floating point tailors its numbers around 1. Furthermore, it does it in a more precise way. With ordinary floating point, when you get close to zero, eventually you run out of accuracy and have to resort to denormal numbers, which lie evenly spaced on the number line. With tapered floating point, there are no denormals, instead you just make all of your bits exponent, make that exponent negative (exponent numbers are signed in tapered floating point) with the largest possible magnitude and then make the mantissa have no bits. This'll make your number 2^(exp) (remember that exp is negative), which is very small number. The next smallest number ends up being 2^(exp+1), following by 2^(exp+2), 2^(exp+3),...,2^(exp/2), 1.5*2^(exp/2), 2^(exp/2+1), etc.. This means the spacing of the smallest numbers is exponential (in these cases, the numbers having a mantissa of NULL. Ooops, all exponent!)
    This is potentially really neat for science/math applications, but unfortunately there's no hardware support for doing math on these numbers, which makes them very slow. Also, I've never seen a software implementation of them either. I think that's because since storage space is so cheap, you're usually better off using arbitrary precision floating point numbers. That's where the total number of bits for the number isn't set ahead of time, but instead you fix the number mantissa bits to whatever precision you desire, and then use as many bits as required for the exponent, using a special data structure designed just for this purpose. And, if you want the center of precision around a number other than 0, you just use an offset. Look at the C/C++ library mpir for an implementation of that.

  • @skyshoesmith6098
    @skyshoesmith6098 Рік тому

    This is better than the explanation given in my first year of my computer science degree, with one important omission:
    If two approximations are very good, subtracting one from the other might STILL yield a massively disproportionate difference. For example, a quadrillion minus a quadrillion-and-one is one. But floating point numerals would return 0. So that's wrong by a factor of NaN.
    If you don't want a factor of NaN in your multiplications/divisions, this is a (real world!) problem.

  • @aaronspeedy7780
    @aaronspeedy7780 2 роки тому +7

    0:10 Wow, I didn't realize that non-programmers really only hear about floats in the context of precision errors

    • @accuratejaney8140
      @accuratejaney8140 2 роки тому +2

      Yeah, I first heard about floating point numbers in reference to the fact that if you go far enough in Minecraft Bedrock, you can fall through the world (Java doesn't have the problem in the allowed +30 million to -30 million playspace because it uses doubles), and I next heard them in reference to Super Mario 64's "parallel universes" bug and how, if you go far enough, you can't enter certain parallel universes because they're not near enough to a floating point number.

  • @Skyb0rg
    @Skyb0rg 2 роки тому +5

    I think another important part of the philosophy behind NaN and Infinity being standard things is for detecting where errors happen. Having 1/0 be “the largest number in the system” may cause you to run your program, and get back a meaningful, but wrong number. Getting back NaN or Infinity signals to the programmer they made a mistake.

    • @iantaakalla8180
      @iantaakalla8180 2 роки тому

      But what if you wanted to calculate a thing that was effectively infinity, and infinity was the correct answer? What number system would you use then, since the floating-point system basically says infinity is a number too big to calculate? Or am I asking a bad question?
      Like, for example, the evaluation of limits?

    • @rsa5991
      @rsa5991 2 роки тому +2

      @@iantaakalla8180 Some floating point procedures accept infinity as a valid value. For example, in many math libraries, arctangent of +infinity = pi/2.

    • @Skyb0rg
      @Skyb0rg 2 роки тому +1

      @@iantaakalla8180 The getting infinity as a result would not be an error condition in that use case

  • @ManOfDuck
    @ManOfDuck 2 роки тому

    Great video. I love how you broke it down in a way that didnt just explain what it did, but also why it did it, coming from the perspective of the person inventing it. Presenting it like that makes it seem way more intuitive and allows a deeper understanding, rather than just memorization. Solid stuff!

  • @DanDart
    @DanDart Рік тому +2

    Oh, thank you! I was getting really irritated with the philosophy, and now I finally understand what IEEE 754 was going for. I was getting bogged down in the real-ness and when you explained approximations, how -0, inf, NaN, etc could be a thing because it's just a range/not a real number, it made sense finally.

  • @downwardtumble4451
    @downwardtumble4451 2 роки тому +4

    0 is positive and 1 is negative because of how signed integers work. If you were to subtract 1 from 00000000, it would underflow and get you 11111111, which is -1.

  • @rGunti
    @rGunti 2 роки тому +4

    Fun fact: When parsing a text (aka a string) to a number, some systems will output NaN if the input cannot be parsed to a number.
    For example in JavaScript, Number.parseFloat('dQw4w9WgXcQ') returns NaN.

    • @clarise-lyrasmith3
      @clarise-lyrasmith3 2 роки тому +3

      you did NOT just hide the rick roll link as a string, I'm glad that the meme is still around :D

    • @Brivalia
      @Brivalia Рік тому

      Do you have that committed to memory or did you copy it

    • @clarise-lyrasmith3
      @clarise-lyrasmith3 Рік тому

      @@Brivalia I have seen the link so many times that I shudder when a youtube link begins with "dQw"

  • @gingganggoolie
    @gingganggoolie 2 роки тому

    I love this format, and that you use it so much. Like explaining the history of a letter. Knowing WHY something is the way it is is one of the most effective ways I know to remember something long term

  • @tanyaomrit1616
    @tanyaomrit1616 2 роки тому

    jan Misali: These five bits for where the point should go allow us to do something very clever.
    Me, listening to this in the background for the third time, processing everything kinda on autopilot, and also having seen the Lidepla video: Uh oh that can't be good

  • @vgtcross
    @vgtcross 2 роки тому +7

    2:30 got me laughing real hard😂

  • @ALUMOX
    @ALUMOX 2 роки тому +4

    nanpa?????!?!???!??!??!?!???!?!

  • @GameTornado01
    @GameTornado01 2 роки тому +1

    After I had to convert numbers into floating point format and back by hand in an exam this year, I feel really superior watching this.

  • @Benny_Blue
    @Benny_Blue 2 роки тому +2

    1) I really liked your choice for the last second of this video!
    2) I wish I had this 6 months ago, before I took the Computer Science course that taught me all this.
    3) I feel like you did Two’s Compliment a small disservice with the info on screen at 5:27 (the first bullet). I get *why* you did it that way - this is a video about floating point, after all. But for a one-sentence summary, it just feels _unworkable_ - especially when compared to the other two accompanying it. I don’t know what a better one would be, but it feels like for what you have, such a key (and cool!) concept for the corresponding integer standard needs to at least be *named*.
    But regardless of that, this is a GREAT video. It took my Professor three classes and two weeks to get to the end of what you covered completely in 17 minutes. You should be proud!

  • @mattguy1773
    @mattguy1773 2 роки тому +4

    This would have helped me succeed computer science

  • @paper2222
    @paper2222 2 роки тому +3

    nanpa li ike aaaa

  • @bristlebrick
    @bristlebrick 2 роки тому +1

    Just the other day I found myself thinking "I wonder how floating point works. I should like, do some research or something."
    So this was a really nice video to have pop up in my feed.
    Thank you!

  • @turingsghost
    @turingsghost Рік тому +1

    you are my favorite youtuber because your next video is pretty much always about a niche phenomenon in a random very specific field and somehow it's always something that i've also noticed and that i want to learn more about. or sometimes wario ware which is also fantastic

  • @guiAstorDunc
    @guiAstorDunc Рік тому +4

    5:43 yeah this is kinda binary in a nutshell from my understanding
    Everyone just agreed that’s how it should be and it’s too late to change it now

  • @NoNameAtAll2
    @NoNameAtAll2 2 роки тому +3

    I wonder if this video will mention that f32 was made with that amount of exponent bits to fit Avagadro number (big) and Gravitational constant (small)

    • @NoNameAtAll2
      @NoNameAtAll2 2 роки тому +1

      or maybe not gravitational, but "h" quantum constant that I forgot the name of

    • @FreeFireFull
      @FreeFireFull 2 роки тому +3

      @@NoNameAtAll2 Planck constant

    • @anatali121
      @anatali121 2 роки тому +5

      @@ThomasTheThermonuclearBomb I think the gravitational constant is 6.6743 × 10^-11, 9.8m/s^2 is what you get when you plug the grav constant in with earths mass and stuff though!

    • @Sylocat
      @Sylocat 2 роки тому +1

      Ah, so that's why. I always wondered why they didn't use 7-bit exponents instead.

    • @killerbee.13
      @killerbee.13 2 роки тому

      @@ThomasTheThermonuclearBomb That's called "acceleration due to gravity" (average at approximately sea level), or lowercase g, and it's not actually a constant, it technically depends on what part of the world you're on, but the difference is too small to care unless you manufacture scales. The gravitational constant is a fundamental constant used in general relativity, and it's spelled capital G.

  • @gauravbyte3236
    @gauravbyte3236 Рік тому +1

    keeping this as a reminder to comeback as I didn't understand in one go

  • @Silas_MN
    @Silas_MN 2 роки тому

    I learned about a lot of this stuff from one of my college classes I took last term, but I still came away from this with a better understanding of the subject than I had before. great work!

  • @ingwerschorle_
    @ingwerschorle_ 2 роки тому +6

    numbers aaahh

    • @kijete
      @kijete 2 роки тому +2

      you were first

    • @ingwerschorle_
      @ingwerschorle_ 2 роки тому +2

      @@kijete and i will not make a big deal of it

  • @potato_nuggetz6675
    @potato_nuggetz6675 2 роки тому +3

    at 8:03 at the bottom right it says 5.88x10^039 instead of 5.88x10^-39 which is quite humorous i think

  • @Zoltorion
    @Zoltorion 2 роки тому

    I just sat up last night thinking about and looking at the wiki page for floating point numbers in the middle of the night and so now it's blowing my mind that you released this video at basically the same time that was happening. Crazy coincidence and great video as always.

  • @lifthras11r
    @lifthras11r 2 роки тому +1

    I've seen far too many explanations about floating point numbers and I think this video is the most balanced one. There are for example too many guides that start from 0.1 + 0.2 = 0.30000000000000004 (this is a property of _any_ binary number system, not just floating point or IEEE 754 in particular). In addition this gets subnormal numbers and negative zero pretty much correct, something that is frequently missing or sometimes wrong in most other guides. Frankly, as a patron and FP semi-expert [1] I got to see early drafts of this video for comments and I had nothing to add on. Great.
    [1] I'm the original author of a float-to-decimal conversion algorithm in the Rust standard library.

  • @Vaaaaadim
    @Vaaaaadim 2 роки тому +4

    Great video! It's nice to see a new perspective on a topic such as this.
    I had not personally scrutinized the floating point format spec extremely closely. I knew about how the bits were assigned for the sign bit, exponent, and mantissa and that's pretty much it. Infinities and NANs, +0 and -0 I didn't know how they were exactly specified, and subnormals I wasn't aware of at all.
    Another thing that you might find interesting to look into is how computers do arithmetic (addition, subtraction, multiplication, division) at a logic gate level and/or software level. Counter-intuitively, doing multiplication fast is not straightforward and has quite a bit of research behind it.
    This video showcases the first multiplication algorithm that is better than the standard grade school method of multiplication: ua-cam.com/video/cCKOl5li6YM/v-deo.html
    And you can find actual usages of Karatsuba multiplication and Tom Cook multiplication in the OpenJDK implementation of Java's BigInteger class.

  • @telotawa
    @telotawa 2 роки тому +3

    nanpa mute

  • @lizzzylavender
    @lizzzylavender 2 роки тому

    I love the variety of topics jan Misali covers, and I love that it's all the same topics I'm interested in (and in the same WAY I'm interested in them)

  • @cormanec210
    @cormanec210 Рік тому

    Jan misali is the reason we need the ability to subscribe to playlists.