This Function Destroys Programs: MS-BASIC's VAL()

Поділитися
Вставка
  • Опубліковано 7 січ 2025

КОМЕНТАРІ • 410

  • @MichaelDoornbos
    @MichaelDoornbos Рік тому +83

    That was great. I suspect it took quite a bit longer than 20 minutes to work that out. Even system creators in 1977 had to say “we better check for this, you just know some user is gonna try it”

    • @rocketman475
      @rocketman475 Рік тому

      Doornbos = Thornbush

    • @MichaelDoornbos
      @MichaelDoornbos Рік тому

      @@rocketman475 I'm aware.

    • @AndreasDelleske
      @AndreasDelleske Рік тому +8

      Error culture had not been a thing: Will the code still fit into ROM had been the main consideration.

    • @jnharton
      @jnharton Рік тому +8

      @@AndreasDelleskeIt's also worth noting that "crashing your computer" was less of a huge problem back then, especially when all you are doing is running one program at a time and everything else it relies on is in ROM.
      Still super annoying of course, but all you needed to do was hit reset to get back to a clean slate and the BASIC prompt.

    • @marisakirisame867
      @marisakirisame867 Рік тому

      yes and me do this and it corrupted

  • @D0Samp
    @D0Samp Рік тому +18

    This reminds me of PHP authentication bypasses where password hashes starting with "0e" were compared using the == operator, which recognized both sides as the numeric value 0 with an exponent.

  • @stevethepocket
    @stevethepocket Рік тому +56

    Something interesting I finally realized, when you showed a reversed "@" being used to represent the ASCII 0 null character in the memory dump: The reversed symbols used by BASIC's quotes-mode to indicate control characters weren't chosen at random; they're derived by taking the PETSCII code, looking up the _screen_ code for that number, and adding either 128 (if it's low enough) or 64 (if it's not). This happens to work because all of the control codes either under 32, or between 128 and 192. So you get normal ASCII characters for the former and key-front symbols for the latter. Clever of Super Snapshot's monitor to extend that scheme to codes that can't be typed.

  • @PaoloBergamo
    @PaoloBergamo Рік тому +6

    Where was UA-cam 40 years ago when I needed it?

  • @jasejj
    @jasejj Рік тому +16

    The best implementation of the VAL statement I've seen is in Sinclair Spectrum BASIC. You can put full mathematical formulae into the string, and it will evaluate it as the program is run. So something like:
    10 INPUT A$
    20 FOR X=0 TO 256: PLOT X, VAL A$: NEXT X
    And enter say X*2, or even ((SIN X)*80)+80 (to centre the plot on the screen vertically and expand it so it fills the screen), this simple program will plot any mathematical function on the screen. Obviously there's no sanity checking and it will return errors for some graphs, but it's an insanely powerful tool which I don't believe was ever documented by Sinclair either. I believe what is happening is that Sinclair is re-using the BASIC interpreter's line parser to implement the VAL statement itself.
    This bug of course does not exist in ZX BASIC as the code is unrelated. Regarding the VIC's version of this bug, I wonder if it behaves any differently with an 8k memory expansion as the memory map changes.

    • @SalivatingSteve
      @SalivatingSteve Рік тому +4

      Reminds me how in C++ you can feed system terminal command strings directly into cin.

    • @NuntiusLegis
      @NuntiusLegis Рік тому +1

      On the C64 and other CBM machines, you can use programmed direct mode to enter formulae at run time.

    • @marksilverman
      @marksilverman Рік тому +1

      ​@@SalivatingStevefun fact: you can also overflow memory in C++ 😅

    • @CartoType
      @CartoType Рік тому

      I used the Sinclair VAL function to implement the cell recalculation for a spreadsheet program I wrote and sold back in 1982. I was very lucky and made enough in royalties to pay the deposit for my first flat.

    • @silkwesir1444
      @silkwesir1444 11 місяців тому

      @@NuntiusLegis Well it's not really run-time, is it? It's a clever cheat (using invisible text on the screen and poking stuff into the keyboard buffer manually), where it seems you are still in the program but actually you have stepped out for a moment. And if the user doesn't play nice with it, it all goes kaput.

  • @mudi2000a
    @mudi2000a Рік тому +18

    Very interesting! As soon as you read from the book how the VAL function works I guessed correctly how the bug happens.
    I think this channel really offers the best in depth videos about the old Commodore machines, you have such a deep knowledge and also very good explaining skills!

  • @GeoffSeeley
    @GeoffSeeley Рік тому +8

    And this is why we build unit tests now. Thanks Robin!

  • @Dwedit
    @Dwedit Рік тому +13

    On an unrelated note, VAL is also broken if you run QBasic in MSVC builds of DosBox. Asking for VAL("5") gives you 4.99999... instead of 5. MSVC (Microsoft Visual C++) 64-bit compilers do not support 80-bit floating point numbers, and do not support inline-assembly that would use the legacy x87 floating point instructions that support the 80-bit numbers.

    • @williamdrum9899
      @williamdrum9899 Рік тому

      Why are they using floats lol

    • @neilscales
      @neilscales Рік тому

      ​@@williamdrum9899 AFAIK all values in msbasic are floats internally.

    • @williamdrum9899
      @williamdrum9899 Рік тому +1

      @@neilscales Ewww

    • @neilscales
      @neilscales Рік тому +1

      @@williamdrum9899 when you only have a few hundred bytes (not kilobytes) to implement floating point mathematics (including sin/cos/tan/sqrt etc), on an 8bit cpu that can only add or subtract, you don't have the luxury of separate routines for integers. I think people forget how clever Bill Gates and Steve Wozniak were at fitting in complicated code into tiny places.

    • @williamdrum9899
      @williamdrum9899 Рік тому

      @@neilscales Fair point. I guess it was make everything a float or don't support them at all

  • @GadgetUK164
    @GadgetUK164 Рік тому +8

    Very interesting bug! Always a joy to watch your deep dives on the C64 and VIC-20 =D

  • @retrozmachine1189
    @retrozmachine1189 Рік тому +11

    Bug is present in the L2 ROM BASIC in TRS-80 MI/III. Some 40 years on from owning these machines and I'm still finding things out about them. Not that I really did a lot with BASIC back then.

  • @retrozmachine1189
    @retrozmachine1189 Рік тому +19

    It's not the only bit of oddness that Microsoft did in BASIC. I have a vague memory of Z80 instruction reuse in the TRS-80's L2 ROMs. Pass through the ROM via one path and you get a particular instruction but another part of the ROM jumps into the same location + 1, halfway through the same instruction sequence, which decodes to a different instruction. It's probably present in any Z80 (and perhaps 8080) MS BASIC of the same version.

    • @danielmewes
      @danielmewes Рік тому +2

      The MS 4k 8080 Basic has such a behavior as a bug when the memory check overflows.
      It overflows if you run it on a machine with the full 64KB of RAM. It causes it to go into a loop where every second time, it executes the loop with a one-byte offset that alters the meaning of the instructions, and causes it to jump differently to get back to the correct offset again next time, going back and forth between the two ways to interpret the same code.

    • @melkiorwiseman5234
      @melkiorwiseman5234 Рік тому +6

      This was done in particular for the error printing routine. The A register needed to be loaded with the error number and all other registers would be ignored. Using a LD A,#nn instruction followed by a JR $nn instruction to skip the other LD A instructions would take 4 bytes per error message. The TRS-80 BASIC reduced that to 3 bytes by "hiding" the 2-byte LD A,#nn instruction inside the data for a LD HL,#nnnn instruction. Jumping to a particular LD A instruction would load the appropriate error number into A and then the CPU would do all of the following LD HL instructions, which would have no effect on A, before hitting the routine to actually print the error message. I called that "instruction hiding".

    • @damouze
      @damouze Рік тому +3

      There are quite a few of those in the MSX BIOS and BASIC ROMs as well.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому +1

      @@melkiorwiseman5234 A really nice technique on the 8080/Z80 which was used in the C128's ROM but not that of the C64 was placing arguments for a called routine immediately after the CALL (or JSR) instruction. Indeed, the 8080/Z80 has an instruction that feels like it was designed for that purpose, simultaneously reading the stacked PC into HL while storing the old HL to the stack. A function can thus read data from HL, incrementing HL after each byte, and then restack HL while retrieving its old value.

    • @c128stuff
      @c128stuff Рік тому +1

      @@melkiorwiseman5234 A very similar approach is used in the Commodore kernal for io and file related errors to save one byte per error.
      You do a jmp to the right lda # instruction, which is followed by a bit instruction which has the next lda instruction as opperand, effectively skipping it.

  • @jack002tuber
    @jack002tuber Рік тому +22

    Fascinating. This shows the structure of BASIC in memory. There's a line number in there, there's a pointer to the next line, theres special data for the last line. Neat stuff. Type a program, go into the ML monitor and look around. It would be simple to store a ML routine right in basic to use, provided no one goes in and changes the program afterward

    • @renakunisaki
      @renakunisaki Рік тому +4

      I was imagining hiding a copyright string...

    • @whitslack
      @whitslack Рік тому +5

      Another of Robin's recent videos explicitly described the structure of BASIC lines in memory. I think it was the one about why the number of BASIC bytes free shown in the startup banner is exactly the number it is.

    • @skilletpan5674
      @skilletpan5674 Рік тому +1

      Yes, as a 10 or 11 year old i once wrote a simple apple soft parser. You could usually find a tokin list or table that had the number of each tokin (keyword) and with a little more research or messing around you can find out the starting address of basic and start to work out how to decompile it.
      Sometimes you'd want to move the address of your basic program. Things like copyprotection or maybe you wanted some kind of disk loader.

    • @tramadol42
      @tramadol42 Рік тому +2

      Thats how we used to put ML routines into Basic programs (mostly in REM lines)... Ahh memories...

    • @EyMannMachHin
      @EyMannMachHin Рік тому +1

      I do remember a math function plotter in Commodore Basic on the VIC-20, that let you enter any function (eg. x^2+4x) and would use character set manipulation to actually plot the function like it's hires graphics. Instead of you having to change the line where the function was in the code and you having to rerun the program every time you changed the function, you would simply input the function as a string. It had one REM line with the maximum characters a line could have and it would parse your input and put it into that space as code to run in the plot loop. Really nifty thinking there.

  • @tabachanker8716
    @tabachanker8716 Рік тому +10

    Super interesting! Never knew this bug existed before. 2 things I tried after this video on a c64:
    5 a$="1e39"; 10 ? val(a$):rem bug?. I thought there would be no bug, since the val() routine should use the variable a$ in memory, right? Nope, the bug appears on line 5 now! The val() routine uses the string defined on line 5!
    Then i tried to build a string, so I changed line 5 for: 5 a$="1"+"e39". Now val() uses the string stored in memory and the bug doesn't appear on any BASIC line.
    This tells me that C64 BASIC may store string constants with a pointer directly on the line where its definition is provided. Only when the string is built that it uses memory space in the string heap.

    • @melkiorwiseman5234
      @melkiorwiseman5234 Рік тому +4

      All versions of BASIC that I've used store strings in this way. If it's a string literal in the program, the variable pointer contains the length of the string plus the address where the string literal is stored inside the program.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому

      @@melkiorwiseman5234The C128 BASIC uses one 64K bank of memory for holding the user program, and the other for variables and strings. String literals always have to be copied to the second bank.

  • @countzer0408
    @countzer0408 Рік тому +2

    What happens if you RUN the program again after the overflow error? Do you get a different error?

  • @atomcode
    @atomcode 11 місяців тому

    Dear Robin! I really appreciate your videos! Many thanks for your efforts in presenting things in such detail. It is always interesting. 👍

  • @PeranMe
    @PeranMe Рік тому

    Thanks Robin, another great video! Thank you for taking the time to dig into these mysteries!

  • @HelloKittyFanMan
    @HelloKittyFanMan Рік тому +2

    Wow, that's a pretty weird bug that I had no idea was running around inside Commodore 64s! And then even weirder that basic BASIC (ha!) is such a direct port (but with a few proprietary things modded in by the computer brands) that not only are ports in different models of Commodore affected, but in different brands using the 6502 variants, but EVEN in systems using whole different styles of CPU (instruction sets, etc.) like the Zee-80, and it's been overlooked in so many cases until we got to some level of consumer IBM. So of course it's gonna be in Altair BASIC as observed via terminal, too.

  • @AiOinc1
    @AiOinc1 Рік тому +9

    Interesting, this is also the cause of a stack leak I suspect. Does the character ever get pulled back off the stack? Do this enough times and it might overflow the stack.

    • @8_Bit
      @8_Bit  Рік тому +7

      I just experimented with that now and it seems there's no stack leak, so it must be getting cleaned up somehow.

    • @gcewing
      @gcewing Рік тому

      The 6502 stack wraps around in a 256 byte area, so even if there was a leak you might not notice. But it's likely that the stack pointer gets reset whenever BASIC bails out due to an error anyway.

    • @lostwizard
      @lostwizard Рік тому

      On the 6809 version at least, the stack is completely reset on error. That clears any GOSUB/RETURN frames, FOR loop records, expression evaluation intermediate results, and anything else something has stashed on the stack.

  • @ChrisCromwellHP
    @ChrisCromwellHP Рік тому +1

    I wonder if this bug was corrected in the Tandy COCO 3 Color Basic? 🤔

  • @subethasoftware
    @subethasoftware Рік тому

    I was getting some Deja Vu here, thinking “didn’t Robin already post a video about this?” I had to see what prompted me to try it on the CoCo - your Twitter post. Nice to see a video!

  • @Green_House
    @Green_House Рік тому +1

    The Val function stops reading the string at the first character that it can't recognize as part of a number.
    Symbols and characters that are often considered parts of numeric values, such as dollar signs and commas, are not recognized.

  • @rager-69
    @rager-69 Рік тому

    I don't know if you have any control over where ads are placed, but sometimes YT videos pop up at magical places. For example, in this video when you ran the overflow example at 14:48, it went to an ad, as if the bug is so powerful it forced an ad. It made me chuckle.

  • @RedPillRachel
    @RedPillRachel Рік тому

    In Amstrad CPC's Locomotive BASIC, it does exactly this, as an integer or float. A similar function, ASC, returned the ASCII code of a single char.

  • @damouze
    @damouze Рік тому +1

    I checked it in OpenMSX and the bug does not seem to affect MSX BASIC 1.0, which is from 1983, either. It also did not give an overflow error, so I tried it with 1E99, which did give an overflow error, but indeed did not corrupt the BASIC program.
    It occurs to me that this is a very dirty way of putting a machine code payload into a BASIC program. I'm not sure if BASIC will save the payload to disk/cassette or load it from disk/cassette when requested, but suppose it will. Then the payload would be there, but it would be invisible to the user, obfuscated by that stray NUL character.

  • @codahighland
    @codahighland Рік тому +2

    I have to wonder why the substitution was even necessary. A " character, being non-numeric non-whitespace, would have terminated the routine anyway.

    • @Lord-Sméagol
      @Lord-Sméagol Рік тому

      I just fixed the bug in Nascom ROM BASIC 4.7 by having VAL save the byte and address so that an Overflow Error intercept can repair it. But seeing your comment, I just tried stopping it clearing the " ... and initial tests show that it is behaving properly ... another conditional assembly option to add to my source code :)

    • @Lord-Sméagol
      @Lord-Sméagol Рік тому

      Update: some more testing: temporary strings don't appear to have termination bytes:
      I just tried A$="12":B$="34":? VAL(B$) ... and got 3412 !!!

  • @JamesJones-zt2yx
    @JamesJones-zt2yx Рік тому +1

    Had to try BASIC09. It printed 0 for VAL("1e39"), not causing an error--which would concern me in serious numeric code. The program doesn't get corrupted.

  • @daniellomblock6216
    @daniellomblock6216 Рік тому

    Love your videos - so much detail and background info. Subscribed!

    • @8_Bit
      @8_Bit  Рік тому

      Thank you!

  • @tedthrasher9433
    @tedthrasher9433 Рік тому +1

    I can confirm that the bug exists in the AppleSoft basic that is in rom on my ROM01 Apple IIGS (1986) and in ProDOS BASIC 1.5 (1992). On an Apple IIGS this corrupts all kinds of things. When the shutdown command is issued from GS/OS after testing for this bug in AppleSoft, it causes an overflow error on the smart port and the computer completely hangs. Even a soft reset doesn’t work. GSoft BASIC (1999) does not have the bug, but trying to print A returns “inf.” There is no overflow error when the VAL statement is evaluated in GSoft BASIC.

  • @davidellsworth4203
    @davidellsworth4203 Рік тому +2

    Are there any platforms on which the bug is fixed (by undoing the NUL-terminator overwrite upon an overflow error, instead of copying the string), but pressing the equivalent of Ctrl-Break with perfect timing can still result in program corruption, by breaking out of the machine language routine that handles VAL (a race condition)?
    [Edit: I retract the following as it has been brought to my attention that that solution would only work for string literals and not for strings in the heap.] It's strange that they used a NUL termination overwrite at all, though. Wouldn't the character it overwrites always be a non-numeric character anyway (a quotation mark) which terminates VAL's evaluation? It should be possible to fix the bug merely by getting rid of that overwrite, and still operating on the string in situ.

    • @NuntiusLegis
      @NuntiusLegis Рік тому +1

      If I understand it correctly: Only if a string literal in the code is evaluated, but VAL can also process string variables where the string data is in string memory without quotation marks or other terminating characters, and then the next string can start with a numeric character.

    • @WY.C64-Guy
      @WY.C64-Guy Рік тому

      You mean like:
      10 a$="4321"+chr$(34)+"9876"
      20 print val(a$)
      ?

  • @RaquelFoster
    @RaquelFoster Рік тому +1

    Nice work! The part that seems crazy is that somebody actually wrote a book which describes the BASIC interpreter in plain English with literal per-instruction granularity!

  • @isaactanner6403
    @isaactanner6403 Рік тому +1

    Hi all !! I tested it in the WebMSX and worked fine. No bugs and no errors… the value of “A” variable is (print out) “1E+39”, exactly this way…
    Cientific notation from biggest numbers… I have the real machine and will try it next week !!
    My MSX is Brasil Model Gradiente MSX 1, transformed to 2 with 256k of mapper (64 resident)

  • @63801170
    @63801170 Рік тому +1

    The Commander X16 modern computer uses C64 BASIC v2 (with extensions) and also has the bug included.

    • @chromosundrift
      @chromosundrift Рік тому

      I wonder if their license agreement forbids them from fixing it!

    • @flatfingertuning727
      @flatfingertuning727 Рік тому

      If license agreements wouldn't be a problem, there are many places where a few tweaks to the BASIC interpreter could massively improve performance without breaking compatibiliy with any code that doesn't rely upon the addresses of routines in memory. The floating-point shift-right loop is somestimes used on positive values in FAC1, sometimes used on positive values in FAC2, sometimes used on potentially-negative up-to-32-bit values in INT, and sometimes used on potentially-negative values in cases requiring a 16-bit result. Having separate routines to handle these different uses might increase the ROM size by a few dozen bytes, but enormously improve performance. When executing "FOR I=128TO255:NEXT", the existing interpreter spends about a quarter of the overall execution time in one of the aformentioned shift loops, and when processing PEEK and POKE statements even more time is spent in those loops. Having special 16-bit versions of those loops could probably more than double the performance of something like `FOR I=1024 TO 2023:POKE I,PEEK(I+D):NEXT", without having to make any major changes to the overall design of the interpreter.

  • @gcewing
    @gcewing Рік тому +2

    At least it's not as egregious as the arithmetic bug in the BASIC interpreter I wrote for my kit-built Z80 system, whereby adding any negative power of 2 to itself gave 0. It was a surprisingly long time before I noticed it, and it didn't happen to cause a problem in any of the programs I wrote, so I never got around to fixing it. (The whole thing was hand-written and hand-assembled on the back of old line printer paper, so making any changes to it was rather a pain!)

  • @rotordave81
    @rotordave81 Рік тому +2

    Merry Christmas Robin. I'm looking forward to my annual watching of your C64 Christmas video come Monday :)
    Thanks for another year of knowledge, fun and pedantry (sorry, I repeat myself).
    (This video's title is a little clickbaity, surely it *can* destroy programs?)

    • @8_Bit
      @8_Bit  Рік тому +1

      Thanks Dave :) Does the title seem dishonest? VAL() can, has, and does destroy programs as demonstrated in this video. Does it have to destroy programs every time it's used to make the qualifier unnecessary? Hmmmm.

    • @8_Bit
      @8_Bit  Рік тому +2

      Now that I made the thumbnail does it seem better? "This Function (When Used In The Way Shown In The Thumbnail) Destroys Programs: MS-BASIC's VAL()"

    • @8_Bit
      @8_Bit  Рік тому

      You were right about the clickbait, and it's really working. This is the fastest one of my videos has got to 10,000 views in ages!

    • @rotordave81
      @rotordave81 Рік тому

      @@8_Bit In that case, great! I was just being pedantic :) I guess, after all, if you say a telephone sanitiser sanitises telephones, it doesn't mean they sanitise all telephones. Maybe you could even title it "this function reduces distraction" since it destroys programs.

  • @mikegarland4500
    @mikegarland4500 Рік тому +1

    Interesting video as always. Thanks!

  • @HelloKittyFanMan
    @HelloKittyFanMan Рік тому

    Interesting video, thanks! Happy Christmas, Robin! 🎄🎅

  • @VintageGearFreak
    @VintageGearFreak Рік тому +1

    Does VAL() also evaluate expressions like "4+5" ?

    • @8_Bit
      @8_Bit  Рік тому +4

      No, unfortunately, it'll stop when it hits the + sign. It converts a number in text form, consisting of the digits 0-9, and characters "." and "E" for decimals and scientific notation. It does parse "-" or "+" but only at the beginning of the number, or after the "E".

  • @watchmakerful
    @watchmakerful Рік тому +1

    Why does it perform these actions in the program memory itself? Isn't it safer to copy this string somewhere else in the memory and then process it?

    • @8_Bit
      @8_Bit  Рік тому +5

      Yes, it really should do this somewhere else in memory, with an extra byte at the end for the null terminator. I can only guess that they did this sort of hacky solution just to save ROM code space, RAM at runtime, and CPU cycles which were always in short supply in those days.

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      @@8_Bit I am glad they did, because I won't lose sleep over this bug. :-)

  • @beakt
    @beakt Рік тому +1

    So the only computer you showed where it's fixed from that era was the IBM PC. Do you think that IBM engineers inspected the code they licensed from Microsoft before bestowing upon it the IBM label, and, using proper techniques, discovered what no one had thought of? Or maybe just noticed the simple fact that a function which should have only returned a value instead had a global effect (even if temporary), and provisions in the error handling did not reverse the effect.

  • @greggoog7559
    @greggoog7559 Рік тому +6

    Very interesting. I totally expected you to fix the bug in the BASIC interpreter at the end 😄 regarding why they did it this way -- I think it really does make sense from a CPU cycles point... a program with a lot of VAL()s would probably be significantly slower with all the copying going on.
    The best fix would probably have been to make an exception in the error subroutine to put the quote character back.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому

      A really huge portion of the time spent in VAL(), or floating-point numeric evaluation in general, is spent in a sequence of instructions which perform a sign-extending right shift in situations where the operand is going to be positive. Patching that routine to simply use an LSR instruction will cause floating-to-integer conversions to fail for negative operands, but make it much faster. Creating a separate routine that uses LSR would greatly improve the performance of many BASIC programs. The time required to make an extra copy of a string pales in comparison.

    • @greggoog7559
      @greggoog7559 Рік тому

      Interesting, thanks... I'm by no means an expert on C64 BASIC implementation details or 6502 assembly in general. I did meddle with it alot from the age of 7 or so 😄 but I'm now in my 40s and all I do now is occasionally use an emulator. But it does make sense that the actual floating point evaluation is so much slower that the string copy wouldn't matter. Thanks for pointing it out! @@flatfingertuning727

    • @chromosundrift
      @chromosundrift Рік тому

      Wouldn't a good fix be to store the endbyte or length of the string on the stack or in zeropage, rather than splatting a zero terminator in what could be a basic listing? Other solutions that come to mind don't seem as simple.
      I don't think putting a fix in the error routine would be ideal since the quote problem is specific to VAL()'s parsing of a number rather than all causes of overflow error, e.g. the result of arithmetic.

    • @davidellsworth4203
      @davidellsworth4203 Рік тому +2

      Wouldn't the best fix just be to not do any NUL terminator overwrite at all, and still read the string in-place? The character that would be overwritten would always be a non-numeric character anyway (a closing quote) which terminates VAL's evaluation anyway, wouldn't it?
      If the kind of fix you described were used, it might still leave a race condition in which pressing Ctrl-Break with perfect timing could leave the NUL there, corrupting the program... unless all errors handlers would restore the overwritten character, or Ctrl-Break handling is done by polling instead of an interrupt.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому +1

      @@davidellsworth4203 Given the string "GET A$:GET A$:GET A$", if the statements read the characters "X", "2", and "1" in that order, the three bytes starting at the address of A$ would be "12X", with no other intervening bytes.

  • @bjbell52
    @bjbell52 Рік тому

    I tried it on an Atari 800 (emulator) running Microsoft Basic. It comes back with a message "overflow" cr "overflow" .

  • @csbruce
    @csbruce Рік тому +5

    1:47 It makes complete sense that the task of parsing a number would be reused. In fact, it's odd that Commodore BASIC has separate parsing for line numbers. However, there are artifacts from reusing the BASIC-code parser, since, as you show, it skips over spaces.
    5:35 Yeah, that's an ugly hack - just poke a $00 in after the string and don't bother to clean it up on error.
    8:39 This would seem to be more of a problem on the expanded VIC-20 and C64. On the unexpanded VIC, the $00 is tossed into the memory right after the evaluated sting, which happens to be screen RAM, and it works okay. But on an expanded VIC, following the RAM is normally empty space and on the C64, it's ROM, so the parsing behaviour really depends on what's read out of that location. What if it reads a digit out of that spot?
    It's interesting that evaluating the TI$ keeps reusing the same spot at the top of the string heap without needing a garbage collection.
    10:58 That's unexpected. I thought string literals were just referenced to the raw BASIC code or input-buffer space where they reside.
    20:20 What happens if you try to edit that line? BASIC should be confused about the length of that line and make the garbage after it into a new line.
    21:58 A hack way to solve it would be to store $22 instead of $00. BASIC wouldn't get mangled, though the screen or heap might, but they're volatile anyway.

    • @8_Bit
      @8_Bit  Рік тому +3

      Aha, on the C64 location $A000 (ROM) contains $94 which VAL will just ignore (and quit evaluating). Fortunate! If there had been a $34 there (for example) then we'd see some interesting results.
      String literals in immediate mode seem to be put on the heap immediately. Perhaps that's because doing something like A$="TEST" in immediate mode requires it, so rather than distinguishing based on the use, it's only checking if we're RUNning or not.

    • @fllthdcrb
      @fllthdcrb Рік тому +3

      "What happens if you try to edit that line? BASIC should be confused about the length of that line and make the garbage after it into a new line."
      I just tried that. Something like that happens. Except if I edit the affected line itself, it just moves all the real lines after it to be immediately after the edited line, resulting in the garbage just disappearing. It's only if I edit any _other_ line that the garbage becomes its own line, including a garbage line number, but minus the first two bytes that get overwritten with a proper next-line pointer.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому +2

      @@8_BitIn Applesoft, on a 48K machine when not using DOS, RAM is followed by the value of the keyboard input byte, with the high bit set if it hasn't been read yet, but cleared if it has. If a program performs e.g. GET A$:PRINT VAL(A$) and types a digit, the digit will be reflected many times in the output value.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому +2

      Parsing line numbers directly as integers avoids the need to perform a floating-point-to-integer conversion after the fact. What's more curious is that the code wasn't used to handle the leading portion (which might be the whole thing) of floating-point numeric constants. Parsing a constant like 13579 requires converting the digit 1 to a floating-point value by shifting it left 7 places and adjusting the exponent, adding that to zero, then multiplying by ten by incrementing the exponent, copying that value to the second floating-point accumulator, adding two more to that exponent, and adding the two floating-point accumulators. then converting the digit 3 to a floating-point value by shifting it left 6 places and adjusting the exponent, adding that to the previous value, multiplying that result by ten, converting the digit 5 to a floating-point value by shifting left 5 places and adjusting the exponent, adding that to the previous value, etc. Processessing as much of a conversion as possible using integer math, converting that result to a floating-point value, and then using the general-purpose recipe to handle anything that was left over, could likely roughly double the performance of a lot of code that uses many floating-point constants.

    • @csbruce
      @csbruce Рік тому

      @@flatfingertuning727: When dealing with a manually entered line, what does taking an extra millisecond matter? I was thinking more of a mode where the floating-point parser disallows all non-digit characters ("+", "-", "e", ".") (since you wouldn't want «10E=.5» to be misinterpreted).
      The biggest speedup with numbers would be to represent integers as integers for simple operations. I.e., have a special exponent like $00 to mean the mantissa holds a 16-bit unsigned integer. Most variables are small integers, especially in BASIC games. Operations like load, store, add, subtract, and convert to "integer" would be done with uint16 arithmetic, and if an over/underflow occurs or any other operation is requested, the FAC is converted to floating point first. You'd also want number parsing and INT() to detect if the result is a uint16.

  • @glenm9376
    @glenm9376 Рік тому +1

    Great insight thanks. So now I need to boot up just to see what happens when you put the @ symbol in a line.

    • @stevethepocket
      @stevethepocket Рік тому +3

      It wouldn't do anything because the "@" shown here isn't a literal @; it's a reverse symbol used to represent an untypeable ASCII character. Like the ones that appear when you type a color code or a cursor key or something inside quotes, except there's no key combination I'm aware of that will let you input a null.
      Though you did get me wondering what happens if I trick the parser into letting me type control codes outside of quotes. I tried it just now and... it's very interesting! I'm going to have to tell Robin about it because it sounds like another fun subject for a video.

  • @mdpenny42
    @mdpenny42 Рік тому

    FWIW, doesn't seem to affect Acorn's BASIC (as in "BBC BASIC" version 2 on a BBC Micro) - checked on an emulator.
    Then again, Acorn's BASIC was their own development, rather than extended from an extant version of Microsoft BASIC.

  • @HardDriveGuruOfficial
    @HardDriveGuruOfficial 11 місяців тому +1

    Okay, what's the song during the patron list? It's a bop for real!

  • @insectodium206
    @insectodium206 10 місяців тому

    This val() bug is present on the Oric as well
    ( Oric Extended Basic 1.1 (c) 1983 Tangerine )

  • @aceenterprise
    @aceenterprise Рік тому +2

    That's very interesting, especially about the null character to end the basic line like that. I wonder if any program/game ever took advantage of the null character on a line of code to hide additional code behind that?

    • @flatfingertuning727
      @flatfingertuning727 Рік тому +2

      Although Robin didn't test this, saving and loading a program which contains an embedded zero like that, which is at least four bytes from the end of a line, would result in the third and fourth bytes after the zero being interpreted as a line number of a line that would extend until the next zero byte. If this line number was larger than the following line number, this would wreak havoc on any GOTO statements which should target lines that appear after the out-of-sequence line but have a lower number, as well as efforts to edit the program.

    • @Curt_Sampson
      @Curt_Sampson Рік тому

      @@flatfingertuning727 Are you sure about that? I thought that all versions of MS-BASIC used the offset-to-next-line value at the start of the tokenised BASIC line to determine where the next line starts.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому

      @@Curt_SampsonThat's what they do during execution, but to allow for the possibility that a program might be loaded at a different address from where it had been stored, versions of BASIC for the Commodore VIC-20, C64, and Apple II (and probably many others as well) scan all BASIC lines to find zero bytes and fix them. Actually, loading and saving the program isn't necessary to trigger this behavior. Adding a new line to the program can have the same effect, since it's simpler to generate line links from scratch than to try to apply relative offsets to existing line links. It's a bit of a shame, though, that MS didn't precede each line with a length byte, since traversing line links stored that way way would be faster than traversing line links stored as pointers, since only one link-related byte would need to be fetched to traverse each link.

    • @Curt_Sampson
      @Curt_Sampson Рік тому

      @@flatfingertuning727 As for the scanning for zero bytes, yes, you are right that in the pre-5.x MS-BASICs the LNKPRG routine is called immediately after a LOAD, and that does simply scan for zeros. (I've checked only the C64 6502 versions, but I assume the others are the same.) That wouldn't work in later variants such as MSX-BASIC because the tokenisation of numbers can produce $00 bytes in the middle of the line; I don't at the moment recall how the line linker in that version dealt with this.

    • @Curt_Sampson
      @Curt_Sampson Рік тому

      @@flatfingertuning727 Argh. For some reason my first reply vanished, leaving only my second. As I was mentioning in the first reply, no, on the 8080 (for which MS-BASIC was originally written) using a length byte instead of following pointers would have been slower, since the 8080 has good capabilities for loading and following 16-bit pointers, but poor capabilities for doing arithmetic on them.

  • @sandcat-maurice
    @sandcat-maurice Рік тому +1

    Around @18:50 it shows quotation marks as well as '@' in your Monitor. I am using SuperMon64, and none of these symbols are shown. Instead on those locations I just see a dot.
    For instance, your Monitor shows "1E39@): on line 0809 whereas my Monitor shows .1E39.):

    • @8_Bit
      @8_Bit  Рік тому +2

      Yes, the Super Snapshot monitor shows symbols for a wider range of hex values than SuperMon64. I believe SuperMon is showing just the regular CHR$() range of characters and shows a dot when there's nothing printable, while Super Snapshot has extra symbols to print for the "unprintable". It's a somewhat arbitrary decision, but it can be useful.

    • @sandcat-maurice
      @sandcat-maurice Рік тому

      Aha, all clear now. Thank you for the explanation @@8_Bit !

  • @Thiesi
    @Thiesi Рік тому

    Great video as usual, and please thank your son for providing the lyrics for this banger of an outro.

  • @turnkit
    @turnkit Рік тому

    This bug is also in the TRS-80 Model III BASIC.
    I think Radio Shack tried to claim the Model 4 BASIC wasn't Microsoft's anymore since they modified parts to get out of licensing. Something like that. Would be curious to see if the bug is still there.

    • @turnkit
      @turnkit Рік тому

      The DOS based TRS-80 Model 4 BASIC 01.01.01 for TRSDOS Version 6, Copyright 1984 by Microsoft, licensed to Tandy Corp., works properly. (The Model 3 ROM BASIC did not.)

    • @turnkit
      @turnkit Рік тому

      The ROM based BASIC for the TRS-80 Model 4 fails though. This BASIC claims to be "(C) '80 Tandy" but yet still has exactly the same bug as Microsoft's. Maybe because Tandy really didn't create their own BASIC but just stole Microsoft's code?

  • @DavidAsta
    @DavidAsta Рік тому +1

    I'm running MS BASIC v4.7 on my homebrew Z80 computer. It's a modified version of the MS BASIC from 1978 that came with the NASCOM 2 computers. The whole disassembly was published in a magazine in 1983. And I can confirm it has the bug. Thanks a lot Robin, at least now I know there is a bug to be solved 😀 BTW, I've also tested with my MSX (MSX BASIC v1.0 from 1983). The bug does not happen. PRINT A prints 1E+39

    • @gcewing
      @gcewing Рік тому +1

      This suggests that numbers have a wider range in your version. To test for the bug properly, you would need to use a large enough number to trigger an overflow.

    • @Curt_Sampson
      @Curt_Sampson Рік тому +1

      @@gcewing I've tested with 1e69, which does overflow and print the error message, but the bug is not there.
      MSX-BASIC is based on v5.x of Microsoft BASIC, which is substantially different in terms of parsing from the versions up to 4.x. (Among other things, numbers are parsed differently because they're tokenised.)

    • @Lord-Sméagol
      @Lord-Sméagol Рік тому +2

      I found this video yesterday, great detail!
      Having disassembled Nascom ROM BASIC all those years ago, and fixing many of the bugs, and optimizing it, I just HAD to fix this bug :)
      I think this method should be easy enough to implement on other BASICs (6502, 8080, Z80, 6800, 6809 ...)
      Make VAL save the address and original byte somewhere safe.
      If VAL succeeds, clear the saved byte to indicate no further action is needed.
      Change all the jumps to the overflow error to go to a check routine,
      so when an overflow error occurs:
      If the byte saved by VAL is zero, simple continue to the overflow error, otherwise (VAL didn't complete), so restore the byte using the saved address.
      Clear the saved byte to signal all is done and continue to the overflow error.
      Not too bad: 26 bytes of Z80 to fix it. :)
      I also looked at MS BASIC-80 [5.21].
      It doesn't have a problem; Overflow is not a fatal error, it gives a warning, so VAL repairs the 'damage' it did.

  • @ge97aa
    @ge97aa Рік тому +2

    I found that bug in my own reverse engineering of the BASIC ROM. Gotta make sure you clean up your mess before allowing BASIC to raise an error. There are few places in the BASIC ROM where not doing so causes problems.

    • @chromosundrift
      @chromosundrift Рік тому +1

      I wonder how feasible it is to run cleanup code for these types of exception cases. I think it may be better not to have situations that require cleanup. Data destruction is a pretty big deal. I'd be curious how they actually did fix it.

    • @ge97aa
      @ge97aa Рік тому +1

      It wouldn't have been trivial. You're right, it's better not to require the cleanup in the first place.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому

      @@chromosundriftIn most cases, the assumption was that if a program died because of an error, it wouln't matter if things were left in an awkward state, and some of today's compilers go out of their way to exploit such assumptions. When processing a function like "unsigned mul(unsigned short x, unsigned short y) { return x*y; }", it may treat the code as an invitation to identify what inputs to the calling function would cause x to exceed INT_MAX/y, and omit from machine code for the calling function any portions (such as array-bounds checks) that would only be relevant if such inputs were received.

  • @chromosundrift
    @chromosundrift Рік тому +1

    So would the doublequote still be on the stack at $01ff ? Also I wonder if there are any other cases of basic functions altering the program memory temporarily such that other exception flows lead to program corruption. I know that BASIC memory layout was sometimes hacked to obscure program listings or to implement other forms of selfmod. Interesting branch of analysis, thank you Robin.

    • @melkiorwiseman5234
      @melkiorwiseman5234 Рік тому +2

      As I understand it, the error routine resets the stack pointer. The double quote mark would still be in memory, but not technically "on the stack" since the stack wouldn't be pointing to it.

    • @chromosundrift
      @chromosundrift Рік тому

      @@melkiorwiseman5234 makes sense thanks

  • @dougjohnson4266
    @dougjohnson4266 Рік тому +1

    Does Simons BASIC have the same issue with VAL()?

    • @NemanjaVuj
      @NemanjaVuj Рік тому +1

      Yes it does... It is a basic extender after all, and it relies on basic and kernal ROMs.

  • @TrossachsPhoto
    @TrossachsPhoto Рік тому

    Given the " is inserted onto the stack, then the code Overflow errors, does that mean " is still on the stack, and the SP is still "wrong". Could you fill up the stack and cause a Stack Overflow with this?

    • @melkiorwiseman5234
      @melkiorwiseman5234 Рік тому +3

      The error routine resets the stack pointer as part of its "clean up" following any error.

    • @TrossachsPhoto
      @TrossachsPhoto Рік тому

      @@melkiorwiseman5234 Thanks for the explanation!

  • @localroger
    @localroger Рік тому +1

    Implementing VAL for floating point math is actually very tricky if you try to properly catch all the edge case errors, and on those early computers every byte of interpreter code was a byte that wasn't available for the application programmer to use. This was probably a deliberate decision to reduce code size in the interpreter as long as it worked in normal situations.

    • @chromosundrift
      @chromosundrift Рік тому

      This prompts an interesting question, is there a comparably compact implementation of VAL() that doesn't overwrite the string terminator? I haven't tried it but I think storing the length or endpoint on the stack or in zeropage may prove to be as compact. I don't claim it would also be as fast but slightly slower VAL() speed is probably a tradeoff I would prefer.

    • @localroger
      @localroger Рік тому

      I suspect this was just an early, lazy solution that worked "well enough" that it got propagated through generations of releases until probably someone more serious at IBM complained about it.@@chromosundrift

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      @@chromosundrift I prefer the faster version becasue I don't feel the need to overflow my computer with ridiculously high numbers.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому

      @@chromosundrift Using the same code for VAL and for processing floating-point constants within a program, while allowing it to use length-counted strings that could be followed by irrelevant digit characters, would require incorporating logic to handle the length count within the code that parses strings stored in code. On the other hand, given how inefficient that code is already (given a loop "FOR I=256 to 511:Q=32768:NEXT", more than 70% of the loop's execution time would be spent evaluating the constant 32768), adding 7 extra cycles to the cost of processing each digit wouldn't be noticeable. The cost of code to process such checks would be offset by eliminating code to save, modify, and restore the byte after the string, so the total code cost would probably be about the same.

  • @stuartmcconnachie
    @stuartmcconnachie Рік тому +3

    This code for handling VAL is all horrendously complicated given that basic already has a built in expression evaluation routine for calculating things like A$ = “HELLO” : A$ = A$ + “ WORLD”.
    For example BBC BASIC just passes the expression inside the VAL parentheses to the expression evaluator, which returns the result and type (real, integer, string). That means you aren’t limited to string constants and single string variables inside the VAL (you can put expressions there as well), and there’s no need to jump through this adding and removing of termination bytes malarkey, either.
    Presumably things like VAL(A$+B$) and VAL(“1”+A$) (contrived examples, but you get the point) are invalid in variants of MS BASIC also?

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      It works on the C64.

    • @gcewing
      @gcewing Рік тому +3

      I'm pretty sure you can put expressions there. The code in question is run after the expression has been evaluated. If you read the comments, it talks about a "string descriptor" on the "temporary string descriptor stack". I think what's happening is that when the expression happens to be just a string literal, the string descriptor resulting from the evaluation points into the program memory. I suspect that if you triggered an overflow using a more complicated expression, some other part of memory would get corrupted that wasn't so noticeable.
      The zero-swapping thing seems to be a hacky way of re-using an existing routine they had lying around for converting a string to a number. A cleaner way would have been to refactor that routine to take an address and a length instead of relying on zero-termination.

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      @@gcewing It seems Val DOES use address and length with string data on the string heap, the "hack" is used with string data in the code, which, at least in case of a literal not assigned to a variable, does not have a string descriptor with addresss and length I think.
      Anyway in the following example the overflow error does not corrupt the code:
      10 a$="1e+3"+"9": rem comment
      20 print val(a$)
      The concatenation in line 10 forces the storage of the string data on the string heap.

  • @junker15
    @junker15 Рік тому

    I wonder what happens to the stack when the VAL subroutine encounters an unexpected error.
    I have the assembly listing in front of me, and that byte they replace to terminate the string is pushed onto the stack, and if there's no error, then it's pulled off and put back.
    But if there's an error, BASIC's error routine seems to bail through NEWSTT, which might make it all good as far as BASIC's concerned, but leave that tiny bit of junk still on the stack (NEWSTT saves the stack pointer, but doesn't seem to do anything else related to the stack). I don't imagine it being enough to overflow the stack, but "I don't think it'll matter" is a great start for some truly insidious bugs to happen.

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      The bug described in the video is extremely unlikely to happen anyway. I would be surprised if it had ever been a practical problem.

  • @FrankvanderBilt-h1c
    @FrankvanderBilt-h1c 9 місяців тому

    Hi Robin. Nice video. You made a video about a VTECH PRECOMPUTER 1000. I've got the 2000 and the Precomputer Prestige computer for kids. They also got the bug. Looks like they have the same BASIC as the PC 1000?

  • @HelloKittyFanMan
    @HelloKittyFanMan Рік тому

    Oh, I just remembered a question that I should have asked the other day: I remember that the computer will express these exponents (at least in BASIC) as having a + or -, as in "1E+39." If my things were set up then I would try this myself. But what about that: if you include the + or a -, does the same thing happen?
    (Wait until after Christmas to try it, if you want.)

    • @MrDiodera
      @MrDiodera Рік тому

      No - because (at least MS BASIC) doesn't care about "too small" numbers (so entering 1E-99 just becomes 0 without an error).

  • @CrazyMan_Engineer
    @CrazyMan_Engineer Рік тому

    Who created the first version that had this bug? Was it intentional to stop theft of the basic software?

  • @timewave02012
    @timewave02012 Рік тому

    You know you've spent too much time thinking about IEEE 754 floating point when you immediately recognize the significance of 1e39.
    I think the first floating point bug I fixed was when my employer's ATE software suite started handling NaN incorrectly after MSVC changed its floating point behavior.

  • @richardl6751
    @richardl6751 Рік тому

    Microsoft QBASIC from DOS 6.22 give an overflow error but doesn't change the program. Also making the variable double precision a# allows over 1e300.

  • @CafeenMan
    @CafeenMan Рік тому

    Just tried it in VB6 Just shows the text but changed it.
    MsgBox Val("1E39")
    Displays 1E+39

  • @erwinvandenberg1815
    @erwinvandenberg1815 Рік тому +1

    I think the simplest way to fix this bug is to add the end of string character as parameter to the val function. So instead of assuming it is a zero terminated string, the eos character defines the end. This way the memory has not to be set to zero and repaired afterwards (!) and the bug will never appear.

    • @WY.C64-Guy
      @WY.C64-Guy Рік тому +1

      Actually, reading the text, there's no need for the 00 byte... If the function encounters *anything* that is *not* +, -, E, or 0-9, it stops.
      Just leave the quote character... That's the easiest solution.

    • @flatfingertuning727
      @flatfingertuning727 Рік тому

      @@WY.C64-Guy The VAL() function is intended to be usable with strings that are stored in the heap without any other characters around them. Thus, if code performed "GET A$" three times, reading "X", "3", and "1", and then called "VAL(A$)", it would receive a pointer to the start of the byte sequence "13X". Including more information on the heap around strings, however, could greatly improve garbage-collection performance, and probably wouldn't have cost much: if a program has few line strings, adding one or two bytes each wouldn't amount to much, and if a program has many strings but couldn't afford the storage to handle extra couple bytes each, it would likely be rendered unusable by the existing slow garbage collector. If adding the extra pointers reduces free space by 80%, but their existence could cut GC times by 90%, that would be a net win. If pointers were stored MSB first during a GC cycle, and otherwise were stored as a zero byte followed by the length, putting a zero at the end of the heap would have eliminated the need to zero-terminate strings before calling "val".

    • @erwinvandenberg1815
      @erwinvandenberg1815 Рік тому

      @@WY.C64-Guy Good one. I made a small test program and it seems to work:
      10 FORP=40960TO49152:POKEP,PEEK(P):NEXT:REM COPY BASIC
      20 POKE47061,234:POKE47062,234:REM REMOVE STORE 0
      30 POKE1,54:REM TURN BASIC ROM OFF

    • @botsjeh
      @botsjeh Рік тому +1

      @@WY.C64-Guy The Val code is also used to parse numbers from strings that are not delimited by quotes,
      Also the reading of a non-zero and non-digit (or E or .) will cause an error situation that should force the program to stop in other situations, like reading numbers in a DATA statement.

  • @what9418
    @what9418 Рік тому

    What happens if you restore the double quote character from the editor? Will the editor terminate the line or will the missing text show up again?

    • @melkiorwiseman5234
      @melkiorwiseman5234 Рік тому +1

      Editing any line and then entering it will cause the entire original line to be erased and replaced with the line you just typed, so even if it doesn't crash due to the erroneous line terminator, it will replace the current line with whatever you entered, and won't keep anything you don't explicitly type. So the following colon and remark will be lost unless you type them in again before hitting Enter.

    • @what9418
      @what9418 Рік тому

      @@melkiorwiseman5234 👍

  • @sonicunleashedfan124
    @sonicunleashedfan124 Рік тому +3

    Apple 1’s basic has it worse. Line 1 completely disappeared after running this program

  • @stewiegriffin6503
    @stewiegriffin6503 Рік тому

    3:50 why is there space before 200 ?

    • @8_Bit
      @8_Bit  Рік тому

      Many BASICs print a blank space where the negative sign would go. I actually made a full video just about this subject :) ua-cam.com/video/TmadXH8nidY/v-deo.html

  • @davidhand9721
    @davidhand9721 Рік тому

    I've used Basic dialects that didn't use the *+* operator for concatenation, and I've never liked the string+string syntax since. I think it was the *&* operator. It doesn't make a lot of sense to have numeric addition and string concatenation share a syntax because there's no case where the operations are interchangeable. The only thing you can do by changing the type of the variables alone is create a quiet error. It really frustrates me now when I use languages like Python that do the same thing.

    • @NuntiusLegis
      @NuntiusLegis 11 місяців тому

      It is quite intuitive to use + here. A few yaers later it would have been praised as clever "operator overloading".

  • @CrazyBossDK
    @CrazyBossDK Рік тому

    At Memorech Basic (Memotech MTX500,512 and RS128) You just get "Overflow" nothing else. You dont need to use the val version, but you can, but Memotech Basic accept print 1e39 itself. But both versions will give an Overflow.

  • @KitsuneFuzzy
    @KitsuneFuzzy Рік тому

    Not sure if this is an emulator thing.
    But I got curious: since everything of line 10 still seems to be in memory what would happen if you run the program to break it as shown in the video.
    Then list the broken program, with the "removed" REM comment, and put back the quotation marks from where they were removed.
    Immediately after placing the " it seems to break arrow key navigation through the code and instead inputs symbols.
    That only seems to happen if you put quotation marrks back, any other symbol (at least tested with letters) broke nothing.

    • @silkwesir1444
      @silkwesir1444 11 місяців тому +1

      That is actually a feature of the screen editor. It automatically detects if you within quotes and then these control characters for special keys like arrow keys, function keys and so on can be input and displayed like that. Pretty neat actually, can get confused sometimes though...

  • @philp4684
    @philp4684 Рік тому +1

    19:23 The singular form of "parenetheses" is "parenthesis", not "parenthese".

    • @8_Bit
      @8_Bit  Рік тому +1

      Since I won't remember that, I'll just go back to calling them (round) brackets which is the usual Commonwealth term for them.

  • @Robert08010
    @Robert08010 Рік тому

    Just for laughs I tried this on QB64 and I don't know whether I got an error or if I am just misinterpreting its results. The program did not crash or get corrupted. However when I gave it Print Val("1e39"), it returned 1d+38. I can't tell if this is it giving it back to me in hexidecimal or some other notation or if its a similar glitch.

    • @silkwesir1444
      @silkwesir1444 11 місяців тому

      Probably not a glitch, that is actual notation that exists. When I saw it first I thought too it was a typo, but apparently not.

    • @Robert08010
      @Robert08010 11 місяців тому

      Isn't that a different value? @@silkwesir1444

  • @markrosenthal9108
    @markrosenthal9108 Рік тому +1

    This was not a bug, it was a feature. In those days, the machine code size of the interpreters was critical. With as little as 4K of memory, the routines for parsing had to be very small if you wanted enough space to enter that Trek game you were typing in from Creative Computing. Bill Gates had some difficult trade-off decisions to make for MS Basic. You could make your conversion logic strict or forgiving and quirky. The one thing it couldn't be was large.
    A completely different numeric conversion issue happens with floating point numbers even on large computers with compilers. Have you ever seen code testing equality checking for a difference less than something like 0.000001? This comes from the way floating point numbers are represented in binary and associated rounding errors during calculations. This is why Cobol was used for business instead of Basic. With the extra space, you could afford to store numbers as decimal digits. Slower than floating-point, but more accurate.

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      If Cobol doesn't use floating point numbers, I guess it should also replace C and C++ in business.

    • @markrosenthal9108
      @markrosenthal9108 Рік тому +1

      @@NuntiusLegis Cobol has floating point as well. It's best used for scientific and engineering applications. C and C++ eventually addressed the lack of decimal data types with third-party libraries. Java has a standard decimal data type, but it is somewhat awkward to use.

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      @@markrosenthal9108 Science and engineering don't care about rounding errors? I'd rather lose a millionth of a penny in my bank account than live with nuclear power plants having glitches. ;-)
      I wonder if there is software for the C64 using the CPU's BCD mode to avoid rounding errors in calculations.

    • @8_Bit
      @8_Bit  Рік тому

      I can accept the flashing @ symbol on the screen as a feature (I called it a quirk) but the corruption of programs is absolutely a bug, not a feature.

    • @NuntiusLegis
      @NuntiusLegis Рік тому +1

      @@8_Bit But with the gain in speed and memory with this "hacked" solution every time val() is used and the unlikeliness of this bug to occur, I can accept it as almost a feature.

  • @roysainsbury4556
    @roysainsbury4556 Рік тому

    The TRS-80 Model-I and Model-III have this bug (I tried it with the TRS32 emulator), but not the Model-4, as this used a version of Extended BASIC. Same goes for MBASIC in CP/M. The actual start address of a BASIC program on these machines isn't fixed, but there's a pointer somewhere that tells BASIC where it starts. This allowed for tricks like appending a program onto another, but that's a whole different story!

    • @Lord-Sméagol
      @Lord-Sméagol Рік тому

      The BASICs with the bug simply bail out of any function that causes overflow, printing Overflow Error / ?OV Error. The stack pointer is reset and BASIC returns to immediate mode (This prevents VAL restoring the original byte).
      MBASIC uses an overflow flag, so it always restores the byte. Using an overflow flag is much cleaner, it also makes error trapping (ON ERROR GOTO ) easier.

  • @awilliams1701
    @awilliams1701 Рік тому +3

    I actually ran into 100+100 = 100100 yesterday. It's a problem I run into with javascript a lot. The way I fix it is I subtract 0. So I do "100"-0+100 = 200. I also recently learned thanks to our code scans that javascript scoping is shit compared to what I'm familiar with. I was getting "variable already declared" errors in the scanner. I'm like.....does javascript not have proper scopes? NOPE!!

    • @G1itcher
      @G1itcher Рік тому +1

      Use let rather than var.

    • @awilliams1701
      @awilliams1701 Рік тому +1

      @@G1itcher I'm not using val or let. It's just adding variables with numbers. The numbers are numbers, but sometimes the variables are strings and sometimes they are numbers.

    • @whitslack
      @whitslack Рік тому +1

      Rather than subtracting zero, just use the unary plus operator to convert the first operand into a number. +"100"+100 == 200.

    • @awilliams1701
      @awilliams1701 Рік тому

      @@whitslack I'm not testing against 200, I'm adding 100 to it and then using the 200 result later. or are you saying that +"100" = 100?

    • @whitslack
      @whitslack Рік тому

      @@awilliams1701 +"100" evaluates to 100. Applying the unary plus operator to a string value produces a number value.

  • @choppergirl
    @choppergirl Рік тому +7

    Man this is brutal, we did not know about this bug back in the day. It would of never occurred to me the E for exponent would blow things up, because we rarely used numbers that way and stuck to integer math which was faster in programming and just better programming practice. Always for speed use an integer as much as you can for speed, and even better if it's a one or two byte integer for real speed.

    • @mudi2000a
      @mudi2000a Рік тому +1

      On the C64 due to implementation details in BASIC Integer was actually slower.

    • @choppergirl
      @choppergirl Рік тому

      @@mudi2000a I seriously doubt that. Regardless, Real women programmed in TinyMon when on the C=64 anyway. It's all integer basic there. The closer you can get to the bare silicon of the machine the better.

    • @mudi2000a
      @mudi2000a Рік тому +5

      @@choppergirl the problem with CBM Basic integers is that they actually were converted to floating point and back internally at least for calculations because the BASIC has only float math routines. Now I didn’t verify if what I remember is true but I am relatively confident. Of course you could use a BASIC compiler which usually has proper integer math or just assembly.
      EDIT: there is a video on this very channel which confirms what I wrote: ua-cam.com/video/wo14rDnGUbY/v-deo.htmlsi=zF2hmi2dReIaoaVo

    • @choppergirl
      @choppergirl Рік тому

      @@mudi2000a Well I'll have to test it some day with a timing routine. All us girls were spending so much time writing in Compute's TinyMon on the 64 back in the day, that Commodore took notice and built it and a Sprite editor into the C128's ROMs. By then though I had already moved on my BASIC Programming exploits to Microsoft QuickBasic both on the Macintosh and PC AT's.

    • @4rumani
      @4rumani Рік тому

      ​@@choppergirlWell, you're wrong, "woman"

  • @rricci
    @rricci Рік тому

    I have 2 questions.
    1. You never put in the closing quote I the last examples. Could that cause an error?
    2. What would've happen had you added "5 REM THIS IS A BUGGY PROGRAM". what line would have gotten corrupted?

  • @williamsquires3070
    @williamsquires3070 Рік тому

    What if you try it on Apple Integer BASIC?

  • @CRCO1975
    @CRCO1975 Рік тому

    I tried this in TI Extended BASIC on the 99/4A just for grins, knowing it isn't a Microsoft derived BASIC. The TI allows for values up to 1E127. 1E128 results in an overflow. (The computer only displays exponents up to 99, but maintains them up to 127.) Extended BASIC was used because TI BASIC doesn't allow for multiple statements per line.
    I learned a few things by accident doing this (or remembered something I knew long ago).
    Overflow errors in TI BASIC don't stop a program from running but just print a warning message. Extended BASIC allows that to be overridden and actually can stop on a warning if you tell it to do that.
    Leaving off the quotes around the string value in VAL caused interesting behavior:
    10 A=VAL(1E128)::REM REST OF LINE
    20 REM LINE 20
    >RUN
    * WARNING:
    NUMERIC OVERFLOW IN 10
    * STRING-NUMBER MISMATCH IN 10
    So it parsed the value first, declared it an overflow, then determined I hadn't entered a valid string into the function and stopped the program. 2 error messages for the price of 1!

  • @Waeffel
    @Waeffel Рік тому +1

    The Commander X16 has got that bug, too. I have just tried it on the emulator.

  • @monkybros
    @monkybros Рік тому

    Just tested this on two japanese micros, Fujitsu's FM-7 (1981 6809 machine with a microsoft copyright for its basic) and Sega's SC-3000 (1983 z80 machine with a basic interpreter written by Mitec, but pretty much 1:1 function offerings to the tandy basic levels II and III) and neither exhibit this behaviour

  • @whitslack
    @whitslack Рік тому +1

    Why would they bother to NUL-terminate the string at all since the number parser simply stops when it encounters a non-numeric character? The closing quote character is a non-numeric character, so they could have simply left it in place and let the parser stop on a quote rather than on a NUL.

    • @8_Bit
      @8_Bit  Рік тому +1

      The terminator is necessary when evaluating string variables as they're stored next to each other with no separators. Rather than determine the context, it just puts the zero there whether it's necessary or not.

    • @whitslack
      @whitslack Рік тому

      @@8_Bit I see! Have you made a video about the layout of the BASIC string heap in memory (and how BASIC performs garbage collection)?

    • @8_Bit
      @8_Bit  Рік тому +1

      I at least partially covered it on the PET (which is more-or-less the same) in the video "A Pre-Rogue-Like: Fixing 1979's DUNGEON for the Commodore PET" as the garbage collection was related to the bug.

    • @8_Bit
      @8_Bit  Рік тому +1

      Link to video if UA-cam doesn't censor: ua-cam.com/video/VG0ODzV48fI/v-deo.html

    • @whitslack
      @whitslack Рік тому

      @@8_BitAwesome! Thank you!

  • @aresaurelian
    @aresaurelian Рік тому

    I have a memory of trying VAL() as a kid and getting this error when doing some elaborate mathematical wizardry. It annoyed me.

  • @brianwild4640
    @brianwild4640 Рік тому +2

    It is not in the 8 bit Atari they show 1.0E+39 or for the 39 nines 9.99999999E+38

    • @8_Bit
      @8_Bit  Рік тому +1

      Atari released multiple versions of BASIC over the years, which one are you using?

    • @brianwild4640
      @brianwild4640 Рік тому +1

      @@8_Bit back sorry about the delay just had all my machines out works on rev A and rev B and rev C basic and on the 800 and 800 xl and the 130 xe which it would as the OS don’t matter and works on all revs of basic about the only thing good with Atari slow basic lol

    • @8_Bit
      @8_Bit  Рік тому +1

      @@brianwild4640 Do you have the Atari Microsoft BASIC cart? It'll probably bug on that one.

    • @Longuncattr
      @Longuncattr Рік тому

      @@8_Bit I checked, and Atari Microsoft BASIC does *not* show the bug. It does print "OVERFLOW" twice in a row, though, like on the IBM 5155 you showed.

    • @8_Bit
      @8_Bit  Рік тому

      @@Longuncattr Interesting! That's actually surprising as from what I've read it's based on the same code used in the Apple and Commodore versions of Microsoft BASIC.

  • @faenethlorhalien
    @faenethlorhalien Рік тому

    Wow. I remember using it on the Speccy and never had any issues with it. Obvs even fewer on GWDOS on the pc.

    • @8_Bit
      @8_Bit  Рік тому +1

      Yeah, as far as I know Speccy's BASIC has no connection to Microsoft's so it suffers from a completely different set of bugs :) (I might finally make some Speccy videos in 2024, so watch out) ;)

  • @IllidanS4
    @IllidanS4 Рік тому

    It's so surreal to see a function get its argument right from the actual line of the program. That is something you could never see in modern languages.

    • @NuntiusLegis
      @NuntiusLegis 11 місяців тому

      You could never see such caring for the most efficient solution in modern bloatware.

  • @barcoboy2
    @barcoboy2 Рік тому

    Very interesting. After running the program, if you try to insert a line between 10 and 20, or add a line to the top or bottom of the program, line 10 gets changed to:
    10 PRINT VAL("1E39
    8335 SHOW BUG
    Fixing line 10 or even making it shorter or longer does not result in any corruption that I can see.

    • @WY.C64-Guy
      @WY.C64-Guy Рік тому

      I think that's because the linker has to execute to make a correct linked list to include the new line(s), and runs across the garbage left behind.

  • @ericswanson2527
    @ericswanson2527 Рік тому

    It's fine in MS Basic on a 1982 Epson HX-20. OV error, but no truncation.

  • @agpxnet
    @agpxnet Рік тому

    Nice catch. It's absurd that the BASIC interpreter have to modify the program (just to terminate the string with a null) to execute the VAL function. This is done by the following code:
    B7CF A0 00 LDY #$00
    B7D1 B1 24 LDA ($24),Y
    B7D3 48 PHA
    B7D4 98 TYA
    B7D5 91 24 STA ($24),Y
    B7D7 20 79 00 JSR $0079
    B7DA 20 F3 BC JSR $BCF3
    B7DD 68 PLA
    B7DE A0 00 LDY #$00
    B7E0 91 24 STA ($24),Y
    First the original character is stored on stack (B7D1 - B7D3), then its turned to zero (B7D4 - B7D5). The routine BCF3 convert the null terminated string to float and then the original character is restored (B7DD - B7E0). A trick to avoid copying the string in another area. In case of overflow, however, the routine doesn't return to the caller (B7DD) so the program line is cut. Another approach would be to pass the termination character to the routine at BCF3.

    • @NuntiusLegis
      @NuntiusLegis Рік тому +1

      I don't find it absurd becasue it is a fast and memory-saving solution, which is important on an 8-bit system, and that bug is quite unlikely to occur. I use VAL quite a lot (I like to use string-arrays like C-structs to store text and numbers and use VAL if a calculation is needed), so I am glad this works fast enough in most cases, and I never had an overflow error.

    • @agpxnet
      @agpxnet Рік тому +1

      @@NuntiusLegis As I wrote, it would have been enough to pass the termination character ($22) to the routine and you can avoid this hack.

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      @@agpxnet What do you mean by "termination character"? I think there usually is no such thing in C64 BASIC, that's why a zero is set for this particular routine.

    • @agpxnet
      @agpxnet Рік тому

      @@NuntiusLegis I mean, the character that a string ends with. The routine in BCF3 scans through all locations of the string until it hits a zero (which is why that hack modifies the code). However, in the case of literal string, it doesn't end with a zero, but with the quotation mark character (ASCII = $22). If we could pass, for example via a register, the end-of-string character to the BCF3 routine (rather than being hardcoded as 0), the hack would not be necessary.

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      @@agpxnet So you mean the quotation mark, which only works with literals, but VAL must also process string variables where the string data lies in string memory without any seperating characters, as far as I know.

  • @RSCuber
    @RSCuber Рік тому

    Checked on an Altair (clone). Extended basic on the Altair gives the overflow error without corrupting memory. 16k ROM and 8k BASIC do the bug as described.

  • @fuzzix
    @fuzzix Рік тому

    7:30 While Microsoft released fixed BASICs, it looks like Commodore still didn't shell out for an updated version and backported fixes themselves?
    Great investigation, cheers!

    • @williamdrum9899
      @williamdrum9899 Рік тому

      Probably to save money on the licensing

    • @joehoy9242
      @joehoy9242 Рік тому

      ​@@williamdrum9899- CBM (specifically Jack Tramiel) didn't do licensing, he insisted on buying a cut of MS BASIC outright in 1977 that incurred no further cost, and CBM would be able to brand and modify themselves. This was done when the PET was in development - and because Micro-Soft were very much a fledgling company at the time and needed the money, they sold it lock, stock and barrel for about 50 thousand dollars {yup, you read that right), but they didn't read the fine print, which said CBM could use it on any derived technology, not just the PET itself. Microsoft then had to watch as that version of BASIC was used on the VIC-20 (the first home machine to ship a million units), and to add insult to injury, the C64 (Most successful home computer of its generation, shipping around 8 million units) and knowing that they would not see a red cent from any of it. Tramiel pretty much took them to the cleaners, and it wasn't until they lucked out leveraging BillG's parents' relationship with IBM that the company's future was certain. There's a reason MS (and Apple) don't like to talk much about CBM!

  • @kevinwnz
    @kevinwnz Рік тому +1

    The commander x16 also has this basic bug

  • @oleimann
    @oleimann 4 місяці тому

    Try:
    10 A$="1E39":REM STRING PREPARED
    20 A=VAL(A$):REM STRING-TO-NUMBER
    30 REM LAST LINE
    RUN
    LIST
    output:
    10 A$="1E39
    20 A=VAL(A$):REM STRING-TO-NUMBER
    30 REM LAST LINE
    The poke to clear it is POKE 2061,34 on the C64 (that's where the closing quote is)
    This happens because when you set a string from a quoted string, it just refers to that actual location in basic program memory :)
    However, if you use:
    10 A$="1E"+" 39" :REM STRING PREPARED
    It won't be affected, since the full String is in String memory (top of basic memory)
    Also fun with DATA:
    10 DATA 1E39,12345:REM STRINGS PREPARED
    15 CLR:READ A$: REM RESETS AND PICKS UP
    20 A=VAL(A$):REM STRING-TO-NUMBER
    30 REM LAST LINE
    RUN
    LIST

  • @cheater00
    @cheater00 Рік тому

    hey Robin, I have an idea for a video, or series of videos, for you. you usually show a thing on the c64 and go deep into how it works and why. wouldn't it be fun if you compared some other computers? like, take a simple thing, like integers, and their overflow modes. compare basic implementations and see why they differ, and how that is grounded in the hardware it's running on. or how line drawing functions compare. or how strings work. or how the value of pi**2 changes. or what various control structures each computer has. this would be a great chance to whip out some really obscure basic machines or even clones and knockoffs from the soviet union, china, or south america... let me know if you like the idea!

  • @stefankrause5138
    @stefankrause5138 Рік тому

    Has this bug been tested on the commander x 16? 😛

    • @8_Bit
      @8_Bit  Рік тому

      Some other comments have said the bug exists on the X16 too :)

  • @allenhuffman
    @allenhuffman Рік тому

    When you finally get around to watching a video, and find yourself referenced...

  • @WxAxNxDxExRxExR
    @WxAxNxDxExRxExR Рік тому +1

    BASIC bugs aside, how did you get your vintage computers to look perfect and brand new (if not better )???

  • @allenhuffman
    @allenhuffman Рік тому

    It dawns on me that the "run BASIC out of ROM" stuff would not work with VAL due to this. I expect C= had this as well, but we had programs that could take a tokenized BASIC program and put it in a ROM-Pak, and run like a cartridge. (Unless they copied it in to program RAM, I suppose.)

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      What do you mean? BASIC works out of ROM with VAL for more then four decades now on the C64.

    • @allenhuffman
      @allenhuffman Рік тому

      @@NuntiusLegis The program is in RAM, and it is the program code that is getting modified here.

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      @@allenhuffman Ah, sorry, now I understand. A workaround would be to pass the parameter for VAL in a concatenated string (variable), which is always stored in the string heap in RAM.

    • @allenhuffman
      @allenhuffman Рік тому

      @@NuntiusLegis Very good! Yes, we use the trick of adding +”” to force that on the CoCo BASIC. I wrote up a multipart series on how the string memory works, but the C= BASIC appears to work different. Good to know that part would be the same. X=VAL(“1E39"+””) might work.

    • @NuntiusLegis
      @NuntiusLegis Рік тому

      @@allenhuffman It would be nice to have someone examine that in detail on the C64 (I am not good at analyzing assembly code). I am not sure what VAL does with data on the string heap, if it is modified there or if VAL (always) uses the length entry in a string descriptor, or if a descriptor only exists for variables or also for concatenated literals.

  • @HelloKittyFanMan
    @HelloKittyFanMan Рік тому +1

    The E actually means _"times_ ten to the specified power."

    • @williamdrum9899
      @williamdrum9899 Рік тому +1

      Hmm I guess I'm used to thinking of it as a hexadecimal value

    • @HelloKittyFanMan
      @HelloKittyFanMan Рік тому +1

      @@williamdrum9899: Yeah, it's (dec.) 14 when in notations like "$3E7B" or "E4"; but means " *E* xponent" (with the already-understood "*10") in notations like "1E39" or "5E+26" or "18E-9," where the environment is assumed to be decimal (as it is in BASIC).
      But that raises the question: Normally I've seen big numbers expressed by the computer as "[x]E+/-[y];" I mean always with a + or -. So that reminds me to ask Robin to try this with the + or - and see if that's still triggers the bug. (If not today, then to wait until after Christmas to address it, of course.) Happy Christmas! 🎄🎅