About Commodore 64 BASIC Abbreviations

Поділитися
Вставка
  • Опубліковано 28 сер 2024

КОМЕНТАРІ • 126

  • @8_Bit
    @8_Bit  5 років тому +7

    Index:
    0:00 Jeff’s Question
    1:42 Robin’s Intro
    2:05 Transactor Magazine
    3:37 BASIC abbreviation examples
    5:07 About crunching
    6:02 More unusual abbreviations
    8:03 No abbreviation for INPUT
    8:50 Examining the buffer
    10:39 Karl Hildon’s Complete Inner Space Anthology
    12:12 Jim Butterfield’s SuperChart
    12:52 About keyword tokens in BASIC memory
    17:02 Inspecting the BASIC keyword table in ROM
    19:19 The bizarre gosuB
    22:20 Using BRKs to inspect the crunch; about vectors
    26:48 An example of crunching
    29:13 Disassembling the tokenization routine
    35:00 Finally!
    36:25 Ending

  • @csbruce
    @csbruce 5 років тому +25

    6:29 "pR" is "print#" because if two tokens have matching prefixes, the longer one needs to come first in the search table so that the shorter one doesn't permanently "hide" the longer one.
    8:21 If the high bit is set in the input text, it effectively acts as a wildcard that matches the first entry found with the common prefix, so "iN" and "inpU" match "input#". You can't seem to set the high bit on the last character of a token without getting weirdness.
    10:00 As I understand it, there are only three $00 bytes that end a BASIC program: the $00 that terminates the last line and then $00 $00 as the "null pointer" link that indicates there are no more lines. The fourth $00 in your display is probably the content of your previous example lines overwriting each other.
    10:40 The Anthology is my go-to reference. The reprint included the C128 memory map.
    19:02 If "input" came before "input#", then "input#" would always be parsed as the token "input" followed by the character "#" and the "input#" token would never be found using the simple linear-search method implemented in the token cruncher.
    23:58 Lots of BASIC wedges cheap out and intercept errors or take over the zero-page lexical scanner, but proper BASIC extensions take over the various extension vectors and process the extension keywords in the same was as stock BASIC.

    • @cigmorfil4101
      @cigmorfil4101 4 роки тому +4

      8:21 the abbreviations and weirdness is caused by a side-effect of the crunch routine and how it detects the end of a token.
      Thinking back to when the routine was written, the processor was limited in the quantity of memory it could access and so code had to be as compact as possible. The routine uses a few neat tricks to save bytes - with modern machines having so much plentiful memory the art of small, efficient code has gone out of the window, along with the appreciation of the need for it.
      One of these is the way end of tokens are identified in memory and how it is detected to leave the registers ready for more work.
      By setting the top bit and using unsigned arithmetic, if the two characters are equal, ie when one is subtracted from the other the result is 0, then there is a perfect match and the next characters are checked. However, if one of the two has its top bit set then the result is 0x80 regardless of which had the top bit set.
      This is useful as the count so far can be added (ORed) to this to give the count plus 128 as the required token.
      Why not just start with the count at 128?
      The count is initialised by using the offset (Y = 0 -》 start of token table) into the token table, which is then decremented by 1 ready for the token check loop. If 128 had been stored, it would require an extra 2 bytes to load a register with this value. As Y needs to be set to 0xff, the DEY could be removed and another LDY# put in, for a total extra of 3 bytes!
      The advantage of using SEC, SBC instead of CMP (one byte shorter) is that not only is exact testing achieved (Z=1) but also testing that one of the two (expected to be the token table) has the top bit set comes down to a simple CMP - using 10 bytes; to check that the characters were the same, except for bit 7; to check if bit 7 was set on the token table would take more bytes.
      Using a C style string token table would require an extra byte for each token to contain the terminating 0 - an extra 76 (if I've counted correctly) bytes of memory required!
      Incidentally, microsoft could have saved a byte (and 2 clock cycles for every byte pair tested) by instead of using SEC, SBC ,Y they had used EOR ,Y reducing the token identifying check to 9 bytes.
      A few bytes later they have STA ,Y LDA ,Y. This may seem odd at first but the intent is to set the flags which the STA doesn't do. The same could have been achieved by either ORA #0 or CMP #0 instead of the LDA instruction which would have been a byte, and 2 clock cycles, shorter.

    • @BillAnt
      @BillAnt Рік тому

      ​@@cigmorfil4101 - For negative values you'd do
      EOR #$FF ; For one's complement
      SEC
      ADC #$00 ; Increment by adding 1 via the carry flag

    • @cigmorfil4101
      @cigmorfil4101 Рік тому +1

      @@BillAnt
      The 6502 (precursor of the 6510) was quite a neat hardware design. It was a bit confusing for Z80 programmers in that the carry had to be set, not cleared, before a SBC:
      To do a subtraction the 6502 used 2s complement and addition: it did exactly what you said: create the 1s complement and use the carry to convert to 2s complement on the add - thus the need to set the carry before a subtraction - with multi-byte numbers where a borrow is required shown by the carry being unset and so leaving the 1s complement, 1 short, automagically doing the borrow.
      SBC: A - O == A + not(O) + C
      A SUB op code would have been nice: it would do a SBC but automagically pre-set the C flag - saving a byte of memory and 1or 2 clock cycles. Interestingly, the CMP instruction does a subtraction to set the flags but does not store the result, which means it must internally effectively set the carry flag. Similarly an ADD op code to automagically also pre-clear the carry for an ADC would have been useful.

  • @KetilDuna
    @KetilDuna 4 роки тому +2

    Suddenly I remember how I expanded the c64 basic by altering these vectors, intercepting the tokenize, detokenize and run processes and adding my own lame stuff. It began with a book outlining the work, and off i went to do slow stuff fast mostly for costly things like scrolling (move, fill, colors), sprite and hires, and even stubbed my toes on the sid. Happy days! Thank you for these videos.

  • @lostindesolation2810
    @lostindesolation2810 5 років тому +7

    All these years thinking it was an intentional feature. Thanks a lot, really appreciate these vids even though I rarely use basic these days.

  • @noland65
    @noland65 5 років тому +17

    Some notes on this…
    1) Fun fact: The PET 2001 ROM V.1.0 (the one without the hex monitor) didn't have a "GO" token, this came only with ROM 2.0. If you try "gosuB" with this one, it will just tokenize the PETSCII-character values for "gosu", character by character, missing the "B", since any bare characters with the sign-bit set ($80 and higher) will be ignored. (This is also, why there's the special case for the pi-character, $FF, in the tokenizer routine.)
    2) The magical difference $80 works only, because it's actually the absolute difference - which is the really clever bit in this, or the crucial bug, depending on your point of view. Negative $80 is the same as positive $80 in 8-bits! So it doesn't matter, which of the two operands is bigger, it's all about the absolute offset in PETSCII values. This wouldn't work with any other offset. (Also, it's just the sign-bit set, quite handy for coding.)
    3) The "mid$" mystery also works with "gotO", resulting in "mid$t", but no other token.
    4) On the "gosuB"/"mid$u" bug
    What actually seems to happen is that the matching routines fails on matching "gosub" as both the parsed string and the token have the sign-bit set on the last character and there is no difference, indicating to the tokenizer that there's still another character to match. At the same time, we also hit the end of a word, resulting in the pointers into the input-string and the keyword table to be reset in order to check for the next token, but without incrementing the keyword/token index. (As for the token matching, we're still looking into the same keyword, since we had a perfect match on the last character, indicating more to come.) As a result, the keyword-index lags behind by 1 as the search resumes with an attempt to match "return". While the token for "return" is 14 in actuality, this is now 13, and so on. When the search eventually finds a match, we're still a token count behind, resulting in "mid$".
    The same bug applies to "gotO", starting over with the index for "goto" being reused for "run" (now 9 instead of 10). This works only, because "go" is a substring of "gosub" and "goto" and there will be eventually a match using the degraded token index.
    Somewhat related, if we try to input "10 input" ( is the graphics character resulting from "#" with the sign-bit set), we get neither the "input#" token nor the token for "input", but the PETSCII character sequence "input" (not tokenized). The same is probably true for "open#" with the sign-bit set on the last character.
    Edit: Actually, we just spill over into the next keyword as the uppercase "B" in the input buffer matches the last character of "gosuB" in the keyword table and the pointer into the input buffer will be reset only as we fail the match on the next keyword ("return").
    More on this: www.masswerk.at/nowgobang/2019/commodore-basic-tokenizing

    • @cigmorfil4101
      @cigmorfil4101 4 роки тому

      It's not a matter of not increasing the token counter, it's a matter of recognising where the end of the token is.
      When a match fails the crunch routine *looks forward from the current position* to find the next byte in the keyword table with its top bit set which is recognised as the end of the token we were looking at.
      In the case of the video what we have is the input line of "gosuB[null]" being matched against "gosuBreturN..." there is a perfect match with "gosuB" so crunch thinks it's still looking at the token "gosuBreturN". The next characters mismatch: a null from the input against 'r' of the token. As the mismatch is not 128 (indicating that the characters are the same except for the high bit set), it *looks forward* from the 'r' in the token table to find the next character with the high bit set to mark the end of the token and increment the token count.
      By shifting the 'B' of "gosuB" it effectively made the two tokens "gosuB" and "returN" following it into a single token.
      The result is that instead of checking for "returN" after the failed attempt at "gosuB", it starts at the next token of "reM" with the token count at "returN", which is the correct next token to check. Thus when it matches "gO" it is counting one token less than it ought and inserts the previous token (mid$), followed by copying "suB", but with the 'B' being deliberately dropped (see below).
      If it were not for the extra "gO" token at the end of the table the effect would be to copy character(s) into the tokenadised line to be interpreted as a "let ".
      I suspect that you could use this as a trick to get a keyword as a variable, eg entering
      10 returNn=returNn+1
      may list as:
      10 return=return+1
      and run without error, setting the variable "re" to 1 more than its previous value (which would be 0 if unused until that point).
      The reason the 'B' is missing is simple: the basic uses bytes with the top bit set as a token; to avoid any confusion any byte, except the pi symbol, input with the top bit set is deliberately ignored (after the token matching). The pi symbol is interpreted as a token to mean pi, the ratio of a circle's circumference to its diameter (3.14...).
      It is probably accurate to say that the "abbreviation" of keywords is not a bug but a side-effect of the method used to check the tokens for equality and top bit being set to mark the end of a keyword.
      Also, Microsoft could have saved a byte in the program code, and for every character put in the tokenadised line 2 clock cycles by using the Z80 style trick of using ORA #0 instead of LDA ,Y immediately after the STA ,Y to set the flags (specifically the Z flag).

  • @MrGoatflakes
    @MrGoatflakes 5 років тому +2

    Wow, who would have suspected that such a simple thing that seems to be a deliberate choice to aid productivity actually was a subtle bug that opens up how tokenisation in BASIC is performed? Amazing!

  • @rga218
    @rga218 5 років тому +3

    Thanks so much for this video (and the others you've made). I'm just getting into retrocomputing after picking up a C64 recently. I have been having a lot of fun with it, and have learned a lot from your terrific videos.

  • @dwhxyz
    @dwhxyz 5 років тому +4

    I think a great part 2 to this video would be showing how to add additional basic commands via wedging chrget. This is something the 11 year old me found to be magic even though I had a good understanding of machine code and 6502. It still amazes me that I manage to learn 6502 with only a machine code monitor and no books, documentation or teacher. I just worked it out and strangely felt more comfortable in 6502 than basic! I think I still have somewhere my original hand written decimal, hex, pet ascii, mnemonics look up tables similar to what was in your book.

    • @BillAnt
      @BillAnt Рік тому

      The best way to learn is by trial and error, that's what stick best in your memory because you figured it out yourself. Of course a good reference guide of mnemonics and opcodes is a must. For me ML was more of a necessity do its speed and interrupts.

  • @BillSzczytko
    @BillSzczytko 5 років тому

    I hadn't heard of The Transactor until this video. I am devastated I couldn't get it back in the day. The articles are fantastic. I've been reading a ton of them since this video. Thank you sir!

  • @75slaine
    @75slaine 5 років тому +2

    Wow, great video Robin. Some serious research went into that, thank you !!

  • @LeftoverBeefcake
    @LeftoverBeefcake 5 років тому

    I just looked up Karl Hildon’s Complete Inner Space Anthology and found a PDF version. And I gotta say WOW! It's basically a big book of "cheat sheets" that covers an absolute ton of stuff and it even covers the Plus/4 and C16. Thanks for cluing me into this amazing book, this is going to come in handy for a couple different projects I have going on. Hopefully I can snag a copy on eBay as this is valuable reference material to add to the collection.

  • @billkeithchannel
    @billkeithchannel 4 роки тому +1

    I bought Jim Butterfield's book to learn Assembly to directly program the graphics on my C128 in 80 column 2MHz mode.

  • @HeyBirt
    @HeyBirt 5 років тому +1

    Great episode once again Robin! The quirk with 'gosuB' is really interesting, I had not run across that ever being mentioned before.
    Tidbit: I seem to recall that at the time MS 6502 BAISC was being written that Paul Allen was concentrating on 8080/Z80 BASIC, Bill Gates was doing 6502 and the only other employee MS had at the time did not code. Bill's last bit of code was the OS/BASIC for the Tandy Model 100 (8085 based).

    • @8_Bit
      @8_Bit  5 років тому

      I’ve heard various stories. I just looked more and this article mentions Richard Weiland May be the main author: www.pagetable.com/?p=43
      Thanks again for prompting the episode!

    • @HeyBirt
      @HeyBirt 5 років тому

      That is by far the best explanation of the history of MS BASIC I have seen! Thanks!

    • @8_Bit
      @8_Bit  5 років тому

      The guy that runs pagetable.com does amazing work.

  • @PhilReynoldsLondonGeek
    @PhilReynoldsLondonGeek 4 роки тому

    There was a book with an adventure game in it (forget what, precisely) published in the UK, and in the notes for C64, it required one GOSUB (goS) and one RETURN (reT) to be abbreviated purely due to line length.

  • @AgentFriday
    @AgentFriday 5 років тому +4

    gosuB -> mid$su explanation...
    Norbert Landsteiner beat me to the punch on this one, but since I stepped through the ROM to figure this out I wanted to post my own short dedicated explanation...
    The crunch routine actually identified the GO token. However, by that time the token counter that indexes which token is being compared to had already "slipped a cog," so it was off by one. Every time there is a mismatch, the routine skips ahead to the end of the token (the shift last letter), and increments the counter. Since the B in gosuB matched exactly, the routine was fooled into thinking it was comparing against a single token called gosuBreturn, which failed... but it only resulted in a single increment rather than 2.

    • @WaxxyOne
      @WaxxyOne 4 роки тому +1

      So how would the line "1 gosuBreturn" be tokenized?

    • @markjreed
      @markjreed 3 роки тому

      @@WaxxyOne As GOSUB. The high-bit B causes the comparison to continue until it gets to the high-bit N in the keyword table at which point it returns the token number. Which is off by one from the standpoint of "return", so you get gosub. If you entered `1 gosuBreturNrem` you'd get the same thing, likewise `1 gosuBreturNreMstop`. You can string together as many consecutive keywords from the table as you like...

  • @JohnnyLutz
    @JohnnyLutz 2 роки тому

    Good work Robin. Very interesting!

  • @Vector_Ze
    @Vector_Ze Рік тому

    I know it must seem odd to people who aren't touch typists, but for me it takes more time to use the 2-keyboard shortcuts than it does to type the whole 'word' out. Been typing for more that 40-years.

  • @darkstatehk
    @darkstatehk 2 роки тому

    Superb video I was hooked all the way through thank u…

  • @15743_Hertz
    @15743_Hertz 5 років тому +6

    Cool bug! Mine is when you put a color character in a rem statement and it gets parsed. For instance:
    10 REM (shift insert/backspace) (ctrl-1)

    • @3DPDK
      @3DPDK 5 років тому +5

      Try this:
      10 rem "[shift M] [cursor up] [cursor right * 7] [ctrl 1] THIS IS MY COMMENT IN BLACK
      Enter this line and then cursor back up to the shifted 'M', which you'll notice in graphics mode is a '\' This character has been used for "escape" in the automated typewriter world to send format commands to the typewriter since the 1950s. Now turn on reverse ( [ctrl 9] ) and retype the shifted 'M' and enter the line again. Now list the line. The reason for the cursor movements is because the '\' also includes a 'return' and 'line feed' which puts your cursor on the next line.
      The reason your line lists as the ATN key word (arc tangent) is you have directly typed in the token, a reversed spade or screen code 193, without quotations. The only time this works is when typed in a REM statement. The parser to enter a line into program memory no longer tries to tokenize after a REM but records what you type verbatim. Any character above 128 (high bit set) is a key word token in Commodore BASIC. When LISTed, the LIST parser recognizes this as a key word token because it's above 128 and not in quotes, and prints the key word rather than the 193 token. If it was in quotes LIST would print a reversed spade.
      Only if you include a reversed shifted 'M' or '\' and after an open quote does the LIST routine actually perform the format characters, including color changes, rather than print them as tokens. NOTE if you include a closing quote, LIST will print the closing quote, so if you want a colored REM comment without quotes do not include a closing quote.

    • @davehx
      @davehx 5 років тому

      @3DPDK If Rube Goldberg did C64 programming!! Brilliant workaround though and very well explained. Thank you.

  • @cigmorfil4101
    @cigmorfil4101 4 роки тому

    As a result of the tokenadising routine, you have to use pR for print-hash; if you try to use?# it looks the same but when executed will fail with a syntax error as the line contains [print][#] and sees the hash as the first character of what to print, and hash is an illegal character.
    Print# is a separate token and is seen as [print#] which is followed by a number or variable (as in [print#][4][,][a][$]).
    (In the above the square brackets surround each byte of the tokenadised program.)
    As the tokenadising routine starts at the beginning of the token table, it will match with the first "abbreviation" it finds. Also the # is part of the token so it comes down to the order of the keywords in the table as to which gets the "abbreviation".
    Print was given a separate abbreviation of "?" as it was commonly used a lot, but I think that the idea actually came from a prior BASIC (somewhere in the depths of my mind I seem to remember that the first BASIC book I used was non machine specific and used the "?" abbreviation for print). Also, with this specific check done first, the print keyword itself does not need to be near the beginning of the keyword table slowing down the check for other keywords.
    But it is more likely to do with the fact that for two very similar keywords (print and print#, input and input#) if the non-file version was first, the file version (with the #) would never get tokenadised as a match would be found at the non-file version - this allows for spaces to be omitted in the input line as the input was limited in length (on a 40 column PET to 2 lines, giving 80 characters including those for the line number).
    Thus the file version (with the hash) comes before the non file version (without the hash) in the token table. And explains why input has no "abbreviation". To give print its abbreviation a special case is made if the input matches a "?".

  • @brucetungsten5714
    @brucetungsten5714 3 роки тому +1

    Top notch stuff - as always.

  • @MeriaDuck
    @MeriaDuck 5 років тому +1

    So those abbreviations are a quirk, which saves us typing long basic commands, and saves instructions in the tokenizer. Very cool!
    Would you please explain one day how updating basic lines works, and inserting lines between others. That must be rather interesting.
    Did, for example, saving and reloading speed things up? Or is saving just a simple memory dump of the ceunched basic code ram?
    So many questions!

  • @TheHighlander71
    @TheHighlander71 5 років тому

    Very enjoyable episode, thanks.
    I noticed in that crunch routine that the BVS mnemonic was used. Now there's a command I've never used before.
    If only I had had access to magazines like The Transactor...learning about assembly would have been so much quicker!

    • @8_Bit
      @8_Bit  5 років тому +1

      Yes, I think BVS and BVC are very rarely used by most people. Transactor is amazing; if you haven't already, there's a couple archives of the issues online, check them out.

    • @TheHighlander71
      @TheHighlander71 5 років тому

      @@8_Bit I will thanks.
      I've been looking for a bit into this load"*",8 behaviour. The answer appears to be in the 1541 code (which is interesting to see because it uses the same $0200 buffer apparently in the 1541 memory as BASIC does on the C64). Loading the "last used" file looks more like a bug than an intended feature going by what I see in the 1541 disassemblies.

  • @00Skyfox
    @00Skyfox 5 років тому

    You worked with Jerri on the C64DTV? That's awesome! I have two of them. I have plans to modify one to have all the C64 ports, soldered to points on the board, to make a full working C64 out of it.
    Excellent and very thorough explanation about BASIC commands here as always, so thanks! And I was wondering if you could do a video explaining how the C64 determines when it's trying to divide by zero so it knows to throw an error.

  • @NeilRoy
    @NeilRoy 5 років тому

    Hey, fellow Canuck here from Kingston. I think what may be happening with the "go su" being interpreted as "mid$" is possibly due to missing the "to" after that. I don't think "go" by itself can do anything and the interpreter may not know what to do with it so it, as you pointed out, uses the last command it was on.
    I think "goS" is the short form for "gosub".
    It's funny, even after all these years, I still have all of these short forms memorized as I programmed the C64 quite a bit back in the day. I ran a BBS off of my C64 using EBBS at the time as well. Good times.

  • @billkeithchannel
    @billkeithchannel 4 роки тому +1

    I was a fan of Softside Magazine and back in '99 I was able to contact Scott Adams (the coder not the cartoonist) about his Say Yoho! series. It was cool to exchange emails with him.

    • @8_Bit
      @8_Bit  4 роки тому +1

      Cool! Most years I see Scott Adams at the Midwest Gaming Classic in Milwaukee, and he's very friendly and approachable.

    • @billkeithchannel
      @billkeithchannel 4 роки тому

      @@8_Bit That is cool. Hard to believe another 20 years have passed. Back then I was head coder on a MUD and his adventuring advice was helpful.

  • @75slaine
    @75slaine 5 років тому

    What a cameo-tastic week we’re getting.

  • @dwhxyz
    @dwhxyz 5 років тому +1

    And another interesting thing maybe to show how people used to hide basic commands. like "10 SYS4096" into "10 HELLO" . If I recall this was done inserting specials chars into the REM statement at the end of the line ? Also, you could do something to stop the program being fully listed and stop after the first line, again this may have been done with a REM statements.

    • @SpearM3064
      @SpearM3064 5 років тому +1

      Yeah. You add a REM statement and a couple of spaces. Then you use a machine language monitor to replace the spaces with delete characters. In this case, you would have the line:
      10 SYS4096:REM(11 spaces)HELLO
      Then you'd replace those spaces ($20) with delete ($14) characters. When you tried to LIST the program, you'd see 10 HELLO
      If you wanted to stop the program being fully listed, you'd add a Shift-L to the end of the line. Then if you tried to LIST the program, you'd see 10 HELLO and then a SYNTAX ERROR.

  • @wecontrolthevideo
    @wecontrolthevideo 5 років тому

    I used to watch “Bits and Bytes.” I live in the Detroit area and we could see it on TVO CH 32 across the river in Windsor, Ont. All the episodes are on UA-cam, as well as “The Academy” episodes on computers. There are a lot of the Commodore books and Compute! magazine available on archive.org

  • @fnjesusfreak
    @fnjesusfreak 4 роки тому

    Allen and Davidoff co-wrote the Z80/8080 interpreter, the 6502 one is credited to Ric Weiland (with versions 1.1 and later being co-credited to Gates).

  • @SamuliTuomola_stt
    @SamuliTuomola_stt 5 років тому

    excellent stuff Robin! really enjoyed the explanation.

  • @bxdanny
    @bxdanny 2 роки тому

    The shortcuts using shifted characters (which added 128 to the base character's code) were kind of an accidental by=product of the way the keywords were parsed or "crunched". But what I've never understood is why Commodore made PRINT# and INPUT# separate tokens, meaning ?# wouldn't work for PRINT#, although it looked right when listed.

  • @8bittimes
    @8bittimes 5 років тому

    This "feature" is actually a side effect of saving some memory bytes in BASIC ROM. Today - in modern systems - Strings are stored with an ending zero to terminate it. Microsoft, when they wrote the BASIC, cleverly optimized away this terminating byte for every command name in the ROM table by setting the - otherwise unused - bit 7 just on the last character of a token

    • @8_Bit
      @8_Bit  5 років тому +1

      Yes, and if they had used another opcode or two to make sure it's detecting the high bit from the command table, then we wouldn't have had the shortcuts at all. Except for ? which is separately detected.
      It's also interesting to note that some parts of BASIC use 0-terminated strings, and others use the high bit, and the reason there are 3 spaces in the ?BREAK ERROR is because they print the BREAK string using the wrong print routine; it expects the high-bit to be set, but it's a null-terminated string! I show this in another video.

  • @robertlock5501
    @robertlock5501 Рік тому

    Fascinating stuff...

  • @Mustakari
    @Mustakari 5 років тому +3

    Please do a video about adding keywords to basic. Like how it's done in Simon's Basic.
    I don't know, I'm not an expert on the topic but maybe there is space for them if you abbreviate error messages and rearrange things a bit.

    • @SpearM3064
      @SpearM3064 5 років тому +4

      That's somewhat easy to explain, Mustakari. Commodore BASIC is tokenized (as you know). The tokens start at $80 (128) and end at $CB (203), except for pi, which is $FF (255). This leaves room in the table for 52 more tokens between $CC (204) and $FE (254). So one way to add keywords to BASIC is to modify the tokenizer routine to point to a new table that includes your additional commands. Of course, you also have to modify the interpreter to provide vectors for your new commands so it jumps to the correct routine when it sees one of your new commands. There are other ways to do it, too, but this is the method that Simon's Basic used.
      Another way is to wedge into the CHRGET routine, which reads the next character. The DOS Wedge included on the disk that came with the 1541 disk drive did this. It would look for a special character (in this case, @ or / or % or >) and jump to another routine if it saw one of these characters. There was a program published in COMPUTE! Magazine that added extra commands this way.

    • @Mustakari
      @Mustakari 5 років тому

      Thanks Mr. SpearM75503. I guess with this information I could (with great effort) pull it off to make my own customized rom but I would still like to see a video about this.

  • @duckyvirus
    @duckyvirus 5 років тому

    A little behind on these videos, but it seems I've no code to convert into the GIT lol.. Robin is great at explaining things... I've used the C64 for a lot longer than I'll admit and even I am learning things

  • @armin.hierstetter
    @armin.hierstetter 4 роки тому

    I think that the go/to/mid$ mess might have something to do with the fact, that the "go" token is the last in the token list (if I spot that correctly). Maybe the "go" token was chosen randomly, it could maybe also be "xy". it might just marks the end of the list. Don't have the assembly in front of me, so I am just making wild guesses.

  • @SomeBorkedAccount
    @SomeBorkedAccount 5 років тому

    This video is pure ASMR. Don't think that was the intent, but still

    • @8_Bit
      @8_Bit  5 років тому

      Didn't intend ASMR, but I have deliberately worked on how I mic and process the audio in my videos to capture the distinctive sounds of using these retro computers. So, thanks :)

  • @retroelectrons2
    @retroelectrons2 5 років тому +1

    L.F Sander book 'tips and tricks for commodore" i found in a PDF. (1989)

  • @rudolfrieder186
    @rudolfrieder186 5 років тому +2

    Very interesting! I always thought abbreviations were unique in Atari BASIC. There they use a period sign like L. for LIST

    • @8_Bit
      @8_Bit  5 років тому +1

      I need to play around with my Ataris more! I know very little about their BASIC.

    • @pauljs75
      @pauljs75 4 роки тому +1

      @@8_Bit Revision C of Atari Basic (XE computers) is likely the best, since it seems to support the most shorthand stuff. (And it was worth learning when most programs had to be typed out from the back pages of the magazines.) The previous basic versions in XL computers had fewer abbreviated shortcuts, and supposedly have some bugs that were exploited in ways that on some rare instances needed workarounds in the XEs in order to work. In the overall scheme of things the different basic revisions are something like "99.9%" compatible though.
      And on the topic of Atari - if you ever get to it... The Pokey sound. Favors square waves, and one more voice channel than SID for a total of 4. Doesn't have the filters, but did some other things differently. Could also be used to make and play really low resolution audio samples if you're clever enough. (Examples are rare though, since RAM back then was at a premium. So it'd be like a one or two second clip.)

    • @E.T.S.
      @E.T.S. 2 роки тому

      For your information, the Sinclair ZX81 had abbreviations as a default. I never owned one, but as a teen I wondered about some program listings in computer magazines, next to impossible to read.
      Try an emulator to see how it works. The keys are BASIC commands, and these were printed above and below the actual keys, indicating the use of the shift key and such. Pretty clever since it only had 1KB, but also pretty difficult to operate with a PC keyboard. The later Sinclair Spectrum variants had additional commands on the keys as well.

    • @cigmorfil4101
      @cigmorfil4101 Рік тому

      @@E.T.S. by using keywords (direct token input) the ZX81 was also doing some syntax checking, and saving on having to have a tokenadising routine.
      The RAM may have been 1K (1024 bytes) but the ROM was 4K or 8K.
      ----8

  • @AllanHjberg
    @AllanHjberg 2 роки тому

    HI Robin, great show, love your explanations :-).
    @ 24.24 you explain IMAIN Vector, don’t really understand why the book states that is "Runs" after the print of the Ready prompt, but before the input.. That seems wrong to me, seems to me that it runs right after the input..
    Following the topic, could you perhaps explain how a Cartridge Wedges it self in (Since the ROMS doesn’t live at the $300 address range ?

  • @raymitchell9736
    @raymitchell9736 4 роки тому

    It was so long ago when I wrote VIC-20 video games... I wish I could remember the details of the tricks you didn't cover in this video. I knew about the longer line of BASIC by using the abbreviations... but I also learned a few other tricks. You can embed cursor controls inside of REM statements, you type a single quote, then you can force cursor actions when you list the program, such as clear the screen, color changes, backspace over the code and change the appearance of what it listed, even something caused a "?Syntax error" during Listing and it halts the listing! Check out my game Muncher, You'll see the trickery I did... Also, If you look at the filename it has a Control-C embedded in it and for "copy protection" I check for that character and I would SYS64802... I also poke BASIC's top of memory where I store the graphics so BASIC won't corrupt them, but if you want to make a copy of the game you have to restore the pointers to TOP of "memory" -- but if you just copy the tape ... you've got it... DOH! Later on I learned how to shim BASIC in the C64 to add my own commands to BASIC and had a blast... I got into a lot of what you covered here in the video. Good stuff. Brings back a lot of memories, I'm glad you're posting this !!!

  • @garthmacleod
    @garthmacleod 5 років тому

    I'm really loving how UA-cam is exposing me to Canadian Tech Tubers!

  • @RetroRelixRestorer
    @RetroRelixRestorer 5 років тому

    Enjoyed that. Interesting that at $A582, the X index isn't reset; so must be initialise prior to calling - unless I missed something?

  • @MrJinXiao
    @MrJinXiao 5 років тому +1

    Super interesting! You left the lowpass filter off this episode :-( . Thats ok I'm just running my laptop's audio through a software filter. Still, it'd be nice... ;-)

    • @magnustveten492
      @magnustveten492 5 років тому

      MrJinXiao huh so random when you get reminded your getting older. :) I guess my ears have low pass filter built. In now.

    • @8_Bit
      @8_Bit  5 років тому

      Sorry, I was sure I ran the lowpass filter but apparently not. Thanks for watching, I'm glad you've got a workaround there! :)

  • @andrewlankford9634
    @andrewlankford9634 4 роки тому

    I thought the basic programs, when saved to disk, were "tokenized". Instead of sequential ASCII files, they would replace keywords like PRINT, GOSUB etc with two character tokens.

  • @be236
    @be236 5 років тому

    Yeah, I remember these short cuts... the famous one is using ? for PRINT.. heh.

  • @gjermundification
    @gjermundification 5 років тому

    Are you sure that the lack of input and strange gosub is not due to the hardware offset bug? of xx00 is always the one xx below?

  • @AiOinc1
    @AiOinc1 4 роки тому

    Can I get a link to that annotated disassembly of Microsoft's 6502 basic? That seems like something fun to look at.

    • @8_Bit
      @8_Bit  4 роки тому

      I can't remember which one I was referring to, but there's a bunch of great links here at the awesome pagetable.com: www.pagetable.com/?p=793

  • @ChristmasEve777
    @ChristmasEve777 4 роки тому

    BASIC could have been optimized so much more for speed. I almost wish I could go back in time and write some assembly code for the C64 that would revolutionize CBM BASIC. If I write it at this point, it would only be used by people like us (nostalgists and old school hobbyists). I know you were only discussing the translation of line input to the tokenized form (including abbreviations) but it got me thinking about how the actual lines of BASIC were interpreted. Consider this:
    FORI=1024TO2023:POKEI,32:NEXT
    ^^^ this line could EASILY run at the same speed as hand-coded 6502 assembly, turning the screen to all spaces in the blink of an eye. The only problem is you would have to keep a source version of the program in memory and a compiled version in memory at the same time. And memory was so much more limited then than it is now.

    • @BillAnt
      @BillAnt Рік тому

      That's what BASIC compactor/cruncher programs like BLITZ! did, it took a BASIC listing and turned it into a pseudo machine code using their own interpreter instead of the built-in BASIC one. This was a way to speed up BASIC programs and also give it some snooping protection by making it un-listable.

  • @InsanePsychoRabbit
    @InsanePsychoRabbit 4 роки тому

    In Z80 machine code, $00 is NOP, which I honestly think makes more sense than NOP being $EA

    • @8_Bit
      @8_Bit  4 роки тому

      I can go either way on it. Having $00 as BRK is handy when debugging on real hardware, as you can point the BRK interrupt vector at your machine language monitor (aka debugger) and have a much better chance of finding out where and why your program went wrong. $00 as NOP does seem sensible though.

  • @dr.ignacioglez.9677
    @dr.ignacioglez.9677 2 роки тому

    I LOVE C64 👍🥂🎩

  • @semibiotic
    @semibiotic 3 роки тому

    It's interesting, does it work in i8080/Z80 MS BASIC.

  • @insoft_uk
    @insoft_uk 4 роки тому

    Kind of typing in the token instead of the text and the text parses into tokens were if all ready a token just skips it
    I’ll have to watch the whole video, nice side effect tho

  • @BillAnt
    @BillAnt Рік тому

    How does BASIC know what the last line number of the listing? Each tokenized line terminates with a $00, so how does it know that the following leftover/random garbage is not another line? Is there a special last line delimiter code, or a 16 bit line counter/pointer stored somewhere else?

    • @8_Bit
      @8_Bit  Рік тому +1

      Each line begins with a 16-bit pointer to the next line. If there is no next line, it points to $0000. So effectively there are three zero bytes in a row at the end of a BASIC program. The 0 terminator, then the two zeros of the final pointer (to nowhere).

    • @BillAnt
      @BillAnt Рік тому

      ​@@8_Bit - Thank you Robin, It makes perfect sense now. I had the hunch that there must be a series of terminator bytes at the end.

  • @alexsandrosschneidinger5215
    @alexsandrosschneidinger5215 2 роки тому

    Basic ist gut für Programm Forschung

  • @abitobsessed6722
    @abitobsessed6722 5 років тому

    Top Notch

  • @christopherlawley1842
    @christopherlawley1842 4 роки тому

    The OSI BASIC was fundamentally flawed. The garbage collection routine had the wrong length for floating point (I think) storage. So if it was called, the machine went into a permanent loop. :-(
    30 years later and I'm still annoyed.

  • @fuzzybad
    @fuzzybad 5 років тому

    I didn't know there was a C64 game Macbeth. Is that where someone got their handle from? :)

  • @manicsorceress2181
    @manicsorceress2181 5 років тому

    Thanks for another interesting video. I must admit, though, that this was a quite complicated one. Especially towards the end. Not so easy to understand (for my little understanding). Nonetheless, thumbs up.

    • @8_Bit
      @8_Bit  5 років тому

      Yes, I think this video tested the limits of my explanatory abilities, and quite possibly the limits of patience of my audience too :) Thanks for hanging in there.

  • @travisrecyclingja4675
    @travisrecyclingja4675 5 років тому

    Hello my friend!!! One question
    What is a transactor machine?

    • @8_Bit
      @8_Bit  5 років тому +1

      Transactor was a technical magazine about computers made by Commodore. One of Commodore's first computers was called the PET, which was supposedly an acronym for "Personal Electronic Transactor".

    • @travisrecyclingja4675
      @travisrecyclingja4675 5 років тому +1

      Thanks alot for that information, really appreciate it...i saw that on the little research it did
      But what i really want to know...is there a machine called ( transactor machine) i cant find it on line.
      Is saw transactor magazine instead

    • @travisrecyclingja4675
      @travisrecyclingja4675 5 років тому

      Is there a machine call transactor machine?? And thanks for your help

  • @tenminutetokyo2643
    @tenminutetokyo2643 2 роки тому

    I assume you've seen Dave Bradley The Commodore Man's channel.

    • @8_Bit
      @8_Bit  2 роки тому

      Yes, and I've met Dave in person in Toronto before at TPUG events.

  • @retroelectrons2
    @retroelectrons2 5 років тому

    LUV!!!!! my dtv(nts) !!!!

  • @DonEdward
    @DonEdward 4 роки тому

    Ibwas about to comment on run being r u. But as you indicated that any letter can be shifted it occoured to me... this is a bit like tab complete, isnt it? Just shift a letter and the c64 will presume a likely command and fill in the rest.
    I used to use these all the time to load games. I thought i needed the fast load cartridge to do them. Apparently they are stock.

  • @oldofftime
    @oldofftime 5 років тому

    My hart is melting. Thank you...

  • @Lion_McLionhead
    @Lion_McLionhead 4 роки тому

    Brought back bad memories of C64 BASIC. Such a slow interpreted language with illegible abbreviations & line crunching made lions never like the modern love affair with interpreted languages.

  • @user-hx9gu5nh9p
    @user-hx9gu5nh9p 5 років тому

    I never understood why the C64 didn't allow you to enter in the machine language monitor without the need of third parties software

    • @8_Bit
      @8_Bit  5 років тому

      Yes, it's a bit of an oversight. Fortunately other options were available shortly, including Jim Butterfield's excellent (and free) SuperMON.

    • @csbruce
      @csbruce 5 років тому

      The C64 and VIC-20 didn't have a machine-language monitor because their 16K of ROM was completely full of other stuff.

    • @user-hx9gu5nh9p
      @user-hx9gu5nh9p 5 років тому

      Brutikus Red Okay I get it, but if they included one or wrote it somewhere as general info in the BASIC manual probably there were about 10 times more Assembly coders and software... at 9 years old I had no clue such external tools were necessary. I was reading magazines I had access and still it was mentioned nowhere. Then when I got to know it I was already a PC user. The late 80s/90s were really approximative about information!

  • @elbiggus
    @elbiggus 2 роки тому

    The Atari 8-bit family has a much simpler (and intentional) abbreviation system - GR. expands to GRAPHICS, CO. expands to COLOR, etc. - but for some reason it has two for print; PR. expands to PRINT but oddly ? stays as ? even though it's functionally identical to PRINT. (As an aside, Atari BASIC is *awful* - while it's much easier to produce graphics and sound than Commodore's effort it's mind-bogglingly slow despite the hardware being slightly faster due to a number of what I can only describe as "pessimisations"; the way it handles FOR/NEXT and GOTO statements is one of the dumbest things I've ever seen, and I've seen a *lot* of dumb things!)

  • @ShesSometimesDoubleChocolate.
    @ShesSometimesDoubleChocolate. 5 років тому

    LOL, "parenthese"? Haha, what's that supposed to be? Maybe you meant "parenthesIS," which is the singular form of "parentheses." Right?
    Also, brackets are these: [], not these: ().

    • @darrenfoulds
      @darrenfoulds 5 років тому +2

      We were taught to call parentheses "left banana" and "right banana" :)

    • @ShesSometimesDoubleChocolate.
      @ShesSometimesDoubleChocolate. 5 років тому

      Hahaha, @@darrenfoulds, yeah, right! I guess if you're just tots or something like that!

    • @JohnDlugosz
      @JohnDlugosz 5 років тому

      @@darrenfoulds how about "wax" and "wane"?

    • @Pandamad
      @Pandamad 5 років тому

      I was taught in school that they are all forms of brackets, round brackets (or parentheses) Square brackets [ ] Curly brackets {braces} and angle brackets ⟨chevrons⟩

    • @JohnDlugosz
      @JohnDlugosz 5 років тому +1

      @@Pandamad I think that is chiefly British. In order of operations, they say "Brackets, Multiplication..." which indicates that parentheses are called brackets in grade school.