Printing Binary in BASIC and Assembly on Commodore 64

8-Bit Show And Tell

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 11 вер 2024

КОМЕНТАРІ • 151

@casaderobison2718 2 роки тому ⁺²⁵
I feel like you could get the machine language faster by replacing SYS 828,A with SYS X,A after predefining X to 828. I think reinterpreting 828 every iteration is where you're losing speed.
@landsgevaer 2 роки тому ⁺¹²
Variation on the last BASIC code that could be a decent bit faster than the other BASIC versions: precalculate the 16 4-bit nibbles in an array of strings, and print those in the loop using lookup on INT(A/16) and (A AND 15).
(Then you only precalc 16 short strings instead of 256 longer ones, cutting that time down from 20 sec to approx 1 sec I estimate, at the - hopefully limited - cost of two fairly simple calculations in the loop.)
@frankcatweazle3611 2 роки тому ⁺⁵
Make only a PreCalc for 4 Bits (Nibble) and take the top 4 bits (High Nibble) and print them and than the lower 4 bit (Low Nibble). Top Nibble is int(val/16) and Low Nibble is (val and 15).
@paulgroke3555 Рік тому
I tried exactly that - comes out at 15.82s. I didn't measure setup time since I hard-coded the array values - should be fast enough 😊 IMO the best compromise overall.
@mastertravelerseenitall298 2 роки тому ⁺⁵
"I understand if you think that's cheating, but it's all PURE BASIC..."
-8bit Show & LEGEND!
@JohnnyWednesday 2 роки тому ⁺⁶
I know I have no business suggesting this - but I would love, loooooove - to see you work on a line drawing routine? nothing fancy, just a Bresenham line or something - but it would be a nice 'simple' concept with which to explore the memory layout for things like pixel access
@markboulton954 2 роки тому ⁺⁵
I think the screen scrolling is the thing that slows it down the most. If the machine language was writing directly to screen memory rather than sending characters as a screen print, it would be so much faster. When reaching the bottom line just going back to the top line.
@jjeeeekk 2 роки тому ⁺⁵
Thank you Robin, for the patient and detailed description of all the various approaches. I Iike and enjoy this style of presentation very much.
@StevenIngram 2 роки тому ⁺⁶
When I was trying to get gains related to checking bit states on the color computer 2, I managed to eek out a little bit more using DEF FN. I don't know how DEF FN works in Commodore Basic, but on the CoCo, you only have to define a function once, at which point it's pre-tokenized and never has to be parsed again during the execution of the code. So calling the (preparsed) function is faster than parsing the equations every iteration of the loop. At least in CoCo BASIC. :)
@lasskinn474 2 роки тому ⁺¹
Thats pretty neat
@StevenIngram 2 роки тому ⁺²
@@lasskinn474 Yeah, the real trick is figuring out how to utilize (or work around) a used defined function that can only be passed 1 value. My solution used linked lists, multiple arrays in which the values I want to operate on share the same index. So then by passing the index value to the function, it references multiple variables (locations in different arrays) in the calculation despite only receiving a single value. ;) I thought I was so clever when I worked that out back in the day. LOL And all for marginal gains, because BASIC is so darned slow. HAHAHAHA
@markjreed 2 роки тому ⁺³
@@StevenIngram At least in Commodore BASIC, the parameter to a DEF FN is dynamically scoped. So if you have FNA(X) that calls FNB(Y) then inside B, X will refer to the value passed into A.
DEF FN doesn't seem to save time on Commodore, though. I defined a complex function and called it 1000 times and compared it to doing the same calculation directly in the body of the loop. The function version took 1m44s and the direct version took 1m41s, so if anything the function call/return overhead slowed things down.
Which seems to fit with the overall philosophy of the C=BASIC interpreter: It does hardly any advance tokenization. Keywords get converted, and that's 'it. Not even decimal numbers get parsed, instead just left as sequences of ASCII digits, so I'm not surprised that the function isn't precompiled in any meaningful way.
@StevenIngram 2 роки тому
@@markjreed Interesting. Also, I'll be first to admit that my memory of things may be faulty. I'm getting old. :D
@markjreed 2 роки тому
@@StevenIngram I didn't contradict anything you said; I would not be at all surprised if DEF FN actually saved time on the CoCo. The fact that the post-Level-I TRS-80 BASICs were also made by Microsoft doesn't mean they had the same quirks as the Commodore versions. :)
@talideon 2 роки тому ⁺³
6:40: I'd probably load an index register with 8, AND the accumulator with 128, printing "1" if it's nonzero and "0" otherwise, then ROL, CLC, decrement the index register, and loop until the index register is zero.
It'll be interesting to see how close to optimal that is.
@markjreed 2 роки тому ⁺²
No need to AND, just use the carry out from the shift:
sta temp
ldx #$08
loop:
asl temp
lda #$30 ; start with '0'
adc #$00 ; this adds 1 to make it '1' if the bit shifted out by the ASL was a 1
jsr $ffd2
dex
bne loop
@code_explorations 2 роки тому ⁺⁴
This was great. I’m going to show my students in time, who have learned about binary and BASIC. Now I want them to learn different algorithmic approaches and see some assembly language.
I’d love to see a follow-up video showing some of the approaches mentioned by other commenters.
@quantass 2 роки тому ⁺¹
I REALLY HOPE Robin isn't losing interest in posting new content. A Great channel with Great Content shall not die, damn-it !
@michielboland628 2 роки тому ⁺⁴
Maybe replace "sys 828, a" with "sys p, a" where p is initialized to 828 before the loop.
@asosa9502 2 місяці тому ⁺¹
As others have mentioned, the best way in BASIC is to pre-calculate all 4-bit strings. I tested it and got 1.05 seconds of pre-calculation time and 16.25 seconds of normal run time. Slightly slower run time than the precalculated 8-bit strings, but 20x faster pre-calculation time and 16x less memory usage.
@RemnantCult 2 роки тому ⁺⁴
Love the videos! I'm getting into the C64 and other classic computers as a hobby and these videos are informative and exciting.
@redlinechaser7942 2 роки тому
Me too! Wanted one as a kid in 80s but parents wouldn’t buy one. Got an Atari instead. I didn’t know what I was missing. So I’m learning too now And documenting the process. You gonna get real Commodore 64 too?
@3vi1J 2 роки тому ⁺²
Your BASIC5 example points out something I had completely forgotten about C=64 BASIC: as variables go, I != I(0). I guess I've spent too much time in C since my youth, but I totally expected it to have a bug when you ran it due to reuse of the letter I for the variable and the array (I expected the I variable to be stomping on the value in I(0)).
@LordRenegrade 2 роки тому
Yeah, when I got back into C64 stuff, the "different types of variables have different namespaces" thing was weird. I, I%, I$, I() are all unique! Of course it kinda makes sense when you consider that they can only be two actual characters - ABCDEF$ and ABC$ are both actually just AB$. But not AB, AB% or AB(0).
(also note that the different types of arrays are also unique - I%(), I(), and I$() are also unique from each other)
@davidmcgill1000 2 роки тому ⁺³
Would've figured you'd optimize out the expensive operations.
Like an exponent would only be needed for random access of a bit, but you're using them sequentially. Can add to build the lookup, or a more expensive divide and use them directly.
@elbiggus 2 роки тому ⁺⁵
13:27 Not sure that excluding precalculation from your timing is entirely fair; by that metric creating an array B$() that contains all 256 answers then doing *FOR X = 0 TO 255:PRINT X, B$(X):NEXT* would be a winner even if you used your 96 second method to generate it.
@henrikherranen2610 2 роки тому ⁺²
To me the thing to do would be to precalc a nibble table, i.e. DIM B$(16), then print each 8-bit value as two nibbles. Precalc time would be about 1.2 seconds, and print time would only be slightly slower than the fastest 13.2 seconds, perhaps 14-17 seconds. The total execution time would be less than 20 seconds, which I think would be hard to beat with Basic.
@henrikherranen2610 2 роки тому ⁺²
Actually, I tried the nibble method. The printing goes now like this:
140 PRINT A,
150 PRINT B$(A/16);
160 PRINT B$(A AND 15)
End result:
PRECALC: .783333333
TIME:. 16.3166667
So the total is around 18.1 seconds.
Edit: I see csbruce already did an even faster version of the nibble method. Kudos!
@csbruce 2 роки тому ⁺¹
@@henrikherranen2610: The pre-calc can be completely unrolled since there are only 16 values, and multiplication is faster than division (even on the newest CPUs).
@henrikherranen2610 2 роки тому ⁺¹
@@csbruce You're absolutely right. I saw your elegant implementation and decided there was nothing I could do better so I just left my implementation at that point. In hindsight, I should have done the multiplication by a variable containing 1/16 instead of dividing with the constant 16 as you did. That would have made my routine faster as I would have saved both parsing of the constant imside the loop and, of course, the slow division.
@redlinechaser7942 2 роки тому ⁺¹
You vids and comments are always a wealth of information! Best!
@TrollingAround 2 роки тому ⁺¹
For bit of a speed bump use "JSR $E716" to output a char to the screen rather than "JSR $FFD2"
This will bypass/avoid the indirect JMP at $FFD2:
FFD2: 6C 26 03 JMP ($0326)
$0326 is a vector which (by default) points to $F1CA
Code at $F1CA checks to see if we are outputting to the default device:
F1CA: 48 PHA
F1CB: A5 9A LDA $9A ; Default Output Device (3)
F1CD: C9 03 CMP #$03
F1CF: D0 04 BNE $F1D5
F1D1: 68 PLA
F1D2: 4C 16 E7 JMP $E716 ; Output to Screen
So we can avoid that by going straight to $E716.
You could of course bypass the output device checking by changing the vector at $0326 to point to $E716:
Poke 806,22
Poke 807,231
Then all basic PRINT statements as well as all machine code JSR $FFD2 will jump via the vector to $E716 (output to screen) without the checks to see if we should be outputting to a different device (which of course removes the ability to PRINT to other output devices), you would have to restore the vector before doing any of that.
But this still uses an indirect JuMP - which is 5 cycles.
JSR $E716 will save you 5+3+3+2+2+4+3=22 cycles per print character call (over JSR $FFD2) - not massive but something!
To be honest, if I were to need a decimal to binary string routine for basic using SYS I would store input parameters in memory (directly to the addresses you are storing them in your routine) Poking 2 or 3 bytes is much faster (but not as convenient) as the evaluation of parameters, also I would store the result in a string variable, which can then be printed or used however needed.
Since the object here is to write a fast decimal to binary print routine it would be winning to write a faster 'string to screen' routine.
Sorry for the long post; TLDR: For bit of a speed bump use "JSR $E716" to output a char to the screen rather than "JSR $FFD2"
@jjeeeekk 2 роки тому ⁺³
I'm usually using this approach (0.2 sec pre-calc time and 23.8 sec run-time - in summary it's even faster than the full table driven or goto approach):
5 ti$="000000"
10 dimb$(128)
20 b7=128:b6=64:b5=32:b4=16:b3=8:b2=4:b1=2:b0=1
30 data128,64,32,16,8,4,2,1
40 for b= 0 to 7:reada:b$(a)=chr$(49):next
50 b$(0)=chr$(48)
90 pt=ti
100 :
110 print chr$(147) chr$(5);
120 ti$="000000"
130 for x = 0 to 255
140 a=x
150 print a,
160 printb$(aandb7)b$(aandb6)b$(aandb5)b$(aandb4)b$(aandb3)b$(aandb2)b$(aandb1)b$(aandb0)
910 next
920 print "time:"ti/60
930 print "precalc:"pt/60
@csbruce 2 роки тому
SIngle-letter variable names are faster to parse than two-character names.
@jjeeeekk 2 роки тому ⁺¹
@@csbruce Yes, indeed, if focusing on speed only. But for typical use in a subroutine it might be not a good idea to glob 8 single character names (out of 26) with constants which are just needed for this routine. 😉
@TheSimTetuChannel 2 роки тому ⁺¹
That was a great exercise to illustrate the tradeoff between execution speed and code size!
@HeyBirt 2 роки тому ⁺¹
On the ML version you can use a table of nibble wide strings for the binary representation of the nibble. This would only take 16*4 bytes (or I guess 16x5 bytes if you delimited the string so you could print it directly). This would also make it easy to go from hex to binary.
I can remember doing an any base to any base conversion program on the C64 in BASIC back in the day and I'm sure it was very slow..
@WY.C64-Guy 2 роки тому
FWIW - I did my own variation on Example "BASIC10", and put in a slight advantage by blanking the VIC II. That yielded an improvement of about 1.5 seconds in the precalc stage on my NTSC machine. I also agree as others have said that the screen scrolling was chewing up time, so I wanted to see how much keeping the line at the home position (top of screen) would improve things... Improved by about 8 seconds on my machine! (Display time measured at 5.233 seconds.)
@The.Doctor.Venkman 2 роки тому
A bit of a shock that including the SYS routine was longer overall! Thanks, Robin - Nice video!
@TheTRONProgram 8 місяців тому
Interesting exercise! I got 26.12 seconds printing strings with this routine compared to 36 seconds printing numbers:
20 for x=0to255:printx;
25 a=x
30 if a>127 then a=a-128:print "1";:got
o40
35 print"0";
40 if a>63 then a=a-64:print"1";:goto60
45 print"0";
60 if a>31 then a=a-32:print"1"; :goto
70
65 print"0";
70 if a>15 then a=a-16:print"1";:goto72
71 print"0";
72 if a>7 then a=a-8 :print "1";:got
o74
73 print"0";
74 if a>3 then a=a-4 :print "1";:got
o76
75 print"0";
76 if a>1 then a=a-2 :print"1" ;:goto
78
77 print"0";
78 if a>0 then print"1";:goto80
79 print"0";
80 rem
200 print
210 next x
@comchia4306 2 роки тому
That first program in the video reminds me of a sample input program in the book Assembly Lines, which prints characters with the Apple II paddles. I guess it's a good style of test program, programmer and I/O device alike.
@Mr76Pontiac 2 роки тому ⁺³
The only other basic (no pun) update I would have done is instead of having the computer update the entire screen for each line, "HOME" the cursor at each print statement. Clear the screen once at the start, then at each print, just HOME, then print the result. There is a LOT of time spent on redrawing the screen.
The other thing I might have tried is just do appending to a string and do one print of that string. So instead of calling the print many times during:
ti="000000"
print "{clrscr}"
for L=0 to 255
S$=""
for B=7 to 0 step -1
{do whatever steps to figure 0 or 1}
S$=S$+{result} [[[AFTER POST EDIT]]]]
next B
print "{home}"L,S$
NEXT L
print "Time:" TI/60
Your current process of calculating how long it takes to do something I personally thing is a bit (No, not a pun..) invalid since your times include how long it takes to draw the screen, not how long it takes to do the calculation of the bit positioning. Doing the print one entry at a time does make for a straight forward comparison between ALL the revisions, but, the pure ASM version I think would just destroy every other version if the cursor were HOMED instead of new lines printed out.
Too busy at work to kick up VICE to play (Or go over and type it in on my 128 in 64 mode) so what I've got here is just theory.
@markjreed 2 роки тому
I had the same thought; scrolling the screen wastes a lot of time. Maybe you want to see the surrounding context of the joystick values as they change, so there's value there in the end result, but keeping the display on a single line is a lot faster. I wrote a simple one-byte version of the print-byte-in-binary ML routine; a BASIC FOR loop SYSing to it as in the video took about 14 seconds, but adding a CHR$(19) after each SYS chopped the runtime down to under 5 seconds.
@OscarSommerbo 2 роки тому
I can see myself at age 11 getting to Basic5 without much effort and calling it good. I did lack books and mentors to get into assembly on the 64. I kind of decoded those DATA/POKE programs found in magazines, but I lacked the knowledge to make big changes, that worked.
@Kris_M 2 роки тому ⁺³
Faster! Faster!
@PSL1969 2 роки тому
Hey Robin, Great episode! I thought to myself "Here we go again" with the assembly optimization, which is always fun, but I was surprised that Basic won 😎👍
BTW: I was making a note writer prg, with encryption, and I came up with an insane way to xor all bits in two bytes together without using any registers, I was hoping it could be used for something else, when I saw this video, but I don't think so 😂
@bierundkippen720 2 роки тому
A small tip for you: You can explain things much better by supplying a small example which shows how the particular stuff works.
@TimStCroix 2 роки тому ⁺³
There's an optimization to your BASIC programs that you didn't use, assigning your constants to variables.
Commodore BASIC, being a floating point language, converts all constants to FP each and every time it comes across one. So assigning constants to variables outside of loops will speed things up.
Now, with variables BASIC has to search through the list to find the right one but in my experiments I discovered that with one digit constants it's faster if the variable for that constant is one of the first 12 assigned. For 2 or more digits the conversion to FP is very slow so even if it's at the end of a long variable list it should still be faster. For real-world programs the variable list can be kept shorter by reassigning variables (carefully) when needed.
Obviously, debugging such a real-world program would be a nightmare so procedures should be developed make it easier.
Here's a demonstration program, tested in VICE. BTW, I just learned you can copy/paste from your text editor into VICE... neat. You have to use the menu bar 'Edit' 'Paste' option, you can't just paste into the window. It works copying from here, too.
10 a=1 : b=2 : c=3 : d=4 : e=5 : f=6 : g=7 : h=8 : j=9 : k=10 : l = 11 : m = 12
15 n = 13 : z=0
20 ? : ? "one digit constant took";
30 t0 = ti
40 for i = 1 to 1000
50 z=1
60 next
70 ? ti - t0 "jiffies" : ? : ? "two digits took";
80 t0 = ti
90 for i = 1 to 1000
100 z=11
110 next
112 ? ti - t0 "jiffies" : ? : ? "three digits took";
114 t0 = ti
116 for i = 1 to 1000
118 z=111
119 next
120 ? ti - t0 "jiffies" : ? : ? "first variable took";
130 t0 = ti
140 for i = 1 to 1000
150 z=a
160 next
170 ? ti - t0 "jiffies" : ? : ? "twelfth variable took";
180 t0 = ti
190 for i = 1 to 1000
200 z=m
210 next
220 ? ti - t0 "jiffies" : ? : ? "thirteenth variable took";
230 t0 = ti
240 for i = 1 to 1000
250 z=n
260 next
270 ? ti - t0 "jiffies"
Edited 8 hours later: Assigning t0 to ti before the print in the one digit timing test skewed the result. Corrected.
Also added timing for three digits. Benefits of numbering by 10 demonstrated.
And further testing shows that BASIC converts the digit 0 faster than 1 in two and three digit constants: 10 vs. 11 and 100 vs. 111. Weird. I wonder if other digits will change the result, too. But I think I've bothered everyone enough about this weirdness.
Great channel.
@stalinvlad 2 роки тому
Good tip about the constants, my 19 seconds became 18.166
@maxusboostus 2 роки тому ⁺²
The rom screen scroll up routine is very slow. do an un~rolled 1024~2023 scroll routine, I think that would improve the speed.
@BG101UK 2 роки тому ⁺¹
I'd crunch that BASIC program and reduce line numbers as much as possible (preferably single digits). I found the latter especially makes a difference;, particularly where for/next loops are involved. I wrote a simple joystick/digital mouse controller for a few of my attempts at GUI programs and it was extremely sluggish when at its original high line numbers (tacked on the end of the BASIC programs).
Obviously that ran much better when I converted it into assembly .. and assembly was the only way to write a proportional mouse driver as I then ran these under interrupt.
@LordRenegrade 2 роки тому
NB: FOR/NEXT doesn't care about line numbers (the NEXT returns to a real address that's pushed to the stack), it's only GOTO (or "THEN ") that cares. And the length of the line number would only matter a little bit to GOTO; the position it exists at in the program matters much more as the lines are stored in a singly linked list .... and BASIC seeks through a huge portion of the list when finding the target of a GOTO (it has a shortcut to determine whether to search from the start of the program or from the current line, but it's still slow).
Also be careful when benchmarking as the order of variable declaration also matters (they're stored in an array and searched sequentially every time there's a variable reference). Minor changes in variable order could result in significant speed gains or loss.
MS-BASIC (which V2 BASIC is a variant of) is just awful for performance, especially in that era. BBC BASIC is much faster!
@HAGSLAB 2 роки тому
Great in-depth video as always!
@raymitchell9736 2 роки тому ⁺³
Well... I think you had at least 3 or 4 more levels to explore: You can save some interpreting time with the precompute by stepping the bits downward, then your print loop doesn't need the "step -1" or better yet you can use the fact that a AND B(b) returns either the value or 0... so ( (a AND B(b)) =0 ) returns -1 or 0, since CHR$(49) is an ASCII (petscii) '1' then PRINT CHR$( 49+((A AND B(b))=0))) will produce an ASCII '1' or '0' my hunch is that it should give you a speed up without the if statements/GOTO... Finally wondering if using integer variables such as B% so you're not calling floating number routines... Except integer variables cannot be used as the index in a FOR loop.
To save on precompute memory/time you could potentially do this all in one line, this will save the interpreter from needed to process multiple lines:
10 for a = 0 to 255
20 b=128 : for i = 0 to 7 : b=b/2 : print chr$( 49 + ((a and b)=0) ); :next: print
30 next
Tested with a C64 emulator... took 42 seconds, but maybe you can test on your computer see if it's any faster... but maybe these ideas can speed up your basic programs... The thing about the precompute every string on a VIC-20 that would probably eat up your memory... yes it's a Size-vs-Speed tradeoff... I don't think is an acceptable one that I would make for this kind of computer... we are so spoiled with Gigabytes of memory that sometimes we forget how tiny 38K or on a VIC-20 3.8K really is.... I know you know, and your goal was to find the fastest time at any cost... but then there's what is practical.
@markjreed 2 роки тому ⁺¹
Integer variables in Commodore BASIC don't help in general because the interpreter converts them to floating point anyway whenever doing any arithmetic on them. Their presence is basically a leftover; actual support for integer operations was dropped from the shipped ROMs to save space.
@raymitchell9736 2 роки тому
@@markjreed Okay, that's good to know... and now that you mention it I sort of recall that! But I believe integers help in variable storage, e.g. in an array? It's been a long time since I've poked around in the guts of Commodore BASIC... I feel so out of touch with this... I used to write my own custom system wedges (just for fun) I had a BASIC extension for the C64's back in the 1980's where I added some graphics and disc commands... now I have little memory of it. What gave me the insights I needed to do that came from a couple of great books from Abacas and they did deep dives into the BASIC ROMS with annotation and examples... it was so much fun!
@markjreed 2 роки тому ⁺¹
@@raymitchell9736 Yes. Each scalar variable takes up the same amount of space in the variable table no matter what type it is; numeric values fit in the table itself while strings have pointers. But the array table is nothing but names and pointers while the actual values are elsewhere and packed as closely as they can be. So an array of integers only takes two bytes per entry instead of the 5 (IIRC) taken by a float.
@Reboot_TV 2 роки тому ⁺²
I was wondering as the basic program with precalculation prints strings containing all 8 characters at once, would it be faster to create a string with all characters in machine language and then use a single print instead of printing each character individually? Or does printing a string just iterate over the characters and print them one by one anyway?
@brianharris2114 Рік тому
The reason you're getting glitchy data is because of the pulse width , I suspect you would get way better response if you overprint the data preventing the scrolling, where most of the pulses are being missed.
@xbzq Рік тому ⁺¹
Faster basic is to first precalc nibbles, that is a table with 16 entries, then precalc the full bytes for m that. That's going to be much faster than precalc bit by bit for all 256 bytes.
@Tricob1974 2 роки тому ⁺¹
Seeing as CHR$(48) is the Zero character, and CHR$(49) is the 1 character, you can exploit it a bit. In Line 10, I did: A=128: FOR B=1 TO 8: C(B)=A: A=A/2: NEXT. I blanked TI$ in Line 20. In line 30, I did: FOR A=0 TO 255: PRINT A,. In line 40, I did: FOR B=1 TO 8: PRINT CHR$(49+((A AND C(B))=0));. Lastly, in Line 50, I did: NEXT: PRINT: NEXT. That's probably as fast as you can get without the SYS command. It also takes less memory than the other non-SYS versions.
@radman999 2 роки тому ⁺¹
Would love a new podcast episode!
@RandomBitzzz 2 роки тому
Great video! Really interesting.
For your basic 10, personally I would call it a lookup rather than a conversion.
Did you play with the idea of storing a pre-calc table in an seq file and loading it into an array, then doing a lookup? I would think it would be such a small file that it wouldn't eat much time.
This reminds me of stories I've heard of old micro programmers storing pre-calculated info and doing lookups rather than spending CPU time on computations.
@PeterBrodersen 2 роки тому
As several other suggested I would probably have pre-calculated a nibble instead of a byte and then using two operations (integer division and binary AND) to get the first and the second nibble. Of course it depends on how often the final routine is used. It's all a balance between pre-calculation and usage. But cutting the pre-calculation down to 1/16 of the possible values seems like a nice balance. I do guess a lot of time is spent on having all of the text at the screen screen scroll when the bottom line is reached. This might be disproportionate to the other time spent.
By the way, my nephew talked about a mouse he got for Christmas where he can swipe-click (a sort of automated clicking for a short period of time; very useful in a bunch of games he plays). I mentioned how it wasn't that different as to about 30 years ago when I got a joystick with autofire for Christmas, and he simply asked: What's a joystick?
Ouch. I feel old.
@markjreed 2 роки тому
Conditional NEXTs are tricky and asking for trouble; I would always have just a single `NEXT` per loop and replace IF...THEN ...:NEXT with IF...THEN...:GOTO .. Which also saves a statement over your `...:NEXT:GOTO`.
@andrewgillham1907 2 роки тому
Great videos, always enjoy them. One question I have is about the use of 828. I know that is the tape buffer, but the book I have says 820-827 (8 bytes before the tape buffer) and 1020-1023 (4 bytes after the tape buffer) are unused. Why not start at 820 and have 204 bytes available? Is 820-827 actually used somewhere? Or is my book out of date? (Mastering Machine Code on Your Commodore 64 by Mark Greenshields p. 213)
@kespeth2 2 роки тому ⁺¹
did you try using integer index variables instead of floating point (the default) ones? I'll bet that'll save more time, although adding all those %'s would be annoying...
@LordRenegrade 2 роки тому ⁺²
Unfortunately, early MSBASICs (like that found on the C64/128/Vic/PET) didn't have real integer support. They just convert the integers to floats and carry out the calculations using the fp routines, so they're actually slower than float-float operations. The only real purpose for integers in those BASICs is to store data compactly: they're much smaller than the floats in memory.
@csbruce 2 роки тому ⁺¹²
1:49 What's the name of the font you use for subtitles in your videos?
3:32 Since BASIC includes the critical operation to support this, the obvious way to fill in the program (at this point in the video) is:
10 A=0:Z=49:J=128:K=64:L=32:M=16:N=8:O=4:P=2:Q=4
200 PRINTCHR$(Z-((AANDJ)>.));
210 PRINTCHR$(Z-((AANDK)>.));
220 PRINTCHR$(Z-((AANDL)>.));
230 PRINTCHR$(Z-((AANDM)>.));
240 PRINTCHR$(Z-((AANDN)>.));
250 PRINTCHR$(Z-((AANDO)>.));
260 PRINTCHR$(Z-((AANDP)>.));
270 PRINTCHR$(Z-((AANDQ)>.))
Run time: 27.40 seconds.
Since the method is non-destructive, you can also apply it directly to the index variable, reducing the run time to 26.98 seconds. You could also stuff three binary digits into each PRINT statement.
8:00 Using «PRINTRIGHT$(STR$((AANDJ)>.),1);» also makes my program slower, I guess because there's two string functions and STR$() might use division by 10.
9:14 OMG - you're using exponentiation in optimized code! That's even slower than division. I'll use my first emoji: 😱 !
12:59 Given that you're running the loop lots of times, I'd say it's more reasonable to include the pre-calculation time.
14:02 You could also unroll the loop and use constants.
15:41 Here's my "pre-calculation" version (pre-calc included in run time). Instead of processing one bit at a time, it does four bits at a time (hexits):
121 DIM A,X,H$(15):F=15:S=1/16
122 H$(0)="0000":H$(1)="0001":H$(2)="0010":H$(3)="0011"
123 H$(4)="0100":H$(5)="0101":H$(6)="0110":H$(7)="0111"
124 H$(8)="1000":H$(9)="1001":H$(10)="1010":H$(11)="1011"
125 H$(12)="1100":H$(13)="1101":H$(14)="1110":H$(15)="1111"
200 PRINTH$(A*S);H$(AANDF)
Run time: 14.95 seconds!
19:32 Perhaps they'd be called "Pessimizations".
21:40 The first four statements are useful only 6% of the time and are dead weight 94% of the time.
23:00 I think you're obligated to include the pre-calculation time this time, considering that the pre-calculation is a complete run of the main loop!
24:27 Most of the work of the second part of the method is scrolling the screen, which is included in all the other methods. This makes the first part seem faster.
34:23 It would be easy enough to assign 828 to a variable.
37:08 How about an AI joystick that just plays the whole game for you?
37:52 When your songs go "Maximum Reverb", it might be helpful to have subtitles.
@8_Bit 2 роки тому ⁺¹⁰
I use the Microgramma font most of the time, which was a favourite of both Commodore and many sci-fi films, beginning around the release of 2001: A Space Odyssey.
Thanks for all the great observations. I could pretty much remake the video following all your ideas. But... I probably won't :)
@redlinechaser7942 2 роки тому ⁺²
You should make videos too.
@MattKasdorf 2 роки тому ⁺³
hexits? I don't get it... we're printing out binary digits... so four at a time would be... nybblits?
{re-rereading} Oh, I think I get it now.
@elbiggus 2 роки тому ⁺²
(Rats, should've watched the rest of it first; it's in there already. I got a slightly slower time, but maybe that's a PAL machine thing?)
If we're excluding precalculations from the time then the "fastest" way is to go
10 DIM B$(255)
20 REM FILL B$ WITH THE BINARY REPRESENTATIONS USING WHATEVER METHOD YOU WANT
...
120 TI$="000000"
130 FOR X=0 TO 255
140 PRINT X,B$(X)
150 NEXT
920 PRINT "TIME:"TI/60
Run time 13.65s!
@markjreed 2 роки тому ⁺³
@@MattKasdorf four bits = 1 digit in hexadecimal (base 16), hence "hexit".
The whole reason programmers use hex (and used to use octal) is because the digits map directly to binary bits, no matter where they are in the larger hexadecimal number. For example, $30F6 in binary is just the binary for 3=0011 followed by 0=0000 followed by F=1111 followed by 6=0110: 0011000011110110. No division required.
You can't do that with base 10; seeing a 2 somewhere in a decimal number doesn't tell you anything about the underlying bit pattern. So you have to do the actual base conversion arithmetic.
@benjaminsmith3151 2 роки тому
Great video! I'm not so familiar with the Commodore specifics, but I've done a ton of assembly optimization and some of it in 6502. My first expectation, as I think you discovered in the video, is that the BASIC interpreter is calling the same character output routine as the assembly call. Because the screen is scrolling, this might mean that the entire screen buffer of 40xWhatever is being rewritten after each newline. That depends on the hardware, but things weren't so sophisticated back then and I don't think they did anything fancy in the text screen buffering. Is there a Commodore BASIC technique that writes OVER the same line instead of scrolling? If not, there isn't much you could do to go faster. On the other hand, if you directly manipulate the same line in the screen buffer it could save thousands of cycles per line outputted. This would be independent of how much text is on the screen, and you might even be able to hard code it with pokes.
@manicsorceress2181 2 роки тому
I've tested the program with the precalculated binary array. The pre-calculation takes 20.783 seconds for you and 21.1 seconds for me? How can that be? I even removed all spaces but it's still slower. I use a PAL machine. Are you using an NTSC machine that runs faster?
@DavidWonn 2 роки тому
Perhaps I missed it, but what does the rightmost button do? Is it an alias to the left button, or up, or...?
@crocolierrblx9365 2 роки тому ⁺¹
nice!
@stefankrautz9048 2 роки тому
Positioning the cursor with poke 214,x and then print a saved 6 Seconds for me. Seems like scrolling the screen slowes down too :)
@00Skyfox 2 роки тому
Do the first 3 bits of 56320 serve any purpose in the C64? If not, seems like that would be a waste of registers.
@8_Bit 2 роки тому ⁺³
All 8 bits of that register are used when reading the keyboard matrix; it's 8x8 = 64 keys. The joysticks and keyboard share those same lines; that's why you can sometimes make the C64 type things by wiggling the joysticks around.
@jacekzielina1359 2 роки тому ⁺²
Great job! BTW: what is the line 140 A=X good for? Why no to use just x in following commands?
@MattKasdorf 2 роки тому
At 3:08 he explains.
@jacekzielina1359 2 роки тому
@@MattKasdorf thanks, I missed that
@allenhuffman 2 роки тому
I played with BASIC bit print during #SepTandy. Which led me to doing routines to bit shift and rotate left and right, as well as ANd, OR, etc. such a rabbit hole for things BASIC has no business doing. (I did not do one in assembly, but I think I want to now.)
@markboulton954 2 роки тому
I haven't gotten around to trying this in an emulator, but I wonder if you could do something like:
FOR X=0 TO 7: ? CHR(48+(A>127)); : A=(A*2)-(256*(A>127)): NEXT X
@stalinvlad 2 роки тому
I did one with the 15 nibble strings in an array "0000","0001"..."1111" AND with 240 divide by 16 and then AND with 15 (f0 & then 0f), got the time down to 19 seconds. I would supply a listing but vice is all new to me
@stalinvlad 2 роки тому
It is 16 nibbles....
@evansdm2008 2 роки тому
Scrolling is probably so slow that it makes the 6502 version unable to make a dent in it.
@LordRenegrade 2 роки тому
My own "printing out the joystick" thing in assembly just overwrites eight spots on the screen - that saves an enormous amount of CPU time as I'm reading an 8-bit integer and writing eight bytes and I don't need ANY of the KERNAL routines.. I basically store the joystick value somewhere, BIT it, display a one or zero depending on what it is, and then ASL the stored value.
It's not really optimized at all, but it's updating both joysticks at least eight times a frame it seems. That would be a runtime of about 0.5333s (256 values / 8 = 32 frames, 32 frames / 60 jiffies = 0.5333...).
@8_Bit 2 роки тому ⁺²
Yes, it could be much faster - I was specifically trying to replicate a BIN$() type BASIC function that would just use the regular C64 PRINT/CHROUT routines.
@mikegarland4500 Рік тому
Just a question: What's the purpose of having the extra A=X? Could you not just replace all the A's with X instead? I can't figure out if there's an advantage to doing it this way or not; I'm probably missing something..
{sorry, in case it's needed: reference 32:27}
@8_Bit Рік тому ⁺¹
I explain it back around 3:00; some of the algorithms modify A so I keep the framework the same for consistent benchmarking. Obviously they can't be modifying X directly as that would mess up the main test loop!
@mikegarland4500 Рік тому
@@8_Bit ahh, thank you. I figured it was something probably pretty obvious. I somehow missed it, and when I went back, I didn't go back far enough.
@snakefriesia6808 2 роки тому
dnag.. you have a big box Howard The Duck .. isn;t that completely rare now ??
@bierundkippen720 2 роки тому
SYS 2^15 ist easier to remember for Turbo Macro Pro to run.
@kennethsimmons9017 2 роки тому
Would the Comal 80 cartridge work. The the Comal Handbook 2nd. Ed. Page 437-439.
@jefferystone1 2 роки тому
"Epyx Programmers' BASIC Toolkit" also has a BIN$() function.
@retronicksprogrammingchann2337 2 роки тому
Robin - you do such amazing stuff with the 64, how about doing some pc x86/gwbasic/qbasic - maybe a second channel 16-Bit Show And Tell
@8_Bit 2 роки тому ⁺¹
Thanks, I have done a few videos touching on x86, probably not exactly what you're looking for but in case you want to check them out I made a new playlist: ua-cam.com/play/PLvW2ZMbxgP9yiLpZ0s-l8aqb72o4tLHYi.html
In the 386sx laptop video I show a simple x86 game I wrote years ago, and the Chevy video digs into the PC version a little.
@retronicksprogrammingchann2337 2 роки тому
I checked out the videos and in the laptop video you mentioned that you could probably optimize and write the code in half. that sounds like a challenge to yourself and maybe an interesting video. Show the things you did wrong and how to improve the code. Maybe you can put in your videos ideas to do in the future. I will probably watch a few time as I am learning x86 right now.
@stuartmcconnachie Рік тому
Can you improve the precalc time for the basic version with…
DIM N$(15)
N$(0)=“0000”
…
N$(15)=“1111”
DIM B$(255)
I=0
FOR H=0 TO 15
FOR L=0 TO 15
B$(I)=N$(H)+N$(L):I=I+1
NEXT
NEXT
@logiciananimal 2 роки тому
memoization can be very powerful..
@JacGoudsmit 2 роки тому
I wonder if the BASIC version would be faster if you use % to tell the interpreter that A is an integer, and if you use loop unrolling. Also as someone already pointed out, you can generate a 0 or 1 with CHR$.
49 Q%=48
50 PRINT(CHR$(Q%-(A%>127)));: A%=A% AND 127
51 PRINT(CHR$(Q%-(A%>63)));: A%=A% AND 63
52 PRINT(CHR$(Q%-(A%>31)));: A%=A% AND 31
53 PRINT(CHR$(Q%-(A%>15)));: A%=A% AND 15
54 PRINT(CHR$(Q%-(A%>7)));: A%=A% AND 7
55 PRINT(CHR$(Q%-(A%>3)));: A%=A% AND 3
56 PRINT(CHR$(Q%-(A%>1)));: A%=A% AND 1
57 PRINT(CHR$(Q%+A%))
@mikechappell4156 2 роки тому
Sadly, while integer math is theoretically faster than floating point on the C64, it takes longer to parse the extra characters. I don't think C=64 BASIC allowed you default variable names to integer either. (I think TRS-80 BASIC had DEF INT but don't recall Commodore BASIC having it. I could be mixing it up with the little ForTran I used to know.)
You're probably better off using a variable array for 1, 3, 7, 15, 31, 63, 127 than making the parser interpret the numbers twice on each line. Unfortunately, the more you attempt to optimize the code, the less readable it gets.
I loved coding on the C64 and trying to optimize here and there in the 80s, but I definitely prefer coding solutions in readable C to obfuscated BASIC.
@TimStCroix 2 роки тому
@@mikechappell4156 Commodore BASIC is floating point down to it's core. When adding two integers variables, assigning the result to a third integer variable, BASIC will first convert the integers to FP, add them together, then convert the sum to an integer to assign it to the result variable.
It's best to avoid integers except in very specific instances involving integer arrays. On second thought, avoid integers, period.
@NotaWizard 2 роки тому ⁺¹
I would like everyone in the world to think of the BNE mnemonic as short for Bacon 'N Eggs.
@moshly64 2 роки тому ⁺¹
Its the international airport code for Brisbane airport, now I'm gonna call it bacon 'n eggs airport.
@SIDCIAVIC 2 роки тому ⁺³
Printing to the screen is always the bottleneck.
@mikechappell4156 2 роки тому ⁺¹
Scrolling the screen is where it gets nasty.
@SIDCIAVIC 2 роки тому
I prefer to buffer screen codes 1000 at a time. and then flip up and down between them with a nice fast move in assembly.
@stevethepocket 2 роки тому ⁺¹
@@mikechappell4156 Honestly, for the purposes of joystick interpretation (or something similar), printing a home character at the end of each line would make it faster _and_ make more sense than a carriage return.
@mikechappell4156 2 роки тому ⁺¹
@@stevethepocket I tried it and it cut run time down to about 56%.
@garywhittaker6451 5 місяців тому
Can you please point me to where I can get a 4 axis only joystick. As you know Ms Pac-Man is unplayable with an 8 axis joystick. Thanks.
@8_Bit 5 місяців тому
I only know of a few Commodore/Atari compatible 4-way joysticks:
Newport Controls Prostick II or III
Zircon International Z-Stick
Kraft Maze Master
Kraft Switch Hitter
Each of these is switchable between 4-way and 8-way. Since they're all old and probably used joysticks, I'd confirm with the seller that the 4-way function is still working.
If you're handy, search for "true 4-way joystick" online and you'll find some actual 4-way arcade joystick mechanisms, but you'd have to mount them in a box and wire them up yourself.
@cooltaylor1015 2 роки тому
I wonder if a C46 could look up a conversion table faster than it can do the math...
Its a shame, because it has already internally do e the opposite conversion
@magnusjahn5342 2 роки тому
Hy good Man Robin, no easter eggs new, found in 2022?
@CandyGramForMongo_ 2 роки тому ⁺¹
What’s faster, str( or val(?
@markjreed 2 роки тому
Apparently VAL. 1000 VALs on my C128 took 10 seconds; 1000 STR$'s, 17 seconds.
@CandyGramForMongo_ 2 роки тому
@@markjreed All string functions are hogs, that I remember. Hey, thanks for doing that. That’s interesting.
@markjreed 2 роки тому
@@CandyGramForMongo_ Sure. In each case the argument was a literal, btw; STR$(1234) and VAL("1234"). Replacing those with preassigned variables shrinks the difference a little: 1000 VALs still take 9.5 seconds (572 jiffies) while 1000 STR$'s take only 13.4 seconds (801 jiffies).
@GeoffSeeley 2 роки тому ⁺¹
Ah yes, the good old days where code optimization was a requirement. It's a bit of a lost art these days.
@RandomBitzzz 2 роки тому
I took a C class in 1994, and the teacher flat out said don't bother optimizing your code. He said they'd just build faster machines to run it.
@faceeasyg 2 роки тому
Tha BASIC part is really nice by showing different approaches to the same problem, kudos. But... I may have got it wrong, but I have the feeling that in the assembly example you were using the same exact basic print routine, which -who would have though- ran at the same exact speed whether it was called from BASIC or from ML, therefore the conclusion that "the BASIC print routine is fast, just as fast as machine language" is wrong. In both cases the BASIC print routine was tested. You can test the time difference easily with PRINTCHR(0-255) vs ldx#$00stx$0400,xinxbnestart, then you'll compare BASIC PRINT routine vs a print routine optimized for the task in assembly. With your time measuring method the ML code will measure exactly zero, so the BASIC PRINT routine will mathematically take infinite times more time. Now that's a completely different result than what the video's conclusion is.
@8_Bit 2 роки тому
I probably didn't make it clear enough that I was looking to create the equivalent of a BIN$() function that would play nice with BASIC and the KERNAL operating system, so that means using the official $FFD2 CHROUT character print routine. One could certainly store directly to the screen much more quickly.
@suchaluch5615 2 роки тому ⁺¹
Well, there is ugly style in the asm-File... Just by skimming through it:
@27:11 you see:
jsr convert16
rts
Why would you ever type it like this? You could save a byte (the rts) and some cycles by just replacing the 2 lines with
jmp convert16
@8_Bit 2 роки тому ⁺³
The assembly version certainly isn't fully optimized, and I find that particular optimization makes the intent less clear and the code less readable. But it's certainly a good optimization to use when every byte and cycle matter.
@Chris-op7yt Рік тому
the old style joysticks were better, in my opinion, than the newly fashioned ones with micro-switches, which are impossible to do special game combos with.
old style joysticks were spring and contacts, arranged such that you could easily do 360 continuous movements if desired. this includes diagonals being valid positions, that should be easy to accomplish without hitting a vertical or horizontal first, as those might be other moves being trigerred in game.
@huntabadday2663 2 роки тому ⁺¹
This is what I have created
It takes the remainder of the value and adds it to a string variable so it isn't reversed
10 TI$="000000"
20 A=0
30 FOR T=0TO255:PRINT A,
35 I=A:S$=""
36 :
40 FOR J=1 TO 8
50 M=I-INT(I/2)*2 : REM I % 2
60 I=INT(I/2) : REM I >> 1
70 S$=MID$(STR$(M),2,1)+S$ : REM APPEND
80 NEXT
81 :
90 PRINT S$:A=A+1:NEXT
100 PRINT"TIME: ";TI/60
it takes about 73 seconds to complete while expanded across multiple lines
Unexpanded: S$=MID$(STR$(I-INT(I/2)*2),2,1)+S$ : I=INT(I/2) and takes 67 seconds to complete which is pretty close to the first routine
@ncurtis1970 2 роки тому
10 ti$="000000"
20 for n=0to255:x=n:b$="":b=128
30 ifb>xthenb$=b$+"0";goto50
40 b$=b$+"1";x=x-b
50 b=b/2
60 if1
@ncurtis1970 2 роки тому
I believe the vic (pal) is a teeny bit faster than a 64
@David_JA_Noble 2 роки тому
you should be using a bread bin c64
@talideon 2 роки тому ⁺¹
Why would you want to type on the less ergonomic machine?
@8_Bit 2 роки тому ⁺¹
Just look back a few videos for "Simple Commodore 64 Disk Protection" or "Epyx 1984 Preview Disk", I'm using a breadbin in those videos. I happen to like both models.
@robertlock5501 2 роки тому
ah, good ol' screwtoob bots at it again deleting yet another comment... :-/
@sidoldboy3442 2 роки тому
Without looking at what you did until I did this I managed 18.71667 in basic
5 DIM A$(16)
... Fill the array with A$(0)="0000" to A$(15)="1111"
99 TI$="000000"
100 FOR X = 0 TO 255
110 PRINT X,
120 :
130 IF X < 16 THEN PRINT A$(0);:PRINTA$(X AND 15)
140 IF X > 15 THEN PRINT A$(X/16);:PRINT A$(XAND15)
200 NEXT
210 PRINT "TIME:"TI/60
@sidoldboy3442 2 роки тому
btw: I.ve not written C64 basic before, but I have written C128 Basic about 35 years ago LUL....
another improvement to 15.883333
5 DIM A$(16)
... Fill the array with A$(0)="0000" to A$(15)="1111"
90 Y=16:Z=15
99 TI$="000000"
100 FORX=0TO255
110 PRINTX,
130 PRINTA$(X/Y);:PRINTA$(XANDZ)
200 NEXT
210 PRINT "TIME:"TI/60
@TheUtuber999 Рік тому ⁺¹
I paused the video after the first example and gave it a try. Not the speediest example by a longshot, but minimal code, for what it's worth...
10 t=ti:forx=0to255:a=x:forb=1to8
20 ifaand128thenprint"1";:goto40
30 print"0";
40 a=a*2:nextb:print,x:nextx
50 print(ti-t)/60"seconds"
38.5666667 seconds
(edit) Here's a hybrid version using Basic and ML...
10 t=ti
100 forx=0to18:readn:poke828+x,n:next
110 forx=0to255:sys828,x:printx:next
115 print(ti-t)/60"seconds"
120 data 32,155,183 :rem jsr $b79b
130 data 160,8 :rem ldy #8
140 data 138 :rem txa
150 data 10 :rem asl a
160 data 170 :rem tax
170 data 169,48 :rem lda #$30
180 data 105,0 :rem adc #0
190 data 32,210,255 :rem jsr $ffd2
200 data 136 :rem dey
210 data 208,243 :rem bne -13
220 data 96 :rem rts
12.8666667 seconds
This revision will let you call SYS 828, whereby number can be 8-bit or 16-bit unsigned and it will print the binary equivalent...
100 forx=0to42:readn:poke828+x,n:next
110 a=0:print:input"dec:";a
120 ifa>0thensys828,a:goto110
200 data 32,253,174 :rem jsr $aefd
210 data 32,138,173 :rem jsr $ad8a
220 data 32,247,183 :rem jsr $b7f7
230 data 166,21 :rem ldx $15
240 data 240,8 :rem beq +8
250 data 32,87,3 :rem jsr $0357
260 data 169,32 :rem lda #$20
270 data 32,210,255 :rem jsr $ffd2
280 data 166,20 :rem ldx $14
290 data 32,87,3 :rem jsr $0357
300 data 96 :rem rts
310 data 160,8 :rem ldy #8
320 data 138 :rem txa
330 data 10 :rem asl a
340 data 170 :rem tax
350 data 169,48 :rem lda #$30
360 data 105,0 :rem adc #0
370 data 32,210,255 :rem jsr $ffd2
380 data 136 :rem dey
390 data 208,243 :rem bne -13
400 data 96 :rem rts
@flaviobruno 2 роки тому
What about this 'Loop unrolling" with something like a 'String builder' approach?
No cache required, and a decent 25 secs. execution time in pure BASIC...
-----------------------------------------------------------------
10 printchr$(147)chr$(5)
20 ti$="000000"
30 fori=0to255
40 printi,
50 print"0";:if(iand128)thenprint"{left}1";
60 print"0";:if(iand64)thenprint"{left}1";
70 print"0";:if(iand32)thenprint"{left}1";
80 print"0";:if(iand16)thenprint"{left}1";
90 print"0";:if(iand8)thenprint"{left}1";
100 print"0";:if(iand4)thenprint"{left}1";
110 print"0";:if(iand2)thenprint"{left}1";
120 print"0";:if(iand1)thenprint"{left}1";
130 print
135 next
145 printti/60
-----------------------------------------------------------------
The {left} syntax comes from CBM program Studio and stands for the 'left cursor' control character
@robertlock5501 2 роки тому
Here's a converter utility I did back in 2014 for the TI: it's on paste bin with the address /QzrgsEc7 (i'd happily share the link but it'll get deleted)

Наступне

Автоматичне відтворення

Commodore 64 and 128 TIME: Exploration of TI and TI$