CGA Graphics Programming: Even faster circles!

PCRetroProgrammer

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 20 кві 2024
We figure out how to draw really fast circles on the IBM CGA adapter, totally smashing our record from the last video.
Code for this episode:
github.com/wbhart/PCRetroProg...
MartyPC (by dbalsom aka GloriousCow):
github.com/dbalsom/martypc

КОМЕНТАРІ • 31

@erajoj 24 дні тому ⁺¹³
Fond memories of early graphics and assembler :)
@theALFEST 27 днів тому ⁺¹²
Thanks. Optimizing assembly code for speed is my passion.
@_ttaneff_ 23 дні тому ⁺¹¹
UA-cam's algorithm at it's best - subscribe ! :)
Brought back so many memories of my first steps in programming - on IBM/Apple clones in the early 90's (eastern europe thing);
I miss those (failed) teenage attempts at 3D rasterization on a 80286/CGA... and 30 years later, I still look at the assembly output of my code and itch for (micro)optimizations.
Thanks, I will have fun watching your videos!
@pcretroprogrammer2656 23 дні тому
Yeah the algorithm did surprisingly well finding people to watch this one. It's by far the largest number of viewers for a video on this channel.
@zyansheep 21 день тому ⁺⁷
Me clicking on this vid thinking CGA stood for "conformal geometric algebra" 😭
@dsgowo 15 днів тому
Same, it didn't help that there were circles here too
@GloriousCow 26 днів тому ⁺³
Really been having fun watching this series. Seeing it the hands of a pro makes all the work I put into MartyPC feel worth it. I have a number of features in mind that might be helpful for benchmarking routines.
@pcretroprogrammer2656 26 днів тому
Thanks for the very kind comment and for all your hard work on MartyPC!
@MsDuketown 25 днів тому ⁺²
Straight-up facts! Good video, thnx.
@procactus9109 24 дні тому ⁺²
Awesome
@webgpu 22 дні тому ⁺²
oh that prompt ... i spent a dozen years looking at "c:\>" (until win95 -- win 3.1 still required DOS), so many days dealing with autoexec.bat & config.sys .... (my trauma from "command line" days are so huge i never touched linux OS...)
@anon_y_mousse 23 дні тому ⁺⁴
What language is that example code in? Why do you have multiple 1-bit shifts in a row in the assembly instead of combining them?
@pcretroprogrammer2656 23 дні тому ⁺⁵
That is the language Julia. I use it because it is pretty close to pseudocode and readable enough. Note that in some videos I use a drawpixel function, which is not an actual command in Julia. I just made that up for the presentation.
To combine multiple 1-bit shifts on the 8086/8088, one had to put the shift count in the CL register. There was no multibit shift by an immediate value. The problem with using CL is that it takes up an 8 bit register which we are using for other things, and it wasn't really faster anyway. So typically, unless you want to shift by a variable number of bits, instead of a constant number of bits, you use individual shifts. Each shift by a bit can then be counted as 4 cycles, typically.
@anon_y_mousse 22 дні тому
@@pcretroprogrammer2656 If you're stuck with an 8086, could you perhaps use some of the segment registers to store data? Only problem is that you make multiple function calls, so it'd be difficult to use ss, and you appear to have data all over making ds and cs difficult.
@theALFEST 15 днів тому
@@pcretroprogrammer2656 usually it's more like 8 cycles, because each opcode byte fetch takes 4
@TheAndreArtus 24 дні тому ⁺⁴
Have you looked into Bresenham's algorithms?
@pcretroprogrammer2656 24 дні тому ⁺¹
Sure. I implemented Bresenham's line drawing one on this channel.
The midpoint circle algorithm, which I here generalise for ellipses, is itself a generalization of Bresenham's line drawing algorithm.
There are very many versions of it on the web.
I haven't looked into any other Bresenham algorithms other than his line drawing routine and the various generalisations to circles and ellipses though. I'm aware there are generalisations to general conics, but these are pretty incomplete as far as actually usable scan conversion goes.
Do you have some specific Bresenham algorithm in mind?
@TheAndreArtus 24 дні тому ⁺¹
@@pcretroprogrammer2656 Those are the ones. I implemented versions in ASM (with Turbo Pascal calling convention) in the 90s (based of descriptions in Richard Ferraro's book "Programmers Guide to the EGA and VGA cards", Chris D. Watkins' code and descriptions [various books] ) and they were quite fast compared to the standard graphics library that came with TP 5.5. Of course I am going by how I experienced it ~30ya, would probably feel slow today.
We had to do bit plane switching for some [4 bit EGA/VGA] modes which complicated things a bit (you needed to know if you are in packed display or bit plane mode) beyond the bit masking required by having multiple pixels per byte. Of course you don't want to address a single pixel at a time (masking & switching) when drawing the horizontal runs (top and bottom 1/4).
This was my the first video of yours I've watched, made me a bit nostalgic.
@cocusar 27 днів тому ⁺¹
One of the things I really like about this code is the fact of the havoc it wrecks on the PC because of disabling the dram refresh, and it only keeps its code alive because it's the only thing run by the processor! Could you squeeze in some logic in that refreshed area to do a fancy demo?
@pcretroprogrammer2656 27 днів тому ⁺¹
I think this comment ended up on the wrong video, as this code doesn't do anything fancy like turn DRAM refresh off.
But to answer your question, yes you can definitely do a fancy demo effect with DRAM refresh off. There is plenty of room to do all sorts of cool things. I intend to do an example of this on the channel fairly soon.
@cocusar 27 днів тому ⁺¹
@@pcretroprogrammer2656 Ah, I thought this code also messed with the dram because you were only using registers, but I did get confused with the previous video. But in any case, your reply is what I was looking for. Hoping to see that in the future!
@andrewdunbar828 26 днів тому
poor havoc. The reason to use registers is because they're always the fastest.
@RLstavista 22 дні тому ⁺¹
I don't have enough brain bandwidth for this... 🤪
@alexloktionoff6833 13 днів тому ⁺¹
Multiplication by 50 via LEA could be faster ?
@pcretroprogrammer2656 13 днів тому ⁺¹
LEA doesn't have the multiplications by a constant that the 386 has for example. But you can use it for adding various registers. So far I never found a situation where I could get anything more out of it, but it is presumably possible.
@stanb1455 22 дні тому ⁺¹
could something like GlaBIOS' CGA optimizations make things even faster?
@pcretroprogrammer2656 22 дні тому ⁺¹
I don't believe so. Drawing pixels is the only thing it really does faster, but using the BIOS to draw pixels is the last thing you want to do if you want high performance code. It is always going to be faster to write directly to video RAM and to simply update the information for drawing pixels as you go, rather than recomputing it every pixel (which is what the BIOS basically has to do).
@Hiphopasaurus 21 день тому ⁺¹
I bet it would be interesting to see the difference in speed between direct writes and using the BIOS routines to draw pixels. That and benchmarking the different BIOS ROMs doing it would be neat too.
@pcretroprogrammer2656 21 день тому ⁺¹
@@Hiphopasaurus From experience, the standard BIOS routine is VERY slow, and the accelerated routines are only a couple of times faster.
@dr_jazza 9 днів тому
OSU
@supercompooper 23 дні тому ⁺¹
Elipses are like communist rectangles

Наступне

Автоматичне відтворення