@@DrMattRegan I just missed you. I joined NVIDIA right as Kepler (GTX 680) was taping out. I've since moved on, but it was a fun place to work for sure.
@@DrMattRegan I bought a 480 to play video games on back in grade 8 around 2018 cus it was cheap lol It was a good card. Never knew you worked on that lol
UA-cam suggestion. My guess is because I have "retro-computing" and "computer history" playlists and I tend to watch these in full more than other videos
This video appeared in my recommended videos. Just from the title and screenshot, not yet having watched your video, I recognized that you might be using a ROM sequencer, a technique which I love and I myself used before in the early 80s to create fast complex control circuits in place of using a slower and more expensive microprocessor where the algorithm was straightforward and repetitive. Such designs were common in tech companies in Southern California that I worked for in those days. A few years later I was using PALs to do similar things. I might be wrong, as I'm writing this before having watched your video as I wanted to reply to your request as to how I found your video. 😊
The way you configured latch, the way you drive the RGB outputs, and even the control signals using only ROM data and avoiding as much logic calculation in the components as possible were very helpful. Thanks!
I used this technique (loosely) to make a ballistics bullet speed display with selectable fps/mps display on an LCD. it was also just an EPROM and some latches at its heart. It's a lovely technique, simple and reliable. Thanks for sharing this.
Yep, an EPROM with some latches (plus a feedback loop) forms a finite state machine, one of the fundamental levels of computing! en.wikipedia.org/wiki/Automata_theory
This was very interesting for me. I have been scouring youtube for videos on how to design VGA circuits for retro computers. I have seen this type of formula before. First time I have seen it using eproms. It gave me a couple aha moments so I will need to study this video several times. Thanks for all your hard work.!
Glad you enjoyed it! You might want to work your way through the Apple ][ wire-by-wire series starting at video 8 ua-cam.com/video/qbzzzkNPICI/v-deo.html
Congratulations on getting so many viewers for this video. Your material deserves an audience. I liked how you stored the next pixel address in the EEPROM. That simplified your circuit compared to similar projects I've seen. They usually use external counter IC's plus logic to reset the counters. Your approach has a much smaller part count.
Thanks Martin. I'm planning on integrating it into the CPU design. In the Turing6502, the eprom is actually unused while clock is high, so i'm going to use that for the raster generator. The plan is to get the rulebook for the CPU and the raster generator into 2 eproms.
just found it. I'll be doing something similar in the next few weeks but rather than encoding an FSA using EPROMs, I've got a counter-based solution in mind. Just like you,all on bread boards so so many fun and games in the countless hours tracking down bad connections, mis-wiring, defective/toasted components, missing jumpers and bent legs and the like...😂 Really like your approach and explanations though. Please keep going!😊
Great application of this technique. Woz built his Apple II floppy disk controller around a 256-byte PROM-based logic sequencer similar to this. The rest is software running on the Apple II host to manipulate an 8-bit command register. It saved dozens of chips and made the disk controller very inexpensive compared to contemporary ones. I'm sure you already knew this but it is a fascinating example of lateral thinking nonetheless.
Thanks for the feedback. If you haven't watched these yet, you might wand to take a look at the Turing6502 playlist ua-cam.com/play/PLjQDRjQfW-84j-jLvrbEeDvGl0QrhX9p7.html where the 6502 is mainly implemented as state machines programmed into and EPROM (software)
@@DrMattRegan I have watched the first few and they blew my mind. Frankly, I've struggled a bit to understand it but I'll persevere. It's incredibly interesting.
If you're a Ben Eater fan it may be worth watching ua-cam.com/play/PLjQDRjQfW-84aOLT33kzoZghRofK-uL1F.html particularly the second video. It gets the least number of views, but i think it's one of the more important videos on the channel.
Thanks Ben Eater... as I can now enjoy a video such as this, and follow along with most of it. Thanks Matt for your pleasant walk through style and the detailed information.
Really great video and nice implementation. If only I had this kind of info when I tried bit banging basic VGA video on microcontrollers in the mid 1990s :) Would have saved me many weeks of hair pulling !!
13:15 this is why I often use simple constexpr functions for doing odd bit manipulation like this. You could have something like the following (ignore the bad name): constexpr int getColorBits(int value, int color){ return (r&0xe0)
Yes, it's a good idea. Makes the code much easier to read. I should do that before i release code to the wild. If you use constexpr though, will visual studio evaluate it if you hover above it, i haven't tried it. The way i do it, you can highlight the entire expression and VS will evaluate that, or you can highlight the individual components of an expression and VS will evaluate that sub-portion.
It was fun. Love it. Figured out what's actually going (how it's EPROM?) when you ran VS2022 ) P.S. what's for timings issues 13:54 it's not actually correct : as I can see on M27C322 spec the ACCESS TIME: 80ns and PROGRAMMING TIME: 50μs/word; so it's actually even worse than I thought it's actually 80 ns for reading. But I think this time might be overrun by 'multiplying' this scheme. The idea is to split it to 2 parallel 'nodes' to read every second pixel from the second one while the first one is still being processed. It should give you exactly 40ns per pixel you need. It might ) I need to think over it; thanks again for the fun!
Glad you enjoyed it. It's amazing what a simple finite state automata can do. The chip (27C322) has an access time of 50ns. At full VGA we need 40ns/pixel but half VGA is 80ns/pixel. I didn't try it, but is suspect full VGA will fail.
Great video What is the EEPROM program u are using at 13:33 to debug if your VGA generation is working? Also how can you write a paper on something that is interesting to you? For instance: I am interested in transmission and receiving of analog audio and picture (CQUAM Stereo, PAL, NTSC, etc) But this is old and everyone wants digital nowadays so are there any guidlines on how to express this diferently so it would seam like something new and exiting, rather then giberish that some lunatic that is living in the past is writing about Sorry for wierd question I hope you understand it TLRD: how to take some old concept and turn it into something new and exiting that could be good enough for a research paper?
The image at 13:33 is software generated, using the unsigned int EPROM [1024][1024]; array to display the image. This is tickled and downloaded into the 27C322 EPROMs. If you're interested in a research paper on raster generation try: Regan, M. and Miller, G.S., 2017. The problem of persistence with rotating displays. IEEE transactions on visualization and computer graphics, 23(4), pp.1295-1301. Which BTW was named runner-up best paper and was then given as an invited talk at SIGGRAPH 2017 ieeevr.org/2017/awards/ Cheers
Nice build, I wondered about doing this a while ago as I had a need for a static image, the only thing extra I would have needed was a PAL encoder chip and different timings.
thank you very much for making this video! I'm a computer engineering student and am currently making a VGA circuit that small and compact to go with my other projects. Say do you know how I could make a hardware text mode for these circuits?
I got this far with my thinking after watching your Apple II video generation series, and worked out that you could interleave EPROMs to generate full-resolution VGA, and then skipped past that to a RAM framebuffer. I even went the next step of working out making each pixel be a palette entry of 2048 colors. It fell down rapidly after that because designing a tiler engine is non-trivial at any resolution without resorting to CPLDs/FPGAs, and that's a bit beyond my ken for now.
Cool video, though I’m not sure I entirely understand. Is that starman frame-buffer stored in the EPROM? Or is it coming live from somewhere else? Wonder what it’s like to bit-bang DVI, considering most non-CRT monitors have digital circuitry in them anyhow.
Yep, Starman is in the same EPROM, but it's really only there for testing. It's the address lines that are useful. If you feed them into an SRAM instead, you can update the image. HDMI has a different format, but pixels are still sent one at a time.
@@DrMattRegan ah so I did understand it properly. Though along with your frame buffer you’d need to fill the RAM a with the first few bits of the “next address”. I can see why Ben Eater used combinatorial logic and counters to control the blanking periods, since the majority of the work is just “address++” anyhow. Synchronising the timers did look a bit of a pain though. A compromise method might be to use a simple binary counter for incrementing pixels and a second binary counter for incrementing memory addresses through RAM, and an EEPROM to look at that pixel address and output signals for V and H blanking (where the RAM address isn’t incremented) and for resetting the counters. That way you can store memory much more densely, and get 3*5-bit colour.
The RAM shouldn't need addresses, just the ROM. It might also make sense to have a divide by 16 counter infront of the EPROM. I did wire up a counter chain in the apple 2 wire-by-wire series, but it is a pain.
Well, given the design that you gave, the memory structure is a linked list. I'm sure you used it to simply the writing of the video memory on the CPU side by using row, column addressing which is convenient for addressing pixels. To make it contiguous, you'd have to set the pointer of the "next address" portion into the next address. Now, if you were not including the H-Sync and V-Sync as part of the contiguous region, I would imagine you'd need to have an additional counter, either as circuit or some sort of "microcode" that would drive the syncs and possibly the front and back porches via run-length encoding which would take up less memory. If doing that, I'd probably eschew the linked list pointer and pack 2, 2, 2, 1, 1 for R,G,B,H,V, as 8 bits. I imagine you could use the lower ROM for the actual video memory, later to be replaced with RAM. Then might could treat the other ROM as a state machine to do the counter, which while still practically the linked-list in practice, wouldn't be used as addressable to CPU. With that sort of design, might could have multiple ROMs that represent different "video modes" and use some sort of I/O address on the bus to select which ROM via a pair of latches to instantly change the resolution. Either way, would need to be careful of only updating lines that aren't scheduled for immediate render since it could create artifacts (I assume the cause of the CGA render bugs from my childhood); although, I suppose it could be a "feature" if exploited for controlled special effects. Speaking of latching, might could have multiple resistor banks coming off the R,G,B links and use another I/O to swap them using the lines currently tied to ground so you can vary the palette a bit and a few AND gates (at least in theory). I hope my thoughts on it was in the ballpark.
Hmmm... that is pretty complicated, but yes it acts like a linked list. The image is there mainly to test to see if it works, the address is what you are interested in. I would put all the active video below a certain address and the inactive video above that address say 0x80000. At the and of the active area you jump to an address above 0x80000 then go back to the low address for the next active region of a scan line, starting from where you left off.
@@DrMattRegan I think I was overthinking it for the moment as I was trying to account for different video modes and possible different palettes. I think I recall you doing something similar now for the longer video series (I shall have to go back and watch again). I will take the opportunity to ask a dumb question, is my convoluted thought actually feasible to have one bank containing the CPU addressed memory seen as 16k, perhaps as bytes, but then have the parallel linked list that maps the next address? I'm working under the idea that the software would not need to concern itself with that detail and be able to directly write the color information on the opposite clock state as the CPU.
Yes i think what you are saying is correct, you basically treat the video address that points into video memory as another attribute, like i'm currently treating RGB. It would certainly work. With the Apple 2 and ZX speccy, they are essentially 1-bit per pixel (although the speccy has an extra 1/8th bit per pixel), and with some clever re-arrangement, i can get the address space down to 64 K. That will leave several bits for mode selection. I would probably use an octal flip-flop triggered by V-SYNC to change the mode bits.
That EPROM isn’t available / in stock is it? I looked for some DIP EEPROMs and the ones available ( X28C512P-12 ) are pretty low capacity (64K x 8 bit) and slow (120 ns). Also I was starting to wonder how I’d build something that can support higher resolutions. Maybe some kind of RAM instead and you’d copy the ROM to RAM (with a CPU?) at startup to get the video working. Or using multiple EEPROMs and somehow divide the clock between them! I have an FPGA (Digilent Arty, with VGA PMOD hooked up to it) and it can put out 1080p VGA signal (148.5 MHz pixel clock), maybe even higher. But I hate the tool chain and dependencies involved (Vivado in my case, for Xilinx chips).
You can find the 27c322 pretty easily on ebay. All of the constants in the tinyVGA horizontal specifications are divisible by 16, so you can put a divide by 16 counter for the least significant bits. This will reduce the EPROM size by a factor of 16. To go to a higher resolution such as 1080p i think you'll have to persist with the FPGAs unfortunately. You'll have a hard time getting through hole technology with 7400 series logic to go that fast.
Since the whole frame is one big linked list so the words can be in any arbitrary order, including contiguous. The chosen order just makes it simple to generate the data.
Seems like a huge pain in the ass to generate NTSC with code. I still use an old 80s B&W television set that does all of this, purely analog, with a few handfuls of components and some good old fashioned high voltage to watch my old black and white movies. Sometimes nothing beats simplicity.
As soon as the first bit of addressing goes, you'll probably lose the regular HSYNC, so it would scramble or the monitor would drop the image. Just a guess.
Someone tested exactly the above. They wrote filled the EPROM with all 1's (or zero's, whatever) and displayed the bit map. Then they put the chip under a (back then) UV eraser light and watched the chip data "erase" on screen. Today you could use a UV Laser pen to erase sections selectively and watch it erase on screen!
@@LunaWuna The engineer is named Jens Schönfeld. He designs retro products for the Amiga and other computers. I searched using [ jens schönfeld amiga eprom ] but it did not find anything. He told me&others of his experience doing the above at an Amiga Computer festival over 20 years ago. Sorry - I don't have his new email address...
Seems like using a eprom for address rather than a counter seems expensive. You could probably bit bang out the signal using a cortex M0 nowadays. I did a greyscale NTSC video overlay for drones years ago using an M0. It ran fine.
If you have a look at this playlist ua-cam.com/play/PLjQDRjQfW-84aOLT33kzoZghRofK-uL1F.html (90 mins watch time) it may be a little more apparent why use an EPROM. Let me know what you think.
@@DrMattRegan It seems to work like the 'math box' in the Atari Battlezone coin-op used to transform vertices. I am not sure how you are going to take this static image burned to rom and turn it into a programable image without 3*640*480= 921K bits worth of memory. I ended up using tiles of 8x8 pixels, like early computers did, to reduce the memory footprint. Here is a flight demo of the M0+ OSD project. ua-cam.com/video/MtMT04ecjL4/v-deo.html
@@moddaudio the image is just there as a test. I’m planning to use it with the Turing6502 to make an Apple 2 etc. This already uses the 27c322, so I can squeeze the raster generator into the same ROM as the FSA for the CPU
Hello Dr. Reagan. It seems like the 27c322 has been discontinued for some time. Do you know where to get it or any other good replacement for the memory chip? Thank you very much
They are still available on ebay. The tricky thing is the speed, these are 50ns parts and you will probably want to use something in that category. SRAM is an option, but you will need a microcontroller to program it on power-up.
4:44 I don't think I could understand it clearly, but why set the horizontal sync, and the vertical sync at the INactive region instead of simply setting them at the *active* region?
They control where the image is located on the screen. If you put them in the middle of the active region the image would appear in the corners and there would be a big blank horizontal and vertical bar in the middle of the screen!
I’ve gone through comments and no one’s mentioned the obvious mistake at 4:21: “Of this 1K address space…” Obviously 20 address lines is at least a 1 Meg address space, each row being 1 KBytes, for 1024 rows.
Correct, it is meant to be 1 meg. Unfortunately, no matter how many times i review a video, somethings slip through. Maybe one day i can afford an editor.
@@DrMattRegan I certainly appreciate the effort in the series. I am primarily a software developer, but between the series on this channel and Ben Eater's I've learned a lot.
Actually i'm targeting these videos as low level hardware from a software perspective, so i often use software to explain what i'm doing. I've run into a lot of people who are expert level software developers and who know it so well they are a little bored, so they want to step outside their domain a little and know a bit more about what happens under the hood and get closer to the silicon.
Hmmmm. Could I get a motherboard with no CPU to do video out by violating the BIOS chip..? Need to look into what exactly requires the cpu, and what it can do without one. Curious if a modern chipset could be powerful enough to run doom. I feel like it’s a necessity for a good CMPE undergrad to get doom running on something new and I dont think anyone’s done an empty motherboard yet.
NVidia requires a CPU to configure the chip. Only custom older designs work without a CPU. Even those that use the 6845 require a CPU to initialize it.
So I guess, for "changable" video, the data should come from ram, but pointer and all other stuff can still come from little eeprom? I mean the whole can come from ram, but sounds a bit wasteful. Very interesting idea!
@@DrMattRegan I feel the most gain is less parts and using that free timing of eeprom. But maybe also less latency? I can maybe imagine more functionality too.
Am I missing something here... okay, yes, I am.... An Eeprom stores the image and and the code... but what's running the code? Where's the processor ??? And then, the Eeprom's binary output is just dumping right out to those HC whatever chips which turn logic level into something VGA can use ?
Well, the EPROM is storing the horizontal count (and vertical count too). No external count. Imagine the output of an adder connected to a set of flip flops, then the output of the flip flops goes back to one adder input and the number one is the other adder input. Can you see how that would count up? Well, the adder is combinatorial logic, so we can replace it with an EPROM.
@DrMattRegan thanks for replying but i'm still not getting it. Maybe I have always misunderstood how an eprom works. You have an address bus, a data bus and r/w pins to pull or push data so how is that running code ? I just watched the ben eater video which has a lot of detail and i understood that.. but his counter is done on external chips so the eprom is just there to output a byte for the display bitmap from an address given by the external counter but you just have eeproms and no counter chips. Code with no processor makes no sebse to me.. like a recipe without a chef :-) Sorry if I'm frustrating you..I'm frustrating me too :-)
@@jacklewis100 close. The eprom contains the data to be displayed and the address of the next pixel. On the positive edge of clock the next pixel address is latched into the 374s and this becomes the current pixel address. Then we do another look up. For all addresses outside the scan, we count up until we hit a location which is part of the raster count (I could have also set the next pixel location to be 0). That way it doesn’t matter what the flip-flops hold on power up.
@@DrMattRegan ahh... the penny just dropped... or rather my brain took a wrong turn earlier on so that confused everything thereafter... I've retraced my brain steps now. The code was confusing me... i couldn't understand how code was being executed but now I realise the code was run on a pc just to generate the eeprom code! Duh! Brilliant. Thank you.
@@jacklewis100 excellent. Glad you stuck it out. The official name for the structure is a finite state machine. In this case, the state is the screen location.
Question. Why does the order of D0-D15 matter? If you write the data in any decent, order you should get the data out the same way. It's almost the same for the address lines, but since each address line could enable a bank, the reading delay won't be the same. This is very important for DRAMs since their addresses are multiplexed. Any reasonable arguments?
You are right, from an electrical perspective you can mix up the address and data lines as much as you want on these EPROMs, it's 50ns access no matter what. The problem with the 27C322 is if you write LSByte then MSByte into the file that you program the EPROM with, it will come out on the pins as D0 : D8 : D1 : D9 : D2 : D10 etc... Now you can wire up the 27C322 correctly into your circuit so that this would not matter. The difference comes at debug time when you are running a logic probe up and down these pins. It's just so much easier if the pin ordering is contiguous. Before i started using this swizzle routine, i just found it hard to read of hexadecimal values with the logic probe. So i do take advantage of the fact that i can mix up the data lines without penalty.
I read that the lowest data rate for hdmi was 250 MHz. You would need to provide 4x balanced 2.5V logic signals and the signals are 8b10b encoded from the VGA signals. It is then up to the monitor as to which screen resolution and refresh rate is supported. Also hdmi is a licensed technology, so maybe digital video via display port instead. Anyway I think using an eprom is a novel approach to this solution,
Good question. I did use a normal counter here ua-cam.com/video/qbzzzkNPICI/v-deo.html But watch this video ua-cam.com/video/mPkAgXJOoSc/v-deo.html to see why i might use a Finite State Machine as a raster generator
hm, if you used static ram instead of a rom, could add in sprites, cursors , text data etc by using extra memory for sprite and char lookup data. would also give you a direct way to update image data. if you offset the address bus via a secondary 'rom' lookup, text or image scrolling
Yeah, ultimately there will be SRAM for image storage. The ROM image is just for debugging. I'm planning on doing some more with the design including 3D rotation.
I watched the ben eater vids before I did yours. after that It makes me wonder why this never happened back when Apple ][,C=64 never did this. I temember VGA graphics being easy to program because it was just a linear array. no special addressing modes needed. aka Grahics Gems Books on how to do basics like DDA line drawing and fast fills etc. eay before graphic specific gpus these new tricks to old hardware could have jumpstarted audio specific cards too (dma memory from disk, sound samples)
It is a novel approach. Clearly quite simple and flexible, but also wasteful (~90% of memory is either unused or wasted for next address values). Still, its simplicity is why it is nice. Using binary counters and comparators for row and column, plus an adder (if the number of columns is not a power of 2) to fit it a contentious region, and no next addresses would be easy too tho. Another option is to have EPROM still for generating control signals, but RGB data stored separately. Like James Sharman 8-bit computer.
Yes, admittedly wasteful. My aim is for someone not too familiar with 7400 series logic to look at it an say "Oh, is that all that's happening" These big EPROMs are actually very cheap and kind of fun. You can do things in unusual ways. If you haven't seen it already you might want to look at ua-cam.com/play/PLjQDRjQfW-84aOLT33kzoZghRofK-uL1F.html for some more large EPROM madness. Don't forget to watch the last 5 minutes of the 3rd (and final) video in the playlist.
Loved my GeForce GTX 460 a lot. Really stable card and surprisingly capable for the time. Thank you for your work on that!
Yeah, NVIDIA was a great company to work for. Very interesting experience.
@@DrMattRegan I just missed you. I joined NVIDIA right as Kepler (GTX 680) was taping out. I've since moved on, but it was a fun place to work for sure.
@@DrMattRegan I bought a 480 to play video games on back in grade 8 around 2018 cus it was cheap lol
It was a good card. Never knew you worked on that lol
Out of interest (for my benefit), to anyone who had just discovered this channel for the first time, how did you find it?
UA-cam suggested it.
@@xDR1TeK cool. Thanks.
Hackaday article.
great, thanks for letting me know!
UA-cam suggestion. My guess is because I have "retro-computing" and "computer history" playlists and I tend to watch these in full more than other videos
This video appeared in my recommended videos. Just from the title and screenshot, not yet having watched your video, I recognized that you might be using a ROM sequencer, a technique which I love and I myself used before in the early 80s to create fast complex control circuits in place of using a slower and more expensive microprocessor where the algorithm was straightforward and repetitive. Such designs were common in tech companies in Southern California that I worked for in those days. A few years later I was using PALs to do similar things. I might be wrong, as I'm writing this before having watched your video as I wanted to reply to your request as to how I found your video. 😊
Cool. Check out the Turing6502 videos where a build a CPU using similar concepts.
The way you configured latch, the way you drive the RGB outputs, and even the control signals using only ROM data and avoiding as much logic calculation in the components as possible were very helpful. Thanks!
I used this technique (loosely) to make a ballistics bullet speed display with selectable fps/mps display on an LCD. it was also just an EPROM and some latches at its heart. It's a lovely technique, simple and reliable. Thanks for sharing this.
Yep, an EPROM with some latches (plus a feedback loop) forms a finite state machine, one of the fundamental levels of computing! en.wikipedia.org/wiki/Automata_theory
This was very interesting for me. I have been scouring youtube for videos on how to design VGA circuits for retro computers. I have seen this type of formula before. First time I have seen it using eproms. It gave me a couple aha moments so I will need to study this video several times. Thanks for all your hard work.!
Glad you enjoyed it! You might want to work your way through the Apple ][ wire-by-wire series starting at video 8 ua-cam.com/video/qbzzzkNPICI/v-deo.html
@@DrMattRegan thanks
EPROMS were used for programmable logic for a long time
@@asicdathens can you recommend a good book on this subject?
Check Ben Eater's videos as well. He does something similar. Then goes on to connect it to his 6502 computer.
Congratulations on getting so many viewers for this video. Your material deserves an audience. I liked how you stored the next pixel address in the EEPROM. That simplified your circuit compared to similar projects I've seen. They usually use external counter IC's plus logic to reset the counters. Your approach has a much smaller part count.
Thanks Martin. I'm planning on integrating it into the CPU design. In the Turing6502, the eprom is actually unused while clock is high, so i'm going to use that for the raster generator. The plan is to get the rulebook for the CPU and the raster generator into 2 eproms.
just found it. I'll be doing something similar in the next few weeks but rather than encoding an FSA using EPROMs, I've got a counter-based solution in mind. Just like you,all on bread boards so so many fun and games in the countless hours tracking down bad connections, mis-wiring, defective/toasted components, missing jumpers and bent legs and the like...😂
Really like your approach and explanations though. Please keep going!😊
Welcome. Good luck with the build.
Great application of this technique. Woz built his Apple II floppy disk controller around a 256-byte PROM-based logic sequencer similar to this. The rest is software running on the Apple II host to manipulate an 8-bit command register. It saved dozens of chips and made the disk controller very inexpensive compared to contemporary ones. I'm sure you already knew this but it is a fascinating example of lateral thinking nonetheless.
Thanks for the feedback.
If you haven't watched these yet, you might wand to take a look at the Turing6502 playlist ua-cam.com/play/PLjQDRjQfW-84j-jLvrbEeDvGl0QrhX9p7.html
where the 6502 is mainly implemented as state machines programmed into and EPROM (software)
@@DrMattRegan I have watched the first few and they blew my mind. Frankly, I've struggled a bit to understand it but I'll persevere. It's incredibly interesting.
If you're a Ben Eater fan it may be worth watching ua-cam.com/play/PLjQDRjQfW-84aOLT33kzoZghRofK-uL1F.html
particularly the second video. It gets the least number of views, but i think it's one of the more important videos on the channel.
Thanks Ben Eater... as I can now enjoy a video such as this, and follow along with most of it. Thanks Matt for your pleasant walk through style and the detailed
information.
Glad you enjoyed it! You might want to try this series ua-cam.com/play/PLjQDRjQfW-85S5QkX8wZbkqichM6TLYYt.html
@@DrMattRegan Great! The first 2 mins of the 1st in the series has me hooked... preparing the popcorn and ready for a binge session! Cheers!
4:24 отличное разъяснение!
Nice explaining!
Man I wish I had you during my college years . Great explanation !!
Glad you liked it!
Really great video and nice implementation. If only I had this kind of info when I tried bit banging basic VGA video on microcontrollers in the mid 1990s :) Would have saved me many weeks of hair pulling !!
Glad you enjoyed it!
Wow, it's literally been decades since I've seen anyone write GDI code :)
Yeah SetPixels is about the slowest function there is, but i'm not really interested in speed here. Just the simplest way to put pixels on the screen.
13:15 this is why I often use simple constexpr functions for doing odd bit manipulation like this.
You could have something like the following (ignore the bad name):
constexpr int getColorBits(int value, int color){ return (r&0xe0)
Yes, it's a good idea. Makes the code much easier to read.
I should do that before i release code to the wild.
If you use constexpr though, will visual studio evaluate it if you hover above it, i haven't tried it.
The way i do it, you can highlight the entire expression and VS will evaluate that, or
you can highlight the individual components of an expression and VS will evaluate that sub-portion.
@@DrMattRegan I have checked in JetBrains Rider and it can evaluate constexpr, so it is definitely possible for IDEs to evaluate these constexprs
It was fun. Love it. Figured out what's actually going (how it's EPROM?) when you ran VS2022 )
P.S. what's for timings issues 13:54 it's not actually correct : as I can see on M27C322 spec the ACCESS TIME: 80ns and PROGRAMMING TIME: 50μs/word; so it's actually even worse than I thought it's actually 80 ns for reading. But I think this time might be overrun by 'multiplying' this scheme. The idea is to split it to 2 parallel 'nodes' to read every second pixel from the second one while the first one is still being processed. It should give you exactly 40ns per pixel you need. It might ) I need to think over it; thanks again for the fun!
Glad you enjoyed it. It's amazing what a simple finite state automata can do. The chip (27C322) has an access time of 50ns. At full VGA we need 40ns/pixel but half VGA is 80ns/pixel. I didn't try it, but is suspect full VGA will fail.
Interesting reminds me of how James Sharman started his VGA project. Except he used a ROM to handle the sync timings only but not the frame buffer.
At first, but did he eventually use counters.
"Ok, that doesn't seem to work." Lol.
As a fellow Australian programmer, this just sounded so familiar haha.
Welcome. Yes, I had a big spiel in another video about our use of -a and -er at the end of words. Very confusing to some!
Poor ben eater. His card is now basically obsolete. Micronization got up to him
Great video
What is the EEPROM program u are using at 13:33 to debug if your VGA generation is working?
Also how can you write a paper on something that is interesting to you?
For instance: I am interested in transmission and receiving of analog audio and picture (CQUAM Stereo, PAL, NTSC, etc)
But this is old and everyone wants digital nowadays so are there any guidlines on how to express this diferently so it would seam like something new and exiting, rather then giberish that some lunatic that is living in the past is writing about
Sorry for wierd question I hope you understand it
TLRD: how to take some old concept and turn it into something new and exiting that could be good enough for a research paper?
The image at 13:33 is software generated, using the
unsigned int EPROM [1024][1024];
array to display the image. This is tickled and downloaded into the 27C322 EPROMs.
If you're interested in a research paper on raster generation try:
Regan, M. and Miller, G.S., 2017. The problem of persistence with rotating displays. IEEE transactions on visualization and computer graphics, 23(4), pp.1295-1301.
Which BTW was named runner-up best paper and was then given as an invited talk at SIGGRAPH 2017
ieeevr.org/2017/awards/
Cheers
We’ll blow me down with a feather, but I’m impressed!!
Glad you like it!
Nice build, I wondered about doing this a while ago as I had a need for a static image, the only thing extra I would have needed was a PAL encoder chip and different timings.
I'm working on the next video now, you might like it.
thank you very much for making this video! I'm a computer engineering student and am currently making a VGA circuit that small and compact to go with my other projects. Say do you know how I could make a hardware text mode for these circuits?
You might want to have a look at this video
ua-cam.com/video/0drvAVJF66I/v-deo.html
it adds text to the the raster generator.
@@DrMattRegan sounds good! Thank you doctor
interesting yt reccommended, very interesting
I got this far with my thinking after watching your Apple II video generation series, and worked out that you could interleave EPROMs to generate full-resolution VGA, and then skipped past that to a RAM framebuffer. I even went the next step of working out making each pixel be a palette entry of 2048 colors.
It fell down rapidly after that because designing a tiler engine is non-trivial at any resolution without resorting to CPLDs/FPGAs, and that's a bit beyond my ken for now.
Oh cool, did you get the basic raster generator without the tile engine running?
@@DrMattRegan some noise, breadboards aren't ideal for 25MHz signal I suspect
Yeah you will struggle to get 25MHz. That is FPGA territory.
Now this is the kind of content that gets me rock hard.
OK, but it's meant to be a SOFTware approach
@@DrMattRegan Just gotta figure out how to hook up my floppy!
Cool video, though I’m not sure I entirely understand. Is that starman frame-buffer stored in the EPROM? Or is it coming live from somewhere else?
Wonder what it’s like to bit-bang DVI, considering most non-CRT monitors have digital circuitry in them anyhow.
Yep, Starman is in the same EPROM, but it's really only there for testing. It's the address lines that are useful. If you feed them into an SRAM instead, you can update the image.
HDMI has a different format, but pixels are still sent one at a time.
@@DrMattRegan ah so I did understand it properly. Though along with your frame buffer you’d need to fill the RAM a with the first few bits of the “next address”. I can see why Ben Eater used combinatorial logic and counters to control the blanking periods, since the majority of the work is just “address++” anyhow. Synchronising the timers did look a bit of a pain though.
A compromise method might be to use a simple binary counter for incrementing pixels and a second binary counter for incrementing memory addresses through RAM, and an EEPROM to look at that pixel address and output signals for V and H blanking (where the RAM address isn’t incremented) and for resetting the counters. That way you can store memory much more densely, and get 3*5-bit colour.
The RAM shouldn't need addresses, just the ROM. It might also make sense to have a divide by 16 counter infront of the EPROM. I did wire up a counter chain in the apple 2 wire-by-wire series, but it is a pain.
When it's a green monochrome monitor it's referred to as a Rasta beam.
Well, given the design that you gave, the memory structure is a linked list. I'm sure you used it to simply the writing of the video memory on the CPU side by using row, column addressing which is convenient for addressing pixels. To make it contiguous, you'd have to set the pointer of the "next address" portion into the next address. Now, if you were not including the H-Sync and V-Sync as part of the contiguous region, I would imagine you'd need to have an additional counter, either as circuit or some sort of "microcode" that would drive the syncs and possibly the front and back porches via run-length encoding which would take up less memory. If doing that, I'd probably eschew the linked list pointer and pack 2, 2, 2, 1, 1 for R,G,B,H,V, as 8 bits. I imagine you could use the lower ROM for the actual video memory, later to be replaced with RAM. Then might could treat the other ROM as a state machine to do the counter, which while still practically the linked-list in practice, wouldn't be used as addressable to CPU. With that sort of design, might could have multiple ROMs that represent different "video modes" and use some sort of I/O address on the bus to select which ROM via a pair of latches to instantly change the resolution. Either way, would need to be careful of only updating lines that aren't scheduled for immediate render since it could create artifacts (I assume the cause of the CGA render bugs from my childhood); although, I suppose it could be a "feature" if exploited for controlled special effects. Speaking of latching, might could have multiple resistor banks coming off the R,G,B links and use another I/O to swap them using the lines currently tied to ground so you can vary the palette a bit and a few AND gates (at least in theory). I hope my thoughts on it was in the ballpark.
Hmmm... that is pretty complicated, but yes it acts like a linked list. The image is there mainly to test to see if it works, the address is what you are interested in. I would put all the active video below a certain address and the inactive video above that address say 0x80000. At the and of the active area you jump to an address above 0x80000 then go back to the low address for the next active region of a scan line, starting from where you left off.
@@DrMattRegan I think I was overthinking it for the moment as I was trying to account for different video modes and possible different palettes. I think I recall you doing something similar now for the longer video series (I shall have to go back and watch again). I will take the opportunity to ask a dumb question, is my convoluted thought actually feasible to have one bank containing the CPU addressed memory seen as 16k, perhaps as bytes, but then have the parallel linked list that maps the next address? I'm working under the idea that the software would not need to concern itself with that detail and be able to directly write the color information on the opposite clock state as the CPU.
Yes i think what you are saying is correct, you basically treat the video address that points into video memory as another attribute, like i'm currently treating RGB. It would certainly work. With the Apple 2 and ZX speccy, they are essentially 1-bit per pixel (although the speccy has an extra 1/8th bit per pixel), and with some clever re-arrangement, i can get the address space down to 64 K. That will leave several bits for mode selection. I would probably use an octal flip-flop triggered by V-SYNC to change the mode bits.
Neat! Now I wanna try something like this too
You should! If you want the raw data for the EPROMs let me know, i'll post it on GitHub
That EPROM isn’t available / in stock is it? I looked for some DIP EEPROMs and the ones available ( X28C512P-12 ) are pretty low capacity (64K x 8 bit) and slow (120 ns). Also I was starting to wonder how I’d build something that can support higher resolutions. Maybe some kind of RAM instead and you’d copy the ROM to RAM (with a CPU?) at startup to get the video working. Or using multiple EEPROMs and somehow divide the clock between them! I have an FPGA (Digilent Arty, with VGA PMOD hooked up to it) and it can put out 1080p VGA signal (148.5 MHz pixel clock), maybe even higher. But I hate the tool chain and dependencies involved (Vivado in my case, for Xilinx chips).
You can find the 27c322 pretty easily on ebay. All of the constants in the tinyVGA horizontal specifications are divisible by 16, so you can put a divide by 16 counter for the least significant bits. This will reduce the EPROM size by a factor of 16.
To go to a higher resolution such as 1080p i think you'll have to persist with the FPGAs unfortunately. You'll have a hard time getting through hole technology with 7400 series logic to go that fast.
This was very cool cool. thank you so much!
Glad you liked it!
Since the whole frame is one big linked list so the words can be in any arbitrary order, including contiguous. The chosen order just makes it simple to generate the data.
Correct, when i use it in the Apple 2 wire-by-wire series, i adjust the order to match the memory order that Woz used.
Seems like a huge pain in the ass to generate NTSC with code. I still use an old 80s B&W television set that does all of this, purely analog, with a few handfuls of components and some good old fashioned high voltage to watch my old black and white movies. Sometimes nothing beats simplicity.
NTSC is more painful than VGA. I have a video on it for the Apple 2 ua-cam.com/video/yQhCEc0dCsI/v-deo.html
I wonder what if would look like if you errased it while it was displaying an image
As soon as the first bit of addressing goes, you'll probably lose the regular HSYNC, so it would scramble or the monitor would drop the image. Just a guess.
Someone tested exactly the above. They wrote filled the EPROM with all 1's (or zero's, whatever) and displayed the bit map. Then they put the chip under a (back then) UV eraser light and watched the chip data "erase" on screen.
Today you could use a UV Laser pen to erase sections selectively and watch it erase on screen!
@@joeteejoetee do you have a link? Sounds interesting :D
@@LunaWuna The engineer is named Jens Schönfeld. He designs retro products for the Amiga and other computers. I searched using [ jens schönfeld amiga eprom ] but it did not find anything. He told me&others of his experience doing the above at an Amiga Computer festival over 20 years ago. Sorry - I don't have his new email address...
Was just in my recommended 👍
Your channel is 10/10
Seems like using a eprom for address rather than a counter seems expensive. You could probably bit bang out the signal using a cortex M0 nowadays. I did a greyscale NTSC video overlay for drones years ago using an M0. It ran fine.
If you have a look at this playlist
ua-cam.com/play/PLjQDRjQfW-84aOLT33kzoZghRofK-uL1F.html
(90 mins watch time) it may be a little more apparent why use an EPROM. Let me know what you think.
@@DrMattRegan It seems to work like the 'math box' in the Atari Battlezone coin-op used to transform vertices.
I am not sure how you are going to take this static image burned to rom and turn it into a programable image without 3*640*480= 921K bits worth of memory. I ended up using tiles of 8x8 pixels, like early computers did, to reduce the memory footprint. Here is a flight demo of the M0+ OSD project. ua-cam.com/video/MtMT04ecjL4/v-deo.html
@@moddaudio the image is just there as a test. I’m planning to use it with the Turing6502 to make an Apple 2 etc. This already uses the 27c322, so I can squeeze the raster generator into the same ROM as the FSA for the CPU
Vga? From EEPROM?
Ben Eater did this, it was pretty interesting 👍
I believe his image was in an EEPROM, but he used a counter for raster sequence generation
Is the integration video out yet? I tried to find it on YT, but didn't see it.
No but we're getting close. Once i debug the video code in the current ZX Spectrum build, i'll integrate the Turing Z80 with the spectrum video EPROM
Darn. You're my hero!
Glad you enjoy it.
Hello Dr. Reagan. It seems like the 27c322 has been discontinued for some time. Do you know where to get it or any other good replacement for the memory chip? Thank you very much
They are still available on ebay. The tricky thing is the speed, these are 50ns parts and you will probably want to use something in that category. SRAM is an option, but you will need a microcontroller to program it on power-up.
If you can handle surface mount there are parallel NOR flash chips which would achieve the same function.
4:44 I don't think I could understand it clearly, but why set the horizontal sync, and the vertical sync at the INactive region instead of simply setting them at the *active* region?
They control where the image is located on the screen. If you put them in the middle of the active region the image would appear in the corners and there would be a big blank horizontal and vertical bar in the middle of the screen!
Excellent!
Glad you liked it!
How feasible would this translate this model to DVI? I guess the speeds required would require much faster parts.
Not really sure. Probably tricky with finite state machine approach.
Very very nice!!!
Thank you! Cheers!
Awesome video.
Glad you enjoyed it
I’ve gone through comments and no one’s mentioned the obvious mistake at 4:21: “Of this 1K address space…” Obviously 20 address lines is at least a 1 Meg address space, each row being 1 KBytes, for 1024 rows.
Correct, it is meant to be 1 meg. Unfortunately, no matter how many times i review a video, somethings slip through. Maybe one day i can afford an editor.
Glad you liked it. If you're a Ben Eater fan you might like this ua-cam.com/play/PLjQDRjQfW-85S5QkX8wZbkqichM6TLYYt.html playlist
oh ... good idea.
Many many thanks
Why did you leave NVIDIA and what are you doing now?
Nvidia was a great company to work for. I actually left to go to medical school, so it was a career change. I now work in medicine.
@@DrMattRegan That is one amazing career path! Thanks for sharing.
Computer architecture has gone back to being a hobby, but i'm really enjoying it again.
@@DrMattRegan I certainly appreciate the effort in the series. I am primarily a software developer, but between the series on this channel and Ben Eater's I've learned a lot.
Actually i'm targeting these videos as low level hardware from a software perspective, so i often use software to explain what i'm doing.
I've run into a lot of people who are expert level software developers and who know it so well they are a little bored, so they want to step outside their domain a little and know a bit more about what happens under the hood and get closer to the silicon.
7:51 with one more eprom could you then get 8 bits per colour ?
Haha, yes i have some Video DACs, but no 44 pin plcc sockets.
💥👍💥
Hmmmm. Could I get a motherboard with no CPU to do video out by violating the BIOS chip..? Need to look into what exactly requires the cpu, and what it can do without one. Curious if a modern chipset could be powerful enough to run doom. I feel like it’s a necessity for a good CMPE undergrad to get doom running on something new and I dont think anyone’s done an empty motherboard yet.
NVidia requires a CPU to configure the chip. Only custom older designs work without a CPU. Even those that use the 6845 require a CPU to initialize it.
❤❤❤❤❤
Glad you like it!
Great!!👏👏
Thank you! Cheers!
@@DrMattRegan 👍👍
Awesome
Glad you like it.
c128 80 colunm any chace? NTSC RGBI
Haha, not for now.. I'm trying to get a ZX spectrum and an Apple 2e machine working.
nice job!
Thank you! Cheers!
So I guess, for "changable" video, the data should come from ram, but pointer and all other stuff can still come from little eeprom? I mean the whole can come from ram, but sounds a bit wasteful.
Very interesting idea!
Yep. Ultimately the eprom is just to generate the address. Any guesses as to why I use an eprom instead of regular counters?
@@DrMattRegan I feel the most gain is less parts and using that free timing of eeprom. But maybe also less latency? I can maybe imagine more functionality too.
Am I missing something here... okay, yes, I am.... An Eeprom stores the image and and the code... but what's running the code? Where's the processor ??? And then, the Eeprom's binary output is just dumping right out to those HC whatever chips which turn logic level into something VGA can use ?
Well, the EPROM is storing the horizontal count (and vertical count too). No external count. Imagine the output of an adder connected to a set of flip flops, then the output of the flip flops goes back to one adder input and the number one is the other adder input. Can you see how that would count up? Well, the adder is combinatorial logic, so we can replace it with an EPROM.
@DrMattRegan thanks for replying but i'm still not getting it. Maybe I have always misunderstood how an eprom works. You have an address bus, a data bus and r/w pins to pull or push data so how is that running code ? I just watched the ben eater video which has a lot of detail and i understood that.. but his counter is done on external chips so the eprom is just there to output a byte for the display bitmap from an address given by the external counter but you just have eeproms and no counter chips. Code with no processor makes no sebse to me.. like a recipe without a chef :-) Sorry if I'm frustrating you..I'm frustrating me too :-)
@@jacklewis100 close. The eprom contains the data to be displayed and the address of the next pixel. On the positive edge of clock the next pixel address is latched into the 374s and this becomes the current pixel address. Then we do another look up. For all addresses outside the scan, we count up until we hit a location which is part of the raster count (I could have also set the next pixel location to be 0). That way it doesn’t matter what the flip-flops hold on power up.
@@DrMattRegan ahh... the penny just dropped... or rather my brain took a wrong turn earlier on so that confused everything thereafter... I've retraced my brain steps now. The code was confusing me... i couldn't understand how code was being executed but now I realise the code was run on a pc just to generate the eeprom code! Duh! Brilliant. Thank you.
@@jacklewis100 excellent. Glad you stuck it out. The official name for the structure is a finite state machine. In this case, the state is the screen location.
Cool !!!!!!
Question. Why does the order of D0-D15 matter? If you write the data in any decent, order you should get the data out the same way. It's almost the same for the address lines, but since each address line could enable a bank, the reading delay won't be the same. This is very important for DRAMs since their addresses are multiplexed. Any reasonable arguments?
You are right, from an electrical perspective you can mix up the address and data lines as much as you want on these EPROMs, it's 50ns access no matter what.
The problem with the 27C322 is if you write LSByte then MSByte into the file that you program the EPROM with, it will come out on the pins as D0 : D8 : D1 : D9 : D2 : D10 etc...
Now you can wire up the 27C322 correctly into your circuit so that this would not matter.
The difference comes at debug time when you are running a logic probe up and down these pins.
It's just so much easier if the pin ordering is contiguous.
Before i started using this swizzle routine, i just found it hard to read of hexadecimal values with the logic probe. So i do take advantage of the fact that i can mix up the data lines without penalty.
So, using sram instead of eeprom here and you have a display adapter :)
Correct. The image in the EPROM is just for testing.
have you ever considered making a low resolution HDMI circuit or is that not possible with these parts?
In theory yes, but I think the minimum HDMI frequency is 125MHz from memory, so you'd need much faster parts.
I read that the lowest data rate for hdmi was 250 MHz. You would need to provide 4x balanced 2.5V logic signals and the signals are 8b10b encoded from the VGA signals. It is then up to the monitor as to which screen resolution and refresh rate is supported. Also hdmi is a licensed technology, so maybe digital video via display port instead. Anyway I think using an eprom is a novel approach to this solution,
Very good points. I think i would use an existing solution for HDMI
well hello...
Cheers
How much bitcoin can it mine?
Haha, interestingly, i had thought of writing a bitcoin miner for this machine ua-cam.com/video/7hP4BTWvrGw/v-deo.html
Ben eater did something similar
James Sharman's series is worth watching too.
@@DrMattRegan I'll look into him
Why don't just use a counter? 😅
Good question.
I did use a normal counter here ua-cam.com/video/qbzzzkNPICI/v-deo.html
But watch this video ua-cam.com/video/mPkAgXJOoSc/v-deo.html
to see why i might use a Finite State Machine as a raster generator
hm, if you used static ram instead of a rom, could add in sprites, cursors , text data etc by using extra memory for sprite and char lookup data.
would also give you a direct way to update image data. if you offset the address bus via a secondary 'rom' lookup, text or image scrolling
ram; nstead of rom also gives you a? ay to have a BIOS type init to change screen resolution etc on-the-fly.
chatGPT and similar AI tools could use this method too
Yeah, ultimately there will be SRAM for image storage. The ROM image is just for debugging. I'm planning on doing some more with the design including 3D rotation.
I watched the ben eater vids before I did yours. after that It makes me wonder why this never happened back when Apple ][,C=64 never did this.
I temember VGA graphics being easy to program because it was just a linear array. no special addressing modes needed.
aka Grahics Gems Books on how to do basics like DDA line drawing and fast fills etc. eay before graphic specific gpus
these new tricks to old hardware could have jumpstarted audio specific cards too (dma memory from disk, sound samples)
It is a novel approach. Clearly quite simple and flexible, but also wasteful (~90% of memory is either unused or wasted for next address values). Still, its simplicity is why it is nice.
Using binary counters and comparators for row and column, plus an adder (if the number of columns is not a power of 2) to fit it a contentious region, and no next addresses would be easy too tho. Another option is to have EPROM still for generating control signals, but RGB data stored separately. Like James Sharman 8-bit computer.
Yes, admittedly wasteful. My aim is for someone not too familiar with 7400 series logic to look at it an say "Oh, is that all that's happening"
These big EPROMs are actually very cheap and kind of fun. You can do things in unusual ways. If you haven't seen it already you might want to look at ua-cam.com/play/PLjQDRjQfW-84aOLT33kzoZghRofK-uL1F.html for some more large EPROM madness. Don't forget to watch the last 5 minutes of the 3rd (and final) video in the playlist.
So many loops