When I saw this ASIC project idea, my first thought was "I wonder if I could make a VGA crtc and a triangle rasterizer", and you go and just stick a full 3d engine in a chip...
Hehe, well I'm proud of what I accomplished, but don't be intimidated: I'm cheating and not rendering triangles/polygons... such is the beauty of this classic ray casting algorithm. However, someone else did this submission to TT05 which IS a triangle rasterizer: tinytapeout.com/runs/tt05/390/ -- There's the possibility this could be combined with what my design does, and because FPGAs/ASICs can put hardware in parallel, these two could cooperate quite well, I suspect, if extended to support more parametric control.
@@fooglestuff ... I was wondering how you'd get the framebuffer or buiffers to all synchronize... is it possible to run multiple of these on-chip projects together? It'd be neat if they had their own high speed bus, one could be a QVGA framebuffer and another could be a triangle rasterizer and another could be a sprite enginer etc. It seems to me from the projects I've seen that the device isn't that dynamic, but running multiple chips could possibly work. The information seems to get into the (asic -> custom die) - via a handful of gpios, so I wonder if an architecture similar to the original NES PPU but serialized could work well... actually now that I'm rationalizing that in my head, it would just look like a serialized OpenGL... tell it to clear a buffer, setup transforms, shove triangles into it... etc.. hmmmmmmmmmm - if only there was time in the world to do this sorta stuff!!
Great work! You should have a look at the MiSTer FPGA project to see if you are interested there. The project is a hardware FPGA that creates many PC and gaming consoles. Your project might be useful hand in hand.
Maybe at some point I'll get a DE10 and MiSTer, but for now I haven't used more than about 25% of the resources of my humble DE0-Nano :) I'm even thinking of trying a smaller FPGA (one of the cheap Lattice-based boards perhaps) to see how well my design still works with more constraints. There is still an appeal, however, in trying to make the design function as though it is a device used by a retro PC or gaming console (e.g. akin to the the Super FX chip).
Thank you so much! I've enjoyed watching other people's low-level designs and similar labours of love over the years, so I'm glad to be able to offer one of my own (which will have newer versions with more features in the future). And yes, I agree that this is amazing stuff to be able to grasp now. I find it quite addictive, as my prior video about TT02 mentions in the conclusion ;)
This is such a cool idea! I've given some thought to how to do a simple GPU for 3D in a FPGA before but never thought of doing something like this! Nice work!
Thank you! I'd like to see a wide variety of techniques that people can get to fit in constrained hardware, so if you do attempt something then be sure to let me know.
I really like how simplistic it is to implement and use VGA. I wonder if there would be enough shared interest to place shared framebuffer/VGA driver logic on the chip that all designs could have access to
I think there would be, and in fact for things besides video projects, but the challenge is routing a bunch of extra shared signals for all of the Tiny Tapeout tiles… I’m sure TT would support more IOs too if it wasn’t for the routing room required. Maybe there is a neat compromise though… hmm…
I love seeing stuff like this myself so I was surprised and disappointed I didn't see too much on hardware ray casting (though there is a little bit) -- but that encouraged me to push this design and share it. I'm glad to see there are people like me who enjoy the same thing :)
Great presentation. I independently pursued a different Comanche-style ray-caster (and also planned on turning the monitor), but I wanted textures and bailed (for now) as I couldn’t compress it enough to fit. I made a little triangle render instead for TT05. I look forward to seeing where you take this.
Thanks Tommy! I came across your TT05 triangle renderer, and I'm pretty keen to see it in action! I suspect your method and this classic ray caster could also be combined to make a more-sophisticated (and more "game ready") design. I'd love to see you get along further with the Comanche-style renderer; it's another novel concept.
I'd definitely like to think so :) Or at least, it's the beginnings of one. The "game" is more on whatever controlling CPU/MCU you decide to hook up to it, but future versions might include more game logic (or even simply a future TT run might be an implementation of game logic to drive this device).
Very cool! I tried to experiment with something similar, but ran out of time. Then I tried switching to a beam-racing raster renderer on a full Caravel, but also ran out of time.
I know the feeling! Always could use more time. I hope you do manage to get something put together though, even if you don't submit it. While the Google-sponsored Open MPWs look like they might be over, that's where TT continues to provide ways we can access this sort of tech and manufacturing. So... hang in there! Give it another shot. And if you do, please share :)
Hopefully I won't have to wait too much longer. Even if it doesn't work as intended (which is possible -- there's lots of variability in the synthesis and characteristics that I hadn't accounted for very well in this first version), there's a lot to be discovered from trying to debug it. Stay tuned!
Would be awesome to try in on a crt. 60fs@60hz will look soo much better in motion, 16 times lower motion blur then a 60hz sample and hold lcd or oled infact.
Oh, I can't believe I didn't think of that, especially since my earlier videos are about an IBM PS/1 restoration, and that machine includes an ideal VGA CRT monitor :) Thanks for the idea! I'll try this in a future video.
640x480 VGA with 16 colors? Literally the way that GOD intended. Rip Terry . It would be nice if you could run templeOS on this hardware as it is a very nice addon to this open source OS
Oh I forgot about TempleOS. Well, this is probably in the same spirit of the "one person's crazy devotion", but to be clear, this design is more of a peripheral that I suppose TempleOS (or one of the forks?) could be adapted to use, rather than run on... and also it actually supports up to 64 colours so I'm not sure Terry would be happy about that :)
Interesting project!. A dream come true for any FPGA hobbyists. How much an ASIC performance is better than a FPGA? Let's say, if we use the same clk speed, is the worst negative slack much more forgiven in ASIC?
Glad to have you along! If you happen to try this design on your Basys 3, I'd love to hear about your results (good or bad). I'd be open to a GitHub pull request for targeting a Basys 3 (or any Xilinx). See the "de0nano" directory for an FPGA devboard target as an example. As for your request for recommendations... * See the "Raycasting" heading of: lodev.org/cgtutor/index.html * Game Engine Black Books by Fabien Sanglard: fabiensanglard.net/gebb * 8bitworkshop (inc. Steven Hugg's Verilog book): 8bitworkshop.com/ * JS ray casting tutorial: github.com/vinibiavatti1/RayCastingTutorial/wiki * OLD ray casting tutorial; still interesting: permadi.com/1996/05/ray-casting-tutorial-table-of-contents/
I was thinking more that if you wanted to render a circle instead of a flat (square) wall, then you could track the advancing of the algorithm with each line -- though it is complicated by the fact that not necessarily all lines of a circle would be rendered. I haven't thought it through completely :) It's also interesting, though, that you can adapt the algorithm to effectively support rendering of "columns" (horizontally-oriented cylinders) instead of blocks. I haven't tried this, but check out: lodev.org/cgtutor/raycasting4.html#Shapes As for ray marching, I would like to try more sophisticated algorithms like this but I suspect this will require much larger TT area than I can currently afford :) Nevertheless, I will try pushing what I can already do, and look for further optimisations and hacks.
@@fooglestuff I haven't actually coded up any of this stuff except for the bresenham line and circle but I've read and watch quite a lot. Ray marching is what the shader guys use. I can't see how a ray technique would mesh with a bresenham technique, which is to do with pixels/cells. I think vertical cylinders should be easyish from the point you've achieved so far! I suppose you'd get a feel for how much TT you'd need by playing with it in software?
Thanks for this, it's real cute and got a lot of novel potential. I have a RPI4+ and dev tools on Mac, Linux and Win ... what would be the best method to get this running as close to being all hardware? I don't have access to a Pico and even at their incredible value, I can't afford one - If I could somehow make this into the OS on the Pi like for the Amiga PiStorm where it runs on metal then that'd be great but I'm not real familiar with doing my own electronics *yet* but every episode of Adrians Digital Basement and Jan Beta teaches me something new! Is there a way to contact you about a very specific idea that this would be an incredible boon for, and potentially something that would add value to the lives of so many people! Don't get me wrong, I won't hog the idea, it needsa must be FOSS because of its scope, but I'd rather just seem ignorant instead of opening my klep and proving it at first 😀 Thanks again, gonna keep a good eye on this!
I'll get right on it! As I always say... sleep and a day job and regular adherence to 24-hour days are for the weak ;) In all seriousness, I will try to push this further as I can, but I encourage others with more time and expertise to perhaps learn from my approach, but then create their own approach in an HDL of choice. Use Tiny Tapeout as a way to get any such designs made if you like, but there's still always the fast prototyping pathway using cheap FPGAs or even C(++) simulation as I've done with Verilator.
Amazing project and report. Could you share with us how you get the SPI in and VGA out of the tiny tapeout? I plan to port your code to an ice-pico board which has a RP2040 and ICE40UP5K.
Sounds great! I'd be happy to consider a pull request also if you get it to work. I'd also suggest you open a GitHub issue on the repo to request more information and I can get back to you with extra detail and Q&A later. For starters, the version featured in this video is tagged "1.0" github.com/algofoogle/raybox-zero/tree/1.0 - the "main" branch is newer and fine to work from too but be aware it has an extra SPI controller for accessing an external texture memory. You don't necessarily need to worry about the Tiny Tapeout interface when porting (though I can give you pointers on that if you'd like, too) -- instead, the "de0nano" subdirectory includes a Verilog file for wrapping the main module ("rbzero") to target an FPGA board (in this case my DE0-Nano) so that's probably a good place to see what the interfaces are. If it helps, SPI is not essential to see the design in action, so you can do that after getting the basic VGA output going. Please keep me posted!
@@fooglestuffI only found out RGB 5:0 . So EGA? S-Video would be funny if we could use a 10 MHz color carrier and use an mixer on the PCB to get to VGA. .
@@ArneChristianRosenfeldt I'd love to try NTSC or PAL output using the analog capabilities of TT06+, but even putting the VGA DAC in the chip would be great... i.e. each of the R, G, and B channels having a single output but at least 4-bit resolution.
I was referring to the extreme constraint pin count configurations on github. I have the feeling that 8 bit R2R DACs are no problem for the process in the fab. Amiga had that for sound. I love how RGB applies barebones to both CRT electron guns and LCD subpixels. Actually, I like clearType. Maybe we could build a sprite engine which respects subpixel order after rota-zoom of sprites/textures. I learned that phones and windows apply clearType while they write to the framebuffer. This breaks on screen rotation and with screen shots. I wonder if upscalers work this way (for and in TVs). Composite would mostly be for fun. We could write "composite", but then show a HD output. Another LoL project would be EGA with pwm to create a lot of colors. Would be cool if while racing the beam we accumulate photons from pixels in the framebuffer, and then with some hysteresis the EGA output goes up a bit. But then again this is weird because EGA monitors also have a DAC inside of them, which gives me less barebones access to the electron guns. Also EMI is quite high. I also dream of a 3do like point cloud. I don't know how modern point clouds do it, but it looks like for solids I would have to clear the polygon as black in the framebuffer to obscure the background and can then write all the texels ( with displacement constraint to no occlude stuff ). No translucency. Gives this old SEGA System32 Arcade look. Then while racing the beam in interlaced mode, points close to the scanline (weighting based on delta-y) add to the accumulator. Maybe a tile can have stabilized power and can use PWM for audio? I still want to understand 1 bit DACs. I think they carry over the fraction from one pulse to the next. But in the end, people gave up. Sound goes digitally to the iPods or over HDMI. The actual DAC sits with the speaker. I wonder if every channel ( bass, etc.) has its own. @@fooglestuff I Wonder if the Atari Jaguar audio DAC would work. They did not connect it on the PCB, but the Jerry chip did output 4 PWM channels. 2 per stereo channel. The least significant one goes through an RC circuit. Then at the trailing edge of the main PWM, a mixer blips over to the voltage in the capacitor. Now we need a band pass filter to compare blip voltages to one step of the main pulse length. Calibration to adjust a factor for a digital multiplier for the fraction value.
I wanted to create my own processor and/or GPU on an FPGA, but had to abort that project when i found out it is not something a student can afford. And since then YT *keeps* pushing me these recommendations. 😡 Nice for you that you have made something at least, i guess... 😐
@@0LoneTech I had set my minimum requirements to be 114k LE, which was the same as the 15 year old DE2-115 had i used at campus. As a student, my limit was ~150 euro's. i don't care about fancy features or interfaces. A few GPIO and the raw LE's are all i need. But after the cheap boards of ~$15 end, the next boards are all around 800 bucks, which isn't something i could spend on a personal hobby project...
@@blinking_dodo As a student, you could probably get that board for the academic price, currently $423. But that's really quite large; almost specifically picked to rule out midrange chips. You'd get a much better deal at e.g. LFE5UM5G-85F-EVNG, with about 85k LEs for $150. The devkit I bought as a student (after first bitbanging onto a cheap CPLD) has about 13k LEs, and it was never particularly tight.
@@blinking_dodo I had about the same goals and a similar budget way back in the day, and got a 13k LE board, which never felt particularly tight. Today you can get a Lattice based 85k board for that price, but the DE2 is over $400 even with the academic price you should get as a student.
@@0LoneTech Do you think a CPU+GPU fits inside 85k LE? And the pricing of those DE2 boards is skuffed, as they are *barely* supported. (you need the oldest software version and even inside there that kit is on the border of obsolete.) All their innovations has gone into adding extra features, not price reduction, it seems.
A single Tiny Tapeout tile can accommodate on the order of 1000 standard cells (arguably some are more complex than a single gate), but you can buy up to 16 tiles for one design, and you get less “waste” in the margins this way. This design was about 7300 standard cells in 8 tiles, but I could’ve made it fit in 6 (but while 2x3 is an arrangement now offered, it wasn’t yet in TT04).
@@fooglestuffThanks a lot. Would love to see your Wolf3D on the ASIC. I have follow your git journal, detail and awesome information. It would help a lot of other devs.
Do you mean, say, Mesa bindings to create a driver for this hardware? I don't really know Mesa, but I suspect my design is not (yet) suitable given it's not real 3D (and also the "world" itself is static). Let's see what the future holds, though! At some stage I'd like to try putting this chip (or a future one) on, say, an ISA card and actually make a PC game that uses it, and then take that idea further to the point that Mesa bindings might actually become more useful :)
For this design specifically, there isn't an advantage to using an ASIC -- it was to prove that I could do it using sky130 standard cells, and that the open-source workflow could make it work end-to-end, and prove to myself that I could get this done with very limited prior experience. In other words: for fun and learning but also the thrill of having it in real silicon that I designed. As for sky130 more generally, though, you might be interested in how it benefits designers now in the analog domain: ua-cam.com/video/ypXynGz8Heo/v-deo.html
Not this one (TT04) but TT02 does: ua-cam.com/video/QMsmkDeqELg/v-deo.html - See Bitluni’s video about that same chip: ua-cam.com/video/DdF_nzMW_i8/v-deo.html - great stuff!
For now, I'd suggest you open a GitHub "Issue" on my repository, requesting documentation on the things you'd like to know: github.com/algofoogle/raybox-zero/issues -- I wouldn't consider any questions to be stupid :) Then I can better-understand what people want to know, and not only answer you, but aim to do it in a way that other people can benefit.
this is awesome and very exciting! I have a question however, why it seems like ASICs are so behind modern CPUs? I’d seem like specialized hardware would make this a lot faster
You don’t need to study the Zero to ASIC course to get on board with Tiny Tapeout. Many people have learned the tools and submitted designs without it. Z2A was just MY starting point and I got a lot out of it.
@@fooglestuff The "textures" are much more fake than the height though. They're just painted on and have no texture d-: The height is diminishing by normal perspective projection, it's just that the height of everything is uniform. The height of the table I'm sitting at is also uniform so I hope it's not fake. (-:
Let's clear something up. FPGAs are not ASICs. FPGAs are field programmable. ASICs are mask programmable. If the way the design gets onto the chip is via masks being made and used in the fab plant then it's a mask programmable device and it's not an FPGA. If the design is put into the chip when the chip is on a board, then it's field programmable.
Yep, thanks for offering that clarification. This was tested on an FPGA until I felt it met my requirements (or rather, until I ran out of time) and in parallel I synthesised/simulated it multiple times using OpenLane using sky130 standard cells which are used to create the masks that set the design in stone (so to speak). Some designs submitted to TT are for ASIC manufacture of (simple) FPGAs (or even tiny eFPGAs as part of some larger design), and it will be interesting to see how this grows.
When I saw this ASIC project idea, my first thought was "I wonder if I could make a VGA crtc and a triangle rasterizer", and you go and just stick a full 3d engine in a chip...
Hehe, well I'm proud of what I accomplished, but don't be intimidated: I'm cheating and not rendering triangles/polygons... such is the beauty of this classic ray casting algorithm. However, someone else did this submission to TT05 which IS a triangle rasterizer: tinytapeout.com/runs/tt05/390/ -- There's the possibility this could be combined with what my design does, and because FPGAs/ASICs can put hardware in parallel, these two could cooperate quite well, I suspect, if extended to support more parametric control.
@@fooglestuff ... I was wondering how you'd get the framebuffer or buiffers to all synchronize... is it possible to run multiple of these on-chip projects together? It'd be neat if they had their own high speed bus, one could be a QVGA framebuffer and another could be a triangle rasterizer and another could be a sprite enginer etc.
It seems to me from the projects I've seen that the device isn't that dynamic, but running multiple chips could possibly work.
The information seems to get into the (asic -> custom die) - via a handful of gpios, so I wonder if an architecture similar to the original NES PPU but serialized could work well... actually now that I'm rationalizing that in my head, it would just look like a serialized OpenGL... tell it to clear a buffer, setup transforms, shove triangles into it... etc..
hmmmmmmmmmm - if only there was time in the world to do this sorta stuff!!
2:23 LOL, an SI banana from NIST
That made me audibly laugh out loud, thanks for that 😂
Haha, thanks for commenting. I'm glad someone got the nuances of that :)
I'm jealous. I have to make do with uncertified bananas.
@@SnakebitSTIthat sounds like a T-shirt waiting to happen.
Great work! You should have a look at the MiSTer FPGA project to see if you are interested there. The project is a hardware FPGA that creates many PC and gaming consoles. Your project might be useful hand in hand.
I have never come across a software FPGA, have you?
Maybe at some point I'll get a DE10 and MiSTer, but for now I haven't used more than about 25% of the resources of my humble DE0-Nano :) I'm even thinking of trying a smaller FPGA (one of the cheap Lattice-based boards perhaps) to see how well my design still works with more constraints. There is still an appeal, however, in trying to make the design function as though it is a device used by a retro PC or gaming console (e.g. akin to the the Super FX chip).
Really enjoy these low level embedded electronics designs, it's amazing we can contribute to our own ASISs these day. Hat's off with your effort!
Thank you so much! I've enjoyed watching other people's low-level designs and similar labours of love over the years, so I'm glad to be able to offer one of my own (which will have newer versions with more features in the future). And yes, I agree that this is amazing stuff to be able to grasp now. I find it quite addictive, as my prior video about TT02 mentions in the conclusion ;)
This is such a cool idea! I've given some thought to how to do a simple GPU for 3D in a FPGA before but never thought of doing something like this! Nice work!
Thank you! I'd like to see a wide variety of techniques that people can get to fit in constrained hardware, so if you do attempt something then be sure to let me know.
Algorithm jackpot! You are such a joy to listen to.
Thank you! I'll try to keep it up :)
Excellent video, good pace, packed with technical details, nice visuals and great production
Thank you! Hopefully it won't be too long before I can share some more, including newer revisions of this design.
I really like how simplistic it is to implement and use VGA. I wonder if there would be enough shared interest to place shared framebuffer/VGA driver logic on the chip that all designs could have access to
I think there would be, and in fact for things besides video projects, but the challenge is routing a bunch of extra shared signals for all of the Tiny Tapeout tiles… I’m sure TT would support more IOs too if it wasn’t for the routing room required. Maybe there is a neat compromise though… hmm…
amazing how i searched for this topic(kinda like it)and found out you are the only one who made a video about it lol
I love seeing stuff like this myself so I was surprised and disappointed I didn't see too much on hardware ray casting (though there is a little bit) -- but that encouraged me to push this design and share it. I'm glad to see there are people like me who enjoy the same thing :)
Great presentation. I independently pursued a different Comanche-style ray-caster (and also planned on turning the monitor), but I wanted textures and bailed (for now) as I couldn’t compress it enough to fit. I made a little triangle render instead for TT05.
I look forward to seeing where you take this.
Thanks Tommy! I came across your TT05 triangle renderer, and I'm pretty keen to see it in action! I suspect your method and this classic ray caster could also be combined to make a more-sophisticated (and more "game ready") design. I'd love to see you get along further with the Comanche-style renderer; it's another novel concept.
This is essentially a video game on a chip!
I'd definitely like to think so :) Or at least, it's the beginnings of one. The "game" is more on whatever controlling CPU/MCU you decide to hook up to it, but future versions might include more game logic (or even simply a future TT run might be an implementation of game logic to drive this device).
Very cool! I tried to experiment with something similar, but ran out of time. Then I tried switching to a beam-racing raster renderer on a full Caravel, but also ran out of time.
I know the feeling! Always could use more time. I hope you do manage to get something put together though, even if you don't submit it. While the Google-sponsored Open MPWs look like they might be over, that's where TT continues to provide ways we can access this sort of tech and manufacturing. So... hang in there! Give it another shot. And if you do, please share :)
Wow! Neat project. Hope it come back from fab and works as intended.
Hopefully I won't have to wait too much longer. Even if it doesn't work as intended (which is possible -- there's lots of variability in the synthesis and characteristics that I hadn't accounted for very well in this first version), there's a lot to be discovered from trying to debug it. Stay tuned!
Great video Anton!
Thanks Matt! Couldn’t have got there without you :)
Do you know the projects FlexGripPlus and GPLGPU? They could be a good set for a GPU.
Wow! I am happily your 995th subscriber.
Congrats! I’m proud to now be over 1,000 but I won’t forget your contribution :) Hope you like what’s coming up soon!
Would be awesome to try in on a crt. 60fs@60hz will look soo much better in motion, 16 times lower motion blur then a 60hz sample and hold lcd or oled infact.
Oh, I can't believe I didn't think of that, especially since my earlier videos are about an IBM PS/1 restoration, and that machine includes an ideal VGA CRT monitor :) Thanks for the idea! I'll try this in a future video.
Very cool!
Thank you! Glad you liked it :)
640x480 VGA with 16 colors? Literally the way that GOD intended. Rip Terry . It would be nice if you could run templeOS on this hardware as it is a very nice addon to this open source OS
Oh I forgot about TempleOS. Well, this is probably in the same spirit of the "one person's crazy devotion", but to be clear, this design is more of a peripheral that I suppose TempleOS (or one of the forks?) could be adapted to use, rather than run on... and also it actually supports up to 64 colours so I'm not sure Terry would be happy about that :)
❤
Interesting project!. A dream come true for any FPGA hobbyists. How much an ASIC performance is better than a FPGA? Let's say, if we use the same clk speed, is the worst negative slack much more forgiven in ASIC?
Sometimes the youtube algorithm hits gold. And just when I bought a basys 3 a couple of weeks ago.
Do you have any books or tutorials to recommend?
Glad to have you along! If you happen to try this design on your Basys 3, I'd love to hear about your results (good or bad). I'd be open to a GitHub pull request for targeting a Basys 3 (or any Xilinx). See the "de0nano" directory for an FPGA devboard target as an example. As for your request for recommendations...
* See the "Raycasting" heading of: lodev.org/cgtutor/index.html
* Game Engine Black Books by Fabien Sanglard: fabiensanglard.net/gebb
* 8bitworkshop (inc. Steven Hugg's Verilog book): 8bitworkshop.com/
* JS ray casting tutorial: github.com/vinibiavatti1/RayCastingTutorial/wiki
* OLD ray casting tutorial; still interesting: permadi.com/1996/05/ray-casting-tutorial-table-of-contents/
I don't think Bresenham would work for spheres, but ray marching might work?
I was thinking more that if you wanted to render a circle instead of a flat (square) wall, then you could track the advancing of the algorithm with each line -- though it is complicated by the fact that not necessarily all lines of a circle would be rendered. I haven't thought it through completely :) It's also interesting, though, that you can adapt the algorithm to effectively support rendering of "columns" (horizontally-oriented cylinders) instead of blocks. I haven't tried this, but check out: lodev.org/cgtutor/raycasting4.html#Shapes
As for ray marching, I would like to try more sophisticated algorithms like this but I suspect this will require much larger TT area than I can currently afford :) Nevertheless, I will try pushing what I can already do, and look for further optimisations and hacks.
@@fooglestuff I haven't actually coded up any of this stuff except for the bresenham line and circle but I've read and watch quite a lot. Ray marching is what the shader guys use. I can't see how a ray technique would mesh with a bresenham technique, which is to do with pixels/cells. I think vertical cylinders should be easyish from the point you've achieved so far!
I suppose you'd get a feel for how much TT you'd need by playing with it in software?
Thanks for this, it's real cute and got a lot of novel potential.
I have a RPI4+ and dev tools on Mac, Linux and Win ... what would be the best method to get this running as close to being all hardware?
I don't have access to a Pico and even at their incredible value, I can't afford one - If I could somehow make this into the OS on the Pi like for the Amiga PiStorm where it runs on metal then that'd be great but I'm not real familiar with doing my own electronics *yet* but every episode of Adrians Digital Basement and Jan Beta teaches me something new!
Is there a way to contact you about a very specific idea that this would be an incredible boon for, and potentially something that would add value to the lives of so many people!
Don't get me wrong, I won't hog the idea, it needsa must be FOSS because of its scope, but I'd rather just seem ignorant instead of opening my klep and proving it at first 😀
Thanks again, gonna keep a good eye on this!
Now do a Duke3D polygonal sector rasterizer! Then it's on to Quake!!!!
I'll get right on it! As I always say... sleep and a day job and regular adherence to 24-hour days are for the weak ;) In all seriousness, I will try to push this further as I can, but I encourage others with more time and expertise to perhaps learn from my approach, but then create their own approach in an HDL of choice. Use Tiny Tapeout as a way to get any such designs made if you like, but there's still always the fast prototyping pathway using cheap FPGAs or even C(++) simulation as I've done with Verilator.
Amazing project and report. Could you share with us how you get the SPI in and VGA out of the tiny tapeout? I plan to port your code to an ice-pico board which has a RP2040 and ICE40UP5K.
Sounds great! I'd be happy to consider a pull request also if you get it to work. I'd also suggest you open a GitHub issue on the repo to request more information and I can get back to you with extra detail and Q&A later. For starters, the version featured in this video is tagged "1.0" github.com/algofoogle/raybox-zero/tree/1.0 - the "main" branch is newer and fine to work from too but be aware it has an extra SPI controller for accessing an external texture memory. You don't necessarily need to worry about the Tiny Tapeout interface when porting (though I can give you pointers on that if you'd like, too) -- instead, the "de0nano" subdirectory includes a Verilog file for wrapping the main module ("rbzero") to target an FPGA board (in this case my DE0-Nano) so that's probably a good place to see what the interfaces are. If it helps, SPI is not essential to see the design in action, so you can do that after getting the basic VGA output going. Please keep me posted!
NOTE: I edited my reply to correct the URL, so it no longer goes to a 404 :)
@@fooglestuffI only found out RGB 5:0 . So EGA? S-Video would be funny if we could use a 10 MHz color carrier and use an mixer on the PCB to get to VGA. .
@@ArneChristianRosenfeldt I'd love to try NTSC or PAL output using the analog capabilities of TT06+, but even putting the VGA DAC in the chip would be great... i.e. each of the R, G, and B channels having a single output but at least 4-bit resolution.
I was referring to the extreme constraint pin count configurations on github. I have the feeling that 8 bit R2R DACs are no problem for the process in the fab. Amiga had that for sound. I love how RGB applies barebones to both CRT electron guns and LCD subpixels. Actually, I like clearType. Maybe we could build a sprite engine which respects subpixel order after rota-zoom of sprites/textures. I learned that phones and windows apply clearType while they write to the framebuffer. This breaks on screen rotation and with screen shots. I wonder if upscalers work this way (for and in TVs). Composite would mostly be for fun. We could write "composite", but then show a HD output. Another LoL project would be EGA with pwm to create a lot of colors. Would be cool if while racing the beam we accumulate photons from pixels in the framebuffer, and then with some hysteresis the EGA output goes up a bit. But then again this is weird because EGA monitors also have a DAC inside of them, which gives me less barebones access to the electron guns. Also EMI is quite high. I also dream of a 3do like point cloud. I don't know how modern point clouds do it, but it looks like for solids I would have to clear the polygon as black in the framebuffer to obscure the background and can then write all the texels ( with displacement constraint to no occlude stuff ). No translucency. Gives this old SEGA System32 Arcade look. Then while racing the beam in interlaced mode, points close to the scanline (weighting based on delta-y) add to the accumulator.
Maybe a tile can have stabilized power and can use PWM for audio? I still want to understand 1 bit DACs. I think they carry over the fraction from one pulse to the next. But in the end, people gave up. Sound goes digitally to the iPods or over HDMI. The actual DAC sits with the speaker. I wonder if every channel ( bass, etc.) has its own.
@@fooglestuff
I Wonder if the Atari Jaguar audio DAC would work. They did not connect it on the PCB, but the Jerry chip did output 4 PWM channels. 2 per stereo channel. The least significant one goes through an RC circuit. Then at the trailing edge of the main PWM, a mixer blips over to the voltage in the capacitor. Now we need a band pass filter to compare blip voltages to one step of the main pulse length. Calibration to adjust a factor for a digital multiplier for the fraction value.
awesome!!!!!!!!!!!!!!!!!!!!!!!!1
Your exclamations pwn how truly serious you are, so thank you!!!one1 ;)
There seems to be a confusion between shader and texture. Ah, portrait mode like in those C64 demos.
What's the rectangle at top left with the flashing pixels?
I wanted to create my own processor and/or GPU on an FPGA, but had to abort that project when i found out it is not something a student can afford.
And since then YT *keeps* pushing me these recommendations. 😡
Nice for you that you have made something at least, i guess... 😐
How far out of your budget was e.g. a $15 tang nano 9k board?
@@0LoneTech I had set my minimum requirements to be 114k LE, which was the same as the 15 year old DE2-115 had i used at campus. As a student, my limit was ~150 euro's. i don't care about fancy features or interfaces. A few GPIO and the raw LE's are all i need.
But after the cheap boards of ~$15 end, the next boards are all around 800 bucks, which isn't something i could spend on a personal hobby project...
@@blinking_dodo As a student, you could probably get that board for the academic price, currently $423. But that's really quite large; almost specifically picked to rule out midrange chips. You'd get a much better deal at e.g. LFE5UM5G-85F-EVNG, with about 85k LEs for $150. The devkit I bought as a student (after first bitbanging onto a cheap CPLD) has about 13k LEs, and it was never particularly tight.
@@blinking_dodo I had about the same goals and a similar budget way back in the day, and got a 13k LE board, which never felt particularly tight. Today you can get a Lattice based 85k board for that price, but the DE2 is over $400 even with the academic price you should get as a student.
@@0LoneTech Do you think a CPU+GPU fits inside 85k LE?
And the pricing of those DE2 boards is skuffed, as they are *barely* supported. (you need the oldest software version and even inside there that kit is on the border of obsolete.)
All their innovations has gone into adding extra features, not price reduction, it seems.
I have read that in TinyTapeout the max gate is around 1000. Could you share how many gates did your Wolf3D use? Thanks 😊
A single Tiny Tapeout tile can accommodate on the order of 1000 standard cells (arguably some are more complex than a single gate), but you can buy up to 16 tiles for one design, and you get less “waste” in the margins this way. This design was about 7300 standard cells in 8 tiles, but I could’ve made it fit in 6 (but while 2x3 is an arrangement now offered, it wasn’t yet in TT04).
@@fooglestuffThanks a lot. Would love to see your Wolf3D on the ASIC. I have follow your git journal, detail and awesome information. It would help a lot of other devs.
Excellent video, I'm electronic engineer fun KiCAD and I love learn open source VLSI design. If you can, I need your help for start.
Need help on Mesa3D implementation?
Do you mean, say, Mesa bindings to create a driver for this hardware? I don't really know Mesa, but I suspect my design is not (yet) suitable given it's not real 3D (and also the "world" itself is static). Let's see what the future holds, though! At some stage I'd like to try putting this chip (or a future one) on, say, an ISA card and actually make a PC game that uses it, and then take that idea further to the point that Mesa bindings might actually become more useful :)
@@fooglestuff tell us if it's ready :) this could be helpful for open source community who need to learn making GPU from scratch
I saw this on the zero to asic channel lol
Why don't you try the cube engine, and use external memory like parallel psram? That is a very efficient advanced software. FPS engine.
What is the advantage of asic vs a pld in this case where its a relativly small build anyway?
For this design specifically, there isn't an advantage to using an ASIC -- it was to prove that I could do it using sky130 standard cells, and that the open-source workflow could make it work end-to-end, and prove to myself that I could get this done with very limited prior experience. In other words: for fun and learning but also the thrill of having it in real silicon that I designed. As for sky130 more generally, though, you might be interested in how it benefits designers now in the analog domain: ua-cam.com/video/ypXynGz8Heo/v-deo.html
How may I help
That ASIC includes a tiny rick roll.
Not this one (TT04) but TT02 does:
ua-cam.com/video/QMsmkDeqELg/v-deo.html - See Bitluni’s video about that same chip: ua-cam.com/video/DdF_nzMW_i8/v-deo.html - great stuff!
can i contact you in any ways, i have so many stupid questions about that?
For now, I'd suggest you open a GitHub "Issue" on my repository, requesting documentation on the things you'd like to know: github.com/algofoogle/raybox-zero/issues -- I wouldn't consider any questions to be stupid :) Then I can better-understand what people want to know, and not only answer you, but aim to do it in a way that other people can benefit.
this is awesome and very exciting! I have a question however, why it seems like ASICs are so behind modern CPUs? I’d seem like specialized hardware would make this a lot faster
It was interesting before I looked at the course price.
You don’t need to study the Zero to ASIC course to get on board with Tiny Tapeout. Many people have learned the tools and submitted designs without it. Z2A was just MY starting point and I got a lot out of it.
@@fooglestuff oh I see!
Sorry about that, I definitely misunderstood there
each of the three dimensions is dimension-like but at least one must be fake
Sounds deep, man :) But yes, you're right... the "height" (diminishing visual distance from floor to ceiling) is fake.
@@fooglestuff The "textures" are much more fake than the height though. They're just painted on and have no texture d-: The height is diminishing by normal perspective projection, it's just that the height of everything is uniform. The height of the table I'm sitting at is also uniform so I hope it's not fake. (-:
OMG those eyes match that shirt on you! You should wear them every day, they really fit you well. guy, non gay, FYI
Not that there’s anything wrong with that! It’s half just the reflection of my monitor, btw ;)
@@fooglestuff color looked kind of impossible but still, funny how it matched the shirt.
And the UA-cam algorithm strikes again
In a good way, I hope :)
@@fooglestuff absolutely!
Just ask Carmack for help, why not.
Let's clear something up.
FPGAs are not ASICs.
FPGAs are field programmable. ASICs are mask programmable.
If the way the design gets onto the chip is via masks being made and used in the fab plant then it's a mask programmable device and it's not an FPGA.
If the design is put into the chip when the chip is on a board, then it's field programmable.
Yep, thanks for offering that clarification. This was tested on an FPGA until I felt it met my requirements (or rather, until I ran out of time) and in parallel I synthesised/simulated it multiple times using OpenLane using sky130 standard cells which are used to create the masks that set the design in stone (so to speak). Some designs submitted to TT are for ASIC manufacture of (simple) FPGAs (or even tiny eFPGAs as part of some larger design), and it will be interesting to see how this grows.