"It is hard to reverse engineer the design choices made during Quake's development since, as is often the case with highly optimised graphics code, John Carmack did it"
I love the Quake Software Renderer. I used to play QuakeWorld on a 320x200 Software Rendered well into mid 2000's. I wish more people knew that QuakeWorld is still around and is an INCREDIBLE FPS. It's been updated through many open source projects. Just google it. It's a great throwback! :)
@@goqsane I didn't play that much Q1 or QW as at that time I was still bound to dail-up internet connection. So I couldn't waste that much money ^^. Though I played and still play a lot of Q3 / Defrag and later QuakeLive. For me id Tech3 was another big milestone. A number of big games were made with that engine (RtCW, MOHAA, JediKnight2, CoD, ...).
I owned Abrash's Black Book as a 90s kid. It was inspiring and a fun read. It wasn't dry and boring, he spiced it up with a lot of philosophy and personal ramblings. I wish I'd never gotten rid of it.
One of the things I always praise about the book when talking about it to others are these extra chunks of wisdom that aren't tied to any technology. The tech is severely obsolete and teaching just about it wouldn't be as useful, but the wisdom he shares is just so valuable and you can have something useful out of the book instead of just reading about old tech for curiosity sake.
@@StereoBucket I'm doing a lot of graphics programming these days, and the thing that surprises (and pleases) me is that I'm finding ways that these techniques *aren't* so obsolete as we suppose. To be specific - yes, it's all obsolete IF you're rendering polygon meshes in the conventional ways, which have progressed so far since then that there's no relation to these old algorithms. BUT, if you're doing something experimental, maybe using the GPU in ways that aren't conventional, maybe rendering things that aren't triangle meshes, then suddenly a lot of these old "tricks" begin to apply again, and what's old becomes new again. It's exciting.
⚠ WARNING: This video contains flashing images. There's a lot of details I omitted to keep things brief. For example, there's slightly different behavior if a sub-model (door, lift, button, etc) is being drawn. Also I haven't spoken about texture mapping yet --- I think that topic warrants its own video.
A great follow-up to the texture video would to discuss what changes Carmack had to implement/omit to add hardware rendering for the 3Dfx and later GPUs.
The worldspace clipping of BSP entities against the world geometry is actually very relevant to this video, because it eliminates most z-fighting. Z-fighting can still occur between different BSP entities, as in one of the maps of the Contract Revoked mod. Texture mapping, however, should indeed be in a different video.
7:43 The thick book with the Quad sound is way funnier than it has any right to be I love these videos so much! Your visualizations and explanations are always nice and clean. I would love to see you go more in depth with the exceptions (like you mentioned doors and buttons in a comment), but sadly I understand the video has to end at one point...
It always genuinely amazes me how much 3D game engines manage to process on each frame... and you do a fantastic job explaining it all in such a easily understandable way too with the visualisations. I'm glad I stumbled on your channel!
For every frame the CPU has to sort through 3 binary trees, then do the edge spanning, then do texture mapping, combining the texture and lightmap to determine the final color of a pixel. I'm surprised the CPUs of the day managed 20+ fps. Edit: and since the physics ran every frame instead of on a set tickrate, physics was run every frame too.
Gamedevs then: "We'll employ black magic to make a complex 3D game work on a glorified calculator" Gamedevs now: "Why does my visual novel lag when there are three character portraits on the screen?"
I have literally seen newer "boomer shooters" with all 2d speites and basic geometry and baked lighting perform about as poorly as these first 3D fps games did when they came out. Except on modern machines.
@Creative yeah, basically not knowing the engine deeply enough and not knowing how to optimize it well enough. When developers DO know what they're doing, the performance can be absolutely limitless
Seeing how those spans were generated from the active edge list was truly awesome. I've never seen a behind the scenes of a software renderer before, but this is the kind of stuff I love.
Just discovered your amazing channel last night and already a new video. Keep up the awesome content! Also your visualization techniques are some of the best I’ve seen. Love the attention to detail.
Cool video! It's still hard to wrap my head around BSP. As i understand it a infinite plane is used to split the brushes into ever smaller parts. It's hard to imagine how this splitting is actualy coded
This series finally helped me understand how the BSP tree and PVS work together to make the culling tests less expensive. I had a rough idea of what the PVS was doing and what the BSP leaves were, but hadn’t wrapped my head around the recursive division of space into convex volumes of empty or solid space, how you can use the tree to quickly figure out what leaf you’re in, culling out most leaves that aren’t possible to see, testing the bounding box of a leaf against the camera frustum to completely stop tests against its child leaves. It’s all so brilliantly elegant. The span solution of projecting lines into screenspace and testing whether they’re the left or right edge of a span is insanely smart too, it seems like the kind of overcomplicated hack solution I’d try to create when I was younger… except it’s smart and it actually works. Fantastic video. I’d love to see what kind of messy trigonometry had to be done to generate the world geometry from the BSP tree in the future.
I think strategically this kind of optimization does as much in the CPU memory space as possible to reduce the amount of writing into the frame buffer that needs to be done, which was pretty slow on an unaccelerated graphics card.
Hey Matt! Thank you so much for your deep dive videos on Quake. I have been digging into Quakespasm source code for a while now and these videos give me a good overview before I go into more details by myself.
I such a big fan of your videos! I can only imagine how long it takes to understand, summarize, and visualize these very complex topics. As a long time Quake junkie, it's quite fascinating to take a peak behind the curtain. Keep them coming!
I could never visualize what sorted spans were when reading about them in the black book. I guess I tripped myself over by trying to think of them as a vertical construct so it obviously couldn't click. I put off trying to understand them for a few years for when I'd eventually get around to visualizing the renderer myself, but never got around to it. Can't say I'm happy about being beaten to the punch, but I am enjoying this series, keep it up! I can finally see what spans meant.
The Black Book is one of my favorite things from childhood. It was always just out of reach but I still learned things that helped me in my career. Good stuff.
Michael Abrash wrote a wonderful article on scan-line rendering for Dr Dobbs Journal, complete with a code project (not Quake ofc, but demonstrating scanline rendering)
Great breakdown, thank you. Having done my own reading of the Black Book, and other of Abrash's articles (which seem to be the most time capsule-y articles available on the Quake engine development) I'm struck by how pragmatic many of the choices made in building the engine were. For example, the PVS doesn't eliminate overdraw by itself, but it greatly reduces it. Carmack knew it was stupid to try to get the PVS perfect (in fact, fine-grained PVS is still an open problem today). PVS as implemented in quake gets the set of polygons low enough that span buffers can be used -- span buffers being even for the time a slightly outdated algorithm, that was designed in a time when writing a pixel was very, very slow, to completely eliminate the remaining overdraw. Overdraw in drawing the world was very expensive because every pixel had to be textured and lit, and world polygons tend to be large. But then at model drawing time, Carmack just said fuck it, the polys are small enough that we'll just the z-buffer.
Wow the span algorithm reminds me of an algorithm I made for my masters to rasterize convex shapes in 3d. Which ended up being used as a way to view frustum cull a spatial hash.
Thanks for making more videos! You have been one of my favorite technical channels i have stumbled upon in the past few years and the recent resurgence of videos is well appreciated!
Please put this together with your BSP and PVS videos in a single playlist to make it easier to share! Perhaps with the lighting video at the end, too, if you prefer.
if you just build the convex space adjacency tree, you get all the visible faces for free, for ray casting (inside a convex polygon space partition, you always see the inside walls)
A much simpler approach that comes to mind is to keep a list of occupancy spans while plotting polygons front to back. Then there will be no more than a handful spans per each line, and at their size, they might as well be an array. What speaks against that? One thing and a big one is integration of entities (enemies items etc) into the rendering. That you have discarded environment depth data from your buffer before you need it again to occlude the entities by the environment. And another is cache optimisation; that you have execution alternating between various domains of code like BSP traversal, clipping, triangle setup and rasterisation, this means code and hot data get pushed out of the cache regularly. Cache optimisation is a big must to make something run fast on a P5.
Hello, to render enemies etc a depth buffer is indeed used, it is just that for the main level only depth writes are required, which were apparently much faster than depth reads. Your idea with spans sounds like it should work. I've no doubt it was considered by Carmack, Abrash, and co; it'd be interesting to know why they went with the AEL/APL approach. Edit: Chapter 66 of Abrash's book talks about this in the Sorted Spans section.
@@MattsRamblings I'm certain they tried more than a few things. By drawing one surface at a time and having no interleaving tasks, good texture fetch locality is achieved, and cache can be utilised especially if they use swizzled texture format.
@@Ehal256 Once you start increasing geometric resolution, you're going to need better methods than just drawing one whole surface at a time to achieve better cache locality. But either is better than rendering the whole spanbuffer line by line top to bottom. Of course one can come up with tile based methods as well and so on but they can also be hit or miss.
The spans probably are better for avoiding memory cache misses as likely they are stored next to each other in memory...well...it's usually done that way anyhow, along widths rather than heights.
I used to make levels on an editor called QOOLE - I wish I'd known then some of the stuff you've put in your videos as it explains why some things just wouldn't behave themselves.
When I watch your videos on quake I usually try to recognize which parts are relevant to modern graphics programming and which aren't. I suppose if you are taking advantage of the graphics card and modern graphics APIs then everything in this video after the "Back-face culling" chapter wont be particularly useful since you can have your graphics API automatically perform lots of tests to prevent overdraw.
Yes and no. Yes, because programming on the CPU, you don't need to worry about such. However, no because that's actually not far off what the GPU does, so understanding Quake's rasterization helps in understanding how a GPU works. While algorithms might be differ (sometimes significantly), the biggest difference (by far) is Quake is single-threaded (it wasn't until after 64-bit became commonly available that multi-core cpus were commonly available), while GPUs do the vast majority of the work in parallel.
Excellent as usual. If i might give some constructive criticism: Consider red-green colour blindness in your visualisation, perhaps switching to blue and orange. Also check that your audio is normalised, it is very quiet in places.
Holy crap... I can see why Carmack & Romero mentioned that Doom was kind of the sweet spot where the average person can reasonably understand how the engine renderer works and how building out a custom level works, where Wolfenstein was too simple create anything all that interesting and Quake is just that smidge too complex for most people to wrap their heads around.
Sometimes I wonder if stuff like this can be threaded to make it faster. Not that it really matters anymore, but as an academic experiment for those old dual and quad Pentium Pro setups.
Very nice video. While watching it, I was thinking which modern tech would be equally impressive. Maybe it's nanites in UE5. IMO the best presentation about them is called "Nanite | Inside Unreal". It's almost 3 hours long, but the stuff is not terribly complicated, these people simply tell too much information: history, requirements, their reasons to make various decisions, etc.
The book does describe the AEL / APL span algorithm: Check out Chapters 66 and 67. There are some differences compared with the final version in the GPL release, so refer to source for the true method. It was the best technique they found, in terms of maximizing the worst case performance, that is to say, still performing well on the hardest scenes.
What about Z-fighting? I find it interesting that the Z-fighting traces bleeds across horizontal scans consistently and if the camera is kept still, the Z-fighting pattern does not update. You can get certain camera angles where one texture completelly obscures the other while you can get other camer aangles where there's a diagonal streak. Wouldn't Z-fighting be very costly rendering time wise with this algorithm?
I’m not sure if Z-fighting would really apply here given how the world geometry is generated from the CSG brushes. The span optimization is only done to the BSP geometry, which (if I understand correctly) already has hidden surface removal applied to it. Levels are paper-thin shells that have no exterior.
Quake was software rendered, development started before Doom was released, and they likely added minimal OpenGL support by replacing end of graphics pipeline to draw polygons using OpenGL. IdTech3 (Quake 3 arena) likely was similar but more optimized to work with 3D-hardware. IdTech4 (Doom 3) was overhaul to architecture. Sadly that failed because it used patented algorithm and engine was very dependent on those shadow volumes. IdTech3 was very long lived engine.
@@Biel7318 Quake 2 engine is basicly same as Quake 1. Software renderer where end of rendering pipeline works in graphic card. It is easy to make engine with zero overdraw. Like raytracing whole image. It is not necessarily good idea to have zero overdraw as it is faster to let GPU do the work than use CPU to minimize overdraw.
@@gruntaxeman3740 independent of it's optimalness when rendering. Carmack did made an intent of zero overdraw with Quake2, but using the GPU instead of the CPU unlike Quake1. I'm just asking how different is that method to Quake1's
I don't know if it was actually done this way in Quake 2, but with a GPU it's better to culling at the level of BSP nodes and not worry about a little bit of overdraw where one node overlaps behind another
Imagine going to all this trouble, only for OpenGL and Direct3D to arrive shortly after. I often felt sorry for the poor person tasked with building the software renderer that most people never used by the time 3D graphics cards were more commonplace.
Not qute: sprites use transparency (binary). However, yes, translucent water is a problem, and not just because of the overdraw elimination, but the palette too (though the water could be dithered for a probably rather ugly translucency, but only as a second pass, which is how the GL renderer does it (minus the dithering)).
@Bill Currie well that's quake2 water, but quake2 was designed with OpenGL acceleration in mind, so the different passes were expected. But Q1 is entirely solid render. If any, with transparent passes, the lines don't get discarded. Instead, they have a chance to get properly sorted and have correct transparency🤔
@@santitabnavascues8673 Quake had transparent water in OpenGL as a hack, but yeah, it might have come after Quake 2 was released. Certainly the software renderer was not designed with transparent water in mind. However, that's really after the bsp stage of the rendering pipeline. The bsp tree traversal itself is barely changed between software and OpenGL, and even my Vulkan renderer's bsp stage is very similar.
@Bill Currie the bsp is enough to render transparencies well, the way it is traversed sorts the polygons, Quake3 renders them just fine. Good luck with your renderer!🙂
@@santitabnavascues8673 Of course it is, I never said otherwise. The BSP tree how the original legacy OpenGL rendering, and both the modernish OpenGL and Vulkan renderers (both of which I wrote) support transparency, because the BSP tree supports not only depth sorting, but also texture sorting. It's the way spans are handled that (at least at first glance) make transparency awkward. I don't feel like digging into getting software transparency working as I have other things I want to work on (eg, shadows, diegetic UI, ...).
Ive been working with unreal engine 5 a lot and i love some of the modern optimisation methods possible with powerful gpus and multi core cpus, its hard to expain just how many advantages these have. But it gets frustrating when you see "new culling tech" being advertised when the problem was allready solved 30 years agao on less powerful hardware, in a more efficent way. If i had the time to go through unreal engine code and make some overhauls i think itd just end up using the same solutions found in quake (though some problems can be done better with modern hardware then could be done with methods used in quake". In short, quake did optimisation better then multi billion dollar companies 30 years before and its very frustrating.
Definitely a case of "yes, but no" as methods for transparency; in particular the lighting and shadows around them, have changed a lot and require greater efforts to prevent drawing errors. That said, having an expanded set of rendering options with a hierarchy of what other methods they rely on being parsable so the desired balance of older optimizations to realism could be achieved on demand would be a wet dream, especially if it could be injected into earlier titles. The specular mapping of the mid 2000's that carried on way too long is something I'd probably tweak in every title I could😅
@@Xeogin i completely agree, i love 3d graphics and things like ray tracing is pretty much a must have IMO, but binary space partition gave so much preformance improvements (in inside spaces) that its kinda stupid to not use.
@@hughjanes4883 BSP is highly focused on static geometry, and it also requires a fair bit of precomputation. While it's still very effective on modern CPUs (see Ironwail), I can see why it fell out of favor.
@@hughjanes4883 BSP goes very bad very quickly with complexity. Cross cutting planes are a minor problem in Quake but they get drastically worse once you start doing intricate geometry. It was a suitable solution for the time, not for all times.
Yeah, but think of the kind of content we handle now a days. Most of it is dynamic and changing, whereas Quake levels are fully static. Also, current content has millions, literally almost, of triangles, so an active line list iwould become a large structure, and having ever smaller triangles (such as the nanite system) would make this structure even larger than an equivalent depth buffer and sorting through them and referencing polygons would pose a computational complexity that would make their utility rather limited, setting its scope completely outside of the real time rendering. But the core optimizations are still there, for example, many of the current engines do a depth prepass precisely to prevent the overdraw of complex pixel shaders
I can't watch this video, for the same reason i am leaving this comment, but i want to commend you on the best flashing lights warning i have seen on youtube c:
"It is hard to reverse engineer the design choices made during Quake's development since, as is often the case with highly optimised graphics code, John Carmack did it"
cool to see you here! :)
I love the Quake Software Renderer. I used to play QuakeWorld on a 320x200 Software Rendered well into mid 2000's. I wish more people knew that QuakeWorld is still around and is an INCREDIBLE FPS. It's been updated through many open source projects. Just google it. It's a great throwback! :)
@@goqsane I didn't play that much Q1 or QW as at that time I was still bound to dail-up internet connection. So I couldn't waste that much money ^^. Though I played and still play a lot of Q3 / Defrag and later QuakeLive. For me id Tech3 was another big milestone. A number of big games were made with that engine (RtCW, MOHAA, JediKnight2, CoD, ...).
@@goqsanewhich would you recommend? It’s a PvP fps, right? Which is most fun and most played?
That's wild.
I owned Abrash's Black Book as a 90s kid. It was inspiring and a fun read. It wasn't dry and boring, he spiced it up with a lot of philosophy and personal ramblings. I wish I'd never gotten rid of it.
One of the things I always praise about the book when talking about it to others are these extra chunks of wisdom that aren't tied to any technology. The tech is severely obsolete and teaching just about it wouldn't be as useful, but the wisdom he shares is just so valuable and you can have something useful out of the book instead of just reading about old tech for curiosity sake.
@@StereoBucket I'm doing a lot of graphics programming these days, and the thing that surprises (and pleases) me is that I'm finding ways that these techniques *aren't* so obsolete as we suppose.
To be specific - yes, it's all obsolete IF you're rendering polygon meshes in the conventional ways, which have progressed so far since then that there's no relation to these old algorithms.
BUT, if you're doing something experimental, maybe using the GPU in ways that aren't conventional, maybe rendering things that aren't triangle meshes, then suddenly a lot of these old "tricks" begin to apply again, and what's old becomes new again. It's exciting.
@@delphicdescant Thank you! This comment inspired me to read the book and it's actually fascinating
@@ribethings That's cool to hear. I'm glad writing stuff in youtube comments actually made a small difference for somebody.
⚠ WARNING: This video contains flashing images.
There's a lot of details I omitted to keep things brief. For example, there's slightly different behavior if a sub-model (door, lift, button, etc) is being drawn. Also I haven't spoken about texture mapping yet --- I think that topic warrants its own video.
A great follow-up to the texture video would to discuss what changes Carmack had to implement/omit to add hardware rendering for the 3Dfx and later GPUs.
Did you forget to pin this comment?
@@tiagotiagot Lost the pin due to edit.
@Creative Unreal (software rendering) did checkerboard dithering with the u,v coordinates in screen space.
The worldspace clipping of BSP entities against the world geometry is actually very relevant to this video, because it eliminates most z-fighting. Z-fighting can still occur between different BSP entities, as in one of the maps of the Contract Revoked mod.
Texture mapping, however, should indeed be in a different video.
7:43 The thick book with the Quad sound is way funnier than it has any right to be
I love these videos so much! Your visualizations and explanations are always nice and clean. I would love to see you go more in depth with the exceptions (like you mentioned doors and buttons in a comment), but sadly I understand the video has to end at one point...
it's even funnier when you consider the effects the book has on the reader
I really hope this gets more views. You skilled visual explanation are amazing.
When will Quake be on GBA?
It always genuinely amazes me how much 3D game engines manage to process on each frame... and you do a fantastic job explaining it all in such a easily understandable way too with the visualisations. I'm glad I stumbled on your channel!
For every frame the CPU has to sort through 3 binary trees, then do the edge spanning, then do texture mapping, combining the texture and lightmap to determine the final color of a pixel.
I'm surprised the CPUs of the day managed 20+ fps.
Edit: and since the physics ran every frame instead of on a set tickrate, physics was run every frame too.
Gamedevs then: "We'll employ black magic to make a complex 3D game work on a glorified calculator"
Gamedevs now: "Why does my visual novel lag when there are three character portraits on the screen?"
@@sh-creative Autism?
To be fair Carmack was light years ahead of every one, back in the day.
I have literally seen newer "boomer shooters" with all 2d speites and basic geometry and baked lighting perform about as poorly as these first 3D fps games did when they came out. Except on modern machines.
@Creative yeah, basically not knowing the engine deeply enough and not knowing how to optimize it well enough. When developers DO know what they're doing, the performance can be absolutely limitless
@@Nothyn More like glorious calculator.
Seeing how those spans were generated from the active edge list was truly awesome. I've never seen a behind the scenes of a software renderer before, but this is the kind of stuff I love.
Just discovered your amazing channel last night and already a new video. Keep up the awesome content! Also your visualization techniques are some of the best I’ve seen. Love the attention to detail.
Cool video! It's still hard to wrap my head around BSP. As i understand it a infinite plane is used to split the brushes into ever smaller parts. It's hard to imagine how this splitting is actualy coded
Thanks. Yes, the map compiler is certainly not trivial to describe and is worthy of its own video (probably many).
@@MattsRamblings Please make a video on it cause even after 2-3 months of research I still cant wrap my head around it 😭
Thanks, a very solid video. The span optimization is nifty, but your visualizations here are even niftier
I love this! This is great to learn about so thank you for sharing! :)
This series finally helped me understand how the BSP tree and PVS work together to make the culling tests less expensive. I had a rough idea of what the PVS was doing and what the BSP leaves were, but hadn’t wrapped my head around the recursive division of space into convex volumes of empty or solid space, how you can use the tree to quickly figure out what leaf you’re in, culling out most leaves that aren’t possible to see, testing the bounding box of a leaf against the camera frustum to completely stop tests against its child leaves. It’s all so brilliantly elegant.
The span solution of projecting lines into screenspace and testing whether they’re the left or right edge of a span is insanely smart too, it seems like the kind of overcomplicated hack solution I’d try to create when I was younger… except it’s smart and it actually works.
Fantastic video. I’d love to see what kind of messy trigonometry had to be done to generate the world geometry from the BSP tree in the future.
I think strategically this kind of optimization does as much in the CPU memory space as possible to reduce the amount of writing into the frame buffer that needs to be done, which was pretty slow on an unaccelerated graphics card.
Oh it tends to be quite quick on basic PCI cards and many VLB cards. But that was not something to rely upon.
Hey Matt! Thank you so much for your deep dive videos on Quake. I have been digging into Quakespasm source code for a while now and these videos give me a good overview before I go into more details by myself.
Fantastic video, happy to see someone else finally talk about all the weird odds and ends of old games that I enjoy so much
I such a big fan of your videos! I can only imagine how long it takes to understand, summarize, and visualize these very complex topics. As a long time Quake junkie, it's quite fascinating to take a peak behind the curtain. Keep them coming!
TL:DR a wizard did it
Thanks for taking the time to make these videos. They’re fascinating.
What a wonderful explanation of the Active Edge and Polygon List techniques. Bravo!
I could never visualize what sorted spans were when reading about them in the black book. I guess I tripped myself over by trying to think of them as a vertical construct so it obviously couldn't click. I put off trying to understand them for a few years for when I'd eventually get around to visualizing the renderer myself, but never got around to it. Can't say I'm happy about being beaten to the punch, but I am enjoying this series, keep it up! I can finally see what spans meant.
yes that span thing was not explained really well by Abrash
Binged your channel yesterday and now I'm treated to another one, awesome
I barely understand what the main takeaway is the first time, but after a couple rewatches everything makes sense from the perspective given.
This is the best video on Quake's software renderer visibility I've ever seen. Congrats.
The Black Book is one of my favorite things from childhood. It was always just out of reach but I still learned things that helped me in my career. Good stuff.
John Carmack truly is a special person, I can't believe someone could be this much of a genius
A lot of this was done by Michael Abrash as well. And, there are many other games with equally impressive (or more so) tech!
I can only hope that there are many more jawdropping visuals being made for the next video in this amazing series.
Just watched this for the 20th time and finally understood. SO SATISFYING!
Clean, clear, to the point.
Thank you very much!
Michael Abrash wrote a wonderful article on scan-line rendering for Dr Dobbs Journal, complete with a code project (not Quake ofc, but demonstrating scanline rendering)
I really appreciate the way you presented the book at the end, very funny.
Great breakdown, thank you. Having done my own reading of the Black Book, and other of Abrash's articles (which seem to be the most time capsule-y articles available on the Quake engine development) I'm struck by how pragmatic many of the choices made in building the engine were. For example, the PVS doesn't eliminate overdraw by itself, but it greatly reduces it. Carmack knew it was stupid to try to get the PVS perfect (in fact, fine-grained PVS is still an open problem today). PVS as implemented in quake gets the set of polygons low enough that span buffers can be used -- span buffers being even for the time a slightly outdated algorithm, that was designed in a time when writing a pixel was very, very slow, to completely eliminate the remaining overdraw. Overdraw in drawing the world was very expensive because every pixel had to be textured and lit, and world polygons tend to be large. But then at model drawing time, Carmack just said fuck it, the polys are small enough that we'll just the z-buffer.
Great video as usual!
you still can see the quake 2 software render in source code (the last quake to support software)
its a genius code with lot of assembly optimizations
It's amazing how many calculations are made on each single frame when playing
Great presentation!
This series continues to explain the raison d'être of each of the lumps I'm seeing in BSPs; in this case, the vertices.
Wow the span algorithm reminds me of an algorithm I made for my masters to rasterize convex shapes in 3d. Which ended up being used as a way to view frustum cull a spatial hash.
Thanks for making more videos! You have been one of my favorite technical channels i have stumbled upon in the past few years and the recent resurgence of videos is well appreciated!
Please put this together with your BSP and PVS videos in a single playlist to make it easier to share! Perhaps with the lighting video at the end, too, if you prefer.
if you just build the convex space adjacency tree, you get all the visible faces for free, for ray casting (inside a convex polygon space partition, you always see the inside walls)
A much simpler approach that comes to mind is to keep a list of occupancy spans while plotting polygons front to back. Then there will be no more than a handful spans per each line, and at their size, they might as well be an array. What speaks against that?
One thing and a big one is integration of entities (enemies items etc) into the rendering. That you have discarded environment depth data from your buffer before you need it again to occlude the entities by the environment.
And another is cache optimisation; that you have execution alternating between various domains of code like BSP traversal, clipping, triangle setup and rasterisation, this means code and hot data get pushed out of the cache regularly. Cache optimisation is a big must to make something run fast on a P5.
Hello, to render enemies etc a depth buffer is indeed used, it is just that for the main level only depth writes are required, which were apparently much faster than depth reads. Your idea with spans sounds like it should work. I've no doubt it was considered by Carmack, Abrash, and co; it'd be interesting to know why they went with the AEL/APL approach. Edit: Chapter 66 of Abrash's book talks about this in the Sorted Spans section.
@@MattsRamblings I'm certain they tried more than a few things. By drawing one surface at a time and having no interleaving tasks, good texture fetch locality is achieved, and cache can be utilised especially if they use swizzled texture format.
@@SianaGearzquake 2 sorts the output span list by surface (lit textures) to optimize cache hit rate, but I don't think quake does this.
@@Ehal256 Once you start increasing geometric resolution, you're going to need better methods than just drawing one whole surface at a time to achieve better cache locality. But either is better than rendering the whole spanbuffer line by line top to bottom. Of course one can come up with tile based methods as well and so on but they can also be hit or miss.
I studied the everliving crap out of that Black Book in high school
Could you explain how the objects are rendered?
The spans probably are better for avoiding memory cache misses as likely they are stored next to each other in memory...well...it's usually done that way anyhow, along widths rather than heights.
I used to make levels on an editor called QOOLE - I wish I'd known then some of the stuff you've put in your videos as it explains why some things just wouldn't behave themselves.
wow, another one
edit: great one
They explain to us the "truth", because they hide the real truth that Quake rendering engine is Black Magic
When I watch your videos on quake I usually try to recognize which parts are relevant to modern graphics programming and which aren't. I suppose if you are taking advantage of the graphics card and modern graphics APIs then everything in this video after the "Back-face culling" chapter wont be particularly useful since you can have your graphics API automatically perform lots of tests to prevent overdraw.
Yes and no. Yes, because programming on the CPU, you don't need to worry about such. However, no because that's actually not far off what the GPU does, so understanding Quake's rasterization helps in understanding how a GPU works. While algorithms might be differ (sometimes significantly), the biggest difference (by far) is Quake is single-threaded (it wasn't until after 64-bit became commonly available that multi-core cpus were commonly available), while GPUs do the vast majority of the work in parallel.
Excellent as usual. If i might give some constructive criticism: Consider red-green colour blindness in your visualisation, perhaps switching to blue and orange. Also check that your audio is normalised, it is very quiet in places.
I'll bear this in mind in future vids, thanks for bringing it to my attention
Holy crap... I can see why Carmack & Romero mentioned that Doom was kind of the sweet spot where the average person can reasonably understand how the engine renderer works and how building out a custom level works, where Wolfenstein was too simple create anything all that interesting and Quake is just that smidge too complex for most people to wrap their heads around.
Sometimes I wonder if stuff like this can be threaded to make it faster. Not that it really matters anymore, but as an academic experiment for those old dual and quad Pentium Pro setups.
Very nice video. While watching it, I was thinking which modern tech would be equally impressive. Maybe it's nanites in UE5. IMO the best presentation about them is called "Nanite | Inside Unreal". It's almost 3 hours long, but the stuff is not terribly complicated, these people simply tell too much information: history, requirements, their reasons to make various decisions, etc.
Very nice❤
Very interesting!
it looks so good in lowres
Is there any modern "Black book" of comparable technical details?
Great video. Volume was low though.
Did the book mention the span algorithm you described, and did Carmack say it was more efficient than the regular approach?
The book does describe the AEL / APL span algorithm: Check out Chapters 66 and 67. There are some differences compared with the final version in the GPL release, so refer to source for the true method. It was the best technique they found, in terms of maximizing the worst case performance, that is to say, still performing well on the hardest scenes.
@@MattsRamblings That's awesome, thanks
What about Z-fighting? I find it interesting that the Z-fighting traces bleeds across horizontal scans consistently and if the camera is kept still, the Z-fighting pattern does not update. You can get certain camera angles where one texture completelly obscures the other while you can get other camer aangles where there's a diagonal streak. Wouldn't Z-fighting be very costly rendering time wise with this algorithm?
I’m not sure if Z-fighting would really apply here given how the world geometry is generated from the CSG brushes. The span optimization is only done to the BSP geometry, which (if I understand correctly) already has hidden surface removal applied to it. Levels are paper-thin shells that have no exterior.
@@MaximumADHD If I remeber correctly some issues with Z-fighting in levels appeared with GLQuake that did not exist with software renderer.
How are models and your viewmodel affected by lightmaps?. Do they sample a point under them and they are colored according?
how did they adapt the software renderer way of no redrawing into the OpenGL renderer of GL quake or in Quake 2?
Quake was software rendered, development started before Doom was released, and they likely added minimal OpenGL support by replacing end of graphics pipeline to draw polygons using OpenGL.
IdTech3 (Quake 3 arena) likely was similar but more optimized to work with 3D-hardware. IdTech4 (Doom 3) was overhaul to architecture. Sadly that failed because it used patented algorithm and engine was very dependent on those shadow volumes. IdTech3 was very long lived engine.
@@gruntaxeman3740 keep in mind that in the Quake 2 engine Karmack made it a point to no overdraw anything, now how did he do it?
@@Biel7318
Quake 2 engine is basicly same as Quake 1. Software renderer where end of rendering pipeline works in graphic card.
It is easy to make engine with zero overdraw. Like raytracing whole image. It is not necessarily good idea to have zero overdraw as it is faster to let GPU do the work than use CPU to minimize overdraw.
@@gruntaxeman3740 independent of it's optimalness when rendering. Carmack did made an intent of zero overdraw with Quake2, but using the GPU instead of the CPU unlike Quake1. I'm just asking how different is that method to Quake1's
I don't know if it was actually done this way in Quake 2, but with a GPU it's better to culling at the level of BSP nodes and not worry about a little bit of overdraw where one node overlaps behind another
That's genius !!!
Imagine going to all this trouble, only for OpenGL and Direct3D to arrive shortly after. I often felt sorry for the poor person tasked with building the software renderer that most people never used by the time 3D graphics cards were more commonplace.
Fkn John Carmack, he's an alien if there are any living on earth.
this is one of these videos where you need to watch it 20 times to be able to replicate it.
Finally everyone can make non laggy 3D engine for calculators!
I wrote a doom renderer for my Casio, works fine
@@thewhitefalcon8539 the talk was about rasterization of polygons, not raycasting
@@Z_Z.t Doom doesn't use raycasting, it's an interesting hybrid of polygon rasterization but restricted to mostly 2/2.5D.
@@Ehal256 whatever, doom is a different and simpler engine, so some of tecnologies shown in video are not used there, because there's no need to.
I wish I had a big brain to understand this.
What music was used in this video?
Hello, the music is:
White Hex - Searching For You
The 129ers - I Can't Remmber I Can't Recall
Nico Staf - Smooth and Cool
Dan Henig - Midnight
Matt, what tools are you using to render your videos?
Hello, mostly Blender and a tonne of Python scripts.
...That's a lot of words to say "black magic is how it works".
Quake 2's software renderer is even better.
Can you do the same thing with "Minecraft"?! I want to see how "Minecraft" look like on Software renderer!
👍👍👍
In short, the scanline algorithm is used as a depth prepass... but then... quake can't use transparent textures?
Not qute: sprites use transparency (binary). However, yes, translucent water is a problem, and not just because of the overdraw elimination, but the palette too (though the water could be dithered for a probably rather ugly translucency, but only as a second pass, which is how the GL renderer does it (minus the dithering)).
@Bill Currie well that's quake2 water, but quake2 was designed with OpenGL acceleration in mind, so the different passes were expected. But Q1 is entirely solid render. If any, with transparent passes, the lines don't get discarded. Instead, they have a chance to get properly sorted and have correct transparency🤔
@@santitabnavascues8673 Quake had transparent water in OpenGL as a hack, but yeah, it might have come after Quake 2 was released. Certainly the software renderer was not designed with transparent water in mind. However, that's really after the bsp stage of the rendering pipeline. The bsp tree traversal itself is barely changed between software and OpenGL, and even my Vulkan renderer's bsp stage is very similar.
@Bill Currie the bsp is enough to render transparencies well, the way it is traversed sorts the polygons, Quake3 renders them just fine. Good luck with your renderer!🙂
@@santitabnavascues8673 Of course it is, I never said otherwise. The BSP tree how the original legacy OpenGL rendering, and both the modernish OpenGL and Vulkan renderers (both of which I wrote) support transparency, because the BSP tree supports not only depth sorting, but also texture sorting.
It's the way spans are handled that (at least at first glance) make transparency awkward. I don't feel like digging into getting software transparency working as I have other things I want to work on (eg, shadows, diegetic UI, ...).
Also: The visualizations are nice but some parts warrent an epilepsy warning, IMHO.
please epilepsy warning
Wow, this was too technical for me
Ive been working with unreal engine 5 a lot and i love some of the modern optimisation methods possible with powerful gpus and multi core cpus, its hard to expain just how many advantages these have. But it gets frustrating when you see "new culling tech" being advertised when the problem was allready solved 30 years agao on less powerful hardware, in a more efficent way. If i had the time to go through unreal engine code and make some overhauls i think itd just end up using the same solutions found in quake (though some problems can be done better with modern hardware then could be done with methods used in quake".
In short, quake did optimisation better then multi billion dollar companies 30 years before and its very frustrating.
Definitely a case of "yes, but no" as methods for transparency; in particular the lighting and shadows around them, have changed a lot and require greater efforts to prevent drawing errors. That said, having an expanded set of rendering options with a hierarchy of what other methods they rely on being parsable so the desired balance of older optimizations to realism could be achieved on demand would be a wet dream, especially if it could be injected into earlier titles. The specular mapping of the mid 2000's that carried on way too long is something I'd probably tweak in every title I could😅
@@Xeogin i completely agree, i love 3d graphics and things like ray tracing is pretty much a must have IMO, but binary space partition gave so much preformance improvements (in inside spaces) that its kinda stupid to not use.
@@hughjanes4883 BSP is highly focused on static geometry, and it also requires a fair bit of precomputation. While it's still very effective on modern CPUs (see Ironwail), I can see why it fell out of favor.
@@hughjanes4883 BSP goes very bad very quickly with complexity. Cross cutting planes are a minor problem in Quake but they get drastically worse once you start doing intricate geometry. It was a suitable solution for the time, not for all times.
Yeah, but think of the kind of content we handle now a days. Most of it is dynamic and changing, whereas Quake levels are fully static. Also, current content has millions, literally almost, of triangles, so an active line list iwould become a large structure, and having ever smaller triangles (such as the nanite system) would make this structure even larger than an equivalent depth buffer and sorting through them and referencing polygons would pose a computational complexity that would make their utility rather limited, setting its scope completely outside of the real time rendering. But the core optimizations are still there, for example, many of the current engines do a depth prepass precisely to prevent the overdraw of complex pixel shaders
I can't watch this video, for the same reason i am leaving this comment, but i want to commend you on the best flashing lights warning i have seen on youtube c:
Sorry that you can't watch the video, I'll certainly pay closer attention to this in the future.
That`s genius. Never knew Quake`s engine completely eliminates overdraw, I always thought it checks if a pixel was already drawn.
Ok here I was lost
Too complex video