As someone who's hopeless when it comes to any kind of coding or programming, these videos do a good job of explaining what's going on in a comprehensible way, even if I still don't "get" most of the theory behind them. Nice job.
Did it speed up when you full screened it? This lowered the resolution and used a lower color depth, which usually improved performance (also it didn't render window controls, which also caused slowdown)
I did wonder why there hadn't been any activity on the Game Hut channel recently, now I know why, it was because it's all happening on this channel now.
Oh, these remind me of writing modern gpu particle systems: do anything you can to avoid having to remember very much about each individual particle. We have also gotten a lot of work out of storing a small static ID value for particles along with their age. This can be used to add a lot per-particle variety and can still be quite small (8 bits age, 8 bit id). You can use the id value as a random seed to change physics properties, maximum lifetimes, or colors, and you can hash it so that the properties don't obviously correlate. This obviously makes the particles double the size in memory, but it's usually worth it :D
Musty make for some interesting math if you are doing something like a particle that bounces off the floor, you would need the full integral for it to regenerate the position from only time.
Really clever. :) Loved this! This is probably my favorite part of development, figuring out novel solutions in code to custom hardware. The formula look-up table technique in the VPU reminds me of some techniques I used to approximate a sine-wave throughout a given range on some specialized hardware where the math instructions were tremendously slow. That way, all the specialized hardware does is what it is made to do, and you don't waste cycles or memory on things it doesn't need to do. This system would have been a blast to work on. I love you sharing your past development history here, it's fantastic.
rendering particles in real time using the main processor: nah, that is too much time consuming letting the tiny chip do simple maths: that works honestly, this is the kind of workaround outside of the box stuff i love about old game development, there were so many limitations that there was no other option exept being engenious, ressourcefull and creative
It IS a thing of the past. Last non-standard console was Wii U. But even Wii U itself is somewhat similar to a standard x86 pipeline. There isn't a trick to be pulled with modern games, be it PC, Console or mobile. On a simpler way: Disk store data, RAM store logic data, CPU process logic, GPUs store video data and process 2D/3D logic. Game data is first loaded from disk, by the cpu, then stored directly on RAM. Texture data is transferred from ram to CPU, then, transferred directly to GPU RAM (VRAM). Game logic is processed on CPU, and then the CPU sends draw calls to the GPU. The GPU, finally converts the frame to a 2D still and send it through cable. To be honest, all the "tricks" developers can do is use the GPU huge power on math processing to do other stuff rather than calculating 3D geometry and shaders. However, what you can do with GPU besides 2D/3D related stuff is very, very limited. Also designing a game with GPU processing logic is problematic because you would need to code it to accept on both mainstreams GPUs like Nvidia, AMD and Intel, and the performance will vary greatly based on the GPU performance itself - if there is a GPU to begin with.
@@akaDL you could optimise CPU usage by properly taking caches into account. Check out Data Oriented Design if you're interested. But other than that, there's not many tricks with the hardware that I can think of
Nowdays you also can have this level of complexity, is just avoid of using game engines like Unity. But many things changed, instead of VPU you need to use shaders, for example.
uh, I think you missed the point. They are using the vector CPU in both scinarios. The difference is that one they are using a simple start time and a scaler product to search from lookup tables vs resubmitting the changes from the CPU where changes for each particle are made. It's actually pretty smart when you consider the functions of modern shader langagues which essentially do the same job as this vector unit back then. Let me tell you, making batch breaks on a shader from CPU is easily one of the most expensive parts of modern graphics.
All your videos are really amazing, it's interesting to see how every last bit of performance possible (and even more than that) was squeezed out of the hardware with clever tricks.
Damn I wish someone would use these kind of skills and start sell awesome games for the Raspberry Pi or create some stunning tech demos for it at least.
man, I love your videos. I've been in the game industry for some years now, mainly working on system software for consoles, but I'm finally working at a game studio now. I'm hoping that in the coming years I can learn as much as you have over the years. You've worked on games I still play to this day. It's crazy lol
3:07 Speaking of Jak and Daxter, the game recently got decompiled and recompiled into OpenGoal project, made into an actual PC port made from scratch. Jak and Daxter can be built and played natively on x86 systems now instead of emulating MIPS architecture
I did a small particle engine myself, so the video was perfectly clean to me. Thanks for detailed overview. Congrats to clever usage of available resources.
Just imagine if the creativity of those generations were used today instead of puttning to much trust i raw compute performance. I always loved how the particles were handled in the 6th gen of Consoles for this Very reason. Good video!
@@den2k885 I'm not really sure what you imagine we're talking about but if you're somehow avoiding the computer doing anything, that sounds a bit like hard coding for optimization to me. or worse, magic numbers. Do you usually avoid using the trig functions in your language to avoid having the computer do math for you?
I know it might be just me, but not only do I love the grit and passion put into the older systems, but I miss the 'push'. Genesis/ SNES every now and then had teams that would 'push' it to the limit - just for the sake of it. Ps2 had Shadow of Colossus - perhaps the last stallion in the barnhouse. In the last 15 years ingenuity has taken a dive. Ps3 becomes Ps4 becomes Ps5 like an expensive domino effect. But...has trading ingenuity for ho-hum CGI sexiness made us any happier in the long run?
Very interesting! it reminds of the interpolation used in game networking where they can determine where network managed objects should be at a given timestamp. It works with any frame rate even if it was only getting data about objects at a low rate (as low as 20 per second) because it interpolates between the 2 latest packets. A little bit different as you cannot predict in advance where objects will be (especially players), but still similar in the sense that it uses math to figure it out using a timestamp. Thanks for sharing!
Modern browsers are the most inefficient and bloated software ever to be created, because of "it has to work with whatever the user makes" philosophy which kept adding complexity and there were never any revisions because it would break old web pages. At this point, a web browser is an entire OS that exists just to stream apps off the internet instead of downloading them.
If the moving source followed a path that didn't have to keep track of state, like an ellipse or something, then yes it could be modified with minimal effort to also calculate the position based on where the starting position would be at a certain time. However, even if you're trying to do something that does keep track of state (like having particles follow the player) the other advantages of the system, plus the fact that you likely wouldn't have too many particles rendered like this, would keep it feasible. I'm answering from my own knowledge, so it could end up being faster or slower than I figure.
I'm glad GPU based particles have become the norm now it works a similar way. Instead of making the cpu do all this math, let the gpu who is made for multiplying with its floating point units do it. Super fast and allows for more calculations like vector fields and letting particles react to light and shadow. But not every game system has a good system
Yep. Same as how random numbers are (usually) generated. You have a seed value, and an algorithm that calculates the new value based on its phase in a cycle.
GPUs have a lot of slower, weaker cores (the RTX line has 2000-4000, compared to the 4-12 in a standard CPU) that are designed for "parallel" tasks - doing a huge number of small, quick calculations rather than a smaller number of larger, more complicated calculations. Like, say, rendering every polygon in a frame.
GPU archicteture is designed to do a very limited aumont of instructions (or calculations types), where all instructions are specific to geometry calculation. Also, most of the math can be calculated on a paralel way, hence, it is easy to multhread.Thats why GPUs has hundred or thousands of cores. CPUs archicteture is designed to do many types of instructions, or should I say, many different types of calculation. And most instructions can't be multithreaded, because, by the nature of the instruction itself, it can't be split on two parts. For example, I can't split in two paralel task, a routine that waits for the player input to generate an outcome. However, I can and it is easy to split the render of a scene on many, paralel tasks. 3D rendering is just many, repetitive geometry math. Whereas game or any non-gaming logic is complex and does many different and unique things. A GPU will always do the same math, all that changes are the parameters of the calculation and the results. a CPU will always do very different instructions, that not only changes the parameters and its results, but what is actually calculated to begin with.
Except ignore the previous answers because that's not how the PS2 GPU worked. It was a fixed function hardware GPU with no programmable shader cores. Rest assured, the GPU in the PS2 could keep up with the output of the VPU as it's raw speed of the GPU was upwards of 50 million triangles per second. The actual speed depended on what per pixel features you had enabled and size of the triangles. With all features it could still manage 15 million triangles per second ( estimated). These numbers are approximate and are based on an average triangle size of around 50 pixels. My main point is that the fixed function hardware was more than fast enough to draw this number of particles per second.
Regenerating data is faster than loading, iterating and storing it. GPUs are similar, using a look up table can be slower than computing something from scratch in many cases.
This is by no means a dig at the explanation here, which is great, but every time he says maths to me he may as well be saying magic. No, I never learned my times tables.
Hah, that's weird. Although I'm not a game programmer and I don't know anything about the inside of a PS2, the way you solved the problem is the only way I would consider implementing it and I was wondering why. I have a background in parallel and distributed simulation do '3D games' seems quite similar. It sounds like this is what GPUs evolved into.
Need to find a device that is well documented and has compilers readily available. MVG did a video on how to program the GameBoy. Couldn't care less about the gameboy tho XD
I made something similar with Unity's DOTS and shader graph on my phone. Ships flying and shooting at the player. Doing all the transformation inside the shader and calculating the same math on the CPU, I could skip all the cpu's reading and writing of the ships & missiles positions. But than I got limited by the amount of triangles my phone could handle....
So if I understood this correctly, and it's possible I didn't, but the math is basically like a standard curve of a graph. The whole y = mx+b principle? If so then that is pretty clever. I don't know why that wouldn't be more common because that's the method I'd use if I had to solve that problem.
Of course, one drawback of this is that the particles cannot interact with the world, i.e. to do lighting etc. So all that stuff has to be thought through by the CPU in advance if the particles are expected to interact with the world. Kind of like how player models are rendered by the GPU but the hitboxes are handled by the CPU. (correct me if I am wrong of course)
Correct, no collision, and particles are unlit. This entire process only works by being a pure functional and deterministic vertex shader and there is no pixel shader... On a modern GPU this technique could still be applied in a vertex or geometry shader and get per pixel lighting in the pixel shader. Collision would still be an issue that would elude this technique.
I love this videos. A little question here. Are this "impossible" code tricks the reason for the idea "PSX and PS2 were very hard platforms to develop software" or their GPU/CPU were way too different to any other hardware? Is shame PC can't benefit from almost any trick, the only ones I am aware are Doom (fast 3D illusion), Quake ("low" requirements real 3D and some lighting) and Quake 2 or 3 ("free" "dynamic" lighting).
The truth is most PC games up until the late 90's had various implementations of software rendering as GPUs were still in a transition from vendor specific api's to more universal api's. So you'll have a hard time looking for comparable features and hardware "tricks" as everything was mostly standard in featuresets while also being diverse in power capabilities meaning games were built to look nice on the devs computers and they implemented options to reduce quality for entry level consumer grade machiens, still much like today. That being said advancements were mostly made in pursuit of higher polygon count, screen and texture resolution and basic effects such as transparency, lighting, physics, reflections, etc ... Most advancements were made by independent groups from the demoscene that later joined or formed game companies like our host here so you might want to look at productions from the time on pouet.net for example. If I had to name one notable feature or "trick" that blew my mind back in the day it would be using the CPU for voxel based terrain rendering like in Commanche, Delta Force or Outcast.
@@TheSliderW You're entirely correct, though I can think of one clever use of PC hardware undertaken in Wolfenstein 3d, at least (and I assume Doom): on VGA hardware you can disable chain 4 (which is a feature that makes video memory look linear, even though it's really planar) and then use the map mask register so that anywhere that you have up to four suitably-aligned columns with the same pixel contents you can write them in the amount of time it'd otherwise take to write only one. Which happens a reasonable amount of the time when walls are wide and use low-resolution textures. I guess you could argue other Mode X effects too, but those are mostly about the previous generation of 2d gaming.
I'm no expert, but the limitation should effectively be that there's a really limited range at which the particals can be "random." e.g. each partical in an emitter will definitely have a certain colour at X age. Each partical will have the same rotation at X age. You would need to limit the time they could exist, obviously, or else it'll eat memory and processing time. They basically function in a black box, too, so there's no physical interaction with the particals anywhere with the 3D terrain or world. So these mean it'd be very difficult to get the convincing visual effect of say, a splash animation and droplets scattering on the ground. It wouldn't look random enough, and they would phase through the ground if they didn't 'evaporate' before that. The use-case would be for ephemeral effects like sparks, fire or light. To extend from this, he didn't hint whether or not the partical is still drawn if it's hidden/occluded by something (which wastes video memory, but that calculation might waste the processing time, so it's a trade off.) But the GPU may have its own system for that. I'm not shit-talking. Pretty stunning bit of coding. Maybe if @CodingSecrets comes along he can correct me or explain things that were glossed over in the video.
Honestly, this is pretty much the approach used in modern engines (Aside from the fact that they support many more parameters). Modern vector units are now often integrated into modern CPUs directly, so the back and forth between them takes a lot less time, as well as the simple fact modern CPUs are insanely faster than those of the PS2. For some modern particle systems, the GPU instead acts as a sort of VPU, but again it has a lot more memory and so can take in a lot more data. The GPU's parallelism, support for relatively generic code plus the fact it's already where you want to output your triangles anyway makes it incredibly fast, with the main cost that you can't do complex collision calculations since you'd otherwise need to send results back to the CPU.
I guess one of the limitations of this approach is that is it tied to the frame rate of the game. The particle will slow down if the frame rate dips below its target and if the game runs uncapped above the targeted frame rate, the particle simulation will speed up. The frame rate needs to be constant with no variations for it to work. Also you need to design your particle with the knowledge of what frame rate they are targeting, be it 30 or 60fps. If you designed it for 60fps but later in development the game is capped to 30fps, the particle you have made will run at half the speed of what you had in mind. Which means you got to go back and double the variables for each particle. The same can happen but in reverse too, doubling the speed of the particle simulation, thus needing to half the values.
@@yopeaceable The particles here are being interpolated so they can be decoupled from any frame rate. The maximum number of key frames is fixed though (I think the video said a table of 64?).
some games, even today absolutely waste processing power and are poorly optimized, especially on phones which were never meant to do any type of game stuff. Devs would do well to watch these videos and be inspired to optimize their games in novel ways, include novel game mechanics, and also crunch their data usage and load times down in their creations. That said, if something is possible than by definition it is != impossible ;)
This channel just makes me think that back in the "good ol days", programmers got to flex and circumvent hardware limitations. AFAIK in the modern day, doing something like this on the PS4/XBone and onward is practically illegal and night prevent your game from being published. Correct me if I'm wrong!
So, I've seen twice now when you mention 'in a previous video" but I'm watching your videos in upload order and there is no previous video like the ones you metion. What's up? Did some stuff get taken down?
The way described to batch the data would heavily imply no. That said it might have been possible to first calculate transforms by processing attractors/deflectors in the view first. Then it really is just a matter of whether all the data from a previous batch is automatically purged or not upon receiving new instructions or data because if not you could use the previously generated transform tables and use something like calculated between that system and the current vertex distance to create a mapped value for interpolation like they did for color and shape data to keep the transform table slim...but I still can't imagine that would allow for a crazy amount of transforms while still having the bandwidth to do operations on so many emitters and emitted particles.
So you're simplifying things by figuring out all the different properties from only one variable (along with this one variable being able to change all the other properties differently based on their initial starting values)?
Is this kind of optimization thinking dying in the industry as a whole among programmers and developers? Deeply fascinating stuff, but I can't help but feel its rapidly becoming a lost art with the general purpose computing power at our disposal. nice video~
It's dying in the industry as a whole because the industry programming constraints are wide in scope. This art of optimising the software process is useful for programming a single hardware platform; lessons about specific ideas can be shared between hardware platforms but the code itself cannot be shared. When the industry is required to target multiple hardware platforms at the same time for their game title, they're not spending the time to think about these nifty tricks for each hardware platform. Unless there was some constraint that mandated a certain game feature to be performant at interactive rates, they will choose to rely on processing workflows that target the lowest common denominator between the hardware platforms.
I want the source code for this. And I want to make it run on the other 6th gen consoles, like the XBOX and GameCube. It would be pretty difficult, though, because the hardware is completely different.
Well, on the Gamecube you don't have to worry about the T&L part, only the physics. The Gamecube CPU has SIMD instructions, so one can apply that as well.
I remember them jerry rigging their E3 demo, they added a TON of RAM into a devkit and streamed pre-transformed geometry into the GS. All the CPU was doing was DMA transfers. I guess that's where that magic theoretical peak number comes from.
I've rewritten this comment about a half dozen times at this point, because I'm not trying to come off disrespectful, but I'm wondering what you're thoughts are about devs who get excited for better graphics/sound/story/etc then forget to make the game actually fun to play. Is it an easy trap to fall into? Have you ever had that problem? I like better graphics etc, obviously, but there's no point in beefing up the visuals if the game isn't fun in the first place.
I dunno why but this channel always feels like it's missing something at the end of each explanation. Like, there always seems to be just one tiny bit of information missing that makes it impossible for me to understand, yet sooooo close.
As someone who's hopeless when it comes to any kind of coding or programming, these videos do a good job of explaining what's going on in a comprehensible way, even if I still don't "get" most of the theory behind them. Nice job.
I remember when my PC used to struggle to run the particle fountain in Windows Media Player.
Did it speed up when you full screened it? This lowered the resolution and used a lower color depth, which usually improved performance (also it didn't render window controls, which also caused slowdown)
I remember when bubbles savescreen were too fancy for peasants like me
Data oriented programming is making a comeback.
This intro music embodies the sound of delving into the hidden universe of magical coding secrets
I just realized Gamehut is also Coding Secrets.
Big Brain Moment
I did wonder why there hadn't been any activity on the Game Hut channel recently, now I know why, it was because it's all happening on this channel now.
woosh
@@zappafurious don't need any of that.
That was the Codest Secret of them all.
as "modern" embedded programmer, this stuff is amazing. I wander how many square wheel we did reinvent every day
***NEWS FLASH***
TT Games was put in charge of the Large Hadron Collider.
They dreamed out a novel approach allowing time travel
This can be done very simply with horizontal interrupts
@@TheThundererVidz which is why most assholes went with loops
They could do that. THey certainly have the talent. They would make a lego game instead.
@@CasperUK31 My theory's they were cursed and the only way to lift it is to make the same game 100 times
In fact, if we just *mirror* the collider along each axis, then the collider only takes up a quarter of the space
Oh, these remind me of writing modern gpu particle systems: do anything you can to avoid having to remember very much about each individual particle. We have also gotten a lot of work out of storing a small static ID value for particles along with their age. This can be used to add a lot per-particle variety and can still be quite small (8 bits age, 8 bit id). You can use the id value as a random seed to change physics properties, maximum lifetimes, or colors, and you can hash it so that the properties don't obviously correlate. This obviously makes the particles double the size in memory, but it's usually worth it :D
Musty make for some interesting math if you are doing something like a particle that bounces off the floor, you would need the full integral for it to regenerate the position from only time.
Really clever. :) Loved this! This is probably my favorite part of development, figuring out novel solutions in code to custom hardware.
The formula look-up table technique in the VPU reminds me of some techniques I used to approximate a sine-wave throughout a given range on some specialized hardware where the math instructions were tremendously slow. That way, all the specialized hardware does is what it is made to do, and you don't waste cycles or memory on things it doesn't need to do.
This system would have been a blast to work on.
I love you sharing your past development history here, it's fantastic.
rendering particles in real time using the main processor:
nah, that is too much time consuming
letting the tiny chip do simple maths:
that works
honestly, this is the kind of workaround outside of the box stuff i love about old game development, there were so many limitations that there was no other option exept being engenious, ressourcefull and creative
@obsoleteUbiquity i never said its not done as much, i just said back then it was the only option
It IS a thing of the past. Last non-standard console was Wii U. But even Wii U itself is somewhat similar to a standard x86 pipeline. There isn't a trick to be pulled with modern games, be it PC, Console or mobile. On a simpler way: Disk store data, RAM store logic data, CPU process logic, GPUs store video data and process 2D/3D logic.
Game data is first loaded from disk, by the cpu, then stored directly on RAM. Texture data is transferred from ram to CPU, then, transferred directly to GPU RAM (VRAM). Game logic is processed on CPU, and then the CPU sends draw calls to the GPU. The GPU, finally converts the frame to a 2D still and send it through cable.
To be honest, all the "tricks" developers can do is use the GPU huge power on math processing to do other stuff rather than calculating 3D geometry and shaders. However, what you can do with GPU besides 2D/3D related stuff is very, very limited. Also designing a game with GPU processing logic is problematic because you would need to code it to accept on both mainstreams GPUs like Nvidia, AMD and Intel, and the performance will vary greatly based on the GPU performance itself - if there is a GPU to begin with.
@@akaDL you could optimise CPU usage by properly taking caches into account. Check out Data Oriented Design if you're interested. But other than that, there's not many tricks with the hardware that I can think of
Nowdays you also can have this level of complexity, is just avoid of using game engines like Unity. But many things changed, instead of VPU you need to use shaders, for example.
@@akaDL Wasn't the Wii U a beefed up Wii which was itself a beefed up Gamecube?
Re-uploaded to correct title card
Anytime, literally
How fast is the graphics chip in the PS2 again?
@@vyor8837 that depends if travelers tales programs for it or not
Rebrand, huh?
Why are the videos on a new channel, if I may ask? I so really like these videos, but why this move?
I have not a clue of any of this but along with the background tones repeating, I find it very relaxing.
So, you are using the vector processing unit to do vector processing? Genius! ;)
Actually, yes.
I mean it was a vector processing unit for the ps2
probably had less than a kb of cache
uh, I think you missed the point. They are using the vector CPU in both scinarios. The difference is that one they are using a simple start time and a scaler product to search from lookup tables vs resubmitting the changes from the CPU where changes for each particle are made.
It's actually pretty smart when you consider the functions of modern shader langagues which essentially do the same job as this vector unit back then. Let me tell you, making batch breaks on a shader from CPU is easily one of the most expensive parts of modern graphics.
All your videos are really amazing, it's interesting to see how every last bit of performance possible (and even more than that) was squeezed out of the hardware with clever tricks.
Sort of resembles building the particle system as a vertex shader
Exactly my thought, awfully like GPU particle systems but before that
Too bad I didn't have a VPU to calculate my taxes, cheers from jail.
😂
Damn I wish someone would use these kind of skills and start sell awesome games for the Raspberry Pi or create some stunning tech demos for it at least.
That someone could be you!
man, I love your videos. I've been in the game industry for some years now, mainly working on system software for consoles, but I'm finally working at a game studio now. I'm hoping that in the coming years I can learn as much as you have over the years. You've worked on games I still play to this day. It's crazy lol
this is such an iconic visual element of the PS2 era
Woo hoo! That was really exciting seeing another instance of brilliant people getting the most out of video game hardware.
3:07 Speaking of Jak and Daxter, the game recently got decompiled and recompiled into OpenGoal project, made into an actual PC port made from scratch. Jak and Daxter can be built and played natively on x86 systems now instead of emulating MIPS architecture
Reminds me of the old school Winamp visualization plugins.
Transforming a sequential system to a combinatorial one, that's a highly effective and very difficult kind of optimization. Awesome!
I did a small particle engine myself, so the video was perfectly clean to me. Thanks for detailed overview. Congrats to clever usage of available resources.
Just imagine if the creativity of those generations were used today instead of puttning to much trust i raw compute performance.
I always loved how the particles were handled in the 6th gen of Consoles for this Very reason.
Good video!
Damn that was clever programming
Whoa this is clever. I have to watch this a couple of times to fully understand it. Thanks for uploading :-I)
After I started coding games I realized right away how much I needed math, I need to open my high school books again xD
Technically you just need to know how to read the formulas. The computer can do a lot of the heavy lifting for you in coding.
@@SheepUndefined and that's how you end up with inefficient code
@@den2k885 And that's how you end up with premature optimization.
@@SheepUndefined If you are tackling the mathematical part of the problem it is not premature, it is a design requirement.
@@den2k885 I'm not really sure what you imagine we're talking about but if you're somehow avoiding the computer doing anything, that sounds a bit like hard coding for optimization to me. or worse, magic numbers. Do you usually avoid using the trig functions in your language to avoid having the computer do math for you?
Love that thumbnail! When Lego Han Solo shoots Optimus Prime and trying to save Crash Bandicoot!
I love how efficient and creative this is. It's beauty and elegance for nerds!
This way of problem solving is great it's given me a lot to ponder over
This is phenomenal!
Please go through more of what you've done! I know you've done various, but keep on goin :)
So you basicly created a deterministic physics engine. thats an amazing approach to solve the memory issue
RIP Billy Mays
I know it might be just me, but not only do I love the grit and passion put into the older systems, but I miss the 'push'. Genesis/ SNES every now and then had teams that would 'push' it to the limit - just for the sake of it. Ps2 had Shadow of Colossus - perhaps the last stallion in the barnhouse. In the last 15 years ingenuity has taken a dive. Ps3 becomes Ps4 becomes Ps5 like an expensive domino effect. But...has trading ingenuity for ho-hum CGI sexiness made us any happier in the long run?
Very interesting! it reminds of the interpolation used in game networking where they can determine where network managed objects should be at a given timestamp. It works with any frame rate even if it was only getting data about objects at a low rate (as low as 20 per second) because it interpolates between the 2 latest packets. A little bit different as you cannot predict in advance where objects will be (especially players), but still similar in the sense that it uses math to figure it out using a timestamp. Thanks for sharing!
This is great! Well done to you and your team!
I always wondered how the lego series rendered so well on the ps2
1 second of time passes for the player, but days, weeks or months pass for the screen.
please keep doing these videos! Its teaching me so much!
and my supercomputer drops frames trying to render a couple sprites in browser games
Modern browsers are the most inefficient and bloated software ever to be created, because of "it has to work with whatever the user makes" philosophy which kept adding complexity and there were never any revisions because it would break old web pages. At this point, a web browser is an entire OS that exists just to stream apps off the internet instead of downloading them.
@@michaelbuckers this is a really interesting premise. Is there anywhere I can watch/read more about this?
@@DavySolaris you can start at w3c.
I don't code, but I can follow the logical explanation every time.
You guys are crazy and very smart.
Sony's approach to PS2 and PS3 design seems to be the same.
“Now this is currently rendering just 32 particles, so lets emit a few more” me: 👁👁i see..... “okay, so now we have over 1000 particles” me: 0_0
Reminds me of the onin mini game in Jak 2 or the particles used for geyser rock when it's in the background in Jak 1
One of the secrets to coding reality.
i miss the games hut, it was better name, and there was a really good logo there
That channel still exist dawg, this is just for coding secrets
Limitations spark creativity.
6:35 That's why I subscribed
You are amazing
fantastic
That's incredible! But can this particle engine support a moving source?
If the moving source followed a path that didn't have to keep track of state, like an ellipse or something, then yes it could be modified with minimal effort to also calculate the position based on where the starting position would be at a certain time. However, even if you're trying to do something that does keep track of state (like having particles follow the player) the other advantages of the system, plus the fact that you likely wouldn't have too many particles rendered like this, would keep it feasible.
I'm answering from my own knowledge, so it could end up being faster or slower than I figure.
@@ericadigiulio9639 I'm pretty sure that following a player isn't a problem.
Watching the explanation i can only think about how Burning Rangers was made considering the limitations of the Saturn.
Cool! I did something similar for calculating the positions of particles in Blender 3D Game Engine v2.60 in 2013, but in the vertex shader.
Did you work on Wrath of the Cortex? That is one of my favourite childhood games!
I'm glad GPU based particles have become the norm now it works a similar way. Instead of making the cpu do all this math, let the gpu who is made for multiplying with its floating point units do it. Super fast and allows for more calculations like vector fields and letting particles react to light and shadow. But not every game system has a good system
So does that mean every particle goes through the same colors in order?
Yep. Same as how random numbers are (usually) generated. You have a seed value, and an algorithm that calculates the new value based on its phase in a cycle.
DAVE DOOTS!! WOO, I think I worked with this Brother TOM DOOTS, after he finished being a Tester for TT
Cool story bro
Dootson, not Doots - so close
okay, so that saves cpu time. but why can the gpu render all of that??
GPUs have a lot of slower, weaker cores (the RTX line has 2000-4000, compared to the 4-12 in a standard CPU) that are designed for "parallel" tasks - doing a huge number of small, quick calculations rather than a smaller number of larger, more complicated calculations. Like, say, rendering every polygon in a frame.
GPU archicteture is designed to do a very limited aumont of instructions (or calculations types), where all instructions are specific to geometry calculation. Also, most of the math can be calculated on a paralel way, hence, it is easy to multhread.Thats why GPUs has hundred or thousands of cores.
CPUs archicteture is designed to do many types of instructions, or should I say, many different types of calculation. And most instructions can't be multithreaded, because, by the nature of the instruction itself, it can't be split on two parts. For example, I can't split in two paralel task, a routine that waits for the player input to generate an outcome. However, I can and it is easy to split the render of a scene on many, paralel tasks.
3D rendering is just many, repetitive geometry math.
Whereas game or any non-gaming logic is complex and does many different and unique things.
A GPU will always do the same math, all that changes are the parameters of the calculation and the results.
a CPU will always do very different instructions, that not only changes the parameters and its results, but what is actually calculated to begin with.
Except ignore the previous answers because that's not how the PS2 GPU worked. It was a fixed function hardware GPU with no programmable shader cores. Rest assured, the GPU in the PS2 could keep up with the output of the VPU as it's raw speed of the GPU was upwards of 50 million triangles per second. The actual speed depended on what per pixel features you had enabled and size of the triangles. With all features it could still manage 15 million triangles per second ( estimated). These numbers are approximate and are based on an average triangle size of around 50 pixels. My main point is that the fixed function hardware was more than fast enough to draw this number of particles per second.
In short: 1) Compute from scratch using age-based formulae, 2) use LUT (look-up tables) with interpolation for visuals.
Regenerating data is faster than loading, iterating and storing it. GPUs are similar, using a look up table can be slower than computing something from scratch in many cases.
This is by no means a dig at the explanation here, which is great, but every time he says maths to me he may as well be saying magic.
No, I never learned my times tables.
give it a shot
Hah, that's weird.
Although I'm not a game programmer and I don't know anything about the inside of a PS2, the way you solved the problem is the only way I would consider implementing it and I was wondering why.
I have a background in parallel and distributed simulation do '3D games' seems quite similar.
It sounds like this is what GPUs evolved into.
I am tempted to make this in an OpenGL shader.
Everything nowadays is done in highlevel languages. But I really should look into some of the lower level stuff some day. Seems very interesting.
Need to find a device that is well documented and has compilers readily available. MVG did a video on how to program the GameBoy. Couldn't care less about the gameboy tho XD
Me :
For(i=0, i++, i
I made something similar with Unity's DOTS and shader graph on my phone.
Ships flying and shooting at the player. Doing all the transformation inside the shader and calculating the same math on the CPU,
I could skip all the cpu's reading and writing of the ships & missiles positions.
But than I got limited by the amount of triangles my phone could handle....
So if I understood this correctly, and it's possible I didn't, but the math is basically like a standard curve of a graph. The whole y = mx+b principle?
If so then that is pretty clever. I don't know why that wouldn't be more common because that's the method I'd use if I had to solve that problem.
So if you give a processor the state of the universe at it's beginning, and it's laws, you can prove determinism :)
it has to be specifically the ps2's vector processing unit though
no other processor can handle such power
Allocation table FTW
Of course, one drawback of this is that the particles cannot interact with the world, i.e. to do lighting etc. So all that stuff has to be thought through by the CPU in advance if the particles are expected to interact with the world. Kind of like how player models are rendered by the GPU but the hitboxes are handled by the CPU. (correct me if I am wrong of course)
Correct, no collision, and particles are unlit. This entire process only works by being a pure functional and deterministic vertex shader and there is no pixel shader... On a modern GPU this technique could still be applied in a vertex or geometry shader and get per pixel lighting in the pixel shader. Collision would still be an issue that would elude this technique.
So, useful for some very programmery-looking sparks and smoke and flames and stuff?
I love this videos.
A little question here. Are this "impossible" code tricks the reason for the idea "PSX and PS2 were very hard platforms to develop software" or their GPU/CPU were way too different to any other hardware?
Is shame PC can't benefit from almost any trick, the only ones I am aware are Doom (fast 3D illusion), Quake ("low" requirements real 3D and some lighting) and Quake 2 or 3 ("free" "dynamic" lighting).
The truth is most PC games up until the late 90's had various implementations of software rendering as GPUs were still in a transition from vendor specific api's to more universal api's. So you'll have a hard time looking for comparable features and hardware "tricks" as everything was mostly standard in featuresets while also being diverse in power capabilities meaning games were built to look nice on the devs computers and they implemented options to reduce quality for entry level consumer grade machiens, still much like today. That being said advancements were mostly made in pursuit of higher polygon count, screen and texture resolution and basic effects such as transparency, lighting, physics, reflections, etc ... Most advancements were made by independent groups from the demoscene that later joined or formed game companies like our host here so you might want to look at productions from the time on pouet.net for example.
If I had to name one notable feature or "trick" that blew my mind back in the day it would be using the CPU for voxel based terrain rendering like in Commanche, Delta Force or Outcast.
@@TheSliderW You're entirely correct, though I can think of one clever use of PC hardware undertaken in Wolfenstein 3d, at least (and I assume Doom): on VGA hardware you can disable chain 4 (which is a feature that makes video memory look linear, even though it's really planar) and then use the map mask register so that anywhere that you have up to four suitably-aligned columns with the same pixel contents you can write them in the amount of time it'd otherwise take to write only one. Which happens a reasonable amount of the time when walls are wide and use low-resolution textures.
I guess you could argue other Mode X effects too, but those are mostly about the previous generation of 2d gaming.
this euclideon type stuff?
What were the limitations of this approach?
I'm no expert, but the limitation should effectively be that there's a really limited range at which the particals can be "random." e.g. each partical in an emitter will definitely have a certain colour at X age. Each partical will have the same rotation at X age.
You would need to limit the time they could exist, obviously, or else it'll eat memory and processing time.
They basically function in a black box, too, so there's no physical interaction with the particals anywhere with the 3D terrain or world.
So these mean it'd be very difficult to get the convincing visual effect of say, a splash animation and droplets scattering on the ground. It wouldn't look random enough, and they would phase through the ground if they didn't 'evaporate' before that.
The use-case would be for ephemeral effects like sparks, fire or light.
To extend from this, he didn't hint whether or not the partical is still drawn if it's hidden/occluded by something (which wastes video memory, but that calculation might waste the processing time, so it's a trade off.) But the GPU may have its own system for that.
I'm not shit-talking. Pretty stunning bit of coding. Maybe if @CodingSecrets comes along he can correct me or explain things that were glossed over in the video.
Honestly, this is pretty much the approach used in modern engines (Aside from the fact that they support many more parameters). Modern vector units are now often integrated into modern CPUs directly, so the back and forth between them takes a lot less time, as well as the simple fact modern CPUs are insanely faster than those of the PS2. For some modern particle systems, the GPU instead acts as a sort of VPU, but again it has a lot more memory and so can take in a lot more data. The GPU's parallelism, support for relatively generic code plus the fact it's already where you want to output your triangles anyway makes it incredibly fast, with the main cost that you can't do complex collision calculations since you'd otherwise need to send results back to the CPU.
I guess one of the limitations of this approach is that is it tied to the frame rate of the game. The particle will slow down if the frame rate dips below its target and if the game runs uncapped above the targeted frame rate, the particle simulation will speed up. The frame rate needs to be constant with no variations for it to work. Also you need to design your particle with the knowledge of what frame rate they are targeting, be it 30 or 60fps. If you designed it for 60fps but later in development the game is capped to 30fps, the particle you have made will run at half the speed of what you had in mind. Which means you got to go back and double the variables for each particle. The same can happen but in reverse too, doubling the speed of the particle simulation, thus needing to half the values.
@@yopeaceable The particles here are being interpolated so they can be decoupled from any frame rate. The maximum number of key frames is fixed though (I think the video said a table of 64?).
7:20 Alpha is greys or light, I think right? What does Jibber mean?
Alpha has to do with transparency (or opacity, however you prefer to think of it)
some games, even today absolutely waste processing power and are poorly optimized, especially on phones which were never meant to do any type of game stuff.
Devs would do well to watch these videos and be inspired to optimize their games in novel ways, include novel game mechanics, and also crunch their data usage and load times down in their creations.
That said, if something is possible than by definition it is != impossible ;)
currently trying to recreate this in Blender haha
This channel just makes me think that back in the "good ol days", programmers got to flex and circumvent hardware limitations.
AFAIK in the modern day, doing something like this on the PS4/XBone and onward is practically illegal and night prevent your game from being published. Correct me if I'm wrong!
huh... though, how would it work on other systems? is the (possible) lack of VPUs why the GameCube version of Wrath of Cortex runs slower?
But can you run it at native 4k, 120 fps? As we all know, that's all the REALLY matters.
So, I've seen twice now when you mention 'in a previous video" but I'm watching your videos in upload order and there is no previous video like the ones you metion. What's up? Did some stuff get taken down?
Fantavision
Mad props for the great solution :)
However i would assume that it did not have attractors / deflectors for the particles?
The way described to batch the data would heavily imply no.
That said it might have been possible to first calculate transforms by processing attractors/deflectors in the view first. Then it really is just a matter of whether all the data from a previous batch is automatically purged or not upon receiving new instructions or data because if not you could use the previously generated transform tables and use something like calculated between that system and the current vertex distance to create a mapped value for interpolation like they did for color and shape data to keep the transform table slim...but I still can't imagine that would allow for a crazy amount of transforms while still having the bandwidth to do operations on so many emitters and emitted particles.
This just in: PS2 is faster than your pc
Where is the "previous video" (Particle Systems for the PS2)? I can't find it.
Check my GameHut channel
So you're simplifying things by figuring out all the different properties from only one variable (along with this one variable being able to change all the other properties differently based on their initial starting values)?
Is this kind of optimization thinking dying in the industry as a whole among programmers and developers? Deeply fascinating stuff, but I can't help but feel its rapidly becoming a lost art with the general purpose computing power at our disposal. nice video~
It's dying in the industry as a whole because the industry programming constraints are wide in scope. This art of optimising the software process is useful for programming a single hardware platform; lessons about specific ideas can be shared between hardware platforms but the code itself cannot be shared. When the industry is required to target multiple hardware platforms at the same time for their game title, they're not spending the time to think about these nifty tricks for each hardware platform. Unless there was some constraint that mandated a certain game feature to be performant at interactive rates, they will choose to rely on processing workflows that target the lowest common denominator between the hardware platforms.
Xenoblade gives me hope for the art of optimization. XCX on the Wii U was a fucking MIRACLE.
Heyy ! Can you show us your ps2 devkit? Is it software or hardware?
Oooo reupload
How much if any work did you do on the GameCube?
Couldn't you also use this to create keyframe animation of 3D models? Seems incredibly versatile.
I want the source code for this. And I want to make it run on the other 6th gen consoles, like the XBOX and GameCube. It would be pretty difficult, though, because the hardware is completely different.
Well, on the Gamecube you don't have to worry about the T&L part, only the physics. The Gamecube CPU has SIMD instructions, so one can apply that as well.
And Sony claimed PS2 did 60,000,000 tris a second. Lol.
I remember them jerry rigging their E3 demo, they added a TON of RAM into a devkit and streamed pre-transformed geometry into the GS. All the CPU was doing was DMA transfers. I guess that's where that magic theoretical peak number comes from.
Your a wizard Harry.
You're*
@@jamess.7811 yer a lizard hairy
I'm guessing because of legal issues, you can't show the actual code itself?
I've rewritten this comment about a half dozen times at this point, because I'm not trying to come off disrespectful, but I'm wondering what you're thoughts are about devs who get excited for better graphics/sound/story/etc then forget to make the game actually fun to play. Is it an easy trap to fall into? Have you ever had that problem? I like better graphics etc, obviously, but there's no point in beefing up the visuals if the game isn't fun in the first place.
Lol, 666 at 2:45
no need for 2 triangles per particle
Ah that 4KB memory.... Sony is really wasting a huge amount of processing power of vector units because some cost-cutting decisions...
I dunno why but this channel always feels like it's missing something at the end of each explanation. Like, there always seems to be just one tiny bit of information missing that makes it impossible for me to understand, yet sooooo close.