My favorite was a mailing list message in a billion dollar company that went something along the lines of: " Okay, this is embarrassing, I was having sex with my girl friend (yes I have one, Rob!) and in the middle of that I had an epiphany on how to solve this issue. I interrupted the sex for a second, wrote down the note, and resumed. When done, I tried this and it worked perfect. I can't remember why it works, those cells now belong to my girlfriend who will probably now use them to take over the world or some such, I don't know, but this bug was here for five weeks and now it's gone, so I don't want to here anyone complaining about the fix not being documented. Just accept that it is and move on. "
I have some code commented by me to me that reads. /* * What the actual FUCK is this clusterfuck? * * SORT THIS SHIT OUT MAN * * Comment the code * * There is a bug possibly related to this code. * * * Coordinate data is loaded correctly and lookups work correctly until you are handling data near 0,0,0 where the data is still loaded correctly but returns data at 0,0,0 when looking at itself. * * * * Don't you fucking dare delete this until you have done your job properly. * */ The code commented is still there because the drive I was on went pop right after I deleted the backups to set the drive up for raid thinking I had time to do that and make a backup without worrying about the main drive dying. 8 year old drive chooses that moment to go.
"The vessel that houses interdimensional being John Carmack" "Texas Based technocryptic John Carmack" "The resident of the binding between space and time that holds our reality together John Carmack". -Civvie 11
I don't want to complain, I want to be constructive. I feel like your video simultaneously assumed the viewer understood a lot of high-end programming and math, but also talked down to them like a child. You had a lot of high concepts that you didn't really explain well surrounded by a lot of other concepts you explained down to a child. You kept jumping from one to the other and it was really annoying; I felt like you were trying to subtly insult me. If you want to be technical, be technical. If you want to explain things for the layman, then explain things for the layman. But don't do both.
simple solution: dont mind being talked to like a child. we are all stupid. so are you. its in our genes. too much self esteem is like too much sugar. rotting.
Yeah, I learnt game coding in the late 90's... And let's just say, at that point in time EVERYTHING was hacked and abused for speed. Bit shifting integers around was the norm because it was generally much faster than multiplication and division. (well, until you got the pentium that could do integer multiply/divide in a single cycle, at which point that no longer mattered.) In fact, bit shifts are the defacto way to implement multiply and divide routines in assembly if for whatever reason you're working with a CPU that has no hardware multiply/divide circuitry whatsoever. The key point being that a right shift is a divide by 2, and a leftshift is a multiply by two. Depending on what you're working with, a shift instruction may be a single bit (say a 6502). each shift is one bit at a time, but it takes 2 cycles. On some systems, like the x86, you can specify how many shifts to perform as a single instruction. An interesting aside here, for optimisation on old hardware, especially where each shift instruction is a single bit, is that moving a value around in memory by one location is equivalent to shifting it 8 bits. (so, for instance a 65186, which can read 16 bit values, and can perform shifts and rotates directly in memory, but only a single bit per instruction, could simply read a value and write it back in a slightly different memory location to mimic an 8 bit shift.) Of course, this is a well known and relatively straightforward thing to do with integers, but when you start applying it to floating point numbers, well, that's a whole other can of worms... Then again, there are a bunch of other common solutions to problems with square roots. A common solution in general to anything complex is a lookup table. Pre-compute a bunch of values then the instruction in question becomes a memory lookup. But, when doing this you're trading memory for speed. It's plausible when you have 256 values, but the more values needed, the more dubious it gets. If you had 32 bit values, and needed to access them using a 16 bit entry you'd need 65536x4 bytes - 256 kilobytes for a table. If you wanted to use a lookup table for a 16x16 multiply you'd have a 4 byte result, and 4 billion entries. - that's a 16 gigabyte lookup table! - if you have a system with 16 gigabytes of memory lying around, chances are you don't need a lookup table for something like this. Though... It might surprise you to learn that the hardware equivalent of a lookup table allows you to replace pretty much any CPU instruction with a memory chip instead, and this is guaranteed to work at a constant speed, unlike many implementations of actual functions using logic gates. Other tricks you can pull are contextual. Consider comparing the distance between two objects - if you have X and Y coordinates for each object, the distance between them derives from pythagoras; Distance = Sqrt ( (X1 - X2)^2 + (Y1 - Y2)^2 ) So if you you wanted to test whether this was within a certain radius of a point, for say, collision, you'd calculate the distance between these points, then compare it to the distance. But, you've got a square-root in there, which is a pain to calculate. Well, what if your test distance was the square of the distance? Comparing the square of two distances gives the same result as comparing the distances directly, yet now you don't have to do any square root calculations! Sounds pretty good, right? Well, there IS a downside, as there nearly always is, when it comes to tricks. The square of the distance rises exponentially. 1^1, 2^2 = 4, 10^2 = 100, 100^2 = 1000. Because of this, if you're testing large distances, there is a possibility you'll get an overflow, which will create a bunch of weird bugs if you're not careful. But it's otherwise a surprisingly simple trick to avoid the need to use square roots...
@@navi-charlotte I think there's a fine line between the two. Optimization is trying to reduce computational load by figuring out how to get rid of reduntant calculations. I think both lookup tables and the distance example in the OP are just optimizations. A hack relies on using side effects and arguably bit-shifting exponents is a hack, albeit a hack that has been used so often it doesn't seem like using a side effect anymore. The WTF line from the code is definitely a hack - by the time you comment your code this way it's fairly obviously not something the language itended as reasonable code and it wasn't an intended effect of that line to work as it does. And if it's not an intended effect, it's a side effect and hence using it is a hack.
Reminds me a bit of that time when I needed to find all buildings in a certain range around some sample point in a geographical database. The database had several 100K buildings which each had several vertices. Finding these buildings would actually take around 30s if I remember correctly. Sounds fine, but I also needed to process many tens of thousands of points every night. 30s just would not cut it. What I found out is that this database had a spatial index I could use to speed up bounding box operations dramatically. It took like 6 hours to generate this index, but that only needed to happen once. Instead of looking for everything in a circle around the sample point, I looked for everything in a square, and later filtered those results down. The time dropped from 30s to around 20ms per sample. That's 1500x faster by changing a single line (the query) and adding a 5 line for-loop to filter that result down. That did the trick. I could process a whole day's worth of data in about 2 hours. And like you mentioned, a lot of it is caused by avoiding square roots, while the other part is trading CPU time for memory. I don't remember how much memory the index used, but it was not a problem for a relatively modern desktop PC. Memory is so cheap these days, there is a lot you can get away with... That project is still the most dramatic improvement in performance I've done in the real world. I'm still really proud of how it worked out. It's not something I get to do very often anymore, but I think optimization problems are really satisfying to work on.
@@Niosus I can beat that. I worked on a statistical problem and a straight forward implementation of the maths resulted in crazy computation times for my use case - on the order of 10^28 years. I had a function that got called n^n times (my largest n was 27, my advisor had one in the 50s) and it got called with tuples. But I noticed that it did not depend on the order or multiplicity of the numbers. So I got that down to 2^n calls (basically calling it for the power set of an original set, rather than all n-tuples you could get from that set). That got my computational time down to about a week, but my advisors data would have still taken about 1 million years. Then I worked out a way to split the problem in some cases (and most real life examples allow that) and instead of a single problem with the original n you could work on problems of size m and n-m. I analyzed my advisors data in about an hour. Total factor ends up somewhere around 10^35.
Quake 3 lighting was done by pre-computed lightmaps. The real-time lights weren't even as real as they were in Q1 and Q2. In this case, they just projected a light texture on the Z(Q3's up) axis and did some sort of blend mode on top of everything else. The blend mode wasn't truly additive, nor was it a straight multiply. It just sort of scaled the existing color up by the RGB value of the light.
none-lightmapped dynamic lighting in Quake was lambert shading on the models. They don't require a inv-sqrt either, they use N.dot(L) per vertex. This video is quite ignorant unfortunately.
She wasn't wrong. What she says applies mostly to illumination of dynamic objects (like weapons, enemies or players), and sometimes static geometry (walls, floors, ceilings, water) ones that were wrapped (textured) with precomputed light maps. Quake had moving lights (for example light from rockets from rocket launchers, or from player holding quad damage). These lights interacted with precomputed lightmaps on static geometry too.
The trick of bit-shifting to do mul and div in the innerloops of Amiga-demos in the demo scene were common in the late 80's early 90's. Often combined with sin and cos lookup tables that was calculated and stored first thing on execution.
Actually, when I was coding Amiga Demos for a well known scene group, you used to avoid shifting also. The problem with 680x0 shifting is that it didn't have a barrel shifter, so the more shifts you did, the more expensive it was. So for something like lsl.w #2, d0 its timing was 6 + 2n. Where n is the number of shifts. It was 8 + 2n for a long shift. Anyway, you can achieve much better performance with add.w d0, d0 add.w d0, d0. Same as you would use lea to add to address registers rather than using add or subtract. These are all very common coding practices for demo coders, the same as putting arithmetic instructions directly after starting a blit, because you could essentially get them for free. And also Sin and Cos tables were rarely if ever calculated, they were always precomputed and we'd got tools for doing just that and included as dc.w tables.
@@thewelder3538 You are probably right - it is so long ago and I might mix up CPU's. I just remember I was working on a simple 3D-engine for the Amiga and seem to remember I was pouring over minutiae in the assembly-code optimizing each and every line. I do seem to remember that I did some sort of pre-comp with subsequent storing of sin/cos in memory. Another thing I remember was some super-quick/cheap way of determining the facing of a polygon (for bf culling)- which at first seemed like a complex operation but end up being only a few lines of code with cheap instructions.
A trick I've used to allow quick plotting of polar coordinates is to exploit the fact that for values of r in the range 0 to 2, one can compute r*cos(theta) and r*sin(theta) by computing d=arcos(r-1) and then computing cos(theta+d)+cos(theta-d) and sin(theta+d)+sin(theta-d). Use table lookups for the trig functions and no multiplies will be required,
Do we know if id hired someone from the demoscene or did they just hear about this trick from it and not just understand it (hence the WTF)? One up for the EU (EEC??) side of the pond!
"Lies"? It is skill, and all about speed/efficiency and something programmers always have to contend with. Same applies to traditional cell animation .. there's a reason they had layers (vs re-drawing everything for every frame). To have a game run and cleverly mask limitations while keeping the user in-game was art. It's not deceitful puritan, it's an art-form.
I'd rather call it a craft than art. You can't just splotch stuff somewhere, see if it works and be done. You really need to know how stuff works. Artists rarely do, they usually only care about results ;-)
Secondly, they likely used gourad shading. That narrows things down to calculating that for only each vertex, and then it fills in the blanks between the vectors.
3:40 I recall doing Lambert shading even earlier, using integer math. I'm the architect and co-author of the Viewpoint graphics library for DOS (and bare metal), which was the _fastest_ graphics library in its day. Consecutive pixels on the same surface will have values very close together -- that is, it doesn't change much from one pixel to the next. So, Only one iteration of Newton's method is necessary if you use the previous value as a starting guess. Oh, and the slowest part is the Conditional Jump instructions. So avoiding the conditional loop "am I done yet or need another pass?" makes a big difference.
It was a pretty informative and well done video, if i could suggest one thing is that slow the pace on the puns, they feel somewhat forced Aside from that, great video!
The whole point of the video (the magic number) was mentioned in about 10 seconds at the end followed by "I don't know how this works". Could I suggest that you research how it works and then tell us? That would have been a lot more informative.
I wrote this comment in another thread, but I summarize what I could understand from the code. Mind you I came to the conclusion before looking at the wiki page and it does coincide with what is written there as well. I don't see why he doesn't understand it, it's simple to understand. Any function can be represented as a polynomial sum so y=f(x) can be transformed into y = a0 + a1*x + a2*(x^2) + a3*(x^3) + a4*(x^4) + ... The problem arises that square roots and esp. inverse square roots and have an infinite number of sums. But if we stop at some x^n we will have a result at some precision. So for 1/sqrt(x) they discarded all members above the second. The task has now came down to calculate a0 and a1 in order for y to have the least difference from the exact value. They have calculated that the most appropriate values are a0 = 0x5f375a86 a1 = -0.5 which is y = 0x5f375a86 - 0.5*x and is the same as i = 0x5f375a86 - (i >> 1); However this approximation isn't enough precision so they use one step of Newton's iteration x = x * (1.5f - xhalf * x * x). All of those casts in the code were to cast the floats into an int for faster calculation then you recover the decimal precision approximation with the later half of code. Once you understand that the hex constant 0x5f375a86 is really just the integer representation of sqrt(2^127) it becomes more clear why this approximation works in this case.
Pretty much all of Q3's effects and features are really designed to hit GL 1.1 compliance with little extension reliance - most extension use is for optimizations. The most interesting to me is the fake fog system they have. Doing that ensures they've got fog on anything they can put an alphablended surface on, ensuring visual consistency between all those vendors and their varying quality of OpenGL ICDs in the consumer 3d wild west of the '90s (3dfx, 3dlabs, intel, nvidia, ati, rendition, powervr, matrox, trident, s3) Q3 did save a bunch of cpu power by having a precalculated lightgrid though, rather than evaluating point lights everywhere (it only did that for the few dynamic lights in the world). However they still use the slow sqrt calculations for this. q_rsqrt's only used for speeding up a few non-essential shader effects that have nothing to do with lighting. Like shiny environment maps, wavy normals (ctf flags really use them), specular alpha (which q3 barely has at all - xaero's pants are this) and flares (which q3 dummied out). OA does abuse these effects gratuitously so the use of OA footage is on point at least :P other than shaders it's also used for projecting "marks" on the walls. (scorches, bullet holes, blood stains) My only real criticism of the video is that the overbrights didn't get captured so your OA footage looks dark - another fake thing id did to get more volumnous higher range lighting by automatically cranking your brightness up while making non-lit elements darker. Another extension dodge there (using combine operations is a more commonly acceptable way of doing this on slightly more modern multitexturing-capable 3d hardware - OA's recent engine updates on git use a blended quad or a fragment shader to do this now, a quick and dirtry fix reaction to all the dark OA problem). The only other big Quake coding topic I can think of is the surfacecache in the old quake *software* renderer - responsible for saving so much lightmap blending with the lightmap's texels filtered, and going further explainingthe abrash assembly magic would be a huge feat. or maybe the span drawer that is infamously pentium optimized
The key to such constants, including this particular one, is to find a series which is equivalent or is close enough approximation of the function you need in the scope of interest; so that instead of calculating that function (1/sqrt(x**2+y**2+z**2) or 1/sqrt(x**2) here) you add values, instead of computing through multiplication, division etc. Search Taylor series, for example - that might be of help to someone who needs to do similar 'sorcery' on their behalf
All game code is "lying". All of games are basically a stage magic act. - You're talking about Quake 3 Arena, not Quake -Also not 30FPS, never ever 30FPS. Even when Quake 1 was new >50 was the standard. When Quake's multiplayer popularity grew, 100FPS became the desired one, and 100Hz monitors were often cited for helping the greatest players. -iDSoft didn't come up with the method, it was Silicon Graphics. -It's a floating point DECIMAL. More accurately, single-precision floating point decimal. Please don't keep repeating the "30 times per second", it was never a thing in computer games. Even Wolf3D's updatecycle is 50 times a second. -I "get" it that there's nothing much to get. It's bitshifting the single precision to work around its irregularities predictably. People have been doing this since the early 90's. The magic number just ensures that the vector is normalized. Don't take the criticism as a deterrent. Your enthusiasim is great and entertaining, and i just accidentally stumbled onto this video, but there are quite a few points that are off by such a margin i have to comment on them.
You're correct, although with Doom there was a 35Hz update cycle hardcoded in the engine, until numerous source ports fixed that, allowing for rendering any number of frames your machine is capable of, per second. gldoom and chocolate doom must do this, if I recall correctly
*_- You're talking about Quake 3 Arena, not Quake_* She's showcasing and looking at the Q3A Source., but the function in question was originally written for ID Tech 1 (aka Quake 1) … albeit in ASM *_Also not 30FPS, never ever 30FPS. Even when Quake 1 was new >50 was the standard. When Quake's multiplayer popularity grew, 100FPS became the desired one, and 100Hz monitors were often cited for helping the greatest players._* I'm not even sure where to begin with this. 50Hz was *never* a Computer Display Standard... and I don't mean "Well outside of Europe it was uncommon", I literally mean *NEVER* … 60Hz was the Commodore Standard, 72Hz was the IBM/PC Standard. This changed with the introduction of VESA XGA (1994) where it was Standardised to 60-75Hz. Of course Displays could strictly speaking support Half Rates (i.e. 31 - 38Hz in 15.75KHz stepping) meaning the Full Supported Range would usually be 31 - 75Hz; adaptive to the input Signal but most would just use VESA VBE (or DOS32X) that would basically use a fixed 60 / 72Hz. Wolfenstein used 70Hz, Doom used 35Hz, Quake used 30Hz, Quake 2 on-ward used OpenGL (Desktop Refresh, Default: 60Hz) This is the first time I've *EVER* heard of "100FPS became desired and 100Hz monitors were often cited helping the greatest players" … which I played Quake and Quake 2 in early "eSports" (before it was call that) Competitive Tournaments. It sounds to me like the same bullshit you tend to hear from people about "PlayStation Games ran at 60 Frame/s, why can't Modern Games" … good god that always makes me laugh hard, because most PlayStation Games were barely capable of 30 Frames/s; and even then the Consoles output to NTSC (60Hz) and PAL (50Hz)., and had to support both without it affecting Gameplay and Audio like it did on 4th Gen Consoles / Computers. You have to keep in mind that Scanline (Interlace / Progressive) Vs. Parallel Raster are two *very* different approaches to Display Image Output, just as Masked CRT Vs. Individual Pixel LED … again have very different outputs. Hook up a Console with a Dual Output (Old CRT and New LED) and compare them side-by-side... you'll very quickly understand why we Nostalgically felt the games looked and ran better of "Native" Hardware., and in many respects it's because CRT Displays just naturally Resolved a lot of the Modern Issues we have with Games. *_-iDSoft didn't come up with the method, it was Silicon Graphics._* That's merely where Carmack sourced the approach., it was a fairly common Performance "Hack"... and the original version of it uses Fixed-Point Floats. There's an example of it in the Amiga 1000 Software Development Toolkit from 1985. *_-It's a floating point DECIMAL. More accurately, single-precision floating point decimal._* "Single Precision" is a Modernised Term. Floating Points are broken down into 3 Terms., Significant, Base and Exponent; which can either be expressed via IEEE, Real or Fixed. This is further expanded via Binary, Decimal, Hexadecimal and Octagonal (and I bet you've never even heard of the last one, but most Computers even today still support such in their FPU; as well, they've not really been touched in terms of base Design since the Late-80s) You're specifically referring to IEEE 754 approach., which wasn't specified until 1985 and wasn't widely adopted until 1990... it is a 32-bit DFP (Decimal) approach. *_Please don't keep repeating the "30 times per second", it was never a thing in computer games. Even Wolf3D's updatecycle is 50 times a second._* As noted above., Quake (ID Tech 1) uses 30Hz as the Base... Wolfenstein 3D and Catacombs 3D both used 70Hz... It's important to know this, as the Audio (MUS Format) doesn't encode the Tempo, and so in order to get the correct playback speed for Conversion; you actually have to know the Refresh Cycle for each Specific Game to rebuild the Tempo. *_-I "get" it that there's nothing much to get. It's bitshifting the single precision to work around its irregularities predictably. People have been doing this since the early 90's. The magic number just ensures that the vector is normalized._* It's called "Fast Approximation", and it's been commonplace much further back than the 90s. Even Floating Point Units themselves,. are by their nature Fast Approx. rather than Strict Precision … all the technique pointed out in the Video does, is essentially switch what would in most cases take a quite costly number of Cycles (MUL/DIV were typically 140 - 180 Cycles) to a more Fast Approximation result that can be handled in a different way; in this case with Bit Shifting (which typically takes 20 - 37 Cycles) while producing "Good Enough" Results for what is being Calculated. Is it "Lying" … eh … that's an academic argument at best. I mean look at it like this., a CLUT (Colour Look Up Table) provides access to the full 24bit Colour Range; while only providing 256 Colours for said task … and if you don't mind thrashing the Video DMA, with the added latency; you can switch it for each Surface Calculated allowing for the Display of > 256 Total Colours / 4 Billion. In either case it's still going to be quicker and more efficient than a 3 Byte Colour.
Kanakotka Good God, what planet were you on where people were playing Quake at 50 fps? “When Quake was new” was 1996. 3D accelerator cards were new, and even my 3dfx Voodoo card got an unbelievable 30 fps running GLQuake at 640x480. Maybe you’re confusing this with refresh rate. Hopefully her enthusiasm won’t be deterred by people like you, because I learned something interesting today. If you “have to comment on them,” it shows how fragile your own ego is that you feel compelled to put others down.
That's not really a lie. They just used some bit level hacking and a bit of magic to get a really good initial guess. The real question here of course is where did they get the magic number from. I'm guessing that whoever wrote the comment was a different person from who came up with the number. Knowing how this sort of things happen in my experience, I think that somebody came up with the algorithm for the initial guess as a dirty placeholder, then somebody else adjusted the magic number during testing while trying to optimize it, then they discovered that it works with only one iteration, but didn't know why and how you can possibly justify it. Hence the comment.
Kenneth McCall Ha. That's some sick math skills. So it has quite a history. EDIT: Holy shit, one of the guys involved in the history of that algorithm has the Turing Award for his work in numerical analysis. That's some serious brain power that went into this little algorithm.
Part of this is misleading. This method doesn't "light an entire game arena", it's only used for dynamic lights and entities. All of the static lighting is calculated when the map is compiled and stored in a light map, and that calculation uses regular old sqrt(), not Q_rsqrt(). It's still impressive, but it's not as impressive as you're claiming.
@@meshuggahdave5607 The devs know exactly what their code did and why it works. It's a myth and a joke because anyone with half a brain knew how this worked when they looked at it :-/
@@Folsomdsf2 well that's what you'd assume when they're getting paid and putting out games like quake... I thought this was a fishy video but now I know. Thanks
I agree, it was mostly just rubbish with about 10 seconds devoted to the actual point of interest which was this calculation that is never explained. The presentation is fine but the script sucks.
@@irlrp but you had time to watch and comment on this video? Weird thing to say. I would have got bored reading your comment but I didn't have time to because my time weighs more than gold...
Firstly: I love that this video exists... so thank you. It's lovely in-depth analysis. Awesome. That said: This often feels like a dad trying to use all the hip words. In this case, they are used correctly so the issue may be just be delivery. More problematically: I'm pretty sure the described method of lighting is ray-tracing; which is not what Quake was doing for lighting. Quake used surface-based lighting. The polygons were pre-lit (and even then, not with ray-tracing); and then, at runtime, the polygon's texture is tiled into a buffer, with each texel lit according to the weighted average intensities of the four nearest light map points, as shown in Figure Three. If dynamic lighting is needed, the light map is modified accordingly before the buffer, is built. Then the polygon is drawn with perspective texture mapping, with the surface serving as the input texture, and with no lighting performed during the texture mapping. Also: the video is wrong about precision. Floating point and non-floating point numbers have the same precision (precision is determined by the number of significant digits); the difference is the ability to change scale while maintaining preceision. 12345 and 12.345 are equally precise as one another; but without Floating point, you'd end up with 12345 and 12; which are not equally precise.
if you don't interrupt yourself every second for a cheap joke this may be a very informative and fun video. I watched it to the end, but by minute 5 it was an andurance. It's okay to make jokes every once and then, but not all the time, and less in a video that aims to be informative
@@Kalisparo Yeah, if I didn't know this was a personal channel I would think it was one of those UA-cam networks with the presenters reading off of cue-cards and trying WAY too hard with the fake enthusiasm/stale jokes etc.
And here is a fact: Carmack was not the first to invent the inverse square root. A similar piece of code was written for PC back in 96m by specialists from 3dfx. =)
This video completely failed to explain it's core topic and instead overexplained a select few components of a single math problem that was only tangentially related to the lighting system. This really needs to be completely redone or removed. You should have at least breifly mentioned light mapping and prebaked lighting (aka pre calculated lighting), fading light maps, sliding light maps, drop shadows (which doubled as drop lighting in Quake 1 for weapon flash and explosion effects), and a few other key effects that together created the lighting used in the Quake games.
"they had to get a bit shifty..... *bit shifty*" hahahahahha On a serious note though, the code is not difficult. 0x5f3759df is the floating point representation of the square root of (2 to the power of 127). And i >> 1 means: shift i 1 bit to the right, effectively cutting i in half (as explained in the video). So what happens is this: x2 = number * 0.5F // Remember half of the input number, will be used later y = number // Copy the input number into y i = * (long *) &y // Trick the compiler into storing the floating point input as a long. This conversion is necessary, because you can't bitshift a floating point number, or subtract a floating point number from a hexadecimal number. i = 0x5f3759df - (i >> 1) // Subtract half of the input from the square root of (2 to the power of 127) y = * (float *) &i // Now that the bitshift and subtraction is done, convert it back from long to a floating point number. This is already an approximation, but the result is refined in the next line: y = y * (1.5 - (x2 * y * y)) // This method of getting closer to the square root is called Newton's method, look it up. The longfloat conversion is tricky and depending on the compiler it may have a different outcome, because the byte order may be different from what you expect, and then a bitshift will mess up the value. In modern C we would use what is called a union instead. In a union, a float and a long can share the same memory location, so the conversion is not necessary anymore. This will not solve the byte order problem, so you still need to be careful and test it well if you want to use this trick.
In case this will one day will be remade. Say that this is about quake and fast inverse square root in the first 20 seconds. Keep most of info, but remove all humour, and keep pace fast. Use dos footage. Keep code on the screen longer. Add more graphs.
I learnt nothing new in this video, and not only that but i used the bit shifting in the 80s, i still do today ! Especially in javascript. That bit of code in the source of Quake can be found in Doom and Wolfenstein 3D aswell. The whole method has been used on the demoscene for years before those games. Early 90s games also used trig tables as integers. The technic was known as the fixed dot calculation. This video is really trivial.
Most of the lighting (like that on walls etc) was generated ahead of time. When you created a level, you needed to also generate the lighting, so it was not all generated each pass, much of what you see on static objects, like walls, floors and objects that don't move are generated when the level is created and that is it. Doors are a special case, watch them when they open and close and you should see some static lighting examples if I recall (and my memory us fuzzy on that). The only lighting I can think of which would need to be adjusted during game play would be that from say, weapon fire and what not.
I miss those days when computers weren't powerful enough and programmers needed to be clever to squeeze every drop of performance. These days no one bothers with code optimization.
Man the coders in the 90's were legends they had to work with limitations and went to the extent of hacking the hardware (prince of persia did) and here i am trying to figure out how unity works.
I thought the jokes were okay. The bad part is the self-referencing the joke. That's a joke-killer every time. Trust your audience enough to recognize the joke by themselves via their shared culture. If they don't share that culture, it's another learning experience. Either way, win-win.
This is a pretty old topic, im pretty sure he's on record of saying he doesn't remember where that came from. Can't say i doubt him, i don't remember what i wrote 2 months ago.
Carmack understood how it worked at a theoretical level IIRC what he didn't understand is how anyone ever discovered it in the first place and that is from a guy with an insane knowledge of esoteric coding approaches and seeing how to trick things up in ways other people would never think of. I've spent waaay too much time in the Quake 2 source code and still find bits that blow your mind when you understand the angle they approached it from.
Bitshifting a floatpoint number IS evil though. I will accept using bitwise AND and XOR on floating point, but only only the sign. You have to put some limits on what kind of evil magic people do with floating point, and often it even ends up slower because even the CPU goes: WTF? I thought this was an floating point, and now I have to route it over to the integer ALU? STALL STALL
There's nothing at all wrong with a technique like this if it's a good fit for the hardware. (And at the time, it was) Bit shifting a floating point seems like a meaningless and broken operation but the result is just used as an initial guess for a couple Newton iterations. The real relevant piece there is that the exponent is bit-shifted and negated, which gets you on the right order of magnitude for an inverse square root. The rest just helps massage it for a little bit better results.
Seems like this was pushed by youtube recently. Not a fan of the humor, not sure if maybe I'm not a target demographic you were shooting for. As a game developer I found this fascinating but the presentation leaves to be desired. This was in 2016 tho soooo yeah hmmm. The Information quality is superb and the explanation is great. Just not a fan of "Look at me I'm edgy" jokes.
You are wrong. That number is not some magic number. And it is not unknown as to why and how it works. I spent about 5 minutes writing the function in code when i got about half way done of trying to see what the f was going on i noticed that. This is not some sorcery the number is not chosen at random it is chosen because of how that number is represented in binary and how computers convert floats to longs. No magic here just smart people realizing that converting a long to a float gives the results they need. If anyone knows what i am talking about and feels like they can explain it better than me please be my guest. But i really wish you would have explained how this really works and that the magic number is not so mystical as it appears at first glance. My bet is how that number was found was not by going 1,2,3,4.... but by determining what they needed in binary then that number just so happened to be its hex representation. Not mystical please i urge you to relook at the code and you can see what is going on here and if you explained it better the whole comment at the end about coming up to solutions that may not be right in front of your face these "dirty tricks" would be a lot more meaningful if you would just explain how it works. Instead of going nobody knows if i can find the solution i know other people do as well so who else knows how this works? the solution if you would.
It is sort of a magic number in that the way it works is somewhat orthogonal to the meaning of the underlying data. The number 0x5f3759df (1011111001101110101100111011111 in binary, this is important, examine how it has a pattern about it) is a correction piece. For IEEE 754 numbers stored contiguously in memory and treated as a signed integer (0 + Exp + Mant), the result of a singular Newton's Approx., the bit pattern from the magic number can be used to "repair" the odd/even signing of the exponent and restore the overall mantissa for the float by ones-complimenting i/2 (where i is int representation of the float to be invsqrt) and XOR'ing it with 0x5f3759df then dropping the MSB carry. Whether it's a magic number or not is up to the viewer. That's a subjective term, not a scientific one. I consider it magic because it's less a number and more of a blueprint, and due to the fact it can be used numerically makes it magic. Also, nowadays people should look into using extensions to SSE on x86 platforms and something like THUMB on ARM for doing triple (or more!) fast inverse approximations.
Magic number is programming jargon. It's used to describe a constant that appears directly in code instead of being declared as a variable or a constant. Did you see the code? That's exactly how the number appeared. It's a magic number.
Nice video, but um, every C developer should know that bit shifting is equivalent to multiplication or division in powers of 2. It's not a hack - it's how binary works. I'm also pretty sure lighting wasn't done that way back then (I don't even think they had shaders). Most of the "lighting" was static / pre-rendered, not calculated on the fly. Take a look at the other characters in this video - no shadows. Wanna know why? It's because they were not calculating "every single light beam, 30 frames per second". Most shadows back then were also just an alpha mask. The closest thing to "raytracing" back then was actually in the way Doom and Wolf3D (and blake stone etc, ROTT and others) were rendered. Quake was (one of ?) the first "true" 3d engines, which used proper 3d structures and perspective projection to 2d, allowing for more vertically stacked rooms etc. All of that said, these games were phenomenal for their time - and I sunk many hours into Quake 1 and 2.
"Nice video, but um, every C developer should know that bit shifting is equivalent to multiplication or division in powers of 2. It's not a hack - it's how binary works." Doing so on integers is totally normal yes. But on floating point numbers? I'd wager that a big number of C developers don't even know the bit level representation of floating point numbers without looking it up. And as the video explain, the manipulation doesn't really do what you actually want, so you have to use a magic number to clean it up. It's a hack. "not calculated on the fly." I didn't get the impression that she was saying this anyway. It's a relatively short video. You'll have to simplify, so of course the details are wrong. Quake 3 does have some dynamic lights. You don't calculate the lighting with raytracing, but some of the calculations for the dynamic light themselves can roughly speaking be similar .
It's her style, I think she's adorable. Not everyone on the internet needs to be emulating the super-edgy style. It was informative, well-produced, and she has personality. I subscribed immediately.
You know what ELSE we had in the 90s?? Fax machines and video arcades and dial-up and floppy disks and FM Synthesizers and grunge music and Buffy the Vampire Slayer and slightly shoddy but very efficient inverse square root calculations!
5:23 Dividing power by -2 will give you inverse sqare root, but it can not be implemented in floating point format - you won't get inverse sqare root by dividing mantisa by -2 !! as shown in 5:30, you are confusing number raised to a power with number times 2 raised to a power. And dividing is not expensive, especially by 2, and last thing - you didnt exlpain how this piece of code work, even roughly. If someone is interested how it works there is wiki >> "Fast inverse sqare root"
I see the point you're making however, I wouldn't call it lies. Every physical model is just an approximation of reality :) Not mentioning its discrete computational representation ;p
This video needs to be better researched. The floating point math, especially relating to the fast inverse square method was understood at the time. A simple youtube search reveals what is going on here. Your video is disingenuous to not explain this math when that is the point of this video.
Newton's method was, not the weird integer math and constant value they used. Just forget Newton's method - it's a distraction. The constant value and the combination of two floating point operations into one faster integer operation is the interesting bit. I remember a numerical analyst couldn't figure out how they came up with it. He tried experimenting to see if the constant was the best value to use - it turns out it wasn't. But who came up with it is actually still a mystery as far as I know. The last I heard it was someone at Nvidia. Oh yeah, just remembered. The numerical analyst also tried coming up with a theoretical best constant to use, and in practice, for some reason, it didn't work as well as the one id used, which really added to the mystery. Wish I could remember the guy's name. He was no slouch.
The line with the "// evil floating point bit level hacking" comment isn't actually the bit shifting bit of the equation, the part you mentioned not understanding is actually along the lines of what you just described with the Mantissa and Exponent values. The first bit takes a floating point number (whatever number we're trying to sqrt) and we''re assigning that to the Float named "y", we then give our Long named "i" the value of "i" multiplied by the value at "y" (y's value is treated as a Long rather than Float here)....THEN our "WTF" line assigns "i" the value of the Mantissa, which is located at the memory address 0x5f3759df, and SUBTRACTS the value of "i" AFTER it's been bitshifted/divided by 2 using bit shifting.... I think the main "wtf" moment for US is really thinking how the original programmer came up with that specific memory address, how our Mantissa gets stored there, and why it consistently happens at that memory address rather than being a bit more random... John Carmack wasss a big part of ID so it wouldn't surprise me if this was his doing in some way, it's definitely not impossible to understand hardware well enough to do things that SEEM strange like this, especially with a language like C where you kinda need to know how things like memory work since you're doing all the management yourself, as opposed to something more abstracted like C++ which allows for some auto-management of resources so the programmer doesn't HAVE to understand it inside and out wholeheartedly just to create something stable
I want to correct you on certain details. Neither John Carmack nor anyone at Id invented the inverse square root algorithm used in the Quake III source code. The original author is Greg Walsh, albeit using a different-and more precise-hex value. Another note is that the game didn't run at 30 frames a second by default, but rather at 60 fps. Now, this depends on different factors such as the refresh rate of the monitor and most CRT monitors used in the 90s were 75Hz and some even more. Another factor is the graphics card but most people who played Quake at the time had upgraded anyway as Quake III required an OpenGL-compliant GPU to run and didn't include a software renderer. It doesn't invalidate your points, just a clarification.
Alright, I had to pause to write this comment. I'm BRAND NEW to this channel. This is my first video of yours I've EVER seen. But I'm 3:23 in to this video at the time of paying and DEFINITELY subscribing. But the video is from 2016 and I just need to find out if the channel is STILL getting current updates!? lol... I have no idea why THIS older video came up in my suggested feed randomly today out of the blue, but I'm glad it did. I can say this much: 1. I can already tell this content is going to be explained in a way a geek (but non-coder) like me can understand (to me, your comment about Carmack / "geek quotient" hints at this). 2. I MUCH prefer a video with at least SOME editing and production value to "live streamers" these days. 3. Your pacing and cadence of narration in general is EXCELLENT. I really hope you're still making new videos! Subscription button clicked, time to finish this video, then check out the rest of the channel! :)
I saw a video covering this in more detail. The “random number they pulled” is actually a memory address they use to change the type from a “float” to a “long” to do the bit-hacking.
@@nosville22 68k had one of the most impressive instruction sets out of any CPU I've seen, it was well designed, especially compared to the x86. It was a bit rubbish (IMHO) at performance though with even the most basic of opcodes taking several cycles.
Nice video, but the trick is not needed anymore. Pentium 3 launched in 1999 introduced SSE instruction set, it has RSQRTPS instruction that calculates more precise approximation (maximum relative error is 0.037%). It takes less time than Q3 trick, and calculates 4 values at once. That instruction is supported in all modern CPUs ever since, even cell phones have similar one, FRSQRTE in armv8, VRSQRTE in armv7.
Yes and no. On some microcontrollers, that have specific purposes like calculation of space vector pwms, this method is still pretty useful. Btw. what is described in the video is called the "quake magic number".
I really want to know how the fuck that "what the fuck" part works. Pull a number out of thin air and it's fixed? WHAT?! I'm glad we have GPUs that can use Fresnel's equations to do rendering based on actual physics now, and not these approximations. Not only does it look amazingly better (to the point that it's easily a revolutionary change in gaming) but it doesn't require a mindfuck to figure-out how it works. O.o
What's happening is that by interpreting the value of the mantissa as an integer, you're sneakily getting a really cheap approximation of its logarithm base 2. More specifically the calculation that matters is: log2(1 + m) ≈ m + σ where m is the mantissa and σ is a constant chosen to minimize the error (about 0.045). Plug that into the final algorithm and do some juggling, and you end up using this value in the code: 1.5 * 2^23 * (127 - σ) = 1597463007 = 0x5f3759df
No idea how I ended up here, but I loved what I found. I don't even know if you're still making videos. *edit* You don't make videos anymore. This is a massive loss.
I keep being mesmerized at the ways programmers came up with solutions to the limitations they had back then, I mean, if someone told me to make a game with the same constraints these guys had to follow I wouldn't be able to do anything, I really have to go back and look through some of the solutions they came up with back then, I think there is much to learn about not only programming in general but more specifically about problem solving!
Back in the old days game programmers used tons of cpu-time and memory saving tricks, and dividing/multiplying by two by shifting the value is probably one the most well-known ones. The interesting part is how you exploit it when you need to divide/multiply by a number which is not a power of two. That's what would've been interesting.
I did a sarcastic reference to this video and then I tried to do my own, it was a disaster. Now I know how hard it is to do it, so I better don't do jokes about things I can barely do myself. I still feel cringe though.
Gameplay here isnt from Quake 3, but is from Open Arena, an opensourced arena shooter which tries to emulate Quake 3 styled gameplay while utilizing the improved IOQuake3 engine, with further improvements also extending to their lightmap methods for lighting . Its getting a massive visual overhaul sometime in the future .Its absolute free to download and fun as the classic to play with compatibility with custom Quake 3 maps, models and some mods to boot!
This trick you're talking about was coded by Mike Abrash in Quake. It was not Carmack in Q3 Arena like all the b-roll footage. He'd already left Id. Check out his book, he was an epic x86 assembly coder!
calculating actual bouncing light rays (photons) like that is called raytracing--aka that thing that we are JUST NOW starting to do for real in games with the NVidia RTX cards
Once worked with a thing called a "super compiler" - it made functions like this using ... genetic algorithms. You gave it some working "not so fast code" in C, and a bunch of tests that defined "winner". It then bred assembler no one could understand. Sorcery, care of our course mentor Mr Ken Thompson, a god amongst us.
First video ive seen from your channel and I already subbed: good editing, well prepared, well explained even tho im not a coding expert.. but facts like that about "hacking" the system to make something like this from ID Software happening is really cool! Theres people complaining about the puns and stuff (the hate) but hey dont worry about this u cant please everybody :p Stay true to you, I really like this kind of content and even tho your channel isnt that big I can see the quality and the effort youre putting into it !!!! Have a nice day :)
Um programmers at the time all knew it wasn't pure 3d just as doom wasn't. We even had a term for it... 2.5d. though that also applied to gold box games etc.
Wondering why the magical unnamed hexadecimal constant , 0x5f3759df, does what it does? The answer is scarily simple. 0x5f3759df is a hexidecimal representation of 1597463007 (in decimal notation). 1597463007 seconds is exactly how many seconds it takes for a beam of light to reflect off a metallic surface in a zero gravity inverse matter pressure environment.
That's actually OpenArena, not Quake 3 Arena, though it's based on the same source code. If you think Quake was hacked up, look at Doom's source code sometime. Fracunit is very fun.
This was pretty far from talking about Quake 3 lighting. You started with a pretty deceiving premise when mixing raytracing concepts with what's going on in id's engine. The light in Quake 3 is hardly general in use. For environments they use 2 modes that have nothing to do with what you explained, and for characters, or VFX there's a different way they do it. If i go into explaining them, i need a video longer than this one, and it's already pretty well documented by others. Let there be lies, indeed....
I gave this video a try thinking it's gonna be a warmup of a discussion that happened on twitter or something. The depth got me by surprise, gonna subscribe 😊
Now draw a perfect circle on a pixel display with no gaps and no overlapping points, using only integer addition and subtraction. I was given the algorithm with no comments. It took me months to figure out how it worked.
That's actually footage of a game called Open Arena, which is a community created free and open source version of Quake 3, using the same engine as Quake 3.
@GroinMischief Yeah - that means "open source". Maybe they tinkered alot under the hood of OpenArena netcode but graphically/gameplay-wise it didn't improve much or at all (at least I didn't noticed any real diffrence except for diffrent maps and assets remodel/reskin). It plays just like Quake 3 and if you have Quake 3/Quake Live already, It won't give u anything diffrent.
I still play Q3A and Quake Live. Quake Champions also but much less. For it's time, 1999, the gfx were quite stunning. And in QL the IDTech 3 engine still holds up pretty well, although it has been further optimised.
A comment I once found in some code:
// I was drunk when I wrote this and have no idea how it works, but it does.
My favorite was a mailing list message in a billion dollar company that went something along the lines of:
" Okay, this is embarrassing, I was having sex with my girl friend (yes I have one, Rob!) and
in the middle of that I had an epiphany on how to solve this issue. I interrupted the sex for
a second, wrote down the note, and resumed. When done, I tried this and it worked perfect.
I can't remember why it works, those cells now belong to my girlfriend who will probably now
use them to take over the world or some such, I don't know, but this bug was here for five weeks
and now it's gone, so I don't want to here anyone complaining about the fix not being documented.
Just accept that it is and move on. "
And then people shit on tech priests because they only burn incense instead of cracking the code of everything they maintain
I've written that comment more than once....
Like a drunk writer
I have some code commented by me to me that reads.
/*
* What the actual FUCK is this clusterfuck?
*
* SORT THIS SHIT OUT MAN
* * Comment the code
* * There is a bug possibly related to this code.
* * * Coordinate data is loaded correctly and lookups work correctly until you are handling data near 0,0,0 where the data is still loaded correctly but returns data at 0,0,0 when looking at itself.
* *
*
* Don't you fucking dare delete this until you have done your job properly.
*
*/
The code commented is still there because the drive I was on went pop right after I deleted the backups to set the drive up for raid thinking I had time to do that and make a backup without worrying about the main drive dying. 8 year old drive chooses that moment to go.
John Carmack is a rogue AI and you can't convince me otherwise, at this point.
Ill die on this hill
"The vessel that houses interdimensional being John Carmack"
"Texas Based technocryptic John Carmack"
"The resident of the binding between space and time that holds our reality together John Carmack".
-Civvie 11
So Samuel Hayden was a self-insert character?
@@carlsonraywithers3368 urm well ackshully John Charmack hadz no purt in za dezelopjent oz za mozurn DOOM chames
ua-cam.com/video/Q8H8EM0Cw4o/v-deo.html
I don't want to complain, I want to be constructive.
I feel like your video simultaneously assumed the viewer understood a lot of high-end programming and math, but also talked down to them like a child. You had a lot of high concepts that you didn't really explain well surrounded by a lot of other concepts you explained down to a child. You kept jumping from one to the other and it was really annoying; I felt like you were trying to subtly insult me.
If you want to be technical, be technical. If you want to explain things for the layman, then explain things for the layman. But don't do both.
Also, what the hell does this have to do with lies? Seriously, that description is just some poorly shoe-horned clickbait.
simple solution: dont mind being talked to like a child. we are all stupid. so are you. its in our genes. too much self esteem is like too much sugar. rotting.
You might want to disengage that snowflake mode...
I think you got triggered AF
I want to know what's at that address
Yeah, I learnt game coding in the late 90's...
And let's just say, at that point in time EVERYTHING was hacked and abused for speed.
Bit shifting integers around was the norm because it was generally much faster than multiplication and division. (well, until you got the pentium that could do integer multiply/divide in a single cycle, at which point that no longer mattered.)
In fact, bit shifts are the defacto way to implement multiply and divide routines in assembly if for whatever reason you're working with a CPU that has no hardware multiply/divide circuitry whatsoever.
The key point being that a right shift is a divide by 2, and a leftshift is a multiply by two.
Depending on what you're working with, a shift instruction may be a single bit (say a 6502). each shift is one bit at a time, but it takes 2 cycles.
On some systems, like the x86, you can specify how many shifts to perform as a single instruction.
An interesting aside here, for optimisation on old hardware, especially where each shift instruction is a single bit, is that moving a value around in memory by one location is equivalent to shifting it 8 bits. (so, for instance a 65186, which can read 16 bit values, and can perform shifts and rotates directly in memory, but only a single bit per instruction, could simply read a value and write it back in a slightly different memory location to mimic an 8 bit shift.)
Of course, this is a well known and relatively straightforward thing to do with integers, but when you start applying it to floating point numbers, well, that's a whole other can of worms...
Then again, there are a bunch of other common solutions to problems with square roots.
A common solution in general to anything complex is a lookup table.
Pre-compute a bunch of values then the instruction in question becomes a memory lookup.
But, when doing this you're trading memory for speed.
It's plausible when you have 256 values, but the more values needed, the more dubious it gets.
If you had 32 bit values, and needed to access them using a 16 bit entry you'd need 65536x4 bytes - 256 kilobytes for a table. If you wanted to use a lookup table for a 16x16 multiply you'd have a 4 byte result, and 4 billion entries. - that's a 16 gigabyte lookup table! - if you have a system with 16 gigabytes of memory lying around, chances are you don't need a lookup table for something like this.
Though... It might surprise you to learn that the hardware equivalent of a lookup table allows you to replace pretty much any CPU instruction with a memory chip instead, and this is guaranteed to work at a constant speed, unlike many implementations of actual functions using logic gates.
Other tricks you can pull are contextual.
Consider comparing the distance between two objects - if you have X and Y coordinates for each object, the distance between them derives from pythagoras; Distance = Sqrt ( (X1 - X2)^2 + (Y1 - Y2)^2 )
So if you you wanted to test whether this was within a certain radius of a point, for say, collision, you'd calculate the distance between these points, then compare it to the distance.
But, you've got a square-root in there, which is a pain to calculate.
Well, what if your test distance was the square of the distance? Comparing the square of two distances gives the same result as comparing the distances directly, yet now you don't have to do any square root calculations!
Sounds pretty good, right?
Well, there IS a downside, as there nearly always is, when it comes to tricks.
The square of the distance rises exponentially.
1^1, 2^2 = 4, 10^2 = 100, 100^2 = 1000.
Because of this, if you're testing large distances, there is a possibility you'll get an overflow, which will create a bunch of weird bugs if you're not careful.
But it's otherwise a surprisingly simple trick to avoid the need to use square roots...
It is called optimization, not hacking and abusing. Everything had to be optimized for speed.
@@navi-charlotte same shit (?) Taking one thing once intended for a determined job and apply it to do other stuff
@@navi-charlotte I think there's a fine line between the two. Optimization is trying to reduce computational load by figuring out how to get rid of reduntant calculations. I think both lookup tables and the distance example in the OP are just optimizations. A hack relies on using side effects and arguably bit-shifting exponents is a hack, albeit a hack that has been used so often it doesn't seem like using a side effect anymore. The WTF line from the code is definitely a hack - by the time you comment your code this way it's fairly obviously not something the language itended as reasonable code and it wasn't an intended effect of that line to work as it does. And if it's not an intended effect, it's a side effect and hence using it is a hack.
Reminds me a bit of that time when I needed to find all buildings in a certain range around some sample point in a geographical database. The database had several 100K buildings which each had several vertices. Finding these buildings would actually take around 30s if I remember correctly. Sounds fine, but I also needed to process many tens of thousands of points every night. 30s just would not cut it.
What I found out is that this database had a spatial index I could use to speed up bounding box operations dramatically. It took like 6 hours to generate this index, but that only needed to happen once.
Instead of looking for everything in a circle around the sample point, I looked for everything in a square, and later filtered those results down. The time dropped from 30s to around 20ms per sample. That's 1500x faster by changing a single line (the query) and adding a 5 line for-loop to filter that result down. That did the trick. I could process a whole day's worth of data in about 2 hours. And like you mentioned, a lot of it is caused by avoiding square roots, while the other part is trading CPU time for memory. I don't remember how much memory the index used, but it was not a problem for a relatively modern desktop PC. Memory is so cheap these days, there is a lot you can get away with...
That project is still the most dramatic improvement in performance I've done in the real world. I'm still really proud of how it worked out. It's not something I get to do very often anymore, but I think optimization problems are really satisfying to work on.
@@Niosus I can beat that. I worked on a statistical problem and a straight forward implementation of the maths resulted in crazy computation times for my use case - on the order of 10^28 years. I had a function that got called n^n times (my largest n was 27, my advisor had one in the 50s) and it got called with tuples. But I noticed that it did not depend on the order or multiplicity of the numbers. So I got that down to 2^n calls (basically calling it for the power set of an original set, rather than all n-tuples you could get from that set). That got my computational time down to about a week, but my advisors data would have still taken about 1 million years. Then I worked out a way to split the problem in some cases (and most real life examples allow that) and instead of a single problem with the original n you could work on problems of size m and n-m. I analyzed my advisors data in about an hour. Total factor ends up somewhere around 10^35.
Quake 3 lighting was done by pre-computed lightmaps. The real-time lights weren't even as real as they were in Q1 and Q2. In this case, they just projected a light texture on the Z(Q3's up) axis and did some sort of blend mode on top of everything else. The blend mode wasn't truly additive, nor was it a straight multiply. It just sort of scaled the existing color up by the RGB value of the light.
Yeah, I remember dynamic lighting was introduced much later with some mods.
none-lightmapped dynamic lighting in Quake was lambert shading on the models. They don't require a inv-sqrt either, they use N.dot(L) per vertex. This video is quite ignorant unfortunately.
it's possible that it was done for make the game faster for multi?
Stop mansplaining
She wasn't wrong. What she says applies mostly to illumination of dynamic objects (like weapons, enemies or players), and sometimes static geometry (walls, floors, ceilings, water) ones that were wrapped (textured) with precomputed light maps. Quake had moving lights (for example light from rockets from rocket launchers, or from player holding quad damage). These lights interacted with precomputed lightmaps on static geometry too.
"what are you gonna do?" wait a decade or two and play quake raytraced :p
As we speak
your dream comes true
Hey, can you read my future too?! I need to know..
Tadaa.... welcome to the future.
This aged like a fine wine
The trick of bit-shifting to do mul and div in the innerloops of Amiga-demos in the demo scene were common in the late 80's early 90's. Often combined with sin and cos lookup tables that was calculated and stored first thing on execution.
Actually, when I was coding Amiga Demos for a well known scene group, you used to avoid shifting also. The problem with 680x0 shifting is that it didn't have a barrel shifter, so the more shifts you did, the more expensive it was. So for something like lsl.w #2, d0 its timing was 6 + 2n. Where n is the number of shifts. It was 8 + 2n for a long shift. Anyway, you can achieve much better performance with add.w d0, d0 add.w d0, d0. Same as you would use lea to add to address registers rather than using add or subtract. These are all very common coding practices for demo coders, the same as putting arithmetic instructions directly after starting a blit, because you could essentially get them for free. And also Sin and Cos tables were rarely if ever calculated, they were always precomputed and we'd got tools for doing just that and included as dc.w tables.
@@thewelder3538 You are probably right - it is so long ago and I might mix up CPU's. I just remember I was working on a simple 3D-engine for the Amiga and seem to remember I was pouring over minutiae in the assembly-code optimizing each and every line. I do seem to remember that I did some sort of pre-comp with subsequent storing of sin/cos in memory. Another thing I remember was some super-quick/cheap way of determining the facing of a polygon (for bf culling)- which at first seemed like a complex operation but end up being only a few lines of code with cheap instructions.
I appreciate what you shared.
A trick I've used to allow quick plotting of polar coordinates is to exploit the fact that for values of r in the range 0 to 2, one can compute r*cos(theta) and r*sin(theta) by computing d=arcos(r-1) and then computing cos(theta+d)+cos(theta-d) and sin(theta+d)+sin(theta-d). Use table lookups for the trig functions and no multiplies will be required,
Do we know if id hired someone from the demoscene or did they just hear about this trick from it and not just understand it (hence the WTF)? One up for the EU (EEC??) side of the pond!
"Lies"? It is skill, and all about speed/efficiency and something programmers always have to contend with. Same applies to traditional cell animation .. there's a reason they had layers (vs re-drawing everything for every frame). To have a game run and cleverly mask limitations while keeping the user in-game was art. It's not deceitful puritan, it's an art-form.
I'd rather call it a craft than art. You can't just splotch stuff somewhere, see if it works and be done. You really need to know how stuff works. Artists rarely do, they usually only care about results ;-)
Relax your tism
Every single object? I think not. Anything that didn't move likely had that info baked into the texture.
Secondly, they likely used gourad shading. That narrows things down to calculating that for only each vertex, and then it fills in the blanks between the vectors.
Q3 game used to reflect the floor and walls at the same time .
Crazy Tubex We're talking about Q1
No , this is Q3
Crazy Tubex Did... did you even watch the video? By the time Q3 was out, the hardware limitations for calculating light wasn't a thing anymore.
Interesting, but I was expecting Quake, not Quake 3!
Wtf is quake 6? You mean Quake Champions?
bronze
Wouldn't Quake Champions be more like Quake 5? Unless you are counting Enemy Territory, which I think of as Quake 1.5
quake not q3
me too
I was expecting mipmapping or something
Kind of interesting content though
"they had to get a bit shifty with their code" that was gold, thank you
I was really expecting Quake 1, software rendered 320x200 graphics to show that
I'm that kind of guy that just want technical information in the most direct way possible.
Then go google it. UA-cam is for people who wants to be famous, not for technical information.
We prefer boobanimals.
@Jon Don lol your problem with his comment was that he briefly stated he was a guy?
@Jon Don seriously when people say stuff like this they sound like a feminazi
@Jon Don
Nobody cares what anybody cares about, to be fair. With that point, their comment as well as your and my replies are pointless. :D
3:40 I recall doing Lambert shading even earlier, using integer math.
I'm the architect and co-author of the Viewpoint graphics library for DOS (and bare metal), which was the _fastest_ graphics library in its day. Consecutive pixels on the same surface will have values very close together -- that is, it doesn't change much from one pixel to the next. So, Only one iteration of Newton's method is necessary if you use the previous value as a starting guess.
Oh, and the slowest part is the Conditional Jump instructions. So avoiding the conditional loop "am I done yet or need another pass?" makes a big difference.
This video was so good, I did not even flinch at the puns. Not a BIT.
It was a pretty informative and well done video, if i could suggest one thing is that slow the pace on the puns, they feel somewhat forced
Aside from that, great video!
I agree. Great video, but the puns are forced. Very informative.
i would suggest not to look away behind the camera so obviously, use sunglasses like casey neistat :D
The whole point of the video (the magic number) was mentioned in about 10 seconds at the end followed by "I don't know how this works".
Could I suggest that you research how it works and then tell us? That would have been a lot more informative.
en.wikipedia.org/wiki/Fast_inverse_square_root
Very interesting article about it, where the writer actually tried to track down the inventor/etc
www.beyond3d.com/content/articles/8/
I wrote this comment in another thread, but I summarize what I could understand from the code. Mind you I came to the conclusion before looking at the wiki page and it does coincide with what is written there as well. I don't see why he doesn't understand it, it's simple to understand. Any function can be represented as a polynomial sum so y=f(x) can be transformed into y = a0 + a1*x + a2*(x^2) + a3*(x^3) + a4*(x^4) + ... The problem arises that square roots and esp. inverse square roots and have an infinite number of sums. But if we stop at some x^n we will have a result at some precision. So for 1/sqrt(x) they discarded all members above the second. The task has now came down to calculate a0 and a1 in order for y to have the least difference from the exact value. They have calculated that the most appropriate values are a0 = 0x5f375a86 a1 = -0.5 which is y = 0x5f375a86 - 0.5*x and is the same as i = 0x5f375a86 - (i >> 1); However this approximation isn't enough precision so they use one step of Newton's iteration x = x * (1.5f - xhalf * x * x). All of those casts in the code were to cast the floats into an int for faster calculation then you recover the decimal precision approximation with the later half of code. Once you understand that the hex constant 0x5f375a86 is really just the integer representation of sqrt(2^127) it becomes more clear why this approximation works in this case.
Pretty much all of Q3's effects and features are really designed to hit GL 1.1 compliance with little extension reliance - most extension use is for optimizations. The most interesting to me is the fake fog system they have. Doing that ensures they've got fog on anything they can put an alphablended surface on, ensuring visual consistency between all those vendors and their varying quality of OpenGL ICDs in the consumer 3d wild west of the '90s (3dfx, 3dlabs, intel, nvidia, ati, rendition, powervr, matrox, trident, s3)
Q3 did save a bunch of cpu power by having a precalculated lightgrid though, rather than evaluating point lights everywhere (it only did that for the few dynamic lights in the world). However they still use the slow sqrt calculations for this.
q_rsqrt's only used for speeding up a few non-essential shader effects that have nothing to do with lighting. Like shiny environment maps, wavy normals (ctf flags really use them), specular alpha (which q3 barely has at all - xaero's pants are this) and flares (which q3 dummied out). OA does abuse these effects gratuitously so the use of OA footage is on point at least :P other than shaders it's also used for projecting "marks" on the walls. (scorches, bullet holes, blood stains)
My only real criticism of the video is that the overbrights didn't get captured so your OA footage looks dark - another fake thing id did to get more volumnous higher range lighting by automatically cranking your brightness up while making non-lit elements darker. Another extension dodge there (using combine operations is a more commonly acceptable way of doing this on slightly more modern multitexturing-capable 3d hardware - OA's recent engine updates on git use a blended quad or a fragment shader to do this now, a quick and dirtry fix reaction to all the dark OA problem).
The only other big Quake coding topic I can think of is the surfacecache in the old quake *software* renderer - responsible for saving so much lightmap blending with the lightmap's texels filtered, and going further explainingthe abrash assembly magic would be a huge feat. or maybe the span drawer that is infamously pentium optimized
Let me say straight. Carmack is true genius and a good guy. And you are just trying to syphon his credits, shame !
The key to such constants, including this particular one, is to find a series which is equivalent or is close enough approximation of the function you need in the scope of interest; so that instead of calculating that function (1/sqrt(x**2+y**2+z**2) or 1/sqrt(x**2) here) you add values, instead of computing through multiplication, division etc. Search Taylor series, for example - that might be of help to someone who needs to do similar 'sorcery' on their behalf
Thank you! I had to plow through quite a bit of complaints before someone finally had something to share regarding that number.
All game code is "lying". All of games are basically a stage magic act.
- You're talking about Quake 3 Arena, not Quake
-Also not 30FPS, never ever 30FPS. Even when Quake 1 was new >50 was the standard. When Quake's multiplayer popularity grew, 100FPS became the desired one, and 100Hz monitors were often cited for helping the greatest players.
-iDSoft didn't come up with the method, it was Silicon Graphics.
-It's a floating point DECIMAL. More accurately, single-precision floating point decimal.
Please don't keep repeating the "30 times per second", it was never a thing in computer games. Even Wolf3D's updatecycle is 50 times a second.
-I "get" it that there's nothing much to get. It's bitshifting the single precision to work around its irregularities predictably. People have been doing this since the early 90's. The magic number just ensures that the vector is normalized.
Don't take the criticism as a deterrent. Your enthusiasim is great and entertaining, and i just accidentally stumbled onto this video, but there are quite a few points that are off by such a margin i have to comment on them.
You're correct, although with Doom there was a 35Hz update cycle hardcoded in the engine, until numerous source ports fixed that, allowing for rendering any number of frames your machine is capable of, per second. gldoom and chocolate doom must do this, if I recall correctly
lel absolutely every single one of your "points" is semantics at best
*_- You're talking about Quake 3 Arena, not Quake_*
She's showcasing and looking at the Q3A Source., but the function in question was originally written for ID Tech 1 (aka Quake 1) … albeit in ASM
*_Also not 30FPS, never ever 30FPS. Even when Quake 1 was new >50 was the standard. When Quake's multiplayer popularity grew, 100FPS became the desired one, and 100Hz monitors were often cited for helping the greatest players._*
I'm not even sure where to begin with this.
50Hz was *never* a Computer Display Standard... and I don't mean "Well outside of Europe it was uncommon", I literally mean *NEVER* … 60Hz was the Commodore Standard, 72Hz was the IBM/PC Standard. This changed with the introduction of VESA XGA (1994) where it was Standardised to 60-75Hz.
Of course Displays could strictly speaking support Half Rates (i.e. 31 - 38Hz in 15.75KHz stepping) meaning the Full Supported Range would usually be 31 - 75Hz; adaptive to the input Signal but most would just use VESA VBE (or DOS32X) that would basically use a fixed 60 / 72Hz.
Wolfenstein used 70Hz, Doom used 35Hz, Quake used 30Hz, Quake 2 on-ward used OpenGL (Desktop Refresh, Default: 60Hz)
This is the first time I've *EVER* heard of "100FPS became desired and 100Hz monitors were often cited helping the greatest players" … which I played Quake and Quake 2 in early "eSports" (before it was call that) Competitive Tournaments. It sounds to me like the same bullshit you tend to hear from people about "PlayStation Games ran at 60 Frame/s, why can't Modern Games" … good god that always makes me laugh hard, because most PlayStation Games were barely capable of 30 Frames/s; and even then the Consoles output to NTSC (60Hz) and PAL (50Hz)., and had to support both without it affecting Gameplay and Audio like it did on 4th Gen Consoles / Computers.
You have to keep in mind that Scanline (Interlace / Progressive) Vs. Parallel Raster are two *very* different approaches to Display Image Output, just as Masked CRT Vs. Individual Pixel LED … again have very different outputs.
Hook up a Console with a Dual Output (Old CRT and New LED) and compare them side-by-side... you'll very quickly understand why we Nostalgically felt the games looked and ran better of "Native" Hardware., and in many respects it's because CRT Displays just naturally Resolved a lot of the Modern Issues we have with Games.
*_-iDSoft didn't come up with the method, it was Silicon Graphics._*
That's merely where Carmack sourced the approach., it was a fairly common Performance "Hack"... and the original version of it uses Fixed-Point Floats.
There's an example of it in the Amiga 1000 Software Development Toolkit from 1985.
*_-It's a floating point DECIMAL. More accurately, single-precision floating point decimal._*
"Single Precision" is a Modernised Term.
Floating Points are broken down into 3 Terms., Significant, Base and Exponent; which can either be expressed via IEEE, Real or Fixed.
This is further expanded via Binary, Decimal, Hexadecimal and Octagonal (and I bet you've never even heard of the last one, but most Computers even today still support such in their FPU; as well, they've not really been touched in terms of base Design since the Late-80s)
You're specifically referring to IEEE 754 approach., which wasn't specified until 1985 and wasn't widely adopted until 1990... it is a 32-bit DFP (Decimal) approach.
*_Please don't keep repeating the "30 times per second", it was never a thing in computer games. Even Wolf3D's updatecycle is 50 times a second._*
As noted above., Quake (ID Tech 1) uses 30Hz as the Base... Wolfenstein 3D and Catacombs 3D both used 70Hz...
It's important to know this, as the Audio (MUS Format) doesn't encode the Tempo, and so in order to get the correct playback speed for Conversion; you actually have to know the Refresh Cycle for each Specific Game to rebuild the Tempo.
*_-I "get" it that there's nothing much to get. It's bitshifting the single precision to work around its irregularities predictably. People have been doing this since the early 90's. The magic number just ensures that the vector is normalized._*
It's called "Fast Approximation", and it's been commonplace much further back than the 90s.
Even Floating Point Units themselves,. are by their nature Fast Approx. rather than Strict Precision … all the technique pointed out in the Video does, is essentially switch what would in most cases take a quite costly number of Cycles (MUL/DIV were typically 140 - 180 Cycles) to a more Fast Approximation result that can be handled in a different way; in this case with Bit Shifting (which typically takes 20 - 37 Cycles) while producing "Good Enough" Results for what is being Calculated.
Is it "Lying" … eh … that's an academic argument at best.
I mean look at it like this., a CLUT (Colour Look Up Table) provides access to the full 24bit Colour Range; while only providing 256 Colours for said task … and if you don't mind thrashing the Video DMA, with the added latency; you can switch it for each Surface Calculated allowing for the Display of > 256 Total Colours / 4 Billion.
In either case it's still going to be quicker and more efficient than a 3 Byte Colour.
Kanakotka Good God, what planet were you on where people were playing Quake at 50 fps? “When Quake was new” was 1996. 3D accelerator cards were new, and even my 3dfx Voodoo card got an unbelievable 30 fps running GLQuake at 640x480. Maybe you’re confusing this with refresh rate.
Hopefully her enthusiasm won’t be deterred by people like you, because I learned something interesting today.
If you “have to comment on them,” it shows how fragile your own ego is that you feel compelled to put others down.
@@neeyotube When pings were higher than your fps. Great times!
That's not really a lie. They just used some bit level hacking and a bit of magic to get a really good initial guess. The real question here of course is where did they get the magic number from. I'm guessing that whoever wrote the comment was a different person from who came up with the number.
Knowing how this sort of things happen in my experience, I think that somebody came up with the algorithm for the initial guess as a dirty placeholder, then somebody else adjusted the magic number during testing while trying to optimize it, then they discovered that it works with only one iteration, but didn't know why and how you can possibly justify it. Hence the comment.
en.wikipedia.org/wiki/Fast_inverse_square_root
Kenneth McCall
Ha. That's some sick math skills. So it has quite a history.
EDIT: Holy shit, one of the guys involved in the history of that algorithm has the Turing Award for his work in numerical analysis. That's some serious brain power that went into this little algorithm.
@@ellisgl ''though investigation has shed some light on possible methods'' hehe, nice pun in the wiki article, last line of first paragraph
Part of this is misleading. This method doesn't "light an entire game arena", it's only used for dynamic lights and entities. All of the static lighting is calculated when the map is compiled and stored in a light map, and that calculation uses regular old sqrt(), not Q_rsqrt(). It's still impressive, but it's not as impressive as you're claiming.
As a mapper for Q3 back in the day, I was a tad confused...but lightmaps!
I'm not sure this person actually knows how video games work :-/
So you're saying this women is claiming the devs don't know what the code does but they actually do?
@@meshuggahdave5607 The devs know exactly what their code did and why it works. It's a myth and a joke because anyone with half a brain knew how this worked when they looked at it :-/
@@Folsomdsf2 well that's what you'd assume when they're getting paid and putting out games like quake...
I thought this was a fishy video but now I know.
Thanks
Fun fact: The SSE instruction rsqrtss is much faster and more accurate than this, and has been available in Intel processors since about 1999.
I am sorry but i have to say that this video was really annoing to watch because it was almost pure filler
Yeah, I'm reluctantly forced to agree. Doesn't really get started until 3 minutes in. The script could have really used some tightening.
I agree, it was mostly just rubbish with about 10 seconds devoted to the actual point of interest which was this calculation that is never explained.
The presentation is fine but the script sucks.
True. Good info but took a while to get to the point
I disagree, i haven't got the time to get bored
@@irlrp but you had time to watch and comment on this video? Weird thing to say.
I would have got bored reading your comment but I didn't have time to because my time weighs more than gold...
Firstly: I love that this video exists... so thank you. It's lovely in-depth analysis. Awesome.
That said: This often feels like a dad trying to use all the hip words. In this case, they are used correctly so the issue may be just be delivery.
More problematically: I'm pretty sure the described method of lighting is ray-tracing; which is not what Quake was doing for lighting. Quake used surface-based lighting. The polygons were pre-lit (and even then, not with ray-tracing); and then, at runtime, the polygon's texture is tiled into a buffer, with each texel lit according to the weighted average intensities of the four nearest light map points, as shown in Figure Three. If dynamic lighting is needed, the light map is modified accordingly before the buffer, is built. Then the polygon is drawn with perspective texture mapping, with the surface serving as the input texture, and with no lighting performed during the texture mapping.
Also: the video is wrong about precision. Floating point and non-floating point numbers have the same precision (precision is determined by the number of significant digits); the difference is the ability to change scale while maintaining preceision. 12345 and 12.345 are equally precise as one another; but without Floating point, you'd end up with 12345 and 12; which are not equally precise.
if you don't interrupt yourself every second for a cheap joke this may be a very informative and fun video. I watched it to the end, but by minute 5 it was an andurance. It's okay to make jokes every once and then, but not all the time, and less in a video that aims to be informative
She’s not very funny
absolutely beautiful. I've always been fascinated with different numerical systems and how they translate into each other. Lovely!
The puns!! Oh god the puns!!! *cuts ears off*
Well it was 2016.
i nearly died from the overacting... If trying too hard needs a video, this is it.
@@positronundervolt4799 lol. UA-cam seems to be bitshifting its algos.
@@Kalisparo Yeah, if I didn't know this was a personal channel I would think it was one of those UA-cam networks with the presenters reading off of cue-cards and trying WAY too hard with the fake enthusiasm/stale jokes etc.
I really like the fact that she enjoys talking about the subject and knows what she's doing.
And here is a fact: Carmack was not the first to invent the inverse square root. A similar piece of code was written for PC back in 96m by specialists from 3dfx. =)
The fast inverse square root wasn't Carmack's doing.
This video completely failed to explain it's core topic and instead overexplained a select few components of a single math problem that was only tangentially related to the lighting system. This really needs to be completely redone or removed.
You should have at least breifly mentioned light mapping and prebaked lighting (aka pre calculated lighting), fading light maps, sliding light maps, drop shadows (which doubled as drop lighting in Quake 1 for weapon flash and explosion effects), and a few other key effects that together created the lighting used in the Quake games.
"they had to get a bit shifty..... *bit shifty*" hahahahahha
On a serious note though, the code is not difficult. 0x5f3759df is the floating point representation of the square root of (2 to the power of 127). And i >> 1 means: shift i 1 bit to the right, effectively cutting i in half (as explained in the video). So what happens is this:
x2 = number * 0.5F // Remember half of the input number, will be used later
y = number // Copy the input number into y
i = * (long *) &y // Trick the compiler into storing the floating point input as a long. This conversion is necessary, because you can't bitshift a floating point number, or subtract a floating point number from a hexadecimal number.
i = 0x5f3759df - (i >> 1) // Subtract half of the input from the square root of (2 to the power of 127)
y = * (float *) &i // Now that the bitshift and subtraction is done, convert it back from long to a floating point number. This is already an approximation, but the result is refined in the next line:
y = y * (1.5 - (x2 * y * y)) // This method of getting closer to the square root is called Newton's method, look it up.
The longfloat conversion is tricky and depending on the compiler it may have a different outcome, because the byte order may be different from what you expect, and then a bitshift will mess up the value. In modern C we would use what is called a union instead. In a union, a float and a long can share the same memory location, so the conversion is not necessary anymore. This will not solve the byte order problem, so you still need to be careful and test it well if you want to use this trick.
bless you, may you never know a production deployment on a friday
@@romansmirnov961 Haha thanks :D that's the nicest blessing I've ever had
In case this will one day will be remade.
Say that this is about quake and fast inverse square root in the first 20 seconds.
Keep most of info, but remove all humour, and keep pace fast.
Use dos footage.
Keep code on the screen longer.
Add more graphs.
bob by yeah righto
Quake 3 doesn't run in Dos.
Games have bigger performance issues these days, an L3 cache miss on my machine is ~20 times slower than a standard fsqrt call.
What's with all the cringe gamer lingo?
It's very strange, I'm not sure how this ELI5 / r/fellowkids video is recommended
Cringing.
Gamers come out with lots of cringe lingo so probably trying to sound cool with the gamers.
Let me explain how ID lied but I don't understand how it works... I'll never get these 8 minutes back.
I learnt nothing new in this video, and not only that but i used the bit shifting in the 80s, i still do today ! Especially in javascript.
That bit of code in the source of Quake can be found in Doom and Wolfenstein 3D aswell.
The whole method has been used on the demoscene for years before those games.
Early 90s games also used trig tables as integers. The technic was known as the fixed dot calculation.
This video is really trivial.
Most of the lighting (like that on walls etc) was generated ahead of time. When you created a level, you needed to also generate the lighting, so it was not all generated each pass, much of what you see on static objects, like walls, floors and objects that don't move are generated when the level is created and that is it. Doors are a special case, watch them when they open and close and you should see some static lighting examples if I recall (and my memory us fuzzy on that). The only lighting I can think of which would need to be adjusted during game play would be that from say, weapon fire and what not.
>talking about light rendering in the quake 3 engine (or rather ioq3 running open arena), yet using the low lighting settings
*shrug*
I made my own 3d engine this year. If you put a cube and a lightsource the engine works at 60fps. If you wanna put 2 cubes the whole thing lags. Haha
when I was in college my instructor (ex epic dev) told me games are all smoke and mirrors....I miss you Mr Fissler.
RIP, best instructor at Wake Tech
why wouldn't they be? if the observer perceives the image that looks great, what's wrong with doing optimizations in the background?
@@ChristopherGray00 it was my first semester at SGD so I had no idea how games got made or what went into them.
I miss those days when computers weren't powerful enough and programmers needed to be clever to squeeze every drop of performance. These days no one bothers with code optimization.
Yup, just grab unity download some free assets and slap together some poorly optimized shit that struggles on super computers and throw it on Steam.
In 2016 he's the Oculus Rift guy
***** He's working for Oculus
And a rocket scientist for a hobby, literally.
Man the coders in the 90's were legends they had to work with limitations and went to the extent of hacking the hardware (prince of persia did) and here i am trying to figure out how unity works.
I can't watch this, the jokes are unbearable.
Sounds like it was made for kids...
I love the jokes
jokes aren't bad, but it's just not natural. she's forcing it. the video is good though.
@@filipenicoli_ I think it was..
I thought the jokes were okay. The bad part is the self-referencing the joke. That's a joke-killer every time. Trust your audience enough to recognize the joke by themselves via their shared culture. If they don't share that culture, it's another learning experience. Either way, win-win.
I highly doubt that John Carmack doesn't understand the code. And bitshifting is not evil or lying.
This is a pretty old topic, im pretty sure he's on record of saying he doesn't remember where that came from.
Can't say i doubt him, i don't remember what i wrote 2 months ago.
Carmack understood how it worked at a theoretical level IIRC what he didn't understand is how anyone ever discovered it in the first place and that is from a guy with an insane knowledge of esoteric coding approaches and seeing how to trick things up in ways other people would never think of.
I've spent waaay too much time in the Quake 2 source code and still find bits that blow your mind when you understand the angle they approached it from.
Bitshifting a floatpoint number IS evil though.
I will accept using bitwise AND and XOR on floating point, but only only the sign. You have to put some limits on what kind of evil magic people do with floating point, and often it even ends up slower because even the CPU goes: WTF? I thought this was an floating point, and now I have to route it over to the integer ALU? STALL STALL
There's nothing at all wrong with a technique like this if it's a good fit for the hardware. (And at the time, it was)
Bit shifting a floating point seems like a meaningless and broken operation but the result is just used as an initial guess for a couple Newton iterations. The real relevant piece there is that the exponent is bit-shifted and negated, which gets you on the right order of magnitude for an inverse square root. The rest just helps massage it for a little bit better results.
Seems like this was pushed by youtube recently. Not a fan of the humor, not sure if maybe I'm not a target demographic you were shooting for. As a game developer I found this fascinating but the presentation leaves to be desired. This was in 2016 tho soooo yeah hmmm. The Information quality is superb and the explanation is great. Just not a fan of "Look at me I'm edgy" jokes.
All good jokes are edgy. But I can see how you don't like them not being a fan and all.
0:00 "I play a lot of shooters . . . buutttt have just started so recently" as a picture of a controller comes up. . .
You are wrong. That number is not some magic number. And it is not unknown as to why and how it works. I spent about 5 minutes writing the function in code when i got about half way done of trying to see what the f was going on i noticed that. This is not some sorcery the number is not chosen at random it is chosen because of how that number is represented in binary and how computers convert floats to longs. No magic here just smart people realizing that converting a long to a float gives the results they need. If anyone knows what i am talking about and feels like they can explain it better than me please be my guest. But i really wish you would have explained how this really works and that the magic number is not so mystical as it appears at first glance. My bet is how that number was found was not by going 1,2,3,4.... but by determining what they needed in binary then that number just so happened to be its hex representation. Not mystical please i urge you to relook at the code and you can see what is going on here and if you explained it better the whole comment at the end about coming up to solutions that may not be right in front of your face these "dirty tricks" would be a lot more meaningful if you would just explain how it works. Instead of going nobody knows if i can find the solution i know other people do as well so who else knows how this works? the solution if you would.
It is sort of a magic number in that the way it works is somewhat orthogonal to the meaning of the underlying data. The number 0x5f3759df (1011111001101110101100111011111 in binary, this is important, examine how it has a pattern about it) is a correction piece. For IEEE 754 numbers stored contiguously in memory and treated as a signed integer (0 + Exp + Mant), the result of a singular Newton's Approx., the bit pattern from the magic number can be used to "repair" the odd/even signing of the exponent and restore the overall mantissa for the float by ones-complimenting i/2 (where i is int representation of the float to be invsqrt) and XOR'ing it with 0x5f3759df then dropping the MSB carry.
Whether it's a magic number or not is up to the viewer. That's a subjective term, not a scientific one. I consider it magic because it's less a number and more of a blueprint, and due to the fact it can be used numerically makes it magic.
Also, nowadays people should look into using extensions to SSE on x86 platforms and something like THUMB on ARM for doing triple (or more!) fast inverse approximations.
en.wikipedia.org/wiki/Fast_inverse_square_root
Magic number is programming jargon. It's used to describe a constant that appears directly in code instead of being declared as a variable or a constant. Did you see the code? That's exactly how the number appeared. It's a magic number.
This is really good information presented intelligently and humorously. Why so many dislikes?
I feel like the same information is being repeated 3 times before we get in depth look. I guess the script could use one more pass of refinement ;)
I love your videos and how well you explain these concepts.
Nice video, but um, every C developer should know that bit shifting is equivalent to multiplication or division in powers of 2. It's not a hack - it's how binary works. I'm also pretty sure lighting wasn't done that way back then (I don't even think they had shaders). Most of the "lighting" was static / pre-rendered, not calculated on the fly. Take a look at the other characters in this video - no shadows. Wanna know why? It's because they were not calculating "every single light beam, 30 frames per second". Most shadows back then were also just an alpha mask. The closest thing to "raytracing" back then was actually in the way Doom and Wolf3D (and blake stone etc, ROTT and others) were rendered. Quake was (one of ?) the first "true" 3d engines, which used proper 3d structures and perspective projection to 2d, allowing for more vertically stacked rooms etc. All of that said, these games were phenomenal for their time - and I sunk many hours into Quake 1 and 2.
"Nice video, but um, every C developer should know that bit shifting is equivalent to multiplication or division in powers of 2. It's not a hack - it's how binary works."
Doing so on integers is totally normal yes. But on floating point numbers? I'd wager that a big number of C developers don't even know the bit level representation of floating point numbers without looking it up. And as the video explain, the manipulation doesn't really do what you actually want, so you have to use a magic number to clean it up. It's a hack.
"not calculated on the fly."
I didn't get the impression that she was saying this anyway. It's a relatively short video. You'll have to simplify, so of course the details are wrong. Quake 3 does have some dynamic lights. You don't calculate the lighting with raytracing, but some of the calculations for the dynamic light themselves can roughly speaking be similar .
by the time the video had reached 7 seconds i had to pause it and take a step back to think about my life and confirm that i actually just heard that
This is interesting, but she's trying waaaaay too hard to be funny and it ends up being annoying. Great video other than that.
Right on spot
It's her style, I think she's adorable. Not everyone on the internet needs to be emulating the super-edgy style. It was informative, well-produced, and she has personality. I subscribed immediately.
You know what ELSE we had in the 90s?? Fax machines and video arcades and dial-up and floppy disks and FM Synthesizers and grunge music and Buffy the Vampire Slayer and slightly shoddy but very efficient inverse square root calculations!
5:23 Dividing power by -2 will give you inverse sqare root, but it can not be implemented in floating point format - you won't get inverse sqare root by dividing mantisa by -2 !! as shown in 5:30, you are confusing number raised to a power with number times 2 raised to a power. And dividing is not expensive, especially by 2, and last thing - you didnt exlpain how this piece of code work, even roughly. If someone is interested how it works there is wiki >> "Fast inverse sqare root"
I see the point you're making however, I wouldn't call it lies. Every physical model is just an approximation of reality :) Not mentioning its discrete computational representation ;p
Muydeemer do you even clickbait bro..
the thumbnail is enticing, the content informative and the speaker an ye olde schoole gamer
sub c:
This video needs to be better researched. The floating point math, especially relating to the fast inverse square method was understood at the time. A simple youtube search reveals what is going on here. Your video is disingenuous to not explain this math when that is the point of this video.
Newton's method was, not the weird integer math and constant value they used. Just forget Newton's method - it's a distraction. The constant value and the combination of two floating point operations into one faster integer operation is the interesting bit.
I remember a numerical analyst couldn't figure out how they came up with it. He tried experimenting to see if the constant was the best value to use - it turns out it wasn't. But who came up with it is actually still a mystery as far as I know. The last I heard it was someone at Nvidia.
Oh yeah, just remembered. The numerical analyst also tried coming up with a theoretical best constant to use, and in practice, for some reason, it didn't work as well as the one id used, which really added to the mystery. Wish I could remember the guy's name. He was no slouch.
@@brandonkellner4053 there are many "initial guess" values you can use. It depends how many passes the NAlg you use.
@@tehf00n The initial guess is hard coded and there is only pass.
Only super nerds care about this. I just came here to watch a cute nerd girl talk about Quake....
The line with the "// evil floating point bit level hacking" comment isn't actually the bit shifting bit of the equation, the part you mentioned not understanding is actually along the lines of what you just described with the Mantissa and Exponent values. The first bit takes a floating point number (whatever number we're trying to sqrt) and we''re assigning that to the Float named "y", we then give our Long named "i" the value of "i" multiplied by the value at "y" (y's value is treated as a Long rather than Float here)....THEN our "WTF" line assigns "i" the value of the Mantissa, which is located at the memory address 0x5f3759df, and SUBTRACTS the value of "i" AFTER it's been bitshifted/divided by 2 using bit shifting....
I think the main "wtf" moment for US is really thinking how the original programmer came up with that specific memory address, how our Mantissa gets stored there, and why it consistently happens at that memory address rather than being a bit more random...
John Carmack wasss a big part of ID so it wouldn't surprise me if this was his doing in some way, it's definitely not impossible to understand hardware well enough to do things that SEEM strange like this, especially with a language like C where you kinda need to know how things like memory work since you're doing all the management yourself, as opposed to something more abstracted like C++ which allows for some auto-management of resources so the programmer doesn't HAVE to understand it inside and out wholeheartedly just to create something stable
I want to correct you on certain details.
Neither John Carmack nor anyone at Id invented the inverse square root algorithm used in the Quake III source code. The original author is Greg Walsh, albeit using a different-and more precise-hex value.
Another note is that the game didn't run at 30 frames a second by default, but rather at 60 fps.
Now, this depends on different factors such as the refresh rate of the monitor and most CRT monitors used in the 90s were 75Hz and some even more. Another factor is the graphics card but most people who played Quake at the time had upgraded anyway as Quake III required an OpenGL-compliant GPU to run and didn't include a software renderer.
It doesn't invalidate your points, just a clarification.
Alright, I had to pause to write this comment. I'm BRAND NEW to this channel. This is my first video of yours I've EVER seen. But I'm 3:23 in to this video at the time of paying and DEFINITELY subscribing. But the video is from 2016 and I just need to find out if the channel is STILL getting current updates!? lol... I have no idea why THIS older video came up in my suggested feed randomly today out of the blue, but I'm glad it did.
I can say this much: 1. I can already tell this content is going to be explained in a way a geek (but non-coder) like me can understand (to me, your comment about Carmack / "geek quotient" hints at this). 2. I MUCH prefer a video with at least SOME editing and production value to "live streamers" these days. 3. Your pacing and cadence of narration in general is EXCELLENT.
I really hope you're still making new videos! Subscription button clicked, time to finish this video, then check out the rest of the channel! :)
Came for Quake. Not found! Was Quake 3!! LIES!!!!!
It's not even Quake 3, it's Open Arena. Same but different but still same.
I saw a video covering this in more detail. The “random number they pulled” is actually a memory address they use to change the type from a “float” to a “long” to do the bit-hacking.
At least we have dedicated math coprocessors now, remember programming a 68K CPU used fixed point maths, fun days
I still do tricky fixed point math in batch scripts in 2019 😭
I'm no programmer bit I've read the 68k specs and it looked like a bitch to write anything.
@@nosville22 68k had one of the most impressive instruction sets out of any CPU I've seen, it was well designed, especially compared to the x86. It was a bit rubbish (IMHO) at performance though with even the most basic of opcodes taking several cycles.
Nice video, but the trick is not needed anymore.
Pentium 3 launched in 1999 introduced SSE instruction set, it has RSQRTPS instruction that calculates more precise approximation (maximum relative error is 0.037%). It takes less time than Q3 trick, and calculates 4 values at once. That instruction is supported in all modern CPUs ever since, even cell phones have similar one, FRSQRTE in armv8, VRSQRTE in armv7.
Yes and no. On some microcontrollers, that have specific purposes like calculation of space vector pwms, this method is still pretty useful. Btw. what is described in the video is called the "quake magic number".
I really want to know how the fuck that "what the fuck" part works. Pull a number out of thin air and it's fixed? WHAT?!
I'm glad we have GPUs that can use Fresnel's equations to do rendering based on actual physics now, and not these approximations. Not only does it look amazingly better (to the point that it's easily a revolutionary change in gaming) but it doesn't require a mindfuck to figure-out how it works. O.o
What's happening is that by interpreting the value of the mantissa as an integer, you're sneakily getting a really cheap approximation of its logarithm base 2. More specifically the calculation that matters is:
log2(1 + m) ≈ m + σ
where m is the mantissa and σ is a constant chosen to minimize the error (about 0.045). Plug that into the final algorithm and do some juggling, and you end up using this value in the code:
1.5 * 2^23 * (127 - σ) = 1597463007 = 0x5f3759df
Dasyati
Thanks. The equations are easier to understand. Most explanations of this are long-winded diatribes of an esoteric nature. >.
I dunno how this video made it to my feed...but it made my day! Awesome!
No idea how I ended up here, but I loved what I found. I don't even know if you're still making videos.
*edit*
You don't make videos anymore. This is a massive loss.
I keep being mesmerized at the ways programmers came up with solutions to the limitations they had back then, I mean, if someone told me to make a game with the same constraints these guys had to follow I wouldn't be able to do anything, I really have to go back and look through some of the solutions they came up with back then, I think there is much to learn about not only programming in general but more specifically about problem solving!
never watched a video of you before love the enthusiasm!
This video could be boiled down to a 30 sec short. Saved us all a lot of processor time.
Nooooooo, not the puns. The terrible, horrible, torturous puns T_T *dies*
+Vally123 If you can't stand any more puns, try sitting down. ;)
And remember, it was all just a 'bit' of 'light' banter....
You are awesome! haha
Vally123 your grammar is worse.
GameWorld feel free to correct my grammar, then, I don't see anything wrong with it.
Vally123 sorry, I read it wrong.
Back in the old days game programmers used tons of cpu-time and memory saving tricks, and dividing/multiplying by two by shifting the value is probably one the most well-known ones.
The interesting part is how you exploit it when you need to divide/multiply by a number which is not a power of two. That's what would've been interesting.
I don't know how I came here, but it was really fun and illustrative video. I'm a subscriber now :D
Thanks! New videos will be coming out next week......(well, next week if you read this when I write it). :)
Jessica i want you to be my girlfriend. Please don't tell me you have a husband and bunch of kids.
That's probably the last time I'll see a floppy disc for 20 more years
Where do lies come in to this?
Our computer science professor showed this and asked to explain why Carmack is a genius.
I did a sarcastic reference to this video and then I tried to do my own, it was a disaster. Now I know how hard it is to do it, so I better don't do jokes about things I can barely do myself. I still feel cringe though.
The name is "cringe". You can get used to it.
Gameplay here isnt from Quake 3, but is from Open Arena, an opensourced arena shooter which tries to emulate Quake 3 styled gameplay while utilizing the improved IOQuake3 engine, with further improvements also extending to their lightmap methods for lighting . Its getting a massive visual overhaul sometime in the future .Its absolute free to download and fun as the classic to play with compatibility with custom Quake 3 maps, models and some mods to boot!
....Kind of a roundabout way of saying "I don't know how this works".. Don't you think?
This video could've easily been 3 minutes.. Not 8..
This trick you're talking about was coded by Mike Abrash in Quake. It was not Carmack in Q3 Arena like all the b-roll footage. He'd already left Id. Check out his book, he was an epic x86 assembly coder!
calculating actual bouncing light rays (photons) like that is called raytracing--aka that thing that we are JUST NOW starting to do for real in games with the NVidia RTX cards
Once worked with a thing called a "super compiler" - it made functions like this using ... genetic algorithms. You gave it some working "not so fast code" in C, and a bunch of tests that defined "winner". It then bred assembler no one could understand. Sorcery, care of our course mentor Mr Ken Thompson, a god amongst us.
As a PC gamer, it hurts every time you say "30 fps". I guess you mean "at least 60 fps"...
First video ive seen from your channel and I already subbed: good editing, well prepared, well explained even tho im not a coding expert.. but facts like that about "hacking" the system to make something like this from ID Software happening is really cool! Theres people complaining about the puns and stuff (the hate) but hey dont worry about this u cant please everybody :p Stay true to you, I really like this kind of content and even tho your channel isnt that big I can see the quality and the effort youre putting into it !!!! Have a nice day :)
Um programmers at the time all knew it wasn't pure 3d just as doom wasn't. We even had a term for it... 2.5d. though that also applied to gold box games etc.
Always appreciate learning more about my favorites, thanks!
That information was really interesting. I can't stand your delivery or jokes though.
Every joke delivery was a joke in its own. Which was funny. Joke recursion.
M Reese a joke within a joke you say? Joeception
Wondering why the magical unnamed hexadecimal constant , 0x5f3759df, does what it does? The answer is scarily simple. 0x5f3759df is a hexidecimal representation of 1597463007 (in decimal notation). 1597463007 seconds is exactly how many seconds it takes for a beam of light to reflect off a metallic surface in a zero gravity inverse matter pressure environment.
That's actually OpenArena, not Quake 3 Arena, though it's based on the same source code.
If you think Quake was hacked up, look at Doom's source code sometime. Fracunit is very fun.
I’m in the comments trying to figure out why 2k people gave this a dislike. Seemed like a pretty innocuous video.
This was pretty far from talking about Quake 3 lighting. You started with a pretty deceiving premise when mixing raytracing concepts with what's going on in id's engine.
The light in Quake 3 is hardly general in use. For environments they use 2 modes that have nothing to do with what you explained, and for characters, or VFX there's a different way they do it.
If i go into explaining them, i need a video longer than this one, and it's already pretty well documented by others.
Let there be lies, indeed....
I gave this video a try thinking it's gonna be a warmup of a discussion that happened on twitter or something. The depth got me by surprise, gonna subscribe 😊
Just popped up in my feed. Great content but jeeeeeez some of the script is cringy.
Now draw a perfect circle on a pixel display with no gaps and no overlapping points, using only integer addition and subtraction. I was given the algorithm with no comments. It took me months to figure out how it worked.
that's one old ass alpha version of Quake 3!
That's actually footage of a game called Open Arena, which is a community created free and open source version of Quake 3, using the same engine as Quake 3.
open arena is just Q3 with chaged assets, nothing else - good when you just want to check "gameplay" of it
i was just testing you guys *cough* you passed.
@TGcitizen: You might be surprised what the difference of "open source" entails. Spoiler: Mods, features, code changes, new assets, new maps.
@GroinMischief Yeah - that means "open source". Maybe they tinkered alot under the hood of OpenArena netcode but graphically/gameplay-wise it didn't improve much or at all (at least I didn't noticed any real diffrence except for diffrent maps and assets remodel/reskin). It plays just like Quake 3 and if you have Quake 3/Quake Live already, It won't give u anything diffrent.
How have I only just discovered your channel? This is awesome!
This was just awful, like someone's mom trying to be hip
She's a coder, she's more hip than you'll ever be
Maybe her hips hurt.
judgmental teens are more awful than hip moms.
I still play Q3A and Quake Live. Quake Champions also but much less. For it's time, 1999, the gfx were quite stunning. And in QL the IDTech 3 engine still holds up pretty well, although it has been further optimised.