I figured this out by myself while making a 2D platformer for a homework, and thought that its the stupidest solution possible, but now I'm kind of proud of myself
Be proud. _I_ nudged Javi in this direction with one of my comments on his "code a DOOMlike engine as a Windows console app" video he made last year. _You_ made a working implementation, which takes more work. EDIT: This one, ua-cam.com/video/xW8skO7MFYw/v-deo.html
@9:06 its very difficult for me to get into perspective from math information firstly , at first i thought when dx=1 dy/dx=dy which means the problem is solved... but actually we are finding the distance not the gradient so we need to calculate that.. other thing that made me confuse was from previous part sin RayAngle would give unit vector which means the hypoteneus is 1 and x has some value (i.e EyeX) the same is true for cos RayAngle which gives the y value to update (EyeY) so in the previous video that diagonal is moved by 1 unit .. or maybe (0.1 unit) but this time we want to move x or y direction so that we know what is hypoteneus distance so does that mean we need to check whether sin RayAngle < cos RayAngle or do we need to compare nTestX < nTestX nTestX=playerX+distanceToWall*EyeX nTestX=playerX+distanceToWall*EyeX to know which direction to go whether in y direction or x direction? i think we must compare nTestX < nTestY
I implemented this in Minecraft for my custom entity hitbox and motion system, it has allowed me to make physics-based movements for my mobs. It's really quite something.
@@jaylawlmc1915 If you want to learn more, you could hop on my Discord server and we can discuss how it works. I'd love to share it. The link is in the description of every one of my videos.
That's amazing! I got into computer graphics in university and at that time we didn't visualize anything and concencrated mostly on the math aspects. I spent a few weeks in my semester break and coded a few computer graphic algorithms on my own. At the beginning it was really frustrating but then I really got into it and nowdays it is one of my favorite computer science topics. I wish they had teached us that beautiful subject like you did. It's super intuitive with a little bit of drawing. Thanks man. I really appreciate your great content!
Before this, my raycaster was taking small little steps and it was SUUPER laggy. I was about to quit until I tried your method and it ran butter smooth! What a relief.
you precalculate the rise and run (you also calculate the *first* rise and run, as your position could be anywhere inside the square at first) you set a variable to your X coordinate (call it "A") in the loop: --you get the Y coord of the next wall up --you go right one square, and go up by the rise --if you've crossed the Y coordinate of the next wall up: ----you increment "A" by the run ----do a wall check at (X - "A", Y = the Y coord of the next wall up) --do a wall check at (X = current X pos, Y = current Y pos) do this until you've found a wall also flip some things to check other quadrants also be careful with precision because rounding errors accumulate here
I’m just about to make a voxel raycast engine using oct-trees and sign distance functions, and was trying to think of how to do this. Thank you for the video.
Thanks for the video ! I started to look into DDA ray casting and found the article you've mentioned, but couldn't understand it. So I started to reverse-engineer the RayCastWorld extension of the PGE, but my implementation wasn't working properly. You've saving me lots of headache :)
This video has been super helpful for me in implementing block selection in my Minecraft clone. I've tried twice to adapt the method to 3D and everything tends to work except when I aim down the planes made by any two axis. This second time I checked someone else's implementation of DDA and found a comment in their code where they note that truncating the ray origin into a vector of integers will have issues in the negative axis. Turns out this was exactly the source of my issue and now after applying a component-wise flooring everything works fine.
to demonstrate why just truncating doesn't work: consider the case very close to an axis' origin: tile #0 contains the positions from 0 to 0.999..., tile #1 contains 1.0 to 1.999..., tile #2 contains 2.0 to 2.999... seems like the perfect case for truncating, just lop off the decimal portion. But what about the negative side? If you simply truncate, then while 0.2 correctly corresponds to tile #0, -0.6 ALSO corresponds to tile #0, even though it is meant to be tile #-1. If you instead floor the position, then -0.6 correctly rounds to -1
Wow you helped me so much! I never really understood the DDA algorithm so I couldn't implement it into my C# raycasting engine. The way you explained it really helped and I've got it working with my engine now. The speed has improved greatly from about 11fps to 500fps. Thanks!
Probably can make it even faster using branchless version from shadertoy, and even then its possible to optimize it even further by doing early termination and applying mask without if statements etc etc, there many ways to implement it
Haha me too! I've got it to real-time raycasting for small scenes on CPU, but want to improve my algorithm before porting to GPU. There are some clever ways of exploiting progressive rendering, to skip empty areas, and some shadow caching tricks I want to explore first.
you definitely want some sort of accelerating structure to skip empty space instead of marching it. you can start with a simple octree raycasting and then go in the direction of SVO's that are far more efficient on big datasets (>1024^3)
@@Alexander_Sannikov I've tried using oct-trees, but this kind of optimized ray marching code is so efficient, that the overhead of navigating the tree is often more expensive than just "dumb" marching. Obviously this does depend on scene complexity & other factors. A mostly empty scene will always be faster with an oct tree. I'm thinking of using a '64-tree' in combination with a progressively rendered depth buffer. Render at low resolution, use the lowest depth value of adjacent pixels to inform a guaranteed voxel free frustum for that region of the render, Render at the next highest resolution, skipping those regions, etc...
@@benjaminmiller3620 for any two given algorithms, one of which has O(log(N)) complexity and the other one has O(N) comlexity, there exists M large enough so that for any N > M, the log-algorithm will be faster than the linear one. However, it also means that for any N < M it will be slower. Octree algorithms do have a significant constant hiding in that O()-notation that's why they will be slower on small datasets. However, there will always be a threshold in dataset size so that it'll be faster for datasets greater than that threshold. That hidden constant in O(log N) also greatly depends on your implementation details. For example, if you're using pointers to store the octree, it's a no-go -- you need to store it flattened. Even something as simple as rearranging your tree breadth-first vs depth-first has no change from algorithmic point of view, but has drastic impact on cache coherency and overall performance.
This speeds things up from the old "step a tiny amount" method by several orders of magnitude, I reckon. Incidentally, it's remarkably easy to replace the wall finding code from your "First Person Shooter" videos with this. I think I even have enough cycles left to repeat the raycasting process a couple more times now, - I can have different tiles/blocks rendering above (or below) the "zero" height. Also, I've tried Lode, Sage3D, Permadi and Vinibiavatti1's tutorials and while Vinibiavatti1's comes close, your tutorials are the only ones I've been able to follow and implement well and correctly. Thank You Javid! Now I'm off to make "RayCraft" (TM) 😀
I don't think you need to repeat the ray casting, just store more data in the ray (distance, block side, block height, block texture, maybe block array). If you do define different levels (in height) of blocks, I think you do need to to cast for each level.
@@PaulSpades You are correct; by the end I had the concept of cubes rather than walls and floors, so using a 3D array and walking each level was the way to go for me. But if the ground were just a heightmap, then yes, one cast for all levels would be more efficient.
Oh neat, just 12 hrs ago I posted a video of a wolfenstein-ish renderer using this raycasting method that I ported entirely into a compute shader I'm glad you're putting this out now, the command prompt FPS was really neat but I was very dissatisfied with that raycasting method. You have a clear way of communicating and I appreciate that you code in a real language :P
Nice, this is how I learned Raycasting as a kid and subsequently programming as a whole from that project. I ended up incorporating everything I learned into that project in some way or another. Funny part is how you can manage to get this done in such a small amount of code yet in Turbo and Borland C++ the size of this project, even before I started adding a bunch of features was rather large in order to get it to run at a decent speed needing things like DMA pixel transfer, special video modes and screen buffers, Fixed point mathematics
I decided to watch this while being half asleep . The moment you explained how it work, my mind was blown away by how brilliant this algorithm is. Your explanation is so intuitive that even my half asleep brain can understand. Love this vidieo
I could have used this for my 2D game with rope bending mechanics. I went the "several small increments" route for checking collision of the rope with the tiles. It was veeeery complicated to keep it bug free.
Nice explanation, I did something in my free time where I need to do multiple raycasts in 3D space from the same point to just slightly different target points and adapted the algorithm to advance every ray one step at a time, so every ray roughly checks blocks in the same area.
I have been looking for this exact solution for months as an aside to a voxel project I'm working on! I even used your A* algo for it (albeit heavily modified). Thanks for making such digestible content, it really helps beginner developers such as myself understand concepts that seem just a little out of reach :)
Using DDA to do longer jumps requires multiplication which is a slower operation than many adds, especially on older computers (and 8 bit computers dont even have multiplication) so adding might be just as fast as the only way to speed this up in such a CPU is by massive table lookups to do the multiplication often in a system that is low on memory. I have done a raycaster on a C64 but will look more into if I can use elements of DDA to speed it up even more, or if the multiplication tables will take too much space. But thanks for explaining the algorithm with such precision.
Fixed-point only requires addition and bitshifts IIRC in each loop iteration with the exception of a single multiplication and division outside the loop. A code snippet from my codebase which can rasterize over 1 billion projected 3D lines per second with the added cost of a hierarchical Z-Buffer depth test per pixel on the CPU (the code shown below is the simpler 2D version which accepts 2-vectors, but I use an extended version in 3D which accepts floating-point 3-vectors which then get converted into fixed-point within the function with the Z value interpolated at each step using bitshifts). // Calls plot(x, y) for each pixel of a line drawn from p1 to p2 using fixed-point DDA. template void plot_line(Plot plot, Vec2i p1, Vec2i p2) { const Vec2i64 slen = p2 - p1; constexpr int32_t sign[2] = {-1, 1}; const bool a_small = abs(slen.y) < abs(slen.x); const bool a_large = !a_small; const int64_t end_val = slen[a_large]; const int64_t step = sign[slen[a_large] >= 0]; const int64_t x_len = slen[a_large] * step; const int64_t slope = div_nz(slen[a_small] > 32ll)}; plot((int32_t)pt[a_large], (int32_t)pt[a_small]); } } Can use 32-bit instead with 16-bit shifts, or even 16-bit with 8-bit shifts, or 8-bit with 4-bit shifts depending on acceptable precision/range and hardware demands. 'div_nz' is just a function to perform division with a check to avoid division by zero, returning zero in that case. Not sure how well it'd perform on C64 with the single division and multiplication per line (not per step), but it's something I lean on heavily to rasterize wireframes on very high-resolution 3D meshes (millions of polygons each) in real-time and faster than the GPU can do it (at least consumer-grade NVIDIA and AMD GPUs are surprisingly slow to rasterize lines; I think because they actually convert each line to a triangle strip and rasterize them as triangles). It'd need to be adapted a bit for raycasting to avoid possible ray leaks (the above doesn't plot any doubles as pixel artists call them) but the same fixed-point concept should apply to change the requirements for division and multiplication each iteration into bitshifts. I keep hearing that Bresenham is faster than DDA for line rasterization but I think that's assuming floating-point and not fixed-point. I've yet to find any Bresenham implementation beating this at least on today's hardware.
There should really be a specific name for this alteration of the DDA algorithm. It doesn't help running into this all over the web while knowing in some cell sized grid space, you can just check each axis intersect rather than some small amount along the direction of the vector. Thanks for this, it's incredible!
Since we are talking about straight lines we know that they have one constant angle to the x-axis and one constant angle to the y-axis. This means that for one ray you need to compute 1/cos(theta_x) and 1/cos(theta_y) only once and those are your increments in lengths of the line when we increment x and y by 1 respectively, no square roots needed since they are expensive to compute.
There's another tutorial that's been around for a while, like Lode's... though it recommends using pre-calculated look up tables for sin and cos... www.permadi.com/tutorial/raycast/rayc8.html so no overhead of square roots
15:00 when you shaded in the cells you checked, it shows that this is just a low-resolution pixelated line. Look at the fast algorithm for drawing a line: depending on the slope you always move either x or y; after a certain number of pixels you move 1 in the other axis. I've done that in x86 (16-bit) assembly language, back in the late 80's through early 90's. So, super-fast code for that is around, and even if you're not writing directly in ASM and keeping track of register usage, it still minimizes the number of values needed and avoids expensive operations like division and avoids branching.
When you think about it, the bresenham line algorithm is the DDA algorithm with optimizations based on the quadrant the direction is in. So you could benefit from some of those same optimizations. (Plus the bresenham line algorithm uses an error threshold of 0.5 to change direction to keep the line thin. You'd need to change that to 0)
When I developed this method independently, all those years ago (heh, as many others did, it seems), I realized it's the 2D generalization of Bresenham's. (Which is to say the same thing, merely going the other direction!) You keep two error accumulators, increment and conditionally-decrement each by their respective slopes -- importantly, the slope doesn't need to be normalized anymore -- and just let it spin! The importance of denormal is, for creating a 3D view, you project a view plane relative to the view angle, and this all happens with basic 2D geometry. Which, I didn't know complex arithmetic or linear algebra at the time, but it's exactly that, as it happens. There's no trig involved whatsoever (beyond the one sin/cos pair to start with), no divide by zero, no janky hacks for special angles or quadrants -- it just works! And the thing about denormal slope is, it gives you the perspective-corrected distance automatically. Dividing the view plane into angles, say, is a whole different projection and needs a complicated correction to look right (to preserve straight lines). This does of course mean, you can't go to 180° FOV, but that's typical of perspective projection.
DDA/Brezenham has linear complexity: the longer the line you're casting, the more expensive it gets. for large-scale data (say, for rendering voxels) you need faster average case convergence than O(N). there's multiple approaches to do that like octrees, kd-trees, BVH's, SVO's, brick maps, but that's beside the point, I just wanted to note that DDA is certainly not a fast solution from asymptotic analysis standpoint. it's surely fast to implement though and works well for small scenes with low resolution grids (
Note, you can use a very similar algorithm to implement exact time-step-less collision physics with AABBs and circles/spheres. In fact, with just _one_ more little coordinate substitution, you can make the object's position an implicit value computed directly from the current time based on object data that only needs to be updated (and synchronized across threads!) when an object actually collides.
can you expands on how it could be used on AABBs? I know how it's aabb vs aabb is done(add the sizes of one box to the another and then raycast), but I simply can't imagine how it can be done with this algorithm
This is basically the same algorithm used in the Wolf 3D source. Although John Carmack used goto to switch between x steps and y steps. I read the code in Game Engine Black Book: Wolfenstein 3D, but the source code is also available free online. I read about it while playing around with your first console FPS video. :-)
I have a different method, scrap the ray casting, instead just calculate the xy of the furthest point you want to draw, then skim through all the cells between that point and the player within a certain range of that point, while going through those cells you fill in a seperate array with the closest points to the player with an object, those are what you actually draw, for an easy to understand image, imagine a square with the player at any corner looking at the opposite corner, you just start skiming from that opposite corner through every cell in the square towards the player with each diagonal between the player and the opposite corner serving as a column, the edge columns will always have the adjacent corners as their closet cell, the opposite corner will always have the furthest cell that can be declared to hold something
btw the xy of the furthest point can be calculated by this method: y = ceil(dist * normalized angle), x = ceil(dist - y), you can then just calculate the furthest cells that are in range by using the normalised angle and -(0 - normalised angle), haven't directly checked any of that but I don't see any particlur issues with it
I am actually surprised that the code you are showing actually seem to work. I implemented this a few years ago and the first thing I encountered was that my program aborted with a division by 0 exception whenever dx or dy where 0. So basically if my line was parallel to one of the coordinate axis. I had to implement a special handling for those two cases for getting the algorithm to work, which was easy enough to do, as you are simply going along one axis then. Just wanted to add this here in case someone encounters the same issue as I once did. :)
Excellent video. I code my own games for fun every once in awhile and old school 2d is my bread and butter. Although, I never actually finish a game and learn more each time. Starting out with an ultima clone, to super mario, to megaman and eventually played with some ray casting. It amazes me how useful trig is for games.
4:47, coming back to this a year later and I still think it's faster to use normalisation and virtual/imaginary boxes, though I admit I may suck at explaining it. I'll try again in list format. 1. Imagine the space between the ray source and the ray target as the area of your imaginary square 2. Normalise the angle to your target into just the space in that quadrilateral - for example: ((angle / 360) % 4) * 4 3. Decide which axis the angle should represent - for example: x = (angle < 0.5) ? angle : 1.0 - angle, y = vise versa 4. Multiply both normalised values against the distance between the ray source and the scene borders to get the ray targets position (if you didn't have it already and were working from just an angle) 5. Get the distance between the ray source vs target, this is the limits of your imaginary box from point 1 6. Divide the larger edge by the smaller, so width by height or vise versa, this value is how many units (pixels if it helps with imagery) you must process in the larger edge before moving up 1 unit in the smaller edge 7. Everything that shows up in those units is hit by the ray, do whatever with it, you can identify the other visible objects in a view cone by just iterating upto half the cone's width in units from the center ones you find, just use a normalised semi-circle's points to identify the remaining ones at the cone's front (or should I call it base?, feels like the point end should be the base though) **Edit:** Btw to apply the same math to 3d space just pretend the z axis is an x axis and do the math for each x axis separately, you'll end up with 2 y values, to get the mid point between them in the 3d cone's view, just multiply their normalised values against each other
Typically these types of visibility tests benefit from terminating as quickly as possible (finding the nearest occluder ASAP and stopping the search). If you want everything that intersects the ray or other primitive like view cone or a circular/spherical search parameter without the occlusion, I think best bet is to use a spatial index and use it to find all things that intersect the primitive in logarithmic time.
@@darkengine5931 I didn't say can't use an index table with this, I'm saying that you use the size of the scene to determine how far in the table to jump x and y indices before reading the next index before switching axis, in other words because you know how far to jump you skip the "is dist 1 greater than dist 2" check and keep adding the values until you passby the scene boundaries or find an indice that is not 0
@@zxuiji I see! But I think #7 sounds a bit expensive if I understand since you're performing a parameter search if I understand correctly (something that I think would require a miniature loop or some additional inner branching). DDA achieves the equivalent of #7 by exactly determining which grid cell to check each marching step when all we have is a grid and slope from #6 with just a tiny bit of arithmetic each step and no need for any inner loop. If I'm misunderstanding, could you describe #7 with some pseudocode along with the overall loop?
The next game I make that is tiled, I see if I can include this. I have only ever implemented raycasting for non-tiled things using a different method, but this seems simpler for tiled. (Maybe a tad slower though for large simple secnes)
2:12 I remember trying to make my own bad ray engine and the way I implemented it was to go out half the length, check for collision, then if there was one, go back a 4th, then check again, go forward an 8th and so on. It was way faster than the step by step like shown here, but still pretty slow due to the engine I used edit, the method I used is called ray marching apparently
Found your channel from the procedural universe video. I'm very interested in your lessons now. When will we get a video on saving state in a procedural universe?
I want to thank you so much for giving me the idea. I just started to write a program which main goal is RayCasting. I have resources to write the code, but I really wanted to understand what is happening under it. It will be a lie to say that I now understand it 100%, but I have the idea how it works and it's all thanks to you. Subscribe and Like from me, really great job.
I substituted "fast inverse sqrt" into the olc::PGE, pretty interesting results, but overall it does give a frame rate boost, especially casting many rays.
That's a good idea! For fastest response, you'll probably want precomputed look up tables dx,dy for each vector direction. Not a problem with today's large memory computer, but it was back then. Just make sure that you do the right 1st step since it is fractional step.
I don't think it's good compared to DDA. DDA lines work similar to convervative rasterisers in that every pixel the line cross is marked (perfect for a ray marcher), Bresenham skips pixels as it's optimised for 1 pixel per column (assuming a more horizontal slope). See 18:49 for an example of a DDA covering more pixels
Even though I figured this out a long time ago myself, it's still engaging to watch you explain it with your Bob-Ross style. You could teach the world's children to become programmers. And DDA FTW!
I would love to see a good example of collision when a player takes up 2x2 units of tiles. Running against a jagged square slant. So that the player is pushed to the side of the slant. like \ or / for example. Moving both up and left or right.
This will be of interest to anyone trying to compile PGE v2.14 on the command line with VSCode. It turns out that the order in which the libraries appear in the command line matter. This works: g++ -std=c++20 fastray.cpp -o fastray -static -luser32 -lgdi32 -lopengl32 -lgdiplus -lstdc++fs -lShlwapi -ldwmapi If you're really stuck with library link order problems, this from Stack Overflow: "Another alternative would be to specify the list of libraries twice: gcc prog.o libA.a libB.a libA.a libB.a -o prog.x Doing this, you don't have to bother with the right sequence since the reference will be resolved in the second block."
I tried my had at a 3d implementation. Can't tell if my variables were borked, or if my math was. Either way, it ran FAR too slowly to be usefull. Might end up revisiting it at a later date once I am using a shader instead of CPU.
Hello Javi, nice work I would like to understand better all the formulas that you used in the video to implement it my own code. Do you have any recomendación? Thanks again
@@javidx9Hello javi dou you know how to scale sprites when texturizing the walls? mi grid size is designed for 8 units but my sprites are 128*128. Do you know if its possbile?
If `direction` is normalized vector then `vRayUnitStepSize` can be calcualted as `{ 1 / vRay.x, 1 / vRay.y }`, knowing the fact that `sqrt(vRay.x^2 + vRay.y^2)` is equal to `1`
I wonder if you can use a modified form of the Bresenham's line algorithm or the Xiaolin Wu's line algorithm. Both would probably be a bit faster than DDA (maybe 2x faster?). Also, neither requires the computation of a square root, only a single division.
Raycasting must both hit every tile and tell you how far away that hit was. I think you might be able to do something similar by testing all the x intersections then all the y then taking the shortest, but I doubt it would be any better.
Seems very fast indeed. But not quite [52 CPU clock cycle] fast. I'll have a lot more optimisation to do, but I think it might be doable with this as a base...
Probably can make it way faster using branchless version from shadertoy, and even then its possible to optimize it even further by doing early termination and applying mask without if statements etc etc, there many ways to implement it
Love your videos. Can you please also do a floor casting in 3D raycaster? I am looking for an efficient way to map floor without having to cast one ray for each pixel on screen.
I gave this a go using the pen & paper explanation you give. I've almost got it working and I decided to compare what I have with the code you've written. I tried substituting my own variables into the algorithm with no success. If I understand correctly, what is being displayed is an "illusion" and all of the vectors and map coordinates are actually way up in the top left corner of the screen. I think there was a similar thing going on in the A* video as well. Is that correct? I can see the benefit, specifically where the map has no actually coordinates. That's like 2000 x and y variables that don't need to be created. I was gonna ask if, I should strive for this approach, but I think I've answered my own question. Do all games really only take place in the top left corner of the screen? Damn, that's crazy!
I have a suggestion for an interesting topic for a video. There is an 128-bit decimal in C#. What is the best way to create a 128-bit double or float or decimal in C++? I did read about quad precision floats and doubles. I did possibly see a couple of projects on GitHub but I haven't looked into it that much. There is apparently an 80-bit double which is supported by non-microsoft compilers but that fact limits its usefulness a bit, especially if you use Visual Studio.
Amazing video thank you! Since the direction vector is normalized to begin with, isn't there no need to do a Sqrt? - you can simply divide 1 / dx and 1 / dy to get the step size per unit length in those directions
Thanks! I'm not sure I follow, suppose a scenario where dx=1 and dy=1 then the length is sqrt(2) which is greater than unity, so normalisation is still required?
@@javidx9 in this case, the vector that you're using to calculate vRayUnitStepSize is already normalized, so you know that the hypotenuse's length is 1. So if you divide all sides of the triangle by x, you get that the hypotenuse is 1/x when x = 1. Just to be sure I double checked the math in my code and both equations come out to the same answer for me. I liked seeing the math behind the general case too, just thought it was cool that there was a simplification - that's all!
Why not use line formula y=(dx/dy)x+c, increment by x unit? You can use integer instead of float, arrays instead of vector, and avoid successive distance comparison. You'll need a corresponding x=() but that should be no problem?
Well for 3D worlds that aren't saved as tiles you could leave the original file as is and generate a temporary folder and sub files with each section of the world split into cells with each file representing a cell, the name can be as simple as a hex string of X-Y-Z and in each file would be just IDs and modifications for common objects which is stored elsewhere, the positions stored in each file should be normalised to the cell dimensions, the player position could also be normalised to the cell and the cell loaded around the player would stored in simple integers, this immeadiatly resolves the infinity problem and simplifies grabbing what cells to load and draw
as another note this method means you can straight up start with the furthest you're willing to draw on (such as 0xFFFFFFFF) and decrement by cell size and attempt to load, a successful load results in content, a fail results in just the sky& maybe water if at the default sea point
another note is this can work for the "randomly generated" objects by having the seed at the top of the cell file and then modifications to each sequential object underneath
btw javid if you'e gonna include this in a video don't put the messages themselves in the vid, just link 'em under the vid, you can just call me a random programmer :)
This is basically just a bresenham line drawing algorithm. I was hoping for some kind of intelligent space partitioning algorithm that avoids the redundancy between adjacent pixels.
@@javidx9 thanks for the reply. I think I got the hang of it now. I just need to look into the exacts of calculating the y and x length to see which is shorter. Then I have tiles to check if a wall is there.
Honestly, this seemed like the obvious thing to do. After you described the problem statement, and before you presented the algorithm I was thinking "if the default approach is to take tiny steps, then something better would be to only compute intersections with the gridlines". Though I didn't think through the details of how you would go about that. The walking along as a vector with x or y component magnitude aspect of it is clever. Hm, it also generalizes well into the 3 dimensional case. If Minecraft uses Raycasting for whatever reason, but doesn't use this algorithm, perhaps it should. Though this would be complicated by things that are not grid aligned, like mobs and entities. Perhaps that would be most applicable to the Minecraft thing with RTX Raytracing. Though, if it was implemented well then they probably used something like this algorithm or better anyways. Maybe this algorithm could be put to good use with non-axis aligned things if objects in your world are sparse, and you have a spatial partitioning scheme recording what objects exist in which voxels. Thereby allowing you to narrow down possibly collisions by first marching along the grid, then checking for potential intersections when the algorithms enters a grid cell that contains part of an object. One overkill option you could maybe do is have a lookup table where the indices are the x and y delta of one point and another (assuming they're grid aligned, and exist on a 1024x768 screen). You can have a bit-vector representing whether to step next in the x(0) or y(1) direction. According to my calculations the lookup table would need about 1.3 GB, so probably not worthwhile but ya. If your grid size is a lot smaller it could maybe make sense? Oh, perhaps you could use a fixed point representation instead. Sacrificing a bit of accuracy, or perhaps you won't even lose precision if your source and destination are snapped to some grid(like the pixel coordinates), and you ensure the zoom level of the grid is snapped to some convenient values. This could improve the performance since floating point operations are slower.
I always treated vertical collision points and horizontal collision points separately. Dividing your normalized walk direction vector by dx gives you a vector with x component equal to 1. Walk along these steps to get a vertical collision. Repeat the same with y to get a horizontal collision. Finally choose the closer one.
that's literally THE article that i reed for learning raycasting
I think its one of the best resources out there certainly.
Me too lol
Me too
me too
Me too, years and years ago
I figured this out by myself while making a 2D platformer for a homework, and thought that its the stupidest solution possible, but now I'm kind of proud of myself
occams razor for you
Same thing !
But I did it for 3D, now that's a bit more stupid ...
@@antoninperonnet6138 well worry not you have aabb now implemented then!
Be proud. _I_ nudged Javi in this direction with one of my comments on his "code a DOOMlike engine as a Windows console app" video he made last year. _You_ made a working implementation, which takes more work.
EDIT: This one, ua-cam.com/video/xW8skO7MFYw/v-deo.html
@9:06
its very difficult for me to get into perspective from math information
firstly ,
at first i thought when dx=1
dy/dx=dy which means the problem is solved...
but actually we are finding the distance not the gradient so we need to calculate that..
other thing that made me confuse
was from previous part
sin RayAngle would give unit vector which means the hypoteneus is 1 and x has some value (i.e EyeX)
the same is true for cos RayAngle which gives the y value to update (EyeY)
so in the previous video that diagonal is moved by 1 unit .. or maybe (0.1 unit)
but this time we want to move x or y direction so that we know what is hypoteneus distance
so does that mean we need to check whether sin RayAngle < cos RayAngle
or do we need to compare nTestX < nTestX
nTestX=playerX+distanceToWall*EyeX
nTestX=playerX+distanceToWall*EyeX
to know which direction to go whether in y direction or x direction?
i think we must compare nTestX < nTestY
I implemented this in Minecraft for my custom entity hitbox and motion system, it has allowed me to make physics-based movements for my mobs. It's really quite something.
is it faster than the raycasting api provided by spigot?
@@jaylawlmc1915 My implementation is in a data pack, so that is unlikely.
@@Dominexis oh interesting! i didnt know you could code things like raytracing into datapacks. thats cool
@@jaylawlmc1915 If you want to learn more, you could hop on my Discord server and we can discuss how it works. I'd love to share it. The link is in the description of every one of my videos.
@@jaylawlmc1915 At least Bukkit's BlockIterator seems to use something similar, with added rounding or so.
"C for cimplicity"
C for complicated.
@@deleted_handle C for coping.
So far so good
That's amazing! I got into computer graphics in university and at that time we didn't visualize anything and concencrated mostly on the math aspects. I spent a few weeks in my semester break and coded a few computer graphic algorithms on my own. At the beginning it was really frustrating but then I really got into it and nowdays it is one of my favorite computer science topics. I wish they had teached us that beautiful subject like you did. It's super intuitive with a little bit of drawing. Thanks man. I really appreciate your great content!
Yeah, maths with a bit of sparkle goes a long way 🤣 Thanks Marc!
Before this, my raycaster was taking small little steps and it was SUUPER laggy. I was about to quit until I tried your method and it ran butter smooth! What a relief.
I don't know how to say that your tutorials are perfect. you are so patient in explaining things and I really love it. you are my hero in programming.
you precalculate the rise and run (you also calculate the *first* rise and run, as your position could be anywhere inside the square at first)
you set a variable to your X coordinate (call it "A")
in the loop:
--you get the Y coord of the next wall up
--you go right one square, and go up by the rise
--if you've crossed the Y coordinate of the next wall up:
----you increment "A" by the run
----do a wall check at (X - "A", Y = the Y coord of the next wall up)
--do a wall check at (X = current X pos, Y = current Y pos)
do this until you've found a wall
also flip some things to check other quadrants
also be careful with precision because rounding errors accumulate here
I’m just about to make a voxel raycast engine using oct-trees and sign distance functions, and was trying to think of how to do this. Thank you for the video.
Thanks for the video ! I started to look into DDA ray casting and found the article you've mentioned, but couldn't understand it. So I started to reverse-engineer the RayCastWorld extension of the PGE, but my implementation wasn't working properly. You've saving me lots of headache :)
heh yeah its a little different in RCW because it yields coordinate perfect wall planes, and is selective about the side of the tile hit.
This video has been super helpful for me in implementing block selection in my Minecraft clone. I've tried twice to adapt the method to 3D and everything tends to work except when I aim down the planes made by any two axis. This second time I checked someone else's implementation of DDA and found a comment in their code where they note that truncating the ray origin into a vector of integers will have issues in the negative axis. Turns out this was exactly the source of my issue and now after applying a component-wise flooring everything works fine.
to demonstrate why just truncating doesn't work: consider the case very close to an axis' origin: tile #0 contains the positions from 0 to 0.999..., tile #1 contains 1.0 to 1.999..., tile #2 contains 2.0 to 2.999...
seems like the perfect case for truncating, just lop off the decimal portion. But what about the negative side? If you simply truncate, then while 0.2 correctly corresponds to tile #0, -0.6 ALSO corresponds to tile #0, even though it is meant to be tile #-1. If you instead floor the position, then -0.6 correctly rounds to -1
Wow you helped me so much! I never really understood the DDA algorithm so I couldn't implement it into my C# raycasting engine. The way you explained it really helped and I've got it working with my engine now. The speed has improved greatly from about 11fps to 500fps. Thanks!
Probably can make it even faster using branchless version from shadertoy, and even then its possible to optimize it even further by doing early termination and applying mask without if statements etc etc, there many ways to implement it
ah cool, I used this method but in 3d for voxels recently.
Haha me too! I've got it to real-time raycasting for small scenes on CPU, but want to improve my algorithm before porting to GPU. There are some clever ways of exploiting progressive rendering, to skip empty areas, and some shadow caching tricks I want to explore first.
you definitely want some sort of accelerating structure to skip empty space instead of marching it. you can start with a simple octree raycasting and then go in the direction of SVO's that are far more efficient on big datasets (>1024^3)
@@Alexander_Sannikov I've tried using oct-trees, but this kind of optimized ray marching code is so efficient, that the overhead of navigating the tree is often more expensive than just "dumb" marching. Obviously this does depend on scene complexity & other factors. A mostly empty scene will always be faster with an oct tree. I'm thinking of using a '64-tree' in combination with a progressively rendered depth buffer. Render at low resolution, use the lowest depth value of adjacent pixels to inform a guaranteed voxel free frustum for that region of the render, Render at the next highest resolution, skipping those regions, etc...
@@benjaminmiller3620 for any two given algorithms, one of which has O(log(N)) complexity and the other one has O(N) comlexity, there exists M large enough so that for any N > M, the log-algorithm will be faster than the linear one. However, it also means that for any N < M it will be slower. Octree algorithms do have a significant constant hiding in that O()-notation that's why they will be slower on small datasets. However, there will always be a threshold in dataset size so that it'll be faster for datasets greater than that threshold.
That hidden constant in O(log N) also greatly depends on your implementation details. For example, if you're using pointers to store the octree, it's a no-go -- you need to store it flattened. Even something as simple as rearranging your tree breadth-first vs depth-first has no change from algorithmic point of view, but has drastic impact on cache coherency and overall performance.
This speeds things up from the old "step a tiny amount" method by several orders of magnitude, I reckon. Incidentally, it's remarkably easy to replace the wall finding code from your "First Person Shooter" videos with this.
I think I even have enough cycles left to repeat the raycasting process a couple more times now, - I can have different tiles/blocks rendering above (or below) the "zero" height.
Also, I've tried Lode, Sage3D, Permadi and Vinibiavatti1's tutorials and while Vinibiavatti1's comes close, your tutorials are the only ones I've been able to follow and implement well and correctly.
Thank You Javid!
Now I'm off to make "RayCraft" (TM) 😀
I don't think you need to repeat the ray casting, just store more data in the ray (distance, block side, block height, block texture, maybe block array). If you do define different levels (in height) of blocks, I think you do need to to cast for each level.
@@PaulSpades You are correct; by the end I had the concept of cubes rather than walls and floors, so using a 3D array and walking each level was the way to go for me. But if the ground were just a heightmap, then yes, one cast for all levels would be more efficient.
Oh neat, just 12 hrs ago I posted a video of a wolfenstein-ish renderer using this raycasting method that I ported entirely into a compute shader
I'm glad you're putting this out now, the command prompt FPS was really neat but I was very dissatisfied with that raycasting method. You have a clear way of communicating and I appreciate that you code in a real language :P
Nice, this is how I learned Raycasting as a kid and subsequently programming as a whole from that project. I ended up incorporating everything I learned into that project in some way or another. Funny part is how you can manage to get this done in such a small amount of code yet in Turbo and Borland C++ the size of this project, even before I started adding a bunch of features was rather large in order to get it to run at a decent speed needing things like DMA pixel transfer, special video modes and screen buffers, Fixed point mathematics
I was just working on something with the PGE and I saw this pop up in my notifications. Glorious days!
lol thank you Robben, make sure you share your PGE adventures!
I decided to watch this while being half asleep . The moment you explained how it work, my mind was blown away by how brilliant this algorithm is. Your explanation is so intuitive that even my half asleep brain can understand. Love this vidieo
I could have used this for my 2D game with rope bending mechanics. I went the "several small increments" route for checking collision of the rope with the tiles. It was veeeery complicated to keep it bug free.
Nice explanation, I did something in my free time where I need to do multiple raycasts in 3D space from the same point to just slightly different target points and adapted the algorithm to advance every ray one step at a time, so every ray roughly checks blocks in the same area.
I made something similar for octrees some times ago. Glad to see how you would have done it
I have been looking for this exact solution for months as an aside to a voxel project I'm working on! I even used your A* algo for it (albeit heavily modified). Thanks for making such digestible content, it really helps beginner developers such as myself understand concepts that seem just a little out of reach :)
Used this for my Minecraft clone and it's working FLAWLESSLY thank you!
So I've been working on a game engine and this was LITERALLY EXACTLY what I needed for one of the issues in my code. Massive legend.
Using DDA to do longer jumps requires multiplication which is a slower operation than many adds, especially on older computers (and 8 bit computers dont even have multiplication) so adding might be just as fast as the only way to speed this up in such a CPU is by massive table lookups to do the multiplication often in a system that is low on memory. I have done a raycaster on a C64 but will look more into if I can use elements of DDA to speed it up even more, or if the multiplication tables will take too much space. But thanks for explaining the algorithm with such precision.
Fixed-point only requires addition and bitshifts IIRC in each loop iteration with the exception of a single multiplication and division outside the loop. A code snippet from my codebase which can rasterize over 1 billion projected 3D lines per second with the added cost of a hierarchical Z-Buffer depth test per pixel on the CPU (the code shown below is the simpler 2D version which accepts 2-vectors, but I use an extended version in 3D which accepts floating-point 3-vectors which then get converted into fixed-point within the function with the Z value interpolated at each step using bitshifts).
// Calls plot(x, y) for each pixel of a line drawn from p1 to p2 using fixed-point DDA.
template
void plot_line(Plot plot, Vec2i p1, Vec2i p2)
{
const Vec2i64 slen = p2 - p1;
constexpr int32_t sign[2] = {-1, 1};
const bool a_small = abs(slen.y) < abs(slen.x);
const bool a_large = !a_small;
const int64_t end_val = slen[a_large];
const int64_t step = sign[slen[a_large] >= 0];
const int64_t x_len = slen[a_large] * step;
const int64_t slope = div_nz(slen[a_small] > 32ll)};
plot((int32_t)pt[a_large], (int32_t)pt[a_small]);
}
}
Can use 32-bit instead with 16-bit shifts, or even 16-bit with 8-bit shifts, or 8-bit with 4-bit shifts depending on acceptable precision/range and hardware demands. 'div_nz' is just a function to perform division with a check to avoid division by zero, returning zero in that case. Not sure how well it'd perform on C64 with the single division and multiplication per line (not per step), but it's something I lean on heavily to rasterize wireframes on very high-resolution 3D meshes (millions of polygons each) in real-time and faster than the GPU can do it (at least consumer-grade NVIDIA and AMD GPUs are surprisingly slow to rasterize lines; I think because they actually convert each line to a triangle strip and rasterize them as triangles).
It'd need to be adapted a bit for raycasting to avoid possible ray leaks (the above doesn't plot any doubles as pixel artists call them) but the same fixed-point concept should apply to change the requirements for division and multiplication each iteration into bitshifts. I keep hearing that Bresenham is faster than DDA for line rasterization but I think that's assuming floating-point and not fixed-point. I've yet to find any Bresenham implementation beating this at least on today's hardware.
There should really be a specific name for this alteration of the DDA algorithm. It doesn't help running into this all over the web while knowing in some cell sized grid space, you can just check each axis intersect rather than some small amount along the direction of the vector. Thanks for this, it's incredible!
You could talk about the raycasting technique used in doom or dukenuken, i think it would make a great video, honestly
It took me an entire month to understand the lodev article....only if u had made this video earlier. I could have saved time
Niice! Just realized you can also use the same DDA algo to draw an antialiased line.
you can generate a distance field in O(n) time and use that to skip tiles based on how close the nearest filled tile is.
thank you I've used this to implement raycasting in my engine. I've tested it a lot and it's amazing what these magical 40 lines of code can do
Since we are talking about straight lines we know that they have one constant angle to the x-axis and one constant angle to the y-axis. This means that for one ray you need to compute 1/cos(theta_x) and 1/cos(theta_y) only once and those are your increments in lengths of the line when we increment x and y by 1 respectively, no square roots needed since they are expensive to compute.
There's another tutorial that's been around for a while, like Lode's... though it recommends using pre-calculated look up tables for sin and cos... www.permadi.com/tutorial/raycast/rayc8.html
so no overhead of square roots
Exactly what I needed for my roguelike game!
Once saw this being used for voxel tree traversal, had no idea what its called. Thank you.
15:00 when you shaded in the cells you checked, it shows that this is just a low-resolution pixelated line. Look at the fast algorithm for drawing a line: depending on the slope you always move either x or y; after a certain number of pixels you move 1 in the other axis.
I've done that in x86 (16-bit) assembly language, back in the late 80's through early 90's.
So, super-fast code for that is around, and even if you're not writing directly in ASM and keeping track of register usage, it still minimizes the number of values needed and avoids expensive operations like division and avoids branching.
When you think about it, the bresenham line algorithm is the DDA algorithm with optimizations based on the quadrant the direction is in. So you could benefit from some of those same optimizations. (Plus the bresenham line algorithm uses an error threshold of 0.5 to change direction to keep the line thin. You'd need to change that to 0)
When I developed this method independently, all those years ago (heh, as many others did, it seems), I realized it's the 2D generalization of Bresenham's. (Which is to say the same thing, merely going the other direction!) You keep two error accumulators, increment and conditionally-decrement each by their respective slopes -- importantly, the slope doesn't need to be normalized anymore -- and just let it spin!
The importance of denormal is, for creating a 3D view, you project a view plane relative to the view angle, and this all happens with basic 2D geometry. Which, I didn't know complex arithmetic or linear algebra at the time, but it's exactly that, as it happens. There's no trig involved whatsoever (beyond the one sin/cos pair to start with), no divide by zero, no janky hacks for special angles or quadrants -- it just works!
And the thing about denormal slope is, it gives you the perspective-corrected distance automatically. Dividing the view plane into angles, say, is a whole different projection and needs a complicated correction to look right (to preserve straight lines). This does of course mean, you can't go to 180° FOV, but that's typical of perspective projection.
DDA/Brezenham has linear complexity: the longer the line you're casting, the more expensive it gets. for large-scale data (say, for rendering voxels) you need faster average case convergence than O(N).
there's multiple approaches to do that like octrees, kd-trees, BVH's, SVO's, brick maps, but that's beside the point, I just wanted to note that DDA is certainly not a fast solution from asymptotic analysis standpoint. it's surely fast to implement though and works well for small scenes with low resolution grids (
Note, you can use a very similar algorithm to implement exact time-step-less collision physics with AABBs and circles/spheres.
In fact, with just _one_ more little coordinate substitution, you can make the object's position an implicit value computed directly from the current time based on object data that only needs to be updated (and synchronized across threads!) when an object actually collides.
can you expands on how it could be used on AABBs? I know how it's aabb vs aabb is done(add the sizes of one box to the another and then raycast), but I simply can't imagine how it can be done with this algorithm
This is basically the same algorithm used in the Wolf 3D source. Although John Carmack used goto to switch between x steps and y steps. I read the code in Game Engine Black Book: Wolfenstein 3D, but the source code is also available free online. I read about it while playing around with your first console FPS video. :-)
very good explanation of the concept of raycasting with dda algorithm, thank you by the way!
I have a different method, scrap the ray casting, instead just calculate the xy of the furthest point you want to draw, then skim through all the cells between that point and the player within a certain range of that point, while going through those cells you fill in a seperate array with the closest points to the player with an object, those are what you actually draw, for an easy to understand image, imagine a square with the player at any corner looking at the opposite corner, you just start skiming from that opposite corner through every cell in the square towards the player with each diagonal between the player and the opposite corner serving as a column, the edge columns will always have the adjacent corners as their closet cell, the opposite corner will always have the furthest cell that can be declared to hold something
btw the xy of the furthest point can be calculated by this method: y = ceil(dist * normalized angle), x = ceil(dist - y), you can then just calculate the furthest cells that are in range by using the normalised angle and -(0 - normalised angle), haven't directly checked any of that but I don't see any particlur issues with it
I'd seen this before somewhere and was too lazy to look it up again :D
With the ol Voxel stuff this is invaluable! thank you.
I just wanna say thank you SO MUCH for this tutorial! You teach the topic so understandable and simple!
Maybe I can finally implement that collision algorithm I've been struggling with
The explanation is crisp as always, I could go and implement right after the theory!
I am actually surprised that the code you are showing actually seem to work. I implemented this a few years ago and the first thing I encountered was that my program aborted with a division by 0 exception whenever dx or dy where 0. So basically if my line was parallel to one of the coordinate axis. I had to implement a special handling for those two cases for getting the algorithm to work, which was easy enough to do, as you are simply going along one axis then.
Just wanted to add this here in case someone encounters the same issue as I once did. :)
Well infinity is a valid floating point number and the maths still works out when it is used, so should be ok.
@@javidx9 yeah. but NaN is not infinity... well we all assumed something
Excellent video. I code my own games for fun every once in awhile and old school 2d is my bread and butter. Although, I never actually finish a game and learn more each time. Starting out with an ultima clone, to super mario, to megaman and eventually played with some ray casting. It amazes me how useful trig is for games.
That's very interesting! It could be useful in 3D tiled-based games as well. For ex. raycasting in minecraft
4:47, coming back to this a year later and I still think it's faster to use normalisation and virtual/imaginary boxes, though I admit I may suck at explaining it. I'll try again in list format.
1. Imagine the space between the ray source and the ray target as the area of your imaginary square
2. Normalise the angle to your target into just the space in that quadrilateral - for example: ((angle / 360) % 4) * 4
3. Decide which axis the angle should represent - for example: x = (angle < 0.5) ? angle : 1.0 - angle, y = vise versa
4. Multiply both normalised values against the distance between the ray source and the scene borders to get the ray targets position (if you didn't have it already and were working from just an angle)
5. Get the distance between the ray source vs target, this is the limits of your imaginary box from point 1
6. Divide the larger edge by the smaller, so width by height or vise versa, this value is how many units (pixels if it helps with imagery) you must process in the larger edge before moving up 1 unit in the smaller edge
7. Everything that shows up in those units is hit by the ray, do whatever with it, you can identify the other visible objects in a view cone by just iterating upto half the cone's width in units from the center ones you find, just use a normalised semi-circle's points to identify the remaining ones at the cone's front (or should I call it base?, feels like the point end should be the base though)
**Edit:** Btw to apply the same math to 3d space just pretend the z axis is an x axis and do the math for each x axis separately, you'll end up with 2 y values, to get the mid point between them in the 3d cone's view, just multiply their normalised values against each other
Typically these types of visibility tests benefit from terminating as quickly as possible (finding the nearest occluder ASAP and stopping the search). If you want everything that intersects the ray or other primitive like view cone or a circular/spherical search parameter without the occlusion, I think best bet is to use a spatial index and use it to find all things that intersect the primitive in logarithmic time.
@@darkengine5931 I didn't say can't use an index table with this, I'm saying that you use the size of the scene to determine how far in the table to jump x and y indices before reading the next index before switching axis, in other words because you know how far to jump you skip the "is dist 1 greater than dist 2" check and keep adding the values until you passby the scene boundaries or find an indice that is not 0
@@zxuiji I see! But I think #7 sounds a bit expensive if I understand since you're performing a parameter search if I understand correctly (something that I think would require a miniature loop or some additional inner branching). DDA achieves the equivalent of #7 by exactly determining which grid cell to check each marching step when all we have is a grid and slope from #6 with just a tiny bit of arithmetic each step and no need for any inner loop. If I'm misunderstanding, could you describe #7 with some pseudocode along with the overall loop?
The next game I make that is tiled, I see if I can include this. I have only ever implemented raycasting for non-tiled things using a different method, but this seems simpler for tiled. (Maybe a tad slower though for large simple secnes)
Best video ever on Raycasting really! I've finally understood it, all thanks to you!
2:12 I remember trying to make my own bad ray engine and the way I implemented it was to go out half the length, check for collision, then if there was one, go back a 4th, then check again, go forward an 8th and so on. It was way faster than the step by step like shown here, but still pretty slow due to the engine I used
edit, the method I used is called ray marching apparently
zeno's paradox at it finest, you'll be eternally computing it and never reaching that edge.
Found your channel from the procedural universe video. I'm very interested in your lessons now. When will we get a video on saving state in a procedural universe?
I think that combining this raycasting technique with quadtree rather than a static cell size would lead to an interesting implementation
Nvidia has an article like this on doing it with sparse voxel octrees, infact that's what I'm working on at the moment.
I want to thank you so much for giving me the idea. I just started to write a program which main goal is RayCasting. I have resources to write the code, but I really wanted to understand what is happening under it. It will be a lie to say that I now understand it 100%, but I have the idea how it works and it's all thanks to you. Subscribe and Like from me, really great job.
Just Joined your channel to show my support for your awesome videos :)
Hey thanks Python++ 😄 much appreciated!
This is one of the most useful videos on your channel, only bested by balls and capsules! :)
This is just absolutely genius
I substituted "fast inverse sqrt" into the olc::PGE, pretty interesting results, but overall it does give a frame rate boost, especially casting many rays.
If you follow the link to the source you'll see an even faster implementation commented out that just uses abs() function.
Good tutorial !! I have a problem with the intersection circle, which is drawn in the center of the square and not on the edge.
Wouldn't Bresenham Line Drawing algorithm be a better implementation for game development based on it's ability to be implemented using only integers?
That's a good idea! For fastest response, you'll probably want precomputed look up tables dx,dy for each vector direction. Not a problem with today's large memory computer, but it was back then. Just make sure that you do the right 1st step since it is fractional step.
@@ImGelling Modified Bresenham, obviously. Optimized for Raycasting.
I don't think it's good compared to DDA. DDA lines work similar to convervative rasterisers in that every pixel the line cross is marked (perfect for a ray marcher), Bresenham skips pixels as it's optimised for 1 pixel per column (assuming a more horizontal slope). See 18:49 for an example of a DDA covering more pixels
Even though I figured this out a long time ago myself, it's still engaging to watch you explain it with your Bob-Ross style. You could teach the world's children to become programmers. And DDA FTW!
Fantastic video, thank you for doing this.
I would love to see a good example of collision when a player takes up 2x2 units of tiles. Running against a jagged square slant. So that the player is pushed to the side of the slant. like \ or / for example. Moving both up and left or right.
Such an excellent video! Hats off to you sir!
This will be of interest to anyone trying to compile PGE v2.14 on the command line with VSCode.
It turns out that the order in which the libraries appear in the command line matter.
This works:
g++ -std=c++20 fastray.cpp -o fastray -static -luser32 -lgdi32 -lopengl32 -lgdiplus -lstdc++fs -lShlwapi -ldwmapi
If you're really stuck with library link order problems, this from Stack Overflow:
"Another alternative would be to specify the list of libraries twice:
gcc prog.o libA.a libB.a libA.a libB.a -o prog.x
Doing this, you don't have to bother with the right sequence since the reference will be resolved in the second block."
I did this to draw ropes using 8x8 sprites for NES without knowing it had a name :p Or at least a very similar technique.
Very thanks for this video! I gonna make some test's with this raycast method :P
Amazing video, amazing voice!
Thank you so much!
Don't you have this for 3D as well? I would love to see it, as I am currently struggling with it :D
I was the 0x100 viewer, or at least that is what the counter showed, nice video!
I tried my had at a 3d implementation. Can't tell if my variables were borked, or if my math was. Either way, it ran FAR too slowly to be usefull. Might end up revisiting it at a later date once I am using a shader instead of CPU.
Also, Wolfenstein 3D did not use square roots. It was integer-only calculations back then, folks.
Love that ray casting article
Hello Javi, nice work I would like to understand better all the formulas that you used in the video to implement it my own code. Do you have any recomendación? Thanks again
I have a video which is "maths for aspiring game developers" which covers the basics
@@javidx9Hello javi dou you know how to scale sprites when texturizing the walls? mi grid size is designed for 8 units but my sprites are 128*128. Do you know if its possbile?
Thank you for the information.
this could be handy, I usually used pseudo-quadtrees for this kind of task
If `direction` is normalized vector then `vRayUnitStepSize` can be calcualted as `{ 1 / vRay.x, 1 / vRay.y }`, knowing the fact that `sqrt(vRay.x^2 + vRay.y^2)` is equal to `1`
I wonder if you can use a modified form of the Bresenham's line algorithm or the Xiaolin Wu's line algorithm. Both would probably be a bit faster than DDA (maybe 2x faster?). Also, neither requires the computation of a square root, only a single division.
Raycasting must both hit every tile and tell you how far away that hit was. I think you might be able to do something similar by testing all the x intersections then all the y then taking the shortest, but I doubt it would be any better.
Seems very fast indeed. But not quite [52 CPU clock cycle] fast. I'll have a lot more optimisation to do, but I think it might be doable with this as a base...
Probably can make it way faster using branchless version from shadertoy, and even then its possible to optimize it even further by doing early termination and applying mask without if statements etc etc, there many ways to implement it
Love your videos. Can you please also do a floor casting in 3D raycaster? I am looking for an efficient way to map floor without having to cast one ray for each pixel on screen.
Thanks Harshit, my RayCastWorld "engine" does floor and ceiling sample still with columnar rays. ua-cam.com/video/Vij_obgv9h4/v-deo.html
I gave this a go using the pen & paper explanation you give. I've almost got it working and I decided to compare what I have with the code you've written. I tried substituting my own variables into the algorithm with no success. If I understand correctly, what is being displayed is an "illusion" and all of the vectors and map coordinates are actually way up in the top left corner of the screen. I think there was a similar thing going on in the A* video as well. Is that correct?
I can see the benefit, specifically where the map has no actually coordinates. That's like 2000 x and y variables that don't need to be created. I was gonna ask if, I should strive for this approach, but I think I've answered my own question. Do all games really only take place in the top left corner of the screen? Damn, that's crazy!
I have a suggestion for an interesting topic for a video. There is an 128-bit decimal in C#. What is the best way to create a 128-bit double or float or decimal in C++? I did read about quad precision floats and doubles. I did possibly see a couple of projects on GitHub but I haven't looked into it that much. There is apparently an 80-bit double which is supported by non-microsoft compilers but that fact limits its usefulness a bit, especially if you use Visual Studio.
I don’t get 7:04 could someone explain to me what cos(theta), sin(theta) would mean in code or even logical maths i can’t picture it please
Very nice explanation. Thanks.
Amazing video thank you!
Since the direction vector is normalized to begin with, isn't there no need to do a Sqrt? - you can simply divide 1 / dx and 1 / dy to get the step size per unit length in those directions
Thanks! I'm not sure I follow, suppose a scenario where dx=1 and dy=1 then the length is sqrt(2) which is greater than unity, so normalisation is still required?
@@javidx9 in this case, the vector that you're using to calculate vRayUnitStepSize is already normalized, so you know that the hypotenuse's length is 1. So if you divide all sides of the triangle by x, you get that the hypotenuse is 1/x when x = 1. Just to be sure I double checked the math in my code and both equations come out to the same answer for me.
I liked seeing the math behind the general case too, just thought it was cool that there was a simplification - that's all!
Interesting! (Eventually, I went "Wait a minute; that's just Bresenham's line algorithm with extra steps." :grin: )
Yes, it's Bresenham + permadeath. XD
Well, it actually is if you use it for a roguelike...
thanks a lot, i really needed this video!
i love this channel even though I understand shit .... feelin productive :D
Why not use line formula y=(dx/dy)x+c, increment by x unit? You can use integer instead of float, arrays instead of vector, and avoid successive distance comparison.
You'll need a corresponding x=() but that should be no problem?
was also confused by this
Well for 3D worlds that aren't saved as tiles you could leave the original file as is and generate a temporary folder and sub files with each section of the world split into cells with each file representing a cell, the name can be as simple as a hex string of X-Y-Z and in each file would be just IDs and modifications for common objects which is stored elsewhere, the positions stored in each file should be normalised to the cell dimensions, the player position could also be normalised to the cell and the cell loaded around the player would stored in simple integers, this immeadiatly resolves the infinity problem and simplifies grabbing what cells to load and draw
As a note the cell size can be decided by local draw distance, remote draw distance being what you see beyond the cell thr player is in
as another note this method means you can straight up start with the furthest you're willing to draw on (such as 0xFFFFFFFF) and decrement by cell size and attempt to load, a successful load results in content, a fail results in just the sky& maybe water if at the default sea point
another note is this can work for the "randomly generated" objects by having the seed at the top of the cell file and then modifications to each sequential object underneath
final note, the ray stuff would involve only one step, after it's just loading cells towards the player
btw javid if you'e gonna include this in a video don't put the messages themselves in the vid, just link 'em under the vid, you can just call me a random programmer :)
And here I was hoping this was about discrete dipole approximation.
This is basically just a bresenham line drawing algorithm. I was hoping for some kind of intelligent space partitioning algorithm that avoids the redundancy between adjacent pixels.
Thank you
I need to rewatch this video.. somehow I missed the part of finding the collision with a wall.
The array is either wall or not wall, since you know your position in the array, you know if it's wall or not.
@@javidx9 thanks for the reply. I think I got the hang of it now. I just need to look into the exacts of calculating the y and x length to see which is shorter. Then I have tiles to check if a wall is there.
Honestly, this seemed like the obvious thing to do. After you described the problem statement, and before you presented the algorithm I was thinking "if the default approach is to take tiny steps, then something better would be to only compute intersections with the gridlines". Though I didn't think through the details of how you would go about that. The walking along as a vector with x or y component magnitude aspect of it is clever.
Hm, it also generalizes well into the 3 dimensional case. If Minecraft uses Raycasting for whatever reason, but doesn't use this algorithm, perhaps it should. Though this would be complicated by things that are not grid aligned, like mobs and entities. Perhaps that would be most applicable to the Minecraft thing with RTX Raytracing. Though, if it was implemented well then they probably used something like this algorithm or better anyways.
Maybe this algorithm could be put to good use with non-axis aligned things if objects in your world are sparse, and you have a spatial partitioning scheme recording what objects exist in which voxels. Thereby allowing you to narrow down possibly collisions by first marching along the grid, then checking for potential intersections when the algorithms enters a grid cell that contains part of an object.
One overkill option you could maybe do is have a lookup table where the indices are the x and y delta of one point and another (assuming they're grid aligned, and exist on a 1024x768 screen). You can have a bit-vector representing whether to step next in the x(0) or y(1) direction. According to my calculations the lookup table would need about 1.3 GB, so probably not worthwhile but ya.
If your grid size is a lot smaller it could maybe make sense?
Oh, perhaps you could use a fixed point representation instead. Sacrificing a bit of accuracy, or perhaps you won't even lose precision if your source and destination are snapped to some grid(like the pixel coordinates), and you ensure the zoom level of the grid is snapped to some convenient values. This could improve the performance since floating point operations are slower.
For sparse objects, you just check 'em. They're sparse!
@@javidx9 true.
Maybe then, sparse collections of many objects.
well, then to answer my own question. If the collections are sparse, you can check against a few bounding boxes, doh!
God content
Ok, I joined your discord. My first one. any other one's you might recommend?
I always treated vertical collision points and horizontal collision points separately. Dividing your normalized walk direction vector by dx gives you a vector with x component equal to 1. Walk along these steps to get a vertical collision. Repeat the same with y to get a horizontal collision. Finally choose the closer one.
This is not DDA, this is WOO.
How can we know the face of box where intersection happens also can this work in 3d ?
Detecting the face just requires some creative if/else action based upon the ray direction, and yes I believe this will work just as well in 3D.