Perspective Projection Matrix (Math for Game Developers)
Вставка
- Опубліковано 22 тра 2024
- In this video you'll learn what a projection matrix is, and how we can use a matrix to represent perspective projection in 3D game programming.
You'll understand the derivation of a perspective projection matrix in 3D computer graphics. The matrix I'll derive is as used by left-handed coordinate systems like DirectX (OpenGL uses a right-handed system).
In perspective projection, objects that are far away appear smaller, and objects that are close to us appear bigger. We will learn how to perform this type of 3D projection using a projection matrix.
This is a very useful tool for 3D programmers, game developers, and other computer graphics professionals. Understanding the math behind how projection is achieved is important, and I want you to own this knowledge at the end of this video!
For comprehensive courses on computer science, programming, and mathematics, visit: www.Pikuma.com.
Don't forget to subscribe to receive updates and news about new courses and tutorials:
/ @pikuma
Enjoy! - Наука та технологія
To those still scratching their heads after watching the end of this video and putting the perspective divide in your [vertex] shader here is the conclusion he left out:
The output of the vertex shader is expected to be in Clip Space -- not NDC space (that is, with the projection matrix applied to the camera space vec4 but without perspective divide). Then, AFTER the fragment shader has run and the depth of each fragment has been determined the perspective divide is handled implicitly by the shader pipeline by dividing the values in the depth buffer by the w value stored in the output of the vertex stage.
Also, contrary to his statements, both modern OpenGL and DirectX assume Clip Space to be a left-handed coordinate system. Vulkan assumes Clip Space to be right handed (with the positive Y axis facing down). Rather, OpenGL's Clip Space expects the near and far plane to be mapped into the -1, 1 range whereas DirectX and Vulkan expect the near and far plane to be mapped into the 0, 1 range.
This was *exactly* what I was looking for! Most tutorials are either way too simplistic, are focused on using specific game engines, or are so incredibly advanced that you need a PhD level to understand them. Your tutorial hit that exact sweet spot in the middle, where you get complete, detailed info, but can still understand it without already having ten years of experience. This vid made me subscribe.
Is this a joke, I just started learning rendering stuff and you just posted about the subject I was searching for. Your channel is a gem keep up the good work and thank you soo much for this quality content!
That feeling when you find that one video that answers all of your questions👍👌👌👌 thanks for the amazing explanation
Thank you for the great video! By the way, congratulations for completing the new physics tutorial. Another one that I really want to do!
I can imagine that the last few weeks have been tough. It is often like that on the last mile of a project, right?
Do you already have an idea of what topic you will do next?
I was super lucky to find this channel. After picking up some C/C++ and Linear algebra basics will definitively proceed a paid course about CG, Game engine and Raycasting to finally start my computer graphics journey after 20 years of procrastination :)
This is the best video I've seen explaining how this perspective projection matrix works. Thank you so much! 😊
Excellent explanation! Very well done and paced. I learned a lot!
Very nice and detailed explanation. Thank you very much!
this video is just something else, it explains so well a really complex thing.
Wonderfully explained. Clear and thorough
Your channel is amazing, muito obrigada!!
Took me weeks of learning this in school and it's summarized into one short video. Great job
Thats the best video i have seen in the last months i subscribed to your channel and liked the video keep it up you did a great job
All I can say after watching this video is awesome. You are the best by far compared to other videos on this topic.
great explanation. I was learning shader stuff and it helped a lot
thanks so much, this was very useful and handy for me
An amazing explanation!
Thank you!
I think you dropped this man 👑
Thank you so much!
this is the best video for beginner
great work and great explanation! thank you very much!
i have a question, how can i determine de angle of a normal inside the projection matrix? i mean, how can i determine if a side of a cube is currently visible to the projection matrix?
Glad it was helpful! 🙂
You deserve 1 trillion views.
Awesome ♥♥
I'm already saving for the 2D Physics game course :)
Just saying, I'm enjoying the Raycasting course while at it
No rush. 🙂 Enjoy every minute of it!
Fuck yesss, I have yet to watch but I'm glad to see this!
@pikuma looking at your equation, if z = znear, it looks like the outputted z value is going to be 0. From your explanation of NDC, shouldn't it equal -znear?
Is there any logic/intuition behind the lambda expression? I'm looking everywhere for derivation of that, but can't find any info.
This is really well explained. do you go into more depth in your course "3D computer graphics?"
Sure. We cover a simplified (but complete) pipeline.
@Pikuma Please correct me if I'm wrong, but if you define aspect as height / width, then you don't need to inverse fov formula. I.e. when aspect = h /w, then fov = tan (angle / 2). Inversion is needed if aspect = width / height.
Hi @AmarelSWTOR. That's a good question. I thought the inversion was needed to correctly set the FOV angle as "inversely proportional" to how we scale the screen x and y. The FOV I'm using in the code is the vertical FOV (= h/w).
Is this true?
@@Dannnneh it's true
14:27 In this part you keep switching between whether the range is 0 to 1 or the range -1 to 1
Hi. I think I meant "0 to 1" for values that are in front of us.
@@pikuma Hello. Thanks for answering back. I'm having a bit of an issue that I can't really solve though. Normally the camera in view space is oriented to face towards the -Z direction. I have a near clipping plane of 1 and a far clipping plane of 100. This would mean that all the points that should fit inside the clip box should have a z value between -1 and -100 since I'm facing the -Z direction. For some reason tho it's reversed so only the values between 1 and 100 are inside the clipping box (have a final z value between 0 and 1).
I just figured out how to fix this while typing the comment but I'm going to post it anyway because it might help someone in the future. I fixed it by using negative values for my near and far z planes. So instead of Znear being 1 it's equal to -1 and likewise Zfar is equal to -100 instead of 100. I have no idea whether it's normal to use negative values for clipping planes but that's what I've done to fix my issue.
Edit: DON'T DO THIS. The x and y components are going to flip sign (negative becomes positive and vise verca)
Please correct if i am wrong: the normalization of z to 0 and 1 considering that what we are seing is between znear and zfar should be z' = (z-znear)/(zfar-znear), is not that?
I wonder how I could apply this with regards to VR and canted or non-parallel displays, as those view matrices are completely different both from a flat screen and each eye. Like for instance, in the left eye m20 is the inverse of m02 and m00 is the same value as m22. This is extremely hard to figure out
Hm, are you sure it's not simply a different handiness or system (like OpenGL) that achieves something similar but using different matrix entries?
Can you kindly point out the resource you're using? 🙂
I understand this, however i still don't get how you pass from 3d coordenates to 2d, since the screen is actually a 2d raster, once you have the 3d coordenates, how do you represent the points in 2d? i'm stuck with this. Great videos btw, by far the best ones i've seen about computer graphics.
After the perspective-divide (where we divide both x's and y's by z), we simply plot a pixel at point (x,y) on the screen. It's almost likecwe forget there was a z, and we draw the x & y on the 2D screen.
@@pikuma Thanks for the response! this actually clarified a lot for me.
It seems the calculation you have done for perspective projection martix assumes Z from 0 to 1 (to get Zf/(Zf-Zn) and -ZfZn/(Zf-Zn)) and not -1 to 1. for -1 to 1 the values would calculate to be (Zf+Zn)/(Zf-Zn) and -2ZfZn/(Zf-Zn).
Great Tutorial I understood a lot but still my tiny brain can't handle some things.
I still don't understand why do we need zfar and znear.. what would happen if we didn't use them?
Hi krystof. They define what is visible in terms of depth. What is the closest and the furthest z value we will consider for the projection in the screen? Everything outside znear and zfar we won't consider.
Another goal of the near and far planes is when we clip (see my video about the stages of the graphics pipeline), so we don't try to project verrices that are too close to the eye-point. If we try to project and divide by zero (division by z in the perspective divide) that would be a problem.
Brilliant. Thanks
19:40 I'm not sure why but I don't understand the yellow part with the minus sign. Is it multiplied by w aswell or is it just incremented without multiplication?
(the negative value to the z when doing x*0+y*0+z*(far/(dist))+(-(far/(dist))) or is it x*0+y*0+z*(far/(dist))+W*(-(far/(dist)))?)
Good question. So, the element [3,3] of the matrix multiplies Z, and the element [3,4] of the matrix multiplies by 1 but subtracts from the previous value of [3,3]*Z.
It is:
(zfar/(zfar-znear)) * Z - (znear*(zfar/(zfar-znear))) * 1
All this will be stored in the final vector Z component.
@@pikuma Oh thanks for explanation. I'm learning this because I'm trying to render things without OpenGL or anything.
I'm still having problems with transforming Vector in 3D space to normalized Vector of the screen.
@@SkrovnoCZ Hm, I see. I believe OpenGL uses a different perapective projection matrix than the one I mentioned. All the same tasks are still basically the same, but the handedness and the final normalization is a bit different.
For a breakdown of OpenGL "way of doing projectio ", this website is great:
www.songho.ca/opengl/gl_projectionmatrix.html
@@pikuma Thanks. I'm not using OpenGL because I don't understand it. I'm just doing printf() of a prepared string which will output 3D shapes.
You are explainig it great btw.
Do you also have a video about "Clipping" in image space? (when a triangle is on the borders of the image space then 3 points of that triangle become 4 which will result in cutting the remainig part out of bounds so the triangle will become a rectangle?)
@@SkrovnoCZ Sure. All the pipeline is covered in our lectures at pikuma.com, including clipping. Although we do frustum clipping in world space in our code.
i need understand somethings:
1 - after the projection, i can use the X and Y normaly without Z?
2 - seen the matrix projection function, the parameters, the 'aspect' is the H/W... understand.... but what is the best for 'fov', ZNear and ZFar values?
Hi Joaquim. After the projection we usually render points (x,y) on the screen.
That works ok, but in reality GPUs store the old z (depth) value with the projected point because it is useful for certain computations later (like texture mapping our polygons on the screen considering perspective). We need the z (actually 1/z) for that!
Znear and Zfar you can manually choose for your game as you want. Some games use a very big Zfar, while others use a very small Zfar (and we can see less objects at the distance).
Back in the day of slow machines we used small Zfar values (clipping objects to improve CPU performance). Sometimes we even added a *fog* effevt to mask that visible/aggressive far clipping.
@@pikuma thank so much for all. Do you have a video that uses a projection and dots/lines? I need learn more things ;)
@@joaquimjesus6134 Sure. I have the lectures on 3D graphics at pikuma.com that cover a complete software renderer.
@@pikuma thank you so much. Correct me anotherthing what is the best 'fov'?
@@joaquimjesus6134 Same thing, some games like 60 degrees, other games use 90 degrees... it depends on what's the angle of opening you want and how many objects you want to see inside your 'field of view' in your game.
There's no correct answer. 🙂
Abraços!
have have seen these video sometimes and i will see it more. theres 1 thing that you don't speak: do i need convert Degrees to Radiians? the computer don't use Radians instead Degrees?
Most graphics frameworks expect values to be in radians. Degreesxare only used to display or input angles from the user via UI. In programming, it's usually all done in radians.
@@pikuma thank you so much for all
dude, your brazilian is pretty strong, i can tell it by the tone
I'm glad. 🙂
Why would you divide `result.z / result.w` at the very end? What is the point of "perspective shrinking" the distance factor? Seems an unnecessary step, particularly if z was already normalized.
The normalization of the z values (value between 0 and 1) happens *after* the perspective divide.
I need to understand something. The projection matrix receives the vertices of the world objects already normalized, or the matrix takes care of normalizing them in ranges from one to minus -1.
The projection matrix receives the values as they are in world space (not normalized), and the normalization of x's, y's, and z's (between -1 and 1) happens as we multiply the proj.matrix and also after the perspective divide (which i
performs the division by w).
@@pikuma When it refers to world values, it refers to values that are outside the range -1 and 1 for example I can put an arbitrary value for a vertize, maybe (5.0,2.0,3.0), then the matrix will take care of normalizing it so that are within that range (1 -1).
@@stevenriofrio7963 Yes, world space is basically any value in the 3D world... (0,0,-3.6), or (-4.5, 5.8, 47.0), etc.
@@pikuma The last question. If I provide one of my vertices with a z coordinate of (60.0) and my "zfar" is 20.0, will it not be seen on the screen? . Thank you very much for responding.
@@stevenriofrio7963 There's a little more to it, and it involves something called clipping. There is a stage where we clip all the triangles to only have objects inside the view frustum. The clipping happens at the top, bottom, left, right, and also the near and far planes. That's why vertices outside znear and zfar get discarded (clipped out of our final view) and we only render objects inside the view (between -1 and 1).
I’m embarrassed, but why do we multiply the x component by the whole aspect ratio, instead of multiplying x by screen width and y by screen height? The unadjusted screen is a unit square, and we’re just stretching the square to fit the (rectangular) monitor. What am I missing?
Because it's a ratio. Like 1.5 to 1 The y is always multiplied by 1 and the x by 1.5. Because it's always a something to one we drop the 'to one' part.
@@neoncyber2001 so like y = 1 and x = 1(x/y). Where y is 1?
I don't understand lambda much,why don't we divide zfar÷znear = ratio? But zfar÷(zfar-znear)????
15:36 is what I don't get. How is λ derived? I've spent 3 days over this and still don't get it. Everything else is pretty easy
can relate lol
What are the values of fov, znear and far? How can I get them?
You can pick them yoursef. Some games use a FOV of 60°, others 50°, etc.
Znear and Zfar the same thing. Some games have a znear of 100, others 1000.
It's up to you, the programmer.
@@pikuma okk got you! Thank u!
Eyes on the prize! This guy is a national pride haushs
I was wondering, isn't the following matrix correct?
projectionMatrix = [
[aspectRatio * FOV, 0, 0, 0],
[0, FOV, 0, 0],
[0, 0, lambda, 1 ],
[0, 0, -lambdaOffset, 0],
]
since we have to subtract the lambdaOffset for the Z component, wouldnt it be better if it was in the 3rd column? (, '-')a
I just spotted the difference!
i was doing the vector[1x4] . [4x4]matrix
you're doing the matrix[4x4] .[4x1] vector
but unfortunately, my rendering is still all messed up
I don't understand anything about normalizing z.
:D
(zFar / zFar-zNear) will never be between 0 and 1... Think about it. (10/10-1) or (100/100-20).
My understanding is that this z "normalization" will happen after the perspective divide, placing the z values between 0 and 1 (in front of us in a left-handed system). Or simply -1 and 1 in most APIs.
@@aprile1710 I was able to build an engine without this part. I use the following matrix below, then I perform perspective division. Works just fine without this step. (See link below)
// Perspective Projection Matrix
float persp[4][4] = {
{aspect * 1/tan(fov/2), 0, 0, 0},
{0, 1/tan(fov/2), 0, 0},
{0, 0, 1, 0},
{0, 0, -1, 0}
};
ua-cam.com/video/IO9sT3t2fSc/v-deo.html&ab_channel=AlexFish
its a shame he ruined the video by flashing his satan hands all the way through it.
...Excuse me?
Why do we multiply x by the aspect ratio but not y?
this is exactly what i wonder
Because it is a ratio between the x and y axis, specifically the width and the height which means one of the them is the base which always will be 1.0 or 100% whereas the other will have a percentage that based on the aforementioned base. For instance, a screen with 500 pixels height and 1000 pixels width will have an aspect ratio of 0.5:1. The reason why you only multiply x and not y is because once again it is a ratio, if you multiply both of them means nothing has change and this is not we wanted. Assuming that an object originally comes from a space that is considered square, the distribution of the values across x-axis and y-axis are equal but that is not the case in the screen space because its width and height are not equal. For example, a square with one of its vector as 0.5x, 0.5y if converted onto the screen space without multiplying the aspect ratio, what will happens? 0.5y = 50% of the height and 0.5x = 50% of the width and 50% of 500pixels and 50% of 1000pixels are clearly not the same. However if you now multiply x with 0.5 (the aspect ratio that we just calculated), 0.5x0.5 = 0.25, and 25% of 1000 pixels is indeed equal to 50% of 500pixels thus the square is now rendered correctly on the screen space