Neural Radiance Fields | NeRF in 100 lines of PyTorch code

Papers in 100 Lines of Code

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 7 лис 2024

КОМЕНТАРІ • 51

@zskater1234 Рік тому ⁺⁸
Just bought your course!! Pretty cool to find someone talking/teaching NeRFs, since LLMs and Diffusion models stormed and got all the attention haha
@papersin100linesofcode Рік тому
Thank you so much! I am glad that you like the content, and I hope you will like the course. Great videos about NeRF will be released soon :)
@er-wl9sy Рік тому ⁺³
Awesome. Please keep doing in this field
@papersin100linesofcode Рік тому ⁺¹
Thank you! I have a few upcoming videos related to NeRF, and will produce more if people are interested.
@businessplaza6212 Рік тому ⁺³
my appology i mean the argument "dataset" used in line 11 to 36 in the test function. Does it take the dataset from the llff format from the pkl file? i don get it, tks!!
@papersin100linesofcode Рік тому ⁺¹
Hi, thank you for question, and excuse me for the delayed answer. Yes, it is using the data from the pkl file, which was generated by myself directly from the NeRF data to make things easier. It can be downloaded from the GitHub link
@ArrayI0 7 місяців тому ⁺¹
thank you very much! I learn the point of nerf from your vidio
@papersin100linesofcode 7 місяців тому
Glad to hear it, thank you! :)
@businessplaza6212 Рік тому ⁺²
Does the code you've shared have the variable "dataset" defined? I don't see it. What is the output of the code a png file with rendered image? than is possible to get a mesh? thanks for your assistance
@papersin100linesofcode Рік тому ⁺¹
Hi @businessplaza6212, thank you for your question. The GitHub code has several variables "dataset" in different functions. Therefore, I am not sure to understand your first question, could you please rephrase it? Yes, the output is a rendered 2D image. It is possible to get a mesh, I explain how to do it in my course. Otherwise, you may also be interested in this notebook from the initial NeRF paper github.com/bmild/nerf/blob/master/extract_mesh.ipynb.
@businessplaza6212 Рік тому ⁺²
Thank you for your fast reply! Your work is great! Im wondering about de “dataset” variable that you use in line 106. But where is defined? Could you clarify pls? I will buy your course as Im working on a Nerf thesis for my master in sc in ML. You mixed the transforms json file from colmap in a pkl file?
@papersin100linesofcode Рік тому ⁺¹
Hi, I am so sorry I forgot to answer. Most questions are already answered in other comments. Do you still need clarifications?
@BenignVishal 7 місяців тому ⁺¹
I am planning to buy your course, but will i be able to generate the mesh from the capture!?
@papersin100linesofcode 6 місяців тому
Hi, thank you for your question. Unfortunately, not in high quality. We discuss the ray marching algorithm and use it to extract a mesh from NeRF. However, the mesh is not high quality, and does not possess colours. If you want a coarse mesh, that is fine, but if you have high expectations on the quality of the mesh, and need colours, then you would need more advances algorithms than the ones used in the course.
@eliezershlomi3224 11 місяців тому ⁺¹
How would you add the coarse and fine networks improvement?
@papersin100linesofcode 11 місяців тому
Hi, thank you for your comment. I am planning to add a video about it. I hope I can release it in the near future
@eliezershlomi3224 11 місяців тому
@@papersin100linesofcode I subscribed, thank you
@thomascole2933 11 місяців тому
Absolutely great video! Really helped clear up the papers seeing things implemented so straightforwardly. I have a few questions. What type of GPU did you use to train this model? When creating the encoding you initialize your out variable to have the position vector placed in it. (making the output [batch, ((3 * 2) * embedding_pos_dim) + 3] adding that trailing +3) Was there a reason for doing that? I mean adding it surely doesn't hurt. Batching the image creation is also a great idea for smaller gpus. Thanks again for such a great video!
@papersin100linesofcode 11 місяців тому
Hi, thank you for your great comment!
1) I should have used a P5000 or RTX5000.
2) I am not sure I understand which line you are referring to?
@ankanbhattacharyya8805 11 місяців тому ⁺¹
I understand 10*6 for the pos enc. but why did you add 3 to it? Posencdim*6+3?
@papersin100linesofcode 11 місяців тому ⁺²
Hi, thank you for your question. This is because we concatenate the position to the positional encoding. This is not mentioned in the paper, but done in their implementation.
@ankanbhattacharyya8805 9 місяців тому ⁺¹
@@papersin100linesofcode ow. I understand. Thanks a lot
@aditya-bl5xh Рік тому ⁺¹
Hey I have a smaller question, Nerf takes 5d input, position and view direction, is there s way to get the view direction from a rotation matrix (3x3)?
@papersin100linesofcode Рік тому ⁺¹
Hi, thank you for your question. Do you mean the camera to world matrix (c2w)? If so, yes, and actually the direction is already computed from it most of the time. The direction is computed from the camera, using its 3x3 c2w matrix
@aditya-bl5xh Рік тому ⁺¹
@@papersin100linesofcode yea can you please tell the formula that is used to get them?
@papersin100linesofcode Рік тому
@@aditya-bl5xh you may be interested in this script github.com/kwea123/nerf_pl/blob/master/datasets/ray_utils.py. I will soon make a video about it
@aditya-bl5xh Рік тому ⁺¹
@@papersin100linesofcode thanks! Appreciated
@SeungLabFx Рік тому ⁺³
This is so nice. Just bought your course
@papersin100linesofcode Рік тому ⁺⁴
Thank you so much! You can download it here drive.google.com/drive/folders/18bwm-RiHETRCS5yD9G00seFIcrJHIvD-?usp=sharing. You will understand in the course how it was generated :)
@jeffreyalidochair Рік тому ⁺³
a practical question: how do people figure out the viewing angle and position for a scene that's been captured without that dome of cameras? the dome of cameras makes it easy to know the exact viewing angle and position, but what about just a dude with one camera walking around the scene taking photos of it from arbitrary positions? how do you get theta and phi in practice?
@papersin100linesofcode Рік тому ⁺³
Hi Jeffrey, thank you for your question. In practise, people use COLMAP (open source pipeline) for estimating the camera parameters.
@papersin100linesofcode Рік тому ⁺¹
The camera parameters can also be learned (have a look at my video about NeRF-- if you are interested)
@jeffreyalidochair Рік тому ⁺²
@@papersin100linesofcode thank you! do MIP-NeRF and Zip-NeRF also use COLMAP?
@papersin100linesofcode Рік тому ⁺²
@@jeffreyalidochair MIP-NeRF and Zip-NeRF can be see as algorithms that take as input pictures together with their camera parameters, which can be estimated in several ways. But yes, in the real data from those papers the camera parameters are specifically estimated with colmap
@rebellioussunshine1819 10 місяців тому
Great video! Could you tell me how much time it took for the model to train approximately?
@papersin100linesofcode 10 місяців тому
Thank you! About 24 hours
@__karthikkaranth__ 4 місяці тому ⁺²
You skipped the coarse/fine logic from the paper. Were you able to get decent results without it?
@papersin100linesofcode 4 місяці тому ⁺¹
Hi, thank you for your question. The results I show at the beginning of the video are without it.To me these are decent results although they would be better with the hierarchical volume sampling strategy. I think I will make a video about it in the near future :)
@UncleChrisTs Рік тому ⁺²
Great video thank you very much!
@papersin100linesofcode Рік тому ⁺¹
I am glad you like it. Thank you for your comment!
@machinelearnernp4438 Рік тому ⁺²
Does it generate the sample in 16 epochs?
@papersin100linesofcode Рік тому ⁺²
Thank you for your question. The model is trained for 16 epochs and then, it can be used for rendering
@nettyyyys Рік тому ⁺¹
@@papersin100linesofcode I have tried it and is it notmal that it generates white images at the beggining? Also Why you set the deltas last as almost inf? Besides I think that using this makes the weight sumbe always 1 so the last regularization has no sense.... Correct me if I am wrong!
@papersin100linesofcode Рік тому ⁺¹
@LearningEnglish Does the images remain white with more training? The deltas are the distance to the following sample, and so, for the last sample, the distance to the next one is infinity in theory. We take the exponential of the opposite value of delta which does not lead to exploding values.
I hope this is clear. If no, do not hesitate to ask me questions
@aditya-bl5xh Рік тому ⁺²
Can you explain pytorch implementation for mip NeRF or zip NeRF? The github repos are very hard to understand
@papersin100linesofcode Рік тому ⁺¹
Thank you for the suggestion! I will try to add them
@aditya-bl5xh Рік тому ⁺¹
@@papersin100linesofcode thanks!
@YuRan-un8yj Рік тому ⁺²
Great video! can i get the dataset ?
@papersin100linesofcode Рік тому ⁺¹
Thank you for your comment! You should have accessed to the data now, excuse me for the delay. I have removed the authorization, so that anyone can access it directly from now on
@vamsinadh100 Рік тому ⁺¹
can you share link for dataset
@papersin100linesofcode Рік тому ⁺¹
Done

Наступне

Автоматичне відтворення

Neural Radiance Fields at High FPS | FastNeRF in 100 lines of PyTorch code