For the part starting at 1:01:42, first of all, NeRF++ only deals with the parametrization, NOT the other two problems AT ALL. But yes, in general these are possible improvements to NeRF. For example, one can use (multi-view) depth prediction to constrain the geometry so that NeRF learns the correct geometry. However, I don't understand the point of near field culling. If I understand correctly this is just "ignoring" empty space based on sparse geometry given by other algos such as COLMAP, but you are just ignoring it in training, so you don't know what the network will learn in this region. It might learn nothing, or even purely noise, and when you change angle the artifact appears. A more reasonable way for me would be to for example explicitly enforce sigma=0 in this region. Btw, NeRF DO deals with this, it has near and far clipping planes to clip the ray starting and ending points.
If you look at the results on the NeRF project site, they would seem to be much more photorealistic. Now at first, I thought this was because the views don't stray far away, but look at the 360 pinecone example. It seems much better than the results for the truck. Is this purely due to cherry picking? If so, call me old fashioned but that is borderline deceptive from the NeRF authors. From looking at the NeRF results, one would think the method is pure magic. Also, the view dependent part that only feeds in at the back of the network shocked me. It seems VERY important indeed, yet the paper casually brushes it off! Anyway, thank you for this very clear presentation.
I do agree with your analysis. There is some pretty bad cherry picking in NeRF results. This has become a trend in Computer vision. The authors do this for publicity. Unnecessarily trying to create the rumor that they have solved the problem. A really bad practise for the progress of science.
at 25:30 you show the nerf outputs on the truck scene and then again at 1:05:00 but the results are completely different and comparable to free view synthesis. seems like the first example wasn't properly optimized to make free view synthesis look better than it is...
He addresses this a few minutes later at 1:06:43 where he says they provided more training samples and training time to original NERF because NERF++ was using two NERFs and essentially getting twice the sample budget because of the homogeneous parameterization. I'll probably have to read their paper to understand that better but sure there's your answer :D Seems like he was trying to give original NERF the best chance at winning.
Impressive results! The analysis of NeRF is also really insightful, thank you!
Thanks a lot for sharing! Kudos for pointing out the limitations, and for the approachable explanations!
Your NeRF failure mode analysis is excellent.
great to listen to you my friend, well done!
For the part starting at 1:01:42, first of all, NeRF++ only deals with the parametrization, NOT the other two problems AT ALL. But yes, in general these are possible improvements to NeRF.
For example, one can use (multi-view) depth prediction to constrain the geometry so that NeRF learns the correct geometry.
However, I don't understand the point of near field culling. If I understand correctly this is just "ignoring" empty space based on sparse geometry given by other algos such as COLMAP, but you are just ignoring it in training, so you don't know what the network will learn in this region. It might learn nothing, or even purely noise, and when you change angle the artifact appears. A more reasonable way for me would be to for example explicitly enforce sigma=0 in this region. Btw, NeRF DO deals with this, it has near and far clipping planes to clip the ray starting and ending points.
Great work guys, keep it up
If you look at the results on the NeRF project site, they would seem to be much more photorealistic. Now at first, I thought this was because the views don't stray far away, but look at the 360 pinecone example. It seems much better than the results for the truck. Is this purely due to cherry picking? If so, call me old fashioned but that is borderline deceptive from the NeRF authors. From looking at the NeRF results, one would think the method is pure magic. Also, the view dependent part that only feeds in at the back of the network shocked me. It seems VERY important indeed, yet the paper casually brushes it off!
Anyway, thank you for this very clear presentation.
I do agree with your analysis. There is some pretty bad cherry picking in NeRF results. This has become a trend in Computer vision. The authors do this for publicity. Unnecessarily trying to create the rumor that they have solved the problem. A really bad practise for the progress of science.
at 25:30 you show the nerf outputs on the truck scene and then again at 1:05:00 but the results are completely different and comparable to free view synthesis. seems like the first example wasn't properly optimized to make free view synthesis look better than it is...
He addresses this a few minutes later at 1:06:43 where he says they provided more training samples and training time to original NERF because NERF++ was using two NERFs and essentially getting twice the sample budget because of the homogeneous parameterization. I'll probably have to read their paper to understand that better but sure there's your answer :D Seems like he was trying to give original NERF the best chance at winning.
What software you use?
Is the software developed by u?
Plot twist : he didn't exist it's a AI bot.