NERFs (No, not that kind) - Computerphile

Поділитися
Вставка
  • Опубліковано 3 лют 2025

КОМЕНТАРІ • 137

  • @Computerphile
    @Computerphile  Рік тому +106

    The previous version of this video had some scenes which were insensitive in the light of the tragic events in Prague. We apologize if the video’s content was insensitive given the context of the tragedy. It was never our intention to cause any harm or offense and we were not aware of the tragedy at the time of the video's edit & subsequent release. Those parts of the video have since been removed -Sean

    • @azrobbins01
      @azrobbins01 Рік тому +13

      Do you think you will make the original version public after some time has passed? I left a comment on the other version and it shows as still being there, but the video is private.

    • @jme_a
      @jme_a Рік тому +54

      The fact you felt the need to do this highlights a big issue in today's society.

    • @bectionary
      @bectionary Рік тому +2

      Out of interest, do you remember how long the original video was?

    • @arinc9
      @arinc9 Рік тому +15

      @@jme_a How much sensitivity is too much sensitivity? Guess we'll never know.

    • @jub4346
      @jub4346 Рік тому +20

      who finds nerf fights offensive ? all you did was play on words

  • @Guus
    @Guus Рік тому +62

    Dang I kinda expected you guys to explain more in depth how it actually worked. I feel like Lewis was about to get into the good stuff but was then a bit cut off to just give a basic demonstration instead. Would love to see a longer video starring Lewis where he can take all the time he likes to explain it further, here or on his own channel :)

  • @derrickobara6806
    @derrickobara6806 Рік тому +162

    Is it possible, considering Lewis's specialty, we could call him a nerf herder?

    • @klaxoncow
      @klaxoncow Рік тому +14

      Though he needs to look a bit more disheveled, I feel, to truly be a scruffy nerd herder.

  • @stephenmurray7495
    @stephenmurray7495 Рік тому +15

    I do love Dr P's enthusiasm. He seems more like the mischievous student himself

  • @omnipedia-tech
    @omnipedia-tech Рік тому +9

    One of the coolest things about NERFs is how they handle reflections within the image, where you can control the viewpoint to actually enter inside the scene within the reflection like it is its own little mini-universe.

  • @AloisMahdal
    @AloisMahdal Рік тому +8

    The "bad" angles are also kinda awesome, though.
    I could see this being used artistically.

    • @kevinkor2009
      @kevinkor2009 Рік тому +2

      It could represent flying through a dream or a multiverse and finally snapping into focus when you reach a destination.

    • @OrangeC7
      @OrangeC7 3 місяці тому +1

      @@kevinkor2009 A memory would make a lot of sense, especially since conceptually speaking it's the same idea, you wouldn't be able to see things you never saw in the first place

  • @gabrigamer00skyrim
    @gabrigamer00skyrim Рік тому +1

    When I saw the thumbnail I was expecting the video to be about Neural Radiance Fields. When they said (no, not that kind) I was then expecting a video of Dr. Mike playing with dart guns.
    Happy to get the former but sad for not having the latter

  • @Locut0s
    @Locut0s Рік тому +6

    Having played around with ray tracers in the 1990s, povray and the like, as well as some 3d modelling from that era… all of this is just insane to see.

  • @TheZaxanator
    @TheZaxanator Рік тому +45

    Great video, Nerf seems like a really interesting technology. I'd love a follow up about gaussian splatting

  • @Tospaa
    @Tospaa Рік тому +4

    I see Dr Mike Pound, I click like. That simple.
    Really good content thank you all!

  • @aelolul
    @aelolul Рік тому +1

    Good timing. I was just playing with a demo of SMERF which builds on the technique. I'd love a deeper dive on these techniques!

  • @realeques
    @realeques Рік тому

    as a swoftware engeneer im so glad that i can just harvest such knowledge

  • @grantpottage
    @grantpottage Рік тому

    Really enjoyed this video. The discussion was quite interesting, and I appreciated the insights shared. The duo of both Mike and Lewis brought an engaging and insightful presence to the conversation that added to the overall enjoyment.

  • @amadzarak7746
    @amadzarak7746 Рік тому

    I’ve been waiting for this one! This is great

  • @hieattcatalyst4540
    @hieattcatalyst4540 Рік тому +2

    This video on Neural Radiance Fields is mind-blowing! Kudos to Lewis for demystifying the complexities with such clarity. Seriously, I'm hooked! Wondering, though, how these fields might revolutionize CGI or virtual environments? Can't wait to dive deeper into this fascinating realm!

  • @SC-fk6bb
    @SC-fk6bb Рік тому +1

    Best of the video: Dr Mike watching at the student with a suspicious sight 😂😂

  • @TomSnyder--theJaz
    @TomSnyder--theJaz Рік тому +2

    Well done, Lewis
    Cheers
    (Watch out Mike, Lewis a very good presenter ;)

  • @CallousCoder
    @CallousCoder Рік тому +2

    This is so much better than the rubber dart variant 😊

  • @isaacg1
    @isaacg1 Рік тому +2

    Wow perfect timing! Was looking at this earlier. Would love to see that follow up on gaussian splatting

  • @jacejunk
    @jacejunk Рік тому +3

    Thanks for covering this subject. Could you cover Gaussian Splatting in the future? I think the rendering description would be easier to understand for novices.

  • @KSanofficial
    @KSanofficial 5 місяців тому

    Is the location/pose of the camera known when the model is trained or does it estimate each picture's position in space? I know there is some literature about uncertainty in camera poses with nerf but I guess it makes rendering much more complicated? Amazing video, I'll start my internship in surface reconstruction soon and this is so nice to watch!

  • @diophantine1598
    @diophantine1598 Рік тому +3

    The current state of the art NeRFs are actually much better than this. There’s also Gaussian Splatting which is faster to generate, faster to render, and even higher quality. This field of research is very new and exciting.

  • @Yupppi
    @Yupppi Рік тому +1

    Well this is an interesting change of pace, having a PhD student explain the supervisor (and the internet) something new and cool.
    Is this video fast forwarded? Something about their movement and pace doesn't feel natural. Somehow 0.75x feels more natural.

  • @RRobert99
    @RRobert99 Рік тому +12

    I wonder if google might use tech like this at some point to increase their 3D coverage on maps. If you put together images from street view and satellite images I imagine you could get a decent enough result to show most places in 3D like they already have for bigger cities.

    • @cyvan9759
      @cyvan9759 Рік тому

      Exactly what I was thinking. It will likely be challenging considering how many pictures needed to get a good result

    • @kushagrano1
      @kushagrano1 Рік тому +2

      They already are

    • @alvesvaren
      @alvesvaren Рік тому +2

      At least apple maps already does this. you can move between "frames" in streetview and you can see it try to reconstruct it. It looks really cool

    • @tsunghan_yu
      @tsunghan_yu 9 місяців тому

      They use it for Immersive View.

  • @Veptis
    @Veptis Місяць тому

    I am really about the prospect of dimensional generalization. Meaning you not only have location and rotation of the camera (rD and rO). But you can add more dimensions. Being it time or even additional concepts like the location and rotation of a light or some object etc.
    I wonder what is done and how the training objective is. it's the sort of interaction between computer graphics and ML which I would love to study.

  • @jaffarbh
    @jaffarbh Рік тому +1

    One handy trick is to increase the shutter speed (and then de-noise) to minimise blur.

  • @rseichter
    @rseichter Рік тому +1

    Well, this is slightly more advanced than what we were able to do in the 1990s using the "Stuttgart Neural Network Simulator" (SNNS). 🤓

  • @maximecourchesne5986
    @maximecourchesne5986 Рік тому

    Very cool! Not sure what you meant when you said that your camera can collect data points through the three tho

  • @manfreddellkrantz9954
    @manfreddellkrantz9954 Рік тому +6

    How did you get the camera positions considering he just went around with his phone?

    • @U014B
      @U014B Рік тому

      I have the same question. Maybe it's from accelerometer readings? Don't know how that would work with an actual camera, though.

    • @sotasearcher
      @sotasearcher Рік тому +1

      Usually the program COLMAP is used to get the camera positions

    • @GoldSrc_
      @GoldSrc_ Місяць тому +1

      It's just frames from a video, nothing complex.
      Basically just structure from motion like with photogrammetry.
      With enough photos, you can reconstruct any object.
      Then, once you have your object, you can move freely.

  • @GamingShiiep
    @GamingShiiep 10 місяців тому

    I appreciate the the video, but, maybe it's me, i still struggle to understand the "point analogy" shown at around 5:42 (and previously). Because "checking at which distance the observation hits the object" (or similar) implies that you'd be using images with already present depth information and making use of something like focal distance and "out of focus" testing. But it's obviously not, so how does it work then?
    I'm curently starting to read a bit about it for my masters, so I know that there's A LOT of math behind it. However the concepts are even visually hardly ever explained so that it makes actual sense. Maybe in a few weeks I'll come back and I'll understand what you actually try to explain.

  • @articgadgets
    @articgadgets Рік тому

    I am looking forward to a video on Gaussian Spatting!

  • @aryaamootaghi3248
    @aryaamootaghi3248 Рік тому

    based on what was explained it looked to me more like a tomographic ray tracing in the end.
    also , I think it might work only with two camera positions and the rest to be estimated as a part of unknowns in the system of equations🤔?!

  • @Ohmriginal722
    @Ohmriginal722 Рік тому +1

    That looks like it's using a much older version of the NeRF algorithm, there's a lot of more recent nerf papers with more impressive results which run and train much faster like Instant NGP

  • @vermeul1
    @vermeul1 Рік тому

    Great new presenter!

  • @surferriness
    @surferriness Рік тому

    What a trippy 3D scene, imagine the video games you could make.
    Tris count is probably not so nice

  • @arech1778
    @arech1778 8 місяців тому

    I wonder how would it work with 360 cameras as you can compensate for the distortion and have a lot more anchors

  • @stevenmathews7621
    @stevenmathews7621 Рік тому +1

    Love Dr Pound
    very sweet man
    i'm laughing hysterically at the notion that
    off camera he's like a dictatorial a*hole
    camera back on, back into sweet guy mode
    lol, childish, love it 🤣

  • @YandiBanyu
    @YandiBanyu Рік тому +5

    What is the advantage of this vs traditional photogrammetry?

    • @andybrice2711
      @andybrice2711 Рік тому +1

      It's vastly more photoreal by being somewhat "impressionistic" about the details. Converting a scene into a textured mesh is often a complex process for a sub-optimal format. Like trying to recreate a realistic painting in Lego bricks.

    • @YandiBanyu
      @YandiBanyu Рік тому

      ​@@andybrice2711 I see, thanks for the explanation

  • @AgentM124
    @AgentM124 Рік тому

    Just imagine letting a human look around a scene for only a few minutes and have them 'visualize' it in their head. That would probably "look" somewhat like that shown here.

  • @ezracramer1370
    @ezracramer1370 Рік тому

    Impressive, thank you very much!

  • @Primalmoon
    @Primalmoon Рік тому +1

    What kind of camera or system did Lewis use in order to get such a great position data for each camera image? When I've previously played around with any kind of computer vision from random cameras it feels like I need to get everything precisely measured and any inaccuracies would mess everything up. And cameras with GPS would still be very vulnerable to slight noise. But the NeRF viewer seemed to have all of the images located smoothly and continuously with no massive outliers.

  • @kipandcop1
    @kipandcop1 Рік тому +2

    Something I;ve always wondered with NeRFs and similar systems is how you know you point clouds out you are using to train from using things like COLMAP are correct and tuned correctly? When running NeRFs and gaussian splatting myself I was very surprised by the fact that the "done thing" is to just put your images through COLMAP to produce the point clouds, and the NeRF and Gaussian Splat part is more of a rendering technique for said point clouds. Is COLMAP the be all and end all of extracting 3D points from RGB images, so it's not something that people care about improving? If it was possible to get pointclouds out of 3d rendering software in the right format (which I assume shouldn't be that difficult?) like blender for a synthetic scene, could you basically get a perfect NeRF of that scene?

    • @JustThomas1
      @JustThomas1 Рік тому +2

      For most implementations of NeRF, the use of COLMAP is for the camera-solve to the best of my knowledge, and there are some versions that utilize trained camera positions. Most versions I've dug into don't start with a colmap point cloud.
      That being said, for gaussian splatting you are largely correct, although GS utilizes an extensive amount of optimized densification and as a result depending on the amount of densification occurring you may actually end up with very little of the original point cloud being retained.
      Regarding "Is COLMAP the be all and end all of extracting 3D points from RGB images", there are various methods that produce better results than COLMAP in niche scenarios, but few are as versatile as COLMAP.
      Regarding your last comment, I believe the answer is "mostly yes", in the fact that with a proper dataset of rgbd images you can probably generate a much much better NeRF results. Granted the whole point of NeRF is to perform a "close 'nuff" job, and if you have perfect data I don't understand why you wouldn't go with a photogrammetry approach unless your scene depended on the mirror-obscura done by NeRF or the spherical harmonics done by GS.

    • @Jack-gl2xw
      @Jack-gl2xw Рік тому

      Typically COLMAP is just used to determine the position of the camera in 3d space for each photo. Once each photo is labeled with its position, you can train a NeRF

    • @kipandcop1
      @kipandcop1 Рік тому

      @@JustThomas1 thanks for the thoughtful response! As someone who's basically blindly followed guides for instant ngp and gaussian splatting, the use of COLMAP has often left me wanting more explanation after the guides will (very fairly) just walk you through the steps to use it but with no explanation. An interesting thing with the synthetic 3D scene is that for the full path tracing of blender cycles, an individual frame of a scene can take many minutes to render, but if course give you "perfect" path tracing. Although it's a very niche use case, I can imagine times where having that scene freely explorable in a NerF or GS would be beneficial. Gaussian splat especially which can be rendered on CPU in webGL at a decent frame rate, meaning you could render 200frames or whatever once, put them through a GS model and have lots of users explore it in decent enough detail on CPU, to their hearts content. Again, quite a niche use case, but I think an interesting use case nonetheless

    • @kipandcop1
      @kipandcop1 Рік тому

      @@Jack-gl2xw thanks for the response. Yeah I have been mainly left wondering about it after blindly following tutorials for INSTANTNGP and gaussian splatting, where it is what they walk you through using with no mention of parameters. That's fair enough of course as they are guides for beginners like myself, but yeah has left me wanting more in explanation of why it's used so readily. Additionally to a novice like myself, getting the correct 3D points out of photos seems like a difficult thing to optimise across every scene, and when I'm left with a lackluster gaussian splat or NeRF, I'm often left wondering if it was the model training itself or the COLMAP step that "went wrong"/didn't optimise correctly

    • @Jack-gl2xw
      @Jack-gl2xw Рік тому +2

      @@kipandcop1 You can manually verify the COLMAP outputs in nerfstudio like in the video (as in visually check and make sure they look like they are lined up correctly). If COLMAP messed up badly, it should be obvious in your outputted nerf because the incorrectly positioned image/or images will be floating and wont mesh with the scene. As for general quality, im not sure without seeing your data/results. Try better lighting and more photos. Ive had some great results with NGP and Guassian splats with just my phone and running COLMAP on the images
      Edit: another thing, make sure your images overlap with eachother. This is how COLMAP works. If none of your images cover the same area, COLMAP wont be able to figure out the images positions relative to eachother. I recommend recording a video then running COLMAP in sequential mode

  • @PotatoNemo
    @PotatoNemo 5 місяців тому

    So if we have a depth sensor then the NeRF would work better right?

  • @aame6643
    @aame6643 Рік тому

    I’d love a video on Gaussian Splatting, it’s supposed to be better than NERFs?

  • @hdaalpo
    @hdaalpo Рік тому

    I thought I recognized this technique! Corridor Crew did a video on this from a VFX perspective. They used what was available a year ago, so it's likely a tad dated. Any plans to try to add real world users as part of the refining process?

  • @robchr
    @robchr Рік тому +2

    How is this different from photogrammetry?

  • @ZT1ST
    @ZT1ST Рік тому

    @4:57; The way it's described here, it sounds like "Tracert for Ray Tracing".

  • @yppahpeek
    @yppahpeek Рік тому +2

    Is this how Google constructs 3D images for Google Earth? I've been wondering that for ages

    • @zer0k4ge
      @zer0k4ge Рік тому

      You mean terravision?

    • @quillaja
      @quillaja 10 місяців тому +1

      I'd imagine most of the Google Earth's 3D is from lidar datasets.

    • @tsunghan_yu
      @tsunghan_yu 9 місяців тому +1

      NeRF came out in 2020. So Google Earth used something else. Google does use NeRF for Immersive View in Maps.

  • @UnderstandingCode
    @UnderstandingCode Рік тому

    Yes! Love it

  • @dave8is8beast
    @dave8is8beast Рік тому

    Im curious if 360 degree images would help in providing a better set of images to train off of

    • @andybrice2711
      @andybrice2711 Рік тому

      I'd guess probably not, because more pixels of your available resolution will be used up with repetitive images of the environment around your scene, rather than high-resolution data of the objects within it.

  • @georgedyson9754
    @georgedyson9754 Рік тому

    Seems a bit like an Xray CT scanner or an MRI tied to a neural network

  • @duytdl
    @duytdl Рік тому

    How does the neural network (alone) figure out the distances etc?

  • @bengoodwin2141
    @bengoodwin2141 Рік тому

    Would it ever make sense to use this process to generate data, throw away anything beyond some distance of a target object, then use some other system to generate a 3d model?

  • @Roxor128
    @Roxor128 10 місяців тому

    I wonder how well this would work if you fed it a series of 360-degree views from Google Street View going along a road?

    • @pcooper-chi
      @pcooper-chi 10 місяців тому

      Google has hinted that they're working on something like this. Their latest NeRF model (SMERF) can scale to arbitrarily large scenes. Would be pretty cool to navigate Street View in a high-res 3D model...

    • @Roxor128
      @Roxor128 10 місяців тому

      @@pcooper-chi If it works, it'd be really useful for game developers that want to set their games in a real place. Grab the street view imagery for the relevant area and generate a first-draft model to build upon.

  • @Iswimandrun
    @Iswimandrun Рік тому

    So you train it probably on GPU. Can you deploy the NERF on a TPU such as Coral or Intel Compute stick?

  • @iamavataraang
    @iamavataraang Рік тому

    Is this what the PolyCam app uses?

  • @mlguy8376
    @mlguy8376 Рік тому +1

    Is Lewis and Mike related - they talk the same with the same mannerisms. Don’t think I started to talk like my own supervisor 😂

  • @esbenablack
    @esbenablack Рік тому

    Could it compliment point clouds for things like building scanning, for use in Building Information Modeling (BIM)?

  • @olivermorris4209
    @olivermorris4209 Рік тому +1

    No computer desk is complete without a sandwich toaster

    • @michaelwilson5742
      @michaelwilson5742 Рік тому

      Yup, he can look forward to a conversation about that in the new year 😀

  • @tsunghan_yu
    @tsunghan_yu 9 місяців тому

    This is a bit too high level for the channel. But I appreciate the demo!

  • @becomingdave
    @becomingdave Рік тому

    I'm from south Africa and I m making a living from what I have learnt here

  • @gobdovan
    @gobdovan 4 місяці тому

    8:45 floaters

  • @evarlast
    @evarlast Рік тому

    He starts drawing on tractor green bar paper? Does that stuff even exist anymore?

  • @biomatrix8154
    @biomatrix8154 Рік тому

    To reconstruct bloody crime scenes, I guess they'd use Gaussian splattering.

  • @YuTv1408
    @YuTv1408 Рік тому

    Diffusion in Materials science and physics is very similar to Cs diffusion

  • @Bluedragon2513
    @Bluedragon2513 Рік тому

    This could be one of the quicker ways to create 3D models

  • @SirKenchalot
    @SirKenchalot Рік тому +1

    9:41 Great attempt at controlling your mouth there bro; remember, it's a family show.

    • @tjsm4455
      @tjsm4455 9 місяців тому

      haha nice catch

  • @djstr0b3
    @djstr0b3 Рік тому +5

    Dr Mike, you should start producing an online course for the AI subjects that you have discussed.

  • @MagruderSpoots
    @MagruderSpoots Рік тому

    What hardware is this running on?

  • @MikePaixao
    @MikePaixao Рік тому

    now if you reverse engineer the process, and generate a field of color data based on camera angle some higher dimensional maths, you have forward predictive rendering 😀

  • @julianmeredith9168
    @julianmeredith9168 Рік тому

    Just gonna through this out there, I have a few reasons why, but Lewis seems very… AI?!

  • @user-gb3rd6wk7z
    @user-gb3rd6wk7z Рік тому

    The guy on the right looks like Hugh Grant.

  • @YuTv1408
    @YuTv1408 Рік тому

    Nerfs spunds like Nerds...

  • @Yitzh6k
    @Yitzh6k Рік тому

    Are people that are very capable using this technique called "NERF Guns"?

  • @rudiklein
    @rudiklein Рік тому

    The new SME was great, not nerfous at all.

  • @kbrizy1
    @kbrizy1 Рік тому

    Doesn’t look bad at all. Looks like good maps in 3d

  • @adricortesia
    @adricortesia Рік тому +1

    It's basically what Mega Scans does for gaming. They scan for example a rock in high detail, you can import that rock into your game engine.

  • @davidberger5745
    @davidberger5745 Рік тому

    please publish videos on fast moving fields quicker, its completely outdated now.

  • @zwanz0r
    @zwanz0r Рік тому +1

    You guys look quite nerfous. Very nerfwracking. Too bad the tree got nerfed.

  • @Lion_McLionhead
    @Lion_McLionhead Рік тому

    Still looks like a turd but cheaper than lidar scanning.

  • @infectedrainbow
    @infectedrainbow Рік тому

    impressive ways of generating new views of...what? stupid auto CC :(

  • @mertakyaz5359
    @mertakyaz5359 Рік тому

    Hello Computerphile, I love your content. Can someone please explain what DAG is and how it can disrupt the blockchain tech in a video? It is from Graph Theory

  • @maxrs07
    @maxrs07 Рік тому +3

    Why is this called ML? At this rate anything what computer does should be called ML

    • @ParadiZE3D
      @ParadiZE3D Рік тому +5

      Because the underlying optimization algorithms

    • @Jack-gl2xw
      @Jack-gl2xw Рік тому +1

      A neural network is learning to represent the scene. Machine Learning is broadly described as learning patterns and predictions from data (ie: training). Here, the data is the RGB Photos and it is training a NERF to represent the scene

    • @maxrs07
      @maxrs07 Рік тому

      @@Jack-gl2xw its not learning anything its doing what algorithm told it to do and stores it in memory, just like anything else what u do on the computer. Things only should be called ML if they then do the above backwards, but afaik NeRF has no backward algorithm.

    • @sotasearcher
      @sotasearcher Рік тому

      ​@@maxrs07 yes, the original NeRF uses a MLP/multilayer perceptron, which is just a fully connected neural network, and uses a variant of backpropagation. BTW an algorithm doesn't need a backward pass to be ML, it just needs to learn a function, just look at decision trees

    • @Jack-gl2xw
      @Jack-gl2xw Рік тому +1

      @@maxrs07 NERFs still use back the back propagation algorithm. The rendering method of Nerfs is differentiable, thus trainable. I get what youre saying about how it seems like ML is in everything these days, but this is literally a Neural Network.
      For extra info, the Nerf model takes in 4 points (x,y,z, direction) and outputs (r,g,b,alpha, and radiance... i think). From this direct inputs/output of the model you can see how it is ML and how the model is being trained. Check out the full Nerf paper if you are interested

  • @ankurgajurel02
    @ankurgajurel02 Рік тому

    FIRST

    • @Jake28
      @Jake28 Рік тому

      WAIT WHAT?? i thought it said 11 months lmao

  • @infectedrainbow
    @infectedrainbow Рік тому

    I can't find a definition for wasterize.

  • @infectedrainbow
    @infectedrainbow Рік тому

    You should have had the younger guy speak all of the lines. He's understandable.