Layered Depth Light Field Volumetric Video

Поділитися
Вставка
  • Опубліковано 27 лис 2024

КОМЕНТАРІ • 59

  • @rubi-w-
    @rubi-w- 7 місяців тому +1

    DUDE THIS LOOKS AWESOME! Can’t wait for it to become a consumer technology, easy to use, to capture the most possible details out of your life.

    • @JoshGladstone
      @JoshGladstone  7 місяців тому +1

      Thanks!! I hope you're right!

    • @rubi-w-
      @rubi-w- 7 місяців тому +1

      @@JoshGladstone Maybe with AI it will be possible?

    • @JoshGladstone
      @JoshGladstone  7 місяців тому +2

      @@rubi-w- This technique already uses AI, but it's a fairly active area of research and there are advancements every year. The main issue isn't really the capture technology, it's demand. Volumetric content requires a different technology to view than a normal screen, so there's not much demand for it right now. Maybe as VR/MR headsets get smaller and more popular.

    • @rubi-w-
      @rubi-w- 7 місяців тому +1

      @@JoshGladstone Thank you for the clarification ^^ Gonna force my friends to buy a VR/MR headset. Already own a Quest 3 but am planning on switching to the Quest Pro. (Maybe it’s better to wait or save up for something else?)

    • @JoshGladstone
      @JoshGladstone  7 місяців тому +1

      @@rubi-w- Unless you specifically want the face and eye tracking, I think Quest 3 is better than Quest Pro

  • @matchboxgiant
    @matchboxgiant Рік тому +1

    finally got cake player on my quest 2. so cool to walk inside your volumetric videos

    • @JoshGladstone
      @JoshGladstone  Рік тому

      Great, so glad it worked! Thanks for checking it out! :)

  • @davidpacheco5501
    @davidpacheco5501 Місяць тому +1

    You're probably all over this already but it's so much easier to sync videos now - can use GoPros timecode sync feature with all the cameras and then use a python script to automatically clip all the videos to start and end at the same time based on the timecode metadata.

    • @JoshGladstone
      @JoshGladstone  Місяць тому

      @@davidpacheco5501 I've seen that! Looks pretty neat. If I was going to replace all the cameras I'd probably give it a shot.

    • @davidpacheco5501
      @davidpacheco5501 Місяць тому

      @@JoshGladstone Yeah having to buy five Go Pros if you want to replace the setup is not ideal

  • @weevilman
    @weevilman Рік тому +2

    Veeery cool stuff, well demonstrated, great project, good on you!

  • @AndyGaskin
    @AndyGaskin Рік тому +1

    The distortions and artifacts are cool, artistically. Looks like a time ripple effect. Could be put to good use in a movie where some detective has to "scan and enhance" surveillance footage.

  • @Mac_Daffy
    @Mac_Daffy Рік тому +1

    very inspiring work. your progress looks substantial. thanks for explaining it in such a detail.

    • @JoshGladstone
      @JoshGladstone  Рік тому +1

      I appreciate that! Thanks for watching! :)

  • @erickgeisler
    @erickgeisler 11 місяців тому +1

    This is really cool. I wonder if you could calibrate colmap to know your rig so essentially you do not need structure from motion. This way scaling with more then 5 cameras would cost less processing. Also you could try in-painting on each layer to help minimize artifacts. I suspect having a larger distance between each camera could yield better results. Just a thought. Really cool stuff. Great work. I'm going to downlaod and start playing with cakeplayer. You could sync all camera with a timecode slate running on a phone or tablet.

    • @JoshGladstone
      @JoshGladstone  11 місяців тому

      Thanks! At the moment any movement whatsoever requires a completely new colmap solve to yield good results, so ever shot does need its own solve. Still working on that though.
      My earlier rig did have a wider baseline between cameras, and that did seem to help with objects further away, since there was more parallax differences, but it then struggled with closer objects. This rig is my attempt to split the difference and also have something more portable.
      Ideally sync would be at the sensor level, but really it's only an issue with fast moving subjects at the moment. If I'm able to somehow get moving camera shots to work, it could also be an issue there too, but for the moment sync is alright for what it is. I also need to experiment with shooting at higher frame rates to try and get the sync closer.

  • @NimaZeighami
    @NimaZeighami Рік тому +2

    Soooooo sick!

  • @KuyaQuatro
    @KuyaQuatro 10 місяців тому

    finally watched this; really impressive work! would love to check it out sometime if possible! also, the example footage i recognize that corner of sunset & wilcox; used to work close by, there were ALWAYS car accidents at that particular intersection haha

  • @Thats_Cool_Jack
    @Thats_Cool_Jack Рік тому +2

    This is really cool. Ive always wondered if it would be possible to use the depth pass to create background masks that could be content aware filled with stable diffusion or something to mimimize distortion. Might not be worth it though with how much sd can variate between frames.

    • @JoshGladstone
      @JoshGladstone  Рік тому

      Check out lifecastvr.com, I believe they're using rgbd + stable diffusion to inpaint a static background layer.

  • @importon
    @importon Рік тому +4

    Neat! Looks like your doing something similar to debevic's format of geometry layers with displacement from the depth map. What github project are you using to get the depth maps and in-painting? You still keeping your cards close to your chest?

    • @importon
      @importon Рік тому

      I think you meant "Post a comment if you have any thoughts or questions ...... that I may or may not answer"

    • @JoshGladstone
      @JoshGladstone  Рік тому

      Yes! It is very similar to the Google Layered Mesh representation, although the implementation and playback part is pretty different. You can PM me on instagram if you want more info

    • @importon
      @importon Рік тому +1

      That wasn't the part of my comment with the question mark 🙂@@JoshGladstone

  • @roknovak9991
    @roknovak9991 5 місяців тому

    Really cool work! I have some more technical questions about parts of the process.
    You mentioned a neural network that generates the layers. Is it a publically available network and model or is it your own custom network that you trained yourself? Also, how did you decide on 8 layers?
    Also about the compositing in unity. Do you deform each layer along the Z axis based the depth map?

  • @DavidAddis
    @DavidAddis Рік тому +1

    Quite amazing work really, well done! It's impressive what you can achieve with just 5 camera angles. Would you get some benefit from moving the cameras further apart?
    Funnily enough I just watched the shell rendering video from Acerola a few days ago - seems like a (coincidentally?) similar technique.
    I think if you can clear up some of the artifacts, platform holders will start to get interested!

    • @JoshGladstone
      @JoshGladstone  Рік тому +1

      Thanks, I appreciate the kind words!! Actually, my first rig was a 1ft by 1ft square, so the cameras were much further apart, and it was able to get good results at some further distances, but closer captures were challenging because a lot of the image was out of frame and there was no overlap. This rig is an attempt to be able to capture some closer subjects. I even had one version that had the cameras spaced even closer, so this is sort of the middle ground between the two. I'm still optimizing, but yeah for distances it's definitely better to space the cameras out more.
      It would probably improve things to use more cameras, but without a better way to control all of them, it's sort of a balancing act between capture quality, usability, and portability. It would really help if an action camera company got back into the wired array game, but that hasn't been a thing for years. It's an extremely niche market. That's one of the reasons most camera arrays are studio only.

  • @EmanX140
    @EmanX140 Рік тому +1

    Have you looked at Gaussian splatting? What are your thoughts on that?

    • @JoshGladstone
      @JoshGladstone  Рік тому +2

      Yes, Gaussian Splatting is very cool! In fact, the footage of my GoPro rig is a Luma AI gaussian splat!
      One of the issues with gaussian splatting and nerfs in general are the number of views needed to get good results, which is a challenge for video content. There has been some work in nerf video (Neural 3D Video Synthesis, DyNrf), but real-time playback is a challenge. It would need to be baked down into another format that can be streamed/downloaded and played back, and then you can lose a lot of the view dependent effects, i.e. the nerfiness. It is an active area of research though!

  • @eagleed99
    @eagleed99 8 місяців тому +2

    Were u able to do the luma pad version to download?

    • @JoshGladstone
      @JoshGladstone  8 місяців тому

      I submitted it, but it still needs some fixes before it's able to be published, unfortunately.

    • @JoshGladstone
      @JoshGladstone  8 місяців тому

      The Lume Pad 2 version is now live in the Leia App Store!

  • @SirJoeYama
    @SirJoeYama Рік тому +1

    have you tried to capture volumetric videos using the oculus quest 3 stereo cameras and depth sensor?

    • @JoshGladstone
      @JoshGladstone  Рік тому +1

      I haven't, but it wouldn't work with this technique. It probably would with my previous stereoscopic multiplane project, though. But the depth sensor wouldn't factor into that. I can't think of any projects that take stereoscopic images + depth as input.

  • @SoundGuy
    @SoundGuy 5 місяців тому

    I'm wondering if super sparse lightfeilds like this always require more than 2 cameras, or can you use already fillmed stereo videography to make things like this.
    and then i'm wondering if you can use AI to clean up the scene eliminating those ghosts and makeing more perfect layers and also guessing or imagining the missing informatoin. there are plenty of cool upscalers out there already available.

    • @JoshGladstone
      @JoshGladstone  5 місяців тому

      At the moment, it needs more than two cameras. But my previous project was based on stereo pairs, check out my older video "Multiplane Video - Volumetric Video with Machine Learning". I even converted some historical stereo photographs to volumetric, which is also available on the channel.
      It's certainly possible that future projects could produce similar results from stereo images. In fact, there are projects currently in the works to produce similar results from a single image.
      Hell, there are even current generative AI projects that aim to produce similar results from 0 images! Crazy!

  • @AndriiShramko
    @AndriiShramko Рік тому +1

    Great! Did you test a gaussian splatting for a volumetric video? Why did you choose this method but not splatting?

    • @JoshGladstone
      @JoshGladstone  Рік тому

      Splatting tends to need a lot more views to get good results. It's also an open questions on how to play back gaussian splats in real-time as a video. This can be wrapped into an mp4 and streamed on mobile hardware.

    • @JoshGladstone
      @JoshGladstone  Рік тому

      I did use guassian splats for the video though! The shot of the camera rig is a Luma AI capture.

  • @dlawrence
    @dlawrence Рік тому +1

    Nice work, Josh! Have you played with spatial video from the iPhone 15 Pro? It will be enabled in the next iOS release. When this happens it will be one of the most mainstream spatial video capture formats out there. It would be so cool to have a player that could display with this video.

    • @JoshGladstone
      @JoshGladstone  Рік тому +1

      Thanks! I haven't seen any samples yet, but Apple's spatial video is just stereoscopic 3d wrapped into hevc. There's no volumetric or 6dof aspect to it, so once the format is parsed it should be relatively simple to view on any VR headset.

    • @Naundob
      @Naundob Рік тому

      @@JoshGladstoneThat was my guess as well. But rumor has it that it will incorporate depth from LiDAR and/or some AI trickery to give it some 6dof flavor. It fact it would be rather disappointing if Apple would end up with only good old 3d video.

    • @JoshGladstone
      @JoshGladstone  Рік тому

      @@Naundob The frame it's displayed in has some 6DoFiness to it, but it's definitely just 3D wrapped into hevc in such a way that it's backwards compatible with 2D playback. There's more info here: developer.apple.com/videos/play/wwdc2023/10071/

  • @kailafars1387
    @kailafars1387 Рік тому

    "Promo SM"