Artificial Intelligence For the Stereographer

Поділитися
Вставка
  • Опубліковано 14 сер 2021
  • Originally presented at the NSA Virtual 3D-Con 2021 (3d-con.com), 8/15/21
    The collected links: WorldOfDepth.com/tutorials/AIf...
    Many modern developments in AI deal with the 3rd dimension, from self-driving cars that perceive and navigate their environments in 3D, to programs that can build 3D models of scenes simply by analyzing 2D videos of them. Though some of these AIs are only accessible to researchers and tech companies, some AI applications like upscaling images or colorizing black and white photos are immediately useful and available to all. This workshop will show you how to use AIs of special interest to stereographers, via the free tool Google Colab, to do things like creating depth maps, stereopairs, animations, and stereovideos-even from 2D images.
    UPSCALING IMAGES
    Recommended: BigJPG.com
    Neural Image Super-Resolution by Christian Ledig et al. *not covered in this workshop: github.com/fukumame/superresol...
    • colab.research.google.com/gith...
    COLORIZING BLACK & WHITE IMAGES
    Website version: deepai.org/machine-learning-mo...
    DeOldify by Jason Antic et al.: github.com/jantic/DeOldify
    • Artistic version (for more interesting details, and vibrance)
    Customized for direct uploading: colab.research.google.com/dri...
    Original: colab.research.google.com/gith...
    • Stable version (better for landscapes and portraits)
    Original: colab.research.google.com/gith...
    CREATING DEPTH MAPS FROM 2D IMAGES
    Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging by S. Mahdi H. Miangoleh et al.:
    yaksoy.github.io/highresdepth/
    • Original: colab.research.google.com/gith...
    • Bug fix version (if the above gives you an error): colab.research.google.com/dri...
    Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer by René Ranftl et al.: github.com/isl-org/MiDaS
    • Customized for direct uploading: colab.research.google.com/dri...
    • Original: colab.research.google.com/gith...
    USING DEPTH MAPS TO ANIMATE 2D IMAGES
    3D Photography using Context-aware Layered Depth Inpainting by Meng-Li Shih et al.: shihmengli.github.io/3D-Photo-...
    • colab.research.google.com/dri...
    • Documentation/Settings: github.com/vt-vl-lab/3d-photo-...
    EXTRAPOLATING DEPTH MAPS FROM 2D VIDEOS
    Consistent Video Depth Estimation by Xuan Lo et al.: roxanneluo.github.io/Consisten...
    • • Consistent Video Depth...
    • colab.research.google.com/dri... *NEW / not covered in this workshop
    Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes by Zhengqi Li et al.: www.cs.cornell.edu/~zl548/NSFF/
    • • Neural Scene Flow Fiel...
    The original stereophotos were captured with:
    • my DIY catadioptric rig (see worldofdepth.com/daily/201023....)
    • twin Sony RX100.ii rig
    • 3D LG Thrill
    • sequentially with an iPhone SE
    The stereos were edited using:
    • StereoPhoto Maker: stereo.jpg.org
    • ImageMagick.org
    • Photopea.com
    • ffmpeg.org (for videos)
    WorldOfDepth.com
    / worldofdepth
    Written/produced by Gordon Au © 2021
  • Наука та технологія

КОМЕНТАРІ • 45

  • @dragonflyK110
    @dragonflyK110 6 місяців тому +1

    "Originally presented at the NSA Virtual 3D-Con 2021, 8/15/21 "
    I'm not gonna lie for a minute or so I was very confused about why the National Security Agency would have a 3D Conference :)
    Anyway thank you for this video, it was quite educational for somebody getting back into 3D tech after not paying much attention to it over the last decade or so. And despite the video's age it seems to have held up quite well. Though if you know of any relevant AI models that have been released since this video I'd love to know, as I'm currently trying to learn as much as I can about this topic.
    Thank you again for the time you put into this video.

    • @WorldofDepth
      @WorldofDepth  4 місяці тому

      Thanks for the appreciation! In my tests, the best AI depth estimator is still MiDaS, but the new version 3.1, released after this video. There is a very new one called Marigold (huggingface.co/spaces/toshas/marigold), but in my first tests, it's not as good as MiDaS v3.1.

    • @dragonflyK110
      @dragonflyK110 4 місяці тому

      @@WorldofDepth Thank you for the response, I have done quite a bit of research since that comment so I have actually heard about Marigold.
      Have you tried out Depth Anything? It's even newer than Marigold and is in my testing much better than MiDaS. It has a HF space if you want to try it out.

  • @BrawlStars-jd7jh
    @BrawlStars-jd7jh Рік тому

    really cool stuff, thanks for sharing!

    • @BrawlStars-jd7jh
      @BrawlStars-jd7jh Рік тому

      i have a problem, when i run the last proccess in the 3d Photo inpainting notebook, it says
      "TypeError: load() missing 1 required positional argument: 'Loader'"
      i already uploaded the depth and the base image

  • @user-fv6nc7qi2x
    @user-fv6nc7qi2x 2 роки тому +2

    when are you uploading about the AIs you talked about at the end? instantly subbed

    • @WorldofDepth
      @WorldofDepth  2 роки тому

      Thank you! This workshop was for the NSA 3D convention, so I may possibly revisit this topic with those other AIs and newer ones for next year's Con. In the meantime I recommend checking out Ugo Capeto's YT channel for reviews of additional AIs.

  • @WorldofDepth
    @WorldofDepth  2 роки тому +1

    Note that as of 8/17/21, the 3D-Photo-Inpainting Colab Notebook is NOT working due to missing files. I’ve opened a new issue report with the researchers on GitHub and will update this comment with developments.

    • @WorldofDepth
      @WorldofDepth  2 роки тому +1

      8/18/21: Files restored and working again :) Reference: github.com/vt-vl-lab/3d-photo-inpainting/issues/131

  • @jimpvr3d289
    @jimpvr3d289 2 роки тому +1

    VERY interesting! But if the depth AI produce only depth maps.. what produces the missing pixel information (like the clouds behind Mr. Rogers head?) Is stereophotoMaker doing that and why doesn't AI do that also?
    Thank you

    • @WorldofDepth
      @WorldofDepth  2 роки тому +3

      So the painting in of those kind of background spaces is exactly what the 3D-Photo-Inpainting AI does-the zoom-in animation at 22:42 is an example. And it does that based on everything it has learned from massive amounts of training. The SPM animation at 23:25 does do some amount of inpainting as well, but I think it's based on straight calculations and copying existing pixels, rather than on AI, and I think it's not as smooth.

  • @CabrioDriving
    @CabrioDriving 2 роки тому +1

    Do you have knowledge how to convert 2D photo to 3D, the way you feel like standing say 1 or 2 feet from a huge 3D-world-window, with proper, deep depth of the scene, without seeing everything flat and too wide? I was thinking about some mathematical function to convert the image to different "lens angle", but not sure if this is the good direction in thinking. Also, there is a formula for 2d to 3D convertion with camera focus point variable, near plane cut variable and far plane cut variable. I wonder which parameter of this (or other formula?), you need to manipulate to have 3D photo with ideal depth, of scene as close to you, as possible. Imagine like the 3D photo would be a huge window to your garden, which is from floor level to the ceiling and you are standing just next to it. Typically I notice that videos/photos are the best to be viewed with 10 feet virtual (perceived distance in VR) distance, but that kills the feeling of presence in that world - it is just like seeing some window with 3D, too far away.

    • @WorldofDepth
      @WorldofDepth  2 роки тому +1

      What you describe sounds more like VR 180º 3D images to me. To have a feeling of standing close to a 3D scene, I think that's the only option, unless you render in full 3D. I don't work in 180º or 360º, but check out the 3D-Con workshops and Special Interest Groups about VR, from both this year and last year.

    • @CabrioDriving
      @CabrioDriving 2 роки тому +1

      @@WorldofDepth I wasn't thinking about super wide angle of VR180, but let's say 100-110. Just with deep scene depth. I will study the topic if it is possible to do what I described. Cheers

  • @metamind095
    @metamind095 2 роки тому +2

    Can I use Midas to convert 2d Video to 3D somehow? What in your opinion is the best tool to do 2d to 3d video conversion? Thx for the video.

    • @WorldofDepth
      @WorldofDepth  2 роки тому +1

      It's possible to do that with MiDaS frame by frame, but the resulting video will flicker, so I think it's best to use tools made specifically for video instead. “Consistent Video Depth Estimation” is near the bottom of my collected links (see video description), plus another, but I haven't used them myself. There are surely other similar AIs out there now as well.

    • @Rocketos
      @Rocketos Рік тому

      @@WorldofDepth can u upload a tutorial for using consistent video depth estimation ? Please

    • @WorldofDepth
      @WorldofDepth  Рік тому +1

      @@Rocketos I don't have one to upload. Perhaps if I have time in the future.

    • @Rocketos
      @Rocketos Рік тому

      @@WorldofDepth thanks i love your workshops

  • @MichaelBrownArtist
    @MichaelBrownArtist 2 роки тому

    12/16/21 MiDaS v.3 - Failed at the second step (load a model): ModuleNotFoundError: No module named 'timm'

    • @WorldofDepth
      @WorldofDepth  2 роки тому

      Hmm, I just tried the ‘upload version’ notebook and it worked fine. Did you miss the first code box, above “Uploading Your Image”? That is the step that installs timm. If you did run that, it may be that the bandwidth limit for a certain external file was reached, and it was temporarily unavailable, but it should notify you if that's the problem.

    • @MichaelBrownArtist
      @MichaelBrownArtist 2 роки тому +1

      @@WorldofDepth , Maybe I did miss it. I'll try again. Thanks for preparing such a great presentation.

    • @MichaelBrownArtist
      @MichaelBrownArtist 2 роки тому +1

      Your suspicion was correct. I missed the first step (instal timm). I was able to run it, but the final depth map was very tiny: 284x217 px. Not sure why.

    • @WorldofDepth
      @WorldofDepth  2 роки тому +1

      @@MichaelBrownArtist ah, good. But yes, MiDaS v.3 outputs very small depth maps. I would recommend trying 1) starting with a larger input image, 2) using BMD + MiDaS v2.1 as a possible alternative, which outputs at original size, and 3) upscaling the v3 depth map and using something like a symmetrical nearest-neighbor interpolatation method to smooth it. Ugo Capeto recommends the latter; I don't have a program which offers that method, so I've use Imagemagick and the "Kuwahara edge-preserving noise filter" to pretty good results.

    • @MichaelBrownArtist
      @MichaelBrownArtist 2 роки тому

      @@WorldofDepth , thank you.

  • @CabrioDriving
    @CabrioDriving 2 роки тому

    So you have 2D images + depth maps. Which software/github to use to produce the two stereo images? 3d stereo photo maker is not good in my opinion.

    • @WorldofDepth
      @WorldofDepth  2 роки тому

      I use SPM. Which, by the way, produces much smoother stereopairs if you appropriately upscale your 2D image + depth map first. For example, if you're using an SPM deviation value of 60 = 6% of image width, for a 256-level depth map, you need 4267px-wide image.
      You could also take advantage of the 3PI AI and produce a panning video with it, then extract a stereopair. Not time efficient but would have the best inpainting.

    • @CabrioDriving
      @CabrioDriving 2 роки тому +1

      @@WorldofDepth Hi. Thanks for your time and valuable answer. What I noticed is SPM has problems with depth recognition despite depth map seems to be ok, visually. Also, it tears away surfaces like faces when you produce 3D, even with deviation of 25 or 30 (default). Also, produced images look downscaled in quality. I have a depth map and I see in it, the depth is represented correctly. Then SPM makes flat foreground and correct background hmmm... or vertically sliced/layered depth or some things are flat and other are correct. I have spent a lot of time on this software (even with google AI working from my disk ) and never produced a great 3D photo out of depth maps made with MIDAS or LeRES and some other AI software. So, that is why I asked about some other project to produce correct images. Good point on 256-level depth maps and image size.

    • @WorldofDepth
      @WorldofDepth  2 роки тому

      @@CabrioDriving Almost every AI-produced depth map needs manual corrections, I think, especially if it's an image with people's faces at any significant size. As I say in the video, it can be convenient if you're producing 2D animations that are more forgiving of the depth map, but otherwise it's always going to be work, at this stage of the tech…
      The sliced/layered depth problem you mention is exactly the image size issue I mentioned. Upsizing for stereopair generation and then downsizing back to original size should smooth those areas.

    • @CabrioDriving
      @CabrioDriving 2 роки тому +1

      @@WorldofDepth Thank you for your priceless comments. 1. How did you calculate that needed resolution of 4267 pixels wide 2. what should be image width for 3% deviation? 3. What deviation % you suggest for best effect?

    • @WorldofDepth
      @WorldofDepth  2 роки тому +1

      ​@@CabrioDriving 1) If you use an 8-bit grayscale depth map that is normalized to go all the way from pure black to pure white, it has 256 different depth levels. If you want to differentiate all of those in a stereopair, then those levels will correspond to horizontally shifting parts of the original image between 0 and 255 pixels. If the maximum shift you want ( = deviation) is 6%, that means 255 pixels must be 6% (or less) of your image width, so WIDTH * .06 = 255, and WIDTH = 4250px. (My original number was slightly off.)
      2) By the same method, you need an 8500px-wide image if deviation is 3%.
      3) It depends on the picture and the intended use. 3% was the old rule of thumb, and that's good for TV size, I think. For laptop or phone screen size, I probably use 4.5-6% more, or rarely up to 8% for a very deep scene. For display via a projector at wall size, maybe you'd want to go down to 1.5%.

  • @importon
    @importon Рік тому

    Have you lost interesting in this stuff? Why no new content?

    • @WorldofDepth
      @WorldofDepth  Рік тому

      As you can see, this video workshop was part of NSA Virtual 3D-Con 2021, and other videos of mine were made for similar conferences and some smaller regional events. I presented again at the 2022 3D-Con last month, but unfortunately, the conference was not recorded.
      If you want to see the latest 3D I'm producing, follow @WorldOfDepth on Instagram, and/or check WorldOfDepth.com (though I need to update that more!).