Real time practical monocular 3D perception | ACV Meetup

Поділитися
Вставка
  • Опубліковано 30 чер 2024
  • Patreon : / membership
    Advanced in monocular depth estimation through supervised and self-supervised learning makes it possible to infer a semantically coherent depth map. However monocular depth estimation still lacks some features for practical embedded robotics applications. In such constraints scenarios, the scale and shift ambiguity, the computation requirements and the lack of temporal stability are some of the barriers to production. In this talk, I introduce ongoing work to include spatio-temporal inductive bias in the models. I first introduce how the flexibility of transformers architecture could improve models efficiency at run time. Then, I take a quick tour of how we can leverage multi-view geometry through optical expansion and depth from motion to solve the inherent scale ambiguity of monocular models.
    00:00 Into
    02:35 Hardware
    03:13 Mammal perception
    06:54 State of the art
    10:00 Opportunities
    11:00 Guided Attention
    14:50 Efficient depth refinement
    17:30 Recurrent formulation
    22:00 Intermediate Summary
    23:15 Optical expansion
    24:40 Motion parallax
    26:10 Solving scale ambiguity
    27:00 Conclusion
    28:00 Questions
    [Chronique d'une IA]
    Spotify : open.spotify.com/show/5yTTTKf...
    Amazon music : music.amazon.fr/podcasts/5097...
    Apple Podcasts: podcasts.apple.com/us/podcast...
    [About me]
    Visual Behavior : visualbehavior.ai
    Perso : www.thibaultneveu.ai/
    Github : github.com/thibo73800
    Linkedin : / thibaultneveu
    Twitter : / thiboneveu
  • Наука та технологія

КОМЕНТАРІ •