Real time practical monocular 3D perception | ACV Meetup
Вставка
- Опубліковано 30 чер 2024
- Patreon : / membership
Advanced in monocular depth estimation through supervised and self-supervised learning makes it possible to infer a semantically coherent depth map. However monocular depth estimation still lacks some features for practical embedded robotics applications. In such constraints scenarios, the scale and shift ambiguity, the computation requirements and the lack of temporal stability are some of the barriers to production. In this talk, I introduce ongoing work to include spatio-temporal inductive bias in the models. I first introduce how the flexibility of transformers architecture could improve models efficiency at run time. Then, I take a quick tour of how we can leverage multi-view geometry through optical expansion and depth from motion to solve the inherent scale ambiguity of monocular models.
00:00 Into
02:35 Hardware
03:13 Mammal perception
06:54 State of the art
10:00 Opportunities
11:00 Guided Attention
14:50 Efficient depth refinement
17:30 Recurrent formulation
22:00 Intermediate Summary
23:15 Optical expansion
24:40 Motion parallax
26:10 Solving scale ambiguity
27:00 Conclusion
28:00 Questions
[Chronique d'une IA]
Spotify : open.spotify.com/show/5yTTTKf...
Amazon music : music.amazon.fr/podcasts/5097...
Apple Podcasts: podcasts.apple.com/us/podcast...
[About me]
Visual Behavior : visualbehavior.ai
Perso : www.thibaultneveu.ai/
Github : github.com/thibo73800
Linkedin : / thibaultneveu
Twitter : / thiboneveu - Наука та технологія