[CVPR 2021] DeepVideoMVS: Multi-View Stereo on Video with Recurrent Spatio-Temporal Fusion

Поділитися
Вставка
  • Опубліковано 9 лют 2025
  • Paper: openaccess.the...
    Code: github.com/ard...
    DeepVideoMVS is a learning-based online multi-view depth prediction approach on posed video streams, where the scene geometry information computed in the previous time steps is propagated to the current time step. The backbone of the approach is a real-time capable, lightweight encoder-decoder that relies on cost volumes computed from pairs of images. We extend it with a ConvLSTM cell at the bottleneck layer, and a hidden state propagation scheme where we partially account for the viewpoint changes between time steps.
    This extension brings only a small overhead of computation time and memory consumption over the backbone, while improving the depth predictions significantly. As the result, DeepVideoMVS achieves highly accurate depth maps with real-time performance and low memory consumption. It produces noticeably more consistent depth predictions than our backbone and the existing methods throughout a sequence, which gets reflected as less noisy reconstructions.

КОМЕНТАРІ •