DINO: Self-Supervised Vision Transformers

Поділитися
Вставка
  • Опубліковано 13 чер 2024
  • DINO, a remarkable self-supervised method, employs two distinct augmented views of an image to acquire the ability to concentrate on objects within the image and generate distinguishable representations for various image categories. It has outperformed prior self-supervised techniques across a range of vision tasks and impressively achieved an 80.1% accuracy on ImageNet, all while utilizing the ViT-B as its backbone.
    Paper link: arxiv.org/abs/2104.14294
    Table of Content:
    00:00 Introduction
    03:45 Knowledge Distillation
    05:13 DINO
    07:40 Multi-crop training
    12:31 Avoiding Collapse
    16:06 Results
    Icon made by Freepik from flaticon.com

КОМЕНТАРІ • 10

  • @yiqian22
    @yiqian22 7 місяців тому

    As always, thank you very much for the clear explanation - I truly appreciate it! 👏

  • @ericsy78
    @ericsy78 9 місяців тому

    This is a great video I really appreciate the dedication in each video you post, I learn a lot watching your videos and it has always been helpful to me.

    • @soroushmehraban
      @soroushmehraban  9 місяців тому

      Thanks for the feedback! It’s my pleasure

  • @alihadimoghadam8931
    @alihadimoghadam8931 9 місяців тому

    Great video, as always 🤘

  • @AshishJain-iw5md
    @AshishJain-iw5md 8 місяців тому +1

    Very informative!!!

  • @pulakgautam3536
    @pulakgautam3536 8 місяців тому

    I love your channel!

    • @soroushmehraban
      @soroushmehraban  8 місяців тому

      Thanks for the kind comment! This is really encouraging. Will try my best to come up with more paper reviews in the future.