Vision Transformer (ViT) - An Image is Worth 16x16 Words: Transformers for Image Recognition

Поділитися
Вставка
  • Опубліковано 18 гру 2024

КОМЕНТАРІ •

  • @suke933
    @suke933 2 роки тому +1

    Once again the better way of illustrating the recent knowledge. Thanks a lot.

  • @VikashVerma-c3v
    @VikashVerma-c3v Рік тому +1

    Nice explanation

    • @AIBites
      @AIBites  Рік тому

      Thank you Vikash! 😊

  • @BiranchiNarayanNayak
    @BiranchiNarayanNayak 2 роки тому +1

    Excellent explanation. i love it.

  • @sdsgnitromax8632
    @sdsgnitromax8632 4 роки тому

    Hi! Could you please elaborate on task transfer? You gave an example of classification of dogs and cats as task 1 and task 2 of horses and elephants. How does knowledge transfer work here?

    • @AIBites
      @AIBites  4 роки тому

      Thanks for your comments. To elaborate on that, appearance wise all are 4 legged creatures. So the knowledge that classes horses and elephants are similar to classes dogs and cats should be transferred from task 1 to task 2. I spoke more from the perspective of meta learning where we train in episodes and each episode is a task. Hope it make sense now. Or perhaps the example wasn't the best.

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 4 роки тому +2

    this is a very good presentation

    • @AIBites
      @AIBites  4 роки тому

      Thank you very much!

  • @lisabecker3246
    @lisabecker3246 3 роки тому

    Thanks for the great video! Do you mean BERT instead of BIRT when you mention the class token?

    • @AIBites
      @AIBites  3 роки тому

      Yes, thats a good spot Lisa. I meant BERT! :)

  • @godwinrayan4110
    @godwinrayan4110 3 роки тому

    Great video! Would be nice if you could also post one about DETR and deformable DETR:)

    • @AIBites
      @AIBites  3 роки тому +1

      Thanks Godwin.. I have a video on DETR. Will do them at some point :)

  • @vipulmehra1925
    @vipulmehra1925 3 роки тому

    How to carry regression with Vision Transformer?

    • @AIBites
      @AIBites  3 роки тому

      Thanks for your question. It is the same as how you do with any neural network or CNN architecture. Instead of training your output with a softmax cross-entropy you can train with a L1 or L2 loss.

  • @Holman57
    @Holman57 4 роки тому

    Very cool

    • @AIBites
      @AIBites  4 роки тому

      Thank you very much!

  • @navinbondade5365
    @navinbondade5365 3 роки тому

    great video, can. you making coding video

    • @AIBites
      @AIBites  3 роки тому

      ya sure, in the future videos! ☺