DeiT - Data-efficient image transformers & distillation through attention (paper illustrated)

Поділитися
Вставка
  • Опубліковано 18 гру 2024

КОМЕНТАРІ •

  • @amirhosseinfarzaneh9771
    @amirhosseinfarzaneh9771 3 роки тому +3

    Great job. You started off very good by disecting the numbers in the tables. I wish I could see more of the tables in the paper to be analyzed. I also liked the final conclusion in the end saying that there is still a CNN teacher trained in the end.

    • @AIBites
      @AIBites  3 роки тому +1

      Thanks for the feedback Amir. I will try to cover more results tables in the future. As a general trend the viewership drops quite a lot after the concept is explained. So less emphasis on the results and conclusion.

  • @LiamWu-b3g
    @LiamWu-b3g Рік тому +1

    Great explanation

  • @ddas8554
    @ddas8554 2 роки тому +1

    How is the distillation token initialized and what is its purpose? Looking at the code it doesn't seem to take intputs from the output of a CNN?

    • @oguzhanercan4701
      @oguzhanercan4701 2 роки тому

      I agree with you, that part looks a mistake but not sure

    • @zulfawijaya4809
      @zulfawijaya4809 Місяць тому

      I also don't see the uses of the distillation token in the formula. Although the paper stated that the best performance obtained through class+distillation token

  • @yefetbentili6198
    @yefetbentili6198 3 роки тому +1

    Very helpful would love to see more of such content

    • @AIBites
      @AIBites  3 роки тому +1

      Thank you Yefet! quite encouraging to keep going :)

  • @rahilmehrizi5434
    @rahilmehrizi5434 3 роки тому

    Very clear explanation, thank you!

  • @markgicharu
    @markgicharu 3 роки тому +1

    Very Interesting
    Waiting patiently to see its use case in the Industry and its value proposition over current methods that utilize various CNN's
    @AI Bites Keep up the good work

    • @AIBites
      @AIBites  3 роки тому

      Thank you 🙂

  • @ReDream
    @ReDream 3 роки тому

    Very clear and concise! Great video!

    • @AIBites
      @AIBites  3 роки тому

      Glad you enjoyed it!

  • @Zach-0414
    @Zach-0414 Рік тому

    Thank you for your great work. I would like to seek a question: is the distillation loss computed between the softmax of the teacher and the softmax of the student model, or is it computed between the softmax of the teacher and the ground truth? In the paper, I noted it seemed to be the former, but in your video presentation, it appeared to be the latter. Could you please confirm? If I misunderstand you, I am so sorry.

    • @AIBites
      @AIBites  Рік тому +1

      hey Zach, great question. The papers are always right as they are the source of information for my videos. Most of the times when we do reading groups in University, its quite normal for people to disagree on something. As such everything won't be 100% clear from the paper. So I feel you are right on this occasion :)

    • @Zach-0414
      @Zach-0414 Рік тому

      @@AIBites Thanks for your kind reply.

  • @anshumansinha5874
    @anshumansinha5874 8 місяців тому +1

    I think you’ve got the temperature effect wrong. At higher temperatures we get more distributed logits.

    • @AIBites
      @AIBites  8 місяців тому

      okies, may be I realized it on hindsight! :)

    • @anshumansinha5874
      @anshumansinha5874 8 місяців тому

      @@AIBites What does that mean?

  • @goldfishjy95
    @goldfishjy95 3 роки тому

    This is incredible, thank you so much! subscribed!

  • @Fatima-kj9ws
    @Fatima-kj9ws 3 роки тому

    Amazing, thank you very much

  • @yabezD
    @yabezD 2 місяці тому

    where to draw these kinda charts ?. could you tell me. itll be helpful

    • @AIBites
      @AIBites  29 днів тому

      I think its keynote if I can recollect as its been a long time I did this video.

    • @yabezD
      @yabezD 29 днів тому

      @AIBites thanks for sharing, any others, android specific?

  • @egeres14
    @egeres14 3 роки тому

    Love your videos 🥰

    • @AIBites
      @AIBites  3 роки тому +1

      Thank you! Encouraging comments like yours keep me going :)

  • @mustafabuyuk6425
    @mustafabuyuk6425 3 роки тому

    good explanation

    • @AIBites
      @AIBites  3 роки тому

      Thank you Mustafa!