Swin Transformer - Paper Explained

Поділитися
Вставка
  • Опубліковано 12 чер 2024
  • Brief explanation of swin transformer paper.
    Paper link: arxiv.org/abs/2103.14030
    Table of Content:
    00:00 Intro
    00:13 Patch Embedding
    02:56 Swin transformer block
    03:57 W-MSA
    05:14 SW-MSA
    08:56 Masked MSA implementation
    14:58 Patch Merging
    16:12 stages
    18:28 Image classification result
    19:12 Relative position bias
    Icon made by Freepik from flaticon.com

КОМЕНТАРІ • 23

  • @VedantJoshi-mr2us
    @VedantJoshi-mr2us 3 дні тому +1

    By far one of the best + complete, SWIN transformer explanations on the entire Internet.

  • @SizzleSan
    @SizzleSan 11 місяців тому +1

    Thorough! Very comprehensible, thank you.

  • @failuredocumentary
    @failuredocumentary Рік тому +2

    Really informative, helped me lot to understand many concepts here. Keep up the good work

  • @antonioperezvelasco3297
    @antonioperezvelasco3297 7 місяців тому

    Thanks for the good explanation!

  • @omarabubakr6408
    @omarabubakr6408 10 місяців тому

    That's The Most Illustrative Video Of Swin-Transformers on The Internet!

    • @soroushmehraban
      @soroushmehraban  10 місяців тому

      Glad you enjoyed it 😃

    • @omarabubakr6408
      @omarabubakr6408 10 місяців тому

      @@soroushmehraban yes abs thx so much, although I Have a Quick Question More Related to PyTorch actually which is in min 12:49 in line 239 in the code 1st what does -1 here means and what does it do exactly with the tensor 2nd from where we get [4,16] the 4 here from where we got it cuz its not mentioned in the reshaping. Thanks in advance.

  • @rohollahhosseyni8564
    @rohollahhosseyni8564 9 місяців тому

    Very well explained, thank you Soroush.

  • @siarez
    @siarez Рік тому

    Great video! Thanks

  • @akbarmehraban5007
    @akbarmehraban5007 Рік тому

    I enjoy very much

  • @proteus333
    @proteus333 7 місяців тому

    Amazing video !

  • @kundankumarmandal6804
    @kundankumarmandal6804 5 місяців тому

    You deserve more likes and subscribers

  • @user-sw4hm4hh6h
    @user-sw4hm4hh6h 10 місяців тому

    perfect description.

  • @EngineerXYZ.
    @EngineerXYZ. 5 місяців тому

    Why channel increasees c to 4c after merging

    • @soroushmehraban
      @soroushmehraban  5 місяців тому +1

      Because we downsample the width by 2 and height by 2. That means we have 4x downsampling in spatial resolution that we give it to the channel dimension. It's just a simple tensor reshaping.
      For example 10x10x2 = 200.
      After merging it's 5x5x8 = 200.