Usually, an image classification model is used at the beginning of object detection pipeline, and it’s called backbone. Most object detection pipelines use ResNet as backbone. I assume that you want to replace ResNet with Swin, just like what they did in the paper (section 4.2). If that’s the case, your best option is to use MMDetection library. They already included Swin as a backbone on their github: github.com/open-mmlab/mmdetection/blob/cfd5d3a985b0249de009b67d04f37263e11cdf3d/mmdet/models/backbones/swin.py
you said the 200 epochs test u ran is not a proper wxperiment to judge the quality of this transformer architecture. So other than increasing the c vaue back to 96, what other things should I look into to experiment and get the best performance out of this architecture
The settings I used in the video were simple just to have a taste of this transformer. In my opinion, a proper experiment would be replicating the results in the paper on ImageNet-1K dataset (the ones in table 1). This way we can judge the model, and then look for improvement.
@@mashaan14 im sorry. which table? (Also, big thanks your videos and replies have been a big help. however, any chance I can ask somewhere more convenient than yt comments?)
Table 1 on page 6 of swin transformer paper. You can text me on twitter or linkedin, whichever is convenient to you. twitter.com/mashaan_14 linkedin.com/in/mashaan
Hi sir
Im a student who is studying on it
i would like to use swin transformer on object detection from my project
how can i accomplish
thank you sir
Usually, an image classification model is used at the beginning of object detection pipeline, and it’s called backbone. Most object detection pipelines use ResNet as backbone.
I assume that you want to replace ResNet with Swin, just like what they did in the paper (section 4.2). If that’s the case, your best option is to use MMDetection library. They already included Swin as a backbone on their github:
github.com/open-mmlab/mmdetection/blob/cfd5d3a985b0249de009b67d04f37263e11cdf3d/mmdet/models/backbones/swin.py
is the notebook where you test the model with C = 48 and 200 epochs available somewhere, I would really like to check it out
here you go:
github.com/mashaan14/UA-cam-channel/blob/main/notebooks/2024_08_19_swin_transformer.ipynb
@@mashaan14 thankssss
you said the 200 epochs test u ran is not a proper wxperiment to judge the quality of this transformer architecture. So other than increasing the c vaue back to 96, what other things should I look into to experiment and get the best performance out of this architecture
The settings I used in the video were simple just to have a taste of this transformer. In my opinion, a proper experiment would be replicating the results in the paper on ImageNet-1K dataset (the ones in table 1). This way we can judge the model, and then look for improvement.
@@mashaan14 im sorry. which table?
(Also, big thanks your videos and replies have been a big help. however, any chance I can ask somewhere more convenient than yt comments?)
Table 1 on page 6 of swin transformer paper.
You can text me on twitter or linkedin, whichever is convenient to you.
twitter.com/mashaan_14
linkedin.com/in/mashaan
@@mashaan14 Okay thanks