Image classification on Custom Dataset Using FasterViT

Поділитися
Вставка
  • Опубліковано 1 жов 2024
  • Fast Vision Transformers with Hierarchical Attention
    Learn to perform Image classification with custom dataset using FasterViT model.
    GitHub: github.com/Aar...
    Dataset added in GitHub repo: github.com/Aar...
    Email: aarohisingla1987@gmail.com
    FasterViT
    FasterViT, a fast vision transformer model developed by NVIDIA.
    FasterViT (Faster Vision Transformer) is a variant of the Vision Transformer (ViT) architecture, designed to address some of the performance and efficiency challenges associated with traditional transformer models in image classification tasks.
    Traditional Vision Transformers apply the transformer architecture, originally developed for natural language processing tasks, to image data. ViTs divide an image into patches, flatten them, and then process these patches as a sequence using a transformer model. While ViTs have shown promising results in image classification, they often require significant computational resources and have long inference times due to their complexity.
    FasterViT is designed to be more computationally efficient than standard ViTs. This is achieved through architectural changes that reduce the number of parameters and floating-point operations (FLOPs) required for inference.
    Sparse Attention Mechanisms: Incorporating sparse attention mechanisms can help reduce the computational load by focusing the model's attention on the most relevant parts of the input.
    #computervision #transformers #nvidia #imageclassification

КОМЕНТАРІ • 28

  • @jasonchen6623
    @jasonchen6623 3 місяці тому +1

    can you compare with yolov8-cls? which classification model is better? FasterViT or yolov8-cls model
    thanks.

  • @Alphaz-db8vc
    @Alphaz-db8vc 2 місяці тому

    Is this good for 1k custom dataset?

  • @abdelrahimkoura1461
    @abdelrahimkoura1461 3 місяці тому

    when excute the cell number 3 load fastervit I got error message "" cannot import name '_update_default_kwargs' from 'timm.models._builder' (C:\anaconda3\Lib\site-packages\timm\models\_builder.py) "" why and how solve it

    • @fouziaanjums6475
      @fouziaanjums6475 3 місяці тому

      Same here

    • @CodeWithAarohi
      @CodeWithAarohi  3 місяці тому

      I have added the steps in readme file. Please follow those steps: github.com/AarohiSingla/FasterViT

    • @imrankhan-el2zp
      @imrankhan-el2zp Місяць тому

      @@CodeWithAarohi same error after rewrite and install fastVit etc.
      ImportError: cannot import name '_update_default_kwargs' from 'timm.models._builder' (/usr/local/lib/python3.10/dist-packages/timm/models/_builder.py)

  • @Polly-7890
    @Polly-7890 3 місяці тому

    Mam thanks for your videos. Could you suggest how to extract titles from images and also how to detect language from image directly

  • @palurikrishnaveni8344
    @palurikrishnaveni8344 3 місяці тому

    Nice video, mam
    Can you show the predicted results in terms of explainable ai like gradcam, gradcam++, or any heatmaps?

  • @aneerimmco
    @aneerimmco 3 місяці тому

    Thank you, Ma'am. I learned something new.

  • @nakulmali1413
    @nakulmali1413 3 місяці тому

    Mam thanks for all your videos, mam please upload video on object classification using YOLOv8 and run the model training, validation and testing with python script. Mam i am waiting for this topic video.

  • @sandeepnadipalli2004
    @sandeepnadipalli2004 3 місяці тому

    please provide custom dataset in your github

    • @CodeWithAarohi
      @CodeWithAarohi  3 місяці тому +1

      Done: github.com/AarohiSingla/FasterViT/tree/main/RockPaperScissorsDataset

  • @aiforeveryone
    @aiforeveryone 3 місяці тому

    Excellent

  • @arnavthakur5409
    @arnavthakur5409 3 місяці тому

    Nicely explained video

  • @pifordtechnologiespvtltd5698
    @pifordtechnologiespvtltd5698 3 місяці тому

    Amazing

  • @396me
    @396me 3 місяці тому

    Why faster ViT exists

    • @CodeWithAarohi
      @CodeWithAarohi  3 місяці тому +1

      FasterViT exists to make Vision Transformers faster, more efficient, and less resource-intensive. It addresses issues like high computational cost, long training time, and large memory usage, making these models more practical for real-world applications such as real-time image processing and running on devices with limited power.

    • @science.20246
      @science.20246 3 місяці тому

      whar you think if we use hog volume as features and we apply head detection on

  • @mehdismaeili3743
    @mehdismaeili3743 3 місяці тому

    Excellent .

  • @soravsingla8782
    @soravsingla8782 3 місяці тому

    Awesome