[Code] How to use Facebook's DETR object detection algorithm in Python (Full Tutorial)

Поділитися
Вставка
  • Опубліковано 28 тра 2024
  • Watch my as I struggle my way up the glorious path of using the DETR object detection model in PyTorch.
    Original Video on DETR: • DETR: End-to-End Objec...
    Their GitHub repo: github.com/facebookresearch/detr
    My Colab: colab.research.google.com/dri...
    OUTLINE:
    0:00 - Intro
    0:45 - TorchHub Model
    2:00 - Getting an Image
    6:00 - Image to PyTorch Tensor
    7:50 - Handling Model Output
    15:00 - Draw Bounding Boxes
    20:10 - The Dress
    22:00 - Rorschach Ink Blots
    23:00 - Forcing More Predictions
    28:30 - Jackson Pollock Images
    32:00 - Elephant Herds
    Links:
    UA-cam: / yannickilcher
    Twitter: / ykilcher
    BitChute: www.bitchute.com/channel/yann...
    Minds: www.minds.com/ykilcher
  • Наука та технологія

КОМЕНТАРІ • 86

  • @anjandash_
    @anjandash_ 4 роки тому +6

    Wow! I just finished the GPT3 video, and just now saw this uploaded! Cool!

  • @slackstation
    @slackstation 3 роки тому +1

    Awesome. Gonna keep this bookmarked as a quick reference to Colab and the various python tools. Great work.

  • @jakevikoren
    @jakevikoren 4 роки тому +7

    Absolutely LOVE all your content and am so excited to see you playing with code. I would be stoked if you made more videos of implementing (or just forking and playing with) some of the research papers you explore (in PyTorch). Many thanks for all you contribute to this fascinating field

    • @YannicKilcher
      @YannicKilcher  4 роки тому +4

      Yea the problem is the coding videos take much longer to make, but once I have an efficient process I can make more.

  • @vladimirblagojevic1950
    @vladimirblagojevic1950 4 роки тому +1

    Man, these coding videos are joy to watch...and learn!

  • @manjit862
    @manjit862 3 роки тому +3

    man your so humble, thanks for the content.

  • @MatsErikPistol
    @MatsErikPistol 4 роки тому +2

    I really like your videos. You selected papers overlap very well with my selection. Thanks!

    • @YannicKilcher
      @YannicKilcher  4 роки тому +6

      I know, I regularly read your diary while you sleep.

  • @victorirekponor6858
    @victorirekponor6858 4 роки тому

    I seriously don't know why anyone would dislike this gem of a video. Thanks, Yannic, I spend most of my day on your channel, learning and regurgitating knowledge. much love from Nigeria!

  • @raghebalghezi9532
    @raghebalghezi9532 4 роки тому

    Great content as usual! Thank you so much. Keep up the good work.

  • @walkwithfeel9265
    @walkwithfeel9265 Рік тому

    This tutorial was very intersting and full of fun. Thank you

  • @christianleininger2954
    @christianleininger2954 4 роки тому +1

    great video! it would like (maybe some others) that you take a paper and implement the algorithm from RL would be cool

  • @jayvnathan6496
    @jayvnathan6496 3 роки тому +2

    That was really a fun video. Thanks

  • @KarolMajek
    @KarolMajek 4 роки тому +1

    Thanks, I just sirajed it and will show inference on my channel (video input)

  • @anaghajoshi7856
    @anaghajoshi7856 3 роки тому +4

    You put out great content! Can't get enough of these. Thanks!
    Although, when I input an image in your code, it always detects 15 objects regardless of how many are in the picture.
    Any idea why this might be happening?

  • @onlyfeelingsss
    @onlyfeelingsss 4 роки тому +2

    Yanic, you're a beast!

  • @linlinzhao9085
    @linlinzhao9085 4 роки тому +2

    In addition to the great content, I like the Vim keymap in notebooks as well :))

  • @zhehuang3849
    @zhehuang3849 3 роки тому

    Thanks for the awesome video. Can you make more of this kind of coding walk-through video! wished to see some for the contrastive learning.

  • @samm9840
    @samm9840 3 роки тому +1

    Thanks for the video. Any experience with using it for real-time detection?

  • @deepak010
    @deepak010 4 роки тому

    Great .... Thanks for video ..
    Very helpful 👍

  • @kanaadpathak
    @kanaadpathak 4 роки тому +7

    Can you please do a video on how we can train our own custom detector?

  • @yuanjunchai9210
    @yuanjunchai9210 3 роки тому +1

    Nice share! Your glasses are very cool!

  • @anheuser-busch
    @anheuser-busch 4 роки тому

    "Now we are going to use the requests library, always a pleasure" had me cracking up!

  • @audrius0810
    @audrius0810 4 роки тому +1

    Hey, awesome work on the explanation of the paper and now this! Just found your channel searching for exactly this and it is an instant favorite in terms of how relevant the topics you touch are and how intuitive the explanations.
    I have a question - do you know how transfer learning would work in terms of taking their pre-trained model and using a custom number of outputs for fewer classes? I've kind of succeeded by just editing the hard coded num_classes in their github, but that is for training from scratch and I don't have the capacity for that.
    Again, appreciate what you're doing!

    • @YannicKilcher
      @YannicKilcher  4 роки тому +2

      I see your problem: the checkpoint weights aren't going to match in all places. There are multiple approaches, the easiest one is: load all weights that match, randomly initialize the ones that don't match. More sophisticated are things like model surgery.

    • @audrius0810
      @audrius0810 4 роки тому

      Heh, funny how my mind had already steered towards the complicated one. Thanks for the advice!
      I'm doing this for work and we've actually already decided for a more plug-and-play solution to tune a 101 layer retinanet via detectron2. Guess I'll just secretly hope that it's too expensive runtime-wise and I'll get to revisit DETR.

  • @Konstantin-qk6hv
    @Konstantin-qk6hv 2 роки тому

    Wow, cool howto! You're good in python

  • @fatihbaltac1482
    @fatihbaltac1482 4 роки тому +1

    Very good !

  • @Hoxle-87
    @Hoxle-87 3 роки тому

    Great video, thanks

  • @patrickjdarrow
    @patrickjdarrow 4 роки тому +1

    The paper said segmentation was easily accomplished, right? That'd make an awesome complimentary vid.

  • @Kram1032
    @Kram1032 3 роки тому +1

    Honestly, predicting cake for so many things is pretty reasonable

  • @Ruhgtfo
    @Ruhgtfo 3 роки тому

    whoal this is really powerful

  • @JohnXu66
    @JohnXu66 3 роки тому

    😂 I love your video and it is very fun.

  • @khushpreetsandhu9874
    @khushpreetsandhu9874 4 роки тому +4

    Very Informative, Thank you for the video !!! Can you suggest how do we perform custom object detection using this architecture ?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      If by custom, you mean with your own classes, then you'd need to re-train the model with a custom dataset.

    • @khushpreetsandhu9874
      @khushpreetsandhu9874 4 роки тому +1

      @@YannicKilcher , I mean using own custom dataset. Any suggestions on how can we do it using this architecture ? Should we follow general image classification approach or object detection approach (where we need annotations file) ?

  • @quebono100
    @quebono100 4 роки тому

    Love you :)

  • @hk2780
    @hk2780 4 роки тому +1

    so cool channel.

  • @ZobeirRaisi
    @ZobeirRaisi 4 роки тому

    Great, Thanks

  • @patilvikii
    @patilvikii 4 роки тому +1

    We call it sirajing! Epic

  • @LaoZhao11
    @LaoZhao11 4 роки тому +2

    first step. buy a cool sunglasses
    nice tutorial, thanks!!!

  • @snippletrap
    @snippletrap 3 роки тому +1

    Shaga-boom, you now have a model

  • @frosty9392
    @frosty9392 Рік тому

    for uber clear text readability on virtually any image:
    white fill + black outline (of the text)
    no clue if that is doable with what you were using, but wanted to point it out just in case

  • @pradyumnareddy5415
    @pradyumnareddy5415 4 роки тому +1

    I love new gangsta Yannic

  • @waxwingvain
    @waxwingvain 4 роки тому +1

    This is how it's done, james bond

  • @ckalas
    @ckalas 4 роки тому

    how does this relate to the detectron2 framework from fair? it seems like they will at some point add detr into it right?

  • @soumyadrip
    @soumyadrip 4 роки тому

    🔥🔥🔥

  • @soumyajitpodder3516
    @soumyajitpodder3516 4 роки тому +2

    Can you make a video on how to re-train the model to detect new classes using a custom dataset ?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      That's going to be a bit longer, but in essence, it should be easy and they provide the code to train it

  • @dippatel1739
    @dippatel1739 4 роки тому

    I legitimately thought that you will implement it from scratch just like last video on pytorch lighting. But anyways, This is also good content.

    • @YannicKilcher
      @YannicKilcher  4 роки тому +1

      Sorry about that. In my defense, the title says "how to use ..." :D

    • @dippatel1739
      @dippatel1739 4 роки тому

      @@YannicKilcher Yeah. Great work. and Thanks for content.

  • @nicolasugrinovic2886
    @nicolasugrinovic2886 3 роки тому +1

    Do you know anything about how to speed up training and/or fine-tunning with few GPUs?

    • @YannicKilcher
      @YannicKilcher  3 роки тому

      Not really. I guess first thing to do is plumb until your usage is at 100%

  • @1907hasancan
    @1907hasancan Рік тому

    I have a dataset and I want to train with DETR. Can I use this video for train my dataset? What is your opinion?

  • @nark4837
    @nark4837 12 днів тому

    Hi Yannic, I do not understand the sizing of the images, is it a fixed size input, surely it is since it is a CNN? If not, can you explain how and why?

  • @crimythebold
    @crimythebold 4 роки тому +1

    sirajing :)

  • @quangainguyen6209
    @quangainguyen6209 4 роки тому +1

    Can I use this pretrained model to make a model predicting custom objects? and if then, how?

    • @YannicKilcher
      @YannicKilcher  4 роки тому +1

      You'd have to train it yourself

    • @quangainguyen6209
      @quangainguyen6209 3 роки тому

      @@YannicKilcher Thank you. Then how should I prepare my dataset? I want to predict masked vs non-masked faces?

  • @ahmedmansour5591
    @ahmedmansour5591 4 роки тому +1

    Nice tutorial, but i need to know how to suppress multiple bounding boxes for the same object?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      With this method that shouldn't be a problem

    • @ahmedmansour5591
      @ahmedmansour5591 4 роки тому

      @@YannicKilcher I think non maximum suppression should be used

    • @ahmedmansour5591
      @ahmedmansour5591 4 роки тому

      I tried it with many images multiple bounding boxes appear

  • @chrisngandimoun5630
    @chrisngandimoun5630 4 роки тому +1

    hello, please can we use that for custom objetct detection?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      you can, but you'll have to train it yourself

  • @akshayisalkar2571
    @akshayisalkar2571 4 роки тому +1

    Could you please share Wildlife animal IR/thermal dataset for my study?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      I just typed Elephants into google images :D

    • @jakevikoren
      @jakevikoren 4 роки тому

      Here is a large collection of camera trap datasets. Not sure if any are IR/thermal but perhaps helpful!
      lila.science/datasets

  • @anantakusumap2459
    @anantakusumap2459 4 роки тому +1

    Can i fined tuning from torch hub?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      Yes, I'm pretty sure you can. What you import is just a regular torch.nn.Module.

  • @hojoonsong3761
    @hojoonsong3761 3 роки тому +2

    19:05 The white text simply didn't show up because you didn't load the cell after formatting the code LOL

  • @prasannautube1
    @prasannautube1 4 роки тому +1

    Thank you for this wonderful demo Yannic Kilcher. I have learned a lot from this video. Thank you so much again.
    Would like to share something back to others who want to play with this model. I have created a simple util class, based on the codes written by Yannic for someone to play with this model.
    Gist: gist.github.com/balaprasanna/c6b6e5dba63a338c53a211baad48cb19
    Example usage:
    imageurl = "5.imimg.com/data5/GM/EM/MY-38731446/selection_143-500x500.png"
    model = DETRModel(imageurl)
    model.detect()

  • @sparknews2218
    @sparknews2218 Рік тому

    How to train this on our own data

  • @moshehoshen4411
    @moshehoshen4411 4 роки тому

    Hi. Fun and educating at the same time. Any idea about training my own model? Has anybody around done it?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      No, but the code is there for you to do it

  • @hk2780
    @hk2780 4 роки тому +1

    Could I request the paper and live coding review? I just want to know about semantic segmentation paper from Face Book research. The title is Point Rend: github.com/facebookresearch/detectron2/tree/master/projects/PointRend . This is provide the google colab example also. Also, do you have any interested in paper from 2020 CVPR?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      I don't look much at conference proceedings if I don't attend personally, since that research is usually already half a year out of date.

    • @hk2780
      @hk2780 4 роки тому

      @@YannicKilcher Ok that make sense. Now a days there are tons of paper exists. How about next week CVPR paper? Do you have any interested in.