[Code] How to use Facebook's DETR object detection algorithm in Python (Full Tutorial)

Поділитися
Вставка
  • Опубліковано 24 гру 2024

КОМЕНТАРІ • 87

  • @GIS-programming
    @GIS-programming 4 роки тому +1

    I seriously don't know why anyone would dislike this gem of a video. Thanks, Yannic, I spend most of my day on your channel, learning and regurgitating knowledge. much love from Nigeria!

  • @anjandash_
    @anjandash_ 4 роки тому +7

    Wow! I just finished the GPT3 video, and just now saw this uploaded! Cool!

  • @jakevikoren
    @jakevikoren 4 роки тому +7

    Absolutely LOVE all your content and am so excited to see you playing with code. I would be stoked if you made more videos of implementing (or just forking and playing with) some of the research papers you explore (in PyTorch). Many thanks for all you contribute to this fascinating field

    • @YannicKilcher
      @YannicKilcher  4 роки тому +4

      Yea the problem is the coding videos take much longer to make, but once I have an efficient process I can make more.

  • @slackstation
    @slackstation 4 роки тому +1

    Awesome. Gonna keep this bookmarked as a quick reference to Colab and the various python tools. Great work.

  • @MatsErikPistol
    @MatsErikPistol 4 роки тому +2

    I really like your videos. You selected papers overlap very well with my selection. Thanks!

    • @YannicKilcher
      @YannicKilcher  4 роки тому +6

      I know, I regularly read your diary while you sleep.

  • @manjit862
    @manjit862 4 роки тому +3

    man your so humble, thanks for the content.

  • @vladimirblagojevic1950
    @vladimirblagojevic1950 4 роки тому +1

    Man, these coding videos are joy to watch...and learn!

  • @jayvnathan6496
    @jayvnathan6496 4 роки тому +2

    That was really a fun video. Thanks

  • @Kram1032
    @Kram1032 4 роки тому +1

    Honestly, predicting cake for so many things is pretty reasonable

  • @walkwithfeel9265
    @walkwithfeel9265 2 роки тому

    This tutorial was very intersting and full of fun. Thank you

  • @kanaadpathak
    @kanaadpathak 4 роки тому +7

    Can you please do a video on how we can train our own custom detector?

  • @snippletrap
    @snippletrap 4 роки тому +1

    Shaga-boom, you now have a model

  • @linlinzhao9085
    @linlinzhao9085 4 роки тому +2

    In addition to the great content, I like the Vim keymap in notebooks as well :))

  • @anheuser-busch
    @anheuser-busch 4 роки тому

    "Now we are going to use the requests library, always a pleasure" had me cracking up!

  • @KarolMajek
    @KarolMajek 4 роки тому +1

    Thanks, I just sirajed it and will show inference on my channel (video input)

  • @yuanjunchai9210
    @yuanjunchai9210 4 роки тому +1

    Nice share! Your glasses are very cool!

  • @LaoZhao11
    @LaoZhao11 4 роки тому +2

    first step. buy a cool sunglasses
    nice tutorial, thanks!!!

  • @christianleininger2954
    @christianleininger2954 4 роки тому +1

    great video! it would like (maybe some others) that you take a paper and implement the algorithm from RL would be cool

  • @frosty9392
    @frosty9392 Рік тому

    for uber clear text readability on virtually any image:
    white fill + black outline (of the text)
    no clue if that is doable with what you were using, but wanted to point it out just in case

  • @nark4837
    @nark4837 7 місяців тому

    Hi Yannic, I do not understand the sizing of the images, is it a fixed size input, surely it is since it is a CNN? If not, can you explain how and why?

  • @khushpreetsandhu9874
    @khushpreetsandhu9874 4 роки тому +4

    Very Informative, Thank you for the video !!! Can you suggest how do we perform custom object detection using this architecture ?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      If by custom, you mean with your own classes, then you'd need to re-train the model with a custom dataset.

    • @khushpreetsandhu9874
      @khushpreetsandhu9874 4 роки тому +1

      @@YannicKilcher , I mean using own custom dataset. Any suggestions on how can we do it using this architecture ? Should we follow general image classification approach or object detection approach (where we need annotations file) ?

  • @anaghajoshi7856
    @anaghajoshi7856 4 роки тому +4

    You put out great content! Can't get enough of these. Thanks!
    Although, when I input an image in your code, it always detects 15 objects regardless of how many are in the picture.
    Any idea why this might be happening?

  • @pradyumnareddy5415
    @pradyumnareddy5415 4 роки тому +1

    I love new gangsta Yannic

  • @waxwingvain
    @waxwingvain 4 роки тому +1

    This is how it's done, james bond

  • @shoebumx
    @shoebumx 4 роки тому +2

    Yanic, you're a beast!

  • @Konstantin-qk6hv
    @Konstantin-qk6hv 3 роки тому

    Wow, cool howto! You're good in python

  • @audrius0810
    @audrius0810 4 роки тому +1

    Hey, awesome work on the explanation of the paper and now this! Just found your channel searching for exactly this and it is an instant favorite in terms of how relevant the topics you touch are and how intuitive the explanations.
    I have a question - do you know how transfer learning would work in terms of taking their pre-trained model and using a custom number of outputs for fewer classes? I've kind of succeeded by just editing the hard coded num_classes in their github, but that is for training from scratch and I don't have the capacity for that.
    Again, appreciate what you're doing!

    • @YannicKilcher
      @YannicKilcher  4 роки тому +2

      I see your problem: the checkpoint weights aren't going to match in all places. There are multiple approaches, the easiest one is: load all weights that match, randomly initialize the ones that don't match. More sophisticated are things like model surgery.

    • @audrius0810
      @audrius0810 4 роки тому

      Heh, funny how my mind had already steered towards the complicated one. Thanks for the advice!
      I'm doing this for work and we've actually already decided for a more plug-and-play solution to tune a 101 layer retinanet via detectron2. Guess I'll just secretly hope that it's too expensive runtime-wise and I'll get to revisit DETR.

  • @raghebalghezi9532
    @raghebalghezi9532 4 роки тому

    Great content as usual! Thank you so much. Keep up the good work.

  • @patrickjdarrow
    @patrickjdarrow 4 роки тому +1

    The paper said segmentation was easily accomplished, right? That'd make an awesome complimentary vid.

  • @hojoonsong3761
    @hojoonsong3761 4 роки тому +2

    19:05 The white text simply didn't show up because you didn't load the cell after formatting the code LOL

  • @patilvikii
    @patilvikii 4 роки тому +1

    We call it sirajing! Epic

  • @soumyajitpodder3516
    @soumyajitpodder3516 4 роки тому +2

    Can you make a video on how to re-train the model to detect new classes using a custom dataset ?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      That's going to be a bit longer, but in essence, it should be easy and they provide the code to train it

  • @samm9840
    @samm9840 4 роки тому +1

    Thanks for the video. Any experience with using it for real-time detection?

  • @quangainguyen6209
    @quangainguyen6209 4 роки тому +1

    Can I use this pretrained model to make a model predicting custom objects? and if then, how?

    • @YannicKilcher
      @YannicKilcher  4 роки тому +1

      You'd have to train it yourself

    • @quangainguyen6209
      @quangainguyen6209 4 роки тому

      @@YannicKilcher Thank you. Then how should I prepare my dataset? I want to predict masked vs non-masked faces?

  • @ahmedmansour5591
    @ahmedmansour5591 4 роки тому +1

    Nice tutorial, but i need to know how to suppress multiple bounding boxes for the same object?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      With this method that shouldn't be a problem

    • @ahmedmansour5591
      @ahmedmansour5591 4 роки тому

      @@YannicKilcher I think non maximum suppression should be used

    • @ahmedmansour5591
      @ahmedmansour5591 4 роки тому

      I tried it with many images multiple bounding boxes appear

  • @fatihbaltac1482
    @fatihbaltac1482 4 роки тому +1

    Very good !

  • @1907hasancan
    @1907hasancan Рік тому

    I have a dataset and I want to train with DETR. Can I use this video for train my dataset? What is your opinion?

  • @Ruhgtfo
    @Ruhgtfo 3 роки тому

    whoal this is really powerful

  • @zhehuang3849
    @zhehuang3849 4 роки тому

    Thanks for the awesome video. Can you make more of this kind of coding walk-through video! wished to see some for the contrastive learning.

  • @chrisngandimoun5630
    @chrisngandimoun5630 4 роки тому +1

    hello, please can we use that for custom objetct detection?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      you can, but you'll have to train it yourself

  • @nicolasugrinovic2886
    @nicolasugrinovic2886 4 роки тому +1

    Do you know anything about how to speed up training and/or fine-tunning with few GPUs?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      Not really. I guess first thing to do is plumb until your usage is at 100%

  • @JohnXu66
    @JohnXu66 3 роки тому

    😂 I love your video and it is very fun.

  • @akshayisalkar2571
    @akshayisalkar2571 4 роки тому +1

    Could you please share Wildlife animal IR/thermal dataset for my study?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      I just typed Elephants into google images :D

    • @jakevikoren
      @jakevikoren 4 роки тому

      Here is a large collection of camera trap datasets. Not sure if any are IR/thermal but perhaps helpful!
      lila.science/datasets

  • @sparknews2218
    @sparknews2218 2 роки тому

    How to train this on our own data

  • @Hoxle-87
    @Hoxle-87 3 роки тому

    Great video, thanks

  • @anantakusumap2459
    @anantakusumap2459 4 роки тому +1

    Can i fined tuning from torch hub?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      Yes, I'm pretty sure you can. What you import is just a regular torch.nn.Module.

  • @ckalas
    @ckalas 4 роки тому

    how does this relate to the detectron2 framework from fair? it seems like they will at some point add detr into it right?

  • @deepak010
    @deepak010 4 роки тому

    Great .... Thanks for video ..
    Very helpful 👍

  • @hk2780
    @hk2780 4 роки тому +1

    so cool channel.

  • @ZobeirRaisi
    @ZobeirRaisi 4 роки тому

    Great, Thanks

  • @dippatel1739
    @dippatel1739 4 роки тому

    I legitimately thought that you will implement it from scratch just like last video on pytorch lighting. But anyways, This is also good content.

    • @YannicKilcher
      @YannicKilcher  4 роки тому +1

      Sorry about that. In my defense, the title says "how to use ..." :D

    • @dippatel1739
      @dippatel1739 4 роки тому

      @@YannicKilcher Yeah. Great work. and Thanks for content.

  • @sonOfLiberty100
    @sonOfLiberty100 4 роки тому

    Love you :)

  • @soumyadrip
    @soumyadrip 4 роки тому

    🔥🔥🔥

  • @0兒-y4c
    @0兒-y4c 2 місяці тому

    yor are god

  • @crimythebold
    @crimythebold 4 роки тому +1

    sirajing :)

  • @moshehoshen4411
    @moshehoshen4411 4 роки тому

    Hi. Fun and educating at the same time. Any idea about training my own model? Has anybody around done it?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      No, but the code is there for you to do it

  • @prasannautube1
    @prasannautube1 4 роки тому +1

    Thank you for this wonderful demo Yannic Kilcher. I have learned a lot from this video. Thank you so much again.
    Would like to share something back to others who want to play with this model. I have created a simple util class, based on the codes written by Yannic for someone to play with this model.
    Gist: gist.github.com/balaprasanna/c6b6e5dba63a338c53a211baad48cb19
    Example usage:
    imageurl = "5.imimg.com/data5/GM/EM/MY-38731446/selection_143-500x500.png"
    model = DETRModel(imageurl)
    model.detect()

  • @hk2780
    @hk2780 4 роки тому +1

    Could I request the paper and live coding review? I just want to know about semantic segmentation paper from Face Book research. The title is Point Rend: github.com/facebookresearch/detectron2/tree/master/projects/PointRend . This is provide the google colab example also. Also, do you have any interested in paper from 2020 CVPR?

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      I don't look much at conference proceedings if I don't attend personally, since that research is usually already half a year out of date.

    • @hk2780
      @hk2780 4 роки тому

      @@YannicKilcher Ok that make sense. Now a days there are tons of paper exists. How about next week CVPR paper? Do you have any interested in.