[Code] How to use Facebook's DETR object detection algorithm in Python (Full Tutorial)
Вставка
- Опубліковано 28 тра 2024
- Watch my as I struggle my way up the glorious path of using the DETR object detection model in PyTorch.
Original Video on DETR: • DETR: End-to-End Objec...
Their GitHub repo: github.com/facebookresearch/detr
My Colab: colab.research.google.com/dri...
OUTLINE:
0:00 - Intro
0:45 - TorchHub Model
2:00 - Getting an Image
6:00 - Image to PyTorch Tensor
7:50 - Handling Model Output
15:00 - Draw Bounding Boxes
20:10 - The Dress
22:00 - Rorschach Ink Blots
23:00 - Forcing More Predictions
28:30 - Jackson Pollock Images
32:00 - Elephant Herds
Links:
UA-cam: / yannickilcher
Twitter: / ykilcher
BitChute: www.bitchute.com/channel/yann...
Minds: www.minds.com/ykilcher - Наука та технологія
Wow! I just finished the GPT3 video, and just now saw this uploaded! Cool!
Awesome. Gonna keep this bookmarked as a quick reference to Colab and the various python tools. Great work.
Absolutely LOVE all your content and am so excited to see you playing with code. I would be stoked if you made more videos of implementing (or just forking and playing with) some of the research papers you explore (in PyTorch). Many thanks for all you contribute to this fascinating field
Yea the problem is the coding videos take much longer to make, but once I have an efficient process I can make more.
Man, these coding videos are joy to watch...and learn!
man your so humble, thanks for the content.
I really like your videos. You selected papers overlap very well with my selection. Thanks!
I know, I regularly read your diary while you sleep.
I seriously don't know why anyone would dislike this gem of a video. Thanks, Yannic, I spend most of my day on your channel, learning and regurgitating knowledge. much love from Nigeria!
haters gonna hate :D
Great content as usual! Thank you so much. Keep up the good work.
This tutorial was very intersting and full of fun. Thank you
great video! it would like (maybe some others) that you take a paper and implement the algorithm from RL would be cool
That was really a fun video. Thanks
Thanks, I just sirajed it and will show inference on my channel (video input)
You put out great content! Can't get enough of these. Thanks!
Although, when I input an image in your code, it always detects 15 objects regardless of how many are in the picture.
Any idea why this might be happening?
Yanic, you're a beast!
In addition to the great content, I like the Vim keymap in notebooks as well :))
Thanks for the awesome video. Can you make more of this kind of coding walk-through video! wished to see some for the contrastive learning.
Thanks for the video. Any experience with using it for real-time detection?
Great .... Thanks for video ..
Very helpful 👍
Can you please do a video on how we can train our own custom detector?
Nice share! Your glasses are very cool!
"Now we are going to use the requests library, always a pleasure" had me cracking up!
Hey, awesome work on the explanation of the paper and now this! Just found your channel searching for exactly this and it is an instant favorite in terms of how relevant the topics you touch are and how intuitive the explanations.
I have a question - do you know how transfer learning would work in terms of taking their pre-trained model and using a custom number of outputs for fewer classes? I've kind of succeeded by just editing the hard coded num_classes in their github, but that is for training from scratch and I don't have the capacity for that.
Again, appreciate what you're doing!
I see your problem: the checkpoint weights aren't going to match in all places. There are multiple approaches, the easiest one is: load all weights that match, randomly initialize the ones that don't match. More sophisticated are things like model surgery.
Heh, funny how my mind had already steered towards the complicated one. Thanks for the advice!
I'm doing this for work and we've actually already decided for a more plug-and-play solution to tune a 101 layer retinanet via detectron2. Guess I'll just secretly hope that it's too expensive runtime-wise and I'll get to revisit DETR.
Wow, cool howto! You're good in python
Very good !
Great video, thanks
The paper said segmentation was easily accomplished, right? That'd make an awesome complimentary vid.
Honestly, predicting cake for so many things is pretty reasonable
whoal this is really powerful
😂 I love your video and it is very fun.
Very Informative, Thank you for the video !!! Can you suggest how do we perform custom object detection using this architecture ?
If by custom, you mean with your own classes, then you'd need to re-train the model with a custom dataset.
@@YannicKilcher , I mean using own custom dataset. Any suggestions on how can we do it using this architecture ? Should we follow general image classification approach or object detection approach (where we need annotations file) ?
Love you :)
so cool channel.
Great, Thanks
We call it sirajing! Epic
first step. buy a cool sunglasses
nice tutorial, thanks!!!
Shaga-boom, you now have a model
for uber clear text readability on virtually any image:
white fill + black outline (of the text)
no clue if that is doable with what you were using, but wanted to point it out just in case
I love new gangsta Yannic
This is how it's done, james bond
how does this relate to the detectron2 framework from fair? it seems like they will at some point add detr into it right?
No clue, but I'm sure it does.
🔥🔥🔥
Can you make a video on how to re-train the model to detect new classes using a custom dataset ?
That's going to be a bit longer, but in essence, it should be easy and they provide the code to train it
I legitimately thought that you will implement it from scratch just like last video on pytorch lighting. But anyways, This is also good content.
Sorry about that. In my defense, the title says "how to use ..." :D
@@YannicKilcher Yeah. Great work. and Thanks for content.
Do you know anything about how to speed up training and/or fine-tunning with few GPUs?
Not really. I guess first thing to do is plumb until your usage is at 100%
I have a dataset and I want to train with DETR. Can I use this video for train my dataset? What is your opinion?
Hi Yannic, I do not understand the sizing of the images, is it a fixed size input, surely it is since it is a CNN? If not, can you explain how and why?
sirajing :)
Can I use this pretrained model to make a model predicting custom objects? and if then, how?
You'd have to train it yourself
@@YannicKilcher Thank you. Then how should I prepare my dataset? I want to predict masked vs non-masked faces?
Nice tutorial, but i need to know how to suppress multiple bounding boxes for the same object?
With this method that shouldn't be a problem
@@YannicKilcher I think non maximum suppression should be used
I tried it with many images multiple bounding boxes appear
hello, please can we use that for custom objetct detection?
you can, but you'll have to train it yourself
Could you please share Wildlife animal IR/thermal dataset for my study?
I just typed Elephants into google images :D
Here is a large collection of camera trap datasets. Not sure if any are IR/thermal but perhaps helpful!
lila.science/datasets
Can i fined tuning from torch hub?
Yes, I'm pretty sure you can. What you import is just a regular torch.nn.Module.
19:05 The white text simply didn't show up because you didn't load the cell after formatting the code LOL
Thank you for this wonderful demo Yannic Kilcher. I have learned a lot from this video. Thank you so much again.
Would like to share something back to others who want to play with this model. I have created a simple util class, based on the codes written by Yannic for someone to play with this model.
Gist: gist.github.com/balaprasanna/c6b6e5dba63a338c53a211baad48cb19
Example usage:
imageurl = "5.imimg.com/data5/GM/EM/MY-38731446/selection_143-500x500.png"
model = DETRModel(imageurl)
model.detect()
Have you tried it with video?
How to train this on our own data
Hi. Fun and educating at the same time. Any idea about training my own model? Has anybody around done it?
No, but the code is there for you to do it
Could I request the paper and live coding review? I just want to know about semantic segmentation paper from Face Book research. The title is Point Rend: github.com/facebookresearch/detectron2/tree/master/projects/PointRend . This is provide the google colab example also. Also, do you have any interested in paper from 2020 CVPR?
I don't look much at conference proceedings if I don't attend personally, since that research is usually already half a year out of date.
@@YannicKilcher Ok that make sense. Now a days there are tons of paper exists. How about next week CVPR paper? Do you have any interested in.