I seriously don't know why anyone would dislike this gem of a video. Thanks, Yannic, I spend most of my day on your channel, learning and regurgitating knowledge. much love from Nigeria!
Absolutely LOVE all your content and am so excited to see you playing with code. I would be stoked if you made more videos of implementing (or just forking and playing with) some of the research papers you explore (in PyTorch). Many thanks for all you contribute to this fascinating field
for uber clear text readability on virtually any image: white fill + black outline (of the text) no clue if that is doable with what you were using, but wanted to point it out just in case
Hi Yannic, I do not understand the sizing of the images, is it a fixed size input, surely it is since it is a CNN? If not, can you explain how and why?
@@YannicKilcher , I mean using own custom dataset. Any suggestions on how can we do it using this architecture ? Should we follow general image classification approach or object detection approach (where we need annotations file) ?
You put out great content! Can't get enough of these. Thanks! Although, when I input an image in your code, it always detects 15 objects regardless of how many are in the picture. Any idea why this might be happening?
Hey, awesome work on the explanation of the paper and now this! Just found your channel searching for exactly this and it is an instant favorite in terms of how relevant the topics you touch are and how intuitive the explanations. I have a question - do you know how transfer learning would work in terms of taking their pre-trained model and using a custom number of outputs for fewer classes? I've kind of succeeded by just editing the hard coded num_classes in their github, but that is for training from scratch and I don't have the capacity for that. Again, appreciate what you're doing!
I see your problem: the checkpoint weights aren't going to match in all places. There are multiple approaches, the easiest one is: load all weights that match, randomly initialize the ones that don't match. More sophisticated are things like model surgery.
Heh, funny how my mind had already steered towards the complicated one. Thanks for the advice! I'm doing this for work and we've actually already decided for a more plug-and-play solution to tune a 101 layer retinanet via detectron2. Guess I'll just secretly hope that it's too expensive runtime-wise and I'll get to revisit DETR.
Thank you for this wonderful demo Yannic Kilcher. I have learned a lot from this video. Thank you so much again. Would like to share something back to others who want to play with this model. I have created a simple util class, based on the codes written by Yannic for someone to play with this model. Gist: gist.github.com/balaprasanna/c6b6e5dba63a338c53a211baad48cb19 Example usage: imageurl = "5.imimg.com/data5/GM/EM/MY-38731446/selection_143-500x500.png" model = DETRModel(imageurl) model.detect()
Could I request the paper and live coding review? I just want to know about semantic segmentation paper from Face Book research. The title is Point Rend: github.com/facebookresearch/detectron2/tree/master/projects/PointRend . This is provide the google colab example also. Also, do you have any interested in paper from 2020 CVPR?
I seriously don't know why anyone would dislike this gem of a video. Thanks, Yannic, I spend most of my day on your channel, learning and regurgitating knowledge. much love from Nigeria!
haters gonna hate :D
Wow! I just finished the GPT3 video, and just now saw this uploaded! Cool!
Absolutely LOVE all your content and am so excited to see you playing with code. I would be stoked if you made more videos of implementing (or just forking and playing with) some of the research papers you explore (in PyTorch). Many thanks for all you contribute to this fascinating field
Yea the problem is the coding videos take much longer to make, but once I have an efficient process I can make more.
Awesome. Gonna keep this bookmarked as a quick reference to Colab and the various python tools. Great work.
I really like your videos. You selected papers overlap very well with my selection. Thanks!
I know, I regularly read your diary while you sleep.
man your so humble, thanks for the content.
Man, these coding videos are joy to watch...and learn!
That was really a fun video. Thanks
Honestly, predicting cake for so many things is pretty reasonable
This tutorial was very intersting and full of fun. Thank you
Can you please do a video on how we can train our own custom detector?
Shaga-boom, you now have a model
In addition to the great content, I like the Vim keymap in notebooks as well :))
"Now we are going to use the requests library, always a pleasure" had me cracking up!
Thanks, I just sirajed it and will show inference on my channel (video input)
Nice share! Your glasses are very cool!
first step. buy a cool sunglasses
nice tutorial, thanks!!!
great video! it would like (maybe some others) that you take a paper and implement the algorithm from RL would be cool
for uber clear text readability on virtually any image:
white fill + black outline (of the text)
no clue if that is doable with what you were using, but wanted to point it out just in case
Hi Yannic, I do not understand the sizing of the images, is it a fixed size input, surely it is since it is a CNN? If not, can you explain how and why?
Very Informative, Thank you for the video !!! Can you suggest how do we perform custom object detection using this architecture ?
If by custom, you mean with your own classes, then you'd need to re-train the model with a custom dataset.
@@YannicKilcher , I mean using own custom dataset. Any suggestions on how can we do it using this architecture ? Should we follow general image classification approach or object detection approach (where we need annotations file) ?
You put out great content! Can't get enough of these. Thanks!
Although, when I input an image in your code, it always detects 15 objects regardless of how many are in the picture.
Any idea why this might be happening?
I love new gangsta Yannic
This is how it's done, james bond
Yanic, you're a beast!
Wow, cool howto! You're good in python
Hey, awesome work on the explanation of the paper and now this! Just found your channel searching for exactly this and it is an instant favorite in terms of how relevant the topics you touch are and how intuitive the explanations.
I have a question - do you know how transfer learning would work in terms of taking their pre-trained model and using a custom number of outputs for fewer classes? I've kind of succeeded by just editing the hard coded num_classes in their github, but that is for training from scratch and I don't have the capacity for that.
Again, appreciate what you're doing!
I see your problem: the checkpoint weights aren't going to match in all places. There are multiple approaches, the easiest one is: load all weights that match, randomly initialize the ones that don't match. More sophisticated are things like model surgery.
Heh, funny how my mind had already steered towards the complicated one. Thanks for the advice!
I'm doing this for work and we've actually already decided for a more plug-and-play solution to tune a 101 layer retinanet via detectron2. Guess I'll just secretly hope that it's too expensive runtime-wise and I'll get to revisit DETR.
Great content as usual! Thank you so much. Keep up the good work.
The paper said segmentation was easily accomplished, right? That'd make an awesome complimentary vid.
19:05 The white text simply didn't show up because you didn't load the cell after formatting the code LOL
We call it sirajing! Epic
Can you make a video on how to re-train the model to detect new classes using a custom dataset ?
That's going to be a bit longer, but in essence, it should be easy and they provide the code to train it
Thanks for the video. Any experience with using it for real-time detection?
Can I use this pretrained model to make a model predicting custom objects? and if then, how?
You'd have to train it yourself
@@YannicKilcher Thank you. Then how should I prepare my dataset? I want to predict masked vs non-masked faces?
Nice tutorial, but i need to know how to suppress multiple bounding boxes for the same object?
With this method that shouldn't be a problem
@@YannicKilcher I think non maximum suppression should be used
I tried it with many images multiple bounding boxes appear
Very good !
I have a dataset and I want to train with DETR. Can I use this video for train my dataset? What is your opinion?
whoal this is really powerful
Thanks for the awesome video. Can you make more of this kind of coding walk-through video! wished to see some for the contrastive learning.
hello, please can we use that for custom objetct detection?
you can, but you'll have to train it yourself
Do you know anything about how to speed up training and/or fine-tunning with few GPUs?
Not really. I guess first thing to do is plumb until your usage is at 100%
😂 I love your video and it is very fun.
Could you please share Wildlife animal IR/thermal dataset for my study?
I just typed Elephants into google images :D
Here is a large collection of camera trap datasets. Not sure if any are IR/thermal but perhaps helpful!
lila.science/datasets
How to train this on our own data
Great video, thanks
Can i fined tuning from torch hub?
Yes, I'm pretty sure you can. What you import is just a regular torch.nn.Module.
how does this relate to the detectron2 framework from fair? it seems like they will at some point add detr into it right?
No clue, but I'm sure it does.
Great .... Thanks for video ..
Very helpful 👍
so cool channel.
Great, Thanks
I legitimately thought that you will implement it from scratch just like last video on pytorch lighting. But anyways, This is also good content.
Sorry about that. In my defense, the title says "how to use ..." :D
@@YannicKilcher Yeah. Great work. and Thanks for content.
Love you :)
🔥🔥🔥
yor are god
sirajing :)
Hi. Fun and educating at the same time. Any idea about training my own model? Has anybody around done it?
No, but the code is there for you to do it
Thank you for this wonderful demo Yannic Kilcher. I have learned a lot from this video. Thank you so much again.
Would like to share something back to others who want to play with this model. I have created a simple util class, based on the codes written by Yannic for someone to play with this model.
Gist: gist.github.com/balaprasanna/c6b6e5dba63a338c53a211baad48cb19
Example usage:
imageurl = "5.imimg.com/data5/GM/EM/MY-38731446/selection_143-500x500.png"
model = DETRModel(imageurl)
model.detect()
Have you tried it with video?
Could I request the paper and live coding review? I just want to know about semantic segmentation paper from Face Book research. The title is Point Rend: github.com/facebookresearch/detectron2/tree/master/projects/PointRend . This is provide the google colab example also. Also, do you have any interested in paper from 2020 CVPR?
I don't look much at conference proceedings if I don't attend personally, since that research is usually already half a year out of date.
@@YannicKilcher Ok that make sense. Now a days there are tons of paper exists. How about next week CVPR paper? Do you have any interested in.