I was code in YoloV3 from Indian UA-camr, and now here I am learning the true nature of Yolo. It helps alot for this OCR Project where I can ignore the image that did not intended to be uploaded to Server.
I regret why I haven't found this gem earlier! I had to go through 5-6 papers and hours of reading to understand these topics but your video made it very clear and specific. Please make more quality content like this. Thanks a lot.
I have seen lot of videos on CNN, mostly crap. But your video is a gem. Appreciate the effort you have put into making this video. Diagrams are a great help in understanding the architecture. Thanks again
Spent multiple hours trying to read through various papers in order to understand some of the topics. Should've stumbled upon your channel and the video much earlier. Love the fact that everything is explained to the point. You've earned yourself a subscriber in me. Can't stress this enough, but please put out more videos like these, along the lines of Computer Vision. Well done mate and once again, THANK YOU SO MUCH!
Great video! We need more detailed explanation-videos like this, any other video i've watched are same few lines of explanation of YOLO where can be found all over the internet.
Really great detailed explanation. I don't get exactly what the ground truth values are determined for grid cells close to the centre grid cell of an object. Would you be able to explain this ?
Hey, can someone explain to me, why the detection is happening in Layer 82, 94 and 106. Is there any mathmatical background or is it like a fix parameter of YOLOv3?
I enjoyed your video. Thank you for putting in the effort. Could you comment on the receptive field of YoloV3? For example if I put in a shape=(416,416,3) image; then as you said, YoloV3 decimates by 32, to produce an output feature map at layer 82 of shape=(13,13,255). This shown quite clearly in your video (15:50 mark). My question is what is the receptive field for that first cell in the output feature map? (ie. the top left cell - of shape=(1,1,255) )? To ask another way, what portion of the original 416,416,3 image is mapped to the 1,1,255 feature cell?
Hi Thank you for the explanation ,I have one question, How is the Objectiveness score calculated during the inference ? There is no groundtruth to refer to, on what basis the objectiveness score is measured ?
Thank you so much for this amazing video. Just one question : at 23:58 , why would you define the "t_0" inside the sigmoid? In the loss function of Yolo v3 they directly use p_0 so I would like to know why! Is this just to make sure that the p_0 is between 0 and 1? Does this t_0 appear somewhere in the model when we implement it? Thanks in advance to anyone who would reply :)
Hello there, There is no need to resize images before training or testing after training. The framework (e.g. the one on GitHub framework for YOLO) will take care of resizing. Moreover, separate images, both for training and testing, can be also of different dimensions.
@@valentynsichkar thanks for reply. In my case my test image is 20,000 x 20,000 size (drone photo mosaic) and model cannot detect. Only when I split the input image as tiles of same size of training images, it work. According to you, I think I can make bigger tiles for detection but just want to know the limit of input size.
hello dear i hope you are okay i want to ask you few questions 1- can i apply some edit on yolo equation to get better detection 2- can you recommend me some videos that explain every thing about YOLOv4 3- how can i write these equations in python? i hope you answer me thank you
Thank you for this super explanation. I have a question regarding the objectness score. As you explained mathematically : P0 = sigmoid ( to) = P(object) * IoU -> my question is how we obtain this "P(object)" - predicted probability ? Thanks in advance for your support ..
It doesn't matter which algorithm. As mentioned in the message above, it depends on what number of classes is specified for training. It can be YOLO v2, v3, v4 or any other algorithm.
New to machine learning and I'm wanting to create an object detection for video games. What are some good resources to start learning, I know the basics essentially of neural networks and their functions. I've bought your course and will be starting to learn that.
~ Timeline for watching again later ~
00:01 Intro
01:17 What is YOLO?
03:13 Architecture of YOLO v3
05:28 Input
07:27 Detections at 3 Scales
09:28 Detection Kernels
12:02 Grid Cells
14:23 Anchor Boxes
18:25 Predicted Bounding Boxes
21:41 Objectness Score
Conclusion
I was code in YoloV3 from Indian UA-camr, and now here I am learning the true nature of Yolo. It helps alot for this OCR Project where I can ignore the image that did not intended to be uploaded to Server.
I regret why I haven't found this gem earlier! I had to go through 5-6 papers and hours of reading to understand these topics but your video made it very clear and specific. Please make more quality content like this. Thanks a lot.
Thank you for the feedback! Will do!
This is one of the simplest and most articulated explanation of YOLOv3. Thank you very much for this video and please keep up the good work.
Thank you for the feedback! Will do!
I have seen lot of videos on CNN, mostly crap. But your video is a gem. Appreciate the effort you have put into making this video. Diagrams are a great help in understanding the architecture. Thanks again
Spent multiple hours trying to read through various papers in order to understand some of the topics. Should've stumbled upon your channel and the video much earlier. Love the fact that everything is explained to the point. You've earned yourself a subscriber in me. Can't stress this enough, but please put out more videos like these, along the lines of Computer Vision. Well done mate and once again, THANK YOU SO MUCH!
Thank you for the feedback! Will do!
Great video!
We need more detailed explanation-videos like this, any other video i've watched are same few lines of explanation of YOLO where can be found all over the internet.
one of the best explanations of YOLO!
This video really contains the details of yolov3! It helps me a lot! Thx!
This is one of the best I have seen . Thank you
Really great detailed explanation. I don't get exactly what the ground truth values are determined for grid cells close to the centre grid cell of an object. Would you be able to explain this ?
Hey, can someone explain to me, why the detection is happening in Layer 82, 94 and 106. Is there any mathmatical background or is it like a fix parameter of YOLOv3?
You should do another video for YOLOv4
I enjoyed your video. Thank you for putting in the effort. Could you comment on the receptive field of YoloV3? For example if I put in a shape=(416,416,3) image; then as you said, YoloV3 decimates by 32, to produce an output feature map at layer 82 of shape=(13,13,255). This shown quite clearly in your video (15:50 mark). My question is what is the receptive field for that first cell in the output feature map? (ie. the top left cell - of shape=(1,1,255) )? To ask another way, what portion of the original 416,416,3 image is mapped to the 1,1,255 feature cell?
thank for detail and easy to understand video. I love it.
It is explained with a lot of diagrams, so even though I am not very good at English, I was able to understand it. Thank you
One question, is ground truth bounding box and anchor boxes used here interchangeably?
Hi Thank you for the explanation ,I have one question, How is the Objectiveness score calculated during the inference ? There is no groundtruth to refer to, on what basis the objectiveness score is measured ?
Thank you so much for this amazing video. Just one question : at 23:58 , why would you define the "t_0" inside the sigmoid? In the loss function of Yolo v3 they directly use p_0 so I would like to know why! Is this just to make sure that the p_0 is between 0 and 1? Does this t_0 appear somewhere in the model when we implement it? Thanks in advance to anyone who would reply :)
Should the input image for detection be same size as training images used in model fitting? Or how big is an input image size ok?
Hello there,
There is no need to resize images before training or testing after training. The framework (e.g. the one on GitHub framework for YOLO) will take care of resizing. Moreover, separate images, both for training and testing, can be also of different dimensions.
@@valentynsichkar thanks for reply. In my case my test image is 20,000 x 20,000 size (drone photo mosaic) and model cannot detect. Only when I split the input image as tiles of same size of training images, it work. According to you, I think I can make bigger tiles for detection but just want to know the limit of input size.
Great Video! Can you please come with more videos
hello dear
i hope you are okay
i want to ask you few questions
1- can i apply some edit on yolo equation to get better detection
2- can you recommend me some videos that explain every thing about YOLOv4
3- how can i write these equations in python?
i hope you answer me thank you
i've read some articles where they improve yolov3 by adding an equation, you should search some, maybe it could help you
thank you for thorough explanation sir, much appreciated it, keep it this way it is great.. cheers sir
Is it possible to integrate the YOLO algorithm with arduino or raspberry pi using a webcam?
Thank you very helpful . Can you make a series on deep learning please ?
Thanks for the feedback! For sure, will do!
Thank you for this super explanation. I have a question regarding the objectness score. As you explained mathematically : P0 = sigmoid ( to) = P(object) * IoU -> my question is how we obtain this "P(object)" - predicted probability ? Thanks in advance for your support ..
yes,it is predicted probability by the network.
@@bharath5666 can i find how does the network predices P(object), but like mathematically or somewhere in the code?
I can just follow the others. This video is very helpful. Did you publish a paper? I would like to cite you for my project.
how many classes can yolo detect?
It depends on how many classes it is set for training. For instance, YOLO trained on COCO dataset detects and classify 80 classes.
@@valentynsichkar yolo v3 ?
It doesn't matter which algorithm. As mentioned in the message above, it depends on what number of classes is specified for training. It can be YOLO v2, v3, v4 or any other algorithm.
Amazing Explanation of Yolo v3. Thank you very much.
Legitmely the clearest video I could find on this topic, amazing! Thanks a lot and keep up the great work Valentyn! :-)
Thank you for the feedback! Will do!
Thanks for the great explanation!!
Thanks a lot. Explained neatly.
Please make videos on V4 and V5 too.
Well explained 👍
great explanation & presentation!!!
very well explained
hats off sir. thank you very much for such a nice briefing.
perfect explanation thanks
Nice explanation!! Thank you
Great. Thank you, it helps me a lot!
thank you so much sir.Its very useful and great explanation!
New to machine learning and I'm wanting to create an object detection for video games. What are some good resources to start learning, I know the basics essentially of neural networks and their functions. I've bought your course and will be starting to learn that.
Good explanation. Thank you sir
Great tutorial, thanks !
Thanks a lot. Please make a vedio on YOLOv4
Thanks for the comment. Will do!
Explained very well.... great
Great presentation
Nicely explained
great video, thanks
for this..
Топчик просто. Сразу всё понятно стало. Стало хоть ясно, что за якоря такие
nice explaination
Thank. It is excellent!
Very helpful thanks!
perfect !
~perfacto!
thank you
Thanks sir
Thanks 🌹🌹🌹🌹
🍀🍀🍀🍀🍀🇮🇶
Amazing explanation !! Thank you
really awesome explanation it was!
thanks a lot
Very well explained