Hi firstly thanks for making these videos, I've learned alot. So I've been trying to do a small project out of interest. The idea is to try and detect characters/objects in a video game and what they are currently doing. So I'm in the middle of doing annotations in CVAT, and after thinking through it each character can go through roughly 55 different types of animations (like walking, running, punching etc). So what I'm doing in CVAT is to put 1 label/class of a character, and added all 55 possible animations in the attributes section, rather than splitting out all of them into their own label/class. In my mind I thought that this would help down the road when I am trying to do real time detection, since putting all the different types of animations in attributes means I am only detecting 7 labels/class on the screen at any one time. So after going through your videos and reading up more I think I may have gotten the wrong idea, and I think if I just fed my dataset into Yolov8 it would just tell me where my characters are on the screen, but not what they are actually doing. So my question is, what am I missing, how should I proceed next, is my dataset almost totally useless since attributes won't help me achieve what I want? Or is there another model out there that can do what I want to do.(But is probably slower)
Hey Ernest, I am glad you find the videos useful 😊. I think you could solve this problem in many different ways, you could use an object detector to detect the character only and then an image classifier to classify the action the character is doing. Or you could try with keypoint detection, and then something like a random forest where you input all the x, y positions from the keypoints. From the top of my head that is what I would do in this project! 💪🙌
@@ComputerVisionEngineer 🙏 alright thanks, I'll keep trying then and see how far I can go. Probably use Yolov8 for object detection and then classification at the same time.
Can you show us how to leverage this training with TensorFlow, GANlib, or others to then implement that fine tuned training model to generate images from text based of that training. Looks like StyleGAN and cycleGAN would be useful to leverage, but some open source ones that can then apply that refined and trained model into generative images via text?
Thank you for making this video. I have one image per class and 500 classes. Can I use the webcam to create more images from the single image I have, so the model has more to analyze? My images are already saved to my drive, so I can't just hold them in front of the webcam. Any suggestions? Thank you.
how can i make it to predict true negative result? for example, if i insert image other than cloudy,rain,shine and sun rise, it will show "no target image"
Hey, I guess you could do that by comparing the confidence score with some threshold (which you can set in 0.5 for example), something like: if confidence_score < threshold: return "no target image"
@@ComputerVisionEngineer you speak not too fast for me. I understand almost everything. You are very useful for my personal project of documents détection
Than you so much , iam a farmer just try to figure how to use the computer vision but dont have much knowledge on the coding part , but this is very useful i tried it .. i can figure out some result , can you make how it should run in android mobile ....
Your tutorials are awesome!
😃💪
Hi firstly thanks for making these videos, I've learned alot.
So I've been trying to do a small project out of interest.
The idea is to try and detect characters/objects in a video game and what they are currently doing.
So I'm in the middle of doing annotations in CVAT, and after thinking through it each character can go through roughly 55 different types of animations (like walking, running, punching etc).
So what I'm doing in CVAT is to put 1 label/class of a character, and added all 55 possible animations in the attributes section, rather than splitting out all of them into their own label/class. In my mind I thought that this would help down the road when I am trying to do real time detection, since putting all the different types of animations in attributes means I am only detecting 7 labels/class on the screen at any one time.
So after going through your videos and reading up more I think I may have gotten the wrong idea, and I think if I just fed my dataset into Yolov8 it would just tell me where my characters are on the screen, but not what they are actually doing.
So my question is, what am I missing, how should I proceed next, is my dataset almost totally useless since attributes won't help me achieve what I want? Or is there another model out there that can do what I want to do.(But is probably slower)
Hey Ernest, I am glad you find the videos useful 😊. I think you could solve this problem in many different ways, you could use an object detector to detect the character only and then an image classifier to classify the action the character is doing. Or you could try with keypoint detection, and then something like a random forest where you input all the x, y positions from the keypoints. From the top of my head that is what I would do in this project! 💪🙌
@@ComputerVisionEngineer 🙏 alright thanks, I'll keep trying then and see how far I can go. Probably use Yolov8 for object detection and then classification at the same time.
Can you show us how to leverage this training with TensorFlow, GANlib, or others to then implement that fine tuned training model to generate images from text based of that training. Looks like StyleGAN and cycleGAN would be useful to leverage, but some open source ones that can then apply that refined and trained model into generative images via text?
Sure, I am going to make some tutorials on generative models in the future! 💪
Complete Tensflow tutorial next plzzz😍😍🙏🙏🙏🙏
I am unable to load the model. is there some other dependencies i have to install ?
Thank you for making this video. I have one image per class and 500 classes. Can I use the webcam to create more images from the single image I have, so the model has more to analyze? My images are already saved to my drive, so I can't just hold them in front of the webcam. Any suggestions? Thank you.
Hello sor how to embed webcam please
how can i make it to predict true negative result? for example, if i insert image other than cloudy,rain,shine and sun rise, it will show "no target image"
Hey, I guess you could do that by comparing the confidence score with some threshold (which you can set in 0.5 for example), something like:
if confidence_score < threshold: return "no target image"
@@ComputerVisionEngineer thank you so much for the reply!!! i have fixed my problem
Super super video!
Muchas muchas gracias.
😃 Glad you enjoyed it! 💪
@@ComputerVisionEngineer you speak not too fast for me. I understand almost everything. You are very useful for my personal project of documents détection
Than you so much , iam a farmer just try to figure how to use the computer vision but dont have much knowledge on the coding part , but this is very useful i tried it .. i can figure out some result , can you make how it should run in android mobile ....
Glad it was helpful! I will try to do another video on how to take a model trained with teachable machine into a mobile device. 🙌
is there a object detection version of it?
Sadly, there is not. An object detection version of it would be super cool! 😃💪
Thank you very much for sharing. Is there any similar simple & user-friendly tool like Teachable Machine but is pytorch based please?
Not that I am aware of.
Can you make a video of training tensorflow lite model for raspberry pi using custom data set
Hey, that sounds like a great idea for a future video, I will try to do it! 🙌
Btw, you can export the model as tensorflow lite format from teachable machine.
@@ComputerVisionEngineer I got accuracy = 0 when converted to TFLite. TF works but it's slow
@@microcontroller_garage5387 Oh, I see. I will try to do a video on tf lite and raspberry pi.
Tried this 100 times, can never get past the "preparing training data" part. It just stays like that for two hours, then freezes and crashes.
Thank you sir
You are welcome! 😃🙌
I liked you accent very much!!! Where are you from
Thank you! I am from Uruguay. 😃🙌
Nice brother contiune what are you doin!@@ComputerVisionEngineer