Get expert guidance, insider tips n tricks and Create stunning images, learn to fine tune diffusion models, advanced Image Editing techniques like In-Painting, Instruct Pix2Pix and many more. Join our Kickstarter campaign now! bit.ly/3JYh7A6
📚 LINK TO BLOGPOST: learnopencv.com/yolov7-pose-vs-mediapipe-in-human-pose-estimation/ ▶ LINK TO YOLO MASTERCLASS PLAYLIST: ua-cam.com/play/PLfYPZalDvZDLALsG9o-cjwNelh-oW9Xc4.html
Many thanks for this great video! You mentioned that one can use any object detection model for yolo pose - could you elaborate on that? How could one plug in the smallest version of yolov7?
You would need to retrain the network with a different backbone. The authors have trained it for the YOLOv7-W6 model. You can train the model using a different yolov7 model. What you would need is a config (.yaml) file corresponding to the smaller model. You can then train the model using the commands given here: github.com/WongKinYiu/yolov7/tree/pose I doubt it would give accurate results for smaller models. I would use mediapipe if I don't need multi-person pose estimation.
Very good explanation. Hi Sir. I have been following your tutorial on how to train a custom Yolov5 object detector as I am doing a school project on vehicle detection. I am having an error on training my model. Is it ok if you can help on this please.
Thanks for the kind words Geoff! YOLO does not have good enough number of points for Face landmarks alignment. Mediapipe has a dedicated face mesh model that gives 468 3D landmark points on the face. You can check out our blog post on Creating Snapchat filters using mediapipe. You can learn about how to use the different points for your application. learnopencv.com/create-snapchat-instagram-filters-using-mediapipe/
The pose solution model consists of two models. The detection model (that detects the body), and the landmark model (that maps the landmarks). If you can make the detection model detect the body without its upper part, theoretically, the solution will work.
As of 2024 Jan update, Mediapipe does supports mutiperson pose but limited to 5 at a time. For further info check out: developers.google.com/mediapipe/solutions/vision/pose_landmarker/
As mentioned in the summary section, it's better to use YOLOv7 or other pose models as mediapipe is optimized for real-time performance which is more suitable for video inference. Hope that helps!
@@dj.qb91 For Multiperson we're checking out MMPose next -> github.com/open-mmlab/mmpose. You may also check it out and compare with YOLOv7. Check this out for getting started: mmpose.readthedocs.io/en/v0.29.0/get_started.html#inference-with-pre-trained-models
Get expert guidance, insider tips n tricks and Create stunning images, learn to fine tune diffusion models, advanced Image Editing techniques like In-Painting, Instruct Pix2Pix and many more.
Join our Kickstarter campaign now! bit.ly/3JYh7A6
R
Ttf
P
P
📚 LINK TO BLOGPOST: learnopencv.com/yolov7-pose-vs-mediapipe-in-human-pose-estimation/
▶ LINK TO YOLO MASTERCLASS PLAYLIST: ua-cam.com/play/PLfYPZalDvZDLALsG9o-cjwNelh-oW9Xc4.html
It was a real pleasure to watch such a clear and concise comparison. Excellent video 👍
Glad you liked it @John. More videos incoming!
Awesome comparison, it reduced my work drastically.
We felt the same while working with both YOLOv7 and mediapipe that everyone should know about this comparison! Glad you found it useful.
Great video comparisson between Yolov and Mediapipe man, good thing I saw this video in my UA-cam feed.
+1 Sub 👍
Awesome, thank you!
great explanation! thanks from Argentina
Nice Video! the test on many cases was so helpful!
Thank you Yohanes!
Great Video!! Thank you for the super informative video, was looking for the right pose estimation to use for my dance project and this really helped!
Glad it was helpful!
great work, thanks!
thank you so much, this video is very helpful
Glad it is helpful!
I like your sharing. It is clear and easy to understand.
Thank you, glad you liked it 😊
Mediapipe does support multiperson detection now
Great Video sir. Thank you for sharing.
You are very welcome
Many thanks for this great video! You mentioned that one can use any object detection model for yolo pose - could you elaborate on that? How could one plug in the smallest version of yolov7?
You would need to retrain the network with a different backbone.
The authors have trained it for the YOLOv7-W6 model. You can train the model using a different yolov7 model. What you would need is a config (.yaml) file corresponding to the smaller model. You can then train the model using the commands given here: github.com/WongKinYiu/yolov7/tree/pose
I doubt it would give accurate results for smaller models. I would use mediapipe if I don't need multi-person pose estimation.
Very good explanation. Hi Sir. I have been following your tutorial on how to train a custom Yolov5 object detector as I am doing a school project on vehicle detection. I am having an error on training my model. Is it ok if you can help on this please.
I felt in love with Mediapipe 1 year ago when I worked with facial pose estimation… but YOLOv7 just outperforms it in terms of faces
Hi Maxim
Are you talking about face Detection or Facial Landmarks Detection using YOLOv7?
Hey!
I’m talking about Facial Landmarks Detection. I fine-tuned and used ensemble instead
Great, do you have a repo you could share?
@@LearnOpenCV I worked with medical sensitive data(
No Issues!
Hi thanks for the nice job in the video ... I'm doing single image (3 image consecutive) face landmarks alignment, is Yolo better than MP ?
Thanks for the kind words Geoff!
YOLO does not have good enough number of points for Face landmarks alignment. Mediapipe has a dedicated face mesh model that gives 468 3D landmark points on the face. You can check out our blog post on Creating Snapchat filters using mediapipe. You can learn about how to use the different points for your application.
learnopencv.com/create-snapchat-instagram-filters-using-mediapipe/
Can we tweek mediapipe to work even when upper part of body is not visible
The pose solution model consists of two models. The detection model (that detects the body), and the landmark model (that maps the landmarks). If you can make the detection model detect the body without its upper part, theoretically, the solution will work.
nice video
Thank you so much!
Excellent
Thank you. Glad you liked it.
Are you sure Mediapipe doesn't support Multi-person? pls verify once
As of 2024 Jan update, Mediapipe does supports mutiperson pose but limited to 5 at a time.
For further info check out:
developers.google.com/mediapipe/solutions/vision/pose_landmarker/
What about in images
As mentioned in the summary section, it's better to use YOLOv7 or other pose models as mediapipe is optimized for real-time performance which is more suitable for video inference.
Hope that helps!
@@LearnOpenCV so with Multiperson which is better than yoloV7.
@@dj.qb91 For Multiperson we're checking out MMPose next -> github.com/open-mmlab/mmpose. You may also check it out and compare with YOLOv7. Check this out for getting started: mmpose.readthedocs.io/en/v0.29.0/get_started.html#inference-with-pre-trained-models
@@LearnOpenCV thanks 🙏🏾
👍👍
Thank You!
Yolo+mediapipe
호호
I'm not sure what that means, but I'm hoping you liked it! 😊