Seriously you don’t need to pay these people for a API garbage, this is old look up panoptic segmentation stop giving these people your money and your data that’s all an API is it’s an application programming interface. In other words you’re just paying them when you don’t need it.
Thanks for the wonderful video! Is it possible to annotate specific objects (with labels) in a few frames of a video (fixed perspective) and keep tracking those objects in the entire video?
Nice video! I have 1 question, can you please suggest which is better out of this "SAM" or "YOLOv6-v3" for real-time detection in terms of accuracy? My requirement is to detect car parts(e.g. Michelin tire). Thank you in advance.
@@Roboflow: Thank you very much for your quick response! For our specific requirement to detect car parts(e.g. wheel type - alloy wheels or not, specific accessory, etc.,) after captured image being uploaded(taken from mobile camera), can you please suggest best algorithm based on your vast experience in this area? Do you recommend YOLOv6-v3 or GroundingDINO or any other? Tons of thanks to you again in advance!
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 mask_annotator = sv.MaskAnnotator(color_map='index') TypeError: MaskAnnotator.__init__() got an unexpected keyword argument 'color_map I got this error. May I know what went wrong?
Thanks so much for the clear video! Are you planning on also intergrate it with some tools to get an output that will include also labels of each mask?
We already did. Take a look here: colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/automated-dataset-annotation-and-evaluation-with-grounding-dino-and-sam.ipynb
Thanks so much for this video!! Is there another way to draw the bounding box (in like a single python file format whereby you just run your main function) that doesn't require jupyter widgets? Oh btw, Liked and subscribed you guys are awesome!
Thanks for like and sub ;) as for your question. That was the only interactive way that I could come up with. But if you don’t want to do it in interactive way than you have plants of options.
Take a look here: github.com/SkalskiP/courses. In general, the Internet is full of free resources. It is not worth paying 1200 USD for a course like that.
How can I tag objects in a picture using SAM? For example: In the picture that was used in the video: Man holding a dog, I want to identify all the objects in the picture like man, Dog, building, etc
Is there any way you can create an auto-labeler using SAM? (SAM would take care of everything with no human intervention). My specific need would be to label lane markings, but for an entire dataset of raw images.
Forgot to mention, great video! Is there any functionality with SAM where you can give it a few examples of what the label is and then it will assume the labels for the dataset. Thanks!
@@Roboflow Could you show me how to use 'detections.masks', please? I try to use it and got AttributeError: 'Detections' object has no attribute 'masks'
Hi i am in the process of learning this SAM model follwing your video , this is very helpful i am planing to use this model to segmentahistorical documents charchters, according to your knowledge will it be possbile or time wasting ?
Hey, great explanation I have a question can we do a multitask in a bounding box for example different layers of liquid in a bounding box vial if yes can you explain how? Thanks!
So how many SAMs Dics are you expecting to come in and out of your model? You just seem to really enjoy SAMs Dics, but I suppose using research from Michael J Black, about 10 years ago, and it makes sense why you really enjoy utilizing SAMs Dics
This still doesn't work in live action does it? Like if I connected it to a camera or a vr headset like the meta quest pro/ pico 4 and used their cameras for AR powered by Sam. That would definetly be awsome!
We have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/ Let us know what you think!
@@Roboflow Yes. so two-part question: (1) current script returns one mask at a time. How can I change it so it returns all detected masks? (2) Let's suppose I have 5 classes. How should I do so that all detections for all 5 classes are shown?
@@Roboflow i like your boss. now when you got time for that. (imho) please keep few pointers in mind a- do it with image sequence or video or both b- getting progress bar / status tqdm / queue c- giving area/prompt in one go. d- how to acheive consistancy.. etc.. if subject move out or something come in front (if possible)
We have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/ Let us know what you think!
You said it’s real time ready - but what IS real time ready is only the prediction for various prompts on the *same* image. Generating the embeddings for each new image is actually really slow (multiple seconds depending on image size and hardware) and cannot be done in browser or in real time. This makes it less useful for live/video analysis, of course, but it’s still great to generate segmentation mask training sets! Thanks for the video
Hi 👋🏻 It’s Peter from the video. You are right I really think I shouldn’t say that because it is confusing. The decoding part is very fast and can get executed in real time, but the encoding is quite slow. Can be faster if you use version B versus version H but still… So I really wish I would be more precise. Apologize for that.
There is a fully onnx optimized and quantized model out there that is faster. Still not ultra fast but at least 1 FPS on my RTX 2080 ti, which is not bad. Semi real-time.
First of all, thank you so much for the content, amazing contribution to the community! I wonder if it is possible to implement the negative point prompt in the SAM model similarly as it can be done in the website, where you can choose several points belonging to the object that you are interested in as well as points that do not belong to it... Some help would be amazing!! Thanks in advance!!
Hi thanks a lot for those kind words :) As for your question - "implement the negative point prompt". I was looking for any project that would implement that functionality. And I didn't found anything :/
One use case is the annotation of eye tracking data. Per video frame one would like to annotate whether a person is looking at other people or objects in the environment. One could use YOLO and bounding boxes, but these are less precise than regions.
Hi it is Peter from the video. I think that if you look for type and quantity of inventory in a shop you will be much better off with using detection models like YOLOv8 or YOLO-NAS.
Can it be used to segment and label objects in a video stream from a live camera? I've been reading a lot of the feedback and people are saying it's computationally heavy and will run too slow at a meaningful refresh rate. I noticed the Meta advertisement was doing it in realtime and labeling and tracking stuff. What are your thoughts on this? Is it easier to stick to OpenCV/Yolo for a live video feed?
@@Roboflow Could you explain what you mean by "work online"? Do you mean the Meta team recorded a video and then post-processed the video offline? Also, is there a way to pass video frames (even if it is very slow like 1Hz or slower) to SAM and have it segment the image frames? I want to then run some python scripts to get me geometry information about these segmented masks.
Thanks for your video I have learnt a lot from you but this time each time i try to follow up with your steps this error encounter me : --------------------------------------------------------------------------- OutOfMemoryError: CUDA out of memory. Tried to allocate 14.40 GiB (GPU 0; 15.90 GiB total capacity; 6.53 GiB already allocated; 7.95 GiB free; 7.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Hi Sir I'm Beginner in I saw your Computer vision video's its fully combined and merged can you please update one by one video order that time we can understand easily thank you.
Very nice work. How do you see SAM being used in practice? You see this as a model to be integrated in a tool to generate training data for your task or being your final model for a certain task?
Hi 👋! It is Peter from the video. I think we will see broad use of SAM in image and video editors. But I also think it will be the default feature in all major annotation tools. It is a bot too slow for real-time usage. But we will transfer the knowledge it provides int datasets that we use for training real-time models. What is your prediction?
@@Roboflow yes, I totally agree with you on the fact that it will be the default tool to annotate data. Since you think this is to slow, what's in your opinion the current state of art model for semantic segmentation for real-time applications?
Hi Rui - we have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/ Let us know what you think!
thank u for that ❤ i have a question ,when I make annotations for more than one object in image and pass it to sam it only mask one object like the brain cell in ur project and didn't mask the other objects in same image? i make it one class for all
Seriously you don’t need to pay these people for a API garbage, this is old look up panoptic segmentation stop giving these people your money and your data that’s all an API is it’s an application programming interface. In other words you’re just paying them when you don’t need it.
Thanks for the wonderful video! Is it possible to annotate specific objects (with labels) in a few frames of a video (fixed perspective) and keep tracking those objects in the entire video?
Nice video! I have 1 question, can you please suggest which is better out of this "SAM" or "YOLOv6-v3" for real-time detection in terms of accuracy? My requirement is to detect car parts(e.g. Michelin tire).
Thank you in advance.
If you want to run real time, then you can’t use SAM. It will be to slow.
@@Roboflow: Thank you very much for your quick response! For our specific requirement to detect car parts(e.g. wheel type - alloy wheels or not, specific accessory, etc.,) after captured image being uploaded(taken from mobile camera), can you please suggest best algorithm based on your vast experience in this area? Do you recommend YOLOv6-v3 or GroundingDINO or any other? Tons of thanks to you again in advance!
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
----> 1 mask_annotator = sv.MaskAnnotator(color_map='index')
TypeError: MaskAnnotator.__init__() got an unexpected keyword argument 'color_map
I got this error. May I know what went wrong?
Hi! I just fixed the notebook. Feel free to try it. :)
guess he meant: sv.MaskAnnotator(color_lookup = "index")
Thanks so much for the clear video! Are you planning on also intergrate it with some tools to get an output that will include also labels of each mask?
We already did. Take a look here: colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/automated-dataset-annotation-and-evaluation-with-grounding-dino-and-sam.ipynb
Thanks so much for this video!! Is there another way to draw the bounding box (in like a single python file format whereby you just run your main function) that doesn't require jupyter widgets? Oh btw, Liked and subscribed you guys are awesome!
Thanks for like and sub ;) as for your question. That was the only interactive way that I could come up with. But if you don’t want to do it in interactive way than you have plants of options.
Can we annotate polygon shape instead of rectangle using SAM
Help please, i need to learn computer vision, but i struggle a lot. Is the OpenCv certificate worth it, it's around 1200 us dollars ? Thanks
Take a look here: github.com/SkalskiP/courses. In general, the Internet is full of free resources. It is not worth paying 1200 USD for a course like that.
How can I tag objects in a picture using SAM? For example: In the picture that was used in the video: Man holding a dog, I want to identify all the objects in the picture like man, Dog, building, etc
Stay tuned for our video tomorrow. 🔥 I’m going to show how to auto annotate images with Grounding DINO and SAM
Great video... thanks Piotr and Roboflow for all the great videos you generate. I am resuming my interest in CV thanks to you!
This is big! If I managed to convince you even a little bit I am proud of myself.
Is there any way you can create an auto-labeler using SAM? (SAM would take care of everything with no human intervention). My specific need would be to label lane markings, but for an entire dataset of raw images.
Forgot to mention, great video! Is there any functionality with SAM where you can give it a few examples of what the label is and then it will assume the labels for the dataset. Thanks!
Yes! Stay tuned to our next vid. We are doing full auto dataset generation and generation of masks from boxes. Should be on Monday.
How can I extract the segmented object produced by SAM?
You can find masks inside `sv.Detections` object. `detections.masks`
@@Roboflow Could you show me how to use 'detections.masks', please? I try to use it and got AttributeError: 'Detections' object has no attribute 'masks'
It is really best video ever)I am making a great project with using sv
This is so kind! Thank you very much!
You should try it with data taken in a underwater marine context. Lots of models struggle with that.
Cool idea!💡 I work on next vid I’ll try to take that into consideration
Trying to work on the same use case with coral segmentation
Since SAM is trained on photos, any idea how well it does on synthetic images, like artwork or games? Cheers
Great question. Unfortunately I didn’t experiment with those :/ sorry
Hi i am in the process of learning this SAM model follwing your video , this is very helpful i am planing to use this model to segmentahistorical documents charchters, according to your knowledge will it be possbile or time wasting ?
SAM is not really good at document segmentation
Hey, great explanation I have a question can we do a multitask in a bounding box for example different layers of liquid in a bounding box vial if yes can you explain how? Thanks!
So how many SAMs Dics are you expecting to come in and out of your model? You just seem to really enjoy SAMs Dics, but I suppose using research from Michael J Black, about 10 years ago, and it makes sense why you really enjoy utilizing SAMs Dics
This still doesn't work in live action does it? Like if I connected it to a camera or a vr headset like the meta quest pro/ pico 4 and used their cameras for AR powered by Sam. That would definetly be awsome!
You should get few fps. But if you want 30 fps, than we are not there yet.
@@Roboflow Hmm I see, but the fact that it's there already is awesome in itself! The future is here and It's really exciting I love it
@@unknown-wm9ru true that!
i got this error :
AttributeError Traceback (most recent call last)
in ()
----> 1 box_annotator = sv.BoxAnnotator(color=sv.Color.red())
2 mask_annotator = sv.MaskAnnotator(color=sv.Color.red(), color_lookup=sv.ColorLookup.INDEX)
3
4 detections = sv.Detections(
5 xyxy=sv.mask_to_xyxy(masks=masks),
AttributeError: type object 'Color' has no attribute 'red'
I'll look into it. Check back in a few hours!
Thanks a lot Linus!
Hello! Can anybody explain how i can evaludate this model after training? What commands can i run?
Great video!! I do have a question. How do we use MskAnnotator to annotate only one specific mask instead of the entire set of masks in sam_result?
Great video, Do youthink that you can share the jupyter notebook?
It is linked in description ;) all our demo notebooks are open sourced
I can't wait to see how it can be used for annotations
Stay tuned for RF update;) we also plan to drop one more video probably Friday/Monday where we will dive deep into auto annotation in Colab
We have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/
Let us know what you think!
For [Single Image Bounding Box to Mask] what should be changed if we have more than 1 class and want to see detection for all classes?
Hi it is Peter from the video 👋🏻 So you want to have multiple boxes converted into multiple masks?
@@Roboflow Yes. so two-part question: (1) current script returns one mask at a time. How can I change it so it returns all detected masks? (2) Let's suppose I have 5 classes. How should I do so that all detections for all 5 classes are shown?
Is it possible to get segmentated image without passing its bounding box?
You don’t need to pass box. If you won’t pass any prompt the whole image gets segmented.
first tutorial using sam should be mask of yourself which you showed on 36sec of this video. but thanks anyways
Just convince my boss to do it and you’ll have it ;)
@@Roboflow i like your boss. now when you got time for that. (imho) please keep few pointers in mind
a- do it with image sequence or video or both
b- getting progress bar / status tqdm / queue
c- giving area/prompt in one go.
d- how to acheive consistancy.. etc.. if subject move out or something come in front (if possible)
Please let us know anytime when the SAM/ Roboflow integration is accomplished 😊
It will be for sure part of out weekly newsletter! ;)
We have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/
Let us know what you think!
How can I train with different color masks rather than black and white mask??
Very nice video. Next video -> Grounded Segment Anything !! 👏
🚨 SPOILER ALERT: That's the plan!
Second that! +1
@@abdshomad I think we should have something by Friday/Monday
You said it’s real time ready - but what IS real time ready is only the prediction for various prompts on the *same* image. Generating the embeddings for each new image is actually really slow (multiple seconds depending on image size and hardware) and cannot be done in browser or in real time. This makes it less useful for live/video analysis, of course, but it’s still great to generate segmentation mask training sets! Thanks for the video
Hi 👋🏻 It’s Peter from the video. You are right I really think I shouldn’t say that because it is confusing. The decoding part is very fast and can get executed in real time, but the encoding is quite slow. Can be faster if you use version B versus version H but still… So I really wish I would be more precise. Apologize for that.
There is a fully onnx optimized and quantized model out there that is faster. Still not ultra fast but at least 1 FPS on my RTX 2080 ti, which is not bad. Semi real-time.
@@mattizzle81 thanks for that info I was actually not aware of that… but still I think it would be much cleaner without that sentence in the vid :)
Plz made video by custom data set ..
First of all, thank you so much for the content, amazing contribution to the community! I wonder if it is possible to implement the negative point prompt in the SAM model similarly as it can be done in the website, where you can choose several points belonging to the object that you are interested in as well as points that do not belong to it... Some help would be amazing!!
Thanks in advance!!
Hi thanks a lot for those kind words :) As for your question - "implement the negative point prompt". I was looking for any project that would implement that functionality. And I didn't found anything :/
Does SAM segment all objects in the scene very well when there is an occlusion?
One use case is the annotation of eye tracking data. Per video frame one would like to annotate whether a person is looking at other people or objects in the environment. One could use YOLO and bounding boxes, but these are less precise than regions.
Can you make a video on MedLSAM ( medical localize and segment anything model) ?
how can this model be used in detecting type and quantity of inventory in a shop?
Hi it is Peter from the video. I think that if you look for type and quantity of inventory in a shop you will be much better off with using detection models like YOLOv8 or YOLO-NAS.
Hi thanks for the great tutorial! How can I download the masks created using SAM and upload them into roboflow?
Do you do freelancing ? my ACADEMIC project is "solar panel detection and counting using SAM."
Nope. We do not do freelancing. :/
Can it be used to segment and label objects in a video stream from a live camera? I've been reading a lot of the feedback and people are saying it's computationally heavy and will run too slow at a meaningful refresh rate. I noticed the Meta advertisement was doing it in realtime and labeling and tracking stuff. What are your thoughts on this? Is it easier to stick to OpenCV/Yolo for a live video feed?
> I noticed the Meta advertisement was doing it in realtime
Could you point me to the resource?
@@Roboflow At about 0:06 in your video. Looks like real time?
@@tomtouma I wish! That was just a lot of work online to produce it. It is not real time.
@@Roboflow Could you explain what you mean by "work online"? Do you mean the Meta team recorded a video and then post-processed the video offline? Also, is there a way to pass video frames (even if it is very slow like 1Hz or slower) to SAM and have it segment the image frames? I want to then run some python scripts to get me geometry information about these segmented masks.
Just thought I'd bump this.
Really solid video, loved the intro haha
This was a great explanation and so was your blog entry. You gained another subscriber today. Thank you!
Hi, it is Peter from the video :) That's awesome to hear! Thanks a lot!
Thanks for your video I have learnt a lot from you but this time each time i try to follow up with your steps this error encounter me :
---------------------------------------------------------------------------
OutOfMemoryError: CUDA out of memory. Tried to allocate 14.40 GiB (GPU 0; 15.90 GiB total capacity; 6.53 GiB already allocated; 7.95 GiB free; 7.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Is that happening in notebook?
try using parallel gpu
@@Roboflow
Yes on both google colab and kaggle
@@saharabdulalim how to do that?
@@husseinali-yx7uf od that happening with your own image?
Hi Sir I'm Beginner in I saw your Computer vision video's its fully combined and merged can you please update one by one video order that time we can understand easily thank you.
Hi :) Are you only interested in auto annotation videos? Or all of them?
How can I do semantic segmentation labeling using sam?
Let's do this on a superstore dataset
send a link :D I'll take a look
Thank you ! Can SAM handle 3D images ? Any advice on how to approach it ?
Very nice work. How do you see SAM being used in practice? You see this as a model to be integrated in a tool to generate training data for your task or being your final model for a certain task?
Hi 👋! It is Peter from the video. I think we will see broad use of SAM in image and video editors. But I also think it will be the default feature in all major annotation tools. It is a bot too slow for real-time usage. But we will transfer the knowledge it provides int datasets that we use for training real-time models. What is your prediction?
@@Roboflow yes, I totally agree with you on the fact that it will be the default tool to annotate data. Since you think this is to slow, what's in your opinion the current state of art model for semantic segmentation for real-time applications?
@@ruiteixeira2324 hahaha hard question. According to papers with code that would be latest version of YOLOv6.
Hi Rui - we have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/
Let us know what you think!
Не рабочий код, дизлайк
how can i proove with my own images?
am not a programmer but i wanna learn
Great !
wow that's really great waited for that...
I’m super happy you like it!
Great video! Thank you! What hardware are you running this on?
thank u for that ❤ i have a question ,when I make annotations for more than one object in image and pass it to sam it only mask one object like the brain cell in ur project and didn't mask the other objects in same image? i make it one class for all
i think cuz of xyxy[0] but if i want to pass multimask on multi object?
@@saharabdulalim I'm actually working on next vid right now. And we will cover more general autoannotation usecases. Vid should be out tomorrow.
how do we switch out with our own videos?
Great! Could you give a talk on possibility of object detection with SAM
Hi 👋🏻 Could you please explain a bit more what do you mean? It is segmentation model. Would you like to convert masks into boxes?
Great. Thanks.
Thanks a lot 🙏
Nice. Thanks.
Thank you! 🙏
Thank you so much
Great video , create some video to explain how SAM works internally please