This is a very cool manual, thank you for it, this is exactly what I wanted to see. I have always been surprised in your channel that you post all these materials and videos for free, because you could sell them.
Hey, glad you find it helpful! I really enjoy sharing my knowledge of computer vision with everyone! 😃💪 Selling courses is not a bad idea, though. Maybe I will do it in the future. 😊
Did you enjoy this video? Try my premium courses! 😃🙌😊 ● Hands-On Computer Vision in the Cloud: Building an AWS-based Real Time Number Plate Recognition System bit.ly/3RXrE1Y ● End-To-End Computer Vision: Build and Deploy a Video Summarization API bit.ly/3tyQX0M ● Computer Vision on Edge: Real Time Number Plate Recognition on an Edge Device bit.ly/4dYodA7 ● Machine Learning Entrepreneur: How to start your entrepreneurial journey as a freelancer and content creator bit.ly/4bFLeaC Learn to create AI-based prototypes in the Computer Vision School! www.computervision.school 😃🚀🎓
hi tenk u for ur video but i got a problem here. The labels in the val part should be the same as the labels we did for the images in the same directory, right? Should we convert them from the image format to numerical format in the code? Or when I downloaded the dataset, the labels for the val part came in this format: "duck 56.96 124.29848700000001 1023.36 421.58926299999996" for each image. Is this the correct format? When I did the second one, I got an error in VS Code. When I did the first one, I got the following output in Google Colab: "Epoch GPU_mem box_loss seg_loss cls_loss dfl_loss Instances Size 10/10 0G 0 0 75.79 0 0 640: 100%|██████████| 5/5 [01:23
Thank you for this video. I have a multiclass problem 10 classes + background. How can I convert the masks to yolo labels considering the right arrangement of the labels?
did you get the answer please can you help me with it I also have a multi class problem and cat seem to find the code to convert the masks to labels please !!!!!!!!!!!!!!!!!
@@haseebkhawaja1050 I exported my Annotation as JSON file. Then you can import the Data with JSON file to roboflow annotater. Then from the robowflow annotater the txt. files for yolov8 can be exported. Also you will finde alot of codes online that can transforme the JSON file to yolov8 txt. format or to a binary mask
2 місяці тому
i can make a hot encoded with classes numbers and a label map. class 1: 1 , class 2: 2. 1:(0,0,0) 2:(255,255,255)
if you find anything on it please help. I found one solution of ROBOFLOW which actually generates auto lables after annotating no need to go through to masks and then convert to lables (polygons)
Use the color of the different classes to define which segment has which color: import os import cv2 import numpy as np # Define color-to-label/class mapping color_to_label = { (25, 25, 77): 0, # Red (61, 245, 61): 1, # Green (189, 18, 0): 2, # Black # Add more colors and labels as needed for more different classes } input_dir = 'myinput_path' output_dir = 'myoutput'_path for j in os.listdir(input_dir): image_path = os.path.join(input_dir, j) # Load the color mask mask = cv2.imread(image_path) H, W = mask.shape[:2] polygons = [] for color, label in color_to_label.items(): # Create a binary mask for the current color binary_mask = cv2.inRange(mask, np.array(color), np.array(color)) contours, _ = cv2.findContours(binary_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) for cnt in contours: if cv2.contourArea(cnt) > 200: polygon = [label] for point in cnt: x, y = point[0] polygon.append(x / W) polygon.append(y / H) polygons.append(polygon) # Save the polygons with labels with open('{}.txt'.format(os.path.join(output_dir, j)[:-4]), 'w') as f: for polygon in polygons: f.write(' '.join(map(str, polygon)) + ' ')
Dear Tutor, Greetings! I am getting the following error while running the code at time stamp 44:35 'for j, mask in enumerate(result.masks.data): AttributeError: 'NoneType' object has no attribute 'data' Can you please help me out ?
hey there does this code work for two classes or more than 1 class (mask to data points etc labels) please help how can we modify it to include more than 1 class please
please sir i have an error and i didnt find the solution its not possible to find the images, i dont know if the problem is on my config file. i have a big problem with it
In the config file, what is is "nc" variable referring to? Also what are the instructions for the "names" variable, does it contains object names? do we decide what we want to call the object?
Hi, I was just wondering how you installed the specific segmentation masks of ducks from open images, I couldn't figure it out on my own. I would greatly appreciate it if you lend me a hand
@@ComputerVisionEngineer Should I only download the mask data? If so, after that have you written a script which creates a list of the image ID's and extracts the necessary annotations. Like in your object detection video, could you share it with me?
I have resorted to another solution after I couldn't figure out the exact thing I wanted to make: I downloaded all the zip files containing all the masks, all mask files have a notation like __.png for example: 00a3d94534a1b356_m0k4j_97c014cd.png I then wrote a script in python to filter only the masks with the wanted class label utilising multiprocessing and multithreading, the script saves the files as .png omitting the other parts. After running the script, I was able to obtain the binary masks only for my selected class. Now I'm going to annotate them using the polygonization script which you've provided, I might modify it a little. Anyway, thanks for your help!
Thank you for this cool instruction!! I am annotating mice in a cage group-housed. Sometimes, animals go under the bedding and are scarcely visible. In case of no annotation made for an image, I do not get an annotation file created for the image. As a result, I have a mismatch in the number of image files and annotation files. Would this be a problem? I could delete the images that do not have any annotation. But I would rather keep them since no annotation is also a kind of annotation I guess. What would be your suggestion? Thank you in advance!!
Hey, thanks for the tutorials. I am new to computer vision. Currently, I am preparing a data set on the dry wall construction process. We have to detect dry wall stages like: Stud Installation, Gypsum Panelling, electrical works and plastering. However, I have some confusion about labelling the data. My question is: do I need to label each object in the image at the same time? Or should I focus on a single object in each image? Besides, we have only 250 images from the construction sites; are these enough for training?
Great video! I wonder if you could speak to the data vizualization. How do you create the masks? And say I only have the Yolov8 annotation format (.png and corresponding .txt), any recommendation on how to visualize it by chance?
it might seem simple to you, but I'd love to see just one example of utilizing a webcam. It seems using the webcam is the most interesting use of YOLO, instead of cycling thru a bunch of still jpg images.
When creating the labels, do you devide by the dimensions of the mask or to the whole image? I am trying to adapt your label creation process to handle multiple masks in one image.
Hi Philippe, awesome tutorial! I really like your style 😊 And I have a question for you. What is the best way to make a dataset for damage detection on cars, machined products, or imperfections/dirt on railroads? Semantic segmentation like you did in this video or object detection like you did with tha alpacas? Regards and keep up your great work. Dom😊
My dataset has two classes and after using your python file to convert, I found that it just has only one class in txt file (class which labels 0) although in the image has clearly two objects in two classes. How can I fix this error?
Hello sir, just watched your video and it is great in case of a simple query let us take training should we include images which doesnt have ducks also? or is it fine that we use all images with ducks for training which is best for fine tuning the model to detect ducks in the images. In simple words for purpose of training should dataset contain all imgaes with ducks in it or a mixture of images with and without ducks.
I ended up exporting as COCO then using roboflow to convert to YOLO format. Now I'm just using roboflow to annotate. I like it better and it actually lets me export to YOLOv8.
hey, your way is working perfectly!!, but when I am taking multiple objects it is classifying all of them as one label. I believe the problem is in masks_to_polygon.py, I did the same thing as instructed by you for config.yaml. Can you tell me where I can be wrong
I trained model in colab then I downloaded but when I use the Spyder for prediction using that last weight model got an error called no attribute 'data' how can I solve this
would the segment anything model be better for this task? I'm trying to segment plants from a herbarium collection, they are full dried plants pressed on white paper sheets and scanned into digital images, but there is a paper label with collection data and a stamp getting in the way of my automatic segmentation attempts. Im a bit confused on what would be the best method to accomplish the task of extracting the plant from the background ( I also may want to segment pieces of the plant, like leaves, flowers, stems). So far it seems to me the best method would be to train YOLO to detect the plant and draw a bounding box around it and then use SAM to make a mask of the plant inside the box (or multiple masks for the pieces of the plant) . Does this make sense?
@@ComputerVisionEngineer I do now. I trained it to draw bounding boxes around the plants using your other tutorial video, it performs really well. Now I'm going to try to use the bounding boxes as prompts for SAM (segment anything model) to extract detailed masks of the plants. Wish me luck!
Hola Felipe, estuve trabajando sobre los archivos que nos compartes, lo adapté a mis necesidades, previamente hice todo el etiquetado en CVAT, pero me queda una duda ya que el training al parecer no me está funcionando: En el archivo "config.yaml", hay dos líneas que no explicaste: "nc:1" (que supongo es la cantidad de classes generadas en CVAT, y la línea "names:['...'] (Supongo que son los nombres asignados a las classes en CVAT). El problema es que asumiendo esto, lo adapto a mi necesidad (nc:7 names: ['Sin arandela', 'Arandela OK', 'Arandela rota', ...ETC ], y en el archivo run - weights, con el collage de imágenes que me arroja, solo me aparece la primera etiqueta, siendo "Sin arandela". ¿Es posible que me digas qué puedo estar haciendo mal? Hice 100 epochs, donde en TRAIN tengo 187 fotografías, y en VAL tengo 46 fotografías.
Hola Julio, 187 + 46 imagenes parecen pocas para entrenar un algoritmo de este tipo, especialmente considerando que tenes 7 clases. Adaptaste las mascaras para trabajar con 7 clases?
@@ComputerVisionEngineer Hola, Felipe. Cuantas consideras que pueden ser una buena base para desarrollar un buen script? El problema es que en este momento soy únicamente yo en el proyecto, por lo que no puedo tener una cantidad muy grande, al menos hasta demostrar resultados y que me asignen una persona adicional. Sobre adaptar las máscaras, no entiendo a qué te refieres, apenas inicio en este mundo. No sé si responda tu pregunta, pero a lo largo del etiquetado en CVAT, hice uso de todas las etiquetas, habiendo aproximadamente de 2 a 3 etiquetas por imagen.
@@juliogomez6065 El tutorial de este video es para semantic segmentation de una sola clase. Para segmentación multi-clase hay que estudiar la documentación de yolov8 para ver cómo hacer las máscaras. Las máscaras que usé en el video son binarias y solo sirven para segmentación de una sola clase (blanco=objeto, negro=fondo). Sobre la cantidad de imágenes, todo depende... pero te sugiriía al menos unas miles, por ejemplo en este video uso ~3000 imágenes para hacer segmentación de una sola clase.
Thank you so much for your lovely content. Indeed, it is very informative. However, I have a question about data handling. I noticed in the images folder you shared, the duck images are in both the train and val folders. Shouldn't it be that the train folder contains only duck images and the val folder is with non-duck images? Looking forward to your clarification. Thanks!
The val folder contains the validation data, this is how you validate the model. If the model detects ducks, it's appropriate to use images with ducks as validation data.
many thanks for your prompt response but I have a big challenge of using a webcam to detect ducks using the following line: model.predict(source=0, show=True, conf=0.2) it has a huge lag. can you help me how to resolve this to be real-time detection?
I followed everything exactly but for some reason my val_batch0_pred has no segmentations on it. Even though the val_batch0_labels is segmented perfectly. I think this is probably the reason why I'm getting "AttributeError: 'NoneType' object has no attribute 'data'" when I try running the code. The object I'm trying to detect and the images given are very simple and easy, the model should not be struggling with this at all. What can I do?
@@ComputerVisionEngineer no I'm using my own data set which is alot smaller, because I don't have many images of the thing I'm trying to detect, because it's of a proprietary ph indicator test so not many images exist, and so getting more is not an option. I have 6 images for training, and 4 for validation. I tried with 10, 50 and 100 epochs but still not a single detection on val_batch0_pred
@@ComputerVisionEngineer I've seen other people on github who have had more images and everything have the same issue, but non of them really got an answer. Or at least not one that is relevant in my case. My validation pictures and very similar to the training ones so the model should have no issues, idk what's wrong.
@@fawazmirza4646 oh I see. 10 images is usually not enough to train this type of model. Training for that many epochs on 6 images will produce overfitting.
So what do you suggest I do with the small data I have? What machine learning method, if any should I try? Or is there a way to make yolov8 work for my case?
I have a question, I see you used the masked label (txt) data for training, What is the process to train the model directly using mask and original samples without any txt data on YoloV8? I have mask image but don't have any text data.
You need to convert the masks into the txt files in order to train the model with yolov8. I have a Python script in this project's github repository that may help you to do that. 🙌
hello! thank you for your video! I have a question regarding using the prediction to predict segmentation from the image. From my results, It states it indicates 2 ducks in my image (which has 2 ducks) however, the outcome only displays 1 image segmentation. What should I do if I want both image segmentations to be predicted? Thank you!
If I am not mistaken when hosted locally you don't need an account with cvat, but each user needs to create an account in your locally hosted cvat app. 🙌
I have a question: How do you download the images from google datasets? Can you make a video explaining that process? Seems like a dumb process, but I really don't know how to do that
I am currently preparing a Python script to download a semantic segmentation dataset from the google open images dataset. It will be available in my Patreon soon. 🙌
i fixed it. it was because of the mask transformation to yolo files. the txt files had all 0 as the class (the very first number of the .txt file). and i manually changed 0's according to the images with the same name @@ComputerVisionEngineer
@@akifakbulut765 using a GPU would be a good way to try to reduce the execution time, if you don't have a GPU in your local computer you could consider using something like an EC2 instance from AWS.
Thanks for the video! Do you have a masks_to_polygons script that would also work for multiple segmentation classes? Or do you know where I would find one? Have been looking for ages..
I don't have a multiclass masks_to_polygons script, but I think you could create one taking my one class script as baseline. Maybe chatgpt can help you adapting the script to multiclass. 💪💪
I'd annotate my images in Inkscape or Illustrator, using paths as the masks, save it to SVG, then just convert SVG to YOLO format. Straight text-to-text conversion, more or less. All the info you need to normalize the vertices is in the SVG. To create multiple classes, you could group the classes, or probably the better thing would be to give each path a custom xml attribute for the object class.
Edit the 'nc' field to your number of classes, and edit the 'names' field so it contains all your class names. In case of multiclass segmentation you also need to edit your masks. 🙌
Hi and thank you for your video! I have noticed that, when annotating a dataset of multiple images with CVAT with two labels, the export phase goes wrong and not all the segmentation masks are created correctly. Some of them contain for example just one class of objects even though I had previously annotated objects of different classes in that picture. Do you know how to solve it? Is there any other annotation tool that allows to export the images of the segmented masks? Thank you
Hey, I just tried to do it and everything seems fine with a couple of images. I annotated two labels and exported it as 'Segmentation mask 1.1', are you using this export format?
Very well done , however my code still errors that . The only things you aren't explaining very well is WHAT goes inside each of the directories. You've explained that its " images to train the model" and " to validate the model", however I cant tell if IMAGES\TRAIN contains 1) images the masks were trained from 2) images of the masks 3) or bulk unknown images to be analyzed IMAGES\VAL you said contains "images to validate the training model" , however I dont know which images those might be - 1)original bulk of all ducks known and unknown, 2) images of masks 3) or just the images that were used to create the masks) ...all same above questions with LABELS\TRAIN LABELS\VAL (you never mentioned that any files are inserted into this directory, or if this is the output ) Then, are any of the folders we created empty? finally, it would be great to see the results found in your runs\detect\train folder.
Thank you for the tutorial, it's one of the best i've seen in yolo. Would you be able to provide me some support on how to get the RGB masks from the inferences crop results? Cheers
This is one of the most informative I have seen on this topic. Yes, I would also like to know how to crop out the original region given by the predicted mask. I have been having a hard time with that @@ComputerVisionEngineer
That part of the duck is just called the webbed foot (palmeado). I have some questions I would like to ask you, computer vision related, because I don't know who else could I ask
Thanks a lot for the tutorial, however, I seem to run into the same problems as @dmitrium12. Somehow the runs/segment/train file does not mask predictions and thus the graphs with train/loss and val/loss is just a dot in the middle of the grapth. I have used your dataset and followed every step.
If evaluation plots are only a dot in the middle of the graph it means you are training for only 1 epoch. Increase the number of epochs and you should be able to see a different plot. 🙌
Hello! I found your video very interesting, and it's helping me a lot in my new job as a vision engineer. I managed to train the duck-segmenting algorithm following your steps - amazingly clear! I can see how it makes some batch predictions in the 'runs' folder for some of the images in 'val'. However, when I import the model 'last.pt' and I try to make predictions, I consistently get 'no detections' and 'masks: None'. Do you know what could be going on? Thanks a million😊
Hey Guillem, I am glad the video is helping you in your new job! 😃 How many epochs did you train the model for? Are you using the exact same dataset as I use in the video?
@@ComputerVisionEngineer Hi getting the same error as stated above. No masks are gnerated for the predictions after 10 ephors using the same data set and code you have given not sure whats going wrong.
@@sarthakdas815 I faced the same issue but solved it. That was because I used the masks that are in the "SegmentationObject" folder, we should use the masks that are in the "SegmentationClass" folder.
hello, im new to computer vision and I have a question. what is the most suitable algorithm/s or method/s for image steganalysis to detect the changed pixels in the stego image? i want to segment only the changed pixels in the stego image? can I use semantic segmentation also for this kind of problem?
Hey, I don't think that is a problem you can solve with semantic segmentation 🤔, but you can try! 😃 Regarding what are the most suitable methods for image steganalysis, I recommend you do a {Google, Github, Google Scholar} search, it is a field I haven't been involved in. 💪💪
@@ComputerVisionEngineer Okay, i'll search. thank you. btw, I really appreciate your effort in making really valuable videos related to CV for free. I learned so much from your channel. this is one of the best channels with real-world implementations for CV that I've seen on UA-cam. keep going 💪💪!!
Here if we want to infer on an image , so how to do it? I tried doing: from ultralytics import YOLO # Load a model model = YOLO('/content/yolov8n-seg.pt') # load an official model model = YOLO('/content/runs/segment/train/weights/best.pt') # load a custom model # Predict with the model results = model('image.jpg') # predict on an image Output: image 1/1 /content/gdrive/MyDrive/segmentation/data/images/train/11-03-22-ROHAN SANGHVI-DAUGHTER_S BEDROOM_page-0001.jpg: 480x640 (no detections), 10.6ms Speed: 0.6ms preprocess, 10.6ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 640) it ran successfully but cannot see where is it saved. if the code is wrong pls update on how i can change it for inferencing on a single image? Thank you.
@@ComputerVisionEngineer yes after doing that masks can be seen of that shapes , what to do if i want want my segmentation on my test image or actual image so that bounding box can be seen in my output with segmented part?
@@vishalpahuja2967 I see, you would like a visualization as the one in the thumbnail, right? img + mask on top + bounding box, is that it? you can visualize the mask on top of the image by applying an overlay, take a look on how to do that, and about the bounding box take a look at my video on object detection with yolov8 + object tracking, the first part is about how to get bounding boxes with a yolov8 model and how to draw the bounding box on the image. 💪🙌
Another awesome tutorial, showing all the necessary steps! Wish you all the best.
Thank you! 😊🙌
This is a very cool manual, thank you for it, this is exactly what I wanted to see. I have always been surprised in your channel that you post all these materials and videos for free, because you could sell them.
Hey, glad you find it helpful! I really enjoy sharing my knowledge of computer vision with everyone! 😃💪 Selling courses is not a bad idea, though. Maybe I will do it in the future. 😊
Did you enjoy this video? Try my premium courses! 😃🙌😊
● Hands-On Computer Vision in the Cloud: Building an AWS-based Real Time Number Plate Recognition System bit.ly/3RXrE1Y
● End-To-End Computer Vision: Build and Deploy a Video Summarization API bit.ly/3tyQX0M
● Computer Vision on Edge: Real Time Number Plate Recognition on an Edge Device bit.ly/4dYodA7
● Machine Learning Entrepreneur: How to start your entrepreneurial journey as a freelancer and content creator bit.ly/4bFLeaC
Learn to create AI-based prototypes in the Computer Vision School! www.computervision.school 😃🚀🎓
I must say that only your videos are helping me with computer vision project. All others do not work. Thank you, from Serbia
You're doing an amazing job! This is a really good video. Keep making more videos like this one!
Thank you for your support! I will keep on making videos like this one! 😃🙌
Wow! was waiting for this video.
Thank you!
😃 You are welcome! 💪🙌
hi tenk u for ur video but i got a problem here. The labels in the val part should be the same as the labels we did for the images in the same directory, right? Should we convert them from the image format to numerical format in the code? Or when I downloaded the dataset, the labels for the val part came in this format: "duck 56.96 124.29848700000001 1023.36 421.58926299999996" for each image. Is this the correct format? When I did the second one, I got an error in VS Code. When I did the first one, I got the following output in Google Colab:
"Epoch GPU_mem box_loss seg_loss cls_loss dfl_loss Instances Size
10/10 0G 0 0 75.79 0 0 640: 100%|██████████| 5/5 [01:23
Thank you! Your tutorials are awesome!
Nice class sir. You explained to finetune YOLO in a simple way. Thank you
You are welcome! 😃🙌
please , how can i get the code at 11:40 in the video? type in by myself?
Great tutorial, everything explained very well. You saved me :D
😃 I am glad you enjoyed it! 🙌
can you give me a step, on how to download the dataaset from the openimages and the annotation mask
I have the same question
@@ialbornoz Did you ever figure out how to do it?
@@felipe_gf ...
Thank you for this video.
I have a multiclass problem 10 classes + background. How can I convert the masks to yolo labels considering the right arrangement of the labels?
did you get the answer please can you help me with it I also have a multi class problem and cat seem to find the code to convert the masks to labels please !!!!!!!!!!!!!!!!!
Also how can we add background I mean if its not a separate class then how we can specify it as a background
@@haseebkhawaja1050 I exported my Annotation as JSON file. Then you can import the Data with JSON file to roboflow annotater. Then from the robowflow annotater the txt. files for yolov8 can be exported. Also you will finde alot of codes online that can transforme the JSON file to yolov8 txt. format or to a binary mask
i can make a hot encoded with classes numbers and a label map. class 1: 1 , class 2: 2. 1:(0,0,0) 2:(255,255,255)
HI, I have a project wherein, I have to segment multiple classes, how do i go about it? What changes do I need to make in the code?
actually this one bothers me to
if you find anything on it please help. I found one solution of ROBOFLOW which actually generates auto lables after annotating no need to go through to masks and then convert to lables (polygons)
add classes in code. if u have multiple object to detect then u need to add more class in code
you need to edit the config file in which nc = number of labels and names = names of ur labels like
nc : 4
names : ['duck' , 'cat' , 'cow' , 'horse' ]
Use the color of the different classes to define which segment has which color:
import os
import cv2
import numpy as np
# Define color-to-label/class mapping
color_to_label = {
(25, 25, 77): 0, # Red
(61, 245, 61): 1, # Green
(189, 18, 0): 2, # Black
# Add more colors and labels as needed for more different classes
}
input_dir = 'myinput_path'
output_dir = 'myoutput'_path
for j in os.listdir(input_dir):
image_path = os.path.join(input_dir, j)
# Load the color mask
mask = cv2.imread(image_path)
H, W = mask.shape[:2]
polygons = []
for color, label in color_to_label.items():
# Create a binary mask for the current color
binary_mask = cv2.inRange(mask, np.array(color), np.array(color))
contours, _ = cv2.findContours(binary_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours:
if cv2.contourArea(cnt) > 200:
polygon = [label]
for point in cnt:
x, y = point[0]
polygon.append(x / W)
polygon.append(y / H)
polygons.append(polygon)
# Save the polygons with labels
with open('{}.txt'.format(os.path.join(output_dir, j)[:-4]), 'w') as f:
for polygon in polygons:
f.write(' '.join(map(str, polygon)) + '
')
Dear Tutor,
Greetings! I am getting the following error while running the code at time stamp 44:35
'for j, mask in enumerate(result.masks.data):
AttributeError: 'NoneType' object has no attribute 'data'
Can you please help me out ?
Hey, it may be that you are not detecting any objects in that image. Have you tried with other images?
@@ComputerVisionEngineer I tried on three or four other images in val set, but I am getting the same error
Same problem for me
Thank you so much for the label creation code!!
thank you for the video again amazing job
You are welcome! Glad you enjoyed it! 😀💪
Thank you so much for your kindness tutorial! Hope you everything well!!!!!!
😊 You are welcome! I am glad you enjoyed it. 🙂🙌
Wow. you are amazing bro. Thank you so much for teaching me this!!!
Thanks for the nice tutorial. :D
Thanks for the detailed explanation.
You are welcome! 😃
When I'm exporting the annotated files using segmentation mask 1.1, the zip I'm getting is just a single text file. Any idea what else I can do?
Thank you so much for this wonderful video. It helped me so much.
Thank you. Awesome tutorial.
Thank you! 😄
hey there does this code work for two classes or more than 1 class (mask to data points etc labels) please help how can we modify it to include more than 1 class please
If I have a segmented mask can I zoom in or out the masked object while the unmasked remaining image stays same ?
I need samples of C. elegans nematodes. Are they existing in the dataset?
this was very informative. Thank you
Hey do you have a video or some tipps for Yolov11 Segmentation?
Getting this error what could be the issue.
IndexError: list index out of range when i train my .yaml file
@18:50 starting with `epoch=1` to make sure everything is well established is a Golden Rule !
Yeah, 100% agreed! 😃
please sir i have an error and i didnt find the solution
its not possible to find the images, i dont know if the problem is on my config file. i have a big problem with it
In the config file, what is is "nc" variable referring to? Also what are the instructions for the "names" variable, does it contains object names? do we decide what we want to call the object?
nc is the number of classes, yes you can name the objects whatever you want, the names won't affect the training process 🙌
@@ComputerVisionEngineer Thank you
Hi, I was just wondering how you installed the specific segmentation masks of ducks from open images, I couldn't figure it out on my own. I would greatly appreciate it if you lend me a hand
Hey, take a look at the 'Annotations and metadata' section of this page storage.googleapis.com/openimages/web/download_v7.html 🙌
@@ComputerVisionEngineer Should I only download the mask data? If so, after that have you written a script which creates a list of the image ID's and extracts the necessary annotations. Like in your object detection video, could you share it with me?
I have resorted to another solution after I couldn't figure out the exact thing I wanted to make: I downloaded all the zip files containing all the masks, all mask files have a notation like __.png for example: 00a3d94534a1b356_m0k4j_97c014cd.png I then wrote a script in python to filter only the masks with the wanted class label utilising multiprocessing and multithreading, the script saves the files as .png omitting the other parts. After running the script, I was able to obtain the binary masks only for my selected class. Now I'm going to annotate them using the polygonization script which you've provided, I might modify it a little. Anyway, thanks for your help!
@@Di0n-r5u Glad you solved it!
@@Di0n-r5u hey can you please share the script
Can you help me with exporting this model because I'm facing an error while exporting the model into tflite or pb format
how can we get .json file from this annotated image which will carry the co-ordinates of the polygon mask as text format???
Does YOLO supports 3d image segmentation ? I have some 3d images in .obj format and want to segment them
great. What if we have 2 objects? can you help with that too?
i need the code for polygons to mask...can i get it?..please
could i use A111 inpaint anything to create the masks for the training?
Not sure if I understand, but no, I don't think you could do that.
@@ComputerVisionEngineer ok, thanks
How can I labeled 2 classes and configured training. We hope to receive feedback from you! Thank you, from Viet Nam
thanks you are very helpful, keep going bro
Thank you for your support!! 😃🙌
Thank you for this cool instruction!! I am annotating mice in a cage group-housed. Sometimes, animals go under the bedding and are scarcely visible. In case of no annotation made for an image, I do not get an annotation file created for the image. As a result, I have a mismatch in the number of image files and annotation files. Would this be a problem? I could delete the images that do not have any annotation. But I would rather keep them since no annotation is also a kind of annotation I guess. What would be your suggestion? Thank you in advance!!
Hi Philippe! I already tried without deleting any images. So the mismatch stayed. But it worked :)
Hey, thanks for the tutorials. I am new to computer vision. Currently, I am preparing a data set on the dry wall construction process. We have to detect dry wall stages like: Stud Installation, Gypsum Panelling, electrical works and plastering. However, I have some confusion about labelling the data. My question is: do I need to label each object in the image at the same time? Or should I focus on a single object in each image? Besides, we have only 250 images from the construction sites; are these enough for training?
What could be the reason for the following issue:
for j, mask in enumerate(result.masks.data):
AttributeError: 'list' object has no attribute 'masks'
If no detections were found result may be an empty list. Print result and see how it looks like. Let me know how it goes. 🙌
Great video! I wonder if you could speak to the data vizualization. How do you create the masks? And say I only have the Yolov8 annotation format (.png and corresponding .txt), any recommendation on how to visualize it by chance?
thanks for the response earlier, how can I expend it to a data set with multiple segmentation classes?
You would need to edit the config file and create the annotations accordingly. I may do a video about multiclass detection in the future.
it might seem simple to you, but I'd love to see just one example of utilizing a webcam.
It seems using the webcam is the most interesting use of YOLO, instead of cycling thru a bunch of still jpg images.
Ok, thank you for your feedback, next time I work with Yolo I will use a Webcam. 🙌
can i use CVAT Anotation website for yoloV5
When creating the labels, do you devide by the dimensions of the mask or to the whole image?
I am trying to adapt your label creation process to handle multiple masks in one image.
You are a hero
Thank you! 😃🙌
Good stuff, thanks
Hi Philippe, awesome tutorial! I really like your style 😊 And I have a question for you. What is the best way to make a dataset for damage detection on cars, machined products, or imperfections/dirt on railroads? Semantic segmentation like you did in this video or object detection like you did with tha alpacas? Regards and keep up your great work. Dom😊
My dataset has two classes and after using your python file to convert, I found that it just has only one class in txt file (class which labels 0) although in the image has clearly two objects in two classes. How can I fix this error?
Hi, what Python file to convert do you mean?
Hello sir, just watched your video and it is great in case of a simple query let us take training should we include images which doesnt have ducks also? or is it fine that we use all images with ducks for training which is best for fine tuning the model to detect ducks in the images. In simple words for purpose of training should dataset contain all imgaes with ducks in it or a mixture of images with and without ducks.
Amazing video 👍🏻👍🏻👍🏻
😀💪
But how do we export from cvat into yolo format if we have more than just one class?
I will try to make a video about multi class image segmentation in the future 🙌
I ended up exporting as COCO then using roboflow to convert to YOLO format. Now I'm just using roboflow to annotate. I like it better and it actually lets me export to YOLOv8.
hey, your way is working perfectly!!, but when I am taking multiple objects it is classifying all of them as one label. I believe the problem is in masks_to_polygon.py, I did the same thing as instructed by you for config.yaml. Can you tell me where I can be wrong
Hello my weights folder is empty how can I overcome this problem it may be an issue of my train coding
Probably your training process is not being completed. Do you see any error? Have you tried to train the model from a google colab?
@@ComputerVisionEngineer thank you very much no I didn't try using colab I used Spyder
Thank you I used colab and I got the weights
I trained model in colab then I downloaded but when I use the Spyder for prediction using that last weight model got an error called no attribute 'data' how can I solve this
Hello can we use Spyder for this without using colab
i am not able to download dataset
would the segment anything model be better for this task? I'm trying to segment plants from a herbarium collection, they are full dried plants pressed on white paper sheets and scanned into digital images, but there is a paper label with collection data and a stamp getting in the way of my automatic segmentation attempts. Im a bit confused on what would be the best method to accomplish the task of extracting the plant from the background ( I also may want to segment pieces of the plant, like leaves, flowers, stems). So far it seems to me the best method would be to train YOLO to detect the plant and draw a bounding box around it and then use SAM to make a mask of the plant inside the box (or multiple masks for the pieces of the plant) . Does this make sense?
It makes sense, although you would need a yolo model trained on detecting plants, do you have one?
@@ComputerVisionEngineer I do now. I trained it to draw bounding boxes around the plants using your other tutorial video, it performs really well. Now I'm going to try to use the bounding boxes as prompts for SAM (segment anything model) to extract detailed masks of the plants. Wish me luck!
@@mpfmax0 Good luck! Let me know how it goes! 😃
Hola Felipe, estuve trabajando sobre los archivos que nos compartes, lo adapté a mis necesidades, previamente hice todo el etiquetado en CVAT, pero me queda una duda ya que el training al parecer no me está funcionando: En el archivo "config.yaml", hay dos líneas que no explicaste: "nc:1" (que supongo es la cantidad de classes generadas en CVAT, y la línea "names:['...'] (Supongo que son los nombres asignados a las classes en CVAT). El problema es que asumiendo esto, lo adapto a mi necesidad (nc:7 names: ['Sin arandela', 'Arandela OK', 'Arandela rota', ...ETC ], y en el archivo run - weights, con el collage de imágenes que me arroja, solo me aparece la primera etiqueta, siendo "Sin arandela". ¿Es posible que me digas qué puedo estar haciendo mal? Hice 100 epochs, donde en TRAIN tengo 187 fotografías, y en VAL tengo 46 fotografías.
Hola Julio, 187 + 46 imagenes parecen pocas para entrenar un algoritmo de este tipo, especialmente considerando que tenes 7 clases. Adaptaste las mascaras para trabajar con 7 clases?
@@ComputerVisionEngineer Hola, Felipe. Cuantas consideras que pueden ser una buena base para desarrollar un buen script? El problema es que en este momento soy únicamente yo en el proyecto, por lo que no puedo tener una cantidad muy grande, al menos hasta demostrar resultados y que me asignen una persona adicional. Sobre adaptar las máscaras, no entiendo a qué te refieres, apenas inicio en este mundo. No sé si responda tu pregunta, pero a lo largo del etiquetado en CVAT, hice uso de todas las etiquetas, habiendo aproximadamente de 2 a 3 etiquetas por imagen.
@@juliogomez6065 El tutorial de este video es para semantic segmentation de una sola clase. Para segmentación multi-clase hay que estudiar la documentación de yolov8 para ver cómo hacer las máscaras. Las máscaras que usé en el video son binarias y solo sirven para segmentación de una sola clase (blanco=objeto, negro=fondo). Sobre la cantidad de imágenes, todo depende... pero te sugiriía al menos unas miles, por ejemplo en este video uso ~3000 imágenes para hacer segmentación de una sola clase.
Is there any reference on how to save the segmented object as it's own image?
Do you mean saving the mask you get at the output?
what is speed on segmentation, will I be able to use it on live video?
You can make it work on ~real time if not mistaken, let me know how it goes. 🙌
Thank you so much for your lovely content. Indeed, it is very informative. However, I have a question about data handling. I noticed in the images folder you shared, the duck images are in both the train and val folders. Shouldn't it be that the train folder contains only duck images and the val folder is with non-duck images? Looking forward to your clarification. Thanks!
The val folder contains the validation data, this is how you validate the model. If the model detects ducks, it's appropriate to use images with ducks as validation data.
many thanks for your prompt response but I have a big challenge of using a webcam to detect ducks using the following line:
model.predict(source=0, show=True, conf=0.2)
it has a huge lag.
can you help me how to resolve this to be real-time detection?
I followed everything exactly but for some reason my val_batch0_pred has no segmentations on it. Even though the val_batch0_labels is segmented perfectly. I think this is probably the reason why I'm getting "AttributeError: 'NoneType' object has no attribute 'data'" when I try running the code. The object I'm trying to detect and the images given are very simple and easy, the model should not be struggling with this at all. What can I do?
Hey, are you using the same dataset as I did in the video? How many epochs are you training?
@@ComputerVisionEngineer no I'm using my own data set which is alot smaller, because I don't have many images of the thing I'm trying to detect, because it's of a proprietary ph indicator test so not many images exist, and so getting more is not an option.
I have 6 images for training, and 4 for validation. I tried with 10, 50 and 100 epochs but still not a single detection on val_batch0_pred
@@ComputerVisionEngineer I've seen other people on github who have had more images and everything have the same issue, but non of them really got an answer. Or at least not one that is relevant in my case.
My validation pictures and very similar to the training ones so the model should have no issues, idk what's wrong.
@@fawazmirza4646 oh I see. 10 images is usually not enough to train this type of model. Training for that many epochs on 6 images will produce overfitting.
So what do you suggest I do with the small data I have? What machine learning method, if any should I try? Or is there a way to make yolov8 work for my case?
I have a question, I see you used the masked label (txt) data for training, What is the process to train the model directly using mask and original samples without any txt data on YoloV8? I have mask image but don't have any text data.
You need to convert the masks into the txt files in order to train the model with yolov8. I have a Python script in this project's github repository that may help you to do that. 🙌
Thank you brother. I found the file named as "masks_to_polygons". ❤
hello! thank you for your video! I have a question regarding using the prediction to predict segmentation from the image. From my results, It states it indicates 2 ducks in my image (which has 2 ducks) however, the outcome only displays 1 image segmentation. What should I do if I want both image segmentations to be predicted? Thank you!
Hey, when you say the outcome only displays 1 image segmentation you mean it only covers one of the two ducks?
@@ComputerVisionEngineer yes, can the code detect 2 ducks instead? or is it only for one duck segmentation detection...
This is very cool, thanks a lot! Could you make a similar video for the SAM?
no attribute called data... Error
8:33 - Its called duck feet. The arms of the duck is the wings.
Oh right the arms are the wings. 'Duck feet', ok, noted. Thank you! 🙌
do you need an account with cvat even when you host it locally?
If I am not mistaken when hosted locally you don't need an account with cvat, but each user needs to create an account in your locally hosted cvat app. 🙌
what if there are more than 1 classes, will the same method to convert to polygon will work?
If there are more than one classes, the same script will not work, you would need to adjust it to deal with multi class masks. 🙌
@@ComputerVisionEngineer any Idea how can we do that, I tried but not able to find concrete solution.
@@kagadevishal5008 me also tried many things but at the end went with ROBOFLOW which automatically does this
Hi! thanks for the video, it's helping too much!!!
I have a question,
What version of tensorboard and numpy do you use?
I noticed that images\train has around 1800 files - unlike in the video, and labels\train has 3965 files. Is that an issue?
Hey, you should have the same number of images and label files. Yolov8 will probably trigger an error in any other case.
I have a question: How do you download the images from google datasets? Can you make a video explaining that process? Seems like a dumb process, but I really don't know how to do that
I am currently preparing a Python script to download a semantic segmentation dataset from the google open images dataset. It will be available in my Patreon soon. 🙌
@@ComputerVisionEngineer
I am predicting watermelons, pineapples and blackberries. My model can predict the objects but call them all watermelons. Do you have any idea?
Take a look at your training data. Perhaps you need to train the model with more data. 🙌
i fixed it. it was because of the mask transformation to yolo files. the txt files had all 0 as the class (the very first number of the .txt file). and i manually changed 0's according to the images with the same name @@ComputerVisionEngineer
Hi, thank you for this video. can you convert the Yolo label to a binary mask?
Hey Zeinab, yes, it is possible! 😀
Each frame reading interval is 53 milliseconds in ultralytics, how can we reduce this interval to 33 milliseconds.
Hey, do you mean the inference is taking 53 ms per frame?
@@ComputerVisionEngineer Yes, the elapsed time between each frame is 53 milliseconds
@@akifakbulut765 are you using a GPU?
@@ComputerVisionEngineer No
@@akifakbulut765 using a GPU would be a good way to try to reduce the execution time, if you don't have a GPU in your local computer you could consider using something like an EC2 instance from AWS.
Is it mandatory to convert masks to polygon or we can directly do labeling in polygon template and can we train that
Converting masks to polygon is necessary in order to do semantic segmentation with yolov8. 🙌
Thanks for the video! Do you have a masks_to_polygons script that would also work for multiple segmentation classes? Or do you know where I would find one? Have been looking for ages..
I don't have a multiclass masks_to_polygons script, but I think you could create one taking my one class script as baseline. Maybe chatgpt can help you adapting the script to multiclass. 💪💪
I'd annotate my images in Inkscape or Illustrator, using paths as the masks, save it to SVG, then just convert SVG to YOLO format. Straight text-to-text conversion, more or less. All the info you need to normalize the vertices is in the SVG. To create multiple classes, you could group the classes, or probably the better thing would be to give each path a custom xml attribute for the object class.
Just a query, If I wanted to train it on multiple classes, how would I go about editing the config file?
Edit the 'nc' field to your number of classes, and edit the 'names' field so it contains all your class names. In case of multiclass segmentation you also need to edit your masks. 🙌
@@ComputerVisionEngineer please can you share code for masks to labels also please for multi class. Help would be much appreciated
Hi and thank you for your video! I have noticed that, when annotating a dataset of multiple images with CVAT with two labels, the export phase goes wrong and not all the segmentation masks are created correctly. Some of them contain for example just one class of objects even though I had previously annotated objects of different classes in that picture. Do you know how to solve it? Is there any other annotation tool that allows to export the images of the segmented masks?
Thank you
Hey, I just tried to do it and everything seems fine with a couple of images. I annotated two labels and exported it as 'Segmentation mask 1.1', are you using this export format?
you can just manually correct the .txt files. first number in the file represents the class. all of them might be 0 in your case.
penjelasan yang bagus👍👍
Thank you! Glad you enjoyed the video. 😊🙌
Great job, man! Thanks a lot! Btw, does this segmebtation project work in c2.capread? Or can i use it for segmentation objects in video ?
C2.capread? What do you mean?
@@ComputerVisionEngineer oh, sorry, im mistaken. Does it works with cv2.VideoCapture?
Sure! You can read frames using cv2.VideoCapture and then input each frame into the model to get the mask 🙌
@@ComputerVisionEngineer thanks for getting back to me! So I'll try to use it on livestream)
Very well done , however my code still errors that .
The only things you aren't explaining very well is WHAT goes inside each of the directories. You've explained that its " images to train the model" and " to validate the model", however I cant tell if IMAGES\TRAIN contains
1) images the masks were trained from
2) images of the masks
3) or bulk unknown images to be analyzed
IMAGES\VAL
you said contains "images to validate the training model" , however I dont know which images those might be - 1)original bulk of all ducks known and unknown, 2) images of masks 3) or just the images that were used to create the masks)
...all same above questions with
LABELS\TRAIN
LABELS\VAL (you never mentioned that any files are inserted into this directory, or if this is the output )
Then,
are any of the folders we created empty?
finally, it would be great to see the results found in your runs\detect\train folder.
Hi, you can download the data from the github repository if I am not mistaken.
@@ComputerVisionEngineer the images are not in github, nor any folders
Is the validation dataset a separate data set?? Is it necessary?
It is not absolutely necessary, but it is a good practice to use a different dataset as validation set.
Thank you for the tutorial, it's one of the best i've seen in yolo. Would you be able to provide me some support on how to get the RGB masks from the inferences crop results? Cheers
Do you mean how to crop the original rgb image in the region given by the predicted mask?
This is one of the most informative I have seen on this topic. Yes, I would also like to know how to crop out the original region given by the predicted mask. I have been having a hard time with that
@@ComputerVisionEngineer
@@anisiobinzubechi I will try to make a video about it soon.
That part of the duck is just called the webbed foot (palmeado). I have some questions I would like to ask you, computer vision related, because I don't know who else could I ask
Oh, webbed foot! 🦆 Cool, thank you!
Sure, you can ask me on discord. 💪🙌
@@ComputerVisionEngineer No sabía lo del discord! Allá voy! Gracias!!
@ComputerVisionEngineer Can you make a video on finetune SAM model(Segment Anything Model) on custom dataset.
I will try to.
Thanks a lot for the tutorial, however, I seem to run into the same problems as @dmitrium12. Somehow the runs/segment/train file does not mask predictions and thus the graphs with train/loss and val/loss is just a dot in the middle of the grapth.
I have used your dataset and followed every step.
sorry meant @guillemcobos1987
If evaluation plots are only a dot in the middle of the graph it means you are training for only 1 epoch. Increase the number of epochs and you should be able to see a different plot. 🙌
Hello! I found your video very interesting, and it's helping me a lot in my new job as a vision engineer. I managed to train the duck-segmenting algorithm following your steps - amazingly clear! I can see how it makes some batch predictions in the 'runs' folder for some of the images in 'val'. However, when I import the model 'last.pt' and I try to make predictions, I consistently get 'no detections' and 'masks: None'. Do you know what could be going on? Thanks a million😊
Hey Guillem, I am glad the video is helping you in your new job! 😃 How many epochs did you train the model for? Are you using the exact same dataset as I use in the video?
@@ComputerVisionEngineer Hi getting the same error as stated above. No masks are gnerated for the predictions after 10 ephors using the same data set and code you have given not sure whats going wrong.
@@sarthakdas815 I faced the same issue but solved it. That was because I used the masks that are in the "SegmentationObject" folder, we should use the masks that are in the "SegmentationClass" folder.
Hi, thanks for the video, its helping too much. Can we crop the segmented object instead of taking a mask? Do you have another video for this?
Hey, yes it is possible. I don't have another video for that.
and would be cool if had small cv2 script that you waive a picture of a duck under cam, and it highlights the duck.
Yeah, sounds like a good way to test the model.
Hi,
Can you make a video of getting output such as on your thumbnail?
Thank you!
Hey, next time I make a video about semantic segmentation I will make the output to look like that 💪
hello, im new to computer vision and I have a question. what is the most suitable algorithm/s or method/s for image steganalysis to detect the changed pixels in the stego image? i want to segment only the changed pixels in the stego image? can I use semantic segmentation also for this kind of problem?
Hey, I don't think that is a problem you can solve with semantic segmentation 🤔, but you can try! 😃 Regarding what are the most suitable methods for image steganalysis, I recommend you do a {Google, Github, Google Scholar} search, it is a field I haven't been involved in. 💪💪
@@ComputerVisionEngineer Okay, i'll search. thank you. btw, I really appreciate your effort in making really valuable videos related to CV for free. I learned so much from your channel. this is one of the best channels with real-world implementations for CV that I've seen on UA-cam. keep going 💪💪!!
@@thisurawz 😃 Thank you so much for your support! 💪🙌
Can anyone share the dataset?
Hey, I will upload the dataset shortly 💪
Thanks a Lot
have you seen meta's SAM
I have! Although I haven't tested so far. I should make a video about it later on! 💪
@@ComputerVisionEngineer thank you for your videos. they are truly amazing and very very educative and fun to watch
@@oi4252 😊 I am so happy you enjoy them!
Here if we want to infer on an image , so how to do it?
I tried doing:
from ultralytics import YOLO
# Load a model
model = YOLO('/content/yolov8n-seg.pt') # load an official model
model = YOLO('/content/runs/segment/train/weights/best.pt') # load a custom model
# Predict with the model
results = model('image.jpg') # predict on an image
Output:
image 1/1 /content/gdrive/MyDrive/segmentation/data/images/train/11-03-22-ROHAN SANGHVI-DAUGHTER_S BEDROOM_page-0001.jpg: 480x640 (no detections), 10.6ms
Speed: 0.6ms preprocess, 10.6ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 640)
it ran successfully but cannot see where is it saved.
if the code is wrong pls update on how i can change it for inferencing on a single image?
Thank you.
Hey, take a look at the tutorial. In the last chapter I show you how to make predictions with the model you trained. 🙌
@@ComputerVisionEngineer yes after doing that masks can be seen of that shapes , what to do if i want want my segmentation on my test image or actual image so that bounding box can be seen in my output with segmented part?
@@vishalpahuja2967 I see, you would like a visualization as the one in the thumbnail, right? img + mask on top + bounding box, is that it? you can visualize the mask on top of the image by applying an overlay, take a look on how to do that, and about the bounding box take a look at my video on object detection with yolov8 + object tracking, the first part is about how to get bounding boxes with a yolov8 model and how to draw the bounding box on the image. 💪🙌