what im trying to do is detect realtime and then put my cursor on the detected object, if possible ,when the pose detects the head and its features are marked as green, i would like to get the cordinates of nose keypoint and put my cursor on that as do it as fast and as efficiently as possible . can you please help me . also how do i use my screen as source. also how would the tracking work , can i make my mouse cursor follow the tracked path
For real-time object detection and cursor control: - Use OpenCV for object detection. - Extract nose keypoint coordinates from pose detection. - Employ PyAutoGUI for efficient cursor placement. - Capture screen frames with OpenCV. - Implement real-time cursor tracking for a responsive user experience. Thanks
@@Ultralytics actually i did try but i couldn't get anywhere . then i asked gpt4 but its was useless from ultralytics import YOLO import win32api from mss import mss import numpy as np # Load model, initialize MSS, define screen area model, sct, monitor = YOLO('yolov8n-pose.pt'), mss(), {'left': 880, 'top': 400, 'width': 800, 'height': 800} while True: # Capture screenshot, run inference screenshot = np.array(sct.grab(monitor))[..., :3] results = model(screenshot, show=True) # Check if any detections were made if len(results.pred) > 0: # Get the first detection's keypoints keypoints = results.pred[0].keypoints # Check if the detection has keypoints if keypoints: # Get the nose keypoint (assuming the first keypoint is the nose) nose_keypoint = keypoints.xy[0] # Move the cursor to the nose keypoint win32api.SetCursorPos((int(nose_keypoint[0]), int(nose_keypoint[1]))) the last part , of checking detection and all is from gpt and i don't know what conccution it made but its not working please help me thank you
@@cv3174 yup and its completed and works amazing. if you train it with game data and proper keypoints it will work even better .but i didn't cause i was satisfied with default one
That sounds impressive! If you need any further assistance or want to explore more about object detection and tracking, check out our detailed guides and documentation: docs.ultralytics.com/guides/vision-eye/. Happy coding! 🚀
You can use the code below to extract the bounding box coordinates from the webcam. For more information, you can explore our docs: docs.ultralytics.com/modes/predict/#working-with-results ```python import cv2 from ultralytics import YOLO from ultralytics.utils.plotting import Annotator, colors model = YOLO("yolov8s.pt") cap = cv2.VideoCapture(0) assert cap.isOpened(), "Error reading video file" while cap.isOpened(): success, im0 = cap.read() if success: results = model.predict(im0, show=False, classes=[108]) boxes = results[0].boxes.xyxy.cpu().tolist() clss = results[0].boxes.cls.cpu().tolist() annotator = Annotator(im0, line_width=2, example=model.names) if boxes is not None: for box, cls in zip(boxes, clss): annotator.box_label(box, color=colors(int(cls), True), label=model.names[int(cls)]) print("Bounding Box Coordinates : ", box) if cv2.waitKey(1) & 0xFF == ord('q'): break continue print("Video frame is empty or video processing has been successfully completed.") break cap.release() cv2.destroyAllWindows() ``` Thanks
There isn't a tutorial specifically covering timestamps. For more comprehensive guidance, feel free to inquire in our community! GitHub Issues: github.com/ultralytics/ultralytics/issues GitHub Discussion: github.com/orgs/ultralytics/discussions Thanks Ultralytics Team!
@@user-firebender You are advised to modify the internal code. For more effective responses to your queries, we recommend posting them on our GitHub: github.com/ultralytics/ultralytics/issues
Certainly, it's feasible. You can utilize the provided code to showcase exclusively designated class labels. """ from ultralytics import YOLO from ultralytics.utils.plotting import Annotator, colors import cv2 model = YOLO("yolov8n.pt") names = model.names cap = cv2.VideoCapture("path/to/video/file.mp4") assert cap.isOpened(), "Error reading video file" w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS)) video_writer = cv2.VideoWriter("ultralytics.avi", cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h)) while cap.isOpened(): success, im0 = cap.read() if success: results = model.predict(im0, show=False) boxes = results[0].boxes.xyxy.cpu().tolist() clss = results[0].boxes.cls.cpu().tolist() annotator = Annotator(im0, line_width=4, example=names) if boxes is not None: for box, cls in zip(boxes, clss): if names[int(cls)] == "person": # Here class name whose bbox you want to display annotator.box_label(box, label=names[int(cls)]) cv2.imshow("ultralytics", im0) video_writer.write(im0) if cv2.waitKey(1) & 0xFF == ord('q'): break continue print("Video frame is empty or video processing has been successfully completed.") break cap.release() video_writer.release() cv2.destroyAllWindows() """ Thanks Ultralytics Team!
Below is the provided code snippet for obtaining the coordinates of Oriented Bounding Boxes using Ultralytics YOLOv8. ```python from ultralytics import YOLO from ultralytics.utils.plotting import Annotator, colors import cv2 # Initialize YOLOv8 model model = YOLO("yolov8n-obb.pt") names = model.names # Open video file cap = cv2.VideoCapture("Path/to/video/file.mp4") assert cap.isOpened(), "Error reading video file" while cap.isOpened(): success, im0 = cap.read() if success: # Make predictions on each frame results = model.predict(im0, persist=True, show=False) pred_boxes = results[0].obb # Initialize Annotator for visualization annotator = Annotator(im0, line_width=2, example=names) # Iterate over predicted bounding boxes and draw on image for d in reversed(pred_boxes): box = d.xyxyxyxy.reshape(-1, 4, 2).squeeze() print("Bounding Box Coordinates : ", box) annotator.box_label(box, names[int(d.cls)], color=colors(int(d.cls), True), rotated=True) # Display annotated image cv2.imshow("ultralytics", im0) # Check for key press to exit if cv2.waitKey(1) & 0xFF == ord('q'): break continue break # Release video capture and close windows cap.release() cv2.destroyAllWindows() ``` Thanks
Hello, a student here. I trained a yolov8m object detection model in google Collab. Ran predictions on images and videos. Getting good results so far. However, I'm rather interested in how I could make inferences out of the video... For instance, i am interested in if i could somehow get some observation tables with : classes (the objects detected) , Detections (how many of them were detected throughout the video). Would to hear how i should proceed with this! I've been reading the documentation but i haven't figured it out yet. Thanks in advance!
If you want to get information about the prediction, i.e. class names, objects detected, and so on, you can play with the `model.predict` method, it will provide all the information you need but you need to format it according to your needs. i.e ``` results = model.predict(im0, show=False) boxes = results[0].boxes.xyxy.cpu().tolist() clss = results[0].boxes.cls.cpu().tolist() ``` Thanks Ultralytics Team!
I am a ME student but somehow I need to work with YOLOv8 to graduate. I need to detect defects on 3D printed objects with YOLOv8. I am built my own custom dataset and trained it. Unfortunately its not detecting any defects. Now I am trying to train new model with twice more images and with 100 epoch. Hope it will work. I am working on Colab.
Thank you for sharing your experience. We wish you the best of luck with your project, and we're here to assist you with any questions or issues that may arise. Thanks
Thanks! Certainly, to detect people using phones in a video or camera stream, utilize YOLOv5 or YOLOv8. You can train the model on a dataset with images or frames showing people with phones. Then, deploy the model for real-time detection.
There is currently no direct method to achieve this. The recommended approach is to retrain the model by fine-tuning it with annotations for all classes. This process will enable the model to detect all the specific objects you are interested in. Regards, Ultralytics Team!
really? but it take more time ,doesn't it. in more details about our projects is in self driving car and make model for dataset about bump ,another model for dataset about signs and traffic lights , another model about cars and pedstrains and finally segmentation to detect the lane .but all model use yolov8 @@Ultralytics
Maybe then you can try to use multiple models using the multi-threading concept, we have provided detailed information about this: docs.ultralytics.com/modes/track/ Thanks Ultralytics Team!
@@Ultralytics Thank you for your shaing. I run the full code from the link provided, but there is an error says: ... ValueError: too many values to unpack (expected 4) Why is that? How to debug it? 😃
It sounds like there might be an issue with the unpacking of values in the code. Ensure you're using the latest versions of `torch` and `ultralytics`. If the issue persists, please share the specific line causing the error for more detailed help. 😊 For more guidance, check our documentation: docs.ultralytics.com/
lets say i have a yolo m odel, trained on my custom dataset,, i have its weights in my run folder as best.pt,,, now i want to pass a single image and predict the classes present in it. along with the count i need to display the image with the bounding boxes that the model predicted. i am getting thecount, but dont know how to get the bounding boxes, please help
Sorry to hear that! Could you provide more details about the issue? Make sure you're using the latest versions of `torch` and `ultralytics`. You can also check our documentation docs.ultralytics.com/ for troubleshooting tips. 😊
hey i am working on a real time obj detection project in which it shows the number of cars in parking space and no. of empty spaces. what i want is to check if the space is empty or not from a function to perform some task. how i can extract the output in form of text in real time from the output?
Hey there! For real-time object detection and extracting outputs as text, you can use the `predict` function in YOLOv8 to get the detection results. You can then process these results to count cars and determine empty spaces. Make sure you're using the latest versions of `torch` and `ultralytics`. For more details, check out our documentation: docs.ultralytics.com/modes/predict/. If you need further assistance, feel free to ask! 🚗📊
You can utilize the provided code snippet to retrieve the bounding box position as follows: ``` from ultralytics.utils.plotting import Annotator from ultralytics import YOLO import cv2 model = YOLO('yolov8n.pt') # Load a pre-trained or fine-tuned model # Process the image source = cv2.imread('path/to/image.jpg') results = model(source) # Extract results annotator = Annotator(source, example=model.names) for box in results[0].boxes.xyxy.cpu(): x_min, y_min, x_max, y_max = box.tolist() print("Position of Bounding box:", (x_min, y_min, x_max, y_max)) ``` This code snippet utilizes Ultralytics' YOLOv8 model to process an image and extract bounding box results. It then iterates through the detected boxes and prints their positions.
Hello! 🌟 To predict parcels using a live video feed from your camera, you can use the `predict` mode in the Ultralytics YOLOv8 model. First, ensure you have the latest versions of `torch` and `ultralytics` installed. Then, you can run a command like this: ```yolo predict model=path/to/your/model.pt source=0 ``` This command will use your webcam (source=0) for live predictions. For more details, check out the Ultralytics documentation docs.ultralytics.com/modes/predict. If you encounter any issues, please share specific error messages or code snippets. Happy detecting! 📦✨ For more resources, visit our FAQ docs.ultralytics.com/help/FAQ/.
Yes, you can extract the output of Ultralytics YOLOv8 by following our docs: docs.ultralytics.com/usage/python/#__tabbed_3_2 The code snippets are available in the docs. Thanks Ultralytics Team!
@Ultralytics I don't think that's what muhammad meant. I'm pretty sure that he wanted the code that was being shown on THIS video. Not some generic example code. I would also like to request the code that was shown on this video. I'm just getting started on using YOLO, and it would help my understanding by a lot if I could replicate this. Thanks for the wonderful video tho! Much appreciated 👏 👏
We regularly update the code that enhance user experience. The above-provided code is the latest and can be used to extract the detection outputs easily. Thanks Ultralytics Team!
You can achieve this effortlessly by leveraging the principles of instance segmentation. For coding implementation, feel free to explore our documentation page at: docs.ultralytics.com/guides/instance-segmentation-and-tracking/#__tabbed_1_1
Got it! To isolate and crop the segmented area, you can follow these steps: 1. Load the model and run inference: ```python from ultralytics import YOLO model = YOLO("yolov8n-seg.pt") results = model.predict(source="path/to/your/video/frame.jpg") ``` 2. Generate a binary mask and draw contours: ```python import cv2 import numpy as np img = np.copy(results[0].orig_img) b_mask = np.zeros(img.shape[:2], np.uint8) contour = results[0].masks.xy[0].astype(np.int32).reshape(-1, 1, 2) cv2.drawContours(b_mask, [contour], -1, (255, 255, 255), cv2.FILLED) ``` 3. Isolate the object using the binary mask: ```python isolated = cv2.bitwise_and(img, img, mask=b_mask) ``` This will give you the original cropped image based on the segmentation shape. For more detailed steps, check out our guide: docs.ultralytics.com/guides/isolating-segmentation-objects/
Thank you for your kind words! 😊 You can find all the code and resources you need in the Ultralytics YOLO GitHub repository: Ultralytics YOLO GitHub github.com/ultralytics/ultralytics. For detailed documentation and tutorials, visit our official docs: Ultralytics Documentation docs.ultralytics.com/. If you have any specific questions or run into issues, feel free to ask here or open an issue on GitHub. Happy coding! 🚀
You can utilize the mentioned code below to do this. ```python import cv2 from ultralytics import YOLO model = YOLO("yolov8n.pt") cap = cv2.VideoCapture("path/to/video/file.mp4") assert cap.isOpened(), "Error reading video file" while cap.isOpened(): success, im0 = cap.read() if success: results = model.predict(im0, show=False) clss = results[0].boxes.cls.cpu().tolist() if clss: for cls in clss: print(f"label {model.names[int(cls)]}") if cv2.waitKey(1) & 0xFF == ord('q'): break continue break cap.release() cv2.destroyAllWindows() ```
is there a way to use the counting method that it has but instead of using the center of the bounding box, it uses its bottom? im a bit stuck with this, i want to take the center and extract the property of the bounding box height , divide it by 2 and then move the center down so im tecnically touching the floor of the box and get a more accurate reading of collition with a trigger that im using....
Yes, you can modify the counting method to use the bottom of the bounding box. You can achieve this by adjusting the center coordinates. Here's a quick formula: `bottom_y = center_y + (height / 2)`. This will give you the bottom y-coordinate of the bounding box. For more details, you can check our documentation on object counting: docs.ultralytics.com/guides/object-counting/. If you need further assistance, feel free to ask! 😊
You can use mentioned code below to extract the output of Ultralytics YOLOv8. ```python import cv2 import numpy as np from ultralytics import YOLO from ultralytics.utils.plotting import Annotator, colors model = YOLO("yolov8n.pt") names = model.model.names cap = cv2.VideoCapture("Path/to/video/file.mp4") assert cap.isOpened(), "Error reading video file" while cap.isOpened(): success, im0 = cap.read() if success: results = model.predict(im0, show=False) boxes = results[0].boxes.xyxy.cpu().tolist() clss = results[0].boxes.cls.cpu().tolist() annotator = Annotator(im0, line_width=3, example=names) if boxes is not None: for box, cls in zip(boxes, clss): annotator.box_label(box, color=(255, 144, 31), label=names[int(cls)]) if cv2.waitKey(1) & 0xFF == ord('q'): break continue print("Video frame is empty or video processing has been successfully completed.") break cap.release() video_writer.release() cv2.destroyAllWindows() ``` Thanks
To see the content of your `.pt` file and check the accuracy of your model, you can follow these steps: 1. Load the Model: Use the `torch` library to load the `.pt` file. 2. Check Model Accuracy: Evaluate the model on a validation dataset to get accuracy metrics. Here's a concise example: ```python import torch from ultralytics import YOLO Load the model model = YOLO("path/to/your/model.pt") Print model architecture print(model.model) Evaluate the model on a validation dataset results = model.val(data="path/to/your/dataset.yaml") print(results) ``` For more details, check out our documentation docs.ultralytics.com/modes/predict/.
How is the frame rate dynamically put onto the window? In my while(True) loop with OpenCV, I use the putText() method on every frame but it seems to stay at 30 besides visible slowdowns at times. How do I make the framerate account for processing time from my YOLO model?
By default, we don't provide support to display the frames per second (FPS) on the frame display. However, In the context of YOLO and OpenCV, dynamically updating the frame rate on the window involves accounting for the processing time of your YOLO model. Instead of a fixed frame rate, you can calculate the actual frames per second based on the time it takes for the YOLO model inference. Thanks Ultralytics Team!
I am working on a project using MediaPipe for pose estimation. Mediapipe only supports single-person pose estimation, but I want to make it multi-person. I was going to use YOLOv8's object detection and loop the MediaPipe pose estimation code through each bounding box but I am not sure how I could do that. Is there a way to run the MediaPipe code through each bounding box of a human instead of the whole frame, and then put it back togethter in one frame?
Thanks for sharing your feedback. You can directly use the Ultralytics YOLOv8 Pose Model, which supports multi-person pose estimation in a single frame. For more information, please visit: docs.ultralytics.com/tasks/pose/
Great idea! You can definitely use YOLOv8 for detecting multiple people and then apply MediaPipe to each detected bounding box. Here's a simple approach: 1. Use YOLOv8 to detect people and get bounding boxes. 2. Crop each bounding box from the original frame. 3. Run MediaPipe pose estimation on each cropped image. 4. Overlay the pose results back onto the original frame. Make sure your YOLOv8 and MediaPipe versions are up to date for best performance. If you need more guidance, check out the Ultralytics Docs docs.ultralytics.com/ for YOLOv8 setup. Good luck with your project! 🚀
Sorry, but I once watched a video and it showed shorter lines of code. Can you tell me the difference between these two pieces of code? import cv2 from ultralytics import YOLO #from ultralytics.models.yolo.detect.predict import DetectionPredictor import time model = YOLO ("best.pt") results= model.predict(source="0", show=True) print(results)
The code remains largely the same, but when the video was created, YOLOv8 had just been introduced. Since then, we've significantly optimized the code, resulting in reduced lines of inference code compared to what was presented in the video. Thank you.
Hello there! I’m building a simple program that use yolov8 to detect person and then call a function ( the function should connect to pixhawk but I already wrote it ) I just need the code where it triggers the function when a person is detected
You can add a check once the person is detected, the sample code is provided below. we hope this will help :) ```python import cv2 from ultralytics import YOLO from ultralytics.utils.plotting import Annotator, colors model = YOLO("yolov8s.pt") cap = cv2.VideoCapture("Path/to/video/file.mp4") w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS)) out = cv2.VideoWriter("Ultralytics.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h)) while True: ret, im0 = cap.read() if not ret: break annotator = Annotator(im0, line_width=3) results = model.predict(im0) # For prediction boxes = results[0].boxes.xyxy.cpu() clss = results[0].boxes.cls.cpu().tolist() for box, cls in zip(boxes, clss): if names[int(cls)] == "person": print("Person Detected!!! Ultralytics !!!") # ... Execute your logic because a person is detected ... annotator.box_label(box, label=model.names[int(cls)], color=colors(int(cls), True)) out.write(im0) cv2.imshow("Ultralytics", im0) if cv2.waitKey(1) & 0xFF == ord("q"): break out.release() cap.release() cv2.destroyAllWindows() ``` Regards Ultralytics Team!
I want to combine the yolo8 with a resnet50 classifier, to run my custom trained model, and if it detects certain classes it invokes the classifier and prints the output of the classifier as bounding boxes instead of the detectors class. Are there any resources on this?
Officially, we do not offer support for backbone modification, but you have the flexibility to comprehend the YOLOv8 architecture and subsequently customize the classifier to suit your requirements. For additional details, please refer to our documentation: docs.ultralytics.com/models/yolov8/
We offer support for the Jetson Nano. You can follow our QuickStart guide to get started:docs.ultralytics.com/guides/nvidia-jetson/ Almost same steps can be utilize for other embedded devices, except the installation steps which can be different for each device. Thanks
At the moment, we don't directly support bounding box sorting based on height/width. However, feel free to adapt it for your specific use case. Thanks, The Ultralytics Team!
hi, could you please lend me ahand? I'm trying to export the results and sending them to and excel or csv file, but i can't seem to get the right code, i already exported it to torchscript but its being impossible to resolve
Sure, you can extract the bounding boxes and classes using mentioned code below. ```python import cv2 from ultralytics import YOLO from ultralytics.utils.plotting import Annotator, colors model = YOLO("yolov8s.pt") cap = cv2.VideoCapture("path/to/video/file.mp4") while cap.isOpened(): success, frame = cap.read() if success: results = model.track(frame) annotator = Annotator(frame, line_width=4) boxes = results[0].boxes.xyxy.cpu() clss = results[0].boxes.cls.cpu().tolist() for box, cls in zip(boxes, clss): print(f"Bounding box {box}, class name {model.names[int(cls)]}") if cv2.waitKey(1) & 0xFF == ord("q"): break else: break cap.release() cv2.destroyAllWindows() ``` Thanks Ultralytics Team!
hey, can you please help me......... I have my custom trained model (best.pt), it detects two things person and headlight. Now I want the output according to these conditions: 1. If model detect only headlight return 0, 2. If model detect only person return 1, 3. If model detect headlight and person both return 0.
Your inquiries appear to be completely technical. We suggest submitting them to the Ultralytics Discussion section for more effective assistance. You can do so at github.com/orgs/ultralytics/discussions. Thanks Ultralytics Team!
i have run the command (model.predict(source="C:\\Users\\User\\Documents\\Bandicam\\check.mp4", stream=True,save=True, imgsz=320, conf=0.5)) and got this ()/////////////////// where is the output video got saved when the stream =true...please help
We highly recommend upgrading the Ultralytics package, and hopefully, this will address your issue. ```pip install -U ultralytics``` Thanks, Ultralytics Team!
hey can you help me with how can i get confidence lets say i want to get only those confidence of my predicted images which is above 0.8 how can i do that ?
Sure, you can simply use conf=0.8 argument with prediction command i.e ```yolo predict conf=0.8 source="path/to/video.mp4" ....``` Thanks Ultralytics Team!
You're welcome! 😊 If you have any more questions, feel free to ask. For more details, check out our FAQ docs.ultralytics.com/help/FAQ/. Happy coding! 🚀
Thank you so much for the kind words and feedback! 😊 I'll definitely keep that in mind for future videos. It's great to hear you're enjoying the content, and I'm thrilled to have you along for the journey. Cheers to learning and growing together!
Hey, I have a model which does inferencing 768 x 448 image size so I can't use this predicted image directly beacause its low resolution and I have show images of high resolution 1920 x 1080, I am able to extract these results but when I am plotting these results to an original size (1920 x 1080) the masks are not coming properly, I mean to say masks are coming a bit outside of the bounding boxes on the higher resolution images, I also tried resizing the masks according to the orginal image but that didn't work, how can I fix this?
Hey! It sounds like the issue might be with the scaling of the masks when resizing to the original image size. Ensure that the scaling factor is applied consistently to both the bounding boxes and the masks. If you need more detailed guidance, please check out our documentation on handling inference results: docs.ultralytics.com/guides/instance-segmentation-and-tracking/. Also, make sure you're using the latest versions of `torch` and `ultralytics`. If the problem persists, please provide more details about your approach. 😊
I trained a model in Google colab and exported it in '.tflite' format. Now, working with Visual Studio code to check the model. It is not working. And I cannot comprehend the problem. It says that I am giving the wrong input. When I give a single image 'image.jpg' it gives the error: ValueError: Cannot set tensor: Dimension mismatch. Got 800 but expected 3 for dimension 1 of input 0. And if I give the image for first preprocessing and then infer by the model. It gives the error: ValueError: Cannot set tensor: Dimension mismatch. Got 3 but expected 800 for dimension 3 of input 0...
Hi there! 👋 It sounds like you're encountering an issue with input dimensions for your `.tflite` model. To help you better, could you please share more details, such as the exact preprocessing steps you're using and the shape of the input tensor expected by your model? In the meantime, ensure you're using the latest versions of `torch` and `ultralytics`. You can update them with: ` pip install --upgrade torch ultralytics ` For more guidance, check out our documentation docs.ultralytics.com and the common issues guide docs.ultralytics.com/guides/yolo-common-issues/. If you need further assistance, feel free to provide additional details here. 😊 Unfortunately, we can't offer private support, but we're here to help in the comments!
Directly there is no support for this feature, but you can use third party tools to convert PyTorch (.pt) to Darknet (.weights) format. The currently available formats are mentioned at the following link: docs.ultralytics.com/modes/export/#arguments
Certainly, you can retrieve the label using the provided code snippet: ```python from ultralytics import YOLO model = YOLO("yolov8s.pt") results = model.predict(frame, verbose=False) boxes = results[0].boxes.xywh.cpu() clss = results[0].boxes.cls.cpu().tolist() names = results[0].names for box, cls in zip(boxes, clss): x, y, w, h = box label = str(names[int(cls)]) #..... #..... ```
Hello. Is is possible to make the model predict just one label? Imagine that I have my paper-scissors-rock prediction model, but when it comes to connect the webcam to make the predictions in real time I just want the model to predict the "Rock" on screen.
Yes, you can configure your model to predict only one specific label, like "Rock," in real-time inference. Use the `classes` argument in the `model.predict()` method to filter predictions by class ID. For example, set `classes=[ID_for_Rock]` where `ID_for_Rock` corresponds to the class index for "Rock" in your dataset. This ensures the model will only output predictions for that class. For more details on prediction arguments, check here: Predict - Ultralytics YOLO Docs docs.ultralytics.com/modes/predict/#arguments.
hey !!!! can you please explain me hoe to get the coordinates of the deteted bounding boxes in an image and one more thing how can i change the saved directory to one single folder not runs>predict1,predict2 etc thanks
Right, you can use mentioned code to extract the bounding boxes from image, additionally you can store output in specific directory. ```python import cv2 import os from ultralytics import YOLO from ultralytics.utils.plotting import Annotator, colors model = YOLO("yolov8s.pt") output_dir = "test" image = cv2.imread("path/to/image.png") results = model.predict(image, show=False, classes=[108]) boxes = results[0].boxes.xyxy.cpu().tolist() clss = results[0].boxes.cls.cpu().tolist() annotator = Annotator(image, line_width=2, example=model.names) for box, cls in zip(boxes, clss): annotator.box_label(box, color=colors(int(cls), True), label=model.names[int(cls)]) print("Bounding Box Coordinates : ", box) cv2.imwrite(os.path.join(output_dir, "output.png"), image) ``` Thanks
I saw that when training and val he is applying some preprocessing(normalize+standardize 255). But when we use .predict, we have to do it manually, or is implemented?
Great question! When you use `.predict` with Ultralytics YOLOv8, the preprocessing steps like normalization and standardization are automatically handled for you. No need to do it manually! For more details, you can check out our documentation docs.ultralytics.com/modes/predict/. 😊🚀
You can display the tracking ID by calling `model.track`, for example: ```yolo track model=yolov8n.pt source="ua-cam.com/video/LNwODJXcvt4/v-deo.html" conf=0.3, iou=0.5 show``` Thanks, Ultralytics Team!
The `results` object contains all the detection outcomes for a given frame. By indexing into `results`, you can access the data within it, allowing you to later process or extract bounding boxes, class IDs, tracking IDs, and more.
Hi! Yes, you can save each frame with the segmentation mask overlayed by using OpenCV to draw and save the frames. After running the model's `predict()` method, you can use OpenCV functions like `cv2.imwrite()` to save the frames. For a detailed guide on handling segmentation results, check out the Isolating Segmentation Objects docs.ultralytics.com/guides/isolating-segmentation-objects/ documentation. 😊
@@Ultralytics Oh yh one more thing: in this video the guy uses it for object detection tracking: ua-cam.com/video/wuZtUMEiKWY/v-deo.html but I plan to use it for segmentation tracking on a custom datasetet so do i just replace the yolov8.pt model with yolov8-seg.pt? please let me know what you think!
Yes, exactly! 👍 To use segmentation tracking, simply replace the `yolov8.pt` model with `yolov8-seg.pt`. This model is designed for segmentation tasks, enabling it to overlay and track masks across frames. Be sure to adjust your data and settings as needed for your custom dataset. Check out the Instance Segmentation and Tracking Guide docs.ultralytics.com/guides/instance-segmentation-and-tracking/ for more details! 🚀
Hey, I am searching how to control the way the model saves the output images in runs/detect/predict. is there a way to change it? I used the save_dir attribute in the model.predict() function but the model still saved it in the default way. Also is possible to get the total count of the detections predicted by model in a run?
To change the save directory, make sure you're using the `save_dir` parameter correctly in `model.predict()`. If it's not working, double-check for typos or version updates. For counting detections, you can iterate over the `Results` objects and sum up the detections. If issues persist, ensure you're using the latest versions of `ultralytics` and `torch`. For more details, check out the predict mode documentation docs.ultralytics.com/modes/predict/. 😊
@@Ultralytics I am using the save_dir parameter correctly. I don't see why it is not working. Iterating over the results gives number of images which had the desired object to be detected. I need total objects detected in all the images, is there a way to do it?
If `save_dir` isn't working, ensure you're using the latest package versions. For counting total detections, iterate over `results` and sum up `len(result.boxes)` for each `result`. This will give you the total number of detected objects. If issues persist, consider checking the documentation or reaching out on our Discord for community support: ultralytics.com/discord.
How can I print the confidence scores for every class id for an image? Say I have 6 classes and a single image. I want to see what the confidence is for every label.
Absolutely, you can utilize the provided code to display the confidence score for each bounding box. """ from ultralytics import YOLO # Load a pre-trained YOLOv8n model model = YOLO('yolov8n.pt') names = model.model.names # Perform inference on 'bus.jpg' with specified parameters results = model.predict("ultralytics.com/images/bus.jpg", verbose=False, conf=0.5) # Process detections boxes = results[0].boxes.xywh.cpu() clss = results[0].boxes.cls.cpu().tolist() confs = results[0].boxes.conf.float().cpu().tolist() for box, cls, conf in zip(boxes, clss, confs): print(f"Class Name: {names[int(cls)]}, Confidence Score: {conf}, Bounding Box: {box}") """ Hope this helps. Thanks.
@@Ultralytics How if want save results the output from terminal to save file .txt? i tried use save_txt=True , but the .txt display only numbers didnt display a class name or a any string
To save the results with class names and confidence scores to a `.txt` file, you can modify the `save_txt` method to include class names. Here's how you can do it: ```python from ultralytics import YOLO Load a pre-trained YOLOv8n model model = YOLO('yolov8n.pt') names = model.model.names Perform inference on 'bus.jpg' with specified parameters results = model.predict("ultralytics.com/images/bus.jpg", verbose=False, conf=0.5) Save results to a .txt file txt_file = "output.txt" with open(txt_file, "w") as f: for result in results: boxes = result.boxes.xywh.cpu() clss = result.boxes.cls.cpu().tolist() confs = result.boxes.conf.float().cpu().tolist() for box, cls, conf in zip(boxes, clss, confs): f.write(f"Class Name: {names[int(cls)]}, Confidence Score: {conf}, Bounding Box: {box} ") print(f"Results saved to {txt_file}") ``` This script will save the results to `output.txt` with class names, confidence scores, and bounding box coordinates. For more details, you can refer to the Ultralytics documentation docs.ultralytics.com/reference/engine/results/.
How can I display a real-time message such as 'Person detected' within the frame when a person is identified? For example, if I am running a program in real-time and it detects a person, how do I show the message 'Person detected' directly on the frame?
Sure, the provided code allows for the direct display of 'Person detected' on the frame in case a person is identified in the video frame. ``` import cv2 from pathlib import Path from ultralytics import YOLO from ultralytics.utils.plotting import Annotator # Load the YOLOv8 model model = YOLO('yolov8n.pt') # pre-trained model model = YOLO('path/to/best.pt') # fine-tuned model # Path to Video video_path = "path/to/video.mp4" if not Path(video_path).exists(): raise FileNotFoundError(f"Source path {video_path} does not exist.") names = model.model.names cap = cv2.VideoCapture(video_path) while cap.isOpened(): success, frame = cap.read() if success: results = model.predict(frame) boxes = results[0].boxes.xyxy.cpu().numpy().astype(int) classes = results[0].boxes.cls.tolist() confidences = results[0].boxes.conf.tolist() annotator = Annotator(frame, line_width=2, example=str(names)) for box, cls, conf in zip(boxes, classes, confidences): if names[int(cls)] == "person": annotator.box_label(box, "Person Detected", (255, 42, 4)) cv2.imshow("YOLOv8 Detection", frame) if cv2.waitKey(1) & 0xFF == ord("q"): break else: break cap.release() cv2.destroyAllWindows() ```
what would I add to my code (using live video cam) to get the coordinates of the boxes? I'm planning on extracting these coordinates and creating a custom segmentation dataset and pairing it to the rest of my project My code so far: from ultralytics import YOLO model = YOLO("yolov8l") results = model.predict(source="0", show=True, conf=0.5) results.show()
You can get the bounding box coordinates using the provided code. """ from ultralytics import YOLO import cv2 import sys cap = cv2.VideoCapture(0) model = YOLO("yolov8n.pt") if not cap.isOpened(): print("Error reading video file") sys.exit() while cap.isOpened(): success, frame = cap.read() if success: results = model.predict(frame, persist=True) boxes = results[0].boxes.xywh.cpu() clss = results[0].boxes.cls.cpu().tolist() names = results[0].names for box, cls in zip(boxes, clss): x, y, w, h = box label = str(names[int(cls)]) #...... #...... #...... """
@@Ultralytics thank you so much, but I'm actually looking to migrate to a segmentation project and need to export every bounding box from my custom dataset that is detected. how would I do so (export what and where the thing is segmented in a way that is readable by an external source). Also, do you have any methods/tips to use this data to make a pathfinding (self-driving car)? I need a way to export the data provided by the segmentation model and find a route that can avoid certain segmentations (like grass)
@@bitmapsquirrel6869 seems like your query is more technical, we would recommend asking your queries on our GitHub Issue Section: github.com/ultralytics/ultralytics/issues
Hello there! New to yolo and am trying to do a project here. I made my own yolov8 model seeing you guys' videos and had a question Well i want to make a simple python script where the model is liked the the model.YOLO and its giving the output as it should with the detection_output. Howevere if want to make "if" functions like if the model detects a spwcefic class from my dataset, it will prijt something. How am i supposed to do that like we do with a variety of other open source models like from cvzone of smth Plz help
You can achieve this by iterating through the detection output of your YOLOv8 model in Python and checking for specific classes. Here's a basic example: ```python for detection in detection_output: if detection['class'] == specific_class_index: print("Detected specific class! Do something...") ``` Replace `specific_class_index` with the index of the class you're interested in. This allows you to execute custom actions based on the detected classes.
@@Ultralytics from ultralytics import YOLO import numpy # load a pretrained YOLOv8n model model = YOLO("path:\to\yolov8_custom.pt") # predict on an image detection_output = model.predict(source=0, conf=0.25, save=False, show=True) # Display tensor array print(detection_output.probs) # Display numpy array print(detection_output[0].numpy()) for detection in detection_output: if detection['class'] == breadboard: print("working") ---------------------------------------------------------- this is my full code, sir. idk what id did wrong but the output in pycharm is this- 0: 480x640 (no detections), 273.0ms 0: 480x640 (no detections), 269.0ms 0: 480x640 1 breadboard, 269.0ms 0: 480x640 1 breadboard, 269.0ms 0: 480x640 (no detections), 272.0ms 0: 480x640 (no detections), 270.0ms 0: 480x640 (no detections), 268.0ms 0: 480x640 1 breadboard, 272.0ms is there anything that needs to be modified?
It looks like you're on the right track, but there are a few adjustments needed. The `detection_output` from YOLOv8 is a list of `Results` objects, and you need to access the `boxes` attribute to get the detected classes. Here's an updated version of your code: ```python from ultralytics import YOLO Load a pretrained YOLOv8 model model = YOLO("path/to/yolov8_custom.pt") Predict on an image (webcam in this case) detection_output = model.predict(source=0, conf=0.25, save=False, show=True) Iterate through the detection results for result in detection_output: for box in result.boxes: if box.cls == breadboard_class_index: Replace with the actual class index for 'breadboard' print("Detected breadboard! Do something...") ``` Make sure to replace `breadboard_class_index` with the actual index of the 'breadboard' class in your dataset. For more details on how to use YOLOv8 in Python, you can refer to our documentation: YOLOv8 Python Usage docs.ultralytics.com/usage/python/.
i am having a problem with yolov8 where when i run yolov8 it draws bounding boxes everywhere and a bunch of different class against my blank wall, what is the fix to this inaccuracy?
It sounds like your model might be overfitting or not trained properly. Here are a few steps you can take to address this: 1. Check Your Dataset: Ensure your training dataset is well-annotated and diverse. Poor-quality data can lead to inaccurate predictions. 2. Verify Training Settings: Make sure your training configuration (e.g., learning rate, batch size) is appropriate for your dataset. 3. Regularization Techniques: Consider using techniques like data augmentation to improve model generalization. 4. Evaluate Model Performance: Monitor metrics like precision, recall, and mAP during training to ensure the model is learning correctly. For more detailed troubleshooting, check out our guide on common YOLO issues: YOLO Common Issues docs.ultralytics.com/guides/yolo-common-issues/. If you need further assistance, please provide more details about your training setup and dataset.
It sounds like there might be a discrepancy between your CLI and Python setups. Here are a few things to check: 1. Environment Consistency: Ensure that the Python environment you're using matches the one used in the CLI. Check versions of `torch`, `ultralytics`, and other dependencies. 2. Model Configuration: Verify that the model configuration and weights used in Python are the same as those in the CLI. 3. Inference Settings: Ensure that inference settings like confidence threshold and image size are consistent between CLI and Python. For more detailed troubleshooting, you can refer to our guide on common YOLO issues: YOLO Common Issues docs.ultralytics.com/guides/yolo-common-issues/. If the problem persists, please share more details about your Python script and the specific inaccuracies you're encountering.
Great Content. Is it possible to see the entire code that you used? i have search the github but i dont see this code. Also is it possible you have a video on extracting the bounding box picture for a conf
If you wish to retrieve the bounding box coordinates, confidence score, and class name for each object, you can employ the provided code below. The code enables you to extract bounding boxes with a confidence score greater than or equal to 0.5. """ from ultralytics import YOLO # Load a pre-trained YOLOv8n model model = YOLO('yolov8n.pt') names = model.model.names # Perform inference on 'bus.jpg' with specified parameters with conf=0.5 results = model.predict("ultralytics.com/images/bus.jpg", verbose=False, conf=0.5) # Process detections boxes = results[0].boxes.xywh.cpu() clss = results[0].boxes.cls.cpu().tolist() confs = results[0].boxes.conf.float().cpu().tolist() for box, cls, conf in zip(boxes, clss, confs): print(f"Class Name: {names[int(cls)]}, Confidence Score: {conf}, Bounding Box: {box}") """ Hope this helps. Thanks.
@@Ultralytics thank you for the response. my7 intention is to run with my webcam. i have added 'source=0'. it then asks for 'stream=True' then shows repetitive warnings without bringing up the image window to show the results.
@@stefangraham7117 With 32GB of RAM, memory shouldn't be an issue. Let's ensure you're using the latest versions of `torch` and `ultralytics`. You can update them using: ```bash pip install --upgrade torch ultralytics ``` If the issue persists, please share the exact warnings you're seeing. For more details on setting up a security alarm system with YOLOv8, check out our guide: docs.ultralytics.com/guides/security-alarm-system/.
I want to label each object (bounding box) in the txt file to help train the YOLO model. But I don't know what the exact format is. Can you give me the recipe?
Sure! For YOLO models, each image's label is stored in a `.txt` file with the same name as the image. Each row in the file represents an object in the format: ` ` Here’s what each part means: - `class`: The class index of the object (zero-indexed). - `x_center`, `y_center`: Center of the bounding box, normalized (0-1) based on image width and height. - `width`, `height`: Dimensions of the bounding box, also normalized. Example: ``` 0 0.5 0.5 0.4 0.3 1 0.3 0.4 0.2 0.1 ``` This example defines two objects: one of class 0 and one of class 1. For more details, check the Ultralytics YOLO Format Guide docs.ultralytics.com/datasets/detect/.
@@Ultralytics Thanks for the nice explanations.When fine-tuning YoloWorld, I ran the following command: yolo task=detect mode=train model=yolov8x-worldv2.pt imgsz=640 data=.yaml epochs=300 batch=32 device='cuda:0' name=yolov8x-worldv2. After that, I replaced the checkpoint yolov8x-worldv2.pt with the best checkpoint that was just trained (best.pt) and continued fine-tuning. However, the program threw an error: TypeError: forward() missing 1 required positional argument: 'guide'. How can I fix this?
It seems like the issue arises because `yolov8x-worldv2.pt` uses a custom model architecture that includes a `guide` argument in its `forward()` method. When continuing fine-tuning with `best.pt`, it may not properly load the custom model logic. To resolve this: 1. Ensure that the `best.pt` checkpoint is being loaded with the exact same architecture as `yolov8x-worldv2.pt`. If `yolov8x-worldv2.pt` depends on custom modifications, use its original model definition or ensure the `guide` parameter is correctly handled. 2. Re-run fine-tuning with the `--resume` flag instead of manually swapping checkpoints. Example: ```bash yolo task=detect model=yolov8x-worldv2.pt imgsz=640 data=.yaml epochs=300 batch=32 device='cuda:0' name=yolov8x-worldv2 --resume ``` Check out Ultralytics Docs docs.ultralytics.com/tasks/detect/ for more training insights. If the issue persists, confirm the compatibility of the custom model architecture in your setup.
@@Ultralytics The best.pt checkpoint was obtained from the training run with the best performance using the initial checkpoint of YOLOWorld yolov8x-worldv2.pt. Therefore, I ensure that it shares the same architecture. What is the mechanism of the --resume argument? Does it use the checkpoint from the previous fine-tuning session to continue fine-tuning (I assume it does not)? Or is there an alternative, such as adding the --guide argument?
The `--resume` argument in Ultralytics YOLO is designed to continue training seamlessly from a previously saved checkpoint by restoring the model weights, optimizer state, and training parameters, including learning rate scheduler and epoch number. It ensures that all aspects of training are resumed exactly as they were at the point of interruption. You can specify the checkpoint to resume from using `model=`. However, in your case with YOLOWorld, the `forward()` method requiring a `guide` argument indicates that the architecture involves custom modifications. The `--resume` mechanism won't automatically handle such architecture-specific parameters unless they are part of the checkpoint's configuration. If you're encountering the `TypeError` related to `guide`, ensure that: 1. The checkpoint (`best.pt`) and architecture (`yolov8x-worldv2.pt`) are fully compatible. 2. The `guide` argument required by the custom model is explicitly passed during training or fine-tuning. For your scenario, adding the `--guide` argument manually is necessary if the `forward()` method explicitly depends on it. Unfortunately, `--resume` alone does not automatically infer custom arguments like `guide`. For more details on checkpoint management and resuming training, visit the Trainer Docs docs.ultralytics.com/reference/engine/trainer/.
Great video. But just wondering the yolov8 have a count which is able to view via cli command. Is there any example where i will be able to return it in a json the number of count ?
Hi@@Ultralytics for example i have an image of 5 cars and 1 truck. how can i have a response to where it it say how many cars it found and trucks within the image
Got it! You can use Ultralytics YOLOv8 to count objects and return the results in JSON format. Here's a concise example: ```python from ultralytics import YOLO import json Load the model model = YOLO("yolov8n.pt") Perform inference results = model("path/to/your/image.jpg") Extract counts counts = {model.names[int(cls)]: int(count) for cls, count in zip(*results.xyxy[0][:, -1].unique(return_counts=True))} Convert to JSON json_counts = json.dumps(counts) print(json_counts) ``` This will give you a JSON response with the count of each object class. For more details, check out our object counting guide docs.ultralytics.com/guides/object-counting/.
Hi there! To extract section areas using the YOLO model, you can start by labeling the relevant sections as bounding boxes on your document. After that, you can train a YOLOv8 model using these labeled bounding boxes.
is it possible to only get 1 result for each class based on the highest conf value? For exampls, I have 2 classes "fruit" and "peduncle". I only want 1 fruit and 1 peduncle from the detection result with the highest confidence value.
Yes, you can achieve that by filtering the detection results to keep only the highest confidence value for each class. You can use the `results` object to access the detections and then apply your filtering logic. For more details on handling results, check out our documentation docs.ultralytics.com/. 😊
Hello, is there a method to generate a video where I can detect faces? Specifically, I'd like to take an input video, identify faces within it, and produce an output video consisting of cropped face segments with consistent dimensions. thanks
Yes, you can do this. After detecting the face, you can crop it and write it to the output video file while ensuring that each cropped face is resized to the same dimensions. The pseudocode is mentioned below!!!. """ face_crop = im0_detected[int(y1):int(y2), int(x1):int(x2)] face_resized = cv2.resize(face_crop, (416, 416)) videowriter.write(face_resized) """
You're welcome! If resizing is distorting the results, you might want to maintain the aspect ratio by padding the cropped faces. Check out our detailed guide on object cropping for more tips: docs.ultralytics.com/guides/object-cropping/. Good luck! 😊
model.fuse() in ultralytics is used to optimize inference performance by combining certain operations, such as convolutions and batch normalization, into a single fused operation for efficiency.
Hi there! Thanks for your suggestion! Extracting bounding box locations from YOLOv8 is a great topic. In the meantime, you can check out our documentation on this at YOLOv8 Docs docs.ultralytics.com. Make sure you're using the latest versions of `torch` and `ultralytics` for the best experience. Stay tuned for more tutorials! 🚀
You can obtain the class names by using the following code after loading the model: ```python model = YOLO('yolov8n.pt') classes_names = model.names ``` Thanks Ultralytics Team!
You're welcome! If you have any more questions, feel free to ask. Happy coding! 😊 For more details, check out our AzureML Quickstart Guide docs.ultralytics.com/guides/azureml-quickstart/. Thanks, Ultralytics Team!
You can use the mentioned code below to display the resultant image. ``` from PIL import Image from ultralytics import YOLO # Load a pretrained YOLOv8n model model = YOLO('yolov8n.pt') # Run inference on 'bus.jpg' results = model(['bus.jpg', 'zidane.jpg']) # results list # Visualize the results for i, r in enumerate(results): # Plot results image im_bgr = r.plot() # BGR-order numpy array im_rgb = Image.fromarray(im_bgr[..., ::-1]) # RGB-order PIL image # Show results to screen (in supported environments) r.show() # Save results to disk r.save(filename=f'results{i}.jpg') ``` For more information, you can explore our Predict docs available at: docs.ultralytics.com/modes/predict/#plotting-results
To implement multi-stream Object Tracking, you can refer to the Ultralytics Docs: docs.ultralytics.com/modes/track/#multithreaded-tracking. Please keep in mind that if you wish to perform object detection on multiple streams instead, you can replace 'track' with 'predict'.
If you encounter any code-related issues, please feel free to open an issue in the Ultralytics GitHub Repository here: github.com/ultralytics/ultralytics/issues
Hi i want to figure out where the live detected results are stored (i am using web cam) and i want it to speak out results using pyttsx3 or any tts engine, my code so far is given below. i am planning on integrating it with the rest of the object. from ultralytics import YOLO model = YOLO("yolov8l") results = model.predict(source="0", show=True, conf=0.5) results.show() thanks in anticipation of help! and a thanks for keeping yolo free for all !
The live detection results will be saved in the 'runs/predict/exp' folder. Ensure that you include a save argument to store the output results. The modified code is provided below. ```python from ultralytics import YOLO model = YOLO("yolov8l") results = model.predict(source="0", show=True, conf=0.5, save=True) results.show() ``` Thank you.
if you could justt talk a little slower lol without swallowing half the words and slurring through. but the video by itself was super cool, excited to try it out!
Thank you for the feedback! We'll slow down the speech for better clarity next time. We are glad you found the video cool, and we hope you enjoy trying it out!
Appreciate you sharing your experience! We'll have a discussion with the creator and make the necessary updates. You can expect clearer voice and more effective sound in the upcoming videos. Thanks Ultralytics Team!
I try to run a code to object detection in real time in my local pc, but the result is wrong, because draw the rectangle in other place. This is the code: from ultralytics import YOLO import cv2 import numpy as np model = YOLO("yolov8n.pt") labels = model.names COLORS = np.random.randint(0, 255, size=(len(labels), 3), dtype="int64") cap = cv2.VideoCapture(0) while True: ret, frame = cap.read() if not ret: print("Cant detect camera") break results = model.track(frame, stream=True) for result in results: classes_names = result.names for box in result.boxes: if box.conf[0] > 0.6: [x1, y1, x2, y2] = box.xyxy[0] x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2) label_index = int(box.cls[0]) label_class = classes_names[label_index] color = ( int(COLORS[label_index][0]), int(COLORS[label_index][1]), int(COLORS[label_index][2]), ) cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2) cv2.putText( frame, f"{classes_names[int(label_index)]}{box.conf[0]}:.2f", (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, 1, color, 2, ) cv2.imshow("Frame", frame) if cv2.waitKey(1) & 0xFF == ord("q"): break cap.release() cv2.destroyAllWindows()
It looks like there might be an issue with how the bounding box coordinates are being processed or drawn. Let's ensure that the coordinates are correctly extracted and used. Also, make sure you're using the latest versions of `torch` and `ultralytics`. Here's a refined version of your code: ```python from ultralytics import YOLO import cv2 import numpy as np Load the YOLOv8 model model = YOLO("yolov8n.pt") Get class labels labels = model.names Generate random colors for each label COLORS = np.random.randint(0, 255, size=(len(labels), 3), dtype="int64") Open the webcam cap = cv2.VideoCapture(0) while True: ret, frame = cap.read() if not ret: print("Can't detect camera") break Run inference results = model(frame, stream=True) for result in results: for box in result.boxes: if box.conf[0] > 0.6: x1, y1, x2, y2 = map(int, box.xyxy[0]) label_index = int(box.cls[0]) label_class = labels[label_index] color = tuple(map(int, COLORS[label_index])) Draw the bounding box cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2) cv2.putText( frame, f"{label_class} {box.conf[0]:.2f}", (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2, ) Display the frame cv2.imshow("Frame", frame) Break the loop if 'q' is pressed if cv2.waitKey(1) & 0xFF == ord("q"): break cap.release() cv2.destroyAllWindows() ``` Make sure to check the following: 1. Ensure you have the latest versions of `torch` and `ultralytics` installed. 2. Verify that your webcam is working correctly. For more detailed information on using the `predict` mode, you can refer to the Ultralytics YOLOv8 documentation docs.ultralytics.com/modes/predict/. If the issue persists, please provide more details about the environment and any error messages you encounter.
Oh finally, an in depth tutorial for yolov8
Thanks
You're welcome. Let us know what else you'd like to see!
what im trying to do is detect realtime and then put my cursor on the detected object, if possible ,when the pose detects the head and its features are marked as green, i would like to get the cordinates of nose keypoint and put my cursor on that as do it as fast and as efficiently as possible . can you please help me . also how do i use my screen as source. also how would the tracking work , can i make my mouse cursor follow the tracked path
For real-time object detection and cursor control:
- Use OpenCV for object detection.
- Extract nose keypoint coordinates from pose detection.
- Employ PyAutoGUI for efficient cursor placement.
- Capture screen frames with OpenCV.
- Implement real-time cursor tracking for a responsive user experience.
Thanks
@@Ultralytics actually i did try but i couldn't get anywhere . then i asked gpt4 but its was useless
from ultralytics import YOLO
import win32api
from mss import mss
import numpy as np
# Load model, initialize MSS, define screen area
model, sct, monitor = YOLO('yolov8n-pose.pt'), mss(), {'left': 880, 'top': 400, 'width': 800, 'height': 800}
while True:
# Capture screenshot, run inference
screenshot = np.array(sct.grab(monitor))[..., :3]
results = model(screenshot, show=True)
# Check if any detections were made
if len(results.pred) > 0:
# Get the first detection's keypoints
keypoints = results.pred[0].keypoints
# Check if the detection has keypoints
if keypoints:
# Get the nose keypoint (assuming the first keypoint is the nose)
nose_keypoint = keypoints.xy[0]
# Move the cursor to the nose keypoint
win32api.SetCursorPos((int(nose_keypoint[0]), int(nose_keypoint[1])))
the last part , of checking detection and all is from gpt and i don't know what conccution it made but its not working please help me
thank you
Alright, for more technical queries, you can refer to our GitHub Issues section: github.com/ultralytics/ultralytics/issues
@@cv3174 yup and its completed and works amazing. if you train it with game data and proper keypoints it will work even better .but i didn't cause i was satisfied with default one
That sounds impressive! If you need any further assistance or want to explore more about object detection and tracking, check out our detailed guides and documentation: docs.ultralytics.com/guides/vision-eye/. Happy coding! 🚀
can i get the full code of finding the coordinates of bounding box from webcam.The given code is not working
You can use the code below to extract the bounding box coordinates from the webcam. For more information, you can explore our docs: docs.ultralytics.com/modes/predict/#working-with-results
```python
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
model = YOLO("yolov8s.pt")
cap = cv2.VideoCapture(0)
assert cap.isOpened(), "Error reading video file"
while cap.isOpened():
success, im0 = cap.read()
if success:
results = model.predict(im0, show=False, classes=[108])
boxes = results[0].boxes.xyxy.cpu().tolist()
clss = results[0].boxes.cls.cpu().tolist()
annotator = Annotator(im0, line_width=2, example=model.names)
if boxes is not None:
for box, cls in zip(boxes, clss):
annotator.box_label(box, color=colors(int(cls), True),
label=model.names[int(cls)])
print("Bounding Box Coordinates : ", box)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
continue
print("Video frame is empty or video processing has been successfully completed.")
break
cap.release()
cv2.destroyAllWindows()
```
Thanks
do you have tutorial how to ekstract txt file containing timestamp of which object is being detected for for vidieo input?
There isn't a tutorial specifically covering timestamps. For more comprehensive guidance, feel free to inquire in our community!
GitHub Issues: github.com/ultralytics/ultralytics/issues
GitHub Discussion: github.com/orgs/ultralytics/discussions
Thanks
Ultralytics Team!
@@Ultralytics sorry if this sounds stupid but which file should i modify so i can get the disired output (txt containing object and time stamp)?
@@user-firebender You are advised to modify the internal code. For more effective responses to your queries, we recommend posting them on our GitHub: github.com/ultralytics/ultralytics/issues
The tracking configuration shares properties with Predict mode, so a lot of this applies to tracking as well.
Certainly, numerous predict mode arguments are supported in tracking mode.
Thanks
Ultralytics Team!
Is it possible to filter my outputs to such as it will only show person on screen? If so how can I achieve this, thank you.
Certainly, it's feasible. You can utilize the provided code to showcase exclusively designated class labels.
"""
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
import cv2
model = YOLO("yolov8n.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
video_writer = cv2.VideoWriter("ultralytics.avi", cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
while cap.isOpened():
success, im0 = cap.read()
if success:
results = model.predict(im0, show=False)
boxes = results[0].boxes.xyxy.cpu().tolist()
clss = results[0].boxes.cls.cpu().tolist()
annotator = Annotator(im0, line_width=4, example=names)
if boxes is not None:
for box, cls in zip(boxes, clss):
if names[int(cls)] == "person": # Here class name whose bbox you want to display
annotator.box_label(box, label=names[int(cls)])
cv2.imshow("ultralytics", im0)
video_writer.write(im0)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
continue
print("Video frame is empty or video processing has been successfully completed.")
break
cap.release()
video_writer.release()
cv2.destroyAllWindows()
"""
Thanks
Ultralytics Team!
Thank you for the video👍. How can you get the coordinates of an oriented bounding Box from an image?
Below is the provided code snippet for obtaining the coordinates of Oriented Bounding Boxes using Ultralytics YOLOv8.
```python
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
import cv2
# Initialize YOLOv8 model
model = YOLO("yolov8n-obb.pt")
names = model.names
# Open video file
cap = cv2.VideoCapture("Path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
while cap.isOpened():
success, im0 = cap.read()
if success:
# Make predictions on each frame
results = model.predict(im0, persist=True, show=False)
pred_boxes = results[0].obb
# Initialize Annotator for visualization
annotator = Annotator(im0, line_width=2, example=names)
# Iterate over predicted bounding boxes and draw on image
for d in reversed(pred_boxes):
box = d.xyxyxyxy.reshape(-1, 4, 2).squeeze()
print("Bounding Box Coordinates : ", box)
annotator.box_label(box, names[int(d.cls)], color=colors(int(d.cls), True), rotated=True)
# Display annotated image
cv2.imshow("ultralytics", im0)
# Check for key press to exit
if cv2.waitKey(1) & 0xFF == ord('q'):
break
continue
break
# Release video capture and close windows
cap.release()
cv2.destroyAllWindows()
```
Thanks
Hello, a student here.
I trained a yolov8m object detection model in google Collab. Ran predictions on images and videos. Getting good results so far.
However, I'm rather interested in how I could make inferences out of the video...
For instance, i am interested in if i could somehow get some observation tables with : classes (the objects detected) , Detections (how many of them were detected throughout the video).
Would to hear how i should proceed with this! I've been reading the documentation but i haven't figured it out yet. Thanks in advance!
If you want to get information about the prediction, i.e. class names, objects detected, and so on, you can play with the `model.predict` method, it will provide all the information you need but you need to format it according to your needs. i.e
```
results = model.predict(im0, show=False)
boxes = results[0].boxes.xyxy.cpu().tolist()
clss = results[0].boxes.cls.cpu().tolist()
```
Thanks
Ultralytics Team!
I am a ME student but somehow I need to work with YOLOv8 to graduate. I need to detect defects on 3D printed objects with YOLOv8. I am built my own custom dataset and trained it. Unfortunately its not detecting any defects. Now I am trying to train new model with twice more images and with 100 epoch. Hope it will work. I am working on Colab.
Thank you for sharing your experience. We wish you the best of luck with your project, and we're here to assist you with any questions or issues that may arise.
Thanks
Excellent Tutorial. Can you please help me to find the person who are using phones in a video or camera stream. Thanks.
Thanks! Certainly, to detect people using phones in a video or camera stream, utilize YOLOv5 or YOLOv8. You can train the model on a dataset with images or frames showing people with phones. Then, deploy the model for real-time detection.
if i make more than one model using yolov8 and want to make combine them or multi task it to work in real time , how can i make this ?
There is currently no direct method to achieve this. The recommended approach is to retrain the model by fine-tuning it with annotations for all classes. This process will enable the model to detect all the specific objects you are interested in.
Regards,
Ultralytics Team!
really? but it take more time ,doesn't it. in more details about our projects is in self driving car and make model for dataset about bump ,another model for dataset about signs and traffic lights , another model about cars and pedstrains and finally segmentation to detect the lane .but all model use yolov8 @@Ultralytics
are any way to collect this model .and thanks more for your reply
Maybe then you can try to use multiple models using the multi-threading concept, we have provided detailed information about this: docs.ultralytics.com/modes/track/
Thanks
Ultralytics Team!
thanks @@Ultralytics
We are creating colab notebooks that will include the codes for our UA-cam videos, we will share them soon! Thanks
The colab notebooks are already released and available at: github.com/ultralytics/ultralytics?tab=readme-ov-file#notebooks
released??
Oops, my mistake! 😅 We haven't released them yet. Stay tuned for updates! 🚀
hello, where can i get the full code? i only see until line 50 in this video, i copy everything and the code isnt working for me
Hi there, please find it here: github.com/niconielsen32/YOLOv8-Class
@@Ultralytics Thank you for your shaing. I run the full code from the link provided, but there is an error says:
...
ValueError: too many values to unpack (expected 4)
Why is that? How to debug it? 😃
@@FernandaZ-u7c i have also got same error. did you find the fix?
It sounds like there might be an issue with the unpacking of values in the code. Ensure you're using the latest versions of `torch` and `ultralytics`. If the issue persists, please share the specific line causing the error for more detailed help. 😊
For more guidance, check our documentation: docs.ultralytics.com/
lets say i have a yolo m
odel, trained on my custom dataset,, i have its weights in my run folder as best.pt,,, now i want to pass a single image and predict the classes present in it. along with the count i need to display the image with the bounding boxes that the model predicted. i am getting thecount, but dont know how to get the bounding boxes, please help
You can get the bounding box coordinates with object counts using the below code!
'''
from ultralytics import YOLO
from ultralytics.solutions import object_counter
import cv2
model = YOLO("yolov8n.pt")
cap = cv2.VideoCapture("path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
counter = object_counter.ObjectCounter() # Init Object Counter
region_points = [(20, 400), (1080, 404), (1080, 360), (20, 360)]
counter.set_args(view_img=True,
reg_pts=region_points,
classes_names=model.names,
draw_tracks=True)
while cap.isOpened():
success, im0 = cap.read()
if not success:
break
tracks = model.track(im0, persist=True, show=False)
boxes = tracks[0].boxes.xyxy.cpu()
for box in boxes:
print("Bounding Box Value : ", box)
im0 = counter.start_counting(im0, tracks)
print("In Counts : {}, Out Counts : {}".format(counter.in_counts, counter.out_counts))
cv2.destroyAllWindows()
'''
Thanks
Ultralytics Team!
I write all the code as shown in the video but nothing happened
Sorry to hear that! Could you provide more details about the issue? Make sure you're using the latest versions of `torch` and `ultralytics`. You can also check our documentation docs.ultralytics.com/ for troubleshooting tips. 😊
hey i am working on a real time obj detection project in which it shows the number of cars in parking space and no. of empty spaces. what i want is to check if the space is empty or not from a function to perform some task. how i can extract the output in form of text in real time from the output?
Hey there! For real-time object detection and extracting outputs as text, you can use the `predict` function in YOLOv8 to get the detection results. You can then process these results to count cars and determine empty spaces. Make sure you're using the latest versions of `torch` and `ultralytics`. For more details, check out our documentation: docs.ultralytics.com/modes/predict/. If you need further assistance, feel free to ask! 🚗📊
Great video! thanks! I would like to know how can I get the results with masks when I predicting
See docs.ultralytics.com/modes/predict/#masks for getting masks results from Segment models :)
Hello. What must I extract to the output of the Ultralytics YOLOv8 Model if I want to know the position of the objects being detected?
You can utilize the provided code snippet to retrieve the bounding box position as follows:
```
from ultralytics.utils.plotting import Annotator
from ultralytics import YOLO
import cv2
model = YOLO('yolov8n.pt') # Load a pre-trained or fine-tuned model
# Process the image
source = cv2.imread('path/to/image.jpg')
results = model(source)
# Extract results
annotator = Annotator(source, example=model.names)
for box in results[0].boxes.xyxy.cpu():
x_min, y_min, x_max, y_max = box.tolist()
print("Position of Bounding box:", (x_min, y_min, x_max, y_max))
```
This code snippet utilizes Ultralytics' YOLOv8 model to process an image and extract bounding box results. It then iterates through the detected boxes and prints their positions.
i wanted to see the full code but could not find it 😥
The code is available in our docs: docs.ultralytics.com/usage/simple-utilities/
hello, I have trained my data sets for parcels, may I ask how can I predict the parcels using the live video feed of my camera?
Hello! 🌟 To predict parcels using a live video feed from your camera, you can use the `predict` mode in the Ultralytics YOLOv8 model. First, ensure you have the latest versions of `torch` and `ultralytics` installed. Then, you can run a command like this: ```yolo predict model=path/to/your/model.pt source=0 ``` This command will use your webcam (source=0) for live predictions. For more details, check out the Ultralytics documentation docs.ultralytics.com/modes/predict. If you encounter any issues, please share specific error messages or code snippets. Happy detecting! 📦✨ For more resources, visit our FAQ docs.ultralytics.com/help/FAQ/.
Is this complete code snippet available that is used in video?
Yes, you can extract the output of Ultralytics YOLOv8 by following our docs: docs.ultralytics.com/usage/python/#__tabbed_3_2
The code snippets are available in the docs.
Thanks
Ultralytics Team!
@Ultralytics I don't think that's what muhammad meant. I'm pretty sure that he wanted the code that was being shown on THIS video. Not some generic example code. I would also like to request the code that was shown on this video. I'm just getting started on using YOLO, and it would help my understanding by a lot if I could replicate this. Thanks for the wonderful video tho! Much appreciated 👏 👏
We regularly update the code that enhance user experience. The above-provided code is the latest and can be used to extract the detection outputs easily.
Thanks
Ultralytics Team!
how can i get only the segmentation area not the mask i want original crop image from video frame?
You can achieve this effortlessly by leveraging the principles of instance segmentation. For coding implementation, feel free to explore our documentation page at: docs.ultralytics.com/guides/instance-segmentation-and-tracking/#__tabbed_1_1
@@Ultralyticsyes but i need the segmentation shape crop image not the bounding box crop image
Got it! To isolate and crop the segmented area, you can follow these steps:
1. Load the model and run inference:
```python
from ultralytics import YOLO
model = YOLO("yolov8n-seg.pt")
results = model.predict(source="path/to/your/video/frame.jpg")
```
2. Generate a binary mask and draw contours:
```python
import cv2
import numpy as np
img = np.copy(results[0].orig_img)
b_mask = np.zeros(img.shape[:2], np.uint8)
contour = results[0].masks.xy[0].astype(np.int32).reshape(-1, 1, 2)
cv2.drawContours(b_mask, [contour], -1, (255, 255, 255), cv2.FILLED)
```
3. Isolate the object using the binary mask:
```python
isolated = cv2.bitwise_and(img, img, mask=b_mask)
```
This will give you the original cropped image based on the segmentation shape. For more detailed steps, check out our guide: docs.ultralytics.com/guides/isolating-segmentation-objects/
Great video. Can i get the codes?
Thank you for your kind words! 😊 You can find all the code and resources you need in the Ultralytics YOLO GitHub repository: Ultralytics YOLO GitHub github.com/ultralytics/ultralytics. For detailed documentation and tutorials, visit our official docs: Ultralytics Documentation docs.ultralytics.com/. If you have any specific questions or run into issues, feel free to ask here or open an issue on GitHub. Happy coding! 🚀
Hello, I want to extract label without rest details... How I can do it? I use yolov8
You can utilize the mentioned code below to do this.
```python
import cv2
from ultralytics import YOLO
model = YOLO("yolov8n.pt")
cap = cv2.VideoCapture("path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
while cap.isOpened():
success, im0 = cap.read()
if success:
results = model.predict(im0, show=False)
clss = results[0].boxes.cls.cpu().tolist()
if clss:
for cls in clss:
print(f"label {model.names[int(cls)]}")
if cv2.waitKey(1) & 0xFF == ord('q'):
break
continue
break
cap.release()
cv2.destroyAllWindows()
```
is there a way to use the counting method that it has but instead of using the center of the bounding box, it uses its bottom? im a bit stuck with this, i want to take the center and extract the property of the bounding box height , divide it by 2 and then move the center down so im tecnically touching the floor of the box and get a more accurate reading of collition with a trigger that im using....
Yes, you can modify the counting method to use the bottom of the bounding box. You can achieve this by adjusting the center coordinates. Here's a quick formula: `bottom_y = center_y + (height / 2)`. This will give you the bottom y-coordinate of the bounding box. For more details, you can check our documentation on object counting: docs.ultralytics.com/guides/object-counting/. If you need further assistance, feel free to ask! 😊
I want this episode code not entire code
You can use mentioned code below to extract the output of Ultralytics YOLOv8.
```python
import cv2
import numpy as np
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
model = YOLO("yolov8n.pt")
names = model.model.names
cap = cv2.VideoCapture("Path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
while cap.isOpened():
success, im0 = cap.read()
if success:
results = model.predict(im0, show=False)
boxes = results[0].boxes.xyxy.cpu().tolist()
clss = results[0].boxes.cls.cpu().tolist()
annotator = Annotator(im0, line_width=3, example=names)
if boxes is not None:
for box, cls in zip(boxes, clss):
annotator.box_label(box, color=(255, 144, 31), label=names[int(cls)])
if cv2.waitKey(1) & 0xFF == ord('q'):
break
continue
print("Video frame is empty or video processing has been successfully completed.")
break
cap.release()
video_writer.release()
cv2.destroyAllWindows()
```
Thanks
@@Ultralytics how to see the content of the .pt file and how to know the accuracy of my .pt file i.e the accuracy of my dataset to show any other
To see the content of your `.pt` file and check the accuracy of your model, you can follow these steps:
1. Load the Model: Use the `torch` library to load the `.pt` file.
2. Check Model Accuracy: Evaluate the model on a validation dataset to get accuracy metrics.
Here's a concise example:
```python
import torch
from ultralytics import YOLO
Load the model
model = YOLO("path/to/your/model.pt")
Print model architecture
print(model.model)
Evaluate the model on a validation dataset
results = model.val(data="path/to/your/dataset.yaml")
print(results)
```
For more details, check out our documentation docs.ultralytics.com/modes/predict/.
How is the frame rate dynamically put onto the window? In my while(True) loop with OpenCV, I use the putText() method on every frame but it seems to stay at 30 besides visible slowdowns at times. How do I make the framerate account for processing time from my YOLO model?
By default, we don't provide support to display the frames per second (FPS) on the frame display. However, In the context of YOLO and OpenCV, dynamically updating the frame rate on the window involves accounting for the processing time of your YOLO model. Instead of a fixed frame rate, you can calculate the actual frames per second based on the time it takes for the YOLO model inference.
Thanks
Ultralytics Team!
I am working on a project using MediaPipe for pose estimation. Mediapipe only supports single-person pose estimation, but I want to make it multi-person. I was going to use YOLOv8's object detection and loop the MediaPipe pose estimation code through each bounding box but I am not sure how I could do that. Is there a way to run the MediaPipe code through each bounding box of a human instead of the whole frame, and then put it back togethter in one frame?
Thanks for sharing your feedback. You can directly use the Ultralytics YOLOv8 Pose Model, which supports multi-person pose estimation in a single frame. For more information, please visit: docs.ultralytics.com/tasks/pose/
Great idea! You can definitely use YOLOv8 for detecting multiple people and then apply MediaPipe to each detected bounding box. Here's a simple approach:
1. Use YOLOv8 to detect people and get bounding boxes.
2. Crop each bounding box from the original frame.
3. Run MediaPipe pose estimation on each cropped image.
4. Overlay the pose results back onto the original frame.
Make sure your YOLOv8 and MediaPipe versions are up to date for best performance. If you need more guidance, check out the Ultralytics Docs docs.ultralytics.com/ for YOLOv8 setup. Good luck with your project! 🚀
Sorry, but I once watched a video and it showed shorter lines of code. Can you tell me the difference between these two pieces of code?
import cv2
from ultralytics import YOLO
#from ultralytics.models.yolo.detect.predict import DetectionPredictor
import time
model = YOLO ("best.pt")
results= model.predict(source="0", show=True)
print(results)
The code remains largely the same, but when the video was created, YOLOv8 had just been introduced. Since then, we've significantly optimized the code, resulting in reduced lines of inference code compared to what was presented in the video. Thank you.
@@Ultralytics I can understand your answer to mean that the above two pieces of code are cis and have the same effect. Is that correct?
@@k23vanthuan98 Yes, for inference the latest code is available at: docs.ultralytics.com/modes/predict/
Hello there! I’m building a simple program that use yolov8 to detect person and then call a function ( the function should connect to pixhawk but I already wrote it )
I just need the code where it triggers the function when a person is detected
You can add a check once the person is detected, the sample code is provided below. we hope this will help :)
```python
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
model = YOLO("yolov8s.pt")
cap = cv2.VideoCapture("Path/to/video/file.mp4")
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH,
cv2.CAP_PROP_FRAME_HEIGHT,
cv2.CAP_PROP_FPS))
out = cv2.VideoWriter("Ultralytics.avi",
cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))
while True:
ret, im0 = cap.read()
if not ret:
break
annotator = Annotator(im0, line_width=3)
results = model.predict(im0) # For prediction
boxes = results[0].boxes.xyxy.cpu()
clss = results[0].boxes.cls.cpu().tolist()
for box, cls in zip(boxes, clss):
if names[int(cls)] == "person":
print("Person Detected!!! Ultralytics !!!")
# ... Execute your logic because a person is detected ...
annotator.box_label(box, label=model.names[int(cls)], color=colors(int(cls), True))
out.write(im0)
cv2.imshow("Ultralytics", im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
out.release()
cap.release()
cv2.destroyAllWindows()
```
Regards
Ultralytics Team!
I want to combine the yolo8 with a resnet50 classifier, to run my custom trained model, and if it detects certain classes it invokes the classifier and prints the output of the classifier as bounding boxes instead of the detectors class. Are there any resources on this?
Officially, we do not offer support for backbone modification, but you have the flexibility to comprehend the YOLOv8 architecture and subsequently customize the classifier to suit your requirements. For additional details, please refer to our documentation: docs.ultralytics.com/models/yolov8/
How to use the result of prediction in controlling a output devices? For example a servo motor. Thank you very much and have a nice day.
We offer support for the Jetson Nano. You can follow our QuickStart guide to get started:docs.ultralytics.com/guides/nvidia-jetson/
Almost same steps can be utilize for other embedded devices, except the installation steps which can be different for each device. Thanks
@@Ultralytics thank you very much.
You're welcome! If you have any more questions, feel free to ask. Have a great day! 😊
Is there a built-in method for sorting the results of the boxes by their height or is there somewhere in yolo that i can implement this internally?
At the moment, we don't directly support bounding box sorting based on height/width. However, feel free to adapt it for your specific use case.
Thanks,
The Ultralytics Team!
hi, could you please lend me ahand? I'm trying to export the results and sending them to and excel or csv file, but i can't seem to get the right code, i already exported it to torchscript but its being impossible to resolve
Sure, you can extract the bounding boxes and classes using mentioned code below.
```python
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
model = YOLO("yolov8s.pt")
cap = cv2.VideoCapture("path/to/video/file.mp4")
while cap.isOpened():
success, frame = cap.read()
if success:
results = model.track(frame)
annotator = Annotator(frame, line_width=4)
boxes = results[0].boxes.xyxy.cpu()
clss = results[0].boxes.cls.cpu().tolist()
for box, cls in zip(boxes, clss):
print(f"Bounding box {box}, class name {model.names[int(cls)]}")
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
break
cap.release()
cv2.destroyAllWindows()
```
Thanks
Ultralytics Team!
hey, can you please help me.........
I have my custom trained model (best.pt), it detects two things person and headlight. Now I want the output according to these conditions: 1. If model detect only headlight return 0, 2. If model detect only person return 1, 3. If model detect headlight and person both return 0.
Your inquiries appear to be completely technical. We suggest submitting them to the Ultralytics Discussion section for more effective assistance. You can do so at github.com/orgs/ultralytics/discussions.
Thanks
Ultralytics Team!
Great video! any chance of getting this code file?
Thanks :) All code samples are available in Ultralytics Docs: docs.ultralytics.com/modes/predict/
i have run the command (model.predict(source="C:\\Users\\User\\Documents\\Bandicam\\check.mp4", stream=True,save=True, imgsz=320, conf=0.5))
and got this ()///////////////////
where is the output video got saved when the stream =true...please help
We highly recommend upgrading the Ultralytics package, and hopefully, this will address your issue.
```pip install -U ultralytics```
Thanks,
Ultralytics Team!
hey can you help me with how can i get confidence lets say i want to get only those confidence of my predicted images which is above 0.8 how can i do that ?
Sure, you can simply use conf=0.8 argument with prediction command i.e
```yolo predict conf=0.8 source="path/to/video.mp4" ....```
Thanks
Ultralytics Team!
@@Ultralytics Thank you :
You're welcome! 😊 If you have any more questions, feel free to ask. For more details, check out our FAQ docs.ultralytics.com/help/FAQ/. Happy coding! 🚀
i want to draw center lines of bounding box object detected and tracked by yolov8 in live video tell me how to do it and code
You can utilize the provided code to draw centroids of objects and implement object tracking over time.
```python
from collections import defaultdict
import cv2
import numpy as np
from ultralytics import YOLO
model = YOLO('yolov8n.pt')
video_path = "Path/to/video/file.mp4"
cap = cv2.VideoCapture(video_path)
track_history = defaultdict(lambda: [])
while cap.isOpened():
success, frame = cap.read()
if success:
results = model.track(frame, persist=True, show_conf=False)
boxes = results[0].boxes.xywh.cpu()
track_ids = results[0].boxes.id.int().cpu().tolist()
annotated_frame = results[0].plot()
for box, track_id in zip(boxes, track_ids):
x, y, w, h = box
track = track_history[track_id]
track.append((float(x), float(y)))
if len(track) > 30:
track.pop(0)
points = np.hstack(track).astype(np.int32).reshape((-1, 1, 2))
cv2.polylines(annotated_frame, [points], isClosed=False, color=(230, 230, 230), thickness=2)
cv2.imshow("YOLOv8 Tracking", annotated_frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
break
cap.release()
cv2.destroyAllWindows()
```
Thanks
Mate! great tutorial, Thank you
Thank you so much for the kind words and feedback! 😊 I'll definitely keep that in mind for future videos. It's great to hear you're enjoying the content, and I'm thrilled to have you along for the journey. Cheers to learning and growing together!
Hey, I have a model which does inferencing 768 x 448 image size so I can't use this predicted image directly beacause its low resolution and I have show images of high resolution 1920 x 1080, I am able to extract these results but when I am plotting these results to an original size (1920 x 1080) the masks are not coming properly, I mean to say masks are coming a bit outside of the bounding boxes on the higher resolution images, I also tried resizing the masks according to the orginal image but that didn't work, how can I fix this?
Hey! It sounds like the issue might be with the scaling of the masks when resizing to the original image size. Ensure that the scaling factor is applied consistently to both the bounding boxes and the masks. If you need more detailed guidance, please check out our documentation on handling inference results: docs.ultralytics.com/guides/instance-segmentation-and-tracking/. Also, make sure you're using the latest versions of `torch` and `ultralytics`. If the problem persists, please provide more details about your approach. 😊
Hello, excellent job, very good videos! Where can I download the episode codes so I can practice?
Sure, all codes are available in our docs: docs.ultralytics.com/modes/predict/
Thanks
Ultralytics Team!
Can we get a similar video for Yolo v9??
Yes it's coming soon, maybe in next 2 weeks :)
I trained a model in Google colab and exported it in '.tflite' format. Now, working with Visual Studio code to check the model. It is not working. And I cannot comprehend the problem. It says that I am giving the wrong input. When I give a single image 'image.jpg' it gives the error: ValueError: Cannot set tensor: Dimension mismatch. Got 800 but expected 3 for dimension 1 of input 0. And if I give the image for first preprocessing and then infer by the model. It gives the error: ValueError: Cannot set tensor: Dimension mismatch. Got 3 but expected 800 for dimension 3 of input 0...
What is the image size at which you exported the YOLOv8 model to tflite format? Thanks
@@Ultralytics Umm. I don't know. How can i find out?
For more details, you can check out our documentation at: docs.ultralytics.com/
Hi there! 👋 It sounds like you're encountering an issue with input dimensions for your `.tflite` model. To help you better, could you please share more details, such as the exact preprocessing steps you're using and the shape of the input tensor expected by your model? In the meantime, ensure you're using the latest versions of `torch` and `ultralytics`. You can update them with: ` pip install --upgrade torch ultralytics ` For more guidance, check out our documentation docs.ultralytics.com and the common issues guide docs.ultralytics.com/guides/yolo-common-issues/. If you need further assistance, feel free to provide additional details here. 😊 Unfortunately, we can't offer private support, but we're here to help in the comments!
Good morning! How can I convert the Yolo v8 file .pt to weights? can you help me with this one, thank you vey much
Directly there is no support for this feature, but you can use third party tools to convert PyTorch (.pt) to Darknet (.weights) format. The currently available formats are mentioned at the following link: docs.ultralytics.com/modes/export/#arguments
is there a way in which i can only get the label from the object so i need to use a text to speech
Certainly, you can retrieve the label using the provided code snippet:
```python
from ultralytics import YOLO
model = YOLO("yolov8s.pt")
results = model.predict(frame, verbose=False)
boxes = results[0].boxes.xywh.cpu()
clss = results[0].boxes.cls.cpu().tolist()
names = results[0].names
for box, cls in zip(boxes, clss):
x, y, w, h = box
label = str(names[int(cls)])
#.....
#.....
```
Hello. Is is possible to make the model predict just one label? Imagine that I have my paper-scissors-rock prediction model, but when it comes to connect the webcam to make the predictions in real time I just want the model to predict the "Rock" on screen.
Yes, you can configure your model to predict only one specific label, like "Rock," in real-time inference. Use the `classes` argument in the `model.predict()` method to filter predictions by class ID. For example, set `classes=[ID_for_Rock]` where `ID_for_Rock` corresponds to the class index for "Rock" in your dataset. This ensures the model will only output predictions for that class.
For more details on prediction arguments, check here: Predict - Ultralytics YOLO Docs docs.ultralytics.com/modes/predict/#arguments.
hey !!!!
can you please explain me hoe to get the coordinates of the deteted bounding boxes in an image and one more thing how can i change the saved directory to one single folder not runs>predict1,predict2 etc
thanks
Right, you can use mentioned code to extract the bounding boxes from image, additionally you can store output in specific directory.
```python
import cv2
import os
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
model = YOLO("yolov8s.pt")
output_dir = "test"
image = cv2.imread("path/to/image.png")
results = model.predict(image, show=False, classes=[108])
boxes = results[0].boxes.xyxy.cpu().tolist()
clss = results[0].boxes.cls.cpu().tolist()
annotator = Annotator(image, line_width=2, example=model.names)
for box, cls in zip(boxes, clss):
annotator.box_label(box, color=colors(int(cls), True),
label=model.names[int(cls)])
print("Bounding Box Coordinates : ", box)
cv2.imwrite(os.path.join(output_dir, "output.png"), image)
```
Thanks
I saw that when training and val he is applying some preprocessing(normalize+standardize 255). But when we use .predict, we have to do it manually, or is implemented?
Great question! When you use `.predict` with Ultralytics YOLOv8, the preprocessing steps like normalization and standardization are automatically handled for you. No need to do it manually! For more details, you can check out our documentation docs.ultralytics.com/modes/predict/. 😊🚀
how to show tracking id?
You can display the tracking ID by calling `model.track`, for example:
```yolo track model=yolov8n.pt source="ua-cam.com/video/LNwODJXcvt4/v-deo.html" conf=0.3, iou=0.5 show```
Thanks,
Ultralytics Team!
What exactly are results? why did you use just results[0], instead of all of results?
The `results` object contains all the detection outcomes for a given frame. By indexing into `results`, you can access the data within it, allowing you to later process or extract bounding boxes, class IDs, tracking IDs, and more.
Hi nice video but quick question, can I save each frame of the output along with the segmentation mask overlayed on top? If how? Thanks for your help!
Hi! Yes, you can save each frame with the segmentation mask overlayed by using OpenCV to draw and save the frames. After running the model's `predict()` method, you can use OpenCV functions like `cv2.imwrite()` to save the frames. For a detailed guide on handling segmentation results, check out the Isolating Segmentation Objects docs.ultralytics.com/guides/isolating-segmentation-objects/ documentation. 😊
@Ultralytics Thanks for the help! I'll have a look at the documentation and let you know if theres anything else, but yeah thanks👍
You're very welcome! 😊 Feel free to reach out if you have more questions-happy experimenting with YOLOv8! 🚀
@@Ultralytics Oh yh one more thing: in this video the guy uses it for object detection tracking: ua-cam.com/video/wuZtUMEiKWY/v-deo.html but I plan to use it for segmentation tracking on a custom datasetet so do i just replace the yolov8.pt model with yolov8-seg.pt? please let me know what you think!
Yes, exactly! 👍 To use segmentation tracking, simply replace the `yolov8.pt` model with `yolov8-seg.pt`. This model is designed for segmentation tasks, enabling it to overlay and track masks across frames. Be sure to adjust your data and settings as needed for your custom dataset. Check out the Instance Segmentation and Tracking Guide docs.ultralytics.com/guides/instance-segmentation-and-tracking/ for more details! 🚀
Hey, I am searching how to control the way the model saves the output images in runs/detect/predict. is there a way to change it? I used the save_dir attribute in the model.predict() function but the model still saved it in the default way. Also is possible to get the total count of the detections predicted by model in a run?
To change the save directory, make sure you're using the `save_dir` parameter correctly in `model.predict()`. If it's not working, double-check for typos or version updates. For counting detections, you can iterate over the `Results` objects and sum up the detections. If issues persist, ensure you're using the latest versions of `ultralytics` and `torch`. For more details, check out the predict mode documentation docs.ultralytics.com/modes/predict/. 😊
@@Ultralytics I am using the save_dir parameter correctly. I don't see why it is not working. Iterating over the results gives number of images which had the desired object to be detected. I need total objects detected in all the images, is there a way to do it?
If `save_dir` isn't working, ensure you're using the latest package versions. For counting total detections, iterate over `results` and sum up `len(result.boxes)` for each `result`. This will give you the total number of detected objects. If issues persist, consider checking the documentation or reaching out on our Discord for community support: ultralytics.com/discord.
How can I print the confidence scores for every class id for an image? Say I have 6 classes and a single image. I want to see what the confidence is for every label.
Absolutely, you can utilize the provided code to display the confidence score for each bounding box.
"""
from ultralytics import YOLO
# Load a pre-trained YOLOv8n model
model = YOLO('yolov8n.pt')
names = model.model.names
# Perform inference on 'bus.jpg' with specified parameters
results = model.predict("ultralytics.com/images/bus.jpg", verbose=False, conf=0.5)
# Process detections
boxes = results[0].boxes.xywh.cpu()
clss = results[0].boxes.cls.cpu().tolist()
confs = results[0].boxes.conf.float().cpu().tolist()
for box, cls, conf in zip(boxes, clss, confs):
print(f"Class Name: {names[int(cls)]}, Confidence Score: {conf}, Bounding Box: {box}")
"""
Hope this helps. Thanks.
@@Ultralytics How if want save results the output from terminal to save file .txt? i tried use save_txt=True , but the .txt display only numbers didnt display a class name or a any string
To save the results with class names and confidence scores to a `.txt` file, you can modify the `save_txt` method to include class names. Here's how you can do it:
```python
from ultralytics import YOLO
Load a pre-trained YOLOv8n model
model = YOLO('yolov8n.pt')
names = model.model.names
Perform inference on 'bus.jpg' with specified parameters
results = model.predict("ultralytics.com/images/bus.jpg", verbose=False, conf=0.5)
Save results to a .txt file
txt_file = "output.txt"
with open(txt_file, "w") as f:
for result in results:
boxes = result.boxes.xywh.cpu()
clss = result.boxes.cls.cpu().tolist()
confs = result.boxes.conf.float().cpu().tolist()
for box, cls, conf in zip(boxes, clss, confs):
f.write(f"Class Name: {names[int(cls)]}, Confidence Score: {conf}, Bounding Box: {box}
")
print(f"Results saved to {txt_file}")
```
This script will save the results to `output.txt` with class names, confidence scores, and bounding box coordinates.
For more details, you can refer to the Ultralytics documentation docs.ultralytics.com/reference/engine/results/.
How can I display a real-time message such as 'Person detected' within the frame when a person is identified? For example, if I am running a program in real-time and it detects a person, how do I show the message 'Person detected' directly on the frame?
Sure, the provided code allows for the direct display of 'Person detected' on the frame in case a person is identified in the video frame.
```
import cv2
from pathlib import Path
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator
# Load the YOLOv8 model
model = YOLO('yolov8n.pt') # pre-trained model
model = YOLO('path/to/best.pt') # fine-tuned model
# Path to Video
video_path = "path/to/video.mp4"
if not Path(video_path).exists():
raise FileNotFoundError(f"Source path {video_path} does not exist.")
names = model.model.names
cap = cv2.VideoCapture(video_path)
while cap.isOpened():
success, frame = cap.read()
if success:
results = model.predict(frame)
boxes = results[0].boxes.xyxy.cpu().numpy().astype(int)
classes = results[0].boxes.cls.tolist()
confidences = results[0].boxes.conf.tolist()
annotator = Annotator(frame, line_width=2, example=str(names))
for box, cls, conf in zip(boxes, classes, confidences):
if names[int(cls)] == "person":
annotator.box_label(box, "Person Detected", (255, 42, 4))
cv2.imshow("YOLOv8 Detection", frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
break
cap.release()
cv2.destroyAllWindows()
```
@@Ultralytics Thank you so much. If there is any method to convert to speech like if the person detects then voice output like person detected
Third-party tools can be utilized for speech processing.
what would I add to my code (using live video cam) to get the coordinates of the boxes? I'm planning on extracting these coordinates and creating a custom segmentation dataset and pairing it to the rest of my project
My code so far:
from ultralytics import YOLO
model = YOLO("yolov8l")
results = model.predict(source="0", show=True, conf=0.5)
results.show()
You can get the bounding box coordinates using the provided code.
"""
from ultralytics import YOLO
import cv2
import sys
cap = cv2.VideoCapture(0)
model = YOLO("yolov8n.pt")
if not cap.isOpened():
print("Error reading video file")
sys.exit()
while cap.isOpened():
success, frame = cap.read()
if success:
results = model.predict(frame, persist=True)
boxes = results[0].boxes.xywh.cpu()
clss = results[0].boxes.cls.cpu().tolist()
names = results[0].names
for box, cls in zip(boxes, clss):
x, y, w, h = box
label = str(names[int(cls)])
#......
#......
#......
"""
@@Ultralytics thank you so much, but I'm actually looking to migrate to a segmentation project and need to export every bounding box from my custom dataset that is detected. how would I do so (export what and where the thing is segmented in a way that is readable by an external source). Also, do you have any methods/tips to use this data to make a pathfinding (self-driving car)? I need a way to export the data provided by the segmentation model and find a route that can avoid certain segmentations (like grass)
@@bitmapsquirrel6869 seems like your query is more technical, we would recommend asking your queries on our GitHub Issue Section: github.com/ultralytics/ultralytics/issues
Hello there! New to yolo and am trying to do a project here.
I made my own yolov8 model seeing you guys' videos and had a question
Well i want to make a simple python script where the model is liked the the model.YOLO and its giving the output as it should with the detection_output.
Howevere if want to make "if" functions like if the model detects a spwcefic class from my dataset, it will prijt something. How am i supposed to do that like we do with a variety of other open source models like from cvzone of smth
Plz help
You can achieve this by iterating through the detection output of your YOLOv8 model in Python and checking for specific classes. Here's a basic example:
```python
for detection in detection_output:
if detection['class'] == specific_class_index:
print("Detected specific class! Do something...")
```
Replace `specific_class_index` with the index of the class you're interested in. This allows you to execute custom actions based on the detected classes.
@@Ultralytics
from ultralytics import YOLO
import numpy
# load a pretrained YOLOv8n model
model = YOLO("path:\to\yolov8_custom.pt")
# predict on an image
detection_output = model.predict(source=0, conf=0.25, save=False, show=True)
# Display tensor array
print(detection_output.probs)
# Display numpy array
print(detection_output[0].numpy())
for detection in detection_output:
if detection['class'] == breadboard:
print("working")
----------------------------------------------------------
this is my full code, sir. idk what id did wrong but the output in pycharm is this-
0: 480x640 (no detections), 273.0ms
0: 480x640 (no detections), 269.0ms
0: 480x640 1 breadboard, 269.0ms
0: 480x640 1 breadboard, 269.0ms
0: 480x640 (no detections), 272.0ms
0: 480x640 (no detections), 270.0ms
0: 480x640 (no detections), 268.0ms
0: 480x640 1 breadboard, 272.0ms
is there anything that needs to be modified?
It looks like you're on the right track, but there are a few adjustments needed. The `detection_output` from YOLOv8 is a list of `Results` objects, and you need to access the `boxes` attribute to get the detected classes. Here's an updated version of your code:
```python
from ultralytics import YOLO
Load a pretrained YOLOv8 model
model = YOLO("path/to/yolov8_custom.pt")
Predict on an image (webcam in this case)
detection_output = model.predict(source=0, conf=0.25, save=False, show=True)
Iterate through the detection results
for result in detection_output:
for box in result.boxes:
if box.cls == breadboard_class_index: Replace with the actual class index for 'breadboard'
print("Detected breadboard! Do something...")
```
Make sure to replace `breadboard_class_index` with the actual index of the 'breadboard' class in your dataset.
For more details on how to use YOLOv8 in Python, you can refer to our documentation: YOLOv8 Python Usage docs.ultralytics.com/usage/python/.
i am having a problem with yolov8 where when i run yolov8 it draws bounding boxes everywhere and a bunch of different class against my blank wall, what is the fix to this inaccuracy?
It sounds like your model might be overfitting or not trained properly. Here are a few steps you can take to address this:
1. Check Your Dataset: Ensure your training dataset is well-annotated and diverse. Poor-quality data can lead to inaccurate predictions.
2. Verify Training Settings: Make sure your training configuration (e.g., learning rate, batch size) is appropriate for your dataset.
3. Regularization Techniques: Consider using techniques like data augmentation to improve model generalization.
4. Evaluate Model Performance: Monitor metrics like precision, recall, and mAP during training to ensure the model is learning correctly.
For more detailed troubleshooting, check out our guide on common YOLO issues: YOLO Common Issues docs.ultralytics.com/guides/yolo-common-issues/.
If you need further assistance, please provide more details about your training setup and dataset.
@@Ultralytics when i run it in command line it works fine however when i run it in python the inaccuracies start happening
It sounds like there might be a discrepancy between your CLI and Python setups. Here are a few things to check:
1. Environment Consistency: Ensure that the Python environment you're using matches the one used in the CLI. Check versions of `torch`, `ultralytics`, and other dependencies.
2. Model Configuration: Verify that the model configuration and weights used in Python are the same as those in the CLI.
3. Inference Settings: Ensure that inference settings like confidence threshold and image size are consistent between CLI and Python.
For more detailed troubleshooting, you can refer to our guide on common YOLO issues: YOLO Common Issues docs.ultralytics.com/guides/yolo-common-issues/.
If the problem persists, please share more details about your Python script and the specific inaccuracies you're encountering.
Great Content. Is it possible to see the entire code that you used? i have search the github but i dont see this code. Also is it possible you have a video on extracting the bounding box picture for a conf
If you wish to retrieve the bounding box coordinates, confidence score, and class name for each object, you can employ the provided code below. The code enables you to extract bounding boxes with a confidence score greater than or equal to 0.5.
"""
from ultralytics import YOLO
# Load a pre-trained YOLOv8n model
model = YOLO('yolov8n.pt')
names = model.model.names
# Perform inference on 'bus.jpg' with specified parameters with conf=0.5
results = model.predict("ultralytics.com/images/bus.jpg", verbose=False, conf=0.5)
# Process detections
boxes = results[0].boxes.xywh.cpu()
clss = results[0].boxes.cls.cpu().tolist()
confs = results[0].boxes.conf.float().cpu().tolist()
for box, cls, conf in zip(boxes, clss, confs):
print(f"Class Name: {names[int(cls)]}, Confidence Score: {conf}, Bounding Box: {box}")
"""
Hope this helps. Thanks.
@@Ultralytics thank you for the response. my7 intention is to run with my webcam. i have added 'source=0'. it then asks for 'stream=True' then shows repetitive warnings without bringing up the image window to show the results.
@@stefangraham7117 This could be due to insufficient memory.
@@Ultralytics i mean the system i am using has 32GB of RAM. could there be a part of the code i can tweak to limit the memory usage?
@@stefangraham7117 With 32GB of RAM, memory shouldn't be an issue. Let's ensure you're using the latest versions of `torch` and `ultralytics`. You can update them using:
```bash
pip install --upgrade torch ultralytics
```
If the issue persists, please share the exact warnings you're seeing. For more details on setting up a security alarm system with YOLOv8, check out our guide: docs.ultralytics.com/guides/security-alarm-system/.
I want to label each object (bounding box) in the txt file to help train the YOLO model. But I don't know what the exact format is. Can you give me the recipe?
Sure! For YOLO models, each image's label is stored in a `.txt` file with the same name as the image. Each row in the file represents an object in the format:
` `
Here’s what each part means:
- `class`: The class index of the object (zero-indexed).
- `x_center`, `y_center`: Center of the bounding box, normalized (0-1) based on image width and height.
- `width`, `height`: Dimensions of the bounding box, also normalized.
Example:
```
0 0.5 0.5 0.4 0.3
1 0.3 0.4 0.2 0.1
```
This example defines two objects: one of class 0 and one of class 1. For more details, check the Ultralytics YOLO Format Guide docs.ultralytics.com/datasets/detect/.
@@Ultralytics Thanks for the nice explanations.When fine-tuning YoloWorld, I ran the following command: yolo task=detect mode=train model=yolov8x-worldv2.pt imgsz=640 data=.yaml epochs=300 batch=32 device='cuda:0' name=yolov8x-worldv2. After that, I replaced the checkpoint yolov8x-worldv2.pt with the best checkpoint that was just trained (best.pt) and continued fine-tuning. However, the program threw an error:
TypeError: forward() missing 1 required positional argument: 'guide'.
How can I fix this?
It seems like the issue arises because `yolov8x-worldv2.pt` uses a custom model architecture that includes a `guide` argument in its `forward()` method. When continuing fine-tuning with `best.pt`, it may not properly load the custom model logic.
To resolve this:
1. Ensure that the `best.pt` checkpoint is being loaded with the exact same architecture as `yolov8x-worldv2.pt`. If `yolov8x-worldv2.pt` depends on custom modifications, use its original model definition or ensure the `guide` parameter is correctly handled.
2. Re-run fine-tuning with the `--resume` flag instead of manually swapping checkpoints. Example:
```bash
yolo task=detect model=yolov8x-worldv2.pt imgsz=640 data=.yaml epochs=300 batch=32 device='cuda:0' name=yolov8x-worldv2 --resume
```
Check out Ultralytics Docs docs.ultralytics.com/tasks/detect/ for more training insights. If the issue persists, confirm the compatibility of the custom model architecture in your setup.
@@Ultralytics The best.pt checkpoint was obtained from the training run with the best performance using the initial checkpoint of YOLOWorld yolov8x-worldv2.pt. Therefore, I ensure that it shares the same architecture. What is the mechanism of the --resume argument? Does it use the checkpoint from the previous fine-tuning session to continue fine-tuning (I assume it does not)? Or is there an alternative, such as adding the --guide argument?
The `--resume` argument in Ultralytics YOLO is designed to continue training seamlessly from a previously saved checkpoint by restoring the model weights, optimizer state, and training parameters, including learning rate scheduler and epoch number. It ensures that all aspects of training are resumed exactly as they were at the point of interruption. You can specify the checkpoint to resume from using `model=`.
However, in your case with YOLOWorld, the `forward()` method requiring a `guide` argument indicates that the architecture involves custom modifications. The `--resume` mechanism won't automatically handle such architecture-specific parameters unless they are part of the checkpoint's configuration.
If you're encountering the `TypeError` related to `guide`, ensure that:
1. The checkpoint (`best.pt`) and architecture (`yolov8x-worldv2.pt`) are fully compatible.
2. The `guide` argument required by the custom model is explicitly passed during training or fine-tuning.
For your scenario, adding the `--guide` argument manually is necessary if the `forward()` method explicitly depends on it. Unfortunately, `--resume` alone does not automatically infer custom arguments like `guide`. For more details on checkpoint management and resuming training, visit the Trainer Docs docs.ultralytics.com/reference/engine/trainer/.
I noticed in your previous comment that you would have colab notebooks. Do you have notebooks now? Where is the address?
You can extract the output of Ultralytics YOLOv8 using mentioned code below.
"""
from ultralytics import YOLO
# Load the YOLOv8 model
model = YOLO('yolov8n.pt')
names = model.model.names
# Perform inference on an image
results = model('ultralytics.com/images/bus.jpg')
# Extract bounding boxes, classes, names, and confidences
boxes = results[0].boxes.xyxy.tolist()
classes = results[0].boxes.cls.tolist()
confidences = results[0].boxes.conf.tolist()
# Iterate through the results
for box, cls, conf in zip(boxes, classes, confidences):
x1, y1, x2, y2 = box
confidence = conf
detected_class = cls
name = names[int(cls)]
"""
Thanks
Ultralytics Team!
Great video. But just wondering the yolov8 have a count which is able to view via cli command. Is there any example where i will be able to return it in a json the number of count ?
We are uncertain about which count you're inquiring about - whether it's the object counting module or the total count of objects within a frame.
Hi@@Ultralytics for example i have an image of 5 cars and 1 truck. how can i have a response to where it it say how many cars it found and trucks within the image
Its the total count of objects within a frame.
Got it! You can use Ultralytics YOLOv8 to count objects and return the results in JSON format. Here's a concise example:
```python
from ultralytics import YOLO
import json
Load the model
model = YOLO("yolov8n.pt")
Perform inference
results = model("path/to/your/image.jpg")
Extract counts
counts = {model.names[int(cls)]: int(count) for cls, count in zip(*results.xyxy[0][:, -1].unique(return_counts=True))}
Convert to JSON
json_counts = json.dumps(counts)
print(json_counts)
```
This will give you a JSON response with the count of each object class. For more details, check out our object counting guide docs.ultralytics.com/guides/object-counting/.
hi, how do i extract the section areas in the version of yolo that can do section
Hi there! To extract section areas using the YOLO model, you can start by labeling the relevant sections as bounding boxes on your document. After that, you can train a YOLOv8 model using these labeled bounding boxes.
is it possible to only get 1 result for each class based on the highest conf value? For exampls, I have 2 classes "fruit" and "peduncle". I only want 1 fruit and 1 peduncle from the detection result with the highest confidence value.
Yes, you can achieve that by filtering the detection results to keep only the highest confidence value for each class. You can use the `results` object to access the detections and then apply your filtering logic. For more details on handling results, check out our documentation docs.ultralytics.com/. 😊
Hello, is there a method to generate a video where I can detect faces? Specifically, I'd like to take an input video, identify faces within it, and produce an output video consisting of cropped face segments with consistent dimensions.
thanks
Yes, you can do this. After detecting the face, you can crop it and write it to the output video file while ensuring that each cropped face is resized to the same dimensions. The pseudocode is mentioned below!!!.
"""
face_crop = im0_detected[int(y1):int(y2), int(x1):int(x2)]
face_resized = cv2.resize(face_crop, (416, 416))
videowriter.write(face_resized)
"""
@@Ultralytics thanks for the response! But resize function is distorting the results , I'll try to find a way , thanks anyway
You're welcome! If resizing is distorting the results, you might want to maintain the aspect ratio by padding the cropped faces. Check out our detailed guide on object cropping for more tips: docs.ultralytics.com/guides/object-cropping/. Good luck! 😊
where is the code used in this video?
Hi there, please find it here: github.com/niconielsen32/YOLOv8-Class
Why do you use model.fuse()?
model.fuse() in ultralytics is used to optimize inference performance by combining certain operations, such as convolutions and batch normalization, into a single fused operation for efficiency.
HEY @Ultralytics please make a video for extracting the locations of the bounding boxes which are given by the yolov8 model
Hi there! Thanks for your suggestion! Extracting bounding box locations from YOLOv8 is a great topic. In the meantime, you can check out our documentation on this at YOLOv8 Docs docs.ultralytics.com. Make sure you're using the latest versions of `torch` and `ultralytics` for the best experience. Stay tuned for more tutorials! 🚀
Just wanted to ask, how to get the class name of the live result. Thank you very much.
You can obtain the class names by using the following code after loading the model:
```python
model = YOLO('yolov8n.pt')
classes_names = model.names
```
Thanks
Ultralytics Team!
Can you give me this code please? thank you
Sure, you can use mentioned code to extract output of Ultralytics YOLOv8 Object Detection.
```python
from ultralytics import YOLO
# Load the YOLOv8 model
model = YOLO('yolov8n.pt')
# Perform inference on an image
results = model('ultralytics.com/images/bus.jpg')
# Extract bounding boxes, classes, names, and confidences
boxes = results[0].boxes.xyxy.tolist()
classes = results[0].boxes.cls.tolist()
names = results[0].names
confidences = results[0].boxes.conf.tolist()
# Iterate through the results
for box, cls, conf in zip(boxes, classes, confidences):
x1, y1, x2, y2 = box
confidence = conf
detected_class = cls
name = names[int(cls)]
```
Thanks
Ultralytics Team!
@@Ultralytics thank you
You're welcome! If you have any more questions, feel free to ask. Happy coding! 😊
For more details, check out our AzureML Quickstart Guide docs.ultralytics.com/guides/azureml-quickstart/.
Thanks,
Ultralytics Team!
can we get the code which you are using, so that we can understand how to use the python code for running it
We are creating colab notebooks that will include the codes for our UA-cam videos, we will share them soon! Thanks
when do the google colab notebook will be available@@Ultralytics
Notebooks will be available at end of this week! Thanks
how to extract the resulted image or video and how to show
You can use the mentioned code below to display the resultant image.
```
from PIL import Image
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Run inference on 'bus.jpg'
results = model(['bus.jpg', 'zidane.jpg']) # results list
# Visualize the results
for i, r in enumerate(results):
# Plot results image
im_bgr = r.plot() # BGR-order numpy array
im_rgb = Image.fromarray(im_bgr[..., ::-1]) # RGB-order PIL image
# Show results to screen (in supported environments)
r.show()
# Save results to disk
r.save(filename=f'results{i}.jpg')
```
For more information, you can explore our Predict docs available at: docs.ultralytics.com/modes/predict/#plotting-results
I use yolov8 to predit multiple-stream, but there seems no way to know result from which stream, Any one konw deal with it ?
To implement multi-stream Object Tracking, you can refer to the Ultralytics Docs: docs.ultralytics.com/modes/track/#multithreaded-tracking.
Please keep in mind that if you wish to perform object detection on multiple streams instead, you can replace 'track' with 'predict'.
the code isn't working for me 😢
If you encounter any code-related issues, please feel free to open an issue in the Ultralytics GitHub Repository here: github.com/ultralytics/ultralytics/issues
Can I get a link for this code to refer?
Sure! You can find the code and more details on our GitHub page: Ultralytics GitHub github.com/ultralytics/ultralytics 🚀
Hi i want to figure out where the live detected results are stored (i am using web cam) and i want it to speak out results using pyttsx3 or any tts engine, my code so far is given below. i am planning on integrating it with the rest of the object.
from ultralytics import YOLO
model = YOLO("yolov8l")
results = model.predict(source="0", show=True, conf=0.5)
results.show()
thanks in anticipation of help! and a thanks for keeping yolo free for all !
The live detection results will be saved in the 'runs/predict/exp' folder. Ensure that you include a save argument to store the output results. The modified code is provided below.
```python
from ultralytics import YOLO
model = YOLO("yolov8l")
results = model.predict(source="0", show=True, conf=0.5, save=True)
results.show()
```
Thank you.
hello, can you release the full code?
You can get access to all YOLOv8 code at github.com/ultralytics/ultralytics
if you could justt talk a little slower lol without swallowing half the words and slurring through. but the video by itself was super cool, excited to try it out!
Thank you for the feedback! We'll slow down the speech for better clarity next time. We are glad you found the video cool, and we hope you enjoy trying it out!
Thanks more
Welcome 😊
please slow down 😭
Thank you for your feedback. We will inform the presenter about the request for a slower speed voice.
Can we get another person to talk on these videos please? This guy speaks way to fast and it is sometimes hard to understand what he is talking about.
Appreciate you sharing your experience! We'll have a discussion with the creator and make the necessary updates. You can expect clearer voice and more effective sound in the upcoming videos.
Thanks
Ultralytics Team!
@@UltralyticsThanks for listening. Just take things a little slower please. Thanks again.
@@UltralyticsThanks for listening. Just take things a little slower please. Thanks again.
@alfred You do have an option to watch the video in 0.5x speed
@@shyamsaseethar2602 Very clever. Will try.
I try to run a code to object detection in real time in my local pc, but the result is wrong, because draw the rectangle in other place. This is the code:
from ultralytics import YOLO
import cv2
import numpy as np
model = YOLO("yolov8n.pt")
labels = model.names
COLORS = np.random.randint(0, 255, size=(len(labels), 3), dtype="int64")
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
print("Cant detect camera")
break
results = model.track(frame, stream=True)
for result in results:
classes_names = result.names
for box in result.boxes:
if box.conf[0] > 0.6:
[x1, y1, x2, y2] = box.xyxy[0]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)
label_index = int(box.cls[0])
label_class = classes_names[label_index]
color = (
int(COLORS[label_index][0]),
int(COLORS[label_index][1]),
int(COLORS[label_index][2]),
)
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
cv2.putText(
frame,
f"{classes_names[int(label_index)]}{box.conf[0]}:.2f",
(x1, y1),
cv2.FONT_HERSHEY_SIMPLEX,
1,
color,
2,
)
cv2.imshow("Frame", frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
It looks like there might be an issue with how the bounding box coordinates are being processed or drawn. Let's ensure that the coordinates are correctly extracted and used. Also, make sure you're using the latest versions of `torch` and `ultralytics`. Here's a refined version of your code:
```python
from ultralytics import YOLO
import cv2
import numpy as np
Load the YOLOv8 model
model = YOLO("yolov8n.pt")
Get class labels
labels = model.names
Generate random colors for each label
COLORS = np.random.randint(0, 255, size=(len(labels), 3), dtype="int64")
Open the webcam
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
print("Can't detect camera")
break
Run inference
results = model(frame, stream=True)
for result in results:
for box in result.boxes:
if box.conf[0] > 0.6:
x1, y1, x2, y2 = map(int, box.xyxy[0])
label_index = int(box.cls[0])
label_class = labels[label_index]
color = tuple(map(int, COLORS[label_index]))
Draw the bounding box
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
cv2.putText(
frame,
f"{label_class} {box.conf[0]:.2f}",
(x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
color,
2,
)
Display the frame
cv2.imshow("Frame", frame)
Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
```
Make sure to check the following:
1. Ensure you have the latest versions of `torch` and `ultralytics` installed.
2. Verify that your webcam is working correctly.
For more detailed information on using the `predict` mode, you can refer to the Ultralytics YOLOv8 documentation docs.ultralytics.com/modes/predict/. If the issue persists, please provide more details about the environment and any error messages you encounter.