ComfyUI: Yolo World, Inpainting, Outpainting (Workflow Tutorial)

Поділитися
Вставка
  • Опубліковано 11 чер 2024
  • This tutorial focuses on Yolo World segmentation and advanced inpainting and outpainting techniques in Comfy UI. It has 7 workflows, including Yolo World instance segmentation, color grading, image processing, object/subject removal using LaMa / MAT, inpaint plus refinement, and outpainting.
    ------------------------
    JSON File (UA-cam Membership): www.youtube.com/@controlaltai...
    Yolo World Efficient Sam S CPU/GPU Jit Download: huggingface.co/camenduru/Yolo...
    Fooocus Inpaint Model Download: huggingface.co/lllyasviel/foo...
    LaMa Model Download: github.com/Sanster/models/rel...
    MAT Model Download: github.com/Sanster/models/rel...
    Inference Install (Comfy UI Portable Instructions):
    Go to python_embedded folder. Right click and open terminal.
    Command to Install Inference:
    python -m pip install inference==0.9.13
    python -m pip install inference-gpu==0.9.13
    Command to Uninstall Inference:
    python -m pip uninstall inference
    python -m pip uninstall inference-gpu
    Command to Upgrade Inference to Latest Version (non compatible with Yolo World)
    python -m pip install --upgrade inference
    python -m pip install --upgrade inference-gpu
    ------------------------
    TimeStamps:
    0:00 Intro.
    01:01 Requirements.
    03:21 Yolo World Installation.
    04:32 Models, Files Download.
    05:09 Yolo World, Efficient SAM.
    12:50 Image Processing, Color Grading.
    19:09 Fooocus Inpaint Patch.
    21:56 Inpaint & Refinement.
    24:17 Subject, Object Removal.
    31:58 Combine Workflow.
    33:17 Outpainting.
  • Навчання та стиль

КОМЕНТАРІ • 179

  • @nocnestudio7845
    @nocnestudio7845 День тому

    Great tutorials. Good Job 💪... 28:12 very heavy operation. 1 frame is taking all life... Gimp is Faster

  • @AI3EFX
    @AI3EFX 3 місяці тому +6

    Crazy. Time for me to put my Confy UI cape back on. Great video!!!

  • @LuckRenewal
    @LuckRenewal 2 місяці тому +1

    i really love your videos, you really explain it very well!

  • @brgresearch
    @brgresearch 2 місяці тому +1

    Brilliant explanations. Thanks for making this video, it is so useful, and you have a great mastery of the subject.

    • @brgresearch
      @brgresearch 2 місяці тому

      My only gripe, as I'm replicating these workflows, is that perhaps the seed numbers you use could be simpler to replicate, or perhaps pasted in the description? That way we can get the exact same generations that you did easily. Right now, not only is the seed long and complicated, but it's not always clear, like in the case of the bear on the street seed 770669668085503, which even on a 2K monitor (the easiest frame I could fine was at 22:16) , was really hard to make out due to the 6's looking like 8's. Still replicable, but perhaps for ease of following along, an easier seed would be helpful. Thank you again for making this, I'm halfway through replicating the workflows and I'm beginning to understand!

    • @controlaltai
      @controlaltai  2 місяці тому +1

      @brgresearch the seed number I used is random. Don't use the same seed as it's not cpu generated and will still give different results if you are not using the same GPU as mine. Use any random seed and keep randomising it. You are suppose to replicate the workflow intend rather than the precise output. Meaning, the workflow is supposed to do x with y output, at your end it should still do x with z output. I hope that makes sense.

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Also, if you need the seed for any video, just send an email or comment on the video, I will just post it for you. I prefer to not post it in description as some one with not a 4090 will get different output.

    • @brgresearch
      @brgresearch 2 місяці тому

      @@controlaltai thank you for the clarification. I did not know that the hardware will also affect the generation. My thought was to try to follow along as exactly as possible, so that I would get the same results and be able to make the changes you made in a similar manner, especially with the seam correction example, because I did not want to get a different bear! I completely understand that it's okay to get a z output, even if yours is y, as long as the workflow process arrives at the same type of result. I'm practicing with the workflow today, and it's really amazing what can be accomplished with this workflow. Thank you so much again, and really appreciate the work and education you are doing.

  • @Artishtic
    @Artishtic 13 днів тому

    Thank you for the object removal section

  • @ericruffy2124
    @ericruffy2124 3 місяці тому

    amazing as always, thanks for all details Mali~

  • @webu509
    @webu509 2 місяці тому +5

    The good news is that the ComfyUI Yolo World plugin is great. The bad news is that the author of this plugin has made many plugins and never maintains them.

    • @haljordan1575
      @haljordan1575 2 місяці тому

      that's my least favorite thing about self-ran image workflows.

  • @freke80
    @freke80 2 місяці тому

    Very well explained! ❤

  • @user-cb1dm8cy1s
    @user-cb1dm8cy1s 2 місяці тому

    Great. Very detailed.

  • @sanchitwadehra
    @sanchitwadehra 2 місяці тому

    Dhanyavad

  • @runebinder
    @runebinder 3 місяці тому +2

    Excellent video, want to try and use ComfyUI as much as I can but Inpaint and Outpaint has been better for me in other UI. Hopefully this will help. I've also only just realised you can zoom in and out of the canvas in Mask Editor due to watching you do it when you were fixing the edge of the mask after adding the polar bear lol.

  • @rsunghun
    @rsunghun 3 місяці тому

    This is so high level 😮

  • @oliviertorres8001
    @oliviertorres8001 3 місяці тому +8

    Results are amazing. But the learning curve to understand (and not only copy/paste) all these workflows seems a very long journey... Nevertheless, I subscribe immediately 😵‍💫

    • @brgresearch
      @brgresearch 2 місяці тому +2

      I had the same reaction. I can't imagine how these workflows were first created, but I'm grateful that eventually, because of these examples, I might understand it.

    • @blacksage81
      @blacksage81 2 місяці тому +2

      If you put the time in, you will understand. Also, I suggest finding a specific use case. In other words "Why am I in this space, what do I want the AI to help me create?" For me, it was consistent characters, so learning masking and inpainting is great for me so I can ensure likeness and improve my training dataset.

    • @oliviertorres8001
      @oliviertorres8001 2 місяці тому

      @@blacksage81 For me, it’s to build my own workflow to sell virtual home staging upfront to my clients. I’m a real estate photographer. Of course, it worth it to struggle a little bit to nail inpainting at a high level of skills 🧗

    • @brgresearch
      @brgresearch 2 місяці тому

      @@blacksage81 this is really good advice. For me, I'm trying to create a filter tool for photos with controlnet and to be able to do minor photo repairs using masking and inpainting. ComfyUI is such a flexible tool in that regard, but at the same time, it's amazing to see how some of the workflows are created.

  • @francaleu7777
    @francaleu7777 3 місяці тому

    thank you, great

  • @swipesomething
    @swipesomething 9 днів тому +1

    3:37 After I installed the node, I had errors "cannot import name packaging from pkg_resources", I updated the inference and inference-gpu packages and it was working, so if anybody has the same errors try to update the inference and inference-gpu

  • @jeffg4686
    @jeffg4686 2 місяці тому

    Thanks for the comment the other day. I had deleted my post already before I saw it so unfortunately the tip wasn't left for others (because of my deletion).
    The tip (leaving for others here) was to delete the specific custom node folder if you have problems loading an addon - in certain cases anyways.
    I had an idea for NN model decoders.
    The idea is simple. It's to pass in a portion of the image that's pre-rendered and that you want unchanged in the final image.
    So, the decoder would basically do it's magic right after the noise is generated.
    So, right on top of the noise, the decoder overlays the image you want included (transparent in most cases).
    It can have some functionality in the NN decoder for shading your image clips - both lighting applied to it as well as shadows.
    This might even need a new adapter "Type" - but I just haven't gotten deep enough into it yet (sorry if you're reading this as I keep correcting it, it's like 4:48 am... - it's pretty bad writing...)
    If you all have direct contacts with those at stability ai, you might reach out and suggest something regarding including pre-renders directly into noise at the beginning of the denoise process.

  • @liwang-pp7dj
    @liwang-pp7dj 3 місяці тому

    Awesome

  • @geraldwiesinger630
    @geraldwiesinger630 3 місяці тому

    Wow, awesome results. Which resolution are the images? Would this work on 4k images as well or would it be necessary to downscale or crop the inpaint region first?

    • @controlaltai
      @controlaltai  3 місяці тому

      Advisable to downscale, near SDXL resolution. Then upscale using comfyui or topaz.

  • @croxyg6
    @croxyg6 2 місяці тому

    If you have a missing .jit file error, go to SkalskiP's huggingface and find the jit files there. Place in your custom nodes > efficient sam yolo world folder.

  • @root6572
    @root6572 20 днів тому

    why are the model files for inpainting not in safetensors format?

  • @ucyuzaltms9324
    @ucyuzaltms9324 Місяць тому

    thanks for this incredible tutorial, i have a problem, i want to use the images which comes from yoloworld Esam, but there is box and text overlays, how can i remove them

    • @controlaltai
      @controlaltai  Місяць тому

      The yolo world doesn't not do anything but segment it. You further process by removing objects. And use those images. Why would you want to use images from yoloworld esam?

  • @mikrodizels
    @mikrodizels Місяць тому

    Quick question, if I need to add LoRa's to the workflow, should they come before Self-Attention Guide + Differential Diffusion nodes or after? Does it make a difference?

    • @controlaltai
      @controlaltai  Місяць тому

      I add the lora after self attention and differential diffusion. To be honest I have not tested it in any other order.

  • @moviecartoonworld4459
    @moviecartoonworld4459 3 місяці тому

    Thank you for always great lectures. I am leaving a message because I have a question.
    If you uncheck mask_combined and mask_extracted in Yoloworld ESAM and run it (Error occurred when executing PreviewImage:
    I get the error Cannot handle this data type: (1, 1, 15), |u1). Is there a solution? You can check and run them separately, but if you run them with both turned off, an error will appear.

    • @controlaltai
      @controlaltai  3 місяці тому +1

      Thank You! So basically, if you pass on the mask to another node, that node cannot handle multiple mask. If yolo for example, detects more than 1 mask, you would get this error when passing on. For that, you should select an extracted mask value or combine the masked. Only a singular mask image should be passed on.
      If you are getting the error without passing it on, then let me know, something else is wrong, as i doubled checked now and I don't get that error.

    • @moviecartoonworld4459
      @moviecartoonworld4459 3 місяці тому

      Thank you for answer!!@@controlaltai

  • @user-rk3wy7bz8h
    @user-rk3wy7bz8h 2 місяці тому

    At the Beginning Thank you
    The question is now which inpaint methode is better? To use Vae encode then Refine with Preview Bridge
    or
    to work directly with VAE Enfode&Inpaint conditioning without any refinment . I want to know how to get the best results
    :) Appreciate

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Hi, so basically I recommend both. The vae encode is for replacement and replaces way better than vae encode&inpaint conditioning. However during extensive testing I found in some cases the latter can replace as well much better. But in many cases I had to keep re generating with random seeds. I would go with the first method, then try the second, because second is not always replacing an object and the success depends on a variety of factors like the background, object mask etc. For minor edits go with second, for major edits like complete replacement, try first method then the second.

  • @Make_a_Splash
    @Make_a_Splash 2 місяці тому +1

    Hi, thanks for the video. Is this working for SD1.5?

    • @controlaltai
      @controlaltai  2 місяці тому

      The fooocus inpaint patch is only for sdxl. Yolo world and color grading doesn't require any checkpoint.

  • @hotmarcnet
    @hotmarcnet 2 місяці тому

    When the workflow passes the ESAM Model Loader, there is an error:
    """
    Error occurred when executing ESAM_ModelLoader_Zho:
    PytorchStreamReader failed reading zip archive: failed finding central directory
    """

    • @controlaltai
      @controlaltai  2 місяці тому

      Have no idea what is this error. Is this on comfy portable? Windows OS? or different environment?

  • @user-rk3wy7bz8h
    @user-rk3wy7bz8h Місяць тому

    Hello 👋can i use the replace methode with refinement to inpaint a face to give a women shorter hair?
    I tried it and it looked bad with blured hair and masking line.

    • @controlaltai
      @controlaltai  Місяць тому

      Yeah, do one at a time. Don't mask the entire face just above eye brows and up to the chin, maintain mask within the facial borders. The masking line has to be refined.
      Why it is blurred I have no idea. That is dependant on your ksampling and checkpoint.
      I suggest you look at face detailer for hair. It has an automated workflow specifically for hair.
      ua-cam.com/video/_uaO7VOv3FA/v-deo.htmlsi=b_6LSljm0SjLYXvq

    • @user-rk3wy7bz8h
      @user-rk3wy7bz8h Місяць тому

      I appreciate your very fast answer thank you a lot i will take your advice. ❤

  • @subashchandra9557
    @subashchandra9557 29 днів тому

    The issue with this is when you are trying to inpaint pictures that are large, it cannot inpaint accurately at all. Were you able to figure out how to downscale just the masked region such that its max_width is 768 or 1024 so that it is able to inpaint effectively?

    • @controlaltai
      @controlaltai  23 дні тому

      Downscaling only the mask are is possible. The workflow is different however depending on the image it may or may not work. The inpainting works because the ai needs the surrounding pixel data. So depending on the mask you have to select enough pixel data surrounding to get correct results.

  • @martintmv
    @martintmv 2 місяці тому

    i checked the link in the description for Yolo World Efficient Sam S CPU/GPU Jit in the description and the model there is marked as unsafe by HuggingFace... where can I download it from?

    • @controlaltai
      @controlaltai  2 місяці тому

      Please recheck the .jit files are safe. The other file is marked unsafe...yolow-v8_l_clipv2_frozen_t2iv2_bn_o365_goldg_pretrain.pth
      You can download it from another source here:
      huggingface.co/spaces/yunyangx/EfficientSAM/tree/main

    • @martintmv
      @martintmv 2 місяці тому

      @@controlaltai thanks

  • @alvarocardenas8888
    @alvarocardenas8888 2 місяці тому

    is there a way to make the mask for the mammoth automatically? Like putting a mask where the woman was before with x padding

    • @controlaltai
      @controlaltai  2 місяці тому

      Yes, you can, try create a rectangular mask node from masquerade nodes. Use some math nodes to get the size directly from the image source to the width and height input and just define the x and y co ordinates and the mask size.

  • @runebinder
    @runebinder 3 місяці тому

    I'm trying out the Yolo World Mask Workflow, but I'm getting this error when I get to the first Mast to Image node: "Error occurred when executing MaskToImage:
    cannot reshape tensor of 0 elements into shape [-1, 1, 1, 0] because the unspecified dimension size -1 can be any value and is ambiguous" I haven't changed any of the settings, and using a decent image with not too much in it (Res 1792 x 2304), and the prompt of shirt which is showing in WD14. Not sure what settings I need to change. Have tried altering the confidence but that hasn't helped and tried both the Yolo L & M models. Any ideas?

    • @controlaltai
      @controlaltai  3 місяці тому +1

      That error is when it cannot detect any segment. Try with confidence 0.01 and iou to 0.50, if it still cannot detect anything, You need to check what's your inference. When you launch comfy, do you get any message in command prompt that your inference is in a lower version latest version is 0.9.16. If you get that warning then all dependencies are good. if you don't get that warning, means you are on the latest inference on which this node does not work.
      The Wd14 is not what Yolo Sees. That's just there to help you. Both are un related. I put that there because, I was testing some images with low resolution and I could not see the objects but the ai could.
      Let me know if 0.01 / iou 0.50 works or not.

    • @runebinder
      @runebinder 3 місяці тому

      Thanks. I’ll check in a bit and let you know how I get on.

    • @runebinder
      @runebinder 3 місяці тому

      @@controlaltai copied everything out of the Command Prompt window and into a Word doc so I could use Ctrl+F to search for Inference and I get the warning that I'm on 0.9.13 and it asks me to update, so looks good on that front. Tried the same image but used Face as the prompt this time as it's a portrait shot and figured that would be easy for it to find and it worked, thanks for your help :)

  • @stepahinigor
    @stepahinigor 2 місяці тому

    It stuck on Blur Masked Area, I see issues on github but cant find clear solution, something about pytorch version :(

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Yeah, I bypassed it by having the code run via gpu.....had to make modifications to the node.

  • @nkofr
    @nkofr 2 місяці тому

    Hi! I'm trying to understand what's the point of preprocessing with Lama if the samplers then use a denoise of 1.0?

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Hi, The samplers checkpoints are not trained to remove the object that well. Lama is very well trained. However it’s not perfect. The point here is to use lama to accurately remove the subject object and then use fooocus inpainting to guide and fix the image to perfection.

    • @nkofr
      @nkofr 2 місяці тому

      @@controlaltai Yes but my understanding was that setting denoise to 1.0 was like starting from scratch (not using anything from the denoised area), so if the denoise is set to 1 my understanding is that what Lama has done is completely ignored. No??

    • @controlaltai
      @controlaltai  2 місяці тому +1

      @nkofr not really, we are using fooocus Inpaint models with inpaint conditioning method, not vae encode method . This method is basically for fine tuning. Where as the vae encode is for subject replacement. Denoising of 1 here is not the same as denoising of 1 in general sampling. Value comparison is apple to oranges. Denoising value also is not a hard rule and depends on the distortion cause by the lama model. So no, denoising of 1 will not undo the lama work, you can actually see in the workflow it uses the bases left by the lama and reconstructs that. The things is Mat And Lama will work in complicated images and the reconstruction done by them is beautiful, however for such complexity we need to just fine tune it. Hence we use the fine tune method.

    • @nkofr
      @nkofr 2 місяці тому

      @@controlaltai Ok thanks that makes sense!
      (What you call "fine tune" is the pass with fooocus inpaint).
      Have you heard about Lama with Refiner? Any idea on how to activate the Refiner for Lama in ComfyUI?
      Where do you get all that knowledge from?:)

    • @controlaltai
      @controlaltai  Місяць тому +1

      No idea on how to activate refiner for lama in comfyui at the moment.

  • @mikrodizels
    @mikrodizels 2 місяці тому

    Does this only work with SDXL models? I only have tried outpainting for now, I want to outpaint my epicrealism_naturalSinRC1VAE created images, everything seems to work in the previews, but in the final image after going through the sampler, the outpainted area is just noise. I included the same Lora and custom VAE I used to previously generate my images into this workflow as well.

    • @controlaltai
      @controlaltai  2 місяці тому

      The fooocus patch only works with SDXL checkpoints.

    • @mikrodizels
      @mikrodizels 2 місяці тому

      @@controlaltai Oh ok, got it. Is there an outpaint workflow, that would work like this for SD 1.5?

    • @controlaltai
      @controlaltai  2 місяці тому

      In comfy, all you have to do is remove the focus patch. However you have seen the difference when applying fooocus. I suggest you switch to any sdxl checkpoint. Even turbo lightning will give good results.

    • @mikrodizels
      @mikrodizels 2 місяці тому

      @@controlaltai Got it to work without Focus for 1.5, seamless outpaint, but the loss of quality with each queue (image becomes more grainy and red), unfortunately is inescapable no matter what. You are correct, time to try the lightening jugger, cheers

  • @haljordan1575
    @haljordan1575 2 місяці тому

    what if you wanted to replace an object with an existing one? or inpainting it?

    • @controlaltai
      @controlaltai  2 місяці тому

      The tutorial covers that extensively. Please check the video.

  • @ankethajare9176
    @ankethajare9176 Місяць тому

    hey, what if my image is more than 4096 pixels for outpainting?

    • @controlaltai
      @controlaltai  Місяць тому

      Hey, You may run out of memory issues on consumer grade hardware. SDXL cannot handle that resolution. You can outpaint in smaller pixels and do more runs rather than going beyond 1024 outpaint resolution.

  • @user-rk3wy7bz8h
    @user-rk3wy7bz8h 2 місяці тому

    I need Help Please: i dont see the Models of the (Load Fooocus Inpaint). I download all 4 and placed them in models -inpaint models.

    • @controlaltai
      @controlaltai  2 місяці тому +1

      The location is ComfyUI_windows_portable\ComfyUI\models\inpaint
      And not ComfyUI_windows_portable\ComfyUI\models\inpaint\models
      After putting the models, close everything including browser and restart.

    • @user-rk3wy7bz8h
      @user-rk3wy7bz8h 2 місяці тому

      @@controlaltai Thank you.
      The problem has been solved after i renamed the folder as (inpaint) instead of inpaint models.
      I apppreciate your fast answer ;) Continue, i like you

  • @saberkz
    @saberkz 2 місяці тому

    How i can contact you for some workflow help

  • @neoz8413
    @neoz8413 2 місяці тому

    import failed and the log file said, can't found Supervision, how to fix this pls.

    • @controlaltai
      @controlaltai  2 місяці тому

      go to comfyui python embedded folder, open terminal and try
      python -m pip install supervision
      If that does not work then try this: python -m pip install inference==0.9.13

  • @yklandares
    @yklandares 2 місяці тому

    as you wrote, I downloaded the models, but in the node where we select (yolo_world/l) In general, are they supposed to load themselves? but no, I have this error.
    I got an error when uploading Your world_ModelLoader_Zho:
    It is impossible to get a "model of the world"

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Yes, the dev has designed that it loads the models automatically the "yolo_world/l".
      However, you have to download the .jit files in the custom node folder root directory. Other wise you get error for models and it does not load automatically.

    • @yklandares
      @yklandares 2 місяці тому

      Error occurred when executing Yoloworld_ModelLoader_Zho:
      Can't get attribute 'WorldModel' on
      File "G:\NEUROset\ComfyUIPort\ComfyUI\execution.py", line 151, in recursive_execute
      output_data, output_ui = get_output_data(obj, input_data_all)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\ComfyUI\execution.py", line 81, in get_output_data
      return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\ComfyUI\execution.py", line 74, in map_node_over_list
      results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\ComfyUI\custom_nodes\ComfyUI-YoloWorld-EfficientSAM\YOLO_WORLD_EfficientSAM.py", line 70, in load_yolo_world_model
      YOLO_WORLD_MODEL = YOLOWorld(model_id=yolo_world_model)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\inference\models\yolo_world\yolo_world.py", line 36, in __init__
      self.model = YOLO(self.cache_file("yolo-world.pt"))
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\ultralytics\engine\model.py", line 95, in __init__
      self._load(model, task)
      File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\ultralytics\engine\model.py", line 161, in _load
      self.model, self.ckpt = attempt_load_one_weight(weights)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\ultralytics
      n\tasks.py", line 700, in attempt_load_one_weight
      ckpt, weight = torch_safe_load(weight) # load ckpt
      ^^^^^^^^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\ultralytics
      n\tasks.py", line 634, in torch_safe_load
      return torch.load(file, map_location="cpu"), file # load
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\torch\serialization.py", line 1026, in load
      return _load(opened_zipfile,
      ^^^^^^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\torch\serialization.py", line 1438, in _load
      result = unpickler.load()
      ^^^^^^^^^^^^^^^^
      File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\torch\serialization.py", line 1431, in find_class
      return super().find_class(mod_name, name)
      @@controlaltai

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Make sure something is masked. Also ensure with multiple objects are masked only one is passed through to the next node. That can be done via mask combine or mask extracted (selection).

    • @yklandares
      @yklandares 2 місяці тому

      I didn 't sleep for two days and agonized over the process and eventually placed two .jit models but not just in a folder but with the name yolo_world@@controlaltai

  • @baseerfarooqui5897
    @baseerfarooqui5897 2 місяці тому

    hi very informatic video i am getting this error while running code "AttributeError: type object 'Detections' has no attribute 'from_inference'

    • @controlaltai
      @controlaltai  2 місяці тому

      Thank you! Is it detecting anything? Try a lower threshold.

    • @baseerfarooqui5897
      @baseerfarooqui5897 2 місяці тому

      @@controlaltai already tried but nothing happened

    • @controlaltai
      @controlaltai  2 місяці тому

      Check inference version.

    • @baseerfarooqui5897
      @baseerfarooqui5897 2 місяці тому

      can u please elaborate it. thanks@@controlaltai

  • @user-ey3cm7lf1y
    @user-ey3cm7lf1y 9 днів тому

    I tried to install inference==0.9.13
    But i got error.
    Should i downgrade my python version to 3.11 ?

    • @controlaltai
      @controlaltai  8 днів тому +1

      I suggest you backup your environment then downgrade. Wont work unless on 3.11

    • @user-ey3cm7lf1y
      @user-ey3cm7lf1y 8 днів тому

      @@controlaltai Thank you
      i solve the problem on 3.11

  • @iangregory9569
    @iangregory9569 2 місяці тому

    do you have any basic masking, composting videos??

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Not yet, however a 10 to 15, maybe more basic series episodes will be there on the channel covering part by part slowly, explaining every aspect of comfy and stable diffusion. We just don’t have an eta on the first episode…..

    • @iangregory9569
      @iangregory9569 2 місяці тому

      @@controlaltai ,sorry i guess what i mean is what "mask node" would i use too layer two images together like in ,photoshop fusion, AE so a 3d rendered apple with a separate alpha channel then comped onto a background of a table, but there are so many mask nodes i don't know which is the most straight forward to use for such a simple job, thanks

    • @controlaltai
      @controlaltai  2 місяці тому +1

      It’s works differently here. Say apply and bg, so you use mask to select apply, cut the apple and then paste it on another bg. This can be done via masquerade nodes cut by mask and paste by mask function. To select the apple, manual is messy, you can use grounding dino, clipseg or yoloworld. All three would suffice. In between you can add a grow mask node, feather mask etc to refine the mask and selection.

    • @iangregory9569
      @iangregory9569 2 місяці тому

      thank you!@@controlaltai

  • @yklandares
    @yklandares 2 місяці тому

    Don't ask me what kind of "yellow world" it is.
    Is it obvious that you need to go somewhere, but as you wrote, I downloaded them, but in yolo_world/limis you need to go somewhere? In general, they kind of have to download themselves, but no. He writes this
    I got an error when loading Yoloworld_ModelLoader_Zho:
    It is not possible to get a "model of the world" in
    File "G:\NEUROset\ComfyUIPort\ComfyUI\execution.py ", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)

    • @controlaltai
      @controlaltai  2 місяці тому +1

      I cannot understand your question. Rephrase.

    • @yklandares
      @yklandares 2 місяці тому

      as you wrote, I downloaded the models, but in the node where we select (yolo_world/l) In general, are they supposed to load themselves? but no, I have this error.
      I got an error when uploading Your world_ModelLoader_Zho:
      It is impossible to get a "model of the world"@@controlaltai

    • @controlaltai
      @controlaltai  2 місяці тому

      Yes models load themselves.

  • @kikoking5009
    @kikoking5009 2 місяці тому

    Great video very helpfull.
    I just have an issue in removing an object.
    I have a picture of 4 man, and i wanted to remove 1.
    In the step at the end after the Ksampler i have the issue that the face details of the other persons change a bit when i see it in the Image Comparer (rgthree)
    Can i remove 1 person without changing other details?

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Thanks! This is a bit complicated. So I have to try this. Are you finding this issue after the first or second ksampler? Also the approach would depend on how the interaction is in the image. If you can send a sample image, i can try and let you know if successful.

    • @kikoking5009
      @kikoking5009 2 місяці тому

      @@controlaltai I find the Issue in both KSamplers. I don't know how to send a sample image. And here in youtube i can only write

    • @controlaltai
      @controlaltai  2 місяці тому +1

      @@kikoking5009 send an email to mail @ controlaltai . com (without spaces)

    • @controlaltai
      @controlaltai  2 місяці тому +1

      I cannot reply to you from whatever email you sent from. "Remote server returned '550 5.4.300 Message expired -> 451 Requested action aborted;Reject due to policy restrictions" I need the photo sample of the 4 person along with your workflow. Sent an email from an account where I can reply back to you.

    • @kikoking5009
      @kikoking5009 2 місяці тому

      @@controlaltai i tried and sent it with other email. If it didn't work i really don't know. By the way iam thankful that you answer my question and try to help.
      Best luck

  • @mariusvandenberg4250
    @mariusvandenberg4250 2 місяці тому

    Fantastic node thank you. I am getting this error:
    Error occurred when executing Yoloworld_ESAM_Zho:
    'WorldModel' object has no attribute 'clip_model'

    • @eric-rorich
      @eric-rorich 2 місяці тому

      Me too... there is already a Ticket open, should be fixed soon

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Are on your inference 0.9.13 or the latest 0.9.17?

    • @eric-rorich
      @eric-rorich 2 місяці тому

      @@controlaltaiinference package version 0.9.13

    • @controlaltai
      @controlaltai  2 місяці тому

      Are the jit models downloaded? When does this error happen? Always or occasionally.

    • @mariusvandenberg4250
      @mariusvandenberg4250 2 місяці тому

      @@controlaltai yes i am. I rerun python -m pip uninstall inference and then python -m pip install inference==0.9.13

  • @Spinaster
    @Spinaster 2 місяці тому

    Thank you for your precious tutorial.
    I follow every steps but I still get the following error:
    "Error occurred when executing Yoloworld_ESAM_Zho:
    type object 'Detections' has no attribute 'from_inference'
    File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-YoloWorld-EfficientSAM\YOLO_WORLD_EfficientSAM.py", line 141, in yoloworld_esam_image
    detections = sv.Detections.from_inference(results)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^"
    Any suggestions?
    🙏

    • @controlaltai
      @controlaltai  2 місяці тому

      Okay, multiple things can be wrong,
      1. Check inference is 0.9.13
      2. Check if the jit models are downloaded correctly
      3. The object may not be detected, for that select some other keyword or reduce threshold.
      4. Multiple objects are selected, and is getting passed on to the masked node. Only a single mask can be passed. For this use mask combined or selectg from mask extracted a value of the mask.

  • @edba7410
    @edba7410 Місяць тому

    May I get the JSON files for this lesson?

    • @controlaltai
      @controlaltai  Місяць тому

      Everything is explained in the video. There are 6 to 7 workflows, you can build the workflow yourself.

    • @edba7410
      @edba7410 Місяць тому

      @@controlaltai I tried, but I don't get the same results as you. Maybe I can't catch some points and I'm connecting the nodes incorrectly.

  • @user-pg9wy3qn4c
    @user-pg9wy3qn4c 8 днів тому

    Does this method work with videos?

    • @controlaltai
      @controlaltai  7 днів тому

      It does indeed in my testing but the workflow is way way different. I took a plane take off video and removed it completely (the plane that is) and re constructed the video. I did not include it in the tutorial as it was becoming too long.

  • @andrejlopuchov7972
    @andrejlopuchov7972 2 місяці тому

    Can that work with animatediff?

    • @controlaltai
      @controlaltai  2 місяці тому

      Yup. I had planned to showcase it however I could not fit it as the video went too long, I had to cut so many concepts, so I though it would be a seperate video all together. Yolo world works for video real time detection.
      Basically I was able to take a plane lifting off video. Get the plane mask, make it disappear and have the whole video without the plane, only the camera and other elements moving. I still have to iron it out.
      Other ideas include use these workflow techniques like color grading a video in comfy.
      So you have a person dancing short video. Use the similar technique to isolate and change the colors say of the clothing and re stitch everything.
      All shown in the video can be applied to animate diff. Just the workflows would be slightly different.

  • @Pauluz_The_Web_Gnome
    @Pauluz_The_Web_Gnome 2 місяці тому

    Luckily its not this cumbersome! Nice

  • @godpunisher
    @godpunisher 2 місяці тому

    This....this rivals Adobe.

  • @manolomaru
    @manolomaru Місяць тому

    ✨👌😎😯😯😯😎👍✨

    • @controlaltai
      @controlaltai  Місяць тому

      Thank you!!

    • @manolomaru
      @manolomaru Місяць тому

      @@controlaltai Hello Malihe 👋🙂
      ...Yep, I installed Pinokio to avoid dealing with the other way of installation. But unfortunately I'll have to do it that way.
      Thank you so much for your time, and ultrasuperfast response 👍

  • @petpo-ev1yd
    @petpo-ev1yd 2 місяці тому

    where should I put yolo modles to ?

    • @controlaltai
      @controlaltai  2 місяці тому

      What yolo models are you talking about....? Check here 4:34

    • @petpo-ev1yd
      @petpo-ev1yd 2 місяці тому

      @@controlaltai I mean yolo_world/l or yolo_world/m

    • @petpo-ev1yd
      @petpo-ev1yd 2 місяці тому

      @@controlaltai Error occurred when executing Yoloworld_ModelLoader_Zho:
      Could not connect to Roboflow API. here is the error

    • @controlaltai
      @controlaltai  2 місяці тому

      Did you download the .jit files?

    • @petpo-ev1yd
      @petpo-ev1yd 2 місяці тому

      @@controlaltai yes,I did everything what you said

  • @Mehdi0montahw
    @Mehdi0montahw 2 місяці тому +1

    top

  • @arthitchyoutube
    @arthitchyoutube 2 місяці тому

    The workflow is not too hard to learn. It is taking up too much of my machine resources , if I put every process I want in one workflow. I need to separate them to finish the image. 😢

    • @controlaltai
      @controlaltai  2 місяці тому

      Remove the tagger node, the tagger node needs to run on cpu than gpu, there is a trick to do that, on gpu it takes minutes, cpu seconds. For the rest, it’s just what it is. Splitting it is a good idea.

  • @sukhpalsukh3511
    @sukhpalsukh3511 3 місяці тому

    Please share workflow 🥺

    • @controlaltai
      @controlaltai  3 місяці тому +3

      It's already shared with members. Also nothing is hidden in the video, you can create it from scratch if you do not wish to be a member.

    • @sukhpalsukh3511
      @sukhpalsukh3511 3 місяці тому +1

      @@controlaltai Thank you, really advanced but simple tutorial, appreciate your work,

  • @silverspark8175
    @silverspark8175 2 місяці тому +2

    avoid using Yolo World - it has outdated dependencies and most probably you will have issues with other nodes.
    Also Segm_Detector from Impact-Pack detects objects mush more accurate

    • @35wangfeng
      @35wangfeng 2 місяці тому

      agree with you

    • @controlaltai
      @controlaltai  2 місяці тому +2

      I know, the dev doesn’t respond. I am trying to find a way to update the dependencies myself. Will post if I am successful. The techniques I show in the video are not possible via dino or clipseg. As of now the best solution is to just have another comfy installed portable and have the yolo and inpainting inside. I am finding, this is now becoming very common. For example comfy 3d is a mess and requires completely different diffusion, same with plenty of other stuff. With mini conda I can manage different environments instead of separate environments, but I should get around to make a tutorial for that, this way we can still use new stuff without compromising the main go to workflow.

  • @yklandares
    @yklandares 2 місяці тому +1

    please reply to the subscriber)

  • @stephaneramauge4550
    @stephaneramauge4550 16 днів тому

    Thanks for the video
    But slowdown twice at least !!!
    It's really painfull to pause, rewind to see where you're clicking
    It 's a tutorial not a formula one race !

    • @controlaltai
      @controlaltai  9 днів тому

      Thank you for the feedback. Will take it into consideration.

  • @viniciuslacerda4577
    @viniciuslacerda4577 2 місяці тому

    error : efficient_sam_s_gpu.jit does not exist

    • @controlaltai
      @controlaltai  2 місяці тому

      Check the requirements section of the video. You need to download the two .jit files in the custom nodes yolo folder.

  • @l_Majed_l
    @l_Majed_l 2 місяці тому

    keep the good work 👍 but can you tell me why it dose not mask anything in my workflow please:{
    "last_node_id": 8,
    "last_link_id": 7,
    "nodes": [
    {
    "id": 3,
    "type": "Yoloworld_ModelLoader_Zho",
    "pos": [
    -321,
    49
    ],
    "size": {
    "0": 315,
    "1": 58
    },
    "flags": {},
    "order": 0,
    "mode": 0,
    "outputs": [
    {
    "name": "yolo_world_model",
    "type": "YOLOWORLDMODEL",
    "links": [
    2
    ],
    "shape": 3,
    "slot_index": 0
    }
    ],
    "properties": {
    "Node name for S&R": "Yoloworld_ModelLoader_Zho"
    },
    "widgets_values": [
    "yolo_world/l"
    ]
    },
    {
    "id": 1,
    "type": "LoadImage",
    "pos": [
    -662,
    142
    ],
    "size": {
    "0": 315,
    "1": 314
    },
    "flags": {},
    "order": 1,
    "mode": 0,
    "outputs": [
    {
    "name": "IMAGE",
    "type": "IMAGE",
    "links": [
    3,
    7
    ],
    "shape": 3,
    "slot_index": 0
    },
    {
    "name": "MASK",
    "type": "MASK",
    "links": null,
    "shape": 3
    }
    ],
    "properties": {
    "Node name for S&R": "LoadImage"
    },
    "widgets_values": [
    "srg_sdxl_preview_temp_rtriy_00012_ (1).png",
    "image"
    ]
    },
    {
    "id": 4,
    "type": "ESAM_ModelLoader_Zho",
    "pos": [
    -298,
    255
    ],
    "size": {
    "0": 315,
    "1": 58
    },
    "flags": {},
    "order": 2,
    "mode": 0,
    "outputs": [
    {
    "name": "esam_model",
    "type": "ESAMMODEL",
    "links": [
    1
    ],
    "shape": 3
    }
    ],
    "properties": {
    "Node name for S&R": "ESAM_ModelLoader_Zho"
    },
    "widgets_values": [
    "CUDA"
    ]
    },
    {
    "id": 5,
    "type": "PreviewImage",
    "pos": [
    653,
    50
    ],
    "size": [
    210,
    246
    ],
    "flags": {},
    "order": 5,
    "mode": 0,
    "inputs": [
    {
    "name": "images",
    "type": "IMAGE",
    "link": 4
    }
    ],
    "properties": {
    "Node name for S&R": "PreviewImage"
    }
    },
    {
    "id": 7,
    "type": "PreviewImage",
    "pos": [
    1115,
    197
    ],
    "size": [
    210,
    246
    ],
    "flags": {},
    "order": 7,
    "mode": 0,
    "inputs": [
    {
    "name": "images",
    "type": "IMAGE",
    "link": 6
    }
    ],
    "properties": {
    "Node name for S&R": "PreviewImage"
    }
    },
    {
    "id": 6,
    "type": "MaskToImage",
    "pos": [
    703,
    371
    ],
    "size": {
    "0": 210,
    "1": 26
    },
    "flags": {},
    "order": 6,
    "mode": 0,
    "inputs": [
    {
    "name": "mask",
    "type": "MASK",
    "link": 5
    }
    ],
    "outputs": [
    {
    "name": "IMAGE",
    "type": "IMAGE",
    "links": [
    6
    ],
    "shape": 3,
    "slot_index": 0
    }
    ],
    "properties": {
    "Node name for S&R": "MaskToImage"
    }
    },
    {
    "id": 2,
    "type": "Yoloworld_ESAM_Zho",
    "pos": [
    61,
    85
    ],
    "size": {
    "0": 400,
    "1": 380
    },
    "flags": {},
    "order": 4,
    "mode": 0,
    "inputs": [
    {
    "name": "yolo_world_model",
    "type": "YOLOWORLDMODEL",
    "link": 2
    },
    {
    "name": "esam_model",
    "type": "ESAMMODEL",
    "link": 1,
    "slot_index": 1
    },
    {
    "name": "image",
    "type": "IMAGE",
    "link": 3
    }
    ],
    "outputs": [
    {
    "name": "IMAGE",
    "type": "IMAGE",
    "links": [
    4
    ],
    "shape": 3,
    "slot_index": 0
    },
    {
    "name": "MASK",
    "type": "MASK",
    "links": [
    5
    ],
    "shape": 3,
    "slot_index": 1
    }
    ],
    "properties": {
    "Node name for S&R": "Yoloworld_ESAM_Zho"
    },
    "widgets_values": [
    "fire",
    0.1,
    0.1,
    2,
    2,
    1,
    true,
    false,
    true,
    true,
    true,
    0
    ]
    },
    {
    "id": 8,
    "type": "WD14Tagger|pysssss",
    "pos": [
    -275,
    445
    ],
    "size": {
    "0": 315,
    "1": 220
    },
    "flags": {},
    "order": 3,
    "mode": 0,
    "inputs": [
    {
    "name": "image",
    "type": "IMAGE",
    "link": 7
    }
    ],
    "outputs": [
    {
    "name": "STRING",
    "type": "STRING",
    "links": null,
    "shape": 6
    }
    ],
    "properties": {
    "Node name for S&R": "WD14Tagger|pysssss"
    },
    "widgets_values": [
    "wd-v1-4-convnext-tagger",
    0.35,
    0.85,
    false,
    false,
    "",
    "solo, food, indoors, no_humans, window, fire, plant, potted_plant, food_focus, pizza, tomato, rug, stove, fireplace"
    ]
    }
    ],
    "links": [
    [
    1,
    4,
    0,
    2,
    1,
    "ESAMMODEL"
    ],
    [
    2,
    3,
    0,
    2,
    0,
    "YOLOWORLDMODEL"
    ],
    [
    3,
    1,
    0,
    2,
    2,
    "IMAGE"
    ],
    [
    4,
    2,
    0,
    5,
    0,
    "IMAGE"
    ],
    [
    5,
    2,
    1,
    6,
    0,
    "MASK"
    ],
    [
    6,
    6,
    0,
    7,
    0,
    "IMAGE"
    ],
    [
    7,
    1,
    0,
    8,
    0,
    "IMAGE"
    ]
    ],
    "groups": [],
    "config": {},
    "extra": {},
    "version": 0.4
    }

    • @controlaltai
      @controlaltai  2 місяці тому +1

      Thanks! Please link the json file on google drive or something.....will check it out for you.

  • @barrenwardo
    @barrenwardo 3 місяці тому

    Awesome