SDXL ComfyUI Stability Workflow - What I use internally at Stability for my AI Art

Поділитися
Вставка
  • Опубліковано 29 вер 2024
  • Since we have released stable diffusion SDXL to the world, I might as well show you how to get the most from the models as this is the same workflow I use on a daily basis at stability.ai. In this video I show you some of the basics on how to get the model from the models to generate your best AI artwork from our models. You will need some of the custom nodes over at civit, but you can choose the package that works best for you, as they are all pretty similar.
    We will start with a basic workflow and then complicate it with a refinement pass, but then we will add in another special twist I am sure you will enjoy. #stablediffusion #sdxl #comfyui
    Grab some of the custom nodes from civit.ai: civitai.com/ta...
    Grab the SDXL model from here (OFFICIAL): (bonus LoRA also here)
    huggingface.co...
    The refiner is also available here (OFFICIAL):
    huggingface.co...
    Additional VAE (only needed if you plan to not use the built-in version)
    huggingface.co...

КОМЕНТАРІ • 308

  • @TedWillingham
    @TedWillingham Рік тому +38

    I would love if you could go over some of those settings in advanced detail - like "oh, I fiddle with more conditioning steps when I want to X", etc. There are so many superstitious people out there giving bunk advice that your level-headed breakdown would be super valuable!

    • @sedetweiler
      @sedetweiler  Рік тому +17

      Great idea! I will have to ponder where to start! :-)

  • @Pfaeff
    @Pfaeff Рік тому +26

    Why are there width and height values in the CLIPTextEncoderSDXL and what is the difference between width and target_width and why is one of them 4096?

    • @courtneyb6154
      @courtneyb6154 Рік тому +5

      Great questions and hopefully Scott can take the time to explain. Building out the workload is a great first step, but not knowing what everything does so that you can fine tune it is lame.

  • @Aaabii
    @Aaabii Рік тому

    Thank you very much. ı prefer comfyUI over A1111 and you are my go to channel for my purposes.

  • @courtneyb6154
    @courtneyb6154 Рік тому

    Excellent video Scott. If you could do some of us a favor and go into detail about what everything is and how it works within the cliptextencode nodes then that would be of tremendous value. I have scoured the net and am only able to find limited info about the options and nothing i have found has explained how or why they work. Building out the workflow is a great first step but not knowing how to fine tune is lame 😂Thanks!!!!

  • @RyanLeeLab
    @RyanLeeLab Рік тому

    Where do we save workflow.json file? custom_nodes folder? or can we just manually create the Workflow and save in there?

    • @sedetweiler
      @sedetweiler  Рік тому

      I just created one in a handy place and just drag and drop in the ones I need.

  • @me.shackvfx5911
    @me.shackvfx5911 Рік тому +33

    I've grown to understand and enjoy comfy UI more that the one i was using before thanks to your videos.I really appreciate you and the effort you put into making these tutorials. One of these days you can show us how to train sdxl 1 or it's lora with our faces . Thanks :)

    • @sedetweiler
      @sedetweiler  Рік тому +8

      Great to hear! Training will be coming soon! Cheers!

  • @centurionstrengthandfitnes3694
    @centurionstrengthandfitnes3694 2 місяці тому +1

    I found this overwhelming. I've followed along step-by-step twice, but I just don't understand why we have 3 Ksamplers by the end, nor why we have CLIPTextEncodeSDXL and CLIPTextEncodeSDXL Refiner boxes. I'm not hearing explanations of what any of these things do for us on our way to a good final image. Maybe it's just me, though?

  • @GuitarWithMe100
    @GuitarWithMe100 Рік тому +1

    Im still confused on what the ClipTextEncodeSDXL does? and how does the value 4096 affect it?

    • @sedetweiler
      @sedetweiler  Рік тому

      That was the initial conditioning prior to scaling, so we just prefer that for the refiner.

  • @spiralofhope
    @spiralofhope 11 місяців тому +1

    I was able to follow the tutorial well. I'm a bit confused at the three separate seeds. I can adjust the first (the conditioner/initializer) and get changed results, do I care about the others? In a previous video you said it wouldn't matter much for that context. Is that also true here?

  • @kaareej
    @kaareej 2 місяці тому +1

    why set width and height to 4096? why not the same as target width and height?

  • @bigbo1764
    @bigbo1764 Рік тому +1

    I’m curious, how would I implement a lora in this setup? I tried inserting 2 lora nodes after the checkpoint nodes and connecting them like I would in SD 1.5, but it seems to not be registering the existence of my Lora and just skipping over it. My checkpoints are connected to the Lora nodes only, except for the VAE, which is used for the decoding, what am I doing wrong and how exactly do I fix this?

  • @Adreitz7
    @Adreitz7 Рік тому +5

    Thanks for this look at the setup that Stability uses internally. I'm not so familiar with Comfy, but I've been using and enjoying SDXL through Invoke, which has a similar Nodes capability. I have a few questions and comments:
    1. What are the Original and Target W/H actually doing for the CLIP conditioning nodes and what is the logic to setting those values? I played around with it, testing various combinations, and the only thing I could confidently say is that setting Original W/H smaller than 1024 causes the image to become blurry. I couldn't see any specific benefit to any other value, as I tried 1024, 4096, and 40960 for Original and between 64 and 40960 for Target -- setting different values made the image different, but not obviously better or worse. I settled on just setting them the same as the output image dimensions.
    2. Why are there two prompt inputs for the base text encoder node when you provide the same input to both? Invoke calls one input the prompt and the other the style. What effects are caused by, e.g. separating your prompts into a prompt and a style and sending them independently to the two inputs, switching the inputs (so prompt goes to the "style" input and vice versa), setting them both the same, or leaving one or the other blank? I've found that if I prompt the base model for a roller coaster in the first input, I get a roller coaster. But if I prompt "roller coaster" for the first input and "photograph" for the second, I get anything BUT a roller coaster -- ruined buildings, abstract paintings, etc.
    3. Connected with #2, Invoke's refiner conditioning node only includes a "style" input, but I've found that only giving it a style prompt can cause the refiner to do weird things (like making architecture look like it's made of tent fabric).
    4. You've indicated that initializing the noise with the refiner is an interesting idea, which it is, but have you seen any consequences other than just making the images different? Does it provide any actual benefit?
    5. I've experimented with higher resolution SDXL generations. I'm on a Mac and there are some apparent generation bugs with Invoke on MPS (about 1856 square and above it becomes debilitating). But I've noticed that my scenes at higher resolution (photographic sci-fi style architecture) tend to become wide angle and taken from a high vantage point, almost as if the resolution setting is correlated with the position and zoom of the virtual camera. Has Stability done any experiments at higher resolutions than 1024x1024?
    6. Is there a benefit or danger to sending the same noise seed to both the base and refiner?

    • @bobbyboe
      @bobbyboe 10 місяців тому +1

      Good questions... I also would like to know the answer. Did you understand the concept of why there is a field of dimensions in a node that is supposed to provide only text?

  • @Darkwing8707
    @Darkwing8707 Рік тому +3

    Why did you choose 4096 for the height and width in the conditioners?

    • @digitalbear3831
      @digitalbear3831 Рік тому +1

      I'd like to know that one too

    • @AdamDesrosiers
      @AdamDesrosiers Рік тому +2

      also would like to know what these conditioners numbers do. And somehow, I've been happier with outputs when I set those number to 2048. But why? I don't know what those are doing.

    • @sedetweiler
      @sedetweiler  Рік тому +1

      The refiner was initially conditioned at that size prior to scaling, so we tend to use that size.

    • @jonnyfat
      @jonnyfat Рік тому

      @@sedetweiler Thanks for this tutorial - great reference. Great to have tutorials on this by someone who knows what they're talking about :-) I picked up on the size thing too - so it's 4096 for the base and 1024 for the refiner? Thanks!

    • @petec737
      @petec737 7 місяців тому

      @@sedetweiler "we tend to use that size" isn't really an answer. The only reason you'd have those numbers different is if you want to CROP a portion of the image..so in your case it's like wanting to crop out a 4096x4096 OUT OF a 1024x1024 image; which obviously is not how math works :)

  • @technoprincess95
    @technoprincess95 Рік тому +1

    Would you mind sharing this workflow through a gdrive ❤

  • @A.polon.i.a
    @A.polon.i.a 6 місяців тому

    Great video Scott, I wonder could you explain how to change the image size? What do I have to alter to produce an image of 832 x 1216 for example? Or point me to a future video that explains it, as I'm only on ep.2 Thanks💖

  • @gloorbit5471
    @gloorbit5471 10 місяців тому +1

    Being that this video is now four months old can I assume that your checkpoint is now named differently? The one I have that was downloaded when I installed Searge's script yesterday has the vae included in the filename like so: sd_xl_base_1.0_0.9vae.safetensors and a refined one named respectively.

    • @sedetweiler
      @sedetweiler  10 місяців тому +1

      Sure, feel free to rename them. I do because they all have generic names and need to be changed to keep things sane.

  • @florentraffray1073
    @florentraffray1073 6 місяців тому +1

    Thanks for these tutorials, great to have an in depth dive into the UI.
    I'm a little confused about the start/end steps and steps in the KSampler.
    In your second sampler in the chain of them, if you start at step 3 and do 12 steps, wouldn't that leave you at step 15 for your starting point in the next one?

    • @sedetweiler
      @sedetweiler  6 місяців тому

      There are some advantages to skipping steps in some cases. It all has to do with the residual noise.

  • @archielundy3131
    @archielundy3131 8 місяців тому +5

    A million thanks for these. As finicky and frustrating as the program is for beginners, your calm expertise is just what's needed.

  • @V_2077
    @V_2077 6 місяців тому +1

    Mine has nowhere near as much detail as this. I'm using PonyXL model is it an issue with the model?

    • @sedetweiler
      @sedetweiler  6 місяців тому

      Probably. Models that are "adjusted" can also have massive amnesia if they are not well done or are overly focused in one area.

  • @gameplayfirst-ger
    @gameplayfirst-ger Рік тому +2

    How is there any noise left during handover to the refiner, if you don't use the "end_at_step" parameter? Don't you get images without any noise from the base sampler if you don't limit the end in any way?
    Your base preview image confirms that you don't have any noise left after the base, which doesn't match the workflow described in the SD-XL documentation.
    And why do you overlap steps? For example you do 12 steps in base, but start at step 12 in refiner, instead of starting at step 13.

  • @CMak3r
    @CMak3r Рік тому +8

    Prompt switching can be realized with additional KSampler that will render first steps with completely different prompt. For example you may want to create triangle composition, or a symmetrical image, and it can be done at early steps of a generation. Good for abstract art. And also I like that in ComfyUI it's seed can be fixed while base model and refiner will be generating on different seeds

    • @zacharykrevitt7560
      @zacharykrevitt7560 11 місяців тому

      good idea! just tried this out and it worked in an ineresteing way. Essentially prompting an init image

  • @angryDAnerd
    @angryDAnerd Рік тому +4

    Excellent tutorial, thanks! I got SDXL up and running with the refiner. If you have the time I'd like to see you make a video explaining how Stable Diffusion works and explain exactly what the program is doing as it sends the data through the nodes in Comfy so I can have a greater conceptual understanding of what is happening. Believe me I could watch hours of technical stuff lol.

  • @DarnSylon
    @DarnSylon Рік тому +2

    When you added the third or 'pre-sampler', why did you not pass the noise information as you had done with the first of the two samplers? I messed with that setting on the first two and didn't notice much of a change. Thank you for the videos and instructions. They are extremely helpful. And you suggest not to add things like extra fingers to the negative prompt. What is your method of not getting extra fingers or limbs, etc?

  • @Maxime_motion
    @Maxime_motion Рік тому +1

    13:41 ''Tha V tric again'' not clear for me . ctrl+V only copy the node not the link

    • @sedetweiler
      @sedetweiler  Рік тому

      Gotta hold down shift to get the links to paste as well. Control-Shift-V

  • @rajansharma8887
    @rajansharma8887 2 місяці тому

    i have a question related to naming , why its showing "SXDL\sd_xl_base_1.0.safetensors" instead of "sd_xl_base_1.0.safetensors" . I'm really confused about the naming convention.

  • @ethanhorizon
    @ethanhorizon 4 місяці тому

    Thanks for the tutorial! Is the "noise seed" in Ksampler Advanced same as "seed" in Ksampler? You set noise seed as 4, what's the meaning of the number? What if I left it as zero?

  • @VandreBorba
    @VandreBorba 9 місяців тому

    What is the ClipTextEncorder SDXL? It's diferent king of encorder that sdxl needs?

  • @xevenau
    @xevenau 11 місяців тому +1

    Quick question, why was the last Ksampler added without a preview mode?

    • @sedetweiler
      @sedetweiler  11 місяців тому

      It wasn't on purpose. I just add them for their maths, not the previews.

    • @xevenau
      @xevenau 11 місяців тому +1

      @@sedetweiler thank you!

  • @KlausMingo
    @KlausMingo Місяць тому

    I experimented with your workflow and I found that to really see what refinement was done, you should leave BOTH samplers at 20 steps, then on the refinement sampler you can start at 12. This way you can properly compare the differences. Whereas when you do only 12 steps on the first sampler, you end up with a significant changes on the refined image.

  • @KlausMingo
    @KlausMingo Місяць тому

    I appreciate the clear steps and explanations. Almost all SDXL models found on Civitai for example, do not have refiners, so do I use the official refiner?

  • @rrkred3561
    @rrkred3561 2 місяці тому

    Ok i was a comfyui HATER. but once learning more and more on it I started noticing my images improved more than what A1111 outputed. thing is yeah A1111 is much more simplier so to get nearly the same or better on comfyUi it requires some work, but the skill cieling and options on comfy ui are so much higher which is a good thing it means it can create VERY good pieces of art if the workflow is done right.

  • @Luxcium
    @Luxcium 7 місяців тому

    I paused the video to have more information and ChatGPT linked me back to this video without me even asking anything related (I ask something about SLI vs non SLI when using NVIDIA and ComfyUI)

  • @merttkn
    @merttkn 5 місяців тому

    When I try to install sd_xl_1.0.safetensors I get error which says something like "wrong json extension" after I wait long time while trying to install it. Can anyone help about it. Thanks.

  • @RobertWildling
    @RobertWildling 9 місяців тому

    Hmmm... at around 14:15, when you add the first refiner with the 3 steps, shouldn't the last refiner's "start_at_step" be changed to 15?

  • @jaredbeiswenger3766
    @jaredbeiswenger3766 Рік тому +1

    I'm curious what's happening with your 2nd refiner when it starts at step 12 while the base model is also running to step 15. Are the 2 models alternating steps (acting simultaneously) or do they still run discretely? I'm curious if the starting step is logically useful or if it's straight voodoo magic.

    • @m4dbutt3r
      @m4dbutt3r Рік тому +1

      Yes I was just going to comment that the math does not seem to add up in that 3 sampler version at the end (first refiner: start 0, steps 3; base model: start 3, steps 12; 2nd refiner: start 12 [??? why not 15??] steps 20). I tried it at both 12 and 15 and actually liked the 12 better, but that may have been a coincidence and in fact it doesn't really matter. Very curious what is actually happening if you "mess up" these numbers. If I really mess with them, most of the time it comes out black and white (In one iteration I forgot to change the numbers when I copied over the second refiner to make the first, and I got beautiful, but black and white, versions of my images!!). Voodoo magic indeed.

    • @jaredbeiswenger3766
      @jaredbeiswenger3766 Рік тому +1

      @@m4dbutt3r appreciate this. Would be nice to know if this power can be harnessed for good

  • @dreaminspirer
    @dreaminspirer Рік тому +1

    thanks so much for the video.
    I'm having BASE Steps and TOTAL Steps Primitives. So I'm trying to use a Primitive node to feed the PRERUN steps to 1st Refiner (let's call it PRERUN KSampler) but i bumped into a problem.
    - Feeding "steps" into PRERUN Ksampler is fine but I can not feed this "steps" INT to "start at step" for the BASE KSampler . they're both INT, but perhaps ComfyUI considers "steps" and "start/end at step" are different types. 😒
    - The other way around is feeding "end at step" for PRERUN and feeding this value to "start at step" for BASE and feed all KSampler with same "steps" value. But for some reason, the PRERUN Ksampler needs to be fed with exact amount of steps otherwise the result is nothing but NOISE. 😒
    please help , thanks again.

    • @sedetweiler
      @sedetweiler  Рік тому +2

      I have also noted that, and I think it is a bug. That should work just fine. I got around it by using a math node, since that was the end goal anyway.

    • @dreaminspirer
      @dreaminspirer Рік тому

      @@sedetweiler thats exactly what i found. Derfuu VAR nodes and MATH nodes did the trick without any problem.
      Having said that, i found PRERUN step should not be more than 3 or it's all crap :)
      Thanks again and pls keep sharing with us the quirky tricks to play with Comfyui

  • @nonetrix3066
    @nonetrix3066 Місяць тому

    With different aspect ratios e.g. 896 x 1152 inside of the SDXL clip text encode what should I change target_width, target_height, and width height. I am super confused about these, I can't seem to get them right really and I end up getting nightmare fuel deformed images. I would like these explained, I don't see why they even exist really

  • @kick851
    @kick851 4 місяці тому

    for the cliptextencodesdxl
    if my latent image is 768x1280 do i still use 4096 for width and height and what about the target width and height

  • @jameswilkinson150
    @jameswilkinson150 4 місяці тому

    🎨🖌️I’m an artist and I’d love to use this to create variants of my work and also generate animations. Is this possible using this? Sorry if it’s a dumb question but I’m totally new to this. 🖌️🎨

  • @larryross9380
    @larryross9380 5 місяців тому

    Perhaps things have changed since this was published nine months ago, because this workflow just gave me dark, abstract images. But I learned a lot about how to build out a workflow! Thanks!5

  • @팽도리-v6s
    @팽도리-v6s 2 місяці тому

    Your videos are amazing, thank you for your great contents!!

  • @leonardliu3256
    @leonardliu3256 11 місяців тому

    opposite of Walgreens should be CVS or RateAid lol, JK

  • @tronprogram8749
    @tronprogram8749 3 місяці тому

    i was getting a really odd error where comfyui would show 'pause' on the terminal whenever it wanted to load something to the refiner part. if this is happening to you, enable page file and set it to a decent value. for some reason (idk why), the thing just breaks if you don't use pagefile on windows.

  • @deafponi
    @deafponi 8 місяців тому

    Hi there Scott, thank you for the excellent tut. I must admit though, my robots did not look anything close to how refined yours came out. I wonder if I missed anything somewhere...

  • @arturabizgeldin9890
    @arturabizgeldin9890 Рік тому

    Looks like you disabled passing noise from initial refiner KSamler that goes the initial 3 steps

  • @lechatsportif124
    @lechatsportif124 9 місяців тому

    Couple of things, isn't it recommended for the refiner to actually be started at 80% of total steps? Also, is conditioning via the refiner really a thing or did you just kind of mess around with it? You didn't select pass on noise, so I'm not sure what that means.
    Thank you for the tutorials, they are great!

  • @petec737
    @petec737 7 місяців тому

    In all seriousness makes no sense to generate an image in 3 steps, pass that as the latent for generating another image in 12 steps then pass that image to generate the final image in 20 steps.
    You'll get BETTER and FASTER results by just going full in 35 steps in one go..

  • @nicologuarnieri3503
    @nicologuarnieri3503 8 місяців тому

    I have watched and did the same thing as you over and over again but my comfy ui fails at the green SDXL and gives me error "G" can someone please help me fix it?

  • @Zizos
    @Zizos Рік тому

    I just did download the official base and refiner but it seems I've got the VAE version from somewhere else in the past.
    What's the difference? I get that the VAE is built-in to the model. Does this mean you get to delete the VAE Decode node or some other node?
    Can you just keep the VAE version and follow your workflow with no difference in results or at least no negative results in quality?
    As in the last step you showed you can 1st generate a blank latent and then into the base and refiner... Seems like you can do all sorts of tricks like that to experiment with the resulting image. I wonder if it makes sense. If I get it right, it seems that the latent creates a base noise ignoring the models so that you can just get something a bit out of the box (model). Is that right?
    Thank you for the tutorial. I have lot's of stuff to learn.

  • @alexlindgren1
    @alexlindgren1 11 місяців тому

    I'm aware that SD don't take account of spatial relationships, but I want to be able to replace for example a sofa in an existing image with an image of another sofa, but not sure on how to take on that challenge with SD, do you have any suggestions where to start? I don't want to manually mask each image, but I want the AI to recognize what part of the image is a sofa and mask it for me, I should just provide the image of the sofa and the "base image" of the livingroom.

  • @Kelticfury
    @Kelticfury Рік тому

    So what is the end goal here? I mean the nodes are cool and all but if I want to actually produce something I can use it is way way faster to just use auto1111 than to try and reinvent the wheel with some wonky strung together workflow. I am trying here, I have spent hours just getting a half-assed approximation of what I can do already without the additional headache. I think you have convinced me that comfy UI just isn't for me. At least not in it's current state.

  • @kseniyagerasimenko7767
    @kseniyagerasimenko7767 9 місяців тому

    for some reason my image, which was prompted as black and white became blue after adding a second refiner before sampling

  • @scottmahony4742
    @scottmahony4742 8 місяців тому

    models, the refiner, etc. Where can I find definitions for all these variables?

  • @beatemero6718
    @beatemero6718 Рік тому

    Can someone tell me how to use the command line to disable xformers in comfyUI? I just cant figure it out and found nothing in the internet.

  • @kenjix7316
    @kenjix7316 Рік тому

    why exactly dont we change the return leftover noise for the first sampler (the 2nd refiner one added at the end)?

  • @TissaUnderscore
    @TissaUnderscore 6 місяців тому

    Should i use a refiner for a custom model? for example if i use juggernaut xl?

  • @rsunghun
    @rsunghun Рік тому +3

    I was waiting for it. These are very difficult for ordinary people to figure out how to use it. Thank you for the video!

  • @reekster30
    @reekster30 Рік тому +1

    wow - great tutorial dude. I've only recently got into comfy and wondered why all the controlNETs were failing last week :D All new ones install thanks to your videos and loving all the sdxl videos... fun times ahead (but I really need a pc gaming rig for speed) haha
    Out of interest - what kind of set up for a pc would you recommend for quicker generation/processing? massive 128gb RAM and like a RTX4090? :D
    thanks for your videos - amazing

  • @ianwilliams7740
    @ianwilliams7740 Рік тому

    on that third sampler you added you kept the return with leftover noise to disable.. does that mean you use up all the noise in those 3 early steps? what's the thought in not setting that to enable??

  • @novantha1
    @novantha1 Рік тому +3

    Huh. I wonder what would happen if you had dedicated models for a variety of tasks (hands, eyes, hair, reflections, contrast, and so on) and fed a few steps from each of them in a daisy chain until you got to the first "true" sampler...
    Truly the possibilities are endless; thanks for the food for thought and the hard work!

    • @sedetweiler
      @sedetweiler  Рік тому +2

      That's a great idea, and we do have those as loras. It's fun to combine them to help get what you want.

    • @tripleheadedmonkey6613
      @tripleheadedmonkey6613 Рік тому +1

      That is an interesting idea. The multitude of experts approach is proving to be the more effective of what we have developed recently.
      Not too mention that you could also combine this with prompt blending syntax to ensure that each part of the processing is focusing entirely on one subject in the prompt while still maintaining an overall mixed composition.
      If for simplification purposes you set up 5 samplers, each with an equal number of steps, 4 for the limbs and 1 for the head/torso. Then you set up a prompt blending which focuses 20% of the processing on each limb etc. it may even have better results.

    • @tripleheadedmonkey6613
      @tripleheadedmonkey6613 Рік тому

      And yeah using LORA chains would mean that we could have a separate model output for each limb, while maintaining the same initial model. Allowing for less resources used at the same time compared to multiple dedicated models.

    • @tripleheadedmonkey6613
      @tripleheadedmonkey6613 Рік тому

      I think I'm going to play around with this now actually xD Minus the dedicated limb lora of course.

  • @TomMaiaroto
    @TomMaiaroto Рік тому +2

    I'm new to ComfyUI all and really love your videos. Thanks! Maybe this is obvious to folks, but one thing I recently learned was the ability to condition after one KSampler ran so you can continue to refine your final image. It ended up being an alternative (or another tool in the toolbelt) to inpainting. I wasn't just refining, I was adding to or dramatically changing the final image - all without losing the "base" starting point that was all "locked down" in that the seed was fixed, the cfg and steps didn't change, etc. So it was a very non-destructive compositional workflow. If I wanted to add an object to the image, I could do that through a second prompt that was applied to a second KSampler.
    I could also introduce new LoRAs later on in those steps. I'm going to continue to experiment with this strategy and go through this more than once. So instead of a long prompt followed by a smaller corrective one, do more of a build up of prompts. Start simple and continue to add on to it so that elements within the image can be independently adjusted, removed, or re-arranged. Again, a more compositional approach during image generation to hopefully reduce the amount of work in post (or a series of very similar images that can be worked together in post processing). This could get a bit messy too, but maybe not if they are arranged left to right in a linear fashion building up the scene.

    • @sedetweiler
      @sedetweiler  Рік тому

      That's great! It is a lot of fun adding into the pipeline. It's what we do internally as well when testing models and playing with new ideas. Cheers!

  • @michaelroper87
    @michaelroper87 7 місяців тому

    What does it mean if it nevers leave the load checkpoint stage?

  • @brucehunter8235
    @brucehunter8235 11 місяців тому

    Why do I get a different robot to yours? Just curious. I thought if the seed was the same I should get the same image.

  • @tripleheadedmonkey6613
    @tripleheadedmonkey6613 Рік тому +1

    One question I had. Is there any reason why you recommend using the VAE from the refiner, when there is only 1 version of the VAE (barring custom fixes for FP16) publicly available?
    If I choose to merge the fixed FP16 base VAE with the refiner, am I getting the same experience as you are (besides fp16-fp32 differences) ?

  • @Yggdrasil777
    @Yggdrasil777 Рік тому +3

    I have been in love with ComfyUI since I found it (coming from Unreal Blueprints, very familiar system). I am currently working out some torch issues with my current system, but I generate whenever I can. It is great to see you building out the workflow and explaining the nodes that you use and why. Very informative and THANKS for the tip with the shift-click to copy nodes AND connections. NICE!

    • @sedetweiler
      @sedetweiler  Рік тому

      Great to hear! I am really happy with the nodes, but I hope they really update to things like docking, etc. Cheers!

    • @digitalbear3831
      @digitalbear3831 Рік тому +1

      Same here since I come from Houdini, just love the node spagetti

    • @sedetweiler
      @sedetweiler  Рік тому +1

      Yusss! I also used Houdini as well as Substance Designer and I am hoping to get into nested nodes here as well. Cheers!

  • @14MTH3M00N
    @14MTH3M00N 9 місяців тому +2

    Love your disgust for the negative prompts lists haha. relatable stuff

    • @sedetweiler
      @sedetweiler  9 місяців тому +4

      (((((((((extra arms!))))))))) :-)

  • @bavoch8960
    @bavoch8960 5 місяців тому +1

    You skipped the explanation of important parameters

  • @lioncrud9096
    @lioncrud9096 Рік тому +1

    any tips on adding an upscaler?

  • @goodie2shoes
    @goodie2shoes 6 місяців тому

    the refiner seems to add limbs 😞 in my images

  • @audiogus2651
    @audiogus2651 Рік тому

    Think I missed an important step around 7:15 lol

  • @Smashachu
    @Smashachu 10 місяців тому +1

    Hmm i'm messing around with rendering the first 2-3 steps as something that i know SDXL is trained very well in so for example a brown horse racing for a positive prompt on the first 3 frames, then using a negative prompt for the Brown, with the new color being purple with a (purple horse:1.3). It's been working very well especially for harder to generate things, it's like it's erasing the colors and redrawing it now that there's a rough shape. I'd love to see how it will workout in combination with controlnet to maintain consistency in textures and shapes.

    • @sedetweiler
      @sedetweiler  10 місяців тому

      That method can also help with LoRA images that are not as strong as you prefer. It's a great workflow. 🥂

  • @shallowandpedantic2320
    @shallowandpedantic2320 Рік тому +1

    Thanks. If you're looking for recommendations, a video focused on comparing upscalers and incorporating upscaling into this kind of workflow might help people. Seems like a nice next step. Appreciate what you've shared so far.

  • @RikkTheGaijin
    @RikkTheGaijin Рік тому

    LOL all this work and the image looks like shit.

  • @HellBoundSinner
    @HellBoundSinner Рік тому

    Love the videos...but I am having an issue and I can't figure it out. When it goes into the refiner ksampler advanced it throws the following error. Everything is linked correctly and I made sure all the input data is correct. I keep getting this...mat1 and mat2 shapes cannot be multiplied (77x2048 and 1280x768)

    • @sedetweiler
      @sedetweiler  Рік тому

      That happens when you are mixing checkpoints, loras, or other goodies that are trained at different resolutions. Basically they are not compatable.

  • @dxnxz53
    @dxnxz53 4 місяці тому +1

    it blew my mind that you can load an entire workflow from the image! thanks for the great content.

  • @Vestu
    @Vestu Рік тому

    Hmm my results got a LOT worse when I added than third KSampler. Like monotone and disfigured. Probably wired something wrong there 🤔 Anyone else?
    Anyway, REALLY good results until that point and I saved the workflow. Thank you.

    • @sedetweiler
      @sedetweiler  Рік тому +1

      Just check where you are starting each sampler. It should be fine, but that might be overkill.

  • @MisterKerstov
    @MisterKerstov Рік тому +1

    Thank for this really concise and helpful tutorial. Just one thought, you did not enable the "return with leftover noise" for the "initial conditioning" node. Wouldn't it make sense to do so?

    • @sedetweiler
      @sedetweiler  Рік тому +1

      It actually returns so much that things go sideways. Give it a try. I have not found that to work well.

  • @JoeOliveTree0026
    @JoeOliveTree0026 Рік тому

    I'm using Google Collab, but mine doesn't have the "Manager" button. Any idea how can I add that?

    • @bigbo1764
      @bigbo1764 Рік тому

      It’s an external git repository that you have to install; I’m not sure how it would work in collab.

  • @tomschuelke7955
    @tomschuelke7955 9 місяців тому

    In the Clip Text Encode SDXL.. you dial up width and hight to 4000 something... what is that for? i didnt understand it.. my first thought was... what the hell for a graphiccard can handle 4000 to 4000 pixels in one step.. but thats something else yes... but what is it..

    • @sedetweiler
      @sedetweiler  9 місяців тому

      The clip encoder was trained at 4096 square. This isn't the resolution of the image, just the sdxl clip

  • @AltimaNEO
    @AltimaNEO Рік тому

    So I've been using a workflow that was on the comfy up Github in their examples page. I'm struggling with trying to figure out how many steps I should be giving the refiner?

    • @sedetweiler
      @sedetweiler  Рік тому

      I would start with 32 in the base and 8 in the refiner

  • @ariahrism
    @ariahrism Рік тому

    Question - If we say "start at step 0, end at step 3" shouldn't the next refiner logically start at step 4? Or would step 3 be correct?
    also, you set "steps" to 12, but shouldnt steps remain "20" and the "end at step" should say "12", so that it prematurely ends with the remaining noise to pass on?

    • @sedetweiler
      @sedetweiler  Рік тому

      It's ordinal, so it's correct to stop as 12 on one and start at 12 on the next.

  • @dkf-nl1703
    @dkf-nl1703 Рік тому

    @14:20, doesn't the second sampler go up to step 15? And as a result, shouldn't the third sampler start at 15? And thanks for a great video!

    • @sedetweiler
      @sedetweiler  Рік тому +1

      They are exclusive, the step start is correct.

  • @lukeovermind
    @lukeovermind Рік тому +1

    fantastic! I am looking at some advance workflows, however with no real explanations how they work. I want to use it but I dont know what some of the nodes and flows do! However I found alot of value from your vids and at this stage I am happy to just play and learn comfy and put of creating art projects/ideas with SDXL for the time being.
    That 3rd Sampler is neat! I tried to see if you can use latent upscale method in your previous video with SDXL base and refiner, didnt work but that is the beauty of comfy! You get to try stuff

    • @sedetweiler
      @sedetweiler  Рік тому +1

      I also think it is a pretty great way to learn how all of this works together. It really is limitless!

  • @ysy69
    @ysy69 Рік тому +1

    Hi Scott, really appreciate your giving us the most recent update on SDXL. Do you know how to fine tuned a model using SDXL 1.0 and Dreambooth? Is this something you can create a tutorial video for us?

    • @sedetweiler
      @sedetweiler  Рік тому +2

      That is coming soon. It is going to be easier to train, results wise, but still getting methodology together.

    • @ysy69
      @ysy69 Рік тому

      @@sedetweiler 🙏🙏looking forward to… do you know if the new dataset should be set at minimum at 1024 by 1024?

  • @imperfectmammal2566
    @imperfectmammal2566 Рік тому +1

    Thank you so much! Even though I couldn’t understand much, it helped me get started with comfy.

    • @sedetweiler
      @sedetweiler  Рік тому

      You’re welcome 😊 Just keep working with it and it will start to click into place.

  • @TomSweeney-ov8qs
    @TomSweeney-ov8qs Рік тому

    Do you have any videos (or recommendations for other videos) that go in depth on debunking the negative prompt urban legends you mention?

    • @sedetweiler
      @sedetweiler  Рік тому

      No, but I should make one. It's just terrible what people pass on as the perfect negative. Do they think the model was trained on "bad anatomy" and "extra fingers?"

  • @FlorinGN
    @FlorinGN Рік тому +1

    I am definitely going to search for a good upscale workflow on your channel.

  • @hleet
    @hleet Рік тому +1

    WOW ! that's a super tutorial of ComfyUI there ! Thanks. I never know that there was this new addition of clipnode for SDXL !
    The only drawback that I find in ComfyUI is the way it manage the workflows. I mean when you want to change your original workflow, you need to save a local file, and if you want to do something else (like inpainting) you have to redo ALL your workflow and save it to a file to recall your workflow and switch by loading one workflow or another depending on what you want to do. Definitly not fond of this way of managing workflows. They could have done some kind of "favorite" workflow. Like 5 or more "workflow ready" that you could custom afterwards and save your "favorite custom workflow" and switch whenever you like. it would skyrocket the use and adoption of comfyui !

    • @sedetweiler
      @sedetweiler  Рік тому +1

      I just drop the json you get from using "save" into the interface and it loads. But,*do agree that would be nice.

    • @hleet
      @hleet Рік тому

      @@sedetweiler ooh ! Nice another tip ! Drag and drop the json just works too ! I might be able to explore more versatile stuff with comfyui now :)

  • @matthewharrison3813
    @matthewharrison3813 Рік тому +1

    Thanks for the great video. Could you please talk more on the clip encoder width and height and target width and height? What do they do and is there any documentation? Why are you using a different value for the target than the base?

    • @4richis
      @4richis 10 місяців тому

      I would love to see and answer to this as well

  • @skylightikab443
    @skylightikab443 11 місяців тому +1

    Thanks for no nativ english speaker this was a good tutorial. It was very helpful! :)

  • @clonosaurios
    @clonosaurios 11 місяців тому +1

    Thank you for your video! I learnt that comfyui is awesome :)

  • @JohnSundayBigChin
    @JohnSundayBigChin Рік тому +1

    Hi Scott, im rewatching the whole series again, you have done a good job. I have a question in this particular episode with the Sampler...why do you have the possibility of using the denoise within the KSampler but not with the advanced KSampler? Do they work differently?

    • @sedetweiler
      @sedetweiler  Рік тому +1

      it was to simplify things. when you start at a later step with the advanced sampler, you are "skipping" some of the pieces you do not want to denoise, so it is the same thing but harder to explain.

    • @JohnSundayBigChin
      @JohnSundayBigChin Рік тому +1

      When you make Img2Img in one of the videos I saw that you used the common Ksampler because you needed the denoiser. Now everything is much clearer to me, thank you very much for answering.

  • @lakislambrianides7619
    @lakislambrianides7619 Рік тому +1

    This is a great video congrats. Very informative very thorough and you left no doubts. Can't wait for the next step!

  • @ImAlecPonce
    @ImAlecPonce Рік тому +1

    Thanks!!! these boxes are actually starting to make sense

  • @filosofiahoy4105
    @filosofiahoy4105 Рік тому

    Your chain gives me an Error in multiplication...

  • @imperfectmammal2566
    @imperfectmammal2566 Рік тому +1

    Can you tell me how to use the Loras offset that came with sdxl in comfy

    • @sedetweiler
      @sedetweiler  Рік тому +2

      Yes, I will post a video on that and it is SUPER easy to do! Cheers!

  • @sollekram
    @sollekram Рік тому

    Lora key not found when i run throw lora model

    • @bigbo1764
      @bigbo1764 Рік тому

      Yea I’m also having issues with using loras on top of the checkpoints

  • @owenmeyers5601
    @owenmeyers5601 Рік тому

    👊 "Promosm"

  • @johnmorrison3465
    @johnmorrison3465 Місяць тому

    this video has some great insights on how to process the original image. i have a few fyis to add. for those of us stuck with low vram rigs that have to run SD 1.5 (fir now 😢), the verbose negative prompt is essential -- for SDXL it is worthless like Scott says. for those with Mac, this web interface uses ctl just like windows. if you like png over jpg and don't want to share metadata, open the image in an editing app (photoshop, gimp, etc.), export as png, and make sure the include metadata is not selected. Thanks for all the great content Scott -- you never disappoint 👍.

  • @maddocmiller6475
    @maddocmiller6475 Рік тому

    Do you know which Sampler ClipDrop is using for how many Steps? Especially in the SDXL0.9 days. Would love to know.

    • @sedetweiler
      @sedetweiler  Рік тому +1

      I believe it is dpmpp sde GPU.

    • @maddocmiller6475
      @maddocmiller6475 Рік тому

      @@sedetweiler Interesting, seems like a very good Sampler, which I had never used. Thanks for the Info, very appreciated! 🤓👍