@@SteelRoo feels more like you are lazy to build the stuff i show you for free and want everything get handed to you on a silver tablet. this is my job.
This is crazy. I did your method. The results came out very well. Really amazing And I thought There is almost no need to use additional nodes to add details to the face and hands. It's almost finished. Thank you very much for this secret... Always loved by Thai fans.
I found with testing just now that having all three set to BETA gives my final result a fake/waxy skin appearance, but switching the middle step to KARRAS kept the realistic skin look throughout.
The method works extremely well when it comes to details, but for some reason I get fine horizontal stripes in my image after the last KSampler (Upscaling). Does anyone have any idea what is causing this?
@@kivisedoIt's the upscaling, Flux gets wacky when you go over 2 megapixels. I solved it by running the image thorugh SDUltimate Tiled Upscale for 4 steps with 0.15 denoising.
Thank you very much! I recreated the workflow on my RTX 3050 8GB VRAM, 32GB RAM, and the result was WOW. The whole generations process took 10.40 minutes. I repeated the generations using the same nodes but with adding a Lora and changing the model to FP8 and the process took 8 minutes only. FP8 is much faster on my system than the Q8 GGUF.
And why exactly is the upscaler inserted in between? Unfortunately, I don't understand that. It would be very nice if someone could explain this to me. 🙂
2 місяці тому
I really need to test FLUX lol !!! But I'll try your trick with SDXL too. Great video as usual !!!
How does it handle text generation? Obviously when you don't ask for text, it can be forgiven that any text generated is gibberish, but what happens if you prompt for text generation?
I followed the video to build it as I'm not subscribed to his patreon, so can't 100% say I've done it all correctly, but text generation on a T-shirt worked fine for me on my version.
I can't say for definite in regards to the chin but I suspect the females (in particular) were trained off professional models and stoxk photo style models. They all have that cleaned up, professionally airbushed look to them with the base models. You really have to prompt and/or use loras and dedicated checkpoints to get away from that. It may also explain the chin in some way.
I hate that default girls face with Peter Griffins chin too... :) There are two Loras that fix that bony face problem: "Chin Fixer 2000" and "AntiFlux - Chin". Or you can use any asian face Lora, because asian women rarely have that defect. For example "Flux - Asian Beauties - By Devildonia".
Not every time, but if you put any keywords similar to beautiful / very beautiful, you are going to get that same generic look. Flux does similar things with lighting and perspectives too.
Basically a 3-pass with a noise injection after the first ksampler and an upscale after the second. It gave me a gridded image because on the first ksampler I've set finish step at 10 but setting 20 for total step (where you set a total of 10), and the second ksampler couldn't converge starting from 10 and finishing to 20 on a total of 20. Which means you are obligated to use a ksampler advanced and not a ksampler custom with a sigmasplit node, because this one does only the first thing I described. How unfortunate. Gonna try this other approach with turbo.
One question where I'm not fully understanding something. The original empty latent is 1304x768, but in the Image Resize node the resize width and height are 1024x1536. It seems this would switch the image from landscape to portrait mode and distort the image because of the different aspect ratio but all images are about the same, following the aspect ratio of the first image. Why does this work?
It doesn't seem like this would be very fast, as it has 3/4 samplers. But I do like the workflow that focuses on highest quality, it is similar to Nerdy Rodents latest, but he also used custom Scheduler Sigmas to give more control of the generation (like help dealing with the turkey skin).
Because the model doesn't load and unload between samplers, it is equivalent to doing 30 steps on one sampler in time, but it finalizes the overall composition so you can see the results of the composition earlier and cancel the render process if you don't like it. This is mainly how it is faster. Plus it just gives better render results comparitively. I'll have to check out Nerdy Rodents latest workflow too though.
The fast part is to have a full preview of the composition after 10 steps that stays the same. So (depending on GPU) after 10 seconds you know if you will go on or cancle. And then get much better details if you keep going
Interesting idea and followed the video along to build it, seems to give backgrounds a lot more detail and less of the blurred bokeh effect which I really like. I did get the faint grid patteern I've found with Flux and was suprised for such a small upscale, but added a SD XL pass at the end with a Denoise of 0.1 and that fixed the issue and results in better skin detail :)
Honest question, could this accurately be described as a "2x HR Fix"? Instead of 30 step gen, its 10 step, followed by a 10 step "HR fix", followed by a 10 step "HR fix" ?
maybe, but keep in mind that this does 10-20 out of 20. not just 10 extra steps. and that's important for how flux works. because if you would do a second 0-10 out of 10 you would simply get the same image again
@@OlivioSarikas Thanks for the reply! I’ve been contributing to Forge, and your video has me thinking that HR Fix has untapped potential, that perhaps a “loops” parameter would yield these results. One thing that doesn’t make sense in this workflow (to me) is how much noise is being added to the latent outputs on the next sampling. It’s just a true/false value… I would think this should be similar to “denoising strength” in WebUIs where a lower value adds less noise to the latent output, and a higher value adds more. In regards to your reply… if each of those 3 KSampler nodes generated the total steps (from 0 to X) , without feeding in a latent input, would the result images be drastically different from each other?
I tried this and ran into "Warning: Ran out of memory when regular VAE encoding, retrying with tiled VAE encoding." on the third ksampler. Its progressing but is taking 30 minutes for 30 steps. Im on a 3090 24gb btw.
@@lexmirnov @nesimatbab Thats probably it. Should be less than 160 seconds. And I am using fp16 on my 3090. Still, its no doubt there is a lot of pixels to push considering its still a quite large final image.
The workflow seems quite ingenious. I tried it but I keep getting bands/stripes on the final render, after the upscale. These are not so obvious until the last step, but after the upscale they are quite annoying. No matter what I did, I can't get rid of them completely.
Every subsequent image after the first one has a thicker and thicker outline. They look like being drawn by a thick sharpie 😅. Do you have any idea how to fix it?
@@OlivioSarikas 😮 I'm surprised how you know the cause just from 2 sentences and even have a solution. Thank you so much for the quick reply 🫰. I'll check it out
Once I tracked down all the pieces this works well. One suggestion to make the workflows clearer. Double click on node titles and change them to something descriptive like Old Style Advanced KSampler, Stage 1 Advanced KSampler, etc.
Second suggestion - once I got this working I simplified the workflow using the Anything Everywhere and Anything Everywhere3 nodes along with some filtering by node coloring to get rid of all the lines in the graph. Matter of opinion though, to some it might obscure the logic of the workflow.
Great workflow! works perfectly, one question, i see in the video that the lora is not connected, in case i would like to, where the lora node need to be plugging in? on the input clip of the positive prompt node?
Thanks, Olivio. What about for img2img... how should I handle the denoise of the first pass? I typically use .65 denoise in a regular single pass workflow... Cheers!
First test i did went really well, then it started over cooking the image on all subsequent tests. It seems very situational, great when it works, awful when it doesnt.
make sure you have different seeds on the different ksamplers. also, you might have to test different step counts with different models that are community trained
@@OlivioSarikas It was the seed issue, didnt see your reply to this comment but did get a reply on your discord. having the same seed for 0 to 30 and 0 to 10 then different seeds for 10 to 20 and 20 to 30 makes the 10 20 30 method work again. cheers
Do you mean the small blank ones? They are called Reroute. They are a passive node used to extend the output of a node closer to where it is needed especially where there a lot of connections to the output. The connection passes through them and they have no effect. Their use is optional.
Seed? Running the same seed for all 3 samplers (set seed widget to input) or generating separate, like you do in this workflow. And the way you run it, is there any point using separate nodes for generation of seed?
Did I miss this, or was there talk of time savings? If so, what is the time comparison of this method vs the normal method? Side note: if you want to do fast iterations in Flux, you can render in 512x512. When you get something you like, just Hi-res fix it by 2X and make it 1024x1024. If you set the Denoise to 0.35 and the hires steps to ~15, it looks almost identical to the 512x512 version. (Note: I'm talking about using it in Forge, but you could just activate an upscale if you did it in ComfyUI)
The time saving is to gave a full image after 10 steps and then cancel uf you don't like it. 512x512 gives a different composition and less details, so you will get a worse image in the end
@@OlivioSarikas Ah ok gotcha, you just cancel if you don't like. On my 512x512 testing, I'm not seeing less details or comp changes when using in hi-res fix in forge. I can't post examples here unfortunately.
Okay, you are now the next one patreontube I can afford to support! Well done!! Btw, The next time you want to say that, it's called a dirtypull windowslide.
Are you running on Windows? I recently discovered that Nvidia drivers for Windows (since Oct 2023) allow system RAM to be used to supplement GPU VRAM. I have found that it runs about 4 times slower. (But on the flip side, it lets me do things I wouldn't have been able to with only 16G VRAM)
AI Image generation is basically a glorified denoiser. I'm wondering if too much noise was removed in the first sampler. Would be interested to see the results if you did steps 1-10 of a max of 12 (or up to 15) for example for the first sampler. This way you have an overlap but you're still letting the second sampler not go to waste as much. The way you have it now, the second sampler is nothing more than full image inpaint with a very low denoising strength.
With the particularity that the image on which the entire process is based has been generated in an accelerated manner in 10 steps, which increases the possibilities of alterations and malformations in the hands, eyes, anatomy. etc., which will not later be able to be corrected in the remaining refined processes.
How long does this take? Also, can auto1111 do this as well with other models? How much vram does it use? What about doing the flux control net upscaler? Or doing supir?
@OlivioSarikas Thank you so much for this!! Any tips on how to speed this up on an M1 Macbook bro? I have followed this example with the exception of using the safetensor version instead of the gguf version. It's going rather slow, though.
At first I was like "wait, WHY does this work?" But then I noticed each KSampler has a different seed, and it all clicked, because by changing the seed it triggers it to do different things in each part of the image than it would have and that's what introduces the extra detail. That's actually kinda genius! 🤯 I wonder if adding noise between KSamplers would help too? 🤔 Come to that, I wonder what would happen if you had a different KSampler for every single individual step? 🤯
I'm about to go to bed but now I'll have trouble sleeping since I want to try that first thing in the morning lol 😆 Thanks a lot, regardless of the insomnia! 💪
This is insane! I had the same idea and was working on a workflow when this popped up. This gives way better results in a 10th of the time my solution was getting! Thank you so much for sharing this Olivio
@@ernstaugust6428 thanks for the info. I built a similar WF and it's running in about 2 mins total on my box with 4090. I added more steps and a 4th stage so it gets sampeled twice after the upscale.
I'm not sure why, but I don't have the same results as you. After my first pass (10 steps), the image is already very realistic. After the 2nd pass (20 steps) the image has more details, but the image is overcooked (too much contrast, weird colors), it starts looking like a painting. After the 3rd pass it's basically the same, so the end results after 30 steps is worst than after 10 steps. I used the same models as you (for the GGUF and the upscale). I'm not sure why is that.
To answer my comment (maybe it can be helpful to others). I initially thought that you were using the same noise_seed for every sampler (which product this overcooked effect). With different noise seed for each sampler, it's much better :)
@@OlivioSarikas Yes, I fixed that, and it works well for characters, but I realized that for scenery, the second pass tends to make the image looks 'fake' (compared to the first pass). I'm losing lots of details (textures), image look too 'clean', with strong contrasts and saturated colors. I'm trying to add some extra conditioning for the 2nd pass to keep it realistic, but no success so far. I'm testing different parameters, but still no success so far.
@@OlivioSarikas If you start with the 3rd stage (or something similar) maybe this can be used like Magnific?? Just a thought. I'm thinking, upscale, inject noise, and denoise from a late stage??
"In German, the word for "windscreen wiper" is "Scheibenwischer." It's a compound word made up of "Scheibe," which means "windscreen" or "windshield," and "Wischer," which means "wiper." So, literally translated, "Scheibenwischer" means "windscreen wiper" or "windshield wiper.""
Very interesting and creative workflow. I don't use GGUF models though. Is this trick useful for someone like me that uses FP8 models? I did a couple of quick tests with a fine-tuned model (STOIQO New Reality FLUX) and I didn't see any perceivable difference in the amount of details and quality of textures doing this in 3 stages instead of doing all steps in 1 stage.
@@OlivioSarikas I'm afraid you misunderstood my question. Also, I already used the appropriate loader when I did my test with the FP8 model. My point was, If I use a "normal" FP8/FP16 model, is there any benefit to this 3-stage workflow instead of using just 1 ksampler? As I already mentioned, I did not notice a difference in the quality of the images when doing it in 3 stages vs 1 stage when using the FP8 model STOIQO New Reality FLUX.
This works incredibly well, also combined with adjusting the early block weights in lora one can achieve some very fine detail at distance. Thanks Olivio
I am a bit gutted that you have just shown what I have figured out with SDXL and Flux and do very similar workflows with 3 passes and uncontrolled image back to latent passes to do just this and consistently get better images for it too
Why are you a bit gutted? I don't get it... Is it because someone had the same idea as you? But why would that be bad? 🤷♂️ probably happens more than you'd think
All of the nodes are visible and explained, there is no paywall or secret. Don't be lazy create your own workflows with what you have learned and expand upon them its up to you.
"No offense but even though I found your work very useful, and I would definitely benefit from it, I don't see why should I recognize you any way or form"
@@OlivioSarikas Workflow much appreciated but I still hate the system with subscriptions and paywalls! Also It's not laziness! If I had to manually create every workflow I encounter, it would be a real headache! :)
@@chilldowninaninstant lol yes that is super lazy, with gen AI one doesn't even have to learn to draw or paint for years, simply learn to operate a software and understand some concepts to get nice looking images. And here a youtuber spoon feeds people how to do some specific thing, and still some folks complain.
That's kind a awesome workflow. Thanks. Have you tried this method with SDXL or even SD 1.5? I wonder if the quality would also be also improved on older txt2img generators.
Unfortunatelly it doesn't work well for everything - taking just 10 steps render in the first step using regular model will frequently result in a messed up results (people with additional limbs and so on). So yeah, while this method really improves details with the same amount of steps - it breaks things a lot as well :(
I'm getting into Flux kind of late, but this is a super helpful trick. Getting fast previews is key. I was using turbo and doing smaller renders to test different settings, but this method is much better. I don't know if people are appreciating the algebra on the upscales: 30steps + Upscale vs 20steps + Upscale + 10steps It's the same amount of processing, but this method puts 10 generative steps after the upscale, which is the trick to better upscales in general. Thanks for sharing!
0-10 in computer calculations is 11 steps. As is 10-20. 10 11 12 13 14 15 16 17 18 19 20. That's 11 numbers, so your ksampler should be set to 11. Otherwise, you never reach the final step. Unless you don't want to reach the final step. Also isn't this exactly how SDXL refiner works?
The workflow shown in the video can be reproduced manually, I've tried it myself. So if you want to learn from scratch, you can follow the workflow as demonstrated in the video. However, if you prefer a simpler option and want to support, you can check out the provided link. By the way, thank you Olivio Sarikas.
@@the_one_and_carpool he literally shows you the work flow. By building it yourself you learn. I can't believe you need someone to give it to you, it's so simple. And you're wrong, he's asking a really fair price for his patreon most ask much more. The really wrong thing is that you expect people to give you stuff for free when you don't offer anything except complaints.
you need the advanced sampler to be able to start at a middle step. % denoise lets you stop at a position but doesn't let you start in the middle so you have to use a sampler node that allows you to set the start step. Just use the advanced sampler and add the flux guidance node after your text encoder.
Why did you choose to decode the image and use an upscaler model on that, rather than upscale the latent, inject a small amount of noise and then use that for your 3rd sampling stage?
not more than Flux usually. But because you can cancle after the first ksampler if you don't like the result, you actually save a lot of time and power
Get my Workflow here: www.patreon.com/posts/is-super-flux-to-114327248
Tnx, but can u post a workflow out of the Patreon, cuz its banned in some countries...
👋
Feels like just greed if you promote something in a free community then want us to pay. :(
Nice. The image of the woman still got flux chin and flux plastic skin though. Lol.
@@SteelRoo feels more like you are lazy to build the stuff i show you for free and want everything get handed to you on a silver tablet. this is my job.
okay the zoom into the eye with the human standing there was actually insane.
Nice! Great job Olivio!
Love the flow and super happy to finally give you a small token of support per month!
This is the most useful video you've ever made. And I generally find all of your videos useful. Thanks, Olivio!!
This is crazy. I did your method. The results came out very well. Really amazing And I thought There is almost no need to use additional nodes to add details to the face and hands. It's almost finished. Thank you very much for this secret... Always loved by Thai fans.
I found with testing just now that having all three set to BETA gives my final result a fake/waxy skin appearance, but switching the middle step to KARRAS kept the realistic skin look throughout.
Excellent find. For me, the final result almost resembled an illustration. Your recommendation fixed this perfectly.
The method works extremely well when it comes to details, but for some reason I get fine horizontal stripes in my image after the last KSampler (Upscaling). Does anyone have any idea what is causing this?
I'm getting the same striping, something that hasn't been a problem with other workflows
@@kivisedoIt's the upscaling, Flux gets wacky when you go over 2 megapixels. I solved it by running the image thorugh SDUltimate Tiled Upscale for 4 steps with 0.15 denoising.
@@equilibrium964 Exacly what saw an I did the same to fix. Works fine with a little extra mask blur (16) and tile padding (48) and low denoise.
Great video! Thanks for the shoutout. Always happy to help test and improve on things! And the workflow results are looking clean too :D
GENIUS!
Thank you very much! I recreated the workflow on my RTX 3050 8GB VRAM, 32GB RAM, and the result was WOW. The whole generations process took 10.40 minutes. I repeated the generations using the same nodes but with adding a Lora and changing the model to FP8 and the process took 8 minutes only. FP8 is much faster on my system than the Q8 GGUF.
And why exactly is the upscaler inserted in between? Unfortunately, I don't understand that. It would be very nice if someone could explain this to me. 🙂
I really need to test FLUX lol !!! But I'll try your trick with SDXL too. Great video as usual !!!
How does it handle text generation? Obviously when you don't ask for text, it can be forgiven that any text generated is gibberish, but what happens if you prompt for text generation?
I followed the video to build it as I'm not subscribed to his patreon, so can't 100% say I've done it all correctly, but text generation on a T-shirt worked fine for me on my version.
How does the final image resemble the prompt so well, even though cfg is set to 1.0 for each step?
4x upscaler makes wierd squared pattern at final image. Do u have any idea how to fix it?
I found it out it was resizer issue, so what resizer do u use? Which node I mean, not model.
Hey it's very interesting but how do I add a denoise strength to that KSampler Advanced??
why do all girls generated by flux have that same chin ?
I can't say for definite in regards to the chin but I suspect the females (in particular) were trained off professional models and stoxk photo style models. They all have that cleaned up, professionally airbushed look to them with the base models. You really have to prompt and/or use loras and dedicated checkpoints to get away from that. It may also explain the chin in some way.
No, only if you don't know how to prompt properly / tune your generation parameters / don't know how to train a LoRA.
@@Elwaves2925 helpful answer
@@devnull_ i know how to do all of this, but i wonder why the default always has this, it's just very specific
I hate that default girls face with Peter Griffins chin too... :)
There are two Loras that fix that bony face problem:
"Chin Fixer 2000" and "AntiFlux - Chin".
Or you can use any asian face Lora, because asian women rarely have that defect.
For example "Flux - Asian Beauties - By Devildonia".
Every time there is a generation of a Caucasian woman in Flux, the same one shows up. She made her appearance in this video as well.😅
Not every time, but if you put any keywords similar to beautiful / very beautiful, you are going to get that same generic look. Flux does similar things with lighting and perspectives too.
And if you do get that same face even without asking for beautiful face, you can try lowering your Flux guidance value, or use a LoRA.
Overfitted ?
Meet Mrs. Flux
@@bentontramell
I tried to post a paragraph about overfitting but it was censored !
.
Basically a 3-pass with a noise injection after the first ksampler and an upscale after the second. It gave me a gridded image because on the first ksampler I've set finish step at 10 but setting 20 for total step (where you set a total of 10), and the second ksampler couldn't converge starting from 10 and finishing to 20 on a total of 20. Which means you are obligated to use a ksampler advanced and not a ksampler custom with a sigmasplit node, because this one does only the first thing I described. How unfortunate. Gonna try this other approach with turbo.
the upscaler model you use is not marked as safe in huggingface
just download the safetensors variant.
One question where I'm not fully understanding something. The original empty latent is 1304x768, but in the Image Resize node the resize width and height are 1024x1536. It seems this would switch the image from landscape to portrait mode and distort the image because of the different aspect ratio but all images are about the same, following the aspect ratio of the first image. Why does this work?
no, it uses the scale factor of 0.5, not the pixle values, because it is in "rescale" mode.
It doesn't seem like this would be very fast, as it has 3/4 samplers. But I do like the workflow that focuses on highest quality, it is similar to Nerdy Rodents latest, but he also used custom Scheduler Sigmas to give more control of the generation (like help dealing with the turkey skin).
Because the model doesn't load and unload between samplers, it is equivalent to doing 30 steps on one sampler in time, but it finalizes the overall composition so you can see the results of the composition earlier and cancel the render process if you don't like it. This is mainly how it is faster.
Plus it just gives better render results comparitively. I'll have to check out Nerdy Rodents latest workflow too though.
The fast part is to have a full preview of the composition after 10 steps that stays the same. So (depending on GPU) after 10 seconds you know if you will go on or cancle. And then get much better details if you keep going
Interesting idea and followed the video along to build it, seems to give backgrounds a lot more detail and less of the blurred bokeh effect which I really like. I did get the faint grid patteern I've found with Flux and was suprised for such a small upscale, but added a SD XL pass at the end with a Denoise of 0.1 and that fixed the issue and results in better skin detail :)
Honest question, could this accurately be described as a "2x HR Fix"? Instead of 30 step gen, its 10 step, followed by a 10 step "HR fix", followed by a 10 step "HR fix" ?
maybe, but keep in mind that this does 10-20 out of 20. not just 10 extra steps. and that's important for how flux works. because if you would do a second 0-10 out of 10 you would simply get the same image again
@@OlivioSarikas Thanks for the reply! I’ve been contributing to Forge, and your video has me thinking that HR Fix has untapped potential, that perhaps a “loops” parameter would yield these results. One thing that doesn’t make sense in this workflow (to me) is how much noise is being added to the latent outputs on the next sampling. It’s just a true/false value… I would think this should be similar to “denoising strength” in WebUIs where a lower value adds less noise to the latent output, and a higher value adds more. In regards to your reply… if each of those 3 KSampler nodes generated the total steps (from 0 to X) , without feeding in a latent input, would the result images be drastically different from each other?
I tried this and ran into "Warning: Ran out of memory when regular VAE encoding, retrying with tiled VAE encoding." on the third ksampler. Its progressing but is taking 30 minutes for 30 steps. Im on a 3090 24gb btw.
Did you skip the image resize 0.5 node? I had the same on 3080Ti 16gb.
@@lexmirnov @nesimatbab Thats probably it. Should be less than 160 seconds. And I am using fp16 on my 3090. Still, its no doubt there is a lot of pixels to push considering its still a quite large final image.
The workflow seems quite ingenious. I tried it but I keep getting bands/stripes on the final render, after the upscale. These are not so obvious until the last step, but after the upscale they are quite annoying. No matter what I did, I can't get rid of them completely.
Every subsequent image after the first one has a thicker and thicker outline. They look like being drawn by a thick sharpie 😅. Do you have any idea how to fix it?
Use different seeds per ksampler
@@OlivioSarikas 😮 I'm surprised how you know the cause just from 2 sentences and even have a solution. Thank you so much for the quick reply 🫰. I'll check it out
In general very nice details. But how do you get rid of the banding artifacts?
Once I tracked down all the pieces this works well. One suggestion to make the workflows clearer. Double click on node titles and change them to something descriptive like Old Style Advanced KSampler, Stage 1 Advanced KSampler, etc.
Second suggestion - once I got this working I simplified the workflow using the Anything Everywhere and Anything Everywhere3 nodes along with some filtering by node coloring to get rid of all the lines in the graph. Matter of opinion though, to some it might obscure the logic of the workflow.
Have you tried this with SD35 yet? I've been trying to and having zero luck.
I've given up on trying to use negative prompts with Flux.
Great workflow! works perfectly, one question, i see in the video that the lora is not connected, in case i would like to, where the lora node need to be plugging in? on the input clip of the positive prompt node?
injecting latent noise gives much better results
He had all of them enabled for adding noise
Wait, how you do that ? I tried anebling leftover noise, noise injection... It has zero effect x)
@@tetsuooshima832 latent vision on youtube. (creator of ip-adapter, instand ID, PulID, and many more)
Hello, thanks for the video, i wanted to ask where you got your upscale model from. I cannot find it in comfy model manager. Thanks.
Same here, cannot find it. Used search and Google. Nothing.
Hi I got your workflow from Pathan, but I don’t have all the same files lorahs- upscales ect, do have links for them
@OlivioSarikas, can you do a version with lora for flux?? Please!
this is for flux
@@OlivioSarikas Yes, I mean your workflow, can u do a version with lora? I dont know if I have to put a lora for each Ksampler
@@davidmanas95 you mean this? ua-cam.com/video/jfbqlSaRIPI/v-deo.html
@OlivioSarikas have you tested if FLUX schnell any better with this workflow?
Thanks, Olivio. What about for img2img... how should I handle the denoise of the first pass? I typically use .65 denoise in a regular single pass workflow...
Cheers!
First test i did went really well, then it started over cooking the image on all subsequent tests.
It seems very situational, great when it works, awful when it doesnt.
make sure you have different seeds on the different ksamplers. also, you might have to test different step counts with different models that are community trained
@@OlivioSarikas It was the seed issue, didnt see your reply to this comment but did get a reply on your discord. having the same seed for 0 to 30 and 0 to 10 then different seeds for 10 to 20 and 20 to 30 makes the 10 20 30 method work again. cheers
What are the two nodes called after Unet Loader and DualCLIPLoader?
Do you mean the small blank ones? They are called Reroute. They are a passive node used to extend the output of a node closer to where it is needed especially where there a lot of connections to the output. The connection passes through them and they have no effect. Their use is optional.
Seed? Running the same seed for all 3 samplers (set seed widget to input) or generating separate, like you do in this workflow. And the way you run it, is there any point using separate nodes for generation of seed?
seperate, because the same will introduce problems with the image
Genius!🎖🎖
Where would you recommend inserting the LORA? beginning, middle, end, all of the above?
try different way. but i used it at the first ksampler, so it doesn't interfer with the rest. It might get to much if you use it on all three
@@OlivioSarikas thank you, I'll try it out!
Did I miss this, or was there talk of time savings? If so, what is the time comparison of this method vs the normal method? Side note: if you want to do fast iterations in Flux, you can render in 512x512. When you get something you like, just Hi-res fix it by 2X and make it 1024x1024. If you set the Denoise to 0.35 and the hires steps to ~15, it looks almost identical to the 512x512 version. (Note: I'm talking about using it in Forge, but you could just activate an upscale if you did it in ComfyUI)
The time saving is to gave a full image after 10 steps and then cancel uf you don't like it. 512x512 gives a different composition and less details, so you will get a worse image in the end
@@OlivioSarikas Ah ok gotcha, you just cancel if you don't like. On my 512x512 testing, I'm not seeing less details or comp changes when using in hi-res fix in forge. I can't post examples here unfortunately.
Okay, you are now the next one patreontube I can afford to support! Well done!!
Btw, The next time you want to say that, it's called a dirtypull windowslide.
nice, looks way better
Hey Oldie, follower from Nepal... Dhantaran
Would using "Scale to Megapixels" 2.0 be more efficient than going to 4x and than back down to 2x?
So weird. My output looks like absolute garbage. And my workflow is running about 4 times slower than usual. Did Comfy update something today? 😢
Are you running on Windows? I recently discovered that Nvidia drivers for Windows (since Oct 2023) allow system RAM to be used to supplement GPU VRAM. I have found that it runs about 4 times slower. (But on the flip side, it lets me do things I wouldn't have been able to with only 16G VRAM)
Is this also available for Forge? :)
AI Image generation is basically a glorified denoiser. I'm wondering if too much noise was removed in the first sampler. Would be interested to see the results if you did steps 1-10 of a max of 12 (or up to 15) for example for the first sampler. This way you have an overlap but you're still letting the second sampler not go to waste as much. The way you have it now, the second sampler is nothing more than full image inpaint with a very low denoising strength.
With the particularity that the image on which the entire process is based has been generated in an accelerated manner in 10 steps, which increases the possibilities of alterations and malformations in the hands, eyes, anatomy. etc., which will not later be able to be corrected in the remaining refined processes.
How long does this take? Also, can auto1111 do this as well with other models? How much vram does it use? What about doing the flux control net upscaler? Or doing supir?
Shouldnt sampler 2 start at 11 steps and 3rd at 21?
Interesting..! 😎
@OlivioSarikas Thank you so much for this!! Any tips on how to speed this up on an M1 Macbook bro? I have followed this example with the exception of using the safetensor version instead of the gguf version.
It's going rather slow, though.
At first I was like "wait, WHY does this work?" But then I noticed each KSampler has a different seed, and it all clicked, because by changing the seed it triggers it to do different things in each part of the image than it would have and that's what introduces the extra detail. That's actually kinda genius! 🤯 I wonder if adding noise between KSamplers would help too? 🤔 Come to that, I wonder what would happen if you had a different KSampler for every single individual step? 🤯
I'm about to go to bed but now I'll have trouble sleeping since I want to try that first thing in the morning lol 😆
Thanks a lot, regardless of the insomnia! 💪
Are you making the workflow available to non patreons at some point?
I recreated it. It's simple.
do you get better results with 3 different seeds? or can you use the same seed in the three steps?
three different seeds seem better, because the same seed will enhance errors over time
@@OlivioSarikas I really like your channel I always learn something new from your video thanks for sharing. 🙏
Love your work and this dope workflow. I'm calling it Flux Cascade in my build. Thx for sharing 😃🤙🏾
I use nf4, fast and high detailed. It's not Lora compatible but I never use ut anyway.
Wow! awesome!
This is insane! I had the same idea and was working on a workflow when this popped up. This gives way better results in a 10th of the time my solution was getting! Thank you so much for sharing this Olivio
This is actually an impressive workflow solution great job!
Interesting idea. How long does it take to run with 4090? Have you tried skipping a step ie starting at 11 instead of 10? or injecting noise?
RTX 3060 / 12GB - first two steps take 1 minute each. Last step takes 4.5 minutes with minor improvements compared to step 2
@@ernstaugust6428 thanks for the info. I built a similar WF and it's running in about 2 mins total on my box with 4090. I added more steps and a 4th stage so it gets sampeled twice after the upscale.
Amazing workflow! Very easy to follow and thank you for walking through each node step by step. I managed to replicate your results!
I'm not sure why, but I don't have the same results as you. After my first pass (10 steps), the image is already very realistic. After the 2nd pass (20 steps) the image has more details, but the image is overcooked (too much contrast, weird colors), it starts looking like a painting. After the 3rd pass it's basically the same, so the end results after 30 steps is worst than after 10 steps. I used the same models as you (for the GGUF and the upscale). I'm not sure why is that.
To answer my comment (maybe it can be helpful to others). I initially thought that you were using the same noise_seed for every sampler (which product this overcooked effect). With different noise seed for each sampler, it's much better :)
Make sure you use a different seed on each image
@@OlivioSarikas Yes, I fixed that, and it works well for characters, but I realized that for scenery, the second pass tends to make the image looks 'fake' (compared to the first pass). I'm losing lots of details (textures), image look too 'clean', with strong contrasts and saturated colors. I'm trying to add some extra conditioning for the 2nd pass to keep it realistic, but no success so far. I'm testing different parameters, but still no success so far.
love the breaking bad reference!!!
Could this work with img2img?
technically yes, but it might change the details because of the 10-step first render. but give it a try
@@OlivioSarikas If you start with the 3rd stage (or something similar) maybe this can be used like Magnific?? Just a thought. I'm thinking, upscale, inject noise, and denoise from a late stage??
"In German, the word for "windscreen wiper" is "Scheibenwischer." It's a compound word made up of "Scheibe," which means "windscreen" or "windshield," and "Wischer," which means "wiper." So, literally translated, "Scheibenwischer" means "windscreen wiper" or "windshield wiper.""
Very interesting and creative workflow. I don't use GGUF models though. Is this trick useful for someone like me that uses FP8 models? I did a couple of quick tests with a fine-tuned model (STOIQO New Reality FLUX) and I didn't see any perceivable difference in the amount of details and quality of textures doing this in 3 stages instead of doing all steps in 1 stage.
you can also use it with the other models, but you need to change the model loader
@@OlivioSarikas I'm afraid you misunderstood my question. Also, I already used the appropriate loader when I did my test with the FP8 model. My point was, If I use a "normal" FP8/FP16 model, is there any benefit to this 3-stage workflow instead of using just 1 ksampler? As I already mentioned, I did not notice a difference in the quality of the images when doing it in 3 stages vs 1 stage when using the FP8 model STOIQO New Reality FLUX.
This works incredibly well, also combined with adjusting the early block weights in lora one can achieve some very fine detail at distance. Thanks Olivio
I am a bit gutted that you have just shown what I have figured out with SDXL and Flux and do very similar workflows with 3 passes and uncontrolled image back to latent passes to do just this and consistently get better images for it too
Is your SDXL workflow available anywhere? I'd be curious to try it out if it is.
Why are you a bit gutted? I don't get it... Is it because someone had the same idea as you? But why would that be bad? 🤷♂️ probably happens more than you'd think
No offense, but by principle I hate subscriptions and finding out there is a paywall in the end!
It's a reward for people who support me. I show the full workflow for free in the video
All of the nodes are visible and explained, there is no paywall or secret. Don't be lazy create your own workflows with what you have learned and expand upon them its up to you.
"No offense but even though I found your work very useful, and I would definitely benefit from it, I don't see why should I recognize you any way or form"
@@OlivioSarikas Workflow much appreciated but I still hate the system with subscriptions and paywalls! Also It's not laziness! If I had to manually create every workflow I encounter, it would be a real headache! :)
@@chilldowninaninstant lol yes that is super lazy, with gen AI one doesn't even have to learn to draw or paint for years, simply learn to operate a software and understand some concepts to get nice looking images. And here a youtuber spoon feeds people how to do some specific thing, and still some folks complain.
Joined your Patreon but you dont reply to questions there it seems
That's kind a awesome workflow. Thanks. Have you tried this method with SDXL or even SD 1.5? I wonder if the quality would also be also improved on older txt2img generators.
This is pure genius, thank you very much Olivio.
So simple. I wonder WHY it works though?
Unfortunatelly it doesn't work well for everything - taking just 10 steps render in the first step using regular model will frequently result in a messed up results (people with additional limbs and so on). So yeah, while this method really improves details with the same amount of steps - it breaks things a lot as well :(
8:22 Thank our sponsors, Rionlard and Toribor
So, iterative upscaling?
More like iterative resampling. Upscale is optional.
Dude, you do deliver. Real impressive neat trick.
Great idea, but doesn’t work with consistent characters, unfortunately
I give this video a Mmuah! out of 10.
I'm getting into Flux kind of late, but this is a super helpful trick. Getting fast previews is key. I was using turbo and doing smaller renders to test different settings, but this method is much better.
I don't know if people are appreciating the algebra on the upscales:
30steps + Upscale
vs
20steps + Upscale + 10steps
It's the same amount of processing, but this method puts 10 generative steps after the upscale, which is the trick to better upscales in general.
Thanks for sharing!
0-10 in computer calculations is 11 steps. As is 10-20. 10 11 12 13 14 15 16 17 18 19 20. That's 11 numbers, so your ksampler should be set to 11. Otherwise, you never reach the final step. Unless you don't want to reach the final step. Also isn't this exactly how SDXL refiner works?
Gonna try your theory tomorrow. Interesting!
Cool, I will have to try this!
Sorry for that but you really need to check it out!
They are promising a new SANA model soon, what do you know?
The workflow shown in the video can be reproduced manually, I've tried it myself. So if you want to learn from scratch, you can follow the workflow as demonstrated in the video. However, if you prefer a simpler option and want to support, you can check out the provided link. By the way, thank you Olivio Sarikas.
Someone share the workflow, I don't want to spend money on a subscription for the sake of one file)
just watch it and build it, not hard at all. When you do this you will learn how to make stuff yourself instead of begging for handouts.
Ugh, just watch it and build it. stop being lazy. Or spend 3 dollars... you don't have 3 dollars?
@@the_one_and_carpool he literally shows you the work flow. By building it yourself you learn. I can't believe you need someone to give it to you, it's so simple. And you're wrong, he's asking a really fair price for his patreon most ask much more. The really wrong thing is that you expect people to give you stuff for free when you don't offer anything except complaints.
Normal face as for witches )
How can replicate it using the "SamplerCustom" or "SamplerCustom Advanced"??
Thank u very much bro !
you need the advanced sampler to be able to start at a middle step. % denoise lets you stop at a position but doesn't let you start in the middle so you have to use a sampler node that allows you to set the start step. Just use the advanced sampler and add the flux guidance node after your text encoder.
Love it!
Still slow if you don't have a 4090, they need to make a more accessible model
Why did you choose to decode the image and use an upscaler model on that, rather than upscale the latent, inject a small amount of noise and then use that for your 3rd sampling stage?
Pixel upscalers are more powerful than latent upscalers
❤
first!!!
Second lol
how taxing is this on hardware?
not more than Flux usually. But because you can cancle after the first ksampler if you don't like the result, you actually save a lot of time and power
@@pingwuTKD sorry, i don't have a mac. but you can ask in my discord