Stable Diffusion Running on an NVIDIA RTX 4090 (Speed Test) Automatic 1111 (Vlads SD.Next)
Вставка
- Опубліковано 10 сер 2023
- This short video is for anyone curious to see what sort of speed you can get with an NVIDIA RTX 4090 within Stable Diffusion. I'm currently using the Automatic 1111 (Vlads Fork) setting, with Xformers turned off based on Vladmandic's suggestion that it's unnecessary. . I don't know if I am getting the speed I should but I am happy with the performance.
github.com/vladmandic/automatic
Thanks for the benchmark. the 4090 is really incredible
You're welcome. I struggled to find any demonstration like this before buying a 4090 so I thought it would help people see the speed.
@@gohan2091 would you consider having more than 1 on the PC to make it even faster? wondering if it would even fit. I am considering assembling the ultimate mahcine for me for stable diffusion. Can you share the rest your your PC specification?
@@edwincloudusa no, one pc is enough. My PC is 64GB DDR4,, AMD Ryzen 5950X, windows 11.
@@gohan2091He means dual cards I think
@@hansa9159 I don't think dual cards would be sensible. One 4090 is plenty
thanks 👍
ty for video, i really saving money for buy 4090, its to expensive for me . i hope can buy it
I hope you can buy one. It's a very nice card for gaming and AI. To save money you don't need to get an Overclocked edition, mine is a standard Zotac.
Interesting is to see that it generates several images at the same time, my A1111 only generates the images one after the other and now with Forge-UI, it does it around 3.5-3.9 it/sec with a 3050ti, 4 GB vram on an AMD laptop .
Thanks for the video
Anyone can generate several images at the same time. It's due to the batch size slider (not batch count)
@@gohan2091 Thanks
Omg that is the speed of Gods, can you make video of another configuration settings speed test?
What other configuration?
More steps
@@surrealechoai how many? more steps doesn't necessarily mean better quality. I could do some hires fix latent scaler maybe?
@@gohan2091Could you try next time with 30 steps DPM++ SDE Karras and Hi-Res Fix being 4x Ultra Sharp, at 15? I've seen great results with these settings. What do you think?
That's so awesome thank you! How did you get it to generate multiple images at a time? I tried the same settings but they still seemed to generate 1 at a time.
Take a look at the sliders that I adjust in the video. You have batch count and batch size. Batch size is how many images you're generating at the same time. So a batch size of 8 would create 8 images together at the same time. Batch count is how many images you're generating in sequence (one after another). An example, batch count 10, batch size 1 would create 10 images one at a time. Where as batch count 1, batch size 10 would create 1 lot of 10 images (all 10 images created together at the same time).
@@gohan2091 Is there any difference between both methods? I mean, is there any positive side of one method over the other? Thanks!
Also interested if anyone knows
@@CyborgMedia
That's a rocket
My GTX970 can generate 1 image in that time.
How much time
@@David-wp4vu All the time that took them to generate all those images + more
how i get only 2.8it/s ?
I have a RTX 4090 and intels 13900K not over clocked and mine does not run that fast, what setting are you using?
Slower than my dual core PC!!
try quad core
@@Hoffmanpack "Why should I try a quad-core!! I currently have a dual-core PC, which is too much faster than an RTX 4090 build
In reddit interesting info.. drivers version changes output speed.
537.13 - 682.11 seconds
531.79 - 297.95 seconds
Not sure if it affected 4090.. but after such post, need to test..
Do you have a link to the reddit?
Just tested with SD1.5 for me new driver for 4090 works better.
So maybe issue only for old graphic cards..
My result: 1024x1024 100steps, dpm++ 2m sde karras
537.13 -> 6.83
531.79 -> 6.19
Reddit post: New NVidia driver released, with possible slowdown issue resolved?
@@pastuh are you getting similar speeds to what I get on the video if you replicate the settings I used? Just curious
After testing I see my single images generated much faster.
Batch 4 faster, 8/16 I would say identical
MSI Afterburner set to power limit: 60%
Automatic1111 latest update
COMMANDLINE_ARGS=--opt-sdp-attention --no-half-vae
@@pastuh interesting. And with the same settings as I'm using? I'm on Vlads fork of A1111. Any ideas why yours is faster? I'm not using xformers either
When I bought this computer with a 7900xtx, I wasnt interested in AI image generation. I just tried it the other day.
Its the most excruciating experience possible... most of the A1111 forks dont work hardly at all, and the couple that I can seem to get functioning will produce one image every 5+ minutes...
Try nod.ai Shark local installation. That's what I used when I had an AMD 6700 XT. Much faster generation speeds
Or maybe Install Linux on some cheap SSD then use it for the AI stuff. The 7900xt is pretty fast on ROCm even a little bit faster than Shark. You can look at the bench video I made to get an idea.
how much did the whole pc setup with 4090 cost you
The 4090 was £1500, the rest of the computer excluding the monitor was around £2500. This is gbp (great British pounds)
can you show some faceswapping and controlnet speeds
Sure, ua-cam.com/video/8s8z3Q6AXCs/v-deo.html
why you can generate Random person at a time? i use the same prompt and setting but always generate female only?
Look at the prompt I'm using. Type the prompt you're using here in the chat. Is it the same?
@@gohan2091 A{child|teenager|adult|elderly}{male|female},{Mexican|German|Swedish|American|Japanese|English|Thai}wearing casual in a {mall|park|library|woods|stadium}+,RAW photo,(high detail skin:1.2) 8k uhd, dslr, soft lighting,high quality,film grain, Fujifilm XT3
nsfw,naked,nude,cartoon,painting,illustration,(worst quality,low quality,normal quality:2)BadDream UnrealisticDream
@@qwew789 do you have the dynamic prompt extension installed? You need that to be able to use wildcards and curly brackets
@@gohan2091 i do not have this, thanks for answer👍
no second pass? no hires fix? ummm....
definitely a lot faster then RTX 4070ti
VRAM usage in task manager please
I've already made the video. I won't be redoing it, sorry
What's thee max picture dimensions? Can it do 1080p? 4k? Thanks
You should be using the maximum resolution of the 1.5 model which is usually 512 or 768 pixels. You can latent upscale it by 2x via hi-res fix which just takes seconds. You can generate 4k images via upscaling but as the original generation? It's possible but the results may not be good.
Ty
yeah that makes more fun. its a 24 GB GPU for 1800€....
what is Vlads fork setting?
I Don't understand what you're asking
Can you also try this using sdxl 1.0?
I don't have SD XL properly setup. I have it installed and it works in A1111 but I tend to use 1.5 models at the moment. It is fast though.
It's very good, it's great, it's fast, it would be nice if you made one with a professional Quadro card
Lol I don't have a quadro card to test :(
ok friend, great, we still love your videos..@@gohan2091
why my 4090 doesn't makes it at that speeD?
I don't know. Are you using A1111 or SD.Net (vlads)? Are you using xformers? Or SDP? what version of python do you have? Are your nvidia drivers updated? Things like that.
@@gohan2091 fixed it...
I had an amd previously so i installed a stable difussion version for amd,
So I unninstaled everything,
And installed the Nvidia's version...
So the problem was in the stable difusion software, that wasn't compatible
@@Mente_Fugaz great :) enjoy!
@@Mente_Fugaz how could you tell?