Stable Diffusion Performance Optimization Tutorial

Поділитися
Вставка
  • Опубліковано 19 гру 2024

КОМЕНТАРІ • 77

  • @tonycelentano2071
    @tonycelentano2071 Рік тому +36

    I love that Arnold Schwarzanegger is giving AI tips.

  • @AI-ByteBard
    @AI-ByteBard Рік тому +11

    Thanks man! This dramatically increased my performance!

  • @quizkraftstudio
    @quizkraftstudio 7 місяців тому

    instaling xfomrers was the big one for me. I have a 3070ti and went from 22 mins for a single 1920 x 1080 image to 50 seconds. it was pretty fast as it was at 512 x 512 but for some reason 1080 images slowed it to a crawl. Thank you!

  • @ralkeon87
    @ralkeon87 Рік тому +2

    If I remove --medvram, it will slow down drastically on generating images (it goes from 1 min to 20-28 min). What can be the issue?

  • @linkernetir
    @linkernetir 8 днів тому

    thanks for telling me --medvram reduces speed.

  • @Lell19862010
    @Lell19862010 Рік тому +2

    Isn't it sufficient to right click on the bat file and choose modify to open it in notepad?

  • @JuxGD
    @JuxGD Рік тому +1

    for editing .bat files, can't you just shift right click and then open with?

    • @JuxGD
      @JuxGD Рік тому +1

      or click show more options and then open with

  • @petertremblay3725
    @petertremblay3725 13 днів тому

    I have forge ui with a rtx 3060 12 GB and under settings in stability matrix i have cuda malloc and cuda stream, should i activate both?

  • @wadimek116
    @wadimek116 7 місяців тому +1

    wow, thank you, from waiting 60 seconds for 512x512 on Pony models now it takes 10second. RTX 3070ti

  • @kawabalik5080
    @kawabalik5080 4 місяці тому

    Das problem es bracht extrem lange obwohl ich nur 2 propmts eingegeben hab. Wenn ich mehrere prompts eingebe funktioniert es garnicht erst

  • @ArtificialChange
    @ArtificialChange 9 місяців тому

    what if I have a 2060sup and a 4060, only 4060 is being used. i would like to be able to set up my 2060sup to run my auto1111, then if i feel like it I can also run a comfyui separately

  • @chaos_artifical_intelligence
    @chaos_artifical_intelligence Рік тому +2

    Don't the arguments '--opt-split-attention' and '--xformers' conflict with each other? You can only choose either.

    • @Archive-pg2zn
      @Archive-pg2zn  Рік тому +2

      Hi there, the documentation does not mention any compatibiltiy issues. In fact, using the "--opt-split-attention" option may enhance the performance of the "--xformers" command line argument. Could you provide me with a source as I'm very interested in optimizing performance?

    • @Albedowo
      @Albedowo Рік тому

      You confused "--opt-sdp-attention" with "--opt-split-attention'"

    • @chaos_artifical_intelligence
      @chaos_artifical_intelligence Рік тому +1

      @@Archive-pg2zn
      --xformers
      --opt-split-attention-v1
      --opt-sub-quad-attention
      I think those 3 can't work together as the code is set up with conditions making only one of them actually be applied.

    • @chaos_artifical_intelligence
      @chaos_artifical_intelligence Рік тому

      @@Albedowo check out the codes, you'll find them incompatible with each other

  • @gamingweeb54
    @gamingweeb54 6 місяців тому

    My laptop freezes everytime i hit generate 😢
    Please help

  • @JpresValknut
    @JpresValknut Рік тому

    Works way better now. Yet my RTX 3080 still does not run on 100% it feels like. Not that that means anything but the GPU constantly stays at 60 degrees with barely any cooling on yet it moves between 80% and 100% in the 3D usage in the task manager. I adapted everything in the global settings of the Nvidea Control Panel yet it seems like there is still a lot to go. Comparing it to other benchmarks confirms this. Is there anything more you can advise on?

  • @iresolvers
    @iresolvers 7 місяців тому

    is there a system info extention for comfyui??

  • @YoshiYo-qr1nk
    @YoshiYo-qr1nk Рік тому +1

    I'm really struggling to find a good setting for RX 6600, do you have an idea where I could find the best commandline?

    • @joshuad4772
      @joshuad4772 5 місяців тому

      just drop it, radeon cards arent made for ai stuff

    • @darkromano_
      @darkromano_ 2 місяці тому

      Use stable difusion ZLUDA for better performance on AMD

  • @anon3253
    @anon3253 Рік тому +2

    Thank you! My images were being made in 3 minutes, it went to 20 seconds.

  • @hanznif9365
    @hanznif9365 Рік тому

    hi, im using rtx2060super, my problem is SD using lot of my ram not gpu..how to make it using the GPU btw Thanks

  • @SpacenSpooks
    @SpacenSpooks Рік тому +2

    Hello, I'm using windows surface 4 with AMD Ryzen 5 GPU (?), but my CPU is running at 100%, my memory (RAM?) is running at 90%, and images are taking 20 minutes to load (an hour to lad one image was the worst time). I need to know what tutorial would best suit my computer's needs? (I'm a complete n00b to this, BTW)

    •  8 місяців тому

      are you still having issues?

  • @JohnRustles
    @JohnRustles 8 місяців тому

    How do I restart automatic1111 on windows? What's the command?
    Thank you for the video!

  • @supersonicman4353
    @supersonicman4353 10 місяців тому

    Can someone please help its using 100% of my cpu and giving me super slow performance and not using my gpu at all btw my specs are 7600 for cpu and 6800xt for gpu as well as 32gb of ddr5 ram

  • @NonexistentIndividual-e1l
    @NonexistentIndividual-e1l Рік тому

    My PC keeps shutting down after running Stable Diffusion for a period. I'm using Windows with a NVIDIA card. Any suggestions?

    • @Archive-pg2zn
      @Archive-pg2zn  Рік тому

      Hello there! This is most likely a power delivery issue. Remove these flags if you have set them as command-line arguments. "--opt-split-attention-v1 --opt-sub-quad-attention". See if your computer crashes. Be sure you have updated your NVIDIA Drivers for your specific GPU. Try installing Automatic1111 on WSL2. It is very easy and improves performance in some cases.Tutorial: ua-cam.com/video/sfQvP5VGxKI/v-deo.html Use a program like CPU-Z to see the power draw and usage of your GPU and CPU. See if your computer crashes while it is at 99% or 100% utilization. It may be due to your CPU not being cooled properly, causing it to thermal shut down. Although this is unlikely to be the case. There is a GitHub issue that specifically addresses the problem you mentioned: github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/11598

  • @OceanBoyJj
    @OceanBoyJj Рік тому

    DO you recommend xformers and other settings on a RTX 3090?

    • @itizjuan
      @itizjuan Рік тому

      xformers IS recommended to use on a RTX 3090.

  • @philippeheritier9364
    @philippeheritier9364 Рік тому +2

    thx it goes really faster now

  • @kam6576
    @kam6576 Рік тому

    My performence is good when generating pictures but when I switch between checkpoints it takes 5-10 minutes. Do you know how to make it faster?

    • @SpookySpiralz
      @SpookySpiralz Рік тому

      use ssd

    • @pixelpirate6
      @pixelpirate6 Рік тому

      Switching between checkpoints also takes a bit longer for me cause I'm using a hard drive. The first image generation/itteration is usually slower when you load your checkpoint cause it ned to be loaded into vram. Thats what I heard people saying

    • @SupremacyGamesYT
      @SupremacyGamesYT Рік тому

      @@SpookySpiralz You don't need to have the models/checkpoints on SSD, nor A1111's installation. I have everything on a 7200 HDD. Switching between them takes 2-3min depending on their GB size.

    • @jithel7948
      @jithel7948 Рік тому

      @@SupremacyGamesYT Yes, but on an SSD my switches are about 15 seconds

  • @brianp235
    @brianp235 Рік тому +1

    Excellent video! Thanks for making this!

  • @remusveritas739
    @remusveritas739 Рік тому +1

    you showed it all very simple and clear and i still dont get it, im so stupid man...

  • @akselkarlsson5229
    @akselkarlsson5229 Рік тому

    You are a Wizard. Thanks for making the videos so easy to follow so dum dums like me can make higher res pictures :)

  • @rezzancicek4864
    @rezzancicek4864 Рік тому

    thank yo bro for all you do. I have gtx 1660 super but extremely low performance can you help me if I share my PC on discort or anydesk ?

    • @Archive-pg2zn
      @Archive-pg2zn  Рік тому

      Hi, thank you! Try installing Stable Diffusion this way: ua-cam.com/video/sfQvP5VGxKI/v-deo.html Be sure to install the latest Nvidia Drivers.

    • @rezzancicek4864
      @rezzancicek4864 Рік тому

      @@Archive-pg2zn already i have newst nvdia bro :(

  • @Viraltubs
    @Viraltubs Рік тому +1

    BRO YOU NEED LEARN US ABOUT ERRORS !! MEMORY ERRORS ETC

  • @TairMukashev
    @TairMukashev 3 місяці тому

    Thank you!!!

  • @tuurblaffe
    @tuurblaffe Рік тому

    KEK! the bloat and spyware of windows is no match for your performance get a custom iso if you need windows but don't go messing with all spy and bloatware

  • @khristiankhouri8732
    @khristiankhouri8732 Рік тому

    Hi, tried out new args and removed --medvram, on an rtx3060ti and get memory error when trying to 2x upscale in img2img from 512x768 to 1024x1536. what's the deal?
    Args are --xformers --opt-channelslast --upcast-sampling --opt-split-attention --no-half-vae --medvram --vae-path "models\VAE\vae-ft-mse-840000-ema-pruned.safetensors"

    • @Archive-pg2zn
      @Archive-pg2zn  Рік тому

      Hi there, it might be worth trying a different approach. Instead of specifying the VAE via the commandline arguments, you could try using the UI. If the VAE is in the VAE folder (it appears that this is already the case on your computer) go to "Settings" > "Stable Diffusion" > "SD VAE". Choose the VAE from the drop down menu and click on "Apply Settings". Additionally, try using the following command line arguments: "--xformers --upcast-sampling --opt-sub-quad-attention --opt-channelslast --medvram --no-half-vae". Hopefully, this will help you avoid the memory error. Note that upscaling the generated image requires a lot of memory. I suggest to generate all your images without the upscaling feature. After the generation restart Automatic1111 with the "--lowvram" argument and batch upscale all your images. You can let this run while you're away. Let me know if it works!

    • @kdzvocalcovers3516
      @kdzvocalcovers3516 Рік тому

      use xformers only..do not use any other args..especially do not use medvram if possible-my 3070ti is using 85 percent load,,,was at 100 percent,6.9gb/8gb..fast and efficient...32 768x768 images..hires fix,,,takes less than 3 minutes,,,16 images at 1.8 seconds

  • @oseaniic
    @oseaniic 26 днів тому

    TY

  • @mentocthementalist5917
    @mentocthementalist5917 3 місяці тому

    TYVM

  • @AkshayTravelFilms2
    @AkshayTravelFilms2 Рік тому

    Thanks Man :)

  • @procrastonationforever5521
    @procrastonationforever5521 Рік тому +1

    why on the Earth you need to rename bat file to txt? What a lamer...

  • @campfirecult4375
    @campfirecult4375 Рік тому

    🔥

  • @distorter85
    @distorter85 Рік тому

    Did as you showed, no any improvements at all )))

  • @antoinefeuerstein9817
    @antoinefeuerstein9817 11 місяців тому

    such a shame ... you are making a tutorial on windows bro ! the title stipulate a general tuto compatible for server side, windows is such a trash in term of performance ! :P

  • @Statvar
    @Statvar Рік тому +1

    Xformers helps it go faster for sure, but I keep running into an error when I use image2image with 768 by 512 saying, "modules.devices.NansException: A tensor with all NaNs was produced in VAE. This could be because there's not enough precision to represent the picture. Try adding --no-half-vae commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check."
    Whenever I use text2img its works fine. Im using a rtx 2070 super btw. Thanks for the informative video

    • @Statvar
      @Statvar Рік тому

      Nevermind I just added --no-half-vae with the xformers might've fixed it for now. I also noticed that using different command line arguments slightly changes the image even if its using the same seed. not necessarily a bad thing, just something i noticed

    • @Archive-pg2zn
      @Archive-pg2zn  Рік тому

      @@Statvar Great job on finding the solution to the error by adding the "--no-half-vae" command line argument. You're absolutly right that changing the command line arguments can slightly change the image. With "--xformers" you trade a tad bit of percision for generation speed. Although most of the times the difference is negligible. Keep up the good work!

  • @pinielka
    @pinielka Рік тому +1

    Hi, firs of all thanks for this video ! I HAVE A PROBLEM and need your help pls
    if i use the user.bat with lowvram everything works properly but very slow, if i use it witout lowvram i get this error: "stable diffusion error PYTORCH_CUDA_ALLOC_CONF"
    my system info is: cpu: Intel64 Family 6 Model 151 Stepping 2, GenuineIntel
    system: Windows
    release: Windows-10-10.0.22621-SP0
    python: 3.10.6
    device: NVIDIA GeForce GT 1030 (1) (compute_37) (6, 1)
    cuda: 11.8
    cudnn: 8700device:NVIDIA GeForce GT 1030 (1) (compute_37) (6, 1) cuda:11.8 cudnn:8700 2GB
    ram: free:8.42 used:7.39 total:15.82
    gpu: free:0.99 used:1.01 total:2.0
    gpu-active: current:0.01 peak:1.07
    gpu-allocated: current:0.01 peak:1.07
    gpu-reserved: current:0.02 peak:1.69
    gpu-inactive: current:0.02 peak:0.18
    events: retries:0 oom:0
    utilization: 0
    can i do something to work SD faster?

    • @Archive-pg2zn
      @Archive-pg2zn  Рік тому +1

      Hi! I'm glad you found my video helpful. One solution you could try is to use these flags: "--medvram flag --xformers --opt-split-attention". Try lowering the resolution at which you're rendering pictures. Don't use any upscalers, because they allocate extra VRAM. Run the benchmark to see which flags work the best. If you're still experiencing slow performance, it might be worth considering upgrading your graphics card to one with at least 4 GB of VRAM. The GT 1030 only has 2 GB of VRAM, which is lower than the recommended VRAM of 4 GB for Stable Diffusion. You may use Google Colab. Their free tier has high performance graphics cards. The GTX 970 and 960 4 GB Edition should offer solid performance too. I hope this helps!

    • @pinielka
      @pinielka Рік тому

      @@Archive-pg2zn thank u so much for all information! 👍👍I will check it out and reply

    • @pinielka
      @pinielka Рік тому

      @@Archive-pg2zn for this flags: "--medvram flag --xformers --opt-split-attention" it gives me en error: flag

    • @Archive-pg2zn
      @Archive-pg2zn  Рік тому +1

      ​@@pinielka I have made a typo. This "--medvram flag --xformers --opt-split-attention" should actually be "--medvram --xformers --opt-split-attention", without the "flag". If these falgs result in a "cuda alloc" error, your best bet is to use your original flags "--lowvram --xformers --opt-split-attention". If you really want to speed up the generation process consider upgrading your graphics card to one with at least 4 GB of VRAM.

    • @pinielka
      @pinielka Рік тому

      @@Archive-pg2zn thanks for your reply! I will check it out and let you know

  • @Jenova867
    @Jenova867 7 місяців тому

    Did everything in the video and image generation still takes a long time it maybe even slower because I took out medvram