Stable Diffusion Lora Training with Kohya (Tutorial)

ControlAltAI

834

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 10 лют 2025

КОМЕНТАРІ • 132

@controlaltai Рік тому ⁺⁹
CPU Thread Per Core Clarification @ 17:35 - There is a mistake for the number of CPU threads per core. It’s not logical processors but threads per core which is 2 for AMD 7950X3D. Go with the default setting of 2. Some Intel processors don’t have hyper threading, there it should be set at 1. All consumer grade CPU are at either 1 or 2 threads per core.
Also as mentioned in comments by @quercus3290
"Increasing Network Rank (Dimension) will have extreme impacts on lora training times, this should probably be mentioned."
@TheRMartz12 11 місяців тому
wdym extreme impacts on training times? as in it will make the training faster or longer by hours?
@Lexie-bq1kk 5 місяців тому
For simple LoRA, can always start with Rank 32 and Alpha of 1, and play with it as needed
@HanSolocambo 10 місяців тому ⁺¹¹
Regularization images: 10:20 If you have 134 training images you'll have 134 regularization images. That's simple.
1- Nb or reg.images = nb. of training images.
2- Name of each reg.image = name of its corresponding image in the training dataset (01.png = 01.png)
3- prompt for each reg.image has to be the exact caption that you wrote down for the corresponding training image, BUT minus the trigger word (first/last word in your caption, character/style).
4- The seed for each reg.image has to be the seed that you define in Kohya > Parameteres > Seed (ex:12345).
5- The sampler for each reg.image must be DDIM
6- The checkpoint for each generated reg.image must be the checkpoint used for training.
7- The size (pixels) of EACH reg.image has to be the same as its corresponding image in the training (which forces to crop cleverly images for the dataset, using multiples of 16 rather than random wild crops).
8- Each regularization image needs its own .txt caption. Once the dataset is done, copy all .txt captions into the reg.images folder and batch remove (BooruDatasetTagManager, notepad++, etc.) the trigger word from all txt file.
Same size, same prompt, same name, same dimensions, etc. And THEN regularization images make sense. There's nothing random about them "I don't know how many I need, I heard that, people disagree on the number, etc.". You do them manually one by one, specifying each time the prompt, the size. It's a precise part of the database, as precise as training images and their own captions.
I've read loads of crap about how to make them. People simply copy paste rubbish from forum to forum, when they should simply read 1 source: the original science paper... because those guys actually know what they're talking about. reg.images give impressive LoRA results, but they have to be made properly, not randomly ;)
Cheers. Great tuto, thanks for taking the time to video composite + voice over all that nicely.
@amorgan5844 10 місяців тому
That makes way more sense, ive downloaded so many different sets from people and they are absolutely the worst quality and make terrible loras😂
@Emperorhirohito19272 10 місяців тому
the character in my LoRA has dark skin and a certain hairstyle, so I didn't put it in the caption as its present in all the images. but then this means when removing the keyword for the regularisation images, none of that info is present so it generates people who look nothing like my character. is this still correct? in the video he seems to say what matters is making generic versions of your character.
@controlaltai 10 місяців тому
For rasterization images use keywords similar to your character. For example, I used all curly hair rasterization images.
@Lexie-bq1kk 5 місяців тому
To anyone wondering, there is still an active conversation around weather or not regularization images help. I mean for certain cases they obviously do per what people are saying about their results. There are cases where people say they help, cases where they are unhelpful, and cases where people say they don't notice a difference. At this point the only way to know if it's what you need is to try it with and without. If the base model already has a strong understanding of the class you're training, you may not need them.
@DivineDragonCrisis 3 місяці тому
for 6, does this mean regularisation images has to be ai generated and not just some images found elsewhere?
@sergetheijspartner2005 11 місяців тому ⁺¹
Very detailed, wasn't working on it so the details I skipped but later on these details will sure come in handy
@DivineDragonCrisis 3 місяці тому ⁺¹
Thank you for the guide. I was able to follow along for most of it but I think an update is due. The layout of kohya ss shown in the video and the current layout is different. The differences aren't extreme but pretty significant I reckon.
@controlaltai 3 місяці тому ⁺³
Hi, Yeah sure, in AI an update is always due. However some key concepts were explained in the video, those will remain consistent. But about the UI yeah I agree. Will put it in the bucket list to do. Thanks.
@DivineDragonCrisis 3 місяці тому ⁺¹
@@controlaltai thank you. I hope to see the updated video soon!
@theiwid24 8 місяців тому
Nice work mate!
@cheapestvfx1134 11 місяців тому ⁺¹
Hi can you share about resume training with last checkpoint,?
@quercus3290 Рік тому ⁺¹⁰
Increasing Network Rank (Dimension) will have extreme impacts on lora training times, this should probably be mentioned.
@controlaltai Рік тому ⁺²
Thanks, message pinned.
@TRoJMelencio Рік тому
what is the ideal number for RTX 3070 card?
@controlaltai Рік тому
Hi, which setting number are you talking about?
@CreadorDefinitivo Рік тому
@@controlaltai Network Rank
@SonnyNguyen 10 місяців тому
Thanks for great video. So how's about the learning rate input for SD1.5?
@controlaltai 10 місяців тому
Thank you! Learning rate value should not make a difference if it's sd1.5 sdxl or even sd3. The core concept explained is general and does not change with base model.
@박지영-p9s 7 місяців тому ⁺¹
Can you also upload a video on how to make a checkpoint?
@proyectoai 10 місяців тому
Hello, excellent tutorial!! I really thank you very much for taking this time. Please could you help me with the ending? After training the lora finally: what file should I paste in the C:\Users\\stable-diffusion-webui\models\Lora folder?
I don't have a safetensors file like the other Lora's , ONLY GENERATE JSON's FILE
@controlaltai 10 місяців тому
Hi, the folder you set up before starting the process should save a .safetensor file. If it has not something has gone wrong. The Lora is in safe tensor format.
@mjd010djm 7 місяців тому
Hello! I have the same problem. Did you figure out the solution?
@wbarreguy Рік тому ⁺³
Another great one. I'm becoming a bona fide experienced intermediate level user in great part due to your help. Really the best. BTW, is the voice in the videos an AI or is it a human speaking?
@controlaltai Рік тому ⁺²
Welcome and Thank You! I am glad you find the video helpful. We are a team of two who work on these videos. About the voice, tbh what does it really matter? Our main objective to give a professional presentation style video with clear voice overs for easy understanding to help people leverage AI. Thank you again for the comment and support!!
@wbarreguy Рік тому ⁺¹
You're welcome! I asked because it's a perfect voice for the tutorials and sounds great every time. It made me wonder if txt to speech had gotten that good, so I was curious. The whole presentation is top notch so great job to both of you. @@controlaltai
@anilmacwan Рік тому ⁺³
@@controlaltai What's wrong with just saying it's AI.
@controlaltai Рік тому ⁺¹
There is nothing wrong with it. The channel name is ControlAltAI. Not related to this comment or your comment, I have received, just say, weird comments in the early days of the Channel, which make you slightly diplomatic in replying. Now a days, you can clone your own voice, It’s still AI.
@DivineDragonCrisis 3 місяці тому
@20:01 and @21:02, you mentioned the learning rate for text encoder twice. Could you clarify this please?
@controlaltai 3 місяці тому ⁺¹
Okay I was explaining the learning rate calculation in general and at 21:02 the rate for sdxl. Go with the values given at 21:02. Non sdxl may have different recommended values.
@DivineDragonCrisis 3 місяці тому ⁺¹
@@controlaltai 4e-7 for learning rate or for text encoder learning rate?
@controlaltai 3 місяці тому ⁺²
@DivineDragonCrisis both, only u net is different.
@K-A_Z_A-K_S_URALA 10 місяців тому ⁺¹
Hello friend, is there going to be anything new on Lora training program for people? I haven't heard anything in a long time, maybe there are some new software chips
@controlaltai 10 місяців тому ⁺¹
Hi, the core concept is the same. There is no point in making tutorials with different software. I ensure to explain the basic core concept in the last video. This they hold true. Once you understand what each function does, it can be applied to any user interface.
@K-A_Z_A-K_S_URALA 3 місяці тому ⁺¹
@@controlaltai I can ask you a question: what is the maximum number of photos you need to upload for training on a real person in full growth? I have 250 pcs of photos of my wife, I trained on a 1.5 model and everything is cool. I wrote 150_gen and top quality and now I'm busy with sdхl and I'm curious whether this amount will be a style or a character or you need to make up to 100 pcs of photos???? thanks
@controlaltai 3 місяці тому
@K-A_Z_A-K_S_URALA 100-150 is enough. Only face about 100, go more for full body and poses, facial expressions etc.
@timeisupchannel 6 місяців тому
Hello, It there any way to continue training after stop?
@The3DSphinx Рік тому ⁺¹
To install Kohya I go!!
@controlaltai Рік тому ⁺¹
I hope the tutorial was useful....:)
@whitefox7318 Рік тому ⁺³
Have you tried to trigger Lora without the trigger word ? I noticed there is no difference in a1111 SD 1.5. I can trigger it without the use of the trigger word by just adding my Lora
@controlaltai Рік тому
I did around 10 to 12 models for the video but in sdxl only. For me when I don't use a trigger it was a hit or miss on every generation batch. So I actually don't know how it is for sd 1.5. But I guess it also depends on the trigger term used as well as whether captioning was done or not. If you have used captions the model gets trained with certain keyboards say 1girl for example, and whenever you use that it will use the trained images from the defined LoRA.
@enaduotria194 7 місяців тому
i have an Nvidia RTX 3060 and installed CUDA toolkit and CUDNN. The training is still very slow, about 70s per item. It would take about 30 hours to complete. I am training based on an SDXL model and using the same exact parameter settings you have. Is it normal for it to be that slow?
@controlaltai 7 місяців тому
Yeah, sdxl training is slower. You will get faster training times with faster gpus.
@robbobgood 3 місяці тому
i got kohya downloaded from pinokio, would it affect me in any way since i did not get it manually? also i dont got a rtx so it might be using my cpu, i went this way because i was going through instructions with chat gpt on setting stable difusion with comfyui through the anaconda prompt but in the end i needed an rtx so i went and set it up through the pinokio way
@controlaltai 3 місяці тому
Your need an nvidia gpu. About pinokio I am not aware, never used it
@akhelper3866 5 місяців тому
clicked start training but then jus tsaid bam done and doesnt even show a error
@havelicricket Рік тому
make buckets
number of images (including repeats) / 各bucketの画像枚数（繰り返し回数を含む）
bucket 0: resolution (1024, 1024), count: 440
mean ar error (without repeats): 0.0
@controlaltai Рік тому
Need more info to understand the issue. Paste the whole command prompt error.
@havelicricket Рік тому
[Dataset 0]
loading image sizes.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00
@spierdlajify Рік тому
do i have to run ./accelerate config every time before i start working? im getting so much errors.. and crashes
@controlaltai Рік тому ⁺¹
The actual configuration is one time, option 4, other times you just use option 5 to launch. Accelerate has to be successful for the first time without errors. The errors during training are different and may be a result of wrong configuration or settings.
@0AD8 8 місяців тому
Hey man, I installed Kohya and finished training the model.
How can I use it now?
Do I installed A1111 and how do I link the model that I just trained?
@controlaltai 8 місяців тому
After installing a1111 put it in the checkpoint or lora folder depending on if your trained lora or checkpoint.
@0AD8 8 місяців тому
@@controlaltai ok Thanks, so I just plat the folder named "model" in the Lora model folder right?
@controlaltai 8 місяців тому
No. The trained model will be a file, I don't know what you trained. You have to put that file in checkpoint or lora folder in a1111
@0AD8 8 місяців тому
@@controlaltai I trained a lora, so Im guessing I have to to put the JSON Source File in the lora folder?
But its only 5 KB is that right?
Excuse me for asking so many questions, I'm new to this😅
@controlaltai 8 місяців тому ⁺¹
@0AD8 after and if you trained a Lora the Kohya ss saves it as a .safetensor file. You have to put that file in the Lora folder of a1111
@TheCasius01 10 місяців тому ⁺¹
Sorry it's so annoying to hear a robot voice for 30 min. It's annoying because your content seems good
@dergedankenmacher6210 10 місяців тому
I miss my 30 minutes of Hindu turorials
@cilginflix 3 місяці тому
prepare training data but no just gives errors and makes hundreds of folders...
@Giastice Рік тому
Hi, does it matter whether I use Nvidia or AMD graphics cards?
@controlaltai Рік тому ⁺¹
Yes please check the github for AMD requirements, do not follow the video tutorial setup part. AMD has limitations and require some different dependencies. Video tutorial is for nVidia GPU only.
github.com/bmaltais/kohya_ss?tab=readme-ov-file
@SudaonYT 11 місяців тому
while installing kohya ss I'm getting this error
ModuleNotFoundError: No module named 'pkg_resources'
D:\Ai\k training\kohya_ss>
is there any way that i can fix this
I'm on windows 10, my gpu is rtx 2060
@controlaltai 11 місяців тому
Check your python version. ,Most likely you are using the wrong version of Python.
@peergynt6515 Рік тому ⁺¹
I could never get bitsandbytes to work in windows, I just got errors and only managing to fix a few.
@sgate-se3jk Рік тому
You'll find the option to install it in the time of setup you can also run the setup.bat file
@hotsince84 11 місяців тому
Why am I almost always getting the same face with the same settings as you have?
@controlaltai 11 місяців тому
What do you mean exactly? What have you trained, face, style, person?
@thomas163 Рік тому
your videos so great, but i have a error. Error caught was: No module named 'triton'
@controlaltai Рік тому ⁺¹
Try this:
git clone github.com/openai/triton.git; cd triton/python; pip install cmake; # build-time dependency pip install -e
@thomas163 Рік тому
Wow, thank you. Tonight I will taste it 😉
@lilillllii246 Рік тому
If I try to make a cloth lora, should I do the same as in the video? Is there any difference?
@controlaltai Рік тому
The number of images are higher. I have not tried a style LoRA yet. Try with 100-150 images and see how it goes. Rest of the principle basics remain the same.
@macadenmiles1333 9 місяців тому
i absolutely hate that you skipped from the download to magically having a new tab, where did it come from? why did you not include it in your video?
@controlaltai 9 місяців тому
Can u tell me what video timestamp you are reffering to so I can check and explain
@MS-gn4gl Рік тому
I didn't uinderstant setting per core on the cpu even your cpu only has 2 threads per core at 16 cores?
@controlaltai Рік тому
Go with 2 only. Mistake in the video. 2 should be default. Unless there is no hyper threading like in some consumer grade intel cpu.
@BreHuang Рік тому
When I click any buttons in Kohya GUI it doesn't work at all. I'm on MacOS. Can anyone help?
@controlaltai Рік тому
I don't have a Mac, but here are the official instructions for MacOS: github.com/bmaltais/kohya_ss?tab=readme-ov-file#linux-and-macos
@brunohof2972 7 місяців тому
24:35, How many vram is low?
@controlaltai 7 місяців тому
Depends on what model you are training. Sdxl needs 10 to 12 gb. 6gb is cutting it too close.
@alpha-fraguii6754 Рік тому ⁺¹
Really Nice did you see the application ideogram ?
@controlaltai Рік тому
Thank you! Yes I have for some time now.
@alpha-fraguii6754 Рік тому ⁺¹
I see all your videos I learn some skills, I use it for fun when I have time, I saw other youtubers too because Im french and I don t understand everything but I understand the essential
@controlaltai Рік тому
@@alpha-fraguii6754 Thank you so much for the support. I wanted to make a video on Ideogram. The problem is I cannot do proper testing, they have everything open. I will probably try and make one if you like. Let me know. I can give my own twist to it.
@alpha-fraguii6754 Рік тому
@@controlaltai Yes why not
@argaeffendi Рік тому
Hi brother. I have an error during installing kohya. It said:
Error running pip: install --upgrade torch==2.0.1+cu118 torchvision==0.15.2+cu118 --index-url.
Is that a matter? How can i fixed it?
@controlaltai Рік тому
Try this: Open koyha_ss folder. Right click terminal
type .\setup.bat and enter
Select 2 for install And tell me what does it say exactly.
@GlenKorr Рік тому
I tried following your tutorial but I keep getting this error when I launch the training. Any idea what might cause this?
CalledProcessError: Command '['D:\\Stable_Diffusion\\Lora\\kohya_ss\\venv\\Scripts\\python.exe'
@controlaltai Рік тому ⁺¹
You have to install Kohya in a separate folder, outside stable diffusion. Kohya is completely separate from SD A1111. Don't put this under the lora folder. Kohya creates it on venv environment. If its under SD folder, there would be conflicts.
You should Have two folders, one koyha_ss and then stable diffusion.
@GlenKorr Рік тому ⁺¹
@@controlaltai I will give it anothe try!
@officaltradingden Рік тому
same issue, folders are in separate locations
@controlaltai Рік тому
@@officaltradingden are you getting the exact same error? Check for dependencies and ensure everything is compatible and up to date.
@pixelycia Рік тому
Number of CPU threads per core is wrong, 24 in total could be fine, but this is not what field accepts, in most cases it's 2.
@controlaltai Рік тому
That actually depends on your cpu specs, but yeh the default is 2. For me I did see higher usage when increasing the number, but it’s not optimised. Made a little difference. However when I upgraded to torch 2 and the latest Cudann, the boost was noticeable.
github.com/bmaltais/kohya_ss/wiki/LoRA-training-parameters#number-of-cpu-threads-per-core
@nichin79 Рік тому ⁺¹
However, your CPU has 16 cores and 32 threads, hence 2 threads per core, and thats probably why the default value is 2 because the threads are often double the core
@brunohof2972 7 місяців тому
Im getting this error when trying to train the lora.
AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\python.exe: can't open file 'X:\\IA\\kohya\\kohya_ss-24.1.4\\sd-scripts\\sdxl_train_network.py': [Errno 2] No such file or directory
@controlaltai 7 місяців тому
Probably there is some permissions issue. Make sure the kohya directory has permissions for your system to read and write.
@descubriendorusia Рік тому ⁺¹
hello, thanks for the tutorial, first time I saw a good explanation about the parameters. I'm training now using different gpu servers, but also I want to train on my MacBook Pro M1 Pro 16 gb, I managed to train sd 1.5 at 4.22it/s, but when I try with sdxl I got 100.22it/s. it is impossible to train on m1 with sdxl? thanks
@controlaltai Рік тому ⁺²
Hi and thank you. SDXL is very very heavy, so yeh an external GPU with dedicated vram is recommended. Basically the M1 or any processor with an integrated GPU won’t cut it. You can still train if the vram is there but it will go slower than sd 1.5. Just remember the M1 Pro does not have any dedicated vram. My advise would be to have a system with atleast 8 gb vram. I obviously don’t have that mac environment, so have not tested it, but you can definitely give it a try.
@descubriendorusia Рік тому ⁺¹
@@controlaltai thanks! I hope they don't launch soon a SDXL 2.0 even more heavy 😱😱😱. I will be waiting for your new tutorials, maybe about the data set preparing for people, body and face.
@WhySoBroke Рік тому
Type 0 for a laptop? You know some laptops have 3k 4K RTX GPUs.....right?
@controlaltai Рік тому
I meant laptops not having 4x cards. People who don't have 30x0, 40x0 cards can only use fp16. When applying fp16 many laptop users have reported errors during training. They have to select zero in that case.
@yashtrivedi9403 Рік тому
please help! getting errors like:
RuntimeError: Error(s) in loading state_dict for SdxlUNet2DConditionModel:
Unexpected key(s) in state_dict: "input_blocks.1.1.norm.bias........
and
CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['Y:\\stable-diffusion\\kohya\\kohya_ss\\venv\\Scripts\\python.exe', './sdxl_train_network.py', '--pretrained_model_name_or_path=Y:/COSMOS/models/epicrealism_naturalSinRC1VAE.safetensors', '--train_data_dir=Y:\\COSMOS\\models\\LoRA\\img', '--reg_data_dir=Y:\\COSMOS\\models\\LoRA\
eg', '--resolution=512,512', '--output_dir=Y:\\COSMOS ... ] '--noise_offset=0.0']' returned non-zero exit status 1.
@controlaltai Рік тому
Why does the resolution say 512 for a SDXL model. Have you selected the correct parameters? Double check those.
@vpakarinen Рік тому
There is only x amount of real images, so it's good idea to utilize stable diffusion to create images for training.
@MisterWealth Рік тому
Mine goves me errors ubfortunately
@controlaltai Рік тому
What errors are you getting?
@HanSolocambo 10 місяців тому
@MisterWealth: "Mine goves me errors ubfortunately"
... If yaurs gove you errars it's prabobly becouze you con't wraite anythong praporly.
@JesseDouglas 10 місяців тому
Why is this in an AI Voice? Makes it really hard to sit and watch hearing that monotone computer voice drone on.
@guruware8612 Рік тому
a billion more girly-pictures for the internet, 💤
slowly it get really boring, not to say annoying.
@HanSolocambo 10 місяців тому
What's boring ? To be a pervert with only 1 idea in mind ? Maybe try to think about something else than boobs. And you'll realize the problem is not the tool. But yourself.
@justinstricklin7671 Рік тому
PS D:\SDXL\Lora Training\kohya_ss> gui.bat
gui.bat : The term 'gui.bat' is not recognized as the name of a cmdlet, function, script file, or operable program.
Check the spelling of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:1
+ gui.bat
+ ~~~~~~~
+ CategoryInfo : ObjectNotFound: (gui.bat:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
Suggestion [3,General]: The command gui.bat was not found, but does exist in the current location. Windows PowerShell does not load commands from the current location by default. If you trust this command, instead type: ".\gui.bat". See "get-help about_Command_Precedence" for more details.
@controlaltai Рік тому
Don't put gui.bat in command prompt. Just run the file via folder (left click). If you want to run via command prompt. The the location you posted, type ".\setup.bat" and press enter. Then select option 5 to start Koya_ss GUI in browser.
@pengyou314 Рік тому
I’d like to see the gpu usage🥹🥹🥹
@controlaltai Рік тому
It starts with 100%. Then stays around 90 when choosing AdamW. However in Adam8 it's lower. During the training, the ram usage also goes high.

Наступне

Автоматичне відтворення