SDXL 1.0 vs SD 1.5 Character Training using LoRA Kohya ss for Stable diffusion Comparison and guide

Поділитися
Вставка
  • Опубліковано 30 січ 2025

КОМЕНТАРІ • 137

  • @Jisamaniac
    @Jisamaniac Рік тому +16

    This man watched Vikings for years, waited for nextgen AI, and got himself a YT channel then said my time has come.

  • @Irokz
    @Irokz Рік тому

    i have a RTX 4090 and damn, SDXL is fast
    5 sec generation for 1 image, instead of like 3 minute with my old one aha
    Speed was my main problem with SDXL and thats why i never used it before
    But, now, that speed isn't the problem anymore, there is an other one that makes me use SD 1.5
    That's true that the realism is better in SDXL, even if, with complexe prompt, and good checkpoint, you can do super realistic things with SD too
    But, the lack of checkpoints, the lack of lora's, is the worst part
    Also, it seem like that SDXL lora's are harder to combine and give more artefact than SD, i don't know if that's only me.
    Anyway, for character i use faceswaplab combined with Lora, it work incredibly good, because using lora only doesn't give the realism i want to.
    But using both of them, is incredible, the body fit, so the face fit too.

    • @AI-HowTo
      @AI-HowTo  Рік тому

      SDXL training usually gives slighly more realisic results than SD, but SD results are smoother and contain less particles or grains, as you mentioned, biggest issue is speed, in SD one must combine different techniques as you mentioned to achieve good results, there is no single solution for anything thus far thus having a work flow is a must to achieve quality results.

  • @Beauty.and.FashionPhotographer
    @Beauty.and.FashionPhotographer 4 місяці тому

    Mac settings are different, very very few people were able to even launch Koyha so far. Would You by any chance know what parameters on koyha running on a mac would work ?

    • @AI-HowTo
      @AI-HowTo  4 місяці тому

      unfortunately no, but the principles in this video remain true, regardless of the changes in environment, or some parameters (how captioning works, how the SD trains, and the approach of image selection, and so on)

  • @___x__x_r___xa__x_____f______
    @___x__x_r___xa__x_____f______ 8 місяців тому

    Hi, the settings seem to have changed for adamw8bit at 0.0001. the model seems to overfit. have you noticed a change?

    • @AI-HowTo
      @AI-HowTo  8 місяців тому

      have not done any recent trainining, and usually the algorithm is fixed so learning rate changes are unlikely to been used differently, not sure, anyway, if you see things overfit quickly then using smaller number of steps could be better, besides, learning rate smaller than 0.0001 doesnt make much sense i think, so we usually consider increasing it not decreasing it to learn faster for instance...not sure if any recent changes in Kohya have made things different.

  • @dr1nkm1lk
    @dr1nkm1lk Рік тому +1

    how can I contact you? I want to train a lora with images of a myself/a person to create images with the epicrealism model. basically profile pictures. been running and testing on a runpod installation but not getting good results :(

    • @dr1nkm1lk
      @dr1nkm1lk Рік тому

      of course I would pay

    • @AI-HowTo
      @AI-HowTo  Рік тому

      sorry, not considering any work at the time being, thank you.

    • @dr1nkm1lk
      @dr1nkm1lk Рік тому +1

      @@AI-HowTo damn! Ok :)

  • @joelapablaza7722
    @joelapablaza7722 Рік тому

    Awesome content! It would also be cool if you leave the specs of your PC in the description so we know what you're working with and can kinda guess what to expect with our own PCs.

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      Thanks, will do so, will paste it here and in the description too:
      Adapter Type NVIDIA GeForce RTX 3070 Laptop GPU, NVIDIA compatible, 8GB VRAM
      Physical Memory (RAM) 16.0 GB
      Processor AMD Ryzen 7 5800H 3201 Mhz, 8 Core(s), 16 Logical Processor(s)
      SDXL training was performed online on Runpod because it didnt work on my laptop, using 24GB RTX 3090GPU

  • @foreropa
    @foreropa 2 місяці тому

    Great video, thanks a lot!! How long took the training with SD 1.5? I don´t recall hearing you say it. And the SDXL model in Runpod? How much time? And it was easy to config? I haven´t used it.

    • @AI-HowTo
      @AI-HowTo  2 місяці тому +1

      You are welcome,
      on Run Pod, it took around 50 minutes to achieve stable results
      that used RTX 3090. Runpod comes with preinstalled Stable diffusion usually, one must choose a suitable pod for that.
      last year i made this video ua-cam.com/video/arx6xZLGNCA/v-deo.html for Runpod, it is possible that changes happened since then too, easier that is, with models pre installed.
      on my Laptop RTX 3070 with similar image sizes I think that took more than 3-4 hours if I am not mistaken for 1024 images sizes.
      this depends largely on how many images are used, and the image sizes.

    • @foreropa
      @foreropa 2 місяці тому

      @@AI-HowTo Thank you SO much for your answer, I will see that other video. Take care!

  • @sgbishan3717
    @sgbishan3717 Рік тому

    i don't really understand for the "undesired tags". if my face got a mole and i always want the mole to be there, do i put the word 'mole' there? what's about a scar on the face?

    • @AI-HowTo
      @AI-HowTo  Рік тому

      definitely, dont include mole in the captions, if that is a trait of the face...any train in the character that you want absorbed by the LoRA should not be included in the captions ... if you include them, you will also have to include them in the prompting, and this makes prompting more complex and less effective ... remove 1girl,mole,scar,lips,nose,eye color,hair color,realistic, and all things that related to your character and repeat across all the images, and just caption everything else.

    • @erans
      @erans Рік тому

      @@AI-HowTo what about tags like "standing" "looking away" "looking at viewer" and different items of clothing?

    • @AI-HowTo
      @AI-HowTo  Рік тому

      if you images are always looking at the viewer, then dont include that in the caption... if your character is always wearing the same dress and you want that dress to be part of the LoRA then dont describe the dress either....anything that repeats in all the images and you want to be part of Your LoRA dont caption it.

    • @erans
      @erans Рік тому

      @@AI-HowToThanks for the reply, the problem starts with the fact that most of the photos the character does look at the viewer, but only looks away in 3-4 photos.
      Should I include "looking away" in those 3-4 photos and nothing on the subject (looking at viewer) in the rest?

    • @AI-HowTo
      @AI-HowTo  Рік тому

      yes its possible, dont include looking at viewer since it is the default, and only in those write "looking away", it is not 100% exact science, but I think this is the most logical approach.

  • @rendomone
    @rendomone Рік тому

    Regularisation images are required? Because someone says it is not. Is there any diff if we use it or not.

    • @AI-HowTo
      @AI-HowTo  Рік тому

      nope Reg images are not required, becuase SD already knows the concept of a person....I prefer to train with them and without them and see which model is better, because it is purely experimental process.... my best results in private LoRAs were produced with Regularized LoRAs, they became more rich and able to adapt to prompts more and produce better results

    • @rendomone
      @rendomone Рік тому +1

      @@AI-HowTo in my limited practice it is opposite. it would be interesting to see comparisson.

  • @jackzhang891
    @jackzhang891 Рік тому

    Great tutorial. Thank you. For generating photos of a particular person (face + body), do you think Dreambooth is outdated now that we use can LoRAs?

    • @AI-HowTo
      @AI-HowTo  Рік тому

      I dont think so, it is just that LoRAs are more practical and achieve good level of accuracy, faster to train tooo and simpler, and we can generate as many as we can and reuse them efficienty for objects/people.

  • @juridjatsenko2013
    @juridjatsenko2013 Рік тому

    CalledProcessError: Command '['C:\\Users\\grome\\stable-diffusion-webui\\cd kohya_ss\\venv\\Scripts\\python.exe',
    the script activates well from the folder. I can`t locate the command line within Kohya trying to change the command path. What am I doing wrong?

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      not sure....but if you are trying to run the script directly using the command line using something like (accelerate ...params...) then first you must activate from inside scripts folder... then go up two steps using cd.. , cd.. and run the script from there so the scrip is run from Kohya folder not from script folder .

    • @juridjatsenko2013
      @juridjatsenko2013 Рік тому

      It happens when I run Kohya trying to make a Lora. I get this after it has classified all the images, together with low memory error, even though I have 3070.
      Would running the script before starting to make Lora fix it? Or is there anywhere withing Kohya to point toward exact script location to avoid error?

    • @AI-HowTo
      @AI-HowTo  Рік тому

      not sure, trying setting precision to fp16 and save precision to fp16, use xformers, gradient checking if necessary and try again *3070 should work with bf16* but not sure, possibly there is something missing in your installation for instance.
      usually we only run a script which was made by someone else or to avoid using the GUI, but I always used the GUI, I also have 3070, it works without memory problems or errors for SD 1.5, try setting the parameters yourself in the GUI and avoid using scripts when possible .... also check your images in case some of them have large resolutions which may cause errors too.... also try with 1 image till you figure out the source of the error ...

  • @kritikusi-666
    @kritikusi-666 Рік тому

    How do you go about training a model that works with your own pictures? Do I just start taking selfies daily for couple days and then adjust the w/h to 1024? I keep getting bad details around the eye area. Any way to improve this?

    • @AI-HowTo
      @AI-HowTo  Рік тому

      same as here, better if someone took the photos, since selfies are not so good, the angel of view is not good ... just let someone take the pictures from different angels, with good room natural lighting condition, and some full body shots and train normally.
      I also use Photoshop to smooth the images slightly, for example if there are wrinkles in the face or some features that I dont want to appear in my LoRA, Photoshop can hide them, I use Photoshop neaural filters for this purpose which is parts of photonshop 2021 and above, this helps improve quality of the LoRA greatly.
      use a good regularization set on a good checkpoint if you want pretty images such as Photon or Majicmix checkpoints and follow the same steps here.

  • @LTyphoon
    @LTyphoon Рік тому

    Errors keep appearing when using the SDXL1.5 model. Is there any solution?

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      Errors usually happen when your GPU dont have enought VRAM as mentioned in the Video, best we can do is turn on Gradient checking, xformers, and if VRAM is 12GB only, then use Ada Adapter optimizer ... I used Run pod for training this SDXL LoRA because my VRAM was only 8GB which is not sufficient for Kohya SDXL training.

  • @ysy69
    @ysy69 Рік тому +1

    Thank you for this useful comparison and tutorial. From your experiments, which LoRA do you prefer (1.5 or XL) and in your view, what are the key advantages from your preference ?

    • @AI-HowTo
      @AI-HowTo  Рік тому +10

      thus far SD 1.5 looks smoother, faster, and good enough for characters and produces wonderful results at resolution 1024x1024 with checkpoints such as photon/majicmix, while SDXL is more realistic such as the skin details/moles/wrinkles, but a lot slower unfortunately, for now, I continue using SD 1.5 because of my GPU, in the future, once i get a better GPU, and there exists better community checkpoints, i will use SDXL ...SDXL on the other hand is better with anatomy, that is: body, hands, fitting character on a bike/horse, and understands objects better, so for complex scene prompting, SDXL is superior.

    • @ysy69
      @ysy69 Рік тому +1

      @@AI-HowTo thank you!

    • @AI-HowTo
      @AI-HowTo  Рік тому

      You are most welcome.

    • @ysy69
      @ysy69 Рік тому

      @@AI-HowTo hello, have you made more discoveries since 1 month ago in the art of fine tuning SDXL models, be LoRas or Dreambooth?

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      nope, currently working in another field , wont finish before June 2024, so i doubt i will be posting videos about training, and possible about this subject in general, not sure yet... The training process is a procedure, so following the principles mentioned in this video is all that a person really needs, in general the remaining is just experimental, and the same parameters that works for some dataset may not work 100% optimally for another data set, but the principle remains the same.... I expect soon there will be better, faster, more accurate training methods, even better than Dreambooth that will give people better results.

  • @PiotrGarryWysocki
    @PiotrGarryWysocki Рік тому

    Can I allow, to some sunglasses in dataset?

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      sure :) , I did this model for educational purposes only, I have no use for it, but if I further developed it later, i will add glasses, and improve full body images to get higher resolution ones.

  • @kvartz
    @kvartz Рік тому

    Is it important to remove the incorrect anatomy images from the regularisations images? In other words, how to prevent the disfigured hands fingers, eyes not aligning, etc.?

    • @AI-HowTo
      @AI-HowTo  Рік тому +2

      yes, it is better to have good class images, it will improve the quality of the images based on what i have tested recently, it can also improve the color of your character... for instance, if your class images have doll like skin/faces, it makes your character smoother and look better.... avoid having any kind of deformation in the class set, while its effect will be limited, if it has 1% effect you should remove it...Kohya ss is definately learning from class set also based on what I have seen from my recent tests.

    • @kvartz
      @kvartz Рік тому

      @@AI-HowTo thanks a lot for the answer!

    • @AI-HowTo
      @AI-HowTo  Рік тому

      you are welcome

  • @stefanvozd
    @stefanvozd Рік тому

    Can you use img2img with a image of a clothing to make your Lora Person wear it or you would need to make lora of that peace of clothing too?
    Tnx for the video great stuff

    • @AI-HowTo
      @AI-HowTo  Рік тому

      this is very difficult to do directly even with controlnet (reference+openpose or other control net combinations and masking), because stable diffusion doesn't understand geometry, accurate dressing is done using blender 3D software, in stable diffusion you get something approximate, unless you trained your clothes inside a LoRA for instance as in this video ua-cam.com/video/wJX4bBtDr9Y/v-deo.html , even then, SD might miss some accurate details .... img2img in general is used to change image style and do inpainting that is: changing part of the image.

  • @RyokoChanGamer
    @RyokoChanGamer Рік тому

    I read your answer from the previous video and I'm doing it that way, I always put the trigger word in the prompt... but I'm giving up on these regularization images, I'm training a character right now, it's already on the 4th epoch and the samples show an image that has 0 % of what I'm training... I have 20 images being processed with different sizes but of good quality and with 100 steps, I'm using 900 random 512x512 images of girls (even my character) generated in the same checkpoint.. what's wrong? I've seen videos saying that it needs to be 100 regularization images per trained image, I've seen videos saying that it uses 10... I'm confused

    • @fi5h81
      @fi5h81 Рік тому +1

      I got that same - if I use regularization images, likeness of object is very low

    • @RyokoChanGamer
      @RyokoChanGamer Рік тому

      @@fi5h81 I'm totally lost, I think the solution will be to give up these images and focus on training without them.. the training I mentioned is still being processed, it was estimated for 5 hours... it's been 5 epochs and not even 0.1% of the character has appeared yet... when I train without these regularization images it takes less than half the time and I can see results

    • @AI-HowTo
      @AI-HowTo  Рік тому +2

      you should do what works for you ...training can work really well even without regularization and faster... there is no requirement for 100 reg images per 1 training images... all these are made up by the trainers ... there are no rules like 100 reg images///3000 steps.... all are made up ...training is entirely experimental... I suggest you reduce the 100 repeats down to 20 and increase number of epochs ... I usually never use more than 20 repeats per image and often 10 when i have more images and it works really well with reg and without...... as mentioned in the video Kohya will use (repeats*image count reg images only) so if you use for instance 100 repeats, and have 20 images then you need 2000 reg images otherwise your class images will be repeated since Kohya uses 2 factor regularization, so if it didnt find the extra reg images it will repeat them and increase the effect of the reg set over your character and takes longer to train....I also strongly recommend using real world regularization ... but once again, regularization is not necessary for character LoRA, it only has positive effect in flexibility and can make the LoRA more rich for inpainting.

    • @RyokoChanGamer
      @RyokoChanGamer Рік тому

      @@AI-HowTo thank you very much you are saving me and thank you for your patience with a beginner... i am not comfortable just asking you questions, i am also looking and reading articles on the internet about it.. i will do the tests you recommended tomorrow... i am on that moment testing 2 epochs of a lora i just created... i'll like it this time just a suggestion... what better way to test a lora? The one I'm testing at certain checkpoints is a complete mess, at others it works fine, at some I need to change a lot so it doesn't get blurred... usually at weight 1 it never works, below 7 the character is already mischaracterized, I'm testing just between 7, 8 and 9...

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      totally fine, we all are beginners in this area, and learning, you should ask when you have a question, even if no one replied, you lose nothing, that is how we all learn...Character LoRAs often are delicate and dont work on other check points properly, they work best on the same check point it was trained on.... Character LoRA that is not overfitted should work perfectly at weight 1 ... anything less, the character will lose it's features gradually ... but when you want to create full body shots, we may use 0.8 or lower then automatically repaint the face with after detailer extension ... if you are not getting good results at weight 1 then most likely you are training on large number of repeats such as 50+, this is why using smaller number of repeats is necessary such as 20 or less... use smaller repeats and you can increase number of epochs how many you like.

  • @divye.ruhela
    @divye.ruhela 8 місяців тому

    Subbed! Very good tutorial! I know this is an old video, but I had a few queries.
    Is it harder to create/ train a 'realistic character LoRA' if the original dataset contains AI generated images created on realistic checkpoints, instead of a real person's photos? I guess, what I mean to ask is, can a LoRA created using AI generated datasets achieve such realism?
    PS. Also, what would be the best checkpoint to create such an AI generated dataset? TIA!

    • @AI-HowTo
      @AI-HowTo  8 місяців тому

      yes as long as the training data set are of good quality and do not contain deformations in the eyes or fingers, even slightest deformation could be augmented after training if they repeated. as for the best check point, not sure unfortunately at the time being, previosuly for 1.5 i got the best results with majicmix v4, even for wester characters despite that the checkpoint was asian and for SDXL the Juggernaut XL , not sure now....I think in general the principles of training do not change overtime, so the video is still good to rely on for training.

  • @huevonesunltd
    @huevonesunltd Рік тому

    I normaly use "easy lora training scripts" for lora training on 1.5, mainly because i have already tweaked my settings for the best results.
    can i still use that to train SDXL ?

    • @AI-HowTo
      @AI-HowTo  Рік тому

      I think you can do that partially! but not practical, load the script, then go back and check SDXL and fill up SDXL related parameters, images sizes, noise offset...etc....there are some SDXL scripts already in newer version of Kohya which populates lots of settings for you such as the noise offset 0.0357 amongst other optimization parameters such as the adafactor optimizer ... scripts are just alternate way of Json file storeage but included as part of the Kohya, so you can export it as Json too or save it for future use that fits your data.

  • @DanielSchweinert
    @DanielSchweinert Рік тому +1

    Thank you! Perfectly explained! Btw. do you have captions for your regularization images?

    • @AI-HowTo
      @AI-HowTo  Рік тому +2

      you are welcome, nope just class name woman

  • @lilillllii246
    @lilillllii246 Рік тому

    Did you actually apply “ : 1girl,blonde hair,blue eyes,solo,realistic,looking at viewer” like in your video? Also, in order to make the normalized image as similar as possible to the data photo to be trained, shouldn't we prepare something similar to the image to be trained? I'm asking because I see things like AI images in your video. Please understand because my English is bad.

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      yes, : 1girl,blonde hair,blue eyes,solo,realistic,looking at viewer, any any feature that is specific to our character, it helps to absorb these features into the LoRA.... so we describe everything else that we want to change in the tags.
      the normalized set, must not be similar to our subject, must be different, but from the same class, so that we get small features from the reg set and make our LoRA richer... Reg set is also not a must, but i found it to give more flexible results.

    • @lilillllii246
      @lilillllii246 Рік тому

      @@AI-HowTo drive.google.com/drive/folders/1N139LyAgXfRN1hDrCjfkQixtroP5RQzR?usp=sharing Thank you so much for your reply. I followed everything exactly as you did, starting with the female model, but the results are so different.

    • @lilillllii246
      @lilillllii246 Рік тому

      @@AI-HowTo drive.google.com/drive/folders/1Rd2scCO-tBk_0_XZxGwZubOH09LIQscE?usp=sharing Take a look when you get a chance. I'm pretty sure I did the exact same thing as you, but it's so different.

    • @AI-HowTo
      @AI-HowTo  Рік тому

      I just checked them, Your only problem is the resizing of the images, You are resizing them in the wrong way, it is causing the images to stretch which causes distortion in the training process, the training seems to progress correctly based on how stretched these images are .... you need to maintain the aspect ratio when resizing or use a better resizing/cutting softwrae that is all..... also i suggest to remove "long hair" and smile from the captions, but that is another thing.

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      so based on your incorrect image resize, the output is logical..... also make sure all your images are of good quality, some of them are really not good quality..... aslo you need to have more reg images such as(20(images)x20(repeats)=400 images , that would be better.

  • @ЕвгенийВладимиров-л3к

    Tried using nude photos as regularization images? is it worth doing that?

    • @ЕвгенийВладимиров-л3к
      @ЕвгенийВладимиров-л3к Рік тому

      just an archive with very high-quality nude photos is easier to find on a torrent than in clothes

    • @AI-HowTo
      @AI-HowTo  Рік тому

      nope, didn't try, I think having such pictures will increase likilhood of producing similar pictures in generations, they will work too of course since they are of person or woman class, however, I dont think the model will be richer or more flexible when it comes to fashion or clothing styles.

    • @ЕвгенийВладимиров-л3к
      @ЕвгенийВладимиров-л3к Рік тому

      ​@@AI-HowTo how many photos do you need in different poses with different facial expressions so that the model is as flexible as possible and does not resist the prompt? and the same for regularization images, if there are 10,000 maximally diverse, will it be better? (I teach XL)

    • @AI-HowTo
      @AI-HowTo  Рік тому

      For regularization images: we only need (number of repeats X image count) regularization images, so if we have 30 image, and we are doing 20 repeats, KOHYA will only use 600 reg images and ignores the rest .....
      as for how many images we need to train: most importantly is having high quality images, 1 image per pose /per expression/per angel of the face , should be enough, but I have seen that having more images is producing in general better results... so around 20 total images is suitable, 40 images maybe ideal to have more poses/expressions, and up to 100 for character and object is good if it helps capture the face/body/from all angels with different clothing and poses... .... for Style: 100-400 images is good...so there is no set rule for this.

  • @TentationAI
    @TentationAI Рік тому +1

    hope a good realistic checkpoint will be release for sdxl :)

    • @AI-HowTo
      @AI-HowTo  Рік тому

      I hope so too! without community checkpoints or LoRAs realistic results in SDXL are limited.

    • @AlexanderGarzon
      @AlexanderGarzon Рік тому +1

      @@AI-HowTo I think there are some already released on civitai

    • @AI-HowTo
      @AI-HowTo  Рік тому

      yes, still, it will take a while to test and get filtered by the community to know which is better than which ... biggest problem thus far for SDXL wide adoption seems to be speed not quality, it is slower than SD 1.5 and requires more GPU to train in comparison.

  • @ApexArtistX
    @ApexArtistX Рік тому

    why captioning not working ?

    • @AI-HowTo
      @AI-HowTo  Рік тому

      if it is not working with you, then possibly Kohya version you are using needs you to explicitly define the caption extension to .txt in the parameters section under
      Caption Extension field ... it used to take .txt by default if i am not mistaken, now the default seems .caption, so if your captions are text files you should type .txt ... that might be the case.

  • @MrXRes
    @MrXRes Рік тому

    I have 3080Ti and my training process is so much slower! It's 120.77s for each step and I cant solve the problem :(

    • @AI-HowTo
      @AI-HowTo  Рік тому

      sorry, never experienced this issue, because this should not happen, 3080ti is powerful almost equal to 3090 even if it has smaller VRAM, SDXL should run even with 12GB, so your training speed must match this video or be a little bit slower..... you can try to use a different learning algorithm such as Adaadapter which is better for 12GB instead of Adam8 see in this video ua-cam.com/video/RT2jj-5t8x8/v-deo.html ... make sure Gradient Check pointing option is on since your VRAM is less than 24 also if it stilld doesnt work use this option in ua-cam.com/video/RT2jj-5t8x8/v-deo.html - --network_train_unet_only and hopefully things work better, otherwise, there must be a driver issue in your system.

  • @김태우-o5h3f
    @김태우-o5h3f Рік тому

    Whats your graphic card for training XL

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      mine is RTX 3070, ok for training 1024x1024 SD 1.5 but not ok for SDXL unfortunately, we need 12GB at least for SDXL training...that's why i used runpod for making this video only... then generated the images locally on my 8GB graphics card for both SDXL and SD 1.5

    • @AI-HowTo
      @AI-HowTo  Рік тому

      You are welcome, turning gradient checkpointing on will allow you to run SDXL training on 12GB and with adafactor optimizer because it requires less GPU than adam8

    • @김태우-o5h3f
      @김태우-o5h3f Рік тому

      @@AI-HowTo Ah thanks for reply! I also own 3070ti ... no luck with xl I guess :(

  • @lilillllii246
    @lilillllii246 Рік тому

    I trained it with sd1.5 by doing everything the same as you, and it came out like a cartoon. Why is that?

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      if you are training and prompting on a realistic checkpoint, you should not get a cartonic person, i never got this kind of results with any LoRA before...double check on which check point you are doing the training, or producing the results.

  • @Philinnor
    @Philinnor Рік тому

    overlays at the end are hiding the conclusion image....

    • @AI-HowTo
      @AI-HowTo  Рік тому

      Thank you for mentioning it, updated, I removed them from conclusion section.

  • @gokalpkocakk
    @gokalpkocakk Рік тому

    thanks for video! Why it takes a lot of time with XL? I hope it'll be as fast as 1.5 :/

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      SDXL uses different Unet structure, which requires more time to generated images/train ... also even in SD 1.5 if i Train on 1024 images we will 4 times slower than 512 images, however, on SD 1.5 training 1024 resolution images usually produces studio quality images so i dont have to do many repeats... for this model, results were great from first attempt because the data was 1024x1024

  • @m.a6416
    @m.a6416 Рік тому

    How does training work if say the male character you are using is always wearing some cultural head gear(native American chieftan, Mongolian skullcap, Viking helmet, Thai crown dress, Arab male shawl etc) Like say I got 50 images of an individual in which he's wearing the same thing on his head in all 50. What kind of classification images should I use. regular photos of men or do i curate a new set of classification photos of men all wearing the head garb. Some head accessories are easier for the ai models to understand without specific training(viking helmet) given how commonly they are used but others not so much. I had many issues with various ethnic head gear, most notably that of Vietnam, Thailand, Mongolia, Arabs.

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      we use normal regularization images of men, we can also, use me with various types of helmets too in the class images that are different than yours....
      True, it is easier for SD to improve on something it has seen before than something new... most importantly, don't mention anything related to helmet in the captions of head gear or related to the dress if the dress repeats and you want it to be part of the final results, you dont even need to use man in the captions, just use triggerword absorbing all your character features... the class (in folder name) on the other hand is man.

    • @m.a6416
      @m.a6416 Рік тому

      @@AI-HowTo Thank you

  • @K-A_Z_A-K_S_URALA
    @K-A_Z_A-K_S_URALA Рік тому +1

    посмотрел спасибо за труд!

    • @AI-HowTo
      @AI-HowTo  Рік тому

      You are most welcome

    • @K-A_Z_A-K_S_URALA
      @K-A_Z_A-K_S_URALA 2 місяці тому +1

      thanks, I watched it a year ago) the answer to your request...usually when we use a large number of pictures such as hundreds, it becomes more of a style training ... there is how ever no max number, some say around 1000 is the max, but one can also train with around 10 pictures and get good results using a mix of techniques and lots of try/error .... this video ua-cam.com/video/vA2v2IugK6w/v-deo.html is a better example of character training with lots of useful notes .... for me, I got better results with 30+ pictures till around 80 pictures for people, it captures more changes and details , but even with around 10 that can be done too.... and new SD versions can train on smaller numbers..........I remember when I was coaching a year ago... on the 1.5 model, I did this by throwing folders of photos there were a lot in each folder))) with the names 100_body....80_face.....80_fullbody set Network Rank128 Network Alpha1 and lo and behold, the model was super obedient, flexible, super response and quality....it's been exactly a year, I've been waiting for people to start training sdxl, but there's still not much information, so I want to try to train now, I don't know where to start ...))) thank you for the answer!

  • @ashish-lk9lx
    @ashish-lk9lx Рік тому

    hey can you provide regularzation images for men? please

    • @AI-HowTo
      @AI-HowTo  Рік тому +1

      sorry, I have none that I can share at the time being, because images must be checked for copy right issues before sharing publicly... you can just generate some using SD and collect some from Freepik or real pictures from google which is best approach then crop them manually... when used in your own computer, it doesnt matter if it has copy right or not, since the pictures are not used directly nor are memorized, so generated models will have no copy right issues.

  • @AlexMay-r1y
    @AlexMay-r1y Рік тому

    Do you have a Discord or a community you can get into?

    • @AI-HowTo
      @AI-HowTo  Рік тому

      sorry no, not at the moment

  • @guangyuniu785
    @guangyuniu785 Рік тому

    Great tutorial, would you mind to share your reg dataset?

    • @AI-HowTo
      @AI-HowTo  Рік тому +2

      Thank you, I will make it available, need to check if it has any copy righted content just incase because it has some real images as well... I will make it available on the same video or send you a link as soon as possible.

    • @guangyuniu785
      @guangyuniu785 Рік тому

      @@AI-HowTo can you send me a link? I just wanna check out the difference between using a reg data and without

    • @AI-HowTo
      @AI-HowTo  Рік тому

      sorry for the LoRA training of Olivia I wont make the LoRA available for copyright issues and due to some complaints of the video i got ... for this video of Kathren civitai.com/models/126328/katheryn-winnick-xyzkwv1sdxl you can see here three links but all are regularized .... I did several models recently, I got better results with regularization, but in some cases without class images seems to work as well...personally, I always do regularization for models that I want to be better and more flexible, takes more time to train, but often when regularization set is used, results are often better.

    • @guangyuniu785
      @guangyuniu785 Рік тому

      thanks for your advice, it helps a lot!!!@@AI-HowTo

    • @AI-HowTo
      @AI-HowTo  Рік тому

      you are welcome

  • @ac1rajat
    @ac1rajat Рік тому

    Make a discord community please

    • @AI-HowTo
      @AI-HowTo  Рік тому

      sorry, not planning on any for the following few months at least.

  • @sheleg4807
    @sheleg4807 Рік тому

    You skip so much it's so hard to follow what you're doing, half of the clicks you're doing are skipped and unable to follow

    • @AI-HowTo
      @AI-HowTo  Рік тому

      was trying to shorted the video as much as possible, but point taken, thanks for the input.

  • @WifeWantsAWizard
    @WifeWantsAWizard Рік тому

    Yeah, try to get a render from the side. Or from behind. "LoRA" is a patch, not a solution. And it only works if your subject is always taking a selfie.

    • @AI-HowTo
      @AI-HowTo  Рік тому +2

      if training data has side views, the results will be really good ...this one has limited dataset... for long shots, use of after detailer is the only solution for LoRAs, which like you said is more like a patch for a certain object/character ... LoRAs and even SDs have many shortcommings but there are many ways around them

    • @fi5h81
      @fi5h81 Рік тому +1

      it works from side too, i made many loras like that - U need side photos in UR dataset