😕LoRA vs Dreambooth vs Textual Inversion vs Hypernetworks

Поділитися
Вставка
  • Опубліковано 24 січ 2025

КОМЕНТАРІ • 350

  • @infocyde2024
    @infocyde2024 2 роки тому +238

    The thing about textual inversions is that they create embeddings that are cross combatable with the base models. A textual inversion trained with SD 1.5 will work with all 1.5 based models, and here is the kicker, you can combine them without having to do any model merging. That is HUGE.

    • @lewingtonn
      @lewingtonn  2 роки тому +29

      yeah, the flexibility of textual inversion is a big factor, also it's really cool conceptually!!

    • @zyin
      @zyin 2 роки тому +17

      The video really should have mentioned this, it's an incredible advantage for embeddings that was just left out.

    • @neilslater8223
      @neilslater8223 2 роки тому +26

      Yes, combining two, three or more Dreambooth models is possible, but it takes time and generates yet another 2GB+ model that you need to save somewhere.
      Whilst textual inversions can be used flexibly within the prompts in any combination, including weighting them, using as negative prompts, all on the fly with no extra file management
      However, textual inversion cannot learn to output things that the base model is not able to do at all. So depending on the base model, it may not be possible to train a textual inversion for a specific concept.

    • @infocyde2024
      @infocyde2024 2 роки тому

      @@expodemita I do not think they are compatible between 1.4/1.5 and 2.0 2.1. 2.0 and 2.1 should be compatible.

    • @alexandrmalafeev7182
      @alexandrmalafeev7182 Рік тому +2

      @@infocyde2024 2.0 and 2.1 are for sure

  • @simonbronson
    @simonbronson 2 роки тому +67

    Much appreciated, having someone clever distil all of this dense information down and explain it succinctly and with so much enthusiasm is so refreshing!

  • @NukerOfFace
    @NukerOfFace Рік тому +4

    Superb video. I don't think I've ever seen a tutorial/explaination for anything that is this good.

  • @KalebWyman
    @KalebWyman 2 роки тому +37

    Thanks for explaining these so well, your visual diagrams are great!

  • @GayanZmith-vy1ql
    @GayanZmith-vy1ql Рік тому +12

    i'm a total beginner to AI, and i suck at math, but you somehow managed to clear a shit ton of confusion. I was hooked on Dreambooth tutorials and trust me, you don't want that. I literally thought i was not going to be able to get started simply because of the massive resources it required.
    Trust me, you are really good at explaning things :)
    Really appreaciate the help

    • @glasco_
      @glasco_ Рік тому +1

      I’ve been trying to install dream booth for 3 days now. No success. Ready to walk in front of a bus

  • @tomm5765
    @tomm5765 2 роки тому +15

    Thanks for your hard work putting this together, very helpful to evolve my understanding of the different approaches. Much appreciated!

  • @takeuchi5760
    @takeuchi5760 2 роки тому +8

    Thanks so much for this. Very underrated channel, literally was thinking something like this would be really helpful.

  • @AleOnYouTube
    @AleOnYouTube Рік тому +2

    you deserve more subscribers, only channel I found that actually delivers what you need to know

  • @metamon2704
    @metamon2704 Рік тому +9

    You explained that amazingly, very easy to understand - also things move fast because it seems like LoRA is now the most popular.

  • @kulusic1
    @kulusic1 2 роки тому +45

    Textual inversion is far better on 2.1 than 1.5, and i think that's why they don't get the same love dreambooth receives. You can also speed up textual inversion training if you spend a few minutes getting the initializing text right so the vectors start in relatively close proximity to their final resting place. The best part imo, is you can combine many embeddings together, something which dreamtbooth doesn't really allow.

    • @sommeliereroguro
      @sommeliereroguro Рік тому +8

      How can you get the initializing text right before the training?

    • @alefratat4018
      @alefratat4018 Рік тому +1

      @@sommeliereroguro By running image to text I suppose ?

    • @nathanbollman
      @nathanbollman Рік тому

      Ironically I haven't been able to run dreambooth yet,I switched to linux for AI... something broken with PyTorch2.0 and Cuda11.7 only thing affected is dreambooth training. Turn on gradient checkpoint and it cant train, turn it off and I cant make it to the first epoch without running out of 24GB of vram? I hope this gets fixed soon.

    • @sub-jec-tiv
      @sub-jec-tiv Рік тому +1

      Totally agree. Suuper crucial to be able to call multiple embeddings in a prompt!

  • @jackzhang891
    @jackzhang891 Рік тому +4

    Hey Koiboi. Great video. When you made this video, as you said yourself, LoRA was still very new and the stats are probably not accurate. Now that a good amount of time has passed, I would love to watch an updated analysis video on the effectiveness of LoRA compared to Dreambooth and Textual Inversion.
    Either way, this is the most informative video I've watched so far comparing these fine-tuning models. Liked and subbed 👍.

  • @TheTruthIsGonnaHurt
    @TheTruthIsGonnaHurt Рік тому +2

    Liked and Subscribed, Thank you for all the hard work!

  • @CameronRule
    @CameronRule 2 роки тому +18

    One interesting piece of data is Lora has quite a high faves per download rating while only being out for a short period of time

    • @lewingtonn
      @lewingtonn  2 роки тому +6

      yeah, I saw that too.... good sign!

  • @JunaidAzizChannel
    @JunaidAzizChannel 7 місяців тому +2

    Man casually delivers a masters degree course with a research thesis in 20 minutes

  • @jitgo
    @jitgo Рік тому +4

    All different now! LoRA is by far the best all round method now and hugely gaining popularity... Great video by the way, excellent explanations!

  • @WarAnakin
    @WarAnakin Рік тому +1

    i don't usually comment on videos, but you dear sir deserve an applause for the level of research that you have achieved. Not only that, but you explained so that even a cat would understand it.

  • @anthonyaddo
    @anthonyaddo Рік тому +2

    Such an EXCELLENT video. Very very well researched and perfectly presented. Thanks for sharing all your findings and appreciate the time it took.

  • @AC-zv3fx
    @AC-zv3fx 2 роки тому +37

    LORA works only with an extension, and many people don't know how to use it yet, hence lower ratings. Great video btw! Visual comparision would have been great as well! As far as I can remember, there was one in LORA blogpost, showing how textual inversion may be less flexible than dreambooth or lora, and the latter two were showing comparatively similar results.

    • @Avenger222
      @Avenger222 2 роки тому +5

      Auto added compatibility now! But it was only added recently. (I still use the extension, I find the drop-down much easier to use than how auto implemented it, plus it gives you the ability to tweak the weight of both U-Net and the Text Encoder -- super cool!)

    • @artavenuebln
      @artavenuebln Рік тому

      i did everything i should do and i never get lora to run. it was no issue with the textual inversion, tho.

    • @glitter_fart
      @glitter_fart Рік тому +1

      controlnet has almost made lora obsolete for anything other than oddities

  • @LuisPereira-bn8jq
    @LuisPereira-bn8jq 2 роки тому +3

    That was a really helpful video that definitely saved me a bunch of time trying to understand these differences by myself :P

    • @lewingtonn
      @lewingtonn  2 роки тому +1

      saving people time makes me super happy, thanks!

  • @Animes4ever1
    @Animes4ever1 2 роки тому +2

    Awesome comparison mate, great addition with the statistics, thanks a lot

  • @toastypanda2963
    @toastypanda2963 Рік тому

    Great explanation! I've learned more about how AI art works from this video alone than all my previous watched videos combined. Everyone tends to say how to configure things without explaining how it works.

  • @mattecrystal6403
    @mattecrystal6403 2 роки тому +27

    I've been messing with Loras and they seem to work really well. You can also do a good amount of mix and matching with loras whereas a full model checkpoint only allows you to use that one model at a time. if I had a fruits lora and a vegetables lora, then I could just turn them both on to get fruits and vegies in my random prompt that doesn't ask for fruits or vegies. If I later just want fruit then I could just remove the vegies lora.
    I think loras are going to be big going forward, most people just don't know about them yet.

    • @treyslider6954
      @treyslider6954 Рік тому

      I get the feeling that Textual Inversion is the go-to for when you have a new idea you want to teach the model (like a specific character or subject), and Lora is great for when you have a concept you don't want to stop and explain to the model, or may have difficulty doing so. They're very similar things, but not quite the same.
      For example; loras are great for mimicking a specific art style, because instead of having to describe "I want a painted animation style like this specific style, but with eyes drawn just so", you can train a lora and then just say "" at the end of your prompt, and since it isn't actually part of the prompt, this clears up tokens for describing the actual thing you want depicted in that style.

    • @ArbJunkAgeG
      @ArbJunkAgeG Рік тому +1

      This is exactly how i feel about lora. It’s disappointing that people don’t seem to gasp the same values of how beneficial loras can be.

    • @tbuk8350
      @tbuk8350 Рік тому

      @@treyslider6954 And also, as described in the Automatic1111 docs, Textual Inversion can't teach COMPLETELY new concepts.
      The example they gave is that if you trained a model that only knew how to make apples on images of bananas, it wouldn't learn what a banana is, it would just make long yellow apples (in the best-case scenario). Because it's not actually changing model weights, it's better for teaching a style than a new subject, because unless the subject is very similar to something it's seen, it can't learn it.
      LoRAs can teach a model something it's never seen before, because they are directly inserting weights into the model, meaning it's actually modifying the model and not the input going into it.
      Basically, Textual Inversion for simple styles, LoRA for anything complicated.

  • @Apothis1
    @Apothis1 Рік тому

    Really appreciate this, so many videos showing how to do this stuff, but not how it works, and specially not how it works dumbed down to a level I can understand. Very cool, thankyou

  • @xhinker
    @xhinker Рік тому +1

    LoRA actually doesn't insert addition layers, LoRA add addition weights to the checkpoint model weights.

  • @moneyjuice
    @moneyjuice 2 роки тому +4

    I love your videos, always on point !

  • @Philip8888888
    @Philip8888888 Рік тому

    Wow. Thanks for this video, esp. the first part which gave just enough detail to understand the trade-offs and underlying approaches.

  • @m3dia_offline
    @m3dia_offline Рік тому

    I love it, love your promises on what we are going to get from your video at the very starting few seconds of the video itself, keep it going man, love your channel and your energy.

  • @j.clayton7672
    @j.clayton7672 Рік тому

    Awesome. As someone who was too lazy to look up the papers, and too stupid to understand them, I truly appreciate your video. I actually understood it.

  • @RemitheDreamfox
    @RemitheDreamfox Рік тому

    You explained this so well. My smooth brain couldn't understand these different methods for the longest time \uwu/

  • @kateryna_phototalk
    @kateryna_phototalk 2 місяці тому

    Insane, amazingly clear explanation 👏

  • @fun7704
    @fun7704 Рік тому

    This was a very informative video in fact, thank you! And I like your very dramatic delivery of the content! :)

  • @ParanoidAmerican
    @ParanoidAmerican Рік тому

    This video is exactly what I needed, and you went about it in the best way possible. Thanks for this

  • @bardiashahrestani3291
    @bardiashahrestani3291 Рік тому +1

    My understanding is that LoRAs train specific layers of the model and store them rather than injecting new layers. Injecting new layers would make the model config to become incompatible with the model itself.

  • @yo252yo
    @yo252yo Рік тому

    this is the best video about the topic ive ever seen, thanks so much

  • @errrorproduction
    @errrorproduction Рік тому

    really great video! finally understand the differences. just the conclusion is already out of date, since we're moving so incredibly fast. lora, is the most popular format on civitai now. understandable, since training is the quickest, even though ti's end-result is much smaller.

  • @ArtfulRascal8
    @ArtfulRascal8 2 роки тому +3

    the fact that you dont have10x more subscribers or views boggles me. i guess not enough sex and drama. i hate to be cynical but holy S*t this is a important subject. and you break things down so normies like me can understand. Thank you sincerely.

    • @lewingtonn
      @lewingtonn  2 роки тому

      thanks so much dude!! I made a few off-topic passion-project videos that my audience didn't really understand, so I think youtube doesn't trust my content... something like that.
      Quality audience < quantity audience!

    • @ArtfulRascal8
      @ArtfulRascal8 2 роки тому +1

      @@lewingtonn youtube is tyrannical these days. I guess with the amount of videos being posted everyday they have to do some thing. but one would think searching would solve the issue of relevance and quality, but "the algorithm" obviously chooses who it vets, and who it vets is obviously etc etc etc. We could have this conversation for hours maybe even days lmao. but no really thanks for your content man. seriously.

    • @lewingtonn
      @lewingtonn  2 роки тому

      ​@@ArtfulRascal8 sounds like you probably know more about this than me lmao, but thanks honestly!

  • @adriangpuiu
    @adriangpuiu 2 роки тому +5

    the conclusion is simple. use kohya ss to extract the lora deltas from checkpoints ..... thus you end up with 1 base model and plenty of lora files that are few MB in size

  • @xhinker
    @xhinker Рік тому

    Nice video, even though I watched it 6 months later, lots of things happened, your video is still extremely helpful (except the LoRA part 😊)

  • @ModestJoke
    @ModestJoke 2 роки тому +1

    "SKS" is a type of rifle. The point of Dreambooth is to overwrite what the model knows about a given word, either partially or completely. You can add new dog breeds to the model by training pictures of them under the generic class "dog" without destroying all the other kinds of dogs the model knows if you only train a little bit. Or you can make every dog you produce be your dog if you overtraining it. The point of choosing "sks" is not to use a word the model doesn't know. The point is to use a word you don't care if you overwrite completely, and then training it enough so that it works in your desired prompt. You could train "a photo of a dog person" to be pictures of you if you train it long enough. You're much better off training it to use a word with some meaning to you. Like a misspelling of a name, or by using "l33t $p34k" to spell it, or something else that's not real, yet has meaning to you. That way you can have different strings of text for different subjects of styles and put them all in the *same* model. If you always use "sks" or "ohwx", then you need en entire checkpoint per subject, and that's a bad idea.

  • @AB-wf8ek
    @AB-wf8ek 2 роки тому +2

    Thanks a ton for this breakdown, I've been struggling with this same question for a few weeks now. I had already come to a similar conclusion myself, but this was very validating.
    Dreambooth is preferred, but the models sizes make it so cumbersome and challenging to test different versions. With textual inversion, the file sizes are insignificant, and you can stack them on top of each other, making them very flexible.
    I haven't actually evaluated embeddedings (textual inversion) yet for quality because the animation notebook I use doesn't support them, but the developer just made it compatible, so I'm looking forward to testing it out more.

  • @0xjeph
    @0xjeph 4 місяці тому +1

    LoRA does not add new layers to the original model. Instead, it introduces additional weights in a low-rank decomposition format and integrates them into the existing layers of the model.

  • @barryjones6479
    @barryjones6479 2 роки тому +1

    Great video and explanation! I really want TI to be the future but I agree, the quality of dreambooth training is usually better.

    • @lewingtonn
      @lewingtonn  2 роки тому

      thank's for the data point!

  • @swannschilling474
    @swannschilling474 2 роки тому +1

    Thanks for the input, good research!!

  • @tljstewart
    @tljstewart Рік тому

    ok you had me @00:27 , would be cool to see a video on civitai

  • @BlancheNuit
    @BlancheNuit Рік тому

    That is the type of quality content that I'm digging for.
    I want to understand Stable Diffusion and everything related. But my attention span/knowledge about programming is not enough that I can just read papers about it. So I need videos, with visuals, and easy explainations. And your video was Perfect. Liked + Subscribed :)

  • @ronenbecker1873
    @ronenbecker1873 Рік тому

    You're an absolute legend. Great video

  • @wendellkwang3724
    @wendellkwang3724 Рік тому

    what a great list of checkpoints you have, a man of culture 🤣

  •  Рік тому

    Thank You a lot. This has been a really good explanation that I felt missing.

  • @fredingham1855
    @fredingham1855 Рік тому

    Outstanding job explaining these concepts! Well done!

  • @dreamingtulpa
    @dreamingtulpa Рік тому

    Why am I only now seeing this? Great video and thanks for the feature ❤

  • @Unstable_Stories
    @Unstable_Stories Рік тому

    I greatly appreciate this video sir! It is really helpful for me to have context of how things actually work behind the scenes to make mental connections and improve how I interact with the external program.

  • @ksottam
    @ksottam 2 роки тому

    Loved this breakdown. You need more followers!

  • @crustysoda
    @crustysoda 2 роки тому +6

    Thank you for model explanation. Really loved your content so far.
    At the end of civitai comparison, I’m curious if we split data to use cases, object embedding vs style embedding would have different performance/preference.

    • @lewingtonn
      @lewingtonn  2 роки тому +2

      that's a super hard question to answer :(

  • @zynexis
    @zynexis 2 роки тому +2

    from what I gather at this point (may be wrong, don't know the exact details) this is how i view the various techniques:
    dreambooth:
    easy to use and see clear results due to typical aggressive training settings
    easy to overtrain, turning model into 1 trick pony
    can contaminate rest of model if overtrained
    merge can transfer contamination
    probably still good for merging overall
    textual inversion:
    works with several models with same base
    model doesn't learn anything
    cannot be included in a merge
    'tricking' a model to output a result based on what it knows without understanding
    'plug in' solution for specific objects/concepts
    hypernetworks:
    does not need mixing into model before use, unlike lora
    can be swapped and scaled on the fly in webui (req same base model)
    cannot be included in a merge
    LoRA:
    small file size but needs to be merged into another model (with same base)
    probably best for merging without affecting model broadly
    (no idea how lora merging affects actual model, are new nodes inserted?)
    finetuning:
    keep model stable while learning new concepts
    probably the most solid/slow/steady
    please feel free to add to list to or correct me

    • @lewingtonn
      @lewingtonn  2 роки тому

      yeah, sounds very accurate to me
      The only thing I would mention is that I think that LoRA merging and hypernetwork merging can be done in exactly the same way, it's just that at the moment AUTOMATIC1111 does them differently

    • @zynexis
      @zynexis 2 роки тому

      ​@@lewingtonn that would make sense, if they both operate on those intermediary nodes
      it raises questions of how well LoRA/hypernetworks merge when several models are merged and how well they handle it
      seem the fewer subnodes are maybe more specialized in what they do to the underlying model. Maybe it just magically works out xD
      guess it would be similar to merging 2 hypernetworks and run the merged on a model

  • @neocaron87
    @neocaron87 2 роки тому +3

    That was absolutely awesome. Thanks for that, I wish you'd do a deep dive tutorial of the most recent update of dreambooth in automatic 1111, some settings seems to have major impact in the training while not being very much covered. (Gradient anyone? XD)

    • @Atomizer74
      @Atomizer74 2 роки тому

      Yeah, every time I grasp the settings a bit better, new settings get added.

  • @AndyGilleand
    @AndyGilleand 2 роки тому

    I just started training Lora in the dreambooth extension in Automatic1111 and the files it's outputting for me for SD2.1 are about 5 MB pt files, nowhere near what you said Lora sizes should be.

  • @baseddoggie
    @baseddoggie 9 місяців тому +1

    I'm not sure the numbers about Dreambooth downloads are accurate. It seems he got that number from the number of "checkpoints" (aka full size models) downloaded from civit ai, but I'm not so sure most of those are made with Dreambooth, a lot (if not the majority) are model merges which is not the same thing. Just thought I'd mention that.

  • @dv8silencermobile
    @dv8silencermobile Рік тому

    You are really good at explaining this stuff. Thanks!

  • @StunMuffin
    @StunMuffin 4 місяці тому

    The best explaining on the UA-cam🎉❤

  • @midnightCirc
    @midnightCirc Рік тому

    lora is my go-to. Being able to hotswap and combine styles/likeness/scenes on the fly and being able to adjust weights is SO powerful.

  • @TheAnna1101
    @TheAnna1101 Рік тому

    Thanks for making such great and informative video. Keep up the good work

  • @tbuk8350
    @tbuk8350 Рік тому

    This video is incredibly helpful. I'm probably going to use either LoRA or Dreambooth, as Textual Inversion can't teach brand new subjects as well as you can by directly inserting or modifying weights in the model.

  •  Рік тому

    thanks for making those complex concepts easy to understand!

  • @матвейлапушинский

    Incredable explanation! Thanks a lot.

  • @danielaston6560
    @danielaston6560 Рік тому

    This video is dope. Super clear and informative. Thank you!!!

  • @grahamulax
    @grahamulax Рік тому

    This is the best video. You mentally collapsed at the end and I could relate so much hahah. Textual inverse IS THE COOLEST!...Now excuse me while I use some dreambooth.

  • @jeronimogauna7508
    @jeronimogauna7508 11 місяців тому

    Best video I ever seen.
    Best vibes!
    Thanks so much

  • @zentrans
    @zentrans 2 роки тому +2

    Sounds like your explanation could have been simpler, but I could be wrong. Textual inversion seems to be a method by which you discover which textual parameters (which are no necessarily human readable) need to be inputted in order to get images resembling your input images, while Deambooth creates a whole new association/s within the network, the more associations it makes the more integrated the concept is, the more intricate/creative your prompts can get while keeping attention on the desired feature.

    • @lewingtonn
      @lewingtonn  2 роки тому +1

      I don't know about the bit about more associations = more integrated, but yeah, sounds right to me!

    • @zentrans
      @zentrans 2 роки тому

      @@lewingtonn if I'm correct, textual inversion should allow you to do selective editing with img2img. Can you try that and compare to your previous attempts ?

    • @zentrans
      @zentrans 2 роки тому

      @@lewingtonn btw this would have been great for your Greta VS Tate memestream

    • @lewingtonn
      @lewingtonn  2 роки тому +1

      @@zentrans you're 100% correct but that would take a LOT of time lol... I need to find a place to live first 😂

  • @friendofai
    @friendofai 2 роки тому +1

    Really great video, thanks for sharing all your research!

  • @rickguzman9463
    @rickguzman9463 Рік тому

    THANK YOU THANK YOU THANK YOU!! Great video. Great insight.

  • @cinematic_monkey
    @cinematic_monkey Рік тому

    What I was looking for in that video was the comparison of usability in different scenarios. Which model is good for faces which one for style transfer etc. I'm missing that, other than that quite comprehensive comparison. Good job!

  • @tahreezzmurdifin52
    @tahreezzmurdifin52 2 роки тому +1

    you are awesome

  • @martinchen9667
    @martinchen9667 Рік тому

    brilliant video, thank you for all the efforts!

  • @Exaltar
    @Exaltar 2 роки тому +1

    You're a god damn genius, been watching your videos for the last 2 days. I love your content but I feel like a total moron because I know you're explaining things in the best way possible for a laymen like myself.

    • @lewingtonn
      @lewingtonn  2 роки тому +1

      hahaha that's super high praise dude, I'm glad you find my stuff helpful!

  • @Grifter
    @Grifter Рік тому

    I've used all these methods besides dreambooth. And from my experience on training a specific person LORA has gave me the best results and it's also the quickest of the methods i've tried as well which is a bonus. You can also use them on any model and mix them together ect. The only problem i've had is using it to produce two different people at the same time. As you can't go over a total weight of 1.0 but more realistically like 0.8 and the more you use together the lower the weight you have to use for each. But that can be solved using inpainting or probably other methods as well.

  • @JohnSmith-he5xg
    @JohnSmith-he5xg Рік тому +1

    Do you have a link to the Excalidraw document? Was impressed by how clean that looked and wanted to check it out

    • @lewingtonn
      @lewingtonn  Рік тому +1

      excalidraw is great but you can only have one document going at a time so there's no link I'm afraid

    • @JohnSmith-he5xg
      @JohnSmith-he5xg Рік тому

      @@lewingtonn No worries. Great video!

  • @badradish2116
    @badradish2116 Рік тому +1

    could you please do a part 2 where you
    - explain aesthetic gradients for educational purposes, and maybe provide data on user feedback like you did at the end for the others.
    - explain lycoris, which from what i understand is lora + 4 random good ideas, but id love to see someone on your level break it down a bit better.
    - give us updated data on the other forms now that more feedback is available (you mentioned not having a big enough sample size to judge the newest tech).
    that would be insanely helpful. thanks!

  • @jichenzhang4385
    @jichenzhang4385 Рік тому

    Very nice introduction! Thank you!

  • @VitaNova83
    @VitaNova83 Рік тому

    Absolutely incredible video, thank you!

  • @kazimozden4010
    @kazimozden4010 Рік тому

    Thank you for an informative and engaging video!

  • @Beef_Supreeeme
    @Beef_Supreeeme Рік тому

    You have to respect the effort in making this video.

  • @takocain
    @takocain Рік тому

    That was an insanely good explanation. Thank you!

  • @ytchen6748
    @ytchen6748 Рік тому

    What a great video! Thanks for your academic sharing and empirical results❤

  • @ddude2
    @ddude2 Рік тому +1

    Amazing video with the explanation on the heuristics. Have you updated your excel with the usage now after 4 months and would you change the opinion based on your quantitative data from civitai

  • @dalefunk2709
    @dalefunk2709 Рік тому

    textual inversion makes WAY more sense. To put it plainly its training the word, or "changing the definition" of what that word means to the model. so like you said "sks" might put out a sign or something, but by the end the model understands the word now means "corgi". Its like how normal words become slang words in real life, or phrases change their meanings as language develops.

  • @jondargy
    @jondargy Рік тому

    Very nice summary- thank you 🙏

  • @kirollosmalek1365
    @kirollosmalek1365 Рік тому

    man you're a hero

  • @suryaprasathramalingam2421
    @suryaprasathramalingam2421 10 місяців тому

    thanks for the short explanation. Loved it!

  • @kernsanders3973
    @kernsanders3973 Рік тому +3

    In my experience with Lora VS Hypernetworks, Hypernetworks seems to be more versatile, they work so much better with different models. I trained a hypernetwork for a anime character on a anime model as base and was able to use them fine on the realistic models. Where Lora would struggle with that. Usually trying an anime character on a realistic model with Lora would start causing breakdown in anatomy or cause the output to look very deepfried. I've seen a stand alone Lora trainer that doesnt use the Automatic WebUI that I might try out and see if it produces better results. But so far Hypernetwork seems to be king between the two. Also Lora models don't seem to work well with VAEs. Where I have not had problems with Hypernetworks and VAEs.
    I do wish they would incorporate better Hypernetwork and Textual Inversion management into the WebUI. I don't always want ALL my embeddings to load when starting up the WebUI. Some interface where one could enable and disable the embeddings would be a time saver. Almost like the extension enable/disable page. At the moment have to manually move them in and out of the embeddings folder. If I knew how to create extensions for the WebUI then I would give it a shot. For Hypernetworks, I wish they got the same treatment on the WebUI that Lora gets. Again this if I knew how to create extensions I would create something like that.

    • @aquilesbaezta4354
      @aquilesbaezta4354 Рік тому

      Hello, I tried to train anime with textual inversion and the result is always something similar to a Picasso painting. Do you know what I could be doing wrong? i use this video for reference ua-cam.com/video/2ityl_dNRNw/v-deo.html

    • @kernsanders3973
      @kernsanders3973 Рік тому

      @@aquilesbaezta4354 If you are training anime, you want to train using on a model that is anime based. The AnythingV3, AnythingV4 or AnythingV4.5 is good base models. Would highly recommend to rather train a Hypernetwork. It uses less Vram and you can switch them on and off with no problem in the setting under Extra Networks. Make sure to switch CLIP to 0 and turn the VAE off when training. Look for Nerdy Rodent on youtube. He has a good video on Hypernetwork training. Do all the settings he does when setting up the Hypernetwork.

    • @kernsanders3973
      @kernsanders3973 Рік тому

      @@aquilesbaezta4354 Also before training starts. Go through you fileword files to make sure they are accurately describing the accompanying training image. Not only what the subject is but also what the style is in the anime training image. Example if it's cell shaded, screencap or a digital painting. Remember when training is done to load the VAE again before prompting.

  • @alexkfridges
    @alexkfridges 2 роки тому +1

    God tier content

    • @lewingtonn
      @lewingtonn  2 роки тому +1

      not as good as a real life headcrab, but it's something haha

  • @NetworkDirection
    @NetworkDirection Рік тому

    Hey, you're that guy from IT Masters!

  • @takif8756
    @takif8756 Рік тому

    Great tutorial mate, thank you!

  • @TurboSkibidiFun
    @TurboSkibidiFun 2 роки тому

    This is so well taught man thank you so much

  • @Roughneck7712
    @Roughneck7712 2 роки тому +17

    Great video! Personally, I like textual inversion and feel that - ultimately - that's where most will end up gravitating to for training. HOWEVER, I really wish someone would create clear instructions on image captioning best practices when preparing the datasets for training images ... HINT HINT!

    • @lewingtonn
      @lewingtonn  2 роки тому +6

      haha I'll chuck 'er on the backlog!

    • @magenta6
      @magenta6 2 роки тому

      Aitrepreneur has a very good tutorial on this. ua-cam.com/video/2ityl_dNRNw/v-deo.html

  • @caschque7242
    @caschque7242 Рік тому

    Really good guide. One constructive critical point: when calculating for a trend of data: do it by time, not in total. Dreambooth was the first one, so you biased the numbers in favor, simply because Dreambooth existed for longer. For the favorites, you could do Favorites/Downloads.

  • @mlcat
    @mlcat Рік тому

    Very clear explanation, thank you!

  • @lionroot_tv
    @lionroot_tv 2 роки тому

    This is great. Thank you for sharing your knowledge, and about Excalidraw.

  • @austinliu9218
    @austinliu9218 Рік тому

    clearly explained, much appreciated!

  • @MarcusStreips
    @MarcusStreips Рік тому

    Nicely done. I know from experience that training Dreambooth requires at least 10 GB of VRAM, so its not accessible to everyone. I am definitely going to check out the other methods.

  • @Funzelwicht
    @Funzelwicht Рік тому

    Awesome explanantion for everyone!