Useful Resources How to install Stable Diffusion Forge UI on Windows (Nvidia GPU) ua-cam.com/video/zqgKj9yexMY/v-deo.html Settings and Tips and Tricks for Forge UI ua-cam.com/video/zqgKj9yexMY/v-deo.html How to get 260+ Free Art Styles for Stable Diffusion A1111 and Forge UI (The styles.csv download link is on the pinned comment of that video) ua-cam.com/video/UyBnkojQdtU/v-deo.html In this video I am using the model: Juggernaut X RunDiffusion (version 10) from CivitAI civitai.com/models/133005?modelVersionId=456194 you download it and place it in the folder webui\models\Stable-diffusion Outpaint Tutorial for Forge UI ua-cam.com/video/5_dOevJRzEI/v-deo.html Inpaint Tutorial for Forge UI ua-cam.com/video/srvek4ucH-A/v-deo.html If you have any questions you can post them in Pixaroma Community Group facebook.com/groups/pixaromacrafts/ or Pixaroma Discord Server discord.gg/a8ZM7Qtsqq
Example of one of my prompts: Cinematic photo of a Celtic woman, with pale skin, fiery red hair cascading over her shoulders, and bright blue eyes. She wears a woolen cloak fastened with a bronze brooch and is adorned with silver bracelets. Behind her, misty forests and ancient standing stones rise in the background, ultra realistic, Shot with a Nikon F3 and a 35mm ƒ2 lens, using Kodak Portra 400 film stock
I speed up things in video but still go pretty fast usually. I have this: - CPU Intel Core i9-13900KF (3.0GHz, 36MB, LGA1700) box - GPU GIGABYTE AORUS GeForce RTX 4090 MASTER 24GB GDDR6X 384-bit- Motherboard GIGABYTE Z790 UD LGA 1700 Intel Socket LGA 1700 - 128 GB RAM Corsair Vengeance, DIMM, DDR5, 64GB (4x32gb), CL40, 5200Mhz- SSD Samsung 980 PRO, 2TB, M.2 - SSD WD Blue, 2TB, M2 2280- Case ASUS TUF Gaming GT501 White Edition, Mid-Tower, White- Cooler Procesor Corsair iCUE H150i ELITE CAPELLIX Liquid- PSU Gigabyte AORUS P1200W 80+ PLATINUM MODULAR, 1200W- Microsoft Windows 11 Pro 32-bit/64-bit English USB P2, Retail
Would you be able to make a longer video going over how to use all the built in stuff forge comes with? (The whole area with LayerDiffuse, controlnet, dynamic thresholding, etc)
Is too much information for one video, but is split on multiple videos for most of the stuff check ua-cam.com/video/zqgKj9yexMY/v-deo.html ua-cam.com/video/q5MgWzZdq9s/v-deo.html ua-cam.com/video/c03vp7JsCI8/v-deo.html ua-cam.com/video/5_dOevJRzEI/v-deo.html ua-cam.com/video/srvek4ucH-A/v-deo.html as for the dynamic thresholding I didnt found it so useful because it kind of change the colors. For control net sdxl models seems are not so good at v1.5 models, so I mostly use canny model, and you can check it in my sketch video or cartoon videos.
what would you say is the best model for SD and its settings. ive downloaded 1000 over the last year, tried merging a few, im always on the lookout for the "perfect" model that has midjourney quality for both nsfw/sfw photos. mostly portraits, but also creative mockups as well. while I have a few go toos, it can still be frustrating going back and forth just to get one that can handle what you want it to do. I just want to get to a point where I turn it on, have all the settings saved just the way i want, and prompt away without the back and forth.
In the last months I am using only the juggernaut xl models right now I am using latest version Juggernaut_X_Rundiffusion10 civitai.com/models/133005?modelVersionId=456194 but older models also works ok like from 7 to 9 but usually latest has more training, and I liked they always give the the settings you can use in the description, and is good as a general model because it can do anything. And is also the higher rated SDXL model in the last month on civitai Recommended settings: Res: 832*1216 (For Portrait, but any SDXL Res will work fine) - I usually just use between 1024 and 1216 what fit better for the ratio i need. Sampler: DPM++ 2M Karras Steps: 30-40 CFG: 3-7 (less is a bit more realistic)
Unfortunately there’s no such thing as perfect do it all model. Midjourney actively juggles multiple models as it renders. If you want to replicate midjourney - closest possible is you’d have to use comfyui with python scripts and dynamically choosing models based upon image context and CLIP along with ipadapter. There are tons of models but each does specific things well. Best you can hope for is “decent in everything” or AMAZING in specific things. Juggernaut is good for realistic fantasy images. Pony is versatile for fantasy art.
I don't have for forge, but for comfyui i will do one next week, usually people just generate with ai and use other things like Photoshop for adding text
Windows operating system, Nvidia card, with at least 4gb of vram to run older models like 1.5 and you need more vram like 6-8 to run sdxl latest models, i got it yo work on 6gb of vram but didn't test it on 4gb vram
Thank you for the video, quick question: when I try to create an x/y/z script to generate multiple photos like you, for example with different STEPs, I do get several photos generated, but I don’t have the captions to identify which photo contains which setting. Also, at the end, I don’t see all the photos lined up for comparison at a glance. I only see one photo, and I have to go into my folder to see the others. However, I have enabled "draw legend." Is there something I need to adjust in the settings? Thank you very much for your help.
I just tested in the latest version, so i selected the xyz plot, for x type i put steps, for x value i put 20,21,22,23 and i enabled draw legends. When i generated on the interface i get a single image , but in the output folder i get 4 different image without the text with seed on it. So on the interface i can open that big image that has that legend on it and i can save it from the interface, not sure why is not saved with the rest, but if you clicked on that long image with the legend and all to open in the top left corner you have a save button, so that will save it in the folder that big image, or you can just right click and save image as. and put it where you want
@@KAVaviation I have a svd video but is for the older version of forge. Right now there are not many good video models locally, I am waiting for a good model maybe the guys who did flux will do a nice one for video. until then I am using online generators like klingai and others
Hi man i'm your new subcriber, may I ask something? I just bought a laptop with RTX 3070 VRAM 8 GB, I want to install Stable Diffusion Forge, but I'm still afraid and doubtful that there will be a virus and it seems like using the GPU for SD can really make the GPU heat up. I'm asking for your opinion on this as I'm still new to this, thanks in advance! success always for you!
Hmm I never heard of a problem like that. As for safety when you download models from internet make sure is safetensor extension instead of ckpt. I have on older computer one with 6gb of VRAM and still works, but never had a problem. I guess you can test it to see if you are afraid, test with a game and test with stable diffusion to see how much temperature will get, but the video card should handle this kind of things
I'm having difficulty getting two subjects to interact. As an example I want two characters to simply "shake hands" well what ensues instead is a disembodied horror show. Advice?
It can be hard sometimes, i either do Inpainting or I combine them in Photoshop and just do an image to image to blend it better. Flux model is definitely better at that but depends on how you prompt, sometimes you can get luck if you add a lot of details, so just saying shake hands might not be enough, so you can ask chatgpt to describe in better details, so with flux you can do something like: Two characters, one a tall, broad-shouldered man in a formal black suit with neatly combed hair and sharp features, and the other a slender woman in a stylish business outfit with her hair in a neat bun, stand facing each other in a softly lit office space, their hands extended in a firm yet respectful handshake, the man's confident grip meeting the woman's graceful, slightly forward-leaning posture, as both exchange subtle expressions of calm professionalism, signaling agreement or partnership in this formal yet amicable interaction.
@@pixaroma The video was great and made sense at the same time 👍. I tried it somewhat not 1:1 but similar. What should I tell stable diffusion if I want a sketch style or like anime Style oder Cartoon style?
@@PetrusiliusZwack you can actually use styles ready made prompts that you can add to your prompts, i did a few videos about that on my channel, for both forge and new one on comfyui
Any tips to make sdxl loading model faster? Sd1.5 is faster because it still use 512 base model but sdxl took longer like 3 time longer. I'm using Rtx 4060 8gb.
Usually sdxl models are also larger then 1.5, like 3 times large maybe that can be the cause, i dont have any tips for that, i have rtx4090 and I didnt notice any difference :) plus I dont use 1.5 since sdxl appeared for me 512px is too small image size
Hey I was wondering is it possible to automate the process of a prompting in the text field in SD if so how? My biggest guess is that you use wild cards over here
I didnt try any methods, I usually just copy and paste from chatgpt because i use sometimes images to get prompts. But I saw there is an extension that let you add chatgpt api to it so you will have like chatgpt inside the stable diffusion, you can read more about it but I didnt test it. github.com/hallatore/stable-diffusion-webui-chatgpt-utilities
I like Forge but the creator of it has jumped ship. I know there's a couple of branches that are working but I just don't see it sticking around long term. I switched back to my combo of Auto, Comfy and fooocus
Did.you tried juggernaut? I think not all models support inpaint, the juggernaut xl i saw in description on civit ai that they added inpaint so maybe that can be the cause
@@pixaroma I mean is there way to use normal modals to inpaint because i dont know why im using krita ai diffusion made by acly when im using normal models they can inpaint but in normal stable diffusion forge its impossible do it
Sorry I don't know all the technical stuff, in forge ui i used the juggernaut but i didn't try other, and i think they are other in painting models. So if didn't work either is not compatible or is a bug with the interface
In my opinion the models need more training, it has problems with anything that can have a lot of combinations, like fingers on a hand can be in so many positions and if you look at the hand from different positions sometimes it looks like you have 4 fingers, sometimes 3 depends on position and you can hold objects and each finger bend in multiple points, and that is what i think make it confuse. Plus when the model is done it tries to censor it and it will miss some things on how it actually looks. It tries to find patterns and more training it has better images will create and with less mutation. Also doesn't know how to count very well. So i think they need to train it in more ways like to make it understand how things look in 3d from different angles to have better results, and probably some physics, things like gravitation affect things, and how objects interact, collisions etc, but in the future probably they figure out how to do that
Without training a lora model is not so easy, even with lora is not perfect. You can also try extensions like Reactor but those work more with photorealistic images. There are options with controlnet and ip adapter but i didn't manage to get consistent results with sdxl models, i saw others using it with sd 1.5 models. Only from prompt is hard to make it right. You can also try Inpainting to keep the face or head and change everything else. You can get similar results if describe accurate how the hair look, how is dressed and so on, try a few generation and try to find one similar
I don't use the control net for faces, i tried but didn't get consistent results 😃 there are for comfy ui workflow that works I saw online but I use mostly forge and control net i use it to get contours and poses and convert sketches so i use mostly canny model
😂 i don't always get what i want but with enough tries and right prompts i get it close enough, depends on the images, it still has things that can't do right no matter what you try
Create discord server, no one uses facebook these days. (atleast i dont). With midjourney and games seeing a rise, i am confident the upcoming generation has discord account.
0:30 yes that trained image. none was generated copy paste using other trained data. it not do spaceship if not trained. its will make spaceship looking toaster bcoz both those trained it can combine. nothing is generate from 0% its not intelligence. this is image this is caption. if you ask something similiar what is in caption that was trained you get that image or combined with other that has same. nothing generated. still it not know nothing else that was trained. even then it not know anything. its programmed todo. and its sold as AI what we think in movies. and its just this is this and answer is this. we allready know answer we trained it xD same as good old jarvis
In my opinion it can do what was trained but the billion combination for each different prompt is what makes it more interesting, its like having unlimited variations for something, i can not do a job if i was not trained, i learn different things then i make like a mix of what i learned, just ai can do those million combination that we don't have enough years in or life to do :) but will see in the future with more training will get more advanced
You need an Nvidia rtx card, with a lot of vram, i have rtx4090 24gb of vram. I do a 1024px image in 4-5 seconds. If i use a hyper model can do like in one second or so. I do speed up the video to not wait but is quite fast anyway
@@pixaroma i tried using ForgeUI and it is what i needed fr. i got a RTX 3070 8gb. but fordge sped ut up to 7s per image, so thanks for the tutorual :D
😂 yeah it has more problems with hands than a finetuned sdxl, will see what happens, I saw is problem with finetuning because of license, not sure what features brings, if not we stick with sdxl
Useful Resources
How to install Stable Diffusion Forge UI on Windows (Nvidia GPU)
ua-cam.com/video/zqgKj9yexMY/v-deo.html
Settings and Tips and Tricks for Forge UI
ua-cam.com/video/zqgKj9yexMY/v-deo.html
How to get 260+ Free Art Styles for Stable Diffusion A1111 and Forge UI (The styles.csv download link is on the pinned comment of that video)
ua-cam.com/video/UyBnkojQdtU/v-deo.html
In this video I am using the model: Juggernaut X RunDiffusion (version 10) from CivitAI
civitai.com/models/133005?modelVersionId=456194
you download it and place it in the folder webui\models\Stable-diffusion
Outpaint Tutorial for Forge UI
ua-cam.com/video/5_dOevJRzEI/v-deo.html
Inpaint Tutorial for Forge UI
ua-cam.com/video/srvek4ucH-A/v-deo.html
If you have any questions you can post them in Pixaroma Community Group facebook.com/groups/pixaromacrafts/
or Pixaroma Discord Server discord.gg/a8ZM7Qtsqq
Finally a pro. No bs, no intro, straight to the point. Subscribed
I love that. Finally a UA-cam that respects my time. Instantly subscribed
Just wow, I cannot stop watching 💣
Great video guide with clear explanations and usability.
Thank you so much for your support, I really appreciate it ☺️
Wow wow wow. Fantastic video that doesnt have a goofy voice and those quickly paced captions. Thanks!!
Example of one of my prompts:
Cinematic photo of a Celtic woman, with pale skin, fiery red hair cascading over her shoulders, and bright blue eyes. She wears a woolen cloak fastened with a bronze brooch and is adorned with silver bracelets. Behind her, misty forests and ancient standing stones rise in the background, ultra realistic, Shot with a Nikon F3 and a 35mm ƒ2 lens, using Kodak Portra 400 film stock
great explanation and good tips, thank you so much. i can just copy what is said above: no bullshit, no ads, no intro, just straight to the point
After a long search finally an amazing video.
this might just be one of the best videos ive learned from thank you.
love it, learned a lot of new tricks. What are your specs you are running, GPU, Processor, Ram ?? yours generates pretty fast.
I speed up things in video but still go pretty fast usually. I have this: - CPU Intel Core i9-13900KF (3.0GHz, 36MB, LGA1700) box - GPU GIGABYTE AORUS GeForce RTX 4090 MASTER 24GB GDDR6X 384-bit- Motherboard GIGABYTE Z790 UD LGA 1700 Intel Socket LGA 1700 - 128 GB RAM Corsair Vengeance, DIMM, DDR5, 64GB (4x32gb), CL40, 5200Mhz- SSD Samsung 980 PRO, 2TB, M.2 - SSD WD Blue, 2TB, M2 2280- Case ASUS TUF Gaming GT501 White Edition, Mid-Tower, White- Cooler Procesor Corsair iCUE H150i ELITE CAPELLIX Liquid- PSU Gigabyte AORUS P1200W 80+ PLATINUM MODULAR, 1200W- Microsoft Windows 11 Pro 32-bit/64-bit English USB P2, Retail
Great video I’m new to stable diffusion and never used a lot of those options!
Would you be able to make a longer video going over how to use all the built in stuff forge comes with? (The whole area with LayerDiffuse, controlnet, dynamic thresholding, etc)
Is too much information for one video, but is split on multiple videos for most of the stuff check ua-cam.com/video/zqgKj9yexMY/v-deo.html ua-cam.com/video/q5MgWzZdq9s/v-deo.html ua-cam.com/video/c03vp7JsCI8/v-deo.html ua-cam.com/video/5_dOevJRzEI/v-deo.html ua-cam.com/video/srvek4ucH-A/v-deo.html as for the dynamic thresholding I didnt found it so useful because it kind of change the colors. For control net sdxl models seems are not so good at v1.5 models, so I mostly use canny model, and you can check it in my sketch video or cartoon videos.
@pixaroma
Once again, you knocked it out of the park. You are in the major leagues. :)
what would you say is the best model for SD and its settings. ive downloaded 1000 over the last year, tried merging a few, im always on the lookout for the "perfect" model that has midjourney quality for both nsfw/sfw photos. mostly portraits, but also creative mockups as well. while I have a few go toos, it can still be frustrating going back and forth just to get one that can handle what you want it to do. I just want to get to a point where I turn it on, have all the settings saved just the way i want, and prompt away without the back and forth.
In the last months I am using only the juggernaut xl models right now I am using latest version Juggernaut_X_Rundiffusion10 civitai.com/models/133005?modelVersionId=456194 but older models also works ok like from 7 to 9 but usually latest has more training, and I liked they always give the the settings you can use in the description, and is good as a general model because it can do anything. And is also the higher rated SDXL model in the last month on civitai
Recommended settings:
Res: 832*1216 (For Portrait, but any SDXL Res will work fine) - I usually just use between 1024 and 1216 what fit better for the ratio i need.
Sampler: DPM++ 2M Karras
Steps: 30-40
CFG: 3-7 (less is a bit more realistic)
Unfortunately there’s no such thing as perfect do it all model. Midjourney actively juggles multiple models as it renders.
If you want to replicate midjourney - closest possible is you’d have to use comfyui with python scripts and dynamically choosing models based upon image context and CLIP along with ipadapter.
There are tons of models but each does specific things well. Best you can hope for is “decent in everything” or AMAZING in specific things.
Juggernaut is good for realistic fantasy images.
Pony is versatile for fantasy art.
I just wanna say that you're amazing, man
great video. Wish I had found you months ago, it would have saved me a lot of time. Liked and subscribed.
Very informative and no nonsense. Subbed and liked!
Great video mate! Super informative, and straight to the point!
Nice video! Why is it so difficult to find any tutorial that shows how to use Stable Diffusion to add text to an existing image? Can you help?
I don't have for forge, but for comfyui i will do one next week, usually people just generate with ai and use other things like Photoshop for adding text
Informative, as always. Thank You
Can you tell me what is the minimum hardware requirements to run Forge WebUI. Please
Windows operating system, Nvidia card, with at least 4gb of vram to run older models like 1.5 and you need more vram like 6-8 to run sdxl latest models, i got it yo work on 6gb of vram but didn't test it on 4gb vram
@@pixaroma Thank you for your time
Thank you for the video, quick question: when I try to create an x/y/z script to generate multiple photos like you, for example with different STEPs, I do get several photos generated, but I don’t have the captions to identify which photo contains which setting. Also, at the end, I don’t see all the photos lined up for comparison at a glance. I only see one photo, and I have to go into my folder to see the others. However, I have enabled "draw legend." Is there something I need to adjust in the settings? Thank you very much for your help.
I just tested in the latest version, so i selected the xyz plot, for x type i put steps, for x value i put 20,21,22,23 and i enabled draw legends. When i generated on the interface i get a single image , but in the output folder i get 4 different image without the text with seed on it. So on the interface i can open that big image that has that legend on it and i can save it from the interface, not sure why is not saved with the rest, but if you clicked on that long image with the legend and all to open in the top left corner you have a save button, so that will save it in the folder that big image, or you can just right click and save image as. and put it where you want
Nice video and great description
Thanks for your efforts
Awesome mate!
Thank you so much for sharing. Your tutorial series are greatly helpful for the starters.
Great, helpful video, thank you.
very good video, thanks a lot this is a gold mine
Could you make a complete guide/tutorial about "Regional Prompter" extension for AUTO and how to get 2 characters interacting? Thx in advance.
I didn't play too much with it yet, i am still waiting for sd3 maybe can do things better
@@pixaroma Can you make a video about making short animations? Like the SVD thing?
@@KAVaviation I have a svd video but is for the older version of forge. Right now there are not many good video models locally, I am waiting for a good model maybe the guys who did flux will do a nice one for video. until then I am using online generators like klingai and others
Awesome video, liked
Excellent video. Glad I watched it. Liked and Subbed.
Thank you ☺️
Good video, found this all out the hard way lol
subscribed.
Love these videos
Hi man i'm your new subcriber, may I ask something? I just bought a laptop with RTX 3070 VRAM 8 GB, I want to install Stable Diffusion Forge, but I'm still afraid and doubtful that there will be a virus and it seems like using the GPU for SD can really make the GPU heat up. I'm asking for your opinion on this as I'm still new to this, thanks in advance! success always for you!
Hmm I never heard of a problem like that. As for safety when you download models from internet make sure is safetensor extension instead of ckpt. I have on older computer one with 6gb of VRAM and still works, but never had a problem. I guess you can test it to see if you are afraid, test with a game and test with stable diffusion to see how much temperature will get, but the video card should handle this kind of things
@@pixaroma thanks for your opinion man, really appreciate it! This really helped me in making a decision 🙏🙏🙏
Thank you for the video. Really nice.
Very good!
I'm having difficulty getting two subjects to interact. As an example I want two characters to simply "shake hands" well what ensues instead is a disembodied horror show. Advice?
It can be hard sometimes, i either do Inpainting or I combine them in Photoshop and just do an image to image to blend it better. Flux model is definitely better at that but depends on how you prompt, sometimes you can get luck if you add a lot of details, so just saying shake hands might not be enough, so you can ask chatgpt to describe in better details, so with flux you can do something like: Two characters, one a tall, broad-shouldered man in a formal black suit with neatly combed hair and sharp features, and the other a slender woman in a stylish business outfit with her hair in a neat bun, stand facing each other in a softly lit office space, their hands extended in a firm yet respectful handshake, the man's confident grip meeting the woman's graceful, slightly forward-leaning posture, as both exchange subtle expressions of calm professionalism, signaling agreement or partnership in this formal yet amicable interaction.
Love it !!
Good video thanks for the information
What voice software did you use in making this video?
VoiceAir Ai
Thanks for this
Flashbang at 5:18
Sorry
@@pixaroma The video was great and made sense at the same time 👍. I tried it somewhat not 1:1 but similar. What should I tell stable diffusion if I want a sketch style or like anime Style oder Cartoon style?
@@PetrusiliusZwack you can actually use styles ready made prompts that you can add to your prompts, i did a few videos about that on my channel, for both forge and new one on comfyui
Any tips to make sdxl loading model faster? Sd1.5 is faster because it still use 512 base model but sdxl took longer like 3 time longer. I'm using Rtx 4060 8gb.
Usually sdxl models are also larger then 1.5, like 3 times large maybe that can be the cause, i dont have any tips for that, i have rtx4090 and I didnt notice any difference :) plus I dont use 1.5 since sdxl appeared for me 512px is too small image size
to generate the images quicker to i need a better GPU CPU RAM ?
Better gpu with more VRAM, preferably nvidia rtx series with more video ram
Hey I was wondering is it possible to automate the process of a prompting in the text field in SD if so how? My biggest guess is that you use wild cards over here
I didnt try any methods, I usually just copy and paste from chatgpt because i use sometimes images to get prompts. But I saw there is an extension that let you add chatgpt api to it so you will have like chatgpt inside the stable diffusion, you can read more about it but I didnt test it. github.com/hallatore/stable-diffusion-webui-chatgpt-utilities
Thank's a lot 💚💚💚
ooo k ay ! Next step, making Weird Science with this thumbnail o0
😂🧪🧬👩🔬
I like Forge but the creator of it has jumped ship. I know there's a couple of branches that are working but I just don't see it sticking around long term. I switched back to my combo of Auto, Comfy and fooocus
yeah is missing some updates, will see on long term what happens :)
hey mate, what specs in pc you have? e.g GPU
My PC:
- CPU Intel Core i9-13900KF (3.0GHz, 36MB, LGA1700) box
- GPU GIGABYTE AORUS GeForce RTX 4090 MASTER 24GB GDDR6X 384-bit
- Motherboard GIGABYTE Z790 UD LGA 1700 Intel Socket LGA 1700
- 128 GB RAM Corsair Vengeance, DIMM, DDR5, 64GB (4x32gb), CL40, 5200Mhz
- SSD Samsung 980 PRO, 2TB, M.2
- SSD WD Blue, 2TB, M2 2280
- Case ASUS TUF Gaming GT501 White Edition, Mid-Tower, White
- Cooler Procesor Corsair iCUE H150i ELITE CAPELLIX Liquid
- PSU Gigabyte AORUS P1200W 80+ PLATINUM MODULAR, 1200W
- Microsoft Windows 11 Pro 32-bit/64-bit English USB P2, Retail
- Wacom Intuos Pro M
Did.you tried juggernaut? I think not all models support inpaint, the juggernaut xl i saw in description on civit ai that they added inpaint so maybe that can be the cause
@@pixaroma I mean is there way to use normal modals to inpaint because i dont know why im using krita ai diffusion made by acly when im using normal models they can inpaint but in normal stable diffusion forge its impossible do it
Sorry I don't know all the technical stuff, in forge ui i used the juggernaut but i didn't try other, and i think they are other in painting models. So if didn't work either is not compatible or is a bug with the interface
THX, much appreciated :)
Thanks!!
great vid...what causes anatomical mutations and how to address this formidable non-intelligent conundrum,
In my opinion the models need more training, it has problems with anything that can have a lot of combinations, like fingers on a hand can be in so many positions and if you look at the hand from different positions sometimes it looks like you have 4 fingers, sometimes 3 depends on position and you can hold objects and each finger bend in multiple points, and that is what i think make it confuse. Plus when the model is done it tries to censor it and it will miss some things on how it actually looks. It tries to find patterns and more training it has better images will create and with less mutation. Also doesn't know how to count very well. So i think they need to train it in more ways like to make it understand how things look in 3d from different angles to have better results, and probably some physics, things like gravitation affect things, and how objects interact, collisions etc, but in the future probably they figure out how to do that
Phenomenal
tell me how to make a character but in different poses.. for example (plant, put, delight) and so on
Without training a lora model is not so easy, even with lora is not perfect. You can also try extensions like Reactor but those work more with photorealistic images. There are options with controlnet and ip adapter but i didn't manage to get consistent results with sdxl models, i saw others using it with sd 1.5 models. Only from prompt is hard to make it right. You can also try Inpainting to keep the face or head and change everything else. You can get similar results if describe accurate how the hair look, how is dressed and so on, try a few generation and try to find one similar
@@pixaroma and you used the control.lnet??
I don't use the control net for faces, i tried but didn't get consistent results 😃 there are for comfy ui workflow that works I saw online but I use mostly forge and control net i use it to get contours and poses and convert sketches so i use mostly canny model
@@pixaroma you VAE LIKE??
I use automatic vae :)
This is a Goody 💪💪
how come your stavle diffusion like a trained dog, as i have everyting same but never get what i want
😂 i don't always get what i want but with enough tries and right prompts i get it close enough, depends on the images, it still has things that can't do right no matter what you try
❤👌🏻
Create discord server, no one uses facebook these days. (atleast i dont).
With midjourney and games seeing a rise, i am confident the upcoming generation has discord account.
I created one today, but I dont have too much experience with it, I will work on it on the next days
discord.gg/a8ZM7Qtsqq
0:30 yes that trained image. none was generated copy paste using other trained data. it not do spaceship if not trained. its will make spaceship looking toaster bcoz both those trained it can combine. nothing is generate from 0% its not intelligence.
this is image this is caption. if you ask something similiar what is in caption that was trained you get that image or combined with other that has same. nothing generated. still it not know nothing else that was trained. even then it not know anything. its programmed todo.
and its sold as AI what we think in movies. and its just this is this and answer is this. we allready know answer we trained it xD
same as good old jarvis
In my opinion it can do what was trained but the billion combination for each different prompt is what makes it more interesting, its like having unlimited variations for something, i can not do a job if i was not trained, i learn different things then i make like a mix of what i learned, just ai can do those million combination that we don't have enough years in or life to do :) but will see in the future with more training will get more advanced
how is ur generation so fast lol, if i dare go over 612x612 it just stops and dies. even with lower end models.
You need an Nvidia rtx card, with a lot of vram, i have rtx4090 24gb of vram. I do a 1024px image in 4-5 seconds. If i use a hyper model can do like in one second or so. I do speed up the video to not wait but is quite fast anyway
@@pixaroma i tried using ForgeUI and it is what i needed fr. i got a RTX 3070 8gb. but fordge sped ut up to 7s per image, so thanks for the tutorual :D
hmmmm uniforms...
Didn't find the word when I did the tutorial 😂
Hands and feet *smdh*, hands and feet.
😂 yeah it has more problems with hands than a finetuned sdxl, will see what happens, I saw is problem with finetuning because of license, not sure what features brings, if not we stick with sdxl