some sd1.5 paint/drawing styles, can do landscapes with a lot more detail per resolution than current sdxl models, at the end you use them all, for the parts they excel. i suppose sd3 will take some months to lift off, see how people use them and some good models appear, maybe in 6 months, will see.
Yeah I agree. I see far too many people comparing sd3 with fine tuned models like its a fair comparison. (I like how Sebastian compared to base models here. Much better IMO Will take time to see the true strength of the new stuff )much the same happened with SD1.5 too) Great comment yo!
Thank Sebastian for comparing Stable Diffusion, Midjourney and Dalle3 in such details. Your video helps me a lot in making an informed decision of selecting the app that meets my needs.
@@Eleganttf2 Oh its maxed it out at 8 GB i should also say other than my GPU at 1070 its a beast of a machine so the rest of the system is doing alot of heavy lifting (i9 1490k, 64 gb ram) but VRAM 8 GB
I have been watching your videos since I discovered Stable Diffusion in january, many thanks for all the info you are bringing to the community. The first part of this video with the role-playing was awesome and so funny, well done :) On reddit and on the UD Discord, people seem quite mad at Stability AI though, regarding the performance of SD3 and its absurde license.
Sebastian, SD3 is GREAT …for text interpretation. I use AI to build graphics/posters for my screenplays. I need to describe more than one person in my prompt. In SDXL clothing A would ‘bleed’ in to clothing B. Characters were always wearing the other’s clothing. I had to submit multiple prompts - I mean a lot - just to get close. SD3 fixes that. Last night I started resubmitting all my graphics for my show bible. Accurate characters every time.
There's a GitHub issue opened on the WebUI Forge repo for SD3 support. This is the last comment posted 8 hours ago as of the time I write this reply: "@huchenlei could help with this on the dev branch, when he'll have time." ~dan4ik94 So people are working on it, but it might take some time.
Comfys almost always quickest to adapt changes and the devs working on sd3 are using it themselves to test their shit, support was out before sd dropped, probably lots of people working on automatic support rn, forge is always really slow in updating but will probably deliver a great efficient implementation if they do someday. if you can’t wait, you’ll have to deal with the spaghetti I guess
🎯 Key points for quick navigation: 00:00 *🆕 Introduction to Stable Diffusion 3* - Overview of Stable Diffusion 3 release, - Instructions for downloading and starting usage, - Comparison of the 2B model with the 8B model. 02:00 *🛠️ Key Features and Enhancements* - Enhanced text prompt understanding and resolution capabilities, - Introduction of the 16-channel VAE for better detail retention, - Compatibility with various image sizes. 04:00 *💻 Performance and Requirements* - Differences in resource requirements between 2B and 8B models, - Benefits of the 2B model for most users, - Explanation of diminishing returns with higher capacity models. 06:00 *📊 Research Insights and Comparisons* - Summary of research findings on improved autoencoders, - Comparison of FID scores across different channel configurations, - Examination of perceptual similarity metrics. 08:00 *🖼️ Image Quality and Generation Comparisons* - Visual comparisons between SDXL, MidJourney, and DALL-E models, - Discussion on text rendering and image detail differences, - Analysis of various prompts and their outcomes. 12:00 *📥 Downloading and Using Stable Diffusion 3* - Steps for downloading models and setting up, - Overview of different download options and encoders, - Initial generation examples and settings configuration. 15:00 *🎮 Final Thoughts and Next Steps* - Encouragement to start using Stable Diffusion 3, - Mention of future content and continued exploration, - Closing remarks and invitation for viewer feedback. Made with HARPA AI
I guess it depends on what kinds of prompts you're looking for. If you go with more accurate description such as "transparent acrylic pig statue, a small opaque pig statue inside the bigger arcylic statue" you'll get much better results with Dall-E. A very short description is more like "I'm feeling lucky".
3.0 is horrible. Its broken. Censored. No comercial License. Noone wil fine-tune it. This is 2.0 all over again. All they shown on twoter was 8b. Bot this one. So sad :(((
The problem with SD3 is you can't make LoRA's or checkpoints with it for commercial purposes, even with the paid Creator's licence - the ToC for enterprise is not visible. Here's the relevant part of the Creator's License: In Section 2(b)(ii), except as expressly permitted in the agreement, You cannot "modify or prepare any derivative work based upon the Stability Technology or any component thereof". Additionally, the definition of "Derivative Work(s)" in Section 1(d) includes "any modifications to a Core Model, and any other model created which is based on or derived from a Core Model or a Core Model's Output(s)." So based on these provisions, you are not allowed to train or refine Stability's existing models to create derivative works. So let's say someone puts a checkpoint or LoRA on CivitAi, every single one will be for non-commercial use only - even if you have a paid licence. Afaik, nobody uses the base models so unless something changes, I don't see much use for SD3 even if we get the 8b param model.
A little off-topic, but when and where do we still find SD 2.0 and SD 2.1 useful? Are there specific use-cases where one of these is a better choice than 1.5 or SDXL? As mentioned, 1.5 can still accomplish a lot, has great speed using LCM, uses fewer resources, and has more complete tools and models. Seems like the most facile workflow today would use SD 1.5 for speedy exploration and near-realtime painting, then apply a newer larger model for image refinement. But maybe the 2.x models have specific talents that make them worth including…?
I liked the fact you explained a bit the new tech behind the version number, but it could be interesting (in future though) to get info on the 3 clips, wtf is t5, model files specificity and more technical stuff than just how to install (there's READMEs). Hope you have to try it a bit before 😇
Excellent video, what great news! Now I'm going to wait for you to release a video on how to make a LORA with SD3 so I can create my LinkedIn profile picture 😅
I just generated some images locally using the same prompt set I've used to test since 1.5 and they're better than even the SD3 API version. Beautiful lighting, better faces, but hands are still a problem. I just posted my results on the facebook SD page.
That SD3 license is a slap to the face to all those that supported SD for the open source community. Sd3 or Adobe? I cant tell them apart now. The license is a bill and a cuff combo
I'm getting "Error while deserializing header: HeaderTooSmall" with any of the models, do you know if I have to update something? (RTX 3080 10Gb). Thanks!
Hey Sebastian, what's the best version to download for general use please.... normal, with clips, with clips & t5? I'm still fairly new so I'd appreciate your advice.
@@sebastiankamph Not quite understanding this... if i install it localy on my company computer... use it for example for architectural images.. is it free? or is it only free if i use it uncomercialy
@@tomschuelke7955 It's free for non-commercial use only. For commercial use, you need a licence, the cheapest of which limits you to 6000 images/ month and less than 1m revenue. If you want completely free, use SDXL or SD 1.5.
I'm so sad... They completely gutted anatomy because they repeated their mistake from 2.x with an overagressive filter. At least it can do landscapes...
@@fabgeb667 Bro... this is non-partisan around the entire world. This has nothing to do with "leftism". Both the left and right are in favor of regulating AI around the entire world. Who do you think nudity offends? 🤣
My English is bad I may have missed something but I did not understand the point of comparing the images of SDXL, Midjourney and Dall-E, knowing that the video talks about SD3. . Why not compare with SD3 ? I don't understand this video.
I installed SwarmUI to test SD3, but I just get terrible results with the same prompts that worked in older models. I guess there are some aspects in SwarmUI (I used ForgeUI previously) or in SD3 that I overlooked.
Incase you're too afraid to download the 8B and 16B encoders, they both have a low VRAM mode that I've ran on a 3070 (8GB). I've 32GB of system memory that gets maxed out when loading the 16B encoder, so if you've less RAM you may not want to bother. They both produce similar images, at least in low VRAM mode, so I'd sick to the 8B encoder.
I'm new to this, i didn't understand what you did in the interface and how you're supposed to configure ComfyUI. I managed to follow the docs but I get the following error when trying to "Queue Prompt" AttributeError: 'NoneType' object has no attribute 'tokenize'
umm... you didn't mention how to get the nodes for these example workflows. Installing missing nodes via the manager doesn't seem to work. What did I miss here?
I would like to point out, that the license for the SD3 is completely useless at its current state. This need to be sorted out before people will invest ANY effort into SD3. This is very, very dissapointing. Just an example exerpt from this mess: Creator License $20 per month, the number of Images generated is limited to 6,000/month. I wonder what Stability wants to achieve with it. How they even plan to control this? Too many nasty questions.
@@godlesschannel7730 i have been using dreamshape or reavix and juggernaut lighting versions and quality is amazing. Loras work good in my experience as well..
Stability AI - The parent company Diffusion - an AI image learning technique which converts noise (nonsense dots) into a coherent image after several passes through its model. Stable Diffusion - The name of Stability AI's diffusion model. Dall E - the name of OpenAI (Chat GPT owner) diffusion model There have been three main releases of Stable Diffusion (there are actually more, but the following three are the most relevant) SD 1.5 - the start of it becoming mainstream. Many derivative models are based on it (512x512 pixels, ~900M param model) SDXL - an improvement in text generation, image quality and resolution (1024x1024 pixels, ~3bn+ param model) SD 3 - there are multiple models, the one released today is SD3 medium because it has 2b params (the API uses the larger 8b param model and their research paper concentrates on the 8bn param model). The medium model is less than SDXL but it should generate similar quality images because of better training and a better architecture. The main improvement in the SD3 models is prompt adherence i.e. it will draw more complicated prompts accurately (because it has a better text encoder - these are the things which translate your prompt into something the model can understand) and it has the best text generation.
try comparing between models that they benchmark themselves in their paper. SD3, dalle3 and Ideogram, specially in typography ideogram 1 is king. (page 10)
The Dall-E pig-inside-a-pig generations remind me of that jontron episode where he checks out cursed Frozen flash games (Disney bootlegs episode for anyone who's curious).
I checked this four prompt examples in Ideogram and it smashed all four ot them. Moreover the frog and the pretzel it has done better and more accurate than Dalle or SD3.
Comfy just allows so much more creative freedom in how you use sd Auto is basically doing the same few manual steps over and over again, maybe mixed up in a different order Comfy on the other hand is a source of unlimited content potential
6 місяців тому
Was expecting a tutorial on how to install and setup SD3.
Request video onetrainer lora training for 8gig systems for settings for us potato users, include cmd torch install and versions of python nvidia drivers etc needed to run, i have tried lora is made but not working or traning correctly.
My man, read it again. The prompt is "Translucent pig. Inside is a smaller pig". None of them got it right but looks like all of them have better prompt adherence than you 😅, no offense of course ✌️
@@sebastiankamph People's faces and royalty free photography as well as my own. I do stable diffusion photoshoot packages and I have a feeling these extra layers are going to provide precisely the boost in flexibility I've been looking for.
That kinda didn't age well.....but then again, in the near future, it may have..... Great video regardless. It still helped me understand the differences between the different models a little better.
Here’s a money idea! Just remember me whoever hits it big. Once AI can continuously generate hands it will be even more difficult to distinguish AI from non AI art. The idea make something that can analyze images in-depth and can easily tell the difference between the AI and real art.
Gave it a quick shot, worked out of the box to my suprise.But it has a hefty bias, not as extreme as the google one but there seems to be a strong preference for "people of color" . I also already got twisted limbs basically instantly, deformed hands and even people without face which my due to some censorship effect though. But we'll see what comes out of this. As long as there's no control net support its usefulness is limited at best anyway I'd say. Otherwise a 2B model also seems to be a good choice considering that's right in the middle between SD 1.5 with about 1B and SDXL with 3.5B.
a bit disapointing that the SD3 results miss the mark by so much and that you didn't include them in the side-by-side comparison. Also whats the point of snarky comments at MJ? it clearly had the best results in your own tests...
10:55 you can see the Dall-E one also has a lof of diversity in the filter with the black wizard lol. Cesnorship and wokeness should also be taken into consideration
So much for the early release email I was supposed to get when SD3 came out. Thanks Sebastian!
I think I beat the email by 30 minutes, give or take. Happy to help!
some sd1.5 paint/drawing styles, can do landscapes with a lot more detail per resolution than current sdxl models, at the end you use them all, for the parts they excel.
i suppose sd3 will take some months to lift off, see how people use them and some good models appear, maybe in 6 months, will see.
Yeah I agree. I see far too many people comparing sd3 with fine tuned models like its a fair comparison. (I like how Sebastian compared to base models here. Much better IMO
Will take time to see the true strength of the new stuff )much the same happened with SD1.5 too)
Great comment yo!
@@JustArtsCreations i agree just like how SDXL first came out its rough especially lots of great 1.5 fine tuned models out there at the time
But problem is with license - no commercial license in version 3
@@szach-i-mat you must have missed the new license the other day. its exactly the same as the other models now.
Thank Sebastian for comparing Stable Diffusion, Midjourney and Dalle3 in such details. Your video helps me a lot in making an informed decision of selecting the app that meets my needs.
Great introduction. Cant wait to see what the community models bring to the table now
That mid journey character cracked me up.
Love his cinematic lighting! lol
he needs his own show
"Heay guuys i can doo perrtty imagess"
rofl *_*
great now do it for flux
Thank you. I tried it and it works really well. Waiting on A1111 to do a patch update so we can also test the models there as well.
Running the medium version on my 1070 just fine here. Love it!
By the way you were by far the first to upload a video about this so ty
You're welcome! How's the speed?
@@sebastiankamph Its not actually all that bad the step count is 9 seconds on average for me so really not far off from SDXL
@@JustArtsCreations Running on 1070ti pretty much the same
@@JustArtsCreations what's the Vram usage while generating with SD3 2gb if i may ask can you check it while its generating in task manager ? Thanks!
@@Eleganttf2 Oh its maxed it out at 8 GB i should also say other than my GPU at 1070 its a beast of a machine so the rest of the system is doing alot of heavy lifting (i9 1490k, 64 gb ram) but VRAM 8 GB
I have been watching your videos since I discovered Stable Diffusion in january, many thanks for all the info you are bringing to the community. The first part of this video with the role-playing was awesome and so funny, well done :) On reddit and on the UD Discord, people seem quite mad at Stability AI though, regarding the performance of SD3 and its absurde license.
2B or not 2B?... that is the question.
Best comment
Damn, I just posted that myself. I guess I should have checked first. 🙂
no 2b nier :( gotta stick to pony xl for that
Sebastian,
SD3 is GREAT …for text interpretation. I use AI to build graphics/posters for my screenplays.
I need to describe more than one person in my prompt. In SDXL clothing A would ‘bleed’ in to clothing B. Characters were always wearing the other’s clothing. I had to submit multiple prompts - I mean a lot - just to get close.
SD3 fixes that. Last night I started resubmitting all my graphics for my show bible. Accurate characters every time.
wait, is it only for Comfy? is there no automatic1111 version yet? im running forge. how do i install??
There's a GitHub issue opened on the WebUI Forge repo for SD3 support. This is the last comment posted 8 hours ago as of the time I write this reply: "@huchenlei could help with this on the dev branch, when he'll have time." ~dan4ik94
So people are working on it, but it might take some time.
Lol same using Forge here
Comfys almost always quickest to adapt changes and the devs working on sd3 are using it themselves to test their shit, support was out before sd dropped, probably lots of people working on automatic support rn, forge is always really slow in updating but will probably deliver a great efficient implementation if they do someday.
if you can’t wait, you’ll have to deal with the spaghetti I guess
Kudos on the roleplaying the different models scene! That was great, definately left me laughing 😆
Glad you enjoyed it! Wanted to try something different and I had fun doing it :D
The midjourney bit at the end cracked me up xd
Missing from the tests: actually trying the examples from the resesrch paper in SD3 to see if it can actually do those things.
🎯 Key points for quick navigation:
00:00 *🆕 Introduction to Stable Diffusion 3*
- Overview of Stable Diffusion 3 release,
- Instructions for downloading and starting usage,
- Comparison of the 2B model with the 8B model.
02:00 *🛠️ Key Features and Enhancements*
- Enhanced text prompt understanding and resolution capabilities,
- Introduction of the 16-channel VAE for better detail retention,
- Compatibility with various image sizes.
04:00 *💻 Performance and Requirements*
- Differences in resource requirements between 2B and 8B models,
- Benefits of the 2B model for most users,
- Explanation of diminishing returns with higher capacity models.
06:00 *📊 Research Insights and Comparisons*
- Summary of research findings on improved autoencoders,
- Comparison of FID scores across different channel configurations,
- Examination of perceptual similarity metrics.
08:00 *🖼️ Image Quality and Generation Comparisons*
- Visual comparisons between SDXL, MidJourney, and DALL-E models,
- Discussion on text rendering and image detail differences,
- Analysis of various prompts and their outcomes.
12:00 *📥 Downloading and Using Stable Diffusion 3*
- Steps for downloading models and setting up,
- Overview of different download options and encoders,
- Initial generation examples and settings configuration.
15:00 *🎮 Final Thoughts and Next Steps*
- Encouragement to start using Stable Diffusion 3,
- Mention of future content and continued exploration,
- Closing remarks and invitation for viewer feedback.
Made with HARPA AI
I guess it depends on what kinds of prompts you're looking for. If you go with more accurate description such as "transparent acrylic pig statue, a small opaque pig statue inside the bigger arcylic statue" you'll get much better results with Dall-E.
A very short description is more like "I'm feeling lucky".
3.0 is horrible. Its broken. Censored. No comercial License. Noone wil fine-tune it. This is 2.0 all over again. All they shown on twoter was 8b. Bot this one. So sad :(((
Spread the word about what is in the terms of agreement. They are insane.
@@eliparrish9145not obvious for now. We need to wait for their answer.
xl is the only way still.
@@thinghy3 Exacly 🎉
The problem with SD3 is you can't make LoRA's or checkpoints with it for commercial purposes, even with the paid Creator's licence - the ToC for enterprise is not visible.
Here's the relevant part of the Creator's License:
In Section 2(b)(ii), except as expressly permitted in the agreement, You cannot "modify or prepare any derivative work based upon the Stability Technology or any component thereof".
Additionally, the definition of "Derivative Work(s)" in Section 1(d) includes "any modifications to a Core Model, and any other model created which is based on or derived from a Core Model or a Core Model's Output(s)."
So based on these provisions, you are not allowed to train or refine Stability's existing models to create derivative works.
So let's say someone puts a checkpoint or LoRA on CivitAi, every single one will be for non-commercial use only - even if you have a paid licence. Afaik, nobody uses the base models so unless something changes, I don't see much use for SD3 even if we get the 8b param model.
I appreciate the download information at the end of the video.
rather than up top.
A little off-topic, but when and where do we still find SD 2.0 and SD 2.1 useful? Are there specific use-cases where one of these is a better choice than 1.5 or SDXL? As mentioned, 1.5 can still accomplish a lot, has great speed using LCM, uses fewer resources, and has more complete tools and models. Seems like the most facile workflow today would use SD 1.5 for speedy exploration and near-realtime painting, then apply a newer larger model for image refinement. But maybe the 2.x models have specific talents that make them worth including…?
Sadly not really. They're now dead. What they could do better than 1.5, sdxl now does (and now sd3).
Love the skit. Great work.
Have you seen cascade finetunes? No? Wanna know why? No comercial license. Pony Guy already said there will never be Pony 3.0 ( 2b) with this license.
I liked the fact you explained a bit the new tech behind the version number, but it could be interesting (in future though) to get info on the 3 clips, wtf is t5, model files specificity and more technical stuff than just how to install (there's READMEs).
Hope you have to try it a bit before 😇
lovely cant wait to try it out thank you
Excellent video, what great news! Now I'm going to wait for you to release a video on how to make a LORA with SD3 so I can create my LinkedIn profile picture 😅
This new format is fire!
Glad you like it! Do you think actor Seb should make a comeback for future videos?
@@sebastiankamph Definitely! 👌🏼
I just generated some images locally using the same prompt set I've used to test since 1.5 and they're better than even the SD3 API version. Beautiful lighting, better faces, but hands are still a problem. I just posted my results on the facebook SD page.
That SD3 license is a slap to the face to all those that supported SD for the open source community. Sd3 or Adobe? I cant tell them apart now. The license is a bill and a cuff combo
Can we use in normal A1111, I don't really comfortable with ComfiUi at all...
Yeah same, just wait for A1111 to patch it
Lets goooo I waited for your video!! thanks bro :D
Great video. I like the skit.
I'm getting "Error while deserializing header: HeaderTooSmall" with any of the models, do you know if I have to update something? (RTX 3080 10Gb). Thanks!
Try other UI for SD3, it called StableSwampUI or smt like that
will you make video for automatic 1111 as well in the future ?
Once a1111 updates you can just drop the file in the /models/Stable-diffusion folder
Thanks a lot!! Cannot wait to get my hands on it! 😊
Hey Sebastian, what's the best version to download for general use please.... normal, with clips, with clips & t5? I'm still fairly new so I'd appreciate your advice.
Probably the normal ones and then use clips separately, and running CLIP only (no t5). So basically what I did in Swarm at the end.
@@sebastiankamph Thank you... so the 4gb model then?
Woah SD3 is here! Heck yes
We can all hear the cooling fans of GPUs in the whole world run wild!
Happy generating, folks!
My little office is already much warmer. Yours too?
SD 3 has now officially made all concerns about climate change obsolete - Chapeau :-)
@@sebastiankamph Absolutely! I am experimenting with it as much as I can without things like pcm, lcm, hyper, turbo etc. out yet...
Hey at 14:07 how did you connect the two clips without having to go back to the main node? like was there a hotkey? That looks handy
I think that was just an accidental cut in edit. I dragged it two times. But that feature would be fantastic, maybe it exists.
@@sebastiankamph oh okay got ya haha that makes sense what perfect timing ! Thanks though for the reply eh
Hi there. Is Stable Diffusion 3 free and with private generations? Can it be used for creating stock images?
Hey, yes, this is correct! If you run it locally it's free
@@sebastiankamph great! All that's left to do is wait for step-by-step instructions from the good folks on how to install SD3 on Mac or Windows :)
@@sebastiankamph Not quite understanding this... if i install it localy on my company computer... use it for example for architectural images.. is it free? or is it only free if i use it uncomercialy
@@tomschuelke7955 It's free for non commercial use. Otherwise u need to buy a license.
@@tomschuelke7955 It's free for non-commercial use only. For commercial use, you need a licence, the cheapest of which limits you to 6000 images/ month and less than 1m revenue. If you want completely free, use SDXL or SD 1.5.
I'm so sad... They completely gutted anatomy because they repeated their mistake from 2.x with an overagressive filter. At least it can do landscapes...
vote for leftism, get this
@@fabgeb667 Bro... this is non-partisan around the entire world. This has nothing to do with "leftism". Both the left and right are in favor of regulating AI around the entire world. Who do you think nudity offends? 🤣
@@Avenger222 censorship means the end of the the free world, which has started since couple of years.
I've downloaded the file and put it in the models folder but its not working when im selecting the checkpoint
And the clips in the clips folder? (Unless you use swarm). What's happening?
Hooray !
😊👍😁🙂
Emad, Robin Rombach, Andreas Blattmann, Patrick Esser and Dominik Lorenz & team = Amazing!
Finally here, and horrendous corporate garbage license! Go away SD3!
I really liked the comedy part as a good explanation of the differences. Thank you for this video! :)
My English is bad I may have missed something but I did not understand the point of comparing the images of SDXL, Midjourney and Dall-E, knowing that the video talks about SD3. . Why not compare with SD3 ? I don't understand this video.
But thanks for the good news
he showed SD3 images before from the 3D3 webpages, and afterwords tried the same prompts for the other models
@@tomschuelke7955 Oh ok thanks !
I installed SwarmUI to test SD3, but I just get terrible results with the same prompts that worked in older models. I guess there are some aspects in SwarmUI (I used ForgeUI previously) or in SD3 that I overlooked.
No.. the model is just dumpster fire. Just look at the r/StableDiffusion
Fr
LOL, that was a good laugh, you should do more stuff like that. 😀
you are unbeliable good actor...
Really excited! So many even better models to come for the community. Now how do I run this? :D
i been using sdxl and generating 1080x1280 fine no bad hands or weird things excellent quality too
Incase you're too afraid to download the 8B and 16B encoders, they both have a low VRAM mode that I've ran on a 3070 (8GB). I've 32GB of system memory that gets maxed out when loading the 16B encoder, so if you've less RAM you may not want to bother. They both produce similar images, at least in low VRAM mode, so I'd sick to the 8B encoder.
I'm new to this, i didn't understand what you did in the interface and how you're supposed to configure ComfyUI.
I managed to follow the docs but I get the following error when trying to "Queue Prompt"
AttributeError: 'NoneType' object has no attribute 'tokenize'
What is the name of the extension for comfyui that adds a performance monitor to this panel on the right?
Crystools
hey do you know which EXACT version of stable diffusion perchance ai art plugin uses?
umm... you didn't mention how to get the nodes for these example workflows. Installing missing nodes via the manager doesn't seem to work. What did I miss here?
They're in Comfy by default. Just make sure you have the latest version.
found out about the release from your vid. Thanks!
I would like to point out, that the license for the SD3 is completely useless at its current state. This need to be sorted out before people will invest ANY effort into SD3. This is very, very dissapointing. Just an example exerpt from this mess: Creator License $20 per month, the number of Images generated is limited to 6,000/month. I wonder what Stability wants to achieve with it. How they even plan to control this? Too many nasty questions.
all while using the open source community as thier personal honeybees.
You should have done more comparisons with actual SD3 :)
What is safe? more accuratly was was unsafe about previous versions?
I am excited to give this a try but I am very looking forward to AD model! :D
you the guy in the video?
Running the Medium version on a 3060ti. 1024x1024 in under 25 seconds at 28 steps. Still does weird things with limbs . . . . . . .
What was your speed with SDXL?
Why are they still releasing a +20 steps models when we have 5 step models
@@AgustinCaniglia1992 faster= usually worse quality and loras rarely works with them so less variety,less unique art
@@godlesschannel7730 i have been using dreamshape or reavix and juggernaut lighting versions and quality is amazing. Loras work good in my experience as well..
In my SD Models folder there is no Clips folder. Where do the 4 clip files go?
/models/clip/
Can it be used with FORGE UI? I'm not familiar with comfy UI.
Not yet, I tried.. Will probably be updated soon
How do i get this for forge UI or A1111? Which files should i download?
Not available yet, needs an update from ui devs.
Sir you are the winner
You will probably have to use T5+CLIP to get the prompt adherence from the paper.
SD3 is a model, not an actual update of the Stable diffusion "itself" I'm so confused please someone explain me
Stability AI - The parent company
Diffusion - an AI image learning technique which converts noise (nonsense dots) into a coherent image after several passes through its model.
Stable Diffusion - The name of Stability AI's diffusion model.
Dall E - the name of OpenAI (Chat GPT owner) diffusion model
There have been three main releases of Stable Diffusion (there are actually more, but the following three are the most relevant)
SD 1.5 - the start of it becoming mainstream. Many derivative models are based on it (512x512 pixels, ~900M param model)
SDXL - an improvement in text generation, image quality and resolution (1024x1024 pixels, ~3bn+ param model)
SD 3 - there are multiple models, the one released today is SD3 medium because it has 2b params (the API uses the larger 8b param model and their research paper concentrates on the 8bn param model). The medium model is less than SDXL but it should generate similar quality images because of better training and a better architecture. The main improvement in the SD3 models is prompt adherence i.e. it will draw more complicated prompts accurately (because it has a better text encoder - these are the things which translate your prompt into something the model can understand) and it has the best text generation.
try comparing between models that they benchmark themselves in their paper. SD3, dalle3 and Ideogram, specially in typography ideogram 1 is king. (page 10)
StabilityAI is so back!
The Dall-E pig-inside-a-pig generations remind me of that jontron episode where he checks out cursed Frozen flash games (Disney bootlegs episode for anyone who's curious).
Should I move away from Automatic1111 and Forge? I noticed he didn't show either in his examples.
Just wait for the update for em
How can I have the modelsamplingS3D model, I don't have it 🥺
Was comparsion images made with SD3 or SDXL ?
Wooooo!, It's here!
I checked this four prompt examples in Ideogram and it smashed all four ot them. Moreover the frog and the pretzel it has done better and more accurate than Dalle or SD3.
what file should we be downloading for a1111?
Not sure as A1111 is not yet supported.
Its not supported in a1111 yet. Wait for a1111 update.
Can i use SD 1 on iPad 9,.?
Not as a local install. But you can use cloud solutions like ThinkDiffusion. Or install on a local pc and use that ip.
Does Sebastian mostly Comfy now? Haven't seen much Auto1111 lately.
Not much happening with a1111
Comfy just allows so much more creative freedom in how you use sd
Auto is basically doing the same few manual steps over and over again, maybe mixed up in a different order
Comfy on the other hand is a source of unlimited content potential
Was expecting a tutorial on how to install and setup SD3.
See my "How to install comfy" in the description, then download the model from SD3 video and use the workflows provided.
After use SD3 should we delete SDXL?
No, you can keep it if you want. Up to you.
Request video onetrainer lora training for 8gig systems for settings for us potato users,
include cmd torch install and versions of python nvidia drivers etc needed to run, i have tried lora is made but not working or traning correctly.
That MJ bleu/red light !! 🤣
10:50 All of them were wrong. A translucent pig inside another pig would just be a picture of a pig.
My man, read it again. The prompt is "Translucent pig. Inside is a smaller pig". None of them got it right but looks like all of them have better prompt adherence than you 😅, no offense of course ✌️
I can't wait to train this.
What are you going to train it on first?
@@sebastiankamph bwoops?
@@goodie2shoesbobs and vagene
@@sebastiankamph People's faces and royalty free photography as well as my own. I do stable diffusion photoshoot packages and I have a feeling these extra layers are going to provide precisely the boost in flexibility I've been looking for.
@@xilixsory to disapoint. Ot sosnt work woth humans. They destroyed anatomy. Not sutavle for lora/dreambooth of humans
That kinda didn't age well.....but then again, in the near future, it may have..... Great video regardless. It still helped me understand the differences between the different models a little better.
Here’s a money idea! Just remember me whoever hits it big. Once AI can continuously generate hands it will be even more difficult to distinguish AI from non AI art. The idea make something that can analyze images in-depth and can easily tell the difference between the AI and real art.
Anyone know the Fooocus Version he is talking about?
I know RuinedFooocus is working on it as we speak
Will this work in A1111?
Not yet.
SD3 is an absolute joke! :D
even my standard pony models create better limbs.
holy shit this release is so scuffed
I hear that it's free to download, but hearing many others saying it's going to cost $20 a month to use.
Thank you
You're welcome!
works with zluda ?
Gave it a quick shot, worked out of the box to my suprise.But it has a hefty bias, not as extreme as the google one but there seems to be a strong preference for "people of color" . I also already got twisted limbs basically instantly, deformed hands and even people without face which my due to some censorship effect though. But we'll see what comes out of this. As long as there's no control net support its usefulness is limited at best anyway I'd say. Otherwise a 2B model also seems to be a good choice considering that's right in the middle between SD 1.5 with about 1B and SDXL with 3.5B.
a bit disapointing that the SD3 results miss the mark by so much and that you didn't include them in the side-by-side comparison. Also whats the point of snarky comments at MJ? it clearly had the best results in your own tests...
Taking a wild guess this won't work on A111
Sadly not yet as their last update was 2 weeks ago. But I'm sure it won't be too long.
@@sebastiankamph surprised I was able to get sd3 running on comfy with a GTX 1660 ti
Please, talk about the license and the limitations it suppose for any kind of use other than hobbyists, even for you as a youtuber.
10:55 you can see the Dall-E one also has a lof of diversity in the filter with the black wizard lol. Cesnorship and wokeness should also be taken into consideration
Do ppl still use fooocus?
Can't we run it on amd 😢
Hmm, how would you be able to see the translucent pig, if it were inside the smaller pig? ;)
Now......how do we train it?