It's going to be a game changer for Midjourney if they can get the text generating accurately before the other AI image generators catch up to Midjourney's image quality.
I'm sure I heard of an alternative that already caught up. Here's the thing, the moment you have created an ai product others can use the output of your product to create an improvement cheaper and faster than you when you first made the product.
I feel like getting accurate text generation on images is easier than generating realistic images than can fool humans. So I think Midjourney will get there first.
There is already alternative that gets same quality but can write already, it's namer escapes me but if you Google it you will find it, it was released like 2 weeks ago.
Slooooooowly getting closer. One would think text outputs would be possible by now, but it makes sense given the way ai tech works.... Cool post Matt. Informative as always.
I'm going to do a deeper dive on Deep Floyd. Just since recording this video this morning, I've learned some better prompting tricks for it. Stay tuned. :)
Yes, negative prompts are very important, i use a full list of them. Then proper prompts depends versions, the latest XL need less prompts, but more accurate key-words. Then modifiers, and artists name can help the result.
Textual embeddings like "EasyNegative" make this super easy, you can end up with really tiny prompts that give you a ton of detail, especially when combined with LORAs
I follow a couple of UA-camrs whose main focus is Ai but none of them, not a single one, can match your enthusiasm. This is the main reason why I regard you as the top of my list.
Thanks @mreflow! As soon as you released that video, MidJourney 5.1 came out. It seems to be doing a bit better with text now but still can use improvement. I did hit a proper "HELLO" the first time with the prompt: A sign that says "HELLO".
Interesting how different models have different strengths and weaknesses. Clearly its not possible to just combine different models. Makes you realize they're more than just a simple function that takes inputs and spits out outputs.
I don't think the existing models will disappear. Just like you can still use things like Disco Diffusion or MidJourney V.1 if you want, I imagine you'll always be able to go back to old models or even blend new with old.
Technical question, because you seem to talk in 1.2 speed. Do you actually speed up your video's? 😅 On topic: cool new step in AI. I knew it would come, but actually seeing it work is still surprising.
well... our free ai image editing application already as got ai text generation and it works on 8GB cards, any SD 1.5 model is free and can be used commercially 🙂
None of the paper quilling looked like paper quilling. But I think everything else looked pretty good. Maybe not on par with MidJourney’s resolution quality, but it free? 😃
You won’t believe what I created using DeepFloyd. It’s a dude wearing a baseball cap 🧢 with text wrote on it, but I added him into a full scene. Bing AI can do text as well sometimes.
You're not getting very good results because having a negative prompt is almost more important than having a positive prompt. I like to think of it as the positive prompt is about what the image will be, but the negative prompt is what decides the quality. So just write a paragraph of things you don't want like "bad quality, worst quality, blurry, bad hands, too many fingers, mutated, mutation, misspelled" and so on. You pretty much can't hear anything with too long of a negative prompt because if you add something obscure that it doesn't recognize, well then it won't match anything anyway and have no effect. Make it a paragraph literally. That text box is way too small for how large the prompts should be.
I keep asking this, and I'll keep asking until I get an answer lol. Why cant you prompt the image generator ai, then once it creates an image that is close, you ask it to make small or single adjustments. For example; you prompted it to make an image alof a himan made of leaves or something. Then it had a funky nose and mouth. Why can you just tell it to keep the image and just change the nose and mouth? I run into this with any ai image generator I use.
I recently had an artist get upset with me since I no longer use her service. Each design was costing me anywhere from 50-200$. After the introduction of AI, am able to pump out similar content (in fact better content). I feel for her but such is life. As a small business, my interest is to save as much as possible to put it back in business. I can only see this being a bigger issue for artists going forward.
Anyone know the reason why we can't train a lora to understand a word or sentence we are trying to use? It wouldn't be great because we would have to make pretty big files as far as loras tend to go, but I figure if we trained something on a large flash drive or something... I wonder how that would go.
Those Stable Diffusion images of Kim Kardashian and Abraham Lincoln are not really showing the best of what Stable Diffusion can do. Stable Diffusion often requires very, very long prompts to get quality results, and conversely, Midjourney gives better results for shorter prompts, but consequently, you have less control over the initial generation. I often use the same boilerplate prompt text in almost every one of my Stable Diffusion prompts, so it's not even that big of a deal to have large prompts. The point is, I think it's unfair to compare Stable Diffusion and Midjourney using the same exact prompt. Also SD 2.x especially requires more specific prompting even compared to SD 1.5 because you can't use real world artist styles by name. You have to describe it manually.
I agree but I was basing it off of their example prompt from their own demonstration. This was not designed to be a prompting guide. Just a quick overview of what's now available. I'll be diving deeper into prompting in future videos. I've also made several past videos about prompting with Stable Diffusion.
Do you use some of SD's models and loras/lycoris. If you haven't try out deliberate, rev animated kind of models and some loras as well, if you do everything well and take time, the images will definitely have better quality than that of mid journey and also more closer to your imagination
@@mreflow Thanks buddy, I had a feeling that was the case, but wasn't 100% sure. Thanks again for the great content, you're on top of it and it's awesome!
the way the other company can't do what other can do makes me think that they keeping secret from each other to be on top of the game/race. Am I somehow correct??
Matt i love your content and you are doing an amazing job overall presenting those new AI tools and models, but the fact that you are comparing Midjourney to vanilla SD while using SD without giving any negative prompts, using no trained models and generally not prompting in the correct way for SD kinda misinforms people. Vanilla Midjourney is better than vanilla SD but a fully custom SD setup + correct way of prompting and use is miles ahead of Midjourney. It is just hard to make work right and that is why it is mostly meant for professionals and people that take AI image generation seriously.
It’s just a speculation, but seems like Midjourney uses modified version of stable diffusion for image generation, and they also trained their models on a massive amounts of carefully selected images, which is why the difference between base SD model and midjourney output is so huge. So probably after public release of SD XL midjourney will get even more better.
Imagine dunking on StableDiffusion while not using negative prompts, loras, specialized models, nothing. SD is better in almost every way imaginable if you know how to use it
Deep Floyd is not currently open source. This is directly from the Stability website: "DeepFloyd IF is a state-of-the-art text-to-image model released on a non-commercial, research-permissible license that provides an opportunity for research labs to examine and experiment with advanced text-to-image generation approaches. In line with other Stability AI models, Stability AI intends to release a DeepFloyd IF model fully open source at a future date."
False. MidJourney has developed their own model. They used Stable Diffusion at one time but no longer do (according to them on their weekly office hours calls).
When text is available - an entire industry will become obsolete overnight. However, I also feel that the industry is currently vastly overpriced - I will lose no sleep.
By the time journalists finish their articles describing the shortcomings of AI, people fix them. The pace is ridiculous
Funny because ive been telling people how crazy this wave this tech is. No longer year to year but week to week
It's going to be a game changer for Midjourney if they can get the text generating accurately before the other AI image generators catch up to Midjourney's image quality.
I'm sure I heard of an alternative that already caught up. Here's the thing, the moment you have created an ai product others can use the output of your product to create an improvement cheaper and faster than you when you first made the product.
I feel like getting accurate text generation on images is easier than generating realistic images than can fool humans. So I think Midjourney will get there first.
@@GamingDad yes i think this is the reason ai is good in generating test cases
Better yet, for free and better Leonardo AI.
There is already alternative that gets same quality but can write already, it's namer escapes me but if you Google it you will find it, it was released like 2 weeks ago.
The phrase "cherry selected" is some next level colloquialization. Love the videos, brother.
Slooooooowly getting closer. One would think text outputs would be possible by now, but it makes sense given the way ai tech works.... Cool post Matt. Informative as always.
I always love your content and appreciate the work you’re doing! In other news - I will now be referring to my feet as “leg hands”
Advanced prompting helps a lot for stable diffusion. More detailed prompts and negative prompts can help make things close to midjourney's quality.
I'm going to do a deeper dive on Deep Floyd. Just since recording this video this morning, I've learned some better prompting tricks for it. Stay tuned. :)
Yes, negative prompts are very important, i use a full list of them. Then proper prompts depends versions, the latest XL need less prompts, but more accurate key-words. Then modifiers, and artists name can help the result.
Textual embeddings like "EasyNegative" make this super easy, you can end up with really tiny prompts that give you a ton of detail, especially when combined with LORAs
I follow a couple of UA-camrs whose main focus is Ai but none of them, not a single one, can match your enthusiasm. This is the main reason why I regard you as the top of my list.
I've been looking for something like DeepFloyd-more great content from my favorite AI news site.
Hey Matt! Content idea - tutorials on AI-enabled robotics, tutorials on DIY robots we can make at home to do important things like passing the butter
@@ari-enby you pass the butter
I cant explain how much i loooove this channel plus your thumbnails always on point
Thanks @mreflow! As soon as you released that video, MidJourney 5.1 came out. It seems to be doing a bit better with text now but still can use improvement. I did hit a proper "HELLO" the first time with the prompt: A sign that says "HELLO".
Finally!! I've waited all err, month for this.
Wow! Now I can generate images of president Biden saying "It's Joever" in a fraction of the time!
Interesting how different models have different strengths and weaknesses. Clearly its not possible to just combine different models.
Makes you realize they're more than just a simple function that takes inputs and spits out outputs.
Great and very helpful video. Thanks Matt
I hope they allow option to leave in alien text... It created a dream-like feeling.
I don't think the existing models will disappear. Just like you can still use things like Disco Diffusion or MidJourney V.1 if you want, I imagine you'll always be able to go back to old models or even blend new with old.
I am waiting for this since I first used Midjourney 😮
I suspect Matt's grave stone will have the quote "This is as bad as it gets."
Thanks for the update Matt.
Thanks Matt! 🙏🏼
Technical question, because you seem to talk in 1.2 speed. Do you actually speed up your video's? 😅
On topic: cool new step in AI. I knew it would come, but actually seeing it work is still surprising.
I will now refer to my feet as leg hands. Thanks Matt!
Midjourney plus in-painting with local SD on can give some good text.
Never regret watching. Thanks Matt
Monkey leg hands are also called monkey feet
I can call em whatever I want!
can't wait to have this in Stable Diffusion.
Great work again Matt. Top notch.
well... our free ai image editing application already as got ai text generation and it works on 8GB cards, any SD 1.5 model is free and can be used commercially 🙂
I'm glad that You read messages
4th Kardashian and Abe wedding photo. Undeliable proof AI is getting humor!
Paris Hilton and Einstein wedding photos...lol!
None of the paper quilling looked like paper quilling. But I think everything else looked pretty good. Maybe not on par with MidJourney’s resolution quality, but it free? 😃
Love your videos Matt! This is my 14th day of asking you to play the Banjo on-stream since you have it in the background of your videos
Confession: I can play guitar, bass, and ukulele but never learned the banjo. It was a gift and I've never learned to play it. :(
You won’t believe what I created using DeepFloyd. It’s a dude wearing a baseball cap 🧢 with text wrote on it, but I added him into a full scene. Bing AI can do text as well sometimes.
Awesome video! Thank you!
When are we getting decent hands/fingers and feet/toes with stable diffusion that's what I'm keeping an eye on.
i have to try that one !! thansks for bring this up homie 🐒!!
You're not getting very good results because having a negative prompt is almost more important than having a positive prompt. I like to think of it as the positive prompt is about what the image will be, but the negative prompt is what decides the quality. So just write a paragraph of things you don't want like "bad quality, worst quality, blurry, bad hands, too many fingers, mutated, mutation, misspelled" and so on. You pretty much can't hear anything with too long of a negative prompt because if you add something obscure that it doesn't recognize, well then it won't match anything anyway and have no effect.
Make it a paragraph literally. That text box is way too small for how large the prompts should be.
Love your content Matt
I keep asking this, and I'll keep asking until I get an answer lol. Why cant you prompt the image generator ai, then once it creates an image that is close, you ask it to make small or single adjustments. For example; you prompted it to make an image alof a himan made of leaves or something. Then it had a funky nose and mouth. Why can you just tell it to keep the image and just change the nose and mouth? I run into this with any ai image generator I use.
I recently had an artist get upset with me since I no longer use her service. Each design was costing me anywhere from 50-200$. After the introduction of AI, am able to pump out similar content (in fact better content). I feel for her but such is life. As a small business, my interest is to save as much as possible to put it back in business. I can only see this being a bigger issue for artists going forward.
She’s not entitled to your money. F her.
I don't know how interested I am unless I can use Stable Diffusion 1.5.
"Monkey leg hands" aka feet 😂
Anyone know the reason why we can't train a lora to understand a word or sentence we are trying to use? It wouldn't be great because we would have to make pretty big files as far as loras tend to go, but I figure if we trained something on a large flash drive or something... I wonder how that would go.
1:57 Top G president
Those Stable Diffusion images of Kim Kardashian and Abraham Lincoln are not really showing the best of what Stable Diffusion can do. Stable Diffusion often requires very, very long prompts to get quality results, and conversely, Midjourney gives better results for shorter prompts, but consequently, you have less control over the initial generation.
I often use the same boilerplate prompt text in almost every one of my Stable Diffusion prompts, so it's not even that big of a deal to have large prompts.
The point is, I think it's unfair to compare Stable Diffusion and Midjourney using the same exact prompt.
Also SD 2.x especially requires more specific prompting even compared to SD 1.5 because you can't use real world artist styles by name. You have to describe it manually.
I agree but I was basing it off of their example prompt from their own demonstration. This was not designed to be a prompting guide. Just a quick overview of what's now available. I'll be diving deeper into prompting in future videos. I've also made several past videos about prompting with Stable Diffusion.
PS. I use Stable Diffusion to make my thumbnails. So I've definitely put it through its motions.
Do you use some of SD's models and loras/lycoris. If you haven't try out deliberate, rev animated kind of models and some loras as well, if you do everything well and take time, the images will definitely have better quality than that of mid journey and also more closer to your imagination
@@mreflow it is made by ai
Good googly moogly this stuff is happening quickly.
What are negative prompts used for?
You plug in keywords that you want to make sure are not in your image.
@@mreflow Thanks buddy, I had a feeling that was the case, but wasn't 100% sure. Thanks again for the great content, you're on top of it and it's awesome!
"I, for one, welcome our AI robotic overlords" is the text I added to the AI generated portrait of myself I recently created.
😁
exciting times ✨
I think in about 2 years we will have something like MJ with text quality of deep floyd as olen source.
I'd guess even sooner. Within the next 6-months I'd guess.
Before 2024
@@mreflow I don't know if that would really be that fast, but i absolutely adore how postive i felt reading your comment.
Thanks
Rishabh
@@friendlyvimana you underestimate the pace at which ai is improving
3 to 6 months.
You know that stable diffusion needs more detailed prompts for quality output.
What if we combine stability text ability and midjourney hand ability mix them together
Any idea on when will Midjourney have an API?🤔
The fonts used in Midjourney appear to be more stylish and sophisticated, whereas the fonts in Stable Diffusion seem rather generic.
the way the other company can't do what other can do makes me think that they keeping secret from each other to be on top of the game/race. Am I somehow correct??
can you make a video tutorial on how to install stable diffusion on local machine
Matt i love your content and you are doing an amazing job overall presenting those new AI tools and models, but the fact that you are comparing Midjourney to vanilla SD while using SD without giving any negative prompts, using no trained models and generally not prompting in the correct way for SD kinda misinforms people. Vanilla Midjourney is better than vanilla SD but a fully custom SD setup + correct way of prompting and use is miles ahead of Midjourney. It is just hard to make work right and that is why it is mostly meant for professionals and people that take AI image generation seriously.
Finally holy crap
Yep, those are the monkey leg hands... hahaha.
It’s great but I want to compare it to what adobe firefly can do
I train PlaygroundAI to write Letters in Graffiti Style. 😉
uasing deep floyd for long, it requires multiple input!
dude if you slide the slider to the right it will spell it EVERY time... just press where it says *advanced options
1:55 Abraham Lincoln by Balenciaga
It’s just a speculation, but seems like Midjourney uses modified version of stable diffusion for image generation, and they also trained their models on a massive amounts of carefully selected images, which is why the difference between base SD model and midjourney output is so huge. So probably after public release of SD XL midjourney will get even more better.
MidJourney claims on almost every office hours call that they are not using Stable Diffusion. They are using their own proprietary method.
@@mreflow so the trick is to call them during non office hours..🎉
Yes midjourney is nice and all. But how about OpenJourney in SD, versus Midjourney
After like ten attempts I made it write "negligent" correctly!
Is there any way to run it in discord like midjourney? Thanks
Imagine dunking on StableDiffusion while not using negative prompts, loras, specialized models, nothing.
SD is better in almost every way imaginable if you know how to use it
Deepfloyd is open source, so who knows if Midjourney uses it like how they used Stable Diffusion before.
Deep Floyd is not currently open source.
This is directly from the Stability website: "DeepFloyd IF is a state-of-the-art text-to-image model released on a non-commercial, research-permissible license that provides an opportunity for research labs to examine and experiment with advanced text-to-image generation approaches. In line with other Stability AI models, Stability AI intends to release a DeepFloyd IF model fully open source at a future date."
What about Microsofts new Designer tool ? It does text and kinda like Canva..
Thank you.
Is this working locally yet?
finally 😁😁😁😁😁😁
Its not even 2 months of gpt 4 and all of this get soo much confusing 😵
Tried them - all dreadful - at the moment, they will hopefully get better.
Midjourney uses Stable Diffusion 🤭
False. MidJourney has developed their own model. They used Stable Diffusion at one time but no longer do (according to them on their weekly office hours calls).
Can you update this video? It seems they closed the model
Hey Matt
The scientific term for "leg hands" is "feet".
I'm pretty sure monkeys have leg-hands though.
Bring on the memes!
Like for Kanye stalking Kim 🤣 🤣
"Sometimes it takes a few generations for you to get to what you want," meant something totally different like 4 years ago.
I can hear the soul of Artist leave their Body now with this news.
If anything. Comics and Manga are next.
deep Floyd is Paused ...lol
installed the extension for that in web. saw it need money to work. deleted it instantly.
Sometimes I think I can easily use photoshop to add text on AI pictures.
Monkey leg hands :) i usually call them feet
I think the monkey leg hands are called feet
"Finally"? It's been like 5 months since all of this became public.
any update on this?
It spells like I do, lol!!
Why don't you ever use BlueWillow?
WOLF..in..german..is...LOBO..😂😂😂
Came here for the info... not a Kartrashian.
4:26 says DIⵎT.
When text is available - an entire industry will become obsolete overnight. However, I also feel that the industry is currently vastly overpriced - I will lose no sleep.
Monkey leg hands lol
🐒monkey feet-hands 😆