In my experience, Dall-e builds off of more realistic elements, but I feel like it gets stuck by the different style of its sources, and can look like a patchwork. Midjourney always creates a unified style, and that gives it an aesthetic baseline.
But SD is outright better than MJ in all results which is why MJ is trying to catch up with SD now, but SD is about to release a new update too. SD is king right now.
I sometimes use "Trending on Artstation" with DiscoDiffusion - it seems to give colors a more vibrant look, with a preferences towards solid swaths of color; I call it almost a claymation effect.
Ya, the prompt may work with discodiffusion, but doesn't seem to affect midjourney. I suspect many of the people using it with midjourney are using it because they got it to work with disco first :)
Regarding 9) “text and image prompting” Try this: Put your image reference in an editor and scale it down and place in a corner of the canvas, export as transparent png. In Dall-e use the png and edit, erase any small part of the transparent background. Then just use basic text prompts with no reference to a style or the corner image. Dall-e will fill in the blank space nicely. As a last step, you may want to edit the result you like and then erase stuff from the reference corner. Done! Dall-e has you covered.
Haha, nice trick. Sort of a variation of where I loaded in the fullsize image and erased most of it, replacing it with the text prompt. Will give this alternate a try!
Just wanted to point out, with Midjourney you use the --iw command to change how similar the variation is to the input image. Default is 0.25. So if you want something really similar to your input image, you can do like --iw 1 (or I think it even goes higher).
Yup, that's the parameter I've been using. However, with all the tests I've done, using even -iw 100 still produces images that are pretty different from the original image. I'd love a 0 to 100 scale, where 0 doesn't use the image at all, and 100 produces almost an exact duplicate with the image with only the most minor of tweaks, and of course all the values between.
@@ArtOfSoulburn MJ devs have said that image prompts don't work like that, the software takes inspiration from the image but doesn't "start from it", though they've hinted that they may add that functionality at some point
@@northwind6199 Yes, from what I understand there's an init image and an image prompt. The first kind starts with your image and then works from there. The second type starts with a pure noise image and then tries to head towards your image. MJ can currently only do the second. Not sure from a technical perspective which of these two is necessary to get what I'm looking for, but would love better control over how close the final result is to an image, either in shape / composition or details. And the current method is ok, but could be a lot better.
Dall-e is also good for making abstract signage and other "greebling" (for lack of a better term) for artworks and designs, as a ACP 3D modeller im loving Dall-e for making tons of concept art that i can model and practice my 3D skills on
the past few weeks I've done very extensive testing on dalle2 and a little on midjourney. (10k+ images specifically in recreating various anime styles) How they behave in response to prompts is extremely different. (like night and day) With dalle2 it very clearly to me seems to have been trained on what seems to be google images or something very close to. Whereas midjourney seems to very clearly been trained mostly on art and paintings. Dalle2 seems to compare how similar every word is to every other word and tries to keep the similarity between all words as high as possible. Because of this the context of every word in your prompt to each other will make big difference to the output. Whereas midjourney is a bit more of a mystery to me, it definately doesn't work like this. it seems to be more subject orientated maybe? I find to get the best out of dalle2 I'm using the entire prompt limit, because most of it's training data is in realistic context, I need to fill it with key words relating to images of similar style but different subject, so that when I give it a subject in the prompt it can accurately translate it to the style. The more artists, and shows and content of similar style you add to the prompt the better. Also the ordering, positioning etc of the words doesn't seem to have an effect in dalle2 so your prompt usually ends up a mass conglomeration of words. Whereas Midjourney seems to absolutely hate mass conglomeration of words, and much prefers more straight forward and more grammatically ordered prompts. I've been able to get much better results from dalle2 compared to midjourney at least with anime. Though dalle2 does struggle with fullbody anime art, (it's extremely good at face and or upperbody shots, and shots of just the neck down. But anything involving both head and legs, I've yet to get perfectly clean results. But the results I'm getting for upperbody and face shots are completely indistinguishable from top pixiv artists with the correct prompting. But with dalle2 It becomes very tricky then to get the subject to do anything specific without deteriorating the quality of the styling since it has no concept of whether something is a style, subject, object, verb etc. All words are treated the same. Also if you want an image of multiple objects interacting in specific ways, it'll also not understand what object is related to what words. For example a red pen and a black stick could just as easily give you a black pen and a red stick. Putting short white hair in the prompt will also make your character a little shorter and their clothes a little shorter... etc. Dalle2 is very trigger happy with disallowing the use of any keyword that can remotely be used in any negative context. Which also means if you're trying to create a negative mood in your art, maybe you want a depressed unhappy character. Not allowed, everything must be happy as larry. Otherwise dalle2 says no. So it does have a very apparent and obvious bias towards positive and happy moods and will give you better results if you make you're subject a happy subject. Which is often very limiting if you want a character to look upset or depressed etc.
Thanks for the post, sounds like you've been also doing some very detailed testing. I've observed many of the same things you have, there's obviously some very different results due to the dataset each used and then I assume the AI itself has been given biases by the programmers to weight certain things higher (from what they say of Dall-e, many of the biases that were added are to undo other biases). I'm starting to play with Stable Diffusion now too, and am running into a whole new set of biases and differing results. I suspect as more and more of these AIs get made, each will end up with a sort of "house style", which perhaps people will go to for specific types of projects. Anyways, thanks for your post, I think getting more specific testing out there in the world can demystify some of this stuff, even if only a little.
Very detailed video and explanation! It was really interesting to see how different the two generators are given similar input. I'll be curious to see how these AI evolve over time and what type of options artists will have to use them. Looking forward to more videos on this topic from you! I'm always curious when we see some of the photo real AI solutions are there photos online that would match up to specific parts of those AI generated images? Is the AI kind of photo bashing its interpretations together to generate images? It's really crazy how this is such a hot topic in the concept world right now, for better or worse...
Ya, the link between how to "learns" from other artwork and the results it paints is a little nebulous. I get the impression that it does photobash, but not in a traditional way, like I doubt a tree in its painting will have the same tree in one of its training image, but maybe a branch or a leaf will look very similar to what it learned from. But to truly wrap my head around it, I'd need to have access to a dataset, make it small, and see how close the AI images get. Anyways, glad you liked the video!
@@ArtOfSoulburn as I understand it most AI process images in pixel relation, as in they don't look at the whole picture at once but look at how a few pixels relate to each other, so imagine photo bashing but using only 1% of each original image at a time
I see lots of Syd Mead, Jean Giraud/Moebius & Geof Darrow + impressionist painters like Monet influence in Midjourney. It's really inescapeable for Midjourney. They need to broaden their input prompt or whatever they're adding to their AI.
If you don't define the style you want to use, Midjourney's output tends to a certain mix of styles that gets repetitive very soon. But you can always influence the stye of a picture
Neil Blevins.... it has been a while! hehe I can still remember your creepy creations and your wealth of free scripts back in the early 2000 :-) Nice to catch up.
Hey Olivier, always happy to chat with a longtime fan :) It has been a long and winding road, but yup, still making creepy creations, just with tools that have evolved for 20 years :)
Great content. Breakdown like this are very useful. Could you try how high image reference influenced the overall results? Or how to use “artists artstyle” effectively?
Glad you like it. I'll let others speak to using "In The Style Of" effectively, as I'm not terribly interested in emulating another artist's styles. And can you explain a little more what you mean by "high image reference"? I'm not sure I fully understand the question.
Interesting maybe it is just me but I am seeing machine learning as very similar to human bias is subtile ways, I really had not considered the realtionship before. You are what you know and how it was presented to you.
Is it possible to have it make it generate different views of the same object like the different view ports in a 3D App. Front view, Top view, side view, etc?
So what I meant in that part is while dall-e defaults to photoreal and midjourney defaults to painterly, with the right prompt you can invert it. For example, if you ask Dall-e to make an oil painting, it can produce something that looks painterly. But if you don't specify, it'll default to photoreal.
I wouldn't consider "Detailed" to be that random or different. Compared to the defaults, it produced close-ups for everything except the alien heads. As for why there are no close-ups for the alien heads, I'd guess it is because it also seems to focus its "Detailed" close-ups on eyes/heads when present, and the alien heads are already just close-ups of heads with giant eyes.
To me the interesting things is that "detailed" produces very different results for midjourney and dall-e, which goes to show the differences in either the software or the training set. It's fine that it interprets detailed as a closeup, it's just something to note to give you better control over the final results.
So, no one is interested in the photos of people sitting NEXT TO chairs? That woman seems to be sitting on an invisible chair!! I like this kind of "failures" they're trippy and fun.
Haha! Oh I enjoy them too, we used to love looking at renders that failed at work where say the person's skin didn't render but their eyes and teeth did :)
You're confirming what I think of dall-e, it had been made dull by a corporate mentally, it has hard to access and don't let you do what you want I'll turn to midjourney and stable diffusion when it's available
Hi I'm quite new to AI art but as a product designer it's quite interesting. But can anyone tell my if its even possible to get very clean renderings? For example if I ask for a nintendo gameboy, it looks good but the buttons are not round an the texture is not quite smooth. I do see this kind of pictures in social medias but I cant believe that they do this only on AI?
Right now these tools are still pretty simplistic, so getting super clean results is quite difficult, and will likely require specific training to get exactly what you're looking for. For example, I suspect you'd need hundreds of gameboys in the training data and tweaks to the model to analyze and really copy that sort of detail, and Nintendo might not be too happy from a legal perspective having their product given such attention. That said, things have improved a lot in the past month since this video was made, might be worth giving Midjourney V4 a try for example and see if it's doing a better job.
from what I've noticed, Dall-E is good with less information. typing phrases like "viewed in" muddies up what it looks for. giving it direct lens sizes, angles, ETC seems to work best.
Interesting! At one point you said about using Midjourney for early concepts and DALL-E for later refinement. I'm not sure if I understood the workflow suggested. Do you mean using as a later refinement of the Midjourney concepts once satisfied with them? Or later refinement in the sense of taking the final art you made inspired/ref'd by Midjourney's output and asking for DALL-E variations to get references for your own variations? Or maybe both?
So what I meant by that was that midjourney's variations tend to be in some cases quite different from the original, wheras dall-e variations are closer to the original. So you might use midjourney at the earlier stage of production where you want more inventive variations. And if you want only slight variations, it may be best to use Dall-e's variations and edit tool. Ideally I'd love both to have some slider labeled "Very Different" and "Very Similar" so I could have control over this, but right now neither does this very well. Let me know if that makes sense!
midjourney definitely does it's painterly look intentionally. they have a guiding layer with specific "aesthetic" images that it tries to guide towards (hence the common purple/orange color scheme) they've introduced new, more photorealistic modes, using --testp param. excited to see these models / pipelines evolve
How can you say "detailed" didn't do very much? Based on every prompt the detailed version shows more detail on surfaces, for example the texturing and stitching on the stuffed animals, much more small graphic details on the robots, and texture patterns on the chair seats all missing from the "regular" prompt.
So I get into a more detailed analysis on the Midjourney graphic in my other video ua-cam.com/video/5PGjVWU599s/v-deo.html If you've watched that and still disagree, that's totally fine, but I stand by my subjective analysis.
Dall-e has the “problem” (not really, but it limits creativity considerably) of very strict codes of conduct, where if you ask for things like “bones”, “stab”, or any prompt assuming violence, then that can cause you to lose access to the program.
Yup, ran into that. Some odd ones too. I asked for "Warts" and it gave me a censored warning, so I changed the word to "Lumps" and it was ok. I can guess why it may not have liked the word warts, but I was looking for warts like the kind you get on the face.
MJ is now being inundated with pervs trying to make porn..reported someone who spent hours creating a very specific aged girl (17) in very specific lacey lingerie...yuck
I think your face images are being skewed by the word "portrait" which automatically is going to drive more side-on images. Try "frontal full face image of a..." or something like that. But yeah, I've found that DallE2 was definitely trained by scraping more photostock websites, I assume MJ didn't use those as much. I'd recommend checking StabilityDiffusion out too, its kind of a mid-ground between MJ and DE, but has been trained on the LAION dataset, so you can actually see what the dataset returns for a given keyword. I kind of prefer the MJ visual aesthetic to be honest out of all of the ones I've tried.
Interesting, thanks for the tip, removing portrait and doing 10 tests, I did indeed get consistent frontal views. However, removing portrait didn't seem to help the 3/4 test much. See some results here: neilblevins.com/temp/faces.jpg But again thanks for the note, I wouldn't have thought the word portrait would cause some issues. I'll give stabilityDiffusion a try too, thanks!
@@ArtOfSoulburn I think a huge part of this, will be figuring out tools to develop a better understanding of prompts. A huge amount of community effort is going into exactly this, a ton of artists doing all sorts of studies. Definitely check out stable diffusion though, its got a really interesting feature where you can give a seed value and it'll generate the same basic scene (same structure), but allow you to change the prompt slightly to get a variation from a scene you like. Once I get access to the model weights, I'm planning on doing some prompt morphing (same seed image, prompt changing over time). Will also be building it into a version of PureRef. You can see a version of the prompts changing with same seed on the latest video on my channel.
@@zoombapup As well as understanding prompts better, I think the datasets these ais are being trained on could also use some tweaking. Right now most of them are aiming for the internet meme market, not the professional concept art market. So at least for the kind of work I do, I think a lot of this will be play until a company who actually wants to sell to this market steps up and provides the necessary tools.
@@ArtOfSoulburn Yeah, I suspect a lot of fine-tuned models with specific requirements in mind will pop up. I can see a concept art specific model being really popular. Honestly I think MJ is basically that anyway. But you can always train your own (there's loads of tools from the LAION community in terms of dataset, things like LAION aesthetic, which I think formed a big part of stablediffusion). Its a fascinating area, but still really underexplored for practical usage.
@@zoombapup To note, MJ being for concept art is really about your definition of concept art. As a professional concept artist, 99% the stuff you see in the art of books, the big environment vista painting for example, are about 1% of the actual job. Most of the job is refining small details, or showing the same design from several angles. Plus some of these AIs are being advertised as "use this and it'll replace the need for a concept artist", and those AIs will obviously not get usage in the professional concept art community because they're not looking to work with the artist but replace them. I think the next video I do will be a discussion of exactly what a professional concept artist would want from these AIs, I've touched on the subject in all my videos, but maybe doing a nice focused lesson would inspire some of the tech folk to consider making something more targeted.
So when I say 3/4 view, I'm talking about rotated around the figure, seeing the figure half way between front and side, not 3/4s of their height. Maybe it thinks that's what I want :)
Midjouney creates nightmares. I've spent some time playing with it and after a while the uncanny valley really got to me. I still like it but it hurts my brain.
I'm confused, there's a bunch of AB comparisons in the video. Do you mean the giant keyword sheets? For those there's just not enough screen real estate to have them both on screen at the same time. But feel free to check them out in more detail here: www.neilblevins.com/art_lessons/midjourney_vs_dalle/midjourney_vs_dalle.htm the links to the two images are at the bottom.
Yup, wheras most people in midjourney use it to add "detail" to their painting. Its fascinating how the the same words can mean such different things depending on your training data.
The no gun thing is understandable but also annoying when trying to make characters holding guns, works well with staves and swords though. Swords seem to be very generic though and has no concept of context sometimes showing a person as a staff member when asking for a staff weapon.
“You can breathe a sigh of relief, I don’t think DALL-E will be replacing your artwork anytime soon.“ I wouldn’t be so sure of that. Remember that AI generated art is advancing at an exponential pace, which means it doubles in power and ability every few months!!
My comment was meant to be a little silly, if you know Doug Chiang's artwork and saw what it produced in his "style", it's a pretty big delta to cross :)
THe thing people forget is, that it doesn't have to be perfect. Just "good enough". A lot of people will be put out of work from this software. Particularly once it improves further. This is only the "begining" so to speak. And there is no say what it will do to all the other areas that are not creative fields. Doctors, Teachers you name it. You can already now see Ai giving prescriptions and it can grade School Essays and in both examples it's as good like humans. In some cases even better. Almost no profession is safe from this automatition. And it will lead to a drastically reduced workforce. Ai is to humans what cars have been to horses. It makes us obsolete.
@@CrniWuk That is certainly the worry. For doctors it needs to do a little more than "good enough", but for artists, good enough might be fine with the public, which means, as you say, a lot of lost jobs. Really not sure how any of this will pan out.
@@ArtOfSoulburn Despite how much these have evolved, somehow I still doubt they ever do it beyond "good enough" state. More specifically, when you want something exact, it's much more feasible to communicate that with a human being than to throw a dice n amount of times and hope you get what you're looking for. I can only hope I my comment doesn't turn into a lie in near future.
I think they will allow fewer people to do more faster, and so there may indeed be fewer jobs available. At which point I will make beer professionally :)
That will likely require true sentience to work, since the current algorithms are all based on training datasets. But a great first step would be making sure these datasets are all images that are copyright free.
From what I've seen, DallE has no sense of composition. Even in the example of the teddy bear, all four images generated by DallE have either the ears or feet cropped out or just looks ugly. Midjourney on the other hand really knows how to present the prompt.
Ya, I wonder if that's because Dall-e used more photographs in its dataset, and so many photos might be cropped, wheras midjourney used a lot of paintings which tend to have better compositions. Either way, your analysis is spot on.
I am strongly against restrictions that limit users for no meaningful reason. So I have no interest in supporting DALL-E at all. No guns? That just shows me they have no positive regard for people, treating everyone like children. Such restrictions tend to be absurd even when children are involved let alone adults. When a company has so little respect and care for people, it tells me they will also have nothing against doing anything that would screw with people. To be fair, I do not know how Midjourney operates. However if they are more laid back and more reasonable in how they offer their product. Then I would rather use their AI even if it was inferior.
For artists I think the restrictions are a bad idea. However, I can sorta get behind the fact they don't want their software being used by average people to create deep fakes that help to destabilize society. So I'm conflicted on the issue.
And sorry to continue to nitpik, BUT your claim that words like "Photography" and "Octane Render" and "Unreal Engine" do more to add detail than "Realistic" or "Photoreal" neglects the background and ignores the fact that the subjects more closely match the regular prompts than "Detailed" or "Macro Photography" which both seem to offer far more intricacy in surface texture. Or am I crazy?
These are all experiments, I've never used it for my professional concept art job, and considering the copyright issues, I don't forsee using it for my job anytime soon.
@@ArtOfSoulburn I see, I apologize if I was disrespectful. I appreciated the video, you did a good job. How do you get a job like that if you don't mind me asking?
@@petermontgomery2874 Well there's plenty of ways to work at becoming a concept artist. I have over 25 years of experience, and the industry was very different when I started, so my path isn't really copyable anymore. But if you have some skill at drawing or painting or in 3d or photobashing, you can make artwork to create a portfolio and try and get into a school, several specialize in concept such as Art Center in California or FDZ in Singapore. You could pay for a mentor, a number of big concept artists do online mentorships, for example, you could check out schoolism. Or you could watch videos and work on your skills on your own, making image after image after image and posting them on social media or sites like artstation to get feedback. If you decide to persue it, best of luck!
Um, AI doesn't create or "try" to do any. Clearly they were not "trained" on the keywords((there were no keywords classifying what you said or it was limited or not what it should have been). AI is just a massive interpolation based on the data it is fed. It is only as good as the data it is trained on. If you want a AI to fit to those terms it the inputs must be classified accordingly. It is very likely that when the images were scraped from the internet they simply took keywords surrounding the image(e.g., do a google search). In fact, it is very likely that the way these images are classified is that people are using google's search engine to tag the images and we know how poor that is.
Hmmmm concept art.... What is the item intended to do, is it military or civilian, armoured or not, what is it's power source, how far will it go, do you have a certain style for the design Gothic, insectoid slab shaped... Give me a ring next Wednesday I will have it drawn up and a rough 3D print!! Use a piece of software for concept design and get the sort of low imagination shit in the latest dune movie..... Balloons lifting off 500+ ton harvesters ha ha ha he ha ha ho he ha ha... Oh that level of design stupidity gets me everytime.
Hehe. Well, in my own mind I make a distinction between concept art and concept design (although so many, including myself by accident sometimes, use the terms interchangeably). Art can be more nebulous and be about big shapes, lighting and mood. And design is the part you describe, a more practical view on what it does, how it does it, the details, functionality instead of form. Right now if these pieces of software do have a place it would be in concept art, it would need entirely new methods if it wanted to try its hand at design.
@@ArtOfSoulburn Indeed I see the truth of it (Dune the good one). I did a bit in my early 20's designing a few 25mm/1/35th sci-fi kit ranges and was offerd an interview at Pinewood. One of my hero's Martin Bower Space1999, The Nostromo, Refinary, Narcicus, Outland (everything)... Once said, sci-fi modelling will only look authentic if every piece looks like it's supposed to be there. I suppose I'm just old school.
@@thebritishengineer8027 There's definitely some truth to that. Another quote I've seen is "It doesn't matter if it work as long as it looks like it would work". :) And agreed, love Martin's work, I became aware of him when he did a giant alien photo dump about 10 years ago. Amazing work.
People always deny it and claim these ai are genuinely creative but these systems are just stolen images that get remixed. A system with true creativity (which would be an ability to mix the things it knows to create something new) wouldn't have such limitations
I suppose it depends on your definition of creative but for me some thing can’t be creative unless it’s sentient. So while it can create astounding results, the ai isn’t creative in my definition.
In my experience, Dall-e builds off of more realistic elements, but I feel like it gets stuck by the different style of its sources, and can look like a patchwork. Midjourney always creates a unified style, and that gives it an aesthetic baseline.
But SD is outright better than MJ in all results which is why MJ is trying to catch up with SD now, but SD is about to release a new update too. SD is king right now.
SD is not that good!
@@nasionalsb You have to learn prompt engineering, and it gets even easier in v1.5, SD is fantastic and far surpasses Dall-E with the right prompts.
@@xbon1 Ive used all of the most common ones and MJ just feels better.
Great breakdown of the major variables when using both tools! Kudos! 👏👏👏
Thanks Kevin, glad you found it helpful!
I sometimes use "Trending on Artstation" with DiscoDiffusion - it seems to give colors a more vibrant look, with a preferences towards solid swaths of color; I call it almost a claymation effect.
Ya, the prompt may work with discodiffusion, but doesn't seem to affect midjourney. I suspect many of the people using it with midjourney are using it because they got it to work with disco first :)
Regarding 9) “text and image prompting” Try this: Put your image reference in an editor and scale it down and place in a corner of the canvas, export as transparent png. In Dall-e use the png and edit, erase any small part of the transparent background. Then just use basic text prompts with no reference to a style or the corner image. Dall-e will fill in the blank space nicely. As a last step, you may want to edit the result you like and then erase stuff from the reference corner. Done! Dall-e has you covered.
Haha, nice trick. Sort of a variation of where I loaded in the fullsize image and erased most of it, replacing it with the text prompt. Will give this alternate a try!
Woah, great video mate!
You rock, always on the cutting edge, someday we will get a 3d Model from this.
thanks terrablader, and yes, getting 3d models would be super helpful, not sure when that will be possible but tech is growing fast.
Thanks again Neil for a great video.
Ive been using them both for a while and would agree with all your observations
Just wanted to point out, with Midjourney you use the --iw command to change how similar the variation is to the input image. Default is 0.25. So if you want something really similar to your input image, you can do like --iw 1 (or I think it even goes higher).
Yup, that's the parameter I've been using. However, with all the tests I've done, using even -iw 100 still produces images that are pretty different from the original image. I'd love a 0 to 100 scale, where 0 doesn't use the image at all, and 100 produces almost an exact duplicate with the image with only the most minor of tweaks, and of course all the values between.
I personally use the "--stylize" parameter to avoid too much interpretation from the source or text prompt.
@@ArtOfSoulburn MJ devs have said that image prompts don't work like that, the software takes inspiration from the image but doesn't "start from it", though they've hinted that they may add that functionality at some point
@@northwind6199 Yes, from what I understand there's an init image and an image prompt. The first kind starts with your image and then works from there. The second type starts with a pure noise image and then tries to head towards your image. MJ can currently only do the second. Not sure from a technical perspective which of these two is necessary to get what I'm looking for, but would love better control over how close the final result is to an image, either in shape / composition or details. And the current method is ok, but could be a lot better.
Dall-e is also good for making abstract signage and other "greebling" (for lack of a better term) for artworks and designs, as a ACP 3D modeller im loving Dall-e for making tons of concept art that i can model and practice my 3D skills on
the past few weeks I've done very extensive testing on dalle2 and a little on midjourney. (10k+ images specifically in recreating various anime styles)
How they behave in response to prompts is extremely different. (like night and day)
With dalle2 it very clearly to me seems to have been trained on what seems to be google images or something very close to.
Whereas midjourney seems to very clearly been trained mostly on art and paintings.
Dalle2 seems to compare how similar every word is to every other word and tries to keep the similarity between all words as high as possible.
Because of this the context of every word in your prompt to each other will make big difference to the output.
Whereas midjourney is a bit more of a mystery to me, it definately doesn't work like this. it seems to be more subject orientated maybe?
I find to get the best out of dalle2 I'm using the entire prompt limit, because most of it's training data is in realistic context, I need to fill it with key words relating to images of similar style but different subject, so that when I give it a subject in the prompt it can accurately translate it to the style. The more artists, and shows and content of similar style you add to the prompt the better. Also the ordering, positioning etc of the words doesn't seem to have an effect in dalle2 so your prompt usually ends up a mass conglomeration of words.
Whereas Midjourney seems to absolutely hate mass conglomeration of words, and much prefers more straight forward and more grammatically ordered prompts.
I've been able to get much better results from dalle2 compared to midjourney at least with anime.
Though dalle2 does struggle with fullbody anime art, (it's extremely good at face and or upperbody shots, and shots of just the neck down. But anything involving both head and legs, I've yet to get perfectly clean results. But the results I'm getting for upperbody and face shots are completely indistinguishable from top pixiv artists with the correct prompting. But with dalle2 It becomes very tricky then to get the subject to do anything specific without deteriorating the quality of the styling since it has no concept of whether something is a style, subject, object, verb etc. All words are treated the same.
Also if you want an image of multiple objects interacting in specific ways, it'll also not understand what object is related to what words.
For example a red pen and a black stick could just as easily give you a black pen and a red stick. Putting short white hair in the prompt will also make your character a little shorter and their clothes a little shorter... etc.
Dalle2 is very trigger happy with disallowing the use of any keyword that can remotely be used in any negative context. Which also means if you're trying to create a negative mood in your art, maybe you want a depressed unhappy character. Not allowed, everything must be happy as larry. Otherwise dalle2 says no. So it does have a very apparent and obvious bias towards positive and happy moods and will give you better results if you make you're subject a happy subject. Which is often very limiting if you want a character to look upset or depressed etc.
Thanks for the post, sounds like you've been also doing some very detailed testing. I've observed many of the same things you have, there's obviously some very different results due to the dataset each used and then I assume the AI itself has been given biases by the programmers to weight certain things higher (from what they say of Dall-e, many of the biases that were added are to undo other biases). I'm starting to play with Stable Diffusion now too, and am running into a whole new set of biases and differing results. I suspect as more and more of these AIs get made, each will end up with a sort of "house style", which perhaps people will go to for specific types of projects. Anyways, thanks for your post, I think getting more specific testing out there in the world can demystify some of this stuff, even if only a little.
Cool, this video featuring faces from Dalle 2 prompted me to search about the rule. And it's allowed now. So thanks for informing :)
Very useful and instructive comparison. Thank you !
Thanks, glad you enjoyed it!
Very detailed video and explanation! It was really interesting to see how different the two generators are given similar input. I'll be curious to see how these AI evolve over time and what type of options artists will have to use them. Looking forward to more videos on this topic from you!
I'm always curious when we see some of the photo real AI solutions are there photos online that would match up to specific parts of those AI generated images? Is the AI kind of photo bashing its interpretations together to generate images? It's really crazy how this is such a hot topic in the concept world right now, for better or worse...
Ya, the link between how to "learns" from other artwork and the results it paints is a little nebulous. I get the impression that it does photobash, but not in a traditional way, like I doubt a tree in its painting will have the same tree in one of its training image, but maybe a branch or a leaf will look very similar to what it learned from. But to truly wrap my head around it, I'd need to have access to a dataset, make it small, and see how close the AI images get. Anyways, glad you liked the video!
@@ArtOfSoulburn as I understand it most AI process images in pixel relation, as in they don't look at the whole picture at once but look at how a few pixels relate to each other, so imagine photo bashing but using only 1% of each original image at a time
Amazing breakdown! Thanks man!
Thanks Ten, glad you've enjoyed it!
Excellent
Excellent content, as always
Thanks Lucas!
point number 9 is fascinating, I had no idea it worked like that with dalle2 I'll be experimenting with this straight away.
Cool, glad you found that part interesting, and hope you find your tests informative!
I see lots of Syd Mead, Jean Giraud/Moebius & Geof Darrow + impressionist painters like Monet influence in Midjourney. It's really inescapeable for Midjourney. They need to broaden their input prompt or whatever they're adding to their AI.
If you don't define the style you want to use, Midjourney's output tends to a certain mix of styles that gets repetitive very soon. But you can always influence the stye of a picture
A magnificent exploration. Thanks.
Glad you enjoyed it!
Thank you for this video i just started making youtube videos with ai and music this gave me much more insight :) subbed
Thanks, glad you've enjoyed it!
This video is extremely helpful, thank you
Thanks, glad you liked it!
great work
Neil Blevins.... it has been a while! hehe I can still remember your creepy creations and your wealth of free scripts back in the early 2000 :-) Nice to catch up.
Hey Olivier, always happy to chat with a longtime fan :) It has been a long and winding road, but yup, still making creepy creations, just with tools that have evolved for 20 years :)
great watch.. thanks for sharing!
Glad you found it interesting Chad!
Great content. Breakdown like this are very useful. Could you try how high image reference influenced the overall results? Or how to use “artists artstyle” effectively?
Glad you like it. I'll let others speak to using "In The Style Of" effectively, as I'm not terribly interested in emulating another artist's styles. And can you explain a little more what you mean by "high image reference"? I'm not sure I fully understand the question.
Hey I know that guy. Great video and tests. Thank you for the disciplined tests.
Haha! Hey Sam! thanks, glad you dug them!
Interesting that octane render seemed to take the Oct for the alien prompt as it's like octopus.
Estupendo video estoy aprendiendo mucho gracias
Interesting maybe it is just me but I am seeing machine learning as very similar to human bias is subtile ways, I really had not considered the realtionship before. You are what you know and how it was presented to you.
It learned by looking at our artwork, so yes, I agree, it's going to have the same biases we have unless you work hard to train it otherwise.
I think Tom Scott did a video a while back on how AI can't be a solution to remove human bias because it can't escape the biases of its creators.
Is it possible to have it make it generate different views of the same object like the different view ports in a 3D App. Front view, Top view, side view, etc?
Not reliably yet. But hopefully one of them will work on it soon.
12:42 this should be the other way around right, more photorealistic from Dall-e and painted style from midjourney
So what I meant in that part is while dall-e defaults to photoreal and midjourney defaults to painterly, with the right prompt you can invert it. For example, if you ask Dall-e to make an oil painting, it can produce something that looks painterly. But if you don't specify, it'll default to photoreal.
I wouldn't consider "Detailed" to be that random or different. Compared to the defaults, it produced close-ups for everything except the alien heads. As for why there are no close-ups for the alien heads, I'd guess it is because it also seems to focus its "Detailed" close-ups on eyes/heads when present, and the alien heads are already just close-ups of heads with giant eyes.
To me the interesting things is that "detailed" produces very different results for midjourney and dall-e, which goes to show the differences in either the software or the training set. It's fine that it interprets detailed as a closeup, it's just something to note to give you better control over the final results.
also, try "photo of" - I did "photo of an alien in a shopping mall, 1985" very cool results on DALLE
Can we just appreciate Midjourneys art at 12:08
Sure, it's a cool image.
So, no one is interested in the photos of people sitting NEXT TO chairs? That woman seems to be sitting on an invisible chair!! I like this kind of "failures" they're trippy and fun.
Haha! Oh I enjoy them too, we used to love looking at renders that failed at work where say the person's skin didn't render but their eyes and teeth did :)
space is dark, so dall-e nailed background too
You're confirming what I think of dall-e, it had been made dull by a corporate mentally, it has hard to access and don't let you do what you want
I'll turn to midjourney and stable diffusion when it's available
Regarding camera views in Midjourney. I've tried 'from the waist up' and it just seemed to ignore it.
yup, I assume if it had training images using those words it might be a different story.
Hi I'm quite new to AI art but as a product designer it's quite interesting. But can anyone tell my if its even possible to get very clean renderings? For example if I ask for a nintendo gameboy, it looks good but the buttons are not round an the texture is not quite smooth. I do see this kind of pictures in social medias but I cant believe that they do this only on AI?
Right now these tools are still pretty simplistic, so getting super clean results is quite difficult, and will likely require specific training to get exactly what you're looking for. For example, I suspect you'd need hundreds of gameboys in the training data and tweaks to the model to analyze and really copy that sort of detail, and Nintendo might not be too happy from a legal perspective having their product given such attention. That said, things have improved a lot in the past month since this video was made, might be worth giving Midjourney V4 a try for example and see if it's doing a better job.
from what I've noticed, Dall-E is good with less information. typing phrases like "viewed in" muddies up what it looks for. giving it direct lens sizes, angles, ETC seems to work best.
Interesting! At one point you said about using Midjourney for early concepts and DALL-E for later refinement. I'm not sure if I understood the workflow suggested. Do you mean using as a later refinement of the Midjourney concepts once satisfied with them? Or later refinement in the sense of taking the final art you made inspired/ref'd by Midjourney's output and asking for DALL-E variations to get references for your own variations? Or maybe both?
So what I meant by that was that midjourney's variations tend to be in some cases quite different from the original, wheras dall-e variations are closer to the original. So you might use midjourney at the earlier stage of production where you want more inventive variations. And if you want only slight variations, it may be best to use Dall-e's variations and edit tool. Ideally I'd love both to have some slider labeled "Very Different" and "Very Similar" so I could have control over this, but right now neither does this very well. Let me know if that makes sense!
@@ArtOfSoulburn oh, yeah, I see! Thank you!
"Octane-render" made Dall-e think alien should be more octopus-like XD
Otter?!?! That's clearly a ferret. LOL!
Haha! Its a fuzzy mammal and the video was recorded off the cuff, so you'll have to forgive my mislabeling :)
So as Batman is about to fall into the public domain, could someone redesign the costume, props, etc. and copyright those (not the character)?
Can we talk about the woman who is also a chair at 6:13?
Oh sometimes you get really wacky results :) Pretty much par for the course :)
The keywords hyperreal and 4k get you the best results in midjourney
Adding "Pixar style" in DallE does a lot.
Haha, I'm sure it does. However as a former Pixar artist it would feel super wrong using that in my prompts :)
@@ArtOfSoulburn it basicly turns everything into cute render chars 😀
GOOD INFO
Thanks!
midjourney definitely does it's painterly look intentionally. they have a guiding layer with specific "aesthetic" images that it tries to guide towards (hence the common purple/orange color scheme)
they've introduced new, more photorealistic modes, using --testp param. excited to see these models / pipelines evolve
How can you say "detailed" didn't do very much? Based on every prompt the detailed version shows more detail on surfaces, for example the texturing and stitching on the stuffed animals, much more small graphic details on the robots, and texture patterns on the chair seats all missing from the "regular" prompt.
So I get into a more detailed analysis on the Midjourney graphic in my other video ua-cam.com/video/5PGjVWU599s/v-deo.html If you've watched that and still disagree, that's totally fine, but I stand by my subjective analysis.
Dall-e has the “problem” (not really, but it limits creativity considerably) of very strict codes of conduct, where if you ask for things like “bones”, “stab”, or any prompt assuming violence, then that can cause you to lose access to the program.
Yup, ran into that. Some odd ones too. I asked for "Warts" and it gave me a censored warning, so I changed the word to "Lumps" and it was ok. I can guess why it may not have liked the word warts, but I was looking for warts like the kind you get on the face.
MJ is now being inundated with pervs trying to make porn..reported someone who spent hours creating a very specific aged girl (17) in very specific lacey lingerie...yuck
@@ArtOfSoulburn likely just sensors anything with "war" in it
I think your face images are being skewed by the word "portrait" which automatically is going to drive more side-on images. Try "frontal full face image of a..." or something like that. But yeah, I've found that DallE2 was definitely trained by scraping more photostock websites, I assume MJ didn't use those as much. I'd recommend checking StabilityDiffusion out too, its kind of a mid-ground between MJ and DE, but has been trained on the LAION dataset, so you can actually see what the dataset returns for a given keyword. I kind of prefer the MJ visual aesthetic to be honest out of all of the ones I've tried.
Interesting, thanks for the tip, removing portrait and doing 10 tests, I did indeed get consistent frontal views. However, removing portrait didn't seem to help the 3/4 test much. See some results here: neilblevins.com/temp/faces.jpg But again thanks for the note, I wouldn't have thought the word portrait would cause some issues. I'll give stabilityDiffusion a try too, thanks!
@@ArtOfSoulburn I think a huge part of this, will be figuring out tools to develop a better understanding of prompts. A huge amount of community effort is going into exactly this, a ton of artists doing all sorts of studies. Definitely check out stable diffusion though, its got a really interesting feature where you can give a seed value and it'll generate the same basic scene (same structure), but allow you to change the prompt slightly to get a variation from a scene you like. Once I get access to the model weights, I'm planning on doing some prompt morphing (same seed image, prompt changing over time). Will also be building it into a version of PureRef. You can see a version of the prompts changing with same seed on the latest video on my channel.
@@zoombapup As well as understanding prompts better, I think the datasets these ais are being trained on could also use some tweaking. Right now most of them are aiming for the internet meme market, not the professional concept art market. So at least for the kind of work I do, I think a lot of this will be play until a company who actually wants to sell to this market steps up and provides the necessary tools.
@@ArtOfSoulburn Yeah, I suspect a lot of fine-tuned models with specific requirements in mind will pop up. I can see a concept art specific model being really popular. Honestly I think MJ is basically that anyway. But you can always train your own (there's loads of tools from the LAION community in terms of dataset, things like LAION aesthetic, which I think formed a big part of stablediffusion). Its a fascinating area, but still really underexplored for practical usage.
@@zoombapup To note, MJ being for concept art is really about your definition of concept art. As a professional concept artist, 99% the stuff you see in the art of books, the big environment vista painting for example, are about 1% of the actual job. Most of the job is refining small details, or showing the same design from several angles. Plus some of these AIs are being advertised as "use this and it'll replace the need for a concept artist", and those AIs will obviously not get usage in the professional concept art community because they're not looking to work with the artist but replace them. I think the next video I do will be a discussion of exactly what a professional concept artist would want from these AIs, I've touched on the subject in all my videos, but maybe doing a nice focused lesson would inspire some of the tech folk to consider making something more targeted.
For 3/4 view I usually ask for "head to torso" to get better results
So when I say 3/4 view, I'm talking about rotated around the figure, seeing the figure half way between front and side, not 3/4s of their height. Maybe it thinks that's what I want :)
I was kinda pissed DALLE2 doesn't have Boris Vallejo style.. damnit!
Midjouney creates nightmares. I've spent some time playing with it and after a while the uncanny valley really got to me. I still like it but it hurts my brain.
Why didn't you put them side by side for AB comparison bro? Woulda been better. Still interesting content, thanks.
I'm confused, there's a bunch of AB comparisons in the video. Do you mean the giant keyword sheets? For those there's just not enough screen real estate to have them both on screen at the same time. But feel free to check them out in more detail here: www.neilblevins.com/art_lessons/midjourney_vs_dalle/midjourney_vs_dalle.htm the links to the two images are at the bottom.
Is this really in 720p? That is the highest res I can select. Kind of surprised for an art video has such poor res.
Detail picture means a close up of a larger picture...
Yup, wheras most people in midjourney use it to add "detail" to their painting. Its fascinating how the the same words can mean such different things depending on your training data.
Can you quotr curve or besier. Quote angle . skin atmosphere chromatic abberation
I find midjourney more aesthetically pleasing and better at rendering 'artistic' images.
Would be nice if DALL-E 2 did eyes a bit better.
not just guns. I had Dall-E come back as content violation when I asked for a woman with a bow and arrow.
Interesting. I Guess no Brave fan art then. I got a similar warning for using the word "warts".
The no gun thing is understandable but also annoying when trying to make characters holding guns, works well with staves and swords though. Swords seem to be very generic though and has no concept of context sometimes showing a person as a staff member when asking for a staff weapon.
“You can breathe a sigh of relief, I don’t think DALL-E will be replacing your artwork anytime soon.“ I wouldn’t be so sure of that. Remember that AI generated art is advancing at an exponential pace, which means it doubles in power and ability every few months!!
My comment was meant to be a little silly, if you know Doug Chiang's artwork and saw what it produced in his "style", it's a pretty big delta to cross :)
THe thing people forget is, that it doesn't have to be perfect. Just "good enough". A lot of people will be put out of work from this software. Particularly once it improves further. This is only the "begining" so to speak. And there is no say what it will do to all the other areas that are not creative fields. Doctors, Teachers you name it. You can already now see Ai giving prescriptions and it can grade School Essays and in both examples it's as good like humans. In some cases even better. Almost no profession is safe from this automatition. And it will lead to a drastically reduced workforce. Ai is to humans what cars have been to horses. It makes us obsolete.
@@CrniWuk That is certainly the worry. For doctors it needs to do a little more than "good enough", but for artists, good enough might be fine with the public, which means, as you say, a lot of lost jobs. Really not sure how any of this will pan out.
@@ArtOfSoulburn Despite how much these have evolved, somehow I still doubt they ever do it beyond "good enough" state. More specifically, when you want something exact, it's much more feasible to communicate that with a human being than to throw a dice n amount of times and hope you get what you're looking for.
I can only hope I my comment doesn't turn into a lie in near future.
Do you think these apps will take your job away ???
I think they will allow fewer people to do more faster, and so there may indeed be fewer jobs available. At which point I will make beer professionally :)
@@ArtOfSoulburn I hear yah buddy, I am also a Freelance illustrator :(
Maybe one day these algorithms will create art from scratch instead of making collages out of stolen artwork.
That will likely require true sentience to work, since the current algorithms are all based on training datasets. But a great first step would be making sure these datasets are all images that are copyright free.
From what I've seen, DallE has no sense of composition. Even in the example of the teddy bear, all four images generated by DallE have either the ears or feet cropped out or just looks ugly. Midjourney on the other hand really knows how to present the prompt.
Ya, I wonder if that's because Dall-e used more photographs in its dataset, and so many photos might be cropped, wheras midjourney used a lot of paintings which tend to have better compositions. Either way, your analysis is spot on.
@@ArtOfSoulburn Ah that makes sense.
I prefer DALLE aesthetic
I am strongly against restrictions that limit users for no meaningful reason. So I have no interest in supporting DALL-E at all. No guns? That just shows me they have no positive regard for people, treating everyone like children. Such restrictions tend to be absurd even when children are involved let alone adults.
When a company has so little respect and care for people, it tells me they will also have nothing against doing anything that would screw with people.
To be fair, I do not know how Midjourney operates. However if they are more laid back and more reasonable in how they offer their product. Then I would rather use their AI even if it was inferior.
For artists I think the restrictions are a bad idea. However, I can sorta get behind the fact they don't want their software being used by average people to create deep fakes that help to destabilize society. So I'm conflicted on the issue.
6:57 THAT IS ELON MUSK. ELON IS AN ALIEN CONFIRMED.
And sorry to continue to nitpik, BUT your claim that words like "Photography" and "Octane Render" and "Unreal Engine" do more to add detail than "Realistic" or "Photoreal" neglects the background and ignores the fact that the subjects more closely match the regular prompts than "Detailed" or "Macro Photography" which both seem to offer far more intricacy in surface texture. Or am I crazy?
most people got sotNice tutorialng from tNice tutorials video. Much love!
How do you consider yourself a concept artist while having an ai system literally make the art for you?
These are all experiments, I've never used it for my professional concept art job, and considering the copyright issues, I don't forsee using it for my job anytime soon.
@@ArtOfSoulburn I see, I apologize if I was disrespectful.
I appreciated the video, you did a good job.
How do you get a job like that if you don't mind me asking?
@@petermontgomery2874 Well there's plenty of ways to work at becoming a concept artist. I have over 25 years of experience, and the industry was very different when I started, so my path isn't really copyable anymore. But if you have some skill at drawing or painting or in 3d or photobashing, you can make artwork to create a portfolio and try and get into a school, several specialize in concept such as Art Center in California or FDZ in Singapore. You could pay for a mentor, a number of big concept artists do online mentorships, for example, you could check out schoolism. Or you could watch videos and work on your skills on your own, making image after image after image and posting them on social media or sites like artstation to get feedback. If you decide to persue it, best of luck!
Looks like Midjourney's more for me.
Um, AI doesn't create or "try" to do any. Clearly they were not "trained" on the keywords((there were no keywords classifying what you said or it was limited or not what it should have been). AI is just a massive interpolation based on the data it is fed. It is only as good as the data it is trained on. If you want a AI to fit to those terms it the inputs must be classified accordingly. It is very likely that when the images were scraped from the internet they simply took keywords surrounding the image(e.g., do a google search). In fact, it is very likely that the way these images are classified is that people are using google's search engine to tag the images and we know how poor that is.
Hmmmm concept art.... What is the item intended to do, is it military or civilian, armoured or not, what is it's power source, how far will it go, do you have a certain style for the design Gothic, insectoid slab shaped... Give me a ring next Wednesday I will have it drawn up and a rough 3D print!! Use a piece of software for concept design and get the sort of low imagination shit in the latest dune movie..... Balloons lifting off 500+ ton harvesters ha ha ha he ha ha ho he ha ha... Oh that level of design stupidity gets me everytime.
Hehe. Well, in my own mind I make a distinction between concept art and concept design (although so many, including myself by accident sometimes, use the terms interchangeably). Art can be more nebulous and be about big shapes, lighting and mood. And design is the part you describe, a more practical view on what it does, how it does it, the details, functionality instead of form. Right now if these pieces of software do have a place it would be in concept art, it would need entirely new methods if it wanted to try its hand at design.
@@ArtOfSoulburn Indeed I see the truth of it (Dune the good one). I did a bit in my early 20's designing a few 25mm/1/35th sci-fi kit ranges and was offerd an interview at Pinewood. One of my hero's Martin Bower Space1999, The Nostromo, Refinary, Narcicus, Outland (everything)... Once said, sci-fi modelling will only look authentic if every piece looks like it's supposed to be there. I suppose I'm just old school.
@@thebritishengineer8027 There's definitely some truth to that. Another quote I've seen is "It doesn't matter if it work as long as it looks like it would work". :) And agreed, love Martin's work, I became aware of him when he did a giant alien photo dump about 10 years ago. Amazing work.
In my opinion Midjourney is way better and cooler.
you acidentally reverse engineered the design process for the transformers movies
Haha! Michael Bay was ahead of his time.
I'm on the dall e 1 trial list right now, I tried a lot of things but didn't like any of it, it all seemed low budget Junkie
People always deny it and claim these ai are genuinely creative but these systems are just stolen images that get remixed. A system with true creativity (which would be an ability to mix the things it knows to create something new) wouldn't have such limitations
I suppose it depends on your definition of creative but for me some thing can’t be creative unless it’s sentient. So while it can create astounding results, the ai isn’t creative in my definition.
They're both VERY censorious to the point of it THINKS you think something, even if you're not, it still reports you.
distort it
i learned that dall-e loves furries:D
DALL-E looks awful. It appears to be trained on product images and stock photography.
69 likes