Explained simply: How does AI create art?
Вставка
- Опубліковано 13 січ 2023
- AI text to art generators explained simply with pen and paper under 6 minutes. Suitable for complete beginners. No math or coding knowledge needed.
===FOLLOW ME (techie_ray)===
Instagram: / techie_ray
TikTok: / techie_ray
Personal website: www.techieray.com/
How does AI generate images
How does AI create art
How does AI draw art
How do text-to-art generators work
How do text-to-art generation work
Text to art generation explain
AI generated images explain
Generative art explained
How does Dalle 2 work
How does Stable Diffusion work
How does Midjourney work
Dalle 2 explain
Stable Diffusion explain
Midjourney explain - Наука та технологія
Hands down the clearest explanation of how AI art works.
Thank you 🤗
What an awesome video! Finally, an unbiased and purely factual look at how it all works. Thank you so much. I'm a digital artist and it is hard to not feel unsettled by how this technology got the where it is today, on the backs of those who may not know that their images were being used in such a manner. As much as I hate the concept of it, I didn't wanna turn a blind eye and be stuck in an echo chamber passing it off as "advanced photobashing." I really appreciate your video for being so easy to understand... and so cool that you broke it down on pen and paper!
Wow thanks so much for the donation and your kind comment, much appeciated! I'm very happy to know that you found my explanation helpful :)
best explanation of ai art on youtube
This has to be one of the best explanations about this topic. Awesome video.
Hope to see more “explained simply “videos related to AI/ML etc You are great at telling complex things simply &clearly👍👍👍
thank you!! More to come :)
Surprised to know you aren't Indian given how clear your explanation is.
Thanks mate, more content, please!
What brand mechanical pencil is that?
Well explained. Easily the best explanation for the algorithm on UA-cam.
But, the thing is that how AI learns to adjust colors, shading and sunlight on different elements in an image? On which element is it trained to do that, perhaps a global element like if someone prompts sunlight or moonlight?
That's a good question. It comes down to the labelling of training images. For example, for every image that has sunlight, it's likely to have the word "sunlight" in its label. So everytime the model sees the word "sunlight" and sees a consistent pattern of bright pixels in a column, it can infer that everytime "sunlight" is mentioned it has to diffuse that certain visual
myyy meeennn, dude this is the best explanation ever, I have watch hours and hours and you did it so simple props my dudes, Pura Vida from Costa Rica
Appreciate the kind comment, thank you!
Great job ! Clear explainations, congrats !
Thank you!!!
Damn, the best explanation you will find on UA-cam and its on his Phone haha
Thank you so much mr ray
Nice video, bravo
Amazing explanation.....Thank you very much !!!
Thank you!!! 🥰
this is gold!!!!!
Best
I dont understand why is the diffusion process needed. Why make it fussy and then clear?
Good question! To explain simply, it's mathematically easier for the AI to guess the right shape/colours from a random colours, than to draw new shapes/colours on a blank white canvas
@@techieray wow intrsting. Do you know why?
How does it understand context though?
Great question. It doesn't actually understand context in the same way we interpret context. Mechanically, it only interprets associations between visuals based on historical patterns. These "associations" are laid out in the latent space - think of 500 billion dimension plane where each dimension relates to certain attribute e.g. size, colour, angle, style, etc. And there's a visual at every point in that plane. Visuals that relate to each other are more closer to together on a particular plane (e.g. "car" and "airplane" would be close on an axis about transportation , while "airplane" and "bird" would be close on an axis about sky. That's how image associations work in a nutshell, and by extension, context.
There's also the "attention is all you need" algorithm that works out the context/objective of a text prompt. This context/objective then informs the model to find what are the relevant associations in that visual plane.
So this explains how AI generates *_instructions,_* but it doesn't explain how AI makes arbitrary decisions. For example, if my prompt is, "Show me a screenshot from a 1970s scifi tv series," I don't specify the nature of the screen shot. So how does the AI "decide" what will make up the image? In this case, my result was a woman in her 30s with blonde hair and a sci-fi uniform seemingly made of alligator skins. Her uniform is red, as is her lipstick. She's sitting at a glass desk gazing to her left, as if speaking to someone off-camera. There's a small glass of water on the desk and a blue folder.
^ How does the AI come up with all of that? I didn't tell it do any of that.
Great question! This is all learnt during the training process where the model has seen many examples of images with labels/captains relating to "sci-fi". It's likely that these sample images depicted a blonde woman, alligator skims, glass desk, etc, and that these features are recorded within the latent space of the model.
But to be fair, the model doesn't really understand the relationship between these features. It just knows that these features relate to the "sci fi prompt", but not necessarily why. This is why the same prompt can give rise to different outputs/images that depict more or less the same features (but assorted in different ways)!
Nice, except TikTok logo over video.
Thank you for this video I disagree with ai art and I am doing a argument essay so I want to know how it works great explanation
Thanks for watching, I'm glad my explanation was helpful :)
Can't wait to send this to the 'aI Is BlAtantLY copY and paStiNG Art' crowd.
But it does
You stupid or what
Did you even watch the video @@Knight00519
So it’s not stealing anyone’s art. What a shocker! It’s almost like we know that but some people refuse to accept it 😑
haha this is actually a pretty contentious question from a legal perspective! While technically the AI is not directly reproducing an output image in its generation, it's the fact that AI was pre-trained on loads of copyrighted art is an issue from legal perspective. And that's where the "stealing" argument comes in. This comes down to a broader debate of whether there should be an exception for AI companies to scrape art from the public internet to train their AI.
@@techierayI’m no legal expert, but it’s clear that any scraping data falls under the fair use doctrine as it’s transformative, not a copy and paste as your video clearly suggested!
@@PlatanosConAqua you raise a good point about fair use, though that doctrine only applies in the US. Different rules apply in other jurisdictions like UK and Australia.
@@techierayReal question: if all that happens is that the AI learns how images work based in the scraped photos, how is it any different from a human learning how to draw based on those images? The technical process is different but they produce the same result.
This explanation stops right when it begins to get interesting. Pixels, a representation of words and of colors through numbers, also libraries are issues below the level of AI. AI starts when a user can input whatever natural-language prompt, not just something confined to certain essential terms like "beside" or "above". I'd have been interested in the question how an algorithm can understand an individual image description for which it won't find a match in its libraries, and create a matching image on such basis, itself. Nevertheless, it's helpful, for certain, to recapitulate what are the more basic steps. AI builds on these, after all.
amazing! so easy to understand :o the only doubt i have about this is how can the AI interpret the correlation between the words? like what's the process for it to know that "pikachu eats a big strawberry on a cloud" means that it has to generate an image of a pikachu on top of a cloud with a strawberry in it's mouth?
thank you! In the video, I only explained the reverse diffusion process for the pikachu and the strawberry. But the same process applies to all of the other words in the prompt. For example, the Ai would have seen a bunch of images of "eat" (e.g. people eating, animals eating, etc) to the point that it learns "eat" should involve a piece of food being close or inside a mouth. Same thing with prepositions like "on" (which the model learns X has to be on top of Y) and adjectives like "big" etc