Creating Art with AI - Ep. 2.1 - Steps

Поділитися
Вставка
  • Опубліковано 26 жов 2024

КОМЕНТАРІ • 2

  • @royalstingray822
    @royalstingray822 5 місяців тому

    I think one of the best ways to think about steps is like solving a jigsaw puzzle. The AI starts with a big jumble of pieces with seemingly no coherence and then moves a number of those pieces at a time to try and create the puzzle image. If you give the AI say 5 steps, but it's a 512 piece puzzle (a 512 x 512 image would have 512 pixels of width and 512 pixels of length) it has to move the pieces in big handfuls. It's only got 5 moves to solve the puzzle and 512 pieces to move, so it moves 100 at a time. It might be able to group colours and rough textures and such together to make more educated guesses at what should go where, but it's generally a pretty crude approach. Results often sort of resemble the completed puzzle image, but are usually a way off the mark. By contrast, if you give the AI 10,000 steps, well now it has to cut the pieces up into tiny smaller pieces so it has enough to move every step. This means the ai can get carried away creating details which shouldn't exist, since it's too focused on tiny pieces to see the big picture. Somewhere in there will be the perfect step count, where the AI creates the puzzle like a human would, moving each piece in turn until it's perfectly arranged.
    Now obviously this isn't quite how software like stable diffusion works, but it is a simple way to understand how step count can influence generation. Low step counts force the AI to make bolder decisions about what it puts where. Tiny variance in the initial 'noise' can lead to huge variance in the final picture, as the AI has to make very significant decisions at each step even though it doesn't really have much information to go on. On the other hand, high step counts force the AI to make very small changes - the problem here really is luck. A string of 'bad' decisions is much harder for the AI to undo before it finishes. Take the 'wizard portrait' example in this video - at 50 steps the hair is defined but clearly hair. This is because it has jumped to the most logical conclusion at what should be there and then not been able to keep guessing. By 100 steps it has guessed itself into the wrong answer, by continuing to chase it's own 'refinement' of individual strands of hair which has lead to these more tendril like spiky things. At 150 steps I wouldn't be surprised if the hair becomes a helmet of sorts.

  • @kryless7775
    @kryless7775 Рік тому

    Well explained,ty