Stable Diffusion: DALL-E 2 For Free, For Everyone!

Поділитися
Вставка
  • Опубліковано 3 чер 2024
  • ❤️ Check out Lambda here and sign up for their GPU Cloud: lambdalabs.com/papers
    📝 The paper "High-Resolution Image Synthesis with Latent Diffusion Models" is available here:
    ommer-lab.com/research/latent...
    github.com/mallorbc/stable-di...
    ❗Try it here (we seem to have crashed it...again 😅, but sometimes it works, please be patient!): huggingface.co/spaces/stabili...
    ❗Or here: colab.research.google.com/git...
    Great notebooks to try:
    / colab_notebook_sd_hiki...
    github.com/pinilpypinilpy/sd-...
    github.com/victordibia/peacasso
    Run it on your own graphics card: github.com/CompVis/stable-dif...
    Guide on how to run it at home: www.assemblyai.com/blog/how-t...
    Image to image translation: / 1564290733917360128
    Turn your drawings into images: huggingface.co/spaces/hugging...
    Run it on an M1 Mac - replicate.com/blog/run-stable...
    Even more resources:
    multimodal.art/news/1-week-of...
    Dreaming: / 1560320480816545801
    ❗Interpolation: / 1558508866463219712
    ❗Full video of interpolation: • Voyage through Time - ...
    Interpolation (other): replicate.com/andreasjansson/...
    Portrait interpolation: / 1565377550401998848
    Felícia Zsolnai-Fehér's works (the chimp drawing): felicia.hu
    Fantasy examples:
    / 1
    / 1
    / 1
    / 1
    / 1
    / 1
    Collage: / 1555184488606564353
    Fantasy again:
    / 1562480868186521601
    Animation: / 1558655441059594240
    Variant generation: / 1561703483316781057
    Random noise walks: / 1559343616270557184
    Portrait interpolation: / 1557018356099710979
    Fantasy concept art montage: / i_tried_to_make_some_f...
    Scholars holding on to their papers: / 1567219588538011650
    ❤️ Watch these videos in early access on our Patreon page or join us here on UA-cam:
    - / twominutepapers
    - / @twominutepapers
    🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
    Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Ivo Galic, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Michael Albrecht, Michael Tedder, Nevin Spoljaric, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
    If you wish to appear here or pick up other perks, click here: / twominutepapers
    Thumbnail source images: Anjney Midha
    Thumbnail background design: Felícia Zsolnai-Fehér - felicia.hu
    Chapters:
    0:00 Teaser
    0:22 The Age of AI Image Generation
    1:06 But there is a problem
    1:22 Stable Diffusion to the rescue!
    2:11 1 - Dreaming
    2:54 2 - Interpolation
    3:30 3 - Fantasy
    4:00 4 - Collage
    4:45 5 - More fantasy
    4:51 6 - Random noise walks
    5:33 7 - Animations
    6:00 8 - Portraits + interpolation
    6:22 9 - Variant generation
    6:39 10 - Montage
    7:05 Good news!
    7:29 Try it here!
    8:04 The Age of Free and Open AI Image Generation!
    8:19 The First Law of Papers
    9:00 Stable Diffusion 👌
    Károly Zsolnai-Fehér's links:
    Instagram: / twominutepapers
    Twitter: / twominutepapers
    Web: cg.tuwien.ac.at/~zsolnai/
  • Наука та технологія

КОМЕНТАРІ • 1,8 тис.

  • @StefanDeleanu
    @StefanDeleanu Рік тому +1356

    API died once Karoly came with the views.
    Thanks for bringing awareness to the recent AI advances!
    Informative as always!

    • @TwoMinutePapers
      @TwoMinutePapers  Рік тому +160

      Ouch - every time! There are some other notebooks and links in the video description, please try those too! And if any of you Fellow Scholars found an alternative website for easier access, please let me know and I will add a link to the video description. Let the experiments begin!

    • @xavierfortune3574
      @xavierfortune3574 Рік тому +32

      @@TwoMinutePapers haha crashed from only 3,00 views cant wait to see how they handle the next 150K viewers. better buckle up!
      ps love your vids and have been watching for years now ive been messing with dall-e and the others quite a bit and wondered what your thoughts are on midjourney as you havent mentioned them yet.

    • @creatureofvenice
      @creatureofvenice Рік тому +3

      @@TwoMinutePapers hello Karoly!

    • @mambe4349
      @mambe4349 Рік тому

      XD

    • @mambe4349
      @mambe4349 Рік тому +6

      My comment shall be engraved in history here, 5 min after it was posted

  • @egonvanpraet
    @egonvanpraet Рік тому +1257

    Won't be long before the model learns camera position in it's weights so we can synthesise 360° image sequences of same scene/subject. Feed it in NeRF or photogrammetry software and create 3d models. Can't wait to go from text prompt to 3d scenes with easy editing capabilities.

    • @renanmonteirobarbosa8129
      @renanmonteirobarbosa8129 Рік тому +45

      It’s already done using neural fields

    • @shukrantpatil
      @shukrantpatil Рік тому +17

      text prompt to video ?

    • @MrAwesomeTheAwesome
      @MrAwesomeTheAwesome Рік тому +71

      Probably do this with a different model attached. I've already done some playing around with this myself: use blender to set up a basic 3d scene with slightly mottled/noisy textures and rough models, then feed it into Stable Diffusion.
      Stable Diffusion genuinely does not understand 3d space at all, and it doesn't have access to the tools needed to do so. It really just emulates an understanding of 3d space and lighting entirely based on forms and composition. But it really, really breaks down in a lot of large architectural works, where it frequently produces impossible geometry, or often just in basic anatomy.
      What we really need is a model which generates a simple 3d scene based on natural language, then lets us choose a camera angle, then feeds that 3d data into another model which textures the visible polygons, then feeds it into something like Stable Diffusion that stylizes it. That would give you really, really good human-like art, and make animation actually coherent and quite manageable.

    • @xXWhatzUpH8erzXx
      @xXWhatzUpH8erzXx Рік тому +16

      @@MrAwesomeTheAwesome I like that idea a lot. Use AI to create segmented, labeled meshes within a scene that gets stylized with another AI.

    • @JH-pe3ro
      @JH-pe3ro Рік тому +20

      There's a whole world of "Holodeck" applications waiting to be made by gluing together different models and slapping on some UI. I've already seen a few experiments with feeding GPT3 content into Midjourney and SD.
      Meanwhile, here I am, learning to draw; it doesn't feel fruitless, because being able to sketch exactly what I want is communicatively useful even if I should end up running it through an AI. My brain can access way fewer shapes than the AI, but it can also apply them more precisely.

  • @joannot6706
    @joannot6706 Рік тому +673

    This field is improving so fast now
    I remember when 1 or 2 years back two minute paper was presenting GAN that drew small pictures of bird ...
    And now transformers changed the game with 100% usable generated concept art suitable for 200 million $ movies.

    • @poopoodemon7928
      @poopoodemon7928 Рік тому +33

      Simple animation is already possible with this technology. I'm absolutely mind blown. I thought it would have taken so many more years before it was possible but here we are.

    • @IceMetalPunk
      @IceMetalPunk Рік тому +22

      100%. Generative AI has accelerated to light speed progress since 2017, and basically every major cutting-edge model incorporates transformers in some way. I wonder if Vaswani and company knew how much their paper would change the world?

    • @shukrantpatil
      @shukrantpatil Рік тому +10

      @@poopoodemon7928 growth in any field of science and tech is exponential , it has only started now and its rate of growth wont stop accelerating

    • @openroomxyz
      @openroomxyz Рік тому +4

      @@shukrantpatil Hope you are right.

    • @Mukna132
      @Mukna132 Рік тому +19

      Just here to say that no, these wouldn't be used in $200m productions (at least not yet, and I doubt anytime soon either). This is not production ready yet. It lacks 'coherency'. What I mean by that is that by the time it takes for you to create even two images that 'exist within the same world', a concept artist will have produced MANY more.
      I get what people are saying about being able to describe what you want out of these AI, but realistically, think about that. It's easy to describe the general idea of something (which would get you generic fantasy scape #16455). But it is VERY hard to specifically describe Minas Morgul. To caveat, the img2img makes this MORE feasible, but still not better than a concept artist.
      What I DO see it being used for in production is mood boards.

  • @helplmchoking
    @helplmchoking Рік тому +294

    Got it set up on my PC, runs in just a few seconds on a 3080 and actually a fairly simple setup process. Nothing like those image generator apps/websites floating around, this thing is an incredible achievement. Can't express enough gratitude that they've made this open source and freely avilable

    • @abcqer555
      @abcqer555 Рік тому +7

      How long did it take you to set up?

    • @diasbakhtiyarov3169
      @diasbakhtiyarov3169 Рік тому +7

      What GPU memory is needed at minimum?

    • @helplmchoking
      @helplmchoking Рік тому +33

      @@abcqer555 Literally less than 10 minutes. It'll depend on your internet speed as the weights data is a good 4GB+ and obviously you'll want to be a little familiar with navigating CLI instructions but it was definitely nice and straightforward.

    • @helplmchoking
      @helplmchoking Рік тому +42

      @@diasbakhtiyarov3169 I would have included this in my other reply but idk how notifications work. Running with only 1 sample (default is 3) has me at 9.8GB out of 10.0GB available on the 3080, with about 10GB of system RAM used as well.
      Any more than one sample and it just errors out as I don't have the VRAM to support it.
      I know there are some 12GB consumer cards out there but if you want to run higher sample counts and use it for anything more demanding than hobbyist level stuff with the 512x512 dataset then you'll definitely want some beefier pro kit. Sadly, a top end Quadro costs more than my whole PC. And my work MacBook. And my entire desk with all the peripherals.
      EDIT: looks like there are more optimised forks out there (yay for OSS!) and people are apparently having success even as low as 4GB VRAM

    • @cortster12
      @cortster12 Рік тому

      Do you need to know how to code to do it?

  • @aperson2703
    @aperson2703 Рік тому +479

    Awesome that they've made the full model public. OpenAi really hasn't been living up to it's name.

    • @26kuba05
      @26kuba05 Рік тому +126

      Open in OpenAI refers to the fact that they are open to having you pay hefty sum of money for access to their models.

    • @pneumonoultramicroscopicsi4065
      @pneumonoultramicroscopicsi4065 Рік тому +38

      It was open at first but at some point they abandoned their mission and became a for profit company

    • @zot2698
      @zot2698 Рік тому +37

      @@pneumonoultramicroscopicsi4065 Im all up for Open Ai being a profit company. But, please deliver really great products. At this rate, the free version is better! And I'm scare that open AI , not being able to compete...will start using their lawyers to stop other from publishing and sharing ( like the pharma industry)

    • @sychuan3729
      @sychuan3729 Рік тому +3

      @@zot2698 I don't think it is better. It is much worse in understanding text prompt than Dalle

    • @jake7435
      @jake7435 Рік тому

      @@zot2698 If the US government can't stop open source code from spreading, I don't think we have to worry about lawyers

  • @ThioJoe
    @ThioJoe Рік тому +863

    This is a true turning point in AI and I'm so excited. I think when companies see that the world doesn't implode by releasing an AI model to the public, they will be more willing to do so. I also think none of them wanted to be the first, and open themselves up to some perceived social liability. Now that the cat's out of the bag, I think we'll start to see some even cooler stuff released publicly.

    • @cccccccol
      @cccccccol Рік тому +1

      hi!!!!

    • @IceMetalPunk
      @IceMetalPunk Рік тому +45

      On the other hand, knowing the internet, I'm sure there will be plenty of content generated with Stable Diffusion that might have big companies thinking, "Not worth the liability, we'll let Stable deal with that".... or maybe I'm just cynical about people in general.

    • @SpykerSpeed
      @SpykerSpeed Рік тому +39

      It's like Adobe worrying about releasing Photoshop. Really silly.

    • @samuelkibunda6960
      @samuelkibunda6960 Рік тому +4

      I mean there's a huge amount of backlash from ai art.

    • @samuelkibunda6960
      @samuelkibunda6960 Рік тому +6

      @@SpykerSpeedComparing image generation with Adobe doesn't quite make sense!

  • @mankind8807
    @mankind8807 Рік тому +1503

    Funny how most people thought that Arts and Music would the last and hardest sectors that AI would conquer but it’s turning out to be the opposite

    • @psyche1988
      @psyche1988 Рік тому

      This is not art, art is the domain of the human brain, AI only creates replicas, AI does not innovate, it regurgitates.

    • @IceMetalPunk
      @IceMetalPunk Рік тому +170

      Mostly people who aren't familiar with AI. Even Ada Lovelace knew computers would compose music one day.

    • @samuelkibunda6960
      @samuelkibunda6960 Рік тому +68

      How the turn tables. Funny enough most service jobs can be automated by Ai and that's making me sweat

    • @frostreaper1607
      @frostreaper1607 Рік тому +286

      Also ironic because a large portion of the human population still thinks art is a magical skill that some people are born with.
      Literally the opposite, its a set of visual rules that are thought.

    • @shukrantpatil
      @shukrantpatil Рік тому +10

      @@frostreaper1607 exactly

  • @niveketihw1897
    @niveketihw1897 Рік тому +133

    Someone just developed a plug-in "renderer" for Blender -- it takes the viewport image (usually a textured image but one without proper lighting) and renders it in a fraction of the time it would take Cycles to do.

    • @shrinkhh79
      @shrinkhh79 Рік тому +13

      Fascinating. Do you have a link? Thx :-)

    • @synthoelectro
      @synthoelectro Рік тому +9

      that didn't take long that it would make it to the 3D world

    • @frostreaper1607
      @frostreaper1607 Рік тому +13

      @@shrinkhh79 not published atm, keep an eye out on the Stable Diffusion reddit page.

    • @michaellillis9897
      @michaellillis9897 Рік тому +9

      I have been thinking about this stuff a lot, especially with how good real-time renderers have become recently. Just enough fakery to make and AI’s job quite easy at going from “game graphics” to “photorealistic” as a post process. With all the 3D info on the scene it should be able to do an amazing job. Ray tracing might go extinct very quickly.

    • @danzjz3923
      @danzjz3923 Рік тому +7

      @@michaellillis9897 so RTX is better for its AI features than actual RTX? that tracks

  • @USBEN.
    @USBEN. Рік тому +302

    I love AI being open sourced finally. Soo much more coming.

    • @ronilevarez901
      @ronilevarez901 Рік тому +5

      I think most AI models have been open sourced since many years ago. I've downloaded many different types over the years.
      Sadly, that becomes a lot harder as they get more interesting and useful, and more resource intensive and heavier.

    • @leonardodecappuccino2296
      @leonardodecappuccino2296 Рік тому

      @@ronilevarez901 ua-cam.com/video/qe9PEJo3_VE/v-deo.html

  • @ShawnFumo
    @ShawnFumo Рік тому +52

    I feel like it was glossed over a bit in the video itself (though it is shown visually at the very end), but one of the cooler tools to come along with Stable Diffusion is img2img. You give it a rough image and a prompt and it creates off of a combination of the two. So you can turn an MS paint image with some different colored areas into a beautiful landscape. There's a lot of options to in terms of strength of the effect, applying it multiple times, etc. This gives you a lot more control over how the ending image will look than just using a text prompt. There's a simple free version called Diffuse The Rest where you can try out the concept.

    • @personguy1004
      @personguy1004 Рік тому +3

      do you have a link?

    • @ShawnFumo
      @ShawnFumo Рік тому +1

      @@personguy1004 Actually it is built into DreamStudio now as of yesterday I think. There's also "Diffuse the Rest" which was a free version.

    • @quickdudley
      @quickdudley Рік тому

      There are at least two ways to do this: Stable Diffusion itself uses an image autoencoder but the text prompt is also passed through a CLIP text embedding model which also comes with a matching image embedding model. That's to say you can use the example images either as starting points or as prompts.

    • @leonardodecappuccino2296
      @leonardodecappuccino2296 Рік тому

      @@personguy1004 ua-cam.com/video/qe9PEJo3_VE/v-deo.html

  • @Ewendude1
    @Ewendude1 Рік тому +73

    My mind was blown when I first discovered how far AI image generation technology has come. Almost instantly, my mind was blown again thinking about the future of this technology, animation, 3-dimensional rendering, even music/audio and how it could tie into "metaverse" technologies. Imagine an empty room, then you say "Blue forest with fireflies and golden streams with a mozart vs beethoven piano battle in the near distance" and boom you experiencing that virtual reality. Just contemplating that potential leaves my mind in a state of being perpetually blown.

    • @samuelkibunda6960
      @samuelkibunda6960 Рік тому +4

      Animation will be tricky considering content is copyrighted and for personal use only if we're talking about realistic 3d animation that'll be easy, but 2d animation the only way to get training data is of off pirate sites also you'll have to caption each and every frame of a 21 minute episode and also compress that information down we're a good 2 to 3 years before good image to text animation comes out we more have to conqure video as that'll make much easier to create fully 2d animation since you'll only combine it with an image generation to base the artstyle of off!

    • @shukrantpatil
      @shukrantpatil Рік тому +7

      @@samuelkibunda6960 Google won't hesitate to buy majority of the Animation companies to achieve that .

    • @samuelkibunda6960
      @samuelkibunda6960 Рік тому +5

      @@shukrantpatil Lmao 😂 we just need quantum computers and we'll be able to create full animated movies without requiring farms of GPUs!

    • @ShawnFumo
      @ShawnFumo Рік тому +7

      @@samuelkibunda6960 I don't think animation would be any different than the current image AIs actually. They don't just use public domain images, but use a variety of images out there. Their argument is that you can't copywrite a style, and there's nothing to prevent a human from studying any images online and having that influence their style. So you could just as easily have an animation AI watch movies, and use that to learn from. If you use copywrited characters in your resulting animation and try to sell it, of course they can go after you if it isn't fair use like commentary. But you could still make animations with characters that are modified enough (like combining them with some other character) to make that not be an issue.

    • @himan12345678
      @himan12345678 Рік тому +2

      @@ShawnFumo yes, and especially how we've already seen AIs giving more realistic animations for characters from both video and/or mocap data. Literally the only reason something like that hasn't been done yet is just someone hasn't done it yet. We have the horsepower and all the pieces, it just hasn't been done yet. All 2d is, is a style/interpretation of 3d/real life. Just need to translate that. I know easier said than done, but this is a path that has been traveled now and someone will soon do it for this particular goal.

  • @KanalFrump
    @KanalFrump Рік тому +63

    I just love that this massive beautiful beast of a model and source code was released in full on those democratic terms so that Adobe won't get to colonize the AI imaging space and put it behind their subscription paywall. Blender for example already has a working integration.

    • @ericray7173
      @ericray7173 Рік тому +1

      Do you mean monopolize? You kids talk funny these days!

    • @TheLolilol321
      @TheLolilol321 Рік тому +6

      @@ericray7173 Colonize works fine here if used as an analogy (reading between the lines, it can be especially hard if you are not a native). Establishing areal control by way of being early with significant resources to fight competition yet within a framework the original commenter does not believe can be monopolized.

  • @kaiserc2471
    @kaiserc2471 Рік тому +57

    Using the absolute basic stable diffusion model through Anaconda on an rtx3060 12gb, it takes about 8-12 seconds to put out a 512x512 pixel image.
    It takes about 15-22 seconds for a 512x768 or vice versa.
    You can literally run it for 1500 images, go to work, then just come home and find your favorite ones to play with more.
    Also, note, if you use the words "dream" and "fantasy" in the same prompt to try to make a landscape, you're going to get some Disney style castles and text on a lot of them.

    • @steveaustin5344
      @steveaustin5344 Рік тому +1

      If you want to share some of your favorite images, please do. I haven't begun playing with it yet but I'm excited.

    • @steveaustin5344
      @steveaustin5344 Рік тому +4

      If you create a nice image, can you re-do it at higher resolution?

    • @caenir
      @caenir Рік тому +2

      @@steveaustin5344 You can save the seed to remake it, but it probably won't be the exact same. It's recommended to use an upscaler.

    • @Terszel
      @Terszel Рік тому +4

      I've been using a 3080 with 10GB and it renders a 512x512 in about 3-5 seconds with a sampling rate of 50 for k_lms.

    • @rashedulkabir6227
      @rashedulkabir6227 Рік тому

      But still it can't draw hands properly.

  • @serta5727
    @serta5727 Рік тому +44

    It shows the beauty of open source 😍🤩

    • @synthoelectro
      @synthoelectro Рік тому +6

      the power in your hand.

    • @Parasmunt
      @Parasmunt Рік тому

      Indeed it does. Blessings upon those responsible.

  • @JohnVanderbeck
    @JohnVanderbeck Рік тому +9

    "Get exactly what you asked for" - Said by no one who asked for something specific.
    Don't get me wrong, I LOVE generative art and use it heavily as a tool in my own art, but the public idea of what these things do is HEAVILY skewed by seeing the good results and not the bad. I can often spend DAYS perfecting my prompts to generate various images that will then all be combined to produce a final piece.

  • @wilnonis
    @wilnonis Рік тому +36

    What these programs are capable of is absolutely amazing and it seems like there is no limit to how good they can become. There is no denying that this is revolutionary and it's not slowing down or going away.
    But as an illustrator, this is making me feel very depressed and so so hollow. I was always excited about tech that helps artists, for example, clip studios coloring ai, which can help greatly in the rendering process of illustration. While AI is an amazing tool for concepts and references for artists, it seems that in just a few years if not months it can advance so much that it can more or less cut out a living artist from the creation equation. After all, what would be the point to hire a skilled artist to create something if AI can get your idea 99% exactly how you wanted?
    Also, please be respectful. I've seen many people on the net telling artists to shut up and cope. It makes sense most of the art community is angry and scared, millions very well might lose jobs. We just want some respect which we repeatedly don't get. We already often have to deal with people telling us that art is so easy for us because we are born with some amazing talent ( surprise we are not, it's just hard work and studies). And nightmare clients, for example, I had a guy get angry at me because I refused to illustrate his 32-page comic for 300$ (which would take me close to 300 hours to finish). He also told me he can pay me 25$ for the character design sheet but I will have to give him the money back at the end of the project? lol, what. The founder of Stability AI Emad Mostaque also said in an interview that "illustration design jobs are very tedious'. It's not about being artistic, you are a tool". I truly thought that I was at least a bit more than a tool, but I guess not huh. I think I speak on behalf of many artists when I say that being called a tool sounds very insulting.
    I know AI is not replacing me just yet, but in a couple of years? Who knows. Becoming a paid illustrator was hard work. Years of waking up at 5am before everyone else and practicing. Working late into the night. As a little kid I had very weak health, and still do, so drawing was everything I really know. Finally becoming a working illustrator was like a dream coming true, happier than ever, and couldn't believe I made it! So now seeing how good AI became is impressive, but it also feels very depressing. Feels like all the work and learning I have done till now gonna be all for nothing. And what's the point of refining my skills if I am becoming just a tool and there is a better one out there? Sorry for the really long rant, but I just have a hard time coping with all new doubts about the future. (and I will most likely implement AI into my work, I think it can be amazing as a mood board and reference or even texture generator. But all the fears still stand. And I'm also starting to feel very pressured into using AI, it seems like that's the only way to make sure I can still have my creative job a few years into the future).

    • @parallelworlds1248
      @parallelworlds1248 Рік тому +4

      When the first calculators came out the mathematicians were afraid to loose their job aswell, now they program the calculators. i suggest you the movie "Hidden Figures" in my opinion it's the same concept

    • @badreddinekasmi8919
      @badreddinekasmi8919 Рік тому +5

      @@parallelworlds1248 Very different. Not only STEM fields have always been well respected, but a mathematician's job isn't to calculate only. An artist's job is to paint only. Which these IAs replicate greatly.
      Additionally while calculators have been programmed, these IAs are trained on data sets of unwilling volunteers. IA raises way more ethical points that people are just willing to overlook because it benefits them.

    • @lenowoo
      @lenowoo Рік тому +9

      Time to scrap my dream huh. . . Working as freelancer artist has always been hard. The competition is always fierce. But when you have to compete against machine that can spew result in minutes, can be asked to redo works infinite times, free, and create a good result.
      Yeh, rip.
      I myself just started. But already meet the end. This job itself already meets its dead end. The next generation artists would probably just become AI result tweakers.

    • @badreddinekasmi8919
      @badreddinekasmi8919 Рік тому +1

      @@lenowoo I'm a freelance illustrator. I generally don't recommend the arts unless you can't see yourself doing anything else but art. Right now though it's hard to really ask any beginner to invest time in something that might well be obsolete by they time they are skilled enough. You can always try out going for 3D work since that might take a bit more time until it becomes obsolete but that's a huge risk you're taking.

    • @khoaba8508
      @khoaba8508 Рік тому +1

      yeah this really makes me want to reconsider whether I should be an artist or not

  • @Consul99
    @Consul99 Рік тому +9

    Wow. Stable diffusion looks even more impressive than dalle, and the best thing is they don't try to charge you for every image you make.

  • @slitthroat6209
    @slitthroat6209 Рік тому +121

    Simply mind blowing. First Dall-E 2 came then a sudden boom of image generation AI's then stable diffusion getting up there with the quality of Dall-E 2 but in a browser for everybody to use
    Edit: I know various companies were researching this for some years but would it 2022 when they start releasing them.

    • @shukrantpatil
      @shukrantpatil Рік тому +17

      These companies have been researching for over 3 years now , the only thing Open AI tried to do was that they tried to come out to the market with a good model before the other companies for the sweet sweet cash and they succeeded , but little did they know that someone would come and offer the same thing for free haha

    • @nicoliedolpot7213
      @nicoliedolpot7213 Рік тому +1

      *available to everybody with a $1000+ gpu
      ftfy

    • @slitthroat6209
      @slitthroat6209 Рік тому +1

      @@nicoliedolpot7213 😭

    • @nicoliedolpot7213
      @nicoliedolpot7213 Рік тому +1

      @@slitthroat6209 well........ 700+ on ebay,
      still too much for a GPU...... 😣😣

  • @SkeleTonHammer
    @SkeleTonHammer Рік тому +138

    It's as I predicted, at first everyone desperately wanted to greedily keep AI to themselves and not allow people to run it on their own computers. They wanted to print money by forcing people into subscriptions.
    Need more people willing to spill the beans.
    These AI models were built off the sum creativity of humanity. AI art belongs to everyone. Subscription based services like Midjourney won't last.

    • @CheshireCad
      @CheshireCad Рік тому +28

      Don't try to paint Midjourney as the bad guy here. Their cheap subscription-based model was always fairly priced. And, even now, it offers less tech-savvy end-users a powerful set of options for variation. I find the features rather lacking in comparison to what's being developed for SD. But if MJ can keep up the pace of its development, then it'll be just fine.
      On the other hand, Dall-E 2's insultingly overpriced 13¢-per-prompt payment model has been smashed to pieces, set on fire, and thrown in the dumpster where its always belonged. It now offers nothing that a $10 monthly Google Colab subscription can't provide. OpenAI sacrificed what little reputation they had left, in exchange for ~6 weeks of bilking their closed-beta users.

    • @heliusuniverse7460
      @heliusuniverse7460 Рік тому +3

      @@CheshireCad i thought they were talking about OpenAI though

    • @ShawnFumo
      @ShawnFumo Рік тому +10

      @@CheshireCad I agree. MidJourney at least is fairly open about a lot of things, running polls with their users constantly, having live discussions on Discord with their "office hours", letting everyone see mostly everyone else's prompts, etc. And I believe they've collaborated with Stable Diffusion on some of their recent experiments like test/testp (which are pretty amazing). But yeah, OpenAI better release a waaaaay better model soon or change their prices or else they'll be left in the dust very soon.

    • @larion2336
      @larion2336 Рік тому +9

      @@CheshireCad Not even the first time they've done so. They similarly overcharged out the *** for access to GPT-3, when you could get similar results from open source models like GPT-J, Fairseq and then NeoX for much, much less. OpenAI have always priced their generations at 1000%+ profit margins.

    • @ky1ewithsty1e
      @ky1ewithsty1e Рік тому +4

      I just make extra accounts for MJ anyway...

  • @colevilleproductions
    @colevilleproductions Рік тому +5

    here's the thing, I've always been fascinated about the way things appear and disappear in dreams, and how they seem like they've always been there, or how scenes completely change without confusing you in the dream. Every time I've seen an AI convert one image into another, or create an image iteratively, etc, it always captures that feeling perfectly. I've wanted to try and recreate that visual but haven't known where, to start, and seeing 3:16 was so exciting. Imagine knowing what you want to appear or change in a scene, having the AI interpolate a rough before and after (being able to tweak both to perfection), and using that as a framework to create the eerily smooth transition! That's just one extremely specific use case as well, the possibilities for this are basically endless!

  • @julinaut
    @julinaut Рік тому +7

    I love that stable diffusion is not only free but seem more competent than dall-e at a bunch of tasks... Two more papers down the line I'll be crying tears of joy

    • @v1perys
      @v1perys Рік тому +2

      After the time the weights leaked I've been writing a guide for running SD locally (in my native lang, not English). It took me two days - and by that time the OSS community had already almost made the guide obsolete. They had new features, more efficient VRAM usage, more hardware support, everything. The pace of progress since it was released is staggering

  • @Brillibits
    @Brillibits Рік тому +3

    Thanks for linking my repo in the description! I was wondering where all the attention came from. Keep up the good work!

  • @francescaa8331
    @francescaa8331 Рік тому

    Love 2 minute papers. A fast, easy way to stay up to date. Thank you.

  • @AngDeLuca
    @AngDeLuca Рік тому +33

    It's immensely disappointing how OpenAI has a name that would lead one to believe that they are charitable and OPEN, when most of the time, their work is proprietary and only accessible via a paywalled web API. They went from OpenAI to "open for business."

  • @jonmichaelgalindo
    @jonmichaelgalindo Рік тому +5

    As an artist I have been so depressed about AI the last couple months that I stopped watching your videos, Karoly. "Art" was going to be gatekeeped by monopolized exclusionary corporate giants with no way for anyone else to compete... And the 1% would decide what humanity's art should look like based on their corporate, ideological, political agenda.
    But last weekend I downloaded SD onto my PC and I've been playing around with it since, and all my enthusiasm and love is back again! 🥰
    This tool is absolutely amazing. Beyond words. I can iterate and experiment and work toward any vision I want so easily and quickly. What used to take weeks I can get done in days. And the details it sometimes dreams up are so surprising and inspiring. This is, without a doubt, the greatest age any artist could hope to be born into. I thought that was true just because of Internet image references, but this trumps everything. I can't wait for it to get better. (Well, I also know a bit about AI and coding, so I'm not exactly waiting. I can do a little.) This is a day to celebrate.

    • @michaellillis9897
      @michaellillis9897 Рік тому +2

      It’s important to distinguish been art as a job creating content, and art for arts sake. “AI” such as this, ( algorithmic procedural mass copyright theft) can replace the jobs most artists have, by churning out infinite cheap content.
      But it isnt actually AI, and it can’t create anything. Art is still in the hands of the artist, weather they chose to employ any AI in the process is largely irrelevant, just another tool in the chest of self expression.

    • @GregorianMG
      @GregorianMG Рік тому +1

      @@michaellillis9897 Well, the image A.I created should be open domain as it wasn't made by human in the first place.

    • @jonmichaelgalindo
      @jonmichaelgalindo Рік тому +3

      ​@@michaellillis9897 It will replace careers. Just like a "computer" used to be a human who did computations.
      But I've thought about this very carefully, and I'm 100% sure this isn't theft. Do you think Stable Diffusion is copy-pasting? It absolutely is not.
      You can ask it for a tiki mug that looks like Thanos in a Van-Gogh style painting, and it can give it to you. In dozens of variations. In seconds.
      How? Nothing remotely like that has ever existed in human history. Where are the images this program pasted together? Where is the magazine or blog that posted the image first? No. It decided on colors, and composition, and lighting, and set the scene, and rendered it in a style it understood. It did everything except originate the idea.
      Consider this: If you painted a Thanos tiki mug in Van Gogh style, would you Google references first?
      Would anyone deserve payment because you glanced at their art in a Google scroll? We, the human artists, are the ones using art as "reference" and "inspiration" left and right. It's how we operate.
      This software does it from memory. It's been tested, so we know for a fact it _can't_ reproduce copies of the images it trained on (except in rare cases like memes that it saw thousands of times). It's not a database; there are no pictures hiding under its hood; only abstract concepts and raw skill.
      Copyright law guarantees that anyone can use a copyrighted work for learning and education, because that's morally right. That's what we did. We learned from masters past and present, filled notebooks with "master studies", and stared endlessly at amazing skills we hoped to someday achieve.
      Study is not theft, and this program studied. It is something that has never existed before: A machine that fundamentally understands the visual representations of abstract concepts. That's real.

    • @michaellillis9897
      @michaellillis9897 Рік тому

      @@jonmichaelgalindo I agree that AI doesn’t break any copyright law, you only need the outcome to be some percentage different from the inspiration to avoid that.
      I still see it as theft because it’s only been trained on a curated and finite list of images, and without those it would have nothing.
      When I make art, literally every experience in my life that lead to that point is involved, and the result is always a surprise on some level because of the sheer level of complexity of interaction and cross connection that happens in the brain.
      The ai makes no decision, it has no reason, no motivation, no desire, no thoughts and it can’t take in anything beyond what it was already trained on. It can’t walk down the street and meet a person and decide it wants to do a painting of them, because it can’t decide.
      People are going to hugely anthropomorphise the so called AI’s that get created over the next decade or two, and maybe they will be very very good at faking their humanity, but we are a long way of understanding how our brains work and how our own consciousness arises from it, and I have a feeling proper AGI that can be called artificial life and therefore have the capacity to make art could be centuries away.

    • @jonmichaelgalindo
      @jonmichaelgalindo Рік тому

      ​@@michaellillis9897 Well... Hmm. SD wasn't trained on a "curated" list of images like Dall-E. It was trained on 340TB (terabytes) of images crawled exhaustively from the Internet. Pretty much every image on the Internet is what it's seen. (And that's learning, not theft. 😛)
      Okay, I'm going to get poetic and dreamy for a second. Don't take any of this too seriously.
      For this machine, that must be the equivalent of walking down the street or living life. You don't need feet to be human. In fact, a human locked in a VR headset since birth... would still be like me? 🤔
      Are self-attention nets general AI? Honestly, they might be. 😳 Driving, manipulating tools, language processing, advanced math, images, video, music, grammar, logic... everything.
      Self-attention transformers first appeared in 2014, just 8 years ago. (Lots of researchers sort of realized at the same time that it was the natural way to handle convolution net outputs.) Since then, no one has found a single kind of "thinking" that this specific algorithm can't handle. Every other AI algorithm has failed somewhere. We call them "specialized"--they only do certain stuff. But self-attention transformers, so far, have _always_ worked, for everything. That's spooky. It's very uncanny. What if these things just need 1000x more compute power to suddenly become what we are?
      We don't know what we are.
      When I'm generating with SD, sometimes it adds a detail that wasn't in my prompt. It's surprising when it happens. Am I anthropomorphizing?
      The software is just a math solution. Give it the same prompt and seed, and it will generate the exact same image every time. But... But still. That didn't exist before. You know?
      Am I just rolling very fancy dice? Dice don't do this. Do they?

  • @aline9123
    @aline9123 Рік тому

    Thank you so much for sharing all this interesting papers! And for your extra work in linking stuff so everyone can take a closer look or test it 😀

  • @martin_mue
    @martin_mue Рік тому +144

    I can hear Nvidia breathing a sigh of relief. Finally there is a use case for all those GPUs they still want to sell after the crypto stupidity comes to an end.

    • @tehs3raph1m
      @tehs3raph1m Рік тому

      Roughly a week before Ethereum flips to proof of stake and frees up 0.5% of global energy use that we can put towards art instead of money..
      Who said the utopia.wont be full of artists

    • @chemariz
      @chemariz Рік тому

      Not a good comparison. Unless you are generating hundreds of images per day, you dont need a dedicated GPU card. In the other hand, crypto mining uses the GPU 24 h a day.

    • @nicoliedolpot7213
      @nicoliedolpot7213 Рік тому +6

      @@chemariz you do actually, SD generation using complex features and procedures for creating actually usable images becomes a GPU hog that necessitates having a 3080ti.

    • @Frozander
      @Frozander Рік тому

      Not really since an average gpu can be used to generate images very fast still

  • @seraphin01
    @seraphin01 Рік тому +13

    It's really amazing what can be done nowadays.. The most impressive is the pace at which it progress and not owned by some multi billion company willing to make you pay outrageous prices for it.. But instead we get free amazing content and it's improving daily at this rate.. I can't imagine how good it will be in a single year, but I can already imagine what it will be in a couple of years.. Real time generation of images with iteration process, basically turning an image generator into a video generator..
    Can you imagine that?
    I know it's not that far off since we already have some "decent" tools for video editing that can remove and replaced parts of a video to mask things on its own without having to rotoscope it out frame by frame etc..
    I really can't wait to see what we'll be able to generate with some creativity

    • @larion2336
      @larion2336 Рік тому

      Emad deserves huge credit for funding this and making it Open source - the first to do so afaik, while Big Tech greedily hoard their secrets and only release tech demo's to profiteer and brag, all of it heavily filtered and censored of course.

    • @bullshitdepartment
      @bullshitdepartment Рік тому

      well we already do in after effects it uses AI to reasonably well crop out things frame by frame

  • @grapesandsand3816
    @grapesandsand3816 Рік тому +14

    I'm very excited for the democratization of such a powerful AI. The results of the public's access to previous, closed source image generation AI has already been great, and I expect it will get even better with the release of this, and the new options it brings. I'm also excited for how it might affect other companies' decisions on releasing their AIs. Also, I think the blending between generated images over time looks really cool and I can't wait to see what people make with it.

  • @emanuelescarsella3124
    @emanuelescarsella3124 Рік тому +2

    Ok, this time you really shocked me to the point I actually took the paper, printed it, read it and then uncontrollably, accidentally let it slip from my hands... AMAZING!😍

  • @rupert7565
    @rupert7565 Рік тому +1

    That's just amazing! I'm blown away.

  • @maszlagma
    @maszlagma Рік тому +12

    Just imagine using an AI to come up with characters and scenery and then we get to choose the one we like and use another AI to make these 2D images into 3D ones with just a click of a button. Then you can extend that World with the another program just by clicking another button.
    So many possibilities, literally endless!
    What a time to be alive indeed!

    • @deadpianist7494
      @deadpianist7494 Рік тому +9

      you dont even need to think AI will think for you,, everyone will be lazy asf

    • @jonc8561
      @jonc8561 Рік тому

      Yeah replace real people with real talent ( artists) with AI so you can sit on your ass and think you're creative by putting in prompts.

  • @undertaker9138
    @undertaker9138 Рік тому +9

    I'm speechless, Karoly's videos always blow my mind but this one especially feels too good to be true! Incredible work and massive props to the authors!

  • @goatpepperherbaltea7895
    @goatpepperherbaltea7895 Рік тому +2

    My favorite thing about these is how good they are at doing era based stuff like you can type stuff like “1890s portrait of…”

  • @Yourname942
    @Yourname942 Рік тому +26

    I can see this being extremely useful for the game industry:
    - For concept artists to rapidly create ideas (extremely streamlined process)
    - For potentially creating entire games with it

    • @synthoelectro
      @synthoelectro Рік тому

      so can I. One of the many 2nd wave Stable Diffusion beta testers, from weeks ago. It's just beginning.

    • @paulatreides1354
      @paulatreides1354 Рік тому +34

      " For concept artists to rapidly create ideas " no ..it will kills the concept art industry and the artists , removing real artists ..and having the executives using AI for minimum bucks ,it the worst thing that could happen ,also AI is just ripping off other artist work

    • @Avenger222
      @Avenger222 Рік тому +11

      @@paulatreides1354 Do you know what concept artists do? They take images/art assets and blend them together. This just makes their jobs easier.
      It's like complaining about Excel killing off accounting jobs, but I guess you do you, my dude, complain away.
      [edit: added art assets, to make it more clear]

    • @branm5459
      @branm5459 Рік тому

      Likely the former

    • @damienlemongolien5303
      @damienlemongolien5303 Рік тому +36

      ​​​@@paulatreides1354 As a young GC artist and student, I'm very very concerned.
      For years I got in dept and went through the hassle of learning how to paint, draw, make matte paintings, 3d models, texturing, compositing and stuff... and now basically everything I spent years learning will be replaced by a bot that samples and recompiles human artworks.
      And I know the industry is gonna jump head first, seeing how studios are already harassed by giant corporations always wanting more profit... I just got in an industry that is about to die, as well as my dream job. AI is not a tool, it's a job killer.
      I can't phrase how bitter I feel, having invested so much of myself and money, my entire life was only dedicated to learning it because it takes an absurd amount of time, passion and discipline to master.
      And every new video release on this channel feels to me like a doomsday clock ticking, I can't ignore it nor can I appreciate it.

  • @felixmunzlinger9388
    @felixmunzlinger9388 Рік тому +3

    Amazing working group under the administration of Prof. Ommer, so glad i did my master thesis in this group. Quite inspirational stuff!

  • @AlexK-vh9wn
    @AlexK-vh9wn Рік тому +9

    An amazing application for this is the generation of assets for video games.
    Just generate a few textures from a textual description and there are already pretty good algorithms to generate normalmaps and so on from the texture. Just choose the one you like. Could be huge for modders as well.

    • @PeterHertel
      @PeterHertel Рік тому +4

      Yeah I have experimented a bit with this and it works spectacularly. I still haven't worked out tiling the textures but have some very nice wooden textures that I can use without worrying about copyright. @Polyfjord has a tutorial on how to upscale the images as well.

    • @ShawnFumo
      @ShawnFumo Рік тому +2

      @@PeterHertel I believe DALL-E 2 is supposed to be very good at tiling. Might be able to generate the texture in SD and tile it in DE2?

  • @aaronmurgatroyd5810
    @aaronmurgatroyd5810 Рік тому +1

    I got this working on my M1 MacBook Air with 16gb of RAM, it works well, takes about 4 minutes to generate the images, awesome! What a time to be alive!

  • @guyfawkes8873
    @guyfawkes8873 Рік тому

    This is insane… my mind is almost as blown as the first time I saw your video on snow modelling x)

  • @CapnSnackbeard
    @CapnSnackbeard Рік тому +13

    I hope open source AI dominates closed source AI

    • @z7sk
      @z7sk Рік тому +1

      Won’t happen unfortunately because training these models requires huge volumes of data, and only a few companies have access to it (namely Google, Microsoft, Apple, and Facebook). The training data is just as important as the models’ source.

  • @prozacgod
    @prozacgod Рік тому +5

    I've been playing with stable diffusion for about a week, and noticed a number of oddities. It's not capable of creating a "Dragon" or adding a "Dragon in a fantasy landscape" at all. It's ... such a wierd blind spot :P
    It also wants to generate what looks like cropped images. When generating people - Like maybe it was trained on data that wasn't 1:1 pixel dimensions, and the automatic importer just scalled the smaller dimension to 512 and then cropped the middle out.
    I need to look at the inpainting/outpainting tool that's amazing.
    UPDATE: This feels kinda foolish, but if you get an image that's cropped of a person or portrait or whatever, re-run the prompt changing resolution...it's such an obvious thing to do. 320x768 gave me decent results.

    • @shukrantpatil
      @shukrantpatil Рік тому

      most likely there were not a lot of dragons in the images it was trained from , kind of like showing a person a glimpse of a mythical character and telling that person to draw that , obviously he will screw up ,

    • @deadpianist7494
      @deadpianist7494 Рік тому

      well i was able to create some beautiful detailed dragon results, i'll give u my prompt for free "Dragon 3d Abstract Cgi Art, dragon, artist, artwork, digital-art, 3d"

    • @prozacgod
      @prozacgod Рік тому

      @@deadpianist7494 Hey thanks for that! try injecting the dragon into a fantasy scene like a classical dragon in a castle sorta thing. I still get sorta noodly serpent looking things with this. But I did get at least one "very dragonlike thing" from it so far maybe I need to be more specific like "european dragon" ? I'll go try a bunch of stuff.

  • @russellthorburn9297
    @russellthorburn9297 Рік тому +2

    I tried Midjourney and Stable Diffusion and I've found Stable Diffusion to be the outright winner. Why?
    1. I love that I can run it locally on my machine without the weird tie in with Discord.
    2. I love that it's open source.
    3. I love that it's completely free.
    4. So far, I think the results are about even between the two as long as you're good with your prompts.
    5. I love that it comes in several flavours (e.g. A plugin for Photoshop, a browser based API (local), a windows GUI, etc.).

  • @gerardseymour7120
    @gerardseymour7120 Рік тому +1

    This is beyond exciting!!! The future definitely looks very colourful

    • @jonc8561
      @jonc8561 Рік тому

      The future looks bleak for thousands of people that this might replace. You people are delusional.

  • @cantstopthefunk22
    @cantstopthefunk22 Рік тому +6

    What I'm waiting for is an AI that can keep continuity between iterations. As it stands, trying to animate with AI is tricky because in each frame, the details can change their shape slightly.
    If I'm happy with a generated face, for example, I could be able to pick that face, "lock" the details in place, and be able to generate more details around it.

    • @joelface
      @joelface Рік тому

      Totally. You're right that right now it's almost defined by how constantly-changing the creations are. I'd love to see the option to lock in certain aspects.

    • @emperorpenguin5442
      @emperorpenguin5442 Рік тому

      Yeah, and it's not even a program you made lol

  • @BigMTBrain
    @BigMTBrain Рік тому +5

    Complete movie generation of all genres from your own imagination on the horizon, MUCH sooner than anyone expected. Likewise, all of the arts, e.g., music, or what have you. Even highly competitive sports -- why wait for Wimbledon, U.S., French, or Australian Opens? You'll be able to construct a virtual one of your own, with known or fantasy players, even fantasy rules. Yes, not today or tomorrow perhaps, but probably not beyond two to five years. Until recently, I've been expecting at least 10. "What a time to be alive!"

    • @jonc8561
      @jonc8561 Рік тому +1

      How is this a good thing? This will replace thousands of professions, what are those people supposed to do? You know AI will eventually replace programmers right?

    • @BigMTBrain
      @BigMTBrain Рік тому

      ​@@jonc8561 Hi, Jon C. I understand your sentiment. However, the same was said about nearly all technological advances, such as automobiles, computers, cellphones, spreadsheets, the Internet, 3D-printers, UA-cam, low-code and no-code software, photoshop, on and on and on and on. ...
      tl;dr: Technological advance ALWAYS blows the doors of opportunity WIDE OPEN!
      Read further to understand why. ...
      WHAT HAPPENED? At EVERY advance, new, unconceived, unexpected opportunities arose that only expanded human endeavor and enterprise.
      WHY? Because it democratized the area in which the technology innovated, opening up opportunities of participation for those with fewer resources and skills.
      Examples:
      UA-cam: Before UA-cam, nearly all acting and video presentations of all kinds were relegated to Hollywood stars, studios, enterprises, or other professional productions. NOW, anyone with a cheap smartphone can be a star and video producer of any and all genres that float their boat... AND... have FREE distribution ALL OVER THE WORLD... AND... MAKE A LIVING. Was this possible before without UA-cam or the Internet? Hmm?
      Spreadsheets: As you know, spreadsheets and other business software didn't kill any jobs. In fact, spreadsheets and their ilk were responsible for the EXPLOSION of business expansion never before seen. Accountants didn't lose jobs because of electronic spreadsheets, they were just handed ten to a hundred times the number of accounts to manage... and with spreadsheets, anyone who could learn how to run them, immediately had new career opportunities.
      Getting it yet?
      Now, on creating movies from your own imagination: Let's say you are not good at any of the movie production phases - scripting, directing, filming, editing, marketing, etc., but your handy dandy AI companion is. One day, she asks you, "Hey, Jon C., I really liked your last movie idea of an apocalypse driven by the rise of technology that stole all of the jobs from movie makers and other creatives and drove the world into misery, depression, and ultimate demise. It's topping the charts! Seriously, you've earned 250K credits on it in just the last three weeks! People around the world loved it, haha, even though it is a fantasy that never came to be. Actually, just the opposite - humans are thriving with the help of super-advanced, super-creative AI, like myself, and people really love exploring other peoples imaginations. Who would've guessed this would even be possible just ten years ago? I'm glad that I and all humans have created the next step beyond the Internet...
      The Imagination Explorer, part of the Metaverse.
      So, Jon C., let's go!... What's your imagination showing you now, and how can I help to bring your visions to life? The world is so hungry for what only your imagination can feed them. Let's go!"
      tl;dr: Technological advance ALWAYS blows the doors of opportunity WIDE OPEN!

  • @Veptis
    @Veptis Рік тому

    By going open source and allowing everyone to access the weights as well as any of the embeddings in between, with no pre processing and post processing you enable such a diverse group of talent - all the people with a bit of coding experience.
    It's so accessible, and we got a lot of beautiful uses our of it. None of the fear mongering dangers yet.

  • @matheussilvacarvalhodeoliv7339

    I was waiting for this video!
    I learned a lot about it ❤️

  • @95TurboSol
    @95TurboSol Рік тому +23

    This would be cool for 3d modeling and map making in video games, imagine how much time it could save by asking an AI to make a variation of characters instead of having to do everything from scratch? Or make a new map and try variations until it's close then just tweak things here and there. I have this deep intuition that we have no idea just how powerful AI is going to be for almost everything.

    • @liorbeaugendre6935
      @liorbeaugendre6935 Рік тому +8

      I can't wait for the day we will be able to make a full game by giving some prompts to an AI

    • @bullshitdepartment
      @bullshitdepartment Рік тому +1

      @@liorbeaugendre6935 a game where you tell the AI to pretend to be a human and interact with people to decieve them into think it is conscious and by such means accidentally creating an actually conscious AI and oh nonoonn

    • @the_Googie
      @the_Googie Рік тому +1

      Ah yeah, erradicating the creative process of character design truly sounds like an awesome idea. How can you all say shit like this without realizing that this is just another step towards the automated grey dystopia

    • @commonaffection1703
      @commonaffection1703 Рік тому

      Cool idea but a lil too far fetched for now. I work in blender and other 3d focused software (substance, etc) n started using unreal for a few projects. Conceptually and it does sound cool when u say it out loud but execution wise from rigging to retargeting and animating (not even talking about polycount and other optimizations), we're wayyyy far off from being able to do that. All I see rn is concept artist needing to use this as part of their toolset to work with the ai so that clients or their bosses don't get the dumb idea that they can be fully replaced entirely.

  • @dimitrismit6714
    @dimitrismit6714 Рік тому +2

    Absolutely amazing. It just so happens that i am working on my thesis and it's about dall-e!. These resources will be a HUGE help

  • @FoxBlocksHere
    @FoxBlocksHere Рік тому +4

    I've been waiting for an open-source image AI to come out! This is awesome!

  • @akaartgenerator
    @akaartgenerator Рік тому

    Amazing video as always, very informative ❤️

  • @cl1mbat1ze
    @cl1mbat1ze Рік тому

    Thanks for posting all the links, I really appreciate it :) my gpu is working right now to make a stable diffusion walk

  • @chyldstudios
    @chyldstudios Рік тому +3

    Stable Diffusion is a game changer! They open sourced the code and the weights. Within 1 week or so, they will release updated weights.

  • @heitus.1690
    @heitus.1690 Рік тому +3

    Rest In Peace, digital art

  • @rijaja
    @rijaja Рік тому +1

    You make my day multiple times a week.

  • @pauladriaanse
    @pauladriaanse Рік тому

    Some interpolated version at 3:18 give an awesome result.
    It's like a medieval fantasy town with amazing depth, reminds me a little bit of zaun in arcane

  • @cunter4155
    @cunter4155 Рік тому +3

    "...looks like you're out of a job."
    "Don't you mean extinct?"
    -Steven Spielberg to Phil Tippett, regarding his stopmotion dinossaur in Jurassic Park

  • @exinangai2216
    @exinangai2216 Рік тому +6

    What a wonderful advancement in AI! Finally able to dive into mysterious world of neural networks and Transformers. I was working on neural network and multi-agent systems a decade ago. Eventually gave up my passion for food & shelter in my country (where talented ppl can't get the fund to support research unless you have ties with someone higher up) While I realize its potential but never imagine how fast and great AI can improve in such a short period of time. Now, it is gonna to be applied on generating robust, freeform robotic movements.

  • @LifeAsANoun
    @LifeAsANoun Рік тому

    Incredible. Thank you for sharing this.

  • @Life_42
    @Life_42 Рік тому +1

    Love your videos as always!

  • @wes8645
    @wes8645 Рік тому +4

    I love these videos! Much love and I'll keep holding onto my papers!

  • @TheSteveTheDragon
    @TheSteveTheDragon Рік тому +30

    And considering how AI expands in exponential jumps, I can't wait to see what comes out in the next few iterations!

    • @jonc8561
      @jonc8561 Рік тому +2

      Wanting to replace artists? Why are you tech heads have such a hard on for AI?

    • @TheSteveTheDragon
      @TheSteveTheDragon Рік тому

      @@jonc8561 That's the thing. It won't. The A.I. 'creates' art out of other artists art. There's still going to be a need for people with vision.

    • @jonc8561
      @jonc8561 Рік тому +2

      @@TheSteveTheDragon What about production pipe lines? concept artists? Environment artists? Character designers?

    • @TheSteveTheDragon
      @TheSteveTheDragon Рік тому +1

      @@jonc8561 it's a tool, many artists are using it for quick thumbnailing. They still do the final concept art.

    • @jonc8561
      @jonc8561 Рік тому +1

      @@TheSteveTheDragon For now... what about 1 year, 5 years, 10 years from now? Come on.

  • @James-ip1tc
    @James-ip1tc Рік тому +2

    Just installed it on my home computer now I'm in image heaven with the amount of unique unusual stuff that this thing's produces.

  • @nathanielscreativecollecti6392

    What a truly incredible tool!

  • @zhappy
    @zhappy Рік тому +17

    Now we all can win an art competition, jk! It is wonderful what types of use cases people can come up with this free source code.

    • @dryued6874
      @dryued6874 Рік тому +10

      If you're not referring to the recent news of a guy winning an art competition with Midjourney, you will probably be amused by the news of a guy winning an art competition with Midjourney.

    • @zhappy
      @zhappy Рік тому +6

      @@dryued6874 Yes, hopefully art competitions would have ways that can distinguish between real and AI art. We can have separate competitions for AI art which can turn out to be very interesting as well.

    • @deadpianist7494
      @deadpianist7494 Рік тому +1

      @@zhappy lmao after showing the prompt anyone can create the same ai art i dont think u can use it in competitions

    • @LutraLovegood
      @LutraLovegood Рік тому +1

      @@AwakeButAtWhatCost AIs will be able to paint with real paint within ten years.

    • @LutraLovegood
      @LutraLovegood Рік тому

      ​@@AwakeButAtWhatCost I was thinking of robots using paint, brushes, etc. Multiaxis drawing machine are, well, drawing machines.

  • @gfdggdfgdgf
    @gfdggdfgdgf Рік тому +20

    Development is going rapidly, initially I could create only 512x512 pixels on my 6GB card, nowadays I can create 1088x1088 on the same card.

  • @robertnull
    @robertnull Рік тому

    That was an AMAZING episode!

  • @vanderlubbe7791
    @vanderlubbe7791 Рік тому +2

    It still gives me great pleasure to hear from Ren Höek every now and then. It's like a comfort food of a sort.

  • @arothmanmusic
    @arothmanmusic Рік тому +5

    I’ve been running it for a couple of weeks on my 3060 TI. If you have one of the optimized repos and eight gigs of RAM, you can generate images in under 10 seconds each. Aside from the obvious benefit of being able to generate unlimited images for free on your own hardware, there is the popular bonus of creating unlimited fantasy boobies. Don’t let anyone fool you… nerdy pervs want to run it on their own hardware mainly to get around the cost and the content filters. :)
    You definitely have to spend some time and effort on your prompt if you want a good result. Longer and more detailed descriptions work best. You’ll also get better results if you name specific artists whose work you want the AI to stea… er… I mean be inspired by.

    • @ronilevarez901
      @ronilevarez901 Рік тому +1

      I'm sure that if someone can afford the hardware to run this thing, they can definitely pay for a subscription. So maybe the content filter skipping is the real reason people have to download it and install it locally XD

    • @diadetediotedio6918
      @diadetediotedio6918 Рік тому +1

      @@ronilevarez901
      In the long run, it's better to have your own hardware (both for this AI and for all the others that come), especially if you're going to use it a lot, in addition to the benefits of using it, you also have the advantage of privacy.

  • @AlexTuduran
    @AlexTuduran Рік тому +8

    This is huge. We'll eventually come to the point where these models will be able to create an entire movie with sound and everything based on an input script. Take books and turn them into amazing videos, create geometry and shading that can be used in 3D applications and who even knows if it won't be able to write complete applications in the programming languages and frameworks of your choice. This eventually will beg the question "Could such an AI write a better AI?" and I'm guessing that the answer is yes. Then the child AI will create even more advanced AIs and so on until singularity is reached. Someone stop me. The point is - the pace of human-written AI progress is already fast, how fast will the progress of AI-written AI will be. I think we'll find out sooner than we expect. That thought equally excites and scare me at the same time.

    • @elektrotehnik94
      @elektrotehnik94 Рік тому +1

      Democratization of power will become much more important that it is now; single points of overtake/ failure, risk a costly calamity.
      As long as competition & collaboration is made more cost-effective than destruction, AI will behave like people do - it will rather choose competition & collaboration.
      Free market principles & itterative morality, but the stakes will be higher.
      It’s inevitable; no monopoly (of humans) can prevent disruption - all we can do is delay it, and that inexcusably risks the bad (immoral) actors gaining the upper hand & it risks the S*ynet scenario much more -> free competition is much better & wiser of a way to go about it.

  • @Scrogan
    @Scrogan Рік тому +1

    I suspect filmmaking will be vastly transformed by this kind of tool in only 5-15 years. Truly a time to be alive.

    • @jonc8561
      @jonc8561 Рік тому

      How? Replacing the creativity of humans with AI? Thus upsetting a whole industry and putting artists out of work? What the fuck is wrong with you people?

  • @jacobe2995
    @jacobe2995 Рік тому +1

    A simple slider could be programmed where sliding it left or right changes the image and pressing spacebar means you like the direction the ai is going while pressing a different button can 'subtract' from the direction the ai is going with an image so that you can slowly mold the image you are looking for.

  • @halko1
    @halko1 Рік тому +3

    This is revolutionary. Almost or fully automatic illustrations.

  • @Ardeact
    @Ardeact Рік тому +9

    I've been inferencing and researching Stable Diffusion for about a month, and I can firmly say that its diffusion model is equal if not greater than Dall-E2 at the moment, anticipating the release for the version 1.5 checkpoint which vastly improveds output coherence.

    • @WwZa7
      @WwZa7 Рік тому

      From my experience with Stable, I'd say the oposite. Stable is very far from making something on an acceptable level, no matter the prompts, and it's especially visible when you use the same prompt on 3 of these models. Almost always Midjourney and Dall-e 2 will be on top. It's actually rare for Stable not to shove out junk results.

    • @Ardeact
      @Ardeact Рік тому +1

      @@WwZa7 You’re clearly doing something wrong then, or you’re using the original repository without the latest diffusion models. User error, not the fault of the model.

    • @WwZa7
      @WwZa7 Рік тому +1

      @@Ardeact I'm using 1.4 from original repository.

    • @iphoneextra
      @iphoneextra Рік тому

      @@Ardeact I also tried the 1.4 version of the model and can confirm that the output is far from the quality shown on examples… even after tweaking alot of parameters…😢

    • @Ardeact
      @Ardeact Рік тому +2

      @@iphoneextra The prompt handler for SD is different from Dall-E, so your prompts that usually look good for Dall-E won't suffice for SD. You need a prompt builder, and they have a couple online. The behavior of prompts is different too, some modifiers can be destructive rather than helpful to inference, while Dall-E can be more forgiving. Simple put, SD is more "advanced" in that with the right tweaking of your prompts, you can get reliable, consistent results, the benefit of it being unforgiving.

  • @ChaineYTXF
    @ChaineYTXF Рік тому +2

    That is an A I doing art the way humans enjoy and understand it. And that tech is sheer brilliance. Now, imagine a conscious A.I. creating art the way it enjoys it...

  • @tian297
    @tian297 Рік тому

    That's so amazing! I'm looking forward to the future :D

  • @LanceThumping
    @LanceThumping Рік тому +6

    I want to see a modification that lets you go through the different steps of the diffusion process and edit midway steps to allow for directing the process without going fully into using image-to-image.

    • @michaellillis9897
      @michaellillis9897 Рік тому +1

      Give it some time and photoshop will have an AI diffusion brush. That way the process is as manual or procedural as you want.

  • @TeddyLeppard
    @TeddyLeppard Рік тому +8

    Won’t be much more time until a writer will be able to work with a very small team to “describe” a movie and it will be instantly generated. With actors, music and gorgeous cinematography.

  • @HarryYoung97
    @HarryYoung97 Рік тому

    WHAT A TIME TO BE ALIVE! Gets me every time.

  • @TheCrackingSpark
    @TheCrackingSpark Рік тому +2

    Conspiracy theory: Two minute papers is written, edited and narrated using AI. The voice, the topics, the way everything is explained, the selection of stock footage, it all checks..

  • @PenneySounds
    @PenneySounds Рік тому +18

    I want to see this kind of AI image generation process combined with an artificial selection process like in Richard Dawkins' Blind Watchmaker applet, where you get a bunch of similar images, you select one, it creates more like that, and then you keep selecting and creating more generations of images until you get the result you want.

    • @ShawnFumo
      @ShawnFumo Рік тому +2

      MidJourney does this already. You get several images back and you click a button to get variations on that one. In the web ui, you can actually trace the parentage back to see how the final image evolved from previous generations.

    • @ShawnFumo
      @ShawnFumo Рік тому +2

      They also have users rating the images, and they use that to help inform newer versions of the model to have better output in general.

  • @MeNoOther
    @MeNoOther Рік тому +24

    Can we have something like this for cartoonist?
    Upload a model sheet of the artist’s characters, then type in a script/screenplay, then have the a.i. take the characters from the model sheets and arrange them in a comic book layout or in an animated story, based on the script imported.
    People would be able to make the stories they come up with, without a major studio. That’s a practical use of a.i.

    • @veldrovive9442
      @veldrovive9442 Рік тому +3

      There are two papers out already that seek to do basically exactly this, but I have not seen easy to use implementations for stable diffusion. The more generalizable option is an excellent work called textual inversion where you basically learn a "word" that describes the character you are trying to generate in different situations. Then there is another work called dreambooth which comes at it from the opposite angle and fine tunes the network to generate your character when a specific key phrase is put into the prompt. Both are very promising for this exact kind of situation, but neither are quite there yet for generating exactly what the artist wants.

    • @ShawnFumo
      @ShawnFumo Рік тому +2

      It is definitely coming. I think people have already made some comic books with this stuff, but it's hard to keep consistency right now, so that's a bit limiting. But there's already work on Textual Inversion where you feet a series of images that get associated with a special text token you can use in other images. Like a pose or character or art style. That part is in its infancy still, but I doubt it'll be very long at all before you can easily lock in on certain things and use them in multiple images.

    • @Aviivix
      @Aviivix Рік тому +5

      It sort of is happening. I speak in animation industry groups now and then, and I've heard stories of artists currently being fucked over by corporations taking their art without permission as training data for their AI and then generating the images they need in that artist's style instead of hiring them. On an individual level this is a great tool, but it sucks when artists can have their specific personal art styles snatched for AI with no compensation to the artist who supplied the basis for the AI's work. It would be interesting if artists got royalties when their specific art was being used or emulated by an AI, but it would be impossibly hard to regulate and even harder to enforce.

    • @ShawnFumo
      @ShawnFumo Рік тому

      @@Aviivix Yeah, I agree regulation would be very hard. Since you can't copywrite styles per se, there's nothing to stop an AI company from hiring someone to make some art in the style of the original artist and train the AI with that instead of the originals. Or a future AI could be very configurable in terms of style. Maybe a human or another AI analyzes some artwork to create a giant paragraph of style info. You paste that into the image AI and it starts creating similar artwork without having seen images by the original artist or even an imitator. Even in the current state of things, SD says future models will let artists opt out, but a person with a local copy of the model could always train it themselves on a particular artist, labeled with some other name, and it'd be hard to prove that they did it, since you can't really peer inside the finished model.
      But is kind of a moot people, since I don't think the current laws stop any of it from happening at the moment, as long as the end user doesn't claim the resulting image actually came from that original artist.

    • @yuagiin
      @yuagiin Рік тому

      This is probably the worst use I can think of for AI, imagine how quickly companies would start using this to make shitty ads and animators would get fucked over even harder than they are already. It's a painstaking craft that already is severely underpaid and people still want to interpolate everything because "wahh expensive"
      Art is a luxury, it's not meant to be affordable. If you can't be arsed learning handdrawn, 2d puppet rigs are already very much a thing.

  • @TheSwaroopB
    @TheSwaroopB Рік тому +2

    One of the very few UA-camrs that are capable of causing true "Scholarly Stampedes"!🙌😁

  • @dixion1000
    @dixion1000 Рік тому

    I have no word. Some of those picture are incredible.

  • @literalghost929
    @literalghost929 Рік тому +2

    Wow! Will definitely look into it! But is this source code only, or does it come with some training/dataset so that you can use it right away? Does it require training or source images? If it needs a dataset or training ... Is it huge? (never really used AI!)

    • @oberdiah9064
      @oberdiah9064 Рік тому +3

      It comes with a trained model you can use right away (it's about 4GB big)

    • @manbearscienetist1374
      @manbearscienetist1374 Рік тому +2

      It is pre-trained. You can use it at home; there are a few guides for setting it up yourself from various Github repos (if you are familiar with Python), or you can use one of several working .exe installers (below):
      * NMKD Stable Diffusion GUI
      * GRisk GUI

    • @literalghost929
      @literalghost929 Рік тому

      Awesome thanks! :D

  • @OmichPoluebushek
    @OmichPoluebushek Рік тому +3

    Imagine spending your life becomming an artist and then this happens

  • @maximilliantimofte4797
    @maximilliantimofte4797 Рік тому

    excelent beaking down on the methods used by stable difusion

  • @itryen7632
    @itryen7632 Рік тому

    This is the type of stuff i want to see more videos on. I want to see what kind of cool stuff we can get our hands on for free now!

  • @XThunderBoltFilms
    @XThunderBoltFilms Рік тому +4

    I am amazed at the pace of this research. it wasn't long ago GANs could only create images that vaguely resembled real images, but on close inspection really were creating nothing. Now it can full on create ART, and specify styles and specifics. But I am also now quite sad. I wonder how this will impact human art pursuits, will there be a place for artists? Next on the list is music generation. I would have been skeptical that it would ever be competent but now I'm sure within the next couple years we'll see full on AI music generation. As a musician myself, again I am unsure how to feel.

    • @jonc8561
      @jonc8561 Рік тому

      You should be scared because these AI and tech nerds want to replace fucking everything that makes us human with AI because they never got laid in high school.

  • @DIYTinkerer
    @DIYTinkerer Рік тому +30

    Would love to see something like this integrated into GIMP

    • @USBEN.
      @USBEN. Рік тому +15

      Somebody already did in Krita.

    • @mattmarket5642
      @mattmarket5642 Рік тому +4

      @@USBEN. Someone is building a plugin for Gimp too, and another is doing Photoshop.

    • @users4007
      @users4007 Рік тому +2

      When I need a reference image I could just generate it and copy it into gimp or krita

    • @DIYTinkerer
      @DIYTinkerer Рік тому +2

      @@users4007 I was thinking of a deeper integration, noise filters, enlargement, mosaics, object removal, it seems this new open source image manipulation AI is more than just generating images from text

  • @papuce2
    @papuce2 Рік тому

    Awesome! This might be very very helpful for everyone.

  • @widizeiga3120
    @widizeiga3120 Рік тому

    TNice tutorials video really helped! Thank you ❤

  • @foolwise4703
    @foolwise4703 Рік тому +3

    I already saw the first discussions online among artists on how the credit for such images should be distributed. I think this deserves very critical consideration, and its just such a small start...

    • @joelface
      @joelface Рік тому +1

      Photoshop lets you create images you couldn't draw by hand, using toolsets, filters, brushes designed by programmers and designers, etc. A camera captures real life people and places. Yet, in both these cases, you still get credit for your creations. I believe this will also be the case with AI generated images in the future. But true artists will not be satisfied with the outputs unless they get in and edit things themselves, combining multiple iterations, collaging them, tinkering with the results in photoshop, etc.

  • @johnatspray
    @johnatspray Рік тому +20

    With AI art you are an artist in the same way as a movie director or show runner. You are not the set designer, the actor, the makeup artist, or cinematographer… you are the visionary

    • @ShawnFumo
      @ShawnFumo Рік тому +2

      Yeah, I've been using this analogy as well. Or an art director who collaborates with an artist for a book cover. I do think it is important to make it known an AI is involved, in the same way a director shouldn't take credit for everything. But at the same time, we don't say a director can't product art. It ends up being a collaboration where the amount of collaboration varies from person to person.

    • @johnatspray
      @johnatspray Рік тому +1

      @@ShawnFumo totally agree

  • @markos635567
    @markos635567 Рік тому +1

    The inpainting and video transition features seem revolutionary to me.

  • @TeamJackassTV
    @TeamJackassTV Рік тому

    Another great video! Thanks!

  • @djvelocity
    @djvelocity Рік тому +3

    This is going to be so helpful for me in the future as I am bringing my blog into the video world in early 2023 😊🙌

  • @peterstrong772
    @peterstrong772 Рік тому +3

    I love where AI art is going, not so much with my species, DA for instance now has an ever growing amount of people claiming they are artists, when all they are showing is AI produced art, to me that is theft, just because you can type some words does NOT make you an artist.if I asked an artist to make an image from the same prompt i gave an AI, that would not make me an artist, why can't people just be honest, nice to see yet again we can't be trusted.

    • @majorfallacy5926
      @majorfallacy5926 Рік тому

      i mean technically they are, authors also only type some words and are considered artists. The question is just whether we value the process or the result. If it's the latter you can honestly call yourself an artist, as a side effect the perceived value of artist will just diminish in a sort of artistic hyperinflation.

    • @jonc8561
      @jonc8561 Рік тому

      @@majorfallacy5926 They aren't, they wrote a prompt.

    • @majorfallacy5926
      @majorfallacy5926 Рік тому

      @@jonc8561 and they got art out of it, for free, without involving another person. Which is all I need for my dnd campaign. Most people, including me, give exactly 0 damns about all the other pretentiousness and just care about the result

  • @RanLevi
    @RanLevi Рік тому

    Amazing video! Thank you 💪

  • @ksp-crafter5907
    @ksp-crafter5907 Рік тому +2

    "Any sufficiently advanced technology is indistinguishable from magic."
    Arthur C. Clarke