Ai that makes thumbnails (or any image)

Поділитися
Вставка
  • Опубліковано 9 чер 2024
  • A video about using AI to generate youtube thumbnails. I explore the classic GAN method and compare it with a newer method called diffusion. One turns out to be better than the other!
    Reviewed by Andrew Carr: / andrew_n_carr
    Disclaimer: All thumbnails were deleted after use. I do not aggregate youtube data.
    The losses are not exactly inverse for the generator and discriminator because the two are not trained on the same data.
    LINKS
    Twitter: / max_romana
    Discord: / discord
    Patreon: / emergentgarden
    The life engine: thelifeengine.net
    SOURCES
    Original GAN Paper: proceedings.neurips.cc/paper/...
    Face interpolation: • StyleGAN2 Interpolatio...
    BigGAN Paper: arxiv.org/abs/1809.11096
    thispersondoesnotexist.com
    Flower Gan: / 1527890938386857984
    Katydid: • The Katydid (Leaf Bug)
    Mantis: • Praying Mantis Hunts a...
    Diffusion Paper: arxiv.org/abs/2006.11239
    Diffusion beats GANs: arxiv.org/abs/2105.05233?curi...
    Blog Post: gretel.ai/blog/diffusion-mode...
    Diffusion explanation: • Diffusion Models | Pap...
    Diffusion Visualization: / 1537042940475883520
    Water Diffusion: • demo - (hot and cold w...
    Dall-E 2: openai.com/dall-e-2/
    Imagen: imagen.research.google/
    CogView: arxiv.org/abs/2105.13290
    Parti: parti.research.google/
    TIMESTAMPS
    (0:00) Intro
    (0:32) The Goal
    (1:11) The Data
    (2:15) Latent Image Generators
    (3:03) GANs
    (4:35) GAN training
    (7:45) Diffusion
    (8:55) Diffusion training
    (10:43) ☆Generated thumbnails☆
    (13:58) Diffusion beats GANs
    (15:27) Conclusion
    (16:28) Outro
    MUSIC
    • Closed Circuits
  • Наука та технологія

КОМЕНТАРІ • 61

  • @0xdecaf
    @0xdecaf Рік тому +15

    Fantastic video, this channel deserves much more attention. You have a real talent for breaking down complex ideas and making them easy to understand. Thanks!

  • @markmarketing7365
    @markmarketing7365 Рік тому +16

    It would be super awesome to have a GAN trained to do camouflage. In fact, there are papers that describe this already. They train a GAN with one NN to colour a triangle on a random position on a random background, and a second NN to try to detect this one. As a result the triangles take on patterns that are harder to make out. There's cool websites where you can try to spot these triangles yourself.
    I've always wanted to do a few variants of this. Firstly I'd love to see this done, but with a "poisonous" triangle added to the image along with the camouflaged triangle, one with some very distinct pattern. Then the spotting NN is penalised for detecting the position of that triangle. It would be awesome to see if aside from camouflage, mimicry would evolve - and which one would be more likely.
    Secondly a variant where some parameter influences the contour of the triangle as well, like a frayed edge, would be cool. I'm sure you'd get some crazy good results.

    • @darkwise8628
      @darkwise8628 Рік тому

      hey, can you please post a link to such an website? thanks in advance!

  • @HotDiceMiniatures
    @HotDiceMiniatures Рік тому +6

    What an amazing video. Thank you Emergent Garden. What a great channel name btw. You're awesome.

  • @Crasterius
    @Crasterius Рік тому +6

    This is an awesome project. I hope you can take this to the next level.

  • @SantosEnoque
    @SantosEnoque Рік тому

    🎉 I am glad UA-cam recommended your channel 🔥🔥🔥🔥 this is something else 🎉

  • @MakerBen
    @MakerBen Рік тому +3

    This is super neat! Amazing explanation of diffusion!

  • @sohamjobanputra2914
    @sohamjobanputra2914 Рік тому +2

    Idea#1: AI that can generate great comments.
    Idea#2: AI that can generate a script for a movie/Stories.
    Idea#3: AI that can generate tips for demotivated people.
    Idea#4: AI that can tell, that when approximately human civilization will end.
    Idea#5: AI that can generate Idea like this.

  • @Jennn
    @Jennn Рік тому

    Omg. This is the best sick day ever. Thank you sir for taking the time to teach us all that you have! I just cannot stop watching your content!

  • @ninjalacoon
    @ninjalacoon Рік тому +1

    You know what. I am realizing that the process of diffusion is a lot like reddit's "the place" event that they have done in the past. People would contribute pixel colors individually to anywhere on the page canvas and it was always amazing to me to see how the image would "evolve" over time. People would organically arrive at recognizable images by seeing the patterns that would emerge from others that had laid down pixel colors before them. As the images begin to take shape, in an iterative way more pixels would come to fill in the gaps and hone it into a final form that would resemble a flag, a person's face, a logo, etc. In this case however, it was an image that was collectively well known to the people that participated. Obviously a lot of the images were coordinated and didn't undergo this process but there were a lot of areas where this seemed to be the case.

  • @literailly
    @literailly Рік тому +1

    Awesome video and explanation, thank you!

  • @franzfungis4264
    @franzfungis4264 Рік тому

    Love the presentation!

  • @HM-rf2ov
    @HM-rf2ov Рік тому +3

    My experience with GANs is exactly the same. After solving many bugs and issues, I end up with "mode collapse".

  • @FrostCraftedMC
    @FrostCraftedMC Рік тому +1

    okay seeing the thumbnail change to what it is now made me watch this. im making this comment before i watch cause i wanna say that it made me wonder if you gave the the ai access to the youtube statistics to try and learn to make better thumbnails for this video, and if you did then thats cool thats why i wanted to watch this. but if you didnt, id love your input on if thats a good idea or even a possible idea.

    • @andreivlasenko527
      @andreivlasenko527 Рік тому

      Yes, sounds like you can extend diffusion model with something like this(as diffusion model has some internal measure of how good image is(it's trained there for measuring noise, unrealisticness) you can try to put such statistics in that space for worse thumbails being worse in the same sense as noisy images worse - but it can be tricky that way, also something like using conditioning, like with text or class conditioned diffusion, to get the "slider" for generating better or worse thimbnails) - but one of the real problems is how to get such statistics, like you can not just use amount of views or something like that cause it's depends on so many things, like channel popularity, trend, youtube recomendations system and so on. Something like rate of click per view of thumbail would be good probably, but it's some internal youtube info, we're not gonna get

  • @Bjarkediedrage
    @Bjarkediedrage Рік тому +9

    Great video!! I sort of agree with the other comment. I'd say this video is deserving of a little better title that reflects its educational value and content. I only clicked because I was subscribed and I wanted to see if I had to unsubscribe. I'm picky!^^ and I think I've come to associate some clickbate with poor quality video content, and this is definitely not that!

  • @Kram1032
    @Kram1032 Рік тому

    larger datasets may not even be necessary. You can accomplish an increase in diversity by deduplicating the data you already got, potentially actually increasing performance with a smaller dataset!
    Deduplication may be tricky. But one method might be to train up a purposefully relatively small network to simply distinguish images. If it thinks two images are the same, chances are, the images are really similar.
    And to further improve this, you can train up *multiple* such networks and go with if like more than half of them think they are the same image, they are too similar and should be picked at random as a group - i.e. you group up "similar" images, then randomly select groups, and finally randomly select an image from each group.
    Alternatively, more easily, you can just discard all but one of the images of each group to shrink your dataset down to only sufficiently unique thumbnails.

  • @pooriaarab
    @pooriaarab Рік тому +1

    Can you share a link to the dataset generated or a tutorial on how to do it?
    Also, is it feasible to create UA-cam Thumbnails with the current state of the art of AI?

  • @realastropulse
    @realastropulse Рік тому +2

    Just a little after this was released, the most impressive diffusion based image generator yet was open-sourced. Stable Diffusion is the most promising AI image creator yet, at least until Parti's techniques are perfected and data researchers go back to the drawing board.

    • @gw6667
      @gw6667 6 місяців тому

      Who's Parti?

  • @OMGitshimitis
    @OMGitshimitis Рік тому +1

    Is there a hybrid approach? Like using diffusion to generate images and then a gan to tune that model? I'm not a computer scientist and I may be either saying something super stupid or super obvious but I'm genuinely curious.

  • @literailly
    @literailly Рік тому +1

    Can you similarly walk the latent space with a diffusion model by modifying the input noise?

  • @CathrinMachinArt
    @CathrinMachinArt Рік тому

    amazing video

  • @frost7423
    @frost7423 Рік тому +1

    this is the first time i click on a click baity thumbnail and get good content

  • @kalilinuxhikida9216
    @kalilinuxhikida9216 Рік тому +1

    So with thumb nails you could include the text of the video, so It would have to make a good thumb nail and text prompt

  • @RafaelSCalsaverini
    @RafaelSCalsaverini Рік тому +1

    What happens if you use a diffusion network as generator for a GAN?

  • @JadeFoxy
    @JadeFoxy Рік тому

    As long as diffusion models cannot generate samples in one forward pass i think GANs have a reason to exist in use cases where synthesis speed is an important factor.

  • @fungi42021
    @fungi42021 Рік тому

    very cool

  • @Graverman
    @Graverman Рік тому

    great video

  • @jnotjequel
    @jnotjequel Рік тому +2

    can't wait to play the new MIAECROOFT: MRGROOTBU update 13:48

  • @pelodofonseca6106
    @pelodofonseca6106 Рік тому

    If it takes 24 hours to train a batch of images how do wombo and dall e generate images in less than a minute?

  • @CmdrTigerKing
    @CmdrTigerKing Рік тому

    so generating the perfect image is like a slot machine

  • @DerfaelB
    @DerfaelB Рік тому +2

    600/10

  • @andreivlasenko527
    @andreivlasenko527 Рік тому +1

    Seems like my comment got deleted cause of arxiv link, I was saying you could try StyleGAN XL as it showed quite good performance with diverse datasets like imagenet, and trains relatively fast(despite big size) and second advise is using finetuning instead of training from scratch, it's much faster and more stable for gans

    • @EmergentGarden
      @EmergentGarden  Рік тому

      Oh yes, the best way to do it would be to fine-tune a big pretrained model like stylegan. But I'd rather do that with a diffusion model first, and maybe stylegan for comparison.

  • @enesmahmutkulak
    @enesmahmutkulak Рік тому

    Hi, can you share your datasets? Or is it from kaggle?

  • @yorkwestenhaver8680
    @yorkwestenhaver8680 Рік тому +2

    Fucking Great video man!!

  • @iDrewa
    @iDrewa Рік тому

    Imagen is pronounced I-mi-gen. Great vid!

  • @dunar1005
    @dunar1005 9 місяців тому

    5:02 is that by brute force? So the discriminator will see 1,000,000 pure noise pictures, and when, by chance the generator generates two black pixels besides each other, it will decide that it prefers that picture.?

  • @Landee
    @Landee Рік тому

    gg !

  • @CharlesVanNoland
    @CharlesVanNoland Рік тому +1

    The more I've learned about neural networks, particularly while watching Machine Learning Street Talk, the more I doubt when people say "these people do not exist" about the deepfaked face images. I believed it blindly before, but now I am concerned that it's basically just interpolating between faces, which is pretty great regardless, but I think someone needs to take the input images the network was trained on and compare them to the best most concise outputs it generates, and see which faces it most resembles - if not matches almost perfectly. Sure, with a GAN it's encoding down to a much lower dimensional latent variable, and then decoding back up to image resolution, but that still could just mean that it's just showing us faces it's learned, and interpolating between them within the latent variable space. At any rate, I'd just ;ike to see comparisons between the "random" outputs and the actual images that the network is trained on.

    • @willguggn2
      @willguggn2 Рік тому

      That's the point of latent spaces of human faces … ? Given the right parameters it should be able to generate every possible picture of a human face within and outside the training data.

  • @michaelmam1490
    @michaelmam1490 Рік тому

    12:23 Does anyone know if the Chinese text is intelligible?

  • @programorprogrammed
    @programorprogrammed Рік тому

    Looking at the patreon, we must all be broke

  • @CmdrTigerKing
    @CmdrTigerKing Рік тому

    This video's script was enirely AI generated. then read by an AI, and pictures created by an AI

  • @goranjosic
    @goranjosic Рік тому +1

    I recently played with Stable Diffusion beta 1.5, they have a trial and some points for everyone to try their model and my impression is that their diffusion model is really great only for generating artistic images, paintings, in many other situations it looks either too artificial or overfited - it copies too much, (that is my impression) and all faces looks awful - especially when compared to "Mid Journey" and their model and images.
    I guess, more work is needed on this type of neural network...?! _I'm not expert, just a hobby programmer_

  • @errantwashere
    @errantwashere Рік тому +1

    Errant was here

  • @RafaelSCalsaverini
    @RafaelSCalsaverini Рік тому

    Have you generated Adam Neely?

  • @petergibbons607
    @petergibbons607 Рік тому

    i have a friend that is an "artist" (not really that good but thinks they are), and she is pissed about this AI thing, it's pretty funny, sucks to be an artist now :D (or soon when this thing gets really REALLY good)

  • @alan2here
    @alan2here Рік тому

    annealing
    defusion
    waveform collapse

  • @raptordarwish887
    @raptordarwish887 Рік тому +1

    Make a thumbnail of a bot sitting infront of the computer making/editing thumbnail

  • @Agony.
    @Agony. Рік тому +2

    As if youtuber's jobs weren't easy and lazy enough.

  • @artem945
    @artem945 Рік тому +1

    So the discriminator is basically solving the Turing's test...

  • @HighlyRegardted
    @HighlyRegardted Рік тому

    Unfortunately…Oil is still the oil of the 21st century …

  • @ellinikoptero7355
    @ellinikoptero7355 Рік тому

    Man, love the vid, but for real change the thumbnail. Your current just falls too much into the uncanny valley, as you too know and I didn’t click on the vid for many hours, even considering unsubbing because I thought it was junk cluttering up my “subscribed” feed.

    • @caesaroftampa1266
      @caesaroftampa1266 Рік тому

      Worst possible reaction ever. I did not have the same reaction, I saw it and was intrigued. Reminded me of some of VSauce's thumbnails!

    • @EmergentGarden
      @EmergentGarden  Рік тому +1

      Not the reaction I was going for lol! I'll be messing around with the thumbnail/title, I figured uncanny ones would catch the eye but they can also freak people out.