Advanced Style transfer with the Mad Scientist node

Поділитися
Вставка
  • Опубліковано 28 лип 2024
  • We are talking about advanced style transfer, the Mad Scientist node and Img2Img with CosXL-edit. Upgrade the IPAdapter extension to be able to use all the new features. Workflows are available in the example directory.
    Discord server: / discord
    Github sponsorship: github.com/sponsors/cubiq
    Support with paypal: www.paypal.me/matt3o
    Twitter: / cubiq
    00:00 Intro
    00:23 Style Transfer Precise
    02:03 Mad Scientist Node
    05:35 Advanced Blocks Tweaking
    07:27 CosXL Edit
  • Наука та технологія

КОМЕНТАРІ • 194

  • @latentvision
    @latentvision  Місяць тому +18

    Since I made this video I added a "precise style transfer" node to the IPAdapter. You can use that instead of fiddling with the Mad Scientist. It also works with SD1.5 (to some extent).
    Also since I've been asked quite a few times now... sorry, we do not have exact data of what each block does. 3 and 6 are pretty strong so it was easy but other layers have also some impact on both the composition and the style. Some seems to effect text, others background, others age. But at the moment it doesn't seem there is a "definitive guide". I would have told you otherwise 😅

    • @flankechen
      @flankechen 23 дні тому

      thanks a lot, so in SD1.5, which block for style which for composition?

    • @CaraDePatoGameplays
      @CaraDePatoGameplays 17 днів тому

      This intrigued me, I'm going to do a lot of tests to see what they do besides 3 and 6

  • @MarcSpctr
    @MarcSpctr Місяць тому +73

    This guy is literally equivalent to what Piximperfect is to Photoshop.
    I doubt even the people who worked on SDXL had any idea that this much stuff and control can be gained over the models.
    Like seriously, wtf ????
    Amazing work.

    • @saschamrose6498
      @saschamrose6498 Місяць тому +4

      i would say more like video co pilot is to after effects

    • @latentvision
      @latentvision  Місяць тому +47

      nnaah I guess that the difference is just that I actually share what I find

    • @GG-hh1sl
      @GG-hh1sl Місяць тому +2

      @@latentvision lol

    • @DarioToledo
      @DarioToledo Місяць тому +1

      Unm3sh

    • @rhaedas9085
      @rhaedas9085 Місяць тому +5

      @@latentvision Share, and explain. You're like that one teacher that didn't just show you the math formula, but showed why it was important and how to use it practically.

  • @831digital
    @831digital Місяць тому +67

    Best Comfyui channel on UA-cam.

  • @leolis78
    @leolis78 Місяць тому +36

    Matteo, your work is amazing! You are our Dr. Brown. Our mad scientist who will give 1.21 Gigawatts to the AI to take us to the future. We love you!!! 😄😄😄

    • @latentvision
      @latentvision  Місяць тому +10

      just doing my part!

    • @ooiirraa
      @ooiirraa Місяць тому +3

      ​@@latentvision and we are doing our part loving you and being grateful 🎉

    • @caseyj789456
      @caseyj789456 Місяць тому +3

      Yeah you are our mad scientist 😂 ❤ Merci Mateo !

  • @TriNguyenKV
    @TriNguyenKV 26 днів тому +3

    when it comes to teaching and concise explaining, you are the GOAT!!!! Thank you so much, please keep doing this. Thank you!

  • @DarkGrayFantasy
    @DarkGrayFantasy Місяць тому +6

    As always amazing work Matt3o!
    For those interested in the Crossattention codes this is what they target:
    1) General Structure
    2) Color Scheme
    3) Composition
    4) Lighting and Shadow
    5) Texture and Detail
    6) Style
    7) Depth and Perspective
    8) Background and Environment
    9) Object Features
    10) Motion and Dynamics
    11) Emotions and Expressions
    12) Contextual Consistency

    • @stefansotra2934
      @stefansotra2934 Місяць тому +1

      Where did you get this info?

    • @DarkGrayFantasy
      @DarkGrayFantasy Місяць тому

      @@stefansotra2934 Research really, nothing more...

    • @ceegeevibes1335
      @ceegeevibes1335 29 днів тому +1

      wow cool... thanks!

    • @walidflux
      @walidflux 12 днів тому

      is 12 the 0.0 index ? if there is a more clear description for all these please link it

  • @lonelyeyedlad769
    @lonelyeyedlad769 Місяць тому +6

    Great work as usual, M! I am happy to see that the group experimentation with the UNET layers has led to the development of a node that will give us more control over our generations. Thank you for your continued efforts in this field!

  • @user-ck5sh2um3b
    @user-ck5sh2um3b Місяць тому +10

    You are a mad scientist haha thank you so much Matteo

    • @latentvision
      @latentvision  Місяць тому +3

      mad for sure, scientist not so much 😅

    • @user-ck5sh2um3b
      @user-ck5sh2um3b Місяць тому +1

      @@latentvision haha 😂 keep up the great work I love your content.

  • @puoiripetere
    @puoiripetere 20 годин тому

    Matteo sei davvero incredibile con il tuo lavoro... 🎉

  • @moviecartoonworld4459
    @moviecartoonworld4459 Місяць тому

    "While keeping up with the influx of new features is important, I'm reminded again of the value of in-depth understanding of a single function. Thank you as always."

  • @HiProfileAI
    @HiProfileAI Місяць тому

    I love the idea of target conditioning various layers and being able to direct the layer with this kind of control in the cross attention. Thank you Matteo for you continued work and expertise. You give us a lot to play with and work with. The implications of the kind of control we can have in image creation and manipulation will last for years. Continued blessing to and appreciation to you good sir. 🙏🏾👍🏾

  • @dck7048
    @dck7048 Місяць тому +1

    Image gen is a tech that seemed science fiction a couple years ago, but to have refined it to the point people in their homes can casually do generations like 7:19 is nothing short of outstanding. Thanks as always.

  • @autonomousreviews2521
    @autonomousreviews2521 Місяць тому +1

    Love what you're doing for the community - thank you for your time and for sharing :D

  • @mariopt
    @mariopt Місяць тому +2

    Thanks a lot for this new node, really appreciate it.

  • @GG-hh1sl
    @GG-hh1sl Місяць тому +2

    Just found the node today and was wondering about its use - thanks for sharing the knowledge!

  • @fukong
    @fukong Місяць тому +3

    God of IPAdapter

  • @jasonchen1139
    @jasonchen1139 Місяць тому +1

    Incredible Content ! your work is undoubted the best !

  • @urbanthem
    @urbanthem Місяць тому

    Thanks a thousand Matteo. Your last statement is something I tell time and time again, we only use so little potential in what's already out there. Brilliantly proving that point.

  • @ysy69
    @ysy69 Місяць тому

    Thank you. Exactly, we become conditioned to chase the new shiny toy rather that fully learning and enjoying the old ones. So much can be done with this, looking forward to...

  • @SerginMattos
    @SerginMattos Місяць тому +1

    Your work is amazing!

  • @Showdonttell-hq1dk
    @Showdonttell-hq1dk Місяць тому

    This is so incredibly cool! Thank you very much. I can't even imagine how nerve-wracking and exciting the coding was for this. :)

  • @euroronaldauyeung8625
    @euroronaldauyeung8625 Місяць тому +2

    genius hacking of cross attention and perfect explanation of the indexing.

  • @rsunghun
    @rsunghun Місяць тому +1

    Absolutely amazing 😮

  • @legendaryanime69
    @legendaryanime69 Місяць тому +2

    Always waiting for your greate video, that help me alot! Thanks

  • @abdellahla6159
    @abdellahla6159 Місяць тому +1

    Great node, thanks a lot 😁

  • @karlwang4837
    @karlwang4837 Місяць тому

    it was amazing ,thank you for the work you have done for the community, i really appreciate it

  • @Sedtiny
    @Sedtiny Місяць тому +3

    Thank you again, my lord

  • @johnriperti3127
    @johnriperti3127 Місяць тому +1

    Thanks Matteo, this is so good!

  • @Firespark81
    @Firespark81 Місяць тому

    This is awesome! ty!

  • @ceegeevibes1335
    @ceegeevibes1335 Місяць тому +1

    love love love this, going MAD!!!!

  • @walidflux
    @walidflux Місяць тому +1

    Again, blowing minds !!!!

  • @openroomxyz
    @openroomxyz Місяць тому +2

    Thanks that's cool, amazing findings that will help the comunity

  • @dreammaking516
    @dreammaking516 Місяць тому +4

    Insanely cool, also just realized, you are italian as well😂🔥

  • @Nairb932
    @Nairb932 Місяць тому +1

    Keep up the good work man

  • @huwhitememes
    @huwhitememes Місяць тому +2

    Awesome, Bro

  • @igorkotov8937
    @igorkotov8937 Місяць тому +1

    Thank you!

  • @nerdbg1782
    @nerdbg1782 Місяць тому

    This builds on your previous experimental node where you asked for some help from the community. Glad to see they helped you decipher the layers

    • @latentvision
      @latentvision  Місяць тому +1

      not to remove anything from the wonderful community but you've been distracted 😄Style and Composition was released months ago, way before the prompt injection.

    • @nerdbg1782
      @nerdbg1782 Місяць тому

      ​@@latentvision I was speaking about block weights, this one: ua-cam.com/video/OrST6Nq1NUg/v-deo.htmlsi=VyhskRDQS5m8JFMX
      Anyhow, it's nice to see the two combined, regardless of if it is a new feature or not. Good stuff, in either case 🙂

  • @majic_snap
    @majic_snap Місяць тому

    My understanding is that Precise generally weakened the weights of more layers, but style has always been a mystery to neural networks, although you have done so well already. I hope you can bring us more surprises, thank you for your contributions! The name 'Mad Scientist' is simply fantastic

  • @ryanontheinside
    @ryanontheinside Місяць тому +1

    this is awesome thank you

  • @YING180
    @YING180 Місяць тому

    so cool and you are our mad scientist

  • @jibcot8541
    @jibcot8541 Місяць тому

    Very cool, I need to play with IPAdapter more often, but I am often too busy just improving prompts and upscale workflows!

  • @jensenkung
    @jensenkung 2 дні тому +1

    7:20 my jaw literally drop

  • @mycelianotyours1980
    @mycelianotyours1980 Місяць тому

    Thank you so much!

  • @nelsonporto
    @nelsonporto Місяць тому +1

    GENIUS

  • @GoblinWar
    @GoblinWar Місяць тому

    Cos-XL is so tight, I'm a huge fan

  • @madmushroom8639
    @madmushroom8639 Місяць тому +1

    Very cool! Would love to see some coding sessions. Maybe you could explain your code a bit. More info about the vector sizes, layers etc :)

    • @latentvision
      @latentvision  Місяць тому +2

      I was thinking about that... not sure how much interest there would be on that though

    • @madmushroom8639
      @madmushroom8639 Місяць тому

      @@latentvision Yeah maybe, but your "ComfyUI: Advanced Understanding (Part 1)" video actually performed really well I think, where you went into more details. That plus some code examples what is going on behind the scenes with your knowledge would be awesome!
      Maybe a small poll could show if its worth your time :)

  • @yvann.mp4
    @yvann.mp4 Місяць тому

    amazing, thanks a lots

  • @Mika43344
    @Mika43344 Місяць тому +1

    W O W!!! AMAZING!

  • @swannschilling474
    @swannschilling474 Місяць тому +2

    I'll take the blue pill!! 😁 Thanks so much for this one!! 💊

  • @jccluaviz
    @jccluaviz Місяць тому

    Thanks you dr.Matteo.
    I think i need one of your pills to make my days shine.
    Again an extraodinary work.

  • @Billybuckets
    @Billybuckets Місяць тому +1

    Until I use this a *lot*, I will have no idea what the different UNet blocks do. Maybe you could put a Note node in the pack that contains an estimation of the relative contribution of each block to style, composition, and anything else that might be useful.
    A++ work as always. Best SD channel around.

    • @latentvision
      @latentvision  Місяць тому

      unfortunately we don't know exactly what the blocks do

  • @bgmspot7242
    @bgmspot7242 Місяць тому +2

    Nice❤❤

  • @BubbleVolcano
    @BubbleVolcano Місяць тому

    Nice work! ❤It's awesome to see real progress on the U-net layer. But having too many parameters can make it tough to get started, even for someone like me who's been at it for over a year. It's just too challenging for ordinary people. If we change the filling parameter to four simple options like ABCD, it might be easier to promote. Ordinary people aren't into the process; they're all about the end result.

  • @Alice-Coro
    @Alice-Coro Місяць тому +8

    Amazing video. You do a great job at explaining complex ideas. I've learned so much from your videos.

  • @miguelitohacks
    @miguelitohacks Місяць тому

    HOLY SHIT, this is powerful!

  • @divye.ruhela
    @divye.ruhela Місяць тому +1

    Impeccable naming, we're all a little mad by now 🤣

  • @krio_gen
    @krio_gen Місяць тому +3

    Unbelievable.

    • @latentvision
      @latentvision  Місяць тому +3

      believe it!

    • @krio_gen
      @krio_gen Місяць тому +2

      @@latentvision )))
      I dived into it with my head. I feel like a Mad Scientist)

  • @MrGingerSir
    @MrGingerSir Місяць тому +3

    This is awesome! Are you planning on making a version that works with embeds?

  • @johnsondigitalmedia
    @johnsondigitalmedia Місяць тому +2

    Awesome work! Do you have the info on the other 10 control index points?

  • @gsMuzak
    @gsMuzak Місяць тому +5

    you're the man, thanks for all this tutorials!

  • @noxin7
    @noxin7 Місяць тому +1

    Mateo, This is amazing work with the mad scientist node - My only question (not criticism) is if you plan to convert the index:weight string into widgets for ease of use or is there something that prevents that?

  • @glassmarble996
    @glassmarble996 Місяць тому

    you have so many secrets matteo :D

  • @kenwinne
    @kenwinne Місяць тому +1

    Matteo, thank you for bringing us IPAdapter, which provides a solid ground for us to combat the uncertainty generated by large models. I personally like your explanation of basic theories. Although your course is less than 10 minutes, I have studied it repeatedly for several hours. If you have time to explain in detail the specific functions and applications of the 12 layers of the cross nerve, thank you very much for your efforts, thank you!

  • @lucagenovese7207
    @lucagenovese7207 Місяць тому +1

    07:20 quella roba è fucking insane.

  • @GG-hh1sl
    @GG-hh1sl Місяць тому +3

    How about a widget setting in the IpAdapter node, to set the strength of each layer with a short lable of its function?

    • @latentvision
      @latentvision  Місяць тому +3

      we don't know exactly what is the function of each layer unfortunately

  • @StudioOCOMATimelapse
    @StudioOCOMATimelapse Місяць тому

    Very good as always Matteo.
    Can you explain all the index please?
    I've noticed only 3:
    3: Reference image
    5: Composition
    6: Style

  • @SouthbayCreations
    @SouthbayCreations Місяць тому +1

    Great video, thank you! Where can we find this node?

  • @michail_777
    @michail_777 Місяць тому +2

    And one more question. Where can I find an explanation of the index/Cross Attention?

  • @sephia4583
    @sephia4583 Місяць тому +2

    Is there any similar way to apply Lora style to only specific layer? Maybe we can apply negative weight for composition layer (e.g. layer 3) and positive weight for style layer (e.g. layer 6)?

  • @ElevatedKitten-sr6yi
    @ElevatedKitten-sr6yi Місяць тому +1

    🤯

  • @kinai_4414
    @kinai_4414 Місяць тому

    Damn that's impressive.
    Could the same logic be applied to a Lora node in the future ?

  • @nomand
    @nomand Місяць тому

    incredible. Apart from style and composition, has the community found consensus on what specific qualities of the image other indexes affect?

  • @alxleiva
    @alxleiva Місяць тому +2

    You called that node based on yourself right? You're truly a mad scientist bringing us the best discoveries! Thank you Mateo

  • @MikeTon
    @MikeTon 5 днів тому

    Amazing and insightful work! Question wrt to sponsorship, do you have a preference between github vs patreon? I'm getting so much value here that I want to meaningfully support you and will default to github support if there's no preference

    • @latentvision
      @latentvision  5 днів тому

      hey thanks! I don't use patreon because I don't have time to push updates. Either github or paypal at the moment!

  • @kallamamran
    @kallamamran Місяць тому

    Wow... You should make the layers as weight handles and call the layers for what they are :D

  • @4rrxw794
    @4rrxw794 Місяць тому

    🤩🤩

  • @aidiffuser
    @aidiffuser Місяць тому

    Hello man, thanks for sharing this amazing improvement on control! Did something change between the style transfer and composition from 2 days ago to this release? I cant seem to reproduce same results :(
    Or, is there a way to reproduce the exact same layer weights of that previous release within the mad scientist node?

    • @latentvision
      @latentvision  Місяць тому

      no, style and composition should be the same. if you have issues please post an issue on the official repository with a before/after images possibly

  • @ParrotfishSand
    @ParrotfishSand Місяць тому

    🙏

  • @neofuturist
    @neofuturist Місяць тому +2

    UPDATE ALL THE NODES!!!!
    thanks Matteo

  • @gsMuzak
    @gsMuzak Місяць тому +2

    a newbie question (maybe), index 3 is composition and 6 is style, what are the others? I don't remember if you have already talked about them in your other ipadapters videos

    • @rhaedas9085
      @rhaedas9085 Місяць тому +2

      Look at his video a few weeks about about prompting the individual UNet blocks, that's what's going on here. There's still a lot to figure out, and some may be still dependent on others so it's not as clear cut as these.

    • @gsMuzak
      @gsMuzak Місяць тому

      @@rhaedas9085 thanks

  • @manojkchauhan
    @manojkchauhan Місяць тому

    Hey Matteo, Just finished your ComfyUI tutorial - seriously impressive stuff! 👍❤
    Your breakdown of advanced features with practical examples is super motivating. I'm excited to put these into action and unlock the full potential of ComfyUI. Thanks for sharing your knowledge!

  • @DanielVagg
    @DanielVagg Місяць тому

    Great video. Top notch content, as always

  • @AIFuzz59
    @AIFuzz59 Місяць тому

    Do you have a list of what the other index layers are? We are experimenting with this now

    • @latentvision
      @latentvision  Місяць тому

      no, it's difficult to undestand. some are subject specific for example (eg: they work with people not with landscapes)

  • @AI.Absurdity
    @AI.Absurdity Місяць тому +1

  • @denisquarte7177
    @denisquarte7177 Місяць тому

    "We fail to understand what we already have" - cries in GLIGEN conditioning

  • @ooiirraa
    @ooiirraa Місяць тому

    Woooow, wow 🎉 you are amazing. This is just soooo cool.
    Why the negative prompt doesn't go with minus? It would be 3:-2.5, 6:1, and this way all the sintaxis could be consistent everywhere. And people would be able to pass positive and negative as much as they want.

    • @latentvision
      @latentvision  Місяць тому +1

      I need to think about it, technically you can send a negative value to the positive embeds so it's not that simple

    • @ooiirraa
      @ooiirraa Місяць тому

      @@latentvision then it could be a letter like 3:n2.5, 6:1 or 3:2.5n, 6:1 or 3:neg2.5, 6:1 (to make it 100% transparent)

  • @afrosymphony8207
    @afrosymphony8207 Місяць тому

    please is the prompt injection node out yet???

  • @quotesspace1713
    @quotesspace1713 Місяць тому +2

    Thanks, that's really cool 🙏🙏.
    but Is this just for me? I found almost everything too advanced and couldn't understand what's going on, but I would really love to understand it in depth so that I can add my own to it and share. I do have some knowledge on comfyui but this is...

  • @tofu1687
    @tofu1687 Місяць тому +1

    ... It feels like SD3 is going to have a very hard time

  • @pedrogorilla483
    @pedrogorilla483 26 днів тому

    Did anyone ever figure out what each block of the Unet does? When I was obsessively trying to understand how stable diffusion work, I went deep into it but could never get a straight answer. Also what processes are involved in each block? If I remember correctly each block has layers within it, with ResNets and other things above my pay grade. If anyone can point to a resource I’d appreciate 🙏

  • @GiovaniFerreiraS
    @GiovaniFerreiraS 27 днів тому

    Is this an evolution on the prompt block by block thing? I remember you saying on that video that nothing stopped you from using images.

    • @latentvision
      @latentvision  27 днів тому

      the technology is the same but technically we did this before the prompt injection. Visual embeddings are easier to evaluate

  • @gimperita3035
    @gimperita3035 Місяць тому

    Thank you! I discovered cosxl recently - and it struggles on my 4080 - and this released with perfect timing.

  • @nkofr
    @nkofr Місяць тому

    Hi, thanks, wonderful! I just don't understand the point of this custom node having "weight_type" field if we modify the layers' weights in the bottom input field? Is "weight_type" overriden by the values in the input field?

    • @latentvision
      @latentvision  Місяць тому

      "style transfer precise" uses a different strategy to apply the embeds. You need to use it only if you want to do the style transfer thing. If you want to experiment with blocks you can select whatever and it will be overwritten (except again "precise")

    • @nkofr
      @nkofr Місяць тому

      @@latentvision Thank you Matteo, that's awesome! Grazie

  • @flankechen
    @flankechen 24 дні тому

    amazing work, anyone test mad scientist in SD1.5? how is the specific block to inject attn work?

    • @latentvision
      @latentvision  24 дні тому

      I made a new "precise style transfer" node that should work with SD1.5 and makes the whole process simpler

  • @Freezasama
    @Freezasama Місяць тому

    How to use mad scientist? I cant find it :/

  • @zhenyu2714
    @zhenyu2714 Місяць тому

    hi matteo! I wonder if I choose different weight types and set all layers 0 except the sixth layer 1, I foud the result is all the same as default style transfer, is that means the style transfer is sixth layer 1 and other 11 layers 0, and style transfer precise is third and sixth layer are 1 other 10 layers are 0??

    • @latentvision
      @latentvision  Місяць тому

      precise is negative composition (layer 3) and positive style (layer 6)

  • @context_eidolon_music
    @context_eidolon_music Місяць тому +1

    Your 666th like is from me. I don't know what I'd do without your brilliant work. Thank you.

  • @baseerfarooqui5897
    @baseerfarooqui5897 Місяць тому

    hi thanks for this great tutorial. im getting error while executing the code is "" ipadapter object has no attributee 'apply_ipadapter" i tried to using sd15 checkpoints as well sdxl. but getting same.

    • @latentvision
      @latentvision  Місяць тому

      maybe it's an older version, of an old workflow, or simply browser cache

  • @calvinherbst304
    @calvinherbst304 Місяць тому

    dying to know what the other index blocks are!

  • @michail_777
    @michail_777 Місяць тому

    Mateo hi and thank you. I'm using the Mad Scientist node. Thanks for the clarification. I've become more aware of how to use it. I also have one question about the "IPAdapter Encoder" node, it has an input for a mask? The point is that both input image and mask should be connected to this node. When using only the input image in the "IPAdapter Encoder" node, the output image adopts the style/whatever. But when I also connect an input mask (I tried just a colored map, image, half painted image), the IPAdapter Encoder node has no effect on the generated image at all. Could you please explain how to use the mask in the "IPAdapter Encoder" node?

    • @latentvision
      @latentvision  Місяць тому

      I'm sorry I'm not sure I completely understand, maybe join my discord or post a discussion with some screenshots in the IPAdapter repository

    • @michail_777
      @michail_777 Місяць тому

      @@latentvision Yeah, I already wrote to L2 (quick help).