This new type of illusion is really hard to make

Steve Mould

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 25 січ 2025

КОМЕНТАРІ • 2,1 тис.

@SteveMould 4 місяці тому ⁺³³⁸
That was a real Parker Twisty Square.
The sponsor is Jane Street. Find out about their internship at: jane-st.co/internship-stevemould NOTE THE URL ON SCREEN IS INCORRECT! This is the correct URL. I'd call it a Parker URL but Matt got it right.
@blobbo. 4 місяці тому ⁺²
ok
@thetechfury 4 місяці тому ⁺¹
I like your video Steve Mould. Keep up the good work. ^w^
@yuyurolfer 4 місяці тому ⁺¹
Nice how you and Matt uploaded your linked videos at the same time
@h1234e1234 4 місяці тому
Server not found. Maybe AI was not so much of a hype. XD Joking :P
@thetechfury 4 місяці тому
@@yuyurolfer Indeed.
@CriticalMonkey623 4 місяці тому ⁺⁶⁷⁰⁶
Okay, hear me out. THIS is AI art. Not people using AI to just generate whatever they put in a prompt. But actual human creativity and ingenuity using AI as a tool to create something which previously would have been extremely difficult, if not impossible. There are a lot of ethical and aesthetic problems with generative AI in its current state, but this is the first time I've seen something made with AI and thought "that's beautiful".
@candycryptid2832 4 місяці тому ⁺¹⁵⁶
I agree!
@bl4cksp1d3r 4 місяці тому ⁺⁵⁴⁴
it is interesting, yeah, I will argue that in this specific case AI is DEFINITELY used as a tool to find a solution. My problem from day one always was with people who say they are AI artists. But that's clearly not what this video is about
@KindOfWitch 4 місяці тому ⁺⁹
yass queenses this is totes the stuff
@taffy4801 4 місяці тому ⁺¹¹³
A novel solution to a novel problem. Well put.
@candycryptid2832 4 місяці тому ⁺⁶
@@bl4cksp1d3r I AGREE
@PixelSodaCreates 4 місяці тому ⁺¹⁰⁰⁷
The rabbit/duck illusion got a serious glow-up
@oliviervancantfort5327 4 місяці тому ⁺⁵⁴
Too bad the cover image of the video was edited to make the transformation more dramatic. The left rabbit ear on the second cube was basically erased on the duck image...
@MollyHJohns 3 місяці тому ⁺³
@@oliviervancantfort5327 oof now that you pointed it out 😢
@alexholker1309 4 місяці тому ⁺¹³⁹⁶
15:14 Bias and hallucination in the context of generative AI aren't simply human fallibilities, they're the mechanism by which it functions: you're handing an algorithm a block of random noise and hoping it has such strong biases that it can tell you exactly what the base image looked like even though there never was a base image.
@truejim 4 місяці тому ⁺¹⁴⁴
Well said. Also: bias and hallucination are so commonplace in our own neural networks (our brains) that we even given them categories and names, such as “over generalization”, “confirmation bias”, “sunk cost fallacy”, or the catch-all “brain fart”. All neural networks (including our own) apply learned patterns in contexts where the learned pattern shouldn’t be applied. That’s why (to your point) the neural network driving diffusion can denoise noise that was never there in the first place.
@istinaanitsi3342 4 місяці тому ⁺²
это переход к изображению которое было в другой параллельной реальности и ваш мозг может существовать сразу в нескольких таких если его тренировать к непредвзятости, а то что это можно воспроизвести на компьютере впечатляет меньше чем древнекитайский язык в котором эта опция обязательна к применению, вы просто зациклены на вашем языке и это делает вас способными к удивлению
@truejim 4 місяці тому ⁺⁵
@@istinaanitsi3342 I think that’s the premise of Blake Crouch’s novel “Dark Matter”. 😀
@istinaanitsi3342 4 місяці тому ⁺¹
@@truejim не читал, но словосочетание темная материя просто говорит о неспособности науки понять мир, поэтому они заменяют знания темными словами
@WavyCats 4 місяці тому ⁺²²
@@truejim Very true. The ability for humans to recognise faces, even in places where there is no face, can be said to be one of our biases, yet a useful one at that, which makes me wonder whether hallucination and bias in reasoning is not merely a flaw, but something that may have inadvertently assisted in our survival throughout history.
@Neptutron 4 місяці тому ⁺⁸²⁰
Hey Steve and Matt, thank you guys for featuring our research - it was a lot of fun working with you! I'm Ryan Burgert, the author of Diffusion Illusions - I'll try to answer as many questions as I can in the comments!
@lmeeken 4 місяці тому ⁺²⁰
One thing I wasn't clear on. They describe taking the first images of two iterative prompt responses, flipping and layering them, and then using that single image as the first step in two different prompts (in this case, for penguin and giraffe). But how do you end up with a single image, rather than two different images that just used the same starting point?
@I.swareitsnotpersonal 4 місяці тому ⁺⁸
@Neptutron hey, I’m just wondering from an artist perspective, how this might be used to make artworks. I’ve made previous comment about it. I just wanted to say your work sounds amazing and looks amazing! Although 😅I’m a little worried about people wanting to steal and profit from the other artist’s artwork. 👍
@venanziadorromatagni1641 4 місяці тому ⁺¹⁵
I just wished they hadn’t used Midjourney pics.
That company is pretty exploitive, both towards copyright holders AND to their customers.
@koalachick8029 4 місяці тому ⁺⁴
For the diffusion array: could I put in a bunch of images and a "goal" image and have the machine output the correct arrays?
@Zutia 4 місяці тому ⁺¹⁹
Hi, important ethical question.
Can you say with 100% certainty that your copy of Stable Diffusion is entirely divorced from stolen artwork?
@gakulon 4 місяці тому ⁺⁶¹⁷
Loved the Matt Parker jumpscare in the image sequence
@doq 4 місяці тому ⁺²⁰
I literally pushed pause right on that frame and lost it 💀
@arnabbiswasalsodeep 4 місяці тому ⁺²⁴
Not a jumpscare but easter egg. Also its Maths Parker²
@AnasHart 4 місяці тому ⁺²⁹
2:52 at 0.25x speed
@larshofmann7516 4 місяці тому ⁺³⁵
Ah yes, the Parker Scare
@WillBinge 4 місяці тому ⁺¹
I thought I saw him
@mccoydtromb 4 місяці тому ⁺²¹³
I would love to hear this kind of illusion done with audio, such as reversing the audio file and hearing different text, or a piece of music!
@blackwing1362 4 місяці тому ⁺¹¹
Or something with the Yanny or Laurel thing but on purpose.
@danklemonsoda 4 місяці тому ⁺¹⁵
4 different sounds when overlayed making a completely different one would be cool
@KINIIKIO 3 місяці тому
⁠⁠⁠⁠@@blackwing1362 if u increase/decrease the pitch of that audio, you will be able to hear each word on purpose
@jasondashney 3 місяці тому ⁺³
I'm not sure that would work because these images can be based on something that vaguely sort of kind of resembles a penguin or giraffes, but I don't think our brains give us the same leeway for sounds. I don't think there's a pareidolia for sounds, is there?.
@danklemonsoda 3 місяці тому ⁺⁵
@@jasondashneywords, sentences, we can derive words from really distorted sounds
@davetech9403 4 місяці тому ⁺¹⁸⁵
Those blocks would sell really well in gift shops. Especially in Zoos.
@WitchOracle 3 місяці тому ⁺³
I would buy so many of them for real (I like to have a basket of fidgets, puzzles, and tactile art pieces on my coffee table and these would fit right in)
@vicnaum 3 місяці тому ⁺¹
@@WitchOracle I think there is a method you can 3D print it in-place (no assembly), and also transfer a color to the the first layer from a piece of ink-jet printed paper (it was on TeachingTech channel I think)
@protocol6 4 місяці тому ⁺²⁰⁴
A Mould-Parker crossover video about double image illusions in which you create several of them and you didn't do one that morphed from Parker to Mould?
@hundredfireify 4 місяці тому ⁺⁵
This is pushing it too far imho
@freemanmarco3373 4 місяці тому ⁺⁸
@@hundredfireify nah😭😭😭. we need that
@megaing1322 4 місяці тому ⁺³
I don't think the tech is currently up to that, since the models don't have a concept of steve or Matt.
@dside_ru 4 місяці тому ⁺¹
@@megaing1322one word: embeddings.
@megaing1322 4 місяці тому
@@dside_ru One word: Models
More random words you want to throw at me with no real relation to what I said?
@etunimenisukunimeni1302 4 місяці тому ⁺²²⁹
This wasn't a video about how diffusion models work and are trained... but you still managed to explain both better than the majority of videos on YT about the subject. Can you make a video explaining how you became so damn good at explaining things?
Oh, and this is the coolest application of image generators I've seen to date. Brilliant idea leveraging the intermediate diffusion steps to sneakily steer the result into multiple directions simultaneously!
@theblinkingbrownie4654 4 місяці тому ⁺²
Im not him but I'll guess it's due to how many years he has been explaining such a variety of topics.
@-danR 4 місяці тому ⁺²
This was the _least_ illuminating Steve Mould video I have ever seen. Most of them are exceptionally lucid, even in a single pass.
I lost my bearings past the "keep adding noise..." stage.
@etunimenisukunimeni1302 4 місяці тому ⁺⁸
@@-danR Can't blame you, it's a weird process that seems completely backwards the first time you learn about it. It sounds so stupid that first they make this giant model that can remove noise only to put most of it back in, but it's the only way to iterate enough to get a clear picture.
Anyway, it was only background information, and it doesn't really matter if it didn't become crystal clear how all of it works - the important thing is that the model is trained to be good at removing noise from a grainy picture. If you then start from a random mess and tell the model it's an extremely noisy picture of a cat, it will make into a picture of a cat by taking the supposed noise away. And because it happens in steps, you can alternate the subject between a cat and a dog in every other step, and it becomes both a cat and a dog in the end (obviously oversimplified)
@TheClintonio 4 місяці тому ⁺¹⁹⁷
This is a really educational video on AI which _should_ help most people understand and realise that these LLM and diffusion models are not General AI (ie; "truly intelligent") and just simple mathematical models. I studied AI and ML long before LLMs became a thing and have always been aware of this but convincing people of it is very hard in a short timeframe.
@Zutia 4 місяці тому ⁺²³
Honestly, as long as this shit continues to be trained by stealing work from actual human artists I don't care. I'm genuinely disappointed in Matt and Steve.
@DaveEtchells 4 місяці тому ⁺¹⁹
How do you know that we aren’t simply somewhat more complex mathematical models? 😉
@theninja4137 4 місяці тому ⁺³⁶
@Zutia what is and isn't stealing in this context is something that still needs to be established.
As the training images are not directly used, but just statistics on them (the training images are not actually stored in the final model, it's therefore impossible for it to "copy-paste" parts of them into the output image), so it doesn’t conflict with the current copyright. And if we change copyright in that regard, we also need to consider what that implies for artists being inspired by each other
@FENomadtrooper 4 місяці тому ⁺⁸
I don't see many people calling them general AI, but I do run into hordes of people on the internet vehemently claiming that an LLM is not even a type of AI at all.
@raccoon1302 4 місяці тому ⁺⁵¹
@@Zutia being disappointed in Steve for covering an extremely interesting and relevant application of a novel technology is quite frankly nuts. touch grass
@zero01101 4 місяці тому ⁺⁸
this is absolutely the best explanation of the u-net and text encoder and how they work together i've ever heard
@dekumarademosater2762 4 місяці тому ⁺⁴¹⁷
So a person could do this too - rough outline sketch of penguin, of a giraffe; flip one, work out an average rough from both; flip one back, do more detail on both, flip one. Repeat till you're happy or you give up.
But some people just do it in their head - amazing!
@rayscotchcoulton 4 місяці тому ⁺⁷⁸
Was thinking the same thing. With enough trial and error with both your original image and whatever secondary image that sort of manifests itself, this seems absolutely doable. It feels like an artist expression that humans could absolutely be trained in, but just haven't really ever largely pursued.
@cmmartti 4 місяці тому ⁺⁸¹
It's a common trend to do this with names or words in fancy script, so that it reads the same flipped upside down. I've seen a bunch on UA-cam and he does it in a few seconds (I couldn't tell you what it's called, it was something I saw in passing).
@mistrsportak9940 4 місяці тому
@@rayscotchcoultonWith enough trial and error, a monkey can write the Hamlet
@seav80 4 місяці тому ⁺⁷⁰
@@cmmartti those are called ambigrams
@cmmartti 4 місяці тому ⁺⁷
@@seav80 There we go!
@APrettyGoodChannel 4 місяці тому ⁺⁷¹
The reason some text models struggle with counting the number of r characters in a word like strawberry is because they don't see the word, they receive a vector which was trained to represent the different meanings of the word when looked at through different filters, similar to these illusions, which is what attention QKV projections do (extracting information from the vector which is layered in there). Sometimes the vector would have managed to store information about a word such as spelling and rhyming which the model can use, but oftentimes not, it depends on chance with how often things appear in the training data. The model could count it if the word was split into individual letters with spaces between them, because each would encode into a unique vector.
@LarryFain-y9w 4 місяці тому
write words the way they sound
so the AI can say them easier
@RamsesTheFourth 4 місяці тому ⁺⁶
@@LarryFain-y9w Phonetic consistency in english language would be great news for all non-english native speakers.
@Pandora_The_Panda 4 місяці тому ⁺¹
Wouldn't work because english has too many different accents and dialects, unfortunately.
@anchpop 4 місяці тому ⁺²
not quite, the model receives a stream of tokens which are not semantically meaningful. a model whose tokens mapped 1-1 with english characters would have no problem counting the number of r characters in strawberry. what you are referring to is a part of the model that converts chunks of the tokens stream into token embeddings
@RamsesTheFourth 4 місяці тому
@@Pandora_The_Panda Only one accent and dialect would be standarad others would not. Or each country would have their own standard.
@KeplersDream 4 місяці тому ⁺⁴⁰
It's like a sci fi version of the old Mad Magazine 'fold-in' pictures, if anyone remembers them.
@frankhooper7871 4 місяці тому ⁺²
73 years old and remember them well 😊
@stigcc 4 місяці тому ⁺²
I 'member!
@truejim 4 місяці тому ⁺⁹⁷
4:30 Minor nit. I don’t think the token embedding is really embedding based on semantics. It’s embedding based on how humans have used tokens in our writing. Since we tend to use semantically similar tokens in linguistically similar ways, the embedding does tend to cluster semantically similar tokens near each other. But it will also cluster tokens that aren’t semantically similar, merely because they’re used in the same way linguistically. For example “the” and “his” will be near each other in the embedding space not because they’re similar in meaning, but because they’re interchangeable in many sentences.
@muschgathloosia5875 4 місяці тому ⁺⁸
What else is semantics then? The model is essentially doing what linguists do but using raw statistics instead of pattern recognition.
@truejim 4 місяці тому ⁺¹
@@muschgathloosia5875 A purely semantic embedding would cluster tokens based only on similar *meaning*. Embeddings such as Word2Vec cluster tokens based on how the token is used in written English. So two tokens can be embedded near each other because they have similar meaning, *or* because they’re interchangeable in a sentence. “I ate his pie” vs “I ate that pie”. The words ‘his’ and ‘that’ don’t mean similar things, yet they’re still clustered near each other. The neural network is being trained on how words are used, not what they mean. It just so happens that words with similar meaning are also often interchangeable in a sentence.
@marigold2257 4 місяці тому ⁺²²
@@muschgathloosia5875it’s not understanding the semantics because the way it arranges things has nothing to do with semantics and everything to do with frequency of use together, if for every string of words I had a dice I could roll that would give me a a word to write down then I could generate sentences, if that dice was weighted via analysis of how often words are used together then my writing would look human and because of how language works it would look like semantic understanding, but I don’t understand anything I’m just rolling a dice based on frequency of word use
@coolguyflex 4 місяці тому
@@marigold2257LLMs are not Markov chains. They capture very complex and subtle relations between words. An LLM works by analyzing it's training data and representing it numerically in a way so it can reuse it to satisfy prompts. But the training process forces the model to be efficient with its organisation. The model is unable to learn all word patterns, so it has to instead find and learn subtle higher order concepts that are simple to memorize but can be used to satisfy many prompts.
It's like getting a kid to solve a thousand exam questions. They can't possibly learn all answers, so they will be forced to pick up patterns in answers, allowing them to answer questions they haven't seen before. These patterns will be artificacts of the way the questions are framed, as well as real knowdlege about the subject of the test.
It's difficult to examine exactly what the model knows, but it's possible to show that at it organizes it's knowledge in a way that encodes concepts similar to our semantic concepts. For example age may be represented as a geometric direction where words further along that direction are semantically older. Does that mean the model "understands" age? That's a philosophical question. But it means the model can use the concept of age in ways similar to what we do.
People often take poor math abilities as an example that the LLM isn't actually reasoning like we are. I think that's mostly training artefacts. There is not enough pressure on the model to learn mathematical concepts, so it instead learns shortcuts to produce plausible answers. However, concepts like age, sex, size are quite well represented because they are very useful useful to answer the types of prompts the model was trained for.
@muschgathloosia5875 4 місяці тому ⁺²
@@marigold2257 I'm not claiming it has any 'understanding' I'm just saying that the vector of tokens created is probably relevant to semantics more than just happenstance. I'm not putting any merit on the output of a generative model just the intermediary organization of the data.
@amarissimus29 4 місяці тому ⁺²⁰
In the settings of automatic1111, you can enable a clipskip slider right up top next to your model, vae, etc. Very useful if you're playing around with CLIP, especially when you've got novel length prompts. Doesn't really help you understand how the vector spaces really work, but it does help you to pretend to understand how they work.
@Dialethian 4 місяці тому ⁺²⁶⁷
Oh the overlap with mundane cryptography could be interesting. The order of words could be scrambled between two outputs.
The idea of synthesizing sound that says different things if you understand different languages is kinda horrifying.
@ianmoore5502 4 місяці тому ⁺⁴³
Or sounds which mean the same thing in multiple languages.
What a time to be alive!
@Gabu_ 4 місяці тому ⁺¹¹
That's already a real thing!
@noahjacobs7486 4 місяці тому ⁺³⁶
We could create infinite laurel/yanny prompts or images that have hidden details for color blind individuals
@istinaanitsi3342 4 місяці тому ⁺¹
это будет проблема для английского языка, в русском языке заложена защита от такой глупости, дети умеют пользоваться этим в играх на русском языке
@istinaanitsi3342 4 місяці тому
@@minhuang8848 просто ваш мозг вас разыгрывает
@eler90 4 місяці тому ⁺³⁰
Just don't forget that "but it eorks either way" means actually that scientists have tried I would assume thousands of ideas regarding the network architectures, hyperparameters etc. and only some ideas have worked so well that they allowed for the next step. Showcasing results is one thing, developing the models another. It's hard work.
@jonathanlevi2458 4 місяці тому ⁺¹
Dude, was deep and understandable. Thanks!
@jenshaglof8180 4 місяці тому ⁺¹⁷
Steve! This video actually taught me how text-to-image AI works. I've seen many videos about it but it still seemed like magic to me. Now, I actually understand the underlying process. Thank you so much!!!
@BarnabyPine 4 місяці тому ⁺¹¹⁵
Salvador Dali has a painting which looks like a woman in a dress going through a door in some kind of cubic world. When you go to take a picture of it, it looks like a pixelated Abraham Lincoln
@alexanderhalpin 4 місяці тому ⁺⁷
That is an example of a hybrid image
@RFC3514 4 місяці тому ⁺²⁹
That's basically a highpass / lowpass image (close up you see the fine details, further away you only see the big blocks). They're not hard to make. There's one in this video at 14:36 (not pixelated, but it's still the same highpass / lowpass concept).
P.S. - I'm pretty sure the woman in Dali's Lincoln painting isn't in a dress, unless the fabric is incredibly thin. 😉
@maxnami 4 місяці тому
Canadian artist Bob Gonsalves used to do that kind of paintings. Search for his works on internet.
@ghislainbugnicourt3709 4 місяці тому ⁺¹²
@@RFC3514 I agree with you but "they're not hard to make" is misleading. Some are hard to make. One example I like from Dali is The Hallucinogenic Toreador, where the same effect is used but with a smaller scale difference. I believe that's much harder to make, and that's without even considering the artistic aspect.
@caesarpizza1338 4 місяці тому
@user-gt5df8yt1v What painting is this?
@SimonJ57 4 місяці тому ⁺⁶
Seeing the pair have 3 different images (maybe a 4th) depending on the other squares orientation absolutely Blew, my, mind.
And I would love to buy some.
@macronencer 4 місяці тому ⁺⁷
The one where you combine the four transparencies together is a very cool new form of steganography. Excellent!
@lordfly88 4 місяці тому ⁺¹
Wow! That was incredible! It went from the most mind bending optical puzzles, to such a fantastic explanation of the whole thing. This is what UA-cam is truly meant for.
@Tigrou7777 4 місяці тому ⁺¹⁴⁵
The idea of generating images by removing noise is just as crazy as LLMs that generate text by predicting the next word (these are gross simplifications, but that's basically what it is).
@istinaanitsi3342 4 місяці тому ⁺⁴
так работает резчик по камню или дереву, что в этом особенного
@vectoralphaSec 4 місяці тому ⁺²³
AI is just mathematical magic. It's amazing.
@istinaanitsi3342 4 місяці тому
@@vectoralphaSec математика основа мира, а для вас просто мусор видимо
@nio804 4 місяці тому ⁺¹¹
It's even weirder than text prediction because the image model is trained to predict what noise was *added* to an image to make it noisier, and then by running that "backwards" on random noise you just happen to get an unreasonably efficient image generator.
@istinaanitsi3342 4 місяці тому ⁺²
@@nio804 это не предсказание а угадывание общих шаблонов загрязнения сигналов, уверен - работает только в заранее заданных условиях, обычный саморекламный трюк
@ex5tube 4 місяці тому ⁺³⁴⁰
I'm a software engineer and a midjourney user, and I've watched maybe 50 - 100 videos on LLM and generative AI.
In 17 minutes you managed to provide the best simple explanation for how generative AI works with LLMs to produce images from prompts.
Steve, you should teach a paid course on this stuff.
@ScottiStudios 4 місяці тому ⁺⁴
I was going to comment the same thing. Such a compact and simple yet comprehensive explanation. Well done.
@istinaanitsi3342 4 місяці тому ⁺¹
ой ой ой
@evansdm2008 4 місяці тому ⁺⁴
Yes, same. I’m a software engineer also. Exactly as Steve says, I feel satisfied with that explanation.
@Just_A_Dude 4 місяці тому ⁺¹⁰
I'm saving this video for the next time someone calls generative AI a "collage tool cut-and-pasting other people's images."
@kevinbrown2701 4 місяці тому ⁺¹
I'm NOT a software engineer (I can barely string a string together), and yet it still made sense to me! I was left with one burning question though: Where can I buy these things to show other people?
@wholelottavideo8381 4 місяці тому ⁺³³
Amazing stuff. A word about your video editing. You have to give viewers enough time to assimilate the starting image before progressing to the secondary. Probably an extra second would do. When editing, you know what you are looking at, but a viewer doesn't. Wanted to rewind and pause all the time.
@gaelonhays1712 4 місяці тому ⁺³
I've been doing drawings that do this for years; this was really cool to see.
_This_ is what AI is meant to be used for. It's not gonna take over every human job, because humans will always find ways to use it that it couldn't think of on its own.
@MrBelles104 4 місяці тому ⁺⁷
I saw both video thumbnails pop up in my feed, noting the similarities, and I loved the opportunity you had to collab with Stand-up Maths!
@chaos.corner 4 місяці тому ⁺¹⁶
The topic puts me in mind of ambigrams. I've created a few and it's all about getting enough features to trigger the word recognition in one orientation without destroying the recognition in the other direction. And vice versa.
@BrightBlueJim 4 місяці тому
Which is what I hate, hate, Hate, Hate, HATE about AI. What used to be a clever thing is now something you can make just by writing the appropriate prompt.
@natchu96 4 місяці тому ⁺¹
@BrightBlueJim Well...doing normally time-consuming tasks extremely quickly is pretty much what computers were created for...
@gameeverything816 4 місяці тому ⁺⁵
Whoa whoa @ 11:01 you just gonna gloss over that?! That was awesome! I wanna see more of that, that was wild!
@Cyrribrae 3 місяці тому ⁺¹
Yea that was the part that really blew my mind and was only briefly mentioned. Just so much cool stuff that just was out of reach before
@Yahboykatra 4 місяці тому ⁺³
Love how this video describes generative ai images so well! Appreciate the video!
@JimCoder 4 місяці тому ⁺¹⁴
I suspect our own minds are filtering noise from those images to make sense of them. Then from another perspective that same noise becomes signal, yielding a different perceived image. Fascinating stuff reminiscent of Hofstadter's Godel Escher Bach.
@alquinn8576 4 місяці тому ⁺²
I wonder how my cat sees the world. Sometimes i think very different from me since they don't have the higher level concepts to make sense of nearly all of the human artifacts around them; i.e. it doesn't fit into their umwelt. I think the closest i came to understanding what that was like was when i overdosed on edibles and tried using my smart phone but nothing on it made any sense (I was trying to google what to do if you overdose on edibles, but I couldn't tell the app icons apart from one another).
@BrightBlueJim 4 місяці тому
The cover of which is what I was reminded when looking at the 3D robot dog: the book cover art is a 3D figure that appears as a 'G' in one orientation, an 'E' in another, and a 'B' in another.
@Tiniuc 4 місяці тому ⁺⁶
12:10 dear god, a jigsaw puzzle with multiple answers!!!
@voradorhylden3410 4 місяці тому ⁺²
This is awesome. This is art. Something awe inspiring and flips how you look at things. Forces a new perspective. Nicely done!
@BrightBlueJim 4 місяці тому ⁺¹
But it's not art. The root word for "art" is the same as that for "artifact" and "artificial", which means (to me) that for something to be art, it must be man-made. Which makes the AI itself art, but not the picture. Sort of.
@SnakeSolidPL 4 місяці тому ⁺²⁹
1:01 poor rabbit being called trash by Steve
@NotBroihon 4 місяці тому ⁺⁵
😢
@Jus10Ed 4 місяці тому ⁺³
It's cute. Kind of looks like a stained glass window.
@MerchantMarineGuy 4 місяці тому ⁺²⁷
5:12 sports…..what??!??
@torchy_ 4 місяці тому
...jewish people...?????
@jackpoco 4 місяці тому
And again at 7:44
@zacharydefeciani7890 4 місяці тому ⁺⁸⁵
I saw that hidden matt parker at 2:52
@standupmaths 4 місяці тому ⁺⁵³
Highlight of the video.
@yuyurolfer 4 місяці тому ⁺⁵
@@standupmaths It's him!
@zacharydefeciani7890 4 місяці тому ⁺⁷
@standupmaths there better be a steve mould in your video somewhere 😉
@legoworks-cg5hk 4 місяці тому ⁺¹
@@standupmathswhy don't you have that tick?
@loudej Місяць тому
This is the comment I was looking for
@maybud60 4 місяці тому
Steve, it takes extraordinary talent to break down complex ideas into digestible pieces. Respect! Fascinating stuff.
@MikkoRantalainen 4 місяці тому
Great video! I consider this video to be mostly about creative visual hack that depends on human visual understanding but it also happens to be one of the best introductions to noise diffusion image generators, too.
@robadkerson 4 місяці тому ⁺⁸⁶
You could have a puzzle that's a different picture no matter how you put it together
@jimburton5592 4 місяці тому ⁺²⁹
I actually made one of those once. Didn't even take too long to design, and each of the possible ways to assemble the puzzle resulted in a unique image. Granted, it was only a 1 piece puzzle. But hey, it's a proof of concept, right?
@robadkerson 4 місяці тому ⁺⁶
@@jimburton5592 nice! Not every arrangement has to work, you could even "seek" different solutions
@bobbob0507 4 місяці тому ⁺⁸
In other words, a normal puzzle
@KillerKatz12 4 місяці тому ⁺⁴
@@bobbob0507 Well no because normal puzzles only make it possible for you to have one solution so you don’t get confused why your picture doesn’t look right. Basically the pieces only fit with certain pieces even if you try to jam them into a different one it will be slightly off size.
@moonrock41 4 місяці тому
You'd probably need to limit the number of pictures to two, but it would still be considerably more challenging since you'd need to determine which picture the pieces you've assembled are intended for.
@izzard 4 місяці тому ⁺¹²
6:14 Heeeyyy… I thought we weren't using Lenna anymore?!
@BrightBlueJim 4 місяці тому
What do you mean, "we"?
@apppples 4 місяці тому ⁺⁴
@@BrightBlueJim people who respect the wishes of exploited women whose images were used without their consent is a pretty good stand in for the word "we" in this context
@ker6349 4 місяці тому ⁺⁸
@@BrightBlueJimpeople who understand that there are a significant number of better test images, including those which are made and distributed with the permission of the subject of the photograph. Lenna was publicly fine with it for a while IIRC but now she thinks it's unnecessary for a variety of reasons
@74Gee 4 місяці тому ⁺⁵
Your description of diffuser, large language and clip models, and how they relate/interact was the best I've heard so far.
I can only imagine the enlightening journey it took to explain this so succinctly.
@theolaa 20 днів тому
Positively fascinating! I know it was simplified, but your explanation of generative AI was really great.
@oliviervancantfort5327 4 місяці тому ⁺⁸
Too bad the cover image was edited. The left rabbit ear has basically disappeared on the duck image...
@alex.g7317 4 місяці тому ⁺³
At 2:00 I have never understood generative AI more. I love this explanation.
@Levaaant 4 місяці тому ⁺¹⁶
0:27 can we get the link to that please?
@Brightguy858 4 місяці тому
Its github
@iropiupiu9642 4 місяці тому ⁺¹
I want the link too (._.)
@miguelabro 3 місяці тому
Get this comment up
@DaleHawkins 4 місяці тому
Thanks!
@kyar0s539 4 місяці тому ⁺¹
THANK YOU. This was really mindbreaking and inspiring. Love your channel, and loved this video. Love this kind of reflexion+ type where you take a very complex AI subject and decompose it bit by bit.
@G33v3s 4 місяці тому ⁺³⁰
You need to get Vi Hart in on this action with a hexaflexagon that has actual images on each orientation
@zackcinq-mars2129 4 місяці тому
Wow, completely agree! That would be so cool!
@Commentator-jh7wl 4 місяці тому
I wish her brain hadn't melted years ago :(
@MrMattie725 4 місяці тому ⁺⁵
Are we ignoring that the rotated first draft giraf at 11:38 was already the most stereotypical penguin image one would think of? :o
@rai2880 4 місяці тому
And the reverse penguin was also a giraffe
@moncef2733 3 місяці тому
I think it's an editing mistake they swapped the 2 images without noticing because it was too blurry xD
@NickDClements 4 місяці тому ⁺²⁸
0:40 Those are SKEWBITS!, by Make Anything! Well, the auxetic cube he first modeled that led to SKEWBITS. Your original 'Self-assembling material' video inspired him to try and make an auxetic cube that he could 3D print. He made the files available for download, someone else then printed them, used it for this purpose, and now they are in this video. UA-cam is amazing!
@landsgevaer 4 місяці тому
This is my Video Of The Year!
Excellent explanation of generative image AI with a pretty neat application too. Loved it! 💛
@satoriwabisabi 8 днів тому
I learned so much from this video. I understood things about AI text to image, I never understood before. Thank you.
@DangerDurians 4 місяці тому ⁺²⁵
Stopping it half way is exactly how you would do it with physical media
Do a sketch, re orient, edit sketch, repeat
@skilletborne Місяць тому
Right??? A real artist could have done it, but they were too lazy to
@Karak-_- 18 днів тому
@@skilletborne An artist could do it too, but none of them did.
@Mykasan 4 місяці тому ⁺⁸
can you do a epilepsy warning for 2:46. i'm not particularly sensitive to rapid light changes but i know some that are.
@RjWolf3000 4 місяці тому ⁺⁸
That rotating set that created 3 or more images is interesting. Could ai generate a bunch of layers where rotated coukd show an animated scene. That could make a really interesting sign or clock with a mechanical animation.
@joshlake1882 4 місяці тому ⁺²
I had the same idea, I’d love to make a rotating layer display.
@BrightBlueJim 4 місяці тому ⁺¹
I don't know how you do it, but every video of yours I see is fantastic.
@Coksnuss 4 місяці тому
What an excellent video that explains quite accurately (enough) how generative models work at a fundamental level.
@illusion-xiii 4 місяці тому ⁺⁴
Is it just me, or around 1:40 does it really look like the illusion is going to resolve into Yoda for a moment?
@GreenCat188 4 місяці тому ⁺¹
Nyoda Cat?
@happyvirus6590 Місяць тому ⁺³
1:24 cool one, that is.
@Lampe2020 4 місяці тому ⁺³
6:35 You've accidentally created a demon cat XD
@Nicola-cg1rg 4 місяці тому
Great explanation of diffusion models and how text prompts work! One of your better videos of late!
@CuJixBeatZ 4 місяці тому
now thats a great explanation of how diffusion models work with the noise! Felt like i learned something new
@maxlibz 4 місяці тому ⁺²⁷
2:46 epilepsy warning
@resourceress7 4 місяці тому ⁺²
Yes, PLEASE PIN this comment.
Thanks
@DavidTheHypnotist 4 місяці тому ⁺¹
Thank you! That gave me a freaking headache!
@oliparkhouse 4 місяці тому ⁺⁴
@SteveMould Would you mind editing in a 'flashing imagery warning' at the start of the video. UA-cam's editor should allow a text box to be input ahead of the section with flashing, and shouldn't require you to re-upload. Thanks
@maxlibz kudos for putting the warning up. UA-cam showed the comment just before the flashing began. Though I'm not epileptic flashing imagery can trigger or worsen my migraines. Your effort has made a difference already. Thank you!
@yeahiagree1070 4 місяці тому
@@oliparkhouse epilepsy is not a fashion accessory for you to wear to make yourself more interesting. Shut up
@piotrcthlu 4 місяці тому
Thank you, this should be Pinned.
@Simple_But_Expensive 4 місяці тому ⁺⁶⁰
You mentioned not training with human data to eliminate bias, but I have seen mathematical arguments that bias is unavoidable.
There were several papers and videos, but the only one I remember was an episode of Nova discussing how use of AI in predictive law enforcement in Oakland, California led to heavy handed responses in one neighborhood while ignoring rising crime in another.
Admittedly, the math was way over my head, but it seemed pretty convincing.
The problem basically lies not in the training data itself, but in the selection of training data.
Something along the lines of having university students select a set of images of men. The students unconsciously biased the data set by selecting a majority of younger more attractive and apparently more affluent white men by 58%.
Another example was Google’s AI refusing to show any white men in images of the founding fathers of the USA. (Which is confusing because they were all old white men. Talk about bias!)
Trying to select the data completely randomly only proved that we can only generate pseudo random numbers, yielding pseudo random sets.
The bias can be minimized, but never completely eliminated.
In the end, any AI will be a reflection of us, both the good and the bad in all of us. That is what is scary about AI.
@DanKaschel 4 місяці тому ⁺⁴
I think this overstates the severity of the problem.
Sometimes AI is thought of as a really sophisticated calculator, and indications that its answers might be incorrect are an existential threat.
But AI is maybe more like... Marketing. We get iteratively better at creating AI that will achieve our goals, and with time we will build more and more expertise at accelerating that process. The fact that AI in its current form is not capable of solving certain problems perfectly is scary in the sense that we can't cure cancer with medicine. It's unfortunate, but not necessarily unsolvable and certainly not intrinsic (except to specific approaches).
@Gabu_ 4 місяці тому ⁺⁶
Your very first point is "It isn't a problem with the training data, it's just a problem with the training data"... Maybe think a bit longer on your argument.
@LesenundDenken 4 місяці тому
The internet, like history, art, and pretty much any human cultural artifact, are all humanity's Caliban's mirror.
@Yottenburgen 4 місяці тому ⁺²
The google thing was likely instructional bias tbh rather than something trained into it. But that really just points into bias on both parts, what you put in and what is already inside of it.
@Yottenburgen 4 місяці тому ⁺⁴
@@Gabu_ Human training data is different, it means random quality datasets that humans have a 100% hand in creating, it matters what you put in but every dataset even if it isn't explicitly human consolidated is biased.
Even if a LLM were to create its own dataset it would still be human based as it inherited a human bias.
@justpassnthru 4 місяці тому ⁺³⁴
I remember, back in the 70's, there was a drawing of a "prom queen" with crown and all but when turned upside down was a picture of an old woman. It was a classic. Very simplistic compared to this but the same idea.
@almendratlilkouatl 4 місяці тому ⁺¹
oh yeah, and if you put it on the side you can see the beatles and aleister crowley riding a whale on the pyramid of Tolotsin the ancient god of fire and the sun and if you fold it at 33 degrees you get the masonic token to unlock the next level
@ranjitkonkar9067 4 місяці тому ⁺²
I remember that. Still used as an optical illusion example. Except that you didn't have to turn it around, did you? Just took a shift in perspectives to suddenly start seeing the other one.
@michalpifko3516 4 місяці тому
@@ranjitkonkar9067 There are two commonly used optical illusions that show a young/old woman. One involves rotating the image (the one OP was talking about) and it often comes with text that says "before 6 beers/afer 6 beers". The other it the one you are probably remembering (you can see a profile of an old woman or a young woman looking away from the picture).
@unfa00 4 місяці тому ⁺²
Steve, your clear explanation makes me want to try and make such a puzzle myself.
My idea is I could model something and animate it so I can easily switch between two different states and paint digitally.
Like painting on 4 separate cards while seeing them all juxtaposed.
It seems possible to do manually with digital painting.
Way, way harder to do with purely physical tools I guess.
I'd wager an artist could make these, maybe even better than the AI can.
The drawings that portray one thing, and then another thing when upside down, have been made by human artists already. The process you've described on how AI does it makes it seem to me like I could do it, even being mediocre at painting/drawing.
@dibbidydoo4318 4 місяці тому ⁺¹
the upside thing is the classic but there's more complicated tricks you can do with Genai.
@johannesstephanusroos4969 4 місяці тому
@@dibbidydoo4318 Could you tell me how, please? I have a friend who's obsessed with ducks, so anything that changes from a duck to something else and back would be amazing. I'd really like to make one for them
@zynskeyfolf 3 місяці тому
A wild unfa spotted
@HolyGarbage 3 місяці тому
This was incredibly fascinating. Thank you Steve and Matt.
@khutikhuti 4 місяці тому ⁺¹⁴
Sneaking in a Matt Parker pic in the images there 😂👌
@andrewcullen7671 4 місяці тому ⁺²¹
In the storied traditions of computational neuroscience, this video is a competent procedural explanation for the process of visual imagination. I wrote about this in my Master's thesis because I have aphantasia, and wanted to understand what other people could do, that I struggle with. In most people, the brain can generate real visual images in the occipital lobe based on words from the temporal lobe, eyes closed, no visual data. This process is how people have visual hallucinations - the brain generating visual data based on low-quality information. This is also why hallucinations are more common in one's peripheral vision and low light.
People with aphantasia, including some hyperverbal autistic people, often require high quality visual data, so they can't imagine anything with their eyes closed, even picturing something that happened earlier that day, or their loved one's face. But the process of visual imagination works very much like diffusion. If a person pictures an apple, they may get a fuzzy red blob at first, and then the brain fills in more and more details based on previous experiences with apples. if I try this, I just think of the definition of an apple. Weirdly, I'm an abstract surrealist painter and art teacher - no visual imagination. I can't remember what my mom looks like.
@glennac 4 місяці тому ⁺²
You would have been a great case study for Oliver Saks or V. S. Ramachandran. Both have written fascinating books about neuroscience and the many divergent ways the brain functions in certain individuals.
May I ask, if you can’t “picture” your mother visually when you two are apart, with what cues do you rely on to establish that relationship? Do you “hear” or recall her voice? Are there behavioral mannerisms of hers that reinforce your relationship with her when you two are apart?
Thank you for sharing your experience. 🙏🏼
@WindsorMason 4 місяці тому
Fascinating!
@Gna-rn7zx 4 місяці тому ⁺¹³
Thumbnail is a bit misleading... the duck image was altered.
@baki2200 4 місяці тому ⁺²
The rabbit image was altered too! Look at the duck beak
@AnaSnyder-j4x 4 місяці тому
This content is always full of useful and practical knowledge.
@DumitruCoretchiD 3 місяці тому
Super interesting Steve! Thanks for explaining this.
@Nakatoa0taku 4 місяці тому ⁺⁶
Your sponsor sounds like insider trading with extra steps 😂
@rayscotchcoulton 4 місяці тому ⁺¹¹
AI thoughts and comments aside, the angel-statue-to-Yoda transformation at 1:24 is absurdly clean and made me laugh out loud
@jarydm87 4 місяці тому ⁺¹⁹
What about a generative ai song that sounds legible and good played forward and in reverse?
@frostden 4 місяці тому ⁺¹
JOIN THE NAVY!
@jarydm87 4 місяці тому ⁺²
YVA NETH NIAJ
@dibbidydoo4318 4 місяці тому
it's quite similar to that sora video where you can choose the end frame of the video and the start frame of the video so you can create a loop.
@tarravento 26 днів тому
I've seen this trick in Matt Parker's channel some time ago but I completly forgotted about it. Very interesting concept 👍
@3mpt7 4 місяці тому
Thanks!
@Grim_Beard 4 місяці тому ⁺¹¹
04:00 Sorry, Steve, but this is a very misleading explanation of Large Language Models (LLMs). LLMs do _not_ 'understand' text, and they _don't_ have semantic knowledge (e.g. that 'blue boat' means that the boat is blue). The model doesn't know what a boat is, or what blue is, or what it means for a boat to be blue. All it knows is that certain words (actually tokens, which might be words, parts of words, or combinations of words) go together at certain frequencies. LLMs do not have 'meanings', just probabilities of tokens occurring together.
@Grim_Beard 4 місяці тому ⁺²
@Singularity606 Unsure why you think I feel "so strongly about this". I just thought Steve, who generally likes to give accurate information, might want to, you know, give accurate information. He can't correct errors if no-one points them out.
Also unsure why you're giving misinformation about LLMs, which do _not_ have semantic knowledge. The fact that a prompt like 'blue boat' can be used to generate an image of a blue boat does not mean that either the LLM or the diffusion model has any semantic knowledge. No more than a checkout recognising a barcode as belonging to a banana and displaying a price means that the till knows what a 'banana' is or has any concept of either food or money.
@Grim_Beard 4 місяці тому ⁺¹
@Singularity606 No, I'm talking about meaning not 'qualia' (which is a silly concept invented by a philosopher who doesn't understand cognitive neuroscience or psychology). You know what a boat is, what it does, how it works, where you're likely to find one, what it's used for, and so on. To you, 'boat' is not just a token that appears in some sentences, it _means_ something. LLMs don't have that. In an LLM 'boat' is just a token, that is statistically associated with other tokens.
@Grim_Beard 4 місяці тому
@Singularity606 The word is literally just a token in the LLM's data set. The LLM has no understanding of meaning, it only (1) calculates statistical associations between tokens in training and then (2) uses them to generate output. This is not controversial, it's very basic, fundamental stuff about how LLMs work.
@alansmithee419 4 місяці тому ⁺²
@@Grim_Beard
It's very basic, fundamental stuff about how LLMs are *trained.*
That does not necessarily tell us anything about how it actually performs that task internally within the model. AIs are often called a black box for this reason, and we are perpetually confused as to just *how* they perform so well. Perhaps the reason for this is that understanding is not so difficult to achieve as we'd expect.
If you ask the LLM what a boat is it will tell you.
If you ask the LLM what will happen if a broken boat is placed in water it will tell you.
If you ask the LLM what a good tool for moving items over seas is it will tell you (it's a boat).
These imply understanding of some form to me, even if it is not the exact same as the understanding we have. Yes internally it's "just a token." But it knows the relationship of that token to other tokens and how they can be put together to form coherent messages, and it can derive information about the world from these relationships. That is language, and (to me) that is understanding. Even if it is not a language any human speaks, being more numerical in nature, it remains a language with meaningful syntax and the ability to perform the task of any human language. The LLM understands this language, and we simply translate for it on either side of the process.
Words in the human brain are "just electrical signals" that we know the relationship of to other electrical signals and how they interact with each other to allow us to form coherent messages, and we can derive information about the world from these electrical signals. We have more types of data than the AI, but that doesn't inherently mean that we understand and they don't, just that they understand less or differently.
Ultimately the only way you can claim that AI doesn't understand (or does, my above statement that they do is just as subjective as your statement that they don't) is to first provide a solid definition of what you mean by "understanding." The word has no set definition, so unless you tell people what specifically you mean when you say that you are not communicating your thoughts in their full form. And in any case you cannot state this not understanding as being a known fact that others are incorrect about. They are simply using a different definition of this ill defined word to you. They are not wrong.
@shiuay6165 4 місяці тому
@@alansmithee419Thank you very much, that's exactly what I wanted to respond to this comment and yours saved me quite some time !
I find it weird that people will go and "correct" people like that, while being so horribly confident in their "knowledge", saying things like "this is basic knowledge/facts about LLMs". This guy even has 10 likes wtf, how can anyone not think a minute about defining what is "semantics", "understanding" or even "knowing" before arguing if current LLMs have such things.
Guys, please define the terms you are using before asking if LLMs have those !
@tdata545 4 місяці тому ⁺⁸
With the Duck and Rabbit, I can see both and where both transforms in each form. But these overlays are crazy.
@anoobis117 4 місяці тому ⁺²¹
It's not really an illusion in my opinion. It's just a fancy way of putting images together creatively. An illusion would imply there is some sort of visual trickery involved to make you think what you're seeing is something else, or that it exploits the visual cortex to produce hallucinatory artifacts. This does not do either.
@JimC 4 місяці тому ⁺⁶
I agree with you completely. But what do we call it instead? I can't think of another word.
@handsbasic 4 місяці тому
i think we conclude that all visual perception is an illusion because of our object recognition meat “software.” i don’t think it’s such a radical conclusion.
@Foxmasker 4 місяці тому
@@JimCa double image? Idk
@StevenGallman-g9c 4 місяці тому
You explain even the most difficult concepts so well.
@Mokodokococo 2 місяці тому
Hey @SteveMould! Thank you for everything you do!
There's is something I'd like to know about. I've no idea if it's an area of research.
To start with an example, water is a good subject for this behaviour. I think that you already made some video around the subject I'm about to describe.
So water is flowing along a river and you put some kind module in the course of the water. The river will obviously show changes downstream but upstream as well (eg. making pond/lake, or taking another route altogether).
To give you another example and you may find a pattern there. I seem to remember seeing somewhere that somehow, a ray of light may take an entirely different path based on what is on its way.
The subject is not about taking a different path but more generally how something downstream can affect something upstream.
Hopefully it'll reach your eyes and you'll find it interesting enough to make a video about it.
@ramadrian248 4 місяці тому ⁺⁴
12:04 You're welcome
@CountJeffula 4 місяці тому ⁺⁴
I wonder what Plato would think of the fact that we are quite literally creating a Theory of Forms where abstract ideas are no longer merely figments of human imagination, but destinations in a multidimensional vector space that can be visited repeatedly and used in increasingly novel ways. I’m sure Aristotle would need to think on it for a while given his views on Plato’s theory.
@miauzure3960 4 місяці тому ⁺¹⁰
why there is no "AI" in the title!! this is one of the best explainations of AI diffusion
@IMakeUnitaleThings 4 місяці тому ⁺⁷
Probably because AI and images is usually a bad thing and would minimize views, like putting nfts into the title or something like that
@PeetSneekes 4 місяці тому
You know, I never want to watch these videos, but when I do, I’m mesmerized, fascinated and happier. Thank you!
@RebeccaHess-p7o 4 місяці тому
This video made everything clearer and easier to grasp.
@TheLurker 4 місяці тому ⁺²⁵
Hey! Just a heads up that this video uses the Lenna image at 6:14. This is a playboy centerfold that was used for decades as a test image in digital image processing, but it's generally frowned upon to use it now, because it's a vestige of misogyny from the 1970s in tech. Its use has also historically privileged lighter skin tones over darker ones.
It's worth going and reading about the history of this image and how it got into such wide use, and why folks consider it harmful in this day and age if you want to know more.
@Hyperlooper 4 місяці тому ⁺¹⁰
Bring back lenna
@morphentropic 4 місяці тому ⁺¹
Which folks?
@alquinn8576 4 місяці тому ⁺⁸
i'm here for the flippy image stuff, not this woke BS
@Hyperlooper 4 місяці тому ⁺³
@@morphentropicyou know, "folks". Same ones who don't mind your cat getting eaten.
@jay_13875 4 місяці тому ⁺⁴
Nobody asked
@harrygreenfeld4964 4 місяці тому ⁺⁷
As long as all the data that gets scraped gets due credit or paid as necessary, not a problem.
@cmyk8964 4 місяці тому ⁺³
Yes, that’s the biggest problem with GenAI in its current state. It’s created from mostly pirated copyrighted works or sensitive personal data.
@TetanusSnowfall 4 місяці тому
Considering one of the illusions had Yoda something tells me even the most ethical proponents of the tech aren't interested in that.
@hurrdurr7861 4 місяці тому ⁺²
Your old world ideas of ownership of visuals is long gone.
@jeremiasrobinson 4 місяці тому ⁺²²
I made images that morph into each other using mirrors and anamorphism. As the viewer changes their position, the image morphs.
@jeremiasrobinson 4 місяці тому ⁺¹
I feel like I had to take a similar approach to these robots.
@VitorMiguell 4 місяці тому
You posted it somewhere?
@jeremiasrobinson 4 місяці тому ⁺¹
@@VitorMiguell I have videos of prototypes. I was living outdoors when I was making these, and they ended up only lasting for a few days each time I made them because if the temperature or humidity or something changes then the little boxes I made these in changed shape just enough to mess it up. I plan on making a better one soon, though, as now I live in a house and I can potentially make one large enough to put your head inside of the get the proper morphing effect. The problem with looking at it from outside of something is that when you move, something blocks your view, so there is an interruption in the morph. Keep that in mind as you look at this prototype. ua-cam.com/video/-stSuKmsee8/v-deo.html
@Splarkszter 4 місяці тому ⁺¹
BANGER.
I LEARNED A LOT. THANK YOU!!!
@xyznihall 4 місяці тому
Great video. Loved the explanation of diffusion models!
@brixiu5 4 місяці тому ⁺⁵
I'm a little bit annoyed that the thumbnail is so obviously edited. The duck on the left has part of the square erased to make it look like the tool was better than it actually was.
@seekyunbounded9273 4 місяці тому ⁺⁴
15:00 its about how its handled , if its handled by turning words into tokens it litterly cant see what the word is made of and will just rely on probability of what the input text thought it
@redandpigradioshows 4 місяці тому ⁺¹³
This video recaptures the fascination I had for AI before the investment bubble killed it, one day the bubble will pop and we'll be back to this kind of applications
@gaggix7095 4 місяці тому ⁺¹⁰
This stuff is like 1 year old. Nothing has changed, you can still use SD on your PC.
@LexanPanda 4 місяці тому
I hit the bell on your channel years ago and watch every video, but this one didn't show up in my notifications, nor was it recommended alongside other videos like your videos usually are to me. I'm glad this was a collab with Matt or I may have gone quite a while without seeing it.
@asdocneter 4 місяці тому ⁺¹
The one @14:37 (room/pig) that works at different zoom levels also works (understandably) when squinting.

Наступне

Автоматичне відтворення