- 2
- 99 900
Sam Sartor
Приєднався 16 сер 2015
AI Generated Art (Journal Club Presentation)
This is a presentation I gave on October 6th 2022 at the William & Mary Graduate Student Association's weekly Journal Club.
I talk about the past, present, and future of computers spitting out pretty pictures. Specifically how computers generate art and the extent to which this is an extension of human creativity. This includes recent AI projects like DALL-E and Stable Diffusion, but also more traditionally generated art: fractals, minecraft worlds, and the "wave function collapse" algorithm.
Demo Links:
- There are countless ways to run Stable Diffusion for free. For this presentation I set up github.com/AUTOMATIC1111/stable-diffusion-webui.
- lexica.art is an archive of previously made AI art and the corresponding prompts.
- dreamfusion3d.github.io uses Stable Diffusion to make 3D art.
- oskarstalberg.com/Townscaper is an online demo of the Townscaper game.
I talk about the past, present, and future of computers spitting out pretty pictures. Specifically how computers generate art and the extent to which this is an extension of human creativity. This includes recent AI projects like DALL-E and Stable Diffusion, but also more traditionally generated art: fractals, minecraft worlds, and the "wave function collapse" algorithm.
Demo Links:
- There are countless ways to run Stable Diffusion for free. For this presentation I set up github.com/AUTOMATIC1111/stable-diffusion-webui.
- lexica.art is an archive of previously made AI art and the corresponding prompts.
- dreamfusion3d.github.io uses Stable Diffusion to make 3D art.
- oskarstalberg.com/Townscaper is an online demo of the Townscaper game.
Переглядів: 626
Відео
Why neural networks aren't neural networks
Переглядів 99 тис.3 роки тому
There is a better way to understand how AIs sort data, process images, and make decisions! Made for the 2021 Summer of Math Exposition: www.3blue1brown.com/blog/some1 Source code available here: gitlab.com/samsartor/nn_vis The background music is an excerpt of the endless ambient generative music system "At Sunrise," available at generative.fm/music/alex-bainter-at-sunrise
The youtube algorithm blessed me today what a video!
I am learning new things each day ..thanks man , for sharing this perspective . I never thought of transforming the data , I was used to rotating the lines..! 👍
Hi there, this idea of looking at what the deep learning model is doing internally is really intuitive. I wanted to know, if there is any underlying connection between Transformations and Fine tuning the weights and biases with Backpropogation.
In the sense that fine-tuning/training is how you work out what the transformations (linear weights + biases) should be, yes! But I pretty much ignored training with backprop in this video because there are already really good explanations of it on UA-cam, and I just wanted to explain what the model itself actually consists of.
AMAZING! The animations and explanation are perfect! Thank you for your hard work
Please keep sharing your knowledge with us, you are just awesome, and we love your content.🙏
Amazing ! Please keep it up, we have so much yet to learn.
brilliant!!!
Even in the face of the almost insurmountable and fascinating amount of "ummmmmmm" during the speech...this is an incredible breakdown. Fantastic job! Despite the internet's data being hijacked for commercial purposes (images/personal/location) it seems like similar to old high school typing classes the ramification to what you're becoming a part of from just posting something on Reddit, Facebook, or getting your first smartphone (despite how cool it is) needs to be explained to users in almost a classroom setting because it's becoming so overwhelming to your average user how their data is being extrapolated. I always think about my mom saying she took a typing class in college and her family asking why she would need to learn how to type, the same question may arise from asking why would you need to know how AI works. Granted it's a different time, but relatable. I feel like 98% of people that would randomly watch this video would be completely blown away or at least ignorant to how this technology is evolving. As an artist I'm still adding new plugins or going to certain apps for certain functionality (for my real job, not just making dumbass cartoons on UA-cam) but I still feel like I'm lagging behind. Additionally, I hope there's a way to explain that the technology wielded at 39:00 is comfortably used by certain groups of people based on a morally loose justification, and that morally loose justification is leading to more and more questionable places. And generally like your explanation of denoising a photograph or a render engine, it's almost impossible to escape, and I say that using voice recognition because I'm not physically able to type this comment.
This is seriously the best NN video I've seen, and yes that's after watching 3B1B's series
Great video (and I am on ~3min onto the video)! Ty
From this perspective an ANN resembles an SVM with a learned kernel
Just awsome!!! I have been thinking about neural networks this way, and how the key word has stuck with this statistical process. You illustrated beautifully just amazing!!
wow, you are so talented, man!
It's all just Math!
I can't believe my highschool math is this important
I thought the same thing as well! I love ML and statistics!! :)
clean
Hello, what is at 6:55 in the 3d plot, is it what's in 1 node of the hidden layer or the 3 nodes ? Because 1 node does not make sense for me, we activates these 3 nodes independently, and we do the sum of these 3 nodes after that, which is shown at 7:03 What does it means at 6:54 "The three new numbers are nonlinearly altered" ?
Because the middle layer has 3 nodes, the plot is 3 dimensional (to visualize the 3 x,y,z values for every sample). When the network has 2 nodes, all the points are on a plane. And when it has one node, all the points are on a line. I did mess with the basis vectors a bit to improve clarity. The nonlinearity in this example is the tanh function, which is applied to each value in-between the linear transforms.
brilliant video
absolutely beautiful visualizations
Does one transformation map to one layer in the NN? I can see that if the input is in 2D and to transform it into a representation in 3D it corresponds to a layer that takes two input and outputs 3. But if several transformations are from 2D to 2D can they be encapsulated into one layer or do you have to do them one after another (aka in diff layers)?
That's a really good question! Sadly, it doesn't have a very clear answer. Different people draw the borders between layers in different places. Traditionally, the nonlinearities inside a neural network are basic functions, and are exactly the same for every layer. For example, the linear transformations in this video are all completely different, but each is punctuated by a tanh function. If you have 10 2D linear transformations, each followed by tanh, that would count as 10 layers. Sadly it isn't usually that simple. Although a matrix multiply followed by tanh makes perfect sense when transforming data through 2, 3, or even 100 dimensions, a modern image model might transform the data across a hundred million dimensions at a time. That requires more specialized kinds of linear transformations, sometimes grouped together in odd ways. For example, I'm currently working with a network that repeatedly transforms the data through a depthwise 7x7 linear layer, a normalization layer, a 1x1 linear layer, a GLEU nonlinear function, another 1x1 linear layer, and a final skip connection. You could argue that those all amount to just one very convoluted linear transform punctuated with the single GLEU nonlinearity. But since the individual parts are important for performance, reliability, and accuracy, they are also called "layers" and the whole combination is more often called a "block". A ConvNeXt block to be specific. 6 layers per ConvNeXt block, 12 ConvNeXt blocks per ConvNeXt backbone, 2 backbones in the actual U-Net, and some miscellaneous stuff on the side.
Amazing video! Thank you.
This is top notch! Hope you make more educational videos
But the process is similar and we do get something similar to what our brains do, not coincidentally.
came from your other video. Thanks for posting this
Why isn't it equivalent? What do neurons do differently?
I loved the video! Just one question though: why not simply use a GLM instead of doing all that complicated data transformation?
learned a lot. great presentation Sam.
help to explain the success of the deep Neural Nets.
A different look at the problem that is refreshing.
Thanks for the video. What I would add, you "reduce" the neural network to "simple statistics". And you explain very trivial examples of decision making (very nice) that should show us that AI is not real intelligence. But: Neural networks are a "complete basis" of any function between two vector spaces, so any information processing that can be done can be done with a neural network. At 8:23 you show what artificial networks ignore, but all these things don't give us anything anymore. In the sense that a new way of processing information is added by these things. The reason for this is that NN are complete, we cannot "add" anything. These points at 8:23 could make a NN simulating the brain's information processing more complex, but not impossible in itself! To me, the blue beer-strawberry example shows how a very low level of intelligence works, say an IQ of 0.1, but it's not "proof" that AI isn't a form of intelligence. And you can see it different: Take away from a human being all its ability to make "simple" decisions and process information that a function of any kind could. What would remain would be pure "human intelligences", but what is that? hp even a quantum computer adds nothing to what can be computed with a universal Turing machine, it is only faster!
You made the world a better place with this video!
Like n. 7001! ;) Brilliant! The best visual explanation of NN I have encountered so far. The visuals are extremely helpful in getting the gist of what a feed-forward neural network does. It's important to point out - and this would have spiced up the ending of the video too ;) - that there are other types of neural networks that are more similar to how the brain works. Hebbian learning and recursion are involved in these other types of neural networks, for which a simplification in the terms used in the video would be not so quite straightforward. It would actually be great to see a follow-up video on these kind of NN!
They are also, extreme simplifications of the processes that occur in a brain and, fundamentally, they end up being used as if they were the current "artificial neural networks" when they depend on statistical methods such as backpropagation.
Thank you! I didn't have the vocabulary to say this when trying to explain this to people...
nobody has claimed they would be. just brain inspired. which is quite amazing
ƤRO𝓂O𝕤ᗰ
Wonderful! Thanks so much!
We've been lied to! But how is AI even considered an intelligence!?
Currently AIs are "narrow" intelligences, meaning they make specific sorts of predetermined decisions. Humans are "general" intelligences. I think we'll have general AIs in the next decades, but that is hotly debated. Edit: also "lied to" is going a bit far lol. I just think there are better metaphors.
@@samsartorial Humm agreed but I think we should also research AI"s without current networks with what works best and is more efficient.
no. translation isn't a linear transfomation. We call linear transformation a function that preserves T(xa+yb) = xT(a) + yT(b).
That's true! The technical term is in an affine transformation. As I've mentioned before, I wanted to avoid overloading the viewer with terminology given neural networks can always replicate affine transformations given an extra linear dimension. We tend to include it anyway just because it makes optimization easier.
4:20 why can't you just use a parabolic function?
Thank you for asking this question! I learned a lot while trying to answer it. You could absolutely fit a single parabola to that particular example. But the goal is to find a process that can draw a decision boundary through any data set, no matter how complicated. We call any such process a "universal approximator". I thought that almost any sufficiently long sequence of linear and nonlinear transformations was capable of universal approximation, and that parabolas weren't used as nonlinear transformations for some practical reason. They extrapolate badly? But looking into it further, Allan Inkus proved in 1999 that polynomial transformations aren't sufficient. Our nonlinear function has to exponential or trigonometric, or piecewise, or something else similarly weird. Sigmoid functions like tanh were the first to be proved as universal, and I used them in this video because they looked nice, but these days everyone uses a piecewise function called RELU.
I find it intellectually dishonest to draw two circles, call them neurons, then draw a line between them and call it a neural link network, and artificial intelligence. So far the majority use of this so-called artificial intelligence is to steal our humanity, to sort us like berries to take advantage, and control us. To call this computer process intelligence is an insult to our real intelligence. Am I to ooh-and-ahh over this emerging evil?
hope your PhD is going well!!
What a *fantastic* video! I've watched a large number of videos on artificial neural networks over the past few years... yet I learned such a lot from this one! Such a (shockingly) clean perspective on how these systems work. The choice of the examples and the clarity of the writing and animation are just superb. If you didn't win, it's a travesty.
There's a chance that after I watch this video I will think that neural networks are more like neural networks than I initially did.
Thanks for debunking the "Neural Network".
This video blew my goddamn mind. Please make more 😃
Awesome, thank u for that masterpiece.
Also the people in these comments are definitely lying, this would literally be a small part of the first day of a ML course and just at par level at best, they never took a ML course.
You make a big statement and then very little to back it up. You gave a great explanation and argument for why current neural networks could model anything that a brain could produce but then your takeaway is that because it doesn't use the chemical interactions that you just argued where unnecessary, that it can't learn. Looks like you understand the concept but not fully. What about these chemical and dendritic processes do you believe would cause behavior that is impossible to model with neural networks?
I never said that behavior in the brain is impossible to model with neural networks. The basic theory of NNs tells us that any dendritic process is modelable by a deep enough network. The thesis of this video was "the biological metaphor confuses the idea of neural networks as iterated linear and nonlinear transformations derived through a statistical process". If neural networks are fundamentally emulations of nature, then what is the point of multi-head attention? What about the SPADE block used in img2img tasks? What the heck is it emulating? Neural networks absolutely _could_ emulate nature, but they don't _have to_ emulate nature. When we finally create a model that can truly learn from its experiences, it doesn't have to do it in the same way nature does.
I can do a even bigger claim, it is not possible to simulate even a single neuron on a digital computer. Neurons are physical, real, beings, they change their shapes and they create magnetic fields, and there are good theories out there that even state tha they interact directly with the quantum nature of our universe. So good luck modeling that.
Very interesting, viewing the content in another directions
I don't know why the Avocado Armchair made me laugh so much, I just did not expect it I guess