Neural Network learns the Mandelbrot set [Part 1]
Вставка
- Опубліковано 1 чер 2024
- Every frame is rendered by passing each pixel's x, y coordinates through a neural network, which outputs a value between (0, 1) that is used as the pixel's value.
This network is trained to approximate the value of the Mandelbrot function evaluated at x, y, so the image improves as the network learns.
The color does not indicate anything about the approximation, I just think it looks cool.
See the code for more details:
github.com/MaxRobinsonTheGrea...
This is an overfit problem, or rather, there is no such thing as "over" fitting. You can only ever come short of an accurate fit.
Ultimately it would require larger networks, longer training times, ever diminishing learning rates, and larger samples of training data points to improve the approximation, but it is interesting to see just how well the network can fit the set with pragmatic limitations.
The final attempt took about 10 hours to train on my poor MX150, so I don't think I'll be scaling up anytime soon.
Timestamps:
(0:00): Intro
(0:21): First attempt
(0:51): Second attempt
(1:52): Third attempt
Music: / user-82623269 - Наука та технологія
"Smooth Mandelbrot set isn't real, it can't hurt you"
screenshot moment
it is complex! (infinitely complex)
The most beautiful aspect to me is the flickering effect of the ill formed mandelbrot set, seems the distortion caused by convection streams of hot air...
it would be interesting to see ML try and "rediscover" the formula to draw it, rather than trying to imitate the likeness of the set
if the goal is just to overfit on the mandelbrot set, you can use an ELM which you can fit to it in a single step
set random weights, random bias with shapes (2, nodes) (nodes) and then create your X and y data the inputs n outputs,
then create a "beta" by doing this (python sorta pseudocode)
```
beta = pinv(sigmoid(X@weights+bias))@y
```
where linalg could be either torch.linalg.pinv or numpy.linalg.pinv or equivelant for Moore-Penrose Inverse in whatever language/library
and sigmoid could be any implementation of the sigmoid function.
the @ symbol in python is made for dot products and should work in modern versions (i use 3.7+) (in numpy you can replace it with "X.dot(weights)+bias" if you want and same for torch)
and X and y are the coordinates as a matrix (X shape is: however many datapoints, 2) (and y shape is: all the values if it's monochrome or: however many datapoints, 3)
then you can inference this model for new data like
```
def predict(x):
return sigmoid(x@weights+bias)@beta
```
where x is your coordinates as a vector, depending on the number of nodes it should be able to fit almost perfectly in a single step, and as always the more data points the better. the only cavaet is you need tons of memory bc you cant exactly batch it, though you can fit separate elms on separate batches and then average their predictions together as an ensemble
Interesting, elm = extreme learning machine? I'll look more into this when I revisit the project.
@@EmergentGarden thats right! and im glad to hear
Trying to find the authors discussion of activation functions. Best I have ever seen
That flickering, jiggling behavior actually looks neat.
It looks like it's burning, I like it.
Awesome! What kind of data are you feeding the NN? Is it images of Mandelbrot sets? Or simply vectors of numerical inputs to the Mandelbrot function?
the nn is quried on a set of inputs and the loss is the differance between that and what the mandlebrot set outputted
@@randomsnow6510 that's not the question Ilias asked. Plus the loss it very likely to be something else than just the difference (it's not even a distance), maybe mse?
It would be interesting to see if an RNN could learn the Mandelbrot iteration rule.
If there was a way of working that out formulaically it would save a hell of a lot of processor cycles! Especially at deep zooms..
I think one improvement you could make would be how training points are selected. Say, give it 100 points at 0 iterations, 100 points at 1 iteration, 100 points at 2 iterations, 100 points at 3 iterations ... 100 points at 2000 iterations, and 500000 points that are actually in the set, or something like that.
The interesting thing about the mandelbrot set, as with many other complex systems, is that they can be described with a very simple expression. As far as I can tell, the neural network here is not learning the mandelbrot set any more than I am if I were to tediously memorize a subset of it one number at a time (with, of course, a lot of rounding-error). Indeed, the neural network is *incredibly* inefficient at this task - 20 layers, 400 neurons each and we get a fuzzy visual copy of something which can be expressed with essentially three terms.
It reminds me of the recent findings that neural networks with 175 billion parameters were only able to perform 3-digit arithmetic with 80% accuracy. Despite their inherent overwhelming complexity, they are generally unable to identify even the simplest rules underlying a set of observations. In some ways, this limitation is quite an interesting (but kind of obvious) characteristic of their learning behaviour.
10 months and until now only 13 comments. Realizing I'm into some niche stuff.
1:56 At the same time you dont need a heater in your room :D
what do you mean with "can attempt to learn"?
how you visualise this,thanks in advance
It's an interesting approach, but I think a better approach would be a CNN combined with an LLM and/or GANs. So, the LLM "understands" the functional relationship of the Mandelbrot algorithm, and the CNN can create/"understand" the pictorial constructions of the Mandelbrot set. Maybe the GANs can create new Mandelbrot-like sets or algorithms.
Ahahaha, I'm actually trying to do the same thing.
Just 3 layers, 16 neurons each.
I get interesting color gradient, but the shape doesn't really look like Mandelbrot.
I'm am going to code this on my ti84 or I am going to die trying
You can provide the neural network access to use formulas to create the patterns, and it would found found the original calcualtion method z = z^2 + 1 / c with attempts. 9 letter is'nt that complex if you think about
it.
i wonder if someone could simulate dementia with AI? like inputting music, removing neurons one by one and then showing the final result.
Yes, you cant really remove neurons but you can set his weights to zero to simulate a missing neuron. I wonder if it will try to atleast adapt or it will fuck it up permanently
What an interesting question!
Look at Nexpo's 'The Disturbing Art of A.I.' video, in which he mentions an experiment testing just that, via slowly removing neurons from a Stable Diffusion network and looking at the resulting effect. The relevant part starts at around 15:50, and is quite short, but still interesting.
It’s basically what dropout does :)
Congrats, you just overfit huge network on a complex shape...
If you can't zoom in and make it render finer details, it's overfitting
i have a formula or you: z = z^2 + 1 / c
3:24 ITS, NOT IT'S. DO YOU HAVE THE WRONGEST GRAMMAR?