Hey, Jeff, I have one question, I'm kind of a starter on python, so IDK if this is a dumb question. How am I supposed to install torch-1.8.1 version or older in Google Colab notebook? Since the notebook uses python 3.10 by default, I cannot install older torch versions than 2.x.x. Am i supposed to create a virtual environment for python older version (i.e.: 3.8.3)? Or is there an easier way to do this?
Can you train StyleGAN on labeled data (would it be called “conditional” GAN?)? My idea is, if we already have a big dataset with characteristics we care about (hair color, male/female, etc, for the faces), can’t we train it the way that each of those characteristic would be corresponded to some latent vector elements? (So, then we wouldn’t need to search the right direction in the latent space to deform the picture certain way)
@@sebb1510 I only did I few experiments in Kerras on Mnist dataset and, yes, it is possible to use one or a few elements of a latent vector as a condition (like, a figure value can be represented by a value of z[0] and then manipulated the way you want). But the loss function need to take that into account and basically it will contain of different loss elements with different weights, it all is a fun to play with thought.
Thanks for the video Jeff, I've learned a lot from your playlist/Tutorials on ML and GANS! Now I have a question, I've been working with stylegan2 on google colab, and since recently google colab has a A100-SXM4-40GB GPU available. Now theoretically that's good news, since it should be their fastest gpu available, only it seems to be using a higher version of pytorch, which in turn (after importing in colab) doesnt seem compatible with the code of stylegan2 . Is this a problem you' or your students encountered as well? And since I'm a programming rookie, is there a solution for this problem? Thank you in advance!
Thanks, Jeff, another great video as usual. One quick question: if one training can't finish in 12/24 hours in Colab, is there a way to continue train it in the next session? I hope you can have a video to talk about that. Thanks again.
@@sebastianbejarano350 Thanks for your comments. The point which confuses me is: after the session is out, all the files, including saves states/models, will be gone. We could save the states/intermediate models in Google Drive, but we have save it periodically (every 10 minutes?) , how can we achieve that? Using Keras callback? Would you please point me to a little more detail? Thanks again!
@@freebooks4456 Yes, in fact you can upload files to your GDrive during your Colab Session, Keras has some checkpoints where you can do that :) Hmm I don't have the docu link handy atm but pretty sure that the feature exist (as i've used it)
@@sebastianbejarano350 Thanks for the more details. One thing: if we put files in G-Drive, then the training will be very slow because a lot network activities when reading batches - most time I found it's even slower than my local PC training if put files in G-Drive. The better way is probably upload all files to Colab then train there ... Oh, we may could just write the intermediate models into G-Drive?
Hey I appreciate the tutorial, I am havign one issue thought. In the perform initial training section I am getting "ModuleNotFoundError: No module named 'click'" so I installed click via the terminal but still shows the same issue
Thank you for the informative video Jeff! I am trying to run this notebook and having a problem with training, right after tick 0 I get these errors: /usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:477: UserWarning: This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. cpuset_checked)) /usr/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 34 leaked semaphores to clean up at shutdown len(cache)) Am I doing something wrong or is there a way to fix this?
I have a problem, I think some libraries need to be updated, it throws me the error: RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented Something has changed since you made the video, any solution to fix? Anyway, very nice job!!!
Was facing the same issue- the thing that helped was that I went over to the file - grid_sample_2d_ under utils in Torch_utils in your Gdrive (it's sort of like a hidden file cloned from git). Then, go over the StyleGAN3 from NVDIA (Yes, StyleGAN3 you read that right). Copy and paste the file they used for the same filename - grid_sample_2d - should be something similar. This magically worked and defeated all the compatibility issues we faced for pytorch and python. Hope this helps!
that file name doesn't come up but I changed everything with that included 'grid' and it's still not working, do you still have repository or the old grid_sample_2d.py file by any chance? :DD@@GanufacturingSeniorDesign
@@GanufacturingSeniorDesign Hi, same issue. Finding this solution. But.. where to find a hidden file? 8-) My gdrive is as good as empty, just my training images i put in there myself..
Was facing the same issue- the thing that helped was that I went over to the file - grid_sample_2d_ under utils in Torch_utils in your Gdrive (it's sort of like a hidden file cloned from git). Then, go over the StyleGAN3 from NVDIA (Yes, StyleGAN3 you read that right). Copy and paste the file they used for the same filename - grid_sample_2d - should be something similar. This magically worked and defeated all the compatibility issues we faced for pytorch and python. Hope this helps!
Thanks Jeff, I learn something new from each of your videos! How could we generate videos from this training data? I've only seen it work linked to the Nvidia training data. Cheers
I've always though that could be an area that NVIDIA optimizes better in StyleGAN. Most the evaluation to give you the FID score, that is why one approach is to simply not evaluate and visually look at the fake images being generated.
Hah, yes I can see it. I am totally going to have to do a video of all of my alleged doppelgängers some day. Tobias Funke has also been suggested, even more so before I grew a beard.
Getting this error when trying to execute the "Perform Initial Training" code: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.7/dist-packages/tensorboard/compat/__init__.py)
I have the same problem. This is the error message: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/tensorboard/compat/__init__.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.7/dist-packages/tensorboard/compat/__init__.py) I was successfully running Jeff's coloab notebook many times, then this error started showing up about 5 days ago or so.
@arkoudos, I found a solution to my problem (which sounds similar to yours) from another set of comments. This solution was suggested by @Devdabomber, I'm just pasting it here. You need to manually uninstall tensorboard before you run the training: !pip uninstall tensorboard
Hi Jeff, your channel his a hidden treasure!
Thanks! Working to "unhide" the treasure. 😁
This is what I wanted to do with Deep Dream....cheers!
Where Can I find the video about how to Generate GANs?
Hey, Jeff, I have one question, I'm kind of a starter on python, so IDK if this is a dumb question.
How am I supposed to install torch-1.8.1 version or older in Google Colab notebook? Since the notebook uses python 3.10 by default, I cannot install older torch versions than 2.x.x. Am i supposed to create a virtual environment for python older version (i.e.: 3.8.3)? Or is there an easier way to do this?
Can you train StyleGAN on labeled data (would it be called “conditional” GAN?)? My idea is, if we already have a big dataset with characteristics we care about (hair color, male/female, etc, for the faces), can’t we train it the way that each of those characteristic would be corresponded to some latent vector elements? (So, then we wouldn’t need to search the right direction in the latent space to deform the picture certain way)
I know this was a while ago but did you figure out the answer to this? Would be helpful!
@@sebb1510 I only did I few experiments in Kerras on Mnist dataset and, yes, it is possible to use one or a few elements of a latent vector as a condition (like, a figure value can be represented by a value of z[0] and then manipulated the way you want). But the loss function need to take that into account and basically it will contain of different loss elements with different weights, it all is a fun to play with thought.
@@KlimovArtem1 thanks, I am just learning so a lot of this goes over my head but I will come back to this comment in the future.
is there a command to generate images?
Thanks, Jeff Can you please share the dataset or the utility to collect dataset??
Sure, I use this: github.com/jeffheaton/pyimgdata
Thanks for the video Jeff, I've learned a lot from your playlist/Tutorials on ML and GANS!
Now I have a question, I've been working with stylegan2 on google colab, and since recently google colab has a A100-SXM4-40GB GPU available. Now theoretically that's good news, since it should be their fastest gpu available, only it seems to be using a higher version of pytorch, which in turn (after importing in colab) doesnt seem compatible with the code of stylegan2 . Is this a problem you' or your students encountered as well? And since I'm a programming rookie, is there a solution for this problem? Thank you in advance!
Thanks, Jeff, another great video as usual. One quick question: if one training can't finish in 12/24 hours in Colab, is there a way to continue train it in the next session? I hope you can have a video to talk about that. Thanks again.
I'd say probably saving actual state of the network training and resuming it after that using keras apis.
@@sebastianbejarano350 Thanks for your comments. The point which confuses me is: after the session is out, all the files, including saves states/models, will be gone. We could save the states/intermediate models in Google Drive, but we have save it periodically (every 10 minutes?) , how can we achieve that? Using Keras callback? Would you please point me to a little more detail? Thanks again!
@@freebooks4456 Yes, in fact you can upload files to your GDrive during your Colab Session, Keras has some checkpoints where you can do that :) Hmm I don't have the docu link handy atm but pretty sure that the feature exist (as i've used it)
@@sebastianbejarano350 Thanks for the more details. One thing: if we put files in G-Drive, then the training will be very slow because a lot network activities when reading batches - most time I found it's even slower than my local PC training if put files in G-Drive. The better way is probably upload all files to Colab then train there ... Oh, we may could just write the intermediate models into G-Drive?
Hi, @jeff. Are you considering a tutorial like this one but using AWS ? Maybe would be useful to not get kicked after a few hours ? hehe
Hey I appreciate the tutorial, I am havign one issue thought. In the perform initial training section I am getting "ModuleNotFoundError: No module named 'click'" so I installed click via the terminal but still shows the same issue
Jeff, I've just got an A6000, any chance you can look at a comparison between training on a V100 and an A6000? I'm curious how they compare.
Can you please post a link to the utility you mentioned for collecting images? And great video!
Thank you for the video! May I ask how many images I should put in the dataset to get a decent result?
Based on a decent model, i'd say between 700-900 should be enough
Thank you for the informative video Jeff! I am trying to run this notebook and having a problem with training, right after tick 0 I get these errors:
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:477: UserWarning: This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
/usr/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 34 leaked semaphores to clean up at shutdown
len(cache))
Am I doing something wrong or is there a way to fix this?
Just disregard that warning. The ticks are ticking along and a fakes.....png is generated every 10 ticks . So all should be good.
4k gans sure! I want to see
Hi Jeff, I'm wondering, can I use a Macbook pro & iOS to train StyleGan?
You will need a CUDA device, so I do not believe you can.
I have a problem, I think some libraries need to be updated, it throws me the error:
RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented
Something has changed since you made the video, any solution to fix?
Anyway, very nice job!!!
Was facing the same issue- the thing that helped was that I went over to the file - grid_sample_2d_ under utils in Torch_utils in your Gdrive (it's sort of like a hidden file cloned from git). Then, go over the StyleGAN3 from NVDIA (Yes, StyleGAN3 you read that right). Copy and paste the file they used for the same filename - grid_sample_2d - should be something similar. This magically worked and defeated all the compatibility issues we faced for pytorch and python. Hope this helps!
that file name doesn't come up but I changed everything with that included 'grid' and it's still not working, do you still have repository or the old grid_sample_2d.py file by any chance? :DD@@GanufacturingSeniorDesign
@@GanufacturingSeniorDesign Hi, same issue. Finding this solution. But.. where to find a hidden file? 8-) My gdrive is as good as empty, just my training images i put in there myself..
unfortunately this error message..: "RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented"
Was facing the same issue- the thing that helped was that I went over to the file - grid_sample_2d_ under utils in Torch_utils in your Gdrive (it's sort of like a hidden file cloned from git). Then, go over the StyleGAN3 from NVDIA (Yes, StyleGAN3 you read that right). Copy and paste the file they used for the same filename - grid_sample_2d - should be something similar. This magically worked and defeated all the compatibility issues we faced for pytorch and python. Hope this helps!
Did you manage to do this in the Colab? Or locally?
Thanks Jeff, I learn something new from each of your videos! How could we generate videos from this training data? I've only seen it work linked to the Nvidia training data. Cheers
You can. I have an example of generating video at this notebook: github.com/jeffheaton/pretrained-gan-70s-scifi
@@HeatonResearch Thanks Jeff!
How many circuit board images did you train off of? I am trying to do the same but with faces
Hi, were you able to train with faces?
Thanks man
there is no option for background execution
Checkpointing takes up to an hour? Why so long, what is it doing during that operation?
I've always though that could be an area that NVIDIA optimizes better in StyleGAN. Most the evaluation to give you the FID score, that is why one approach is to simply not evaluate and visually look at the fake images being generated.
Is 10k images enough ?
I've trained on less, make sure you are using augmenting. The more the better, though, 30K or so is nearer to optimal from my own experience.
@@HeatonResearch thank u so much
Bro think he's Al form toy story💀
Hah, yes I can see it. I am totally going to have to do a video of all of my alleged doppelgängers some day. Tobias Funke has also been suggested, even more so before I grew a beard.
@Jeff Heaton I didn't know you would read the comments. I hope I didn't offend you.
jai shri ram
Getting this error when trying to execute the "Perform Initial Training" code: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.7/dist-packages/tensorboard/compat/__init__.py)
I have the same problem. This is the error message: Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/tensorboard/compat/__init__.py", line 42, in tf
from tensorboard.compat import notf # noqa: F401
ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.7/dist-packages/tensorboard/compat/__init__.py)
I was successfully running Jeff's coloab notebook many times, then this error started showing up about 5 days ago or so.
@arkoudos, I found a solution to my problem (which sounds similar to yours) from another set of comments. This solution was suggested by @Devdabomber, I'm just pasting it here. You need to manually uninstall tensorboard before you run the training:
!pip uninstall tensorboard
@@oz1178 thanks mate i will do that