Windows 10, Training a GAN on your Own Images: StyleGAN2 ADA PyTorch

Поділитися
Вставка
  • Опубліковано 3 лис 2024

КОМЕНТАРІ • 91

  • @oxfordsculler8013
    @oxfordsculler8013 3 роки тому +3

    Thanks Jeff. Definitely a fan of PyTorch, it seems to be the way things are going. Appreciate that you do this on Windows.

  • @sam---sam
    @sam---sam 2 роки тому +1

    Cheers for this, as someone with zero machine learning experience this resource is great for possibly getting the ball rolling

  • @IainTait
    @IainTait 3 роки тому +5

    Thanks Jeff, that was incredibly clear and EXACTLY what I needed. Would love to see more on the latent vector space for sure! Especially outputting to image sequences / video :-)

  • @Silpheedx
    @Silpheedx 3 роки тому +6

    Thank you so much for this. If you happen to have some follow up videos on how to control the vectors that control the output that would be amazing

  • @blesfemy
    @blesfemy 3 роки тому +1

    Just found your channel. Falling in love with all the videos. Thank you for a throughout explanation! You've got a sub from me.

    • @HeatonResearch
      @HeatonResearch  3 роки тому

      Welcome aboard! Thank you for subscribing!

  • @hoaxuan7074
    @hoaxuan7074 3 роки тому +1

    If you want the value 1 out of a dot product you have some choices. You can set one element of the input vector to 1 and the corresponding element in a weight vector to one. Adding an equal amount of noise to all the input elements gives 1 unit of noise to the dot product output. You can make all the input elements 1 and all the weight elements 1/d (dimensions) to get 1 out. In the first case a lot of noise is simply cut out. The second case is even better where the noise is averaged out. As the variance equation for linear combinations of random variables tells you. In both cases the angle between the input and weight vectors is zero. When you increase that angle toward 90 degrees the length of the weight vector has to increase for a constant input vector to continue to get 1 out. And the output gets very noisy.
    If you want to create a associative memory that's useful information or even if you just want to understand dot products a bit better.

  • @thejakinator
    @thejakinator 3 роки тому

    Excellent video ! Thank you so much Jeff

  • @tolgakarahan
    @tolgakarahan 3 роки тому +1

    Thank you for sharing your knowledge. Keep doing good work.

  • @davidstamatis1142
    @davidstamatis1142 3 роки тому

    Jeff you are GANLORD -- THANK YOU

  • @tpocomp
    @tpocomp 3 роки тому

    Thank You Jeff! Very much appreciated

  • @jimbog6376
    @jimbog6376 3 роки тому +2

    This was super useful! Now I dont have to deal with copyright in my projects, now I can just make my own images!

  • @Doincus
    @Doincus 3 роки тому +6

    How would you generate a latent walk animation using the method at 29:39 and 30:36? Jeff brings it up, but I don't know how you would input it to generate a smooth interpolation between images.

  • @SubhankarChoudhury
    @SubhankarChoudhury 3 роки тому +2

    I wish I had a professor like you in College. Learning would have been so much fun. Really love your content. And one more thing, is that a 3D printer running somewhere in your room? Feels like I can hear the stepper motors :D

  • @dodi453
    @dodi453 2 роки тому +1

    If you're having issues during image conversion, try adding the columns "bit depth" and "bit rate" to the detailed list view of the images in their folder to see if there are any that don't match the bulk of the other images.

  • @kaegancasey5291
    @kaegancasey5291 2 роки тому

    You are a god.

  • @sepeslurdes1918
    @sepeslurdes1918 3 роки тому +1

    Hi Jeff
    Thanks for all your videos! This was my first training. I am getting this fid values:
    315.36 48.98 24.11 17.39 15.57 15.67 15.79
    So I think my training has converged, and stopped the training.. Is it worth waiting?
    The generated images quality are surprisingly good from the very first pickle. The dataset was very curated, about 6.5k images, well centered, pretty homogeneus, downscaled at 256x256

  • @Cauthon007
    @Cauthon007 3 роки тому +1

    Is there like a last command thats missing at 31:26 ?? to actually resume the training?
    EDIT : For those curious windows command would be ;
    python train.py --data [YourDir] --outdir [Your output Dir] --resume="C:\[Your File Path mine is in C]\00001-Ganoutput-auto1
    etwork-snapshot-000800.pkl"

  • @pagsful
    @pagsful 3 роки тому +2

    Thank you for this great video, i 've trained my GAN for 8 days with my 3060 and i'm very happy with thr results.
    I want to make an animation of the evolution of results how can i do that?

  • @elderizm
    @elderizm 3 роки тому +1

    Thanks Jeff. It takes hours for only three images. This shouldn't be that long. What am I doing wrong?

  • @davidstamatis1142
    @davidstamatis1142 3 роки тому +1

    Hi Jeff- I watched your chat on controlling features with latent vectors on Google Collab with faces. Can you walk through the method of controlling and analyzing features via latent vectors adjustments in this windows context with our own PKL files? Thanks!!

  • @bigdreams5554
    @bigdreams5554 3 роки тому

    Windows? Friends don't let friends do ML in windows, lol. Great video, appreciate the walkthru. Now I could play some video games while style gan is running in the background ;)

    • @HeatonResearch
      @HeatonResearch  3 роки тому +4

      Hah, I am there, but lots of my subscribers do this in Windows.

  • @rahimnealyakoob5968
    @rahimnealyakoob5968 3 роки тому +2

    need a wsl2 video on how to train Gan's Jeff!

  • @matthewoldach6387
    @matthewoldach6387 3 роки тому +1

    Great video but would be nice to mention you need a graphics card with a minimum of 12GB (my NVIDIA RTX 3070 only has 8GB!). I'm wondering if it's possible to add another GPU and a "bridge" to run AMDs Crossfire? Would that effectively give you 16GB of VRAM? Probably just cheaper to spin up a AWS instance FWIW lol 🤷

  • @amineleking9898
    @amineleking9898 3 роки тому +3

    Hi professor many thanks for the content I am learning so much from you.
    A question, how do you handle very large datasets processing without running into a memory problem.
    I tried to look for a video on the subject on your channel but couldn’t find any.
    Many thanks

  • @JordanService
    @JordanService 3 роки тому

    This is a great video-- Thanks so much. Will this run on RTX 2000 series I am using a 2060super. I am excited about this because i like pytorch and the install issues are a huge problem for me in the past with setting up ML projects.

  • @lethalcompclips
    @lethalcompclips 3 роки тому +1

    When I attempt to train my images I get this error hundreds of times
    No module named 'upfirdn2d_plugin'
    warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:

    ' + str(sys.exc_info()[1]))
    Setting up PyTorch plugin "upfirdn2d_plugin"... Failed!
    C:\Users\isaka
    epos\stylegan2-ada-pytorch\torch_utils\ops\upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:
    Followed by a memory error like this
    RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 4.00 GiB total capacity; 2.61 GiB already allocated; 0 bytes free; 2.68 GiB reserved in total by PyTorch)
    Any suggestions?

    • @CResearch286
      @CResearch286 6 місяців тому

      I know this was 3 years ago, so for anyone who happens to come across this it means your GPU is out of memory. Lower the batch size with --batch, or try tp use lower dimension images for e.g., --cfg paper256 for 256x256.

  • @chinokyou
    @chinokyou 2 роки тому

    Hi Jeff, it's a great video. But it looks like cuda only supports Visual Studio 2019 16.x and 2017 15.x.

  • @TheZaer79
    @TheZaer79 2 роки тому

    Hi M. Heaton ! Thanks for this ,tutorial ! I followed your instructions step by step, but an error message appears when i want to generate : "image dimensions after scale and crop are required to be square". How can i resolve this problem ! Thanks in advance for you reply !

  • @princeofexcess
    @princeofexcess 3 роки тому

    So this network has random seeds for the generator. If i want to train the network to do style transfer from one type of style to another I cannot use StyleGAN2 ADA? I believe I remember using StyleGAN for this task before they removed it from StyleGAN2 ADA?

  • @arrows78910
    @arrows78910 3 роки тому +1

    I have a gtx 1070, training works but is incredibly slow in comparison to what you guys run on. why are newer cards so hard to buy!!

  • @Mesenqe
    @Mesenqe 3 роки тому

    Thank you for this tutorial. I have one issue, but, I don't know if I could ask here.
    Training for 25000 kimg... it stops after one tick. Any idea..
    "
    tick 0 kimg 0.0 time 1m 10s sec/tick 9.6 sec/kimg 2403.47 maintenance 60.1 cpumem 4.02 gpumem 3.54 augment 0.000
    Evaluating metrics...
    "

  • @kitarolivier
    @kitarolivier 3 роки тому +1

    Thank you for your help installing on windows :)
    Did you noticed that projected faces with pytorch (for me on windows 10, RTX 3070) are less acurate than on Tensorflow (ubuntu, GTX1050i). For example, the eyes are not the good color (blue in original & tensorflow picture, brown on pytorch), using same FFHQ.pkl and vgg16.pt

    • @sharjeelkhalidchaudhry3314
      @sharjeelkhalidchaudhry3314 3 роки тому

      I am trying this on rtx 2070 8gb , 32 gb RAM on just 500 images , getting this error please help :)
      python.exe .\dataset_tool.py --source C:\Users\skcha\tempp\DT --dest C:\Users\skcha\tempp\DT_TEMP
      0%| | 0/554 [00:00

    • @sepeslurdes1918
      @sepeslurdes1918 3 роки тому +1

      @@sharjeelkhalidchaudhry3314 Your images are required to be power-of-two: That means squared images 32x32, 64x64, 128x128, 256x256, 512x512, etc^2xetc^2.. You need to resize your images to any of those proportions

  • @aladdinabudula8193
    @aladdinabudula8193 2 роки тому +1

    Hi Jeff. Thanks for the walk through. However, as I have followed your steps to the teeth, I still can't have the same result as yours. I got the error message all the time: "RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented". In the end, I don't have any pkl model file. Please help. thanks.

  • @Silpheedx
    @Silpheedx 3 роки тому +1

    When I try to kick of the training I get this error " warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:

    ' + traceback.format_exc())"
    Is that because I installed cuda_11.4.1 ? Or any ideas why I'm getting that error?

    • @iluleet
      @iluleet 3 роки тому

      I faced the same problem here as well
      The " upfirdn2d. Falling" is really frustrating.

    • @waifuct
      @waifuct 2 роки тому

      @@iluleet Late answer, but to help other travelers - The fix for this is to install ninja - "pip install ninja"

  • @jwc8963
    @jwc8963 3 роки тому

    I am running at Win 10, RTX 3080, CUDA 11.1, 200 512×512 dataset but still comes OOM🤦
    Tried to allocate 66.00 MiB (GPU 0; 16.00 GiB total capacity; 13.20 GiB already allocated; 49.00 MiB free; 14.40 GiB allowed; 13.60 GiB reserved in total by PyTorch)
    Have tried several methods suggested by other programmers but still not working. How should I modified it?

  • @bishwasapkota9621
    @bishwasapkota9621 3 роки тому

    Thanks Jeff for this video. I am trying to generate images using my trained model. But I am having error such as generate.py: error: unrecognized arguments: --outdir=.generated_images --trunc=0.7 --seeds. Any idea?

  • @giocia1988
    @giocia1988 3 роки тому

    Thank you so much for this video!
    I have a question, is it possible to create your own dataset starting from images with a non-square shape,
    for example 1024x512?

    • @HeatonResearch
      @HeatonResearch  3 роки тому +1

      Yes, it could be done. But not easily, it would require modification of the StyleGAN2 source code and a restructuring of their neural network.

    • @makeandbreak127
      @makeandbreak127 3 роки тому

      I just started playing with this and if you add "--transform=center-crop --width=512 --height=512" as options after the line of code for dataset_tool.py, it will crop all your images to a square shape (assuming you're subject is in the center of this image).

  • @mstrefford
    @mstrefford 3 роки тому +2

    Thanks Jeff. When I run with one of the pretrained Nvidia models, I get this error:
    No module named 'upfirdn2d_plugin'
    warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:

    ' + str(sys.exc_info()[1]))
    I'm running the latest dev version of Windows 10 (pre-release 210123-1645), WSL2, I have an RTX3070 card and I've installed CUDA 11.2 as per you instructions.
    Any ideas?

    • @mstrefford
      @mstrefford 3 роки тому

      I've noticed it actually does generate output, and also you get the same error. I'm assuming this isn't a big issue then?

    • @mamounlahlou777
      @mamounlahlou777 3 роки тому

      I got the same error. Did you manage to find any solution ?

    • @mstrefford
      @mstrefford 3 роки тому

      @@mamounlahlou777 I didn’t fix it but I noticed that it still works, I’m assuming as it falls back to an alternative solution. If you notice in the video, Jeff also gets that error (you have to look quickly, its only on screen for a split second).

    • @yuhanglee6828
      @yuhanglee6828 3 роки тому

      @@mstrefford did you fix this error?

    • @mstrefford
      @mstrefford 3 роки тому

      @@yuhanglee6828 no but it didn’t stop it from working.

  • @XavierCliment
    @XavierCliment 2 роки тому

    Excellent tutorial, how can I convert pkl files to pth?
    Thank you

    • @HeatonResearch
      @HeatonResearch  2 роки тому

      PKL is what StyleGAN2 supports out of the box, you might have to modify the NVIDIA code to support other formats.

  • @karmeleustarroz3694
    @karmeleustarroz3694 Рік тому

    14:56
    In this step i get this error:
    PackagesNotFoundError: The following packages are not available from current channels:
    - python-3.11
    I have installed Python 3.11, coul someone help me?

  • @DoUGotMusic
    @DoUGotMusic 3 роки тому

    I am currently on step 6 converting the images. I am getting an error that states that the argument is an invalid choice choose from display, compare, create_mnist, etc. Is this because my images are not at the same resolution. If so, what is the best way to obtain a dataset that can maintain proper resolution. I got my dataset using the flickr api and searched for dogs.

  • @mationplays1500
    @mationplays1500 2 роки тому

    Thanks for the great video but I get an error when training:
    RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 6.00 GiB total capacity; 4.17 GiB already allocated; 0 bytes free; 4.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
    Is it possible to train 6Gib or is it not enought memory?

    • @HeatonResearch
      @HeatonResearch  2 роки тому +1

      Might work at 256x256, and good to make the batch size smaller. You are running out of memory for sure, and 6GB is right at the minimum of what StyleGAN would want.

    • @mationplays1500
      @mationplays1500 2 роки тому

      @@HeatonResearch Thanks for the reply, but how do you change the batch_size. Now on my laptop I even only have 4gb, is it still possible to train with lower res and batch size? Thanks

  • @mationplays1500
    @mationplays1500 2 роки тому

    why do the pictures have to be square in dataset_tool.py?

  • @jennymild2886
    @jennymild2886 3 роки тому

    Thank you Jeff for this awesome video! I had tried to get my RTX3090 to work with StyleGAN2 for ages, and now it's finally working. I am making artvideos of these images that I generate and would like to ask you what would be a good way to make these videos out of the images?
    I noticed that I can't use the .pkl files that I make with PyTorch with Stylegan2 with TensorFlow. Do you know why? Is there a way to make them work? I have render_video.py that works fine with .pkl files that I created earlier with TensorFlow and would like to use similar kind of "program" to do the videos.
    Thank you for doing these videos and helping out noobies like me :) Cheers!

  • @Alee94
    @Alee94 2 роки тому

    I am having this error '' Torch not compiled with CUDA enabled" when i run the train.py ,
    anybody has any solution plzzz

  • @peelthebananna9827
    @peelthebananna9827 3 роки тому +1

    How much vram do you use when running larger datasets?

    • @HeatonResearch
      @HeatonResearch  3 роки тому

      For 90% of what I do, around 12GB is a sweetspot. But now that I have an A6000, I will perform some tasks that show how to use the larger RAM.

  • @Doincus
    @Doincus 3 роки тому

    Hey whenever I try to convert my images as you do at 22:22 I get the error " Input images must be stored as RGB or grayscale " even though I've checked all my images are infact already RGB. I'm really confused why this is happening at this step.

    • @JayanWarden
      @JayanWarden 3 роки тому

      Are they PNGs? Make sure that they are strictly RGB (24bit), not RGBA (32bit). Alternatively you can convert all the PNGs to 100% quality JPGs, those should work.

  • @meminesis
    @meminesis 3 роки тому

    What if I had a RTX3070 (8GB) how long would training last with a dataset of 300images, like a week?

    • @HeatonResearch
      @HeatonResearch  3 роки тому +2

      The number of images does not matter. Its how many kimg you train to. Resolution will count for alot too, with 8GB, your probably best looking at 256x256 maybe trained to 1000 or 3000 kimg. I will be using a 3060 for a few weeks soon, and will post a video on its results.

    • @meminesis
      @meminesis 3 роки тому

      ​@@HeatonResearch on training my first GAN (1024px) I noticed that FID score acts in a strange way; in a week it went from a low 344 points to 71 but lately, with almost 2000 kimg done FID still fluctuates between 70 to 80... Why FID doesn't get progressively lower with kimg progress, isn't this supposed to be linear? And fakes visually don't get any better of course.

  • @qwe-de7xd
    @qwe-de7xd 3 роки тому

    19:47
    git : The term 'git' is not recognized as the name of a cmdlet, function, script
    file, or operable program. Check the spelling of the name, or if a path was
    included, verify that the path is correct and try again.
    At line:1 char:1

    • @HeatonResearch
      @HeatonResearch  3 роки тому

      Try this first... git-scm.com/book/en/v2/Getting-Started-Installing-Git

    • @sepeslurdes1918
      @sepeslurdes1918 3 роки тому

      Another option is just, once in github, to click the green button "CODE" and then "Download ZIP". Then just uncompress the zip into a temp folder.. Voilá! No need to install command line "git" tool

  • @DuppyTree
    @DuppyTree 3 роки тому +1

    I have an RTX 2070 with 32 ram is doing this a good idea?

    • @HeatonResearch
      @HeatonResearch  3 роки тому +2

      That should work.

    • @DansuB4nsu03
      @DansuB4nsu03 3 роки тому +2

      32 GB is more than enough. Minimum accepted/barely-working amount is about 12 GB.

  • @unknownfilmmaker777
    @unknownfilmmaker777 2 роки тому

    Can someone tell me which gan does video to video on Windows?

  • @yuhanglee6828
    @yuhanglee6828 3 роки тому

    assert len(self.block_resolutions) == len(self.num_block)
    AssertionError
    help!

  • @dyst1575
    @dyst1575 3 роки тому

    19:20 I'm lost after you make it with ssh instead of https. Git clone url give me error and I can't continue the tutorial ? " The term 'git' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again. "
    edit : it worked I just had to install git but I know nothing about coding. solution : ua-cam.com/video/_ZhruB2Vmdk/v-deo.html

  • @HalkerVeil
    @HalkerVeil 3 роки тому +1

    Why can't they just make a simple install setup exe?
    Is that SO hard? XD
    My 12 year old nephew can do that. He can also make a simple button to click instead of doing all these command prompts.

    • @HeatonResearch
      @HeatonResearch  3 роки тому

      IKR.. I agree totally. In some ways, Docker is that. But then on Windows, getting Docker running with a GPU is its own epic journey.

  • @sharjeelkhalidchaudhry3314
    @sharjeelkhalidchaudhry3314 3 роки тому

    I am trying this on rtx 2070 8gb , 32 gb RAM on just 500 images , getting this error please help :)
    python.exe .\dataset_tool.py --source C:\Users\skcha\tempp\DT --dest C:\Users\skcha\tempp\DT_TEMP
    0%| | 0/554 [00:00

  • @thetafferboy
    @thetafferboy 3 роки тому

    Thanks for a brilliant tutorial. I tried a sample of 2,876 1024x1024 images of dogs. Train.py runs without error but runs for 10 minutes only and just generates a single 00000-dogs-test-auto1 directory, so I only have a single pkl file and terrible fakes. Any ideas what could be causing this? I am trying to run it on an 1080Ti :-)

    • @vgaggia
      @vgaggia 2 роки тому

      Dunno if you still care, but you need to train the pkl for days/weeks before the quality is good.

  • @mattclark7511
    @mattclark7511 3 роки тому

    (PART 6: Convert images) I have cropped my dataset to 1x1 (A square) due to not working if not. But the error I'm getting is that it has to be a power of two? any help would be appreciated. Thanks

    • @donaldufuah3399
      @donaldufuah3399 3 роки тому

      Hello Matt, usually for you to get style GAN running without that error, your input to the network has to be of power two, meaning you should have say 32*32, 64*64 etc but if you’re doing transfer learning with the model try resizing your input to 256 by 256, 512 by 512 or 1024 by 1024. I hope this helps.