GPU's in Kubernetes the easy way? nvidia gpu operator overview!

Поділитися
Вставка
  • Опубліковано 25 жов 2024

КОМЕНТАРІ • 30

  • @Hossein118
    @Hossein118 11 місяців тому +4

    It's a bit eerie because 3 years after the video was posted and out of a sudden I opened it for the first time and he correctly reminds me that I am watching it on a Tuesday.

    • @NullLabs
      @NullLabs  11 місяців тому +1

      I'm good! All this ML has paid off!

    • @TRFAD
      @TRFAD 3 місяці тому

      Same here. lol

  • @MrBillyClanton
    @MrBillyClanton 4 роки тому +5

    Could you do a video on installation?

    • @NullLabs
      @NullLabs  4 роки тому +1

      absolutely! Thanks for asking!

    • @julianmuller3869
      @julianmuller3869 2 роки тому

      If you have Helm installed do:
      helm install --wait --generate-name \
      -n gpu-operator --create-namespace
      nvidia/gpu-operator
      Check with:
      kubectl get pods -n gpu-operator

  • @JazzTechie
    @JazzTechie Рік тому +1

    Another easy non-daemonset approach to doing this is edit the nodes to add the labels, then using affinity or topology spread constraints, but I guess the daemonset approach is if you’re running a giant bazillion node cluster and you want it to do it automatically when you join in more nodes.
    Do you know if any special setup needs to happen on the hosts prior to rolling out the nvidia image to it?
    Also - any recommendations on which driver containers to use (like what repo/image the daemonset is trying to pull)?

    • @NullLabs
      @NullLabs  Рік тому +1

      you are right. Another consideration is if your cluster is changing size, aka, you scale up once a week to do a full retrain ect. This makes adding GPU nodes easier. This is an older video though, maybe I should update it to be a little more modern.

  • @aakarshitagarwal2115
    @aakarshitagarwal2115 Рік тому +1

    with ARM image EKS still need to support GPU over it, by which only nvidia ds can be utilize

    • @NullLabs
      @NullLabs  Рік тому

      Thanks for the info! I have not used ARM image on EKS! Thanks!

  • @mohamed_akram1
    @mohamed_akram1 11 днів тому +1

    But how to make multiple pods access the same GPU?

    • @NullLabs
      @NullLabs  9 днів тому +1

      To do this you require special virtualization software from Nvidia. At least that is they way I know of! :)

  • @shashanks2412
    @shashanks2412 3 роки тому +1

    Hello, i have configured kubeflow on k8s cluster, However i have recently added gpu nodes to my kubernetes cluster which is on premise. Need to make sure now the notebook servers in kubeflow make use of gpu nodes…how do i achieve it

    • @NullLabs
      @NullLabs  3 роки тому +1

      Hey! I see you have also joined the discord server and asked the question there! I have responded there as it's a bit easier to have a back and forth! Hope you found these videos useful and thank you for watching!

    • @shashanks2412
      @shashanks2412 3 роки тому

      @@NullLabs yes thank you

  • @realsushi_official1116
    @realsushi_official1116 4 роки тому +1

    So 1 gpu for multiple dockers is a deadend idea?

    • @NullLabs
      @NullLabs  4 роки тому

      I'm actively perusing this. At this point with nvidia, I don't think it is possible. There is things like www.nvidia.com/en-us/data-center/virtual-gpu-technology/ but that requires both extra software and hardware licencing, and as nvidia does not sponsor me... it is very expensive. Now you can run multiple containers, just not in parallel.

    • @NullLabs
      @NullLabs  4 роки тому +1

      If I figure anything out in this area you all will be one of the first to know.

    • @MrBillyClanton
      @MrBillyClanton 4 роки тому +1

      Yes, unless you are using A100
      www.nvidia.com/en-us/technologies/multi-instance-gpu/

    • @NullLabs
      @NullLabs  4 роки тому +1

      @@MrBillyClanton This is correct. And even if you have the A100 it is my understanding you have to pay/licence for some extra software too do this. it is a shame honestly. though 3 2060S are about the same price as a 2080, so really it's best to get 3 of those, and lets you run 3 containers. (same holds true for previous models as well.

    • @V1nc3nt20
      @V1nc3nt20 3 роки тому

      @@NullLabs Do you have any news about this? Because it seems to be possible "by accident": github.com/NVIDIA/gpu-operator/issues/28

  • @Dchau360
    @Dchau360 2 роки тому

    Is it possible to run multiple driver versions? Have 2 containers and each have different driver version?

    • @NullLabs
      @NullLabs  2 роки тому

      I think that the operator can do this, though I have not tried to do this directly. I have put it on my list of things to check in the future.
      You CAN do this if you install them manually and expose them to the cluster manually. This I know works (and thus am pretty sure that the operator can do it... if it does not I doubt it would be hard to update it) Do mind you, that you would need two separate nodes, as these are not the drivers "in" the container, but on the system we are talking about. So not even a A100 with gpu virtualization would support two sets of drivers on the same gpu.

  • @avirupghosh7856
    @avirupghosh7856 3 роки тому

    V V V Slow

  • @saurabhmalhotra110
    @saurabhmalhotra110 2 роки тому

    A video without demo, useless.

    • @NullLabs
      @NullLabs  2 роки тому

      Kind of maybe. Unless you did not know this existed. Also if you want to get me enough money to have double GPUs to make this work I would gladly update it with a demo!

    • @toshumalhotra4335
      @toshumalhotra4335 2 роки тому

      @@NullLabs Yes you can use k8s on docker and at least make a good demo on sharing a single gpu with multiple containers, hope this doesn't cost you