Installing DUAL Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation

Поділитися
Вставка
  • Опубліковано 30 жов 2023
  • In this comprehensive UA-cam tutorial, we'll guide you through the entire process of installing not one but two powerful and cost-effective Tesla P100 GPUs on your Dell PowerEdge R720 server, from start to finish. We cover everything, including the essential parts you'll need, step-by-step installation instructions, and the crucial driver installation to ensure your GPUs work seamlessly in tandem. Whether you're a seasoned IT professional or a server enthusiast, this video provides valuable insights to help you harness the full potential of your server's performance with dual P100 GPUs. Don't miss out on this essential guide to supercharge your server capabilities! Subscribe for more tech tutorials and stay tuned for more in-depth AI/ML/DL related content.
    📚 Additional Resources:
    AI/ML/DL GPU Buying Guide 2023: Get the Most AI Power for Your Budget
    • AI/ML/DL GPU Buying Gu...
    AI/ML/DL with the Dell PowerEdge R720 Server - Energy, Heat, and Noise Considerations
    • AI/ML/DL with the Dell...
    Throttle No More: My Strategy for GPU Cooling in Dell PowerEdge
    • Throttle No More: My S...
    Installing Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation
    • Installing Tesla P100 ...
    Dell PowerEdge R720XD GPU Upgrade: Installing Tesla P40 with NVIDIA Drivers
    • Dell PowerEdge R720XD ...
    Dell PowerEdge R720 GPU Deep Learning Upgrade: Installing Dual Tesla P40s with NVIDIA Drivers
    • Dell PowerEdge R720 GP...
    Other UA-cam Video That Describe Cabling Issues In More Detail
    • Discussing power cavea...
    Links to Parts I Used
    BETTER CABLING OPTION: a.co/d/hccc8m8
    www.amazon.com/dp/B07MV4CYMV?...
    www.amazon.com/dp/B07M9X68DS?...
    www.ebay.com/itm/203107893431...
    HOW TO GET IN CONTACT WITH ME
    🐦 X (Formerly Twitter): @TheDataDaddi
    📧 Email: skingutube22@gmail.com
    💬 Discord: / discord
    Feel free to connect with me on X (Formerly Twitter) or shoot me an email for any inquiries, questions, collaborations, or just to say hello! 👋
    HOW TO SUPPORT MY CHANNEL
    If you found this content useful, please consider buying me a coffee at the link below. This goes a long way in helping me through grad school and allows me to continue making the best content possible.
    Buy Me a Coffee
    www.buymeacoffee.com/TheDataD...
    As a cryptocurrency enthusiast, I warmly welcome donations in crypto. If you're inclined to support my work this way, please feel free to use the following addresses:
    Bitcoin (BTC) Address: bc1q3hh904l4uttmge6p58kjhrw4v9clnc6ec0jns7
    Ethereum (ETH) Address: 0x733471ED0A46a317A10bf5ea71b399151A4bd6BE
    Should you prefer to donate in a cryptocurrency other than Bitcoin or Ethereum, please don't hesitate to reach out, and I'll provide you with the appropriate wallet address.
    Thanks for your support!
  • Наука та технологія

КОМЕНТАРІ • 38

  • @ultraplexplextor
    @ultraplexplextor 6 місяців тому +1

    Thanks for some great videos, I have 2 PowerEdge R720 with dual cards (GRID K2 4 x GPU) per server
    I like to upgrade to some newer cards.
    Can you make a video about Heat and performance issue with tesla upgrade?

    • @TheDataDaddi
      @TheDataDaddi  6 місяців тому +1

      Hi there! Thanks for the feedback. I really appreciate it. Absolutely! That is actually next on my list of videos to make. I will also measure power consumption as well at normal conditions and max load. So far though, I can tell you from a qualitative perspective there have been no issues with throttling due to heat.

    • @rdsii64
      @rdsii64 5 місяців тому

      How loud is your R720. I also have an R720. My problem is with no graphics cards, my server is whisper quite since it only runs one service. The air conditioner is louder. Because I am space constrained, My homelab rack is in my office next to my desk. What I'm afraid of is if I off load the heavy lifting to a tesla card, the noise will be unbearable since I have to sit next to my rack.

    • @TheDataDaddi
      @TheDataDaddi  5 місяців тому +1

      So at idle or under minor loads the server is relatively quiet like you said, but the minute I start trying to train a model or do anything that requires more than minimal resources the noise becomes bothersome. You could adjust the fans down, but this might harm your system if done long term. You could also invest in a nice pair of noise canceling headphones. I did this for awhile when my server was in my room, and it made it bearable to live with for a year. Also, if you have just one R720 the noise is tolerable (imo). However, if you have multiple its too much to have in your room. Unfortunately, though I have not found a great solutions to this. My solution eventually was moving to a bigger place so I could put the server's in the basement. Hope this helps. In your case, it may be better to build you own rig that runs more quietly even though it will likely be more expensive. Hope this helps! @@rdsii64

  • @many151000
    @many151000 Місяць тому +1

    Friend, I have a problem when connecting a Tesla M10 to my r720 server, when I put in the power cable the sources start pulsing amber, do you know why it is...

    • @TheDataDaddi
      @TheDataDaddi  Місяць тому

      Hi there. Thanks so much for the question.
      Is the server able to boot, when you have the M10 in?

  • @computersales
    @computersales 6 місяців тому +2

    Interesting I never noticed the dual slot riser 3 only supports 150W out of the power connector. It is weird because the dual slot riser 2 power connector can provide 225W. Doesn't seem like it would be a limitation of the riser slot on the motherboard side of things. Also so glad I don't have to deal with those PCIe 8-pin to EPS 12V adapters. They don't look fun to install.

    • @TheDataDaddi
      @TheDataDaddi  6 місяців тому +1

      Really appreciate the comment! Yeah I didn't realize that either for riser 3. Also, it might not be an issue. I honestly never tried it with the 2 slot riser 3. I just didn't want the GPU to be under powered. I replaced riser 3 with a single slot that has 225W (I believe) that I got from EBAY. That seemed to do the trick. Yeah the PCIe 8-pin to EPS 12V adapters are a pain in the ass. I wish there was a better method, but I havent found one short of making your own custom cable.

    • @computersales
      @computersales 6 місяців тому

      @@TheDataDaddi I think if anything it would trigger some sort of power overdraw limit. I may have to try for the fun of it. The cheaper aftermarket cables should work fine with the R720 but they won't work with R730 due to slightly different wiring. Otherwise making your own is the best option.

    • @TheDataDaddi
      @TheDataDaddi  6 місяців тому +1

      @@computersales Yeah that would be interesting. If you do, please let me know the results! Yeah I agree. If I had the time, I would likely go that route instead.

  • @IntenseGrid
    @IntenseGrid 4 місяці тому +1

    Could it be that the reason the cable that came with the server didn't work in the 3rd PCI riser is because the EPS power on that riser was keyed for 150Watts, while the cable was keyed for 225Watts?

    • @TheDataDaddi
      @TheDataDaddi  4 місяці тому

      Hi there. Do you mean the cable that came with the GPU? If so that certainly could be the case, just not sure why because that would under power the GPU by default.

    • @IntenseGrid
      @IntenseGrid 4 місяці тому +1

      @@TheDataDaddi In your first video of the series you mentioned that vlink can give you extra bandwidth between GPUs. Mike Adams just bought some A100 40GB units (unsure if cards or modules) but I was thinking that is not only overkill for the model he is training...especially if you can get decent bandwith between GPUs for a pair of P40's or P100's to run the same workload. I'm a noob to AI, so would love your opinion.

    • @TheDataDaddi
      @TheDataDaddi  4 місяці тому

      Yeah I would say that for most hobbyist that NVLink might be overkill. However, I have never used it myself so I cannot say how much it would speed up training. Now if he is training TBs worth of data over many GPUs. Then the extra bandwidth for data transfer could prove invaluable. So I would say (albeit fair blindly) that it really depends on application and the amount of data transfer that has to happen. For most applications and AI/ML/DL hobbyist I would say that PCIE transfer works just fine. Caveat being, I have not been able to test this yet so I is hard for me to give a real answer here. Getting my hands on a Gen 1 NVLink and setting it up for my home lab for test is high on my list of videos to make so one day I will hopefully be able to answer your question much more completely. @@IntenseGrid

  • @GOLTURBO555
    @GOLTURBO555 2 місяці тому +1

    Dual GPUs? How? Raid at bios? Or the software use both cards?

    • @TheDataDaddi
      @TheDataDaddi  2 місяці тому

      Software uses both card naturally once the drivers are installed.

  •  13 днів тому

    In wich OS did you run the nvidia-smi command ?

    • @TheDataDaddi
      @TheDataDaddi  12 днів тому

      Hi there. Thanks for the question!
      I am using Ubuntu 22.04 for all of my servers at the current moment. Might switch over to NixOS soon though.

  • @user-or9ir7dp5v
    @user-or9ir7dp5v 4 місяці тому +1

    I think that the speed of Riser3 is X8 although it is X16 in length.

    • @TheDataDaddi
      @TheDataDaddi  4 місяці тому +1

      You maybe right. I was unclear on this. I could never find a definitive answer here. However, I swapped out riser 3 mostly for wattage not for more PCIE lanes.

    • @user-or9ir7dp5v
      @user-or9ir7dp5v 4 місяці тому +1

      @@TheDataDaddi Just refer to the user manual in page 103, there are two types of Riser card. The default one only supports x8 speed and the backup one supports x16.

    • @TheDataDaddi
      @TheDataDaddi  4 місяці тому +1

      Right, I was just unclear as to whether that was a constraints of the mobo for riser 3 or just be the riser 3 variant I had originally had 2 slots that would in theory split the x16 across both slots to x8. Anyway, thanks so much for the feedback. I will go reread that page in the manual.@@user-or9ir7dp5v

  • @IntenseGrid
    @IntenseGrid 4 місяці тому +1

    I didn't see the link to the custom cable maker.

    • @TheDataDaddi
      @TheDataDaddi  4 місяці тому +1

      So this is the video I watched from the guy that makes his own cables. I believe he sells them on EBAY but I am not 100 sure. You could reach out to him and see.
      ua-cam.com/video/qC7UdfQPMVI/v-deo.htmlsi=3X0NX9aFRxiqaJNr

  • @Mark300win
    @Mark300win 5 місяців тому +1

    What are you using these dual p100 for?

    • @TheDataDaddi
      @TheDataDaddi  5 місяців тому +2

      At the moment, I am using for a computer vision project I am working on. I am using using Meta's detectron2 framework with various pre-trained object detection models (mostly of the Faster R-CNN variety) to fine-tune them to my particular application. At the moment, I am fine-tuning with thousands of 1920 × 1080 annotated images of webpages. The goal is to create an object detection model that can reliably detect various parts of a webpage. This model will be used in various other down stream tasks.

    • @Mark300win
      @Mark300win 5 місяців тому

      @@TheDataDaddithat’s awesome! I’m torn apart whether to go with p40 24g vram or p100 12g vram… what do you recommend

    • @TheDataDaddi
      @TheDataDaddi  5 місяців тому +2

      I really struggled with this as well. My recommendation is the following. The p40 is going to be the better choice for most people. It has higher VRAM which is more and more important these days as models get larger. Better single precision performance which is what is really important for most applications. However, the p100 has greater bandwidth and substantially greater double precision. So, if you had a particular application that required double precision, I would choose the p100. Also if you wanted something more general purpose that could handle all projects, I would probably choose the p100. Overall, for most people the p40 will likely be the better choice. I will be honest though I have not had a chance to full test both sets of GPUs so I cannot comment more specifically yet. This performance comparison is high on my list. As soon as, I get a chance I am going to make a video on it. @@Mark300win

    • @Mark300win
      @Mark300win 5 місяців тому

      @@TheDataDaddi yeah the p40 vs p100 is a very hot topic and haven’t seen any video yet about it

    • @GOLTURBO555
      @GOLTURBO555 2 місяці тому +1

      @@TheDataDaddi can I game in these servers?

  • @fantomgaming9018
    @fantomgaming9018 6 місяців тому +1

    It doesn't work

    • @TheDataDaddi
      @TheDataDaddi  6 місяців тому

      Hey there. Sorry to hear that it is not working for you. If you tell me what your issue is, I'd be happy to help you trouble shoot.

  • @blender_wiki
    @blender_wiki 4 місяці тому +1

    Nice video however in 2023 th P100 are basically junk cards, even not decent for encoding, doesn't supprt h265 at 10bit

    • @TheDataDaddi
      @TheDataDaddi  4 місяці тому +1

      Hi there. First, I would like to say I appreciate the feedback. However, in this case I must disagree with you. I think it depends on what your application is. If you are using this as a gaming GPU or trying working with large video datasets that might benefit from h265 10 bit encoding, I would agree there are better choices albeit more expensive. However, for a general purpose machine learning GPU that is cost effective, it is still very relevant in my opinion. These would be especially relevant for those that are either just getting in to AI/ML/DL or are budget constrained or both.