NVIDIA just made Local AI 3X Faster on Windows??

Поділитися
Вставка
  • Опубліковано 28 жов 2024

КОМЕНТАРІ • 58

  • @harmonichemispheres1080
    @harmonichemispheres1080 5 місяців тому +7

    this channel is so underrated. thanks for all the work!

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      Thank you! Glad we're making content that's engaging outside of many other cringe "ai info" channels. :))))

  • @fontenbleau
    @fontenbleau 5 місяців тому +27

    but we want faster on Linux! 🥺

    • @aifluxchannel
      @aifluxchannel  5 місяців тому +4

      Hopefully this is coming soon with the new open source / "unified" driver they've been teasing. www.phoronix.com/news/NVIDIA-R560-Open-Default

    • @RickySupriyadi
      @RickySupriyadi 5 місяців тому

      unified? oh no what will happen to Legacy gpu...

    • @levieux1137
      @levieux1137 5 місяців тому

      clearly, they're focused on windows because of gaming but with the move to AI they should start to think about moving to the platform where people use AI instead...

  • @BoominGame
    @BoominGame 5 місяців тому +1

    So are the 555 Drivers available for Ubuntu 24? I have 2 x RTX a4000 this is great news.

    • @aifluxchannel
      @aifluxchannel  5 місяців тому +1

      In theory yes! Although this isn't the final version that will be included with the next kernel update.

    • @BoominGame
      @BoominGame 5 місяців тому

      @@aifluxchannel The only thing is will the rest of the AI libraries follow suit? It's already hard to get everything working as is with mature drivers.

  • @SiCSpiT1
    @SiCSpiT1 5 місяців тому +3

    I've been running LLMs on Win10 with a 3070 for months now and as long as the full model can fit into the VRAM buffer the out put is nearly instant. Unless I'm trying to do something weird. This update is more than likely for the Turing cards.

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      The performance improvement will definitely vary depending on your hardware config. That said, 3070 is a decent GPU for local AI!

    • @joe_limon
      @joe_limon 5 місяців тому +1

      I think the speed is something that will end up making local agentic systems much more powerful. We are on the cusp of releasing open sourced ai systems that are given time to process and think before they respond, tripling the processing speed essentially triples the amount they will be able to think before they respond.

  • @southcoastinventors6583
    @southcoastinventors6583 5 місяців тому +1

    Can't wait the AI games will get so good they play themselves because who has time for that better off just paying the sub. I imagine the main reason most people stay off linux is certain programs like adobe products don't really run as well as a bunch of games but with windows discontinuing windows 10 and adding all these spyware like programs which they promise they will not have access to it. Its certainly getting easier for people to switch.

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      Blockade Labs and their work on "dynamic" ai generated skyboxes are currently one of my biggest obsessions. AR/VR is generally incredibly cringe, but the idea of walking through basically a dream that morphs into something new as long as it's out of your field of view is incredible.
      But also, AI NPC's (true generative ai not just legacy AI) will also be super cool to see grow.

    • @BoominGame
      @BoominGame 5 місяців тому

      I installed Windows 10, worked on it 2 months, then it broke, I transferred my wsl onto a bare metal Ubuntu and I am not looking back to use windows anymore, ever. Someone will integrate an AI - ollama - into ubuntu very soon, I hope.

  • @Leto2ndAtreides
    @Leto2ndAtreides 5 місяців тому +4

    Local saves aren't spyware. Anyway, I've been using Ollama on my Windows, and it's been pretty fast... For 7B models.
    ... Where you legit feel bad for not having an M3 Max with its memory shared between the CPU and GPU.
    ... I want that 128GB RAM for my local AI models ...

    • @aifluxchannel
      @aifluxchannel  5 місяців тому +1

      More ram = more better llm performance!

  • @mattelder1971
    @mattelder1971 5 місяців тому +1

    Interesting, they released these drivers for RTX cards, but not for Quadro cards. Those are still on a slightly older version. Hopefully they bring these improvements to the Quadro line as well.

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      Quadro cards are still technically supported, but I don't think we'll be seeing optimizations for these. Far more gaming GPUs exist from that era than the quadro era.

    • @mattelder1971
      @mattelder1971 5 місяців тому

      @@aifluxchannel They only seem to have released new Game Ready drivers, but no Studio drivers. Many people doing local AI are using Quadro cards. RTX and Quadro aren't different "eras", they exist simultaneously.

    • @BoominGame
      @BoominGame 5 місяців тому

      Quadro are for virtualisation not so much for AI calculations, old tensors if any, dissociated vram where a 32 gig will be in fact 4x8 on 4 different dies...

  • @lightweight1889
    @lightweight1889 18 днів тому +1

    GT 1030 +300% performance?

  • @ohardest
    @ohardest 5 місяців тому +1

    Really excited about this. I have a RTX 2080 I want to try it on.

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      2080 is still a great GPU for ai!

  • @Cjak001
    @Cjak001 5 місяців тому +1

    So this isn't in the Studio Driver? I guess I'll switch over then

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      Not sure what you mean by studio, but I don't think this is the full version intended to bring linux updates to the official kernel driver.

    • @Cjak001
      @Cjak001 5 місяців тому +1

      @@aifluxchannel Well the article specifically mentioned Nvidia's game ready drivers for the improvement but I asked because I was using their studio drivers which are more geared towards software and "creative apps" so basically I didn't know if the changes affected those drivers too.

    • @peterpui7219
      @peterpui7219 5 місяців тому +2

      Nvidia Studio Driver are supported with this feature as well as Nvidia Game Driver.

  • @motess5304
    @motess5304 5 місяців тому +3

    🤣 I need to know just what the hell this guy is doing in windows that it breaks. The only time I've ever had my windows system dump is everytime a stick of RAM dies on me or gets sketch and needs replacing. That is literally it. Reminds me back int he day, and I'm dating myself but running the original pro tools on a Mac IIci and people asking how can I do music production on it as it was so unstable and always crashing and I was like what are you doing? It never ever crashed on me, like ever. 🤦🏿‍♂

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      Literally, basic parts of the OS just break. haha

    • @InnocentiusLacrimosa
      @InnocentiusLacrimosa 5 місяців тому

      ​@@aifluxchannelJust does not happen to me. I was running Linux as my main OS in early 90s. Been mainly using Windows since with occasional Linux. Windows is really stable.

    • @martin777xyz
      @martin777xyz 5 місяців тому

      ​@@aifluxchannelGiven you use Windows so infrequently it's probably some issue in the initial setup you've never paid attention to.

  • @ozgurdenizcelik
    @ozgurdenizcelik 5 місяців тому +1

    few weeks ago i started to use wsl and it's kinda confusing what to do now

    • @aifluxchannel
      @aifluxchannel  5 місяців тому +1

      WSL is basically a 50-ish% implementation of a real linux kernel. Windows also kneecaps how it can access underlying resources like memory and GPU. Generally, I'd recommend just buying another SSD and dual-booting with linux.

    • @ozgurdenizcelik
      @ozgurdenizcelik 5 місяців тому

      @@aifluxchannel great idea i believe i can buy a sata

    • @ozgurdenizcelik
      @ozgurdenizcelik 5 місяців тому

      would dividing my ssd into c and d solve this problem ?

    • @ozgurdenizcelik
      @ozgurdenizcelik 5 місяців тому

      would diving my ssd into c and d and using d as Linux solve my problem

    • @BoominGame
      @BoominGame 5 місяців тому +1

      Ask chat GPT, but I suggest you install ubunto on a separate drive it's much more stable and less of an hassle to work with all your environments.

  • @FatalKeystroke
    @FatalKeystroke 5 місяців тому +1

    Will this do anything for a Tesla card?

  • @johnkost2514
    @johnkost2514 5 місяців тому +2

    INT4 quantization is a boon to local LLM(s). This is an interesting performance uplift.

    • @aifluxchannel
      @aifluxchannel  5 місяців тому +2

      Definitely agree on this point - phi 3 is definitely pushing the boundaries of what is possible with and without quants this small.

  • @Tom_Neverwinter
    @Tom_Neverwinter 5 місяців тому +1

    just one tiny problem. nvidia and multiple gpus typically fails. it never detects the other cards and install the approprate drivers

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      This is why we use linux ;)

    • @BoominGame
      @BoominGame 5 місяців тому

      I worked on windows withe 2 RTXs no problem, but yes ubuntu is faster and handles this much better. It's more when you start playing with different packages like xformers, bitsanbytes, triton, protobuf etc.. you will struggle re-conciliating Torch and CUDA with all that crowd,.

  • @average_snmp_user
    @average_snmp_user 5 місяців тому +1

    we all know that you are talking about the titan RTX

  • @leeme179
    @leeme179 5 місяців тому +3

    yay just switched to windows from ubuntu😆

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      Noooooooo!

    • @leeme179
      @leeme179 5 місяців тому

      @@aifluxchannel 😆I switched without knowing about this driver, and actually I updated my driver today have already have 555 version, I wanted to try WSL on windows, tired of restarting to game on windows 😝

    • @leeme179
      @leeme179 5 місяців тому

      @@aifluxchannel I had a 4090 and recently bought Tesla P40 for £200 and just managed to get all parts to get it working today and used Llama 70b 4bit using both GPUs was getting 6 tokens per second using LLM studio

    • @Korodarn
      @Korodarn 5 місяців тому

      @@leeme179 I assume online gaming that requires the dumb kernel level anti-cheat? Because I have almost zero problems for most games in Linux these days. Even on hyprland it works pretty good, especially with 555 removing the flickering on games like minecraft. I had to be on 535 before that, but it worked fine, just not perfect.
      I don't think anyone should support the games running kernel level anti-cheat. It's a huge security hole, and it's not even really all that effective against committed cheaters. There are better options that don't require them to take control of the players computer at such a level.

    • @leeme179
      @leeme179 5 місяців тому

      @@Korodarn you are correct, games using kernel level anti-cheat and I recently did some digging as to how good they are, and it does not look good, to the point that either game developers have given up or don't care, but the only thing that seems to be working (sadly) seems to be the kernel level anti-cheat loading at boot like Vanguard

  • @GerryPrompt
    @GerryPrompt 5 місяців тому +1

    Driver go fastorrrrrrr 😂

    • @aifluxchannel
      @aifluxchannel  5 місяців тому

      We can hope for more linux improvements too!

  • @RickySupriyadi
    @RickySupriyadi 5 місяців тому +1

    what about linux :(
    oh mine isn't rtx lol....
    nevermind.....