clearly, they're focused on windows because of gaming but with the move to AI they should start to think about moving to the platform where people use AI instead...
@@aifluxchannel The only thing is will the rest of the AI libraries follow suit? It's already hard to get everything working as is with mature drivers.
I've been running LLMs on Win10 with a 3070 for months now and as long as the full model can fit into the VRAM buffer the out put is nearly instant. Unless I'm trying to do something weird. This update is more than likely for the Turing cards.
I think the speed is something that will end up making local agentic systems much more powerful. We are on the cusp of releasing open sourced ai systems that are given time to process and think before they respond, tripling the processing speed essentially triples the amount they will be able to think before they respond.
Can't wait the AI games will get so good they play themselves because who has time for that better off just paying the sub. I imagine the main reason most people stay off linux is certain programs like adobe products don't really run as well as a bunch of games but with windows discontinuing windows 10 and adding all these spyware like programs which they promise they will not have access to it. Its certainly getting easier for people to switch.
Blockade Labs and their work on "dynamic" ai generated skyboxes are currently one of my biggest obsessions. AR/VR is generally incredibly cringe, but the idea of walking through basically a dream that morphs into something new as long as it's out of your field of view is incredible. But also, AI NPC's (true generative ai not just legacy AI) will also be super cool to see grow.
I installed Windows 10, worked on it 2 months, then it broke, I transferred my wsl onto a bare metal Ubuntu and I am not looking back to use windows anymore, ever. Someone will integrate an AI - ollama - into ubuntu very soon, I hope.
Local saves aren't spyware. Anyway, I've been using Ollama on my Windows, and it's been pretty fast... For 7B models. ... Where you legit feel bad for not having an M3 Max with its memory shared between the CPU and GPU. ... I want that 128GB RAM for my local AI models ...
Interesting, they released these drivers for RTX cards, but not for Quadro cards. Those are still on a slightly older version. Hopefully they bring these improvements to the Quadro line as well.
Quadro cards are still technically supported, but I don't think we'll be seeing optimizations for these. Far more gaming GPUs exist from that era than the quadro era.
@@aifluxchannel They only seem to have released new Game Ready drivers, but no Studio drivers. Many people doing local AI are using Quadro cards. RTX and Quadro aren't different "eras", they exist simultaneously.
Quadro are for virtualisation not so much for AI calculations, old tensors if any, dissociated vram where a 32 gig will be in fact 4x8 on 4 different dies...
@@aifluxchannel Well the article specifically mentioned Nvidia's game ready drivers for the improvement but I asked because I was using their studio drivers which are more geared towards software and "creative apps" so basically I didn't know if the changes affected those drivers too.
🤣 I need to know just what the hell this guy is doing in windows that it breaks. The only time I've ever had my windows system dump is everytime a stick of RAM dies on me or gets sketch and needs replacing. That is literally it. Reminds me back int he day, and I'm dating myself but running the original pro tools on a Mac IIci and people asking how can I do music production on it as it was so unstable and always crashing and I was like what are you doing? It never ever crashed on me, like ever. 🤦🏿♂
@@aifluxchannelJust does not happen to me. I was running Linux as my main OS in early 90s. Been mainly using Windows since with occasional Linux. Windows is really stable.
WSL is basically a 50-ish% implementation of a real linux kernel. Windows also kneecaps how it can access underlying resources like memory and GPU. Generally, I'd recommend just buying another SSD and dual-booting with linux.
I worked on windows withe 2 RTXs no problem, but yes ubuntu is faster and handles this much better. It's more when you start playing with different packages like xformers, bitsanbytes, triton, protobuf etc.. you will struggle re-conciliating Torch and CUDA with all that crowd,.
@@aifluxchannel 😆I switched without knowing about this driver, and actually I updated my driver today have already have 555 version, I wanted to try WSL on windows, tired of restarting to game on windows 😝
@@aifluxchannel I had a 4090 and recently bought Tesla P40 for £200 and just managed to get all parts to get it working today and used Llama 70b 4bit using both GPUs was getting 6 tokens per second using LLM studio
@@leeme179 I assume online gaming that requires the dumb kernel level anti-cheat? Because I have almost zero problems for most games in Linux these days. Even on hyprland it works pretty good, especially with 555 removing the flickering on games like minecraft. I had to be on 535 before that, but it worked fine, just not perfect. I don't think anyone should support the games running kernel level anti-cheat. It's a huge security hole, and it's not even really all that effective against committed cheaters. There are better options that don't require them to take control of the players computer at such a level.
@@Korodarn you are correct, games using kernel level anti-cheat and I recently did some digging as to how good they are, and it does not look good, to the point that either game developers have given up or don't care, but the only thing that seems to be working (sadly) seems to be the kernel level anti-cheat loading at boot like Vanguard
this channel is so underrated. thanks for all the work!
Thank you! Glad we're making content that's engaging outside of many other cringe "ai info" channels. :))))
but we want faster on Linux! 🥺
Hopefully this is coming soon with the new open source / "unified" driver they've been teasing. www.phoronix.com/news/NVIDIA-R560-Open-Default
unified? oh no what will happen to Legacy gpu...
clearly, they're focused on windows because of gaming but with the move to AI they should start to think about moving to the platform where people use AI instead...
So are the 555 Drivers available for Ubuntu 24? I have 2 x RTX a4000 this is great news.
In theory yes! Although this isn't the final version that will be included with the next kernel update.
@@aifluxchannel The only thing is will the rest of the AI libraries follow suit? It's already hard to get everything working as is with mature drivers.
I've been running LLMs on Win10 with a 3070 for months now and as long as the full model can fit into the VRAM buffer the out put is nearly instant. Unless I'm trying to do something weird. This update is more than likely for the Turing cards.
The performance improvement will definitely vary depending on your hardware config. That said, 3070 is a decent GPU for local AI!
I think the speed is something that will end up making local agentic systems much more powerful. We are on the cusp of releasing open sourced ai systems that are given time to process and think before they respond, tripling the processing speed essentially triples the amount they will be able to think before they respond.
Can't wait the AI games will get so good they play themselves because who has time for that better off just paying the sub. I imagine the main reason most people stay off linux is certain programs like adobe products don't really run as well as a bunch of games but with windows discontinuing windows 10 and adding all these spyware like programs which they promise they will not have access to it. Its certainly getting easier for people to switch.
Blockade Labs and their work on "dynamic" ai generated skyboxes are currently one of my biggest obsessions. AR/VR is generally incredibly cringe, but the idea of walking through basically a dream that morphs into something new as long as it's out of your field of view is incredible.
But also, AI NPC's (true generative ai not just legacy AI) will also be super cool to see grow.
I installed Windows 10, worked on it 2 months, then it broke, I transferred my wsl onto a bare metal Ubuntu and I am not looking back to use windows anymore, ever. Someone will integrate an AI - ollama - into ubuntu very soon, I hope.
Local saves aren't spyware. Anyway, I've been using Ollama on my Windows, and it's been pretty fast... For 7B models.
... Where you legit feel bad for not having an M3 Max with its memory shared between the CPU and GPU.
... I want that 128GB RAM for my local AI models ...
More ram = more better llm performance!
Interesting, they released these drivers for RTX cards, but not for Quadro cards. Those are still on a slightly older version. Hopefully they bring these improvements to the Quadro line as well.
Quadro cards are still technically supported, but I don't think we'll be seeing optimizations for these. Far more gaming GPUs exist from that era than the quadro era.
@@aifluxchannel They only seem to have released new Game Ready drivers, but no Studio drivers. Many people doing local AI are using Quadro cards. RTX and Quadro aren't different "eras", they exist simultaneously.
Quadro are for virtualisation not so much for AI calculations, old tensors if any, dissociated vram where a 32 gig will be in fact 4x8 on 4 different dies...
GT 1030 +300% performance?
Really excited about this. I have a RTX 2080 I want to try it on.
2080 is still a great GPU for ai!
So this isn't in the Studio Driver? I guess I'll switch over then
Not sure what you mean by studio, but I don't think this is the full version intended to bring linux updates to the official kernel driver.
@@aifluxchannel Well the article specifically mentioned Nvidia's game ready drivers for the improvement but I asked because I was using their studio drivers which are more geared towards software and "creative apps" so basically I didn't know if the changes affected those drivers too.
Nvidia Studio Driver are supported with this feature as well as Nvidia Game Driver.
🤣 I need to know just what the hell this guy is doing in windows that it breaks. The only time I've ever had my windows system dump is everytime a stick of RAM dies on me or gets sketch and needs replacing. That is literally it. Reminds me back int he day, and I'm dating myself but running the original pro tools on a Mac IIci and people asking how can I do music production on it as it was so unstable and always crashing and I was like what are you doing? It never ever crashed on me, like ever. 🤦🏿♂
Literally, basic parts of the OS just break. haha
@@aifluxchannelJust does not happen to me. I was running Linux as my main OS in early 90s. Been mainly using Windows since with occasional Linux. Windows is really stable.
@@aifluxchannelGiven you use Windows so infrequently it's probably some issue in the initial setup you've never paid attention to.
few weeks ago i started to use wsl and it's kinda confusing what to do now
WSL is basically a 50-ish% implementation of a real linux kernel. Windows also kneecaps how it can access underlying resources like memory and GPU. Generally, I'd recommend just buying another SSD and dual-booting with linux.
@@aifluxchannel great idea i believe i can buy a sata
would dividing my ssd into c and d solve this problem ?
would diving my ssd into c and d and using d as Linux solve my problem
Ask chat GPT, but I suggest you install ubunto on a separate drive it's much more stable and less of an hassle to work with all your environments.
Will this do anything for a Tesla card?
Which version?
INT4 quantization is a boon to local LLM(s). This is an interesting performance uplift.
Definitely agree on this point - phi 3 is definitely pushing the boundaries of what is possible with and without quants this small.
just one tiny problem. nvidia and multiple gpus typically fails. it never detects the other cards and install the approprate drivers
This is why we use linux ;)
I worked on windows withe 2 RTXs no problem, but yes ubuntu is faster and handles this much better. It's more when you start playing with different packages like xformers, bitsanbytes, triton, protobuf etc.. you will struggle re-conciliating Torch and CUDA with all that crowd,.
we all know that you are talking about the titan RTX
Wait and see ;)
yay just switched to windows from ubuntu😆
Noooooooo!
@@aifluxchannel 😆I switched without knowing about this driver, and actually I updated my driver today have already have 555 version, I wanted to try WSL on windows, tired of restarting to game on windows 😝
@@aifluxchannel I had a 4090 and recently bought Tesla P40 for £200 and just managed to get all parts to get it working today and used Llama 70b 4bit using both GPUs was getting 6 tokens per second using LLM studio
@@leeme179 I assume online gaming that requires the dumb kernel level anti-cheat? Because I have almost zero problems for most games in Linux these days. Even on hyprland it works pretty good, especially with 555 removing the flickering on games like minecraft. I had to be on 535 before that, but it worked fine, just not perfect.
I don't think anyone should support the games running kernel level anti-cheat. It's a huge security hole, and it's not even really all that effective against committed cheaters. There are better options that don't require them to take control of the players computer at such a level.
@@Korodarn you are correct, games using kernel level anti-cheat and I recently did some digging as to how good they are, and it does not look good, to the point that either game developers have given up or don't care, but the only thing that seems to be working (sadly) seems to be the kernel level anti-cheat loading at boot like Vanguard
Driver go fastorrrrrrr 😂
We can hope for more linux improvements too!
what about linux :(
oh mine isn't rtx lol....
nevermind.....
What kind of GPU do you use!?
@@aifluxchannel an old GTX 1080