Like others in the comments I was also sent your way because I have a AI home server somewhere in my future, and I am looking to understand which hardware makes the most sense. Appreciate the video. I'll be tuning in on any future builds. Hopefully you continue to look at different options from GPUs, to CPUs and continue similar videos. Thanks! Subbed.
Thank you very much! I'll for sure create more videos with different builds! I hope to upload smth. new every week or so. (Am also working and studying - that's why I don't do uploads with fixed timings). If you ever build your first Rig let me know how it went. If you gave any questions, don't hesitate to ask here in the comments. ✌️ By the way, I guess you're from Canada: You have a beautiful country. I just traveled from Calgary to Vancouver Island some weeks ago. Stunningly beautiful and the people are so kind to everyone! 😃
Our Chinese retailers in eBay and Aliexpress prolly woke up for increased demand and inflated the prices for P40 and P100 by 100% up to $300. No one should buy them with that price imo. Same goes for P4 and Mi50's etc. Also remember that you need a server chassis to cool them / some random blowerhack which will cost you extra. NVIDIA GTX 4060 Ti 16GB with PCIe x8 and with 128-bit bus isn't either something to consider. I'm getting used Intel a770 16GB's for $200 and see what the battlemage brings.
The P40ties are priced insanely expensive right now. The RTX 4060 TI can be an optionnif you only use it for inference (bandwith shouldn't be a problem if you don't want to train a model). The AMD MI60 can still be bought for around 300USD (32gb Vram, great bandwith and perf.). There are modded 2080 TIs with 22gb Vram from the bay area. The P100 only has 16gb of Vram but can still be found for around 200 USD. There are some viable options. But it's not as easy as it used to be... I did keep away from intel arcs as they use vulcan - which does not seem to be well optimized.
Not necessarily. ML is not inherently a parallel workload. GGUF for instance splits the weights and than calculates the matrices serially. There isn't a massive boost. But if you use VLLM from github, you can parallelize your workload. I am planning on testing that. VLLM is planning on supporting GGUF eventually. But remember, ML is a huge field - not just application wise. Many workloads have to be done serially and are quite hard to do asynchronous. There is for instance only one project I know of, which tries to use multiple GPUs asynchronous for CNNs. (github.com/czg1225/AsyncDiff) Greets, Simon
I have both and i watercooled both. The 3090 needs an active backplate. I basically just have the p40 dedicated to ai art tasks without having to switch what my 3090 is doing, like crypto mining or games.
Would be interesting! Especially with those modded 22gb versions which are being sold from the US. Unfortunately we don't have those ourselves. But I hope to get one in my hands. I am not sure if we are yet "big enough" to ask for testing samples from those companies. But I am thinking about it. I think it would be around 50 % of the performance of a rtx 3090, given the 2080ti has half the cuda cores at a slightly lower core clock.
Very fair point - I'd only recommend the P40 for LLMs below 30b today (and if you can get it for below 200USD). It takes some adjustments to have it working in a more modern environment. But they are still usable. ✌️
i'd love to cooperate with you. let me know if i could do anything. i'm studying the mathematics behind the LLMs and stuff like that so it'll be a fun thing to do!
Hey K1, I already work together with a friend of mine from my masters. If you just want to see us doing some experiments you are interested in, we can of course test your ideas. Greets Simon & Andreas
@@AI-HOMELAB Thank you very much for your reply. yeah i'd love to get involved if you thought its possible. thank you. I am studying bachelor's btw. not as expert as you guys.
We neither are really experts. We just try out stuff and hope to show smth. useful. But as I said, we are still quite a small channel and try to find our direction. For the moment we are happy as we are. But if you have an idea you would like us to test or some git repository we will of course try it out for you. ✌️ Greets, Simon & Andreas
Hey First of all: sorry for not answering your first comment. I was getting lots of comments on the first video - honestly more than I thought I would. Now for your statement: I own a 3090, a 3060 and some P40ties, some M40ties and some K80ties. As I also said in the first video: These will be my first test videos. There was honestly no bad intentions in not mentioning your comment. I had planned to do this content and forgot about your suggestion. The reason why I didn't answer straight away back then, was because I don't own an SLI bridge or two 3090ties. So it was kind of out of my expertise. And by the way: You seem to have a really nice setup. ✌️ I am just starting doing this as a hobby. No bad intentions here. 🙂
4:50 small correction on specs: the P40 can do only ~0.19 TFLOPS in FP16, its FP16:FP32 ratio is 1:64.
Right, sorry about that 😅
I was searching for a comparison like this for hours, then you got recomended. Thank you so much.
Glad I could help with my comparison!
Are you planning to create an AI Server for yourself?
@@AI-HOMELAB I am thinking of creating an ai server for ai image generation(flux).
@@AI-HOMELAB I am thinking of creating an ai server for ai image generation(flux).
@blaxt1215 That should work with both cards, but I'd probably go with the 3090.
@@AI-HOMELAB Yes i was also thinking of going with it.
Like others in the comments I was also sent your way because I have a AI home server somewhere in my future, and I am looking to understand which hardware makes the most sense. Appreciate the video. I'll be tuning in on any future builds. Hopefully you continue to look at different options from GPUs, to CPUs and continue similar videos. Thanks! Subbed.
Thank you very much! I'll for sure create more videos with different builds! I hope to upload smth. new every week or so. (Am also working and studying - that's why I don't do uploads with fixed timings).
If you ever build your first Rig let me know how it went. If you gave any questions, don't hesitate to ask here in the comments. ✌️
By the way, I guess you're from Canada: You have a beautiful country. I just traveled from Calgary to Vancouver Island some weeks ago. Stunningly beautiful and the people are so kind to everyone! 😃
Nice comparison thanks!
Thank you for your positive feedback! =D
Great. Thanks for that.
Thanks for this!
You are welcome! Thank you for watching and your nice comment! =)
Why are you running so small LLM models with 96GB VRAM?
Because I was comparing the P40 to the 3090 of which I only have one. ✌️
Our Chinese retailers in eBay and Aliexpress prolly woke up for increased demand and inflated the prices for P40 and P100 by 100% up to $300. No one should buy them with that price imo. Same goes for P4 and Mi50's etc. Also remember that you need a server chassis to cool them / some random blowerhack which will cost you extra.
NVIDIA GTX 4060 Ti 16GB with PCIe x8 and with 128-bit bus isn't either something to consider. I'm getting used Intel a770 16GB's for $200 and see what the battlemage brings.
The P40ties are priced insanely expensive right now. The RTX 4060 TI can be an optionnif you only use it for inference (bandwith shouldn't be a problem if you don't want to train a model).
The AMD MI60 can still be bought for around 300USD (32gb Vram, great bandwith and perf.).
There are modded 2080 TIs with 22gb Vram from the bay area.
The P100 only has 16gb of Vram but can still be found for around 200 USD.
There are some viable options. But it's not as easy as it used to be...
I did keep away from intel arcs as they use vulcan - which does not seem to be well optimized.
@@AI-HOMELAB Just got used 3090 for 550€ which is a bargain. I prolly just keep an eye on these.
Would love to see p106-104-102 mining cards in llm. Cheap as dirt but still a decent 1080ti chip but some caveats like the cut bus.
What about multiple P40s versus a single 3090, AI is a parallel workload.
Not necessarily. ML is not inherently a parallel workload. GGUF for instance splits the weights and than calculates the matrices serially. There isn't a massive boost. But if you use VLLM from github, you can parallelize your workload. I am planning on testing that. VLLM is planning on supporting GGUF eventually.
But remember, ML is a huge field - not just application wise. Many workloads have to be done serially and are quite hard to do asynchronous. There is for instance only one project I know of, which tries to use multiple GPUs asynchronous for CNNs.
(github.com/czg1225/AsyncDiff)
Greets,
Simon
I have both and i watercooled both. The 3090 needs an active backplate. I basically just have the p40 dedicated to ai art tasks without having to switch what my 3090 is doing, like crypto mining or games.
What about 2080ti vs 3090
Would be interesting! Especially with those modded 22gb versions which are being sold from the US. Unfortunately we don't have those ourselves. But I hope to get one in my hands. I am not sure if we are yet "big enough" to ask for testing samples from those companies. But I am thinking about it. I think it would be around 50 % of the performance of a rtx 3090, given the 2080ti has half the cuda cores at a slightly lower core clock.
At this point I wouldn't recommend buying anything without tensor cores even for home lab.
Very fair point - I'd only recommend the P40 for LLMs below 30b today (and if you can get it for below 200USD). It takes some adjustments to have it working in a more modern environment. But they are still usable. ✌️
i'd love to cooperate with you. let me know if i could do anything. i'm studying the mathematics behind the LLMs and stuff like that so it'll be a fun thing to do!
Hey K1,
I already work together with a friend of mine from my masters. If you just want to see us doing some experiments you are interested in, we can of course test your ideas.
Greets
Simon & Andreas
@@AI-HOMELAB Thank you very much for your reply. yeah i'd love to get involved if you thought its possible. thank you. I am studying bachelor's btw. not as expert as you guys.
We neither are really experts. We just try out stuff and hope to show smth. useful. But as I said, we are still quite a small channel and try to find our direction. For the moment we are happy as we are. But if you have an idea you would like us to test or some git repository we will of course try it out for you. ✌️
Greets,
Simon & Andreas
Guess you read my comment. Full stop. NvLinked 3090s are the bomb. Thanks for credit there sir. No need, you are behind the curve.
Hey
First of all: sorry for not answering your first comment. I was getting lots of comments on the first video - honestly more than I thought I would.
Now for your statement: I own a 3090, a 3060 and some P40ties, some M40ties and some K80ties. As I also said in the first video: These will be my first test videos. There was honestly no bad intentions in not mentioning your comment. I had planned to do this content and forgot about your suggestion.
The reason why I didn't answer straight away back then, was because I don't own an SLI bridge or two 3090ties. So it was kind of out of my expertise.
And by the way: You seem to have a really nice setup. ✌️
I am just starting doing this as a hobby. No bad intentions here. 🙂