Yes, it can totally sustain itself - it uses way less stuff to keep itself running and has run locally with just a few M2 Ultras for the full 671B model. That's almost a counterpoint, as it's OpenAI and Anthropic that need the massive sustainability costs. This video is pure copium.
ok so a lot wrong here. For one, the inference cost is so low because, in part, they use a VERY sparse MOE model trained natively in 8bit precision (rather than the standard 16 or 32). Not only is memory requirements lower (Which is great for the VRAM constrained cards in China) but so is the compute needed for a forward pass as only 32/700 billion parameters are active at a time. Compare this to models like GPT4 that likely are running hundreds of billions of parameters at a time and it makes since how the model is so much cheaper and more scalable. As for the practicality of running locally, many people have made partial quants as low as 1.58bit using ternary weights for many of the layers, only requireing ~80GB of ram for the full model and runnable at OKish speed off CPU due to the small number of active parameters. Quants due of course result in lower performance and running this would still be slow on most consumer hardware. Luckily, they also released a series of distills having sizes like 1.5b, 7b, 8b, 14b, 32b, and 70b all of which are runnable on consumer hardware depending on your specs. I have a 3060 12GB and can run all but the 70b at decent speed (and the 70b slowly) and the 14 and 32 actually aren't that far off the full model in terms of benchmark performance (although obviously they'll have less world knowledge at such a small size). Overall this family of models is a big win for open source and the fight against big tech, I've fully switched my workflow over to open source models for the first time (Only occasionally using Gemini when I need big context lengths for a problem since that is hard to achieve on consumer hardware for now. Looking forward to RWKV models next!)
Why even make this video? These are all non-arguments. You note that while DeepSeek is cheaper, it being open source is no silver bullet or whatever because backend complexity barriers to entry. But the payment for reduction in complexity, as you seconds earlier noted, is cheaper than what OpenAI is asking. Would love to know what motivated you to write edit and upload this.
All your arguments about sustainability ar the exact questions you should be asking about OpenAI and the corporate approach americans have. The fact that you bring these up about deep seek, ignoring that OpenAI with the amount of money they put into their work, and the amount of money the put into monopolizing it, speaks alot about your horseblinders. And no, I fucking hate the CCP and am deeply skeptical about them, but at least it's open sourced and 100 times cheaper so far. Nah, you fucked up on this one.
This seems like absolute nonsense. Especially since you can run the model locally. Server maintenance is thus a non-issue. Also how difficult so you think self-hosting can be?? It’s extremely trivial for anyone who can follow an extremely basic tutorial.
I find it funny how Americans can't just accept defeat
damn is 😂🎉
Just take the L, bro
Communism 1
Capitalism 0
Actually China is CAPITALIST but
Deepseek rocks 🤘🏾!
Monopolistic Capitalism 0
State-led Capitalism 1
The levels of cope of this guy when he criticized deepseek for being open source. Come on
i think deepseek hurt the feelings of the biggest openai fanboy
most effective murica propaganda
Yes, it can totally sustain itself - it uses way less stuff to keep itself running and has run locally with just a few M2 Ultras for the full 671B model. That's almost a counterpoint, as it's OpenAI and Anthropic that need the massive sustainability costs. This video is pure copium.
fr
we're not closer to AGI. correct. But those guys found ways to use AI tech more efficient. And that's the amazing part.
And released to the public. With a permissive license.
“I don’t like it” “it’s not American” so “it’s going to fail”.
Bro, I’m sorry but you lost credibility with this video…
ok so a lot wrong here. For one, the inference cost is so low because, in part, they use a VERY sparse MOE model trained natively in 8bit precision (rather than the standard 16 or 32). Not only is memory requirements lower (Which is great for the VRAM constrained cards in China) but so is the compute needed for a forward pass as only 32/700 billion parameters are active at a time. Compare this to models like GPT4 that likely are running hundreds of billions of parameters at a time and it makes since how the model is so much cheaper and more scalable.
As for the practicality of running locally, many people have made partial quants as low as 1.58bit using ternary weights for many of the layers, only requireing ~80GB of ram for the full model and runnable at OKish speed off CPU due to the small number of active parameters. Quants due of course result in lower performance and running this would still be slow on most consumer hardware. Luckily, they also released a series of distills having sizes like 1.5b, 7b, 8b, 14b, 32b, and 70b all of which are runnable on consumer hardware depending on your specs. I have a 3060 12GB and can run all but the 70b at decent speed (and the 70b slowly) and the 14 and 32 actually aren't that far off the full model in terms of benchmark performance (although obviously they'll have less world knowledge at such a small size).
Overall this family of models is a big win for open source and the fight against big tech, I've fully switched my workflow over to open source models for the first time (Only occasionally using Gemini when I need big context lengths for a problem since that is hard to achieve on consumer hardware for now. Looking forward to RWKV models next!)
Can you link it
1.58bit inference using ternary weights
Why even make this video? These are all non-arguments. You note that while DeepSeek is cheaper, it being open source is no silver bullet or whatever because backend complexity barriers to entry. But the payment for reduction in complexity, as you seconds earlier noted, is cheaper than what OpenAI is asking. Would love to know what motivated you to write edit and upload this.
Lol the turbocope on channels like this is hilarious.
Just accept the fact that the United States is no longer leading the AI race 😂🎉
It's not that deep bro
The US may have lost the lead🐋
title: debunked, video: I'm skeptical... lol
Meh. What did you debunked
The hype.
I wonder how much time you've cost humanity creating and watching this stupid video
Overhyping or not, but is IS free and it IS better than GPT-4. Everything else is a mere speculation.
All your arguments about sustainability ar the exact questions you should be asking about OpenAI and the corporate approach americans have. The fact that you bring these up about deep seek, ignoring that OpenAI with the amount of money they put into their work, and the amount of money the put into monopolizing it, speaks alot about your horseblinders. And no, I fucking hate the CCP and am deeply skeptical about them, but at least it's open sourced and 100 times cheaper so far. Nah, you fucked up on this one.
I don't trust the CCP but
Deepseek's research is goated.
bruh
This video was great. Not many people willing to break things down to the lowest levels.
Just as Google said at the beginning of this hype. Big tech has no moat.
your country would be unstoppable if you knew how to learn from your mistakes
OpenAI sicked a hydra on Deepseek an flew him out
This seems like absolute nonsense. Especially since you can run the model locally. Server maintenance is thus a non-issue. Also how difficult so you think self-hosting can be?? It’s extremely trivial for anyone who can follow an extremely basic tutorial.
unsubbed.
Cope harder