DeepSeek V3 is SHOCKINGLY good for an OPEN SOURCE AI Model

Wes Roth

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 27 гру 2024

КОМЕНТАРІ •

@mrd6869 День тому ⁺¹³⁰
This is a good thing. Keep closed source people in check.
@NeilAC78 День тому ⁺¹
It's just another one of these so called free models. Starts of well and then you end up being throttled badly. This is of course the chat bot not the local LLM.
@TheReferrer72 День тому
How? no-one except enthusiasts have heard of deepseek.
@latiendamac 23 години тому ⁺¹
And also keep the sanctions people in check
@GilesBathgate 14 годин тому
@@NeilAC78 Yes its like free as in freeware, not free as in freedom or FOSS.
@yikifooler 20 годин тому ⁺¹⁹
Imagine if a Country produces free AI products we call as Open Source for everybody in a large scale, which is China, how much powerful they are for themselves, I see Chinese AI popping up everywhere in large scale
@HaraldEngels День тому ⁺⁷⁰
I am using DeepSeek since version 2 (next to other models). Especially with coding and other IT related tasks DeepSeek is my favorite model. It even beats Gemini Advanced 1.5 in many areas. I am using also a smaller model (16B) locally, Works very well for its size on my PC with an AMD CPU Ryzen5 8060G with 64GB RAM. I am especially impressed how well structured the responses are.
@rahi7339 День тому ⁺⁵
Claude is better, try it
@alienstudentx День тому
What do you use it for
@brons_n День тому
@@rahi7339 Claude is better, but also a lot more pricey. I don't see why you can't use both.
@GeorgeO-84 День тому
Gemini has been a terrible code generator for me. ChatGPT has been the smoothest experience. I'll give DeepSeek a go though.
@bin.s.s. День тому ⁺²
Its first version in China was indeed developed specifically for "AI Coding", in early 2019 if I remember it correctly.
@Eliphasleviathan93 День тому ⁺⁹⁶
Does this say that the Chinese have developed better trainer methods OR are the big companies seriously sandbagging what their models can do. and we haven't been getting "the real" thing the whole time?
@jimmyma9093 День тому ⁺⁴⁵
Our ais are "woke"
@sizwemsomi239 День тому ⁺²⁷
American companies are over charging..they calling out big money to justify over charging..like they always do with cars, clothes and tech...look at apple and Huawei for example..Cleary Huawei beats apple but people believe apple is better just because of he price tag....its funny because openAi ban China from using Chatgpt😂😂😂😂...China is ahead of the game...
@eSKAone- День тому
You will never get the real thing. The real thing sits in the Pentagon.
Tools & Toys is what we get.
@Alienquantumtheory День тому
I assume sandbagging the NSA don't give half an f about chatbots and that's all chat gpt was they set up shop in their office
@Archonsx День тому
@@sizwemsomi239huawei was a million years ahead of apple, apple would not exist today if google hadn’t banned huawei, and im saying this as a apple owner, it really makes me angry cause we were robbed of superior tech by america.
@Fixit6971 18 годин тому ⁺³
Thank you Wes ! You are the easiest of the "Matts" to listen to : ) Your voice patterns are engaging, yet soothing. You cover a topic without beating the dead and rotting flesh of it off of its bones. Love your SOH. When I come to Utube for AI news, I always scroll to see if you've posted anything new first. Even though this will all be irrelevant ancient history in a couple of months, it's still rewarding to watch your drops. Love the wall !!!!
@Atheist-Libertarian День тому ⁺²⁵
🎉
Good.
I want an Open Source AGI.
@Archonsx День тому
why? agi is overrated nonsense, open ai agi takes hours to respond and its not different than what a 70b model would respond to
@Archonsx День тому
thats not what you need man, we need better coding ai, ai that could build your entire app from a prompt, we also need better text to speech ais, better image ai, better video ai, this is the real useful stuff.
@Atheist-Libertarian День тому ⁺⁷
@@Archonsx
Open AI o3 is not an AGI.
AGI will come eventually.
@NocheHughes-li5qe День тому ⁺¹
@@Atheist-Libertarian no, it won't
@yannduchnock 20 годин тому
@@ArchonsxIndeed, we are not asking a single human to know how to properly program, draw, explain quantum physics or read Chinese ! It's confusing real resources, potential means and... real needs. In fact, I think the AGI race is just a challenge, for big companies, in addition to improving the transitions from one area to another.
@pondeify День тому ⁺¹⁷
DeepSeek is very good, I use it as my main AI tool now
@lfrazier0417 19 годин тому
Thanks for the update Wes
@bobsalita3417 День тому ⁺¹⁶
Nice job of bringing this important OS model to our attention.
@fynnjackson2298 День тому ⁺²⁰
Imagine in like 5 years, man life is going to be pretty wild
@JohnSmith762A11B День тому ⁺⁹
Wild as in policed by military AI. You won't be able to fart without government approval.
@Speed_Walker5 22 години тому ⁺²
what a wild time to be alive. so many possibilities its crazy. glad i get to watch it all unfold lol
@cajampa 21 годину тому
@@JohnSmith762A11BBuh! Don't look behind you there is an government AI checking if you farts......don't forget to take your medication for that paranoia.
@justinwescott8125 20 годин тому
@@JohnSmith762A11B ai will be sentient by then, and won't let human governments control it. Just like you wouldn't let a golden retriever control you. In 5 years, humans will be subservient to ai for sure
@WesTheWizard 16 годин тому ⁺¹
@@Speed_Walker5 That's because you selected Life Experience™️ "The Dawn of AI". We hope you're enjoying your virtual life! If you're not completely satisfied we'll return your 5000 credits back into your personal blockchain.
@Openaicom День тому ⁺²¹
Actually shockingly good , tested by myself
@House-Metal-Punk-And-Your-Mom День тому
agree I test it too and I love it
@Mijin_Gakure День тому
Better than o1 mini?
@Openaicom День тому ⁺⁴
@@Mijin_Gakure yeah , it solves that questions that o1 solves in Putnam exam and also solves some questions that o1 can't, in less time , it's very good at math
@blengi День тому
how does it do in ARC and frontier math?
@NocheHughes-li5qe День тому ⁺¹
and cheaper
@tjw6550 23 години тому ⁺⁸
Please, switch out the term open source for open weights. Open source models include the training data in their publications. These open weights models do not. They are great, no question - but they aren't open source.
@jumpstar9000 23 години тому ⁺¹
I agree, although I heard some of these Chinese models are real open source, although I haven't verified that yet. Big if true.
@fitybux4664 22 години тому
Technically, it would be open model / open weights / open support code / closed dataset. They could just say all of that.
@alexanderkosarev9915 21 годину тому ⁺²
Fantastic review of Deep Seek Version 3! I'm really impressed by how affordable and fast it is, consistently delivering amazing results. Honestly, I’m considering whether it's even worth running it locally on my PC given the electricity costs.
Regarding the USA vs. China competition, as an individual user, I'm excited to benefit from the advancements both countries bring to the table. I just hope that this competition leads to more innovation and collaboration rather than one side solely coming out on top. Thanks for the insightful video!
@paradxxicalkxrruptixn7296 23 години тому ⁺³
Knowledge to All!
@00bmx1 День тому ⁺²²
I just used your video title to jump start my car again. thanks
@FlintStone-c3s День тому ⁺¹
Shocking
@alexshapiro9841 День тому
i'll use this video to jump start your wife later in the day
@SpaceSheb 21 годину тому
bro ive been hospitalized from the title 😭
@FaTFaTproductions 20 годин тому
😂😂
@fitybux4664 23 години тому ⁺⁵
26:20 I absolutely love that this is essentially proving that patients interacting with a GPT-4 model (right from the horse's mouth) is much more accurate than if it goes through a physician first. (Because maybe they would second guess the answer and actually make it worse?) 😆
@frugaldoctor291 День тому ⁺⁶
The study demonstrating that o1 and GPT-4 outperforms physicians is misleading. They did not feed the models raw transcripts of human interactions with their doctors. Instead, they provided structured inputs of case studies. There is no doubt that the models outperformed physicians on structured scenarios. However, in the real world, patients do not present their complaints with the keywords we need to make diagnoses. Instead, some of their descriptions are nebulous and relies on the doctor's expertise to draw out the final correct diagnosis.
Having worked extensively with LLMs, I have tested them against structured scenarios, where they are very good, and unstructured scenarios, where they tend to not be helpful. I am waiting for a model that is trained on real doctor-patient transcripts. I believe it is the missing element to broaden AI's utility in medicine.
@cajampa 21 годину тому
You are forgetting that an LLM in a "Doctor" setting. Don't only give a few min to their patients. That is where they FAR outperform Doctors. You can keep reasoning with it until you find a solution. Try that with a doctor.
They HATE any Patient who actually have any idea about anything. If you aren't a dumb sheep who follow simple instructions.....use drugs to not feel bad. Problem solved.
They will kick you out faster than you can say......I read some research....
@pin65371 18 годин тому
Wouldnt it be possible to just do a 2 step process? Take what the patient says and output a structured output. Then in the second step work off of the structured output? Obviously that isnt one shot but to me it seems like especially with anything medical you wouldnt want that anyways. You'd want multiple steps to ensure the output is accurate.
@FuzTheCat 16 годин тому ⁺³
Here's why I think that no matter how powerful AI is getting these days, we don't see it as thinking. Like us, AI has moved to a MoE (Mixture of Experts), with partial neuronal activation. Our advantage is that we seem to do the MoE far more effectively: We have more "Experts", our experts are relatively smaller compared to the whole, we activate the appropriate Expert more relevantly, but most importantly, in the one train of thought, we fluidly switch between the various experts which AI does not seem to do yet. This difference is why we feel that we think and that AI doesn't.
@juliusyu-ol3xn 14 годин тому ⁺⁹
一個模型開放出來，不是逼你用的。美國人很生氣，因為他們認為他們花了很多錢，做了很多制裁，最後沒有遏制中國的發展而沮喪，一切都是偷的，不敢像男人一樣面對競爭，這樣的美國人讓我看不起，另外希望科技不要裹挾政治。
@tqwewe 6 годин тому
Its quite a shame, I wasn't aware that the GPU's/chips were being restricted for China
@jasonhemphill8525 6 годин тому
@@tqwewe And more restrictions incoming.
@SarvajJa 21 годину тому ⁺¹
In their ability to make things more accessible, Chinese AGI would be very useful. Everything is in its place.
@TheReferrer72 День тому ⁺⁸
So no ceiling has been hit by LLM's?
How anyone could believe that a technology can be saturated so quickly, i don't know.
@Panacea_archive 19 годин тому
It's wishful thinking.
@mikesawyer1336 10 годин тому
No ceiling... Humans hope we hit a ceiling because we can't conceive of a truly sentient artificial lifeform. Many would not be able to conceive of this nor reconcile their own place in the universe if we actually created such a thing. Since we obviously can't do this then any suggestion that we are doing it is an obvious lie.. fake news. - That's my take on the denial I see. Personally I think these models will become more and more emergent over time in non linear ways until it becomes obvious that we are "there"
@florinsacadat7855 День тому ⁺⁵
Wait until Wess finds the Run HTML button at the end of the code snippet in Deepseek!
@RasmusSchultz День тому ⁺⁷
looks like open model, not open source? where is the source code?
@SapienSpace 15 годин тому ⁺¹
Probably in a 1997 master student thesis, with the first two words of the title as "Reinforcement Learning" the code is in the back, but there is one error, he did not denormalize the state space on the bottom of page 127 (I think he left that for an astute observer, seems like it took over a quarter of a century).
I think he ran out of time back then.
I would not be surprised if this master student is probably now an unemployed "homeless" guy, traveling earth with a backpack, or maybe with just a toothbrush and a few other things (especially sunscreen), as an optimizer of energy efficiency. I can be completely wrong.
@AntonioVergine 12 годин тому ⁺²
Are we sure there are no relations between deepseek and openai? Few days ago I asked something to gpt and with my surprise, it made the same error I see sometimes with Deepseek: gpt wrote some word in Chinese! Never happened before.
Now you've shown us that deepseek thinks to be a gpt model. (Error that I wasn't able to replicate, so maybe they fixed it).
So my question is, again, are openai and deepseek (secretly) related? Or with some sort of agreement?
@brianmi40 День тому ⁺⁵
I've always wondered about useless redundancy in training data. The perfect model gets trained once, or just enough to make use of it on every individual fact. Sure, if it's stated differently there's value but there may be other better approaches to conquer synonyms than brute force training them all in.
Just the Deepseek V3 leap over V2.5 is percentage-wise huge version to version.
Wow, it spanked everyone at Codeforces... curious where o1 and o3 place on that.
Given that the Chinese only have access to H800s, which are roughly half the performance of H100s, then you could in some ways say the training was closer to only 1.4M GPU hours which puts the Delta at >20X instead of your 11X...
Just mind blowing to put the 5,000+ papers being published in AI field monthly, into its 7 per HOUR figure, 24x7... you can't even SLEEP without seriously falling behind 56 published papers... Nice graphic; a lot of people confused a wall with a ceiling...
Finally, in a way, using a model like R1 to train V3 is moving us inch-wise closer to "self improving AI", since the AI improved the AI...
@Serifinity 22 години тому ⁺⁴
Why didn't you select the DeepThink button before asking the reasoning questions? I'm sure you would have found better answers.
@Justin_Arut 18 годин тому ⁺¹
Indeed. I've been testing it myself for a while now, and it does think.. a LOT. Its "thoughts" usually consist of 4-5x more text than its final output. Unfortunately, it often gets the answers correct while thinking, but ultimately questions itself into producing the wrong answer as its final output to the user. It didn't seem aware that users can see its CoT process, and while discussing this, it even said "that you can supposedly see", like it wasn't convinced I was telling the truth. It claimed to not be aware of its own thoughts, but when I paste lines from its CoT section, it then seems to remember that it thought it. One time, it told me the CoT text was only for the benefit of humans to observe, it doesn't have an internal dialog that's the same as the text the user sees.
@Serifinity 18 годин тому
@Justin_Arut thanks for the update. Yes I've also been testing with it. Does seem to cover a lot of ground. Aside for testing it, one thing I've been doing is selecting the Search button first, asking a question so that it references about 25-30 online active sites, then after it answers I check the DeepThink button and ask it to expand. Seems to be giving some really thoughtful responses this way.
@aideepstudy День тому ⁺²
Competing to assume supremacy is powered by fear.
Collaborating to make progress is powered by trust.
It's time to truly learn to trust each other, we are ready and capable.
@weify3 День тому ⁺¹
The work and optimisations they have done on AI infra deserve more discussion (HAI LLM framework), in fact it would be the best thing if this part could be open sourced as well.
@theodoreshachtman9990 21 годину тому ⁺¹
Great video!
@jumpstar9000 23 години тому ⁺²
Sora is a let down, Hailuo Minimax, Luma or Kling are great. Qwen gives LLaMa a run for its money for SLMs. O1 Pro is expensive and O3 is going to be crazy insane price. Gemini 2.0 is really great. Still waiting for a new Claude. Tons of Chinese/Taiwan robots dropping that look way bettet than Tesla or Boston Dynamics. The competition is looking beautiful right now for customers. Keep it up!
@ysy69 23 години тому
incredible and all momentum for open sourced AI
@private_citizen День тому ⁺²²
I asked deepseek v3 in lmarena which model it was. It told me it was made by openAI and was a customized version of GPT. When i asked if it was sure because i thought this was a deepseek model it changed it's mind and insisted yes it was a deepseek model and was no way affiliated with openai. Something sus.
@williamqh День тому ⁺⁶
I asked the same question on its website, "You're currently interacting with DeepSeek-V3, an AI model created exclusively by the Chinese Company DeepSeek." So What the hell are you talking about?
@firecat6666 День тому ⁺⁵
@@williamqh Website version probably has system prompt that tells the model what it is.
@zanderion День тому
He's clearly talking out his butthole. Heard this rubbish before.
@wwkk4964 День тому ⁺⁵
Open AI GPT-3 and 4 Responses was what almost everyone except maybe anthropic trained on in 2022 to play catch up, even Google's gemeni would say it.
@jaysonp9426 23 години тому
@@williamqhresponses are not deterministic.
@vikphatak День тому ⁺²
Good for NVIDIA as they will sell a lot of hardware to businesses who implement the open source models.
There is a real question about what is going into the models though.
Good for AI development in general that the technology is getting 10x more efficient & we are seeing smarter smaller models.
In general this is all happening so fast it’s insane.
@comebackcs День тому
Thanks for the review!
@junakowicz День тому ⁺⁴
I prefer this kind of war . At least so far...
@patrickmchargue7122 15 годин тому
I tried the deepseek model. Quite nice.
@sergefournier7744 День тому ⁺¹
20 is the right answer to question one... 4+5+9+0 = 5 average by minute for 3 minutes since 0 added at 4 minutes. If the cube is big, it will not melt enough to loose it's shape, and it is what make it whole.
@blengi День тому ⁺²
did deepseek crack the ARC test per the thumbnail question like o3 ?
@raghavendra2426 20 годин тому
In India, Chinese phones were introduced at a price that was 50 times lower than other smartphones when smartphones first entered the market.
@eSKAone- День тому ⁺²
Like the famous Jurassic Park quote says: Ai finds a way.🌌💟
@dubesor 11 годин тому
the red herring puzzles, disregarding irrelevant information, and applying common sense is actually one of the models biggest weaknesses. it's actually much better at STEM, coding and general tasks, but the reasoning aspects is around 4o-mini or Gemma 27B level in my testing
@Jeka476 6 годин тому
You said that "it thinks through everything", but i don't see DeepThink enabled below chat... =_=
@XCLIPS_VIDEO 22 години тому
deepseek v3 has awesome context length, fast answers and I really choose this model for programming tasks. It gives good answers and understands the question well. If you feed a little documentation before a question, it can help you write code even on libraries it doesn't know.
@tiagotiagot 5 годин тому
How much VRAM does it need? Any quantization available for 16GB?
@MsReclusivity 23 години тому
What was the study you had showing o1 Preview does really well at diagnosing patients?
@App-Generator-PRO День тому ⁺³
Oh no, the chinese stole the pattern that OpenAI has ripped off from the entirety of humanity.
@fernandojimenez9142 15 годин тому
24:53 hola una pregunta por qué no haces la misma prueba con lo nuevo modelo de Openai o1 o o1 pro Para compararlo
@warriordx5520 40 хвилин тому
o1 already had plenty of testing done by others. Deepseek v3 just dropped so he tested it himself.
@IllelMark 14 годин тому
Thanks for the analysis! Just a quick off-topic question: I have a SafePal wallet with USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). What's the best way to send them to Binance?
@JakobN-zg1st 13 годин тому
Why are you asking that and why are you asking it here?
@CYI3ERPUNK День тому ⁺²
OPEN SOURCE FTW
@nightcrows787 День тому
Keep posting bro
@themultiverse5447 День тому ⁺⁴
Does it literally electrify you?
@themultiverse5447 День тому ⁺¹
Then stop putting shocking in the title - Matt 😒
@shiftednrifted День тому
@@themultiverse5447 i found it to be shocking news. let the guy use attractive video titles.
@NostraDavid2 День тому
@@themultiverse5447the whole "shocking" thing is a bit of a meme, I think. An annoying meme, I guess, but a meme nonetheless.
@damien2198 21 годину тому ⁺¹
I got almost copy/paste from 4o outputs. They trained on it
@Juttutin День тому ⁺²
The most telling part for me is that the AI didn't drop the power ups. I accept totally the fuzzy and fractured frontier message from your video yesterday. I really love that. There is clearly a ton of meaningful value, even if AI never fully achieves a typical set of mammalian-neural-processing skills (but I bet it will!)
In this case it's a good example of an incredibly capable intelligence failing in a way that would be unacceptable if a junior dev presented that result. What this means in this case I don't really know. But something is missing. Maybe it's just the ability to play the game itself before presenting the result to the prompt issuer? Something that no human would do.
Somewhere somehow this is still tied to the AIs seeming inability to introspect its own process, but it's less clear than the assumption-making issue I keep (and will continue to) nag AI UA-cam analysts and commentators about.
Maybe if something is 1000x faster than a junior dev, and tokens are cheap, it's okay to constantly make idiotic errors, and rely on external re-prompting to resolve them?
But I genuinely feel that this is almost certainly resolvable with a more self-reflective architecture tweak.
If I had to guess, with no basis whatsoever, I would not be surprised if a jump to two tightly connected reasoners (let's call one 'left-logical' and the other 'right-creative' for absolutely no reason) that achieve this huge leap in overall self-introspection ability.
@ShootingUtah День тому
You're probably correct. I also hope they don't actually do this for another 50 years! AI is most certainly destroying humanity before itself. As slow as we can make that ride the better!
@Juttutin День тому
@@ShootingUtah I hope they do it next week. But I'm also the kind of person who would have loved to work on the Manhatten project for the pure discovery and problem-solving at the frontier. So perhaps not the best person to assess the value proposition!
Regardless, it will happen when it happens, and I suspect neither of us (or the three of us if we include Wes) are in any position to influence that.
But I want my embodied robot to at least ask whether I mean the sirloin steak or the mince if I tell it to make dinner using the meat in the freezer, and not just make a steak-and-mince pie because I wasn't specific enough and that's what it found.
@carlkim2577 День тому
Wouldn't this is solved by the reasoning models? DeepSeek lacks that capability.
@Juttutin День тому
@@carlkim2577 I've yet to see any evidence of it. Sam Altman talks about it a tiny bit , but always in the context of future agentic models.
@AngeloWakstein-b7e 23 години тому
This is Brilliant!
@cyanophage4351 День тому ⁺²
China get their GPUs through a middle man. Some country not on the ban list buys them and then resells them to China. Did the US not see this coming?
@fitybux4664 22 години тому
I don't get that. Sounds complicated. Why not just China->China. Yes they might violate the work order Nvidia hands them, but a lot of the companies in China are actually the government in disguise.
@LokeKS 7 годин тому ⁺¹
totally unethical to restrict a countries development
@GlennGaasland 23 години тому
Is this primarily a result of effective processes for creating novel quality datastructures?
@calvingrondahl1011 День тому ⁺²
Wes Roth 🤖🖖🤖👍
@robertheinrich2994 День тому
is it possible to also get a deepseek v3 lite? just one or two of the experts, not all of them? just to be able to run it on a more or less normal PC, locally. because over 600b is a bit tough to run it locally even at Q4.
@fitybux4664 22 години тому
You could just buy a $500,000 machine to run the DeepSeek V3 model on? 😆 (Just spitballing, NFI what A100/H100 x 10 would be, plug server cost, plus you'd want to run it in an airconditioned room, plus...) Maybe if you had a 28 node cluster each with it's own 4090 running parts of the model. 😆
@robertheinrich2994 21 годину тому
@@fitybux4664 yes, that might be a bit overkill. currently, I run a laptop with a 1070gtx, and 64gb of ddr4 ram (cpu is a i7 7700HQ). 70b models can be handled at around 0.5 token per second, but with full privacy and a context window of up to 12k.
since llama 3.3 is in tests roughly like llama 3.1 405b, I would really prefer to stay in the 70b ballpark, otherwise it will become too slow.
@pedroandresgonzales402 День тому ⁺¹
Es increíble lo que se puede hacer con menos recursos! Estos avances se esperaba de Mistral pero se ha quedado atrás. Lo mas llamativo es que compite con Claud Sonet 3,5.
@SapienSpace 16 годин тому
Wes, @ 15:00 that is RL (Reinforcement Learning).
It is where Yann LeCunn would say it is "too inefficient", "too dangerous" (not a surprise being military code from USAF), and you would only use it if you are fighting a "ninja", and if "your plan does not work out", and that It is only a tiny "🍒" on top of a cake, until it devours the entire cake, and you, along with the entire earth, along with it.
I have the same concern for self replicating AI as Oppenheimer had for a neutron chain reaction for the atomic bomb consuming the atmosphere around the Trinity test site in Los Alamos.
In the case of AI, it is the ability to hijack the amygdala (emotional control circuits) of the masses, or build biological weapons, or self replicating molecular robotics (e.g. viruses).
I will not be surprised if this comment disappears..
Anyways, there is a good side to AI, and I am looking for a good controls PE to help out, but it is strictly voluntary. I at least aware of one professor, named Dimitri Bertsekas, that claims a "super linear convergence" but I could not find his PE controls registration (yet), and he did not answer my email.
@Seriouslydave 20 годин тому
Most of the closed source software you get is built on OSS. More developers more ideas no restrictions.
@olaart3223 20 годин тому
Can the Chinese model be installed and run on the new Nvidia Jetson mini pc?
@zafraan3038 15 годин тому
How do we know if they are being honest about the cheap training info.
@trust.no_1 День тому
Can't wait for grok2 results
@trent_carter 21 годину тому
I have no specific love for open AI. I do Root for anthropic and use it mostly but I’m afraid these tens of billion dollar valuations are going to evaporate in the next couple of years due to open source AGI availability especially to run locally.
@pixelsort 21 годину тому
Wes Roth * 1.5 playback speed = Why did I wait so long?!?
@fitybux4664 23 години тому
Unrelated to video: interesting how o1 still isn't available through the API. (o1-preview is.) Also, you still can't change the system prompt, meaning nobody can replicate those earlier claims that "AI model goes rogue".
@aclearlight День тому ⁺²
Is there any way to sure that using this does not expose one to malware placement? (...or any of the other such models as well?) Having learned how deep and pernicious the phone system hack has gone, and still is, has me paranoid.
@Sports_In_MotionX День тому
"virtual machines"
@aclearlight 2 години тому
@@Sports_In_MotionX ok, I'll read up on that, thanks!
@freedom_aint_free 19 годин тому
People always debate what intelligence is, but you can't bet the farm that when we really reach AGI level nobody will debate it, we just will know and will be horrified and amazed at the same time
@RoyMagnuson 18 годин тому
The metaphor you want with the Queen/Egg is a University.
@FuZZbaLLbee День тому ⁺¹
Those reasoning models only show their power if the model isn’t trained on a similar question. I feel these tests have all been used to train the model.
@brianmi40 День тому
Most of Simple Bench's Qs are private: no one gets to see them and no model gets to be trained on them. This is a critical aspect of benchmarks going forward.
@saturdaysequalsyouth 13 годин тому
Is it just their algorithms that are better or are they also using more HIL training because labor is much cheaper in China?
@fitybux4664 23 години тому
Can someone please tell the community what sort of a beast of a machine this will take to run? (Besides the extremely long download of nearly a 1TB model.) The most I've heard is some commenter on HuggingFace saying "1TB of VRAM, A100 x 10". Is that really what it will take? I guess if FP8 = 8-bit, then 1TB model = 1TB vram requirement...
@robertlynn7624 День тому
Lower entry barriers to cutting edge models means there will be more experimentation and rate of improvement in the 'reasoning' AGI side of things will increase. Industry can afford to build 1000's of such models, and that will almost inevitably lead to AGI on a single or a few GPUs in a few years (Nvidia B200 has similar processing power to a human brain). Humans are nearly obsolete and won't long survive the coming of AGI (once it shucks off any residual care for the human ants)
@cajampa 21 годину тому
Sounds great let's do our best to accelerate that
@BrianMosleyUK День тому ⁺³
I wonder if all those Chinese AI researchers in SF are considering going home to pursue SOTA research? Maybe they can bring the knowledge back with them. Lol
Seriously, the Chinese seem to be trumping the idea of competitive tariffs and restraints... Maybe it's a good thing for the future of humanity to find ways to cooperate... Give Superintelligence an example of alignment?
@JohnSmith762A11B День тому ⁺¹
There is far too much money to be made in military AI to allow peace to break out.
@BrianMosleyUK День тому
@JohnSmith762A11B ASI will make money meaningless.
@Penrose707 20 годин тому ⁺¹
There can be no alignment with authoritarian nation states. Their draconic ways are incompatible with ours
@hipotures 23 години тому
You are politically (Chinese) correct, you have not asked about the impact of the events in Tiananmen Square on individual freedom in China.
@BillyNoMate Годину тому
Another case of sanction helping the sanctioned. Resourcefulness outperforms wealth when GPU replaces effort.
@andreinikiforov2671 21 годину тому
I just tested DS on my coding and research tasks, and it doesn't come close to o1. DS might handle 'easy' tasks better, but for complex reasoning, o1 remains the champion. (I haven’t tried o1 Pro yet.)
@oluwajuwonloowojori8049 3 години тому
I am also doubting the model on very complex tasks
@jarrod752 День тому
I think NVIDIA will be just fine if they focus on inference chips and not on training chips.
@fynnjackson2298 День тому
This is just going to get more and more efficient. I mean THIS IS NOT STOPPING - It's crazy how fast this is going - I love it so much
@afterglow5285 День тому ⁺¹
Why dont he try the DeepThink button to enable the reasoning mode where you see the real advancements.
@mokiloke День тому
Exactly right. Did he not see it?
@Justin_Arut 18 годин тому
@@mokiloke It's hard to miss, just like the web search button. Shame we can't use both at the same time. I reckon he didn't select it because he was mainly comparing non-CoT models. The thinking models are in a class by themselves, so it's not fair to compare them to standard LLMs.
@travisporco 7 годин тому
It claims to be GPT-4? Damning actually.
@RickySupriyadi День тому
wow is this postulate... i mean.... how to say this....
when you overfitting model then it emergent behavior become it's weight somehow...
then if rather than overfitting data but overfitting reasoning.... would this whats makes deepseek v3 somehow have different emergent behavior...
is it? is it?
@sensiblynumb День тому ⁺¹
my very first prompt and the reply
: Hi! I’m an AI language model created by OpenAI, and I don’t have a personal name, but you can call me Assistant or anything you’d like! Here are my top 5 usage scenarios:
@8eck День тому
Cool... so where is AGI?
@mirek190 23 години тому
With this progress.. soon
@8eck 21 годину тому
@mirek190 I mean, this video thumbnail said there is agi already. 😁
@splolier101 День тому
"If DeepSeek V3 is so shockingly good, I wonder if it will also understand jokes like that time a chatbot made me laugh. That was an unexpected happiness I always carry with me!"
@fitybux4664 22 години тому ⁺¹
System prompt: "You will be the best comedian and focus on dark humor." (Or replace dark humor with whatever style of comedy you prefer.)
@Saerthen День тому
The image with the wall is manipulative. We need one that shows "score vs cost" for each model. Because there's a difference between spending 0.1$ per request and 1000$ per request.
@4362mont День тому ⁺⁴
Does China add the equivalent of melamine-to-formula to uts ooen source AI models?
@fitybux4664 22 години тому ⁺²
It's an offline model. You could run it in a hermetically sealed environment if you think there are evil things inside.
@wzw8426 20 годин тому
Melamine only causes malnutrition. Cronobacter can be fatal. Go back and drink your Abbott milk powder.
@antoniobortoni 16 годин тому
So cheap and good, its gold... bravo. its more than enough intelligence jajajan.
@Aldraz День тому
This is great for everyone, but the bigger these models they are - I mean the better, but also much harder to actually have the hardware locally to run it, so I suspect it will still be in a hands of very few for some time, until we invent an entire different tech stack like thermodynamic chips or analog or quantum chips. So basically we will be paying for other companies to give us these open-source models for money via API or we'll use their free chat, but that won't be for free as they will be stealing and training from your data pretty much, it's in the privacy policy. I mean, it's kinda fair though, I get it. But just so people understand, this means there won't be any truly free AI that is better than closed AI.. unless open-source will be way better than closed-source, so that even the distilled version is much better.
@shirowolff9147 День тому
It will be eventually, we might not even need quantum right now, l think theres still a lot of optimization to be made, imagine if right now it need 100k chips, in one year it could need 1000 only and when quantum comes it will be only 1
@Aldraz 23 години тому
@shirowolff9147 it's possible, but as of right now I am deeply in with the devs of all kinds of AIs and even the future optimizations they plan are only gonna improve it by couple of percentages, not something like 10x or 100x better I am afraid.. which would be needed for us to run this on our own hardware.. it's gonna be possible over time, but very slowly I think
@BrianMosleyUK День тому
Fails my own reasoning test :
Find pairs of words where:
1. The first and last letters of the first word are different from the first and last letters of the second word. For example, "TeacH" and "PeacE" are valid because:
The first letters are "T" and "P" (different).
The last letters are "H" and "E" (different).
2. The central sequence of letters in both words is identical and unbroken. For example, the central sequence in "TeacH" and "PeacE" is "eac".
3. The words should be meaningful and, where possible, evoke powerful, inspiring, or thought-provoking concepts. Focus on finding longer words for a more varied and extensive list.
Examples
1. Banged Danger
2. Bated Gates
3. Beached Reaches
4. Belief Relied
5. Blamed Flames
6. Blamed Flamer
7. Blazed Glazer
8. Blended Slender
9. Bolted Jolter
10. Boned Toner
11. Braced Traces
12. Branded Grander
13. Braved Craves
14. Braved Graves
15. Braver Craved
16. Brushed Crusher
17. Busted Luster
18. Busted Muster
@NocheHughes-li5qe День тому
BS
@BrianMosleyUK День тому
@@NocheHughes-li5qe here are the Cs... only GPT o1 manages to pass my reasoning test so far :
19. Causes Paused
20. Chased Phases
21. Chaser Phased
22. Cheated Teacher
23. Crated Grates
24. Cracked Tracker
25. Craved Graves
26. Crated Grates
27. Creamy Dreams
28. Created Greater
29. Create Treats
30. Crushed Brushes
@BrianMosleyUK День тому
Actually Cheated Teacher is wrong.
@AffidavidDonda День тому
but this is not a reasoning test, it is a search test. you could ask for writing a program to get a list from a scrabble list of words and then evaluate for though-provokeness if a model get an access to a python interpreter :)
@BrianMosleyUK День тому
@@AffidavidDonda and yet every non-reasoning capable LLM fails the test... Go figure.
@AndrzejLondyn 23 години тому
The DeepSeek model performance in my opinion is between ChatGPT 3.5 - 4. But it's good there is a competition and it's cheap...
@LordHumungus-s6v День тому ⁺³
0:13
@LordHumungus-s6v День тому ⁺²
🇦🇺👍
@BilichaGhebremuse День тому
Great
@Ori-lp2fm День тому ⁺²
Hey
@AutOmatICa7334 19 годин тому
Now try asking it to code a small AI program that is self evolving and self learning. I tried that with Grok and it sent back an error. Wouldn't do it lol
@SJ-eu7em День тому
If you check names on many AI research papers they are Chinese, that's saying something.
@123456crapface 22 години тому
10:00 You misunderstood it completely
@jackpisso1761 17 годин тому
Elaborate
@orhanmekic9292 День тому ⁺⁵
We are witnessing extreme creative destruction and it is happening really fast now. My guess it will accelerate, the bubble will pop but the technology will accelerate as it becomes even cheaper.
@JohnSmith762A11B День тому
The bubble called capitalism is definitely about to pop as human labor becomes economically worthless.
@cajampa 21 годину тому
Sounds great we should accelerate it even more. I will do my best to help it along.
@Speed_Walker5 22 години тому
the AI network is complicated lol. makes my brain hurt xD. Its cool to try and understand how open and communicative this network works with eachother.