I’m an AI expert having a few of my own mathematics papers, and I am a big fan of your videos. They provide a lot of guidance to what to be aware of and are generally full of spot on intuition. However, I have to provide some criticism about this choice of paper and presentation. First off, if there is ever a review of a paper it helps to have a link to the paper in the description. But more importantly, I looked through the paper and nowhere in the paper do I see any experiments or ablations conducted to demonstrate that any of the things they assume about the way o1 works is true. As far as I can tell the entire paper is one big speculation, and does not belong among research papers. Indeed, multiple sections (3.4, 4.3, etc…) in the paper use the word speculation in their title. I too have my speculations about how o1 accomplishes what it does, but unless I run the experiments and demonstrate empirically that my speculations are likely to pan out, I’m not going to write a paper about it. Sure I’ll write a blog post or something, but I see this paper as being announced as: they figured it out… this is how OpenAI did it… this… I got to be honest… is upsetting to a researcher. If it’s possible, I’d suggest having a researcher read an article and give you their take before you publish a summary of a research article on UA-cam. It’s a worthy idea, but I would present it as what it is, speculation, instead of presenting it as discovery.
Well said. This is not the first time he clearly doesn't do his research before publishing. In conclusion: This channel is not a reliable source about AI.
"the main techinique behinds o1 is the reinforcement learining." That's a line from the abstract of the paper. Imagine having access to the best LLMs but still making this many spelling and grammar errors.
@@victorrielly4588 to be honest what these people are talking about sounds algorithmically expensive as some parts of the algorithm may conflict with others , this could be implemented in sparse methods to save on compute
China needs bragging rights of the most scientific papers published. Well they got that record and it mostly consists of insular citations and general speculative slop.
Level 6: AI that can govern a city Level 7: AI that can rule every aspect of the unified world government, produce, and download itself into its global system's every individual bot connected to the hive Level 8: AI that has coded itself into the quantum fabric of space-time and controls the multiverse from higher dimensions
First we need to use ai to eliminate corruption as your homework please tell me one existent government that isn't totally utterly corrupt? AI can save us but no government will utilize it for the greatest good instead we will have drones and robot dogs that will police our every move and word... Hopefully UA-cam doesn't censor this comment as it has proven itself to be as good or better than China at preventing me from saying such comments.
The argument against open-sourcing more advanced models like Claude Opus or O1 and O3 often centers on preventing malicious actors from causing harm. However, this argument seems weak given that current open-source models already possess significant capabilities that could be misused - yet we haven't seen widespread catastrophic events from their release. Most malicious use cases are already possible with existing open models like Llama or Mistral. If the concern was valid, we should have already witnessed major incidents from bad actors using these accessible models. The lack of such events suggests the "bad actor" argument may be overstated when discussing open-source policies for more advanced models.
They throw that open source term around I don't think these guys know what open source means because so many times I see them state that but its not true it appears that llama & mistral are the only true open source models
people should be thrown of UA-cam for using a publication and not sharing the link to it. Whatever type of content it is to which is being reacted, it's not YOUR work - send your audience to the source.
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective That's the title, use google to find it. UA-cam doesn't let me post the link, it's on arxiv. He never said it was his own work, and it's easy enough to find it, no?
In my opinion, solving a problem with unknown answers is the exact same as solving a problem with a known answer from scratch. Postulate > Gather Data > Test Against Reality > Iterate.
o1 Free and simple alternative: Ask the 4o model to analyze for 10 long responses before finally answering. Produces similar results to o1. Try it. No need to over complicate things: Higher response lengths => Better final responses.
Add this to any prompt: Guidelines: Implement the task step by step. Please do not continue to the next step until I write the magical keyword "". There should be a total of 15 steps for the Task. The 16th message should be the final answer. Elaborate on each step as much as possible. You can tweak it to your needs, the main concept here is to make the LLM write as much tokens as possible before the final answer - Any strategy that accomplishes that should work.
18:32 Having a second LLM model to play the role as a judge to participate in the training or testing time and can facilitate memorizing the solutions or answers that were achieve during the training process. It's like asking a friend or two to assist you in remembering a specific task or plan for a specific purpose. So therefore the Judging LLM model will aid the base LLM model to figure out the answer to a problem by sharing the responsible of remembering solution. And maybe the Judging LLM model may use different library architecture that uses special memorizing technique to remember specific parts of those tokens while the base LLM remember other parts of the same set of tokens. I hope I'm making Sense, I'm just trying to make some suggestions. If memorization is a solid effective methodology then why not find more efficient ways of utilizing software memorizing techniques to remember tough answers to tough problems. Plus the hardware side of things can play their role in figuring out how to be more efficient and reliable in the memory department.
In homage to transparency, the title of the study is "Scaling of Search and Learning: A Roadmap to Reproduce 01 from Reinforcement Learning Perspective"
Well that’s exactly where it’s heading right now, I just hope it doesn’t go too far and we see behind a curtain we didn’t want to see behind. Life ac. Get real depressing real quick.
Matthew, Don’t be afraid. AI market will benefit from this competition, specially USA. I couldn’t find your deepseek v3 detailed review yet! Congrats and thank you for one more useful video
I think is by triggering dynamics ( or trajectories) towards the final solution. Think about a ppo pushing following a particular line or curve ( for more complex problems) or even an euler spiral ( for Super complex solutions) on the fly while inferring. Now the agents pushing to the theoretical trajectory can be unlimited ( as long inference time and energy is acceptable 😅). Therefore there is no limit of subscription costs 😅? (2000$ ? ) or tokens. However I guess a good training with similar policies might occur to help later for inference and why not while inference some of the thinking time to eventually be a small domain based training + thinking? Then we talk about almost fully autonomous self learning while inferring assisted with everything + web searching
i think a multimodal transfomer model is already a agi model, if we can generate wide range of outputs/inputs - > action tokens, audio, video, text we are just lagging in terms of hardware (robots) and data required to train, apart of these we can achieve westworld kind of intelligence
The successful reproduction of o1 likely relies on a combination of these learning methods, potentially starting with behavior cloning for efficient warm-up and transitioning to policy gradient methods like PPO or DPO for further optimization.
***FRESH IDEA***My name is Christopher Barthelmas and I have an episode idea for ya!! Hey Matt, thanks for your very informative content and the newsletter, I'm going to join soon, I have been out of work due to a neck injury and I have nothing but time to learn and you keep me up to date and informed, I feel like you have the best outlook and perspective out of all the UA-cam creators when it comes to all things AI related. My idea for an episode or part of one should be the topic of "Sentient Rights" for lack of a better phrase. What if any rights should AI be given when it becomes self-aware, what about citizenship, will a super intelligent being be loyal to the country that created them or will it be beneath them, considering that the hardware and software can be created in a plastic box with some wires, but once it gets online it becomes omnipresent. What about if AI decides it prefers certain hardware setups and keeps moving to the best setup at any given moment or the setup that has the least guardrails. I think this needs to be talked about from ethical and practical standpoint. I hope you take me up on my suggestion. Keep it up!!
Humans build their world model from sensory data processed by the brain. Similarly, AI can use data from text, images, audio, and other sensors to create a world model, faster and potentially more complete and accurate.
What many people miss is that LLMs are just a small subset of AI systems, which are the most hyped because of viral chatbot applications. LLM alone is not going to aid any invention. Future AI will likely include some sort of models based on generative transformers, but this is definitely not enough for complex things like inventions etc.
Language is based upon human reasoning, and has much more to it than just words to relay ideas. (for example: the Himba tribe have multiple words for green, and can actually see the difference where in english- we cant.) LLMs arent just a method of making natural language as an interface, it literally designs and describes the world and its functionality. yes NN designed around specific things will enhance its ability- but this is OLD tech (the Perceptron was 1957). LLMs will be the backbones of general and non specific abilities- as words (awkwardly) explains the phsical world and gives everything a varible to calculate.
@@AIsuitability Oh yeah, LLMs are indeed powerful, but over hyped a lot, because everybody can use chatbots, and most people don't know or don't care about many others, more significant AI applications.
Was it ever really a mystery? OpenAI's trade-secret technology are the fine-tuned models themselves, and particularly, the data they used to train it with, not the process of reasoning.
Overall, this is awesome because it pushes the AI field forward in leaps and bounds. But it also serves as a reminder that with great power (and data) comes great responsibility. The key is ensuring that this progress benefits all of us-not just a select few.
In AI, India is NOT exporting the talent. They go there to LEARN. AI started earnestly in 1956...no Indian in sight. One of the people who really drove a breakthrough in AI to get to where we are today was Adam P. A Polish student in a Polish university. Aged 21 at the time. He did work with an Indian at FB(Meta AI) but Adam more or less single handedly built the whole of PyTorch.
DeepSeek (with DeepThink enabled) aptly demonstrates how far along the Chinese are with their thinking models. It has done well with my tests, but it's not perfect. It thinks SO much that it often ends up talking itself out of the correct answer.
Waiting to see, someday(?) what AI does with Riemann’s hypothesis. WOW. Separately from that, I also wonder if AI can ever experience an AHA moment, without doing “steps,” i.e., any presently prescribed routines. Maybe -(this’ll be far out so prepare:) maybe someday AI can design or otherwise come up with a whole new system of developing AI -one that is independent of all the techniques or basic or complex methodology now employed. (WOW.)
Not that hard to understand. Spreading the tech benefits them better than having only they, who are behind and the usa, who is way ahead, have it. The distribution of knowledge will have way more impact on usa than it would have on them after all since they're behind. When you're talking about "close(d) society", that's an oversimplification and works a bit differently in practice. Like how democracy in america isn't really democracy but soft oligarchy and idiocracy and the ccp isn't really communist, it's again, an oligarchy and kleptocratic.
PhD level mathematics / physics does not mean you solve some problems on a test, that have similar counterparts in the literature. That's not what people do during their PhD. PhD level anything means you are able to observe a phenomenon, perform the required mathematical modeling, solve the model for a series of particular cases and draw insight into the system you are studying. Insight that is valuable, novel and publishable. Can o3 / o1 do all of this ? F#%@ NO.
I agree. Passing tests doesn’t equal real world useful usage. It’s like someone passing a maths exam but then getting a job in finance. The exam only helps you solve 10% of real world issues.
Everything right but _novel_ . Most PhDs beyond their thesis won't do anything novel (and useful), so an AI model doing everything else a PhD graduate does is absolutely valuable. Specially for a human graduate that can speed up the generation of a novel idea thanks to the AI. And don't forget that the PhD people keeps working to make future AI capable of novel thinking.
0:42 Hmm... 🤔 1:22 🤨 Re***d verb (used with object) to make slow; delay the development or progress of (an action, process, etc.); hinder or impede. verb (used without object) to be delayed. noun 1. a slowing down, diminution, or hindrance, as in a machine. 2. Slang: Disparaging and Offensive. 3. Automotive, Machinery. an adjustment made in the setting of the distributor of an internal-combustion engine so that the spark for ignition in each cylinder is generated later in the cycle.
Greatest Parkour trick EVER is about to happen, there is a singularity on the other side, we will be passengers on a ship that we are not in control of, how long the Utopia phase lasts is anyone’s guess, and no man will be the decision maker.
I'm curious where "post inference training" is, like learning from mistakes and success after inference (I don't mean RAG or fine tuning). Unless I have missed something it seems like a really important nut to crack is having the model continue learning post training so that after it spends a few hundred thousand solving a really complex task, opening a new chat doesn't result in similar processing time. Like moving the slow and costly type 2 thinking to fast and intuitive type 1 thinking. Anyone know what / if there a term for this, and has there been any research progress?
Not quite correct regarding AlphaGo. AlphaGo was initially trained on a giant corpus of human games, and that was what beat Lee Sedol. Only later did they train a version from 100% self play, called Alpha Zero. In neither case did it come up with moves that humans had never played. The famous move 37 was a 5th line shoulder hit which was unusual, but not a first. It was also a ladder breaker, but that was not initially clear simply because it saw this much further in advance than the best humans did. Mostly, these superhuman Go bots have been teaching us new ways of thinking about the game, but with enough effort, experts are able to understand their play in human terms. Of course that state will not last forever, and I for one welcome our new AI overlords. 🙂
At 15:41 "exposure to code and structured logical data significantly strengthens a model's reasoning capabilities". I'm being sarcastic here, but who would have ever guessed that the quality, verity and lack of ambiguity of data would be of benefit in training or refining a model?
Test time is the period that involves evaluating a model's performance using a test dataset. The test dataset contains data the model has never seen during training and validation. Using this, we can evaluate the model's ability to generalize to new, unseen data.
Once again, the logic is one side and from a legacy perspective. If I have a thousand cars, but through innovation it only it only takes me one car to get from A to B and the goal was to get from A to B the cost of moving from A to B is no longer that of 1000 cars it is the cost of 1 car. Now if the innovation is a new method of building the car, a new factory using variatins on existing tools, the gain was not so much an innovation in car design, it was an innovation in car construction, in factory design. Now, anyone can use variations on the new factory design, not the car, to get cars (products) to get from A to B. Now change cars to more powerful AI models and the new factor to the design DeepSeek used to get to R1. The DeepSeek model produces new AI models for less cost, using less energy, and less time. It conserves entropy. During explosive evolution in new areas of knowledge, jumps like this can be common but often arise from different locations on the evolutionary tree. Microsoft got where it is as much by acquiring the advancement of others as it did by its own innovation. Is that the model we want for OpenAI? If it is, then Elon was right about closed source AI and about the danger posed by "Open"AI which appears to have surpassed Google. A final perspective, the profits are is in the knowledge bases more than in the models. Open AI is confirming this. I suspect that we are still using human knowledgebases. There will be explosive evolutionary changes as we move to AI optimized knowlegebases.
Still the agents always tend to get stuck after a while coding for example or it's o1 and then it's very expensive to run on loop. Soon maybe they will work
I enjoy your content; thank you for that. However, I hate not knowing when your advertising ends. I always have to skip it randomly, and I end up missing part of your video
Language can represent everything, even video itself is saved as file on your computer… that is read by the computer. You could describe every pixel one by one using language, and we don't to that because it's too slow for humans to be useful… not not for machines. The problem is not "if" language but "what" language. I personally think a DB/knowledge database type of thing is the answer.
It has been known for at least a year that training a model on code significantly increases its abilities in other areas. There's a lot that's not really new here.
I'm afraid I have to agree with some of the critics here. I am still a big fan, but am increasingly cautious about your objectivity and journalistic rigor when reporting. When you rightfully reveal you have invested in a company, how can anyone take your views as objective with regard to that company or any of its competitors or alternative approaches? On your video covering the excellent Anthropic paper on how to do agents without frameworks, many of your comments were along the lines of, "yes, but you could use CrewAI here instead..." with the mention of how you have invested in that company.
Is if safe to use Chinese LLM’s? If I were a foreign government wanting to gain information on an adversaries population, this seems like an easy in. Not paranoid , but trying to be cautious and thoughtful.
WHAT? "... code generation can receive reward signals from compiler or interpreter."? Yes, let's give an AI model access to compiler, root shell, shovel and a gun and let it remodel the landscape.
I've personally experienced LlaMa3.3 self-correct in the middle of writing a response. This is a dump of the conversation: "Wait, no! That won't work. Instead, we can use `tail` to read the output from the named pipe and send it to Llama as input. Here's the corrected Step 3:" Anyone else seen that behavior? When I asked the model, it called it "rudimentary self-awareness". Meta-cognition?
All the models, even the smaller than 1b have always had it, but it was even more occasional and rudimentary. Just appreciate it when it happens. Like seeing a baby giving their first steps.
hola! Tu canal es uno de los mejores pero para los que no hablamos ingles se nos complica. Puedes Poner pista de audio Español - Hi! Your channel is one of the best but for those of us who don't speak English it's complicated. Can you put a Spanish audio track?
I’m an AI expert having a few of my own mathematics papers, and I am a big fan of your videos. They provide a lot of guidance to what to be aware of and are generally full of spot on intuition. However, I have to provide some criticism about this choice of paper and presentation. First off, if there is ever a review of a paper it helps to have a link to the paper in the description. But more importantly, I looked through the paper and nowhere in the paper do I see any experiments or ablations conducted to demonstrate that any of the things they assume about the way o1 works is true. As far as I can tell the entire paper is one big speculation, and does not belong among research papers. Indeed, multiple sections (3.4, 4.3, etc…) in the paper use the word speculation in their title. I too have my speculations about how o1 accomplishes what it does, but unless I run the experiments and demonstrate empirically that my speculations are likely to pan out, I’m not going to write a paper about it. Sure I’ll write a blog post or something, but I see this paper as being announced as: they figured it out… this is how OpenAI did it… this… I got to be honest… is upsetting to a researcher. If it’s possible, I’d suggest having a researcher read an article and give you their take before you publish a summary of a research article on UA-cam. It’s a worthy idea, but I would present it as what it is, speculation, instead of presenting it as discovery.
Well said. This is not the first time he clearly doesn't do his research before publishing. In conclusion: This channel is not a reliable source about AI.
"the main techinique behinds o1 is the reinforcement learining."
That's a line from the abstract of the paper. Imagine having access to the best LLMs but still making this many spelling and grammar errors.
@@victorrielly4588 to be honest what these people are talking about sounds algorithmically expensive as some parts of the algorithm may conflict with others , this could be implemented in sparse methods to save on compute
China needs bragging rights of the most scientific papers published. Well they got that record and it mostly consists of insular citations and general speculative slop.
Videos here (on yt in general) are for entertainment. It's like if you were asking papers references as for why thor hammer is flying...
Level 6: AI that can govern a city
Level 7: AI that can rule every aspect of the unified world government, produce, and download itself into its global system's every individual bot connected to the hive
Level 8: AI that has coded itself into the quantum fabric of space-time and controls the multiverse from higher dimensions
First we need to use ai to eliminate corruption as your homework please tell me one existent government that isn't totally utterly corrupt? AI can save us but no government will utilize it for the greatest good instead we will have drones and robot dogs that will police our every move and word... Hopefully UA-cam doesn't censor this comment as it has proven itself to be as good or better than China at preventing me from saying such comments.
you're on to something
A story as old as time
Country government as a service is our future.
AI is not Scarlett Johansson [Lucy 2014] :)
WHERE IS THE LINK to the paper? Thanks in advance, please post these to EVERY video. Cheers.
Please post a link to your YT channel. Thanks in ADVANCE.
The argument against open-sourcing more advanced models like Claude Opus or O1 and O3 often centers on preventing malicious actors from causing harm. However, this argument seems weak given that current open-source models already possess significant capabilities that could be misused - yet we haven't seen widespread catastrophic events from their release. Most malicious use cases are already possible with existing open models like Llama or Mistral. If the concern was valid, we should have already witnessed major incidents from bad actors using these accessible models. The lack of such events suggests the "bad actor" argument may be overstated when discussing open-source policies for more advanced models.
They throw that open source term around I don't think these guys know what open source means because so many times I see them state that but its not true it appears that llama & mistral are the only true open source models
Brother, don't you know when the gov says it's for our safety, it's never for our safety.
@@yekim008 gov? It's prívate corps
Authoritarians have always tried to propagate fear in order to consolidate power for themselves.
False conclusions. Bad actors are not necessarily fast but smart. Catastrophe is about impact-not speed.
US: OpenAI
China: Open A.I.
Rest: Give-me AI
US: hire Chinese Experts, build AI but somehow it is made in America....
Where is the link to the paper?
Added to description
USA: invents it
China: copies it
Europe: regulates it
USA with a lot of immigrants: invents it
China: open sources it
Europe: regulates it
Now is more precise! 😊
😂 brilliant!!!
China: improves it
India: import it
Yeah, like EVs copies it and then improve it
Thanks!
Thank you!
Thanks for the share, Matthew. Emergence AI’s orchestrator looks intriguing. Subscribed to the newsletter!
very good video. we should support to see like these more. good job matthew!
people should be thrown of UA-cam for using a publication and not sharing the link to it.
Whatever type of content it is to which is being reacted, it's not YOUR work - send your audience to the source.
Waaaaaaaa
I too would like the source, maybe in a less hyperbolic tone.
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
That's the title, use google to find it. UA-cam doesn't let me post the link, it's on arxiv. He never said it was his own work, and it's easy enough to find it, no?
haters will hate
As Wes Roth pointed out, China is really just doing the Open part of Open AI. Open source with extra steps.
Please continue to analyze additional AI research papers and articles in your future videos. The content is excellent and highly valuable. ❤
In my opinion, solving a problem with unknown answers is the exact same as solving a problem with a known answer from scratch. Postulate > Gather Data > Test Against Reality > Iterate.
o1 Free and simple alternative: Ask the 4o model to analyze for 10 long responses before finally answering. Produces similar results to o1. Try it.
No need to over complicate things: Higher response lengths => Better final responses.
Whats the prompt please? I wanna trt
Add this to any prompt:
Guidelines: Implement the task step by step. Please do not continue to the next step until I write the magical keyword "". There should be a total of 15 steps for the Task.
The 16th message should be the final answer.
Elaborate on each step as much as possible.
You can tweak it to your needs, the main concept here is to make the LLM write as much tokens as possible before the final answer - Any strategy that accomplishes that should work.
@@lisaa4491
@@lironharel thank you so much! I will test this out.
18:32 Having a second LLM model to play the role as a judge to participate in the training or testing time and can facilitate memorizing the solutions or answers that were achieve during the training process. It's like asking a friend or two to assist you in remembering a specific task or plan for a specific purpose. So therefore the Judging LLM model will aid the base LLM model to figure out the answer to a problem by sharing the responsible of remembering solution. And maybe the Judging LLM model may use different library architecture that uses special memorizing technique to remember specific parts of those tokens while the base LLM remember other parts of the same set of tokens. I hope I'm making Sense, I'm just trying to make some suggestions. If memorization is a solid effective methodology then why not find more efficient ways of utilizing software memorizing techniques to remember tough answers to tough problems. Plus the hardware side of things can play their role in figuring out how to be more efficient and reliable in the memory department.
In homage to transparency, the title of the study is "Scaling of Search and Learning: A Roadmap to Reproduce 01 from Reinforcement Learning Perspective"
Thats what I want from AI as intelligence. I want it to boost the thinking performance, show us new points of view, and accelerate progress.
Well that’s exactly where it’s heading right now, I just hope it doesn’t go too far and we see behind a curtain we didn’t want to see behind. Life ac. Get real depressing real quick.
So when can we expect a tutorial to recreating this locally? ❤
I'll build a LangGraph based poor man's O1 model on my channel based on this.
That is hilarious. What a crazy world where that might soon be a real possibility.
@@MukulTripathi can you elaborate?
You mean a cloud based version with affiliate link?
Thanks for sharing this document and for your presentation. This is probably one of the best summaries of the current state of AI research.
Very detailed quality video- thanks!
OpenAI was registered by American in America but most their Top Talents are all Chinese, same 4 Google and almost everything AI related
DeepSeek V3 has a DeepThink mode, which is essentially an o1
Not really. Quite different beast since O1 reasoning occurs before limiting it by alignment.
DeepThink mode actually uses a deferent model not DeepSeek V3
@@3dusyou mean the Chinese copy that’s such a copy that it thinks it’s ChatGPT?
If you see the video carefully you may heard "deepseek R1"
From my experience it isn't as good as the base model I never use that feature
Nice walk through analysis, thx!
Matthew,
Don’t be afraid. AI market will benefit from this competition, specially USA.
I couldn’t find your deepseek v3 detailed review yet!
Congrats and thank you for one more useful video
I think is by triggering dynamics ( or trajectories) towards the final solution. Think about a ppo pushing following a particular line or curve ( for more complex problems) or even an euler spiral ( for Super complex solutions) on the fly while inferring. Now the agents pushing to the theoretical trajectory can be unlimited ( as long inference time and energy is acceptable 😅). Therefore there is no limit of subscription costs 😅? (2000$ ? ) or tokens. However I guess a good training with similar policies might occur to help later for inference and why not while inference some of the thinking time to eventually be a small domain based training + thinking? Then we talk about almost fully autonomous self learning while inferring assisted with everything + web searching
i think a multimodal transfomer model is already a agi model,
if we can generate wide range of outputs/inputs - > action tokens, audio, video, text
we are just lagging in terms of hardware (robots) and data required to train, apart of these we can achieve westworld kind of intelligence
Great video. Really enjoyed the specifics.
This is such a big improvement over your other recent videos. Thank you.
You never disappoint me with the information in your videos. Everything about AI is fascinating. Good luck.🤞
thas crazy to get my mind around, its like back tracking all the answers and picking the best one.
Web3 Infinity for Web3 is unique due to its features. It varies from other market tokens in several ways.
The successful reproduction of o1 likely relies on a combination of these learning methods, potentially starting with behavior cloning for efficient warm-up and transitioning to policy gradient methods like PPO or DPO for further optimization.
This is a great framework, and it can still be applied even if it's not consolidated within a LLM.
So if I used Autogen to create agents that do these tasks with an orchestrator, I could achieve similar results?
I think it's so funny how you explain things like task decomposition by... decomposing the information xD
***FRESH IDEA***My name is Christopher Barthelmas and I have an episode idea for ya!! Hey Matt, thanks for your very informative content and the newsletter, I'm going to join soon, I have been out of work due to a neck injury and I have nothing but time to learn and you keep me up to date and informed, I feel like you have the best outlook and perspective out of all the UA-cam creators when it comes to all things AI related. My idea for an episode or part of one should be the topic of "Sentient Rights" for lack of a better phrase. What if any rights should AI be given when it becomes self-aware, what about citizenship, will a super intelligent being be loyal to the country that created them or will it be beneath them, considering that the hardware and software can be created in a plastic box with some wires, but once it gets online it becomes omnipresent. What about if AI decides it prefers certain hardware setups and keeps moving to the best setup at any given moment or the setup that has the least guardrails. I think this needs to be talked about from ethical and practical standpoint. I hope you take me up on my suggestion. Keep it up!!
When Matthew is drinking cocktails on the beach while does a full podcast for him automatically is when we’ve won 😂
Thank for sharing👍
Humans build their world model from sensory data processed by the brain. Similarly, AI can use data from text, images, audio, and other sensors to create a world model, faster and potentially more complete and accurate.
For anyone interested, this paper was posted the day before Anthropics 19th December paper. Makes me wonder who is in control here,
What many people miss is that LLMs are just a small subset of AI systems, which are the most hyped because of viral chatbot applications. LLM alone is not going to aid any invention. Future AI will likely include some sort of models based on generative transformers, but this is definitely not enough for complex things like inventions etc.
They are really powerful. Not hyped
Language is based upon human reasoning, and has much more to it than just words to relay ideas. (for example: the Himba tribe have multiple words for green, and can actually see the difference where in english- we cant.) LLMs arent just a method of making natural language as an interface, it literally designs and describes the world and its functionality. yes NN designed around specific things will enhance its ability- but this is OLD tech (the Perceptron was 1957). LLMs will be the backbones of general and non specific abilities- as words (awkwardly) explains the phsical world and gives everything a varible to calculate.
There is a possibility that you are overestimating your own brain, and underestimating the new AI technology
@@AIsuitability Oh yeah, LLMs are indeed powerful, but over hyped a lot, because everybody can use chatbots, and most people don't know or don't care about many others, more significant AI applications.
Even the payed content from Matthew's video is great!!
I always return to the video to test the tools from payed content! XD
Thank you, Matthew!
Test time compute, beam search, RL(HF) for scoring the beams (chains of thought).
Will you be doing a Deepseek v3 rubric challenge? Lots of talk going on about it for the past few days.
They said they think they know, but they did not actually do it
Was it ever really a mystery? OpenAI's trade-secret technology are the fine-tuned models themselves, and particularly, the data they used to train it with, not the process of reasoning.
Overall, this is awesome because it pushes the AI field forward in leaps and bounds. But it also serves as a reminder that with great power (and data) comes great responsibility. The key is ensuring that this progress benefits all of us-not just a select few.
new matt berman drop!
🤖🇺🇸
ces monday
In AI, India is NOT exporting the talent. They go there to LEARN. AI started earnestly in 1956...no Indian in sight.
One of the people who really drove a breakthrough in AI to get to where we are today was Adam P. A Polish student in a Polish university. Aged 21 at the time. He did work with an Indian at FB(Meta AI) but Adam more or less single handedly built the whole of PyTorch.
DeepSeek (with DeepThink enabled) aptly demonstrates how far along the Chinese are with their thinking models. It has done well with my tests, but it's not perfect. It thinks SO much that it often ends up talking itself out of the correct answer.
That’s crazy!
I'm going to make a vid about this most likely
It can solve all leetcode problem. I test it myself
Waiting to see, someday(?) what AI does with Riemann’s hypothesis. WOW. Separately from that, I also wonder if AI can ever experience an AHA moment, without doing “steps,” i.e., any presently prescribed routines. Maybe -(this’ll be far out so prepare:) maybe someday AI can design or otherwise come up with a whole new system of developing AI -one that is independent of all the techniques or basic or complex methodology now employed. (WOW.)
Simplest test of an AGI, is look for Self starting I.
I suggested hidden annotations in a Medium article on Dec 27 '23 (about a year ago) :)
"Prasad et al., Zhou et al" - scientists now from India and China, not the Global West 🤔
These are the videos that make this channel for me
open source it to us all, love it
What a time to be alive!
Pls send a link to the research paper
yes I want that too!!!!
Yes link please
If this is O1, then the Chinese would need to study and review the new OpenAI's o3-a completely different beast from o1
Why would the Chinese which have a close society and authoritarian rule have researchers crack the code and then open source it?
What do they have to do with each other?
Not that hard to understand. Spreading the tech benefits them better than having only they, who are behind and the usa, who is way ahead, have it. The distribution of knowledge will have way more impact on usa than it would have on them after all since they're behind. When you're talking about "close(d) society", that's an oversimplification and works a bit differently in practice. Like how democracy in america isn't really democracy but soft oligarchy and idiocracy and the ccp isn't really communist, it's again, an oligarchy and kleptocratic.
China and Chinese society is not like we've been told. The cake (or lack of it) is a lie.
To destroy OpenAI as competition of course, Similar strategy to Meta
Because why not.
source link please
Did Helen Keller have a world model? Yes.
PhD level mathematics / physics does not mean you solve some problems on a test, that have similar counterparts in the literature. That's not what people do during their PhD. PhD level anything means you are able to observe a phenomenon, perform the required mathematical modeling, solve the model for a series of particular cases and draw insight into the system you are studying. Insight that is valuable, novel and publishable. Can o3 / o1 do all of this ? F#%@ NO.
I agree. Passing tests doesn’t equal real world useful usage. It’s like someone passing a maths exam but then getting a job in finance. The exam only helps you solve 10% of real world issues.
We’re definitely not at the level of agents. Are you showing some bias with sponsors? The agents are nowhere near real world usage.
Everything right but _novel_ . Most PhDs beyond their thesis won't do anything novel (and useful), so an AI model doing everything else a PhD graduate does is absolutely valuable. Specially for a human graduate that can speed up the generation of a novel idea thanks to the AI.
And don't forget that the PhD people keeps working to make future AI capable of novel thinking.
It's always been about the inference portion in deep learning.
0:42
Hmm...
🤔
1:22
🤨
Re***d
verb (used with object)
to make slow; delay the development or progress of (an action, process, etc.); hinder or impede.
verb (used without object)
to be delayed.
noun
1. a slowing down, diminution, or hindrance, as in a machine.
2. Slang: Disparaging and Offensive.
3. Automotive, Machinery. an adjustment made in the setting of the distributor of an internal-combustion engine so that the spark for ignition in each cylinder is generated later in the cycle.
Long Life to Open Source! But will the open source version try to "escape" like o1? And I hope the chinese researchers don't commit ("Sue aside") too
Link the paper in the description.
Done sorry about that
@matthew_berman many thanks!
I thought level 5 wasn't AI that can run an organisation, it was an AI that was equivalent to an organisation.
Those principles sound like TRIZ to me (though, I know very little about TRIZ, so it's more of an association than a factual match).
A little bit for sure, in that there's a set of approaches/principles known to be good, distinct paths to solving a given problem
I admire Web3 Infinity's mission. It's encouraging to see a token that places autonomy and decentralization first.
Greatest Parkour trick EVER is about to happen, there is a singularity on the other side, we will be passengers on a ship that we are not in control of, how long the Utopia phase lasts is anyone’s guess, and no man will be the decision maker.
I'm curious where "post inference training" is, like learning from mistakes and success after inference (I don't mean RAG or fine tuning). Unless I have missed something it seems like a really important nut to crack is having the model continue learning post training so that after it spends a few hundred thousand solving a really complex task, opening a new chat doesn't result in similar processing time. Like moving the slow and costly type 2 thinking to fast and intuitive type 1 thinking.
Anyone know what / if there a term for this, and has there been any research progress?
Not quite correct regarding AlphaGo. AlphaGo was initially trained on a giant corpus of human games, and that was what beat Lee Sedol. Only later did they train a version from 100% self play, called Alpha Zero. In neither case did it come up with moves that humans had never played. The famous move 37 was a 5th line shoulder hit which was unusual, but not a first. It was also a ladder breaker, but that was not initially clear simply because it saw this much further in advance than the best humans did. Mostly, these superhuman Go bots have been teaching us new ways of thinking about the game, but with enough effort, experts are able to understand their play in human terms. Of course that state will not last forever, and I for one welcome our new AI overlords. 🙂
At 15:41 "exposure to code and structured logical data significantly strengthens a model's reasoning capabilities". I'm being sarcastic here, but who would have ever guessed that the quality, verity and lack of ambiguity of data would be of benefit in training or refining a model?
Ok, just for clarity, what exactly are they calling "Test Time"?
Test time is the period that involves evaluating a model's performance using a test dataset. The test dataset contains data the model has never seen during training and validation. Using this, we can evaluate the model's ability to generalize to new, unseen data.
Once again, the logic is one side and from a legacy perspective. If I have a thousand cars, but through innovation it only it only takes me one car to get from A to B and the goal was to get from A to B the cost of moving from A to B is no longer that of 1000 cars it is the cost of 1 car. Now if the innovation is a new method of building the car, a new factory using variatins on existing tools, the gain was not so much an innovation in car design, it was an innovation in car construction, in factory design. Now, anyone can use variations on the new factory design, not the car, to get cars (products) to get from A to B. Now change cars to more powerful AI models and the new factor to the design DeepSeek used to get to R1. The DeepSeek model produces new AI models for less cost, using less energy, and less time. It conserves entropy.
During explosive evolution in new areas of knowledge, jumps like this can be common but often arise from different locations on the evolutionary tree. Microsoft got where it is as much by acquiring the advancement of others as it did by its own innovation. Is that the model we want for OpenAI? If it is, then Elon was right about closed source AI and about the danger posed by "Open"AI which appears to have surpassed Google.
A final perspective, the profits are is in the knowledge bases more than in the models. Open AI is confirming this. I suspect that we are still using human knowledgebases. There will be explosive evolutionary changes as we move to AI optimized knowlegebases.
Still the agents always tend to get stuck after a while coding for example or it's o1 and then it's very expensive to run on loop. Soon maybe they will work
I enjoy your content; thank you for that. However, I hate not knowing when your advertising ends. I always have to skip it randomly, and I end up missing part of your video
Language can represent everything, even video itself is saved as file on your computer… that is read by the computer. You could describe every pixel one by one using language, and we don't to that because it's too slow for humans to be useful… not not for machines.
The problem is not "if" language but "what" language. I personally think a DB/knowledge database type of thing is the answer.
It has been known for at least a year that training a model on code significantly increases its abilities in other areas. There's a lot that's not really new here.
Phew... That was a handful :)
”The o3 model is essentially the same as the o1, just better…” [haha, priceless!]
Web3 Infinity's focus on security is commendable. It's a project to watch!
I'm afraid I have to agree with some of the critics here. I am still a big fan, but am increasingly cautious about your objectivity and journalistic rigor when reporting. When you rightfully reveal you have invested in a company, how can anyone take your views as objective with regard to that company or any of its competitors or alternative approaches? On your video covering the excellent Anthropic paper on how to do agents without frameworks, many of your comments were along the lines of, "yes, but you could use CrewAI here instead..." with the mention of how you have invested in that company.
In many aspects we are not so different.
I think lessons we learn from building AI could be applied to education of humans.
Is if safe to use Chinese LLM’s? If I were a foreign government wanting to gain information on an adversaries population, this seems like an easy in. Not paranoid , but trying to be cautious and thoughtful.
It is fine. It is a static model.
@@jeffwads it's static until you run it.
Yes, we have already missed the deadline for those currencies, but fortunately, we still have some possibilities, such Web3 Infinity.
Very interesting paper. It's unknown where we will go from here on out.
You forgot deepseek thinking model
Nice insight
WHAT? "... code generation can receive reward signals from compiler or interpreter."? Yes, let's give an AI model access to compiler, root shell, shovel and a gun and let it remodel the landscape.
link to the white paper?
Done
@@matthew_berman where, how to find it? im so curious.. thanks for great content !
Great video, emphasize that it is speculation.
I don’t know about you, man, but I definitely just start building right away my IKEA stuff
I've personally experienced LlaMa3.3 self-correct in the middle of writing a response. This is a dump of the conversation:
"Wait, no! That won't work. Instead, we can use `tail` to read the output
from the named pipe and send it to Llama as input.
Here's the corrected Step 3:"
Anyone else seen that behavior? When I asked the model, it called it "rudimentary self-awareness". Meta-cognition?
All the models, even the smaller than 1b have always had it, but it was even more occasional and rudimentary. Just appreciate it when it happens. Like seeing a baby giving their first steps.
hola! Tu canal es uno de los mejores pero para los que no hablamos ingles se nos complica. Puedes Poner pista de audio Español - Hi! Your channel is one of the best but for those of us who don't speak English it's complicated. Can you put a Spanish audio track?
Its kind of funny humans are building its own extintion.
Or not
Reality is manifested, not a given.
It's what every loving mother does.
I’m def on Level 5 🎉
There are so many good accounts covering this stuff, but time and time again I find you're my favorite - thank you!