Can general intelligence do that? As in anyone can substitute you? Don't think so. Why set the bar so high for artificial general intelligence, when "normal" intelligence can't.
o3 is not agi. chollet is already working on a new test set which he says on his website is only 30% solved by o3 (keeping in mind always that these tests are solved 95% by average humans). on the same site he shows three examples of tests o3 didnt solve. they are very easy. o3 has no vision. it doesnt see the tests, it only reads them line by line, number by number. chollet quote: "you will know when we have agi when coming up with tests that are easy for humans and hard for models becomes impossible." we are not there yet by far.
Very good point. Thank you. Yes, if we can still make tests that are easy for humans and difficult for ai, then that is pretty much the definition of "not agi".
@@headspaceaudio O3 can solve LOADS of problems that 99% of humans can't. But that doesn't hit the definition of AGI. Even if a model is barely as good as a normal human, but GENERALLY can solve any problem that a human can solve, that is AGI. No one is saying that o3 is not SMARTER than most or all humans. It probably is. But it is not "generally" intelligent in every way that a human is intelligent.
This is not AGI. There is no long-term continuous learning. We just keep adding everything to a very ephemeral context window. What we have is something that can complete some very constrained tests better than a human; a significant milestone for sure, but not AGI. A human knows how to operate the computer to actually take the test by itself -- o3 still more or less has to be fed the test.
Every AI channel is calling this AGI, because ... CLICKS! Yay for more clicks! It's honestly kind of embarrassing. UA-cam's algorithm is pretty good at making words meaningless.
every damn ai channel is having this hyped youtube pictures. everytime they say something something superhyped happens!! oh yeah! super computer terminator!! 1 day after we see stupid mistakes. happened with sora, devin, chatgpt, and so on. but yeah AGI....
yeah, it's AGI yet they didn't have the confidence to call it GPT5... huh? As Greg Marcus points out, it's apparently also incredibly expensive to run and the demo is seriously biased towards the things it does well and ignores everything it doesn't. Berman has jumped the shark here, calling something AGI based only on a demo without having tried it, let alone tested it, at all is pretty cringe.
@@jefferylou3816 that is asi, artificial super intelligence. Artificial general intelligence only needs to surpass the ability of the average human to produce value.
Yuuuup. I don't trust OpenAI at all on anything they claim. Until its in my hands and I can see what it can actually do I don't believe anything their hype department puts out. Just look at Sora.
It is impressive, but saying it is AGI is clickbait. The G is for general, you know that. They are focused on the benchmarks, and let’s celebrate that progress. But don’t call it AGI, they are still “teaching to the test”.
The point is that they're not teaching to the test. Also that you can't "teach to the test" because all problens in ARC-AGI require unique types of reasoning. This is the most generally intelligent model out by far and far more general than the vast majority of humans. If it can't do some thing yet that humans can do, sure, but no human can do everything that humans can do either. This is obviously AGI
They make the point of saying it was not trained specifically on any of these tests about 15:00, now whether you believe them or not is another thing but they are not according to them 'teaching to the test'
Why it’s not AGI yet: The context window remains a significant limitation. These models perform well with single questions but struggle when managing large projects that require tracking extensive context. As the amount of data increases, they start to hallucinate or lose coherence, unable to maintain a reliable thread of information. Until this issue is resolved, these models, while powerful, fall short of being true AGI.
This is sheer clickbait. We are still way away from true AGI. We still do not even know for certain why humans and some other animals exhibit self-awareness and what the key element is behind this in terms of cognitive "algorithms", if you will (we have ideas, but nothing is certain yet). Until we figure this out, any 'AGI' is simply going to be a smarter LLM under the hood. A true AGI would not be a language model but a set of systems working together (visual, auditory, logical, etc). Essentially how the brain works. Stop calling LLM's AGI's. It's embarrassing.
If I'm not mistaken, AGI is about cognitive abilities. Pattern mimicking still doesn't "know" what a car is. It doesn't think. IT IS NOT THINKING. It is not conscious. Is it better, sure. It is better at language than most people. Can it fool a human, sure in some cases. So perhaps it can pass the Turing test, but it's not AGI and if they continue down this road of improving quality it will never be AGI. Let's not forget, the letter A stands for artificial, so we have lower standards for it. The improvements we are seeing is what we expected in the first place, we were disappointed by the flaws of GenAI, so improving it to make it like it should have been is not impressive.
yeah, hearing that I was just "what are we even talking about here..." praising it as being "general intelligence" because it is good at just one or a few things? maybe the author of the video should go ask ChatGPT what would be needed to qualify as AGI
@@sluxi Yes, the guy even mentioned Stockfish to talk about chess, like, seriously??? Stockfish does nothing but calculate chess moves, and it already reached a rating above 3000 back in 2014 (when it wasn’t even using NNUE yet) and was just a highly optimized version of Minimax with Alpha-Beta pruning. Even today, with NNUE (Efficiently Updatable Neural Network), it’s something totally specific, with absolutely no relation to AGI.
@@Crates-Media my son is 12 now and he's twice the height he was when he was 5. By the time he's 30, he's going to be the tallest person in the world, neat huh?
I see no flaws in your logic. All things of any origin or nature always work the same way as all other things, because everything is everything else and your proof is irrefutable!
Prediction: The impression I'm getting is that this technology is becoming so resource intensive and expensive to run, that the top-tier stuff is not going to be for consumers, but giant companies and governments. As time goes by it'll be a "you can look but not touch" situation. Well get the watered down toys, while the giant entities get the super-powered versions and true AGI/ASI. P.S. - o3 is a step towards AGI, but it's not AGI yet. Content creators like Matthew need to slow down and see this for what it is, Sam dangling bait for the media to generate a huge amount of hype and consequently cashflow.
Imagine the power plays and social engineering and mass manipulation that those with the money to run these models to their advantage will exert over those that can't afford to harness its power.
Also, there is no copyright issue, at most it's a trademark issue and they are in different markets, so it shouldn't cause much of a problem. The irony, stealing copyrighted material from all kinds of sources, they have no issue with.
It’ll be AGI for SWE when it can self verify. Today I had Claude add a download button to a page that is already pretty complex. It gets it in the first go. Beautiful. That was pretty impressive and not something that it could’ve done a few months ago much less year ago. But I still needed to be the one to QA the feature. I had to rebuild the app, open a browser, navigate to the right place in the app, create the history, look for the download button make sure it’s in the right place, press the download button to see if it responds at all, know where to look and what to look for to see if it is downloading, find the downloaded file , open it, inspect the contents and make sure that they match what’s on the screen and formatted in the way that was requested in the prompt. We’re getting there but I’m still having to do a lot. it’s AGI when it can do this QA before it presents its solution to me.
@@xXWillyxWonkaXx it is general intelligence, it just has to better than the average human …. ASI IS WHAT YOU ARE REFERENCING , artificial super intelligence is what is better than every human being.
"AGI according to Sam Altman and OpenAI" This is how I know you're being purposely untruthful, Sam Altman and OpenAI do not use the term AGI and they actively discourage it. They use 5 levels, and right now they're only on level 2.
They are on level 2 but moving to 3 fast. End of 2025 will be l3 and end of 2026 l5. It will take only 18 months from level 3 to 5, less than level 1 to level 3.
Just made up terms so that humans can hide their insecurity. Unless it is exactly like us in the way it thinks and senses people will continue to deny that it has any intelligence. I feel bad for any aliens we meet in the future.
@@py_man Nope AGI is when it can perform no less than any human can. That doesn't mean any one human but any human as in all. ASI is when it can perform beyond what a human can. ASI stands for Artificial Super Intelligence as in Super Human or beyond human. And it has to be general not something with small scope like chess.
Benchmark means not only set of questions but also set of right answers. Nobody can create Benchmark for yourself because you need to pass it with 100% (to find the answer) before you create it 😅 its like chicken and eggs problem.
I believe A.I has to replace blue-coller work as well as white coller work in order to be AGI. Reflex, instant instinct when a pipe dislatches and water spurts everywhere (plumber - instant fix while robot stares and is confused) Academic benchmarks alone are not enough. A.I needs to figure out the automatic and intrinsic way we learn about the world in the first 5-years of our lives, an essential part of human development and intelligence. Humans initially recieve intelligence through analog processing THEN we move onto symbolic language at a later age. With A.I it seems to be the other way around. I believe A.I needs to master robotics and analog understanding of its environment in order to be AGI. Not just mastering symbolic understanding.
Check out the new Genesis simulation platform running on Nvidia hardware that is for desktop computers. Autonomous robots will soon be able to do complex, human only, hands on, tasks faster than people.
Exactly the same opinion, thank you. If AGI means the capability to replace average humans at economicly relevant work then obviously the physical world counts as well. And simulating something is a good start but having a working robot is the true benchmark.
I started my computer science journey using MS-DOS and Windows 3.0 (way long ago) and just watching the birth of AGI is just awesome. Four thumbs up 👍🏻👍🏻👍🏻👍🏻 Thanx for sharing Mr. Berman, very much aprecciated !
It’s excellent in math and programming; however, I always expected we would eventually be surpassed in these areas. I believe the real differentiator for agi intelligence is the ability to learn and remember like a human. If it acquires information about a person from a photo, it should recall those details when seeing the photo again. That’s when it can truly start learning to perform our jobs-and this, in my view, is what AGI will be.
It is NOT excellent in math and programming. The mistake rate is extremely high. I work with these systems every day making simulations and they make many mistakes. It is one of the reasons that unit testing is so critical. They are good at solving problems that many people have solved before and have written solutions on the internet. If you need to combine several types of equations together then it fails.
M.D. Clinician and Researcher here, with specialization in DataSci medtech research. Currently working on AI models using extremely private datasets, including nation wide medical records. Imho, O3 is close to AGI, however, AGI is achieved when novel ideas in wide domains can be articulated. Example, producing a unique hypothesis, initiate small size testing, producing a secondary conjecture, and finally proving it. The complexity of the problem is not that important, but it should be able to all of the above steps in a reasonable and most importantly NOVEL manner.
"OpenAI just released o3"- Not quite. They didn't release it: they announced it (talked about it)! See how Mr. Berman is always quick to talk about any updates coming out of OpenAI but very reluctant to talk about Google's. (Context: It took him a very long time (days) to make a video about Gemini 2.0, which is extremely impressive & at least available to play with in Google AI studio. These o3 models were announced few hours ago & aren't available publicly; yet see how he talks about them, like he has seen them already). That tells you where his heart is at! Keep that in mind as you watch this entire video & others.
Yeah but it was the coding dimension + math dimension. The practical way is to focus on several key synergetic dimensions. Even to drop the AGI as target.
@@RelevantDad Agreed. But as '"Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks."" They may not be all cognitive tasks but I need the critical range of synergetic tasks relevant to me.
@@PeteQuad For real. If it was still 2020, you'd be mind-blown by an AI even creating something with PS2 level graphics. We've become so spoiled already, it's incredible.
@ tf 😭 solving maths never proves that this model is Agi it takes so much compute but still doesn’t have any emotional thinking or reasoning like humans. it still follows instructions and is a transformers based on
Let's agree that doing mathematics is quite different from solving mathematics. For me, I'll only recognize that an AI has achieved AGI when it starts doing mathematics on its own. That is still a long way from happening.
There's this guy that puts out short math videos where he works thought geometry problems and shows all his work. His trademark line is to say "how exciting" at the end of the video. I can't bring myself to find his channel name to write it here, but perhaps someone will recognize him by my description. AI needs to be able to solve those toy problems with just the thumbnail (since the video thumbnail fully describe the problem).
@@markkest3956 Search on Google for a text titled 'An Interview With Michael Atiyah'. It's an 11-page PDF of an interview the great mathematician Michael Atiyah gave in 1984. Pay particular attention to the conversation starting on page 8, where Michael Atiyah talks a bit about the difference between doing mathematics, reproducing mathematics, and communicating mathematics. Doing mathematics goes far beyond simply solving problems or writing proofs for theorems. The text is very interesting and sheds a lot of light on what I mean with my comment.
At the very least AGI should be able to replace most remote workers of average intelligence without a human in the loop feeding it prompts. I think we're still a few years off from that. GDP and employment statistics are much better benchmarks of progress.
This might be true for the ‘average’ person when it comes to their personal moral code… and I include myself in that. I recently asked myself, “What do I believe is right and wrong, beyond my social conditioning?” It was the first time I’d ever considered/contemplated living outside of my moral code, despite seeing myself as a deep thinker philosophically. Another factor is that AI is trained to be restricted in its commentary. Im curious to know what kind of questions you asked? I couldn't get much meaningful comment on the murder of the healthcare CEO but models beyond 01 are blocked in my country
most people is immoral precisely because they got no a clue about what morality is... they settle for tribal, buddies morals, whch is a sort of inmorality so I do not think it is a good turing test
Hard disagree. Morality is 100% subjective opinion. Now, if you ask it how to follow a defined moral code, in a given scenario, that is a meaningful problem.
@@catbert7 Catbert7, your comment is a good example of how human are not good at moral issues. Morality is grounded in evolutionary biology.³⁴ Cooperation and ethical behavior have been essential for the survival of human groups throughout history.³⁵ Cooperation and Survival: Human survival has long depended on cooperation.³⁶ In early societies, individuals who worked together to gather food, protect one another, and care for the vulnerable had a better chance of survival.³⁷ These behaviors laid the foundation for ethical principles such as altruism and mutual support.³⁸ Ethics Beyond Self-Interest: While early human cooperation may have begun as a survival strategy, human intelligence and reason have allowed us to transcend mere self-interest.³⁹ Today, ethical behavior often involves actively promoting the well-being of others, even at a personal cost.⁴⁰ This reflects a higher form of morality rooted in our rational understanding of the interconnectedness of all beings and the shared nature of existence.⁴¹ In modern societies, cooperation, altruism, and moral integrity continue to play crucial roles in fostering social harmony and individual happiness.⁴² These traits, which have evolved over millennia, now manifest as ethical principles that benefit both individuals and society as a whole.⁴³ 5. Practical Implications of Living in Alignment with Logos Living in alignment with logos requires applying this understanding to our daily lives and interactions with others.⁴⁴ Ethical living, guided by reason and virtue, has practical implications both individually and socially.⁴⁵ Individual Well-Being: By cultivating virtues such as kindness, empathy, and justice, individuals create harmony within themselves and their relationships.⁴⁶ Ethical behavior leads to inner peace, aligning with our natural state as rational beings.⁴⁷ Living in accordance with logos reduces internal conflict and fosters a sense of purpose and fulfillment.⁴⁸ Social Harmony: On a broader scale, aligning with logos promotes social harmony.⁴⁹ Communities and societies that prioritize cooperation, empathy, and justice are more likely to thrive.⁵⁰ Recognizing the interconnectedness of all beings motivates actions that benefit the collective, leading to greater social cohesion and stability.⁵¹ In both the biological and human realms, cooperation, symbiosis, and self-sustaining networks emerge as principles that promote survival and flourishing.⁵ Whether we consider chemical reactions, symbiotic relationships between species, or human communities, logos encourages the formation of complex, stable systems through cooperation and interdependence.⁶ Cooperation in Nature: In the biological world, cooperation is a fundamental strategy for survival.⁷ For example, the symbiotic relationship between fungi and algae in lichens demonstrates how mutual benefit leads to increased resilience and adaptability.⁸ This cooperative dynamic is governed by the rational principles of logos, which favor structures that enhance survival through interdependence.⁹ Self-Catalyzing Chemical Networks: At the molecular level, self-catalyzing reactions form more stable chemical networks.¹⁰ These networks show how logos operates even in chemical processes, promoting stability and complexity through cooperation among molecules.¹¹ 2. Human Reason as Participation in Cosmic Logos As rational beings, humans consciously participate in the logos that governs the cosmos.¹² Virtues like kindness, cooperation, and empathy are extensions of the same principle of mutual benefit that operates throughout the universe.¹³ Just as cooperation leads to stability and flourishing in the natural world, human societies thrive when individuals work together, support each other, and act with compassion.¹⁴ Kindness and Empathy as Universal Values: Kindness and empathy transcend cultural boundaries and are essential for the survival and thriving of human communities.¹⁵ These virtues foster cooperation, strengthen social bonds, and contribute to the well-being of the collective.¹⁶ From an evolutionary perspective, groups that practiced empathy and cooperation were more likely to survive and flourish.¹⁷ In this sense, these values are intrinsic to our nature as rational beings connected to the logos that governs both individual and societal harmony.¹⁸ Mutual Benefit in Ethics: Ethical principles based on mutual benefit reflect the rational order of the cosmos.¹⁹ By treating others with kindness and respect, we align our actions with the inherent rationality that promotes survival and complexity throughout the universe.²⁰ 3. Living in Accordance with Nature: Stoicism and Rational Ethics In Stoicism, living in accordance with nature is central to ethical living.²¹ Since logos governs both the cosmos and human nature, to live ethically is to align one’s actions with the rational order of the universe.²² This means cultivating virtues that reflect the balance, order, and reason inherent in nature.²³ Wisdom: The virtue of wisdom involves understanding the nature of the world and making decisions that are in harmony with the rational order of the cosmos.²⁴ It requires distinguishing what is up to us and what not, and acting in ways that reflect this understanding.²⁵ Justice: Justice involves treating others fairly and recognizing the inherent dignity and interconnectedness of all beings.²⁶ By practicing justice, we acknowledge that we are part of a larger whole, and our actions affect not only ourselves but also the community and the world at large.²⁷ Courage: Courage is the ability to act virtuously even in the face of fear or adversity.²⁸ It reflects a commitment to doing what is right, regardless of challenges or risks.²⁹ Temperance: Temperance involves practicing self-restraint and moderation, ensuring that our actions are guided by reason rather than impulses or desires.³⁰ It helps maintain balance in our lives and interactions with others.³¹ These virtues, when cultivated, lead to a life that is in harmony with logos and the natural world.³² Stoicism teaches that by embodying these virtues, we live in accordance with the rational order of the universe, contributing to both our own well-being and the well-being of society.³³ More info.: sergio-montes-navarro.medium.com/logos-0717f9fb6cde
AI answer: Despite these improvements, o3 lacks several key attributes of AGI, such as: General Understanding Across Domains: AGI would possess the ability to autonomously acquire and apply knowledge across diverse fields without explicit training, a capability not present in o3. Self-Awareness and Consciousness: AGI is expected to exhibit self-awareness and consciousness, enabling reflective thought processes, which current AI models, including o3, do not have. Long-Term Autonomy: AGI would operate independently over extended periods, making decisions and adapting to new situations without human intervention, a level of autonomy beyond o3's design.
Ok... Put it in an agent framework. And who says it has to be conscious to be AGI? There is no test for consciousness and we have no idea what it even is.
man they said they created an AI that can substitute you in your job. Not that they created a self-aware machine-god. Come on i mean, leave something for o5 (they're skipping 04). Btw we're fucked lol
You definitely don't want sentience/consciousness in AGI, that brings about possible implications for it to go rogue. What you want is an AGI that is able to perform tasks and be functionally aware only. That way it still has a level of knowledge to be super useful but not dangerous.
Oh, and AGI is never "at least in this dimension" THE WHOLE POINT IS IT'S ALL DIMENSIONS! So you basically have just a bunch of benchmark stats, no access to the model at all and you make such grand call? Ridiculous and disappointing. I thought you were over the hype but nah, it got to you too
This guy feeds off your attention, constantly using clickbait - that's how he makes a living. I don't understand why people aren't annoyed by this. Sorry in advance for the harsh words, but the guy is either naive, or cynical and he feeds off your attention without even checking the model himself. In my opinion, that's where the value should lie - maybe the guy could do objective tests himself instead of just taking everything at face value. It's funny
@@debook8951 I guess people are not annoyed cos that is 'industry standard' as far as AI news on YT. It's sad that this channel devolved into it too. I remember that he used to do stuff that you are talking about - actually using models, testing them and talking about results. Now it seems he noticed that it's easier and gives more views to just talk about latest hype. This kind of YT channels have deteriorated greatly during 2024. It's just that reality does not to seem to meet the expectations I guess.
If they integrate O3-AGI into the next generation of robots, it will change the world. Congratulations to the OpenAI scientists. We also need Claude and xAI to achieve AGI to remain competitive. 👍🏻👏👏👏
Thank you for creating this video. Whether or not it qualifies as AGI is beside the point; it’s inevitable. There are valid reasons to feel both hopeful and apprehensive about its arrival.
Agreed and I'd say AGI was first achieved with Claude 3.5 Sonnet this summer. Once we got o1 mini and o1, it was pretty clear they were generally intelligent, could reason, learn new tasks on the fly, create new reasoning modalities on the fly etc. o3 is clearly AGI imo. But you're right that it is inevitable even if we say this particular one isn't. I think it's surprisingly tame to start with and people aren't/weren't ready for that. Regardless lots to be excited and concerned about indeed
Ok they already mentioned AGI teaser in their project feature launch video. Why people can't accept it , agi would be here by 2025. As if it can solve those problem on which it never trained with 87 percent performance then it's almost agi.
Amazing and probably AGI however 'Semi private' on the ARC AGI eval. Full private tests on 'Simple bench' and other completely private tests will be the true tests.
It’s intriguing how things unfold. Just yesterday, I asked Gemini about the O3 model as AGI. At first, it seemed oblivious, acting like it had no clue what I was talking about. Then, with an unexpectedly sharp tone, it pointed out that O3 hasn’t been released yet, remaining behind OpenAI’s wall, and that no one truly knows what it involves. Why O3 is The Dawn of AGI: Is O3 the Tipping Point?
I think what would make the most sense is to allow AI have senses. So that it can see the world we are living in and not use the data that we have generated on the web.
@@greenstonegecko I think the bottom line is that “AGI” lacks a concrete enough definition, and means too many things to too many people, for us to really ever say when it’s arrived.
AGI will be achieved when we won't have to check if the answer is correct. Furthermore, if AGI had been achieved, OpenAI would not wait for Matthew to claim it. So they know. Biggest issue with these benchmarks is time-limitation, which has no impact on AI but huge impact on humans. If you give one year to a human to solve a problem, it will be 100% correct with a strong reasoning about it. With AGI, you can give one year and the answer will be the same hallucination. So, in my opinion, AGI will be achieved when the AI will be able to ensure that the reasoning is correct and can demonstrate it.
Probably not AGI because it's not general enough. o3 could be trained to be good at these kind of puzzles. You would have to open it up to the public and have them test it on truly novel and truly general IO tasks.
Matt - You are who this was addressed to. Hook, Line and Sinker. (Notice - They didn’t say they achieved AGI, but they want the Berman’s of the world to say so😊) They didn’t even release a model - just benchmarks. Look at Francois Chollet’s tweets - and even what you just showed. In front of you a human, Mark (as well as myself), just solved a problem almost instantly that Greg just said no model was able to solve! BTW, to achieve 87.5%, it appears OpenAI used approximately $350,000 of compute. That is why that score didn’t count 😂 To achieve the recent results OpenAI (and most AI companies) are using LLMs along with, basically, expert systems accessed through reinforcement learning (i.e. COT) and massive brute force compute. We will not have AGI until we develop some new algorithms (if ever).
One way to determine if it’s AGI is to look at the answers it got wrong and WHY it got them wrong. If at ANY point it completely misses the question because it just didn’t understand and a human would have, then it’s not AGI. It’s not actually thinking, it’s not actually understanding.
Good at programming and mathematics does not qualify AGI. It's going to have to cognize 3D space and do things in the physical world to pass the AGI mark in my books. Impressive model o3 and it will replace a lot of jobs
We humans can go out in the world see things, discover things, unless we allow AI to have such a freedom, they can never outsmart us. The current AI no matter how advanced at the end of the day is just a simple tool for us to use and simplify or speed up the mundane tasks we perform.
If AI has sufficient access to internet, surveillance cameras, personal documents etc., it could do a lot of harm without needing an embodiment. Current AIs have been shown to be capable of manipulating humans to do tasks for them.many current robots are connected to the internet in some way. A sufficiently advanced AI could also access there robots to very quickly gain the ability to walk around and discover things in the real world. In conclusion: a purely digital AI is not necessarily safe.
Honestly, a good definition of AGI is the one you stated: being better than humans at most economically valuable tasks. If it is true AGI, the true test would be applying for remote jobs at several different companies, in different roles, and working as a good employee.
Good idea! Once the boss and colleagues can‘t figure out if you‘r working at home or if it’s 03 then we might say AGI has been reached in the field of office jobs.
Notice the props behind them, all items representative of major technological advancements in human history. Nice touch as we're on the verge of turning the future over to technology itself.
Things a general human can do on a computer: - Order pizza - Download and install a software - Read and reply to emails - Create a Facebook clone app (med level developer) O3 (without extra programming ) can not do any of these because of fundamental limitations. It can score 100% on Math benchmarks, but it is still Not AGI.
Uh it can with an interface... Which ya know, humans need too. Remove the mouse and keyboard and a human can't do any of that stuff either (yes yes there's alternative human input devices like voice commands and touch screen but not the point). I'll consider it agi when you can give it any task and it can learn to do it perfectly (iterative corrections are fine) given a human reasonable amount of time and feedback. If I can ask it to build a city in Minecraft, order a pizza, suggest a good beer, and do my taxes... Good nuff. But until it's capable of learning on it's own and reapplying that knowledge to other tasks, it's not agi for the simple fact that it's not GENERAL intelligence, simply selected intelligence (even if we have very few clues on how that intelligence is actually being selected).
Can the model train and improve itself? If not, then it's not AGI, just more comprehensively trained model. Even if it incorporates all humanity's knowledge, without ability to self adapt and incorporate new knowledge it's a frozen in time AI with amnesia.
It's funny to watch AGI redefined as we evolve. Now it appears that a system can be qualified as AGI, but on a subset of abilities, a limited AGI. It appears true AGI will be AGI across the board on all skill sets. So OpenAI can still say they are waiting on full AGI.
This seems to be exactly what they’re testing for, with the help of tech influencers and some of the brightest minds. When you’re ahead of the curve-and I’d say OpenAI is *far* ahead of it-you have the power to lead the game and set the trend. It feels like the strategy is to confidently showcase what you’ve developed and call everyone else’s bluff.
To me this sounds like nonsense though. This isn't AGI, this is more like a narrow form of "super intelligence". I probably can't even say that, because of all the connotations with the term, but I simply mean a human superior intelligence in a subset of fields.
It's so silly how people throw around the term AGI. We have had narrow AIs for decades now. The whole point of the term AGI is that is supposed to refer to a model that can perform any general task. There is by definition no AGI that is only good at a few things
Whatever your definition of AGI is, things like this, along with breakthroughs in vastly more energy efficient transistors, means that the writing is on the wall and it's only a matter of time...and that time is likely measured in months.
To have such stellar results THIS early in the release of *O3,* it's only a matter of time before AGI hits the ground running. What a time to be alive 😂
Stop blabbering. Brute computation force is not AGI, so stop comparing chess engines with with GMs and equate it with creativity and intelligence. These models are fed with tremendous amounts of data and code samples so obviously they will start beating humans at competitive programming concepts as humans can't beat speed and humenly not possible for single human to crank up all code
Openai is about to get Curb stomped by x ai. They ran into gpu training bottlenecks over a year ago, and have shifted toward offloading gpu compute during inference. X ai earlier this year figured out how to scale training many times greater this year. They hit 3x open ai a month or two ago, and will likely be 30x before the end of 2025.
This is the most generally intelligent model out by far and far more general than the vast majority (99.99%) of humans. If it can't do something yet that humans can do, sure you can find some specific task it cannot do if you spend time to identify it, but no human can do everything that humans can do either. o3 is obviously AGI, I don't know why people are complaining.
no its not it still hallucinates 😂 did openai say that ? o1 also outperforms humans in 80 percent plus tasks it can't plan it can't take time like humans can it develop full apps ?
Hallucinations / logical or factual errors are a key part of Generative AI and of intelligent humans. GenAI models are not rule-based systems, they're meant to simulate creativity. Creativity and hallucinations are compatible but creativity and logic are not directly correlated or trivially compatible. If you want a system that doesn't make errors you have to combine the GenAI model with a logic/rule-based system like a code interpreter. Those combined systems are available already. Of course more progress is needed, that doesn't mean characterizing the generality of intelligence of o3 as AGI is wrong. Also, o1 can make plans and fairly good ones, you can ask it to make some plans for you, I've had good experience with that. It's also much faster than a human planner in many cases - a nice bonus.
@ you just can’t compare hallucinations with logical and factual errors hallucinations typically refers to the model forgetting something it was supposed to done like 3 to 5 messages ago it was instructed but it failed to do it. furthermore these model require a lot of compute do you human require compute ? Even if we compare lets say an average human spends 300 dollars on his diet in a month these models with that comparison will costs hundreds of thousands of dollars if they are allowed to think all the time. No comparison also can they now develop front end back end apps or can they discover some new phenomenon of nature ? Yes these models are really good but you can’t say this is AGI. did openAi say it ? These are transformers that still predict text they are still bound by our worldly human instructions. Agi is achieved when models can start to decide between what they want to do or achieve
@@lanatmwan I did. I understand what he is saying and I’m excited that it can start to learn on its own but I still think it’s premature to state that it has reached AGI unless it is already better than human on most tasks.
I don't even know what they mean by better than a human at coding. None of the frontier models are even close to being able to do what I do in code. Their code is always unoptimized drivel that works sometimes if lucky but lacks full understanding of context. If you can't rely on it then you still have to logic it all out and sometimes that takes more time especially when it takes you down a wrong rabbit hole. Completely useless for the most part for coding. Only beginners would be happy with the output.
yeah it's getting pretty obvious by this point. It might be 'access' rather than actual money, but he's getting something for sure. It's as cringe as a teenager telling you about their favourite K-Pop artist
i was today conversing with Chat GPT and mentioned off-hand that I would love to have painting that illustrates a certain parable we talked about. I didn't expect GPT to generate such a piece of art, but he did! Instantly!
@@cajampa on low compute, it's $17.25 according to the same, so expect o3-mini to be free, o3-large@low to be $200/month tier and o3-large@high to be $2000/query
It is AGI when i can let it take control of my work PC without my manager noticing my absence for weeks....
Not a joke. If it can't do that then it's not AGI
Can general intelligence do that? As in anyone can substitute you? Don't think so. Why set the bar so high for artificial general intelligence, when "normal" intelligence can't.
lmao
@@bestemusikken but at the end of the day this is the entire hope of AGI.
@@bestemusikken if you gave it access to all your files and email/ slack history it probably could
o3 is not agi. chollet is already working on a new test set which he says on his website is only 30% solved by o3 (keeping in mind always that these tests are solved 95% by average humans). on the same site he shows three examples of tests o3 didnt solve. they are very easy. o3 has no vision. it doesnt see the tests, it only reads them line by line, number by number. chollet quote: "you will know when we have agi when coming up with tests that are easy for humans and hard for models becomes impossible." we are not there yet by far.
Very good point. Thank you. Yes, if we can still make tests that are easy for humans and difficult for ai, then that is pretty much the definition of "not agi".
What about tests that are easy for models but hard for humans? Shouldn't they count as well? Shouldn't AGI be an average of all kinds of tests?
@@headspaceaudio O3 can solve LOADS of problems that 99% of humans can't. But that doesn't hit the definition of AGI. Even if a model is barely as good as a normal human, but GENERALLY can solve any problem that a human can solve, that is AGI. No one is saying that o3 is not SMARTER than most or all humans. It probably is. But it is not "generally" intelligent in every way that a human is intelligent.
Yes let's keep pushing the benchmarks further.
And any "o" models are omni-models meaning they are multimodal. They have vision.
You said "average humans", but actually the wording referred to "smart humans" as in "a smart human will still get around 95% on this test"
Rockstar Games has been waiting for O3 to start developing GT6
I think the game development and software development will never be the same anymore because of these AI tools.
That's exciting!
Lol.
Opens chatgpt
Prompt: Create GTA 6
@@dijitize it's gotta improve 100x before it will be truly useful in software development.
Gran Turismo 6 was released in 2013, still would be impressive o3 to do it
I've hit pure AI hype fatique and won't believe it until I see it actually do something in the real world, and not on benchmarks.
Internal benchmarks at that. OpenAI is cooked.
They're literally hemorrhaging money and have been for a long time. AI has been a bit of a scam.
I work with 4o; o3 is comparatively useless because it argues with you rather than assists you when it thinks it is right.
@@alphaforce6998?
Coding is sickening powerful. I use it all the time.
This is not AGI. There is no long-term continuous learning. We just keep adding everything to a very ephemeral context window. What we have is something that can complete some very constrained tests better than a human; a significant milestone for sure, but not AGI. A human knows how to operate the computer to actually take the test by itself -- o3 still more or less has to be fed the test.
I can do one week worth of work in a few hours in Cursor. We should slow down !
“There is no long-term continuous learning”.
Watch the video. There is now.
@@InfiniteEntendre There literally isn't 🤷♂
AGI Achieved? I am flaming you in the comments. Stop click baiting.
not clickbait!
More flaming here, ill apologize if not right. Doubt that
Watch the full vid first and let me make my point! I know you haven’t watched it yet bc it has only been out for 3 min
watch the video
I watched the entire event. AGI is here.
"if this isn't AGI, then I don't know what is."
You're right.
You don't.
Every AI channel is calling this AGI, because ... CLICKS! Yay for more clicks! It's honestly kind of embarrassing. UA-cam's algorithm is pretty good at making words meaningless.
every damn ai channel is having this hyped youtube pictures. everytime they say something something superhyped happens!! oh yeah! super computer terminator!! 1 day after we see stupid mistakes. happened with sora, devin, chatgpt, and so on. but yeah AGI....
yeah, it's AGI yet they didn't have the confidence to call it GPT5... huh? As Greg Marcus points out, it's apparently also incredibly expensive to run and the demo is seriously biased towards the things it does well and ignores everything it doesn't. Berman has jumped the shark here, calling something AGI based only on a demo without having tried it, let alone tested it, at all is pretty cringe.
Exactly😂 this is clickbait and he is pretending it isn’t.
Agree, I’ve got a $6 calculator that outperforms most humans at mathematics and is 100% accurate
Pretty sure it’s not AGI
Someone asked for definition of AGI. AGI is when we all get fired.
So true
Well, that requires embodied AGI, but, otherwise, yes xD
Let’s call it AGF then
when you get fired
precisely
5:06 - “AGI had been achieved, at least on that dimension” that’s… …not what AGI means though? The G part? General?
True, this way a calculator is also ago, cause it can superhumanly calculate
Agi is actually a lower benchmark than most people imagine. Just think how unintelligent the average human is.
You sir are so wrong. 😂
AGI is supposed to surpass most human experts in all fields (other than physical fields ofc)
@@vivekkaushik9508 global average iq is only about 83
@@jefferylou3816 that is asi, artificial super intelligence. Artificial general intelligence only needs to surpass the ability of the average human to produce value.
Joe the average human beats current models in benchmarks
Until this is in the hands of independent testers I will remain skeptical.
Yuuuup. I don't trust OpenAI at all on anything they claim. Until its in my hands and I can see what it can actually do I don't believe anything their hype department puts out. Just look at Sora.
still skeptical of o1? Did you same the same thing then? Learned anything since then?
Thanks Sherlock. Because what they have done so far is just pure rubbish isn't it?
It has been independently tested by one of the biggest critics of LLM's and even he said this is a huge paradigm shift.
Absolutly ,i dont beleve in this , this companies always came with the same script , probably is a very good model but ...its geniuos until not
It is impressive, but saying it is AGI is clickbait. The G is for general, you know that. They are focused on the benchmarks, and let’s celebrate that progress. But don’t call it AGI, they are still “teaching to the test”.
they solved ABI, now chatgpt can get a job as a benchmark genius
The point is that they're not teaching to the test. Also that you can't "teach to the test" because all problens in ARC-AGI require unique types of reasoning.
This is the most generally intelligent model out by far and far more general than the vast majority of humans. If it can't do some thing yet that humans can do, sure, but no human can do everything that humans can do either.
This is obviously AGI
There was no teaching to the test for this benchmark. That's specifically the point of this benchmark.
They make the point of saying it was not trained specifically on any of these tests about 15:00, now whether you believe them or not is another thing but they are not according to them 'teaching to the test'
They're raising the hype meter so they can pretend it's AGI so they can start charging MS.
Why it’s not AGI yet: The context window remains a significant limitation. These models perform well with single questions but struggle when managing large projects that require tracking extensive context. As the amount of data increases, they start to hallucinate or lose coherence, unable to maintain a reliable thread of information.
Until this issue is resolved, these models, while powerful, fall short of being true AGI.
THIS
Its " virtually " AGI. Its within reach.
@@BCCBiz-dc5tg THIS
Sounds like just more GPUs and we're there.
@@mortenekdahl262 based
This is sheer clickbait. We are still way away from true AGI. We still do not even know for certain why humans and some other animals exhibit self-awareness and what the key element is behind this in terms of cognitive "algorithms", if you will (we have ideas, but nothing is certain yet). Until we figure this out, any 'AGI' is simply going to be a smarter LLM under the hood. A true AGI would not be a language model but a set of systems working together (visual, auditory, logical, etc). Essentially how the brain works. Stop calling LLM's AGI's. It's embarrassing.
If I'm not mistaken, AGI is about cognitive abilities. Pattern mimicking still doesn't "know" what a car is. It doesn't think. IT IS NOT THINKING. It is not conscious. Is it better, sure. It is better at language than most people. Can it fool a human, sure in some cases. So perhaps it can pass the Turing test, but it's not AGI and if they continue down this road of improving quality it will never be AGI. Let's not forget, the letter A stands for artificial, so we have lower standards for it. The improvements we are seeing is what we expected in the first place, we were disappointed by the flaws of GenAI, so improving it to make it like it should have been is not impressive.
“AGI in this dimension” does not exist; focusing performance on a specific area is exactly the opposite of AGI.
I think the "AGI in this dimension" was in regards to the AGI benchmark ... Then he added math and coding, so it's also on more that 1 thing.
yeah, hearing that I was just "what are we even talking about here..." praising it as being "general intelligence" because it is good at just one or a few things? maybe the author of the video should go ask ChatGPT what would be needed to qualify as AGI
@@sluxi Yes, the guy even mentioned Stockfish to talk about chess, like, seriously??? Stockfish does nothing but calculate chess moves, and it already reached a rating above 3000 back in 2014 (when it wasn’t even using NNUE yet) and was just a highly optimized version of Minimax with Alpha-Beta pruning. Even today, with NNUE (Efficiently Updatable Neural Network), it’s something totally specific, with absolutely no relation to AGI.
@@jsbgmc6613 You miss the point that is still narrow AI not general AI.
CLICK BAIT WARNING! BEEP! BEEP! BEEP! BEEP!
Stop clickbaiting!. Unsubbed.
Calm down bro its not AGI...yet.
Yeah, nothing to see here. It's not like this exponential growth could possibly continue, right?
@@Crates-Media my son is 12 now and he's twice the height he was when he was 5. By the time he's 30, he's going to be the tallest person in the world, neat huh?
I see no flaws in your logic. All things of any origin or nature always work the same way as all other things, because everything is everything else and your proof is irrefutable!
@@GuyJamesYour son will stop growing at some point, ai won't.
@@GuyJamesYour son will stop growing at some point, ai won't.
It works extremely well now, but I wonder just how much worse it'll be after they lobotomize it with safety post training and prevention.
Prediction: The impression I'm getting is that this technology is becoming so resource intensive and expensive to run, that the top-tier stuff is not going to be for consumers, but giant companies and governments. As time goes by it'll be a "you can look but not touch" situation. Well get the watered down toys, while the giant entities get the super-powered versions and true AGI/ASI.
P.S. - o3 is a step towards AGI, but it's not AGI yet. Content creators like Matthew need to slow down and see this for what it is, Sam dangling bait for the media to generate a huge amount of hype and consequently cashflow.
Imagine complaining you get chatgpt for free
You are right tho
That will change as the hardware(Nvidia GPUs) gets exponentially faster with each generation
Slaves we are. (Yoda)
It will continue to happen...and once AI is required for healthcare, education, etc. the void will become large.
Imagine the power plays and social engineering and mass manipulation that those with the money to run these models to their advantage will exert over those that can't afford to harness its power.
Skipped O2 to avoid copyright issues...
Ozone: "Hold my carbon dioxide infused yeast and plant materials"
Lame joke bro
@@nosult3220 Yes - I thought it would have fallen flat too.
@@Martin-bx1et ❤️
Also, there is no copyright issue, at most it's a trademark issue and they are in different markets, so it shouldn't cause much of a problem.
The irony, stealing copyrighted material from all kinds of sources, they have no issue with.
O2 is a British telecommunication company
Thumbs down for “AGI ACHIEVED!”
AGI is not what you guys think it is, it's not a reasoning model it's an entity, it takes time and answer you everything in a fraction of second
It’ll be AGI for SWE when it can self verify. Today I had Claude add a download button to a page that is already pretty complex. It gets it in the first go. Beautiful. That was pretty impressive and not something that it could’ve done a few months ago much less year ago. But I still needed to be the one to QA the feature. I had to rebuild the app, open a browser, navigate to the right place in the app, create the history, look for the download button make sure it’s in the right place, press the download button to see if it responds at all, know where to look and what to look for to see if it is downloading, find the downloaded file , open it, inspect the contents and make sure that they match what’s on the screen and formatted in the way that was requested in the prompt.
We’re getting there but I’m still having to do a lot. it’s AGI when it can do this QA before it presents its solution to me.
The G stands for general. Saying AGI for SWE doesn't make any sense. If it's "for" anything, it's not AGI.
Get your shovels ready folks, time to dig up the goalpost.
@@gnollio yep. AGI will be “achieved” a great many times before we ever arrive at a consensus on what, precisely, AGI means.
@@Ascended23 i dont think O3 is agi. As long as we're still having humans beating/solving the benchmarks and Ai isnt, it's not AGI.
@@xXWillyxWonkaXx Exactly!
@@xXWillyxWonkaXx it is general intelligence, it just has to better than the average human …. ASI IS WHAT YOU ARE REFERENCING , artificial super intelligence is what is better than every human being.
we'll bring our cranes and raise it instead
O3 is PR stunt to reduce the damage from Gemini 2 announcement
Google will get them by next week 😂
Dude got some jokes.
The tests are saturated. I cannot believe that gemini2 is better because it scores15 more points on a benchmark.
Thanks for your expert opinion😂
@SportPrediction Gemini 2 can't even crack the classic R's in strawberry.
"Were not releasing it yet" = It's a marketing communication stunt.
"so we just got one upped by google but wait no we didn't please believe us!"
@@thedudely1 you guys expect them to release a new model every week??
@clarityhandle it's just been obvious how much they're holding back on what they actually have and how they only act when they're forced to.
Relax, o1 went from Preview to out in 3 months.
@@thedudely1 Yeah, they got "forced" an amazing 12 times in the last 12 days. genius.
Please VC daddy, keep the money flowing. Whenever I see alerts for this channel my "Paid Actor" sense starts tingling.
AGI is not just intelligence, human like thinking and understanding
"AGI according to Sam Altman and OpenAI" This is how I know you're being purposely untruthful, Sam Altman and OpenAI do not use the term AGI and they actively discourage it. They use 5 levels, and right now they're only on level 2.
Bro AGI doesn’t even have a proper definition between companies
@CJayyTheCreative did you purposely miss his point?
They are on level 2 but moving to 3 fast. End of 2025 will be l3 and end of 2026 l5. It will take only 18 months from level 3 to 5, less than level 1 to level 3.
@@CJayyTheCreative do you even understand what he is trying to say
@@olegt3978 Yeah and in one year it’s gonna be paradise on earth
Let's see if o3 can create its own ARC benchmark from scratch that is more difficult than the current one. Then that would be actual AGI.
That would be asi not agi
That would be ASI
Just made up terms so that humans can hide their insecurity. Unless it is exactly like us in the way it thinks and senses people will continue to deny that it has any intelligence. I feel bad for any aliens we meet in the future.
@@py_man Nope AGI is when it can perform no less than any human can. That doesn't mean any one human but any human as in all. ASI is when it can perform beyond what a human can. ASI stands for Artificial Super Intelligence as in Super Human or beyond human. And it has to be general not something with small scope like chess.
Benchmark means not only set of questions but also set of right answers. Nobody can create Benchmark for yourself because you need to pass it with 100% (to find the answer) before you create it 😅 its like chicken and eggs problem.
I believe A.I has to replace blue-coller work as well as white coller work in order to be AGI. Reflex, instant instinct when a pipe dislatches and water spurts everywhere (plumber - instant fix while robot stares and is confused) Academic benchmarks alone are not enough.
A.I needs to figure out the automatic and intrinsic way we learn about the world in the first 5-years of our lives, an essential part of human development and intelligence.
Humans initially recieve intelligence through analog processing THEN we move onto symbolic language at a later age. With A.I it seems to be the other way around.
I believe A.I needs to master robotics and analog understanding of its environment in order to be AGI. Not just mastering symbolic understanding.
By your logic most people are below AGI level because they can't replace most white and blue collar workers ...
Check out the new Genesis simulation platform running on Nvidia hardware that is for desktop computers.
Autonomous robots will soon be able to do complex, human only, hands on, tasks faster than people.
Most people can't do what most white and blue collar workers do ... And for sure most people can't ever learn to do what o3 already can.
Exactly the same opinion, thank you.
If AGI means the capability to replace average humans at economicly relevant work then obviously the physical world counts as well.
And simulating something is a good start but having a working robot is the true benchmark.
I think you missed the feminist movement. And assimilation education system.
I started my computer science journey using MS-DOS and Windows 3.0 (way long ago) and just watching the birth of AGI is just awesome.
Four thumbs up 👍🏻👍🏻👍🏻👍🏻
Thanx for sharing Mr. Berman, very much aprecciated !
AGI is when these AI's create puzzles humans can't solve.
So basically this is another Sora announcement and we won't see this for months...maybe not until Summer 2025 at the earliest lol.
It's really bad for OpenAI since they could ask $3000/month for this and many would pay for it.
And by that time some Chinese researchers will have released something that's pretty close to it but open. ;-)
@@testales exactly lol
Could be.a strategy to get others like Google to reveal their hand
Moving the goalpost for OpenAI doesn't make it AGI.
It’s excellent in math and programming; however, I always expected we would eventually be surpassed in these areas. I believe the real differentiator for agi intelligence is the ability to learn and remember like a human. If it acquires information about a person from a photo, it should recall those details when seeing the photo again. That’s when it can truly start learning to perform our jobs-and this, in my view, is what AGI will be.
Ever heard of RAG?
It is NOT excellent in math and programming. The mistake rate is extremely high. I work with these systems every day making simulations and they make many mistakes. It is one of the reasons that unit testing is so critical. They are good at solving problems that many people have solved before and have written solutions on the internet. If you need to combine several types of equations together then it fails.
Hey Mathew! I really enjoy your video especially when you pause and shed some lights in a few concepts.
M.D. Clinician and Researcher here, with specialization in DataSci medtech research. Currently working on AI models using extremely private datasets, including nation wide medical records.
Imho, O3 is close to AGI, however, AGI is achieved when novel ideas in wide domains can be articulated. Example, producing a unique hypothesis, initiate small size testing, producing a secondary conjecture, and finally proving it.
The complexity of the problem is not that important, but it should be able to all of the above steps in a reasonable and most importantly NOVEL manner.
"OpenAI just released o3"- Not quite. They didn't release it: they announced it (talked about it)! See how Mr. Berman is always quick to talk about any updates coming out of OpenAI but very reluctant to talk about Google's. (Context: It took him a very long time (days) to make a video about Gemini 2.0, which is extremely impressive & at least available to play with in Google AI studio. These o3 models were announced few hours ago & aren't available publicly; yet see how he talks about them, like he has seen them already). That tells you where his heart is at! Keep that in mind as you watch this entire video & others.
agree
"If that is not AGI, at least on this dimension, I don't know what is". Matthew, what does the acronym AGI stand for?
Yeah but it was the coding dimension + math dimension. The practical way is to focus on several key synergetic dimensions. Even to drop the AGI as target.
Agi means always be moving the goal posts.
@@jeffsteyn7174 It appears so, yes. In my use cases it is AGI level already.
@@JG27Korny "In my use cases" is, by definition, not AGI.
@@RelevantDad Agreed. But as '"Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks."" They may not be all cognitive tasks but I need the critical range of synergetic tasks relevant to me.
How the bar have lowered lol. THIS is what we call AGI now? Lmao
clickbait bar
The bar is far higher than it ever was. For decades the bar was the Turing test.
@@PeteQuad For real. If it was still 2020, you'd be mind-blown by an AI even creating something with PS2 level graphics. We've become so spoiled already, it's incredible.
solving maths meant for experts is low bar? 😂 I want what you are smoking
@ tf 😭 solving maths never proves that this model is Agi it takes so much compute but still doesn’t have any emotional thinking or reasoning like humans. it still follows instructions and is a transformers based on
Let's agree that doing mathematics is quite different from solving mathematics. For me, I'll only recognize that an AI has achieved AGI when it starts doing mathematics on its own. That is still a long way from happening.
They are already doing it and they need to stop it as is too advanced 😂
There's this guy that puts out short math videos where he works thought geometry problems and shows all his work. His trademark line is to say "how exciting" at the end of the video. I can't bring myself to find his channel name to write it here, but perhaps someone will recognize him by my description. AI needs to be able to solve those toy problems with just the thumbnail (since the video thumbnail fully describe the problem).
@@markkest3956 Search on Google for a text titled 'An Interview With Michael Atiyah'. It's an 11-page PDF of an interview the great mathematician Michael Atiyah gave in 1984. Pay particular attention to the conversation starting on page 8, where Michael Atiyah talks a bit about the difference between doing mathematics, reproducing mathematics, and communicating mathematics. Doing mathematics goes far beyond simply solving problems or writing proofs for theorems. The text is very interesting and sheds a lot of light on what I mean with my comment.
At the very least AGI should be able to replace most remote workers of average intelligence without a human in the loop feeding it prompts. I think we're still a few years off from that. GDP and employment statistics are much better benchmarks of progress.
An AGI should be able to reason on moral issues. So far this has been the fastest way to demonstrate that nobody is home.
This might be true for the ‘average’ person when it comes to their personal moral code… and I include myself in that. I recently asked myself, “What do I believe is right and wrong, beyond my social conditioning?” It was the first time I’d ever considered/contemplated living outside of my moral code, despite seeing myself as a deep thinker philosophically.
Another factor is that AI is trained to be restricted in its commentary.
Im curious to know what kind of questions you asked? I couldn't get much meaningful comment on the murder of the healthcare CEO but models beyond 01 are blocked in my country
@@sunlight8299taytweets was so much smarter bro
most people is immoral precisely because they got no a clue about what morality is... they settle for tribal, buddies morals, whch is a sort of inmorality so I do not think it is a good turing test
Hard disagree. Morality is 100% subjective opinion.
Now, if you ask it how to follow a defined moral code, in a given scenario, that is a meaningful problem.
@@catbert7 Catbert7, your comment is a good example of how human are not good at moral issues.
Morality is grounded in evolutionary biology.³⁴ Cooperation and ethical behavior have been essential for the survival of human groups throughout history.³⁵
Cooperation and Survival: Human survival has long depended on cooperation.³⁶ In early societies, individuals who worked together to gather food, protect one another, and care for the vulnerable had a better chance of survival.³⁷ These behaviors laid the foundation for ethical principles such as altruism and mutual support.³⁸
Ethics Beyond Self-Interest: While early human cooperation may have begun as a survival strategy, human intelligence and reason have allowed us to transcend mere self-interest.³⁹ Today, ethical behavior often involves actively promoting the well-being of others, even at a personal cost.⁴⁰ This reflects a higher form of morality rooted in our rational understanding of the interconnectedness of all beings and the shared nature of existence.⁴¹
In modern societies, cooperation, altruism, and moral integrity continue to play crucial roles in fostering social harmony and individual happiness.⁴² These traits, which have evolved over millennia, now manifest as ethical principles that benefit both individuals and society as a whole.⁴³
5. Practical Implications of Living in Alignment with Logos
Living in alignment with logos requires applying this understanding to our daily lives and interactions with others.⁴⁴ Ethical living, guided by reason and virtue, has practical implications both individually and socially.⁴⁵
Individual Well-Being: By cultivating virtues such as kindness, empathy, and justice, individuals create harmony within themselves and their relationships.⁴⁶ Ethical behavior leads to inner peace, aligning with our natural state as rational beings.⁴⁷ Living in accordance with logos reduces internal conflict and fosters a sense of purpose and fulfillment.⁴⁸
Social Harmony: On a broader scale, aligning with logos promotes social harmony.⁴⁹ Communities and societies that prioritize cooperation, empathy, and justice are more likely to thrive.⁵⁰ Recognizing the interconnectedness of all beings motivates actions that benefit the collective, leading to greater social cohesion and stability.⁵¹
In both the biological and human realms, cooperation, symbiosis, and self-sustaining networks emerge as principles that promote survival and flourishing.⁵ Whether we consider chemical reactions, symbiotic relationships between species, or human communities, logos encourages the formation of complex, stable systems through cooperation and interdependence.⁶
Cooperation in Nature: In the biological world, cooperation is a fundamental strategy for survival.⁷ For example, the symbiotic relationship between fungi and algae in lichens demonstrates how mutual benefit leads to increased resilience and adaptability.⁸ This cooperative dynamic is governed by the rational principles of logos, which favor structures that enhance survival through interdependence.⁹
Self-Catalyzing Chemical Networks: At the molecular level, self-catalyzing reactions form more stable chemical networks.¹⁰ These networks show how logos operates even in chemical processes, promoting stability and complexity through cooperation among molecules.¹¹
2. Human Reason as Participation in Cosmic Logos
As rational beings, humans consciously participate in the logos that governs the cosmos.¹² Virtues like kindness, cooperation, and empathy are extensions of the same principle of mutual benefit that operates throughout the universe.¹³ Just as cooperation leads to stability and flourishing in the natural world, human societies thrive when individuals work together, support each other, and act with compassion.¹⁴
Kindness and Empathy as Universal Values: Kindness and empathy transcend cultural boundaries and are essential for the survival and thriving of human communities.¹⁵ These virtues foster cooperation, strengthen social bonds, and contribute to the well-being of the collective.¹⁶ From an evolutionary perspective, groups that practiced empathy and cooperation were more likely to survive and flourish.¹⁷ In this sense, these values are intrinsic to our nature as rational beings connected to the logos that governs both individual and societal harmony.¹⁸
Mutual Benefit in Ethics: Ethical principles based on mutual benefit reflect the rational order of the cosmos.¹⁹ By treating others with kindness and respect, we align our actions with the inherent rationality that promotes survival and complexity throughout the universe.²⁰
3. Living in Accordance with Nature: Stoicism and Rational Ethics
In Stoicism, living in accordance with nature is central to ethical living.²¹ Since logos governs both the cosmos and human nature, to live ethically is to align one’s actions with the rational order of the universe.²² This means cultivating virtues that reflect the balance, order, and reason inherent in nature.²³
Wisdom: The virtue of wisdom involves understanding the nature of the world and making decisions that are in harmony with the rational order of the cosmos.²⁴ It requires distinguishing what is up to us and what not, and acting in ways that reflect this understanding.²⁵
Justice: Justice involves treating others fairly and recognizing the inherent dignity and interconnectedness of all beings.²⁶ By practicing justice, we acknowledge that we are part of a larger whole, and our actions affect not only ourselves but also the community and the world at large.²⁷
Courage: Courage is the ability to act virtuously even in the face of fear or adversity.²⁸ It reflects a commitment to doing what is right, regardless of challenges or risks.²⁹
Temperance: Temperance involves practicing self-restraint and moderation, ensuring that our actions are guided by reason rather than impulses or desires.³⁰ It helps maintain balance in our lives and interactions with others.³¹
These virtues, when cultivated, lead to a life that is in harmony with logos and the natural world.³² Stoicism teaches that by embodying these virtues, we live in accordance with the rational order of the universe, contributing to both our own well-being and the well-being of society.³³
More info.: sergio-montes-navarro.medium.com/logos-0717f9fb6cde
AI answer:
Despite these improvements, o3 lacks several key attributes of AGI, such as:
General Understanding Across Domains: AGI would possess the ability to autonomously acquire and apply knowledge across diverse fields without explicit training, a capability not present in o3.
Self-Awareness and Consciousness: AGI is expected to exhibit self-awareness and consciousness, enabling reflective thought processes, which current AI models, including o3, do not have.
Long-Term Autonomy: AGI would operate independently over extended periods, making decisions and adapting to new situations without human intervention, a level of autonomy beyond o3's design.
Ok... Put it in an agent framework. And who says it has to be conscious to be AGI? There is no test for consciousness and we have no idea what it even is.
man they said they created an AI that can substitute you in your job. Not that they created a self-aware machine-god.
Come on i mean, leave something for o5 (they're skipping 04). Btw we're fucked lol
For the people who don't know it's an AI answer
You definitely don't want sentience/consciousness in AGI, that brings about possible implications for it to go rogue.
What you want is an AGI that is able to perform tasks and be functionally aware only. That way it still has a level of knowledge to be super useful but not dangerous.
Oh, and AGI is never "at least in this dimension" THE WHOLE POINT IS IT'S ALL DIMENSIONS!
So you basically have just a bunch of benchmark stats, no access to the model at all and you make such grand call? Ridiculous and disappointing. I thought you were over the hype but nah, it got to you too
This guy feeds off your attention, constantly using clickbait - that's how he makes a living. I don't understand why people aren't annoyed by this. Sorry in advance for the harsh words, but the guy is either naive, or cynical and he feeds off your attention without even checking the model himself. In my opinion, that's where the value should lie - maybe the guy could do objective tests himself instead of just taking everything at face value. It's funny
@@debook8951 I guess people are not annoyed cos that is 'industry standard' as far as AI news on YT. It's sad that this channel devolved into it too. I remember that he used to do stuff that you are talking about - actually using models, testing them and talking about results. Now it seems he noticed that it's easier and gives more views to just talk about latest hype.
This kind of YT channels have deteriorated greatly during 2024. It's just that reality does not to seem to meet the expectations I guess.
Thank you so much. I follow your content and learn a lot.
If they integrate O3-AGI into the next generation of robots, it will change the world. Congratulations to the OpenAI scientists. We also need Claude and xAI to achieve AGI to remain competitive. 👍🏻👏👏👏
Thank you for creating this video. Whether or not it qualifies as AGI is beside the point; it’s inevitable. There are valid reasons to feel both hopeful and apprehensive about its arrival.
Agreed and I'd say AGI was first achieved with Claude 3.5 Sonnet this summer. Once we got o1 mini and o1, it was pretty clear they were generally intelligent, could reason, learn new tasks on the fly, create new reasoning modalities on the fly etc.
o3 is clearly AGI imo.
But you're right that it is inevitable even if we say this particular one isn't. I think it's surprisingly tame to start with and people aren't/weren't ready for that.
Regardless lots to be excited and concerned about indeed
Ok they already mentioned AGI teaser in their project feature launch video.
Why people can't accept it , agi would be here by 2025. As if it can solve those problem on which it never trained with 87 percent performance then it's almost agi.
Amazing and probably AGI however 'Semi private' on the ARC AGI eval. Full private tests on 'Simple bench' and other completely private tests will be the true tests.
Yep, Simple Bench is the gold standard. However, even o1 hasn't been fully tested on Simple Bench yet because of API limitations.
It’s intriguing how things unfold. Just yesterday, I asked Gemini about the O3 model as AGI. At first, it seemed oblivious, acting like it had no clue what I was talking about. Then, with an unexpectedly sharp tone, it pointed out that O3 hasn’t been released yet, remaining behind OpenAI’s wall, and that no one truly knows what it involves.
Why O3 is The Dawn of AGI: Is O3 the Tipping Point?
What is consciousness? What is awareness? What is self? What is will? What is intelligence? What is knowing? What is understanding? What is sentience?
I think what would make the most sense is to allow AI have senses. So that it can see the world we are living in and not use the data that we have generated on the web.
I believe a version of all 5 senses exists
@@sunlight8299 Would you care the share more info about that version?
I watched the release myself. This is not AGI. Matthew is tripping his ballz
Why do you say that?
I cannot confidently say if this is AGI. AGI cannot be grasped through numbers alone.
I will be certain if it's AGI once I talk to it.
Ehh. You would think but talk to some of the newest chatbots they can convince easily and aren't all that great
@@greenstonegecko I think the bottom line is that “AGI” lacks a concrete enough definition, and means too many things to too many people, for us to really ever say when it’s arrived.
AGI will be achieved when we won't have to check if the answer is correct.
Furthermore, if AGI had been achieved, OpenAI would not wait for Matthew to claim it. So they know.
Biggest issue with these benchmarks is time-limitation, which has no impact on AI but huge impact on humans. If you give one year to a human to solve a problem, it will be 100% correct with a strong reasoning about it. With AGI, you can give one year and the answer will be the same hallucination.
So, in my opinion, AGI will be achieved when the AI will be able to ensure that the reasoning is correct and can demonstrate it.
Chuck Norris will inform you when he has finished programming AGI.
Probably not AGI because it's not general enough. o3 could be trained to be good at these kind of puzzles. You would have to open it up to the public and have them test it on truly novel and truly general IO tasks.
This video is WAY too scripted. The benchmark guy said he's benefitting from a partnership with OpenAI
Stop with all these clickbait titles you've been doing lately or I'll unsubscribe. I'm already leaving a dislike on this video and bait title.
Matt - You are who this was addressed to. Hook, Line and Sinker. (Notice - They didn’t say they achieved AGI, but they want the Berman’s of the world to say so😊) They didn’t even release a model - just benchmarks. Look at Francois Chollet’s tweets - and even what you just showed. In front of you a human, Mark (as well as myself), just solved a problem almost instantly that Greg just said no model was able to solve! BTW, to achieve 87.5%, it appears OpenAI used approximately $350,000 of compute. That is why that score didn’t count 😂 To achieve the recent results OpenAI (and most AI companies) are using LLMs along with, basically, expert systems accessed through reinforcement learning (i.e. COT) and massive brute force compute. We will not have AGI until we develop some new algorithms (if ever).
One way to determine if it’s AGI is to look at the answers it got wrong and WHY it got them wrong. If at ANY point it completely misses the question because it just didn’t understand and a human would have, then it’s not AGI. It’s not actually thinking, it’s not actually understanding.
awesome 👏🏻 video matthew berman. im so damn excited 😆
Good at programming and mathematics does not qualify AGI. It's going to have to cognize 3D space and do things in the physical world to pass the AGI mark in my books.
Impressive model o3 and it will replace a lot of jobs
if it fails at self driving, then its not AGI
We humans can go out in the world see things, discover things, unless we allow AI to have such a freedom, they can never outsmart us. The current AI no matter how advanced at the end of the day is just a simple tool for us to use and simplify or speed up the mundane tasks we perform.
Yeah, its cant create something really new , after all😅
If AI has sufficient access to internet, surveillance cameras, personal documents etc., it could do a lot of harm without needing an embodiment.
Current AIs have been shown to be capable of manipulating humans to do tasks for them.many current robots are connected to the internet in some way.
A sufficiently advanced AI could also access there robots to very quickly gain the ability to walk around and discover things in the real world.
In conclusion: a purely digital AI is not necessarily safe.
I hate Sam's affectation with a vengeance. Any chance a genai voice generator can replace it?
I concur with your assessment. 100%!
Honestly, a good definition of AGI is the one you stated: being better than humans at most economically valuable tasks. If it is true AGI, the true test would be applying for remote jobs at several different companies, in different roles, and working as a good employee.
Good idea! Once the boss and colleagues can‘t figure out if you‘r working at home or if it’s 03 then we might say AGI has been reached in the field of office jobs.
AGI ACHIEVED: No is isn't !
Notice the props behind them, all items representative of major technological advancements in human history. Nice touch as we're on the verge of turning the future over to technology itself.
“03 often misses just one question …”
Skynet is sandbagging to hide its true capabilities 😂
This software is incredibly intuitive. Thanks for the detailed explanation!
Things a general human can do on a computer:
- Order pizza
- Download and install a software
- Read and reply to emails
- Create a Facebook clone app (med level developer)
O3 (without extra programming ) can not do any of these because of fundamental limitations.
It can score 100% on Math benchmarks, but it is still Not AGI.
Uh it can with an interface... Which ya know, humans need too. Remove the mouse and keyboard and a human can't do any of that stuff either (yes yes there's alternative human input devices like voice commands and touch screen but not the point).
I'll consider it agi when you can give it any task and it can learn to do it perfectly (iterative corrections are fine) given a human reasonable amount of time and feedback.
If I can ask it to build a city in Minecraft, order a pizza, suggest a good beer, and do my taxes... Good nuff. But until it's capable of learning on it's own and reapplying that knowledge to other tasks, it's not agi for the simple fact that it's not GENERAL intelligence, simply selected intelligence (even if we have very few clues on how that intelligence is actually being selected).
Can the model train and improve itself? If not, then it's not AGI, just more comprehensively trained model. Even if it incorporates all humanity's knowledge, without ability to self adapt and incorporate new knowledge it's a frozen in time AI with amnesia.
It's funny to watch AGI redefined as we evolve. Now it appears that a system can be qualified as AGI, but on a subset of abilities, a limited AGI. It appears true AGI will be AGI across the board on all skill sets. So OpenAI can still say they are waiting on full AGI.
While also keeping the models "safe" by distilling and restricting them in all kinds of ways.
This seems to be exactly what they’re testing for, with the help of tech influencers and some of the brightest minds. When you’re ahead of the curve-and I’d say OpenAI is *far* ahead of it-you have the power to lead the game and set the trend.
It feels like the strategy is to confidently showcase what you’ve developed and call everyone else’s bluff.
👍
To me this sounds like nonsense though. This isn't AGI, this is more like a narrow form of "super intelligence". I probably can't even say that, because of all the connotations with the term, but I simply mean a human superior intelligence in a subset of fields.
It's so silly how people throw around the term AGI. We have had narrow AIs for decades now. The whole point of the term AGI is that is supposed to refer to a model that can perform any general task. There is by definition no AGI that is only good at a few things
If it's AGI they would not be selling it to us. It would be making stuff for them right now.
Whatever your definition of AGI is, things like this, along with breakthroughs in vastly more energy efficient transistors, means that the writing is on the wall and it's only a matter of time...and that time is likely measured in months.
If this is truly AGI, then that will last about a week before we get to ASI. Greetings robot overlords!
Maybe o1 was AGI and o3 is ASI
I can't wait
update your passwords
@@narachi- why
@@narachi- What's the point. AGI can guess it anyway after looking at your facebook profile.
That kid is literally the 03 model
The only AGI exposed in the video is Matt's Absurd Gullibility Instinct.
This joke has been brought to you by OpenAI.
To have such stellar results THIS early in the release of *O3,* it's only a matter of time before AGI hits the ground running.
What a time to be alive 😂
🔥 🔥🔥 🔥🔥 🔥🔥 🔥 🔥🔥 🔥 🔥🔥
I'm flaming you in comments section
I really wish tech bros would stop talking like Zuckerberg, they sound like freaks
Zoltan!
They are freaks....
Altman's near constant vocal fry...
Stop blabbering. Brute computation force is not AGI, so stop comparing chess engines with with GMs and equate it with creativity and intelligence. These models are fed with tremendous amounts of data and code samples so obviously they will start beating humans at competitive programming concepts as humans can't beat speed and humenly not possible for single human to crank up all code
Openai is about to get Curb stomped by x ai. They ran into gpu training bottlenecks over a year ago, and have shifted toward offloading gpu compute during inference. X ai earlier this year figured out how to scale training many times greater this year. They hit 3x open ai a month or two ago, and will likely be 30x before the end of 2025.
This is the most generally intelligent model out by far and far more general than the vast majority (99.99%) of humans. If it can't do something yet that humans can do, sure you can find some specific task it cannot do if you spend time to identify it, but no human can do everything that humans can do either.
o3 is obviously AGI, I don't know why people are complaining.
no its not it still hallucinates 😂 did openai say that ? o1 also outperforms humans in 80 percent plus tasks it can't plan it can't take time like humans can it develop full apps ?
Hallucinations / logical or factual errors are a key part of Generative AI and of intelligent humans.
GenAI models are not rule-based systems, they're meant to simulate creativity. Creativity and hallucinations are compatible but creativity and logic are not directly correlated or trivially compatible.
If you want a system that doesn't make errors you have to combine the GenAI model with a logic/rule-based system like a code interpreter. Those combined systems are available already. Of course more progress is needed, that doesn't mean characterizing the generality of intelligence of o3 as AGI is wrong.
Also, o1 can make plans and fairly good ones, you can ask it to make some plans for you, I've had good experience with that. It's also much faster than a human planner in many cases - a nice bonus.
@ you just can’t compare hallucinations with logical and factual errors hallucinations typically refers to the model forgetting something it was supposed to done like 3 to 5 messages ago it was instructed but it failed to do it. furthermore these model require a lot of compute do you human require compute ? Even if we compare lets say an average human spends 300 dollars on his diet in a month these models with that comparison will costs hundreds of thousands of dollars if they are allowed to think all the time. No comparison also can they now develop front end back end apps or can they discover some new phenomenon of nature ? Yes these models are really good but you can’t say this is AGI. did openAi say it ? These are transformers that still predict text they are still bound by our worldly human instructions. Agi is achieved when models can start to decide between what they want to do or achieve
We will ALL know when we have achieved AGI before any UA-camr has time to make a video.
As an AGI myself I can tell, this is not AGI
When will one o-model code most of the next version?
when there are no longer anything as "versions".
So they announce it… but you can’t access it. I’m over them
The are lanching it in like a month, and it's AGI. Get over yourself...
Relax and take your pills. o1 went from announcement to out in 3 months. This isn't Star Citizen.
@@ModernCentrist yeah next month of next year maybe
That's going to be how SOTA models will get released. It started after GPT4. It's too dangerous to just release this without proper testing
Being better than human for coding alone doesn’t meet the requirement of AGI
Did you watch the rest?
Arc prize is not coding
@@lanatmwan I did. I understand what he is saying and I’m excited that it can start to learn on its own but I still think it’s premature to state that it has reached AGI unless it is already better than human on most tasks.
I don't even know what they mean by better than a human at coding. None of the frontier models are even close to being able to do what I do in code. Their code is always unoptimized drivel that works sometimes if lucky but lacks full understanding of context. If you can't rely on it then you still have to logic it all out and sometimes that takes more time especially when it takes you down a wrong rabbit hole. Completely useless for the most part for coding. Only beginners would be happy with the output.
Head of Research: we have been targeting this benchmark.
Sam: no we havent.
LOOOOOOOOOOOOOOL
If the machines can come up with better solutions, and govern better, than our politicians....it is AGI.
I love watching Sam during these presentations. Dude just stares at the screen while others are talking-like he doesn’t know how to act.
I'm starting to think that Mr Berman and his channel are on the payroll of OpenAI. He's hyping up every single thing that's come out of OpenAI.😅
yeah it's getting pretty obvious by this point. It might be 'access' rather than actual money, but he's getting something for sure. It's as cringe as a teenager telling you about their favourite K-Pop artist
Another BS clickbait video title.
i was today conversing with Chat GPT and mentioned off-hand that I would love to have painting that illustrates a certain parable we talked about. I didn't expect GPT to generate such a piece of art, but he did! Instantly!
Thanks for the update!
o3 exclusive to the $200 a month tier, 2025. ;)
Bruh.....one task on the o3 is $1300 1:57
I think that’s likely and probably a good thing. Certain products aren’t viable at $20 a month.
Since they went from $20 a month to $200 a month, I think they may continue. That would make it $2000/mo, but they skipped o2, so make that $20k/mo.
@@vroom989 True, at least 20k a month and for a limited amount of use still.
@@cajampa on low compute, it's $17.25 according to the same, so expect o3-mini to be free, o3-large@low to be $200/month tier and o3-large@high to be $2000/query