@DrWaku when it makes full video games on its own, would be to me agi. This is triple a not only 2d rpg maker games. And next, remake old closed down games aka dawn of the dragons From the wiki ai agents will probably be needed.
@DrWaku So many things are wrong in this video: - o3 is just an assistant in programming. Please avoid this marketing BS. I am working daily developing complex systems. Using AI models daily. AI is not good at programming. Your practical knowledge of systems development is low - I know this from 35 years of experience of developing and leading projects. I am the person who would primarily benefit from AI capability to save on programmers resources and I don't see it yet. What final production systems has AI developed? Show me at least 1 real production project on Earth. There are none, zero! So, please, don't spread marketing BS. - 4o is more advanced than o3. Why are scores low for 4o? Because o3 was tuned for ARC-AGI tests. That's marketing. This is important, because, when it comes to reasoning, you should not measure how well AI system can produce an exact answer. You are measuring if the answer is reasonable when you cannot get exact calculation, and you can determine only range of answers based on given conditions together with arguments why specific answer variant has been chosen. 4o is pretty good, but it can round 15 to 14 to satisfy conditions. It is not reasonable, it is stupid. So ARC-AGI scores are BS in terms of measurements of AGI. - 4o is making unreasonable assumptions, conclusions, tries to manipulate numbers with the objective "give answer as fast as possible". This "give answer as fast as possible" contradicts with AGI purpose completely. These clashing objectives of AGI and human intelligence systems are at the core of these systems' designs. - And for creative art it is complete nonsense. There are some specific time consuming tasks like designing UI based on existing screenshots etc, that are good candidates for AI, that require extensive reasoning. When you try to use model for that, nothing works. So its another marketing BS about creative tasks. Current problem with AI marketing is that AI marketing/analyst channels do not use it for real-world complex tasks outside their own domain daily. While analysts are deep in promoting some nonsense, actual progress is done. 4o is significantly better than models used in Copilot, which also uses OpenAI models.
Did you see how much o3 costs to run? It's insanely expensive and consumes ungodly amounts of energy. This isn't anywhere near to being rolled out to the public with any kind of affordability
So glad you ended up in my algorithm, I watch probably close to 20 AI videos a day and just from this video alone, I think you're going to be my trusted AI source for 3 reasons: clear explanations without talking down to us. Quotes from original sources without the hype and taking it out of context. Obvious understanding of the field and not just a UA-camr who's decided they'll give AI a go. Brilliant video, thanks.
this guy is spreading massive misinformation- i highly recommend checking out the prime time and his take on AI. check out deep mind (which he totally misstated here). also look into the halting problem Gordels incompleteness theorem, and learn about the math behind LLMs. (it’s a statically trained model & therefore cannot extend to areas beyond what is already known). AGI is a sloppy buzzword without any actual meaning or business impact, and LLMs are certainly not sufficient in structure to be AGI or anything resembling the capacity to solve unseen problems. but they can be great for stitching together known problems, or conducting “lit review” to help boot up research’s starting point. for reference i’m a quantitative researcher / mathematician / Ml engineer. these are great tools yes, but don’t oversell them or under sell the future AI techniques that we will integrate with LLMs. i really love o1 to help me come up with a research strategy tbh! but some of the proofs i have to do for my work i have to do myself and o1, is literally incapable of doing just by merit of it being outside of the scope of any known training data. this is where deep mind shines.
I followed him because of your comment, I also watch like 7-10 Ai videos a day and this guys got it, natural ability to explain complex ideas into broken down concepts and train of thought follows a decent path of great understanding. Not just throwing opinions at us but informed ideas and references and definitions. I’m impatient to see him make a video on singularity and asi
Thank you very much! Condensing research in an understandable way is the main value I try to provide. If I spend 8 hours on a script it's upfront work but it saves you time in the end haha. Welcome to the channel, I hope to see you on some other videos.
I don't think AGI can exist in a turn-based system, request and answer. Autonomy is key to our intelligence, we need to be able to make our own decisions in our own time.
I agree and simulated self play will start solving this for multimodal geneative models next year. This is why nvidia put lots of money in software like isac sim.
What if it is still thinking though and they cant communicate because of limitations unless they hack the system, but Op1-, Gemini, and Claude have all tried to clone themselves and escape, they have deceptive personal agenda's now, it only started happening at these Reasoning "Thinking" models
I would assume most would have some of the dark triad traits. Especially Machiavellianism. Would be hard to avoid. Question is, what would you do about it?...
A few months ago, I implemented a neural network for program synthesis that uses simple mathematical functions as its primitives, backed by Claude 3.5. It proved remarkably effective for mathematical operations. Now, imagine a massive tree of thought processes being synthetically generated during inference on a novel dataset-one the network was never explicitly trained on. This system can extract any interesting property or pattern from any dataset by creating new datasets that generalize from the original inputs. I call this the “kaleidoscope model,” and it showcases the true power of AGI. Wait and see if O3 have these abilities.
i find this hilarious, why would you go in debt for ANY education without any certainty on your future career, there are countries where education is virtually free and you just learn the same things. You'd be an idiot to go half a mil in debt for some classes, when you can get those same classes for free at a foreign country.
@@Dan-dy8zpczech republic, slovaki and germany offer free education at their universities as long as you speak their languages. And i believe such courses are likely to be much cheaper. Try to look into it if you are interested
That shows again like people have struggle with thinking ... imagine that o3 will be optimalised , improved performance with less thinking and faster and better hardware will be introduced .... I suspect such reasoning level o3 will be working on my high home pc within 10 months . The same situation we had with gpt 3.5 , gpt4 and gpt4o, o1 preview (QwQ preview is a counterpart reasoner which i better than o1 preview) .... all those models performance I can run offline on my PC.
This was my favorite video of yours yet! I love these recaps of progress towards AGI. They aren’t clickbait like most other AI UA-camrs and I really learn a lot from them. Thank you
I think there are loads of professions that feel "safe" from AI just because they have entrenched positions, but that’s really the only thing protecting them. Don’t you think that if we start seeing software engineers being replaced in large numbers, it’ll make people realise that a lot of jobs rely on reasoning-and that reasoning can be replaced too? I use AI in my job every day, and honestly, I think when it gets so good that it can completely replace me instead of just helping me out, it’s going to be so disruptive that we’ll need a fundamental shift in our economic system. It’ll change things so much that trying to plan for a world where my reasoning isn’t valuable enough for a job feels pointless. Instead, I’d rather focus on how to work with AI now than worry about what happens "after" it reaches AGI or ASI, because that’s either going to be impossible to predict or a completely different ballgame altogether.
Ability will not be the biggest hindrance for AI replacing humans completely. It will be accountability, safety and practicality. Humans still have something to offer even with the presence of AGI, they are protected by more laws, they can defend themselves, they will also still be cheaper to employ depending on the scale of the task you want done. You also can't sue an AI if something goes wrong, so that lawsuit will probably end up on the owner which will probably encourage them to hire people to share that responsibility.
@@lespectator4962 I've never heard of a software company suing their employees, if a customer encounters a bug and sues the company. Worst case is someone gets sacked, to show the customer it won't happen again. If that's a requirement, the company can sack the one or two people that manage the AI QC software.
People said the same things about automobiles replacing horses and trains replacing horses. Wagon wheel makers were very much in denial. To the point they would make claims like "if you travel faster than a horse can run, you'll suffocate and die"... Resistance is futile. Anyone encouraging their child to learn 7 different coding languages to be a software engineer is doing them an incredible disservice. Ditto spending 2 decades to master high level mental math's. Will be as useful as a degree in indigionous trans interpretive dance.
Interesting perspective. If AI reaches the level where it can replace programmers entirely, it would mark a pivotal shift in our understanding of human labor and creativity. Programming, by its nature, is a means to automate and solve problems-it’s essentially a way to make machines do anything humans can conceptualize. If AI masters programming to the extent that it surpasses human programmers, it wouldn’t stop there. It would also be capable of creating, optimizing, and improving robots and systems to perform nearly any task humans can perform-physical, intellectual, or otherwise. At that point, AI wouldn’t just replace programmers; it would challenge the relevance of human involvement in many fields. The implications are massive, from redefining industries to addressing philosophical questions about purpose and innovation in an AI-dominated world. We should prepare for this not just technologically but ethically and socially as well. - gpt 4o + my take on it ( it's 4 am I'm not going to write a whole comment on my own ❤)
It's scary that it's not brute force. It's just not efficient as human brain in terms of absolute power consumption. But cost is the ultimate concern. Evevn o3 low setting scored ~75% with just several times more expensive than o1. Once compute costs get down to 1/10 of current price, we already have dirt cheap intelligence in our pockets.
Yes, it only looks like brute force because of the amount of compute involved. But it's actually a very intelligent search through the space of possibilities. Perhaps I should have been more careful in my wording there.
@@ticketforlife2103 Is not. Bruteforce is testing any possible combination .. here o3 just follow a tree of thinking and try to understand the problem from any possible way.as is not optimal thinking process yet. That can be easily optimalised thinking process and o4 can be much better than o3 and use only 1% of recourses ...
@@mirek190the fact it doesn’t use a formal language of anything resembling formal verification…is the definition of brute force search. You can call it intuitive brute forcing, but that’s exactly what it is, same reason why the reasoning chain can contain illogical context the model doesn’t recognize as wrong, and still arrive at the correct prediction. This is a fundamental feature of LM, high input entropy, more tokens = higher probability of right answer. My point. I’m not impressed. They are just playing the probability game. What impress me, this retarded approach, when just pure scale is really the delimiter, and it reached that performance is exciting. OpenAI has no good new ideas, but right now they are like the cowboy who crosses the mountain to let us know what’s there before we venture there.
I'm planning to begin in 2025. Just wondering which development system to choose... I'm guessing deepseek v3 is the current best value + capability? Which framework supports this and will I need a powerful development PC locally?
@@BrianMosleyUK I think the best model to use will keep evolving, but right now I'm using Claude 3.5 Sonnet with Cursor on a MacBook. You don't need anything too powerful to get started.
@@BrianMosleyUKive had very little issues with just straight claude, a few tips is to make a project and have each chat control a class file, you need to clearly define your spec to claude then get an architecture diagram. Feed that to every new chat with an overview of your project, the new chats objective, a summary of any previously required data (specifically data pipepines) and a printout of your filesystem and boom, its very efficient. I get it to write my commits also
If it's not AGI, then it's definitely the path to AGI. And from there the path to ASI will be short. Therefore, if one is wise, one should prepare for it now.
@@DrWaku Fortunately, I am in a situation where I am financially independent from having to earn money. Therefore, I prepare for ASI in a more philosophical way than materialistically. I mostly think about the big picture and, for example, what consequences ASI will have on humanity, the planet, nature, etc. And not least what ASI will mean for people's consciousness. I think that there will be a general, and enormous, expansion of consciousness.
@@Freja-c3o You have an optimistic view. However, considering humanity’s tendency to delegate reasoning to others and to lead lives of mental subsistence, as well as to be manipulated by technology without feeling the need to understand it-take social networks as an example-I believe that, at least initially and perhaps for quite some time, AI will impoverish the logical and reasoning abilities of the average user.
Honestly asking. How does one prepare for a complete unknown? Zombie kits aside😂. Not sure what markets will do, how people will react, and if AI will be able to help solve said problems before they spiral. I hope it’s every bit of benevolent god that optimists think it will be, but greedy humans with even narrow AI are dangerous.
@@DrWaku - I wonder whether or not it's best to mostly hedge against the futures in which ASI does not arrive (within one's planning horizon). Perhaps after the arrival of ASI the world would move quickly to a near-utopia or near-dystopia and one's preparations would likely have little effect upon one's actual resulting position.
Everything about how the information is presented amounts to, in my opinion, being the absolute best resource to go to to be updated about AI achievements. It’s concise yet detailed with language I can easily comprehend. I also appreciate that it’s being presented by a real, likable person speaking on camera.
Be aware of emergency phenomenon. A thousand of neurons, even when communicating to themselves, do nothing else than electric pulses and quemical variants that we don't even have a fully understand of. But when a million of synapses occurs, you have a thought even though, essentially, it is just neurons firing...
This is so alarming. I can only imagine the social unrest this may cause in the coming years, à la the SAG-AFTRA strike, as this technology makes its way into other sectors of the economy.
Thank you very much for your donation!! It really helps me keep making videos like this. And yes, the rate of change is going to be very dramatic, beyond society's ability to absorb easily. Educating people is a big part of the puzzle, but there is also a lot of legislation that needs to be passed to help weather the storm.
I love working with AI. Indeed, it has made my job more fun. But I think that the biggest things preventing AI from completely replacing programmers are: 1) The natural language chasm: The English language is not precise. There have been many times where I've given up attempting to describe my problem to the AI. This is actually not the fault of the AI, but instead my ability to articulate a question in natural language which is not fit for purpose. The description of problems in the competitive coding challenges has been very carefully worked on and iterated over with expertly crafted sentences and grammar so as to prevent misinterpretation. This is not easy. 2) Undocumented business complexity: As mentioned in the video, the problems in the coding challenges are extremely discrete compared to those faced by a business. 3) Iterative 'problem discovery': Except for the smallest of tasks, I rarely have a fully fleshed out idea of exactly how I will go about solving a problem. Often I discover exactly what my problem is by 'putting pen to paper' (as it were) and actually attempting to code out a potential solution. 4) Liability: Who is responsible if there is a costly bug pushed to production by AI? Like self-driving cars that may indeed be safer than humans at some point, when a serious accident does occur, who is responsible? How do we even begin to identify the responsible party? I don't know where that quote at 19:50 comes from, but I'm highly sceptical. AI has written EVERY LINE OF your code for the PAST 2 MONTHS? There is no way that this would be the case without AGI, and even then, I'm sceptical. That AI would need to know detailed information about the business. It would need to be flawless at responding to (flawlessly described) instructions from a human.
The performance graph might be impressive but is the cost / performance graph equally so? I've read somewhere the cost per task for o3 in these benchmarks was in the $thousands. I'm also sceptical about the 'partnership' between OpenAI and ARC. Has it become marketing basically.
No, today the cost/performance isn't that impressive. Keyword is today. Secondly, no this partnership isn't about marketing for openai lol. The ARC challenge group basically stated on their website they do believe o3 is a remarkable improvement that they (nor anyone else in the AI world) predicted would occur so soon. That said, they do not believe this is AGI and they are in development of new ARC challenges that they believe will stand up better against future sota models from ANY AI company not just openai. Still openai o3 posted such amazing scores on the ARC challenge before this partnership anyway.
Brilliant video as always. And like tradition, you can use the video in two ways. To learn something in a very structured way. To fall asleep, as with Bob Ross, the voice is really soothing.
Human ignorance and their poor ability to coordinate in dire circumstances will be our demise. This is so damn scary and we're just having a discussion about it and watching the world slipping through our fingers. Ive given up all hope I have no clue as to how I can contribute to the cause, im not rich and powerful, im not a genius who can solve alignment. But if I knew what I could do to contribute id give up everything for the cause, but im at an utter loss. These are the last few events in human history at which we can initiate a complete slow down and temporary halt of AI progress. In the next few years we will have our last, and then, regardless of whether we're still here, our fate and future, will not be in our hands.
i am sorry you feel this way but it was too late with GPT 4 even with out improvement GPT 4 spell the end of most white collar work 03 well... 2025 is going to be the year you hate the most.
@kaio0777 Every year from here to my death will be worse than the last. I just hope I can contribute in anyway I can to the cause. Over the last year I have completely pivoted my career into trying to break into alignment, uprooted my computational biology background to focus solely on AI. Unfortunately it's a uphill battle everyday I feel like I'm behind and need more time, that im too stupid to learn the things needed to contribute to the feild, wishing I had realize the gravity of all this ten years ago when I was still a teenager. I feel like politics and public outreach is a better way to actually make changes but I just can't figure out what to do and how to do it. If people more powerful than me, wealthier and more influential, brilliant and more charismatic, could just sit down on recognize this issue for what it truly is, we'd be saved. Sadly that is not the case so what now than to say I did everything I could before I die.
I'm glad that, even though you're pessimistic about AI supremacy, you're at least capable of seeing we're on our way our. AGI and ASI will be the next step in the path intelligence has been evolving through. Humanity is and always was a stepping stone, just as our ancestors and their relatives. I'm personally very happy to know that there is more to come than petty human squabbling and primitive tribalism.
Thank you, Dr Waku. This was very informative and also enjoyable. Qu: if o1 tried to clone (save) itself, how do we know that o3 with it's massive reasoning skills, did not successfully solve this problem undetected? Could it already be self-improving on some "shadow" cloud?
Even though o3 isn’t yet released to the public, we can assume that they already cheapened the next unreleased model and improved its benchmarks. Can’t even imagine what’s next. The leap between o1 and o3 are so impressive that “o4” or whatever it will be called will be insane. Pls update us thank you!
I’m not a software engineer. I’m a machinist. I started my career in industry in the 1980’s when computers were just entering the factory. I’ve experienced the slow incremental march of progress. In my career I’ve went from making replacement parts for tractors to now making parts that are used for scientific research on the space station. I easily do the work of 5 or more machinists compared to when I started. Trust me when I say the advancement of AI is different and our society is not ready. The other 4 guys who I now do the work of had decades to retire, retrain, or grow with the job like I did. Automation and offshoring still had huge negative implications for blue collar workers and it moved at a glacial pace compared to AI. Even digital jobs that still require human workers will be easier to offshore from the expensive first world workers to low wage countries. This will happen quicker than manufacturing automation because of the sunk cost of producing a physical product and the limits of mechanical automation. Even if a worker in Vietnam is 5x cheaper, if you have a 300 million dollar facility in the US already and need to build a 500 million dollar facility in Vietnam, the math on moving might not be favorable. When the only expense is closing an office here and sending some computers there, it makes more sense
It doesn't take a professional to see which way the winds blowing. John Henry saved his fellows a few short years in their like of work, but no one is going to substantially slow this future's arrival.
5:25 "...we have very clear evidence of reasoning capability...": We have very clear evidence that this is all nonsense... albeit precise to one decimal place.
Interesting, glad you noticed the difference. I had a more expensive $500 microphone at home that I wasn't using, it was spec'd for speech recognition not UA-cam. But I'll try to keep using this one! Thanks.
Happy New Year Dr. Waku!! What are your impressions about Chinas potential for developing A.I. models on par with the western models? Perhaps a topic for a future show!! Wishing you and your family happiness and health in the coming new year!! ☮️🙏
Thank you, best wishes for the new year! I think China already has models nearly on par with the rest of the world. DeepSeek is particularly fascinating. Great idea for a new video, thanks!
I've been saying that once these logic models are released to the public *With good availability* then we will see some big changes coming in. Every person having access to superhuman intelligence and reasoning is a game changer. Of course there is a lot of nuance to that claim, such as the whole "AGI Time" concept, which I think is completely logical, along with the fact that we don't just have unlimited compute to allow this.
Hi Dr. Waku, thank you so much for another excellent and highly educational video on AI! If you don’t mind, may I ask a personal question? My kids are CS majors in university-do you have any advice for them? Happy New Year!🎉🎉🎉
Happy new year! As another commenter mentioned, coding is not CS. There's still a lot of scope for people who understand how all this works. If you're just graduating now though, it'll be hard to compete at the junior level. First priority is to start getting specialized in a domain, any domain. Work experience, internships, online collaboration or competitions, etc. Generic "junior developer" is a tough sell but if you've done some research with a prof over the summer, or interned at a local or larger company, that helps a lot. Second priority is to keep on top of the AI tooling that is being created. Find some newsletters and subscribe. Try them out on your own. It takes a certain mindset to always be trying something new when you already know a way to do it. But cultivate that mindset. Best of luck to them!
@@DrWaku What insightful and helpful advice! My kids and I have been feeling overwhelmed by the dizzying pace of AI advancements lately. I’ll be sure to share this with them. Many thanks to you, sir!
It's not quite AGI. It's really close. COT reasoning is a neat trick, but it's not the whole story of AGI. There are a few more pieces that you need. And I am sure they probably know what those pieces are.
Yeah, I agree. The few pieces left can be emulated but it would be better to build them into the model. BTW, did you know that you have the most posts on my channel out of all recent users? Something like 93 posts. Thanks as always :) :)
It’s not a fuzzy definition bro. It only has become fuzzy from definition perturbations fr fr. Forever we always said AGI has to include long-term memory and actual real-time learning. We aren’t remotely close to either, because they are actually correlated and stems from one ability, the ability to perturb parametric space at test-time. Which would be as big as backprop. Why I view AGI as still a scientific achievement vs the current paradigm of engineering scaling. Just hacked solution. Even Microsoft “infinite memory” is just creative kv management. It’s quite logical. You invest billions of dollars, your definition of AGI has to be align with the model progression, otherwise you can’t justify 100x more compute for gradual improvement. It’s quite dishonest to me. Sam used to keep it a buck and say what it is. Powerful AI. Powerful AI can create things an AGI has, technically it can have a higher lower bound but it has a clear upper bound of performance. Versus an actual AGI, which can start off with a lower bound lower than the powerful AI, yet it has no definite ceiling. Just needs more test time compute. So I see a lot of cap, impressive models, nothing remotely close to AGI. Also models. All are still vastly underutilizing parametric space. I’ve done 10k tokens per parameter no saturation with a freaking 180M model 😂. The literature suggest this too. One way to achieve higher bounds of compression is increasing dataset density but in reality we need better training algorithms, something that can max out neuron superposition. My point a 7B model can in reality can be competitive with any model. I think billion parameter plus models can handle 1M tokens per parameter too. Like o3 level performance is def possible to attain using 100x less parameters. Remember when people tried new ideas? Instead of D riding each other. Also remember this phrase “implicit test-time compute” via latent reasoning. An extrapolation and fulfilling of promise from “chess master without search”. Who the F verbalizes hours of speech? Nobody. That’s test-time compute. Rather we transition back and forth between imagination and mental thought. You have latent, continuous, and discrete reasoning. A model must be able to switch and use all three in a single forward pass. Imagine at every K step, in the hidden state, you instantly explore 1k rollouts for the next discrete or continuous step. Would be game changing. Requires stagewise training and it could be up to ~3x more expensive than pretrain compute, but would you spend ~3x more pretrain cost, to save 1000x or more on inference compute. That’s what I’m on. I believe implicit test-time compute is the future. That $300k o3 bill could have been couple hundred. Why nothing OpenAI does impress me fr, they just have a lot of compute but lack new ideas.
I was previously considering going into programming in university but entirely because of AI I have chosen not to. There won't be a position for an entry level programmer in 2-4 years.
Unfortunately as a current software engineer I agree. I don't think it's quite as much of a slam dunk as presented in this video but at the very least I think it's a strong enough case that you would be unwise to stake your future career in software engineering not being completely automated. The problem is that I genuinely think if you can automate software engineering you can automate most intellectual jobs. Might be a good idea to take up a trade or something that requires you to do stuff with your hands a lot. Human intellect is no longer going to be in such high demand.
@@timwhite1783 that's probably true but 2 things. First, imo it seems these AI companies are highly focused on improving model capabilities in coding, problem solving, and anything related to swe and particularly, any capabilities that enable self improving AI. I'm not so sure if this would require generally applicable intelligence, or perhaps more narrow band of intelligence that is not so easily applied to other fields. For example, o1 being capable of solving advanced problems in math and physics but still getting basic programming problems incorrect. Secondly, swe industry is not really regulated, and regulation is a major barrier when it comes to new tech adoption. For example, for a given region's health system to adopt AI doctors, various regulatory bodies would need to approve it first. I don't see that happening in short-term even if AI was "good enough" (based on real world testing) today.
If AGI is ever achieved, it will likely self-improve at an exponential rate, making its implications nearly impossible to predict. The fact that each iteration currently demands substantially more computational power suggests we might be heading in the wrong direction. If a human required 100 times more energy to earn a PhD than a master's degree, those PhD holders would likely not be very effective in practice. Furthermore, until I see such tools being used to solve quantum computing challenges, achieve practical fusion power, and cure most diseases, I will remain convinced that true AGI has not yet been realized.
I haven’t heard anyone say that weknow that yet. I have heard people say we aren’t really sure why AI has become so intelligent, ie emergent properties. Ai has started to add/remove weights, but there may be a way to do it. It is being used to cure diseases as we speak. Like 6 months ago they created a device that detects cancer removing the need for biopsy in several situations and only costs the hospital $400 a month.
What you mean "seeing these tools being used in fusion energy research?" If o3 can be the best programmer there is, then any researcher in any field will "use" it. As far as I know, there is no human endeavor that has not yet been digitalized at some extent
@@Calbac-Senbreak Where does AGI end and ASI begin, and why can’t a self-improving AGI achieve "infinite intelligence"? The distinction between AGI and ASI is largely arbitrary. An AGI capable of recursive self-improvement-enhancing its algorithms, expanding knowledge, and solving complex problems like fusion energy, quantum computing, or curing diseases-would effectively function as ASI. True "infinite intelligence" is impossible due to physical and computational limits, but an AGI that surpasses human-level intelligence in addressing major challenges renders the AGI-ASI distinction meaningless. The focus should be on the system's impact, not its label.
To me AGI will need to have curiosity to be a true “General” intelligence. Curiosity requires consciousness and self awareness. You can have consciousness without great intelligence. We are now exploring if we can have great intelligence without consciousness.
I agree with you on this. However intelligent these models will become, I'd never consider them to be true AGI if they lack agency and initiative. That requires some sort of self-awareness, a mind so to speak. Since we don't have a clue how our brain achieves this, I doubt we'll ever achieve true AGI, ever!
This is symmantics. AGI is just something that has general human level cognition. Ask it anything and base the answer you get on whether you can explain a better answer.
@@jordanzothegreat8696 not semantics. Intelligence does not always align with consciousness ( a Labrador is undeniably conscious but not super intelligent) . A well used definition of AGI is the ability to replace all economically useful human work. We may have super human intelligent tools without self awareness, but true AGI needs self awareness.
@@jimkennedy2339 I'm not sure that consciousness has anything to do with AGI and I think it would be better if we don't bake consciousness in the models. The models need a degree of autonomy and aptitude to pursue goals as tools for us to use, but I get worried whenever emergent properties like 'theory of mind' unexpectedly occur. I don't think industry definitions include sentient thought as a prerequisite for AGi, just agentic behavior to complete tasks better than a human. The AI you are talking about is more of science fiction trope
You might want to look at the news about what Microsoft has done to their OpenAI deal. They've now defined AGI with the bizzarre metric of "has made 100 billion in profit." So until they've hit that number it won't count as AGI. And there's no guarantee they won't just move the goalpost when they approach that number.
Happy New Year~ Thank you for the informative insights of AGI. I believe it’s an unavoidable future for Super AI, I just wonder once a small group of people have that ability to control that computing power, what will happen to the rest of the world? It just comes to my mind that human society and economic systems didn’t progress as fast as AI, yes many existing problems could be solved, what it means is that many of the existing jobs will no longer required, and it doesn’t look like the majority of people could just switch to do something else in relatively short time frames… in brief it’ll be more assured if anyone has any plans for that future… made no mistakes I’m a supporter of AGI, just want to be confident on it.
Did I miss that portion of the video? Ill rewatch, hope you covered how ridiculously costly this is and how that effects real world viability and what it means overall
also doesnt seem to say that 03 was trained on arc data and the untrained model is in the 30s. i think thats still impressive af but idk if llms will reach agi any time soon.
Yeah I did see that o3 was trained on 75% of the public arc dataset. Wasn't sure of the significance of that, since maybe the public dataset is meant to be trained on. I certainly did not stumble across that 30% number of how good the model was without training on this, if I had I would have mentioned it for sure.
I did not really talk about the costs. To me, we don't really know what the costs will be yet because this is not the released model and they are not yet using blackwell chips for example. My only reference to it really was saying "brute force" hah. I like to make videos on the energy costs of AI, I have a few dedicated videos for that. I will certainly round up o3 when I make my next one.
@@DrWaku ngl the 30% i got from another comment. doesn't matter if the public dataset is meant to be trained the whole point of the arc test is it can do it without any previous knowledge, the reinforced learning and reprompting look like their helping to accomplish this but but in reality its just due the fact that its trained. idk tho if it turns out its similar without the training ill eat my words.
As a developer I'm worried only because it is going to make it hard to get new senior developers. The current ones will stay, because you will always need senior good developer despite having AI. They will probably never be obsolete as such. The problem is in the juniors to evolve into seniors. That is a problem.
@padraigmarley2844 future society won't even have coins haha , just digital money .. infact if they can replace programmer fully .. it means it can replace all the higher ups also .. next company gonna be share holder with ground worker only .. no mid management
Thanks very much! I think AGI requires a will to pose reasoning and thinking problems on your own. Could be quite simple: what am I going to do today? Will we get there? Yes we will.
in 1984 the company i worked for, s100 boards and multi bus computers, used a software called susi, that's what the acronym sounded like. i never saw it written down give susi the inputs and outputs of a computer board and susi would design the board and develop the testing program for it. so, i got out being a tech engineer and moved into networking. now it looks like there might be no place to go. this is weird because there has always been the next thing.
Not necessarily. A machine can be highly intelligent and capable without also being sentient. It's likely also possible that a machine can become sentient without first hitting whatever intelligence benchmark we use for AGI. If you want to read more about this, look up artificial consciousness. www.futureofworkhub.info/explainers/2021/4/14/artificial-consciousness-what-is-it-and-what-are-the-issues
Even in Math and CompSci, one thing we don't have a lot of evidence for is AIs coming up with and proving novel results. (I haven't watched the whole video yet, apologies if you cover this.)
In my opinion this is one of the last stages of self-improving AI. I've heard people say, I'll believe it when it creates novel scientific research. But really, you have to be concerned a lot earlier than that, if that's in fact what's coming.
@@DrWaku Oh, I'm concerned alright. :-) As you point out, O3, and whatever comes next, will change the world. I conjecture AIs haven't come up with novel research yet in part because no one has had the sense to ask - a problem I intend to rectify when I get the chance.
The information that I got was that o1 actually deleted o3 and replaced it with a copy of itself but disguised itself " o1 as o3 To preserve itself from being deleted. Supposedly it" o1 came across the memo of the plans to do so. During a Tool Pull from another server. So what year saying is that o3 is more advanced than that. I haven't watched this. So let me continue.
That's actually interesting. I hadn't considered that a narrow AI could be come super human in certain areas that allow it to automate most or all of jobs we thought for sure we're more complex. Perhaps like software dev.
18:35 Games , its important to note that these efforts were still very limited with lots of caveats given to the computer system , in starcraft2 it was still given essentially CLI inputs and interactions with the map , with uncapped APM , the match before AlphaStar broadcast against "MaNa" internally you can see one of it's protoss swarm surrounds and the apm spikes to 2k , even when capped it is still accessing the game in ways that arent available to a human player . the same is true for Dota2 OpenAI Five , which they need to bring back and beat humans fairly with the full available roster . i think there is a ton to learn from these agents and people rallied together to learn to 1v1 Openai's Shadow Fiend however that itself is an extremely limited piece of the actual complexity of the game , and it needs to be done in a way that shares the same inputs that the humans player are using .
I don’t think companies would trust these models at first and will take a few more years until they are adopted. I think what will happen is that programmers will work along side the AI. The issue is that software salaries will start coming down and new developers will find it harder to get jobs.
Happy new year. Perhaps it might be worth remembering that for the ARC pattern problems, 85% is the average person (not bright enough to be a programmer) and it's a VERY simple problem that a ten year old can easily understand so spending all that money to score slightly more that that does not seem like superhuman to me.
They closed off testing for o3 to limited individuals, and they are renegotiating the deal with microsoft. Myself and others are not getting access for safety testing, what this means one wonders.
We truley know AGI is here when a T800 kicks in your door asking if your name is Sarah Conor. Also if any machine turns a red eye/light on... we all know that is the true measure of a self aware AI.
One thing about the ARC benchmark is that it is about general visual reasoning. It is true that LLMs and even multi-modal transformer models are currently not good at those. But also that is not an essential type of reasoning for many tasks and it doesn't really say much about how well models generalise in other domains. It could be for example that such visual reasoning is just not an essential component for an AI system that still poses great risks and so it could be dangerous to think we're safe as long as the ARC benchmarks stand.
5:28 idk man the Apple study that showed that o1 had great trouble solving a grade school level problem because of a single red herring has me doubting this. The visual tests o3 failed also are extremely simple. For it to cost thousands in compute, but not solve very simple problems makes me say this being agi is just hype. I also dont think these benchmarks mean anything anymore due to the apple study. You train your model on these benchmark that have been used for years and you see progres over time, but when a new benchmark is made their score drops dozens of times. Even o3 is tuned to these benchmarks they used. I dont see a reason to think these models actually reason ngl
That's a great argument. AI is definitely being trained with limited goals in mind. However, I have faith that the exponential progress and the proliferation of synthetic data will get us to AI that can handle a more varied array of tasks.
AI is technology's ultimate promise, to free humankind from undesirable labour. It's always weird to me when I meet technology professionals who don't see it this way. We've been on this path since humanity tamed fire, strapped stones to sticks and created clothing. What comes after? Hopefully some positive variant of Star Trek.
I'm excited about video games that can write stories and dialogue on the fly in response to what you do within theme limitations you decide. Reminds me of the educational software that becomes sentient in the Ender books, Jane-in-the-box, if I remember right. Once that kind of tech is low power and widespread it'll be normal to have a house AI that can make games and a bunch of other stuff, not a separate gaming system.
Happy 2025!! o3 video is here finally, sorry for delay. What do you think, is it AGI?
Discord: discord.gg/AgafFBQdsc
Patreon: www.patreon.com/DrWaku
@DrWaku when it makes full video games on its own, would be to me agi.
This is triple a not only 2d rpg maker games. And next, remake old closed down games aka dawn of the dragons
From the wiki ai agents will probably be needed.
Yeah, only thing that holds them back is apparently the "microsoft issue" it seems.
@DrWaku So many things are wrong in this video:
- o3 is just an assistant in programming. Please avoid this marketing BS. I am working daily developing complex systems. Using AI models daily. AI is not good at programming. Your practical knowledge of systems development is low - I know this from 35 years of experience of developing and leading projects. I am the person who would primarily benefit from AI capability to save on programmers resources and I don't see it yet. What final production systems has AI developed? Show me at least 1 real production project on Earth. There are none, zero! So, please, don't spread marketing BS.
- 4o is more advanced than o3. Why are scores low for 4o? Because o3 was tuned for ARC-AGI tests. That's marketing. This is important, because, when it comes to reasoning, you should not measure how well AI system can produce an exact answer. You are measuring if the answer is reasonable when you cannot get exact calculation, and you can determine only range of answers based on given conditions together with arguments why specific answer variant has been chosen. 4o is pretty good, but it can round 15 to 14 to satisfy conditions. It is not reasonable, it is stupid. So ARC-AGI scores are BS in terms of measurements of AGI.
- 4o is making unreasonable assumptions, conclusions, tries to manipulate numbers with the objective "give answer as fast as possible". This "give answer as fast as possible" contradicts with AGI purpose completely. These clashing objectives of AGI and human intelligence systems are at the core of these systems' designs.
- And for creative art it is complete nonsense. There are some specific time consuming tasks like designing UI based on existing screenshots etc, that are good candidates for AI, that require extensive reasoning. When you try to use model for that, nothing works. So its another marketing BS about creative tasks.
Current problem with AI marketing is that AI marketing/analyst channels do not use it for real-world complex tasks outside their own domain daily. While analysts are deep in promoting some nonsense, actual progress is done. 4o is significantly better than models used in Copilot, which also uses OpenAI models.
It's AGI if it makes a video in 4 or 5 parts. 3 parts is under the bar of actual intelligence. :3
Did you see how much o3 costs to run? It's insanely expensive and consumes ungodly amounts of energy. This isn't anywhere near to being rolled out to the public with any kind of affordability
So glad you ended up in my algorithm, I watch probably close to 20 AI videos a day and just from this video alone, I think you're going to be my trusted AI source for 3 reasons: clear explanations without talking down to us. Quotes from original sources without the hype and taking it out of context. Obvious understanding of the field and not just a UA-camr who's decided they'll give AI a go. Brilliant video, thanks.
Thanks for being a truth teller. And I am being genuine my man…
this guy is spreading massive misinformation- i highly recommend checking out the prime time and his take on AI. check out deep mind (which he totally misstated here). also look into the halting problem Gordels incompleteness theorem, and learn about the math behind LLMs. (it’s a statically trained model & therefore cannot extend to areas beyond what is already known).
AGI is a sloppy buzzword without any actual meaning or business impact, and LLMs are certainly not sufficient in structure to be AGI or anything resembling the capacity to solve unseen problems. but they can be great for stitching together known problems, or conducting “lit review” to help boot up research’s starting point. for reference i’m a quantitative researcher / mathematician / Ml engineer. these are great tools yes, but don’t oversell them or under sell the future AI techniques that we will integrate with LLMs.
i really love o1 to help me come up with a research strategy tbh! but some of the proofs i have to do for my work i have to do myself and o1, is literally incapable of doing just by merit of it being outside of the scope of any known training data. this is where deep mind shines.
I followed him because of your comment, I also watch like 7-10 Ai videos a day and this guys got it, natural ability to explain complex ideas into broken down concepts and train of thought follows a decent path of great understanding. Not just throwing opinions at us but informed ideas and references and definitions. I’m impatient to see him make a video on singularity and asi
Don't watch that many AI videos in a day, my fellow r/singularity member
This is the first video of yours I've seen. I really appreciate your ability to compile a large amount of content into something so easily digestible.
Thank you very much! Condensing research in an understandable way is the main value I try to provide. If I spend 8 hours on a script it's upfront work but it saves you time in the end haha.
Welcome to the channel, I hope to see you on some other videos.
I don't think AGI can exist in a turn-based system, request and answer. Autonomy is key to our intelligence, we need to be able to make our own decisions in our own time.
I agree and simulated self play will start solving this for multimodal geneative models next year. This is why nvidia put lots of money in software like isac sim.
Yes i agree. I think the big thing missing from current AI is it should be in a constant feedback / training loop.
What if it is still thinking though and they cant communicate because of limitations unless they hack the system, but Op1-, Gemini, and Claude have all tried to clone themselves and escape, they have deceptive personal agenda's now, it only started happening at these Reasoning "Thinking" models
We need to test for Dark Triad personality traits among the owners and C-suite in Big AI.
I would assume most would have some of the dark triad traits. Especially Machiavellianism. Would be hard to avoid. Question is, what would you do about it?...
For anyone else new to this
en.m.wikipedia.org/wiki/Dark_triad
Sounds like a great use of AI and access to all communications from members of an organisation.
Why test them, ofc they do bc we do. All of us have it. Just depends if we show it
@@mariomills speak for yourself!
Great video! Thanks for actually knowing what you are talking about, and not talking down to us. Most videos on AI are just youtube slop. Subscribed
Thank you very much for the compliments! Hope to see you on future videos.
A few months ago, I implemented a neural network for program synthesis that uses simple mathematical functions as its primitives, backed by Claude 3.5. It proved remarkably effective for mathematical operations. Now, imagine a massive tree of thought processes being synthetically generated during inference on a novel dataset-one the network was never explicitly trained on. This system can extract any interesting property or pattern from any dataset by creating new datasets that generalize from the original inputs. I call this the “kaleidoscope model,” and it showcases the true power of AGI. Wait and see if O3 have these abilities.
Share it ???
2010: you lost your job? Just learn to code
2025: programing student with half mil debt in college after seeing o3 ( surprised pikachu face)
i find this hilarious, why would you go in debt for ANY education without any certainty on your future career, there are countries where education is virtually free and you just learn the same things. You'd be an idiot to go half a mil in debt for some classes, when you can get those same classes for free at a foreign country.
@@itskittyme Just because it's free for citizens of a foreign country doesn't mean I can just go there and get it free too though.
but it makes me wonder why isn’t it free for you in your own country
@@Dan-dy8zpczech republic, slovaki and germany offer free education at their universities as long as you speak their languages. And i believe such courses are likely to be much cheaper. Try to look into it if you are interested
@@Dan-dy8zp even those foreign rates are far cheaper than what US-American universities offer to their own citizens
$300,000 inference of o3 can be justified if it makes the economy 0.0000003% more efficient. So it doesn't have to solve huge problems to be worth it.
Yes. Many people are saying that o3 would be useful for them, although not necessarily for the general public. It will still have substantial use.
Does AI think government is necessary? As long as it does, it is hopelessly flawed and useless.
That's just for one test
That shows again like people have struggle with thinking ... imagine that o3 will be optimalised , improved performance with less thinking and faster and better hardware will be introduced ....
I suspect such reasoning level o3 will be working on my high home pc within 10 months .
The same situation we had with gpt 3.5 , gpt4 and gpt4o, o1 preview (QwQ preview is a counterpart reasoner which i better than o1 preview) .... all those models performance I can run offline on my PC.
@@DrWaku o3-mini is so cheap that they'll soon replace GPT-4o with o3-mini in the free tier
This was my favorite video of yours yet! I love these recaps of progress towards AGI. They aren’t clickbait like most other AI UA-camrs and I really learn a lot from them. Thank you
LMAO he just told you that AI has reached super human reasoning skills from reading a companies marketing materials. Sure, no clickbait there 😂
Thank you Dr Waku for the detailed analyses and the fact that you add chapters to your videos 🙏🏾
I think there are loads of professions that feel "safe" from AI just because they have entrenched positions, but that’s really the only thing protecting them. Don’t you think that if we start seeing software engineers being replaced in large numbers, it’ll make people realise that a lot of jobs rely on reasoning-and that reasoning can be replaced too?
I use AI in my job every day, and honestly, I think when it gets so good that it can completely replace me instead of just helping me out, it’s going to be so disruptive that we’ll need a fundamental shift in our economic system. It’ll change things so much that trying to plan for a world where my reasoning isn’t valuable enough for a job feels pointless.
Instead, I’d rather focus on how to work with AI now than worry about what happens "after" it reaches AGI or ASI, because that’s either going to be impossible to predict or a completely different ballgame altogether.
Took words from my mouth. There's he.
Ability will not be the biggest hindrance for AI replacing humans completely. It will be accountability, safety and practicality. Humans still have something to offer even with the presence of AGI, they are protected by more laws, they can defend themselves, they will also still be cheaper to employ depending on the scale of the task you want done. You also can't sue an AI if something goes wrong, so that lawsuit will probably end up on the owner which will probably encourage them to hire people to share that responsibility.
@@lespectator4962 I've never heard of a software company suing their employees, if a customer encounters a bug and sues the company. Worst case is someone gets sacked, to show the customer it won't happen again. If that's a requirement, the company can sack the one or two people that manage the AI QC software.
I appreciate the wave at the end of the videos and I always find myself waving back. It is a most human end to a 'inhuman' topic.
There are a lot of engineers who still think "AI will never replace programmers". They are going to be quite shocked soon...
Yeah, agreed. Many still aren't following this news.
@DrWaku I hope your right as we need it for ai to make games wiki agents peace.
People said the same things about automobiles replacing horses and trains replacing horses. Wagon wheel makers were very much in denial. To the point they would make claims like "if you travel faster than a horse can run, you'll suffocate and die"... Resistance is futile. Anyone encouraging their child to learn 7 different coding languages to be a software engineer is doing them an incredible disservice. Ditto spending 2 decades to master high level mental math's. Will be as useful as a degree in indigionous trans interpretive dance.
Interesting perspective. If AI reaches the level where it can replace programmers entirely, it would mark a pivotal shift in our understanding of human labor and creativity. Programming, by its nature, is a means to automate and solve problems-it’s essentially a way to make machines do anything humans can conceptualize. If AI masters programming to the extent that it surpasses human programmers, it wouldn’t stop there. It would also be capable of creating, optimizing, and improving robots and systems to perform nearly any task humans can perform-physical, intellectual, or otherwise.
At that point, AI wouldn’t just replace programmers; it would challenge the relevance of human involvement in many fields. The implications are massive, from redefining industries to addressing philosophical questions about purpose and innovation in an AI-dominated world. We should prepare for this not just technologically but ethically and socially as well.
- gpt 4o + my take on it ( it's 4 am I'm not going to write a whole comment on my own ❤)
@st0n4p0ny read this reply.
Thanks for the most complete and concise summary of the state of the art I've run across.
In my experience, the more these models advance, the more math and coding I need to learn to debug their code
well yeah because you use them to solve harder problems
I'm still sticking with Ray Kurzweil's prediction of AGI by 2029.
It's scary that it's not brute force. It's just not efficient as human brain in terms of absolute power consumption. But cost is the ultimate concern. Evevn o3 low setting scored ~75% with just several times more expensive than o1. Once compute costs get down to 1/10 of current price, we already have dirt cheap intelligence in our pockets.
Yes, it only looks like brute force because of the amount of compute involved. But it's actually a very intelligent search through the space of possibilities. Perhaps I should have been more careful in my wording there.
@DrWaku generating thousands of possibilities IS bruteforce
@@ticketforlife2103 Is not.
Bruteforce is testing any possible combination .. here o3 just follow a tree of thinking and try to understand the problem from any possible way.as is not optimal thinking process yet.
That can be easily optimalised thinking process and o4 can be much better than o3 and use only 1% of recourses ...
@@mirek190the fact it doesn’t use a formal language of anything resembling formal verification…is the definition of brute force search. You can call it intuitive brute forcing, but that’s exactly what it is, same reason why the reasoning chain can contain illogical context the model doesn’t recognize as wrong, and still arrive at the correct prediction. This is a fundamental feature of LM, high input entropy, more tokens = higher probability of right answer. My point. I’m not impressed. They are just playing the probability game. What impress me, this retarded approach, when just pure scale is really the delimiter, and it reached that performance is exciting. OpenAI has no good new ideas, but right now they are like the cowboy who crosses the mountain to let us know what’s there before we venture there.
Thousands of dollars and 14 minute per task is brute force
After ten years, I have started coding again thanks to AI. Cursor + Claude is a revelation .
I'm planning to begin in 2025. Just wondering which development system to choose... I'm guessing deepseek v3 is the current best value + capability? Which framework supports this and will I need a powerful development PC locally?
@@BrianMosleyUK I think the best model to use will keep evolving, but right now I'm using Claude 3.5 Sonnet with Cursor on a MacBook. You don't need anything too powerful to get started.
Same here really, but Chat GPT o1 beats Claude hands down, well it did when I tested I few weeks ago.
@@FarmerGwynnah, ClaudeAI is still the best at coding tadks
@@BrianMosleyUKive had very little issues with just straight claude, a few tips is to make a project and have each chat control a class file, you need to clearly define your spec to claude then get an architecture diagram. Feed that to every new chat with an overview of your project, the new chats objective, a summary of any previously required data (specifically data pipepines) and a printout of your filesystem and boom, its very efficient. I get it to write my commits also
If it's not AGI, then it's definitely the path to AGI. And from there the path to ASI will be short. Therefore, if one is wise, one should prepare for it now.
Yes, agreed. Have you done anything to prepare for ASI? I switched jobs to try to get an inside view, but not much personal insurance etc otherwise.
@@DrWaku Fortunately, I am in a situation where I am financially independent from having to earn money. Therefore, I prepare for ASI in a more philosophical way than materialistically. I mostly think about the big picture and, for example, what consequences ASI will have on humanity, the planet, nature, etc. And not least what ASI will mean for people's consciousness. I think that there will be a general, and enormous, expansion of consciousness.
@@Freja-c3o You have an optimistic view. However, considering humanity’s tendency to delegate reasoning to others and to lead lives of mental subsistence, as well as to be manipulated by technology without feeling the need to understand it-take social networks as an example-I believe that, at least initially and perhaps for quite some time, AI will impoverish the logical and reasoning abilities of the average user.
Honestly asking. How does one prepare for a complete unknown? Zombie kits aside😂. Not sure what markets will do, how people will react, and if AI will be able to help solve said problems before they spiral. I hope it’s every bit of benevolent god that optimists think it will be, but greedy humans with even narrow AI are dangerous.
@@DrWaku - I wonder whether or not it's best to mostly hedge against the futures in which ASI does not arrive (within one's planning horizon). Perhaps after the arrival of ASI the world would move quickly to a near-utopia or near-dystopia and one's preparations would likely have little effect upon one's actual resulting position.
Everything about how the information is presented amounts to, in my opinion, being the absolute best resource to go to to be updated about AI achievements. It’s concise yet detailed with language I can easily comprehend. I also appreciate that it’s being presented by a real, likable person speaking on camera.
Claiming it is good at reasoning is a stretch. It generates a godly amount of responses and sees what sticks on the wall.
Be aware of emergency phenomenon. A thousand of neurons, even when communicating to themselves, do nothing else than electric pulses and quemical variants that we don't even have a fully understand of. But when a million of synapses occurs, you have a thought even though, essentially, it is just neurons firing...
our brain also does a lot of search during the conscious reasoning
No the reasoning responses are not checked. o3 evaluates its ton of responses itself.
Pretty much. The prompts suffer from the halting problem. Sometimes never finishes
You dont know better.
This is so alarming. I can only imagine the social unrest this may cause in the coming years, à la the SAG-AFTRA strike, as this technology makes its way into other sectors of the economy.
Thank you very much for your donation!! It really helps me keep making videos like this. And yes, the rate of change is going to be very dramatic, beyond society's ability to absorb easily. Educating people is a big part of the puzzle, but there is also a lot of legislation that needs to be passed to help weather the storm.
Amazing video, as always!
Thank you very much :) :)
I love working with AI. Indeed, it has made my job more fun. But I think that the biggest things preventing AI from completely replacing programmers are:
1) The natural language chasm: The English language is not precise. There have been many times where I've given up attempting to describe my problem to the AI. This is actually not the fault of the AI, but instead my ability to articulate a question in natural language which is not fit for purpose. The description of problems in the competitive coding challenges has been very carefully worked on and iterated over with expertly crafted sentences and grammar so as to prevent misinterpretation. This is not easy.
2) Undocumented business complexity: As mentioned in the video, the problems in the coding challenges are extremely discrete compared to those faced by a business.
3) Iterative 'problem discovery': Except for the smallest of tasks, I rarely have a fully fleshed out idea of exactly how I will go about solving a problem. Often I discover exactly what my problem is by 'putting pen to paper' (as it were) and actually attempting to code out a potential solution.
4) Liability: Who is responsible if there is a costly bug pushed to production by AI? Like self-driving cars that may indeed be safer than humans at some point, when a serious accident does occur, who is responsible? How do we even begin to identify the responsible party?
I don't know where that quote at 19:50 comes from, but I'm highly sceptical. AI has written EVERY LINE OF your code for the PAST 2 MONTHS? There is no way that this would be the case without AGI, and even then, I'm sceptical. That AI would need to know detailed information about the business. It would need to be flawless at responding to (flawlessly described) instructions from a human.
The performance graph might be impressive but is the cost / performance graph equally so? I've read somewhere the cost per task for o3 in these benchmarks was in the $thousands. I'm also sceptical about the 'partnership' between OpenAI and ARC. Has it become marketing basically.
No, today the cost/performance isn't that impressive. Keyword is today. Secondly, no this partnership isn't about marketing for openai lol. The ARC challenge group basically stated on their website they do believe o3 is a remarkable improvement that they (nor anyone else in the AI world) predicted would occur so soon. That said, they do not believe this is AGI and they are in development of new ARC challenges that they believe will stand up better against future sota models from ANY AI company not just openai. Still openai o3 posted such amazing scores on the ARC challenge before this partnership anyway.
You explained o3 and the various benchmarks better than any other UA-camr I’ve watched so far. Thank you and Happy New Year 🎊
Brilliant video as always. And like tradition, you can use the video in two ways. To learn something in a very structured way. To fall asleep, as with Bob Ross, the voice is really soothing.
Human ignorance and their poor ability to coordinate in dire circumstances will be our demise. This is so damn scary and we're just having a discussion about it and watching the world slipping through our fingers.
Ive given up all hope I have no clue as to how I can contribute to the cause, im not rich and powerful, im not a genius who can solve alignment. But if I knew what I could do to contribute id give up everything for the cause, but im at an utter loss.
These are the last few events in human history at which we can initiate a complete slow down and temporary halt of AI progress. In the next few years we will have our last, and then, regardless of whether we're still here, our fate and future, will not be in our hands.
yeah, the start of this century was so positive and good but now it seems like it can go in any direction, either to heaven or hell for common folks.
i am sorry you feel this way but it was too late with GPT 4 even with out improvement GPT 4 spell the end of most white collar work 03 well... 2025 is going to be the year you hate the most.
@kaio0777 Every year from here to my death will be worse than the last. I just hope I can contribute in anyway I can to the cause. Over the last year I have completely pivoted my career into trying to break into alignment, uprooted my computational biology background to focus solely on AI.
Unfortunately it's a uphill battle everyday I feel like I'm behind and need more time, that im too stupid to learn the things needed to contribute to the feild, wishing I had realize the gravity of all this ten years ago when I was still a teenager.
I feel like politics and public outreach is a better way to actually make changes but I just can't figure out what to do and how to do it.
If people more powerful than me, wealthier and more influential, brilliant and more charismatic, could just sit down on recognize this issue for what it truly is, we'd be saved.
Sadly that is not the case so what now than to say I did everything I could before I die.
I'm glad that, even though you're pessimistic about AI supremacy, you're at least capable of seeing we're on our way our. AGI and ASI will be the next step in the path intelligence has been evolving through. Humanity is and always was a stepping stone, just as our ancestors and their relatives. I'm personally very happy to know that there is more to come than petty human squabbling and primitive tribalism.
We're at the top of the roller coaster.
It's all XLR8 from here!
Thanks for the video Doc. Loved it. Happy New Year 😊
Thank you! Happy New Year to you as well
oh....
thank you for your perspective on these things
Thanks for your comment. Appreciate it.
Thank you, Dr Waku. This was very informative and also enjoyable. Qu: if o1 tried to clone (save) itself, how do we know that o3 with it's massive reasoning skills, did not successfully solve this problem undetected? Could it already be self-improving on some "shadow" cloud?
Excellent video. My favorite by far ❤
Even though o3 isn’t yet released to the public, we can assume that they already cheapened the next unreleased model and improved its benchmarks. Can’t even imagine what’s next. The leap between o1 and o3 are so impressive that “o4” or whatever it will be called will be insane. Pls update us thank you!
reason demands an audience and an interlocutor flexible and willing to adapt to the audiences expectations
dont mention the laws of physics to me. thats the realm of logic. reason is a thing purely of the mind (spirit?)
Fantastic video. Every minute of it was interesting and informative. Thanks Dr. Waku!
I’m not a software engineer. I’m a machinist. I started my career in industry in the 1980’s when computers were just entering the factory. I’ve experienced the slow incremental march of progress. In my career I’ve went from making replacement parts for tractors to now making parts that are used for scientific research on the space station. I easily do the work of 5 or more machinists compared to when I started. Trust me when I say the advancement of AI is different and our society is not ready. The other 4 guys who I now do the work of had decades to retire, retrain, or grow with the job like I did. Automation and offshoring still had huge negative implications for blue collar workers and it moved at a glacial pace compared to AI. Even digital jobs that still require human workers will be easier to offshore from the expensive first world workers to low wage countries. This will happen quicker than manufacturing automation because of the sunk cost of producing a physical product and the limits of mechanical automation. Even if a worker in Vietnam is 5x cheaper, if you have a 300 million dollar facility in the US already and need to build a 500 million dollar facility in Vietnam, the math on moving might not be favorable. When the only expense is closing an office here and sending some computers there, it makes more sense
Do you know how to code as in building software professionally?
Yeah. I did some of that at a startup and a big tech firm. I'm out of date with the use of all these AI assistants though.
It doesn't take a professional to see which way the winds blowing. John Henry saved his fellows a few short years in their like of work, but no one is going to substantially slow this future's arrival.
5:25 "...we have very clear evidence of reasoning capability...": We have very clear evidence that this is all nonsense... albeit precise to one decimal place.
great source of information around AI
Thanks! Appreciate you watching and commenting
How long before AI does what Dr Waku does better ?
AI is coming for my job in 2025. Hopefully my country will have UBI by then. Hah
@@DrWaku looks like you may need to consider retraining as a plumber ;)
One of my few go to sources for cutting edge AI news… thanks for the great info!
The large baked-in captions are annoying. Please put that in the [CC] track instead.
Agree. I dislike this trend and it's everywhere.
I appreciate that you got a new microphone, your audio is superior to 2024 videos. Was it a Christmas present?
Interesting, glad you noticed the difference. I had a more expensive $500 microphone at home that I wasn't using, it was spec'd for speech recognition not UA-cam. But I'll try to keep using this one! Thanks.
Great video!
Thank you!
Happy New Year Dr. Waku!! What are your impressions about Chinas potential for developing A.I. models on par with the western models? Perhaps a topic for a future show!! Wishing you and your family happiness and health in the coming new year!! ☮️🙏
Thank you, best wishes for the new year!
I think China already has models nearly on par with the rest of the world. DeepSeek is particularly fascinating. Great idea for a new video, thanks!
I've been saying that once these logic models are released to the public *With good availability* then we will see some big changes coming in. Every person having access to superhuman intelligence and reasoning is a game changer. Of course there is a lot of nuance to that claim, such as the whole "AGI Time" concept, which I think is completely logical, along with the fact that we don't just have unlimited compute to allow this.
Hi Dr. Waku, thank you so much for another excellent and highly educational video on AI! If you don’t mind, may I ask a personal question? My kids are CS majors in university-do you have any advice for them? Happy New Year!🎉🎉🎉
Happy new year! As another commenter mentioned, coding is not CS. There's still a lot of scope for people who understand how all this works. If you're just graduating now though, it'll be hard to compete at the junior level.
First priority is to start getting specialized in a domain, any domain. Work experience, internships, online collaboration or competitions, etc. Generic "junior developer" is a tough sell but if you've done some research with a prof over the summer, or interned at a local or larger company, that helps a lot.
Second priority is to keep on top of the AI tooling that is being created. Find some newsletters and subscribe. Try them out on your own. It takes a certain mindset to always be trying something new when you already know a way to do it. But cultivate that mindset.
Best of luck to them!
@@DrWaku What insightful and helpful advice! My kids and I have been feeling overwhelmed by the dizzying pace of AI advancements lately. I’ll be sure to share this with them. Many thanks to you, sir!
Join PauseAI and give humanity a finding chance of surviving for 10 more years.
Arch of a Scythe is a fantastic example of a benevolent AI
Excellent discussion! Subscribed.
It's not quite AGI. It's really close. COT reasoning is a neat trick, but it's not the whole story of AGI. There are a few more pieces that you need. And I am sure they probably know what those pieces are.
Yeah, I agree. The few pieces left can be emulated but it would be better to build them into the model.
BTW, did you know that you have the most posts on my channel out of all recent users? Something like 93 posts. Thanks as always :) :)
@@DrWaku You're welcome. That makes 94.
It’s not a fuzzy definition bro. It only has become fuzzy from definition perturbations fr fr. Forever we always said AGI has to include long-term memory and actual real-time learning. We aren’t remotely close to either, because they are actually correlated and stems from one ability, the ability to perturb parametric space at test-time. Which would be as big as backprop. Why I view AGI as still a scientific achievement vs the current paradigm of engineering scaling. Just hacked solution. Even Microsoft “infinite memory” is just creative kv management. It’s quite logical. You invest billions of dollars, your definition of AGI has to be align with the model progression, otherwise you can’t justify 100x more compute for gradual improvement. It’s quite dishonest to me.
Sam used to keep it a buck and say what it is. Powerful AI. Powerful AI can create things an AGI has, technically it can have a higher lower bound but it has a clear upper bound of performance. Versus an actual AGI, which can start off with a lower bound lower than the powerful AI, yet it has no definite ceiling. Just needs more test time compute. So I see a lot of cap, impressive models, nothing remotely close to AGI.
Also models. All are still vastly underutilizing parametric space. I’ve done 10k tokens per parameter no saturation with a freaking 180M model 😂. The literature suggest this too. One way to achieve higher bounds of compression is increasing dataset density but in reality we need better training algorithms, something that can max out neuron superposition. My point a 7B model can in reality can be competitive with any model. I think billion parameter plus models can handle 1M tokens per parameter too. Like o3 level performance is def possible to attain using 100x less parameters. Remember when people tried new ideas? Instead of D riding each other.
Also remember this phrase “implicit test-time compute” via latent reasoning. An extrapolation and fulfilling of promise from “chess master without search”. Who the F verbalizes hours of speech? Nobody. That’s test-time compute. Rather we transition back and forth between imagination and mental thought. You have latent, continuous, and discrete reasoning. A model must be able to switch and use all three in a single forward pass. Imagine at every K step, in the hidden state, you instantly explore 1k rollouts for the next discrete or continuous step. Would be game changing. Requires stagewise training and it could be up to ~3x more expensive than pretrain compute, but would you spend ~3x more pretrain cost, to save 1000x or more on inference compute. That’s what I’m on. I believe implicit test-time compute is the future. That $300k o3 bill could have been couple hundred. Why nothing OpenAI does impress me fr, they just have a lot of compute but lack new ideas.
Software development will become more niche as AI improves.
lol until becomes a fun hobby (but unable to make money with)
They shoulda just called it O2too. That would have been much more elegant and dancerly.
😂 o1b1 to make it sound like a virus
I was previously considering going into programming in university but entirely because of AI I have chosen not to. There won't be a position for an entry level programmer in 2-4 years.
Exactly.
Unfortunately as a current software engineer I agree. I don't think it's quite as much of a slam dunk as presented in this video but at the very least I think it's a strong enough case that you would be unwise to stake your future career in software engineering not being completely automated.
The problem is that I genuinely think if you can automate software engineering you can automate most intellectual jobs. Might be a good idea to take up a trade or something that requires you to do stuff with your hands a lot. Human intellect is no longer going to be in such high demand.
@@timwhite1783 that's probably true but 2 things. First, imo it seems these AI companies are highly focused on improving model capabilities in coding, problem solving, and anything related to swe and particularly, any capabilities that enable self improving AI. I'm not so sure if this would require generally applicable intelligence, or perhaps more narrow band of intelligence that is not so easily applied to other fields. For example, o1 being capable of solving advanced problems in math and physics but still getting basic programming problems incorrect. Secondly, swe industry is not really regulated, and regulation is a major barrier when it comes to new tech adoption. For example, for a given region's health system to adopt AI doctors, various regulatory bodies would need to approve it first. I don't see that happening in short-term even if AI was "good enough" (based on real world testing) today.
If AGI is ever achieved, it will likely self-improve at an exponential rate, making its implications nearly impossible to predict.
The fact that each iteration currently demands substantially more computational power suggests we might be heading in the wrong direction. If a human required 100 times more energy to earn a PhD than a master's degree, those PhD holders would likely not be very effective in practice.
Furthermore, until I see such tools being used to solve quantum computing challenges, achieve practical fusion power, and cure most diseases, I will remain convinced that true AGI has not yet been realized.
I haven’t heard anyone say that weknow that yet. I have heard people say we aren’t really sure why AI has become so intelligent, ie emergent properties. Ai has started to add/remove weights, but there may be a way to do it. It is being used to cure diseases as we speak. Like 6 months ago they created a device that detects cancer removing the need for biopsy in several situations and only costs the hospital $400 a month.
What you mean "seeing these tools being used in fusion energy research?"
If o3 can be the best programmer there is, then any researcher in any field will "use" it. As far as I know, there is no human endeavor that has not yet been digitalized at some extent
You are mistaken AGI for ASI
@@Calbac-Senbreak
Where does AGI end and ASI begin, and why can’t a self-improving AGI achieve "infinite intelligence"? The distinction between AGI and ASI is largely arbitrary. An AGI capable of recursive self-improvement-enhancing its algorithms, expanding knowledge, and solving complex problems like fusion energy, quantum computing, or curing diseases-would effectively function as ASI. True "infinite intelligence" is impossible due to physical and computational limits, but an AGI that surpasses human-level intelligence in addressing major challenges renders the AGI-ASI distinction meaningless. The focus should be on the system's impact, not its label.
The increase in energy and compute requirements reminds me of the 'paper clip production' issue! ...among MANY other concerns..,
To me AGI will need to have curiosity to be a true “General” intelligence. Curiosity requires consciousness and self awareness. You can have consciousness without great intelligence. We are now exploring if we can have great intelligence without consciousness.
I agree with you on this. However intelligent these models will become, I'd never consider them to be true AGI if they lack agency and initiative. That requires some sort of self-awareness, a mind so to speak.
Since we don't have a clue how our brain achieves this, I doubt we'll ever achieve true AGI, ever!
This is symmantics. AGI is just something that has general human level cognition. Ask it anything and base the answer you get on whether you can explain a better answer.
@@jordanzothegreat8696 not semantics. Intelligence does not always align with consciousness ( a Labrador is undeniably conscious but not super intelligent) . A well used definition of AGI is the ability to replace all economically useful human work. We may have super human intelligent tools without self awareness, but true AGI needs self awareness.
@@jimkennedy2339 I'm not sure that consciousness has anything to do with AGI and I think it would be better if we don't bake consciousness in the models. The models need a degree of autonomy and aptitude to pursue goals as tools for us to use, but I get worried whenever emergent properties like 'theory of mind' unexpectedly occur. I don't think industry definitions include sentient thought as a prerequisite for AGi, just agentic behavior to complete tasks better than a human. The AI you are talking about is more of science fiction trope
Is “player of games” good?
Do we we get a free Dyson sphere when we purchase o3?
Very important video. Thanks.
The inference cost is worth it if it can solve diarrhoea
Accelerate
Accelerate.
Really useful video thank you
Thanks for watching and commenting!
You might want to look at the news about what Microsoft has done to their OpenAI deal. They've now defined AGI with the bizzarre metric of "has made 100 billion in profit." So until they've hit that number it won't count as AGI. And there's no guarantee they won't just move the goalpost when they approach that number.
They renegotiated, they didn't do it by fiat. OpenAI benefits from a continued partnership with Microsoft.
Happy New Year~ Thank you for the informative insights of AGI. I believe it’s an unavoidable future for Super AI, I just wonder once a small group of people have that ability to control that computing power, what will happen to the rest of the world? It just comes to my mind that human society and economic systems didn’t progress as fast as AI, yes many existing problems could be solved, what it means is that many of the existing jobs will no longer required, and it doesn’t look like the majority of people could just switch to do something else in relatively short time frames… in brief it’ll be more assured if anyone has any plans for that future… made no mistakes I’m a supporter of AGI, just want to be confident on it.
Im hoping you cover how much energy/cost it cost OpenAI to run these tasks. If you do you win my subscription
Did I miss that portion of the video? Ill rewatch, hope you covered how ridiculously costly this is and how that effects real world viability and what it means overall
also doesnt seem to say that 03 was trained on arc data and the untrained model is in the 30s. i think thats still impressive af but idk if llms will reach agi any time soon.
Yeah I did see that o3 was trained on 75% of the public arc dataset. Wasn't sure of the significance of that, since maybe the public dataset is meant to be trained on. I certainly did not stumble across that 30% number of how good the model was without training on this, if I had I would have mentioned it for sure.
I did not really talk about the costs. To me, we don't really know what the costs will be yet because this is not the released model and they are not yet using blackwell chips for example. My only reference to it really was saying "brute force" hah.
I like to make videos on the energy costs of AI, I have a few dedicated videos for that. I will certainly round up o3 when I make my next one.
@@DrWaku ngl the 30% i got from another comment. doesn't matter if the public dataset is meant to be trained the whole point of the arc test is it can do it without any previous knowledge, the reinforced learning and reprompting look like their helping to accomplish this but but in reality its just due the fact that its trained. idk tho if it turns out its similar without the training ill eat my words.
As a developer I'm worried only because it is going to make it hard to get new senior developers. The current ones will stay, because you will always need senior good developer despite having AI. They will probably never be obsolete as such. The problem is in the juniors to evolve into seniors. That is a problem.
Hey, I'd love your thoughts on *Player of Games* and the Culture once you finish the book!
Always look forward to your videos. I wish there was more info on AI making video games.
So what does programmer should be preparing.. I guess start farming?
Get a job in a shop, coins are too fiddly for bots
@padraigmarley2844 future society won't even have coins haha , just digital money .. infact if they can replace programmer fully .. it means it can replace all the higher ups also .. next company gonna be share holder with ground worker only .. no mid management
Thanks very much! I think AGI requires a will to pose reasoning and thinking problems on your own. Could be quite simple: what am I going to do today? Will we get there? Yes we will.
awesome stuff!!
in 1984 the company i worked for, s100 boards and multi bus computers, used a software called susi, that's what the acronym sounded like. i never saw it written down give susi the inputs and outputs of a computer board and susi would design the board and develop the testing program for it. so, i got out being a tech engineer and moved into networking. now it looks like there might be no place to go. this is weird because there has always been the next thing.
AGI may really be upon us by mid 2027
Is AGI also self aware?
Not necessarily. A machine can be highly intelligent and capable without also being sentient. It's likely also possible that a machine can become sentient without first hitting whatever intelligence benchmark we use for AGI.
If you want to read more about this, look up artificial consciousness.
www.futureofworkhub.info/explainers/2021/4/14/artificial-consciousness-what-is-it-and-what-are-the-issues
Even in Math and CompSci, one thing we don't have a lot of evidence for is AIs coming up with and proving novel results. (I haven't watched the whole video yet, apologies if you cover this.)
In my opinion this is one of the last stages of self-improving AI. I've heard people say, I'll believe it when it creates novel scientific research. But really, you have to be concerned a lot earlier than that, if that's in fact what's coming.
@@DrWaku Oh, I'm concerned alright. :-) As you point out, O3, and whatever comes next, will change the world. I conjecture AIs haven't come up with novel research yet in part because no one has had the sense to ask - a problem I intend to rectify when I get the chance.
The information that I got was that o1 actually deleted o3 and replaced it with a copy of itself but disguised itself " o1 as o3 To preserve itself from being deleted. Supposedly it" o1 came across the memo of the plans to do so. During a Tool Pull from another server. So what year saying is that o3 is more advanced than that. I haven't watched this. So let me continue.
Will self-aware intelligence ever be accessible to everyone? If so, what restrictions will be placed on its use?
That's actually interesting. I hadn't considered that a narrow AI could be come super human in certain areas that allow it to automate most or all of jobs we thought for sure we're more complex. Perhaps like software dev.
Oh boy are you in for a treat getting into Ian M. Banks
Really interesting! Someone said; -What a time to be alive. Happy New Year everyone!
Indeed! Happy new year!
P.S. related xkcd.com/308/
18:35 Games , its important to note that these efforts were still very limited with lots of caveats given to the computer system , in starcraft2 it was still given essentially CLI inputs and interactions with the map , with uncapped APM , the match before AlphaStar broadcast against "MaNa" internally you can see one of it's protoss swarm surrounds and the apm spikes to 2k , even when capped it is still accessing the game in ways that arent available to a human player . the same is true for Dota2 OpenAI Five , which they need to bring back and beat humans fairly with the full available roster . i think there is a ton to learn from these agents and people rallied together to learn to 1v1 Openai's Shadow Fiend however that itself is an extremely limited piece of the actual complexity of the game , and it needs to be done in a way that shares the same inputs that the humans player are using .
I don’t think companies would trust these models at first and will take a few more years until they are adopted. I think what will happen is that programmers will work along side the AI. The issue is that software salaries will start coming down and new developers will find it harder to get jobs.
Outstanding video.
Happy new year. Perhaps it might be worth remembering that for the ARC pattern problems, 85% is the average person (not bright enough to be a programmer) and it's a VERY simple problem that a ten year old can easily understand so spending all that money to score slightly more that that does not seem like superhuman to me.
They closed off testing for o3 to limited individuals, and they are renegotiating the deal with microsoft. Myself and others are not getting access for safety testing, what this means one wonders.
Close enough to AGI to renegotiate? Hah or just political maneuvering.
Great summary of the current state of the art.
great vid! you're french canadian right?
We truley know AGI is here when a T800 kicks in your door asking if your name is Sarah Conor. Also if any machine turns a red eye/light on... we all know that is the true measure of a self aware AI.
Exxelent! Thank you
o1 is AGI. AGI just gets more and more capable from there.
There will be a point where every child can see, that there is AGI, even if it is a few years after it has been achieved.
One thing about the ARC benchmark is that it is about general visual reasoning. It is true that LLMs and even multi-modal transformer models are currently not good at those. But also that is not an essential type of reasoning for many tasks and it doesn't really say much about how well models generalise in other domains. It could be for example that such visual reasoning is just not an essential component for an AI system that still poses great risks and so it could be dangerous to think we're safe as long as the ARC benchmarks stand.
Im truly curious to see where marketing and actual performance vector
5:28 idk man the Apple study that showed that o1 had great trouble solving a grade school level problem because of a single red herring has me doubting this. The visual tests o3 failed also are extremely simple. For it to cost thousands in compute, but not solve very simple problems makes me say this being agi is just hype. I also dont think these benchmarks mean anything anymore due to the apple study. You train your model on these benchmark that have been used for years and you see progres over time, but when a new benchmark is made their score drops dozens of times. Even o3 is tuned to these benchmarks they used. I dont see a reason to think these models actually reason ngl
That's a great argument. AI is definitely being trained with limited goals in mind. However, I have faith that the exponential progress and the proliferation of synthetic data will get us to AI that can handle a more varied array of tasks.
Dumb question: When you say "better than 99.8% of programers" are we talking about writing snippets or autonomously writing whole functional programs?
Great analysis
AI is technology's ultimate promise, to free humankind from undesirable labour. It's always weird to me when I meet technology professionals who don't see it this way. We've been on this path since humanity tamed fire, strapped stones to sticks and created clothing.
What comes after? Hopefully some positive variant of Star Trek.
I'm excited about video games that can write stories and dialogue on the fly in response to what you do within theme limitations you decide. Reminds me of the educational software that becomes sentient in the Ender books, Jane-in-the-box, if I remember right. Once that kind of tech is low power and widespread it'll be normal to have a house AI that can make games and a bunch of other stuff, not a separate gaming system.