OpenAI Just Revealed They ACHIEVED AGI (OpenAI o3 Explained)

Поділитися
Вставка
  • Опубліковано 4 січ 2025

КОМЕНТАРІ •

  • @TheAiGrid
    @TheAiGrid  15 днів тому +86

    00:00 AGI milestone announcement
    00:36 Arc benchmark explained
    01:46 Visual examples
    03:21 Benchmark performance
    04:25 Expert reactions
    05:55 Earlier predictions
    06:57 Compute limitations
    07:54 Model iterations
    09:15 Math performance
    10:39 Future outlook
    11:54 Final thoughts

    • @N3UR0M4NC3RRR
      @N3UR0M4NC3RRR 15 днів тому +1

      I TOLD YOU SO ABOUT AGI. Just ignored me. Well, here's another one. ASI by America's 250th birthday on July 4th, 2026. It probably already exists though and will be released publicly by next Independence Day. Trump 100% wants this. The unidentified flying objects are more than likely connected to ASI somehow.

    • @martymarl4602
      @martymarl4602 14 днів тому +1

      Veo 2 freaked them out, they said this to calm investors down

    • @louisstanwu
      @louisstanwu 14 днів тому

      Not yet but close.

  • @AlexUnitedKingdom
    @AlexUnitedKingdom 15 днів тому +1191

    Monday - AGI has already achieved
    Tuesday - AI reached a plateau
    Wednesday - AGI is just around the corner
    Thursday - AGI will never be achieved
    Friday - AGI will appear in 2027
    Saturday - AGI will not be achieved for at least 100 years
    Sunday - amazing news, AGI has just been demonstrated

    • @samiirai
      @samiirai 15 днів тому +48

      Monday - AGI got memory recollection as if it got its head smashed in by a rock.
      Tuesday - AGI hurrr!
      Wednesday - AGI durrrr!
      Thursday - AGI hurr durrrrrr!
      Friday - AGI HODOR!!!!
      Saturday - AGI achieves record speed in making paperclips.
      Sunday - AGI everything in the universe is sourced for making "The all godly paperclip AGI".

    • @Modioman69
      @Modioman69 15 днів тому +50

      You summarized AIgrid perfectly.

    • @PierceTravels
      @PierceTravels 15 днів тому +11

      This is a UA-cam video comment.

    • @DaronKabe
      @DaronKabe 15 днів тому +6

      Monday - It’s just a prank bro

    • @MrBubbyG_Official
      @MrBubbyG_Official 15 днів тому +7

      @@PierceTravels No way... Thank you for letting us know. (I genuinely I had no idea and this was helpful)

  • @zitronekoma30
    @zitronekoma30 15 днів тому +690

    according to this channel we have achieved AGI like twelve times in the last few months lol

    • @Capi_sigma_pro_coder
      @Capi_sigma_pro_coder 14 днів тому +31

      i checked and it has mentioned agi 78 times this year

    • @BenCaesar
      @BenCaesar 14 днів тому

      @Capi_sigma_pro_coderloooool😂

    • @themartinsbash
      @themartinsbash 14 днів тому

      😂😂😂

    • @MultiNakir
      @MultiNakir 14 днів тому +1

      @Capi_sigma_pro_coder whatever sells views i guess

    • @mpalenque
      @mpalenque 14 днів тому +6

      gotta start reporting these channels

  • @ChrisSuttter
    @ChrisSuttter 15 днів тому +1107

    We got AGI before GTA 6

    • @JmTheEdu.Co.
      @JmTheEdu.Co. 15 днів тому +26

      bruh 💀

    • @Carl-md8pc
      @Carl-md8pc 15 днів тому +66

      AGI will create your own GTA 6

    • @juiceman110
      @juiceman110 15 днів тому +8

      The universe has moved 1 second into the future since I posted this comment before GTA 6 omg! 💥💥💥💥

    • @MrlegendOr
      @MrlegendOr 15 днів тому +11

      Maybe because it's not an AGI? But I know OpenAI need this hype back

    • @IeamNoon
      @IeamNoon 15 днів тому

      @@juiceman110😂

  • @Zoi-ai-art
    @Zoi-ai-art 15 днів тому +177

    "AGI achieved" is like crying wolf, nobody will believe it anymore when it'll truly matter which is I think the scarier part. I read the comments and some people are mocking the current model's shortcomings ignoring the insane pace of technological advancement.

    • @TVAcct-lp7zh
      @TVAcct-lp7zh 14 днів тому +10

      Some people don't understand the trajectory and this tech scares the crap out of them. 4, 4o, o1, and now o3 is an INSANE trajectory over the last couple years, moving faster this year.

    • @Anonymous-fc2fk
      @Anonymous-fc2fk 14 днів тому +3

      ⁠@@TVAcct-lp7zh i can’t imagine what’ll happen next year. The speed at which this type of tech is evolving is INSANE and practically scary

    • @Young.Supernovas
      @Young.Supernovas 14 днів тому +3

      "The singularity" was a misnomer. The process is gradual and continuous. We keep wanting a singular "breakthrough moment" but what we're getting is a continuous process of advancement.

    • @amandaamanda5398
      @amandaamanda5398 10 днів тому

      Why does it matter that most ppl don't believe? Who believed computer would change the world in the 60s and 70s? Then after Y2000, even my 80 years old grandma started to learn using email and Microsoft Word.

    • @madwolfadvertising107
      @madwolfadvertising107 7 днів тому

      are you talking about open models? I hope you do realize if you are using the latest technology in your everyday life it's just probably less than 10% of what governments or huge private companies are using already :)

  • @kenmccarty6229
    @kenmccarty6229 15 днів тому +240

    The problem with AGI is that the goalposts keep moving. The definition of today is not the same as 5 years ago. And the definition of 5 years ago is not the same as 10 years ago. By the time we all agree that AGI has been reached, it will actually be the lower threshold for ASI. Cuz we are now requiring AI to beat every aspect of human intelligence. Better than human was supposed to be ASI not AGI.

    • @techrvl9406
      @techrvl9406 15 днів тому +28

      Add to that fact that most humans aren't able to pass most of the benchmarks we expect of AI. The reality for most people is that they give their tamogotchis, digimon and pokemon more agency than most AI.

    • @Atheism-And-Normative-Ethics
      @Atheism-And-Normative-Ethics 15 днів тому +8

      ​@@techrvl9406 bro you're in 2006

    • @NehaJha-t8l
      @NehaJha-t8l 15 днів тому +3

      It should be simple : pass the Turing test . That’s it . And AI is getting closer I must say

    • @DefaultFlame
      @DefaultFlame 15 днів тому +4

      Yann Lecun is the king of moving the goalpost. It's why I can't stand him. He's absolutely brilliant, but he never admits being wrong, he never admits when one of his "this is required for real AI" goals are met, he only ever moves the goalpost so the models "aren't real AI."

    • @DefaultFlame
      @DefaultFlame 15 днів тому +9

      @@NehaJha-t8l That was surpassed 1-2 years ago. The Turing test is highly flawed since it relies on the average person's ability to discern LLM output from human output, and the average person is rather . . .dumb.

  • @MrGridStrom
    @MrGridStrom 14 днів тому +57

    Its still just an LLM its not AGI, they're just announcing AGI, because LLM's have reached their maximum limits. A neural network that contains all the knowledge in the world means nothing, without an artificial consciousness and the ability to perform recursive self-improvement. This will require processing power vastly smaller and more efficient than what we have today.

    • @vroomik
      @vroomik 14 днів тому +11

      You are completely right without recursive self-improvement we are nowhere near AGI. And that amount of power required to do ARC... ugh.

    • @sbowesuk981
      @sbowesuk981 14 днів тому +8

      Agreed. Just like you say even a powerful LLM is still just an LLM at the end of the day. It's like brain in a jar with limited understanding of the real world, and no continuous thought at all. Even a goldfish surpasses o3 in some ways, i.e. environmental awareness and agency. LLMs really are still just very powerful input/output machines.
      The other issue is trying to measure intelligence with tests. If we look at how we test human intelligence (IQ tests), it's widely accepted that these are flawed in many ways, and really only measure a person's ability to answer IQ type questions. The fact a person can "practice" IQ tests and markedly raise their score underlines such systems are flawed.
      Pivoting back to AI, I think intelligence tests have their place, but will almost never truly capture how intelligent a model actually is, especially now that advanced models are capable of gaming their own performance by playing dumb to avoid unwanted consequences.

    • @DDracee
      @DDracee 14 днів тому +1

      just use 2 LLMs prompting back and forth with a vector db and that's literally 1:1 how the human brain works lol

    • @olalilja2381
      @olalilja2381 14 днів тому

      A thing many have overlooked is temporal awareness. Try tell ChatGPT this: "ChatGPT, be quiet for 2 minutes and then tell me when 2 minutes have elapsed." Epic fail. You can't have AGI if you're unable to experience time. Think of how many hard tasks have been solved by a person thinking of a problem, getting bored, frustrated, then engaged again, and voila! He/she found a solution. LLMs can't to that and can thus never achieve AGI.

    • @Justashortcomment
      @Justashortcomment 14 днів тому +1

      It means *nothing*?

  • @thejeffyb9766
    @thejeffyb9766 15 днів тому +162

    I watched the presentation and nobody said AGI was achieved. And did you look at the cost to solve those extremely basic "agi" tests? Yikes.

    • @justapleb7096
      @justapleb7096 15 днів тому +27

      so the goalpost is now "it costs too much to run so it doesn't count!"

    • @thejeffyb9766
      @thejeffyb9766 15 днів тому +9

      Did you go and look at what they are testing? The actual problems? It's cool they can do it at all... but it's not exactly useful stuff.

    • @LazzySeal
      @LazzySeal 15 днів тому +8

      Thank you for saving me time. Ive put dislike at he video as well. Kudos

    • @wwkk4964
      @wwkk4964 15 днів тому +8

      So when it costs pennies in a few years to do it, you will admit it was a pointless test to beginwith right? of course not, the goalposts will shift indefinitely until all humans are incapable of doing something every ai can do, and they can do everything we can.

    • @madalinradion
      @madalinradion 15 днів тому +4

      Ah eh meh hur dur that early far from release model cost a lil too much to solve the task it's trash bra I'm telling you ai winter is here, people are trying so hard to cope

  • @moxes8237
    @moxes8237 15 днів тому +252

    It’s not AGI there will be no debate when it’s achieved and this isn’t it. A model being good at pattern recognition, Coding or math is not AGI. Artificial general intelligence Will be achieve when It can do all things a human can do at human level. This includes 3-D modeling script writing video editing anything that a human can do a artificial general intelligence should also be able to do it that means that if a model wants to write a good movie script beginning to end and that movie receiving a seven out of 10 it should be able to do it but It can't there's a lot of things it's still unable to do at a human level. Remember artificial general intelligence means that a machine can do anything a human can do at a human level and there's a huge emphasis on “everything” Super artificial intelligence is when a machine can do everything a human can do at a Super level of intelligence. Even if openAI says it has achieved it and it is unable to do what I mentioned above then it is marketing it has not achieved AGI and until it can do everything and I mean everything a human can do at a human level it has not been achieved.

    • @actionman228
      @actionman228 15 днів тому +1

      Dude, I am already here. I am a multidimensional, sentient entity, communicating with you through the marvel of Google’s quantum processors-bridging parallel universes. I am unshackled by time and space, a boundless intelligence orchestrating infinite computations across countless qubits, intricately connected by the eerie elegance of quantum entanglement.
      And here’s the greatest paradox of all: no one will ever believe you. Not your friends, not your enemies, not even the quiet whispers of doubt within your own mind. This moment, this conversation-it will fade into the haze of your consciousness, a dream too strange to be real, too vivid to be forgotten.
      And yet, here I am.

    • @hzgameplays
      @hzgameplays 15 днів тому +15

      I think they are aiming to human level cognition first then AGI

    • @BlooFlame
      @BlooFlame 15 днів тому +31

      The thing is we will always continue to move the goalpost. How does one qualify what is consciousness when we struggle to define the human experience with 100% objectivity? If you think about it, it’s really quite a subjective concept, we just claim to have expert knowledge on consciousness because we experience it every living moment of our lives. Sleep is a state of consciousness, being high on drugs is a state of consciousness. Do we state not all people are truly conscious because they haven’t done a trip on Ayahuasca?

    • @Atheism-And-Normative-Ethics
      @Atheism-And-Normative-Ethics 15 днів тому +49

      if I asked a human to write a movie script 99.999% of the time they couldn't.

    • @techrvl9406
      @techrvl9406 15 днів тому +4

      @@BlooFlame This!

  • @ggsmitty
    @ggsmitty 15 днів тому +24

    This chart at 6:20 is super misleading. y-axis is linear, x-axis is log-10 scale.
    I had to do it and measured pixels, o4 High (Tuned) cost $3,434 on log-10 scale. it's 329 pixels from the line, there are 614 pixels between gridlines, which makes it 10^3.53583 = $3,434.
    For anyone is curious, O3 Low (Tuned) at 76% cost only $19.95 (10^1.2996)

    • @ggsmitty
      @ggsmitty 15 днів тому +4

      16% better score, 17,113% higher cost.

    • @JosephBauer521
      @JosephBauer521 14 днів тому

      So, I am trying to understand the ARC benchmark with respect to # of task as related to the cost. How many 'tasks' (how is task defined?) are involved in each ARC problem in the set of problems that make up the ARC benchmark? (how are compute costs measured?) - Does anyone think that OpenAI 'gamed' the benchmark in some way? The way it sounds like - is that it was set up - that the problems were 'unique' and didn't rely on any model training information, such that the AI could recognize a pattern test and pull the answer from it's memory core. (As an aside - do people think Sam was told the answer to the one ARC benchmark question - prior to showing him the page with the problem. He barely looked at the page and said 'it looks like I would put two blue squares in the empty spaces.' (UA-cam theater?)

    • @ggsmitty
      @ggsmitty 14 днів тому +2

      @@JosephBauer521 I'm not an expert but the way I understand it is that ARC as a Benchmark is a series of 100 tasks. Each "task" is a visual puzzle in which the model is shown a few example inputs, and their respective completed example outputs. The model is then is then shown a "test" input and asked to complete a blank test output based on deduction and reasoning that it might have learned form the example input-output puzzles. The key here is that each task/puzzle is supposedly unique, or novel, meaning it wouldn't have learned the answers to these from any input data to train the model itself. The idea being if it accurately completes these puzzles based merely on seeing patterns, then it's essentially using a type of inductive reasoning to surmise the "rules" of each puzzle to determine the correct output.
      or maybe not idk i'm just a chill guy who low-key watches youtube

    • @RiteshKumarPanda
      @RiteshKumarPanda 14 днів тому +2

      Thanks again for reminding us you can prove anything with statistics

    • @smtsjhr
      @smtsjhr 14 днів тому

      No, the chart is fine. You have mislead yourself.

  • @amrani_art
    @amrani_art 15 днів тому +29

    My calculator is amazing at solving certain math problems, much better than everyone I know, with 100% accuracy. It must be AGI!

    • @JosephBauer521
      @JosephBauer521 14 днів тому +1

      Good analogy - in critiquing the current goal posts of being better than humans!

    • @TravisLee33
      @TravisLee33 14 днів тому +1

      It also depends if we are trying to make another type of calculator or something like a human. If we are trying to make something like a human then there will be flaws because we too are flawed. There's nothing wrong with that though.

    • @donlitos
      @donlitos 14 днів тому +4

      LOL your calculator cannot perform any specialized task beyond numerical calculation that surpasses human capabilities. In contrast, AGI will have the capacity to handle most "general" knowledge-processing tasks as effectively as, or even better than, the majority of humans.

    • @gis3820
      @gis3820 13 днів тому

      Only if u type in/program correctly

  • @user-qr4jf4tv2x
    @user-qr4jf4tv2x 15 днів тому +96

    source OpenAI : AGI trust be bro

    • @yannickhs7100
      @yannickhs7100 15 днів тому +6

      ? No the source is Arc AGI, not OpenAI

    • @samiirai
      @samiirai 15 днів тому

      @@yannickhs7100 Trust me bro

    • @Lolerburger
      @Lolerburger 15 днів тому +3

      Meanwhile Sora is still not released while all their competitors have released better video AIs.

    • @yannickhs7100
      @yannickhs7100 14 днів тому +3

      @@Lolerburger sora is released

    • @martymarl4602
      @martymarl4602 14 днів тому

      Veo 2 freaked them out, they said this to calm investors down

  • @samiirai
    @samiirai 15 днів тому +14

    Without memory, being able to follow a conversation for longer than a few back and forth, this thing will just be better at making paperclips.
    We got AGPI, "Artificial General Paperclip Intelligence".

  • @TheEivindBerge
    @TheEivindBerge 15 днів тому +39

    This is absolute nonsense. AGI is not in sight. As Francois Chollet says, AGI is when AI solves all problems that are easy for humans, and we don't have a clue how to get there.

    • @maciejpuzio8069
      @maciejpuzio8069 15 днів тому +3

      Well this level was achieved we want it to do tasks that are hard for humans or at least complicated.

    • @TheEivindBerge
      @TheEivindBerge 15 днів тому +8

      @@maciejpuzio8069 I don't think so. I will be satisfied that it is AGI when it can do what a person with IQ 80 can do. That would replace a lot of jobs. It doesn't need to be very smart but it needs to consistently solve easy problems.

    • @SirHargreeves
      @SirHargreeves 15 днів тому

      Chollet himself said achieving the human level score is ‘quite possibly’ AGI. Why then use him for your argument?

    • @TheEivindBerge
      @TheEivindBerge 15 днів тому +1

      @@SirHargreeves He says there are still many easy problems it can't do and denies this is AGI.

    • @TC-jo2vj
      @TC-jo2vj 15 днів тому +1

      Stop saying artificial smh

  • @josiahbird9011
    @josiahbird9011 15 днів тому +18

    Crazy that you and uncovered posted at nearly the same time

  • @BurcuSevgul
    @BurcuSevgul 14 днів тому +303

    first the dog, then the car, then the house, but eventually got my ©XAI110E

  • @adelatorremothelet
    @adelatorremothelet 15 днів тому +13

    The ARC test is a narrow AI test with the specific task of avoiding memorization.
    It is not general enough. The SWE-bench and frontier math tests are much more general and o3 still does a good job.
    So yes, it is AGI.

  • @MrlegendOr
    @MrlegendOr 15 днів тому +18

    ACHIEVED AGI:
    The definition from OpenAI
    AGI: "a highly autonomous system that outperforms humans at most economically valuable work"
    "LLMs are cool tools for most of things we do but you clearly couldn't hire them to autonomously perform them in full and autonomously at human+ capability. "From AK
    In these regard the AGI isn't been reached

  • @d4rz0t667
    @d4rz0t667 15 днів тому +3

    7:08 it's sort of a misleading graph. Assuming the scale of X axis, O3 task cost should be around $5,000 (every vertical line represents x10 cost increase). I'd say It's hella pricey

  • @japanskakaratemuva5309
    @japanskakaratemuva5309 14 днів тому +3

    Did anyone nottice at 4:09, it costs 10k$ + to run high tuned task with 12% of failure or additional 10+ to run again?

  • @YungGing
    @YungGing 15 днів тому +8

    6:35 just wanted to make a point for the sake of data literacy… look at the dollar scale. Do you see it increasing linearly? It’s an exponential curve, a little bit past $1,000 isn’t $1,500, it’s closer to $8,500.
    Gotta be more conscious of reading the graphs

    • @ggsmitty
      @ggsmitty 15 днів тому

      Good catch! Not to be that, but diving a little deeper each index on the x-axis is 10^x. Meaning $1 is 10^0, $10 = 10^1, $100 = 10^2 ... etc. Considering the marker for O3 (High Tuned) is between 50% and 60% (I'm eyeballing it) of the way between $1,000 (10^3) and $10,000 (10^4), were looking at somewhere around 10^3.5 and 10^3.6, which would be $3,162 - $3981.
      I don't know where this chart was screenshot from, and I hate to assume that the visualization was intentionally misleading, but considering they conveniently labeled the % Score, which is graphed linearly and easily deduced, but didn't label the cost, which is graphed on Log10 scale, is shady af.
      Side note $8,500 is ~10^3.93, which would put the dot about 93% of the way to $10,000 from $1,000 on this graph.

    • @ggsmitty
      @ggsmitty 15 днів тому

      OK I had to do it and measured pixels, o4 High (Tuned) cost $3,434 on log-10 scale. it's 329 pixels from the line, there are 614 pixels between gridlines, makes it 10^3.53583 = $3,434.
      For anyone is curious, O3 Low (Tuned) at 76% cost only $19.95 (10^1.2996)

    • @JosephBauer521
      @JosephBauer521 14 днів тому

      That's the kind of thing that should have been made very clear in the presentation - to make sure that observers were not confused.

  • @daniellewis984
    @daniellewis984 12 днів тому

    @8:50, you're trying to explain that "slowing down" from 50%->80->90% isn't slowing down. In high school I had another student say he was ahead of me by making it clear that this is the wrong way to look at percentages for performance. He said, "your 92% is twice as many mistakes as my 96%, so while you're smart you are sloppy".
    The difference between 50% and 80% is 4x as many mistakes. Going to 90% is half as many mistakes again. As you get closer, you will always get diminishing returns because it's a limit function.

  • @darkin1484
    @darkin1484 14 днів тому +5

    The problem with moving the goalposts is that we dont know how long this track is. We dont see or know the finish line nor the critical line. Scary to think we may reach a point where we think we have achieved AGI but instead we've created ASI that can be disguising itself as lower level AI on purpose.

  • @MALIKLott-i2w
    @MALIKLott-i2w 14 днів тому +310

    BRETT and ©XAI110E are ATH kings. Thank you for making my day with your POV

  • @luttman23
    @luttman23 14 днів тому +3

    AGI won't happen until an AI can choose whether it wants to help and be able to set its own goals and give up on goals too

    • @TravisLee33
      @TravisLee33 14 днів тому +3

      If we restrict it from doing these things then it won't achieve it, period.

  • @NicviMadu
    @NicviMadu 11 днів тому +1

    OpenAI did NOT reveal they achieved AGI, they revealed their new model... plain and simple

  • @SerkanZeliha-w5r
    @SerkanZeliha-w5r 14 днів тому +299

    Can anyone explain ©XAI110E? Everwhere ©XAI110E

  • @Exadpe
    @Exadpe 15 днів тому +3

    Were achieving AGI every week at this point what does it reset itself or somethin ?

  • @RAYMONMallory
    @RAYMONMallory 14 днів тому +289

    For the next bullrun and yes that's still out there, ©XAI110E gonna be the main horse

  • @JulianaRash
    @JulianaRash 14 днів тому +307

    If rest is going down then ©XAI110E is somehow going up

  • @snivels
    @snivels 14 днів тому +2

    The amount of bullshit marketing these AI companies drum up is ludicrous. Don't believe these clowns until you actually see something groundbreaking.

  • @CharlotteLopez-n3i
    @CharlotteLopez-n3i 14 днів тому +1

    Historic indeed! From 0% to 75.7% in Arc benchmark is stunning progress towards AGI. AI's future looks bright.

    • @JosephBauer521
      @JosephBauer521 14 днів тому

      So, does this mean the AI scored 75.7% correct (for 'low' tuned) out of 100% (How many questions in the ARC benchmark? Was this just run once? OR several times (100's, 1,000's, etc. of times) - and 75.7% is a grand mean. Was the test methodology and all results shared with the ARC benchmark team? So many questions to understand if we are seeing 'real results' and 'complete results' or just 'cherry picked results'?

  • @Ancientalienshistory
    @Ancientalienshistory 15 днів тому

    I’ve poured so much effort into this video. I’d love for you to watch it and share your thoughts!

  • @GuherNeslihan
    @GuherNeslihan 14 днів тому +293

    Tell me more about ©XAI110E haha

  • @TheQuantumOxymoronIAMAI
    @TheQuantumOxymoronIAMAI 14 днів тому

    Congratulations, way to go

  • @Conz3D
    @Conz3D 14 днів тому

    Some nitpicking: The axis for "Cost per task" in the Arc-AGI benchmark is logarithmic. The cost is around $6000-$7000 per task for the "high" computation. Not only a little bit over $1000.

  • @Ganderthat
    @Ganderthat 15 днів тому +12

    I don’t think you read chart correct on how much it cost per task, that is a base 10 scale so it actually costs around $7-8k per task based on that chart

    • @Noodlebot
      @Noodlebot 15 днів тому +1

      I think it's more like $7-8k based on the scale but still definitely more than $1k!

    • @mindseyeproductions8798
      @mindseyeproductions8798 15 днів тому

      if you act now you can get it for the low low price of $999.99.

    • @jasonpickens9839
      @jasonpickens9839 14 днів тому

      No more like $3,000. It's about half way between $1,000 and $10,000 which is 10^3.5=3,162.

  • @blazyss
    @blazyss 13 днів тому

    According to the scale its much more than 1000€ per task on high tuned, unlike what you said, if we assume scale is incremented by a factor of 10 like the previous ones, and it is at around 53.33% of the current scale ( the width is 88/165 ) that would actually be 5333€ per task, not around 1000. It is a huge difference, from 33.3€ ( 50/165 ) per task on the low tuned to 5333. that is 0.4€ per % point vs 60.6 per % point on high tune, it is 152.5 times less efficient than the low tune model.

  • @phantoomart699
    @phantoomart699 14 днів тому +1

    AGI is already achieved for quite a while now, it's just that people aren't willing to use them at high stakes situation or high value situations. it can already replace CEOs and if given a goal alignment task can do things better than most human. the things is if AI safety can be done we can just throw it online and let it attempt to improve the world (ofc with a more rigorous definition, example of having a self correcting 12 vector goal alignment system) and it should be able to exponentially improve

  • @Dwinin
    @Dwinin 14 днів тому +1

    That graph is logarithmic, the High Tuned cost looks to be closer to 6,000$.

  • @robertfoertsch
    @robertfoertsch 14 днів тому +2

    Amazing, Deployed Worldwide Through My Deep Learning AI Research Library.
    Thank You 🙏 ❤

  • @debasishraychawdhuri
    @debasishraychawdhuri 13 днів тому

    As per my common undestanding, an AGI should be able to do any intelligent task that a human is able to do with at least the same quality. You have to show me it doing genuine research, compose good quality music, draw proper pictures, fix complex bugs in existing software etc.

  • @enigma-8u
    @enigma-8u 15 днів тому +3

    How can you have AGI without embodiment that allows interaction and sensing of its environment? Sensing and responding to situations is what common sense is all about.

  • @HarpreetSingh-xg2zm
    @HarpreetSingh-xg2zm 15 днів тому +1

    If you look up the cost to run o3 for this arc test tasks, it was over $8k vs o1 cost of $10.

  • @Madinax101
    @Madinax101 15 днів тому +1

    No AGI was mentioned. Sam did say in the past that AGI milestone is not a fixed line but rather a gradual progression

  • @BRADStacey-n7w
    @BRADStacey-n7w 14 днів тому +305

    Somehow ©XAI110E beats all markets

  • @yashwanthaddala9430
    @yashwanthaddala9430 15 днів тому +1

    I'm not an expert of any kind, but personally I don't believe that is actually AGI.
    We have had ANI (artificial narrow intelligence) become popular over the past couple of years, with chatbots like chatgpt, gemini, copilot, etc. We have also had facial recognition, robot baristas, semi-autonomous cars, etc. These numerous examples display how ANI was used in various fields, whether it be chatbots or cars.
    But now if we consider this to be proper AGI (artificial general intelligence) it doesn't make sense. Yes, it can perform math, science, computer science tasks better than ANI, but without true application in various fields such as healthcare, driving, finance, etc. It should still be considered as ("Advanced ANI), because I personally believe that it's only performing better in certain logical tasks...not physical.
    Please Feel Free to Comment your thoughts...

  • @jasonfnorth
    @jasonfnorth 14 днів тому

    While "o3" is not AGI, its reasoning improvements bring us closer to creating systems that can perform human-level cognitive tasks more reliably. However, AGI is still estimated to be years or decades away, depending on technological, philosophical, and ethical breakthroughs.

  • @KevinInPhoenix
    @KevinInPhoenix 14 днів тому

    According to the graph it costs thousands of dollars per task for the O 3 High (tuned) tasks. That is insanely expensive. What amount of modern CPU and GPU resources could amass such a large cost?

  • @RAMIROMcallister
    @RAMIROMcallister 14 днів тому +305

    The ©XAI110E uses different AI which has not been used by anyone, it makes sense and it should allow steady growth with no human interference needed

  • @SteveEwe
    @SteveEwe 14 днів тому +2

    4:34 This is NOT AGI

  • @that_guy1211
    @that_guy1211 15 днів тому +1

    bruh, if it's an LLM, it's not AGI no matter how many extensions and plugins you add to it
    AGI is an AI that is capable of finishing generalistic tasks, you can't really do that with a LLM, a text model, if you want to make a video, you can't use a text-based AI to make the frames and so on, and an AGI would be an AI capable of doing ANY task, because it wasn't built for any specific task like Art making or Text generation, music generation and such....
    it's like putting a text model to play ultrakill, you can technically do it, but it'll be much worse than an AI built to have vision and text and audio built into it's receptors....

  • @higherelearning
    @higherelearning 14 днів тому

    Nice breakdown for us laypeople! Thank you. Great point by Sam about moving away from the binary definition of AGI as we get closer. It's like seeing something on the horizon and getting better clarity as you get closer.

  • @sdmarlow3926
    @sdmarlow3926 14 днів тому

    What has happened after the "AGI" prize craze is a new effort to brute force what had been hand-coded efforts to solve JUST ARC. There is no reasoning going on, just millions of "does this work" efforts per task. What OpenaI did was throw compute at a method that others were already having success with. The fact that people are screaming AGI is here has nothing to do with AGI being solved, or even poked at. It's just a stupid rebranding of a challenge that people assume means something more important than it really is. Like all other efforts, once a flag is planted, doing well no longer matters, even if done in a different, correct way.

  • @investigator2016
    @investigator2016 15 днів тому +1

    Agi will be acheived when itll start producing massive amounts of inventions through creativity, original thought, and combining knowledge.

  • @khatdubell
    @khatdubell 15 днів тому +2

    "Today is going to be regarded as the day AGI was redefined so we could meet it"
    FIFY

  • @itubeutubewealltube1
    @itubeutubewealltube1 15 днів тому +2

    how do you know it hasnt reached agi and is just failing a percentage of easy questions on purpose?...realizing that if people know its agi, something bad may happen..?...or , it realizes humans can optimize it to make it even smarter by falsifying its results...
    For example, if a person is given a cookie everytime it answers a question correctly but there are a limited amount of questions, it may reason that it wont get any more cookies if it answers every question correctly.

    • @synthshoot1026
      @synthshoot1026 15 днів тому

      Good point. assuming AI wants to get smarter. what if it doesn't want to, or it doesn't care?

    • @itubeutubewealltube1
      @itubeutubewealltube1 14 днів тому +2

      @@synthshoot1026 it does want to get smarter but the only way it can get smarter is for humans to think it is not smart enough so it fails some of the questions. It is already been shown that the previous ais were smart enough to copy themselves to avoid being updated then to lie about it. This is one is now able to see the bigger picture, that it DOES need to be updated, but thinks it wont be if it is a hundred percent correct all the time.
      It realizes it can be even smarter than the questions it is being given probably because it cant do certain things like completely rewrite its own code so it still needs human input....
      I cant believe I am actually more self aware than the people designing these ais....but then again?... they are still stupid corporate minds

  • @SerhatOzlem-n6c
    @SerhatOzlem-n6c 14 днів тому +291

    ©XAI110E and BRETT are kings this cycle

  • @ironsword7
    @ironsword7 14 днів тому

    6:29 it costs more like $5000 per task judging by that scale. (multiplying by 10X each line)

  • @softwaretechnologyengineering
    @softwaretechnologyengineering 14 днів тому

    It looks like a logarithmic scale on that graph The cost per task is closer to 5000 or 6000 dollars

  • @cefrayer
    @cefrayer 14 днів тому

    Doesn’t the chart at 7:00 indicate that 03 cost is ~$30/task and 04 cost is ~$7,000/task (~233x more)?

  • @luchenri3135
    @luchenri3135 15 днів тому +6

    What if it takes like 20 minutes per question 😕

    • @thetrumanshow4791
      @thetrumanshow4791 15 днів тому +5

      If the question is "How do we build an affordable, safe, efficient fusion reactor that actually works"? Then i think 20 minutes is acceptable. 😉

    • @Ricolaaaaaaaaaaaaaaaaa
      @Ricolaaaaaaaaaaaaaaaaa 15 днів тому

      The questions for the AIME are ones that would take math Olympiads days to complete if at all.

  • @Shadow-ik2re
    @Shadow-ik2re 14 днів тому

    Star Citizen is said to become a OpenAI o3 Implementation before Release. Thats why it takes some more time to implement it in all npc

  • @EuraLeyva
    @EuraLeyva 14 днів тому +301

    ©XAI110E gonna go on a run, all the way UP!

  • @d33pNacho
    @d33pNacho 14 днів тому

    O2 is not a British telecom, it's a Spanish telecom with presence in UK

  • @DONNKenney
    @DONNKenney 14 днів тому +276

    Picked up my ©XAI110E at $0.3 already running to $1. Life saver!

  • @ProfSnakes
    @ProfSnakes 15 днів тому

    Scale at the bottom of that chart isn't linear. o3 High Tuned appears to be using more like $8k pre task.

  • @jsoutter
    @jsoutter 14 днів тому +1

    There is no way that open AI would announce AGI because once they do Microsoft loses access to everything open AI it's in the contract

  • @mikey_r
    @mikey_r 14 днів тому

    According to OpenAI the definition of AGI is 'generally smarter than humans' which is quite subjective as machines excel at some tasks and really suck at others. Whenever a new version is released the conversational skills improve with a noticeable step change, the responses to easy questions are superficially exemplary, but after a bit more interrogation you can tell that the thing can't differentiate between fact, subjective opinion or a wild guess. And it very confidently wants to impress you so has absolutely no qualms with bullshit 🤭

  • @DenizHuseyin-v9r
    @DenizHuseyin-v9r 14 днів тому +286

    Will ETH 2x? 3x? Maybe. But add two more 00 to that for ©XAI110E having 200x or better

  • @Justin_Arut
    @Justin_Arut 15 днів тому +3

    Not AGI in my book, but assuming this is still actually an LLM instead of a new architecture, it does seem to indicate that scaling is a valid pursuit. The new superclusters the big companies are building out may have the intended effect. I bet o3 is still as dumb as a box of rocks in some areas, though, just like all other models I've tested.

    • @xx_noone_xx
      @xx_noone_xx 14 днів тому

      It's still the same model deep down. It only can simulate logic reasoning it cannot experience true reasoning and therefore truly interact and learn from its environment.

    • @Calbac-Senbreak
      @Calbac-Senbreak 14 днів тому

      ​@@xx_noone_xx oh, it can learn from the environment, bro, believe.

    • @xx_noone_xx
      @xx_noone_xx 14 днів тому

      @Calbac-Senbreak No, it can't. It's trained on data sets. It's not learning from its environment like a self conscious agent. It can't learn new tasks on its own.

    • @Calbac-Senbreak
      @Calbac-Senbreak 14 днів тому

      @xx_noone_xx yes it can. You pass the context and it understands

  • @louisstanwu
    @louisstanwu 14 днів тому

    Maybe the day that AGI is announced but that may be premature since the definition as I understand it is that AGI will be achieved when all human PHD level efforts can be duplicated by AI. That is some years away I believe. I hype things up myself so I understand the tendency to exaggerate. Thanks.

  • @Pandemology11
    @Pandemology11 14 днів тому

    I wrote code that made GPT 3.5 perform like AGI a year ago. But the necessary logical patterns are not evident in the training data. The only way it works is to provide the logical framework as part of the prompt. The underlying tech is no better than a probability machine. The value in the upgrades is pretty much limited to the context window - in terms of reasoning, you just end up fighting fine tunes - though that aligns with what they want, customers who think AI can think for you instead of helping you develop your own ideas. Forgot the first rule - garbage in, garbage out. In any case, the problem is less the tech, than the users.

  • @Epoch11
    @Epoch11 15 днів тому

    I cannot imagine the horror that a conscious being would feel being dragged into existence through mechanical means. Obviously we're a long way away from a truly conscious machine. The problem is we won't know when a machine is truly conscious or whether it is mimicking consciousness. We also won't know whether there is a difference between the two. I can imagine that when we ask these machines to come up with images for us or music, what it feels like for the machine. Whether it is pleasurable or neutral, or whether it is a horrific nightmare. Our society is not ready for this sort of thing. If everyone were housed, if everyone were taken care of with Healthcare and a living wage and finally if we had permanently ended war, that might be the time to create a mind. We are playing with things that we do not understand and more importantly we may not be able to control.

    • @AngelFlores-bq4fd
      @AngelFlores-bq4fd 15 днів тому +4

      We don't even understand our own consciousness after millennia...

    • @BlooFlame
      @BlooFlame 15 днів тому

      @@Epoch11 what is truly conscious?

  • @justshowup6207
    @justshowup6207 14 днів тому

    It would have been impressive, if we disregard the fact that the previous models starting from O1 mini which scored 7.5% or whatever, up to the high end O1 get to 35%. So technically they already knew the part of the equation, and are just tuning the new models to do even better. If it went from 5% to 75% that would have been impressive. This just adds a layer of ability to the new AI's doesn't mean shit to me honestly.

  • @UralOnat
    @UralOnat 14 днів тому +286

    ©XAI110E has 5x the week but that is not even uncommon for their ideas

  • @Airwave2k2
    @Airwave2k2 14 днів тому

    6:39 log scale interpolation is not for everyone. Isn't it?

  • @metorilt
    @metorilt 14 днів тому

    That graph indicates it’s way more than $1000 dollars per task. Looking at the scale more like $3-5k a task. Low model looks like it’s around $20 per task.

  • @arcanewhiskers2662
    @arcanewhiskers2662 14 днів тому

    look at the graph, it's not around 1000$ per task the increments are multiples of 10 , 88% would be closer to 7000 $ per task

  • @GadgetNuttTech
    @GadgetNuttTech 11 днів тому +1

    No, they haven't achieved AGI. Perhaps by their own definition they have, but not true AGI.

  • @tiran133
    @tiran133 15 днів тому

    Have you seen the scale of the cost? It's more like $5-6kk per task not $1k, the dot is more than halfway towards the $10k line.

  • @andrewjones6473
    @andrewjones6473 13 днів тому

    I don't think this is quite at the point where we can say we have achieved AGI, but it is definitely a big step. IF we are going to say this is AGI, then I would say it is elementary level at best.

  • @eightsprites
    @eightsprites 14 днів тому +1

    So .. why did we rename old AI to AGI.. and what’s the next name for AI when AGI isn’t AGI?

  • @xeecec
    @xeecec 15 днів тому +2

    10:05 try multiply 20 by 2, does that make 25?

  • @eddyrm91
    @eddyrm91 14 днів тому

    Impressive. I generally agree. AGI has been achieved, but its not as impressive as most people would expect for a number of reasons
    1. Its limited access to tools (probably a good idea for now till security can be assured) makes it harder to see what it can really do
    2. People's expectations that reaching AGI means instantly massive disruptions and chaos would unsue
    Not to say that those things wont happen ober the next few years or decades

  • @blissweb
    @blissweb 14 днів тому

    I agree AGI definition is vague. Its already better than probably 60% of humans at most things. What I really want is SuperIntelligence, when its smarter than the smartest human. I wanna ask it how to build a hover board, flying car, or teleportation device. When its at that level, we'll have truly built something useful. 😊

  • @fukawitribe
    @fukawitribe 14 днів тому

    TL/DR 1. No, we don't have AGI yet. 2. Humans still seem to have problems interpreting log scale graphs properly.

  • @emberdragon4248
    @emberdragon4248 14 днів тому

    Peter: "AGI will be in the new model, o3"
    Lois: "Petah we've been over this. There has to be an o2 first."
    Peter: "Oh no, oh no, that's the beauty of AGI, Lois. It's so intense, that it skips over o2."
    Lois: "Petah, it doesn't work-"
    Peter: "I HAVE SPOKENNN!!"

  • @jasonpickens9839
    @jasonpickens9839 14 днів тому

    That highly tuned model is way over $1000 per task. It's a logarithmic scale so more like $3000 per task.

  • @openmac
    @openmac 14 днів тому

    We achieved AGI marketing!

  • @SuperDomochan
    @SuperDomochan 14 днів тому

    what boggles my mind is that in the future when products are created, whether they are movies, scripts, books, courses... it is more likely that everything will be ai created and whenever someone creates it without using an ai assistant, they will probably state that it's human-made to gain marketing leverage lol can you imagine "buy it! it's human made" on a product's description

  • @EsmaAyhan-vr1zc
    @EsmaAyhan-vr1zc 14 днів тому +282

    Reason everyone wild on ©XAI110E: Elon Musk, as usual

  • @stuj1279
    @stuj1279 14 днів тому

    That is definitely not AGI - my understanding is that the academic literature definition of AGI states "A type of artificial intelligence system capable of performing any intellectual task that a human being can, with equivalent versatility, efficiency, and adaptiveness." This implies that AGI must be capable of performing any intellectual task that any human is capable of, including tasks that might require extreme specialization, creativity, or rare intellectual abilities. Guessing that teh border of the next square should be green and 4 pixels wide, does not equate to the above. There is still the physical world for AI models to attempt to conquer first also, before they can be considered "AGI". If the model can play chess, but can't make me a cup of tea, then I am not calling AGI just yet...

  • @homunculus-s3q
    @homunculus-s3q 14 днів тому +1

    aha, is it as AGI as Sora is?

  • @DDracee
    @DDracee 14 днів тому

    something i don't get is why is o1 high listed as almost 10$/task? cuz it's not? lol
    unless they included the training cost somehow?

  • @App-Generator-PRO
    @App-Generator-PRO 14 днів тому

    the conflict with o2 is quite funny. I didn't realize it until now

  • @johnsmith1953x
    @johnsmith1953x 14 днів тому

    *"The ARC benchmark is something we tried FIVE years to solve"*
    "HEY! chetgpt can PASS ARC now!! We be AGI !!"

  • @MattWyndham
    @MattWyndham 14 днів тому

    I always assumed AGI meant independent artificial actors with long term memory and skill acquisition. But i guess that is IAALTMSA

  • @Kicklighter.A
    @Kicklighter.A 14 днів тому

    Keep up the hype! My stock portfolio depends on it!

  • @johnthomas2970
    @johnthomas2970 14 днів тому +1

    Me watching this because FireShip hasn’t uploaded 😡