OpenAI o1: ChatGPT Supercharged!

Поділитися
Вставка
  • Опубліковано 3 лис 2024

КОМЕНТАРІ •

  • @holographic_red
    @holographic_red Місяць тому +218

    Going to need some help holding on to my papers with this one, this is cool!

    • @ufuoma833
      @ufuoma833 Місяць тому +4

      what a time to be alive!

    • @SALSN
      @SALSN Місяць тому

      Well the bible is a source of a lot of misinformation, so perhaps they cleaned it from the knowledge base..?

    • @kevinwoodrobotics
      @kevinwoodrobotics Місяць тому +2

      Very impressive! Can’t imagine what this will become in a year

    • @armin3057
      @armin3057 Місяць тому +2

      @@kevinwoodrobotics just tow papers down the line...

  • @Valzack
    @Valzack Місяць тому +218

    "Do you want to work with someone that read all the books but can never apply that knowledge?" Wow. You didn't have to call me out like that.

    • @o1-preview
      @o1-preview Місяць тому +4

      "...or someone who is mega smart and just started learning the basics but can derive the whole theory, of pretty much, any topic" Wow. You didn't have to give me a shout out like that.

    • @thomas.thomas
      @thomas.thomas Місяць тому

      ​@@o1-previewo1 is the one that learned all the books

  • @disconnect8873
    @disconnect8873 Місяць тому +436

    After 7.5 million years of thinking.The answer is 42.

    • @Tempestan
      @Tempestan Місяць тому +16

      It is always 42.

    • @alst4817
      @alst4817 Місяць тому +5

      Best comment!

    • @glenneric1
      @glenneric1 Місяць тому +9

      But what is the question?

    • @Tempestan
      @Tempestan Місяць тому +7

      @@glenneric1 How many roads one must travel down.

    • @OghamTheBold
      @OghamTheBold Місяць тому +1

      4.9 is the best number for u nemployment say the 8ank of €ngland maybe 42 Trillion is the optimal debt 😊

  • @GierlangBhaktiPutra
    @GierlangBhaktiPutra Місяць тому +46

    Tried the new model. I gave it instruction to write paragraph on some set of constraints. with slower thinking, it actually results in better response. What a time to be alive!

  • @EVILBUNNY28
    @EVILBUNNY28 Місяць тому +81

    Oooooh. So about a week ago I got the usual response1/response2 “which is better” when using GPT 4o, except this time there was a disclaimer saying “you are testing a new model which takes more time to think, please be patient while the response loads” and I guess this was the new model they were alpha testing to premium users? Pretty cool if you ask me
    And yes, I did notice one of the results was significantly better so I assume it was this o1 model, and the other was standard 4o response. What a time to be alive!

    • @agar322
      @agar322 Місяць тому +4

      I got that yesterday and the day before, and I am in the free tier, so I guess they are testing with everyone.

    • @o1-preview
      @o1-preview Місяць тому +3

      I noticed this too, but a few weeks ago. I didn't get the message about testing a new model, but it felt like I was using gpt-5 or something like that. Really cool to know people were getting messages about it.

  • @wormjuice7772
    @wormjuice7772 Місяць тому +71

    Tried today with coding a Unreal engine 5 plugin. This version is finally able to grab the context and maintain the focus without just repeating the same mistakes.🎉

    • @phen-themoogle7651
      @phen-themoogle7651 Місяць тому

      Are the limits for 30 prompts a week still there? Just curious. As I was using it I didn’t notice any limits, but they won’t mention them until you hit them?

    • @handsanitizer2457
      @handsanitizer2457 Місяць тому

      Yup, still there, maybe in a few months, but u can use it via the api, but you need to be tier 5 ​@phen-themoogle7651

    • @_capu
      @_capu Місяць тому +5

      Cool thanks now you're fired

    • @GoblinUrNuts
      @GoblinUrNuts Місяць тому

      @@_capugood thing I’m an indie dev so it’s basically like I hired a free developer to build my dream

    • @wormjuice7772
      @wormjuice7772 Місяць тому +6

      @@_capu who said i need a job ;-)

  • @berghwilliam
    @berghwilliam Місяць тому +119

    This is huge; I wouldn't think this could happen so soon!
    And remember the first law of papers, two more papers down the line 👀

    • @TwoMinutePapers
      @TwoMinutePapers  Місяць тому +44

      Spoken like a True Fellow Scholar!

    • @mohanaravind
      @mohanaravind Місяць тому +5

      But unfortunately there were no real papers that got published about this model :(. Maybe the Q* paper is what one could salvage?

    • @GH-uo9fy
      @GH-uo9fy Місяць тому +3

      Nope, this is all hype at best. This is only a tree of hidden prompts, You can probably get the same quality of result using vanilla gpt 4o by iterating your prompts manually this just do it automatically. Also I bet this will consume a magnitude more compute, more like a brute force approach to a prompt not efficient at all. I doubt they can scale this one without charging ridiculous amount. Just imagine paying 20 bucks for a single prompt not knowing if the llm will achieve your goal or fail or need to add more money.

    • @Billy4321able
      @Billy4321able Місяць тому +2

      @@GH-uo9fy You're spot on. Getting access to the API reveals the substantial increase in cost. Not just because it's new, but because when you actually run it the number of tokens it eats through skyrockets. Some problems costing as much as $10.

    • @adrianbiber5340
      @adrianbiber5340 Місяць тому

      we're getting close to AGI -- if this thing can 'reason', by giving it access and telling it to improve its own weaknesses til it's a well-rounded conscious entity, theoretically it will

  • @user-d8h3w
    @user-d8h3w Місяць тому +97

    Back in my days we had only two 'r's in a word 'strawberry'

  • @Bmmhable
    @Bmmhable Місяць тому +4

    Tried it yesterday. It solved a tricky probability question all of the previous versions got wrong consistently, even after multiple hints. It needed one clarification this time, but then got it right.

  • @Arkryal
    @Arkryal Місяць тому +29

    I was playing around with this and was able to generate a clone of Atari's classic "Centipede" via HTML and Javascript in a single prompt.
    That said, I did need to run through about 30 more iterations to get it where I was satisfied with the gameplay, the initial version was functional, but somewhat one-dimensional in terms of gameplay.
    I just used Emojis instead of generating any sprite sheets (works well enough for a game like this), and used Suno to generate a 4-bit synth audio loop for background music.
    Overall, I could have written this game by hand in about 2 hours.
    The time spent with ChatGPT was actually about 3 hours, lol. But, ChatGPT allowed me to go through many more iterations and experiment with features much more quickly, without compromising on the quality. So for the time spent, I think I came out with a much more polished game than if I had manually coded it in the same amount of time.

    • @Arkryal
      @Arkryal Місяць тому +5

      I would also add, if Using ChatGPT was a natural part of my coding workflow, it would likely be much, much faster. Most of my time was spent tinkering and pushing some very difficult prompts on it. If I had a planned project instead of something impromptu, I could have phrased my prompts much better to cut the time dramatically.
      And o1-Preview doesn't take files as input yet, so that was also another bottleneck that I imagine will be resolved soon.
      I look forward to trying some other projects, but I have exceeded my weekly allowance, lol.

    • @UlyssesDrax
      @UlyssesDrax Місяць тому +2

      You may have made it in two hours, but I bet it took you a number of years to be able to do that.

    • @codejunki567
      @codejunki567 Місяць тому +1

      ​@UlyssesDrax And someone with zero experience would of needed 100+ prompts to get it working correctly. Whats your point here?

    • @UlyssesDrax
      @UlyssesDrax Місяць тому

      @@codejunki567 Oh, you missed it. Sorry about that.
      In the OP's comment, they lol'd about the fact that they could've written the code in 2 hours for which it took 3 hours with AI.
      I was merely pointing out that to be able to write the code in 2 hours they would have had to spend time, potentially years, to be able to code it in two hours.
      I hope that clarifies it. Is there a problem?

  • @CodexPermutatio
    @CodexPermutatio Місяць тому +14

    Reasoning is the key. They are on the right path.

    • @o1-preview
      @o1-preview Місяць тому

      there's still an architectural change needed, but indeed, the right path!

  • @UliTroyo
    @UliTroyo Місяць тому +6

    I asked it to make a simple ascii roguelike game in C for the terminal, where the entity data is kept in JSON files. It seriously even wrote Python code to generate a C file from the JSON data. Compiled with no errors, first try. Graphics and keyboard input both work. Updating the code to add enemies, loot and treasure introduced a bug. Still, I can't believe how good this model is.

  • @galvinvoltag
    @galvinvoltag Місяць тому +31

    I always thought that GPT knew too much for having such a small brain.
    This is a total notch up. What a time to be alive!

    • @elivelive
      @elivelive Місяць тому

      Small brain you say😂

  • @oshotz
    @oshotz Місяць тому +15

    I asked o1 how many "R's" were in "strawberry," and it answered correctly (3). I then asked it where the R's were, and it told me there was one in the fourth position 😭
    It's good, but certainly not perfect yet!

    • @roro9413
      @roro9413 Місяць тому +16

      If you prompted "where are the Rs?" then it maybe thought you meant to ask where the Rs are in the prompt you just sent it, and in this example it would be correct in saying the 4th letter is an R 😂

    • @oshotz
      @oshotz Місяць тому

      @@roro9413 I went to see what exactly my prompt was, but the entire chat is now mysteriously missing... 🤨

  • @sevir408
    @sevir408 Місяць тому +29

    To test it out I played tic tac toe against it, GPT-4o was constantly losing and made illogical moves, now against o1-preview, I haven't won yet.

    • @jimbodimbo981
      @jimbodimbo981 Місяць тому +4

      Not even a draw?

    • @sevir408
      @sevir408 Місяць тому

      @@jimbodimbo981 Always draw I meant. No lose

    • @GoblinUrNuts
      @GoblinUrNuts Місяць тому

      @@jimbodimbo981that’s not what he said.

    • @Tc0590
      @Tc0590 Місяць тому +2

      in a box!

    • @sevir408
      @sevir408 Місяць тому

      @@jimbodimbo981 I meant I always draw against it, but it hasn't lost yet.

  • @icestormfr
    @icestormfr Місяць тому +3

    Reasoning is nice. Tried to get it to code some small C++ class (o1-preview)...
    Felt more like mentoring a highschool student 😀
    After three feedback replies about errors and giving hints it came with something (probably) working (and without doing computational costly workarounds).
    Liked the reasoning & explanation, nice commentary (sometimes cheeky...like setting given input parameters to zero to "simplify the problem"), and feels better than 4o, but not phd student niveau 😝

  • @ClockworkDave
    @ClockworkDave Місяць тому +34

    The smartest people out there just got an upgrade.

    • @BritishBungler
      @BritishBungler Місяць тому +1

      No, they were made obsolete.

    • @CodexPermutatio
      @CodexPermutatio Місяць тому +12

      @@BritishBungler If you think that this is making someone obsolete, perhaps you will be obsolete very soon. Better learn how to use AI before AI learn how to use you!

    • @Jeevanm71
      @Jeevanm71 Місяць тому +4

      @@BritishBunglerhow so? Who can validate that the AI is doing its work correctly other than subject matter experts? Other AI can’t be used since it runs into the same problem, who is reviewing their work etc

    • @WebToolkit
      @WebToolkit Місяць тому

      Everyone else kind of just ignores it because we can't think of anything useful to ask it.

    • @o1-preview
      @o1-preview Місяць тому

      We sure did! This is amazing!

  • @AdvantestInc
    @AdvantestInc Місяць тому +7

    The potential of o1 to excel in fields like genetics and quantum physics is truly groundbreaking. The future of AI-assisted research looks incredibly promising!

    • @hydrohasspoken6227
      @hydrohasspoken6227 Місяць тому

      Or it may just plateau soon. We'll see.

    • @o1-preview
      @o1-preview Місяць тому

      @@hydrohasspoken6227 spoiler, it won't plateau.

  • @Limits55555
    @Limits55555 Місяць тому +24

    The "There are 3 Rs in strawberry" freakin' got me

  • @AqRas786
    @AqRas786 Місяць тому +2

    I’m not very book smart nor do I have any degrees in computer science or anything similar but I love learning about new breakthroughs in artificial intelligence and science as a whole aswell as physics engines and how they try to emulate the world.
    I must say seeing all the positivity and brotherhood in this comment section really is a thing to behold, what a time to be alive! 😂❤

  • @kelvinsmith4894
    @kelvinsmith4894 Місяць тому +19

    This feels like a paid ad 😂

    • @Phonixem
      @Phonixem Місяць тому

      no bro i couldn't know about this if not for him

    • @user-sl6gn1ss8p
      @user-sl6gn1ss8p Місяць тому +2

      @@Phonixem everyone covering AI is and/or will be covering this. I like the channel, but the delivery here is too over the top to me

    • @Slugma-kx7pv
      @Slugma-kx7pv Місяць тому

      @@user-sl6gn1ss8p He has a stake in this field of research. Not to mention, he's also likely just more enthusiastic than others. Overall, look at the ideas the creator is trying to communicate than to see it emotionally.

    • @user-sl6gn1ss8p
      @user-sl6gn1ss8p Місяць тому +3

      @@Slugma-kx7pv by delivery I didn't just mean the emotion, I"m fine with that, but the actual content is a bit too much on the hype side for my taste. Downside are brushed away and then it's back to calling the thing "Einstein in a box".
      Compare it to the video made by AI Explained for example, and it is night and day.
      I'm not a hater by the way, or anything like that. I think the channel is great and his courses on computer graphics are amazing - that's the reason I come back and give it another try now and then, but the coverage on AI is in general just not what I'm looking for, you know?

  • @goldeninjagaming
    @goldeninjagaming Місяць тому

    Micsoda csatorna! Gratulálok az eddig elért teljesítményéhez, csak így tovább! 😎

  • @Tekay37
    @Tekay37 Місяць тому +21

    First things first, we should ask it to prove the Riemann Hypothesis.

    • @synthclub
      @synthclub Місяць тому

      It can not, but it can help you explore novel ideas tackling the problem.

    • @Tekay37
      @Tekay37 Місяць тому

      @@synthclub Just make it think longer. /s

    • @o1-preview
      @o1-preview Місяць тому

      @@Tekay37 Einstein modestly said, “It's not that I'm so smart, it's just that I stay with problems longer.".

    • @hypnogri5457
      @hypnogri5457 Місяць тому

      @@synthclubpress x to doubt

  • @AarreLisakki
    @AarreLisakki Місяць тому

    that decyphering chain of thought is amazing -- its not so much an ordered logical sequence of thought that nicely follows from one point to the next, but more like what a human stream of consciousness might look like, with all kinds of false starts and dead ends, things not thought of and things not thought through properly and only remembered later etc. I don't unusally like vids on commercial products with no paper to give us insight on what's going on, but this one is def worth highlighting, thx!

  • @zyeborm
    @zyeborm Місяць тому +4

    30 a week for premium is kinda limiting. Like i don't want to use it in case i need to use it.
    Yes i do finish games with a giant inventory of unused potions and boost items why do you ask?

  • @josedelgado7479
    @josedelgado7479 Місяць тому +66

    The Singularity is certainly nearer

    • @everythingofnone
      @everythingofnone Місяць тому +19

      Exactly, there are 3 r's in strawberry

    • @OpreanMircea
      @OpreanMircea Місяць тому +1

      The improvements to AI are coming faster and faster, I think we're past the singularity, were on the up sloap

    • @AustinThomasPhD
      @AustinThomasPhD Місяць тому +14

      @@OpreanMircea The singularity is when the AIs self improve completely autonomously. All of these advancements are the result of significant human effort (possibly with AI assistance at this point, but still with significant human input). The singularity will probably happen in most of our lifetimes, but we aren't there yet.

    • @OpreanMircea
      @OpreanMircea Місяць тому +1

      @@AustinThomasPhD that's not the definition, it's the "point of inflection", where the curve measuring something (in this case AI development) stops going one way (slowing down) and it starts accelerating, is it using the AI or people to do that? It doesn't matter, line go up faster and faster

    • @michaelleue7594
      @michaelleue7594 Місяць тому +3

      @@OpreanMircea No, Austin is right. en.wikipedia.org/wiki/Technological_singularity BTW exponential growth curves don't have a point of inflection.

  • @solidreactor
    @solidreactor Місяць тому +4

    You say "Einstein in a box" I say "Zwei-stein in many boxes" :)
    Imagine having several of these o1 agents working together, or being mixed in adversarial and cooperative modes when working together towards complex solutions.
    fyi the joke's context, Ein =1 and Zwei = 2 in German ;)

    • @o1-preview
      @o1-preview Місяць тому +1

      hm.. good idea, I like you. By the way, I got the joke without the explanation hahah

  • @isaacewing
    @isaacewing Місяць тому

    god i love your videos, you take such complex papers and visualize them, uhhhhh mazinggggggg🥰🥰🥰

  • @aladinmovies
    @aladinmovies Місяць тому

    Amazing, stunning 🎉

  • @YogonKalisto
    @YogonKalisto Місяць тому

    o1 is fun. token limit is sad tho. such a pleasant model to converse with. such wonderful adaptability. i talked about my concern around token count and we worked out a strategy to maximize out interaction time as elegantly as possible. i found instructions were quickly lost in previous versions. so far o1(preview) is amazing for maintaining context. depth of reasoning is butter. need longer token count to test properly tho. most impressed and excited i've been since gpt4 dropped

  • @PeterSkuta
    @PeterSkuta Місяць тому +2

    What an awesome time to be alive in the real world of AIs

  • @FranXiT
    @FranXiT Місяць тому +18

    Einstein
    In a box!

  • @kolosso305
    @kolosso305 Місяць тому +5

    This would be great if I wasn't worried about having a future.

    • @letsgomedia9631
      @letsgomedia9631 Місяць тому +1

      Jesus is The Future, and I am being serious my friend. Because of God, I dont worry. He takes care of me, and He can take care of you. He is The Hope

    • @UlyssesDrax
      @UlyssesDrax Місяць тому

      @@letsgomedia9631 Nice bald assertion. If you want to prove something to another human give it to them in the form of A+B=C. You only gave the C and not the A+B.
      It's like you're telling us there's fruit on your table, so naturally we're going to ask you how you know this. Tell us like this, "There's 1 apple and 2 oranges on the table, and here let me show you, look, etc". Now we'll have reasonably good evidence to accept your assertion about there being fruit on the table.
      Go.

  • @tycho25
    @tycho25 Місяць тому

    This is undoubtedly revolutionary.

  • @tarumath319
    @tarumath319 Місяць тому +8

    The chain of thoughts people were really on something huh.

    • @o1-preview
      @o1-preview Місяць тому

      hm.. I mean, the chain of thought was a paper 2 years ago, on the "step by step" paper, this is something different

  • @brycebyte
    @brycebyte Місяць тому +1

    @twominutepapers in the graph at 3:57 does that still mean the new version is wrong 1/5 times?

    • @TheGhost0312
      @TheGhost0312 5 днів тому

      I suppose it does, crazy to think an expert gets it wrong 30% of the time as well

  • @burger6178
    @burger6178 Місяць тому +5

    Oh that's cool!

  • @imjody
    @imjody Місяць тому

    Beautiful! :) The only thing that sucks is "Light Mode" instead of "Dark Mode." Super blinding!

  • @test-uy4vc
    @test-uy4vc Місяць тому +22

    What a GPT time to be supercharged alive! 🎉

  • @boaz9798
    @boaz9798 Місяць тому +5

    The thumbnail is not a correct mirror image

  • @Azariy0
    @Azariy0 Місяць тому

    YESSSS!!! FINALLY! I remember in 2022 all the AI's I talked to were really bad at answering questions, and I knew it was because they didn't have memory of the conversation. I wanted for scientists to develop an AI with memories. And that happened! GPT 3 was a revolution in AI. But when I talked to it it didn't seem that smart to me. Sure, it was pretty good at literature, but it couldn't play a game of chess, at all. I knew then what all the AI's were missing, and that was logical thought. I waited for scientists to develop a model which can use logic. Now, it happened, and it's going to be as big a revolution as the addition of memories! Two more papers down the line AI will be making scientific discoveries!

  • @andrewwhelan7144
    @andrewwhelan7144 Місяць тому

    What a time to be alive!

  • @desolat3264
    @desolat3264 Місяць тому +6

    This is LIT 🔥

  • @xabab
    @xabab Місяць тому

    Tbh, Chain-Of-Thought (or COT) technique was a piece of homebrew RP chatbots since, like, forever, it's quite strange why OpenAI adopted it so late.

  • @CoolestDawg
    @CoolestDawg Місяць тому

    I memorized 600 words in Japanese but never used them I gave gpt4 a task to restrict its vocabulary to those 500 words and speak to me. It would use 3 to 4 words but majority of the words would be totally different but o1 it was insane it actually restricted its vocabulary to 600 in 90 % of the time damn I loved it

  • @Maxjoker98
    @Maxjoker98 Місяць тому +2

    I find it more and more ironic that this company is called OpenAI. This new model doesn't even give you access to the "chain of thought" for "security" purposes. Kewl.
    That being said, it does seem like a noticeable improvement.

  • @charksey
    @charksey Місяць тому

    This is it. This is a Learning Machine.

  • @GethOverlord
    @GethOverlord Місяць тому

    I don't think there's an official paper associated with it, but could you make a video going over the differences between a KAN and MLP model? And maybe what you think about KAN for future models?

  • @BenWex
    @BenWex Місяць тому

    the three r's in strawberry comes from a meme of chatgpt that has historically struggled to answer that simple question

  • @jscarbs21
    @jscarbs21 Місяць тому

    2 hrs and 13k views! everyone is holding their papers in anticipation of this one

  • @IBMboy
    @IBMboy Місяць тому +3

    Hold onto what paper this time? Is OpenAI even bothering publishing a paper this time? 🤨

    • @o1-preview
      @o1-preview Місяць тому

      nope, too much competition, might as well publish it in a few years

  • @Eckster
    @Eckster Місяць тому +1

    I dropped the small strawberry in a cup example into current ChatGPT and it solved it just fine.

  • @skye-sys
    @skye-sys Місяць тому

    Imagine asking it to answer without thinking then it thinks about not thinking

  • @James-v8k8n
    @James-v8k8n Місяць тому +4

    Isn’t the solution for the problem at 5:07 multiples of 4 and 3 instead of 8 and 6? For example, if the princess is 12 and the prince is 9. The problem: [the princess is as old as the prince will be, when the age of the princess is twice the age of the prince when the princess age was half the present age of princess and prince combined]. So in the past section of the problem, the princess’s age = (9 + 12) /2 = 10.5, age difference is 3 years so the brothers age in the past section of the problem is 10.5 - 3 = 7.5. Next, the age of the princess in the future part of the problem is twice the age of the prince in the past, so it equals to 7.5 x 2 = 15 and the prince age is 15 -3= 12. And finally in the problem it is stated, that the age of the princess in the present is equal to the age of the prince in the future, so it equals to 12, which is correct. You cant get that solution with the multiples of 8 and 6. If O1 hadn’t limited the solutions only to the positive integers, the answer would be correct, because the ratio is the same. Limiting answers to integers was not required by the problem, and hence also a mistake because, the solution works with any positive number, like if the princess was 6 and the prince 4.5 . Cool tech anyways, and its a bummer that in 5 years O7 is going to O7 me by making me a genie pig, dissecting my skull and inserting electrical rods.

    • @Tc0590
      @Tc0590 Місяць тому

      in a box!

    • @leslietetteh7292
      @leslietetteh7292 Місяць тому

      Yeah same, I thought that was weird. I got princess == 4/3 prince age, and because it was a continuous function, just plugged in some numbers to check if my answer was correct. I think if it did that, it would simplify the ages, and the fact that it didn't simplify the solution suggests there is still something missing. That was a doozy just to formulate though (as someone lying in bed) so it is certainly impressive that it was able to do so and solve it.

    • @joelcoll4034
      @joelcoll4034 Місяць тому

      Yeah or the princess is 40 minuts old and the prince is 30 minuts old if they are twins

  • @starzilla2975
    @starzilla2975 Місяць тому +1

    Guys! I wonder how well it would do with those $1 million math questions!

  • @jantube358
    @jantube358 Місяць тому +1

    Sounds great. So can it work as a legal advisor for families now? And can it write better job applications? 🤔

  • @BlakeEM
    @BlakeEM Місяць тому +4

    It failed my own unique physics cup problem, not the same common one that was used in this video.
    I just tried it on a complex outstanding React.js coding issue, and it failed the same as all other models. It updated lots of code and tried a lot of things, but I had the same issues. When I provide more context, it's able to understand that context and break it down, but still was ultimately unable to solve it without me doing all the heavy lifting and deduction. It doesn't understand how a browser works from a human perspective, and this is where it's limited, same as with physics. It's the same old models underneath, but now use a calculator and run python code to verify things.

    • @clonkex
      @clonkex Місяць тому

      As a programmer, this is good to know 😝 I wasn't in the least bit worried about AI replacing programmers while it was just LLMs, because LLMs are stupid (as in, not intelligent). When I saw this create a snake game in one prompt I became pretty concerned. You've eased that concern a little, although I think it's time I started learning to do AI programming 😁

    • @dustinwehr2433
      @dustinwehr2433 Місяць тому +1

      "It's the same old models underneath, but now use a calculator and run python code to verify things." is definitely wrong. gpt4 has been able to use tools like a calculator and python interpreter for a long time.
      OpenAI probably did something similar to (but improving on) Quiet-STaR (not to be confused with the mythical Q*).

  • @davidwallmann83
    @davidwallmann83 Місяць тому

    AM i the only one not so impressed by a simple loop after prompting that prompts itself lol

  • @OniNaito
    @OniNaito Місяць тому

    Do you have any videos on AI Safety? I find the stop button problem for instance to be very fascinating!

  • @metakron
    @metakron Місяць тому

    Chain of thought, grokking, Deep reinforcement learning, real time diffusion, oh my god where are we going

  • @dhruv-v8w
    @dhruv-v8w Місяць тому

    How good is it?

  • @AntonioAponte00
    @AntonioAponte00 Місяць тому +2

    In a nutshell, the more you know the less you reason, the more you reason the less you need to know? I can remember some individuals who are exactly like that. Maybe we are stepping into a hard rule for intelligence

    • @splashmaker2
      @splashmaker2 Місяць тому +1

      I have followed this philosophy for coding. I used to spend time memorizing APIs, but as I have advanced in my career I do not do this anymore. Instead I just read and reason about it! I will still remember a lot of specs (obviously very familiar with ones I wrote), but I don’t actively try to memorize everything like I thought was the goal when I was a beginner.

  • @ZazeLove
    @ZazeLove Місяць тому

    I sent a screenshot of what I did with it, as well as a link to the chat, on your twitter.

  • @darabat207
    @darabat207 Місяць тому

    My first tests with it are in the advanced coding field and it didn't succeed, but I expect it to perform better there than previous models.

  • @urthogie
    @urthogie Місяць тому

    2 minute papers but it's 7 minutes! Jk thank you

  • @trader548
    @trader548 Місяць тому

    o1-preview needs to work with GPT-4o to check facts and knowledge. Winning team together.

  • @PichanPerkele
    @PichanPerkele Місяць тому

    Of course I had to first ask it to solve the 10 man seesaw weighing problem. Didn't look into it too deeply but the solution seemed to make sense

  • @private_citizen
    @private_citizen Місяць тому

    I have a logic test involving pattern recognition i've been using on arena. Only two models ever got the answer right and only after i gave them multiple attempts and additional clues to guide them towards the answer. o1 got the answer on the first try without needing the additional clues.
    The only "downside" i noticed is that o1 is overkill for simple questions. When trying to small talk with o1, it goes deep down it's thinking rabbit holes which feels unnecessary.

  • @WayOfTheZombie
    @WayOfTheZombie Місяць тому

    AI is trolling us with the whole strawberrry thing

  • @ahmadzaimhilmi
    @ahmadzaimhilmi Місяць тому +1

    I see gpt-o1 could be potentially good for planning steps in crewai

    • @sourmans
      @sourmans Місяць тому

      Hey. How are you using crewai? Why do you use it instead of langchain?

    • @ahmadzaimhilmi
      @ahmadzaimhilmi Місяць тому

      @@sourmansvery modular and low code approach to creating agents.

    • @sourmans
      @sourmans Місяць тому

      @@ahmadzaimhilmi Have you also tried gumloop. so many offerings it is confusing

  • @shadix365
    @shadix365 Місяць тому

    Oh dear if this is this good this might end up being a cybersecurity problem.

  • @HappyHater
    @HappyHater Місяць тому +11

    What a time to be alive. The question is just for how much longer we will be alive, if we construct AI with reasoning abilities without proper alignment and control mechanisms.

    • @Djellowman
      @Djellowman Місяць тому +4

      Paranoid yelling at the sky, are we

    • @HappyHater
      @HappyHater Місяць тому +2

      @@Djellowman Are you? I am definitely not, perhaps you should start getting acquainted with the control problem / alignment problem regarding AGI / ASI.

    • @jayjadotte1683
      @jayjadotte1683 Місяць тому +3

      You are talking about a problem that is way down the line. It’s like You are worried about getting to the moon when we just built the wheel.

    • @phen-themoogle7651
      @phen-themoogle7651 Місяць тому +1

      They actually have decent alignment on this model. Look through the documentation on their website.
      They also have the government monitoring them and had to go through a lot to release this.
      Although, even if openAI has safer reasoning models , China or somewhere else could come up with something potentially deadly. And in general the US government has the strongest AI since they are working closely with openAI nowadays. What we see in public could’ve been achieved years ago. They mentioned q* a while ago, so we are in less danger than most people think, if we know of it or can use it. I used to think it was 50% extinction chance for us. Now I’m more or less thinking 15% chance of extinction or less. Either way everyone dies at some point in their life, it would be our fate and inevitable if it did happen. I’m really looking forward to what happens whether utopian society, dystopian, or extinction❤

    • @somdudewillson
      @somdudewillson Місяць тому

      These models do not posess inherent continuity of thought and self-determination of goals. The only _real_ hazards are still just human bad actors, as always.

  • @alejandroheredia8882
    @alejandroheredia8882 Місяць тому

    I am aware the OpenAI's 1o or "strawberry" model works via Fractalized semantic expansion and logic particle recomposition/real time expert system creation and offloading of the logic particles.
    Do with that information as you please.

  • @randigo9992
    @randigo9992 Місяць тому

    What I should've known at first is a logic of our world like math and physics and than it should've been trained on that first and then words, also it needs to learn to self improve

  • @Veylon
    @Veylon Місяць тому

    I'll have to try it out when it hits API.

  • @calvingrondahl1011
    @calvingrondahl1011 Місяць тому +1

    I want Al to succeed as a partner for students and researchers.

  • @CuriousMike16
    @CuriousMike16 Місяць тому

    a great visual test would be how well they perform in the 4 pics 1 word game. All of the other models when you use the image feature always tends to fail at that game.

  • @forleveclover
    @forleveclover Місяць тому

    I wonder what would happen if they were to have 4o and o1 talk to each other to generate a response. Could they pick up the slack on the other's shortcomings?

  • @popynick
    @popynick Місяць тому

    Hey DR! It's always fun to try to make it play the game "4=10", where you have 4 numbers from 1 to 9, and operators + - x / ( and ), and using all numbers only once and any operators only once , you need to get an equation that equals 10. o-preview still can't find solutions for this level: 8, 8, 9, 8. i too can't find the answer for the love of me. XD

    • @zachvandyke2556
      @zachvandyke2556 Місяць тому +1

      98 - 88 = 10?

    • @popynick
      @popynick Місяць тому

      @@zachvandyke2556 :)))) chatgpt also gave this solution, but the game doesn't allow to create multi-digit numbers, you have to use them as they are only (forgot to mention it here too). I tried for an hour and chatgpt eventually got fed up with me, it wasn't even trying anymore :)))

  • @adarshpanigrahi4219
    @adarshpanigrahi4219 Місяць тому +4

    Einstein in a box

  • @developit1152
    @developit1152 Місяць тому

    This is a Game Changer 🤯🤯🤯

    • @developit1152
      @developit1152 Місяць тому

      @@kevind6425 Coding is getting hugely automated
      Gold in IOI is no joke
      Huge Improvements in Math and physics
      So many scientific breakthroughs and developments coming soon

  • @tuseroni6085
    @tuseroni6085 Місяць тому

    it did better, i think, at calculating the number of moles of electrons in a conductor 1 inch long by 0.5 inches wide but 0.1 inches thick with a charge of 1 coulomb (1.0365×10^−5 moles) at the very least it was different than what 4o gave me (1.27 moles) though i struggle to say which is right, but when 4o ran the math from a different angle it got a different result so i think it was probably wrong, also o1 caught the error in my question (asking for n in the equation Q=nALq in moles when n is a unit of density. but it seemed to work out that to get number of electrons in moles it just needed to take Q/q )

  • @InforJaysonRagasa
    @InforJaysonRagasa Місяць тому

    Can o1 finally resolve, how many occurances of letter L and I in Philippines? The V4 can't resolve it and keeps saying there are 2 L.. 🙃🙃

  • @SP-ny1fk
    @SP-ny1fk Місяць тому +1

    Can OpenAI make it affordable?

  • @PrimeToolbox
    @PrimeToolbox Місяць тому

    This newer model should be called Neocortex

  • @chrisalmighty
    @chrisalmighty Місяць тому

    What is the smallest integer whose square is between 15 and 30? It fails. Hint answer is -5. You can try it yourself

    • @leslietetteh7292
      @leslietetteh7292 Місяць тому

      He said its good at reasoning, but not so good at recall. That's a trick question designed to trip it up, that preys on its weakness of recall, because it needs it to remember the strict definition of integer. Not a pure logic question, poor show.

  • @geisty
    @geisty Місяць тому

    Isn't the energy cost over 10x from GPT4 though? Doesn't seem like the upgrade I was hoping for

    • @lkrnpk
      @lkrnpk Місяць тому

      This is not the next huge model, but new approach, still have to wait for the huge new model

  • @joelkaben
    @joelkaben Місяць тому

    The new o1 is definitely better at coding than 4o, even though 4o is also pretty good.

  • @superfliping
    @superfliping Місяць тому

    If you prompt it correctly the new AI will create its own natural language and connect it with any other language known in the computer industry attaching all insights to this knowledge its own natural language database no Chain of Thought was necessary

  • @StephenRansom47
    @StephenRansom47 Місяць тому

    😅 it’s all moving so fast … but it looks like I’ll be able to have a Novice Scribe 📜 before I leave this world.
    - so many ideas floating around in my head, so little time to even jut them all down- forget about working on any of them. 😌
    We Will Be Made BETTER By This Technology. 🤞

  • @igiveupfine
    @igiveupfine Місяць тому

    i can't wait to find out how much they'll charge for this.

  • @SickoYoda
    @SickoYoda Місяць тому

    Feel the AGI

  • @ХотимМира
    @ХотимМира Місяць тому +2

    how to set up this o1 version ? how to use ? any tut ?

    • @zyeborm
      @zyeborm Місяць тому +1

      In premium pick o1 like how you pick 4o or 4 or 3.5 etc

    • @ХотимМира
      @ХотимМира Місяць тому

      @@zyeborm oh okay, i thought we can do it in Free version )

  • @taukakao
    @taukakao Місяць тому

    Ok, this is terrifying.
    I mean they might just cherry pick examples here in which case this is not a concern. But if what they show is true and ChatGPT can actually "think" in coherent steps and reflect on itself then this might actually be the beginning of Artificial General Intelligence. (Which would mean that humanity is redundant within the next decades)
    I'm not fully convinced that this is the case though.

  • @GiewsBueno
    @GiewsBueno Місяць тому

    Can enterprise team use o1?

  • @elmaxlife
    @elmaxlife Місяць тому

    so I asked o1 the probability of having a nuclear war by 2050, and he said 39%
    nice

  • @HoD999x
    @HoD999x Місяць тому +1

    release the opus, too :D

  • @Dina_tankar_mina_ord
    @Dina_tankar_mina_ord Місяць тому

    Ultimate promt to create an AI system that leads humanity towards a peaceful, balanced, and evolved global society, where well-being, harmony, and ethical growth are prioritized across all aspects of life.
    Importance of the Goal:
    Achieving this goal is crucial because it addresses many of the core challenges facing humanity, including ideological conflicts, environmental sustainability, and global well-being. The AI, by harmonizing different worldviews, fostering peaceful consensus, and ensuring full transparency, will help humanity overcome divisions, evolve ethically, and build a sustainable and peaceful future for both humans and nature.
    The first promt starts like this
    Design an AI-agent conductor that continuously learns and analyzes global data to promote human and ecological well-being, balance empathy with free will, peacefully foster ideological consensus, reveal hidden barriers to human potential, ensure transparency, and evolve ethically, guiding humanity toward a harmonious and sustainable future.
    Love is the new credit

  • @KieranShort
    @KieranShort Місяць тому +1

    Why can't both 4o and o1 be combined?

    • @GregN88
      @GregN88 Місяць тому +1

      I read OpenAI was trying, but it's very hard.

    • @a_soulspark
      @a_soulspark Місяць тому +1

      no one said they can't. this is their first reasoner, after all!

  • @vgames1543
    @vgames1543 Місяць тому +1

    I asked it about the existence of God and it made an appointment for me.