"...or someone who is mega smart and just started learning the basics but can derive the whole theory, of pretty much, any topic" Wow. You didn't have to give me a shout out like that.
Tried the new model. I gave it instruction to write paragraph on some set of constraints. with slower thinking, it actually results in better response. What a time to be alive!
Oooooh. So about a week ago I got the usual response1/response2 “which is better” when using GPT 4o, except this time there was a disclaimer saying “you are testing a new model which takes more time to think, please be patient while the response loads” and I guess this was the new model they were alpha testing to premium users? Pretty cool if you ask me And yes, I did notice one of the results was significantly better so I assume it was this o1 model, and the other was standard 4o response. What a time to be alive!
I noticed this too, but a few weeks ago. I didn't get the message about testing a new model, but it felt like I was using gpt-5 or something like that. Really cool to know people were getting messages about it.
Tried today with coding a Unreal engine 5 plugin. This version is finally able to grab the context and maintain the focus without just repeating the same mistakes.🎉
Are the limits for 30 prompts a week still there? Just curious. As I was using it I didn’t notice any limits, but they won’t mention them until you hit them?
Nope, this is all hype at best. This is only a tree of hidden prompts, You can probably get the same quality of result using vanilla gpt 4o by iterating your prompts manually this just do it automatically. Also I bet this will consume a magnitude more compute, more like a brute force approach to a prompt not efficient at all. I doubt they can scale this one without charging ridiculous amount. Just imagine paying 20 bucks for a single prompt not knowing if the llm will achieve your goal or fail or need to add more money.
@@GH-uo9fy You're spot on. Getting access to the API reveals the substantial increase in cost. Not just because it's new, but because when you actually run it the number of tokens it eats through skyrockets. Some problems costing as much as $10.
we're getting close to AGI -- if this thing can 'reason', by giving it access and telling it to improve its own weaknesses til it's a well-rounded conscious entity, theoretically it will
Tried it yesterday. It solved a tricky probability question all of the previous versions got wrong consistently, even after multiple hints. It needed one clarification this time, but then got it right.
I was playing around with this and was able to generate a clone of Atari's classic "Centipede" via HTML and Javascript in a single prompt. That said, I did need to run through about 30 more iterations to get it where I was satisfied with the gameplay, the initial version was functional, but somewhat one-dimensional in terms of gameplay. I just used Emojis instead of generating any sprite sheets (works well enough for a game like this), and used Suno to generate a 4-bit synth audio loop for background music. Overall, I could have written this game by hand in about 2 hours. The time spent with ChatGPT was actually about 3 hours, lol. But, ChatGPT allowed me to go through many more iterations and experiment with features much more quickly, without compromising on the quality. So for the time spent, I think I came out with a much more polished game than if I had manually coded it in the same amount of time.
I would also add, if Using ChatGPT was a natural part of my coding workflow, it would likely be much, much faster. Most of my time was spent tinkering and pushing some very difficult prompts on it. If I had a planned project instead of something impromptu, I could have phrased my prompts much better to cut the time dramatically. And o1-Preview doesn't take files as input yet, so that was also another bottleneck that I imagine will be resolved soon. I look forward to trying some other projects, but I have exceeded my weekly allowance, lol.
@@codejunki567 Oh, you missed it. Sorry about that. In the OP's comment, they lol'd about the fact that they could've written the code in 2 hours for which it took 3 hours with AI. I was merely pointing out that to be able to write the code in 2 hours they would have had to spend time, potentially years, to be able to code it in two hours. I hope that clarifies it. Is there a problem?
I asked it to make a simple ascii roguelike game in C for the terminal, where the entity data is kept in JSON files. It seriously even wrote Python code to generate a C file from the JSON data. Compiled with no errors, first try. Graphics and keyboard input both work. Updating the code to add enemies, loot and treasure introduced a bug. Still, I can't believe how good this model is.
I asked o1 how many "R's" were in "strawberry," and it answered correctly (3). I then asked it where the R's were, and it told me there was one in the fourth position 😭 It's good, but certainly not perfect yet!
If you prompted "where are the Rs?" then it maybe thought you meant to ask where the Rs are in the prompt you just sent it, and in this example it would be correct in saying the 4th letter is an R 😂
Reasoning is nice. Tried to get it to code some small C++ class (o1-preview)... Felt more like mentoring a highschool student 😀 After three feedback replies about errors and giving hints it came with something (probably) working (and without doing computational costly workarounds). Liked the reasoning & explanation, nice commentary (sometimes cheeky...like setting given input parameters to zero to "simplify the problem"), and feels better than 4o, but not phd student niveau 😝
@@BritishBungler If you think that this is making someone obsolete, perhaps you will be obsolete very soon. Better learn how to use AI before AI learn how to use you!
@@BritishBunglerhow so? Who can validate that the AI is doing its work correctly other than subject matter experts? Other AI can’t be used since it runs into the same problem, who is reviewing their work etc
The potential of o1 to excel in fields like genetics and quantum physics is truly groundbreaking. The future of AI-assisted research looks incredibly promising!
I’m not very book smart nor do I have any degrees in computer science or anything similar but I love learning about new breakthroughs in artificial intelligence and science as a whole aswell as physics engines and how they try to emulate the world. I must say seeing all the positivity and brotherhood in this comment section really is a thing to behold, what a time to be alive! 😂❤
@@user-sl6gn1ss8p He has a stake in this field of research. Not to mention, he's also likely just more enthusiastic than others. Overall, look at the ideas the creator is trying to communicate than to see it emotionally.
@@Slugma-kx7pv by delivery I didn't just mean the emotion, I"m fine with that, but the actual content is a bit too much on the hype side for my taste. Downside are brushed away and then it's back to calling the thing "Einstein in a box". Compare it to the video made by AI Explained for example, and it is night and day. I'm not a hater by the way, or anything like that. I think the channel is great and his courses on computer graphics are amazing - that's the reason I come back and give it another try now and then, but the coverage on AI is in general just not what I'm looking for, you know?
that decyphering chain of thought is amazing -- its not so much an ordered logical sequence of thought that nicely follows from one point to the next, but more like what a human stream of consciousness might look like, with all kinds of false starts and dead ends, things not thought of and things not thought through properly and only remembered later etc. I don't unusally like vids on commercial products with no paper to give us insight on what's going on, but this one is def worth highlighting, thx!
30 a week for premium is kinda limiting. Like i don't want to use it in case i need to use it. Yes i do finish games with a giant inventory of unused potions and boost items why do you ask?
@@OpreanMircea The singularity is when the AIs self improve completely autonomously. All of these advancements are the result of significant human effort (possibly with AI assistance at this point, but still with significant human input). The singularity will probably happen in most of our lifetimes, but we aren't there yet.
@@AustinThomasPhD that's not the definition, it's the "point of inflection", where the curve measuring something (in this case AI development) stops going one way (slowing down) and it starts accelerating, is it using the AI or people to do that? It doesn't matter, line go up faster and faster
You say "Einstein in a box" I say "Zwei-stein in many boxes" :) Imagine having several of these o1 agents working together, or being mixed in adversarial and cooperative modes when working together towards complex solutions. fyi the joke's context, Ein =1 and Zwei = 2 in German ;)
o1 is fun. token limit is sad tho. such a pleasant model to converse with. such wonderful adaptability. i talked about my concern around token count and we worked out a strategy to maximize out interaction time as elegantly as possible. i found instructions were quickly lost in previous versions. so far o1(preview) is amazing for maintaining context. depth of reasoning is butter. need longer token count to test properly tho. most impressed and excited i've been since gpt4 dropped
@@letsgomedia9631 Nice bald assertion. If you want to prove something to another human give it to them in the form of A+B=C. You only gave the C and not the A+B. It's like you're telling us there's fruit on your table, so naturally we're going to ask you how you know this. Tell us like this, "There's 1 apple and 2 oranges on the table, and here let me show you, look, etc". Now we'll have reasonably good evidence to accept your assertion about there being fruit on the table. Go.
YESSSS!!! FINALLY! I remember in 2022 all the AI's I talked to were really bad at answering questions, and I knew it was because they didn't have memory of the conversation. I wanted for scientists to develop an AI with memories. And that happened! GPT 3 was a revolution in AI. But when I talked to it it didn't seem that smart to me. Sure, it was pretty good at literature, but it couldn't play a game of chess, at all. I knew then what all the AI's were missing, and that was logical thought. I waited for scientists to develop a model which can use logic. Now, it happened, and it's going to be as big a revolution as the addition of memories! Two more papers down the line AI will be making scientific discoveries!
I memorized 600 words in Japanese but never used them I gave gpt4 a task to restrict its vocabulary to those 500 words and speak to me. It would use 3 to 4 words but majority of the words would be totally different but o1 it was insane it actually restricted its vocabulary to 600 in 90 % of the time damn I loved it
I find it more and more ironic that this company is called OpenAI. This new model doesn't even give you access to the "chain of thought" for "security" purposes. Kewl. That being said, it does seem like a noticeable improvement.
I don't think there's an official paper associated with it, but could you make a video going over the differences between a KAN and MLP model? And maybe what you think about KAN for future models?
Isn’t the solution for the problem at 5:07 multiples of 4 and 3 instead of 8 and 6? For example, if the princess is 12 and the prince is 9. The problem: [the princess is as old as the prince will be, when the age of the princess is twice the age of the prince when the princess age was half the present age of princess and prince combined]. So in the past section of the problem, the princess’s age = (9 + 12) /2 = 10.5, age difference is 3 years so the brothers age in the past section of the problem is 10.5 - 3 = 7.5. Next, the age of the princess in the future part of the problem is twice the age of the prince in the past, so it equals to 7.5 x 2 = 15 and the prince age is 15 -3= 12. And finally in the problem it is stated, that the age of the princess in the present is equal to the age of the prince in the future, so it equals to 12, which is correct. You cant get that solution with the multiples of 8 and 6. If O1 hadn’t limited the solutions only to the positive integers, the answer would be correct, because the ratio is the same. Limiting answers to integers was not required by the problem, and hence also a mistake because, the solution works with any positive number, like if the princess was 6 and the prince 4.5 . Cool tech anyways, and its a bummer that in 5 years O7 is going to O7 me by making me a genie pig, dissecting my skull and inserting electrical rods.
Yeah same, I thought that was weird. I got princess == 4/3 prince age, and because it was a continuous function, just plugged in some numbers to check if my answer was correct. I think if it did that, it would simplify the ages, and the fact that it didn't simplify the solution suggests there is still something missing. That was a doozy just to formulate though (as someone lying in bed) so it is certainly impressive that it was able to do so and solve it.
It failed my own unique physics cup problem, not the same common one that was used in this video. I just tried it on a complex outstanding React.js coding issue, and it failed the same as all other models. It updated lots of code and tried a lot of things, but I had the same issues. When I provide more context, it's able to understand that context and break it down, but still was ultimately unable to solve it without me doing all the heavy lifting and deduction. It doesn't understand how a browser works from a human perspective, and this is where it's limited, same as with physics. It's the same old models underneath, but now use a calculator and run python code to verify things.
As a programmer, this is good to know 😝 I wasn't in the least bit worried about AI replacing programmers while it was just LLMs, because LLMs are stupid (as in, not intelligent). When I saw this create a snake game in one prompt I became pretty concerned. You've eased that concern a little, although I think it's time I started learning to do AI programming 😁
"It's the same old models underneath, but now use a calculator and run python code to verify things." is definitely wrong. gpt4 has been able to use tools like a calculator and python interpreter for a long time. OpenAI probably did something similar to (but improving on) Quiet-STaR (not to be confused with the mythical Q*).
In a nutshell, the more you know the less you reason, the more you reason the less you need to know? I can remember some individuals who are exactly like that. Maybe we are stepping into a hard rule for intelligence
I have followed this philosophy for coding. I used to spend time memorizing APIs, but as I have advanced in my career I do not do this anymore. Instead I just read and reason about it! I will still remember a lot of specs (obviously very familiar with ones I wrote), but I don’t actively try to memorize everything like I thought was the goal when I was a beginner.
I have a logic test involving pattern recognition i've been using on arena. Only two models ever got the answer right and only after i gave them multiple attempts and additional clues to guide them towards the answer. o1 got the answer on the first try without needing the additional clues. The only "downside" i noticed is that o1 is overkill for simple questions. When trying to small talk with o1, it goes deep down it's thinking rabbit holes which feels unnecessary.
What a time to be alive. The question is just for how much longer we will be alive, if we construct AI with reasoning abilities without proper alignment and control mechanisms.
@@Djellowman Are you? I am definitely not, perhaps you should start getting acquainted with the control problem / alignment problem regarding AGI / ASI.
They actually have decent alignment on this model. Look through the documentation on their website. They also have the government monitoring them and had to go through a lot to release this. Although, even if openAI has safer reasoning models , China or somewhere else could come up with something potentially deadly. And in general the US government has the strongest AI since they are working closely with openAI nowadays. What we see in public could’ve been achieved years ago. They mentioned q* a while ago, so we are in less danger than most people think, if we know of it or can use it. I used to think it was 50% extinction chance for us. Now I’m more or less thinking 15% chance of extinction or less. Either way everyone dies at some point in their life, it would be our fate and inevitable if it did happen. I’m really looking forward to what happens whether utopian society, dystopian, or extinction❤
These models do not posess inherent continuity of thought and self-determination of goals. The only _real_ hazards are still just human bad actors, as always.
I am aware the OpenAI's 1o or "strawberry" model works via Fractalized semantic expansion and logic particle recomposition/real time expert system creation and offloading of the logic particles. Do with that information as you please.
What I should've known at first is a logic of our world like math and physics and than it should've been trained on that first and then words, also it needs to learn to self improve
a great visual test would be how well they perform in the 4 pics 1 word game. All of the other models when you use the image feature always tends to fail at that game.
I wonder what would happen if they were to have 4o and o1 talk to each other to generate a response. Could they pick up the slack on the other's shortcomings?
Hey DR! It's always fun to try to make it play the game "4=10", where you have 4 numbers from 1 to 9, and operators + - x / ( and ), and using all numbers only once and any operators only once , you need to get an equation that equals 10. o-preview still can't find solutions for this level: 8, 8, 9, 8. i too can't find the answer for the love of me. XD
@@zachvandyke2556 :)))) chatgpt also gave this solution, but the game doesn't allow to create multi-digit numbers, you have to use them as they are only (forgot to mention it here too). I tried for an hour and chatgpt eventually got fed up with me, it wasn't even trying anymore :)))
@@kevind6425 Coding is getting hugely automated Gold in IOI is no joke Huge Improvements in Math and physics So many scientific breakthroughs and developments coming soon
it did better, i think, at calculating the number of moles of electrons in a conductor 1 inch long by 0.5 inches wide but 0.1 inches thick with a charge of 1 coulomb (1.0365×10^−5 moles) at the very least it was different than what 4o gave me (1.27 moles) though i struggle to say which is right, but when 4o ran the math from a different angle it got a different result so i think it was probably wrong, also o1 caught the error in my question (asking for n in the equation Q=nALq in moles when n is a unit of density. but it seemed to work out that to get number of electrons in moles it just needed to take Q/q )
He said its good at reasoning, but not so good at recall. That's a trick question designed to trip it up, that preys on its weakness of recall, because it needs it to remember the strict definition of integer. Not a pure logic question, poor show.
If you prompt it correctly the new AI will create its own natural language and connect it with any other language known in the computer industry attaching all insights to this knowledge its own natural language database no Chain of Thought was necessary
😅 it’s all moving so fast … but it looks like I’ll be able to have a Novice Scribe 📜 before I leave this world. - so many ideas floating around in my head, so little time to even jut them all down- forget about working on any of them. 😌 We Will Be Made BETTER By This Technology. 🤞
Ok, this is terrifying. I mean they might just cherry pick examples here in which case this is not a concern. But if what they show is true and ChatGPT can actually "think" in coherent steps and reflect on itself then this might actually be the beginning of Artificial General Intelligence. (Which would mean that humanity is redundant within the next decades) I'm not fully convinced that this is the case though.
Ultimate promt to create an AI system that leads humanity towards a peaceful, balanced, and evolved global society, where well-being, harmony, and ethical growth are prioritized across all aspects of life. Importance of the Goal: Achieving this goal is crucial because it addresses many of the core challenges facing humanity, including ideological conflicts, environmental sustainability, and global well-being. The AI, by harmonizing different worldviews, fostering peaceful consensus, and ensuring full transparency, will help humanity overcome divisions, evolve ethically, and build a sustainable and peaceful future for both humans and nature. The first promt starts like this Design an AI-agent conductor that continuously learns and analyzes global data to promote human and ecological well-being, balance empathy with free will, peacefully foster ideological consensus, reveal hidden barriers to human potential, ensure transparency, and evolve ethically, guiding humanity toward a harmonious and sustainable future. Love is the new credit
Going to need some help holding on to my papers with this one, this is cool!
what a time to be alive!
Well the bible is a source of a lot of misinformation, so perhaps they cleaned it from the knowledge base..?
Very impressive! Can’t imagine what this will become in a year
@@kevinwoodrobotics just tow papers down the line...
"Do you want to work with someone that read all the books but can never apply that knowledge?" Wow. You didn't have to call me out like that.
"...or someone who is mega smart and just started learning the basics but can derive the whole theory, of pretty much, any topic" Wow. You didn't have to give me a shout out like that.
@@o1-previewo1 is the one that learned all the books
After 7.5 million years of thinking.The answer is 42.
It is always 42.
Best comment!
But what is the question?
@@glenneric1 How many roads one must travel down.
4.9 is the best number for u nemployment say the 8ank of €ngland maybe 42 Trillion is the optimal debt 😊
Tried the new model. I gave it instruction to write paragraph on some set of constraints. with slower thinking, it actually results in better response. What a time to be alive!
Oooooh. So about a week ago I got the usual response1/response2 “which is better” when using GPT 4o, except this time there was a disclaimer saying “you are testing a new model which takes more time to think, please be patient while the response loads” and I guess this was the new model they were alpha testing to premium users? Pretty cool if you ask me
And yes, I did notice one of the results was significantly better so I assume it was this o1 model, and the other was standard 4o response. What a time to be alive!
I got that yesterday and the day before, and I am in the free tier, so I guess they are testing with everyone.
I noticed this too, but a few weeks ago. I didn't get the message about testing a new model, but it felt like I was using gpt-5 or something like that. Really cool to know people were getting messages about it.
Tried today with coding a Unreal engine 5 plugin. This version is finally able to grab the context and maintain the focus without just repeating the same mistakes.🎉
Are the limits for 30 prompts a week still there? Just curious. As I was using it I didn’t notice any limits, but they won’t mention them until you hit them?
Yup, still there, maybe in a few months, but u can use it via the api, but you need to be tier 5 @phen-themoogle7651
Cool thanks now you're fired
@@_capugood thing I’m an indie dev so it’s basically like I hired a free developer to build my dream
@@_capu who said i need a job ;-)
This is huge; I wouldn't think this could happen so soon!
And remember the first law of papers, two more papers down the line 👀
Spoken like a True Fellow Scholar!
But unfortunately there were no real papers that got published about this model :(. Maybe the Q* paper is what one could salvage?
Nope, this is all hype at best. This is only a tree of hidden prompts, You can probably get the same quality of result using vanilla gpt 4o by iterating your prompts manually this just do it automatically. Also I bet this will consume a magnitude more compute, more like a brute force approach to a prompt not efficient at all. I doubt they can scale this one without charging ridiculous amount. Just imagine paying 20 bucks for a single prompt not knowing if the llm will achieve your goal or fail or need to add more money.
@@GH-uo9fy You're spot on. Getting access to the API reveals the substantial increase in cost. Not just because it's new, but because when you actually run it the number of tokens it eats through skyrockets. Some problems costing as much as $10.
we're getting close to AGI -- if this thing can 'reason', by giving it access and telling it to improve its own weaknesses til it's a well-rounded conscious entity, theoretically it will
Back in my days we had only two 'r's in a word 'strawberry'
lol 🤣
You guys had ‘r’s?
Tried it yesterday. It solved a tricky probability question all of the previous versions got wrong consistently, even after multiple hints. It needed one clarification this time, but then got it right.
I was playing around with this and was able to generate a clone of Atari's classic "Centipede" via HTML and Javascript in a single prompt.
That said, I did need to run through about 30 more iterations to get it where I was satisfied with the gameplay, the initial version was functional, but somewhat one-dimensional in terms of gameplay.
I just used Emojis instead of generating any sprite sheets (works well enough for a game like this), and used Suno to generate a 4-bit synth audio loop for background music.
Overall, I could have written this game by hand in about 2 hours.
The time spent with ChatGPT was actually about 3 hours, lol. But, ChatGPT allowed me to go through many more iterations and experiment with features much more quickly, without compromising on the quality. So for the time spent, I think I came out with a much more polished game than if I had manually coded it in the same amount of time.
I would also add, if Using ChatGPT was a natural part of my coding workflow, it would likely be much, much faster. Most of my time was spent tinkering and pushing some very difficult prompts on it. If I had a planned project instead of something impromptu, I could have phrased my prompts much better to cut the time dramatically.
And o1-Preview doesn't take files as input yet, so that was also another bottleneck that I imagine will be resolved soon.
I look forward to trying some other projects, but I have exceeded my weekly allowance, lol.
You may have made it in two hours, but I bet it took you a number of years to be able to do that.
@UlyssesDrax And someone with zero experience would of needed 100+ prompts to get it working correctly. Whats your point here?
@@codejunki567 Oh, you missed it. Sorry about that.
In the OP's comment, they lol'd about the fact that they could've written the code in 2 hours for which it took 3 hours with AI.
I was merely pointing out that to be able to write the code in 2 hours they would have had to spend time, potentially years, to be able to code it in two hours.
I hope that clarifies it. Is there a problem?
Reasoning is the key. They are on the right path.
there's still an architectural change needed, but indeed, the right path!
I asked it to make a simple ascii roguelike game in C for the terminal, where the entity data is kept in JSON files. It seriously even wrote Python code to generate a C file from the JSON data. Compiled with no errors, first try. Graphics and keyboard input both work. Updating the code to add enemies, loot and treasure introduced a bug. Still, I can't believe how good this model is.
I always thought that GPT knew too much for having such a small brain.
This is a total notch up. What a time to be alive!
Small brain you say😂
I asked o1 how many "R's" were in "strawberry," and it answered correctly (3). I then asked it where the R's were, and it told me there was one in the fourth position 😭
It's good, but certainly not perfect yet!
If you prompted "where are the Rs?" then it maybe thought you meant to ask where the Rs are in the prompt you just sent it, and in this example it would be correct in saying the 4th letter is an R 😂
@@roro9413 I went to see what exactly my prompt was, but the entire chat is now mysteriously missing... 🤨
To test it out I played tic tac toe against it, GPT-4o was constantly losing and made illogical moves, now against o1-preview, I haven't won yet.
Not even a draw?
@@jimbodimbo981 Always draw I meant. No lose
@@jimbodimbo981that’s not what he said.
in a box!
@@jimbodimbo981 I meant I always draw against it, but it hasn't lost yet.
Reasoning is nice. Tried to get it to code some small C++ class (o1-preview)...
Felt more like mentoring a highschool student 😀
After three feedback replies about errors and giving hints it came with something (probably) working (and without doing computational costly workarounds).
Liked the reasoning & explanation, nice commentary (sometimes cheeky...like setting given input parameters to zero to "simplify the problem"), and feels better than 4o, but not phd student niveau 😝
The smartest people out there just got an upgrade.
No, they were made obsolete.
@@BritishBungler If you think that this is making someone obsolete, perhaps you will be obsolete very soon. Better learn how to use AI before AI learn how to use you!
@@BritishBunglerhow so? Who can validate that the AI is doing its work correctly other than subject matter experts? Other AI can’t be used since it runs into the same problem, who is reviewing their work etc
Everyone else kind of just ignores it because we can't think of anything useful to ask it.
We sure did! This is amazing!
The potential of o1 to excel in fields like genetics and quantum physics is truly groundbreaking. The future of AI-assisted research looks incredibly promising!
Or it may just plateau soon. We'll see.
@@hydrohasspoken6227 spoiler, it won't plateau.
The "There are 3 Rs in strawberry" freakin' got me
I’m not very book smart nor do I have any degrees in computer science or anything similar but I love learning about new breakthroughs in artificial intelligence and science as a whole aswell as physics engines and how they try to emulate the world.
I must say seeing all the positivity and brotherhood in this comment section really is a thing to behold, what a time to be alive! 😂❤
This feels like a paid ad 😂
no bro i couldn't know about this if not for him
@@Phonixem everyone covering AI is and/or will be covering this. I like the channel, but the delivery here is too over the top to me
@@user-sl6gn1ss8p He has a stake in this field of research. Not to mention, he's also likely just more enthusiastic than others. Overall, look at the ideas the creator is trying to communicate than to see it emotionally.
@@Slugma-kx7pv by delivery I didn't just mean the emotion, I"m fine with that, but the actual content is a bit too much on the hype side for my taste. Downside are brushed away and then it's back to calling the thing "Einstein in a box".
Compare it to the video made by AI Explained for example, and it is night and day.
I'm not a hater by the way, or anything like that. I think the channel is great and his courses on computer graphics are amazing - that's the reason I come back and give it another try now and then, but the coverage on AI is in general just not what I'm looking for, you know?
Micsoda csatorna! Gratulálok az eddig elért teljesítményéhez, csak így tovább! 😎
First things first, we should ask it to prove the Riemann Hypothesis.
It can not, but it can help you explore novel ideas tackling the problem.
@@synthclub Just make it think longer. /s
@@Tekay37 Einstein modestly said, “It's not that I'm so smart, it's just that I stay with problems longer.".
@@synthclubpress x to doubt
that decyphering chain of thought is amazing -- its not so much an ordered logical sequence of thought that nicely follows from one point to the next, but more like what a human stream of consciousness might look like, with all kinds of false starts and dead ends, things not thought of and things not thought through properly and only remembered later etc. I don't unusally like vids on commercial products with no paper to give us insight on what's going on, but this one is def worth highlighting, thx!
30 a week for premium is kinda limiting. Like i don't want to use it in case i need to use it.
Yes i do finish games with a giant inventory of unused potions and boost items why do you ask?
The Singularity is certainly nearer
Exactly, there are 3 r's in strawberry
The improvements to AI are coming faster and faster, I think we're past the singularity, were on the up sloap
@@OpreanMircea The singularity is when the AIs self improve completely autonomously. All of these advancements are the result of significant human effort (possibly with AI assistance at this point, but still with significant human input). The singularity will probably happen in most of our lifetimes, but we aren't there yet.
@@AustinThomasPhD that's not the definition, it's the "point of inflection", where the curve measuring something (in this case AI development) stops going one way (slowing down) and it starts accelerating, is it using the AI or people to do that? It doesn't matter, line go up faster and faster
@@OpreanMircea No, Austin is right. en.wikipedia.org/wiki/Technological_singularity BTW exponential growth curves don't have a point of inflection.
You say "Einstein in a box" I say "Zwei-stein in many boxes" :)
Imagine having several of these o1 agents working together, or being mixed in adversarial and cooperative modes when working together towards complex solutions.
fyi the joke's context, Ein =1 and Zwei = 2 in German ;)
hm.. good idea, I like you. By the way, I got the joke without the explanation hahah
god i love your videos, you take such complex papers and visualize them, uhhhhh mazinggggggg🥰🥰🥰
Amazing, stunning 🎉
o1 is fun. token limit is sad tho. such a pleasant model to converse with. such wonderful adaptability. i talked about my concern around token count and we worked out a strategy to maximize out interaction time as elegantly as possible. i found instructions were quickly lost in previous versions. so far o1(preview) is amazing for maintaining context. depth of reasoning is butter. need longer token count to test properly tho. most impressed and excited i've been since gpt4 dropped
What an awesome time to be alive in the real world of AIs
Einstein
In a box!
This would be great if I wasn't worried about having a future.
Jesus is The Future, and I am being serious my friend. Because of God, I dont worry. He takes care of me, and He can take care of you. He is The Hope
@@letsgomedia9631 Nice bald assertion. If you want to prove something to another human give it to them in the form of A+B=C. You only gave the C and not the A+B.
It's like you're telling us there's fruit on your table, so naturally we're going to ask you how you know this. Tell us like this, "There's 1 apple and 2 oranges on the table, and here let me show you, look, etc". Now we'll have reasonably good evidence to accept your assertion about there being fruit on the table.
Go.
This is undoubtedly revolutionary.
The chain of thoughts people were really on something huh.
hm.. I mean, the chain of thought was a paper 2 years ago, on the "step by step" paper, this is something different
@twominutepapers in the graph at 3:57 does that still mean the new version is wrong 1/5 times?
I suppose it does, crazy to think an expert gets it wrong 30% of the time as well
Oh that's cool!
Beautiful! :) The only thing that sucks is "Light Mode" instead of "Dark Mode." Super blinding!
What a GPT time to be supercharged alive! 🎉
The thumbnail is not a correct mirror image
YESSSS!!! FINALLY! I remember in 2022 all the AI's I talked to were really bad at answering questions, and I knew it was because they didn't have memory of the conversation. I wanted for scientists to develop an AI with memories. And that happened! GPT 3 was a revolution in AI. But when I talked to it it didn't seem that smart to me. Sure, it was pretty good at literature, but it couldn't play a game of chess, at all. I knew then what all the AI's were missing, and that was logical thought. I waited for scientists to develop a model which can use logic. Now, it happened, and it's going to be as big a revolution as the addition of memories! Two more papers down the line AI will be making scientific discoveries!
What a time to be alive!
This is LIT 🔥
Tbh, Chain-Of-Thought (or COT) technique was a piece of homebrew RP chatbots since, like, forever, it's quite strange why OpenAI adopted it so late.
I memorized 600 words in Japanese but never used them I gave gpt4 a task to restrict its vocabulary to those 500 words and speak to me. It would use 3 to 4 words but majority of the words would be totally different but o1 it was insane it actually restricted its vocabulary to 600 in 90 % of the time damn I loved it
I find it more and more ironic that this company is called OpenAI. This new model doesn't even give you access to the "chain of thought" for "security" purposes. Kewl.
That being said, it does seem like a noticeable improvement.
This is it. This is a Learning Machine.
I don't think there's an official paper associated with it, but could you make a video going over the differences between a KAN and MLP model? And maybe what you think about KAN for future models?
the three r's in strawberry comes from a meme of chatgpt that has historically struggled to answer that simple question
2 hrs and 13k views! everyone is holding their papers in anticipation of this one
Hold onto what paper this time? Is OpenAI even bothering publishing a paper this time? 🤨
nope, too much competition, might as well publish it in a few years
I dropped the small strawberry in a cup example into current ChatGPT and it solved it just fine.
Imagine asking it to answer without thinking then it thinks about not thinking
Isn’t the solution for the problem at 5:07 multiples of 4 and 3 instead of 8 and 6? For example, if the princess is 12 and the prince is 9. The problem: [the princess is as old as the prince will be, when the age of the princess is twice the age of the prince when the princess age was half the present age of princess and prince combined]. So in the past section of the problem, the princess’s age = (9 + 12) /2 = 10.5, age difference is 3 years so the brothers age in the past section of the problem is 10.5 - 3 = 7.5. Next, the age of the princess in the future part of the problem is twice the age of the prince in the past, so it equals to 7.5 x 2 = 15 and the prince age is 15 -3= 12. And finally in the problem it is stated, that the age of the princess in the present is equal to the age of the prince in the future, so it equals to 12, which is correct. You cant get that solution with the multiples of 8 and 6. If O1 hadn’t limited the solutions only to the positive integers, the answer would be correct, because the ratio is the same. Limiting answers to integers was not required by the problem, and hence also a mistake because, the solution works with any positive number, like if the princess was 6 and the prince 4.5 . Cool tech anyways, and its a bummer that in 5 years O7 is going to O7 me by making me a genie pig, dissecting my skull and inserting electrical rods.
in a box!
Yeah same, I thought that was weird. I got princess == 4/3 prince age, and because it was a continuous function, just plugged in some numbers to check if my answer was correct. I think if it did that, it would simplify the ages, and the fact that it didn't simplify the solution suggests there is still something missing. That was a doozy just to formulate though (as someone lying in bed) so it is certainly impressive that it was able to do so and solve it.
Yeah or the princess is 40 minuts old and the prince is 30 minuts old if they are twins
Guys! I wonder how well it would do with those $1 million math questions!
Sounds great. So can it work as a legal advisor for families now? And can it write better job applications? 🤔
Great question
Feel free to tey buddy
It failed my own unique physics cup problem, not the same common one that was used in this video.
I just tried it on a complex outstanding React.js coding issue, and it failed the same as all other models. It updated lots of code and tried a lot of things, but I had the same issues. When I provide more context, it's able to understand that context and break it down, but still was ultimately unable to solve it without me doing all the heavy lifting and deduction. It doesn't understand how a browser works from a human perspective, and this is where it's limited, same as with physics. It's the same old models underneath, but now use a calculator and run python code to verify things.
As a programmer, this is good to know 😝 I wasn't in the least bit worried about AI replacing programmers while it was just LLMs, because LLMs are stupid (as in, not intelligent). When I saw this create a snake game in one prompt I became pretty concerned. You've eased that concern a little, although I think it's time I started learning to do AI programming 😁
"It's the same old models underneath, but now use a calculator and run python code to verify things." is definitely wrong. gpt4 has been able to use tools like a calculator and python interpreter for a long time.
OpenAI probably did something similar to (but improving on) Quiet-STaR (not to be confused with the mythical Q*).
AM i the only one not so impressed by a simple loop after prompting that prompts itself lol
Do you have any videos on AI Safety? I find the stop button problem for instance to be very fascinating!
Chain of thought, grokking, Deep reinforcement learning, real time diffusion, oh my god where are we going
How good is it?
In a nutshell, the more you know the less you reason, the more you reason the less you need to know? I can remember some individuals who are exactly like that. Maybe we are stepping into a hard rule for intelligence
I have followed this philosophy for coding. I used to spend time memorizing APIs, but as I have advanced in my career I do not do this anymore. Instead I just read and reason about it! I will still remember a lot of specs (obviously very familiar with ones I wrote), but I don’t actively try to memorize everything like I thought was the goal when I was a beginner.
I sent a screenshot of what I did with it, as well as a link to the chat, on your twitter.
My first tests with it are in the advanced coding field and it didn't succeed, but I expect it to perform better there than previous models.
2 minute papers but it's 7 minutes! Jk thank you
o1-preview needs to work with GPT-4o to check facts and knowledge. Winning team together.
Of course I had to first ask it to solve the 10 man seesaw weighing problem. Didn't look into it too deeply but the solution seemed to make sense
I have a logic test involving pattern recognition i've been using on arena. Only two models ever got the answer right and only after i gave them multiple attempts and additional clues to guide them towards the answer. o1 got the answer on the first try without needing the additional clues.
The only "downside" i noticed is that o1 is overkill for simple questions. When trying to small talk with o1, it goes deep down it's thinking rabbit holes which feels unnecessary.
AI is trolling us with the whole strawberrry thing
I see gpt-o1 could be potentially good for planning steps in crewai
Hey. How are you using crewai? Why do you use it instead of langchain?
@@sourmansvery modular and low code approach to creating agents.
@@ahmadzaimhilmi Have you also tried gumloop. so many offerings it is confusing
Oh dear if this is this good this might end up being a cybersecurity problem.
What a time to be alive. The question is just for how much longer we will be alive, if we construct AI with reasoning abilities without proper alignment and control mechanisms.
Paranoid yelling at the sky, are we
@@Djellowman Are you? I am definitely not, perhaps you should start getting acquainted with the control problem / alignment problem regarding AGI / ASI.
You are talking about a problem that is way down the line. It’s like You are worried about getting to the moon when we just built the wheel.
They actually have decent alignment on this model. Look through the documentation on their website.
They also have the government monitoring them and had to go through a lot to release this.
Although, even if openAI has safer reasoning models , China or somewhere else could come up with something potentially deadly. And in general the US government has the strongest AI since they are working closely with openAI nowadays. What we see in public could’ve been achieved years ago. They mentioned q* a while ago, so we are in less danger than most people think, if we know of it or can use it. I used to think it was 50% extinction chance for us. Now I’m more or less thinking 15% chance of extinction or less. Either way everyone dies at some point in their life, it would be our fate and inevitable if it did happen. I’m really looking forward to what happens whether utopian society, dystopian, or extinction❤
These models do not posess inherent continuity of thought and self-determination of goals. The only _real_ hazards are still just human bad actors, as always.
I am aware the OpenAI's 1o or "strawberry" model works via Fractalized semantic expansion and logic particle recomposition/real time expert system creation and offloading of the logic particles.
Do with that information as you please.
What I should've known at first is a logic of our world like math and physics and than it should've been trained on that first and then words, also it needs to learn to self improve
I'll have to try it out when it hits API.
I want Al to succeed as a partner for students and researchers.
a great visual test would be how well they perform in the 4 pics 1 word game. All of the other models when you use the image feature always tends to fail at that game.
I wonder what would happen if they were to have 4o and o1 talk to each other to generate a response. Could they pick up the slack on the other's shortcomings?
Hey DR! It's always fun to try to make it play the game "4=10", where you have 4 numbers from 1 to 9, and operators + - x / ( and ), and using all numbers only once and any operators only once , you need to get an equation that equals 10. o-preview still can't find solutions for this level: 8, 8, 9, 8. i too can't find the answer for the love of me. XD
98 - 88 = 10?
@@zachvandyke2556 :)))) chatgpt also gave this solution, but the game doesn't allow to create multi-digit numbers, you have to use them as they are only (forgot to mention it here too). I tried for an hour and chatgpt eventually got fed up with me, it wasn't even trying anymore :)))
Einstein in a box
This is a Game Changer 🤯🤯🤯
@@kevind6425 Coding is getting hugely automated
Gold in IOI is no joke
Huge Improvements in Math and physics
So many scientific breakthroughs and developments coming soon
it did better, i think, at calculating the number of moles of electrons in a conductor 1 inch long by 0.5 inches wide but 0.1 inches thick with a charge of 1 coulomb (1.0365×10^−5 moles) at the very least it was different than what 4o gave me (1.27 moles) though i struggle to say which is right, but when 4o ran the math from a different angle it got a different result so i think it was probably wrong, also o1 caught the error in my question (asking for n in the equation Q=nALq in moles when n is a unit of density. but it seemed to work out that to get number of electrons in moles it just needed to take Q/q )
Can o1 finally resolve, how many occurances of letter L and I in Philippines? The V4 can't resolve it and keeps saying there are 2 L.. 🙃🙃
Can OpenAI make it affordable?
It already is.
This newer model should be called Neocortex
What is the smallest integer whose square is between 15 and 30? It fails. Hint answer is -5. You can try it yourself
He said its good at reasoning, but not so good at recall. That's a trick question designed to trip it up, that preys on its weakness of recall, because it needs it to remember the strict definition of integer. Not a pure logic question, poor show.
Isn't the energy cost over 10x from GPT4 though? Doesn't seem like the upgrade I was hoping for
This is not the next huge model, but new approach, still have to wait for the huge new model
The new o1 is definitely better at coding than 4o, even though 4o is also pretty good.
If you prompt it correctly the new AI will create its own natural language and connect it with any other language known in the computer industry attaching all insights to this knowledge its own natural language database no Chain of Thought was necessary
😅 it’s all moving so fast … but it looks like I’ll be able to have a Novice Scribe 📜 before I leave this world.
- so many ideas floating around in my head, so little time to even jut them all down- forget about working on any of them. 😌
We Will Be Made BETTER By This Technology. 🤞
i can't wait to find out how much they'll charge for this.
Feel the AGI
how to set up this o1 version ? how to use ? any tut ?
In premium pick o1 like how you pick 4o or 4 or 3.5 etc
@@zyeborm oh okay, i thought we can do it in Free version )
Ok, this is terrifying.
I mean they might just cherry pick examples here in which case this is not a concern. But if what they show is true and ChatGPT can actually "think" in coherent steps and reflect on itself then this might actually be the beginning of Artificial General Intelligence. (Which would mean that humanity is redundant within the next decades)
I'm not fully convinced that this is the case though.
Can enterprise team use o1?
so I asked o1 the probability of having a nuclear war by 2050, and he said 39%
nice
release the opus, too :D
Ultimate promt to create an AI system that leads humanity towards a peaceful, balanced, and evolved global society, where well-being, harmony, and ethical growth are prioritized across all aspects of life.
Importance of the Goal:
Achieving this goal is crucial because it addresses many of the core challenges facing humanity, including ideological conflicts, environmental sustainability, and global well-being. The AI, by harmonizing different worldviews, fostering peaceful consensus, and ensuring full transparency, will help humanity overcome divisions, evolve ethically, and build a sustainable and peaceful future for both humans and nature.
The first promt starts like this
Design an AI-agent conductor that continuously learns and analyzes global data to promote human and ecological well-being, balance empathy with free will, peacefully foster ideological consensus, reveal hidden barriers to human potential, ensure transparency, and evolve ethically, guiding humanity toward a harmonious and sustainable future.
Love is the new credit
Why can't both 4o and o1 be combined?
I read OpenAI was trying, but it's very hard.
no one said they can't. this is their first reasoner, after all!
I asked it about the existence of God and it made an appointment for me.