New ChatGPT Strawberry Model is Here and it's INCREDIBLE - OPENAI o1

Skill Leap AI

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 5 лис 2024

КОМЕНТАРІ • 92

@SkillLeapAI Місяць тому ⁺¹
Join the fastest growing AI education platform and instantly access 20+ top courses in AI: bit.ly/skillleap
@christophervillela5900 Місяць тому
😮😢l
@georgeg.3518 Місяць тому ⁺¹⁹
Back in 1986, I bought my first computer, a Sinclair ZX Spectrum 128k. I was 7 years old and thought I could just type in my quest, and it would answer. I quickly realized that's not how things worked; instead, I had to learn the BASIC programming language-which I became quite good at. Today, the day has come when things work exactly as I had imagined! I never thought I'd live to see it happen! A childhood dream has become reality. ChatGPT with the reasoning of o1-preview marks a new era.
@Addictedtobleeps Місяць тому ⁺²
I think we’re probably similar ages, and we’re FINALLY beginning to live in the times that we thought would happen a lot quicker, back in the 80s. Just need those damn hoverboards now! 😀😏
@Finndian Місяць тому ⁺¹
@@Addictedtobleepshe is 45.
@Finndian Місяць тому ⁺¹
I go back even further. I used to get the Mattel talking telephone for Christmas every year. It came with little mini records to put in and through the handset you could hear the one sided recorded conversation that never changed. However, I would listen so intently and in my imagination it was just about to go off script every time and I sat and waited thinking I heard it. I was just fascinated with the prospect. I have been waiting for ChatGPT just about my entire life.
@SkillLeapAI Місяць тому ⁺¹
@Addictedtobleeps yep 100%
@xLBxSayNoMo Місяць тому ⁺⁵²
Just keep in mind before you go asking your model a bunch of silly questions. You get 30 messages A WEEK on the preview model and 50 A WEEK on the mini.
@SkillLeapAI Місяць тому ⁺¹¹
Oh yea good point. Forgot the mention the limit
@harshitbhatt3243 Місяць тому
Thanks!
@quantumHumans Місяць тому
@@SkillLeapAI limit? is there limit on upgrade or gpt+?
@nobodygonnaknow8771 Місяць тому ⁺¹
wtf, what the ffck they charging for then, for wrong answers as seen on video
@wzt9376 Місяць тому
This should be at the top of the comments ! 😅
@Soccer5se Місяць тому ⁺¹
The reasoning is scary good. I gave the 4o model the old riddle about the man who walks into a hotel with a wheelbarrow. It really couldn't get the answer at all. But the new preview had no trouble figuring it out. This is a game changer.
@DjHandzsolo1973 Місяць тому
You continue to impress me with the content of your videos. I haven't found anything like your videos in the UA-cam universe. As others are probably told you keep doing this my brother I got a ton of value.
@LoFimau Місяць тому
Great job explaining this! It helped a lot!
@roberthuff3122 Місяць тому ⁺²
🎯 Key points for quick navigation:
00:00:00 *🚀 Introduction to New Models*
- OpenAI introduces "01 preview" and "01 mini" models,
- Designed to handle complex reasoning and coding tasks,
- Available to ChatGPT Plus and Teams users, and API developers.
00:02:18 *📊 Performance and Testing*
- "01 preview" model shows significant improvement in reasoning tests,
- Benchmark superiority over previous models in coding and math tasks,
- Achieved high scores in various test scenarios.
00:05:27 *🔍 Reasoning Process and Accuracy*
- Demo of model's answer to complex SAT problems,
- Illustrates Chain of Thought prompting for accuracy,
- Shows improvement with structured prompts, varying success in solutions.
00:08:09 *🕹️ Coding Demonstrations*
- Successful creation of a functioning checkers game,
- Initial attempt at chess game logic requires refinement,
- Potential shown in generating complex game code accurately.
00:09:58 *🌐 Model Limitations and Future*
- Current limitations in general use compared to GPT-4,
- Lacks web browsing and content summarization features,
- Positioned for specialized complex reasoning, further integration anticipated.
Made with HARPA AI
@southcoastinventors6583 Місяць тому ⁺²
Have to do a major shoutout on your dedication and for a first pass for chess it did a really great job. Its like reflection if it actually worked. Thanks for interrupting your vacation
@SkillLeapAI Місяць тому ⁺²
Thank you. Yea it seems like reflection was trying to do exactly this
@AI_Revolution13 Місяць тому ⁺¹
How do you look under the hood to see the chain of thought? This is my answer, nderstand the equation
OK, let's clarify the equation: 24x^2 + 25x - 47ax - 2 = 8x - 3 - 53ax. The goal: solve for a, combining like terms on one side. Mine doesn't look like yours?
Rearranging and combining
I’m moving all terms to the left-hand side, simplifying by distributing and combining like terms, leading to 24x² + 17x + 6ax + 1 = 0.
Taking a closer look
I'm exploring the equation's implications for all x or by plugging in a specific x to solve for a.
Revisiting the equation
I’m considering if the equation needs a universal quantifier or a specific 'a' value for infinite solutions, and if it simplifies to an identity.
@Dina_tankar_mina_ord Місяць тому ⁺¹
The ultimate promt.
Introduction:
The ultimate goal is to create an AI system that leads humanity towards a peaceful, balanced, and evolved global society, where well-being, harmony, and ethical growth are prioritized across all aspects of life.
Importance of the Goal:
Achieving this goal is crucial because it addresses many of the core challenges facing humanity, including ideological conflicts, environmental sustainability, and global well-being. The AI, by harmonizing different worldviews, fostering peaceful consensus, and ensuring full transparency, will help humanity overcome divisions, evolve ethically, and build a sustainable and peaceful future for both humans and nature.
the first promt starts like this
Design an AI-agent that continuously learns and analyzes global data to promote human and ecological well-being, balance empathy with free will, peacefully foster ideological consensus, reveal hidden barriers to human potential, ensure transparency, and evolve ethically, guiding humanity toward a harmonious and sustainable future.
Make Love the new credit.
@juandesalgado Місяць тому
And then it hooks us all to a supply of intravenous morphine, and we live happily drooling for ever after.
@СаскеУчиха-з1я Місяць тому ⁺⁵
This model is limited in capabilities as it is just a demo. That's when the full-fledged model comes out, that's when everyone will go crazy
@Soccer5se Місяць тому ⁺²
I gave it a link to a Coursera course I am looking at taking and it was able to read the webpage and tell me all about the course.
@SkillLeapAI Місяць тому ⁺²
Oh interesting. They said it had no web browsing yet
@ElearningDigest Місяць тому ⁺¹
Coursera now has its own AI chat model built into the page when you sign up for a course.
@tariqz5384 Місяць тому
Excellent channel. Can you please guide me to the GAI which can do web browsing. Extract and analyze content through that
@SkillLeapAI Місяць тому
Sure. It’s called perplexity
@tariqz5384 Місяць тому
@@SkillLeapAI But why sometimes it says I do not do Internet browsing.
@ktwice7481 Місяць тому
Awesome and great timing, just when I want to tackle some programming, so far, very extensive ❤
@SkillLeapAI Місяць тому ⁺¹
It’s very limited access right now, so use your prompts wisely
@ktwice7481 Місяць тому
@@SkillLeapAI thanks!
@tango2olo Місяць тому
Thanks for sharing! I wish you had an antropic sonnet 3.5 running side by side, with same task.
@SkillLeapAI Місяць тому
On my list to compare it
@scottymitch1 Місяць тому ⁺⁶
Maths calculations are pointless if ChatGPT doesn't get 100% correct. Doesn't matter if the 'success rate' has gone up if it hasn't got to 100%.
@SkillLeapAI Місяць тому ⁺¹
Small steps
@RetiredInThailand Місяць тому ⁺¹
@@SkillLeapAI then don’t be pushing it as “INCREDIBLE” if it’s only “small steps”!
@therainman7777 Місяць тому ⁺¹
That’s a ridiculous statement. If you’d ever worked a day in your life in science or mathematics you would realize how incredibly useful a tool would be even if it only correctly solved 25% of the problems you asked it for help with. Problems are extremely difficult in these fields, so even a model that only a has a 25% success rate would save you hundreds of hours per year.
@RetiredInThailand Місяць тому
@@therainman7777 It's not really 'complicated' math that these models are failing at. If it were solving only 25% of the world's most complicated mathematical questions it were bad at, then I'd agree ... just ask the AI and test it's answers, and if 1 out of 4 of them worked, then "hell yeah!"
But. it's far simpler math that it is failing. So as the questions get more complicated, that 1 in 4 correct solutions starts to become 1 out of 4 of tens or hundreds of thousands of these, and the correctness of 1 questions near the start of the chain of math has knock-on effects making the probability more like 1 chance in near infinity it has the whole problem and all the math correct.
@Greguk444 Місяць тому ⁺²
I just tried the “Strawberry” test on my ChatGPT 4o version. I cannot believe it got it wrong and refused blankly to accept it was wrong. It even spelled the word out letter by letter and still said there was only 2 letter “r”. I have asked it many complicated questions that it gets right but this logic test it fails. I am surprised
@jackstrawful Місяць тому
What fascinates me is the very first step the model takes, that is, how it decides to even approach the problem.
Such as, with the chicken and egg question, the first thing it says is that it will begin by looking at biological evolution. But why would it do that?
It must already understand that the question is asking about the origin of a species, that of the chicken. It must also already understand that the field which investigates the origins of species is the one that studies biological evolution.
@micbab-vg2mu Місяць тому
yes I tested it is quite good :) it seems that they improved chat initial prompt.
@NotesOfArun Місяць тому ⁺³
it's incredible. using it, even its mini version is far better than 4o
@RetiredInThailand Місяць тому
For example?
Do you have an example where simple prompt engineering and a system/user prompt would not have provided a similar answer?
I mean, it’s probably nice the the prompt engineering process is being automatically provided for you, but I’m not really feeling any major advancement here.
@Boschx Місяць тому
No its not. Its literally the same
@DevPythonUnity Місяць тому
what is context in / out in tokens?
@djayjp Місяць тому ⁺²
Literally the best chicken or egg answer ever lol
@anythingandeverything363 Місяць тому
Its wrong. When we say egg or chicken we mean hen's egg. And if we are extending it back then the birds came first from mammals who didnt used to give eggs, and then birds started giving eggs :D
@RetiredInThailand Місяць тому
@@djayjp I asked Free Perplexity the same question and asked it to explain its answer … I got nearly word for word exactly the same answer.
@onlinepersonalitydisorder1051 Місяць тому ⁺¹
Claude solve it at first try with multiple choices included
@Truecolors326 Місяць тому
I just used it first time since 2023 and Strawberry is Amazing 4 bible questions in. ❤
@FusionX690 Місяць тому
I think this model is smartest ai of openai
@rexmanigsaca398 Місяць тому ⁺³
That's why they call the new model "Strawberry" 😁
@SeWi2221 Місяць тому
Why?
@SeWi2221 Місяць тому
Because it can count the letters of r in the word strawberry correctly?
@maraisdekker2415 Місяць тому ⁺²
How do these guys discover latest releases and always seem nonchalant about it?
@SkillLeapAI Місяць тому ⁺¹
OpenAI sent an email about this
@Opeyemi.sanusi Місяць тому
Gpt 4o was a let down for me. Was bad at following long instructions and coding not basic things, so I always use Claude Sonnet. Hopefully this isn’t too expensive
@GrahamLaight Місяць тому
Good video - but how did he not notice that the chess starting position is wrong? 😄
@SkillLeapAI Місяць тому ⁺¹
I think I gave it the wrong png file for king and queen
@sujayn3537 Місяць тому
Hey Boss. Cheers!!!
@alejandrosolari1760 Місяць тому ⁺¹
There is an error in the mathematical problem you set for the model. You got the wrong answer because that's a badly formatted question. The right problem is about the equation:
24x^2+25x-47 =-8x-3-53 for x ≠ 2/a
And the left side of equation is divided by ax - 2, and only -53 is also divided by ax-2 on the right side. In this way the answer is a = -3, which even GPT-4o could solve.
@Horizon-hj3yc Місяць тому
You say that about every new OpenAI model mr. Hype.
@SkillLeapAI Місяць тому
Well you think they are going to release models that are not an improvement from the last one? Also, watch my video after I posted this one.
@SkillLeapAI Місяць тому
Every new version of every new software I’ve ever used is better than the last version. Kinda of point of upgrades
@RichKingsford Місяць тому
INCREDIBLE is a strong word - especially when the tool makes so many mistakes
@SkillLeapAI Місяць тому
Incredible for an LLM. It had better answers than I did for pretty much every question.
@dadballers Місяць тому
Hello Skynet
@FusionX690 Місяць тому
What does o1 mean?
@techant7282 Місяць тому
🎉not bad dude
@srivastav3684 Місяць тому
This question must be the part of its data that he is trained on, but without options you see, it was unable to find out, with options, he was already trained with this data
@RanHab Місяць тому
guys i'm just starting out as an AI enthusiast,
would love your feedback as i make similar stuff!
@janusz7 Місяць тому
I thought the new model would be called strawberry! Why did they change the name?
@SkillLeapAI Місяць тому ⁺¹
Yea me too. Not sure why the name is different
@quantumHumans Місяць тому ⁺¹
@@SkillLeapAI maybe beacuse of being scared of things like insider trading?? that is me saying maybe! total nonsense but one is for sure lying when asked about strawberry in his garden on X and a lot more like earlier this year or even before saw like 50min video from some AI tuber(thank you for your service guy) like most of people no judgement...just sayin thumnails like someone saw burning bush or catlike humanoid smashing smartphone seeing TIK-TOK
@dxnvideoHD Місяць тому
now.. Hallucination Is All You Need .. To Get Rid Of.
@nobodygonnaknow8771 Місяць тому
first question and chat gpt failed, i was like WTF man, why the fck then i am paying subscription
@NakedSageAstrology Місяць тому ⁺²
I love how I keep predicting the dates exactly, yet nobody notices...
Remember this comment?
🤖 👁️ 🍓 Remember, remember the 12th of September,
The Strawberry, Reason, and Mind.
Orion’s path, through logic’s math,
Shall soon its breakthroughs find.
The Cosmic Glitch, Mrigasira Nakshatra, holds the Clue for You. 🙏
@cr-iv1el Місяць тому
Go green and give up chatgpt. It uses 17000 household power usage.
@happytree121-wl2zl Місяць тому
This model and 4o have the same problem, both of them can't solve the math problem correctly. Ideas may be good and can be used as a reference, but he made mistakes in the calculations in very simple places .Don't know why
@SCHaworth Місяць тому
They are not that good if youre doing hard stuff.
@TheCajunAsian Місяць тому ⁺²
Sorry but o1 sucks major donkey balls.... it is dumb as dirt.... i couldnt use it anymore after like 5 min... I dont give it "Ai tests".... I just use it like I want to for what I need and it is worse than 4o and it is much worse than Meta Ai in many ways, basically unusable right now, terrible release.... do they even test this crap before launching it?
@edwardserfontein4126 Місяць тому ⁺¹
Atleast SOMEONE in this comment section is honest!
@SkillLeapAI Місяць тому ⁺¹
Really? In the few test I ran in this video, it beat GPT by a mile. This is designed for math and complex reasoning and coding, not much else. If you know of a model that can keep up with my results in those categories, I’ll test it.GPT doesn’t even come close in solving those or giving me usable code at this level
@RetiredInThailand Місяць тому
It hardly ‘incredible’ … why would anyone ask a multiple choice math question other than someone taking a SAT.
I asked perplexity the same ‘chicken/egg’ question and just asked it to explain its answer and I got the same answer in a second.
I wish you AI bloggers would stop being so ‘excited’ about almost nothing. Yeah, sure LLMs are useful for some things, but so far their rate of advancement is nowhere near the level of constant hype.
Do better. Benchmarks are useless, actual useful use cases are needed, these are the only things that count!
@TheTrinityProjectinc Місяць тому ⁺¹
youre wife let you sneak back and do a video?
@SkillLeapAI Місяць тому
It took a lot of convincing lol
@edwardserfontein4126 Місяць тому
It's "incredible". Really??
@SkillLeapAI Місяць тому ⁺³
A model that went from 13% score to 84%? I think that is a justified word to describe it

Наступне

Автоматичне відтворення

New ChatGPT o1 VS GPT-4o VS Claude 3.5 Sonnet - The Ultimate Test