Why GPT-4 is much smarter than it was a year ago - OpenAI cofounder John Schulman
Вставка
- Опубліковано 13 чер 2024
- Full Episode: • John Schulman (OpenAI ...
Apple Podcasts: podcasts.apple.com/us/podcast...
Spotify: open.spotify.com/episode/1ivz...
Transcript: www.dwarkeshpatel.com/p/john-...
Me on Twitter: / dwarkesh_sp - Наука та технологія
Very interesting about post training vs pre training
Interested to hear the argument, and it may be "smarter" in some senses, but the degree of refusals and "laziness", especially with code has grown tremendously in that last 6 months. What good is all that underlying intelligence if you can't get access to it?
From my anecdotal experiments with it for coding I also think it got worse.
Its just as wrong and buggy as before, but now it's much better at gaslighting you as to why it is actually right ;)
Sooner or later we are going to give up on the idea that a model that can provide you an in depth analysis of 17th century prose is also the same model that will be able to give you the best code.
@@garrettmillard525 You do realize that that was the "old" way of thinking right? This was what people were trying to do (unsuccessfully) for years before realizing that actually, that model that can analysis prose, works better at a lot of things then narrow defined model.
@@elirane85 Lol, yes, I do know that. You cannot compare today's models and methods to that. And yet, for hard coded decision making, the absolutely incredible GPT-4o is still inadequate. Let me know when it stops relying on Wolfram Alpha to do basic math. The future is a generalist model layered with APIs and/or, say, a model that was trained exclusively on code. Hallucinations increase with reasoning capacity. If they can create a generalist model that codes and does math perfectly, they have created AGI.
@@garrettmillard525 "If they can create a generalist model that codes and does math perfectly, they have created AGI" - I've been saying that exact thing for years :)
It's also why I'm not "scared" about AI taking programmers jobs, since if it will be able to do that, we will have MUCH much more scary things to worry about like. Like existential level problems ;)
By YouSum Live
00:00:00 Post-training improvements enhance model quality.
00:00:34 Increased compute investment leads to significant gains.
00:01:50 Complexity of post-training requires skilled personnel.
00:02:19 Continuous R&D effort needed for model functionality.
00:02:53 Serious pre-training efforts correlate with post-training success.
00:03:12 Distillation and cloning reduce uniqueness in model development.
00:03:47 Experience across various AI domains aids in research success.
00:04:20 Curiosity and empirical approach crucial for effective research.
By YouSum Live
On enormous complex task coding benchmarks, gpt-4-1106-preview and gpt-4-0125-preview are astronomically smarter than the newer versions in April and May. I'm not saying slightly, I'm talking about 98% accuracy versus 26% accuracy.
Proof?
No it's not. The first version of it was sth else.
I remembered also the first gpt-4 version at the launch was the best
They did a bunch of tweaks since then. I assume they didn’t have enough compute power for the volume of the demand. They also had different level of intelligence for different users: I had two accounts and gpt4 on the first one was producing outputs that were consistently worse, than on another one, using the same prompt. Plus, the first one didn’t have as much context window as the second one.
It’s noticeably better from my anecdotes. Much better at problem solving, math, discrete math/logic.
I’m not denying that you may be having a worse time, but not for me. I’m curious as to which ways it’s gotten worse for people.
@@openviewtrading7024 From my anecdotal experiments with it for coding I also think it got worse.
Its kinda still just as wrong and buggy as before, but now it's much better at gaslighting you as to why it is actually right ;)
The safeguards increased to limit how much copywrite material it would reproduce. Originally it would give very detailed responses about engineering codes and standards. Now it doesn’t.
I think the elo he’s talking about is user satisfaction. That doesn’t necessarily mean it’s smarter, just that people like the responses more.
The response quality feels like it has continued to degrade.
I hear nothing but big nothingburgers from this guy.Typical cofounder CEO type stuff, he can't say anything that'll hurt the company and is on eggshells the entire time. Still crazy you got him on I'm more interested in less involved people though, that can actually talk about stuff.
Is it though?
gute frage, nächste frage.
@@fabianw2kwas?
Uh…ummm..uh…ummm..duh
Sundar has left his cookies all over this video.... and ur website
He should use AI to remove all the “uh’s” and “um’s”
I really love all your interviews, but not this one. The Sarah Paine one was exceptional by bringing some order to a very complex topic. This interview does the opposite. AI is a fairly straight forward topic compared to Paine's world politics but the constant use of terminology, abbreviations and lingo makes it almost impossible to follow for someone not in the field.
Lol what lingo? RL? Mode? ELO? These are absolute basics. If you don't know those terms you probably aren't going to enjoy an interview with an incredibly technical guy. This is a 5 minute clip from a 90 minute interview, I'm sure you would be given better context if you care to watch it.
But that is a lie, it's not smarter, it's just better at processing data. It still doesn't know if a cat is a cat. It has zero intelligence and can't tell the difference between even 0 and 1.
But smarter is one word in a UA-cam video title instead of four words. If you applied your human intelligence you would be able to interpret that and understand what they meant.
you could say exactly the same about humans. at the end of the day it's just a token probability predictor that learns patterns in the data sure. However, that doesn't stop it from being able to replace and automate practically any job that is text based. Some of it's features are already greatly superior to a human like it's working memory.
Why does he take 50mins to swallow. It's distracting. Same with the chick who did the chat 4o release presentation the other day. Can you get these people some water and force them to drink it ty
😂😂
They’re tech nerds. Probably a little nervous
You benefit from the incomprehensible brilliance of these people. They are too smart for the human body. Have some respect and ignore the quirks.