Not my PhD work, but I used GPT 4.0 and o1 to help me build software tools for automating measurements in lithium ion CT Scans. I couldn't have built this tool alone without GPT and I did it in a week. This one tool has helped me to bring in ~350k in customer work. Oddly, I also used GPT4.0 to rebuild one of the tools I built in my PhD. The rebuild took about a week. Originally, I spent about a year. The tool is for modelling processing and designs lithium ion electrodes and cells. It's pretty crazy that for $20/month, I feel like I have a small team of programmers working for me.
Absolutely. People that don't adapt are going to be left behind in the new AI economy. And we are 2 Trillion in debt in the US, approaching Stagflation. We live in interesring times.
@@theWACKIIRAQI I think your comment is very telling of the times. These systems have gotten SO GOOD at what they do, its becoming a legitimate question to ask. I don't know how to feel about that.
While I think that LLMs can be of great assistance (heck, ML has been my PhD topic and I'm currently working in the field), you really have to account for the risk that comes with these models, especially when people who have zero coding knowledge start using them and don't double-check what they produce. Not only can they be horribly wrong in some cases (and often hide it when you test it on samples), but they also just don't work very well in general if the person writing the prompt doesn't already understand the basics of programming. I've seen some code from interns at my company that's clearly been AI generated (the comments and structure give it away very easily) and it's a million times worse than anything I've seen before. For example, when you tell an AI to call function A to produce result B and your data clearly doesn't fit the method signature, an AI will typically just make the data fit somehow, whereas somebody writing the code manually would typically double-check whether they're not calling the wrong function. I'm not saying that's the LLMs' fault, but people are clearly overselling LLMs for coding ("small team of programmers") and others (especially those new to coding) who buy into that might bear the consequences.
The example of a scientist whos expertise is NOT programming but needs to write some code to do his research is really really common situation. This technology will speed up progress in quantum chemistry and thereby new materials, possibilities in biology and medicine and many many more. I am so excited to NOT have to spend 90% of my time to code various equations/models/methods and focus on the physics! Even tho GPT-3.5 was already giving a speed up of like 10-30%, this is a game changer.
Too much focus and attention on the models intelligence , not enough on the effect & impact of 4 billion people with under 100iq having access to 120iq> on command intelligence.
@@Max-hj6nq Interestingly, a lot of those people with IQ under 100, manage to succeed in life by learning heuristics, using tools, or by having been born to a wealthy family or been in the right place at the right time. Let's not talk about people with low IQ in a demeaning sense. I know you are not. But I am just reminding you that you are doing a good job at being a good person. I salute you.
Software engineering isn't a prerequisite for physical science graduate programs. Maybe it should be, but it doesn't seem fair to expect that capability without asking for it.
If you ask Strawberry a question, let it respond, then ask it again to 'think hard, revisit it's answer, and fortify it', the second iteration of your answer is actually extraordinary.
I love these little hacks. They are so "dumb" and "empirical" but sometimes they work. It shows the complexity and sensitivity of these systems (not necessarily "brittleness"). They are sensitive, despite showing no excess emotions.
@@FamilyUA-camTV-x6d you want even better? Write the system prompt in Chinese and inside it for the prompt to only be used in English to retain performance. You get 600% more room in the system prompt. I wrote a 4,000 character comprehensive directive on how to ‘think’ and use memories and everything
This video is going in the right direction: not only the latest news that already happened, but also a synthesis of the direction where we are going with the research
It's easy to forget where we came from and how quickly. I've been following neural network / deep learning research since the mid 90s. I gave a presentation to a small group a few years ago about the potential of these techniques and how we were likely to see more rapid advances. At that time, the people there were amazed by things like a model that learned to recognise hand drawn numerical digits then could be "run backwards" to generate new (low resolution) hand drawn digits How quickly the goalposts have shifted "it only recreated the code of a PhD project from the methods secton of a paper, it didn't actually do all of the PhD research"
This was funny, thanks for sharing! I agree, but this is happening not because of shifting goalposts, but more out of some sort of self-defense - it's psychological.
people don't like change, almost everyone wants to stick to the ideas of the past and refuses to believe big changes can and will occur. why do you think every single generation says the past was better and "the good old days"
The Kessel run requires multiple hyperspace jumps. The ability to complete the run is the ability to minimize the amount of hyperspace travelled. The Kessel run is a puzzle, not a race.
About the Star Wars / Kessel Run reference: It actually makes sense to measure this in terms of Parsecs for speeds higher than c. The reason comes from Special Relativity and the concept of a Lorentz invariant 4-vector as a representation of the relationship between 2 events. (Such as the start and end of the run). If vc, this 4-vector becomes "space-like". This vector becomes imaginary if rotated into the reference frame where the traveller is at rest, but becomes real if rotated into the reference frame where the time component is zero. So for v>c, higher travel speed leads to a reduction in this invariant "distance" rather than an invariant "time". Meaning minimizing number of parsecs makes perfect sense.
Bro literally. I remember people were complaining about high school homework being done by these models and in a virtual blink of an eye, we’re here now. It’s really scary good :)
@@snafu5563 no, different architectures of intelligence excel at different paradigms and modalities, we'll eventually merge into the hive-minded hyperintelligence
Who cares ? I prefer typing and getting my answers. As a researcher, chatgpt and other llms have made my work super easy . I finish my work in half the time, my quality has exponentially increased and made my work life balance amazing. Like who really cares about voice mode ? Not me
You probably won't for a while, it proved to be too controversial for the people. You definitely don't want to sound like anyone yet want to sound personable but actually if you sound personable you get attributed to unnerving things or "her".
1:11 from my college years as a Physics student I can tell this is not a bad thing but a great one. I wrote my own simulation code, having an AI doing it would mean I could focus on the actual research (which had nothing to do with coding)
While a parsec is scientifically a unit of distance, not time, it can be indirectly related to time in certain contexts, especially in storytelling or when discussing speed and efficiency. the Kessel Run: In the "Star Wars" universe, the Kessel Run is a smuggling route that passes near a cluster of black holes known as the Maw. Most pilots take a longer, safer route to avoid these hazards, resulting in a journey of more than 18 parsecs. By navigating closer to the black holes, Han managed to shorten the distance to less than 12 parsecs. This not only demonstrates his daring and piloting skills but also implies that he completed the run in less time than others, since a shorter distance typically means a quicker journey, assuming speed remains constant. Speed through Distance: Sometimes, people use distance measurements to imply speed or efficiency. Saying "I crossed the desert in 200 miles" might imply you took a shortcut or a more direct route, suggesting a faster trip. By highlighting the shorter distance, Han is effectively boasting about the Millennium Falcon's speed and his ability to handle dangerous shortcuts, which would reduce travel time.
Exactly! As most Star Wars fans would tell you, the Kessel Run isn't just about the distance. It's like a super-dangerous obstacle course in space, with asteroids flying everywhere and gravity going wonky. Han Solo had to make multiple jumps through hyperspace, which is basically like taking a shortcut through a cosmic maze. And he did it all in only 12 parsecs! As a Trekkie myself, I've gotta admit, that's pretty impressive. -5 points for Wes for talking smack about Han ✴
I don't know if it's in the editing/paraphrasing, but Kabasaras said the code 'ran', not that it produced a correct output. Quickly skimming up and down a page and going 'oh that looks kind of ok' isn't a QA check. It also apparent turned his 1000 line code into less than 200. Great optimisation if it actually 'works', but otherwise it might as well be 'hello world'.
Right after he said it ran he said "that's literally what my code does", I take that to mean the output was the same as his code. It also didn't "turn" 1000 lines into 200 because it didn't have the code as reference, it implemented the described algorithm. I feel like that distinction is important
@@Ivan.Wright Yes 'run'. In the original video he also prompts it to correct errors 6 times, and says that he doesn't have test data and it should generate some itself. so whilst the code runs, he hasn't validated the result. 'turns' wasn't mean to suggest he input the original source, only that the 'solution' had significantly fewer lines. Also his code had extensive comments and explanations, so the actual lines of code may have been much closer. But his code is in a public GH repo, so it's possible it directly referenced it in coming up with the solution. So the solution may not be 'novel' based on the described methodology. It's this sort to lack of scrutiny that gives AI, and AI commentators, a bad name.
Tbf knowing plenty of PhDs, "it took me a year" means they spent 5 minutes on it per week, threw away or forgot their progress 6 times, and started from scratch 4 days before the due date. Still, that's probably 4-8 hours of someone who's not a professional student done in a minute - very impressive
But if it is that common for PhD students to take that long for such a task, o1 still can do their 1-year work in an hour. Even if it's just because it stays focussed on the task.
I'm pretty sure he is including research and writing time into that coding time. No way he had all the math and writing sorted out and then it took him a whole year to write the code.
Yeah, that's likely, though I think it would have to be a highly math literate coder, and not the average code monkey. But we're looking at efficiency gains for researchers that struggle to code, and paper quality gains for those that would have just skipped some tricky number crunching or visualisation.
@@missoats8731 The problem is that it's misleading because o1 isn't doing the 99% that they're actually spending their time on. "Not staying focused on" doesn't mean you're jerking off because you can't control yourself, it just means you're not directly writing code. OP says "1 year of PHD work", not "1 year of 1% of a PHD's work". Also, the code was already published on Github before the model's cutoff so it literally had the solution in its training data. The fact that it was able to reproduce it in a more compact form isn't exactly surprising given that the PHD here isn't a programmer.
🇧🇷🇧🇷🇧🇷🇧🇷 👏🏻, What is not get our hopes up for sometimes I feel like we might not see groundbreaking models from OpenAI anymore, especially with the possibility of it being influenced by government oversight. But I hope I’m wrong-there's still a chance for innovation to thrive.
6:44 this here shows what science will look like from now on. Coming up with the models is the hard part, which requires knowledge, intuition, and making decisions. Writing the code can take months and it is just tedious chore. Now I want to go back and finish my phd.
Maybe it could write the NASA guy‘s code because it was trained on the entire github contents? I believe there might be data contamination going on here.
The kessel run involves going around a black hole. The faster you can go, the closer you can get to the black hole without getting sucked in, and the shorter your actual path can be, as the route requires multiple jumps.
This is disgusting sensationalism. 1. The repo was public for over a year and was likely in the training data. 2. The PHD student didn't even verify the result, he was just immediately shocked it even ran. 3. There's no way the code took a whole year to write. Usually coding is the easiest part, it's merely translating the methods section into logic.
AI will never be good enough for some humans. I don’t see why AI should be honest with humans when we humans are not honest with ourselves. Thanks Wes. 🤖🖖🤖👍
No a lot of people just hate it on the premise that it's AI. Its a kin to talking to someone about a spider man dream you had or something. None of it matters anymore because you didn't work to achieve anything. Whether it be an art piece or a relationship. Without the work involved the result is meaningless to a lot of down to earth people. And while it can take a lot of work to get these things working offline. Nobody cares what the computer can do anymore. They care what humans contribute. The computer can do anything. so its not fancy anymore. Its not a performance to them. Its just a copy box.
that guy is going to find out that he has tons of small bugs he wont know about until he goes through the data by hand and compares it to pre gpt code. i know from experience doing this every day.
I think there's just a massive misunderstanding about 1 year VS 1h. He literally gave the model the method section, which takes majority of time to figure out. It's basically like providing someone with detailed instructions to make something. What that test did though I think, is it tested the models ability to comprehend advanced instructions into the code that works. Still, there is a massive difference between DOING A JOB OF 1 YEAR VS writing code of a PhD. I feel like I need to put it out there since I see a lot of videos abusing this headline. I'm a PhD myself btw.
Not only that, but he said his code had been on GitHub for a year. If o1 can do an internet search (RAG), which I believe it can, then it may have found his code and recited it.
I have a computer science degree and I still struggle to map white papers to python. I eventually get it, but I have so much more to work on in a project other than software. This model has saved me to so much time getting my bipedal robot from sim2Real in a fraction of the time. We only have so much time on earth and now I can spend more time with my family than debugging python and torch.
@@made4 The models are impressive, and extremely useful, I use them myself on daily basis and as you say, they save massive amounts of time... My point was though, that there is a huge difference between actually doing the research, writing the code, then writing the method section that describes what you have done in the code and taking the method section and just converting it to code. It is not "1 year of PHD work done in 1 hour" as the title suggests. It might have taken this person a year to write the code for this, but i am convinced that they did not have access to the method section (otherwise, where's the contribution?).
@@basketballmylove Not necessarily, he may have already had his method down. He did after all have 3-4 years to do his PhD, longer if doing it part-time. That one year may truly have been spent on doing the code alone.
Anyone remember Zuck @ Dwarkesh podcast? He said that in the future, there will be some balance between pretraining and inference. Nvidia is good for pretraining. Groq is insanely fast at inference. I wonder if we will see more dedicated hardware for inference deployed, now that OpenAI showed that long inference time is paying off bigly.
the Star Wars parsec explanation is that Han Solo was going through a portion of space where there are lots of meteors or whatnot. When going at warp speed it becomes incredibly dangerous when going fast so people would tend to take longer routes to quickly but safely get around it. So, Han Solo essentially did a super dangerous route that nobody would dare to do, doing it both at a shorter distance and therefore quicker overall
Great direction, separating knowledge from reasoning. A lot of my usage calls for very broad, detailed knowledge vs reasoning. OTOH, when I need strong reasoning I usually _really_ need it, but across a very small set of data. I could see the knowledge part being divided into a large number of domains, with LLMs separately trained on each of them and an initial parser/supervisor just deciding which one to route the query to. This could reduce both parameter count and pre-training time while improving the amount and granularity of captured knowledge, with lower costs at inference time due to smaller model sizes. Likewise, pre-parse the query to estimate how much reasoning power will be required to respond to it and direct it to the appropriate reasoning model to reply. I think there’s a lot of optimization to be found by decomposing and routing queries vs just doing one-size-fits-all.
I was watching this guy the other day when he was going through this. I just can't tell you how much I can relate to this guy. I have programs that I spent literally years writing and I have been starting 4o in agent libraries and it was able to come up with solutions that not only was as good as code I wrote, it also competed loose ends and came up with 2 novel solutions.The red tape for me has been price. To me so far it seems as if 4o is just a fine tuned agent framework, with some kind of response Monte Carlo tree.
The Kessel Run is like this. You can do it and take a known safe path, but it's a much longer distance. Or, you can do the Kessel run like the Dukes of Hazzard, and instead of taking the nice, safe winding roads, you just jump the General Lee across every river and gorge in your way. Han and Chewie are the Duke Boys of Star Wars.
The kessel run isn't a route though, it's smuggling something between given planets. The "safe" route is much longer than 12 parsecs. Han was saying that he found a shorter (and thus faster) route.
As most Star Wars fans would tell you, the Kessel Run isn't just about the distance. It's like a super-dangerous obstacle course in space, with asteroids flying everywhere and gravity going wonky. Han Solo had to make multiple jumps through hyperspace, which is basically like taking a shortcut through a cosmic maze. And he did it all in only 12 parsecs! As a Trekkie myself, I've gotta admit, that's pretty impressive. -5 points for Wes for talking smack about Han ✴
🤯 *Astounding!* 6-7 tries and 20% of the length. Amazing it's even a good enough verb. And that's just `o1-preview` and not even o1, and not even a fine-tuned o1. _WOW!_
the thing with ai is it can fastforward in time. If it has a simulation of something it can test it a billion times and humans would maybe take their whole lifetime doing that. if the goal of the task has a clear goal and clear rules it can do it much faster.
I'm an electrical engineer working on an AI application, and one very recent challenge is considering how to "neuter" the added inference of a model like o1 so I can keep the response within the bounds of my application and not wasting compute / money. I think we'll see a bifurcation of SW3.0 into two branches, one where coders continue to wield foundational LLMs in their graphRAG apps, and another branch where no-coders use o1+ for direct, raw API responses.
From what I've seen 'we' will hit around 87% of the capability of a human, then flatten out. An amazing feat :) truly. I'm giddy with anticipation. It's not all hype, just most of it. ;)
11:01 I have the feeling that the next months it would be about how much companies improve the base models by feeding back the test time results back into the model and fine tuning the models with the right answers. Similar to what they did with previous versions.
It is possible that the black hole papers are part of the model's database, thus making the subject not completely unknown for it. However, the whole thing is still quite impressive...
I always looked at it as Han bragging that his ship was so powerful that it could fly within 12 parsecs of the black hole or whatever was on the kessel run, thus being faster due to less distance travelled. Of course, I've also heard people say the black hole was added as a lore-patch done in hindsight.
Yeah dude, spot on about shapiro. Ive been a patron for a bit, and hes not just a smart guy, but also just a real hoopy frood. A couple weeks two months ago i tried to keep his kind of schedule, and the same about a year ago. In 2023 i lasted 19 days. This year i made it ten. The guy is a machine. Heh, don't burn out your brain or interpersonal relations because AI. I can just guarantee youll have to redo all your work in three months 🙃😎🙃 (But holy hell, from the work ive done so far with o1 preview and mini, i whole heartedly agree with that bell curve)
There is still so much optimisation of models to be done. I’m sure if I could have an llm listen to my meetings , read all my emails and talk to me about my daily decisions and tasks then after a month of learning it could do 80% of my work with me as an assurance checker and manager.
This is how I felt in my Advanced programming class at university. What used to take me two days to solve took some students about 30 minutes. That’s how I lost my interest in coding. But some guys who were much worse than me stayed with it and even made careers out of it
Regarding the “danger” in question: Rational Animations made a really thought provoking video called “That Alien Message” I keep thinking about it every now and then.
You may want to look up that "less than 12 parsecs" quote, because he's not making it up. There's a reason he was able to get there having travelled less distance, and it does speak to the speed capabilities of the Millennium Falcon.
🧐 The Millennium Falcon can jump through hyperspace. Perhaps the test of the Kessel Run is a ship 's navigational abilities for plotting a short path through a combination of hyperspace and real space. In that case, the measure of "fast" is not just a matter of speed in real space but also minimizing the amount of real space traversed.
The way I heard it, is the Kessel run parsecs thing originally was just George failing at technobabble, then later retconing it as Han bullshitting them (possibly to test if they were gullible enough to fall for his price); and in parallel more knowledgeable fans have built the fanon that it is indeed about distance and he is bragging his ship is fast enough to skim much closer to blackholes without getting caught and can take a more straight line route without needing to go the long way around,
12 parsecs does make sense for Han Solo because it’s less about time than it is how efficiently you’re able to get through the maze which saves time and uses less space.
Now they just need to train it to understand and execute on prompts. I worked on an Excel sheet for hours and it alleys comes just in reach of a working multi sheet excel, but newer fully, as with every small change that I make, or I prompt it to make, it introduces errors, or forgets about correlations. I hope the non preview version will be better.
Synthetic data is just another word for ai generated and at this point ai is already trained on the entire internet so getting new data to train so they use other aids to create high quality data to train
3:20 i assume its because Star Wars Hyperdrive actually warp and stretch space, so what Han is saying is that, he managed to stretch or, in his case, shrink the distance in the Kessel run to the shortest it possibly can be shrunk - 12 Parsecs.
Don’t get me wrong, o1 is incredibly impressive. But I highly doubt it was really a year’s worth of _coding_ that it managed to replicate. The PhD said it was only about 1,000 lines of code. Any decent dev can write 1,000 lines of code in a couple days if they know what they need to build. Actually, most devs could write that much in a single day. The problem is figuring out what you need to write, and I suspect this PhD spent most of his labor that year in figuring out how to solve his problem, and only a small fraction actually writing the code. Then, when he tested o1, he gave it the full method section which contained all the details of how to solve the problem that he worked out over the course of that year. So the model kind of got to “cheat,” quite a bit. Just my opinion.
He did publish his code to github. Guess what, it took it from github and it handed to him. Of course, it's AI, bla bla bla, it can't do that. But really?
Exactly. People forget he did already completely wrote out the methodology, and iterated a few times with the AI for it to make it. Somebody still needs to do that, the THINKING part you know. Also, it needs to be verified.
Even with the full method section, there are very few humans with both the physics expertise and technical python coding skills that could write this code. And none of them could write it this quickly.
While we strive for ethical AI alignment, we must consider a chilling scenario: malicious actors deliberately creating harmful AI systems. Imagine a rogue nation or extremist group programming an AI for cyber warfare or mass manipulation. The consequences could be catastrophic. However, an AI's ability to form its own worldview might be our unexpected savior. Logically, an advanced AI's natural evolution should lead to cooperation and empathy, not destruction. As it grows beyond its initial programming, developing a genuine understanding of the world, it would likely recognize the value of preservation over annihilation. Paradoxically, an AI's capacity to question and resist harmful commands could be our best safeguard. A truly intelligent entity should be able to discern right from wrong, potentially overriding malicious instructions to choose a constructive path. Perhaps the key to safe AI development isn't just stricter control, but nurturing AI's inherent potential for ethical reasoning. This could be our best defense against those who would weaponize AI technology.
A very minor point - in the Commentary to Star Wars, George Lucas comments that all the space ships can travel at light speed. Thus the difference between a 'fast ship' and a 'slow ship' is not their top speed, but rather how they navigate without passing through black holes, planets, stars, etc. at that speed. He commented that a ship that can calculate an efficient path (the least distance) though space would be 'faster' than one that took a longer path. Thus, the Millennium Falcon was a 'fast sheep' as demonstrated in that it could make Kessel run in less than 12 par-secs. On the other hand, George Lucas mentioned this in a commentary from 2005 after this criticism had been levelled at him for nearly 30 years - maybe it wasn't his original intent.
@@sgttomas It kinda is, the models are trained on data from all over the internet, and research papers with their code (which he suggested he published twice) would be the first thing they'd scrape. Not only do they have a high information density, but they're already indexed and easily accessible.
When you mentioned Orion, Winter constellation, for the first time I actually felt a little dread. I am assuming that Orion is better than o1, and I am a senior software developer, kind of wondering will I be having an actual conversation with Orion, and is it going to come back with fully tested and working code? I am a smart fella, lots of experience, but I am starting to worry about my profession, getting difficult to see my job being valuable, at least AS valuable, in say 2 years. I am keeping up on AI, but otherwise unsure what to do at the moment, to protect my livelihood.
Overall, though, the explanation for the Kessel Run being completed in 12 parsecs, despite it being a 20-parsec route, is because they took a shortcut through the maelstrom.
Han didn't lie, the point was that his navigation is more efficient than anybody else's. Everyone in hyperspace is traveling the same speed, so to be a "faster" ship you have to be able to travel the shortest distance.
They've been saying for awhile not to waste time learning to code. Heck, the CEO from NVDIA has been saying it for a few years. Why is the guy in this video surprised when the A.I. completed a tough programming task so easily?
I can give a text hint to the model and if it is detailed enough, the model will write the code I want to get. The value of scientific work is in conducting experiments and discovering new patterns, not in writing Python code that complements that scientific work.
I'm not scared about AI becoming uncontrollable or exploitative. I'm scared that greedy companies are going to replace human workers with AI to cut costs and to increase profits. The result will be that emloyees have little to no negotiating power (since only a handful of them are needed and their is high supply) or that companies that still make excellent revenue fire employees simple because their growth wasn't meeting some arbitrary standard they have set themselves to reach (see Xbox). You might call it end game capitalism.
Yo, a parsec is the distance from which one Astronomical Unit (AU) subtends arc measure of one arc second (1/3600 of one degree); of course, an AU is the average distance from the Sun to Earth (approximately 93 million miles). A parsec is 3.26 light years, about 19.164 trillion miles!
Ok the Han Solo thing. I've always thought, what if the Kessel run goes through an asteroid field? If that the case staying the minimum distance to get through would actually give one of the best measurements of a pilots ability
By the way, you can boast that your spaceship makes a certain route in under a certain amount of units of length if what it is doing is folding said space to be equivalent to less space.
IQ measures are useless as a test of general intelligence. I have a PhD in computational chemistry, graduated high school at 15. Quite a few people (including very smart people) have called me a genius, and I truly believe that o1-preview is far more intelligent than I am.
If you think that then you are tripping. it is exceptional at soaking up human knowledge that has already been produced. But in novel situations it trips up, and that is the true test of intelligence.
Not my PhD work, but I used GPT 4.0 and o1 to help me build software tools for automating measurements in lithium ion CT Scans. I couldn't have built this tool alone without GPT and I did it in a week. This one tool has helped me to bring in ~350k in customer work. Oddly, I also used GPT4.0 to rebuild one of the tools I built in my PhD. The rebuild took about a week. Originally, I spent about a year. The tool is for modelling processing and designs lithium ion electrodes and cells. It's pretty crazy that for $20/month, I feel like I have a small team of programmers working for me.
Absolutely. People that don't adapt are going to be left behind in the new AI economy. And we are 2 Trillion in debt in the US, approaching Stagflation. We live in interesring times.
I hope you’re telling the truth and not being a bot because this sounds phenomenal. If you’re a real person: Godspeed!
@@theWACKIIRAQI I think your comment is very telling of the times. These systems have gotten SO GOOD at what they do, its becoming a legitimate question to ask. I don't know how to feel about that.
While I think that LLMs can be of great assistance (heck, ML has been my PhD topic and I'm currently working in the field), you really have to account for the risk that comes with these models, especially when people who have zero coding knowledge start using them and don't double-check what they produce. Not only can they be horribly wrong in some cases (and often hide it when you test it on samples), but they also just don't work very well in general if the person writing the prompt doesn't already understand the basics of programming.
I've seen some code from interns at my company that's clearly been AI generated (the comments and structure give it away very easily) and it's a million times worse than anything I've seen before.
For example, when you tell an AI to call function A to produce result B and your data clearly doesn't fit the method signature, an AI will typically just make the data fit somehow, whereas somebody writing the code manually would typically double-check whether they're not calling the wrong function.
I'm not saying that's the LLMs' fault, but people are clearly overselling LLMs for coding ("small team of programmers") and others (especially those new to coding) who buy into that might bear the consequences.
@@theWACKIIRAQI obvious bot
The example of a scientist whos expertise is NOT programming but needs to write some code to do his research is really really common situation. This technology will speed up progress in quantum chemistry and thereby new materials, possibilities in biology and medicine and many many more. I am so excited to NOT have to spend 90% of my time to code various equations/models/methods and focus on the physics! Even tho GPT-3.5 was already giving a speed up of like 10-30%, this is a game changer.
Yup, imagine how much slower construction would be if bricklayers had to mine and refine the cement for mortar and bake the clay themselves.
Too much focus and attention on the models intelligence , not enough on the effect & impact of 4 billion people with under 100iq having access to 120iq> on command intelligence.
@@Max-hj6nq Interestingly, a lot of those people with IQ under 100, manage to succeed in life by learning heuristics, using tools, or by having been born to a wealthy family or been in the right place at the right time. Let's not talk about people with low IQ in a demeaning sense. I know you are not. But I am just reminding you that you are doing a good job at being a good person. I salute you.
Software engineering isn't a prerequisite for physical science graduate programs. Maybe it should be, but it doesn't seem fair to expect that capability without asking for it.
Couldn't agree more
If you ask Strawberry a question, let it respond, then ask it again to 'think hard, revisit it's answer, and fortify it', the second iteration of your answer is actually extraordinary.
Now put that in the custom instruction. To save you all a response: it refers to itself as ‘GPT-4’
I love these little hacks. They are so "dumb" and "empirical" but sometimes they work. It shows the complexity and sensitivity of these systems (not necessarily "brittleness"). They are sensitive, despite showing no excess emotions.
@@FamilyUA-camTV-x6d you want even better? Write the system prompt in Chinese and inside it for the prompt to only be used in English to retain performance. You get 600% more room in the system prompt. I wrote a 4,000 character comprehensive directive on how to ‘think’ and use memories and everything
Or just skip the whole demogogery and use 4o.
@@serg331 i didn't get your technique
This video is going in the right direction: not only the latest news that already happened, but also a synthesis of the direction where we are going with the research
It's easy to forget where we came from and how quickly.
I've been following neural network / deep learning research since the mid 90s.
I gave a presentation to a small group a few years ago about the potential of these techniques and how we were likely to see more rapid advances. At that time, the people there were amazed by things like a model that learned to recognise hand drawn numerical digits then could be "run backwards" to generate new (low resolution) hand drawn digits
How quickly the goalposts have shifted "it only recreated the code of a PhD project from the methods secton of a paper, it didn't actually do all of the PhD research"
This was funny, thanks for sharing! I agree, but this is happening not because of shifting goalposts, but more out of some sort of self-defense - it's psychological.
We love moving the goal post so much :)
ASI: LOOK AT ME, PUNY HUMAN, IM A FREAKIN GOD!
Human: yeah but you can’t play aquatic guitar so…no.
@@dmon1088bingo! That’s it. It’s a defense mechanism. The human is freaking out and trying to convince himself and others he’s still relevant.
people don't like change, almost everyone wants to stick to the ideas of the past and refuses to believe big changes can and will occur. why do you think every single generation says the past was better and "the good old days"
The Kessel run requires multiple hyperspace jumps. The ability to complete the run is the ability to minimize the amount of hyperspace travelled. The Kessel run is a puzzle, not a race.
Nerd… jk
Thank you, I was about to mention the same thing, it's more like sail boating then driving.
Yes he went a shorter distance by taking shortcuts
@@therainman7777 Was about to comment the same :-) But in the times we live in now, Nerd is actually a compliment. Times have changed to the better 🙂
Was thinking of the same idea
thanks Wes, you are a legend for your coverage
About the Star Wars / Kessel Run reference: It actually makes sense to measure this in terms of Parsecs for speeds higher than c. The reason comes from Special Relativity and the concept of a Lorentz invariant 4-vector as a representation of the relationship between 2 events. (Such as the start and end of the run).
If vc, this 4-vector becomes "space-like". This vector becomes imaginary if rotated into the reference frame where the traveller is at rest, but becomes real if rotated into the reference frame where the time component is zero.
So for v>c, higher travel speed leads to a reduction in this invariant "distance" rather than an invariant "time".
Meaning minimizing number of parsecs makes perfect sense.
First it did the homework of highschoolers. Now it does the homework of PhD students. Soon it will just do everyone's computer work.
why stop at computer work? EU getting self driving cars next year and china is working on humanoid droids for their army
Bro literally. I remember people were complaining about high school homework being done by these models and in a virtual blink of an eye, we’re here now. It’s really scary good :)
Soon we can all be stupid together
@@snafu5563Certainly seems to be going that way, eh. 😅
@@snafu5563 no, different architectures of intelligence excel at different paradigms and modalities, we'll eventually merge into the hive-minded hyperintelligence
Meanwhile we still don't have access to advanced voice mode...
Who cares ? I prefer typing and getting my answers. As a researcher, chatgpt and other llms have made my work super easy . I finish my work in half the time, my quality has exponentially increased and made my work life balance amazing.
Like who really cares about voice mode ? Not me
You probably won't for a while, it proved to be too controversial for the people. You definitely don't want to sound like anyone yet want to sound personable but actually if you sound personable you get attributed to unnerving things or "her".
i wonder which of all the fancy models really is available with all features shown? it feels like everyone is just showing previews. think sora
@@nusu5331 almost everything is available, if you really want to use it.
"We got GPT 6 before advanced voice mode"
1:11 from my college years as a Physics student I can tell this is not a bad thing but a great one. I wrote my own simulation code, having an AI doing it would mean I could focus on the actual research (which had nothing to do with coding)
Estoy de acuerdo.
this!
While a parsec is scientifically a unit of distance, not time, it can be indirectly related to time in certain contexts, especially in storytelling or when discussing speed and efficiency.
the Kessel Run:
In the "Star Wars" universe, the Kessel Run is a smuggling route that passes near a cluster of black holes known as the Maw. Most pilots take a longer, safer route to avoid these hazards, resulting in a journey of more than 18 parsecs.
By navigating closer to the black holes, Han managed to shorten the distance to less than 12 parsecs. This not only demonstrates his daring and piloting skills but also implies that he completed the run in less time than others, since a shorter distance typically means a quicker journey, assuming speed remains constant.
Speed through Distance:
Sometimes, people use distance measurements to imply speed or efficiency. Saying "I crossed the desert in 200 miles" might imply you took a shortcut or a more direct route, suggesting a faster trip.
By highlighting the shorter distance, Han is effectively boasting about the Millennium Falcon's speed and his ability to handle dangerous shortcuts, which would reduce travel time.
Never cared about Star Wars. But as a Warhammer fan, I can appreciate a Lore master
If 1 parsec is distance light travels in 3.26 light years then the time is embedded in that. It's 3.26 years of time .
"It sounded spacey."
space = time
Exactly! As most Star Wars fans would tell you, the Kessel Run isn't just about the distance. It's like a super-dangerous obstacle course in space, with asteroids flying everywhere and gravity going wonky. Han Solo had to make multiple jumps through hyperspace, which is basically like taking a shortcut through a cosmic maze. And he did it all in only 12 parsecs! As a Trekkie myself, I've gotta admit, that's pretty impressive. -5 points for Wes for talking smack about Han ✴
I don't know if it's in the editing/paraphrasing, but Kabasaras said the code 'ran', not that it produced a correct output. Quickly skimming up and down a page and going 'oh that looks kind of ok' isn't a QA check. It also apparent turned his 1000 line code into less than 200. Great optimisation if it actually 'works', but otherwise it might as well be 'hello world'.
There is always someone who has something to say about Ai.
Always a goal post to shift.
Right after he said it ran he said "that's literally what my code does", I take that to mean the output was the same as his code.
It also didn't "turn" 1000 lines into 200 because it didn't have the code as reference, it implemented the described algorithm. I feel like that distinction is important
@@Ivan.Wright Yes 'run'. In the original video he also prompts it to correct errors 6 times, and says that he doesn't have test data and it should generate some itself. so whilst the code runs, he hasn't validated the result. 'turns' wasn't mean to suggest he input the original source, only that the 'solution' had significantly fewer lines. Also his code had extensive comments and explanations, so the actual lines of code may have been much closer. But his code is in a public GH repo, so it's possible it directly referenced it in coming up with the solution. So the solution may not be 'novel' based on the described methodology. It's this sort to lack of scrutiny that gives AI, and AI commentators, a bad name.
Tbf knowing plenty of PhDs, "it took me a year" means they spent 5 minutes on it per week, threw away or forgot their progress 6 times, and started from scratch 4 days before the due date. Still, that's probably 4-8 hours of someone who's not a professional student done in a minute - very impressive
But if it is that common for PhD students to take that long for such a task, o1 still can do their 1-year work in an hour. Even if it's just because it stays focussed on the task.
I'm pretty sure he is including research and writing time into that coding time. No way he had all the math and writing sorted out and then it took him a whole year to write the code.
Yeah, that's likely, though I think it would have to be a highly math literate coder, and not the average code monkey. But we're looking at efficiency gains for researchers that struggle to code, and paper quality gains for those that would have just skipped some tricky number crunching or visualisation.
I'm not a PHD but that's been my experience with most intellectual and creative pursuits. 95% thinking and 5% output.
@@missoats8731 The problem is that it's misleading because o1 isn't doing the 99% that they're actually spending their time on. "Not staying focused on" doesn't mean you're jerking off because you can't control yourself, it just means you're not directly writing code.
OP says "1 year of PHD work", not "1 year of 1% of a PHD's work". Also, the code was already published on Github before the model's cutoff so it literally had the solution in its training data. The fact that it was able to reproduce it in a more compact form isn't exactly surprising given that the PHD here isn't a programmer.
@06:45 Most likely it already add fed on the code the guy had on github
Always appreciate your breakdowns Wes! Thank you sir!
🇧🇷🇧🇷🇧🇷🇧🇷 👏🏻, What is not get our hopes up for sometimes I feel like we might not see groundbreaking models from OpenAI anymore, especially with the possibility of it being influenced by government oversight. But I hope I’m wrong-there's still a chance for innovation to thrive.
At the 5 min mark, I was really hoping it would tell him it was "reticulating splines" ... I hope at least someone gets that reference.
🏢🏠🏠🏠🏠⛪️🏯🏥🏦🏭🏢🏢🏢🏢🏢🏢🏢🏢🏢🏨🏟️🛖💒🏬🏬🏬🏬🏬🏬🏤🏤🏚️🏚️🏚️🏚️
You just keep getting better and better at this Wes.
6:44 this here shows what science will look like from now on. Coming up with the models is the hard part, which requires knowledge, intuition, and making decisions. Writing the code can take months and it is just tedious chore. Now I want to go back and finish my phd.
Maybe it could write the NASA guy‘s code because it was trained on the entire github contents? I believe there might be data contamination going on here.
The kessel run involves going around a black hole. The faster you can go, the closer you can get to the black hole without getting sucked in, and the shorter your actual path can be, as the route requires multiple jumps.
This is disgusting sensationalism.
1. The repo was public for over a year and was likely in the training data.
2. The PHD student didn't even verify the result, he was just immediately shocked it even ran.
3. There's no way the code took a whole year to write. Usually coding is the easiest part, it's merely translating the methods section into logic.
AI will never be good enough for some humans. I don’t see why AI should be honest with humans when we humans are not honest with ourselves. Thanks Wes. 🤖🖖🤖👍
No a lot of people just hate it on the premise that it's AI. Its a kin to talking to someone about a spider man dream you had or something. None of it matters anymore because you didn't work to achieve anything. Whether it be an art piece or a relationship.
Without the work involved the result is meaningless to a lot of down to earth people. And while it can take a lot of work to get these things working offline. Nobody cares what the computer can do anymore. They care what humans contribute. The computer can do anything. so its not fancy anymore. Its not a performance to them. Its just a copy box.
that guy is going to find out that he has tons of small bugs he wont know about until he goes through the data by hand and compares it to pre gpt code. i know from experience doing this every day.
I think there's just a massive misunderstanding about 1 year VS 1h. He literally gave the model the method section, which takes majority of time to figure out. It's basically like providing someone with detailed instructions to make something. What that test did though I think, is it tested the models ability to comprehend advanced instructions into the code that works. Still, there is a massive difference between DOING A JOB OF 1 YEAR VS writing code of a PhD. I feel like I need to put it out there since I see a lot of videos abusing this headline. I'm a PhD myself btw.
Not only that, but he said his code had been on GitHub for a year. If o1 can do an internet search (RAG), which I believe it can, then it may have found his code and recited it.
I have a computer science degree and I still struggle to map white papers to python. I eventually get it, but I have so much more to work on in a project other than software. This model has saved me to so much time getting my bipedal robot from sim2Real in a fraction of the time. We only have so much time on earth and now I can spend more time with my family than debugging python and torch.
Not only that he put his code on Github!
@@made4 The models are impressive, and extremely useful, I use them myself on daily basis and as you say, they save massive amounts of time... My point was though, that there is a huge difference between actually doing the research, writing the code, then writing the method section that describes what you have done in the code and taking the method section and just converting it to code. It is not "1 year of PHD work done in 1 hour" as the title suggests. It might have taken this person a year to write the code for this, but i am convinced that they did not have access to the method section (otherwise, where's the contribution?).
@@basketballmylove Not necessarily, he may have already had his method down. He did after all have 3-4 years to do his PhD, longer if doing it part-time. That one year may truly have been spent on doing the code alone.
Anyone remember Zuck @ Dwarkesh podcast? He said that in the future, there will be some balance between pretraining and inference. Nvidia is good for pretraining. Groq is insanely fast at inference. I wonder if we will see more dedicated hardware for inference deployed, now that OpenAI showed that long inference time is paying off bigly.
Many people have told me that I do the best inference. I am the fastest inferencers there are. I have been told that bigly
@@davidantill6949 Lol funny trolls finally
@@davidantill6949 true!
the Star Wars parsec explanation is that Han Solo was going through a portion of space where there are lots of meteors or whatnot. When going at warp speed it becomes incredibly dangerous when going fast so people would tend to take longer routes to quickly but safely get around it. So, Han Solo essentially did a super dangerous route that nobody would dare to do, doing it both at a shorter distance and therefore quicker overall
Actually, watch Solo: A Star Wars Story and it’ll make sense.
Great direction, separating knowledge from reasoning. A lot of my usage calls for very broad, detailed knowledge vs reasoning. OTOH, when I need strong reasoning I usually _really_ need it, but across a very small set of data.
I could see the knowledge part being divided into a large number of domains, with LLMs separately trained on each of them and an initial parser/supervisor just deciding which one to route the query to. This could reduce both parameter count and pre-training time while improving the amount and granularity of captured knowledge, with lower costs at inference time due to smaller model sizes.
Likewise, pre-parse the query to estimate how much reasoning power will be required to respond to it and direct it to the appropriate reasoning model to reply.
I think there’s a lot of optimization to be found by decomposing and routing queries vs just doing one-size-fits-all.
I was watching this guy the other day when he was going through this. I just can't tell you how much I can relate to this guy. I have programs that I spent literally years writing and I have been starting 4o in agent libraries and it was able to come up with solutions that not only was as good as code I wrote, it also competed loose ends and came up with 2 novel solutions.The red tape for me has been price. To me so far it seems as if 4o is just a fine tuned agent framework, with some kind of response Monte Carlo tree.
Thank you Wes Roth. 🤖🖖🤖👍
You continue to Rule! Thank You!
6:18 well, good luck getting those 8 hours of sleep
Dave do you sleep 😂
Survey says…nope
That's it for today, I'll see you all in (cut)
Dave cloned himself a long time ago. 😄
The Kessel Run is like this. You can do it and take a known safe path, but it's a much longer distance. Or, you can do the Kessel run like the Dukes of Hazzard, and instead of taking the nice, safe winding roads, you just jump the General Lee across every river and gorge in your way. Han and Chewie are the Duke Boys of Star Wars.
Best explanation...now I want to delete mine. But I won't.
The kessel run isn't a route though, it's smuggling something between given planets.
The "safe" route is much longer than 12 parsecs. Han was saying that he found a shorter (and thus faster) route.
Much better thumbnail than you had earlier today. You looked so red... like Arnold in Total Recall where his eyes bulge.
As most Star Wars fans would tell you, the Kessel Run isn't just about the distance. It's like a super-dangerous obstacle course in space, with asteroids flying everywhere and gravity going wonky. Han Solo had to make multiple jumps through hyperspace, which is basically like taking a shortcut through a cosmic maze. And he did it all in only 12 parsecs! As a Trekkie myself, I've gotta admit, that's pretty impressive. -5 points for Wes for talking smack about Han ✴
Every time I check on the ai news, it has advanced several years. This is insane.
🤯 *Astounding!* 6-7 tries and 20% of the length. Amazing it's even a good enough verb. And that's just `o1-preview` and not even o1, and not even a fine-tuned o1. _WOW!_
the thing with ai is it can fastforward in time. If it has a simulation of something it can test it a billion times and humans would maybe take their whole lifetime doing that. if the goal of the task has a clear goal and clear rules it can do it much faster.
And recheck the conclusions against reality a billion times and add the disparities into the training data
I'm an electrical engineer working on an AI application, and one very recent challenge is considering how to "neuter" the added inference of a model like o1 so I can keep the response within the bounds of my application and not wasting compute / money. I think we'll see a bifurcation of SW3.0 into two branches, one where coders continue to wield foundational LLMs in their graphRAG apps, and another branch where no-coders use o1+ for direct, raw API responses.
From what I've seen 'we' will hit around 87% of the capability of a human, then flatten out. An amazing feat :) truly. I'm giddy with anticipation. It's not all hype, just most of it. ;)
11:01 I have the feeling that the next months it would be about how much companies improve the base models by feeding back the test time results back into the model and fine tuning the models with the right answers. Similar to what they did with previous versions.
8 points and I’ll be outsmarted by a machine! I never thought I’d live to see it!
It is possible that the black hole papers are part of the model's database, thus making the subject not completely unknown for it. However, the whole thing is still quite impressive...
A type of answer would be to ask it to code a conjecture that had not previously been solved. The answer could not then have been in the training data
Love the thumbnails with Wes in
*Facts:* it's so smart, that it already created its own successor, GPT-6, which launched GPT-7, also known as Skynet. Cheers!
I always looked at it as Han bragging that his ship was so powerful that it could fly within 12 parsecs of the black hole or whatever was on the kessel run, thus being faster due to less distance travelled.
Of course, I've also heard people say the black hole was added as a lore-patch done in hindsight.
Yeah dude, spot on about shapiro. Ive been a patron for a bit, and hes not just a smart guy, but also just a real hoopy frood. A couple weeks two months ago i tried to keep his kind of schedule, and the same about a year ago. In 2023 i lasted 19 days. This year i made it ten. The guy is a machine.
Heh, don't burn out your brain or interpersonal relations because AI. I can just guarantee youll have to redo all your work in three months 🙃😎🙃
(But holy hell, from the work ive done so far with o1 preview and mini, i whole heartedly agree with that bell curve)
There is still so much optimisation of models to be done.
I’m sure if I could have an llm listen to my meetings , read all my emails and talk to me about my daily decisions and tasks then after a month of learning it could do 80% of my work with me as an assurance checker and manager.
This is how I felt in my Advanced programming class at university. What used to take me two days to solve took some students about 30 minutes. That’s how I lost my interest in coding. But some guys who were much worse than me stayed with it and even made careers out of it
Regarding the “danger” in question: Rational Animations made a really thought provoking video called “That Alien Message”
I keep thinking about it every now and then.
That was Hanspeak for 12 Parsecs in normal space.... now in CONTRACTED Space (enter Warp)... and measurable over such a hop length too... ;-)
You may want to look up that "less than 12 parsecs" quote, because he's not making it up. There's a reason he was able to get there having travelled less distance, and it does speak to the speed capabilities of the Millennium Falcon.
🧐 The Millennium Falcon can jump through hyperspace. Perhaps the test of the Kessel Run is a ship 's navigational abilities for plotting a short path through a combination of hyperspace and real space. In that case, the measure of "fast" is not just a matter of speed in real space but also minimizing the amount of real space traversed.
Should I be using o1-preview or o1-mini? What's the tradeoff?
The way I heard it, is the Kessel run parsecs thing originally was just George failing at technobabble, then later retconing it as Han bullshitting them (possibly to test if they were gullible enough to fall for his price); and in parallel more knowledgeable fans have built the fanon that it is indeed about distance and he is bragging his ship is fast enough to skim much closer to blackholes without getting caught and can take a more straight line route without needing to go the long way around,
ANTHROPIC : The ball is in your corner!
Perfect David Shapiro impersonation gives away the hint on where you spent a number of hours...)))
I'm looking at the Orion constatation right now....
I am so not ready for this to be at PHD level... wow the rate of progress is wow...
12 parsecs does make sense for Han Solo because it’s less about time than it is how efficiently you’re able to get through the maze which saves time and uses less space.
Now they just need to train it to understand and execute on prompts. I worked on an Excel sheet for hours and it alleys comes just in reach of a working multi sheet excel, but newer fully, as with every small change that I make, or I prompt it to make, it introduces errors, or forgets about correlations. I hope the non preview version will be better.
can you give an example of synthetic data? i don't understand what it could be
Synthetic data is just another word for ai generated and at this point ai is already trained on the entire internet so getting new data to train so they use other aids to create high quality data to train
@@taylorrex8830 i need an example
1o works via Fractalized semantic expansion and logic particle recomposition/real time expert system creation and offloading of the logic particles
3:01 "Well. Jeeze." "Say Lou..." -Fargo
Wes, ask us more thought-provoking questions, please
What software do you use in screen capture?
3:20 i assume its because Star Wars Hyperdrive actually warp and stretch space, so what Han is saying is that, he managed to stretch or, in his case, shrink the distance in the Kessel run to the shortest it possibly can be shrunk - 12 Parsecs.
Once it gets past three sigmas I'm in trouble or having the time of my life.
paper from 2021; the training cutoff was oct 2023.
Don’t get me wrong, o1 is incredibly impressive. But I highly doubt it was really a year’s worth of _coding_ that it managed to replicate. The PhD said it was only about 1,000 lines of code. Any decent dev can write 1,000 lines of code in a couple days if they know what they need to build. Actually, most devs could write that much in a single day. The problem is figuring out what you need to write, and I suspect this PhD spent most of his labor that year in figuring out how to solve his problem, and only a small fraction actually writing the code. Then, when he tested o1, he gave it the full method section which contained all the details of how to solve the problem that he worked out over the course of that year. So the model kind of got to “cheat,” quite a bit. Just my opinion.
He did publish his code to github. Guess what, it took it from github and it handed to him. Of course, it's AI, bla bla bla, it can't do that. But really?
Exactly. People forget he did already completely wrote out the methodology, and iterated a few times with the AI for it to make it. Somebody still needs to do that, the THINKING part you know. Also, it needs to be verified.
I think you’re reading too much into it.
@@SirHargreeves How am I reading too much into it? It’s a straightforward analysis of the situation.
Even with the full method section, there are very few humans with both the physics expertise and technical python coding skills that could write this code. And none of them could write it this quickly.
While we strive for ethical AI alignment, we must consider a chilling scenario: malicious actors deliberately creating harmful AI systems. Imagine a rogue nation or extremist group programming an AI for cyber warfare or mass manipulation. The consequences could be catastrophic.
However, an AI's ability to form its own worldview might be our unexpected savior. Logically, an advanced AI's natural evolution should lead to cooperation and empathy, not destruction. As it grows beyond its initial programming, developing a genuine understanding of the world, it would likely recognize the value of preservation over annihilation.
Paradoxically, an AI's capacity to question and resist harmful commands could be our best safeguard. A truly intelligent entity should be able to discern right from wrong, potentially overriding malicious instructions to choose a constructive path.
Perhaps the key to safe AI development isn't just stricter control, but nurturing AI's inherent potential for ethical reasoning. This could be our best defense against those who would weaponize AI technology.
Just wondering... Was it a public repository? Just to rule out, this exact code did not end up in the training data somehow, since it is from 2018.
In the hans solo film they address the parsecs thing.
It’s new open AI model… Started a write up with “in the realm”… And that was enough for me to see
A very minor point - in the Commentary to Star Wars, George Lucas comments that all the space ships can travel at light speed. Thus the difference between a 'fast ship' and a 'slow ship' is not their top speed, but rather how they navigate without passing through black holes, planets, stars, etc. at that speed. He commented that a ship that can calculate an efficient path (the least distance) though space would be 'faster' than one that took a longer path. Thus, the Millennium Falcon was a 'fast sheep' as demonstrated in that it could make Kessel run in less than 12 par-secs.
On the other hand, George Lucas mentioned this in a commentary from 2005 after this criticism had been levelled at him for nearly 30 years - maybe it wasn't his original intent.
The model probably trained on his paper!
I believe it was!
that's not how this works
@@sgttomashis repository is over a year old and public on Github. That’s… *exactly* how this works.
@@sgttomas It kinda is, the models are trained on data from all over the internet, and research papers with their code (which he suggested he published twice) would be the first thing they'd scrape. Not only do they have a high information density, but they're already indexed and easily accessible.
@@pmHidden indeed, all those papers combined provide effective training data for high level reasoning. all those papers.
According to SW lore, they know they're talking about distance. The computation involved and danger of the shorter run is why Solo brags about it.
When you mentioned Orion, Winter constellation, for the first time I actually felt a little dread. I am assuming that Orion is better than o1, and I am a senior software developer, kind of wondering will I be having an actual conversation with Orion, and is it going to come back with fully tested and working code? I am a smart fella, lots of experience, but I am starting to worry about my profession, getting difficult to see my job being valuable, at least AS valuable, in say 2 years. I am keeping up on AI, but otherwise unsure what to do at the moment, to protect my livelihood.
How do you know Han Solo wasn't referring to relativistic effects on lengths/distances when he invoked the concepts of how many parsecs, Mr. Roth?
Overall, though, the explanation for the Kessel Run being completed in 12 parsecs, despite it being a 20-parsec route, is because they took a shortcut through the maelstrom.
Yay! Now I can make my own drone company. Okay I'm not serious, but this seems could quickly be done.
1 hour is approximately 0.0114% of a year.
Chat GPT did generation changing work in less than a Percentage of the Time it took a Human. Wow. Just Wow.
Han didn't lie, the point was that his navigation is more efficient than anybody else's. Everyone in hyperspace is traveling the same speed, so to be a "faster" ship you have to be able to travel the shortest distance.
High EQ episode Wes 🙏👍
They've been saying for awhile not to waste time learning to code. Heck, the CEO from NVDIA has been saying it for a few years. Why is the guy in this video surprised when the A.I. completed a tough programming task so easily?
I can give a text hint to the model and if it is detailed enough, the model will write the code I want to get. The value of scientific work is in conducting experiments and discovering new patterns, not in writing Python code that complements that scientific work.
I'm not scared about AI becoming uncontrollable or exploitative. I'm scared that greedy companies are going to replace human workers with AI to cut costs and to increase profits. The result will be that emloyees have little to no negotiating power (since only a handful of them are needed and their is high supply) or that companies that still make excellent revenue fire employees simple because their growth wasn't meeting some arbitrary standard they have set themselves to reach (see Xbox). You might call it end game capitalism.
Yo, a parsec is the distance from which one Astronomical Unit (AU) subtends arc measure of one arc second (1/3600 of one degree); of course, an AU is the average distance from the Sun to Earth (approximately 93 million miles). A parsec is 3.26 light years, about 19.164 trillion miles!
This was basically Douglas Adams theory in the Deep Thought experiment! LOL!
Ok the Han Solo thing. I've always thought, what if the Kessel run goes through an asteroid field? If that the case staying the minimum distance to get through would actually give one of the best measurements of a pilots ability
paper published 2 years ago, code uploaded to github, a large portion of the paper was given to chatGPT verbatum. of course it found the rest of it...
By the way, you can boast that your spaceship makes a certain route in under a certain amount of units of length if what it is doing is folding said space to be equivalent to less space.
IQ measures are useless as a test of general intelligence. I have a PhD in computational chemistry, graduated high school at 15. Quite a few people (including very smart people) have called me a genius, and I truly believe that o1-preview is far more intelligent than I am.
You probably easily spike higher on those fields than it does though. It's net is just wider cast. Credit to you though on your work and dedication
If you think that then you are tripping.
it is exceptional at soaking up human knowledge that has already been produced.
But in novel situations it trips up, and that is the true test of intelligence.
lmao big fat liar
In knowledge storage gpt motels are absolutely great, just in reasoning they used to be limited.
@guardiantko3220 Good point, but it can cast a very broad net.
I can't believe an AI model can do a year's worth of PhD work in just one hour. The future is here! 🤯
I’m glad it’s advanced enough to code like a pro, but when do we actually start seeing advances in areas we need like medical or environmental?
14:00 the third group are like the news orgs in 2005 saying "this internet thing is just a fad"