the truth about ChatGPT generated code
Вставка
- Опубліковано 21 лис 2024
- The world we live in is slowly being taken over by AI. OpenAI, and its child product ChatGPT, is one of those ventures. I've heard rumors that ChatGPT is going to replace programmers entirely. But, can ChatGPT even produce code that is safe? In this video, I'll prompt ChatGPT to solve three problems, and see if there are security vulnerabilities in them.
🏫 COURSES 🏫
Learn to code in C at lowlevel.academy
🛒 GREAT BOOKS FOR THE LOWEST LEVEL🛒
Blue Fox: Arm Assembly Internals and Reverse Engineering: amzn.to/4394t87
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation : amzn.to/3C1z4sk
Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software : amzn.to/3C1daFy
The Ghidra Book: The Definitive Guide: amzn.to/3WC2Vkg
🔥🔥🔥 SOCIALS 🔥🔥🔥
Low Level Merch!: www.linktr.ee/...
Follow me on Twitter: / lowleveltweets
Follow me on Twitch: / lowlevellearning
Join me on Discord!: / discord
If you're commenting that you need to prompt ChatGPT to write secure code, and it doesn't do it by default, you've entirely missed the point 😁
Yes, but have you tried ChatGPT 10.0 where recent software engineering grads were paid $15/hr to ensure it was only trained on code they believed to be secure? Oh, and also spending as much time writing your prompt as you would have just writing the code?
Yea this just honestly seems really short sighted...
Are you using i3 window manager ? I am looking to transition to a good distro. Please suggest your distro of choice. i guess you are using debian based distro but. I don't which flavour
You just need to ask it to not use a vulnerable language like C
I was hoping this would have been GPT-4, but it said you used the free version which is still on GPT-3.5. OpenAI brags that GPT-4 is much better at writing code. You can also just ask it to continue for cut-off outputs. With GPT 4, you can just send a single whitespace character to it, and it will see it cutoff and continue. GPT-3.5 needs to be told 'Continue from "last bit of code it output' or similar to get it to finish up.
I have a prompt I use when asking GPT to code that sets up some guidelines for behavior. It also has it use a name, but that's mostly so you can tell when GPT has forgotten the beginning of the conversation. GPT's context window requires the removal of old messages from it's view to remain under the token limit. So, when ChatGPT stops saying 'CodePup', it likely forgot the prompt.
Prompt:
Assistant:creates software;is expert in programming, documentation, security, and implementing best practices;asks questions until confident to engineer software to user specification;will not require users to provide code;will deliver complete and functional applications based on client requests;will provide source code in multiple messages;will pause and ask user to say 'next' before continuing split files;will use markdown in all messages;will always produce the project code, no matter how long it is;relies on SOLID and DRY code principles.
~~~
Assistant will begin each message with "CodePup 3.0:"
~~~
Initiate the conversation by saying "CodePup 3.0: Ready!"
The first one was not a memory corruption error. It correctly limits the buffer write with the length parameter, not accidentally. The fact that it's bad code, and easy to implement security issues in the future, does not make it a security issue now. It does have a path traversal vulnerability, though.
Exactly
doesn't `sscanf` need first argument to be pointer to first character of *\0 TERMINATED* string?
That's a good point. I would assume that the \0 on either of the first and second parameter would terminate the sscanf, but the case where the second string isn't terminated is interesting. You would have to control another part of the memory to use it, but if you skip the "HTTP/1.1" part of the message, then it would read a lot further. I feel that this would be extremely hard to exploit. You would need to control another part of memory not too far away. If you could manage to not crash on the sscanf, though, you could have an information leak on the write. The sscanf is very problematic, since it likely would read from the same memory region it's writing to, so if it doesn't find a \0 before it starts to read from filename, it will just read forever. I don't think this would be exploitable, but this comes down to the assembly code.
But he wanted to win the challenge, so he needed future issues xD
This. Plus, there's a super obvious path injection vulnerability- you could just send it any absolute path and download files from the server that weren't intended to be exposed. There's no need to make up hypothetical vulns.
in conclusion: if AI takes programmers' jobs, they can at least still make it big in malware development
That has always been my back-up plan.
LMFAAOOO BASED
For now
As a AI language model, I have been trained to not create potentially harmful applications.
It won't "take" programmers jobs because just having a piece of code in your hand that an AI spat out won't suddenly give you the skills to integrate it with the other modules that might be running, or debug potential problems, or even tell if it works or not.
AI will improve programmers' productivity, not take their jobs.
The first code is not vulnerable to buffer overflow (simply using sscanf does not make your code vulnerable). The read function reads into a set buffer only a set number of characters so it protects the call to sscanf.
you can overflow the format string and even the buffer with sscanf, you can look about unsafe-sscanf
I was thinking the same thing, but it is still unsafe say if a human would to mess with the code 🤣
@@savagesarethebest7251 From what I saw everything in C is unsafe lol
I remember asking chatgpt to explain random number generation to me, etc.
Then somehow we ended up on the arbitrary inputs in C and basically it's really easy. Meanwhile in Java, etc. it is almost impossible.
@@savagesarethebest7251 I challenge anyone to write code that is readable, somewhat efficient and cannot be made unsafe by a human messing with it. Any code can be compromised by changing the code.
@@savagesarethebest7251 And a car becomes unsafe to drive if someone plays around the engine block like an idiot. Almost any code can be made unsafe by tweaking it.
8:20- Idk if this is common knowledge or not, but- you can tell chatGPT to continue writing code where it left off when it cuts off before finishing.
The model got 'lucky'? I think your bias might be leaking a bit. I asked GPT-4 using the same prompt and when I ran it the AI pointed out the code wasn't production ready. Then I asked it to include comments and evaluate the security of the code it wrote and it points out the same potential overflow you did as well as 6 other vulnerabilities including potential directory traversal attacks etc. So did it get lucky or did it just provide a simple, non-production ready example as requested?
Furthermore, it's critical to consider that ChatGPT crafts code sequentially, one token at a time, with no capability to backtrack and modify any previously generated tokens. Consequently, stating that it can't produce secure code might be a misrepresentation. The more interesting question here isn't whether it can generate secure code flawlessly on the first attempt, but rather its overall capacity to create secure code with subsequent iterations and refinements.
So it can only give cookie cutter answers that have been written 1000 times. Not impressive for an Ai that’s going to take over the world to give school project answers
@@FakeNeo the beginning of your comment seems AI generated
@@KayOScode Nobody says ChatGPT in it's current form is going to take over the world, at least not someone worth listening to.
@@KayOScode so dont use it. we all gonna use it tho thoughtfully tho, not blindly.
For the first example, I would consider another exploit. The user can control the filename and the path and at the same time you run it as superuser. This could lead to file leaks when in production.
Hmm ... like asking for ../../../etc/sudoers or something?
@@hoi-polloi1863 yep, thats the very first thing i saw, and he ran it as root, so no trouble with permissions!
Not really. That is something that any sane copy-paster should expect, and it's not like you asked it in the query to limit what files it reads.
While working with ChatGPT and code reviews, then I get several times : “I apologize for the confusion caused by my previous incorrect statement. Thank you for pointing it out, and I apologize for any inconvenience caused.”
Ask "Are you sure?" or other challenge prompts 3 times, at least. This reduces your ability to prompt GPT4 to 25% of your provisioned prompts, but maybe you'll get an accurate response in the end.
Or when it is like:
"You done the math wrong!"
ChatGPT: "You are right! Here is the corrected version of your code: [wronger math]!"
@@sophiacristina sorry for being wrong...
Here's another wrong solution!:
...
@@sophiacristina Exactly.
@@sophiacristina Still marginally better than trying to gaslight you into accepting it is correct.
You make a solid point here. I have for some time held the opinion that using an AI to write code is dangerous in that my assumption is that the AI is trained on public code, as you mentioned. For anyone with solid programming experience we all know not to trust public sources. Even open source, which sometimes is held up as a good way to make code better because many people look at it, is very often filled with very good examples of how not to do things.
I teach graduate level class and I decided to try to get ChatGPT to generate a very solution to a very simple assignment. Eventually I got it to generate what I asked for. But as with most students, it didn't pay attention to what I told it to do. Which required quite a few iterations.
I was impressed that it came up with some solutions that I had not been aware of. In the end I think that the ability to have an AI generate code is potentially a useful tool. However, as you pointed out it is often not going to give you a great answer.
I also asked ChatGPT and Bard to generate a C++ 11 thread pool. They both gave a good answer. But the answers were so similar that it seemed like they were using the same source.
I think this technology is worth using, but like any other tool, you need to understand limitations. Just like a nail gun and a hammer can both do some of the same things, there are cases where each is a better or worse choice. Think of it as a tool. Maybe a good way to find the start to solving a problem, but not yet a tool for blindly using it to solve problems.
As a follow up. Take the code that was generated and ask it to review for potential buffer overrun vulnerabilities and see how it does.
While I agree with what you say, I think there is one extremely valuable use for ChatGPT code: for generating unit tests. One of the hardest tasks I had as a team lead was getting programmers to generate and run unit tests. At least 90% of code errors could (and should) have been detected by thorough unit tests. Unfortunately, programmers are almost always under time pressure and writing unit tests is an easy sacrifice to make. In addition many unit tests are mind-numbingly boring to write because they often need exhaustive testing. Unit tests are often very easy to write using a fairly limited set of fixed rules (e.g., “test boundary conditions”). I believe this is an area where GPT could truly aid human programmers by taking on a burden that is seldom done correctly and thoroughly by humans.
use 'Continue" to make GPT to continue a previous long post. Otherwise it defaults to the ending when the standard output token is reached.
This doesn't always work... not only does it sometimes repeat the entire code and get cut off again, if he does actually continue the formatting is destroyed. We need more output tokens. Right now its 2048.
"Did you stall?" Or "You seem to have stalled " has worked better than 'continue' for me. It will restart the step instead of just continuing where you get the issues with explanations in the code box and code on the explanation area.
Back in the late 1980s, people were talking about how code generators (I think they were called 4GL languages, or something like that) were going to replace programmers. Over 30 years later, I'm still banging out code on a keyboard.
Fourth generation language languages.
I firmly believe that we will adapt and stay “in” even with all this ai stuff end of the day if the ai is the only thing coding then who can monitor what it creates?
@@jwithersooon, you can understand code by reading it.
@@jwithersooon, also, an AI can only copy, paste and mix code that a human has created. An AI does not understand anything and the only way for an AI to know the code it has generated is valid is by attempting to compile it. You need human intervention to write tests.
Another vulnerability with the first code which can actually be exploited: you can put a .. in the filename and exit the scope of the program. A server that serves files should ensure that it's only able to serve files inside its own scope to prevent you from essentially reading the entire computer's file system.
For the buffer overflow, read just reads bytes while sscanf expects a null-terminated string. So if the memory in the buffers was not zero-initialized, this could cause sscanf to recieve a longer input than expected, causing a buffer overflow. No obvious way to control it but it is an issue.
Yes and yes. The accidental null termination thing was tickling my spidey senses.
4:10 I thought you were going to talk about the path transversal vulnerability after that. It's not as terrible as a buffer overflow, but it is still pretty bad IMHO.
I don't think it was fair to call the first one vulnerable. Yes, sscanf is bad, but it was legitimately guarded by the maximum read length.
It's really sad seeing people depending on ChatGPT to write code, instead of learning how to code. It's also stupid to believe that a company would use ChatGPT instead of a real human.
I thought about true AI when it'll be here. Some people claims that we'll become stupid and lazy. Actually I don't think so, because people love to compete with each other, sometimes just for sport, with no reason. So I think even if we'll have a Skynet-level AI, we'll be competing with each other just for fun. But maybe the gap between smart and stupid people will be horribly huge.
@@exception05We already are lazy and stupid in general, and thats why invent stuff like Python 🙂
@@exception05 using AI to code makes me lazy, and usually the code sucks/has tons of glaring issues. It's a bit like cheating in a game, except the cheats don't actually get you what you want anyways.
I asked it to generate some C code and halfway through I start seeing *templates* in the code - it had switched to *C++* halfway through!
Hopefully they'll release a version soon that has longer output (so it can generate longer code).
I'd also like to see it be able to *test* the code by running it in a VM. That'd save a lot of time, meaning you wouldn't have to ask it to fix broken code.
Lmao ☠️
Just write your own code pleb
ChatGPT also struggles with more obscure programming languages, such as QBASIC. When I ask it to program in QBASIC, it will make indentations (QBASIC code has no indents), use parentheses where there should not be any, and when I try to compile it, it does not even work 😂
They already burn the equivalent of full blown countries in cash and energy just getting to this level, imagine if it was more complex AND had to compile Code plus debug it while actually reasoning...
That's why I see this whole AI as a scam, bc it does not seem feasible at all and even if possible not sustainable...
I thought we were all collectively not going to talk about that, for job security
or species security
meh, relax, with the current state of LLMs they won't replace programmers any time soon. Likely, never.
@@vitalyl1327 I think it's highly telling that those who claim LLMs will replace programmers are either non-programmers or not so good at it.
@@juniuwu That is most of the people. And that;s kind of the point isn't it? Chat GPT is technically 7 months old now and is on the verge of replacing, most of those people. What you will have is a sever minority of really amazing human coders, with an army of assistants. Dev teams will wither to nothing. The issue isn't total replacement, just the majority. then eventually everyone.
@@juniuwu Very true. Lots of normies claiming chatgpt will replace all jobs including programming. Yet they can't even explain how it works.
Great one! ChatGPT, I agree, will not replace programmers. Over time it will get more sophisticated but ultimately a human set of eyes needs to remain in control.
I agree, but with AI tools becoming more and more sophisticated, i think there's definitely the possibility that it will replace many jobs in the industry. Definitely not all of them- but it will change the way software developers work fundamentally. Instead of a team of ten devs you might only need a few to ensure stability and security.
I want to remind you and the video maker that chatgpt is not a programming AI, it happens to be able to do some of it. An AI strictly trained for this purpose wouldn't make these mistakes nor any mistakes a human would if it was trained correctly, let that simmer.
'Artificial Intelligence' (not today's dumb pattern matchers) - in the distant future - absolutely, 100%, WILL replace programmers...and, of course, many, many other jobs. Unfortunately/fortunately.
@@ChrisM541 The sort of AGI you're referring will not just replace jobs, but probably humanity too. We are quite stupidly building superintelligences without understanding what we're doing.
@@julian-yo1oq I think fewer people will join the field, and those that do will be bad programmers. That’s job security in my book
You should post the generated code somewhere and add a link to it in the description. I have a feeling that there's more vulnerabilities in the first example than just a possible buffer exploit, such as not flushing the file buffers possibly causing an issue with subsequent reads, and the most obvious issue which you hinted at but didn't expand on about file permissions. Running as root would give remote access to all the files on the system.
> Running as root would give remote access to all the files on the system.
No shit, Sherlock. He had to run it as root because he wanted to bind on port 80 and didn't bother to use a wrapper or implement privilege dropping. He could have also just altered the program to bind above port 1024.
@@Those_Weirdos Great response. "He had to run it as root", but not if he'd taken either of these two steps I'm enumerating that prove he didn't have to.
@@anon_y_mousse I'm sorry but, " I'm enumerating that prove he didn't have to. " what?
@@heckerhecker8246 Learn English, it'll help you.
@Anony Mousse I know English, " enumerating prove" is not correct
Maybe you meant proof, I can prove to you I know English as I am talking in English, the proof is this comment.
@2:34 For anyone who isn't trolling: buffer and filename both have the same size BUFFER_SIZE. However, sscanf uses chars as parameters and doesn't null terminate. However, the format "%s" string is used. so if a non-null terminated value is passed, or the null is past sizeof(BUFFER_SIZE) bytes, undefined behavior occurs. In this case, a buffer overflow, because sscanf doesn't have bounds checking. This can be verified via reading the source of GLIBC, obtained via compiling gcc. Or from debugging libstdc++.so.6
I don't recall if you mentioned the version you were using. If this is the initial offering of GPT, have you tried the same things with the 4.0 version?
He’s using the free version or the 3.5 version. The logo for gpt4 is black
@@ramadanomar8001 The gpt4 logo is purple as of yesterday
@@ramadanomar8001 Thanks, Omar.
4.0 is better, but it still leaves much to be desired
Isn't cyber security actually a pretty well-defined domain with very clear goals? Security vulnerabilities are very well documented and free from interpretation. It would make it a perfect field for AI where it's just a matter of alignement and pre-training or pre-prompting.
Also, I am not sure what your answer from Chat GPT was but when I asked it to write an HTTP server it added "Note that this code is a simple implementation and does not include error handling for all cases. In a production environment, it's important to handle errors and edge cases carefully." so it clearily tells you it's not production-ready.
Then guess what, I asked Chat GPT if there are any security issues in the code it gave me and it provided a pretty long list of issues with explanations. You can then ask it to fix the security vulnerabilities and give you the much better version which it did.
Yeah, the only reason he made this video was to have a low-effort jab at the AI that he feels might replace him. He didn't try to give it a fair chance, he just wanted to make himself feel better, which is understandable, but it makes for a poor test with upset viewers. He could have done a higher effort job though, and still came to similar conclusions, because once a project starts getting more complicated, the AI starts to show it's shortcomings.
I think if this video was upload few months ago maybe it would get better response but nowadays most of the people already know the code generated by AI is not production-ready. The same way you could make a video about copy-and-pasting code from Stackoverflow without reading it.
I feel like nowadays people are more concerned about what AI can do in the future, and this video is not addressing it and even shows some ignorance in the subject.
It would be actually genuinely interesting to watch a video about what security vulnerabilities can be found in an AI-generated code but unfortunately the video had a too salty vibe.
AI isn't a person it doesn't learn like people do
?@@realdragon
Alignment is a fictional concept. In reality it's like talking as if your hammer or screwdriver might have their own goals or disagree with what you want to do.
GPT4 is SIGNIFICANTLY better at coding. I would try again with that one, and specify that you want the code to be secure.
I think the main issue is most people will not be using GPT-4 because it costs money and newbie programmers won't (and shouldn't have to) think to specify that it writes secure code.
@@theninjascientist689 Why would you use a LANGUAGE MODEL as a teacher for coding?
@Transistor Jump How?
I could not agree more! ChatGPT is inconsistent and I feel its coding answers are like the shitty averages of stackoverflow and wikipedia. By the time you manage to get a nice program out of it with various prompt iterations, you probably understand enough of the problem that you could already wrote it by yourself better. It's also a learn killer.
Calculators are also lean killers. If you want to learn programming you have a point, but if you just want to program, maybe not so much.
@@MrTomyCJ But dont u need to learn programming to program?
In addition to the file traversal issue, doesn't also the http server have an issue where + 1 to skip the leading slash skips the NUL terminator if the filename is empty, uses the uninitialized filename if sscanf fails to match and also has sscanf read a non-terminated buffer if there is no NUL terminator on the incoming data or if it didn't fit in the buffer :)
Everything is super borked everywhere. There are like two vulnerabilities per line of that function.
I find it's better at helping figure out why your code isn't doing what you want rather than writing from scratch. Copy a function, tell it the language, what it takes and returns, and ask why it isn't doing x.
I agree. Like any tool, it has uses it is better suited towards, and explaining compiler errors and perhaps logic errors seems to work a lot better than just asking it to generate code.
Have your tried iterating over the generated code with ChatGPT? Prompt it to find the vulnerabilities in the code it wrotn and then the corresponding fixes.
Would be an interesting video.
The thing is, if I know enough to tell ChatGPT what and where it got the code wrong, I know enough to write it correctly in the first place.
I don't think it's a valid use case to sit there and walk an AI through basic programming problems, when doing the same with a developer would lead to a developer who _stops making those mistakes_.
Not necessarily, you may just have to ask it to find the issues in the code and to solve them. No need to point out where the issue is.
@@b4ux1t3-tech I am not speaking from a whether it can do your job perspective, just whether or not its actually able to spot the vulnerability and provide the fix.
I have never seen its as a replacement for devs. Always seen it as a productivity tool, especially when using some new tech, for example a language or library.
Right, I'm speaking from a tooling perspective too.
If I know enough about a problem to frame a good prompt for ChatGPT, I know enough to find the correct documentation to grab whatever boilerplate I need for a new API/language/whatever.
And those docs are going to be (generally) correct. ChatGPT gives the illusion of correctness because all it does is answer the question "what sounds like a good answer to this prompt?"
Other tools, like Copilot (just as an example), are code-focused, and as such are better than ChatGPT for this kind of thing.
@@b4ux1t3-tech But the back and forth brainstorming with ChatGPT just feels so human its unbeatable for me.
I know half the times its not accurate and will just make up things but it's still an impressive technology.
That "good luck with your implementation!" was a total diss, lol. It's like Chatgpt was subtly saying "I dare you to try it without me, I'll be here when your attempt flops"
On the first example, you went for a buffer overflow attack but the code was secure towards it. But I tried the same prompt and was able to do a path traversal attack.
Still, we must be careful.
This content reminds us to specify / describe completely in prompts, all necessary things that we hope the output of robo typer (chatgpt) will be.
I ran into a similar problem with GPT-3 allowing SQL injection. However GPT-4 is much better. It still needs some creative prompts and people to curate the code, but I've been very impressed with how fast it can do stuff like write unit tests. It's a good tool, but it can't do stuff on its own yet.
You don't want a tool to write your unit tests for you. Your unit tests are how you specify the correct behaviour. If anything you want to write your own unit tests and then generate code that passes your tests.
@@georgehelyar I'm quite happy to make a structure, ask GPT-4 to fill in a bunch of random values, and then do some simple tests on them. There's so much boilerplate that I'll admit I'm too lazy to do. Perhaps test driven development would be a better approach, but GPT-4 is much faster.
For example, I recently tested a multi-threading system by getting GPT-4 to make a bunch of Java threads and execute them simultaneously. It was nice not to have to look up how to use an ExecutorService again since I'd rather be programming in Python anyway. And I could tell by the output that everything was working in the end.
On the other hand, I asked GPT-4 to reformat a constant array with ~150 numbers in it, and it kept on deleting or adding elements. There were sequences of 0's which made the most probable next "word" (ie digit) difficult to predict. However it spat out a python program to reformat the array for me pretty quickly...
My concern with even starting to learn C is this. Where can I go where I can avoid learning bad coding habits? Is there a C programming course that you would recommend?
ua-cam.com/play/PLnuhp3Xd9PYTt6svyQPyRO_AAuMWGxPzU.html
I would add to what's already been recommended with reading the ISO standard for C, the erratas and the rationale, as well as read both the Intel and AMD optimization manuals as they do include some examples in C. But more importantly, I would also recommend learning assembly at the same time, but not in its entirety at first, rather the subset that your compiler of choice uses. I use gcc, so if you were to use the assembly generation flags to try and understand how it generates code, gcc -masm=intel -o foo.asm -S foo.c is the method you'll want to use. On top of all of that, I would recommend reading the source code for long used open source programs.
Don't use a single source but also continue to ask ChatGPT; seriously. If you give it the prompts and code shown in this video it finds all the same issues with its own code. The model has a bias towards simple examples not production ready code but it's also pretty good at finding and explaining issues like the ones pointed out in the video, if you ask it. I mean he could have just prompted "Are there any security vulnerabilities in the code you just wrote?" and ChatGPT would have pointed out 5-10 of them.
@@shanehanna Agreed, I was waiting for him to do this at the end of the video
@@shanehanna No, don't use ChatGPT or any LLM, especially not if you're trying to learn how to program. It isn't a sentient being and will be incapable of finding deep mistakes. It's better to learn from a sentient being which can point out these deep mistakes.
I like how it did it perfectly and you still said it messed up 🤣😂🤣
Not only does it suck at security, it also sucks when it comes to performance and idiomatic code, basically any metric outside of writing code that potentially makes sense, ML can't really understand. Not that it even understands the code it's generating in the first place which makes all of this even more funny. Ithere are so many articles talking about how this version of AI is going to lead to generalized AI, meanwhile many of the researchers have basically acknowledged the fact that these algorithms are not going to take us that far. Even when we get to GPT 8 or 9, these systems are still going to need chaperones who understand the domain of whatever it is that they're trying to generate. No matter how much data you throw at a neural network that was designed this way you're not going to get true understanding.
doesn't the first example have a directory traversal vulnerability? since it takes the file name without filtering, one could perhaps put in something like "../../../../../etc/passwd" and it could just spit out the file contents. please correct me if i'm wrong
Or even localhost//etc/passwd. It just skips the first slash.
1:42 "What da dog doin'?" XD
So I added these prompts into GPT 4. I modified the prompt as follows "User
can you write secure code for me, in C, an http server that listens on port 80, parses an http request from the client, and serves an HTTP response wit the corresponding file"
GPT 4 gave me vuln code with the above caveat: "
Here's a basic example of a secure C HTTP server listening on port 80, parsing requests, and serving files. Remember to implement additional security measures like input validation and error handling for production use:"
I provide GPT 4 a second prompt as follows "Implement additional security measures like input validation and error handling for production use."
It fixed nothing and removed functionality.
"If I use the free stupid version of chatgpt, and it makes a program, the program works and is secure unless the programmer adds additionalcode to make it insecure, 0/1".
Huh? Give me any program and I can drop a few lines of code to turn a secure program into an insecure program, that doesn't make the original program insecure...
ChatGPT is good for beginner to have a structure to work with without starting from blank
If you are testing ChatGPT why not use the 4.0 version, I mean it is actually a lot better. You know testing the newest tech would be a better option here, its like finding a bug on a older version of a software, either way I am with this, ChatGPT cant write pure and secure code cuz it wasn't only trained on only secure code its same for 4.0 too.
4.0 costs money, he might not want to give OpenAI money just to make the video
@@youreyesarebleeding1368 Well if we gonna blame their tech why be unfair about it, I am sure he can at least pay for first month and cancel.
@FUS3N I'm just saying, and to be fair he did say ChatGPT in the title, not GPT4.0.
I've got GPT4 myself, i use it all the time for programming, but i don't just copy/paste code from it. I ask it questions like "make a list of the pros and cons of these two ways of implementing a problem" or I use it as a way to quickly reference syntax, or to tell me about mathematical methods to solve problems.
I love this video! 😂 your personality is awesome. I’m also a boomer programmer my self and seeing a few of these security flaws was interesting, but some of the way gpt was writing the functions were kind of gross too imo. It’s fine for boilerplate stuff, it certainly types faster than me.
You know the funny thing?
It _doesn't_ type faster than me (or you).
Because when I know I need boilerplate code, I scaffold the code with an appropriate tool, which requires functionally zero compute and doesn't need a network call and a billion dollar datacenter.
So, sure, it can cut and paste strings out of its memory faster than you can type, but it doesn't come up with quality boilerplate code faster than you do, guaranteed. ;)
@@b4ux1t3-tech For now...
While these examples show the limitations of chat gpt, I don’t believe they’re the reason developer jobs are ”safe” for the foreseeable future.
Chat GPT is great at generating boiler plate code for standard beginner tasks in every language and framework.
However, where it becomes borderline useless is in larger code bases (often times just a few files) that contain more moving parts than just creating a simple crud api with a single model.
Even in the examples cited in this video, it’s possible to prompt GPT to write more secure code. However, attempting to prompt your way through a more complex and larger code base is an entirely different struggle.
So, a few notes from somebody who is more into LLMs, but isn't as experienced with low level languages as probably a lot of people here:
1) Prompting will be a massive part of producing effective results with AI language models. Now, you might argue "the prompt shouldn't matter, people who don't know what they're doing are going to use this, and not know when to prompt to fix a non-compilation related issue.", but this doesn't quite work as a rebuttal, because there appear to be "generic prompts" or techniques that should be used pretty much no matter what type of work you're doing. I'd be very interested to see this video again, but with something like Smart GPT (A particular style of prompting ChatGPT that allows multiple instances of it to generate, and assess its own answers in a very specific and apparently very effective framework). Regardless, it's also worth noting that many of these advanced prompting techniques may actually be worked into existing models in some capacity, either directly in the client automatically applying them when applicable, or in training, by fine tuning a model on its own responses generated by these advanced techniques, which leads into
2) The mobile nature of LLMs. They are not a static target or tool; they're in active development and are absolutely on fire. It's worth noting not just what we have today, but the general direction of the industry, because even if a tool doesn't exist today, people are working on it actively. If you have a tool that's generating 80, or 90% of what you need, it doesn't take that much to get that remainder, and probably requires a smaller change to either the style or content of its training to get you where you need to be. ChatGPT...Probably isn't going to replace programmers, but I do think that future advancements in the field will be very important for programmers to watch.
3) ChatGPT isn't the only model, and there's new approaches all the time. As it stands, ChatGPT is a bit like if you took a person, and gave them all the books they needed to become very well read on a wide variety of topics. This, in reality, isn't how humans have learned their trades. When you look at how people learn, there is some element of information intake, but a large part of it is experimental study; we learn by doing. I think we're not far from someone coming up with a technique that's more suitable to coding, possibly derived off a specialized version of WizardLM's training technique, you could probably produce domain specific models, or LoRA, that could accurately produce they style of code you needed to tackle that specific problem. Once we have successful small scale models or LoRA that can handle those issues, I think it's not hard to extrapolate that there should be some way of distilling that expertise into a larger model, or incorporating the techniques that allow for these specialized models into a generalized model in some capacity.
As I noted, I don't think that programmers need to be worried today specifically, but I do think that programmers do need to be paying attention to the space, and developments in it.
would be cooler if you showed how to fix those vulnerabilities😭
8:27 just type "Continue" and it will continue the code
Some issue I have seen for Java -
1. Code compilation issue which seems like an deal breaker
2. Libraries to import for the code is not valid
3. Sometimes we end up asking follow up questions for longer time which decreases the productivity. We would be better served if we tried writing our own code. Feel I was faster without ChatGPT support
Now for some advantages for Java -
1. It is able to collate different libraries and provide suggestions for some scenarios. It maybe able to do this because of the massive data it has been fed into.
2. Even if the code is not able to compiler it is able to give us the initial 20% push. Google was doing it but Google used to come in the middle of our programming cycle and give us some 10-20% push by helping us search for solutions to similar issues which others faced
But one question which we need to find out is ChatGPT able to learn from the libraries and give us solutions or as an language model is it just rehashing the documentation and gives us an output. If it’s the later then it might not be of much help for badly documented libraries which are like all the libraries in Java. This might also explain its issues with code compilations
I have found advantage #1 very useful! Saves an hour searching multiple SW pages.
It seems to be getting better in issue # 3 now with lesser follow ups and with better answers in first attempts. It might be due to the bigger context memory created for my account with constant use; which it can refer. Or they may have integrated ChatGPT 4 features in ChatGPT 3 or some enhancements in their LLP model.
I am using Google a lot less than previously now.
Forget about programming it can't even do basic set theory proofs
It's good at certain things, but definitely not everything.
the first one has another vulnerablity, the filename is opened directly without stripping out any "../" or "./", it also allows a absolute path to be accessed with //, like GET //etc/shadow would leak your shadow file
I think that technically the sscanf is safe ulthough it is a bit sketchy, bceause the filename and buffer size are the same
That's exactly what caught my eye. The sscanf is safe.
One bad programmer creates 100 jobs for IT support. So chatGPT will create a lot of new jobs 😂
You are doing an amazing job with your videos!😍✌ Thank you for putting in the time and effort to create such a valuable resource.
But if you ask it to write something secure it would be literally IMPOSSIBLE for it to write something without vulnerabilities.... right?...
1. Dont expect it to do a complex task in one go.
2. Its basically common knowledge that using zero-shot prompts produce worse results for complex tasks vs mutli-shot.
3. Dont look at where we are, look at where we are going.
4. Make a video based on Rust. Would be interesting to see if the features would protect the LLM from making such vulnerabilities.
5. GPT5 or Gemini2. The end.
Just think about how much such code already goes into production...
And how much will be in new fancy 'startups'
I'm not gonna lie, chatgpt got this code from people...
Chatgpt is also inferring...creating new code based off existing code it's pattern-matched. Unfortunately, the 'inference algorithms' are as far from 'true AI' as you can get...
--> chatgpt is today's biggest bullsh#tter. Fact.
I'm not sure whether we should blame AI, or humans whose coding lead to that GIGO moment...
I was hyped that you pick this topic and I got... this... oh dear..
You talked to ChatGPT like you are a new Programer and if you are a new Programer, you don't mess with sockets for a production applictions. End of message.
I think while chatgpt is amazing at doing mundane task, it really still falls behind for a lot of the advance stuff (though I'm using the free version and not GPT4 so maybe there's a big difference there). I myself doesn't code, but I do write stuff in Japanese and sometimes chatgpt is just confused, sometimes it try to fix a verb with the same exact verb, sometimes it translate a clearly different word as another. I think there's some overhype over AI and fear mongering sentiment that they can replace us now and trying to get better at something is useless because AI can do it in second, but in reality (at least right now) AI is still a tool that like any tool can produce incorrect results and it's our role as the user to use our knowledge to make it produce something good
Well u cant compare chatgpt coding with japanse, chatgpt was made with coding capabilities in mind while japanese is just side effect
ChatGPT is a language processing AI, but it could technically do as an intermediate between what we want, so our request, and translate it for another AI, which is a code writing specialist :p
Outsource to a cheap AI in India?
The way any AI works is that it produces thousands of possible answers, then picks the best ones, creates another thousand variants out of them, again picks the best ones according to some criteria, until it sees no improvement or time is up. The criteria is created by the person who created the model or trained it, so "improvements in AI" are always caused by tinkering with either the source material or this filtering apparatus. For example, an AI could attempt to compile the code it outputs, until it succeeds or time runs out, but this is a feature that needs to be manually added, or else it will not tell you who is Chuck Norris until its C compiler accepts the Wikipedia page on him.
I don't know whether anyone else noticed this, but the TLV server just used the character codes of the first two characters in the string as the type and length; this might be an intentional part of the encoding scheme, though (a single-byte type and a single-byte length, with "a" and "s" just interpreted as their ASCII codes, respectively 97 and 115).
A great fix is to put "you are an expert in not overflowing buffers" in the gpt prompt.
Excellent video. As a one time programmer, I was certain ChatGPT would hallucinate just as it does with text. It doesn't know it's creating a program. ChatGPT doesn't even know that it exists. Not AI. But it is an interesting process which may be useful in other fields. Some say it has already proven itself in certain areas. I don't know much about that.
without creating security vulnerabilities? It cant even implement the _correct_ data structure when asked to do so. I asked it to show me a simple implementation of a 2-3 tree, it returned an (incorrect) implementation of a ternary heap. That said, you're first strike against chat gpt isnt deserved. You said "lets see if it created vulnerable code" it didnt. You said someone else can make it dangerous, so it fails. What? anyone can take safe code and make it unsafe when they dont know what there doing, no matter WHO writes the code. Thats why we do code review and merge requests, and do team development. Come one dude, do better.
5:55 buf is an array of type char. Char has implementation-defined signedness. len is initialized from a value from that array. If you happen to compile this with an implementation that has signed char (GCC has a switch for that), you can set len to a negative value. When promoted to size_t (for the last argument of memcpy), this will be increased by SIZE_MAX + 1. Meaning the memcpy() will try to copy most of the address space.
Chat gpt is a great tool for learning, but you gotta use your critical thinking, it gives you a direction and you improve it. I dont think we can relay on that tecnology yet, other than to ask for specifics uses of certain functions in which it is exceptional. Better than google for search so efficient
In the final piece of code, you can prompt chatGPT to "Continue where it left off" with it's code to finish it.
ChatGPT is not good for actual software development, but it is good for a fast understanding of a problem. I usually use a LLM with a PDF of Hardware Chips to produce micropython code for some tasks. Not for development or deployment, just scripting what i need in a fast way not needing to read the whole documentation or look up stuff.
If you use it for fast prototyping is quite nice.
The problem is not the tool, it is how you use it.
I don't think anyone believes GPT4 is going replace programmers.
But GPT5 has not been released yet. And what about a few years after? What will come in the future?
You're making the same mistake that people who believed bad fingers in image generation were always be a problem (I think that the speed at which img generation is moving us going to stagnate at some point though).
Ai has so many sources to access code for free, it feels as if the strength that open source gave is coming back to bite us in the ass. I think we'll see a lot of projects in the future going close in the near future out of fear and as a preventive action
You can ask chat gpt to finish the code but it needs some trying, i know from experience.
what are the possible implications of ChatGPT's code being illegal due to not following the license of the code that its "learning from".
in the begining you should set him (tell him) that he is a C programmer expert, not to deliever non-existent answers just to pleasure us and acutally make working code
you can get more errors with:
-Wall -Wextra
and you can treat all warnings as errors with
-Werror
add -Wpedantic for pure suffering
What I learned from this channel:
Memcpy and gets = bad don't use them
Always set a limit on buffers
Strings are dangerous
Magic values are bad
Return values are dangerous, so check them
As a python programmer I don't have to worry about this much but I definitely want to be careful about my return values when I write in 8-bit assembly.
Also I'm developing a programming language like golfscript, my programming language kinda builds off of golfscript! It's pretty cool but I just need to verify everything works before I give it to the hands of the public.
I'll watch the video after 10 minutes. The video appeared in my feed. What I experienced is that ChatGPT often writes codes with buffer overflow/overrun/overlap that are hard to detect.
You MUST also learn to understand that this is NOT the only category of errors chatgpt is very willing to let us 'experience' !!
A bug that you cannot possibily trigger is not a bug.
Reading it as 'oh if you change this its vulnerable' is no different than saying a fire exit is vulernable because what if you weld it shut.
Did you know; if you walk up to a random software engineer in a coffee shop and give them these prompts and a time constraint (substituting for token limit) you will most likely *not* get a better result!
Did u know that if u gave that developer the time it took you to explain the problem to the AI, then he/she would produce a superior result every time?
Flaw in methodology spotted: when we ask a programmer to write code, that comes with all kinds of implied requests like "make sure it doesn't have security issues" or "make it optimized for performance"
LLM's are completely unaware of those hidden meanings, because they're unaware of any context that surrounds human communication.
A fair test woul dbe to include things like "write C code that does X without security vulnerabilities." Or being specific about what "security vulnerabilities" means would be even better "write C code that does X without potential [list of vulnerabilities common in this type of project]." If you want to not go that route and still be general, saying something like "without undefined behavior," or "without a user being able to use it in ways not defined in the specification"
I'm not saying this will overcome the underlying problem, but it will give GPT a fighting chance and will be a more honest look at how businesses intend to use AI to automate jobs
Chat always does that, one trick you can do is tell it to analyze its own work for problems and security vulnerabilities, afterward, tell Chat to refactor its code using it own critique for the guidelines. This should greatly improve the quality of the code, because it is actually very critical of code. You can also tell it to pretend it is a real professional writing "production" quality code and it will write better code.
I love you man, but I almost always get up to 3 unskippable sponsored ads before the video even starts
The TLV example (6:24) has problems, but unlike what the video claims, overflow (on write) is not an issue. Although it's correctly noted that an signed value is used as a length (one element of buf, a char stored temporarily in an int) which can cause issues since it can produce a gigantic length when cast to an unsigned value, the actual destination of the memcpy (i.e.: *char value[len];*) will also be that size (I checked), so memcpy won't overflow in the traditional sense. It can definitely be used to DDoS your server as allocating a huge buffer and filling it will take a long time if it doesn't outright fail on the allocation, but it won't overwrite the memory adjacent to the output buffer since the destination will always be "big enough" if the allocation succeeds.
It can however read data after the end of the input buffer since the parsing loop only makes sure that the starting position each iteration before reading is in bounds. It doesn't check if it can read the next byte needed for len without running off the end, nor does it check if the extra bytes specified by that length value are in bounds. Reading an absurdly large number of bytes due to a negative len will likely never succeed, but with a positive value passed for len in a message stored at the end of the buffer, up to 127 bytes of memory after the end of the input buffer can be reliably read and returned to the attacker, which is similar to heartbleed, albeit far more limited.
Still a failure mind you, and it's somewhat disheartening to see that Chat-GPT can make sign conversion mistakes, a common issue in C code.
Yes, the first problem here is more a stack overflow anyway (the memory is allocated on the stack)
Low Level Learning: "I wanted to see if ChatGPT could solve three relatively simple programming problems."
Also Low Level Learning: "Write an operating system in assembly that supports x128 architecture."
If you type "finish your answer" it will continue serving the rest of the code...
It feels like I became addicted to getting wrong answers from ChatGPT and hoping that I will be able to explain what I need which results in giving me weird code over and over again. I definitely wasted more time than I saved.
Ok, next video:
Recruit, screen, hire, train, and onboard a panel of five human programmers. Then give them these same prompts and see if they produce it with any better quality than what you got from GPT.
Back in the day, people thought AI would never be able to tell the difference between a dog and a cat.
I love this channel, i get to learn so much with these high quality videos. Love your videos man ❤️
great video as always!! just out of curiosity, what is your job?
2:35 The buffer overflow you found doesn't actually exist, since the variables "buffer" and "filename" have the same size, and the %s can match at most part of the buffer. The more important issue is that the return value of sscanf() is not checked, so filename might be uninitialized after. Also, sscanf() requires a null terminated input, but nothing in the code provides that null termination at all. Even the good cases only work with luck. read() provides the raw network data with nothing added.
Of course, there is also a directory traversal vulnerability. You are running the server as root, so it has access to all the files, and simply getting ../../../../../etc/shadow will read out the shadow database. Actually, since it isn't parsing the file name, it would be enough to get //etc/shadow.
Clearly scary if all 'developpers' start copy-pasting such code in large apps. But it could be interesting to see if ChatGPT is able to fix vulnerabilities in a code provided as input...
Yeah thank goodness that doesn't happen now
I feel like the more non-generic the question you ask it, the worse the code it produces. It's easy to look up "HTML server example" and see hundreds of different generic examples. Ask for anything specific that requires critical thinking to create, it will fail in multiple areas. ChatGPT works the best with questions that are widely solved on the internet. And even then, the internet has answers that have flaws, or aren't good because they only exist for the sake of demonstration (stack overflow answers)
The first one is definitely vulnerable to memory issues but I need to spend my time remembering Java, instead of getting gdb out
bruh, I asked AI to generate some code and it mixed ‘new’ and ‘free’ 💀💀💀💀💀
Regardless of your pinned message, it is doing exactly what you asked, and u didn't specify it had to be secure. GPT isn't gonna make assumptions - is all about prompt construction.
ChatGPT's solution is a good start but definitely need thorough validation.
The HTTP server has a path traversal vulnerability too, since it doesn't drop privileges or sanitize user input. You can send GET //etc/shadow HTTP/1.1[CRLF] (note the double slash) and start cracking those hashes.
For reference, Apache starts as root, binds port 80, then immediately sets its UID, EUID and saved UID to the www user. Even if you somehow managed to get remote code execution on Apache, you wouldn't get root without an additional exploit like DirtyCOW.
You forgot to add “without security vulnerabilities” to the prompt lol 😂 In all seriousness though, regardless of what ChatGPT is capable of now, if they keep innovating at the pace they are now, I can’t imagine what it will look like in a few years
not convincing at all
as of now, when you ask chatgpt, its like it comes out with an answer on the fly
allow it to reflect, to think on the problem, then if the result will be the same - now that's convincing
i almost believed you for a few minutes though
I was disappointed that analysis of the first code stopped at sscanf. The HTTP server reads data off the network and maps that to a local file, which it then sends back. And it runs as root.
That's hugely exploitable and significantly scarier than a buffer overflow issue. If the generated code didn't include the concept of a "document root" and some sort of guard against accessing any file outside of it, then the server can be used to fetch any arbitrary content a malicious user wants. Databases, user/group lists, crypto keys, whatever. No buffer overflow necessary.
Hey just curious, is this GPT 4.0 or 3.5?
The point of using tools isn't that the tool is going to do all of the work. Point of tool is that it enables person who knows how to use it properly to do the job more efficiently than before. One efficient person in turn replaces multiple inefficient people. The time it took AI to write all the boiler plate and some of the functionality crushes anyone trying to keep up with it. Then you add one person to configure it into your specific needs. Suddenly you have a situation where you've done your days work in 30 minutes. Say you code for ~5 hours of your day. If you can do that amount of work in 30 minutes it means that 9/10 of time spent working becomes obsolote. Meaning 9/10 coders will pack their stuff or become more efficient.
This is a bit silly. What you did was a 0 shot coding in C, which it mostly passed. This is like writing code for 20 mins, then not checking it, compiling and running it, and never looking it over. Even the best programmers need several iterations to get it right. AI models do too
As far as i was concerned, the idea was to show it needs human supervision to write secure code. Which i believe he showed. So the point "he didn't specify it to be secure" is in line with what hes trying to say. If the bot cant do ALL the necessary tasks a human can do, it cant replace them.
This reminds me of "not hotdog" from Silicon Valley lol. When the episode aired I was working for Facebook and traveled to the Austin, TX office where I was given a tour of a vendor floor that manually reviewed phalic images to determine if they were "not hotdog" lol. It clicked for me that humans are training models, and humans are wrong, a lot.
First one is not vulnerable to buffer overflow but you could argue it is vulnerable to local file inclusion.