Literally the first content I've seen about buffer overflow that was so incredibly well explained that made me get really interested in learning more about it. This video doesn't even feel like it's 44 minutes long, I could watch way more of you teaching this super interesting stuff
There are videos that discuss stack over follow / buffer overflow and more variables. Just gotta know what to look for, this information has been out for many many years! Has a lot to do with network administration and active directories. Heck I'm surprised there hasn't been any CTF for buffer under run. I reported the security threat the second I figured I could gain full privilege to a server and reroute traffic. Sad part is I had no idea about bug bounty programs and since big shot did or else I'd be rich lol
Thanks John for uploading this writeup. I'm embarrassed to say that I spent 5 days trying to solve this one and looking that the "print" command was to blame, made my jaw drop. Love your content and I have to admit that you've been my inspiration to pursue a career in cybersecurity
A unique analogy I like to think of it as is say someone’s normal appetite. They’re normally able to comfortably eat 3 slices of pizza. You feed them 4, they may be able to handle it, feed them 5 they’ll start to feel sick, you feed them 6 and they’ll barf. Meaning you only saw them eat pizza but now thanks to throwing up overflowing, you could see say spaghetti, corn.
The Computer has a Limited Number of PROCESSOR Registers (or Variables) e.g. EAX, EBX. When you Jump to Run another Function, those Registers (Variables) need to be saved on the Stack. The next function can then use EAX, EBX registers or Variables. When the Function finishes and Returns to the previous Function, Variables from the STACK are copied to EBX, EAX and Instruction Pointer, so the previous function can run.
This was a very good video. A little bit chunky in the explanation about the stack, but the rest was perfect. One comment. You defined the variable "offset" but forgot to use it. =) Thank you very much for your efford. BTW I didn't solve this challenge during the event.
Excellent content! learned a lot on this one. Based on seeing the previous buffer overflow CTF, I tried passing various lengths of strings to the program and did manage to get it to crash, but I didn't know anything about how to get the address to the win function and pass that in. Thanks!
I sort of prefer to think of everything in terms of activation records and dynamic links to other activation records. For some reason it's easier to wrap my mind around everything.
Awesome video, loved what you showed. Interesting how big a leap in required knowledge and skill this entailed. Are you sure there wasn't a simpler solution?
A slightly more efficient method of finding the offset is to generate a fairly large string of characters that never repeat the same 2 bytes, pass it into the app that you started with a debugger, then check the EIP register and read out the little endian format of the string. Search for that as a substring of the original string you generated, et voila.... count the bytes prior and you have your offset without trial and error.
Little endian means least significant byte gets stored in the lowest memory address. Big endian means the least significant byte gets stored in the highest address needed for all the bytes or in other words the most significant byte gets put in the smallest address. Just remember little endian is least significant byte goes to lowest address and big endian is reverse of that.
It just tells you if the "little end" or "big end" comes first. For anyone wondering why on earth little-endian became a thing, it's because it simplifies mathematical operations for the processor. You always start with the least significant byte then carry to the more significant ones, so starting with the LSB in memory avoids having to count over from the beginning of it's storage to find the LSB, esp for inc or dec or other small operations that only occasionally will carry
buffer Overflows Exist for Speed Performance reasons. e.g. What is Your Favourite Number? You could simply check in code that the user only types two characters. To Stop the buffer Overflow attack. However, you cant do something like that for Reading a large XML or HTML file. it will Slow things down.
I have a little bit confusion on the little endian part. If stack grows from high addr to low addr, memory address increases towards the high addr, and the return addr is just somewhere above buf variable, then, when passing AAA....\x f6 \x 91 \x 04 \x 08 to the program, shouldnt it read and store the inputs from the \x 08 to the \x 41? like low addr high addr | local var | return address of the function | \x 08 \x 04 \x 91 \x f6 \x 41 ......... \x 41 \x 00 Then, when the return address is overwritten, it should be the \x 00 \x 41 being written to the return addr first, rather than the \x f6?
And also, why is the address of win() in local the same as the one on the server? shouldnt the win() has a different address to be called in the server?
Say your function gets called, so return address gets pushed to the stack and execution transfers to the beginning of your function. ESP is say 0x2000 now, which points to the first byte of the return address in memory It's a completely bare-bones function that doesn't save the frame pointer or set up any other variables, all it does is subtract 0x100 from ESP, to create a 100-byte local buffer starting at address 0x1900. If you start copying a string to your buffer, the first byte gets copied to 0x1900, the 2nd byte to 0x1901, etc. If you don't do any bounds checking, the 101st through 104th bytes you copy in will go in 0x2000 through 0x2003 and you've overwritten your return address Discreet numbers the processor saves are stored in little-endian on Intel systems. Buffers are generally copied in byte by byte in an incrementing loop
They gave out a pre-compiled executable with the challenge to ensure it was configured to be based at the same virtual address to make the challenge easier. Since it actually tells you the original return address during operation, you could actually still calculate the right address on the server if it were based differently but not changing from run to run.
what were the error codes it gave when it segfaulted? i'm curious as to what exactly happened. the first one gave error 14 in _vuln_ second one was error 6 in _vuln_ and the third was error 14 in _libc-2.33_ , was it a coincidence they had the same number or was it the same error on different programs? is there a way to get more details about an error code when working with something like this?
I don't get why the win function address is the same on both hosts, your local machine and the remote server, couldn't the win function be allocated in different addresses on memory of different hosts (also disabling aslr)?
because there were protection called PIE/ASLR, which is will randomization the address but in this binary the protection are disabled so the address will be the same
someone HELP ME PLEASE........i type "./ vuln" then after that it show this "-bash: ./: Is a directory". at minute 2.15 in this video......what i must do?
It's GNU/Linux. And it's a framebuffer, not a linux buffer. And industry names 32-bit x86 simply Aarch64, so terminology related to hardware and software are less confused with platform and system concepts. So maybe a better name is a Unix buffer. Simple user mistakes by displaced devs but keep up the good work! Just work on your terminology conventions because using it like you do really sucks and kills people. (btw, your graphic terminology sucks even harder but that's not just a typical problem in the USA; it's global.)
Hey John thank you so much! I was stuck in this when I was using print function and giving it to gdb. And upon seeing it in the gdb interface, it used to take \x90 as legit ASCII characters \, x, 9 and 0. I was so frustrated that how do I give \x90 as my input because that just doesn't print anything (ofc it's just nops). Your sys.stdout.buffer.write() method saved me! Thank you once again!
Literally the first content I've seen about buffer overflow that was so incredibly well explained that made me get really interested in learning more about it. This video doesn't even feel like it's 44 minutes long, I could watch way more of you teaching this super interesting stuff
There are videos that discuss stack over follow / buffer overflow and more variables. Just gotta know what to look for, this information has been out for many many years! Has a lot to do with network administration and active directories. Heck I'm surprised there hasn't been any CTF for buffer under run. I reported the security threat the second I figured I could gain full privilege to a server and reroute traffic. Sad part is I had no idea about bug bounty programs and since big shot did or else I'd be rich lol
@@elijah2863 are you a poorly written bot?
Thanks John for uploading this writeup. I'm embarrassed to say that I spent 5 days trying to solve this one and looking that the "print" command was to blame, made my jaw drop. Love your content and I have to admit that you've been my inspiration to pursue a career in cybersecurity
my first time ever i understood the whole logic of buffer overflow
Fascinating stuff! Really shows how all these different Linux programs can be used to solve and debug a larger problem
Thanks for explaining everything in a noob friendly way. I will be forever grateful
You explain it very well! I speak french, and your explanations are clearer than the ones on my main language
This was a good one, learned a lot, been watching all of these in order lol
A unique analogy I like to think of it as is say someone’s normal appetite. They’re normally able to comfortably eat 3 slices of pizza. You feed them 4, they may be able to handle it, feed them 5 they’ll start to feel sick, you feed them 6 and they’ll barf. Meaning you only saw them eat pizza but now thanks to throwing up overflowing, you could see say spaghetti, corn.
The return value has been compromised :)
for the first time I understand what a buffer overflow really is, thank you!
Best RET2WIN Beginner Masterclass I've ever seen in UA-cam. A massive Thank you for that!
I usually don't say stuff like this, but I'm a fan of your videos.
Fantastic Unreal content John. Could've been 3 hours and still entertaining and informative.
Incredible stuff and fantastic explanation, you're a great teacher john
THIS is what I wanted to learn,
great content and explanations, finally understood some of the concepts.
Enjoyed the programming walk thru too.
Great video, thanks John
Great as always! Enjoyed it a lot!
Great video John, learned a lot and had fun watching. You make it all seem so easy!
Your knowledge is astonishing!
Great video, thank you so much!
This my first time seeing a buffer overflow and u explained it very well sir and total get it
Your explanation is great always ❤️😇
Your vids are awesome !!!
This is a good video i learned how buffer overflow's are actually made and what the \x byte characters actually are, thank you.
Congratulations. A thousand congratulations.
I'm looking forward to Buffer Overflow 3.
I've tried everything, but it won't.
Pretty detailed this one John, should be very good for the beginners. You go through a lot of the standard pitfalls here which is great.
Your python scripting skills is always so incredible!!
great info! thx 4 uploading.
Excellent explanation!
argparse is soo hard > you make it look easy, great work from John.
Melted my brain 😂😂
Excellent video and explanation. Thanks
The Computer has a Limited Number of PROCESSOR Registers (or Variables) e.g. EAX, EBX.
When you Jump to Run another Function, those Registers (Variables) need to be saved on the Stack.
The next function can then use EAX, EBX registers or Variables.
When the Function finishes and Returns to the previous Function, Variables from the STACK are copied to EBX, EAX and Instruction Pointer, so the previous function can run.
Well explained John
that python programming was hectic
Exceptional !!!
38:51 what's “I believe button" ?
This was a very good video. A little bit chunky in the explanation about the stack, but the rest was perfect.
One comment. You defined the variable "offset" but forgot to use it. =)
Thank you very much for your efford. BTW I didn't solve this challenge during the event.
Excellent content! learned a lot on this one. Based on seeing the previous buffer overflow CTF, I tried passing various lengths of strings to the program and did manage to get it to crash, but I didn't know anything about how to get the address to the win function and pass that in. Thanks!
That was a fun challenge!
At 31 will watch it all !!
I sort of prefer to think of everything in terms of activation records and dynamic links to other activation records. For some reason it's easier to wrap my mind around everything.
Awesome video, loved what you showed. Interesting how big a leap in required knowledge and skill this entailed. Are you sure there wasn't a simpler solution?
A slightly more efficient method of finding the offset is to generate a fairly large string of characters that never repeat the same 2 bytes, pass it into the app that you started with a debugger, then check the EIP register and read out the little endian format of the string. Search for that as a substring of the original string you generated, et voila.... count the bytes prior and you have your offset without trial and error.
Great stuff! this makes me fell extra noob
Little endian means least significant byte gets stored in the lowest memory address.
Big endian means the least significant byte gets stored in the highest address needed for all the bytes or in other words the most significant byte gets put in the smallest address.
Just remember little endian is least significant byte goes to lowest address and big endian is reverse of that.
It just tells you if the "little end" or "big end" comes first.
For anyone wondering why on earth little-endian became a thing, it's because it simplifies mathematical operations for the processor. You always start with the least significant byte then carry to the more significant ones, so starting with the LSB in memory avoids having to count over from the beginning of it's storage to find the LSB, esp for inc or dec or other small operations that only occasionally will carry
stand up for once so that i can scan that QR code!!, you are killing me! 😮💨💀
So this is where "Stack Overflow" comes from
what if the binary was stripped? how will we find the address of the function then?
buffer Overflows Exist for Speed Performance reasons.
e.g. What is Your Favourite Number? You could simply check in code that the user only types two characters. To Stop the buffer Overflow attack.
However, you cant do something like that for Reading a large XML or HTML file. it will Slow things down.
40:00 You forgot to use the offset variable.
Loooooo forgotten but remembered in our hearts.
I have a little bit confusion on the little endian part. If stack grows from high addr to low addr, memory address increases towards the high addr, and the return addr is just somewhere above buf variable, then, when passing AAA....\x f6 \x 91 \x 04 \x 08 to the program, shouldnt it read and store the inputs from the \x 08 to the \x 41? like
low addr high addr
| local var | return address of the function |
\x 08 \x 04 \x 91 \x f6 \x 41 ......... \x 41 \x 00
Then, when the return address is overwritten, it should be the \x 00 \x 41 being written to the return addr first, rather than the \x f6?
And also, why is the address of win() in local the same as the one on the server? shouldnt the win() has a different address to be called in the server?
Say your function gets called, so return address gets pushed to the stack and execution transfers to the beginning of your function. ESP is say 0x2000 now, which points to the first byte of the return address in memory
It's a completely bare-bones function that doesn't save the frame pointer or set up any other variables, all it does is subtract 0x100 from ESP, to create a 100-byte local buffer starting at address 0x1900.
If you start copying a string to your buffer, the first byte gets copied to 0x1900, the 2nd byte to 0x1901, etc.
If you don't do any bounds checking, the 101st through 104th bytes you copy in will go in 0x2000 through 0x2003 and you've overwritten your return address
Discreet numbers the processor saves are stored in little-endian on Intel systems. Buffers are generally copied in byte by byte in an incrementing loop
They gave out a pre-compiled executable with the challenge to ensure it was configured to be based at the same virtual address to make the challenge easier. Since it actually tells you the original return address during operation, you could actually still calculate the right address on the server if it were based differently but not changing from run to run.
great
can you please do more malware analysis videos
what were the error codes it gave when it segfaulted? i'm curious as to what exactly happened. the first one gave error 14 in _vuln_ second one was error 6 in _vuln_ and the third was error 14 in _libc-2.33_ , was it a coincidence they had the same number or was it the same error on different programs? is there a way to get more details about an error code when working with something like this?
I have a question, can this python script be used for other ctfs with ofcourse some modifications
Cyberchef also does endian swapping, so you can just dump in the address (shown as big endian), then it will give you the little endian rep
I don't get why the win function address is the same on both hosts, your local machine and the remote server, couldn't the win function be allocated in different addresses on memory of different hosts (also disabling aslr)?
I think im a stupid. it will be the same because its the virtual memory address, and it will be mapped on the appropriate physical memory by the os.
Well, you're not stupid, because you answered your own question correctly ;)
because there were protection called PIE/ASLR, which is will randomization the address but in this binary the protection are disabled so the address will be the same
someone HELP ME PLEASE........i type "./ vuln" then after that it show this "-bash: ./: Is a directory". at minute 2.15 in this video......what i must do?
dont give space after ./ vuln rather write it ./vuln
Lol I literally just held the "K" button to test if I could overflow it at random and it spit the flag out.
Is this not to easy with the source code?
only thing that the source code help with was the use of gets/buffer size
You can still view the source code with decompile the binary but its not 100% similiar with original source code
Use awk
Who uses 32bit x86 in 22
Are you suffering from mouth cancer?
It's GNU/Linux. And it's a framebuffer, not a linux buffer.
And industry names 32-bit x86 simply Aarch64, so terminology related to hardware and software are less confused with platform and system concepts. So maybe a better name is a Unix buffer.
Simple user mistakes by displaced devs but keep up the good work! Just work on your terminology conventions because using it like you do really sucks and kills people.
(btw, your graphic terminology sucks even harder but that's not just a typical problem in the USA; it's global.)
holy shit, you need to go outside dude, touch some grass fr.
I'm pretty sure mixing up some terminology doesn't kill people 🤣
Hey John thank you so much! I was stuck in this when I was using print function and giving it to gdb. And upon seeing it in the gdb interface, it used to take \x90 as legit ASCII characters \, x, 9 and 0. I was so frustrated that how do I give \x90 as my input because that just doesn't print anything (ofc it's just nops). Your sys.stdout.buffer.write() method saved me! Thank you once again!