I love how there's a team of literal wizards out there somewhere speaking ancient runic languages who have the power to delete the internet if they get a rune wrong.
these "wizards" were parsing HTML with regex lol. don't be too impressed by them. don't be fooled by all these fancy terminologies. that's an absolute rookie mistake and big fat no no lol
Pro engineering move: don't actually fix the issue, implement an undocumented hacky workaround that itself seems like a bug so that when someone else fixes the workaround they get hit with the initial issue
To be fair though, I think it is unlikely they knew about the leak, given the rarity and impactfulness. I would chalk the empty buffer masking the issue up to coincidence.
@@kevinfaang I think its more likely the author of the code knew about it being possibility, so just put it there just in-case. Otherwise there's really no reason to have an empty buffer. The general rule when you see weird stuff like this is, "its there for a reason".
Considering that Google and all the other caching services were involved, it would've come to light eventually hurting their trust even more, so they didn't have much of a choice
one thing my father always told me about iterators: "NEVER check index == end. ALWAYS check index >= end. You should never assume that your iterator is consistent and never skips over any values."
That advice taken at face value is bogus since many kinds of iterators only support equality-comparison, e.g. when iterating a linked list. But it's good advice if your iterator does support a total ordering that's consistent with iteration.
@@thisismygascan4730 == can actually be faster in some cases, since it makes it easier for the compiler to reason about the loop count, but yeah it _usually_ makes little to no difference
@@MatthijsvanDuin from my experience, nearly all instruction sets I've seen so far (AVR, MIPS, x86, ARM) take the same number of clock cycles for integral comparison (usually 1 or 2).
That is why we invent safety things. Sure you can operate this machine without such and such guard, but just know for every smart one that can there is dum/drunk/exhausted/stressed someone else that will get chewed up if this machine can operate with someone in eye site distance.
@@fulconandroadcone9488 Sometimes, the smart one is also the exhausted and stressed one (also, smart people can be drunk, though hopefully only during hobby projects). I mean, I don't know about you, but whenever I'm too sleep-deprived, the quality of my work generally declines.
@@nickstegman8494 my guess is if that if you disregard basic safety which includes disabling safety mechanism during normal operation you might not be as smart as you think you are
@@fulconandroadcone9488 you've never worked while severely exhausted. Not a programmer, but in any field exhaustion leads to dumb basic mistakes. Example, I was exhausted the other night after a 16hr shift, made some cereal for quick dinner, and put the milk in the pantry and cereal in the fridge.
The more you learn of the Internet and its overall supporting structure, the more you'll realize how fragile it really is and how terrifyingly easy it is to make it crumble.
The title was intriguing and video was so interesting that I didn't even notice that video has 54 views from such small channel with 157 subs. Good job :D
Рік тому+4
I didn't realized that too until I read this comment :D Really good content.
So it was the time honored, decades old combination of: - working with naked pointers - not checking if you reached the end - foolishly trusting that input from a network source _isn't_ garbage Calling it "Cloudbleed" is fitting, as Heartbleed had all of those as well :)
Never trust input for the en user - period. Garbage in - Garbage out. Not too long ago I fixed Non-Ascii chars bombing a piece of middle-ware, all because the commercial devs for the input system never though users would cut and past content from the Internet into a comment field. Nor did they properly set the XML declaration for the output file to specify the character set!
@@williamdrum9899 This was off by a _lot_ more then one. And also you were off _with a pointer_ - which is why you should not work with naked pointers unless you really need to go low level.
This also shows a different common issue with loops, where the exit clause is an equal value. Sure, you might expect the incremented value to eventually always reach the desired value, but the safer thing to do in this case is check if it's higher or equal. I would likely write it as less than, but depends a bit on what the surrounding code looks like.
Was gonna point out the same thing. Never assume an incrementer value will always eventually _exactly equal_ a target value, any number of things (race conditions, cosmic ray bitflips, floating-point fuckery, ancient mummy-curses, etc) could cause it to somehow "miss" the intended exit value.
@@soxic-y8e Well, once you've ruled out race conditions (because you're using a language that enforces thread safety, or your code isn't multithreaded to begin with), cosmic rays (because your server uses ECC memory), and floating-point fuckery (because you're using integer vars), at that point it's time to look into supernatural root causes. ...Honestly, I'm surprised there _aren't_ more COEs containing the phrase "Return the slaaaab..."
Checking for equality would be the normal case. If the index is beyond the upper limit, that should trigger some form of assertion so the root bug is fixed, not masked.
@@cassinihuygens1288 Sure, but preventing hell from breaking loose is more important, you can always add logging after breaking out of the loop in case shit hit the fan. I'm not saying you shouldn't deal with the issue, I'm saying you should ensure the code runs as expected even when failing. A small addition; it would be better to check if the value exceeded the expected value outside of the loop regardless for quite massive performance reasons, since a loop like this will run so many times that a simple if-statement will slow it down significantly. It is also not necessary to check it until after the loop has been exited as it couldn't have been above it prior to that. So regardless this is how you'd do it.
I didn't really think a HTML/XML parser was _that_ hard to implement, but i never even thought that it basically is just a giant state machine, where different characters can change the entire state in many different ways, and managing that is nightmare inducing
A layer of abstraction is missing in the description. A finite state machine is indeed used in many modern approaches, but they are usually auto-generated from a more human-friendly parser description. So the finite state machine remains under the hood, invisible for the end-user (programmer). Do you know regexp? In classic implementations finite state machines are generated from regexp, and then used to do the parsing.
You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. As I have answered in HTML-and-regex questions here so many times before, the use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts.
An empty buffer as a terminator of a buffer filled with null terminated strings is common enough pattern to be named: double null terminated buffers. But it's also weird enough that even when its used it's often not consistently produced or consumed.
As a QA software dev, my job is to write the program that runs our automated tests as well as hunt down bugs in the code the regular QA guys find. Kudos for making an entertaining video out of a bug hunt. I have had some looooong hunts. Requiring a very specific type of bad input is really hard to figure out, I often end up stepping through the debugger trying to think of if any of the possible branches would fubar me.
I have a fun bug report for you. Devs on project come up to me saying that Cassandra DB rejects a basic SELECT query on data of a certain user, and they can't see the entire stack trace because of logs are masked on production. I try the query and see that JSON parser returns with "index out of bounds" while constructing the array of integer IDs from the result! By removing numbers from the end of it and looking up the source code of the parser used by the database, it turned out that the hacky ICPC-style was failing on a combination of the first 3 numbers pushing the index into thousands.
I became a much better software engineer the day I joined a small company where it was standard practice to stay at work until the bug that was blocking process was fixed. I wasn't there very long before I became very adept at identifying possible error conditions. As you suggest, a good debugger is an essential tool because it shows you all the things you didn't think about.
@@chupasaurus Exhaustive scenario walk-through during code inspection could have weeded it out before it got into the final product, but it depends on the caliper of the code inspectors.
The QA community should self publish a monthly magazine only filled with bughunts from devs out i the trenches like this video. Like veterans telling war stories at a bar, imagine all the little things you could pick up along the way over the years.
The evil part comes, when running the debugger changes how the program is executed, for example when the bug comes from a race condition and running the debugger naturally makes the code run slower.
This channel is so underated omg I can't even believe that a small channel can produce so much quality content, keep going bro, you're going long ways ❤
Based on the voice I think it’s the same guy behind Fireship Edit: After listening to Fireship again I can hear a difference in the voice, and Fireship also shares a channel called Jeff Delaney which I would think is his personal channel.
This is a really great video. It explains the situation really well and easy to understand. I also massively appreciate how you put footnotes in the description.
I always feel uncomfortable seeing/doing something like p == pe to check if the end has been reached, interesting seeing those fears validated. I always do p >= pe, and add assert(p
I've been on UA-cam since it was Google video and I've never seen a video description like yours, good stuff, this video presentation goes tough as nails, I salute
Got this in my feed with only 20 views. Usually I always skip those, but I gave it a chance. Great quality on the video, don’t think you’ll not be big one day.
If you’re ever waiting for something to increment until it equals a value, it doesn’t hurt to have an “else > value” block that throws an error to let you know something went wrong
This is where documentation becomes important. If the developer who implemented the empty buffer had explained why they needed it, maybe this wouldn't have happened or at the very least they could have figured out a way to circumvent the problem when they were first rewriting the HTML parser.
But there's always the chance the buffer was perhaps entirely accidental (smells like an off-by-one error to me, instantiating one too many buffers for our purposes) rather than actually covering up this other off-by-one error in the other parser
@@ferociousfeind8538 naw, I don't think it was a one off. They had deliberate functionality in there were the system would chunk html docs and the final doc in the buffer was always empty. That empty buffer chunk was like a carriage return in a way. The closing tag of the HTML indicated the end of the doc but if the doc doesn't have a closing tag, the lack of characters in the following buffer would work as a flag that basically tells the parser that this is the end of the doc. It's obvious that this was done deliberately but the reason is a bit vague as to why exactly they did it this way. Hence why documentation would have been important. When they migrated from the old parser, they obviously didn't take into account the edge case where the HTML is broken and they removed the empty chunk without adding logic to handle the usecase.
this is why i write my thought process for most things. espically if its "clever". forget someone else. *I* need to remember what I did and why 2 months later when I look at it aagain lol
This video and channel deserves so much more recognition. I'm almost halfway through the video and it's so well put together. Good wishes and I cannot wait to see you get the views you deserve for the effort put in. ♥♥♥♥
This video was super good and well-made! I don't know how to describe it, but it just felt good to watch. The visuals were just so satisfying, and I especially liked 6:51. I also really appreciate the sources, assumptions, and corrections in the description! Many big UA-camrs don't cite anything and go by the philosophy of "well you shouldn't trust me entirely anyways so it's not my fault if I misinform you." Subscribed and liked, great video!
New favorite channel; love the high BPM background music, visuals, concise yet detailed explanation, overall format/structure, the Michael Bay explosions every 5 seconds, etc.
I just found your channel, must say I am impressed and like the content. I have a CS and Mathematics degree, and most channels don't give a deep enough dive or are just too cornflakes with water dull to pay attention. Thanks for the content
Kevin! Just stumbled on this video and was watching until 5:00 before noticing that your channel is tiny. How are you this good at this humble size! Big ups to you my man. Thanks for the content
I have never been more proud to know that as a c++ dev I will always be needed because legacy moon runes written 30 years ago will inevitably fail when some obscure pointer spills over into undefined memory.
Such informative video! It's easy to follow and it taught me to value backwards compatibility more. I hope you get more views! Also kinda surprised to see Mr. Affable there
Which is why when having range checks you should not test for equality but equal or greater/less, that way a one off error can only catch one extra char. Yes it will cost performance, but if that is unacceptable, you need to have much better understanding. The extra empty buffer seems in my opinion as a big red flag, if its explicitly added and not just a result of an one off error it probably served a purpose and should be thoroughly documented, or refactored/re-engineered into something less obfuscated. This is why I always encourage curiosity and ask all new devs to make sure to question any code they do not understand, if a more experiences dev cannot explain it in an understandable way its probably wrong, or at least bad code that should be rewritten :)
I had a stupid bug last month, that heavily degraded performance, also happened Friday night. I had added a new caching solution some years back (one I wrote), and after 1,5 years flawless performance, I added more usage. This tipped the scales, and the caching storage was exhausted. I then remembered I forgot to add a flush of storage in this case, so it got full and all new requests failed. I quickly added a flush, only a few hours later, but this was only a bandaid; it'll fill, flush, fill. So after some quick sleep with a fever from the flu that I had, I realized the cause (it was added to something unique, causing every call to create a cache with no hits, at 1+ mil transactions per second), so I deleted the caching at this point and it flowed again, phew. Cause? A design change. The caching point WAS a good non-unique place for 6 months in development, but during bug testing, someone altered it, so it became unique. And I had already done the tests for performance at scale, so it just wasn't noticed 😬 Luckily it was hardly noticed by anyone, but it could have been truly terrible. I work in a bank, and the finance engines was grinding to a halt. A process that runs some critical financing was still running after having used 16+ hours! After my hotfix, we terminated it and restarted, and it took 2 minutes (as it should) 😳 If I wasn't sick with a fever, I could likely have reacted faster, but thinking in that condition was as slow as a caching bottleneck 😂
I've learned to add a >= instead of == even if you always expect the pointer to never get past the target, cuz you never know, right? This could've prevented this from happening as well
Yes, only add == when you truly only want to run your code when the variable is exactly that value, if the code can accept higher values there's no point not using >=
10:20 Usually, if some code looks dumb, or inefficient, it’s because some software engineer before you working to a deadline had to get something out fast, not clean- and there’s probably a good reason for that “empty buffer” 😂
As soon as I saw the pre-increment on the pointer I knew that was the problem. I don't have any professional experience but I know the human mind was never ment to comprehend that operator.
I used to have a website, full of scripts that were poorly coded by me. It was full of unfinished tags i.e. those that weren't closed properly. It could have caused the whole internet to collapse if my website was visited by a lot of people
This is very common for how memory bugs occur in software. One programmer makes an assumption as to how the memory works, and writes their code accordingly, then some other programmer changes some other piece of code that makes it so those assumptions no longer hold, and voila, you have a bug.
The only problem I see with this, is how the f, didn't they have 100% test coverage, with something this important, if they had proper test use cases, they would instantly catch an issue, with the new parser implementation. It's insane that this important digital companies don't do the most basic coding practices, it's just mind boggling to me.
No company can reach 100% test coverage. Seems you don't know how any important software is patched together in the real world. 😁 It's all a patch work. No software can be defect free.
"why was an empty buffer added? No reason" I can almost guarantee that was a developer that thought "someone will inevitably do something dumb. Lets make sure there's nothing beyond the last buffer"... and the developer that removed it thought "this system must be perfect - no overhead. It's not like anyone's going to try read beyond the last buffer".
This one was a double-whammy: Code Optimization kills the safety hack the old engineer put in place (never documented; probably SOP to him). And a rookie mistake in implementing boundary check conditions. Back in my good old days of Delphi, the compiler had an option called Range Checking, which would guarantee this kind of bug would never see the light of day....however, it hindered performance and most devs never used it outside of debug builds.
It is, however, an *excellent* showcase of how things should be done once an issue has occurred. Google's infosec team found it and reported quickly. Cloudflare's response time was excellent, initial mitigation was under an hour. Cross-Atlantic work let them have constant flow of work. Full resolution within a few days. And I imagine next time the google team needs their beers covered Cloudflare will happily foot the bill
I only understood probably 10% of this, but from the very little that I did understand, it appears that the way HTML is parsed is part of the reason why it is still so important to learn lower level languages, like C. Also: VERY important: this is why we always use the proper comparison operators during an iteration loop! 😂
Back at my college we have been told to use => or =< for buffers for exact same reason. This precise condition (++p == pe) is based on assumption that p will always be incremented by 1 and never will be incremented by 2. But actually there is possibility it will be incremented by 2 or even more. Simple use of (++p >= pe) will actually fix issue, and prevent it from happening. It reminds old exploit of buffer overflow somehow. The reason may be bugs in this or other code, sun or other radiation make bit from 0 to 1, malfunction of memory or registers, etc.
I love how there's a team of literal wizards out there somewhere speaking ancient runic languages who have the power to delete the internet if they get a rune wrong.
Imagine the same but for all of reality 🤔
@@jeffbrownstain guys... I think I have my next D&D campaign.
I will add a wizard hat sticker to my programmation laptop in memory of this comment.
these "wizards" were parsing HTML with regex lol. don't be too impressed by them. don't be fooled by all these fancy terminologies. that's an absolute rookie mistake and big fat no no lol
@@girlswithgames What a stupid comment. This is exactly the right thing to do. How do *you* parse HTML?
It sounds like an old engineer added the buffer to avoid a leak, but didn't leave notes on it
Pro engineering move: don't actually fix the issue, implement an undocumented hacky workaround that itself seems like a bug so that when someone else fixes the workaround they get hit with the initial issue
@@kevinfaang Now when you put it that way .. that is most likely what happened lol :D
To be fair though, I think it is unlikely they knew about the leak, given the rarity and impactfulness. I would chalk the empty buffer masking the issue up to coincidence.
@@kevinfaang He was just placing an elephant in Cairo
@@kevinfaang I think its more likely the author of the code knew about it being possibility, so just put it there just in-case. Otherwise there's really no reason to have an empty buffer.
The general rule when you see weird stuff like this is, "its there for a reason".
Cloudflare deserves some more credit for how transparent they were about the issue
Considering that Google and all the other caching services were involved, it would've come to light eventually hurting their trust even more, so they didn't have much of a choice
They knew google would snitch
They had to be
theyre an infosec company lmao ofc they were
look at bugbounty, they only pay max 3k$ ;) Soo it is wroth selling bug on black market
one thing my father always told me about iterators: "NEVER check index == end. ALWAYS check index >= end. You should never assume that your iterator is consistent and never skips over any values."
i was looking for this comment. it's not like == is any faster or more readable.
That advice taken at face value is bogus since many kinds of iterators only support equality-comparison, e.g. when iterating a linked list. But it's good advice if your iterator does support a total ordering that's consistent with iteration.
@@thisismygascan4730 == can actually be faster in some cases, since it makes it easier for the compiler to reason about the loop count, but yeah it _usually_ makes little to no difference
@@MatthijsvanDuin from my experience, nearly all instruction sets I've seen so far (AVR, MIPS, x86, ARM) take the same number of clock cycles for integral comparison (usually 1 or 2).
@@randomghost1080 It's not about the time taken to a literal == or >= comparison, it's about enabling the compiler to do loop transformations.
As a dev, I'm not sure if I'm comforted because this stuff can even happen to a company like CloudFlare, or horrified for exactly that reason.
That is why we invent safety things. Sure you can operate this machine without such and such guard, but just know for every smart one that can there is dum/drunk/exhausted/stressed someone else that will get chewed up if this machine can operate with someone in eye site distance.
@@fulconandroadcone9488 Sometimes, the smart one is also the exhausted and stressed one (also, smart people can be drunk, though hopefully only during hobby projects). I mean, I don't know about you, but whenever I'm too sleep-deprived, the quality of my work generally declines.
@@nickstegman8494 my guess is if that if you disregard basic safety which includes disabling safety mechanism during normal operation you might not be as smart as you think you are
@@fulconandroadcone9488 you've never worked while severely exhausted. Not a programmer, but in any field exhaustion leads to dumb basic mistakes.
Example, I was exhausted the other night after a 16hr shift, made some cereal for quick dinner, and put the milk in the pantry and cereal in the fridge.
We're only human, after all🗿
There are three hard problems in computing: cache invalidation, naming things, bounds checking, and hunter2.
That one took me embarasingly long to get
Cleverrr
lol. nice one
I like this one xDDD
boundary checking is not that hard, but it does but into CPU cycles.
The more you learn of the Internet and its overall supporting structure, the more you'll realize how fragile it really is and how terrifyingly easy it is to make it crumble.
It's a literal jenga tower. We all use packages that use other packages tgat are all maintained by unpaid individuals.
As an electrical power engineer, I would implore you to take a look our electric grid, and you will feel way better about the internet! 😅
Hardly.
It's easy to knock down the modern HTML standard, but the basic adress system and the pysical network are very robust.
@@JoshSweetvaleuntil you learn about operation cyberpolygon being ran by the same people that ran event 201 right before covid
@@CubeInspector There's no-one at the steering wheel.
You managed to make a overly technical topic very interesting instead of boring! Never saw anything like that before.
The title was intriguing and video was so interesting that I didn't even notice that video has 54 views from such small channel with 157 subs. Good job :D
I didn't realized that too until I read this comment :D Really good content.
2 days later and the channel has 1270 subs
Now 1.46K subscribers
Edit: Now 4.01K subscribers
Edit2: 6.04K subscribers
Now 2.2k
Update: 3.2k
So it was the time honored, decades old combination of:
- working with naked pointers
- not checking if you reached the end
- foolishly trusting that input from a network source _isn't_ garbage
Calling it "Cloudbleed" is fitting, as Heartbleed had all of those as well :)
I wish HTML in browsers was actually checked before any representation to force developers to simply make them good
Never trust input for the en user - period. Garbage in - Garbage out. Not too long ago I fixed Non-Ascii chars bombing a piece of middle-ware, all because the commercial devs for the input system never though users would cut and past content from the Internet into a comment field. Nor did they properly set the XML declaration for the output file to specify the character set!
*not checking if you reached or overshot the end. the check should have been >= not ==
More like an off-by-one error
@@williamdrum9899 This was off by a _lot_ more then one.
And also you were off _with a pointer_ - which is why you should not work with naked pointers unless you really need to go low level.
This also shows a different common issue with loops, where the exit clause is an equal value. Sure, you might expect the incremented value to eventually always reach the desired value, but the safer thing to do in this case is check if it's higher or equal. I would likely write it as less than, but depends a bit on what the surrounding code looks like.
Was gonna point out the same thing. Never assume an incrementer value will always eventually _exactly equal_ a target value, any number of things (race conditions, cosmic ray bitflips, floating-point fuckery, ancient mummy-curses, etc) could cause it to somehow "miss" the intended exit value.
@@WackoMcGoose Ancient mummy-curses is my new excuse for issues in code 🤣🤣
@@soxic-y8e Well, once you've ruled out race conditions (because you're using a language that enforces thread safety, or your code isn't multithreaded to begin with), cosmic rays (because your server uses ECC memory), and floating-point fuckery (because you're using integer vars), at that point it's time to look into supernatural root causes.
...Honestly, I'm surprised there _aren't_ more COEs containing the phrase "Return the slaaaab..."
Checking for equality would be the normal case. If the index is beyond the upper limit, that should trigger some form of assertion so the root bug is fixed, not masked.
@@cassinihuygens1288 Sure, but preventing hell from breaking loose is more important, you can always add logging after breaking out of the loop in case shit hit the fan.
I'm not saying you shouldn't deal with the issue, I'm saying you should ensure the code runs as expected even when failing.
A small addition; it would be better to check if the value exceeded the expected value outside of the loop regardless for quite massive performance reasons, since a loop like this will run so many times that a simple if-statement will slow it down significantly. It is also not necessary to check it until after the loop has been exited as it couldn't have been above it prior to that. So regardless this is how you'd do it.
I didn't really think a HTML/XML parser was _that_ hard to implement, but i never even thought that it basically is just a giant state machine, where different characters can change the entire state in many different ways, and managing that is nightmare inducing
you don't actually write the state machine, that code is generated
Any parser can be a state machine.
The fact that it is a state machine is just an implementation detail.
A layer of abstraction is missing in the description. A finite state machine is indeed used in many modern approaches, but they are usually auto-generated from a more human-friendly parser description. So the finite state machine remains under the hood, invisible for the end-user (programmer). Do you know regexp? In classic implementations finite state machines are generated from regexp, and then used to do the parsing.
You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. As I have answered in HTML-and-regex questions here so many times before, the use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts.
And the HTML parser needs to be able to turn itself off temporarily when it finds a element.
This is the reason why I always do >= than just == comparison, you never know when an accidental off by one error might occur
Your presentation and use of case studies to make something as mundane as reviewing code somehow made for a downright entertaining watch. Subscribed!
An empty buffer as a terminator of a buffer filled with null terminated strings is common enough pattern to be named: double null terminated buffers. But it's also weird enough that even when its used it's often not consistently produced or consumed.
Cool info! A lot of people here are assuming it was some design quirk
As a QA software dev, my job is to write the program that runs our automated tests as well as hunt down bugs in the code the regular QA guys find. Kudos for making an entertaining video out of a bug hunt.
I have had some looooong hunts. Requiring a very specific type of bad input is really hard to figure out, I often end up stepping through the debugger trying to think of if any of the possible branches would fubar me.
I have a fun bug report for you.
Devs on project come up to me saying that Cassandra DB rejects a basic SELECT query on data of a certain user, and they can't see the entire stack trace because of logs are masked on production. I try the query and see that JSON parser returns with "index out of bounds" while constructing the array of integer IDs from the result! By removing numbers from the end of it and looking up the source code of the parser used by the database, it turned out that the hacky ICPC-style was failing on a combination of the first 3 numbers pushing the index into thousands.
I became a much better software engineer the day I joined a small company where it was standard practice to stay at work until the bug that was blocking process was fixed. I wasn't there very long before I became very adept at identifying possible error conditions. As you suggest, a good debugger is an essential tool because it shows you all the things you didn't think about.
@@chupasaurus Exhaustive scenario walk-through during code inspection could have weeded it out before it got into the final product, but it depends on the caliper of the code inspectors.
The QA community should self publish a monthly magazine only filled with bughunts from devs out i the trenches like this video. Like veterans telling war stories at a bar, imagine all the little things you could pick up along the way over the years.
The evil part comes, when running the debugger changes how the program is executed, for example when the bug comes from a race condition and running the debugger naturally makes the code run slower.
This channel is so underated omg I can't even believe that a small channel can produce so much quality content, keep going bro, you're going long ways ❤
Based on the voice I think it’s the same guy behind Fireship
Edit: After listening to Fireship again I can hear a difference in the voice, and Fireship also shares a channel called Jeff Delaney which I would think is his personal channel.
@@AndersHass I don't think so, it has some similarities but I don't think it is the same guy behind fireship
Now, this was _very_ instructive, and, to this old hand who committed every imaginable blunder under the sun, very relatable. Thank you!
I finished watching the video and thought that you would have at least 100k subs. Keep making great content and you will definitely go far!
This is a really great video. It explains the situation really well and easy to understand. I also massively appreciate how you put footnotes in the description.
this aged well
Kudos for isolating and fixing the problem successfully in short order. Really like the video, well made and fun to watch.
I really appreciate how this guy has all his assumptions, corrections, etc, in the vid description.
Well done, Kevin. This was extremely well made. Loved it!
I always feel uncomfortable seeing/doing something like p == pe to check if the end has been reached, interesting seeing those fears validated. I always do p >= pe, and add assert(p
I figured that out myself while programming, I thought "how am I *sure* this index will be ever equal to the limit?"
I've been on UA-cam since it was Google video and I've never seen a video description like yours, good stuff, this video presentation goes tough as nails, I salute
Got this in my feed with only 20 views. Usually I always skip those, but I gave it a chance. Great quality on the video, don’t think you’ll not be big one day.
If you’re ever waiting for something to increment until it equals a value, it doesn’t hurt to have an “else > value” block that throws an error to let you know something went wrong
This is where documentation becomes important. If the developer who implemented the empty buffer had explained why they needed it, maybe this wouldn't have happened or at the very least they could have figured out a way to circumvent the problem when they were first rewriting the HTML parser.
But there's always the chance the buffer was perhaps entirely accidental (smells like an off-by-one error to me, instantiating one too many buffers for our purposes) rather than actually covering up this other off-by-one error in the other parser
@@ferociousfeind8538 naw, I don't think it was a one off. They had deliberate functionality in there were the system would chunk html docs and the final doc in the buffer was always empty. That empty buffer chunk was like a carriage return in a way. The closing tag of the HTML indicated the end of the doc but if the doc doesn't have a closing tag, the lack of characters in the following buffer would work as a flag that basically tells the parser that this is the end of the doc. It's obvious that this was done deliberately but the reason is a bit vague as to why exactly they did it this way. Hence why documentation would have been important. When they migrated from the old parser, they obviously didn't take into account the edge case where the HTML is broken and they removed the empty chunk without adding logic to handle the usecase.
this is why i write my thought process for most things. espically if its "clever". forget someone else. *I* need to remember what I did and why 2 months later when I look at it aagain lol
This video and channel deserves so much more recognition. I'm almost halfway through the video and it's so well put together. Good wishes and I cannot wait to see you get the views you deserve for the effort put in. ♥♥♥♥
8:09 the humor in this little animation is simply sublime. Made me giggle while watching at 1 am
I've been looking for videos like this. Tech/software story telling. I loved your delivery, hope to see more from you. Subscribed!
This video was super good and well-made! I don't know how to describe it, but it just felt good to watch. The visuals were just so satisfying, and I especially liked 6:51. I also really appreciate the sources, assumptions, and corrections in the description! Many big UA-camrs don't cite anything and go by the philosophy of "well you shouldn't trust me entirely anyways so it's not my fault if I misinform you." Subscribed and liked, great video!
^ this++++++
@@SilnyDuke520 doesn't compile
New favorite channel; love the high BPM background music, visuals, concise yet detailed explanation, overall format/structure, the Michael Bay explosions every 5 seconds, etc.
I just found your channel, must say I am impressed and like the content. I have a CS and Mathematics degree, and most channels don't give a deep enough dive or are just too cornflakes with water dull to pay attention. Thanks for the content
This form of content has great potential for your channel. good work
I work on systems that address network vulnerabilities for AWS, and you did a good job here.
Kevin! Just stumbled on this video and was watching until 5:00 before noticing that your channel is tiny. How are you this good at this humble size! Big ups to you my man. Thanks for the content
I speak english. I did not understand a single word in this video. I watched the whole thing anyway.
I was super surprised when I saw that you didn't have atleast 100k subscribers because of how high quality this video is. Great job!
TLDR: Buffer overrun in some legacy parser.
wtf lol i clicked on this and watched it the whole way thru thinking it was from some big tech channel but it only has 984 views. lol good vid bro
Had some code break once after an update. It was the update that exposed an old bug that hadnt been caught. I can see how things like this happen.
Those can fun, what is not fun is having an issue and purposefully ignoring it until it bites you in the non of the pleasant areas.
well, this aged well
I have never been more proud to know that as a c++ dev I will always be needed because legacy moon runes written 30 years ago will inevitably fail when some obscure pointer spills over into undefined memory.
very entertaining and educational! your combination of humor and narration is great
Such informative video! It's easy to follow and it taught me to value backwards compatibility more.
I hope you get more views!
Also kinda surprised to see Mr. Affable there
mr affable is everywhere
Dude I work in CDN and you explained everything perfectly, great work!
Where do you work? Is it a worthwhile job? I'm just getting started on my Comp Sci degree and still exploring and researching my options.
I got recommended this video, first time I've seen your channel and I gotta say I enjoyed this video. I'm now subscribed 👍
How did I not know of this channel before? Great video dude
Which is why when having range checks you should not test for equality but equal or greater/less, that way a one off error can only catch one extra char. Yes it will cost performance, but if that is unacceptable, you need to have much better understanding. The extra empty buffer seems in my opinion as a big red flag, if its explicitly added and not just a result of an one off error it probably served a purpose and should be thoroughly documented, or refactored/re-engineered into something less obfuscated.
This is why I always encourage curiosity and ask all new devs to make sure to question any code they do not understand, if a more experiences dev cannot explain it in an understandable way its probably wrong, or at least bad code that should be rewritten :)
Wow. I can't imagine the time it must have taken to make this video. Well done.
I had a stupid bug last month, that heavily degraded performance, also happened Friday night. I had added a new caching solution some years back (one I wrote), and after 1,5 years flawless performance, I added more usage. This tipped the scales, and the caching storage was exhausted. I then remembered I forgot to add a flush of storage in this case, so it got full and all new requests failed. I quickly added a flush, only a few hours later, but this was only a bandaid; it'll fill, flush, fill. So after some quick sleep with a fever from the flu that I had, I realized the cause (it was added to something unique, causing every call to create a cache with no hits, at 1+ mil transactions per second), so I deleted the caching at this point and it flowed again, phew.
Cause? A design change. The caching point WAS a good non-unique place for 6 months in development, but during bug testing, someone altered it, so it became unique. And I had already done the tests for performance at scale, so it just wasn't noticed 😬
Luckily it was hardly noticed by anyone, but it could have been truly terrible. I work in a bank, and the finance engines was grinding to a halt. A process that runs some critical financing was still running after having used 16+ hours! After my hotfix, we terminated it and restarted, and it took 2 minutes (as it should) 😳
If I wasn't sick with a fever, I could likely have reacted faster, but thinking in that condition was as slow as a caching bottleneck 😂
Sounds fake.
@@gustawbobowski1333
cool
What did you major in or how to do learn this?
You just popped up in my recommendations! Great video!
I've learned to add a >= instead of == even if you always expect the pointer to never get past the target, cuz you never know, right? This could've prevented this from happening as well
Yes, only add == when you truly only want to run your code when the variable is exactly that value, if the code can accept higher values there's no point not using >=
"debugging and rethinking life"
im going to start using that
Great video! The silly sound effects were beautiful; they're what I imagine in my head when doing my own programming :)
This video is amazing! You deserve much more popularity.
10:20 Usually, if some code looks dumb, or inefficient, it’s because some software engineer before you working to a deadline had to get something out fast, not clean- and there’s probably a good reason for that “empty buffer” 😂
As soon as I saw the pre-increment on the pointer I knew that was the problem. I don't have any professional experience but I know the human mind was never ment to comprehend that operator.
This is kinda like a crime doc but for software engineering and I'd gladly watch much more
1.4k... I expected this channel to have at least 100k. Great video man, you're making it into the algorithm!
Happened again with CrowdStrike..
Thank you for adding the credits, you are the best!
I used to have a website, full of scripts that were poorly coded by me. It was full of unfinished tags i.e. those that weren't closed properly. It could have caused the whole internet to collapse if my website was visited by a lot of people
?
Bro really forgot difference between p++ and ++p that was taught in Intro to CS 💀
This is very common for how memory bugs occur in software. One programmer makes an assumption as to how the memory works, and writes their code accordingly, then some other programmer changes some other piece of code that makes it so those assumptions no longer hold, and voila, you have a bug.
This channel is awesome, man. Props. You deserve far more subs.
The only problem I see with this, is how the f, didn't they have 100% test coverage, with something this important, if they had proper test use cases, they would instantly catch an issue, with the new parser implementation. It's insane that this important digital companies don't do the most basic coding practices, it's just mind boggling to me.
No company can reach 100% test coverage. Seems you don't know how any important software is patched together in the real world. 😁
It's all a patch work. No software can be defect free.
The extra buffer at the end feels like a band-aid fix that never got documented
Absolutely incredible video. Thank you.
"why was an empty buffer added? No reason"
I can almost guarantee that was a developer that thought "someone will inevitably do something dumb. Lets make sure there's nothing beyond the last buffer"... and the developer that removed it thought "this system must be perfect - no overhead. It's not like anyone's going to try read beyond the last buffer".
8:54 and that's why >= or
this style of video is great. you should do more of this
0:31 oh shit I remember reading that tweet and thinking:"well someone's about to have a shitty day"
great sound quality and recording. already better than most other channels
10:50 That's horrifying.
Got this in my algorithm, and really liked the video and the way it was narrated! :)
This is a fantastic video. Wonderful explanation that I think even someone who never saw a line of code might understand
I love your editing style. nice vid!
keep making videos like this, this video was amazing, absolutely loved it.
wonderful production quality from such a small channel
keep it up man
>= is always better than == when dealing with loops
A buffer overflow because of memory unsafe languages like C??? I’m shocked! I can’t believe it!
very well made video. Can't fathom how much hours of hard work has been put into such a masterpiece! Appreciate and cheers
Stumbled on this channel today, top quality content! Thanks 😄
This one was a double-whammy: Code Optimization kills the safety hack the old engineer put in place (never documented; probably SOP to him). And a rookie mistake in implementing boundary check conditions. Back in my good old days of Delphi, the compiler had an option called Range Checking, which would guarantee this kind of bug would never see the light of day....however, it hindered performance and most devs never used it outside of debug builds.
awesome research work dude!!!
I was so confused when i heard the Discord ping sfx lmao
It is, however, an *excellent* showcase of how things should be done once an issue has occurred. Google's infosec team found it and reported quickly. Cloudflare's response time was excellent, initial mitigation was under an hour. Cross-Atlantic work let them have constant flow of work. Full resolution within a few days.
And I imagine next time the google team needs their beers covered Cloudflare will happily foot the bill
1:20
thats why I never received my parcel!
I only understood probably 10% of this, but from the very little that I did understand, it appears that the way HTML is parsed is part of the reason why it is still so important to learn lower level languages, like C.
Also: VERY important: this is why we always use the proper comparison operators during an iteration loop! 😂
Pointers as always being the bane of coder’s existence.
Corporate needs you to find the difference between this pictures:
Pointers, array indices, references
Assembly programmers: "They're the same picture"
Thank you for including the artists you used in this video.
2:31 oh no, not the taco tuesdays!
For a moment, I thought this was the voice of Fireship. Then I remembered Fireship narrates extremely fast. Awesome content.
2:25 Customer trust? In Cloudfare?
I love how transparent you are with making drama.
Back at my college we have been told to use => or =< for buffers for exact same reason. This precise condition (++p == pe) is based on assumption that p will always be incremented by 1 and never will be incremented by 2. But actually there is possibility it will be incremented by 2 or even more. Simple use of (++p >= pe) will actually fix issue, and prevent it from happening. It reminds old exploit of buffer overflow somehow.
The reason may be bugs in this or other code, sun or other radiation make bit from 0 to 1, malfunction of memory or registers, etc.
7:37 I checked my Discord ffs.