CrowdStrike IT Outage Explained by a Windows Developer

Поділитися
Вставка
  • Опубліковано 14 жов 2024
  • Dave explains the Crowdstrike IT outage, focusing in on its role as a kernel mode driver. For my book on the spectrum, see: amzn.to/3XLJ8kY
    Get the shirt: amzn.to/4bRUgAn
    Follow me for updates!
    Twitter: @davepl1968 davepl1968
    Facebook: davepl
    Opinions are mine only, not a spokesperson!
    I should add that I don't know if the channel definition update files were privately signed or not; what I meant is that I presume they do not go through the WHQL signing process. But even if they do, we've learned that trusting their content (which was all zeros in this case) didn't go well.

КОМЕНТАРІ • 9 тис.

  • @StarLightDotPhotos
    @StarLightDotPhotos 2 місяці тому +2672

    As a former CrowdStrike employee this is the best explanation I have heard and is 100% accurate.

    • @djrmarketing598
      @djrmarketing598 2 місяці тому +369

      Was your last day Friday? LOL (not blaming you but I'd probably quit after that HAHA)

    • @DavesGarage
      @DavesGarage  2 місяці тому +579

      Thanks, I really appreciate that vote of confidence! I was pretty worried about getting something wrong!

    • @zippythechicken
      @zippythechicken 2 місяці тому +126

      @@DavesGarage it doesn't explain why these companies are resorting to using CrowdStrike. That is the bigger question. Let alone Auto Updates to Windows or Third Party Software that the end user can't stop completely... We are not their Beta Testers.. but... yes we are.

    • @tringuyen7519
      @tringuyen7519 2 місяці тому +109

      @@zippythechickenBC they don’t want to hire cybersecurity experts in house. It’s cheaper for them to use CS.

    • @zzzgz5
      @zzzgz5 2 місяці тому +100

      @@zippythechicken The CS product is one of the best available. It can monitor and alert for far more 'risky' behavior than most other in the industry. There is a reason why their install base is so huge.

  • @Yandarval
    @Yandarval 2 місяці тому +5534

    "Agile, ambitious and aggressive" the sarcasm with which this phase was uttered, wonderful.

    • @shmehfleh3115
      @shmehfleh3115 2 місяці тому +434

      Move fast and break everyone's things.

    • @Alkatross
      @Alkatross 2 місяці тому +176

      Their product was so disruptive that our paradigm was shifted out of pocket.

    • @aRandomPersonOfTheInternet
      @aRandomPersonOfTheInternet 2 місяці тому +64

      7:30 I could listen to this on repeat lol

    • @EujenSandu
      @EujenSandu 2 місяці тому +18

      @@Alkatross you deserve a Level 1 comment with many likes, sir.

    • @scene2much
      @scene2much 2 місяці тому +19

      ​@@EujenSanduthis is just a rewrite of. "The road to hell is paved with...". Followed by the clank of some fallen piece of audio equipment.

  • @zug-zug
    @zug-zug 2 місяці тому +4613

    While this is technically what crashed machines it isn't the worst part.
    CS Falcon has a way to control the staging of updates across your environment. businesses who don't want to go out of business have a N-1 or greater staging policy and only test systems get the latest updates immediately. My work for example has a test group at N staging, a small group of noncritical systems at N-1, and the rest of our computers at N-2.
    This broken update IGNORED our staging policies and went to ALL machine at the same time. CS informed us after our business was brought down that this is by design and some updates bypass policies.
    So in the end, CS caused untold millions of dollars in damages not just because they pushed a bad update, but because they pushed an update that ignored their customers' staging policies which would have prevented this type of widespread damage. Unbelievable.

    • @britneyfreek
      @britneyfreek 2 місяці тому +671

      well. sue them into the ground. nobody in this world needs commercial rootkits.

    • @shaun2072
      @shaun2072 2 місяці тому +26

      😯

    • @winxplorer
      @winxplorer 2 місяці тому +135

      Wow, that... is just bad? stupid? reckless? Hopefully they change at least that updating beavior, after this ... nightmare

    • @ghostrider-be9ek
      @ghostrider-be9ek 2 місяці тому +231

      yea this - this was exactly why I held off on many key systems for CS deployment - I was not happy they would just override all staging - nothing should EVER be allowed and I was going to have a zoom call with their team about this.

    • @zlcoolboy
      @zlcoolboy 2 місяці тому +88

      Thats crazy. I can't believe they took that liability.

  • @rezephuafrogs4021
    @rezephuafrogs4021 2 місяці тому +59

    When I was in high school I had a teacher that had a way of explaining things to you that temporarily elevated you to a fraction of his level of understanding. Today I got to experience that again. Thank you Dave! 🤯

    • @andrepemmelaar8728
      @andrepemmelaar8728 2 місяці тому

      Absolutely the same feeling!

    • @emordnilaps
      @emordnilaps 2 місяці тому

      Amen. Today's computational underpinnings are somewhat opaque to me, a 74-year-old whose first computing challenge was to code a very simple program into machine language (not assembly!) and put it onto punched paper tape to run on an old machine (which predated "big iron") in the University's basement.

  • @MrKvasi
    @MrKvasi 2 місяці тому +951

    The company I work at got bought by a bigger one. They required us to install Crowdstrike on all servers. We found a memory leak, that Crowdstrike still hasn't fixed after 6 months so I have refused to install it until then. I was on vacation when I saw all URGENT emails from other divisions.
    Thank you Crowdstrike for not fixing your memory leaks, it saved my vacation. =P

    • @Micloren
      @Micloren 2 місяці тому +92

      Give this man a raise! ;)

    • @elta6241
      @elta6241 2 місяці тому +41

      I feel for you. No one knowingly puts this rubbish on their machines. They have a lot of 'help' in getting an installed base.

    • @nerychristian
      @nerychristian 2 місяці тому

      @@elta6241 I work for a school district in L.A. We purchased it for our computers. I'm guessing the company has a strong influence with government institutions.

    • @j.d.375
      @j.d.375 2 місяці тому +19

      hahahahahaha amazing xD well i guess you can say hey boss i saved our compony do i get a raise xD?

    • @Philip550c
      @Philip550c 2 місяці тому

      I've been passing on purchasing crowdstrike at my org every year since 2016 as they left a sour taste in my mouth for claiming that Russia hacked the DNC servers and then being unable to provide proof. Haven't trusted them since then.

  • @NealB123
    @NealB123 2 місяці тому +6328

    3 days ago no one outside of IT had ever heard of Crowdstrike. Now the entire world knows the name. Reputation destroyed in an instant.

    • @joester4life
      @joester4life 2 місяці тому +265

      3 days ago no one outside of IT had ever heard of Crowdstrike.
      Yeah.. idk about that. Everyone at my job knows what Crowdstrike is; and they are not in IT. There's a lot of people that knows about it (who uses a computer, and works in a office).. but not the average joe who's working on a machine or something. But not every work place used Crowdstrike.

    • @Arsenic71
      @Arsenic71 2 місяці тому +246

      @@joester4life Yeah, pretty much everyone who works in or with IT security knows Crowdstrike. As for the users though, the original assumption might very well be true.

    • @Vict0rFrankenstein
      @Vict0rFrankenstein 2 місяці тому +243

      I'm a SWE who uses a mac. I knew about crowdstrike falcon from looking at my activity monitior to see what was causing my fan to sound like a jet engine. Falcon was consistently at 800% CPU usage. Complained to the security people to no avail. Fortunately for me this borked update did not seem to affect the mac version. Hopefully my company ditches this junky software.

    • @DeltaMikeTorrevieja
      @DeltaMikeTorrevieja 2 місяці тому +134

      When the dust settles everyone will remember the name but not how or why they know it.
      Short term reputation hit, long term success.

    • @IndellableHatesHandles
      @IndellableHatesHandles 2 місяці тому +60

      I think you overestimate how much the user side of this matters. Even juniors have little input on what their companies use, especially in large corporations, so what matters more is what the big wigs think of it

  • @Vladimir_Kv
    @Vladimir_Kv 2 місяці тому +5832

    The most funny thing is that CEO of Crowdstrike was a CTO at McAfee... during their worldwide faceplant.

    • @cleverlyblonde
      @cleverlyblonde 2 місяці тому +557

      And that McAfee were suing Microsoft back in 2006 to ensure their software could talk to the kernel...

    • @stage6fan475
      @stage6fan475 2 місяці тому +164

      This CEO looked like Don Knotts on speed during his TV appearance. 🤣🤣🤣🤣🤣🤣🤣🤣
      He is the absolute Faceplant Champion of Computers. If he was an athlete, we would retire his number.

    • @chibicitiberiu
      @chibicitiberiu 2 місяці тому +335

      Just a classic case of a tech company taken over by non-tech leadership who doesn't really understand the intricacies of software development, cutting costs in the wrong places.

    • @kxjx
      @kxjx 2 місяці тому

      Bunch of scammers across the av industry. They make money by scaremongering and they introduce at least as much risk as they claim to prevent.

    • @cpuuk
      @cpuuk 2 місяці тому +33

      Guy's got form alright.

  • @mikezimmermann89
    @mikezimmermann89 2 місяці тому +168

    Dear God! I’ve been out of the IT world for 15 years now, and I still understood his explanations. I’m VERY IMPRESSED by Dave’s clear and concise presentation and astounded by the fact that I remembered enough of this “stuff” to finish some of his sentences! Until today, I was convinced that a benevolent universe had purged all that out of my head to make room for important stuff (like cocktail recipes).

    • @Smannellites
      @Smannellites 2 місяці тому +9

      Yes, exactly how I reacted. LIke you, I am a retired IT consulant, retired for nearly 20 years. Dave's presentation could not be clearer. It made me wonder whether the Crowdstrike P-code interpreter creates another vector for introducing viruses, malware and rootkits.

    • @carddamom188
      @carddamom188 2 місяці тому

      It must be sort of like Linux eBPF...

    • @juppster5694
      @juppster5694 2 місяці тому

      @imairt Completely agree with you, and for pretty much the same reasons. I'd bet UA-cam is awash with channel hosts mumbling and umm-ing their way through this issue right now! (I'd have a look but I don't think I could bear it!)

    • @dharma404_
      @dharma404_ 2 місяці тому

      I agree - he just cuts through this stuff like a hot knife through butter!

    • @MikeJones-nu4sd
      @MikeJones-nu4sd 2 місяці тому

      Same here, but I've only been retired 4 years.

  • @alleneng
    @alleneng 2 місяці тому +689

    for some reason dave's explanation was waaay easier to understand than every other video about this

    • @Schyz
      @Schyz 2 місяці тому +105

      To be fair, it's much easier to explain something when you understand it, which hasn't been the case in most of the media.

    • @Smart-Towel-RG-400
      @Smart-Towel-RG-400 2 місяці тому +85

      Because he knows how to explain things to management that has no technical skills ...he just did that for us

    • @brodriguez11000
      @brodriguez11000 2 місяці тому +6

      Isn't Bitlocker involved in this mess?

    • @andrewroby1130
      @andrewroby1130 2 місяці тому +14

      That's what happens when people know what they're talking about, ime

    • @user-hy7su3jm8z
      @user-hy7su3jm8z 2 місяці тому

      "If you cannot explain it to an 8 year old child, you do not know it well enough yourself." Some Scientist said (possibly Einstein) but my brain is a vast relational database of broken links so don't trust it!

  • @LeonEvans_Guyver1
    @LeonEvans_Guyver1 2 місяці тому +336

    This ladies and gentleman is what an expert sounds like.
    👍

    • @DawnAdams-d5i
      @DawnAdams-d5i 2 місяці тому +26

      Yep! It's a lonely position. Real experts are thin on the ground, wankers abound

    • @mikicerise6250
      @mikicerise6250 2 місяці тому +5

      Well, to recognize a domain expert as such you also need a community of engineers who know at least enough to tell one apart from a quack. Those are becoming thin on the ground as well. 😅

    • @binsarm9026
      @binsarm9026 2 місяці тому +6

      the reason they are scarce is because "unfettered truth" is bothersome to parties with vested interests, i'm sure many media outlets (big enough to be legal targets) avoid explaining it totally for fear of upsetting the "wrong people".

    • @tcfjr
      @tcfjr 2 місяці тому

      Hear, hear!

    • @jamieevans5979
      @jamieevans5979 2 місяці тому +1

      Yeah, Dave truly is an expert. I understand everything he talks about here, but I couldn't explain it as well as him.

  • @mikeyoung00
    @mikeyoung00 2 місяці тому +478

    Love that while stuck at the airport Dave opened his MacBook. A fair amount of dry humor in this vid.

    • @thomasbrotherton4556
      @thomasbrotherton4556 2 місяці тому +51

      I caught that too. It would make sense that a retired Windows developer would use a MacBook.

    • @DavesGarage
      @DavesGarage  2 місяці тому +117

      They really are the best laptop, and I can do Final Cut on it. On the desktop, I've also got a PC with Windows (and Unbuntu via WSL2). But for anything video related, it's Mac, so...

    • @markfitzpatrick7186
      @markfitzpatrick7186 2 місяці тому +39

      Also, “Solitaire damages your Git enlistment” - a joke so subtle, it could be placed in an episode of The Office.

    • @LogicalError007
      @LogicalError007 2 місяці тому

      ​@@thomasbrotherton4556Microsoft employees also use MacBook in Microsoft.
      They're a software company. They don't care what you use if you do your work. There are teams inside Microsoft that use slack instead of teams.

    • @Nihonjindesuka1
      @Nihonjindesuka1 2 місяці тому

      Windows pcs worked fine 😂

  • @ShawnWrona
    @ShawnWrona 2 місяці тому +103

    What a great explanation!
    No bull crap. No conspiracy theories. No badmouthing. Just plain facts. Even me… who rarely uses a computer anymore understands, and follows Dave’s explanation and walks away a little more knowledgeable. Thanks Dave😊

    • @ticotube2501
      @ticotube2501 2 місяці тому +3

      Yes, very insightful. First time on the channel. Will come back.

    • @tmst2199
      @tmst2199 2 місяці тому +4

      Never mind the fact that CS could be executing malicious code on your machine.

  • @CHmLgN
    @CHmLgN 2 місяці тому +398

    I just learned more about system functions in 5 minutes then I would’ve imagined. What a clear breakdown on things.

    • @rembautimes8808
      @rembautimes8808 2 місяці тому +13

      Back in those days , I was 19 years old when I decided to buy a Windows 95 book to try to find out how Operating Systems worked. Could not understand a thing but I do vaguely remember something about Ring 0 and Ring 1. Excellent video deserves a multiple like feature. 😄

    • @brodriguez11000
      @brodriguez11000 2 місяці тому +2

      There's a city under the sheets.

    • @CharveL88
      @CharveL88 2 місяці тому

      @@rembautimes8808 I learned to code at 14 on an Atari 400 early 80's, and teaching my computer teacher. Always thought I'd be CEO of IBM but then I hit 17 and wanted to be a rock star. 😆
      I remember some of this stuff and found I could follow the logic trail, more or less. He pulls it all together so well I feel in 10 minutes I have a way better picture than I have over the last 40 years.

    • @jbutler8585
      @jbutler8585 2 місяці тому +4

      On Friday everybody also got a crash course in Change Management (or how NOT to do it) too. There are normally multiple barriers in the way of something catastrophic like this happening, and they all got skipped.

    • @HTxLL
      @HTxLL 2 місяці тому +1

      than*

  • @mhewett5193
    @mhewett5193 2 місяці тому +558

    I am a network systems engineer that had to deal with this for 14 hours that day. This was one of the most informative videos I have ever seen. You helped simplify Windows OS in 15 minutes in a way that hours of reading hasn't. Something about real world scenarios to tag the concept with in my memory really helps. Thanks!

    • @JM-bl3ih
      @JM-bl3ih 2 місяці тому

      That title is the dumbest thing I've ever heard

    • @user-fe8gx3ie5v
      @user-fe8gx3ie5v 2 місяці тому

      Self-doxxing

    • @billybbob18
      @billybbob18 2 місяці тому +2

      Pain is the greatest motivator to learn.

    • @LordSwagtron
      @LordSwagtron 2 місяці тому +12

      @@JM-bl3ihwhat exactly is dumb about a job title that is literally just “the engineer that administers the systems in our company network”?

    • @dr.angerous
      @dr.angerous 2 місяці тому

      ​@@LordSwagtron Super dumb confirms that he watched this stupid shit video

  • @shankthebat8654
    @shankthebat8654 2 місяці тому +447

    I love that you get right to the point, you don't waste time on useless background, and go full speed ahead with just the information. Thank you!

    • @ringwe
      @ringwe 2 місяці тому +36

      Also absent in the background: annoying music.

    • @uselessmitten7836
      @uselessmitten7836 2 місяці тому +20

      All of us spectrum kids can get our clear and concise information here! :)

    • @deano023
      @deano023 2 місяці тому +10

      I completely agree and totally appreciate how Dave does get straight to the point. I'm sure many other content creators start with useless background simply to "pad out" the video.

    • @LiveWireBT
      @LiveWireBT 2 місяці тому +10

      Absolutely agree. It was a pain to see so many "experts" around the globe talking so much while not explaining anything at all, except that there is nothing that could be done, while as an interested professional you knew that a business could build better systems and architectures (like a few, that were not impacted did) and these people were just talking heads not knowing what was actually going on.

  • @fbmowner
    @fbmowner 2 місяці тому +47

    What I've learned so far is that every OS has a big boss and that big boss ensures everyone follows the rules and as soon as someone gets out of line the big boss shuts the party down before the looting begins. In all seriousness this is a great video. Subbed!

    • @OneNidim
      @OneNidim 2 місяці тому +1

      With how complex operating systems are and the levels upon levels of logic gates, yeah I’d say it’s good to have a big boss in your OS

  • @elgabacho73
    @elgabacho73 2 місяці тому +793

    I work in IT. Crowdstrike sales has been calling me trying to get us to switch to them. I don't think they'll be calling us for a bit.

    • @Sypaka
      @Sypaka 2 місяці тому +109

      Just pick up the phone and say. "No, I like our systems to be stable and without a backdoor." hang up.

    • @videogamecoverss
      @videogamecoverss 2 місяці тому +100

      Crowdstrike convinced my company I work as a Network Engineer for to swap over to them and we did around a month and a half ago... The person who made that decisions didn't have to wake up super early in the morning on Friday while panicking.

    • @dustojnikhummer
      @dustojnikhummer 2 місяці тому +11

      ​@@videogamecoverssAnd how was this his fault exactly?

    • @iz5808
      @iz5808 2 місяці тому +23

      Tbh there is no guarantee that the other companies have their updater utility made in a safer way, at least cs will pay more attention to that part now. But overall that and their wokeness is something that gives off a bad vibe about the company

    • @KPX01
      @KPX01 2 місяці тому +28

      @@dustojnikhummer for believing in the snake oil.

  • @tazzybod
    @tazzybod 2 місяці тому +326

    "It's a fair bet that update 291 will never be needed or used again" Dave you're a legend 😂🤣

    • @KernelLeak
      @KernelLeak 2 місяці тому +4

      Never needed again? Yeah.
      Never used again? Bring Your Own Vulnerability says ehhh...

    • @Dwigt_Rortugal
      @Dwigt_Rortugal 2 місяці тому +17

      Update 291 should have its own Wikipedia page. "We really 291'ed that release."

    • @myleft9397
      @myleft9397 2 місяці тому +5

      Hi, I'm Dave and I'm based as f---, should be how he starts every video

    • @tma2001
      @tma2001 2 місяці тому

      @@Dwigt_Rortugal yeah its now a meme like Room 101

  • @DJMegabit
    @DJMegabit 2 місяці тому +231

    This is by far the best explanation of the problem I have seen. Way better than any media outlet. Really great job Dave. Thanks for this

    • @paulhansendk
      @paulhansendk 2 місяці тому +6

      I guess if we look at it from the medias perspective, they need to explain it in a way so everyone who are not IT savy can understand it. But for us who knows a bit about IT, he does a very good job explaining it.

    • @bobbykjack
      @bobbykjack 2 місяці тому +4

      @@paulhansendk Yup, even most 'technical' news media wouldn't-and shouldn't-go anywhere near this level of detail.

    • @binsarm9026
      @binsarm9026 2 місяці тому +1

      to be fair, he's doing it in more than 10 minutes which i'm sure is well over the limit of any "media outlet" - or even a dedicated tech program (for the masses - such as BBC Click).
      still, one cannot deny the clarity of his explanation which very carefully avoids hinting at responsibility of the error and just states the basic facts of what happened.
      i don't work in IT and only know cursory programming in HTML and some basic C++ but Dave's explanation made sense to me without having to understand the technical details !
      very commendable and earns a Subscribe from me even - just because that's all he is in it for (!) - he deserves as many likes as i can give him for such a beneficial message to society !!

    • @jajajajajaja867
      @jajajajajaja867 2 місяці тому +1

      Because the internet is full of technology nerd posers that don’t actually know how a computer works

    • @RaymondHng
      @RaymondHng 2 місяці тому

      @@binsarm9026 Individual television news segments are limited to 2.5-to-3 minutes in duration.

  • @VeritasAlienari
    @VeritasAlienari 2 місяці тому +35

    You're obviously a skilled and experienced technical powerhouse, but the writing style (sarcasm, wit, technical aptitude combination) and delivery make this more than just a "system dump" of data the viewer has to try and digest. Instead, we're treated to a bit of entertainment as we debug.
    Thank you for the package deal.

  • @dorothythompson927
    @dorothythompson927 2 місяці тому +394

    Hi Dave, I’m also a retired Windows developer. It was fun listening to you talk about all those old system components that used to be part of our daily life experience. I was impressed that I remembered enough to understand what you were talking about! Thanks a lot for your explanation. I confess that I feel kind of angry at the CrowdStrike developers for taking such liberties with the kernel code. Seems kind of arrogant. No doubt someone thought they were being super clever by defining their code as something required to run when the kernel starts up. Imagine if the CrowdStrike developer had just arranged a meeting with a Windows kernel expert at Microsoft to discuss what they were planning to do.
    A whole lot of suffering could have been avoided.

    • @89qwyg9yqa34t
      @89qwyg9yqa34t 2 місяці тому +16

      To be fair, I'm sure that if they didn't take that liberty, malware would all just be adapted to shut it down before it executed its malicious instructions. It may potentially have been necessary to continue to define itself as security software.

    • @happyzahn8031
      @happyzahn8031 2 місяці тому +12

      Sounds like MS needs a new 'must run' list/level so that only their own stuff is on there permanently and then the next level is sw like crowdstrike so its 'necessary' but can be turned off if it breaks the kernel.

    • @codenameirvin1590
      @codenameirvin1590 2 місяці тому +23

      @@happyzahn8031 they shouldn’t let third-party code run as part of the kernel at all.

    • @itskdog
      @itskdog 2 місяці тому

      ​@@happyzahn8031 They do have that, it's called Safe Mode, and that's why the fix requires going into Safe Mode.

    • @therealctoo4183
      @therealctoo4183 2 місяці тому

      @@89qwyg9yqa34t If you were really worried about malware you would be advocating for diversity among computing platforms in all businesses. Farmers know not to plant every acre with the same crops because it's not secure! Today, we really only have 3 platforms: Unix/Linix, Mac, and Windows. Three is not enough, but it would be a start if business settled on roughly 33% of their systems running each, and then began looking for a couple more.

  • @steveunangst
    @steveunangst 2 місяці тому +267

    Old school BSOD t-shirt is awesome!

    • @BobBrownEmoryVillage
      @BobBrownEmoryVillage 2 місяці тому +4

      "BSOD t-shirt" in you favorite search tool finds several. Mine's on the way!

    • @russellzauner
      @russellzauner 2 місяці тому +3

      That busted code snippet would make a cool new one lol

  • @siamimam2109
    @siamimam2109 2 місяці тому +308

    My professor would fail us if we didn’t write crazy number of code to test our data structures. NOW I can appreciate why he made us do it. ❤

    • @Roadent1241
      @Roadent1241 2 місяці тому +3

      Would it make sense to have a little spider like checker-tester buddy? Runs along the lines of code and executes all commands and whatnot in a VM so you can see it's all working properly and fix what isn't? That's all I can visualise we need nowadays.

    • @slipstreamvids7422
      @slipstreamvids7422 2 місяці тому +4

      Our IT guy separated our process and business LAN for security purposes and forgot to assign ethernet addresses to all the process modules. He took the whole plant down

    • @littleredpony6868
      @littleredpony6868 2 місяці тому

      @@slipstreamvids7422sounds like someone who would be worried about losing their job

    • @lylechipperson3407
      @lylechipperson3407 2 місяці тому +6

      @@Roadent1241 You are exactly right. If crowdstrike simply used virtual machines running in a sandbox (or multiple sandboxes) to test their update, none of this would've happened. I have a feeling if we ever know exactly WHO was responsible for this work I could bet their age within a +/- 5 year window. (sorry people over 50, but you know damn well what I mean)

    • @mankytoo
      @mankytoo 2 місяці тому +3

      @@lylechipperson3407 Weird that your reaction is to a 50+ yo man ( who once was a Windows developer ) who explains about the bug AND tells how to remove it to let Windows start up again as is should be.

  • @garysturdivant2047
    @garysturdivant2047 2 місяці тому +24

    Extremely well and clearly described, Dave. As a former kernel developer (at Tandem Computers), we didn't allow such back doors, but then we were being deployed as a 24x7 hardware/software fault-tolerant server system and did not have millions using our systems, developing third-party drivers (or attacking them).
    Yes. Multiple failures at Crowdstrike. Someone wrote that driver code without the requisite error checking, no one caught it in reviews/inspections (if they do that, and if they don't...don't even want to go there), no one in QA thought to test for it or ran the test, someone in the release chain submitted that file (or failed to substitute the correct one if the default is an all zeroes file), etc. I don't expect today's developers/QA to think like we did (what could be corrupted if the processor/driver/adapter/etc fails between this instruction and the next and how can I prevent that corruption). Too time consuming and non-agile. But...apparently no one considers the consequences of not doing so and the damage to customers and the company it causes, or the bean-counters dismiss it as too unlikely and worth the risk.

    • @chrisfleischman3371
      @chrisfleischman3371 2 місяці тому +1

      Black swan event perspective of the developers.

    • @Harteo3917
      @Harteo3917 2 місяці тому

      Want to bet someone used AI to write the code for them and then trusted it so completely they didn't check?😊😏

    • @noahkilleen239
      @noahkilleen239 2 місяці тому

      15 years of industry experience in IT has my spidey sense tingling DIRECTLY toward the bean counters and either using poorly vetted outsourced devs or insufficient funding for enough QA staff or both.... ironic that it took down so many airlines as Boeing's bean counters did the same with the 737-MAX.

    • @TheSillyBun
      @TheSillyBun 2 місяці тому

      Oh, SysGen. Good old day!

  • @kxjx
    @kxjx 2 місяці тому +404

    We are worried about getting p0wned so we install a kernel driver, mark it as critical, and then let a suplier with a history of screwups push updates to it whenever they like with no testing or controls. Good job. Good job.

    • @paulobembe7742
      @paulobembe7742 2 місяці тому +5

      Yeah,,,WTF

    • @gooble69
      @gooble69 2 місяці тому

      "We are worried about getting p0wned "
      You hit the nail on the head here. The fear of the danger is quite often worse than the danger itself...

    • @WhoTnT
      @WhoTnT 2 місяці тому +27

      I'm here with my mouth open, amazed that this is how Windows works. What is the point of the certification process if a driver can do whatever it wants after it is certified? How is there no system in place to disable non-MS drivers that are causing kernel mode errors even if they are boot-start? I'm not sure if this is a valid concern but I'm thinking about all the Chinese computer products that install drivers on my system and what they could be doing in the background even if certified.

    • @sas408
      @sas408 2 місяці тому +13

      @@WhoTnT did you watch the video? The answer for your question is in video - Windows does everything you said but CrowdStrike marked its driver as critical and resticted booting without that.

    • @WhoTnT
      @WhoTnT 2 місяці тому +12

      @@sas408 Maybe you didn't read my comment fully. "disable non-MS drivers that are causing kernel mode errors EVEN IF THEY ARE BOOT-START" The OS of the system should not be overridden by a third party driver. The fact that the system can even be stuck in a boot loop because of a third party driver is insane.

  • @QualityDoggo
    @QualityDoggo 2 місяці тому +466

    "They have a bug they don't protect against" is the key line. CrowdStrike added kernel drivers, but did not make them robust enough. Kernel code, especially when running such complex functionality, should be able to take more abuse from user code without causing a BugCheck. Very disappointing. Great explanation!

    • @brunovandooren3762
      @brunovandooren3762 2 місяці тому +65

      I wrote real-time kernel code communicating with a satellite base station via various PCI interfaces. Every friday I'd boot my system in a torture test where I'd intentionally try to crash my interface with malformed requests, out of order requests, logic errors and whatnot. I didn't want the customer's configuration scripts or user mode applications to be able to trigger a kernel panic in any way.

    • @proosee
      @proosee 2 місяці тому

      @@brunovandooren3762 IMO such integration tests should be run with every push to repo, not just weekly.

    • @Don_Giovanni
      @Don_Giovanni 2 місяці тому +38

      ​@@brunovandooren3762 What do you mean I can't just test the happy path?! 🤔

    • @woogiewoogie0012
      @woogiewoogie0012 2 місяці тому +22

      They didn’t field test a small update that they absolutely fucked up. Someone is fired.

    • @RP-dy5mu
      @RP-dy5mu 2 місяці тому +6

      @@Don_Giovanni Feel-good software development

  • @akostarkanyi825
    @akostarkanyi825 2 місяці тому +234

    Crowdstrike - good name for a company which hit masses throughout the world with its product.

    • @jamesdavison2416
      @jamesdavison2416 2 місяці тому

      And Has Been for a decade +

    • @pielord177
      @pielord177 2 місяці тому +5

      Yeah when I heard the name I thought it was some military tech for launching missiles or something

    • @laus9953
      @laus9953 2 місяці тому +5

      and I don't believe in coincidence anymore..

    • @mineralt
      @mineralt 2 місяці тому +4

      Only better name would be “Plague” 😂

    • @XDarkGreyX
      @XDarkGreyX 2 місяці тому +2

      DigitalWorldNuke

  • @braxtonbell9489
    @braxtonbell9489 2 місяці тому +4

    Rarely do you encounter a technical subject presented in a manner that effortlessly transcends a wide range of listeners' understanding or experience levels. This video conveys core concepts in an easy-to-understand and memorable way. Dave achieves this without forced analogies or a condescending tone. I learned something today that I will retain. Thanks, Dave, for the great content! 💯

    • @DavesGarage
      @DavesGarage  2 місяці тому +1

      Wow, thanks for the kind words, I appreciate the vote of confidence!

  • @ottoberkes996
    @ottoberkes996 2 місяці тому +191

    Hi Dave, thanks for the explanation and bringing back some good old memories. I joined the Windows NT dev team in '93 and was at MSFT until 2011 so I'm sure we crossed paths. For all the talk about AI etc., kernel mode is still kernel mode, pointers are still pointers, and all drivers - I've written my share - should be developed with extreme care by people who understand that every line of code could cause a blue screen and heartache. "Move fast and break things" won't cut it.

    • @ruk2023--
      @ruk2023-- 2 місяці тому +1

      This is not a case of move fast and break things.

    • @Keiranful
      @Keiranful 2 місяці тому +13

      ​@@ruk2023--skirting Microsoft driver certification procedures and low resilience code is very much a case of "move fast". The "break things" is simply a natural result.

    • @ruk2023--
      @ruk2023-- 2 місяці тому +2

      @@Keiranful no. Move fast means deploy quickly. This has been a problem for a looooong time. It’s a result of poor quality control

    • @Keiranful
      @Keiranful 2 місяці тому +3

      @@ruk2023-- Kernel level QC takes time. That contradicts quick deployment. Hence the work around.
      The lack off stress testing and thus resilience is also a symptom of trying to just get things out the door as quickly as possible. QC takes time.

    • @ruk2023--
      @ruk2023-- 2 місяці тому

      @@Keiranful But this was caused by a problem that was pre-existent for a long time. Did you watch the video? The definition file is just a catalyst.

  • @yorkaturr
    @yorkaturr 2 місяці тому +90

    I've been a professional software developer for 25 years, and I started out on my C64 when I was 8 years old, moved on to low level DOS 3d graphics programming and later into desktop business software and the web. Your explanations make perfect sense and I'm extremely impressed with the depth of your knowledge. I'll tip my virtual hat to you, sir.

    • @DouglasLancy
      @DouglasLancy 2 місяці тому +3

      Are you me? Basically same origin story. And I agree with your assessment.

    • @RiotNrrrdUTube
      @RiotNrrrdUTube 2 місяці тому +2

      Dave said a kernel panic on macOS is “pink” … uh, no, they’re dark grey, usually

    • @LarryKamons
      @LarryKamons 2 місяці тому

      I started on a TRS80 when I was 16......

    • @yorkaturr
      @yorkaturr 2 місяці тому +1

      @@DouglasLancy I once saw a guy in the subway who looked exactly like me, but that was years ago. Based on your profile picture our current resemblance isn't very strong, but age changes people

  • @johnp312
    @johnp312 2 місяці тому +47

    Finally an "Air Crash Investigation" style explanation of what actually happened. I now understand WHAT, WHY and HOW. Thank you, Dave!

  • @spyderwalker94
    @spyderwalker94 2 місяці тому +9

    being in IT for 30 years, your video is precise, easy to follow and on point. Well done.

  • @c_b5060
    @c_b5060 2 місяці тому +128

    I appreciate your straight forward, no nonsense delivery that is organized in a logical and understandable format.

    • @unocualqu1era
      @unocualqu1era 2 місяці тому +1

      I hate how other people at youtube and social media is blaming microsoft

    • @niftybass
      @niftybass 2 місяці тому +2

      Agreed! Well explained, and included depth I haven't heard from anyone else so far.
      Microsoft shares some blame, because Windows is easily broken, gives very little useful diagnostic info, etc.
      That said, CrowdStrike: I wonder how many people learned something about how to behave one's self whilst in kernel mode. LOL

  • @kimbjrnjensen2580
    @kimbjrnjensen2580 2 місяці тому +59

    Just love it when the deeper technicalities are explained for the most of us to at least get a sense of the problem. No magic, just machinery

  • @Throgmoyd
    @Throgmoyd 2 місяці тому +96

    As a predomainantly IBM Mainframe Sysprog (retired) I am heartened that I actually understood everything explained in this hugely informative explanation. Thank you!

    • @fum00A
      @fum00A 2 місяці тому +1

      Yes same here... One of the differences is that mainframes use storage protect keys in addition to supervisor/user mode in the PSW. And yes, I've had to fix my share of hard waits due to program checks in the supervisor code :(

    • @martinjones1390
      @martinjones1390 2 місяці тому

      As a retired analyst / programmer on IBM mainframes and various minis I've spent plenty of time investigating core dumps, particularly on DOS/VSE. It's one thing to locate the failed instruction, (invariable a decimal exception where a packed decimal field has an invalid value), but tracking what happened up until that point is the fun part. Perform stack chaining and linkage chaining through called subroutines gets very complex and a bit tedious, (especially when called in to work at 2.00am while attending a party on Friday night).

  • @franciscoramos7391
    @franciscoramos7391 2 місяці тому +3

    I browsed UA-cam trying to find a good explanation about the Crowstrike outage. I found this one to be the best... Thanking the author for such a great explanation. Excellent job

  • @xXC0deZer0Xx
    @xXC0deZer0Xx 2 місяці тому +115

    As a .NET developer whose work does not involve much around system functions, but higher level abstractions. I appreciate this breakdown of what's happening at the lower levels. Very clear and concise.

    • @AnalogDude_
      @AnalogDude_ 2 місяці тому

      stop wasting your time with microsoft, it's f ake.

    • @christophsiebert1213
      @christophsiebert1213 2 місяці тому +4

      @@AnalogDude_ Idk, I do have windows and I also do program with .NET and I ALSO can run those programs. Not much fake I can see...

    • @AnalogDude_
      @AnalogDude_ 2 місяці тому

      @@christophsiebert1213 i did too, some 20 years ago, Bil payed 50.000us$ and changed the code to his liking and called it dos.
      Pure waste of time, Ubuntu is better and more secure.

    • @TBonerton
      @TBonerton 2 місяці тому +1

      ​@@christophsiebert1213I just clicked the x on a window and it closed. Seems to exist, but maybe only I can see it.

    • @AnalogDude_
      @AnalogDude_ 2 місяці тому

      @@christophsiebert1213 Why not C/C++?
      than your software runs pretty much everywhere, you're wasting your time learning things one company decides or changes rather than a committee.

  • @TeeBaz
    @TeeBaz 2 місяці тому +220

    Could listen to Dave explain IT all day. A natural teacher!

  • @indylmc
    @indylmc 2 місяці тому +22

    An OS coder that can describe an issue that a non-OS coder can understand .. sheer brilliance. Well done Dave.

  • @jaumemallach7965
    @jaumemallach7965 2 місяці тому +21

    well, sometimes people get confy and forgets that "with great power comes great responsability", thanks for the video Dave

  • @jamesaffleck4192
    @jamesaffleck4192 2 місяці тому +50

    Dave's wardrobe coordinator deserves a raise. We'll played. 👏

  • @lvtiguy226
    @lvtiguy226 2 місяці тому +76

    Dave, as a layperson I really appreciated your video. While I did not understand all the language, I found your explanation thorough and informative. I now have a better understanding of why the Crowdstrike crash was so disruptive. Thank you.

    • @joelpichette
      @joelpichette 2 місяці тому +5

      it was so disruptive because it's their objective... get it ? crowd strike

    • @laus9953
      @laus9953 2 місяці тому +3

      ​@@joelpichette that was my first thought also..
      actually I remembered seeing that name for the first time just before shutting down my work laptop,
      and wondering about such a name.
      I haven't switched that laptop on since, and probably should remember this video here with Dave's instructions as to what to delete in safe mode, if it won't start next time

  • @aquatrax123
    @aquatrax123 2 місяці тому +1346

    I'm a network engineer for a company that has about 3000 computers. Since running EDR from another company, I have never seen so many blue screens. I was against using any EDR and would be happy to keep using app locker, but our insurance company forced us into this junk software. My aversion to using software like this was what happens if the company gets hacked that makes the software and sends out ransom to everyone at once. I always referenced solar winds for being against this type of software.

    • @Sebazzz1991
      @Sebazzz1991 2 місяці тому

      You nailed it. Centralisation is really the core of the problem here.
      Take ZScaler, which is a service that proxies all network connections of a computer to a central cloud proxy server, mitms it (decrypt, inspect, log, and encrypt), and then forwards it to the target server. Imagine that this is hacked, and this isn't immediately discovered. Hackers listening in and being able to tap off cookies, bearer tokens and other confidential information for weeks. That would affect so many companies. And if they would want to cause a DoS, many computers and servers would be left without an operational internet connection.

    • @pdanny421
      @pdanny421 2 місяці тому +20

      I f
      😂

    • @hgbugalou
      @hgbugalou 2 місяці тому +143

      I work in a heavily regulated sector and more times than not their rules we have to follow make our security posture weaker for many reasons, forcing EDR among them.

    • @LtShifty
      @LtShifty 2 місяці тому

      ​@@hgbugalou My company has a 50+ strong security/risk/compliance/soc operation.
      I have a list of nearly three dozen critical threats to the business that I could resolve in a matter of months, it's been three years since I raised most of them.

    • @jondonnelly3
      @jondonnelly3 2 місяці тому +123

      Imagine the most expensive (with addons) cybersecurity software that exists pwning all your machines. The machines would have been safer with Windows Defender... yeah... which is free.

  • @rokombolo24
    @rokombolo24 2 місяці тому +43

    Our engineer dodged this one by not signing up for CS and keeping Sophos. CS charges about $30k extra for content filtering, which Sophos includes. We have computers all over the world so this would have hit us hard not being able to get to all those remote users and sites.

    • @keen123
      @keen123 2 місяці тому +1

      I hope you bought that engineer a beer!

    • @franmotero
      @franmotero 2 місяці тому +3

      Sophos crashed out our distributed servers every week, sometimes every night. Since we changed to Crowdstrike only had this crash, we remain with CS for sure.

    • @mennovanlavieren3885
      @mennovanlavieren3885 2 місяці тому

      @@franmotero OMG are we secured by the equivalent of the Mexican cartels? Are there no good product on the market?

    • @zemm9003
      @zemm9003 2 місяці тому +1

      ​@@franmotero so you like leaving a huge backdoor open day and night. Interesting choice. I prefer the crashes than getting hacked by opening up the kernel to third party custom code.

    • @BondJFK
      @BondJFK 2 місяці тому

      @@zemm9003 Friendly crashes are better than missile attack

  • @shantanusapru
    @shantanusapru 2 місяці тому +76

    This is THE BEST explanation of the Crowdstrike-related outage!!
    In fact, so many other videos are not even explanations but mere rehashing of 'what' went wrong, instead of 'how' & 'why' it went wrong...

    • @vullord666
      @vullord666 2 місяці тому

      And this is the type of video (or own investigation) I hope government agencies do for the incident. The actual root problem needs to be addressed, not slaps on the wrists or finger pointing. Crowdstrike needs to be punished, but it needs to be understood that another bad actor can do this again or Microsoft themselves and beyond that that this isn't just a windows issue. Apple and Linux don't allow deep kernel level access like this, but theoritically they could still cause themselves a similar issue. We need better regulation over something so ingrained in our lives than the promise that it won't happen again.

    • @shantanusapru
      @shantanusapru 2 місяці тому

      @@vullord666 Hmm...
      I'd be careful in assigning blame...Accountability is fine, but culpability is another ball game altogether...
      Also, reg. more/better regulation, well, more regulation always has a trade-off of less freedom & less privileges...So, one should be careful what one wishes for...
      I understand your point; am just saying let's not be reactionary or have a knee-jerk reaction to this incident/issue...

  • @SpaceMomo
    @SpaceMomo 2 місяці тому +120

    I dont know much about IT and programming but man.. your explanation was perfect for a novice like me. Thank you Dave.
    Also as a deaf person i am thankful that you spoke in calm and clear sentences because that helped the subtitles to work nearly perfectly so thank you again.

    • @cappaculla
      @cappaculla 2 місяці тому +8

      Thing is it was perfect for a novice and for IT veterans alike,, Dave's quite the guy..

    • @Ryan-lk4pu
      @Ryan-lk4pu 2 місяці тому +2

      Not too take anything away but I would've like to have been told way P-code is

    • @Hexanitrobenzene
      @Hexanitrobenzene 2 місяці тому +1

      @@Ryan-lk4pu Wiki is your friend :) It seems it's a Microsoft version of bytecode, that is, code intended to be run on some virtual machine.

    • @fredericapanon207
      @fredericapanon207 2 місяці тому

      ​@@Ryan-lk4pu I had not heard of P-code before. Since he included it with assembler, I just figured that it was another low-level language that is able to work directly with the hardware.

  • @michaelmeyer2725
    @michaelmeyer2725 2 місяці тому +48

    This was incredibly precise and VERY easy to understand. Fortunately my employer doesn't use Crowdstrike so I got to sit back and watch some of my friends scramble. Thank you for putting this out.

  • @BoStern
    @BoStern 2 місяці тому +4

    Thanks Dave, found myself to be on the spectrum just a few years ago, at 53. Changed everything! Thanks for your extremely lucid, helpful and complete lessons on this channel! 🙏🏻

  • @georgefernandez7558
    @georgefernandez7558 2 місяці тому +57

    HI Dave, I've taught operating systems for a long time at university level, so I know exactly what you're talking about. Your explanation here is excellent, short, clear and to the point, not even a little stumble or hesitation. Congrats, it was a pleasure to watch the video. I'm impressed. As a comment, I can't understand why they don't seem to have a robust test environment where they can test these updates to the hilt, the corrupted file is _also_ part of the software.

    • @biglightball
      @biglightball 2 місяці тому +5

      I believe that the reason here are obviously corporate rules. Cutting costs for maximum profit. Risk of huge fu'ps is calculated. Like in the car industry. Haven't you watched "Fight Club"?

    • @tomiantenna7279
      @tomiantenna7279 2 місяці тому +5

      Look, there is a right way to do it, and a profit maximizing way to do it.

    • @SteveBrecht
      @SteveBrecht 2 місяці тому +7

      My suspicion (and of course I have no evidence) is that because the distributed file contained only null values, the issue may have been after the testing farm. The update may have passed testing just fine but the file became corrupted when being transferred into the update distribution system. This is no excuse though, there are plenty of ways to easily validate that the file transferred as designed before distribution. Never just trust it. I am looking forward to the details when they are released.

    • @TC2290-wh5cb
      @TC2290-wh5cb 2 місяці тому +4

      I'm guessing they never saw this zeros/NULL filled file being distributed as a point of failure so there were no tests. It may be there is extensive testing but it never picked up a file corruption before distribution. Suffice to say, there will be a LOT more eyeballs on it now. The driver should have handled it better as well rather than just crashing the Kernel.

    • @larrymitchell3502
      @larrymitchell3502 2 місяці тому +1

      @@TC2290-wh5cb Both points you make are true. A driver with error trapping is 'one more chance' to handle an invalid definition file. But the driver executes at Ring 0. If I understand what Dave said, processes operating there cannot access user memory?

  • @OldMan_PJ
    @OldMan_PJ 2 місяці тому +326

    A little over 10 years ago I was working on a project in the corporate offices of a major bank trying to upgrade from Windows XP to Windows 7 when the geniuses in charge of software deployment decided to force an uninstall of a password vault that tied into the Windows login. The problem was the uninstall process required a reboot and connection to the network for the new software to install. RIP the over 30% of workers that were working from home. I figured out pretty quickly how to fix the problem by using Safe Mode with the Command Prompt to edit the registry but then the geniuses in charge of IT security decided to disable Safe Mode on every system. The ineptitude of that place was astounding.

    • @patrickbuick5459
      @patrickbuick5459 2 місяці тому +53

      It is called shooting oneself in the foot with a semi automatic with a large magazine, never stopping pulling the trigger or moving the foot.
      I feel for you, I have been there.

    • @fightingfalconfan
      @fightingfalconfan 2 місяці тому

      No, this is basically IT feeling threatened that someone might try to install malicious code into the network and take the whole damn bank down. So they try to make it idiot proof with no one allowed to enter the standard back door of windows and do thing's they really are not authorized to do. You may know what your doing; but joe shmoe next to you don't and one wrong command later can take everyone out. But honestly no security software should ever run next to the kernel for any reason.

    • @tortolgawd4481
      @tortolgawd4481 2 місяці тому +37

      The dumbest things that companies do is adminlock every necessary tools 😂😂😂

    • @matthewstott3493
      @matthewstott3493 2 місяці тому

      Came into work one morning, many years ago and a Windows Engineer had run a script that accidentally started deleting domain user accounts. Within seconds before he hit Ctrl+C around 500 users were deleted. It took a couple hours to restore from backup. The help desk was pounded. Shit Happens, all the time. You learn from the mistakes and you make sure it won't happen again. Nobody would have dreamed something like this could happen and so many systems being impacted so rapidly. That's the problem with complex interactive large systems. Take Joyent and AWS, they both had an engineer accidentally reboot an entire US East data center because they typo'd on the command console. Apparently the default behavior is to execute the command on all nodes on your data center. Both Joyent and AWS fixed their command line consoles from allowing that to ever happen again. One must specify a group of nodes or individual nodes to operate against or the command will refuse to run. One day, someone at Pixar deleted the wire frames for the Toy Story 2 characters. Pixar had to shutdown, have all hands on deck for weeks, practically sleeping at the office. They found a copy on a work from home artist on maternity leave with a Silicon Graphics SGI Irix workstation at home. They drove to her home, wrapped the computer in pillows and carried it on a gurney to a station wagon and drove 15 mph to the Pixar office with their hazard lights on. Then recovered the data from the drives. They still had to spend tens of thousands of hours version checking every file to rebuild what was deleted from the NFS shares that were wide open with practically zero permissions. Local restaurants delivering food to the Pixar office started dropping off food for free because they had their best month financially ever in the history of their business. Scale that scenario up to 8+ Million computers broken by Crowdstrike / Microsoft since Thursday night. Tech's everywhere are exhausted beyond measure.

    • @wisteela
      @wisteela 2 місяці тому +16

      I didn't know you could disable Safe Mode.

  • @markusmayer7956
    @markusmayer7956 2 місяці тому +74

    You crack me up, Dave. 😂 The blue screen of death shirt, the offhand reference to using a MacBook (at 0:40) to investigate. Brilliant.
    Of course it wouldn't be the same without your skill and technical insight to follow up with. I always enjoy your hearing your perspective and learning from your expertise. Keep up the good work.

    • @xBINARYGODx
      @xBINARYGODx 2 місяці тому +2

      your seeing something you want to see that is not there in the way you are want to assume

    • @rjparker2414
      @rjparker2414 2 місяці тому

      Yeah, I want a BSOD shirt too! 🤣

  • @twol78s90
    @twol78s90 2 місяці тому +1

    WIth 47 years in Systems Administration and Systems Programming, in Windows, Unix/Linux, and embedded systems, I've seen a lot of things go awry over this period of time, but this Crowdstrike Falcon situation was one of the most scary from the standpoint of having such a huge impact on IT services across the planet.
    Your description of the situation was perfect...technically spot-on, but also explained in such a way that it was understandable by just about anyone with any concept of the need to control access to device drivers, memory managers, and resource schedulers through kernel services. Very skillfully crafted, as well as calmly stated and with a subtle injection of humor that made it very engaging to listen to through the end...even for a crusty old IT guy like myself.
    It all goes back to the early days of computers that had "Priviliged Mode" and "User Mode" to enable multi-tasking (so that multiple user-mode programs couldn't step on each other or the operating system) and timesharing (creating virtual environments for multiple users that isolate them from the hardware).
    Even my old PDP 8/e system has a "Timeshare and Memory Expansion" board in it that adds "User Mode" that traps the execution of certain instructions (HALT, for example, as well as JMP, JSR(Jump to Subroutine), IOT (I/O Transfer) instructions, and of course, the instructions that change between user-mode and privileged mode). When such instructions are encountered when in User Mode, the instruction is not executed, and an interrupt is triggered, which turns on Privileged Mode, and vectors to a interrupt service routine that emulates the execution of the instruction(s) that triggered the interrupt, then sets the mode back to User Mode, and returns to the user program. It has a consequence of slowing down the system a bit, as the CPU has to emulate the instruction(s) that triggered the trap( for example, an instruction that checks the status of a Serial I/O board to see if a character is ready to be transferred), but it was worth it because of the ability to isolate user programs from the hardware. That's early 1970's computing technology.
    Even then, there were folks that figured out how to trick the system to be able to subvert the protections and crash the primitive multi-user timeshared systems that ran on the PDP 8/e (TSS/8). Such features existed in various forms in computers long before the PDP 8/e came out, dating back to the 1950's.
    Just change the names from "Privileged Mode" to Ring 0, and "User Mode" to Ring 1, and the concepts are much the same. It's a bit more complicated today, with all the stuff like multiple CPUs, look-ahead, caching, user and kernel memory spaces, and speculative execution, but distilled down to the base functionality, very similar.
    Crowdstrike is a widely-deployed solution, as it instantly became clear with outages in a huge number of systems that directly affected the public. The place I work for uses it, and we had a number of servers BSOD as a result of the update. The fix was simple as you described, except that a few had Bitlocker set up, which added an additional layer of complexity, but fortunately, the keys were all printed, and locked up the ubiquitous very beefy and heavily fire-rated IT Department safe. It caused some downtime of a number of applications, and certainly hassles for IT to get things back up and running as quickly as possible, but it was caught very quickly and the agents shut down on other machines before it could spread across all of the servers and end-user systems.
    The worry I have about all of this is that bad actors will inevitably go after the Crowdstrike kernel driver with Ghidra and other such tools and will figure out the instruction set of the p-Code interpreter, as well as finding ways to trick any security/validation wrappers put around p-Code submissions to validate them, and thusly could write their own p-Code routines to wreak havoc on systems that use Crowdstrike. Depending on what kinds of operations that the p-Code engine can perform, the consequences of someone putting together a user-mode program that loads a malicious p-Code program into the engine that causes irreparable damage would make the incident that occurred look tame in comparison.
    To me this says that Crowdstrike had better get cracking on A) fixing their release chain so faulty updates have much less chance (e.g., very closely approaching zero) of slipping through; B) seriously harden the methodology by which updates are validated to make forging any kind of update extraordinarily difficult, and C) completely revamping the p-Code instruction set such that any "old" p-Code routines fed to it will be trapped, as well as substantially hardening the p-Code's execution validation methodologies (e.g., making sure that the p-Code isn't trying to do something that could lead to system instability or kernel panic). If they don't do all of these things quickly I suspect a lot of customers are going to flee to other platforms out of knee-jerk reaction, which is rather sad, and won't necessarily eliminate the risks, as just about every behavioral detection engine must run in kernel mode, making such solutions potentially vulnerable.
    Crowdstrike's methodology is overall quite sound, and their methods of detection and analysis of emergent threats is very effective. Their "front-end" is pretty amazing, and has discovered quite a number of emergent threats and pushed out emergency updates that prevented our machines from being compromised. Perhaps engineering got so wrapped up in the threat identification and analysis aspect of Crowdstrike that the computer agent didn't get as much continuous attention that it should have received. Having a p-Code module of all zeroes cause a kernel panic just screams of problems in the p-Code interpreter.
    No matter what the situation is that allowed this serious problem to occur, it is yet another example of how a borked (I use this word frequently, nice to hear someone else use it!) update (either accidental or supply-chain induced as with Solar Winds) can have massive consequences.
    It just goes to show just how our world-wide computing infrastructure is perhaps a bit more tenuous than one might believe, and can suffer major difficulties as a result of something innocuous, or worse, maliciously crafted.
    The scary part is that there are lots of independent and state-sponsored actors out there that will spend lots of money and enormous amounts of distributed time and talent to come up with a way to cause such a situation to occur with who-knows-what piece of software (I'm not necessarily saying Crowdstrike...it could be anything) that could have even worse ramifications than this Crowdstrike incident.
    That day will inevitably come, and when it does, I sure hope I am retired from working in IT, as it will be a very, very unpleasant time for the world at large, and even worse for anyone who is working in systems administration.
    Thank you, Dave, for your great channel. Even this jaded old systems guy who has been around the block way too many times learns something and frequently gets a good chuckle from your subtly-injected humor. God Bless.

  • @LogicalNiko
    @LogicalNiko 2 місяці тому +108

    Ironically I was also worked at a company hit by NoPetya and other attacks that actually used drivers to destroy the boot loader and attempt to encrypt the data drives (while technically masking searching for financial data). Crowdstrike in their early days swooped in and I spent days on the ground with their Senior engineers taking apart the malware and hacking together methods to recover it.
    Now almost a decade later I work at Microsoft and Crowdstrike does a driver based corruption of the boot process, and I spent Friday and Saturday identifying, tracing, and fixing that aftermath.

    • @tomorrow4eva
      @tomorrow4eva 2 місяці тому +12

      If you live long enough, you become the villain?

    • @TimTatarsky
      @TimTatarsky 2 місяці тому

      Does this mean that crowd Strikes are using methods that they borrowed from malware?

  • @UpsidePotential
    @UpsidePotential 2 місяці тому +201

    As someone who has dozens of windows kernel drivers in the field. I can totally understand how this happened and it's not forgivable.

    • @Lil.Yahmeaner
      @Lil.Yahmeaner 2 місяці тому +6

      More insight would be 👍🏽

    • @FriendlyNeighborhoodNitpicker
      @FriendlyNeighborhoodNitpicker 2 місяці тому +11

      And yet, what do you want to bet it will be forgiven? Oh just a mistake. If insurance companies are requiring operators to use the software, the likelihood of getting it out of everyone’s picture in the near future is slim. At least until the lawsuits put them out of business.

    • @cybervigilante
      @cybervigilante 2 місяці тому +10

      @@FriendlyNeighborhoodNitpicker I want to hear how much money the airlines lost. I imagine getting CS off your system is a bit more involved than Microsoft Uninstall.

    • @ArtStoneUS
      @ArtStoneUS 2 місяці тому +3

      Is it possible the .sys file was provided by a three letter government agency and passed through without any testing?

    • @nomore6167
      @nomore6167 2 місяці тому +3

      "I can totally understand how this happened and it's not forgivable" - Especially when it was committed by a company which advertises itself as a security company.

  • @michaelmcgovern8110
    @michaelmcgovern8110 2 місяці тому +24

    From 40 years in the SW business as programmer, tech doc, project facilitator, I say: you make this difficult stuff understandable and digestible for everybody. Nice job!
    Thanks!

  • @sinmenon4347
    @sinmenon4347 2 місяці тому

    I grew up with computers, I basically learned how to read on MS-DOS back in the Windows 3.1 era. So when I found your channel I had to subscribe because learning about Windows and everything makes me so happy. I know this have absolutely no relationship to your video, I just wanted to share and tell you a "thank you" for making this channel and taking your time to explain stuff.

  • @rainmant5724
    @rainmant5724 2 місяці тому +112

    Thank you for the technical dive. Nobody else seemed to really offer the reason, outside of "bad update".

    • @greyfox78569
      @greyfox78569 2 місяці тому +6

      Sabotage of an update probably by an intel service. Most QA guys are saying no way was this green lit for release without someone changing something post QA. Even the worst of the worst QA guy would have caught that bug so either the QA rubber stamped it, or someone changed something post sign off.

    • @Logicon1138
      @Logicon1138 2 місяці тому +7

      Or conspiracy theories. not that I'm bashing those per se, but with those it's always something way worse than what it really is. Occams razor etc

    • @gabvsd5934
      @gabvsd5934 2 місяці тому +1

      I actually learned what the Kernel does 😅 yeah this video was very informative

    • @kevinmcfarlane2752
      @kevinmcfarlane2752 2 місяці тому +1

      CrowdStrike go into some detail on their blog. They say it’s a logic error. They also say more info will be forthcoming.

  • @saifal-badri
    @saifal-badri 2 місяці тому +61

    People like Dave make UA-cam actually useful, love it please keep these videos coming

  • @meggrobi
    @meggrobi 2 місяці тому +192

    CrowdStrike seems a perfect vehicle for State actor to gain back door access with or without MS knowledge.

    • @videogamecoverss
      @videogamecoverss 2 місяці тому +8

      Incredibly unlikely in that regard. Any backdoor the government can use is a backdoor malicious attackers can use.
      In that case, Crowdstrike would have been screwed long before this.

    • @whatcouldgowrong7914
      @whatcouldgowrong7914 2 місяці тому +9

      Security through obscurity, the fact everyone relied on this platform tells a cookie cutter story of IT security with no redundancy

    • @lilblackduc7312
      @lilblackduc7312 2 місяці тому

      I'm sure they're much better at it than that.

    • @LyricsQuest
      @LyricsQuest 2 місяці тому +14

      Backdoors are already built into windows, and the CPU itself.

    • @Robnoxious77
      @Robnoxious77 2 місяці тому +4

      @@LyricsQuestand routers and switches and servers and hosting providers etc etc

  • @michaelbenzinger2600
    @michaelbenzinger2600 2 місяці тому +170

    Booting into safe-mose works great if you don't have BitLocker full disk encryption activated like most enterprises. If you didn't print out a hardcopy of the key to enter when you try to boot into safe-mode and your AD servers were also impacted, you better hope you have accessible backups of that data or a lot of information is going to be lost.

    • @joester4life
      @joester4life 2 місяці тому +15

      Someone found a way that you can get into SafeMode without Bitlocker (many people (including myself)) was happy someone found this.. I had a few machines that had an old Bitlocker key, and a new one didn't update to AD or MBAM.
      From what I read, the EFI partition with BCD and Boot Manager isn't encrypted and you still need to login with your account in safe mode.

    • @氷語
      @氷語 2 місяці тому +3

      @@joester4lifebootloader can never be encrypted if you want it to be bootable by non proprietary hardware. But this still shouldn't allow you to decrypt the drives using the tpm if you manage to boot into a new boot entry crafted for this. This is ensured by additional checks in the bootloader itself.

    • @Space_Rat1
      @Space_Rat1 2 місяці тому +4

      @@氷語 actually, its not supposed to boot into safe mode without triggering the bitlocker, at least thats how it works on my machine. i would love to know how they did it

    • @incandescentwithrage
      @incandescentwithrage 2 місяці тому

      ​@@Space_Rat1Unpatched recovery partition most likely. Missing KB5034441 and similar

    • @smatchimo55
      @smatchimo55 2 місяці тому

      @@Space_Rat1 same i've gone over to my friends house who runs a hobby server and bitlocker has fucked him on multiple occasions

  • @hbduckie
    @hbduckie 2 місяці тому +19

    Dave - this was an insanely clear, concise, and thorough explanation, which is only possible in part to your depth of experience (and in part to your eloquence, wit and dry humor, which I relate to). Thank you!

  • @brucemunro8598
    @brucemunro8598 2 місяці тому +87

    I've been waiting for a decent explanation of this issue. Duly provided by Dave. Thank you, Sir.

    • @zlcoolboy
      @zlcoolboy 2 місяці тому +1

      Yeah, no one really explained what it was all about.

  • @IAmMarvel1
    @IAmMarvel1 2 місяці тому

    Not sure if editing or not but you speak w/o hesitation and look at the camera always . That gives me confidence you are worthy of a subscription.

  • @BarsMonster
    @BarsMonster 2 місяці тому +231

    If we don't sugar coat it, it's a kernel-level remote backdoor running unsigned code. You can't make it any worse.
    CrowdStrike is not the only one. Ubuntu live patch at least betatests patches on free users first. Then we have Intel ME. WiFi cards running unaudited binary blobs (which is why you often can't boot from it). You are not in control and this will repeat.

    • @RavingKats
      @RavingKats 2 місяці тому +7

      If it'd deleted the MBR that would've been worse technically speaking...

    • @RFC3514
      @RFC3514 2 місяці тому +19

      @@RavingKats - Not really, because then it would stop having access to your system. Worse for "normal" users, maybe, not worse for any organization trying to keep prying eyes away from its files. This is essentially a built-in rootkit.

    • @sirseven3
      @sirseven3 2 місяці тому +6

      ​@@RavingKatsive tracked an issue regarding Intel rapid storage technology driver knocking drives out of existence. Can't slave it to a secondary pc at all but it'll register in devmgmt. I can't access a Linux machine within my environment to view the raw data and see if that's what happened but there's ALOT of driver anomalies lately with windows

    • @mdonnang
      @mdonnang 2 місяці тому +2

      Wow! thanks to your comment I finally understood the backdoor access software (the hall) in the Matrix trilogy, and the Keymaker purpose. Thank you!

    • @BarsMonster
      @BarsMonster 2 місяці тому +5

      @@RavingKats If it just deleted MBR without reboot - 90% of systems would remain operational. There would be world-wide notice to not reboot and admins would be able to restore MBR remotely without service disruption...

  • @refl9630
    @refl9630 2 місяці тому +113

    Aren't you the guy that created the task manager?! It's amazing that you chose to use your time to create quality informative content. I'm subscribing immediately!

    • @acquacow
      @acquacow 2 місяці тому +7

      He's got a whole video or three about that.

    • @gabrielandy9272
      @gabrielandy9272 2 місяці тому +2

      he worked on microsoft before but he is retired now.

    • @cybervigilante
      @cybervigilante 2 місяці тому +3

      Woe. Ctrl-Alt-Del is a real lifesaver. Especially if you run Mathematica, which can really bork out sometimes.

    • @ikategame
      @ikategame 2 місяці тому

      no, he isnt the task manager guy

    • @audio2u
      @audio2u 2 місяці тому +9

      @@ikategame actually, he is. Follow the link to his book, and read the preview. He specifically says "if you've ever used Task manager, you're using some of my code"

  • @billmccaffrey1977
    @billmccaffrey1977 2 місяці тому +16

    Great explanation Dave. I'm retired as well and spent the majority of my career developing microprocessors at Motorola and AMD. I would bet at this point that CrowdStrike has at least 4 lawyers for every engineer looking into this with another group of spin doctors looking at how to disclose what happened. It's not a business to be in if you have a weak stomach.

  • @rjparker2414
    @rjparker2414 2 місяці тому +1

    Excellent video! Clear, useful. Am recommending it to many techie friends.
    Now where can I get that BSOD t-shirt? Especially if it supports Dave's Garage!🤣👍
    Recently diagnosed AuDHD, at age 70, am also going to pick up Dave's book(s).
    Thanks Dave, for a breath of fresh air, on so many levels.

  • @kensanchez2064
    @kensanchez2064 2 місяці тому +19

    I am really amazed how you can have the ability to explain such deep subject in such a clear way. First time I've encounter this channel and I am now subscribed to it. Really good work!!

  • @sfoldy
    @sfoldy 2 місяці тому +52

    I'm not even a programmer, but between you and Steve Gibson, I feel like an engineer. This is by far the most clear and in-depth explanation of what happened (based on the current knowledge) that I have heard. Thank you!

    • @sabin97
      @sabin97 2 місяці тому

      if memory serves me right it was someone using a null pointer.
      and the fact that the error was not caught by anyone in the chain makes me doubt the quality of their programmers.....not just the ones doing the grunt work but the ones that are supposed to conduct the code reviews.
      closed source software is not trustworthy.

    • @lades3
      @lades3 2 місяці тому

      Do you know if Steve Gibson has a UA-cam channel?

    • @disjunction66
      @disjunction66 2 місяці тому +2

      Upvote for mentioning Steve Gibson!

  • @SpynCycle57
    @SpynCycle57 2 місяці тому +199

    Crowdstrike: Run our software and we guarantee no one will access your system.

    • @BonnysBeats
      @BonnysBeats 2 місяці тому +1

      😂

    • @alext3811
      @alext3811 2 місяці тому +3

      "just trust me, bro", and then they drop billions in BSODs.

    • @Reelix
      @Reelix 2 місяці тому

      *no one else

    • @wverbrug1
      @wverbrug1 2 місяці тому +2

      They took “no one” to litterly

  • @locutus842
    @locutus842 2 місяці тому +5

    I haven't heard talk like this in almost 40 years! Thanks for the memories! 😁

    • @Harteo3917
      @Harteo3917 2 місяці тому

      We still had people talking sweet to our ears like this in the 1990s the last decade before good computers and internet, i miss it so much because it's simple but somehow so stimulating i'm excited now lol. We had kids programmes like art attack and sMart that i frequently watched and somehow had an impact, and they just talked to us in such simple and kind ways and not infantilizing like even kids were little adults with a brain capable of learning.
      Now nearing my middle 30s i'm relating more to how people talked in older shows, i've been watching bullseye the game show and i really like Jim Bowen great man but the way he talks is just like described. I've seen a few episodes of Tomorrow's World and i felt myself lapping every word up while falling into a relaxed lull, there's just nothing better about the way things used to be explained something special about it that appeals in the right way to the brain.

  • @Bob-cx4ze
    @Bob-cx4ze 2 місяці тому +212

    In the military, we had similar issues with colonel panics.

    • @CptRiggs
      @CptRiggs 2 місяці тому +82

      Not to mention General Failures

    • @VenturiLife
      @VenturiLife 2 місяці тому +67

      Major issues

    • @davidf2281
      @davidf2281 2 місяці тому +56

      Private access

    • @spvillano
      @spvillano 2 місяці тому +42

      Ah, but that's why we retain the option for Corporal Punishment.

    • @TheRealScooterGuy
      @TheRealScooterGuy 2 місяці тому +36

      In retrospect, it seems that Captain Obvious should have been involved.

  • @GarryFunk
    @GarryFunk 2 місяці тому +98

    The whole time I'm listening to Dave talk about "Ring 1" and "Ring 0" all I could think about was "One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them". Haha

  • @SpaceCop
    @SpaceCop 2 місяці тому +36

    As an Amiga user, I was very happy to watch this guru meditation.

    • @docwhogr
      @docwhogr 2 місяці тому

      guru meditation? where?!

  • @BRICSlayer
    @BRICSlayer 2 місяці тому +11

    It's crazy the press is calling it a "Microsoft outage". I run networks of WIndows servers and workstations and none of them had any problem. We don't use crowdstrike and none of us ever heard of it until last week.

    • @cigmorfil4101
      @cigmorfil4101 2 місяці тому

      What I don't get is why Windwos requires software like Crowdstrike, whether written by 3rd party or Microsoft, to be in the kernel in the first place.

    • @BRICSlayer
      @BRICSlayer 2 місяці тому

      @@cigmorfil4101 Not sure what a kernel is, I'm not a programmer, just a network administrator.

  • @theatomproject007
    @theatomproject007 2 місяці тому +33

    I made BSoD T-shirts back in the day (mid 90's) and wore one to COMDEX LV one year. The front had a small pocket logo "BSoD The OS of Choice." and the back was a graphics driver BSoD in the classic hex dump version. It went over not so well visiting the Microsoft booth...but it seemed everyone else liked it. I still have a handful of them left over and periodically wear them when working on my cars.

    • @RoamingAdhocrat
      @RoamingAdhocrat 2 місяці тому +1

      I wore a BSoD t-shirt the day I met Kevlin Henney at a conference :)

  • @falkerhard
    @falkerhard 2 місяці тому +55

    You made that look simple. I've seen people scrambling to fix and talk as if they are piloting the star trek enterprise. This is straight to the point.

    • @adrianhenle
      @adrianhenle 2 місяці тому +9

      It's simple to fix a single machine if it's sitting on your desk. It's a bit different when you have thousands of machines on racks, and have to track down and physically interact with each one.

    • @DavesGarage
      @DavesGarage  2 місяці тому +4

      Ha!

    • @AG-ig8uf
      @AG-ig8uf 2 місяці тому +5

      It is not simple to fix if disk is encrypted with BitLocker and CrowdStrike files have group policy , as it is in most corporations. Even if you manage to get bitlocker key to boot in safe mode, you still can't delete files as local admin.

  • @bongomcgurk7363
    @bongomcgurk7363 2 місяці тому +16

    What an educational pleasure listening to an expert explain a technical matter in such an understandable way.

  • @KeithCooper-Albuquerque
    @KeithCooper-Albuquerque 2 місяці тому

    Excellent explanation and overview of kernel mode and rings 0 and 1. I am retired also. I was a C/C++ UNIX/Linux and some Windows programmer. It is refreshing to hear someone who worked on the bleeding edge and knows his stuff explain this problem so completely well. Thanks Dave!

  • @projectartichoke
    @projectartichoke 2 місяці тому +18

    I've read a bunch of stuff about this issue over the last few days and this video is, by far, the best and most understandable explanation of exactly what happened.

  • @ktxed
    @ktxed 2 місяці тому +89

    the level of misinformation about windows in the past days is appalling, thanks for stepping in to bring more clarity into the matter :)

    • @BastetFurry
      @BastetFurry 2 місяці тому +3

      Yeah, even a 98SE can be made rock solid with enough patience, did that back in the days and never had a larger problem with it.
      Still have the Nickles PC Report books on the shelf that taught me the ropes to a stable system.

    • @AlanTheBeast100
      @AlanTheBeast100 2 місяці тому +7

      MS permits certified drivers to backdoor code into the system. MS is as much to blame.

    • @Akens888
      @Akens888 2 місяці тому +3

      @@AlanTheBeast100 that is an interesting point but I wouldn't go as far as saying they share the blame. If that is not under review right now I would be surprised.

    • @ktxed
      @ktxed 2 місяці тому

      @@AlanTheBeast100 "to backdoor code into the system" lol

    • @nmisoo
      @nmisoo 2 місяці тому +5

      You are responsible for what you choose to install on your system. MS is not responsible for your stupidity.

  • @Valentina-vt7ru
    @Valentina-vt7ru 2 місяці тому +1

    I must admit I couldn't understand most of the terms... but no doubt you Sir must definitely have a LOT of experience! Please keep educating people with your videos. Even if some of us are tottaly beginners it's inspiring

  • @MrT79shakeshake
    @MrT79shakeshake 2 місяці тому +37

    I am a mechanical engineer and I Know fuck all about computers on this level but damn that's the best explanation I have heard for a complex system made simple. You sir are an amazing explainer.

    • @Scot-p1v
      @Scot-p1v 2 місяці тому +6

      Well said! Deep knowledge is great, but the ability to concisely convey it to others is much more rare-and deeply undervalued.

  • @michaelhanson5773
    @michaelhanson5773 2 місяці тому +9

    As someone with a computer science major and worked on software design, your definitions of kernel and user modes and how they were different and how they work were great... Better than my professors i had in college...

  • @okay1904
    @okay1904 2 місяці тому +22

    Dave - this was brilliant. Simple - direct, easy to understand, and your outlining of the solution was amazing. Well done. Good job. Thanks. It just shows that our media (newspapers, TV, online commentators), do not really communicate, and their focus is more of sensationalist news - anything that sells their channell. You have done the most amazing job of succinctly explaining exactly what went wrong and how to fix it.
    Your explanation is so brilliant, you deserve an award of some kind, for such excellent communication, and understanding. You should be on TV, you are much better than the people who talk about tech on TV, you actually know what you are talking about, and know it very very well. Thank you.

    • @timothyallen6411
      @timothyallen6411 2 місяці тому +2

      Dave's award could be a hundred thousand new likes and subscribes -- so what are people waiting for? Do it!

  • @DanielaHltb
    @DanielaHltb 2 місяці тому

    the clarity you bring to your subjects is beyond impressive!

  • @kaseyboles30
    @kaseyboles30 2 місяці тому +530

    Unless there was an active zero day in the wild, and a dangerous one, this should not have gone live without test time at each location on test systems designed to act similar to live systems. To fast track this (especially on a Friday) without a significant and imminent threat is in my opinion reckless.

    • @andreaseriksson8121
      @andreaseriksson8121 2 місяці тому +15

      I totally agree!

    • @andythebritton
      @andythebritton 2 місяці тому +33

      Exactly - this is the most mind-boggling aspect of the whole thing.

    • @wa1gon
      @wa1gon 2 місяці тому +93

      I have been a software developer for the best part of 50 years, I have worked for large companies like WalMart, S&P, Digital, Nokia, and many smaller companies.
      It is clever how CrowdStikes gets around the test to get their driver signed. Clever in software is rarely good. What I don't get is all the large companies have rock-solid testing before release. Everything is under source code management SCM) which uses branches for different stages. The development code when ready for test is merged into the QA branch, QA beats on the code until they are happy, then they merge into a release branch where a release team will create a "GOLD" version of the binaries. That is sent back to QA for final testing. Then and only then can the binaries be released to production.
      For a bug that seems to brick every system it is installed on, how did it get pass the QA and release process?

    • @Moe_Posting_Chad
      @Moe_Posting_Chad 2 місяці тому +1

      WELCOME TO A WORLD WHERE TECH IS ALL H1B1 VISAS! THANK CHRIST FOR STREET SH!TTERS!

    • @jjankosky
      @jjankosky 2 місяці тому +9

      AV data files are rarely tested since they are daily at this point. Good bet they will be going through test going forward.

  • @dondavis8849
    @dondavis8849 2 місяці тому +8

    This explanation makes sense, and seems knowledgeable. I've been a systems programmer for 46 years, and I've done kernel programming on various operating systems, including windows.

  • @kehlarn6478
    @kehlarn6478 2 місяці тому +68

    accurate summary. the source of the zeroed file is either a crash during writeout during the build process (full disk/stopped vm scenario likely) or a cdn corruption. both would have been caught by the inclusion of a checksum/manifest pair to validate the payloads were intact. the moment the driver decided to bypass certification and dynamically include contents to speed up the process they should have known they needed to supplant it with a checksum manifest but chose not to for unknown reasons. this is sadly a VERY common outcome in cdn mapped content due a variety of corruption vectors and the trust modern software has in network integrity is rather poorly misplaced. always verify your content is intact regardless of how small/large

    • @hesido
      @hesido 2 місяці тому +11

      I'm just a hobby programmer and even I would have thought to do checksum testing. It's ridiculous, frankly speaking. In a chrome extension I wrote for my personal use which modified existing functions on a page, I only replaced the functions that I tested for the checksum, and the code warned me if the underlying page has updated the JS functions, so I could update my own extension to match the update (and this worked pretty flawlessly and saved me a lot of headaches.)
      The scary thing is that this kernel hog doesn't even seem to have a way to vet the driver files, the program blindly trusts those files to be the real deal.

    • @dismuter_yt
      @dismuter_yt 2 місяці тому +8

      If the reason is corruption, then it is mind-boggling that they would not at least have signed their updates with their own certificates prior to running them through QA. That would act as protection against corruption, but also as an additional layer of protection against tampering. Imagine if their distribution machines were compromised and an attacker replaced the update with a malicious rootkit. I'm tempted to say that with the cavalier approach that they took to bypass quality certification by Microsoft to execute code on ring 0, if they didn't sign their updates, then they are amateurs, should lose all business, and their company should disappear. It might happen anyway, if they get sued to oblivion.

    • @davelikesthings
      @davelikesthings 2 місяці тому +8

      It's also an insane risk if they're blindly accepting the file. It's lucky it just ran into a zero byte file and not something created and injected via a malicious third party.

    • @retromodernart4426
      @retromodernart4426 2 місяці тому +3

      They did it on purpose so they can pretend to be a "bad actor" and insert whatever they want into systems hosting their rootkit malware for whatever purpose they want including but not limited to taking servers and services offline, hard.

    • @kehlarn6478
      @kehlarn6478 2 місяці тому +1

      @@hesido checksum manifests is an advanced concept. the avg programmer doesnt understand why files would not be what they wrote in the first place. your description for function rehooking sounds just like multiple other good projects. same concept. search, compare, replace/skip.
      sadly a lot of shady crap goes on in driver land. there's a lot less examples of good ways to do things that low level so the expertise isnt available.

  • @ChipLohman
    @ChipLohman 2 місяці тому

    Thank you for your laser delivery and explanation of some of the fascinating facts behind the Wizard of Oz's computer science curtain.😢
    Thank you also for bringing David Cutler into your classroom. I can speak for myself that your presentations have been a rare and valuable insight into the behind-the-scene history many of us grew up with.
    One sea story: My early jump into computers was with an Epson QX-10 (IBM Clone) and their proprietary word processing software. I remember coming home to my bride - in tears as she looked at a 50-page Thesis she had typed ... as she pointed to letters in the text >dripping< down the screen! That would have been during the '85-'86 era. Best...

  • @fransiscoalvarezinski6293
    @fransiscoalvarezinski6293 2 місяці тому +9

    I've been a heavy PC user forever and started with MS-DOS and IBM-DOS in 1985, 8086 etc. I've never written a line of code beyond a complicated batch file. Yet I actually followed you thru your entire presentation. I'm not that smart. You are that good.

    • @binsarm9026
      @binsarm9026 2 місяці тому

      well said, i too have rudimentary programming knowledge only and yet could grasp the gist of what happened despite all the technical jargon !

  • @dmitripogosian5084
    @dmitripogosian5084 2 місяці тому +718

    Funny how defence against potential distributed threats created a single point of failure across such heterogenic deployment

    • @henson2k
      @henson2k 2 місяці тому +44

      now everybody know how to bring Windows down on global scale, Crowdstrike should really prove they can be trusted with kernel access.

    • @adamnealis
      @adamnealis 2 місяці тому +29

      "Inevitable" is a better noun than "funny". Not sure why affected companies were allowing CS to update prod servers in real time.

    • @PplsChampion
      @PplsChampion 2 місяці тому +21

      its like the safety test that caused chernobyl

    • @misham6547
      @misham6547 2 місяці тому

      First because some cyber security updates can't wait, seconds because for some reason this update if order update settings​@@adamnealis

    • @jaideepshekhar4621
      @jaideepshekhar4621 2 місяці тому +14

      I heard it auto updated on all systems, even those locked from any updates.

  • @rudikroch6499
    @rudikroch6499 2 місяці тому +49

    "It crashes, because it has to" - I feel that the Matrix would have been more complete had they worked this line into the story. Preferably spoken by Morpheus or maybe even Cypher in a serious, deliberate, and credible manner. There's so much to unpack with this line...

    • @Rawdiswar
      @Rawdiswar 2 місяці тому +2

      Please, unpack

    • @therealwhite
      @therealwhite 2 місяці тому +3

      They kinda maxed out the playtime getting Reeves to say "There is no spoon" , otherwise Morpheus would have to monologue about the cat glitching out, and then he would've said it.

    • @emanuelczirai8936
      @emanuelczirai8936 2 місяці тому +1

      "the body cannot live without the mind" - Morpheus

    • @joep9617
      @joep9617 2 місяці тому +3

      They wouldn't call it "Windows" if it wasn't going to get broken and replaced from time to time.

    • @Rawdiswar
      @Rawdiswar 2 місяці тому

      @@joep9617 bro, profound.

  • @robxombie2054
    @robxombie2054 2 місяці тому

    I've been building/repairing/tweaking PC and Mac hardware since 1997, and your flavor or style of explaining is right on for me. I know so little when it comes right down to it, and I admire your expertise. I'd like to pledge a bit of my mindspace to learning and disseminating any knowledge you'd like to share by subscribing to this channel and watching as often as I can. Thank you, for helping to never waste a day - and keep on keeping on!

  • @KerrySayers-vm6vm
    @KerrySayers-vm6vm 2 місяці тому +20

    Wow, nice, thanks. As a developer for IBM starting way back in PC DOS 1.0, I understood everything you presented and appreciate your time in explaining not only what happened but how to fix it.

  • @Xyler94
    @Xyler94 2 місяці тому +30

    I had figured it was a kernel level driver error, just didn't realize how the definition updates go through. We're really trusting AV and Cybersecurity companies to not muck this up like Crowdstrike did!

    • @rjparker2414
      @rjparker2414 2 місяці тому

      A friend and I, who both had COBOL programming experience, were joking about it being a line error in the kernel.
      Thanks Dave, for confirming what we suspected, while presenting the error in great detail. As well as a fix!

  • @kevinmorbidthelostcronin1984
    @kevinmorbidthelostcronin1984 2 місяці тому +11

    It has been a decade since I did development at this level. I have no idea if I will ever return to the field. Why am I mesmerized into keep watching this video? Dave, I think you offered a clearer presentation than any of my university CS professors.

    • @Shivian124
      @Shivian124 2 місяці тому +1

      There's no financial incentive for great CS field personnel to remain in academia - private industry pays a lot more and with far better conditions.

  • @ViennaGuy2000
    @ViennaGuy2000 2 місяці тому

    I use computers every day for programming, but at a higher level. I work with end users explaining our complex software.
    Dave's explanation style, depth, and content are A1. Outstanding video saved for reference.

  • @donaldwright2426
    @donaldwright2426 2 місяці тому +6

    For people like me who have no IT expertise or any particular skill for programming, I learned a lot and also made me understand the basics of operating systems. RING 0 or 1 were completely unknown to me. In short, you gave a good presentation on the subject. THANKS.

  • @KayleeKerin
    @KayleeKerin 2 місяці тому +39

    "Borked" I have not heard that term in a LONG time. Thank you

    • @G.Aaron.Fisher
      @G.Aaron.Fisher 2 місяці тому +5

      I use that term all the time. It's just such a phonetically appropriate description of something misconfigured to the point of failure.

    • @brodriguez11000
      @brodriguez11000 2 місяці тому

      Usually accompanied by a loud, BOING!