I was the tech guy that had to travel to 3 different sites of ours to remediate individual workstations impacted. My sysadmin handled the servers like you mentioned. Took us 12 whole hours to get everything back to normal working operation.
Ironically, The CEO of CrowdStrike, George Kurtz used to be the Chief Technology Officer of McAfee in 2010, when a security update from the antivirus firm crashed tens of thousands of computers.
you hit the nail on the head, sandbox and iterative deployment, on a personal level I would disable all auto updates and have a sacrificial machine to test any updates not just from crowd strike especially if it has sys access
We had our Falcon updates limited and it didn't respect this value. The best solution is to do what Apple does with macOS, the system kernel is blocked for third parties, they instead create an isolated version called a system extension for vendors to apply (runs at the user level). This assures OS does not crash when third party apps screw up. Alternatively, I'm thinking of now diversifying the EDR apps and using Splunk in the middle to merge it all. That or block content updates via network layer and release it after its been approved but such a hassle lol
Really feeling it for all the IT departments that needed to deal with this on the weekend. We don't have CS for EDR but we do a scheduled update release to avoid outages from bugs. I think it's the best way to mitigate and forsee possible issues. Great summary and incident walkthrough!
good explanation of how IT teams can and should handle a problem like this. As I expressed to others in our computer club a day or so after the July 19th global crashing, there is NO EXCUSE for Crowdstrike INC. to push out crap patches like this without adequate testing beforehand and as an added precaution, staged or incremental distribution of any type of critical patch that could cause major computer network crashes. As far as I'm concerned, Crowdstrike name with me now is MUD - if you're an IT higher up in the IT team, find some better replacement cause this should never happen if a company is well run - and Crowdstrike IS NOT>
Passed ITILv4 a couple months ago and when I read about this incident I thought about Release and Deployment Management, sometimes I thought these processes were excessive, but surely not anymore!
Thanks Jono for excellent analysis for incident response steps. its too bad you can't switch to Apple or Linux. CrowdStrike’s Tests Failed to Flag Bug. Time to replace whatever quality I assurance tool they used.
Hello, I am not very knowledgeable in this field but just curious about, you mentioned your company uninstalls crowdsrike and if it was uninstalled to not get hit by the damage, does that mean the systems are temporarily vulnerable without the crowdstrike software? Or is there already another layer of security protection in the case where such events occur, like having double software... if there's such a thing.
Very Informative JONO, Stage releases is the solution and it should be in place as Falcon sensor assume full cloud control and auto update in behave of customers unlike any software that should go through compliance to approve update to be applied. (if CRWD want full control! they should consider the same customer process to be applied specially in case of update)
Thanks for making a video on this topic! I seriously feel bad for Crowdstrike, as per the CEO of that company he told the news it was a content update and not a security update and I am pretty much sure hackers may have found a way to penetrate into most of the systems during this time as they are well aware (probably more aware than us) to what's happening so I am guessing all network enterprises will be on high alert right now as most of them disabled falcon crowdstrike now and they don't have a backup EDR.
Yep I can't imagine how bad the person that pushed the update felt. We made the decision to mainly restore from safe mode instead of uninstalling Falcon from all machines as that would open an even bigger can of worms. As long as Falcon is still on the machine, there shouldn't be any risk of a hack!
@@TechwithJono it's good to know about this thanks! I am currently enrolled in a next gen soc analyst course with hands on lab practice on Splunk siem, Crowdstrike EDR, xSOAR and phantom soar and nessus for VA. Just hoping to get the best knowledge on these tools and hope that I will land a job after learning this.
I don't deal with the admin side of things, more ethical hacking (I dislike any other terms). Any rate, having done what you guys do back in the day, I still appreciate this, sadly due to the nature of this 'mistake' I can only imagine the sheer cost in terms of damages, the current market value of CS and so on... I do think this sort of issue will become a far bigger problem. For my purposes I will await on the various reports since I though I value content like this, it is not what I use (or any social media for that matter). Hopefully this will not affect you but I hope corporates and such see this as an opportunity to consider alternatives. We have seen similar situations with SolarWinds, regardless, it just shows once again that putting your eggs on basket is not the best idea. Rest is not my domain, besides this was not as result of an exploit. Ps. given the sheer amount of 'users' affected, it will take time to resolve the issue on systems not tied to critical infrastructure. The damage is done.
I think alot of alternative solutions also uses the same type of kernel driver design, which we now know is a flawed design. I've had a chat with another vendor today and they've also expressed similar concerns along with many of their clients. It seems like a design overhaul is needed!
@@TechwithJono Thanks for the reply, I don't postulate much but yes there looking at the overall design would be what I would do. Sometimes, and because how these updates work, when I hear Kernel I think of assembler and then vendor and so on. I know ASM, one slip (just one) can result in sheer chaos, but I believe what you say is what I would do in terms of Administration, since the risks are so high... thanks for the reply and video.
Disabling auto-update can expose computers to risks and reduce security when facing future exploits. The company should diversify its approach to security. While it’s not possible to install two antivirus programs on the same machine, a better strategy would be to use different sets of security software for half of the company. This issue arose when one software pushed out an update without proper testing. Although I believe they will be more cautious about updates in the future, we should never rely solely on one provider. Having a backup plan is essential.
Your strategy is only good in theory. But in practice, it will be a nightmare to manage multiple vendors for the same security system. Disabling auto-update won't expose additional risks as long as you replace with scheduled updates.
Not sure I heard it here or not but you will only have the computer crash from an invalid pointer if the code is executing inside of the kernel.... If chrome or itunes did the same thing the programs would simply crash, but the computer would not need to reboot. Essentially these security companies (crowdstrike) make a rootkit so they can have unrestricted access to the memory allowing them to see inside of all the programs running & their data... I call it a rootkit b/c first of all that's exactly the kind of thing a malicious rootkit would do & in this case crowdstrike ended up being malicious.
Piloting and phased roll-out are excellent ideas but they should have been done by CrowdStrike. The burden shouldn't rest on customers, and CrowdStrike has a bigger customer and PC base to monitor and detect any borderline problem. For example it would be difficult for you to catch and troubleshoot a 5% problem than for them. For you it may be 5% of 1000 PCs, and for them it's 5% of 10 million PCs.
Yes the ideal solution would be Crowdstrike shouldering the entire burden. But as we've witnessed, this is not a good idea so there needs to be a backup plan
someone missed the memo of never push anything to prod on a friday
3 місяці тому
Pretty sure as this is a content update and not an app update you don't control when content is delivered. This is only controlled by CrowdSrike. As we are N-1 and still impacted.
The problem with staged roll out is that you open the door for zero day exploit so I guess users just have to pick their poison. In short you cannot rely soley on a third party for your company's security.
In this day and age, it's hard for a company to not rely solely on a third party security solution as the pros usually outweighs the cons. But yeah nothing is perfect as we all have witnessed.
was on call operator (AU) when this happened and multiple clients called reporting the about BSoD with crowdstrike error across hundreds of devices and servers. Team was confused at first then realized it was a worldwide outage instead of targeted attack which we initially thought. Thank god for that reddit post :D
I was the tech guy that had to travel to 3 different sites of ours to remediate individual workstations impacted. My sysadmin handled the servers like you mentioned. Took us 12 whole hours to get everything back to normal working operation.
We're all in this together!
Ironically, The CEO of CrowdStrike, George Kurtz used to be the Chief Technology Officer of McAfee in 2010, when a security update from the antivirus firm crashed tens of thousands of computers.
Do I smell conspiracy 🤔
you hit the nail on the head, sandbox and iterative deployment,
on a personal level I would disable all auto updates and have a sacrificial machine to test any updates not just from crowd strike especially if it has sys access
👍👍
We had our Falcon updates limited and it didn't respect this value. The best solution is to do what Apple does with macOS, the system kernel is blocked for third parties, they instead create an isolated version called a system extension for vendors to apply (runs at the user level). This assures OS does not crash when third party apps screw up.
Alternatively, I'm thinking of now diversifying the EDR apps and using Splunk in the middle to merge it all. That or block content updates via network layer and release it after its been approved but such a hassle lol
Yeah macOS does it better than Windows in this case. Diversifying EDR solutions seems like a major hassle indeed!
Really feeling it for all the IT departments that needed to deal with this on the weekend.
We don't have CS for EDR but we do a scheduled update release to avoid outages from bugs. I think it's the best way to mitigate and forsee possible issues.
Great summary and incident walkthrough!
good explanation of how IT teams can and should handle a problem like this. As I expressed to others in our computer club a day or so after the July 19th global crashing, there is NO EXCUSE for Crowdstrike INC. to push out crap patches like this without adequate testing beforehand and as an added precaution, staged or incremental distribution of any type of critical patch that could cause major computer network crashes. As far as I'm concerned, Crowdstrike name with me now is MUD - if you're an IT higher up in the IT team, find some better replacement cause this should never happen if a company is well run - and Crowdstrike IS NOT>
Passed ITILv4 a couple months ago and when I read about this incident I thought about Release and Deployment Management, sometimes I thought these processes were excessive, but surely not anymore!
Yep these processes were created for this exact reason!
I dont know why this guy isnt viral yet, short - on point video, well said! keep up the good work man :)
Cheers!
What I find most apropos, is the name itself, Crowdstrike! How ironic, and so fitting...
Can't get hacked if you can't login!
@@TechwithJono or if you simply don't!
Hey jono can we get a vdo abt remote job in cybersecurity pls.. Would be so helpful.. Pros nd cons..
Thanks Jono for excellent analysis for incident response steps. its too bad you can't switch to Apple or Linux. CrowdStrike’s Tests Failed to Flag Bug. Time to replace whatever quality I assurance tool they used.
As a infosec analyst we’re on high alert as phishers will take advantage by sending out fake emails claiming to be from crowdstrike
Same
A saving grace would be that the emails probably wouldn't get to anyone with money since email was likely down ;)
Good analogies 👍🏾. Even though I am not a techy 👩🏽💻 yet. You explanation helped me understand what happened.
40+ Servers and 270 Workstations, we handled everything in a similar manner, were up before Noon.
Well done
Hello, I am not very knowledgeable in this field but just curious about, you mentioned your company uninstalls crowdsrike and if it was uninstalled to not get hit by the damage, does that mean the systems are temporarily vulnerable without the crowdstrike software? Or is there already another layer of security protection in the case where such events occur, like having double software... if there's such a thing.
As a Cyber security or Tech professorial what are the "Reddit" accounts to follow?
Awesome video ! What log files do you inspect when you're looking for data exiltration attempts / compromission in general ?
Generally network traffic, so abnormally high bytes size, malicious IP addresses, user logins at unusual time, etc
So you covered for Helpdesk/IT in your security role? where there not any other IT guys in the office?
With the size of our company, this level of outage easily overwhelms. It was all hands on deck.
Very Informative JONO, Stage releases is the solution and it should be in place as Falcon sensor assume full cloud control and auto update in behave of customers unlike any software that should go through compliance to approve update to be applied. (if CRWD want full control! they should consider the same customer process to be applied specially in case of update)
Thanks!
Thanks for making a video on this topic! I seriously feel bad for Crowdstrike, as per the CEO of that company he told the news it was a content update and not a security update and I am pretty much sure hackers may have found a way to penetrate into most of the systems during this time as they are well aware (probably more aware than us) to what's happening so I am guessing all network enterprises will be on high alert right now as most of them disabled falcon crowdstrike now and they don't have a backup EDR.
Yep I can't imagine how bad the person that pushed the update felt. We made the decision to mainly restore from safe mode instead of uninstalling Falcon from all machines as that would open an even bigger can of worms. As long as Falcon is still on the machine, there shouldn't be any risk of a hack!
@@TechwithJono it's good to know about this thanks! I am currently enrolled in a next gen soc analyst course with hands on lab practice on Splunk siem, Crowdstrike EDR, xSOAR and phantom soar and nessus for VA. Just hoping to get the best knowledge on these tools and hope that I will land a job after learning this.
I don't deal with the admin side of things, more ethical hacking (I dislike any other terms). Any rate, having done what you guys do back in the day, I still appreciate this, sadly due to the nature of this 'mistake' I can only imagine the sheer cost in terms of damages, the current market value of CS and so on... I do think this sort of issue will become a far bigger problem. For my purposes I will await on the various reports since I though I value content like this, it is not what I use (or any social media for that matter).
Hopefully this will not affect you but I hope corporates and such see this as an opportunity to consider alternatives. We have seen similar situations with SolarWinds, regardless, it just shows once again that putting your eggs on basket is not the best idea. Rest is not my domain, besides this was not as result of an exploit.
Ps. given the sheer amount of 'users' affected, it will take time to resolve the issue on systems not tied to critical infrastructure. The damage is done.
I think alot of alternative solutions also uses the same type of kernel driver design, which we now know is a flawed design. I've had a chat with another vendor today and they've also expressed similar concerns along with many of their clients. It seems like a design overhaul is needed!
@@TechwithJono Thanks for the reply, I don't postulate much but yes there looking at the overall design would be what I would do. Sometimes, and because how these updates work, when I hear Kernel I think of assembler and then vendor and so on. I know ASM, one slip (just one) can result in sheer chaos, but I believe what you say is what I would do in terms of Administration, since the risks are so high... thanks for the reply and video.
Disabling auto-update can expose computers to risks and reduce security when facing future exploits.
The company should diversify its approach to security. While it’s not possible to install two antivirus programs on the same machine, a better strategy would be to use different sets of security software for half of the company.
This issue arose when one software pushed out an update without proper testing. Although I believe they will be more cautious about updates in the future, we should never rely solely on one provider. Having a backup plan is essential.
Your strategy is only good in theory. But in practice, it will be a nightmare to manage multiple vendors for the same security system. Disabling auto-update won't expose additional risks as long as you replace with scheduled updates.
Why not just use different OS's for half the company?
Not sure I heard it here or not but you will only have the computer crash from an invalid pointer if the code is executing inside of the kernel.... If chrome or itunes did the same thing the programs would simply crash, but the computer would not need to reboot. Essentially these security companies (crowdstrike) make a rootkit so they can have unrestricted access to the memory allowing them to see inside of all the programs running & their data... I call it a rootkit b/c first of all that's exactly the kind of thing a malicious rootkit would do & in this case crowdstrike ended up being malicious.
Crowdstrike falcon is a kernel driver, and besides macOS, all other OS wouldn't have escaped the outage
we run n-2, it was a file in a channel somewhere in the 30s. Not a driver change. The sensor version wouldn't have mattered
n-2 or even n-1 is a great idea
How did you get your Bitlocker recovery keys?
We stored it on a server in AD
Piloting and phased roll-out are excellent ideas but they should have been done by CrowdStrike. The burden shouldn't rest on customers, and CrowdStrike has a bigger customer and PC base to monitor and detect any borderline problem.
For example it would be difficult for you to catch and troubleshoot a 5% problem than for them. For you it may be 5% of 1000 PCs, and for them it's 5% of 10 million PCs.
Yes the ideal solution would be Crowdstrike shouldering the entire burden. But as we've witnessed, this is not a good idea so there needs to be a backup plan
@@TechwithJono You are right. You have to do things to protect yourself from companies you pay to protect you. It's insane 😀
close one, imagine dropping something like this on the last friday on july
someone missed the memo of never push anything to prod on a friday
Pretty sure as this is a content update and not an app update you don't control when content is delivered. This is only controlled by CrowdSrike. As we are N-1 and still impacted.
Interesting. I didn't know this! Thanks for sharing
what a nightmare, few hundred to go due to not found bit locker key.
The pain of manually getting the recovery key...one by one...
Please did you have any course on soc analyst or cyber security
Stay tuned!
lol you could drop crowdstrike and move to sentinel one or cortex. I surely would not recommend using cylance or carbon black.
That's a nada. Our company already dropped a big bag to renew our contract
The problem with staged roll out is that you open the door for zero day exploit so I guess users just have to pick their poison. In short you cannot rely soley on a third party for your company's security.
In this day and age, it's hard for a company to not rely solely on a third party security solution as the pros usually outweighs the cons. But yeah nothing is perfect as we all have witnessed.
good time to write a VB script to delete that file!!!! Not sure if you guys still do that anymore.
I wonder if that will work because the computers reboot right when it hits the login page. Will it even have time to execute the script?
If you are as old as I am you may remember Sophos doing a simiar thing.
I'm too young to know it
you need to up your sarcasm level when you say "You had a great time doing OT to fix...."
I really do hey!
Did you seriously say you "def had a great time working overtime"?! 😂 most IT guys didnt
I need to level up my sarcasm
was on call operator (AU) when this happened and multiple clients called reporting the about BSoD with crowdstrike error across hundreds of devices and servers. Team was confused at first then realized it was a worldwide outage instead of targeted attack which we initially thought. Thank god for that reddit post :D
Yes! The reddit post saved us too!
those who not attacked by this incident🗿🗿🗿🗿
Lucky!
@@TechwithJono is this happen in all world?
It is I... 🗿
Use Linux.
Crowdstrike hit Linux a few months ago, they seem to have a bad habit of taking down their customers !
A simple and ignorant answer lol
Hey do you have any social media like instagram to connect to you?