I have managed to dodge all the (outtages) bullets so far but this one I couldn't dodge 😂😂 I was working with Spotify's api and communicating on discord.
They didn't have a plan. What had happened here is they happened to be in the process of migrating to traffic director. When that croaked they simply redeployed the previous version of software that fell back to DNS discovery (aka config file with a bunch of dns hosts for each microservice) Been there done that, hello istio.
2:20 To put it into perspective, whenever I have an interview and they tell me about cloud services (which I have no experience of, I am still trying to get into the industry), whenever I mention that there should be a workaround on interviews or to some professors, they are either confused or tell me "there's no way to do that" or "it can't be afforded". Only once I heard there was a workaround set up I wish I knew more on the topic to at least know the details on how would you set up a workaround, the obstacles, and if there are scenarios or configurations where it would not be possible or it would be too time consuming, but I am trying to learn and dedicate more time into the topic
From my experience, it’s just a matter of trade-off. Creating a high availability system with “all” the possible problems covered with failover mechanism is expensive so you usually try to cover only the most probable one depending on the cloud provider SLA
Spotify has great engineers, this is beyond shadow of doubt. However, the issue here was in service discovery and according to Spotify's report they were using DNS based service disccovery and very recently had moved to xDS based Service Discovery [ Traffic Director ] . In short, they had a backup mechanism place for this unlike Slack whose real issue was in AWS networking. Had Spotify experienced similar issue to Slack even they would have to wait for the service to come back up
Finally got around to watching this one. So they really needed better metrics and/or better understanding of their metrics to know if all customers had migrated.
Just imagine if they add the code to delete the old format file there 👀 (and that is why you should never delete things on production environment unless you are *absolutely* sure)
Head to database.husseinnasser.com for a discount coupon to my Introduction to Database Engineering course. Link redirects to udemy with coupon applied.
what i wonder is, how didn't they realise they had a bunch of configurations erroring when being read as the new format..? 😅 guessing they missed some alerting in there..
Get my Fundamentals of Networking for Effective Backends udemy course Head to network.husseinnasser.com (link redirects to udemy with coupon)
I have managed to dodge all the (outtages) bullets so far but this one I couldn't dodge 😂😂 I was working with Spotify's api and communicating on discord.
Lmao shots fired from both the directions
google cloud said "fuck you" in particular.
Lol sounds like a target attack. Cripple the infrastructure and destroy communications to delay the response.
now every time a service I use is down I think "man the Hussain video is gonna be fire!"
This is quality! Keep it up. Thank you
I enrolled your Fundamentals of Database Engineering course and it is one of the best course on this topic
They didn't have a plan. What had happened here is they happened to be in the process of migrating to traffic director.
When that croaked they simply redeployed the previous version of software that fell back to DNS discovery (aka config file with a bunch of dns hosts for each microservice)
Been there done that, hello istio.
The fact that this channel exists and is so succesful, is AMAZING but also kinda blowing my mind.
Fascinating video as always! What a great deep dive into an issue that impacted (and confused) so many of us directly
This channel is pure gold
Can you make coverage on node-ipc malware npm package?
I have to tell you the truh : your youtube channel is a genuine gold mine
2:20 To put it into perspective, whenever I have an interview and they tell me about cloud services (which I have no experience of, I am still trying to get into the industry), whenever I mention that there should be a workaround on interviews or to some professors, they are either confused or tell me "there's no way to do that" or "it can't be afforded". Only once I heard there was a workaround set up
I wish I knew more on the topic to at least know the details on how would you set up a workaround, the obstacles, and if there are scenarios or configurations where it would not be possible or it would be too time consuming, but I am trying to learn and dedicate more time into the topic
From my experience, it’s just a matter of trade-off. Creating a high availability system with “all” the possible problems covered with failover mechanism is expensive so you usually try to cover only the most probable one depending on the cloud provider SLA
Sick audio processing, and your voice sounds epic, actually!
Great video, and impressive management by Spotify!
Spotify has great engineers, this is beyond shadow of doubt. However, the issue here was in service discovery and according to Spotify's report they were using DNS based service disccovery and very recently had moved to xDS based Service Discovery [ Traffic Director ] . In short, they had a backup mechanism place for this unlike Slack whose real issue was in AWS networking. Had Spotify experienced similar issue to Slack even they would have to wait for the service to come back up
I was hoping to see this here, subscribed x
Finally got around to watching this one.
So they really needed better metrics and/or better understanding of their metrics to know if all customers had migrated.
Thanks for your insights and comments dude.
Maybe they included the node-ipc package XD
Nice video! Thank you Hussein!
"Our services have fully recovered, I'm not recovered though!" 😂😂
nice video. but xDS not xSD >.
😢 i noticed during editing haha
@@hnasr sry, it's such a small mistake anyone can make, you are our hero still, Thank you.
Love your videos keep it up bro
Great !!! We were expecting this one !!!
Just imagine if they add the code to delete the old format file there 👀
(and that is why you should never delete things on production environment unless you are *absolutely* sure)
can u make a video on Google maps outage happened on March 18.
Which browser is this?
thank you for that quality video! learned a lot.
Alsalam aliykum brother Hussain.
Did you read Designing Data-Intensive Applications Book? If yes please make make a video about it.
Iso 8601 for all date-related operations
Thanks for the amazing videos
thats a radio voice that keeps people listening
Dope explanation 👌🏾👌🏾
Please make video on Okta Hacked incident.
You had the flew too? I did at the same time as you
amazing, as allways
Head to database.husseinnasser.com for a discount coupon to my Introduction to Database Engineering course. Link redirects to udemy with coupon applied.
i didnt know google also ignores and leaves their known failures tests like that...hahaha
You know what happens when we change formats,
I was expecting, "Things go down"
GitHub outage video please
when did this happen?
Omegastar needs to get their shit together.
The era of centralized web everyone
he kept reading "xSD" 😭
Thank you bro
رائع رائع انت راااااااااائع انت والمحتوى
Good lesson learn
its always the GCP
Keep some animations the things which you explains, Viewers cant see web page for 20 minutes
what i wonder is, how didn't they realise they had a bunch of configurations erroring when being read as the new format..? 😅 guessing they missed some alerting in there..
I thought it was always the DNS
ur amazing.
Please reupload to Odysee, it is has been almost a year since you updated your channel there.
Dog woke up the baby 😂
legal jargon for sorry we pushed beta to brod but it was nobody's fault, have a good day
It was DNS 😅
DNS is always the problem!
I don't know why but Spotify music selection is kinda boring compared to youtube music
Hi
Damn that video is boring. This could be fit in like 5 minutes instead of 20...
Have you visited psychiatrist ?
Please EQ ur microphone it's pretty irritating to ear