Date: Fri, 18 Mar 2022 02:00:00 GMT
On March 8 2022 Spotify and Discord experienced an outage latest 2-3 hours. The reason was a configuration on the xDS formats on Google Traffic Director. Let us discuss how this change caused the outage and what Spotify did to mitigate that outage without relying on Google restoring the service back up. Resources Spotify outage https://engineering.atspotify.com/2022/03/incident-report-spotify-outage-on-march-8/ Google Cloud outage https://status.cloud.google.com/incidents/LuGcJVjNTeC5Sb9pSJ9o Envoy xDS https://blog.envoyproxy.io/the-universal-data-plane-api-d15cec7a Microservices scaling with common sense https://www.youtube.com/watch?v=NsIeAV5aFLE CARDS 4:36 Miicros https://www.youtube.com/watch?v=NsIeAV5aFLE 9:30 Spotify Hermes https://www.youtube.com/watch?v=fMq3IpPE3TU&t=21s 0:00 Intro 2:00 Spotify Outage 3:30 Microservices 6:10 Service Discovery 10:00 Spotify Quick Workaround 12:15 Google Traffic Director Outage