Outage Dissections

6 videos


Dissecting GitHub Outage: Downtime due to an Edge Case

838 views 41 likes 2022-05-23

In August 2021, GitHub experienced an outage where their MySQL Master database went into a degraded state. Upon investig...

Dissecting GitHub Outage - Downtime due to ALTER TABLE

1694 views 88 likes 2022-05-09

Can an ALTER TABLE command take down your production? 🤯 GitHub had a major outage and it all started with a schema migr...

An engineering deep-dive into Atlassian's Mega Outage of April 2022

4303 views 226 likes 2022-04-15

In April 2022, Atlassian suffered a major outage where they "permanently" deleted the data for 400 of their paying cloud...

Dissecting Google Maps Outage: Bad Rollout and Cascading Failures

1104 views 69 likes 2022-04-01

Google Maps had a global outage on 18th March 2022, during which the end-users were not able to use Directions, Navigati...

Dissecting GitHub Outage: ID column reaching the max value 2147483647

1694 views 145 likes 2022-03-23

GitHub experience an outage on 5th May 2020 on a few of their internal services and it happened because a table had an a...

Dissecting Spotify's Global Outage - March 8, 2022

2838 views 171 likes 2022-03-12

Incident Report: Spotify Outage on March 8: https://engineering.atspotify.com/2022/03/incident-report-spotify-outage-on-...

Arpit's Newsletter read by 14000+ engineers

🔥 Thrice a week, in your inbox, an essay about system design, distributed systems, microservices, programming languages internals, or a deep dive on some super-clever algorithm, or just a few tips on building highly scalable distributed systems.

  • v10.6.4
  • © Arpit Bhayani, 2022