How to handle database outages?



2427 views Database Engineering



Why a database goes down?

An unexpected heavy load on your database can lead to a process crash or a massive slowdown.

Before jumping to the potential short-term and long-term solutions, ensure you monitor the database well. CPU, Memory, Disk, and Connections are being closely monitored.

Short term solutions

  • Kill the queries that have been running for a long time
  • Quickly scale up your database if you have been seeing a consistent heavy usage
  • Check if the recent deployment is the culprit; if so, revert asap
  • Reboot the database will calm the storm and buy you some time

Long term solutions

  • Ensure the right set of indexes is in place
  • Tune your database default parameters to gain optimal performance
  • Check for the notorious N+1 Queries
  • Upgrade the database version to get the best that DB can offer
  • Evaluate the need for Horizontal scaling using Replicas and Sharding

Arpit Bhayani

Arpit's Newsletter

CS newsletter for the curious engineers

❤️ by 17000+ readers

If you like what you read subscribe you can always subscribe to my newsletter and get the post delivered straight to your inbox. I write essays on various engineering topics and share it through my weekly newsletter.




Other essays that you might like


What are Embedded Databases?

2349 views 101 likes 2022-03-25

Embedded databases are coupled with the application they are part of and operate in a confined space. They are designed ...

How does the database guarantee reliability using write-ahead logging?

2461 views 135 likes 2022-03-21

Any persistent database needs to guarantee reliability. No matter how big or small the changes are, they should survive ...

How do indexes make databases read faster?

4312 views 303 likes 2022-03-16

In this video, we discuss how indexes make a database operate faster. While discussing that, we dive deep into how the d...

How to handle database outages?

2427 views 160 likes 2022-03-14

In this video, we talk about why a database goes down, what happens when the database is down, a few short-term solution...


Be a better engineer

A set of courses designed to make you a better engineer and excel at your career; no-fluff, pure engineering.


System Design Masterclass

A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems.

Enrolled by 700+ learners

Details →

Designing Microservices

A free course to help you understand Microservices and their high-level patterns in depth.

Enrolled by 17+ learners

Details →

GitHub Outage Dissections

A free course to help you learn core engineering from outages that happened at GitHub.

Enrolled by 67+ learners

Details →

Hash Table Internals

A free course to help you learn core engineering from outages that happened at GitHub.

Enrolled by 25+ learners

Details →

BitTorrent Internals

A free course to help you understand the algorithms and strategies that power P2P networks and BitTorrent.

Enrolled by 42+ learners

Details →

Topics I talk about

Being a passionate engineer, I love to talk about a wide range of topics, but these are my personal favourites.




Arpit's Newsletter read by 17000+ engineers

🔥 Thrice a week, in your inbox, an essay about system design, distributed systems, microservices, programming languages internals, or a deep dive on some super-clever algorithm, or just a few tips on building highly scalable distributed systems.



  • v12.4.4
  • © Arpit Bhayani, 2022

Powered by this tech stack.