Registrations for the September 2021 System Design Cohort are now open Enroll Today

"D" in ACID - Durability

Published on 19th Jul 2021

1 min read

Share this article on


After discussing the "A", the "C", and the "I", it is time to take a look at the "D" of ACID - Durability.

Durability seems to be a taken-for-granted requirement, but to be honest, it is the most important one. Let's deep dive and find why it is so important? How do databases achieve durability in the midst of thousands of concurrent transactions? And how to achieve durability in a distributed setting?

What is Durability?

In the context of Database, Durability ensures that once the transactions commit, the changes survive any outages, crashes, and failures, which means any writes that have gone through as part of the successful transaction should never abruptly vanish.

This is exactly why Durability is one of the essential qualities of any database, as it ensures zero data loss of any transactional data under any circumstance.

A typical example of this is your purchase order placed on Amazon, which should continue to exist and remain unaffected even after their database faced an outage. So, to ensure something outlives a crash, it has to be stored in non-volatile storage like a Disk; and this forms the core idea of durability.

How do databases achieve durability?

The most fundamental way to achieve durability is by using a fast transactional log. The changes to be made on the actual data are first flushed on a separate transactional log, and then the actual update is made.

This flushed transactional log enables us to reprocess and replay the transaction during database reboot and reconstruct the system's state to the one that it was in right before the failure occurred - typically the last consistent state of the database. The write to a transaction log is made fast by keeping the file append-only and thus minimizing the disk seeks.

Durability in ACID

Durability in a distributed setting

If the database is distributed, it supports Distributed Transactions, ensuring durability becomes even more important and trickier to handle. In such a setting, the participating database servers coordinate before the commit using a Two-Phase Commit Protocol.

The distributed computation is converged into a step-by-step process where the coordinator communicates the commit to all the participants, waits for all acknowledgments, and then further communicates the commit or rollback. This entire process is split into two phases - Prepare and Commit.

References


If my work adds value, consider supporting me


Buy Me A Coffee

Arpit's Newsletter

1400+ Signups

If you like what you read subscribe you can always subscribe to my newsletter and get the post delivered straight to your inbox. I write essays on various engineering topics and share it through my weekly newsletter 👇



Other articles that you might like

"A" in ACID - Atomicity

"A" in ACID - Atomicity

A single database transaction often contains multiple statements to be executed on the database. In ...

28th Jun
"I" in ACID - Isolation

"I" in ACID - Isolation

Isolation is the ability of the database to concurrently process multiple transactions in a way that...

5th Jul
Mistaken Beliefs of Distributed Systems

Mistaken Beliefs of Distributed Systems

In this essay, we learn about a set of false assumptions that programmers new to distributed applica...

17th Jun
Bitcask - A Log-Structured Fast KV Store

Bitcask - A Log-Structured Fast KV Store

Bitcask is a Key-Value store that persists its data in append-only log files and still reaps super-p...

19th Jul