Master-Replica Replication

Master-Replica architecture is one the most common high-level architectural pattern prevalent in distributed systems. We can find it in use across databases, brokers, and custom-built storage engines. In this essay, we talk about everything we should know about the Master-Replica replication pattern.

A system that adheres to the Master-Replica replication architecture contains multiple nodes and each node, called Replica, holds an identical copy of the entire data. Thus if there are N nodes in the system, there will be N copies of the data.

Master-Replica Replication

Scaling Reads

With N nodes capable of serving the data, we easily scale up the reads by a factor of N. Hence, this pattern is commonly put in place to amplify and scale the reads that the system can handle.

Handling Writes

With N nodes that hold and own the data, the writes become tricky to handle. In the Master-Replica setup, the writes go to one of the pre-decided nodes that act as the Master. This Master node is responsible for taking up all the writes that happen in the system.

Master is not any special node; rather, it is just one of the Replicas with this added responsibility. Thus in the system of N nodes, 1 node is the Master that takes in all the writes, while the other N - 1 node caters to the read requests coming from the clients.

Write Propagation

Once the write operation is successful on the Master node, the changes are propagated to all the Replicas through Replication Log (Commit Log, Bin Log, etc.), letting the system eventually catch up.

The time elapsed between the write operation on the Master and the operation propagating to the Replica is called Replication Lag. This is one of the core metrics that is observed 100% of the time.

What happens when the Master goes down?

Since the Master node takes in all the write operations, it going down is a massive event. The write operation that happens when the Master is facing an outage results in an error.

When the system detects such an outage, it tries to auto-recover by promoting one active Replica as the new Master by running a Leader Election algorithm. All the healthy Replicas participate in this election and, through a consensus, decide the new Master.

Once the new Master is elected, the system starts accepting and processing the writes again.

Master-Replica in action

This is a widespread pattern that we can find across almost all the databases and distributed systems. Some of the most common examples are:

  • Relational databases like MySQL, PostgreSQL, Oracle, etc.
  • Non-relational databases like MongoDB, Redis, etc.
  • Distributed brokers like Kafka, etc.

Arpit Bhayani

Arpit's Newsletter

CS newsletter for the curious engineers

❤️ by 30000+ readers

If you like what you read subscribe you can always subscribe to my newsletter and get the post delivered straight to your inbox. I write essays on various engineering topics and share it through my weekly newsletter.

Other essays that you might like

Leaderless Replication

444 reads 2022-01-16

In this essay, we take a look into a different way of replication, called Leaderless Replication, that comes in handy in...

Multi-Master Replication

497 reads 2021-11-03

In this essay, we look at what Multi-Master Replication is, the core challenge it addresses, use-cases we can find this ...

Replication Formats

334 reads 2021-08-15

When we are employing a Master-Replica pattern to improve availability, throughput, and fault-tolerance, the big questio...

The RUM Conjecture

670 reads 2020-05-31

While designing any storage system the three main aspects we optimize for are Reads, Updates, and auxiliary Memory. RUM ...

Be a better engineer

A set of courses designed to make you a better engineer and excel at your career; no-fluff, pure engineering.

Paid Courses

System Design Masterclass

A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems.

1000+ learners

Details →

Redis Internals

Learn internals of Redis by re-implementing some of the core features in Golang.

46+ learners

Details →

Free Courses

Designing Microservices

A free playlist to help you understand Microservices and their high-level patterns in depth.

106+ learners

Details →

GitHub Outage Dissections

A free playlist to help you learn core engineering from outages that happened at GitHub.

251+ learners

Details →

Hash Table Internals

A free playlist to help you understand the internal workings and construction of Hash Tables.

427+ learners

Details →

BitTorrent Internals

A free playlist to help you understand the algorithms and strategies that power P2P networks and BitTorrent.

192+ learners

Details →

Topics I talk about

Being a passionate engineer, I love to talk about a wide range of topics, but these are my personal favourites.