LCR algorithm for Leader Election in Distributed Systems

Watch the video explanation ➔

Leader Election is a critical component in any distributed system. It enables the system to auto-recover from leader failures. When a leader node goes down, the Leader Election algorithm is triggered to elect the new leader.

LCR Algorithm

LCR Algorithm for leader election is the simplest and the easiest to understand and implement, and its variant can be seen in action in a bunch of distributed systems.

The LCR algorithm expects the network to have the following properties

Unique ID

Every node in the network has a unique identification - UID - that can be compared with other UIDs.

Virtual Circular Ring

LCR also assumes that the nodes in the network are virtually arranged in a circular ring, and each node knows the node to its right in the ring.

Although we want a circular ring, the nodes may be physically connected to other nodes through any topology. The ring mandate is purely virtual and can be maintained only for powering elections.

The Algorithm

In the election, every node participates to be the new leader. To participate, it creates a message having its UID and sends it to its neighbor.

When a node receives a UID, it compares the incoming UID with its own, and

  • if the incoming is greater than its own, it forwards it
  • if the incoming is lesser than its own, it discards
  • if the incoming UID == its own, it declares itself as the new leader

When a node receives its UID as an incoming message, it implies that the message survived the entire iteration leading to the assertion that the node has the highest UID in the network; and hence can become the new leader.

The new leader is then announced through another message passed across the ring.

Halting the algorithm

Halting is one of the most important aspects of any distributed algorithm, as it can get tricky to know when to stop.

LCR algorithm stops when the new leader initiates the message and announces itself. To announce, it initiates the HALT message and sends it across.

The node receiving the HALT message understands that the new leader has been elected and needs to stop participating in the election. The node updates its local state with this information and forwards the message to the next node.

Complexity

Each node participates in the election and sends messages across the entire ring. Every node could potentially receive the message from every other node; the communication complexity, the number of messages shared, is thus O(n^2).

Here's the video ⤵

Courses I teach

Alongside my daily work, I also teach some highly practical courses, with a no-fluff no-nonsense approach, that are designed to spark engineering curiosity and help you ace your career.


System Design Masterclass

A no-fluff masterclass that helps experienced engineers form the right intuition to design and implement highly scalable, fault-tolerant, extensible, and available systems.


Details →

System Design for Beginners

An in-depth and self-paced course for absolute beginners to become great at designing and implementing scalable, available, and extensible systems.


Details →

Redis Internals

A self-paced and hands-on course covering Redis internals - data structures, algorithms, and some core features by re-implementing them in Go.


Details →


Writings and Learnings

Knowledge Base

Bookshelf

Papershelf


Arpit's Newsletter read by 90000+ engineers

Weekly essays on real-world system design, distributed systems, or a deep dive into some super-clever algorithm.