Arpit's Newsletter read by 90000+ engineers
Weekly essays on real-world system design, distributed systems, or a deep dive into some super-clever algorithm.
Say, we have a distributed database with three nodes, and we want our “commit” to succeed when it is successful in all three nodes otherwise we “abort”.
This is a classic Distributed Transaction.
Assumptions:
Say have the transaction, and N processes are participating. One of the N
processes becomes the coordinator and it coordinates until the end of the protocol.
All the nodes send if they can commit
or abort
to the coordinator process. If a process does not send any information, the coordinator will mark it as abort
.
At the end of phase 1, the coordinator will have the local decisions of all the nodes. The coordinator decides commit
if all the nodes can commit
, otherwise the decision is abort
.
Process A broadcasts its decision to all the nodes in the network, and thus the entire network either commits or aborts; thus completing the transaction.
If the coordinator fails before the start of phase 1, then since the consensus did not even start, all the nodes can safely abort.
If the coordinator fails after initiating phase 1, some of the nodes might have sent their local decision and would be waiting to hear back the final decision. These nodes would remain blocked.
If a participant crashes before sending its local decision to the coordinator, then the coordinator keeps on waiting for the local decision.
If the coordinator and one participant crash in phase 2 without other participants knowing anything about the decision then the new coordinator that comes up would have no idea about the decision.
No one could proceed, because the new coordinator does not know if the crashed node was committed or aborted.
The Two-Phase Commit looks super-simple but it has a major flaw in the failure scenarios, and hence distributed systems take even finer steps to remediate and reach a consensus more robustly.
Here's the video ⤵
Alongside my daily work, I also teach some highly practical courses, with a no-fluff no-nonsense approach, that are designed to spark engineering curiosity and help you ace your career.
A no-fluff masterclass that helps experienced engineers form the right intuition to design and implement highly scalable, fault-tolerant, extensible, and available systems.
An in-depth and self-paced course for absolute beginners to become great at designing and implementing scalable, available, and extensible systems.
A self-paced and hands-on course covering Redis internals - data structures, algorithms, and some core features by re-implementing them in Go.
Arpit's Newsletter read by 90000+ engineers
Weekly essays on real-world system design, distributed systems, or a deep dive into some super-clever algorithm.