Two Phase Commit to power Distributed Transactions in a Distributed System



618 views Distributed Systems



Say, we have a distributed database with three nodes, and we want our “commit” to succeed when it is successful in all three nodes otherwise we “abort”.

This is a classic Distributed Transaction.

Assumptions:

  • no messages are lost
  • processes can fail in the middle of the transaction
  • every node knows about every other node in the network

Two-Phase Commit

Say have the transaction, and N processes are participating. One of the N processes becomes the coordinator and it coordinates until the end of the protocol.

Phase 1

All the nodes send if they can commit or abort to the coordinator process. If a process does not send any information, the coordinator will mark it as abort.

At the end of phase 1, the coordinator will have the local decisions of all the nodes. The coordinator decides commit if all the nodes can commit, otherwise the decision is abort.

Phase 2

Process A broadcasts its decision to all the nodes in the network, and thus the entire network either commits or aborts; thus completing the transaction.

Failure Scenarios

If the coordinator fails before the start of phase 1, then since the consensus did not even start, all the nodes can safely abort.

If the coordinator fails after initiating phase 1, some of the nodes might have sent their local decision and would be waiting to hear back the final decision. These nodes would remain blocked.

If a participant crashes before sending its local decision to the coordinator, then the coordinator keeps on waiting for the local decision.

If the coordinator and one participant crash in phase 2 without other participants knowing anything about the decision then the new coordinator that comes up would have no idea about the decision.

No one could proceed, because the new coordinator does not know if the crashed node was committed or aborted.

The Two-Phase Commit looks super-simple but it has a major flaw in the failure scenarios, and hence distributed systems take even finer steps to remediate and reach a consensus more robustly.


Arpit Bhayani

Arpit's Newsletter

CS newsletter for the curious engineers

❤️ by 17000+ readers

If you like what you read subscribe you can always subscribe to my newsletter and get the post delivered straight to your inbox. I write essays on various engineering topics and share it through my weekly newsletter.




Other essays that you might like


Two Phase Commit to power Distributed Transactions in a Distributed System

618 views 28 likes 2022-09-16

Distributed Transactions are the heart and soul of Distributed Systems and getting all the participating nodes to agree ...

Exponential Information Gathering (EIG) Algorithm for Byzantine Agreement

379 views 16 likes 2022-09-14

Byzantine Agreement is an important problem to address in a Distributed Network. It is all about being tolerant of the n...

Exponential Information Gathering (EIG) Algorithm - Distributed Consensus even when processes crash

245 views 6 likes 2022-09-12

Exponential Algorithms have to be the worst possible way to solve Distributed Consensus; but are they really that bad? ...

FloodSet Algorithm - Distributed Consensus even when processes crash

432 views 14 likes 2022-09-09

Reaching a consensus is extremely critical in a Distributed System as we would have situations day-in and day-out where ...


Be a better engineer

A set of courses designed to make you a better engineer and excel at your career; no-fluff, pure engineering.


System Design Masterclass

A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems.

800+ learners

Details →

Designing Microservices

A free playlist to help you understand Microservices and their high-level patterns in depth.

17+ learners

Details →

GitHub Outage Dissections

A free playlist to help you learn core engineering from outages that happened at GitHub.

67+ learners

Details →

Hash Table Internals

A free playlist to help you understand the internal workings and construction of Hash Tables.

25+ learners

Details →

BitTorrent Internals

A free playlist to help you understand the algorithms and strategies that power P2P networks and BitTorrent.

42+ learners

Details →

Topics I talk about

Being a passionate engineer, I love to talk about a wide range of topics, but these are my personal favourites.




Arpit's Newsletter read by 17000+ engineers

🔥 Thrice a week, in your inbox, an essay about system design, distributed systems, microservices, programming languages internals, or a deep dive on some super-clever algorithm, or just a few tips on building highly scalable distributed systems.



  • v12.7.8
  • © Arpit Bhayani, 2022

Powered by this tech stack.