Two phase commit (2PC)

Version 5

Created by ochaloup on May 25, 2017 5:59 AM. Last modified by ochaloup on Jan 8, 2018 4:56 PM.

A well-known algorithm to achieve ACID transaction outcomes is the two-phase commit protocol. 2PC (part of a family of consensus protocols) serves to coordinate the commit of a distributed transaction, i.e one that updates multiple resources. Those of you already familiar with Narayana will be well acquainted with a popular manifestation of this philosophy: JTA

As its name suggests, the protocol works in two phases. The first phase is named ‘prepare’ and the coordinator queries participants if they are ready to finish with the commit. The second phase is named ‘commit’ and coordinator commands participants to commit and made changes visible to the outer world.

For a better understanding of the process let’s take an example. We have a JMS broker where we want to send a message to a queue and a database where we insert a record into a table.

The transaction manager, represented by Narayana, coordinates two resources here - JMS broker and database.

Before we dive into the example description of the protocol we need to elaborate a little bit on terms used down here.

Transaction manager - is component responsible for managing a global transaction. It drives two-phase commit and counts as arbiter in case of a failure.
Global transaction - defines a work of unit that fulfills of the ACID properties. Application environment enlists resources (enlisting means a resource joins the transaction) to the global transaction. Work done by the resource (the resource is represented by a database or a JMS broker) is wraps in transaction branch.
Multiple resources can participate in the global transaction and its context is automatically propagated to methods in the call stack, usually in the scope of one thread.
Transaction branch - is an individual transaction which is part of the global transaction. It can’t exist alone. It’s defined for a unit of work done on the particular resource within the global transaction. A new transaction branch is normally created for each resource. The term of transaction branch is defined by X/Open specification. The transaction branch is XA capable.
In this text, we understand transaction branch as part of the global transaction being run under the umbrella of the application. There is a part of the transaction branch located at the resource but we named it resource-located transaction in this text (see below).
Local transaction - is a non-XA capable (it means not capable of participating in two-phase commit protocol) transaction. It’s defined for a single resource. It’s usually managed by the application. It could be part of the global transaction in particular circumstances too - e.g. what LRCO provides (see Narayana documentation).
You can find term resource local transaction In some places on the internet which mean the same as the local transaction in this text.
resource-located transaction - is a specific term used only in this text and stands for transaction run on the resource. It’s meant being a counterpart for the transaction (either local or branch) started and managed by application/transaction manager.
The work being done in the resource is covered by the resource-located transaction - any database insertion, JMS message sending etc. is included in the resource-located transaction.
The XA capable (can participate in two-phase commit) resource-located transaction is normally called as transaction branch in the context of the resource too.
The term resource emphasizes here the fact that this is the transaction running on the resource.

First transaction manager starts a global transaction.
Now the business logic sends a message with some content to the JMS queue and inserts a row into the database table. Each operation defines its own transaction branch. Up to that sending message starts a resource-located transaction in the JMS broker, inserting data starts a resource-located transaction in the database.
Each transaction branch is enlisted (enlisting means joining) to the global transaction managed by the transaction manager.
When business logic finishes its task of transaction manager to ensure that the work done within the resource-located transactions will be reflected as a single unit of work, thus transaction manager drives two-phase commit.
In the first phase, transaction manager calls prepare on both resources (for both transaction branches). If everything goes well they respond ‘ok’. When the prepare successes then information about preparation is persisted at the side of the transaction manager. The same information - that resource confirmed ok in the prepare phase - is persisted at the side of the resources. Each resource maintains its own transaction log - the JMS broker and the database.
As the final step, the transaction manager commands both resources to commit. When the transaction branch has committed the outcome of the transaction starts to be visible to the outer world. In the case of JMS broker, it means that the message is put to the target queue, in the case of database it means that the record is inserted into the database table. When commit succeed the information about the transaction is removed from the transaction logs (global and local ones).

This handling is not simple and it would be senseless if we live in the world without failures. What happens when a failure occurs.

If a failure occurs during business logic is still processing data then all transactions (the global one and local ones) are stored only in memory (are not stored in some persistent storage). At failure transaction manager aborts (rolls-back) all local branches of the global transaction. If not possible (e.g. in case some network failure) there is defined a timeout after which transactions are rolled back independently in each resource.
If the failure occurs during the process of two-phase commit the behavior depends on the stage of the process.
- If a resource does not agree to prepare then the whole two-phase commit is canceled. The transaction manager informs the other resources to rollback their resource-located transactions. Still, the information about the existence of the transaction is stored only in the memory.
- If both participants agree to prepare then the global transaction should be committed. As mentioned above the successful termination of the prepare phase means a persistent log is saved at the side of transaction manager and at the side of the resource-located transactions (the resources). If a resource (JMS broker or database) starts to be unavailable, it’s responsibility of transaction manager to keep trying to finish to commit the transaction. When the resource is back to be available the transaction manager commands to commit the appropriate transaction branch, reflecting resource-located transaction being finished.

You can continue to article on my thoughts about 3PC here: https://developer.jboss.org/wiki/Three-phaseCommitProtocol

Or check the blog post about Saga: Narayana team blog: Sagas and how they differ from two-phase commit

JBossDeveloper

Two phase commit (2PC)

Comments