Title and Author of Paper
Transaction Management in the R Distributed Database Management System. C. Mohan et al.
Summary
This paper describes to handle transactions in a distributed environment using a two-phase commit protocol (2PC). 2PC is a form of atomic commit that uses a coordinator to decide whether or not to commit or abort a transaction. The paper goes on to compare standard 2PC with two variations (1) presumed abort (PA) and (2) presumed commit (PC), which differ in how they handle failure conditions. This paper review will be divided into three sections, one for 2PC, one for PA, and one for PC.
For the purposes of this discussion, all nodes are presumed to use a write-ahead-log to provide support for recovering from failures at each node.
Two-Phase Commit
The two-phase commit protocol divides each transaction into — you guessed it — two phases. The first phase is the prepare phase, where each node participating in the transaction is sent a PREPARE message. Nodes respond to this message with a YES vote or NO vote stating whether or not they are willing to commit the transaction (i.e., that no lock conflicts exist).
If all node votes are YES votes, the commit phase of the protocol is started. The coordinator first writes a commit log and then sends a COMMIT message to each node participating in the transaction. Upon receipt of a COMMIT message from the coordinator, each subordinate node commits the transaction by first writing to their log and then updating any data from the transaction, finally the subordinate node sends an ACK to the coordinator. If one or more nodes voted NO, then the coordinator writes an abort log and sends ABORT messages to each node. Upon receipt of an ABORT message each node writes an abort log, releases any locks held, and sends an ACK to the coordinator.
After the coordinator receives ACK messages from each node, it writes an end log and can remove any references to the transaction from memory (forgetting the transaction).
Let’s take a more in-depth look at the protocol.
Prepare Phase
Figure 1 shows the flow of messages during the prepare phase from the point of view of the coordinator. To start a transaction, the coordinator sends a PREPARE message to each subordinate in the transaction.
Figure 2 shows a subordinate receiving a prepare message and voting to participate in the transaction. To participate, it writes a prepare log to durable storage and sends a YES message to the coordinator.
Figure 3 shows a similar flow of messages for a subordinate voting not to participate in the transaction. In this case, the subordinate writes an abort log to durable storage and sends a NO message to the coordinate. The subordinate is allowed to release all locks and “forget” any state about the transaction — it will never participate in the transaction after sending a NO vote.
After all subordinates have voted YES or NO, the transaction enters the commit phase.
Commit Phase
In Figure 4, we see the flow of messages after the coordinator has received at least one NO vote and must abort the transaction. The coordinator first writes an abort log and then sends an ABORT message to each subordinate (that did not already vote NO). Upon receiving an ABORT message, the subordinate writes an abort log to durable storage and sends an ACK back to the coordinator. At this point, the subordinate can release all locks and “forget” about the transaction.
After receiving all ACK messages, the coordinator writes an end log signalling that the transaction is complete.
In Figure 5, all subordinates have voted YES and the transaction is committed. The coordinator first writes a commit log. After this write, the transaction is considered committed and cannot be rolled back. After writing the log, the coordinator sends a COMMIT message to each subordinate, and each subordinate commits the transaction by writing a commit log and sending an ACK message back to the coordinator. At this point, the subordinate can update any data, and releasing any locks.
After receiving all ACK messages, the coordinator writes an end log signalling that the transaction is complete.
Handling Failures
To handle failure, it is assumed that a recovery process is running at each node in the cluster. When a failure happens the recovery process reads the local write-ahead-log and accumulates information about any running transactions in progress at the time of failure.
If the recovery process on a coordinator node finds a transaction in the committing or aborting phase, the coordinator resends COMMIT or ABORT messages to any subordinates that have not acknowledged the transaction.
If the recovery process on a subordinate node finds a transaction in the prepare phase, it periodically tries to contact the coordinator to find out how to resolve the transaction. Once the subordinate knows how to resolve the transaction it proceeds as normal.
This leaves the scenario where a subordinate fails, the subordinate asks for the current state of a transaction from the coordinator, and the coordinator has no information about the transaction. The paper details how, in this case, the correct response is to ABORT the transaction, since the process crashed before receiving all the votes needed. By aborting, the transaction could be retried if necessary.
Presumed Abort (PA)
In the standard two-phase commit protocol the process aborts when a subordinate transaction fails and the coordinator does information about the transaction to know whether to commit or abort. Since this scenario always aborts, it is safe for the coordinator to immediately “forget” about aborted transactions rather than require ACKs. Note that this is strictly a performance optimization.
Presumed Commit (PC)
Presumed commit (PC), is analogous to presumed abort, except, if no information about a transaction is available at the coordinator node, it is presumed that the transaction was a commit. In this case, ACKs are required for ABORT messages but not or COMMIT messages. This is a further optimization if we assume that commits are more frequent then aborts.
Performance Comparison
The paper concludes that, in all cases, presumed abort outperforms the standard two-phase commit protocol. The comparison between PA and PC is less clear-cut. If a transaction is read-only, then PA outperforms PC, yet for transactions with updates, PC requires less ACK messages than PA.