Recent from talks
Nothing was collected or created yet.
Two-phase locking
View on WikipediaIn databases and transaction processing, two-phase locking (2PL) is a pessimistic concurrency control method that guarantees conflict-serializability.[1][2] It is also the name of the resulting set of database transaction schedules (histories). The protocol uses locks, applied by a transaction to data, which may block (interpreted as signals to stop) other transactions from accessing the same data during the transaction's life.
By the 2PL protocol, locks are applied and removed in two phases:
- Expanding phase: locks are acquired and no locks are released.
- Shrinking phase: locks are released and no locks are acquired.
Two types of locks are used by the basic protocol: Shared and Exclusive locks. Refinements of the basic protocol may use more lock types. Using locks that block processes, 2PL, S2PL, and SS2PL may be subject to deadlocks that result from the mutual blocking of two or more transactions.
Read and write locks
[edit]Locks are used to guarantee serializability. A transaction is holding a lock on an object if that transaction has acquired a lock on that object which has not yet been released.
For 2PL, the only used data-access locks are read-locks (shared locks) and write-locks (exclusive locks). Below are the rules for read-locks and write-locks:
- A transaction is allowed to read an object if and only if it is holding a read-lock or write-lock on that object.
- A transaction is allowed to write an object if and only if it is holding a write-lock on that object.
- A schedule (i.e., a set of transactions) is allowed to hold multiple locks on the same object simultaneously if and only if none of those locks are write-locks. If a disallowed lock attempts on being held simultaneously, it will be blocked.
| Lock type | read-lock | write-lock |
|---|---|---|
| read-lock | ✔ | X |
| write-lock | X | X |
Variants
[edit]| guarantees conflict-serializability | guarantees view-serializability | eliminates deadlocks | guarantees recoverability | guarantees strictness | prevents phantom reads | prevents dirty reads | |
|---|---|---|---|---|---|---|---|
| 2PL | Yes | No | No | No | No | No | No |
| C2PL | Yes | Yes[citation needed] | Yes | yes?[citation needed] | yes?[citation needed] | No[citation needed] | Yes[citation needed] |
| S2PL | Yes | No | No | Yes | Yes | Yes | Yes |
| SS2PL | Yes | No | No | Yes | Yes | Yes | Yes |
Two-phase locking
[edit]According to the two-phase locking protocol, each transaction handles its locks in two distinct, consecutive phases during the transaction's execution:
- Expanding phase (aka Growing phase): locks are acquired and no locks are released (the number of locks can only increase).
- Shrinking phase (aka Contracting phase): locks are released and no locks are acquired.
The two phase locking rules can be summarized as: each transaction must never acquire a lock after it has released a lock. The serializability property is guaranteed for a schedule with transactions that obey this rule.
Typically, without explicit knowledge in a transaction on end of phase 1, the rule is safely determined only when a transaction has completed processing and requested commit. In this case, all the locks can be released at once (phase 2).
Conservative two-phase locking
[edit]Conservative two-phase locking (C2PL) differs from 2PL in that transactions obtain all the locks they need before the transactions begin. This is to ensure that a transaction that already holds some locks will not block waiting for other locks. C2PL prevents deadlocks.
In cases of heavy lock contention, C2PL reduces the time locks are held on average, relative to 2PL and Strict 2PL, because transactions that hold locks are never blocked. In light lock contention, C2PL holds more locks than is necessary, because it is difficult to predict which locks will be needed in the future, thus leading to higher overhead.
A C2PL transaction will not obtain any locks if it cannot obtain all the locks it needs in its initial request. Furthermore, each transaction needs to declare its read and write set (the data items that will be read/written), which is not always possible. Because of these limitations, C2PL is not used very frequently.
Strict two-phase locking
[edit]To comply with the strict two-phase locking (S2PL) protocol, a transaction needs to comply with 2PL, and release its write (exclusive) locks only after the transaction has ended (i.e., either committed or aborted). On the other hand, read (shared) locks are released regularly during the shrinking phase.
Unlike 2PL, S2PL provides strictness (a special case of cascade-less recoverability). This protocol is not appropriate in B-trees because it causes Bottleneck (while B-trees always starts searching from the parent root). [citation needed]
Strong strict two-phase locking
[edit]or Rigorousness, or Rigorous scheduling, or Rigorous two-phase locking
To comply with strong strict two-phase locking (SS2PL), a transaction's read and write locks are released only after that transaction has ended (i.e., either committed or aborted). A transaction obeying SS2PL has only a phase 1 and lacks a phase 2 until the transaction has completed. Every SS2PL schedule is also an S2PL schedule, but not vice versa.
See also
[edit]References
[edit]- ^ Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (1987): Concurrency Control and Recovery in Database Systems, Addison Wesley Publishing Company, ISBN 0-201-10715-5
- ^ Gerhard Weikum, Gottfried Vossen (2001): Transactional Information Systems, Elsevier, ISBN 1-55860-508-8
Two-phase locking
View on GrokipediaBackground Concepts
Concurrency Control in Databases
In database systems, a transaction is defined as a sequence of read and write operations on shared data items, executed as an atomic unit that preserves database consistency by transforming it from one valid state to another when run in isolation.[1] This concept emerged prominently in the 1970s with the development of multi-user database systems, such as IBM's System R, which enabled multiple users to access and modify shared data concurrently, necessitating mechanisms to manage interference and maintain reliability.[1] Concurrent execution of transactions, while improving system throughput, introduces anomalies that compromise data integrity. A lost update occurs when two transactions read the same data item and one overwrites the other's modifications without incorporating them.[3] Dirty reads happen when a transaction accesses uncommitted changes made by another transaction, potentially propagating invalid data if the modifying transaction aborts.[3] Unrepeatable reads arise when a transaction rereads a data item within its execution but obtains a different value due to an intervening commit by another transaction.[3] Phantom reads occur when a transaction reissues a query and detects new or absent rows resulting from concurrent inserts or deletes by other transactions.[3] Concurrency control protocols address these issues by ensuring isolation—preventing transactions from observing each other's intermediate effects—and consistency, whereby the database remains in a valid state despite parallelism. The overarching goal is to permit high degrees of interleaving for performance while avoiding anomalies that violate application semantics. Serializability serves as the standard criterion for concurrency control correctness, stipulating that a schedule of transactions must produce results equivalent to some serial execution of those transactions in isolation.[1] This equivalence is often assessed through conflict serializability, where two schedules are conflict equivalent if they order all pairs of conflicting operations (reads and writes on the same item) identically, ensuring no cycles in the precedence graph that would indicate non-serializable behavior.[1] Protocols like two-phase locking achieve this by coordinating access to data items, alongside alternatives such as timestamp ordering.[1]Lock Types and Compatibility
In two-phase locking (2PL), the fundamental locking primitives consist of shared locks and exclusive locks, which manage concurrent access to data items to ensure consistency. These locks address potential conflicts in database concurrency control, where simultaneous read and write operations could lead to inconsistent views of data. Shared locks (S locks), also referred to as read locks, are acquired by transactions performing read operations on a data item. Multiple transactions can simultaneously hold shared locks on the same data item, allowing concurrent reads without interference, as reads do not modify the data. This design maximizes throughput for read-heavy workloads by permitting parallelism among reading transactions. Exclusive locks (X locks), or write locks, are requested by transactions intending to write to a data item. Only one transaction at a time can hold an exclusive lock on a given data item, blocking all other transactions from reading or writing it until the lock is released. This exclusivity protects the integrity of updates by isolating modifications from concurrent access. Lock compatibility governs whether a requesting transaction can acquire a lock on an item already locked by another transaction, preventing unsafe concurrent operations. Shared locks are compatible with other shared locks, enabling multiple readers. However, exclusive locks are incompatible with any existing shared or exclusive locks, ensuring no interference during writes. The following table summarizes lock compatibility:| Current Lock | Requested Shared (S) | Requested Exclusive (X) |
|---|---|---|
| Granted Shared (S) | Compatible | Incompatible |
| Granted Exclusive (X) | Incompatible | Incompatible |
Basic Two-Phase Locking Protocol
Protocol Rules and Phases
The two-phase locking (2PL) protocol is a concurrency control mechanism that ensures serializability in database systems by structuring each transaction's lock operations into two distinct phases: a growing phase, during which the transaction acquires all necessary locks without releasing any, and a shrinking phase, in which it releases locks without acquiring any new ones.[4] This core rule mandates that locks are requested and granted only in the growing phase, while all releases occur exclusively in the shrinking phase, preventing interleavings that could lead to non-serializable schedules.[4] The lock point defines the critical transition in a transaction's execution, marking the instant when the final lock is acquired and the first lock release is performed, thereby separating the two phases and enforcing the protocol's structure. At this point, the transaction holds all required locks and may proceed to perform its data operations, after which it enters the shrinking phase to release them systematically. The protocol's algorithmic steps can be outlined as follows, assuming a centralized lock manager that handles requests for shared (read) or exclusive (write) locks and enforces compatibility rules.[4] Growing Phase:- For each data item to be read or written, the transaction requests the appropriate lock (shared for reads, exclusive for writes).
- If the lock is compatible with existing locks on the item (e.g., no conflicting exclusive lock for a shared request), it is granted immediately.
- If incompatible, the transaction waits until the conflicting lock is released, without proceeding to any releases of its own locks.
- The phase continues until all required locks are acquired.
- Once the lock point is reached (all locks acquired and operations complete), the transaction releases locks on data items no longer needed, typically in reverse order of acquisition or at commit time.
- No new lock requests are permitted after the first release.
Procedure TwoPhaseLocking(T):
locks_held = [empty set](/page/Empty_set)
phase = "growing"
while T has pending operations:
if operation requires lock on item X:
if phase == "growing":
request_lock(X, mode) // mode: shared or exclusive
if lock granted:
add X to locks_held
else:
wait until granted
else: // shrinking phase
error: "Lock request after release not allowed"
perform operation on X
phase = "shrinking"
for each lock in locks_held:
release_lock(lock)
Procedure TwoPhaseLocking(T):
locks_held = [empty set](/page/Empty_set)
phase = "growing"
while T has pending operations:
if operation requires lock on item X:
if phase == "growing":
request_lock(X, mode) // mode: shared or exclusive
if lock granted:
add X to locks_held
else:
wait until granted
else: // shrinking phase
error: "Lock request after release not allowed"
perform operation on X
phase = "shrinking"
for each lock in locks_held:
release_lock(lock)
Execution Example
To illustrate the basic two-phase locking protocol, consider a simple scenario involving two concurrent transactions, T1 and T2, operating on shared data items A and B. Transaction T1 performs a read on A followed by a write on B, requiring a shared lock (S-lock) on A and an exclusive lock (X-lock) on B. Transaction T2 performs a write on B followed by a read on A, requiring an X-lock on B and an S-lock on A. These operations conflict on B, as an X-lock is incompatible with any other lock on the same item.[5] The execution proceeds as follows, with locks managed by a lock manager that grants requests only if compatible with existing locks held by other transactions. In the growing phase, T1 successfully acquires its required locks before T2 can proceed fully, demonstrating how 2PL enforces ordering to avoid certain non-serializable interleavings. T2 enters a wait state for the conflicting lock on B until T1 transitions to the shrinking phase and releases its locks. No new locks are acquired after the first release in either transaction. The timeline of lock requests, grants, waits, and releases is shown in the table below:| Time | T1 Action | Lock Status for T1 | T2 Action | Lock Status for T2 | Notes |
|---|---|---|---|---|---|
| 1 | Request S(A) | Granted: S(A) | - | - | T1 growing phase begins. |
| 2 | read(A); request X(B) | Granted: S(A), X(B) | - | - | T1 completes growing phase (lock point). |
| 3 | write(B); end operations | Held: S(A), X(B) | Request X(B) | Waits for X(B) | T2 growing phase starts but blocks on B (incompatible with T1's X(B)). |
| 4 | Transition to shrinking phase | Held: S(A), X(B) | Waiting | - | No new lock requests allowed for T1. |
| 5 | Release X(B) | Held: S(A) | Granted: X(B) | X(B) | T2 lock request granted after release. |
| 6 | Release S(A) | No locks | read(A); end operations | X(B), S(A) | T2 completes growing phase and operations. |
| 7 | Commit | - | Transition to shrinking phase; release S(A), X(B) | No locks | T2 shrinking phase; commit. |
Theoretical Analysis
Proof of Serializability
Conflict serializability is a correctness criterion for concurrent transaction schedules in database systems, where a schedule is deemed conflict serializable if it is equivalent to some serial execution of the transactions, achieved through a series of conflict-preserving swaps of adjacent operations. This equivalence preserves the outcome of the schedule while ensuring that conflicting operations—such as two writes or a read followed by a write on the same data item—maintain their relative order. A key result in concurrency control theory states that any schedule produced under the basic two-phase locking (2PL) protocol is conflict serializable, with the serialization order of transactions determined by the topological order of the precedence graph associated with the schedule. This theorem establishes 2PL as a sufficient condition for ensuring serializability, as the protocol's structure inherently avoids schedules that violate conflict equivalence to serial executions. The precedence graph, also known as the serialization graph, is constructed for a given schedule with nodes representing individual transactions and directed edges indicating precedence due to conflicts. Specifically, an edge from transaction to (denoted ) exists if an operation in precedes and conflicts with an operation in , such as when holds an exclusive lock on a data item that attempts to access incompatibly. A schedule is conflict serializable if and only if this graph is acyclic, allowing a topological sort that defines a valid serial order. The proof that 2PL produces conflict serializable schedules proceeds by contradiction, showing that the precedence graph under 2PL cannot contain cycles. Assume a cycle exists in the graph for a 2PL schedule. In the growing phase of 2PL, transactions acquire all necessary locks without releasing any, so conflicting lock requests create forward edges in the order of lock points (the moments when locks are granted). The shrinking phase, where locks are released but no new ones acquired, cannot introduce backward edges that would close a cycle, as all potential conflicts are resolved earlier in a manner consistent with the acquisition order. Thus, the lock point ordering imposes a partial order that prevents cycles, ensuring acyclicity and hence conflict serializability. While 2PL guarantees serializability when all transactions adhere to the protocol's rules—including acquiring locks before accessing data items—the absence of proper lock requests (e.g., reading without a shared lock) can permit non-serializable schedules. However, the protocol itself enforces the necessary locking discipline to maintain this property in compliant executions.Deadlock Risks and Mitigation
In two-phase locking (2PL), a deadlock arises when two or more transactions form a circular wait condition, where each holds locks needed by another and awaits locks held by the others, typically during the growing phase of lock acquisition. For instance, transaction T1 might acquire a lock on data item A and request a lock on B, while transaction T2 holds a lock on B and requests one on A, creating a cycle that prevents progress. This mutual dependency violates the no-circular-wait condition of the classic deadlock framework and can stall system execution indefinitely if unresolved.[6][7] Deadlocks occur more frequently in basic 2PL than in serial transaction execution because the protocol permits dynamic, concurrent lock requests without pre-declaring all needed resources, increasing the likelihood of conflicting acquisitions among active transactions. To detect deadlocks, database management systems (DBMS) maintain a waits-for graph, with nodes representing transactions and directed edges indicating that one transaction awaits a lock held by another. Cycle detection in this graph is performed periodically or on lock requests using algorithms like depth-first search (DFS), which traverse the graph to identify loops; if a cycle is found, the system intervenes to break it.[8][7][9] Mitigation strategies focus on resolution rather than strict prevention in basic 2PL, as complete avoidance would overly restrict concurrency. Upon detection, the DBMS selects a victim transaction to abort—often based on criteria such as the minimum cost (e.g., fewest locks held, least progress made, or fewest prior restarts)—and rolls it back to release its locks, allowing the cycle to resolve. Partial rollbacks may be used to undo only the conflicting operations, minimizing restart overhead. Alternatively, timeout mechanisms abort transactions that wait beyond a threshold, preventing indefinite hangs without explicit graph checks. Prevention techniques like timestamp-based protocols (e.g., wait-die or wound-wait) can be layered on 2PL to prioritize transactions and avoid cycles proactively, though they risk unnecessary aborts.[6][9] These approaches involve trade-offs: deadlock detection incurs runtime overhead from graph construction, maintenance, and periodic scans, which scales with transaction volume but enables higher concurrency by allowing most executions to proceed uninterrupted. In contrast, resolution via aborts reduces throughput due to rollback costs and potential restarts, while timeouts or prevention schemes may terminate non-deadlocked transactions, leading to unnecessary resource waste and lower efficiency under low-contention workloads. Empirical studies show that detection-based methods balance these costs effectively in practice for moderate contention levels.[8][7]Advanced Variants
Conservative Two-Phase Locking
Conservative two-phase locking (C2PL), also known as static two-phase locking, requires each transaction to declare all data items it intends to access at the outset and to acquire the necessary locks on all of them before executing any read or write operations. This pre-declaration phase ensures that locking is performed statically, without incremental requests during transaction execution.[10] The protocol maintains the two-phase structure of basic two-phase locking but modifies the growing phase to occur entirely at the beginning: a transaction attempts to obtain all required locks (shared for reads, exclusive for writes) in a single batch, releasing none if any acquisition fails, enters the shrinking phase immediately after acquiring all locks, and then executes its read and write operations, releasing locks as they are no longer needed during this phase.[11] Unlike basic two-phase locking, which permits dynamic lock requests and can lead to deadlocks, C2PL motivates this upfront strategy to eliminate such risks.[1] C2PL guarantees deadlock freedom because no transaction holds partial locks while waiting for others, thereby avoiding cycles in the wait-for graph. It also simplifies implementation by obviating the need for deadlock detection and resolution mechanisms.[10] However, C2PL reduces concurrency since locks are held from the transaction's start until they are released during execution, potentially blocking other transactions for extended periods.[11] Moreover, it demands prior knowledge of the complete access set, which may not be feasible for all applications and can lead to starvation if transactions frequently require large numbers of locks. Regarding correctness, C2PL ensures conflict serializability, inheriting the properties of basic two-phase locking: the precedence graph is acyclic because all conflicting lock acquisitions establish edges early, prior to any data access.[1]Strict Two-Phase Locking
Strict two-phase locking (S2PL) extends the basic two-phase locking (2PL) protocol by requiring that all exclusive (write) locks acquired by a transaction be held until the transaction either commits or aborts, while shared (read) locks may be released earlier once they are no longer needed, provided no new locks are acquired thereafter. This retention applies specifically to exclusive locks, which prevent concurrent modifications to the same data item, ensuring that no other transaction can read or write the affected items until the transaction completes. The protocol maintains the two distinct phases of basic 2PL: a growing phase where locks are acquired as needed, and a shrinking phase where lock releases occur, but with the shrinking phase for exclusive locks delayed until after the commit or abort point. The primary motivation for S2PL is to enhance recoverability in database systems by preventing cascading aborts during failure recovery. In basic 2PL, early release of exclusive locks can allow other transactions to read uncommitted (dirty) data, leading to situations where an aborting transaction's effects propagate, requiring multiple transactions to be rolled back in a chain reaction. By holding exclusive locks until commit, S2PL ensures that dirty writes are not visible to other transactions, thereby avoiding such cascades and simplifying recovery to restoring from before-images without affecting dependent transactions. This approach aligns with the need for strict schedules, which are both serializable and recoverable. S2PL offers a balance between concurrency and recoverability, as the allowance for early release of shared locks permits higher throughput in read-heavy workloads compared to variants that retain all locks until the end. It guarantees serializability through the same precedence graph mechanism as basic 2PL, where conflicting operations establish a total order equivalent to some serial execution, and the additional lock retention does not alter this property. However, the prolonged hold on exclusive locks can increase the likelihood of deadlocks and extend transaction waiting times, potentially reducing overall system performance under high contention, as resources remain blocked longer than in basic 2PL.Strong Strict Two-Phase Locking
Strong strict two-phase locking (SS2PL), also known as rigorous two-phase locking, extends strict two-phase locking by requiring transactions to hold both shared (S) locks for reads and exclusive (X) locks for writes until the transaction commits or aborts, with no early releases permitted. This variant builds on strict two-phase locking, which already retains X locks until commit but allows S locks to be released earlier. In SS2PL, the protocol maintains the two phases of basic two-phase locking: a growing phase where locks are acquired as needed without releases, followed by a shrinking phase where no new locks are acquired, but all locks—S and X—are released only at commit or abort.[12][13] The implementation of SS2PL defines a clear lock point at the end of the growing phase, after which the transaction performs its operations using the acquired locks and then releases everything atomically upon commit or abort. This approach ensures that conflicting transactions are serialized strictly in the order of their commit points, simplifying the enforcement of isolation levels. SS2PL preserves conflict serializability, as guaranteed by the underlying two-phase locking rules, while providing even stricter isolation by preventing any intermediate visibility of transaction effects during execution.[14][12] A key advantage of SS2PL is its simplicity in recovery mechanisms, as holding all locks until commit eliminates cascading aborts entirely—no transaction can read uncommitted data, and undoing changes is straightforward since no partial updates are visible. This reliability makes SS2PL widely adopted in production database systems, where it is often implemented under the simpler label of "two-phase locking" for robust transaction processing. However, these benefits come at the cost of reduced concurrency, as the prolonged retention of all locks blocks other transactions longer, increasing the risk of deadlocks and lowering overall throughput compared to less conservative variants.[13][12]Variant Comparisons
The variants of two-phase locking (2PL) represent refinements to the basic protocol, each balancing trade-offs between concurrency, deadlock avoidance, recoverability, and implementation simplicity. Basic 2PL offers high flexibility and throughput but risks deadlocks and cascading aborts, while conservative 2PL eliminates deadlocks through pre-acquisition at the cost of reduced concurrency. Strict 2PL enhances recoverability by holding exclusive locks until commit, and strong strict (or rigorous) 2PL extends this to all locks for the simplest recovery mechanisms, albeit with the longest lock durations.[15]| Variant | Serializability | Deadlock Risk | Lock Hold Times | Recoverability | Key Trade-off |
|---|---|---|---|---|---|
| Basic 2PL | Yes | High (requires detection) | Shortest (releases after use in shrinking phase) | Weak (allows dirty reads and cascading aborts) | High concurrency but prone to deadlocks and recovery issues |
| Conservative 2PL | Yes | None (pre-acquires all locks) | Moderate (acquires all upfront, releases gradually) | Weak (similar to basic) | Deadlock-free but low concurrency due to predeclaration overhead[15] |
| Strict 2PL | Yes | High (like basic) | Longer (holds exclusive locks until commit) | Strong (avoids cascading aborts) | Balances concurrency and recoverability, common in production systems[15] |
| Strong Strict 2PL | Yes | High (like basic) | Longest (holds all locks until commit) | Strongest (simplest abort handling) | Lowest concurrency but easiest recovery implementation[15] |
