Recent from talks
Contribute something
Nothing was collected or created yet.
Record locking
View on WikipediaThis article needs additional citations for verification. (December 2009) |
Record locking is the technique of preventing simultaneous access to data in a database, to prevent inconsistent results.
The classic example is demonstrated by two bank clerks attempting to update the same bank account for two different transactions. Clerks 1 and 2 both retrieve (i.e., copy) the account's record. Clerk 1 applies and saves a transaction. Clerk 2 applies a different transaction to his saved copy, and saves the result, based on the original record and his changes, overwriting the transaction entered by clerk 1. The record no longer reflects the first transaction, as if it had never taken place.
A simple way to prevent this is to lock the file whenever a record is being modified by any user, so that no other user can save data. This prevents records from being overwritten incorrectly, but allows only one record to be processed at a time, locking out other users who need to edit records at the same time.
To allow several users to edit a database table at the same time and also prevent inconsistencies created by unrestricted access, a single record can be locked when retrieved for editing or updating. Anyone attempting to retrieve the same record for editing is denied write access because of the lock (although, depending on the implementation, they may be able to view the record without editing it). Once the record is saved or edits are canceled, the lock is released. Records can never be saved so as to overwrite other changes, preserving data integrity.
In database management theory, locking is used to implement isolation among multiple database users. This is the "I" in the acronym ACID.
A thorough and authoritative description of locking was written by Jim Gray.[1]
Granularity of locks
[edit]If the bank clerks (to follow the illustration above) are serving two customers, but their accounts are contained in one ledger, then the entire ledger, or one or more database tables, would need to be made available for editing to the clerks in order for each to complete a transaction, one at a time (file locking). While safe, this method can cause unnecessary waiting.
If the clerks can remove one page from the ledger, containing the account of the current customer (plus several other accounts), then multiple customers can be serviced concurrently, provided that each customer's account is found on a different page than the others. If two customers have accounts on the same page, then only one may be serviced at a time. This is analogous to a page-level lock in a database.
A higher degree of granularity is achieved if each individual account may be taken by a clerk. This would allow any customer to be serviced without waiting for another customer who is accessing a different account. This is analogous to a record-level lock and is normally the highest degree of locking granularity in a database management system.
In a SQL database, a record is typically called a "row".
The introduction of granular (subset) locks creates the possibility for a situation called deadlock. Deadlock is possible when incremental locking (locking one entity, then locking one or more additional entities) is used. To illustrate, if two bank customers asked two clerks to obtain their account information so they could transfer some money into other accounts, the two accounts would essentially be locked. Then, if the customers told their clerks that the money was to be transferred into each other's accounts, the clerks would search for the other accounts but find them to be "in use" and wait for them to be returned. Unknowingly, the two clerks are waiting for each other, and neither of them can complete their transaction until the other gives up and returns the account. Various techniques are used to avoid such problems.
Use of locks
[edit]Record locks need to be managed between the entities requesting the records such that no entity is given too much service via successive grants, and no other entity is effectively locked out. The entities that request a lock can be either individual applications (programs) or an entire processor.
The application or system should be designed such that any lock is held for the shortest time possible. Data reading, without editing facilities, does not require a lock, and reading locked records is usually permissible.
Two main types of locks can be requested:
Exclusive locks
[edit]Exclusive locks are exclusively held by a single entity, usually for the purpose of writing to the record. If the locking schema was represented by a list, the holder list would contain only one entry. Since this type of lock effectively blocks any other entity that requires the lock from processing, care must be used to:
- ensure the lock is held for the shortest time possible;
- not hold the lock across system or function calls where the entity is no longer running on the processor – this can lead to deadlock;
- ensure that if the entity is unexpectedly exited for any reason, the lock is freed.
Non-holders of the lock (a.k.a. waiters) can be held in a list that is serviced in a round-robin fashion, or in a FIFO queue. This would ensure that any possible waiter would get equal chance to obtain the lock and not be locked out. To further speed up the process, if an entity has gone to sleep waiting for a lock, performance is improved if the entity is notified of the grant, instead of discovering it on some sort of system timeout driven wake-up.
Shared locks
[edit]Shared locks differ from exclusive locks in that the holder list can contain multiple entries. Shared locks allow all holders to read the contents of the record knowing that the record cannot be changed until after the lock has been released by all holders. Exclusive locks cannot be obtained when a record is already locked (exclusively or shared) by another entity.
If lock requests for the same entity are queued, then once a shared lock is granted, any queued shared locks may also be granted. If an exclusive lock is found next on the queue, it must wait until all shared locks have been released. As with exclusive locks, these shared locks should be held for the least time possible.
See also
[edit]References
[edit]- ^ Gray, Jim & Reuter,f (1993), Distributed Transaction Processing: Concepts and Techniques, Morgan Kaufmann, pp. 375–437, ISBN 1-55860-190-2
Record locking
View on GrokipediaFundamentals
Definition and Purpose
Record locking is a concurrency control mechanism employed in database management systems and file systems to prevent multiple transactions from simultaneously accessing or modifying the same data record, thereby avoiding conflicts and ensuring data integrity. By restricting concurrent access, record locking serializes operations on shared resources, which is essential in multi-user environments where transactions must execute as if they were isolated from one another. This technique is particularly vital in relational databases, where records represent individual rows in tables, but it also applies to file-based systems handling structured data.[6][1] The primary purpose of record locking is to uphold the ACID properties of database transactions—specifically atomicity, consistency, isolation, and durability—with a strong emphasis on isolation and consistency. Isolation ensures that transactions do not interfere with each other, preventing phenomena such as dirty reads or lost updates, while consistency guarantees that the database remains in a valid state after each transaction. For example, consider two bank transactions attempting to debit and credit the same account balance concurrently: without locking, the second transaction might overwrite the first's changes, resulting in a lost update and an inaccurate balance. Record locking mitigates this by enforcing mutual exclusion during critical sections of transaction execution.[7][8] At its core, the mechanics of record locking involve a transaction requesting and acquiring a lock on a specific record prior to any read or write operation, holding it until the transaction commits or aborts, at which point the lock is released to allow subsequent access. This process effectively queues conflicting operations, promoting a serializable execution order without requiring full serialization of all transactions. Locks may be implemented at varying granularities, from individual records to larger structures like pages or tables, and can include both exclusive modes for modifications and shared modes for reads, depending on the system's design. The foundational principles of this approach were articulated by Jim Gray and Andreas Reuter in their authoritative text Transaction Processing: Concepts and Techniques (1993), which establishes locking as a cornerstone of reliable concurrency control in distributed and centralized systems.[9][7]Historical Development
The concept of record locking emerged in the late 1960s and early 1970s as database systems began supporting multi-user access to shared data. IBM's Information Management System (IMS), a hierarchical database management system developed starting in 1966, introduced record-level locking to isolate changes made by concurrent programs, enabling reliable transaction processing in environments like banking and airlines.[10] Concurrently, the shift toward relational databases was catalyzed by Edgar F. Codd's 1970 paper, which proposed a model for large shared data banks, implying the need for concurrency control mechanisms to maintain data integrity amid multiple users.[11] IBM's System R project (1974–1979), the first implementation of Codd's relational model, incorporated locking protocols as part of its transaction management, using two-phase locking to ensure serializability in multi-user scenarios. In the late 1970s and 1980s, foundational work by Jim Gray and colleagues at IBM formalized record locking within transaction processing frameworks. Gray's contributions during the System R era and subsequent projects introduced the transaction abstraction, including the properties of atomicity, consistency, isolation, and durability—later known as the ACID properties—with locking as a core mechanism for achieving isolation through protocols like two-phase locking.[12] This work extended to distributed systems, where Gray's research in the 1980s addressed locking in non-shared environments, such as at Tandem Computers, influencing scalable concurrency in transaction-oriented applications.[13] These advancements laid the groundwork for standardized approaches, culminating in the ANSI SQL-92 standard (1992), which defined isolation levels (READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, and SERIALIZABLE) that rely on locking modes to control concurrency phenomena like dirty reads and phantoms.[14] The 1990s and 2000s saw a shift toward finer-grained locking to improve concurrency in relational database management systems (RDBMS). Oracle Database 6.0, released in 1988, pioneered row-level locking, allowing multiple users to access the same table without blocking unrelated rows, a feature that enhanced performance in high-throughput environments.[15] Similarly, Microsoft SQL Server 7.0 (1998) adopted row-level locking as its default, moving away from coarser page- or table-level mechanisms to reduce contention in enterprise applications.[1]Types of Locks
Exclusive Locks
An exclusive lock is a lock mode in database systems that grants a single transaction sole access to a record or resource for modification purposes, thereby blocking all other transactions from writing to that resource until the lock is released, and in many systems such as SQL Server and DB2, also blocking reads.[1][5] This mode ensures that data modifications occur without interference, maintaining the integrity of the data during write operations.[16] Exclusive locks are essential for update and delete operations, as they prevent issues such as dirty reads—where a transaction reads uncommitted changes from another—or lost updates, where concurrent modifications overwrite each other.[1] In systems like SQL Server and DB2, they are typically acquired automatically during data-modifying statements to enforce isolation.[5] These locks are acquired either explicitly through commands like SQL's LOCK statement or implicitly as part of a transaction's execution, and they are held until the transaction commits—making changes permanent—or rolls back—discarding them.[16][1] For instance, in PostgreSQL, an EXCLUSIVE lock conflicts with most other lock modes except basic read access (ACCESS SHARE locks), ensuring no concurrent writes occur while allowing reads.[16] A practical example is in a banking system, where an exclusive lock is placed on an account record during a fund transfer to debit one account and credit another, guaranteeing atomicity and preventing balance inconsistencies from simultaneous transactions.[5][1] While exclusive locks reduce concurrency by serializing access to the locked resource—potentially impacting performance in high-throughput environments—they provide strong guarantees of serializability for modifications, ensuring that committed changes appear as if executed sequentially.[1][5] Unlike shared locks, which permit multiple readers but no writers, exclusive locks block writes and, in many systems, all access to prioritize safe writes.[16] Their impact varies with granularity, such as row-level locking allowing finer parallelism than table-level.[1]Shared Locks
Shared locks, also known as read locks, are a lock mode in database concurrency control that permits multiple transactions to simultaneously read a specific record or resource while preventing any transaction from writing to it.[17] This mode ensures that readers do not interfere with each other but collectively block writers to maintain data consistency during concurrent access.[17] Shared locks are particularly suited for workloads dominated by read operations, such as reporting queries or SELECT statements that retrieve data without modification. In such scenarios, they enable high concurrency among read-only transactions, optimizing throughput in systems where queries outnumber updates. A key behavior of shared locks is their potential upgrade to exclusive locks when a transaction requires write access; however, this conversion can only proceed after all conflicting shared locks on the resource are released.[17] For instance, in an e-commerce database, multiple users can acquire shared locks to view current inventory levels simultaneously, but any attempt to update stock quantities would require upgrading to an exclusive lock, blocking until all shared locks are freed. Regarding compatibility, shared locks can coexist with other shared locks but are incompatible with exclusive locks, as shown in the following matrix for basic lock modes:| Lock Mode | Shared (S) | Exclusive (X) |
|---|---|---|
| Shared (S) | Compatible | Incompatible |
| Exclusive (X) | Incompatible | Incompatible |
Granularity of Locks
Coarse-Grained Locking
Coarse-grained locking refers to a concurrency control mechanism in database systems where locks are applied to larger units of data, such as entire tables, pages, files, or even databases, rather than to individual records. This approach minimizes the number of locks that need to be managed, thereby reducing the overhead associated with acquiring, releasing, and tracking numerous fine-level locks. In shared database environments, coarse-grained locks are particularly useful for maintaining data consistency during operations that affect broad scopes of data, as they simplify the locking protocol by associating locks with sets of resources rather than atomic elements.[18] The primary advantages of coarse-grained locking emerge in environments with low concurrency, such as systems serving few simultaneous users or handling bulk operations like data imports or schema modifications. In these scenarios, the reduced lock management cost leads to lower overhead and simpler implementation, as fewer locks mean less contention and easier enforcement of isolation. For instance, a transaction accessing many records within a single table benefits from locking the entire table, avoiding the expense of multiple individual locks while ensuring atomicity. This makes coarse-grained locking efficient for workloads where high parallelism is not required, such as in early database designs or applications with infrequent concurrent access.[18] Examples of coarse-grained locking include table-level locks in MySQL, which are employed for data definition language (DDL) statements to protect structural changes across the entire table, preventing concurrent modifications that could lead to inconsistencies. Similarly, traditional file systems often use whole-file locks to guard against simultaneous writes, treating the file as the lockable unit to simplify access control in non-relational storage environments. Both exclusive and shared locks can be applied at these coarse levels; for example, an exclusive table lock blocks all access, while a shared lock permits multiple reads but no writes.[19] However, coarse-grained locking introduces trade-offs by restricting concurrency, as locking a large unit like a table can block operations on unrelated records within it, leading to unnecessary serialization in multi-user settings. This makes it suitable primarily for online transaction processing (OLTP) systems with low contention, where the simplicity outweighs the potential for bottlenecks. In contrast to fine-grained locking, which permits higher parallelism through record-specific controls, coarse-grained methods prioritize ease of management over maximal throughput.[18] Historically, coarse-grained locking was prevalent in early hierarchical databases, such as IBM's Information Management System (IMS), which employed a two-level locking hierarchy at the database and segment levels to isolate concurrent program changes with minimal overhead. This approach, developed in the 1970s, reflected the hardware and workload constraints of the era, where fine-grained mechanisms were computationally expensive, and broader locks sufficed for batch-oriented processing. Seminal work on lock granularity, including proposals for hierarchical protocols, built upon these systems to balance concurrency and efficiency in shared environments.[18][10]Fine-Grained Locking
Fine-grained locking refers to a concurrency control mechanism in database systems where locks are applied at the individual record or row level, permitting concurrent access to unrelated portions of the data while preventing conflicts on the targeted rows. This approach contrasts with broader locking strategies by isolating modifications to specific rows, thereby allowing multiple transactions to proceed simultaneously on different records within the same table.[19][1] In multi-user online transaction processing (OLTP) environments, fine-grained locking enhances throughput by reducing the scope of blocked operations, enabling higher levels of concurrency as transactions affecting distinct rows do not interfere with one another. This granularity supports scalable applications where frequent, short-lived updates occur across large datasets, minimizing wait times and improving overall system responsiveness.[19][1] A practical example is row-level locking in PostgreSQL, where commands likeSELECT * FROM table WHERE id = 1 FOR UPDATE acquire an exclusive lock on a single row to facilitate selective updates in large tables without impacting other records. This mechanism applies both exclusive modes, such as FOR UPDATE to block modifications, and shared modes, like FOR SHARE to allow reads while restricting writes.[16]
However, fine-grained locking introduces challenges, including increased overhead from managing a larger number of locks, which can consume more system resources and complicate lock administration. Additionally, the finer interactions among transactions heighten the risk of deadlocks, as cycles in lock dependencies become more likely with numerous row-specific acquisitions.[19][1]
Fine-grained locking, particularly row-level variants, became a standard feature in relational database management systems during the late 1980s and 1990s to accommodate the growing demands of concurrent, scalable applications, as exemplified by its introduction in Oracle Database version 6 in 1988.[15]
