Recent from talks
Nothing was collected or created yet.
Lock (computer science)
View on WikipediaIn computer science, a lock or mutex (from mutual exclusion) is a synchronization primitive that prevents state from being modified or accessed by multiple threads of execution at once. Locks enforce mutual exclusion concurrency control policies, and with a variety of possible methods there exist multiple unique implementations for different applications.
Types
[edit]Generally, locks are advisory locks, where each thread cooperates by acquiring the lock before accessing the corresponding data. Some systems also implement mandatory locks, where attempting unauthorized access to a locked resource will force an exception in the entity attempting to make the access.
The simplest type of lock is a binary semaphore. It provides exclusive access to the locked data. Other schemes also provide shared access for reading data. Other widely implemented access modes are exclusive, intend-to-exclude and intend-to-upgrade.
Another way to classify locks is by what happens when the lock strategy prevents the progress of a thread. Most locking designs block the execution of the thread requesting the lock until it is allowed to access the locked resource. With a spinlock, the thread simply waits ("spins") until the lock becomes available. This is efficient if threads are blocked for a short time, because it avoids the overhead of operating system process rescheduling. It is inefficient if the lock is held for a long time, or if the progress of the thread that is holding the lock depends on preemption of the locked thread.
Locks typically require hardware support for efficient implementation. This support usually takes the form of one or more atomic instructions such as "test-and-set", "fetch-and-add" or "compare-and-swap". These instructions allow a single process to test if the lock is free, and if free, acquire the lock in a single atomic operation.
Uniprocessor architectures have the option of using uninterruptible sequences of instructions—using special instructions or instruction prefixes to disable interrupts temporarily—but this technique does not work for multiprocessor shared-memory machines. Proper support for locks in a multiprocessor environment can require quite complex hardware or software support, with substantial synchronization issues.
The reason an atomic operation is required is because of concurrency, where more than one task executes the same logic. For example, consider the following C code:
if (lock == 0) {
// lock free, set it
lock = myPID;
}
The above example does not guarantee that the task has the lock, since more than one task can be testing the lock at the same time. Since both tasks will detect that the lock is free, both tasks will attempt to set the lock, not knowing that the other task is also setting the lock. Dekker's or Peterson's algorithm are possible substitutes if atomic locking operations are not available.
Careless use of locks can result in deadlock or livelock. A number of strategies can be used to avoid or recover from deadlocks or livelocks, both at design-time and at run-time. (The most common strategy is to standardize the lock acquisition sequences so that combinations of inter-dependent locks are always acquired in a specifically defined "cascade" order.)
Some languages do support locks syntactically. An example in C# follows:
public class Account // This is a monitor of an account
{
// Use `object` in versions earlier than C# 13
private readonly Lock _balanceLock = new();
private decimal _balance = 0;
public void Deposit(decimal amount)
{
// Only one thread at a time may execute this statement.
lock (_balanceLock)
{
_balance += amount;
}
}
public void Withdraw(decimal amount)
{
// Only one thread at a time may execute this statement.
lock (_balanceLock)
{
_balance -= amount;
}
}
}
C# introduced System.Threading.Lock in C# 13 on .NET 9.
The code lock(this) can lead to problems if the instance can be accessed publicly.[1]
Similar to Java, C# can also synchronize entire methods, by using the MethodImplOptions.Synchronized attribute.[2][3]
[MethodImpl(MethodImplOptions.Synchronized)]
public void SomeMethod()
{
// do stuff
}
Granularity
[edit]Before being introduced to lock granularity, one needs to understand three concepts about locks:
- lock overhead: the extra resources for using locks, like the memory space allocated for locks, the CPU time to initialize and destroy locks, and the time for acquiring or releasing locks. The more locks a program uses, the more overhead associated with the usage;
- lock contention: this occurs whenever one process or thread attempts to acquire a lock held by another process or thread. The more fine-grained the available locks, the less likely one process/thread will request a lock held by the other. (For example, locking a row rather than the entire table, or locking a cell rather than the entire row);
- deadlock: the situation when each of at least two tasks is waiting for a lock that the other task holds. Unless something is done, the two tasks will wait forever.
There is a tradeoff between decreasing lock overhead and decreasing lock contention when choosing the number of locks in synchronization.
An important property of a lock is its granularity. The granularity is a measure of the amount of data the lock is protecting. In general, choosing a coarse granularity (a small number of locks, each protecting a large segment of data) results in less lock overhead when a single process is accessing the protected data, but worse performance when multiple processes are running concurrently. This is because of increased lock contention. The more coarse the lock, the higher the likelihood that the lock will stop an unrelated process from proceeding. Conversely, using a fine granularity (a larger number of locks, each protecting a fairly small amount of data) increases the overhead of the locks themselves but reduces lock contention. Granular locking where each process must hold multiple locks from a common set of locks can create subtle lock dependencies. This subtlety can increase the chance that a programmer will unknowingly introduce a deadlock.[citation needed]
In a database management system, for example, a lock could protect, in order of decreasing granularity, part of a field, a field, a record, a data page, or an entire table. Coarse granularity, such as using table locks, tends to give the best performance for a single user, whereas fine granularity, such as record locks, tends to give the best performance for multiple users.
Database locks
[edit]Database locks can be used as a means of ensuring transaction synchronicity. i.e. when making transaction processing concurrent (interleaving transactions), using 2-phased locks ensures that the concurrent execution of the transaction turns out equivalent to some serial ordering of the transaction. However, deadlocks become an unfortunate side-effect of locking in databases. Deadlocks are either prevented by pre-determining the locking order between transactions or are detected using waits-for graphs. An alternate to locking for database synchronicity while avoiding deadlocks involves the use of totally ordered global timestamps.
There are mechanisms employed to manage the actions of multiple concurrent users on a database—the purpose is to prevent lost updates and dirty reads. The two types of locking are pessimistic locking and optimistic locking:
- Pessimistic locking: a user who reads a record with the intention of updating it places an exclusive lock on the record to prevent other users from manipulating it. This means no one else can manipulate that record until the user releases the lock. The downside is that users can be locked out for a very long time, thereby slowing the overall system response and causing frustration.
- Where to use pessimistic locking: this is mainly used in environments where data-contention (the degree of users request to the database system at any one time) is heavy; where the cost of protecting data through locks is less than the cost of rolling back transactions, if concurrency conflicts occur. Pessimistic concurrency is best implemented when lock times will be short, as in programmatic processing of records. Pessimistic concurrency requires a persistent connection to the database and is not a scalable option when users are interacting with data, because records might be locked for relatively large periods of time. It is not appropriate for use in Web application development.
- Optimistic locking: this allows multiple concurrent users access to the database whilst the system keeps a copy of the initial-read made by each user. When a user wants to update a record, the application determines whether another user has changed the record since it was last read. The application does this by comparing the initial-read held in memory to the database record to verify any changes made to the record. Any discrepancies between the initial-read and the database record violates concurrency rules and hence causes the system to disregard any update request. An error message is generated and the user is asked to start the update process again. It improves database performance by reducing the amount of locking required, thereby reducing the load on the database server. It works efficiently with tables that require limited updates since no users are locked out. However, some updates may fail. The downside is constant update failures due to high volumes of update requests from multiple concurrent users - it can be frustrating for users.
- Where to use optimistic locking: this is appropriate in environments where there is low contention for data, or where read-only access to data is required. Optimistic concurrency is used extensively in .NET to address the needs of mobile and disconnected applications,[4] where locking data rows for prolonged periods of time would be infeasible. Also, maintaining record locks requires a persistent connection to the database server, which is not possible in disconnected applications.
Lock compatibility table
[edit]Several variations and refinements of these major lock types exist, with respective variations of blocking behavior. If a first lock blocks another lock, the two locks are called incompatible; otherwise the locks are compatible. Often, lock types blocking interactions are presented in the technical literature by a Lock compatibility table. The following is an example with the common, major lock types:
| Lock type | read-lock | write-lock |
|---|---|---|
| read-lock | ✔ | X |
| write-lock | X | X |
- ✔ indicates compatibility
- X indicates incompatibility, i.e., a case when a lock of the first type (in left column) on an object blocks a lock of the second type (in top row) from being acquired on the same object (by another transaction). An object typically has a queue of waiting requested (by transactions) operations with respective locks. The first blocked lock for operation in the queue is acquired as soon as the existing blocking lock is removed from the object, and then its respective operation is executed. If a lock for operation in the queue is not blocked by any existing lock (existence of multiple compatible locks on a same object is possible concurrently), it is acquired immediately.
Comment: In some publications, the table entries are simply marked "compatible" or "incompatible", or respectively "yes" or "no".[5]
Disadvantages
[edit]Lock-based resource protection and thread/process synchronization have many disadvantages:
- Contention: some threads/processes have to wait until a lock (or a whole set of locks) is released. If one of the threads holding a lock dies, stalls, blocks, or enters an infinite loop, other threads waiting for the lock may wait indefinitely until the computer is power cycled.
- Overhead: the use of locks adds overhead for each access to a resource, even when the chances for collision are very rare. (However, any chance for such collisions is a race condition.)
- Debugging: bugs associated with locks are time dependent and can be very subtle and extremely hard to replicate, such as deadlocks.
- Instability: the optimal balance between lock overhead and lock contention can be unique to the problem domain (application) and sensitive to design, implementation, and even low-level system architectural changes. These balances may change over the life cycle of an application and may entail tremendous changes to update (re-balance).
- Composability: locks are only composable (e.g., managing multiple concurrent locks in order to atomically delete item X from table A and insert X into table B) with relatively elaborate (overhead) software support and perfect adherence by applications programming to rigorous conventions.
- Priority inversion: a low-priority thread/process holding a common lock can prevent high-priority threads/processes from proceeding. Priority inheritance can be used to reduce priority-inversion duration. The priority ceiling protocol can be used on uniprocessor systems to minimize the worst-case priority-inversion duration, as well as prevent deadlock.
- Convoying: all other threads have to wait if a thread holding a lock is descheduled due to a time-slice interrupt or page fault.
Some concurrency control strategies avoid some or all of these problems. For example, a funnel or serializing tokens can avoid the biggest problem: deadlocks. Alternatives to locking include non-blocking synchronization methods, like lock-free programming techniques and transactional memory. However, such alternative methods often require that the actual lock mechanisms be implemented at a more fundamental level of the operating software. Therefore, they may only relieve the application level from the details of implementing locks, with the problems listed above still needing to be dealt with beneath the application.
In most cases, proper locking depends on the CPU providing a method of atomic instruction stream synchronization (for example, the addition or deletion of an item into a pipeline requires that all contemporaneous operations needing to add or delete other items in the pipe be suspended during the manipulation of the memory content required to add or delete the specific item). Therefore, an application can often be more robust when it recognizes the burdens it places upon an operating system and is capable of graciously recognizing the reporting of impossible demands.[citation needed]
Lack of composability
[edit]One of lock-based programming's biggest problems is that "locks don't compose": it is hard to combine small, correct lock-based modules into equally correct larger programs without modifying the modules or at least knowing about their internals. Simon Peyton Jones (an advocate of software transactional memory) gives the following example of a banking application:[6] design a class Account that allows multiple concurrent clients to deposit or withdraw money to an account, and give an algorithm to transfer money from one account to another.
The lock-based solution to the first part of the problem is:
class Account:
member balance: Integer
member mutex: Lock
method deposit(n: Integer)
mutex.lock()
balance ← balance + n
mutex.unlock()
method withdraw(n: Integer)
deposit(−n)
The second part of the problem is much more complicated. A transfer routine that is correct for sequential programs would be
function transfer(from: Account, to: Account, amount: Integer)
from.withdraw(amount)
to.deposit(amount)
In a concurrent program, this algorithm is incorrect because when one thread is halfway through transfer, another might observe a state where amount has been withdrawn from the first account, but not yet deposited into the other account: money has gone missing from the system. This problem can only be fixed completely by putting locks on both accounts prior to changing either one, but then the locks have to be placed according to some arbitrary, global ordering to prevent deadlock:
function transfer(from: Account, to: Account, amount: Integer)
if from < to // arbitrary ordering on the locks
from.lock()
to.lock()
else
to.lock()
from.lock()
from.withdraw(amount)
to.deposit(amount)
from.unlock()
to.unlock()
This solution gets more complicated when more locks are involved, and the transfer function needs to know about all of the locks, so they cannot be hidden.
Language support
[edit]Programming languages vary in their support for synchronization:
- Ada provides protected objects that have visible protected subprograms or entries[7] as well as rendezvous.[8]
- The ISO/IEC C standard provides a standard mutual exclusion (locks) application programming interface (API) since C11. The current ISO/IEC C++ standard supports threading facilities since C++11. The OpenMP standard is supported by some compilers, and allows critical sections to be specified using pragmas. The POSIX pthread API provides lock support.[9] Visual C++ provides the
synchronizeattribute of methods to be synchronized, but this is specific to COM objects in the Windows architecture and Visual C++ compiler.[10] C and C++ can easily access any native operating system locking features. - C# provides the
lockkeyword on a thread to ensure its exclusive access to a resource. - Visual Basic (.NET) provides a
SyncLockkeyword like C#'slockkeyword. - Java provides the keyword
synchronizedto lock code blocks, methods or objects[11] and libraries featuring concurrency-safe data structures. - Objective-C provides the keyword
@synchronized[12] to put locks on blocks of code and also provides the classes NSLock,[13] NSRecursiveLock,[14] and NSConditionLock[15] along with the NSLocking protocol[16] for locking as well. - PHP provides a file-based locking [17] as well as a
Mutexclass in thepthreadsextension.[18] - Python provides a low-level mutex mechanism with a
Lockclass from thethreadingmodule.[19] - The ISO/IEC Fortran standard (ISO/IEC 1539-1:2010) provides the
lock_typederived type in the intrinsic moduleiso_fortran_envand thelock/unlockstatements since Fortran 2008.[20] - Ruby provides a low-level mutex object and no keyword.[21]
- Rust provides the
Mutex<T>[22] struct.[23] - x86 assembly language provides the
LOCKprefix on certain operations to guarantee their atomicity. - Haskell implements locking via a mutable data structure called an
MVar, which can either be empty or contain a value, typically a reference to a resource. A thread that wants to use the resource ‘takes’ the value of theMVar, leaving it empty, and puts it back when it is finished. Attempting to take a resource from an emptyMVarresults in the thread blocking until the resource is available.[24] As an alternative to locking, an implementation of software transactional memory also exists.[25] - Go provides a low-level Mutex object in standard's library sync package.[26] It can be used for locking code blocks, methods or objects.
Mutexes vs. semaphores
[edit]A mutex is a locking mechanism that sometimes uses the same basic implementation as the binary semaphore. However, they differ in how they are used. While a binary semaphore may be colloquially referred to as a mutex, a true mutex has a more specific use-case and definition, in that only the task that locked the mutex is supposed to unlock it. This constraint aims to handle some potential problems of using semaphores:
- Priority inversion: If the mutex knows who locked it and is supposed to unlock it, it is possible to promote the priority of that task whenever a higher-priority task starts waiting on the mutex.
- Premature task termination: Mutexes may also provide deletion safety, where the task holding the mutex cannot be accidentally deleted. [citation needed] (This is also a cost; if the mutex can prevent a task from being reclaimed, then a garbage collector has to monitor the mutex.)
- Termination deadlock: If a mutex-holding task terminates for any reason, the OS can release the mutex and signal waiting tasks of this condition.
- Recursion deadlock: a task is allowed to lock a reentrant mutex multiple times as it unlocks it an equal number of times.
- Accidental release: An error is raised on the release of the mutex if the releasing task is not its owner.
See also
[edit]References
[edit]- ^ "lock Statement (C# Reference)". 4 February 2013.
- ^ "ThreadPoolPriority, and MethodImplAttribute". MSDN. p. ??. Retrieved 2011-11-22.
- ^ "C# From a Java Developer's Perspective". Archived from the original on 2013-01-02. Retrieved 2011-11-22.
- ^ "Designing Data Tier Components and Passing Data Through Tiers". Microsoft. August 2002. Archived from the original on 2008-05-08. Retrieved 2008-05-30.
- ^ "Lock Based Concurrency Control Protocol in DBMS". GeeksforGeeks. 2018-03-07. Retrieved 2023-12-28.
- ^ Peyton Jones, Simon (2007). "Beautiful concurrency" (PDF). In Wilson, Greg; Oram, Andy (eds.). Beautiful Code: Leading Programmers Explain How They Think. O'Reilly.
- ^ ISO/IEC 8652:2007. "Protected Units and Protected Objects". Ada 2005 Reference Manual. Retrieved 2010-02-27.
A protected object provides coordinated access to shared data, through calls on its visible protected operations, which can be protected subprograms or protected entries.
{{cite book}}: CS1 maint: numeric names: authors list (link) - ^ ISO/IEC 8652:2007. "Example of Tasking and Synchronization". Ada 2005 Reference Manual. Retrieved 2010-02-27.
{{cite book}}: CS1 maint: numeric names: authors list (link) - ^ Marshall, Dave (March 1999). "Mutual Exclusion Locks". Retrieved 2008-05-30.
- ^ "Synchronize". msdn.microsoft.com. Retrieved 2008-05-30.
- ^ "Synchronization". Sun Microsystems. Retrieved 2008-05-30.
- ^ "Apple Threading Reference". Apple, inc. Retrieved 2009-10-17.
- ^ "NSLock Reference". Apple, inc. Retrieved 2009-10-17.
- ^ "NSRecursiveLock Reference". Apple, inc. Retrieved 2009-10-17.
- ^ "NSConditionLock Reference". Apple, inc. Retrieved 2009-10-17.
- ^ "NSLocking Protocol Reference". Apple, inc. Retrieved 2009-10-17.
- ^ "flock".
- ^ "The Mutex class". Archived from the original on 2017-07-04. Retrieved 2016-12-29.
- ^ Lundh, Fredrik (July 2007). "Thread Synchronization Mechanisms in Python". Archived from the original on 2020-11-01. Retrieved 2008-05-30.
- ^ John Reid (2010). "Coarrays in the next Fortran Standard" (PDF). Retrieved 2020-02-17.
- ^ "class Thread::Mutex".
- ^ "std::sync::Mutex - Rust". doc.rust-lang.org. Retrieved 3 November 2020.
- ^ "Shared-State Concurrency - The Rust Programming Language". doc.rust-lang.org. Retrieved 3 November 2020.
- ^ Marlow, Simon (August 2013). "Basic concurrency: threads and MVars". Parallel and Concurrent Programming in Haskell. O’Reilly Media. ISBN 9781449335946.
- ^ Marlow, Simon (August 2013). "Software transactional memory". Parallel and Concurrent Programming in Haskell. O’Reilly Media. ISBN 9781449335946.
- ^ "sync package - sync - pkg.go.dev". pkg.go.dev. Retrieved 2021-11-23.
External links
[edit]Lock (computer science)
View on GrokipediaFundamentals
Definition and Purpose
In computer science, a lock is a synchronization primitive that enforces mutual exclusion, permitting only one thread or process to access a shared resource or critical section of code at a time, thereby preventing concurrent modifications that could lead to inconsistencies.[3] This mechanism ensures that operations on shared data appear atomic from the perspective of other threads, maintaining the integrity of the program's state in multithreaded environments.[1] The primary purpose of locks is to safeguard critical sections—portions of code where shared resources are accessed or modified—against race conditions, where interleaved execution by multiple threads could produce undefined or erroneous behavior.[3] By serializing access to these sections, locks promote data consistency and predictability, enabling developers to reason about program behavior in concurrent settings without worrying about low-level timing dependencies.[1] This is essential in systems ranging from operating system kernels to parallel applications, where uncontrolled concurrency could otherwise compromise reliability. The concept of locks traces its origins to early research in concurrent programming, particularly Edsger W. Dijkstra's 1968 work on semaphores as a means to achieve mutual exclusion in multiprogramming systems.[8] Semaphores provided a general synchronization tool, but locks evolved as a simpler, specialized variant focused solely on binary mutual exclusion, simplifying implementation for common use cases while building on these foundational ideas.[3] A typical usage involves acquiring the lock before entering the critical section and releasing it afterward, as illustrated in the following pseudocode:lock = new Lock(); // Initialize the lock
// Critical section
lock.acquire(); // Acquire the lock (wait if held by another thread)
// Access or modify shared resource here
lock.release(); // Release the lock
lock = new Lock(); // Initialize the lock
// Critical section
lock.acquire(); // Acquire the lock (wait if held by another thread)
// Access or modify shared resource here
lock.release(); // Release the lock
Basic Operations
The acquire operation, fundamental to lock usage, enables a thread to attempt exclusive ownership of the lock, thereby gaining access to the protected critical section. If the lock is free, the thread immediately obtains it and proceeds; however, if another thread already holds the lock, the acquiring thread blocks—either by suspending execution or spinning—until the lock becomes available.[3][1] The release operation allows the owning thread to relinquish control of the lock, transitioning it to an available state and permitting one waiting thread, if any, to acquire it next. This action ensures that mutual exclusion is maintained, as only the current owner can perform the release, preventing unauthorized access.[3][9] At the hardware level, these operations rely on atomic primitives to guarantee indivisibility and prevent race conditions during state transitions. A key example is the test-and-set instruction, which atomically reads a lock variable (typically testing if it is zero) and sets it to one if so, returning the original value; this enables simple yet effective lock implementations. On x86 architectures, the LOCK prefix enforces atomicity for read-modify-write instructions like compare-and-swap, ensuring the operation completes without interference from other processors.[3][10] Implementations of acquire differ in their waiting strategy: busy-waiting (or spinning) involves the thread repeatedly checking the lock state in a loop, consuming CPU cycles but responding quickly to short waits, whereas blocking suspends the thread via the operating system scheduler, yielding the processor to others and conserving resources for prolonged contention.[3][1] For error handling, certain acquire variants support timeouts, where the operation returns an error code (such as ETIMEDOUT) if the lock cannot be obtained within a specified duration, allowing threads to avoid indefinite suspension.Types
Exclusive Locks
Exclusive locks provide mutual exclusion by granting a single thread sole access to a protected resource, thereby preventing any concurrent reads or writes from other threads. This mechanism ensures that critical sections of code execute atomically, avoiding race conditions and data corruption in multithreaded environments. The foundational concept of mutual exclusion was introduced by Edsger W. Dijkstra in 1965 to solve the problem of coordinating multiple cyclic processes that share resources, using a symmetric algorithm based on shared variables to enforce that only one process enters its critical section at a time without assumptions about relative speeds or priorities.[11] The standard implementation of exclusive locks is the mutex, short for mutual exclusion lock, which acts as a binary gatekeeper allowing only one thread to proceed while others wait. Mutexes are typically either recursive or non-recursive: non-recursive mutexes cause deadlock if the same thread attempts to acquire the already-held lock, whereas recursive mutexes permit multiple acquisitions by the owning thread, tracking a reference count that decrements on each unlock until it reaches zero, at which point the mutex becomes available. This design supports straightforward synchronization in languages like C and C++, where mutexes underpin thread-safe operations on shared state.[12] In multithreaded applications, mutexes protect shared variables, data structures, or sections of code that modify common resources, such as counters, queues, or configuration data, ensuring operations like incrementing a global variable or updating a linked list remain indivisible. For instance, when multiple threads update a shared integer counter, a mutex serializes access to the read-modify-write sequence, guaranteeing correctness without external visibility of intermediate states. Ownership of the mutex resides exclusively with the acquiring thread, which must release it via an unlock operation; attempts by other threads to unlock an unowned or foreign-owned mutex result in undefined behavior or errors.[12] Prominent examples include the POSIXpthread_mutex_t type, which enforces exclusive access through atomic lock and unlock primitives, blocking the caller until acquisition succeeds. Similarly, the Windows Critical Section object provides process-local mutual exclusion with recursive entry support, using efficient low-level instructions for faster performance within a single application compared to inter-process alternatives. These mechanisms rely on basic acquire and release semantics to maintain thread safety.[12][13]
Shared Locks
Shared locks, also known as reader-writer locks, are synchronization primitives designed to allow multiple concurrent readers access to a shared resource while ensuring that writers have exclusive access. This approach addresses the readers-writers problem, first formalized in 1971, where readers can proceed simultaneously without interfering with each other, but any writer must block all readers and other writers to maintain data consistency.[14] In terms of acquisition modes, a thread acquires a shared (read) lock to perform read-only operations, permitting multiple such locks to coexist as long as no write lock is held. Conversely, acquiring an exclusive (write) lock grants sole access to the resource, blocking all other readers and writers until released. This contrasts with exclusive locks, which enforce total mutual exclusion even among readers.[15][16] Shared locks are particularly useful in scenarios where reads vastly outnumber writes, such as in caching systems where frequent lookups can proceed in parallel without blocking, while infrequent updates require isolation to prevent corruption. For instance, in a multi-threaded cache, multiple threads can query entries concurrently under shared locks, improving throughput in read-heavy workloads.[17] Prominent implementations include Java'sReentrantReadWriteLock from the java.util.concurrent.locks package, which supports reentrant acquisition for both read and write modes, and the POSIX pthread_rwlock_t interface, which provides standard functions like pthread_rwlock_rdlock for shared access and pthread_rwlock_wrlock for exclusive access.[16][15]
Many shared lock implementations support upgrading and downgrading to enable atomic transitions between modes; for example, a thread holding a shared lock can upgrade to an exclusive lock (potentially waiting for other readers to finish), while a writer can downgrade to a shared lock to allow concurrent readers post-update, though upgrades require careful handling to avoid deadlocks or writer starvation.[18]
Specialized Locks
Spinlocks are a type of lock that employs busy-waiting, where a thread repeatedly checks the lock's availability using atomic operations instead of yielding the processor.[19] This approach is particularly suitable for multiprocessor systems when locks are held for short durations, as the overhead of context switching in blocking mechanisms outweighs the cost of spinning.[20] In contrast to traditional blocking locks, spinlocks avoid suspending threads, enabling faster acquisition in low-contention scenarios.[19] Optimistic locks operate by allowing threads to proceed without initially acquiring a lock, relying on versioning or atomic operations like compare-and-swap (CAS) to validate changes upon completion.[21] If a conflict is detected—such as a version mismatch— the operation aborts and may fall back to a pessimistic locking strategy to ensure consistency.[21] This non-blocking technique reduces contention in read-heavy or low-conflict environments by minimizing lock holding times, though it requires validation overhead.[22] Adaptive locks combine elements of spinning and blocking to optimize performance based on current contention levels, dynamically switching strategies to balance CPU utilization and latency.[23] For instance, under light contention, the lock may spin briefly to avoid scheduling overhead; if contention persists, it transitions to blocking to conserve resources.[23] This hybrid method improves scalability in varied workloads compared to pure spin or block approaches.[23] Practical implementations include the Linux kernel'sspinlock_t, a basic spinlock structure used for protecting short critical sections in interrupt handlers and kernel code, defined in the kernel's locking subsystem.[24] Similarly, Java's AtomicReference class supports lock-free patterns through CAS operations, enabling atomic updates to object references without explicit locks for concurrent data structures like queues.[25]
These specialized locks rely on hardware support for atomic instructions, such as test-and-set, which atomically reads and sets a memory location to detect and claim ownership.[19] Fetch-and-add provides another primitive, atomically incrementing a value and returning the prior value, useful for ticket-based or counter-driven locking schemes.[3]
Granularity
Coarse-Grained Locking
Coarse-grained locking employs a single lock to protect a broad scope of shared resources, such as an entire data structure or module, ensuring that all threads must acquire this lock sequentially to access any part of the protected area. This technique serializes operations across the locked scope, preventing concurrent modifications and maintaining data consistency through mutual exclusion. For instance, in a concurrent hash table implementation, a single global lock is acquired before any insert, delete, or lookup operation on the table, rather than protecting individual buckets separately.[26][27][28] The primary advantages of coarse-grained locking lie in its simplicity and ease of implementation, as developers manage only one lock, which reduces the complexity of synchronization logic and minimizes the potential for errors in lock association. This approach also lowers the risk of race conditions that could arise across interrelated resources, since the uniform protection eliminates the need to coordinate multiple locks. In environments with low contention, such as single-threaded extensions or scenarios with infrequent parallel access, the overhead of acquiring and releasing the lock remains minimal, making it efficient for basic concurrency needs.[29][30][27] Despite these benefits, coarse-grained locking suffers from significant drawbacks related to performance under contention. By forcing all threads to compete for the same lock, it leads to serialization of operations, where only one thread progresses at a time, resulting in high contention and diminished throughput as the number of threads increases. This lack of parallelism causes poor scalability on multiprocessor systems, with studies showing performance degradation up to several times worse than alternatives in high-concurrency workloads.[30][27][31] Coarse-grained locking finds suitable use cases in small-scale applications where concurrency demands are low, or in initialization phases of larger systems to safely set up shared state without complex synchronization. It serves as a reliable baseline for concurrent data structures in prototyping or when scalability is not a primary concern, ensuring correctness with straightforward exclusive access control.[27][32]Fine-Grained Locking
Fine-grained locking refers to a synchronization strategy in which multiple locks are employed to protect smaller, distinct portions of a shared resource or data structure, thereby permitting concurrent access and modification by different threads to unrelated segments without interference. This approach partitions the resource into finer units—such as individual nodes in a linked list or buckets in a hash table—each guarded by its own lock, which minimizes the scope of mutual exclusion compared to protecting the entire structure with a single lock. By enabling parallelism among operations on disjoint parts, fine-grained locking enhances scalability in multiprocessor environments where contention would otherwise serialize execution. Key techniques for implementing fine-grained locking include lock striping and hierarchical (or hand-over-hand) locking. In lock striping, a fixed number of locks are cycled through or assigned to different segments of the data structure; for instance, in a concurrent hash map, keys are hashed to determine which of several locks protects the corresponding bucket, allowing independent operations on non-overlapping buckets. Hierarchical locking, on the other hand, involves acquiring locks progressively as a thread navigates the structure—for example, in a linked list, a thread locks the current node and the next one before releasing the previous, ensuring atomic traversal while limiting the locked region to adjacent elements at any time. These methods, as detailed in foundational works on concurrent programming, balance granularity with correctness by carefully defining lock scopes. The primary advantage of fine-grained locking is improved throughput and reduced contention in high-contention scenarios, where multiple threads frequently access overlapping but not identical parts of the resource; experimental evaluations show that it can achieve near-linear speedup on multiprocessors by allowing more threads to proceed simultaneously. For example, in a sorted linked list implementation using hand-over-hand locking, concurrent insertions and deletions on different sections can overlap, leading to higher performance than a global lock. However, this granularity introduces challenges, including greater implementation complexity due to the need for precise lock management and the potential for deadlocks arising from unordered acquisitions across multiple locks, which requires strict protocols like consistent ordering to mitigate. A representative example of fine-grained locking is applying per-element locks to an array, where each array index is protected by a dedicated lock, enabling threads to update or read independent elements in parallel without blocking unrelated operations. As a lock-free alternative in fine-grained contexts, read-copy-update (RCU) is often used for read-mostly data structures; it allows multiple readers to access a snapshot of the data without locks while writers create copies for updates, deferring reclamation until all readers complete, thus avoiding the overhead and contention of traditional fine-grained locks.Database Applications
Lock Types in Databases
In database management systems (DBMS), locks are synchronization primitives used to control concurrent access by multiple transactions to shared data items, such as individual rows, pages, or entire tables, thereby preventing inconsistencies during reads and writes.[33] These mechanisms are essential for maintaining transaction isolation in multi-user environments, where transactions represent logical units of work that may span multiple operations on persistent storage.[34] The primary lock types in databases build on shared and exclusive modes but are adapted for relational structures. A shared lock (S-lock) permits multiple transactions to read a data item simultaneously while blocking any write attempts, allowing concurrent reads without interference.[35] In contrast, an exclusive lock (X-lock) grants a single transaction both read and write access, blocking all other shared or exclusive locks on the same item to ensure modifications occur atomically from other perspectives.[33] Intent locks, such as intent shared (IS), intent exclusive (IX), and shared with intent exclusive (SIX), facilitate hierarchical locking by signaling a transaction's intention to acquire finer-grained locks lower in the data structure (e.g., on rows within a table), without immediately blocking coarser levels.[36] Lock granularity in databases refers to the scope of data units protected by a lock, balancing concurrency and overhead. Row-level locking targets individual records, maximizing parallelism in high-contention scenarios but increasing lock management costs.[37] Page-level locking covers groups of rows on a storage page (typically 4-8 KB), offering a compromise for moderate workloads.[38] Table-level locking secures an entire table, simplifying enforcement but potentially serializing access in multi-user systems.[34] Modern DBMS like SQL Server and MySQL dynamically escalate or de-escalate granularity based on transaction size and resource usage to optimize performance.[33] Databases provide explicit locking via SQL standards, such as theSELECT ... FOR UPDATE clause, which acquires exclusive locks on selected rows to prevent concurrent modifications until the transaction commits or rolls back.[39] This is commonly used in scenarios requiring pessimistic concurrency control, like inventory updates in e-commerce systems.
Locks primarily contribute to the isolation property of ACID (Atomicity, Consistency, Isolation, Durability) by serializing access to data, ensuring transactions appear to execute sequentially despite concurrency.[33] However, improper lock acquisition—such as prolonged holds or incompatible hierarchies—can lead to deadlocks or excessive blocking, potentially forcing transaction aborts that indirectly affect atomicity by requiring restarts.[34]
Concurrency Control Protocols
In database systems, concurrency control protocols employing locks ensure that multiple transactions can execute simultaneously while preserving the illusion of serial execution, thereby achieving serializability and supporting recoverability. These protocols build on lock types such as shared locks for reads and exclusive locks for writes to manage access to shared data items. A foundational approach is two-phase locking (2PL), which structures transaction execution to prevent schedules that violate consistency. Two-phase locking divides each transaction into two distinct phases: a growing phase, during which the transaction acquires locks as needed but does not release any, and a shrinking phase, during which it releases locks but acquires no new ones. This separation ensures that the protocol guarantees conflict-serializability, meaning any schedule produced is equivalent to some serial execution of the transactions in terms of conflicting operations. The point of transition between phases occurs at the first unlock operation, after which no further lock requests are allowed. By enforcing this rule, 2PL avoids cycles in the precedence graph of transactions, which would otherwise lead to non-serializable outcomes. A common variant, strict two-phase locking (strict 2PL), enhances recoverability by requiring transactions to hold all exclusive locks until they commit or abort, preventing cascading aborts where uncommitted changes are read by other transactions. This modification ensures that the database remains consistent even during failures, as no transaction can read uncommitted data written by another that might later roll back. Strict 2PL is widely adopted in production database systems for its balance of concurrency and durability guarantees. Another variant is conservative 2PL (also known as static or predeclaration 2PL), in which transactions must acquire all required locks before executing any operations, effectively combining the growing phase upfront and eliminating the possibility of deadlocks arising from partial lock acquisition. While this approach reduces deadlock risk, it can lower concurrency by delaying transaction starts until all locks are available. To handle potential deadlocks in locking protocols, databases employ prevention strategies such as timeout-based mechanisms, where a transaction aborts if it waits longer than a predefined threshold for a lock, and detection via wait-for graphs, which model transaction dependencies as a directed graph and identify cycles indicating deadlocks for resolution by aborting one involved transaction. These methods complement 2PL by maintaining system liveness without excessive overhead. For example, consider two transactions T1 and T2 accessing shared data items A and B, where a non-serializable schedule might allow T1 to read A, T2 to write A and read B, and T1 to write B, resulting in T1 seeing its own update overwritten by T2 inconsistently. Under 2PL, T2 would block on acquiring an exclusive lock on A until T1 releases it in its shrinking phase, forcing an order equivalent to serial execution (e.g., T1 then T2), thus preventing the anomalous outcome.Properties and Mechanisms
Lock Compatibility
Lock compatibility defines the rules governing whether multiple threads can simultaneously hold locks of specific modes on the same shared resource, enabling controlled concurrency while preventing conflicts such as race conditions or inconsistent reads. In systems supporting multiple lock modes, compatibility ensures that read operations can proceed in parallel when no writes are involved, but write operations require exclusivity to maintain data integrity. These rules are fundamental to solutions for the reader-writer problem in concurrent programming and to database concurrency control, where shared locks allow concurrent reads and exclusive locks enforce mutual exclusion for modifications.[40][3] The simplest compatibility model involves two modes: shared (S) locks, which permit multiple threads to read a resource concurrently but block writes, and exclusive (X) locks, which grant a single thread both read and write access while blocking all other locks. Under this model, multiple S locks are compatible with each other, allowing high throughput for read-heavy workloads; however, an S lock is incompatible with any X lock, and no two X locks can coexist on the same resource. This asymmetry ensures that writers do not interfere with ongoing reads but must wait for readers to complete, promoting progress in reader-writer scenarios.[40] For basic S and X modes, the compatibility is captured in the following matrix, where a new lock request is granted only if it is compatible with all existing locks on the resource:| Requested Mode | Existing Mode: S | Existing Mode: X |
|---|---|---|
| S | Compatible | Incompatible |
| X | Incompatible | Incompatible |
| Requested \ Existing | NL | IS | IX | S | SIX | X |
|---|---|---|---|---|---|---|
| NL | Yes | Yes | Yes | Yes | Yes | Yes |
| IS | Yes | Yes | Yes | Yes | Yes | No |
| IX | Yes | Yes | Yes | No | No | No |
| S | Yes | Yes | No | Yes | No | No |
| SIX | Yes | Yes | No | No | No | No |
| X | Yes | No | No | No | No | No |
Fairness and Ordering
In concurrent programming, fairness refers to policies that ensure threads acquire locks in a manner that prevents indefinite postponement, such as through first-in-first-out (FIFO) queuing mechanisms. Fair locks implement a queue where threads are granted access in the order they request the lock, thereby guaranteeing progress for all contenders and avoiding starvation. This approach is particularly useful in shared-memory multiprocessors, where algorithms like the FIFO spin lock use atomic swap operations to maintain a per-lock queue, ensuring first-come-first-served ordering with O(1) space per lock and process.[42] In contrast, unfair locks prioritize throughput by allowing any waiting thread to acquire the lock immediately upon release, without strict adherence to arrival order. This can lead to higher performance in low-contention scenarios but risks starvation, where a thread repeatedly loses the race to reacquire the lock to newly arriving threads. For instance, queued locks like MCS can be configured as unfair to reduce overhead, though this may cause indefinite waits in pathological cases.[43] Lock ordering imposes a global hierarchy on multiple locks—often based on memory addresses or identifiers—to prevent circular dependencies during acquisition. By requiring threads to always acquire locks in a consistent sequence (e.g., lower address first), this policy eliminates the possibility of wait cycles, ensuring bounded waiting times without relying on detection mechanisms.[1] Hand-over-hand locking, also known as lock coupling, applies fairness in traversals of data structures like linked lists by acquiring locks sequentially on adjacent nodes while releasing prior ones. This technique maintains progress for concurrent operations by limiting hold times and enabling overlapping traversals, thus reducing contention and supporting fair access in fine-grained scenarios.[44] A practical example is Java'sReentrantLock, which supports both modes via constructors: ReentrantLock() creates an unfair lock for better throughput, while ReentrantLock(true) enables fair mode with FIFO queuing to minimize starvation by prioritizing the longest-waiting thread.[45]
Challenges
Deadlock
In concurrent programming, deadlock refers to a state in which two or more threads are permanently blocked, each waiting for the other to release a resource, typically locks, resulting in a standstill from which no thread can proceed without external intervention. This phenomenon arises specifically in systems using locks for mutual exclusion, where threads acquire and hold locks while attempting to obtain additional ones held by others.[46] A classic example involves two threads and two locks, A and B. Thread 1 acquires lock A and then attempts to acquire lock B, while Thread 2 acquires lock B and then attempts to acquire lock A; if the acquisitions interleave such that each thread holds one lock and waits for the other, both become indefinitely blocked.[47] Deadlock occurs only if four necessary and sufficient conditions, known as the Coffman conditions, hold simultaneously: mutual exclusion (resources like locks cannot be shared), hold and wait (a thread holds at least one resource while waiting for another), no preemption (resources cannot be forcibly taken from a thread), and circular wait (a cycle exists in the resource dependency chain). To prevent deadlock, strategies focus on violating one of the Coffman conditions, such as enforcing a total ordering on lock acquisitions so all threads request locks in the same sequence (e.g., by assigning unique identifiers to locks and acquiring lower IDs first), thereby eliminating circular wait.[48] Other prevention methods include using resource allocation graphs to deny requests that would create cycles or imposing timeouts on lock acquisitions to break potential holds.[49] For detection, systems construct a wait-for graph where nodes represent threads and directed edges indicate one thread waiting for a lock held by another; deadlock is present if the graph contains a cycle, which can be identified using algorithms like depth-first search.[49] Upon detection, recovery techniques include aborting and rolling back one or more deadlocked threads to release their held locks, allowing others to proceed.[49][50]Other Limitations
One significant limitation of locks in concurrent systems is priority inversion, where a low-priority thread holds a lock required by a high-priority thread, allowing medium-priority threads to preempt the low-priority one and indefinitely delay the high-priority thread's progress.[51] This issue arises in priority-based scheduling environments, such as real-time operating systems, and can violate timing guarantees critical for safety.[51] To mitigate priority inversion, priority inheritance protocols temporarily elevate the low-priority thread's priority to match the high-priority contender's, ensuring bounded blocking times, as formalized in the basic priority inheritance protocol.[51] Locks also suffer from a lack of composability, meaning that while individual lock-protected operations may be correct in isolation, combining them into larger atomic units often requires redesigning locking strategies to avoid races or deadlocks, complicating modular software development. For instance, transferring data between two lock-protected queues necessitates coordinated locking across both, which cannot be naively composed from separate enqueue and dequeue operations without risking inconsistency. In terms of scalability, lock contention serializes access to shared resources, effectively turning parallelizable sections into bottlenecks that limit overall speedup, as captured by extensions to Amdahl's law modeling critical sections probabilistically.[52] High contention under many threads exacerbates this, where the fraction of time spent waiting for locks grows, reducing efficiency on multicore systems even when the workload is highly parallelizable.[52] Additionally, blocking locks introduce significant overhead through operating system context switches when threads block and resume, each incurring latency from saving/restoring thread states and invalidating CPU caches, which can dominate performance in contended scenarios. This overhead is particularly pronounced in fine-grained locking, where frequent acquire/release cycles amplify the costs without proportional concurrency gains. A notable real-world example occurred during the 1997 Mars Pathfinder mission, where priority inversion in the VxWorks operating system caused the high-priority telemetry task to be delayed by a low-priority data logging task holding a shared semaphore, leading to multiple spacecraft resets until priority inheritance was enabled via a software patch.[53]Implementations
Language Support
In C and C++, locks are primarily provided through library abstractions rather than language builtins. The POSIX standard offers pthread_mutex_t as a basic mutual exclusion primitive, which can be initialized, locked with pthread_mutex_lock(), and unlocked with pthread_mutex_unlock(), ensuring exclusive access to shared resources by a single thread at a time.[54] In C++, the C++11 standard introduced thePerformance Considerations
Performance of locks in concurrent systems is primarily assessed through metrics such as acquire latency, which measures the time to obtain and release a lock; throughput, representing the number of successful lock operations per unit time; and scalability, evaluating how these metrics degrade or improve with increasing thread or core counts.[59] High-contention scenarios often reveal bottlenecks, where throughput can drop significantly due to waiting threads, as seen in multithreaded applications with up to 30,000 lock acquisitions per second per thread under moderate load.[60] Several factors influence these metrics, including contention levels, where multiple threads competing for the same lock increase waiting times and reduce overall efficiency; lock hold times, with shorter durations minimizing overhead; and hardware characteristics like Non-Uniform Memory Access (NUMA) effects, which can amplify latency through remote memory accesses during lock handoffs across nodes.[59] In NUMA systems, contention exacerbates coherence traffic, potentially halving throughput on multi-level hierarchies unless addressed by topology-aware designs.[59] For instance, under high contention on 128 threads, a hierarchical MCS (HMCS) lock achieves up to 7.6 times higher throughput compared to flat variants by localizing contention within NUMA domains.[59] To mitigate performance limitations, optimizations such as lock-free programming techniques eliminate blocking altogether, enabling progress for non-blocked threads and improving scalability under high contention or oversubscription, often outperforming traditional locks by factors of 2-3 in thread-heavy workloads.[61] Hazard pointers provide safe memory reclamation in lock-free structures, reducing garbage collection pauses and maintaining high throughput in persistent memory environments. Transactional memory offers an alternative by speculatively executing critical sections without explicit locks, yielding better performance in contended scenarios with abort rates below 10%, as it avoids deadlock-prone locking hierarchies. Benchmarking these aspects typically employs tools like the Java Microbenchmark Harness (JMH), which facilitates precise measurement of concurrent code by controlling warmup, iterations, and thread groups to isolate lock overheads in microbenchmarks. Custom microbenchmarks often simulate varying contention to quantify trade-offs, such as in STAMP suites where spinlocks—busy-waiting locks suitable for brief operations—outperform mutexes (blocking locks) by nearly 40% in execution time for short critical sections comprising 3-7% of workload cycles.[62]Comparisons
Locks vs. Semaphores
In computer science, locks, often implemented as mutexes, are binary synchronization primitives that enforce mutual exclusion by allowing only one thread to acquire them at a time, typically using acquire and release operations to protect critical sections without providing signaling capabilities.[63] Semaphores, in contrast, are integer-valued primitives that support counting mechanisms, enabling operations like wait (P, decrement if positive) and signal (V, increment) to coordinate access to multiple resources or facilitate signaling between threads, such as in producer-consumer scenarios.[64] While a binary semaphore (restricted to values 0 or 1) can emulate a lock for simple exclusion, general semaphores extend this by maintaining a counter greater than 1, allowing multiple threads to proceed up to the count limit before blocking others.[64] Locks are preferred for straightforward mutual exclusion in critical sections, such as protecting a shared variable from concurrent modifications, where ownership semantics ensure the acquiring thread releases it to avoid errors.[63] Semaphores, however, are more suitable for scenarios involving resource pools or synchronization points, like limiting the number of threads accessing a database connection pool to five by initializing the semaphore to 5—threads wait if the count reaches zero and signal upon release.[63] For instance, in a bounded buffer, semaphores can manage both empty and full slots separately, enabling efficient producer-consumer coordination that a lock alone cannot achieve without additional primitives.[64] Historically, semaphores were introduced by Edsger W. Dijkstra in 1965 as a general solution for process synchronization, encompassing binary variants for exclusion while providing broader counting functionality for cooperation.[64] Although semaphores generalize locks by supporting signaling and multi-resource control, locks remain simpler and more lightweight for pure exclusion tasks, reducing the risk of misuse in basic scenarios.[63]Locks vs. Other Primitives
Monitors represent a higher-level abstraction over basic locks in concurrent programming, encapsulating a mutual exclusion lock along with procedures (or methods) that access shared data, thereby ensuring that only one thread executes within the monitor at a time. This automatic acquisition and release of the lock upon entering and exiting monitor procedures eliminates the need for explicit lock management in the code. The concept was formalized by C. A. R. Hoare as a structuring mechanism for operating systems to simplify synchronization.[65] In practice, languages like Java implement monitors intrinsically with each object, where thesynchronized keyword on methods or blocks automatically enforces exclusion via the object's monitor lock.[66]
Condition variables complement locks by enabling threads to wait efficiently for specific conditions on shared state, rather than relying on busy-waiting loops that waste CPU cycles. Typically paired with a mutex, a condition variable allows a thread to atomically release the associated lock and suspend until signaled by another thread, at which point it reacquires the lock and checks the condition. In POSIX threads (pthreads), this is achieved using pthread_cond_wait on a pthread_cond_t object alongside a mutex, with signaling via pthread_cond_signal or pthread_cond_broadcast to notify waiting threads when the predicate changes, such as in bounded buffer implementations for producer-consumer coordination.[67] This pairing extends basic locks to handle not just exclusion but also efficient inter-thread communication on dynamic conditions.
Barriers and latches serve as synchronization points for coordinating multiple threads at specific milestones in execution, distinct from locks' role in protecting critical sections of shared resources. A barrier, such as POSIX's pthread_barrier_wait, blocks threads until a predetermined number arrive, then releases all simultaneously, facilitating phased parallel computations like iterative algorithms where threads must align before advancing. Latches, exemplified by C++20's std::latch, operate as decrementable counters that enable threads to wait until the count reaches zero after a fixed number of arrivals, ideal for one-off events like parallel initialization without ongoing resource guarding.
Higher-level primitives like monitors and condition variables offer advantages over raw locks by providing structured abstraction that mitigates common pitfalls, such as deadlocks from unbalanced acquire-release pairs or inefficient polling. For instance, in coordinating threads with a shared queue, a pure mutex requires threads to spin while repeatedly locking and checking fullness or emptiness, whereas pairing it with a pthread_cond_t allows suspended waiting, reducing overhead and improving scalability. This abstraction level, supported natively in languages like Java via monitors, promotes safer and more maintainable concurrent code.[68][67][66]