Hubbry Logo
Demand pagingDemand pagingMain
Open search
Demand paging
Community hub
Demand paging
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Demand paging
Demand paging
from Wikipedia

In computer operating systems, demand paging (as opposed to anticipatory paging) is a method of virtual memory management. In a system that uses demand paging, the operating system copies a disk page into physical memory only when an attempt is made to access it and that page is not already in memory (i.e., if a page fault occurs). It follows that a process begins execution with none of its pages in physical memory, and triggers many page faults until most of its working set of pages are present in physical memory. This is an example of a lazy loading technique.

Basic concept

[edit]

Demand paging only brings pages into memory when an executing process demands them. This is often referred to as lazy loading, as only those pages demanded by the process are swapped from secondary storage to main memory. Contrast this to pure swapping, where all memory for a process is swapped from secondary storage to main memory when the process starts up or resumes execution.

Commonly, to achieve this process a memory management unit is used. The memory management unit maps logical memory to physical memory. Entries in the memory management unit include a bit that indicates whether a page is valid or invalid. A valid page is one that currently resides in main memory. An invalid page is one that currently resides in secondary memory. When a process tries to access a page, the following steps are generally followed:

  • Attempt to access page.
  • If page is valid (in memory) then continue processing instruction as normal.
  • If page is invalid then a page-fault trap occurs.
  • Check if the memory reference is a valid reference to a location on secondary memory. If not, the process is terminated (illegal memory access). Otherwise, we have to page in the required page.
  • Schedule disk operation to read the desired page into main memory.
  • Restart the instruction that was interrupted by the operating system trap.

Advantages

[edit]

Demand paging, as opposed to loading all pages immediately:

  • Only loads pages that are demanded by the executing process.
  • As there is more space in main memory, more processes can be loaded, reducing the context switching time, which utilizes large amounts of resources.
  • Less loading latency occurs at program startup, as less information is accessed from secondary storage and less information is brought into main memory.
  • As main memory is expensive compared to secondary memory, this technique helps significantly reduce the bill of material (BOM) cost in smart phones for example. Symbian OS had this feature.

Disadvantages

[edit]
  • Individual programs face extra latency when they access a page for the first time.
  • Low-cost, low-power embedded systems may not have a memory management unit that supports page replacement.
  • Memory management with page replacement algorithms becomes slightly more complex.
  • Possible security risks, including vulnerability to timing attacks; see Percival, Colin (2005-05-13). Cache missing for fun and profit (PDF). BSDCan 2005. (specifically the virtual memory attack in section 2).
  • Thrashing which may occur due to repeated page faults.

See also

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Demand paging is a fundamental technique in systems, where pages of a program's are loaded into physical from secondary storage only when explicitly referenced by the executing process, typically triggered by a . This on-demand loading avoids the overhead of preloading unused pages, enabling efficient utilization of limited physical by treating as a backing store extension of RAM. Originating in the Atlas computer system developed at the in 1959, demand paging represented a breakthrough in providing the illusion of a large, contiguous space to programs despite hardware constraints. The technique was further refined in the operating system during the 1960s, which combined paging with segmentation for enhanced modularity, protection, and sharing among multiple users. By the 1970s, demand paging became a cornerstone of commercial systems like the and UNIX, facilitated by hardware support such as entries with valid/invalid bits to track page residency in . At its core, demand paging exploits the principle of locality—the observation that programs tend to reference a small, slowly changing subset of their pages (known as the ) over short periods—allowing physical memory to act as a cache for the most active virtual pages. When a occurs, the operating system kernel intervenes: it allocates a free physical frame if available, selects a victim page for replacement using algorithms like least recently used (LRU) or first-in, first-out (FIFO) if memory is full, loads the demanded page from disk, and updates the . This process incurs latency due to disk I/O but is mitigated by spatial and temporal locality, where subsequent references to nearby or recently used pages avoid further faults. The primary advantages of demand paging include simplified program development by eliminating manual overlay management, support for multiprogramming through better memory sharing, and the ability to execute programs larger than physical memory. However, it introduces challenges such as page fault overhead, which can degrade performance if fault rates are high, and the risk of thrashing—excessive paging activity that consumes more resources than it saves—particularly when the aggregate working sets of active processes exceed available memory. To address these, techniques like prepaging (anticipatory loading of likely-needed pages) and working set models have been developed to optimize allocation and prevent system overload.

Fundamentals

Overview and Definition

Demand paging is a memory management scheme in which the logical address space of a process is divided into fixed-size pages, and these pages are loaded into physical memory only when first accessed by the process, in contrast to earlier methods that pre-loaded entire programs into memory before execution. This on-demand loading allows processes to begin running with minimal initial memory allocation, deferring the transfer of data from secondary storage until necessary. Demand paging operates as a key component within the broader framework of virtual memory, which provides an abstraction of a large, contiguous address space beyond the limits of physical memory. The core principle of demand paging is lazy evaluation of memory requirements, where a newly created process starts with an empty set of physical frames allocated to it and relies on secondary storage, such as a disk, as the backing store for its pages. Upon accessing a page not yet in memory, the operating system intervenes to fetch it, ensuring that only actively used portions of the program occupy physical memory at any time. For example, in a process divided into 100 pages where only the first 20 are referenced during initial execution, solely those 20 pages are brought into memory, thereby conserving physical resources and enabling multitasking by allowing space for other processes. Demand paging was first conceptualized in the early 1960s with the Atlas computer at the , where it was implemented as part of a one-level storage system that automatically fetched missing pages from a drum-based backing store upon access. This innovation was formalized and advanced in the operating system, with key implementation details on demand paging for its segmented presented at the 1969 ACM Symposium on Operating System Principles.

Prerequisites: Virtual Memory and Paging

Virtual memory is a memory management technique that allows processes to use more memory than is physically available in the system by providing an abstraction of a large, contiguous to each program. It achieves this by mapping virtual addresses generated by the program to physical addresses in main , utilizing secondary storage such as disks as an extension of main when necessary. This mapping is performed automatically through hardware and software mechanisms, enabling the illusion of dedicated for multiple processes while sharing the physical resources efficiently. Paging is a fundamental method for implementing virtual memory, where both the virtual address space and physical memory are divided into fixed-size units called pages and page frames, respectively. Pages in the virtual space correspond to frames in physical memory, with a typical uniform size such as 4 KB, which simplifies memory allocation and eliminates the need for contiguous allocation of variable-sized blocks. This fixed-size approach reduces external fragmentation—unused gaps between allocated memory blocks—by allowing any free frame to be assigned to any page without regard to location. However, paging introduces internal fragmentation, where portions of a page remain unused if the data or code assigned to it does not fill the entire fixed-size unit, leading to wasted space within allocated pages. The address translation process in paging begins with dividing the logical (virtual) address into two parts: a page number and an offset within that page. The page number serves as an index into a , a maintained by the operating system that maps each virtual page number to a corresponding physical frame number. The physical address is then computed by combining the frame number with the unchanged offset, using the formula: Physical Address=(page table[page number]×page size)+offset\text{Physical Address} = (\text{page table[page number]} \times \text{page size}) + \text{offset} This translation ensures that references to are resolved to physical locations efficiently, typically with hardware support like a . Demand paging extends this paging mechanism by loading pages into physical memory only when accessed, rather than preloading the entire .

Operational Mechanisms

Page Fault Handling

A page fault is a type of hardware , often referred to as a trap, that is generated when a running attempts to access a page that is not currently mapped to a physical frame in main . This occurs in demand paging systems where pages are not preloaded but fetched only upon reference. The (MMU) detects the fault during address translation by examining the entry for the requested virtual page; specifically, if the present or validity bit in that entry is set to 0, indicating the page is absent from physical memory, the MMU raises the to halt execution and notify the operating system. The operating system handles the page fault through a dedicated interrupt service routine, beginning with saving the processor context of the interrupted process to ensure its state can be restored later. The handler then inspects the fault details, such as the faulting virtual address and the process's , to distinguish between an invalid access—arising from protection violations like attempting to write to a read-only page—and a legitimate demand paging scenario where the page simply resides on secondary storage. In the case of a valid missing page, the OS allocates a physical frame (which may involve page replacement if no free frames are available, as described below) and, after loading the page into the frame, updates the entry to associate the virtual page with the frame, setting the present bit to 1 to enable resumption of the access. The MMU's role extends beyond mere translation; it enforces by monitoring flags in entries, such as read/write permissions and user/kernel modes, which can also trigger faults if violated, though these differ from demand paging faults. Once the is updated, control returns to the process at the instruction that caused the fault, allowing the original access to complete successfully. In demand paging, these faults are resolvable "soft" interrupts that support efficient usage by loading only needed pages, unlike irrecoverable "hard" faults from hardware errors such as parity issues. For instance, if a references virtual page 5 and its entry shows the present bit as 0, the MMU immediately signals a , transferring control to the kernel's trap handler, which validates the access and prepares the mapping without terminating the . This mechanism ensures transparent operation for the application, building on the static structures used in paging.

Page Loading and Replacement

Upon detecting a , the operating system initiates the page loading process by identifying the required page in the backing store, typically a disk or swap area, and transferring it into an available physical frame in main memory. The kernel schedules a disk I/O operation to read the page contents, which are then copied into the allocated frame; once loaded, the entry's validity bit is set to 1, the is resolved, and the interrupted instruction is restarted to allow the process to continue execution. This on-demand loading ensures that only actively referenced pages occupy physical memory, minimizing initial resource usage. Frame allocation begins with a search for free frames using a frame table that tracks the status of physical memory pages. If a free frame is available, the page is loaded directly into it; however, in memory-constrained systems where all frames are occupied, the operating system must invoke a page replacement algorithm to select and evict an existing page before proceeding with the load. This eviction may involve writing the replaced page back to the backing store if it has been modified (a "dirty" page), adding further I/O overhead. Basic page replacement algorithms aim to minimize the frequency of future page faults by evicting pages least likely to be needed soon. The First-In-First-Out (FIFO) algorithm maintains a queue of loaded pages and evicts the oldest one, regardless of usage patterns, which is simple to implement but can suffer from Belady's anomaly where increasing frame count paradoxically raises fault rates. In contrast, the Least Recently Used (LRU) algorithm tracks access history—often approximated using a stack or reference counters—and evicts the page that has not been used for the longest time, providing better performance by favoring recently active pages, though it requires hardware support like timestamp counters for efficiency. The impact of page faults on system performance is captured by the effective access time (EAT), which accounts for the probability of faults during memory operations: EAT=(1p)×memory access time+p×(page fault time+memory access time)\text{EAT} = (1 - p) \times \text{memory access time} + p \times (\text{page fault time} + \text{memory access time}) where pp is the page fault probability. This formula highlights how even low fault rates can degrade performance significantly due to the disparity between memory access times (typically 100 nanoseconds) and page fault handling, dominated by disk I/O latencies of several milliseconds. To mitigate this overhead, some systems employ read-ahead techniques, which speculatively load subsequent pages into memory upon a fault, anticipating sequential access patterns and reducing future I/O demands.

Benefits and Limitations

Advantages

Demand paging enhances memory efficiency by loading only the pages accessed by a into physical , thereby allocating resources more effectively to active workloads and avoiding waste on unused portions of programs. This selective loading preserves physical for other es, enabling a higher degree of multiprogramming where multiple programs can execute simultaneously without exhausting available RAM. It also reduces I/O overhead by eliminating the need to load unused code and data upfront, which shortens initial program load times and decreases the volume of disk operations that could otherwise contribute to inefficient resource contention. In resource-constrained environments, this efficiency allows for smaller physical RAM configurations, lowering bill of materials (BOM) costs in devices like early smartphones; for example, Symbian OS employed demand paging to significantly cut BOM expenses by reducing reliance on expensive main memory. Symbian OS v9.5 further demonstrated this through demand paging optimizations that reduced average RAM usage by more than 25%. Programs using demand paging can begin execution right away, without the delay of fully loading all pages, making it especially suitable for large applications where complete upfront loading would be prohibitive. on web and desktop applications has shown that demand paging, when paired with code reordering, can decrease startup latency by more than 58% in bandwidth-limited scenarios by prioritizing essential pages and deferring others. Overall, this supports greater multiprogramming by fitting more processes into memory, improving CPU utilization and system throughput while minimizing idle periods.

Disadvantages

Demand paging introduces significant performance overhead due to page faults, where each fault triggers a and disk I/O operation, typically taking 10-100 milliseconds to retrieve a page from secondary storage, thereby degrading the effective access time from nanoseconds to milliseconds. This overhead arises because the operating system must the process, locate the page on disk, load it into physical , and restart execution, making access times highly variable and unpredictable compared to direct memory references. A critical risk is thrashing, where excessive page faults occur when the system's multiprogramming level exceeds available physical memory, leading to CPU utilization dropping as the system spends more time handling I/O than executing processes; this happens if the aggregate working sets of active processes surpass physical RAM, causing fault rates to spike dramatically. Thrashing can be mitigated by page replacement algorithms that prioritize retaining frequently accessed pages, but poor exacerbates the issue, resulting in systems "paging themselves to death" with low overall efficiency. The implementation of demand paging increases operating system complexity, requiring sophisticated data structures such as page tables, fault handlers, and replacement queues, which enlarge the kernel codebase and elevate the potential for bugs in . This added intricacy stems from the need to track page validity, manage swap space, and handle variable-sized allocations for page tables, making the system more error-prone than simpler memory models. Security vulnerabilities arise from page faults enabling side-channel attacks, where timing differences in fault handling can leak sensitive information, such as encryption keys, through covert channels that exploit demand paging's on-access loading mechanism. For instance, attackers can infer memory layouts or access patterns by measuring fault latencies, as demonstrated in exploits targeting page fault side channels in secure enclaves. Programs experience initial slowdowns during startup, as the first references to pages generate bursts of faults while building the in , delaying execution until sufficient pages are loaded from disk. This warmup phase can significantly postpone program responsiveness, particularly for applications with large or scattered footprints. In multi-user environments, high page fault rates create for disk I/O, leading to a convoy effect where one process's prolonged access hogs the subsystem, stalling others and reducing overall system throughput. This contention amplifies under heavy loads, as multiple processes compete for limited I/O bandwidth, further degrading performance in shared systems.

Implementation and Applications

In Operating Systems

Demand paging is integrated into operating system kernels through dedicated trap handlers that manage page faults, ensuring that virtual memory pages are loaded only when accessed. In the , for instance, page faults are handled by the do_page_fault() function, located in the architecture-specific fault handling code such as arch/x86/mm/fault.c, which processes interrupts from the (MMU) and invokes higher-level routines like handle_mm_fault() in mm/memory.c to allocate and map pages as needed. This mechanism relies on multi-level s, such as the four-level hierarchy used in architectures, where page table entries (PTEs) track mappings from virtual to physical addresses, with levels including page global directory (PGD), page upper directory (PUD), page middle directory (PMD), and PTE. Major operating systems implement demand paging with system-specific optimizations. Unix-derived systems like , tracing back to the 1970s Unix lineage, employ demand paging alongside (COW) techniques to efficiently handle process forking by sharing pages until modifications occur, a strategy formalized in early implementations for management. uses demand-zero pages, where initially allocated pages are filled with zeros on first access to enhance security and efficiency, managed by a low-priority thread that prepares free pages from the standby list. Similarly, macOS's kernel incorporates demand paging within its unified buffer cache, which merges the traditional buffer cache with the cache to minimize disk I/O and consolidate backing storage for files and pages. Supporting kernel structures include page descriptor tables, represented by the struct page in , which maintain flags such as the (indicating modifications requiring write-back to disk) and the present/resident bit (tracking whether a page is in physical ). The swap daemon, known as kswapd in , operates as a kernel thread to proactively reclaim by scanning pages, evicting dirty ones to backing store (swap space or files), and freeing clean ones when pressure on low watermarks is detected. Configuration of demand paging behavior is tunable through kernel parameters. In , the vm.swappiness , adjustable via /proc/sys/vm/swappiness with values from 0 to 100, controls the aggressiveness of swapping anonymous pages versus reclaiming file-backed pages, where higher values prioritize swap usage to balance memory pressure. The evolution of demand paging in systems began with early implementations using swapping in the 1970s, such as Unix Version 6 in 1975, which laid the groundwork for later enhancements; full demand paging was first implemented in the 3BSD distribution in 1979, with further developments in subsequent BSD variants and System V Unix. Modern kernels, including contemporary versions, extend this with support for huge pages—typically 2 MB in size on —to cover larger address ranges per (TLB) entry, thereby reducing TLB misses and improving performance for memory-intensive workloads.

Modern Variations and Comparisons

Modern variations of demand paging incorporate predictive and anticipatory mechanisms to mitigate page fault latencies. Prepaging, also known as anticipatory paging, extends traditional demand paging by prefetching likely future pages into the based on access patterns, such as sequential reads. In , the readahead subsystem implements this by triggering asynchronous reads when a occurs on a missing page or when accessing a page marked with the PG_readahead , using functions like page_cache_async_ra() to load chunks of data ahead of explicit requests. This approach amortizes I/O overhead and improves throughput for sequential workloads, as demonstrated in kernel optimizations that adjust readahead sizes dynamically via the file_ra_state structure. Similarly, in GPU environments, 's Unified Memory provides demand fetching by allowing the GPU to access system on-demand, where trigger automatic migration of pages from CPU to GPU without explicit intervention. Upon a fault, the driver coalesces multiple faults into groups, migrates data in batches (e.g., up to 896 KB per transfer on Pascal GPUs), and updates mappings to minimize stalls, achieving up to 2x bandwidth improvements in sparse access scenarios. In contemporary , demand paging integrates with optimizations like Transparent Huge Pages (THP) to reduce (TLB) misses. THP enables the kernel to allocate 2 MB pages on fault instead of 4 KB pages, transparently collapsing contiguous small pages during allocation or compaction, which is particularly beneficial in resource-intensive environments like AWS EC2 instances running databases or applications. For instance, EC2 configurations often enable THP to enhance memory efficiency for large heaps, lowering overhead in virtualized setups. On mobile operating systems, Android employs as a compressed swap device to augment demand paging by avoiding slow disk I/O. zRAM dynamically allocates a RAM-based partition for compressing evicted anonymous and dirty pages via the kswapd daemon, decompressing them on access; this keeps more pages resident in memory, reducing fault times in low-RAM devices while maintaining demand-based loading from backing storage for clean pages. Demand paging contrasts with swapping, where entire processes are moved to secondary storage rather than individual pages, incurring higher overhead due to bulk transfers and context switches. Unlike segmentation, which divides memory into variable-sized logical units corresponding to program modules (e.g., code, data), demand paging uses fixed-size pages to minimize internal fragmentation but can suffer external fragmentation if not combined with compaction. In comparison to eager loading, which preloads all process pages into memory at startup—potentially wasting space for sparsely accessed programs—demand paging defers loading until a fault, optimizing for irregular access patterns prevalent in modern applications. In the 2020s, the shift to solid-state drives (SSDs) has significantly reduced page fault latencies from milliseconds on HDDs to microseconds, enabling faster demand paging in I/O-bound workloads, though queue depth and write amplification can still introduce variability. However, (NUMA) architectures complicate replacement policies by introducing remote access penalties, prompting hybrid approaches like Linux's automatic NUMA balancing, introduced in kernel 3.13 and refined in 5.x series, which monitors access patterns and migrates pages to local nodes proactively. Looking ahead, research prototypes integrate for predictive loading, such as the Learning-based Page Replacement (LPR) scheme, which uses to select between LRU and MRU policies based on eviction history, reducing faults by up to 36% in benchmarks like SPLASH-2x.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.