Recent from talks
Nothing was collected or created yet.
Demand paging
View on WikipediaThis article includes a list of references, related reading, or external links, but its sources remain unclear because it lacks inline citations. (June 2009) |
In computer operating systems, demand paging (as opposed to anticipatory paging) is a method of virtual memory management. In a system that uses demand paging, the operating system copies a disk page into physical memory only when an attempt is made to access it and that page is not already in memory (i.e., if a page fault occurs). It follows that a process begins execution with none of its pages in physical memory, and triggers many page faults until most of its working set of pages are present in physical memory. This is an example of a lazy loading technique.
Basic concept
[edit]Demand paging only brings pages into memory when an executing process demands them. This is often referred to as lazy loading, as only those pages demanded by the process are swapped from secondary storage to main memory. Contrast this to pure swapping, where all memory for a process is swapped from secondary storage to main memory when the process starts up or resumes execution.
Commonly, to achieve this process a memory management unit is used. The memory management unit maps logical memory to physical memory. Entries in the memory management unit include a bit that indicates whether a page is valid or invalid. A valid page is one that currently resides in main memory. An invalid page is one that currently resides in secondary memory. When a process tries to access a page, the following steps are generally followed:
- Attempt to access page.
- If page is valid (in memory) then continue processing instruction as normal.
- If page is invalid then a page-fault trap occurs.
- Check if the memory reference is a valid reference to a location on secondary memory. If not, the process is terminated (illegal memory access). Otherwise, we have to page in the required page.
- Schedule disk operation to read the desired page into main memory.
- Restart the instruction that was interrupted by the operating system trap.
Advantages
[edit]Demand paging, as opposed to loading all pages immediately:
- Only loads pages that are demanded by the executing process.
- As there is more space in main memory, more processes can be loaded, reducing the context switching time, which utilizes large amounts of resources.
- Less loading latency occurs at program startup, as less information is accessed from secondary storage and less information is brought into main memory.
- As main memory is expensive compared to secondary memory, this technique helps significantly reduce the bill of material (BOM) cost in smart phones for example. Symbian OS had this feature.
Disadvantages
[edit]- Individual programs face extra latency when they access a page for the first time.
- Low-cost, low-power embedded systems may not have a memory management unit that supports page replacement.
- Memory management with page replacement algorithms becomes slightly more complex.
- Possible security risks, including vulnerability to timing attacks; see Percival, Colin (2005-05-13). Cache missing for fun and profit (PDF). BSDCan 2005. (specifically the virtual memory attack in section 2).
- Thrashing which may occur due to repeated page faults.
See also
[edit]References
[edit]- Tanenbaum, Andrew S. Operating Systems: Design and Implementation (Second Edition). New Jersey: Prentice-Hall 1997.
Demand paging
View on GrokipediaFundamentals
Overview and Definition
Demand paging is a memory management scheme in which the logical address space of a process is divided into fixed-size pages, and these pages are loaded into physical memory only when first accessed by the process, in contrast to earlier methods that pre-loaded entire programs into memory before execution. This on-demand loading allows processes to begin running with minimal initial memory allocation, deferring the transfer of data from secondary storage until necessary. Demand paging operates as a key component within the broader framework of virtual memory, which provides an abstraction of a large, contiguous address space beyond the limits of physical memory. The core principle of demand paging is lazy evaluation of memory requirements, where a newly created process starts with an empty set of physical frames allocated to it and relies on secondary storage, such as a disk, as the backing store for its pages. Upon accessing a page not yet in memory, the operating system intervenes to fetch it, ensuring that only actively used portions of the program occupy physical memory at any time. For example, in a process divided into 100 pages where only the first 20 are referenced during initial execution, solely those 20 pages are brought into memory, thereby conserving physical resources and enabling multitasking by allowing space for other processes. Demand paging was first conceptualized in the early 1960s with the Atlas computer at the University of Manchester, where it was implemented as part of a one-level storage system that automatically fetched missing pages from a drum-based backing store upon access. This innovation was formalized and advanced in the Multics operating system, with key implementation details on demand paging for its segmented virtual memory presented at the 1969 ACM Symposium on Operating System Principles.Prerequisites: Virtual Memory and Paging
Virtual memory is a memory management technique that allows processes to use more memory than is physically available in the system by providing an abstraction of a large, contiguous address space to each program. It achieves this by mapping virtual addresses generated by the program to physical addresses in main memory, utilizing secondary storage such as disks as an extension of main memory when necessary. This mapping is performed automatically through hardware and software mechanisms, enabling the illusion of dedicated memory for multiple processes while sharing the physical resources efficiently.[1] Paging is a fundamental method for implementing virtual memory, where both the virtual address space and physical memory are divided into fixed-size units called pages and page frames, respectively. Pages in the virtual space correspond to frames in physical memory, with a typical uniform size such as 4 KB, which simplifies memory allocation and eliminates the need for contiguous allocation of variable-sized blocks. This fixed-size approach reduces external fragmentation—unused gaps between allocated memory blocks—by allowing any free frame to be assigned to any page without regard to location. However, paging introduces internal fragmentation, where portions of a page remain unused if the data or code assigned to it does not fill the entire fixed-size unit, leading to wasted space within allocated pages.[6] The address translation process in paging begins with dividing the logical (virtual) address into two parts: a page number and an offset within that page. The page number serves as an index into a page table, a data structure maintained by the operating system that maps each virtual page number to a corresponding physical frame number. The physical address is then computed by combining the frame number with the unchanged offset, using the formula: This translation ensures that references to virtual memory are resolved to physical locations efficiently, typically with hardware support like a memory management unit. Demand paging extends this paging mechanism by loading pages into physical memory only when accessed, rather than preloading the entire address space.[6]Operational Mechanisms
Page Fault Handling
A page fault is a type of hardware interrupt, often referred to as a trap, that is generated when a running process attempts to access a virtual memory page that is not currently mapped to a physical frame in main memory. This occurs in demand paging systems where pages are not preloaded but fetched only upon reference. The Memory Management Unit (MMU) detects the fault during address translation by examining the page table entry for the requested virtual page; specifically, if the present or validity bit in that entry is set to 0, indicating the page is absent from physical memory, the MMU raises the interrupt to halt execution and notify the operating system.[7][8] The operating system handles the page fault through a dedicated interrupt service routine, beginning with saving the processor context of the interrupted process to ensure its state can be restored later. The handler then inspects the fault details, such as the faulting virtual address and the process's page table, to distinguish between an invalid access—arising from protection violations like attempting to write to a read-only page—and a legitimate demand paging scenario where the page simply resides on secondary storage. In the case of a valid missing page, the OS allocates a physical frame (which may involve page replacement if no free frames are available, as described below) and, after loading the page into the frame, updates the page table entry to associate the virtual page with the frame, setting the present bit to 1 to enable resumption of the access.[9][10][11] The MMU's role extends beyond mere translation; it enforces memory protection by monitoring flags in page table entries, such as read/write permissions and user/kernel modes, which can also trigger faults if violated, though these differ from demand paging faults. Once the page table is updated, control returns to the process at the instruction that caused the fault, allowing the original memory access to complete successfully. In demand paging, these faults are resolvable "soft" interrupts that support efficient memory usage by loading only needed pages, unlike irrecoverable "hard" faults from hardware errors such as memory parity issues.[7][12] For instance, if a process references virtual page 5 and its page table entry shows the present bit as 0, the MMU immediately signals a page fault, transferring control to the kernel's trap handler, which validates the access and prepares the mapping without terminating the process. This mechanism ensures transparent operation for the application, building on the static page table structures used in virtual memory paging.[13]Page Loading and Replacement
Upon detecting a page fault, the operating system initiates the page loading process by identifying the required page in the backing store, typically a disk or swap area, and transferring it into an available physical frame in main memory.[14] The kernel schedules a disk I/O operation to read the page contents, which are then copied into the allocated frame; once loaded, the page table entry's validity bit is set to 1, the page fault is resolved, and the interrupted instruction is restarted to allow the process to continue execution.[15] This on-demand loading ensures that only actively referenced pages occupy physical memory, minimizing initial resource usage.[16] Frame allocation begins with a search for free frames using a frame table that tracks the status of physical memory pages.[17] If a free frame is available, the page is loaded directly into it; however, in memory-constrained systems where all frames are occupied, the operating system must invoke a page replacement algorithm to select and evict an existing page before proceeding with the load.[18] This eviction may involve writing the replaced page back to the backing store if it has been modified (a "dirty" page), adding further I/O overhead.[19] Basic page replacement algorithms aim to minimize the frequency of future page faults by evicting pages least likely to be needed soon. The First-In-First-Out (FIFO) algorithm maintains a queue of loaded pages and evicts the oldest one, regardless of usage patterns, which is simple to implement but can suffer from Belady's anomaly where increasing frame count paradoxically raises fault rates. In contrast, the Least Recently Used (LRU) algorithm tracks access history—often approximated using a stack or reference counters—and evicts the page that has not been used for the longest time, providing better performance by favoring recently active pages, though it requires hardware support like timestamp counters for efficiency.[19] The impact of page faults on system performance is captured by the effective access time (EAT), which accounts for the probability of faults during memory operations: where is the page fault probability.[20] This formula highlights how even low fault rates can degrade performance significantly due to the disparity between memory access times (typically 100 nanoseconds) and page fault handling, dominated by disk I/O latencies of several milliseconds.[21] To mitigate this overhead, some systems employ read-ahead techniques, which speculatively load subsequent pages into memory upon a fault, anticipating sequential access patterns and reducing future I/O demands.[13]Benefits and Limitations
Advantages
Demand paging enhances memory efficiency by loading only the pages accessed by a process into physical memory, thereby allocating resources more effectively to active workloads and avoiding waste on unused portions of programs. This selective loading preserves physical memory for other processes, enabling a higher degree of multiprogramming where multiple programs can execute simultaneously without exhausting available RAM.[22] It also reduces I/O overhead by eliminating the need to load unused code and data upfront, which shortens initial program load times and decreases the volume of disk operations that could otherwise contribute to inefficient resource contention.[22] In resource-constrained environments, this efficiency allows for smaller physical RAM configurations, lowering bill of materials (BOM) costs in devices like early smartphones; for example, Symbian OS employed demand paging to significantly cut BOM expenses by reducing reliance on expensive main memory. Symbian OS v9.5 further demonstrated this through demand paging optimizations that reduced average RAM usage by more than 25%.[23][24] Programs using demand paging can begin execution right away, without the delay of fully loading all pages, making it especially suitable for large applications where complete upfront loading would be prohibitive. Research on web and desktop applications has shown that demand paging, when paired with code reordering, can decrease startup latency by more than 58% in bandwidth-limited scenarios by prioritizing essential pages and deferring others.[25] Overall, this supports greater multiprogramming by fitting more processes into memory, improving CPU utilization and system throughput while minimizing idle periods.[22]Disadvantages
Demand paging introduces significant performance overhead due to page faults, where each fault triggers a context switch and disk I/O operation, typically taking 10-100 milliseconds to retrieve a page from secondary storage, thereby degrading the effective memory access time from nanoseconds to milliseconds.[26] This overhead arises because the operating system must interrupt the process, locate the page on disk, load it into physical memory, and restart execution, making access times highly variable and unpredictable compared to direct memory references.[22] A critical risk is thrashing, where excessive page faults occur when the system's multiprogramming level exceeds available physical memory, leading to CPU utilization dropping as the system spends more time handling I/O than executing processes; this happens if the aggregate working sets of active processes surpass physical RAM, causing fault rates to spike dramatically.[27] Thrashing can be mitigated by page replacement algorithms that prioritize retaining frequently accessed pages, but poor locality of reference exacerbates the issue, resulting in systems "paging themselves to death" with low overall efficiency.[28] The implementation of demand paging increases operating system complexity, requiring sophisticated data structures such as page tables, fault handlers, and replacement queues, which enlarge the kernel codebase and elevate the potential for bugs in memory management.[26] This added intricacy stems from the need to track page validity, manage swap space, and handle variable-sized allocations for page tables, making the system more error-prone than simpler memory models.[27] Security vulnerabilities arise from page faults enabling side-channel attacks, where timing differences in fault handling can leak sensitive information, such as encryption keys, through covert channels that exploit demand paging's on-access loading mechanism.[29] For instance, attackers can infer memory layouts or access patterns by measuring fault latencies, as demonstrated in exploits targeting page fault side channels in secure enclaves.[30] Programs experience initial slowdowns during startup, as the first references to pages generate bursts of faults while building the working set in memory, delaying execution until sufficient pages are loaded from disk.[22] This warmup phase can significantly postpone program responsiveness, particularly for applications with large or scattered memory footprints. In multi-user environments, high page fault rates create resource contention for disk I/O, leading to a convoy effect where one process's prolonged access hogs the subsystem, stalling others and reducing overall system throughput.[28] This contention amplifies under heavy loads, as multiple processes compete for limited I/O bandwidth, further degrading performance in shared systems.[27]Implementation and Applications
In Operating Systems
Demand paging is integrated into operating system kernels through dedicated trap handlers that manage page faults, ensuring that virtual memory pages are loaded only when accessed. In the Linux kernel, for instance, page faults are handled by thedo_page_fault() function, located in the architecture-specific fault handling code such as arch/x86/mm/fault.c, which processes interrupts from the memory management unit (MMU) and invokes higher-level routines like handle_mm_fault() in mm/memory.c to allocate and map pages as needed. This mechanism relies on multi-level page tables, such as the four-level hierarchy used in x86-64 architectures, where page table entries (PTEs) track mappings from virtual to physical addresses, with levels including page global directory (PGD), page upper directory (PUD), page middle directory (PMD), and PTE.[31]
Major operating systems implement demand paging with system-specific optimizations. Unix-derived systems like Linux, tracing back to the 1970s Unix lineage, employ demand paging alongside copy-on-write (COW) techniques to efficiently handle process forking by sharing pages until modifications occur, a strategy formalized in early Linux implementations for virtual memory management.[32] Windows NT uses demand-zero pages, where initially allocated pages are filled with zeros on first access to enhance security and efficiency, managed by a low-priority zero page thread that prepares free pages from the standby list.[33] Similarly, macOS's XNU kernel incorporates demand paging within its unified buffer cache, which merges the traditional buffer cache with the virtual memory cache to minimize disk I/O and consolidate backing storage for files and pages.[34]
Supporting kernel structures include page descriptor tables, represented by the struct page in Linux, which maintain flags such as the dirty bit (indicating modifications requiring write-back to disk) and the present/resident bit (tracking whether a page is in physical memory).[31] The swap daemon, known as kswapd in Linux, operates as a kernel thread to proactively reclaim memory by scanning pages, evicting dirty ones to backing store (swap space or files), and freeing clean ones when pressure on low watermarks is detected.[35]
Configuration of demand paging behavior is tunable through kernel parameters. In Linux, the vm.swappiness sysctl, adjustable via /proc/sys/vm/swappiness with values from 0 to 100, controls the aggressiveness of swapping anonymous pages versus reclaiming file-backed pages, where higher values prioritize swap usage to balance memory pressure.[36]
The evolution of demand paging in Unix-like systems began with early implementations using swapping in the 1970s, such as Unix Version 6 in 1975, which laid the groundwork for later enhancements; full demand paging was first implemented in the 3BSD distribution in 1979, with further developments in subsequent BSD variants and System V Unix.[37] Modern kernels, including contemporary Linux versions, extend this with support for huge pages—typically 2 MB in size on x86-64—to cover larger address ranges per translation lookaside buffer (TLB) entry, thereby reducing TLB misses and improving performance for memory-intensive workloads.[38][35]
