Hubbry Logo
Virtual address spaceVirtual address spaceMain
Open search
Virtual address space
Community hub
Virtual address space
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Virtual address space
Virtual address space
from Wikipedia

In computing, a virtual address space (VAS) or address space is the set of ranges of virtual addresses that an operating system makes available to a process.[1] The range of virtual addresses usually starts at a low address and can extend to the highest address allowed by the computer's instruction set architecture and supported by the operating system's pointer size implementation, which can be 4 bytes for 32-bit or 8 bytes for 64-bit OS versions. This provides several benefits, one of which is security through process isolation assuming each process is given a separate address space.

Example

[edit]
In the following description, the terminology used will be particular to the Windows NT operating system, but the concepts are applicable to other virtual memory operating systems.

When a new application on a 32-bit OS is executed, the process has a 4 GiB VAS: each one of the memory addresses (from 0 to 232 − 1) in that space can have a single byte as a value. Initially, none of them have values ('-' represents no value). Using or setting values in such a VAS would cause a memory exception.

           0                                           4 GiB
VAS        |----------------------------------------------|

Then the application's executable file is mapped into the VAS. Addresses in the process VAS are mapped to bytes in the exe file. The OS manages the mapping:

           0                                           4 GiB
VAS        |---vvv----------------------------------------|
mapping        |||
file bytes     app

The v's are values from bytes in the mapped file. Then, required DLL files are mapped (this includes custom libraries as well as system ones such as kernel32.dll and user32.dll):

           0                                           4 GiB
VAS        |---vvv--------vvvvvv---vvvv-------------------|
mapping        |||        ||||||   ||||
file bytes     app        kernel   user

The process then starts executing bytes in the EXE file. However, the only way the process can use or set '-' values in its VAS is to ask the OS to map them to bytes from a file. A common way to use VAS memory in this way is to map it to the page file. The page file is a single file, but multiple distinct sets of contiguous bytes can be mapped into a VAS:

           0                                           4 GiB
VAS        |---vvv--------vvvvvv---vvvv----vv---v----vvv--|
mapping        |||        ||||||   ||||    ||   |    |||
file bytes     app        kernel   user   system_page_file

And different parts of the page file can map into the VAS of different processes:

           0                                           4 GiB
VAS 1      |---vvvv-------vvvvvv---vvvv----vv---v----vvv--|
mapping        ||||       ||||||   ||||    ||   |    |||
file bytes     app1 app2  kernel   user   system_page_file
mapping             ||||  ||||||   ||||       ||   |
VAS 2      |--------vvvv--vvvvvv---vvvv-------vv---v------|

On Microsoft Windows 32-bit, by default, only 2 GiB are made available to processes for their own use.[2] The other 2 GiB are used by the operating system. On later 32-bit editions of Microsoft Windows, it is possible to extend the user-mode virtual address space to 3 GiB while only 1 GiB is left for kernel-mode virtual address space by marking the programs as IMAGE_FILE_LARGE_ADDRESS_AWARE and enabling the /3GB switch in the boot.ini file.[3][4]

On Microsoft Windows 64-bit, in a process running an executable that was linked with /LARGEADDRESSAWARE:NO, the operating system artificially limits the user mode portion of the process's virtual address space to 2 GiB. This applies to both 32- and 64-bit executables.[5][6] Processes running executables that were linked with the /LARGEADDRESSAWARE:YES option, which is the default for 64-bit Visual Studio 2010 and later,[7] have access to more than 2 GiB of virtual address space: up to 4 GiB for 32-bit executables, up to 8 TiB for 64-bit executables in Windows through Windows 8, and up to 128 TiB for 64-bit executables in Windows 8.1 and later.[4][8]

Allocating memory via C's malloc establishes the page file as the backing store for any new virtual address space. However, a process can also explicitly map file bytes.

Linux

[edit]

For x86 CPUs, Linux 32-bit allows splitting the user and kernel address ranges in different ways: 3G/1G user/kernel (default), 1G/3G user/kernel or 2G/2G user/kernel.[9]

See also

[edit]

Notes

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
In , particularly within operating systems, a virtual address space refers to the complete range of addresses that the operating system assigns to a specific or user, enabling it to reference memory locations in a logical, abstracted manner without direct dependence on the underlying physical memory configuration. This abstraction is facilitated by hardware components like the (MMU), which translates virtual addresses to physical addresses using data structures such as page tables, allowing processes to operate as if they have exclusive access to a large, contiguous memory area. The primary purposes of virtual address spaces include providing memory isolation between processes to enhance security and stability—ensuring that one process cannot access or interfere with another's —while also supporting mechanisms to prevent unauthorized writes or reads. Additionally, they enable efficient resource utilization by allowing the virtual space to exceed available physical RAM, with unused portions swapped to secondary storage via paging, thus accommodating multitasking environments where multiple processes run concurrently without exhausting physical memory. The size and structure of a virtual address space vary by system architecture and operating system implementation; for instance, in 32-bit Windows systems, it typically totals 4 GB, divided into user-mode space (up to 2 GB or 3 GB with extensions) for the process and kernel-mode space for system resources. In architectures like IBM z/OS, it consists of private areas unique to each process and a shared common area for system-wide elements, starting from address zero and extending to the maximum supported by the hardware. Typically, the virtual address space is organized into distinct regions or segments. In Unix-like systems, these commonly include:
  • Text (code) segment: Contains the executable machine instructions of the program.
  • Rodata (read-only data) segment: Holds read-only constants and string literals.
  • Data segment: Stores initialized global and static variables.
  • BSS segment: Reserves space for uninitialized global and static variables, initialized to zero by the system.
  • Heap: Provides dynamic memory allocation, growing upwards from the end of the BSS segment.
  • Stack: Manages local variables, function calls, and parameters, growing downwards from the high end of the address space; it often includes the argc (argument count), argv (argument vector) array, and environment variables at its top.
This design not only simplifies programming by presenting a memory view but also supports advanced features like paging and , where physical is allocated only when needed.

Fundamentals

Definition and Purpose

The virtual address space is the set of memory addresses that a uses to reference locations in , providing an isolated and abstract view of the system's for each running program. It consists of a range of virtual addresses generated by the , which are distinct from physical addresses in the actual hardware , and are mapped to physical locations by the operating system in conjunction with hardware mechanisms. This abstraction ensures that each operates within its own private environment, preventing direct access to the of other . The primary purpose of the virtual address space is to enable concurrent execution of multiple es on the same system without interference, facilitating multiprogramming and environments. It supports by isolating processes, thereby enhancing system security and stability, as one process cannot inadvertently or maliciously overwrite another's . Additionally, it allows programs to utilize more memory than is physically available by incorporating swapping or paging techniques, where inactive portions of a process's memory are temporarily moved to secondary storage. Key characteristics of a virtual address space include its provision of a contiguous logical view of to the process, despite the potentially fragmented nature of physical allocation. The size is determined by the addressing architecture, typically 32 bits (yielding 4 GB) in older systems or 64 bits (up to 16 exabytes, though often limited in practice) in modern ones, allowing for vast addressable ranges. It is commonly divided into distinct regions such as the text segment for executable code, for initialized variables, heap for dynamic allocation, and stack for function calls and local variables. Historically, the concept originated in the late to address fragmentation and challenges in early multiprogramming systems, with the first in the Atlas computer at the in 1959.

Virtual Addresses vs. Physical Addresses

Virtual addresses are generated by the CPU during program execution and represent offsets within a process's virtual address space, providing an that does not directly correspond to specific hardware memory locations. In contrast, physical addresses refer to the actual locations in physical , such as RAM, where data is stored and accessed by the hardware. This distinction allows each process to operate under the illusion of having its own dedicated, contiguous memory space, independent of the underlying physical memory configuration. The mapping from virtual to physical addresses occurs at runtime through hardware and software mechanisms, primarily handled by the memory management unit (MMU) in the CPU, which consults data structures maintained by the operating system to perform the translation. For instance, the operating system decides which physical addresses correspond to each virtual address in a process, enabling dynamic allocation and protection without requiring programs to be aware of the physical layout. As a result, a virtual address like 0x1000 accessed by a process might be translated to a physical address such as 0x5000, with no fixed binding between them, allowing the same program binary to run in different memory locations across executions or systems. A typical virtual address space is divided into regions to enforce security and functionality, such as user space for application and data (often in the lower addresses) and kernel space for operating system components (typically in the higher addresses, such as the upper 128 TB in systems). These regions include attributes like read-only for segments and writable for data areas, ensuring isolation where user-mode cannot access kernel areas directly, even though both may reside in the same virtual address space. This layout supports relocation transparency, as virtual addresses remain consistent regardless of physical memory assignments, facilitating loading at arbitrary locations without modifications.

Address Translation Mechanisms

Paging

Paging divides the virtual address space of a process into fixed-size units known as pages, typically 4 KB in size, allowing the operating system to map these to corresponding fixed-size blocks in physical memory called page frames. This fixed-size allocation simplifies memory management by eliminating fragmentation issues associated with variable-sized blocks, enabling non-contiguous allocation of physical memory to virtual pages. A virtual address in paging is composed of two parts: the virtual page number (VPN), which identifies the page within the virtual address space, and the offset, which specifies the byte position within that page. The number of bits allocated to the VPN and offset depends on the page size; for a 4 KB page (2^{12} bytes), the offset uses 12 bits, leaving the remaining bits for the VPN in systems with larger address spaces. The (MMU) uses the VPN to index into a , a that maps each VPN to a physical frame number (PFN) or indicates if the page is not present in memory. In modern systems supporting 64-bit architectures like , which typically use 48-bit virtual addresses (sign-extended to 64 bits), single-level page tables become impractical due to their size—potentially requiring gigabytes of for sparse address spaces. Instead, hierarchical or multi-level page tables are employed, where the VPN is split across multiple levels (e.g., two or four levels, with 5-level paging introduced by in 2017 and supported in hardware since 2019 for up to 57-bit virtual addresses), with each level indexing a smaller table that points to the next, ultimately leading to the leaf page table entry (PTE) containing the PFN. Each PTE also includes metadata such as presence bits, flags, and reference/modified bits for efficient management. The process involves walking these levels: the MMU shifts and masks the virtual to extract indices for each level, fetching PTEs from physical or a (TLB) cache if available. If a page is not present in physical memory, accessing it triggers a , an handled by the operating system kernel. The OS fault handler checks if the page is valid (e.g., mapped to disk) and, if so, allocates a free physical frame, loads the page from secondary storage (like disk or swap space), updates the PTE, and resumes the process; invalid accesses may result in segmentation faults or termination. This mechanism supports demand paging, where pages are loaded into memory only upon first access, reducing initial memory footprint and enabling larger virtual address spaces than physical memory availability. The is computed as the PFN from the PTE multiplied by the page size, plus the offset from the virtual address: \text{[Physical address](/page/Physical_address)} = (\text{PFN} \times \text{page size}) + \text{offset} This formula ensures byte-level alignment within the frame. Paging variants enhance efficiency in specific scenarios. paging, as noted, defers loading until needed, often combined with page replacement algorithms like least recently used (LRU) to evict frames when is full. (COW) allows multiple processes to share the same physical pages initially (e.g., after a operation), marking them read-only in PTEs; upon a write attempt to a shared page, a triggers the OS to copy the page into a new frame for the writing process, preserving isolation while minimizing initial duplication overhead.

Segmentation

Segmentation divides the virtual address space into variable-sized segments that correspond to logical units of a program, such as , , and stack sections. Unlike fixed-size paging, each segment can have a different length tailored to the needs of the program module, promoting and easier sharing of or between processes. A virtual address in a segmented consists of a segment selector, which identifies the segment, and an offset, which specifies the location within that segment. In modern 64-bit x86 , segmentation is simplified to a flat model, where most segment bases are zero and limits are ignored except for compatibility and specific uses like (FS/GS segments). The operating system maintains a segment table, also known as a descriptor table, which stores entries for each segment. Each segment descriptor includes the base where the segment resides in , the limit defining the segment's size, and access rights such as read-only for segments or read-write for segments. In architectures like the Intel x86, these descriptors are held in the (GDT) or Local Descriptor Table (LDT), with the segment selector serving as an index into the appropriate table. During address translation, the hardware uses the segment selector to retrieve the corresponding descriptor from the segment table. It then verifies that the offset does not exceed the segment's limit; if it does, a occurs. Upon successful validation, the base address from the descriptor is added to the offset to compute the . This enables direct mapping in pure segmentation or serves as an initial step before further in combined systems. Pure segmentation maps segments directly to contiguous physical memory regions, which supports logical program structure but can suffer from external fragmentation as free memory holes form between allocated segments. To mitigate this, segmentation is frequently combined with paging, where each segment is subdivided into fixed-size pages that are mapped non-contiguously; this hybrid approach, as implemented in x86 segmented paging, eliminates external fragmentation while retaining the benefits of logical division. Historically, segmentation gained prominence in the operating system, developed in the 1960s, where it allowed for dynamic linking and sharing of procedures and data across processes in a environment. This design influenced subsequent systems, including early architectures like the 8086, which introduced segmentation to expand the addressable memory beyond the limitations of flat addressing. In combined segmentation and paging schemes, internal fragmentation arises when a segment's size is not an exact multiple of the page size, leaving unused space in the final page of the segment. This inefficiency is typically limited to one page per segment but can accumulate in systems with many small segments.

Benefits and Limitations

Advantages

Virtual s provide memory protection by isolating each in its own independent address space, preventing unauthorized access to other processes' memory through hardware-enforced mechanisms such as access bits in page or segment tables that specify permissions like read, write, or execute. This isolation enhances system by ensuring that faults or malicious actions in one process do not corrupt others. Efficient multitasking is enabled by virtual address spaces, which allow the operating system to overcommit physical by allocating more to processes than is physically available, swapping infrequently used pages to disk as needed. This approach supports running more processes simultaneously than the available RAM would otherwise permit, improving overall resource utilization without requiring all to be resident at once. Virtual address spaces offer abstraction from physical memory constraints, presenting programs with a large, contiguous address space that hides the underlying hardware details and fragmentation issues. This portability allows applications written for a standard virtual model to execute unchanged across diverse hardware platforms, as the operating system handles the mapping to physical resources. Programming is simplified by virtual address spaces, as developers can allocate and use large, contiguous regions without managing physical layout, relocation, or fragmentation manually. Linking and loading become straightforward since each program operates within a consistent , reducing the complexity of in application code. Resource sharing is facilitated through mechanisms like mappings, where multiple es can access the same physical pages via their virtual spaces, and techniques that initially share pages during process forking and duplicate them only upon modification. This enables efficient and reduces memory duplication for common resources like libraries or data structures.

Challenges and Overhead

One significant challenge in virtual address space management is the translation overhead incurred during memory access. Each virtual address reference typically requires traversing the hierarchy via the (MMU), which can involve multiple memory accesses and add substantial latency to every load or store operation. In multi-level s, this process may demand up to four or more memory references per on a TLB miss, potentially leading to performance losses of up to 50% in data-intensive workloads. Page tables themselves impose considerable memory overhead, consuming physical RAM to store mappings for the entire virtual . In 64-bit systems, multi-level page tables with four or five levels can require gigabytes of memory if fully populated, even though only a of the address space is used, exacerbating in memory-constrained environments. For instance, forward-mapped page tables for large virtual address spaces become impractical due to this exponential growth in storage needs. Page fault handling introduces further performance degradation, as unmapped virtual addresses trigger interrupts that necessitate context switches and potentially disk I/O to load missing pages. This overhead can escalate dramatically under thrashing conditions, where excessive paging activity occurs due to overcommitted memory, leading to high page fault rates, prolonged wait times, and reduced CPU utilization as the system spends more time swapping pages than executing useful work. Fragmentation remains an issue despite the contiguity provided in virtual address spaces. Paging eliminates external fragmentation in physical memory by allowing non-contiguous allocation of pages, but it introduces internal fragmentation, where allocated pages contain unused space because processes are rounded up to fixed page sizes, wasting memory within each frame. Security vulnerabilities can arise from improper management of virtual address mappings, enabling exploits such as buffer overflows that corrupt adjacent memory regions within the process, potentially altering control flow or enabling arbitrary code execution. Scalability challenges intensify with larger address spaces in 64-bit systems, where the complexity of managing multi-level page tables grows, increasing both translation latency and administrative overhead for operating systems handling terabyte-scale .

Implementations in Operating Systems

Unix-like Systems

In systems, each process operates within its own isolated, flat virtual address space, providing abstraction from physical memory constraints and enabling multitasking through kernel-managed paging. The kernel maintains per-process page tables to translate virtual addresses to physical ones, supporting demand paging where pages are allocated and loaded only upon access to optimize resource use. Mappings of files, devices, or anonymous regions into this space are handled via the POSIX-standard mmap() , which integrates seamlessly with the paging system for efficient I/O and sharing. The typical layout of a process's virtual address space follows a conventional structure to facilitate binary loading and runtime growth. For executables in the ELF format, common on Unix-like systems, the loader maps the following segments into the virtual address space starting at low addresses, with the heap and stack positioned higher up:
  • Text segment: Contains the executable code.
  • Rodata segment: Contains read-only data, such as string constants.
  • Data segment: Contains initialized data variables.
  • BSS segment: Contains uninitialized data, which is allocated and zero-initialized at runtime.
  • Heap: Used for dynamic allocations, begins after the BSS segment and expands upward via the brk() or sbrk() system calls, which adjust the break point—the end of the data segment.
  • Stack: For function calls and local variables, starts near the top of the user address space and grows downward, ensuring separation from the heap to prevent unintended overlaps. The initial stack layout includes argc at the base, followed by pointers to argv (the argument strings), environment pointers (envp), and an auxiliary vector.
Key system calls underpin address space manipulation and process creation. The fork() call duplicates the parent process, establishing a new for the child using (COW) semantics, where physical pages are shared until modified to minimize initial overhead. Following fork(), execve() overlays a new program image, discarding the prior and loading the executable's segments into fresh virtual regions per the ELF headers. These mechanisms align with standards, ensuring portability across implementations. Memory management in Unix-like systems emphasizes flexibility and protection. Virtual memory overcommitment permits processes to allocate more virtual memory than physically available, relying on heuristics to approve requests and using swap space to offload inactive pages to disk during pressure. Page protections—such as read-only for text or no-access for guards—are enforced via mprotect(), which updates page table entries to trigger faults on violations, enhancing and . In modern 64-bit implementations, user address spaces extend vastly, for example, up to 128 terabytes in , far exceeding early constraints. Historically, virtual memory evolved from early Unix's fixed, small address spaces—limited to swapping entire processes in versions like Sixth Edition (1975)—to demand-paged systems in the late with Seventh Edition, introducing per-page for larger, sparse spaces. A core Unix principle ties each process ID (PID) to a distinct , isolating execution contexts; the kernel handles page faults internally for valid accesses or dispatches signals like SIGSEGV for invalid ones, allowing user-level error recovery. This design, rooted in compliance, persists in contemporary systems for robust, efficient memory handling.

Windows

In Windows, the virtual address space implementation originated with the kernel in version 3.1 released in 1993, which introduced protected subsystems to isolate user-mode environments while providing a robust framework inherited from earlier designs like VMS. This foundation evolved to support per-process virtual address spaces, ensuring isolation between applications and the kernel. For 32-bit processes on Windows, each process receives a 4 GB virtual address space, split into 2 GB for user-mode code and data and 2 GB reserved for kernel-mode components. In contrast, 64-bit processes on modern Windows versions utilize a much larger 128 TB user-mode virtual address space (from 0x0000000000000000 to 0x7FFFFFFFFFFFFFFF), with an additional 128 TB allocated for kernel-mode, leveraging 48-bit canonical addressing within the theoretical 2^64 limit. This expansion accommodates (32-bit compatibility mode on 64-bit systems) and native 64-bit applications, where large-address-aware binaries can access up to 4 GB in environments. The layout of the virtual address space in a Windows is organized around the (PE) format, where the loader maps the executable image and dependent DLLs into low virtual addresses starting near 0x00010000, followed by reserved regions for modules and data sections. The PE format includes key sections such as .text for executable code, .data for initialized data, and .rdata for read-only data (equivalent to rodata). Heaps are dynamically allocated using functions like HeapAlloc, typically in the mid-range addresses, while each thread maintains its own stack, defaulting to 1 MB in user mode and allocated at higher addresses near 0x7FFF0000 in 32-bit processes to avoid conflicts with growing heaps. The stack also holds command-line arguments, parsed into argc (argument count) and argv (argument vector) by the C runtime library for the main function. Key application programming interfaces (APIs) for managing virtual address space include VirtualAlloc, which reserves and optionally commits regions of with specified protections, and VirtualProtect, which modifies access rights (e.g., read, write, execute) on committed pages without reallocating. During process creation via CreateProcess, Windows employs semantics for shared pages between parent and child, where initial mappings point to the same physical pages marked as read-only, and writes trigger private copies to maintain isolation. Memory management in Windows tracks physical page usage through working sets, which represent the set of pages actively resident in RAM for a process; the kernel adjusts these dynamically to prioritize frequently accessed pages and minimize thrashing. Swapping to disk occurs via the pagefile.sys, a system-managed paging file that stores evicted pages when physical RAM is exhausted, with sizing recommendations based on commit charge and crash dump needs. For memory-intensive applications exceeding standard virtual limits, Address Windowing Extensions (AWE) enable direct mapping of large physical memory blocks (up to available RAM) into user space using APIs like AllocateUserPhysicalPages, bypassing the pagefile for non-paged allocations. Distinct from Unix-like systems, Windows enforces session isolation where processes in different Terminal Services sessions (e.g., remote desktops) operate in separate namespaces, preventing cross-session address space access even for shared objects. Additionally, job objects group related processes to impose collective limits on commitments, working sets, and CPU usage, facilitating enterprise resource governance.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.