Hubbry Logo
Memory addressMemory addressMain
Open search
Memory address
Community hub
Memory address
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Memory address
Memory address
from Wikipedia
In a computer using virtual memory, accessing the location corresponding to a memory address may involve many levels.

In computing, a memory address is a reference to a specific memory location in memory used by both software and hardware.[1] These addresses are fixed-length sequences of digits, typically displayed and handled as unsigned integers. This numerical representation is based on the features of CPU (such as the instruction pointer and incremental address registers). Programming language constructs often treat the memory like an array.

Types

[edit]

Physical addresses

[edit]

A digital computer's main memory consists of many memory locations, each identified by a unique physical address (a specific code). The CPU or other devices can use these codes to access the corresponding memory locations. Generally, only system software (such as the BIOS, operating systems, and specialized utility programs like memory testers) directly addresses physical memory using machine code instructions or processor registers. These instructions tell the CPU to interact with a hardware component called the memory controller. The memory controller manages access to memory using the memory bus or a system bus, or through separate control, address, and data buses, to execute the program's commands. The bus managed by the memory controller consists of multiple parallel lines, each representing a binary digit (bit).

Logical addresses

[edit]

A computer program uses memory addresses to execute machine code, and to store and retrieve data. In early computers, logical addresses (used by programs) and physical addresses (actual locations in hardware memory) were the same. However, with the introduction of virtual memory, most application programs do not deal directly with physical addresses. Instead, they use logical or virtual addresses, which are translated to physical addresses by the computer's memory management unit (MMU) and the operating system's memory mapping mechanisms.

Unit of address resolution

[edit]

Most modern computers are byte-addressable. Each address identifies a single 8-bit byte (octet) of storage. Data larger than a single byte may be stored in a sequence of consecutive addresses. There exist word-addressable computers, where the minimal addressable storage unit is exactly the processor's word.[a] For example, the Data General Nova minicomputer, and the Texas Instruments TMS9900 and National Semiconductor IMP-16 microcomputers, used 16-bit words, and there are many old mainframe computers that use 36-bit word addressing (such as the IBM 7090, with 15-bit word addresses, giving an address space of 215 36-bit words, approximately 128 kilobytes of storage, and the DEC PDP-6/PDP-10, with 18-bit word addresses, giving an address space of 218 36-bit words, approximately 1 megabyte of storage), not byte addressing. The range of addressing of memory depends on the bit size of the bus used for addresses – the more bits used, the more addresses are available to the computer. For example, an 8-bit-byte-addressable machine with a 20-bit address bus (e.g. Intel 8086) can address 220 (1,048,576) memory locations, or one MiB of memory, while a 32-bit bus (e.g. Intel 80386) addresses 232 (4,294,967,296) locations, or a 4 GiB address space. In contrast, a 36-bit word-addressable machine with an 18-bit address bus addresses only 218 (262,144) 36-bit locations (9,437,184 bits), equivalent to 1,179,648 8-bit bytes, or 1152 KiB, or 1.125 MiB — slightly more than the 8086.

A small number of older machines are bit-addressable. For example, a variable filed length (VFL) instruction on the IBM 7030 "Stretch" specifies a bit address, a byte size of 1 to 8 and a field length.

Some older computers (decimal computers) are decimal digit-addressable. For example, each address in the IBM 1620's magnetic-core memory identified a single six bit binary-coded decimal digit, consisting of a parity bit, flag bit and four numerical bits.[2] The 1620 used 5-digit decimal addresses, so in theory the highest possible address was 99,999. In practice, the CPU supported 20,000 memory locations, and up to two optional external memory units could be added, each supporting 20,000 addresses, for a total of 60,000 (00000–59999).

Some older computers are character-addressable, with 6-bit BCD characters containing a 2-bit zone and a 4-bit digit; the characters in an address only have digit values representing 0–9. Typically some of the zone bits are part of the address and some are used for other purposes, e.g., index register, indirect address.[3]

Some older computers are decimal-word addressable, typically with 4-digit addresses.[4] In some machines the address fields also select index registers, restricting the range of possible address.[5]

Word size versus address size

[edit]

Word size is a characteristic of computer architecture denoting the number of bits that a CPU can process at one time. Modern processors, including embedded systems, usually have a word size of 8, 16, 24, 32 or 64 bits; most current general-purpose computers use 32 or 64 bits. Many different sizes have been used historically, including 8, 9, 10, 12, 18, 24, 36, 39, 40, 48 and 60 bits.

Very often, when referring to the word size of a modern computer, one is also describing the size of address space on that computer. For instance, a computer said to be "32-bit" also usually allows 32-bit memory addresses; a byte-addressable 32-bit computer can address 232 = 4,294,967,296 bytes of memory, or 4 gibibytes (GiB). This allows one memory address to be efficiently stored in one word.

However, this does not always hold true. Computers can have memory addresses larger or smaller than their word size. For instance, many 8-bit processors, such as the MOS Technology 6502, supported 16-bit addresses— if not, they would have been limited to a mere 256 bytes of memory addressing. The 16-bit Intel 8088 and Intel 8086 supported 20-bit addressing via segmentation, allowing them to access 1 MiB rather than 64 KiB of memory. All Intel Pentium processors since the Pentium Pro include Physical Address Extensions (PAE) which support mapping 36-bit physical addresses to 32-bit virtual addresses. Many early LISP implementations on, e.g., 36-bit processors, held 2 addresses per word as the result of a cons. Some early processors held 2 and even 3 addresses per instruction word.

In theory, modern byte-addressable 64-bit computers can address 264 bytes (16 exbibytes), but in practice the amount of memory is limited by the CPU, the memory controller, or the printed circuit board design (e.g., number of physical memory connectors or amount of soldered-on memory).

Contents of each memory location

[edit]

Each memory location in a stored-program computer holds a binary number or decimal number of some sort. Its interpretation, as data of some data type or as an instruction, and use are determined by the instructions which retrieve and manipulate it.

Some early programmers combined instructions and data in words as a way to save memory, when it was expensive: The Manchester Mark 1 had space in its 40-bit words to store little bits of data – its processor ignored a small section in the middle of a word – and that was often exploited as extra data storage.[citation needed] Self-replicating programs such as viruses treat themselves sometimes as data and sometimes as instructions. Self-modifying code is generally deprecated nowadays, as it makes testing and maintenance disproportionally difficult to the saving of a few bytes, and can also give incorrect results because of the compiler or processor's assumptions about the machine's state, but is still sometimes used deliberately, with great care.

Address space in application programming

[edit]

In modern multitasking environment, an application process usually has in its address space (or spaces) chunks of memory of following types:

Some parts of address space may be not mapped at all.

Some systems have a "split" memory architecture where machine code, constants, and data are in different locations, and may have different address sizes. For example, PIC18 microcontrollers have a 21-bit program counter to address machine code and constants in Flash memory, and 12-bit address registers to address data in SRAM.

Addressing schemes

[edit]

A computer program can access an address given explicitly – in low-level programming this is usually called an absolute address, or sometimes a specific address, and is known as pointer data type in higher-level languages. But a program can also use relative address which specifies a location in relation to somewhere else (the base address). There are many more indirect addressing modes.

Mapping logical addresses to physical and virtual memory also adds several levels of indirection; see below.

Memory models

[edit]

Many programmers prefer to address memory such that there is no distinction between code space and data space (see above), as well as from physical and virtual memory (see above) — in other words, numerically identical pointers refer to exactly the same byte of RAM.

However, many early computers did not support such a flat memory model — in particular, Harvard architecture machines force program storage to be completely separate from data storage. Many modern DSPs (such as the Motorola 56000) have three separate storage areas — program storage, coefficient storage, and data storage. Some commonly used instructions fetch from all three areas simultaneously — fewer storage areas (even if there were the same total bytes of storage) would make those instructions run slower.

Memory models in x86 architecture

[edit]

Early x86 processors use the segmented memory model addresses based on a combination of two numbers: a memory segment, and an offset within that segment.

Some segments are implicitly treated as code segments, dedicated for instructions, stack segments, or normal data segments. Although the usages are different, the segments do not have different memory protections reflecting this. In the flat memory model all segments (segment registers) are generally set to zero, and only offsets are variable.

Memory models in IBM S/360 and successors multiprocessors

[edit]

In the 360/65 and 360/67, IBM introduced a concept known as prefixing.[6] Prefixing is a level of address translation that applies to addresses in real mode and to addresses generated by dynamic address translation, using a unique prefix assigned to each CPU in a multiprocessor system. On the 360/65, 360/67 and every successor prior to z/Architecture, it logically swaps a 4096 byte block of storage with another block assigned to the CPU. On z/Architecture,[7] prefixing operates on 8196-byte blocks. IBM classifies addresses on these systems as:[8]

  • Virtual addresses: addresses subject to dynamic address translation
  • Real addresses: addresses generated from dynamic address translation, and addresses used by code running in real mode
  • Absolute addresses: physical addresses

On the 360/65, on S/370 models without DAT and when running with translation turned off, there are only a flat real address space and a flat absolute address space.

On the 360/67, S/370 and successors through S/390, when running with translation on, addresses contain a segment number, a page number and an offset. Although early models supported both 2 KiB and 4 KiB page sizes, later models only supported 4 KiB. IBM later added instructions to move data between a primary address space and a secondary address space.

S/370-XA added 31-bit addresses, but retained the segment/page/offset hierarchy with 4 KiB pages.

ESA/370 added 16 access registers (ARs) and an AR access control mode, in which a 31-bit address was translated using the address space designated by a selected AR.

z/Architecture supports 64-bit virtual, real and absolute addresses, with multi-level page tables.

See also

[edit]

Notes

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A memory address is a unique numerical identifier that specifies a particular location in a computer's main , where or instructions are stored and retrieved. In , these addresses enable the (CPU) to access specific bytes within the , typically organized as a linear sequence of addressable units. Addresses are fundamentally binary values, though they are often represented in notation for human readability and purposes. During program execution, the CPU transmits memory addresses over the address bus to modules, indicating the exact location for reading or writing operations. Compilers or interpreters map variable names in to these hardware-defined memory addresses, facilitating the storage and manipulation of . The size of a memory address determines the maximum addressable memory space; for instance, a 32-bit address supports up to 4 gigabytes of , while 64-bit addresses accommodate vastly larger capacities in contemporary systems. Modern operating systems employ virtual memory addressing to abstract physical memory limitations, where virtual addresses generated by programs are translated to physical addresses by the memory management unit (MMU). This mechanism enhances security through address space isolation, prevents direct access to physical hardware, and supports features like paging and segmentation for efficient memory allocation. Pointers, as variables that hold memory addresses, play a crucial role in dynamic memory management, allowing indirect access to data structures and enabling advanced programming techniques such as linked lists and recursion.

Fundamentals

Definition and Role

A memory address is a unique numerical identifier that specifies a particular byte or word within a computer's , enabling the (CPU) to locate and access stored data or instructions. This identifier functions as a reference point in the , allowing precise targeting of storage locations for efficient retrieval and manipulation. The concept of memory addressing emerged in the context of the , first outlined in a 1945 report that proposed a stored-program design where both instructions and data reside in the same memory unit, differentiated solely by their addresses. In this foundational model, addresses serve to organize memory as a linear array of cells, each capable of holding fixed-size units of information, thereby supporting the sequential execution of programs. In , memory addresses are essential for (RAM) operations, which underpin program execution and by permitting direct, non-sequential access to any memory location. Key operations facilitated by addresses include loading, where data is fetched from a specified address into a CPU register for processing; storing, which writes data from a register back to the designated address; and jumping, a control flow mechanism that redirects program execution to an instruction located at a particular address.

Address Representation

Memory addresses are fundamentally represented as binary numbers, consisting of a fixed number of bits that uniquely identify locations in a computer's space. In binary form, an address is a of 0s and 1s; for instance, a 32-bit address spans 32 bits, allowing for 2322^{32} distinct locations. This binary representation directly corresponds to the hardware's addressing mechanism, where each bit position contributes to the positional value in base-2. For human readability and compactness, addresses are commonly displayed and manipulated in notation, where each group of four binary bits (a ) is represented by a single digit ranging from 0-9 or A-F. This format condenses an 8-bit byte into two digits, making it efficient for representing memory locations; for example, the binary address 11111111 (255 in ) is written as 0xFF. is the standard in programming and debugging tools because it aligns closely with binary while being more concise than . The width of an address, measured in bits, determines the total addressable memory in a system, calculated as 2n2^n bytes, where nn is the number of address bits. A 32-bit address width supports up to 4 GB (232=4,294,967,2962^{32} = 4,294,967,296 bytes) of memory, sufficient for many early personal computers but limiting for modern applications. In contrast, a 64-bit address width theoretically enables 2642^{64} bytes, or approximately 18 exabytes, though practical implementations often use fewer bits (e.g., 48 bits) due to hardware constraints. This scaling is crucial for handling large datasets in contemporary systems. Memory addresses are typically treated as unsigned integers, interpreting the full bit range as positive values from 0 to 2n12^n - 1, which aligns with their role in indexing non-negative locations. However, in certain architectures, signed representations come into play for offset calculations in addressing modes, where negative offsets allow relative addressing backward from a base register (e.g., for accessing local variables or loop counters). This signed usage does not alter the absolute address itself but affects computations during address generation. When storing multi-byte addresses in memory—such as in pointer variables or data structures—the system's dictates the byte order. In big-endian architectures, the most significant byte (MSB) is stored at the lowest memory address, mimicking human reading order (e.g., the number 0x12345678 has 0x12 at the base address). Conversely, little-endian systems, common in x86 processors, store the least significant byte (LSB) first (e.g., 0x78 at the base address). This ordering impacts address portability across systems and requires careful handling in network protocols or cross-platform code.

Address Types

Physical Addresses

A physical address is a hardware-specific identifier that corresponds directly to an actual location in the main memory, such as RAM, where data or instructions are stored. It represents the real, tangible position in the memory hardware, distinct from any software-generated abstractions. Physical addresses are generated by the (MMU), a hardware component in the CPU that maps higher-level addresses to these concrete locations after any necessary translation. The range of possible physical addresses is constrained by the system's installed RAM size and the width of the address bus; for instance, a system with 8 GB of RAM typically utilizes up to 33 bits for addressing, as 2^33 bytes equals 8 GB. To access memory, the CPU transmits the over the address bus to the , which decodes it into components like bank, row, and column selectors for DRAM chips, enabling direct hardware-level read or write operations without intervening layers. This mapping ensures precise targeting of storage cells in the DRAM array via signals such as row address strobe (RAS) and column address strobe (CAS). Physical addresses are inherently limited by the fixed hardware configuration, such as the number of address pins on the processor and the total capacity of installed RAM modules, which cannot be expanded without physical upgrades. Additionally, physical memory allocation is susceptible to fragmentation, where free becomes scattered into non-contiguous blocks due to repeated allocations and deallocations, complicating the provision of large contiguous regions needed for optimizations like huge pages or power-saving modes in modern DRAM. In contrast to virtual addresses, physical addresses provide no or relocation flexibility.

Virtual Addresses

A virtual address is an identifier generated by software, such as a program or operating , to reference a within a process's space, independent of the underlying physical layout. This abstraction allows each process to operate as if it has exclusive access to a dedicated, contiguous region, typically starting from 0x00000000 and extending to the maximum size defined by the 's addressing architecture, such as 4 GB for 32-bit systems. The primary purpose of virtual addresses is to provide , enable efficient sharing of physical resources among multiple , and create the illusion of a large, contiguous that exceeds available physical RAM. By isolating each 's , virtual addressing prevents unauthorized access between processes, enhancing system security and stability. Additionally, it supports mechanisms like demand paging, where only actively needed portions of a are loaded into physical , optimizing resource utilization. Virtual addresses are generated during the compilation and linking phases of program development, where the assigns relative offsets within the program's , data, and stack segments to form a complete for the process. Upon execution, the operating system assigns this to the process, ensuring isolation from other processes' spaces. In systems like , this generation was foundational to supporting multiprogramming environments with dynamic memory allocation. Key advantages of virtual addresses include simplifying by abstracting away physical memory constraints, such as fragmentation or limited capacity, thereby allowing programmers to focus on logical memory needs without hardware-specific optimizations. This approach also facilitates portability across different hardware platforms and supports advanced features like memory-mapped files and shared libraries. Virtual addresses are translated to physical addresses through hardware mechanisms like the , a process detailed in subsequent sections on address resolution.

Address Resolution

Translation Mechanisms

Address translation is the process by which virtual addresses generated by programs are mapped to physical addresses in main memory, enabling isolation, protection, and efficient memory utilization. This conversion is primarily handled by the (MMU), a hardware component that uses data structures such as page tables for paging or segment descriptors for segmentation to perform the mapping. The MMU intercepts every memory access, translating the virtual address to a physical one before forwarding it to the memory system, which ensures that processes operate within their allocated memory regions without interfering with others. In paging systems, memory is partitioned into fixed-size units called pages, typically 4 KB in size, to simplify allocation and management. A virtual address is divided into two parts: the virtual page number (VPN), which identifies the page, and the page offset, which specifies the byte within the page. The VPN is computed as V=virtual addresspage sizeV = \left\lfloor \frac{\text{virtual address}}{\text{page size}} \right\rfloor, and it serves as an index into a that maps the VPN to a physical frame number (PFN). The resulting physical address is then PFN × page size + offset. To support large address spaces without excessive memory overhead, multi-level page tables are employed, where the VPN is split across multiple levels (e.g., page directory and page table indices) for hierarchical lookup, reducing the size of each table while covering vast virtual spaces. Segmentation provides an alternative mechanism where memory is organized into variable-sized segments, each representing logical units such as code, data, or stack sections. A virtual address consists of a segment selector and an offset; translation adds the base address stored in a segment register (or descriptor table) to the offset, yielding the , while bounds checks ensure the offset does not exceed the segment's limit. This approach allows flexible allocation aligned with program structure but can lead to external fragmentation due to varying segment sizes. To mitigate the latency of table lookups, which can involve multiple memory accesses, the Translation Lookaside Buffer (TLB) acts as a small, fast hardware cache holding recent virtual-to-physical mappings. On a TLB hit, translation completes in a single cycle; misses trigger a page table walk, incurring significant delays that can degrade overall system performance if hit rates fall below 99% in typical workloads. Modern processors employ multi-level TLBs and prefetching techniques to boost coverage and hit rates. During context switching, when the operating system changes the active process, the MMU loads a new set of translation structures, necessitating TLB flushing or invalidation to prevent stale entries from the previous process's from causing incorrect mappings or security violations. This operation, often implemented via inter-processor interrupts in multiprocessor systems, ensures address space isolation but introduces overhead, prompting optimizations like process-tagged TLB entries to avoid full flushes. The translated ultimately determines the location in where data is read or written.

Unit of Resolution

In , the unit of resolution refers to the smallest addressable element in , which determines the of access. Most modern systems are byte-addressable, where each individual byte (8 bits) has a unique memory address, allowing precise access to sub-word portions of . This design supports flexible handling of variable-sized types and is standard in architectures like and . In contrast, older systems were often word-addressable, where the smallest unit is a multi-bit word, such as the 12-bit words in the PDP-8 or the 64-bit words in the , requiring accesses to entire words rather than individual bytes. The CPU's word size, typically 32 or 64 bits in contemporary processors, influences access efficiency but does not alter the underlying address in byte-addressable systems. For instance, when accessing a full word in a byte-addressable , the address increments by the word size in bytes—such as 8 bytes (23) for a 64-bit word—to reach the next aligned word. Address increment for next word=word size in bytes\text{Address increment for next word} = \text{word size in bytes} This ensures that multi-byte data structures are fetched efficiently without partial byte overlaps, though the total number of addressable units depends on the address width (e.g., 64 bits allowing up to 264 bytes). Alignment requirements further impact resolution by mandating that data accesses start at addresses that are multiples of the data type's size to avoid penalties. Unaligned accesses, such as loading a 4-byte from an odd-byte boundary, can incur performance penalties on certain architectures; however, modern x86 processors handle them efficiently with minimal overhead, while stricter platforms like older or may trap or slow down significantly. Compilers often insert bytes to enforce alignment, optimizing for the native word size. Data types are sized relative to the word to facilitate efficient ; for example, in a 64-bit , a 32-bit occupies 4 bytes, a single-precision float uses 4 bytes, and a double-precision float spans 8 bytes, all addressable at byte but ideally aligned to their size for optimal . This sizing allows sub-word operations without wasting , though it requires careful management to prevent alignment issues.

Memory Organization

Address Spaces

In , an refers to the range of addresses available to a for referencing locations, serving as an that provides each program with a private view of . This space can be structured as a contiguous range, such as from 0 to 23212^{32} - 1 in typical 32-bit systems, encompassing 4 gigabytes of potential addresses. Alternatively, it may employ a segmented layout to organize different regions logically. The layout of an address space commonly divides into to ensure isolation and . User space occupies the lower portion of the address range, accessible only by the process, while kernel space resides in the upper portion, shared across processes and reserved for operating system operations. Within user space, key segments include the code (text) segment at lower addresses for executable instructions, followed by the , the heap for dynamic allocations that grows upward toward higher addresses, and the stack at the high end that grows downward to accommodate function calls and local variables. This opposing growth direction between heap and stack helps prevent collisions as memory usage expands. In 64-bit systems like on , the per process is typically 48-bit addresses with 4-level paging, yielding a total of 256 terabytes, with user space allocated 128 terabytes (from 0 to 24712^{47} - 1) and kernel space the remaining 128 terabytes starting at higher addresses. Support for 5-level paging, available since 4.15 (as of 2025), extends this to 57-bit virtual addresses (128 pebibytes total), with user space up to 64 pebibytes (0 to 25612^{56} - 1). These virtual addresses within the space are mapped to physical memory via hardware mechanisms such as page tables. To optimize resource utilization, operating systems like employ memory overcommitment, permitting processes to allocate exceeding available physical RAM by assuming not all pages will be accessed simultaneously; excess demand is handled through swapping to disk or the out-of-memory killer if necessary. This approach enhances efficiency but requires careful configuration to avoid system instability.

Location Contents

Memory addresses in computer systems hold various types of contents essential for program execution and . Primarily, these include instructions, which are binary representations of operations fetched and executed by the processor. Data such as variables and arrays occupy other addresses, representing the operands and results processed during . Additionally, metadata like pointers—values that store other memory addresses—reside at specific locations to facilitate indirect referencing and dynamic structures. The volatility of contents at memory addresses varies by the underlying hardware. In (RAM), contents are temporary and volatile, meaning they are lost when power is removed, requiring reloading upon system restart. In contrast, (ROM) stores non-volatile contents that persist without power, typically holding or boot instructions. Access patterns to memory contents are governed by protection mechanisms to ensure system integrity. Instructions in code segments are often designated as read-only to prevent modification during execution, while data areas support read-write access for updates. Some architectures enforce execute-only permissions on instruction regions, restricting reads or writes to mitigate security risks like code injection. Pointers enable self-referential structures by storing addresses that point to other locations, forming chains like linked lists where each node contains data and a pointer to the next node. This allows dynamic allocation and traversal without fixed-size arrays, with each pointer value acting as a within the broader .

Addressing Techniques

Common Schemes

Common schemes for specifying memory addresses in computer architectures provide the foundational mechanisms by which instructions reference operands in memory. These schemes balance simplicity, flexibility, and efficiency, allowing processors to access data without excessive complexity in instruction encoding. Direct addressing, immediate addressing, relative addressing, and base-register addressing represent the most prevalent approaches, each suited to different use cases in program execution. In direct addressing, the memory address is explicitly contained within the instruction itself, forming the effective address directly. For example, an instruction like LOAD 0x1000 retrieves the operand from the absolute location 0x1000 in memory. This mode is straightforward and requires only one memory reference, making it efficient for fixed-location accesses, though it limits the addressable space to the size of the instruction's address field. Immediate addressing embeds the operand value directly in the instruction rather than specifying a memory address, so it does not constitute a true addressing mode for memory access. Instead, it provides constants or initial values immediately available to the processor, such as in an ADD #5 operation that adds the literal 5 to a register. This avoids memory fetches, saving execution cycles, but the operand size is constrained by the instruction field length, often smaller than a full word. Relative addressing computes the effective address by adding an offset from a reference point, typically the (PC), to support that relocates without modification. For instance, a branch instruction with a +4 offset jumps to the address PC + 4, exploiting spatial locality in sequential code execution. This mode conserves address bits in instructions and facilitates efficient short-range jumps or loads, though it restricts accesses to nearby memory regions. Base-register addressing forms the effective address by adding an offset to the contents of a base register, commonly used for accessing structured like . In this scheme, the instruction specifies the base register and offset, yielding an address such as base_register + 8 for the eighth element assuming 8-byte entries. It expands the addressable range beyond instruction limits and supports dynamic , requiring an additional register access but enabling flexible data handling. The evolution of these schemes reflects advancements in , starting with limited options in 8-bit microprocessors like the (1972), which relied on basic absolute and register modes due to constrained instruction space and few registers. Early 8-bit systems emphasized simplicity to fit within small silicon budgets, often using single-accumulator absolute addressing. As 16-bit architectures emerged, more modes like indexing and were added for efficiency. By the 1980s, RISC designs such as MIPS simplified these to a core set—primarily register, immediate, and PC-relative—prioritizing load/store operations and abundant registers to reduce accesses and complexity. This shift, driven by scaling and optimizations, streamlined addressing for pipelined execution while maintaining compatibility with common schemes.

Modes in Instruction Sets

Addressing modes in instruction sets define how are located in or registers during instruction execution, enabling efficient access to data structures like arrays or pointers. These modes vary across architectures but commonly include direct, indirect, indexed, and register-based variants to balance flexibility and performance. They form the foundation for common addressing schemes by specifying operand location through combinations of registers, immediates, and displacements. Indexed addressing computes the effective address by adding an index value, often from a register, to a base address, which is particularly useful for traversing arrays or tables. The formula for the effective address in scaled indexed mode is: Effective address=base+(index×scale)\text{Effective address} = \text{base} + (\text{index} \times \text{scale}) where the scale factor (typically 1, 2, 4, or 8) accounts for data element sizes like bytes or words. This mode reduces the need for multiple instructions in loop constructs, as seen in array access patterns. Indirect addressing loads or stores data by dereferencing a pointer stored in a register or memory location, allowing dynamic access without hardcoding addresses. For example, an instruction like LOAD (R1) retrieves the value at the memory address held in register R1, supporting operations on linked structures or function pointers. This mode introduces an extra memory access compared to direct addressing, impacting latency in pointer-heavy code. Register indirect addressing operates solely through registers, where the register contains the effective address without additional fetches for . This is akin to indirect addressing but avoids a secondary dereference, making it faster for register-to-memory transfers in load/store architectures. It is commonly used in architectures like for base register offsets in load/store instructions. In the ARM architecture, load/store instructions support offset addressing modes, including register offsets and scaled variants, where the effective address is base plus an immediate or register-shifted value, facilitating efficient stack operations and array indexing. Similarly, the x86 instruction set employs complex modes such as [base + index * scale + displacement], allowing up to three components for flexible memory access in a single instruction, as detailed in Intel's architecture manuals. More complex addressing modes, while reducing instruction count, increase instruction decode complexity and time due to additional hardware logic for address calculation, leading to higher power consumption and potential pipeline stalls in modern processors. This trade-off favors simpler modes in RISC designs for faster execution, as opposed to CISC's broader mode support.

Memory Models

Flat Models

In flat memory models, the address space is organized as a single, contiguous linear array, enabling uniform access to memory locations without the need for segmentation. This approach treats memory as a straightforward sequence of bytes, where addresses are interpreted directly as offsets within the entire space. For example, in a 32-bit flat model, the full 4 GB (2^{32} bytes) of addressable memory is accessible using linear addresses ranging from 0 to 4,294,967,295, providing a simple and predictable addressing scheme. Flat models are widely adopted in embedded systems, where simplicity and direct hardware access are prioritized, as well as in modern operating systems like running on processors. In these environments, the model eliminates the complexities of segment registers by setting them to zero or using them minimally, allowing programs to operate within a vast, uninterrupted . For instance, on employs a flat model in 64-bit mode, leveraging the processor's to support up to 2^{48} bytes of in a linear fashion. The primary advantages of flat models include simplified programming and , as developers can use straightforward pointer arithmetic without handling segment boundaries or related faults. This uniformity reduces overhead in code generation for compilers and avoids issues like segment overlap or limit violations, while efficiently supporting large-scale applications that require expansive spaces. Additionally, the model facilitates easier of software across platforms that share similar linear addressing conventions. Implementation of flat models relies on paging for address translation, where virtual addresses are mapped directly to physical memory pages without intervening segmentation layers. This involves configuring segment descriptors to span the entire —such as base address 0 and limit covering all bytes—allowing the (MMU) to perform efficient, hardware-accelerated translations via page tables. In practice, this setup ensures that linear virtual addresses are resolved to physical locations solely through paging structures, maintaining the model's seamless linearity.

Segmented and Paged Models

In segmented memory models, the of a is divided into variable-sized logical units known as segments, each corresponding to a meaningful portion of the program such as , , stack, or heap. This division allows segments to be placed independently in physical memory, supporting sparse address spaces and enabling sharing of common segments like across processes. A virtual address in this model consists of a segment identifier (or selector) and an offset within that segment; the physical address is computed by adding the offset to the base address of the segment, as stored in a per-process segment table that also includes limit checks for protection. For example, in early architectures like the , a 16-bit segment selector shifted left by 4 bits (multiplied by 16) is added to a 16-bit offset to form a 20-bit physical address, allowing access to 1 MB of memory despite 16-bit registers. Paging, in contrast, partitions both the virtual address space and physical memory into fixed-size units called pages (typically 4 KB) and page frames, respectively, facilitating non-contiguous allocation without regard to logical structure. Virtual addresses are split into a page number (virtual page number, or VPN) and an offset; the page table maps each VPN to a physical frame number, with the physical address formed by combining the frame number and offset. To handle large address spaces efficiently, multi-level page tables are employed, where higher-level directories point to lower-level tables, reducing memory overhead for sparse mappings—modern systems like x86-64 use four levels for 48-bit virtual addresses. This approach supports demand paging, where pages are loaded into memory only when accessed, and provides isolation through valid/invalid bits in page table entries. Many systems combine segmentation and paging to leverage the strengths of both, creating a hybrid model where segments are further subdivided into fixed-size pages for mapping to physical memory. In this setup, a virtual address includes a segment selector, a page number within the segment, and an offset; the segment table points to a page table for that segment, enabling logical organization alongside efficient physical allocation and enhanced protection through layered checks. This combination, as seen in architectures supporting both mechanisms, allows for variable segment sizes for programmer-visible modularity while using paging to mitigate fragmentation and support sharing at the page level. The primary trade-offs between these models revolve around flexibility, efficiency, and fragmentation: segmentation excels in logical division and sharing but suffers from external fragmentation due to variable sizes, leading to scattered free holes that complicate allocation. Paging promotes physical efficiency and eliminates external fragmentation through uniform sizes but introduces internal fragmentation, where partially used pages waste space (up to half a page per allocation on average). Hybrid models balance these by providing segmentation's modularity with paging's compaction, though at the cost of increased translation complexity and overhead from multiple table lookups. Overall, paging has become more prevalent in modern systems for its hardware support and reduced fragmentation, while segmentation offers conceptual benefits for structured programming.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.