Recent from talks
Contribute something
Nothing was collected or created yet.
Overlay (programming)
View on Wikipedia

In a general computing sense, overlaying means "the process of transferring a block of program code or other data into main memory, replacing what is already stored".[1] Overlaying is a programming method that allows programs to be larger than the computer's main memory.[2] An embedded system would normally use overlays because of the limitation of physical memory, which is internal memory for a system-on-chip, and the lack of virtual memory facilities.
Usage
[edit]Constructing an overlay program involves manually dividing a program into self-contained object code blocks called overlays or links, generally laid out in a tree structure.[b] Sibling segments, those at the same depth level, share the same memory, called overlay region[c] or destination region. An overlay manager, either part of the operating system or part of the overlay program, loads the required overlay from external memory into its destination region when it is needed; this may be automatic or via explicit code. Often linkers provide support for overlays.[3]
Example
[edit]The following example shows the control statements that instruct the OS/360 Linkage Editor to link an overlay program containing a single region, indented to show structure (segment names are arbitrary):
INCLUDE SYSLIB(MOD1)
INCLUDE SYSLIB(MOD2)
OVERLAY A
INCLUDE SYSLIB(MOD3)
OVERLAY AA
INCLUDE SYSLIB(MOD4)
INCLUDE SYSLIB(MOD5)
OVERLAY AB
INCLUDE SYSLIB(MOD6)
OVERLAY B
INCLUDE SYSLIB(MOD7)
+--------------+
| Root Segment |
| MOD1, MOD2 |
+--------------+
|
+----------+----------+
| |
+-------------+ +-------------+
| Overlay A | | Overlay B |
| MOD3 | | MOD7 |
+-------------+ +-------------+
|
+--------+--------+
| |
+-------------+ +-------------+
| Overlay AA | | Overlay AB |
| MOD4, MOD5 | | MOD6 |
+-------------+ +-------------+
These statements define a tree consisting of the permanently resident segment, called the root, and two overlays A and B which will be loaded following the end of MOD2. Overlay A itself consists of two overlay segments, AA, and AB. At execution time overlays A and B will both utilize the same memory locations; AA and AB will both utilize the same locations following the end of MOD3.
All the segments between the root and a given overlay segment are called a path.
Applications
[edit]As of 2015[update], most business applications are intended to run on platforms with virtual memory[citation needed]. A developer on such a platform can design a program as if the memory constraint does not exist unless the program's working set exceeds the available physical memory. Most importantly, the architect can focus on the problem being solved without the added design difficulty of forcing the processing into steps constrained by the overlay size. Thus, the designer can use higher-level programming languages that do not allow the programmer much control over size (e.g. Java, C++, Smalltalk).
Still, overlays remain useful in embedded systems.[4] Some low-cost processors used in embedded systems do not provide a memory management unit (MMU). In addition many embedded systems are real-time systems and overlays provide more determinate response-time than paging. For example, the Space Shuttle Primary Avionics System Software (PASS) uses programmed overlays.[5]
Even on platforms with virtual memory, software components such as codecs may be decoupled to the point where they can be loaded in and out as needed.
Historical use
[edit]IBM introduced the concept of a chain job[6] in FORTRAN II. The program had to explicitly call the CHAIN subroutine to load a new link, and the new link replaced all of the old link's storage except for the Fortran COMMON area.
IBM introduced more general overlay handling[7] in IBSYS/IBJOB, including a tree structure and automatic loading of links as part of CALL processing.
In OS/360, IBM extended the overlay facility of IBM 7090/94 IBSYS IBLDR[8] by allowing an overlay program to have independent overlay regions, each with its own overlay tree. Another dynamic system of overlays could be constructed using the LOAD or LINK system macros. Separately-compiled modules were loaded anywhere in available memory and retained until a DELETE macro was issued for them. Loaded programs could, in turn, load other programs and so on.[9] OS/360 also had a simpler overlay system for transient operating system SVC routines, using 1024-byte SVC transient areas.
Systems using Memory segmentation typically do not require overlay facilities. Programs on Burroughs Large Systems are composed of segments, which are natural divisions of the program such as COBOL paragraphs, ALGOL procedures, data structures, etc. Each segment is dynamically loaded as needed, and then possibly removed to swapped out to free storage. Specific overlay structures are not necessary because the MCP overlays automatically.[10] Multics, likewise, does not require specific overlays, because each module is a separate segment, which the supervisor loads, deletes, or swaps out as needed.
In the home computer era overlays were popular because the operating system and many of the computer systems it ran on lacked virtual memory and had very little RAM by current standards: the original IBM PC had between 16K and 64K, depending on configuration. Overlays were a popular technique in Commodore BASIC to load graphics screens.[2]
"Several PC/MS-DOS linkers in the 1980s supported [overlays] in a form nearly identical to that used 25 years earlier on mainframe computers."[4][11] Binary files containing memory overlays had de facto standard extensions .OVL[11] or .OVR[12] (but also used numerical file extensions like .000, .001, etc. for subsequent files[13]). This file type was used among others by WordStar[14] (consisting of the main executable WS.COM and the overlay modules WSMSGS.OVR, WSOVLY1.OVR, MAILMERGE.OVR and SPELSTAR.OVR, where the "fat" overlay files were even binary identical in their ports for CP/M-86 and MS-DOS[15]), dBase,[16] and the Enable DOS office automation software package from Enable Software. Borland's Turbo Pascal[17][18] and the GFA BASIC compiler were able to produce .OVL files.
See also
[edit]Notes
[edit]References
[edit]- ^ "Oxford Dictionaries". 2015-11-26. Archived from the original on 2022-07-10. Retrieved 2022-07-10.
- ^ a b Butterfield, James "Jim", ed. (June 1986). "Part 4: Overlaying". Loading And Linking Commodore Programs. p. 74. Archived from the original on 2022-07-10. Retrieved 2022-07-10.
This lets you run programs which are, in effect, much larger than the amount of memory in your computer.
{{cite book}}:|magazine=ignored (help) - ^ "The GNU Linker documentation: Overlay Description". 2008-06-03. Archived from the original on 2022-06-23. Retrieved 2022-07-10. [1]
- ^ a b Levine, John R. (2000). Linkers & Loaders. Morgan Kaufmann Publishers. p. 177. ISBN 1-55860-496-0. Archived from the original on 2022-04-06. Retrieved 2022-07-10. [2]
- ^ National Research Council (November 1993) [June 1993]. An Assessment of Space Shuttle Flight Software Development Processes (2 ed.). Washington, DC, USA: National Academy of Sciences, The National Academies Press. doi:10.17226/2222. hdl:2060/19930019745. ISBN 978-0-309-04880-4. LCCN 93-84549. Retrieved 2012-10-29. (208 pages)
- ^ "Chapter 12: The Chain Job" (PDF). IBM 7090/7094 Programming Systems – FORTRAN II Programming (PDF). Poughkeepsie, New York, USA: IBM Corporation. August 1963. pp. 34–35. Form C28-6054-4 File No. 7090-25. Archived (PDF) from the original on 2022-03-15. Retrieved 2022-07-10.
{{cite book}}:|work=ignored (help) (52 pages) - ^ IBM 7090/7094 Programming Systems – IBJOB Processor – Overlay feature of IBLDR (PDF) (1 ed.). Poughkeepsie, New York, USA: IBM Corporation. May 1963. Form C28-6331 File No. 7090-27. Archived (PDF) from the original on 2022-03-15. Retrieved 2021-12-26.
{{cite book}}:|work=ignored (help) (8 pages) - ^ IBM 7090/7094 Programming Systems IBJOB Processor Overlay Feature of IBLDR (PDF). IBM Corporation. 1963. Retrieved 2025-09-01.
- ^ IBM System/360 Operating System Supervisor Services and Macro Instructions (PDF). IBM Corporation. 1974. p. 9. Retrieved 2025-09-01.
- ^ Master Control Program Reference Manual (PDF). Burroughs Corporation. 1969. pp. 3-1 – 3-7. Retrieved 2025-09-01.
- ^ a b Elliott, John C. (2012-06-05) [2000-01-02]. "PRL file format". seasip.info. Archived from the original on 2020-01-26. Retrieved 2020-01-26.
[…] A PRL file is a relocatable binary file, used by MP/M and CP/M Plus for various modules other than .COM files. The file format is also used for FID files on the Amstrad PCW. There are several file formats which use versions of PRL: SPR (System PRL), RSP (Resident System Process). LINK-80 can also produce OVL (overlay) files, which have a PRL header but are not relocatable. GSX drivers are in PRL format; so are Resident System Extensions (.RSX). […]
[3] - ^ Dohmen, Norbert (1990). "Platz schaffen durch Überlagern - Overlay-Strukturen in Turbo Pascal". mc (in German). Vol. 90, no. 12. pp. 124–130. Archived from the original on 2022-08-04. Retrieved 2022-08-04. [4]
- ^ Gavin, Bruce. "Create Program Overlays". In Pearson, Dave (ed.). Turbo Pascal - Norton Guide. v3. p. 149. Archived from the original on 2022-08-04. Retrieved 2022-08-04.
- ^ Mabbett, Alan (1985). Getting started with WordStar, MailMerge + SpellStar. Cambridge University Press. ISBN 0-521-31805-X.
- ^ Necasek, Michal (2018-01-30) [2018-01-28, 2018-01-26]. "WordStar Again". OS/2 Museum. Archived from the original on 2019-07-28. Retrieved 2019-07-28.
[…] The reason to suspect such difference is that version 3.2x also supported CP/M-86 (the overlays are identical between DOS and CP/M-86, only the main executable is different) […] the .OVR files are 100% identical between DOS and CP/M-86, with a flag (clearly shown in the WordStar 3.20 manual) switching between them at runtime […] the OS interface in WordStar is quite narrow and well abstracted […] the WordStar 3.2x overlays are 100% identical between the DOS and CP/M-86 versions. There is a runtime switch which chooses between calling INT 21h (DOS) and INT E0h (CP/M-86). WS.COM is not the same between DOS and CP/M-86, although it's probably not very different either. […]
- ^ Sidnam-Wright, Liz; Stevens, Brad, eds. (1990-07-31). "Ashton-Tate ships dBASE IV Version 1.1" (PDF). Torrance, California, USA: Ashton Tate. p. 2-2-2. Archived from the original (PDF) on 2017-04-04. Retrieved 2014-02-13.
Version 1.1 has a new dynamic Memory Management System (dMMS) that handles overlays more efficiently: the product requires less memory, which results in more applications space availability. […] The product's lower memory requirements of only 450K of RAM provide improved network support because supplemental hardware memory is no longer required to support networks. […] By speeding up areas of dBASE IV that are overlay-dependent, the new dMMS improves performance when working at the Control Center and in programs that use menus and windows.
(5 pages) - ^ Herschel, Rudolf; Dieterich, Ernst-Wolfgang (2000). Turbo Pascal 7.0 (in German) (2 ed.). R. Oldenbourg Verlag. p. 249. ISBN 3-486-25499-5.
- ^ Eßer, Hans-Georg (June 2009). "Chapter 6. Speicherverwaltung und Dateisysteme - Teil 5: Nicht-zusammenhängende Speicherzuordnung". Betriebssysteme I (PDF) (in German). Munich, Germany: Hochschule München. Archived (PDF) from the original on 2022-05-08. Retrieved 2014-02-13. (9 pages)
Further reading
[edit]- IBM OS Linkage Editor and Loader - Program Numbers 360S-ED-510, 360S-ED-521, 360S-LD-547 (PDF). Release 21 (10 ed.). White Plains, New York, USA: IBM Corporation. March 1972 [January 1972]. Order No. GC28-6538-9, File No. S360-31. Archived (PDF) from the original on 2022-07-10.
{{cite book}}:|work=ignored (help) (2+244+4 pages) - Groeber, Marcus; Di Geronimo, Jr., Edward "Ed"; Paul, Matthias R. (2002-03-02) [2002-02-24]. "GEOS/NDO info for RBIL62?". Newsgroup: comp.os.geos.programmer. Archived from the original on 2019-04-20. Retrieved 2019-04-20.
[…] The reason Geos needs 16 interrupts is because the scheme is used to convert inter-segment ("far") function calls into interrupts, without changing the size of the code. The reason this is done so that "something" (the kernel) can hook itself into every inter-segment call made by a Geos application and make sure that the proper code segments are loaded from virtual memory and locked down. In DOS terms, this would be comparable to an overlay loader, but one that can be added without requiring explicit support from the compiler or the application. What happens is something like this: […] 1. The real mode compiler generates an instruction like this: CALL <segment>:<offset> -> 9A <offlow><offhigh><seglow><seghigh> with <seglow><seghigh> normally being defined as an address that must be fixed up at load time depending on the address where the code has been placed. […] 2. The Geos linker turns this into something else: INT 8xh -> CD 8x […] DB <seghigh>,<offlow>,<offhigh> […] Note that this is again five bytes, so it can be fixed up "in place". Now the problem is that an interrupt requires two bytes, while a CALL FAR instruction only needs one. As a result, the 32-bit vector (<seg><ofs>) must be compressed into 24 bits. […] This is achieved by two things: First, the <seg> address is encoded as a "handle" to the segment, whose lowest nibble is always zero. This saves four bits. In addition […] the remaining four bits go into the low nibble of the interrupt vector, thus creating anything from INT 80h to 8Fh. […] The interrupt handler for all those vectors is the same. It will "unpack" the address from the three-and-a-half byte notation, look up the absolute address of the segment, and forward the call, after having done its virtual memory loading thing... Return from the call will also pass through the corresponding unlocking code. […] The low nibble of the interrupt vector (80h–8Fh) holds bit 4 through 7 of the segment handle. Bit 0 to 3 of a segment handle are (by definition of a Geos handle) always 0. […] all Geos API run through the "overlay" scheme […]: when a Geos application is loaded into memory, the loader will automatically replace calls to functions in the system libraries by the corresponding INT-based calls. Anyway, these are not constant, but depend on the handle assigned to the library's code segment. […] Geos was originally intended to be converted to protected mode very early on […], with real mode only being a "legacy option" […] almost every single line of assembly code is ready for it […]
External links
[edit]Overlay (programming)
View on GrokipediaDefinition and Purpose
Core Concept
In overlay programming, a memory management technique is employed to enable programs larger than the available physical memory by sequentially loading distinct modules into a shared memory region during execution. This approach allows multiple program segments to occupy the same address space at different times, thereby minimizing the overall memory footprint required for the program.[6][7] The fundamental principle of overlays is that only the currently active module resides in memory, while inactive modules are stored on secondary storage and swapped in as needed, ensuring efficient utilization of limited resources without requiring the entire program to be resident simultaneously.[8][6] This dynamic swapping is orchestrated to align with the program's execution flow, loading modules just before they are invoked to avoid memory exhaustion. Overlays differ from segmentation, which involves dividing memory into fixed, logical units for protection and sharing, by emphasizing runtime replacement of code segments within a designated area rather than static partitioning.[8] Key terminology includes the root, the always-resident portion of the program that serves as the entry point and coordinates execution; the overlay area, the shared memory region where modules are loaded and replaced; and the overlay driver, the software component responsible for managing the loading and unloading of these modules from storage.[6][7][8]Memory Constraints Addressed
In the era preceding widespread virtual memory systems, early computers faced severe limitations in main memory capacity, typically ranging from a few kilobytes to hundreds of kilobytes in 1960s mainframes such as the IBM System/360 models, which offered physical core storage from 262 kilobytes up to about 2 megabytes depending on configuration.[9][10] These constraints arose because core memory was expensive and physically bulky, often consisting of magnetic rings that limited scalability without significant hardware investment. Programs, however, frequently exceeded available memory; for instance, complex applications like scientific simulations or business processing could require more space than the entire addressable storage, halting execution unless managed through techniques like overlays.[11] A key software factor exacerbating these hardware limits was the prevalence of monolithic executables in early programming environments, where entire programs were compiled and linked into single, indivisible load modules that needed to reside fully in memory for execution.[11] In systems lacking built-in support for dynamic loading or segmentation, such as some pre-OS/360 environments, monolithic executables could not partially occupy memory, often leading to execution failures if the program's size surpassed physical limits. However, OS/360 introduced facilities for overlays and dynamic loading to mitigate this. This enabled programmers to divide code into segments that could be loaded selectively, preventing outright execution halts.[11] Overlays addressed these constraints by enabling the reuse of memory regions, thus permitting larger programs to run on resource-constrained machines without necessitating costly hardware upgrades.[11] For example, in planned overlay structures, non-concurrent segments shared the same storage area, with only the root and active parts resident at any time, effectively multiplying usable program size beyond physical boundaries. This improved overall system efficiency, allowing early mainframes to handle sophisticated tasks like multiprogramming precursors. However, the approach introduced trade-offs, including elevated I/O overhead from frequent disk swaps to load overlay segments, which could degrade execution speed compared to fully in-memory programs, and required significant programmer effort to define structures without runtime optimization knowledge.[9][11]Implementation Mechanisms
Overlay Structures
Overlay structures in programming refer to the organizational frameworks that define how different modules or segments of a program share limited memory space by replacing one another at runtime. These structures are typically tree-based (hierarchical), where modules are arranged in a tree-like hierarchy with child modules replacing parent segments in designated memory areas.[8] Tree-based structures allow for more efficient memory use in complex programs by enabling mutual exclusion among siblings—modules at the same level that do not execute concurrently.[12][8] Key components of overlay structures include the overlay map, which outlines the relationships and loading positions of modules; the root segment, consisting of the non-overlayed base code that remains resident in memory throughout execution; and overlay regions, which are fixed memory blocks reserved for loading and swapping overlay modules.[13] The overlay map serves as a blueprint generated by the linker, visually or programmatically representing the hierarchy or sequence of segments to guide loading decisions.[13] The root segment typically contains the program's entry point and essential control logic, ensuring continuity, while overlay regions are sized to accommodate the largest module in a given path to avoid fragmentation.[12][8] The design process for overlay structures involves the programmer or compiler explicitly specifying the layout through directives in assembly language, linker scripts, or control statements, which define segment groupings and their mutual exclusions.[13] For instance, in systems like OS/360, programmers use OVERLAY and INCLUDE statements in the linkage editor input to assign modules to segments and regions, ensuring that dependent modules are structured hierarchically based on execution flow.[13] This manual partitioning requires analyzing the program's call graph to minimize overlaps and storage needs, often resulting in a tree where branches represent alternative execution paths.[12] A representative example of a tree-based overlay structure can be visualized as follows: the root segment, always in memory, branches to call either overlay A or overlay B, both of which load into the same overlay region, replacing the previous occupant as needed; from A, a child overlay C might then load into a sub-region, illustrating how siblings A and B share space without conflict.[12] This design ensures that only the active path occupies memory at any time, optimizing for constraints like the 32K-word limit in early systems.[13]Dynamic Loading Process
At runtime, the overlay manager detects a call to an unloaded module by consulting a reference table that tracks the current state of loaded segments or routines. If the target overlay is not in memory, the manager saves the existing overlay to disk if it occupies the same region, freeing up the necessary space; it then initiates an input/output operation to read the new overlay from secondary storage, such as a disk pack, and updates the program's linkage pointers to point to the freshly loaded code and data. This stepwise swapping ensures seamless transitions between program phases while adhering to memory constraints.[8][14] The overlay supervisor or driver serves as the central coordinator for these runtime actions, maintaining tables like the Overlay Name Table—which records metadata such as overlay identifiers, hierarchy levels, sizes, and memory addresses—and the Overlay Reference Table, which maps specific routine calls to their overlay indices and entry points. These structures allow the supervisor to intercept calls, verify prerequisites, and trigger asynchronous I/O requests for loading, often using system macros like SEGLD in environments such as IBM OS/360 to post event control blocks for synchronization. By handling relocation and linkage adjustments during loads, the supervisor preserves program integrity across swaps.[8][14] Error handling mechanisms in the dynamic loading process mitigate risks from disk I/O failures, absent overlay files, or storage conflicts by performing pre-load validations and runtime checks. For example, if a disk read error occurs or an referenced routine is missing, the loader halts the operation, logs the issue in an error map or supervisory log, and may invoke an abnormal end routine to terminate the task while releasing allocated resources and notifying the operator. In cases of memory conflicts, the system prioritizes unloading non-essential segments or aborts the load to avoid corruption, ensuring system stability.[8][14] Disk seek and transfer latencies represent the main performance bottleneck in overlay swapping, as the process requires physical access to secondary storage for each load. On 1970s mainframe hardware, such as IBM's 2314 disk facility with average seek times of 60-75 milliseconds or the later 3330 model at 30 milliseconds, typical swap operations incurred delays of tens to low hundreds of milliseconds, potentially extending overall program load times by about 50% compared to non-overlaid execution. Despite this, execution overhead remained low—often under 1% increase—thanks to techniques like buffered preloading and optimized direct calls within active overlays.[8][15]Practical Examples
Basic Code Example
In a constrained memory environment, overlays enable a program larger than available RAM by loading mutually exclusive modules into the same memory region on demand. Consider a simple program consisting of a root module that sequentially calls two functions,calc1 and calc2, which do not execute simultaneously and thus can share the same memory space. Each function requires 1 KB of memory, while the root module needs 0.5 KB, resulting in a total program size of 2.5 KB. With only 1.5 KB of RAM available, overlays allow the program to run by reusing the memory slot for the functions.[16]
The following pseudocode illustrates a basic overlay implementation using a manual overlay manager. The root loads calc1 initially, executes it, unloads it, loads calc2, and executes it. Module definitions are marked as overlays, and the manager handles dynamic loading from secondary storage (e.g., disk).
// Overlay Manager Functions
function load_overlay(module_name, memory_address):
read module_name from disk into memory_address
// Assume module size fits in allocated slot
function unload_overlay(memory_address):
// Optional: clear or mark memory as free for reuse
// Root Module (resident in memory, 0.5 KB)
root_start:
allocate_overlay_slot(1 KB) // Shared slot for calc1 and calc2
load_overlay("calc1", overlay_slot)
call calc1
unload_overlay(overlay_slot)
load_overlay("calc2", overlay_slot)
call calc2
deallocate_overlay_slot
end
// Overlay Modules (stored on disk)
// Marked as overlay in linker directives
overlay calc1: // 1 KB
// Computation logic for calc1
return
overlay calc2: // 1 KB
// Computation logic for calc2
return
// Overlay Manager Functions
function load_overlay(module_name, memory_address):
read module_name from disk into memory_address
// Assume module size fits in allocated slot
function unload_overlay(memory_address):
// Optional: clear or mark memory as free for reuse
// Root Module (resident in memory, 0.5 KB)
root_start:
allocate_overlay_slot(1 KB) // Shared slot for calc1 and calc2
load_overlay("calc1", overlay_slot)
call calc1
unload_overlay(overlay_slot)
load_overlay("calc2", overlay_slot)
call calc2
deallocate_overlay_slot
end
// Overlay Modules (stored on disk)
// Marked as overlay in linker directives
overlay calc1: // 1 KB
// Computation logic for calc1
return
overlay calc2: // 1 KB
// Computation logic for calc2
return
calc1 is called, the manager loads it into the next 1 KB slot, using 1.5 KB total and executing without overlap issues. After calc1 completes, unloading frees the slot for calc2, reusing the space and keeping total RAM usage at 1.5 KB despite the 2.5 KB program size. This manual approach relies on the programmer structuring calls to avoid simultaneous module needs.[17][16]
Such overlay mechanisms were commonly implemented in low-level assembly languages for systems with limited memory, including the PDP-11, where linkers like those in RT-11 or RSX-11 supported overlay directives to manage module loading.