Recent from talks
Nothing was collected or created yet.
A20 line
View on Wikipedia

The A20, or address line 20, is one of the electrical lines that make up the system bus of an x86-based computer system. The A20 line in particular is used to transmit the 21st bit on the address bus.
A microprocessor typically has a number of address lines equal to the base-two logarithm of the number of words in its physical address space. For example, a processor with 4 GB of byte-addressable physical space requires 32 lines (log2(4 GB) = log2(232 B) = 32), which are named A0 through A31. The lines are named after the zero-based number of the bit in the address that they are transmitting. The least significant bit is first and is therefore numbered bit 0 and signaled on line A0. A20 transmits bit 20 (the 21st bit) and becomes active once addresses reach 1 MB, or 220.
Overview
[edit]The Intel 8086, Intel 8088, and Intel 80186 processors had 20 address lines, numbered A0 to A19; with these, the processor can access 220 bytes, or 1 MB. Internal address registers of such processors only had 16 bits. To access a 20-bit address space, an external memory reference was made up of a 16-bit offset address added to a 16-bit segment number, shifted 4 bits to the left so as to produce a 20-bit physical address. The resulting address is equal to segment × 16 + offset.[1] There are many combinations of segment and offset that produce the same 20-bit physical address. Therefore, there were various ways to address the same byte in memory.[2] For example, here are four of the 4096 different segment:offset combinations, all referencing the byte whose physical address is 0x000FFFFF (the last byte in 1 MB-memory space):
- F000:FFFF
- FFFF:000F
- F555:AAAF
- F800:7FFF
Referenced the last way, an increase of one in the offset yields F800:8000, which is a proper address for the processor, but since it translates to the physical address 0x00100000 (the first byte over 1 MB), the processor would need another address line for actual access to that byte. Since there is no such line on the 8086 line of processors, the 21st bit above, while set, gets dropped, causing the address F800:8000 to "wrap around"[1] and to actually point to the physical address 0x00000000.
When IBM designed the IBM PC AT (1984) machine, it decided to use the new higher-performance Intel 80286 microprocessor. The 80286 could address up to 16 MB of system memory in protected mode. However, the CPU was supposed to emulate an 8086's behavior in real mode, its startup mode, so that it could run operating systems and programs that were not written for protected mode. The 80286 did not force the A20 line to zero in real mode, however. Therefore, the combination F800:8000 would no longer point to the physical address 0x00000000, but to the address 0x00100000. As a result, programs relying on the address wrap around would no longer work. To remain compatible with such programs, IBM decided to correct the problem on the motherboard.
That was accomplished by inserting a logic gate on the A20 line between the processor and system bus, which got named Gate-A20. Gate-A20 can be enabled or disabled by software to allow or prevent the address bus from receiving a signal from A20. It is set to non-passing for the execution of older programs that rely on the wrap-around. At boot time, the BIOS first enables Gate-A20 when it counts and tests all of the system memory, and then disables it before transferring control to the operating system.
Originally, the logic gate was a gate connected to the Intel 8042 keyboard controller.[1] Controlling it was a relatively slow process. Other methods have since been added to allow more efficient multitasking of programs that require this wrap-around with programs that access all of the system memory. There are multiple methods to control the A20 line.[3]
Disconnecting A20 would not wrap all memory accesses above 1 MB, just those in the 1–2 MB, 3–4 MB, 5–6 MB, etc. ranges. Real-mode software cared only about the area slightly above 1 MB, so the Gate-A20 line was enough.
Enabling the Gate-A20 line is one of the first steps that a protected-mode x86 operating system does in the bootup process, often before control has been passed to the kernel from the bootstrap (in the case of Linux, for example).
Virtual 8086 mode, introduced with the Intel 80386, allows the A20 wrap-around to be simulated by using the virtual memory facilities of the processor; physical memory may be mapped to multiple virtual addresses. Thus, the memory mapped at the first megabyte of virtual memory may be mapped again in the second megabyte of virtual memory. The operating system may intercept changes to Gate A20 and make corresponding changes to the virtual-memory address space, which also makes irrelevant the efficiency of Gate-A20 line toggling.
A20 gate
[edit]Controlling the A20 line was an important feature at one stage in the growth of the IBM PC architecture, as it added access to an additional 65,520 bytes (64 KB − 16 bytes) of memory in real mode, without significant software changes.
In what was arguably a "hack", the A20 gate was originally part of the keyboard controller on the motherboard, which could open or close it depending on what behavior was desired.[4]
In order to keep full compatibility with the Intel 8086, the A20 gate was still present in Intel CPUs until 2008.[5] As the gate was initially closed right after boot, protected-mode operating systems typically opened the A20 gate early during the boot process to never close it again. Such operating systems had no compatibility reasons for keeping it closed, and they gained access to the full range of physical addresses available by opening it.
The Intel 80486 and Pentium added a special pin named A20M#, which when asserted low forces bit 20 of the physical address to be zero for all on-chip cache- or external-memory accesses. It was necessary, since the 80486 introduced an on-chip cache and so masking this bit in external logic was no longer possible. Software still needs to manipulate the gate and must still deal with external peripherals (the chipset) for that.[6]
The PC System Design Guide PC 2001 removes compatibility for the A20 line: "If A20M# generation logic is still present in the system, this logic must be terminated such that software writes to I/O port 92, bit 1, do not result in A20M# being asserted to the processor."[7]
Support for the A20 gate was changed in the Nehalem microarchitecture (some sources incorrectly claim that A20 support was removed). Rather than the CPU having a dedicated A20M# pin that receives the signal whether or not to mask the A20 bit, it has been virtualized so that the information is sent from the peripheral hardware to the CPU using special bus cycles.[citation needed] From a software point of view, the mechanism works exactly as before, and an operating system must still program external hardware (which in-turn sends the aforementioned bus cycles to the CPU) to disable the A20 masking.[citation needed]
Intel no longer supports the A20 gate, starting with Haswell. Page 271 of the Intel System Programmers Manual Vol. 3A from June 2013 states: "The functionality of A20M# is used primarily by older operating systems and not used by modern operating systems. On newer Intel 64 processors, A20M# may be absent."[8]
A20 handler
[edit]The A20 handler is IBM PC memory manager software that controls access to the high memory area (HMA). Extended-memory managers usually provide this functionality. A20 handlers are named after the 21st address line of the microprocessor, the A20 line.
In DOS, HMA managers such as HIMEM.SYS have the "extra task" of managing A20. HIMEM.SYS provided an API for opening/closing A20. DOS itself could use the area for some of its storage needs, thereby freeing up more conventional memory for programs. That functionality was enabled by the DOS=HIGH or HIDOS=ON directives in the CONFIG.SYS configuration file.
Affected programs
[edit]Since 1980, the address wrap was internally used by 86-DOS and MS-DOS to implement the DOS CALL 5 entry point at offset +5 to +9 (which emulates the CP/M-80-style CALL 5 BDOS API entry point at offset +5 to +7) in the Program Segment Prefix (PSP) (which partially resembles CP/M-80's zero page).[9][10] This was, in particular, utilized by programs machine-translated from CP/M-80 through assembly language translators[9] like Seattle Computer Products' TRANS86.[11] The CALL 5 handler this entry point refers to resides at the machine's physical address 0x000000C0 (thereby overlapping the four bytes of the interrupt service routine entry point reserved for INT 30h and the first byte of INT 31h in the x86 real mode interrupt vector table).[12][13][14] However, by the design of CP/M-80, which loaded the operating system immediately above the memory available for the application program to run in, the 8080/Z80 16-bit target address stored at offset +6 to +7 in the zero page could deliberately also be interpreted as the size of the first memory segment.[9] In order to emulate this in DOS with its 8086 segment:offset addressing scheme, the far call entry point's 16-bit offset had to match this segment size (i.e. 0xFEF0), which is stored at offset +6 to +7 in the PSP, overlapping parts of the CALL 5.[13][14] The only way to reconcile these requirements was to choose a segment value that, when added to 0xFEF0, results in an address of 0x001000C0, which, on an 8086, wraps around to 0x000000C0.[15][12][14]
A20 had to be disabled for the wraparound to occur and DOS programs using this interface to work. Newer DOS versions which can relocate parts of themselves into the HMA, typically craft a copy of the entry point at FFFF:00D0 in the HMA (which again resolves to physical 0x001000C0), so that the interface can work without regard to the state of A20.[14][16]
One program known to use the CALL 5 interface is the DOS version of the Small-C compiler.[17] Also, the SPELL utility in Microsoft's Word 3.0 (1987) is one of the programs depending on the CALL 5 interface to be set up correspondingly.[18] Sun Microsystems' PC-NFS (1993) requires the CALL 5 fix-up as well.[16]
Also, to save program space,[1] a trick was used by some BIOS and DOS programmers, for example, to have one segment that has access to program data (such as from F800:0000 to F800:7FFF, pointing to the physical addresses 0x000F8000–0x000FFFFF), as well as the I/O data (such as the keyboard buffer) that was located in the first memory segment (with addresses F800:8000 to F800:FFFF pointing to the physical addresses 0x00000000 to 0x00007FFF).
This trick works for as long as the code isn't executed in low memory, the first 64 KB of RAM, a condition that was always true in older DOS versions without load-high capabilities.
With the DOS kernel relocated into higher memory areas, low memory increasingly became available for programs, causing those depending on the wraparound to fail.[19] The executable loaders in newer versions of DOS attempt to detect some common types of affected programs and either patch them on-the-fly to function also in low memory[20] or load them above the first 64 KB before passing execution on to them.[20] For programs, which are not detected automatically, LOADFIX[21] or MEMMAX -L[21] can be used to force programs to be loaded above the first 64 KB.
The trick was utilized by IBM/Microsoft Pascal itself as well as by programs compiled with it,[22][23][10][17] including Microsoft's MASM.[17] Other commonly used development utilities using this were executable compressors like Realia's Spacemaker[20] (written by Robert B. K. Dewar in 1982 and used to compress early versions of the Norton Utilities[24][25][26][27]) and Microsoft's EXEPACK[19][20][1][28][17] (written by Reuben Borman in 1985) as well as the equivalent /E[XEPACK] option in Microsoft's LINK 3.02 and higher.[19][1][28][26] Programs processed with EXEPACK would display a "Packed file is corrupt" error message.[1][20][28]
Various third-party utilities exist to modify compressed executables either replacing the problematic uncompression routine(s) through restubbing, or attempting to expand and restore the original file.
Modern Legacy BIOS boot loaders (such as GNU GRUB) use the A20 line.[3] UEFI boot loaders use 32-bit protected mode or 64-bit long mode.
See also
[edit]References
[edit]- ^ a b c d e f g Paul, Matthias R. (2002-02-02). "Treiber dynamisch nachladen (Intra-Segment-Offset-Relokation zum Laden von TSRs in die HMA)" [Loading drivers dynamically (Intra-segment offset relocation to load TSRs into the HMA)] (in German). Newsgroup: de.comp.os.msdos. Archived from the original on 2017-09-09. Retrieved 2017-07-02. (NB. Gives a comprehensive overview on the history and "nature" of the HMA and the non-obvious design constraints to be observed when developing resident system extensions to be loaded into the HMA, some of which are caused by the A20 gate. It also describes how to address these issues using stubs, backdoors, and intra-segment offset relocation, a method used by DR-DOS drivers capable of relocating into the HMA and similar to a (more sophisticated) method used as the basis for the dynamic dead code elimination in the author's FreeKEYB driver.)
- ^ Paul, Matthias R. (2002-04-11). "Re: [fd-dev] ANNOUNCE: CuteMouse 2.0 alpha 1". freedos-dev. Archived from the original on 2020-02-21. Retrieved 2020-02-21.
- ^ a b "A20 Line". OSDev Wiki. 2021-07-19. Archived from the original on 2021-11-30. Retrieved 2021-07-19.
- ^ Shanley, Tom; Anderson, Don (1995). Swindle, John (ed.). ISA System Architecture (3 ed.). Mindshare, Inc. / Addison-Wesley Publishing Company. pp. 79–80. ISBN 0-201-40996-8. ISBN 978-0-201-40996-3. [1]
- ^ "Envisioning a Simplified Intel Architecture for the Future". intel.com. Intel. Retrieved 2023-05-22.
- ^ Shanley, Tom (1996). Protected mode software architecture. Taylor & Francis. p. 60. ISBN 0-201-55447-X.
- ^ "Chapter 3 PC System". PC 2001 System Design Guide (PDF). Intel Corporation and Microsoft Corporation. p. 52. Retrieved 2023-06-03.
SYS–0047. A20M# is always de-asserted (pulled high) at the processor
- ^ Intel System Programmers Manual Vol. 3A from June 2013.
- ^ a b c 86-DOS - Disk Operating System for the 8086 - Programmer's Manual (PDF). Version 0.3 (Preliminary ed.). Seattle, Washington, USA: Seattle Computer Products, Inc. 1980. pp. 7, 17. Archived from the original (PDF) on 2019-06-23. Retrieved 2011-09-13.
[...] This form is provided to simplify translation of 8080/Z80 programs into 8086 code, and is not recommended for new programs. [...] Memory size. This is the number of bytes available in the program segment. [...]
(41 pages) - ^ a b Letwin, James (1985-04-10). "Method and operating system for executing programs in a multi-mode microprocessor". Microsoft. US06722052, US4779187A. Archived from the original on 2018-09-23. Retrieved 2018-09-23.
[...] Some programs written for the 8086 rely on [address wrap-around] to run properly. Unfortunately, memory locations extend above 1 megabyte in the real mode of the 80286 and are not wrapped to low memory locations. Consequently, programs including those written in MicroSoft PASCAL and programs which use the "Call 5" feature of MS-DOS will fail on the standard 80286 system. [...] For example, no PASCAL programs are loaded into memory below 64K, and a special instruction is placed in the lower memory locations above 1 megabyte–for example, address 100000h or 100010h. [...]
{{cite web}}: CS1 maint: bot: original URL status unknown (link) - ^ Taylor, Roger; Lemmons, Phil (June 1982). "Upward migration - Part 1: Translators - Using translation programs to move CP/M-86 programs to CP/M and MS-DOS" [Using translation programs to move CP/M programs to CP/M-86 and MS-DOS] (PDF). BYTE. Vol. 7, no. 6. BYTE Publications Inc. pp. 321–322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 [342, 344]. ISSN 0360-5280. CODEN BYTEDJ. Archived (PDF) from the original on 2020-01-16. Retrieved 2020-01-15.
[...] Gaining Access to CP/M-86 [...] Gaining access to CP/M-86 requires placing the function code in the CL register, placing the byte parameter in the DL register or placing the word parameter in the DX register, placing the data segment in the DS register (the data segment is usually not changed for a converted program), and executing a software interrupt, INT #224. The result is returned in the AL register if it is a byte value; if the result is a word value, it is returned in both the AX and BX registers. Double-word values are returned with the offset in the BX registers and the segment in the ES register. Conversion of programs from CP/M-80 to CP/M-86, then, requires replacing the call to location 5 with the software interrupt INT #224. Another necessary change involves the warm boot. Under CP/M-80, the warm boot may be accessed by a system call with a function code of 0 for a jump to location 0. CP/M-86, however, does not support the jump to location 0. As a result, you must change this program exit in the translated program if the program is to run correctly. Provided that the call to location 5 is replaced with INT #224, that the warm boot change is made, and that the registers are mapped correctly, there should be little problem in getting the translated program to access the CP/M-86 system functions. [...] Gaining Access to MS-DOS [...] Although MS-DOS has a "preferred" mechanism through a soft-ware interrupt, INT #33, for accessing the system, an additional mechanism is provided for "preexisting" programs that is compatible with CP/M-80 calling conventions, at least for functions in the range of 0-36. As far as system calls within the allowed function range are concerned, the programmer doesn't have to do anything to translated programs to get them to run under MS-DOS other than to correctly map the registers. MS-DOS also supports the warm boot function of CP/M-80. A jump to location 0 under MS-DOS executes a software interrupt, INT #32, which is functionally a program end and the normal way to exit from a program. [...]
[2] [3][4][5][6][7][8][9][10][11][12][13][14][15] (13 pages) - ^ a b Schäpers, Arne (1991). "Kapitel 5: EXEC im Detail - Program Segment Prefix (PSP)". DOS 5 für Programmierer: Die endgültige Referenz (in German) (1 ed.). Addison Wesley (Deutschland) GmbH. pp. 148–151, 971–972 [149, 971–972]. ISBN 3-89319-350-2. (1123+v pages, foldout, 5.25"-floppy)
- ^ a b "Format of Program Segment Prefix (PSP)". INTER61. 2000. Archived from the original on 2020-02-17. Retrieved 2019-12-19.
- ^ a b c d Necasek, Michal (2011-09-13). "Who needs the address wraparound, anyway?". OS/2 Museum. Archived from the original on 2020-02-19. Retrieved 2020-02-19.
[...] 86-DOS, and hence PC DOS/MS-DOS, used a clever trick. The byte at offset 5 of the PSP contained a far call opcode (9Ah); the word at offset 6 of the PSP contained the appropriate value to indicate program segment size, and also the offset part of the far call. The word at offset 8, which served as the segment part of the far call, was crafted such that when combined with the offset, it would wrap around (a well understood feature of the 8086 CPU) and point to address 0:C0h, which contains interrupt vector 30h. [...] A problem with the compatibility interface occurs when the loaded program has in fact less than 64KB available. If that happens, the word at PSP offset 6 may not contain the correct value, but the CALL 5 interface will still work; the instruction at offset 5 will be CALL 0:C0h, making the reported program segment size C0h. It is unclear why DOS does that; it appears to be a bug in DOS 5.0 and later, as DOS 4.0 and earlier versions simply adjust the segment portion so that it wraps around to 0:C0h. That works as long as the program segment size is paragraph aligned, and it will be. [...]
- ^ Norton, Peter (1985). The Peter Norton Programmer's Guide to the IBM PC (Illustrated ed.). Microsoft Corporation. ISBN 0-91484546-2. ISBN 978-0-91484546-1. p. 263:
[...] By a process too bizarre and complicated to explain, the segmented address is set so that it serves two purposes. Not only does it point to the DOS function dispatcher, but the offset part also indicates how much of the code segment we can use (up to hex FFF0, 16 bytes short of 64K). The offset part of the address, the part we are interested in, is located at offset 6 within the PSP, following the instruction's op-code at offset 5. The upshot of this is that if DOS has less than 64K to give our programs, we can use this field to learn how many bytes are available — a technique that should work with most or all windowing and multitasking systems. [...]
(426 pages) - ^ a b "Caldera OpenDOS Machine Readable Source Kit (M.R.S) 7.01". Caldera, Inc. 1997-05-01 [1997-04-16]. Archived from the original on 2021-08-07. Retrieved 2022-01-02.
[...] BIOSINIT.A86 1.40 93/11/11 12:25:29 [...] VDISK header changes [...] BIOSINIT.A86 1.39 93/11/08 23:19:22 [...] SetupHMA does CALL5 initialisation [...] now fixup JMPF in hi-memory for CALL5 link for PC-NFS [...]
[16] (NB. OpenDOS 7.01 M.R.S.: IBMBIO\BIOSINIT.A86 SetupHMA ) - ^ a b c d Necasek, Michal (2018-03-16). "The A20-Gate: It Wasn't WordStar". OS/2 Museum. Archived from the original on 2018-09-23. Retrieved 2018-09-23.
- ^ Parsons, Jeff (2018-05-27) [1987-12-01, 1987-08-02]. "Somebody Put a SPELL On Me". PCjs. Archived from the original on 2019-01-29. Retrieved 2019-04-21.
- ^ a b c Schulman, Andrew; Brown, Ralf D.; Maxey, David; Michels, Raymond J.; Kyle, Jim (1994) [November 1993]. Williams, Andrew (ed.). Undocumented DOS: A programmer's guide to reserved MS-DOS functions and data structures - expanded to include MS-DOS 6, Novell DOS and Windows 3.1. The Andrew Schulman Programming Series (1st printing, 2nd ed.). Reading, Massachusetts, USA: Addison Wesley Publishing Company. pp. 349–350. ISBN 0-201-63287-X. ISBN 978-0-201-63287-3.
[...] Leaving the A20 line enabled causes problems with programs that expect wraparound to occur [...] One such program was the unpacking routine Microsoft's own linker originally included with any file that had been EXEPACKed to reduce its size! According to Phillip Gardner, author of the shareware DOSMAX UMB maintenance utility and a veteran in the DOS disassembly area, the notorious "Packed File Corrupt" error message than began appearing everywhere shortly after the introduction of DOS 5.0 is directly due to the fact that the A20 line is enabled, and the original unpacking routine depended on the segment wraparound effect to properly expand the compressed files. [...]
(xviii+856+vi pages, 3.5"-floppy [17]) Errata: [18][19] (NB. On page 350, the book has a detailed description of the inner workings of the problematic EXEPACK uncompression routine.) - ^ a b c d e Paul, Matthias R. (2002-10-07) [2000]. "Re: masm .com (PSP) related trouble". Newsgroup: alt.lang.asm. Archived from the original on 2017-09-03. Retrieved 2017-09-03.
[...] DR Concurrent DOS 386 (since 1988-07-08) will load EXEPACKed programs above the 64K mark, that is, outside "lowest memory", by extending the memory block containing the program's environment [...] DR DOS 5.0+ always loads .EXE-format programs with no fixups, and (since 1990-05-25) also .COM-format programs compressed with SpaceMaker - and therefore starting with 9Ch 55h (PUSHF/PUSH BP) - above the 64K mark to avoid the EXEPACK wrap around bug. It does this by extending the memory block containing the program's environment, since 1989-12-14 it will even allocate multiple fillers when necessary. This environment expansion code is disabled if the name of the parent program as stored in the MCB is "WIN" to improve performance when WIN.COM starts KERNEL.EXE (0 relocation items). [...] the MS-DOS/PC DOS 5.0+[...] kernel scans for a variety of code sequences in .EXE format executables and applies patches for various versions of EXEPACKed files in order to let them run in lowest memory (when DOS is in the HMA), that is, a load segment < 64 Kb. Otherwise they would display "Packed file corrupt". The code checks that the code's entry point [...] is not < 0002h [...] and then reads the WORD immediately preceding the entry point [...] If this WORD reads 5242h ("RB"), the file is assumed to be EXEPACKed. The code then looks for one of several combinations of code sequences at offsets from this "RB" signature. [...] the MS-DOS 5.0+[...] kernel scans for an unknown class of .COM executables. If their signatures are found in the file, the A20 countdown variable at offset 18h in the disk buffer info table (see Table "DOS 5.0-6.0 disk buffer info") will be set to 10, which will cause A20 to be disabled after INT 21h calls for this count of INT 21h calls to follow. Presumably this class of programs requires A20 to be disabled for some time after it begins execution. (Similar actions occur on entry into INT 21h/AH=25h and AH=49h.) [...]
- ^ a b Paul, Matthias R. (1997-07-30) [1996-06-18, 1994-05-01]. "V.4. Bessere Speicherausnutzung mit selbsthochladenden Programmen". NWDOS-TIPs — Tips & Tricks rund um Novell DOS 7, mit Blick auf undokumentierte Details, Bugs und Workarounds. Release 157 (in German) (3 ed.). Archived from the original on 2016-11-04. Retrieved 2014-08-06.
{{cite book}}:|work=ignored (help) (NB. The provided link points to a HTML-converted version of theNWDOSTIP.TXT, which is part of theMPDOSTIP.ZIPcollection.) [20] - ^ Pascal Compiler (PDF). Personal Computer Computer Language Series (1 ed.). International Business Machines Corporation. August 1981. Archived (PDF) from the original on 2020-05-29. Retrieved 2018-09-23.
- ^ "NAME ENTX - Microsoft MS-DOS Computer Pascal runtime system control". Version 1.00. Microsoft Corp. 1981. Archived from the original on 2020-02-23. Retrieved 2020-02-23.
[...] DX is final DS (may be negative) [...] final DS value (may be negative) [...]
- ^ "Expert Report of Robert B. K. Dewar In Response To The Report Of Kenneth D. Crews". Cambridge University Press et al v. Patton et al, Filing 124, Supplemental Initial Disclosures by Cambridge University Press, Oxford University Press, Inc., Sage Publications, Inc. - Cambridge University Press, Oxfort University Press, Inc., and Sage Publications, Inc. v. Mark P. Becker, Georgia State University President, et al, Civil Action No. 1:08-CV-1425-ODE (Court document). United States District Court For The Northern District Of Georgia, Atlanta Division. p. 18. Exhibit A. Archived from the original on 2018-05-01. Retrieved 2019-04-23.
[...] SPACEMAKER and TERMULATOR, commodity software for IBM PC (PC DOS file compression utility and VT-100 emulator), being marketed by Realia, Inc. R.B.K. Dewar (1982-1983), 8088 assembly language, 8,000 lines [...]
- ^ Realia, Inc. (January 1983). "If you use DOS, you need this program". PC Magazine (advertisement). 2 (9). Ziff-Davis Publishing: 417. Archived from the original on 2019-04-22. Retrieved 2019-04-22.
- ^ a b Dewar, Robert Berriedale Keith (1984-03-13). "DOS 3.1 ASMB (Another Silly Microsoft Bug)". info-ibmpc@USC-ISIB.ARPA. Archived from the original on 2018-05-01. Retrieved 2019-04-23.
[...] The /E option of the linker should generate an EXE file which is logically equivalent to the uncompressed EXE file. The current version [...] results in AX being clobbered. AX on entry to an EXE file has a definite meaning (it indicates drive validity for the parameters), thus it should be passed through to the uncompressed image. Given this one very obvious violation of the interface rules, there may be others, I have not bothered to investigate further [...] I did write the Realia SpaceMaker program which does a similar sort of thing to the EXEPACK option (but needless to say does not have this particular [...]
- ^ Necasek, Michal (2018-04-30). "Realia SpaceMaker". OS/2 Museum. Archived from the original on 2019-01-27. Retrieved 2019-02-22.
- ^ a b c Necasek, Michal (2018-03-23). "EXEPACK and the A20-Gate". OS/2 Museum. Archived from the original on 2018-11-13. Retrieved 2019-04-20.
Further reading
[edit]- Brouwer, Andries Evert (2001). "A20 - a pain from the past". Archived from the original on 2017-09-09. Retrieved 2017-09-09.
- Collins, Robert R. (2001). "A20/Reset Anomalies". Archived from the original on 2017-09-09. Retrieved 2017-09-09.
- Necasek, Michal (2018-01-30) [2018-01-28, 2018-01-26]. "WordStar Again". OS/2 Museum. Archived from the original on 2019-07-28. Retrieved 2019-07-28.
- Ingenoso, Tony (1998-12-20). "Chapter 13 - The A20 gate and the HMA". Making Code Work Better - How to minimize the size of 80x86 code and sometimes make it faster (e-book). Archived from the original on 2019-11-18. Retrieved 2019-11-18.
- Ludloff, Christian (2011). "x86 architecture legacy stuff: KBC, PS/2, and A20M#". sandpile.org. Archived from the original on 2021-08-15. Retrieved 2022-01-02.
A20 line
View on GrokipediaBackground
Early PC Memory Limitations
The Intel 8086 and 8088 microprocessors, which powered the original IBM PC and its compatible systems, featured a 20-bit address bus consisting of lines A0 through A19, enabling direct access to a maximum of 1 MB of physical memory, equivalent to bytes or addresses from 00000h to FFFFFh.[3] This hardware constraint arose from the design of the Bus Interface Unit (BIU), which generated 20-bit physical addresses to interface with memory and I/O devices.[4] In real mode, the addressing scheme employed 16-bit segment registers (CS for code, DS for data, SS for stack, and ES for extra) combined with 16-bit offsets to form physical addresses, calculated as (segment value × 16) + offset, theoretically spanning the full 1 MB space.[4] However, this mechanism introduced wraparound behavior at the 1 MB boundary, where addresses exceeding FFFFFh would cycle back to 00000h, potentially causing unintended overlaps or errors in memory access without explicit software management.[4] Each segment was limited to 64 KB (65,536 bytes, addressed by offsets from 0000h to FFFFh), with segments aligned on 16-byte boundaries to facilitate overlapping and contiguous addressing.[4] The IBM PC, introduced in 1981, and its XT successor typically shipped with base read/write RAM configurations ranging from 16 KB to 64 KB on the system board, expandable in 16 KB increments up to a maximum of 640 KB for general-purpose use by applications and the operating system.[5] The remaining upper memory regions were reserved: approximately 128 KB from A0000h to BFFFFh for video RAM (e.g., 32 KB for monochrome display at B0000h–B7FFFh or 32 KB for color/graphics at B8000h–BFFFFh), and 64 KB from F0000h to FFFFFh for ROM containing BIOS firmware, I/O drivers, and the Cassette BASIC interpreter.[5] This allocation left 384 KB (from 640 KB to 1 MB) unavailable for base memory expansion due to these hardware reservations, enforcing a practical limit on usable RAM.[5] Real-mode addressing quirks, such as the 64 KB segment size, required programmers to manage transitions across segment boundaries by adjusting the segment register when offsets approached FFFFh, as exceeding this would wrap the offset back to 0000h and potentially access unintended memory locations in the next segment.[4] These characteristics, including the overlapping nature of segments offset by 16 bytes, influenced compatibility decisions in early PC software and hardware design to ensure reliable operation within the constrained 1 MB space.[4]Introduction of Extended Addressing
The original Intel 8086 and 8088 processors employed 20-bit addressing, constraining them to a 1 MB memory space with inherent wraparound at the upper limit.[6] In February 1982, Intel introduced the 80286 microprocessor, which expanded physical addressing to 24 bits (A0–A23), enabling access to up to 16 MB of memory and marking a significant advancement beyond the 8086's capabilities.[7] The 80286 incorporated two primary operating modes to balance innovation with legacy support: protected mode, which fully leveraged the 24-bit address bus for the complete 16 MB address space, and real mode, which preserved the 20-bit addressing and segmented memory model of the 8086 to ensure object-code compatibility with existing software.[8] The IBM PC/AT, released in 1984, integrated the 80286 processor and introduced extended memory above the traditional 1 MB boundary, allowing for greater RAM capacity through onboard and expansion options while maintaining real-mode operation as the default for booting and running 8086-compatible applications. However, this transition created compatibility challenges, as the 80286's real mode did not automatically replicate the 8086's exact addressing behavior without additional hardware intervention. To achieve seamless 8086 emulation and prevent unintended access to the 1–2 MB address range—which would disrupt wraparound expectations—early motherboard designs like the IBM AT's included built-in logic to disable the A20 address line by default.[1]Technical Mechanism
Role of the A20 Address Line
The A20 address line, also known as bit 20 (zero-indexed) on the address bus, serves as the 21st bit in x86 processor addressing, enabling the distinction between the lower 1 MB of memory (addresses 00000h to FFFFFh) and the region immediately beyond it (1 MB to 2 MB).[9] This line was introduced with the Intel 80286 processor, which featured a 24-bit address bus capable of accessing up to 16 MB in protected mode, but required specific handling in real mode to maintain compatibility with earlier systems.[10] When the A20 line is held low (0) by the external gate, bit 20 is masked on the address bus, causing addresses that exceed 1 MB to wrap around to the base of memory, thereby emulating the 20-bit addressing behavior of the original Intel 8086 processor and limiting the effective addressable space to 1 MB (00000h to FFFFFh).[9] For example, a linear address of 100000h would map to 00000h under this condition, preventing unintended access to higher memory regions and ensuring software compatibility with legacy 8086/8088 applications that assume a 1 MB wraparound.[9] In real mode on the 80286, the physical address is 24 bits with A21–A23 always low, but A20 may be high for addresses >=1MB; the external A20 gate forces A20 low to enforce the 1MB boundary and emulate 8086 wraparound.[10] When the A20 line is asserted high (1), bit 20 is unmasked, allowing full 21-bit addressing in real mode and extending access to the High Memory Area (HMA), which spans 100000h to 10FFEFh (1 MB to approximately 1 MB + 64 KB, or 65,520 bytes).[9] This capability provides a modest expansion beyond the traditional 1 MB limit without switching to protected mode, though the HMA's usability is constrained by the 64 KB segment size in real mode.[9] Subsequent processors, such as the Intel 80386, integrated the A20 mechanism into the CPU pinout (via the A20M# pin) while preserving it for real-mode compatibility, ensuring backward compatibility with 80286 software and hardware designs.[10]A20 Gating for Compatibility
The A20 gate, also referred to as the Gate-A20, is a hardware circuit integrated into the motherboard design of the IBM PC/AT, introduced in 1984, which forces the A20 address line to a low state upon system reset or power-up.[2] This mechanism emulates the 20-bit addressing behavior of the original Intel 8086/8088 processors used in the IBM PC, limiting effective addressable memory to 1 MB in real mode.[11] By default disabling the A20 line, the gate ensures that the 21st bit of the address bus (A20) remains inactive, causing memory addresses exceeding 1 MB to wrap around to the base of the address space rather than accessing higher memory regions.[12] The primary purpose of this gating logic was to maintain backward compatibility with software developed for the 8086-based IBM PC, much of which relied on the wraparound behavior at the 1 MB boundary for proper operation.[2] Without this intervention, programs running in real mode on the 80286 processor—capable of 24-bit addressing up to 16 MB—could inadvertently access undefined memory areas above 1 MB, leading to unpredictable behavior or crashes in ported applications such as early versions of QDOS or Microsoft development tools.[2] This design choice allowed IBM to transition to more advanced processors while preserving the vast existing software ecosystem without requiring widespread recompilation or modifications.[11] In early implementations on the PC/AT, the A20 gate employed straightforward hardware elements, such as AND gates or latches, directly tied to the CPU's reset signals to enforce the low state of A20.[12] These simple circuits sufficed in the absence of advanced features like on-chip caches, as the PC/AT's architecture focused on reliable emulation of prior systems.[2] As processor architectures advanced into the 80386 era around 1985, the A20 gating mechanism grew more intricate to accommodate new complexities such as instruction pipelining and the potential for external caching, which could otherwise introduce inconsistencies in address handling during real-mode operations.[2] Despite these enhancements, the core principle of forcing A20 low on initialization endured to support real-mode emulation, ensuring continued compatibility with legacy software environments.[12]Enabling Methods
Keyboard Controller Approach
The Intel 8042 keyboard controller, introduced as the standard interface in the IBM PC/AT in 1984, provided the primary mechanism for enabling the A20 address line through dedicated output port commands. This microcontroller managed keyboard and auxiliary device inputs while also controlling system signals, including the A20 gate via bit 1 of its output port. The controller's ubiquity in the PC/AT design and subsequent compatible systems, including IBM PS/2 machines, established it as the de facto standard for A20 management across x86 platforms.[1] To enable the A20 line, software writes the command byte 0xD1 to the controller's command port at I/O address 0x64, instructing it to prepare for an output port update; it then waits for the input buffer to empty (indicated by bit 1 of the status register at 0x64 being clear) before writing the data byte 0xDF to the data port at 0x60, which sets output port bit 1 high and activates the gate. Disabling A20 follows a similar sequence: write 0xD1 to 0x64, wait for buffer readiness, then write 0xDD to 0x60 to clear bit 1. This procedure, which typically requires polling the status register in a loop with timeouts up to 20 ms for command acknowledgment, is routinely implemented in BIOS initialization routines and operating system bootloaders to ensure reliable memory addressing transitions. The A20 signal propagates within 20 μs of the controller accepting the port data.[1] Early implementations of the 8042 exhibited inherent processing delays, with response times for commands ranging from 15 ms to start transmission to 2 ms for completion, compounded by the need for buffer synchronization. These latencies, including signal switching times on the order of 20 μs, posed reliability challenges in faster systems where rapid A20 toggling was required, often necessitating extended wait loops or retries to avoid incomplete gate operations during mode switches.[1]FAST and Alternative Gates
The FAST A20 gate provides a direct hardware method to enable the A20 address line by manipulating the system control port at I/O address 0x92, bypassing the slower keyboard controller sequence.[13] This approach sets bit 1 (value 0x02) of the port, which controls the A20 gate on compatible chipsets, allowing activation in a single I/O operation without the delays associated with command queuing in the 8042 controller.[14] It became common in later IBM PS/2 systems and many 386-era motherboards starting in the late 1980s, offering significantly faster enabling for extended memory access in performance-critical bootloaders and early operating systems.[13][2] To implement the FAST gate, software reads the current byte from port 0x92, performs a bitwise OR with 0x02 to set the A20 bit while preserving other bits (such as bit 0 for CPU reset), and writes the result back; a direct write of 0x02 was occasionally used but could interfere with port functionality on some hardware.[13] This method became common in systems like those using AMI BIOS variants.[13] Chipset-specific implementations, such as those in VIA's VT82C496G from the 1990s, supported fast A20 via port 0x92 writes or other methods for enabling A20 on PCI/ISA platforms.[15] Despite its speed advantages, the FAST A20 gate was not universally supported across all motherboards, leading to risks like system lockups or hardware damage if written to on incompatible systems.[2] Reliable enabling required prior detection routines, such as probing the port's response or checking BIOS extension functions (e.g., INT 15h, AH=0x24), to confirm compatibility before attempting the write.[14] This method saw adoption in operating systems like OS/2 and early Windows versions, where it supplemented the keyboard controller approach on detected hardware for quicker initialization.[2]Software Implications
A20 Handlers and Memory Managers
A20 handlers are software routines implemented in BIOS firmware or operating system code to detect the status of the A20 address line and toggle it as needed for accessing memory beyond the 1 MB boundary. These handlers ensure compatibility with real-mode addressing while enabling extended memory access, particularly for operations involving the high memory area (HMA), which spans the 64 KB region immediately above 1 MB.[16] In BIOS implementations, handlers often leverage interrupt 15h with AH=87h to perform block moves of extended memory, a function that requires the A20 line to be enabled to avoid address wrapping and ensure correct data transfer up to 16 MB.[17] A prominent example of an A20 handler is provided by HIMEM.SYS, a device driver introduced with MS-DOS 5.0 in 1991, which implements the eXtended Memory Specification (XMS) for managing memory above 1 MB.[16] HIMEM.SYS installs a dedicated A20 handler to control access to the HMA, supporting multiple machine-specific methods to toggle the line and allocating the HMA as a single 64 KB block for exclusive use by one program at a time.[18] This driver detects the host system's configuration and selects an appropriate handler, such as those for IBM PC/AT compatibles, ensuring reliable enabling of extended memory without hardware conflicts.[19] Detection of the A20 line's status in these handlers typically involves a probing sequence that writes a known value to a high memory address, such as 0x500000 (5 MB), and then reads from the corresponding low memory address, 0x100000 (1 MB), to check for aliasing due to wrapping.[13] If the read value matches the written one at the low address, the A20 line is disabled; otherwise, it is enabled. Handlers retry alternative enabling methods—such as keyboard controller commands or chipset-specific ports—if the initial probe indicates failure, prioritizing non-disruptive approaches to maintain system stability.[16] Beyond HIMEM.SYS, other DOS memory managers incorporate A20 handling for advanced features like multitasking and expanded memory emulation. EMM386.EXE, Microsoft's expanded memory manager for MS-DOS, builds on XMS by emulating LIM 4.0 expanded memory using extended memory pages and relies on an underlying A20 handler (often from HIMEM.SYS) to switch between real and protected modes for upper memory block (UMB) allocation.[18] Similarly, Quarterdeck's QEMM provides an integrated A20 handler alongside its optimized memory management, enabling efficient UMB usage and multitasking by dynamically controlling the line for virtual 8086 mode tasks.[20] In Unix-like systems, such as early Linux implementations, bootloaders include A20 enabling routines in assembly code (e.g., setup.S) to prepare for kernel loading, using probes and controller commands to ensure full 32-bit addressing before protected mode entry.[21]Affected Programs and Workarounds
Early DOS programs, particularly those developed for the 8086 processor, often relied on the 1 MB address wraparound behavior, where memory accesses beyond 1 MB folded back to the beginning of the address space. This assumption caused compatibility issues on 80286 and later processors, where the A20 gate was disabled by default to emulate the wraparound but could be enabled for extended memory access, breaking programs that depended on it. For example, Microsoft Pascal 1.0 (1981) used negative segment register values for copying static data, which wrapped around correctly on the 8086 but failed on the 80286 without wraparound emulation.[22] Compilers like the Small C compiler for MS-DOS exploited the CALL 5 interface in the program's segment prefix (PSP), a holdover from CP/M compatibility that invoked DOS functions via a far call wrapping from high addresses (e.g., FFFF:0005) to low memory (0000:0005). This mechanism, present since DOS 1.0 (1981), failed if the A20 line was enabled, as the wraparound no longer occurred, leading to incorrect execution of interrupt handlers. Similarly, executables compressed with Microsoft's EXEPACK utility (introduced in 1985 with Microsoft C 3.0) unintentionally depended on wraparound during in-memory unpacking; when loaded below 64 KB with A20 enabled, the unpacking algorithm produced corrupt code due to misaligned relocations.[22][23] To address these issues, early workarounds involved loading affected programs at higher memory addresses (above 64 KB) to shift the load location and restore effective wraparound, as outlined in a 1985 Microsoft patent for compatibility enhancements. In MS-DOS 3.3 (1987) and subsequent versions, the operating system began incorporating basic extended memory support, but specific A20-related fixes were limited; programs often required manual intervention, such as avoiding high memory loading. By MS-DOS 5.0 (1991), built-in exepatching mechanisms detected and modified common affected executables on-the-fly, particularly EXEPACKed files like EDIT.COM and FDISK.EXE, ensuring they functioned even when core DOS was relocated to the high memory area (HMA) with A20 enabled.[23] The LOADFIX command, available in MS-DOS 5.0 and later, provided a user-friendly workaround by preloading a 64 KB dummy block into upper memory, forcing subsequent programs to load higher and bypassing EXEPACK unpacking flaws or CALL 5 wraparound dependencies.[23] For systems using extended memory managers, HIMEM.SYS (included starting with MS-DOS 5.0) allowed configuration of A20 handling via options in CONFIG.SYS, such as /A20CONTROL:OFF to prevent HIMEM from overriding existing A20 control by hardware or other drivers, thus maintaining wraparound for legacy software.[24] Games and terminate-and-stay-resident (TSR) utilities from the mid-1980s, such as early Sierra On-Line adventure titles, frequently crashed if A20 was enabled mid-execution, as they assumed uniform low-memory access patterns without wraparound disruptions; users mitigated this by disabling memory managers like HIMEM.SYS during gameplay or using LOADFIX to enforce compatible loading.[2]Modern Legacy
Phasing Out in CPUs
The A20M# pin, an active-low signal that masks address line A20 to emulate the 1 MB memory wraparound of the original 8086 processor, was introduced by Intel in the 80486 microprocessor in 1989 to provide explicit hardware control over A20 gating integrated directly into the CPU.[25] This pin allowed system designers to assert it low during real-address mode operations for backward compatibility with 8086 software, while enabling full 32-bit addressing when deasserted. The feature was retained in subsequent Intel architectures, including the Pentium series launched in 1993, where the pin similarly forced bit 20 to zero for on-chip cache accesses and external bus cycles when asserted.[26] Support for the A20M# pin persisted through the Core 2 family in 2006, ensuring compatibility with legacy real-mode environments.[27] However, starting with the Nehalem microarchitecture in 2008, the physical pin was removed from the CPU, with the A20M# signal instead generated by the chipset and forwarded to the processor via the QPI interface.[28] In Intel's 64-bit mode (IA-32e), introduced with earlier architectures and extended in Nehalem, the CPU continued to emulate the real-mode A20 behavior internally, applying the mask to effective addresses in compatibility sub-mode to maintain 8086-like wrapping unless explicitly managed by software or hardware signals.[29] This emulation preserved the functionality for older operating systems that relied on A20 gating during boot or legacy execution. The full deprecation of A20M# hardware support was noted starting with the Haswell microarchitecture in 2013, where Intel's System Programming Manual (Volume 3A) explicitly stated that the functionality—primarily used by legacy systems—was no longer supported in hardware on newer Intel 64 processors, and the signal may be ignored in long mode.[29] In subsequent generations like Skylake and beyond (starting 2015), the CPU ignores the A20M# signal in long mode, as 64-bit addressing inherently exceeds the 1 MB boundary without needing emulation, rendering the pin vestigial or absent on the package. AMD processors followed a parallel path, incorporating A20M# support in their AMD64 architecture for 8086 compatibility via address bit masking,[30] but this became vestigial in later designs from the late 2010s onward, where real-mode emulation persists only for minimal legacy needs without dedicated pin control.Relevance in Contemporary Systems
In x86 systems utilizing legacy BIOS rather than UEFI, the A20 line must still be enabled during the boot process to support real-mode code execution, particularly for accessing memory beyond the first megabyte in components like option ROMs on expansion cards. This requirement persists to maintain compatibility with the original 8086 addressing model, where the A20 gate is initially disabled upon power-on to emulate memory wraparound, necessitating explicit enabling by the BIOS or bootloader for full 1MB+ address space utilization in real mode. UEFI firmware, by contrast, automatically enables the A20 line early in the boot sequence, eliminating the need for manual intervention in modern firmwares.[31][12][32] Operating system loaders such as GRUB continue to incorporate A20 handlers to ensure compatibility with legacy partitions, including those formatted for DOS or Windows 9x, allowing seamless booting of older real-mode environments on contemporary hardware. For instance, GRUB automatically enables the A20 gate during its initialization to access extended memory regions required by these legacy systems. Similarly, in scenarios involving legacy boot modes, components like the Windows Boot Manager rely on underlying firmware or compatibility layers to manage A20 state transitions when supporting older installation media or partitions. This handling prevents addressing conflicts and ensures that real-mode code from vintage OSes can execute without modification.[33][34] Emulation environments replicate A20 gating to achieve faithful simulation of 1980s-era PCs, preserving the exact memory addressing behaviors for software compatibility testing and retro computing. Virtual machines like QEMU and VMware emulate the A20 line's toggle via I/O ports (e.g., 0x92 or keyboard controller), allowing users to disable it for authentic real-mode wraparound or enable it for protected-mode transitions, which is essential for running unmodified DOS applications or debugging legacy firmware. Retro hardware platforms, such as the MiSTer FPGA with its ao486 or PCXT cores, implement A20 logic in hardware description language to mirror the original IBM PC/AT chipset, enabling cycle-accurate execution of 8088/286/386 software including games and utilities that depend on gated addressing.[35][36][37] As of 2025, A20 concepts see niche applications in embedded x86 systems for precise memory mapping in control applications and in hardware modifications like the original Xbox A20 hack, which grounds the line to bypass secure boot and map flash memory directly, facilitating custom firmware installation on 2001-era consoles. These uses highlight lingering compatibility needs in specialized domains, though the A20 gate has had no mainstream impact on PC hardware or software since Intel phased out support around 2013 with the Haswell architecture.[38][39][40]References
- https://en.wikibooks.org/wiki/X86_Assembly/Bootloaders