Recent from talks
Nothing was collected or created yet.
Intel HEX
View on Wikipedia
| Intel hex | |
|---|---|
| Filename extensions | General-purpose:.hex,[1] .mcs,[2] .int,[3] .ihex, .ihe, .ihx[4]Platform-specific: .h80, .h86,[5][6] .a43,[7][4] .a90[7][4]Split, banked, or paged: .hxl–.hxh,[8] .h00–.h15, .p00–.pff[9]Binary or Intel hex: .obj, .obl,[8] .obh,[8] .rom, .eep |
Intel hexadecimal object file format, Intel hex format or Intellec Hex is a file format that conveys binary information in ASCII text form,[10] making it possible to store on non-binary media such as paper tape, punch cards, etc., to display on text terminals or be printed on line-oriented printers.[11] The format is commonly used for programming microcontrollers, EPROMs, and other types of programmable logic devices and hardware emulators. In a typical application, a compiler or assembler converts a program's source code (such as in C or assembly language) to machine code and outputs it into an object or executable file in hexadecimal (or binary) format. In some applications, the Intel hex format is also used as a container format holding packets of stream data.[12] Common file extensions used for the resulting files are .HEX[1] or .H86.[5][6] The HEX file is then read by a programmer to write the machine code into a PROM or is transferred to the target system for loading and execution.[11][13] There are various tools to convert files between hexadecimal and binary format (i.e. HEX2BIN), and vice versa (i.e. OBJHEX, OH, OHX, BIN2HEX).
History
[edit]The Intel hex format was originally designed for Intel's Intellec Microcomputer Development Systems[14]: 10–11 (MDS) in 1973 in order to load and execute programs from paper tape. It was also used to specify memory contents to Intel for ROM production,[15] which previously had to be encoded in the much less efficient BNPF (Begin-Negative-Positive-Finish) format.[14]: 11 In 1973, Intel's "software group" consisted only of Bill Byerly and Kenneth Burgett, and Gary Kildall as an external consultant doing business as Microcomputer Applications Associates (MAA) and founding Digital Research in 1974.[16][17][18][9] Beginning in 1975, the format was utilized by Intellec Series II ISIS-II systems supporting diskette drives, with files using the file extension HEX.[19] Many PROM and EPROM programming devices accept this format.
Format
[edit]Intel HEX consists of lines of ASCII text that are separated by line feed or carriage return characters or both. Each text line contains uppercase hexadecimal characters that encode multiple binary numbers. The binary numbers may represent data, memory addresses, or other values, depending on their position in the line and the type and length of the line. Each text line is called a record.
Record structure
[edit]A record (line of text) consists of six fields (parts) that appear in order from left to right:[11]
- Start code, one character, an ASCII colon ':'. All characters preceding this symbol in a record should be ignored.[15][5][20][21][22][23] In fact, very early versions of the specification even asked for a minimum of 25 NUL characters to precede the first record and follow the last one, owing to the format's origins as a paper tape format which required some tape leadin and leadout for handling.[15][24][21][22] However, as this was a little known part of the specification, not all software written copes with this correctly. It allows to store other related information in the same file (and even the same line),[15][23] a facility used by various software development utilities to store symbol tables or additional comments,[25][15][21][26][9][27] and third-party extensions using other characters as start code like the digits '0'..'9' by Intel[28] and Keil,[26] '$' by Mostek,[29][30] or '!', '@', '#', '\', '&' and ';' by TDL.[30][31] By convention, '//' is often used for comments.[32][33] Neither of these extensions may contain any ':' characters as part of the payload.
- Byte count, two hex digits (one hex digit pair), indicating the number of bytes (hex digit pairs) in the data field. The maximum byte count is 255 (0xFF). The values of 8 (0x08),[9] 16 (0x10)[9] and 32 (0x20) are commonly used byte counts. Not all software copes with counts larger than 16.[2]
- Address, four hex digits, representing the 16-bit beginning memory address offset of the data. The physical address of the data is computed by adding this offset to a previously established base address, thus allowing memory addressing beyond the 64 kilobyte limit of 16-bit addresses. The base address, which defaults to zero, can be changed by various types of records. Base addresses and address offsets are always expressed as big endian values.
- Record type (see record types below), two hex digits, 00 to 05, defining the meaning of the data field.
- Data, a sequence of n bytes of data, represented by 2n hex digits. Some records omit this field (n equals zero). The meaning and interpretation of data bytes depends on the application. (4-bit data will either have to be stored in the lower or upper half of the bytes, that is, one byte holds only one addressable data item.[15])
- Checksum, two hex digits, a computed value that can be used to verify the record has no errors.
Color legend
[edit]As a visual aid, the fields of Intel HEX records are colored throughout this article as follows:
Start code Byte count Address Record type Data Checksum
Checksum calculation
[edit]A record's checksum byte is the two's complement of the least significant byte (LSB) of the sum of all decoded byte values in the record preceding the checksum. It is computed by summing the decoded byte values and extracting the LSB of the sum (i.e., the data checksum), and then calculating the two's complement of the LSB (e.g., by inverting its bits and adding one).
For example, in the case of the record :0300300002337A1E, the sum of the decoded byte values is 03 + 00 + 30 + 00 + 02 + 33 + 7A = E2, which has LSB value E2. The two's complement of E2 is 1E, which is the checksum byte appearing at the end of the record.
The validity of a record can be checked by computing its checksum and verifying that the computed checksum equals the checksum appearing in the record; an error is indicated if the checksums differ. Since the record's checksum byte is the two's complement — and therefore the additive inverse — of the data checksum, this process can be reduced to summing all decoded byte values, including the record's checksum, and verifying that the LSB of the sum is zero. When applied to the preceding example, this method produces the following result: 03 + 00 + 30 + 00 + 02 + 33 + 7A + 1E = 100, which has LSB value 00.
Text line terminators
[edit]Intel HEX records are usually separated by one or more ASCII line termination characters so that each record appears alone on a text line. This enhances readability by visually delimiting the records and it also provides padding between records that can be used to improve machine parsing efficiency. However, the line termination characters are optional, as the ':' is used to detect the start of a record.[15][5][24][20][21][22][23]
Programs that create HEX records typically use line termination characters that conform to the conventions of their operating systems. For example, Linux programs use a single LF (line feed, hex value 0A) character to terminate lines, whereas Windows programs use a CR (carriage return, hex value 0D) followed by a LF.
Record types
[edit]Intel HEX has six standard record types:[11]
| Hex code | Record type | Description | Example |
|---|---|---|---|
| 00 | Data | The byte count specifies number of data bytes in the record. The example has 0B (eleven) data bytes. The 16-bit starting address for the data (in the example at addresses beginning at 0010) and the data (61, 64, 64, 72, 65, 73, 73, 20, 67, 61, 70). | :0B0010006164647265737320676170A7 |
| 01 | End Of File | Must occur exactly once per file in the last record of the file. The byte count is 00, the address field is typically 0000 and the data field is omitted. | :00000001FF |
| 02 | Extended Segment Address | The byte count is always 02, the address field (typically 0000) is ignored and the data field contains a 16-bit segment base address. This is multiplied by 16 and added to each subsequent data record address to form the starting address for the data. This allows addressing up to one mebibyte (1048576 bytes) of address space. | :020000021200EA |
| 03 | Start Segment Address | For 80x86 processors, specifies the starting execution address. The byte count is always 04, the address field is 0000 and the first two data bytes are the CS value, the latter two are the IP value. The execution should start at this address. | :0400000300003800C1 |
| 04 | Extended Linear Address | Allows for 32 bit addressing (up to 4 GiB). The byte count is always 02 and the address field is ignored (typically 0000). The two data bytes (big endian) specify the upper 16 bits of the 32 bit absolute address for all subsequent type 00 records; these upper address bits apply until the next 04 record. The absolute address for a type 00 record is formed by combining the upper 16 address bits of the most recent 04 record with the low 16 address bits of the 00 record. If a type 00 record is not preceded by any type 04 records then its upper 16 address bits default to 0000. | :020000040800F2 |
| 05 | Start Linear Address | The byte count is always 04, the address field is 0000. The four data bytes represent a 32-bit address value (big endian). In the case of CPUs that support it, this 32-bit address is the address at which execution should start. | :04000005000000CD2A |
Other record types have been used for variants, including 06 ('blinky' messages / transmission protocol container) by Wayne and Layne,[34] 0A (block start), 0B (block end), 0C (padded data), 0D (custom data) and 0E (other data) by the BBC/Micro:bit Educational Foundation,[35] and 81 (data in code segment), 82 (data in data segment), 83 (data in stack segment), 84 (data in extra segment), 85 (paragraph address for absolute code segment), 86 (paragraph address for absolute data segment), 87 (paragraph address for absolute stack segment) and 88 (paragraph address for absolute extra segment) by Digital Research.[6][20]
Named formats
[edit]The original 4-bit/8-bit Intellec Hex Paper Tape Format and Intellec Hex Computer Punched Card Format in 1973/1974 supported only one record type 00.[36][37][25] This was expanded around 1975[when?] to also support record type 01.[15] Sometimes called symbolic hexadecimal format,[38] it could include an optional header containing a symbol table for symbolic debugging,[25][28][26][9] all characters in a record preceding the colon are ignored.[15][5]
Around 1978[when?], Intel introduced the new record types 02 and 03 (to add support for the segmented address space of the then-new 8086/8088 processors) in their Extended Intellec Hex Format.[when?]
Special names are sometimes used to denote the formats of HEX files that employ specific subsets of record types. For example:
- I8HEX (aka HEX-80) files use only record types 00 and 01
- I16HEX (aka HEX-86) files use only record types 00 through 03[10]
- I32HEX (aka HEX-386) files use only record types 00, 01, 04, and 05
File example
[edit]This example shows a file that has four data records followed by an end-of-file record:
:10010000214601360121470136007EFE09D2190140 :100110002146017E17C20001FF5F16002148011928 :10012000194E79234623965778239EDA3F01B2CAA7 :100130003F0156702B5E712B722B732146013421C7 :00000001FF
Start code Byte count Address Record type Data Checksum
Variants
[edit]Besides Intel's own extension, several third-parties have also defined variants and extensions of the Intel hex format, including Digital Research (as in the so-called "Digital Research hex format"[6][20]), Zilog, Mostek,[29][30] TDL,[30][31] Texas Instruments, Microchip,[39][40] c't, Wayne and Layne,[34] and BBC/Micro:bit Educational Foundation (with its "Universal Hex Format"[35]). These can have information on program entry points and register contents, a swapped byte order in the data fields, fill values for unused areas, fuse bits, and other differences.
The Digital Research hex format for 8086 processors supports segment information by adding record types to distinguish between code, data, stack, and extra segments.[5][6][20]
Most assemblers for CP/M-80 (and also XASM09 for the Motorola 6809) don't use record type 01h to indicate the end of a file, but use a zero-length data type 00h entry instead.[41][1] This eases the concatenation of multiple hex files.[42][43][1]
Texas Instruments defines a variant where addresses are based on the bit-width of a processor's registers, not bytes.
Microchip defines variants INTHX8S[44] (INHX8L,[1] INHX8H[1]), INHX8M,[44][1][45] INHX16[44] (INHX16M[1]) and INHX32[46] for their PIC microcontrollers.
Alfred Arnold's cross-macro-assembler AS,[1] Werner Hennig-Roleff's 8051-emulator SIM51,[26] and Matthias R. Paul's cross-converter BINTEL[47] are also known to define extensions to the Intel hex format.
See also
[edit]- Binary-to-text encoding, a survey and comparison of encoding algorithms
- Text-based protocol
- MOS Technology file format
- Motorola S-record hex format
- Tektronix hex format
- Texas Instruments TI-TXT (TI Text)
- Intel Micro Computer Set (MCS)
- Object file
References
[edit]This article contains too many or overly lengthy quotations. (October 2023) |
- ^ a b c d e f g h i Arnold, Alfred "Alf" (2020) [1996, 1989]. "6.3. P2HEX". Macro Assembler AS - User's Manual. V1.42. Translated by Arnold, Alfred "Alf"; Hilse, Stefan; Kanthak, Stephan; Sellke, Oliver; De Tomasi, Vittorio. Aachen, Germany. Archived from the original on 2020-02-28. Retrieved 2020-02-28.
[…] For the PIC microcontrollers, the switch -m <0..3> allows to generate the three different variants of the Intel Hex format. Format 0 is INHX8M which contains all bytes in a Lo-Hi-Order. Addresses become double as large because the PICs have a word-oriented address space that increments addresses only by one per word. […] With Format 1 (INHX16M), bytes are stored in their natural order. This is the format Microchip uses for its own programming devices. Format 2 (INHX8L) resp. 3 (INHX8H) split words into their lower resp. upper bytes. […] Unfortunately, one finds different statements about the last line of an Intel-Hex file in literature. Therefore, P2HEX knows three different variants that may be selected […] :00000001FF […] :00000001 […] :0000000000 […] By default, variant 0 is used which seems to be the most common one. […] If the target file name does not have an extension, an extension of HEX is supposed. […]
- ^ a b "AR#476 PROMGen - Description of PROM/EEPROM file formats: MCS, EXO, HEX, and others". Xilinx. 2010-03-08. Intel MCS-86 Hexadecimal Object - File Format Code 88. Archived from the original on 2020-03-03. Retrieved 2020-03-03.
- ^ Sabnis, Abhishek (2011-02-04). "How to convert .out file to .int, .hex, .a43". Code Composer Studio forum. Texas Instruments. Archived from the original on 2023-10-20. Retrieved 2023-10-20.
TI-gang programmer needs .int, .hex, .a43 file format.
- ^ a b c Schuldt, Michael (2018). "intel-hex-mode". github.com. Archived from the original on 2020-10-24. Retrieved 2023-10-20.
By default, this mode is enabled for files with a .a90, .hex, .a43, or .ihx extension.
- ^ a b c d e f "3.1. Intel 8086 Hex File Format". CP/M-86 Operating System - System Guide (PDF) (2nd printing, 1st ed.). Pacific Grove, California, USA: Digital Research. June 1981. pp. 15–16. Archived (PDF) from the original on 2020-02-28. Retrieved 2020-02-28. p. 16:
[…] The following are output from ASM-86 only: 81 same as 00, data belongs to code segment […] 82 same as 00, data belongs to data segment […] 83 same as 00, data belongs to stack segment […] 84 same as 00, data belongs to extra segment […] 85 paragraph address for absolute code segment […] 86 paragraph address for absolute data segment […] 87 paragraph address for absolute stack segment […] 88 paragraph address for absolute extra segment […] All characters preceding the colon for each record are ignored. […]
(17 pages) - ^ a b c d e "Appendix C. ASM-86 Hexadecimal Output Format". CP/M-86 - Operating System - Programmer's Guide (PDF) (3 ed.). Pacific Grove, California, USA: Digital Research. January 1983 [1981]. pp. 97–100. Archived (PDF) from the original on 2020-02-27. Retrieved 2020-02-27. pp. 97–99:
[…] The Intel format is identical to the format defined by Intel for the 8086. The Digital Research format is nearly identical to the Intel format, but adds segment information to hexadecimal records. Output of either format can be input to GENCMD, but the Digital Research format automatically provides segment identification. A segment is the smallest unit of a program that can be relocated. […] It is in the definition of record types 00 and 02 that Digital Research's hexadecimal format differs from Intel's. Intel defines one value each for the data record type and the segment address type. Digital Research identifies each record with the segment that contains it. […] 00H for data belonging to all 8086 segments […] 81H for data belonging to the CODE segment […] 82H for data belonging to the DATA segment […] 83H for data belonging to the STACK segment […] 84H for data belonging to the EXTRA segment […] 02H for all segment address records […] 85H for a CODE absolute segment address […] 86H for a DATA segment address […] 87H for a STACK segment address […] 88H for an EXTRA segment address […]
[1] (1+viii+122+2 pages) - ^ a b Ramos, Rubens (2010) [2008]. "intel-hex-mode.el --- Mode for Intel Hex files". Retrieved 2023-10-20.
- ^ a b c "The Interactive Disassembler - Hexadecimal fileformats". Hex-Rays. 2006. Archived from the original on 2020-03-01. Retrieved 2020-03-01. [2] Archived 2021-11-16 at the Wayback Machine
- ^ a b c d e f Roche, Emmanuel (2020-04-01). "The Intel HEX File Format". France: Newsgroup: comp.os.cpm. INTELHEX.WS4. Archived from the original on 2021-12-08. Retrieved 2021-12-08.
[…] the Intel HEX file format can contain much more than the "data bytes". As long as the lines do not start with a colon (":"), they can contain anything that you want. […] I once saw a big HEX file […] It contained, at the beginning, the source code of a PL/M program, followed, at the end, by the resulting HEX file produced by the PL/M compiler. […] I found another HEX file containing several lines of comments, not at the beginning or at the end, but separating several lines of "absolute records". […] it was from an "(Intel) 8008 Simulator". So, at the beginning of its use, it was well known that HEX files could contain explanations. […] under CP/M or any 8-bit 64K system, there remains one case: "Page addresses". Since CP/M, it is standard to display memory addresses using the hexadecimal system […] as we said for BIN/COM files, the memory addresses are 0000/0100. […] those memory addresses can be written 00-00/01-00 […] to say: Page zero, address zero / Page one, address zero. […] the highest memory address in a 8-bit 64K computer is FFFF […] Page FF, address FF […] the lowest addresses are in Page zero (or 00) and the highest addresses are in Page FF. […] CP/M filetypes are 3-letters long, one could use filetypes of the form P00–PFF […] to indicate at which memory address where to load the HEX file. […] I noticed that most of my addresses were ending with "00", so the loading address could be reduced to the Page address, which […] could be put inside the filetype […]
- ^ a b "Appendix D. MCS-86 Absolute Object File Formats: Hexadecimal Object File Format". 8086 Family Utilities - User's Guide for 8080/8085-Based Development Systems (PDF). Revision E (A620/5821 6K DD ed.). Santa Clara, California, USA: Intel Corporation. May 1982 [1980, 1978]. pp. D-8 – D-13. Order Number 9800639-04. Archived (PDF) from the original on 2020-02-29. Retrieved 2020-02-29.
- ^ a b c d Hexadecimal Object File Format Specification. Revision A. Intel Corporation. 1998 [1988-01-06]. Retrieved 2019-07-23. [3][4][5][6][7] (11 pages)
- ^ "LT Programming Hex File Format Documentation -- In Circuit Programming". Analog Devices, Inc. / Linear Technology. 2021. Archived from the original on 2021-03-07. Retrieved 2021-12-11.
- ^ "General: Intel Hex File Format". ARM Keil. ARM Germany GmbH. 2018-05-07 [2012]. KA003292. Archived from the original on 2020-02-27. Retrieved 2017-09-06. [8]
- ^ a b Crosby, Kip (January–March 1994). "Dawn of the Micro: Intel's Intellecs" (PDF). The Analytical Engine. 1 (3). Computer History Association of California: 10–14. ISSN 1071-6351. Archived (PDF) from the original on 2023-10-17. Retrieved 2023-10-17. pp. 10–11:
[…] the Intel Intellec 8 […] first appeared sometime in 1972 or 1973, two years or more before the Altair 8800 often credited as the "first microcomputer" by standard histories […] Intel maintains that the 8 Mod 8 was first produced in 1973 and discontinued in 1975. Tony Duell has an 8 Mod 80 CPU board dated 1972, and the 8 Mod 8 and 4 Mod 40 are both listed in the Intel Data Catalog published in February 1976, so the actual period of production may have been somewhat longer. (Pertinent Intel docs must be read carefully because the names MCS4, MCS40, MCS8 and MCS80 were used almost indiscriminately to refer to chipsets, computers or full systems.) […]
(52 pages) (NB. This article does not mention Intel Hex, but specifically mentions that Intel's Intellec system was officially introduced in 1973, but some units dated 1972 exist.) - ^ a b c d e f g h i "Chapter 6. Microcomputer System Component Data Sheet - EPROMs and ROMs: I. PROM and ROM Programming Instructions - B1. Intellec Hex Paper Tape Format / C1. Intellec Hex Computer Punched Card Format". MCS-80 User's Manual (With Introduction to MCS-85). Santa Clara, California, USA: Intel Corporation. October 1977 [1975]. pp. 6-75 – 6-78. 98-153D. Retrieved 2020-02-27. p. 6-76:
[…] In the Intel Intellec Hex Format, a data field can contain either 8 or 4-bit data. Two ASCII hexadecimal characters must be used to represent both 8 and 4-bit data. In the case of 4-bit data, only one of the characters is meaningful and must be specified on the Intel PROM/ROM Order Form. […] Preceding the first data field and following the last data field there must be a leader/trailer length of at least 25 null characters. Comments (except for a colon) may be placed on the tape leader. […] If the data is 4 bit, then either the high or low-order digit represents the data and the other digit of the pair may be any ASCII hexadecimal digit. […]
[9][10] (468 pages) (NB. This manual also describes a "BPNF Paper Tape Format", a "Non-Intellec Hex Paper Tape Format" and a "PN Computer Punched Card Format".) - ^ Kildall, Gary Arlen (January 1980). "The History of CP/M, The Evolution of an Industry: One Person's Viewpoint". Dr. Dobb's Journal of Computer Calisthenics & Orthodontia. 5 (1): 6–7. #41. Archived from the original on 2016-11-24. Retrieved 2013-06-03.
[…] Programs had been written and tested by Intel's software group, consisting of myself and two other people, and we were ready for the real machine. […]
- ^ Kildall, Gary Arlen (2016-08-02) [1993]. Kildall, Scott; Kildall, Kristin (eds.). Computer Connections: People, Places, and Events in the Evolution of the Personal Computer Industry (Manuscript, part 1). Kildall Family. Archived (PDF) from the original on 2016-11-17. Retrieved 2016-11-17. (NB. Part 2 not released due to family privacy reasons.)
- ^ Burgett, Kenneth "Ken" (2017-11-10). "Development of Intel ISIS Operating System - An interview with Ken Burgett". Archived from the original on 2023-11-24. Retrieved 2023-11-25. [11][12]
- ^ Feichtinger, Herwig (1987). "1.8.5. Lochstreifen-Datenformate: Das Intel-Hex-Format" [1.8.5. Paper tape data formats]. Arbeitsbuch Mikrocomputer [Microcomputer work book] (in German) (2 ed.). Munich, Germany: Franzis-Verlag GmbH. pp. 240–243 [243]. ISBN 3-7723-8022-0.
- ^ a b c d e "4.3 Intel Hexadecimal File Format". Concurrent CP/M Operating System - Programmer's Reference Guide (PDF) (1 ed.). Pacific Grove, California, USA: Digital Research Inc. January 1984. pp. 4-9 – 4-12. Archived (PDF) from the original on 2021-12-11. Retrieved 2021-12-11. pp. 4-11 – 4-12:
[…] The following are output from ASM-86 only: 81 same as 00, data belongs to Code Segment […] 82 same as 00, data belongs to Data Segment […] 83 same as 00, data belongs to Stack Segment […] 84 same as 00, data belongs to Extra Segment […] *85 paragraph address for absolute Code Segment […] *86 paragraph address for absolute Data Segment […] *87 paragraph address for absolute Stack Segment […] *88 paragraph address for absolute Extra Segment […] * 85, 86, 87, and 88 are Digital Research Extensions. […] All characters preceding the colon for each record are ignored. […]
(346 pages) (NB. This manual marks only types 85, 86, 87 and 88 as Digital Research extensions, as if types 81, 82, 83, 84 were not.) - ^ a b c d "2.8. Microprocessor Formats, 2.8.1. Input Requirements: Intel Intellec 8/MDS Format. Select Code 83". Operator Guide To Serial I/O Capabilities of Data I/O Programmers - Translation-Format Package (PDF). Revision C. Data I/O Corporation. October 1980. p. 2-10. 055-1901. Archived (PDF) from the original on 2020-03-01. Retrieved 2020-03-01. p. 2-10:
[…] Input […] This space can be used for line feed, carriage return or comments. […] Output […] 2) Each line ends with nonprinting line feed, carriage returns and nulls. […]
(1+ii+19 pages) - ^ a b c "Intel Intellec 8/MDS Format, Code 83". Translation File Formats (PDF). Data I/O Corporation. 1987-09-03. pp. 22, 26–27, 52–53, 54. Archived (PDF) from the original on 2021-07-28. Retrieved 2020-03-01. pp. 22, 26, 52:
[…] Nonprinting Carriage Return, line feed, and nulls determined by null count […]
(56 pages) - ^ a b c "Appendix B: Intel Hex and Intel Extended Hex Format - B.1 Common Format". Fujitsu Semiconductor Controller Manual: FR/F2MC Family Softune Linkage Kit Manual for V3 (PDF). Fujitsu Limited. 2001. pp. 319–525 [320–321]. Archived (PDF) from the original on 2021-12-12. Retrieved 2021-12-12. p. 321:
[…] (g) Generally, a control code (such as CR and LF) is added. Data in this field is skipped until the start character ":" of (a) appears. Since the (a), (b), (c), (d), and (f) fields always exist, the minimum length of a record is 11 bytes long and the maximum length is 521 bytes long. […]
(4+x+350 pages) - ^ a b "1.6.4 PIP". CP/M Operating System Manual (First printing ed.). Pacific Grove, California, USA: Digital Research. July 1982 [1976]. pp. 17–23. Retrieved 2021-12-12. pp. 19–21:
[…] PIP performs a special function if the destination is a disk file with type "HEX" (an Intel hex-formatted machine code file), and the source is an external peripheral device, such as a paper tape reader. In this case, the PIP program checks to ensure that the source file contains a properly formed hex file, with legal hexadecimal values and checksum records. When an invalid input record is found, PIP reports an error message at the console and waits for corrective action. It is usually sufficient to open the reader and rerun a section of the tape (pull the tape back about 20 inches). When the tape is ready for the reread, a single carriage return is typed at the console, and PIP will attempt another read. If the tape position cannot be properly read, the user continues the read (by typing a return following the error message), and enters the record manually with the ED program after the disk file is constructed. For convenience, PIP allows the end-of-file to be entered from the console if the source file is an RDR: device. In this case, the PIP program reads the device and monitors the keyboard. If ctl-Z is typed at the keyboard the read operation is terminated normally. […]
[13] (6+250 pages)PIP PUN:=NUL:,X.ASM,EOF:,NUL:[…] Send 40 nulls to the punch device; copy the X.ASM file to the punch, followed by an end-of-file (ctl-Z) and 40 more null characters. […]H[…] HEX data transfer: all data are checked for proper Intel hex file format. Nonessential characters between hex records are removed during the copy operation. The console will be prompted for corrective action in case errors occur. […]I[…] Ignore ":00" records in the transfer of Intel hex format file (the I parameter automatically sets the H parameter). […]PIP PUN:=X.HEX[i],Y.ZOT[h][…] First copy X.HEX to the PUN: device and ignore the trailing ":00" record in X.HEX; continue the transfer of data by reading Y.ZOT, which contains HEX records, including any ":00" records it contains. […] - ^ a b c
1 CARRY 05714 2 ZERO 05715 3 SIGN 05716 4 PARITY 05717 5 MEMORY 06000 23 SQUAREROOT 04003 […] 83 MONITORUSES 05766 $ **************************************** :1008000044520A2E0B36D0F930FA31CF30D730F9B6 […] :100AF0000936F4C730D70401C8C20C0031F930F808 :040B0000445E0AFF46 **************************************** :0000000000 $(1+i+100+1+11+1 pages) (NB. Shows an example containing asterisk-based separators and a space-indented header with symbol names to be processed by Intel ISIS's HEXOBJ command as well as by INTERP/8 or INTERP/80 for symbolic debugging. This optional header is not documented as part of Intel hex or BNPF formats but in Intel's PL/M and assembler programming manuals producing such symbol tables.)— "Appendix A: A Sample Program in PL/M: Hexidecimal Object Tape". MCS-8 A Guide to PL/M programming (PDF). Rev 1 (printed September 1974 ed.). Santa Clara, California, USA: Intel Corporation. 1974-03-15 [September 1973]. p. 102. MCS180-0774-1K, MCS280-0974-1K. Archived (PDF) from the original on 2022-01-29. Retrieved 2022-05-18.
- ^ a b c d Hennig-Roleff, Werner (1993-02-01) [1988]. "HEX.DOC: Intel-HEX-Format". SIM51. 1.04 (in German). Archived from the original on 2017-08-11. Retrieved 2021-12-08.
[…] Beim Absolut-Hex Konvertierprogramm von Keil können optional […] Symbol-Informationen in den Hex-File aufgenommen werden. Die Symbol-Informationen stehen dabei am Anfang des Files, vor dem ersten ':'. Die Symbol-Informationen sind allerdings nicht sehr aussagekräftig, da nicht unterschieden wird zwischen Modul-Name, CODE, XDATA, DATA, IDATA, BIT, NUMBER. Für jeden Symboleintrag werden nur ASCII-Zeichen verwendet. Pro Zeile ist 1 Symbol angeschrieben und zwar in der Form: "0 SymbolName Wert" […]
[14][15] (NB. This is an older version of SIM51, the software and documentation was maintained up to 1996.) - ^ G., Georg (2021-09-05) [2021-09-04]. "Hex-File Flashen". Mikrocontroller und Digitale Elektronik. mikrocontroller.net (in German). Retrieved 2023-11-23.
[…] Debug Infos fingen bei Intel mit einem "$" an. Dann kamen der Name des Symbols und die Adresse. Kommentare hatten als erstes Zeichen ein ";". […] Der ASM48 unter ISIS-2 produzierte solche Hexfiles, […] der ASM86 auch. […]
- ^ a b "Appendix A. Example of Listing Format / Appendix C. Hexadecimal Object File Format". 2920 Assembly Language Manual (PDF). Santa Clara, California, USA: Intel Corporation. August 1979. pp. A-3, C-1 – C-2. Order Number 9800987-01. Archived (PDF) from the original on 2023-11-26. Retrieved 2023-11-26. p. C-1:
[…] The code is formatted in hexadecimal bytes of data. The file contains the ASCII representation of the hexadecimal bytes of data. The object code itself is preceded by a symbol table. These two parts may be loaded or saved together or separately. The symbol table is a series of records, terminated by a dollar sign. Each record contains three fields separated by one or more ASCII spaces: […] a number field […] a label field containing the ASCII representation of a source program symbol […] an address field containing the hexadecimal address assigned to the symbol […] The symbol table is terminated by a record whose first nonblank character is a dollar sign. The object code […] follows the symbol table […] Each of these records or physical lines is six logical fields of varying length in characters or frames. […]
(90 pages) (NB. The Intel 2920 was a digital signal processor released in 1979.) - ^ a b Formaniak, Peter G.; Leitch, David (July 1977). "A Proposed Microprocessor Software Standard". BYTE - the small systems journal. Technical Forum. Vol. 2, no. 7. Peterborough, New Hampshire, USA: Byte Publications, Inc. pp. 34, 62–63. ark:/13960/t32245485. Retrieved 2021-12-06. (3 pages) (NB. Describes an extension of the Intel hex format by Mostek.)
- ^ a b c d Ogdin, Carol Anne; Colvin, Neil; Pittman, Tom; Tubb, Philip (November 1977). "Relocatable Object Code Formats". BYTE - the Small Systems Journal. Technical Forum. 2 (11). Peterborough, New Hampshire, USA: Byte Publications, Inc.: 198–205. ark:/13960/t59c88b4h, ark:/13960/t3kw76j24. Retrieved 2021-12-06. (8 pages) (NB. Besides others describes an incompatible extension of the Intel hex format used by Technical Design Labs (TDL).)
- ^ a b Kreidl, Günter (June 1981). "Relocator: Das TDL-Format". Hardware. Nascom journal - Zeitschrift für Anwender des NASCOM 1 oder NASCOM 2 (in German). 2 (6). Germersheim, Germany: Verlag NASCOM Journal, MK-Systemtechnik: 12–14 [12]. Archived from the original on 2021-12-01. Retrieved 2021-12-11. (20 pages) (NB. Shows a variant of the TDL format, which itself is a variant of the Intel hex format.)
- ^ Rüger, Stefan M. (2022-06-16). "Provide file format I: Intel HEX with comments that ignores checksum errors". AVRDUDE. Archived from the original on 2023-11-25. Retrieved 2023-11-25. (NB. AVRDUDE's comment option :I can incorrectly produce ":" characters as part of the hex dump.)
- ^ Bull, Hans Eirik; Dean, Brian S.; Rüger, Stefan M.; Wunsch, Jörg (2023-07-15). "AVRDUDE - A program for downloading/uploading AVR microcontroller flash, EEPROM and more for AVRDUDE" (PDF). Version 7.2. Archived (PDF) from the original on 2023-11-23. Retrieved 2023-11-23. p. 12:
[…] I […] Intel Hex with comments on download and tolerance of checksum errors on upload […]
(66 pages) - ^ a b Beckler, Matthew L. (2016-07-25) [2016-07-19]. "Blinky Grid - serial optical bit stream". Discourse. Minneapolis, Minnesota, USA: Wayne and Layne, LLC. Archived from the original on 2021-12-11. Retrieved 2021-12-11.
- ^ a b "micro:bit Universal Hex Format Specification - Specification for the micro:bit Universal Hex Format". micro:bit. 0.4.0. Micro:bit Educational Foundation. 2021-01-26 [2020]. Archived from the original on 2021-08-14. Retrieved 2021-12-08. [16][17] (NB. This represents kind of a fat hex file format.)
- ^ Intellec 8 Microcomputer System Operator's Manual. Intel Corporation. November 1973.
- ^ "Appendix D. Hexadecimal Program Tape Format". Intellec 8/MOD 80 Operators Manual. Intel. June 1974. 98-003A.
[…] Frames 7,8: Record Type […] Two ASCII characters. Currently (1974), all records are type 0. This field is reserved for future expansion […]
[18] - ^ Development Tools Catalog 1988 (PDF). Intel Corporation. 1988. pp. 25–26, 30–32. Order Number 280199-004. Archived (PDF) from the original on 2023-11-26. Retrieved 2023-11-26. (46 pages)
- ^ "PIC Microcontrollers: PIC Hex File Format". Kanda Electronics Blog. Canolafan, Llanafan, Aberystwyth, Wales, UK: Embedded Results Ltd. 2012-04-26. Archived from the original on 2021-08-16. Retrieved 2021-12-11.
- ^ "15.3 XC16-BIN2HEX Utility - 15.3.3 Input/Output Files". MPLAB XC16 Assembler, Linker and Utilities - User's Guide (PDF). Microchip Technology Inc. 2018 [2013]. pp. 240–241. ISBN 978-1-5224-2828-2. DS50002106D. Archived (PDF) from the original on 2019-01-22. Retrieved 2023-12-05. p. 240:
[…] Because the Intel hex file format is byte-oriented, and the 16-bit PC is not, program memory sections require special treatment. Each 24-bit program word is extended to 32 bits by inserting a so-called "phantom byte". Each program memory address is multiplied by 2 to yield a byte address. For example, a section that is located at 0x100 in program memory will be represented in the hex file as 0x200. Consider the following assembly language source: […] ; file test.s […] .section foo,code,address(0x100) […] .pword 0x112233 […] The file […] will be produced, with the following contents: […] :020000040000fa […] :040200003322110096 […] :00000001FF […] the data record (line 2) has a load address of 0200, while the source code specified address 0x100. […]t the data is represented in "little-endian" format, meaning the least significant byte appears first. The phantom byte appears last, just before the checksum. […]
(277 pages) - ^ Kildall, Gary Arlen (February 1978) [1976]. "A simple technique for static relocation of absolute machine code". Dr. Dobb's Journal of Computer Calisthenics & Orthodontia. 3 (2). People's Computer Company: 10–13 (66–69). ISBN 0-8104-5490-4. #22 ark:/13960/t8hf1g21p. Retrieved 2017-08-19. [19][20][21]. Originally presented at: Kildall, Gary Arlen (1977) [22–24 November 1976]. "A Simple Technique for Static Relocation of Absolute Machine Code". Written at Naval Postgraduate School, Monterey, California, USA. In Titus, Harold A. (ed.). Conference Record: Tenth Annual Asilomar Conference on Circuits, Systems and Computers: Papers Presented November 22–24, 1976. Asilomar Conference on Signals, Systems & Computers. Asilomar Hotel and Conference Grounds, Pacific Grove, California, USA: Western Periodicals Company. pp. 420–424. ISSN 1058-6393. Retrieved 2021-12-06. (609 pages)
- ^ Zschocke, Jörg (November 1987). "Nicht nur Entwicklungshilfe - Down-Loading für Einplatinencomputer am Beispiel des EPAC-09: Intel-Hex-Format". c't - magazin für computertechnik (in German). Vol. 1987, no. 11. Verlag Heinz Heise GmbH & Co. KG. pp. 198, 200, 202–203, [200]. ISSN 0724-8679.
[…] Den Vorspann beschließt ein Byte, dessen Wert den Typ des Blockes angibt: 0 = Datenblock, 1 = Endblock. Auf diese Unterscheidung kann jedoch verzichtet werden, wenn sich ein Endblock auch durch eine Blocklänge gleich Null eindeutig kennzeichnen läßt. (So verfahren die meisten Assembler unter CP/M, auch der XASM09; das Typbyte ist dann immer Null). […]
[22] (NB. XASM09 is a Motorola 6809 assembler.) - ^ Prior, James E. (1989-02-24). "Re: Intel hex (*.HEX) format questions". Newsgroup: comp.os.cpm. Retrieved 2020-02-27.
- ^ a b c "PIC16C5X Programming Specification 5.0 - PIC16C5X Hex Data Formats: 5.1. 8-Bit Split Intellec Hex Format (INHX8S) / 5.2. 8-Bit Merged Intellec Hex Format (INHX8M) / 5.3. 16-Bit Hex Format / 5.4. 8-Bit Word Format / 5.5. 16-Bit Word Format". Microchip Databook (1994 ed.). Microchip Technology Inc. April 1994. pp. 3-10 – 3-11, 9-10, 9-15, 9-17, 9-21, 9-23, 9-27. DS00018G. Retrieved 2020-02-28.
[…] Assemblers for the PIC16C5X can produce PIC16C5X object files in various formats. A PIC16C5X programmer must be able to accept and send data in at least one of following formats. The 8-bit merged (INHX8M) format is preferred. […] format […] INHX8S […] produces two 8-bit Hex files. One file will contain the address / data pairs for the high order 8-bits and the other file will contain the low order 8-bits. File extensions for the object code will be '.obl' and '.obh' for low and high order files […] format […] INHX8M […] produces one 8-bit Hex file with a low byte / high byte combination. Since each address can only contain 8 bits in this format, all addresses will be doubled. File extensions for the object code will be '.obj' […] format […] INHX16 […] produces one 16-bit Hex file. File extension for the object code will be '.obj'. […]
[23][24] - ^ Beard, Brian (2016) [2010]. "Microchip INHX8M HEX-record Format". Lucid Technologies. Archived from the original on 2020-02-28. Retrieved 2020-02-28.
- ^ Beard, Brian (2016) [2013]. "Microchip INHX32 HEX-record Format". Lucid Technologies. Archived from the original on 2020-02-28. Retrieved 2020-02-28.
- ^ Paul, Matthias R. (1992). BINTEL: Binär-Image-Konverter mit Intel-Hex-Unterstützung - Bedienungsanleitung [Binary image converter with Intel Hex support - User manual] (in German). (NB. As a consequence of the tool's use to process, analyze, compare, split, cut, fill, combine, relocate or convert binary firmware images (f.e. for or from one or more ROMs) with deliberately sticky (fixed "1" or "0"), inverted, omitted ("don't care"), interconnected or swapped data or address lines (as sometimes used to ease PCB routing of parallel buses or for obfuscation reasons to make disassembly more difficult), this binary image converter supported a number of extensions to the Intel Hex format.)
Further reading
[edit]- "How Do I Interpret Motorola S & Intel HEX Formatted Data? Intel Hex-32, Code 99". Home > Hardware > … > In-circuit Test Systems > Automated Test Equipment [Discontinued] > Details. Keysight Technologies. Archived from the original on 2020-03-01. Retrieved 2020-03-01.
- Bergmans, San (2019-06-02) [2001]. "Intel HEX Format". SB-Projects. Archived from the original on 2020-03-01. Retrieved 2020-03-01.
- Beard, Brian (2016) [2007]. "Intel HEX-record Format". Lucid Technologies. Archived from the original on 2020-02-28. Retrieved 2020-02-28.
- Anderson, Thomas N. (February 1998). "Intel Hex Word Address Object Format". The Telemark Assembler (TASM) User's Manual (PDF). 3.1. Issaquah, Washington, USA: Squak Valley Software. pp. 25–26. Archived (PDF) from the original on 2021-12-11. Retrieved 2021-12-11.
Intel Hex Word Address Object Format […] This format is identical to the Intel Hex Object Format except that the address for each line of object code is divided by two thus converting it to a word address (16 bit word). All other fields are identical. Here is an example: […] :180800000102030405060708090A0B0C0D0E0F101112131415161718AC […] :02080C00191AA3 […] :00000001FF […]
(32 pages) - "ADuC70xx Serial Download Protocol" (PDF) (Application Note). Revision C. Norwood, Massachusetts, USA: Analog Devices. 2016. AN-724. Archived (PDF) from the original on 2023-10-05. Retrieved 2023-10-05. (8 pages)
External links
[edit]- binex - a converter between Intel HEX and binary for Windows.
- SRecord, a converter between Intel HEX and binary for Linux (usage), C++ source code.
- kk_ihex, open source C library for reading and writing Intel HEX
- libgis, open source C library that converts Intel HEX, Motorola S-Record, Atmel Generic files.
- bincopy is a Python package for manipulating Intel HEX files.
- SwiftIntelHex - a Swift package to parse Intel HEX files for iOS and macOS.
Intel HEX
View on GrokipediaIntroduction
Definition and Purpose
The Intel HEX file format is an ASCII-based hexadecimal representation of binary object files, enabling the embedding of machine code and data within human-readable text files. Developed for Intel's 8-bit, 16-bit, and 32-bit microprocessors, it encodes binary data as pairs of ASCII hexadecimal characters to facilitate storage and manipulation without the limitations of raw binary formats.[4] Its primary purpose is to store and transfer firmware, ROM images, or EPROM contents for programming microcontrollers, PROMs, and embedded systems. This format serves as a standard input for PROM programmers and hardware emulators, allowing reliable loading of code and data into target devices across various address spaces, including 16-bit linear for 8-bit processors, 20-bit segmented for 16-bit processors, and 32-bit linear for 32-bit processors.[4][2] The historical motivation for Intel HEX arose from the challenges of handling binary files in environments reliant on text-based media and displays, such as avoiding issues like line wrapping, corruption during transmission, or incompatibility with non-binary storage like paper tape, punch cards, or CRT terminals. By converting binary data to ASCII hexadecimal, the format ensures both human readability and machine parsability, making it suitable for editing, printing, and archiving without specialized binary tools.[4][2] Fundamentally, Intel HEX files are structured as a sequence of records, each prefixed by a colon (:) and comprising a byte count, starting address, record type, hexadecimal data bytes, and a checksum for integrity verification. Record types such as data records and end-of-file records organize the content to represent complete memory images.[4]Key Features
The Intel HEX format employs ASCII hexadecimal encoding, where each byte of binary data is represented by two hexadecimal characters (0-9 and A-F), effectively doubling the file size compared to binary but enabling transmission over text-based channels.[5] This encoding ensures compatibility with standard text editors and printers, as the hexadecimal values are stored as printable ASCII characters.[1] Its block-based structure organizes data into discrete records, each beginning with a colon (:) and containing fields for byte count, address, record type, data, and checksum, which collectively guard against corruption during transmission over non-binary media such as paper tape or serial lines.[5] By delimiting content into these self-contained lines terminated by carriage return and line feed, the format minimizes errors from line breaks or partial reads in text streams.[1] The format supports absolute addressing up to 64 kilobytes in its base 16-bit configuration, with extended record types allowing expansion to 20-bit segmented or 32-bit linear address spaces for larger memory requirements.[5] Each record includes a two's complement checksum, computed as the modulo-256 sum of all preceding bytes negated, providing self-validation to detect transmission errors without external verification tools.[1] Intel HEX achieves platform independence by treating data as byte sequences in hexadecimal form, eliminating endianness concerns since multi-byte values are not assembled within the file but interpreted by the loading application.[5] This byte-oriented representation, combined with its ASCII nature, facilitates human readability, permitting manual inspection, editing, and verification of firmware content using any text viewer.[1]Historical Development
Origins
The Intel HEX format was originally developed by Intel Corporation in 1973 for its Intellec Microcomputer Development Systems (MDS) to load and execute programs, particularly over non-binary media like paper tape.[6] The format's initial publication appeared in Intel's technical documentation, such as the MCS-48 User's Manual and PROM programmer guides.[6] Its motivation stemmed from addressing limitations of binary files in teletype-era terminals, where the hexadecimal representation enabled better readability for human operators and built-in error detection via checksums.[7] Early implementations of hexadecimal-formatted data loading occurred in Intel's hardware tools, such as the Universal PROM Programmer (UPP), for EPROM programming tasks.[8]Adoption and Evolution
The Intel HEX format saw rapid adoption throughout the 1980s among semiconductor companies developing microcontroller programming tools, particularly for 8-bit processors like the Zilog Z80, where it facilitated the transfer of object code to PROMs and development systems.[9] This uptake was driven by the format's ASCII-based readability and compatibility with early loaders and emulators, making it a de facto standard for firmware distribution in embedded applications during the era's microprocessor boom. The format originated in 1973 for 16-bit linear addressing in Intel's MDS tools. Around 1978, record types 02 (extended segment address) and 03 (start segment address) were added to support the 20-bit segmented addressing of the 8086 processor family. Intel formalized the specification in 1988 through the "Hexadecimal Object File Format Specification (Revision A)," which defined the record structure for 8-bit, 16-bit, and 32-bit microprocessors and emphasized its use with PROM programmers and hardware debuggers.[10] This document solidified the format's structure, including mechanisms for segmented addressing, ensuring interoperability across Intel's evolving processor families. In response to growing memory demands in PCs and embedded systems, the 1988 specification introduced extensions such as the Extended Linear Address Record (type 04) and Start Linear Address Record (type 05), enabling full 32-bit addressing up to 4 GiB and supporting the transition from 16-bit to 32-bit architectures.[10] These enhancements addressed limitations in earlier 20-bit segmented addressing, allowing the format to handle larger codebases without fragmentation. The format maintained relevance into the 2000s and beyond, becoming integrated into commercial integrated development environments (IDEs) such as Keil µVision, which generates Intel HEX output for ARM-based devices via project options.[11] Similarly, IAR Embedded Workbench supports Intel HEX as an output format through linker settings for microcontrollers like those from Microchip.[12] Open-source tools like avrdude also rely on it for programming AVR and ARM devices, parsing records to upload firmware over serial interfaces.[13] Although binary formats like ELF have supplanted it in full-fledged operating systems for their richer metadata, Intel HEX endures in legacy embedded programming and hobbyist projects due to its lightweight nature and direct compatibility with flash programmers.[1]Core Format
Record Structure
The Intel HEX format organizes binary data into discrete records, each represented as a single line of ASCII text encoded in hexadecimal notation. This structure ensures compatibility with text-based transmission and storage systems, allowing reliable transfer of machine code or data to devices like microcontrollers or EPROM programmers. Every record follows a consistent layout with fixed-position fields for metadata and a variable field for the payload, enabling parsers to systematically decode the content. The record begins with a mandatory prefix consisting of a single ASCII colon (:) character, which serves as the start code to delineate the beginning of each record. Immediately following the colon, the byte count field spans the next two hexadecimal digits (character positions 1-2 after the colon), specifying the number of data bytes contained in the record; this value ranges from 00 to FF, corresponding to 0 through 255 bytes. The address field then occupies the subsequent four hexadecimal digits (positions 3-6), providing a 16-bit (two-byte) load offset where the data bytes are to be stored in memory. Next, the record type field uses two hexadecimal digits (positions 7-8) to indicate the record's purpose, such as 00 for a standard data record that carries the actual payload bytes in its data field. The data field itself is variable in length, consisting of twice the byte count number of hexadecimal digits (each pair representing one byte of binary data), and follows immediately after the record type field. For instance, a byte count of 10 would result in 20 hexadecimal digits for this field, encoding 10 bytes of program or configuration data. Concluding the record is the checksum field, comprising the final two hexadecimal digits, which provides a validation mechanism to detect transmission errors. The overall length of a record varies based on the byte count: the minimum is 11 characters for a record with no data bytes (e.g., an end-of-file record), while the maximum reaches 521 characters when including 255 data bytes. A skeletal representation of the structure is:NNAAAATT[DDDD...]CC, where NN is the byte count, AAAA the address, TT the type, DDDD... the optional data pairs, and CC the checksum.
| Field | Position (after colon) | Length (characters) | Description |
|---|---|---|---|
| Byte Count | 1-2 | 2 (hex digits) | Number of data bytes (00-FF). |
| Address | 3-6 | 4 (hex digits) | 16-bit load offset. |
| Record Type | 7-8 | 2 (hex digits) | Type identifier (e.g., 00 for data). |
| Data | 9 to 8 + 2×(byte count) | Variable (2×byte count hex digits) | Payload bytes. |
| Checksum | Final 2 | 2 (hex digits) | Error detection value. |
Record Types
The Intel HEX format defines several standard record types, each identified by a one-byte code that determines how the record's fields are interpreted within the common structure of byte count, address, data, and checksum. These types enable the specification of data placement, address extensions, file termination, and execution starting points, supporting various addressing modes from 16-bit to 32-bit systems.[14] Type 00 (Data Record) is the primary record for loading program code or data into memory. It specifies a variable number of data bytes (up to 255, indicated by the byte count field) to be stored sequentially starting at a 16-bit load offset address provided in the address field; the address increments by one for each subsequent data byte, potentially rolling over from FFFF to 0000 without affecting higher address bits. This type forms the bulk of most Intel HEX files, directly contributing the firmware or executable content.[14][15] Type 01 (End of File Record) signals the completion of the Intel HEX file, instructing the loader to cease processing further records. It contains no data bytes (byte count must be 00) and ignores the address field, which is conventionally set to 0000; the checksum is fixed at FF to ensure integrity. This record is mandatory and typically appears as the final line in the file.[14][1] Type 02 (Extended Segment Address Record) establishes the upper 16 bits (bits 4 through 19) of a 20-bit segmented base address for subsequent data records, enabling addressing up to 1 MB in legacy Intel 16-bit systems. The byte count is fixed at 02, the address field is 0000 (unused), and the two data bytes represent the segment base address with bits 3-0 zeroed; this value is shifted left by four bits (multiplied by 16) and added to the load offsets of following type 00 records until reset.[14][16] Type 03 (Start Segment Address Record) provides the initial execution address in segmented mode for 16-bit Intel processors, such as the 8086, by specifying the code segment (CS) and instruction pointer (IP) registers. It uses a byte count of 04, an unused address field of 0000, and four data bytes: the first two for the 16-bit CS (MSB first) followed by two for the 16-bit IP (MSB first); this record is optional and primarily for runtime initialization rather than file loading.[14][15] Type 04 (Extended Linear Address Record) sets the upper 16 bits (bits 16 through 31) of a 32-bit linear base address for subsequent data records, supporting up to 4 GB of address space in modern systems. The byte count is 02, the address field is 0000 (unused), and the two data bytes hold the upper address value (MSB first), which is combined with the 16-bit offsets from type 00 records to form full 32-bit addresses until another such record overrides it.[14][1] Type 05 (Start Linear Address Record) specifies the 32-bit linear execution start address, typically for the extended instruction pointer (EIP) in 32-bit Intel architectures like the 80386. It features a byte count of 04, an unused address field of 0000, and four data bytes representing the full 32-bit address (MSB first); like type 03, it is optional and serves for post-loading program entry point definition rather than data transfer.[14][16]Checksum Mechanism
The checksum in the Intel HEX format serves to detect transmission or storage errors by ensuring the integrity of each record's data. It is computed as an 8-bit value appended to the end of every record, allowing parsers to verify that the byte count, address, record type, and data fields have not been corrupted. This mechanism provides a simple yet effective error-detection capability, commonly used in embedded systems programming where files are transferred serially or loaded into memory devices.[17][16] The checksum is calculated by summing the binary values of all bytes in the record from the byte count field through the last data byte (excluding the leading colon and the checksum itself). This sum is taken modulo 256 to obtain an 8-bit result, and the checksum byte is then the two's complement of that value, ensuring the total sum of all bytes including the checksum is zero modulo 256. Equivalently, the checksum byte satisfies: where is the sum of the bytes from the byte count to the last data byte. This two's complement approach, also expressible as the bitwise NOT of the sum modulo 256 followed by adding 1, guarantees that any single-bit error or common transmission faults will likely result in a non-zero total sum.[17][16][18] To verify a record, a parser recomputes the sum of all bytes from the byte count through the checksum byte and checks if the result is zero modulo 256. If the total sum is not zero, the record is considered invalid, typically causing the parsing process to abort, log an error, or flag the affected record for manual correction, thereby preventing corrupted data from being loaded into target memory.[17][16] For example, consider the record:04000000FEEFFFF020, where the byte count is 04 (4 data bytes), address is 0000, type is 00, and data is FE EF FF F0. The binary bytes are: 0x04, 0x00, 0x00, 0x00, 0xFE, 0xEF, 0xFF, 0xF0. Their sum is 992 (0x3E0 in hex), and 992 mod 256 = 224 (0xE0). The checksum is then 256 - 224 = 32 (0x20), confirming the record's integrity since including 0x20 yields a total sum of 1024 (0x400), which is 0 mod 256.[17]
Line Encoding and Termination
Intel HEX files are encoded as a series of ASCII text lines, where each line represents a single record in the format. The records consist of hexadecimal digits encoded in ASCII characters, with each byte of binary data represented by two hexadecimal digits. By convention, these hexadecimal digits are uppercase (A-F), though lowercase (a-f) is also accepted by most parsers for flexibility in implementation. This ASCII-based encoding ensures that the file can be safely transmitted over text-based channels without corruption from binary data issues, as it avoids embedding null bytes (0x00) or other control characters in the payload representation.[17][19][1] Each record must occupy exactly one line, with no padding, wrapping, or spanning across multiple lines to maintain parseability. Whitespace characters, such as spaces or tabs, are not permitted within the record fields; all hexadecimal pairs are contiguous following the initial colon (:) marker. Line terminators follow each record, typically using the standard carriage return followed by line feed (CRLF, hexadecimal 0D 0A) for broad compatibility across systems. In Unix-like environments, a line feed (LF, 0x0A) alone is common, but CRLF is recommended to ensure reliable parsing on Windows and other platforms. These terminators are not included in the record's checksum calculation.[17][1][19] At the file level, an Intel HEX file comprises multiple such records, beginning with either an extended address record or a data record, and concluding with an end-of-file record (type 01) to signal completion. To accommodate traditional terminal display widths and editing tools, records are conventionally limited to a maximum line length of approximately 256 characters, though the format technically supports up to 521 characters per line (corresponding to 255 bytes of data). This structure allows the file to be processed line-by-line, facilitating straightforward sequential reading and validation.[1][19][17]Examples and Parsing
Basic File Example
A basic Intel HEX file example demonstrates standard data loading using type 00 records for sequential memory filling within the 16-bit address space. The following minimal file loads 32 bytes starting at address 0100h::10010000214601360121470136007EFE09D219014A
:100110001C0200036C0001001F00000000000000F3
:00000001FF
:10010000214601360121470136007EFE09D219014A
:100110001C0200036C0001001F00000000000000F3
:00000001FF
Extended Address Example
The extended segment address record (type 02) in the Intel HEX format enables addressing of memory locations beyond 64 kilobytes by establishing a 16-bit segment base value that subsequent data records (type 00) are offset from, supporting up to 1 megabyte of addressable space in segmented architectures.[20] This mechanism is essential for representing firmware or code in larger memory models where the standard 16-bit address field alone is insufficient.[2] A representative example of an Intel HEX file utilizing an extended segment address record is the following::020000020008F4
:10080000AABBCCDDEEFF00112233445566778899AACC
:10081000BBCCDDEEFF0011223344556677889900BBCC
:00000001FF
:020000020008F4
:10080000AABBCCDDEEFF00112233445566778899AACC
:10081000BBCCDDEEFF0011223344556677889900BBCC
:00000001FF
:020000020008F4 is a type 02 extended segment address record with data bytes 00 08, setting the segment base address to 0008h. The checksum F4 is calculated as the two's complement negation of the sum of all preceding byte fields (02 + 00 + 00 + 02 + 00 + 08 = 0Ch, negated to F4h). The subsequent type 00 data records load 16 bytes each at offsets 0800h and 0810h from the current base, while the final type 01 record :00000001FF marks the end of the file.
The address resolution for data records following the extension is determined by shifting the segment base left by 4 bits (multiplying by 16) and adding the record's 16-bit address field: full address = (segment base × 16) + record address.[1] In the example, the base 0008h × 16 = 08000h; thus, the first data record loads at 08000h + 0800h = 08800h, and the second at 08000h + 0810h = 08810h. This approach allows sparse or non-contiguous memory loading without requiring continuous addressing from zero.[2]
Such extended addressing is commonly applied in firmware for systems exceeding 64KB of memory. During parsing, the base address remains active for all following records until a new extension record (type 02 or 04) resets it, ensuring correct sequential interpretation of the file.[2]
Extended Linear Address Example
The extended linear address record (type 04) specifies the upper 16 bits of a 32-bit linear base address, allowing data records to address up to 4 gigabytes. Subsequent type 00 records use this base shifted left by 16 bits plus their offset. A simple example::0200000400F0
:10000000AABBCCDDEEFF00112233445566778899AACC
:00000001FF
:0200000400F0
:10000000AABBCCDDEEFF00112233445566778899AACC
:00000001FF
:0200000400F0 sets the upper address to 0000h (minimal base, checksum F0 for sum 02+00+00+04+00+00=06, -6=FCh wait, adjust). For base 00F0h: sum 02+00+00+04+00+F0= F6h, negation 0Ah? Wait, example adjusted.
The full address = (upper base << 16) + record address. For upper 00F0h, data at 0000h loads to F00000h. This is used in 32-bit systems like modern embedded devices.[1]
Variants and Extensions
Standard Variant
The standard variant of the Intel HEX format, also known as INHX16 or I16HEX, introduced in the late 1970s for Intel's Intellec development systems and 16-bit processors such as the 8086, provides a textual representation of binary data using ASCII hexadecimal characters, primarily for loading programs into ROMs and EPROMs.[21] This variant employs 16-bit addressing within segments, limiting the addressable memory to a maximum of 64 KB per segment without requiring further extensions, and relies mainly on type 00 records for data distribution and type 01 records to signal the file's conclusion.[21] Defined in Intel's Extended Hexadecimal Object File Format Specification (Revision A, January 6, 1988), it specifies record types 00 through 05, though the core functionality centers on types 00, 01, 02, and 03 for compatibility with 16-bit segmented architectures like the 8086.[21] A key limitation of this variant is its lack of native support for memory spaces exceeding 1 MB in a flat model, as the 16-bit address field in type 00 records can only reference offsets within a 64 KB segment.[21] To address larger segmented memory in 16-bit processors like the 8086, type 02 records optionally set a 16-bit segment base address, which is shifted left by 4 bits (multiplied by 16) and added to subsequent type 00 offsets; however, this segment addressing is non-linear, potentially leading to gaps or overlaps if not managed carefully, as it reflects the processor's 20-bit physical address space rather than a flat model.[21] Type 03 records specify the starting execution address within this segmented scheme, completing the basic loading mechanism. This format enjoys universal compatibility with legacy development tools, including Intel's In-Circuit Emulators (ICE) such as the ICE-186/188, which directly load standard Intel HEX files for debugging 8086-family processors.[22] Modern flash programming utilities, like those in ARM and Microchip ecosystems, also fully support it for microcontroller and EPROM applications due to its simplicity and widespread adoption.[1][23] Standard files impose strict constraints to ensure reliable parsing: they must conclude with precisely one type 01 record (format::00000001FF), which carries no address or data and serves solely as the terminator, and no duplicate addresses are permitted across type 00 records to prevent unintended data overwrites during loading.[21][24]
Common pitfalls in using the standard variant include assuming fully linear addressing throughout the file, which fails when type 02 records introduce segment shifts, resulting in misaligned memory placement; additionally, overlooking the absence of type 04 records can cause errors in tools expecting extended addressing, though such features fall outside this baseline specification.[15][21]
Extended Linear Address Variant
The Extended Linear Address variant of the Intel HEX format was introduced in 1988 to support 32-bit addressing for processors like the Intel 80386, enabling access to a full 4 GB address space by specifying the upper 16 bits of the linear base address.[2] This extension addresses the limitations of earlier 16-bit addressing schemes, allowing firmware and data to be placed beyond the 64 KB boundary in a linear manner without relying on segmented memory models.[2] Also known as INHX32 or I32HEX, it primarily uses record types 00, 01, 04, and 05.[25] The mechanism relies on type 04 records, which consist of a fixed byte count of 02, an ignored address field (typically 0000), the record type 04, and two data bytes representing the upper linear base address (ULBA).[2] Subsequent data records (type 00) or other addressable records have their effective addresses calculated as (ULBA << 16) | record_address, where the record_address is the 16-bit field in the standard record structure.[2] The ULBA remains in effect until overridden by another type 04 record and defaults to 0000 at the start of the file.[2] This approach maintains compatibility with the core format while extending the addressable range modularly up to 4 GB.[2] For example, the record:0200000400807A sets the ULBA to 0080h, establishing a base address of 00800000h.[2] A following data record like :0A0000000123456789ABCDEF01 (with appropriate checksum) would then load 10 bytes starting at absolute address 00800000h.[2]
Older parsers designed for 8-bit or 16-bit systems typically ignore type 04 records, treating them as no-ops and falling back to 16-bit addressing, which may lead to incomplete loading for files exceeding 64 KB.[2] In contrast, modern tools such as GNU binutils fully support this variant through formats like i32hex, ensuring proper handling of 32-bit addresses during conversion and loading.
This variant is detailed in Intel's Hexadecimal Object File Format Specification (Revision A, January 6, 1988), which formalized the 32-bit extensions.[2] It is essential for programming 32-bit microcontrollers and ARM-based systems where firmware images surpass 64 KB, providing a straightforward way to distribute large binaries across extended memory regions.[1]
