Hubbry Logo
search
logo
2324485

Executable and Linkable Format

logo
Community Hub0 Subscribers
Read side by side
from Wikipedia
Executable and Linkable Format
Filename extension
none, .axf, .bin, .elf, .o, .out, .prx, .puff, .ko, .mod, and .so
Magic number0x7F 'E' 'L' 'F'
Developed byUnix System Laboratories[1]: 3 
Initial release14 May 1998; 27 years ago (1998-05-14)
Latest release
4.2 [2]
2025; 0 years ago (2025)
Type of formatBinary, executable, object, shared library, core dump
Container forMany executable binary formats
Standardgabi.xinuos.com
Websitegithub.com/xinuos/gabi
An ELF file has two views: the program header shows the segments used at run time, whereas the section header lists the set of sections.

In computing, the Executable and Linkable Format[3] (ELF, formerly named Extensible Linking Format) is a common standard file format for executable files, object code, shared libraries, device drivers, and core dumps. First published in the specification for the application binary interface (ABI) of the Unix operating system version named System V Release 4 (SVR4),[4] and later in the Tool Interface Standard,[1] it was quickly accepted among different vendors of Unix systems. In 1999, it was chosen as the standard binary file format for Unix and Unix-like systems on x86 processors by the 86open project.

By design, the ELF format is flexible, extensible, and cross-platform. For instance, it supports different endiannesses and address sizes so it does not exclude any particular CPU or instruction set architecture. This has allowed it to be adopted by many different operating systems on many different hardware platforms.

File layout

[edit]

Each ELF file is made up of one ELF header, followed by file data. The data can include:

  • Program header table, describing zero or more memory segments
  • Section header table, describing zero or more sections
  • Data referred to by entries in the program header table or section header table
Structure of an ELF file with key entries highlighted

The segments contain information that is needed for run time execution of the file, while sections contain important data for linking and relocation. Any byte in the entire file can be owned by one section at most, and orphan bytes can occur which are unowned by any section.

ELF header

[edit]

The ELF header defines whether to use 32-bit or 64-bit addresses. The header contains three fields that are affected by this setting and offset other fields that follow them. The ELF header is 52 or 64 bytes long for 32-bit and 64-bit binaries, respectively.

ELF header[5]
Offset Size (bytes) Field Purpose
32-bit 64-bit 32-bit 64-bit
0x00 4 e_ident[EI_MAG0] through e_ident[EI_MAG3] 0x7F followed by ELF(45 4c 46) in ASCII; these four bytes constitute the magic number.
0x04 1 e_ident[EI_CLASS] This byte is set to either 1 or 2 to signify 32- or 64-bit format, respectively.
0x05 1 e_ident[EI_DATA] This byte is set to either 1 or 2 to signify little or big endianness, respectively. This affects interpretation of multi-byte fields starting with offset 0x10.
0x06 1 e_ident[EI_VERSION] Set to 1 for the original and current version of ELF.
0x07 1 e_ident[EI_OSABI] Identifies the target operating system ABI.
Value ABI
0x00 System V
0x01 HP-UX
0x02 NetBSD
0x03 Linux
0x04 GNU Hurd
0x06 Solaris
0x07 AIX (Monterey)
0x08 IRIX
0x09 FreeBSD
0x0A Tru64
0x0B Novell Modesto
0x0C OpenBSD
0x0D OpenVMS
0x0E NonStop Kernel
0x0F AROS
0x10 FenixOS
0x11 Nuxi CloudABI
0x12 Stratus Technologies OpenVOS
0x08 1 e_ident[EI_ABIVERSION] Further specifies the ABI version. Its interpretation depends on the target ABI. Linux kernel (after at least 2.6) has no definition of it,[6] so it is ignored for statically linked executables. In that case, offset and size of EI_PAD are 8.

glibc 2.12+ in case e_ident[EI_OSABI] == 3 treats this field as ABI version of the dynamic linker:[7] it defines a list of dynamic linker's features,[8] treats e_ident[EI_ABIVERSION] as a feature level requested by the shared object (executable or dynamic library) and refuses to load it if an unknown feature is requested, i.e. e_ident[EI_ABIVERSION] is greater than the largest known feature.[9]

0x09 7 e_ident[EI_PAD] Reserved padding bytes. Currently unused. Should be filled with zeros and ignored when read.
0x10 2 e_type Identifies object file type.
Value Type Meaning
0x00 ET_NONE Unknown.
0x01 ET_REL Relocatable file.
0x02 ET_EXEC Executable file.
0x03 ET_DYN Shared object.
0x04 ET_CORE Core file.
0xFE00 ET_LOOS Reserved inclusive range. Operating system specific.
0xFEFF ET_HIOS
0xFF00 ET_LOPROC Reserved inclusive range. Processor specific.
0xFFFF ET_HIPROC
0x12 2 e_machine Specifies target instruction set architecture. Some examples are:
Value ISA
0x00 No specific instruction set
0x01 AT&T WE 32100
0x02 SPARC
0x03 x86
0x04 Motorola 68000 (M68k)
0x05 Motorola 88000 (M88k)
0x06 Intel MCU
0x07 Intel 80860
0x08 MIPS
0x09 IBM System/370
0x0A MIPS RS3000 Little-endian
0x0B – 0x0E Reserved for future use
0x0F Hewlett-Packard PA-RISC
0x13 Intel 80960
0x14 PowerPC
0x15 PowerPC (64-bit)
0x16 S390, including S390x
0x17 IBM SPU/SPC
0x18 – 0x23 Reserved for future use
0x24 NEC V800
0x25 Fujitsu FR20
0x26 TRW RH-32
0x27 Motorola RCE
0x28 Arm (up to Armv7/AArch32)
0x29 Digital Alpha
0x2A SuperH
0x2B SPARC Version 9
0x2C Siemens TriCore embedded processor
0x2D Argonaut RISC Core
0x2E Hitachi H8/300
0x2F Hitachi H8/300H
0x30 Hitachi H8S
0x31 Hitachi H8/500
0x32 IA-64
0x33 Stanford MIPS-X
0x34 Motorola ColdFire
0x35 Motorola M68HC12
0x36 Fujitsu MMA Multimedia Accelerator
0x37 Siemens PCP
0x38 Sony nCPU embedded RISC processor
0x39 Denso NDR1 microprocessor
0x3A Motorola Star*Core processor
0x3B Toyota ME16 processor
0x3C STMicroelectronics ST100 processor
0x3D Advanced Logic Corp. TinyJ embedded processor family
0x3E AMD x86-64
0x3F Sony DSP Processor
0x40 Digital Equipment Corp. PDP-10
0x41 Digital Equipment Corp. PDP-11
0x42 Siemens FX66 microcontroller
0x43 STMicroelectronics ST9+ 8/16-bit microcontroller
0x44 STMicroelectronics ST7 8-bit microcontroller
0x45 Motorola MC68HC16 Microcontroller
0x46 Motorola MC68HC11 Microcontroller
0x47 Motorola MC68HC08 Microcontroller
0x48 Motorola MC68HC05 Microcontroller
0x49 Silicon Graphics SVx
0x4A STMicroelectronics ST19 8-bit microcontroller
0x4B Digital VAX
0x4C Axis Communications 32-bit embedded processor
0x4D Infineon Technologies 32-bit embedded processor
0x4E Element 14 64-bit DSP Processor
0x4F LSI Logic 16-bit DSP Processor
0x8C TMS320C6000 Family
0xAF MCST Elbrus e2k
0xB7 Arm 64-bits (Armv8/AArch64)
0xDC Zilog Z80
0xF3 RISC-V
0xF7 Berkeley Packet Filter
0x101 WDC 65C816
0x102 LoongArch
0x14 4 e_version Set to 1 for the original version of ELF.
0x18 4 8 e_entry This is the memory address of the entry point from where the process starts executing. This field is either 32 or 64 bits long, depending on the format defined earlier (byte 0x04). If the file doesn't have an associated entry point, then this holds zero.
0x1C 0x20 4 8 e_phoff Points to the start of the program header table. It usually follows the file header immediately following this one, making the offset 0x34 or 0x40 for 32- and 64-bit ELF executables, respectively.
0x20 0x28 4 8 e_shoff Points to the start of the section header table.
0x24 0x30 4 e_flags Interpretation of this field depends on the target architecture.
0x28 0x34 2 e_ehsize Contains the size of this header, normally 64 Bytes for 64-bit and 52 Bytes for 32-bit format.
0x2A 0x36 2 e_phentsize Contains the size of a program header table entry. As explained below, this will typically be 0x20 (32-bit) or 0x38 (64-bit).
0x2C 0x38 2 e_phnum Contains the number of entries in the program header table.
0x2E 0x3A 2 e_shentsize Contains the size of a section header table entry. As explained below, this will typically be 0x28 (32-bit) or 0x40 (64-bit).
0x30 0x3C 2 e_shnum Contains the number of entries in the section header table.
0x32 0x3E 2 e_shstrndx Contains index of the section header table entry that contains the section names.
0x34 0x40 End of ELF Header (size).

Example hexdump

[edit]
00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 3e 00 01 00 00 00  c5 48 40 00 00 00 00 00  |..>......H@.....|

[10]

Program header

[edit]

The program header table tells the system how to create a process image. It is found at file offset e_phoff, and consists of e_phnum entries, each with size e_phentsize. The layout is slightly different in 32-bit ELF vs 64-bit ELF, because the p_flags are in a different structure location for alignment reasons. Each entry is structured as:

Program header[11]
Offset Size (bytes) Field Purpose
32-bit 64-bit 32-bit 64-bit
0x00 4 p_type Identifies the type of the segment.
Value Name Meaning
0x00000000 PT_NULL Program header table entry unused.
0x00000001 PT_LOAD Loadable segment.
0x00000002 PT_DYNAMIC Dynamic linking information.
0x00000003 PT_INTERP Interpreter information.
0x00000004 PT_NOTE Auxiliary information.
0x00000005 PT_SHLIB Reserved.
0x00000006 PT_PHDR Segment containing program header table itself.
0x00000007 PT_TLS Thread-Local Storage template.
0x60000000 PT_LOOS Reserved inclusive range. Operating system specific.
0x6FFFFFFF PT_HIOS
0x70000000 PT_LOPROC Reserved inclusive range. Processor specific.
0x7FFFFFFF PT_HIPROC
0x04 4 p_flags Segment-dependent flags (position for 64-bit structure).
Value Name Meaning
0x1 PF_X Executable segment.
0x2 PF_W Writeable segment.
0x4 PF_R Readable segment.
0x04 0x08 4 8 p_offset Offset of the segment in the file image.
0x08 0x10 4 8 p_vaddr Virtual address of the segment in memory.
0x0C 0x18 4 8 p_paddr On systems where physical address is relevant, reserved for segment's physical address.
0x10 0x20 4 8 p_filesz Size in bytes of the segment in the file image. May be 0.
0x14 0x28 4 8 p_memsz Size in bytes of the segment in memory. May be 0.
0x18 4 p_flags Segment-dependent flags (position for 32-bit structure). See above p_flags field for flag definitions.
0x1C 0x30 4 8 p_align 0 and 1 specify no alignment. Otherwise should be a positive, integral power of 2, with p_vaddr equating p_offset modulus p_align.
0x20 0x38 End of Program Header (size).

Section header

[edit]
Offset Size (bytes) Field Purpose
32-bit 64-bit 32-bit 64-bit
0x00 4 sh_name An offset to a string in the .shstrtab section that represents the name of this section.
0x04 4 sh_type Identifies the type of this header.
Value Name Meaning
0x0 SHT_NULL Section header table entry unused
0x1 SHT_PROGBITS Program data
0x2 SHT_SYMTAB Symbol table
0x3 SHT_STRTAB String table
0x4 SHT_RELA Relocation entries with addends
0x5 SHT_HASH Symbol hash table
0x6 SHT_DYNAMIC Dynamic linking information
0x7 SHT_NOTE Notes
0x8 SHT_NOBITS Program space with no data (bss)
0x9 SHT_REL Relocation entries, no addends
0x0A SHT_SHLIB Reserved
0x0B SHT_DYNSYM Dynamic linker symbol table
0x0E SHT_INIT_ARRAY Array of constructors
0x0F SHT_FINI_ARRAY Array of destructors
0x10 SHT_PREINIT_ARRAY Array of pre-constructors
0x11 SHT_GROUP Section group
0x12 SHT_SYMTAB_SHNDX Extended section indices
0x13 SHT_NUM Number of defined types.
0x60000000 SHT_LOOS Start OS-specific.
... ... ...
0x08 4 8 sh_flags Identifies the attributes of the section.
Value Name Meaning
0x1 SHF_WRITE Writable
0x2 SHF_ALLOC Occupies memory during execution
0x4 SHF_EXECINSTR Executable
0x10 SHF_MERGE Might be merged
0x20 SHF_STRINGS Contains null-terminated strings
0x40 SHF_INFO_LINK 'sh_info' contains SHT index
0x80 SHF_LINK_ORDER Preserve order after combining
0x100 SHF_OS_NONCONFORMING Non-standard OS specific handling required
0x200 SHF_GROUP Section is member of a group
0x400 SHF_TLS Section hold thread-local data
0x0FF00000 SHF_MASKOS OS-specific
0xF0000000 SHF_MASKPROC Processor-specific
0x4000000 SHF_ORDERED Special ordering requirement (Solaris)
0x8000000 SHF_EXCLUDE Section is excluded unless referenced or allocated (Solaris)
0x0C 0x10 4 8 sh_addr Virtual address of the section in memory, for sections that are loaded.
0x10 0x18 4 8 sh_offset Offset of the section in the file image.
0x14 0x20 4 8 sh_size Size in bytes of the section. May be 0.
0x18 0x28 4 sh_link Contains the section index of an associated section. This field is used for several purposes, depending on the type of section.
0x1C 0x2C 4 sh_info Contains extra information about the section. This field is used for several purposes, depending on the type of section.
0x20 0x30 4 8 sh_addralign Contains the required alignment of the section. This field must be a power of two.
0x24 0x38 4 8 sh_entsize Contains the size, in bytes, of each entry, for sections that contain fixed-size entries. Otherwise, this field contains zero.
0x28 0x40 End of Section Header (size).

Tools

[edit]
  • readelf is a Unix binary utility that displays information about one or more ELF files. A free software implementation is provided by GNU Binutils.
  • elfutils provides alternative tools to GNU Binutils purely for Linux.[12]
  • elfdump is a command for viewing ELF information in an ELF file, available under Solaris and FreeBSD.
  • objdump provides a wide range of information about ELF files and other object formats. objdump uses the Binary File Descriptor library as a back-end to structure the ELF data.
  • The Unix file utility can display some information about ELF files, including the instruction set architecture for which the code in a relocatable, executable, or shared object file is intended, or on which an ELF core dump was produced.

Applications

[edit]

Unix-like systems

[edit]

The ELF format has replaced older executable formats in various environments. It has replaced a.out and COFF formats in Unix-like operating systems:

Non-Unix adoption

[edit]

ELF has also seen some adoption in non-Unix operating systems, such as:

Microsoft Windows also uses the ELF format, but only for its Windows Subsystem for Linux compatibility system.[18]

Game consoles

[edit]

Some game consoles also use ELF:

  • PlayStation Portable,[19] PlayStation Vita, PlayStation, PlayStation 2, PlayStation 3, PlayStation 4, PlayStation 5
  • GP2X
  • Dreamcast
  • GameCube
  • Nintendo 64
  • Wii
  • Wii U

PowerPC

[edit]

Other (operating) systems running on PowerPC that use ELF:

  • AmigaOS 4, the ELF executable has replaced the prior Extended Hunk Format (EHF) which was used on Amigas equipped with PPC processor expansion cards.
  • MorphOS
  • AROS
  • Café OS (The operating system run by the Wii U)

Mobile phones

[edit]

Some operating systems for mobile phones and mobile devices use ELF:

  • Symbian OS v9 uses E32Image[20] format that is based on the ELF file format;
  • Sony Ericsson, for example, the W800i, W610, W300, etc.
  • Siemens, the SGOLD and SGOLD2 platforms: from Siemens C65 to S75 and BenQ-Siemens E71/EL71;
  • Motorola, for example, the E398, SLVR L7, v360, v3i (and all phone LTE2 which has the patch applied).
  • Bada, for example, the Samsung Wave S8500.
  • Nokia phones or tablets running the Maemo or the Meego OS, for example, the Nokia N900.
  • Android uses ELF .so (shared object[21]) libraries for the Java Native Interface.[citation needed] With Android Runtime (ART), the default since Android 5.0 "Lollipop", all applications are compiled into native ELF binaries on installation.[22] It's also possible to use native Linux software from package managers like Termux, or compile them from sources via Clang or GCC, that are available in repositories.

Some phones can run ELF files through the use of a patch that adds assembly code to the main firmware, which is a feature known as ELFPack in the underground modding culture. The ELF file format is also used with the Atmel AVR (8-bit), AVR32[23] and with Texas Instruments MSP430 microcontroller architectures. Some implementations of Open Firmware can also load ELF files, most notably Apple's implementation used in almost all PowerPC machines the company produced.

Blockchain platforms

[edit]
  • Solana uses ELF format for its on-chain programs (smart contracts). The platform processes ELF files compiled to BPF (Berkeley Packet Filter) byte-code, which are then deployed as shared objects and executed in Solana's runtime environment. The BPF loader validates and processes these ELF files during program deployment.[24]

86open

[edit]

86open was a project to form consensus on a common binary file format for Unix and Unix-like operating systems on the common PC compatible x86 architecture, to encourage software developers to port to the architecture.[25] The initial idea was to standardize on a small subset of Spec 1170, a predecessor of the Single UNIX Specification, and the GNU C Library (glibc) to enable unmodified binaries to run on the x86 Unix-like operating systems. The project was originally designated "Spec 150".

The format eventually chosen was ELF, specifically the Linux implementation of ELF, after it had turned out to be a de facto standard supported by all involved vendors and operating systems.

The group began email discussions in 1997 and first met together at the Santa Cruz Operation offices on August 22, 1997.

The steering committee was Marc Ewing, Dion Johnson, Evan Leibovitch, Bruce Perens, Andrew Roach, Bryan Wayne Sparks and Linus Torvalds. Other people on the project were Keith Bostic, Chuck Cranor, Michael Davidson, Chris G. Demetriou, Ulrich Drepper, Don Dugger, Steve Ginzburg, Jon "maddog" Hall, Ron Holt, Jordan Hubbard, Dave Jensen, Kean Johnston, Andrew Josey, Robert Lipe, Bela Lubkin, Tim Marsland, Greg Page, Ronald Joe Record, Tim Ruckle, Joel Silverstein, Chia-pi Tien, and Erik Troan. Operating systems and companies represented were BeOS, BSDI, FreeBSD, Intel, Linux, NetBSD, SCO and SunSoft.

The project progressed and in mid-1998, SCO began developing lxrun, an open-source compatibility layer able to run Linux binaries on OpenServer, UnixWare, and Solaris. SCO announced official support of lxrun at LinuxWorld in March 1999. Sun Microsystems began officially supporting lxrun for Solaris in early 1999,[26] and later moved to integrated support of the Linux binary format via Solaris Containers for Linux Applications.

With the BSDs having long supported Linux binaries (through a compatibility layer) and the main x86 Unix vendors having added support for the format, the project decided that Linux ELF was the format chosen by the industry and "declare[d] itself dissolved" on July 25, 1999.[27]

FatELF: universal binaries for Linux

[edit]

FatELF is an ELF binary-format extension that adds fat binary capabilities.[28] It is aimed for Linux and other Unix-like operating systems. Additionally to the CPU architecture abstraction (byte order, word size, CPU instruction set etc.), there is the potential advantage of software-platform abstraction e.g., binaries which support multiple kernel ABI versions. As of 2021, FatELF has not been integrated into the mainline Linux kernel.[29][30][31]

See also

[edit]

References

[edit]

Further reading

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The Executable and Linkable Format (ELF) is a standard file format for executables, relocatable object files, shared object libraries, and core dumps, primarily used on Unix-like operating systems to define the structure and organization of binary data for loading and linking.[1] Developed in the early 1990s by Unix System Laboratories (a subsidiary of AT&T) and Sun Microsystems as part of System V Release 4 (SVR4), ELF first appeared in Solaris 2.0 and has since become the de facto standard for many open-source and commercial Unix variants, including Linux, BSD, and Solaris.[2] Its design emphasizes flexibility, portability across architectures, and support for dynamic linking, replacing earlier formats like a.out and COFF to streamline software development and execution.[1] At its core, an ELF file begins with a fixed ELF header that provides essential metadata, including the magic bytes (0x7F 'E' 'L' 'F'), the file's class (32-bit or 64-bit), data encoding (little-endian or big-endian), the target architecture (e.g., x86, ARM), and the file type (relocatable, executable, shared object, or core).[3] Following the header, ELF files may include a program header table—an array of entries describing loadable segments for process image creation, such as code, data, and dynamic linking information—and a section header table that details smaller, linkable sections like .text (executable code), .data (initialized variables), .bss (uninitialized data), and .symtab (symbol table).[2] This dual structure allows ELF to serve both runtime loading (via program headers) and static linking/relocation (via section headers), enabling efficient memory mapping and shared library usage without redundant code duplication.[1] ELF's extensibility supports versioning for symbols, relocation entries for address resolution, and notes sections for auxiliary information like debugging data or operating system-specific details, making it adaptable to modern features like position-independent code (PIC) and multi-architecture binaries.[1] Widely implemented in toolchains such as GCC and binutils, ELF facilitates cross-compilation and has influenced formats in non-Unix environments, underscoring its role as a foundational element in software portability and system reliability.[2]

History and Development

Origins and 86open Principles

The Executable and Linkable Format (ELF) originated in the late 1980s when Unix System Laboratories (USL) developed it as part of System V Release 4 (SVR4) to supersede the limitations of the earlier a.out format, providing a more flexible structure for executables, object files, and shared libraries.[1][4] USL collaborated with Sun Microsystems, incorporating elements of Sun's dynamic shared library system from SunOS 4.x (introduced in 1988), which enabled runtime linking of libraries to reduce executable sizes and improve modularity.[4] ELF was first specified in the SVR4 Application Binary Interface (ABI), with initial implementations appearing in SVR4-based systems such as Solaris 2.0, released in 1992.[1][4] Shortly thereafter, Sun Microsystems adopted ELF in Solaris 2.0 (also known as SunOS 5.0), marking one of the earliest widespread deployments and demonstrating its compatibility with SVR4-based environments.[4] Other Unix variants followed suit in the early 1990s, as the format's design facilitated portability across x86 systems without requiring OS-specific modifications.[1] In response to growing fragmentation among proprietary executable formats on x86 Unix platforms, the 86open project was founded in 1997 by a consortium including the Santa Cruz Operation (SCO) and Linux vendors, to establish a unified standard that would allow binaries to run seamlessly across diverse Unix implementations. The project focused on consensus-building for a common ABI, ultimately endorsing ELF as the solution before concluding in 1999.[4][5][6] The core principles of 86open emphasized a simple yet extensible file structure to accommodate future enhancements without breaking compatibility, robust support for dynamic linking to enable shared libraries, generation of position-independent code (PIC) for relocatable executables, and deliberate exclusion of OS-specific dependencies to ensure broad interoperability across Unix variants.[4] These guidelines addressed the proprietary silos of the era, promoting ELF's adoption as a vendor-neutral format that prioritized efficiency and cross-platform usability.[5]

Standardization and Evolution

The formal standardization of the Executable and Linkable Format (ELF) was led by the Tool Interface Standard (TIS) committee, a consortium of industry leaders formed in 1993 to define portable formats for Unix-like systems. The TIS adopted ELF, originally developed for System V Release 4, as the standard object file format and published version 1.1 of the Portable Formats Specification in October 1993, extracting and refining ELF details from the System V Application Binary Interface.[7] This effort culminated in the TIS ELF Specification version 1.2 in May 1995, which incorporated minor fixes, clarifications, and extensions for broader portability across 32-bit architectures.[1] Subsequent evolution included the Generic Application Binary Interface (gABI), released in March 1997 as part of the System V ABI edition 4.1, which defined processor-independent ELF conventions to promote cross-distribution compatibility in Linux and other environments.[8] The gABI built on TIS foundations by specifying common ELF usage, such as dynamic linking and symbol resolution, while allowing processor supplements for specific architectures. System V ABI extensions further refined ELF for operating system interfaces, including process initialization and function calling sequences. Support for 64-bit architectures emerged in the mid-1990s, with ELFCLASS64 defined to accommodate larger address spaces; an initial ELF-64 specification was developed for the Alpha processor around 1995, enabling 64-bit object files on Digital Unix. This was extended to x86-64 in the early 2000s through processor-specific ABIs, maintaining backward compatibility with 32-bit ELF while supporting extended data types and relocations. In the 2000s, ELF evolved with security enhancements, notably the introduction of RELRO (Relocation Read-Only) as a linker option in GNU ld around 2007, which marks relocation sections as read-only after processing to prevent runtime tampering. Partial and full RELRO modes balanced performance and protection, becoming standard in distributions for mitigating exploits targeting global offset tables. Adaptations for modern hardware continued into the 2010s, with formal ARM ELF specifications published in 1999 to support embedded and mobile processors.[9] For RISC-V, ELF support was integrated into toolchains starting in the mid-2010s, with the processor-specific ABI specification finalized in 2021 to enable open-source implementations across microcontrollers and servers. In 2025, Xinuos published version 4.2 of the ELF specification and released a draft of version 4.3 for public review, formalizing updates such as separating the ELF spec from the gABI.[10]

File Format Specifications

Overall Structure and Layout

The Executable and Linkable Format (ELF) organizes files in a hierarchical structure that begins with a fixed-size ELF header containing metadata about the file's layout and type. This header is immediately followed by zero or more program headers, which describe loadable segments for runtime execution, and the actual content of those segments or sections. The file concludes with a section header table that catalogs all sections, such as code, data, and debugging information, enabling link-time processing; the sections themselves occupy space between the program headers and the section header table.[1] ELF supports distinct file types tailored to different stages of software development and use. Executable files rely on program headers to define loadable segments that the operating system maps directly into memory for execution. Relocatable object files emphasize sections as the primary units, facilitating combination with other objects during linking to produce executables or shared libraries. Shared libraries incorporate dynamic linking mechanisms, including dedicated sections for symbol tables and relocation entries to resolve references at load time or runtime. Core dump files preserve process state, encompassing memory segments, thread information, and register values for post-mortem analysis.[1] ELF files vary by class and data encoding to accommodate diverse hardware. The 32-bit class uses 32-bit addresses and types suitable for traditional systems, while the 64-bit class employs 64-bit addressing for larger memory spaces and modern architectures. Data encoding supports little-endian byte order for processors like x86 or big-endian for others like some PowerPC variants. Identification begins with the magic bytes 0x7F 'E' 'L' 'F' in the file's initial bytes, distinguishing ELF from other formats. Common file types include ET_EXEC for standalone executables, ET_DYN for position-independent code like shared libraries, and ET_REL for relocatable objects awaiting linking.[1] Program headers and section headers provide complementary perspectives on the file: program headers offer a runtime-oriented view by grouping sections into coarse-grained segments optimized for efficient loading and execution by the dynamic linker, whereas section headers deliver a fine-grained, link-time view that permits the static linker to manipulate individual sections for tasks like relocation and symbol merging. This separation enhances modularity, allowing tools to operate on either view as needed without redundancy.[1]

ELF Header Details

The Executable and Linkable Format (ELF) begins with a fixed-size header that provides essential metadata for interpreting the file, ensuring compatibility across different systems and architectures. This header is located at offset zero and contains an array of identification bytes followed by core structural fields, allowing parsers to validate the file format, determine its class (32-bit or 64-bit), encoding, and other attributes before processing the rest of the file. The header's design promotes portability by standardizing field positions and sizes, with variations only for bit width to accommodate different processor architectures. The header's size is 52 bytes for 32-bit ELF files and 64 bytes for 64-bit files, reflecting the use of 4-byte or 8-byte addressing for certain fields. The initial 16 bytes form the e_ident array, which serves as the file's "magic number" and configuration descriptor. Specifically, bytes 0-3 (EI_MAG) must contain the hexadecimal values 0x7f, 'E', 'L', 'F' to identify an ELF file; byte 4 (EI_CLASS) specifies the file class as 1 for 32-bit or 2 for 64-bit; byte 5 (EI_DATA) indicates data encoding as 1 for little-endian or 2 for big-endian; byte 6 (EI_VERSION) is always set to 1 for the current ELF version; byte 7 (EI_OSABI) denotes the operating system/ABI target, such as 0 for System V or 3 for Linux; and byte 8 (EI_ABIVERSION) provides the ABI version number, with bytes 9-15 reserved for padding (EI_PAD) initialized to zero. This array enables immediate format validation, as any mismatch (e.g., incorrect magic bytes) signals an invalid ELF file, preventing erroneous parsing.[11] Following e_ident, the header includes several core fields that describe the file's type, target machine, and layout pointers, all encoded in native byte order as determined by EI_DATA. The e_type field (2 bytes) classifies the file as ET_NONE (0, no file type), ET_REL (1, relocatable), ET_EXEC (2, executable), ET_DYN (3, shared object), or ET_CORE (4, core dump). The e_machine field (2 bytes) identifies the target architecture, such as EM_386 (3) for Intel 80386 or EM_X86_64 (62) for AMD x86-64. The e_version field (4 bytes) is fixed at 1, matching EI_VERSION for consistency. The e_entry field (4 or 8 bytes, depending on class) holds the virtual address of the program's entry point. Layout offsets are provided by e_phoff (4 or 8 bytes) for the program header table position and e_shoff (4 or 8 bytes) for the section header table position, both relative to the file start. The e_flags field (4 bytes) carries processor-specific flags, such as 0x00000001 for x86-64 code model adjustments. Header metadata includes e_ehsize (2 bytes) indicating the header's own size (52 or 64); e_phentsize (2 bytes) and e_phnum (2 bytes) for program header entry size (typically 32 or 56 bytes) and count; and e_shentsize (2 bytes), e_shnum (2 bytes), and e_shstrndx (2 bytes) detailing section header entry size (40 or 64 bytes), total count, and index of the string table section for section names. These fields collectively guide the loader or linker in navigating the file without prior knowledge of its internal structure.[11] To maintain alignment and portability, ELF headers adhere to strict padding and byte-order rules: the e_ident padding bytes are always zero, and all multi-byte fields (e.g., addresses and offsets) are stored in the endianness specified by EI_DATA, with natural alignment for 32-bit (4-byte) and 64-bit (8-byte) variants to avoid unaligned access issues on target architectures. This structure facilitates robust parsing by allowing tools to first verify the header's integrity—through magic checks, version consistency, and size validations—before advancing to variable components, thereby minimizing errors in cross-platform or multi-architecture environments. For instance, a mismatch in e_ehsize would indicate a corrupted or non-standard file, prompting immediate rejection.[11]
FieldOffset (32-bit)Size (32-bit)TypeDescription
e_ident016 bytesArrayIdentification bytes for magic, class, data, version, OS/ABI, ABI version, and padding.
e_type162 bytesElf32_HalfFile type (e.g., executable, shared object).
e_machine182 bytesElf32_HalfTarget architecture (e.g., EM_386).
e_version204 bytesElf32_WordObject file version (always 1).
e_entry244 bytesElf32_AddrEntry point virtual address.
e_phoff284 bytesElf32_OffProgram header table offset.
e_shoff324 bytesElf32_OffSection header table offset.
e_flags364 bytesElf32_WordProcessor-specific flags.
e_ehsize402 bytesElf32_HalfELF header size in bytes.
e_phentsize422 bytesElf32_HalfProgram header entry size.
e_phnum442 bytesElf32_HalfNumber of program header entries.
e_shentsize462 bytesElf32_HalfSection header entry size.
e_shnum482 bytesElf32_HalfNumber of section header entries.
e_shstrndx502 bytesElf32_HalfIndex of section name string table.
For 64-bit ELF files, offsets after e_ident shift accordingly, with e_entry, e_phoff, and e_shoff expanding to 8 bytes each, resulting in the total 64-byte size; the table above illustrates the 32-bit layout, but field purposes remain identical.[11]

Program Headers and Segments

The program header table in an ELF file is an array of program header entries that describe the layout of loadable segments for creating a process image during execution. This table is optional but required for executable and shared object files; relocatable object files typically omit it. The ELF header references the table using the e_phoff field for its file offset and e_phnum for the number of entries. Each entry is a fixed-size structure: 32 bytes for 32-bit ELF files (Elf32_Phdr) and 56 bytes for 64-bit files (Elf64_Phdr). The table enables the operating system loader to map segments directly into memory without relying on section-level details, facilitating efficient runtime loading.[1] Each program header entry contains fields that specify the segment's type, location, size, and attributes. The structure for a 32-bit ELF (Elf32_Phdr) is defined as follows:
FieldTypeDescription
p_typeElf32_WordSpecifies the segment type, indicating how the entry should be interpreted (e.g., loadable segment or auxiliary information).[1]
p_offsetElf32_OffFile offset where the segment begins, in bytes from the start of the file. For loadable segments, this must align with p_align; for non-loadable, it points to in-file data.[1]
p_vaddrElf32_AddrVirtual address where the segment should be loaded in memory. Loadable segments start at this address after mapping.[1]
p_paddrElf32_AddrPhysical address for the segment, typically used in embedded systems or kernels; ignored by most user-space loaders.[1]
p_fileszElf32_WordSize of the segment in the file, in bytes; for loadable segments, this is the portion copied from the file to memory.[1]
p_memszElf32_WordSize of the segment in memory, in bytes; may exceed p_filesz for segments requiring zero-initialization (e.g., BSS-like areas).[1]
p_flagsElf32_WordAccess permissions: PF_R (read, bit 0), PF_W (write, bit 1), PF_X (execute, bit 2). These map to memory protection settings like read-only or executable.[1]
p_alignElf32_WordAlignment constraint: the segment's file offset and virtual address must be multiples of this value (e.g., 0x1000 for page alignment). Set to 0 or 1 for non-loadable segments.[1]
The 64-bit variant (Elf64_Phdr) uses analogous types (e.g., Elf64_Off, Elf64_Xword for larger sizes) but maintains the same field order and semantics.[12] The p_type field defines the segment's purpose, with standard values outlined in the ELF specification. Common types include PT_NULL (0, unused entry), PT_LOAD (1, loadable segment for code or data), PT_DYNAMIC (2, dynamic linking information like symbol tables), PT_INTERP (3, path to the program interpreter, e.g., "/lib/ld-linux.so.2"), PT_NOTE (4, auxiliary notes such as build IDs or core dump metadata), and PT_GNU_STACK (0x6474e551, GNU extension for stack attributes, including executable permission for security hardening). Other types like PT_TLS (7, thread-local storage) and processor-specific extensions may appear depending on the platform. Loadable segments (PT_LOAD) are the core of execution, typically dividing the image into read-execute (text) and read-write (data) portions.[1][12] In dynamic loading, the program header table guides the loader (such as ld.so on Linux) to construct the process address space. The loader reads the table, maps PT_LOAD segments into virtual memory at their p_vaddr with appropriate p_flags protections, initializes extra memory for p_memsz > p_filesz, and processes auxiliary segments like PT_DYNAMIC for relocation and symbol resolution or PT_INTERP to invoke the dynamic linker itself. This segment-based approach allows efficient loading without parsing finer-grained sections, supporting position-independent code in shared libraries. For instance, the initial process image combines segments from the executable and interpreter, with the loader applying relocations post-mapping.[1]

Section Headers and Contents

The section header table in an ELF file is an array of section header entries that describe the layout and attributes of each section within the object file, enabling tools like linkers and debuggers to interpret the file's contents.[13] Each entry is a fixed-size structure—40 bytes for 32-bit ELF (Elf32_Shdr) and 64 bytes for 64-bit ELF (Elf64_Shdr)—and the ELF header provides pointers to this table via the e_shoff (offset to the table), e_shnum (number of entries), and e_shentsize (size of each entry) fields.[11] The table typically appears at the end of the file, and section names are stored as indices into a dedicated string table section, often named .shstrtab.[13] The Elf32_Shdr structure consists of the following fields:
FieldTypeSize (bytes)Description
sh_nameElf32_Word4An index into the section header string table section (.shstrtab), giving the name of this section as a null-terminated string.[11]
sh_typeElf32_Word4A value specifying the type of section, such as program data or symbol table (see section types below).[13]
sh_flagsElf32_Word4Section flags, bitmasks indicating attributes like whether the section is allocatable (SHF_ALLOC), writable (SHF_WRITE), executable (SHF_EXECINSTR), or occupies no space in the file (SHF_MASKOS for OS-specific).[11]
sh_addrElf32_Addr4The virtual address at which the section should reside in memory, if applicable (0 if not relevant).[13]
sh_offsetElf32_Off4The offset in bytes from the beginning of the file to the first byte of the section.[11]
sh_sizeElf32_Word4The size in bytes of the section, or 0 if the section occupies no space (e.g., .bss).[13]
sh_linkElf32_Word4An index into the section header table for a related section, such as the associated string table for symbol tables (interpretation depends on sh_type).[11]
sh_infoElf32_Word4Extra information, often an index into another section or table, with meaning varying by sh_type (e.g., target section for relocations).[13]
sh_addralignElf32_Word4The alignment requirement for the section in memory, expressed as a power of 2 (0 or 1 means no alignment).[11]
sh_entsizeElf32_Word4The size in bytes of each entry if the section holds a table of fixed-size entries (e.g., symbols); 0 otherwise.[13]
The Elf64_Shdr structure mirrors this layout but uses 64-bit types where appropriate (e.g., Elf64_Xword for sh_size and sh_flags, Elf64_Addr for sh_addr, Elf64_Off for sh_offset), resulting in the larger size.[11] Section types, defined by the sh_type field, categorize the purpose and contents of each section, with standard values including SHT_NULL (0, an inactive entry with undefined values), SHT_PROGBITS (1, program-specific data like code or constants), SHT_SYMTAB (2, a full symbol table for linking), SHT_STRTAB (3, a null-terminated string table), SHT_RELA (4, relocation entries with explicit addends), SHT_HASH (5, a hash table for symbol lookups), SHT_DYNAMIC (6, dynamic linking information), SHT_NOTE (7, vendor-specific notes), SHT_NOBITS (8, data that occupies memory but no file space, like uninitialized variables), SHT_REL (9, relocation entries without addends), SHT_SHLIB (10, reserved), and SHT_DYNSYM (11, a minimal dynamic symbol table).[13] Processor-specific or OS-specific types may extend this range.[11] Common sections in ELF object files include .text (type SHT_PROGBITS, flags SHF_ALLOC | SHF_EXECINSTR, containing executable machine code), .data (SHT_PROGBITS, SHF_ALLOC | SHF_WRITE, holding initialized global or static variables), .bss (SHT_NOBITS, SHF_ALLOC | SHF_WRITE, for uninitialized data that is zeroed at runtime to conserve file space), .rodata (SHT_PROGBITS, SHF_ALLOC, read-only constants like strings), .symtab (SHT_SYMTAB, a table of symbols with sh_entsize typically 16 bytes on 32-bit systems, linked to .strtab via sh_link), .strtab (SHT_STRTAB, strings for symbol names and other identifiers), .rela (SHT_RELA, relocation entries with addend fields for position-independent code), .rel (SHT_REL, similar but without addends), .dynamic (SHT_DYNAMIC, entries for dynamic linker use, such as shared library dependencies), and .shstrtab (SHT_STRTAB, exclusively for section names referenced by sh_name indices).[13][11] Special sections encompass .interp (SHT_PROGBITS, containing the null-terminated path to the program interpreter, such as /lib/ld-linux.so.2 for dynamic executables) and .note (SHT_NOTE, holding auxiliary information like build IDs for debugging or GNU notes for ABI identification).[14] During linking, the linker processes these sections by merging compatible ones based on type and flags—for instance, combining .text and .rodata into a single read-only loadable segment, or .data and .bss into a writable segment—while resolving relocations and symbols to produce the final executable's program headers.[13] This separation allows flexible static analysis and linking without affecting runtime loading.[11]

Example Hexdump and Parsing

To illustrate the practical structure of an ELF file, consider a minimal 32-bit executable for x86 architecture that prints "Hello world" and exits, compiled for Linux systems. Such files begin with the ELF identification bytes, followed by the ELF header and program headers, as defined in the official ELF specification. This example is 116 bytes total, with no section headers (e_shoff=0) and a single program header.[1][15] The following hexdump shows the full content of this minimal 32-bit ELF executable (little-endian byte order). The magic bytes (0x7F 'E' 'L' 'F') confirm the file class (32-bit), data encoding (little-endian), and version.[1]
00000000  7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 03 00 01 00 00 00  54 80 04 08 34 00 00 00  |........T...4...|
00000020  00 00 00 00 00 00 00 00  34 00 20 00 01 00 00 00  |........4. .....|
00000030  00 00 00 00 01 00 00 00  00 00 00 00 00 80 04 08  |................|
00000040  00 80 04 08 74 00 00 00  74 00 00 00 05 00 00 00  |....t...t.......|
00000050  00 10 00 00 b0 04 31 db  43 b9 69 80 04 08 31 d2  |......1.C.i...1.|
00000060  b2 0b cd 80 31 c0 40 cd  80 48 65 6c 6c 6f 20 77  |[email protected] w|
00000070  6f 72 6c 64                                       |orld             |
In this hexdump, bytes 0x00-0x0F form the e_ident array: 0x7F454C46 (magic number), 0x01 (EI_CLASS for 32-bit), 0x01 (EI_DATA for little-endian), 0x01 (EI_VERSION), 0x00 (EI_OSABI for System V), and padding zeros. Bytes 0x10-0x11 hold e_type = 0x0002 (ET_EXEC for executable), and 0x12-0x13 hold e_machine = 0x0003 (EM_386 for Intel 80386). The e_version at 0x14-0x17 is 0x00000001. The e_entry at 0x18-0x1B is 0x08048054 (entry point virtual address). e_phoff at 0x1C-0x1F is 0x00000034 (program header offset at byte 52). e_shoff at 0x20-0x23 is 0x00000000 (no section headers). e_flags at 0x24-0x27 is 0x00000000. e_ehsize at 0x28-0x29 is 0x0034 (52 bytes). e_phentsize at 0x2A-0x2B is 0x0020 (32 bytes per program header). e_phnum at 0x2C-0x2D is 0x0001 (one entry). e_shentsize at 0x2E-0x2F is 0x0000, e_shnum at 0x30-0x31 is 0x0000, e_shstrndx at 0x32-0x33 is 0x0000 (no sections).[1] Parsing proceeds sequentially from the file start. First, validate the magic bytes at offset 0 to ensure ELF format and extract e_ident for architecture details: class (32-bit vs. 64-bit determines header size and field widths), data (little-endian requires byte reversal on big-endian hosts), and OS/ABI for compatibility. Next, read e_type to confirm it's an executable (ET_EXEC = 2); e_machine specifies the target ISA (e.g., 3 for x86). The e_entry provides the virtual address to jump to after loading (0x08048054). Since e_shoff=0, there are no sections to parse. Program headers begin at e_phoff=0x34 (byte 52): the single PT_LOAD entry (p_type=1 at bytes 0x34-0x37) describes a loadable segment with p_offset=0x00000000 (file offset), p_vaddr=0x08048000 (virtual address, page-aligned), p_paddr=0x08048000, p_filesz=0x00000074 (116 bytes from file), p_memsz=0x00000074 (116 bytes in memory), p_flags=0x00000005 (read and execute). p_align=0x00001000 (4 KB page). This segment encompasses the entire file, including the code starting at file offset 0x54 (virtual 0x08048054): it performs sys_write (eax=4, ebx=1 for stdout, ecx=0x08048069 for string buffer, edx=11 for length) via int 0x80, then sys_exit (eax=1) via int 0x80, printing "Hello world" before terminating. The string resides at file offset 0x69 (virtual 0x08048069).[1][15] Common pitfalls in parsing include ignoring endianness from EI_DATA, leading to swapped multi-byte fields (e.g., misreading e_type as 0x0001 instead of 2 on big-endian systems), or assuming 64-bit structure for 32-bit files, which alters field sizes and offsets (64-bit headers are 64 bytes with 8-byte fields). Additionally, 32-bit vs. 64-bit differences affect alignment: 32-bit uses 4-byte words, while 64-bit uses 8-byte, potentially causing buffer overflows in parsers. In this minimal example, the absence of sections simplifies linking but limits debuggability; the self-contained loadable segment demonstrates runtime execution without separate link-time sections.[1]

Tools and Utilities

Core Manipulation Tools

The GNU Binutils suite comprises a set of command-line utilities essential for creating, modifying, and inspecting Executable and Linkable Format (ELF) files in software development workflows. Developed and maintained by the GNU Project, these tools facilitate the manipulation of ELF object files, executables, and libraries by handling linking, section copying, disassembly, and archiving operations.[16] The GNU linker, known as ld, serves as the primary tool for combining multiple ELF object files (.o) and libraries into a single executable program or shared library. It resolves symbols, applies relocations, and generates the final ELF binary by processing input sections and program headers as defined in the ELF specification. For instance, the command ld -o output.elf input1.o input2.o -lc links two object files with the standard C library to produce an executable. To create a shared library, the option -shared is used, as in ld -shared -o libexample.so input.o, which produces a position-independent ELF file suitable for dynamic loading. objcopy enables the copying and transformation of ELF files, allowing developers to manipulate sections, such as removing debugging symbols or converting formats. It can extract specific sections, adjust headers, or strip unnecessary data to reduce file size. A common usage is stripping all symbols with objcopy --strip-all input.elf -o output.elf, which removes symbol tables and debug information while preserving the executable's functionality, aiding in production builds. Other options include --only-section=.text to copy just the code section. For disassembly and inspection, objdump disassembles ELF sections to reveal machine code in assembler mnemonics, alongside headers and symbol tables. It is particularly useful for verifying the output of compilation and linking steps. Key flags include -d to disassemble executable sections, -h to display section headers, and -t for the symbol table. An example command, objdump -d input.elf, outputs the disassembled instructions from code sections, helping developers analyze binary structure. When combined with -S, it intermixes source code if available, as in objdump -S -d input.elf.[17] The readelf utility provides detailed parsing of ELF file structures, displaying headers, program headers, section headers, symbols, and relocations without disassembly. It offers flags tailored to specific components, such as -h for the ELF header, -S for section headers, -l for program headers, and -s for the symbol table. For example, readelf -h input.elf prints the main ELF header fields like magic number and architecture, while readelf -S input.elf lists all sections with their sizes and attributes. This tool is invaluable for verifying format compliance during development. Finally, ar functions as an archiver for creating static libraries in .a format, which bundle multiple ELF object files for use in linking. It maintains file metadata like timestamps and permissions within the archive. Common operations include creating an archive with ar rcs libexample.a obj1.o obj2.o, where r inserts files, c suppresses prompts, and s generates a symbol index for efficient linking. The --record-libdeps option can track inter-library dependencies. Thin archives, supported in modern versions, reference external ELF files instead of embedding them, optimizing build processes.[18]

Analysis and Debugging Tools

The GNU Debugger (GDB) serves as a foundational tool for runtime debugging of ELF executables on Unix-like systems, enabling developers to inspect program execution, set breakpoints within specific ELF sections such as .text or .data, and load ELF-formatted core dump files to analyze crash states. GDB examines dynamic symbols from the ELF's .dynsym section and supports symbol resolution through debugging information embedded in ELF files, facilitating step-by-step execution tracing and variable inspection during runtime.[19] This integration with ELF structures allows precise control over loaded segments and addresses, making it essential for diagnosing issues in dynamically linked binaries.[20] Strace is a diagnostic utility that traces system calls and signals made by ELF executables running on Linux, providing visibility into kernel interactions without requiring source code modifications.[21] By attaching to an ELF process via the ptrace mechanism, strace logs calls such as open, read, and mmap related to ELF loading and execution, helping identify I/O bottlenecks or permission errors in runtime behavior.[22] It supports filtering by syscall type or file paths, such as those involving ELF shared libraries, to focus analysis on specific aspects of program flow.[21] Complementing strace, ltrace traces dynamic library calls and signals in ELF binaries, intercepting invocations to functions in shared objects like libc.so during execution.[23] This tool records entry and return points for library APIs, revealing how ELF programs interact with dynamically loaded code and aiding in the diagnosis of linkage or API misuse issues.[24] Ltrace also captures associated system calls when invoked with appropriate flags, offering a layered view of runtime dependencies beyond kernel boundaries.[25] The ldd command, part of the GNU C Library (glibc), lists the shared library dependencies of an ELF executable or library by simulating the dynamic linker's resolution process.[26] It parses the ELF's program headers and dynamic section to output required .so files, their memory addresses, and any unresolved symbols, which is crucial for verifying linkage integrity before deployment. For example, running ldd on a binary reveals paths to libraries like libm.so.6, highlighting potential portability issues across systems.[27] Security auditing tools like checksec evaluate ELF binaries for protective features, checking attributes such as RELRO (Relocation Read-Only) to prevent GOT overwrites, NX (No eXecute) stack to block code execution in data areas, and PIE (Position Independent Executable) for ASLR compatibility.[28] By inspecting ELF headers and sections, checksec reports on canary usage for stack smashing protection and Fortify Source for buffered I/O safeguards, enabling quick assessments of binary hardening against exploits.[29] These audits are performed statically on ELF files, providing a security posture overview without execution.[30] The elfkickers suite comprises a collection of utilities for in-depth ELF file analysis, including scripts for computing section entropy to detect packed or obfuscated code and generating statistics on header layouts and symbol tables.[31] Tools within elfkickers, such as elfls for listing sections and readelf variants for parsing, facilitate forensic examination of ELF structures, revealing anomalies like unusual permissions or alignments that may indicate tampering.[32] This set is particularly useful for reverse engineering and quality assurance, focusing on static properties rather than runtime behavior.[33] Valgrind, a suite of dynamic analysis tools, primarily through its Memcheck instrumenter, detects memory errors in running ELF processes by shadowing allocations and tracking accesses in loaded segments.[34] It intercepts ELF binary execution to identify leaks, invalid reads/writes, and use-after-free bugs, with support for debugging information from ELF DWARF sections to pinpoint issues in source code lines.[35] Valgrind's compatibility with ELF on Linux allows comprehensive profiling of heap and stack usage in dynamically linked applications, often revealing subtle defects missed by static checks.[36]

Usage and Adoption

Primary Use in Unix-like Systems

The Executable and Linkable Format (ELF) is the predominant binary file format for executables, shared libraries, and core dumps in Unix-like systems, enabling efficient loading and execution by kernels and dynamic linkers. In Linux, ELF has been the default format since kernel version 1.0 released in 1994, with initial support introduced in development kernel 0.99.13 in 1993. The Linux kernel integrates ELF loading through the binfmt_elf module, which registers the format with the execve system call to interpret ELF binaries during process creation. Upon execution, the kernel validates the ELF header's magic bytes and structure, then uses the program header table to map loadable segments into virtual memory via mmap, before passing control to the dynamic loader, typically /lib/ld-linux.so.2 for 32-bit systems or /lib64/ld-linux-x86-64.so.2 for 64-bit.[37][38] In BSD variants, ELF adoption occurred in the mid-1990s as a replacement for the older a.out format to support advanced features like dynamic linking and shared libraries. FreeBSD introduced ELF header files in version 2.2.6 (1996), marking the beginning of its transition, with full native support solidified by version 3.0 in 1998, including FreeBSD-specific extensions such as brand notes in ELF headers to denote compatibility features like ABI versioning. NetBSD transitioned to ELF as its primary format for i386 and sparc ports starting with release 1.5 in 2000, maintaining backward compatibility for a.out binaries through tools like elf2aout for bootloaders and debugging utilities. OpenBSD added initial ELF support in version 1.2 (1996) and made it the native format across all platforms from version 5.4 (2013) onward. These implementations leverage ELF's program headers for kernel-level segment mapping, ensuring portable execution across architectures.[39][40][41] Solaris, originating from System V Release 4 (SVR4) in 1988, was one of the first Unix systems to adopt ELF as its standard format with Solaris 2.0 in 1992, inheriting SVR4's design for object files and executables. The illumos project, an open-source continuation of OpenSolaris since 2010, retains this SVR4-derived ELF support, including unique extensions like the .SUNW_cap section for specifying software and hardware capabilities such as required CPU instructions or platform features to guide linking and loading decisions. In both systems, the kernel performs ELF header validation to confirm file integrity and architecture compatibility, maps program segments into process address space, and defers relocation resolution—such as adjusting addresses for position-independent code—to the runtime linker, ld.so.1, which processes dynamic relocation entries from .rel or .rela sections.[42][43] While macOS primarily employs the Mach-O format for native binaries, it supports ELF indirectly through cross-compilation toolchains in Xcode and third-party GNU binutils, allowing developers to generate ELF files targeting Linux or embedded Unix-like systems without altering the host OS's loader. This partial integration facilitates porting and building for Unix environments but does not involve direct kernel handling of ELF files on macOS itself. Overall, ELF's standardization in Unix-like kernels emphasizes robust header validation to prevent malformed binaries, memory-efficient segment mapping for shared libraries, and deferred relocation processing to minimize load times and enable address space layout randomization for security.[38]

Adoption in Non-Unix Environments

The Executable and Linkable Format (ELF) has seen partial adoption in Windows environments through compatibility layers and development tools that bridge POSIX-like functionality with the native Portable Executable (PE) format. Cygwin, a POSIX emulation layer for Windows, supports handling ELF files via libraries such as ELFIO, enabling developers to read and generate ELF binaries within a Unix-like environment while ultimately producing PE executables for Windows execution.[44] Similarly, MinGW provides a GNU toolchain for Windows that can be configured to generate ELF object files for cross-compilation purposes, though native Windows applications remain in PE format.[45] ReactOS, an open-source implementation of the Windows NT kernel, incorporates ELF internals in its debugging subsystem, such as the dbghelp module, to parse ELF modules for compatibility and analysis tasks.[46] BeOS and its successor Haiku adopted ELF as the native executable format starting in the late 1990s, leveraging its Unix-like heritage for efficient binary handling on x86 architectures after transitioning from the earlier Preferred Executable Format (PEF) used on PowerPC.[47] This choice facilitated compatibility with Unix tools and binutils, allowing Haiku to maintain a lightweight yet robust binary ecosystem without major modifications to the core ELF specification.[48] Fuchsia, Google's modular capability-based operating system, employs ELF via its built-in ELF runner for launching executable components.[49] In firmware and real-time operating systems (RTOS), ELF has been adapted for resource-constrained embedded environments, notably in uClinux distributions for microcontrollers lacking memory management units (MMUs). uClinux employs ELF as the base format for executables and shared libraries, with modifications to support flat loading and no-MMU operation, enabling deployment on systems like ARM-based devices.[50] Cross-platform toolchains further extend this adoption; for instance, LLVM and Clang can generate ELF binaries targeting non-Unix architectures directly from Windows or other hosts, facilitating development for embedded and hybrid systems.[51] A key challenge in ELF's adoption outside Unix environments stems from application binary interface (ABI) differences, particularly in calling conventions. The System V ABI, integral to ELF on Unix-like systems, passes the first six integer or pointer arguments in registers RDI, RSI, RDX, RCX, R8, and R9, contrasting with the Windows x64 ABI, which uses RCX, RDX, R8, and R9 for the first four arguments and reserves additional registers for shadow space.[52] These variances necessitate adaptations in loaders and linkers to ensure compatibility, often requiring wrappers or recompilation for cross-environment execution.

Applications in Embedded and Specialized Systems

The Executable and Linkable Format (ELF) finds significant application in game consoles, where it supports development kits, homebrew software, and proprietary variants tailored to hardware constraints. In the PlayStation series, the PS2 employs ELF files for homebrew applications and executable packing, enabling modular loading of code segments optimized for the Emotion Engine processor. Similarly, the PS3 utilizes SELF (Signed Executable and Linkable Format), a cryptographically signed extension of ELF, for system executables and dynamic libraries (SPRX files), ensuring secure loading on the Cell Broadband Engine architecture from 2006 onward. For the Nintendo Switch, homebrew development relies on ELF-based formats like .nro files, which are loaded via custom loaders to run unsigned code on the Tegra X1 SoC, facilitating community-driven applications since the console's 2017 release. In mobile ecosystems, ELF serves native code execution in Android, where the Native Development Kit (NDK) compiles C/C++ libraries into ELF shared objects (.so files) linked against Bionic libc, enabling high-performance components in apps since Android 1.0 in 2008. This format allows dynamic loading of optimized binaries for ARM architectures, supporting features like graphics rendering and signal processing in resource-limited environments. On iOS, while the native Mach-O format dominates, ELF plays a role in jailbreak tools through custom loaders that inject ELF shared objects, as demonstrated by developer Comex's "food" module, which enabled loading Android-compatible ELF libraries like libflashplayer.so on jailbroken devices. Historical and specialized uses of ELF extend to PowerPC-based systems, where AmigaOS 4 adopted ELF executables to replace the earlier Extended Hunk Format for PowerPC accelerator cards, providing a standardized structure for binaries on Amiga hardware since 2006. MorphOS, a lightweight OS for PowerPC Macs and Amiga clones, also employs ELF for its executables, leveraging the format's flexibility for efficient media-centric applications on constrained 32-bit and 64-bit PowerPC processors. IBM's AIX operating system on PowerPC historically favored the proprietary XCOFF format but incorporated ELF support through optional toolchains for compatibility with Linux environments, allowing cross-compilation of ELF binaries for PowerPC targets. In blockchain platforms, ELF underpins node implementations running on Unix-like hosts. Ethereum clients, such as the Go Ethereum (Geth) implementation, produce ELF executables for Linux deployments, facilitating consensus and transaction processing on x86 and ARM hosts since the network's 2015 launch. Solana validators, built in Rust, compile to ELF binaries for Unix systems, with on-chain programs specifically packaged as ELF files containing BPF bytecode, enabling high-throughput validation on diverse hardware including ARM-based servers. For embedded and IoT systems, ELF's modularity supports resource-constrained architectures like ARM and RISC-V. Raspberry Pi OS, a Debian derivative for ARM processors, uses ELF for all native executables and libraries, allowing seamless deployment of Linux applications on devices like the Raspberry Pi 4 since 2012. SiFive's RISC-V boards, such as the HiFive series, rely on ELF for firmware and application binaries, with toolchains handling relocations and sections tailored to embedded needs. To minimize footprint in these environments, optimizations like stripping unnecessary sections (e.g., debug symbols via binutils' strip tool) reduce ELF file sizes by up to 50% without affecting runtime functionality, as applied in RISC-V software for IoT edge devices.

Extensions and Variants

Multi-Architecture Support Initiatives

As adoption grew, the format evolved to support multiple architectures through the e_machine field in the ELF header, a 16-bit identifier that specifies the target processor, enabling compatibility with over 60 architectures including ARM (EM_ARM=40), MIPS (EM_MIPS=8), and x86-64 (EM_X86_64=62). This design choice decoupled the core file structure from architecture-specific details, allowing ELF to serve as a flexible container for binaries across diverse hardware without requiring format redesigns.[4] To accommodate varying system conventions, ELF incorporates Application Binary Interface (ABI) extensions that build on the generic System V ABI. The Linux generic ABI (gABI), maintained by the Linux Foundation, provides a baseline for ELF usage across Linux distributions, defining common conventions for object files, executables, and shared libraries while leaving room for processor-specific adaptations.[53] Architecture-specific ABIs extend this foundation; for instance, the ARM Embedded Application Binary Interface (EABI), finalized in 2009, tailors ELF for ARM processors by specifying details like relocation types, dynamic linking tags, and procedure call standards to ensure efficient execution on resource-constrained embedded systems. These extensions maintain backward compatibility with the gABI while addressing unique requirements, such as ARM's support for both little- and big-endian byte orders via the e_ident[EI_DATA] field.[4] Efforts to consolidate multiple architectures into a single ELF file have included proposals like FatELF, which embeds several architecture-specific ELF binaries within one container file.[54] Cross-compilation standards have further enhanced ELF's multi-architecture capabilities, particularly through the LLVM project's toolchain. LLVM/Clang supports generating portable ELF object files and executables for numerous targets via the target triple specification (e.g., armv7-linux-gnueabihf for ARM ELF), enabling developers to build binaries for remote architectures from a single host system without architecture-specific toolchains.[51] This portability relies on ELF's extensible structure, such as the e_flags field for architecture-specific attributes, and integrates with standards like the gABI to produce compatible outputs for linking and execution across ecosystems.[51]

Modern Binary Format Proposals

In 2009, Ryan C. Gordon proposed FatELF as an extension to the ELF format to enable universal binaries on Linux, akin to those on macOS, by embedding multiple architecture-specific ELF binaries within a single file.[55] The structure consists of a primary ELF binary followed by secondary ELF binaries, each identified by headers specifying attributes like CPU architecture, byte order, and OS ABI version, allowing the loader to select the appropriate one at runtime.[56] Despite initial interest for simplifying multi-architecture distribution, the project was abandoned later that year due to kernel integration challenges and community resistance, resulting in no widespread adoption by 2025.[57] WebAssembly (Wasm), introduced in 2017, draws conceptual inspiration from ELF in its binary format, organizing modules into sections for code, data, and custom metadata to support portable, secure execution across environments.[58] While distinct from ELF, Wasm's section-based layout facilitates integration with ELF-based systems through tools like the LLVM linker (lld), which can produce Wasm outputs from ELF inputs or embed Wasm modules into ELF executables for hybrid applications.[59] Contemporary security enhancements to ELF focus on embedding properties for hardware-enforced protections rather than full cryptographic signing schemes. For instance, the GNU toolchain introduced the .note.gnu.property section in 2019 to signal support for Intel's Control-flow Enforcement Technology (CET), which mitigates control-flow hijacking attacks by enforcing indirect branch restrictions, with the kernel using this note to configure execution accordingly.[60] Similarly, on ARM64, ELF binaries leverage the same property note mechanism to enable Pointer Authentication Codes (PAC) and Branch Target Identification (BTI) for control-flow integrity (CFI), authenticating pointers and validating branch targets to prevent exploits like return-oriented programming, as integrated into Linux kernels since 2020.[61] Looking ahead, RISC-V ELF extensions in the 2020s incorporate property notes to indicate support for vector cryptography instructions, such as those in the Zvkn (AES/SHA) and Zvks (SM4) extensions ratified in 2023, enabling efficient hardware-accelerated crypto operations in ELF binaries without altering the core format.[62] GNU binutils 2.45, released in July 2025, added support for these RISC-V cryptography property notes.[63] Broader discussions in the Linux community, including a January 2025 thread on the binutils mailing list introducing the concept of "ELF 2.0," have explored potential enhancements to ELF functionality such as improved relocations and compatibility, though no formal specification has emerged as of November 2025.[64]

References

User Avatar
No comments yet.