Recent from talks
Nothing was collected or created yet.
| UPX | |
|---|---|
| Initial release | May 26, 1998 |
| Stable release | 5.0.2
/ July 20, 2025 |
| Repository | |
| Written in | C++, Assembly |
| Operating system | Microsoft Windows, Linux, macOS, DOS, Atari TOS |
| Platform | i386, MIPS, AMD64, ARM, PowerPC, m68k |
| Available in | English |
| Type | Executable compression |
| License | GPL with exception for compressed executables,[1] proprietary for compression algorithm in binary distributions[2] |
| Website | upx |
UPX (Ultimate Packer for eXecutables) is a free and open source executable packer supporting a number of file formats from different operating systems.[3][4]
Compression
[edit]UPX uses a data compression algorithm called UCL,[5] which is an open-source implementation of portions of the proprietary NRV (Not Really Vanished)[6] algorithm.[2]
UCL has been designed to be simple enough that a decompressor can be implemented in just a few hundred bytes of code. UCL requires no additional memory to be allocated for decompression, a considerable advantage that means that a UPX packed executable usually requires no additional memory.
UPX (since 2.90 beta) can use LZMA on most platforms; however, this is disabled by default for 16-bit due to slow decompression speed on older computers (use --lzma to force it on).
Starting with version 3.91, UPX also supports 64-Bit (x64) PE files on the Windows platform.[7] This feature is currently declared as experimental.
Decompression
[edit]UPX supports two mechanisms for decompression: an in-place technique and extraction to temporary file.
The in-place technique, which decompresses the executable into memory, is not possible on all supported platforms. It has the advantage of being more efficient in terms of memory, and that the environment set up by the OS remains correct.
The rest uses extraction to temporary file. This procedure involves additional overhead and other disadvantages; however, it allows any executable file format to be packed. The extraction to temporary file method has several disadvantages:
- Special permissions are ignored, such as suid.
argv[0]will not be meaningful.- Multiple running instances of the executable are unable to share common segments.
Unmodified UPX packing is often detected and unpacked by antivirus software scanners. UPX also has a built-in feature for unpacking unmodified executables packed with itself.
Supported formats
[edit]UPX supports the following formats:[8]
- Portable Executable (PE, EXE and DLL files):
- COFF executables, used by DJGPP2
- a.out format, BSD i386 (removed)
- Raw 8086/DOS files:[nb 1]
- Watcom/LE (used by DOS4G, PMODE/W, DOS32A and CauseWay)[citation needed]
- TMT/adam (as generated by the TMT Pascal compiler)
- Atari/TOS
- Linux kernel, i386, x86-64 and ARM
- Linux Executable and Linkable Format, i386, x86-64, ARM, PowerPC, MIPS
- PlayStation 1/EXE (MIPS R3000)
- Darwin Mach-O, ppc32, i386, and x86-64
UPX does not currently support PE files containing CIL code intended to run on the .NET Framework.
Notes
[edit]- ^ a b c For the DOS targets, UPX supports a special option
-8086in order to force the embedded decompressor to become compatible with 8088/8086 processors, so that the compressed files can be executed and decompressed even on the earliest PCs running DOS. - ^ The facility to compress DOS .COM-style files can be utilized also to compress other binary executable files. Some FreeDOS and EDR-DOS kernel files are known to be UPX-compressible this way.
- ^ The facility to compress DOS .COM-style files can be utilized also to compress non-executable binary data files, if the driver/application using these files has been enhanced to detect UPX-compressed files and jump to the decompressor embedded in the file. FreeDOS is known to utilize this for .CPX files, UPX-compressed .CPI font files.
References
[edit]- ^ "UPX License Agreement". Archived from the original on 2016-03-12. Retrieved 2016-09-14.
- ^ a b "The UPX Hacker's Guide". GitHub. 19 February 2022. Archived from the original on 14 May 2022. Retrieved 14 September 2016.
- ^ Marak, Victor (2015). Windows Malware Analysis Essentials. Packt Publishing. p. 188. ISBN 978-1-78528-151-8. Archived from the original on May 14, 2022. Retrieved November 22, 2015.
Packers such as Ultimate Packer for Executables (UPX) are more of executable compressors as size reduction is the primary goal, not obfuscation, which can be a byproduct ...
- ^ Blunden, Bill (2013). The Rootkit Arsenal (Second ed.). Jones & Bartlett Learning. pp. 353–355. ISBN 978-1-4496-2636-5. Archived from the original on May 14, 2022. Retrieved November 22, 2015.
One of the most prolific executable packers is UPX (the Ultimate Packer for executables). Not only does it handle dozens of different executable formats, but also its source code is available online.
- ^ Markus Oberhumer. "UCL data compression library". oberhumer.com. Archived from the original on 2024-06-28. Retrieved 2022-01-11.
- ^ Markus Oberhumer. "NRV Compression Library". Archived from the original on September 9, 2012.
- ^ "UPX News". Archived from the original on 2018-01-04. Retrieved 2016-09-14.
- ^ – Linux General Commands Manual from ManKier.com
- ^ "dos extender rtm32 - fileformat of the stub? \ VOGONS". Archived from the original on 2022-01-11. Retrieved 2022-01-11.
External links
[edit]Introduction
Overview
UPX, the Ultimate Packer for eXecutables, is a free and open-source tool designed to compress executable files, reducing their size while preserving full functionality and enabling self-decompression at runtime.[1] It achieves this by packing programs, dynamic libraries, and other binaries into compact forms suitable for distribution, thereby minimizing disk space usage, network bandwidth, and loading times without imposing runtime penalties in most cases.[4] UPX is licensed under the GNU General Public License version 2 or later, with a special exception that permits the redistribution of compressed executables—including for commercial purposes—without requiring the disclosure of the original source code, provided the UPX version remains unmodified and the decompression stub is used solely for unpacking at startup.[5] This licensing model ensures broad accessibility while protecting users from potential modifications that could embed malicious code. The project was initially released on May 26, 1998, with the latest stable version, 5.0.2, issued on July 20, 2024.[6][7] Key benefits of UPX include typical compression ratios of 50% to 70%—superior to general-purpose archivers like ZIP or GZIP for executables—along with high portability across multiple operating systems and architectures.[4][1] It supports various executable formats from platforms such as Windows, Linux, and macOS, making it a versatile option for developers and system administrators.[1] The official website is upx.github.io, and the source repository is hosted at GitHub/upx/upx.[1][4]History and Development
UPX originated as a hobby project initiated in 1996 by developers Markus F.X.J. Oberhumer and László Molnár, who sought to create an efficient executable compressor building on prior tools like DJP and LZOP.[1][8] The duo's collaboration laid the foundation for what would become a widely used open-source utility, with John F. Reiser later joining as a key contributor.[1] The first public beta release, version 0.05, arrived on May 26, 1998, marking UPX's debut as a portable packer for multiple executable formats.[6] Early development emphasized stability and licensing clarity, culminating in the release of the full source code under the GNU General Public License with version 0.99 on February 25, 2000.[6] This was swiftly followed by the stable version 1.00 on March 26, 2000, which included documentation enhancements and bug fixes to solidify its reliability.[6] A significant advancement came with version 2.90 on October 8, 2006, introducing LZMA compression support for improved ratios across 32-bit and 64-bit formats via the new--lzma option.[9] Further expansion occurred in version 3.91, released on September 30, 2013, which added experimental support for 64-bit Windows PE files based on contributions from Stefan Widmann.[10] The project transitioned to GitHub hosting with version 3.92 on December 11, 2016, facilitating broader community involvement.[6]
In more recent years, UPX adopted semantic versioning starting with 4.0.0 on October 28, 2022, to better manage updates and compatibility.[6] Version 5.0, released on February 20, 2024, brought enhancements including refined ARM64 support for Linux (aarch64) and other modern architectures, alongside optimizations like LZMA for 64-bit PowerPC contributed by Thierry Fauck.[6] The latest release, 5.0.2 on July 20, 2024, primarily addresses bug fixes and compatibility issues for contemporary systems, ensuring ongoing viability.[11][7] As an active open-source endeavor on GitHub, UPX continues to receive community contributions and is implemented mainly in C++ with assembly for performance-critical sections.[4][1]
Core Functionality
Compression Process
The compression process in UPX begins with parsing the input executable file to understand its format-specific structure, including identification of compressible sections such as code and data, entry points, and relocation tables.[2] These sections are then analyzed for entropy to determine suitability for compression, with non-compressible elements like certain headers preserved or minimally modified.[1] UPX primarily employs algorithms from the NRV (Not Really Vanished) family, re-implemented as the open-source UCL library, which are LZ77-based dictionary compressors designed for high decompression speed and reasonable ratios.[12] Variants such as nrv2b and nrv2d are commonly used, where nrv2b offers a good balance of size reduction and speed, while nrv2d prioritizes slightly better ratios at minor speed costs; these algorithms process data in blocks, replacing repeated sequences with references to achieve lossless compression.[12] For higher compression ratios, UPX supports LZMA (Lempel–Ziv–Markov chain algorithm), introduced in version 2.90, which uses a more advanced range encoder and adaptive dictionary for superior entropy coding but results in slower decompression times.[9] The core steps involve compressing eligible sections individually using the chosen algorithm and compression level—ranging from 1 (fastest, lowest ratio) to 9 (slowest, highest ratio), with --best or --ultra-brute modes applying aggressive optimizations like multiple passes or brute-force parameter tuning for maximum reduction.[2] Relocation tables are adjusted to account for offset changes due to compression, ensuring correct runtime addressing, while original headers are embedded to maintain format compatibility.[2] Factors influencing the compression ratio include the input file's code entropy and redundancy, typically yielding 50-70% size reduction for programs and DLLs, though results vary by algorithm and options.[1] The output is a self-extracting executable that incorporates a small decompression stub—often under 3 KB—prepended or appended to the compressed data, allowing the file to run transparently by unpacking sections in place or to memory at runtime without additional overhead for most supported formats.[2] This stub is platform-specific and optimized in assembly for minimal size and fast execution.[2]Decompression Process
When a UPX-packed executable is launched, a small, uncompressed stub—typically ranging from 1 to 5 KB in size and written in assembly—executes first to handle the unpacking process. This stub is embedded at the beginning of the packed file and relies on minimal system calls without dependencies on libraries like libc, ensuring portability across supported platforms. For instance, on Linux ELF executables, the stub is approximately 1700 bytes and performs all necessary operations independently.[2][1] The runtime decompression proceeds in-place within memory for most formats, avoiding the creation of temporary files and minimizing overhead. The stub maps the packed file into memory, then applies the inverse of the compression algorithm—such as the NRV (Not Really Vanished) decompression routine from the UCL library—to expand compressed sections directly over their packed counterparts. It subsequently restores the original entry point, handles any necessary relocations, wipes the stack if required, and transfers control to the unpacked code, simulating kernel mappings as needed (e.g., setting brk() on Linux/ELF). This self-modifying approach ensures the original program executes seamlessly, with decompression speeds exceeding 500 MB/sec on modern hardware and no additional memory footprint post-unpacking.[2][1][12] To maintain integrity, the stub includes built-in checks for file corruption or modifications, aborting execution with an exit code of 127 if issues are detected during decompression. Non-executable overlay data—such as resources or appended files—is preserved if the --overlay option was specified during compression, by copying it after the decompressed image without alteration. The stub's lightweight design adds negligible startup overhead, typically under a second, and supports compatibility across diverse architectures by tailoring operations to the target format, such as decompressing to /tmp and re-executing via execve() for certain Linux variants.[2][1]Supported Platforms and Formats
Executable Formats
UPX primarily supports the Portable Executable (PE) format for Windows systems, encompassing both 32-bit and 64-bit executables as well as dynamic link libraries (.exe and .dll files). This format enables compression of native Windows applications while preserving their functionality upon decompression.[4] The Executable and Linkable Format (ELF) is another core supported format, targeting 32-bit and 64-bit binaries on Linux and Unix-like operating systems. Additional formats include Mach-O for macOS and iOS applications, supporting both 32-bit and 64-bit binaries, with ARM64 support added in version 4.0. Other legacy formats covered are the Linear Executable (LE) for DOS and 16-bit Windows environments; a.out for older Unix systems; and Atari TOS for Atari ST platforms. These extend UPX's compatibility to historical systems while maintaining focus on native machine code.[4] UPX does not support managed formats such as .NET assemblies or Java bytecode, restricting its use to native machine code binaries that can be directly executed or linked.[13] During processing, UPX automatically detects the input file's format by inspecting its header signatures or magic bytes, ensuring seamless handling without manual specification.[2]Operating Systems and Architectures
UPX runs on a wide range of host operating systems, including all versions of Windows, various Linux distributions, macOS, FreeBSD, DOS, and Atari TOS. Pre-built binaries are provided for major platforms such as Linux (across architectures like amd64, arm64, i386), Windows (x86 and x64), and macOS, facilitating easy deployment without compilation. Cross-compilation is fully supported, allowing users to generate packed executables for diverse targets from a single host environment, such as creating Windows PE files on a Linux system or macOS binaries from Windows.[3][8] The tool targets numerous CPU architectures, encompassing x86 (i386 in both 32-bit and 64-bit configurations), AMD64, ARM (32-bit and 64-bit variants, including Thumb instruction set), PowerPC (32-bit and 64-bit), MIPS (supporting both big-endian and little-endian byte orders), and the Motorola 68000 family (m68k). These architectures enable UPX to compress executables in formats like ELF for Linux, PE for Windows, Mach-O for macOS, and legacy formats such as LE for DOS or TOS for Atari. Support for these platforms ensures compatibility with both modern and embedded systems, though decompression stubs are optimized per architecture for efficient runtime performance.[14][4] Notable enhancements include the addition of ARM64 (aarch64) support for Linux ELF executables in version 3.94 (2017), with further refinements in subsequent releases up to version 4.0 (2022) for broader stability across PE and Mach-O formats. As of version 5.0.0 (February 2025), enhancements include improved handling of ELF sections such as PT_MIPS_ABIFLAGS, maintaining backward compatibility.[14][3] Building UPX from source requires a standards-compliant C++ compiler (such as GCC or Clang) and the NASM assembler for generating architecture-specific stubs. The build process uses CMake for configuration, supporting cross-compilation toolchains to produce binaries for non-native hosts; official documentation recommends verifying toolchain compatibility for less common architectures like MIPS or m68k. Pre-built binaries mitigate these requirements for standard use cases on supported hosts.[3][14]Usage
Command-Line Interface
UPX operates primarily through a command-line interface, functioning as a console application without a graphical user interface, which makes it suitable for scripting and batch processing of multiple files.[2] To use UPX, users must first install it by downloading pre-built binaries from the official GitHub releases page at https://github.com/upx/upx/releases, which provides executables for Windows (32-bit and 64-bit) and various Linux architectures (such as amd64 and arm64, often statically linked for portability). As of November 2025, the latest version is 5.0.2.[3] For macOS, pre-built binaries are not directly provided in releases; instead, UPX can be installed via package managers like Homebrew (brew install upx) or MacPorts (sudo port install upx), ensuring compatibility with Apple Silicon and Intel architectures; users can also build from source.[15][16] After installation, verify the setup by running upx -V in the terminal, which displays the UPX version and build information to confirm proper functionality.[2]
The basic syntax for invoking UPX is upx [options] file1 [file2 ...], allowing the processing of one or more input files in a single command; by default, it compresses the files and outputs packed versions with a .upx extension to avoid overwriting originals, though the --overwrite option can be used to replace the input files directly.[2] Core commands include upx file.exe for default compression of an executable, upx -d file.exe to decompress or unpack a previously packed file back to its original form (supports multiple files for batch operations), and upx -l file.exe to list detailed packing information such as compression ratios and supported formats without modifying the file.[2] These operations require no additional prerequisites beyond a standard command-line environment, and UPX supports batch processing by specifying multiple filenames or using wildcards in scripts for automation.[2]
UPX returns exit codes to indicate operation status: 0 for success, 1 for fatal errors (e.g., unsupported format or validation failure), and 2 for warnings (non-fatal issues); consult the documentation for version-specific details to implement appropriate error handling in scripts.[2] This design ensures reliable integration into build scripts or pipelines.[2]
Common Options and Examples
UPX provides several command-line options to control compression levels, decompression, output handling, and file inspection, allowing users to tailor the packing process to specific needs. The--best option enables the highest compression ratio by exhaustively trying multiple methods and levels, though it is computationally intensive and slower than default settings, making it suitable for final releases of executables.[17] The -9 flag sets the maximum compression level for the default algorithm, prioritizing size reduction over speed, but is generally less thorough than --best, which tests multiple algorithms.[17] For alternative algorithms, --lzma invokes the LZMA compression method, which can achieve better ratios but results in slower decompression times, particularly for larger files.[17] The --brute option further enhances optimization by testing all available compression algorithms exhaustively to find the best fit.[17]
Decompression is handled via the -d or --decompress flags, which unpack one or more UPX-compressed files in place, restoring them to their original state without altering functionality.[17] Output can be directed to a custom file using -o followed by the desired filename, preventing overwrites of originals during testing.[17] Additionally, --overlay=strip removes non-essential overlay data from files, potentially improving compression ratios but risking crashes if the program depends on that data.[17]
Practical examples illustrate common usage patterns. To achieve maximum compression on a single executable, the command upx --best program.exe applies exhaustive optimization, reducing file size significantly while preserving executability.[17] For decompression with a specified output, upx -d -o original.exe packed.upx unpacks the file to a new location named original.exe.[17] Batch compression of multiple executables is straightforward with upx --best *.exe, which processes all .exe files in the current directory using optimal settings.[17]
To inspect packed files, the -l or --list option displays compression statistics, such as original and packed sizes along with the ratio achieved.[17] Integrity verification uses -t or --test, which checks the file's compressed and uncompressed data without modifying it, ensuring no corruption occurred during packing.[17] For scripting or automated environments, --no-progress suppresses verbose output, keeping logs clean.[17]
Best practices emphasize verifying results after operations: always run -t on packed files to confirm integrity, and test the unpacked executable to ensure it functions correctly, as compression can occasionally interact poorly with certain code patterns.[17] Using --no-progress is recommended in batch scripts to avoid cluttering output, and specifying -o helps maintain safe backups during experimentation.[17]
Advanced Topics
Customization and Building from Source
UPX can be built from source to enable customization and adaptation for specific development needs, such as targeting niche architectures or integrating modified compression behaviors. The source code is hosted on GitHub, where developers clone the repository usinggit clone https://github.com/upx/upx.git. Building requires a C++ compiler that supports C++17, such as GCC 8 or later, Clang 5 or later, or MSVC 2019 (version 16.11 or later), along with CMake version 3.8 or higher (3.10 recommended) and an assembler like NASM or YASM for x86/x64 targets, or GAS for other architectures.[18][4]
The build process typically involves configuring with CMake and compiling the executable. For Unix-like systems, including Linux and macOS, run cmake -S . -B build followed by cmake --build build from the root directory; alternatively, traditional Makefiles are available in the src/ directory for make -C src. On Windows, use Visual Studio projects or MinGW with cmake to generate appropriate build files, such as cmake -G "MinGW Makefiles" .. and then mingw32-make. Cross-compilation is supported via target-specific toolchains, for example, setting CROSS=arm-linux- in Makefiles or using CMake's cross-compilation presets for embedded targets like ARM or RISC-V, allowing tailoring to architectures beyond the standard supported platforms.[18][4][19]
Dependencies are minimal and mostly bundled: the UCL compression library and zlib are included as static libraries for core functionality, with optional support for bzip2 or Zstandard if enabled during configuration. For DOS-specific stubs, PMODE/W is optionally required to handle extender support in legacy builds. No external runtime libraries are needed beyond the build tools, ensuring portability across platforms.[18][4][2]
Customization allows developers to modify core components for specialized use cases. Decompression stubs, which handle runtime unpacking, can be edited in the src/stub/ directory, where architecture-specific assembly files (e.g., i386-linux.elf.S) define the stub behavior; after changes, rebuild the stubs using provided tools in contrib/stubtools/. To implement new compression algorithms, alter files like compress_lzma.cpp or compress_zlib.cpp in the src/ directory, integrating them into the packer framework. Adding support for new executable formats involves creating or extending packer classes, such as deriving from base classes in p_com.cpp or p_elf.cpp, and registering them in the format detection logic. These modifications require recompiling UPX to incorporate the changes.[20][4]
Binary releases distributed via GitHub include pre-built decompression stubs for common architectures, simplifying end-user deployment without source compilation. In contrast, building from source permits precise tailoring, such as enabling experimental features or optimizing for specific hardware constraints not covered in standard binaries.[3][4]
Version control is managed through Git, with the official repository at https://github.com/upx/upx; developers track changes locally and contribute improvements via pull requests, following the project's guidelines for code reviews and testing.[4]
Integration with Development Tools
UPX can be integrated into build systems to automate the compression of executables as a post-linking step, enhancing distribution efficiency without altering the core build process. In Makefiles, developers commonly append UPX invocation after the linking target, such asupx $(TARGET) to compress the resulting binary directly.[21] CMake supports this through the FindSelfPackers module, which detects and configures UPX for use in custom commands or install rules, allowing seamless incorporation into cross-platform projects.[22] For Microsoft Visual Studio environments, MSBuild tasks can invoke UPX via custom build steps in project files, typically targeting release configurations to minimize executable size before deployment.[23]
In continuous integration and continuous deployment (CI/CD) pipelines, UPX is frequently employed to compress release artifacts, reducing bandwidth and storage needs. GitHub Actions workflows often include dedicated steps using actions like ghaction-upx or upx-action, where commands such as upx --best release/*.exe are executed after compilation to pack binaries before publishing.[24] Similarly, Jenkins pipelines can incorporate UPX in shell scripts during deployment stages, ensuring compressed outputs for automated releases while maintaining pipeline reproducibility.[25]
UPX integrates with various development tools, particularly in toolchain workflows and debugging scenarios. It is compatible with MinGW and Clang compilers, where packed executables generated from these tools retain full functionality, as UPX supports the PE and ELF formats they produce.[1] For debugging, tools like GDB require unpacking the binary first with upx -d to access original symbols and code, avoiding issues with the compression stub.[26] IDA Pro users can analyze UPX-packed files by leveraging its built-in unpacking capabilities or manual techniques to locate the original entry point post-decompression.[27]
In reverse engineering contexts, UPX's -d option facilitates straightforward unpacking for static analysis, enabling tools like Detect It Easy (DIE) to identify and extract UPX-packed ELF or PE files through integrated scripts. This approach is standard for malware analysis, where decompression reveals the unpacked payload without needing specialized plugins.[28]
UPX is utilized in software packaging processes, including distribution builds. Debian includes UPX as the upx-ucl package, allowing packagers to compress ELF binaries during Debian package creation to optimize repository space.[29] For mobile applications, its ELF support enables packing of native libraries within Android APK files, reducing app bundle sizes without impacting runtime performance.[1]
Best practices for UPX integration emphasize reliability and targeted application. Scripts should pin specific UPX versions to ensure consistent compression behavior across builds, mitigating risks from updates that might alter packing algorithms.[25] Packing should be conditional, applied only to release builds to preserve debuggability in development versions, and always followed by verification of the unpacked executable to confirm integrity.[21]
Limitations and Considerations
Performance Impacts
UPX typically achieves a file size reduction of 50% to 70% for programs and DLLs, making it particularly effective for decreasing disk space and network transfer requirements. This compression ratio is determined by comparing the original file size to the packed size, as reported by theupx -l command, and is more substantial for executables dominated by code sections rather than embedded data.[1][30]
At runtime, UPX introduces a modest overhead in startup time owing to the execution of its decompression stub, which can slow initial loading by a few milliseconds to hundreds of milliseconds depending on the binary size and hardware; however, this effect is negligible for applications that run for extended periods. Post-decompression, there is no ongoing performance penalty, as the unpacked executable behaves identically to its original form.[31][1]
In terms of memory usage, UPX employs in-place decompression, which avoids persistent overhead after startup but involves a temporary allocation during unpacking equivalent to the size of the decompressed data. This process can lead to higher peak memory consumption compared to non-packed executables, especially in scenarios where multiple instances run concurrently without shared memory mapping.[1][32]
The time required for compression varies by selected options; default levels (around -7 or -8) complete quickly, often in seconds for files under 1 MB, while the --best mode prioritizes ratio over speed and can take considerably longer for only marginal additional gains. Algorithm selection influences these trade-offs, with options like LZMA providing superior compression ratios at the expense of increased packing duration, though decompression remains rapid at over 500 MB/s on modern systems.[30][1]
