Recent from talks
Nothing was collected or created yet.
Nm (Unix)
View on Wikipedia| nm | |
|---|---|
| Original authors | Dennis Ritchie, Ken Thompson (AT&T Bell Laboratories) |
| Developers | Various open-source and commercial developers |
| Initial release | November 3, 1971 |
| Written in | C |
| Operating system | Unix, Unix-like, Plan 9 |
| Platform | Cross-platform |
| Type | Command |
| License | Plan 9: MIT License |
nm is a Unix command used to dump the symbol table and their attributes from a binary executable file (including libraries, compiled object modules, shared-object files, and standalone executables).
The output from nm distinguishes between various symbol types. For example, it differentiates between a function that is supplied by an object module and a function that is required by it. nm is used as an aid for debugging, to help resolve problems arising from name conflicts and C++ name mangling, and to validate other parts of the toolchain.
This command is shipped with a number of later versions of Unix and similar operating systems including Plan 9. The GNU Project ships an implementation of nm as part of the GNU Binutils package.
The etymology is that in the old Version 7 Unix, nm's manpage used the term name list instead of symbol table.[1]
nm output sample
[edit]/*
* File name: test.c
* For C code compile with:
* gcc -c test.c
*
* For C++ code compile with:
* g++ -c test.cpp
*/
int global_var;
int global_var_init = 26;
static int static_var;
static int static_var_init = 25;
static int static_function()
{
return 0;
}
int global_function(int p)
{
static int local_static_var;
static int local_static_var_init=5;
local_static_var = p;
return local_static_var_init + local_static_var;
}
int global_function2()
{
int x;
int y;
return x+y;
}
#ifdef __cplusplus
extern "C"
#endif
void non_mangled_function()
{
// I do nothing
}
int main(void)
{
global_var = 1;
static_var = 2;
return 0;
}
If the previous code is compiled with the gcc C compiler, the output of the nm command is the following:
# nm test.o
0000000a T global_function
00000025 T global_function2
00000004 C global_var
00000000 D global_var_init
00000004 b local_static_var.1255
00000008 d local_static_var_init.1256
0000003b T main
00000036 T non_mangled_function
00000000 t static_function
00000000 b static_var
00000004 d static_var_init
When the C++ compiler is used, the output differs:
# nm test.o
0000000a T _Z15global_functioni
00000025 T _Z16global_function2v
00000004 b _ZL10static_var
00000000 t _ZL15static_functionv
00000004 d _ZL15static_var_init
00000008 b _ZZ15global_functioniE16local_static_var
00000008 d _ZZ15global_functioniE21local_static_var_init
U __gxx_personality_v0
00000000 B global_var
00000000 D global_var_init
0000003b T main
00000036 T non_mangled_function
The differences between the outputs also show an example of solving the name mangling problem by using extern "C" in C++ code.
| Symbol type | Description |
|---|---|
| A | Global absolute symbol |
| a | Local absolute symbol |
| B | Global bss symbol |
| b | Local bss symbol |
| D | Global data symbol |
| d | Local data symbol |
| f | Source file name symbol |
| R | Global read-only symbol |
| r | Local read-only symbol |
| T | Global text symbol |
| t | Local text symbol |
| U | Undefined symbol |
See also
[edit]References
[edit]External links
[edit]- : write the name list of an object file – Shell and Utilities Reference, The Single UNIX Specification, Version 5 from The Open Group
- – Plan 9 Programmer's Manual, Volume 1
Nm (Unix)
View on Grokipedianm is a command-line utility in Unix-like operating systems used to display the symbol table contents of object files, executable files, shared libraries, or object-file archives.[1] It lists symbols such as functions, global variables, and their attributes including type, value, and size, aiding developers in debugging, linking analysis, and understanding binary structures.[2] Defined as an optional utility under the POSIX.1-2017 standard in the Software Development Utilities option group, nm originated from early Unix systems and is implemented in various forms across distributions like GNU/Linux, BSD, and Solaris.[1]
The tool processes input files in formats such as ELF (Executable and Linkable Format) or COFF (Common Object File Format), extracting symbolic information unless stripped during compilation.[2] By default, it sorts output by symbol name and uses a BSD-style format showing the symbol's address, type (e.g., T for text/code, D for data, U for undefined), and name; alternative formats like POSIX or SysV can be specified for portability or detailed views.[1] Common options include -g to show only external (global) symbols, -u for undefined symbols, and -A to prefix each line with the file or library name, facilitating analysis of dependencies in large projects.[2]
In the GNU implementation, part of the Binutils package maintained by the Free Software Foundation, nm supports dynamic symbols via -D for shared objects and demangling of C++ names, enhancing usability in modern software development.[2] While POSIX mandates basic functionality like portable output with -P, extensions in specific systems (e.g., XSI options like -e for external/static symbols) provide additional flexibility without breaking compatibility.[1] If no symbolic information is present—common in optimized or stripped binaries—nm reports this without error, returning success status zero.[1] Overall, nm remains a fundamental tool for reverse engineering, build troubleshooting, and ensuring symbol resolution in Unix environments.[2]
Overview
Purpose and Functionality
Thenm command is a standard command-line utility in Unix-like operating systems designed to list symbols—such as functions, global variables, and other identifiers—from object files, static libraries, or executable binaries. It extracts and displays information from the symbol tables embedded in these files, providing developers with visibility into the program's structure without executing it. This tool is part of the GNU Binutils suite in most Linux distributions and is available across various Unix variants, including BSD and Solaris implementations.[2][3]
Symbol tables in object files serve as repositories for essential metadata used during compilation, linking, and debugging processes. These tables contain entries detailing symbol names (stored as strings), types (e.g., function, object, or section), values (such as addresses or offsets), and sizes (in bytes, if applicable). In the Executable and Linkable Format (ELF), the predominant standard for Unix-like systems, symbol tables are organized into sections like .symtab for general symbols and .dynsym for dynamic linking, enabling the linker to resolve references and the debugger to map addresses to meaningful names. This structure facilitates both static analysis and runtime support, ensuring symbols can be relocated or inspected as needed.[4]
The primary applications of nm revolve around software development and analysis tasks. In debugging, it helps identify undefined symbols (marked as type 'U'), which signal linking errors or missing dependencies, allowing developers to troubleshoot unresolved references early. It also supports reverse engineering by revealing the exported and imported symbols in a binary, aiding in understanding proprietary or obfuscated code without source access. Additionally, nm verifies compilation outputs by confirming that expected symbols from source code are correctly generated and preserved in the resulting artifacts, which is crucial for build system validation and toolchain integrity checks.[2][3]
The utility supports multiple object file formats aligned with Unix standards, with a primary focus on ELF for relocatable objects, executables, and shared libraries in Linux and other POSIX-compliant systems; different implementations support various formats: for example, the GNU version handles COFF (used in some historical Unix environments) and ELF, while the macOS version handles Mach-O (prevalent in Darwin-based systems like macOS). As a non-interactive tool, nm produces plain text output to standard output (stdout), making it suitable for scripting and integration with other commands via piping for automated analysis workflows.[3]
History and Development
The nm command originated in Version 6 Unix, released in May 1975 by AT&T [Bell Labs](/page/Bell Labs), where it formed part of the core toolchain for examining symbol tables in object files generated by assemblers and linkers.[5] This early implementation allowed developers to list symbols such as functions and variables along with their addresses and types, facilitating debugging and program analysis in the resource-constrained environment of PDP-11 systems.[5] Developed by Bell Labs engineers, nm was tailored to the a.out executable format, the default binary structure in pre-BSD Unix distributions, which organized code into text, data, and BSS segments.[5] As Unix spread beyond Bell Labs, the command evolved alongside system variants; it was incorporated into the Berkeley Software Distribution (BSD) starting from early releases like 1BSD in 1977, adapting to Berkeley's enhancements like virtual memory support while retaining compatibility with a.out. Standardization efforts culminated in its inclusion in POSIX.2 (IEEE Std 1003.2-1992), which defined its core interface for portability across conforming systems, with refinements in subsequent versions like POSIX.1-2008 to address evolving utility behaviors. It is defined as an optional utility in POSIX.1-2017 within the Software Development Utilities option group.[1] Key milestones in the open-source era began in 1987, when the Free Software Foundation (FSF) launched the GNU nm as part of its broader initiative to build a complete Unix-compatible operating system, emphasizing freedom to modify and redistribute software tools.[6] By the early 1990s, GNU nm was integrated into the GNU Binutils suite, a collection of binary manipulation utilities that became essential for GCC-based development environments. This period also saw significant adaptations, as Unix-like systems transitioned from a.out to the Executable and Linkable Format (ELF) in the mid-1990s—starting with Solaris in 1992 and adopted by Linux in 1996—requiring nm to parse new symbol table structures for dynamic linking and shared libraries.[7] Despite the rise of comprehensive disassemblers, nm maintains relevance in contemporary Linux distributions, macOS (adapted for Mach-O binaries), and embedded systems like those using uClibc, where its lightweight symbol inspection supports firmware debugging and build verification.[2]Usage
Command Syntax
Thenm command follows the POSIX-specified syntax for listing symbols from object files, with the general form nm [options] file..., where file denotes one or more pathnames to object files, executable files, or object-file libraries (such as those in archive format).[1] Object files typically have a .o extension, archives use .a, and executables are standalone binaries containing symbol tables.[1] In POSIX-compliant systems, if no files are specified, nm writes a diagnostic message to standard error and exits with a non-zero exit status; implementations like GNU nm instead default to processing a.out, the conventional name for the default executable or object file in Unix-like environments.[2][1]
When multiple files are specified, nm processes them sequentially, producing output for each and prefixing the results with the filename (or full pathname if the -A option is used) to distinguish symbols from different inputs.[1] This behavior ensures clear separation in the output stream, though the exact formatting adheres to the chosen output style (default or POSIX with -P).[1] The command does not read from standard input as a file operand; instead, it expects explicit file arguments or the a.out fallback.[2]
Upon encountering errors, such as invalid or unreadable files, nm terminates with a non-zero exit status (typically 1), indicating failure, while successful execution returns 0.[1] The core syntax is POSIX-compliant, ensuring portability across conforming Unix systems, but implementations like GNU nm introduce extensions such as additional options for handling dynamic symbols or specific object formats without altering the basic invocation pattern.[2]
Key Options and Flags
Thenm command supports a variety of options to control the symbols displayed, their formatting, sorting, and filtering, allowing users to tailor output for debugging, analysis, or scripting needs. These options are largely standardized in POSIX but include implementation-specific extensions, particularly in GNU binutils and BSD variants. Common options focus on symbol selection and basic output modifications, while advanced flags handle demangling, dynamic symbols, and sorting behaviors.
Common Options
- The
-aor--debug-symsflag includes all symbols in the output, even those typically used only by debuggers, providing a complete view of the symbol table.[2][8] - In POSIX and GNU implementations,
-gor--extern-onlyrestricts output to external (global) symbols, excluding local ones to focus on inter-module references; in some BSD variants like OpenBSD,-ginstead displays undefined symbols (use-efor external).[9][2][8] - In POSIX and GNU implementations, the
-uor--undefined-onlyoption displays only undefined symbols, useful for identifying unresolved dependencies during linking; in some BSD variants like OpenBSD,-uprovides extended symbol information, while-gshows undefined symbols.[9][8] -t radixor--radix=radixspecifies the base for numeric symbol values, whereduses decimal,ooctal, andxhexadecimal, enabling compatibility with different output preferences.[9][2]
Sorting and Formatting
- The
-nor--numeric-sortflag sorts symbols by their memory address in ascending order, facilitating analysis of code layout.[2][8] -por--no-sortoutputs symbols in the order they appear in the file, preserving the original structure without imposing any sorting.[2][8]- The
-Cor--demangle[=style]option decodes mangled symbol names, such as those from C++ compilers, into readable form (e.g., converting_Z3foovtofoo()), with styles likegnuorautofor flexibility.[2][8]
Filtering
-Dor--dynamicdisplays symbols from the dynamic symbol table, relevant for shared libraries and runtime loading.[2][8]- The
-Aor--print-file-nameflag prefixes each line with the full pathname of the object file or archive, aiding in multi-file inspections.[9][2] -lor--line-numbersappends source file names and line numbers for symbols, if debugging information is available, enhancing traceability (note: in some BSD variants like OpenBSD,-linstead shows archive indices).[2][8]
GNU-Specific Options
In the GNU implementation, additional flags extend functionality:--defined-only(or-Uin some contexts) shows only symbols defined in the file, excluding undefined ones.[2]--no-demangledisables name demangling, outputting raw mangled names as the default behavior.[2]--size-sortsorts symbols by their sizes (requiring-Sfor size display in BSD-style output), useful for identifying large functions or data.[2]
Deprecated or Variant Options
The-o flag, historically used for octal output in POSIX and traditional Unix systems, is deprecated in GNU binutils in favor of -t o; in GNU, -o now aliases --print-file-name for backward compatibility but may produce warnings.[9][2] Options can be combined for targeted use, such as nm -C -D to demangle and display dynamic symbols readably.[2]
Output Interpretation
Format Components
The standard output format of thenm command, used across Unix implementations, follows the BSD style by default, consisting of lines with fields separated by spaces: an address (or value), a symbol type indicator, an optional size, and the symbol name.[2][10] This format can vary based on options and the chosen output style, such as SysV or POSIX, but the BSD format remains the most common for displaying symbol information from object files, executables, or archives.[9]
The address field represents the hexadecimal memory address or offset of the symbol within the object file, padded to a fixed width depending on the architecture (e.g., 8 digits for 32-bit systems or 16 digits for 64-bit ELF files); for undefined symbols, it displays as 000000 or a similar zero value.[2][10] The type field is a single-letter code indicating the symbol's section or attributes (e.g., T for text section, D for data, U for undefined), with uppercase typically denoting global symbols and lowercase for local ones.[2]
The size field, which shows the byte length of the symbol, is optional and appears only when specified via options like -S or --print-size in BSD format, or inherently in POSIX format; it is absent for undefined symbols.[2][10] The name field contains the symbol's identifier, which may include version information (e.g., @VER suffix); if the -C or --demangle option is used, C++ mangled names are decoded into readable form, though this can be disabled with --no-demangle.[2][10]
For output involving multiple files or archive members, options like -A, -o, or --print-file-name prepend the filename or archive path to each line (e.g., filename: address type name), facilitating distinction between symbols from different sources.[2][10] Architecture-specific variations affect field widths, such as longer hexadecimal addresses in 64-bit ELF binaries compared to 32-bit a.out formats, ensuring compatibility with the target's object file layout.[2] Other output formats, like POSIX (invoked with -P or --format=posix), rearrange fields to name type value size (with filename prefix if applicable) and support radices like decimal or octal via -t, promoting portability across Unix systems.[9]
Symbol Types and Meanings
In the output of thenm command, symbol types are indicated by single-letter codes that distinguish between local and global symbols, as well as their sections and attributes. Lowercase letters denote local symbols, which are not visible outside the object file, while uppercase letters indicate global or external symbols that can be referenced across files during linking.[3] This distinction aids in understanding symbol visibility and linkage requirements.
Local symbols include types such as t for symbols in the text (code) section, d for initialized data sections, and b for uninitialized (BSS) data sections.[3] These represent internally defined elements like static functions or variables that do not require external resolution. Global symbols, conversely, feature T for text sections (indicating executable code locations that may be called from other modules), D for initialized data, and U for undefined external references, which signal that the linker must resolve them from other object files or libraries.[3] The U type is particularly useful for identifying dependencies during the build process, as unresolved U symbols can lead to linking errors if not provided.[3]
Special symbol types provide additional context for non-standard or auxiliary symbols. These include N for debugging symbols (used in symbol tables for tools like debuggers), W for weak external symbols (which prefer a strong definition if available but do not cause errors otherwise), and G for symbols in small initialized data sections (optimized for certain architectures).[3] In GNU implementations, I denotes an indirect reference to another symbol, while i marks indirect functions as a GNU extension; V indicates weak object symbols, often for common blocks in dynamic contexts.[3] File-specific types encompass a for local absolute symbols (with fixed values independent of relocation) and ? for unknown or format-specific types that do not fit standard categories.[3]
When examining dynamic symbols (via specific output modes), types like V may appear for common blocks in shared libraries, highlighting shared data allocations.[3] Implementations vary slightly: BSD variants, such as FreeBSD, use similar conventions but omit GNU-specific extensions like G, I, and i, instead relying on core types like R/r for read-only data and lacking explicit indirect indicators. These differences ensure compatibility within their respective ecosystems while maintaining the fundamental purpose of symbol classification for debugging and linking analysis.[3]
| Type | Description (GNU/BSD Common) | Scope | Example Implication |
|---|---|---|---|
T / t | Text/code section | Global/Local | T marks callable functions; t for static ones.[3] |
D / d | Initialized data section | Global/Local | Global D for shared variables; local d for file-internal.[3] |
B / b | Uninitialized (BSS) data | Global/Local | Zero-filled at runtime; used for global/local uninitialized static data.[3] |
U | Undefined external | Global | Requires linker resolution; indicates import.[3] |
A / a | Absolute value | Global/Local | Fixed address, no relocation needed.[3] |
N | Debugging symbol | Special | Used by debuggers; not executable code.[3] |
W / w | Weak external | Global/Local | Overridable by strong symbols without error.[3] |
? | Unknown type | Special | Format-specific or unrecognized.[3] |
Examples and Applications
Basic Usage Examples
Thenm command is commonly used to inspect symbols in object files after compilation, helping developers verify that functions and variables have been correctly defined or to identify entry points like the main function.[2]
A basic invocation lists all symbols from a compiled object file, such as main.o, displaying their addresses, types, and names in a tabular format. For example:
nm main.o
nm main.o
0000000000000000 T main
0000000000000027 t helper_function
U [printf](/page/Printf)
0000000000000000 T main
0000000000000027 t helper_function
U [printf](/page/Printf)
T indicates a symbol in the text (code) section, t a local text symbol, and U an undefined symbol requiring resolution during linking; such output aids in post-compilation verification.[2]
To diagnose linking issues, the -u option filters for only undefined symbols in an executable or object, as in nm -u program. For a program referencing external functions, the output could be:
U printf
U malloc
U printf
U malloc
-C (demangle) with -D (dynamic symbols) provides readable listings of exported functions, such as nm -DC libexample.so. A sample output might include:
0000000000000a50 T example_function(int)
0000000000000b20 T another_export
0000000000000a50 T example_function(int)
0000000000000b20 T another_export
nm output to grep for targeted searches, like nm main.o | grep main, yielding only relevant lines such as 0000000000000000 T main to quickly locate specific entry points.[2]
Advanced Scenarios
In advanced debugging scenarios, particularly for linked executables, thenm -D option is employed to examine dynamic symbols, revealing undefined symbols (marked as type 'U') that indicate dependencies on external shared libraries. This allows developers to verify runtime linking behavior without executing the program, such as identifying missing or unresolved external references in a binary like /path/to/binary. For instance, running nm -D /path/to/binary lists symbols from the dynamic symbol table, helping diagnose issues in large projects where dynamic linking introduces potential conflicts.[3]
When handling static archives, nm inspects library members directly; applying it to a file like libarchive.a enumerates all symbols across the archive's object files, prefixed with member names when using the -A or --print-file-name flag. This is crucial for auditing static libraries in multi-module builds, ensuring no unintended symbol overlaps occur before linking. The output format facilitates targeted analysis, with each symbol's value, type, and originating member clearly delineated.[3]
Cross-platform development, such as for embedded systems, leverages the --target=bfdname option to specify the object format, enabling nm to process binaries from non-native architectures like ARM bare-metal targets. For example, nm --target=elf32-littlearm /path/to/arm-binary decodes symbols in ARM ELF format, supporting verification of cross-compiled code without a full toolchain switch. This capability is essential in heterogeneous environments, where developers analyze firmware images for symbol integrity across architectures.[3][11]
Integration with scripting tools enhances nm's utility for automated symbol extraction in complex workflows; the output can be piped to awk or sed for filtering, such as extracting only undefined symbols with nm binary.o | awk '/ U / {print $3}'. The -p or --no-sort flag ensures symbols appear in file order, preserving context for scripts processing large outputs and avoiding sorting overhead. This approach is common in build scripts for selective symbol reporting in version control or CI pipelines.[3][12]
Troubleshooting multi-module builds often involves detecting duplicate symbols or version mismatches using nm's detailed output; duplicates manifest as repeated entries across files (visible with -A), while the --with-symbol-versions option appends version tags (e.g., symbol@VER_2) to reveal incompatibilities between library versions. In scenarios like merging modules from disparate sources, this helps preempt linker errors by identifying conflicting definitions early. Additionally, -C or --demangle aids in interpreting mangled names for clearer duplicate identification.[3]
For performance considerations with large files, such as massive object dumps in enterprise-scale projects, the --no-sort option skips symbol sorting, significantly reducing processing time by outputting in encounter order—ideal for quick inspections where full collation is unnecessary. This flag is particularly beneficial when combined with output redirection or scripting, maintaining efficiency without compromising symbol accessibility.[3]
Implementations and Variants
GNU nm Features
The GNU implementation ofnm, as part of the GNU Binutils collection, has been integrated since version 2.1 released in 1993, providing comprehensive support for multiple object file formats and architectures including ELF, COFF, and Mach-O across various hosts and targets.[13][14] This integration allows nm to leverage the broader Binutils ecosystem, such as the BFD (Binary File Descriptor) library for portable object file handling, enabling analysis of binaries from diverse platforms without requiring separate tools.[3]
GNU nm extends the standard functionality with several options not mandated by POSIX, including --dynamic (or -D) to display the dynamic symbol table of shared object files, which lists symbols relevant to runtime linking such as those in PLT entries or GOT references.[3] The --extern-only option (or -g) filters output to show only globally defined symbols, aiding in the inspection of exported interfaces in libraries.[3] Additionally, the --format (or -f) option supports customizable output styles like bsd, gnu, or sysv, allowing users to choose between traditional BSD-like listings with symbol types in uppercase or SysV-style with additional section details for enhanced readability.[3]
Demangling in GNU nm is handled via the -C or --demangle[=style] option, which fully complies with the Itanium C++ ABI used by GCC, decoding mangled names from low-level object code into human-readable C++ identifiers including templates, namespaces, and overloaded functions.[3] This feature integrates with the accompanying c++filt tool for advanced filtering, ensuring accurate reversal of compiler-generated names while supporting styles like gnu-v3 for modern C++ standards.[3]
Target specification is facilitated by the --target=bfdname option, permitting analysis of non-native formats such as Windows PE executables via nm --target=pe-i386 executable.exe, which extends nm's utility beyond Unix-like systems to cross-platform debugging scenarios.[3] GNU nm also includes enhancements for ELF files, such as improved parsing for indirect function symbols (marked with 'i') and unique global symbols ('u'), along with options like --ifunc-chars=CHARS to customize their display characters, facilitating better handling of large symbol tables in modern binaries.[3]
In contrast to POSIX specifications, which limit radix options to hexadecimal or octal via -t, GNU nm adds --radix=decimal support for sorting addresses and sizes in decimal notation, providing flexibility for users preferring base-10 output in reports or scripts.[3]
BSD and Other Unix Variants
In BSD-derived systems such as OpenBSD and NetBSD, the nm utility retains traditional Unix characteristics with a focus on simplicity and compatibility with a.out and ELF object formats, displaying symbol tables without demangling by default.[8] The -g option limits output to external (global) symbols only, similar to the GNU implementation.[8] Options like -n enable numeric sorting by symbol value, -u shows undefined symbols exclusively, and -a includes debugger symbol table entries; in NetBSD, -C provides optional demangling for low-level names such as those from C++, while OpenBSD lacks this option.[8][15] Output uses a BSD-style format by default, listing the symbol value in hexadecimal, a single-letter type indicator (e.g., 'T' for global text section symbols, 'U' for undefined, 'D' for data, with lowercase for local equivalents), and the symbol name, sorted alphabetically unless modified.[8] FreeBSD's nm implementation, derived from GNU binutils but configured for BSD defaults, supports ELF files and emphasizes portability across architectures, with options like --extern-only (equivalent to -g) for global symbols and --format=bsd for traditional output.[16] It includes GNU extensions such as --demangle for name decoding and --dynamic for shared library symbols, but lacks native Mach-O support, focusing instead on Unix-like binaries.[16] NetBSD similarly adopts binutils nm with added compatibility for weak and indirect symbols, using type letters like 'w' for weak defined and 'i' for indirect references, which may vary in interpretation from pure BSD implementations.[15] In System V derivatives like Solaris and illumos, nm targets ELF object files with SVR4 extensions, integrating closely with the link editor (ld) for symbol resolution and supporting symbol versioning through output fields indicating bind types (local or global).[17] Legacy System V/AT&T versions emphasized a.out format with limited options, such as -n for numeric sorting and -g for global symbols only, without built-in demangling or dynamic symbol handling. Modern Solaris nm adds -D for dynamic symbols, -p for parseable output (value, type, bind, section index, name), and -t for radix specification (decimal, octal, hexadecimal), with type letters including 'T' for text, 'R' for read-only data, and 'v' for weak versions.[17] These variants prioritize terse, machine-readable formats over verbose debugging, reflecting SVR4's emphasis on production linking workflows.[17] The Darwin implementation in macOS and Xcode tools adapts nm for Mach-O binaries, using llvm-nm by default since Xcode 8, with options like -m to display section symbols as (segment_name, section_name) followed by visibility (external or non-external). It supports multi-architecture files via -arch (e.g., -arch arm64 for iOS or Apple Silicon), -g for global symbols only, and -n for numeric sorting, without automatic demangling unless specified. Symbol types follow Mach-O conventions, such as 'S' for section symbols and 'I' for indirect, differing from BSD ELF types in handling fat binaries and code-signing metadata. Portability challenges arise from inconsistent symbol type letters across variants; for instance, BSD systems like OpenBSD use 'i' or 'I' for indirect symbols in a manner tied to a.out/ELF specifics, while GNU nm employs 'i' for indirect functions and adds types like 'G' for unique globals not present in traditional BSD.[8] These differences can affect script parsing, requiring format flags like -P for POSIX-compatible output in mixed environments.[8]Related Tools and Alternatives
Comparison with objdump
Theobjdump utility, part of the GNU Binutils suite, provides a comprehensive analysis of object files by disassembling executable code, displaying file and section headers, and extracting various metadata, whereas nm is specialized for listing symbols from the symbol table without additional disassembly or header details.[18][3]
Both tools support common object file formats such as ELF and COFF, enabling them to process the same input files like executables or relocatable objects.[18][3] A key overlap exists in symbol handling: objdump's -t option dumps the symbol table in a manner similar to nm, but includes extra context such as symbol flags (e.g., local, global, weak), section names, and sizes, offering a more detailed view than nm's basic address-type-name format.[18]
nm excels in scenarios requiring rapid, targeted symbol queries due to its lightweight design and options like -u for undefined symbols only or sorting by name/address, producing concise output ideal for scripting and automation.[3]
objdump is preferable when disassembly is needed, such as with the --disassemble (or -d) option to view assembly instructions, or for inspecting section headers via --headers (or -h), which reveal layout and attributes not accessible in nm.[18]
One limitation of nm is its absence of relocation information—objdump can display relocation entries with -r—and it does not provide hexadecimal dumps of sections, which objdump handles via -s.[18][3]
For instance, to quickly identify undefined symbols in an object file that might cause linker errors, nm file.o | grep ' U ' suffices, but for a thorough binary inspection including code flow and relocations, objdump -d -r file.o is more appropriate.[3][18]
Integration in Build Processes
In Unix build processes, the nm tool is commonly invoked within Makefile rules to verify the presence of key symbols, such as the entry point 'main', prior to linking object files into an executable. For example, a Makefile rule might executenm $< | grep ' main ' to confirm the symbol is defined in the object file, failing the build if absent to prevent linking errors. This practice ensures symbol integrity early in the compilation pipeline.[3]
Automated CI/CD pipelines often incorporate nm with the -u or --undefined-only option to detect unresolved external symbols in linked binaries, catching potential runtime linking issues before deployment. Such checks can be scripted to run post-compilation, for instance, nm -u output_executable piped to a failure condition if any undefined symbols are found, thereby enforcing dependency completeness in continuous integration workflows.[3]
When combined with the GNU linker (ld) and make, nm facilitates the generation of symbol maps by redirecting its output to .map files, as in nm linked_executable > symbols.map, which aids in debugging relocation and symbol resolution during the final linking stages. In the Linux kernel build system, nm is specifically used to extract symbols from the vmlinux binary via the command nm -n vmlinux | scripts/kallsyms, enabling the creation of a compressed symbol table for kernel debugging and module loading.[3][19][20]
For static analysis, nm output is frequently processed with tools like grep or awk to construct symbol dependency graphs, such as nm object.o | awk '/ U / {print $3}' to identify and map undefined symbols to their required libraries, helping developers visualize and resolve inter-file dependencies without full linking. In GCC and Clang-based toolchains, post-link invocation of nm with -D examines dynamic symbols in shared objects to generate export lists, ensuring only intended symbols are visible externally after the linking phase completes.[3][21]
Best practices recommend employing the --defined-only option in release builds to audit and limit exported symbols, reducing the binary's attack surface and improving load times; for instance, nm --defined-only --format=posix shared.so lists only locally defined symbols for verification against an approved export policy. This selective filtering promotes modular designs and compliance with visibility controls in production artifacts.[3]