Recent from talks
Contribute something
Nothing was collected or created yet.
PATH (variable)
View on WikipediaThis article needs additional citations for verification. (August 2014) |
PATH is an environment variable on Unix-like operating systems, DOS, OS/2, and Microsoft Windows, specifying a set of directories where executable programs are located. In general, each executing process or user session has its own PATH setting.
History
[edit]Multics originated the idea of a search path. The early Unix shell only looked for program names in /bin, but by Version 3 Unix the directory was too large and /usr/bin, and a search path, became part of the operating system.[1]
Unix and Unix-like
[edit]On POSIX and Unix-like operating systems, the $PATH variable is specified as a list of one or more directory names separated by colon (:) characters.[2][3]
Directories in the PATH-string are not meant to be escaped, making it impossible to have directories with : in their name.[4]
The /bin, /usr/bin, and /usr/local/bin directories are typically included in most users' $PATH setting (although this varies from implementation to implementation). The superuser also typically has /sbin and /usr/sbin entries for easily executing system administration commands. The current directory (.) is sometimes included by users as well, allowing programs residing in the current working directory to be executed directly. System administrators as a rule do not include it in $PATH in order to prevent the accidental execution of scripts residing in the current directory, such as may be placed there by a malicious tarbomb. In that case, executing such a program requires specifying an absolute (/home/userjoe/bin/script.sh) or relative path (./script.sh) on the command line.
When a command name is specified by the user or an exec call is made from a program, the system searches through $PATH, examining each directory from left to right in the list, looking for a filename that matches the command name. Once found, the program is executed as a child process of the command shell or program that issued the command.
DOS, OS/2, and Windows
[edit]On DOS, OS/2, and Windows operating systems, the %PATH% variable is specified as a list of one or more directory names separated by semicolon (;) characters.[5]
The Windows system directory (typically C:\WINDOWS\system32) is typically the first directory in the path, followed by many (but not all) of the directories for installed software packages. Many programs do not appear in the path as they are not designed to be executed from a command window, but rather from a graphical user interface. Some programs may add their directory to the front of the PATH variable's content during installation, to speed up the search process and/or override OS commands. In the DOS era, it was customary to add a PATH {program directory};%PATH% or SET PATH={program directory};%PATH% line to AUTOEXEC.BAT.
When a command is entered in a command shell or a system call is made by a program to execute a program, the system first searches the current working directory and then searches the path, examining each directory from left to right, looking for an executable filename that matches the command name given. Executable programs have filename extensions of EXE or COM, and batch scripts have extensions of BAT or CMD. Other executable filename extensions can be registered with the system as well.
Once a matching executable file is found, the system spawns a new process that runs it.
The PATH variable makes it easy to run commonly used programs located in their own folders. If used unwisely, however, the value of the PATH variable can slow down the operating system by searching too many locations, or invalid locations.
Invalid locations can also stop services from running altogether, especially the 'Server' service which is usually a dependency for other services within a Windows Server environment.
References
[edit]- ^ McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139.
- ^ Open Group Unix Specification, Environment Variables
- ^ Open Group Unix Specification, execve() function
- ^ Dash exec.c as an example of an implementation of a PATH-string parser
- ^ Microsoft.com, PATH command
PATH (variable)
View on Grokipedia/bin and /usr/bin.[1]
In Microsoft Windows, PATH serves a similar purpose but uses semicolons to delimit directories, allowing the system to search for files with extensions like .exe, .com, .bat, and .cmd in the specified locations, starting with the current directory.[2] The variable is inherited by child processes, enabling consistent command resolution across sessions, and modifications can be temporary (via command prompt) or persistent (through system settings).[3] PATH plays a critical role in system administration and scripting, as improper configuration can lead to command-not-found errors or security risks from unintended executables in early-listed directories. Utilities and shells, such as bash or cmd.exe, rely on it to execute programs efficiently without requiring absolute paths, promoting usability in diverse computing environments.[4]
Fundamentals
Definition and Purpose
The PATH environment variable is a fundamental environment variable in operating systems, particularly those supporting command-line interfaces, that contains a sequence of directory paths. These paths define the locations where the system searches for executable files when a user or script invokes a command by name rather than by its full pathname. This mechanism allows the operating system to resolve ambiguous command names efficiently by performing a linear search through the specified directories in the order listed.[5] The primary purpose of PATH is to streamline command execution in interactive and scripted environments, eliminating the need to provide absolute or relative paths for commonly used programs. By centralizing the search locations, it promotes modularity in software design, enabling users to run executables from multiple directories without repetitive path specification, which enhances productivity and supports the development of extensible toolchains. For instance, shell interpreters like those in Unix-like systems use PATH to locate and launch commands seamlessly.[6] PATH integrates directly with shell interpreters, such as bash on Unix-like systems or cmd.exe on Windows, to resolve commands during execution. Upon receiving a command like "ls" or "dir", the shell prepends each directory in PATH to the command name (appending a suitable separator if needed) and attempts to execute the resulting file until a match is found or the list is exhausted. If no match exists, the shell typically reports an error indicating the command is not found. This ordered search ensures predictable behavior while allowing customization of the executable discovery process.[7] It is important to distinguish PATH from related environment variables like LD_LIBRARY_PATH, which serves a complementary but distinct role by specifying directories for locating shared libraries and dynamic dependencies at runtime, rather than for executable programs themselves. While both facilitate resource resolution, PATH focuses exclusively on command invocation, originating from early Unix systems and evolving into a ubiquitous feature across diverse platforms.[8]Syntax and Basic Usage
The PATH environment variable consists of a colon-separated list of directory paths in Unix-like systems, where each path prefix indicates a location to search for executable files when a command is invoked without a full pathname.[5] In Windows, it uses a semicolon as the delimiter to separate paths.[2] Paths within PATH can be absolute (starting from the root directory) or relative, and the variable's value is a single string that the shell or command interpreter parses accordingly.[5][2] To set or modify PATH, users typically use shell-specific commands that export the variable to the environment. In POSIX-compliant shells like sh or bash, theexport builtin assigns a new value, often by prepending or appending directories to the existing PATH while preserving the original via variable expansion. For example, to prepend /usr/local/bin in bash:
export PATH="/usr/local/bin:$PATH"
export PATH="/usr/local/bin:$PATH"
/opt/bin uses:
export PATH="$PATH:/opt/bin"
export PATH="$PATH:/opt/bin"
set command or path builtin achieves this; for instance, to append C:\newdir:
set PATH=%PATH%;C:\newdir
set PATH=%PATH%;C:\newdir
path command:
path %PATH%;C:\newdir
path %PATH%;C:\newdir
.exe or .bat (in Windows).[5][2] This left-to-right order determines precedence: if multiple directories contain executables with the same name, the leftmost one is selected, potentially shadowing later versions.[5] In Unix-like systems, relative paths are supported, and including the current directory (denoted as .) explicitly in PATH enables searching there; zero-length path prefixes (such as a leading or trailing colon, or ::) also represent the current working directory.[5] In Windows, the current directory is searched before PATH unless explicitly cleared.[2]
Historical Development
Origins in Multics and Early Unix
The concept of a search path for resolving commands and executables originated in the Multics operating system, developed in the 1960s by MIT, Bell Labs, and General Electric. Multics provided mechanisms for managing search paths through dedicated commands such asadd_search_paths to append directories to a search list, delete_search_paths to remove them, set_search_paths to define lists explicitly, and print_search_paths to display current configurations. These features allowed the system to locate commands and files across hierarchical directories without requiring full path specifications, streamlining user interaction in a multi-user environment. This approach directly influenced early Unix developers at Bell Labs, including Ken Thompson, who had contributed to Multics and carried forward ideas for efficient command resolution.[10][11]
Unix Version 1, released in 1971 for the PDP-11 minicomputer, introduced the initial implementation of a search path mechanism as a simple, fixed directory list primarily limited to /bin for executable lookup. The Thompson shell, the command interpreter of the time, automatically scanned this directory when users entered command names, enabling basic program execution without explicit paths. This design reflected the resource constraints of the era, with the PDP-11's limited memory and storage prioritizing simplicity over flexibility. Ken Thompson and Dennis Ritchie, key architects of Unix, refined this in subsequent releases; through Unix Version 6 in 1975, the Thompson shell continued to use a hardcoded search path, primarily /bin, without support for environment variables or dynamic configuration.[11]
Early implementations exhibited significant limitations, including fixed paths without support for dynamic environment variable expansion or user customization, which often resulted in hardcoded binary locations within programs or the shell itself. These constraints persisted, complicating maintenance as the system grew; developers frequently embedded absolute paths in source code, reducing portability across installations. Despite these drawbacks, the search mechanism played a crucial role in enabling the development of portable Unix tools, such as the ed line editor and the an macro assembler, by standardizing executable discovery in shared directories and facilitating code reuse in the PDP-11 ecosystem. This foundational approach laid the groundwork for the environment variable's modern role in directing shells to locate executables efficiently.
Evolution and Standardization
The PATH variable was introduced with the Bourne shell, developed by Stephen Bourne in 1977 and released in Unix Version 7 in 1979. This shell supported environment variables, including $PATH as a colon-separated list of directories, allowing dynamic parameter expansion and greater flexibility in script portability while reducing hardcoding of directory paths. The Bourne shell was later integrated into AT&T's System V Unix releases starting in 1983. The POSIX standardization process, from 1988 to 1990, addressed PATH under IEEE Std 1003.1 (POSIX.1), which specifies its use by exec functions for searching executables. The colon-separated format and requirements for default directories like /bin and /usr/bin to ensure availability of essential utilities such as ls and sh were formalized in IEEE Std 1003.2 (POSIX.2, 1992).[1][4] During this period, PATH's influence extended beyond AT&T Unix variants, with Berkeley Software Distribution (BSD) implementations in the 1980s incorporating the variable for consistent command resolution in their evolving kernel and userland via shells like the C shell. This Unix-centric model also impacted non-Unix platforms, such as OpenVMS, where emulations like the GNU for VMS (GNV) package adopted PATH-like search paths to bridge compatibility for ported Unix tools. In the 1990s and 2000s, Linux distributions, emerging post-1991, refined PATH handling to support longer paths up to 4096 characters via kernel limits defined in pathconf(2), addressing scalability in growing filesystems without fragmenting the variable. Integration with desktop environments further evolved PATH usage; GNOME and KDE leveraged it for dynamic application launching through session managers, ensuring environment variables propagated to graphical processes for seamless user workflows. A pivotal milestone came with FIPS PUB 151-2 in 1993, which adopted POSIX.1-1990 as the standard for federal Unix systems under U.S. government procurement, promoting portability and including PATH as part of the required interfaces.[12]Implementations by Operating System
Unix and Unix-like Systems
In Unix and Unix-like systems, the PATH environment variable specifies a colon-separated list of directories that the shell searches sequentially to locate executable files when a command is invoked without a full pathname.[5] This search mechanism enables users to run programs by name alone, promoting efficient command execution across the filesystem.[5] The default PATH in most Unix-like systems typically includes/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin, encompassing essential user and system binaries while prioritizing local installations.[13] This configuration can be customized system-wide through files like /etc/profile, which is sourced for login shells, or /etc/bashrc for non-login interactive shells, and per-user via ~/.bashrc or ~/.profile for interactive sessions.[14][15] For instance, to append a directory to PATH in a user's ~/.bashrc, one would add export PATH="$PATH:/new/directory".[14]
Shells like Bash and Zsh handle PATH through parameter expansion, allowing references such as ${PATH} in scripts to access or modify the variable dynamically.[16] In Bash, command resolution is optimized via an internal hash table that caches the full pathnames of previously executed commands, avoiding repeated searches of PATH directories unless the checkhash option is enabled or the cache is cleared with hash -r.[16] This caching mechanism enhances performance for frequent command invocations but requires manual invalidation if executables are relocated within PATH.[16]
For enhanced security, particularly in privileged contexts, PATH variants exclude user-writable directories to mitigate risks like command hijacking through malicious executables; a common secure setup for root is /bin:/usr/bin:/sbin:/usr/sbin, omitting paths such as /usr/local/bin or the current directory if they permit non-root writes. In Debian-based systems, PATH integrates with the alternatives mechanism, where update-alternatives manages symbolic links in standard directories like /usr/bin to switch between multiple implementations of a command (e.g., editors), ensuring the selected version is resolved via PATH without altering the variable itself.[17]
POSIX standards limit the total size of the environment (including PATH) and arguments to {ARG_MAX} bytes, typically 2,097,152 (2 MB) in Linux implementations, allowing PATH strings well beyond 1024 characters in practice. Individual directory paths within PATH adhere to PATH_MAX, which is 4096 bytes on Linux systems, enforced by the kernel to prevent overflow during resolution.[18] Exceeding these limits can lead to truncated searches or execution failures.
To query PATH resolution, utilities like type and which provide insights: type command displays how the shell interprets a name, including its hashed or searched pathname if external, while which command outputs the full path to the executable found first in PATH.[19] For example:
$ type ls
ls is hashed (/bin/ls)
$ which python3
/usr/bin/python3
$ type ls
ls is hashed (/bin/ls)
$ which python3
/usr/bin/python3
DOS, OS/2, and Windows
In MS-DOS 2.0, introduced in 1983, the PATH environment variable specifies directories for searching executable files and is configured using the SET command within the AUTOEXEC.BAT file executed at boot.[20] Directories in the PATH are separated by semicolons, adhering to the 8.3 filename convention where filenames are limited to eight characters plus a three-character extension.[21] Unlike later systems, MS-DOS does not recurse into subdirectories, instead sequentially searching only the explicitly listed directories starting from the current working directory if no full path is provided.[20] OS/2, released in 1987 as a collaborative effort between Microsoft and IBM, evolved the PATH mechanism by setting it via the SET statement in the CONFIG.SYS file, which loads environment variables at system startup.[22] This approach supported session-specific configurations, allowing distinct PATH values for OS/2 native sessions and compatibility modes like DOS or Windows, with paths separated by semicolons.[22] The implementation drew brief inspiration from Unix traditions but adapted to OS/2's multitasking environment for broader compatibility. Windows NT and 9x series in the 1990s maintained semicolon delimiters for PATH entries while introducing configurable system and user environments accessible via the Control Panel's System Properties or the setx command for persistent changes.[23] These versions feature case-insensitivity in path matching, consistent with the NTFS and FAT file systems, and support expansion of %PATH% to append or prepend directories without overwriting the existing value.[2] From Windows 2000 onward, PATH management integrated with advanced shells like PowerShell, where it is accessed as $env:PATH for scripting and automation.[7] User Account Control (UAC), introduced in Windows Vista, imposes considerations by filtering user-specific environment variables in elevated processes, relying primarily on system-wide PATH settings to mitigate privilege escalation risks.[3] Post-Windows 10 updates, such as the 2016 long path support via registry enabling, allow PATH variables up to 32,767 characters, approaching the full environment block limit while maintaining backward compatibility.[3] For temporary modifications in the Command Prompt, the commandset PATH=C:\Windows;%PATH% prepends the specified directory to the existing PATH without affecting permanent settings.[24]
set PATH=C:\Windows;%PATH%
set PATH=C:\Windows;%PATH%
Other Platforms and Variations
In OpenVMS, the PATH functionality is implemented through the DCL DEFINE DCLDISK:[],ddcu:[mytooldir],SYSPATH enables automatic resolution of commands without explicit file specifications, enhancing usability in the process logical name table.[25] On macOS, which inherits Unix-like PATH behavior, package managers like Homebrew introduce variations by prepending platform-specific directories to the PATH environment variable for accessing installed tools.[26] For Intel-based systems, Homebrew adds/usr/local/bin to the front of PATH, while on Apple Silicon Macs, it uses /opt/homebrew/bin, ensuring Homebrew binaries take precedence without requiring sudo for execution.[26] Users configure this by evaluating the output of brew shellenv in their shell profile, such as ~/.zshrc, to dynamically set the PATH for the session.[26]
In mainframe environments like IBM z/OS, program execution relies on Job Control Language (JCL) statements such as STEPLIB and JOBLIB rather than a shell-like PATH variable, reflecting historical influences from OS/360 where library concatenation defined search orders.[27] STEPLIB specifies private load libraries for a single job step via a DD statement (e.g., //STEPLIB DD DSNAME=USER.LOADLIB,DISP=SHR), searched before system libraries, while JOBLIB applies job-wide and is ignored if STEPLIB is present in a step.[27] This mechanism differs from interactive shell PATH by focusing on batch job library resolution, with search precedence as STEPLIB, then JOBLIB, followed by the Link Pack Area (LPA) and system linkage editor libraries.[27]
Embedded real-time operating systems (RTOS) like FreeRTOS lack a runtime PATH environment variable, as they operate without a traditional file system or shell for executable searches; instead, paths are managed statically at compile time through include directories specified in build configurations. For instance, FreeRTOS projects require compiler include paths to point to FreeRTOS/Source/include and portable layers, ensuring kernel headers are accessible during builds without dynamic resolution.[28]
Android, as a Linux-based system with Java roots, deviates from binary PATH usage by emphasizing DEX (Dalvik Executable) files for app bytecode, where class loading occurs via the Android Runtime (ART) directly from APK packages rather than a global CLASSPATH or PATH.[29] DEX files contain compiled Java classes optimized for mobile execution, loaded into an app's isolated process without reliance on environment variables like PATH for binaries; instead, the system uses intents and component instantiation for runtime behavior.[29]
Cross-platform build tools address PATH portability by abstracting OS-specific path handling. In CMake, paths are normalized to absolute forms for variables typed as PATH or FILEPATH, and the tool leverages the host's PATH environment to locate compilers and dependencies across platforms like Unix Makefiles or Visual Studio generators.[30] Similarly, Go's path/filepath package ensures portability by using OS-specific separators (e.g., / on Unix, \ on Windows) in functions like Join and Clean, which normalize paths and resolve relative components without platform-dependent code.[31] During builds, Go's go build command supports cross-compilation by setting environment variables like GOOS and GOARCH, indirectly relying on the developer's PATH for toolchain access while producing binaries executable on target systems.[32]
Security and Best Practices
Common Vulnerabilities
One common vulnerability arises from including the current directory (denoted by ".") in the PATH variable, which can enable local command injection. When "." is present, the operating system searches the current working directory for executables before other directories, allowing a malicious file with a common name (e.g., "ls" or "sudo") placed in that directory to be executed unintentionally, such as via a command typo like "sl" instead of "ls". This risk is particularly acute in multi-user systems or when running untrusted scripts, as it bypasses the need for explicit execution (e.g., "./malicious").[33][34] Shadowing vulnerabilities occur when malicious programs are placed in directories that appear early in the PATH search order, overriding legitimate system tools. For instance, an attacker could create a fake "net.exe" in a user-writable directory listed before "C:\Windows\system32", causing commands like "net" to execute the malicious version instead, potentially leading to privilege escalation or data exfiltration. This technique has been employed in post-exploitation frameworks like Empire and PowerSploit.[34][35] Path traversal attacks can exploit manipulated PATH variables containing sequences like "../" to access unauthorized executables outside intended directories. In scenarios where user input influences PATH construction without sanitization (e.g., appending unvalidated input in scripts), an attacker might set PATH="../malicious:/bin" to redirect command resolution to parent directories, executing hidden payloads. This was demonstrated in vulnerabilities like Shellshock (CVE-2014-6271), where specially crafted environment variables allowed remote code execution in bash processes, affecting millions of Unix-like systems via web servers and SSH.[36][37] Case-sensitivity issues in mixed environments, particularly on Windows with its default case-insensitive filesystem, can lead to unintended executable matches and security bypasses. For example, an attacker might place a malicious file named "ViM.exe" (mixed case) in a PATH directory, which matches a Unix-like tool expectation like "vim" due to insensitivity, hijacking the command in cross-platform setups such as WSL or Cygwin.[38] Historical incidents in 1990s Unix systems often involved trojan horses exploiting PATH for persistence, such as placing fake system binaries (e.g., "ps" or "who") in user directories to capture credentials when administrators ran commands without full paths. These attacks relied on users not specifying absolute paths, a weakness noted in early Unix security analyses as a vector for insider threats and unauthorized access. In modern container environments like Docker, inheritance of the host's PATH without proper isolation can expose sensitive directory structures or enable execution of host binaries if mounts or privileges are misconfigured.[39]Configuration Recommendations
To minimize security risks such as command shadowing, where malicious binaries could override system tools, configure the PATH variable with system directories preceding user-specific or custom ones. For instance, on Unix-like systems, prioritize entries like/usr/bin and /bin before appending user directories such as ~/bin.[40] This order ensures that trusted system executables are discovered first during command resolution. Additionally, avoid including the current directory (.) in PATH unless absolutely necessary for specific workflows, as it introduces risks of executing unintended or malicious files in the working directory.[41] When invoking scripts, prefer full absolute paths (e.g., /usr/local/bin/myscript) over relying on PATH to enhance predictability and reduce dependency on environment state.[42]
Optimizing PATH length is essential to prevent potential buffer overflows in applications or shells that process the variable, particularly in resource-constrained environments. Regularly trim unused or redundant entries by reviewing and removing obsolete directories, aiming to keep the PATH reasonable in length. For auditing PATH contents on Linux, tools like echo $PATH | [tr](/page/.tr) ':' '\n' | wc -l can count entries, while integrating with the audit framework (auditd) allows logging PATH-related access events for compliance and debugging.[43]
Environment-specific configurations further enhance safety and maintainability. In shell scripts on Unix-like systems, declare PATH as readonly after setting it (e.g., readonly PATH="/bin:/usr/bin:$[HOME](/page/Home)/bin") to prevent accidental modifications that could introduce vulnerabilities during script execution.[44] On Windows, prioritize editing the user-specific PATH over the system-wide version to avoid requiring administrator privileges, which reduces escalation risks; user PATH entries are appended to system ones, ensuring non-disruptive additions.[45]
Verification of PATH configuration is straightforward and should be routine. On Unix-like systems, use echo $PATH to display the current value, confirming the order and contents; pair this with tools like ShellCheck, a static analyzer that flags insecure PATH manipulations in scripts (e.g., dynamic alterations without validation).[46][47] In Windows, the echo %PATH% command or path utility serves the same purpose, helping detect duplicates or misconfigurations.
In modern CI/CD pipelines, such as those using GitHub Actions, explicitly export and manage PATH for reproducible builds by prepending custom tool directories in workflow steps (e.g., echo "$GITHUB_WORKSPACE/bin" >> $GITHUB_PATH). This ensures consistent command resolution across runs, mitigating environment variability that could lead to build failures or security inconsistencies.[48]