Linux namespaces

namespaces
namespaces
Original author	Al Viro
Developers	Eric W. Biederman, Pavel Emelyanov, Al Viro, Cyrill Gorcunov et al.
Initial release	2002; 24 years ago
Written in	C
Operating system	Linux
Type	System software
License	GPL and LGPL

Namespaces are a feature of the Linux kernel that partition kernel resources such that one set of processes sees one set of resources, while another set of processes sees a different set of resources. The feature works by having the same namespace for a set of resources and processes, but those namespaces refer to distinct resources. Resources may exist in multiple namespaces. Examples of such resources are process IDs, host-names, user IDs, file names, some names associated with network access, and inter-process communication.

Namespaces are a required aspect of functioning containers in Linux. The term "namespace" is often used to denote a specific type of namespace (e.g., process ID) as well as for a particular space of names. ^[1]

A Linux system begins with a single namespace of each type, used by all processes. Processes can create additional namespaces and can also join different namespaces.

History

Linux namespaces were inspired by the wider namespace functionality used heavily throughout Plan 9 from Bell Labs.^[2] The Linux Namespaces originated in 2002 in the 2.4.19 kernel with work on the mount namespace kind. Additional namespaces were added beginning in 2006^[3] and continuing into the future.

Adequate container support functionality was finished in kernel version 3.8^[4]^[5] with the introduction of User namespaces.^[6]

Namespace kinds

Since kernel version 5.6, there are 8 kinds of namespaces. Namespace functionality is the same across all kinds: each process is associated with a namespace and can only see or use the resources associated with that namespace, and descendant namespaces where applicable. This way, each process (or process group thereof) can have a unique view on the resources. Which resource is isolated depends on the kind of namespace that has been created for a given process group.

Mount (mnt)

Mount namespaces control mount points. Upon creation the mounts from the current mount namespace are copied to the new namespace, but mount points created afterwards do not propagate between namespaces (using shared subtrees, it is possible to propagate mount points between namespaces^[7]).

The clone flag used to create a new namespace of this type is CLONE_NEWNS - short for "NEW NameSpace". This term is not descriptive (it does not tell which kind of namespace is to be created) because mount namespaces were the first kind of namespace and designers did not anticipate there being any others.

Process ID (pid)

The PID namespace provides processes with an independent set of process IDs (PIDs) from other namespaces. PID namespaces are nested, meaning when a new process is created it will have a PID for each namespace from its current namespace up to the initial PID namespace. Hence, the initial PID namespace is able to see all processes, albeit with different PIDs than other namespaces will see processes with.

The first process created in a PID namespace is assigned the process ID number 1 and receives most of the same special treatment as the normal init process, most notably that orphaned processes within the namespace are attached to it. This also means that the termination of this PID 1 process will immediately terminate all processes in its PID namespace and any descendants.^[8]

Network (net)

Network namespaces virtualize the network stack. On creation, a network namespace contains only a loopback interface. Each network interface (physical or virtual) is present in exactly 1 namespace and can be moved between namespaces.

Each namespace will have a private set of IP addresses, its own routing table, socket listing, connection tracking table, firewall, and other network-related resources.

Destroying a network namespace destroys any virtual interfaces within it and moves any physical interfaces within it back to the initial network namespace.

Inter-process Communication (ipc)

IPC namespaces isolate processes from SysV style inter-process communication. This prevents processes in different IPC namespaces from using, for example, the SHM family of functions to establish a range of shared memory between the two processes. Instead, each process will be able to use the same identifiers for a shared memory region and produce two such distinct regions.

UTS

UTS (UNIX Time-Sharing) namespaces allow a single system to appear to have different host and domain names to different processes. When a process creates a new UTS namespace, the hostname and domain of the new UTS namespace are copied from the corresponding values in the caller's UTS namespace.^[9]

User ID (user)

User namespaces are a feature to provide both privilege isolation and user identification segregation across multiple sets of processes, available since kernel 3.8.^[10] With administrative assistance, it is possible to build a container with seeming administrative rights without actually giving elevated privileges to user processes. Like the PID namespace, user namespaces are nested, and each new user namespace is considered to be a child of the user namespace that created it.

A user namespace contains a mapping table converting user IDs from the container's point of view to the system's point of view. This allows, for example, the root user to have user ID 0 in the container but is actually treated as user ID 1,400,000 by the system for ownership checks. A similar table is used for group ID mappings and ownership checks.

Control group (cgroup) Namespace

The cgroup namespace type hides the identity of the control group of which the process is a member. A process in such a namespace, checking which control group any process is part of, would see a path that is actually relative to the control group set at creation time, hiding its true control group position and identity. This namespace type has existed since March 2016 in Linux 4.6.^[11]^[12]

Time

The time namespace allows processes to see different system times in a way similar to the UTS namespace. It was proposed in 2018 and was released in Linux 5.6, in March 2020.^[13]

Proposed namespaces

syslog namespace

The syslog namespace was proposed by Rui Xiang, an engineer at Huawei, but wasn't merged into the Linux kernel.^[14] systemd implemented a similar feature called “journal namespace” in February 2020.^[15]

Administrative hierarchy

To facilitate privilege isolation of administrative actions, each namespace type is considered owned by a user namespace based on the active user namespace at the moment of creation. A user with administrative privileges in the appropriate user namespace will be allowed to perform administrative actions within that other namespace type. For example, if a process has administrative permission to change the IP address of a network interface, it may do so as long as its own user namespace is the same as (or ancestor of) the user namespace that owns the network namespace. Hence, the initial user namespace has administrative control over all namespace types in the system.^[16]

Implementation details

Namespaces are represented by virtual file objects within the kernel. An open filedescriptor on such a file may be used to associate a process with the corresponding namespace.

Visibility in /proc

The kernel makes the namespaces of each process visible at /proc/pid/ns/kind. Like all non-file resources in /proc, these can be read as symbolic links, yielding kind:[inode_number], or accessed as ordinary files. (These files are unreadable but are useful in other ways. Their inode numbers match the textual numbers yielded by readlink.) These files are in one-to-one correspondence with namespaces in the kernel, so the inode numbers act as unique identifiers.

As of Linux 6.1.0, kind can be any of cgroup, ipc, mnt, net, pid, time, user, uts. Inheritance of some namespaces can be controlled separately from the effective namespace of the process itself, and that is visible as /proc/pid/ns/kind_for_children.

Syscalls

Three syscalls can directly manipulate namespaces:

clone, with flags to specify which new namespace the new process should be migrated to.
unshare, to disassociate parts of a process's or thread's execution context that are currently being shared with other processes (or threads)
setns, to place the current process into the namespace specified by a file descriptor.

Destruction

If a namespace is no longer referenced, it will be deleted, the handling of the contained resource depends on the namespace kind. A namespace is considered referenced when:

it has at least one member process;
it has at least one referenced child namespace; or
its virtual file (/proc/pid/ns/kind) is in use, including:
- via an open filedescriptor;
- being a process' current directory;
- being a process' root directory; or
- underpinning a bind mount.

Adoption

Various container software use Linux namespaces in combination with cgroups to isolate their processes, including Docker^[17] and LXC.

Other applications, such as Google Chrome make use of namespaces to isolate its own processes which are at risk from attack on the internet.^[18]

There is also an unshare wrapper in util-linux. An example of its use is:

SHELL=/bin/sh unshare --map-root-user --fork --pid chroot "${chrootdir}" "$@"

References

^ Heddings, Anthony (2020-09-02). "What Are Linux Namespaces and What Are They Used for?". How-To Geek. Retrieved 2024-08-22.
^ "The Use of Name Spaces in Plan 9". 1992. Archived from the original on 2014-09-06. Retrieved 2016-03-24.
^ "Linux kernel source tree". kernel.org. 2016-10-02.
^ "LKML: Linus Torvalds: Linux 3.8". lkml.org. Retrieved 2024-03-22.
^ "Linux_3.8 - Linux Kernel Newbies". kernelnewbies.org. Retrieved 2024-03-22.
^ "Namespaces in operation, part 5: User namespaces [LWN.net]".
^ "Documentation/filesystems/sharedsubtree.txt". 2016-02-25. Retrieved 2017-03-06.
^ "Namespaces in operation, part 3: PID namespaces". lwn.net. 2013-01-16.
^ "uts_namespaces(7) - Linux manual page". www.man7.org. Retrieved 2021-02-16.
^ "Namespaces in operation, part 5: User namespaces [LWN.net]".
^ Heo, Tejun (2016-03-18). "[GIT PULL] cgroup namespace support for v4.6-rc1". lkml (Mailing list).
^ Torvalds, Linus (2016-03-26). "Linux 4.6-rc1". lkml (Mailing list).
^ "It's Finally Time: The Time Namespace Support Has Been Added To The Linux 5.6 Kernel - Phoronix". www.phoronix.com. Retrieved 2020-03-30.
^ "Add namespace support for syslog [LWN.net]". lwn.net. Retrieved 2022-07-11.
^ "journal: add concept of "journal namespaces" by poettering · Pull Request #14178 · systemd/systemd". GitHub. Retrieved 2022-07-11.
^ "Namespaces in operation, part 5: User namespaces". lwn.net. 2013-02-27.
^ "Docker security". docker.com. Retrieved 2016-03-24.
^ "Chromium Linux Sandboxing". Archived from the original on 2019-12-19. Retrieved 2019-12-19.

External links

[1] Heddings, Anthony (2020-09-02). "What Are Linux Namespaces and What Are They Used for?". How-To Geek. Retrieved 2024-08-22.

[2] "The Use of Name Spaces in Plan 9". 1992. Archived from the original on 2014-09-06. Retrieved 2016-03-24.

[pid_namespaces_commit-3] "Linux kernel source tree". kernel.org. 2016-10-02.

[4] "LKML: Linus Torvalds: Linux 3.8". lkml.org. Retrieved 2024-03-22.

[5] "Linux_3.8 - Linux Kernel Newbies". kernelnewbies.org. Retrieved 2024-03-22.

[6] "Namespaces in operation, part 5: User namespaces [LWN.net]".

[shared_subtrees-7] "Documentation/filesystems/sharedsubtree.txt". 2016-02-25. Retrieved 2017-03-06.

[pid_namespace_description-8] "Namespaces in operation, part 3: PID namespaces". lwn.net. 2013-01-16.

[9] "uts_namespaces(7) - Linux manual page". www.man7.org. Retrieved 2021-02-16.

[10] "Namespaces in operation, part 5: User namespaces [LWN.net]".

[cgroup_namespace_pull_request-11] Heo, Tejun (2016-03-18). "[GIT PULL] cgroup namespace support for v4.6-rc1". lkml (Mailing list).

[cgroup_namespace_merge-12] Torvalds, Linus (2016-03-26). "Linux 4.6-rc1". lkml (Mailing list).

[13] "It's Finally Time: The Time Namespace Support Has Been Added To The Linux 5.6 Kernel - Phoronix". www.phoronix.com. Retrieved 2020-03-30.

[14] "Add namespace support for syslog [LWN.net]". lwn.net. Retrieved 2022-07-11.

[15] "journal: add concept of "journal namespaces" by poettering · Pull Request #14178 · systemd/systemd". GitHub. Retrieved 2022-07-11.

[user_namespace_description-16] "Namespaces in operation, part 5: User namespaces". lwn.net. 2013-02-27.

[docker-17] "Docker security". docker.com. Retrieved 2016-03-24.

[google-18] "Chromium Linux Sandboxing". Archived from the original on 2019-12-19. Retrieved 2019-12-19.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

History

Linux namespaces

Recent from talks

Recent from talks

Contribute something

Contribute something

Media Pages

Timelines

Articles

Notes collections

Notes

Notes

Days in Chronicle

Linux namespaces

History

Namespace kinds

Mount (mnt)

Process ID (pid)

Network (net)

Inter-process Communication (ipc)

UTS

User ID (user)

Control group (cgroup) Namespace

Time

Proposed namespaces

syslog namespace

Administrative hierarchy

Implementation details

Visibility in /proc

Syscalls

Destruction

Adoption

References

External links