Cgroups
View on Wikipedia
| cgroups | |
|---|---|
| Original authors | v1: Paul Menage, Rohit Seth, Memory Controller by Balbir Singh, CPU controller by Srivatsa Vaddagiri v2: Tejun Heo |
| Developers | Tejun Heo, Johannes Weiner, Michal Hocko, Waiman Long, Roman Gushchin, Chris Down et al. |
| Initial release | 2007 |
| Written in | C |
| Operating system | Linux |
| Type | System software |
| License | GPL and LGPL |
| Website | Cgroup v1, Cgroup v2 |
cgroups (abbreviated from control groups) is a Linux kernel feature that limits, accounts for, and isolates the resource usage (CPU, memory, disk I/O, etc.)[1]: § Controllers of a collection of processes.
Engineers at Google started the work on this feature in 2006 under the name "process containers".[2] In late 2007, the nomenclature changed to "control groups" to avoid confusion caused by multiple meanings of the term "container" in the Linux kernel context, and the control groups functionality was merged into the Linux kernel mainline in kernel version 2.6.24, which was released in January 2008.[3] Since then, developers have added controllers for the kernel's own memory allocation,[4] netfilter firewalling,[5] the OOM killer,[6] and many other parts.
A major change in the history of cgroups is cgroup v2, which removes the ability to use multiple process hierarchies and to discriminate between threads as found in the original cgroup (now called "v1").[1]: § Issues with v1 and Rationales for v2 Work on the single, unified hierarchy started with the repurposing of v1's dummy hierarchy as a place for holding all controllers not yet used by others in 2014.[7] cgroup v2 was merged in Linux kernel 4.5 (2016).[8]
Versions
[edit]There are two versions of cgroups. They can co-exist in a system.
- The original version of cgroups was written by Paul Menage and Rohit Seth. It was merged into the mainline Linux kernel in 2007 (2.6.2). Development and maintenance of cgroups was then taken over by Tejun Heo, who instituted major redesigns without breaking the interface (see § Redesigns of v1). It was renamed "Control Group version 1" (cgroup-v1) after cgroups-v2 appeared in Linux 4.5.[9]
- Tejun Heo found that further redesign of v1 could not proceed without breaking the interface. As a result, he added a separate, new system called "Control Group version 2" (cgroup-v2). Unlike v1, cgroup v2 has only a single process hierarchy (because a controller can only be assigned to one hierarchy, processes in separate hierarchies cannot be managed by the same controller; this change sidesteps the issue). It also removes the ability to discriminate between threads, choosing to work on a granularity of processes instead (disabling an "abuse" of the system which led to convoluted APIs).[1]: § Issues with v1 and Rationales for v2 The first version of the unified hierarchy The document first appeared in Linux kernel 4.5 released on 14 March 2016.[8]
Features
[edit]One of the design goals of cgroups is to provide a unified interface to many different use cases, from controlling single processes (by using nice, for example) to full operating system-level virtualization (as provided by OpenVZ, Linux-VServer or LXC, for example). Cgroups provides:
- Resource limiting
- groups can be set not to exceed a configured memory limit, which also includes the file system cache,[10][11] I/O bandwidth limit,[12] CPU quota limit,[13] CPU set limit,[14] or maximum open files.[15]
- Prioritization
- some groups may get a larger share of CPU utilization[16] or disk I/O throughput[17]
- Accounting
- measures a group's resource usage, which may be used, for example, for billing purposes[18]
- Control
- freezing groups of processes, their checkpointing and restarting[18]
Use
[edit]
A control group (abbreviated as cgroup) is a collection of processes that are bound by the same criteria and associated with a set of parameters or limits. These groups can be hierarchical, meaning that each group inherits limits from its parent group. The kernel provides access to multiple controllers (also called subsystems) through the cgroup interface;[3] for example, the "memory" controller limits memory use, "cpuacct" accounts CPU usage, etc.
Control groups can be used in multiple ways:
- By accessing the cgroup virtual file system manually.
- By creating and managing groups on the fly using tools like
cgcreate,cgexec, andcgclassify(fromlibcgroup). - Through the "rules engine daemon" that can automatically move processes of certain users, groups, or commands to cgroups as specified in its configuration.
- Indirectly through other software that uses cgroups, such as Docker, Firejail, LXC,[19] libvirt, systemd, Open Grid Scheduler/Grid Engine,[20] and Google's developmentally defunct lmctfy.
The Linux kernel documentation contains some technical details of the setup and use of control groups version 1[21] and version 2.[1]
Interfaces
[edit]Both versions of cgroup act through a pseudo-filesystem (cgroup for v1 and cgroup2 for v2). Like all filesystems they can be mounted on any path, but the general convention is to mount one of the versions (generally v2) on /sys/fs/cgroup under the sysfs default location of /sys. As mentioned before the two cgroup versions can be active at the same time; this too applies to the filesystems so long as they are mounted to a different path.[21][1] For the description below we assume a setup where the v2 hierarchy lies in /sys/fs/cgroup. The v1 hierarchy, if ever required, will be mounted at a different location.
At initialization cgroup2 should have no defined control groups except the top-level one. In other words, /sys/fs/cgroup should have no directories, only a number of files that control the system as a whole. At this point, running ls /sys/fs/cgroup could list the following on one example system:
cgroup.controllerscgroup.max.depthcgroup.max.descendantscgroup.pressurecgroup.procscgroup.statcgroup.subtree_controlcgroup.threadscpu.pressurecpuset.cpus.effectivecpuset.cpus.isolatedcpuset.mems.effectivecpu.statcpu.stat.localio.cost.modelio.cost.qosio.pressureio.prio.classio.statirq.pressurememory.numa_statmemory.pressurememory.reclaimmemory.statmemory.zswap.writebackmisc.capacitymisc.currentmisc.peak
These files are named according to the controllers that handle them. For example, cgroup.* deal with the cgroup system itself and memory.* deal with the memory subsystem. Example: to request the kernel to 1 gigabyte of memory from anywhere in the system, one can run echo "1G swappiness=50" > /sys/fs/cgroup/memory.reclaim.[1]
To create a subgroup, one simply creates a new directory under an existing group (including the top-level one). The files corresponding to available controls for this group are automatically created.[1] For example, running mkdir /sys/fs/cgroup/example; ls /sys/fs/cgroup/example would produce a list of files largely similar to the one above, but with noticeable changes. On one example system, these files are added:
cgroup.eventscgroup.freezecgroup.killcgroup.typecpu.idlecpu.maxcpu.max.burstcpu.pressurecpu.uclamp.maxcpu.uclamp.mincpu.weightcpu.weight.nicememory.currentmemory.eventsmemory.events.localmemory.highmemory.lowmemory.maxmemory.minmemory.oom.groupmemory.peakmemory.swap.currentmemory.swap.eventsmemory.swap.highmemory.swap.maxmemory.swap.peakmemory.zswap.currentmemory.zswap.maxpids.currentpids.eventspids.events.localpids.maxpids.peak
These changes are not unexpected because some controls and statistics only make sense on a subset of processes (e.g. nice level being the CPU priority of processes relative to the rest of the system).[1]
Processes are assigned to subgroups by writing to /proc/<PID>/cgroup. The cgroup a process is in can be found by reading the same file.[1]
On systems based on systemd, a hierarchy of subgroups is predefined to encapsulate every process directly and indirectly launched by systemd under a subgroup: the very basis of how systemd manages processes. An explanation of the nomenclature of these groups can be found in the Red Hat Enterprise Linux 7 manual.[22] Red Hat also provides a guide on creating a systemd service file that causes a process to run in a separate cgroup.[23]
systemd-cgtop[24] command can be used to show top control groups by their resource usage.
V1 coexistence
[edit]On a system with v2, v1 can still be mounted and given access to controllers not in use by v2. However, a modern system typically already places all controllers in use in v2, so there is no controller available for v1 at all even if a hierarchy is created. It is possible to clear all uses of a controller from v2 and hand it to v1, but moving controllers between hierarchies after the system is up and running is cumbersome and not recommended.[1]
Major evolutions
[edit]Redesigns of v1
[edit]Redesign of cgroups started in 2013,[25] with additional changes brought by versions 3.15 and 3.16 of the Linux kernel.[26][27][28]
The following changes concern the kernel before 4.5/4.6, i.e. when cgroups-v2 were added. In other words they describe how cgroups-v1 had been changed, though most of them have also been inherited into v2 (after all, v1 and v2 share the same codebase).
Namespace isolation
[edit]While not technically part of the cgroups work, a related feature of the Linux kernel is namespace isolation, where groups of processes are separated such that they cannot "see" resources in other groups. For example, a PID namespace provides a separate enumeration of process identifiers within each namespace. Also available are mount, user, UTS (Unix Time Sharing), network and SysV IPC namespaces.
- The PID namespace provides isolation for the allocation of process identifiers (PIDs), lists of processes and their details. While the new namespace is isolated from other siblings, processes in its "parent" namespace still see all processes in child namespaces—albeit with different PID numbers.[29]
- Network namespace isolates the network interface controllers (physical or virtual), iptables firewall rules, routing tables etc. Network namespaces can be connected with each other using the "veth" virtual Ethernet device.[30]
- "UTS" namespace allows changing the hostname.
- Mount namespace allows creating a different file system layout, or making certain mount points read-only.[31]
- IPC namespace isolates the System V inter-process communication between namespaces.
- User namespace isolates the user IDs between namespaces.[32]
- Cgroup namespace[33]
Namespaces are created with the "unshare" command or syscall, or as "new" flags in a "clone" syscall.[34]
The "ns" subsystem was added early in cgroups development to integrate namespaces and control groups. If the "ns" cgroup was mounted, each namespace would also create a new group in the cgroup hierarchy. This was an experiment that was later judged to be a poor fit for the cgroups API, and removed from the kernel.
Linux namespaces were inspired by the more general namespace functionality used heavily throughout Plan 9 from Bell Labs.[35]
Conversion to kernfs
[edit]Kernfs was introduced into the Linux kernel with version 3.14 in March 2014, the main author being Tejun Heo.[36] One of the main motivators for a separate kernfs is the cgroups file system. Kernfs is basically created by splitting off some of the sysfs logic into an independent entity, thus easing for other kernel subsystems the implementation of their own virtual file system with handling for device connect and disconnect, dynamic creation and removal, and other attributes. This does not affect how cgroups is used, but makes maintaining the code easier.[37]
New features introduced during v1
[edit]Kernel memory control groups (kmemcg) were merged into version 3.8 (2013 February 18) of the Linux kernel mainline.[38][39][4] The kmemcg controller can limit the amount of memory that the kernel can utilize to manage its own internal processes.
Support for per-group netfilter setup was added in 2014.[5]
The unified hierarchy was added in 2014. It repurposes of v1's dummy hierarchy to hold all controllers not yet used by others. This changed dummy hierarchy would become the only available hierarchy in v2.[7]
Changes after v2
[edit]Unlike v1, cgroup v2 has only a single process hierarchy and discriminates between processes, not threads.
cgroup awareness of OOM killer
[edit]Linux Kernel 4.19 (October 2018) introduced cgroup awareness of OOM killer implementation which adds an ability to kill a cgroup as a single unit and so guarantee the integrity of the workload.[6]
Adoption
[edit]Various projects use cgroups as their basis, including CoreOS, Docker (in 2013), Hadoop, Jelastic, Kubernetes,[40] lmctfy (Let Me Contain That For You), LXC (Linux Containers), systemd, Mesos and Mesosphere,[40] HTCondor, and Flatpak.
Major Linux distributions also adopted it such as Red Hat Enterprise Linux (RHEL) 6.0 in November 2010, three years before adoption by the mainline Linux kernel.[41]
On 29 October 2019, the Fedora Project modified Fedora 31 to use CgroupsV2 by default[42]
See also
[edit]- Operating system–level virtualization implementations
- Process group
- Tc (Linux) – a traffic control utility slightly overlapping in functionality with network-oriented cgroup settings
- Job object – the equivalent Windows concept, as managed by that platform’s Object Manager
References
[edit]- ^ a b c d e f g h i j "Control Group v2". docs.kernel.org.
Sections referenced in this document:
- ^ Jonathan Corbet (29 May 2007). "Process containers". LWN.net.
- ^ a b Jonathan Corbet (29 October 2007). "Notes from a container". LWN.net. Retrieved 14 April 2015.
The original 'containers' name was considered to be too generic – this code is an important part of a container solution, but it's far from the whole thing. So containers have now been renamed 'control groups' (or 'cgroups') and merged for 2.6.24.
- ^ a b "memcg: add documentation about the kmem controller". kernel.org. 18 December 2012.
- ^ a b "netfilter: x_tables: lightweight process control group matching". 23 April 2014. Archived from the original on 24 April 2014.
- ^ a b "Linux_4.19 - Linux Kernel Newbies".
- ^ a b "cgroup: prepare for the default unified hierarchy". 13 March 2014.
- ^ a b "Documentation/cgroup-v2.txt as appeared in Linux kernel 4.5". 14 March 2016.
- ^ "diff between Linux kernel 4.4 and 4.5". 14 March 2016.
- ^ Jonathan Corbet (31 July 2007). "Controlling memory use in containers". LWN.
- ^ Balbir Singh, Vaidynathan Srinivasan (July 2007). "Containers: Challenges with the memory resource controller and its performance" (PDF). Ottawa Linux Symposium.
- ^ Carvalho, André (18 October 2017). "Using cgroups to limit I/O". andrestc.com. Retrieved 12 September 2022.
- ^ Luu, Dan. "The container throttling problem". danluu.com. Retrieved 12 September 2022.
- ^ Derr, Simon (2004). "CPUSETS". Retrieved 12 September 2022.
- ^ "setrlimit(2) — Arch manual pages". man.archlinux.org. Retrieved 27 November 2023.
- ^ Jonathan Corbet (23 October 2007). "Kernel space: Fair user scheduling for Linux". Network World. Archived from the original on 19 October 2013. Retrieved 22 August 2012.
- ^ Kamkamezawa Hiroyu (19 November 2008). Cgroup and Memory Resource Controller (PDF). Japan Linux Symposium. Archived from the original (PDF presentation slides) on 22 July 2011.
- ^ a b Hansen D, IBM Linux Technology Center (2009). Resource Management (PDF presentation slides). Linux Foundation.
- ^ Matt Helsley (3 February 2009). "LXC: Linux container tools". IBM developerWorks.
- ^ "Grid Engine cgroups Integration". Scalable Logic. 22 May 2012.
- ^ a b "Control Groups version 1". docs.kernel.org.
- ^ "1.2. Default Cgroup Hierarchies | Resource Management Guide | Red Hat Enterprise Linux | 7 | Red Hat Documentation". docs.redhat.com.
- ^ "Managing cgroups with systemd". www.redhat.com.
- ^ "Systemd-cgtop".
- ^ "All About the Linux Kernel: Cgroup's Redesign". Linux.com. 15 August 2013. Archived from the original on 28 April 2019. Retrieved 19 May 2014.
- ^ "The unified control group hierarchy in 3.16". LWN.net. 11 June 2014.
- ^ "Pull cgroup updates for 3.15 from Tejun Heo". kernel.org. 3 April 2014.
- ^ "Pull cgroup updates for 3.16 from Tejun Heo". kernel.org. 9 June 2014.
- ^ Pavel Emelyanov, Kir Kolyshkin (19 November 2007). "PID namespaces in the 2.6.24 kernel". LWN.net.
- ^ Jonathan Corbet (30 January 2007). "Network namespaces". LWN.net.
- ^ Serge E. Hallyn, Ram Pai (17 September 2007). "Applying mount namespaces". IBM developerWorks.
- ^ Michael Kerrisk (27 February 2013). "Namespaces in operation, part 5: User namespaces". lwn.net Linux Info from the Source.
- ^ "LKML: Linus Torvalds: Linux 4.6-rc1".
- ^ Janak Desai (11 January 2006). "Linux kernel documentation on unshare".
- ^ "The Use of Name Spaces in Plan 9". 1992. Archived from the original on 6 September 2014. Retrieved 15 February 2015.
- ^ "kernfs, sysfs, driver-core: implement synchronous self-removal". LWN.net. 3 February 2014. Retrieved 7 April 2014.
- ^ "Linux kernel source tree: kernel/git/torvalds/linux.git: cgroups: convert to kernfs". kernel.org. 11 February 2014. Retrieved 23 May 2014.
- ^ "memcg: kmem controller infrastructure". kernel.org source code. 18 December 2012.
- ^ "memcg: kmem accounting basic infrastructure". kernel.org source code. 18 December 2012.
- ^ a b "Mesosphere to Bring Google's Kubernetes to Mesos". Mesosphere.io. 10 July 2014. Archived from the original on 6 September 2015. Retrieved 13 July 2014.
- ^ "Red Hat Enterprise Linux - 6.0 Release Notes" (PDF). redhat.com. Retrieved 12 September 2023.
- ^ "1732114 – Modify Fedora 31 to use CgroupsV2 by default".
External links
[edit]- Official Linux kernel documentation on cgroups v1 and cgroups v2
- Red Hat Resource Management Guide on cgroups
- Ubuntu manpage on cgroups Archived 9 August 2021 at the Wayback Machine
- Linux kernel Namespaces and cgroups by Rami Rosen (2013)
- Namespaces and cgroups, the basis of Linux containers (including cgroups v2), slides of a talk by Rami Rosen, Netdev 1.1, Seville, Spain, 2016
- Understanding the new control groups API, LWN.net, by Rami Rosen, March 2016
- Large-scale cluster management at Google with Borg, April 2015, by Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune and John Wilkes
- Job Objects, similar feature on Windows
Cgroups
View on Grokipedia/sys/fs/cgroup.[2] Cgroups form the foundational resource control layer for container technologies like Docker and Kubernetes, enabling efficient virtualization and workload isolation in modern computing environments.[5]
Overview
Definition and Purpose
Control groups, commonly known as cgroups, are a Linux kernel feature that organizes processes into hierarchical groups to limit, account for, and isolate the usage of system resources such as CPU time, memory, disk I/O, and network bandwidth for collections of tasks.[2][1] This subsystem aggregates sets of tasks and their future children into groups, associating them with specific parameters that define behavior for various resource controllers.[2] The primary purpose of cgroups is to enable precise resource allocation and management in environments requiring isolation, such as containers and virtualization technologies, by preventing any single process or user from monopolizing system resources.[2] They facilitate workload isolation in multi-tenant systems, where multiple applications or users share the same kernel, ensuring that resource demands from one group do not adversely affect others.[1] This capability supports broader containerization efforts by providing the foundational mechanisms for bounding and prioritizing resource consumption.[6] Key benefits of cgroups include enhanced system stability through enforced limits that mitigate denial-of-service risks from resource-intensive tasks, promotion of fair resource sharing among competing groups, and improved overall efficiency in resource utilization, particularly in server and cloud environments.[2] Initially motivated by the need for process containerization, cgroups were developed by Google engineers in 2006–2007 under the name "process containers" to underpin projects like Linux Containers (LXC), addressing the limitations of earlier resource management approaches in handling dynamic workloads.[6][7]Historical Development
The development of control groups, commonly known as cgroups, originated in 2006 at Google, where engineers Paul Menage and Rohit Seth led the initial work under the name "process containers" to support resource isolation for container-like environments.[8] This effort addressed the need for fine-grained resource control in large-scale computing, building on existing kernel mechanisms like cpusets.[9] The project was renamed cgroups shortly thereafter and merged into the mainline Linux kernel as version 1 in the 2.6.24 release in early 2008, marking its availability for upstream adoption.[1] Early adoption of cgroups v1 focused on container technologies, with integration into Linux Containers (LXC) starting around 2008, where it combined with kernel namespaces to enable full OS-level virtualization.[10] By 2009, as additional controllers for resources like memory and I/O were added and refined, cgroups v1 achieved sufficient stability for production use in distributions and tools, paving the way for broader ecosystem support including later projects like Docker in 2013.[3] Paul Menage served as the primary maintainer during this formative period until 2011, when responsibilities transitioned to Tejun Heo, who oversaw subsequent redesigns and maintenance.[11] Key milestones included the experimental introduction of cgroups v2 in kernel 3.16 in 2014, featuring a unified hierarchy to address v1's limitations in scalability and consistency.[12] This version reached production readiness in kernel 4.5 in 2016, with default enablement options emerging in subsequent releases.[8] Refinements continued into 2025, enhancing features like delegation for unprivileged users in v2 hierarchies. Post-2020 updates bolstered the IO controller with improved weight-based throttling and cost modeling starting in kernel 5.1, while Pressure Stall Information (PSI)—initially added in 4.20—matured through better integration in container runtimes and orchestrators, enabling proactive resource pressure detection by 2024.[13][14]Core Concepts
Hierarchy Structure
Control groups (cgroups) are organized in a hierarchical structure that forms the foundation for resource management in the Linux kernel. In cgroup version 1 (v1), the system supports multiple independent hierarchies, often described as a forest, where each hierarchy is a tree of cgroups dedicated to one or more controllers. Every process belongs to exactly one cgroup per hierarchy, and the root cgroup of each hierarchy initially contains all tasks on the system. Child cgroups inherit resource limits and accounting from their parents, ensuring that constraints propagate downward in the tree.[2] In contrast, cgroup version 2 (v2) employs a single unified hierarchy, simplifying the organization into one tree where all controllers operate within the same structure. This unified approach ensures consistent views of processes across controllers, with the root cgroup at the top level exempt from direct resource control but serving as the parent for all others. Processes inherit their parent's cgroup membership upon creation via fork, and resource distributions follow a top-down model where a child cgroup can only allocate resources it has received from its parent.[4] The hierarchies are exposed through a pseudo-filesystem mounted under /sys/fs/cgroup. For v1, the cgroup filesystem (cgroupfs) is mounted with options specifying controllers, such asmount -t cgroup -o cpuset,[memory](/page/Memory) none /sys/fs/cgroup/cpuset. For v2, the cgroup2 filesystem is mounted as mount -t cgroup2 none /sys/fs/cgroup/unified, providing a single mount point for the unified hierarchy. The root cgroup resides at this mount point, with subdirectories representing child cgroups.[2][4]
A key feature of the hierarchy is delegation, which allows non-root users to manage sub-hierarchies without system-wide privileges. In v1, delegation relies on file permissions, enabling users to create, modify, and move processes within permitted cgroups by writing to files like tasks, though containment is less strict. In v2, delegation is more robust: users gain control by setting ownership or permissions on files such as cgroup.procs, cgroup.threads, and cgroup.subtree_control, while the nsdelegate mount option enforces boundaries using cgroup namespaces to prevent unauthorized process migrations outside the delegated subtree. This option is set system-wide on mount from the init namespace, treating namespaces as delegation limits.[1][4]
For illustration, consider a simple hierarchy tree in v2: the root cgroup (/sys/fs/cgroup) branches to a user-specific cgroup (e.g., /sys/fs/cgroup/user.slice), which further divides into process groups (e.g., /sys/fs/cgroup/user.slice/app1 and /sys/fs/cgroup/user.slice/app2). Processes launched under user.slice inherit limits from the root and user levels, allowing isolated resource management for applications without affecting the broader system.[4]
Controllers and Resources
Control groups (cgroups) utilize controllers, also known as subsystems, to manage and limit specific types of system resources allocated to groups of processes. Each controller handles a distinct resource domain, such as CPU time or memory usage, and operates within the cgroup hierarchy to enforce policies like shares, limits, or protections. In cgroup version 2 (v2), controllers are integrated into a unified hierarchy, where they can be selectively enabled for subtrees via thecgroup.subtree_control file by appending names like "+cpu" or "+memory" to activate them for child cgroups.[4]
The core controllers include the following, with their managed resources and purposes detailed below. This list reflects availability as of Linux kernel 6.17 (released in September 2025), encompassing both longstanding and newer additions. Recent additions include the dmem controller for device memory management, introduced in kernel 6.14 (June 2025).[4]
| Controller | Managed Resources | Description |
|---|---|---|
| cpu | CPU cycles and scheduling | Regulates the distribution of CPU time among cgroups using a weight-based shares model for proportional allocation and a quota-based bandwidth model for hard limits on usage periods. It supports integration with the completely fair scheduler (CFS) for fair CPU sharing.[15] |
| memory | RAM, swap, and kernel memory | Tracks and limits memory usage, including user-space allocations, kernel data structures, and TCP buffers, while providing protection levels to prioritize cgroups during pressure and out-of-memory (OOM) scenarios. Usage is accounted hierarchically to prevent double-counting.[16] |
| io | Block device I/O bandwidth and operations | Manages I/O resources on block devices through weight-based proportional sharing and absolute limits on bytes or I/O operations per second (IOPS), unifying the v1 blkio controller's functionality with improved hierarchical accounting. Available since the initial cgroup v2 release in Linux kernel 4.5 (2016).[17] |
| blkio | Block I/O (v1-specific) | In cgroup v1, controls block device I/O throughput and weights for proportional bandwidth allocation, serving as the predecessor to the v2 io controller; it supports per-device rules but lacks v2's unified hierarchy. |
| devices | Device file access | Enforces allow/deny rules for access to device nodes (e.g., /dev/null) using Berkeley Packet Filter (BPF) programs, preventing unauthorized operations like read/write on specific major:minor device pairs. In v2, it relies on eBPF for flexible policy definition.[18] |
| pids | Process and thread counts | Limits the number of tasks (processes or threads) that can be created within a cgroup via fork() or clone(), accounting for both direct and threaded modes to prevent fork bombs; it provides current usage tracking and a maximum limit. Available since the initial cgroup v2 release in kernel 4.5 (2016).[19] |
| rdma | Remote Direct Memory Access (RDMA) resources | Accounts for and limits RDMA/InfiniBand hardware resources, such as host channel adapter (HCA) handles and queue pairs, enabling fair sharing among cgroups in high-performance computing environments. Ported to v2 from v1 and available since Linux kernel 4.11 (2017).[20] |
| hugetlb | Huge page memory | Limits the usage of huge TLB pages per cgroup, enforced during allocation to manage large memory pages for performance-critical applications. Available since the initial cgroup v2 release in kernel 4.5 (2016).[21] |
| misc | Miscellaneous scalar resources | Provides a generic interface for limiting and accounting various scalar resources registered by kernel subsystems, such as RDMA-specific or other non-standard resources. Available since Linux kernel 5.13 (2021).[22] |
| dmem | Device memory | Regulates the allocation and usage of device-specific memory, such as GPU video RAM, to prevent overcommitment and enable fair sharing in heterogeneous computing environments. Introduced in Linux kernel 6.14 (2025).[23] |
| net_cls | Network packet classification (v1-specific) | In cgroup v1, tags network packets with class IDs for traffic control (tc) integration, allowing classification based on cgroup membership; not fully ported to v2, where network management relies on other mechanisms. |
| net_prio | Network priority (v1-specific) | In cgroup v1, sets priority levels for outgoing network traffic per cgroup, influencing socket buffer prioritization; similar to net_cls, it is primarily a v1 feature without direct v2 equivalent. |
mount -t cgroup2 none /sys/fs/cgroup) and specifying desired ones in the root's cgroup.subtree_control file, such as echo "+cpu +memory +io" > cgroup.subtree_control. This approach ensures only relevant resources are delegated down the hierarchy, integrating seamlessly with the overall tree structure. Additional controllers like cpuset (for CPU/node affinity) and perf_event (for performance monitoring) exist but are outside the primary focus here.[24][8]
Versions
Version 1 Details
Control Groups version 1 (cgroups v1) implements a flexible but complex architecture centered around multiple independent hierarchies, each typically dedicated to a single resource controller or subsystem. In this design, each controller—such as CPU, memory, or block I/O—operates within its own separate hierarchy, which must be mounted as a distinct filesystem instance under /sys/fs/cgroup. For example, the CPU controller is mounted at /sys/fs/cgroup/cpu, while the memory controller uses /sys/fs/cgroup/memory, allowing administrators to apply different grouping policies for different resources without interference.[2] This multi-hierarchy approach enables fine-grained control but requires managing multiple mount points and can lead to administrative overhead. Tasks, or processes, are assigned to groups within a hierarchy by writing their process ID (PID) to the tasks file in the target cgroup directory, such as echoVersion 2 Improvements
Cgroups version 2 introduces a unified hierarchy design, where all controllers are organized under a single tree structure, contrasting with the multiple independent hierarchies of version 1. This unification enables consistent resource distribution across the system and facilitates delegation of sub-hierarchies to less privileged users or namespaces without risking inconsistencies in resource accounting.[4] The single hierarchy also supports thread-level granularity for certain controllers, such as CPU and PIDs, allowing threads within a process to be controlled independently via thecgroup.threads file, which lists and permits migration of threads to other cgroups.[4]
Among the new capabilities, the PIDs controller limits the number of processes and threads that can be created within a cgroup, preventing fork bombs and aiding in resource isolation; for example, setting pids.max to 100 restricts the cgroup to no more than 100 tasks.[4] Memory accounting is enhanced with tiered limits: memory.low reserves a minimum amount of memory for the cgroup to avoid aggressive reclamation, memory.high acts as a soft limit triggering pressure stall information (PSI) when exceeded without immediate termination, and memory.max enforces a hard limit leading to out-of-memory kills if breached.[4] The I/O controller is unified under a single interface, supporting weight-based throttling (io.weight) and maximum bandwidth limits per device (io.max), which simplifies configuration compared to the fragmented blkio and iothrottle controllers in version 1.[4]
Cgroups v2 has become the default in modern Linux distributions, including Fedora since version 31 (2019), Ubuntu since 21.10 (2021), and Debian 11 (2021), reflecting its maturity and improved stability.[26] For systems requiring coexistence with version 1, a hybrid mode is supported by mounting specific controllers to legacy hierarchies while using the unified v2 mount point, often enabled via the kernel boot parameter cgroup_no_v1=all to disable v1 entirely or selectively.[4]
Performance benefits stem from the unified mounting, which reduces kernel overhead in managing multiple filesystem instances and improves scalability for large hierarchies with thousands of cgroups; for instance, dynamic operations like task migrations incur lower latency when using the favordynmods mount option introduced in kernel 4.15.[4] Starting with kernel 5.15 (2021), enhancements to delegation allow unprivileged users to more reliably manage sub-hierarchies without root privileges, provided the cgroup is properly owned and permissions are set, enhancing security in containerized environments. PSI, integrated since kernel 4.20, receives further refinements in later kernels like 5.15, providing per-cgroup metrics on CPU, memory, and I/O pressure to better detect and mitigate bottlenecks before they impact performance.
Features and Capabilities
Resource Limiting and Control
Control groups (cgroups) provide mechanisms to enforce resource limits and quotas on groups of processes, ensuring predictable resource usage in multi-tenant environments. These limits are categorized into hard limits, which impose strict maximums that cannot be exceeded; soft limits, which serve as preferred thresholds for proactive management; and shares, which enable proportional allocation based on relative weights. For instance, in cgroup v1, the memory controller usesmemory.limit_in_bytes for a hard limit on memory usage and memory.high as a soft limit that triggers throttling when approached.[27] In cgroup v2, these are refined with memory.max for hard limits and memory.high for soft throttling to prevent excessive pressure.[16] Similarly, CPU shares are set via cpu.shares in v1 or cpu.weight (ranging from 1 to 10000, default 100) in v2 to allocate resources proportionally among competing cgroups using weighted fair queuing.[28][15]
Enforcement occurs at the kernel level to prevent resource overcommitment by default, integrating with core subsystems for immediate intervention. For CPU resources, the Completely Fair Scheduler (CFS) throttles tasks exceeding quotas, ensuring fair distribution without allowing bursts beyond allocated shares.[29] Memory enforcement involves direct reclamation attempts followed by invocation of the Out-of-Memory (OOM) killer if usage hits the hard limit and cannot be reduced, targeting processes within the cgroup to free memory.[16] I/O limiting uses device-specific throttling to cap bandwidth or operations, avoiding global impacts from misbehaving workloads.[2]
Practical examples illustrate these controls in action. In cgroup v2, CPU quotas are configured by writing to cpu.max in the format "quota period" (in microseconds), such as "100000 200000" to limit a cgroup to 100ms of CPU time every 200ms for 50% utilization.[15] For I/O, v1's blkio controller sets throttling via blkio.throttle.read_bps_device to restrict read bytes per second on specific devices, while v2's io controller uses io.max for broader bandwidth and IOPS limits, e.g., capping reads at 2MB/s.[2][17]
Advanced features enhance control through feedback and refined scheduling. Weighted fair queuing underlies CPU allocation, where higher weights grant larger shares during contention, integrated into the CFS for low-latency fairness.[29] Additionally, Pressure Stall Information (PSI), introduced in Linux kernel 4.20 in 2018, provides feedback via cgroup.pressure files (e.g., cpu.pressure, memory.pressure) that report stall times due to resource contention, enabling dynamic adjustments like load migration to avoid OOM events.[30]
Accounting and Monitoring
Control Groups (cgroups) provide accounting mechanisms to track resource consumption for groups of processes and their descendants, enabling administrators to monitor usage without enforcing limits. These mechanisms rely on kernel-maintained statistics files exposed in each cgroup directory, which report aggregated data from all tasks in the cgroup and its subtree. For instance, the memory controller exposesmemory.current to show the total current memory usage in bytes, while the CPU controller provides cpu.stat with fields like usage_usec for total CPU time consumed and nr_throttled for the number of throttling periods when the completely fair scheduler is active.[4]
In cgroups version 1 (v1), accounting is handled per-controller with separate hierarchies, where files such as memory.usage_in_bytes in the memory subsystem report usage for the cgroup and its children, aggregated hierarchically to reflect the tree structure. Version 2 (v2) unifies this into a single hierarchy, improving aggregation by ensuring stats like those in memory.current and cpu.stat inherently include contributions from all descendant cgroups without requiring manual summation. Event counts, such as io.stat in the IO controller for bytes read or written, further detail specific interactions like rbytes for read operations, providing counters for disk I/O without real-time guarantees unless paired with external polling tools.[2][4]
Monitoring in cgroups integrates with Pressure Stall Information (PSI), a kernel feature introduced in version 4.20 that detects and reports resource contention by measuring the time tasks spend stalled waiting for CPU, memory, or I/O. PSI files like cpu.pressure, memory.pressure, and io.pressure are available in cgroup directories, tracking both "some" (partial stalls affecting some tasks) and "full" (complete stalls affecting all tasks) over averaging windows of 10s, 60s, and 300s, with hierarchical aggregation to show system-wide pressure from sub-cgroups. Full PSI support in cgroups v2, including accurate stall accounting across the unified hierarchy, was enabled starting with kernel 5.2.[30][4]
A key improvement in v2 accounting is enhanced slab memory tracking, where the memory.stat file includes slab_reclaimable and slab_unreclaimable counters to distinguish reclaimable kernel slab allocations (like dentries) from permanent ones, providing a more complete view of kernel memory footprint per cgroup since kernel 5.2. These stats are exported to userspace primarily through the cgroup filesystem (cgroupfs) mounted at /sys/fs/cgroup, with process membership visible via /proc/$PID/cgroup, allowing tools to query and aggregate data for monitoring without direct kernel modifications. While cgroups offer no built-in real-time notifications, integration with netlink sockets enables event-based monitoring for changes in usage or pressure in advanced setups.[4]
Usage and Interfaces
Control Interfaces
The primary interface for interacting with control groups (cgroups) from userspace is the cgroup filesystem, mounted by default at/sys/fs/cgroup, which exposes a hierarchical directory structure where cgroups are represented as subdirectories and their properties as files.[2] Users can create, modify, and delete cgroups using standard filesystem operations like mkdir, rmdir, and file writes; for example, writing a process ID (PID) to the cgroup.procs file assigns that process to the cgroup, enabling resource control and monitoring.[4] Key files include cgroup.procs for listing and assigning processes (or thread groups in v1 via tasks), cgroup.subtree_control for enabling controllers in child cgroups (v2-specific), and controller-specific files like memory.max for setting limits.[1]
In cgroups v1, the filesystem supports multiple hierarchies, each mounted separately for specific controllers (e.g., mount -t cgroup cpu /sys/fs/cgroup/cpu), allowing independent management but leading to complexity in overlapping controls.[2] Conversely, cgroups v2 employs a unified hierarchy mounted at a single point (e.g., mount -t cgroup2 none /sys/fs/cgroup/unified), integrating all controllers under one tree to simplify administration and ensure consistent resource delegation from parent to child cgroups.[4] This unified approach eliminates v1's per-controller mount requirements, with available controllers listed in the root's cgroup.controllers file.[1]
Programmatic access is facilitated by libraries and tools such as libcg, a C library from the libcgroup package, which abstracts filesystem operations for creating and managing cgroups. Command-line utilities like cgcreate (to create cgroups) and cgexec (to execute processes within a cgroup) from the same package provide user-friendly wrappers, primarily for v1; while partial v2 support exists in recent versions (e.g., 3.0+ as of 2024), for cgroup v2 it is recommended to use the filesystem interface directly or tools like systemd-run, as full v2 compatibility is still evolving.[31] Systemd, as the default init system on many distributions, offers integrated cgroup management through its unit files and D-Bus APIs, automatically creating cgroups for services (e.g., via system.slice) and allowing resource limits like CPUQuota= to be set declaratively.[32] For delegation, units can enable subcgroup control with Delegate=yes, enabling finer-grained management within slices.[33]
At the kernel level, task movement between cgroups is handled internally by functions such as cgroup_attach_task, invoked when userspace writes to cgroup.procs or equivalent files, ensuring atomic updates and permission checks.[2] In cgroups v2, a netlink socket interface supports event notifications, such as process migrations or controller state changes, allowing userspace applications to monitor hierarchy dynamics without polling the filesystem.[4]
For systems transitioning to v2, coexistence with v1 is supported in hybrid mode, where unused v2 controllers can be rebound to legacy v1 hierarchies to maintain compatibility for applications relying on v1-specific behaviors, such as per-controller mounts.[4] This fallback ensures gradual migration, with systemd often managing the unified v2 tree while exposing v1 for legacy controllers like blkio.[1]
Configuration Methods
Configuration of control groups (cgroups) can occur at boot time through kernel parameters or at runtime via filesystem operations and tools. Boot-time settings primarily control the hierarchy type and available controllers, ensuring compatibility with system management daemons like systemd. As of 2024, major distributions and init systems like systemd default to cgroup v2, with v1 support deprecated and removed in systemd 258 (September 2025). Container technologies such as Kubernetes have placed v1 in maintenance mode.[34][35] To enable legacy cgroup v1 support on systems defaulting to v2, kernel boot parameters likesystemd.legacy_systemd_cgroup_controller=yes can be used for hybrid mode. Previously, to enable a unified cgroup v2 hierarchy exclusively, the kernel boot parameter cgroup_no_v1=all disables all v1 controllers, forcing all to use v2 (as of systemd 256, June 2024). Alternatively, systemd.unified_cgroup_hierarchy=1 activates the unified hierarchy when systemd is present, without fully disabling v1. These parameters are added to the kernel command line; for example, on systems using GRUB, edit /etc/default/grub to append them to GRUB_CMDLINE_LINUX_DEFAULT, then run update-grub to apply changes across boots.[4][36]
At runtime, cgroups are managed through the cgroup filesystem, typically mounted at /sys/fs/cgroup. To create a new cgroup, use mkdir in the appropriate hierarchy directory, such as mkdir /sys/fs/cgroup/mygroup for v2.[1] Processes are assigned by writing their PID to the cgroup.procs file: echo <PID> > /sys/fs/cgroup/mygroup/cgroup.procs.[4] Resource limits are set by writing to controller-specific files, like echo 50000 100000 > cpu.max for CPU limits in microseconds.[4]
For scripted management, the libcgroup-tools package provides utilities like cgcreate to create cgroups and cgset to configure parameters. For instance, cgcreate -g cpu:/cpulimited creates a CPU cgroup, followed by cgset -r cpu.shares=512 cpulimited to allocate half the default shares.[37]
A basic script for a CPU-limited group might look like this:
#!/bin/bash
cgcreate -g cpu:/limited
cgset -r cpu.shares=256 limited # Limits to about 25% on a 4-core system
cgexec -g cpu:limited stress --cpu 4 --timeout 60s
This creates the group, sets shares, and runs a workload within it (v1 example).
To enable delegation, allowing non-root users to manage child cgroups, write to cgroup.subtree_control in the parent, e.g., echo "+cpu" > /sys/fs/cgroup/user.slice/cgroup.subtree_control. This permits enabling the CPU controller in subdirectories.[4]
Additional tools include cgclassify for reclassifying running processes into cgroups, such as cgclassify -g cpu:/limited <PID>, and systemd-run for ad-hoc cgroups without persistent setup: systemd-run --scope -p CPUShares=256 stress --cpu 4.[38][39]
Troubleshooting mount issues often involves verifying the cgroup filesystem is mounted with mount | [grep](/page/Grep) cgroup; if absent, mount manually with mount -t cgroup2 none /sys/fs/cgroup for v2, ensuring controllers are enabled via kernel parameters if needed.[40] Common errors like "no cgroup mount found" arise from mismatched v1/v2 configurations or disabled controllers.[41]
Evolution and Transitions
v1 Redesigns and Enhancements
During the evolution of control groups version 1 (cgroups v1), several redesigns and enhancements were introduced to address scalability limitations, improve resource accounting, and mitigate operational challenges, primarily between 2013 and 2014. One key redesign was the conversion of the cgroup filesystem from the custom cgroupfs to kernfs, completed in Linux kernel 3.15 (released June 2014). This shift leveraged a unified virtual filesystem framework shared with sysfs, significantly enhancing scalability by optimizing directory traversal, reducing lock contention, and lowering memory usage in environments with thousands of cgroups.[42] Another important enhancement was the addition of namespace isolation for cgroups, introduced in Linux kernel 4.6 (March 2016), which allowed cgroups to be scoped to individual namespaces. This feature enabled processes in different namespaces to maintain isolated views of the cgroup hierarchy, preventing cross-namespace visibility and improving security in containerized setups without affecting the global structure.[1] Experiments with unified hierarchies began in 2013, as discussed at the Linux Kernel Summit, where developers explored consolidating multiple controller-specific hierarchies into a single structure to simplify management and reduce inconsistencies in process classification across controllers.[43] New features in v1 included extensions to the blkio controller for writeback support, merged in Linux kernel 4.2 (August 2015), which extended I/O throttling to buffered write operations, ensuring accurate accounting and limiting of dirty page writebacks per cgroup. Refinements to the memsw (memory plus swap) interface in the memory controller, around kernel 3.15, improved swap usage tracking by better integrating swap limits with memory pressure notifications, allowing more reliable enforcement of combined memory and swap caps. The perf_event controller, initially added in kernel 2.6.39 (April 2011) for basic performance event monitoring, saw expansions in kernel 3.14 (March 2014) to integrate more tightly with the core cgroup framework, enabling hierarchical aggregation of perf events like CPU cycles and cache misses for grouped processes.[1][44] Despite these advances, challenges persisted in v1, particularly with delegation inconsistencies where file permission-based delegation led to varying behaviors across controllers, such as mismatched support for subdirectory creation or process movement. Partial fixes were applied in subsequent kernels, like improved permission checks in 3.15, but full resolution required v2's domain-based delegation. Additionally, the cgroup release agent was refined as a mechanism for automated cleanup; configured via the release_agent file in the root cgroup, it executes a user-defined script when a non-root cgroup becomes empty, aiding in resource reclamation and hierarchy maintenance.[2]Migration to v2
To migrate from cgroups v1 to v2, the primary step involves enabling the unified v2 hierarchy by mounting the cgroup2 filesystem at the root location, typically via the commandmount -t cgroup2 none /sys/fs/cgroup.[4] This establishes a single hierarchy for all controllers, replacing the multiple v1 hierarchies. Existing v1 hierarchies mounted under /sys/fs/cgroup can then be converted by unmounting them and remounting the v2 filesystem, with processes migrated using the cgroup.procs file in the target v2 cgroup to move PIDs from v1 to v2 structures.[4] For legacy support during transition, v2 offers a compatibility mode that allows hybrid setups where unavailable v1 controllers can be mounted alongside v2, though this is not recommended for full adoption as it maintains fragmentation.[4]
Key tools facilitate the migration process. Systemd enables v2 by default in unified mode when the kernel command line includes systemd.unified_cgroup_hierarchy=1, which automates hierarchy conversion during boot on supported systems. For container environments, tools like crictl (part of CRI-tools) allow inspection and management of v2 cgroups in CRI-compatible runtimes such as containerd v1.4+, enabling verification of container paths under /sys/fs/cgroup post-migration.[5] Distributions like Fedora have included automatic migration scripts since Fedora 31 (released in 2019), which detect and switch to v2 on upgrade while handling Docker and other legacy tools via temporary v1 fallbacks.[45]
Migration requires careful consideration of controller compatibility changes. For instance, the v1 freezer controller is replaced in v2 by the cgroup.freeze interface file, which suspends or thaws all tasks in a cgroup by writing 1 or 0, respectively, rather than using separate freezer-specific files.[4] Process management shifts to the unified cgroup.procs file for migrations, which lists and allows writing PIDs to move tasks across cgroups without affecting descendants.[4] Additionally, out-of-memory (OOM) behavior differs; v2 introduces the memory.oom.group knob, which, when enabled, directs the OOM killer to terminate the entire cgroup instead of individual processes, potentially altering application reliability and requiring testing for workloads sensitive to group-wide kills.[4]
The benefits of migration include reduced system complexity through a single hierarchy and improved delegation for unprivileged users, enabling safer containerization without root privileges.[4] However, pitfalls arise from the need to update applications and tools reliant on v1-specific interfaces, as not all v1 controllers (e.g., certain legacy ones like blkio) are fully ported, potentially causing compatibility breaks during transition.[1] Full v2 support became available in Linux kernel 5.0 (released in 2019), with subsequent kernels enhancing stability.[46] Recent distribution trends reflect widespread adoption: Fedora 31+ (2019), Ubuntu 21.10+ (2021), Debian 11+ (2021, including Debian 12 in 2023), and RHEL 9 (2022) now default to v2, often with automated boot-time enabling via systemd.[45][47]
Adoption and Integration
Use in Container Technologies
Control groups (cgroups) form the foundational mechanism for resource isolation and management in container technologies, enabling runtimes to enforce limits on CPU, memory, and other resources to prevent any single container from starving the host system. Docker, introduced in 2013, relies on cgroups as a core component for container resource constraints, mapping command-line flags such as--memory to set hard memory limits (e.g., 300m for 300 MiB) and --cpus to restrict CPU shares (e.g., 1.5 CPUs on a multi-core host) directly to corresponding cgroup filesystem entries like memory.limit_in_bytes and cpu.cfs_quota_us.[48] Similarly, LXC uses cgroups to allocate and limit resources for containers, integrating them with namespaces for process isolation and ensuring controlled access to host resources such as CPU time and memory usage.[49] Podman, a daemonless alternative to Docker, employs cgroups by default via the --cgroups=enabled option, creating new cgroups under a specified parent path to manage container resource limits and support both v1 and v2 hierarchies.[50]
In orchestration platforms like Kubernetes, cgroups underpin pod-level resource quotas through the ResourceQuota API, which imposes namespace-wide limits on aggregate CPU and memory consumption enforced by the container runtime's cgroup configurations. For instance, a ResourceQuota can cap total memory at 1Gi across all pods in a namespace, with the kubelet instructing the runtime to apply these via cgroups to avoid host resource exhaustion.[51][52] CRI-O, a lightweight Kubernetes runtime, provides direct support for cgroup v2 starting with version 1.20 in late 2020, allowing unified resource delegation and improved hierarchical management for pods.[5] As of August 2024, Kubernetes version 1.31 placed cgroup v1 support in maintenance mode, promoting full adoption of v2 for enhanced resource management.[34]
Practical examples of cgroup application in containers include isolating CPU and memory to mitigate denial-of-service risks; for a memory-limited container, exceeding the cgroup-set threshold triggers the kernel's out-of-memory killer, terminating the process while preserving host stability.[48] Additionally, the net_cls controller tags network packets from a container's cgroup with a class identifier (e.g., writing 0x100001 to net_cls.classid), enabling integration with network namespaces and the Linux traffic control (tc) utility for quality-of-service shaping, such as prioritizing container traffic.[53]
The evolution toward cgroup v2 in container ecosystems enhances delegation and simplifies hierarchies, with containerd adopting support in version 1.4 released in August 2020 to facilitate better subtree control and reduced overhead in multi-tenant environments.[54] This shift allows runtimes like containerd to delegate entire cgroup subtrees to containers, improving scalability for orchestration tools like Kubernetes. However, the global cgroup_mutex, which serves as the master lock for any modifications to cgroups or their hierarchies, can experience contention during frequent container creation and destruction operations, potentially impacting performance in high-load scenarios.[2]
Integration with System Management Tools
Systemd, the default init system in most modern Linux distributions, has integrated cgroups for resource management since version 205 in 2013, enabling automatic grouping of processes launched by services, scopes, and slices.[32] Slices organize hierarchical groupings, such as user.slice for per-user resource limits, while services and scopes map directly to cgroup paths for precise control over CPU, memory, and I/O usage of managed processes.[32] This integration allows systemd to enforce limits declaratively in unit files, for example, settingMemoryMax=1G in a service definition to cap memory allocation.[33]
Legacy init systems like Upstart provide cgroup support through job configuration files, where the cgroup stanza assigns processes to specific hierarchies, such as cgroup cpu /sys/fs/cgroup/cpu/tasks for CPU shares.[55] Supervisor, a process control system, can extend cgroup functionality via third-party plugins or custom scripts to monitor and limit resources for supervised programs, though it lacks native hierarchical delegation.[56] In Android, the low-memory killer (LMK) has utilized memory cgroups for out-of-memory (OOM) handling since kernel integration around 2012, prioritizing process termination based on cgroup pressure notifiers to maintain system responsiveness on resource-constrained devices.[57]
Key features in systemd include dynamic delegation via the Delegate=yes directive in unit files, which permits services to create and manage sub-cgroups independently while inheriting parent limits.[32] CPU accounting is enabled per-service with the CPUAccounting=yes property, allowing runtime adjustments like systemctl set-property myservice.service CPUQuota=50% to throttle usage.[33] Starting with systemd 254 in late 2022, enhanced support for cgroup v2 includes improved pressure event handling, where memory pressure propagates up the tree for proactive service adjustments. This includes monitoring the unit's cgroup path, such as /sys/fs/cgroup/system.slice/myservice.service/[memory](/page/Memory).pressure, for stall events, enabling units to react to contention without full OOM invocation.[33] In May 2024, systemd 256 removed support for cgroup v1 hierarchies, aligning with major Linux distributions that default to v2 unified hierarchies.[58]
Adoption of cgroup integration via systemd is widespread in enterprise environments, with Red Hat Enterprise Linux 8 and later (released 2019) relying on it for service isolation in production workloads, supporting features like per-user slicing to prevent resource exhaustion in multi-tenant setups.[59]