Recent from talks
Contribute something
Nothing was collected or created yet.
Protection ring
View on Wikipedia
This article includes a list of general references, but it lacks sufficient corresponding inline citations. (February 2015) |

In computer science, hierarchical protection domains,[1][2] often called protection rings, are mechanisms to protect data and functionality from faults (by improving fault tolerance) and malicious behavior (by providing computer security).
Computer operating systems provide different levels of access to resources. A protection ring is one of two or more hierarchical levels or layers of privilege within the architecture of a computer system. This is generally hardware-enforced by some CPU architectures that provide different CPU modes at the hardware or microcode level. Rings are arranged in a hierarchy from most privileged (most trusted, usually numbered zero) to least privileged (least trusted, usually with the highest ring number). On most operating systems, Ring 0 is the level with the most privileges and interacts most directly with the physical hardware such as certain CPU functionality (e.g. the control registers) and I/O controllers.
Special mechanisms are provided to allow an outer ring to access an inner ring's resources in a predefined manner, as opposed to allowing arbitrary usage. Correctly gating access between rings can improve security by preventing programs from one ring or privilege level from misusing resources intended for programs in another. For example, spyware running as a user program in Ring 3 should be prevented from turning on a web camera without informing the user, since hardware access should be a Ring 1 function reserved for device drivers. Programs such as web browsers running in higher numbered rings must request access to the network, a resource restricted to a lower numbered ring.
X86S, a canceled Intel architecture published in 2024, has only ring 0 and ring 3. Ring 1 and 2 were to be removed under X86S since modern operating systems never utilize them.[3][4]
Implementations
[edit]Multiple rings of protection were among the most revolutionary concepts introduced by the Multics operating system, a highly secure predecessor of today's Unix family of operating systems. The GE 645 mainframe computer did have some hardware access control, including the same two modes that the other GE-600 series machines had, and segment-level permissions in its memory management unit ("Appending Unit"), but that was not sufficient to provide full support for rings in hardware, so Multics supported them by trapping ring transitions in software;[5] its successor, the Honeywell 6180, implemented them in hardware, with support for eight rings;[6] Protection rings in Multics were separate from CPU modes; code in all rings other than ring 0, and some ring 0 code, ran in slave mode.[7]
However, most general-purpose systems use only two rings, even if the hardware they run on provides more CPU modes than that. For example, Windows 7 and Windows Server 2008 (and their predecessors) use only two rings, with ring 0 corresponding to kernel mode and ring 3 to user mode,[8] because earlier versions of Windows NT ran on processors that supported only two protection levels.[9]
Many modern CPU architectures (including the popular Intel x86 architecture) include some form of ring protection, although the Windows NT operating system, like Unix, does not fully utilize this feature. OS/2 does, to some extent, use three rings:[10] ring 0 for kernel code and device drivers, ring 2 for privileged code (user programs with I/O access permissions), and ring 3 for unprivileged code (nearly all user programs). Under DOS, the kernel, drivers and applications typically run on ring 3 (however, this is exclusive to the case where protected-mode drivers or DOS extenders are used; as a real-mode OS, the system runs with effectively no protection), whereas 386 memory managers such as EMM386 run at ring 0. In addition to this, DR-DOS' EMM386 3.xx can optionally run some modules (such as DPMS) on ring 1 instead. OpenVMS uses four modes called (in order of decreasing privileges) Kernel, Executive, Supervisor and User.

A renewed interest in this design structure came with the proliferation of the Xen VMM software, ongoing discussion on monolithic vs. micro-kernels (particularly in Usenet newsgroups and Web forums), Microsoft's Ring-1 design structure as part of their NGSCB initiative, and hypervisors based on x86 virtualization such as Intel VT-x (formerly Vanderpool).
The original Multics system had eight rings, but many modern systems have fewer. The hardware remains aware of the current ring of the executing instruction thread at all times, with the help of a special machine register. In some systems, areas of virtual memory are instead assigned ring numbers in hardware. One example is the Data General Eclipse MV/8000, in which the top three bits of the program counter (PC) served as the ring register. Thus code executing with the virtual PC set to 0xE200000, for example, would automatically be in ring 7, and calling a subroutine in a different section of memory would automatically cause a ring transfer.
The hardware severely restricts the ways in which control can be passed from one ring to another, and also enforces restrictions on the types of memory access that can be performed across rings. Using x86 as an example, there is a special[clarification needed] gate structure which is referenced by the call instruction that transfers control in a secure way[clarification needed] towards predefined entry points in lower-level (more trusted) rings; this functions as a supervisor call in many operating systems that use the ring architecture. The hardware restrictions are designed to limit opportunities for accidental or malicious breaches of security. In addition, the most privileged ring may be given special capabilities (such as real memory addressing that bypasses the virtual memory hardware).
ARM version 7 architecture implements three privilege levels: application (PL0), operating system (PL1), and hypervisor (PL2). Unusually, level 0 (PL0) is the least-privileged level, while level 2 is the most-privileged level.[11] ARM version 8 implements four exception levels: application (EL0), operating system (EL1), hypervisor (EL2), and secure monitor / firmware (EL3), for AArch64[12]: D1-2454 and AArch32.[12]: G1-6013
Ring protection can be combined with processor modes (master/kernel/privileged/supervisor mode versus slave/unprivileged/user mode) in some systems. Operating systems running on hardware supporting both may use both forms of protection or only one.
Effective use of ring architecture requires close cooperation between hardware and the operating system.[why?] Operating systems designed to work on multiple hardware platforms may make only limited use of rings if they are not present on every supported platform. Often the security model is simplified to "kernel" and "user" even if hardware provides finer granularity through rings.[13]
Modes
[edit]Supervisor mode
[edit]In computer terms, supervisor mode is a hardware-mediated flag that can be changed by code running in system-level software. System-level tasks or threads may[a] have this flag set while they are running, whereas user-level applications will not. This flag determines whether it would be possible to execute machine code operations such as modifying registers for various descriptor tables, or performing operations such as disabling interrupts. The idea of having two different modes to operate in comes from "with more power comes more responsibility" – a program in supervisor mode is trusted never to fail, since a failure may cause the whole computer system to crash.
Supervisor mode is "an execution mode on some processors which enables execution of all instructions, including privileged instructions. It may also give access to a different address space, to memory management hardware and to other peripherals. This is the mode in which the operating system usually runs."[14]
In a monolithic kernel, the operating system runs in supervisor mode and the applications run in user mode. Other types of operating systems, like those with an exokernel or microkernel, do not necessarily share this behavior.
Some examples from the PC world:
- Linux, macOS and Windows are three operating systems that use supervisor/user mode. To perform specialized functions, user mode code must perform a system call into supervisor mode or even to the kernel space where trusted code of the operating system will perform the needed task and return the execution back to the userspace. Additional code can be added into kernel space through the use of loadable kernel modules, but only by a user with the requisite permissions, as this code is not subject to the access control and safety limitations of user mode.
- DOS (for as long as no 386 memory manager such as EMM386 is loaded), as well as other simple operating systems and many embedded devices run in supervisor mode permanently, meaning that drivers can be written directly as user programs.
Most processors have at least two different modes. The x86-processors have four different modes divided into four different rings. Programs that run in Ring 0 can do anything with the system, and code that runs in Ring 3 should be able to fail at any time without impact to the rest of the computer system. Ring 1 and Ring 2 are rarely used, but could be configured with different levels of access.
In most existing systems, switching from user mode to kernel mode has an associated high cost in performance. It has been measured, on the basic request getpid, to cost 1000–1500 cycles on most machines. Of these just around 100 are for the actual switch (70 from user to kernel space, and 40 back), the rest is "kernel overhead".[15][16] In the L3 microkernel, the minimization of this overhead reduced the overall cost to around 150 cycles.[15]
Maurice Wilkes wrote:[17]
... it eventually became clear that the hierarchical protection that rings provided did not closely match the requirements of the system programmer and gave little or no improvement on the simple system of having two modes only. Rings of protection lent themselves to efficient implementation in hardware, but there was little else to be said for them. [...] The attractiveness of fine-grained protection remained, even after it was seen that rings of protection did not provide the answer... This again proved a blind alley...
To gain performance and determinism, some systems place functions that would likely be viewed as application logic, rather than as device drivers, in kernel mode; security applications (access control, firewalls, etc.) and operating system monitors are cited as examples. At least one embedded database management system, eXtremeDB Kernel Mode, has been developed specifically for kernel mode deployment, to provide a local database for kernel-based application functions, and to eliminate the context switches that would otherwise occur when kernel functions interact with a database system running in user mode.[18]
Functions are also sometimes moved across rings in the other direction. The Linux kernel, for instance, injects into processes a vDSO section which contains functions that would normally require a system call, i.e. a ring transition. Instead of doing a syscall these functions use static data provided by the kernel. This avoids the need for a ring transition and so is more lightweight than a syscall. The function gettimeofday can be provided this way.
Hypervisor mode
[edit]Recent CPUs from Intel and AMD offer x86 virtualization instructions for a hypervisor to control Ring 0 hardware access. Although they are mutually incompatible, both Intel VT-x (codenamed "Vanderpool") and AMD-V (codenamed "Pacifica") allow a guest operating system to run Ring 0 operations natively without affecting other guests or the host OS.
Before hardware-assisted virtualization, guest operating systems ran under ring 1. Any attempt that requires a higher privilege level to perform (ring 0) will produce an interrupt and then be handled using software; this is called "Trap and Emulate".
To assist virtualization and reduce overhead caused by the reason above, VT-x and AMD-V allow the guest to run under Ring 0. VT-x introduces VMX Root/Non-root Operation: The hypervisor runs in VMX Root Operation mode, possessing the highest privilege. Guest OS runs in VMX Non-Root Operation mode, which allows them to operate at ring 0 without having actual hardware privileges. VMX non-root operation and VMX transitions are controlled by a data structure called a virtual-machine control.[19] These hardware extensions allow classical "Trap and Emulate" virtualization to perform on x86 architecture but now with hardware support.
Privilege level
[edit]A privilege level in the x86 instruction set controls the access of the program currently running on the processor to resources such as memory regions, I/O ports, and special instructions. There are 4 privilege levels ranging from 0 which is the most privileged, to 3 which is least privileged. Most modern operating systems use level 0 for the kernel/executive, and use level 3 for application programs. Any resource available to level n is also available to levels 0 to n, so the privilege levels are rings. When a lesser privileged process tries to access a higher privileged process, a general protection fault exception is reported to the OS.
It is not necessary to use all four privilege levels. Current operating systems with wide market share including Microsoft Windows, macOS, Linux, iOS and Android mostly use a paging mechanism with only one bit to specify the privilege level as either Supervisor or User (U/S Bit). Windows NT uses the two-level system.[20] The real mode programs in 8086 are executed at level 0 (highest privilege level) whereas virtual mode in 8086 executes all programs at level 3.[21]
Potential future uses for the multiple privilege levels supported by the x86 ISA family include containerization and virtual machines. A host operating system kernel could use instructions with full privilege access (kernel mode), whereas applications running on the guest OS in a virtual machine or container could use the lowest level of privileges in user mode. The virtual machine and guest OS kernel could themselves use an intermediate level of instruction privilege to invoke and virtualize kernel-mode operations such as system calls from the point of view of the guest operating system.[22]
IOPL
[edit]The IOPL (I/O Privilege level) flag is a flag found on all IA-32 compatible x86 CPUs. It occupies bits 12 and 13 in the FLAGS register. In protected mode and long mode, it shows the I/O privilege level of the current program or task. The Current Privilege Level (CPL) (CPL0, CPL1, CPL2, CPL3) of the task or program must be less than or equal to the IOPL in order for the task or program to access I/O ports.
The IOPL can be changed using POPF(D) and IRET(D) only when the current privilege level is Ring 0.
Besides IOPL, the I/O Port Permissions in the TSS also take part in determining the ability of a task to access an I/O port.
Miscellaneous
[edit]In x86 systems, the x86 hardware virtualization (VT-x and SVM) is referred as "ring −1", the System Management Mode is referred as "ring −2", the Intel Management Engine and AMD Platform Security Processor are sometimes referred as "ring −3".[23]
Use of hardware features
[edit]Many CPU hardware architectures provide far more flexibility than is exploited by the operating systems that they normally run. Proper use of complex CPU modes requires very close cooperation between the operating system and the CPU, and thus tends to tie the OS to the CPU architecture. When the OS and the CPU are specifically designed for each other, this is not a problem (although some hardware features may still be left unexploited), but when the OS is designed to be compatible with multiple, different CPU architectures, a large part of the CPU mode features may be ignored by the OS. For example, the reason Windows uses only two levels (ring 0 and ring 3) is that some hardware architectures that were supported in the past (such as PowerPC or MIPS) implemented only two privilege levels.[8]
Multics was an operating system designed specifically for a special CPU architecture (which in turn was designed specifically for Multics), and it took full advantage of the CPU modes available to it. However, it was an exception to the rule. Today, this high degree of interoperation between the OS and the hardware is not often cost-effective, despite the potential advantages for security and stability.
Ultimately, the purpose of distinct operating modes for the CPU is to provide hardware protection against accidental or deliberate corruption of the system environment (and corresponding breaches of system security) by software. Only "trusted" portions of system software are allowed to execute in the unrestricted environment of kernel mode, and then, in paradigmatic designs, only when absolutely necessary. All other software executes in one or more user modes. If a processor generates a fault or exception condition in a user mode, in most cases system stability is unaffected; if a processor generates a fault or exception condition in kernel mode, most operating systems will halt the system with an unrecoverable error. When a hierarchy of modes exists (ring-based security), faults and exceptions at one privilege level may destabilize only the higher-numbered privilege levels. Thus, a fault in Ring 0 (the kernel mode with the highest privilege) will crash the entire system, but a fault in Ring 2 will only affect Rings 3 and beyond and Ring 2 itself, at most.
Transitions between modes are at the discretion of the executing thread when the transition is from a level of high privilege to one of low privilege (as from kernel to user modes), but transitions from lower to higher levels of privilege can take place only through secure, hardware-controlled "gates" that are traversed by executing special instructions or when external interrupts are received.
Microkernel operating systems attempt to minimize the amount of code running in privileged mode, for purposes of security and elegance, but ultimately sacrificing performance.
See also
[edit]- Call gate (Intel)
- Memory segmentation
- Protected mode – available on x86-compatible 80286 CPUs and newer
- IOPL (CONFIG.SYS directive) – an OS/2 directive to run DLL code at ring 2 instead of at ring 3
- Segment descriptor
- Supervisor Call instruction
- System Management Mode (SMM)
- Principle of least privilege
Notes
[edit]References
[edit]- ^ Karger, Paul A.; Herbert, Andrew J. (1984). An Augmented Capability Architecture to Support Lattice Security and Traceability of Access. 1984 IEEE Symposium on Security and Privacy. p. 2. doi:10.1109/SP.1984.10001. ISBN 0-8186-0532-4. S2CID 14788823.
- ^ Binder, W. (2001). "Design and implementation of the J-SEAL2 mobile agent kernel". Proceedings 2001 Symposium on Applications and the Internet. pp. 35–42. doi:10.1109/SAINT.2001.905166. ISBN 0-7695-0942-8. S2CID 11066378.
- ^ "Envisioning a Simplified Intel Architecture for the Future". Intel. Retrieved 28 May 2024.
- ^ Tanembaum, Andrew S. (2015). Modern Operating Systems (4th ed.). Pearson. pp. 479–480. ISBN 978-0-13-359162-0.
For many years, the x86 has supported four protection modes or rings [...]. Ring 3 is the least privileged [...]. Ring 0 is the most privileged [...]. The remaining two rings are not used by any current operating system.
- ^ "A Hardware Architecture for Implementing Protection Rings". Communications of the ACM. 15 (3). March 1972. Retrieved 27 September 2012.
- ^ "Multics Glossary - ring". Retrieved 27 September 2012.
- ^ The Multics Virtual Memory, part 2 (PDF). Honeywell Information Systems. June 1972. pp. 160–161.
- ^ a b Russinovich, Mark E.; David A. Solomon (2005). Microsoft Windows Internals (4 ed.). Microsoft Press. pp. 16. ISBN 978-0-7356-1917-3.
- ^ Russinovich, Mark (2012). Windows Internals Part 1 (6th ed.). Redmond, Washington: Microsoft Press. p. 17. ISBN 978-0-7356-4873-9.
The reason Windows uses only two levels is that some hardware architectures that were supported in the past (such as Compaq Alpha and Silicon Graphics MIPS) implemented only two privilege levels.
- ^ "Presentation Device Driver Reference for OS/2 – 5. Introduction to OS/2 Presentation Drivers". Archived from the original on 15 June 2015. Retrieved 13 June 2015.
- ^ ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition. Arm Ltd. p. B1-1136.
- ^ a b Arm Architecture Reference Manual Armv8, for A-profile architecture. Arm Ltd.
- ^ Tanembaum, Andrew S. (2015). Modern Operating Systems (4th ed.). Pearson. pp. 479–480. ISBN 978-0-13-359162-0.
For many years, the x86 has supported four protection modes or rings [...]. Ring 3 is the least privileged [...]. Ring 0 is the most privileged [...]. The remaining two rings are not used by any current operating system.
- ^ "supervisor mode". FOLDOC. 15 February 1995.
- ^ a b Jochen Liedtke (December 1995). "On µ-Kernel Construction". Proc. 15th ACM Symposium on Operating System Principles (SOSP).
- ^ Ousterhout, J. K. (1990). Why aren't operating systems getting faster as fast as hardware?. Usenix Summer Conference A. naheim, CA. pp. 247–256.
- ^ Maurice Wilkes (April 1994). "Operating systems in a changing world". ACM SIGOPS Operating Systems Review. 28 (2): 9–21. doi:10.1145/198153.198154. ISSN 0163-5980. S2CID 254134.
- ^ Gorine, Andrei; Krivolapov, Alexander (May 2008). "Kernel Mode Databases: A DBMS Technology For High-Performance Applications". Dr. Dobb's Journal.
- ^ Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 3C (PDF). Intel Cooperation (published September 2016). 2016. pp. 1–3.
- ^ Russinovich, Mark E.; Solomon, David A. (2005). Microsoft Windows Internals (4th ed.). Microsoft Press. p. 16. ISBN 978-0-7356-1917-3.
- ^ Sunil Mathur. Microprocessor 8086: Architecture, Programming and Interfacing (Eastern Economy ed.). PHI Learning.
- ^ Anderson, Thomas; Dahlin, Michael (21 August 2014). "2.2". Operating Systems: Principles and Practice (2nd ed.). Recursive Books. ISBN 978-0985673529.
- ^ De Gelas, Johan. "Hardware Virtualization: the Nuts and Bolts". AnandTech. Archived from the original on 4 April 2010. Retrieved 13 March 2021.
- Intel 80386 Programmer's Reference
Further reading
[edit]- David T. Rogers (June 2003). A framework for dynamic subversion (PDF) (MSc). Naval Postgraduate School. hdl:10945/919.
- William J. Caelli (2002). "Relearning "Trusted Systems" in an Age of NIIP: Lessons from the Past for the Future". Archived from the original (PDF) on 20 April 2015.
- Haruna R. Isa; William R. Shockley; Cynthia E. Irvine (May 1999). "A Multi-threading Architecture for Multilevel Secure Transaction Processing" (PDF). Proceedings of the 1999 IEEE Symposium on Security and Privacy. Oakland, CA. pp. 166–179. hdl:10945/7198.
- Ivan Kelly (8 May 2006). "Porting MINIX to Xen" (PDF). Archived from the original (PDF) on 27 August 2006.
- Paul Barham; Boris Dragovic; Keir Fraser; Steven Hand; Tim Harris; Alex Ho; Rolf Neugebauer; Ian Pratt; Andrew Warfield (2003). "Xen and the Art of Virtualization" (PDF).
- Marcus Peinado; Yuqun Chen; Paul England; John Manferdelli. "NGSCB: A Trusted Open System" (PDF). Archived from the original (PDF) on 4 March 2005.
- Michael D. Schroeder; Jerome H. Saltzer (1972). "A Hardware Architecture for Implementing Protection Rings".
- "Intel Architecture Software Developer's Manual Volume 3: System Programming (Order Number 243192)" (PDF). Chapter 4 "Protection"; section 4.5 "Privilege levels". Archived from the original (PDF) on 19 February 2009.
- Tzi-cker Chiueh; Ganesh Venkitachalam; Prashant Pradhan (December 1999). "Integrating segmentation and paging protection for safe, efficient and transparent software extensions". Proceedings of the seventeenth ACM symposium on Operating systems principles. Section 3: Protection hardware features in Intel X86 architecture; subsection 3.1 Protection checks. doi:10.1145/319151.319161. ISBN 1581131402. S2CID 9456119.
- Takahiro Shinagawa; Kenji Kono; Takashi Masuda (17 May 2000). "Exploiting Segmentation Mechanism for Protecting Against Malicious Mobile Code" (PDF). Chapter 3 Implementation; section 3.2.1 Ring Protection. Archived from the original (PDF) on 10 August 2017. Retrieved 2 April 2018.
- Boebert, William Earl; R. Kain (1985). A Practical Alternative to Hierarchical Integrity Policies. 8th National Computer Security Conference.
- Gorine, Andrei; Krivolapov, Alexander (May 2008). "Kernel Mode Databases: A DBMS technology for high-performance applications". Dr. Dobb's Journal.
Protection ring
View on GrokipediaFundamentals
Definition and Purpose
Protection rings are a hierarchical mechanism in computer systems designed to separate privileges into concentric layers, visualized as nested circles where the innermost ring grants the highest level of access to system resources and the outermost ring provides the most restricted access.[4] Inner rings, such as ring 0, typically encompass full system control, while outer rings, like ring 3, limit operations to user-level tasks to prevent interference with critical components.[4] This structure enforces a principle of least privilege, ensuring that processes execute only within their designated access boundaries.[1] The primary purposes of protection rings include enforcing security through the isolation of user code from kernel code, thereby preventing unauthorized access to sensitive hardware and data structures.[4] They also enable fault isolation by containing errors or malicious actions within a specific ring, limiting their potential to propagate and compromise the entire system.[4] Additionally, protection rings facilitate controlled resource sharing, allowing designated interactions between rings while maintaining overall separation to support cooperative multitasking.[4] Key benefits of this approach encompass improved system stability, as faults in outer rings can be handled without destabilizing inner privileged operations; enhanced resistance to malware, which is confined to less privileged layers; and efficient resource management through mediated access controls.[4] Conceptually, the rings are depicted as concentric circles with escalating privileges toward the center, symbolizing the decreasing scope of authority from core system functions outward to applications.[1] These privilege levels form the basis for ring-based enforcement, with further details covered in the Privilege Levels section.[4]Historical Development
The concept of protection rings originated in the late 1960s as part of efforts to enhance security in multi-user operating systems. The Multics system, developed jointly by MIT, Bell Labs, and General Electric from 1964 to 1969, introduced the first implementation of ring-based protection to isolate processes and enforce access controls in a shared environment.[5] This hierarchical model used up to eight rings, with lower-numbered rings granting higher privileges, allowing controlled sharing of resources while preventing unauthorized access by less privileged components.[1] Earlier influences on ring-like protection came from capability-based systems in the 1960s. The Burroughs B5000, released in 1961, employed tagged memory and descriptors for object protection, which inspired subsequent hierarchical schemes by providing a foundation for fine-grained access control without direct addressing.[6] These ideas evolved into explicit rings during Multics' hardware design on the Honeywell 645 processor around 1969-1970.[7] In the 1970s, protection rings gained adoption in minicomputers. Digital Equipment Corporation's VAX architecture, introduced in 1977 with the VAX-11/780, incorporated four protection rings to support secure multitasking in its VMS operating system, building on PDP-11's simpler mode-based protection from the early 1970s.[8] The PDP-11 series, starting in 1970, provided basic kernel/user modes and memory management units for process isolation, laying groundwork for VAX's multi-ring extension.[9] A key milestone occurred in the 1980s with the integration of rings into personal computing via Intel's x86 architecture. Although the 8086 (1978) lacked rings, the 80286 processor, released in 1982, introduced four privilege levels to enable protected mode operation, influencing OS designs like OS/2 and early Windows NT.[10] This evolved through subsequent x86 processors, maintaining the four-ring structure for backward compatibility. Modern developments extended rings to support virtualization in the early 2000s. Hypervisors like Xen (2003) and VMware utilized hardware extensions to operate at a conceptual "ring -1" level below ring 0, isolating guest OS kernels for improved security in multi-tenant environments.[11] The x86-64 architecture, launched by AMD Opteron in 2003, preserved the four-ring model while enhancing 64-bit addressing and protection, ensuring ongoing relevance in server and desktop systems.[12] Adoption of protection rings followed a timeline aligned with computing paradigms: in the 1970s with minicomputers like VAX for enterprise use; in the 1980s with PCs via 80286-based systems such as the IBM PC/AT (1984); and in the 2010s with mobile and embedded devices, where ARM architectures adapted ring-like privilege levels (e.g., EL0-EL3 in ARMv8, 2011) for secure OS isolation in smartphones and IoT.[13]Core Mechanisms
Privilege Levels
Protection rings establish a hierarchical structure of privilege levels to enforce security boundaries in computer systems. Originating from the Multics operating system, this model organizes privileges into concentric layers, where inner rings possess greater access rights than outer ones.[1] In many architectures, such as x86, there are typically four levels numbered 0 through 3, with ring 0 representing the highest privilege and ring 3 the lowest.[14] Other architectures employ varying numbers of levels; for instance, ARM's AArch64 uses four exception levels (EL0 to EL3), though implementations often focus on two primary levels for user and kernel execution.[15] Access rules within these levels ensure that code executing in an inner (more privileged) ring can access resources allocated to outer (less privileged) rings, but the reverse is prohibited to prevent unauthorized escalation.[1] This asymmetry is enforced through hardware mechanisms, such as segment descriptors in x86 that specify privilege levels for memory segments, or page tables that apply user/supervisor bits to restrict access based on the current execution level.[14] In ARM, similar controls use exception levels to gate access to system registers and memory regions, ensuring higher levels cannot be directly invoked from lower ones without validation.[15] Transitions between privilege levels are mediated by controlled mechanisms to maintain isolation. For example, system calls allow user-level code to request kernel services, triggering a switch to a higher privilege level with checks on parameters and stack integrity before execution proceeds.[14] These transitions, often facilitated by instructions like gates or exception handlers, include validation to prevent invalid accesses, and returns to lower levels are similarly restricted to authorized paths.[1] In practice, ring assignments allocate the most sensitive operations to the innermost level: ring 0 is reserved for the operating system kernel, which manages hardware directly.[14] Rings 1 and 2, intended for device drivers or execution environments, are rarely utilized in modern systems due to legacy design considerations and the dominance of a two-level model.[14] Ring 3 handles user applications, limiting their scope to prevent interference with system stability.[14] Conceptually, protection rings serve as security boundaries that perform privilege checks on critical operations, including memory accesses, I/O port interactions, and interrupt handling.[1] Violations, such as a user process attempting direct hardware access, result in faults or exceptions, thereby isolating faults and enhancing overall system reliability.[14]Operational Modes
Protection rings facilitate dynamic runtime execution through distinct operational modes that enforce privilege boundaries during program operation. These modes determine the scope of hardware access and resource manipulation, ensuring that less privileged code cannot compromise system integrity. In this context, supervisor mode and user mode represent the primary operational states, with additional modes like hypervisor mode extending capabilities for advanced virtualization. Supervisor mode, corresponding to the highest privilege level (ring 0), grants unrestricted access to all hardware resources, including direct manipulation of memory, I/O devices, and control registers. This mode is essential for kernel operations, such as process scheduling, device driver management, and interrupt handling, where the operating system requires full control to maintain system stability.[16] In contrast, user mode operates at the lowest privilege level (ring 3), imposing strict restrictions on application code to prevent direct hardware interactions that could lead to system crashes or security vulnerabilities. Applications running in user mode can access only user-space memory and invoke system services indirectly, relying on mediated calls to supervisor-mode routines for privileged operations. This separation isolates user processes, containing faults within their allocated resources and escalating only necessary exceptions to higher privileges.[16] Transitions between these modes, known as mode switching, occur through controlled mechanisms such as interrupts, exceptions, or procedure call gates, which involve privilege checks, stack segment changes, and state saving to maintain isolation. For instance, a system call from user mode triggers an interrupt that switches to supervisor mode, executing the requested service before returning, with hardware verifying that the transition adheres to privilege rules to avoid unauthorized escalations. These switches ensure secure inter-mode communication while minimizing overhead through efficient context handling.[16] Hypervisor mode, conceptually positioned as ring -1, introduces an additional layer of privilege above supervisor mode, primarily for virtualization environments. Enabled by extensions like Intel VT-x, this mode allows a hypervisor to oversee and partition multiple operating system instances, each running in their own virtualized rings, by trapping and emulating privileged operations from guest systems. It provides the highest oversight privileges, enabling secure management of virtual machines without interference from underlying OS kernels.[16] The implications of these operational modes extend to robust error containment and system reliability: faults in user mode are typically resolved within the same ring via handlers, preventing propagation unless escalated through exceptions, while supervisor and hypervisor modes incorporate safeguards to isolate virtualized or kernel-level errors from affecting the broader system. Building briefly on the static hierarchy of privilege levels, these modes dynamically apply ring-based restrictions to runtime behavior, enhancing overall security without excessive performance penalties.[16]Architectural Implementations
x86-Specific Features
The x86 architecture implements a hierarchical protection model using four privilege rings, numbered 0 through 3, where ring 0 provides the highest privileges for kernel-mode execution and ring 3 offers the lowest for user-mode applications.[14] Access to privileged resources, such as certain instructions, memory segments, and I/O ports, is enforced based on the current privilege level (CPL), which is stored in the low-order two bits of the code segment (CS) register.[14] The CPL determines the ring under which the currently executing code operates and triggers general-protection exceptions (#GP) if attempts are made to access higher-privilege elements from a less privileged ring.[14] Transitions between rings are mediated by gate descriptors in the global descriptor table (GDT), local descriptor table (LDT), or interrupt descriptor table (IDT), ensuring controlled privilege changes without direct jumps to arbitrary code.[14] Call gates enable inter-ring calls by loading new segment selectors and offsets, performing stack switches to separate stacks for different privilege levels, and validating parameters to prevent unauthorized access.[14] Interrupt gates and trap gates handle hardware interrupts and exceptions, automatically switching to a more privileged ring (typically ring 0) while saving the processor state on the appropriate stack and clearing the interrupt flag (IF) in the EFLAGS register for interrupt gates to disable maskable interrupts during handling.[14] I/O operations in x86 are further regulated by the I/O privilege level (IOPL), encoded in bits 12 and 13 of the EFLAGS register, which specifies the minimum CPL required to execute sensitive I/O instructions like IN and OUT.[14] If the CPL exceeds the IOPL, these instructions trigger a general-protection exception, though ring 3 code can perform limited I/O via the I/O permission bit map in the task state segment (TSS) when IOPL is set to 0.[14] This mechanism allows operating systems to grant user-level processes controlled direct memory access (DMA) or port interactions without full kernel privileges. Rings 1 and 2, intended for intermediate privileges such as device drivers or executive services, have been largely underutilized in modern operating systems, which typically bifurcate into ring 0 for the kernel and ring 3 for applications.[14] This simplification stems from the complexity of managing multiple intermediate levels, leading to legacy support primarily in older systems.[14] Additionally, x86 operates in real mode by default upon power-on, which emulates an 8086 environment with no ring protections or segmentation, requiring a transition to protected mode—enabled by setting the protection enable (PE) bit in the CR0 control register—to activate the full ring mechanism.[14] In the x86-64 extensions, introduced with long mode, the core four-ring structure is retained without fundamental hardware alterations, but segmentation is simplified into a flat 64-bit address space where most segment registers are ignored except for FS and GS bases.[14] Call gates are expanded to support 64-bit offsets and return instruction pointers (RIP), while task switches are deprecated in favor of software-managed context switches using a 64-bit TSS; the CPL continues to govern privilege checks, with stack alignment enforced to 16 bytes during inter-ring calls.[14] Historical vulnerabilities in x86 protection rings have included kernel-mode buffer overflows that enable privilege escalations to ring 0, such as stack overflows in performance counter code allowing return-to-user (ret2usr) attacks to hijack control flow and execute arbitrary ring 0 code from ring 3.[17] These exploits, documented in kernel subsystems like perf_counter.c, underscore the risks of inadequate bounds checking in privileged code, often leading to privilege escalation despite gate protections.[17]Implementations in Other Architectures
In the ARM architecture, protection is implemented through exception levels (ELs) rather than traditional rings, providing a hierarchy of privilege levels in AArch64. EL0 represents the least privileged level, dedicated to unprivileged user applications, while EL1 handles kernel-mode operations for the operating system. EL2 supports hypervisor functionality for virtualization, and EL3 offers the highest privilege for secure monitor or firmware code, enabling isolation between secure and non-secure worlds. This four-level structure allows finer granularity than simpler two-mode systems, with exceptions routing execution to higher levels for privilege escalation.[18] The MIPS architecture employs three primary operating modes—user, supervisor, and kernel—controlled by the KSU bits in the coprocessor 0 (CP0) status register, with additional isolation via coprocessor usability controls. Privilege levels are enforced through four coprocessor access bits (CU0–CU3) in the status register, where CP0 (system control) is always usable in kernel mode but restricted otherwise, triggering a coprocessor unusable exception on unauthorized access. This mechanism provides effective privilege granularity for memory management and interrupts without dedicated hardware rings beyond the three modes.[19] RISC-V utilizes a modular privilege architecture with base modes of machine (M), supervisor (S), and user (U), extensible to include hypervisor (H) mode via the hypervisor extension. Machine mode grants full hardware access for bootloaders and firmware, supervisor mode manages virtual memory and OS services, and user mode executes applications with restricted privileges, enforced by control-status registers like mstatus and sstatus. The hypervisor extension adds H mode between M and S for virtualization, allowing nested paging and guest isolation, with pluggable custom extensions enabling tailored privilege schemes for embedded or high-performance systems.[20] In the PowerPC and IBM POWER architectures, privilege is divided into problem state for user applications and privileged state (supervisor or kernel) for OS code, with an additional hypervisor state in the POWER series for virtualization support. The machine state register (MSR) bits, such as PR (problem/recursive) and HV (hypervisor), determine the current state, where problem state limits access to privileged instructions, and hypervisor state enables secure partitioning of resources among virtual machines. This two-to-three state model focuses on efficient context switching for embedded and server environments, with hardware enforcement of isolation through segment registers and page protection.[21] Compared to x86's fixed four rings, these architectures often feature fewer base levels—typically two to four—optimized for specific domains: ARM and RISC-V emphasize flexibility for mobile and open-source ecosystems, MIPS prioritizes coprocessor efficiency in legacy embedded systems, and PowerPC/POWER targets high-reliability servers with strong virtualization. Modern trends show increased adoption of multi-level protections in server-grade chips, such as ARMv8's exception levels introduced in the 2010s for cloud computing security, enabling confidential computing and reduced privilege overhead in data centers.Hardware and Software Integration
Utilization of Hardware Features
Hardware features in modern CPUs enforce protection rings primarily through integrated memory management units (MMUs) that incorporate privilege checks at the hardware level. Page tables, a core component of virtual memory systems, include bits such as user/supervisor flags to restrict access based on the current ring level; for instance, supervisor-only pages prevent user-mode code (typically ring 3) from reading or writing kernel memory (ring 0). Segmentation mechanisms complement this by defining memory regions with associated privilege attributes, where attempts to access segments outside the allowed ring trigger immediate hardware intervention. These features ensure isolation without software mediation, as the CPU hardware directly validates every memory reference against the current privilege level.[22] Interrupt handling relies on descriptor tables that map interrupt vectors to handlers executed at elevated privilege levels, maintaining ring integrity during asynchronous events. In x86 architectures, the Interrupt Descriptor Table (IDT) specifies the ring level for each handler entry, routing interrupts to ring 0 code while stacking the previous privilege state to prevent unauthorized escalation. This hardware routing ensures that even hardware-generated interrupts, such as those from timers or devices, adhere to protection boundaries, with the CPU automatically switching stacks if needed to isolate handler execution.[14] Protections for timers and I/O operations are enforced through hardware barriers that limit direct access from lower rings. Port I/O instructions, essential for device communication, are privileged and fault if executed outside the kernel ring, while direct memory access (DMA) by peripherals is safeguarded via I/O memory management units (IOMMUs) that apply ring-like translations and access controls to DMA requests, preventing unauthorized memory writes. These mechanisms, such as Intel VT-d, translate device virtual addresses to physical ones with privilege enforcement, blocking DMA attacks that could bypass CPU rings. Virtualization extensions extend the ring model by introducing a hypervisor privilege level, commonly termed ring -1, which operates outside guest OS rings. Intel VT-x and AMD-V achieve this through VMX (Virtual Machine Extensions) instructions that manage transitions between root (hypervisor) and non-root (guest) modes, allowing the hypervisor to intercept and emulate privileged operations without compromising host security. This adds a layer of isolation where guest kernels run in ring 0 relative to their virtual environment but are confined by hardware-enforced VM exits to the hypervisor's higher privilege. Fault mechanisms provide runtime enforcement by generating exceptions on ring violations, such as executing a privileged instruction like HLT from user mode. The general protection fault (#GP) is a key example, where the CPU halts execution, pushes an error code indicating the violation type, and vectors to a ring 0 handler via the IDT, enabling the OS to terminate or sandbox the offending process. This immediate hardware response minimizes exposure to exploits attempting cross-ring access.[14] While these hardware checks enhance security, they introduce performance overhead during context switches between rings, as the CPU must validate privileges, flush TLBs, and reload segment registers. Studies indicate that ring transitions can add hundreds of cycles to switch latency on modern x86 processors; excessive switches may lead to performance degradation in high-contention workloads.[23]Privilege Escalation Techniques
Privilege escalation in protection ring architectures involves controlled mechanisms to transition from less privileged rings, such as ring 3 (user mode), to more privileged rings, like ring 0 (kernel mode), while preserving system security. One primary method is through system calls, which use software interrupts to invoke kernel services. For instance, in x86 architectures, the INT 0x80 instruction generates a software interrupt that vectors to a handler in the Interrupt Descriptor Table (IDT), escalating privilege to ring 0 by switching the stack and updating segment registers like CS and SS from the Task State Segment (TSS).[14] Modern implementations often employ the faster SYSCALL instruction, which loads the kernel entry point from the IA32_LSTAR Model-Specific Register (MSR) and performs the transition without full interrupt overhead, returning via SYSRET after handling.[24] Traps and exceptions provide automatic escalation for error conditions or synchronous events, ensuring the kernel can intervene without explicit user requests. Traps, which occur after instruction execution (e.g., via INT3 for debugging), and exceptions like general protection faults (#GP) or page faults (#PF) trigger IDT handlers that escalate to ring 0, saving the processor state on the kernel stack and potentially pushing error codes.[14] Upon resolution, control returns to the original ring using IRET, restoring the instruction pointer (EIP) and flags, thus maintaining isolation.[14] These mechanisms align with operational modes by temporarily shifting to protected mode during handling. Higher-level abstractions like doors and portals offer OS-specific interfaces for mediated access across rings, reducing direct kernel entry. In systems based on the Mach microkernel, ports serve as capability-based endpoints for inter-process communication (IPC), where user threads send messages via trap instructions to kernel-managed ports, enabling the kernel to validate and execute privileged operations on behalf of the caller without full context switches in all cases.[25] This controlled mediation enforces port rights (e.g., send or receive permissions) to limit exposure. Miscellaneous techniques include task gates for efficient thread or task switching with privilege changes and avoiding escalation altogether through same-ring libraries. Task gates, defined in the IDT or GDT, point to a TSS for switching execution contexts during interrupts or calls, validating the destination privilege via Descriptor Privilege Level (DPL) checks before escalating.[14][26] User-space libraries, operating within ring 3, handle common functions without ring transitions, minimizing overhead while respecting ring boundaries. Despite these safeguards, privilege escalation introduces security risks when exploited, such as through kernel vulnerabilities that bypass ring protections. The Dirty COW exploit (CVE-2016-5195), disclosed in 2016, leveraged a race condition in the Linux kernel's copy-on-write mechanism to allow unprivileged users in ring 3 to gain write access to read-only kernel memory mappings, enabling local root (ring 0) escalation by injecting malicious code.[27] To mitigate injection during transitions, hardware performs rigorous validation on parameters and descriptors. In x86, the processor checks the Current Privilege Level (CPL) against the Requestor Privilege Level (RPL) and DPL of gates or segments, ensuring CPL ≤ DPL and RPL ≤ DPL before allowing escalation; violations trigger a #GP exception.[14] Stack consistency is verified via TSS entries, and paging flags (e.g., user/supervisor bit) further enforce access rules post-transition.[14]Applications and Implications
Role in Operating Systems
In modern operating systems, protection rings facilitate a structured division of privileges, with ring 0 typically reserved for the kernel to manage core services such as process scheduling, memory allocation, and device drivers, while ring 3 hosts user-space applications to ensure controlled access to system resources.[28][29] This design is evident in major systems like Linux, Windows, and macOS, where the kernel operates in ring 0 to execute privileged instructions, and user applications run in ring 3, relying on controlled transitions for kernel interactions.[3][30] Monolithic kernels, such as those in Linux and traditional Windows NT, primarily utilize rings 0 and 3, confining most kernel components—including drivers and executive services—to ring 0 for efficiency, while user processes operate in ring 3.[31] In contrast, microkernels like QNX employ a more distributed approach, running the minimal kernel in ring 0 and placing drivers and other services in ring 3 to enhance modularity and fault isolation.[32][33] For example, in Linux, system calls provide the interface for ring transitions, where user-space processes in ring 3 invoke kernel services in ring 0 via mechanisms like the syscall instruction, ensuring secure data exchange without direct privilege escalation.[34] Similarly, the Windows NT executive, comprising components like the object manager and process manager, executes entirely in ring 0 to oversee system-wide operations.[35] Virtualization introduces additional layers, with hypervisors such as Xen and KVM utilizing a conceptual ring -1 (enabled by hardware extensions like Intel VT-x) to isolate guest operating systems, allowing multiple kernels to run in ring 0 of their virtual environments while the host hypervisor maintains oversight.[36] Over time, operating systems have evolved toward reducing ring 0 complexity, as seen in the adoption of user-space drivers like FUSE in Linux, which implements filesystem operations in ring 3 to limit kernel modifications and exposure. Compatibility with legacy code is maintained by executing it within ring 3 user-space environments, often through emulation layers or compatibility subsystems that prevent direct access to privileged operations.[28]Security and Isolation Benefits
Protection rings provide a fundamental mechanism for isolating faults and malicious activities within less privileged execution environments, thereby safeguarding the kernel from unauthorized access or compromise. By confining user-mode processes to outer rings, such as ring 3 on x86 architectures, these mechanisms ensure that errors, crashes, or exploits in applications do not propagate to the kernel space, maintaining system stability and integrity.[37] This isolation is achieved through hardware-enforced boundaries that prevent direct memory access or privileged instruction execution from lower-privilege rings, effectively containing malware or faulty code to prevent widespread system damage.[38] The enforcement of least privilege via protection rings significantly reduces the attack surface by restricting user applications from accessing sensitive kernel resources, such as direct memory manipulation or hardware controls. For instance, applications in ring 3 cannot perform operations like modifying kernel memory without transitioning through controlled gates, which require validation to prevent unauthorized escalation.[39] This hierarchical access control aligns with core security principles, limiting potential damage from compromised processes and enabling secure multitasking environments.[40] Despite these benefits, protection rings have notable limitations, particularly vulnerabilities in the innermost ring (ring 0) that can undermine overall isolation. The 2018 Meltdown and Spectre attacks exploited speculative execution in modern CPUs to bypass ring boundaries, allowing user-level code to read kernel memory and exposing sensitive data across isolation layers.[41] Such flaws highlight that while rings provide coarse-grained protection, they rely on flawless kernel implementation and can be circumvented by microarchitectural side channels, necessitating additional software mitigations.[42] To address these gaps, protection rings are often augmented with complementary security layers for enhanced isolation. Techniques like Address Space Layout Randomization (ASLR) randomize memory layouts to complicate exploits targeting ring transitions, while mandatory access control systems such as SELinux operate within the kernel ring to enforce fine-grained policies beyond basic ring privileges.[43] In containerization environments like Docker, rings underpin the kernel's namespace and cgroup isolation, allowing multiple isolated workloads to share the host kernel securely without direct ring-level access from containers.[44] A prominent case study is Android's implementation of SELinux in the kernel ring, where it confines system services and apps to specific security contexts, preventing privilege escalations even if a user process is compromised. This ring-enforced MAC has blocked numerous exploits targeting Android's multimedia and networking components, contributing to the platform's resilience against zero-day attacks.[45] Historically, breaches like the 2016 Dirty COW vulnerability demonstrated ring bypass risks, where a race condition in the kernel allowed user-space processes to gain write access to read-only memory, leading to root privilege escalation and widespread exploits across Linux distributions.[46] Looking ahead, hardware advancements such as Intel's Control-flow Enforcement Technology (CET) offer enhanced protection by providing shadow stacks and endbranch verification at the instruction level, complementing ring-based isolation with defenses against control-flow hijacking attacks without relying solely on coarse ring hierarchies.[47] These features, enabled by default in Windows and Linux as of 2025, help mitigate exploits like return-oriented programming that could otherwise bypass rings, paving the way for more robust, layered defenses in future systems.[48]References
- https://wiki.xenproject.org/wiki/Introduction_to_Xen_3.x