Recent from talks
Contribute something
Nothing was collected or created yet.
Power10
View on Wikipedia
Power10 SCM | |
| General information | |
|---|---|
| Launched | 2021 |
| Designed by | IBM, OpenPower partners |
| Common manufacturer | |
| Performance | |
| Max. CPU clock rate | +3.5 GHz to +4 GHz |
| Physical specifications | |
| Cores |
|
| Package |
|
| Socket |
|
| Cache | |
| L1 cache | 48+32 KB per core |
| L2 cache | 2 MB per core |
| L3 cache | 120 MB per chip |
| Architecture and classification | |
| Technology node | 7 nm |
| Microarchitecture | P10 |
| Instruction set | Power ISA (Power ISA v.3.1) |
| History | |
| Predecessor | POWER9 |
| Successor | Power11 |
| POWER, PowerPC, and Power ISA architectures |
|---|
| NXP (formerly Freescale and Motorola) |
| IBM |
|
| IBM/Nintendo |
| Other |
| Related links |
| Cancelled in gray, historic in italic |
Power10 is a superscalar, multithreading, multi-core microprocessor family, based on the open source Power ISA, announced in August 2020 and available from September 2021. The processor is designed to have 15 cores available. The main features of Power10 are higher performance per watt and better memory and I/O architectures, with a focus on artificial intelligence (AI) workloads. Each Power10 core has doubled up on most functional units compared to its predecessor POWER9. Power10 is available in a range of IBM models and is supported by operating systems including Linux 5.9 and PowerVM. The branding is unusual in that its name is not capitalized like POWER9 and all other previous POWER processors.
Description
[edit]The Power10 superscalar, multithreading, multi-core microprocessor family is based on the open source Power ISA. It was announced in August 2020 at the Hot Chips conference. Systems with Power10 CPUs were generally available from September 2021 in the IBM Power10 Enterprise E1080 server. The processor is designed to have 15 cores available, but a spare core will be included during manufacture to cost-effectively allow for yield issues. The main features of Power10 are higher performance per watt and better memory and I/O architectures, with a focus on artificial intelligence (AI) workloads.[1]
Power10-based processors is manufactured by Samsung using a 7 nm process with 18 layers of metal and 18 billion transistors on a 602 mm2 silicon die.[2][3][4][5]
Design
[edit]Each Power10 core has doubled up on most functional units compared to its predecessor POWER9. The core is eight-way multithreaded (SMT8) and has 48 KB instruction and 32 KB data L1 caches, a 2 MB large L2 cache and a very large translation lookaside buffer (TLB) with 4096 entries.[4] Latency cycles to the different cache stages and TLB has been reduced significantly. Each core has eight execution slices each with one floating-point unit (FPU), arithmetic logic unit (ALU), branch predictor, load–store unit and SIMD-engine, able to be fed 128-bit (64+64) instructions from the new prefix/fuse instructions of the Power ISA v.3.1. Each execution slice can handle 20 instructions each, backed up by a shared 512-entry instruction table, and fed to 128-entry-wide (64 single-threaded) load queue and 80-entry (40 single-threaded) wide store queue. Better branch prediction features have doubled the accuracy. A core has four matrix math assist (MMA) engines,[6] for better handling of SIMD code, especially for matrix multiplication instructions where AI inference workloads have a 20-fold performance increase.[7]
The processor has two "hemispheres" with eight cores each, sharing a 64 MB L3 cache for a total of 16 cores and 128 MB L3 caches. Due to yield issues, at least one core is always disabled, reducing L3 cache by 8 MB to a usable total of 15 cores and 120 MB L3 cache. Each chip also has eight crypto accelerators offloading common algorithms such as AES and SHA-3.
Increased clock gating and reworked microarchitecture at every stage, together with the fuse/prefix instructions enabling more work with fewer work units, and smarter cache with lower memory latencies and effective address tagging reducing cache misses, enables the Power10 core to consume half the power as POWER9. Combined with the improvements in the compute facilities by up to 30% makes the whole processor perform 2.6× better per watt than its predecessor. And in the case of mounting two cores on the same module, up to 3 times as fast in the same power budget.
As the cores can act like eight logical processors each the 15-core processor looks like 120 cores to the operating system. On a dual-chip module, that becomes 240 simultaneous threads per socket.
I/O
[edit]The chips have completely reworked memory and I/O architectures, using the open Coherent Accelerator Processor Interface (OpenCAPI) and Open Memory Interface (OMI). Using serial memory communications to off chip controllers reduces signaling lanes to and from the chip, increases the bandwidth and allows the processor to be flexible in its memory technology.[5]
Power10 supports a wide range of memory types, including DDR3 through DDR5, GDDR, HBM, or Persistent Storage Memory. These configurations can be changed by the customer to best fit the use case intended for the system.
- DDR4 – support for up to 16 TiB RAM, 410 GB/s, 10 ns latency
- GDDR6 – up to 800 GB/s
- Persistent storage – up to 2 PB
Power10 enables encrypting of data with no performance penalty at every stage from RAM, across accelerators and cluster nodes to data at rest.
Power10 comes with PowerAXON facility enabling chip to chip, system to system and OpenCAPI bus for accelerators, I/O and other high performance cache coherent peripherals. It manages the communications between nodes in a 16x socket single chip module (SCM) cluster or a 4x socket dual chip module (DCM) cluster. It also manages the memory semantics for clustering of systems enabling load/store access from the core up to 2 PB of RAM on the entire Power10 cluster. IBM calls this feature Memory Inception.
Both OMI and PowerAXON can handle 1 TB/s communications off the chip.
Power10 includes PCIe 5. The SCM has 32x and the DCM has 64x PCIe 5 lanes. The decision to remove NVLink support from Power10 was made due to PCIe 5.0's bandwidth capabilities rendering NVLink support obsolete for the use cases that Power10 was designed for.[4] Support for NVLink on-chip was previously a unique selling point for POWER8 and POWER9.
Variants
[edit]The Power10 chip is available in two variants, defined by firmware in the packaging. Even though the chips are physically identical and the difference is set in firmware, it cannot be changed by the user nor IBM after manufacturing.[8]
- 15× SMT8 cores
- Optimized for high throughput but less compute intensive applications
- 30× SMT4 cores
- Optimized for highly compute intensive applications that require complex instruction sets and multiple cycles for information loaded into cache
Modules
[edit]The Power10 comes in three flip-chip plastic land grid array (FC-PLGA) packages: one single chip module (SCM) and two dual-chip modules (DCM and eSCM).
- SCM, single chip module – 3.6-4.15 GHz, up to 15 SMT8 cores. Can be clustered up to 16 sockets. x32 PCIe 5 lanes. Module size: 68.5×77.5 mm. The module has a unique configuration with 8 connectors on the substrate (OTF) for symmetric multiprocessing (SMP) cables directly connecting other Power10 SCM modules.
- DCM, dual chip module – 3.4-4.0 GHz, up to 24 SMT8 cores. Can be clustered up to four sockets. x64 PCIe 5 lanes. The DCM is in the same thermal range as previous offerings. Module size: 74.5×85.75 mm. The DCM comes in four variants.[9]
- EPEU - 12 cores, 3.36-4.0 GHz
- EPEV - 18 cores, 3.2-4.0 GHz
- EPGW - 24 cores, 2.95-3.9 GHz
- EHC8 - 24 cores, 2.95-3.9 GHz (for North American healthcare)
- eSCM, entry single chip module – 3.0-3.9 GHz, up to 8 SMT8 cores. It combines two Power10 chips. The first chip is fully functional with 4-8 active cores. The other chip only uses the PCIe functionality acting as a IO switch with plenty of more PCIe lanes. These eSCM modules can be clustered up to four sockets. Module size: 74.5×85.75 mm. The eSCM is also called the "ioscm".[10][11]
Systems
[edit]Power10 is available in a range of IBM computers.
Enterprise
[edit]The IBM Power E1080, codename Denali, is the top end Power10 computer by IBM. It's made of 1-4× Central Electronics Complex (CEC) nodes, each one taking up 5Us of space. Each node has 4× Power10 SCM, configurable with 10, 12, or 15 SMT8 cores per processor, and up to 16 TB OMI-DDR4 RAM. The Power E1080 natively runs PowerVM running AIX, IBM i and little-endian Linux.[12] An E1080 system also needs a 2U high System Control Unit for monitoring and configuration.
The Power E1080 also supports up to sixteen I/O expansion drawers, four per CEC node. Each expansion drawer is connected to the respective CEC node by two PCIe fanout modules, and has twelve FHFL PCIe slots. Four of these slots are PCIe 3.0 x16, while the remaining eight are PCIe 3.0 x8. A maximum specification configuration allows the Power E1080 to support 192 single slot PCIe cards across a 16 socket system.[13]
Mid-range
[edit]- IBM Power E1050 - 4U case. 2-4× CPU sockets for 2-4× DCM modules, 24-96 cores. 64× OMI memory slots which support up to 16 TB RAM. 11× PCIe slots, 8× gen.5 and 3× gen.4. 10 slots for up to 64 TB of NVMe based SSDs. Run a mix of Linux, AIX or IBM i operating systems.[9]
Scale-out
[edit]The S-models can run Linux, IBM i and AIX. The L-models are made for Linux, but are allowed to run AIX and IBM i on up to 25% of available CPU cores.[10]
- IBM Power S1024 & L1024 - 4U case. 1-2× CPU sockets for 1-2× DCM modules, 24-48 cores. 32× OMI memory slots which support up to 8 TB RAM. 10× PCIe slots, 8× gen.5 and 2× gen.4. 16 slots for up to 102 TB of NVMe based SSDs.
- IBM Power S1022 & L1022 - 2U case. 1-2× CPU sockets for 1-2× DCM modules, 24-40 cores. 32× OMI memory slots which support up to 4 TB RAM. 10× PCIe slots, 8× gen.5 and 2× gen.4. 8 slots for up to 51 TB of NVMe based SSDs.
- IBM Power S1022s - 2U case. 1-2× CPU sockets for 1-2× eSCM modules, 4-16 cores. 16× OMI memory slots which support up to 2 TB RAM. 10× PCIe slots, 8× gen.5 and 2× gen.4. 8 slots for up to 51 TB of NVMe based SSDs.
- IBM Power S1014 - 4U case or a deskside tower. 1× Power10 eSCM module with 4 or 8 cores. 8× OMI memory slots which support up to 1 TB RAM. 5× PCIe slots, 4× gen.5 and 1× gen.4. 16 slots for up to 102 TB of NVMe based SSDs.
- IBM Power S1012 - 2U, half-wide case or a deskside tower. 1× Power10 eSCM module with 1, 4 or 8 cores. 4× OMI memory slots which support up to 256 GB RAM. 4× PCIe slots, 4× gen.5. 4 slots for up to 6.4 TB of NVMe based SSDs.
Operating system support
[edit]The following operating systems that support Power10:
Comparison with earlier POWER CPUs
[edit]- The change to a 7-nm fabrication process results in significantly higher performance per watt.
- The PowerAXON facility now extends to 2 PB of unified clustered memory space, shared across multiple cluster nodes, and includes support for PCIe 5.
- New SIMD instructions and new data types including bfloat16, INT4(INTEGER) and INT8(BIGINT).[16][17] are aimed at improving AI workloads.
- Unlike earlier POWER9 and POWER8 CPUs, Power10 requires closed source, third party firmware in security sensitive areas of the CPU module, along with additional closed source, third party firmware in the required off-module memory controller.[18]
Branding
[edit]Power10 is unusual in that its name is not capitalized like POWER9 and all other previous POWER processors are. This change is one part in IBM's rebranding of their Power Systems offering, which beginning with Power10 is now just "Power". Power10 also has a logo.[19]
See also
[edit]References
[edit]- ^ "IBM Reveals Next-Generation IBM POWER10 Processor". IBM. August 17, 2020.
- ^ Dr. Cutress, Ian (August 17, 2020). "Hot Chips 2020 Live Blog: IBM's POWER10 Processor on Samsung 7nm". AnandTech. Archived from the original on August 17, 2020.
- ^ Quach, Katyanna (August 17, 2020). "IBM takes Power10 processors down to 7nm with Samsung, due to ship by end of 2021". The Register.
- ^ a b c Schilling, Andreas (August 17, 2020). "IBM Power10 offers 30 cores with SMT8, PCIe 5.0 and DDR5". Hardware LUXX (in German).
- ^ a b Kennedy, Patrick (August 17, 2020). "IBM POWER10 Searching for the Holy Grail of Compute". ServeTheHome.
- ^ Jose Moreira, Puneeth Bhat A H and Satish Kumar Sadasivam (April 15, 2021). Matrix-Multiply Assist Best Practices Guide.
- ^ Russell, John (August 17, 2020). "IBM Debuts Power10; Touts New Memory Scheme, Security, and Inferencing". HPCwire.
- ^ Prickett Morgan, Timothy (August 31, 2020). "IBM's Possible Designs For Power10 Systems". IT Jungle.
- ^ a b Giuliano Anselmi, Marc Gregorutti, Stephen Lutz, Michael Malicdem, Guido Somers, Tsvetomir Spasov (July 11, 2022). "IBM Power E1050 Technical Overview and Introduction" (PDF).
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ a b Giuliano Anselmi, Young Hoon Cho, Andrew Laidlaw, Armin Röll, Tsvetomir Spasov (July 19, 2022). "IBM Power S1014, S1022s, S1022, and S1024 Technical Overview and Introduction" (PDF).
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ "open-power/rainier-xml". February 28, 2023 – via GitHub.
- ^ Morgan, Timothy Prickett (September 8, 2021). "This Is What The Most Powerful Server In The World Looks Like".
- ^ Giuliano Anselmi, Manish Arora, Ivaylo Bozhinov, Dinil Das, Turgut Genc, Bartlomiej Grabowski, Madison Lee, Armin Röll (December 9, 2021). "IBM Power E1080 Technical Overview and Introduction" (PDF).
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ Larabel, Michael (August 9, 2020). "Linux 5.9 Brings More IBM POWER10 Support, New/Faster SCV System Call ABI". Phoronix.
- ^ a b Prickett Morgan, Timothy (August 6, 2019). "Talking High Bandwidth with IBM's POWER10 Architect". The Next Platform.
- ^ Patrizio, Andy (August 18, 2020). "IBM details next-gen POWER10 processor". Network World.
- ^ "Data type aliases". IBM. August 26, 2020.
- ^ "It's not just OMI that's the trouble with POWER10". September 8, 2021.
- ^ Morgan, Timothy Prickett (August 2, 2021). "No More Shouting The Name "Power" (Well, Except In Our Title Here)".
Power10
View on GrokipediaArchitecture and Design
The POWER10 employs a superscalar symmetric multiprocessor (SMP) design compliant with Power ISA Version 3.1, offering backward compatibility with POWER8 and POWER9 modes.[4] It supports configurations including Dual Chip Modules (DCM) with up to 24 cores per socket, Single Chip Modules (SCM) with up to 15 cores, and Entry Single Chip Modules (eSCM) with up to 8 cores, enabling systems like the Power E1080 to scale to 240 cores overall.[6][4] Each core features simultaneous multithreading (SMT8) for up to 8 threads, a 96 KB L1 instruction cache (2 x 48 KB), a 32 KB L1 data cache, 2 MB of L2 cache, and 8 MB of local L3 cache per core, with a total on-chip L3 cache of up to 120 MB using a non-uniform cache architecture (NUCA) for efficient data access.[4] The processor integrates four Matrix Math Accelerator (MMA) units per core to accelerate AI inferencing, particularly for reduced-precision formats like bfloat16 and INT8.[4]Performance and Efficiency
POWER10 delivers up to 3x the performance of POWER9 in targeted workloads, with 2.6x greater energy efficiency and 10–20x faster AI inferencing capabilities, driven by enhanced load/store bandwidth and a quad-issue superscalar execution pipeline operating at frequencies from 2.45 GHz to 4.0 GHz depending on the model.[2][4] Memory support includes the Open Memory Interface (OMI) with Differential DIMMs (DDIMMs) for DDR4 or DDR5, providing up to 409 GB/s bandwidth per chip and capacities reaching 64 TB in high-end systems like the Power E1080.[6][7][4] Interconnects feature PCIe Gen5 slots for I/O expansion and SMP links with 128 GBps bandwidth per link, a 33% improvement over POWER9, facilitating scalable hybrid cloud and edge deployments.[4]Security and Applications
Security is a core focus, with pervasive hardware-based encryption including AES-256 for memory (in CTR mode), four times the AES units per core compared to POWER9, and support for quantum-safe cryptography via certified coprocessors (FIPS 140-2 Level 4).[4] Additional protections encompass hardware mitigations for speculative execution vulnerabilities, secure boot mechanisms, and low common vulnerabilities and exposures (CVE) rates, making it suitable for mission-critical enterprise environments.[4] The POWER10 powers systems running AIX, IBM i, Linux, and PowerVM virtualization, targeting applications in AI/machine learning, database management (e.g., Oracle), web servers, and data analytics.[6][4]Overview
Architecture
The IBM Power10 processor adheres to the Power Instruction Set Architecture (ISA) version 3.1, incorporating extensions optimized for artificial intelligence workloads, such as the Matrix Math Accelerator for accelerated matrix operations, and enhanced security features including pervasive memory encryption, transparent memory encryption, and support for quantum-safe cryptography. These extensions build on the foundational Power ISA framework to enable efficient handling of AI inference and training tasks alongside robust protection against modern threats.[7][8] The core microarchitecture, designated as P10, employs a superscalar, out-of-order execution design with up to 15 cores per single-chip module (SCM), using a chiplet-based approach where core chiplets each contain two cores, supporting simultaneous multithreading up to 8-way (SMT-8) or 4-way (SMT-4) for flexible thread scaling to match workload demands. The processor integrates up to eight core chiplets, facilitating high core densities in enterprise configurations. Fabricated on a 7 nm Samsung process node, the processor die measures 602 mm² and contains approximately 18 billion transistors, enabling dense integration of computational resources while maintaining power efficiency. Clock frequencies reach up to 4 GHz, balancing performance with thermal constraints in multi-chiplet setups.[7][8] The cache hierarchy is structured for low-latency access and high bandwidth, featuring two 48 KB instruction caches (96 KB total) and 32 KB data caches at the L1 level per core, a dedicated 2 MB L2 cache per core, and a shared 120 MB L3 cache per chiplet utilizing a non-uniform cache architecture (NUCA) for optimized data locality. This design prioritizes rapid data retrieval for out-of-order execution pipelines, reducing stalls in compute-intensive applications. The Power10 also supports PCIe 5.0 for high-speed I/O connectivity.[7][8]Key Innovations
The IBM POWER10 processor introduces the Matrix Multiply Assist (MMA) engines, which provide dedicated hardware acceleration for AI inference workloads. Each core features four MMA engines utilizing 256-bit SIMD operations to perform matrix multiplications efficiently, supporting data types such as FP32, BFloat16, and INT8. This enables up to 20x faster AI inference performance for INT8 operations compared to the POWER9 processor, allowing enterprises to process AI tasks directly on the chip without external accelerators.[2][7] Security enhancements in POWER10 emphasize hardware-based protections to safeguard data in hybrid cloud environments. Transparent memory encryption is implemented pervasively across all volatile memory using AES-CTR mode at the memory controller level, ensuring end-to-end data protection without performance overhead. Additionally, secure boot establishes a chain of trust from the service processor through firmware and operating system components, preventing unauthorized code execution during initialization. Specific features include pointer authentication via cryptographic hashing of return addresses to mitigate return-oriented programming attacks, and full-system encryption that extends protections to persistent memory with AES-XTS mode support.[2][9][7][10] For I/O connectivity, POWER10 integrates OpenCAPI 3.0 for coherent accelerator attachments and PCIe 5.0 interfaces, delivering up to 64 GB/s bandwidth per slot to support high-performance data movement. These advancements enable multi-petabyte memory clustering and seamless integration with external devices like GPUs and FPGAs. Energy efficiency is improved by a factor of 2.6x in performance per watt over POWER9 at the core level, achieved through the 7 nm semiconductor process and dynamic voltage/frequency scaling via the EnergyScale technology, which optimizes power consumption based on workload demands.[7][4][11]Design Details
Processor Core
The Power10 processor core employs a superscalar architecture with out-of-order dispatch, enabling efficient parallel execution of instructions while maintaining compatibility with the Power Instruction Set Architecture (PowerISA).[7] Fabricated on a 7 nm process with a 602 mm² die size and approximately 18 billion transistors, this design incorporates advanced branch prediction mechanisms that achieve higher accuracy and lower misprediction flush rates compared to prior generations, supported by deeper and wider instruction windows for improved scheduling.[7] Each core is organized into two execution resource domains, facilitating modular handling of diverse workloads.[12] The cores support variable simultaneous multithreading (SMT) modes to optimize for different application profiles: SMT-8, which allows up to eight threads per core, is tailored for throughput-oriented workloads and typically pairs with configurations of up to 15 cores per chip, maximizing thread density for tasks like database processing.[12] In contrast, SMT-4 mode, limited to four threads per core, targets compute-intensive applications such as scientific simulations, enabling denser core packing with up to 30 cores per chip to prioritize single-thread performance over multithreading.[12] SMT-2 and single-threaded modes are also available for fine-tuned resource allocation, with automatic workload balancing dynamically adjusting thread counts across modes to enhance efficiency.[13] Execution units within each core feature an 8-wide dispatch capability in SMT-8 mode, allowing up to eight instructions to be issued per cycle for high instruction-level parallelism.[7] The integer and floating-point pipelines include multiple dedicated paths, such as two quad-precision/decimal floating-point units and enhanced fixed-point operations, ensuring robust handling of scalar computations.[12] Vector processing is powered by eight vector-scalar units (VSUs) per core—four per domain—each 128 bits wide, fully supporting PowerISA Vector Scalar Extension (VSX) for SIMD operations, permutations, cryptography, and other vectorized tasks, with a 512-bit accumulator for precision.[7] The cores integrate four Matrix Math Accelerator (MMA) units to boost AI inferencing through specialized matrix operations.[13] Power management at the core level leverages IBM's EnergyScale technology for per-core voltage and frequency scaling, dynamically adjusting based on workload demands, thermal constraints, and active thread counts to balance performance and efficiency.[7] Core parking is implemented through workload balancing mechanisms that can reduce active threads per core from eight to as few as one during low-utilization periods, effectively idling resources without full deactivation.[13] These features operate across modes like power-saving (minimum frequency), static (nominal frequency), and maximum performance (up to 4.0 GHz, workload-dependent), with frequencies scaling from 2.0 GHz in low-power states.[12]I/O and Memory
The Power10 processor incorporates advanced I/O subsystems to handle high-bandwidth data transfer, featuring support for PCIe 5.0 with up to 128 lanes per processor module operating at 32 GT/s per lane, providing aggregate bandwidth of up to ~1 TB/s bidirectional depending on configuration (e.g., ~126 GB/s for x16 slots).[14][13] This configuration supports flexible lane bifurcation, such as 1x16, 2x8, or mixed Gen5 and Gen4 setups, facilitating integration with a wide range of adapters including network interfaces, storage controllers, and accelerators while maintaining backward compatibility with earlier PCIe generations.[7] Additionally, the processor includes OpenCAPI 4.0 as a coherent accelerator processor interface, providing up to 25.6 GB/s per link to enable low-latency, cache-coherent communication with external devices like GPUs or specialized memory modules, enhancing data-intensive workloads by allowing direct memory access without CPU intervention.[7][15] The memory architecture centers on integrated controllers supporting DDR4-3200 or DDR5 via the Open Memory Interface (OMI), with a maximum capacity of 4 TB per socket across up to 32 DDIMM slots for high-speed, buffered access.[13][16][17] These controllers incorporate error-correcting code (ECC) for data integrity and hardware-based encryption via AES in counter mode, applied pervasively to protect against physical attacks without performance overhead.[7] This setup delivers peak bandwidths approaching 410 GB/s per socket for DDR4 or up to 819 GB/s for DDR5, prioritizing reliability and security in enterprise environments.[18] Internally, the on-chip interconnect employs the X-Bus for communication between chiplets, offering an aggregate bandwidth of 1.6 TB/s to ensure seamless data flow across the processor's modular components, including cores and I/O elements.[19][13] This high-throughput fabric supports the processor's modular design, minimizing latency in intra-socket operations.Variants
The IBM Power10 processor employs a modular single-die design, organized into two hemispheres each supporting up to 8 SMT-8 cores or 16 SMT-4 cores, to enable flexible configurations tailored to different workloads.[13] This architecture allows for scalability in core density and threading, supporting up to 30 cores in SMT-8 mode or 60 cores in SMT-4 mode per socket in dual-chip configurations, depending on the selected variant.[13] The high-end variant is optimized for enterprise workloads, featuring 15 SMT-8 cores per chip to maximize thread-level parallelism and virtualization efficiency in demanding transactional and database environments.[13] In contrast, the compute variant targets high-performance computing (HPC) and artificial intelligence applications, utilizing 30 SMT-4 cores per chip to deliver higher per-core performance for vectorized and matrix-accelerated computations.[13] Power10 modules are packaged as multi-chip modules (MCMs), incorporating integrated voltage regulators to enhance power delivery efficiency and thermal management across the chips.[13] This MCM approach facilitates seamless integration of compute, I/O, and memory elements within a compact footprint.[13]Systems
Enterprise
The IBM Power E1080 represents the flagship enterprise server in the Power10 lineup, engineered for mission-critical workloads requiring extreme scalability and unwavering reliability. This multi-node system supports up to four interconnected nodes, enabling configurations with as many as 16 processor sockets and 240 Power10 cores, which deliver robust processing power for consolidating large-scale databases such as IBM Db2 and SAP HANA, as well as enterprise resource planning (ERP) applications like SAP S/4HANA.[7] With a maximum memory capacity of 64 TB across the system—16 TB per node—the E1080 facilitates in-memory analytics and transaction processing at enterprise scale, reducing latency and enhancing data throughput for high-volume operations.[7][6] Central to the E1080's enterprise suitability are its advanced redundancy features, which minimize downtime in demanding environments. The system incorporates hot-swappable components, including power supplies, NVMe drives, PCIe adapters, fans, and SMP interconnect cables, allowing maintenance without interrupting operations.[7] Power redundancy is achieved through N+2 configurations with dual supplies per node, while cooling employs N+1 fan redundancy to ensure continuous thermal management even under failure conditions.[7] These elements are bolstered by comprehensive reliability, availability, and serviceability (RAS) enhancements, such as first-failure data capture (FFDC), pervasive memory encryption using AES-CTR, concurrent repair capabilities, and automated fault isolation, which collectively target mean time to failure rates exceeding 99.999% for sustained uptime in critical infrastructure.[7] Storage integration further amplifies the E1080's enterprise prowess, supporting up to four internal NVMe U.2 drives per node for high-speed local access, scalable to 288 drives via up to 12 attached NED24 drawers for massive data repositories.[7] Connectivity to external storage solutions like IBM FlashSystem is seamless through 32 PCIe Gen 5 slots and NVMe over Fibre Channel protocols, enabling terabyte-scale all-flash arrays optimized for database acceleration and hybrid cloud deployments.[7] This architecture allows enterprises to handle petabyte-level workloads with low-latency I/O, integrating directly with IBM's enterprise storage ecosystem for simplified management and data protection.[6]Mid-range
The IBM Power E1050 serves as the primary mid-range offering in the Power10 systems portfolio, targeting departmental to enterprise-scale workloads that demand scalable performance without the full overhead of high-end configurations. It accommodates 2 to 4 sockets, each equipped with a Power10 processor module featuring 12, 18, or 24 active cores, for a maximum of 96 cores across the system. With support for up to 16 TB of memory via Open Memory Interface (OMI) slots, the E1050 excels in analytics processing and virtualization environments, enabling efficient handling of data-intensive tasks such as database operations and virtual machine orchestration.[12][20] Housed in a compact 4U rack form factor, the E1050 optimizes space in data centers while providing robust expansion options through integrated I/O drawers. It supports up to four PCIe Gen3 or Gen4 I/O expansion drawers (EMX0), which deliver additional hot-swap slots for PCIe adapters, storage controllers, and networking devices, facilitating modular growth tailored to evolving workload needs.[21][12] Energy efficiency is a core design principle of the E1050, driven by the 7 nm Power10 processor architecture, which achieves roughly 2.6 times the energy efficiency per socket compared to POWER9-based systems. This focus on power optimization, combined with its mid-tier scalability, results in a lower total cost of ownership (TCO) relative to enterprise-class models that emphasize maximum redundancy and capacity.[12][20]Scale-out
The IBM Power10 scale-out servers are designed for compact, high-density deployments in cloud, edge, and distributed computing environments, emphasizing horizontal scaling for hyperscale data centers and resource-efficient workloads. These systems prioritize reduced physical footprints while delivering enterprise-grade performance for applications running on AIX, IBM i, and Linux.[22][23] The primary models in the IBM Power S101x and S102x series include the S1012, S1014, S1022, and S1022s, all based on Power10 processors in 1- or 2-socket configurations. The S1012 is a 1-socket system supporting 1, 4, or 8 cores, offered in a half-wide 2U rack or tower form factor for edge and small-business use. The S1014 provides 1 socket with up to 8 cores and up to 1 TB of memory in a 4U rack or tower chassis, suitable for entry-level distributed tasks. The S1022 offers 2 sockets with up to 40 cores and up to 4 TB of memory in a 2U rack form factor, enabling higher-density compute for cloud-native applications. The S1022s, a cost-optimized variant, supports 2 sockets with up to 16 cores in a similar 2U design, targeting budget-conscious scale-out scenarios. These configurations allow up to 60 cores across the series in multi-node racks, facilitating efficient resource pooling.[24][25][22][23] Density optimizations in these servers include half-wide chassis options for the S1012, which can reduce IT footprints by up to 75% compared to full-width predecessors, and support for shared power supplies across multiple nodes to minimize rack space and power consumption in hyperscale environments. High-core-per-rack capabilities enable deployments of up to several hundred cores per standard 42U rack, optimizing for containerized and virtualized workloads in dense cloud infrastructures.[24][13] Networking features integrate high-speed options such as 100 GbE adapters, including SR-IOV-capable ports for virtualized traffic, to support low-latency distributed computing. These servers are certified for Red Hat OpenShift Container Platform, providing container orchestration with enhanced price-performance over comparable x86 systems for hybrid cloud deployments.[13][26]Software Ecosystem
Operating Systems
IBM AIX 7.3 provides full native support for Power10 processors, enabling scalability up to 240 cores (1920 hardware threads) in a single logical partition (LPAR).[27] This version includes enhancements for Power10's AI matrix-multiply accelerator, allowing AIX applications to leverage on-chip AI extensions for tasks like inference and training.[27] Additionally, AIX 7.3 supports Live Partition Mobility (LPM), facilitating seamless migration of running AIX partitions between Power10 systems without downtime, provided compatible hardware and PowerVM configurations are in place.[28] IBM i 7.6 is optimized for business-critical applications on Power10 servers, supporting up to 48 SMT8 cores (384 threads) per partition on Power10 hardware.[29] It integrates tightly with Db2 for i, featuring enhancements such as native multi-factor authentication (MFA) using time-based one-time passcodes, new SQL functionalities including data-change-table-reference for UPDATE and DELETE statements, and improved high-availability options like enhanced Db2 Mirror for real-time data replication across Power10 systems.[30] This integration enables efficient handling of transactional workloads, with Power10-specific optimizations for I/O and processor utilization in enterprise environments.[29] Several Linux distributions are certified for Power10, leveraging kernel versions 5.9 and later to access processor-specific features like improved cryptography acceleration and perf event support.[31] Red Hat Enterprise Linux (RHEL) 9 and 10 offer full support, with RHEL 10 certified for Power10 in mid-2025, enabling advanced AI workloads and container orchestration via Podman.[32] Ubuntu 22.04 LTS and 24.04 LTS provide robust compatibility, including post-copy migration recovery and absolute clock offset handling for Power10 environments.[33] SUSE Linux Enterprise Server (SLES) 15 SP6 includes Power10 performance enhancements for cryptography via NSS FreeBL and OpenSSL, optimizing secure communications and data processing.[34]Virtualization and Firmware
Power10 systems leverage PowerVM as the foundational hypervisor for virtualization, enabling logical partitioning (LPAR) to divide physical resources into isolated environments for multiple workloads. This technology supports up to 1,000 LPARs per system on Power10-based servers, facilitating server consolidation and dynamic resource allocation across enterprise-scale configurations.[35] PowerVM also incorporates Active Memory Expansion, a feature that uses real-time compression to expand the effective memory capacity of an LPAR by up to 4x without requiring additional physical RAM, thereby optimizing utilization in memory-intensive applications.[36] For Linux-centric deployments, the OpenPower Abstraction Layer (OPAL) serves as an open-source firmware alternative to PowerVM, providing a standardized interface for direct hardware access and enabling runtime reconfiguration of processors, memory, and I/O resources without system reboots. OPAL integrates with the hostboot and skiboot components to support bare-metal Linux installations on Power10, promoting flexibility in open-source ecosystems.[37][38] The Hardware Management Console (HMC) acts as the centralized management tool for Power10 environments, offering capabilities for provisioning new LPARs, real-time monitoring of system health and performance metrics, and seamless live partition mobility to migrate running workloads between servers with minimal downtime. Version 11 of the HMC, supporting Power10, enhances automation through integration with PowerVM for policy-based resource adjustments.[28][39] Power10 firmware embeds robust security measures, including an immutable boot process via Secure Boot, which cryptographically verifies the integrity of firmware components during initialization to prevent unauthorized code execution. Additionally, tamper detection mechanisms within the firmware monitor for physical or logical alterations to boot images and hardware, triggering alerts or halts to maintain system trustworthiness in high-security deployments.[40][7]Performance Analysis
Comparison with POWER9
The IBM POWER10 processor represents an evolutionary advancement over its predecessor, the POWER9, with key architectural enhancements focused on density, efficiency, and workload acceleration, particularly for AI and high-performance computing. Fabricated on a 7 nm Samsung CMOS process node with 18 metal layers, POWER10 achieves significantly higher transistor density compared to POWER9's 14 nm process, enabling more cores and improved power efficiency per core.[13][2] In terms of core configuration, POWER10 supports up to 15 high-performance cores per single-chip module in SMT-8 mode or up to 30 cores in SMT-4 mode, leveraging the smaller process node for greater integration; this contrasts with POWER9, which maxes out at 12 cores in SMT-8 mode or 24 cores in SMT-4 mode per chip.[13][8] For threading, POWER10 introduces flexible simultaneous multithreading (SMT) options including SMT-8, SMT-4, SMT-2, and single-thread (ST) modes, with SMT-8 providing up to twice the threads per core in high-throughput scenarios compared to POWER9's baseline SMT-4 (though POWER9 also supports SMT-8 in select configurations). This allows POWER10 to better balance throughput and latency for diverse workloads.[13][7] POWER10 significantly upgrades I/O capabilities, featuring PCIe 5.0 with 32 lanes at 32 GT/s—doubling the bandwidth of POWER9's PCIe 4.0 at 16 GT/s—and OpenCAPI 4.0, an evolution from OpenCAPI 3.0 that supports higher-speed coherency and attachment for accelerators. Additionally, symmetric multiprocessing (SMP) interconnect bandwidth reaches 128 GB/s per chip-to-chip link on POWER10, a 33% increase over POWER9.[13] On the instruction set front, POWER10 implements Power ISA 3.1 (specifically v3.1B), extending POWER9's Power ISA 3.0 with the Matrix-Multiply Assist (MMA) facility, which introduces dedicated vector operations for matrix mathematics optimized for AI inferencing and training, such as BF16, FP16, and INT8 formats previously unavailable on POWER9.[13][41]| Feature | POWER10 | POWER9 |
|---|---|---|
| Process Node | 7 nm Samsung CMOS, 18 layers | 14 nm Samsung |
| Cores per Chip | 15 (SMT-8) / 30 (SMT-4) | 12 (SMT-8) / 24 (SMT-4) |
| Threading Modes | SMT-8/4/2/1 (SMT-8 standard) | SMT-8/4/2/1 (SMT-4 baseline) |
| PCIe | Gen 5 (32 GT/s, 32 lanes) | Gen 4 (16 GT/s, 48 lanes) |
| OpenCAPI | 4.0 (enhanced ports/speed) | 3.0 |
| ISA Version | 3.1 with MMA (AI vector ops) | 3.0 |
