Hubbry Logo
ARM Cortex-A76ARM Cortex-A76Main
Open search
ARM Cortex-A76
Community hub
ARM Cortex-A76
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
ARM Cortex-A76
ARM Cortex-A76
from Wikipedia
Not found
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The ARM Cortex-A76 is a high-performance, 64-bit CPU core developed by , implementing the Armv8.2-A architecture with support for extensions including Armv8.1-A, Armv8.3-A (LDAPR only), Armv8.4-A (SDOT/UDOT), Armv8.5-A (PSTATE SSBS), Cryptographic Extension, and RAS Extension. Announced on May 31, 2018, it features a superscalar, out-of-order based on DynamIQ technology, designed to deliver laptop-class single-threaded performance while maintaining smartphone-level power efficiency for demanding tasks in mobile and edge devices. The core supports execution at all exception levels (EL0-EL3) and AArch32 at EL0 only, with ISA compatibility for A64, A32, and T32 instruction sets. Key architectural elements include a non-blocking, high-throughput L1 cache system with 64 KB instruction and 64 KB data caches, a private L2 cache configurable from 128 KB to 512 KB per core, and an optional shared L3 cache up to 4 MB. It incorporates advanced features such as decoupled branch prediction, a 4-wide decode unit, a fourth-generation for instructions and data, and dual-issue 128-bit and floating-point units that double the throughput of prior CPUs. The core supports up to four CPUs per DynamIQ cluster, 40-bit physical addressing, LPAE for 40-bit virtual addressing, ECC for reliability, and interfaces like AMBA or CHI for , along with GICv4 interrupts, Armv8-A timers, CoreSight v3 debug, and ETMv4.2 trace capabilities. In terms of performance, the Cortex-A76 provides a 35% uplift in single-threaded performance over its predecessor, the Cortex-A75, and achieves 40% better performance within the same power envelope, enabling extended battery life for complex workloads like (with 4x improvement in low-precision ML tasks) and productivity applications. Optimized for 7 nm and advanced process nodes, it targets premium smartphones, laptops, automotive systems (including ASIL-D safety compliance via the Cortex-A76AE variant), and other edge-to-cloud devices requiring high efficiency and compute intensity.

Development

Announcement

The ARM Cortex-A76 CPU core was announced by on May 31, 2018, marking the introduction of their latest premium processor design for high-performance mobile and embedded applications. The unveiling occurred alongside Computex 2018 in , where Arm emphasized the core's role in enabling "laptop-class performance with mobile efficiency" through advancements in the DynamIQ . Internally codenamed "," the Cortex-A76 implements the ARMv8.2-A instruction set and is optimized for 7 nm manufacturing processes, supporting clock speeds up to 3.0 GHz. During the announcement, positioned the Cortex-A76 as the successor to the Cortex-A75, highlighting its 64-bit-only kernel mode execution for enhanced and efficiency in . Key performance claims included a 35% uplift in single-threaded performance over the A75 at the same power envelope, or up to 40% improved power efficiency at equivalent performance levels, based on internal evaluations using TSMC's . Additionally, touted 4x faster inference and improvements in complex workloads like web browsing compared to the previous generation, underscoring its focus on AI and sustained performance for . Arm indicated that the Cortex-A76 would enter production availability in the second half of , with commercial silicon integration expected in devices launching in the second half of 2019, enabling broader adoption in smartphones, tablets, and high-end routers. The announcement also coincided with reveals of complementary IP, including the Mali-G76 GPU and Mali-V76 video processor, to form a cohesive for next-generation SoCs.

Design Objectives

Development of the Cortex-A76 began in 2013. The ARM Cortex-A76 was developed with the primary objective of bridging the performance gap between mobile and , delivering high-end computational capabilities while maintaining the power efficiency essential for battery-constrained devices. Announced on May 31, 2018, as part of ARM's client CPU roadmap, the core aimed to support the transition to 7nm process nodes and enable always-connected experiences in the era of connectivity. This design philosophy addressed the slowing pace of by focusing on architectural innovations that provide substantial single-threaded performance gains without proportionally increasing power consumption. Key targets included achieving a 35% uplift in instructions per clock (IPC) compared to the preceding Cortex-A75, emphasizing superscalar out-of-order execution and advanced branch prediction to handle complex workloads more effectively. The microarchitecture was re-engineered to prioritize energy efficiency for sustained tasks, such as productivity applications and emerging machine learning at the edge, while extending battery life in mobile scenarios. ARM emphasized that these improvements would allow the Cortex-A76 to run desktop-like applications seamlessly on smartphones and laptops, fostering a unified computing experience across devices. In terms of applications, the Cortex-A76 was optimized for premium mobile SoCs targeting smartphones, Windows on ARM laptops, and automotive systems, with variants like the Cortex-A76AE incorporating features for autonomous . The design sought to balance raw with and power envelopes typical of mobile platforms, enabling features like always-on AI processing and high-fidelity graphics without compromising responsiveness. Overall, these objectives positioned the core as a foundational element for next-generation , where efficiency and scalability are paramount.

Architecture

Microarchitecture Overview

The ARM Cortex-A76 is a high-performance, 64-bit CPU core implementing the Armv8.2-A , featuring a ground-up redesigned out-of-order superscalar optimized for sustained high performance in mobile and applications. It is designed for integration within Arm's DynamIQ technology, allowing flexible multi-core configurations in shared units (DSUs) with up to four Cortex-A76 cores per cluster. The core supports 40-bit physical addressing for up to 1 TB of memory and includes with separate 64 KB instruction and 64 KB data L1 caches, each 4-way set-associative and virtually indexed, physically tagged. A private L2 cache per core, configurable from 128 KB to 512 KB, provides low-latency access with 9-cycle load-to-use latency, while an optional shared L3 cache in the DSU ranges from 512 KB to 4 MB. The front-end of the emphasizes high instruction throughput and efficient handling through a unit operating independently of the instruction fetch , enabling the predictor to run at double the fetch bandwidth to mask misprediction penalties. The fetch unit delivers 4 to 8 , supported by multi-level target caches and a hybrid indirect predictor to maximize accuracy and throughput in complex code paths. Following fetch, the includes Arm's first 4-wide decode , capable of renaming and dispatching up to 8 micro-operations per cycle to the engine, which features a deep reorder buffer for handling dependencies and speculation. This design contributes to a 35% increase in single-threaded performance compared to the predecessor Cortex-A75. In the execution backend, the Cortex-A76 employs quad-issue execution with three simple arithmetic logic units (ALUs) for basic operations and one complex ALU handling multi-cycle instructions like division and multiplication, enabling high throughput for scalar workloads. Floating-point and advanced SIMD () processing is powered by dual 128-bit pipelines, doubling the vector/FP bandwidth over prior designs and delivering up to 4x performance for low-precision inference tasks. The load/store unit supports deep memory-level parallelism with a sophisticated fourth-generation hardware , optimizing for bandwidth-intensive applications while maintaining 4-cycle L1 load-to-use latency; it interfaces via AMBA or CHI protocols for system-level coherence. Security features include Arm TrustZone, optional cryptography extensions (AES, SHA, PMULL), and RAS (Reliability, Availability, Serviceability) support, with ECC protection available for caches and interconnects. Overall, these elements enable a 35% performance uplift and 40% power efficiency improvement over the Cortex-A75 at iso-process and frequency.

Pipeline Design

The ARM Cortex-A76 implements a high-performance, superscalar, pipeline optimized for both integer and floating-point workloads in power-constrained environments. This design supports the ARMv8.2-A architecture and integrates with DynamIQ technology for flexible multi-core configurations. The pipeline emphasizes sustained performance through advanced speculation and parallelism, targeting applications from to edge servers. At its core, the pipeline spans 13 stages, balancing depth for high clock frequencies—up to 3.0 GHz on 7 nm processes—with latency management for efficient instruction throughput. The front-end operates as a 4-wide superscalar unit, with the fetch delivering up to 8 instructions per cycle from a 64 KiB L1 instruction cache and decoding variable-length ARM instructions into micro-operations (uops). This includes macro-op fusion to reduce decode pressure and improve density for common instruction sequences. Once decoded, uops enter a register rename stage before dispatch to a reorder buffer supporting a 128-entry instruction window, enabling dynamic reordering to tolerate dependencies and hide latencies. The back-end features an 8-wide dispatch to specialized execution pipelines, including three simple integer ALUs, one complex integer ALU, two load/store units, a execution unit, and two vector/floating-point units for and Advanced SIMD operations. Load/store units connect to a 64 KiB L1 data cache, exploiting memory-level parallelism with dual ports for concurrent accesses. Retirement occurs in-order at up to 4 , ensuring architectural state consistency while the out-of-order engine maximizes utilization. This structure allows the core to sustain high IPC, with reported uplifts of 35% in single-threaded performance over the Cortex-A75. Branch prediction plays a critical role in maintaining momentum, employing a predictor separate from the fetch unit to precompute targets and directions ahead of time. It incorporates a multilevel target buffer (BTB) with 2x the bandwidth of the fetch unit, supporting indirect branches and improving accuracy on complex —reducing misprediction rates compared to prior generations. Mispredictions incur an 11-cycle penalty, mitigated by the deep pipeline's window. Overall, these elements enable the Cortex-A76 to deliver desktop-like with mobile efficiency, as evidenced in implementations like Qualcomm's Snapdragon 855.

Memory Hierarchy

The ARM Cortex-A76 employs a multi-level optimized for low latency and high bandwidth in high-performance mobile and embedded systems, featuring private per-core L1 and L2 caches alongside an optional shared L3 cache to support efficient access patterns in multi-core configurations. This design balances the demands of sustained with power efficiency, enabling the core to handle complex workloads while minimizing stalls from dependencies. At the first level, each Cortex-A76 core includes a private 64 KB instruction cache (L1I) and a 64 KB cache (L1D), both implemented as 4-way set associative structures with 64-byte cache lines to facilitate rapid access and prefetching of instructions and . The L1 caches support write-back and write-allocate policies, with a load-to-use latency of 4 cycles, allowing the engine to overlap memory operations effectively and reduce bubbles. Additionally, the L1 cache incorporates hardware prefetching mechanisms that detect common access patterns, such as sequential or stride-based loads, to proactively fetch into the cache and further mitigate latency impacts on performance-critical applications. The second-level cache (L2) is private to each core and configurable in size from 128 KB to 512 KB, operating as a 16-way set associative, inclusive unified cache that backs the L1 caches with a latency of approximately 9 cycles for load-to-use operations. This L2 structure provides a 256-bit read interface from the cache and a matching write interface, supporting up to two 128-bit loads or stores per cycle to sustain the core's dual-issue execution capabilities while ensuring coherence through AMBA CHI or protocols in multi-core systems. The inclusive design simplifies management by automatically invalidating L1 lines upon L2 eviction, contributing to predictable behavior in cache-coherent environments. An optional shared L3 cache, ranging from 512 KB to 4 MB, can be implemented at the cluster level to serve multiple Cortex-A76 cores, offering a latency of 26 to 31 cycles and enhancing bandwidth for shared data access in scenarios like multi-threaded applications. This level integrates with the system's interconnect fabric to maintain coherence and supports ECC for reliability in enterprise-grade deployments. The (MMU) complements the with dedicated translation lookaside buffers (TLBs) to accelerate virtual-to- translations. The L1 instruction TLB (ITLB) and data TLB (DTLB) are each 48-entry fully associative arrays, natively supporting page sizes of 4 KB, 16 KB, 64 KB, 2 MB, 32 MB, and 512 MB for efficient handling of large mappings common in 64-bit ARMv8.2-A environments. These are backed by a unified L2 TLB with 1280 entries organized as 5-way set associative, which aggregates misses from the L1 TLBs and interfaces with the walker to minimize translation overhead during cache fills or direct accesses. The TLB design incorporates support for large physical address extensions (LPAE) up to 40 bits, ensuring scalability for systems with expansive footprints.

Key Features

Instruction Set Extensions

The ARM Cortex-A76 core implements the ARMv8-A instruction set architecture, supporting the 64-bit execution state with the fixed-length 32-bit A64 instruction set, as well as the 32-bit AArch32 execution state using the A32 () and T32 () instruction sets. The AArch32 support is limited to EL0 (user mode) execution level. These base instruction sets provide the foundation for general-purpose computing, including scalar integer operations, advanced SIMD () for vector processing, and floating-point arithmetic via the VFPv4 architecture. The core incorporates several extensions to the ARMv8-A base, enhancing performance in areas such as atomic operations, , reliability, and consistency. The ARMv8.1-A extension adds atomic access instructions under the Large Extensions (LSE) feature, including load-add (LDADD), load-clear (LDCLR), load-set (LDSET), and swap (SWP) variants for byte, halfword, word, and doubleword sizes in AArch64. These instructions enable lock-free programming and improve scalability in multi-core environments by providing single-copy atomicity without requiring exclusive monitors. Additionally, ARMv8.1-A introduces advanced SIMD instructions for half-precision (FP16) floating-point operations and support for 4KB descriptors in AArch32. Building on ARMv8.1-A, the ARMv8.2-A extension includes mandatory support for half-precision floating-point in the scalar and Advanced SIMD units, with instructions like FCVT (convert between FP16 and other formats) and FMUL (multiply FP16). It also adds enhancements for large systems, such as improved virtualization and memory management, though the Cortex-A76 does not implement optional components like Scalable Vector Extension (SVE). The ARMv8.4-A extension adds instructions to Advanced SIMD (e.g., UDOT and SDOT for unsigned and signed 8-bit dot products), which accelerate matrix multiplications and are particularly beneficial for workloads. The ARMv8.5-A extension provides support for the PSTATE Speculative Store Bypass Safe (SSBS) bit, which helps mitigate speculative store bypass vulnerabilities. An optional Cryptographic Extension, based on the ARMv8-A Cryptography feature, integrates hardware acceleration directly into the Advanced SIMD unit with new A64, A32, and T32 instructions. These include AES instructions (AESE for encrypt, AESD for decrypt, AESMC for mix columns), SHA-1 instructions (SHA1C, SHA1M, SHA1H, SHA1SU0, SHA1SU1), SHA-256 instructions (SHA256H, SHA256H2, SHA256SU0, SHA256SU1), polynomial multiplication (PMULL and PMULL2 for carryless multiply used in GCM mode), and CRC-32 computation (CRC32B, CRC32H, CRC32W, CRC32X, CRC32CB, CRC32CH, CRC32CW, CRC32CX). Optional sub-features add SHA-3 (EOR3, RORV, XAR, BCAX, BDEP, BEXT, BGRP, BSL, BIF) and Chinese SM3/SM4 algorithms. This extension significantly boosts throughput for encryption and hashing in security-critical applications. The (RAS) extension, introduced in ARMv8.2-A, adds the Error Synchronization Barrier (ESB) instruction across A32, T32, and A64 to ensure error records are visible before proceeding, along with new system registers (e.g., ERRIDR_EL1 for error identification, ERXFR_EL1 for external error forwarding). These facilitate hardware error detection, reporting, and recovery, enhancing system robustness in server and high-reliability environments. Finally, the core provides partial support for ARMv8.3-A through the Load-Acquire RCpc (Release Consistent processor consistent) instructions, specifically LDAPR, LDAPRB, LDAPRH, and LDAPRX. These load-acquire operations offer weaker ordering guarantees than full acquire semantics, allowing reordering with subsequent stores to different addresses for improved performance in concurrent programming while maintaining compatibility with C++ memory models. Full ARMv8.3-A features like pointer authentication are not supported.

Security and Virtualization

The ARM Cortex-A76 core, based on the ARMv8-A architecture, provides robust hardware support for security through TrustZone technology, which enforces isolation between secure and non-secure execution environments at the exception level EL3 (Secure Monitor). This enables the implementation of a (TEE) for protecting sensitive data and operations, such as cryptographic keys and secure boot processes, from untrusted software in the normal world. TrustZone extends to peripherals, interrupts, and memory, allowing system-wide partitioning configurable by the secure monitor. Additionally, the optional Cryptographic Extension accelerates common security algorithms, including AES encryption/decryption in modes like ECB, CBC, and GCM, as well as , SHA-256, and SHA-512 hashing, enabling efficient handling of secure communications and checks. For virtualization, the Cortex-A76 implements the full ARMv8-A extensions, supporting EL2 () mode to manage multiple guest operating systems with isolated virtual address spaces and resources. The (MMU) facilitates this through stage-2 address translations, enabling efficient while maintaining protection against guest-to-guest interference. The core also includes the Virtualization Host Extensions (VHE) from ARMv8.1-A, which allow the host OS to execute at EL2 with near-native performance by reducing unnecessary traps and context switches for host instructions, such as system calls. This VHE support, combined with Address Space ID (ASID) management at EL2, optimizes overhead in multi-tenant environments like or server applications. These features integrate seamlessly in DynamIQ Shared Unit (DSU) configurations, where multiple Cortex-A76 cores can share and contexts, supporting scalable deployments in devices requiring both isolation and efficiency, such as smartphones and edge servers.

Performance and Efficiency

Benchmark Results

The ARM Cortex-A76 demonstrated significant performance advancements over its predecessor, the Cortex-A75, particularly in and floating-point workloads. In SPECint2006 benchmarks, the A76 achieved a 25% improvement in performance compared to the A75 when evaluated at the same process node and frequency. Similarly, SPECfp2006 results showed a 35% uplift in floating-point operations under identical conditions. These gains were validated through early implementations, such as Huawei's Kirin 980 SoC, where the A76-based cores delivered 1.89 times the performance and 2.04 times the floating-point performance of the Cortex-A73 in the Snapdragon 835 at 2.6 GHz versus 2.45 GHz. Efficiency metrics further highlighted the A76's design strengths, with ARM reporting up to 40% better power efficiency at equivalent levels to the A75, enabling sustained operation in mobile and scenarios without excessive thermal constraints. Memory subsystem enhancements contributed substantially, as LMBench tests indicated a 90% increase in bandwidth over the A75, reducing bottlenecks in data-intensive tasks. In real-world SoC integrations like Qualcomm's Snapdragon 855, which clocked A76 cores up to 2.84 GHz, single-threaded 4 scores reached approximately 3,500, representing a 45% leap over the Snapdragon 845's A75 configuration, while multi-threaded scores approached 11,000.
BenchmarkCortex-A76 (vs. A75)Implementation ExampleSource
SPECint2006 ()+25%Iso-process/
SPECfp2006 (Floating-Point)+35%Iso-process/
(LMBench)+90%N/A
SPECint2006 (vs. A73)1.89xKirin 980 @ 2.6 GHz
SPECfp2006 (vs. A73)2.04xKirin 980 @ 2.6 GHz
Geekbench 4 Single-Core (vs. A75)+45%Snapdragon 855 @ 2.84 GHz
Overall, these results positioned the A76 as a foundational core for mobile devices in , balancing high throughput with constraints typical of battery-powered systems. ARM's internal modeling projected significant uplifts in SPEC suites across early adopters.

Power Consumption

The ARM Cortex-A76 core is engineered for high performance within constrained power envelopes typical of mobile and embedded systems, achieving significant efficiency gains through microarchitectural optimizations such as improved branch prediction, wider execution pipelines, and enhanced prefetching mechanisms that reduce waste from stalls and misses. These design choices enable the core to deliver laptop-class computational throughput while adhering to smartphone-level power budgets, supporting extended battery life in devices like premium mobiles and always-connected PCs. Compared to its predecessor, the Cortex-A75, the A76 provides a 40% improvement in power efficiency at equivalent performance levels, allowing for 40% higher performance within the same power allocation. This uplift stems from targeted reductions in area and power overheads in the engine and , alongside integration with ARM's DynamIQ technology, which facilitates heterogeneous clustering with low-power cores like the Cortex-A55 for workload-specific and voltage scaling. In practice, such efficiencies contribute to over 20 hours of battery life in ARM-based devices running productivity applications. The core's power profile benefits from advanced features including fine-grained power domains for the and floating-point units, as well as support for ARM's Maximum Power Mitigation Mechanism (MPMM), which uses activity monitors to dynamically cap power draw during thermal events without full throttling. When implemented on 7nm process nodes at frequencies up to 3 GHz, these elements ensure the A76 maintains competitive energy-per-instruction metrics, particularly for inference tasks, where it achieves 4x the performance of prior generations at iso-power. Overall, the design prioritizes sustainable efficiency for sustained workloads, balancing peak performance with low leakage and active power dissipation.

Implementations and Usage

Licensing Model

The ARM Cortex-A76 core is licensed by as semiconductor intellectual property (IP) to semiconductor manufacturers, fabless design companies, and system integrators for incorporation into custom system-on-chip (SoC) designs. This licensing enables licensees to configure the core within Arm's DynamIQ Shared Unit (DSU) for scalable, clusters, supporting integration with other Arm IP such as GPUs, interconnects, and controllers via standard AMBA interfaces. The primary licensing pathway for the Cortex-A76 is Arm Flexible Access, a subscription-based model that provides broad, low-barrier entry to Arm's IP portfolio, including the Cortex-A series. Under this program, eligible parties—ranging from startups and institutions to established enterprises—gain unlimited and access to , models, and tools without upfront fees, with costs deferred until or production. Qualifying startups and academic users receive zero-cost access for prototyping and evaluation, while incurs per-project fees or royalties scaled to volume, promoting innovation in mobile, automotive, and applications. Arm also supports traditional licensing options, such as perpetual or time-bound subscriptions, which involve negotiated upfront payments for IP rights followed by per-unit royalties upon commercialization. These models allow for customized configurations and are tailored to high-volume producers, ensuring compliance with Arm's specifications while permitting limited modifications under separate agreements. All licenses emphasize royalty-based to align with Arm's ecosystem-driven business strategy.

Adopted SoCs and Devices

The ARM Cortex-A76 core saw widespread adoption in high-end mobile system-on-chips (SoCs) starting in late , primarily for premium smartphones seeking improved performance and efficiency over previous generations. Early implementations focused on DynamIQ-compatible configurations combining A76 performance cores with Cortex-A55 efficiency cores, enabling balanced big.LITTLE architectures for demanding tasks like gaming and AI processing. These SoCs marked a shift toward laptop-class CPU capabilities in mobile devices while maintaining power constraints suitable for battery-powered platforms. HiSilicon's Kirin 980 was the first commercial SoC to integrate the Cortex-A76, announced in September 2018 and fabricated on a . It features a quad-cluster setup with two high-performance A76 cores at 2.6 GHz, two mid-performance A76 cores at 1.92 GHz, and four A55 cores at 1.8 GHz, delivering up to 75% better single-threaded performance compared to the prior Kirin 970. This SoC powered flagship devices, including the Mate 20, Mate 20 Pro, and Honor View 20, emphasizing advancements in AI via its dual NPU design. Qualcomm's Snapdragon 855, also on 7 nm and launched in December 2018, adopted a similar tri-cluster approach with one prime A76 core at 2.84 GHz, three performance A76 cores at 2.42 GHz, and four A55 cores at 1.8 GHz under the 485 branding. This configuration provided a 45% CPU uplift over the Snapdragon 845, supporting 4K video and enhanced . It was integrated into numerous Android flagships, such as the series, OnePlus 7, and , driving widespread availability in global markets. Samsung's 9820, introduced in February 2019 on an 8 nm process, blended custom M4 cores with A76 for its premium lineup, using two M4 cores at 2.73 GHz, two A76 cores at 2.2 GHz, and four A55 cores at 1.95 GHz. This hybrid design aimed for optimized multimedia and gaming performance, appearing in regional variants of the Galaxy S10 and Note 10 series, particularly in and . Subsequent iterations extended A76 usage to mid-range and 5G SoCs. For instance, the HiSilicon Kirin 990 (2019, 7 nm+ EUV) upgraded to two A76 cores at 2.86 GHz and two at 2.09 GHz alongside four A55 cores, incorporating an integrated modem; it drove Huawei's Mate 30 Pro and P40 series with superior ISP capabilities for photography. Qualcomm's Snapdragon 720G (2020, 8 nm) targeted affordable devices with two A76 cores at 2.3 GHz and six A55 cores at 1.8 GHz, featured in phones like the 6 Pro and Note 9S. MediaTek's Helio G99, announced in May 2022 on a 6 nm process, features two A76 cores at 2.2 GHz and six A55 cores at 2.0 GHz with a Mali-G57 MC2 GPU, aimed at budget gaming smartphones; it powers devices such as the Poco M5 and Narzo 50 series. Beyond smartphones, the A76 found applications in embedded and development platforms. Rockchip's RK3588 (2022, 8 nm) includes four A76 cores at up to 2.4 GHz and four A55 cores, optimized for AI and multimedia with a 6 NPU and 8K video support; it powers single-board computers (SBCs) such as the Radxa Rock 5B, Orange Pi 5, and BPI-M7, used in , media players, and prototyping. The BCM2712 SoC, used in the 5 released in October 2023 on a 16 nm process, integrates four A76 cores at 2.4 GHz with a VII GPU, targeted at hobbyist, educational, and general-purpose computing applications. Allwinner's A733, launched in late 2024 on a 12 nm process, combines two A76 cores at 2.0 GHz and six A55 cores at 1.8 GHz with an optional 3 NPU and a E902 core, supporting up to 16 GB RAM for AI tasks in Android tablets and laptops, such as the Teclast P50Ai. In programmable hardware, Intel's Agilex 5 D-Series FPGAs (2023) incorporate two A76 cores in their hard processor system (HPS) alongside two A55 cores, enabling customizable SoC designs for industrial and applications.
SoCManufacturerCore ConfigurationProcess NodeLaunch YearExample Devices/Platforms
Kirin 980[email protected] GHz + 2×[email protected] GHz + 4×A557 nm2018Huawei Mate 20 Pro, Honor View 20
[email protected] GHz + 3×[email protected] GHz + 4×A557 nm2018Samsung Galaxy S10, OnePlus 7
Exynos 9820[email protected] GHz + 2×[email protected] GHz + 4×A558 nm2019Samsung Galaxy S10 (Exynos variant)
Kirin 990[email protected] GHz + 2×[email protected] GHz + 4×A557 nm+2019Huawei Mate 30 Pro, P40 Pro
[email protected] GHz + 6×[email protected] GHz8 nm2020 6 Pro, Xiaomi Redmi Note 9S
Helio G99[email protected] GHz + 6×[email protected] GHz6 nm2022Xiaomi Poco M5, Narzo 50
RK3588[email protected] GHz + 4×[email protected] GHz8 nm2022Radxa Rock 5B, Orange Pi 5
BCM2712[email protected] GHz16 nm2023 5
A733Allwinner[email protected] GHz + 6×[email protected] GHz12 nm2024Teclast P50Ai
Agilex 5 HPS2×A76 + 2×A55N/A (FPGA)2023Agilex 5 D-Series FPGA development kits

References

  1. https://en.wikichip.org/wiki/arm_holdings/microarchitectures/cortex-a76
  2. https://en.wikichip.org/wiki/qualcomm/snapdragon_800/855
Add your contribution
Related Hubs
User Avatar
No comments yet.