Hubbry Logo
RDNA (microarchitecture)RDNA (microarchitecture)Main
Open search
RDNA (microarchitecture)
Community hub
RDNA (microarchitecture)
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
RDNA (microarchitecture)
RDNA (microarchitecture)
from Wikipedia

AMD RDNA
Release dateJuly 7, 2019
(6 years ago)
 (2019-07-07)[1]
Designed byAMD
Fabrication process
History
PredecessorGraphics Core Next 5
Support status
Supported
A generic block diagram of a GPU

RDNA (Radeon DNA[2][3]) is a graphics processing unit (GPU) microarchitecture and accompanying instruction set architecture developed by AMD. It is the successor to their Graphics Core Next (GCN) microarchitecture/instruction set. The first product lineup featuring RDNA was the Radeon RX 5000 series of video cards, launched on July 7, 2019.[1][4] The architecture is also used in mobile products.[5] It is manufactured and fabricated with TSMC's N7 FinFET graphics chips used in the Navi series of AMD Radeon graphics cards.[6]

The second iteration of RDNA was first featured in the PlayStation 5[7][8] and Xbox Series X/S consoles.[9] Both consoles utilize a custom RDNA 2-based graphics solution as the basis for their GPU microarchitecture. On PC, RDNA 2 is featured in the Radeon RX 6000 series of video cards, which first launched in November 2020.[10] RDNA 2 is also featured in Samsung's Exynos 2200 as the graphics architecture.[11]

The third iteration of RDNA was announced on November 3, 2022, and is featured in the Radeon RX 7000 series of consumer desktop and mobile graphics cards.[12]

The fourth and final iteration of RDNA was unveiled on January 6, 2025 at CES[13] and is used in the Radeon RX 9000 series of desktop graphics cards.

Instruction set

[edit]

AMD's GPUOpen website hosts PDF documents aiming to describe the environment, the organization and the program state of RDNA devices. They detail the instruction set and the microcode formats native to this family of processors that are accessible to programmers and compilers.[14]

Documentation is available for:

RDNA 1

[edit]
AMD RDNA 1
Release dateJuly 7, 2019
(6 years ago)
 (2019-07-07)
CodenameNavi 1x
Fabrication processTSMC N7
History
PredecessorGraphics Core Next 5
SuccessorRDNA 2
Support status
Supported

RDNA 1 (also RDNA1)[15][16] is the first implementation of the RDNA microarchitecture and is the successor to the Radeon RX Vega series.[17][18] The launch occurred on July 7, 2019.[19]

Architecture

[edit]
Die shot of the RX 5500 XT's RDNA GPU

The architecture features a new processor design, although the first details released at AMD's Computex keynote hints at aspects from the previous Graphics Core Next (GCN) architecture being present for backwards compatibility purposes, which is especially important for its use (in the form of RDNA 2) in the major ninth generation game consoles (the Xbox Series X/S and PlayStation 5) to preserve native compatibility with their pre-existing eighth generation game libraries designed for GCN. It features multi-level cache hierarchy and an improved rendering pipeline, with support for GDDR6 memory.

Starting with the architecture itself, one of the biggest changes for RDNA is the width of a wavefront, the fundamental group of work. GCN in all of its iterations was 64 threads wide, meaning 64 threads were bundled together into a single wavefront for execution. RDNA drops this to a native 32 threads wide. At the same time, AMD has expanded the width of their SIMDs from 16 slots to 32 (aka SIMD32), meaning the size of a wavefront now matches the SIMD size.[5]: 2 

RDNA also introduces working primitive shaders. While the feature was present in the hardware of the Vega architecture, it was difficult to get a real-world performance boost from and thus AMD never enabled it. Primitive shaders in RDNA are compiler-controlled.[5]: 2 

The display controller in RDNA has been updated to support Display Stream Compression 1.2a, allowing output in 4K@240 Hz, HDR 4K@120 Hz, and HDR 8K@60 Hz.[5]: 2 [20]

Differences between GCN and RDNA

[edit]

There are architectural changes which affect how code is scheduled:

  1. Single cycle instruction issue:
    • GCN issued one instruction per wave once every 4 cycles.
    • RDNA issues instructions every cycle.
  2. Wave32:
    • GCN used a wavefront size of 64 threads (work items).
    • RDNA supports both wavefront sizes of 32 and 64 threads.
  3. Workgroup Processors:
    • GCN grouped the shader hardware into "compute units" (CUs) which contained scalar ALUs and vector ALUs, LDS and memory access. One CU contains 4 SIMD16s which share one path to memory.
    • RDNA introduced the "workgroup processor" ("WGP"). The WGP replaces the compute unit as the basic unit of shader computation hardware/computing. One WGP encompasses 2 CUs. This allows significantly more compute power and memory bandwidth to be directed at a single workgroup.

Chips

[edit]

Discrete GPUs:

  • Navi 10 found on Radeon RX 5600, Radeon RX 5600 XT, Radeon RX 5600M, Radeon RX 5700, Radeon RX 5700M, Radeon RX 5700 XT, Radeon Pro 5700, Radeon Pro 5700 XT, Radeon Pro W5700X, and Radeon Pro W5700 graphics cards
  • Navi 12 found on Radeon Pro V520 branded graphics card, Radeon Pro 5600M branded mobile graphics card and BC-160 mining card for cryptocurrency
  • Navi 14 found on Radeon RX 5300, Radeon RX 5300 XT, Radeon Pro 5300, Radeon Pro W5300, Radeon RX 5500, Radeon RX 5500 XT, Radeon Pro 5500, Radeon Pro 5500 XT, and Radeon Pro W5500, branded graphics cards; Radeon RX 5300M, Radeon Pro 5300M, Radeon Pro W5300M, Radeon RX 5500M, Radeon Pro 5500M, and Radeon Pro W5500M branded mobile graphics cards

RDNA 2

[edit]
AMD RDNA 2
Release dateNovember 18, 2020
(4 years ago)
 (2020-11-18)
CodenameNavi 2x
Fabrication process
History
PredecessorRDNA 1
SuccessorRDNA 3
Support status
Supported

RDNA 2[21] (also RDNA2)[22] is the successor to the RDNA microarchitecture. It was first publicly announced in early 2020 with a projected release in Q4 2020.[22][23] According to statements from AMD, RDNA 2 would be a "refresh" of the RDNA architecture.[24]

More information about RDNA 2 was made public on AMD's Financial Analyst Day on March 5, 2020.[25][23][26] AMD claimed that it would provide a 50% performance-per-watt improvement over RDNA, with increases in clock speed and instructions-per-clock.[27] Additional features confirmed by AMD include real-time, hardware accelerated ray tracing, "Infinity Cache", mesh shaders, sampler feedback and variable rate shading.[27][10] The company announced that RDNA 2 would be used in next-generation gaming consoles and PC graphics cards[27] code-named "Navi 2X" and also nicknamed as "Big Navi".[27]

AMD unveiled the Radeon RX 6000 series, its next-gen RDNA 2 graphics cards at an online event on October 28, 2020.[28][29] The lineup initially consisted of the RX 6800, RX 6800 XT and RX 6900 XT.[30][31] The RX 6800 and 6800 XT launched on November 18, 2020, with the RX 6900 XT being released on December 8, 2020.[10] Further variants including a Radeon RX 6700 (XT) series based on Navi 22, later launched on March 18, 2021.[32][33][34][35]

On May 31, 2021, AMD launched the RX 6000M series of GPUs designed for laptops.[36][37] These include the RX 6600M, RX 6700M, and RX 6800M. These were made available beginning on June 1, 2021.[36]

On June 1, 2021, AMD's CEO Dr. Lisa Su and Tesla, Inc.'s CEO Elon Musk confirmed that the entertainment systems of Tesla's new Model S and Model X are powered by RDNA 2.[38] The same microarchitecture was also announced to be used for an upcoming flagship Samsung Exynos SoC,[39] later introduced in January 2022 as Exynos 2200, utilizing a custom Xclipse 920 GPU with 3 workgroup processors.[40][41]

An RDNA 2 integrated GPU with 2 compute units is included in the I/O die on AMD's Zen 4-based Ryzen 7000 Series CPUs.[42][43] According to AMD, the integrated RDNA 2 graphics in Ryzen 7000 are not intended for gaming and is instead intended for diagnostic purposes and offering video encode and decode capabilities.[44]

Chips

[edit]

Discrete GPUs:

  • Navi 21
  • Navi 22
  • Navi 23
  • Navi 24

Integrated into APUs/CPUs:

Usage in video game consoles

[edit]

Custom configurations of the RDNA 2 graphics microarchitecture are used in the PlayStation 5[7][45] from Sony, Xbox Series X and Series S consoles[9] from Microsoft, with proprietary tweaks and different GPU modifications in each system's implementation. Valve announced on July 15, 2021, that their Steam Deck would feature the RDNA 2 architecture. The Steam Deck was released in February 2022.[46]

RDNA 3

[edit]
AMD RDNA 3
Release dateDecember 13, 2022
(2 years ago)
 (2022-12-13)
CodenameNavi 3x
Fabrication process
History
PredecessorRDNA 2
SuccessorRDNA 4
Support status
Supported

RDNA 3 (also RDNA3) is the successor to the RDNA 2 microarchitecture and was projected for a launch in Q4 2022 per AMD's gaming GPU roadmap.[47][48][49] At an August 29 reveal event for Ryzen 7000 series CPUs, AMD CEO Lisa Su teased RDNA 3 and revealed that it would utilize chiplets built on TSMC's N5 node.[50] On September 19, 2022, Sam Naffziger, the current senior vice president at AMD, stated in a blogpost that improvements made to the RDNA 3 microarchitecture allow for considerable performance gains and efficiency, with an estimated 50% increase in performance-per-watt compared to the RDNA 2 microarchitecture.[51] Additionally, the RDNA 3 architecture features the next generation of Infinity Cache, a modified graphics pipeline, adaptive power management and rearchitected compute units, leading to an overall robust uplift in rasterization and ray-tracing performance over the previous consumer architecture.[52]

On November 3, 2022, AMD unveiled the RX 7900 XTX and RX 7900 XT graphics cards, based on the RDNA 3 microarchitecture. These are the first commercial GPUs to be based on multi-chip module (MCM) design.[53]

On October 5, 2023 and October 24, 2024 respectively, Samsung announced Exynos 2400 and Exynos 1580, which utilized RDNA 3 microarchitecture-based custom-design GPU, Xclipse 940 and 540.[54][55]

Chips

[edit]

Discrete GPUs:

  • Navi 31 found on Radeon RX 7900 GRE, Radeon RX 7900 XT, Radeon RX 7900 XTX, Radeon Pro W7800 and Radeon Pro W7900 branded graphics cards; Radeon RX 7900M branded mobile graphics cards
  • Navi 32 found on Radeon RX 7700 XT and Radeon RX 7800 XT branded graphics cards
  • Navi 33 found on Radeon RX 7600, Radeon RX 7600 XT, Radeon Pro W7500 and Radeon Pro W7600 branded graphics cards; Radeon RX 7600S, Radeon RX 7600M, Radeon RX 7600M XT and Radeon RX 7700S branded mobile graphics cards

Integrated into APUs/CPUs:

Comparison of RDNA chips

[edit]
RDNA discrete GPU chips
Microarchitecture RDNA 1 RDNA 2 RDNA 3 RDNA 4
Chip Navi 10[56] Navi 12[57] Navi 14[58] Navi 21[59] Navi 22[60] Navi 23[61] Navi 24[62] Navi 31[63][64] Navi 32[65] Navi 33[66] Navi 44[67] Navi 48[68]
Code name Gaming Fighter Sienna Cichlid Navy Flounder Dimgrey Cavefish Beige Goby Plum Bonito Wheat Nas Hotpink Bonefish
LLVM target[69][70] gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 gfx1100 gfx1101 gfx1102 gfx1200 gfx1201
Fab TSMC N7 TSMC N6 TSMC N5 (GCD), TSMC N6 (MCD) TSMC N6 TSMC N4
Package Monolithic Multi-chip module (MCM) Monolithic
Die size (mm2) 251 Unknown 158 520 335 237 107 ~531 ~350 204 199 357
Graphics compute dies 1
Memory cache dies 6 4
GCD size (mm2) 306 200
MCD size (mm2) 37.5
Transistors (billions) 10.3 Unknown 6.4 26.8 17.2 11.06 5.4 57.7 28.1 13.3 29.7 53.9
Transistor density
(MTr/mm2)
41.0 Unknown 40.5 51.5 51.3 46.7 50.5 109.2 (MCM)
132.4 (GCD)[71]
81.2 65.2 149.2 151
Shader engines 2 1 4[72] 2[72] 1[72] 6 TBA 2 4
Shader arrays 4 2 8 4 2 12 TBA 4 8
Workgroup processors 20 12 40 20 16 8 48 30 16 32
Compute units 40 24 80 40 32 16 96 60 32 64
Stream processors 2560 1536 5120 2560 2048 1024 6144 3840 2048 4096
Texture mapping units 160 96 320 160 128 64 384 240 128 256
Render output units 64 32 128 64 32 192 96[73] 64 128
RT accelerators 80 40 32 16 96 60 32 64
AI accelerators[a] 192 120 64 128
L0 cache (KB) 32 per Workgroup processor (WGP) 64 per WGP
L1 cache (KB) 128 per Shader array (SA) 256 per SA 128 per SA
L2 cache (MB) 8 4 2 4 3 2 1 6 4 2 4 8
L3 cache (MB) 128 96 32 16 96 64 32 64
Memory type GDDR6 HBM2 GDDR6
Memory bus (bits) 256 2048 128 256 192 128 64 384 256 128 256
Display Core Next 2.0.0 3.0.0 3.0.2 3.0.3 3.2.0 3.2.1 4.0.1
Video Core Next 2.0.0 2.0.2 3.0.0 3.0.16 3.0.33 4.0.0 4.0.4 5.0.0
Launch Jul 2019 Jun 2020 Oct 2019 Nov 2020 Mar 2021 May 2021 Jan 2022 Dec 2022 Sep 2023 Jan 2023 Jun 2025 Mar 2025
Introduced with RX 5700 (XT) Pro 5600M RX 6800 (XT) RX 6700 XT RX 6600M RX 7900 XT(X) RX 9060 XT RX 9070 (XT)
RDNA integrated GPU chips
Microarchitecture RDNA 2 RDNA 3 RDNA 3.5
Code name Rembrandt Raphael Mendocino Rembrandt-R Dragon Range Phoenix Hawk Point Strix Point[74][75] Strix Halo[76] Krackan Point[77]
LLVM target[78][79] gfx1035 gfx1036 gfx1037 gfx1035 gfx1037 gfx1103 gfx115{0,1} gfx1151 gfx1152
Fab TSMC N6 TSMC N4
Package Monolithic Monolithic[b] Semi-MCM[c] Monolithic[b]
Die size (mm2) TBA 308 TBA
Transistors (billions) TBA
Transistor density
(MTr/mm2)
TBA
Shader engines
Shader arrays
Workgroup processors 8 20 4
Compute units 16 40 8
Stream processors 1024 2560 512
Texture mapping units 64 160 32
Render output units 32 64 8
RT accelerators 16 40 8
AI accelerators[d] 32 80 16
L0 cache (KB) 32 per Workgroup processor (WGP) 64 per WGP
L1 cache (KB) 128 per Shader array (SA) 256 per SA
L2 cache (MB) 2 8 1
L3 cache (MB) 0 32[e] 0
Memory type DDR5/LPDDR5 DDR5 LPDDR5 DDR5/LPDDR5 DDR5 DDR5/LPDDR5(X)
Memory bus (bits)[f] 128 128 128 128/256 128
Display Core Next 3.1.4 3.5.0 3.5.0
Video Core Next 4.0.2 4.0.5 4.0.5
Launch Jan 2022 Sep 2022 Jan 2023 Dec 2023 Jul 2024 Jan 2025 Mar 2025
Introduced with
  • 660M
  • 680M
Radeon
Graphics
610M
  • 660M
  • 680M
610M
  • 740M
  • 760M
  • 780M
  • 740M
  • 760M
  • 780M
860M

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
RDNA (Radeon DNA) is a family of (GPU) microarchitectures developed by (), introduced in 2019 as the successor to the (GCN) architecture. Optimized for gaming and high-performance graphics workloads, RDNA emphasizes improvements in instructions per clock (IPC), reduced latency, higher bandwidth, and power efficiency compared to its predecessor, enabling scalable designs across consumer GPUs, consoles, and professional applications. At its core, the RDNA architecture replaces GCN's Compute Units (CUs) with Workgroup Processors (WGPs), where each WGP combines resources from two CUs to provide dual scalar execution units and support for up to 20 per SIMD32 processor. It introduces native Wave32 wavefront execution with single-cycle instruction issue, doubling scalar throughput and enhancing programmability while maintaining with GCN. The subsystem features a hierarchical caching structure, including 16 KB L0 instruction and scalar data caches per WGP, 128 KB L1 caches per shader array, and up to 4 MB L2 cache, alongside support for GDDR6 and asynchronous compute tunneling for better workload balancing. These elements collectively deliver up to 1.5x IPC uplift and 25% better energy efficiency in tasks. The RDNA family has evolved through multiple generations, each building on the foundational design with targeted enhancements. RDNA 1, debuted in the , laid the groundwork with its focus on graphics-centric optimizations. RDNA 2, released in 2020 and powering GPUs like the as well as the and Xbox Series X/S consoles, introduced hardware-accelerated ray tracing via dedicated Ray Accelerators, the Infinity Cache for up to 2x effective memory bandwidth, and enhanced Compute Units supporting Variable Rate Shading (VRS) and 12 Ultimate, achieving up to 179% performance gains over RDNA 1 in professional workloads. RDNA 3, launched in 2022 with the , adopted a chiplet-based design combining 5 nm and 6 nm process nodes for , second-generation Infinity Cache (up to 96 MB), second-generation ray tracing with improved traversal efficiency, and dedicated AI accelerators, resulting in over 50% efficiency improvements and up to 3500 GB/s effective bandwidth. The latest iteration, RDNA 4, unveiled in 2025 for the Radeon RX 9000 series, features third-generation ray tracing accelerators with over 2x throughput per compute unit, second-generation AI accelerators offering up to 8x INT8 performance for tasks, 64 MB Infinity Cache, and 16 GB GDDR6 memory, delivering up to 40% higher gaming performance at compared to RDNA 3 equivalents.

Background and development

Predecessors and motivations

The (GCN) , introduced by in 2012 with the , formed the foundational predecessor to RDNA and powered the company's discrete and integrated GPUs through the series until 2019. GCN employed a , enabling the same compute units to process both graphics rendering and general-purpose compute tasks seamlessly, which facilitated strong support for APIs like 11 and . Execution was structured around wavefronts comprising 64 threads, executed across four SIMD16 units per compute unit to achieve high parallelism and throughput in vector operations. Scalar processing was managed by dedicated pipelines for and address generation, backed by a 8 KB scalar register file and a 16 KB shared scalar data cache per group of four compute units. The cache hierarchy included private 16 KB L1 vector data caches per compute unit and a distributed 16-way associative L2 cache for coherence across the GPU. GCN's design excelled in compute-heavy workloads but revealed limitations in scalar flexibility, where divergent thread execution in graphics shaders strained the scalar units' branch handling and 64-bit ALU capabilities, leading to underutilization in latency-sensitive scenarios. Cache efficiency also posed challenges, as the per-SIMD vector caches and shared scalar cache resulted in frequent flushes, particularly in geometry and memory access patterns under modern workloads like DirectX 12, which emphasized asynchronous compute and lower occupancy. These issues contributed to suboptimal power consumption and scalability as gaming demands shifted toward higher and reduced thread counts for better responsiveness. The transition to RDNA was driven by AMD's goal to deliver approximately 50% higher over GCN, with a sharpened focus on optimizing for gaming through lower latency, higher , and improved in pipelines. This redesign targeted high-end discrete GPUs and console-integrated solutions, enhancing scalability via refined memory hierarchies and interconnects while ensuring with GCN's instruction set. RDNA's development emphasized readiness for emerging features like ray tracing, addressing GCN's inefficiencies to better align with evolving industry standards for power-sensitive, real-time rendering.

Timeline of releases

The development of the RDNA microarchitecture began as part of AMD's internal roadmap to succeed the architecture, with the fifth generation of GCN () serving as the immediate predecessor before the shift to RDNA for improved gaming efficiency. AMD's RDNA iterations were designed with a focus on annual advancements in , targeting gains such as 50% improvements in early generations over GCN baselines. RDNA 1 was revealed at 2019 and officially announced in July 2019, launching alongside the on July 7, 2019. This marked the debut of the RDNA family, emphasizing a clean break from GCN for better power efficiency in gaming workloads. RDNA 2 followed with its announcement on October 28, 2020, and launch on November 18, 2020, powering the for discrete GPUs. A key milestone was its integration into console hardware through partnerships with and , debuting in the and Series X/S upon their November 2020 releases. RDNA 3 was unveiled on November 3, 2022, with the launching on December 13, 2022, introducing a significant manufacturing shift to TSMC's 5nm node for enhanced density and efficiency. The architecture combined 5nm and 6nm nodes in a design, building on prior collaborations with console partners. RDNA 4 was teased at CES 2025 on January 6 before its full unveiling on February 28, 2025, and launch on March 6, 2025, with the Radeon RX 9000 series. A deeper technical exploration occurred at Hot Chips 2025 in August, highlighting its modular design for flexible GPU configurations.

Core architectural principles

Instruction set architecture

The RDNA (ISA) is a 32-bit reduced instruction set computing (RISC)-like design that supports vector, scalar, and global operations, enabling efficient execution of shader programs on GPUs. It is compatible with major graphics and compute APIs, including 12, , and , facilitating broad software ecosystem support for rendering and general-purpose tasks. The ISA organizes instructions into categories such as VOP (vector operations), SOP (scalar operations), and memory access types like MIMG, FLAT, MUBUF, MTBUF, and DS, allowing developers to target diverse workloads from pixel to matrix computations. Key extensions in the RDNA ISA distinguish it from predecessors while maintaining core principles. The wavefront size is fixed at 32 threads (SIMD32), enabling consistent parallel execution across work-items, with support for Wave64 modes that operate as paired Wave32 units. Double-rate integer operations are provided through instructions like V_MUL_I32_I24 and V_MAD_I32_I24, which perform packed 24-bit multiplies and 32-bit accumulates to accelerate integer-heavy tasks such as texture sampling and . Primitive shaders, introduced in RDNA 1, are implemented as dedicated ISA opcodes including V_INTERP_P1_F32 for and EXP for parameter exports, streamlining and vertex processing by reducing overhead in the . Addressing modes in the RDNA ISA support both 32-bit and 64-bit pointers, encompassing flat addressing for global memory, scratch for private data, and structured formats via MUBUF and MTBUF instructions. Registers include vector general-purpose registers (VGPRs, 0-255 or up to 511 in extended modes) for per-lane data and scalar general-purpose registers (SGPRs, 0-105 plus special-purpose like VCC and M0 for offsets), promoting efficient data movement and control flow. Local data share (LDS) provides 64 KB per CU (128 KB per WGP) for fast thread-local communication, with up to 64 KB allocatable per workgroup; LDS allocation supports two modes: CU mode, which splits the memory for independent access by each CU's SIMDs, and WGP mode, providing a single contiguous space across the WGP. Asynchronous compute queues are enabled through synchronization primitives like S_WAITCNT, S_BARRIER, and DS_GWS_SEMA_P instructions, allowing concurrent execution of graphics and compute workloads without stalling. The RDNA ISA ensures backward compatibility with the Graphics Core Next (GCN) ISA, inheriting its foundational instructions and state models while introducing optimizations for scalar unit independence, such as single-cycle issue for SOP1 and SOP2 operations decoupled from vector pipelines. This design allows scalar instructions to execute per without vector dependencies, improving branch handling and control efficiency in shaders.

Compute unit structure

In the RDNA architecture, Compute Units (CUs) are organized into Workgroup Processors (WGPs), with each WGP comprising two CUs that share resources such as the L0 instruction cache, scalar cache, and LDS for improved efficiency. The compute unit (CU) in the RDNA microarchitecture serves as the fundamental processing core for parallel shader execution, consisting of two 32-wide single instruction, multiple data (SIMD) units that provide a total of 64 shader processors capable of performing arithmetic operations on wavefronts of 32 or 64 threads. Each CU includes a dedicated scalar unit (SALU) that handles address calculations, control flow decisions, and uniform operations across the wavefront, operating independently from the vector paths to improve efficiency in divergent code execution. Additionally, each CU incorporates four texture units for sampling and filtering operations, as well as one primitive unit responsible for assembling and culling primitives in the geometry pipeline, outputting up to one primitive per clock cycle after processing up to two inputs. The execution pipeline within a CU follows an in-order design with distinct stages: instruction fetch from a 16 KB L0 instruction cache per CU (32 KB per WGP) shared across the SIMD units, decode to distribute operations, execution through separate vector and scalar pipes, and write-back to registers or memory. This structure enables single-cycle instruction issue per SIMD for wavefronts, with a typical latency of five cycles exposed for dependent operations, supported by hardware dependency checks to maintain throughput without stalling. Resource allocation in the CU emphasizes shared access for efficiency, including a 16 KB L0 vector cache per CU and a 16 KB L0 scalar cache per WGP for low-latency access, while a larger 128 KB L1 cache is shared across multiple CUs within a shader array to handle and compute workloads. Fixed-function units, such as the texture processors and primitive assemblers, integrate directly with the CU's memory subsystem to support rasterization and without relying on programmable shaders. The CU's local share (LDS) provides 64 KB of high-bandwidth per CU in compute mode, enabling efficient workgroup communication at up to 32 dwords per cycle. Throughput metrics highlight the CU's balanced design, with the scalar capable of executing up to two to accommodate control-heavy workloads, while vector throughput reaches 64 single-precision floating-point operations per cycle per CU. These CUs connect to the GPU's global , optimized for high-bandwidth interfaces like GDDR6, ensuring seamless data flow for both graphics rendering and compute tasks across RDNA generations. Later iterations, such as , introduce dual-issue capabilities in the front-end for enhanced instruction dispatch without altering the core CU layout.

RDNA 1

Key innovations

The RDNA microarchitecture introduced scalar unit independence by providing each SIMD unit with its own dedicated scalar pipeline, separate from the vector units, which minimizes wavefront stalls during branch divergence and non-vector operations. This design doubles branch execution efficiency compared to GCN, enabling up to 2x better performance in control-flow heavy workloads by allowing scalar instructions to execute without blocking the vector pipes. A revamped cache hierarchy enhances data access efficiency, featuring multi-level L0 and L1 caches with 16 KB per SIMD for the scalar cache—50% larger than the equivalent in GCN—and a 128 KB shared L1 cache per shader array to reduce pressure on the L2 cache. The L2 cache scales to 2 MB per memory controller, supporting higher bandwidth and lower latency for both graphics and compute tasks, contributing to overall throughput improvements. Hardware support for primitive shaders enables early primitive culling directly in the shader , processing and discarding off-screen or back-facing before full rasterization, which reduces overhead by up to 2x compared to prior generations. This is complemented by enhanced asynchronous compute capabilities through a dedicated scheduling feature called Asynchronous Compute Tunneling, which allows compute workloads to interleave with without stalling the , improving resource utilization in mixed workloads. Fabricated on TSMC's node, RDNA targets 1.5x instructions per clock (IPC) over GCN at the same clock speed, achieved through these architectural refinements and optimized power delivery for higher efficiency. These innovations first appeared in chips like the Navi 10 GPU.

Implemented chips

The primary discrete GPU implementations of the RDNA 1 microarchitecture are based on two main dies: Navi 10 and Navi 14, both fabricated on TSMC's 7 nm process node. These chips target the mainstream to high-end desktop graphics market, focusing on gaming performance improvements over GCN-based products. The Navi 10 die serves as the foundation for AMD's higher-end RDNA 1 offerings in the Radeon RX 5700 series, including the RX 5700 XT and RX 5700 models. Featuring a die size of 251 mm² and 10.3 billion transistors, it supports up to 40 compute units (2,560 stream processors) in its fully enabled configuration. The RX 5700 XT pairs this die with 8 GB of GDDR6 memory on a 256-bit interface, achieving a memory bandwidth of 448 GB/s, and is positioned for high-end desktop gaming at 1440p and 4K resolutions. Launched on July 7, 2019, the RX 5700 XT carried a starting price of $399, while the RX 5700 variant, with 36 compute units and slightly reduced clocks, started at $349. In contrast, the Navi 14 die targets entry-to-mid-range segments with a more compact design, measuring 158 mm² and containing 6.4 billion transistors. It supports up to 22 compute units (1,408 stream processors) in configurations like the Radeon RX 5500 XT, with 4 GB or 8 GB of GDDR6 memory on a 128-bit bus. This die enables flexible binning for diverse SKUs such as the RX 5500 series for budget gamers. Released on October 7, 2019 (for mobile variants earlier in August), these cards emphasize efficient power use for and gaming. The RX 5500 XT 8 GB model started at $199. Navi 12, a less common die with approximately 8.1 billion transistors and up to 36 compute units, was primarily used in professional and mobile products like the 5600M and mining cards, rather than discrete GPUs.
DieTransistor CountProcess NodeMax Compute UnitsTarget CardsMemory ConfigLaunch DateStarting Price
Navi 1010.3 billion7 nm40RX 5700 XT / RX 57008 GB GDDR6 (256-bit)July 2019$349–$399
Navi 146.4 billion7 nm22RX 5500 XT / RX 5500 series4–8 GB GDDR6 (128-bit)October 2019$169–$199

RDNA 2

Rendering and acceleration features

RDNA 2 introduces dedicated hardware accelerators to enhance real-time rendering capabilities, particularly for gaming workloads. Central to these improvements are Ray Accelerators, with one integrated per Compute Unit (CU), designed to accelerate ray-triangle intersection and (BVH) traversal for ray tracing effects such as shadows, reflections, and . This fixed-function hardware offloads ray tracing computations from general-purpose shaders, enabling efficient hybrid rendering pipelines that combine rasterization with ray-traced elements. To optimize geometry processing and shading efficiency, RDNA 2 supports mesh shaders and task shaders, which allow developers to replace traditional vertex and shaders with more flexible, programmable stages. These features, part of 12 Ultimate, enable coarser culling and amplification, reducing overhead in complex scenes with high polygon counts. Complementing this is hardware-accelerated Variable Rate Shading (VRS) at Tier 2, supporting shading rates including 2x2 and 4x4 pixels per sample to minimize computations in less visually critical areas, such as peripheral regions, without significant quality loss. This combination can reduce pixel shading workload in targeted scenarios, improving frame rates in demanding titles. Memory access efficiency is bolstered by the Infinity Cache (up to 128 MB in high-end configurations), a large on-die L3-like structure that acts as a high-bandwidth pool for spatial and temporal data reuse, effectively doubling bandwidth compared to equivalent L2 cache configurations in prior architectures. Additionally, sampler feedback hardware captures texture sampling patterns, enabling dynamic streaming of only necessary texture data to VRAM, which optimizes memory usage and reduces latency in texture-heavy scenes. For better multi-tasking, separates graphics and compute pipelines, allowing primitive assembly and async compute workloads to execute concurrently without stalling the graphics queue. This decoupling enhances utilization during mixed workloads, such as ray tracing BVH construction alongside draw calls, by dispatching instructions through independent paths to the arrays.

Integrated chips and console applications

The RDNA 2 architecture powered several discrete GPU dies under the Navi branding, targeting a range of performance segments in AMD's Radeon RX 6000 series. The flagship Navi 21 die, fabricated on TSMC's 7 nm process with 26.8 billion transistors and a die area of 520 mm², supported up to 80 compute units (CUs) in its full configuration, as seen in the Radeon RX 6900 XT graphics card. The mid-range Navi 22 die, also on 7 nm with 17.2 billion transistors and a 335 mm² die size, featured 40 CUs and powered cards like the Radeon RX 6700 XT. Lower-tier options included the Navi 23 on 7 nm with 11.1 billion transistors and a 237 mm² die, delivering 32 CUs for the Radeon RX 6600 XT, and the entry-level Navi 24 on a refined 6 nm process with 5.4 billion transistors and a 107 mm² die, providing 16 CUs in the Radeon RX 6500 XT.
DieProcessTransistors (billions)Die Size (mm²)Max CUsExample Product
Navi 217 nm26.852080RX 6900 XT
Navi 227 nm17.233540RX 6700 XT
Navi 237 nm11.123732RX 6600 XT
Navi 246 nm5.410716RX 6500 XT
Integrated variants of appeared in AMD's mobile APUs, particularly the 6000 series (), where scaled-down GPU configurations provided up to 12 CUs for onboard graphics like the Radeon 680M, enabling efficient gaming without a discrete card. These integrated solutions balanced power efficiency with performance, supporting features such as hardware-accelerated ray tracing for enhanced visual effects in supported applications. RDNA 2 found significant adoption in gaming consoles through custom implementations of the Navi 21 die. The , launched in November 2020, features a tailored AMD GPU with 36 CUs clocked up to 2.23 GHz, delivering a peak of approximately 10.28 TFLOPS for rasterization and ray tracing workloads. Similarly, the , also released in November 2020, uses a custom GPU based on Navi 21 with 52 CUs at 1.825 GHz, achieving 12 TFLOPS to drive high-fidelity gaming, while the features 20 CUs at up to 1.565 GHz for 4 TFLOPS. These console designs incorporate optimizations like variable rate shading and mesh shaders, enabling consistent performance at resolutions up to 4K at 120 frames per second in optimized titles.

RDNA 3

Chiplet design and scaling

The microarchitecture marked AMD's transition to a -based design for discrete units (GPUs), enabling modular scaling and improved manufacturing yields by breaking down the monolithic die structure used in prior generations. This approach draws on AMD's experience with integration in CPU architectures, adapting it to workloads. The core components consist of one or more Graphics Compute Dies (GCDs), each fabricated on TSMC's 5nm process node and containing the compute units (CUs) responsible for , rasterization, and other primitives. These GCDs connect via AMD's Infinity Fabric interconnect to Memory Cache Dies (MCDs) on TSMC's 6nm node, which manage memory controllers, peripherals, and large last-level cache pools. In high-end configurations like the Navi 31 GPU, a single GCD integrates up to 96 CUs, paired with six MCDs to form a cohesive package using advanced fanout packaging for low-latency inter-die communication. Scaling is achieved by varying the number of MCDs—ranging from four in mid-range dies to six in flagship models—allowing adjustments to and cache capacity without redesigning the core compute logic. Each CU in features dual-issue dispatch, enabling two SIMD32 units to process instructions simultaneously, which doubles the peak FP32 throughput compared to 's single-issue design per CU. This architectural shift supports higher clock frequencies, up to 15% above levels, while maintaining power efficiency through optimized die partitioning. A key enhancement is the second-generation Infinity Cache, implemented across the MCDs at 16 MB per die, totaling up to 96 MB in the Navi 31-based RX 7900 XTX. This distributed L3 cache reduces pressure on inter-die bandwidth by caching frequently accessed data locally, minimizing trips to GDDR6 memory and effectively lowering bandwidth demands by approximately 20% in workloads. Building briefly on 's introduction of Infinity Cache as a stacked solution, integrates it natively into the fabric for seamless multi-die operation. The mixed-process strategy—5nm for compute-intensive and 6nm for I/O-focused MCDs—targets a 50% improvement in over , achieved through better silicon utilization and reduced power overhead in the interconnect, which consumes less than 5% of the total GPU power budget at up to 3.5 TB/s bidirectional bandwidth.

Implemented chips

The primary discrete GPU implementations of the RDNA 3 microarchitecture are based on three main dies: Navi 31, Navi 32, and Navi 33, fabricated using TSMC's 5 nm and 6 nm process nodes. These chips power the , targeting desktop gaming from entry-level to high-end segments with enhancements in ray tracing, AI acceleration, and efficiency. The Navi 31 die serves as the foundation for AMD's flagship offerings in the RX 7900 series, including the RX 7900 XTX, RX 7900 XT, and RX 7900 GRE models. The (MCM) features approximately 58 billion transistors across one 5 nm GCD (300 mm²) and six 6 nm MCDs, supporting up to 96 compute units (6,144 stream processors). The RX 7900 XTX pairs this with 24 GB of on a 384-bit interface, achieving up to 960 GB/s bandwidth, and is positioned for high-end 4K gaming. Launched on December 13, 2022, the RX 7900 XTX carried a starting price of $999, while the RX 7900 XT (20 GB, 320-bit) started at $899, and the RX 7900 GRE (16 GB, 256-bit) at $549 in 2023, enabling variants through die binning. The Navi 32 die targets mid-range segments with a design similar to Navi 31 but scaled down, featuring one 5 nm GCD (200 mm², 28.1 billion transistors) and four 6 nm MCDs for a total MCM of about 37 billion transistors. It supports up to 60 compute units (3,840 stream processors) in configurations like the Radeon RX 7800 XT, with 16 GB of GDDR6 on a 256-bit bus. This die also powers the RX 7700 XT (54 CUs, 12 GB, 192-bit). Released on September 6, 2023, the RX 7800 XT started at $499 and the RX 7700 XT at $449, emphasizing gaming performance. In contrast, the Navi 33 die is a monolithic on TSMC's 6 nm , measuring 204 mm² with 13.3 billion transistors. It supports 32 compute units (2,048 stream processors) in the RX 7600, with 8 GB of GDDR6 on a 128-bit interface (up to 288 GB/s bandwidth); a 16 GB RX 7600 XT variant launched in January 2024. Released on May 24, 2023, the RX 7600 started at $269 and the XT at $329, focusing on efficient and gaming. RDNA 3 is also integrated into consumer APUs like the Ryzen 7000 series (up to 12 CUs) and handheld devices such as the Asus ROG Ally, but the architecture's primary market focus for discrete GPUs is gaming with AI and ray tracing features. As of November 2025, no console implementations have been announced.
DieTransistor Count (MCM)Process NodeMax Compute UnitsTarget CardsMemory ConfigLaunch DateStarting Price
Navi 3158 billion5 nm / 6 nm96RX 7900 XTX / XT / GRE16–24 GB GDDR6 (256–384-bit)Dec 2022 / 2023$549–$999
Navi 3237 billion5 nm / 6 nm60RX 7800 XT / 7700 XT12–16 GB GDDR6 (192–256-bit)Sep 2023$449–$499
Navi 3313.3 billion6 nm32RX 7600 / 7600 XT8–16 GB GDDR6 (128-bit)May 2023 / Jan 2024$269–$329

RDNA 4

Modular design and AI enhancements

The RDNA 4 microarchitecture introduces a modular system-on-chip (SoC) design that enhances flexibility for mid-range graphics processing units (GPUs), building on the chiplet approach of RDNA 3 by enabling scalable configurations of shader engines and compute units. The flagship Navi 48 die, fabricated on TSMC's 4 nm process with a size of 356.5 mm² and 53.9 billion transistors, incorporates four shader engines, each containing eight dual-issue compute units (DCUs) for a total of 64 compute units, allowing for efficient monolithic integration while supporting half-die variants like the Navi 44 for cost-effective smaller GPUs. This tile-based structure facilitates diverse SKU configurations, such as the Navi 44's two shader engines, reducing manufacturing overhead and enabling broader market coverage in the mid-range segment. A key enabler of this modularity is the implementation of out-of-order memory request handling, which permits requests from different shader waves to be processed non-sequentially, eliminating false dependencies that stalled prior architectures like RDNA 3. This optimization improves overall memory subsystem efficiency, particularly in rasterization workloads where access patterns vary, contributing to smoother execution across diverse GPU configurations. RDNA 4 integrates second-generation AI accelerators into each compute unit to bolster AI capabilities, delivering up to 2x the generalized matrix multiply (GEMM) performance in FP16 compared to , with 1024 FLOPS per clock per compute unit. These accelerators, comprising four matrix acceleration engines (MAEs) per DCU for a total of 128 across the Navi 48, support operations in INT8 at up to 8x the throughput of , enabling efficient AI inference and tasks. This hardware facilitates advanced features like FidelityFX Super Resolution 4 (FSR 4), an AI-based upscaler that leverages multi-layer (MLP) models for enhanced image quality in frame generation and interpolation, running optimally on RDNA 4's integrated tensor-like units. Ray tracing receives significant upgrades in RDNA 4 through third-generation accelerators, achieving over 2x the ray-triangle and traversal throughput per compute unit relative to , thanks to dual engines per ray accelerator. Each DCU now includes two such accelerators, totaling 64 on the Navi 48, which support 8-wide (BVH) nodes for fewer traversal steps and oriented bounding boxes (OBBs) via predefined transformation matrices to accelerate complex scene handling. These enhancements also improve BVH efficiency through primitive compression and parallel path testing instructions like IMAGE_BVH_DUAL_INTERSECT_RAY, enabling better performance in ray-traced titles such as 2077. Supporting these features, RDNA 4 doubles the L2 cache size to 8 MB per GPU die, enhancing data locality for AI and ray tracing workloads, while the Cache expands to 64 MB to sustain high-bandwidth demands. configuration standardizes at 16 GB of GDDR6 on a 256-bit interface, with support for speeds up to 20 Gbps, prioritizing over a shift to GDDR7 in this generation. Additionally, Fabric interconnects see optimized bandwidth usage, reducing overall requirements by approximately 25% through compression and scheduling improvements, further aiding modular .

Implemented chips

The primary discrete GPU implementations of the RDNA 4 microarchitecture are based on two main dies: Navi 48 and Navi 44, both fabricated on TSMC's 4 nm process node. These chips target the desktop graphics market, emphasizing mid-range performance with enhancements in ray tracing and AI acceleration for gaming workloads. The Navi 48 die serves as the foundation for AMD's higher-end RDNA 4 offerings in the Radeon RX 9070 series, including the RX 9070 XT and RX 9070 models. Featuring a die size of 357 mm² and approximately 53.9 billion transistors, it supports up to 64 compute units (4,096 stream processors) in its fully enabled configuration. The RX 9070 XT pairs this die with 16 GB of GDDR6 memory on a 256-bit interface, achieving a memory bandwidth of up to 640 GB/s, and is positioned for mid-to-high-end desktop gaming at 1440p and entry-level 4K resolutions. The RX 9070 features 56 compute units (3,584 stream processors) via binning, with slightly reduced clocks. Launched in March 2025, the RX 9070 XT carries a starting price of $599, while the RX 9070 variant starts at $549, enabling cost-effective variants through modular binning of the die. In contrast, the Navi 44 die targets entry-to-mid-range segments with a more compact design, measuring 199 mm² and containing about 29.7 billion transistors. It supports 32 compute units (2,048 stream processors) in configurations like the , with options for 8 GB or 16 GB of on a 128-bit bus. This die leverages RDNA 4's modular architecture to offer flexible half-die options, allowing to produce diverse SKUs such as the anticipated RX 9050 series for budget-conscious gamers. Released starting in May 2025, these cards emphasize efficient power use and AI-enhanced features for and gaming without entering high-end territory. As of late 2025, RDNA 4 has not been integrated into consumer desktop , with the 9000G series instead employing RDNA 3.5 graphics up to 16 compute units for hybrid CPU-GPU workloads. The architecture's market focus remains on AI-accelerated desktop gaming, with no official announcements for console implementations.
DieTransistor CountProcess NodeMax Compute UnitsTarget CardsMemory ConfigLaunch DateStarting Price
Navi 4853.9 billion4 nm64RX 9070 XT / RX 907016 GB GDDR6 (256-bit)March 2025$549–$599
Navi 4429.7 billion4 nm32RX 9060 XT / RX 9050 series8–16 GB GDDR6 (128-bit)May 2025~$250–$350

Generational comparisons

Performance and efficiency metrics

The RDNA microarchitecture has demonstrated consistent improvements in across generations, enabling higher computational throughput at comparable power levels. The first-generation RDNA, introduced in 2019, achieved approximately 1.5 times the compared to the preceding GCN , primarily through enhanced instruction throughput and reduced latency in the compute units. Subsequent iterations built on this foundation: delivered about 50% better over RDNA 1, thanks to architectural refinements like primitive shaders and shaders that optimized workload distribution. further advanced efficiency with a 54% uplift over , incorporating dual-issue compute units and chiplet-based scaling to balance performance gains with power constraints. For RDNA 4, launched in early 2025, reported up to 40% improvements in rasterization performance and over 2x improvements in ray tracing throughput per compute unit, attributed to architectural enhancements including third-generation ray tracing accelerators and higher clock speeds, though overall perf/watt gains were moderated by a focus in the initial product lineup. Theoretical peak performance, measured in teraflops (TFLOPS) of single-precision floating-point operations, illustrates the scaling trajectory of RDNA flagship GPUs, though real-world efficiency varies due to architectural changes. Representative high-end models show exponential growth: the RDNA 1-based Radeon RX 5700 XT peaked at 9.75 TFLOPS, the RDNA 2-based RX 6900 XT at 23 TFLOPS, and the RDNA 3-based RX 7900 XTX at 61.4 TFLOPS, reflecting increases in counts and clock rates alongside dual-issue capabilities in later generations. The RDNA 4-based RX 9070 XT, positioned as a mid-to-high-end offering, reaches 48.7 TFLOPS, a figure adjusted lower than RDNA 3's flagship due to a streamlined die size and emphasis on power efficiency rather than raw compute density. Infinity Cache, introduced in RDNA 2 and refined in subsequent generations, significantly mitigates memory bandwidth limitations by providing a large on-package L3 cache that effectively doubles bandwidth for cache-hit scenarios in gaming workloads. This technology allowed RDNA 2 and later architectures to achieve up to 2.17 times the effective memory bandwidth compared to traditional L2 cache designs at equivalent physical bus widths, reducing DRAM accesses and latency. Power draw trends across RDNA generations have maintained consistency around 300W TDP for high-end discrete GPUs, with the RX 5700 XT at 225W, RX 6900 XT at 300W, RX 7900 XTX at 355W, and RX 9070 XT at 304W, enabling sustained performance without proportional increases in thermal demands.
GenerationRepresentative GPUPeak FP32 TFLOPSTDP (W)
RDNA 1RX 5700 XT9.75225
RX 6900 XT23300
RX 7900 XTX61.4355
RDNA 4RX 9070 XT48.7304

Feature progression

The RDNA microarchitecture has evolved through successive generations, introducing enhancements to rendering capabilities that prioritize efficiency and flexibility in . The first-generation RDNA, launched in 2019 with the , incorporated primitive shaders as a compiler-controlled mechanism to optimize vertex and primitive handling, reducing overhead in the geometry pipeline compared to prior architectures. This feature enabled more efficient culling and distribution of primitives, laying the groundwork for advanced shading techniques. Building on this, , introduced in 2020 and powering GPUs like the as well as consoles such as the and Series X/S, added mesh shaders for programmable geometry generation and Variable Rate Shading (VRS) to dynamically adjust shading rates based on screen content, allowing developers to allocate compute resources more effectively for complex scenes. Subsequent iterations further refined rendering throughput. , debuting in 2022 with the , implemented dual-issue shader arithmetic logic units (ALUs), enabling the execution of two independent instructions per cycle within each compute unit, which boosted scalar and vector processing for denser workloads. This design shared resources across rendering pipelines, enhancing overall utilization without proportionally increasing die area. In RDNA 4, announced in 2025 for the Radeon RX 9000 series, out-of-order memory request handling was introduced, permitting the memory subsystem to reorder accesses for better latency tolerance, particularly benefiting irregular access patterns in modern rendering while also aiding rasterization pipelines. Ray tracing support represents a key area of progression, transitioning from software-only emulation to dedicated . RDNA 1 lacked native ray tracing hardware, relying on compute shaders for any such effects, which limited real-time viability in games. marked the debut of first-generation ray tracing accelerators integrated into each compute unit, supporting (DXR) and enabling hardware-accelerated intersection tests and (BVH) traversal for realistic lighting and reflections. The second generation in improved upon this with tighter integration to compute resources, delivering up to 2x overall ray tracing performance compared to through enhanced traversal efficiency and dual-issue capabilities. RDNA 4 advances to third-generation ray tracing hardware with enhanced BVH compression and traversal efficiency, alongside AI-driven denoising to mitigate noise in path-traced renders, enabling higher-fidelity real-time effects like at playable frame rates. These accelerators are optimized for workloads, where AI denoising processes sparse ray samples to reconstruct clean images, bridging the gap with rasterization performance. AI and machine learning capabilities emerged later in the lineage, focusing on dedicated accelerators for upscaling and tasks. introduced first-generation AI accelerators within compute units, providing specialized instructions for matrix operations and delivering up to 2.7 times the throughput of prior software-based approaches, primarily for features like FidelityFX Super Resolution (FSR). RDNA 4 elevates this with second-generation AI accelerators—optimized for low-precision formats like FP8 and INT8—offering up to 8x INT8 performance for tasks, enabling general matrix multiply (GEMM) operations at scale for advanced AI workloads, including FSR 4, which leverages neural networks for temporal upscaling and frame generation with reduced artifacts. Scalability strategies have shifted to balance performance and manufacturing yields. RDNA 1 and employed monolithic dies, integrating all compute, memory, and I/O elements on a single chip for low-latency coherence, though this constrained maximum sizes due to defect rates on leading nodes. pioneered a -based , separating graphics compute dies (GCDs) from I/O dies (IODs) connected via Infinity Fabric, allowing modular scaling up to 96 compute units in high-end configurations like the RX 7900 XTX while improving cost-efficiency. RDNA 4 adopts a modular monolithic approach for mid-to-high-end GPUs, returning to single-die construction with up to 64 unified compute units to optimize clock speeds and power delivery, while reserving for future integrated or enterprise variants.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.