Recent from talks
Nothing was collected or created yet.
RDNA (microarchitecture)
View on Wikipedia
| Release date | July 7, 2019[1] |
|---|---|
| Designed by | AMD |
| Fabrication process | |
| History | |
| Predecessor | Graphics Core Next 5 |
| Support status | |
| Supported | |

RDNA (Radeon DNA[2][3]) is a graphics processing unit (GPU) microarchitecture and accompanying instruction set architecture developed by AMD. It is the successor to their Graphics Core Next (GCN) microarchitecture/instruction set. The first product lineup featuring RDNA was the Radeon RX 5000 series of video cards, launched on July 7, 2019.[1][4] The architecture is also used in mobile products.[5] It is manufactured and fabricated with TSMC's N7 FinFET graphics chips used in the Navi series of AMD Radeon graphics cards.[6]
The second iteration of RDNA was first featured in the PlayStation 5[7][8] and Xbox Series X/S consoles.[9] Both consoles utilize a custom RDNA 2-based graphics solution as the basis for their GPU microarchitecture. On PC, RDNA 2 is featured in the Radeon RX 6000 series of video cards, which first launched in November 2020.[10] RDNA 2 is also featured in Samsung's Exynos 2200 as the graphics architecture.[11]
The third iteration of RDNA was announced on November 3, 2022, and is featured in the Radeon RX 7000 series of consumer desktop and mobile graphics cards.[12]
The fourth and final iteration of RDNA was unveiled on January 6, 2025 at CES[13] and is used in the Radeon RX 9000 series of desktop graphics cards.
Instruction set
[edit]AMD's GPUOpen website hosts PDF documents aiming to describe the environment, the organization and the program state of RDNA devices. They detail the instruction set and the microcode formats native to this family of processors that are accessible to programmers and compilers.[14]
Documentation is available for:
RDNA 1
[edit]| Release date | July 7, 2019 |
|---|---|
| Codename | Navi 1x |
| Fabrication process | TSMC N7 |
| History | |
| Predecessor | Graphics Core Next 5 |
| Successor | RDNA 2 |
| Support status | |
| Supported | |
RDNA 1 (also RDNA1)[15][16] is the first implementation of the RDNA microarchitecture and is the successor to the Radeon RX Vega series.[17][18] The launch occurred on July 7, 2019.[19]
Architecture
[edit]
The architecture features a new processor design, although the first details released at AMD's Computex keynote hints at aspects from the previous Graphics Core Next (GCN) architecture being present for backwards compatibility purposes, which is especially important for its use (in the form of RDNA 2) in the major ninth generation game consoles (the Xbox Series X/S and PlayStation 5) to preserve native compatibility with their pre-existing eighth generation game libraries designed for GCN. It features multi-level cache hierarchy and an improved rendering pipeline, with support for GDDR6 memory.
Starting with the architecture itself, one of the biggest changes for RDNA is the width of a wavefront, the fundamental group of work. GCN in all of its iterations was 64 threads wide, meaning 64 threads were bundled together into a single wavefront for execution. RDNA drops this to a native 32 threads wide. At the same time, AMD has expanded the width of their SIMDs from 16 slots to 32 (aka SIMD32), meaning the size of a wavefront now matches the SIMD size.[5]: 2
RDNA also introduces working primitive shaders. While the feature was present in the hardware of the Vega architecture, it was difficult to get a real-world performance boost from and thus AMD never enabled it. Primitive shaders in RDNA are compiler-controlled.[5]: 2
The display controller in RDNA has been updated to support Display Stream Compression 1.2a, allowing output in 4K@240 Hz, HDR 4K@120 Hz, and HDR 8K@60 Hz.[5]: 2 [20]
Differences between GCN and RDNA
[edit]There are architectural changes which affect how code is scheduled:
- Single cycle instruction issue:
- GCN issued one instruction per wave once every 4 cycles.
- RDNA issues instructions every cycle.
- Wave32:
- GCN used a wavefront size of 64 threads (work items).
- RDNA supports both wavefront sizes of 32 and 64 threads.
- Workgroup Processors:
- GCN grouped the shader hardware into "compute units" (CUs) which contained scalar ALUs and vector ALUs, LDS and memory access. One CU contains 4 SIMD16s which share one path to memory.
- RDNA introduced the "workgroup processor" ("WGP"). The WGP replaces the compute unit as the basic unit of shader computation hardware/computing. One WGP encompasses 2 CUs. This allows significantly more compute power and memory bandwidth to be directed at a single workgroup.
Chips
[edit]Discrete GPUs:
- Navi 10 found on Radeon RX 5600, Radeon RX 5600 XT, Radeon RX 5600M, Radeon RX 5700, Radeon RX 5700M, Radeon RX 5700 XT, Radeon Pro 5700, Radeon Pro 5700 XT, Radeon Pro W5700X, and Radeon Pro W5700 graphics cards
- Navi 12 found on Radeon Pro V520 branded graphics card, Radeon Pro 5600M branded mobile graphics card and BC-160 mining card for cryptocurrency
- Navi 14 found on Radeon RX 5300, Radeon RX 5300 XT, Radeon Pro 5300, Radeon Pro W5300, Radeon RX 5500, Radeon RX 5500 XT, Radeon Pro 5500, Radeon Pro 5500 XT, and Radeon Pro W5500, branded graphics cards; Radeon RX 5300M, Radeon Pro 5300M, Radeon Pro W5300M, Radeon RX 5500M, Radeon Pro 5500M, and Radeon Pro W5500M branded mobile graphics cards
RDNA 2
[edit]| Release date | November 18, 2020 |
|---|---|
| Codename | Navi 2x |
| Fabrication process | |
| History | |
| Predecessor | RDNA 1 |
| Successor | RDNA 3 |
| Support status | |
| Supported | |
RDNA 2[21] (also RDNA2)[22] is the successor to the RDNA microarchitecture. It was first publicly announced in early 2020 with a projected release in Q4 2020.[22][23] According to statements from AMD, RDNA 2 would be a "refresh" of the RDNA architecture.[24]
More information about RDNA 2 was made public on AMD's Financial Analyst Day on March 5, 2020.[25][23][26] AMD claimed that it would provide a 50% performance-per-watt improvement over RDNA, with increases in clock speed and instructions-per-clock.[27] Additional features confirmed by AMD include real-time, hardware accelerated ray tracing, "Infinity Cache", mesh shaders, sampler feedback and variable rate shading.[27][10] The company announced that RDNA 2 would be used in next-generation gaming consoles and PC graphics cards[27] code-named "Navi 2X" and also nicknamed as "Big Navi".[27]
AMD unveiled the Radeon RX 6000 series, its next-gen RDNA 2 graphics cards at an online event on October 28, 2020.[28][29] The lineup initially consisted of the RX 6800, RX 6800 XT and RX 6900 XT.[30][31] The RX 6800 and 6800 XT launched on November 18, 2020, with the RX 6900 XT being released on December 8, 2020.[10] Further variants including a Radeon RX 6700 (XT) series based on Navi 22, later launched on March 18, 2021.[32][33][34][35]
On May 31, 2021, AMD launched the RX 6000M series of GPUs designed for laptops.[36][37] These include the RX 6600M, RX 6700M, and RX 6800M. These were made available beginning on June 1, 2021.[36]
On June 1, 2021, AMD's CEO Dr. Lisa Su and Tesla, Inc.'s CEO Elon Musk confirmed that the entertainment systems of Tesla's new Model S and Model X are powered by RDNA 2.[38] The same microarchitecture was also announced to be used for an upcoming flagship Samsung Exynos SoC,[39] later introduced in January 2022 as Exynos 2200, utilizing a custom Xclipse 920 GPU with 3 workgroup processors.[40][41]
An RDNA 2 integrated GPU with 2 compute units is included in the I/O die on AMD's Zen 4-based Ryzen 7000 Series CPUs.[42][43] According to AMD, the integrated RDNA 2 graphics in Ryzen 7000 are not intended for gaming and is instead intended for diagnostic purposes and offering video encode and decode capabilities.[44]
Chips
[edit]Discrete GPUs:
- Navi 21
- Navi 22
- Navi 23
- Navi 24
Integrated into APUs/CPUs:
- Rembrandt (as "Radeon 660M" and "Radeon 680M" models found on Ryzen 6000 series mobile APUs)
- Raphael (as "Radeon Graphics" branded iGPU found on Ryzen 7000 series desktop CPUs)
- Mendocino (as "Radeon 610M" model found on Ryzen 7020 series mobile APUs)
- Rembrandt-R (as "Radeon 660M" and "Radeon 680M" models found on Ryzen 7035 series mobile APUs)
- Dragon Range (as "Radeon 610M" model found on Ryzen 7045 series mobile APUs)
Usage in video game consoles
[edit]Custom configurations of the RDNA 2 graphics microarchitecture are used in the PlayStation 5[7][45] from Sony, Xbox Series X and Series S consoles[9] from Microsoft, with proprietary tweaks and different GPU modifications in each system's implementation. Valve announced on July 15, 2021, that their Steam Deck would feature the RDNA 2 architecture. The Steam Deck was released in February 2022.[46]
RDNA 3
[edit]| Release date | December 13, 2022 |
|---|---|
| Codename | Navi 3x |
| Fabrication process | |
| History | |
| Predecessor | RDNA 2 |
| Successor | RDNA 4 |
| Support status | |
| Supported | |
RDNA 3 (also RDNA3) is the successor to the RDNA 2 microarchitecture and was projected for a launch in Q4 2022 per AMD's gaming GPU roadmap.[47][48][49] At an August 29 reveal event for Ryzen 7000 series CPUs, AMD CEO Lisa Su teased RDNA 3 and revealed that it would utilize chiplets built on TSMC's N5 node.[50] On September 19, 2022, Sam Naffziger, the current senior vice president at AMD, stated in a blogpost that improvements made to the RDNA 3 microarchitecture allow for considerable performance gains and efficiency, with an estimated 50% increase in performance-per-watt compared to the RDNA 2 microarchitecture.[51] Additionally, the RDNA 3 architecture features the next generation of Infinity Cache, a modified graphics pipeline, adaptive power management and rearchitected compute units, leading to an overall robust uplift in rasterization and ray-tracing performance over the previous consumer architecture.[52]
On November 3, 2022, AMD unveiled the RX 7900 XTX and RX 7900 XT graphics cards, based on the RDNA 3 microarchitecture. These are the first commercial GPUs to be based on multi-chip module (MCM) design.[53]
On October 5, 2023 and October 24, 2024 respectively, Samsung announced Exynos 2400 and Exynos 1580, which utilized RDNA 3 microarchitecture-based custom-design GPU, Xclipse 940 and 540.[54][55]
Chips
[edit]Discrete GPUs:
- Navi 31 found on Radeon RX 7900 GRE, Radeon RX 7900 XT, Radeon RX 7900 XTX, Radeon Pro W7800 and Radeon Pro W7900 branded graphics cards; Radeon RX 7900M branded mobile graphics cards
- Navi 32 found on Radeon RX 7700 XT and Radeon RX 7800 XT branded graphics cards
- Navi 33 found on Radeon RX 7600, Radeon RX 7600 XT, Radeon Pro W7500 and Radeon Pro W7600 branded graphics cards; Radeon RX 7600S, Radeon RX 7600M, Radeon RX 7600M XT and Radeon RX 7700S branded mobile graphics cards
Integrated into APUs/CPUs:
- Phoenix (as "Radeon 740M", "Radeon 760M" and "Radeon 780M" models found on Ryzen 7040 series and Ryzen Z1 series mobile APUs)
- Hawk Point (as "Radeon 740M", "Radeon 760M" and "Radeon 780M" models found on Ryzen 8040 series mobile APUs)
Comparison of RDNA chips
[edit]| Microarchitecture | RDNA 1 | RDNA 2 | RDNA 3 | RDNA 4 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Chip | Navi 10[56] | Navi 12[57] | Navi 14[58] | Navi 21[59] | Navi 22[60] | Navi 23[61] | Navi 24[62] | Navi 31[63][64] | Navi 32[65] | Navi 33[66] | Navi 44[67] | Navi 48[68] |
| Code name | Gaming | Fighter | Sienna Cichlid | Navy Flounder | Dimgrey Cavefish | Beige Goby | Plum Bonito | Wheat Nas | Hotpink Bonefish | |||
| LLVM target[69][70] | gfx1010 | gfx1011 | gfx1012 | gfx1030 | gfx1031 | gfx1032 | gfx1034 | gfx1100 | gfx1101 | gfx1102 | gfx1200 | gfx1201 |
| Fab | TSMC N7 | TSMC N6 | TSMC N5 (GCD), TSMC N6 (MCD) | TSMC N6 | TSMC N4 | |||||||
| Package | Monolithic | Multi-chip module (MCM) | Monolithic | |||||||||
| Die size (mm2) | 251 | Unknown | 158 | 520 | 335 | 237 | 107 | ~531 | ~350 | 204 | 199 | 357 |
| Graphics compute dies | — | 1 | — | |||||||||
| Memory cache dies | — | 6 | 4 | — | ||||||||
| GCD size (mm2) | — | 306 | 200 | — | ||||||||
| MCD size (mm2) | — | 37.5 | — | |||||||||
| Transistors (billions) | 10.3 | Unknown | 6.4 | 26.8 | 17.2 | 11.06 | 5.4 | 57.7 | 28.1 | 13.3 | 29.7 | 53.9 |
| Transistor density (MTr/mm2) |
41.0 | Unknown | 40.5 | 51.5 | 51.3 | 46.7 | 50.5 | 109.2 (MCM) 132.4 (GCD)[71] |
81.2 | 65.2 | 149.2 | 151 |
| Shader engines | 2 | 1 | 4[72] | 2[72] | 1[72] | 6 | TBA | 2 | 4 | |||
| Shader arrays | 4 | 2 | 8 | 4 | 2 | 12 | TBA | 4 | 8 | |||
| Workgroup processors | 20 | 12 | 40 | 20 | 16 | 8 | 48 | 30 | 16 | 32 | ||
| Compute units | 40 | 24 | 80 | 40 | 32 | 16 | 96 | 60 | 32 | 64 | ||
| Stream processors | 2560 | 1536 | 5120 | 2560 | 2048 | 1024 | 6144 | 3840 | 2048 | 4096 | ||
| Texture mapping units | 160 | 96 | 320 | 160 | 128 | 64 | 384 | 240 | 128 | 256 | ||
| Render output units | 64 | 32 | 128 | 64 | 32 | 192 | 96[73] | 64 | 128 | |||
| RT accelerators | — | 80 | 40 | 32 | 16 | 96 | 60 | 32 | 64 | |||
| AI accelerators[a] | — | 192 | 120 | 64 | 128 | |||||||
| L0 cache (KB) | 32 per Workgroup processor (WGP) | 64 per WGP | ||||||||||
| L1 cache (KB) | 128 per Shader array (SA) | 256 per SA | 128 per SA | |||||||||
| L2 cache (MB) | 8 | 4 | 2 | 4 | 3 | 2 | 1 | 6 | 4 | 2 | 4 | 8 |
| L3 cache (MB) | — | 128 | 96 | 32 | 16 | 96 | 64 | 32 | 64 | |||
| Memory type | GDDR6 | HBM2 | GDDR6 | |||||||||
| Memory bus (bits) | 256 | 2048 | 128 | 256 | 192 | 128 | 64 | 384 | 256 | 128 | 256 | |
| Display Core Next | 2.0.0 | 3.0.0 | 3.0.2 | 3.0.3 | 3.2.0 | 3.2.1 | 4.0.1 | |||||
| Video Core Next | 2.0.0 | 2.0.2 | 3.0.0 | 3.0.16 | 3.0.33 | 4.0.0 | 4.0.4 | 5.0.0 | ||||
| Launch | Jul 2019 | Jun 2020 | Oct 2019 | Nov 2020 | Mar 2021 | May 2021 | Jan 2022 | Dec 2022 | Sep 2023 | Jan 2023 | Jun 2025 | Mar 2025 |
| Introduced with | RX 5700 (XT) | Pro 5600M | RX 6800 (XT) | RX 6700 XT | RX 6600M | RX 7900 XT(X) | RX 9060 XT | RX 9070 (XT) | ||||
| Microarchitecture | RDNA 2 | RDNA 3 | RDNA 3.5 | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Code name | Rembrandt | Raphael | Mendocino | Rembrandt-R | Dragon Range | Phoenix | Hawk Point | Strix Point[74][75] | Strix Halo[76] | Krackan Point[77] |
| LLVM target[78][79] | gfx1035 | gfx1036 | gfx1037 | gfx1035 | gfx1037 | gfx1103 | gfx115{0,1} | gfx1151 | gfx1152 | |
| Fab | TSMC N6 | TSMC N4 | ||||||||
| Package | Monolithic | Monolithic[b] | Semi-MCM[c] | Monolithic[b] | ||||||
| Die size (mm2) | TBA | 308 | TBA | |||||||
| Transistors (billions) | TBA | |||||||||
| Transistor density (MTr/mm2) |
TBA | |||||||||
| Shader engines | ||||||||||
| Shader arrays | ||||||||||
| Workgroup processors | 8 | 20 | 4 | |||||||
| Compute units | 16 | 40 | 8 | |||||||
| Stream processors | 1024 | 2560 | 512 | |||||||
| Texture mapping units | 64 | 160 | 32 | |||||||
| Render output units | 32 | 64 | 8 | |||||||
| RT accelerators | 16 | 40 | 8 | |||||||
| AI accelerators[d] | 32 | 80 | 16 | |||||||
| L0 cache (KB) | 32 per Workgroup processor (WGP) | 64 per WGP | ||||||||
| L1 cache (KB) | 128 per Shader array (SA) | 256 per SA | ||||||||
| L2 cache (MB) | 2 | 8 | 1 | |||||||
| L3 cache (MB) | 0 | 32[e] | 0 | |||||||
| Memory type | DDR5/LPDDR5 | DDR5 | LPDDR5 | DDR5/LPDDR5 | DDR5 | DDR5/LPDDR5(X) | ||||
| Memory bus (bits)[f] | 128 | 128 | 128 | 128/256 | 128 | |||||
| Display Core Next | 3.1.4 | 3.5.0 | 3.5.0 | |||||||
| Video Core Next | 4.0.2 | 4.0.5 | 4.0.5 | |||||||
| Launch | Jan 2022 | Sep 2022 | Jan 2023 | Dec 2023 | Jul 2024 | Jan 2025 | Mar 2025 | |||
| Introduced with |
|
Radeon Graphics |
610M |
|
610M |
|
|
860M | ||
See also
[edit]References
[edit]- ^ GPU matrix-multiplication units, not to be confused with the XDNA NPU.
- ^ a b Everything (including CPU) on one die
- ^ GPU integrated into IO die; CPU cores on two dies
- ^ GPU matrix-multiplication units, not to be confused with the XDNA NPU.
- ^ MALL cache is dedicated to GPU
- ^ Dual-channel DDR5 provides 2*64=128 bits of width.
- ^ a b AMD press release: "AMD Announces Next-Generation Leadership Products at Computex 2019 Keynote". AMD (Press release). Santa Clara, CA. May 26, 2019. Retrieved October 5, 2019.
- ^ Smith, Ryan (May 26, 2019). "Home> GPUs AMD Teases First Navi GPU Products: RX 5700 Series Launches in July, 25% Improved Perf-Per-Clock". AnandTech. Archived from the original on May 27, 2019. Retrieved June 21, 2019.
- ^ "AMD RDNA Architecture". AMD. Retrieved December 10, 2022.
- ^ Wilson, Matthew (June 11, 2019). "AMD launches RX 5700 XT and RX 5700 GPUs with RDNA architecture". KitGuru. Retrieved December 10, 2022.
- ^ a b c d Smith, Ryan (June 10, 2019). "GPUs AMD Announces Radeon RX 5700 XT & RX 5700: The Next Gen of AMD Video Cards Starts on July 7th At $449/$379". AnandTech. Archived from the original on June 11, 2019. Retrieved June 21, 2019.
- ^ James, James (October 30, 2019). "AMD Radeon RX 5700 XT release date, price, specs, and performance". PCGamesN. Retrieved September 20, 2022.
- ^ a b Funk, Ben (December 12, 2020). "Sony PS5 Gets A Full Teardown Detailing Its RDNA 2 Guts And Glory". Hot Hardware. Retrieved January 3, 2021.
- ^ Radeon [@Radeon] (June 11, 2021). "We're proud that our groundbreaking RDNA 2 and Zen architectures helped enable incredible next-gen PS5 experiences like Ratchet and Clank: Rift Apart! #RatchetPS5" (Tweet). Retrieved June 11, 2021 – via Twitter.
- ^ a b Smith, Ryan (February 24, 2020). "Microsoft Drops More Xbox Series X Tech Specs: Zen 2 + RDNA 2, 12 TFLOPs GPU, HDMI 2.1, & a Custom SSD". AnandTech. Archived from the original on February 24, 2020. Retrieved March 19, 2020.
- ^ a b c Judd, Will (October 28, 2020). "AMD unveils three Radeon 6000 graphics cards with ray tracing and RTX-beating performance". Eurogamer. Retrieved October 28, 2020.
- ^ "Samsung Introduces Game Changing Exynos 2200 Processor With Xclipse GPU Powered by AMD RDNA 2 Architecture". Samsung Newsroom. Korea. January 18, 2022. Retrieved November 4, 2022.
- ^ Cunningham, Andrew (November 3, 2022). "AMD's next-gen Radeon RX 7900 XTX and XT launch December 13 for $999 and $899". Ars Technica. Retrieved November 4, 2022.
- ^ "AMD Debuts Radeon RX 9070 XT and RX 9070 Powered by RDNA 4, and FSR 4". TechPowerUp. January 6, 2025. Retrieved February 18, 2025.
- ^ "AMD GPU architecture programming documentation". GPUOpen. Retrieved March 22, 2025.
- ^ ""RDNA 1.0" Instruction Set Architecture Reference Guide" (PDF). AMD. Retrieved April 28, 2022.
- ^ "AMD RDNA Architecture". AMD. Retrieved April 28, 2022.
- ^ Ung, Gordon (May 26, 2019). "AMD flexes 7nm muscle with a 12-core Ryzen 9 CPU and Radeon RX 5000 graphics cards". PCWorld. Retrieved April 18, 2022.
- ^ Chacos, Brad (July 7, 2019). "AMD Radeon RX 5700 and 5700 XT review: Blazing new trails". PC World. Retrieved April 18, 2022.
- ^ Ridley, Jacob (June 14, 2019). "AMD announces $449 Radeon RX 5700 XT and $379 5700 for July 7 launch". PCGamesN. Retrieved June 14, 2019.
- ^ Stobing, Chris (June 10, 2019). "AMD Details Radeon RX 5700 'Navi' GPUs: Here's What You Need to Know". PCMag. Retrieved June 21, 2019.
- ^ "AMD Investor Relations presentation (PDF)". AMD. 2019. Archived from the original on February 5, 2020. Retrieved February 8, 2020.
- ^ a b Smith, Ryan (January 28, 2020). "Navi Refresh and RDNA2 Both In 2020, According to AMD". AnandTech. Archived from the original on January 29, 2020. Retrieved February 8, 2020.
- ^ a b Thomas, Bill (February 3, 2020). "AMD 'Big Navi' GPU might be right around the corner – but don't hold your breath". TechRadar. Retrieved February 8, 2020.
- ^ Alcorn, Paul (January 29, 2020). "AMD to Introduce New Next-Gen RDNA GPUs in 2020, Not a Typical 'Refresh' of Navi". Tom's Hardware. Retrieved February 8, 2020.
- ^ Ridley, Jacob (January 29, 2020). "AMD will unveil RDNA 2 graphics cards on March 5". PCGamesN. Retrieved February 8, 2020.
- ^ Alcorn, Paul (March 5, 2020). "AMD Financial Analyst Day 2020: CPU and GPU Roadmaps, X3D Die Stacking Revealed". Tom's Hardware. Retrieved March 7, 2020.
- ^ a b c d Smith, Ryan (March 5, 2020). "AMD's RDNA 2 Gets A Codename: "Navi 2X" Comes This Year With 50% Improved Perf-Per-Watt". AnandTech. Archived from the original on March 6, 2020. Retrieved March 7, 2020.
- ^ Garreffa, Anthony (September 9, 2020). "AMD to reveal next-gen Big Navi RDNA 2 graphics cards on October 28". TweakTown. Retrieved September 9, 2020.
- ^ Lyles, Taylor (September 9, 2020). "AMD's next-generation Zen 3 CPUs and Radeon RX 6000 'Big Navi' GPU will be revealed next month". The Verge. Retrieved September 10, 2020.
- ^ Smith, Ryan (October 8, 2020). "AMD Teases Radeon RX 6000 Card Performance Numbers: Aiming For 3080?". AnandTech. Archived from the original on October 8, 2020. Retrieved October 25, 2020.
- ^ Smith, Ryan (September 9, 2020). "AMD Announces Ryzen "Zen 3" and Radeon "RDNA2" Presentations for October: A New Journey Begins". AnandTech. Archived from the original on September 10, 2020. Retrieved October 25, 2020.
- ^ Hollister, Sean (March 3, 2021). "AMD announces $479 Radeon RX 6700 XT, says it will have 'significantly more GPUs available'". The Verge. Retrieved March 4, 2021.
- ^ Mujtaba, Hassan (November 30, 2020). "AMD Radeon RX 6700 XT 'Navi 22 GPU' Custom Models Reportedly Boost Up To 2.95 GHz". Wccftech. Retrieved December 3, 2020.
- ^ Tyson, Mark (December 3, 2020). "AMD CEO keynote scheduled for CES 2020 on 12th January". HEXUS. Retrieved December 3, 2020.
- ^ Cutress, Ian (January 12, 2021). "AMD to Launch Mid-Range RDNA 2 Desktop Graphics in First Half 2021". AnandTech. Archived from the original on April 13, 2021. Retrieved January 4, 2021.
- ^ a b Chin, Monica (May 31, 2021). "AMD announces the Radeon RX 6000M series with RDNA 2 architecture". The Verge. Retrieved June 1, 2021.
- ^ Takahashi, Dean (May 31, 2021). "AMD launches Radeon RX 6000M GPUs for gaming laptops". VentureBeat. Retrieved June 1, 2021.
- ^ Alvarez, Simon (June 1, 2021). "AMD confirms Tesla's new Model S and Model X will boast RDNA 2 GPUs". TeslaRati. Retrieved June 1, 2021.
- ^ "It's official! Flagship Exynos chip with AMD RDNA2 GPU is coming later this year". SamMobile. June 1, 2021. Retrieved October 26, 2022.
- ^ "Die analysis: Samsung Exynos 2200 with RDNA2 graphics". Patreon. April 26, 2022. Archived from the original on October 25, 2023. Retrieved April 19, 2023.
- ^ "Samsung Announces Game Changing Exynos 2200". Samsung Semiconductor Global. January 18, 2022. Retrieved October 26, 2022.
- ^ Cunningham, Andrew (September 26, 2022). "Everything you need to know about Zen 4, socket AM5, and AMD's newest chipsets". Ars Technica. Retrieved September 26, 2022.
- ^ "AMD confirms Ryzen 7000 series feature RDNA2 graphics". VideoCardz. August 30, 2022. Retrieved September 26, 2022.
- ^ Roach, Jacob (May 25, 2022). "AMD Ryzen 7000 graphics aren't powerful enough for gaming". Digital Trends. Retrieved September 26, 2022.
- ^ Gartenberg, Chaim (March 18, 2020). "Sony reveals full PS5 hardware specifications". The Verge. Retrieved January 3, 2021.
- ^ "Steam Deck". Steam. July 15, 2021. Retrieved July 16, 2021.
- ^ Gapo, Branko (July 28, 2021). "AMD: Zen4 and RDNA3 on track for 2022 launch". VideoCardz. Retrieved July 29, 2021.
- ^ "AMD RDNA 3 Release Date, Price And Specs". GPU Mag. June 23, 2021. Retrieved July 24, 2021.
- ^ Soutter, Neil (January 25, 2021). "AMD RX 7000 series could be a performance monster, with much better ray tracing performance". Game Debate. Retrieved January 29, 2021.
- ^ Wickens, Katie (August 30, 2022). "AMD's Lisa Su confirms chiplet-based RDNA 3 GPU architecture". PC Gamer. Retrieved September 20, 2022.
- ^ Naffziger, Sam (September 19, 2022). "Advancing Performance-Per-Watt to Benefit Gamers". AMD. Retrieved September 22, 2022.
- ^ Garreffa, Anthony (September 19, 2022). "AMD says its next-gen RDNA 3 GPU will be the true leader in efficiency". TweakTown. Retrieved October 2, 2022.
- ^ Gamers Nexus (November 3, 2022). "AMD Radeon RX 7900 XTX & 7900 XT Specs, Price, Release Date, & "8K" Gaming". YouTube. Retrieved November 4, 2022.
- ^ Samsung Electronics (October 5, 2023). "Exynos 2400: Mobile experiences in a new light". Samsung Electronics. Retrieved January 13, 2025.
- ^ Samsung Electronics (October 24, 2024). "Exynos 1580: Super smooth. Extremely efficient". Samsung Electronics. Retrieved January 13, 2025.
- ^ "AMD Navi 10 GPU Specs". TechPowerUp. Retrieved November 5, 2022.
- ^ "AMD Navi 12 GPU Specs". TechPowerUp. Retrieved November 5, 2022.
- ^ "AMD Navi 14 GPU Specs". TechPowerUp. Retrieved November 5, 2022.
- ^ "AMD Navi 21 GPU Specs". TechPowerUp. Retrieved November 7, 2022.
- ^ "AMD Navi 22 GPU Specs". TechPowerUp. Retrieved November 7, 2022.
- ^ "AMD Navi 23 GPU Specs". TechPowerUp. Retrieved November 7, 2022.
- ^ "AMD Navi 24 GPU Specs". TechPowerUp. Retrieved November 7, 2022.
- ^ "AMD Navi 31 GPU Specs". TechPowerUp. Retrieved November 7, 2022.
- ^ "AMD Unveils World's Most Advanced Gaming Graphics Cards, Built on Groundbreaking AMD RDNA 3 Architecture with Chiplet Design". AMD (Press release). November 3, 2022.
- ^ "AMD Navi 32 GPU Specs". TechPowerUp. Retrieved August 25, 2023.
- ^ "AMD Navi 33 GPU Specs". TechPowerUp. Retrieved November 22, 2022.
- ^ "AMD Navi 44 GPU Specs". TechPowerUp. Retrieved June 19, 2025.
- ^ "AMD Navi 48 GPU Specs". TechPowerUp. Retrieved May 9, 2025.
- ^ "User Guide for AMDGPU Backend — LLVM 22.0.0git documentation". llvm.org.
- ^ "Accelerator and GPU hardware specifications — ROCm Documentation". rocm.docs.amd.com.
- ^ btarunr (November 4, 2022). "AMD Announces the $999 Radeon RX 7900 XTX... (endnote RX-819)". TechPowerUp.
- ^ Walton, Jarred (June 5, 2023). "AMD RDNA 3 GPU Architecture Deep Dive: The Ryzen Moment for GPUs". Tom's Hardware. Retrieved September 11, 2023.
- ^ "AMD Strix Point". TechPowerUp. Retrieved September 7, 2024.
- ^ "AMD Details the Radeon 890M RDNA 3.5 iGPU of "Strix Point" a bit More". TechPowerUp. Retrieved September 7, 2024.
- ^ "AMD Strix Halo GPU Specs". TechPowerUp. October 9, 2025. Retrieved October 9, 2025.
- ^ "AMD Krackan Point GPU Specs". TechPowerUp. October 9, 2025. Retrieved October 9, 2025.
- ^ "User Guide for AMDGPU Backend — LLVM 22.0.0git documentation". llvm.org.
- ^ "Accelerator and GPU hardware specifications — ROCm Documentation". rocm.docs.amd.com.
External links
[edit]RDNA (microarchitecture)
View on GrokipediaBackground and development
Predecessors and motivations
The Graphics Core Next (GCN) microarchitecture, introduced by AMD in 2012 with the Radeon HD 7000 series, formed the foundational predecessor to RDNA and powered the company's discrete and integrated GPUs through the Vega series until 2019. GCN employed a unified shader model, enabling the same compute units to process both graphics rendering and general-purpose compute tasks seamlessly, which facilitated strong support for APIs like DirectX 11 and OpenCL. Execution was structured around wavefronts comprising 64 threads, executed across four SIMD16 units per compute unit to achieve high parallelism and throughput in vector operations. Scalar processing was managed by dedicated pipelines for control flow and address generation, backed by a 8 KB scalar register file and a 16 KB shared scalar data cache per group of four compute units. The cache hierarchy included private 16 KB L1 vector data caches per compute unit and a distributed 16-way associative L2 cache for coherence across the GPU.[6] GCN's design excelled in compute-heavy workloads but revealed limitations in scalar processing flexibility, where divergent thread execution in graphics shaders strained the scalar units' branch handling and 64-bit ALU capabilities, leading to underutilization in latency-sensitive scenarios. Cache efficiency also posed challenges, as the per-SIMD vector caches and shared scalar cache resulted in frequent flushes, particularly in geometry processing and memory access patterns under modern workloads like DirectX 12, which emphasized asynchronous compute and lower occupancy. These issues contributed to suboptimal power consumption and scalability as gaming demands shifted toward higher instruction-level parallelism and reduced thread counts for better responsiveness.[1][7] The transition to RDNA was driven by AMD's goal to deliver approximately 50% higher performance per watt over GCN, with a sharpened focus on optimizing for gaming performance through lower latency, higher instructions per cycle, and improved efficiency in graphics pipelines.[1] This redesign targeted high-end discrete GPUs and console-integrated solutions, enhancing scalability via refined memory hierarchies and interconnects while ensuring backward compatibility with GCN's instruction set. RDNA's development emphasized readiness for emerging features like ray tracing, addressing GCN's inefficiencies to better align with evolving industry standards for power-sensitive, real-time rendering.[8]Timeline of releases
The development of the RDNA microarchitecture began as part of AMD's internal roadmap to succeed the Graphics Core Next (GCN) architecture, with the fifth generation of GCN (Vega) serving as the immediate predecessor before the shift to RDNA for improved gaming efficiency.[9] AMD's RDNA iterations were designed with a focus on annual advancements in performance per watt, targeting gains such as 50% improvements in early generations over GCN baselines.[2] RDNA 1 was revealed at Computex 2019 and officially announced in July 2019, launching alongside the Radeon RX 5000 series on July 7, 2019.[10] This marked the debut of the RDNA family, emphasizing a clean break from GCN for better power efficiency in gaming workloads.[9] RDNA 2 followed with its announcement on October 28, 2020, and launch on November 18, 2020, powering the Radeon RX 6000 series for discrete GPUs.[11] A key milestone was its integration into console hardware through partnerships with Sony and Microsoft, debuting in the PlayStation 5 and Xbox Series X/S upon their November 2020 releases.[12] RDNA 3 was unveiled on November 3, 2022, with the Radeon RX 7000 series launching on December 13, 2022, introducing a significant manufacturing shift to TSMC's 5nm process node for enhanced density and efficiency.[4] The architecture combined 5nm and 6nm nodes in a chiplet design, building on prior collaborations with console partners.[13] RDNA 4 was teased at CES 2025 on January 6 before its full unveiling on February 28, 2025, and launch on March 6, 2025, with the Radeon RX 9000 series.[5][14] A deeper technical exploration occurred at Hot Chips 2025 in August, highlighting its modular design for flexible GPU configurations.[15]Core architectural principles
Instruction set architecture
The RDNA instruction set architecture (ISA) is a 32-bit reduced instruction set computing (RISC)-like design that supports vector, scalar, and global operations, enabling efficient execution of shader programs on AMD GPUs.[16] It is compatible with major graphics and compute APIs, including DirectX 12, Vulkan, and OpenGL, facilitating broad software ecosystem support for rendering and general-purpose computing tasks.[16] The ISA organizes instructions into categories such as VOP (vector operations), SOP (scalar operations), and memory access types like MIMG, FLAT, MUBUF, MTBUF, and DS, allowing developers to target diverse workloads from pixel shading to matrix computations.[17] Key extensions in the RDNA ISA distinguish it from predecessors while maintaining core principles. The wavefront size is fixed at 32 threads (SIMD32), enabling consistent parallel execution across work-items, with support for Wave64 modes that operate as paired Wave32 units.[18] Double-rate integer operations are provided through instructions like V_MUL_I32_I24 and V_MAD_I32_I24, which perform packed 24-bit multiplies and 32-bit accumulates to accelerate integer-heavy tasks such as texture sampling and geometry processing.[16] Primitive shaders, introduced in RDNA 1, are implemented as dedicated ISA opcodes including V_INTERP_P1_F32 for interpolation and EXP for parameter exports, streamlining mesh and vertex processing by reducing overhead in the graphics pipeline.[17] Addressing modes in the RDNA ISA support both 32-bit and 64-bit pointers, encompassing flat addressing for global memory, scratch for private data, and structured formats via MUBUF and MTBUF instructions. Registers include vector general-purpose registers (VGPRs, 0-255 or up to 511 in extended modes) for per-lane data and scalar general-purpose registers (SGPRs, 0-105 plus special-purpose like VCC and M0 for offsets), promoting efficient data movement and control flow.[18] Local data share (LDS) provides 64 KB per CU (128 KB per WGP) for fast thread-local communication, with up to 64 KB allocatable per workgroup; LDS allocation supports two modes: CU mode, which splits the memory for independent access by each CU's SIMDs, and WGP mode, providing a single contiguous space across the WGP.[16] Asynchronous compute queues are enabled through synchronization primitives like S_WAITCNT, S_BARRIER, and DS_GWS_SEMA_P instructions, allowing concurrent execution of graphics and compute workloads without stalling.[16] The RDNA ISA ensures backward compatibility with the Graphics Core Next (GCN) ISA, inheriting its foundational instructions and state models while introducing optimizations for scalar unit independence, such as single-cycle issue for SOP1 and SOP2 operations decoupled from vector pipelines.[17] This design allows scalar instructions to execute per wavefront without vector dependencies, improving branch handling and control efficiency in shaders.Compute unit structure
In the RDNA architecture, Compute Units (CUs) are organized into Workgroup Processors (WGPs), with each WGP comprising two CUs that share resources such as the L0 instruction cache, scalar cache, and LDS for improved efficiency.[1] The compute unit (CU) in the RDNA microarchitecture serves as the fundamental processing core for parallel shader execution, consisting of two 32-wide single instruction, multiple data (SIMD) units that provide a total of 64 shader processors capable of performing arithmetic operations on wavefronts of 32 or 64 threads.[1] Each CU includes a dedicated scalar unit (SALU) that handles address calculations, control flow decisions, and uniform operations across the wavefront, operating independently from the vector paths to improve efficiency in divergent code execution.[16] Additionally, each CU incorporates four texture units for sampling and filtering operations, as well as one primitive unit responsible for assembling and culling primitives in the geometry pipeline, outputting up to one primitive per clock cycle after processing up to two inputs.[9] The execution pipeline within a CU follows an in-order design with distinct stages: instruction fetch from a 16 KB L0 instruction cache per CU (32 KB per WGP) shared across the SIMD units, decode to distribute operations, execution through separate vector and scalar pipes, and write-back to registers or memory.[1] This structure enables single-cycle instruction issue per SIMD for wavefronts, with a typical latency of five cycles exposed for dependent operations, supported by hardware dependency checks to maintain throughput without stalling.[17] Resource allocation in the CU emphasizes shared access for efficiency, including a 16 KB L0 vector cache per CU and a 16 KB L0 scalar cache per WGP for low-latency data access, while a larger 128 KB L1 data cache is shared across multiple CUs within a shader array to handle graphics and compute workloads.[1] Fixed-function units, such as the texture processors and primitive assemblers, integrate directly with the CU's memory subsystem to support rasterization and geometry processing without relying on programmable shaders.[9] The CU's local data share (LDS) provides 64 KB of high-bandwidth shared memory per CU in compute mode, enabling efficient workgroup communication at up to 32 dwords per cycle.[17] Throughput metrics highlight the CU's balanced design, with the scalar pipeline capable of executing up to two instructions per cycle to accommodate control-heavy workloads, while vector throughput reaches 64 single-precision floating-point operations per cycle per CU.[9] These CUs connect to the GPU's global memory controller, optimized for high-bandwidth interfaces like GDDR6, ensuring seamless data flow for both graphics rendering and compute tasks across RDNA generations.[1] Later iterations, such as RDNA 3, introduce dual-issue capabilities in the front-end for enhanced instruction dispatch without altering the core CU layout.[18]RDNA 1
Key innovations
The RDNA microarchitecture introduced scalar unit independence by providing each SIMD unit with its own dedicated scalar pipeline, separate from the vector units, which minimizes wavefront stalls during branch divergence and non-vector operations.[1] This design doubles branch execution efficiency compared to GCN, enabling up to 2x better performance in control-flow heavy workloads by allowing scalar instructions to execute without blocking the vector pipes.[1][9] A revamped cache hierarchy enhances data access efficiency, featuring multi-level L0 and L1 caches with 16 KB per SIMD for the scalar cache—50% larger than the equivalent in GCN—and a 128 KB shared L1 cache per shader array to reduce pressure on the L2 cache.[1][9] The L2 cache scales to 2 MB per memory controller, supporting higher bandwidth and lower latency for both graphics and compute tasks, contributing to overall throughput improvements.[9] Hardware support for primitive shaders enables early primitive culling directly in the shader pipeline, processing and discarding off-screen or back-facing primitives before full rasterization, which reduces geometry processing overhead by up to 2x compared to prior generations.[9] This is complemented by enhanced asynchronous compute capabilities through a dedicated scheduling feature called Asynchronous Compute Tunneling, which allows compute workloads to interleave with graphics without stalling the pipeline, improving resource utilization in mixed workloads.[9][19] Fabricated on TSMC's 7 nm process node, RDNA targets 1.5x instructions per clock (IPC) over GCN at the same clock speed, achieved through these architectural refinements and optimized power delivery for higher efficiency.[20][1] These innovations first appeared in chips like the Navi 10 GPU.[21]Implemented chips
The primary discrete GPU implementations of the RDNA 1 microarchitecture are based on two main dies: Navi 10 and Navi 14, both fabricated on TSMC's 7 nm process node. These chips target the mainstream to high-end desktop graphics market, focusing on gaming performance improvements over GCN-based products.[22][23] The Navi 10 die serves as the foundation for AMD's higher-end RDNA 1 offerings in the Radeon RX 5700 series, including the RX 5700 XT and RX 5700 models. Featuring a die size of 251 mm² and 10.3 billion transistors, it supports up to 40 compute units (2,560 stream processors) in its fully enabled configuration.[24] The RX 5700 XT pairs this die with 8 GB of GDDR6 memory on a 256-bit interface, achieving a memory bandwidth of 448 GB/s, and is positioned for high-end desktop gaming at 1440p and 4K resolutions.[25] Launched on July 7, 2019, the RX 5700 XT carried a starting price of $399, while the RX 5700 variant, with 36 compute units and slightly reduced clocks, started at $349.[21] In contrast, the Navi 14 die targets entry-to-mid-range segments with a more compact design, measuring 158 mm² and containing 6.4 billion transistors. It supports up to 22 compute units (1,408 stream processors) in configurations like the Radeon RX 5500 XT, with 4 GB or 8 GB of GDDR6 memory on a 128-bit bus.[26] This die enables flexible binning for diverse SKUs such as the RX 5500 series for budget gamers.[27] Released on October 7, 2019 (for mobile variants earlier in August), these cards emphasize efficient power use for 1080p and 1440p gaming. The RX 5500 XT 8 GB model started at $199.[28] Navi 12, a less common die with approximately 8.1 billion transistors and up to 36 compute units, was primarily used in professional and mobile products like the Radeon Pro 5600M and mining cards, rather than consumer discrete GPUs.[29]| Die | Transistor Count | Process Node | Max Compute Units | Target Cards | Memory Config | Launch Date | Starting Price |
|---|---|---|---|---|---|---|---|
| Navi 10 | 10.3 billion | 7 nm | 40 | RX 5700 XT / RX 5700 | 8 GB GDDR6 (256-bit) | July 2019 | $349–$399 |
| Navi 14 | 6.4 billion | 7 nm | 22 | RX 5500 XT / RX 5500 series | 4–8 GB GDDR6 (128-bit) | October 2019 | $169–$199 |
RDNA 2
Rendering and acceleration features
RDNA 2 introduces dedicated hardware accelerators to enhance real-time rendering capabilities, particularly for gaming workloads. Central to these improvements are Ray Accelerators, with one integrated per Compute Unit (CU), designed to accelerate ray-triangle intersection and Bounding Volume Hierarchy (BVH) traversal for ray tracing effects such as shadows, reflections, and global illumination.[30][31] This fixed-function hardware offloads ray tracing computations from general-purpose shaders, enabling efficient hybrid rendering pipelines that combine rasterization with ray-traced elements.[32] To optimize geometry processing and shading efficiency, RDNA 2 supports mesh shaders and task shaders, which allow developers to replace traditional vertex and geometry shaders with more flexible, programmable stages. These features, part of DirectX 12 Ultimate, enable coarser geometry culling and amplification, reducing overhead in complex scenes with high polygon counts. Complementing this is hardware-accelerated Variable Rate Shading (VRS) at Tier 2, supporting shading rates including 2x2 and 4x4 pixels per sample to minimize computations in less visually critical areas, such as peripheral regions, without significant quality loss.[31][33] This combination can reduce pixel shading workload in targeted scenarios, improving frame rates in demanding titles.[30] Memory access efficiency is bolstered by the Infinity Cache (up to 128 MB in high-end configurations), a large on-die L3-like structure that acts as a high-bandwidth pool for spatial and temporal data reuse, effectively doubling bandwidth compared to equivalent L2 cache configurations in prior architectures.[30][32] Additionally, sampler feedback hardware captures texture sampling patterns, enabling dynamic streaming of only necessary texture data to VRAM, which optimizes memory usage and reduces latency in texture-heavy scenes.[31] For better multi-tasking, RDNA 2 separates graphics and compute pipelines, allowing primitive assembly and async compute workloads to execute concurrently without stalling the graphics queue. This decoupling enhances utilization during mixed workloads, such as ray tracing BVH construction alongside draw calls, by dispatching instructions through independent paths to the shader arrays.[34]Integrated chips and console applications
The RDNA 2 architecture powered several discrete GPU dies under the Navi branding, targeting a range of performance segments in AMD's Radeon RX 6000 series. The flagship Navi 21 die, fabricated on TSMC's 7 nm process with 26.8 billion transistors and a die area of 520 mm², supported up to 80 compute units (CUs) in its full configuration, as seen in the Radeon RX 6900 XT graphics card.[35][36] The mid-range Navi 22 die, also on 7 nm with 17.2 billion transistors and a 335 mm² die size, featured 40 CUs and powered cards like the Radeon RX 6700 XT.[37][38] Lower-tier options included the Navi 23 on 7 nm with 11.1 billion transistors and a 237 mm² die, delivering 32 CUs for the Radeon RX 6600 XT, and the entry-level Navi 24 on a refined 6 nm process with 5.4 billion transistors and a 107 mm² die, providing 16 CUs in the Radeon RX 6500 XT.[39][40][41]| Die | Process | Transistors (billions) | Die Size (mm²) | Max CUs | Example Product |
|---|---|---|---|---|---|
| Navi 21 | 7 nm | 26.8 | 520 | 80 | RX 6900 XT |
| Navi 22 | 7 nm | 17.2 | 335 | 40 | RX 6700 XT |
| Navi 23 | 7 nm | 11.1 | 237 | 32 | RX 6600 XT |
| Navi 24 | 6 nm | 5.4 | 107 | 16 | RX 6500 XT |
RDNA 3
Chiplet design and scaling
The RDNA 3 microarchitecture marked AMD's transition to a chiplet-based design for discrete graphics processing units (GPUs), enabling modular scaling and improved manufacturing yields by breaking down the monolithic die structure used in prior generations. This approach draws on AMD's experience with chiplet integration in CPU architectures, adapting it to graphics workloads. The core components consist of one or more Graphics Compute Dies (GCDs), each fabricated on TSMC's 5nm process node and containing the compute units (CUs) responsible for shader processing, rasterization, and other graphics primitives. These GCDs connect via AMD's Infinity Fabric interconnect to Memory Cache Dies (MCDs) on TSMC's 6nm node, which manage memory controllers, peripherals, and large last-level cache pools.[4][47] In high-end configurations like the Navi 31 GPU, a single GCD integrates up to 96 CUs, paired with six MCDs to form a cohesive package using advanced fanout packaging for low-latency inter-die communication. Scaling is achieved by varying the number of MCDs—ranging from four in mid-range dies to six in flagship models—allowing adjustments to memory bandwidth and cache capacity without redesigning the core compute logic. Each CU in RDNA 3 features dual-issue wavefront dispatch, enabling two SIMD32 units to process instructions simultaneously, which doubles the peak FP32 throughput compared to RDNA 2's single-issue design per CU. This architectural shift supports higher clock frequencies, up to 15% above RDNA 2 levels, while maintaining power efficiency through optimized die partitioning.[47][4] A key enhancement is the second-generation Infinity Cache, implemented across the MCDs at 16 MB per die, totaling up to 96 MB in the Navi 31-based RX 7900 XTX. This distributed L3 cache reduces pressure on inter-die bandwidth by caching frequently accessed data locally, minimizing trips to GDDR6 memory and effectively lowering bandwidth demands by approximately 20% in graphics workloads. Building briefly on RDNA 2's introduction of Infinity Cache as a stacked solution, RDNA 3 integrates it natively into the chiplet fabric for seamless multi-die operation. The mixed-process strategy—5nm for compute-intensive GCDs and 6nm for I/O-focused MCDs—targets a 50% improvement in performance per watt over RDNA 2, achieved through better silicon utilization and reduced power overhead in the interconnect, which consumes less than 5% of the total GPU power budget at up to 3.5 TB/s bidirectional bandwidth.[47][4]Implemented chips
The primary discrete GPU implementations of the RDNA 3 microarchitecture are based on three main dies: Navi 31, Navi 32, and Navi 33, fabricated using TSMC's 5 nm and 6 nm process nodes. These chips power the Radeon RX 7000 series, targeting desktop gaming from entry-level to high-end segments with enhancements in ray tracing, AI acceleration, and efficiency.[48][49] The Navi 31 die serves as the foundation for AMD's flagship RDNA 3 offerings in the Radeon RX 7900 series, including the RX 7900 XTX, RX 7900 XT, and RX 7900 GRE models. The multi-chip module (MCM) features approximately 58 billion transistors across one 5 nm GCD (300 mm²) and six 6 nm MCDs, supporting up to 96 compute units (6,144 stream processors).[50] The RX 7900 XTX pairs this with 24 GB of GDDR6 memory on a 384-bit interface, achieving up to 960 GB/s bandwidth, and is positioned for high-end 4K gaming. Launched on December 13, 2022, the RX 7900 XTX carried a starting price of $999, while the RX 7900 XT (20 GB, 320-bit) started at $899, and the RX 7900 GRE (16 GB, 256-bit) at $549 in 2023, enabling variants through die binning.[4] The Navi 32 die targets mid-range segments with a chiplet design similar to Navi 31 but scaled down, featuring one 5 nm GCD (200 mm², 28.1 billion transistors) and four 6 nm MCDs for a total MCM of about 37 billion transistors. It supports up to 60 compute units (3,840 stream processors) in configurations like the Radeon RX 7800 XT, with 16 GB of GDDR6 on a 256-bit bus.[51] This die also powers the RX 7700 XT (54 CUs, 12 GB, 192-bit). Released on September 6, 2023, the RX 7800 XT started at $499 and the RX 7700 XT at $449, emphasizing 1440p gaming performance.[52] In contrast, the Navi 33 die is a monolithic design on TSMC's 6 nm process, measuring 204 mm² with 13.3 billion transistors. It supports 32 compute units (2,048 stream processors) in the Radeon RX 7600, with 8 GB of GDDR6 on a 128-bit interface (up to 288 GB/s bandwidth); a 16 GB RX 7600 XT variant launched in January 2024.[53] Released on May 24, 2023, the RX 7600 started at $269 and the XT at $329, focusing on efficient 1080p and 1440p gaming.[54] RDNA 3 is also integrated into consumer APUs like the Ryzen 7000 series (up to 12 CUs) and handheld devices such as the Asus ROG Ally, but the architecture's primary market focus for discrete GPUs is gaming with AI and ray tracing features. As of November 2025, no console implementations have been announced.[48]| Die | Transistor Count (MCM) | Process Node | Max Compute Units | Target Cards | Memory Config | Launch Date | Starting Price |
|---|---|---|---|---|---|---|---|
| Navi 31 | 58 billion | 5 nm / 6 nm | 96 | RX 7900 XTX / XT / GRE | 16–24 GB GDDR6 (256–384-bit) | Dec 2022 / 2023 | $549–$999 |
| Navi 32 | 37 billion | 5 nm / 6 nm | 60 | RX 7800 XT / 7700 XT | 12–16 GB GDDR6 (192–256-bit) | Sep 2023 | $449–$499 |
| Navi 33 | 13.3 billion | 6 nm | 32 | RX 7600 / 7600 XT | 8–16 GB GDDR6 (128-bit) | May 2023 / Jan 2024 | $269–$329 |
RDNA 4
Modular design and AI enhancements
The RDNA 4 microarchitecture introduces a modular system-on-chip (SoC) design that enhances flexibility for mid-range graphics processing units (GPUs), building on the chiplet approach of RDNA 3 by enabling scalable configurations of shader engines and compute units.[55] The flagship Navi 48 die, fabricated on TSMC's 4 nm process with a size of 356.5 mm² and 53.9 billion transistors, incorporates four shader engines, each containing eight dual-issue compute units (DCUs) for a total of 64 compute units, allowing for efficient monolithic integration while supporting half-die variants like the Navi 44 for cost-effective smaller GPUs.[55] This tile-based structure facilitates diverse SKU configurations, such as the Navi 44's two shader engines, reducing manufacturing overhead and enabling broader market coverage in the mid-range segment.[56] A key enabler of this modularity is the implementation of out-of-order memory request handling, which permits requests from different shader waves to be processed non-sequentially, eliminating false dependencies that stalled prior architectures like RDNA 3.[57] This optimization improves overall memory subsystem efficiency, particularly in rasterization workloads where access patterns vary, contributing to smoother execution across diverse GPU configurations.[57] RDNA 4 integrates second-generation AI accelerators into each compute unit to bolster AI capabilities, delivering up to 2x the generalized matrix multiply (GEMM) performance in FP16 compared to RDNA 3, with 1024 FLOPS per clock per compute unit.[58][5] These accelerators, comprising four matrix acceleration engines (MAEs) per DCU for a total of 128 across the Navi 48, support sparse matrix operations in INT8 at up to 8x the throughput of RDNA 3, enabling efficient AI inference and machine learning tasks.[5][55] This hardware facilitates advanced features like FidelityFX Super Resolution 4 (FSR 4), an AI-based upscaler that leverages multi-layer perceptron (MLP) models for enhanced image quality in frame generation and interpolation, running optimally on RDNA 4's integrated tensor-like units.[59][58] Ray tracing receives significant upgrades in RDNA 4 through third-generation accelerators, achieving over 2x the ray-triangle intersection and traversal throughput per compute unit relative to RDNA 3, thanks to dual intersection engines per ray accelerator.[5][60] Each DCU now includes two such accelerators, totaling 64 on the Navi 48, which support 8-wide bounding volume hierarchy (BVH) nodes for fewer traversal steps and oriented bounding boxes (OBBs) via predefined transformation matrices to accelerate complex scene handling.[55][60] These enhancements also improve BVH construction efficiency through primitive compression and parallel path testing instructions likeIMAGE_BVH_DUAL_INTERSECT_RAY, enabling better performance in ray-traced titles such as Cyberpunk 2077.[60]
Supporting these features, RDNA 4 doubles the L2 cache size to 8 MB per GPU die, enhancing data locality for AI and ray tracing workloads, while the Infinity Cache expands to 64 MB to sustain high-bandwidth demands.[55][5] Memory configuration standardizes at 16 GB of GDDR6 on a 256-bit interface, with support for speeds up to 20 Gbps, prioritizing efficiency over a shift to GDDR7 in this generation.[5][55] Additionally, Infinity Fabric interconnects see optimized bandwidth usage, reducing overall requirements by approximately 25% through compression and scheduling improvements, further aiding modular scalability.[55]
Implemented chips
The primary discrete GPU implementations of the RDNA 4 microarchitecture are based on two main dies: Navi 48 and Navi 44, both fabricated on TSMC's 4 nm process node. These chips target the desktop graphics market, emphasizing mid-range performance with enhancements in ray tracing and AI acceleration for gaming workloads.[61][62] The Navi 48 die serves as the foundation for AMD's higher-end RDNA 4 offerings in the Radeon RX 9070 series, including the RX 9070 XT and RX 9070 models. Featuring a die size of 357 mm² and approximately 53.9 billion transistors, it supports up to 64 compute units (4,096 stream processors) in its fully enabled configuration.[63][64] The RX 9070 XT pairs this die with 16 GB of GDDR6 memory on a 256-bit interface, achieving a memory bandwidth of up to 640 GB/s, and is positioned for mid-to-high-end desktop gaming at 1440p and entry-level 4K resolutions.[65] The RX 9070 features 56 compute units (3,584 stream processors) via binning, with slightly reduced clocks. Launched in March 2025, the RX 9070 XT carries a starting price of $599, while the RX 9070 variant starts at $549, enabling cost-effective variants through modular binning of the die.[61][5] In contrast, the Navi 44 die targets entry-to-mid-range segments with a more compact design, measuring 199 mm² and containing about 29.7 billion transistors. It supports 32 compute units (2,048 stream processors) in configurations like the Radeon RX 9060 XT, with options for 8 GB or 16 GB of GDDR6 memory on a 128-bit bus.[66] This die leverages RDNA 4's modular architecture to offer flexible half-die options, allowing AMD to produce diverse SKUs such as the anticipated RX 9050 series for budget-conscious gamers.[67] Released starting in May 2025, these cards emphasize efficient power use and AI-enhanced features for 1080p and 1440p gaming without entering high-end territory.[68] As of late 2025, RDNA 4 has not been integrated into consumer desktop APUs, with the Ryzen 9000G series instead employing RDNA 3.5 graphics up to 16 compute units for hybrid CPU-GPU workloads.[69] The architecture's market focus remains on AI-accelerated desktop gaming, with no official announcements for console implementations.[5]| Die | Transistor Count | Process Node | Max Compute Units | Target Cards | Memory Config | Launch Date | Starting Price |
|---|---|---|---|---|---|---|---|
| Navi 48 | 53.9 billion | 4 nm | 64 | RX 9070 XT / RX 9070 | 16 GB GDDR6 (256-bit) | March 2025 | $549–$599 |
| Navi 44 | 29.7 billion | 4 nm | 32 | RX 9060 XT / RX 9050 series | 8–16 GB GDDR6 (128-bit) | May 2025 | ~$250–$350 |
Generational comparisons
Performance and efficiency metrics
The RDNA microarchitecture has demonstrated consistent improvements in performance per watt across generations, enabling higher computational throughput at comparable power levels. The first-generation RDNA, introduced in 2019, achieved approximately 1.5 times the performance per watt compared to the preceding GCN architecture, primarily through enhanced instruction throughput and reduced latency in the compute units.[1] Subsequent iterations built on this foundation: RDNA 2 delivered about 50% better performance per watt over RDNA 1, thanks to architectural refinements like primitive shaders and mesh shaders that optimized workload distribution. RDNA 3 further advanced efficiency with a 54% uplift over RDNA 2, incorporating dual-issue compute units and chiplet-based scaling to balance performance gains with power constraints.[70] For RDNA 4, launched in early 2025, AMD reported up to 40% improvements in rasterization performance and over 2x improvements in ray tracing throughput per compute unit, attributed to architectural enhancements including third-generation ray tracing accelerators and higher clock speeds, though overall perf/watt gains were moderated by a mid-range focus in the initial product lineup.[5] Theoretical peak performance, measured in teraflops (TFLOPS) of single-precision floating-point operations, illustrates the scaling trajectory of RDNA flagship GPUs, though real-world efficiency varies due to architectural changes. Representative high-end models show exponential growth: the RDNA 1-based Radeon RX 5700 XT peaked at 9.75 TFLOPS, the RDNA 2-based RX 6900 XT at 23 TFLOPS, and the RDNA 3-based RX 7900 XTX at 61.4 TFLOPS, reflecting increases in shader counts and clock rates alongside dual-issue capabilities in later generations. The RDNA 4-based RX 9070 XT, positioned as a mid-to-high-end offering, reaches 48.7 TFLOPS, a figure adjusted lower than RDNA 3's flagship due to a streamlined die size and emphasis on power efficiency rather than raw compute density. Infinity Cache, introduced in RDNA 2 and refined in subsequent generations, significantly mitigates memory bandwidth limitations by providing a large on-package L3 cache that effectively doubles bandwidth for cache-hit scenarios in gaming workloads. This technology allowed RDNA 2 and later architectures to achieve up to 2.17 times the effective memory bandwidth compared to traditional L2 cache designs at equivalent physical bus widths, reducing DRAM accesses and latency. Power draw trends across RDNA generations have maintained consistency around 300W TDP for high-end discrete GPUs, with the RX 5700 XT at 225W, RX 6900 XT at 300W, RX 7900 XTX at 355W, and RX 9070 XT at 304W, enabling sustained performance without proportional increases in thermal demands.| Generation | Representative GPU | Peak FP32 TFLOPS | TDP (W) |
|---|---|---|---|
| RDNA 1 | RX 5700 XT | 9.75 | 225 |
| RDNA 2 | RX 6900 XT | 23 | 300 |
| RDNA 3 | RX 7900 XTX | 61.4 | 355 |
| RDNA 4 | RX 9070 XT | 48.7 | 304 |