Recent from talks
Nothing was collected or created yet.
Metal (API)
View on Wikipedia
| Metal | |
|---|---|
Apple used the mobile multiplayer online battle arena game Vainglory to demonstrate Metal's graphics capabilities at the iPhone 6's September 2014 announcement event.[1] | |
| Developer | Apple Inc. |
| Initial release | June 2014 |
| Stable release | 4
/ June 2025 |
| Written in | Metal Shading Language (C++14-based in Metal 1–3, C++17-based in Metal 4), Runtime/API: Objective-C, Swift, C++ (binding via Metal-cpp) |
| Operating system | iOS, iPadOS, macOS, tvOS, watchOS and visionOS |
| Type | 3D graphics and compute API |
| License | Proprietary |
| Website | developer |
Metal is a low-level, low-overhead hardware-accelerated 3D graphic and compute shader API created by Apple, debuting in iOS 8. Metal combines functions similar to OpenGL and OpenCL in one API. It is intended to improve performance by offering low-level access to the GPU hardware for apps on iOS, iPadOS, macOS, tvOS, watchOS and visionOS. It is similar to low-level APIs on other platforms such as Vulkan and DirectX 12.
Metal is an object-oriented API that can be invoked using the Swift, Objective-C or C++17[2] programming languages. Full-blown GPU execution is controlled via the Metal Shading Language. According to Apple promotional materials: "MSL [Metal Shading Language] is a single, unified language that allows tighter integration between the graphics and compute programs. Since MSL is C++-based, you will find it familiar and easy to use."[3]
Features
[edit]Metal aims to provide low-overhead access to the GPU. Commands are encoded beforehand and then submitted to the GPU for asynchronous execution. The application controls when to wait for the execution to complete thus allowing application developers to increase throughput by encoding other commands while commands are executed on the GPU or save power by explicitly waiting for GPU execution to complete. Additionally, command encoding is CPU independent thus applications can encode commands to each CPU thread independently. Lastly, render states are pre-computed beforehand, allowing the GPU driver to know in advance how to configure and optimize the render pipeline before command execution.[4]
Metal improves the capabilities of GPGPU programming by using compute shaders. Metal uses a specific shading language based on C++14, implemented using Clang and LLVM.[5]
Metal allows application developers to create Metal resources such as buffers, textures. Resources can be allocated on the CPU, GPU, or both and provides facilities to update and synchronize allocated resources. Metal can also enforce a resource's state during a command encoder's lifetime.[6][7]
On macOS, Metal can provide application developers the discretion to specify which GPU to execute. Application developers can choose between the low-power integrated GPU of the CPU, the discrete GPU (on certain MacBooks and Macs) or an external GPU connected through Thunderbolt. Application developers also have the preference on how GPU commands are executed on which GPUs and provides suggestion on which GPU a certain command is most efficient to execute (commands to render a scene can be executed by the discrete GPU while post-processing and display can be handled by the integrated GPU).[8]
Metal Performance Shaders
[edit]Metal Performance Shaders is a highly optimized library of graphics functions that can help application developers achieve great performance at the same time decrease work on maintaining GPU family specific functions.[9] It provides functions including:
- Image filtering algorithms
- Neural network processing
- Advanced math operations
- Ray tracing
History
[edit]Metal has been available since June 2, 2014 on iOS devices powered by Apple A7 or later,[10] and since June 8, 2015 on Macs (2012 models or later) running OS X El Capitan.[11]
On June 5, 2017, at WWDC, Apple announced the second version of Metal, to be supported by macOS High Sierra, iOS 11 and tvOS 11. Metal 2 is not a separate API from Metal and is supported by the same hardware. Metal 2 enables more efficient profiling and debugging in Xcode, accelerated machine learning, lower CPU workload, support for virtual reality on macOS, and specificities of the Apple A11 GPU, in particular.[12]
At the 2020 WWDC, Apple announced the migration of the Mac to Apple silicon. Macs using Apple silicon will feature Apple GPUs with a feature set combining what was previously available on macOS and iOS, and will be able to take advantage of features tailored to the tile based deferred rendering (TBDR) architecture of Apple GPUs.[13]
At the 2022 WWDC, Apple announced the third version of Metal (Metal 3), which would debut with the release of macOS Ventura, iOS 16 and iPadOS 16. Metal 3 introduces the MetalFX upscaling framework, which renders complex scenes in less time per frame with high-performance upscaling and anti-aliasing, mesh shaders support.[14] Also announced possibility to use C/C++ for Metal API.[15]
At the 2023 WWDC, Apple announced a brand new toolkit called the Game Porting Toolkit to port Windows 10/11-based games. It includes an environment to test binaries, translation layers from HLSL to MSL, and Metal-cpp bindings. Jeremy Sandmel announced a new Game Mode for macOS Sonoma, and Hideo Kojima announced Death Stranding for the macOS App Store.[16]
At the 2024 WWDC, Apple announced Game Porting Toolkit 2, along with the release of new games such as Control: Ultimate Edition, Frostpunk 2, and Assassin's Creed Shadows for macOS.[17]
At the 2025 WWDC, Apple announced Metal 4, a new version of the API featuring a unified command encoder system, support for neural rendering, and new technologies such as MetalFX Frame Interpolation and a ray tracing denoiser.[18]
Supported GPUs
[edit]The first version of Metal supports the following hardware and software:[19]
- Apple A7 SoC or later with iOS 8 or later
- Apple M1 SoC or later with macOS 11 or later
- Intel Processor with Intel HD and Iris Graphics Ivy Bridge series or later with OS X 10.11 or later
- AMD Graphics with GCN or RDNA architecture with OS X 10.11 or later
- NVIDIA Graphics with Kepler architecture with OS X 10.11 to macOS 11 operating system
- NVIDIA Graphics with Maxwell architecture or Pascal architecture with OS X 10.11 to macOS 10.13 operating system
The second version of Metal supports the following hardware and software:
- Apple A7 SoC or later with iOS 11 or later
- Apple M1 SoC or later with macOS 11 or later
- Intel Processor with Intel HD and Iris Graphics Skylake series or later with macOS 10.13 or later
- AMD Graphics with GCN or RDNA architecture with macOS 10.13 or later
The third version of Metal supports the following hardware and software:[20]
- Apple A13 or later with iOS 16, iPadOS 16[21]
- Apple A14 or later with iOS 16, iPadOS 16 or later
- Apple M1 SoC or later with iPadOS 16 or later
- Apple M1 SoC or later with macOS 13 or later
- Intel Processor with Intel UHD 630 or Iris Plus (Kaby Lake or later) with macOS 13 or later
- AMD Graphics with RDNA architecture (5000 and 6000 series) and Pro Vega (5th generation GCN architecture)
The fourth version of Metal supports the following hardware and software:
Adoption
[edit]According to Apple, more than 148,000 applications use Metal directly, and 1.7 million use it through high-level frameworks, as of June 2017.[22] macOS games using Metal for rendering are listed below.
| Title | Developer (macOS version) | Game engine | MacOS release date (OpenGL/DirectX) | Metal-based release date | Metal support notes |
|---|---|---|---|---|---|
| Ark: Survival Evolved | Studio Wildcard | Unreal Engine 4 | 29 August 2017 | ||
| Assassin's Creed Shadows | Ubisoft Quebec | Ubisoft Anvil | 20 March 2025 | ||
| ARMA 3 | Virtual Programming | Real Virtuality | 31 August 2015 | Metal support in beta since 17 September 2017[citation needed] | |
| Baldur's Gate III | Larian Studios | Divinity Engine 4.0 | 22 September 2023 | Metal support in early access since 6 October 2020[citation needed] | |
| Ballistic Overkill | Aquiris Game Studio | Unity Engine 5 | 28 March 2017 | ||
| Batman: Arkham City | Feral Interactive | Unreal Engine 3 | 18 October 2013 | Metal support since 21 February 2019 with v1.2[citation needed] | |
| Batman: The Enemy Within | Telltale Games | Telltale Tool | 8 August 2017 | ||
| BattleTech | Harebrained Schemes | Unity Engine 5 | 24 April 2018 | ||
| Bioshock Remastered | Feral Interactive | Unreal Engine 2.5 | 22 August 2017 | ||
| Bioshock 2 Remastered | Feral Interactive | Unreal Engine 2.5 | 22 October 2020 | ||
| Cyberpunk 2077 | CD Projekt | REDengine 4 | 17 July 2025 | Support for Metal 4 | |
| Cities: Skylines | Paradox Interactive | Unity Engine 5 | 10 March 2015 | Metal support since 18 May 2017[citation needed] | |
| Civilization VI | Aspyr Media | LORE | 24 October 2016 | Metal support since 5 April 2019[citation needed] | |
| Company of Heroes 2 | Feral Interactive | Essence Engine 3 | 21 January 2015 | Metal support since 19 October 2018[citation needed] | |
| Control: Ultimate Edition | Remedy Entertainment | Northlight Engine | 26 March 2025 | ||
| Dead Island 2 | Dambuster Studios | Unreal Engine 4 | 24 July 2025 | ||
| Deus Ex: Mankind Divided | Feral Interactive | Dawn Engine | 12 December 2017 | ||
| DiRT Rally | Feral Interactive | EGO Engine 2.5 | 16 November 2017 | ||
| Divinity: Original Sin II | Larian Studios | Divinity Engine 2 | 31 January 2019 | ||
| Dota 2 | Valve | Source 2 | 18 July 2013 | MoltenVK was announced on 26 February 2018.[23] The option to use this became available on 31 May 2018.[24] | |
| The Elder Scrolls Online | Zenimax Online Studios | N/A | 4 April 2014 | 22 October 2018 | OpenGL Renderer replaced with Vulkan via MoltenVK wrapper (translates Vulkan API calls to Metal) in patch 4.2.5 |
| Empire: Total War | Feral Interactive | TW Engine 3 | 4 March 2009 | Metal support since 16 December 2019[citation needed] | |
| EVE Online | CCP Games | N/A | 6 November 2007 | 14 October 2021 | Previously available on macOS via DirectX 9.0 from November 2007 until February 2009; native macOS version using Metal released 14 November 2021[citation needed] |
| Everspace | Rockfish | Unreal Engine 4 | 26 May 2017 | ||
| F1 2016 | Feral Interactive | EGO Engine 4.0 | 6 April 2017 | ||
| F1 2017 | Feral Interactive | EGO Engine 4.0 | 25 August 2017 | ||
| Fortnite | Epic Games | Unreal Engine 4 | 25 July 2017 | ||
| Frostpunk | 11 Bit Studios | Liquid Engine | 24 February 2021 | ||
| Frostpunk 2 | 11 Bit Studios | Unreal Engine 5 | 20 September 2024 | ||
| Gravel | Virtual Programming | Unreal Engine 4 | 20 January 2019 | ||
| Guardians of the Galaxy: The Telltale Series | Telltale Games | Telltale Tool | 18 April 2017 | ||
| Headlander | Double Fine Productions | Buddha Engine | 18 November 2016 | ||
| Heroes of the Storm | Blizzard Entertainment | SC2 Engine | 2 June 2015 | Metal support in beta since 24 January 2017 (temporarily removed on 29 November 2017[25] until ?)[citation needed] | |
| Hitman | Feral Interactive | Glacier 2 | 20 June 2017 | ||
| Life Is Strange: Before the Storm | Feral Interactive | Unity Engine | 13 September 2018 | ||
| Life Is Strange 2 | Feral Interactive | Unreal Engine 4 | 2019 | ||
| Mafia III | Aspyr Media | Illusion Engine | 11 May 2017 | ||
| Medieval II: Total War | Feral Interactive | TW Engine 2 | 17 December 2015 | Metal support since 25 October 2018[citation needed] | |
| Micro Machines World Series | Virtual Programming | Unity Engine 5 | 30 June 2017 | ||
| Minecraft: Story Mode - Season Two | Telltale Games | Telltale Tool | 11 July 2017 | ||
| MXGP3 | Virtual Programming | Unreal Engine 4 | 23 November 2018 | ||
| Napoleon: Total War | Feral Interactive | TW Engine 3 | 2 July 2013 | Metal support since 25 October 2019 with v1.2[citation needed] | |
| Obduction | Cyan Worlds | Unreal Engine 4 | 29 March 2017 | ||
| Observer | Bloober Team | Unreal Engine 4 | 24 October 2017 | ||
| Quake II | id Software | Quake II engine | 9 February 2019 | A port using MoltenVK was released as vkQuake2.[26] | |
| Refunct | Dominique Grieshofer | Unreal Engine 4 | 5 September 2016 | ||
| Resident Evil 2 | Capcom | RE Engine | 10 December 2024 | ||
| Resident Evil 3 | Capcom | RE Engine | 18 March 2025 | ||
| Resident Evil 4 | Capcom | RE Engine | 20 December 2023 | ||
| Resident Evil Village | Capcom | RE Engine | 28 October 2022 | First macOS game with MetalFX support | |
| Rise of the Tomb Raider | Feral Interactive | Foundation Engine | 12 April 2018 | ||
| Shadow of the Tomb Raider | Feral Interactive | Foundation Engine | 2019 | ||
| Sid Meier's Railroads! | Feral Interactive | Gamebryo | 1 November 2012 | Metal support since 18 December 2018[citation needed] | |
| The Sims 3 | Maxis Redwood Shores | The Sims 3 Engine | 2 June 2009 | 28 October 2020 | |
| The Sims 4 | Maxis | SmartSim | 17 February 2015 | Metal support added 12 November 2019[citation needed] | |
| Sky: Children of the Light | Thatgamecompany | N/A | 18 July 2019 | Native Metal support added since pre-global live in November 2017 | |
| Starcraft | Blizzard Entertainment | Modified Warcraft II engine | 20 November 2001 | Metal support since 2 July 2020 with v1.23.5[citation needed] | |
| StarCraft II | Blizzard Entertainment | SC2 Engine | 27 July 2010 | Metal support in beta since 24 January 2017[citation needed] | |
| Tomb Raider | Feral Interactive | Foundation Engine | 17 January 2014 | Metal support with v1.2 in July 2019[citation needed] | |
| Total War: Rome Remastered | Feral Interactive | TW Engine 2 | 29 April 2021 | ||
| Total War: Shogun 2 | Feral Interactive | TW Engine 3 | 31 July 2014 | Metal support since 4 October 2019[citation needed] | |
| Total War: Shogun 2: Fall of the Samurai | Feral Interactive | TW Engine 3 | 18 December 2014 | Metal support since 4 October 2019[citation needed] | |
| Total War: Three Kingdoms | Feral Interactive | TW Engine 3 | 23 May 2019 | ||
| Total War: Warhammer | Feral Interactive | TW Engine 3 | 19 April 2017 | ||
| Total War: Warhammer II | Feral Interactive | TW Engine 3 | 20 November 2018 | ||
| Total War Saga: Thrones of Britannia | Feral Interactive | TW Engine 3 | 24 May 2018 | ||
| Total War Saga: Troy | Feral Interactive | TW Engine 3 | 13 August 2020 | ||
| Universe Sandbox | Giant Army | Unity Engine 5 | TBA | Metal support in beta since June 2017[citation needed] | |
| Unreal Tournament | Epic Games | Unreal Engine 4 | Cancelled | Metal support since January 2017[citation needed] | |
| War Thunder | Gaijin Entertainment | Dagor Engine 4 | 1 November 2012 | Metal support added 24 May 2017 (removed in ? 2018 and reintroduced 27 August 2020)[citation needed] | |
| Warhammer 40,000: Dawn of War III | Feral Interactive | Essence Engine 4 | 9 June 2017 | ||
| The Witness | Thekla, Inc | Thekla Engine | 8 March 2017 | ||
| World of Warcraft | Blizzard Entertainment | WoW Engine | 23 November 2004 | Metal support since August 2016[citation needed] | |
| X-Plane 11 | Laminar Research | N/A | 30 May 2017 | Metal support in beta since 2 April 2020[27] |
See also
[edit]References
[edit]- ^ McWhertor, Michael (September 9, 2014). "This is the game Apple used to show off iPhone 6". Polygon. Vox Media. Archived from the original on September 11, 2014. Retrieved September 9, 2014.
- ^ "Getting started with Metal-CPP - Metal".
- ^ Apple Inc. "Metal Shading Language Specification" (PDF).
- ^ "Setting Up a Command Structure". Apple Inc.
- ^ "Metal Shading Language Guide". September 8, 2014. Retrieved September 10, 2014.
- ^ Apple Inc. "Setting Resource Storage Mode".
- ^ "Synchronizing a Managed Resource". Apple Inc.
- ^ "GPU Selection in macOS". Apple Inc.
- ^ "Metal Performance Shaders".
- ^ Machkovech, Same (June 2, 2014). "Apple gets heavy with gaming, announces Metal development platform". Ars Technica. Condé Nast.
- ^ Smith, Colin; Meza, Starlayne (June 8, 2015). "Apple Announces OS X El Capitan with Refined Experience & Improved Performance". Newsroom. San Francisco: Apple.
- ^ "Metal 2". Apple Developer. Apple. November 20, 2017. Archived from the original on November 20, 2017 – via Wayback Machine.
- ^ "Bring your Metal app to Apple Silicon Macs". developer.apple.com. Retrieved July 13, 2020.
- ^ "Discover Metal 3". developer.apple.com. Retrieved June 24, 2022.
- ^ "Program Metal in C++ with metal-cpp". developer.apple.com. Retrieved September 10, 2022.
- ^ "WWDC 2023 - Game Porting Toolkit and Metal updates". YouTube. June 5, 2023. Retrieved July 25, 2025.
- ^ "WWDC 2024 - Game Porting Toolkit 2 and more". YouTube. June 10, 2024. Retrieved July 25, 2025.
- ^ "WWDC 2025 - Metal 4 Announcement". YouTube. June 9, 2025. Retrieved July 25, 2025.
- ^ Chiappetta, Marco (December 11, 2018). "Apple Turns Its Back On Customers And NVIDIA With macOS Mojave". Forbes.
- ^ "Metal feature set tables" (PDF). Apple.
- ^ "Metal feature set tables (2022)" (PDF). Apple. Archived from the original on October 14, 2022. Retrieved June 12, 2025.
{{cite web}}: CS1 maint: bot: original URL status unknown (link) - ^ Apple Inc. "WWDC 2017 Platforms State of the Union".
- ^ "Vulkan Applications Enabled on Apple Platforms". Khronos Group Press Release. Retrieved February 24, 2021.
- ^ Larabel, Michael (June 1, 2018). "Initial Vulkan Performance On macOS With Dota 2 Is Looking Very Good". Phoronix. Retrieved June 5, 2018.
- ^ "HEROES OF THE STORM BALANCE PATCH NOTES — NOVEMBER 29, 2017". news.blizzard.com. November 29, 2017.
- ^ Kondrak, Krzysztof [@k_kondrak] (February 9, 2019). "vkQuake2 gets MacOS support" (Tweet). Retrieved February 9, 2019 – via Twitter.
- ^ "X-Plane 11.50 Public Beta 1: Vulkan and Metal Are Here". X-Plane Developer. April 2, 2020. Retrieved April 2, 2020.
External links
[edit]- Metal for Developers
- Metal Programming Guide (preliminary)
- WWDC14 demo; extended version
- Install macOS 10.14 Mojave on Mac Pro (Mid 2010) and Mac Pro (Mid 2012) - Apple article explaining what GPUs are compatible with Apple's Metal APIs on Mac OS 10.14 (Mojave) operating system
Metal (API)
View on GrokipediaIntroduction
Overview
Metal is a proprietary, low-overhead graphics and compute application programming interface (API) developed by Apple Inc., designed for 3D graphics rendering and general-purpose computing on graphics processing units (GPGPU).[4] It provides developers with low-level, direct access to GPU hardware, allowing for efficient execution of parallel workloads on Apple silicon.[1] This enables the creation of high-performance applications, including video games, augmented reality (AR) and virtual reality (VR) experiences, and machine learning tasks that demand intensive computational resources.[4] Available exclusively on Apple's ecosystem, Metal supports the primary platforms of iOS, iPadOS, macOS, tvOS, and visionOS, where it harnesses the capabilities of Apple-designed GPUs for both graphics and compute operations.[4] As the underlying technology, Metal powers higher-level frameworks such as SceneKit for 3D scene rendering and RealityKit for immersive AR/VR simulations, which build upon its performance while offering abstracted APIs for easier development.[6][7] Central to Metal is its shading language, known as the Metal Shading Language (MSL), a C++-based dialect based on a subset of C++17 (as of Metal 4) that allows developers to write custom shaders and compute kernels for GPU execution.[8] MSL provides a unified syntax for graphics and compute programming, facilitating optimized code that runs directly on the GPU with minimal runtime overhead.[8]Design goals
Metal's design prioritizes low-overhead access to GPU hardware, enabling developers to minimize CPU-GPU communication latency by providing direct control over GPU resources without the abstraction layers found in higher-level APIs. This approach reduces driver overhead and allows for more efficient command submission, ensuring that applications can fully utilize the parallel processing capabilities of Apple silicon GPUs.[4][2] A core principle is achieving cross-platform consistency across Apple devices, from iPhones and iPads to Macs and Apple Vision Pro, allowing developers to write unified codebases that deliver reliable performance without extensive platform-specific optimizations. This unified ecosystem fosters seamless portability, as Metal abstracts hardware differences while exposing consistent APIs for graphics and compute tasks across iOS, iPadOS, macOS, tvOS, and visionOS.[4][9] Metal strikes a balance between low-level control—comparable to Vulkan or DirectX 12—and the ease-of-use of older APIs like OpenGL, offering explicit resource management and command buffer encoding to prevent state-related errors while streamlining development with a simplified, object-oriented interface. This design enables fine-grained GPU programming for advanced workloads without the verbosity or implicit behaviors that complicate higher-level alternatives.[2] The API emphasizes power efficiency, particularly for mobile devices like iPhones and iPads, by optimizing resource allocation and execution to reduce energy consumption during intensive graphics and compute operations, while scaling seamlessly to high-performance desktops and laptops. This focus aligns with Apple silicon's integrated architecture, where efficient GPU utilization directly contributes to longer battery life in portable hardware.[4][2] Security features, such as built-in API validation layers, are integral to Metal's design to prevent common GPU programming errors like invalid memory accesses or shader violations, providing runtime checks that help developers debug issues early without compromising production performance. These layers, accessible through Xcode, enforce safe practices and mitigate risks in resource-heavy applications.[10][11] Ultimately, Metal aims to foster innovation in graphics-intensive applications by enabling high-fidelity rendering and compute tasks, including integration with Apple's Neural Engine via frameworks like Metal Performance Shaders for accelerated AI workloads such as machine learning inference. This capability supports cutting-edge features in games, augmented reality, and visual effects, empowering developers to push hardware boundaries within the Apple ecosystem.[4][12]Technical architecture
Core components
The core components of the Metal API form the foundational layer for developers to interact with the GPU, enabling the creation, configuration, and execution of graphics and compute workloads on Apple hardware. These components include devices for GPU access, resources for data storage, pipelines for processing configurations, command structures for work submission, and synchronization mechanisms for coordination. By leveraging these elements, applications can efficiently harness GPU capabilities while maintaining low-overhead communication between CPU and GPU.[1] Device and queue management begins with theMTLDevice protocol, which represents a specific GPU and serves as the primary entry point for Metal operations. Developers obtain an MTLDevice instance using functions like MTLCreateSystemDefaultDevice(), which selects the most suitable GPU, or by enumerating available devices via MTLCopyAllDevices() for multi-GPU systems. The device handles resource creation, such as buffers and textures, and provides access to GPU features like supported formats and limits. For asynchronous execution, the MTLCommandQueue is created from the device using makeCommandQueue(); it schedules the submission of command buffers to the GPU in the order they are committed, supporting parallel encoding from multiple threads to optimize throughput.[13]
Resources in Metal encompass data structures optimized for GPU access, including buffers, textures, and heaps. Buffers, represented by MTLBuffer, store unstructured data such as vertex positions, index arrays, or uniform values, allocated via the device with specified length and options like storage mode (shared or private). They are bound to shaders during command encoding for efficient data transfer. Textures, via MTLTexture, manage formatted image data in 1D, 2D, 3D, or array configurations, created using MTLTextureDescriptor to define dimensions, pixel formats (e.g., RGBA8Unorm), and usage (e.g., render target or shader read). Heaps, implemented as MTLHeap, enable efficient memory allocation by preallocating large blocks of GPU memory, from which multiple buffers and textures can be suballocated, reducing fragmentation and overhead in resource-intensive applications.
Render pipelines are configured through MTLRenderPipelineState objects, which encapsulate the fixed-function and programmable states for graphics rendering. Created from an MTLRenderPipelineDescriptor, these state objects specify vertex and fragment shaders (written in Metal Shading Language), blending modes, rasterization settings like multisampling, and depth/stencil configurations, allowing developers to define how primitives are processed into pixels. Once compiled by the device, the pipeline state is set on a render command encoder to execute draw calls efficiently. In contrast, compute pipelines, represented by MTLComputePipelineState, focus on general-purpose GPU (GPGPU) tasks without graphics output; they are built from MTLComputePipelineDescriptor specifying a single compute shader function and are used to dispatch threadgroups for parallel data processing, such as simulations or machine learning inference.
Command buffers and encoders facilitate the submission of work to the GPU by organizing operations into executable units. A command buffer (MTLCommandBuffer), created from a queue via makeCommandBuffer(), holds a sequence of encoded commands and associated resources; after encoding, it is committed to the queue with commit(), triggering asynchronous GPU execution while allowing the CPU to continue. Encoders, such as MTLRenderCommandEncoder for graphics or MTLComputeCommandEncoder for compute, are obtained from the buffer (e.g., makeRenderCommandEncoder(descriptor:) ) and used to append specific instructions—like setting pipeline states, binding resources, issuing draw or dispatch calls—before ending the encoder to return control to the buffer. This model ensures serialized execution within a buffer while permitting concurrent buffer creation across threads.[14][15][16][17]
Synchronization primitives coordinate CPU-GPU interactions and prevent data hazards, including fences, events, and barriers. MTLFence objects track resource state changes, particularly for heap-allocated resources, by signaling completion of a pass and waiting in subsequent passes within the same queue to resolve access conflicts without CPU intervention. MTLSharedEvent provides cross-queue and CPU-GPU synchronization; developers notify the event upon GPU milestone completion and wait on the CPU or another queue, enabling ordered execution in complex workloads. Barriers, inserted via encoder methods like updateTextureMappingMode() or pass descriptors, enforce ordering within a command buffer by stalling until prior operations complete, ensuring resource consistency during multi-pass rendering or compute chains.[18][19][20]
Updates in Metal 4
Metal 4, announced in June 2025, introduces significant enhancements to the core components for improved performance and efficiency on Apple silicon. Key changes include the introduction ofMTL4CommandQueue for reduced CPU and memory overhead, supporting batched commits with commit:count:. Command buffers are now MTL4CommandBuffer, which can be reused via beginCommandBuffer(allocator:) for better resource management. Resource bindings shift to argument tables using MTL4ArgumentTable, replacing per-resource bindings to streamline shader access. All resources are untracked by default, requiring explicit synchronization with next-generation barriers. Additionally, MTLTextureViewPool enables lightweight creation of texture views, and new encoders like MTL4CommandEncoder handle work-specific protocols. These updates enable explicit memory management and unified command encoding to minimize overhead in graphics and compute workloads.[21]
Metal Shading Language
The Metal Shading Language (MSL) is a programming language designed for writing shaders that execute on Apple GPUs, serving as the core mechanism for defining graphics and compute operations within the Metal API.[8] It is based on a subset of C++14 in Metal versions 1–3 and C++17 starting with Metal 4, incorporating extensions and restrictions tailored for parallel GPU execution, such as SIMD vector types (e.g.,float4 for four-component single-precision floating-point vectors) and address space qualifiers to manage data locality across device, threadgroup, and constant memory.[8] This foundation allows developers familiar with C++ to author low-level GPU code while enforcing safety constraints that prevent issues like undefined behavior in massively parallel environments.[8]
MSL supports multiple shader stages corresponding to different phases of the GPU pipeline, including vertex shaders for transforming geometry, fragment shaders for per-pixel coloring, and compute shaders for general-purpose parallel processing.[8] Later versions introduce specialized stages, such as mesh shaders for generating primitives directly on the GPU and amplification shaders for dynamically controlling vertex processing in Metal 3 and beyond.[22][23] These stages enable flexible pipeline configurations, where shaders are invoked sequentially or in compute-only workflows to handle tasks from rendering to simulations.
The language provides a rich set of built-in functions to facilitate common GPU operations, including matrix mathematics (e.g., matrix_multiply for vector-matrix products), texture sampling (e.g., texture2d.sample for bilinear interpolation), and atomic operations (e.g., atomic_fetch_add for thread-safe increments on shared memory).[8] These functions are optimized for hardware acceleration, ensuring efficient execution without requiring manual SIMD intrinsics in most cases. MSL's type system emphasizes fixed-size data suitable for GPU parallelism, supporting scalar types (e.g., float, int), vector types (e.g., half3), matrix types (e.g., float4x4), and aggregate structures for passing complex data between stages.[8] Notably, it prohibits dynamic memory allocation (no new or delete) and recursion to avoid stack overflows and non-deterministic behavior in parallel threads.[8]
Shaders written in MSL can be compiled either at runtime using API calls like MTLDevice.makeLibrary or offline via the metal compiler tool, which generates a binary library (.metallib) containing an intermediate representation for direct loading and execution on the GPU.[24] This dual approach balances development flexibility with performance, as offline compilation reduces load times in production apps.[8]
MSL includes extensions for advanced rendering and computation, such as ray tracing features introduced in 2020 and enhanced in Metal 3 with hardware acceleration, where developers can define custom intersection functions (e.g., using [[intersection(0)]] attributes) to compute ray-object hits with programmable payloads.[25] In Metal 4, tensor types and operators are natively supported, enabling direct implementation of machine learning primitives like matrix multiplication (tensor_multiply) and convolutions within shaders for integrated graphics-ML workflows.[26] These extensions maintain MSL's C++-like syntax while leveraging hardware-specific instructions for high-throughput operations.
Key features
Graphics rendering
Metal's graphics rendering pipeline follows a traditional structure optimized for hardware-accelerated 3D graphics on Apple platforms, enabling developers to process geometry and produce pixel outputs efficiently. The pipeline begins with vertex processing, where vertex shaders transform input geometry using buffers and textures to compute positions and attributes for each vertex.[27] Following vertex processing, primitive assembly and rasterization convert the transformed geometry into fragments, generating pixel coverage data across the render target.[27] Fragment shading then applies pixel-level computations via fragment shaders to determine final color values, incorporating textures, lighting, and material properties.[27] Finally, depth and stencil testing resolve visibility and layering, using optional attachments to discard or blend fragments based on depth buffers and stencil masks, ensuring correct scene composition.[27] To support dynamic mesh generation, Metal includes tessellation stages that subdivide coarse patches into finer geometry, enhancing detail without increasing base model complexity. Tessellation operates on quad or triangle patches, using per-patch factors stored in MTLBuffer objects to control subdivision levels, which are computed dynamically via compute kernels for adaptive rendering.[28] This feature, available on devices supporting MTLGPUFamily.apple3 or MTLGPUFamily.mac2 and later, integrates into the render pipeline to generate detailed surfaces efficiently.[29] While Metal does not provide traditional geometry shaders, tessellation serves a similar role in amplifying geometry on the GPU.[28] Introduced in Metal 3, ray tracing enables photorealistic rendering through hardware-accelerated ray intersection testing against scene geometry. Developers build acceleration structures, such as bounding volume hierarchies (BVH), to represent triangles and bounding volumes, allowing efficient ray traversal and intersection queries for shadows, reflections, and global illumination.[30] These structures accelerate ray tracing by culling non-intersecting regions, supporting real-time performance on Apple silicon GPUs.[30] Hybrid rendering combines ray tracing with traditional rasterization, using ray-traced effects like ambient occlusion or reflections to enhance rasterized scenes without fully replacing the pipeline.[31] Metal 4 enhances ray tracing with flexible acceleration structures, allowing developers to build custom bounding volume hierarchies with optimization flags for speed or size. Additionally, MetalFX upscaling improvements include integrated denoising and frame interpolation, boosting frame rates for demanding graphics workloads on Apple silicon devices.[5] Mesh shading, also part of Metal 3, provides a modern alternative for handling complex geometries through task and mesh shaders, replacing or augmenting older fixed-function stages like tessellation. Task shaders perform high-level culling and workload distribution, dispatching groups of vertices and primitives to mesh shaders, which generate variable output topologies directly on the GPU for scalable rendering.[32] This approach supports efficient processing of intricate models, such as terrain or crowds, by adjusting detail levels based on visibility and performance needs, and is hardware-accelerated on devices with MTLGPUFamily.apple7 or later.[32] Mesh shaders enable GPU-driven geometry creation, reducing CPU overhead and improving draw call efficiency for large scenes.[33] For performance optimization, Metal supports variable rate shading (VRS), allowing developers to rasterize different screen regions at varying sample rates to balance quality and speed. A rasterization rate map defines zones with coarse (e.g., 2x2 pixels per sample) or fine rates, applied during fragment processing to skip redundant computations in low-detail areas like blurred backgrounds.[34] This technique, checked via device.supportsRasterizationRateMap, reduces GPU workload when pixel shading is costly, such as in complex material evaluations.[34] Complementing VRS, image-based lighting (IBL) uses environment cubemaps and precomputed irradiance textures sampled in fragment shaders to approximate global illumination, minimizing real-time light calculations for reflective surfaces and ambient effects.[35] Metal integrates seamlessly with ARKit and RealityKit to enable spatial computing on visionOS, allowing custom Metal shaders to render augmented reality scenes with real-world passthrough and immersive 3D content. Developers can use RealityKit's CustomMaterial to inject Metal shader functions for entity rendering, combining ARKit's motion tracking with Metal's pipeline for low-latency overlays and interactions.[36] This integration supports visionOS apps by blending Metal-rendered graphics with spatial anchors, facilitating mixed-reality experiences like volumetric displays or hand-tracked interfaces.[37]Compute programming
Metal's compute programming enables developers to execute general-purpose computations on the GPU using compute shaders written in the Metal Shading Language (MSL). These shaders perform parallel processing tasks independent of graphics rendering, leveraging the GPU's massive parallelism for data-intensive operations. Compute pipelines, created via MTLComputePipelineState objects, encapsulate the shader function and its configuration, allowing efficient execution of kernels across threads organized in a grid.[38] Dispatching compute kernels occurs through a compute command encoder, where developers specify the grid size using methods like dispatchThreads, which launches threads in a 1D, 2D, or 3D grid calculated automatically by Metal to cover the workload. Threads are grouped into threadgroups, each containing up to a hardware-defined maximum number of threads (e.g., 1024 on many Apple GPUs), enabling coarse-grained parallelism where multiple threadgroups process independent data partitions. This structure supports scalable execution, with thread positions identifiable via built-in MSL attributes such as [[thread_position_in_grid]] for global indexing and [[thread_position_in_threadgroup]] for local positioning within a group.[39] Within a threadgroup, threads share a dedicated memory space declared with the threadgroup address space qualifier in MSL, providing low-latency access for collaborative data processing, such as reducing partial sums or exchanging intermediate results. This shared memory, limited to the threadgroup's lifetime and sized up to 32 KB depending on the device, facilitates efficient intra-group communication without relying on slower global device memory. To ensure data consistency and prevent race conditions during shared memory access, developers use the threadgroup_barrier function, which synchronizes all threads in the group as both an execution and memory barrier. For instance, threadgroup_barrier(mem_flags::mem_threadgroup, execution::threadgroup) guarantees that prior writes to threadgroup memory are visible to all threads before proceeding, enforcing uniform control flow across the group.[8][8] Metal's compute capabilities excel in large-scale simulations, image processing, and scientific computing by distributing workloads across thousands of threads, harnessing the GPU's vector processing units for tasks like matrix multiplications or finite difference methods. For dynamic workloads, indirect compute dispatches allow grid sizes and threadgroup counts to be specified from buffer data at runtime, using methods like dispatchThreadgroupsWithIndirectBuffer to adapt to varying input sizes without CPU intervention. Sparse textures further enhance efficiency for voluminous datasets, such as volumetric simulations, by allocating physical memory only for accessed regions; unmapped areas return zeros on read, and writes to them are discarded, reducing memory footprint for large 3D textures in compute shaders.[40][41] Integration with machine learning relies on buffer-based data passing, where input and output tensors are stored in MTLBuffers for seamless transfer to compute shaders, enabling custom kernel implementations for operations like convolutions or activations. In Metal 4, native tensor support via MTLTensor resources introduces multi-dimensional arrays optimized for ML models, allowing direct manipulation in shaders for inference and training pipelines alongside other compute tasks.[26] Representative use cases include video encoding, where compute kernels parallelize motion estimation and compression across frames for real-time processing; physics simulations, such as particle systems or fluid dynamics, that dispatch threadgroups to update positions and forces in parallel; and neural network inference, employing custom shaders to execute forward passes on batched inputs for applications like image recognition.[42][26]Performance optimization tools
Metal provides developers with a suite of integrated tools within Xcode and Instruments to debug, validate, and profile applications, enabling precise identification and resolution of performance bottlenecks in graphics, compute, and machine learning workloads.[43] The Metal debugger facilitates GPU capture, allowing developers to record and replay frames to inspect command buffers, resource states, and shader executions in detail. By capturing a Metal workload, developers can step through command encoders, examine dependencies between passes, and visualize the GPU timeline to pinpoint synchronization issues or inefficient resource usage. This frame debugging capability supports editing and reloading shaders on-the-fly, which accelerates iteration and helps optimize rendering pipelines by revealing visual artifacts or suboptimal draw calls.[44][45] API validation layers operate at runtime to enforce correct usage of the Metal API and shaders, catching errors such as invalid resource bindings, incorrect flags in command encoding, or out-of-bounds buffer accesses before they impact production performance. The API Validation layer inspects resource creation and command submission, flagging violations like mismatched storage modes or improper synchronization, while the Shader Validation layer detects execution-time issues, including nil texture accesses or divergent control flow that could lead to stalls. Enabling these layers during development incurs a performance cost but ensures robust code, preventing subtle bugs that degrade GPU efficiency in dynamic scenarios.[46][10] For in-depth profiling, GPU counters and timeline views in the Metal debugger offer granular insights into hardware utilization, including bandwidth consumption, cache hit rates, and execution durations across shader stages. The performance timeline displays parallel CPU and GPU activities, with tracks for vertex, fragment, and compute passes, alongside aggregated shader waterfalls that highlight overlaps and stalls. Counters such as occupancy (measuring active thread utilization), limiter (identifying bottlenecks in execution units), and bandwidth (tracking memory read/write throughput) enable developers to quantify inefficiencies, like low cache coherence or excessive serialization, and guide optimizations such as adjusting threadgroup sizes or reducing texture fetches. These views support both Apple silicon and non-Apple GPUs on macOS, providing consistent profiling across hardware.[47][48][49] Argument buffers serve as a key optimization for reducing CPU overhead in scenarios involving dynamic pipelines, where frequent resource binding would otherwise introduce latency. By encapsulating groups of resources—like buffers, textures, and samplers—into a single buffer passed to shaders, developers can bind complex argument sets once per frame rather than per command, minimizing driver overhead and state validation. Metal supports two tiers of argument buffers: Tier 1 for immutable, CPU-readable resources with limited counts (e.g., up to 31 total arguments on many devices), and Tier 2 for mutable resources with pointer indexing in the Metal Shading Language, available on iOS 13+, allowing up to 96 samplers on mobile devices. This approach is particularly effective for read-only data, such as terrain rendering or particle systems, where it can cut CPU time by consolidating hundreds of individual setResource calls into array accesses.[50][51] On unified memory architectures like Apple silicon, resource residency and eviction controls help manage memory pressure by allowing developers to influence when resources are retained or discarded from GPU-accessible memory. Resources and heaps support purgeable states (e.g., MTLPurgeableState.volatile or .empty), enabling explicit eviction of unused assets to free bandwidth, with the system automatically restoring them on access if needed. This is crucial for applications with large datasets, as it prevents thrashing on devices with shared CPU-GPU memory pools, though the driver handles most residency automatically; developers track it manually for argument buffers to avoid unexpected faults. Heaps allocated from MTLHeap can propagate purgeability to their resources, optimizing footprint in compute-heavy workloads without manual deallocation.[52] To facilitate development without relying solely on Apple hardware, Metal includes simulation tools like the iOS Simulator and macOS support for non-Apple GPUs (e.g., AMD or Intel), allowing testing of Metal code on diverse configurations. In the Simulator, Metal apps run with native GPU acceleration via the host Mac's Metal implementation, supporting frame capture and counters for rapid prototyping, though some features like certain storage modes require alternative render paths. For non-Apple GPUs on macOS, the Metal debugger provides counter statistics tailored to discrete hardware, measuring activities like vertex processing or memory transfers to simulate real-world performance variations during optimization. These tools ensure compatibility and performance tuning across the ecosystem without physical device access.[53][49]Metal Performance Shaders
Metal Performance Shaders (MPS) is a high-level framework built on Metal that provides a library of pre-optimized compute and graphics shaders for accelerating common workloads across Apple GPUs. These shaders are fine-tuned for the unique architecture of each Metal GPU family, enabling developers to achieve high performance without manual low-level optimization. MPS supports a wide range of tasks, including machine learning inference and training, image and video processing, linear algebra operations, and advanced rendering techniques, all while ensuring energy efficiency on Apple silicon devices.[54] At the core of MPS for machine learning is MPSGraph, a compute engine that allows developers to construct, compile, and execute symbolic graphs of operations, where each node represents a shader-encapsulated function producing tensor outputs as graph edges. This declarative programming model supports neural network layers, such as convolutions and activations, as well as training operations like forward and backward passes, leveraging hardware accelerators for high-throughput execution. MPSGraph facilitates integration with Core ML by enabling the deployment of trained models directly onto Metal compute pipelines, allowing seamless GPU acceleration of inference tasks in applications. For example, transformer models can be optimized using MPSGraph's graph compilation, yielding significant speedups on Apple silicon.[55][56][4] MPS also offers specialized kernels for image and matrix processing, including filters for convolutions, reductions, and histogram computations, which automatically tune parameters based on the target GPU's capabilities. These kernels handle compressed texture formats like PVRTC and ASTC, making them suitable for real-time image manipulation in graphics and vision pipelines. In linear algebra, MPS provides optimized routines for matrix multiplication, factorization, and solving systems of equations, often outperforming general-purpose compute shaders by exploiting SIMD instructions and memory hierarchies inherent to Apple GPUs.[57][54] For advanced rendering, MPS includes ray tracing primitives that accelerate ray-geometry intersection tests, optimized for the ray tracing hardware in Apple silicon GPUs, enabling efficient photorealistic effects in games and simulations. While general physics simulations can leverage these compute kernels, MPS focuses on declarative integration rather than raw primitive control. MPS supports sparse textures for efficient handling of irregular data in graphs, with enhancements in Metal 3 for operations like indirect command buffers, while Metal 4 adds native tensor support in the API and shading language, allowing direct tensor manipulation within shaders for more streamlined machine learning workflows.[54][29][5]Development history
Initial release and Metal 1
Apple announced Metal at its Worldwide Developers Conference (WWDC) on June 2, 2014, introducing it as a new graphics and compute API designed to enhance performance on iOS and macOS platforms.[2] The framework debuted alongside iOS 8 in September 2014 and macOS 10.10 Yosemite in October 2014, marking Apple's shift toward a proprietary, low-level interface for GPU acceleration.[58][59] The development of Metal was primarily motivated by the limitations of OpenGL ES, which imposed significant overhead in driver translation and state management, hindering efficient GPU utilization on power-constrained mobile hardware.[60] By providing direct access to the GPU with minimal abstraction, Metal aimed to reduce CPU involvement and unlock higher performance for graphics rendering and general-purpose computing on devices like the iPhone 5s.[61] At launch, its core features encompassed basic graphics pipelines for vertex and fragment shading, compute pipelines for parallel data processing, command encoding to batch GPU instructions, and multi-threaded submission allowing multiple CPU threads to prepare command buffers concurrently.[2][62] These elements enabled developers to construct efficient render passes and dispatch kernels with reduced latency compared to prior APIs.[3] Initial hardware compatibility was limited to Apple's A7 system-on-chip (SoC) and subsequent generations for iOS, ensuring broad availability on devices starting with the iPhone 5s and iPad Air.[2] On macOS, support extended to NVIDIA GPUs based on the Kepler architecture and AMD Radeon GPUs found in Macs from mid-2012 onward, aligning with the transition to Yosemite.[63][64] Developers encountered challenges in adopting Metal due to the need to rewrite OpenGL ES-based codebases, which required significant refactoring of shaders and rendering logic. To facilitate the transition, third-party wrapper libraries like MoltenGL emerged, translating OpenGL ES calls to Metal equivalents and allowing legacy applications to run with improved efficiency.[65] Despite these aids, initial uptake on macOS was slower than on iOS, as developers grappled with the API's lower-level paradigm. A notable early milestone was the launch of Vainglory in November 2014, a multiplayer online battle arena game that utilized Metal to deliver 60 frames per second (FPS) on iOS devices like the iPhone 6, showcasing the API's capability for smooth, high-fidelity mobile gaming previously unattainable with OpenGL ES.[66][67] This demonstration highlighted Metal's impact on enabling console-like experiences on handheld hardware.[68]Metal 2 advancements
Metal 2 was released in 2017 alongside iOS 11 and macOS 10.13 High Sierra, marking a significant evolution of Apple's graphics and compute API with a focus on GPU-driven workflows and expanded hardware capabilities.[69][70] Among the key new features, tile shading introduced a novel pipeline stage that enables developers to perform compute-like operations directly within the rendering pass, improving memory efficiency by processing screen-space tiles on Apple GPUs without additional bandwidth overhead.[71] This tiled rendering approach leverages the tile-based deferred architecture of Apple hardware to blend graphics and compute tasks seamlessly, reducing the need for separate passes in complex scenes like forward-plus lighting.[72] Indirect command buffers further advanced GPU autonomy by allowing the GPU to generate and execute draw commands dynamically, minimizing CPU involvement and enabling scalable rendering for large numbers of objects. Complementing this, multi-GPU rendering support extended to external GPUs via Thunderbolt 3, permitting developers to distribute workloads across multiple graphics processors for enhanced performance in demanding applications.[73] In compute programming, Metal 2 enhanced capabilities through function specialization, where shaders could be compiled with constant parameters to optimize for specific use cases, reducing runtime overhead and improving execution efficiency.[74] Interoperability with other frameworks was bolstered, including tighter integration with IOSurface for efficient texture sharing and support for raster order groups to ensure predictable memory access in fragment shaders, facilitating advanced techniques like order-independent transparency.[74][75] These advancements delivered notable performance improvements, with Apple reporting up to a 40% increase in rendering speed on supported hardware through optimizations like argument buffers, which cut CPU overhead by binding resources more efficiently.[76] Overall draw call throughput could reach up to 10 times higher in GPU-driven scenarios compared to prior versions. The release spurred ecosystem growth, as Valve integrated Metal support into Steam, enabling smoother ports of games like those from Unity and Epic to macOS and paving the way for broader third-party adoption.[77][69]Metal 3 innovations
Metal 3, unveiled at Apple's Worldwide Developers Conference (WWDC) in June 2022, debuted alongside iOS 16 and macOS Ventura, introducing optimizations tailored to Apple silicon for enhanced graphics and compute performance.[78] This version emphasized low-overhead access to GPU capabilities, enabling developers to leverage unified memory architecture more effectively through bindless resources, which support zero-copy data sharing between CPU and GPU by providing GPU-specific handles likegpuResourceID and gpuAddress for direct memory access without traditional binding overhead.[79] These advancements reduce latency and memory bandwidth usage, allowing seamless transfer of large datasets such as textures and buffers directly to GPU memory.[80]
A cornerstone of Metal 3 is its support for advanced rendering techniques, including ray tracing (hardware-accelerated starting with M3-series processors; software-emulated on earlier Apple silicon like M1 and M2) and mesh shading (available starting with A15 Bionic and M2-series processors), on compatible Apple silicon GPUs. While ray tracing is available via Metal 3 on earlier Apple silicon like M1 and M2 through software emulation, hardware acceleration begins with the M3 series for improved performance.[29] Ray tracing integration into the core Metal API simplifies implementation of realistic lighting, shadows, and reflections by accelerating ray intersection calculations on the GPU, with optimizations for building and traversing acceleration structures.[81] Mesh shading introduces programmable geometry processing, replacing fixed-function stages with flexible shaders that dynamically generate vertices and primitives, ideal for complex scenes like dense foliage or particle effects. Complementing these is dynamic resolution scaling via MetalFX, a new framework for upscaling and frame generation that renders at lower resolutions before applying high-quality spatial upscaling and temporal anti-aliasing, significantly boosting frame rates in demanding applications without sacrificing visual fidelity—for instance, enabling 4K output from 1080p internals.[82] MetalFX also supports frame interpolation for smoother motion, further enhancing real-time performance.[83]
Developer tools in Metal 3 received substantial upgrades, including an improved Metal Performance HUD for real-time monitoring of metrics like GPU utilization, frame times, and shader compilation activity, overlaid directly on running apps to aid debugging and optimization.[84] GPU frame capture in Xcode was enhanced with better integration for ray tracing and mesh shading pipelines, allowing developers to inspect command buffers and resource residency more granularly.[85] Additionally, the Fast Resource Loading API, powered by MTLIOCommandQueue and asynchronous command buffers, streamlines asset ingestion by bypassing CPU bottlenecks, providing a direct path from storage to GPU for high-resolution textures and geometry.[86]
These innovations have notably impacted gaming on Apple platforms, facilitating console-quality titles by harnessing Apple silicon's efficiency. For example, Resident Evil Village was ported to macOS using Metal 3 features like MetalFX upscaling and ray tracing support, achieving immersive visuals and responsive performance on M1 and M2 Macs, demonstrating how these tools enable expansive worlds with photorealistic effects previously limited to dedicated consoles.[87] Overall, Metal 3's focus on Apple silicon optimizations has broadened the ecosystem for high-fidelity graphics, making advanced rendering accessible across iOS, iPadOS, and macOS devices.[78]
Metal 4 and beyond
Metal 4 was announced at Apple's Worldwide Developers Conference (WWDC) on June 9, 2025, alongside previews of iOS 19, macOS 16, and visionOS 3, marking a significant evolution in Apple's graphics and compute API designed to leverage the latest Apple Silicon advancements.[88][89] This release emphasizes seamless integration of machine learning capabilities directly into graphics workflows, enabling developers to harness GPU and Neural Engine resources more efficiently for AI-driven applications.[5] A cornerstone of Metal 4 is its native support for tensors, introduced as first-class citizens in both the API and the Metal Shading Language (MSL), facilitating optimized handling of multidimensional data essential for machine learning workloads.[5][8] Tensors can now be created, manipulated, and operated on directly within shaders and compute kernels, with new tensor types and operations like matrix multiplication accelerating common ML tasks such as model inference on the GPU.[90] This support extends to the tensor resource and ML encoder, allowing developers to encode and execute neural networks alongside rendering and compute commands on the unified GPU timeline, reducing latency in AI-accelerated graphics pipelines.[90] The API's AI-graphics fusion enables hardware-accelerated integration of machine learning models into rendering processes, including embedding neural networks directly within shaders for advanced effects like real-time style transfer and object enhancement.[90] Through shader ML features, developers can perform inline inference operations, fusing AI computations with graphics to support emerging techniques in neural rendering and generative effects without separate processing stages.[90] This approach leverages the GPU's parallel architecture for efficient execution, as demonstrated in session examples where ML models run synchronously with draw calls to enhance visual fidelity in applications.[90] For gaming, Metal 4 enhances performance through updates to MetalFX, providing DLSS-like temporal upscaling to render at lower resolutions and upscale to higher outputs with minimal quality loss, achieving up to 2x frame rate improvements in demanding scenes.[91] Additionally, MetalFX Frame Interpolation generates intermediate frames between rendered ones using motion vectors and AI-driven prediction, delivering smoother gameplay and higher effective frame rates comparable to NVIDIA's DLSS 3 frame generation.[91][92] These features, combined with denoising for ray-traced content, optimize resource utilization on Apple Silicon, enabling more complex visuals in titles ported or natively developed for macOS and iOS.[91] To broaden the ecosystem, Metal 4 integrates with Game Porting Toolkit 3 (GPTK 3), offering improved translation layers and simulation tools that allow developers to test and optimize DirectX-based games on non-Apple hardware before deployment to Apple platforms.[93] GPTK 3 enhances cross-platform compatibility by unifying command encoding and providing MetalFX support during emulation, streamlining the porting process for Windows titles and enabling performance profiling on diverse systems.[92] This facilitates easier adoption by third-party studios, with tools for shader conversion and runtime simulation reducing development barriers.[93] Looking ahead, Metal 4 paves the way for deeper integration with the Apple Neural Engine (ANE), allowing direct programming of Neural Accelerators via tensor APIs to offload specific ML operations from the GPU while maintaining pipeline cohesion.[94] This enables parallel execution of ANE tasks alongside GPU shaders, boosting efficiency for hybrid AI workloads in future updates.[95] As Apple continues to evolve its silicon ecosystem, Metal 4's ML primitives position it for expanded support in on-device AI, including potential enhancements to Metal Performance Shaders for advanced neural operations.[5]Hardware compatibility
Supported processors
Metal supports a range of Apple-designed system-on-chip (SoC) processors across its platforms, with compatibility evolving alongside hardware advancements and API versions. On iOS, iPadOS, and tvOS, initial support began with the Apple A7 SoC, enabling Metal 1 features from iOS 8 and tvOS 9. Subsequent SoCs, including A8 through A9 for enhanced Metal 1 capabilities, A10 through A13 for Metal 2 and basic Metal 3, and A14 Bionic through A18 for support for Metal 3 and Metal 4, with advanced features like hardware ray tracing available on later processors such as A17 Pro+ and M1+, have progressively expanded feature availability. Devices with A14 Bionic or later, such as iPhone 12 and subsequent models, iPad Pro (2020 and later), and Apple TV 4K (2nd generation and later), provide Metal 4 compatibility when running iOS 18, iPadOS 18, or tvOS 18 or later.[4][9][29] For macOS, Metal compatibility initially encompassed Intel-based Macs equipped with specific discrete and integrated GPUs. Metal 1 launched on OS X El Capitan 10.11 with support for NVIDIA Kepler-series GPUs (e.g., in MacBook Pro 2012–2015), AMD GCN-based GPUs from 2012 models (e.g., Radeon HD 7000 series), and Intel HD Graphics 4000 or later integrated graphics in Macs from 2012 onward, including MacBook Air, iMac, and Mac mini. Metal 2 extended to AMD Radeon Pro Vega GPUs in 2017–2018 models like the iMac Pro and MacBook Pro. Metal 3 added compatibility with AMD Radeon Pro 5000/6000 series, Intel UHD Graphics 630, and Intel Iris Plus Graphics on select 2017–2020 Intel Macs, alongside all Apple M-series SoCs starting with M1 on macOS Big Sur 11 or later. However, as of macOS Sonoma 14 and later, support for older Intel configurations, including 32-bit apps and certain pre-2017 GPUs, has been deprecated in favor of Apple silicon.[9][96][97] Apple silicon processors, beginning with the M1 family, unify Metal support across macOS, providing backward compatibility for Metal 1 through 3 features on all M1 through M5 SoCs in devices like MacBook Air (2020+), MacBook Pro (2020+), iMac (2021+), Mac mini (2020+), Mac Studio (2022+), and Mac Pro (2023+). Metal 4 requires M1 or later with macOS Sequoia 15 or subsequent releases, marking the end of Intel Mac support for new API features. On visionOS, Metal is supported exclusively on the M2 SoC in Apple Vision Pro devices running visionOS 1 or later, with Metal 3 and 4 features available from visionOS 2 onward.[4][9][29] Backward compatibility ensures that applications built for earlier Metal versions run on newer hardware, though advanced features like hardware ray tracing in Metal 3 (available on M1+ for Apple silicon and A17 Pro+ for iOS devices) or tensor support in Metal 4 (native handling on A14+/M1+, optimized on A18+/M4+) are limited to qualifying processors. Basic Metal 3 support begins with A12 or later. As of 2025, all Apple silicon-equipped devices, including those with A14 through A18 and M1 through M5, support Metal 4 under compatible operating systems.[4][9]Feature tiers
Metal organizes its feature support into tiered levels based on GPU families, allowing developers to target capabilities that align with hardware constraints across Apple devices. These tiers group features into progressive sets, starting from fundamental graphics and compute operations to advanced rendering and AI acceleration. Support is determined by the underlying GPU family, with newer families enabling more sophisticated functionality. Developers must account for these tiers to ensure application compatibility and performance on diverse hardware, from older A-series chips to the latest M-series processors. Platform-specific differences apply, particularly for iOS vs. macOS/iPadOS (e.g., hardware ray tracing on M1+ for Apple silicon, A17 Pro+ for iOS).[29] Tier 1 provides the foundational level of Metal support, encompassing basic graphics rendering and compute tasks without advanced effects like ray tracing. This tier is available on devices with Apple GPU family 1 (A7) and extends to all higher families, including M1 (family 7), ensuring broad compatibility for core operations such as vertex and fragment shading, texture sampling, and simple parallel compute kernels. It lacks hardware support for ray tracing, relying instead on general-purpose compute shaders for any simulation of such effects.[98][29] Tier 2 builds on Tier 1 by adding advanced graphics features, including mesh shading for more efficient geometry processing. This tier requires Apple GPU family 7 or later, corresponding to A15 Bionic, M1, and subsequent chips like A16 and M2 (family 8). Mesh shaders enable GPU-driven mesh generation and culling, reducing CPU overhead in complex scenes, but indirect mesh commands are limited until family 9. Other enhancements include improved argument buffer handling at Tier 2 level for better resource management in shaders.[29][33] Tier 3 introduces full hardware-accelerated ray tracing and enhanced MetalFX upscaling, targeting high-fidelity rendering on premium devices. It is supported on Apple GPU family 7 and above (M1+ for Apple silicon; A17 Pro/family 8+ for iOS), including A17 Pro, M3, and M4 chips. Ray tracing leverages dedicated hardware intersection testing and acceleration structures for real-time global illumination and reflections, while MetalFX provides temporal upscaling and frame generation to boost performance without sacrificing visual quality. Denoised upscaling is exclusive to this tier on family 9+. Earlier families can emulate ray tracing via compute but without hardware efficiency.[99][29][82] Tier 4, aligned with Metal 4, focuses on tensor operations and AI acceleration, enabling efficient machine learning workloads directly on the GPU. Support begins with family 6 (A14/M1+), offering tensor shaders for matrix multiplications used in neural networks, but full capabilities—including advanced sparse tensor handling and integrated AI encoding—require family 8+ (A17 Pro/M2). This tier integrates with Metal Performance Shaders for accelerated transformer models and inference, providing up to several times faster AI processing compared to CPU fallbacks. Native tensor handling in Metal 4 is supported on A14+ and M1+, with optimized performance on A18+ and M4+.[29][56][100]| Tier | Key Hardware Families | Major Features | Example Devices |
|---|---|---|---|
| 1 | Apple1–Apple10, Mac1–Mac2 | Basic graphics/compute, no ray tracing | A7, A11 |
| 2 | Apple7+ , Mac2 | Mesh shading, Tier 2 argument buffers | A15, M1, M2 |
| 3 | Apple7+ (full hardware RT on Apple8+ for iOS, Apple7+ for Apple silicon) | Hardware ray tracing, advanced MetalFX | A17 Pro, M1, M3 |
| 4 | Apple6+ (full on Apple8+) | Tensor ops, AI acceleration (Metal 4) | A14, A18, M1, M4 |
MTLDevice.supportsFamily(_:) method, passing an MTLGPUFamily value such as .apple7 to verify if the device meets a minimum family requirement. Additional checks like supportsRaytracing or readWriteTextureSupport provide granular confirmation for specific features, enabling dynamic code paths.[101][102]
These tiers necessitate fallback strategies for cross-device compatibility; for instance, on Tier 1 or 2 hardware lacking hardware ray tracing, applications can implement software-based ray tracing using compute pipelines, though at a performance cost of 2–5x slower than dedicated hardware. Similarly, MetalFX fallbacks to bilinear upscaling on unsupported devices maintain functionality while prioritizing scalability across the ecosystem.[103][104]
