Hubbry Logo
Metal (API)Metal (API)Main
Open search
Metal (API)
Community hub
Metal (API)
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Metal (API)
Metal (API)
from Wikipedia

Metal
DeveloperApple Inc.
Initial releaseJune 2014; 11 years ago (2014-06)
Stable release
4 / June 2025; 5 months ago (2025-06)
Written inMetal Shading Language (C++14-based in Metal 1–3, C++17-based in Metal 4), Runtime/API: Objective-C, Swift, C++ (binding via Metal-cpp)
Operating systemiOS, iPadOS, macOS, tvOS, watchOS and visionOS
Type3D graphics and compute API
LicenseProprietary
Websitedeveloper.apple.com/metal/

Metal is a low-level, low-overhead hardware-accelerated 3D graphic and compute shader API created by Apple, debuting in iOS 8. Metal combines functions similar to OpenGL and OpenCL in one API. It is intended to improve performance by offering low-level access to the GPU hardware for apps on iOS, iPadOS, macOS, tvOS, watchOS and visionOS. It is similar to low-level APIs on other platforms such as Vulkan and DirectX 12.

Metal is an object-oriented API that can be invoked using the Swift, Objective-C or C++17[2] programming languages. Full-blown GPU execution is controlled via the Metal Shading Language. According to Apple promotional materials: "MSL [Metal Shading Language] is a single, unified language that allows tighter integration between the graphics and compute programs. Since MSL is C++-based, you will find it familiar and easy to use."[3]

Features

[edit]

Metal aims to provide low-overhead access to the GPU. Commands are encoded beforehand and then submitted to the GPU for asynchronous execution. The application controls when to wait for the execution to complete thus allowing application developers to increase throughput by encoding other commands while commands are executed on the GPU or save power by explicitly waiting for GPU execution to complete. Additionally, command encoding is CPU independent thus applications can encode commands to each CPU thread independently. Lastly, render states are pre-computed beforehand, allowing the GPU driver to know in advance how to configure and optimize the render pipeline before command execution.[4]

Metal improves the capabilities of GPGPU programming by using compute shaders. Metal uses a specific shading language based on C++14, implemented using Clang and LLVM.[5]

Metal allows application developers to create Metal resources such as buffers, textures. Resources can be allocated on the CPU, GPU, or both and provides facilities to update and synchronize allocated resources. Metal can also enforce a resource's state during a command encoder's lifetime.[6][7]

On macOS, Metal can provide application developers the discretion to specify which GPU to execute. Application developers can choose between the low-power integrated GPU of the CPU, the discrete GPU (on certain MacBooks and Macs) or an external GPU connected through Thunderbolt. Application developers also have the preference on how GPU commands are executed on which GPUs and provides suggestion on which GPU a certain command is most efficient to execute (commands to render a scene can be executed by the discrete GPU while post-processing and display can be handled by the integrated GPU).[8]

Metal Performance Shaders

[edit]

Metal Performance Shaders is a highly optimized library of graphics functions that can help application developers achieve great performance at the same time decrease work on maintaining GPU family specific functions.[9] It provides functions including:

  • Image filtering algorithms
  • Neural network processing
  • Advanced math operations
  • Ray tracing

History

[edit]

Metal has been available since June 2, 2014 on iOS devices powered by Apple A7 or later,[10] and since June 8, 2015 on Macs (2012 models or later) running OS X El Capitan.[11]

On June 5, 2017, at WWDC, Apple announced the second version of Metal, to be supported by macOS High Sierra, iOS 11 and tvOS 11. Metal 2 is not a separate API from Metal and is supported by the same hardware. Metal 2 enables more efficient profiling and debugging in Xcode, accelerated machine learning, lower CPU workload, support for virtual reality on macOS, and specificities of the Apple A11 GPU, in particular.[12]

At the 2020 WWDC, Apple announced the migration of the Mac to Apple silicon. Macs using Apple silicon will feature Apple GPUs with a feature set combining what was previously available on macOS and iOS, and will be able to take advantage of features tailored to the tile based deferred rendering (TBDR) architecture of Apple GPUs.[13]

At the 2022 WWDC, Apple announced the third version of Metal (Metal 3), which would debut with the release of macOS Ventura, iOS 16 and iPadOS 16. Metal 3 introduces the MetalFX upscaling framework, which renders complex scenes in less time per frame with high-performance upscaling and anti-aliasing, mesh shaders support.[14] Also announced possibility to use C/C++ for Metal API.[15]

At the 2023 WWDC, Apple announced a brand new toolkit called the Game Porting Toolkit to port Windows 10/11-based games. It includes an environment to test binaries, translation layers from HLSL to MSL, and Metal-cpp bindings. Jeremy Sandmel announced a new Game Mode for macOS Sonoma, and Hideo Kojima announced Death Stranding for the macOS App Store.[16]

At the 2024 WWDC, Apple announced Game Porting Toolkit 2, along with the release of new games such as Control: Ultimate Edition, Frostpunk 2, and Assassin's Creed Shadows for macOS.[17]

At the 2025 WWDC, Apple announced Metal 4, a new version of the API featuring a unified command encoder system, support for neural rendering, and new technologies such as MetalFX Frame Interpolation and a ray tracing denoiser.[18]

Supported GPUs

[edit]

The first version of Metal supports the following hardware and software:[19]

The second version of Metal supports the following hardware and software:

The third version of Metal supports the following hardware and software:[20]

The fourth version of Metal supports the following hardware and software:

Adoption

[edit]

According to Apple, more than 148,000 applications use Metal directly, and 1.7 million use it through high-level frameworks, as of June 2017.[22] macOS games using Metal for rendering are listed below.

Title Developer (macOS version) Game engine MacOS release date (OpenGL/DirectX) Metal-based release date Metal support notes
Ark: Survival Evolved Studio Wildcard Unreal Engine 4 29 August 2017
Assassin's Creed Shadows Ubisoft Quebec Ubisoft Anvil 20 March 2025
ARMA 3 Virtual Programming Real Virtuality 31 August 2015 Metal support in beta since 17 September 2017[citation needed]
Baldur's Gate III Larian Studios Divinity Engine 4.0 22 September 2023 Metal support in early access since 6 October 2020[citation needed]
Ballistic Overkill Aquiris Game Studio Unity Engine 5 28 March 2017
Batman: Arkham City Feral Interactive Unreal Engine 3 18 October 2013 Metal support since 21 February 2019 with v1.2[citation needed]
Batman: The Enemy Within Telltale Games Telltale Tool 8 August 2017
BattleTech Harebrained Schemes Unity Engine 5 24 April 2018
Bioshock Remastered Feral Interactive Unreal Engine 2.5 22 August 2017
Bioshock 2 Remastered Feral Interactive Unreal Engine 2.5 22 October 2020
Cyberpunk 2077 CD Projekt REDengine 4 17 July 2025 Support for Metal 4
Cities: Skylines Paradox Interactive Unity Engine 5 10 March 2015 Metal support since 18 May 2017[citation needed]
Civilization VI Aspyr Media LORE 24 October 2016 Metal support since 5 April 2019[citation needed]
Company of Heroes 2 Feral Interactive Essence Engine 3 21 January 2015 Metal support since 19 October 2018[citation needed]
Control: Ultimate Edition Remedy Entertainment Northlight Engine 26 March 2025
Dead Island 2 Dambuster Studios Unreal Engine 4 24 July 2025
Deus Ex: Mankind Divided Feral Interactive Dawn Engine 12 December 2017
DiRT Rally Feral Interactive EGO Engine 2.5 16 November 2017
Divinity: Original Sin II Larian Studios Divinity Engine 2 31 January 2019
Dota 2 Valve Source 2 18 July 2013 MoltenVK was announced on 26 February 2018.[23] The option to use this became available on 31 May 2018.[24]
The Elder Scrolls Online Zenimax Online Studios N/A 4 April 2014 22 October 2018 OpenGL Renderer replaced with Vulkan via MoltenVK wrapper (translates Vulkan API calls to Metal) in patch 4.2.5
Empire: Total War Feral Interactive TW Engine 3 4 March 2009 Metal support since 16 December 2019[citation needed]
EVE Online CCP Games N/A 6 November 2007 14 October 2021 Previously available on macOS via DirectX 9.0 from November 2007 until February 2009; native macOS version using Metal released 14 November 2021[citation needed]
Everspace Rockfish Unreal Engine 4 26 May 2017
F1 2016 Feral Interactive EGO Engine 4.0 6 April 2017
F1 2017 Feral Interactive EGO Engine 4.0 25 August 2017
Fortnite Epic Games Unreal Engine 4 25 July 2017
Frostpunk 11 Bit Studios Liquid Engine 24 February 2021
Frostpunk 2 11 Bit Studios Unreal Engine 5 20 September 2024
Gravel Virtual Programming Unreal Engine 4 20 January 2019
Guardians of the Galaxy: The Telltale Series Telltale Games Telltale Tool 18 April 2017
Headlander Double Fine Productions Buddha Engine 18 November 2016
Heroes of the Storm Blizzard Entertainment SC2 Engine 2 June 2015 Metal support in beta since 24 January 2017 (temporarily removed on 29 November 2017[25] until ?)[citation needed]
Hitman Feral Interactive Glacier 2 20 June 2017
Life Is Strange: Before the Storm Feral Interactive Unity Engine 13 September 2018
Life Is Strange 2 Feral Interactive Unreal Engine 4 2019
Mafia III Aspyr Media Illusion Engine 11 May 2017
Medieval II: Total War Feral Interactive TW Engine 2 17 December 2015 Metal support since 25 October 2018[citation needed]
Micro Machines World Series Virtual Programming Unity Engine 5 30 June 2017
Minecraft: Story Mode - Season Two Telltale Games Telltale Tool 11 July 2017
MXGP3 Virtual Programming Unreal Engine 4 23 November 2018
Napoleon: Total War Feral Interactive TW Engine 3 2 July 2013 Metal support since 25 October 2019 with v1.2[citation needed]
Obduction Cyan Worlds Unreal Engine 4 29 March 2017
Observer Bloober Team Unreal Engine 4 24 October 2017
Quake II id Software Quake II engine 9 February 2019 A port using MoltenVK was released as vkQuake2.[26]
Refunct Dominique Grieshofer Unreal Engine 4 5 September 2016
Resident Evil 2 Capcom RE Engine 10 December 2024
Resident Evil 3 Capcom RE Engine 18 March 2025
Resident Evil 4 Capcom RE Engine 20 December 2023
Resident Evil Village Capcom RE Engine 28 October 2022 First macOS game with MetalFX support
Rise of the Tomb Raider Feral Interactive Foundation Engine 12 April 2018
Shadow of the Tomb Raider Feral Interactive Foundation Engine 2019
Sid Meier's Railroads! Feral Interactive Gamebryo 1 November 2012 Metal support since 18 December 2018[citation needed]
The Sims 3 Maxis Redwood Shores The Sims 3 Engine 2 June 2009 28 October 2020
The Sims 4 Maxis SmartSim 17 February 2015 Metal support added 12 November 2019[citation needed]
Sky: Children of the Light Thatgamecompany N/A 18 July 2019 Native Metal support added since pre-global live in November 2017
Starcraft Blizzard Entertainment Modified Warcraft II engine 20 November 2001 Metal support since 2 July 2020 with v1.23.5[citation needed]
StarCraft II Blizzard Entertainment SC2 Engine 27 July 2010 Metal support in beta since 24 January 2017[citation needed]
Tomb Raider Feral Interactive Foundation Engine 17 January 2014 Metal support with v1.2 in July 2019[citation needed]
Total War: Rome Remastered Feral Interactive TW Engine 2 29 April 2021
Total War: Shogun 2 Feral Interactive TW Engine 3 31 July 2014 Metal support since 4 October 2019[citation needed]
Total War: Shogun 2: Fall of the Samurai Feral Interactive TW Engine 3 18 December 2014 Metal support since 4 October 2019[citation needed]
Total War: Three Kingdoms Feral Interactive TW Engine 3 23 May 2019
Total War: Warhammer Feral Interactive TW Engine 3 19 April 2017
Total War: Warhammer II Feral Interactive TW Engine 3 20 November 2018
Total War Saga: Thrones of Britannia Feral Interactive TW Engine 3 24 May 2018
Total War Saga: Troy Feral Interactive TW Engine 3 13 August 2020
Universe Sandbox Giant Army Unity Engine 5 TBA Metal support in beta since June 2017[citation needed]
Unreal Tournament Epic Games Unreal Engine 4 Cancelled Metal support since January 2017[citation needed]
War Thunder Gaijin Entertainment Dagor Engine 4 1 November 2012 Metal support added 24 May 2017 (removed in ? 2018 and reintroduced 27 August 2020)[citation needed]
Warhammer 40,000: Dawn of War III Feral Interactive Essence Engine 4 9 June 2017
The Witness Thekla, Inc Thekla Engine 8 March 2017
World of Warcraft Blizzard Entertainment WoW Engine 23 November 2004 Metal support since August 2016[citation needed]
X-Plane 11 Laminar Research N/A 30 May 2017 Metal support in beta since 2 April 2020[27]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Metal is a low-level graphics and compute application programming interface (API) developed by Apple Inc., providing developers with direct access to the graphics processing unit (GPU) on Apple devices for high-performance rendering of complex scenes and parallel data processing. Introduced at Apple's Worldwide Developers Conference (WWDC) in 2014 and first released with iOS 8 later that year, Metal enables hardware-accelerated graphics and compute tasks across platforms including iOS, iPadOS, macOS, tvOS, and visionOS, replacing older APIs like OpenGL ES on mobile and OpenGL on desktop. Key features of Metal include its low-overhead architecture, which minimizes driver overhead for efficient GPU utilization, a unified shading language (Metal Shading Language) for writing shaders, and tight integration between graphics rendering and general-purpose on graphics processing units (GPGPU) workloads. This design allows applications such as games, video editing software like , scientific simulations, and models to leverage Apple silicon's performance and efficiency. Metal also supports advanced technologies like Metal Performance Shaders for optimized compute operations and MetalKit for streamlined setup. In June 2025, Apple announced Metal 4 at WWDC25, introducing first-class support for through native tensor handling in the and , enhanced ray tracing with flexible acceleration structures, and improved upscaling via MetalFX for higher frame rates in games. These updates further optimize Metal for , including features like sparse resource management and ahead-of-time shader compilation to reduce runtime overhead, enabling more complex visuals and AI-integrated experiences on devices such as the , , Mac, , and .

Introduction

Overview

Metal is a proprietary, low-overhead and compute application programming interface () developed by Apple Inc., designed for 3D graphics rendering and general-purpose computing on graphics processing units (GPGPU). It provides developers with low-level, direct access to GPU hardware, allowing for efficient execution of parallel workloads on . This enables the creation of high-performance applications, including video games, (AR) and (VR) experiences, and tasks that demand intensive computational resources. Available exclusively on Apple's ecosystem, Metal supports the primary platforms of , , macOS, , and , where it harnesses the capabilities of Apple-designed GPUs for both graphics and compute operations. As the underlying technology, Metal powers higher-level frameworks such as SceneKit for 3D scene rendering and RealityKit for immersive AR/VR simulations, which build upon its performance while offering abstracted APIs for easier development. Central to Metal is its shading language, known as the Metal Shading Language (MSL), a C++-based based on a subset of (as of Metal 4) that allows developers to write custom shaders and compute kernels for GPU execution. MSL provides a unified syntax for graphics and compute programming, facilitating optimized code that runs directly on the GPU with minimal runtime overhead.

Design goals

Metal's design prioritizes low-overhead access to GPU hardware, enabling developers to minimize CPU-GPU communication latency by providing direct control over GPU resources without the abstraction layers found in higher-level APIs. This approach reduces driver overhead and allows for more efficient command submission, ensuring that applications can fully utilize the parallel processing capabilities of GPUs. A core principle is achieving cross-platform consistency across Apple devices, from iPhones and iPads to Macs and , allowing developers to write unified codebases that deliver reliable performance without extensive platform-specific optimizations. This unified ecosystem fosters seamless portability, as Metal abstracts hardware differences while exposing consistent APIs for graphics and compute tasks across iOS, , macOS, tvOS, and . Metal strikes a balance between low-level control—comparable to or 12—and the ease-of-use of older APIs like , offering explicit resource management and command buffer encoding to prevent state-related errors while streamlining development with a simplified, object-oriented interface. This enables fine-grained GPU programming for advanced workloads without the verbosity or implicit behaviors that complicate higher-level alternatives. The emphasizes power efficiency, particularly for mobile devices like iPhones and iPads, by optimizing and execution to reduce during intensive and compute operations, while scaling seamlessly to high-performance desktops and laptops. This focus aligns with Apple silicon's integrated , where efficient GPU utilization directly contributes to longer battery life in portable hardware. Security features, such as built-in validation layers, are integral to Metal's design to prevent common GPU programming errors like invalid accesses or violations, providing runtime checks that help developers debug issues early without compromising production performance. These layers, accessible through , enforce safe practices and mitigate risks in resource-heavy applications. Ultimately, Metal aims to foster innovation in graphics-intensive applications by enabling high-fidelity rendering and compute tasks, including integration with Apple's Neural Engine via frameworks like Metal Performance Shaders for accelerated AI workloads such as inference. This capability supports cutting-edge features in games, , and , empowering developers to push hardware boundaries within the .

Technical architecture

Core components

The core components of the Metal API form the foundational layer for developers to interact with the GPU, enabling the creation, configuration, and execution of and compute workloads on Apple hardware. These components include devices for GPU access, resources for , pipelines for processing configurations, command structures for work submission, and mechanisms for coordination. By leveraging these elements, applications can efficiently harness GPU capabilities while maintaining low-overhead communication between CPU and GPU. Device and queue management begins with the MTLDevice protocol, which represents a specific GPU and serves as the primary entry point for Metal operations. Developers obtain an MTLDevice instance using functions like MTLCreateSystemDefaultDevice(), which selects the most suitable GPU, or by enumerating available devices via MTLCopyAllDevices() for multi-GPU systems. The device handles resource creation, such as buffers and textures, and provides access to GPU features like supported formats and limits. For asynchronous execution, the MTLCommandQueue is created from the device using makeCommandQueue(); it schedules the submission of command buffers to the GPU in the order they are committed, supporting parallel encoding from multiple threads to optimize throughput. Resources in Metal encompass data structures optimized for GPU access, including buffers, textures, and heaps. Buffers, represented by MTLBuffer, store unstructured data such as vertex positions, index arrays, or uniform values, allocated via the device with specified length and options like storage mode (shared or private). They are bound to shaders during command encoding for efficient data transfer. Textures, via MTLTexture, manage formatted data in 1D, 2D, 3D, or array configurations, created using MTLTextureDescriptor to define dimensions, pixel formats (e.g., RGBA8Unorm), and usage (e.g., render target or shader read). Heaps, implemented as MTLHeap, enable efficient memory allocation by preallocating large blocks of GPU memory, from which multiple buffers and textures can be suballocated, reducing fragmentation and overhead in resource-intensive applications. Render pipelines are configured through MTLRenderPipelineState objects, which encapsulate the fixed-function and programmable states for rendering. Created from an MTLRenderPipelineDescriptor, these state objects specify vertex and fragment (written in Metal ), blending modes, rasterization settings like multisampling, and depth/stencil configurations, allowing developers to define how are processed into pixels. Once compiled by the device, the pipeline state is set on a render command encoder to execute draw calls efficiently. In contrast, compute pipelines, represented by MTLComputePipelineState, focus on general-purpose GPU (GPGPU) tasks without output; they are built from MTLComputePipelineDescriptor specifying a single compute function and are used to dispatch threadgroups for parallel data processing, such as simulations or inference. Command buffers and encoders facilitate the submission of work to the GPU by organizing operations into executable units. A command buffer (MTLCommandBuffer), created from a queue via makeCommandBuffer(), holds a sequence of encoded commands and associated resources; after encoding, it is committed to the queue with commit(), triggering asynchronous GPU execution while allowing the CPU to continue. Encoders, such as MTLRenderCommandEncoder for or MTLComputeCommandEncoder for compute, are obtained from the buffer (e.g., makeRenderCommandEncoder(descriptor:) ) and used to append specific instructions—like setting pipeline states, binding resources, issuing draw or dispatch calls—before ending the encoder to return control to the buffer. This model ensures serialized execution within a buffer while permitting concurrent buffer creation across threads. Synchronization primitives coordinate CPU-GPU interactions and prevent data hazards, including fences, , and barriers. MTLFence objects track resource state changes, particularly for heap-allocated , by signaling completion of a pass and waiting in subsequent passes within the same queue to resolve access conflicts without CPU intervention. MTLSharedEvent provides cross-queue and CPU-GPU ; developers notify the event upon GPU milestone completion and wait on the CPU or another queue, enabling ordered execution in complex workloads. Barriers, inserted via encoder methods like updateTextureMappingMode() or pass descriptors, enforce ordering within a command buffer by stalling until prior operations complete, ensuring resource consistency during multi-pass rendering or compute chains.

Updates in Metal 4

Metal 4, announced in June 2025, introduces significant enhancements to the core components for improved performance and efficiency on . Key changes include the introduction of MTL4CommandQueue for reduced CPU and memory overhead, supporting batched commits with commit:count:. Command buffers are now MTL4CommandBuffer, which can be reused via beginCommandBuffer(allocator:) for better . Resource bindings shift to argument tables using MTL4ArgumentTable, replacing per-resource bindings to streamline access. All resources are untracked by default, requiring explicit with next-generation barriers. Additionally, MTLTextureViewPool enables lightweight creation of texture views, and new encoders like MTL4CommandEncoder handle work-specific protocols. These updates enable explicit memory management and unified command encoding to minimize overhead in graphics and compute workloads.

Metal Shading Language

The Metal Shading Language (MSL) is a programming language designed for writing shaders that execute on Apple GPUs, serving as the core mechanism for defining graphics and compute operations within the Metal API. It is based on a of in Metal versions 1–3 and starting with Metal 4, incorporating extensions and restrictions tailored for parallel GPU execution, such as SIMD vector types (e.g., float4 for four-component single-precision floating-point vectors) and address space qualifiers to manage data locality across device, threadgroup, and constant memory. This foundation allows developers familiar with C++ to author low-level GPU code while enforcing safety constraints that prevent issues like in environments. MSL supports multiple shader stages corresponding to different phases of the GPU , including vertex shaders for transforming , fragment shaders for per-pixel coloring, and compute shaders for general-purpose parallel processing. Later versions introduce specialized stages, such as mesh shaders for generating primitives directly on the GPU and amplification shaders for dynamically controlling vertex processing in Metal 3 and beyond. These stages enable flexible configurations, where shaders are invoked sequentially or in compute-only workflows to handle tasks from rendering to simulations. The language provides a rich set of built-in functions to facilitate common GPU operations, including matrix mathematics (e.g., matrix_multiply for vector-matrix products), texture sampling (e.g., texture2d.sample for bilinear interpolation), and atomic operations (e.g., atomic_fetch_add for thread-safe increments on shared memory). These functions are optimized for hardware acceleration, ensuring efficient execution without requiring manual SIMD intrinsics in most cases. MSL's type system emphasizes fixed-size data suitable for GPU parallelism, supporting scalar types (e.g., float, int), vector types (e.g., half3), matrix types (e.g., float4x4), and aggregate structures for passing complex data between stages. Notably, it prohibits dynamic memory allocation (no new or delete) and recursion to avoid stack overflows and non-deterministic behavior in parallel threads. Shaders written in MSL can be compiled either at runtime using API calls like MTLDevice.makeLibrary or offline via the metal tool, which generates a binary library (.metallib) containing an for direct loading and execution on the GPU. This dual approach balances development flexibility with performance, as offline compilation reduces load times in production apps. MSL includes extensions for advanced rendering and computation, such as ray tracing features introduced in 2020 and enhanced in Metal 3 with hardware acceleration, where developers can define custom intersection functions (e.g., using [[intersection(0)]] attributes) to compute ray-object hits with programmable payloads. In Metal 4, tensor types and operators are natively supported, enabling direct implementation of machine learning primitives like matrix multiplication (tensor_multiply) and convolutions within shaders for integrated graphics-ML workflows. These extensions maintain MSL's C++-like syntax while leveraging hardware-specific instructions for high-throughput operations.

Key features

Graphics rendering

Metal's graphics rendering pipeline follows a traditional structure optimized for hardware-accelerated 3D graphics on Apple platforms, enabling developers to process and produce pixel outputs efficiently. The pipeline begins with vertex processing, where vertex shaders transform input using buffers and textures to compute positions and attributes for each vertex. Following vertex processing, primitive assembly and rasterization convert the transformed into fragments, generating pixel coverage data across the render target. Fragment shading then applies pixel-level computations via fragment shaders to determine final color values, incorporating textures, lighting, and material properties. Finally, depth and stencil testing resolve visibility and layering, using optional attachments to discard or blend fragments based on depth buffers and masks, ensuring correct scene composition. To support dynamic , Metal includes stages that subdivide coarse patches into finer , enhancing detail without increasing base model complexity. operates on quad or triangle patches, using per-patch factors stored in MTLBuffer objects to control subdivision levels, which are computed dynamically via compute kernels for adaptive rendering. This feature, available on devices supporting MTLGPUFamily.apple3 or MTLGPUFamily.mac2 and later, integrates into the render pipeline to generate detailed surfaces efficiently. While Metal does not provide traditional geometry shaders, serves a similar role in amplifying on the GPU. Introduced in Metal 3, ray tracing enables photorealistic rendering through hardware-accelerated ray intersection testing against scene geometry. Developers build acceleration structures, such as bounding volume hierarchies (BVH), to represent triangles and bounding volumes, allowing efficient ray traversal and intersection queries for shadows, reflections, and . These structures accelerate ray tracing by culling non-intersecting regions, supporting real-time performance on GPUs. Hybrid rendering combines ray tracing with traditional rasterization, using ray-traced effects like or reflections to enhance rasterized scenes without fully replacing the pipeline. Metal 4 enhances ray tracing with flexible acceleration structures, allowing developers to build custom hierarchies with optimization flags for speed or size. Additionally, MetalFX upscaling improvements include integrated denoising and frame , boosting frame rates for demanding graphics workloads on devices. Mesh shading, also part of Metal 3, provides a modern alternative for handling complex geometries through task and shaders, replacing or augmenting older fixed-function stages like . Task shaders perform high-level and workload distribution, dispatching groups of vertices and to shaders, which generate variable output topologies directly on the GPU for scalable rendering. This approach supports efficient processing of intricate models, such as or crowds, by adjusting detail levels based on visibility and performance needs, and is hardware-accelerated on devices with MTLGPUFamily.apple7 or later. shaders enable GPU-driven geometry creation, reducing CPU overhead and improving draw call efficiency for large scenes. For performance optimization, Metal supports variable rate shading (VRS), allowing developers to rasterize different screen regions at varying sample rates to balance quality and speed. A rasterization rate map defines zones with coarse (e.g., 2x2 per sample) or fine rates, applied during fragment processing to skip redundant computations in low-detail areas like blurred backgrounds. This technique, checked via device.supportsRasterizationRateMap, reduces GPU workload when shading is costly, such as in complex evaluations. Complementing VRS, (IBL) uses environment cubemaps and precomputed irradiance textures sampled in fragment shaders to approximate , minimizing real-time light calculations for reflective surfaces and ambient effects. Metal integrates seamlessly with ARKit and RealityKit to enable on , allowing custom to render scenes with real-world passthrough and immersive 3D content. Developers can use RealityKit's CustomMaterial to inject functions for rendering, combining ARKit's motion tracking with Metal's for low-latency overlays and interactions. This integration supports apps by blending Metal-rendered graphics with spatial anchors, facilitating mixed-reality experiences like volumetric displays or hand-tracked interfaces.

Compute programming

Metal's compute programming enables developers to execute general-purpose computations on the GPU using compute written in the Metal Shading Language (MSL). These perform parallel processing tasks independent of graphics rendering, leveraging the GPU's massive parallelism for data-intensive operations. Compute pipelines, created via MTLComputePipelineState objects, encapsulate the function and its configuration, allowing efficient execution of kernels across threads organized in a grid. Dispatching compute kernels occurs through a compute command encoder, where developers specify the grid size using methods like dispatchThreads, which launches threads in a 1D, 2D, or 3D grid calculated automatically by Metal to cover the workload. Threads are grouped into threadgroups, each containing up to a hardware-defined maximum number of threads (e.g., on many Apple GPUs), enabling coarse-grained parallelism where multiple threadgroups process independent data partitions. This structure supports scalable execution, with thread positions identifiable via built-in MSL attributes such as [[thread_position_in_grid]] for global indexing and [[thread_position_in_threadgroup]] for local positioning within a group. Within a threadgroup, threads share a dedicated memory space declared with the threadgroup address space qualifier in MSL, providing low-latency access for collaborative data processing, such as reducing partial sums or exchanging intermediate results. This , limited to the threadgroup's lifetime and sized up to 32 KB depending on the device, facilitates efficient intra-group communication without relying on slower global device memory. To ensure data consistency and prevent race conditions during access, developers use the threadgroup_barrier function, which synchronizes all threads in the group as both an execution and . For instance, threadgroup_barrier(mem_flags::mem_threadgroup, execution::threadgroup) guarantees that prior writes to threadgroup are visible to all threads before proceeding, enforcing uniform across the group. Metal's compute capabilities excel in large-scale simulations, image processing, and scientific computing by distributing workloads across thousands of threads, harnessing the GPU's vector processing units for tasks like matrix multiplications or methods. For dynamic workloads, indirect compute dispatches allow grid sizes and threadgroup counts to be specified from buffer data at runtime, using methods like dispatchThreadgroupsWithIndirectBuffer to adapt to varying input sizes without CPU intervention. Sparse textures further enhance efficiency for voluminous datasets, such as volumetric simulations, by allocating physical memory only for accessed regions; unmapped areas return zeros on read, and writes to them are discarded, reducing for large 3D textures in compute shaders. Integration with relies on buffer-based data passing, where input and output tensors are stored in MTLBuffers for seamless transfer to compute shaders, enabling custom kernel implementations for operations like convolutions or activations. In Metal 4, native tensor support via MTLTensor resources introduces multi-dimensional arrays optimized for ML models, allowing direct manipulation in shaders for and pipelines alongside other compute tasks. Representative use cases include video encoding, where compute kernels parallelize and compression across frames for real-time processing; physics simulations, such as particle systems or , that dispatch threadgroups to update positions and forces in parallel; and neural network inference, employing custom shaders to execute forward passes on batched inputs for applications like image recognition.

Performance optimization tools

Metal provides developers with a suite of integrated tools within and Instruments to debug, validate, and profile applications, enabling precise identification and resolution of performance bottlenecks in graphics, compute, and workloads. The facilitates GPU capture, allowing developers to record and replay frames to inspect command buffers, states, and executions in detail. By capturing a Metal workload, developers can step through command encoders, examine dependencies between passes, and visualize the GPU timeline to pinpoint synchronization issues or inefficient usage. This frame debugging capability supports editing and reloading shaders on-the-fly, which accelerates iteration and helps optimize rendering pipelines by revealing visual artifacts or suboptimal draw calls. API validation layers operate at runtime to enforce correct usage of the Metal and , catching errors such as invalid bindings, incorrect flags in command encoding, or out-of-bounds buffer accesses before they impact production performance. The API Validation layer inspects creation and command submission, flagging violations like mismatched storage modes or improper synchronization, while the Shader Validation layer detects execution-time issues, including nil texture accesses or divergent that could lead to stalls. Enabling these layers during development incurs a performance cost but ensures robust code, preventing subtle bugs that degrade GPU efficiency in dynamic scenarios. For in-depth profiling, GPU counters and timeline views in the Metal debugger offer granular insights into hardware utilization, including bandwidth consumption, cache hit rates, and execution durations across shader stages. The performance timeline displays parallel CPU and GPU activities, with tracks for vertex, fragment, and compute passes, alongside aggregated shader waterfalls that highlight overlaps and stalls. Counters such as occupancy (measuring active thread utilization), limiter (identifying bottlenecks in execution units), and bandwidth (tracking memory read/write throughput) enable developers to quantify inefficiencies, like low cache coherence or excessive serialization, and guide optimizations such as adjusting threadgroup sizes or reducing texture fetches. These views support both Apple silicon and non-Apple GPUs on macOS, providing consistent profiling across hardware. Argument buffers serve as a key optimization for reducing CPU overhead in scenarios involving dynamic pipelines, where frequent resource binding would otherwise introduce latency. By encapsulating groups of resources—like buffers, textures, and samplers—into a single buffer passed to shaders, developers can bind complex argument sets once per frame rather than per command, minimizing driver overhead and state validation. Metal supports two tiers of argument buffers: Tier 1 for immutable, CPU-readable resources with limited counts (e.g., up to 31 total arguments on many devices), and Tier 2 for mutable resources with pointer indexing in the , available on +, allowing up to 96 samplers on mobile devices. This approach is particularly effective for read-only data, such as terrain rendering or particle systems, where it can cut by consolidating hundreds of individual setResource calls into array accesses. On unified memory architectures like , resource residency and eviction controls help manage memory pressure by allowing developers to influence when resources are retained or discarded from GPU-accessible . Resources and heaps support purgeable states (e.g., MTLPurgeableState.volatile or .empty), enabling explicit of unused assets to free bandwidth, with the system automatically restoring them on access if needed. This is crucial for applications with large datasets, as it prevents thrashing on devices with shared CPU-GPU pools, though the driver handles most residency automatically; developers track it manually for argument buffers to avoid unexpected faults. Heaps allocated from MTLHeap can propagate purgeability to their resources, optimizing footprint in compute-heavy workloads without manual deallocation. To facilitate development without relying solely on Apple hardware, Metal includes simulation tools like the and macOS support for non-Apple GPUs (e.g., or ), allowing testing of Metal code on diverse configurations. In the Simulator, Metal apps run with native GPU acceleration via the host Mac's Metal implementation, supporting frame capture and counters for rapid prototyping, though some features like certain storage modes require alternative render paths. For non-Apple GPUs on macOS, the Metal debugger provides counter statistics tailored to discrete hardware, measuring activities like vertex processing or transfers to simulate real-world variations during optimization. These tools ensure compatibility and across the ecosystem without physical device access.

Metal Performance Shaders

Metal Performance Shaders (MPS) is a high-level framework built on Metal that provides a of pre-optimized compute and shaders for accelerating common workloads across Apple GPUs. These shaders are fine-tuned for the unique of each Metal GPU family, enabling developers to achieve high without manual low-level optimization. MPS supports a wide range of tasks, including inference and training, and , linear operations, and advanced rendering techniques, all while ensuring energy efficiency on devices. At the core of MPS for machine learning is MPSGraph, a compute engine that allows developers to construct, compile, and execute symbolic graphs of operations, where each node represents a shader-encapsulated function producing tensor outputs as graph edges. This model supports layers, such as convolutions and activations, as well as training operations like forward and backward passes, leveraging hardware accelerators for high-throughput execution. MPSGraph facilitates integration with Core ML by enabling the deployment of trained models directly onto Metal compute pipelines, allowing seamless GPU acceleration of tasks in applications. For example, transformer models can be optimized using MPSGraph's graph compilation, yielding significant speedups on . MPS also offers specialized kernels for and matrix processing, including filters for convolutions, reductions, and histogram computations, which automatically tune parameters based on the target GPU's capabilities. These kernels handle compressed texture formats like PVRTC and ASTC, making them suitable for real-time manipulation in and vision pipelines. In linear , MPS provides optimized routines for , factorization, and solving systems of equations, often outperforming general-purpose compute shaders by exploiting SIMD instructions and memory hierarchies inherent to Apple GPUs. For advanced rendering, MPS includes ray tracing primitives that accelerate ray-geometry intersection tests, optimized for the ray tracing hardware in GPUs, enabling efficient photorealistic effects in games and simulations. While general physics simulations can leverage these compute kernels, MPS focuses on declarative integration rather than raw primitive control. MPS supports sparse textures for efficient handling of irregular data in graphs, with enhancements in Metal 3 for operations like indirect command buffers, while Metal 4 adds native tensor support in the and , allowing direct tensor manipulation within shaders for more streamlined workflows.

Development history

Initial release and Metal 1

Apple announced Metal at its (WWDC) on June 2, , introducing it as a new graphics and compute designed to enhance performance on and macOS platforms. The framework debuted alongside in and macOS 10.10 Yosemite in October , marking Apple's shift toward a , low-level interface for GPU acceleration. The development of Metal was primarily motivated by the limitations of , which imposed significant overhead in driver translation and state management, hindering efficient GPU utilization on power-constrained mobile hardware. By providing direct access to the GPU with minimal abstraction, Metal aimed to reduce CPU involvement and unlock higher performance for graphics rendering and general-purpose computing on devices like the . At launch, its core features encompassed basic graphics pipelines for vertex and fragment , compute pipelines for parallel data processing, command encoding to batch GPU instructions, and multi-threaded submission allowing multiple CPU threads to prepare command buffers concurrently. These elements enabled developers to construct efficient render passes and dispatch kernels with reduced latency compared to prior APIs. Initial hardware compatibility was limited to Apple's A7 system-on-chip (SoC) and subsequent generations for iOS, ensuring broad availability on devices starting with the iPhone 5s and iPad Air. On macOS, support extended to NVIDIA GPUs based on the Kepler architecture and AMD Radeon GPUs found in Macs from mid-2012 onward, aligning with the transition to Yosemite. Developers encountered challenges in adopting Metal due to the need to rewrite -based codebases, which required significant refactoring of shaders and rendering logic. To facilitate the transition, third-party wrapper libraries like MoltenGL emerged, translating calls to Metal equivalents and allowing legacy applications to run with improved efficiency. Despite these aids, initial uptake on macOS was slower than on , as developers grappled with the API's lower-level paradigm. A notable early milestone was the launch of Vainglory in November 2014, a game that utilized Metal to deliver 60 frames per second (FPS) on iOS devices like the iPhone 6, showcasing the API's capability for smooth, high-fidelity mobile gaming previously unattainable with . This demonstration highlighted Metal's impact on enabling console-like experiences on handheld hardware.

Metal 2 advancements

Metal 2 was released in alongside and macOS 10.13 High Sierra, marking a significant evolution of Apple's graphics and compute with a focus on GPU-driven workflows and expanded hardware capabilities. Among the key new features, tile shading introduced a novel pipeline stage that enables developers to perform compute-like operations directly within the rendering pass, improving memory efficiency by processing screen-space tiles on Apple GPUs without additional bandwidth overhead. This approach leverages the tile-based deferred architecture of Apple hardware to blend graphics and compute tasks seamlessly, reducing the need for separate passes in complex scenes like forward-plus lighting. Indirect command buffers further advanced GPU autonomy by allowing the GPU to generate and execute draw commands dynamically, minimizing CPU involvement and enabling scalable rendering for large numbers of objects. Complementing this, multi-GPU rendering support extended to external GPUs via , permitting developers to distribute workloads across multiple processors for enhanced performance in demanding applications. In compute programming, Metal 2 enhanced capabilities through function specialization, where shaders could be compiled with constant parameters to optimize for specific use cases, reducing runtime overhead and improving execution efficiency. with other frameworks was bolstered, including tighter integration with IOSurface for efficient texture and support for raster order groups to ensure predictable memory access in fragment shaders, facilitating advanced techniques like . These advancements delivered notable performance improvements, with Apple reporting up to a 40% increase in rendering speed on supported hardware through optimizations like buffers, which cut CPU overhead by binding resources more efficiently. Overall draw call throughput could reach up to 10 times higher in GPU-driven scenarios compared to prior versions. The release spurred ecosystem growth, as integrated Metal support into , enabling smoother ports of games like those from Unity and Epic to macOS and paving the way for broader third-party adoption.

Metal 3 innovations

Metal 3, unveiled at Apple's (WWDC) in June 2022, debuted alongside and , introducing optimizations tailored to for enhanced graphics and compute performance. This version emphasized low-overhead access to GPU capabilities, enabling developers to leverage unified memory architecture more effectively through bindless resources, which support data sharing between CPU and GPU by providing GPU-specific handles like gpuResourceID and gpuAddress for without traditional binding overhead. These advancements reduce latency and memory bandwidth usage, allowing seamless transfer of large datasets such as textures and buffers directly to GPU memory. A cornerstone of Metal 3 is its support for advanced rendering techniques, including ray tracing (hardware-accelerated starting with M3-series processors; software-emulated on earlier Apple silicon like M1 and M2) and mesh shading (available starting with A15 Bionic and M2-series processors), on compatible Apple silicon GPUs. While ray tracing is available via Metal 3 on earlier Apple silicon like M1 and M2 through software emulation, hardware acceleration begins with the M3 series for improved performance. Ray tracing integration into the core Metal API simplifies implementation of realistic lighting, shadows, and reflections by accelerating ray intersection calculations on the GPU, with optimizations for building and traversing acceleration structures. Mesh shading introduces programmable geometry processing, replacing fixed-function stages with flexible shaders that dynamically generate vertices and primitives, ideal for complex scenes like dense foliage or particle effects. Complementing these is dynamic resolution scaling via MetalFX, a new framework for upscaling and frame generation that renders at lower resolutions before applying high-quality spatial upscaling and temporal anti-aliasing, significantly boosting frame rates in demanding applications without sacrificing visual fidelity—for instance, enabling 4K output from 1080p internals. MetalFX also supports frame interpolation for smoother motion, further enhancing real-time performance. Developer tools in Metal 3 received substantial upgrades, including an improved Metal Performance HUD for real-time monitoring of metrics like GPU utilization, frame times, and compilation activity, overlaid directly on running apps to aid debugging and optimization. GPU frame capture in was enhanced with better integration for ray tracing and mesh shading pipelines, allowing developers to inspect command buffers and resource residency more granularly. Additionally, the Fast Resource Loading , powered by MTLIOCommandQueue and asynchronous command buffers, streamlines asset ingestion by bypassing CPU bottlenecks, providing a direct path from storage to GPU for high-resolution textures and geometry. These innovations have notably impacted gaming on Apple platforms, facilitating console-quality titles by harnessing 's efficiency. For example, was ported to macOS using Metal 3 features like MetalFX upscaling and ray tracing support, achieving immersive visuals and responsive performance on M1 and M2 Macs, demonstrating how these tools enable expansive worlds with photorealistic effects previously limited to dedicated consoles. Overall, Metal 3's focus on Apple silicon optimizations has broadened the ecosystem for high-fidelity graphics, making advanced rendering accessible across , , and macOS devices.

Metal 4 and beyond

Metal 4 was announced at Apple's (WWDC) on June 9, 2025, alongside previews of 19, macOS 16, and 3, marking a significant evolution in Apple's graphics and compute designed to leverage the latest advancements. This release emphasizes seamless integration of capabilities directly into graphics workflows, enabling developers to harness GPU and Neural Engine resources more efficiently for AI-driven applications. A cornerstone of Metal 4 is its native support for tensors, introduced as first-class citizens in both the and the Metal Shading Language (MSL), facilitating optimized handling of multidimensional essential for workloads. Tensors can now be created, manipulated, and operated on directly within shaders and compute kernels, with new tensor types and operations like accelerating common ML tasks such as model on the GPU. This support extends to the tensor resource and ML encoder, allowing developers to encode and execute neural networks alongside rendering and compute commands on the unified GPU timeline, reducing latency in AI-accelerated graphics pipelines. The API's AI-graphics fusion enables hardware-accelerated integration of models into rendering processes, including embedding neural networks directly within for advanced effects like real-time style transfer and object enhancement. Through ML features, developers can perform inline operations, fusing AI computations with to support emerging techniques in neural rendering and generative effects without separate processing stages. This approach leverages the GPU's parallel architecture for efficient execution, as demonstrated in session examples where ML models run synchronously with draw calls to enhance visual fidelity in applications. For gaming, Metal 4 enhances performance through updates to MetalFX, providing DLSS-like temporal upscaling to render at lower resolutions and upscale to higher outputs with minimal quality loss, achieving up to 2x improvements in demanding scenes. Additionally, MetalFX Frame Interpolation generates intermediate frames between rendered ones using motion vectors and AI-driven prediction, delivering smoother and higher effective frame rates comparable to NVIDIA's DLSS 3 frame generation. These features, combined with denoising for ray-traced content, optimize resource utilization on , enabling more complex visuals in titles ported or natively developed for macOS and . To broaden the ecosystem, Metal 4 integrates with Game Porting Toolkit 3 (GPTK 3), offering improved translation layers and simulation tools that allow developers to test and optimize DirectX-based games on non-Apple hardware before deployment to Apple platforms. GPTK 3 enhances cross-platform compatibility by unifying command encoding and providing MetalFX support during emulation, streamlining the process for Windows titles and enabling profiling on diverse systems. This facilitates easier adoption by third-party studios, with tools for conversion and runtime simulation reducing development barriers. Looking ahead, Metal 4 paves the way for deeper integration with the Apple Neural Engine (ANE), allowing direct programming of Neural Accelerators via tensor APIs to offload specific ML operations from the GPU while maintaining pipeline cohesion. This enables parallel execution of ANE tasks alongside GPU shaders, boosting efficiency for hybrid AI workloads in future updates. As Apple continues to evolve its silicon ecosystem, Metal 4's ML primitives position it for expanded support in on-device AI, including potential enhancements to Metal Performance Shaders for advanced neural operations.

Hardware compatibility

Supported processors

Metal supports a range of Apple-designed system-on-chip (SoC) processors across its platforms, with compatibility evolving alongside hardware advancements and API versions. On iOS, iPadOS, and tvOS, initial support began with the SoC, enabling Metal 1 features from iOS 8 and tvOS 9. Subsequent SoCs, including A8 through A9 for enhanced Metal 1 capabilities, A10 through A13 for Metal 2 and basic Metal 3, and A14 Bionic through A18 for support for Metal 3 and Metal 4, with advanced features like hardware ray tracing available on later processors such as A17 Pro+ and M1+, have progressively expanded feature availability. Devices with A14 Bionic or later, such as and subsequent models, (2020 and later), and 4K (2nd generation and later), provide Metal 4 compatibility when running iOS 18, iPadOS 18, or tvOS 18 or later. For macOS, Metal compatibility initially encompassed Intel-based Macs equipped with specific discrete and integrated GPUs. Metal 1 launched on 10.11 with support for Kepler-series GPUs (e.g., in 2012–2015), GCN-based GPUs from 2012 models (e.g., ), and HD Graphics 4000 or later integrated graphics in Macs from 2012 onward, including , , and . Metal 2 extended to GPUs in 2017–2018 models like the and . Metal 3 added compatibility with 5000/6000 series, UHD Graphics 630, and Iris Plus Graphics on select 2017–2020 Macs, alongside all Apple M-series SoCs starting with M1 on 11 or later. However, as of 14 and later, support for older configurations, including 32-bit apps and certain pre-2017 GPUs, has been deprecated in favor of Apple silicon. Apple silicon processors, beginning with the M1 family, unify Metal support across macOS, providing for Metal 1 through 3 features on all M1 through M5 SoCs in devices like (2020+), (2020+), iMac (2021+), Mac mini (2020+), Mac Studio (2022+), and Mac Pro (2023+). Metal 4 requires M1 or later with macOS Sequoia 15 or subsequent releases, marking the end of Mac support for new features. On visionOS, Metal is supported exclusively on the M2 SoC in devices running 1 or later, with Metal 3 and 4 features available from 2 onward. Backward compatibility ensures that applications built for earlier Metal versions run on newer hardware, though advanced features like hardware ray tracing in Metal 3 (available on M1+ for Apple silicon and A17 Pro+ for iOS devices) or tensor support in Metal 4 (native handling on A14+/M1+, optimized on A18+/M4+) are limited to qualifying processors. Basic Metal 3 support begins with A12 or later. As of 2025, all Apple silicon-equipped devices, including those with A14 through A18 and M1 through M5, support Metal 4 under compatible operating systems.

Feature tiers

Metal organizes its feature support into tiered levels based on GPU families, allowing developers to target capabilities that align with hardware constraints across Apple devices. These tiers group features into progressive sets, starting from fundamental and compute operations to advanced rendering and AI acceleration. Support is determined by the underlying GPU family, with newer families enabling more sophisticated functionality. Developers must account for these tiers to ensure application compatibility and on diverse hardware, from older A-series chips to the latest M-series processors. Platform-specific differences apply, particularly for vs. macOS/ (e.g., hardware ray tracing on M1+ for , A17 Pro+ for ). Tier 1 provides the foundational level of Metal support, encompassing basic graphics rendering and compute tasks without advanced effects like ray tracing. This tier is available on devices with Apple GPU family 1 (A7) and extends to all higher families, including M1 (family 7), ensuring broad compatibility for core operations such as vertex and fragment , texture sampling, and simple parallel compute kernels. It lacks hardware support for ray tracing, relying instead on general-purpose compute shaders for any simulation of such effects. Tier 2 builds on Tier 1 by adding advanced graphics features, including mesh shading for more efficient . This tier requires Apple GPU family 7 or later, corresponding to A15 Bionic, M1, and subsequent chips like A16 and (family 8). Mesh shaders enable GPU-driven generation and , reducing CPU overhead in complex scenes, but indirect mesh commands are limited until family 9. Other enhancements include improved argument buffer handling at Tier 2 level for better in shaders. Tier 3 introduces full hardware-accelerated ray tracing and enhanced MetalFX upscaling, targeting high-fidelity rendering on premium devices. It is supported on Apple GPU family 7 and above (M1+ for ; A17 Pro/family 8+ for ), including A17 Pro, M3, and M4 chips. Ray tracing leverages dedicated hardware intersection testing and acceleration structures for real-time and reflections, while MetalFX provides temporal upscaling and frame generation to boost performance without sacrificing visual quality. Denoised upscaling is exclusive to this tier on family 9+. Earlier families can emulate ray tracing via compute but without hardware efficiency. Tier 4, aligned with Metal 4, focuses on tensor operations and AI acceleration, enabling efficient machine learning workloads directly on the GPU. Support begins with family 6 (A14/M1+), offering tensor shaders for matrix multiplications used in neural networks, but full capabilities—including advanced sparse tensor handling and integrated AI encoding—require family 8+ (A17 Pro/M2). This tier integrates with Metal Performance Shaders for accelerated transformer models and inference, providing up to several times faster AI processing compared to CPU fallbacks. Native tensor handling in Metal 4 is supported on A14+ and M1+, with optimized performance on A18+ and M4+.
TierKey Hardware FamiliesMajor FeaturesExample Devices
1Apple1–Apple10, Mac1–Mac2Basic graphics/compute, no ray tracingA7, A11
2Apple7+ , Mac2Mesh shading, Tier 2 argument buffersA15, M1,
3Apple7+ (full hardware RT on Apple8+ for iOS, Apple7+ for )Hardware ray tracing, advanced MetalFXA17 Pro, M1, M3
4Apple6+ (full on Apple8+)Tensor ops, AI acceleration (Metal 4)A14, A18, M1, M4
To query tier support at runtime, developers use the MTLDevice.supportsFamily(_:) method, passing an MTLGPUFamily value such as .apple7 to verify if the device meets a minimum family requirement. Additional checks like supportsRaytracing or readWriteTextureSupport provide granular confirmation for specific features, enabling dynamic code paths. These tiers necessitate fallback strategies for cross-device compatibility; for instance, on Tier 1 or 2 hardware lacking hardware ray tracing, applications can implement software-based ray tracing using compute pipelines, though at a performance cost of 2–5x slower than dedicated hardware. Similarly, MetalFX fallbacks to bilinear upscaling on unsupported devices maintain functionality while prioritizing scalability across the ecosystem.

Adoption and ecosystem

Use in Apple platforms

Metal serves as the foundational graphics and compute for Apple's operating systems, powering essential rendering tasks in and macOS, including support in , photo editing in app, and smooth system UI animations. This integration ensures low-overhead GPU access across the platform, enabling efficient hardware-accelerated visuals without relying on higher-level abstractions for core system components. Apple's high-level frameworks leverage Metal as their underlying engine to simplify development while maintaining performance. SceneKit uses Metal for , supporting Metal Shading Language in its programs to handle complex scenes in apps. SpriteKit relies on Metal for 2D graphics and particle effects, providing SpriteKit scenes with direct GPU acceleration for games and interactive content. RealityKit, designed for augmented and mixed reality, builds on Metal to render immersive 3D content, including entity-component systems optimized for GPUs. These frameworks abstract Metal's complexity, allowing developers to create AR experiences via ARKit while ensuring Metal handles the low-level rendering. In and , Metal enables advanced graphics for media and . On , it drives rendering for Apple TV+ content, supporting high-frame-rate video playback and UI elements with GPU efficiency. For , Metal powers fully immersive experiences, allowing custom rendering engines to integrate passthrough video with 3D overlays for spatial apps. Developers can render hover effects and depth-based interactions directly in Metal layers, enhancing immersion on . Metal integrates deeply with machine learning workflows through Core ML, where models execute on-device using Metal for GPU acceleration. Core ML's configuration options, such as preferredMetalDevice, route computations to the appropriate GPU, enabling fast operations in apps. This is facilitated by Metal Performance Shaders Graph, which compiles Core ML models into optimized Metal shaders for scalable performance. At the system level, Metal supports AVFoundation for tasks, including HDR playback and in AVPlayerLayer. It accelerates ProRes encoding and decoding, allowing efficient handling of high-resolution footage in media pipelines. In professional applications, Metal drives advanced effects and rendering. utilizes Metal for its video editing engine, achieving up to 8x faster playback and export speeds on compatible hardware through optimized GPU pipelines. Similarly, employs Metal for real-time audio visualization and effects processing, ensuring low-latency rendering of complex tracks and plugins. Metal 4 further enhances these with AI-accelerated features for pro workflows.

Third-party applications

Third-party developers have widely adopted Metal for gaming applications on macOS and iOS, enabling high-performance ports of major titles. For instance, CD Projekt RED ported Cyberpunk 2077 to macOS, leveraging Metal 4 for advanced features like path-tracing and frame interpolation to deliver smooth gameplay on Apple Silicon hardware. Similarly, Hello Games updated No Man's Sky with native Metal support for Apple Silicon, optimizing rendering and compute tasks to achieve console-like performance on Macs and iPads. Riot Games integrated Metal as the renderer for League of Legends on macOS, resulting in improved frame rates and reduced latency compared to prior OpenGL implementations. These ports demonstrate Metal's role in bringing AAA gaming to Apple ecosystems without relying on translation layers. In professional creative software, Metal provides GPU acceleration for computationally intensive tasks. Adobe Photoshop utilizes Metal to enhance features like neural filters and canvas rendering, allowing faster processing of large images on devices through optimized integration. Blender's Cycles renderer supports Metal for GPU-accelerated and previews, significantly speeding up and workflows on M-series chips by offloading computations from the CPU. These implementations highlight Metal's efficiency in handling complex shaders and for professional-grade applications. Machine learning frameworks have incorporated Metal backends to accelerate and on Apple hardware. TensorFlow's Metal plugin enables GPU-accelerated model execution, supporting operations like convolutions and attention mechanisms directly on the Neural Engine and GPU. PyTorch's Metal Performance Shaders (MPS) backend facilitates high-performance , mapping computational graphs to Metal for up to several times faster iteration on tasks such as image classification compared to CPU-only runs. This adoption extends to broader AI workflows, allowing developers to leverage Apple Silicon's unified without custom code. Game engines like and have deeply integrated Metal as the preferred renderer for Apple platforms, serving as key case studies in ecosystem growth. Unity's built-in Metal support ensures compatibility across iOS, macOS, and , with automatic conversion and resource scaling that optimizes performance for cross-platform titles. achieves near-feature parity with Windows via Metal, including advanced ray tracing and mesh in versions 5.2 and later, enabling developers to target Apple hardware natively for high-fidelity experiences. By 2025, Metal powers a substantial portion of top games, enabling enhanced graphics and efficiency among leading titles. Following Metal 3, features such as MetalFX upscaling and ray tracing have contributed to improved frame rates in AAA titles on handhelds like , reducing rendering overhead and enabling smoother gameplay in demanding scenarios. These advancements have overcome prior limitations in mobile AAA viability, fostering successes in portable gaming while addressing challenges like thermal throttling through efficient design.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.