Hubbry Logo
search
logo
2318466

Ray tracing (graphics)

logo
Community Hub0 Subscribers
Read side by side
from Wikipedia

This recursive ray tracing of reflective colored spheres on a white surface demonstrates the effects of shallow depth of field, "area" light sources, and diffuse interreflection.

In 3D computer graphics, ray tracing is a technique for modeling light transport for use in a wide variety of rendering algorithms for generating digital images.

On a spectrum of computational cost and visual fidelity, ray tracing-based rendering techniques, such as ray casting, recursive ray tracing, distribution ray tracing, photon mapping and path tracing, are generally slower and higher fidelity than scanline rendering methods.[1] Thus, ray tracing was first deployed in applications where taking a relatively long time to render could be tolerated, such as CGI images, and film and television visual effects (VFX), but was less suited to real-time applications such as video games, where speed is critical in rendering each frame.[2]

Since 2018, however, hardware acceleration for real-time ray tracing has become standard on new commercial graphics cards, and graphics APIs have followed suit, allowing developers to use hybrid ray tracing and rasterization-based rendering in games and other real-time applications with a lesser hit to frame render times.

Ray tracing is capable of simulating a variety of optical effects,[3] such as reflection, refraction, soft shadows, scattering, depth of field, motion blur, caustics, ambient occlusion and dispersion phenomena (such as chromatic aberration). It can also be used to trace the path of sound waves in a similar fashion to light waves, making it a viable option for more immersive sound design in video games by rendering realistic reverberation and echoes.[4] In fact, any physical wave or particle phenomenon with approximately linear motion can be simulated with ray tracing.

Ray tracing-based rendering techniques that involve sampling light over a domain generate rays or using denoising techniques.[incomprehensible]

History

[edit]
"Draughtsman Making a Perspective Drawing of a Reclining Woman" by Albrecht Dürer, possibly from 1532, shows a man using a grid layout to create an image. The German Renaissance artist is credited with first describing the technique.
Dürer woodcut of Jacob de Keyser's invention. With de Keyser's device, the artist's viewpoint was fixed by an eye hook inserted in the wall. This was joined by a silk string to a gun-sight–style instrument, with a pointed vertical element at the front and a peephole at the back. The artist aimed at the object and traced its outline on the glass, keeping the eyepiece aligned with the string to maintain the correct angle of vision.

The idea of ray tracing comes from as early as the 16th century, when it was described by Albrecht Dürer, who is credited for its invention.[5] Dürer described multiple techniques for projecting 3-D scenes onto an image plane. Some of these project chosen geometry onto the image plane, as is done with rasterization today. Others determine what geometry is visible along a given ray, as is done with ray tracing.[6][7]

Using a computer for ray tracing to generate shaded pictures was first accomplished by Arthur Appel in 1968.[8] Appel used ray tracing for primary visibility (determining the closest surface to the camera at each image point) by tracing a ray through each point to be shaded into the scene to identify the visible surface. The closest surface intersected by the ray was the visible one. This non-recursive ray tracing-based rendering algorithm is today called "ray casting". His algorithm then traced secondary rays to the light source from each point being shaded to determine whether the point was in shadow or not.

Later, in 1971, Goldstein and Nagel of MAGI (Mathematical Applications Group, Inc.)[9] published "3-D Visual Simulation", wherein ray tracing was used to make shaded pictures of solids. At the ray-surface intersection point found, they computed the surface normal and, knowing the position of the light source, computed the brightness of the pixel on the screen. Their publication describes a short (30-second) film "made using the University of Maryland's display hardware outfitted with a 16mm camera. The film showed the helicopter and a simple ground-level gun emplacement. The helicopter was programmed to undergo a series of maneuvers including turns, take-offs, and landings, etc., until it eventually is shot down and crashed." A CDC 6600 computer was used. MAGI produced an animation video called MAGI/SynthaVision Sampler in 1974.[10]

Flip book created in 1976 at Caltech

Another early instance of ray casting came in 1976, when Scott Roth created a flip book animation in Bob Sproull's computer graphics course at Caltech. The scanned pages are shown as a video in the accompanying image. Roth's computer program noted an edge point at a pixel location if the ray intersected a bounded plane different from that of its neighbors. Of course, a ray could intersect multiple planes in space, but only the surface point closest to the camera was noted as visible. The platform was a DEC PDP-10, a Tektronix storage-tube display, and a printer which would create an image of the display on rolling thermal paper. Roth extended the framework, introduced the term ray casting in the context of computer graphics and solid modeling, and in 1982 published his work while at GM Research Labs.[11]

Turner Whitted was the first to show recursive ray tracing for mirror reflection and for refraction through translucent objects, with an angle determined by the solid's index of refraction, and to use ray tracing for anti-aliasing.[12] Whitted also showed ray traced shadows. He produced a recursive ray-traced film called The Compleat Angler[13] in 1979 while an engineer at Bell Labs. Whitted's deeply recursive ray tracing algorithm reframed rendering from being primarily a matter of surface visibility determination to being a matter of light transport. His paper inspired a series of subsequent work by others that included distribution ray tracing and finally unbiased path tracing, which provides the rendering equation framework that has allowed computer-generated imagery to be faithful to reality.

For decades, global illumination in major films using computer-generated imagery was approximated with additional lights. Ray tracing-based rendering eventually changed that by enabling physically-based light transport. Early feature films rendered entirely using path tracing include Monster House (2006), Cloudy with a Chance of Meatballs (2009),[14] and Monsters University (2013).[15]

Algorithm overview

[edit]
The ray-tracing algorithm builds an image by extending rays into a scene and bouncing them off surfaces and towards sources of light to approximate the color value of pixels.
Illustration of the ray-tracing algorithm for one pixel (up to the first bounce)

Optical ray tracing describes a method for producing visual images constructed in 3-D computer graphics environments, with more photorealism than either ray casting or scanline rendering techniques. It works by tracing a path from an imaginary eye through each pixel in a virtual screen, and calculating the color of the object visible through it.

Scenes in ray tracing are described mathematically by a programmer or by a visual artist (normally using intermediary tools). Scenes may also incorporate data from images and models captured by means such as digital photography.

Typically, each ray must be tested for intersection with some subset of all the objects in the scene. Once the nearest object has been identified, the algorithm will estimate the incoming light at the point of intersection, examine the material properties of the object, and combine this information to calculate the final color of the pixel. Certain illumination algorithms and reflective or translucent materials may require more rays to be re-cast into the scene.

It may at first seem counterintuitive or "backward" to send rays away from the camera, rather than into it (as actual light does in reality), but doing so is many orders of magnitude more efficient. Since the overwhelming majority of light rays from a given light source do not make it directly into the viewer's eye, a "forward" simulation could potentially waste a tremendous amount of computation on light paths that are never recorded.

Therefore, the shortcut taken in ray tracing is to presuppose that a given ray intersects the view frame. After either a maximum number of reflections or a ray traveling a certain distance without intersection, the ray ceases to travel and the pixel's value is updated.

Calculate rays for rectangular viewport

[edit]

On input we have (in calculation we use vector normalization and cross product):

  • eye position
  • target position
  • field of view - for humans, we can assume
  • numbers of square pixels on viewport vertical and horizontal direction
  • numbers of actual pixel
  • vertical vector which indicates where is up and down, usually - roll component which determine viewport rotation around point C (where the axis of rotation is the ET section)

Viewport schema with pixels, eye E and target T, viewport center C

The idea is to find the position of each viewport pixel center which allows us to find the line going from eye through that pixel and finally get the ray described by point and vector (or its normalization ). First we need to find the coordinates of the bottom left viewport pixel and find the next pixel by making a shift along directions parallel to viewport (vectors , ) multiplied by the size of the pixel. Below we introduce formulas which include distance between the eye and the viewport. However, this value will be reduced during ray normalization (so you might as well accept that and remove it from calculations).

Pre-calculations: let's find and normalize vector and vectors which are parallel to the viewport (all depicted on above picture)

note that viewport center , next we calculate viewport sizes divided by 2 including inverse aspect ratio

and then we calculate next-pixel shifting vectors along directions parallel to viewport (), and left bottom pixel center

Calculations: note and ray so

Detailed description of ray tracing computer algorithm and its genesis

[edit]

What happens in nature (simplified)

[edit]

In nature, a light source emits a ray of light which travels, eventually, to a surface that interrupts its progress. One can think of this "ray" as a stream of photons traveling along the same path. In a perfect vacuum this ray will be a straight line (ignoring relativistic effects). Any combination of four things might happen with this light ray: absorption, reflection, refraction and fluorescence. A surface may absorb part of the light ray, resulting in a loss of intensity of the reflected and/or refracted light. It might also reflect all or part of the light ray, in one or more directions. If the surface has any transparent or translucent properties, it refracts a portion of the light beam into itself in a different direction while absorbing some (or all) of the spectrum (and possibly altering the color). Less commonly, a surface may absorb some portion of the light and fluorescently re-emit the light at a longer wavelength color in a random direction, though this is rare enough that it can be discounted from most rendering applications. Between absorption, reflection, refraction and fluorescence, all of the incoming light must be accounted for, and no more. A surface cannot, for instance, reflect 66% of an incoming light ray, and refract 50%, since the two would add up to be 116%. From here, the reflected and/or refracted rays may strike other surfaces, where their absorptive, refractive, reflective and fluorescent properties again affect the progress of the incoming rays. Some of these rays travel in such a way that they hit our eye, causing us to see the scene and so contribute to the final rendered image.

Ray casting algorithm

[edit]

The idea behind ray casting, the predecessor to recursive ray tracing, is to trace rays from the eye, one per pixel, and find the closest object blocking the path of that ray. Think of an image as a screen-door, with each square in the screen being a pixel. This is then the object the eye sees through that pixel. Using the material properties and the effect of the lights in the scene, this algorithm can determine the shading of this object. The simplifying assumption is made that if a surface faces a light, the light will reach that surface and not be blocked or in shadow. The shading of the surface is computed using traditional 3-D computer graphics shading models. One important advantage ray casting offered over older scanline algorithms was its ability to easily deal with non-planar surfaces and solids, such as cones and spheres. If a mathematical surface can be intersected by a ray, it can be rendered using ray casting. Elaborate objects can be created by using solid modeling techniques and easily rendered.

Volume ray casting algorithm

[edit]

In the method of volume ray casting, each ray is traced so that color and/or density can be sampled along the ray and then be combined into a final pixel color. This is often used when objects cannot be easily represented by explicit surfaces (such as triangles), for example when rendering clouds or 3-D medical scans.

Visualization of SDF ray marching algorithm

SDF ray marching algorithm

[edit]

In SDF ray marching, or sphere tracing,[16] each ray is traced in multiple steps to approximate an intersection point between the ray and a surface defined by a signed distance function (SDF). The SDF is evaluated for each iteration in order to be able take as large steps as possible without missing any part of the surface. A threshold is used to cancel further iteration when a point is reached that is close enough to the surface. This method is often used for 3-D fractal rendering.[17]

Recursive ray tracing algorithm

[edit]
Ray tracing can create photorealistic images.
In addition to the high degree of realism, ray tracing can simulate the effects of a camera due to depth of field and aperture shape (in this case a hexagon).
The number of reflections, or bounces, a "ray" can make, and how it is affected each time it encounters a surface, is controlled by settings in the software. In this image, each ray was allowed to reflect up to 16 times. Multiple "reflections of reflections" can thus be seen in these spheres. (Image created with Cobalt.)
The number of refractions a “ray” can make, and how it is affected each time it encounters a surface that permits the transmission of light, is controlled by settings in the software. Here, each ray was set to refract or reflect (the "depth") up to 9 times. Fresnel reflections were used and caustics are visible. (Image created with V-Ray.)

Earlier algorithms traced rays from the eye into the scene until they hit an object, but determined the ray color without recursively tracing more rays. Recursive ray tracing continues the process. When a ray hits a surface, additional rays may be cast because of reflection, refraction, and shadow.:[18]

  • A reflection ray is traced in the mirror-reflection direction. The closest object it intersects is what will be seen in the reflection.
  • A refraction ray traveling through transparent material works similarly, with the addition that a refractive ray could be entering or exiting a material. Turner Whitted extended the mathematical logic for rays passing through a transparent solid to include the effects of refraction.[19]
  • A shadow ray is traced toward each light. If any opaque object is found between the surface and the light, the surface is in shadow and the light does not illuminate it.

These recursive rays add more realism to ray-traced images.

Advantages over other rendering methods

[edit]

Ray tracing-based rendering's popularity stems from its basis in a realistic simulation of light transport, as compared to other rendering methods, such as rasterization, which focuses more on the realistic simulation of geometry. Effects such as reflections and shadows, which are difficult to simulate using other algorithms, are a natural result of the ray tracing algorithm. The computational independence of each ray makes ray tracing amenable to a basic level of parallelization,[20] but the divergence of ray paths makes high utilization under parallelism quite difficult to achieve in practice.[21]

Disadvantages

[edit]

A serious disadvantage of ray tracing is performance, though it can in theory be faster than traditional scanline rendering, depending on scene complexity vs. number of pixels on-screen. Until the late 2010s, ray tracing in real time was usually considered impossible on consumer hardware for nontrivial tasks. Scanline algorithms and other algorithms use data coherence to share computations between pixels, while ray tracing normally starts the process anew, treating each eye ray separately. However, this separation offers other advantages, such as the ability to shoot more rays as needed to perform spatial anti-aliasing and improve image quality where needed.

Whitted-style recursive ray tracing handles interreflection and optical effects such as refraction, but is not generally photorealistic. Improved realism occurs when the rendering equation is fully evaluated, as the equation conceptually includes every physical effect of light flow. However, this is infeasible given the computing resources required, and the limitations on geometric and material modeling fidelity. Path tracing is an algorithm for evaluating the rendering equation and thus gives a higher-fidelity simulations of real-world lighting.

Reversed direction of traversal of scene by the rays

[edit]

The process of shooting rays from the eye to the light source to render an image is sometimes called backwards ray tracing, since it is the opposite direction photons actually travel. However, there is confusion with this terminology. Early ray tracing was always done from the eye, and early researchers such as James Arvo used the term backwards ray tracing to mean shooting rays from the lights and gathering the results. Therefore, it is clearer to distinguish eye-based versus light-based ray tracing.

While the direct illumination is generally best sampled using eye-based ray tracing, certain indirect effects can benefit from rays generated from the lights. Caustics are bright patterns caused by the focusing of light off a wide reflective region onto a narrow area of (near-)diffuse surface. An algorithm that casts rays directly from lights onto reflective objects, tracing their paths to the eye, will better sample this phenomenon. This integration of eye-based and light-based rays is often expressed as bidirectional path tracing, in which paths are traced from both the eye and lights, and the paths subsequently joined by a connecting ray after some length.[22][23]

Photon mapping is another method that uses both light-based and eye-based ray tracing; in an initial pass, energetic photons are traced along rays from the light source so as to compute an estimate of radiant flux as a function of 3-dimensional space (the eponymous photon map itself). In a subsequent pass, rays are traced from the eye into the scene to determine the visible surfaces, and the photon map is used to estimate the illumination at the visible surface points.[24][25] The advantage of photon mapping versus bidirectional path tracing is the ability to achieve significant reuse of photons, reducing computation, at the cost of statistical bias.

An additional problem occurs when light must pass through a very narrow aperture to illuminate the scene (consider a darkened room, with a door slightly ajar leading to a brightly lit room), or a scene in which most points do not have direct line-of-sight to any light source (such as with ceiling-directed light fixtures or torchieres). In such cases, only a very small subset of paths will transport energy; Metropolis light transport is a method which begins with a random search of the path space, and when energetic paths are found, reuses this information by exploring the nearby space of rays.[26]

Image showing recursively generated rays from the "eye" (and through an image plane) to a light source after encountering two diffuse surfaces

To the right is an image showing a simple example of a path of rays recursively generated from the camera (or eye) to the light source using the above algorithm. A diffuse surface reflects light in all directions.

First, a ray is created at an eyepoint and traced through a pixel and into the scene, where it hits a diffuse surface. From that surface, the algorithm recursively generates a reflection ray, which is traced through the scene, where it hits another diffuse surface. Finally, another reflection ray is generated and traced through the scene, where it hits the light source and is absorbed. The color of the pixel now depends on the colors of the first and second diffuse surfaces and the color of the light emitted from the light source. For example, if the light source emitted white light and the two diffuse surfaces were blue, then the resulting color of the pixel is blue.

Example

[edit]
Trefoil knot, created with a parametric equation and ray traced in Python

As a demonstration of the principles involved in ray tracing, consider how one would find the intersection between a ray and a sphere. This is merely the math behind the line–sphere intersection and the subsequent determination of the color of the pixel being calculated. There is, of course, far more to the general process of ray tracing, but this demonstrates an example of the algorithms used.

In vector notation, the equation of a sphere with center and radius is

Any point on a ray starting from point with direction (here is a unit vector) can be written as

where is its distance between and . In our problem, we know , , (e.g. the position of a light source) and , and we need to find . Therefore, we substitute for :

Let for simplicity; then

Knowing that d is a unit vector allows us this minor simplification:

This quadratic equation has solutions

The two values of found by solving this equation are the two ones such that are the points where the ray intersects the sphere.

Any value which is negative does not lie on the ray, but rather in the opposite half-line (i.e. the one starting from with opposite direction).

If the quantity under the square root (the discriminant) is negative, then the ray does not intersect the sphere.

Let us suppose now that there is at least a positive solution, and let be the minimal one. In addition, let us suppose that the sphere is the nearest object on our scene intersecting our ray, and that it is made of a reflective material. We need to find in which direction the light ray is reflected. The laws of reflection state that the angle of reflection is equal and opposite to the angle of incidence between the incident ray and the normal to the sphere.

The normal to the sphere is simply

where is the intersection point found before. The reflection direction can be found by a reflection of with respect to , that is

Thus the reflected ray has equation

Now we only need to compute the intersection of the latter ray with our field of view, to get the pixel which our reflected light ray will hit. Lastly, this pixel is set to an appropriate color, taking into account how the color of the original light source and that of the sphere are combined by the reflection.

Adaptive depth control

[edit]

Adaptive depth control means that the renderer stops generating reflected/transmitted rays when the computed intensity becomes less than a certain threshold. There must always be a set maximum depth or else the program would generate an infinite number of rays. But it is not always necessary to go to the maximum depth if the surfaces are not highly reflective. To test for this the ray tracer must compute and keep the product of the global and reflection coefficients as the rays are traced.

Example: let Kr = 0.5 for a set of surfaces. Then from the first surface the maximum contribution is 0.5, for the reflection from the second: 0.5 × 0.5 = 0.25, the third: 0.25 × 0.5 = 0.125, the fourth: 0.125 × 0.5 = 0.0625, the fifth: 0.0625 × 0.5 = 0.03125, etc. In addition we might implement a distance attenuation factor such as 1/D2, which would also decrease the intensity contribution.

For a transmitted ray we could do something similar but in that case the distance traveled through the object would cause even faster intensity decrease. As an example of this, Hall & Greenberg found that even for a very reflective scene, using this with a maximum depth of 15 resulted in an average ray tree depth of 1.7.[27]

Bounding volumes

[edit]

Enclosing groups of objects in sets of bounding volume hierarchies (BVH) decreases the amount of computations required for ray tracing. A cast ray is first tested for an intersection with the bounding volume, and then if there is an intersection, the volume is recursively divided until the ray hits the object. The best type of bounding volume will be determined by the shape of the underlying object or objects. For example, if the objects are long and thin, then a sphere will enclose mainly empty space compared to a box. Boxes are also easier to generate hierarchical bounding volumes.

Note that using a hierarchical system like this (assuming it is done carefully) changes the intersection computational time from a linear dependence on the number of objects to something between linear and a logarithmic dependence. This is because, for a perfect case, each intersection test would divide the possibilities by two, and result in a binary tree type structure. Spatial subdivision methods, discussed below, try to achieve this. Furthermore, this acceleration structure makes the ray-tracing computation output-sensitive. I.e. the complexity of the ray intersection calculations depends on the number of objects that actually intersect the rays and not (only) on the number of objects in the scene.

Kay & Kajiya give a list of desired properties for hierarchical bounding volumes:

  • Subtrees should contain objects that are near each other and the further down the tree the closer should be the objects.
  • The volume of each node should be minimal.
  • The sum of the volumes of all bounding volumes should be minimal.
  • Greater attention should be placed on the nodes near the root since pruning a branch near the root will remove more potential objects than one farther down the tree.
  • The time spent constructing the hierarchy should be much less than the time saved by using it.

Interactive ray tracing

[edit]

The first implementation of an interactive ray tracer was the LINKS-1 Computer Graphics System built in 1982 at Osaka University's School of Engineering, by professors Ohmura Kouichi, Shirakawa Isao and Kawata Toru with 50 students.[citation needed] It was a massively parallel processing computer system with 514 microprocessors (257 Zilog Z8001s and 257 iAPX 86s), used for 3-D computer graphics with high-speed ray tracing. According to the Information Processing Society of Japan: "The core of 3-D image rendering is calculating the luminance of each pixel making up a rendered surface from the given viewpoint, light source, and object position. The LINKS-1 system was developed to realize an image rendering methodology in which each pixel could be parallel processed independently using ray tracing. By developing a new software methodology specifically for high-speed image rendering, LINKS-1 was able to rapidly render highly realistic images." It was used to create an early 3-D planetarium-like video of the heavens made completely with computer graphics. The video was presented at the Fujitsu pavilion at the 1985 International Exposition in Tsukuba."[28] It was the second system to do so after the Evans & Sutherland Digistar in 1982. The LINKS-1 was claimed by the designers to be the world's most powerful computer in 1984.[29]

The next interactive ray tracer, and the first known to have been labeled "real-time" was credited at the 2005 SIGGRAPH computer graphics conference as being the REMRT/RT tools developed in 1986 by Mike Muuss for the BRL-CAD solid modeling system. Initially published in 1987 at USENIX, the BRL-CAD ray tracer was an early implementation of a parallel network distributed ray tracing system that achieved several frames per second in rendering performance.[30] This performance was attained by means of the highly optimized yet platform independent LIBRT ray tracing engine in BRL-CAD and by using solid implicit CSG geometry on several shared memory parallel machines over a commodity network. BRL-CAD's ray tracer, including the REMRT/RT tools, continue to be available and developed today as open source software.[31]

Since then, there have been considerable efforts and research towards implementing ray tracing at real-time speeds for a variety of purposes on stand-alone desktop configurations. These purposes include interactive 3-D graphics applications such as demoscene productions, computer and video games, and image rendering. Some real-time software 3-D engines based on ray tracing have been developed by hobbyist demo programmers since the late 1990s.[32]

In 1999 a team from the University of Utah, led by Steven Parker, demonstrated interactive ray tracing live at the 1999 Symposium on Interactive 3D Graphics. They rendered a 35 million sphere model at 512 by 512 pixel resolution, running at approximately 15 frames per second on 60 CPUs.[33]

The Open RT project included a highly optimized software core for ray tracing along with an OpenGL-like API in order to offer an alternative to the current rasterization based approach for interactive 3-D graphics. Ray tracing hardware, such as the experimental Ray Processing Unit developed by Sven Woop at the Saarland University, was designed to accelerate some of the computationally intensive operations of ray tracing.

Quake Wars: Ray Traced

The idea that video games could ray trace their graphics in real time received media attention in the late 2000s. During that time, a researcher named Daniel Pohl, under the guidance of graphics professor Philipp Slusallek and in cooperation with the Erlangen University and Saarland University in Germany, equipped Quake III and Quake IV with an engine he programmed himself, which Saarland University then demonstrated at CeBIT 2007.[34] Intel, a patron of Saarland, became impressed enough that it hired Pohl and embarked on a research program dedicated to ray traced graphics, which it saw as justifying increasing the number of its processors' cores.[35]: 99–100 [36] On June 12, 2008, Intel demonstrated a special version of Enemy Territory: Quake Wars, titled Quake Wars: Ray Traced, using ray tracing for rendering, running in basic HD (720p) resolution. ETQW operated at 14–29 frames per second on a 16-core (4 socket, 4 core) Xeon Tigerton system running at 2.93 GHz.[37]

At SIGGRAPH 2009, Nvidia announced OptiX, a free API for real-time ray tracing on Nvidia GPUs. The API exposes seven programmable entry points within the ray tracing pipeline, allowing for custom cameras, ray-primitive intersections, shaders, shadowing, etc. This flexibility enables bidirectional path tracing, Metropolis light transport, and many other rendering algorithms that cannot be implemented with tail recursion.[38] OptiX-based renderers are used in Autodesk Arnold, Adobe AfterEffects, Bunkspeed Shot, Autodesk Maya, 3ds max, and many other renderers.

In 2014, a demo of the PlayStation 4 game The Tomorrow Children, developed by Q-Games and Japan Studio, demonstrated new lighting techniques developed by Q-Games, notably cascaded voxel cone ray tracing, which simulates lighting in real-time and uses more realistic reflections rather than screen space reflections.[39]

Nvidia introduced their GeForce RTX and Quadro RTX GPUs in September 2018, based on the Turing architecture that allows for hardware-accelerated ray tracing. The Nvidia hardware uses a separate functional block, publicly called an "RT core". This unit is somewhat comparable to a texture unit in size, latency, and interface to the processor core. The unit features BVH traversal, compressed BVH node decompression, ray-AABB intersection testing, and ray-triangle intersection testing.[40] The GeForce RTX, in the form of models 2080 and 2080 Ti, became the first consumer-oriented brand of graphics card that can perform ray tracing in real time,[41] and, in November 2018, Electronic Arts' Battlefield V became the first game to take advantage of its ray tracing capabilities, which it achieves via Microsoft's new API, DirectX Raytracing.[42] AMD, which already offered interactive ray tracing on top of OpenCL through its Radeon ProRender,[43][44] unveiled in October 2020 the Radeon RX 6000 series, its second generation Navi GPUs with support for hardware-accelerated ray tracing at an online event.[45][46][47][48][49] Subsequent games that render their graphics by such means appeared since, which has been credited to the improvements in hardware and efforts to make more APIs and game engines compatible with the technology.[50] Current home gaming consoles implement dedicated ray tracing hardware components in their GPUs for real-time ray tracing effects, which began with the ninth-generation consoles PlayStation 5, Xbox Series X and Series S.[51][52][53][54][55]

On 4 November, 2021, Imagination Technologies announced their IMG CXT GPU with hardware-accelerated ray tracing.[56][57] On January 18, 2022, Samsung announced their Exynos 2200 AP SoC with hardware-accelerated ray tracing.[58] On June 28, 2022, Arm announced their Immortalis-G715 with hardware-accelerated ray tracing.[59] On November 16, 2022, Qualcomm announced their Snapdragon 8 Gen 2 with hardware-accelerated ray tracing.[60][61]

On September 12, 2023 Apple introduced hardware-accelerated ray tracing in its chip designs, beginning with the A17 Pro chip for iPhone 15 Pro models.[62][63] Later the same year, Apple released M3 family of processors with HW enabled ray tracing support.[64] Currently, this technology is accessible across iPhones, iPads, and Mac computers via the Metal API. Apple reports up to a 4x performance increase over previous software-based ray tracing on the phone[63] and up to 2.5x faster comparing M3 to M1 chips.[64] The hardware implementation includes acceleration structure traversal and dedicated ray-box intersections, and the API supports RayQuery (Inline Ray Tracing) as well as RayPipeline features.[65]

Computational complexity

[edit]

Various complexity results have been proven for certain formulations of the ray tracing problem. In particular, if the decision version of the ray tracing problem is defined as follows[66] – given a light ray's initial position and direction and some fixed point, does the ray eventually reach that point, then the referenced paper proves the following results:

  • Ray tracing in 3-D optical systems with a finite set of reflective or refractive objects represented by a system of rational quadratic inequalities is undecidable.
  • Ray tracing in 3-D optical systems with a finite set of refractive objects represented by a system of rational linear inequalities is undecidable.
  • Ray tracing in 3-D optical systems with a finite set of rectangular reflective or refractive objects is undecidable.
  • Ray tracing in 3-D optical systems with a finite set of reflective or partially reflective objects represented by a system of linear inequalities, some of which can be irrational is undecidable.
  • Ray tracing in 3-D optical systems with a finite set of reflective or partially reflective objects represented by a system of rational linear inequalities is PSPACE-hard.
  • For any dimension equal to or greater than 2, ray tracing with a finite set of parallel and perpendicular reflective surfaces represented by rational linear inequalities is in PSPACE.

Software architecture

[edit]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Ray tracing in computer graphics is a rendering technique that simulates the physical paths of light rays from a virtual camera through each pixel of an image plane, calculating their interactions with scene objects to determine color, shadows, reflections, and refractions for realistic image synthesis.[1] The foundational concept emerged in 1968 when Arthur Appel developed the first ray casting algorithm to solve visibility problems and compute shadows by tracing rays from the viewer to determine which surfaces occlude others in a scene.[2] This approach laid the groundwork for handling complex lighting, though it was limited to basic intersections without recursion.[1] In 1980, Turner Whitted advanced the method with a recursive ray tracing model that incorporated specular reflections, refractions via Snell's law, and shadow rays to light sources, enabling more accurate global illumination effects like multiple bounces of light.[3] At its core, the Whitted-style ray tracing algorithm operates recursively: for each pixel, a primary ray is cast from the camera eye through the image plane to find the nearest intersecting surface using efficient tests like ray-triangle intersection via the Möller-Trumbore algorithm; shading is then computed using models such as Phong, incorporating direct illumination, and secondary rays are spawned for shadows (to verify unblocked light paths), reflections (bounced at the surface normal), and transmissions (refracted through transparent materials) until a termination criterion like maximum depth is reached.[4] This backward tracing—from eye to light—contrasts with forward photon simulation and allows precise per-pixel evaluation but incurs high computational cost due to numerous ray-object intersection tests, often O(n) per ray where n is the number of scene primitives.[1] Compared to rasterization, which projects and scans geometric primitives across the screen for fast, hardware-optimized rendering of diffuse surfaces but struggles with secondary effects like accurate soft shadows or global reflections, ray tracing excels in physically plausible lighting simulation, automatic visibility resolution, and computational cost that scales linearly with image resolution but provides accurate per-pixel lighting independent of geometric projections, though it historically required offline processing for non-real-time applications. Basic ray tracing can easily be extended to support "soft/fuzzy" phenomena, such as soft shadows and depth of field, by simply casting several slightly differently angled rays and then averaging their results, which is then called distributed ray tracing.[5][6] Its parallelizable nature suits modern GPUs, yet challenges like recursion depth and cache inefficiency limited early adoption until acceleration structures such as bounding volume hierarchies (BVHs) reduced intersection queries.[4] While ray tracing is poorly suited to the traditional real-time rendering of polygon-based models, it offers advantages in certain cases, such as being much simpler to implement and excelling at rendering mathematically well-defined shapes like fractals.[7][8] An example of software utilizing this approach is Mandelbulber, which employs ray tracing for 3D fractal rendering.[9] Advancements in hardware have transformed ray tracing into a viable real-time technology; NVIDIA's RTX platform, introduced in 2018 with Turing GPUs, integrated dedicated ray-tracing cores (RT Cores) for accelerated BVH traversal and ray-triangle intersections, achieving significant performance improvements, up to 10x faster ray tracing compared to prior generations, and enabling hybrid rasterization-ray tracing pipelines in games and films.[10] This technology has also been integrated into gaming consoles like the PlayStation 5 and Xbox Series X/S since 2020, enabling real-time ray tracing in numerous titles. By 2025, subsequent architectures like Ada Lovelace and Blackwell further optimize denoising and AI-assisted upscaling (e.g., DLSS) to maintain 60+ FPS in ray-traced titles, while extensions like path tracing variants enhance unbiased global illumination for production rendering in tools like Blender and Unreal Engine.[11]

Overview

Core Principles

Ray tracing is a rendering technique in computer graphics that simulates the physical behavior of light by tracing the paths of rays from the virtual camera through each pixel of the image plane into the scene, computing the color of each pixel based on intersections with scene geometry and subsequent light interactions at those points.[12] This approach allows for realistic rendering of effects such as shadows, reflections, and refractions by modeling how light rays propagate and interact with surfaces.[3] In physical light transport, rays originate from light sources and scatter in all directions, but ray tracing reverses this process for computational efficiency, tracing rays backward from the viewer (or camera) toward the light sources to determine visible contributions.[13] This backward tracing mimics the selective sampling of light paths that reach the observer, reducing the need to simulate all possible light rays in the scene.[14] A ray in ray tracing is mathematically represented in parametric form as P(t)=O+tD\vec{P}(t) = \vec{O} + t \vec{D}, where O\vec{O} is the ray's origin point, D\vec{D} is its direction vector (typically normalized to unit length), and t0t \geq 0 is a scalar parameter that locates points along the ray.[13] Intersection points occur where t>0t > 0 satisfies the geometry's implicit equation. For primary rays, which initiate the tracing process, they are generated using a pinhole camera model where the camera is positioned at O\vec{O} (the eye point), and the image plane (viewport) is a rectangular grid at a fixed distance dd from the camera, spanning horizontal and vertical field-of-view angles or dimensions.[15] For a pixel at normalized coordinates (u,v)(u, v) in the viewport (where u,v[1,1]u, v \in [-1, 1]), the ray direction D\vec{D} is computed as D=(uaspecttan(θ/2),vtan(θ/2),1)\vec{D} = (u \cdot \text{aspect} \cdot \tan(\theta/2), v \cdot \tan(\theta/2), -1) in camera space, where θ\theta is the vertical field of view and aspect is the viewport's aspect ratio; this vector is then normalized to unit length, then transformed to world space if needed.[16] To determine if a ray intersects scene geometry, intersection tests are performed against primitive shapes like spheres or planes, solving for valid tt values. For a sphere centered at C\vec{C} with radius rr, the intersection solves the quadratic equation derived from substituting the ray equation into the sphere's implicit form PC2=r2|\vec{P} - \vec{C}|^2 = r^2, yielding t2D2+2t(D(OC))+OC2r2=0t^2 |\vec{D}|^2 + 2t (\vec{D} \cdot (\vec{O} - \vec{C})) + |\vec{O} - \vec{C}|^2 - r^2 = 0.[17] The discriminant Δ=b24ac\Delta = b^2 - 4ac (with a=D2a = |\vec{D}|^2, b=2D(OC)b = 2 \vec{D} \cdot (\vec{O} - \vec{C}), c=OC2r2c = |\vec{O} - \vec{C}|^2 - r^2) determines real intersections if Δ0\Delta \geq 0, and the smallest positive root t=bΔ2at = \frac{-b - \sqrt{\Delta}}{2a} gives the nearest hit point. Example pseudocode for this test is as follows:
function intersectSphere(ray, sphere):
    oc = ray.origin - sphere.center
    a = dot(ray.direction, ray.direction)
    b = 2.0 * dot(oc, ray.direction)
    c = dot(oc, oc) - sphere.radius * sphere.radius
    discriminant = b * b - 4 * a * c
    if discriminant < 0:
        return no intersection
    t1 = (-b - sqrt(discriminant)) / (2 * a)
    t2 = (-b + sqrt(discriminant)) / (2 * a)
    if t1 > 0:
        return t1  // nearest positive intersection
    elif t2 > 0:
        return t2
    else:
        return no intersection
For planes defined by a point Q\vec{Q} on the plane and normal N\vec{N}, the test solves t=(OQ)N/(DN)t = -(\vec{O} - \vec{Q}) \cdot \vec{N} / (\vec{D} \cdot \vec{N}) for t>0t > 0, provided DN0\vec{D} \cdot \vec{N} \neq 0; spheres serve as a representative curved primitive.[18] These tests form the basis for detecting the first intersection along the primary ray, enabling further computation of surface properties.[19]

Ray-Scene Interactions

Upon detecting an intersection between a primary ray and a scene object, the ray tracing algorithm computes the precise hit point, the surface normal vector at that location, and the material properties associated with the object.[12] These computations enable subsequent shading and secondary ray generation. To determine the color at the intersection point, a local illumination model evaluates contributions from direct lighting sources. The Phong shading model, an empirical approach, approximates surface appearance through ambient, diffuse, and specular components:
I=IaKa+IdKd(NL)+IsKs(RV)n I = I_a K_a + I_d K_d (\vec{N} \cdot \vec{L}) + I_s K_s (\vec{R} \cdot \vec{V})^n
Here, IaI_a, IdI_d, and IsI_s represent the ambient, diffuse, and specular light intensities; KaK_a, KdK_d, and KsK_s are the material's ambient, diffuse, and specular reflection coefficients; N\vec{N} is the surface normal; L\vec{L} is the direction to the light source; R\vec{R} is the reflection vector; V\vec{V} is the view direction; and nn is the specular exponent controlling highlight sharpness.[20] This model assumes light interacts locally with the surface, ignoring inter-object reflections at this stage. To account for shadows, a secondary shadow ray is cast from the hit point toward each light source. If this ray intersects any occluding object before reaching the light, the point is considered shadowed, attenuating or eliminating the direct lighting contribution.[12] For reflective or transmissive materials, additional rays simulate light bouncing or bending at the surface. The reflected ray direction follows the law of reflection:
R=I2(IN)N \vec{R} = \vec{I} - 2 (\vec{I} \cdot \vec{N}) \vec{N}
where I\vec{I} is the incident ray direction. For refraction through transparent materials, Snell's law governs the transmitted ray direction, incorporating the ratio of indices of refraction η1/η2\eta_1 / \eta_2 between the incident and transmitting media to compute the refracted vector, with potential total internal reflection if the angle exceeds the critical value.[12] Consider a simple scene with a spherical object illuminated by a point light source in an otherwise empty space. The primary ray from the camera through a pixel intersects the sphere at the hit point, where the local normal is computed as the vector from the sphere's center to that point. A shadow ray checks visibility to the light; if unoccluded, the Phong model shades the point using the light's direction and the view ray. If the sphere is reflective, a secondary reflected ray traces further, potentially hitting another object or terminating at the background, forming a basic ray tree that branches from the primary ray to yield the pixel's final color.[12] Bounding volumes can accelerate these intersection tests by pruning rays against scene hierarchies.

History

Origins and Early Concepts

The concept of ray tracing in computer graphics draws its foundational inspiration from geometrical optics in physics, where light is modeled as rays that propagate in straight lines, reflect, and refract upon interacting with surfaces.[21] This simplification of natural light behavior allowed early researchers to simulate realistic visual effects computationally, adapting physical principles to address rendering challenges in digital scenes.[22] One of the earliest applications in computer graphics appeared in Arthur Appel's 1968 work, which introduced ray tracing as a method for hidden surface removal and basic shading in three-dimensional solid models.[2] Appel's algorithm cast rays from the viewpoint through each pixel to determine visible surfaces and compute simple illumination, marking a shift toward more realistic representations beyond mere outlines.[23] This approach efficiently handled complex geometries by tracing rays to identify occlusions and apply tonal shading based on surface normals and light direction.[2] During the 1970s, computer graphics research transitioned from wireframe models—limited to line drawings without depth cues—to shaded renderings that incorporated surface properties and lighting for enhanced realism.[24] This evolution was driven by the need to simulate perceptual depth and material interactions, setting the stage for advanced techniques. A pivotal advancement came in Turner Whitted's 1980 paper, which formalized recursive ray tracing to model reflections and refractions by spawning secondary rays from intersection points, creating a tree-like structure of light paths.[12] Whitted's model extended earlier local illumination methods, enabling the rendering of glossy and transparent effects on both polygonal and curved surfaces.[3] Early ray tracing efforts immediately highlighted significant computational challenges, particularly the high cost of ray-object intersection calculations, which could consume 75-95% of processing time for even modest scenes.[3] Rendering a single image on hardware like the VAX-11/780 often required 44-122 minutes, underscoring the technique's potential for photorealism but limiting its practicality to offline computation.[12] These origins laid the theoretical groundwork for ray tracing, emphasizing its roots in optical simulation while navigating the trade-offs between visual fidelity and efficiency.[3]

Evolution and Key Milestones

In the 1990s and early 2000s, ray tracing transitioned from theoretical research to practical integration in film rendering, bolstered by advancements in acceleration structures like kd-trees, which were first adapted for ray tracing in 1985 and refined through subsequent studies on efficient spatial partitioning for complex scenes.[25] Pixar's RenderMan renderer, initially scanline-based, incorporated ray tracing in the mid-2000s, enabling its use in production for the 2006 film Cars, where it handled indirect lighting and reflections across millions of geometric primitives.[26] The 2010s marked a resurgence in ray tracing, propelled by hardware innovations that made real-time applications feasible. NVIDIA introduced the GeForce RTX series in 2018, featuring dedicated RT cores to accelerate ray-triangle intersections and enable real-time ray tracing in games and simulations.[27] AMD countered with its RDNA 2 architecture in 2020, integrating ray accelerators into GPUs for comparable hardware support in DirectX Raytracing (DXR) and Vulkan environments.[28] Path tracing, which extends ray tracing to unbiased global illumination via Monte Carlo sampling, gained traction in film pipelines during this period. Walt Disney Animation Studios debuted its in-house Hyperion renderer—a path-tracing system—for Big Hero 6 in 2014, achieving physically accurate lighting for urban environments with billions of light paths simulated per frame.[29] Entering the 2020s, ray tracing achieved widespread adoption in interactive media, particularly video games. Cyberpunk 2077, released in 2020, showcased ray-traced reflections, shadows, and ambient occlusion, pushing hardware limits and highlighting the technology's role in photorealistic urban scenes.[30] To address performance bottlenecks in real-time contexts, AI-driven denoising emerged as a key enabler; NVIDIA's DLSS 3, launched in 2022, used neural networks to upscale and denoise ray-traced frames, boosting frame rates by up to 4x in supported titles while preserving detail.[31] By 2025, hybrid rasterization-ray tracing pipelines have solidified as the norm in consumer hardware, with consoles like the PlayStation 5 and Xbox Series X—debuted in 2020—employing AMD's RDNA 2-derived GPUs for dynamic ray-traced effects in AAA titles, balancing computational cost with visual realism.[32] NVIDIA's Blackwell architecture, released in January 2025 with the GeForce RTX 50 series, introduced 4th-generation RT cores for up to 2x improved ray tracing performance, while AMD's RDNA 4 architecture, launched in March 2025 with the Radeon RX 9000 series, enhanced ray accelerators for broader adoption.[33][34] Open standards have also evolved significantly; the Vulkan Ray Tracing Extensions, provisionally released in 2020 and finalized in November 2020, now support mature, cross-vendor implementations, facilitating portable ray tracing in diverse ecosystems from mobile to high-end PCs.[35]

Fundamental Algorithms

Ray Casting

Ray casting is a foundational rendering technique in computer graphics that simulates visibility by projecting rays from the viewpoint through each pixel of an image plane into the scene to detect the nearest surface intersection. This method determines what objects are visible to the viewer and applies basic local shading to compute pixel colors, serving as a precursor to more advanced ray tracing algorithms. Unlike forward tracing, which originates rays from light sources and is computationally inefficient due to the vast number of photons emitted, ray casting employs backward tracing by launching primary rays solely from the eye position, focusing only on rays that could reach the viewer.[36][37] The algorithm proceeds in a straightforward loop: for each pixel in the image, a ray is generated from the camera's eye point through the pixel's center on the virtual image plane; this ray is then tested for intersections with all objects in the scene, and the closest intersection point is selected to resolve depth and visibility, akin to hidden surface removal in depth buffering. If an intersection occurs, the pixel color is calculated using a local illumination model, such as the Phong shading equation, based on the surface normal, material properties, and direct lighting at that point—without considering indirect effects like shadows unless explicitly added via secondary rays. No recursion is involved, limiting the process to primary eye rays only.[38][39] The following pseudocode illustrates the core ray casting loop for a simple scene:
select center of projection (eye) and window on view plane;
for (each pixel in image) {
    determine ray direction from eye through pixel [center](/page/Center);
    initialize closest distance to [infinity](/page/Infinity);
    for (each object in scene) {
        compute [intersection](/page/Intersection) with ray;
        if ([intersection](/page/Intersection) exists and distance < closest distance) {
            update closest distance and record hit point, normal, and object;
        }
    }
    if (hit found) {
        compute local shading at hit point using direct lights;
        set pixel color to shaded color;
    } else {
        set pixel color to background color;
    }
}
This implementation highlights the primary computational cost in exhaustive ray-object intersection tests, which scales linearly with scene complexity.[39] In early 3D video games, ray casting enabled efficient pseudo-3D rendering on limited hardware; for instance, Wolfenstein 3D (1992) utilized it to project rays from the player's viewpoint across a 2D grid-based map, calculating wall heights and distances via trigonometric functions like the Digital Differential Analyzer (DDA) algorithm to simulate corridors and rooms without full polygonal geometry.[40] Despite its simplicity and speed, ray casting has inherent limitations, as it supports only local illumination effects and cannot inherently model phenomena like reflections, refractions, or global illumination, resulting in flat, non-realistic images without manual extensions.[41][38]

Recursive Ray Tracing

Recursive ray tracing, introduced by Turner Whitted in 1980, extends basic ray casting by recursively generating secondary rays to simulate global illumination effects such as reflections and refractions.[3] In this algorithm, a primary ray from the eye through each pixel intersects the scene, and at the hit point, the shading computation spawns additional rays for specular reflection, transmission (refraction), and shadows toward light sources.[12] This process builds a ray tree where each node represents an intersection, and branches correspond to secondary rays, enabling the rendering of mirrored surfaces and transparent materials that basic ray casting cannot handle.[4] The core of the algorithm lies in the recursive shading function, which combines local illumination with contributions from recursive ray traces. The intensity II at an intersection point is computed as
I=Ia+j=1lskd(NLj)Ij+ksS+ktT, I = I_a + \sum_{j=1}^{ls} k_d (N \cdot L_j) I_j + k_s S + k_t T,

where IaI_a is ambient light, the sum accounts for diffuse contributions from lsls light sources with normal NN and light direction LjL_j, ksSk_s S is the specular reflection term with SS obtained by recursively tracing a reflected ray, and ktTk_t T is the transmission term with TT from a refracted ray; coefficients kdk_d, ksk_s, and ktk_t control material properties.[3] Shadow rays are cast from the intersection to each light source to check visibility; if any object blocks the ray, the light's contribution is zeroed, preventing self-shadowing.[12] The reflected ray direction follows R^=V^2(N^V^)N^\hat{R} = \hat{V} - 2 (\hat{N} \cdot \hat{V}) \hat{N}, and the refracted ray obeys Snell's law, ensuring physically plausible bending at interfaces.[3]
To prevent infinite recursion in scenes with highly reflective or refractive materials, termination conditions are enforced. Common limits include a maximum recursion depth of 5-10 bounces, beyond which rays are terminated with a default background color.[42] Additional conditions involve rays that miss all objects (yielding background illumination), total internal reflection during refraction (where the incident angle exceeds the critical angle, spawning a reflection ray instead), or rays at grazing angles where the cosine of the angle between the ray and surface normal falls below a small threshold, avoiding numerical instability and excessive computation.[43] These controls balance realism with computational feasibility, as the number of rays grows exponentially with depth—for instance, each recursive call can spawn up to three secondary rays (reflection, refraction, and multiple shadows).[4] A illustrative example is ray tracing a mirrored sphere illuminated by point lights. The primary ray hits the sphere, spawning a reflection ray that bounces off the surface toward the background or other objects, potentially intersecting a second surface and recursing further; shadow rays from the hit point to each light confirm direct visibility, while the recursion depth limits traces to avoid endless mirroring. This ray tree structure—starting with one primary ray branching into reflection, possible refraction (if transparent), and shadow rays—captures the sphere's reflective highlights and cast shadows accurately.[3] In mimicking natural phenomena, recursive ray tracing excels at approximating specular highlights on glossy surfaces and simple caustics from refractive objects like glass lenses, where focused light patterns emerge from multiple refractions.[12] However, it remains a biased approximation, as it only recurses for perfect specular paths and uses local diffuse shading without global interreflections, limiting accuracy for soft shadows or diffuse caustics. Acceleration structures, such as bounding volume hierarchies, can optimize intersection tests across the recursive tree but are addressed separately.[44]

Advanced Rendering Techniques

Volume Ray Casting

Volume ray casting is a technique in computer graphics used to render three-dimensional volumetric data by casting rays through the volume and accumulating color and opacity contributions along each ray's path. Unlike surface ray tracing, which identifies discrete intersection points with geometric primitives to compute shading, volume ray casting treats the medium as continuous, integrating properties such as density and emission without explicit surface boundaries. This approach is particularly suited for visualizing semi-transparent or participating media where light is absorbed, scattered, or emitted throughout the volume.[45] The core of volume ray casting is based on the volumetric rendering equation, which computes the final color CC of a ray as the integral of transmittance T(t)T(t), density σ(t)\sigma(t), and color c(t)c(t) along the ray from tmint_{\min} to tmaxt_{\max}:
C=tmintmaxT(t)σ(t)c(t)dt C = \int_{t_{\min}}^{t_{\max}} T(t) \sigma(t) c(t) \, dt
Here, transmittance T(t)=exp(tmintσ(s)ds)T(t) = \exp\left(-\int_{t_{\min}}^{t} \sigma(s) \, ds\right) accounts for the attenuation of light due to absorption and scattering up to parameter tt. This formulation, derived from optical models of light propagation in participating media, enables realistic depiction of phenomena like fog or tissue by modeling how light interacts continuously within the volume.[46] In practice, the continuous integral is discretized through ray marching, where the ray is advanced in steps through a voxel grid representing the sampled volumetric data. At each step, the density σ(t)\sigma(t) and color c(t)c(t) are interpolated from the nearest voxels, and contributions are accumulated until the ray exits the volume or reaches full opacity, allowing early termination to improve efficiency. Sampling can use uniform step sizes for simplicity or adaptive sizes based on local density gradients to focus computation on high-variation regions, reducing artifacts while controlling computational cost. Acceleration structures, such as hierarchical bounding volumes, can further optimize traversal by skipping empty space.[45] Compositing along the ray follows established alpha-blending models, typically performed front-to-back or back-to-front to combine samples. In front-to-back compositing, each sample's contribution is added to the accumulated color and transmittance is multiplied, stopping when transmittance falls below a threshold; back-to-front reverses the order for over-compositing. These methods, rooted in digital image compositing principles, ensure correct ordering of translucent layers within the volume.[47] Applications of volume ray casting span medical imaging and computer-generated imagery (CGI). In medical visualization, it enables direct rendering of computed tomography (CT) or magnetic resonance imaging (MRI) datasets to display internal structures like organs or tumors without intermediate surface extraction, providing intuitive insights for diagnosis and surgical planning. In CGI, it simulates effects such as fire, smoke, or clouds by modeling density fields from simulations, contributing to realistic atmospheric and pyrotechnic scenes in films and games.[45][45]

Signed Distance Field Ray Marching

Signed distance fields (SDFs) provide a compact representation for implicit surfaces in computer graphics, defined as a function $ f(\vec{p}) $ that returns the signed distance from a point p\vec{p} to the nearest surface, where the value is positive outside the surface, negative inside, and zero on the surface itself.[48] This formulation enables efficient ray-surface intersection testing without requiring explicit polygonal meshes, making it suitable for procedural and complex geometries.[49] The ray marching algorithm advances a ray parametrically along its direction by a safe step size derived from the SDF, typically updating the parameter $ t $ as $ t += |f(\vec{P}(t))| $, where P(t)\vec{P}(t) is the point along the ray at distance $ t $ from the origin.[48] Intersections are detected by checking for a zero-crossing in the SDF value or when the step size falls below a small epsilon threshold, ensuring the ray does not overshoot the surface due to the distance guarantee. This process, often termed sphere tracing, guarantees convergence to the surface in a finite number of steps proportional to the distance traveled.[48] Surface normals at intersection points are computed analytically as the normalized gradient of the SDF, given by N=f(p)/f(p)\vec{N} = \nabla f(\vec{p}) / \|\nabla f(\vec{p})\|, which provides a Lipschitz-continuous estimate suitable for shading without additional finite differencing in many cases.[49] This gradient-based normal facilitates direct integration with ray tracing pipelines for lighting calculations. SDF ray marching finds applications in procedural geometry generation for video games, such as the deformable clay environments in Claybook, where compute shaders perform real-time simulations and rendering of dynamic implicit surfaces.[50] It is also widely used in real-time demos to visualize intricate mathematical surfaces, like fractals or blended primitives, leveraging GPU acceleration for interactive frame rates. A key advantage for rendering complex shapes lies in the analytical combination of SDFs via constructive solid geometry (CSG) operations, such as union ($ f_1 \cup f_2 = \min(f_1, f_2) )and[intersection](/page/Intersection)() and [intersection](/page/Intersection) ( f_1 \cap f_2 = \max(f_1, f_2) $), which preserve the distance field properties without numerical approximation or mesh regeneration.[49] This enables efficient modeling of hierarchical or blended structures, such as organic forms or architectural elements, directly in the shader code.[51]

Path Tracing Extensions

Path tracing extends traditional ray tracing by incorporating unbiased Monte Carlo methods to simulate global illumination, effectively handling diffuse interreflections and caustics that deterministic recursive ray tracing struggles with due to its specular-only focus.[52] In Monte Carlo path tracing, light paths are generated by recursively tracing rays from the camera through the scene, sampling random directions at each surface interaction according to the bidirectional reflectance distribution function (BRDF), and averaging the radiance contributions from numerous such paths per pixel to approximate the true illumination.[52] This stochastic approach solves the rendering equation, which describes outgoing radiance LoL_o from a point p\vec{p} in direction ωo\vec{\omega}_o as the sum of emitted radiance LeL_e and the integral over the hemisphere Ω\Omega of incoming radiance LiL_i modulated by the BRDF frf_r and the cosine term:
Lo(p,ωo)=Le(p,ωo)+Ωfr(p,ωi,ωo)Li(p,ωi)(nωi)dωi L_o(\vec{p}, \vec{\omega}_o) = L_e(\vec{p}, \vec{\omega}_o) + \int_{\Omega} f_r(\vec{p}, \vec{\omega}_i, \vec{\omega}_o) L_i(\vec{p}, \vec{\omega}_i) (\vec{n} \cdot \vec{\omega}_i) \, d\vec{\omega}_i
[52] To mitigate high variance in these estimates, importance sampling is employed, where directions or light sources are sampled proportional to their expected contribution—such as BRDF lobes for specular materials or direct light sampling for efficiency—reducing the number of paths needed for convergence.[53] Path termination is handled via Russian roulette, a probabilistic technique that terminates rays with a probability inversely proportional to their throughput (typically after a minimum number of bounces), biasing the estimator toward shorter, more contributory paths while maintaining unbiasedness through weight adjustments.[53] The resulting images exhibit significant noise due to the finite number of samples, which is commonly addressed through post-process denoising filters that leverage spatial or temporal correlations in the render passes to reconstruct clean outputs without altering the physical accuracy.

Distributed Ray Tracing

Distributed ray tracing is an advanced extension of ray tracing that incorporates stochastic sampling to render effects involving spatial, temporal, or directional uncertainty, such as soft shadows, glossy reflections, depth of field, and motion blur. Introduced in a seminal 1984 SIGGRAPH paper by Robert L. Cook, Thomas Porter, and Loren Carpenter at Lucasfilm Ltd., the technique uses Monte Carlo integration to distribute ray samples across probability distributions rather than tracing single deterministic rays, enabling the simulation of fuzzy phenomena without additional computational overhead beyond supersampling.[6][54] In traditional ray tracing, rays are cast precisely to compute sharp intersections and reflections, but distributed ray tracing treats parameters like light source positions, lens apertures, or time intervals as distributions, generating multiple rays per pixel or interaction and averaging their contributions to approximate the integral of radiance over these domains. For example, soft shadows are achieved by sampling points across an area light source, while depth of field is simulated by varying ray origins within a lens model, and motion blur by offsetting rays temporally. This approach leverages the same ray tracing framework but repurposes oversampling for integration, making it efficient for photorealistic rendering.[6] Mathematically, it extends the rendering equation by integrating over additional dimensions, such as the light source area AA for shadows: the intensity at a point is the average of contributions from rays to sampled points on AA, approximating AI(l)dA/A\int_A I(\vec{l}) \, dA / |A|, where I(l)I(\vec{l}) is the radiance from direction l\vec{l}. This stochastic method reduces aliasing and artifacts inherent in deterministic sampling, serving as a foundational precursor to modern path tracing by introducing unbiased Monte Carlo techniques for global illumination effects.[6][54] The historical significance of distributed ray tracing lies in its demonstration of high-fidelity images that captured the graphics community's attention, influencing subsequent developments in computer-generated imagery for film and animation at Pixar and beyond. Applications include realistic lighting in movies and the foundational algorithms in rendering software.[6]

Optimization Methods

Acceleration Structures

Acceleration structures in ray tracing are spatial data structures designed to efficiently determine which objects a ray intersects, thereby reducing the computational cost from testing against all scene objects to a logarithmic number of tests per ray.[55] These structures exploit spatial locality and ray coherence to efficiently traverse the scene hierarchy, skipping tests against objects that cannot possibly be intersected by the ray and thereby avoiding large portions of empty space.[56] The bounding volume hierarchy (BVH) is a widely adopted acceleration structure consisting of a binary tree where each node represents an axis-aligned bounding box (AABB) enclosing a subset of the scene's geometry.[57] Leaf nodes contain individual primitives, while internal nodes bound the union of their children's volumes, allowing rays to traverse the tree by quickly rejecting non-intersecting branches and skipping empty nodes.[55] BVH construction typically employs the surface area heuristic (SAH), which estimates traversal cost based on the probability of ray-object intersections proportional to surface areas, guiding splits to minimize expected ray tracing time.[58] Traversal algorithms for BVHs often use a stack-based approach to manage recursion or a digital differential analyzer (DDA) variant for efficient ray-AABB intersection tests, advancing through the tree in a front-to-back manner.[59] Alternative structures include kd-trees, which partition 3D space with axis-aligned planes to create non-overlapping regions, enabling rays to visit only relevant subvolumes during traversal.[60] Kd-trees are particularly effective for scenes with varying object densities, as they adaptively subdivide space based on object distribution, often using SAH for split decisions similar to BVHs.[61] Uniform grids divide the scene into a fixed-resolution 3D lattice of voxels, each potentially containing multiple primitives, and are suited for uniformly distributed geometry where rays can be stepped through cells using a 3D DDA algorithm.[62] For dynamic scenes involving animation or deformation, BVHs support updates through periodic rebuilding, which reconstructs the entire hierarchy, or refitting, which adjusts bounding boxes top-down without altering tree topology to maintain efficiency.[57] Refitting is faster for rigid motions but may degrade quality over time, necessitating hybrid approaches that combine incremental updates with occasional rebuilds for deformable elements.[63] Backward ray tracing, where primary rays originate from the eye point toward the scene, enhances coherence in acceleration structure traversal because adjacent pixels generate rays with similar directions and origins, allowing bundled processing and fewer cache misses compared to forward tracing from light sources.[16] This directionality contributes to overall performance gains in primary visibility computations.[64]

Adaptive Sampling and Depth Control

Adaptive depth control in ray tracing dynamically limits the recursion depth of ray trees by terminating secondary rays when their estimated contribution to the final pixel intensity falls below a predefined threshold, such as when the reflected or transmitted intensity is sufficiently low. This technique prevents unnecessary computation for rays that contribute negligibly to the overall image, thereby improving efficiency while maintaining rendering quality. The approach was originally proposed by Hall and Greenberg as a method to trace ray trees only to depths sufficient for significant contributions, avoiding fixed-depth limitations in early ray tracing implementations. For instance, before spawning a reflected ray, the renderer evaluates the material's reflectance coefficient multiplied by the incident radiance; if this value is below the threshold, recursion stops, reducing average ray tree depths from a maximum of 15 to around 1.7 in typical scenes.[56] Importance sampling in path tracing enhances efficiency by weighting the generation of rays according to their estimated contribution to the pixel's radiance, directing more computational effort toward paths likely to carry significant energy. Rather than uniform random sampling, rays are sampled from probability distributions proportional to the bidirectional scattering distribution function (BSDF) or light source geometry, reducing variance in the Monte Carlo estimator. This method, integral to modern global illumination, was advanced in the context of photon mapping and path tracing by Lafortune and Willems, who combined photon map estimates with local reflection models to guide sampling.[65] In practice, for a glossy surface, importance sampling prioritizes directions aligned with the specular lobe, ensuring fewer samples are needed to converge on accurate highlights compared to uniform sampling. Spatial adaptivity addresses variance in image regions by allocating higher sampling densities to areas with high-frequency details, such as edges, shadows, or caustics, where noise is more perceptible. Techniques like stratified sampling divide the pixel domain into strata to ensure even coverage, while adaptive quadrature refines sampling in high-variance cells based on initial low-resolution passes. A seminal multidimensional approach by Hachisuka et al. extends this to the full sample domain of path tracing, using error estimates to adaptively place samples across spatial, directional, and temporal dimensions, achieving up to 5x variance reduction in complex scenes without bias.[66] This is particularly effective for real-time applications, where initial samples guide refinement in edge-adjacent pixels, balancing quality and performance. Firefly suppression mitigates bright artifacts in path tracing, known as fireflies, which arise from rare high-energy paths that bias the estimator toward overly bright pixels. A common unbiased method involves clamping sample contributions to a maximum intensity threshold before accumulation, preventing individual paths from dominating the average. More advanced reweighting techniques, as proposed by Bitterli et al., adjust firefly weights based on their probability under the sampling distribution, preserving unbiasedness while reducing variance by up to 50% in caustic-heavy scenes.[67] These artifacts are especially prevalent in specular or transmissive materials, and suppression ensures temporally stable renders without introducing systematic bias. An illustrative example of depth control is Russian roulette, a probabilistic termination strategy that adaptively ends paths based on their throughput, avoiding bias by compensating surviving paths with adjusted weights. Introduced by Arvo and Kirk in the framework of particle transport for image synthesis, it selects a random number at each bounce; if below a survival probability (often tied to throughput), the path terminates, but the contribution is scaled by the inverse probability to maintain the estimator's expectation.[68] This method efficiently handles infinite bounces in path tracing, with adaptive probabilities (e.g., lowering survival for low-throughput paths) reducing average path lengths by 20-30% in diffuse scenes while controlling variance.

Implementation Aspects

Computational Complexity

In the naive ray tracing algorithm, each ray must be intersected with every primitive in the scene, yielding a time complexity of $ O(n) $ per ray, where $ n $ is the number of primitives. For an image requiring $ m $ primary rays—typically one per pixel—the total computational cost reaches $ O(m n) $, which becomes prohibitive for complex scenes with millions of primitives. This linear scaling severely limits applicability without spatial partitioning. Acceleration structures such as bounding volume hierarchies (BVH) mitigate this by organizing primitives hierarchically, reducing the average intersection complexity to $ O(\log n) $ per ray through efficient pruning of non-intersecting branches. Consequently, the total time complexity for primary rays drops to $ O(m \log n) $, enabling practical rendering for moderately complex scenes. However, recursive secondary rays—for reflections, refractions, and shadows—can multiply the ray count by factors of 10 or more per pixel, exacerbating the overall cost beyond this bound and demanding further optimizations like adaptive sampling. Memory requirements for ray tracing are dominated by scene storage, which scales as $ O(n) $ to hold primitive geometries, materials, and textures. BVH structures add roughly $ O(n) $ space overhead for tree nodes, with practical implementations in real-time applications consuming 1–2 GB of additional VRAM for acceleration data and ray buffers in typical game scenes. In resource-constrained environments, techniques like BVH compaction help bound this growth. Real-time ray tracing at 1080p resolution (1,920 × 1,080 pixels) and 60 FPS demands processing approximately $ 1.24 \times 10^8 $ primary rays per second to maintain smooth interactivity. Accounting for secondary rays and anti-aliasing samples, effective throughput must reach several giga-rays per second, a threshold met by modern hardware accelerators but highlighting the algorithm's sensitivity to ray volume. Asymptotically, ray tracing scales poorly with scene complexity in its unoptimized form due to the quadratic explosion in intersection tests as both $ m $ and $ n $ grow, rendering it infeasible for large-scale environments without hierarchical culling. Even BVH-accelerated variants, while logarithmic per ray, suffer from compounded costs in incoherent ray bundles and deep recursion, where traversal depth and cache misses amplify effective runtime beyond $ O(m \log n) $ in practice.

Software and Hardware Architectures

Ray tracing implementations rely on specialized software libraries and kernels written in languages such as C++ for CPU-based traversal and CUDA for GPU acceleration, enabling efficient ray-scene intersection computations.[69] A prominent example is Intel's Embree library, introduced in 2011 and continually updated, which provides high-performance ray tracing kernels optimized for x86 and ARM CPUs, focusing on bounding volume hierarchy (BVH) traversal and intersection testing to achieve professional-grade rendering speeds.[70] Embree supports platforms including Linux, macOS, and Windows, and is released under the Apache 2.0 license, allowing integration into applications for both offline and interactive rendering.[69] Application programming interfaces (APIs) standardize ray tracing across hardware, with Microsoft's DirectX Raytracing (DXR), launched in 2018 as part of DirectX 12, providing a peer to rasterization and compute pipelines for dispatching rays, building acceleration structures, and handling shader execution.[71] The Khronos Group's Vulkan Ray Tracing extensions, finalized in 2020, enable similar functionality through ray tracing pipelines and ray queries, supporting hardware-accelerated traversal on compatible GPUs while maintaining cross-platform compatibility.[72] Scene descriptions for ray tracing often use the glTF 2.0 format from Khronos, an API-neutral asset delivery standard that encapsulates geometry, materials, and textures in a compact JSON-based structure, facilitating efficient loading and rendering in both ray tracing and rasterization pipelines.[73] Dedicated hardware architectures accelerate ray tracing by incorporating fixed-function units for BVH traversal and ray-triangle intersections, reducing computational overhead compared to software-only approaches. NVIDIA introduced RT Cores with the Turing architecture in 2018, featuring specialized hardware for ray bounding box tests and triangle intersections, with subsequent generations like Ampere, Ada Lovelace, and Blackwell (2024-2025) enhancing throughput—Blackwell's fourth-generation RT Cores support up to 2x faster ray tracing for complex scenes through improved BVH handling and neural integration.[74][75] AMD's RDNA 2 architecture, released in 2020, added Ray Accelerators as dedicated units for efficient ray-box and ray-triangle computations, integrated into each compute unit; later iterations in RDNA 3 and RDNA 4 (2025) refine these with overall performance gains of up to 40% and over 2x throughput in ray tracing accelerators via redesigned accelerators and larger memory pools.[76][34] In real-time applications, hybrid pipelines combine rasterization for primary visibility with ray tracing for secondary effects like reflections and shadows, enabling denoising algorithms to produce high-fidelity images at interactive frame rates.[77] This approach leverages rasterization's speed for base geometry while using ray queries for targeted tracing, as seen in engines integrating DXR or Vulkan RT to balance performance and quality.[78] By 2025, advancements include unified memory architectures in GPUs, such as those in AMD's RDNA 4 with 16GB integrated pools, which minimize data transfer latency between CPU and GPU during BVH builds and ray dispatches.[34] AI-accelerated ray tracing has emerged in cloud rendering platforms, employing neural networks for denoising and upsampling—NVIDIA's Blackwell architecture integrates these for real-time path tracing in services like Omniverse, reducing noise in traced scenes by up to 50% while supporting remote workflows.[79][75]

Advantages and Limitations

Rendering Quality Benefits

Ray tracing significantly enhances rendering quality by accurately simulating global illumination, which produces natural-looking shadows, reflections, and refractions without the aliasing or banding artifacts prevalent in traditional rasterization techniques. This capability stems from recursively tracing rays from the camera through scene intersections to light sources and other surfaces, allowing light interactions to be computed directly rather than approximated through heuristics. Introduced in Whitted's seminal 1980 algorithm, this method revolutionized shaded display by incorporating true specular reflections, penumbral shadows, and refractions in a unified framework, far surpassing the limitations of earlier local illumination models.[12] The foundation for these benefits lies in the rendering equation, formalized by Kajiya in 1986, which describes light transport as an integral over incoming radiance weighted by surface properties and geometry; ray tracing provides a practical means to approximate this equation, enabling physically plausible results across diverse scenes.[52] Ray tracing also achieves superior material fidelity by supporting complex bidirectional reflectance distribution functions (BRDFs), which model how light scatters off surfaces with varying roughness, metallicity, and transmittance. For instance, it naturally renders the sharp specular highlights of polished metals, the chromatic dispersion in glass, and the soft diffusion in subsurface scattering materials like skin or marble, by evaluating the BRDF at each ray-surface intersection without restricting to simple Lambertian or Phong models. A key advantage is the generation of view-independent effects, where lighting remains consistent regardless of camera position, unlike precomputed baked lighting in interactive applications that can produce inconsistencies during movement.[80] Post-1980s developments in ray tracing further outpaced scanline rendering by inherently supporting distributed global effects like caustics—concentrated light patterns from refraction or reflection—through stochastic sampling of ray paths, avoiding the need for specialized approximations.[12] Quantitatively, Monte Carlo-based variants such as path tracing serve as unbiased estimators of the rendering equation, converging to the exact physically accurate radiance with increasing sample counts, though noise reduces as the inverse square root of samples per pixel.[52]

Performance Challenges

One of the primary performance challenges in ray tracing arises from the handling of secondary rays generated through recursion for effects like reflections and refractions. In the recursive model, each primary ray from the camera can spawn additional rays upon hitting a surface, potentially leading to an exponential increase in the number of rays as recursion depth grows, which significantly elevates computational demands.[12] This multiplication is particularly pronounced in path tracing variants, where Monte Carlo sampling is used to approximate global illumination, resulting in noisy images due to variance in the stochastic estimates unless mitigated by extensive sampling.[81] Scalability issues further compound these costs in complex scenes, where the sheer volume of geometry and interactions requires substantial computational resources, often rendering real-time performance unattainable without specialized hardware acceleration. For instance, path tracing massive scenes with billions of primitives demands efficient memory distribution across multiple GPUs to avoid bottlenecks, yet even optimized implementations struggle with full-scene replication overheads exceeding hundreds of gigabytes.[82] Frequent traversals through acceleration structures, such as bounding volume hierarchies, impose heavy demands on memory bandwidth in GPU implementations, as incoherent ray patterns lead to inefficient cache utilization and increased off-chip accesses. This strain is exacerbated by the need for repeated data fetches during ray-geometry intersections, limiting overall throughput on parallel architectures.[83][84] Despite advances, ray tracing remains prone to artifacts like fireflies—isolated bright noise spikes from rare, high-contribution samples in Monte Carlo integration—which persist even after denoising and degrade image quality. Approximations to reduce computation can also introduce bias, systematically skewing results away from physically accurate solutions.[85] In performance comparisons, ray tracing is generally slower than rasterization for scenes dominated by diffuse interreflections, where incoherent secondary rays amplify traversal costs, whereas it offers advantages in specular scenarios due to more coherent ray bundles that align better with acceleration structures.[86]

References

User Avatar
No comments yet.