Hubbry Logo
search
logo

Volume ray casting

logo
Community Hub0 Subscribers
Read side by side
from Wikipedia

Volume ray casting, sometimes called volumetric ray casting, volumetric ray tracing, or volume ray marching, is an image-based volume rendering technique. It computes 2D images from 3D volumetric data sets (3D scalar fields). Volume ray casting, which processes volume data, must not be mistaken with ray casting in the sense used in ray tracing, which processes surface data. In the volumetric variant, the computation doesn't stop at the surface but "pushes through" the object, sampling the object along the ray. Unlike ray tracing, volume ray casting does not spawn secondary rays.[1] When the context/application is clear, some authors simply call it ray casting.[1][2] Because ray marching does not necessarily require an exact solution to ray intersection and collisions, it is suitable for real time computing for many applications for which ray tracing is unsuitable.

Classification

[edit]

The technique of volume ray casting can be derived directly from the rendering equation. It provides results of very high quality rendering. Volume ray casting is classified as an image-based volume rendering technique, as the computation emanates from the output image and not the input volume data, as is the case with object-based techniques.

Basic algorithm

[edit]
The four basic steps of volume ray casting: (1) Ray Casting (2) Sampling (3) Shading (4) Compositing.

In its basic form, the volume ray casting algorithm comprises four steps:

  1. Ray casting. For each pixel of the final image, a ray of sight is shot ("cast") through the volume. At this stage it is useful to consider the volume being touched and enclosed within a bounding primitive, a simple geometric object — usually a cuboid — that is used to intersect the ray of sight and the volume.
  2. Sampling. Along the part of the ray of sight that lies within the volume, equidistant sampling points or samples are selected. In general, the volume is not aligned with the ray of sight, and sampling points will usually be located in between voxels. Because of that, it is necessary to interpolate the values of the samples from its surrounding voxels (commonly using trilinear interpolation).
  3. Shading. For each sampling point, a transfer function retrieves an RGBA material colour and a gradient of illumination values is computed. The gradient represents the orientation of local surfaces within the volume. The samples are then shaded (i.e. coloured and lit) according to their surface orientation and the location of the light source in the scene.
  4. Compositing. After all sampling points have been shaded, they are composited along the ray of sight, resulting in the final colour value for the pixel that is currently being processed. The composition is derived directly from the rendering equation and is similar to blending acetate sheets on an overhead projector.

Optimizations.

  1. It may work back-to-front, i.e. computation starts with the sample farthest from the viewer and ends with the one nearest to the viewer. This work flow direction ensures that masked parts of the volume do not affect the resulting pixel.
  2. The front-to-back order could also be more computationally efficient since, the residual ray energy is getting down while ray travels away from camera; so, the contribution to the rendering integral is diminishing therefore more aggressive speed/quality compromise may be applied (increasing of distances between samples along ray is one of such speed/quality trade-offs).

Advanced adaptive algorithms

[edit]

The adaptive sampling strategy dramatically reduces the rendering time for high-quality rendering – the higher the quality and/or size of the data-set, the more significant the advantage over the regular/even sampling strategy.[1] However, adaptive ray casting upon a projection plane and adaptive sampling along each individual ray do not map well to the SIMD architecture of modern GPU. Multi-core CPUs, however, are a perfect fit for this technique, making them suitable for interactive ultra-high quality volumetric rendering.

Examples of high quality volumetric ray casting

[edit]
Crocodile mummy provided by the Phoebe A. Hearst Museum of Anthropology, UC Berkeley. CT data was acquired by Dr. Rebecca Fahrig, Department of Radiology, Stanford University, using a Siemens SOMATOM Definition, Siemens Healthcare. The image was rendered by Fovia's High Definition Volume Rendering® engine

This gallery represents a collection of images rendered using high quality volume ray casting. Commonly the crisp appearance of volume ray casting images distinguishes them from output of texture mapping VR due to higher accuracy of volume ray casting renderings.

The CT scan of the crocodile mummy has resolution 3000×512×512 (16bit), the skull data-set has resolution 512×512×750 (16bit).

Ray marching

[edit]

The term ray marching is more broad and refers to methods in which simulated rays are traversed iteratively, effectively dividing each ray into smaller ray segments, sampling some function at each step. These methods are often used in cases where creating explicit geometry, such as triangles, is not a good option.

Visualization of SDF ray marching algorithm

Other examples of ray marching

[edit]
  • In SDF ray marching, or sphere tracing,[3] an intersection point is approximated between the ray and a surface defined by a signed distance function (SDF). The SDF is evaluated for each iteration in order to be able take as large steps as possible without missing any part of the surface. A threshold is used to cancel further iteration when a point has reached that is close enough to the surface. This method is often used for 3D fractal rendering.[4]
  • When rendering screen space effects, such as screen space reflection (SSR) and screen space shadows, rays are traced using G-buffers, where depth and surface normal data is stored per each 2D pixel.

See also

[edit]
  • Amira – commercial 3D visualization and analysis software (for life sciences and biomedical) that uses a ray-casting volume rendering engine (based on Open Inventor)
  • Avizo – commercial 3D visualization and analysis software that uses a ray-casting volume rendering engine (also based on Open Inventor)
  • Shadertoy - online community and platform for Computer graphics professionals, academics and enthusiasts who share, learn and experiment with rendering techniques and procedural art through GLSL code
  • Volumetric path tracing

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Volume ray casting is an image-order volume rendering technique in computer graphics that visualizes three-dimensional scalar or vector data by casting rays from the viewpoint through a volumetric dataset, sampling scalar values at regular intervals along each ray, and compositing the resulting colors and opacities to form a two-dimensional projection image.[1] This method avoids the need for explicit geometric surface extraction, enabling the direct display of semitransparent volumes such as those encountered in medical imaging or scientific simulations.[2] Developed in the late 1980s, volume ray casting builds on foundational work in optical modeling and ray tracing, with Marc Levoy's 1988 paper introducing a practical algorithm for displaying surfaces from volume data without intermediate geometric primitives.[1] Levoy's approach uses trilinear interpolation to estimate scalar values at sample points along rays, assigns colors and partial opacities via classification functions, and composites these values back-to-front using the transparency equation $ C_{out} = C_{in}(1 - \alpha) + c \alpha $, where $ c $ is the sample color and $ \alpha $ is its opacity.[1] Subsequent refinements, such as the 1990 efficient ray tracing algorithm, incorporated hierarchical spatial enumeration (e.g., octree pyramids) and adaptive termination based on accumulated opacity thresholds to accelerate rendering by factors of 2–10 times, making it feasible for larger datasets.[3] The core algorithm operates in image order: for each pixel in the output image, a ray is traced through the volume, typically sampling at evenly spaced intervals to capture gradient-based shading for surface normals and handle partial transparency in materials like fluids or tissues.[2] Front-to-back or back-to-front compositing schemes ensure correct optical depth accumulation, with modern implementations leveraging graphics processing units (GPUs) for real-time performance through texture-based ray marching and fragment shaders.[4] Key advantages include high-fidelity visualization of fuzzy or weak boundaries, reduced aliasing via supersampling, and flexibility in viewpoint and lighting, though it demands significant computational resources proportional to volume size and resolution.[1][3] Applications of volume ray casting span medical imaging for rendering CT or MRI scans to reveal internal structures, molecular graphics for electron density maps, and scientific visualization in fields like fluid dynamics and geophysics.[1][3] In contemporary systems, such as GPU-accelerated tools, it supports interactive exploration of volumetric effects like smoke, fire, or clouds in entertainment and simulation software.[4]

Fundamentals

Overview and Definition

Volume ray casting is an image-based volume rendering technique that generates two-dimensional projections from three-dimensional scalar volumetric data by casting rays through the volume and accumulating color and opacity samples along each ray. This method enables the direct visualization of voxel-based data without the need for intermediate geometric representations, allowing for the rendering of complex internal structures and semi-transparent materials. At its core, volume ray casting operates on principles of sampling scalar fields stored in voxels—discrete three-dimensional grid points representing measured or simulated data values—and projecting them onto an image plane to form a coherent view. It excels in handling volumes with varying densities and opacities, such as those simulating fog, clouds, or medical scans, by assigning partial transparency to individual voxels and compositing contributions accumulatively along ray paths. Understanding this technique requires familiarity with basic concepts like voxels as the fundamental units of volumetric data, scalar fields defining property values across space, and image planes as the virtual projection surfaces from which rays originate. Unlike surface ray tracing, which traces rays that intersect and terminate at explicit geometric surfaces while generating secondary rays for effects like reflections and refractions, volume ray casting samples continuously throughout the entire volume extent without predefined boundaries or recursive ray spawning.[5] This approach is grounded in the volume rendering equation, which models light transport through participating media as the theoretical foundation for accumulating radiance along rays.[6]

Historical Development

The origins of volume ray casting lie in extensions of traditional ray tracing to handle volumetric data in computer graphics. In 1984, James T. Kajiya and Brian P. Von Herzen developed algorithms for tracing rays through volume densities, allowing the rendering of semi-transparent phenomena such as clouds, fog, and flames stored in spatial grids.[7] This work laid foundational principles for integrating volumetric elements into ray-based rendering pipelines, bridging surface-oriented ray tracing with density-based representations. The formal establishment of volume rendering, including ray casting as a core method, occurred in 1988 through independent seminal contributions. Marc Levoy introduced techniques for displaying surfaces extracted from volume data using ray casting to sample and classify scalar fields, emphasizing direct visualization without intermediate geometric models.[1] Simultaneously, Craig Upson and Michael Keeler proposed the V-buffer approach, an efficient ray-casting framework for projecting and compositing volumetric data directly onto the image plane, enabling interactive rendering of complex volume projections.[8] During the 1990s, volume ray casting advanced toward practical implementation with hardware and software support. The 1999 release of the VolumePro board by Mitsubishi Electric Research Laboratories marked a milestone in real-time ray casting, delivering hardware-accelerated volume rendering at interactive frame rates on consumer PCs through parallel ray traversal and compositing.[9] Concurrently, the technique was integrated into prominent software toolkits, including the Visualization Toolkit (VTK), which supported ray-casting mappers from its early versions in the mid-1990s, and Open Inventor, where volume rendering nodes facilitated scene-graph-based implementations by the late 1990s.[10] The 2000s saw a pivotal shift toward graphics processing unit (GPU) acceleration for volume ray casting, enhancing performance for larger datasets. Daniel Weiskopf's 2006 exploration of GPU-based techniques enabled programmable shading and high-speed ray traversal on commodity hardware, expanding accessibility for real-time applications. Reflecting on these developments, a 2006 overview marking the 20th anniversary of volume rendering underscored its transformative role in medical imaging and scientific visualization, crediting ray casting for enabling detailed exploration of volumetric scans like CT and MRI data.

Algorithmic Foundations

Mathematical Formulation

The volume rendering equation provides the theoretical foundation for computing the radiance along a ray in a participating medium, such as a volumetric dataset. In its continuous form, the outgoing radiance II at a point along the ray is given by
I=0Dc(t)α(t)exp(0tα(s)ds)dt, I = \int_0^D c(t) \, \alpha(t) \, \exp\left( -\int_0^t \alpha(s) \, ds \right) dt,
where DD is the ray's length through the volume, tt parameterizes position along the ray, c(t)c(t) represents the color (or emission) at parameter tt, and α(t)\alpha(t) denotes the opacity (or absorption coefficient) at tt. This integral accounts for emission and attenuation due to absorption along the ray path.[6] The exponential term in the equation is the transmission function T(t)T(t), defined as
T(t)=exp(0tσ(s)ds), T(t) = \exp\left( -\int_0^t \sigma(s) \, ds \right),
where σ(s)\sigma(s) is the density or optical thickness at ss, related to opacity by α(t)σ(t)Δt\alpha(t) \approx \sigma(t) \Delta t for small step sizes Δt\Delta t. The transmission T(t)T(t) quantifies the fraction of radiance that survives attenuation from the ray origin to tt. Substituting T(t)T(t) yields an equivalent form I=0Dc(t)α(t)T(t)dtI = \int_0^D c(t) \, \alpha(t) \, T(t) \, dt, emphasizing the balance between emission and surviving light.[6] To apply this in ray casting, the continuous integral is discretized using Riemann sums, assuming a constant step size Δt\Delta t along the ray with nn samples where D=nΔtD = n \Delta t. The integral approximates to a summation over sampled points, transforming the emission-absorption model into iterative compositing. Under the assumption of piecewise constant properties between samples, the front-to-back composited color CC becomes
C=i=1nciαij=1i1(1αj), C = \sum_{i=1}^n c_i \, \alpha_i \, \prod_{j=1}^{i-1} (1 - \alpha_j),
where cic_i and αi\alpha_i are the interpolated color and opacity at the ii-th sample, and the product represents accumulated transmittance up to the previous sample. This formulation accumulates contributions from front to back, early-terminating when transmittance falls below a threshold for efficiency. The derivation follows from partitioning the integral into nn segments, approximating each (i1)ΔtiΔtdtciαiT(iΔt)Δt\int_{(i-1)\Delta t}^{i\Delta t} \cdots dt \approx c_i \alpha_i T(i\Delta t) \Delta t, with T(iΔt)=j=1i1(1αj)T(i\Delta t) = \prod_{j=1}^{i-1} (1 - \alpha_j), and αi1exp(σiΔt)\alpha_i \approx 1 - \exp(-\sigma_i \Delta t). As Δt0\Delta t \to 0, the sum converges to the continuous integral.[6] For shading in volume ray casting, surface normals are derived from gradients of the scalar field to compute local illumination via models like Phong. Gradients f\nabla f at voxel positions are approximated using central differences on the discrete voxel data ff:
f(xi,yj,zk)(f(xi,yj,zk+1)f(xi,yj,zk1)2Δz,f(xi,yj+1,zk)f(xi,yj1,zk)2Δy,f(xi+1,yj,zk)f(xi1,yj,zk)2Δx), \nabla f(x_i, y_j, z_k) \approx \left( \frac{f(x_i, y_j, z_{k+1}) - f(x_i, y_j, z_{k-1})}{2\Delta z}, \frac{f(x_i, y_{j+1}, z_k) - f(x_i, y_{j-1}, z_k)}{2\Delta y}, \frac{f(x_{i+1}, y_j, z_k) - f(x_{i-1}, y_j, z_k)}{2\Delta x} \right),
then normalized as N=f/fN = \nabla f / \|\nabla f\| for use as normals. This provides orientation cues for specular and diffuse lighting without explicit surface extraction. Sobel operators offer an alternative finite-difference approximation, convolving the volume with 3x3x3 kernels to estimate gradients while suppressing noise, though central differences suffice for uniform grids.[1][11]

Classification and Variants

Volume ray casting is classified as a direct volume rendering technique, which samples and integrates scalar field data directly to produce images without intermediate geometric representations, in contrast to indirect methods that extract surfaces such as isosurfaces via algorithms like marching cubes before rendering.[12] Within direct volume rendering, volume ray casting falls under image-based or image-order approaches, where rendering proceeds from the output image by casting rays through each pixel into the volume, differing from object-based or object-order methods that project voxels or volume elements onto the image plane, such as in splatting.[12][13] Key variants of volume ray casting include image-space implementations, which generate rays per pixel in screen coordinates for flexible viewpoint handling, and object-space variants, which align rays with the volume's voxel grid or process slices parallel to the viewing plane for hardware efficiency, as seen in systems like VolumePro.[9][12] Compositing along rays can follow front-to-back order, accumulating opacity to enable early ray termination when transparency thresholds are met, or back-to-front order, which blends all samples without termination for full traversal.[12] Volume ray casting typically employs fixed-step sampling along rays at uniform intervals to approximate the volume integral, distinguishing it from ray marching variants that use variable-step or adaptive sampling to adjust based on data density or gradients for improved efficiency and accuracy.[14][12]

Basic Pipeline

Ray Generation and Traversal

In volume ray casting, rays are generated for each pixel (u,v)(u, v) on the image plane to simulate the viewing direction through the volumetric data. The ray origin is placed at the eye point EE, and the direction DD is computed as the normalized vector from EE to the projected pixel position PP on the near plane, given by D=PEPED = \frac{P - E}{\|P - E\|}. This setup supports perspective projection, allowing for realistic depth cues in the rendered image.[15] To efficiently traverse the volumetric data, the ray is first intersected with the axis-aligned bounding box (AABB) enclosing the volume, determining the entry and exit points along the ray path. The slab method is employed for this intersection, treating the AABB as three pairs of parallel planes (slabs) aligned with the coordinate axes. For a ray parameterized as X(t)=E+tDX(t) = E + t D, the intersection parameters for each axis i{x,y,z}i \in \{x, y, z\} are calculated as ti,min=miniEiDit_{i,\min} = \frac{\min_i - E_i}{D_i} and ti,max=maxiEiDit_{i,\max} = \frac{\max_i - E_i}{D_i} if Di0D_i \neq 0, with adjustments for ray direction sign and division by zero cases handled by setting infinite values. The overall entry parameter is then tmin=max(0,max(tx,min,ty,min,tz,min))t_{\min} = \max(0, \max(t_{x,\min}, t_{y,\min}, t_{z,\min})), and the exit parameter is tmax=min(tx,max,ty,max,tz,max)t_{\max} = \min(t_{x,\max}, t_{y,\max}, t_{z,\max}); if tmin>tmaxt_{\min} > t_{\max}, the ray misses the volume. This method ensures traversal is confined to the relevant segment, avoiding unnecessary computations outside the data bounds. Within the intersected segment, the ray is advanced using a fixed step size Δt\Delta t, determined by the voxel resolution or a desired sampling rate to balance accuracy and performance. Typically, Δt\Delta t is set proportional to the average voxel edge length, such as Δt=voxelsizen\Delta t = \frac{voxel_{size}}{n} where nn is the number of samples per voxel traversal, ensuring uniform sampling density across the volume. The traversal loop increments the parameter as tt+Δtt \leftarrow t + \Delta t and updates the position X(t)=E+tDX(t) = E + t D until t>tmaxt > t_{\max}, resampling the volume at each step.[1] For optimization, early termination of the ray traversal occurs when the accumulated opacity exceeds a predefined threshold, such as 0.95, beyond which additional samples contribute minimally to the final pixel value due to the exponential falloff in the compositing equation. This technique significantly reduces computational cost for opaque or semi-opaque regions without compromising visual fidelity.[1]

Sampling and Interpolation

In volume ray casting, scalar values are extracted from the discrete voxel grid at predefined sample points along each ray to approximate the continuous volumetric field. These sample points are typically positioned equidistantly along the ray at fixed intervals Δt\Delta t, ensuring uniform spacing to facilitate consistent accumulation during compositing.[1] To locate the relevant voxels, the world coordinates of each sample point (x,y,z)(x, y, z) are mapped to discrete voxel indices through floor division, identifying the enclosing grid cell.[3] Since sample points rarely align exactly with voxel centers, interpolation is essential to estimate the scalar value at non-grid locations. The standard approach is trilinear interpolation, which computes a weighted average from the eight surrounding voxels within the unit cube defined by the floor indices. For a point (x,y,z)(x, y, z), let dx=xxdx = x - \lfloor x \rfloor, dy=yydy = y - \lfloor y \rfloor, and dz=zzdz = z - \lfloor z \rfloor. The interpolated scalar ss is given by:
s= (1dx)(1dy)(1dz)v000+dx(1dy)(1dz)v100+(1dx)dy(1dz)v010+dx dy(1dz)v110+(1dx)(1dy)dzv001+dx(1dy)dzv101+(1dx)dy dzv011+dx dy dzv111, \begin{aligned} s = &\ (1 - dx)(1 - dy)(1 - dz) \cdot v_{000} \\ &+ dx(1 - dy)(1 - dz) \cdot v_{100} \\ &+ (1 - dx)dy(1 - dz) \cdot v_{010} \\ &+ dx\ dy(1 - dz) \cdot v_{110} \\ &+ (1 - dx)(1 - dy)dz \cdot v_{001} \\ &+ dx(1 - dy)dz \cdot v_{101} \\ &+ (1 - dx)dy\ dz \cdot v_{011} \\ &+ dx\ dy\ dz \cdot v_{111}, \end{aligned}
where vijkv_{ijk} denotes the scalar value at the voxel offset by ii in xx, jj in yy, and kk in zz from the base index. This method provides a smooth approximation by linearly interpolating along each edge, face, and finally the volume.[3] Nearest-neighbor sampling offers a simpler alternative but introduces blocky artifacts, making trilinear the preferred basic technique for most applications.[1] The interpolated scalar ss is then classified using a transfer function to assign optical properties, mapping it to red, green, blue, and alpha (RGBA) values. A one-dimensional transfer function c(s)c(s) yields color, while α(s)\alpha(s) determines opacity, often via lookup tables for efficiency; two-dimensional variants incorporate additional data like gradient magnitude to distinguish material boundaries.[1] This classification step reveals internal structures by emphasizing specific scalar ranges, such as tissue densities in medical imaging. To mitigate aliasing artifacts from undersampling or sharp transitions in the voxel grid, pre-filtering techniques can be applied prior to interpolation. Basic trilinear interpolation assumes linear variation, but higher-order filters, such as Gaussian kernels, convolve the volume data to band-limit frequencies and reduce jagged edges at silhouettes.[16] These methods preserve detail while ensuring the rendered image aligns with the Nyquist sampling theorem, though they increase computational cost compared to unfiltered nearest or trilinear approaches.[16]

Shading and Local Illumination

In volume ray casting, shading enhances the visual perception of volumetric structures by estimating surface normals from scalar field gradients and applying local illumination models at sampled points along each ray. Gradient estimation is typically performed using finite differences on the interpolated scalar values, with the central difference method being common for its balance of accuracy and computational efficiency: for the x-component, $ G_x = \frac{s(x+1) - s(x-1)}{2} $, and similarly for y and z components, where $ s $ denotes the scalar field. The surface normal is then derived as $ \mathbf{N} = \normalize(-\nabla s) $, providing directionality for lighting computations. This approach, foundational to early volume rendering techniques, enables the simulation of surface-like reflections and depth cues in translucent media.[1][17] Local illumination is computed using models adapted from surface rendering, such as the Phong shading equation, which decomposes light interaction into ambient, diffuse, and specular terms:
I=Iaka+Idkd(NL)+Isks(RV)n, I = I_a k_a + I_d k_d (\mathbf{N} \cdot \mathbf{L}) + I_s k_s (\mathbf{R} \cdot \mathbf{V})^n,
where $ \mathbf{L} $ is the light direction, $ \mathbf{V} $ the view direction, $ \mathbf{R} $ the reflection vector, and $ k_a, k_d, k_s, n $ are material coefficients. Here, $ I_a, I_d, I_s $ represent ambient, diffuse, and specular light intensities, respectively. This model approximates photon scattering at each sample point, assigning color based on the estimated normal without requiring explicit geometry. Variants like Blinn-Phong offer computational alternatives for specular highlights while maintaining similar perceptual effects.[1][18] Support for multiple light sources extends local shading by summing contributions from each: $ I_{\total} = \sum_i I_i $, where each $ I_i $ follows the Phong formulation for light $ i $. Additionally, the gradient magnitude $ |\nabla s| $ modulates opacity in emission or absorption models, increasing transparency in low-gradient regions (e.g., interiors) and emphasizing boundaries, which integrates seamlessly with transfer functions for refined visual emphasis. This modulation enhances feature detection in complex datasets, such as medical scans.[4][19] For non-photorealistic rendering, gradient-free shading options prioritize illustrative clarity over realism, such as maximum intensity projection (MIP), which selects the highest scalar value along the ray without normal-based lighting, yielding X-ray-like views ideal for angiography or feature highlighting. This approach avoids gradient computations entirely, reducing artifacts in noisy data while focusing on structural silhouettes.[20][21]

Compositing and Accumulation

In volume ray casting, compositing involves blending the shaded samples collected along each ray to compute the final pixel color and opacity, accounting for the partial transparency of volumetric data. This process approximates the volume rendering integral by accumulating contributions from multiple samples in a specific order, typically front-to-back or back-to-front, to simulate light transmission through the medium.[6] Front-to-back compositing traverses samples from the viewpoint toward the back of the volume, maintaining an accumulated color CC initialized to zero and a transmittance TT initialized to 1. For each sample with color csc_s and opacity αs\alpha_s, the updates are given by
CC+Tcsαs,TT(1αs). C \leftarrow C + T \cdot c_s \cdot \alpha_s, \quad T \leftarrow T \cdot (1 - \alpha_s).
This weighted accumulation reflects the remaining light transmittance after each sample, enabling early ray termination when TT falls below a small threshold ϵ\epsilon (e.g., 0.01) to improve efficiency without significant loss in image quality.[6] An alternative back-to-front compositing accumulates samples in reverse order, starting from the farthest sample and proceeding toward the viewpoint, akin to the painter's algorithm for opaque surfaces. The final color is computed as
C=iciαij>i(1αj), C = \sum_i c_i \alpha_i \prod_{j > i} (1 - \alpha_j),
where the product term represents the cumulative transmittance of all samples behind the current one; this method does not support early termination but can leverage hardware alpha blending for implementation.[1] Both approaches rely on the general over operator for alpha blending between successive (color, opacity) pairs (Ca,αa)(C_a, \alpha_a) and (Cb,αb)(C_b, \alpha_b):
(Ca,αa)(Cb,αb)=(Ca+(1αa)Cb, αa+(1αa)αb). (C_a, \alpha_a) \circ (C_b, \alpha_b) = \left( C_a + (1 - \alpha_a) C_b, \ \alpha_a + (1 - \alpha_a) \alpha_b \right).
This operator ensures correct handling of overlapping transparent contributions, treating the current accumulation as the background for the next sample.[22] To address under- or over-sampling artifacts that can lead to aliasing in the final image, techniques such as supersampling—casting multiple rays per pixel and averaging results—or adaptive step sizes along rays are employed, though these increase computational cost.[23]

Advanced Techniques

Adaptive Sampling

Adaptive sampling in volume ray casting adjusts the density and distribution of samples along rays to optimize computational efficiency while preserving rendering quality, contrasting with fixed-step methods that sample uniformly regardless of local data characteristics.[3] This approach reduces the number of samples in regions where fine detail is unnecessary, such as homogeneous or low-contribution areas, thereby accelerating rendering without significant loss in visual fidelity.[24] Opacity-based adaptation modulates step sizes based on local or accumulated opacity values, allowing larger steps in low-opacity regions where contributions to the final image are minimal. For instance, step sizes can be increased inversely with local opacity to ensure denser sampling only where opacity is high, such as near occluding structures.[3] Additionally, rays can terminate early when accumulated opacity exceeds 1ϵ1 - \epsilon (typically ϵ=0.01\epsilon = 0.01 to 0.050.05), skipping remaining low-contribution segments entirely.[24] This technique, integral to front-to-back compositing pipelines, can reduce rendering time by factors of 1.3 to 2.2 in datasets with sparse opaque features, like medical scans.[3] Gradient-driven sampling refines step sizes according to the magnitude of the scalar field gradient s|\nabla s|, which indicates local feature sharpness; smaller steps are taken near high-gradient regions resembling surfaces to capture transitions accurately. The step size is dynamically adjusted based on the gradient direction and a threshold, bounded by minimum and maximum distances to prevent under- or over-sampling.[25] This method enhances detail preservation in boundary-heavy volumes, such as CT angiography data, while coarsening traversal in smooth interiors, achieving efficiency gains of 2-3x over uniform sampling without aliasing artifacts.[25] Hierarchical sampling employs a coarse-to-fine strategy, beginning with sparse samples along rays and progressively subdividing those exhibiting high variance or error exceeding a tolerance. Initial low-resolution passes identify uncertain rays—e.g., via variance in interpolated scalar values—and refine them by casting additional sub-rays or increasing local density.[26] This progressive refinement enables interactive previews with gradual quality improvement. Multi-resolution volumes facilitate level-of-detail (LOD) adaptation by selecting resolution based on ray distance from the viewer, using lower LODs for distant or peripheral rays to minimize samples. Hierarchical representations, such as octree-based voxel MIP-map pyramids, precompute multi-resolution structures where each level reduces resolution; during traversal, the LOD is chosen such that projected voxel size matches screen pixel footprint. This distance-driven approach suits large-scale datasets, like geophysical simulations, cutting memory and sample requirements significantly while maintaining perceptual quality.[27]

Acceleration Structures

Acceleration structures in volume ray casting are designed to expedite ray traversal by identifying and skipping regions of empty space where scalar values fall below an isovalue or opacity threshold, thereby reducing the number of sampling operations required for rendering. These techniques leverage precomputed hierarchical or compressed representations of the volume data to enable large leaps along rays, significantly improving performance for sparse datasets without compromising image quality. By avoiding unnecessary computations in homogeneous or transparent areas, acceleration structures can achieve substantial speedups, particularly in medical imaging and scientific visualization applications where volumes often contain substantial empty regions. Empty space leaping (ESL) employs hierarchical data structures, such as octrees or k-d trees, to precompute minimum and maximum scalar values within volume segments, allowing rays to skip over intervals where the maximum value is less than the specified isovalue or opacity threshold. In this approach, the volume is recursively subdivided into a tree where each node stores the scalar range of its subtree; during traversal, a ray intersects the tree to find the largest empty segment it can leap, updating the ray's entry and exit points accordingly. This method, introduced for accelerating volume animation, exploits ray coherence between frames to further optimize leaping decisions, resulting in interactive rendering rates for moderately sized volumes. Octree-based ESL has been shown to provide speedups of up to 10x in sparse datasets like CT scans.[28] Distance fields provide another effective mechanism for empty space skipping by precomputing signed distance functions (SDFs) that represent the minimum distance from each voxel to the nearest surface defined by an isovalue or opacity boundary. During ray marching, the step size along the ray is set to the minimum of the remaining distance to the volume boundary and the safe distance provided by the SDF at the current position, ensuring the ray jumps directly to the next potential intersection without missing opaque regions. This technique, adapted from implicit surface rendering, enables conservative leaps that maintain accuracy while accelerating traversal in volumes with isolated structures, such as anatomical models, where empty space constitutes a large portion of the data. Adaptively sampled distance fields (ADFs) extend this by hierarchically refining the SDF near surfaces, achieving rendering speedups of up to 20 times compared to uniform sampling in complex scenes.[29][29] Voxel traversal algorithms facilitate efficient grid walking in uniform or adaptive voxel grids by systematically advancing rays through cells, prioritizing the major axis of movement to minimize computations. The seminal algorithm by Amanatides and Woo advances the ray by calculating parametric distances to the next voxel boundaries in each dimension, stepping along the axis with the smallest delta t and updating only the necessary coordinates, requiring just two comparisons and one addition per voxel crossed. This approach eliminates redundant ray-plane intersections, making it ideal for structured volumes where empty cells can be quickly skipped via simple tests on voxel occupancy. In practice, it supports real-time traversal for 512^3 grids on modern hardware, forming the basis for many GPU-accelerated ray casters.[30][30] Run-length encoding (RLE) compresses sparse volumes by representing consecutive uniform voxels—particularly empty runs—as compact lists of start positions, lengths, and values, enabling rays to skip entire homogeneous regions in a single operation. In RLE-accelerated ray casting, the volume is encoded into one-dimensional runs along scanlines or hierarchical blocks, and traversal proceeds by decoding runs intersected by the ray to jump over transparent sequences until an opaque run is encountered. This method excels in datasets with long empty stretches, such as simulated fluid dynamics volumes, where compression ratios exceed 10:1 and rendering times are reduced by 5-15 times relative to naive traversal. Hardware implementations, like the VolumePro ASIC, integrate RLE directly into the ray-casting pipeline for real-time performance on commodity systems.[9][9]

Ray Marching Approaches

Ray marching approaches represent iterative variants of volume ray casting that advance rays through the volume using variable step sizes guided by distance estimators, enabling efficient handling of sparse, implicit, or procedural data without uniform sampling across the entire domain. These methods differ from fixed-step traversal by adapting steps based on local geometry or density, reducing unnecessary computations in empty space while ensuring accurate accumulation of volumetric contributions.[31] The core algorithm for ray marching initializes the ray parameter $ t = 0 $ and accumulated opacity $ \alpha = 0 $; it then iterates while $ t < t_{\max} $ and $ \alpha < 1 $, advancing $ t $ by $ \Delta t = \max(\Delta t_{\min}, f(\mathbf{p})) $, where $ \mathbf{p} $ is the current position along the ray and $ f $ estimates the distance to the next significant feature, such as a surface or density boundary; at each step, the volume is sampled for density and color, which are composited into the accumulated result. This framework supports both surface and volumetric rendering by terminating or continuing based on opacity thresholds, with $ \Delta t_{\min} $ preventing numerical instability from excessively small steps.[32] Sphere tracing, a prominent ray marching technique for implicit surfaces within volumes, sets the step size to the estimated distance to the surface, $ \Delta t = |f(\mathbf{p})| $, where $ f $ is a signed distance function defining the implicit geometry $ f(\mathbf{x}) = 0 $. Introduced for antialiased rendering of height fields and extended to general implicit volumes, the method guarantees that rays do not miss intersections if $ f $ is Lipschitz continuous with constant $ K $, as steps are conservatively bounded by $ |f(\mathbf{p})| / K $ to avoid overshooting; convergence is linear in the number of steps, typically requiring fewer iterations than fixed-step methods for complex, non-intersecting surfaces like fractals.[33] This approach excels in procedural volumes where explicit voxelization is infeasible, ensuring watertight traversal without precomputation. Signed distance field (SDF) ray marching adapts sphere tracing principles to volumetric SDFs, which encode both interior density and exterior distance in a single continuous function, marching the ray until $ |f(\mathbf{p})| < \epsilon $ for a small tolerance $ \epsilon $, at which point surface or volume contributions are evaluated and accumulated. This enables compact representation and rendering of intricate procedural volumes, such as fractals or noise-based densities, by evaluating $ f $ on-the-fly without storing a full grid; for example, in neural reconstruction, SDF marching integrates with volume rendering equations to accumulate transmittance and emission along the ray, supporting differentiable optimization for scene fitting. Applications include real-time procedural terrain or cloud volumes, where the method avoids aliasing in thin features by refining steps near boundaries.[34][32] Other variants include constant step marching for uniform media, where $ \Delta t $ is fixed proportional to the medium's optical thickness or voxel resolution to ensure consistent sampling in homogeneous densities like fog or isotropic scattering. Hybrids integrate ray marching with distance-based empty space leaping for complex scenes, using coarse SDFs to compute large skips in low-density regions before switching to fine-grained tracing near features, balancing speed and accuracy in multi-material volumes.[31]

Implementations and Applications

Hardware Acceleration

Volume ray casting on central processing units (CPUs) leverages multi-threading to parallelize the processing of ray bundles across multiple cores, enabling efficient handling of large datasets in offline rendering scenarios. SIMD instructions, such as SSE and AVX, are employed to vectorize parallel sample computations along rays, improving throughput for coherent ray sets. However, the inherently serial nature of ray marching limits real-time performance on CPUs, making them more suitable for adaptive sampling strategies where computational effort varies per ray.[35] Graphics processing units (GPUs) provide foundational acceleration for volume ray casting through programmable shaders in APIs like OpenGL with GLSL, Vulkan, and DirectX, where volumetric data is stored as 3D textures and ray integration occurs within fragment shaders. This approach maps screen-space pixels to rays, marching through the 3D texture to sample and composite values, achieving interactive frame rates for moderately sized volumes. Early implementations integrated optimizations like empty space skipping and early ray termination directly into shader code to reduce unnecessary computations.[4][36] In the 2010s, ray-guided volumetrics advanced GPU acceleration by employing hierarchical data structures, such as sparse voxel octrees, to enable efficient empty space skipping during ray traversal. These methods partition the volume into bricks or nodes, allowing rays to leap over transparent regions, significantly reducing sampling overhead for large-scale datasets while maintaining real-time performance on commodity GPUs. Techniques like SparseLeap further refined this by combining object-order rasterization with image-order ray marching, achieving up to 10x speedups over naive implementations through adaptive traversal.[37][38] Post-2018 hardware ray tracing cores, introduced in NVIDIA's Turing architecture (2018) and AMD's RDNA 2 architecture (2020), support hybrid volume casting by accelerating intersection tests with bounding volume hierarchies for sparse volumes. Subsequent generations, including NVIDIA's Ampere (2020), Ada Lovelace (2022), and Blackwell (2024) architectures with 3rd- and 4th-generation RT cores, and AMD's RDNA 3 (2022) and RDNA 4 (2025) with redesigned ray tracing hardware, further enhance performance for volumetric ray traversal, enabling faster empty space skipping and reduced memory accesses in heterogeneous volumes by factors of 5-20x compared to software-only methods. NVIDIA's OptiX framework utilizes RT cores for ray generation and traversal in programmable pipelines, adaptable to volume primitives via custom hit programs, though primarily optimized for surface geometry. On AMD GPUs, block walking integrates ray tracing hardware with a novel traversal algorithm for sparse voxel data.[39][40][41][42][43] Established frameworks facilitate GPU-accelerated volume ray casting through specialized mappers and kernels. The Visualization Toolkit (VTK) employs its vtkGPUVolumeRayCastMapper, which implements ray casting via OpenGL fragment shaders supporting multi-volume compositing and transfer functions for datasets up to 512^3 voxels at interactive rates. Custom implementations often use OpenCL or CUDA kernels for greater flexibility; comparative studies show CUDA achieving up to 2x faster rendering than OpenGL fragment shaders for 256^3 volumes due to explicit memory management and compute unification. In game engines, plugins extend Unity and Unreal Engine 5 (UE5) with ray marching support, such as Unity's Volume Rendering package for direct volume rendering via compute shaders and UE5's TBRaymarcher for volumetric data visualization.[44][45] Recent 2020s integrations incorporate denoising techniques for noisy Monte Carlo volume ray casting, where stochastic sampling simulates global effects like scattering; frameworks combine CUDA kernels with neural denoisers to reduce variance, enabling real-time previews from as few as 1 sample per pixel by leveraging spatiotemporal networks for up to 100x noise reduction.[46]

Real-World Examples and Use Cases

Volume ray casting plays a pivotal role in medical imaging, enabling the visualization of complex volumetric data from computed tomography (CT) and magnetic resonance imaging (MRI) scans to support diagnosis and surgical planning. Transfer functions are employed to map scalar values, such as tissue density or intensity, to optical properties like color and opacity, facilitating the differentiation of anatomical structures such as bones, soft tissues, and organs. For instance, GPU-accelerated ray casting has been applied to render CT scans of skulls, MRI scans of brains, and CT scans of jaws, producing high-fidelity 3D views that highlight internal features for clinical analysis. In a specific application, ray casting-based volume rendering has been used to localize abnormalities in human abdomen MRI data, achieving accurate segmentation and visualization of pathological regions through ray traversal and compositing along sampled paths.[47][48][49] In scientific visualization, volume ray casting supports the exploration of large-scale datasets from simulations and observations, providing immersive insights into physical phenomena. It is commonly used for rendering fluid dynamics simulations, where ray casting integrates scalar fields like velocity magnitude or vorticity to visualize flow patterns, vortices, and shock waves in computational fluid dynamics (CFD) results from datasets spanning 10–100 MB per time step. For seismic data analysis, the technique generates realistic 3D models of subsurface structures, integrating reflectivity, faults, and stratigraphic features to aid geophysical interpretation and resource exploration. High-quality renders of astrophysical volumes, such as emission nebulae, leverage ray casting to simulate light emission from ionized gases, sampling densities of hydrogen, oxygen, and dust along rays to produce interactive visualizations at approximately 55 frames per second (FPS) on modern GPUs for resolutions up to 1024×768.[50][51][52] In games and visual effects (VFX), volume ray casting variants enable real-time rendering of atmospheric and procedural elements, enhancing immersion in interactive environments. Unreal Engine implements ray marching—a discrete form of ray casting—for volumetric clouds and fog, allowing dynamic lighting interactions like god rays and shadows cast onto terrain, integrated via shader-based traversal for performance on consumer hardware. Similarly, No Man's Sky employs ray marching for procedural terrain generation and volumetric effects, including clouds, by sampling signed distance fields along rays to construct infinite planetary landscapes with seamless transitions between surface and atmosphere. These approaches support real-time rendering of complex volumes, contributing to the game's expansive, explorable universe.[53][54] Performance benchmarks demonstrate the practicality of volume ray casting in demanding scenarios, with GPU implementations achieving interactive rates for substantial datasets. On a Tesla C2070 GPU, multi-volume ray casting of two 512³ datasets at 600×600 resolution exceeds 15 FPS, scalable to higher frame rates on modern hardware for single-volume rendering. Adaptive techniques further enable deployment in resource-constrained settings, such as mobile VR, where progressive ray casting on devices like the Nexus 6P delivers interactive visualization of 512³ datasets at 720×1280 resolution within sub-second completion times per frame, using tiled sampling to balance quality and speed.[55][56]

References

User Avatar
No comments yet.