Hubbry Logo
Structured lightStructured lightMain
Open search
Structured light
Community hub
Structured light
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Structured light
Structured light
from Wikipedia

A structured light pattern projected onto a surface (left)

Structured light is a method that measures the shape and depth of a three-dimensional object by projecting a pattern of light onto the object's surface. The pattern can be either stripes, grids, or dots. The resulting distortions of the projected pattern reveals the object's solid geometry through triangulation, enabling the creation of a 3D model of the object. The scanning process relies on coding techniques for accurately detailed measurement. The most widely used coding techniques are binary, Gray, and phase-shifting, each offering distinct advantages and drawbacks.

Structured light technology is applied across diverse fields, including industrial quality control, where it is used for precision inspection and dimensional analysis, and cultural heritage preservation, where it assists in the documentation and restoration of archaeological artifacts. In medical imaging, it facilitates non-invasive diagnostics and detailed surface mapping, particularly in applications such as dental scanning and orthotics. Consumer electronics integrate structured light technology, with applications ranging from facial recognition systems in smartphones to motion-tracking devices like Kinect. Some implementations, especially in facial recognition, use infrared structured light to enhance accuracy under varying lighting conditions.

Process

[edit]
Structured light sources on display at the 2014 Machine Vision Show in Boston
An Automatix Seamtracker arc welding robot equipped with a camera and structured laser light source, enabling the robot to follow a welding seam automatically

Structured light measurement is a technique used to determine the three-dimensional coordinates of points on an object's surface. It involves a projector and a camera positioned at a fixed distance from each other—known as the baseline—and oriented at specific angles. The projector casts a structured light pattern, which can be either stripes, grids, or dots, onto the object's surface. The camera then captures the distortions in this pattern caused by the object's solid geometry, which reveal the surface shape. By analyzing these distortions, depth values can be calculated.[1][2]

The measurement process relies on triangulation, using the baseline distance and known angles to calculate depth from the pattern's displacement via trigonometric principles. When structured light hits a non-planar surface, the pattern distorts predictably, enabling a 3D reconstruction of the surface. Accurate reconstruction depends on system calibration—which establishes the precise geometric relationship between the projector and camera to prevent depth errors and, consequently, geometric distortions from misalignment—as well as the use of pattern analysis algorithms.[1][2][3]

Types of coding

[edit]

Structured light scanning relies on various coding techniques for 3D shape measurement. The most widely used ones are binary, Gray, and phase-shifting. Each method presents distinct advantages and drawbacks in terms of accuracy, computational complexity, sensitivity to noise, and suitability for dynamic objects. Binary and Gray coding offer reliable, fast scanning for static objects, while phase-shifting provides higher detail. Hybrid methods, such as binary defocusing and Fourier transform profilometry (FTP), balance speed and accuracy, enabling real-time scanning of moving 3D objects.[2][3][4]

Binary coding

[edit]

Binary coding uses alternating black and white stripes, where each stripe represents a binary digit. This method is computationally efficient and widely employed due to its simplicity. However, it requires the projection of multiple patterns sequentially to achieve high spatial resolution. While this approach is effective for scanning static objects, it is less suitable for dynamic scenes due to the need for multiple image captures. In addition, the accuracy of binary coding is constrained by projector and camera pixel resolution, and it needs precise thresholding algorithms to distinguish projected stripes accurately.[4]

Gray coding

[edit]

Gray coding, named after physicist Frank Gray, is a binary encoding scheme designed to minimize errors by ensuring that only one bit changes at a time between successive values. This reduces transition errors, making it particularly useful in applications such as analog-to-digital conversion and optical scanning.[5] In structured light scanning, where Gray codes are used for pattern projection, a drawback arises as more patterns are projected: the stripes become progressively narrower, which can make them harder for cameras to detect accurately, especially in noisy environments or with limited resolution. To mitigate this issue, advanced variations such as complementary Gray codes and phase-shifted Gray code patterns have been developed. These techniques introduce opposite or phase-aligned patterns to enhance robustness as well as to aid in error detection and correction in complex scanning environments.[2][6]

Phase-shifting

[edit]

Phase-shifting techniques use sinusoidal wave patterns that gradually shift across multiple frames to measure depth. Unlike binary and Gray coding, which provide depth in discrete steps, phase-shifting allows for smooth, continuous depth measurement, resulting in higher precision. The main challenges are that depth ambiguities can occur because the repeating wave patterns make it difficult to determine exact distances, which requires extra reference data or advanced processing to resolve, and, because multiple images are needed, this method is not ideal for moving objects—as motion can create distortions and introduce artifacts in the measurement.[4]

Hybrid methods

[edit]

To address the limitations of phase-shifting in dynamic environments, binary defocusing techniques have been developed, in which binary patterns are deliberately blurred to approximate sinusoidal waves. This approach integrates the efficiency of binary projection with the precision of phase-shifting, enabling high-speed 3D shape capture. Advances in high-speed digital light processing (DLP) projectors have further supported the adoption of these hybrid methods in applications requiring real-time scanning, including biomedical imaging and industrial inspection.[3]

Fourier transform profilometry (FTP) measures the shape of an object using a single image of a projected pattern. It analyzes how the pattern deforms over the surface, enabling fast, full-field 3D shape measurement, even for moving objects. The process involves applying a Fourier transform to convert the image into frequency data, filtering out unwanted components, and performing an inverse transform to extract depth information. Although FTP is often used alone, hybrid systems sometimes combine it with phase-shifting profilometry (PSP) or dual-frequency techniques to improve accuracy while maintaining high speed.[7][8]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Structured light is a method that captures the shape and dimensions of an object by projecting a known of , such as stripes, grids, or coded sequences, onto its surface and analyzing the deformation of the with one or more cameras. This technique relies on the principles of : the and camera form a setup where the displacement of the projected features due to the object's allows of depth and surface coordinates with high precision, often achieving sub-millimeter accuracy. The concept of structured light scanning originated in the 1970s with early experiments in projecting light patterns for range measurement, gaining prominence in the and through advancements in digital projectors and image processing. Key developments include binary and Gray coding for pattern decoding in the , and phase-shifting methods in the , enabling faster and more robust . Modern systems use high-speed digital light processing (DLP) projectors and sensors to achieve real-time scanning rates of up to 80 frames per second. Structured light scanning has wide applications in industrial inspection for and , biomedical fields like facial and dental , and cultural heritage preservation for digitizing artifacts. It offers non-contact, high-resolution suitable for delicate objects, though challenges include handling reflective or transparent surfaces and ambient lighting interference, addressed by ongoing advances in coding techniques and multi-wavelength illumination.

Fundamentals

Definition and Principles

Structured light refers to engineered optical fields that exhibit controlled spatial and temporal variations in their amplitude, phase, polarization, or other degrees of freedom, enabling the creation of light beams with complex structures beyond traditional Gaussian profiles. This tailoring allows light to carry additional information, such as orbital angular momentum (OAM), or perform specialized functions like non-diffracting propagation or self-bending trajectories. The foundational principles of structured light involve manipulating the fundamental properties of electromagnetic waves. 's electric field can be decomposed into (intensity distribution), phase (wavefront shape), and polarization (orientation of oscillations). By engineering these —individually or in combination—researchers can sculpt fields to achieve desired spatial structures. For instance, can create helical wavefronts that twist around the beam axis, while polarization structuring produces vector beams with spatially varying polarization states. These principles expand the dimensionality of -matter interactions, enabling applications from high-capacity optical communications to precise optical manipulation. Unlike uniform Gaussian beams, structured fields maintain intricate patterns during propagation under paraxial conditions, governed by the in free space. A key aspect of structured light is its ability to encode information in multiple independent channels, such as spatial modes or polarization states, increasing the information capacity of optical systems. This is particularly evident in beams carrying OAM, where photons possess an additional angular momentum beyond spin, allowing for multiplexing in dimensions orthogonal to wavelength and polarization.

Beam Geometry

In structured light systems, beam geometry describes the spatial configuration of the light field, particularly how wavefront curvature and phase distributions define propagation characteristics. For OAM-carrying beams, such as Laguerre-Gaussian (LG) modes, the wavefront exhibits a helical structure, characterized by a phase singularity at the beam center and an azimuthal phase variation. The electric field of an LG beam can be expressed in cylindrical coordinates (r, φ, z) as involving a phase term \exp(i l \phi), where l is the topological charge representing the number of helical twists, and \phi is the azimuthal angle. This imparts an orbital angular momentum of l \hbar per photon along the propagation direction z. The propagation of structured beams follows the paraxial wave equation, derived from the scalar under small-angle approximations. For LG modes, the radial intensity profile is described by associated , ensuring a doughnut-shaped intensity with a dark central spot for l ≠ 0. This geometry enables non-diffracting or self-healing properties in certain beams, like Bessel beams, which maintain their transverse profile over distance due to their conical structure. Polarization geometry adds another layer, with cylindrical vector beams featuring radially or azimuthally polarized light, where the polarization direction aligns with or is to the radial direction. These structures are analyzed using or representations to quantify spatial variations. In practice, generating such geometries requires precise control, often via spatial light modulators or metasurfaces, to shape the incident light field accurately. of these devices ensures faithful reproduction of the desired beam geometry, accounting for factors like wavelength and input beam quality.

History

Early Developments

The concept of structured light in optics, involving engineered light fields with controlled spatial and temporal structures, has roots in fundamental studies of light propagation and interference dating back to the late 19th and early 20th centuries, but practical engineering of non-Gaussian beams emerged in the late 20th century. Early theoretical work on laser modes, such as the description of Laguerre-Gaussian (LG) beams in the 1960s and 1970s, laid groundwork, though experimental realization was limited. A pivotal advancement occurred in the early 1990s with the recognition that light beams possessing helical phase structures carry orbital (OAM). The seminal 1992 paper by Miles Padgett, , and colleagues demonstrated how LG beams with phase singularities impart a twist to the wavefront, enabling photons to carry OAM beyond spin . This discovery expanded the for light manipulation and marked the foundational moment for modern structured light in . Parallel to these optics developments, structured light techniques in the context of 3D surface profiling trace back to the mid-20th century, influenced by and . In the 1960s, initial experiments used projectors and cameras for basic non-contact 3D object profiling. The 1970s brought key innovations, including Hiroaki Takasaki's 1970 Moire topography method, which projected gratings to produce contour lines via interference fringes for surface mapping. In 1973, G.J. Agin and T.O. Binford advanced slit projection for computational 3D object recognition in industrial settings. These methods adopted inspired by Frank Gray's 1953 to reduce errors in .

Key Advancements

The and saw rapid progress in generating and applying structured light in , with techniques like computer-generated holograms and spatial light modulators enabling the creation of diverse beams, including Bessel beams for non-diffracting propagation (demonstrated in 1987 but advanced post-) and Airy beams for self-bending paths (2007). Vector beams with spatially varying polarization were developed around 2000, enhancing applications in and communications. Detection methods, such as mode sorting and , also matured, supporting OAM multiplexing for high-capacity optical data transmission. In parallel, for applications, the 1990s integrated digital projectors (e.g., DLP technology) and CCD cameras, improving pattern projection and image capture for reconstruction. Phase-shifting algorithms emerged, providing sub-pixel accuracy by analyzing sinusoidal fringe phases. The 2000s introduced hybrid coding (e.g., phase-shifting with binary) for real-time scanning at over 30 Hz. The 2010s accelerated both fields: In , spatiotemporal control and higher-dimensional encoding advanced, including time-varying OAM. In 3D applications, surged with the 2010 Microsoft using speckle patterns for real-time depth sensing at 30 fps, impacting gaming and HCI. Handheld scanners like Artec Eva (2012) achieved 0.1 mm accuracy for AR/VR. As of , AI integration has transformed processing in both domains. models assist in phase unwrapping, , and reconstruction, improving speeds up to 10-fold while achieving sub-millimeter precision in challenging conditions. Additionally, multi-wavelength structured light using metasurfaces for high-density dot projection in visible spectra (e.g., 405 nm, 532 nm, 633 nm) enhances resolution and reduces errors on colorful or low-reflectivity surfaces.

System Components and Process

Hardware Elements

Structured light systems rely on several core hardware components to project patterns and capture deformations for 3D reconstruction. The light projector, typically a Digital Light Processing (DLP) or Liquid Crystal Display (LCD) device, generates and projects structured patterns onto the target object. DLP projectors, such as the Texas Instruments DLP4500 or DLP6500FLQ, are commonly used due to their high contrast ratios (e.g., greater than 1000:1) and micromirror arrays that enable precise pattern control. A high-resolution camera, often employing Charge-Coupled Device (CCD) or Complementary Metal-Oxide-Semiconductor (CMOS) sensors, captures the deformed patterns; examples include the Sony IMX342 CMOS sensor with 31.4 megapixels or Point Grey Grasshopper3 models supporting global or rolling shutters. These components are mounted on a rigid rig that establishes a fixed baseline distance and angle, typically around 30 degrees between the projector and camera optical axes, to facilitate triangulation-based depth computation. Supporting elements enhance system accuracy and robustness. Calibration targets, such as ChArUco boards with checker patterns of known dimensions (e.g., 400x300 mm with 15 mm squares), are essential for aligning the projector and camera coordinate systems. Optical filters, including narrow-band spectral or polarization filters on the camera, reject ambient light interference by suppressing broadband sunlight or unpolarized sources, improving signal-to-noise ratios in non-ideal environments. A computing unit, such as a PC with libraries or an embedded System-on-Chip (SoC) like the AM57xx, handles real-time image processing and pattern decoding. Key specifications ensure reliable performance. Projectors generally require a minimum resolution of 1024x768 pixels, with higher-end models like WXGA (1280x800) or providing finer pattern details for improved depth accuracy. Cameras support frame rates up to 60 fps for capturing dynamic scenes, though typical rates range from 15-30 fps depending on USB interface and exposure settings. mechanisms, such as trigger cables connecting projector GPIO to camera inputs or software-based timing, align projection and capture to within microseconds, preventing motion artifacts. For handling complex geometries with occlusions or large surfaces, variations include multi-projector setups, where multiple DLP units project overlapping patterns to cover non-line-of-sight areas, calibrated via shared camera views. Single-projector configurations suffice for simpler objects but limit compared to multi-projector arrays.

Projection and Reconstruction Process

In structured light systems, the projection and reconstruction begins with the emission of a precisely calibrated from a onto the target's surface, creating a reference grid or fringe that encodes spatial information. As the interacts with the object's , it deforms in a manner proportional to the surface contours. A synchronized camera, positioned at an angle to the projector, captures one or more images of this distorted , recording the intensity variations that reflect the three-dimensional shape. This capture step relies on high-frame-rate to minimize motion artifacts in dynamic scenes. Following image acquisition, the captured data undergoes decoding to establish pixel-to-pixel correspondences between the projector and camera views, identifying unique features such as stripe shifts or phase differences within the deformed pattern. These correspondences enable the computation of depth values through , where the disparity in pattern positions is used to calculate the 3D coordinates of surface points, forming an initial representation of the object. The resulting point cloud captures the geometric structure with sub-millimeter accuracy in controlled environments, depending on system and pattern density. The processing refines this raw data for usability. Pre-processing involves techniques, such as Gaussian filtering or background subtraction, to enhance contrast and suppress artifacts from or minor distortions. Correspondence matching then refines the initial decoding, often using optimization algorithms to resolve ambiguities in overlapping features. A is subsequently generated by aggregating the triangulated depths into a 2D aligned with the camera's view, providing a dense representation of surface elevations. Finally, mesh reconstruction converts the point cloud into a polygonal surface model, typically via algorithms like Poisson surface reconstruction, enabling further analysis or visualization. This ensures robust output but requires computational resources proportional to resolution. For real-time applications, such as scanning moving objects, between and camera is critical to align emission with image capture, preventing temporal mismatches that could degrade accuracy. Computational efficiency is achieved through parallel processing on GPUs, allowing video-rate reconstruction. These optimizations balance density and speed without sacrificing essential geometric . Common error sources include ambient light interference, which reduces contrast by adding unwanted illumination and leading to decoding failures, particularly in outdoor or brightly lit settings. Mitigation strategies involve capturing multiple exposures at varying intensities or applying bandpass optical filters to isolate the projected wavelengths, thereby preserving signal integrity. Surface reflectivity poses another challenge, causing specular highlights or diffuse that distort visibility on shiny or translucent materials, resulting in incomplete or erroneous point clouds. To address this, systems employ multi-exposure techniques to normalize intensity variations or use temporary surface treatments like matte whitening sprays to diffuse reflections uniformly, improving reconstruction reliability across diverse materials.

Coding Techniques

Binary Coding

Binary coding represents one of the earliest and simplest approaches to pattern projection in structured light systems for 3D surface reconstruction. This method involves projecting a sequence of black-and-white stripe patterns onto the object surface, where each pattern corresponds to a in a binary representation. By capturing the deformed patterns with a camera, each point on the surface receives a unique based on its illumination state across the sequence, enabling correspondence establishment between and camera coordinates for triangulation-based depth computation. The encoding process utilizes n distinct binary patterns to generate up to 2n2^n unique identifiers for surface points. Each pattern alternates between illuminated (white, representing bit 1) and non-illuminated (black, representing bit 0) stripes, with the stripe width typically set to cover multiple projector pixels for robustness. During projection, the patterns are displayed sequentially on a static scene. Decoding occurs by thresholding the captured image intensities at each pixel: if the intensity exceeds a predefined threshold (often the midpoint between black and white levels), the bit is assigned 1; otherwise, 0. The unique code for a pixel is then converted to a decimal position value using the formula: position=i=0n12ibi\text{position} = \sum_{i=0}^{n-1} 2^i \cdot b_i where bib_i is the binary bit from the ii-th . This assigns an absolute coordinate along the projector's axis, facilitating . A key advantage of binary coding is its high operational speed, as the number of required patterns scales logarithmically with the desired resolution—for instance, 10 patterns suffice for unique codes—allowing rapid acquisition even with standard projectors. Additionally, the binary nature provides robustness to ambient and surface reflectivity variations, since decoding relies on simple threshold decisions rather than precise intensity measurements. However, binary coding suffers from inherently coarse spatial resolution, limited discretely to 2n2^n steps, which can result in quantization artifacts for fine details. Furthermore, in standard binary sequences, errors during decoding—such as those from shadows or specular reflections—can propagate significantly, as adjacent codes often differ by multiple bits, leading to large positional jumps and reconstruction inaccuracies at boundaries.

Gray Coding

Gray coding serves as an error-resistant variant of binary coding in structured light systems, utilizing binary patterns designed such that adjacent codes differ by only one bit to mitigate transition errors during decoding. This method ensures that a single misdetected boundary affects only one bit in the codeword, preventing error propagation across multiple bits that could occur in standard binary sequences. Patterns are generated following the sequence—for example, for two bits: 00, 01, 11, 10—allowing unique identification of up to 2n2^n positions with nn projected patterns. Introduced by Inokuchi et al. in their pioneering work on range imaging, enhances robustness in by reducing sensitivity to and distortions. In practice, Gray-encoded stripe patterns are projected sequentially onto the object, with each pattern encoding one bit of the position code for projector columns or rows. The camera captures the distorted projections, and for each , the sequence of illumination states (bright or dark) yields the Gray code value corresponding to the 's coordinate. Decoding proceeds by thresholding the captured intensities to binary values and combining them into the full , followed by conversion to standard binary coordinates via iterative bitwise XOR operations. The key conversion formula is: bn=gn(MSB),b_n = g_n \quad (\text{MSB}), bi=gibi+1for i=n1 down to 0,b_i = g_i \oplus b_{i+1} \quad \text{for } i = n-1 \text{ down to } 0, where bib_i is the ii-th binary bit, gig_i is the ii-th Gray bit, and \oplus denotes XOR. This process recovers the absolute position efficiently and is particularly advantageous in noisy environments, such as those featuring specular surfaces, where intensity variations or slight misalignments might otherwise cause large positional discrepancies. Compared to standard binary coding, Gray coding demonstrates higher reliability, with studies showing reduced decoding errors in the presence of effects like interreflections. For example, it achieves accurate depth reconstruction over ranges of 600–1200 mm using 11 patterns, outperforming conventional binary by minimizing pixels in challenging scenes. However, its resolution remains constrained to levels based on the number of patterns—e.g., 10 patterns resolve positions—necessitating complementary techniques for sub-pixel precision.

Phase-Shifting

Phase-shifting is a high-precision coding technique in structured light systems that employs sinusoidal fringe patterns to achieve sub-pixel resolution in 3D surface reconstruction. The method involves projecting multiple phase-shifted sinusoidal patterns onto the object surface, typically three or four frames shifted by equal intervals such as 0°, 120°, and 240° for the three-step approach. These patterns create interference fringes whose deformation on the object's surface encodes depth information. A camera captures the reflected intensities, yielding a wrapped phase map that represents the relative phase shifts caused by surface contours, which can then be mapped to 3D coordinates via triangulation geometry. The phase at each is computed from the captured intensity images using a . For the three-step method, with intensities I1I_1, I2I_2, and I3I_3 corresponding to phase shifts of 0, 2π/32\pi/3, and 4π/34\pi/3, the wrapped phase ϕ\phi is given by: ϕ=\atantwo(3(I1I3),2I2I1I3)\phi = \atantwo\left( \sqrt{3} (I_1 - I_3), \, 2I_2 - I_1 - I_3 \right)
Add your contribution
Related Hubs
User Avatar
No comments yet.