Recent from talks
Contribute something
Nothing was collected or created yet.
Sampling (signal processing)
View on Wikipedia
In signal processing, sampling is the reduction of a continuous-time signal to a discrete-time signal. A common example is the conversion of a sound wave to a sequence of "samples". A sample is a value of the signal at a point in time and/or space; this definition differs from the term's usage in statistics, which refers to a set of such values.[A]
A sampler is a subsystem or operation that extracts samples from a continuous signal. A theoretical ideal sampler produces samples equivalent to the instantaneous value of the continuous signal at the desired points.
The original signal can be reconstructed from a sequence of samples, up to the Nyquist limit, by passing the sequence of samples through a reconstruction filter.
Theory
[edit]Functions of space, time, or any other dimension can be sampled, and similarly in two or more dimensions.
For functions that vary with time, let be a continuous function (or "signal") to be sampled, and let sampling be performed by measuring the value of the continuous function every seconds, which is called the sampling interval or sampling period.[1][2] Then the sampled function is given by the sequence:
- , for integer values of .
The sampling frequency or sampling rate, , is the average number of samples obtained in one second, thus , with the unit samples per second, sometimes referred to as hertz, for example 48 kHz is 48,000 samples per second.
Reconstructing a continuous function from samples is done by interpolation algorithms. The Whittaker–Shannon interpolation formula is mathematically equivalent to an ideal low-pass filter whose input is a sequence of Dirac delta functions that are modulated (multiplied) by the sample values. When the time interval between adjacent samples is a constant , the sequence of delta functions is called a Dirac comb. Mathematically, the modulated Dirac comb is equivalent to the product of the comb function with . That mathematical abstraction is sometimes referred to as impulse sampling.[3]
Most sampled signals are not simply stored and reconstructed. The fidelity of a theoretical reconstruction is a common measure of the effectiveness of sampling. That fidelity is reduced when contains frequency components whose cycle length (period) is less than 2 sample intervals (see Aliasing). The corresponding frequency limit, in cycles per second (hertz), is cycle/sample × samples/second = , known as the Nyquist frequency of the sampler. Therefore, is usually the output of a low-pass filter, functionally known as an anti-aliasing filter. Without an anti-aliasing filter, frequencies higher than the Nyquist frequency will influence the samples in a way that is misinterpreted by the interpolation process.[4]
Practical considerations
[edit]In practice, the continuous signal is sampled using an analog-to-digital converter (ADC), a device with various physical limitations. This results in deviations from the theoretically perfect reconstruction, collectively referred to as distortion.
Various types of distortion can occur, including:
- Aliasing. Some amount of aliasing is inevitable because only theoretical, infinitely long, functions can have no frequency content above the Nyquist frequency. Aliasing can be made arbitrarily small by using a sufficiently large order of the anti-aliasing filter.
- Aperture error results from the fact that the sample is obtained as a time average within a sampling region, rather than just being equal to the signal value at the sampling instant.[5] In a capacitor-based sample and hold circuit, aperture errors are introduced by multiple mechanisms. For example, the capacitor cannot instantly track the input signal and the capacitor can not instantly be isolated from the input signal.
- Jitter or deviation from the precise sample timing intervals.
- Noise, including thermal sensor noise, analog circuit noise, etc..
- Slew rate limit error, caused by the inability of the ADC input value to change sufficiently rapidly.
- Quantization as a consequence of the finite precision of words that represent the converted values.
- Error due to other non-linear effects of the mapping of input voltage to converted output value (in addition to the effects of quantization).
Although the use of oversampling can completely eliminate aperture error and aliasing by shifting them out of the passband, this technique cannot be practically used above a few GHz, and may be prohibitively expensive at much lower frequencies. Furthermore, while oversampling can reduce quantization error and non-linearity, it cannot eliminate these entirely. Consequently, practical ADCs at audio frequencies typically do not exhibit aliasing, aperture error, and are not limited by quantization error. Instead, analog noise dominates. At RF and microwave frequencies where oversampling is impractical and filters are expensive, aperture error, quantization error and aliasing can be significant limitations.
Jitter, noise, and quantization are often analyzed by modeling them as random errors added to the sample values. Integration and zero-order hold effects can be analyzed as a form of low-pass filtering. The non-linearities of either ADC or DAC are analyzed by replacing the ideal linear function mapping with a proposed nonlinear function.
Applications
[edit]Audio sampling
[edit]Digital audio uses pulse-code modulation (PCM) and digital signals for sound reproduction. This includes analog-to-digital conversion (ADC), digital-to-analog conversion (DAC), storage, and transmission. In effect, the system commonly referred to as digital is in fact a discrete-time, discrete-level analog of a previous electrical analog. While modern systems can be quite subtle in their methods, the primary usefulness of a digital system is the ability to store, retrieve and transmit signals without any loss of quality.
When it is necessary to capture audio covering the entire 20–20,000 Hz range of human hearing[6] such as when recording music or many types of acoustic events, audio waveforms are typically sampled at 44.1 kHz (CD), 48 kHz, 88.2 kHz, or 96 kHz.[7] The approximately double-rate requirement is a consequence of the Nyquist theorem. Sampling rates higher than about 50 kHz to 60 kHz cannot supply more usable information for human listeners. Early professional audio equipment manufacturers chose sampling rates in the region of 40 to 50 kHz for this reason.
There has been an industry trend towards sampling rates well beyond the basic requirements: such as 96 kHz and even 192 kHz[8] Even though ultrasonic frequencies are inaudible to humans, recording and mixing at higher sampling rates is effective in eliminating the distortion that can be caused by foldback aliasing. Conversely, ultrasonic sounds may interact with and modulate the audible part of the frequency spectrum (intermodulation distortion), degrading the fidelity.[9][10][11][12] One advantage of higher sampling rates is that they can relax the low-pass filter design requirements for ADCs and DACs, but with modern oversampling delta-sigma-converters this advantage is less important.
The Audio Engineering Society recommends 48 kHz sampling rate for most applications but gives recognition to 44.1 kHz for CD and other consumer uses, 32 kHz for transmission-related applications, and 96 kHz for higher bandwidth or relaxed anti-aliasing filtering.[13] Both Lavry Engineering and J. Robert Stuart state that the ideal sampling rate would be about 60 kHz, but since this is not a standard frequency, recommend 88.2 or 96 kHz for recording purposes.[14][15][16][17] A more complete list of common audio sample rates is:
| Sampling rate | Use |
|---|---|
| 5,512.5 Hz | Supported in Flash.[18] |
| 8,000 Hz | Telephone and encrypted walkie-talkie, wireless intercom and wireless microphone transmission; adequate for human speech but without sibilance (ess sounds like eff (/s/, /f/)). |
| 11,025 Hz | One quarter the sampling rate of audio CDs; used for lower-quality PCM, MPEG audio and for audio analysis of subwoofer bandpasses.[citation needed] |
| 16,000 Hz | Wideband frequency extension over standard telephone narrowband 8,000 Hz. Used in most modern VoIP and VVoIP communication products.[19][unreliable source?] |
| 22,050 Hz | One half the sampling rate of audio CDs; used for lower-quality PCM and MPEG audio and for audio analysis of low frequency energy. Suitable for digitizing early 20th century audio formats such as 78s and AM Radio.[20] |
| 32,000 Hz | miniDV digital video camcorder, video tapes with extra channels of audio (e.g. DVCAM with four channels of audio), DAT (LP mode), Germany's Digitales Satellitenradio, NICAM digital audio, used alongside analogue television sound in some countries. High-quality digital wireless microphones.[21] Suitable for digitizing FM radio.[citation needed] |
| 37,800 Hz | CD-XA audio |
| 44,055.9 Hz | Used by digital audio locked to NTSC color video signals (3 samples per line, 245 lines per field, 59.94 fields per second = 29.97 frames per second). |
| 44,100 Hz | Audio CD, also most commonly used with MPEG-1 audio (VCD, SVCD, MP3). Originally chosen by Sony because it could be recorded on modified video equipment running at either 25 frames per second (PAL) or 30 frame/s (using an NTSC monochrome video recorder) and cover the 20 kHz bandwidth thought necessary to match professional analog recording equipment of the time. A PCM adaptor would fit digital audio samples into the analog video channel of, for example, PAL video tapes using 3 samples per line, 588 lines per frame, 25 frames per second. |
| 47,250 Hz | world's first commercial PCM sound recorder by Nippon Columbia (Denon) |
| 48,000 Hz | The standard audio sampling rate used by professional digital video equipment such as tape recorders, video servers, vision mixers and so on. This rate was chosen because it could reconstruct frequencies up to 22 kHz and work with 29.97 frames per second NTSC video – as well as 25 frame/s, 30 frame/s and 24 frame/s systems. With 29.97 frame/s systems it is necessary to handle 1601.6 audio samples per frame delivering an integer number of audio samples only every fifth video frame.[13] Also used for sound with consumer video formats like DV, digital TV, DVD, and films. The professional serial digital interface (SDI) and High-definition Serial Digital Interface (HD-SDI) used to connect broadcast television equipment together uses this audio sampling frequency. Most professional audio gear uses 48 kHz sampling, including mixing consoles, and digital recording devices. |
| 50,000 Hz | First commercial digital audio recorders from the late 70s from 3M and Soundstream. |
| 50,400 Hz | Sampling rate used by the Mitsubishi X-80 digital audio recorder. |
| 64,000 Hz | Uncommonly used, but supported by some hardware[22][23] and software.[24][25] |
| 88,200 Hz | Sampling rate used by some professional recording equipment when the destination is CD (multiples of 44,100 Hz). Some pro audio gear uses (or is able to select) 88.2 kHz sampling, including mixers, EQs, compressors, reverb, crossovers, and recording devices. |
| 96,000 Hz | DVD-Audio, some LPCM DVD tracks, BD-ROM (Blu-ray Disc) audio tracks, HD DVD (High-Definition DVD) audio tracks. Some professional recording and production equipment is able to select 96 kHz sampling. This sampling frequency is twice the 48 kHz standard commonly used with audio on professional equipment. |
| 176,400 Hz | Sampling rate used by HDCD recorders and other professional applications for CD production. Four times the frequency of 44.1 kHz. |
| 192,000 Hz | DVD-Audio, some LPCM DVD tracks, BD-ROM (Blu-ray Disc) audio tracks, and HD DVD (High-Definition DVD) audio tracks, High-Definition audio recording devices and audio editing software. This sampling frequency is four times the 48 kHz standard commonly used with audio on professional video equipment. |
| 352,800 Hz | Digital eXtreme Definition, used for recording and editing Super Audio CDs, as 1-bit Direct Stream Digital (DSD) is not suited for editing. 8 times the frequency of 44.1 kHz. |
| 384,000 Hz | Maximum sample rate available in common software.[citation needed] |
| 2,822,400 Hz | SACD, 1-bit delta-sigma modulation process known as Direct Stream Digital, co-developed by Sony and Philips. |
| 5,644,800 Hz | Double-Rate DSD, 1-bit Direct Stream Digital at 2× the rate of the SACD. Used in some professional DSD recorders. |
| 11,289,600 Hz | Quad-Rate DSD, 1-bit Direct Stream Digital at 4× the rate of the SACD. Used in some uncommon professional DSD recorders. |
| 22,579,200 Hz | Octuple-Rate DSD, 1-bit Direct Stream Digital at 8× the rate of the SACD. Used in rare experimental DSD recorders. Also known as DSD512. |
| 45,158,400 Hz | Sexdecuple-Rate DSD, 1-bit Direct Stream Digital at 16× the rate of the SACD. Used in rare experimental DSD recorders. Also known as DSD1024.[B] |
Bit depth
[edit]Audio is typically recorded at 8-, 16-, and 24-bit depth; which yield a theoretical maximum signal-to-quantization-noise ratio (SQNR) for a pure sine wave of, approximately; 49.93 dB, 98.09 dB, and 122.17 dB.[26] CD quality audio uses 16-bit samples. Thermal noise limits the true number of bits that can be used in quantization. Few analog systems have signal to noise ratios (SNR) exceeding 120 dB. However, digital signal processing operations can have very high dynamic range, consequently it is common to perform mixing and mastering operations at 32-bit precision and then convert to 16- or 24-bit for distribution.
Speech sampling
[edit]Speech signals, i.e., signals intended to carry only human speech, can usually be sampled at a much lower rate. For most phonemes, almost all of the energy is contained in the 100 Hz – 4 kHz range, allowing a sampling rate of 8 kHz. This is the sampling rate used by nearly all telephony systems, which use the G.711 sampling and quantization specifications.[citation needed]
Video sampling
[edit]This section needs additional citations for verification. (June 2007) |
Standard-definition television (SDTV) uses either 720 by 480 pixels (US NTSC 525-line) or 720 by 576 pixels (UK PAL 625-line) for the visible picture area.
High-definition television (HDTV) uses 720p (progressive), 1080i (interlaced), and 1080p (progressive, also known as Full-HD).
In digital video, the temporal sampling rate is defined as the frame rate – or rather the field rate – rather than the notional pixel clock. The image sampling frequency is the repetition rate of the sensor integration period. Since the integration period may be significantly shorter than the time between repetitions, the sampling frequency can be different from the inverse of the sample time:
Video digital-to-analog converters operate in the megahertz range (from ~3 MHz for low quality composite video scalers in early game consoles, to 250 MHz or more for the highest-resolution VGA output).
When analog video is converted to digital video, a different sampling process occurs, this time at the pixel frequency, corresponding to a spatial sampling rate along scan lines. A common pixel sampling rate is:
Spatial sampling in the other direction is determined by the spacing of scan lines in the raster. The sampling rates and resolutions in both spatial directions can be measured in units of lines per picture height.
Spatial aliasing of high-frequency luma or chroma video components shows up as a moiré pattern.
3D sampling
[edit]The process of volume rendering samples a 3D grid of voxels to produce 3D renderings of sliced (tomographic) data. The 3D grid is assumed to represent a continuous region of 3D space. Volume rendering is common in medical imaging, X-ray computed tomography (CT/CAT), magnetic resonance imaging (MRI), positron emission tomography (PET) are some examples. It is also used for seismic tomography and other applications.

Undersampling
[edit]When a bandpass signal is sampled slower than its Nyquist rate, the samples are indistinguishable from samples of a low-frequency alias of the high-frequency signal. That is often done purposefully in such a way that the lowest-frequency alias satisfies the Nyquist criterion, because the bandpass signal is still uniquely represented and recoverable. Such undersampling is also known as bandpass sampling, harmonic sampling, IF sampling, and direct IF to digital conversion.[27]
Oversampling
[edit]Oversampling is used in most modern analog-to-digital converters to reduce the distortion introduced by practical digital-to-analog converters, such as a zero-order hold instead of idealizations like the Whittaker–Shannon interpolation formula.[28]
Complex sampling
[edit]Complex sampling (or I/Q sampling) is the simultaneous sampling of two different, but related, waveforms, resulting in pairs of samples that are subsequently treated as complex numbers.[C] When one waveform, , is the Hilbert transform of the other waveform, , the complex-valued function, , is called an analytic signal, whose Fourier transform is zero for all negative values of frequency. In that case, the Nyquist rate for a waveform with no frequencies ≥ B can be reduced to just B (complex samples/sec), instead of (real samples/sec).[D] More apparently, the equivalent baseband waveform, , also has a Nyquist rate of , because all of its non-zero frequency content is shifted into the interval .
Although complex-valued samples can be obtained as described above, they are also created by manipulating samples of a real-valued waveform. For instance, the equivalent baseband waveform can be created without explicitly computing , by processing the product sequence, ,[E] through a digital low-pass filter whose cutoff frequency is .[F] Computing only every other sample of the output sequence reduces the sample rate commensurate with the reduced Nyquist rate. The result is half as many complex-valued samples as the original number of real samples. No information is lost, and the original waveform can be recovered, if necessary.
See also
[edit]Notes
[edit]- ^ For example, "number of samples" in signal processing is roughly equivalent to "sample size" in statistics.
- ^ Even higher DSD sampling rates exist, but the benefits of those are likely imperceptible, and the size of those files would be humongous.
- ^ Sample-pairs are also sometimes viewed as points on a constellation diagram.
- ^ When the complex sample-rate is B, a frequency component at 0.6 B, for instance, will have an alias at −0.4 B, which is unambiguous because of the constraint that the pre-sampled signal was analytic. Also see Aliasing § Complex sinusoids.
- ^ When s(t) is sampled at the Nyquist frequency (1/T = 2B), the product sequence simplifies to
- ^ The sequence of complex numbers is convolved with the impulse response of a filter with real-valued coefficients. That is equivalent to separately filtering the sequences of real parts and imaginary parts and reforming complex pairs at the outputs.
References
[edit]- ^ Martin H. Weik (1996). Communications Standard Dictionary. Springer. ISBN 0412083914.
- ^ Tom J. Moir (2022). Rudiments of Signal Processing and Systems. Springer International Publishing AG. p. 459. doi:10.1007/978-3-030-76947-5. ISBN 9783030769475.
- ^ Rao, R. (2008). Signals and Systems. Prentice-Hall Of India Pvt. Limited. ISBN 9788120338593.
- ^ C. E. Shannon, "Communication in the presence of noise", Proc. Institute of Radio Engineers, vol. 37, no.1, pp. 10–21, Jan. 1949. Reprint as classic paper in: Proc. IEEE, Vol. 86, No. 2, (Feb 1998) Archived 2010-02-08 at the Wayback Machine
- ^ H.O. Johansson and C. Svensson, "Time resolution of NMOS sampling switches", IEEE J. Solid-State Circuits Volume: 33, Issue: 2, pp. 237–245, Feb 1998.
- ^ D'Ambrose, Christoper; Choudhary, Rizwan (2003). Elert, Glenn (ed.). "Frequency range of human hearing". The Physics Factbook. Retrieved 2022-01-22.
- ^ Self, Douglas (2012). Audio Engineering Explained. Taylor & Francis US. pp. 200, 446. ISBN 978-0240812731.
- ^ "Digital Pro Sound". Archived from the original on 20 October 2008. Retrieved 8 January 2014.
- ^ Colletti, Justin (February 4, 2013). "The Science of Sample Rates (When Higher Is Better—And When It Isn't)". Trust Me I'm a Scientist. Retrieved February 6, 2013.
in many cases, we can hear the sound of higher sample rates not because they are more transparent, but because they are less so. They can actually introduce unintended distortion in the audible spectrum
- ^ Siau, John (21 October 2010). "96 kHz vs. 192 kHz". SoundStage!HI-FI.
be very careful about any claims that 192 kHz sounds better than 96 kHz. Our experience points in the opposite direction.
- ^ "Why don't Audient Interfaces support 192 kHZ?". Audient.
We are often asked why the iD and EVO interfaces don't support 192 kHZ, because after all, aren't higher-spec numbers better? Well, in this case, not always…
- ^ "192 kHz Is Worse Than 44.1 kHz for Most Music, According to Experts". Headphonesty.
So while 192 kHz may look impressive on a spec sheet, it often leads to more system strain, more distortion, and less clarity, all in service of frequencies no human can actually hear.
- ^ a b AES5-2008: AES recommended practice for professional digital audio – Preferred sampling frequencies for applications employing pulse-code modulation, Audio Engineering Society, 2008, retrieved 2010-01-18
- ^ Lavry, Dan (May 3, 2012). "The Optimal Sample Rate for Quality Audio" (PDF). Lavry Engineering Inc.
Although 60 KHz would be closer to the ideal; given the existing standards, 88.2 KHz and 96 KHz are closest to the optimal sample rate.
- ^ Lavry, Dan. "The Optimal Sample Rate for Quality Audio". Gearslutz. Retrieved 2018-11-10.
I am trying to accommodate all ears, and there are reports of few people that can actually hear slightly above 20KHz. I do think that 48 KHz is pretty good compromise, but 88.2 or 96 KHz yields some additional margin.
- ^ Lavry, Dan. "To mix at 96k or not?". Gearslutz. Retrieved 2018-11-10.
Nowdays [sic] there are a number of good designers and ear people that find 60-70KHz sample rate to be the optimal rate for the ear. It is fast enough to include what we can hear, yet slow enough to do it pretty accurately.
- ^ Stuart, J. Robert (1998). Coding High Quality Digital Audio. CiteSeerX 10.1.1.501.6731.
both psychoacoustic analysis and experience tell us that the minimum rectangular channel necessary to ensure transparency uses linear PCM with 18.2-bit samples at 58 kHz. ... there are strong arguments for maintaining integer relationships with existing sampling rates – which suggests that 88.2 kHz or 96 kHz should be adopted.
- ^ "SWF File Format Specification - Version 19" (PDF). 2013.
- ^ "Cisco VoIP Phones, Networking and Accessories - VoIP Supply".
- ^ "The restoration procedure – part 1". Restoring78s.co.uk. Archived from the original on 2009-09-14. Retrieved 2011-01-18.
For most records a sample rate of 22050 in stereo is adequate. An exception is likely to be recordings made in the second half of the century, which may need a sample rate of 44100.
- ^ "Zaxcom digital wireless transmitters". Zaxcom.com. Archived from the original on 2011-02-09. Retrieved 2011-01-18.
- ^ "RME: Hammerfall DSP 9632". www.rme-audio.de. Retrieved 2018-12-18.
Supported sample frequencies: Internally 32, 44.1, 48, 64, 88.2, 96, 176.4, 192 kHz.
- ^ "SX-S30DAB | Pioneer". www.pioneer-audiovisual.eu. Archived from the original on 2018-12-18. Retrieved 2018-12-18.
Supported sampling rates: 44.1 kHz, 48 kHz, 64 kHz, 88.2 kHz, 96 kHz, 176.4 kHz, 192 kHz
- ^ Cristina Bachmann, Heiko Bischoff; Schütte, Benjamin. "Customize Sample Rate Menu". Steinberg WaveLab Pro. Retrieved 2018-12-18.
Common Sample Rates: 64 000 Hz
- ^ "M Track 2x2M Cubase Pro 9 can ́t change Sample Rate". M-Audio. Archived from the original on 2018-12-18. Retrieved 2018-12-18.
[Screenshot of Cubase]
- ^ "MT-001: Taking the Mystery out of the Infamous Formula, "SNR=6.02N + 1.76dB," and Why You Should Care" (PDF). Archived from the original (PDF) on 2022-10-09. Retrieved 2010-01-19.
- ^ Walt Kester (2003). Mixed-signal and DSP design techniques. Newnes. p. 20. ISBN 978-0-7506-7611-3. Retrieved 8 January 2014.
- ^ William Morris Hartmann (1997). Signals, Sound, and Sensation. Springer. ISBN 1563962837.
Further reading
[edit]- Matt Pharr, Wenzel Jakob and Greg Humphreys, Physically Based Rendering: From Theory to Implementation, 3rd ed., Morgan Kaufmann, November 2016. ISBN 978-0128006450. The chapter on sampling (available online) is nicely written with diagrams, core theory and code sample.
External links
[edit]- Journal devoted to Sampling Theory
- I/Q Data for Dummies – a page trying to answer the question Why I/Q Data?
- Sampling of analog signals – an interactive presentation in a web-demo at the Institute of Telecommunications, University of Stuttgart
Sampling (signal processing)
View on GrokipediaCore Concepts
Definition of Sampling
In signal processing, sampling refers to the process of converting a continuous-time signal , where is a continuous variable representing time, into a discrete-time signal by measuring the signal's value at specific, discrete instants , with denoting the sampling period and an integer index.[7] This results in the sequence , which captures the signal's amplitude at regular or irregular intervals, effectively representing the original continuous waveform as a series of discrete points.[8] The ideal mathematical model for this process, known as impulse sampling, multiplies the continuous signal by a train of Dirac delta functions to produce an impulse-sampled signal , where the Dirac delta acts as an infinitesimal impulse that isolates the signal value at each sampling instant without distortion in the ideal case.[9] The origins of sampling trace back to early 20th-century advancements in telephony and signal analysis, where engineers sought efficient methods for multiplexing multiple signals over limited bandwidth channels in telegraphy and telephone systems.[10] Key developments emerged in the 1920s at Bell Laboratories, where researchers like Harry Nyquist explored instantaneous sampling techniques to determine the minimum number of samples needed for transmitting signals without excessive bandwidth, laying foundational ideas for modern digital representation. These efforts were driven by the need to handle growing demands for long-distance communication, evolving from analog multiplexing to precursors of digital encoding by the 1930s. Sampling enables the digital processing, storage, and transmission of analog signals, which is essential for applications in computing, telecommunications, and multimedia, as digital systems inherently operate on discrete data that can be manipulated algorithmically with finite resources.[7] This conversion bridges continuous physical phenomena, such as audio or sensor outputs, to discrete domains suitable for computers and networks, facilitating compression, analysis, and error correction.[8] The technique applies to both deterministic signals, which can be precisely described by mathematical functions like sinusoids, and random signals, such as noise processes, where sampling captures statistical properties rather than exact predictability, though the core process remains the measurement at discrete times.[11] In illustrations of the impulse sampling model, the continuous signal is depicted as a smooth curve, overlaid with vertical impulses at intervals scaled by , emphasizing the ideal extraction of amplitude values; the resulting discrete sequence is often shown below as points or stems, highlighting the loss of information between samples unless further reconstruction is applied.[9] While uniform sampling maintains constant intervals , the general definition encompasses both uniform and non-uniform approaches for flexibility in various applications.[10]Uniform vs. Non-Uniform Sampling
Uniform sampling refers to the process of acquiring signal values at regular, fixed time intervals denoted by the sampling period , producing a sequence of equally spaced discrete-time samples. This approach is the foundation of most digital signal processing systems due to its straightforward implementation and compatibility with standard hardware like analog-to-digital converters (ADCs).[12] The primary advantages of uniform sampling lie in its simplicity for subsequent analysis and processing; it facilitates efficient Fourier transform computations and straightforward digital filter design, as the uniform grid aligns well with discrete-time algorithms. The sampling frequency is defined as the reciprocal of the sampling period, given by which directly relates the temporal spacing to the rate of data acquisition. For instance, compact discs (CDs) employ uniform sampling at 44.1 kHz to capture audio signals up to 20 kHz, ensuring high-fidelity digital representation through this fixed-rate mechanism.[12][13] In contrast, non-uniform sampling involves acquiring samples at irregular time intervals, such as in jittered, adaptive, or event-driven schemes where the timing varies based on signal characteristics. This method is particularly advantageous for signals with concentrated energy or sparsity, as in compressive sensing applications, where it allows fewer samples to be taken compared to the uniform Nyquist rate, reducing data volume, power consumption, and processing overhead— for example, achieving average power as low as 1.9 mW in asynchronous ADCs for low-activity signals. However, non-uniform sampling presents significant challenges, including heightened complexity in reconstruction algorithms that must handle irregular grids, often leading to increased computational demands and potential errors in spectral analysis. A key application is in radar systems, where non-uniform sampling via compressive sensing enables high-resolution imaging of sparse scenes with reduced hardware requirements, such as fewer antenna elements, thereby lowering costs and complexity over traditional uniform methods.[14][15] In practical implementations, deviations from perfect uniformity, such as sampling jitter—random timing fluctuations in the clock—act as a form of unintentional non-uniformity, introducing noise proportional to the signal's slew rate and degrading the signal-to-noise ratio (SNR) in a frequency-dependent manner; for example, 10 ps of jitter on a 10 MHz sinusoid can limit SNR to approximately 64 dB.[16]Sampling Theory
Nyquist-Shannon Sampling Theorem
The Nyquist-Shannon sampling theorem states that a continuous-time signal bandlimited to a maximum frequency (in hertz) can be completely reconstructed from its samples without loss of information if the sampling rate satisfies , where is known as the Nyquist rate.[17] This condition ensures that the discrete samples capture all the information content of the original signal, allowing perfect recovery in the absence of noise or other distortions.[17] The theorem's foundations trace back to Harry Nyquist's 1928 analysis of telegraph transmission, where he established that the maximum data rate over a channel of bandwidth is symbols per second to avoid intersymbol interference, implicitly linking signaling rate to bandwidth.[18] Claude Shannon formalized and extended this idea in his 1949 paper, proving the general case for bandlimited signals and providing the explicit reconstruction formula.[17] The key reconstruction equation is where is the sampling period and is the normalized sinc function; this interpolates the samples using ideal low-pass filtering.[17] A signal is defined as bandlimited to frequency if its Fourier transform satisfies for all , meaning it contains no energy at frequencies above .[17] The proof relies on the Fourier domain: sampling in time corresponds to periodic replication of the spectrum with period , so if , the replicas do not overlap, preserving the original spectrum for recovery via an ideal low-pass filter with cutoff .[17] At the critical sampling rate , reconstruction is theoretically possible but requires an ideal sinc interpolator, which is non-causal and infinite in duration.[17] Oversampling () provides theoretical benefits such as improved robustness to quantization noise and easier filter design, though the core theorem holds at the Nyquist rate.[17] The theorem assumes ideal conditions, including infinite-duration bandlimited signals with no energy leakage beyond , which rarely hold in practice due to finite signal lengths and non-ideal filtering.[17] For example, standard narrowband telephony speech is bandlimited to 300–3400 Hz, requiring a sampling rate of 8 kHz for faithful reconstruction as per ITU-T G.711.[19]Aliasing Phenomena
Aliasing occurs when high-frequency components in a continuous-time signal are misrepresented as lower-frequency components after sampling, leading to distortion in the reconstructed signal. This phenomenon arises because sampling creates periodic replicas of the signal's spectrum in the frequency domain, causing overlap if the signal is not properly bandlimited.[20] The mechanism of aliasing stems from the frequency-domain replication of the signal spectrum at intervals of the sampling frequency . Specifically, a frequency component in the original signal will appear as an aliased frequency , where is the integer chosen such that falls within the baseband range . This folding effect means that frequencies above the Nyquist frequency are mapped into the lower frequency band, indistinguishable from true low-frequency content.[21][22] A classic visual example of aliasing is the wagon-wheel effect observed in motion pictures, where the spokes of a rotating wheel appear to rotate backward or stationary due to the frame rate undersampling the wheel's true rotational frequency. In audio processing, a 10 kHz tone sampled at 15 kHz will alias to a 5 kHz tone, as kHz, altering the perceived sound.[23][24] These effects are particularly evident in applications like pulse-width modulation where signals exceed the Nyquist limit.[25] To mitigate aliasing, pre-sampling bandwidth estimation is essential to ensure exceeds twice the signal's highest frequency, as per the Nyquist rate. Techniques such as applying the Hilbert transform for envelope detection allow estimation of the signal's instantaneous bandwidth by computing the analytic signal and deriving its frequency content, guiding appropriate sampling rate selection.[26][27] The aliasing phenomenon was first systematically observed and analyzed in the context of early telegraphy systems in 1928, where insufficient bandwidth in pulse transmission led to signal distortion, as detailed in foundational work on telegraph transmission theory.[28]Signal Reconstruction
Signal reconstruction refers to the process of recovering the original continuous-time signal from its discrete-time samples, assuming the signal satisfies the bandlimiting conditions outlined in the Nyquist-Shannon sampling theorem.[29] In the ideal case, perfect reconstruction is achievable using sinc interpolation, where the continuous signal is expressed as: with denoting the sampling frequency and . This formula leverages an ideal low-pass filter to eliminate high-frequency components introduced during sampling, ensuring no distortion if the signal's bandwidth is below .[29] In practical systems, ideal sinc interpolation is computationally intensive and sensitive to infinite sample requirements, leading to the adoption of approximate methods. Zero-order hold (ZOH) reconstruction maintains the sample value constant between points, introducing a linear phase shift but suitable for simple real-time applications due to its low complexity.[30] Linear interpolation connects adjacent samples with straight lines, providing smoother transitions than ZOH at the cost of higher-frequency attenuation, while spline-based methods, such as cubic splines, offer better approximation of curved signal segments for enhanced fidelity in bandwidth-constrained environments. These techniques balance computational efficiency and reconstruction quality in hardware like digital-to-analog converters (DACs). Reconstruction errors arise primarily from finite sample lengths and non-ideal filtering, which truncate the sinc function or introduce passband ripple and stopband leakage. The mean squared error (MSE) serves as a key metric, quantifying distortion as , where is the reconstructed signal; for finite observations, this error increases with signal duration due to boundary effects.[31] Non-ideal filters exacerbate this by allowing aliasing remnants, with MSE scaling inversely with filter order in typical designs.[32] For non-uniform sampling, where samples occur at irregular intervals, standard sinc methods fail, necessitating specialized algorithms. Iterative methods, such as least-squares optimization, refine estimates by minimizing reconstruction error over the irregular grid, converging to near-optimal solutions for bandlimited signals. The non-uniform fast Fourier transform (NUFFT) efficiently computes Fourier-domain interpolations, enabling practical reconstruction with complexity for samples, outperforming direct gridding in imaging applications.[33] A representative example is DAC implementation in audio playback, where oversampled reconstruction—typically at 4x or 8x the base rate—shifts imaging artifacts to ultrasonic frequencies, allowing gentler analog filters to suppress them without audible distortion.[30] In advanced multirate systems, perfect reconstruction is achieved through filter banks, where analysis and synthesis filters are designed to cancel aliasing and distortion across subbands. For two-channel quadrature mirror filters, paraunitary conditions ensure zero phase distortion and exact recovery, foundational to subband coding in compression schemes.[34]Practical Aspects
Quantization Effects
Quantization in signal processing refers to the process of mapping continuous amplitude values of a sampled signal to a finite set of discrete levels, typically represented by binary codes of fixed bit depth . This mapping approximates the original amplitude by rounding to the nearest discrete level, with the step size defined as the full-scale range divided by , where the full-scale range is the total span from the minimum to maximum representable amplitude.[35] For example, in an -bit uniform quantizer, the levels are equally spaced, enabling efficient digital representation but introducing approximation errors. The primary error introduced by quantization is modeled as additive uniform noise, independent of the signal, with the quantization error bounded between and . Under this model, the noise is assumed to be uniformly distributed, yielding a variance of .[36] For a full-scale sinusoidal input signal, this results in a theoretical signal-to-noise ratio (SNR) of dB, where each additional bit improves the SNR by approximately 6 dB.[35] Quantization types include uniform schemes, which use fixed step sizes suitable for signals with uniform amplitude distributions, and non-uniform schemes, which employ variable step sizes to allocate more levels to frequent amplitude ranges, enhancing efficiency for non-Gaussian signals like speech. A prominent non-uniform method is the -law companding, standardized by the ITU-T in Recommendation G.711 for pulse-code modulation (PCM) telephony, where compresses the dynamic range logarithmically before uniform quantization to reduce granular noise in low-amplitude regions.[37] To mitigate nonlinear distortion and linearize the quantization error, dithering adds low-level random noise prior to quantization, decorrelating the error from the signal and converting it to broadband noise.[38] Quantization limits the dynamic range to approximately dB, beyond which signals suffer from granular noise—coarse, distortion-like artifacts resembling a gritty texture due to insufficient levels for small amplitude variations—and clipping, where amplitudes exceeding the representable range are forced to the extreme levels, introducing harmonic distortion.[39][40] For instance, 16-bit quantization in audio applications yields a theoretical SNR of 96 dB, sufficient for most perceptual needs but prone to audible granular noise in quiet passages without dithering.[40] The foundational development of quantization for PCM occurred in the 1940s, pioneered by Alec Reeves at ITT Laboratories in 1937–1938, with practical implementation during World War II for secure telephony transmission.[41] ITU-T standards, such as G.711 established in 1972, formalized non-uniform quantization laws like -law and A-law for international digital telephony, ensuring compatibility and consistent performance across networks.[37]Analog-to-Digital Conversion
Analog-to-digital conversion (ADC) is the process of transforming continuous analog signals into discrete digital representations, essential for integrating sampled signals into digital systems. In signal processing, ADCs perform both sampling and quantization, capturing instantaneous signal values and mapping them to binary codes. This hardware implementation bridges theoretical sampling principles with practical digital processing, enabling applications from telecommunications to consumer electronics.[42] Common ADC architectures balance speed, resolution, and power, with trade-offs dictated by application needs. Flash ADCs use parallel comparators for ultra-high speeds up to gigasamples per second but are limited to low resolutions (4-8 bits) due to exponential hardware growth, making them suitable for wideband signals like radar. Successive approximation register (SAR) ADCs employ a binary search algorithm with a capacitor array, offering medium speeds (up to 5 MSPS) and resolutions up to 18 bits, ideal for general-purpose data acquisition with good power efficiency. Sigma-delta ADCs oversample the input and use noise shaping via a modulator and digital filter, achieving high resolutions (16-24 bits) at lower speeds (up to a few MSPS), trading bandwidth for precision in audio and sensor interfaces.[43][44][45][46] The sampling stage within an ADC relies on sample-and-hold (S/H) circuits to capture and stabilize the analog input. These circuits operate in two phases: tracking, where the output follows the input signal via a closed switch and charged hold capacitor, and holding, where the switch opens to freeze the voltage for the duration of quantization, preventing signal variation that could introduce errors. S/H circuits are crucial for maintaining accuracy in high-speed or dynamic-range applications, as they ensure the input remains constant within one least significant bit (LSB) during conversion, directly impacting spurious-free dynamic range (SFDR) and signal-to-noise ratio (SNR). Aperture jitter in S/H—the timing uncertainty in sampling—degrades SNR, particularly for high-frequency signals, while effective number of bits (ENOB) quantifies overall performance by relating actual SNR to an ideal quantizer's resolution via ENOB = (SNR - 1.76) / 6.02.[47][48][16][49] In professional audio, 24-bit sigma-delta ADCs support sampling rates up to 192 kHz, providing dynamic ranges exceeding 114 dB for high-fidelity capture of music and speech without perceptible distortion. For instance, the Cirrus Logic CS4272 codec integrates stereo 24-bit ADCs with pop-guard noise reduction, enabling seamless analog-to-digital conversion in studio equipment.[50][51] The evolution of ADCs traces from 1930s vacuum-tube samplers, which used electromechanical relays and tubes for rudimentary pulse-code modulation (PCM) in telephony, to 1950s transistor-based systems that marked a pivotal milestone. In the mid-1950s, Bell Labs implemented transistorized PCM coders with successive approximation techniques, digitizing voice signals to 5 bits at 8 kHz sampling, vastly improving reliability and size over tube-based designs. Modern complementary metal-oxide-semiconductor (CMOS) ADCs, dominant since the 1970s, integrate billions of transistors on a single chip, enabling low-voltage operation and resolutions beyond 24 bits.[52][53][54] In embedded systems, power and cost are critical, as higher resolutions and speeds increase consumption—sigma-delta ADCs may draw microwatts for low-rate sensors but milliwatts at audio frequencies, while SAR variants excel in battery-powered devices under 1 mW. Cost scales with process technology; CMOS integration reduces per-unit prices to cents for 12-bit SARs in microcontrollers, but custom high-resolution designs exceed dollars, influencing choices in IoT versus industrial controls. Trade-offs often favor SAR for embedded versatility, balancing sub-microwatt idle power with minimal die area.[55][56][57][58]Filter Design for Sampling
In signal sampling, anti-aliasing filters are essential analog low-pass filters placed before the analog-to-digital converter (ADC) to attenuate frequencies above the Nyquist frequency (), where is the sampling rate, thereby preventing aliasing of high-frequency components into the baseband.[59] These filters are typically designed with a cutoff frequency slightly below to ensure sufficient attenuation in the stopband, balancing the trade-off between passband flatness and transition sharpness. For instance, in audio applications with a 44.1 kHz sampling rate, the filter cutoff is set near 20 kHz to preserve the audible spectrum while rejecting ultrasonic noise.[59] Common designs include Butterworth filters, which provide a maximally flat passband response with a gradual roll-off (e.g., -3 dB at the cutoff and 20 dB/decade per order), minimizing amplitude distortion but requiring higher orders for steep transitions.[60] In contrast, Chebyshev filters offer steeper roll-off (e.g., faster attenuation beyond cutoff) at the cost of ripple in the passband (Type I) or stopband (Type II), making them suitable for applications where aliasing rejection is prioritized over passband uniformity.[59] The choice between them depends on the required attenuation; for example, a 4th-order Chebyshev filter achieves better stopband rejection than a Butterworth of the same order but introduces more phase nonlinearity.[60] Reconstruction filters, employed after the digital-to-analog converter (DAC), are low-pass filters that remove spectral images—replicas of the baseband signal centered at multiples of —to recover a smooth analog output.[61] The ideal brick-wall reconstruction filter has a frequency response defined as where is the sampling period, ensuring perfect sinc interpolation in the time domain.[61] In practice, finite-order approximations like Butterworth filters are used for their flat passband, providing low distortion in the signal band, while Chebyshev variants enable sharper cutoffs to suppress images more effectively with fewer components.[61] Bessel filters are preferred when linear phase (constant group delay) is critical to avoid time-domain overshoot or ringing in pulse-like signals.[61] In oversampled systems, digital filters facilitate multirate processing through decimation and interpolation to manage higher sampling rates efficiently. Decimation involves a low-pass filter followed by downsampling by an integer factor , attenuating frequencies above the new Nyquist limit () to avoid aliasing.[62] Interpolation upsamples by inserting zeros between samples, then applies a low-pass filter with gain and cutoff at the original to eliminate imaging artifacts from the zero-stuffing.[62] These filters are often implemented as finite impulse response (FIR) designs for linear phase or infinite impulse response (IIR) for efficiency, with polyphase structures reducing computational load by a factor of or .[62] Design tools for these filters frequently employ the bilinear transform to convert analog prototypes to digital IIR equivalents, mapping the s-plane to the z-plane via the substitution , where is the sampling period, preserving stability and avoiding aliasing in the frequency warping.[63] This method is particularly useful for tailoring anti-aliasing or reconstruction filters to specific , as it directly translates analog specifications like cutoff and order.[63] Practical challenges in filter design for sampling include finite-order approximations, which cannot achieve ideal brick-wall responses and thus leave residual aliasing or imaging, necessitating higher orders that increase component count and cost.[60] Group delay variations, especially in nonlinear-phase designs like Chebyshev filters, can introduce signal distortion, such as overshoot in step responses, requiring careful specification of maximum allowable ripple (e.g., <0.5 dB for audio).[59]Sampling Strategies
Undersampling Techniques
Undersampling, also known as bandpass sampling, involves intentionally sampling a bandpass signal at a rate lower than the Nyquist rate based on its highest frequency but sufficient for its bandwidth, thereby exploiting aliasing to fold the signal spectrum into the baseband for processing. This technique applies specifically to signals confined to a narrow band centered at a high carrier frequency , where the bandwidth is much smaller than , allowing the sampling frequency to satisfy while . The process leverages the periodic replicas created by sampling to translate the high-frequency band down without requiring analog downconversion hardware.[64] For successful reconstruction without overlap between spectral replicas, the signal must be strictly bandpass with lower frequency and upper frequency , and the sampling rate must be chosen to prevent aliasing distortion within the band of interest. The allowable ranges for are determined by integer multiples , such that for , ensuring the positive and negative frequency images do not overlap. A bandpass pre-filter is essential to isolate the signal band and suppress out-of-band components that could alias into the desired spectrum.[64] One key advantage of undersampling is the significant reduction in hardware complexity and cost for processing high-frequency signals, such as those in radio frequency (RF) applications, by avoiding the need for high-speed analog-to-digital converters (ADCs) and mixers.[64] This approach lowers power consumption and simplifies system design, making it particularly suitable for portable or resource-constrained devices.[65] In practice, undersampling is widely employed in software-defined radio (SDR) systems for direct downconversion of RF signals to baseband through digital signal processing, eliminating intermediate frequency stages and enabling flexible multi-band reception.[65] Historically, it has been utilized in spectrum analyzers since the 1980s to efficiently capture and analyze high-frequency spectra with moderate sampling rates. Despite these benefits, undersampling is sensitive to variations in the signal's carrier frequency, as even small drifts can cause the aliased band to shift or overlap with unwanted replicas, potentially degrading reconstruction quality.[64] Additionally, it demands highly selective bandpass pre-filters to maintain signal integrity, which can introduce insertion loss and increase design challenges at very high frequencies.[65]Oversampling Methods
Oversampling methods entail sampling a signal at a frequency significantly higher than the Nyquist rate of , where is the signal bandwidth, to distribute quantization noise across a broader spectrum. This spreading allows for noise shaping, concentrating noise outside the signal band and enhancing the effective resolution. Without additional shaping, the signal-to-noise ratio (SNR) improves by 3 dB for each octave of oversampling, as the total noise power remains constant but is diluted over twice the bandwidth per octave, while signal power is unchanged. A prominent oversampling technique is delta-sigma modulation, which employs negative feedback to achieve aggressive noise shaping. In a delta-sigma modulator, the input signal passes through an integrator before quantization, with the quantizer error fed back and subtracted, resulting in a noise transfer function (NTF) that attenuates low-frequency noise. For a first-order delta-sigma modulator, the NTF is given by which exhibits high-pass behavior, suppressing noise near DC. The power spectral density (PSD) of the shaped quantization noise is then where is the quantization step size and the term represents the PSD of uniform white quantization noise; the factor pushes noise power toward higher frequencies, potentially yielding SNR gains of 9 dB or more per octave depending on the modulator order.[66][67] After oversampling and modulation, decimation reduces the data rate to the Nyquist rate while preserving signal integrity. This involves digital low-pass filtering to eliminate high-frequency noise introduced by shaping, followed by downsampling, which further concentrates the noise reduction in the baseband and can increase effective bit resolution by approximately 0.5 bits per octave of oversampling in unshaped systems.[68] In practical applications, such as digital audio reproduction, oversampling at 4× the CD rate (176.4 kHz) is commonly used in digital-to-analog converters (DACs) to smooth reconstruction and minimize distortion. High-resolution analog-to-digital converters (ADCs) leverage delta-sigma oversampling to attain 20-bit or greater performance from coarse quantizers, enabling compact, low-power designs in portable devices.[69] Oversampling relaxes anti-aliasing filter requirements by providing greater separation between the signal band and potential aliases, allowing simpler analog filters with less sharp roll-off. These methods emerged in the 1970s for A/D converters, initially using multibit oversampled architectures to exploit advancing integrated circuit capabilities for improved linearity and dynamic range.[70]Specialized Sampling
Complex Sampling
Complex sampling, also known as I/Q or quadrature sampling, processes bandpass signals by separately sampling their in-phase (I) and quadrature (Q) components, each at a rate of , where is twice the signal bandwidth . This approach yields an effective complex sampling rate of for the signal's complex envelope, allowing efficient representation without aliasing of the positive-frequency spectrum.[71] The mathematical foundation relies on the analytic signal representation, where the complex envelope is , with and being the low-pass filtered versions of and , respectively. The analytic signal is then , related to the Hilbert transform such that , with as the carrier frequency. The Nyquist-Shannon sampling theorem extends to this complex domain: a bandpass signal with bandwidth centered at can be perfectly reconstructed from I and Q samples if each is sampled at least at rate , ensuring the complex envelope captures the full information content without loss.[71][72] A key advantage is the direct digitization of intermediate-frequency (IF) signals, bypassing analog downconversion mixers and reducing hardware complexity in digital receivers. This shifts the signal spectrum to baseband, centering it at DC; for the complex signal , the Fourier transform satisfies for , isolating the desired band while suppressing negative frequencies and images.[71] Implementations often employ Hartley or Weaver architectures for generating and processing I/Q components. The Hartley method uses a 90-degree phase shifter post-mixing to derive Q from I, minimizing mixer count, while the Weaver architecture applies low-pass filters after initial quadrature mixing to enhance image rejection across wider bands. These techniques found application in wireless communications, such as GSM receivers, where they enable software-defined processing of modulated signals at IF.[73] Challenges include phase and gain mismatches between I and Q channels, which degrade image rejection and introduce leakage from unwanted sidebands, often requiring calibration algorithms for compensation. Complex sampling emerged in the late 1970s and 1980s as a cornerstone for digital receivers, building on early quadrature concepts to support efficient bandpass digitization.[71]Multidimensional Sampling
Multidimensional sampling generalizes the one-dimensional Nyquist-Shannon sampling theorem to signals varying in two or more dimensions, such as spatial images or volumetric data, by discretizing continuous functions on lattices in the domain. For bandlimited signals on separable Cartesian grids, sampling proceeds independently along each axis, ensuring that the sampling frequencies exceed twice the respective bandwidths to prevent aliasing. In two dimensions, if a signal has bandwidths in the horizontal direction and in the vertical direction, the sampling frequencies must satisfy and , yielding a minimum sampling density of samples per unit area.[74] This extension maintains the core principle that the signal's frequency support must fit within the fundamental period of the multidimensional spectrum without overlap from replicas induced by sampling.[75] Cartesian (rectangular) grids are prevalent in practical systems due to their alignment with hardware architectures and straightforward Fourier analysis via separable transforms, but they can introduce directional aliasing for isotropic signals with circular frequency support. Hexagonal grids, by contrast, provide higher sampling efficiency, requiring about 13% fewer points than square grids to cover the same bandwidth without aliasing, as their Voronoi cells in the frequency domain tile circular regions more compactly and isotropically. This efficiency stems from the geometry of the lattice, which minimizes redundancy in spatial frequency representation while reducing moiré patterns and edge aliasing in images. Such grids have been analyzed for applications where uniform coverage of rotational symmetries is beneficial, though implementation complexity often favors Cartesian alternatives in standard processing pipelines.[76] Reconstruction of bandlimited multidimensional signals from uniform Cartesian samples employs separable sinc interpolation, extending the 1D case to products of sinc functions along each dimension. For a 2D signal sampled at intervals and , the reconstructed value at arbitrary coordinates is given by where . This formula ensures perfect recovery within the bandlimit, though practical approximations truncate the infinite sums to finite windows. Non-Cartesian sampling schemes, such as radial or spiral trajectories used in magnetic resonance imaging (MRI) to traverse k-space, complicate reconstruction because samples lie on irregular lattices, demanding preprocessing steps like density compensation and gridding to a Cartesian grid or iterative nonlinear algorithms to mitigate artifacts and ensure consistent aliasing control. These methods trade sampling efficiency for faster acquisitions but increase computational demands compared to uniform grids.[77] A key example of multidimensional sampling is pixel discretization in digital cameras, where photosensitive elements on a rectangular grid capture the spatial intensity distribution of light, with pixel pitch determining the Nyquist spatial frequency to avoid aliasing from high-frequency scene details like fine textures. Optical anti-aliasing filters are often applied pre-sampling to attenuate frequencies beyond this limit. Historically, multidimensional sampling concepts entered computer graphics in the 1960s through early experiments with raster displays and image synthesis, where grid-based discretization enabled the rendering of continuous scenes onto discrete screens at institutions like Boeing and Bell Labs.[78][79] For compression purposes, subsampling on optimized lattices—such as quincunx or hexagonal patterns—reduces data volume by selectively lowering density in low-frequency regions or chrominance channels, exploiting redundancies without severe perceptual degradation. These lattices preserve essential spatial frequencies while allowing rates below the full Nyquist density, provided reconstruction accounts for the irregular geometry to control aliasing.[80]Applications
Audio and Speech Sampling
In audio and speech sampling, the standard for compact disc (CD) audio is a sampling rate of 44.1 kHz with 16-bit quantization, selected as a compromise between Sony's proposal of 44.056 kHz (aligned with NTSC video standards) and Philips' 44.1 kHz (aligned with PAL video standards) to facilitate mastering using consumer video recorders.[81] This rate exceeds the Nyquist frequency for the human hearing range of approximately 20 Hz to 20 kHz, ensuring capture of audible frequencies up to about 22.05 kHz while providing a guard band against filter roll-off.[82] High-resolution audio formats extend this to 96 kHz sampling and 24-bit depth, offering greater fidelity for professional applications by capturing ultrasonic frequencies and reducing quantization artifacts, though benefits beyond 44.1 kHz/16-bit are often subtle for typical listening. The 16-bit depth in consumer audio provides a theoretical dynamic range of 96 dB, sufficient for most music reproduction as it matches the capabilities of human hearing in quiet environments.[83] Professional recordings favor 24-bit depth, yielding up to 144 dB dynamic range to accommodate headroom for mixing and processing without clipping or excessive noise.[83] In telephony, μ-law (North America/Japan) and A-law (Europe) companding schemes compress 16-bit linear pulse-code modulation (PCM) to 8 bits per sample at an 8 kHz rate, optimizing bandwidth efficiency while maintaining acceptable speech quality through nonlinear quantization that allocates more levels to quieter signals.[84] Speech sampling typically employs an 8 kHz rate to cover the telephony bandwidth of 300-3400 Hz, as defined by ITU-T standards, which prioritizes intelligibility over full audio fidelity by focusing on formant frequencies essential for phoneme recognition. This rate supports voice over IP (VoIP) systems like those using G.711 codec, balancing low latency and bandwidth (64 kbit/s) for real-time communication. For further compression at rates below 8 kHz, linear predictive coding (LPC) models speech as an autoregressive process, predicting samples from prior ones to reduce redundancy; pioneered by Bishnu S. Atal and Manfred R. Schroeder at Bell Labs in the 1960s, LPC enables bit rates as low as 2.4 kbit/s while preserving perceptual quality.[85] Sampling artifacts in audio include quantization noise, which becomes prominent in quiet passages due to coarse level representation at low amplitudes, potentially introducing audible hiss if not mitigated by dithering.[86] Aliasing occurs when high-frequency harmonics exceed half the sampling rate and fold back as lower frequencies without adequate anti-aliasing filters, distorting timbre in instruments with rich overtones.[87] Perceptual coding formats like MP3 address these by exploiting psychoacoustic principles such as simultaneous and temporal masking, quantizing less perceptually relevant components (often from oversampled internal processing at 32 kHz or higher) to achieve compression ratios up to 12:1 with minimal audible degradation.[88] Historically, PCM for telephony originated with Alec H. Reeves' 1937 invention of pulse-code modulation at International Telephone and Telegraph, later advanced by Bell Labs in the late 1930s for secure transmission, establishing the 8 kHz foundation for digital speech.[89]Video and Image Sampling
Image sampling in digital still photography and graphics involves discretizing continuous spatial signals into a grid of pixels, typically arranged in rectangular arrays. High-definition (HD) images commonly use a 1920 × 1080 pixel grid, as defined by the SMPTE ST 274:2013 standard for 1080-line progressive and interlaced formats.[90] This resolution supports a spatial sampling frequency that captures details up to the Nyquist limit of 0.5 cycles per pixel, preventing aliasing for frequencies below half the sampling rate, according to the Nyquist-Shannon sampling theorem applied to two-dimensional signals.[91] In practice, this means the highest resolvable spatial frequency is one-half cycle per pixel, ensuring faithful reconstruction of image details without distortion when anti-aliasing filters are properly applied.[92] Video sampling extends image sampling to spatiotemporal domains, capturing sequences of frames at fixed rates to represent motion. Standard frame rates include 30 frames per second (fps) for NTSC-based systems, providing a temporal sampling rate that samples motion at intervals of approximately 33.3 milliseconds.[93] Chroma subsampling reduces color resolution relative to luminance to conserve bandwidth; for example, the 4:2:2 format samples chroma at half the horizontal rate of luma, as specified in ITU-R BT.601 for standard-definition video, resulting in 360 chroma samples per line compared to 720 luma samples.[93] Interlacing, a historical technique, alternates odd and even scan lines between fields to double perceived vertical resolution at lower bandwidths, but it introduces artifacts like line flicker and feathering on moving edges due to mismatched field timing.[94] Key standards govern these sampling parameters for interoperability. ITU-R BT.601 establishes sampling for standard-definition (SD) video at 13.5 MHz for luma in both 525-line/60 Hz and 625-line/50 Hz systems, supporting 4:2:2 chroma with 720 active samples per line.[93] For HD, ITU-R BT.709 defines 1920 × 1080 resolution at frame rates up to 60/1.001 Hz, with square pixels and 4:2:2 or 4:4:4 chroma options, ensuring compatibility for broadcast and production.[95] Ultra-high-definition (UHD) 4K video at 3840 × 2160 resolution and 60 fps follows SMPTE ST 2082-10 for 12G-SDI mapping, enabling uncompressed transport of progressive-scan content with full chroma sampling to support high-motion scenes.[96] These standards mitigate aliasing risks, such as moiré patterns, which arise when high-frequency patterns exceed the Nyquist limit, producing false low-frequency interference in fabrics or grids.[91] Compression techniques like JPEG for images and MPEG for video rely on block-based sampling to reduce data rates while preserving perceptual quality. JPEG divides images into 8 × 8 pixel blocks, applies the discrete cosine transform (DCT) to convert spatial data to frequency coefficients, and quantizes them for lossy compression, effectively subsampling high frequencies that are less visible to the human eye.[97] MPEG extends this to video by using DCT on macroblocks (typically 16 × 16 pixels) across frames, incorporating motion compensation to handle temporal redundancies, as in MPEG-2 standards for DVD and broadcast.[98] This block sampling introduces minor artifacts like blocking at low bit rates but aligns with sampling theory by focusing on perceptually important low-frequency components. The origins of video sampling trace to 1950s television systems, where analog scanning lines—typically 525 in NTSC or 625 in PAL—were electronically rasterized to sample the visual scene, with interlacing introduced to achieve flicker-free images at limited bandwidths.[99] Early cathode-ray tube cameras scanned these lines progressively or interlaced, sampling at rates tied to the 30 Hz or 25 Hz field rates, laying the foundation for digital pixel grids.[94] Challenges in video and image sampling include spatial aliasing, which manifests as jagged edges (jaggies) on diagonal lines when pixel grids inadequately resolve sharp transitions, violating the Nyquist criterion.[100] Temporal aliasing occurs in motion, such as the wagon-wheel effect where rotating objects appear to reverse direction due to frame rates undersampling rotational speeds, as seen when a wheel's spokes align ambiguously between 24 fps frames.[101] These issues are addressed through anti-aliasing filters and higher sampling rates in modern standards, though they persist in compressed or low-resolution formats.Spatial and 3D Sampling
Spatial sampling in signal processing extends the principles of one- and two-dimensional sampling to three-dimensional volumes, where signals are represented on regular or irregular grids in Cartesian coordinates. In three dimensions, the sampling theorem requires capturing spatial frequencies up to the Nyquist limit in all directions to avoid aliasing, with the sampling density determined by the highest frequency components of the signal, such as those arising from structural variations in physical media. For volumetric data, this often involves discretizing continuous fields into voxels—three-dimensional analogs of pixels—arranged in a grid that can be isotropic, with equal spacing in x, y, and z directions, or anisotropic, with varying resolutions to match the signal's directional properties or acquisition constraints.[102] The foundational developments in 3D spatial sampling emerged in the 1970s with the invention of computed tomography (CT) by Godfrey Hounsfield, who demonstrated the reconstruction of cross-sectional images from X-ray projections, enabling the first practical volumetric imaging systems. Hounsfield's work at EMI Laboratories led to the first clinical CT scanner in 1971, which processed 160 angular projections to form 2D slices that could be stacked into 3D volumes, marking a shift from uniform 2D sampling to layered volumetric representation. Subsequent advancements incorporated non-uniform sampling strategies, such as adaptive mesh refinement (AMR), which dynamically adjusts grid resolution in regions of high spatial frequency variation, reducing computational load while preserving detail in complex geometries like tissue boundaries.[103][104][105] In medical imaging applications like CT and magnetic resonance imaging (MRI), 3D sampling typically employs voxel grids at resolutions such as 512³ to balance detail and acquisition time, where each voxel represents a cubic volume element capturing attenuation or signal intensity. For CT, this resolution allows reconstruction of anatomical structures with voxel sizes around 0.5–1 mm, sufficient for visualizing bone and soft tissue interfaces without excessive aliasing. The extension from 2D theory maintains the separability of sampling along orthogonal axes but accounts for 3D Fourier transforms, where isotropic spacing ensures uniform frequency response, while anisotropic spacing—common in MRI due to slice thickness variations—optimizes for elongated acquisition times in the z-direction by using finer in-plane resolution (e.g., 1 mm × 1 mm × 5 mm voxels).[102][106][107] Beyond medicine, 3D sampling finds applications in holography and 3D printing, where volumetric data drives light field reconstruction or material deposition. In holographic displays, layered depth sampling prevents aliasing in depth layers by ensuring sufficient angular and axial resolution to resolve parallax, avoiding distortions in perceived 3D scenes. For 3D printing, particularly tomographic volumetric additive manufacturing (TVAM), holographic projections sample the resin volume from multiple angles, curing voxels layer-by-layer or volumetrically to fabricate complex objects, with sampling rates tailored to material curing depths to minimize surface roughness. Aliasing in depth layers can manifest as moiré patterns if the axial sampling falls below the Nyquist rate for the object's depth range, necessitating oversampling in z for high-fidelity output.[108][109] Standards like DICOM (Digital Imaging and Communications in Medicine) govern the storage and exchange of 3D medical volumes, specifying metadata for voxel spacing, orientation, and sampling parameters to ensure interoperability across devices. Sampling rates in these volumes are selected based on tissue spatial frequencies; for example, soft tissues exhibit dominant frequencies up to 0.5 mm⁻¹ (cycles per mm), requiring a minimum sampling interval of 1 mm to satisfy the Nyquist criterion and resolve low-contrast features like tumors without aliasing artifacts. This frequency threshold aligns with modulation transfer function (MTF) measurements in CT systems, where resolution at 0.5 mm⁻¹ ensures adequate depiction of anatomical details.[110][111] Reconstruction from 3D samples often employs ray casting, a volume rendering technique that traces rays through the voxel grid from a viewpoint, accumulating opacity and color based on sampled densities to generate 2D projections. This method efficiently handles dense grids by stepping through voxels along ray paths, interpolating values to mitigate discretization errors. In sparse 3D scenarios, such as LiDAR point clouds used for environmental mapping, ray casting faces challenges from irregular sampling densities, where gaps lead to incomplete reconstructions and amplified aliasing in occluded regions; processing requires upsampling or fusion with denser modalities to achieve reliable 3D models. These limitations highlight the need for adaptive sampling in sparse data, extending non-uniform grids to prioritize high-variance areas like object edges.[112][113][114]References
- https://wiki.seg.org/wiki/Frequency_aliasing