Recent from talks
Nothing was collected or created yet.
Noise reduction
View on Wikipedia
Noise reduction is the process of removing noise from a signal. Noise reduction techniques exist for audio and images. Noise reduction algorithms may distort the signal to some degree. Noise rejection is the ability of a circuit to isolate an undesired signal component from the desired signal component, as with common-mode rejection ratio.
All signal processing devices, both analog and digital, have traits that make them susceptible to noise. Noise can be random with an even frequency distribution (white noise), or frequency-dependent noise introduced by a device's mechanism or signal processing algorithms.
In electronic systems, a major type of noise is hiss created by random electron motion due to thermal agitation. These agitated electrons rapidly add and subtract from the output signal and thus create detectable noise.
In the case of photographic film and magnetic tape, noise (both visible and audible) is introduced due to the grain structure of the medium. In photographic film, the size of the grains in the film determines the film's sensitivity, more sensitive film having larger-sized grains. In magnetic tape, the larger the grains of the magnetic particles (usually ferric oxide or magnetite), the more prone the medium is to noise. To compensate for this, larger areas of film or magnetic tape may be used to lower the noise to an acceptable level.
In general
[edit]Noise reduction algorithms tend to alter signals to a greater or lesser degree. The local signal-and-noise orthogonalization algorithm can be used to avoid changes to the signals.[1]
In seismic exploration
[edit]Boosting signals in seismic data is especially crucial for seismic imaging,[2][3] inversion,[4][5] and interpretation,[6] thereby greatly improving the success rate in oil & gas exploration.[7][8][9] The useful signal that is smeared in the ambient random noise is often neglected and thus may cause fake discontinuity of seismic events and artifacts in the final migrated image. Enhancing the useful signal while preserving edge properties of the seismic profiles by attenuating random noise can help reduce interpretation difficulties and misleading risks for oil and gas detection.
In audio
[edit]Tape hiss is a performance-limiting issue in analog tape recording. This is related to the particle size and texture used in the magnetic emulsion that is sprayed on the recording media, and also to the relative tape velocity across the tape heads.
Four types of noise reduction exist: single-ended pre-recording, single-ended hiss reduction, single-ended surface noise reduction, and codec or dual-ended systems. Single-ended pre-recording systems (such as Dolby HX Pro), work to affect the recording medium at the time of recording. Single-ended hiss reduction systems (such as DNL[10] or DNR) work to reduce noise as it occurs, including both before and after the recording process as well as for live broadcast applications. Single-ended surface noise reduction (such as CEDAR and the earlier SAE 5000A, Burwen TNE 7000, and Packburn 101/323/323A/323AA and 325[11]) is applied to the playback of phonograph records to address scratches, pops, and surface non-linearities. Single-ended dynamic range expanders like the Phase Linear Autocorrelator Noise Reduction and Dynamic Range Recovery System (Models 1000 and 4000) can reduce various noise from old recordings. Dual-ended systems (such as Dolby noise-reduction system or dbx) have a pre-emphasis process applied during recording and then a de-emphasis process applied during playback.
Modern digital sound recordings no longer need to worry about tape hiss so analog-style noise reduction systems are not necessary. However, an interesting twist is that dither systems actually add noise to a signal to improve its quality.
Compander-based noise reduction systems
[edit]Dual-ended compander noise reduction systems have a pre-emphasis process applied during recording and then a de-emphasis process applied at playback. Systems include the professional systems Dolby A[10] and Dolby SR by Dolby Laboratories, dbx Professional and dbx Type I by dbx, Donald Aldous' EMT NoiseBX,[12] Burwen Noise Eliminator,[13][14][15] Telefunken's telcom c4[10] and MXR Innovations' MXR[16] as well as the consumer systems Dolby NR, Dolby B,[10] Dolby C and Dolby S, dbx Type II,[10] Telefunken's High Com[10] and Nakamichi's High-Com II, Toshiba's (Aurex AD-4) adres,[10][17] JVC's ANRS[10][17] and Super ANRS,[10][17] Fisher/Sanyo's Super D,[18][10][17] SNRS,[17] and the Hungarian/East-German Ex-Ko system.[19][17]
In some compander systems, the compression is applied during professional media production and only the expansion is applied by the listener; for example, systems like dbx disc, High-Com II, CX 20[17] and UC used for vinyl recordings and Dolby FM, High Com FM and FMX used in FM radio broadcasting.
The first widely used audio noise reduction technique was developed by Ray Dolby in 1966. Intended for professional use, Dolby Type A was an encode/decode system in which the amplitude of frequencies in four bands was increased during recording (encoding), then decreased proportionately during playback (decoding). In particular, when recording quiet parts of an audio signal, the frequencies above 1 kHz would be boosted. This had the effect of increasing the signal-to-noise ratio on tape up to 10 dB depending on the initial signal volume. When it was played back, the decoder reversed the process, in effect reducing the noise level by up to 10 dB.
The Dolby B system (developed in conjunction with Henry Kloss) was a single-band system designed for consumer products. The Dolby B system, while not as effective as Dolby A, had the advantage of remaining listenable on playback systems without a decoder.
The Telefunken High Com integrated circuit U401BR could be utilized to work as a mostly Dolby B–compatible compander as well.[20] In various late-generation High Com tape decks the Dolby-B emulating D NR Expander functionality worked not only for playback, but, as an undocumented feature, also during recording.
dbx was a competing analog noise reduction system developed by David E. Blackmer, founder of Dbx, Inc.[21] It used a root-mean-squared (RMS) encode/decode algorithm with the noise-prone high frequencies boosted, and the entire signal fed through a 2:1 compander. dbx operated across the entire audible bandwidth and unlike Dolby B was unusable without a decoder. However, it could achieve up to 30 dB of noise reduction.
Since analog video recordings use frequency modulation for the luminance part (composite video signal in direct color systems), which keeps the tape at saturation level, audio-style noise reduction is unnecessary.
Dynamic noise limiter and dynamic noise reduction
[edit]Dynamic noise limiter (DNL) is an audio noise reduction system originally introduced by Philips in 1971 for use on cassette decks.[10] Its circuitry is also based on a single chip.[22][23]
It was further developed into dynamic noise reduction (DNR) by National Semiconductor to reduce noise levels on long-distance telephony.[24] First sold in 1981, DNR is frequently confused with the far more common Dolby noise-reduction system.[25]
Unlike Dolby and dbx Type I and Type II noise reduction systems, DNL and DNR are playback-only signal processing systems that do not require the source material to first be encoded. They can be used to remove background noise from any audio signal, including magnetic tape recordings and FM radio broadcasts, reducing noise by as much as 10 dB.[26] They can also be used in conjunction with other noise reduction systems, provided that they are used prior to applying DNR to prevent DNR from causing the other noise reduction system to mistrack.[27]
One of DNR's first widespread applications was in the GM Delco car stereo systems in US GM cars introduced in 1984.[28] It was also used in factory car stereos in Jeep vehicles in the 1980s, such as the Cherokee XJ. Today, DNR, DNL, and similar systems are most commonly encountered as a noise reduction system in microphone systems.[29]
Other approaches
[edit]A second class of algorithms work in the time-frequency domain using some linear or nonlinear filters that have local characteristics and are often called time-frequency filters.[30][page needed] Noise can therefore be also removed by use of spectral editing tools, which work in this time-frequency domain, allowing local modifications without affecting nearby signal energy. This can be done manually much like in a paint program drawing pictures. Another way is to define a dynamic threshold for filtering noise, that is derived from the local signal, again with respect to a local time-frequency region. Everything below the threshold will be filtered, everything above the threshold, like partials of a voice or wanted noise, will be untouched. The region is typically defined by the location of the signal's instantaneous frequency,[31] as most of the signal energy to be preserved is concentrated about it.
Yet another approach is the automatic noise limiter and noise blanker commonly found on HAM radio transceivers, CB radio transceivers, etc. Both of the aforementioned filters can be used separately, or in conjunction with each other at the same time, depending on the transceiver itself.
Software programs
[edit]Most digital audio workstations (DAWs) and audio editing software have one or more noise reduction functions.
In images
[edit]Images taken with digital cameras or conventional film cameras will pick up noise from a variety of sources. Further use of these images will often require that the noise be reduced either for aesthetic purposes or for practical purposes such as computer vision.
Types
[edit]In salt and pepper noise (sparse light and dark disturbances),[32] also known as impulse noise,[33] pixels in the image are very different in color or intensity from their surrounding pixels; the defining characteristic is that the value of a noisy pixel bears no relation to the color of surrounding pixels. When viewed, the image contains dark and white dots, hence the term salt and pepper noise. Generally, this type of noise will only affect a small number of image pixels. Typical sources include flecks of dust inside the camera and overheated or faulty CCD elements.
In Gaussian noise,[34] each pixel in the image will be changed from its original value by a (usually) small amount. A histogram, a plot of the amount of distortion of a pixel value against the frequency with which it occurs, shows a normal distribution of noise. While other distributions are possible, the Gaussian (normal) distribution is usually a good model, due to the central limit theorem that says that the sum of different noises tends to approach a Gaussian distribution.
In either case, the noise at different pixels can be either correlated or uncorrelated; in many cases, noise values at different pixels are modeled as being independent and identically distributed and hence uncorrelated.
Removal
[edit]Tradeoffs
[edit]There are many noise reduction algorithms in image processing.[35] In selecting a noise reduction algorithm, one must weigh several factors:
- the available computer power and time available: a digital camera must apply noise reduction in a fraction of a second using a tiny onboard CPU, while a desktop computer has much more power and time
- whether sacrificing some real detail is acceptable if it allows more noise to be removed (how aggressively to decide whether variations in the image are noise or not)
- the characteristics of the noise and the detail in the image, to better make those decisions
Chroma and luminance noise separation
[edit]In real-world photographs, the highest spatial-frequency detail consists mostly of variations in brightness (luminance detail) rather than variations in hue (chroma detail). Most photographic noise reduction algorithms split the image detail into chroma and luminance components and apply more noise reduction to the former or allows the user to control chroma and luminance noise reduction separately.
Linear smoothing filters
[edit]One method to remove noise is by convolving the original image with a mask that represents a low-pass filter or smoothing operation. For example, the Gaussian mask comprises elements determined by a Gaussian function. This convolution brings the value of each pixel into closer harmony with the values of its neighbors. In general, a smoothing filter sets each pixel to the average value, or a weighted average, of itself and its nearby neighbors; the Gaussian filter is just one possible set of weights.
Smoothing filters tend to blur an image because pixel intensity values that are significantly higher or lower than the surrounding neighborhood smear across the area. Because of this blurring, linear filters are seldom used in practice for noise reduction;[citation needed] they are, however, often used as the basis for nonlinear noise reduction filters.
Anisotropic diffusion
[edit]Another method for removing noise is to evolve the image under a smoothing partial differential equation similar to the heat equation, which is called anisotropic diffusion. With a spatially constant diffusion coefficient, this is equivalent to the heat equation or linear Gaussian filtering, but with a diffusion coefficient designed to detect edges, the noise can be removed without blurring the edges of the image.
Non-local means
[edit]Another approach for removing noise is based on non-local averaging of all the pixels in an image. In particular, the amount of weighting for a pixel is based on the degree of similarity between a small patch centered on that pixel and the small patch centered on the pixel being de-noised.
Nonlinear filters
[edit]A median filter is an example of a nonlinear filter and, if properly designed, is very good at preserving image detail. To run a median filter:
- consider each pixel in the image
- sort the neighbouring pixels into order based upon their intensities
- replace the original value of the pixel with the median value from the list
A median filter is a rank-selection (RS) filter, a particularly harsh member of the family of rank-conditioned rank-selection (RCRS) filters;[36] a much milder member of that family, for example one that selects the closest of the neighboring values when a pixel's value is external in its neighborhood, and leaves it unchanged otherwise, is sometimes preferred, especially in photographic applications.
Median and other RCRS filters are good at removing salt and pepper noise from an image, and also cause relatively little blurring of edges, and hence are often used in computer vision applications.
Wavelet transform
[edit]The main aim of an image denoising algorithm is to achieve both noise reduction[37] and feature preservation[38] using the wavelet filter banks.[39] In this context, wavelet-based methods are of particular interest. In the wavelet domain, the noise is uniformly spread throughout coefficients while most of the image information is concentrated in a few large ones.[40] Therefore, the first wavelet-based denoising methods were based on thresholding of detail subband coefficients.[41][page needed] However, most of the wavelet thresholding methods suffer from the drawback that the chosen threshold may not match the specific distribution of signal and noise components at different scales and orientations.
To address these disadvantages, nonlinear estimators based on Bayesian theory have been developed. In the Bayesian framework, it has been recognized that a successful denoising algorithm can achieve both noise reduction and feature preservation if it employs an accurate statistical description of the signal and noise components.[40]
Statistical methods
[edit]Statistical methods for image denoising exist as well. For Gaussian noise, one can model the pixels in a greyscale image as auto-normally distributed, where each pixel's true greyscale value is normally distributed with mean equal to the average greyscale value of its neighboring pixels and a given variance.
Let denote the pixels adjacent to the -th pixel. Then the conditional distribution of the greyscale intensity (on a scale) at the -th node is for a chosen parameter and variance . One method of denoising that uses the auto-normal model uses the image data as a Bayesian prior and the auto-normal density as a likelihood function, with the resulting posterior distribution offering a mean or mode as a denoised image.[42][43]
Block-matching algorithms
[edit]A block-matching algorithm can be applied to group similar image fragments of overlapping macroblocks of identical size. Stacks of similar macroblocks are then filtered together in the transform domain and each image fragment is finally restored to its original location using a weighted average of the overlapping pixels.[44]
Random field
[edit]Shrinkage fields is a random field-based machine learning technique that brings performance comparable to that of Block-matching and 3D filtering yet requires much lower computational overhead such that it can be performed directly within embedded systems.[45]
Deep learning
[edit]Various deep learning approaches have been proposed to achieve noise reduction[46] and such image restoration tasks. Deep Image Prior is one such technique that makes use of convolutional neural network and is notable in that it requires no prior training data.[47]
Software
[edit]Most general-purpose image and photo editing software will have one or more noise-reduction functions (median, blur, despeckle, etc.).
See also
[edit]General noise issues
[edit]Audio
[edit]- Architectural acoustics including Soundproofing
- Click removal
- Codec listening test
- Noise blanker
- Noise print
- Noise-cancelling headphones
- Sound masking
Images and video
[edit]Similar problems
[edit]References
[edit]- ^ Chen, Yangkang; Fomel, Sergey (November–December 2015). "Random noise attenuation using local signal-and-noise orthogonalization". Geophysics. 80 (6): WD1–WD9. Bibcode:2015Geop...80D...1C. doi:10.1190/GEO2014-0227.1. S2CID 120440599.
- ^ Xue, Zhiguang; Chen, Yangkang; Fomel, Sergey; Sun, Junzhe (2016). "Seismic imaging of incomplete data and simultaneous-source data using least-squares reverse time migration with shaping regularization". Geophysics. 81 (1): S11–S20. Bibcode:2016Geop...81S..11X. doi:10.1190/geo2014-0524.1.
- ^ Chen, Yangkang; Yuan, Jiang; Zu, Shaohuan; Qu, Shan; Gan, Shuwei (2015). "Seismic imaging of simultaneous-source data using constrained least-squares reverse time migration". Journal of Applied Geophysics. 114: 32–35. Bibcode:2015JAG...114...32C. doi:10.1016/j.jappgeo.2015.01.004.
- ^ Chen, Yangkang; Chen, Hanming; Xiang, Kui; Chen, Xiaohong (2017). "Geological structure guided well log interpolation for high-fidelity full waveform inversion". Geophysical Journal International. 209 (1): 21–31. Bibcode:2016GeoJI.207.1313C. doi:10.1093/gji/ggw343.
- ^ Gan, Shuwei; Wang, Shoudong; Chen, Yangkang; Qu, Shan; Zu, Shaohuan (2016). "Velocity analysis of simultaneous-source data using high-resolution semblance—coping with the strong noise". Geophysical Journal International. 204 (2): 768–779. Bibcode:2016GeoJI.204..768G. doi:10.1093/gji/ggv484.
- ^ Chen, Yangkang (2017). "Probing the subsurface karst features using time-frequency decomposition". Interpretation. 4 (4): T533–T542. doi:10.1190/INT-2016-0030.1.
- ^ Huang, Weilin; Wang, Runqiu; Chen, Yangkang; Li, Huijian; Gan, Shuwei (2016). "Damped multichannel singular spectrum analysis for 3D random noise attenuation". Geophysics. 81 (4): V261–V270. Bibcode:2016Geop...81V.261H. doi:10.1190/geo2015-0264.1.
- ^ Chen, Yangkang (2016). "Dip-separated structural filtering using seislet transform and adaptive empirical mode decomposition based dip filter". Geophysical Journal International. 206 (1): 457–469. Bibcode:2016GeoJI.206..457C. doi:10.1093/gji/ggw165.
- ^ Chen, Yangkang; Ma, Jianwei; Fomel, Sergey (2016). "Double-sparsity dictionary for seismic noise attenuation". Geophysics. 81 (4): V261–V270. Bibcode:2016Geop...81V.193C. doi:10.1190/geo2014-0525.1.
- ^ a b c d e f g h i j k "High Com - the latest noise reduction system / Noise reduction - silence is golden" (PDF). elektor (UK) – up-to-date electronics for lab and leisure. Vol. 1981, no. 70. February 1981. pp. 2-04 – 2-09. Archived (PDF) from the original on 2020-07-02. Retrieved 2020-07-02. (6 pages)
- ^ Audio Noise Suppressor Model 325 Owner's Manual (PDF). Rev. 15-1. Syracuse, New York, USA: Packburn electronics inc. Archived (PDF) from the original on 2021-05-05. Retrieved 2021-05-16. (6+36 pages)
- ^ R., C. (1965). "Kompander verbessert Magnettonkopie". Radio Mentor (in German). 1965 (4): 301–303.
- ^ Burwen, Richard S. (February 1971). "A Dynamic Noise Filter". Journal of the Audio Engineering Society. 19 (1).
- ^ Burwen, Richard S. (June 1971). "110 dB Dynamic Range For Tape" (PDF). Audio: 49–50. Archived (PDF) from the original on 2017-11-13. Retrieved 2017-11-13.
- ^ Burwen, Richard S. (December 1971). "Design of a Noise Eliminator System". Journal of the Audio Engineering Society. 19 (11): 906–911.
- ^ Lambert, Mel (September 1978). "MXR Compander". Sound International. Archived from the original on 2020-10-28. Retrieved 2021-04-25.
- ^ a b c d e f g Bergmann, Heinz (1982). "Verfahren zur Rauschminderung bei der Tonsignalverarbeitung" (PDF). radio fernsehen elektronik (rfe) (in German). Vol. 31, no. 11. Berlin, Germany: VEB Verlag Technik. pp. 731–736 [731]. ISSN 0033-7900. Archived (PDF) from the original on 2021-05-05. Retrieved 2021-05-05. p. 731:
ExKo Breitband-Kompander Aufnahme/Wiedergabe 9 dB Tonband
(NB. Page 736 is missing in the linked PDF.) - ^ Haase, Hans-Joachim (August 1980). Written at Aschau, Germany. "Rauschunterdrückung: Kampf dem Rauschen". Systeme und Konzepte. Funk-Technik - Fachzeitschrift für Funk-Elektroniker und Radio-Fernseh-Techniker - Offizielles Mitteilungsblatt der Bundesfachgruppe Radio- und Fernsehtechnik (in German). Vol. 35, no. 8. Heidelberg, Germany: Dr. Alfred Hüthig Verlag GmbH. pp. W293 – W296, W298, W300 [W298, W300]. ISSN 0016-2825. Archived from the original on 2021-05-16. Retrieved 2021-04-25.
- ^ "Stereo Automat MK42 R-Player Budapesti Rádiótechnikai Gyár B". Archived from the original on 2021-04-26. Retrieved 2021-04-25.
- ^ HIGH COM - The HIGH COM broadband compander utilizing the U401BR integrated circuit (PDF) (Semiconductor information 2.80). AEG-Telefunken. Archived (PDF) from the original on 2016-04-16. Retrieved 2016-04-16.
- ^ Hoffman, Frank W. (2004). Encyclopedia of Recorded Sound. Vol. 1 (revised ed.). Taylor & Francis.
- ^ "Noise Reduction". Audiotools.com. 2013-11-10. Archived from the original on 2008-05-13. Retrieved 2009-01-14.
- ^ "Philips' Dynamic Noise Limiter". Archived from the original on 2008-11-05. Retrieved 2009-01-14.
- ^ "Dynamic Noise Reduction". ComPol Inc. Archived from the original on 2009-11-21. Retrieved 2009-01-14.
- ^ "History". Archived from the original on 2007-09-27. Retrieved 2009-01-14.
- ^ "LM1894 Dynamic Noise Reduction System DNR". Archived from the original on 2008-12-20. Retrieved 2009-01-14.
- ^ "Audio Terms". Archived from the original on 2008-12-20. Retrieved 2009-01-14.
- ^ Gunyo, Ed. "Evolution of the Riviera - 1983 the 20th Anniversary". Riviera Owners Association. Archived from the original on 2008-07-05. Retrieved 2009-01-14. (NB. Originally published in The Riview, Vol. 21, No. 6, September/October 2005.)
- ^ "Noise‑Cancelling Headphones with Microphone". InsmartWeb. 19 July 2025. Retrieved 2025-07-22.
- ^ Boashash, B., ed. (2003). Time-Frequency Signal Analysis and Processing – A Comprehensive Reference. Oxford: Elsevier Science. ISBN 978-0-08-044335-5.
- ^ Boashash, B. (April 1992). "Estimating and Interpreting the Instantaneous Frequency of a Signal-Part I: Fundamentals". Proceedings of the IEEE. 80 (4): 519–538. doi:10.1109/5.135376.
- ^ Banerjee, Shounak; Sarkar, Debarpito; Chatterjee, Debraj; Chowdhuri, Sunanda Roy (2021-06-25). "High-Density Salt and Pepper Noise Removal from Colour Images by Introducing New Enhanced Filter". 2021 International Conference on Intelligent Technologies (CONIT). Hubli, India: IEEE. pp. 1–6. doi:10.1109/CONIT51480.2021.9498402. ISBN 978-1-7281-8583-5. S2CID 236920367.
- ^ Orazaev, Anzor; Lyakhov, Pavel; Baboshina, Valentina; Kalita, Diana (2023-01-26). "Neural Network System for Recognizing Images Affected by Random-Valued Impulse Noise". Applied Sciences. 13 (3): 1585. doi:10.3390/app13031585. ISSN 2076-3417.
- ^ Dong, Suge; Dong, Chunxiao; Li, Zishuang; Ge, Mingtao (2022-07-15). "Gaussian Noise Removal Method Based on Empirical Wavelet Transform and Hypothesis Testing". 2022 3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE). Xi’an, China: IEEE. pp. 24–27. doi:10.1109/ICBAIE56435.2022.9985814. ISBN 978-1-6654-5160-4. S2CID 254999960.
- ^ Mehdi Mafi, Harold Martin, Jean Andrian, Armando Barreto, Mercedes Cabrerizo, Malek Adjouadi, "A Comprehensive Survey on Impulse and Gaussian Denoising Filters for Digital Images", Signal Processing, vol. 157, pp. 236–260, 2019.
- ^ Liu, Puyin; Li, Hongxing (2004). "Fuzzy neural networks: Theory and applications". In Casasent, David P. (ed.). Intelligent Robots and Computer Vision XIII: Algorithms and Computer Vision. Vol. 2353. World Scientific. pp. 303–325. Bibcode:1994SPIE.2353..303G. doi:10.1117/12.188903. ISBN 978-981-238-786-8. S2CID 62705333.
- ^ Chervyakov, N. I.; Lyakhov, P. A.; Nagornov, N. N. (2018-11-01). "Quantization Noise of Multilevel Discrete Wavelet Transform Filters in Image Processing". Optoelectronics, Instrumentation and Data Processing. 54 (6): 608–616. Bibcode:2018OIDP...54..608C. doi:10.3103/S8756699018060092. ISSN 1934-7944. S2CID 128173262.
- ^ Craciun, G.; Jiang, Ming; Thompson, D.; Machiraju, R. (March 2005). "Spatial domain wavelet design for feature preservation in computational data sets". IEEE Transactions on Visualization and Computer Graphics. 11 (2): 149–159. Bibcode:2005ITVCG..11..149C. doi:10.1109/TVCG.2005.35. ISSN 1941-0506. PMID 15747638. S2CID 1715622.
- ^ Gajitzki, Paul; Isar, Dorina; Simu, Călin (November 2018). "Wavelets Based Filter Banks for Real Time Spectrum Analysis". 2018 International Symposium on Electronics and Telecommunications (ISETC). pp. 1–4. doi:10.1109/ISETC.2018.8583929. ISBN 978-1-5386-5925-0. S2CID 56599099.
- ^ a b Forouzanfar, M.; Abrishami-Moghaddam, H.; Ghadimi, S. (July 2008). "Locally adaptive multiscale Bayesian method for image denoising based on bivariate normal inverse Gaussian distributions". International Journal of Wavelets, Multiresolution and Information Processing. 6 (4): 653–664. doi:10.1142/S0219691308002562. S2CID 31201648.
- ^ Mallat, S. (1998). A Wavelet Tour of Signals Processing. London: Academic Press.
- ^ Besag, Julian (1986). "On the Statistical Analysis of Dirty Pictures" (PDF). Journal of the Royal Statistical Society. Series B (Methodological). 48 (3): 259–302. doi:10.1111/j.2517-6161.1986.tb01412.x. JSTOR 2345426. Archived (PDF) from the original on 2017-08-29. Retrieved 2019-09-24.
- ^ Seyyedi, Saeed (2018). "Incorporating a Noise Reduction Technique Into X-Ray Tensor Tomography". IEEE Transactions on Computational Imaging. 4 (1): 137–146. Bibcode:2018ITCI....4..137S. doi:10.1109/TCI.2018.2794740. JSTOR 17574903. S2CID 46793582.
- ^ Dabov, Kostadin; Foi, Alessandro; Katkovnik, Vladimir; Egiazarian, Karen (16 July 2007). "Image denoising by sparse 3D transform-domain collaborative filtering". IEEE Transactions on Image Processing. 16 (8): 2080–2095. Bibcode:2007ITIP...16.2080D. CiteSeerX 10.1.1.219.5398. doi:10.1109/TIP.2007.901238. PMID 17688213. S2CID 1475121.
- ^ Schmidt, Uwe; Roth, Stefan (2014). Shrinkage Fields for Effective Image Restoration (PDF). Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. Columbus, OH, USA: IEEE. doi:10.1109/CVPR.2014.349. ISBN 978-1-4799-5118-5. Archived (PDF) from the original on 2018-01-02. Retrieved 2018-01-03.
- ^ Dietz, Henry (2022). "An improved raw image enhancement algorithm using a statistical model for pixel value error". Electronic Imaging. 34 (14): 1–6. doi:10.2352/EI.2022.34.14.COIMG-151.
AI Image Denoiser is much more aggressive, significantly enhancing details, but also applying heavy smoothing. DxO PureRAW, which directly improves the raw image using deep learning trained on "millions of images analyzed by DxO over 15 years," was easily the most effective of the many denoisers tested.
- ^ Ulyanov, Dmitry; Vedaldi, Andrea; Lempitsky, Victor (30 November 2017). "Deep Image Prior". arXiv:1711.10925v2 [Vision and Pattern Recognition Computer Vision and Pattern Recognition].
External links
[edit]Noise reduction
View on GrokipediaFundamentals
Definition and Types of Noise
In signal processing, noise refers to unwanted random or deterministic perturbations that degrade the information content of a desired signal.[5] These perturbations can arise during signal capture, transmission, storage, or processing, introducing variability that obscures the underlying message or data.[6] The concept of noise gained early recognition in the late 19th century with the advent of electrical telegraphy and radio communications, where interference disrupted message transmission.[7] Noise is commonly classified into several types based on its statistical properties and generation mechanisms. A foundational model represents the noisy signal as , where is the original signal and denotes the noise component.[8] Additive white Gaussian noise (AWGN) is a prevalent type, characterized by its additive nature (superimposed on the signal), white spectrum (equal power across frequencies), and Gaussian amplitude distribution with zero mean.[9] Impulse noise, in contrast, manifests as sporadic, high-amplitude spikes or pulses of short duration, often modeled as random binary or salt-and-pepper alterations in discrete signals.[10] Poisson noise, also called shot noise, arises from the discrete, probabilistic arrival of particles like photons or electrons, following a Poisson distribution where variance equals the mean intensity.[11] Speckle noise appears as a granular pattern due to random interference in coherent imaging systems, typically multiplicative in nature and reducing image contrast.[12] Common sources of noise in electronic systems include thermal noise, generated by random thermal motion of charge carriers in resistors (also known as Johnson-Nyquist noise, with power spectral density , where is Boltzmann's constant, is temperature, and is resistance); shot noise, stemming from the quantized flow of discrete charges across junctions; and flicker noise (or 1/f noise), which exhibits power inversely proportional to frequency and originates from material defects or surface traps in semiconductors.[13][14] These noise types manifest across domains such as audio, imaging, and seismic data processing.Importance Across Domains
Noise reduction plays a pivotal role in enhancing signal quality across diverse applications, thereby improving data accuracy and user experience in communications, entertainment, and scientific endeavors. By mitigating unwanted interference, it allows for the extraction of meaningful information from corrupted signals, which is fundamental in signal processing tasks spanning multiple domains. For instance, in audio systems, noise reduction ensures clearer sound reproduction, vital for applications like music production and voice communication where distortions can degrade listener immersion. In image and video processing, it yields sharper visuals, enabling precise analysis in fields such as photography and surveillance. Seismic exploration benefits from reduced noise to achieve superior subsurface imaging, supporting accurate geological interpretations for resource extraction. Similarly, in telecommunications, effective noise suppression guarantees reliable data transmission, minimizing bit errors and enhancing overall network efficiency. The economic and societal advantages of noise reduction are substantial, particularly in healthcare and artificial intelligence. In medical diagnostics, such as MRI and ultrasound imaging, noise attenuation decreases diagnostic errors, leading to more reliable patient assessments and reduced healthcare expenditures through fewer misdiagnoses and repeat procedures. This improvement in accuracy directly contributes to better health outcomes and cost savings, as noiseless images facilitate precise identification of abnormalities. In the realm of AI, noise reduction elevates training data quality by eliminating irrelevant perturbations, resulting in more robust models with higher predictive performance and broader applicability in tasks like pattern recognition and decision-making. A key metric for evaluating noise reduction efficacy is the signal-to-noise ratio (SNR), which quantifies the relative strength of the desired signal against background noise. The SNR is typically expressed in decibels as: where and represent the power of the signal and noise, respectively; higher SNR values signify improved performance and clearer outputs.Core Techniques
Analog Methods
Analog methods for noise reduction encompass hardware-based techniques that process continuous-time signals through electronic circuits to suppress unwanted interference, forming the basis of early electronic systems before digital alternatives emerged. These approaches primarily target deterministic noise sources like electromagnetic interference and frequency-specific artifacts using passive and active components.[15] Core principles include passive filtering with RC circuits, where a resistor-capacitor network creates a frequency-dependent impedance to attenuate noise. In such setups, the capacitor charges through the resistor, forming a low-pass filter that rolls off high-frequency components at a rate of 20 dB per decade beyond the cutoff frequency, effectively reducing broadband noise while preserving signal integrity.[16] Shielding employs conductive enclosures, such as grounded metal shields, to block external electromagnetic fields by redirecting induced currents away from sensitive nodes, minimizing capacitive coupling of radio-frequency interference.[17] Proper grounding complements this by establishing a low-impedance return path for noise currents, preventing ground loops that amplify common-mode interference in mixed analog systems.[18] Key techniques leverage these principles through targeted filters and signal conditioning. Low-pass filters, implemented via RC or active op-amp configurations, attenuate high-frequency noise in applications like audio amplification, where they suppress hiss and RF pickup without significantly distorting the baseband signal.[19] High-pass filters, conversely, eliminate low-frequency components such as 50/60 Hz power-line hum by blocking DC offsets and rumble, using similar RC elements but with the capacitor in series to create a high-impedance path at low frequencies.[20] In audio processing, companding briefly applies pre-emphasis—a high-pass boost to high frequencies during recording—to increase signal-to-noise ratio by lifting quiet components above the noise floor, followed by de-emphasis on playback to flatten the response and compress perceived noise.[21] Historically, analog noise reduction advanced in the 1920s with vacuum tube-based radio receivers, where tuned LC circuits and regenerative amplification circuits reduced atmospheric static and tube-generated noise through selective frequency filtering.[22] The 1960s marked a milestone with the Dolby A system, an analog compander that used four sliding bandpass filters and variable gain cells to achieve 10 dB of broadband noise reduction in professional recording, expanding on earlier pre-emphasis techniques without introducing audible distortion.[23] Despite their effectiveness, analog methods suffer from limitations inherent to physical components, including susceptibility to thermal drift, where resistor and capacitor values can shift according to their temperature coefficients, typically 50-100 ppm/°C (0.005-0.01% per °C) for precision metal film resistors, potentially altering filter cutoff frequencies if uncompensated.[24][25] They also lack adaptability, as fixed circuit parameters cannot dynamically respond to varying noise profiles, constraining their use in non-stationary environments.[26]Digital Methods
Digital methods for noise reduction begin with the discretization of continuous analog signals into digital representations via analog-to-digital converters (ADCs), which sample the signal at discrete time intervals and quantize amplitude levels. This process inherently introduces quantization noise due to finite bit resolution, but it facilitates precise manipulation through digital signal processing (DSP). ADCs are designed to minimize additional noise sources like thermal and aperture jitter, ensuring that the digitized signal retains sufficient fidelity for subsequent noise mitigation.[27] In DSP, noise reduction algorithms operate in either the time domain—using techniques such as finite impulse response (FIR) or infinite impulse response (IIR) filters—or the frequency domain, where signals are transformed via the fast Fourier transform (FFT) to isolate and attenuate noise components. A foundational algorithm is the Wiener filter, which provides an optimal linear estimate of the clean signal by minimizing the mean square error for stationary stochastic processes. The filter's frequency-domain transfer function is expressed as: where denotes the power spectral density of the desired signal and that of the additive noise; this formulation assumes uncorrelated signal and noise. Adaptive filtering extends this capability by dynamically updating filter coefficients to track non-stationary noise, with the least mean squares (LMS) algorithm serving as a core method that iteratively minimizes error using gradient descent on the instantaneous squared error. Introduced by Widrow and Hoff, LMS employs a reference input correlated with the noise to enable real-time cancellation without prior knowledge of noise statistics.[28][29] Compared to analog approaches, digital methods provide superior precision through arithmetic operations immune to component drift, real-time adaptability via algorithmic updates, and post-processing flexibility on stored data, allowing iterative refinement without hardware reconfiguration. The evolution of these techniques traces back to the 1970s with the advent of dedicated DSP chips, such as Bell Labs' DSP-1 in 1979, which enabled compact, real-time implementation of complex filters previously requiring large custom hardware. By the 1980s, devices like Texas Instruments' TMS320 series further democratized DSP for noise reduction applications. In the modern era, graphics processing units (GPUs) have revolutionized the field by leveraging massive parallelism to accelerate computationally intensive algorithms, such as large-scale FFTs for frequency-domain processing.[30][31][32]Evaluation and Tradeoffs
Evaluating the effectiveness of noise reduction techniques requires standardized metrics that quantify the balance between noise suppression and preservation of the underlying signal. Common objective measures include the mean squared error (MSE) and peak signal-to-noise ratio (PSNR), which assess pixel-level or sample-level fidelity between the original clean signal and the denoised output. The MSE is defined as where is the number of samples or pixels, is the original signal value, and is the denoised estimate; lower MSE values indicate better reconstruction with minimal residual error.[33] PSNR, derived from MSE, expresses the ratio in decibels as , where is the maximum possible signal value, providing a scale for perceived quality where higher values (typically above 30 dB for images) suggest effective denoising without excessive distortion.[34] While MSE and PSNR are computationally simple and widely used for their correlation with error minimization, they often fail to capture human perceptual judgments, leading to the adoption of structural similarity index measure (SSIM) for better alignment with visual or auditory quality. SSIM evaluates luminance, contrast, and structural fidelity between signals, yielding values from -1 to 1, with 1 indicating perfect similarity; it has been shown to outperform MSE/PSNR in predicting subjective quality for denoised images and audio. In the context of 2020s advancements, particularly for noise in AI-generated content like deepfakes or synthetic media, learned perceptual image patch similarity (LPIPS) has emerged as a superior metric, leveraging deep network features to mimic human vision and achieving closer agreement with psychophysical ratings than traditional measures. A primary tradeoff in noise reduction lies in balancing aggressive noise suppression against unintended signal distortion, where overzealous filtering can introduce artifacts such as blurring in images or muffled speech in audio, degrading overall fidelity.[35] For instance, spectral subtraction methods may reduce noise by 10-20 dB but at the cost of introducing musical noise or harmonic distortion if the suppression threshold is too high. Another key compromise involves computational complexity versus real-time applicability; advanced adaptive filters or deep learning-based denoisers can achieve superior performance (e.g., PSNR gains of 2-5 dB over linear methods) but require significant processing power, limiting their use in resource-constrained environments like mobile devices or live audio processing.[36] Challenges in noise reduction further complicate evaluation, particularly overfitting in adaptive methods, where models trained on limited noisy data capture noise patterns as signal features, leading to poor generalization on unseen inputs—mitigated through regularization but still resulting in up to 15% performance drops in cross-domain tests.[37] Handling non-stationary noise, which varies temporally like babble or impulsive sounds, poses additional difficulties, as stationary assumptions in filters fail, causing residual noise levels to remain high (e.g., 5-10 dB above stationary cases) and requiring dynamic adaptation that increases latency.[38] These issues underscore the need for hybrid metrics combining objective scores with subjective assessments to fully evaluate technique robustness across domains.Audio Applications
Compander-Based Systems
Compander-based systems represent an early hybrid approach to audio noise reduction, combining analog compression and expansion techniques to extend the dynamic range of analog recording media like magnetic tape. These systems operate by compressing the dynamic range of the audio signal during recording, which boosts low-level signals relative to inherent noise such as tape hiss, and then expanding the signal during playback to restore the original dynamics while attenuating the noise floor. The core principle relies on a sliding gain control that applies more boost to quieter portions of the signal, effectively masking noise in those regions without significantly altering louder signals. This companding process—short for "compressing and expanding"—adapts concepts from earlier video noise reduction methods to audio applications, achieving typical noise reductions of 10-30 dB depending on the system.[39] The compression ratio in these systems defines the degree of dynamic range modification and is expressed as the ratio of change in input level to change in output level in decibels. For instance, a common 2:1 compression ratio means that for every 2 dB increase in input signal above the threshold, the output increases by only 1 dB, compressing the range while the expansion reverses this 1:2 on playback. Mathematically, the compression gain can be modeled as: where is the input signal amplitude, is the threshold, and is the compression ratio (e.g., for 2:1). This fixed-ratio approach ensures predictable noise suppression but requires precise encoder-decoder matching to avoid artifacts.[40] Prominent compander-based systems emerged in the late 1960s and 1970s, tailored for both consumer and professional use. Dolby B, introduced in 1968 by Ray Dolby for cassette tapes, employed a single-band pre-emphasis compander with a 2:1 ratio focused on high frequencies to combat tape hiss, achieving about 10 dB of noise reduction. In the professional realm, dbx systems, developed in the early 1970s by dbx Inc., utilized broadband 2:1 companding across the full audio spectrum for tape and disc recording, offering up to 30 dB reduction and improved headroom. Telcom C-4, launched by Telefunken in 1975, advanced this with a four-band compander operating at a gentler 1.5:1 ratio, providing around 25 dB noise reduction while minimizing tonal shifts through frequency-specific processing.[41][42][43] These systems excelled at suppressing tape hiss, the high-frequency noise inherent to analog magnetic media, by elevating signal levels during quiet passages and thus improving signal-to-noise ratios without requiring digital processing. However, they were susceptible to disadvantages like "breathing" artifacts—audible pumping or modulation effects—arising from mismatches between the encode and decode stages, such as slight speed variations or level errors in tape playback. This could manifest as unnatural dynamic fluctuations, particularly in complex signals, limiting their robustness compared to later adaptive methods.[44][45] The adoption of compander systems fueled a significant boom in consumer audio quality during the 1970s and 1980s, transforming cassettes from niche formats into viable alternatives to vinyl records and enabling widespread home recording and playback with reduced audible noise. By licensing technologies like Dolby B to major manufacturers, these innovations spurred the proliferation of high-fidelity portable and home systems, elevating overall audio fidelity and market accessibility for millions of users.[41][46]Dynamic Noise Reduction
Dynamic noise reduction (DNR) techniques represent an evolution in audio processing, focusing on adaptive systems that adjust in real-time to the signal's content to suppress noise while preserving dynamic range. These methods build briefly on compander foundations by incorporating signal-dependent adaptation for varying audio conditions. A key early example is the Dynamic Noise Limiter (DNL), introduced by Philips in the late 1960s as a playback-only system designed to improve audio quality from analog recordings like cassettes and tapes. The DNL operates by detecting quiet passages where tape hiss becomes prominent and dynamically attenuating high-frequency components, achieving approximately 10 dB of noise reduction without requiring encoding during recording. In contrast, more advanced DNR systems like Dolby Spectral Recording (SR), developed by Dolby Laboratories in the mid-1980s, employ sophisticated multi-band processing to extend dynamic range beyond 90 dB in professional analog audio. Dolby SR uses dual-ended encoding and decoding with spectral skewing, where large-amplitude frequency components modulate the gain of quieter ones, effectively boosting low-level signals and suppressing the noise floor across multiple bands.[47] At the core of these algorithms is spectral analysis, which estimates the noise spectrum from the input signal and applies adaptive filtering to enhance signal-to-noise ratio (SNR). Quiet signals are amplified while noise is attenuated based on real-time SNR assessments, often using techniques like spectral subtraction to derive a clean estimate by subtracting an averaged noise profile from the noisy spectrum.[48] A representative formulation for the adaptive gain is , where the gain function increases for high-SNR regions to preserve detail and decreases for low-SNR areas to minimize noise audibility, typically implemented via sliding shelf filters or over-subtraction factors in the frequency domain.[48] This approach ensures minimal distortion in transient-rich audio, such as music or speech. These techniques found widespread applications in broadcast environments for improving transmission quality over analog lines and in consumer playback systems for vinyl records, where DNL and similar DNR helped mitigate surface noise during reproduction without altering the original mastering.[49] For instance, Dolby SR was adopted in professional studios and film soundtracks, enabling cleaner analog tapes with extended frequency response up to 20 kHz. Despite their effectiveness, dynamic noise reduction systems can introduce artifacts, particularly "pumping" or "breathing" effects, where rapid gain changes in audio with fluctuating levels cause unnatural volume modulation, most noticeable in passages with sudden quiet-to-loud transitions.[50] Post-2010, digital revivals of DNR principles have appeared in streaming audio processing, leveraging DSP chips like the LM1894 for real-time noise suppression in non-encoded sources, though adoption remains niche compared to broadband compression standards.[51]Other Audio Techniques
Spectral subtraction is a foundational technique in audio noise reduction that estimates and removes the noise spectrum from the noisy signal spectrum in the frequency domain. Introduced in the late 1970s, this method assumes the noise is stationary or slowly varying, allowing its spectrum to be estimated during non-speech periods and subtracted from the observed noisy signal. The core operation is defined by the equation where is the estimated clean signal spectrum, is the noisy signal spectrum, is the estimated noise spectrum, and is an over-subtraction factor typically between 1 and 5 to compensate for estimation errors and reduce residual noise.[48] This approach, while simple and computationally efficient, can introduce musical noise artifacts due to spectral floor effects, prompting refinements like magnitude subtraction followed by phase reconstruction from the noisy signal.[48] Wiener filtering, adapted for audio signals, provides an optimal linear estimator that minimizes the mean square error between the clean and estimated signals under Gaussian assumptions. In speech enhancement contexts, the filter gain is derived from signal-to-noise ratio estimates in each frequency bin, yielding a time-varying filter that suppresses noise while preserving signal components. The filter transfer function is given by where and are the power spectral densities of the clean signal and noise, respectively, though in practice, these are approximated from the noisy observation. Tailored to audio, this method excels in non-stationary noise environments by integrating short-time Fourier transform processing, offering better perceptual quality than basic spectral subtraction but requiring accurate noise estimation. Voice activity detection (VAD) complements these spectral methods by identifying speech segments in noisy audio, enabling targeted noise suppression only during active speech periods to avoid distorting silence or low-level signals. VAD algorithms typically analyze features like energy, zero-crossing rates, and spectral characteristics to classify frames as speech or non-speech, often using statistical models or thresholds adapted to noise conditions. In speech enhancement pipelines, VAD updates noise profiles during detected non-speech intervals, improving the accuracy of subsequent spectral subtraction or Wiener filtering.[52] For instance, energy-based VAD with hangover schemes maintains detection during brief pauses, enhancing overall system robustness in variable noise.[52] Subspace methods, emerging in the late 1990s, decompose the noisy signal into signal-plus-noise and pure noise subspaces using techniques like singular value decomposition (SVD), allowing projection of the observation onto the signal subspace to attenuate noise. These approaches model speech as lying in a low-dimensional subspace relative to broadband noise, enabling eigenvalue-based filtering that preserves signal structure better than global spectral methods. Early developments focused on white noise assumptions, with applications to speech denoising showing reduced distortion compared to contemporaneous filters.[53] More recently, blind source separation via independent component analysis (ICA) has advanced audio noise reduction by separating mixed signals into independent sources without prior knowledge of the mixing process. ICA maximizes statistical independence among components using measures like mutual information, making it suitable for multi-microphone setups in reverberant environments. In audio contexts, fast ICA variants enable real-time separation of speech from interfering noises, outperforming subspace methods in non-Gaussian scenarios. These techniques find widespread application in telephony, where spectral subtraction and VAD enhance call quality by mitigating background noise in mobile networks, and in podcasting, where Wiener filtering ensures clear voice reproduction amid studio or remote recording interferences.[48] In the 2020s, AI-hybrid approaches integrate deep neural networks with traditional spectral methods for low-latency denoising in live streaming, achieving sub-50ms inference times suitable for video calls and broadcasts while adapting to diverse noise types like echoes or crowds.[54]Audio Software Tools
Audio software tools for noise reduction enable users to clean up recordings by applying algorithms to suppress unwanted sounds while preserving audio quality. These tools range from free open-source options to professional suites, often incorporating techniques like spectral subtraction for targeted noise removal.[55][56] Audacity, an open-source audio editor, provides a built-in Noise Reduction effect that uses noise profiling to identify and attenuate constant background sounds such as hiss, hum, or fan noise. Users select a noise sample to create a profile, then apply the effect across the track with adjustable parameters for reduction strength, sensitivity, and frequency smoothing, achieving effective results on steady-state noise without requiring advanced hardware.[55][57] Adobe Audition, a professional digital audio workstation, offers AI-assisted noise reduction tools including Adaptive Noise Reduction and Hiss Reduction, which analyze and suppress broadband noise in real-time while integrating seamlessly with digital audio workstations (DAWs) like Premiere Pro for post-production workflows.[58][56] iZotope RX stands out for its spectral repair capabilities, allowing users to visually edit spectrograms to remove intermittent noises like clicks or breaths using modules such as Spectral De-noise, which employs machine learning to preserve tonal elements and minimize artifacts in dialogue or music tracks.[59][60] Common features across these tools include real-time preview for iterative adjustments, batch processing for handling multiple files efficiently, and plugin integration with DAWs such as Ableton Live or Pro Tools to streamline professional editing pipelines. For instance, Adobe Audition's effects rack supports live monitoring during playback, while iZotope RX modules can process audio in standalone mode or as VST/AU plugins, enabling non-destructive edits.[56][61] Recent trends in audio noise reduction software emphasize cloud-based platforms and open-source libraries, driven by AI advancements for more accessible and scalable solutions. Descript, a cloud-native tool launched in the 2020s, features Overdub and Studio Sound for AI-powered noise removal, automatically detecting and eliminating background distractions like echoes or hums in podcast and video audio with one-click enhancement.[62][63] Other online tools provide automatic AI-based noise reduction without custom noise profile support. Adobe Podcast Enhance is a free online AI tool that automatically removes noise, levels audio, and enhances spoken content without requiring user-uploaded noise samples or profiles. VEED.IO Noise Remover applies AI models for automatic background noise suppression in uploaded audio. Auphonic serves as an online audio processor offering adjustable noise reduction levels but lacks custom profile upload functionality.[64][65][66] No widely available online tools fully replicate the custom noise profile upload feature, such as Adobe Audition's noise print for spectral subtraction-based reduction; most rely on pre-trained AI models for automatic processing. For precise noise profile functionality, desktop software like free Audacity or Adobe Audition is recommended. The Python library librosa facilitates custom denoising in research and development, providing functions for spectral analysis and effects like trimming silence, which users combine with algorithms such as Wiener filtering for tailored noise suppression in scripts.[67][68] By 2025, AI integration has become a dominant trend, with tools like those in iZotope RX evolving to handle complex, non-stationary noise through adaptive models, reflecting a market shift toward generative rather than purely subtractive methods.[69][70] Evaluating these tools often involves balancing user interfaces for accessibility against depth of algorithmic control; Audacity's straightforward GUI suits beginners with its profile-based workflow, but lacks the granular spectral editing of iZotope RX, which prioritizes professional algorithm access via visual spectrogram manipulation. Adobe Audition strikes a middle ground with intuitive presets alongside customizable parameters, though open-source options like librosa demand programming knowledge for full algorithmic customization.[71][72] Mobile apps for audio noise reduction remain underexplored in comprehensive reviews, highlighting a gap in portable, on-device processing compared to desktop dominance.[72]Visual Applications
Noise Types in Images and Video
In digital images, noise manifests in various forms depending on the acquisition and transmission processes. Gaussian noise arises primarily from sensor electronics in charge-coupled device (CCD) and complementary metal-oxide-semiconductor (CMOS) imagers, including thermal noise and read-out noise, which become prominent under low-light conditions or high ISO settings to amplify weak signals.[73][74] This noise is characterized by a normal distribution, adding random variations to pixel intensities that appear as fine-grained fluctuations across the image.[75] Salt-and-pepper noise, also known as impulse noise, occurs due to transmission errors, bit errors in data storage, or defective pixels in the sensor, resulting in isolated bright (salt) or dark (pepper) pixels scattered randomly.[76] This type is particularly evident in compressed or digitized images where sudden spikes disrupt the otherwise smooth intensity gradients.[77] Poisson noise, or shot noise, stems from the quantum nature of photon detection in low-light scenarios, where the discrete arrival of photons leads to variance equal to the mean signal intensity.[11] It is modeled by the Poisson distribution, where the probability of observing photons given an expected value is given by: This noise is inherent to photon-limited imaging in CCD and CMOS sensors, dominating in astronomical or medical applications with sparse illumination.[78] In video sequences, noise extends beyond static images to include temporal dimensions, with spatial-temporal correlations arising from frame-to-frame dependencies. Temporal noise often emerges from motion-induced variations, such as inconsistencies in sensor response during object movement or camera shake, leading to flickering or jitter across frames.[79] Compression artifacts, introduced during encoding to reduce data rates, include blocking (visible grid patterns at macroblock boundaries), ringing (oscillations around sharp edges), and blurring, which propagate temporally if not mitigated.[80] Unlike single images, video noise exhibits propagation across frames due to inter-frame prediction in compression standards, necessitating approaches that maintain temporal consistency to avoid artifacts like ghosting or inconsistent denoising.[81] These characteristics are exacerbated in low-light video capture, where sensor noise sources amplify both spatial and temporal irregularities.[82]Spatial Denoising Methods
Spatial denoising methods apply filters directly to pixel values in the local neighborhood of each pixel within an image, aiming to suppress noise while ideally preserving structural details such as edges and textures. These techniques are foundational for processing still images affected by additive noise models, including Gaussian and impulse types like salt-and-pepper noise, and operate without transforming the image into another domain. By focusing on spatial locality, they enable efficient computation suitable for real-time applications, though they often involve tradeoffs between noise suppression and detail preservation. Linear spatial filters provide straightforward noise reduction through convolution with a kernel that averages neighboring pixels. The mean filter, a basic linear approach, computes the output at each pixel as the arithmetic average of values within a sliding window , formulated as where is the noisy input image and the filtered output; this effectively attenuates Gaussian noise by smoothing uniform regions but introduces blurring across edges and fine details.[2] Similarly, the Gaussian blur filter employs a Gaussian-weighted kernel to prioritize closer neighbors, reducing high-frequency noise components more selectively than the uniform mean filter while still risking over-smoothing in textured areas; the kernel is typically defined by a standard deviation controlling the extent of blurring.[83] Nonlinear filters address the limitations of linear methods by applying order-statistics or edge-aware operations, better handling non-Gaussian noise without uniform blurring. The median filter replaces each pixel with the median value from its neighborhood, excelling at removing impulse noise such as salt-and-pepper artifacts by isolating and replacing outliers; introduced by Tukey for signal smoothing, it preserves edges more effectively than linear alternatives in noisy scenarios.[84] The bilateral filter enhances this by incorporating both spatial proximity and radiometric similarity in weighting, computed as where and are Gaussian functions for spatial and range kernels, respectively, and normalizes the weights; this edge-preserving smoothing, proposed by Tomasi and Manduchi, balances noise reduction with fidelity to intensity discontinuities.[85] Anisotropic diffusion models offer iterative, edge-directed smoothing through partial differential equations that adapt diffusion based on local image gradients. The Perona-Malik framework evolves the image via where is a decreasing conduction coefficient (e.g., ) that slows diffusion across strong edges (characterized by gradient magnitude exceeding threshold ) while allowing intraregion smoothing; this nonlinear process effectively denoises while enhancing edges, as demonstrated in early scale-space applications.[86] A key tradeoff in spatial denoising is the inverse relationship between noise removal efficacy and structural preservation: linear filters like mean and Gaussian excel at suppressing random fluctuations but blur details indiscriminately, whereas nonlinear methods such as median and bilateral reduce artifacts like impulses with less distortion yet may leave residual noise in homogeneous areas or introduce artifacts in complex textures. In the 2020s, smartphone computational photography pipelines have increasingly adopted hybrid spatial filters—combining elements of linear smoothing with nonlinear edge preservation, such as guided bilateral variants—to achieve real-time denoising tailored to mobile sensor noise patterns, outperforming standalone filters on datasets like SIDD.Frequency and Transform-Based Methods
Frequency and transform-based methods transform images or video frames into alternative domains, such as frequency or multi-resolution representations, to separate noise from signal components more effectively than spatial-domain processing alone. These techniques exploit the fact that noise often manifests differently in transform coefficients, enabling selective attenuation while preserving edges and textures. Unlike purely local spatial filters, which may blur details, transform methods provide global or multi-scale analysis for superior noise reduction in structured signals.[2] In the Fourier domain, the Wiener filter serves as a foundational approach for denoising by estimating the original signal through minimum mean square error optimization. It applies a frequency-domain multiplier to the noisy Fourier transform, balancing signal restoration against noise amplification, particularly effective for stationary noise like Gaussian white noise in images. For instance, when the point spread function is known, the filter's transfer function is derived as , where is the Fourier transform of the degradation function, is the noise power spectrum, and is the original signal's power spectrum; practical implementations estimate these spectra from the observed image. This method has been shown to outperform inverse filtering by reducing ringing artifacts in restored images.[87] Wavelet transforms enable multi-resolution denoising by decomposing images into subbands via scalable basis functions, allowing noise suppression primarily in detail coefficients. The dyadic wavelet basis is defined as , where controls scale and translation, providing localized time-frequency analysis superior for transient signals. Seminal work introduced soft and hard thresholding of these coefficients: hard thresholding sets coefficients below a threshold to zero, while soft thresholding subtracts from absolute values exceeding , with often chosen as for noise standard deviation and image size . This approach achieves near-minimax rates for estimating functions in Besov spaces, with soft thresholding preferred for its continuity and bias reduction, yielding PSNR improvements of 2-5 dB over linear methods on standard test images like Lena under additive Gaussian noise.[88] The discrete cosine transform (DCT), widely used in video compression standards like MPEG, facilitates denoising in the transform domain by thresholding or adapting coefficients to mitigate quantization noise introduced during encoding. In video applications, 3D-DCT across spatial-temporal blocks compacts energy, allowing soft-thresholding of high-frequency coefficients to reduce chroma noise while preserving luminance details; for example, this has demonstrated effective suppression of mosquito noise around edges in compressed videos, with bitrate savings up to 20% when integrated into encoding pipelines. DCT-based methods are particularly suited for block artifacts in JPEG-compressed images, where coefficient adjustment smooths discontinuities without full inverse transforms.[89] Non-local means (NLM) denoising leverages self-similarity across the entire image by weighting pixel contributions based on patch similarities, effectively operating in a transform-like space of redundant structures rather than fixed bases. Introduced as a method that replaces each pixel with a weighted average of similar pixels found globally, using Gaussian-weighted distances between neighborhoods, NLM preserves textures better than local filters, achieving state-of-the-art PSNR on images with Gaussian noise at levels up to 50, though at higher computational cost mitigated by fast approximations.[90] These methods find key applications in removing compression artifacts, such as JPEG blocking and ringing, where DCT-domain processing directly modifies quantized coefficients to restore smoothness. For textured noise, emerging curvelet transforms extend wavelets by capturing curvilinear singularities with directional elements, outperforming wavelets in preserving fine textures; recent analyses confirm curvelet coefficient thresholding yields higher PSNR (e.g., up to 7 dB gains over wavelets) for textured regions in noisy images, with a 2024 study highlighting its efficacy in adaptive implementations for complex scenes.[91][92][93]Model and Learning-Based Methods
Model and learning-based methods in image and video denoising leverage probabilistic frameworks and data-driven techniques to model noise and image priors, achieving superior performance over traditional filters by incorporating statistical assumptions and learned representations. These approaches treat denoising as an inference problem, estimating the clean image from noisy observations under uncertainty. Statistical methods, such as Bayesian estimators, formulate denoising as maximum a posteriori (MAP) estimation, where the prior distribution on the image captures smoothness or sparsity. A seminal Bayesian approach uses non-local means within a probabilistic framework, as in the Non-Local Bayes (NL-Bayes) algorithm, which estimates pixel values by aggregating similar patches while accounting for noise variance through a Poisson-like model for wavelet coefficients. This method outperforms earlier linear estimators by adaptively weighting patch similarities based on statistical tests, yielding PSNR improvements of 0.5-1 dB on standard benchmarks like Kodak images. Markov random fields (MRFs) provide a foundational prior for Bayesian denoising, modeling local image dependencies via Gibbs distributions to enforce piecewise smoothness. The seminal work by Geman and Geman introduced stochastic relaxation for MRF-based restoration, enabling simulated annealing to solve the energy minimization problem and recover edges in noisy binary images, influencing subsequent developments in continuous-domain denoising. Block-matching techniques extend statistical modeling by grouping similar patches across the image or video, forming 3D arrays for collaborative filtering. The BM3D algorithm represents a high-impact contribution, performing block matching to stack similar 2D patches into 3D groups, followed by collaborative hard thresholding and Wiener filtering in a transform domain like DCT or wavelets. For images, BM3D achieves state-of-the-art non-learning results, with PSNR gains of up to 1.5 dB over competitors on Gaussian noise at σ=25, due to its exploitation of self-similarity. In video denoising, BM3D variants incorporate temporal redundancy by extending matching to spatio-temporal blocks, reducing flickering while preserving motion details. Deep learning methods have revolutionized denoising by learning hierarchical features from data, often trained on noisy-clean image pairs. Convolutional neural networks (CNNs), exemplified by DnCNN, employ residual learning to predict noise rather than clean images, using batch normalization and ReLU activations in a deep architecture to handle blind Gaussian noise levels up to σ=55. DnCNN surpasses BM3D in perceptual quality and PSNR by 0.3-0.8 dB on BSD68 datasets, with faster inference due to its end-to-end design. These models can integrate transform-domain features, such as wavelet coefficients, as inputs to enhance frequency-specific denoising. In 2025, challenges such as the NTIRE Image Denoising Challenge highlighted advances in self-supervised and hybrid methods for real-world noise, achieving state-of-the-art PSNR on diverse datasets.[94] Diffusion models, emerging in the 2020s, offer generative approaches to denoising by iteratively reversing a forward noise addition process, modeling the data distribution as a Markov chain. The foundational Denoising Diffusion Probabilistic Models (DDPM) framework learns to denoise from pure Gaussian noise over hundreds of steps, achieving FID scores below 3 on CIFAR-10 for synthesis tasks adaptable to denoising. Recent advances apply diffusion for blind denoising, where noise parameters are unknown; for instance, Gibbs diffusion estimates both signal and noise spectra from colored noise, improving SSIM by 0.05 on real-world images without paired data. These models excel in preserving textures but require computational acceleration for practical use. For video applications, extensions incorporate motion estimation via optical flow to align frames before denoising, mitigating temporal inconsistencies. Optical flow-guided methods, such as those using reliable motion estimation with spatial regularization, propagate clean pixels across frames while suppressing structured noise, achieving 1-2 dB PSNR gains on sequences like Foreman at σ=25. Recent generative AI developments, including blind-spot guided diffusion, enable self-supervised video denoising by masking spatial neighbors during training, handling real-world noise without clean references and outperforming supervised CNNs in diverse degradations as of 2025.Visual Software Tools
Visual software tools for noise reduction in images and videos provide user-friendly interfaces that integrate algorithmic methods to enhance clarity while preserving details, often supporting both still and moving content through plugins, standalone applications, or libraries.[95][96][97] In open-source environments, the GNU Image Manipulation Program (GIMP) utilizes the G'MIC-Qt plugin, which offers over 500 filters including dedicated noise reduction tools such as wavelet-based and anisotropic smoothing options to handle luminance and color noise effectively.[95][98] Similarly, Adobe Photoshop incorporates built-in Reduce Noise filters alongside third-party plugins like Noiseware and G'MIC, enabling selective denoising that targets ISO-induced artifacts while maintaining edge sharpness.[99][100] For video workflows, DaVinci Resolve from Blackmagic Design features temporal and spatial noise reduction powered by AI, allowing real-time previews and adjustments in a node-based interface to mitigate grain in high-ISO footage.[96] Machine learning-based tools like Topaz DeNoise AI stand out for their deep learning models trained on diverse datasets, automatically distinguishing noise from details in RAW files and supporting formats up to 100MP with minimal artifacts. Key features across these tools include batch processing for handling multiple files efficiently and GPU acceleration to speed up computations, particularly in demanding scenarios like 4K video denoising.[101] Open-source libraries such as OpenCV facilitate custom pipelines through functions like fastNlMeansDenoising, which averages similar patches for gaussian noise removal, and denoise_TVL1 for total variation-based smoothing, integrable into scripts or applications for tailored visual effects.[97][102] Emerging trends emphasize accessibility via mobile apps and cloud integration; for instance, Google's Snapseed app employs structure and healing tools to indirectly reduce noise in low-light photos through selective sharpening and blending.[103] In the 2020s, Adobe Sensei drives cloud-based AI denoising services within Creative Cloud, such as the Denoise feature in Lightroom and Photoshop, which applies neural networks for artifact-free results on uploaded images without local hardware constraints.[104] A notable gap in traditional documentation is the growing role of real-time visual noise reduction in augmented reality (AR) and virtual reality (VR) applications, where tools like Unity's post-processing stacks incorporate adaptive denoising shaders to maintain immersion in dynamic, low-light environments.Specialized Applications
Seismic Exploration
In seismic exploration, noise sources significantly degrade the quality of geophysical data used for oil and gas prospecting. Ground roll, a type of low-velocity surface wave generated by the seismic source, propagates along the earth's surface and often masks primary reflections due to its strong amplitude and low frequency. Multiples, which are unwanted reflections from interfaces such as the sea surface or subsurface layers, interfere with primary signals by creating ghosting effects and reducing resolution in subsurface imaging. Cultural noise, arising from human activities like traffic, machinery, or power lines, introduces erratic coherent and incoherent disturbances, particularly in onshore surveys where near-surface heterogeneity exacerbates the issue.[105][106][107][108][109] Key techniques for noise reduction in seismic data leverage wave propagation properties to enhance signal-to-noise ratios. Stack averaging, applied during common midpoint (CMP) processing, combines multiple traces from different offsets to suppress random noise while preserving coherent reflections, as the signal adds constructively and noise cancels out statistically. Predictive deconvolution predicts and subtracts multiples by modeling the wavelet shape from the data, effectively compressing the seismic wavelet and attenuating reverberations without requiring a priori velocity models. F-k filtering, performed in the frequency-wavenumber domain, separates noise based on velocity differences; for instance, it rejects low-velocity ground roll by applying a velocity fan filter that passes primary reflections while rejecting slower coherent noise.[110][111][112][113][114] Historically, seismic migration methods emerged in the 1950s to address wave propagation distortions, with early techniques like the diffraction summation method correcting for dip-dependent errors in unmigrated sections, laying the groundwork for noise-aware imaging. In modern applications, machine learning approaches, such as convolutional neural networks, have advanced coherent noise attenuation by learning spatial patterns from training data, outperforming traditional filters in handling complex land datasets with ground roll and multiples. These methods improve subsurface imaging by enhancing resolution and reducing artifacts, enabling clearer delineation of reservoirs. In the 2020s, fiber-optic distributed acoustic sensing (DAS) systems, which use existing cables as dense sensor arrays, have introduced new noise challenges like instrumental polarization noise; attenuation via wavelet stacking or deep learning models has demonstrated significant resolution uplift in vertical seismic profiling.[115][116][117][118][119][120][121]Communications and Medical Imaging
In wireless communications, noise reduction is essential for maintaining reliable data transmission over noisy channels, particularly in modern systems like 5G and emerging 6G networks. Orthogonal frequency-division multiplexing (OFDM) serves as a foundational technique for mitigating inter-symbol interference and multipath fading noise, where equalization compensates for channel distortions by inverting the frequency response. For instance, in 5G systems, OFDM-based equalization enhances spectral efficiency and reduces bit error rates in high-mobility scenarios, with notable improvements in signal-to-noise ratio (SNR) under urban fading conditions. In 6G visions, advanced OFDM variants incorporate AI-driven equalization to handle terahertz-band noise, enabling higher data rates while suppressing interference from massive MIMO arrays. Forward error correction (FEC) coding further bolsters noise resilience by adding redundancy to detect and correct transmission errors. Turbo codes, introduced in the 1990s and widely adopted in 3G/4G standards, approach the Shannon limit for error correction, reducing the required SNR by 2-3 dB compared to convolutional codes in additive white Gaussian noise (AWGN) channels.[122] These parallel concatenated codes use iterative decoding to iteratively refine estimates, making them suitable for bandwidth-constrained satellite and mobile links. Adaptive beamforming complements these methods by dynamically adjusting antenna array weights to focus signals toward desired directions while nulling noise sources, improving SNR by 10-15 dB in multi-user environments.[123] Channel estimation is a prerequisite for effective equalization, often employing pilot symbols to model the channel response. In OFDM systems, the least-squares estimator approximates the channel transfer function as , where is the received signal and is the known transmitted pilot, assuming negligible noise for high-SNR pilots; this simplifies to for matrix forms in multi-antenna setups.[124] Such estimation enables zero-forcing or minimum mean-square error equalizers to recover clean symbols from noisy receptions. In medical imaging, noise reduction techniques address modality-specific artifacts to enhance diagnostic accuracy without increasing radiation or scan times. For magnetic resonance imaging (MRI), k-space filtering suppresses thermal noise by applying low-pass filters in the Fourier domain, where undersampled k-space data is smoothed to boost SNR while preserving edge details.[125] In computed tomography (CT), Poisson noise arises from photon starvation in low-dose scans, modeled as with variance equal to the mean intensity ; reduction methods like bilateral filtering or block-matching 3D denoising effectively lower noise variance, enabling substantial dose reductions while maintaining contrast-to-noise ratios for lesion detection.[126] Ultrasound imaging contends with multiplicative speckle noise, which degrades tissue boundaries; suppression via anisotropic diffusion or wavelet thresholding improves speckle SNR, facilitating clearer visualization of structures like tumors.[127] Dictionary learning emerges as a versatile technique across these modalities, training sparse overcomplete dictionaries from image patches to represent clean signals while isolating noise. In medical contexts, K-SVD-based dictionary learning reconstructs denoised images by solving , where is the noisy patch, the learned dictionary, and sparse coefficients; this yields improvements in PSNR in MRI and CT by adapting to anatomical priors.[128] Recent advances, particularly post-2020, integrate AI for superior denoising in medical imaging and extend to quantum communications. Deep learning models like convolutional neural networks (CNNs) and U-Net variants perform unsupervised or self-supervised denoising, achieving superior structural similarity indices in low-dose CT and MRI compared to traditional filters, as evidenced in clinical trials for accelerated scans.[129] In communications, 2025 developments in quantum error correction target noisy intermediate-scale quantum (NISQ) channels, with variational codes optimizing for amplitude damping noise via tailored stabilizers, reducing logical error rates below 10^{-3} in multi-qubit setups.[130] These surface code extensions enable fault-tolerant quantum links, approaching theoretical bounds for depolarizing noise.[131]References
- https://wiki.seg.org/wiki/Introduction_to_noise_and_multiple_attenuation
- https://wiki.seg.org/wiki/Predictive_deconvolution_in_practice
- https://wiki.seg.org/wiki/Deconvolution_methods
- https://wiki.seg.org/wiki/F-k_filtering

