Recent from talks
Nothing was collected or created yet.
Moving average
View on Wikipedia
In statistics, a moving average (rolling average or running average or moving mean[1] or rolling mean) is a calculation to analyze data points by creating a series of averages of different selections of the full data set. Variations include: simple, cumulative, or weighted forms.
Mathematically, a moving average is a type of convolution. Thus in signal processing it is viewed as a low-pass finite impulse response filter. Because the boxcar function outlines its filter coefficients, it is called a boxcar filter. It is sometimes followed by downsampling.
Given a series of numbers and a fixed subset size, the first element of the moving average is obtained by taking the average of the initial fixed subset of the number series. Then the subset is modified by "shifting forward"; that is, excluding the first number of the series and including the next value in the series.
A moving average is commonly used with time series data to smooth out short-term fluctuations and highlight longer-term trends or cycles - in this case the calculation is sometimes called a time average. The threshold between short-term and long-term depends on the application, and the parameters of the moving average will be set accordingly. It is also used in economics to examine gross domestic product, employment or other macroeconomic time series. When used with non-time series data, a moving average filters higher frequency components without any specific connection to time, although typically some kind of ordering is implied. Viewed simplistically it can be regarded as smoothing the data.
Simple moving average
[edit]
In financial applications a simple moving average (SMA) is the unweighted mean of the previous data-points. However, in science and engineering, the mean is normally taken from an equal number of data on either side of a central value. This ensures that variations in the mean are aligned with the variations in the data rather than being shifted in time. An example of a simple equally weighted running mean is the mean over the last entries of a data-set containing entries. Let those data-points be . This could be closing prices of a stock. The mean over the last data-points (days in this example) is denoted as and calculated as:
When calculating the next mean with the same sampling width the range from to is considered. A new value comes into the sum and the oldest value drops out. This simplifies the calculations by reusing the previous mean . This means that the moving average filter can be computed quite cheaply on real time data with a FIFO / circular buffer and only 3 arithmetic steps.
During the initial filling of the FIFO / circular buffer the sampling window is equal to the data-set size thus and the average calculation is performed as a cumulative moving average.
The period selected () depends on the type of movement of interest, such as short, intermediate, or long-term.
If the data used are not centered around the mean, a simple moving average lags behind the latest datum by half the sample width. An SMA can also be disproportionately influenced by old data dropping out or new data coming in. One characteristic of the SMA is that if the data has a periodic fluctuation, then applying an SMA of that period will eliminate that variation (the average always containing one complete cycle). But a perfectly regular cycle is rarely encountered.[2]
For a number of applications, it is advantageous to avoid the shifting induced by using only "past" data. Hence a central moving average can be computed, using data equally spaced on either side of the point in the series where the mean is calculated.[3] This requires using an odd number of points in the sample window.
A major drawback of the SMA is that it lets through a significant amount of the signal shorter than the window length. This can lead to unexpected artifacts, such as peaks in the smoothed result appearing where there were troughs in the data. It also leads to the result being less smooth than expected since some of the higher frequencies are not properly removed.
Its frequency response is a type of low-pass filter called sinc-in-frequency.
Continuous moving average
[edit]The continuous moving average of an integrable function is defined via integration as:
where the environment around defines the intensity of smoothing of the graph of the integrable function. A larger smoothes the source graph of the function (blue) more. The animations below show the moving average as animation in dependency of different values for . The fraction is used, because is the interval width for the integral. Naturally, by fundamental theorem of calculus and L'Hôpital's rule.
-
Continuous moving average sine and polynom - visualization of the smoothing with a small interval for integration
-
Continuous moving average sine and polynom - visualization of the smoothing with a larger interval for integration
-
Animation showing the impact of interval width and smoothing by moving average.
Cumulative average
[edit]In a cumulative average (CA), the data arrive in an ordered datum stream, and the user would like to get the average of all of the data up until the current datum. For example, an investor may want the average price of all of the stock transactions for a particular stock up until the current time. As each new transaction occurs, the average price at the time of the transaction can be calculated for all of the transactions up to that point using the cumulative average, typically an equally weighted average of the sequence of n values up to the current time:
The brute-force method to calculate this would be to store all of the data and calculate the sum and divide by the number of points every time a new datum arrived. However, it is possible to simply update cumulative average as a new value, becomes available, using the formula
Thus the current cumulative average for a new datum is equal to the previous cumulative average, times n, plus the latest datum, all divided by the number of points received so far, n+1. When all of the data arrive (n = N), then the cumulative average will equal the final average. It is also possible to store a running total of the data as well as the number of points and dividing the total by the number of points to get the CA each time a new datum arrives.
The derivation of the cumulative average formula is straightforward. Using and similarly for n + 1, it is seen that
Solving this equation for results in
Weighted moving average
[edit]A weighted average is an average that has multiplying factors to give different weights to data at different positions in the sample window. Mathematically, the weighted moving average is the convolution of the data with a fixed weighting function. One application is removing pixelization from a digital graphical image. This is also known as Anti-aliasing [citation needed]
In the financial field, and more specifically in the analyses of financial data, a weighted moving average (WMA) has the specific meaning of weights that decrease in arithmetical progression.[4] In an n-day WMA the latest day has weight n, the second latest , etc., down to one.

The denominator is a triangle number equal to In the more general case the denominator will always be the sum of the individual weights.
When calculating the WMA across successive values, the difference between the numerators of and is . If we denote the sum by , then
The graph at the right shows how the weights decrease, from highest weight for the most recent data, down to zero. It can be compared to the weights in the exponential moving average which follows.
Exponential moving average
[edit]An exponential moving average (EMA), also known as an exponentially weighted moving average (EWMA),[5] is a first-order infinite impulse response filter that applies weighting factors which decrease exponentially. The weighting for each older datum decreases exponentially, never reaching zero. This formulation is according to Hunter (1986).[6]
There is also a multivariate implementation of EWMA, known as MEWMA.[7]
Other weightings
[edit]Other weighting systems are used occasionally – for example, in share trading a volume weighting will weight each time period in proportion to its trading volume.
A further weighting, used by actuaries, is Spencer's 15-Point Moving Average[8] (a central moving average). Its symmetric weight coefficients are [−3, −6, −5, 3, 21, 46, 67, 74, 67, 46, 21, 3, −5, −6, −3], which factors as [1, 1, 1, 1]×[1, 1, 1, 1]×[1, 1, 1, 1, 1]×[−3, 3, 4, 3, −3]/320 and leaves samples of any quadratic or cubic polynomial unchanged.[9][10]
Outside the world of finance, weighted running means have many forms and applications. Each weighting function or "kernel" has its own characteristics. In engineering and science the frequency and phase response of the filter is often of primary importance in understanding the desired and undesired distortions that a particular filter will apply to the data.
A mean does not just "smooth" the data. A mean is a form of low-pass filter. The effects of the particular filter used should be understood in order to make an appropriate choice.[citation needed]
Moving median
[edit]From a statistical point of view, the moving average, when used to estimate the underlying trend in a time series, is susceptible to rare events such as rapid shocks or other anomalies. A more robust estimate of the trend is the simple moving median over n time points: where the median is found by, for example, sorting the values inside the brackets and finding the value in the middle. For larger values of n, the median can be efficiently computed by updating an indexable skiplist.[11]
Statistically, the moving average is optimal for recovering the underlying trend of the time series when the fluctuations about the trend are normally distributed. However, the normal distribution does not place high probability on very large deviations from the trend which explains why such deviations will have a disproportionately large effect on the trend estimate. It can be shown that if the fluctuations are instead assumed to be Laplace distributed, then the moving median is statistically optimal.[12] For a given variance, the Laplace distribution places higher probability on rare events than does the normal, which explains why the moving median tolerates shocks better than the moving mean.
When the simple moving median above is central, the smoothing is identical to the median filter which has applications in, for example, image signal processing. The Moving Median is a more robust alternative to the Moving Average when it comes to estimating the underlying trend in a time series. While the Moving Average is optimal for recovering the trend if the fluctuations around the trend are normally distributed, it is susceptible to the impact of rare events such as rapid shocks or anomalies. In contrast, the Moving Median, which is found by sorting the values inside the time window and finding the value in the middle, is more resistant to the impact of such rare events. This is because, for a given variance, the Laplace distribution, which the Moving Median assumes, places higher probability on rare events than the normal distribution that the Moving Average assumes. As a result, the Moving Median provides a more reliable and stable estimate of the underlying trend even when the time series is affected by large deviations from the trend. Additionally, the Moving Median smoothing is identical to the Median Filter, which has various applications in image signal processing.
Moving average regression model
[edit]In a moving average regression model, a variable of interest is assumed to be a weighted moving average of unobserved independent error terms; the weights in the moving average are parameters to be estimated.
Those two concepts are often confused due to their name, but while they share many similarities, they represent distinct methods and are used in very different contexts.
See also
[edit]- Exponential smoothing
- Local regression (LOESS and LOWESS)
- Kernel smoothing
- Moving average convergence/divergence indicator
- Martingale (probability theory)
- Moving average crossover
- Moving least squares
- Rising moving average
- Rolling hash
- Running total
- Savitzky–Golay filter
- Window function
- Zero lag exponential moving average
References
[edit]- ^ Hydrologic Variability of the Cosumnes River Floodplain (Booth et al., San Francisco Estuary and Watershed Science, Volume 4, Issue 2, 2006)
- ^ Statistical Analysis, Ya-lun Chou, Holt International, 1975, ISBN 0-03-089422-0, section 17.9.
- ^ The derivation and properties of the simple central moving average are given in full at Savitzky–Golay filter.
- ^ "Weighted Moving Averages: The Basics". Investopedia.
- ^ "DEALING WITH MEASUREMENT NOISE - Averaging Filter". Archived from the original on 2010-03-29. Retrieved 2010-10-26.
- ^ NIST/SEMATECH e-Handbook of Statistical Methods: Single Exponential Smoothing at the National Institute of Standards and Technology
- ^ Yeh, A.; Lin, D.; Zhou, H.; Venkataramani, C. (2003). "A multivariate exponentially weighted moving average control chart for monitoring process variability" (PDF). Journal of Applied Statistics. 30 (5): 507–536. Bibcode:2003JApSt..30..507Y. doi:10.1080/0266476032000053655. ISSN 0266-4763. Retrieved 16 January 2025.
- ^ Spencer's 15-Point Moving Average — from Wolfram MathWorld
- ^ Rob J Hyndman. "Moving averages". 2009-11-08. Accessed 2020-08-20.
- ^ Aditya Guntuboyina. "Statistics 153 (Time Series) : Lecture Three". 2012-01-24. Accessed 2024-01-07.
- ^ "Efficient Running Median using an Indexable Skiplist « Python recipes « ActiveState Code".
- ^ G.R. Arce, "Nonlinear Signal Processing: A Statistical Approach", Wiley:New Jersey, US, 2005.
Moving average
View on GrokipediaFundamentals
Definition
A moving average is a statistical calculation used to analyze data points by creating a series of averages from different subsets of a full data set, typically applied to time series to smooth variations in sequential observations.[14] This technique computes the mean of successive smaller sets of data, advancing one period at a time, which helps in producing a smoothed representation of the underlying pattern.[15] In its simplest form, for a data sequence , the moving average at time with window size is given by where the average is taken over the most recent values up to time .[15] This formulation assumes equal weighting for the simple case, focusing on arithmetic means of contiguous subsets.[16] The primary purpose of a moving average is to reduce short-term noise and fluctuations in time series data, thereby highlighting longer-term trends or cycles for better pattern recognition and forecasting.[17] By smoothing out peaks and troughs, it provides a clearer view of the data's directional movement without altering the overall sequence.[14] It finds applications in fields such as finance for trend analysis and signal processing for noise reduction.[2] The concept of moving averages originated in the early 20th century within statistics, with early uses documented in economic data analysis around 1901 by R.H. Hooker, later termed "moving averages" by G. Udny Yule in 1927.[18]Properties
Moving averages exhibit a smoothing effect by functioning as low-pass filters in signal processing, which attenuate high-frequency variations such as noise while preserving underlying low-frequency trends in data sequences.[19] This property arises because the filter's frequency response passes low frequencies with minimal amplitude reduction but severely attenuates higher frequencies, as seen in the amplitude response of a simple two-point moving average given by , where low values experience little damping compared to values near the Nyquist frequency.[19] Consequently, applying a moving average reduces jaggedness in time series data, leveling out short-term fluctuations without substantially altering long-term patterns.[20] In statistical estimation, simple moving averages serve as unbiased estimators of the underlying signal mean when the data follows a constant trend plus white noise, meaning their expected value equals the true parameter under such assumptions.[21] However, their variance decreases inversely with the window size , approximated as for a two-sided average with noise variance , leading to higher variability for smaller windows and smoother but potentially over-smoothed outputs for larger ones.[21] This creates a fundamental bias-variance trade-off: smaller windows minimize bias by closely tracking local changes but amplify variance due to noise sensitivity, whereas larger windows reduce variance through averaging but introduce bias by oversmoothing, particularly near peaks or troughs where the estimate flattens, with bias scaling as for smooth functions .[21][22] Moving averages contribute to stationarity in non-stationary time series through differencing operations, where first-order differencing—equivalent to a moving average with kernel weights [1, -1]—stabilizes the mean by removing linear trends and level shifts.[23] In ARIMA modeling frameworks, such differencing transforms integrated processes into stationary ones, allowing subsequent moving average components to model the residuals effectively without time-varying statistical properties.[23] This approach ensures constant mean, variance, and autocovariance over time, a prerequisite for reliable time series analysis.[24] Mathematically, moving averages can be represented as discrete convolutions of the input sequence with a kernel that defines the weights, such as a uniform kernel of ones divided by the window length for the simple moving average.[20] For a window of size , the output at index is , which corresponds to convolving the signal with a rectangular pulse kernel, enabling efficient computation via fast convolution algorithms and highlighting the filter's linear, time-invariant nature.[20] This convolution view also reveals the frequency-domain behavior, where the kernel's Fourier transform determines the low-pass characteristics.[20] Edge effects arise in moving average computations near the boundaries of finite data sequences, where the sliding window cannot fully overlap due to insufficient preceding or following points, potentially leading to biased or incomplete estimates at the start and end.[25] Common handling strategies include using partial windows that average only available points within the boundary vicinity, or applying padding techniques such as zero-padding, edge replication, or reflection to extend the sequence artificially and maintain full window coverage.[25] These methods trade off between preserving data integrity and introducing minimal artifacts, with partial windows often preferred for avoiding artificial extensions in short series.[20]Basic Types
Simple Moving Average
The simple moving average (SMA) is a fundamental smoothing technique in time series analysis that computes the arithmetic mean of a fixed number of consecutive observations, assigning equal weight to each value within the specified window. This method applies uniform weights of to the most recent observations, where is the window size, making it particularly suitable for identifying underlying trends by reducing short-term fluctuations in data.[14][26] The formula for the SMA at time is given by: where represents the observation at the corresponding past time point. This rolling calculation updates as new data enters the window and the oldest observation exits, providing a sequence of averages that track changes over time.[14] For illustration, consider a dataset of values [1, 2, 3, 4, 5] with . The first SMA is the average of 1, 2, and 3, yielding 2; the second is the average of 2, 3, and 4, yielding 3; and the third is the average of 3, 4, and 5, yielding 4. Thus, the SMA values are [2, 3, 4]. This example demonstrates how the SMA progressively incorporates newer data while maintaining a fixed window length.[14] One key advantage of the SMA is its computational simplicity, requiring only basic addition and division, which makes it straightforward to implement and interpret even for large datasets. It also provides uniform smoothing that effectively highlights persistent trends by averaging out random noise, minimizing mean squared error in stationary data without trends.[27][26] However, the SMA has notable disadvantages, including a tendency to lag behind actual trends due to its equal weighting of all observations in the window, which delays responsiveness to recent changes. Additionally, it can be sensitive to outliers within the window, as each value influences the average equally, potentially distorting the smoothed result in volatile datasets.[27][26][28] The selection of the window size is crucial, as smaller values increase responsiveness to recent data but introduce more noise and variability, while larger values enhance smoothness and trend visibility at the cost of greater lag and reduced sensitivity to shifts. This trade-off must be balanced based on the data's characteristics and the desired level of smoothing versus timeliness.[14][27]Cumulative Average
The cumulative average, also referred to as the running average or expanding average, computes the mean of all data points from the start of a dataset up to the current observation, resulting in a progressively expanding window size with each new data point.[29][30] This approach accumulates historical information without discarding earlier values, making it suitable for scenarios where overall progress or long-term trends are prioritized over short-term fluctuations. The formula for the cumulative average at time , denoted , for a sequence of observations is: [29][30] For example, given the data sequence [1, 2, 3], the cumulative averages are , , and .[29] As the number of observations grows, converges to the overall mean of the full dataset, providing a stable estimate that becomes less sensitive to recent changes due to the increasing influence of accumulated prior data.[30] This contrasts with fixed-window averages by emphasizing historical accumulation rather than recency. In applications such as quality control, the cumulative average monitors ongoing performance metrics, such as defect rates or measurement consistency, by tracking deviations within specified limits over time.[31] It is also widely used in learning curve analysis for production processes, where it models the average cost or time per unit as output accumulates, typically decreasing by a constant percentage with each doubling of quantity produced.[32] For computational efficiency, the cumulative average supports incremental updates without recalculating the entire sum: , which facilitates real-time tracking in streaming data environments.[29]Weighted Types
Weighted Moving Average
A weighted moving average (WMA) assigns varying weights to the data points within a fixed-size window, allowing for greater emphasis on specific observations, such as more recent ones, compared to the uniform weighting in simple moving averages. This flexibility makes WMAs particularly useful in time series analysis for smoothing data while prioritizing relevant trends.[33] The general form of a WMA at time for a window of size is given by where are the observed values in the window, and are the non-negative weights assigned to each position, with the denominator ensuring normalization so that the weights sum to 1 if desired for unbiased averaging.[33] Normalization is crucial to maintain the scale of the original data and prevent bias in the estimate, as the sum of weights acts as a scaling factor.[34] Weights can be assigned in various ways depending on the application; a common approach is linear weighting that decreases toward older data (e.g., weights of for periods, normalized by their sum ), for instance for (oldest) to (newest), with the highest weight on the newest observation, or triangular weights that peak in the middle for centered smoothing.[33] Such assignments allow customization to domain-specific needs, like emphasizing recency in financial forecasting or sales predictions where recent patterns are more indicative of future behavior.[33] Compared to the simple moving average, the WMA offers advantages in responsiveness, as higher weights on recent data reduce the lag in detecting shifts or trends, leading to more timely signals in volatile series.[33] This can improve forecast accuracy in applications requiring quick adaptation, though it may amplify noise if weights overly favor outliers.[34] For example, consider a time series with values , , and a window size using linear weights , , (oldest to newest). The WMA is calculated as , which weights the latest value more heavily than the simple average of 2.[33] Weight selection criteria typically rely on the problem's context, such as using higher weights for recent data in short-term forecasting to capture evolving patterns, while balancing smoothness and sensitivity through empirical testing or domain expertise.[33] Unlike the Exponential Moving Average (EMA), which assigns weights that decrease exponentially and uses a recursive formula EMA_t = α × a_t + (1 - α) × EMA_{t-1} (with smoothing factor α often set to 2/(n+1) for an n-period equivalent), WMA applies linear weighting. EMA therefore provides stronger emphasis on the most recent data compared to linear decay and is generally more responsive to recent changes. Additionally, EMA is easier to compute incrementally via recursion, requiring only the previous EMA value without storing all prior data points.[35][36]Exponential Moving Average
The exponential moving average (EMA) is a recursive method for estimating the local mean of a time series, assigning exponentially decaying weights to past observations to emphasize recent data. It is defined by the formula where is the new observation at time , is the smoothing factor satisfying , and is the previous EMA value.[37][38] This recursive structure ensures that the EMA incorporates the entire history of data, with the weight on the -th past observation given by the geometric sequence , normalized to sum to 1.[37][39] Initialization of the EMA typically sets to the first observation , the mean of an initial set of observations, or a target value such as zero or the historical mean, depending on the context to avoid undue bias from arbitrary starting points.[37][38] The choice of initialization affects early estimates but has diminishing impact as more data accumulates due to the exponential decay.[39] A key advantage of the EMA lies in its computational efficiency: it requires only the previous EMA value and the current observation for updates, using constant memory regardless of history length. This contrasts with the Weighted Moving Average (WMA), which typically requires storing and weighting a fixed window of past observations.[36] Both EMA and WMA prioritize recent data over older data, unlike the Simple Moving Average (SMA) which weights all periods equally. However, WMA assigns weights that decrease linearly with time, whereas EMA uses exponentially decaying weights, which apply stronger emphasis to the most recent data. Consequently, EMA is generally more responsive to recent changes than WMA.[36] This recursive form enables rapid adaptation to shifts in the underlying process, outperforming fixed-window methods in responsiveness while still smoothing noise through the infinite but decaying influence of past data.[37][38] Unlike finite moving averages, it avoids abrupt resets from sliding windows, providing a continuous estimate suitable for online processing.[39] The smoothing factor relates to the half-life , the time span over which weights decay to half their initial value, via the formula This interpretation allows practitioners to select based on desired memory length, where larger corresponds to smaller and greater smoothing.[39] For example, with and initial , the sequence begins as for , and for , illustrating the gradual incorporation of new values.[37] Parameter selection for trades off between sensitivity and stability: values near 1 yield high responsiveness to recent changes, ideal for volatile series, whereas values near 0 emphasize smoothing and historical trends, reducing sensitivity to outliers.[37][38] Optimal is often determined by minimizing forecast error metrics like mean squared error on validation data.[37]Other Weightings
In addition to simple and exponential weightings, moving averages can employ specialized non-geometric weight functions tailored to domain-specific requirements, such as emphasizing central data points or adapting to signal characteristics. These approaches provide enhanced smoothing while mitigating issues like edge effects or sensitivity to noise variations.[40] Gaussian weighting applies a bell-shaped kernel to the data window, assigning higher weights to points near the center and tapering off symmetrically. The weights are defined by the Gaussian function , where is the position in the window, is the center, and controls the spread. This method is particularly effective for preserving local features while reducing high-frequency noise, as implemented in signal processing toolboxes like MATLAB'ssmoothdata function, which uses a default window size of 4 elements unless specified otherwise.[41] In audio processing, Gaussian-weighted moving averages facilitate noise reduction by blurring impulsive disturbances without overly distorting the underlying waveform, as seen in applications for smoothing acoustic signals in real-time systems.[42]
Hann and Hamming windows, borrowed from signal processing, introduce tapered weighting to minimize boundary artifacts in the averaged output. The Hann window weights are given by for to , creating smooth transitions at the window edges that reduce sidelobe leakage compared to uniform weighting. The Hamming variant modifies this with an added constant term for slightly broader main lobe response: . These windows achieve sidelobe suppression up to -32 dB for Hann, significantly smoother than the -13.5 dB of simple moving averages, making them suitable for cycle detection in oscillatory data.[40] In financial time series analysis, such tapered weights help in trend filtering by dampening abrupt changes at window boundaries, improving indicator stability during volatile periods.
Adaptive weighting schemes dynamically adjust weights based on local data properties, such as volatility, to allocate higher emphasis to stable segments and lower to turbulent ones. Kaufman's Adaptive Moving Average (KAMA), for instance, computes a smoothing constant from the efficiency ratio—measuring directional movement relative to total variation—and applies it to recent observations, effectively increasing weights during low-volatility trends and decreasing them amid clustering volatility.[43] This approach addresses volatility clustering in finance, where periods of high fluctuation follow each other, by customizing the moving average to track persistent trends more responsively without excessive lag.[43]
Compared to uniform weighting, these specialized schemes—Gaussian for central emphasis, windowed for edge tapering, and adaptive for volatility response—reduce artifacts like ringing or oversensitivity, though they may introduce minor phase distortion in transient signals. Gaussian and windowed methods yield smoother outputs with less spectral leakage, while adaptive variants excel in non-stationary environments by maintaining adaptability over fixed windows.[40]
Implementation requires normalizing weights so their sum equals 1 to ensure the average remains unbiased, often via division by the kernel integral or sum. These methods incur higher computational costs than simple averages due to per-point weight calculations—O(n) per window for finite kernels—but optimizations like precomputed tables or recursive approximations mitigate this in real-time applications.[41]
Specialized Variants
Continuous Moving Average
The continuous moving average of a real-valued function over a time window of fixed length ending at time is defined as This formulation provides a uniform weighting across the interval , smoothing the function by averaging its values continuously. It serves as the continuous-time counterpart to the discrete simple moving average, emerging as the limit when the discrete sampling interval approaches zero and the number of points increases proportionally to maintain the window length . A weighted variant analogous to the discrete exponential moving average arises in continuous time through the exponentially decaying kernel, yielding where determines the effective memory scale (with the normalization ensuring the weights integrate to 1).[44] This expression solves the first-order linear ordinary differential equation with initial condition at some starting time ; to verify, differentiate the integral form using the Leibniz rule for parameter-dependent limits and the fundamental theorem of calculus, substitute, and simplify to obtain the differential equation.[44] Continuous moving averages find applications in control theory for mitigating noise in precision timing and frequency systems, where the integral form filters high-frequency fluctuations while preserving low-frequency trends.[45] In physics, they enable baseline correction in signal processing for experimental setups, such as particle detectors, by averaging over short windows to subtract slow drifts from raw waveforms. These methods also approximate components in Kalman filtering for continuous-time stochastic processes, particularly self-similar ones like fractional Brownian motion, by representing moving average integrals as state updates in the filter equations.[46] Specific properties distinguish continuous moving averages in analysis. If is differentiable, then is differentiable, with derivative obtained via the fundamental theorem of calculus applied to the integral bounds. For a constant function , the moving average remains , preserving the value exactly. For a linear trend with , the moving average is ; to derive this, compute the integral , then divide by to yield the lagged form, introducing a phase delay of .Moving Median
The moving median is a robust statistical technique used for smoothing data in a time series or sequence by applying the median within a sliding window of fixed size . At each position , it computes the median of the consecutive observations centered around or including , providing a non-parametric measure of central tendency that slides across the data to produce a smoothed series.[47] To compute the moving median, the values within the window are sorted in ascending order; for odd , the middle value (at position ) is selected as the median, while for even , the average of the two central values (at positions and ) is taken. This process repeats for each overlapping window, typically requiring sorting at each step, which incurs a computational complexity of per window in naive implementations.[47] A primary advantage of the moving median is its insensitivity to outliers, with a breakdown point of 50%, meaning it remains reliable even if up to half the data in the window are contaminated, unlike the arithmetic mean's 0% breakdown point. This robustness makes it particularly effective for preserving sharp changes in the data while suppressing noise, as it relies on order statistics rather than summation.[47]00130-R) However, the moving median's non-linearity complicates mathematical analysis, such as deriving closed-form properties or frequency responses, and its higher computational demands compared to moving averages can be a drawback for large datasets or real-time applications. Additionally, it may produce jagged smoothed curves and handle boundary points less effectively without specialized adjustments.[47] For example, consider the data sequence [1, 10, 2, 3, 100] with window size : the moving medians starting from the second position are 2 (median of 1, 10, 2), 3 (median of 10, 2, 3), and 3 (median of 2, 3, 100), effectively ignoring the outlier 100 and yielding a smoother trend of approximately [2, 3, 3].[47] Variants include the weighted moving median, which assigns different weights to window elements before selecting the median (e.g., via weighted order statistics), and the running median in signal processing, optimized for efficient incremental updates in streaming data to reduce sorting overhead.[48][49]Applications in Modeling
Time Series Smoothing
Moving averages serve as fundamental tools for smoothing time series data, effectively reducing short-term fluctuations and noise to reveal underlying structures such as trends and cycles. By averaging values over a sliding window, these filters decompose a series into a smoothed component—often interpreted as the trend—and a residual component capturing irregular variations. This approach is particularly valuable in fields like economics and meteorology, where raw data often includes random errors that obscure meaningful patterns.[37] In trend estimation, moving averages act as low-pass filters to isolate the long-term trend from a time series, enabling the decomposition , where represents the trend estimated via the moving average and is the residual. For instance, a simple moving average applied symmetrically around each point provides an estimate of the trend-cycle component, which can then be subtracted from the original series to obtain residuals for further analysis. This method assumes the trend evolves gradually, making it suitable for stationary or slowly varying processes.[50] For seasonal adjustment, moving averages are combined with differencing techniques to remove periodic fluctuations, as exemplified in the X-11 method developed by the U.S. Census Bureau. The X-11 procedure employs a series of symmetric moving averages—such as 3x3, 3x5, and 3x9 filters for monthly data—to estimate the trend and seasonal components iteratively, followed by differencing to stabilize the series and refine adjustments. This approach has been a standard for official statistics, though it has been succeeded by X-12-ARIMA and the current X-13ARIMA-SEATS method, which incorporates ARIMA modeling for improved forecasting and adjustment, enhancing the interpretability of economic indicators like unemployment rates.[51][52] In anomaly detection, deviations from a moving average baseline signal potential outliers or unusual events in the time series, as points significantly exceeding a threshold (e.g., two standard deviations) indicate breaks from the expected smoothed behavior. This technique is applied in monitoring systems to detect anomalies by establishing a normal profile with the moving average and flagging deviations in residuals. A prominent application in finance involves the 50-day simple moving average (SMA) to gauge stock price trends, where sustained positions above this line suggest bullish momentum. For example, a stock price above key short-term moving averages such as the 5-day or 10-day simple moving average indicates that the short-term uptrend remains intact, suggesting bullish momentum despite minor price dips. Similarly, the 50-day exponential moving average (EMA) tracks short-term trends, while the 200-day EMA monitors longer-term trends; as a trend-following indicator, the EMA assigns greater weight to recent prices, making it more responsive to new information. The 200-day moving average is approximately equivalent to a 40-week moving average (assuming five trading days per week), spanning nearly a year's trading activity (accounting for approximately 252 trading days per year). In contrast, a 30-week moving average corresponds to about 150 trading days and serves as a shorter, intermediate-term indicator more responsive to medium-term shifts, whereas the 40-week/200-day MA is widely used for identifying major trend changes, such as bull or bear market confirmations. In technical analysis, a stock price above its EMA signals a bullish bias, while a price below indicates a bearish bias.[35][53][9] More broadly, when a stock price is above most medium- and long-term moving averages, it signifies a dominant bullish trend, strong long-term support, and only slight pressure from short-term moving averages.[54][2] Crossovers between short-term and long-term SMAs generate trading signals: a golden cross occurs when the 50-day SMA rises above the 200-day SMA, indicating potential upward trends, while a death cross—its inverse—signals bearish reversals, as observed in major indices like the S&P 500. Similar crossover patterns apply to EMAs. Furthermore, when multiple moving averages align such that shorter-term MAs are positioned above longer-term ones (e.g., 5-day above 8-day above 13-day SMAs), it indicates a strong upward trend, reinforcing buy signals. These patterns aid investors in timing entries and exits, though empirical studies show mixed predictive power depending on market conditions. Trend-following strategies using simple moving averages, such as crossing above a long-term average to enter positions, have been applied to leveraged exchange-traded funds (ETFs) to potentially reduce drawdowns in volatile markets. However, trading leveraged ETFs, such as 3x funds like TQQQ, carries extreme risks, including the possibility of rapid total loss due to leverage amplification, volatility decay, and compounding effects in choppy or declining markets. Markets are unpredictable—no strategy is guaranteed profitable, and parameters optimized on historical data can overfit or underperform in the future. Past performance does not indicate future results.[55][56][3][12][57] In signal processing, moving averages function as finite impulse response (FIR) filters to attenuate high-frequency noise while preserving lower-frequency components essential for analysis. A uniform-weight moving average of length convolves the input signal with a rectangular kernel, effectively acting as a low-pass FIR filter with a frequency response that rolls off gradually, making it ideal for applications like audio denoising or sensor data cleaning.[20] Despite their utility, moving averages have limitations, including over-smoothing that can obscure genuine short-term variations or structural breaks in the data. The choice between simple and exponential types depends on data stationarity: simple averages suit stable series but lag in responsiveness, while exponential variants weight recent observations more heavily for non-stationary data, though they may amplify noise if the decay parameter is poorly tuned.[58] Software implementations facilitate widespread use of moving averages for time series smoothing. In Python, the pandas library provides therolling() method for efficient computation of simple or weighted averages on DataFrames. R's forecast package includes the ma() function for straightforward application to univariate series. MATLAB offers the movmean() function in its core toolbox for vectorized operations on numeric arrays.