Hubbry Logo
Moving averageMoving averageMain
Open search
Moving average
Community hub
Moving average
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Moving average
Moving average
from Wikipedia
Smoothing of a noisy sine (blue curve) with a moving average (red curve).

In statistics, a moving average (rolling average or running average or moving mean[1] or rolling mean) is a calculation to analyze data points by creating a series of averages of different selections of the full data set. Variations include: simple, cumulative, or weighted forms.

Mathematically, a moving average is a type of convolution. Thus in signal processing it is viewed as a low-pass finite impulse response filter. Because the boxcar function outlines its filter coefficients, it is called a boxcar filter. It is sometimes followed by downsampling.

Given a series of numbers and a fixed subset size, the first element of the moving average is obtained by taking the average of the initial fixed subset of the number series. Then the subset is modified by "shifting forward"; that is, excluding the first number of the series and including the next value in the series.

A moving average is commonly used with time series data to smooth out short-term fluctuations and highlight longer-term trends or cycles - in this case the calculation is sometimes called a time average. The threshold between short-term and long-term depends on the application, and the parameters of the moving average will be set accordingly. It is also used in economics to examine gross domestic product, employment or other macroeconomic time series. When used with non-time series data, a moving average filters higher frequency components without any specific connection to time, although typically some kind of ordering is implied. Viewed simplistically it can be regarded as smoothing the data.

Simple moving average

[edit]

In financial applications a simple moving average (SMA) is the unweighted mean of the previous data-points. However, in science and engineering, the mean is normally taken from an equal number of data on either side of a central value. This ensures that variations in the mean are aligned with the variations in the data rather than being shifted in time. An example of a simple equally weighted running mean is the mean over the last entries of a data-set containing entries. Let those data-points be . This could be closing prices of a stock. The mean over the last data-points (days in this example) is denoted as and calculated as:

When calculating the next mean with the same sampling width the range from to is considered. A new value comes into the sum and the oldest value drops out. This simplifies the calculations by reusing the previous mean . This means that the moving average filter can be computed quite cheaply on real time data with a FIFO / circular buffer and only 3 arithmetic steps.

During the initial filling of the FIFO / circular buffer the sampling window is equal to the data-set size thus and the average calculation is performed as a cumulative moving average.

The period selected () depends on the type of movement of interest, such as short, intermediate, or long-term.

If the data used are not centered around the mean, a simple moving average lags behind the latest datum by half the sample width. An SMA can also be disproportionately influenced by old data dropping out or new data coming in. One characteristic of the SMA is that if the data has a periodic fluctuation, then applying an SMA of that period will eliminate that variation (the average always containing one complete cycle). But a perfectly regular cycle is rarely encountered.[2]

For a number of applications, it is advantageous to avoid the shifting induced by using only "past" data. Hence a central moving average can be computed, using data equally spaced on either side of the point in the series where the mean is calculated.[3] This requires using an odd number of points in the sample window.

A major drawback of the SMA is that it lets through a significant amount of the signal shorter than the window length. This can lead to unexpected artifacts, such as peaks in the smoothed result appearing where there were troughs in the data. It also leads to the result being less smooth than expected since some of the higher frequencies are not properly removed.

Its frequency response is a type of low-pass filter called sinc-in-frequency.

Continuous moving average

[edit]

The continuous moving average of an integrable function is defined via integration as:

where the environment around defines the intensity of smoothing of the graph of the integrable function. A larger smoothes the source graph of the function (blue) more. The animations below show the moving average as animation in dependency of different values for  . The fraction is used, because is the interval width for the integral. Naturally, by fundamental theorem of calculus and L'Hôpital's rule.

Cumulative average

[edit]

In a cumulative average (CA), the data arrive in an ordered datum stream, and the user would like to get the average of all of the data up until the current datum. For example, an investor may want the average price of all of the stock transactions for a particular stock up until the current time. As each new transaction occurs, the average price at the time of the transaction can be calculated for all of the transactions up to that point using the cumulative average, typically an equally weighted average of the sequence of n values up to the current time:

The brute-force method to calculate this would be to store all of the data and calculate the sum and divide by the number of points every time a new datum arrived. However, it is possible to simply update cumulative average as a new value, becomes available, using the formula

Thus the current cumulative average for a new datum is equal to the previous cumulative average, times n, plus the latest datum, all divided by the number of points received so far, n+1. When all of the data arrive (n = N), then the cumulative average will equal the final average. It is also possible to store a running total of the data as well as the number of points and dividing the total by the number of points to get the CA each time a new datum arrives.

The derivation of the cumulative average formula is straightforward. Using and similarly for n + 1, it is seen that

Solving this equation for results in

Weighted moving average

[edit]

A weighted average is an average that has multiplying factors to give different weights to data at different positions in the sample window. Mathematically, the weighted moving average is the convolution of the data with a fixed weighting function. One application is removing pixelization from a digital graphical image. This is also known as Anti-aliasing [citation needed]

In the financial field, and more specifically in the analyses of financial data, a weighted moving average (WMA) has the specific meaning of weights that decrease in arithmetical progression.[4] In an n-day WMA the latest day has weight n, the second latest , etc., down to one.

WMA weights n = 15

The denominator is a triangle number equal to In the more general case the denominator will always be the sum of the individual weights.

When calculating the WMA across successive values, the difference between the numerators of and is . If we denote the sum by , then

The graph at the right shows how the weights decrease, from highest weight for the most recent data, down to zero. It can be compared to the weights in the exponential moving average which follows.

Exponential moving average

[edit]

An exponential moving average (EMA), also known as an exponentially weighted moving average (EWMA),[5] is a first-order infinite impulse response filter that applies weighting factors which decrease exponentially. The weighting for each older datum decreases exponentially, never reaching zero. This formulation is according to Hunter (1986).[6]

There is also a multivariate implementation of EWMA, known as MEWMA.[7]

Other weightings

[edit]

Other weighting systems are used occasionally – for example, in share trading a volume weighting will weight each time period in proportion to its trading volume.

A further weighting, used by actuaries, is Spencer's 15-Point Moving Average[8] (a central moving average). Its symmetric weight coefficients are [−3, −6, −5, 3, 21, 46, 67, 74, 67, 46, 21, 3, −5, −6, −3], which factors as [1, 1, 1, 1]×[1, 1, 1, 1]×[1, 1, 1, 1, 1]×[−3, 3, 4, 3, −3]/320 and leaves samples of any quadratic or cubic polynomial unchanged.[9][10]

Outside the world of finance, weighted running means have many forms and applications. Each weighting function or "kernel" has its own characteristics. In engineering and science the frequency and phase response of the filter is often of primary importance in understanding the desired and undesired distortions that a particular filter will apply to the data.

A mean does not just "smooth" the data. A mean is a form of low-pass filter. The effects of the particular filter used should be understood in order to make an appropriate choice.[citation needed]

Moving median

[edit]

From a statistical point of view, the moving average, when used to estimate the underlying trend in a time series, is susceptible to rare events such as rapid shocks or other anomalies. A more robust estimate of the trend is the simple moving median over n time points: where the median is found by, for example, sorting the values inside the brackets and finding the value in the middle. For larger values of n, the median can be efficiently computed by updating an indexable skiplist.[11]

Statistically, the moving average is optimal for recovering the underlying trend of the time series when the fluctuations about the trend are normally distributed. However, the normal distribution does not place high probability on very large deviations from the trend which explains why such deviations will have a disproportionately large effect on the trend estimate. It can be shown that if the fluctuations are instead assumed to be Laplace distributed, then the moving median is statistically optimal.[12] For a given variance, the Laplace distribution places higher probability on rare events than does the normal, which explains why the moving median tolerates shocks better than the moving mean.

When the simple moving median above is central, the smoothing is identical to the median filter which has applications in, for example, image signal processing. The Moving Median is a more robust alternative to the Moving Average when it comes to estimating the underlying trend in a time series. While the Moving Average is optimal for recovering the trend if the fluctuations around the trend are normally distributed, it is susceptible to the impact of rare events such as rapid shocks or anomalies. In contrast, the Moving Median, which is found by sorting the values inside the time window and finding the value in the middle, is more resistant to the impact of such rare events. This is because, for a given variance, the Laplace distribution, which the Moving Median assumes, places higher probability on rare events than the normal distribution that the Moving Average assumes. As a result, the Moving Median provides a more reliable and stable estimate of the underlying trend even when the time series is affected by large deviations from the trend. Additionally, the Moving Median smoothing is identical to the Median Filter, which has various applications in image signal processing.

Moving average regression model

[edit]

In a moving average regression model, a variable of interest is assumed to be a weighted moving average of unobserved independent error terms; the weights in the moving average are parameters to be estimated.

Those two concepts are often confused due to their name, but while they share many similarities, they represent distinct methods and are used in very different contexts.

See also

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A moving average is a statistical technique used to analyze by computing the of a of consecutive points, thereby out short-term fluctuations and highlighting longer-term trends or patterns. This method involves sliding a fixed-size over the , recalculating the at each step as the advances, which makes it particularly useful for identifying underlying cycles in noisy such as economic indicators or sales figures. There are several types of moving averages, each differing in how they weight the data points within the window. The simple moving average (SMA) calculates an equal-weighted of the prices or values over a specified period, such as the average closing price of a over 50 days. In contrast, the exponential moving average (EMA) assigns greater weight to more recent data points using a factor, typically computed as EMA = (current price × factor) + (previous EMA × (1 - factor)), where the factor is often 2/(n+1) for n periods; this responsiveness to new information makes EMA preferable for detecting rapid trend changes. Another variant, the weighted moving average (WMA), applies linearly increasing weights to recent observations, providing a balance between simplicity and recency bias. In and trading, moving averages serve as key technical indicators for determining trend direction, levels, and potential buy or sell signals; for instance, a short-term moving average crossing above a long-term one, known as a "golden cross," signals bullish . When a stock's price is above key moving averages, such as the 5-day, 10-day, or 200-day, it indicates that the short-term uptrend is intact, suggesting bullish momentum despite dips. Moving averages provide buy signals when the price is above key MAs like the 20-period simple moving average (SMA20), which indicates a short-term bullish trend, the 50-period, 100-period, and 200-period, with more buy signals than sell signals indicating a medium- to long-term uptrend. Conversely, if the price falls below the 10-day moving average, it suggests that the short-term trend may be weakening; more broadly, when the price breaks below key moving averages, such as short-term ones like the 7-period or 25-period, it often signals a bearish trend, particularly if longer-term averages like the 99-period are nearby and also declining. Popular periods include the 50-day and 200-day SMAs. The 200-day simple moving average (SMA) is a widely used long-term technical indicator in the stock market, typically representing about 9-10 months of price data. It is commonly employed to identify the overall trend: prices above the 200-day SMA indicate an uptrend (bullish), while prices below suggest a downtrend (bearish). It often acts as dynamic support or resistance and is a key level for many traders and analysts. The 200-week SMA, covering roughly 3.8-4 years of data, is a much longer-term indicator used to assess secular trends or major market cycles. Crossings of the 200-week SMA are rarer and often signal significant shifts between long-term bull and bear markets. It is less responsive to short-term fluctuations compared to the 200-day SMA, providing a smoother view of multi-year trends. In summary, the 200-day SMA is better for intermediate to long-term trend analysis and trading decisions, while the 200-week SMA is suited for identifying very long-term market regimes. Traders monitor these to confirm uptrends (rising averages) or downtrends (declining averages); a 200-day moving average spans approximately 40 weeks (200 ÷ 5 = 40), while a 30-week moving average covers about 150 trading days (30 × 5 = 150), making the 30-week MA an intermediate-term indicator shorter than the standard long-term 200-day/40-week MA, with many sources equating the 40-week (or sometimes 39-week) MA with the 200-day MA due to the 5-day trading week, though in practice on charting platforms they show distinct lines with different responsiveness to price changes. Similar principles apply to intraday timeframes. For example, a 9-period moving average on a 5-minute chart covers the same 45-minute time span as a 45-period moving average on a 1-minute chart (9 × 5 = 45 × 1). However, they are not exactly identical because the 5-minute chart aggregates data into fewer, larger bars using closes every 5 minutes, while the 1-minute chart uses individual minute closes. This leads to differences in the underlying data sets and calculation, especially pronounced for exponential moving averages due to differing weighting factors and the recursive computation, and can produce stepped or flat visual behavior when overlaid on lower timeframe charts. Nevertheless, they are approximately equivalent in time coverage, and traders commonly scale the period by the timeframe ratio (here ×5) to achieve a similar analytical effect. An uptrend is indicated when the price is well above both the 50-day moving average and the 200-day moving average. For moving averages applied to trading volume indicators, the simple moving average (SMA) is the standard choice, as it provides a true unweighted average of volume without overweighting recent periods, unlike the exponential moving average (EMA). However, strategies employing moving averages, such as simple moving average trend filters for timing investments in leveraged exchange-traded funds (ETFs) like TQQQ, carry extreme risks, including the possibility of rapid total loss due to 3x leverage, volatility decay, and compounding effects; markets are unpredictable, no strategy guarantees profitability, optimized historical parameters can overfit or underperform in the future, and past performance does not indicate future results. Beyond , moving averages are applied in for and , such as using a 7-day SMA to analyze daily retail sales and mitigate weekly variations. However, all types exhibit a lag due to their reliance on historical data, which can lead to delayed signals in volatile or sideways markets.

Fundamentals

Definition

A moving average is a statistical calculation used to analyze data points by creating a series of averages from different subsets of a full , typically applied to to smooth variations in sequential observations. This technique computes the of successive smaller sets of data, advancing one period at a time, which helps in producing a smoothed representation of the underlying pattern. In its simplest form, for a sequence {a1,a2,,an}\{a_1, a_2, \dots, a_n\}, the moving average at time tt with size kk is given by 1ki=tk+1tai,\frac{1}{k} \sum_{i=t-k+1}^{t} a_i, where the is taken over the most recent kk values up to time tt. This formulation assumes equal weighting for the simple case, focusing on arithmetic means of contiguous subsets. The primary purpose of a moving average is to reduce short-term and fluctuations in time series data, thereby highlighting longer-term trends or cycles for better and . By out peaks and troughs, it provides a clearer view of the data's directional movement without altering the overall . It finds applications in fields such as for and for . The concept of moving averages originated in the early within , with early uses documented in economic data analysis around 1901 by R.H. Hooker, later termed "moving averages" by G. Udny Yule in 1927.

Properties

Moving averages exhibit a effect by functioning as low-pass filters in , which attenuate high-frequency variations such as noise while preserving underlying low-frequency trends in data sequences. This property arises because the filter's passes low frequencies with minimal amplitude reduction but severely attenuates higher frequencies, as seen in the amplitude response of a simple two-point moving average given by H(ω)=cos(ω/2)|H(\omega)| = |\cos(\omega/2)|, where low ω\omega values experience little damping compared to values near the . Consequently, applying a moving average reduces jaggedness in time series data, leveling out short-term fluctuations without substantially altering long-term patterns. In statistical estimation, simple moving averages serve as unbiased estimators of the underlying signal when the follows a constant trend plus , meaning their equals the true under such assumptions. However, their variance decreases inversely with the kk, approximated as V[f^(x)]σ2/(2k+1)V[\hat{f}(x)] \approx \sigma^2 / (2k + 1) for a two-sided average with noise variance σ2\sigma^2, leading to higher variability for smaller windows and smoother but potentially over-smoothed outputs for larger ones. This creates a fundamental bias-variance trade-off: smaller windows minimize bias by closely tracking local changes but amplify variance due to noise sensitivity, whereas larger windows reduce variance through averaging but introduce bias by oversmoothing, particularly near peaks or troughs where the estimate flattens, with bias scaling as 16f(x)k(k+1)\frac{1}{6} f''(x) k (k + 1) for smooth functions ff. Moving averages contribute to stationarity in non-stationary through differencing operations, where differencing—equivalent to a moving average with kernel weights [1, -1]—stabilizes the by removing linear trends and level shifts. In modeling frameworks, such differencing transforms integrated processes into stationary ones, allowing subsequent moving average components to model the residuals effectively without time-varying statistical properties. This approach ensures constant , variance, and over time, a prerequisite for reliable analysis. Mathematically, moving averages can be represented as discrete convolutions of the input with a kernel that defines the weights, such as a kernel of ones divided by the length for the simple moving average. For a of size MM, the output at index ii is y=1Mj=0M1x[ij]y = \frac{1}{M} \sum_{j=0}^{M-1} x[i-j], which corresponds to convolving the signal with a rectangular kernel, enabling efficient computation via fast algorithms and highlighting the filter's linear, time-invariant nature. This view also reveals the frequency-domain behavior, where the kernel's determines the low-pass characteristics. Edge effects arise in moving average computations near the boundaries of finite sequences, where the sliding cannot fully overlap due to insufficient preceding or following points, potentially leading to biased or incomplete estimates at the start and end. Common handling strategies include using partial windows that average only available points within the boundary vicinity, or applying techniques such as zero-padding, edge replication, or reflection to extend the sequence artificially and maintain full coverage. These methods between preserving and introducing minimal artifacts, with partial windows often preferred for avoiding artificial extensions in short series.

Basic Types

Simple Moving Average

The simple moving average (SMA) is a fundamental technique in time series analysis that computes the of a fixed number of consecutive observations, assigning equal weight to each value within the specified . This method applies uniform weights of 1k\frac{1}{k} to the most recent kk observations, where kk is the window size, making it particularly suitable for identifying underlying trends by reducing short-term fluctuations in data. The formula for the SMA at time tt is given by: SMAt=1ki=1kati+1\text{SMA}_t = \frac{1}{k} \sum_{i=1}^{k} a_{t-i+1} where ati+1a_{t-i+1} represents the at the corresponding past time point. This rolling calculation updates as new data enters the and the oldest exits, providing a sequence of averages that track changes over time. For illustration, consider a of values [1, 2, 3, 4, 5] with k=3k = 3. The first SMA is the average of 1, 2, and 3, yielding 2; the second is the average of 2, 3, and 4, yielding 3; and the third is the average of 3, 4, and 5, yielding 4. Thus, the SMA values are [2, 3, 4]. This example demonstrates how the SMA progressively incorporates newer data while maintaining a fixed length. One key advantage of the SMA is its computational simplicity, requiring only basic addition and division, which makes it straightforward to implement and interpret even for large datasets. It also provides uniform that effectively highlights persistent trends by averaging out random , minimizing in stationary data without trends. However, the SMA has notable disadvantages, including a tendency to lag behind actual trends due to its equal weighting of all observations in the , which delays responsiveness to recent changes. Additionally, it can be sensitive to outliers within the , as each value influences the equally, potentially distorting the smoothed result in volatile datasets. The selection of the window size kk is crucial, as smaller values increase responsiveness to recent data but introduce more noise and variability, while larger values enhance smoothness and trend visibility at the cost of greater lag and reduced sensitivity to shifts. This must be balanced based on the data's characteristics and the desired level of smoothing versus timeliness.

Cumulative Average

The cumulative average, also referred to as the running average or expanding average, computes the of all points from the start of a up to the current , resulting in a progressively expanding size with each new point. This approach accumulates historical information without discarding earlier values, making it suitable for scenarios where overall progress or long-term trends are prioritized over short-term fluctuations. The formula for the cumulative average at time tt, denoted CAt\text{CA}_t, for a sequence of observations a1,a2,,ata_1, a_2, \dots, a_t is: CAt=1ti=1tai\text{CA}_t = \frac{1}{t} \sum_{i=1}^t a_i For example, given the data sequence [1, 2, 3], the cumulative averages are CA1=1\text{CA}_1 = 1, CA2=1.5\text{CA}_2 = 1.5, and CA3=2\text{CA}_3 = 2. As the number of observations tt grows, CAt\text{CA}_t converges to the overall mean of the full dataset, providing a stable estimate that becomes less sensitive to recent changes due to the increasing influence of accumulated prior data. This contrasts with fixed-window averages by emphasizing historical accumulation rather than recency. In applications such as , the cumulative average monitors ongoing performance metrics, such as defect rates or measurement consistency, by tracking deviations within specified limits over time. It is also widely used in analysis for production processes, where it models the average cost or time per unit as output accumulates, typically decreasing by a constant percentage with each doubling of quantity produced. For computational efficiency, the cumulative average supports incremental updates without recalculating the entire sum: CAt=CAt1t1t+att\text{CA}_t = \text{CA}_{t-1} \cdot \frac{t-1}{t} + \frac{a_t}{t}, which facilitates real-time tracking in environments.

Weighted Types

Weighted Moving Average

A weighted moving average (WMA) assigns varying weights to the data points within a fixed-size , allowing for greater emphasis on specific observations, such as more recent ones, compared to the uniform weighting in simple moving averages. This flexibility makes WMAs particularly useful in time series analysis for smoothing data while prioritizing relevant trends. The general form of a WMA at time tt for a of size kk is given by WMAt=i=1kwiati+1i=1kwi,\text{WMA}_t = \frac{\sum_{i=1}^{k} w_i a_{t-i+1}}{\sum_{i=1}^{k} w_i}, where ati+1a_{t-i+1} are the observed values in the window, and wiw_i are the non-negative weights assigned to each position, with the denominator ensuring normalization so that the weights sum to 1 if desired for unbiased averaging. Normalization is crucial to maintain the scale of the original and prevent in the estimate, as the sum of weights acts as a scaling factor. Weights can be assigned in various ways depending on the application; a common approach is linear weighting that decreases toward older data (e.g., weights of k,k1,,1k, k-1, \dots, 1 for kk periods, normalized by their sum k(k+1)/2k(k+1)/2), for instance wi=iw_i = i for i=1i = 1 (oldest) to kk (newest), with the highest weight on the newest observation, or triangular weights that peak in the middle for centered smoothing. Such assignments allow customization to domain-specific needs, like emphasizing recency in financial forecasting or sales predictions where recent patterns are more indicative of future behavior. Compared to the simple moving average, the WMA offers advantages in responsiveness, as higher weights on recent data reduce the lag in detecting shifts or trends, leading to more timely signals in volatile series. This can improve forecast accuracy in applications requiring quick adaptation, though it may amplify noise if weights overly favor outliers. For example, consider a time series with values a1=1a_1 = 1, a2=2a_2 = 2, a3=3a_3 = 3 and a window size k=3k = 3 using linear weights w1=1w_1 = 1, w2=2w_2 = 2, w3=3w_3 = 3 (oldest to newest). The WMA is calculated as 11+22+331+2+3=146=732.333\frac{1 \cdot 1 + 2 \cdot 2 + 3 \cdot 3}{1 + 2 + 3} = \frac{14}{6} = \frac{7}{3} \approx 2.333, which weights the latest value more heavily than the simple average of 2. Weight selection criteria typically rely on the problem's context, such as using higher weights for recent data in short-term forecasting to capture evolving patterns, while balancing smoothness and sensitivity through empirical testing or domain expertise. Unlike the Exponential Moving Average (EMA), which assigns weights that decrease exponentially and uses a recursive formula EMA_t = α × a_t + (1 - α) × EMA_{t-1} (with smoothing factor α often set to 2/(n+1) for an n-period equivalent), WMA applies linear weighting. EMA therefore provides stronger emphasis on the most recent data compared to linear decay and is generally more responsive to recent changes. Additionally, EMA is easier to compute incrementally via recursion, requiring only the previous EMA value without storing all prior data points.

Exponential Moving Average

The exponential moving average (EMA) is a recursive method for estimating the local of a , assigning exponentially decaying weights to past observations to emphasize recent data. It is defined by the formula EMAt=αat+(1α)EMAt1,\text{EMA}_t = \alpha \, a_t + (1 - \alpha) \, \text{EMA}_{t-1}, where ata_t is the new observation at time tt, α\alpha is the smoothing factor satisfying 0<α<10 < \alpha < 1, and EMAt1\text{EMA}_{t-1} is the previous EMA value. This recursive structure ensures that the EMA incorporates the entire history of data, with the weight on the ii-th past observation given by the geometric sequence wi=α(1α)i1w_i = \alpha (1 - \alpha)^{i-1}, normalized to sum to 1. Initialization of the EMA typically sets EMA0\text{EMA}_0 to the first observation a1a_1, the mean of an initial set of observations, or a target value such as zero or the historical mean, depending on the context to avoid undue bias from arbitrary starting points. The choice of initialization affects early estimates but has diminishing impact as more data accumulates due to the exponential decay. A key advantage of the EMA lies in its computational efficiency: it requires only the previous EMA value and the current observation for updates, using constant memory regardless of history length. This contrasts with the Weighted Moving Average (WMA), which typically requires storing and weighting a fixed window of past observations. Both EMA and WMA prioritize recent data over older data, unlike the Simple Moving Average (SMA) which weights all periods equally. However, WMA assigns weights that decrease linearly with time, whereas EMA uses exponentially decaying weights, which apply stronger emphasis to the most recent data. Consequently, EMA is generally more responsive to recent changes than WMA. This recursive form enables rapid adaptation to shifts in the underlying process, outperforming fixed-window methods in responsiveness while still smoothing noise through the infinite but decaying influence of past data. Unlike finite moving averages, it avoids abrupt resets from sliding windows, providing a continuous estimate suitable for online processing. The smoothing factor α\alpha relates to the half-life nn, the time span over which weights decay to half their initial value, via the formula α=1eln2/n.\alpha = 1 - e^{-\ln 2 / n}. This interpretation allows practitioners to select α\alpha based on desired memory length, where larger nn corresponds to smaller α\alpha and greater smoothing. For example, with α=0.2\alpha = 0.2 and initial EMA0=0\text{EMA}_0 = 0, the sequence begins as EMA1=0.2×10+0.8×0=2\text{EMA}_1 = 0.2 \times 10 + 0.8 \times 0 = 2 for a1=10a_1 = 10, and EMA2=0.2×20+0.8×2=5.6\text{EMA}_2 = 0.2 \times 20 + 0.8 \times 2 = 5.6 for a2=20a_2 = 20, illustrating the gradual incorporation of new values. Parameter selection for α\alpha trades off between sensitivity and stability: values near 1 yield high responsiveness to recent changes, ideal for volatile series, whereas values near 0 emphasize smoothing and historical trends, reducing sensitivity to outliers. Optimal α\alpha is often determined by minimizing forecast error metrics like mean squared error on validation data.

Other Weightings

In addition to simple and exponential weightings, moving averages can employ specialized non-geometric weight functions tailored to domain-specific requirements, such as emphasizing central data points or adapting to signal characteristics. These approaches provide enhanced smoothing while mitigating issues like edge effects or sensitivity to noise variations. Gaussian weighting applies a bell-shaped kernel to the data window, assigning higher weights to points near the center and tapering off symmetrically. The weights are defined by the Gaussian function wi=e(im)2/(2σ2)w_i = e^{-(i - m)^2 / (2\sigma^2)}, where ii is the position in the window, mm is the center, and σ\sigma controls the spread. This method is particularly effective for preserving local features while reducing high-frequency noise, as implemented in signal processing toolboxes like MATLAB's smoothdata function, which uses a default window size of 4 elements unless specified otherwise. In audio processing, Gaussian-weighted moving averages facilitate noise reduction by blurring impulsive disturbances without overly distorting the underlying waveform, as seen in applications for smoothing acoustic signals in real-time systems. Hann and Hamming windows, borrowed from signal processing, introduce tapered weighting to minimize boundary artifacts in the averaged output. The Hann window weights are given by wi=0.5(1cos(2πik+1))w_i = 0.5 \left(1 - \cos\left(\frac{2\pi i}{k+1}\right)\right) for i=0i = 0 to kk, creating smooth transitions at the window edges that reduce sidelobe leakage compared to uniform weighting. The Hamming variant modifies this with an added constant term for slightly broader main lobe response: wi=0.540.46cos(2πik)w_i = 0.54 - 0.46 \cos\left(\frac{2\pi i}{k}\right). These windows achieve sidelobe suppression up to -32 dB for Hann, significantly smoother than the -13.5 dB of simple moving averages, making them suitable for cycle detection in oscillatory data. In financial time series analysis, such tapered weights help in trend filtering by dampening abrupt changes at window boundaries, improving indicator stability during volatile periods. Adaptive weighting schemes dynamically adjust weights based on local data properties, such as volatility, to allocate higher emphasis to stable segments and lower to turbulent ones. Kaufman's Adaptive Moving Average (KAMA), for instance, computes a smoothing constant from the efficiency ratio—measuring directional movement relative to total variation—and applies it to recent observations, effectively increasing weights during low-volatility trends and decreasing them amid clustering volatility. This approach addresses volatility clustering in finance, where periods of high fluctuation follow each other, by customizing the moving average to track persistent trends more responsively without excessive lag. Compared to uniform weighting, these specialized schemes—Gaussian for central emphasis, windowed for edge tapering, and adaptive for volatility response—reduce artifacts like ringing or oversensitivity, though they may introduce minor phase distortion in transient signals. Gaussian and windowed methods yield smoother outputs with less spectral leakage, while adaptive variants excel in non-stationary environments by maintaining adaptability over fixed windows. Implementation requires normalizing weights so their sum equals 1 to ensure the average remains unbiased, often via division by the kernel integral or sum. These methods incur higher computational costs than simple averages due to per-point weight calculations—O(n) per window for finite kernels—but optimizations like precomputed tables or recursive approximations mitigate this in real-time applications.

Specialized Variants

Continuous Moving Average

The continuous moving average of a real-valued function f(t)f(t) over a time window of fixed length τ>0\tau > 0 ending at time tt is defined as y(t)=1τtτtf(s)ds.y(t) = \frac{1}{\tau} \int_{t-\tau}^{t} f(s) \, ds. This formulation provides a uniform weighting across the interval [tτ,t][t-\tau, t], smoothing the function by averaging its values continuously. It serves as the continuous-time counterpart to the discrete simple moving average, emerging as the limit when the discrete sampling interval approaches zero and the number of points increases proportionally to maintain the window length τ\tau. A weighted variant analogous to the discrete exponential moving average arises in continuous time through the exponentially decaying kernel, yielding X(t)=1τ0f(ts)es/τds,X(t) = \frac{1}{\tau} \int_{0}^{\infty} f(t - s) e^{-s / \tau} \, ds, where τ>0\tau > 0 determines the effective scale (with the normalization ensuring the weights integrate to 1). This expression solves the first-order linear dXdt=1τ(f(t)X(t)),\frac{dX}{dt} = \frac{1}{\tau} \big( f(t) - X(t) \big), with X(t0)=f(t0)X(t_0) = f(t_0) at some starting time t0t_0; to verify, differentiate the integral form using the Leibniz rule for parameter-dependent limits and the , substitute, and simplify to obtain the . Continuous moving averages find applications in for mitigating noise in precision timing and frequency systems, where the form filters high-frequency fluctuations while preserving low-frequency trends. In physics, they enable baseline correction in for experimental setups, such as particle detectors, by averaging over short windows to subtract slow drifts from raw waveforms. These methods also approximate components in Kalman filtering for continuous-time stochastic processes, particularly self-similar ones like , by representing moving average as state updates in the filter equations. Specific properties distinguish continuous moving averages in analysis. If f(t)f(t) is differentiable, then y(t)y(t) is differentiable, with derivative y(t)=1τ(f(t)f(tτ))y'(t) = \frac{1}{\tau} \big( f(t) - f(t - \tau) \big) obtained via the fundamental theorem of calculus applied to the integral bounds. For a constant function f(t)=cf(t) = c, the moving average remains y(t)=cy(t) = c, preserving the value exactly. For a linear trend f(t)=ktf(t) = k t with k>0k > 0, the moving average is y(t)=k(tτ2)y(t) = k \left( t - \frac{\tau}{2} \right); to derive this, compute the integral tτtksds=k[s22]tτt=k(t22(tτ)22)=kτ(tτ2)\int_{t-\tau}^{t} k s \, ds = k \left[ \frac{s^2}{2} \right]_{t-\tau}^{t} = k \left( \frac{t^2}{2} - \frac{(t - \tau)^2}{2} \right) = k \tau \left( t - \frac{\tau}{2} \right), then divide by τ\tau to yield the lagged form, introducing a phase delay of τ/2\tau / 2.

Moving Median

The moving is a robust statistical technique used for in a or sequence by applying the within a sliding of fixed kk. At each position ii, it computes the of the kk consecutive observations centered around or including ii, providing a non-parametric measure of that slides across the to produce a smoothed series. To compute the moving median, the values within the window are sorted in ascending order; for odd kk, the middle value (at position (k+1)/2(k+1)/2) is selected as the , while for even kk, the average of the two central values (at positions k/2k/2 and k/2+1k/2 + 1) is taken. This process repeats for each overlapping window, typically requiring sorting at each step, which incurs a of O(klogk)O(k \log k) per window in naive implementations. A primary advantage of the moving median is its insensitivity to outliers, with a breakdown point of 50%, meaning it remains reliable even if up to half the data in the window are contaminated, unlike the arithmetic mean's 0% breakdown point. This robustness makes it particularly effective for preserving sharp changes in the data while suppressing noise, as it relies on order statistics rather than .00130-R) However, the moving median's non-linearity complicates , such as deriving closed-form properties or frequency responses, and its higher computational demands compared to moving averages can be a drawback for large datasets or real-time applications. Additionally, it may produce jagged smoothed curves and handle boundary points less effectively without specialized adjustments. For example, consider the data sequence [1, 10, 2, 3, 100] with window size k=3k=3: the moving medians starting from the second position are 2 (median of 1, 10, 2), 3 (median of 10, 2, 3), and 3 (median of 2, 3, 100), effectively ignoring the 100 and yielding a smoother trend of approximately [2, 3, 3]. Variants include the weighted moving median, which assigns different weights to window elements before selecting the (e.g., via weighted order statistics), and the running median in , optimized for efficient incremental updates in to reduce sorting overhead.

Applications in Modeling

Time Series Smoothing

Moving averages serve as fundamental tools for smoothing data, effectively reducing short-term fluctuations and to reveal underlying structures such as trends and cycles. By averaging values over a sliding window, these filters decompose a series into a smoothed component—often interpreted as the trend—and a residual component capturing irregular variations. This approach is particularly valuable in fields like and , where raw data often includes random errors that obscure meaningful patterns. In trend estimation, moving averages act as low-pass filters to isolate the long-term trend from a , enabling the yt=Tt+Rty_t = T_t + R_t, where TtT_t represents the trend estimated via the moving average and RtR_t is the residual. For instance, a simple moving average applied symmetrically around each point provides an estimate of the trend-cycle component, which can then be subtracted from the original series to obtain residuals for further analysis. This method assumes the trend evolves gradually, making it suitable for stationary or slowly varying processes. For seasonal adjustment, moving averages are combined with differencing techniques to remove periodic fluctuations, as exemplified in the X-11 method developed by the U.S. Census Bureau. The X-11 procedure employs a series of symmetric moving averages—such as 3x3, 3x5, and 3x9 filters for monthly data—to estimate the trend and seasonal components iteratively, followed by differencing to stabilize the series and refine adjustments. This approach has been a standard for official statistics, though it has been succeeded by X-12-ARIMA and the current method, which incorporates modeling for improved and adjustment, enhancing the interpretability of economic indicators like unemployment rates. In , deviations from a moving average baseline signal potential outliers or unusual events in the , as points significantly exceeding a threshold (e.g., two standard deviations) indicate breaks from the expected smoothed . This technique is applied in monitoring systems to detect anomalies by establishing a normal profile with the moving average and flagging deviations in residuals. A prominent application in involves the 50-day simple moving average (SMA) to gauge stock price trends, where sustained positions above this line suggest bullish momentum. For example, a stock price above key short-term moving averages such as the 5-day or 10-day simple moving average indicates that the short-term uptrend remains intact, suggesting bullish momentum despite minor price dips. Similarly, the 50-day exponential moving average (EMA) tracks short-term trends, while the 200-day EMA monitors longer-term trends; as a trend-following indicator, the EMA assigns greater weight to recent prices, making it more responsive to new information. The 200-day moving average is approximately equivalent to a 40-week moving average (assuming five trading days per week), spanning nearly a year's trading activity (accounting for approximately 252 trading days per year). In contrast, a 30-week moving average corresponds to about 150 trading days and serves as a shorter, intermediate-term indicator more responsive to medium-term shifts, whereas the 40-week/200-day MA is widely used for identifying major trend changes, such as bull or bear market confirmations. In technical analysis, a stock price above its EMA signals a bullish bias, while a price below indicates a bearish bias. More broadly, when a stock price is above most medium- and long-term moving averages, it signifies a dominant bullish trend, strong long-term support, and only slight pressure from short-term moving averages. Crossovers between short-term and long-term SMAs generate trading signals: a golden cross occurs when the 50-day SMA rises above the 200-day SMA, indicating potential upward trends, while a death cross—its inverse—signals bearish reversals, as observed in major indices like the S&P 500. Similar crossover patterns apply to EMAs. Furthermore, when multiple moving averages align such that shorter-term MAs are positioned above longer-term ones (e.g., 5-day above 8-day above 13-day SMAs), it indicates a strong upward trend, reinforcing buy signals. These patterns aid investors in timing entries and exits, though empirical studies show mixed predictive power depending on market conditions. Trend-following strategies using simple moving averages, such as crossing above a long-term average to enter positions, have been applied to leveraged exchange-traded funds (ETFs) to potentially reduce drawdowns in volatile markets. However, trading leveraged ETFs, such as 3x funds like TQQQ, carries extreme risks, including the possibility of rapid total loss due to leverage amplification, volatility decay, and compounding effects in choppy or declining markets. Markets are unpredictable—no strategy is guaranteed profitable, and parameters optimized on historical data can overfit or underperform in the future. Past performance does not indicate future results. In , moving averages function as (FIR) filters to attenuate high-frequency noise while preserving lower-frequency components essential for analysis. A uniform-weight moving average of length NN convolves the input signal with a rectangular kernel, effectively acting as a low-pass FIR filter with a that rolls off gradually, making it ideal for applications like audio denoising or sensor data cleaning. Despite their utility, moving averages have limitations, including over-smoothing that can obscure genuine short-term variations or structural breaks in the . The choice between simple and exponential types depends on data stationarity: simple averages suit stable series but lag in responsiveness, while exponential variants weight recent observations more heavily for non-stationary data, though they may amplify noise if the decay parameter is poorly tuned. Software implementations facilitate widespread use of moving averages for smoothing. In Python, the pandas library provides the rolling() method for efficient of simple or weighted averages on DataFrames. R's forecast package includes the ma() function for straightforward application to univariate series. offers the movmean() function in its core toolbox for vectorized operations on numeric arrays.

Moving Average Regression Model

In the context of ARIMA modeling for , the moving average process of order qq, denoted MA(qq), represents a model where the current observation is a of past error terms, or innovations. The model is defined as yt=ϵt+θ1ϵt1+θ2ϵt2++θqϵtq,y_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \dots + \theta_q \epsilon_{t-q}, where {ϵt}\{\epsilon_t\} is a sequence of errors with zero and constant variance σ2\sigma^2, and the θi\theta_i are the moving average parameters. This formulation interprets the time series yty_t as depending on the current and previous qq error terms, weighted by the parameters θi\theta_i, which can be positive or negative and do not necessarily sum to unity. It is analogous to a filter in , where the influence of shocks decays after qq periods, distinguishing it from infinite-order processes like autoregressive models. Key properties of the MA(qq) model include stationarity, which holds unconditionally as long as the is stationary, and invertibility, requiring that the roots of the θ(z)=1+θ1z++θqzq=0\theta(z) = 1 + \theta_1 z + \dots + \theta_q z^q = 0 lie outside the unit circle in the (i.e., z>1|z| > 1 for all roots). The function (ACF) of an MA(qq) process cuts off abruptly after lag qq, meaning ρk=0\rho_k = 0 for k>qk > q, while the (PACF) tails off gradually. Estimation of the θi\theta_i parameters and the noise variance σ2\sigma^2 is typically performed using maximum likelihood estimation, assuming normality of the errors, or via conditional sums of squares for initial approximations, with iterative optimization to refine the fit. For model identification, the order qq is determined by examining the sample PACF, which should exhibit a sharp cutoff after lag qq, complemented by information criteria like AIC or BIC to select among candidate models. In practice, software such as the statsmodels library in Python implements these steps through its ARIMA class, facilitating fitting and diagnostic checks. A simple example is the MA(1) model yt=ϵt+0.5ϵt1y_t = \epsilon_t + 0.5 \epsilon_{t-1}, where θ1=0.5\theta_1 = 0.5; this generates a time series with positive at lag 1 (ρ1=0.4\rho_1 = 0.4) but zero thereafter, simulating residuals that exhibit short-term dependence due to lingering effects of past shocks. For invertibility in this case, θ1<1|\theta_1| < 1. In , MA models extrapolate future values by weighting anticipated error terms beyond the observed data, assuming future errors are zero, which contrasts with retrospective smoothing techniques that average past observations directly. This predictive orientation aligns loosely with methods, though MA(qq) provides a parametric framework for error propagation rather than decay.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.