Hubbry Logo
Autoregressive modelAutoregressive modelMain
Open search
Autoregressive model
Community hub
Autoregressive model
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Autoregressive model
Autoregressive model
from Wikipedia

In statistics, econometrics, and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it can be used to describe certain time-varying processes in nature, economics, behavior, etc. The autoregressive model specifies that the output variable depends linearly on its own previous values and on a stochastic term (an imperfectly predictable term); thus the model is in the form of a stochastic difference equation (or recurrence relation) which should not be confused with a differential equation. Together with the moving-average (MA) model, it is a special case and key component of the more general autoregressive–moving-average (ARMA) and autoregressive integrated moving average (ARIMA) models of time series, which have a more complicated stochastic structure; it is also a special case of the vector autoregressive model (VAR), which consists of a system of more than one interlocking stochastic difference equation in more than one evolving random variable. Another important extension is the time-varying autoregressive (TVAR) model, where the autoregressive coefficients are allowed to change over time to model evolving or non-stationary processes. TVAR models are widely applied in cases where the underlying dynamics of the system are not constant, such as in sensors time series modelling[1][2], finance[3], climate science[4], economics[5], signal processing[6] and telecommunications[7], radar systems[8], and biological signals[9].

Unlike the moving-average (MA) model, the autoregressive model is not always stationary; non-stationarity can arise either due to the presence of a unit root or due to time-varying model parameters, as in time-varying autoregressive (TVAR) models.

Large language models are called autoregressive, but they are not a classical autoregressive model in this sense because they are not linear.

Definition

[edit]

The notation indicates an autoregressive model of order p. The AR(p) model is defined as

where are the parameters of the model, and is white noise.[10][11] This can be equivalently written using the backshift operator B as

so that, moving the summation term to the left side and using polynomial notation, we have

An autoregressive model can thus be viewed as the output of an all-pole infinite impulse response filter whose input is white noise.

Some parameter constraints are necessary for the model to remain weak-sense stationary. For example, processes in the AR(1) model with are not stationary. More generally, for an AR(p) model to be weak-sense stationary, the roots of the polynomial must lie outside the unit circle, i.e., each (complex) root must satisfy (see pages 89,92 [12]).

Intertemporal effect of shocks

[edit]

In an AR process, a one-time shock affects values of the evolving variable infinitely far into the future. For example, consider the AR(1) model . A non-zero value for at say time t=1 affects by the amount . Then by the AR equation for in terms of , this affects by the amount . Then by the AR equation for in terms of , this affects by the amount . Continuing this process shows that the effect of never ends, although if the process is stationary then the effect diminishes toward zero in the limit.

Because each shock affects X values infinitely far into the future from when they occur, any given value Xt is affected by shocks occurring infinitely far into the past. This can also be seen by rewriting the autoregression

(where the constant term has been suppressed by assuming that the variable has been measured as deviations from its mean) as

When the polynomial division on the right side is carried out, the polynomial in the backshift operator applied to has an infinite order—that is, an infinite number of lagged values of appear on the right side of the equation.

Characteristic polynomial

[edit]

The autocorrelation function of an AR(p) process can be expressed as [citation needed]

where are the roots of the polynomial

where B is the backshift operator, where is the function defining the autoregression, and where are the coefficients in the autoregression. The formula is valid only if all the roots have multiplicity 1.[citation needed]

The autocorrelation function of an AR(p) process is a sum of decaying exponentials.

  • Each real root contributes a component to the autocorrelation function that decays exponentially.
  • Similarly, each pair of complex conjugate roots contributes an exponentially damped oscillation.

Graphs of AR(p) processes

[edit]
"Figure has 5 plots of AR processes. AR(0) and AR(0.3) are white noise or look like white noise. AR(0.9) has some large scale oscillating structure."
AR(0); AR(1) with AR parameter 0.3; AR(1) with AR parameter 0.9; AR(2) with AR parameters 0.3 and 0.3; and AR(2) with AR parameters 0.9 and −0.8

The simplest AR process is AR(0), which has no dependence between the terms. Only the error/innovation/noise term contributes to the output of the process, so in the figure, AR(0) corresponds to white noise.

For an AR(1) process with a positive , only the previous term in the process and the noise term contribute to the output. If is close to 0, then the process still looks like white noise, but as approaches 1, the output gets a larger contribution from the previous term relative to the noise. This results in a "smoothing" or integration of the output, similar to a low pass filter.

For an AR(2) process, the previous two terms and the noise term contribute to the output. If both and are positive, the output will resemble a low pass filter, with the high frequency part of the noise decreased. If is positive while is negative, then the process favors changes in sign between terms of the process. The output oscillates. This can be linked to edge detection or detection of change in direction.

Example: An AR(1) process

[edit]

An AR(1) process is given by:where is a white noise process with zero mean and constant variance . (Note: The subscript on has been dropped.) The process is weak-sense stationary if since it is obtained as the output of a stable filter whose input is white noise. (If then the variance of depends on time lag t, so that the variance of the series diverges to infinity as t goes to infinity, and is therefore not weak-sense stationary.) Assuming , the mean is identical for all values of t by definition of weak sense stationarity. If the mean is denoted by , it follows fromthatand hence

The variance is

where is the standard deviation of . This can be shown by noting that

and then by noticing that the quantity above is a stable fixed point of this relation.

The autocovariance is given by

It can be seen that the autocovariance function decays with a decay time (also called time constant) of .[13]

The spectral density function is the Fourier transform of the autocovariance function. In discrete terms this will be the discrete-time Fourier transform:

This expression is periodic due to the discrete nature of the , which is manifested as the cosine term in the denominator. If we assume that the sampling time () is much smaller than the decay time (), then we can use a continuum approximation to :

which yields a Lorentzian profile for the spectral density:

where is the angular frequency associated with the decay time .

An alternative expression for can be derived by first substituting for in the defining equation. Continuing this process N times yields

For N approaching infinity, will approach zero and:

It is seen that is white noise convolved with the kernel plus the constant mean. If the white noise is a Gaussian process then is also a Gaussian process. In other cases, the central limit theorem indicates that will be approximately normally distributed when is close to one.

For , the process will be a geometric progression (exponential growth or decay). In this case, the solution can be found analytically: whereby is an unknown constant (initial condition).

Explicit mean/difference form of AR(1) process

[edit]

The AR(1) model is the discrete-time analogy of the continuous Ornstein-Uhlenbeck process. It is therefore sometimes useful to understand the properties of the AR(1) model cast in an equivalent form. In this form, the AR(1) model, with process parameter , is given by

, where , is the model mean, and is a white-noise process with zero mean and constant variance .

By rewriting this as and then deriving (by induction) , one can show that

and
.

Choosing the maximum lag

[edit]

The partial autocorrelation of an AR(p) process equals zero at lags larger than p, so the appropriate maximum lag p is the one after which the partial autocorrelations are all zero.

Calculation of the AR parameters

[edit]

There are many ways to estimate the coefficients, such as the ordinary least squares procedure or method of moments (through Yule–Walker equations).

The AR(p) model is given by the equation

It is based on parameters where i = 1, ..., p. There is a direct correspondence between these parameters and the covariance function of the process, and this correspondence can be inverted to determine the parameters from the autocorrelation function (which is itself obtained from the covariances). This is done using the Yule–Walker equations.

Yule–Walker equations

[edit]

The Yule–Walker equations, named for Udny Yule and Gilbert Walker,[14][15] are the following set of equations.[16]

where m = 0, …, p, yielding p + 1 equations. Here is the autocovariance function of Xt, is the standard deviation of the input noise process, and is the Kronecker delta function.

Because the last part of an individual equation is non-zero only if m = 0, the set of equations can be solved by representing the equations for m > 0 in matrix form, thus getting the equation

which can be solved for all The remaining equation for m = 0 is

which, once are known, can be solved for

An alternative formulation is in terms of the autocorrelation function. The AR parameters are determined by the first p+1 elements of the autocorrelation function. The full autocorrelation function can then be derived by recursively calculating [17]

Examples for some Low-order AR(p) processes

  • p=1
    • Hence
  • p=2
    • The Yule–Walker equations for an AR(2) process are
      • Remember that
      • Using the first equation yields
      • Using the recursion formula yields

Estimation of AR parameters

[edit]

The above equations (the Yule–Walker equations) provide several routes to estimating the parameters of an AR(p) model, by replacing the theoretical covariances with estimated values.[18] Some of these variants can be described as follows:

  • Estimation of autocovariances or autocorrelations. Here each of these terms is estimated separately, using conventional estimates. There are different ways of doing this and the choice between these affects the properties of the estimation scheme. For example, negative estimates of the variance can be produced by some choices.
  • Formulation as a least squares regression problem in which an ordinary least squares prediction problem is constructed, basing prediction of values of Xt on the p previous values of the same series. This can be thought of as a forward-prediction scheme. The normal equations for this problem can be seen to correspond to an approximation of the matrix form of the Yule–Walker equations in which each appearance of an autocovariance of the same lag is replaced by a slightly different estimate.
  • Formulation as an extended form of ordinary least squares prediction problem. Here two sets of prediction equations are combined into a single estimation scheme and a single set of normal equations. One set is the set of forward-prediction equations and the other is a corresponding set of backward prediction equations, relating to the backward representation of the AR model:
Here predicted values of Xt would be based on the p future values of the same series.[clarification needed] This way of estimating the AR parameters is due to John Parker Burg,[19] and is called the Burg method:[20] Burg and later authors called these particular estimates "maximum entropy estimates",[21] but the reasoning behind this applies to the use of any set of estimated AR parameters. Compared to the estimation scheme using only the forward prediction equations, different estimates of the autocovariances are produced, and the estimates have different stability properties. Burg estimates are particularly associated with maximum entropy spectral estimation.[22]

Other possible approaches to estimation include maximum likelihood estimation. Two distinct variants of maximum likelihood are available: in one (broadly equivalent to the forward prediction least squares scheme) the likelihood function considered is that corresponding to the conditional distribution of later values in the series given the initial p values in the series; in the second, the likelihood function considered is that corresponding to the unconditional joint distribution of all the values in the observed series. Substantial differences in the results of these approaches can occur if the observed series is short, or if the process is close to non-stationarity.

Spectrum

[edit]

The power spectral density (PSD) of an AR(p) process with noise variance is[17]

AR(0)

[edit]

For white noise (AR(0))

AR(1)

[edit]

For AR(1)

  • If there is a single spectral peak at , often referred to as red noise. As becomes nearer 1, there is stronger power at low frequencies, i.e. larger time lags. This is then a low-pass filter, when applied to full spectrum light, everything except for the red light will be filtered.
  • If there is a minimum at , often referred to as blue noise. This similarly acts as a high-pass filter, everything except for blue light will be filtered.

AR(2)

[edit]

The behavior of an AR(2) process is determined entirely by the roots of it characteristic equation, which is expressed in terms of the lag operator as:

or equivalently by the poles of its transfer function, which is defined in the Z domain by:

It follows that the poles are values of z satisfying:

,

which yields:

.

and are the reciprocals of the characteristic roots, as well as the eigenvalues of the temporal update matrix:

AR(2) processes can be split into three groups depending on the characteristics of their roots/poles:

  • When , the process has a pair of complex-conjugate poles, creating a mid-frequency peak at:

with bandwidth about the peak inversely proportional to the moduli of the poles:

The terms involving square roots are all real in the case of complex poles since they exist only when .

Otherwise the process has real roots, and:

  • When it acts as a low-pass filter on the white noise with a spectral peak at
  • When it acts as a high-pass filter on the white noise with a spectral peak at .

The process is non-stationary when the poles are on or outside the unit circle, or equivalently when the characteristic roots are on or inside the unit circle. The process is stable when the poles are strictly within the unit circle (roots strictly outside the unit circle), or equivalently when the coefficients are in the triangle .

The full PSD function can be expressed in real form as:

Implementations in statistics packages

[edit]
  • R – the stats package includes ar function;[23] the astsa package includes sarima function to fit various models including AR.[24]
  • MATLAB – the Econometrics Toolbox[25] and System Identification Toolbox[26] include AR models.[27]
  • MATLAB and Octave – the TSA toolbox contains several estimation functions for uni-variate, multivariate, and adaptive AR models.[28]
  • PyMC3 – the Bayesian statistics and probabilistic programming framework supports AR modes with p lags.
  • bayesloop – supports parameter inference and model selection for the AR-1 process with time-varying parameters.[29]
  • Python – statsmodels.org hosts an AR model.[30]

Impulse response

[edit]

The impulse response of a system is the change in an evolving variable in response to a change in the value of a shock term k periods earlier, as a function of k. Since the AR model is a special case of the vector autoregressive model, the computation of the impulse response in vector autoregression#impulse response applies here.

n-step-ahead forecasting

[edit]

Once the parameters of the autoregression

have been estimated, the autoregression can be used to forecast an arbitrary number of periods into the future. First use t to refer to the first period for which data is not yet available; substitute the known preceding values Xt-i for i=1, ..., p into the autoregressive equation while setting the error term equal to zero (because we forecast Xt to equal its expected value, and the expected value of the unobserved error term is zero). The output of the autoregressive equation is the forecast for the first unobserved period. Next, use t to refer to the next period for which data is not yet available; again the autoregressive equation is used to make the forecast, with one difference: the value of X one period prior to the one now being forecast is not known, so its expected value—the predicted value arising from the previous forecasting step—is used instead. Then for future periods the same procedure is used, each time using one more forecast value on the right side of the predictive equation until, after p predictions, all p right-side values are predicted values from preceding steps.

There are four sources of uncertainty regarding predictions obtained in this manner: (1) uncertainty as to whether the autoregressive model is the correct model; (2) uncertainty about the accuracy of the forecasted values that are used as lagged values in the right side of the autoregressive equation; (3) uncertainty about the true values of the autoregressive coefficients; and (4) uncertainty about the value of the error term for the period being predicted. Each of the last three can be quantified and combined to give a confidence interval for the n-step-ahead predictions; the confidence interval will become wider as n increases because of the use of an increasing number of estimated values for the right-side variables.

See also

[edit]

Notes

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
An autoregressive (AR) model is a statistical representation of a random process in which each observation in a is expressed as a of one or more previous observations from the same series, plus a stochastic error term, making it a fundamental tool for modeling dependencies in sequential data. The general form of an AR model of order p, denoted AR(p), is given by the equation
yt=c+i=1pϕiyti+ϵt,y_t = c + \sum_{i=1}^p \phi_i y_{t-i} + \epsilon_t,
where yty_t is the value at time t, cc is a constant, ϕi\phi_i are the model parameters (autoregressive coefficients), and ϵt\epsilon_t is white noise error with mean zero and constant variance. For the simplest case, an AR(1) model, this reduces to yt=c+ϕ1yt1+ϵty_t = c + \phi_1 y_{t-1} + \epsilon_t, assuming stationarity when ϕ1<1|\phi_1| < 1.
The concept originated in the early 20th century, with George Udny Yule introducing the first AR(2) model in 1927 to investigate periodicities in sunspot data, addressing limitations of purely deterministic cycle models by incorporating random disturbances. This work was extended by Gilbert Thomas Walker in 1931, who generalized the approach to higher-order autoregressions and derived methods for parameter estimation, laying the groundwork for the Yule-Walker equations that solve for the coefficients using autocorrelations. Autoregressive models are central to time series analysis, particularly in econometrics and finance, where they forecast variables like stock prices or economic indicators by capturing serial correlation; for instance, AR models have been applied to predict Google stock returns based on lagged values. In signal processing, they model stationary processes for tasks like speech analysis or noise reduction, assuming the data-generating mechanism follows a linear recursive structure. In contemporary machine learning, autoregressive principles underpin generative models for sequences, such as those in natural language processing (e.g., predicting the next word conditioned on prior context) and computer vision (e.g., generating images pixel by pixel), enabling scalable density estimation through the chain rule of probability. These extensions, often implemented with neural networks like recurrent or transformer architectures, have revolutionized applications in large language models while inheriting the core idea of sequential dependency modeling. However, according to Yann LeCun, autoregressive large language models lack true understanding, planning, and reasoning capabilities due to limitations in sample efficiency, world modeling, and their reliance on predicting discrete tokens rather than continuous representations.

Fundamentals

Definition

An autoregressive (AR) model is a stochastic process in which each observation is expressed as a linear combination of previous observations of the same process plus a random error term. These models are fundamental in time series analysis for capturing temporal dependencies, where the value at time tt relies on prior values rather than assuming observations are independent. In contrast to models treating data points as unrelated, AR models leverage the inherent autocorrelation in sequential data, such as economic indicators or natural phenomena, to represent persistence or momentum. The general form of an AR model of order pp, denoted AR(pp), is given by Xt=c+i=1pϕiXti+εt,X_t = c + \sum_{i=1}^p \phi_i X_{t-i} + \varepsilon_t, where cc is a constant, ϕi\phi_i are the model parameters (autoregressive coefficients), and εt\varepsilon_t is white noise—a sequence of independent and identically distributed random variables with mean zero and constant variance. The order pp indicates the number of lagged terms included, allowing the model to account for dependencies extending back pp periods. AR models differ from moving average (MA) models, which express the current value as a linear combination of past forecast errors rather than past values of the series itself. The term "autoregressive" derives from the idea of performing a regression of the variable against its own lagged values, emphasizing self-dependence within the time series. This framework assumes stationarity for reliable inference, though extensions like ARIMA incorporate differencing for non-stationary data.

Historical Development

The origins of autoregressive models trace back to the work of British statistician George Udny Yule, who in 1927 introduced autoregressive schemes to analyze periodicities in disturbed time series, particularly applying them to Wolfer's sunspot numbers to model cycles in astronomical data. Yule's approach represented a departure from traditional periodogram methods, emphasizing stochastic processes where current values depend on past observations to capture quasi-periodic behaviors in time series. In 1931, Gilbert Thomas Walker extended Yule's framework by generalizing it to higher-order autoregressive models, allowing for more flexible representations of complex dependencies in related time series. In the 1930s and 1940s, Herman Wold further advanced the theory by developing the Wold decomposition, showing that stationary processes can be represented as infinite-order AR or MA models, paving the way for ARMA frameworks. A major milestone came in 1970 with George E. P. Box and Gwilym M. Jenkins, who incorporated autoregressive models into the ARMA framework in their seminal book, providing a systematic methodology for identification, estimation, and forecasting that popularized AR models across forecasting applications in statistics and beyond. Since the 1980s, autoregressive models have seen modern extensions in signal processing for spectral estimation and analysis of stationary signals, as well as in machine learning through autoregressive neural networks that leverage past outputs for sequence generation tasks.

Model Formulation

General AR(p) Equation

The autoregressive model of order pp, commonly denoted as AR(pp), specifies that the value of a time series at time tt, XtX_t, depends linearly on its previous pp values plus a constant term and a stochastic error. The general form of the model is given by Xti=1pϕiXti=c+εt,X_t - \sum_{i=1}^p \phi_i X_{t-i} = c + \varepsilon_t, where ϕ1,,ϕp\phi_1, \dots, \phi_p are the autoregressive parameters, cc is a constant representing the deterministic component (often related to the mean of the process), and εt\varepsilon_t is a white noise error term. This equation can be rearranged as Xt=c+i=1pϕiXti+εtX_t = c + \sum_{i=1}^p \phi_i X_{t-i} + \varepsilon_t. The error term εt\varepsilon_t is assumed to have mean zero, constant variance σ2>0\sigma^2 > 0, and to be uncorrelated across time, i.e., E[εt]=0\mathbb{E}[\varepsilon_t] = 0, E[εtεs]=0\mathbb{E}[\varepsilon_t \varepsilon_s] = 0 for tst \neq s, and Var(εt)=σ2\mathrm{Var}(\varepsilon_t) = \sigma^2. For statistical inference, such as maximum likelihood estimation, the errors are often further assumed to be independent and identically distributed as Gaussian, εtN(0,σ2)\varepsilon_t \sim \mathcal{N}(0, \sigma^2). When c=0c = 0, the model is homogeneous, implying a zero- process, which is suitable for centered data. In the inhomogeneous case with c0c \neq 0, the constant accounts for a non-zero , and under weak stationarity, the unconditional of the process is μ=c/(1i=1pϕi)\mu = c / (1 - \sum_{i=1}^p \phi_i). The model assumes weak stationarity, meaning the , variance, and autocovariances are time-invariant, which requires the roots of the to lie outside the unit circle (as detailed in the stationarity conditions section). For inference involving normality assumptions, Gaussian errors facilitate exact likelihood computations. A compact notation for the AR(pp) model employs the backshift operator BB, defined such that BXt=Xt1B X_t = X_{t-1} and BkXt=XtkB^k X_t = X_{t-k} for k1k \geq 1. The autoregressive is ϕ(B)=1i=1pϕiBi\phi(B) = 1 - \sum_{i=1}^p \phi_i B^i, leading to the operator form ϕ(B)Xt=c+εt\phi(B) X_t = c + \varepsilon_t. This notation simplifies manipulations, such as differencing or combining with moving average components in broader ARMA models. For stationary AR(pp) processes, the model admits an infinite (MA(\infty)) representation, expressing XtX_t as an infinite of current and past errors plus the : Xt=μ+j=0ψjεtjX_t = \mu + \sum_{j=0}^\infty \psi_j \varepsilon_{t-j}, where the coefficients ψj\psi_j are determined by the autoregressive parameters and satisfy ψ0=1\psi_0 = 1 with j=0ψj<\sum_{j=0}^\infty |\psi_j| < \infty to ensure absolute summability. This representation underscores the process's dependence on the entire error history, providing a foundation for forecasting and spectral analysis.

Stationarity Conditions

In time series analysis, weak stationarity, also known as covariance stationarity, requires that a process has a constant mean, constant variance, and autocovariances that depend solely on the time lag rather than the specific time points. For an autoregressive process of order pp, denoted AR(pp), this property ensures that the statistical characteristics remain invariant over time, facilitating reliable modeling and forecasting. The necessary and sufficient condition for an AR(pp) process to be weakly stationary is that all roots of the characteristic equation ϕ(z)=1i=1pϕizi=0\phi(z) = 1 - \sum_{i=1}^p \phi_i z^i = 0 lie outside the unit circle in the complex plane, meaning their moduli satisfy z>1|z| > 1. This condition guarantees the existence of a stationary solution. For the simple AR(1) process yt=ϕyt1+εty_t = \phi y_{t-1} + \varepsilon_t, stationarity holds ϕ<1|\phi| < 1. If the stationarity condition is violated, such as when one or more roots have modulus z1|z| \leq 1, the AR process becomes non-stationary, exhibiting behaviors like unit root processes (e.g., random walks with time-dependent variance) or explosive dynamics where variance grows without bound. In the case of a unit root (z=1|z| = 1), as in an AR(1) with ϕ=1\phi = 1, the process integrates to form a non-stationary series with persistent shocks. To address non-stationarity in AR processes, differencing transforms the series into a stationary one by applying the operator yt=ytyt1\nabla y_t = y_t - y_{t-1}, which removes trends or unit roots; higher-order differencing (d\nabla^d) may be needed for processes integrated of order d>1d > 1. This approach underpins ARIMA models, where the differenced series follows a stationary ARMA .

Properties and Analysis

Characteristic Polynomial

The characteristic polynomial of an autoregressive model of order pp, denoted ϕ(z)=1ϕ1zϕ2z2ϕpzp\phi(z) = 1 - \phi_1 z - \phi_2 z^2 - \cdots - \phi_p z^p, arises from the AR(p)(p) operator Φ(B)=1ϕ1Bϕ2B2ϕpBp\Phi(B) = 1 - \phi_1 B - \phi_2 B^2 - \cdots - \phi_p B^p, where BB is the backshift operator such that BXt=Xt1B X_t = X_{t-1}. This polynomial encapsulates the linear dependence structure of the process defined by Xt=j=1pϕjXtj+ϵtX_t = \sum_{j=1}^p \phi_j X_{t-j} + \epsilon_t, with ϵt\epsilon_t as white noise. To derive the characteristic polynomial, consider the AR(p)(p) equation in operator form: Φ(B)Xt=ϵt\Phi(B) X_t = \epsilon_t. Substituting the lag operator BB with a complex variable zz yields the polynomial ϕ(z)\phi(z), which can be viewed through the lens of the of the process. The approach transforms the difference equation into an algebraic one, where ϕ(z)\phi(z) serves as the denominator in the 1/ϕ(z)1/\phi(z), enabling the representation of the AR process as an infinite moving average via partial fraction expansion of the roots. Alternatively, s can be used to express the moments of the process, with the emerging from the denominator of the for the autocovariances. The roots of ϕ(z)=0\phi(z) = 0 provide key insights into the dynamics of the AR process. If the roots are complex conjugates, they introduce oscillatory components in the time series behavior, with the argument of the roots determining the frequency of oscillation. The modulus of the roots governs persistence: roots with smaller modulus (closer to but outside the unit circle) imply slower decay of shocks and longer-lasting effects, while larger moduli (farther from the unit circle) indicate faster decay. For stationarity, all roots must lie outside the unit circle in the complex plane, a condition that ensures the infinite MA representation converges. Pure AR models are always invertible. Stationary AR models can be expressed as a convergent infinite moving average (MA(∞)) representation without additional constraints beyond the root locations. Graphically, the roots are plotted in the , where the unit circle serves as a boundary: points inside indicate non-stationarity, while those outside confirm it, visually highlighting oscillatory patterns via the imaginary axis and persistence via radial distance.

Intertemporal Effects of Shocks

In an autoregressive (AR) model, a shock is conceptualized as a one-time εt\varepsilon_t to the error term, representing an unanticipated disturbance at time tt. This shock influences the future values of the process Xt+kX_{t+k} for k>0k > 0 through the model's recursive structure. The marginal effect of such a shock is given by Xt+kεt=ϕk\frac{\partial X_{t+k}}{\partial \varepsilon_t} = \phi_k, where ϕk\phi_k denotes the kk-th dynamic multiplier, obtained by recursively applying the AR coefficients (for an AR(pp) model, ϕ0=1\phi_0 = 1, ϕk=i=1min(k,p)ϕiϕki\phi_k = \sum_{i=1}^{\min(k,p)} \phi_i \phi_{k-i} for k>0k > 0). The persistence of these intertemporal effects depends on the stationarity of the AR process. In a stationary AR model, where all roots of the characteristic polynomial lie outside the unit circle, the effects of a shock decay geometrically over time, ensuring that the influence diminishes as kk increases (e.g., in an AR(1) process with coefficient ϕ1=ϕ<1\phi_1 = \phi < 1, the effect on Xt+kX_{t+k} is ϕkεt\phi^k \varepsilon_t). Conversely, in non-stationary cases, such as when a unit root is present (e.g., ϕ=1\phi = 1 in AR(1)), the effects accumulate rather than decay, leading to permanent shifts in the level of the series. A key aspect of shock propagation is the variance decomposition, which quantifies how past shocks contribute to the current unconditional variance of the process. For a stationary AR model, the variance Var(Xt)=σε2k=0ϕk2\operatorname{Var}(X_t) = \sigma_\varepsilon^2 \sum_{k=0}^\infty \phi_k^2, where each term ϕk2σε2\phi_k^2 \sigma_\varepsilon^2 represents the contribution from a shock kk periods in the past; this infinite sum converges due to geometric decay. In the AR(1) case, it simplifies to Var(Xt)=σε21ϕ2\operatorname{Var}(X_t) = \frac{\sigma_\varepsilon^2}{1 - \phi^2}, illustrating how earlier shocks have exponentially smaller contributions relative to recent ones. In econometric applications, particularly in macroeconomics, these shocks are often interpreted as exogenous events such as policy changes, supply disruptions, or demand fluctuations that propagate through economic variables modeled via AR processes. For instance, an unanticipated monetary policy tightening can be viewed as a negative shock whose intertemporal effects trace the subsequent adjustments in output or inflation, with persistence reflecting the economy's inertial response to such interventions.

Impulse Response Function

In autoregressive (AR) models, the impulse response function (IRF) quantifies the dynamic impact of a unit shock to the innovation term εt\varepsilon_t on the future values of the process Xt+kX_{t+k}. It is formally defined as the sequence of coefficients ψk=Xt+kεt\psi_k = \frac{\partial X_{t+k}}{\partial \varepsilon_t} for k=0,1,2,k = 0, 1, 2, \dots, with the initial condition ψ0=1\psi_0 = 1 reflecting the contemporaneous effect of the shock. These IRF coefficients arise from the moving average representation of the stationary AR process, Xt=k=0ψkεtkX_t = \sum_{k=0}^\infty \psi_k \varepsilon_{t-k}, and satisfy a linear recurrence relation derived from the AR structure. For an AR(pp) model, they are computed recursively as ψk=i=1pϕiψki\psi_k = \sum_{i=1}^p \phi_i \psi_{k-i} for k>0k > 0, with ψk=0\psi_k = 0 for k<0k < 0. This recursion allows efficient numerical calculation of the IRF sequence, starting from the known AR parameters ϕ1,,ϕp\phi_1, \dots, \phi_p. For the simple AR(1) model, Xt=ϕXt1+εtX_t = \phi X_{t-1} + \varepsilon_t, the IRF has a closed-form expression ψk=ϕk\psi_k = \phi^k for k0k \geq 0. Under the stationarity condition ϕ<1|\phi| < 1, this exhibits geometric decay, with the shock's influence diminishing exponentially over time. In practice, IRFs for AR(pp) models with p>1p > 1 are visualized through plots tracing ψk\psi_k against kk, revealing patterns such as monotonic decay, overshooting (where the response temporarily exceeds the long-run ), or oscillatory behavior influenced by complex in the model's . For instance, roots near the unit circle can prolong the shock's persistence, while purely real roots yield smoother responses. To account for estimation uncertainty, bands are constructed around estimated IRFs using methods like asymptotic normality, which relies on the variance-covariance matrix of the AR parameters, or , which resamples residuals to simulate the of the responses. These bands widen with the forecast horizon kk and are essential for on shock persistence.

Specific Examples

AR(1) Process

The AR(1) process is the first-order autoregressive model, capturing dependence of the current observation on only the immediate past value. It is expressed as Xt=c+ϕXt1+εt,X_t = c + \phi X_{t-1} + \varepsilon_t, where cc denotes a constant term, ϕ\phi is the autoregressive coefficient satisfying ϕ<1|\phi| < 1 for stationarity, and εt\varepsilon_t is white noise with zero mean and finite variance σ2>0\sigma^2 > 0. Under the stationarity condition ϕ<1|\phi| < 1, the unconditional mean of the process is μ=c1ϕ\mu = \frac{c}{1 - \phi}. The unconditional variance is γ0=σ21ϕ2\gamma_0 = \frac{\sigma^2}{1 - \phi^2}. The autocorrelation function of the stationary AR(1) process exhibits exponential decay, given by ρk=ϕk\rho_k = \phi^{|k|} for lag k0k \geq 0. This geometric decline reflects the diminishing influence of past shocks over time, with the rate determined by ϕ|\phi|. An equivalent representation centers the process around its mean, yielding the mean-deviation form Xtμ=ϕ(Xt1μ)+εt.X_t - \mu = \phi (X_{t-1} - \mu) + \varepsilon_t. This formulation highlights the mean-reverting dynamics when ϕ<1|\phi| < 1, as deviations from μ\mu are scaled by ϕ\phi before adding new noise. Simulations of AR(1) sample paths reveal behavioral contrasts across ϕ\phi values. For ϕ=0.2\phi = 0.2, paths show rapid mean reversion, with quick damping of shocks and low persistence. At ϕ=0.9\phi = 0.9, paths display high persistence, wandering slowly before reverting, mimicking long-memory patterns. Negative ϕ\phi, such as ϕ=0.8\phi = -0.8, produces oscillatory paths alternating around the mean. In the unit root case where ϕ=1\phi = 1, the AR(1) process simplifies to a random walk, Xt=c+Xt1+εtX_t = c + X_{t-1} + \varepsilon_t, which lacks stationarity as variance grows indefinitely with time.

AR(2) Process

The AR(2) process extends the autoregressive framework to second-order dependence, defined by the equation Xt=c+ϕ1Xt1+ϕ2Xt2+εt,X_t = c + \phi_1 X_{t-1} + \phi_2 X_{t-2} + \varepsilon_t, where cc is a constant, ϕ1\phi_1 and ϕ2\phi_2 are the autoregressive parameters, and εt\varepsilon_t is white noise with mean zero and finite variance σ2\sigma^2. This formulation allows the current value XtX_t to depend linearly on the two preceding observations, capturing more complex temporal dynamics than the first-order case. Stationarity of the AR(2) process requires that the roots of the characteristic equation 1ϕ1zϕ2z2=01 - \phi_1 z - \phi_2 z^2 = 0 lie outside the unit circle. Equivalently, this condition holds if the parameters satisfy ϕ2<1|\phi_2| < 1, ϕ1<1ϕ2\phi_1 < 1 - \phi_2, and ϕ1>ϕ21\phi_1 > \phi_2 - 1, defining a triangular in the (ϕ1,ϕ2)(\phi_1, \phi_2) parameter space. Under these constraints, the process has a time-invariant μ=c/(1ϕ1ϕ2)\mu = c / (1 - \phi_1 - \phi_2) and finite variance. The autocorrelation function (ACF) of a stationary AR(2) process decays gradually to zero, following the recursive relation ρk=ϕ1ρk1+ϕ2ρk2\rho_k = \phi_1 \rho_{k-1} + \phi_2 \rho_{k-2} for k>2k > 2, with initial values ρ1=ϕ1/(1ϕ2)\rho_1 = \phi_1 / (1 - \phi_2) and ρ2=ϕ1ρ1+ϕ2\rho_2 = \phi_1 \rho_1 + \phi_2. If the characteristic roots are complex conjugates—which occurs when the discriminant ϕ12+4ϕ2<0\phi_1^2 + 4 \phi_2 < 0—the ACF exhibits damped sine wave oscillations, reflecting pseudo-periodic behavior. In contrast, real roots produce a monotonic exponential decay in the ACF. The partial autocorrelation function (PACF) for an AR(2) process truncates after lag 2, with ϕk,k=0\phi_{k,k} = 0 for all k>2k > 2, providing a diagnostic signature for model identification. This sharp cutoff distinguishes AR(2) from higher-order processes, where the PACF would decay more slowly. The distinction between real and complex characteristic roots fundamentally shapes the process's dynamics: real roots yield smooth, non-oscillatory persistence, while complex roots introduce cyclic patterns with a pseudo-period determined by 2π/cos1(ϕ1/(2ϕ2))2\pi / \cos^{-1}(\phi_1 / (2 \sqrt{|\phi_2|}))
Add your contribution
Related Hubs
User Avatar
No comments yet.