You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From AR to SARIMAX: Mathematical Definitions of Time Series Models
Overview
AR, MA, ARMA, ARIMA, ARIMA and ARIMAX are univariate time series models that are special cases of SARIMAX. This guide gives the mathematical definitions of these models, but does not go into in-depth explanations, model selection or parameter estimation.
Precursors to SARIMAX
Autoregressive (AR) Models
Suppose we have a time series given by ${ y_{t} }$. An $AR(p)$ model can be specified by
Where $p$ is the number of time lags to regress on, $\epsilon_{t}$ is the noise at time $t$ and $\beta$ is a constant.
This equation can be made more concise through the use of the lag operator, $L$.
$$L^{n} y_{t} = y_{t-n}$$
Taking $\Theta(L)^{p}$ to be an order $p$ polynomial function of $L$, we can instead define an autoregressive model by
$$ y_{t} = \Theta(L)^{p} y_{t} + \epsilon_{t}$$
Taking note that the constant has been absorbed into the polynomial $\Theta$.
Moving average (MA) Models.
Whereas autoregressive models regress on prior values of $y_{t}$, moving average models regress on prior values of error. An $MA(q)$ model can be specified by
SARIMA models take seasonality into account by essentially applying an ARIMA model to lags that are integer multiples of seasonality. Once the seasonality is modelled, an ARIMA model is applied to the leftover to capture non-seasonal structure.
To see this more clearly, suppose we have a time series $ { y_{t} } $ with seasonality $s$. We can try to eliminate the seasonality with differencing, by applying the differencing operator $\Delta_{s}^{D}$ to take the seasonal differences of the time series. Here $s$ is the number of time lags comprising one full period of seasonality. $D$ takes on a similar meaning to $d$ in ARIMA models, but instead applies to seasonal lags.
We can then capture any remaining structure by applying an $ARMA(P, Q)$ model, to the differenced values, but using seasonal lags. i.e. instead of using a regular lag operator $L$, we use $L^{s}$. $P$ and $Q$ are again seasonal time lags
With any seasonality now removed, we can apply another $ARIMA(p, d, q)$ model to $ \Delta_{s}^{D} y_{t} $ by multiplying the seasonal model by the new ARIMA model.
This is the general form of a $SARIMA(p, d, q)(P, D, Q, s)$ model.
ARIMAX and SARIMAX
ARIMAX and SARIMAX models simply take exogenous variables into account - ie variables measured at time $t$ that influences the value of our time series at time $t$, but that are not autoregressed on. To do this, we simply add the terms in on the right hand side of our ARIMA and SARIMA equations.
For $n$ exogenous variables defined at each time step $t$, denoted by $x_{t}^{i}$ for $ i \leq n $, with coefficients $\beta_{i}$, the $ARIMAX(p, d, q)$ model is defined by