arima                 package:stats                 R Documentation

_A_R_I_M_A _M_o_d_e_l_l_i_n_g _o_f _T_i_m_e _S_e_r_i_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Fit an ARIMA model to a univariate time series.

_U_s_a_g_e:

     arima(x, order = c(0, 0, 0),
           seasonal = list(order = c(0, 0, 0), period = NA),
           xreg = NULL, include.mean = TRUE, transform.pars = TRUE,
           fixed = NULL, init = NULL, method = c("CSS-ML", "ML", "CSS"),
           n.cond, optim.control = list(), kappa = 1e6)

_A_r_g_u_m_e_n_t_s:

       x: a univariate time series

   order: A specification of the non-seasonal part of the ARIMA model:
          the three components (p, d, q) are the AR order, the degree
          of differencing, and the MA order.

seasonal: A specification of the seasonal part of the ARIMA model, plus
          the period (which defaults to 'frequency(x)'). This should be
          a list with components 'order' and 'period', but a
          specification of just a numeric vector of length 3 will be
          turned into a suitable list with the specification as the
          'order'.

    xreg: Optionally, a vector or matrix of external regressors, which
          must have the same number of rows as 'x'.

include.mean: Should the ARIMA model include a mean term? The default
          is 'TRUE' for undifferenced series, 'FALSE' for differenced
          ones (where a mean would not affect the fit nor predictions).

transform.pars: Logical.  If true, the AR parameters are transformed to
          ensure that they remain in the region of stationarity.  Not
          used for 'method = "CSS"'.

   fixed: optional numeric vector of the same length as the total
          number of parameters.  If supplied, only 'NA' entries in
          'fixed' will be varied.  'transform.pars = TRUE' will be
          overridden (with a warning) if any AR parameters are fixed.
          It may be wise to set 'transform.pars = FALSE' when fixing MA
          parameters, especially near non-invertibility. 

    init: optional numeric vector of initial parameter values.  Missing
          values will be filled in, by zeroes except for regression
          coefficients.  Values already specified in 'fixed' will be
          ignored.

  method: Fitting method: maximum likelihood or minimize conditional
          sum-of-squares.  The default (unless there are missing
          values) is to use conditional-sum-of-squares to find starting
          values, then maximum likelihood.

  n.cond: Only used if fitting by conditional-sum-of-squares: the
          number of initial observations to ignore.  It will be ignored
          if less than the maximum lag of an AR term.

optim.control: List of control parameters for 'optim'.

   kappa: the prior variance (as a multiple of the innovations
          variance) for the past observations in a differenced model. 
          Do not reduce this.

_D_e_t_a_i_l_s:

     Different definitions of ARMA models have different signs for the
     AR and/or MA coefficients. The definition here has


 'X[t] = a[1]X[t-1] + ... + a[p]X[t-p] + e[t] + b[1]e[t-1] + ... + b[q]e[t-q]'


     and so the MA coefficients differ in sign from those of S-PLUS. 
     Further, if 'include.mean' is true, this formula applies to X-m
     rather than X.  For ARIMA models with differencing, the
     differenced series follows a zero-mean ARMA model. If a 'xreg'
     term is included, a linear regression (with a constant term if
     'include.mean' is true) is fitted with an ARMA model for the error
     term.

     The variance matrix of the estimates is found from the Hessian of
     the log-likelihood, and so may only be a rough guide.

     Optimization is done by 'optim'. It will work best if the columns
     in 'xreg' are roughly scaled to zero mean and unit variance, but
     does attempt to estimate suitable scalings.

_V_a_l_u_e:

     A list of class '"Arima"' with components:

    coef: a vector of AR, MA and regression coefficients, which can be
          extracted by the 'coef' method.

  sigma2: the MLE of the innovations variance.

var.coef: the estimated variance matrix of the coefficients 'coef',
          which can be extracted by the 'vcov' method.

  loglik: the maximized log-likelihood (of the differenced data), or
          the approximation to it used.

    arma: A compact form of the specification, as a vector giving the
          number of AR, MA, seasonal AR and seasonal MA coefficients,
          plus the period and the number of non-seasonal and seasonal
          differences.

     aic: the AIC value corresponding to the log-likelihood. Only valid
          for 'method = "ML"' fits.

residuals: the fitted innovations.

    call: the matched call.

  series: the name of the series 'x'.

    code: the convergence value returned by 'optim'.

  n.cond: the number of initial observations not used in the fitting.

   model: A list representing the Kalman Filter used in the fitting. 
          See 'KalmanLike'.

_F_i_t_t_i_n_g _m_e_t_h_o_d_s:

     The exact likelihood is computed via a state-space representation
     of the ARIMA process, and the innovations and their variance found
     by a Kalman filter.  The initialization of the differenced ARMA
     process uses stationarity and is based on Gardner _et al._ (1980).
      For a differenced process the non-stationary components are given
     a diffuse prior (controlled by 'kappa').  Observations which are
     still controlled by the diffuse prior (determined by having a
     Kalman gain of at least '1e4') are excluded from the likelihood
     calculations. (This gives comparable results to 'arima0' in the
     absence of missing values, when the observations excluded are
     precisely those dropped by the differencing.)

     Missing values are allowed, and are handled exactly in method
     '"ML"'.

     If 'transform.pars' is true, the optimization is done using an
     alternative parametrization which is a variation on that suggested
     by Jones (1980) and ensures that the model is stationary.  For an
     AR(p) model the parametrization is via the inverse tanh of the
     partial autocorrelations: the same procedure is applied
     (separately) to the AR and seasonal AR terms.  The MA terms are
     not constrained to be invertible during optimization, but they
     will be converted to invertible form after optimization if
     'transform.pars' is true.

     Conditional sum-of-squares is provided mainly for expositional
     purposes.  This computes the sum of squares of the fitted
     innovations from observation 'n.cond' on, (where 'n.cond' is at
     least the maximum lag of an AR term), treating all earlier
     innovations to be zero.  Argument 'n.cond' can be used to allow
     comparability between different fits.  The "part log-likelihood"
     is the first term, half the log of the estimated mean square. 
     Missing values are allowed, but will cause many of the innovations
     to be missing.

     When regressors are specified, they are orthogonalized prior to
     fitting unless any of the coefficients is fixed.  It can be
     helpful to roughly scale the regressors to zero mean and unit
     variance.

_N_o_t_e:

     The results are likely to be different from S-PLUS's 'arima.mle',
     which computes a conditional likelihood and does not include a
     mean in the model.  Further, the convention used by 'arima.mle'
     reverses the signs of the MA coefficients.

     'arima' is very similar to 'arima0' for ARMA models or for
     differenced models without missing values, but handles differenced
     models with missing values exactly. It is somewhat slower than
     'arima0', particularly for seasonally differenced models.

_R_e_f_e_r_e_n_c_e_s:

     Brockwell, P. J. and Davis, R. A. (1996) _Introduction to Time
     Series and Forecasting._ Springer, New York. Sections 3.3 and 8.3.

     Durbin, J. and Koopman, S. J. (2001) _Time Series Analysis by
     State Space Methods._  Oxford University Press.

     Gardner, G, Harvey, A. C. and Phillips, G. D. A. (1980) Algorithm
     AS154. An algorithm for exact maximum likelihood estimation of
     autoregressive-moving average models by means of Kalman filtering.
     _Applied Statistics_ *29*, 311-322.

     Harvey, A. C. (1993) _Time Series Models_, 2nd Edition, Harvester
     Wheatsheaf, sections 3.3 and 4.4.

     Jones, R. H. (1980) Maximum likelihood fitting of ARMA models to
     time series with missing observations. _Technometrics_ *20*
     389-395.

_S_e_e _A_l_s_o:

     'predict.Arima', 'arima.sim' for simulating from an ARIMA model,
     'tsdiag', 'arima0', 'ar'

_E_x_a_m_p_l_e_s:

     arima(lh, order = c(1,0,0))
     arima(lh, order = c(3,0,0))
     arima(lh, order = c(1,0,1))

     arima(lh, order = c(3,0,0), method = "CSS")

     arima(USAccDeaths, order = c(0,1,1), seasonal = list(order=c(0,1,1)))
     arima(USAccDeaths, order = c(0,1,1), seasonal = list(order=c(0,1,1)),
           method = "CSS") # drops first 13 observations.
     # for a model with as few years as this, we want full ML

     arima(LakeHuron, order = c(2,0,0), xreg = time(LakeHuron)-1920)

     ## presidents contains NAs
     ## graphs in example(acf) suggest order 1 or 3
     (fit1 <- arima(presidents, c(1, 0, 0)))
     tsdiag(fit1)
     (fit3 <- arima(presidents, c(3, 0, 0)))  # smaller AIC
     tsdiag(fit3)

