rlm                   package:MASS                   R Documentation

_R_o_b_u_s_t _F_i_t_t_i_n_g _o_f _L_i_n_e_a_r _M_o_d_e_l_s

_D_e_s_c_r_i_p_t_i_o_n:

     Fit a linear model by robust regression using an M estimator.

_U_s_a_g_e:

     rlm(x, ...)

     ## S3 method for class 'formula':
     rlm(formula, data, weights, ..., subset, na.action,
         method = c("M", "MM", "model.frame"),
         wt.method = c("inv.var", "case"),
         model = TRUE, x.ret = TRUE, y.ret = FALSE, contrasts = NULL)

     ## Default S3 method:
     rlm(x, y, weights, ..., w = rep(1, nrow(x)),
         init = "ls", psi = psi.huber,
         scale.est = c("MAD", "Huber", "proposal 2"), k2 = 1.345,
         method = c("M", "MM"), wt.method = c("inv.var", "case"),
         maxit = 20, acc = 1e-4, test.vec = "resid", lqs.control = NULL)

     psi.huber(u, k = 1.345, deriv = 0)
     psi.hampel(u, a = 2, b = 4, c = 8, deriv = 0)
     psi.bisquare(u, c = 4.685, deriv = 0)

_A_r_g_u_m_e_n_t_s:

 formula: a formula of the form 'y ~ x1 + x2 + ...'. 

    data: data frame from which variables specified in 'formula' are
          preferentially to be taken. 

 weights: a vector of prior weights for each case. 

  subset: An index vector specifying the cases to be used in fitting. 

na.action: A function to specify the action to be taken if 'NA's are
          found. The 'factory-fresh' default action in R is 'na.omit',
          and can be changed by 'options(na.action=)'. 

       x: a matrix or data frame containing the explanatory variables. 

       y: the response: a vector of length the number of rows of 'x'. 

  method: currently either M-estimation or MM-estimation or (for the
          'formula' method only) find the model frame.  MM-estimation
          is M-estimation with Tukey's biweight initialized by a
          specific S-estimator.  See the 'Details' section. 

wt.method: are the weights case weights (giving the relative importance
          of case, so a weight of 2 means there are two of these) or
          the inverse of the variances, so a weight of two means this
          error is half as variable? 

   model: should the model frame be returned in the object? 

   x.ret: should the model matrix be returned in the object? 

   y.ret: should the response be returned in the object? 

contrasts: optional contrast specifications: se 'lm'. 

       w: (optional) initial down-weighting for each case. 

    init: (optional) initial values for the coefficients OR a method to
          find initial values OR the result of a fit with a 'coef'
          component.  Known methods are '"ls"' (the default) for an
          initial least-squares fit using weights 'w*weights', and
          '"lts"' for an unweighted least-trimmed squares fit with 200
          samples. 

     psi: the psi function is specified by this argument.  It must give
          (possibly by name) a function 'g(x, ..., deriv)' that for
          'deriv=0' returns psi(x)/x and for 'deriv=1' returns psi'(x).
           Tuning constants will be passed in via '...'. 

scale.est: method of scale estimation: re-scaled MAD of the residuals
          (default) or Huber's proposal 2 (which can be selected by
          either '"Huber"' or '"proposal 2"'). 

      k2: tuning constant used for Huber proposal 2 scale estimation. 

   maxit: the limit on the number of IWLS iterations. 

     acc: the accuracy for the stopping criterion. 

test.vec: the stopping criterion is based on changes in this vector. 

     ...: additional arguments to be passed to 'rlm.default' or to the
          'psi' function. 

lqs.control: An optional list of control values for 'lqs'. 

       u: numeric vector of evalation points. 

k, a, b, c: tuning constants. 

   deriv: '0' or '1': compute values of the psi function or of its
          first derivative. 

_D_e_t_a_i_l_s:

     Fitting is done by iterated re-weighted least squares (IWLS).

     Psi functions are supplied for the Huber, Hampel and Tukey
     bisquare proposals as 'psi.huber', 'psi.hampel' and
     'psi.bisquare'. Huber's corresponds to a convex optimization
     problem and gives a unique solution (up to collinearity). The
     other two will have multiple local minima, and a good starting
     point is desirable.

     Selecting 'method = "MM"' selects a specific set of options which
     ensures that the estimator has a high breakdown point. The initial
     set of coefficients and the final scale are selected by an
     S-estimator with 'k0 = 1.548'; this gives (for n >> p) breakdown
     point 0.5. The final estimator is an M-estimator with Tukey's
     biweight and fixed scale that will inherit this breakdown point
     provided 'c > k0'; this is true for the default value of 'c' that
     corresponds to 95% relative efficiency at the normal.  Case
     weights are not supported for 'method = "MM"'.

_V_a_l_u_e:

     An object of class '"rlm"' inheriting from '"lm"'. The additional
     components not in an 'lm' object are

       s: the robust scale estimate used 

       w: the weights used in the IWLS process 

     psi: the psi function with parameters substituted 

    conv: the convergence criteria at each iteration 

converged: did the IWLS converge? 

  wresid: a working residual, weighted for '"inv.var"' weights only. 

_R_e_f_e_r_e_n_c_e_s:

     P. J. Huber (1981) _Robust Statistics_. Wiley.

     F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw and W. A. Stahel
     (1986) _Robust Statistics: The Approach based on Influence
     Functions_. Wiley.

     A. Marazzi (1993) _Algorithms, Routines and S Functions for Robust
     Statistics_. Wadsworth & Brooks/Cole.

     Venables, W. N. and Ripley, B. D. (2002) _Modern Applied
     Statistics with S._  Fourth edition.  Springer.

_S_e_e _A_l_s_o:

     'lm', 'lqs'.

_E_x_a_m_p_l_e_s:

     summary(rlm(stack.loss ~ ., stackloss))
     rlm(stack.loss ~ ., stackloss, psi = psi.hampel, init = "lts")
     rlm(stack.loss ~ ., stackloss, psi = psi.bisquare)

