polr                  package:MASS                  R Documentation

_O_r_d_e_r_e_d _L_o_g_i_s_t_i_c _o_r _P_r_o_b_i_t _R_e_g_r_e_s_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Fits a logistic or probit regression model to an ordered factor
     response.  The default logistic case is _proportional odds
     logistic regression_, after which the function is named.

_U_s_a_g_e:

     polr(formula, data, weights, start, ..., subset, na.action,
          contrasts = NULL, Hess = FALSE, model = TRUE,
          method = c("logistic", "probit", "cloglog", "cauchit"))

_A_r_g_u_m_e_n_t_s:

 formula: a formula expression as for regression models, of the form
          'response ~ predictors'. The response should be a factor
          (preferably an ordered factor), which will be interpreted as
          an ordinal response, with levels ordered as in the factor.  A
          proportional odds model will be fitted. The model must have
          an intercept: attempts to remove one will lead to a warning
          and be ignored. An offset may be used. See the documentation
          of 'formula' for other details. 

    data: an optional data frame in which to interpret the variables
          occurring in 'formula'. 

 weights: optional case weights in fitting.  Default to 1. 

   start: initial values for the parameters. 

     ...: additional arguments to be passed to 'optim', most often a
          'control' argument. 

  subset: expression saying which subset of the rows of the data should
           be used in the fit. All observations are included by
          default. 

na.action: a function to filter missing data. 

contrasts: a list of contrasts to be used for some or all of the
          factors appearing as variables in the model formula. 

    Hess: logical for whether the Hessian (the observed information
          matrix) should be returned. 

   model: logical for whether the model matrix should be returned. 

  method: logistic or probit or complementary log-log or cauchit
          (corresponding to a Cauchy latent variable and only available
          in R >= 2.1.0). 

_D_e_t_a_i_l_s:

     This model is what Agresti (2002) calls a _cumulative link_ model.
      The basic interpretation is as a _coarsened_ version of a latent
     variable Y_i which has a logistic or normal or extreme-value or
     Cauchy distribution with scale parameter one and a linear model
     for the mean.  The ordered factor which is observed is which bin
     Y_i falls into with breakpoints

             zeta_0 = -Inf < zeta_1 < ... < zeta_K = Inf

     This leads to the model

                  logit P(Y <= k | x) = zeta_k - eta

     with _logit_ replaced by _probit_ for a normal latent variable,
     and eta being the linear predictor, a linear function of the
     explanatory variables.  Note that it is quite common for other
     software to use the opposite sign for eta.

     In the logistic case, the left-hand side of the last display is
     the log odds of category k or less, and since these are log odds
     which differ only by a constant for different k, the odds are
     proportional.  Hence the term _proportional odds logistic
     regression_.

     In the complementary log-log case, we have a _proportional
     hazards_ model for grouped survival times.

     There are methods for the standard model-fitting functions,
     including 'predict', 'summary', 'vcov', 'anova', 'model.frame' and
     an 'extractAIC' method for use with 'stepAIC'.  There are also
     'profile' and 'confint' methods.

_V_a_l_u_e:

     A object of class '"polr"'.  This has components

coefficients: the coefficients of the linear predictor.

    zeta: the intercepts for the class boundaries.

deviance: the residual deviance.

fitted.values: a matrix, with a column for each level of the response.

     lev: the names of the response levels.

   terms: the 'terms' structure describing the model.

df.residual: the number of residual degrees of freedoms, calculated
          using the weights.

     edf: the (effective) number of degrees of freedom used by the
          model

       n: the (effective) number of observations,  calculated using the
          weights

    call: the matched call.

  method: the matched method used.

convergence: the convergence code returned by 'optim'.

   niter: the number of function and gradient evaluations used by
          'optim'.

 Hessian: (if 'Hess' is true).

   model: (if 'model' is true).

_R_e_f_e_r_e_n_c_e_s:

     Agresti, A. (2002) _Categorical Data._ Second edition.  Wiley.

     Venables, W. N. and Ripley, B. D. (2002) _Modern Applied
     Statistics with S._ Fourth edition.  Springer.

_S_e_e _A_l_s_o:

     'optim', 'glm', 'multinom'.

_E_x_a_m_p_l_e_s:

     options(contrasts = c("contr.treatment", "contr.poly"))
     house.plr <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
     house.plr
     summary(house.plr)
     ## slightly worse fit from
     summary(update(house.plr, method = "probit"))
     ## although it is not really appropriate, can fit
     summary(update(house.plr, method = "cloglog"))

     predict(house.plr, housing, type = "p")
     addterm(house.plr, ~.^2, test = "Chisq")
     house.plr2 <- stepAIC(house.plr, ~.^2)
     house.plr2$anova
     anova(house.plr, house.plr2)

     house.plr <- update(house.plr, Hess=TRUE)
     pr <- profile(house.plr)
     confint(pr)
     plot(pr)
     pairs(pr)

