coxph                package:survival                R Documentation

_F_i_t _P_r_o_p_o_r_t_i_o_n_a_l _H_a_z_a_r_d_s _R_e_g_r_e_s_s_i_o_n _M_o_d_e_l

_D_e_s_c_r_i_p_t_i_o_n:

     Fits a Cox proportional hazards regression model. Time dependent
     variables, time dependent strata, multiple events per subject, and
     other extensions are incorporated using the counting process
     formulation of Andersen and Gill.

_U_s_a_g_e:

     coxph(formula, data=parent.frame(), weights, subset,
            na.action, init, control, method=c("efron","breslow","exact"),
            singular.ok=TRUE, robust=FALSE,
            model=FALSE, x=FALSE, y=TRUE,...
            )

_A_r_g_u_m_e_n_t_s:

 formula: a formula object, with the response on the left of a '~'
          operator, and the terms on the right.  The response must be a
          survival object as returned by the 'Surv' function. 

    data: a data.frame in which to interpret the variables named in the
          'formula', or in the 'subset' and the 'weights' argument. 

  subset: expression saying that only a subset of the rows of the data
          should be used in the fit. 

na.action: a missing-data filter function, applied to the model.frame,
          after any subset argument has been used.  Default is
          'options()$na.action'. 

 weights: case weights. 

    init: vector of initial values of the iteration.  Default initial
          value is zero for all variables. 

 control: Object of class 'coxph.control' specifying iteration limit
          and other control options. Default is 'coxph.control(...)'. 

  method: a character string specifying the method for tie handling. 
          If there  are no tied death times all the methods are
          equivalent. Nearly all Cox regression programs use the
          Breslow method by default, but not this one. The Efron
          approximation is used as the default here, as it is much more
          accurate when dealing with tied death times, and is as
          efficient computationally. The exact method computes the
          exact partial likelihood, which is equivalent to a
          conditional logistic model.  If there are a large number of
          ties the computational time will be excessive. 

singular.ok: logical value indicating how to handle collinearity in the
          model matrix. If 'TRUE', the program will automatically skip
          over columns of the X matrix that are linear combinations of
          earlier columns.  In this case the coefficients for such
          columns will be NA, and the variance matrix will contain
          zeros.  For ancillary calculations, such as the linear
          predictor, the missing coefficients are treated as zeros. 

  robust: if TRUE a robust variance estimate is returned.  Default is
          'TRUE' if the model includes a 'cluster()' operative, 'FALSE'
          otherwise. 

   model: flags to control what is returned.  If these are true, then
          the model frame, the model matrix, and/or the response is
          returned as components of the fitted model, with the same
          names as the flag arguments.  

       x: Return the design matrix in the model object?

       y: return the response in the model object?

     ...: Other arguments will be passed to 'coxph.control'

_D_e_t_a_i_l_s:

     The proportional hazards model is usually expressed in terms of a
     single survival time value for each person, with possible
     censoring. Andersen and Gill reformulated the same problem as a
     counting process; as time marches onward we observe the events for
     a subject, rather like watching a Geiger counter. The data for a
     subject is presented as multiple rows or "observations", each of
     which applies to an interval of observation (start, stop].

_V_a_l_u_e:

     an object of class '"coxph"'. See 'coxph.object' for details.

_S_i_d_e _E_f_f_e_c_t_s:

     Depending on the call, the 'predict', 'residuals', and 'survfit'
     routines may need to reconstruct the x matrix created by 'coxph'. 
     Differences in the environment, such as which data frames are
     attached or the value of 'options()$contrasts', may cause this
     computation to fail or worse, to be incorrect.  See the survival
     overview document for details.

_S_P_E_C_I_A_L _T_E_R_M_S:

     There are two special terms that may be used in the model
     equation. A 'strata' term identifies a stratified Cox model;
     separate baseline hazard functions are fit for each strata. The
     'cluster' term is used to compute a robust variance for the model.
     The term '+ cluster(id)', where 'id == unique(id)', is equivalent
     to specifying the 'robust=T' argument, and produces an approximate
     jackknife estimate of the variance.  If the 'id' variable were not
     unique, but instead identifies clusters of correlated
     observations, then the variance estimate is based on a grouped
     jackknife.

_C_O_N_V_E_R_G_E_N_C_E:

     In certain data cases the actual MLE estimate of a coefficient is
     infinity, e.g., a dichotomous variable where one of the groups has
     no events.  When this happens the associated coefficient grows at
     a steady pace and a race condition will exist in the fitting
     routine: either the log likelihood converges, the information
     matrix becomes effectively singular, an argument to exp becomes
     too large for the computer hardware, or the maximum number of
     interactions is exceeded. The routine attempts to detect when this
     has happened, not always successfully.

_P_E_N_A_L_I_S_E_D _R_E_G_R_E_S_S_I_O_N:

     'coxph' can now maximise a penalised partial likelihood with
     arbitrary user-defined penalty.  Supplied penalty functions
     include ridge regression (ridge), smoothing splines (pspline), and
     frailty models (frailty).

_R_e_f_e_r_e_n_c_e_s:

     P. Andersen and R. Gill. "Cox's regression model for counting
     processes, a large sample study", _Annals of Statistics, _
     10:1100-1120, 1982.

     T. Therneau, P. Grambsch, and T. Fleming. "Martingale based
     residuals for survival models",  _Biometrika, _ March 1990.

_S_e_e _A_l_s_o:

     'cluster', 'survfit', 'Surv', 'strata','ridge',
     'pspline','frailty'.

_E_x_a_m_p_l_e_s:

     # Create the simplest test data set
     #
      test1 <- list(time=  c(4, 3,1,1,2,2,3),
                     status=c(1,NA,1,0,1,1,0),
                     x=     c(0, 2,1,1,1,0,0),
                     sex=   c(0, 0,0,0,1,1,1))
      coxph( Surv(time, status) ~ x + strata(sex), test1)  #stratified model

     #
     # Create a simple data set for a time-dependent model
     #
     test2 <- list(start=c(1, 2, 5, 2, 1, 7, 3, 4, 8, 8),
                     stop =c(2, 3, 6, 7, 8, 9, 9, 9,14,17),
                     event=c(1, 1, 1, 1, 1, 1, 1, 0, 0, 0),
                     x    =c(1, 0, 0, 1, 0, 1, 1, 1, 0, 0) )

     summary( coxph( Surv(start, stop, event) ~ x, test2))

