survexp               package:survival               R Documentation

_C_o_m_p_u_t_e _E_x_p_e_c_t_e_d _S_u_r_v_i_v_a_l

_D_e_s_c_r_i_p_t_i_o_n:

     Returns either the expected survival of a cohort of subjects, or
     the individual expected survival for each subject.

_U_s_a_g_e:

     survexp(formula, data, weights, subset, na.action, times, cohort=TRUE,
             conditional=FALSE, ratetable=survexp.us, scale=1, npoints,
             se.fit=, model=FALSE, x=FALSE, y=FALSE)

_A_r_g_u_m_e_n_t_s:

 formula: formula object.  The response variable is a vector of
          follow-up times and is optional.  The predictors consist of
          optional grouping variables separated by the '+' operator (as
          in 'survfit'), along with a 'ratetable'  term.  The
          'ratetable' term matches each subject to his/her expected
          cohort. 

    data: data frame in which to interpret the variables named in the
          'formula', 'subset' and 'weights' arguments. 

 weights: case weights. 

  subset: expression indicating a subset of the rows of 'data' to be
          used in the fit. 

na.action: function to filter missing data. This is applied to the
          model frame after  'subset' has been applied.  Default is
          'options()$na.action'. A possible value for 'na.action' is
          'na.omit', which deletes observations that contain one or
          more missing values. 

   times: vector of follow-up times at which the resulting survival
          curve is  evaluated.  If absent, the result will be reported
          for each unique  value of the vector of follow-up times
          supplied in 'formula'. 

  cohort: logical value: if 'FALSE', each subject is treated as a
          subgroup of size 1. The default is 'TRUE'. 

conditional: logical value: if 'TRUE', the follow-up times supplied in
          'formula' are death times and conditional expected survival
          is computed. If 'FALSE', the follow-up times are potential
          censoring times.  If follow-up times are missing in
          'formula', this argument is ignored.   

ratetable: a table of event rates, such as 'survexp.uswhite', or a
          fitted Cox model. 

   scale: numeric value to scale the results.  If 'ratetable' is in
          units/day, 'scale = 365.25' causes the output to be reported
          in years. 

 npoints: number of points at which to calculate intermediate results,
          evenly spaced  over the range of the follow-up times.  The
          usual (exact) calculation is done  at each unique follow-up
          time. For very large data sets specifying 'npoints'  can
          reduce the amount of memory and computation required. For a
          prediction from a Cox model 'npoints' is ignored. 

  se.fit: compute the standard error of the predicted survival. The
          default is to compute this whenever the routine can, which at
          this time is only for the Ederer method and a Cox model as
          the rate table. 

model,x,y: flags to control what is returned.  If any of these is true,
          then the model frame, the model matrix, and/or the vector of
          response times will be returned as components of the final
          result, with the same names as the flag arguments. 

_D_e_t_a_i_l_s:

     Individual expected survival is usually used in models or testing,
     to 'correct' for the age and sex composition of a group of
     subjects.  For instance, assume that birth date, entry date into
     the study, sex and actual survival time are all known for a group
     of subjects. The 'survexp.uswhite' population tables contain
     expected death rates based on calendar year, sex and age.  Then
     haz <- -log(survexp(death.time ~ ratetable(sex=sex, year=entry.dt,
     age=(birth.dt-entry.dt)), cohort=F)) gives for each subject the
     total hazard experienced up to their observed death time or
     censoring time. This probability can be used as a rescaled time
     value in models: glm(status ~ 1 + offset(log(haz)),
     family=poisson) glm(status ~ x + offset(log(haz)), family=poisson)
     In the first model, a test for intercept=0 is the one sample
     log-rank test of whether the observed group of subjects has
     equivalent survival to the baseline population.  The second model
     tests for an effect of variable 'x' after adjustment for age and
     sex.

     Cohort survival is used to produce an overall survival curve. 
     This is then added to the Kaplan-Meier plot of the study group for
     visual comparison between these subjects and the population at
     large.  There are three common methods of computing cohort
     survival. In the "exact method" of Ederer the cohort is not
     censored; this corresponds to having no response variable in the
     formula.  Hakulinen recommends censoring the cohort at the
     anticipated censoring time of each patient, and Verheul recommends
     censoring the cohort at the actual observation time of each
     patient. The last of these is the conditional method. These are
     obtained by using the respective time values as the follow-up time
     or response in the formula.

_V_a_l_u_e:

     if 'cohort=T' an object of class 'survexp', otherwise a vector of
     per-subject expected survival values.  The former contains the
     number of subjects at risk and the expected survival for the
     cohort at each requested time.

_R_e_f_e_r_e_n_c_e_s:

     G. Berry.  The analysis of mortality by the subject-years method.
     _Biometrics_ 1983, 39:173-84. F Ederer, L Axtell, and S Cutler. 
     The relative survival rate: a statistical methodology. _Natl
     Cancer Inst Monogr_ 1961, 6:101-21. T. Hakulinen.  Cancer survival
     corrected for heterogeneity in patient withdrawal.  _Biometrics_
     1982, 38:933. H. Verheul, E. Dekker, P. Bossuyt, A. Moulijn, and
     A. Dunning.  Background mortality in clinical survival studies. 
     _Lancet_ 1993, 341:872-5.

_S_e_e _A_l_s_o:

     'survfit', 'survexp.us', 'survexp.fit', 'pyears', 'date'

_E_x_a_m_p_l_e_s:

     ## comparing data to population
     data(ratetables)        
     data(cancer)
     ## compare survival to US population
     cancer$year<-rep(as.date("1/1/1980"),nrow(cancer))
     efit <- survexp( ~ ratetable(sex=sex, year=year, age=age*365), times=(1:4)*365,data=cancer)
     plot(survfit(Surv(time, status) ~1,data=cancer))
     lines(efit)
     ## compare data to Cox model
     ## Mayo PBC data
     data(pbc)
     ## fit to randomised patients
     m<-coxph(Surv(time,status)~edtrt+log(bili)+log(protime)+age+platelet,data=pbc,
           subset=trt>0)
     ##compare Kaplan-Meier to fitted model for 2 edema groups in
     ##unrandomised patients
     plot(survfit(Surv(time,status)~edtrt,data=pbc,subset=trt==-9))
     lines(survexp(~edtrt+ratetable(edtrt=edtrt,bili=bili,platelet=platelet,age=age,
       protime=protime),data=pbc,subset=trt==-9,ratetable=m,cohort=TRUE),col="purple")

