family                 package:stats                 R Documentation

_F_a_m_i_l_y _O_b_j_e_c_t_s _f_o_r _M_o_d_e_l_s

_D_e_s_c_r_i_p_t_i_o_n:

     Family objects provide a convenient way to specify the details of
     the models used by functions such as 'glm'.  See the documentation
     for 'glm' for the details on how such model fitting takes place.

_U_s_a_g_e:

     family(object, ...)

     binomial(link = "logit")
     gaussian(link = "identity")
     Gamma(link = "inverse")
     inverse.gaussian(link = "1/mu^2")
     poisson(link = "log")
     quasi(link = "identity", variance = "constant")
     quasibinomial(link = "logit")
     quasipoisson(link = "log")

_A_r_g_u_m_e_n_t_s:

    link: a specification for the model link function.  This can be a
          name/expression, a literal character string, a length-one
          character vector or an object of class '"link-glm"' (such as
          generated by 'make.link') provided it is not specified _via_
          one of the standard names given next.

          The 'gaussian' family accepts the links (as names)
          'identity', 'log' and 'inverse'; the 'binomial' family the
          links 'logit', 'probit', 'cauchit', (corresponding to
          logistic, normal and Cauchy CDFs respectively) 'log' and
          'cloglog' (complementary log-log); the 'Gamma' family the
          links 'inverse', 'identity' and 'log'; the 'poisson' family
          the links 'log', 'identity',  and 'sqrt' and the
          'inverse.gaussian' family the links '1/mu^2', 'inverse',
          'identity' and 'log'.

          The 'quasi' family accepts the links 'logit', 'probit',
          'cloglog',  'identity', 'inverse', 'log', '1/mu^2' and
          'sqrt', and the function 'power' can be used to create a
          power link function. 

variance: for all families other than 'quasi', the variance function is
          determined by the family.  The 'quasi' family will accept the
          literal character string (or unquoted as a name/expression)
          specifications '"constant"', '"mu(1-mu)"', '"mu"', '"mu^2"'
          and '"mu^3"', a length-one character vector taking one of
          those values, or a list containing components 'varfun',
          'validmu', 'dev.resids', 'initialize' and 'name'. 

  object: the function 'family' accesses the 'family' objects which are
          stored within objects created by modelling functions (e.g.,
          'glm').

     ...: further arguments passed to methods.

_D_e_t_a_i_l_s:

     'family' is a generic function with methods for classes '"glm"'
     and '"lm"' (the latter returning 'gaussian()').

     The 'quasibinomial' and 'quasipoisson' families differ from the
     'binomial' and 'poisson' families only in that the dispersion
     parameter is not fixed at one, so they can model over-dispersion. 
     For the binomial case see McCullagh and Nelder (1989, pp. 124-8). 
     Although they show that there is (under some restrictions) a model
     with variance proportional to mean as in the quasi-binomial model,
     note that 'glm' does not compute maximum-likelihood estimates in
     that model.  The behaviour of S is closer to the quasi- variants.

_V_a_l_u_e:

     An object of class '"family"' (which has a concise print method).
     This is a list with elements 

  family: character: the family name.

    link: character: the link name.

 linkfun: function: the link.

 linkinv: function: the inverse of the link function.

variance: function: the variance as a function of the mean.

dev.resids: function giving the deviance residuals as a function of
          '(y, mu, wt)'.

     aic: function giving the AIC value if appropriate (but 'NA' for
          the quasi- families).  See 'logLik' for the assumptions made
          about the dispersion parameter.

  mu.eta: function: derivative 'function(eta)' dmu/deta.

initialize: expression.  This needs to set up whatever data objects are
          needed for the family as well as 'n' (needed for AIC in the
          binomial family) and 'mustart' (see 'glm'.

valid.mu: logical function.  Returns 'TRUE' if a mean vector 'mu' is
          within the domain of 'variance'.

valid.eta: logical function.   Returns 'TRUE' if a linear predictor
          'eta' is within the domain of 'linkinv'.

_N_o_t_e:

     The 'link' and 'variance' arguments have rather awkward semantics
     for back-compatibility.  The recommended way is to supply them is
     as quoted character strings, but they can also be supplied
     unquoted (as names or expressions).  In addition, they can also be
     supplied as a length-one character vector giving the name of one
     of the options, or as a list (for 'link', of class '"link-glm"'). 
     The restrictions apply only to links given as names: when given as
     a character string all the links known to 'make.link' are
     accepted.

     This is potentially ambiguous: supplying 'link=logit' could mean
     the unquoted name of a link or the value of object 'logit'.  It is
     interpreted if possible as the name of an allowed link, then as an
     object.  (You can force the interpretation to always be the value
     of an object via 'logit[1]'.)

_A_u_t_h_o_r(_s):

     The design was inspired by S functions of the same names described
     in Hastie & Pregibon (1992) (except 'quasibinomial' and
     'quasipoisson').

_R_e_f_e_r_e_n_c_e_s:

     McCullagh P. and Nelder, J. A. (1989) _Generalized Linear Models._
     London: Chapman and Hall.

     Dobson, A. J. (1983) _An Introduction to Statistical Modelling._
     London: Chapman and Hall.

     Cox, D. R. and  Snell, E. J. (1981). _Applied Statistics;
     Principles and Examples._ London: Chapman and Hall.

     Hastie, T. J. and Pregibon, D. (1992) _Generalized linear models._
     Chapter 6 of _Statistical Models in S_ eds J. M. Chambers and T.
     J. Hastie, Wadsworth & Brooks/Cole.

_S_e_e _A_l_s_o:

     'glm', 'power', 'make.link'.

_E_x_a_m_p_l_e_s:

     require(utils) # for str

     nf <- gaussian()# Normal family
     nf
     str(nf)# internal STRucture

     gf <- Gamma()
     gf
     str(gf)
     gf$linkinv
     gf$variance(-3:4) #- == (.)^2

     ## quasipoisson. compare with example(glm)
     counts <- c(18,17,15,20,10,20,25,13,12)
     outcome <- gl(3,1,9)
     treatment <- gl(3,3)
     d.AD <- data.frame(treatment, outcome, counts)
     glm.qD93 <- glm(counts ~ outcome + treatment, family=quasipoisson())
     glm.qD93
     anova(glm.qD93, test="F")
     summary(glm.qD93)
     ## for Poisson results use
     anova(glm.qD93, dispersion = 1, test="Chisq")
     summary(glm.qD93, dispersion = 1)

     ## Example of user-specified link, a logit model for p^days
     ## See Shaffer, T.  2004. Auk 121(2): 526-540.
     logexp <- function(days = 1)
     {
         linkfun <- function(mu) qlogis(mu^(1/days))
         linkinv <- function(eta) plogis(eta)^days
         mu.eta <- function(eta) days * plogis(eta)^(days-1) *
           .Call("logit_mu_eta", eta, PACKAGE = "stats")
         valideta <- function(eta) TRUE
         link <- paste("logexp(", days, ")", sep="")
         structure(list(linkfun = linkfun, linkinv = linkinv,
                        mu.eta = mu.eta, valideta = valideta, name = link),
                   class = "link-glm")
     }
     binomial(logexp(3))
     ## in practice this would be used with a vector of 'days', in
     ## which case use an offset of 0 in the corresponding formula
     ## to get the null deviance right.

     ## Binomial with identity link: often not a good idea.
     ## Not run: binomial(link=make.link("identity"))

     ## tests of quasi
     x <- rnorm(100)
     y <- rpois(100, exp(1+x))
     glm(y ~x, family=quasi(variance="mu", link="log"))
     # which is the same as
     glm(y ~x, family=poisson)
     glm(y ~x, family=quasi(variance="mu^2", link="log"))
     ## Not run: glm(y ~x, family=quasi(variance="mu^3", link="log")) # fails
     y <- rbinom(100, 1, plogis(x))
     # needs to set a starting value for the next fit
     glm(y ~x, family=quasi(variance="mu(1-mu)", link="logit"), start=c(0,1))

