tapply                 package:base                 R Documentation

_A_p_p_l_y _a _F_u_n_c_t_i_o_n _O_v_e_r _a "_R_a_g_g_e_d" _A_r_r_a_y

_D_e_s_c_r_i_p_t_i_o_n:

     Apply a function to each cell of a ragged array, that is to each
     (non-empty) group of values given by a unique combination of the
     levels of certain factors.

_U_s_a_g_e:

     tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE)

_A_r_g_u_m_e_n_t_s:

       X: an atomic object, typically a vector.

   INDEX: list of factors, each of same length as 'X'.

     FUN: the function to be applied.  In the case of functions like
          '+', '%*%', etc., the function name must be quoted.  If 'FUN'
          is 'NULL', tapply returns a vector which can be used to
          subscript the multi-way array 'tapply' normally produces.

     ...: optional arguments to 'FUN'.

simplify: If 'FALSE', 'tapply' always returns an array of mode
          '"list"'.  If 'TRUE' (the default), then if 'FUN' always
          returns a scalar, 'tapply' returns an array with the mode of
          the scalar.

_V_a_l_u_e:

     When 'FUN' is present, 'tapply' calls 'FUN' for each cell that has
     any data in it.  If 'FUN' returns a single atomic value for each
     cell (e.g., functions 'mean' or 'var') and when 'simplify' is
     'TRUE', 'tapply' returns a multi-way array containing the values. 
     The array has the same number of dimensions as 'INDEX' has
     components; the number of levels in a dimension is the number of
     levels ('nlevels()') in the corresponding component of 'INDEX'.

     Note that contrary to S, 'simplify = TRUE' always returns an
     array, possibly 1-dimensional.

     If 'FUN' does not return a single atomic value, 'tapply' returns
     an array of mode 'list' whose components are the values of the
     individual calls to 'FUN', i.e., the result is a list with a 'dim'
     attribute.

     Note that optional arguments to 'FUN' supplied by the '...'
     argument are not divided into cells.  It is therefore
     inappropriate for 'FUN' to expect additional arguments with the
     same length as 'X'.

_R_e_f_e_r_e_n_c_e_s:

     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
     Language_. Wadsworth & Brooks/Cole.

_S_e_e _A_l_s_o:

     the convenience functions 'by' and 'aggregate' (using 'tapply');
     'apply', 'lapply' with its versions 'sapply' and 'mapply'.

_E_x_a_m_p_l_e_s:

     require(stats)
     groups <- as.factor(rbinom(32, n = 5, p = .4))
     tapply(groups, groups, length) #- is almost the same as
     table(groups)

     ## contingency table from data.frame : array with named dimnames
     tapply(warpbreaks$breaks, warpbreaks[,-1], sum)
     tapply(warpbreaks$breaks, warpbreaks[, 3, drop = FALSE], sum)

     n <- 17; fac <- factor(rep(1:3, len = n), levels = 1:5)
     table(fac)
     tapply(1:n, fac, sum)
     tapply(1:n, fac, sum, simplify = FALSE)
     tapply(1:n, fac, range)
     tapply(1:n, fac, quantile)

     ## example of ... argument: find quarterly means
     tapply(presidents, cycle(presidents), mean, na.rm = TRUE)

     ind <- list(c(1, 2, 2), c("A", "A", "B"))
     table(ind)
     tapply(1:3, ind) #-> the split vector
     tapply(1:3, ind, sum)

