Methods               package:methods               R Documentation

_G_e_n_e_r_a_l _I_n_f_o_r_m_a_t_i_o_n _o_n _M_e_t_h_o_d_s

_D_e_s_c_r_i_p_t_i_o_n:

     This documentation section covers some general topics on how
     methods work and how the 'methods' package interacts with the rest
     of R.  The information is usually not needed to get started with
     methods and classes, but may be helpful for moderately ambitious
     projects, or when something doesn't work as expected.

     The section "How Methods Work" describes the underlying mechanism;
     "Method Selection and Dispatch" provides more details on how class
     definitions determine which methods are used; "Generic Functions"
     discusses generic functions as objects. For additional information
     specifically about class definitions, see 'Classes'.

_H_o_w _M_e_t_h_o_d_s _W_o_r_k:

     A generic function  has associated with it a collection of other
     functions (the methods), all of which have the same formal
     arguments as the generic.  See the "Generic Functions" section
     below for more on generic functions themselves.

     Each R package will include  methods metadata objects
     corresponding to each generic function  for which methods have
     been defined in that package. When the package is loaded into an R
     session, the methods for each generic function are _cached_, that
     is, stored in the environment of the generic function along with
     the methods from previously loaded packages.  This merged table of
     methods is used to dispatch or select methods from the generic,
     using class inheritance and possibly group generic functions (see
     'GroupGenericFunctions') to find an applicable method. See the
     "Method Selection and Dispatch" section below. The caching
     computations ensure that only one version of each generic function
     is visible globally; although different attached packages may
     contain a copy of the generic function, these behave identically
     with respect to method selection. In contrast, it is possible for
     the same function name to refer to more than one generic function,
     when these have different 'package' slots.  In the latter case, R
     considers the functions unrelated:  A generic function is defined
     by the combination of name and package.  See the "Generic
     Functions" section below.

     The methods for a generic are stored according to the
     corresponding 'signature' in the call to 'setMethod' that defined 
     the method.  The signature associates one class name with each of
     a subset of the formal arguments to the generic function.  Which
     formal arguments are available, and the order in which they
     appear, are determined by the '"signature"' slot of the generic
     function itself.  By default, the signature of the generic
     consists of all the formal arguments except ..., in the order they
     appear in the function definition.

     Trailing arguments in the signature of the generic will be
     _inactive_  if no method has yet been specified that included
     those arguments in its signature. Inactive arguments are not
     needed or used in labeling the cached methods.  (The distinction
     does not change which methods are dispatched, but ignoring
     inactive arguments improves the efficiency of dispatch.)

     All arguments in the signature of the generic function will be
     evaluated when the function is called, rather than using the
     traditional lazy evaluation rules of S.  Therefore, it's important
     to _exclude_ from the signature any arguments that need to be
     dealt with symbolically (such as the first argument to function
     'substitute').  Note that only actual arguments are evaluated, not
     default expressions. A missing argument enters into the method
     selection as class '"missing"'.

     The cached methods are stored in an environment object.  The names
     used for assignment are a concatenation of the class names for the
     active arguments in the method signature.

_M_e_t_h_o_d _S_e_l_e_c_t_i_o_n _a_n_d _D_i_s_p_a_t_c_h:

     When a call to a generic function is evaluated, a method is
     selected corresponding to the classes of the actual arguments in
     the signature. First, the cached methods table is searched for an 
     exact match; that is, a method stored under the signature defined
     by the string value of 'class(x)' for each non-missing argument,
     and '"missing"' for each missing argument. If no method is found
     directly for the actual arguments in a call to a generic function,
     an attempt is made to match the available methods to the arguments
     by using the superclass information about the actual classes.

     Each class definition may include a list of  one or more
     _superclasses_ of the new class. The simplest and most common
     specification is by the 'contains=' argument in the  call to
     'setClass'. Each class named in this argument is a superclass of
     the new class. The S language has two additional mechanisms for
     defining superclasses. A call to  'setIs' can create an
     inheritance relationship that is not the simple one of containing
     the superclass representation in the new class. In this case,
     explicit methods are defined to relate the subclass and the
     superclass. Also, a call to 'setClassUnion' creates a union class
     that is a superclass of each of the members of the union. All
     three mechanisms are treated equivalently for purposes of method
     selection:  they define the _direct_ superclasses of a particular
     class. For more details on the mechanisms, see 'Classes'.

     The direct superclasses themselves may have superclasses, defined
     by any of the same mechanisms, and similarly for further
     generations.  Putting all this information together produces the
     full list of superclasses for this class. The superclass list is
     included in the definition of the class that is cached during the
     R session. Each element of the list describes the nature of the
     relationship (see 'SClassExtension' for details). Included in the
     element is a 'distance' slot giving a numeric distance between the
     two classes. The distance currently is the path length for the
     relationship: '1' for direct superclasses (regardless of which
     mechanism defined them), then '2' for the direct superclasses of
     those classes, and so on. In addition, any class implicitly has
     class '"ANY"' as a superclass.  The distance to '"ANY"' is treated
     as larger than the distance to any actual class. The special class
     '"missing"' corresponding to missing arguments has only '"ANY"' as
     a superclass, while '"ANY"' has no superclasses.

     The information about superclasses is summarized when a class
     definition is printed.

     When a method is to be selected by inheritance, a search is made
     in the table for all methods directly corresponding to a
     combination of either the direct class or one of its superclasses,
     for each argument in the active signature. For an example, suppose
     there is only one argument in the signature and that the class of
     the corresponding object was '"dgeMatrix"' (from the 'Matrix'
     package on CRAN). This class has two direct superclasses and
     through these 4 additional superclasses. Method selection finds
     all the methods in the table of directly specified methods labeled
     by one of these classes, or by '"ANY"'.

     When there are multiple arguments in the signature, each argument
     will generate a similar  list of inherited classes. The possible
     matches are now all the combinations of classes from each argument
     (think of the function 'outer' generating an array of all possible
     combinations). The search now finds all the methods matching any
     of this combination of classes. The computation of distances also
     has to combine distances for the individual arguments. There are
     many ways to combine the distances; the current implementation
     simply adds them. The result of the search is then a list of zero,
     one, or more methods, and a parallel vector of distances between
     the target signature and the available methods.

     If the list has more than one matching method,  only those
     corresponding to the minimum distance are considered. There may
     still be multiple best methods. The dispatch software considers
     this an ambiguous case and warns the user (only on the first call
     for this selection). The method occurring first in the list of
     superclasses is selected.  By the mechanism of producing the
     extension information, this orders the direct superclasses by the
     order they appeared in the original call to 'setClass'. Classes
     specified in 'setIs' and 'setClassUnion'  calls, and by the
     superclasses of these classes. (Note that only the ordering of
     classes within a particular generation of superclasses counts,
     because only these will have the same distance). It is generally a
     very bad idea to  count on any observed ordering, other than of
     the simple superclasses, since both circumstances and future
     changes to the computations could alter such orderings.

     All this detail about selection is less important than the
     realization that having ambiguous method selection usually means
     that you need to be more specific about intentions. It is likely
     that some consideration other than the ordering of superclasses in
     the class definition is more important in determining which method
     _should_  be selected, and the preference may well be different
     for different generic functions.  Where ambiguities arise, the
     best approach is usually to provide a specific method for the
     subclass.

     When the inherited method has been selected, the selection is
     cached in the generic function so that future calls with the same
     class will not require repeating the search.  Cached inherited
     selections are not themselves used in future inheritance searches,
     since that could result in invalid selections. If you want
     inheritance computations to be done again (for example, because a
     newly loaded package has a more direct method than one that has
     already been used in this session), call 'resetGeneric'.  Because
     classes and methods involving them tend to come from the same
     package, the current implementation does not reset all generics
     every time a new package is loaded.

     Besides being initiated through calls to the generic function,
     method selection can be done explicitly by calling the function
     'selectMethod'.

     Once a method has been selected, the evaluator creates a new
     context in which a call to the method is evaluated. The context is
     initialized with the arguments from the call to the generic
     function. These arguments are not rematched.  All the arguments in
     the signature of the generic will have been evaluated (including
     any that are currently inactive); arguments that are not in the
     signature will obey the usual lazy evaluation rules of the
     language. If an argument was missing in the call, its default
     expression if any will _not_ have been evaluated, since method
     dispatch always uses class 'missing' for such arguments.

     A call to a generic function therefore has two contexts:  one for
     the function and a second for the method. The argument objects
     will be copied to the second context, but not any local objects
     created in a nonstandard generic function. The other important
     distinction is that the parent  ("enclosing") environment of the
     second context is the environment of the method as a function, so
     that all R programming techniques using such environments apply to
     method definitions as ordinary functions.

     For further discussion of method selection and dispatch,  see the
     first reference.

_G_e_n_e_r_i_c _F_u_n_c_t_i_o_n_s:

     In principle, a generic function could be any function that
     evaluates a call to 'standardGeneric()', the internal function
     that selects a method and evaluates a call to  the selected
     method.  In practice, generic functions are special objects that
     in addition to being from a subclass of class '"function"' also
     extend the class 'genericFunction'.  Such objects have slots to
     define information needed to deal with their methods.  They also
     have specialized environments, containing the tables used in
     method selection.

     The slots '"generic"' and  '"package"' in the object are the
     character string names of the generic function itself and of the
     package from which the  function is defined. As with classes,
     generic functions are uniquely defined in R by the combination of
     the two names. There can be generic functions of the same name
     associated with different packages (although inevitably keeping
     such functions cleanly distinguished is not always easy). On the
     other hand, R will enforce that only one definition of a generic
     function can be associated with a particular combination of
     function and package name, in the current session or other active
     version of R.

     Tables of methods for a particular generic function, in this
     sense, will often be spread over several other packages. The total
     set of methods for a given generic function may change during a
     session, as additional packages are loaded. Each table must be
     consistent in the signature assumed for the generic function.

     R distinguishes _standard_ and _nonstandard_ generic functions,
     with the former having a function body that does nothing but
     dispatch a method. For the most part, the distinction is just one
     of simplicity:  knowing that a generic function only dispatches a
     method call allows some efficiencies and also removes some
     uncertainties.

     In most cases, the generic function is the visible function
     corresponding to that name, in the corresponding package. There
     are two exceptions, _implicit_ generic functions and the special
     computations required to deal with R's _primitive_ functions.
     Packages can contain a table of implicit generic versions of
     functions in the package, if the package wishes to leave a
     function non-generic but to constrain what the function would be
     like if it were generic. Such implicit generic functions are
     created during the installation of the package, essentially by
     defining the generic function and possibly methods for it, and
     then reverting the function to its non-generic form. (See
     implicitGeneric for how this is done.) The mechanism is mainly
     used for functions in the older packages in R, which may prefer to
     ignore S4 methods. Even in this case, the actual mechanism is only
     needed if something special has to be specified. All functions
     have a corresponding implicit generic version defined
     automatically (an implicit, implicit generic function one might
     say). This function is a standard generic with the same arguments
     as the non-generic function, with the non-generic version as the
     default (and only) method, and with the generic signature being
     all the formal arguments except ....

     The implicit generic mechanism is needed only to override some
     aspect of the default definition. One reason to do so would be to
     remove some arguments from the signature. Arguments that may need
     to be interpreted literally, or for which the lazy evaluation
     mechanism of the language is needed, must _not_ be included in the
     signature of the generic function, since all arguments in the
     signature will be evaluated in order to select a method. For
     example, the argument 'expr' to the function 'with' is treated
     literally and must therefore be excluded from the signature.

     One would also need to define an implicit generic if the existing
     non-generic function were not suitable as the default method.
     Perhaps the function only applies to some classes of objects, and
     the package designer prefers to have no general default method. In
     the other direction, the package designer might have some ideas
     about suitable methods for some classes, if the function were
     generic. With reasonably modern packages, the simple approach in
     all these cases is just to define the function as a generic. The
     implicit generic mechanism is mainly attractive for older packages
     that do not want to require the methods package to be available.

     Generic functions will also be defined but not obviously visible
     for functions implemented as _primitive_ functions in the base
     package. Primitive functions look like ordinary functions when
     printed but are in fact not function objects but objects of two
     types interpreted by the R evaluator to call underlying C code
     directly. Since their entire justification is efficiency, R
     refuses to hide primitives behind a generic function object.
     Methods may be defined for most primitives, and corresponding
     metadata objects will be created to store them. Calls to the
     primitive still go directly to the C code, which will sometimes
     check for applicable methods. The definition of "sometimes" is
     that methods must have been detected for the function in some
     package loaded in the session and 'isS4(x)' is 'TRUE' for  the
     first argument (or for the second argument, in the case of binary
     operators). You can test whether methods have been detected by
     calling 'isGeneric' for the relevant function and you can examine
     the generic function by calling 'getGeneric', whether or not
     methods have been detected. For more on generic functions, see the
     first reference and also section 2 of _R Internals_.

_M_e_t_h_o_d _D_e_f_i_n_i_t_i_o_n_s:

     All method definitions are stored as objects from the
     'MethodDefinition' class. Like the class of generic functions,
     this class extends ordinary R functions with some additional
     slots: '"generic"', containing the name and package of the generic
     function, and two signature slots, '"defined"' and '"target"', the
     first being the signature supplied when the method was defined by
     a call to 'setMethod'. The  '"target"' slot starts off equal to
     the '"defined"' slot.  When an inherited method is cached after
     being selected, as described above, a copy is made with the 
     appropriate '"target"'  signature. Output from 'showMethods', for
     example, includes both signatures.

     Method definitions are required to have the same formal arguments
     as the generic function, since the method dispatch mechanism does
     not rematch arguments, for reasons of both efficiency and
     consistency.

_R_e_f_e_r_e_n_c_e_s:

     Chambers, John M. (2008) _Software for Data Analysis: Programming
     with R_ Springer.  (For the R version: see section 10.6 for method
     selection and section 10.5 for generic functions).

     Chambers, John M. (1998) _Programming with Data_ Springer (For the
     original S4 version.)

_S_e_e _A_l_s_o:

     For more specific information, see 'setGeneric', 'setMethod', and
     'setClass'.

     For the use of ... in methods, see  dotsMethods.

