Rdutils                package:tools                R Documentation

_R_d _U_t_i_l_i_t_i_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Utilities for computing on the information in Rd objects.

_U_s_a_g_e:

     Rd_db(package, dir, lib.loc = NULL)
     Rd_parse(file, text = NULL)

_A_r_g_u_m_e_n_t_s:

 package: a character string naming an installed package.

     dir: a character string specifying the path to a package's root
          source directory.  This should contain the subdirectory 'man'
          with R documentation sources (in Rd format).  Only used if
          'package' is not given.

 lib.loc: a character vector of directory names of R libraries, or
          'NULL'.  The default value of 'NULL' corresponds to all
          libraries currently known.  The specified library trees are
          used to search for 'package'.

    file: a connection, or a character string giving the name of a file
          or a URL to read documentation in Rd format from.

    text: character vector with documentation in Rd format. Elements
          are treated as if they were lines of a file.

_D_e_t_a_i_l_s:

     'Rd_db' builds a simple database of all Rd sources in a package,
     as a list of character vectors with the lines of the Rd files in
     the package.  This is particularly useful for working on installed
     packages, where the individual Rd files in the sources are no
     longer available.

     'Rd_parse' is a simple top-level Rd parser/analyzer.  It returns a
     list with components

     '_m_e_t_a' a list containing the Rd metadata (aliases, concepts,
          keywords, and documentation type);

     '_d_a_t_a' a data frame with the names ('tags') and corresponding text
          ('vals') of the top-level sections in the R documentation
          object;

     '_r_e_s_t' top-level text not accounted for (currently, silently
          discarded by Rdconv, and hence usually the indication of a
          problem).


     Note that at least for the time being, only the top-level
     structure is analyzed.

_W_a_r_n_i_n_g:

     These functions are still experimental.  Names, interfaces and
     values might change in future versions.

_E_x_a_m_p_l_e_s:

     ## Build the Rd db for the (installed) base package.
     db <- Rd_db("base")
     ## Run Rd_parse on all entries in the Rd db.
     db <- lapply(db, function(txt) Rd_parse(text = txt))
     ## Extract the metadata.
     meta <- lapply(db, "[[", "meta")

     ## Keyword metadata per Rd file.
     keywords <- lapply(meta, "[[", "keywords")
     ## Tabulate the keyword entries.
     kw_table <- sort(table(unlist(keywords)))
     ## The 5 most frequent ones:
     rev(kw_table)[1 : 5]
     ## The "most informative" ones:
     kw_table[kw_table == 1]

     ## Concept metadata per Rd file.
     concepts <- lapply(meta, "[[", "concepts")
     ## How many files already have \concept metadata?
     sum(sapply(concepts, length) > 0)
     ## How many concept entries altogether?
     length(unlist(concepts))

