agrep                  package:base                  R Documentation

_A_p_p_r_o_x_i_m_a_t_e _S_t_r_i_n_g _M_a_t_c_h_i_n_g (_F_u_z_z_y _M_a_t_c_h_i_n_g)

_D_e_s_c_r_i_p_t_i_o_n:

     Searches for approximate matches to 'pattern' (the first argument)
     within the string 'x' (the second argument) using the Levenshtein
     edit distance.

_U_s_a_g_e:

     agrep(pattern, x, ignore.case = FALSE, value = FALSE, max.distance = 0.1)

_A_r_g_u_m_e_n_t_s:

 pattern: a non-empty character string to be matched (_not_ a regular
          expression!)

       x: character vector where matches are sought.

ignore.case: if 'FALSE', the pattern matching is _case sensitive_ and
          if 'TRUE', case is ignored during matching.

   value: if 'FALSE', a vector containing the (integer) indices of the
          matches determined is returned and if 'TRUE', a vector
          containing the matching elements themselves is returned.

max.distance: Maximum distance allowed for a match.  Expressed either
          as integer, or as a fraction of the pattern length (will be
          replaced by the smallest integer not less than the
          corresponding fraction), or a list with possible components

          '_a_l_l': maximal (overall) distance

          '_i_n_s_e_r_t_i_o_n_s': maximum number/fraction of insertions

          '_d_e_l_e_t_i_o_n_s': maximum number/fraction of deletions

          '_s_u_b_s_t_i_t_u_t_i_o_n_s': maximum number/fraction of substitutions

          If 'all' is missing, it is set to 10%, the other components
          default to 'all'.  The component names can be abbreviated. 

_D_e_t_a_i_l_s:

     The Levenshtein edit distance is used as measure of
     approximateness: it is the total number of insertions, deletions
     and substitutions required to transform one string into another.

     The function is a simple interface to the 'apse' library developed
     by Jarkko Hietaniemi (also used in the Perl String::Approx
     module).

_V_a_l_u_e:

     Either a vector giving the indices of the elements that yielded a
     match, of, if 'value' is 'TRUE', the matched elements.

_A_u_t_h_o_r(_s):

     David Meyer David.Meyer@ci.tuwien.ac.at (based on C code by Jarkko
     Hietaniemi); modifications by Kurt Hornik.

_S_e_e _A_l_s_o:

     'grep'

_E_x_a_m_p_l_e_s:

     agrep("lasy", "1 lazy 2")
     agrep("lasy", "1 lazy 2", max = list(sub = 0))
     agrep("laysy", c("1 lazy", "1", "1 LAZY"), max = 2)
     agrep("laysy", c("1 lazy", "1", "1 LAZY"), max = 2, value = TRUE)
     agrep("laysy", c("1 lazy", "1", "1 LAZY"), max = 2, ignore.case = TRUE)

