nchar                  package:base                  R Documentation

_C_o_u_n_t _t_h_e _N_u_m_b_e_r _o_f _C_h_a_r_a_c_t_e_r_s (_o_r _B_y_t_e_s _o_r _W_i_d_t_h)

_D_e_s_c_r_i_p_t_i_o_n:

     'nchar' takes a character vector as an argument and returns a
     vector whose elements contain the sizes of the corresponding
     elements of 'x'.

     'nzchar' is a fast way to find out if elements of a character
     vector are non-empty strings.

_U_s_a_g_e:

     nchar(x, type = "chars", allowNA = FALSE)

     nzchar(x)

_A_r_g_u_m_e_n_t_s:

       x: character vector, or a vector to be coerced to a character
          vector.

    type: character string: partial matching to one of 'c("bytes",
          "chars", "width")'.  See 'Details'.

 allowNA: logical: show 'NA' be returned for invalid multibyte strings
          (rather than throwing an error)?

_D_e_t_a_i_l_s:

     The 'size' of a character string can be measured in one of three
     ways

     '_b_y_t_e_s' The number of bytes needed to store the string (plus in C
          a final terminator which is not counted).

     '_c_h_a_r_s' The number of human-readable characters.

     '_w_i_d_t_h' The number of columns 'cat' will use to print the string
          in a monospaced font.  The same as 'chars' if this cannot be
          calculated.

     These will often be the same, and almost always will be in
     single-byte locales.  There will be differences between the first
     two with multibyte character sequences, e.g. in UTF-8 locales.

     The internal equivalent of the default method of 'as.character' is
     performed on 'x' (so there is no method dispatch).  If you want to
     operate on non-vector objects passing them through 'deparse' first
     will be required.

_V_a_l_u_e:

     For 'nchar', an integer vector giving the sizes of each element,
     currently always '2' for missing values (for 'NA').

     If 'allowNA = TRUE' and an element is invalid in a multi-byte
     character set such as UTF-8, its number of characters and the
     width will be 'NA'.  Otherwise the number of characters will be
     non-negative, so '!is.na(nchar(x, "chars", TRUE))' is a test of
     validity.

     Names, dims and dimnames are copied from the input.

     For 'nzchar', a logical vector of the same length as 'x', true if
     and only if the element has non-zero length.

_N_o_t_e:

     This does *not* by default give the number of characters that will
     be used to 'print()' the string.  Use 'encodeString' to find the
     characters used to print the string. Where character strings have
     been marked as UTF-8, the number of characters and widths will be
     computed in UTF-8, even though printing may use escapes such as
     '<U+2642>' in a non-UTF-8 locale.

_R_e_f_e_r_e_n_c_e_s:

     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
     Language_. Wadsworth & Brooks/Cole.

_S_e_e _A_l_s_o:

     'strwidth' giving width of strings for plotting; 'paste',
     'substr', 'strsplit'

_E_x_a_m_p_l_e_s:

     x <- c("asfef", "qwerty", "yuiop[", "b", "stuff.blah.yech")
     nchar(x)
     # 5  6  6  1 15

     nchar(deparse(mean))
     # 18 17

