nchar                  package:base                  R Documentation

_C_o_u_n_t _t_h_e _N_u_m_b_e_r _o_f _C_h_a_r_a_c_t_e_r_s (_B_y_t_e_s)

_D_e_s_c_r_i_p_t_i_o_n:

     'nchar' takes a character vector as an argument and returns a
     vector whose elements contain the sizes of the corresponding
     elements of 'x'.

_U_s_a_g_e:

     nchar(x, type = c("bytes", "chars", "width"))

_A_r_g_u_m_e_n_t_s:

       x: character vector, or a vector to be coerced to a character
          vector.

    type: character string: partial matching is allowed.  See Details.

_D_e_t_a_i_l_s:

     The 'size' of a character string can be measured in one of three
     ways

     '_b_y_t_e_s' The number of bytes needed to store the string (plus in C
          a final terminator which is not counted).

     '_c_h_a_r_s' The number of human-readable characters.

     '_w_i_d_t_h' The number of columns 'cat' will use to print the string
          in a monospaced font.  The same as 'chars' if this cannot be
          calculated (which is currently common).

     These will often be the same, and always will be in single-byte
     locales. There will be differences between the first two with
     multibyte character sequences, e.g. in UTF-8 locales. If the byte
     stream contains embedded 'nul' bytes, 'type = "bytes"' looks at
     all the bytes whereas the other two types look only at the string
     as printed by 'cat', up to the first 'nul' byte.

     The internal equivalent of the default method of 'as.character' is
     performed on 'x'.  If you want to operate on non-vector objects
     passing them through 'deparse' first will be required.

_V_a_l_u_e:

     An integer vector giving the size of each string, currently always
     '2' for missing values (for 'NA').

     Not all platforms will return a non-missing value for
     'type="width"'.

     If the string is invalid in a multi-byte character set such as
     UTF-8, the number of characters and the width will be 'NA'. 
     Otherwise the number of characters will be non-negative, so
     '!is.na(nchar(x, "chars"))' is a test of validity.

_N_o_t_e:

     This does *not* by default give the number of characters that will
     be used to 'print()' the string, although it was documented to do
     so up to R 2.0.1.  Use 'encodeString' to find the characters used
     to print the string.

     As from R 2.1.0 embedded 'nul' bytes are included in the byte
     count (but not the final 'nul'): previously the count stopped
     immediately before the first 'nul'.

_R_e_f_e_r_e_n_c_e_s:

     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
     Language_. Wadsworth & Brooks/Cole.

_S_e_e _A_l_s_o:

     'strwidth' giving width of strings for plotting; 'paste',
     'substr', 'strsplit'

_E_x_a_m_p_l_e_s:

     x <- c("asfef","qwerty","yuiop[","b","stuff.blah.yech")
     nchar(x)
     # 5  6  6  1 15

     nchar(deparse(mean))
     # 18 17

