Wilcoxon                package:stats                R Documentation

_D_i_s_t_r_i_b_u_t_i_o_n _o_f _t_h_e _W_i_l_c_o_x_o_n _R_a_n_k _S_u_m _S_t_a_t_i_s_t_i_c

_D_e_s_c_r_i_p_t_i_o_n:

     Density, distribution function, quantile function and random
     generation for the distribution of the Wilcoxon rank sum statistic
     obtained from samples with size 'm' and 'n', respectively.

_U_s_a_g_e:

     dwilcox(x, m, n, log = FALSE)
     pwilcox(q, m, n, lower.tail = TRUE, log.p = FALSE)
     qwilcox(p, m, n, lower.tail = TRUE, log.p = FALSE)
     rwilcox(nn, m, n)

_A_r_g_u_m_e_n_t_s:

    x, q: vector of quantiles.

       p: vector of probabilities.

      nn: number of observations. If 'length(nn) > 1', the length is
          taken to be the number required.

    m, n: numbers of observations in the first and second sample,
          respectively.  Can be vectors of positive integers.

log, log.p: logical; if TRUE, probabilities p are given as log(p).

lower.tail: logical; if TRUE (default), probabilities are P[X <= x],
          otherwise, P[X > x].

_D_e_t_a_i_l_s:

     This distribution is obtained as follows.  Let 'x' and 'y' be two
     random, independent samples of size 'm' and 'n'. Then the Wilcoxon
     rank sum statistic is the number of all pairs '(x[i], y[j])' for
     which 'y[j]' is not greater than 'x[i]'.  This statistic takes
     values between '0' and 'm * n', and its mean and variance are 'm *
     n / 2' and 'm * n * (m + n + 1) / 12', respectively.

     If any of the first three arguments are vectors, the recycling
     rule is used to do the calculations for all combinations of the
     three up to the length of the longest vector.

_V_a_l_u_e:

     'dwilcox' gives the density, 'pwilcox' gives the distribution
     function, 'qwilcox' gives the quantile function, and 'rwilcox'
     generates random deviates.

_W_a_r_n_i_n_g:

     These functions can use large amounts of memory and stack (and
     even crash R if the stack limit is exceeded and stack-checking is
     not in place) if one sample is large (several thousands or more).

_N_o_t_e:

     S-PLUS uses a different (but equivalent) definition of the
     Wilcoxon statistic: see 'wilcox.test' for details.

_A_u_t_h_o_r(_s):

     Kurt Hornik

_S_o_u_r_c_e:

     These are calculated via recursion, based on 'cwilcox(k, m, n)',
     the number of choices with statistic 'k' from samples of size 'm'
     and 'n', which is itself calculated recursively and the results
     cached.  Then 'dwilcox' and 'pwilcox' sum appropriate values of
     'cwilcox', and 'qwilcox' is based on inversion.

     'rwilcox' generates a random permutation of ranks and evaluates
     the statistic.

_S_e_e _A_l_s_o:

     'wilcox.test' to calculate the statistic from data, find p values
     and so on.

     'dsignrank' etc, for the distribution of the _one-sample_ Wilcoxon
     signed rank statistic.

_E_x_a_m_p_l_e_s:

     require(graphics)

     x <- -1:(4*6 + 1)
     fx <- dwilcox(x, 4, 6)
     Fx <- pwilcox(x, 4, 6)

     layout(rbind(1,2), widths=1, heights=c(3,2))
     plot(x, fx,type='h', col="violet",
          main= "Probabilities (density) of Wilcoxon-Statist.(n=6,m=4)")
     plot(x, Fx,type="s", col="blue",
          main= "Distribution of Wilcoxon-Statist.(n=6,m=4)")
     abline(h=0:1, col="gray20",lty=2)
     layout(1)# set back

     N <- 200
     hist(U <- rwilcox(N, m=4,n=6), breaks=0:25 - 1/2,
          border="red", col="pink", sub = paste("N =",N))
     mtext("N * f(x),  f() = true \"density\"", side=3, col="blue")
      lines(x, N*fx, type='h', col='blue', lwd=2)
     points(x, N*fx, cex=2)

     ## Better is a Quantile-Quantile Plot
     qqplot(U, qw <- qwilcox((1:N - 1/2)/N, m=4,n=6),
            main = paste("Q-Q-Plot of empirical and theoretical quantiles",
                          "Wilcoxon Statistic,  (m=4, n=6)",sep="\n"))
     n <- as.numeric(names(print(tU <- table(U))))
     text(n+.2, n+.5, labels=tU, col="red")

