sort                  package:base                  R Documentation

_S_o_r_t_i_n_g _o_r _O_r_d_e_r_i_n_g _V_e_c_t_o_r_s

_D_e_s_c_r_i_p_t_i_o_n:

     Sort (or _order_) a vector or factor (partially) into ascending
     (or descending) order.  For ordering along more than one variable,
     e.g., for sorting data frames, see 'order'.

_U_s_a_g_e:

     sort(x, decreasing = FALSE, ...)

     ## Default S3 method:
     sort(x, decreasing = FALSE, na.last = NA, ...)

     sort.int(x, partial = NULL, na.last = NA, decreasing = FALSE,
              method = c("shell", "quick"), index.return = FALSE)

_A_r_g_u_m_e_n_t_s:

       x: for 'sort' an R object with a class or a numeric, complex,
          character or logical vector.  For 'sort.int', a numeric,
          complex, character or logical vector, or a factor.

decreasing: logical.  Should the sort be increasing or decreasing? Not
          available for partial sorting.

     ...: arguments to be passed to or from methods or (for the default
          methods and objects without a class) to 'sort.int'.

 na.last: for controlling the treatment of 'NA's. If 'TRUE', missing
          values in the data are put last; if 'FALSE', they are put
          first; if 'NA', they are removed.

 partial: 'NULL' or an integer vector of indices for partial sorting.

  method: character string specifying the algorithm used.

index.return: logical indicating if the ordering index vector should be
          returned as well; this is only available for a few cases, the
          default 'na.last = NA' and full sorting of non-factors.

_D_e_t_a_i_l_s:

     'sort' is a generic function for which methods can be written, and
     'sort.int' is the internal method which is compatible with S if
     only the first three arguments are used.

     The default 'sort' method makes use of 'order' for objects with
     classes, which in turn makes use of the generic function 'xtfrm'.

     If 'partial' is not 'NULL', it is taken to contain indices of
     elements of 'x' which are to be placed in their correct positions
     by partial sorting.  After the sort, the values specified in
     'partial' are in their correct position in the sorted array.  Any
     values smaller than these values are guaranteed to have a smaller
     index in the sorted array and any values which are greater are
     guaranteed to have a bigger index in the sorted array.  (This is
     included for efficiency, and many of the options are not available
     for partial sorting.  It is only substantially more efficient if
     'partial' has a handful of elements, and a full sort is done if
     there are more than 10.)  Names are discarded for partial sorting.

     Complex values are sorted first by the real part, then the
     imaginary part.

     The sort order for character vectors will depend on the collating
     sequence of the locale in use: see 'Comparison'. The sort order
     for factors is the order of their levels (which is particularly
     appropriate for ordered factors).

     Method '"shell"' uses Shellsort (an O(n^{4/3}) variant from
     Sedgewick (1996)).  If 'x' has names a stable sort is used, so
     ties are not reordered.  (This only matters if names are present.)

     Method '"quick"' uses Singleton's Quicksort implementation and is
     only available when 'x' is numeric (double or integer) and
     'partial' is 'NULL'.  (For other types of 'x' Shellsort is used,
     silently.)  It is normally somewhat faster than Shellsort (perhaps
     twice as fast on vectors of length a million) but has poor
     performance in the rare worst case. (Peto's modification using a
     pseudo-random midpoint is used to make the worst case rarer.) 
     This is not a stable sort, and ties may be reordered.

_V_a_l_u_e:

     For 'sort', the result depends on the S3 method which is
     dispatched.  If 'x' does not have a class the rest of this section
     applies.  For classed objects which do not have a specific method
     the default method will be used and is equivalent to 'x[order(x,
     ...)]': this depends on the class having a suitable method for '['
     (and also that 'order' will work, which is not the case for a
     class based on a list).

     For 'sort.int' the sorted vector unless 'index.return' is true,
     when the result is a list with components named 'x' and 'ix'
     containing the sorted numbers and the ordering index vector.  In
     the latter case, if 'method == "quick"' ties may be reversed in
     the ordering, unlike 'sort.list', as quicksort is not stable.

     All attributes are removed from the return value (see Becker _et
     al_, 1988, p.146) except names, which are sorted.  (If 'partial'
     is specified even the names are removed.)  Note that this means
     that the returned value has no class, except for factors and
     ordered factors (which are treated specially and whose result is
     transformed back to the original class).

_R_e_f_e_r_e_n_c_e_s:

     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
     Language_. Wadsworth & Brooks/Cole.

     Sedgewick, R. (1986) A new upper bound for Shell sort. _J.
     Algorithms_ *7*, 159-173.

     Singleton, R. C. (1969) An efficient algorithm for sorting with
     minimal storage: Algorithm 347. _Communications of the ACM_ *12*,
     185-187.

_S_e_e _A_l_s_o:

     'order' for sorting on or reordering multiple variables.

     'is.unsorted'. 'rank'.

_E_x_a_m_p_l_e_s:

     require(stats)

     x <- swiss$Education[1:25]
     x; sort(x); sort(x, partial = c(10, 15))
     median.default # shows you another example for 'partial'

     ## illustrate 'stable' sorting (of ties):
     sort(c(10:3,2:12), method = "sh", index.return=TRUE) # is stable
     ## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12
     ## $ix: 9  8 10  7 11  6 12  5 13  4 14  3 15  2 16  1 17 18 19
     sort(c(10:3,2:12), method = "qu", index.return=TRUE) # is not
     ## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12
     ## $ix: 9 10  8  7 11  6 12  5 13  4 14  3 15 16  2 17  1 18 19
     ##        ^^^^^

     x <- c(1:3, 3:5, 10)
     is.unsorted(x)                #-> FALSE: is sorted
     is.unsorted(x, strictly=TRUE) #-> TRUE : is not (and cannot be) sorted strictly

     ## Not run: ## Small speed comparison simulation:
     N <- 2000
     Sim <- 20
     rep <- 1000 # << adjust to your CPU
     c1 <- c2 <- numeric(Sim)
     for(is in 1:Sim){
       x <- rnorm(N)
       c1[is] <- system.time(for(i in 1:rep) sort(x, method = "shell"))[1]
       c2[is] <- system.time(for(i in 1:rep) sort(x, method = "quick"))[1]
       stopifnot(sort(x, method = "s") == sort(x, method = "q"))
     }
     rbind(ShellSort = c1, QuickSort = c2)
     cat("Speedup factor of quick sort():\n")
     summary({qq <- c1 / c2; qq[is.finite(qq)]})

     ## A larger test
     x <- rnorm(1e7)
     system.time(x1 <- sort(x, method = "shell"))
     system.time(x2 <- sort(x, method = "quick"))
     stopifnot(identical(x1, x2))
     ## End(Not run)

