kmeans                 package:stats                 R Documentation

_K-_M_e_a_n_s _C_l_u_s_t_e_r_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     Perform k-means clustering on a data matrix.

_U_s_a_g_e:

     kmeans(x, centers, iter.max = 10)

_A_r_g_u_m_e_n_t_s:

       x: A numeric matrix of data, or an object that can be coerced to
          such a matrix (such as a numeric vector or a data frame with
          all numeric columns). 

 centers: Either the number of clusters or a set of initial cluster
          centers. If the first, a random set of rows in 'x' are chosen
          as the initial centers. 

iter.max: The maximum number of iterations allowed. 

_D_e_t_a_i_l_s:

     The data given by 'x' is clustered by the k-means algorithm. When
     this terminates, all cluster centres are at the mean of their
     Voronoi sets (the set of data points which are nearest to the
     cluster centre).

     The algorithm of Hartigan and Wong (1979) is used.

_V_a_l_u_e:

     A list with components:

 cluster: A vector of integers indicating the cluster to which each
          point is allocated. 

 centers: A matrix of cluster centres.

withinss: The within-cluster sum of squares for each cluster.

    size: The number of points in each cluster.

_R_e_f_e_r_e_n_c_e_s:

     Hartigan, J.A. and Wong, M.A. (1979). A K-means clustering
     algorithm. _Applied Statistics_ *28*, 100-108.

_E_x_a_m_p_l_e_s:

     # a 2-dimensional example
     x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
                matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
     cl <- kmeans(x, 2, 20)
     plot(x, col = cl$cluster)
     points(cl$centers, col = 1:2, pch = 8)

