cophenetic               package:stats               R Documentation

_C_o_p_h_e_n_e_t_i_c _D_i_s_t_a_n_c_e_s _f_o_r _a _H_i_e_r_a_r_c_h_i_c_a_l _C_l_u_s_t_e_r_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     Computes the cophenetic distances for a hierarchical clustering.

_U_s_a_g_e:

     cophenetic(x)
     ## Default S3 method:
     cophenetic(x)
     ## S3 method for class 'dendrogram':
     cophenetic(x)

_A_r_g_u_m_e_n_t_s:

       x: an R object representing a hierarchical clustering. For the
          default method, an object of class 'hclust' or with a method
          for 'as.hclust()' such as 'agnes'.

_D_e_t_a_i_l_s:

     The cophenetic distance between two observations that have been
     clustered is defined to be the intergroup dissimilarity at which
     the two observations are first combined into a single cluster.
     Note that this distance has many ties and restrictions.

     It can be argued that a dendrogram is an appropriate summary of
     some data if the correlation between the original distances and
     the cophenetic distances is high.  Otherwise, it should simply be
     viewed as the description of the output of the clustering
     algorithm.

     'cophenetic' is a generic function.  Support for classes which
     represent hierarchical clusterings (total indexed hierarchies) can
     be added by providing an 'as.hclust()' or, more directly, a
     'cophenetic()' method for such a class.

     The method for objects of class '"dendrogram"' requires that all
     leaves of the dendrogram object have non-null labels.

_V_a_l_u_e:

     An object of class 'dist'.

_A_u_t_h_o_r(_s):

     Robert Gentleman

_R_e_f_e_r_e_n_c_e_s:

     Sneath, P.H.A. and Sokal, R.R. (1973) _Numerical Taxonomy: The
     Principles and Practice of Numerical Classification_, p. 278 ff;
     Freeman, San Francisco.

_S_e_e _A_l_s_o:

     'dist', 'hclust'

_E_x_a_m_p_l_e_s:

     d1 <- dist(USArrests)
     hc <- hclust(d1, "ave")
     d2 <- cophenetic(hc)
     cor(d1,d2) # 0.7659

     ## Example from Sneath & Sokal, Fig. 5-29, p.279
     d0 <- c(1,3.8,4.4,5.1, 4,4.2,5, 2.6,5.3, 5.4)
     attributes(d0) <- list(Size = 5, diag=TRUE)
     class(d0) <- "dist"
     names(d0) <- letters[1:5]
     d0
     str(upgma <- hclust(d0, method = "average"))
     plot(upgma, hang = -1)
     #
     (d.coph <- cophenetic(upgma))
     cor(d0, d.coph) # 0.9911

