mcl(1)                    USER COMMANDS                    mcl(1)



  NAME
      mcl  - The Markov Cluster Algorithm, aka the MCL algorithm.

  SYNOPSIS
      mcl <-|fname> [-I f (inflation)] [-o str (fname)]  [-scheme
      k (resource scheme)]

      These  options are sufficient in 95 percent of the cases or
      more. The first argument must be the name of  a  file  con-
      taining a graph/matrix in the mcl input format, or a hyphen
      to read from STDIN. With respect to clustering, only the -I
      option  and  -scheme  option  are  relevant and the rest is
      immaterial.

      As of the first 1.002 release, mcl will accept a very  gen-
      eral input format.  Graph indices no longer need be sequen-
      tial; you can use any set of indices you like, as  long  as
      they  are  in  a suitable range. Refer to the mcxio(5) sec-
      tion, and use mcl -z to find the range from  which  indices
      can be picked.

      As of the first 1.003 release, clmformat enables you to map
      a clustering onto a format better  suited  for  inspection,
      using  an  index  file (the so called tab file) mapping mcl
      indices onto descriptive labels.  Read the clmformat manual
      for  more  information - it is simple to use and the manual
      is small.

      A mechanism for pipelines is supported  (as  of  the  first
      1.003  release).   Refer  to the PIPELINES section for more
      information.  A prepackaged pipeline for BLAST data is pre-
      sent in the form of mclblastline.

      The  full  listing of mcl options is shown below, separated
      into parts corresponding with functional  aspects  such  as
      clustering, threading, verbosity, pruning and resource man-
      agement, automatic output naming, and dumping.  The -scheme
      parameter  provides  a  single  access point to the pruning
      options, and should  be  sufficient  in  most  cases.   mcl
      allows comprehensive tuning and access to its internals for
      those who are interested, so it has many options.

      Baseline clustering options
      [-I f (inflation)] [-o str (fname)]  [-scheme  k  (resource
      scheme)]

      Additional clustering options
      [-c  f  (centering)] [-pi f (pre-inflation)] [-l n (initial
      iteration number)] [-L n (main  iteration  number)]  [-i  f
      (initial inflation)] [-pp n (input matrix)] [-do str (carry
      out action)]

      Verbosity options
      [-v str (verbosity type on)] [-V str (verbosity type  off)]
      [--silent  (very)] [--verbose (very)] [-progress k (gauge)]
      [--show (print (small) matrices to screen)]

      Thread options
      [-te k (#expansion threads)] [-ti k  (#inflation  threads)]
      [-t k (#threads)] [--clone (when threading (experimental))]
      [-cloneat n (trigger)]

      Output file name options
      [-o str (fname)] [-ap str (use str as  file  name  prefix)]
      [-aa  str  (append  str  to suffix)] [-az (show output file
      name and exit)]

      Dump options
      [-di i:j (dump interval)] [-dm k (dump modulo)]  [-ds  stem
      (dump  file  stem)] [-dump str (type)] [-digits n (printing
      precision)]

      Output options
      [-sort str (sort mode)] [--expand-only (factor out computa-
      tion)] [--inflate-first (rather then expand)]

      Info options
      [--jury-charter (explains jury)] [--version (show version)]
      [-how-much-ram k (RAM upper  bound)]  [-h  (most  important
      options)]  [-z  (show  current settings)] [-az (show output
      file  name  and  exit)]  [--show-schemes   (show   resource
      schemes)]

      Pruning options
      The  following  options  all pertain to the various pruning
      strategies that can be employed by mcl. They are  described
      in  the  PRUNING OPTIONS section, accompanied by a descrip-
      tion of the mcl pruning strategy.  If your graphs are  huge
      and  you  have  an  appetite for tuning, have a look at the
      following:

      [-p f (cutoff)] [-P n (1/cutoff)] [-S n (selection number)]
      [-R  n  (recovery  number)]  [-pct  f (recover percentage)]
      [-warn-pct  n  (prune  warn  percentage)]  [-warn-factor  n
      (prune  warn  factor)]  [--dense  (allow matrices to fill)]
      [--adapt (pruning)] [--rigid (pruning)]  [-ae  f  (adaptive
      pruning exponent)] [-af f (adaptive pruning factor)] [-nx x
      (x window index)] [-ny y (y window  index)]  [-nj  j  (jury
      window index)] [-nw w (nr of windows)] [-nl w (nr of itera-
      tions)] [--thick (expect dense input graph)]

      The first argument of mcl must be a  file  name,  but  some
      options  are  allowed  to  appear  as  the  first  argument
      instead. These are the options that cause mcl to print  out
      information  of  some  kind, after which it will gracefully
      exit. The full list of these options is

      -z, -h, --version, --show-settings, --show-schemes, --jury-
      charter, -how-much-ram k.

  DESCRIPTION
      mcl  implements  the  MCL  algorithm,  short for the Markov
      cluster algorithm, a cluster algorithm for graphs developed
      by  Stijn van Dongen at the Centre for Mathematics and Com-
      puter Science in Amsterdam, the Netherlands. The  algorithm
      simulates  flow  using  two  simple algebraic operations on
      matrices.  The inception of this flow process and the  the-
      ory  behind  it  are  described elsewhere (see REFERENCES).
      Frequently asked questions are answered  in  the  mclfaq(7)
      section.   The  program  described  here is a fast threaded
      implementation written by the algorithm's creator with con-
      tributions  by several others. Anton Enright co-implemented
      threading; see the HISTORY/CREDITS section for  a  complete
      account.   See  the APPLICABILITY section for a description
      of the type of graph mcl likes best, and for a  qualitative
      assessment  of  its  speed.   mcl is accompanied by several
      other utilities for analyzing  clusterings  and  performing
      matrix and graph operations; see the SEE ALSO section.

      The first argument is the input file name (see the mcxio(5)
      section for its expected format), or  a  single  hyphen  to
      read  from  stdin. The rationale for making the name of the
      input file a fixed parameter is that you typically do  sev-
      eral  runs  with different parameters. In command line mode
      it is pleasant if you do not have to skip over an immutable
      parameter all the time.

      The  -I f  option  is  the  main control, affecting cluster
      granularity. Using mcl is as simple as typing  (assuming  a
      file proteins contains a matrix/graph in mcl input format)

      mcl proteins -I 2.0

      The  above  will result in a clustering written to the file
      named out.proteins.I20s2. It is - of course -  possible  to
      explicitly specify the name of the output file using the -o
      option. Refer to the -ap option for a description of  mcl's
      procedure  in automatically constructing file names from it
      parameters.

      The mcl input format is described in the mcxio(5)  section.
      Clusterings  are  also  stored  as matrices - this is again
      discussed in the mcxio(5) section.   In  finding  good  mcl
      parameter  settings  for a particular domain, or in finding
      cluster structure at different levels of  granularity,  one
      typically  runs  mcl multiple times for varying values of f
      (refer to the -I option for further information).

      mcl expects a nonnegative matrix  in  the  input  file,  or
      equivalently,  a weighted (possibly directed) graph. NOTE -
      mcl interprets the matrix entries or graph edge weights  as
      similarities, and it likes undirected input graphs best. It
      can handle directed graphs, but any  node  pair  (i,j)  for
      which w(i,j) is much smaller than w(j,i) or vice versa will
      presumably have a slightly negative effect on the  cluster-
      ings  output  by mcl. Many such node pairs will have a dis-
      tinctly negative effect, so try to make your  input  graphs
      undirected.  How  your edge weights are computed may affect
      mcl's performance. In protein clustering, one way to go  is
      to  choose the negated logarithm of the BLAST probabilities
      (see REFERENCES).

      mcl's default parameters should make it  quite  fast  under
      almost  all  circumstances.  Taking default parameters, mcl
      has been used to generate good  protein  clusters  on  133k
      proteins,  taking  10 minutes running time on a Compaq ES40
      system with  four  alpha  EV6.7  processors.  It  has  been
      applied  (with good results) to graphs with 800k nodes, and
      if you have the memory (and preferably CPUs as well)  noth-
      ing should stop you from going further.

      For  large  graphs,  there are several groups of parameters
      available for tuning the mcl computing process,  should  it
      be  necessary.  The  easiest  thing  to do is just vary the
      -scheme option. This triggers different  settings  for  the
      group  of pruning parameters {-p/-P, -R, -S, and -pct}. The
      default setting corresponds with -scheme 2.   There  is  an
      additional  group  of control parameters {--adapt, --rigid,
      -ae, -af}, which may be helpful in speeding up  mcl.   When
      doing  multiple mcl runs for the same graphs with different
      -I settings (for obtaining clusterings at different  levels
      of  granularity),  it can be useful to factor out the first
      bit of computation that is common to all runs, by using the
      --expand-only  option  one  time  and then using --inflate-
      first for each run in the set.   Whether  mcl  considers  a
      graph  large  depends  mainly  on the graph connectivity; a
      highly connected graph on 50,000 nodes is large to mcl  (so
      that  you  might want to tune resources) whereas a sparsely
      connected graph on 500,000 nodes may be business as  usual.

      mcl is a memory munger. Its precise appetite depends on the
      resource settings. You can get a rough  (and  usually  much
      too  pessimistic) upper bound for the amount of RAM that is
      needed by using the -how-much-ram option. The corresponding
      entry  in  this manual page contains the simple formula via
      which the upper bound is computed.

      Two other groups of interest are the thread-related options
      (you  can  specify  the number of threads to use) {-t, -te,
      -ti, --clone, -cloneat} and the  verbosity-related  options
      {--verbose,  --silent,  -v,  -V}.   The actual settings are
      shown with -z, and for graphs with at most 12 nodes  or  so
      you can view the MCL matrix iterands on screen by supplying
      --show (this may give some more feeling).

      MCL iterands allow a generic interpretation as  clusterings
      as well. The clusterings associated with early iterands may
      contain a fair  amount  of  overlap.  Refer  to  the  -dump
      option,  the  mclfaq(7)  manual,  and  the  clmimac utility
      (Interpret Matrices As Clusterings).  Use clmimac  only  if
      you have a special reason; the normal usage of mcl is to do
      multiple runs for varying -I parameters and use  the  clus-
      terings output by mcl itself.

      The  creator  of  this  page  feels that manual pages are a
      valuable resource, that online html documentation is also a
      good  thing  to have, and that info pages are way way ahead
      of their time. The NOTES section explains how this page was
      created.

      In  the  OPTIONS  section  options  are  listed in order of
      importance, with related options grouped together.

  OPTIONS
      -I f (inflation)
         Sets the main inflation value to f. This  value  is  the
         main  handle  for  affecting  cluster granularity. It is
         usually chosen somewhere in the range [1.2-5.0].  -I 5.0
         will  tend  to  result  in fine-grained clusterings, and
         -I 1.2 will tend to result in very coarse grained  clus-
         terings. Your mileage will vary depending on the charac-
         teristics of your data. That is why it is a good idea to
         test the quality and coherency of your clusterings using
         clmdist and clminfo. This will most likely  reveal  that
         certain values of -I are simply not right for your data.
         The clmdist section contains a discussion of how to  use
         the  cluster  validation tools shipped with mcl (see the
         SEE ALSO section).

         A second option for affecting cluster granularity is the
         -c option.  It may possibly increase granularity.

         With  low values for -I, like -I 1.2, you should be pre-
         pared to use more resources in order to maintain quality
         of  clusterings,  i.e.  increase  the  argument  to  the
         -scheme option.

      -o str (fname)
         Output the clustering to file named fname.  It is possi-
         ble  to send the clustering to stdout by supplying -o -.
         The clustering is output in the mcl matrix  format;  see
         the mcxio(5) section for more information on this.

         Look  at  the  -ap option and its siblings for the auto-
         matic naming constructions employed by  mcl  if  the  -o
         option is not used.

      -scheme k (use a preset resource scheme)
         There  are  currently  five  different resource schemes,
         indexed 1..5.  High schemes  result  in  more  expensive
         computations  that  may  possibly  be more accurate. The
         default scheme is 2. When mcl is done, it  will  give  a
         grade  (the so called jury synopsis) to the appropriate-
         ness of the scheme used. A low grade does not  necessar-
         ily  imply  that  the  resulting clustering is bad - but
         anyway, a low grade should be reason to try for a higher
         scheme.  The  grades  are  listed in the PRUNING OPTIONS
         section under the -nj option.

         The  PRUNING  OPTIONS  section  contains  an   elaborate
         description of the way mcl manages resources, should you
         be interested.  In case you are worried about the  vali-
         dation  of the resulting clusterings, the mclfaq(7) sec-
         tion has several entries discussing this issue. The bot-
         tom  line  is  that  you have to compare the clusterings
         resulting from different schemes (and otherwise  identi-
         cal parameters) using utilities such as clmdist, clminfo
         on the one hand, and your  own  sound  judgment  on  the
         other hand.

         If  your input graph is extremely dense, with an average
         node degree (i.e. the number  of  neighbours  per  node)
         that  is somewhere above 500, you may need to filter the
         input graph by removing the nodes of highest degree (and
         projecting  them  back  onto  the  resulting  clustering
         afterwards) or by using the -pp option.

      --show-schemes (show preset resource schemes)
         Shows the explicit settings to which the different  pre-
         set schemes correspond.

         The characteristics are written in the same format (more
         or less) as the output triggered by -v pruning.

      -c f (centering)
         The larger the value of f the more nodes are attached to
         themselves rather than their neighbours, the more expan-
         sion (the  spreading  of  flow  through  the  graph)  is
         opposed,  and  the more fine-grained clusterings tend to
         be. f should be chosen greater than or equal to 1.0. The
         default  is  f=1.0. This option has a much weaker effect
         than the -I option, but it can be  useful  depending  on
         your data.

      -do str (carry out action)
      -dont str (omit action)
         These are the actions that can be controlled:

         clm
         log
         show-log
         keep-overlap

         The  clm  action appends a short synopsis of performance
         criteria to the output cluster file, as well as a synop-
         sis of the granularity characteristics.

         The  log  action  appends several characteristics of the
         mcl run to the output cluster file, such as the resource
         management  log,  parameter  settings,  timings, and the
         command line.  Useful bookkeeping, and it  is  advisable
         to always use it.

         The show-log action sends the resource management log to
         STDOUT.

         The keep-overlap action causes  mcl  to  retain  overlap
         should  this  improbable event occur. In theory, mcl may
         generate a clustering that  contains  overlap,  although
         this  almost  never  happens in practice, as it requires
         some particular type of symmetry to be  present  in  the
         input graph (not just any symmetry will do).  Mathemati-
         cally speaking, this is a conjecture and not a  theorem,
         but I am willing to eat my shoe if it does not hold (for
         marzipan values of shoe). It is easy though to construct
         an  input graph for which certain mcl settings result in
         overlap - for example a line graph on an odd  number  of
         nodes. The default is to remove overlap should it occur.

         This option has more than theoretical use because mcl is
         able  to generate clusterings associated with intermedi-
         ate iterands.  For these clusterings,  overlap  is  more
         than a theoretical possibility, and will often occur. If
         you specify the -L k option, mcl will output  the  clus-
         tering associated with the last iterand computed, and it
         may well contain overlap.

         This option has no effect on the  clusterings  that  are
         output  when  using -dump cls - the default for those is
         that overlap is not touched, and this  default  can  not
         yet be overridden.

      -v str (verbosity type on)
         See the --verbose option below.

      -V str (verbosity type off)
         See the --verbose option below.

      --silent (very)
         See the --verbose option below.

      --verbose (very)
         These are the different verbosity modes:

         progress
         pruning
         explain
         all

         where all means all three previous modes.  --verbose and
         -v all turn them all on, --silent and -V all  turn  them
         all  off.  -v str and -V str turn on/off the single mode
         str (for str equal  to  one  of  progress,  pruning,  or
         explain).   Each  verbosity  mode is given its own entry
         below.

      -v progress
         This mode causes mcl to emit an  ascii  gauge  for  each
         single  matrix  multiplication.  It  uses  some  default
         length for the  gauge,  which  can  be  altered  by  the
         -progress k  option.  Simply  using the latter will also
         turn on this verbosity mode.  This  mode  can  give  you
         quickly  an  idea how long an mcl run might take. If you
         use threading (see the -t option and its friends),  this
         option  may  slow down the program a little (relative to
         -V progress, not relative to a single-CPU mcl run).

      -v explain
         This mode causes the output of explanatory headers illu-
         minating the output generated with the pruning verbosity
         mode.

      -v pruning
         This mode causes output of resource-related  quantities.
         It  has a separate entry in the PRUNING OPTIONS section.

      -progress k (gauge)
         If k>0 then for  each  matrix  multiplication  mcl  will
         print  an  ascii  gauge telling how far it is. The gauge
         will be (in some cases approximately) k characters long.
         If  k<0  then  mcl will emit a gauge that is extended by
         one character after  every  |k|  vectors  computed.  For
         large  graphs,  this  option  has been known to ease the
         pain of impatience. If k=0 then mcl will print a message
         only  after  every matrix multiplication, and not during
         matrix multiplication. This can be useful when you  want
         mcl  to be as speedy as possible, for example when using
         parallellized  mode  (as  monitoring  progress  requires
         thread communication).  For parallellization (by thread-
         ing) see the -t option.

      -aa str (append str to suffix)
         See the -ap option below.

      -ap str (use str as file name prefix)
         If the -o option is not used, mcl  will  create  a  file
         name  (for writing output to) that should uniquely char-
         acterize the important parameters used  in  the  current
         invocation  of mcl. The default format is out.fname.suf,
         where out is simply the literal string out, fname is the
         first argument containing the name of the file (with the
         graph) to be clustered, and  where  suf  is  the  suffix
         encoding  a set of parameters (described further below).

         The -ap str option specifies a prefix to use rather than
         out.fname  as  sketched above.  However, mcl will inter-
         pret the character '=', if present in str, as  a  place-
         holder for the input file name.

         If  the  -aa str  option is used, mcl will append str to
         the suffix suf created by itself.  You can use  this  if
         you  need  to  encode some extra information in the file
         name suffix.

         The suffix is  constructed  as  follows.  The  -I f  and
         -scheme  parameter are always encoded.  The -pi f, -l k,
         -i f, and -c f options are  only  encoded  if  they  are
         used.  Any  real argument f is encoded using exactly one
         trailing  digit  behind  the  decimal  separator  (which
         itself  is  not  written).  The  setting -I 3.14 is thus
         encoded as I31. The -scheme option is encoded using  the
         letter 's', all other options mentioned here are encoded
         as themselves (stripped of the hyphen). For example

         mcl small.mci -I 3 -c 2.5 -pi 0.8 -scheme 5

         results in the file name out.small.mci.I30s5c25pi08.  If
         you  want  to  know  beforehand  what  file name will be
         produced, use the -az option.

      -az (show output file name and exit)
         If mcl automatically constructs a file name, it  can  be
         helpful to known beforehand what that file name will be.
         Use -az and mcl will write the file name to  STDOUT  and
         exit.  This  can be used if mcl is integrated into other
         software for which the automatic creation of unique file
         names is convenient.

      -te k (#expansion threads)
         See the -t k option below.

      -ti k (#inflation threads)
         See the -t k option below.

      --clone (when threading)
         See the -t k option below.

      -cloneat n (trigger)
         See the -t k option below.

      -t k (#threads)
         The -t options are self-explanatory. Note that threading
         inflation is hardly useful, as inflation  is  orders  of
         magnitude  faster than expansion. Also note that thread-
         ing is only useful if you have a multi-processor system.

         The --clone option says to give each thread its own copy
         of the matrix being expanded/squared. The latter  option
         can  be further controlled using the --cloneat k option.
         Copies are only made if the source matrix (the one to be
         squared)  has on average at least k positive entries per
         vector. This option is probably not very useful, because
         without it mcl is a memory munger already.

         When  threading,  it is best not to turn on pruning ver-
         bosity mode if  you  are  letting  mcl  run  unattended,
         unless  you want to scrutinize its output later. This is
         because it makes mcl run somewhat slower,  although  the
         difference is not dramatic.

      -l n (initial iteration number) (small letter ell)
         The  number  of times mcl will use a different inflation
         value before it switches to the (main)  inflation  given
         by  the  -I (capital eye) option. The different value is
         called initial inflation and is tunable using  the  -i f
         option  (default value f=2.0). The default value (to -l)
         is zero. This option  supplies  new  ways  of  affecting
         cluster granularity, e.g. by supplying

         mcl proteins -i 1.4 -l 2 -I 4.0

         one  lets  expansion prevail during the first two itera-
         tions, followed by inflation catching up (in  a  figura-
         tive  way  of  writing).   This may be useful in certain
         cases, but this type of  experiment  is  certainly  sec-
         ondary to simply varying -I (capital eye).

      -L n (main iteration number)
         Normally, mcl computes the MCL process until it has con-
         verged fully to a doubly idempotent matrix.  The  number
         of  iterations  required  is  typically somewhere in the
         range 10-100.  The first few iterations  generally  take
         the  longest time.  The -L option can be used to specify
         the number of iterations mcl may do at most.  When  this
         number  is reached, mcl will output the clustering asso-
         ciated with the iterand last computed.

      -i f (initial inflation)
         The inflation value used during the first n  iterations,
         where  n  is  specified  by  the  -l  (ell)  option.  By
         default, n=0 and f=2.0.

      -pi f (pre-inflation)
         If used, mcl will apply inflation one time to the  input
         graph before entering the main process. This can be use-
         ful for making the edge weights in a graph  either  more
         homogeneous  (which may result in less granular cluster-
         ings) or more heterogeneous (which may  result  in  more
         granular clusterings).  Homogeneity is achieved for val-
         ues f less than one,  heterogeneity  for  values  larger
         than  one.   Values  to  try  are  normally in the range
         [2.0,10.0].

      -di i:j (dump interval)
      -dump-interval i:j
         Dump during iterations i..j-1. See the -dump str  option
         below.

      -dm k (dump i+0..i+k..)
      -dump-modulo k
         Sampling  rate: select only these iterations in the dump
         interval.  See the -dump str option below.

      -ds stem (file stem)
      -dump-stem stem
         Set the the  stem  for  file  names  of  dumped  objects
         (default mcl). See the -dump str option below.

      -dump str (type)
         str can be of the following types.

         ite
         cls
         chr

         Repeated  use  is  allowed.   The  ite option writes mcl
         iterands to file.  The  cls  option  writes  clusterings
         associated with mcl iterands to file.

         The  chr  option  says,  for each iterand I, to output a
         matrix C with characteristics of I. C has the same  num-
         ber of columns as I. For each column k in C, row entry 0
         is the diagonal or 'loop' value of column k in  I  after
         expansion  and pruning, and before inflation and rescal-
         ing. Entry 1 is  the  loop  value  after  inflation  and
         rescaling.   Entry  2 is the center of column k (the sum
         of its entries squared)  computed  after  expansion  and
         before  pruning,  entry  3 is the maximum value found in
         that column at the same time. Entry 4 is the  amount  of
         mass kept for that column after pruning.

         The  -ds  option  sets the stem for file names of dumped
         objects (default mcl). The -di and -dm options  allow  a
         selection of iterands to be made.

      -digits n (printing precision)
         This has two completely different uses. It sets the num-
         ber of decimals used for  pretty-printing  mcl  iterands
         when  using  the  --show option (see below), and it sets
         the number of decimals used  for  writing  the  expanded
         matrix when using the --expand-only option.

      --show (print matrices to screen)
         Print matrices to screen. The number of significant dig-
         its to be  printed  can  be  tuned  with  -digits n.  An
         80-column  screen allows graphs (matrices) of size up to
         12(x12)  to  be  printed  with  three  digits  precision
         (behind  the  comma), and of size up to 14(x14) with two
         digits. This can give you an idea of how  mcl  operates,
         and  what  the  effect of pruning is.  Use e.g. -S 6 for
         such a small graph and view the MCL matrix iterands with
         --show.

      -sort str (sort mode)
         str  can  be  one  of  lex,  size, revsize, or none. The
         default is 'revsize', in which the largest clusters come
         first.  If  the  mode  is 'size', smallest clusters come
         first, if the mode is 'lex', clusters are ordered  lexi-
         cographically,  and  if the mode is 'none', the order is
         the same as produced by the procedure used by mcl to map
         matrices onto clusterings.

      --inflate-first (rather then expand)
         Normally,  mcl will take the input graph/matrix, make it
         stochastic, and start computing an  mcl  process,  where
         expansion  and  inflation  are  alternated.  This option
         changes that to alternation of inflation and  expansion,
         i.e. inflation is the first operator to be applied. This
         is intended for use with an input matrix that was gener-
         ated  with the --expand-only option (see below).  If you
         do multiple mcl runs for the same graph, then the  first
         step will be the same for all runs, namely computing the
         square of the input matrix. With the pair of  --inflate-
         first  and  --expand-only  this  bit of computing can be
         factored out.  NOTE this option assumes that  the  input
         matrix  is stochastic (as it will be when generated with
         --expand-only.  The --inflate-first option  renders  all
         options  useless  that  will  otherwise affect the input
         matrix, and precisely these options do affect the matrix
         resulting  from using --expand-only. See the entry below
         for more information.

      --expand-only (factor out computation)
         This option makes mcl compute just  the  square  of  the
         input  graph/matrix,  and write it to the file name sup-
         plied with the -o flag, or to  the  default  file  named
         out.mce.  NOTE  in  this case the output matrix is not a
         clustering. The intended use is that the  output  matrix
         is used as input for mcl with the --inflate-first switch
         turned on, so that multiple mcl runs need not  redo  the
         same computation (the first expansion step).

         Note  that the -scheme parameters affect the matrix com-
         puted with --expand-only. Other options that affect  the
         matrix resulting from this option: -pp, -c, and -digits.
         The latter option  sets  the  precision  for  output  in
         native ascii format.

      -pp n (input matrix)
         For  each  column  vector  (node)  in  the  input matrix
         (graph) mcl will keep the n entries (outgoing edges)  of
         that  vector  (node) that have largest weight and remove
         the rest.

      --jury-charter (explains jury)
         Explains how the jury synopsis is computed from the jury
         marks.

      --version (show version)
         Show version.

      -how-much-ram n (RAM upper bound)
         n  is  interpreted  as  the  number of nodes of an input
         graph.  mcl will print the  maximum  amount  of  RAM  it
         needs for its computations.  The formula for this number
         in bytes is:

            2 * c * k * n

            2  :  two matrices are concurrently held in memory.
            c  :  mcl cell size (as shown by -z).
            n  :  graph cardinality (number of nodes).
            k  :  MAX(s, r).
            s  :  select number (-S, -scheme options).
            r  :  recover number (-R, -scheme options).

         This estimate will usually be too pessimistic.  It  does
         assume  though that the average node degree of the input
         graph does not exceed k. The -how-much-ram option  takes
         other  command-line  arguments  into account (such as -S
         and -R), and it expresses the amount of RAM in  megabyte
         units.

      -h (show help)
         Shows a selection of the most important mcl options.

      --show-settings (show settings)
         A synonym for the -z option.

      -z (show settings)
         Show  current  settings for tunable parameters.  --show-
         settings is a synonym.

  PRUNING OPTIONS
      -p f (cutoff)
      -P n (1/cutoff)
      -S s (selection number)
      -R r (recover number)
      -pct pct (recover percentage)
         After computing a new (column stochastic) matrix  vector
         during  expansion  (which  is matrix multiplication c.q.
         squaring), the vector is successively exposed to differ-
         ent  pruning  strategies.  The intent of pruning is that
         many small entries are removed while retaining  much  of
         the  stochastic mass of the original vector. After prun-
         ing, vectors are rescaled to be  stochastic  again.  MCL
         iterands  are  theoretically  known  to  be  sparse in a
         weighted sense, and this manoever  effectively  perturbs
         the  MCL  process  a  little in order to obtain matrices
         that are genuinely sparse, thus keeping the  computation
         tractable. An example of monitoring pruning can be found
         in the discussion of -v pruning at the end of this  sec-
         tion.

         mcl proceeds as follows. First, entries that are smaller
         than cutoff are removed, resulting in a vector  with  at
         most 1/cutoff entries. The cutoff can be supplied either
         by -p, or as the inverse value by -P. The latter is more
         intuitive,  if  your  intuition  is like mine (and the P
         stands for precision or pruning by the way).  The cutoff
         just described is rigid; it is the same for all vectors.
         The --adapt option causes the computation  of  a  cutoff
         that  depends  on a vector's homogeneity properties, and
         this option may or may not speed up mcl.

         Second, if the remaining stochastic mass (i.e.  the  sum
         of  all  remaining entries) is less than pct/100 and the
         number of remaining entries is less than r (as specified
         by the -R flag), mcl will try to regain ground by recov-
         ering the largest discarded entries. The total number of
         entries is not allowed to grow larger than r.  If recov-
         ery was not necessary, mcl tries  to  prune  the  vector
         further  down  to  at most s entries (if applicable), as
         specified by the -S flag. If this results  in  a  vector
         that  satisfies  the recovery condition then recovery is
         attempted, exactly as described above. The  latter  will
         not occur of course if r <= s.

         The  default  setting  is  something like -P 4000 -S 500
         -R 600. Check the -z flag to be sure. There is a set  of
         precomposed  settings,  which  can be triggered with the
         -scheme k option. k=2 is the default scheme; higher val-
         ues  for k result in costlier and more accurate computa-
         tions (vice versa for lower,  cheaper,  and  less  accu-
         rate).   The schemes are listed using the --show-schemes
         option. It is advisable to use the -scheme  option  only
         in interactive mode, and to use the explicit expressions
         when doing batch processing. The reason is that there is
         no guarantee whatsoever that the schemes will not change
         between different releases. This is because  the  scheme
         options  should  reflect  good general purpose settings,
         and it may become appararent that other schemes are bet-
         ter.

         Note  that  'less  accurate' or 'more accurate' computa-
         tions may still generate the  same  output  clusterings.
         Use  clmdist to compare output clusterings for different
         resource parameters. Refer to clmdist for  a  discussion
         of this issue.

      -warn-pct k (prune warn percentage)
      -warn-factor k (prune warn factor)
         The  two  options  -warn-pct  and -warn-factor relate to
         warnings that may be triggered once the initial  pruning
         of  a vector is completed. The idea is to issue warnings
         if initial pruning almost completely destroys a computed
         vector,  as  this may be a sign that the pruning parame-
         ters should be changed. It depends on the mass remaining
         after  initial pruning whether a warning will be issued.
         If that mass is less than warn-pct or if the  number  of
         remaining  entries  is  smaller  by a factor warn-factor
         than both the number of entries originally computed  and
         the  recovery  number,  in  that  case, mcl will issue a
         warning.

         -warn-pct takes an integer between 0 and 100 as  parame-
         ter,  -warn-factor  takes  a  real positive number. They
         default to something like 30 and 50.0. If  you  want  to
         see  less warnings, decrease warn-pct and increase warn-
         factor. Set warn-factor to zero if you want no warnings.

      --dense (allow matrices to fill)
         This renders all pruning options useless except for one.
         After each expansion step, mcl will remove  all  entries
         that  are  smaller than the threshold specified by -p or
         -P, which acts like a  precision  in  this  case.  After
         removal,  the matrix columns are rescaled to be stochas-
         tic again.

         If the -p threshold (precision) is zero or  very  small,
         the --dense option results in a rather accurate and very
         costly computation of the MCL process. Do not  use  this
         option  for  graphs  with more than several thousands of
         entries, or you will have trouble digging your processor
         out of swap.

      --rigid (pruning)
         See the --adapt option below.

      -ae f (adaptive pruning exponent)
         See the --adapt option below.

      -af f (adaptive pruning factor)
         See the --adapt option below.

      --adapt (pruning)
         The default mcl pruning behaviour as described under the
         -P option is called rigid pruning (it being the  default
         renders  the switch --rigid currently useless), refering
         to the fact that the  first  stage  of  pruning  removes
         entries  smaller  than  a  fixed threshold.  The options
         discussed here enable the  computation  of  a  threshold
         that  depends  on  the  homogeneity characteristics of a
         vector.  This  behaviour  is  triggered   by   supplying
         --adapt.

         The  --adapt  behaviour  only  affects the first pruning
         stage, c.q.  the computation of the first threshold (see
         the  discussion under the -P option). It does not inter-
         fere with either selection or recovery. It  is  affected
         however  by the threshold as specified by the -P option.
         When using --adapt, you typically use the -P  option  as
         well, and you can and should use a higher value then you
         would without using --adapt.

         All that said, --adapt triggers this behaviour: Given  a
         stochastic  vector  v,  its  mass center of order two is
         computed, which is the sum of each  entry  squared.  The
         mass  center of v, call it c, is strongly related to its
         homogeneity properties (see REFERENCES). The threshold T
         is  computed  as  1/f * pow(c, e), where e and f are the
         arguments to the -af f and  -ae e  options  respectively
         (check -z for the respective defaults).  For either e or
         f decreasing it means that T becomes larger.  Finally, T
         is  maxed  with  the rigid threshold value, which can be
         altered using either -p f or -P n.  The  latter  is  why
         you  should  increase  the  -P  parameter n (so that the
         rigid threshold is decreased) once you switch  to  adap-
         tive  pruning. The adaptive threshold should be the main
         factor controlling pruning,  with  the  rigid  threshold
         acting as a safeguard that does not take over too often.

         This may seem complicated, but the  rules  are  actually
         quite  simple, and you may just disregard the definition
         of T. The usefulness of these options will vary. If  you
         want to speed up mcl, try it out and add --adapt to your
         settings.

      --thick (expect dense input graph)
         This option is somewhat esoteric. It does not affect the
         matrices  as  computed by mcl, but it affects the way in
         which they are computed. If  the  input  graph  is  very
         dense, this may speed up mcl a little.

      -v pruning
         Pruning  verbosity  mode  causes  mcl  to  emit  several
         statistics related to the pruning process, each of which
         is  described  below.  Use -v explain to get explanatory
         headers in the output as well (or simply use -v all).

         Selection and recovery
         The number of selections and recoveries mcl had to  per-
         form  during  each iteration is shown. It also shows the
         number of vectors for which the mass after final pruning
         was  below  the fraction defined by the -pct option as a
         percentage (default probably 90 or 95).

         Initial and pruned vector footprint distributions
         The distribution of the vector footprints (i.e. the num-
         ber  of  nonzero  entries)  before  and after pruning is
         shown. This is assembled in a terse (horrid if you will)
         format,  looking as follows (with some context stripped,
         noting that  the  data  for  three  expansion  steps  is
         shown):

         ----------------------------------------------------
          mass percentages  | distr of vec footprints       |
                  |         |____ expand ___.____ prune ____|
           prune  | final   |e4   e3   e2   |e4  e3   e2    |
         all ny nx|all ny nx|8532c8532c8532c|8532c8532c8532c|
         ---------.---------.---------------.---------.-----.
          98 88 86  98 91 86 _________022456 ___________0234
          98 89 86  98 94 91 _______00245678 ___________0234
          98 90 89  99 95 94 _______00235568 ___________0234
          ...

         This  particular  output  was  generated  (and truncated
         after three rounds  of  expansion  and  inflation)  from
         clustering  a  protein graph on 9058 nodes with settings
         -I 1.4, -P 2000, -S 500, -R 600, and -pct 95.

         The header entries 8532c85.. indicate  thresholds  going
         from  80000, 50000, 20000, 12500, 8000, all the way down
         to 300, 200, and 125. The character  'c'  signifies  the
         base 1.25 (for no apparent reason). The second entry '2'
         (after '0') on the first line signifies that roughly  20
         percent  of  all  the  vectors  had  footprint (#nonzero
         entries) between 800 and 1250.  Likewise, 40 percent had
         footprint between 300 and 500. The '0' entries signify a
         fraction somewhere below 5 percent, and the '@'  entries
         signify a fraction somewhere above 95 percent.

         Two  columns  are  listed,  one for the expansion vector
         footprints (i.e. after squaring), and the other for  the
         vector footprints right after initial pruning took place
         (i.e. before selection and recovery, after either  adap-
         tive  or  rigid  pruning).  This may give an idea of the
         soundness of the initial pruning process (overly severe,
         or  overly  mild),  and  the extent to which you want to
         apply selection and/or recovery.

         Initial and final mass windows
         The mass averages of the pruned vectors after the  first
         selection  stage are shown, and the mass averages of the
         vectors as finally computed, i.e.  after  selection  and
         recovery.  Note that the latter corresponds to a differ-
         ent stage than what is shown for the vector  footprints,
         if  either selection or recovery is turned on.  For both
         cases, three averages are shown: the  average  over  all
         vectors,  the  average  over  the worst x cases, and the
         average over the worst y cases. The  mass  averages  are
         shown  as  percentages: '98' on the first line under the
         'prune/all' column means that overall 98 percent of  the
         stochastic mass of the matrix was kept after pruning.

         This  example  demonstrates  that  many entries could be
         removed while retaining much of the stochastic mass. The
         effect of the recovery (-R) parameter is also clear: the
         final averages are higher than the initial averages,  as
         a  result of mcl undoing some overenthousiastic pruning.

         An average over the worst k cases is called a window  of
         width  k; internally, mcl tracks many more such windows.
         The result of this can be seen when  using  the  -do log
         option  (which  appends  a log to the cluster output) or
         the -do show-log option (which sends the log to STDOUT).
         From  a  fixed  set of windows those that are applicable
         are tracked, that is, all those windows  for  which  the
         width does not exceed the graph cardinality. The windows
         in the fixed set have respective sizes 1, 2, 5, 10,  20,
         50,  and  so on up until 5000000 (which makes 15 windows
         in all).

      -nx i (x window index)
      -ny j (y window index)
         The options -nx and -ny both take an index in the  range
         1..15.  The  default  values for -nx and -ny are respec-
         tively 4 and 7, denoting the fourth and  seventh  window
         of  respective  widths  10 and 100. They are used in the
         verbosity output as described above.

      -nj i (jury window index)
         The -nj denotes a window index in the same  way  as  -nx
         and -ny do. This particular window is used for computing
         the jury marks, which are the three number  reported  by
         mcl  when  it  is done. They are a reminder of the exis-
         tence of pruning and its importance for both  speed  and
         accuracy,  and  they are indicative rather than authora-
         tive.

         These jury marks are simply the respective mass averages
         in  the  jury window for the first three iterations. The
         marks are even further simplified and mapped to the jury
         synopsis, which is a single grade expressed as an adjec-
         tive. The grades are, in decreasing  order  of  achieve-
         ment,   perfect   exceptional  superior  excellent  good
         acceptable  mediocre  poor  bad  lousy  miserable  awful
         wretched atrocious. Doing 'mcl --jury-charter' will tell
         you how the jury marks map onto the jury synopsis.

         The jury marks should preferably be higher than  70.  If
         they  are in the vicinity of 80 or 90, mcl is doing fine
         as far as pruning is concerned.  Choose a higher  scheme
         if you think them too low. For very dense graphs that do
         have strong cluster structure, the jury marks  can  sink
         as low as to the 30's and 40's, but the clusterings gen-
         erated by mcl may still be good. The marks and the  syn-
         opsis  grade  the severity of pruning, not cluster qual-
         ity.  Note  that  the  jury  becomes  friendlier,  resp.
         harsher when the -nj option is increased/decreased.

      -nw w (nr of windows)
         Normally,  mcl  will  use  all  windows  that have width
         smaller than the cardinality of the  input  graph.  This
         option  limits  the set of windows to those w windows of
         smallest width.  This affects the -do log output.

      -nl l (number of iterations)
         By default, mcl will log the window  mass  averages  for
         the  first ten iterations. This options sets that number
         to l.  It affects the -do log output.

  PIPELINES
      In general, clustering requires  several  stages;  creating
      the  matrix,  running  mcl,  and displaying the result. The
      display stage is supported by clmformat.  The  matrix  cre-
      ation  stage often needs only be done once for a given data
      collection, followed by repeated  runs  of  the  other  two
      stages for varying inflation values and scheme settings.

      The matrix creation stage can often be split up in two more
      stages, namely parsing a data file in  some  given  format,
      and assembling a matrix from the data bits and pieces, such
      as node indices and edge weights or even edge  weight  con-
      tributions.   The assembly step can be done by mcxassemble,
      which allows a very general input format  and  customizable
      behaviour  in how the bits and pieces should be transformed
      to the input graph.  This leaves  the  parse  stage  to  be
      filled in.

      The  mclpipeline  script implements a generic and customiz-
      able pipeline encapsulating the four  stages  distinguished
      here (parsing, assembly, clustering, display). It is possi-
      ble to let only part of the pipeline be  active,  and  many
      other  features are supported. The IO mechanism is entirely
      file based, and files are associated with  parametrizations
      via file name extensions (by all means a simple mechanism).

      mclpipeline requires a single parse script to be specified.
      It  will be plugged into the pipeline and you should be set
      to run.   The  parse  script  must  satisfy  the  interface
      requirements described in the mclpipeline manual page.

      For  BLAST  input, the mclblastline script provides a dedi-
      cated interface. It uses the mcxdeblast script  that  comes
      prepackaged with mcl.

  APPLICABILITY
      mcl will work very well for graphs in which the diameter of
      the natural clusters is not too large. The presence of many
      edges  between  different  clusters  is not problematic; as
      long as there is cluster structure, mcl will find it. It is
      less likely to work well for graphs with clusters (inducing
      subgraphs) of large diameter, e.g. grid-like graphs derived
      from  Euclidean  data. So mcl in its canonical form is cer-
      tainly not fit for boundary detection  or  image  segmenta-
      tion.  I experimented with a modified mcl and image segmen-
      tation in the thesis pointed  to  below  (see  REFERENCES).
      This  was  fun  and not entirely unsuccesful, but not some-
      thing to be pursued further.

      mcl likes undirected input graphs best, and it really  dis-
      likes  graphs  with node pairs (i,j) for which an arc going
      from i to j is present and the counter-arc from j to  i  is
      absent.  Try to make your input graph undirected.  Further-
      more, mcl interprets edge weights in  graphs  as  similari-
      ties.  If you are used to working with dissimilarities, you
      will have to convert those to similarities using some  con-
      version  formula. The most important thing is that you feel
      confident that the similarities are reasonable, i.e.  if  X
      is  similar  to Y with weight 2, and X is similar to Z with
      weight 200, then this should mean that the similarity of  Y
      (to X) is neglectible compared with the similarity of Z (to
      X).

      mcl is probably not suited for clustering tree graphs. This
      is  because  mcl  works  best  if  there are multiple paths
      between different nodes in the  natural  clusters,  but  in
      tree  graphs  there  is  only  one path between any pair of
      nodes. Trees are too sparse a structure for mcl to work on.

      mcl  may  well  be  suited for clustering lattices. It will
      depend on the density characteristics of the  lattice,  and
      the  conditions for success are the same as those for clus-
      tering graphs in general: The diameter of the natural clus-
      ters  should not be too large.  NOTE when clustering a lat-
      tice, you have to cluster the underlying undirected  graph,
      and  not  the  directed  graph  that represents the lattice
      itself. The reason is that one has to  allow  mcl  (or  any
      other  cluster  algorithm)  to  'look  back in time', so to
      speak. Clustering and directionality bite each other  (long
      discussion omitted).

      mcl  has  a worst-case time complexity O(N*k^2), where N is
      the number of nodes in the graph, and k is the maximum num-
      ber of neighbours tracked during computations. k depends on
      the -P and -S options. If the -S option is used  (which  is
      the  default setting) then k equals the value corresponding
      with this option. Typical values for k  are  in  the  range
      500..1000.  The  average case is much better than the worst
      case though, as cluster structure itself has the effect  of
      helping mcl's pruning schemes, certainly if the diameter of
      natural clusters is not large.

  FILES
      There are currently no resource  nor  configuration  files.
      The mcl matrix format is described in the mcxio(5) section.

  ENVIRONMENT
      Currently, no environmental issues with mcl.

  DIAGNOSTICS
      If mcl issues a diagnostic error, it will  most  likely  be
      because  the  input matrix could not be parsed succesfully.
      mcl tries to be helpful in describing  the  kind  of  parse
      error.   The mcl matrix format is described in the mcxio(5)
      section.

  BUGS
      No known bugs at this time. Please send bug reports to mcl-
      devel@mdcc.cx.

  AUTHOR
      Stijn van Dongen.

  HISTORY/CREDITS
      The  MCL algorithm was conceived in spring 1996 by the pre-
      sent author.  The first implementation of the MCL algorithm
      followed that spring and summer. It was written in Perl and
      proved the viability of the algorithm.  The  implementation
      described  here  began  its  life in autumn 1997. The first
      versions of the vital matrix library were designed  jointly
      by Stijn van Dongen and Annius Groenink in the period Okto-
      ber 1997 - May 1999. The efficient matrix-vector  multipli-
      cation routine was written by Annius. This routine is with-
      out significant changes still one of  the  cornerstones  of
      this MCL implementation.

      Since May 1999 all MCL libraries have seen much development
      and redesign by the present author. Matrix-matrix multipli-
      cation has been rewritten several times to take full advan-
      tage of the sparseness properties of the stochastic  matri-
      ces  brought  forth  by the MCL algorithm. This mostly con-
      cerns the issue of pruning - removal of small elements in a
      stochastic column in order to keep matrices sparse.

      Very  instructive  was  that  around April 2001 Rob Koopman
      pointed out that selecting the k largest elements out of  a
      collection of n is best done using a min-heap. This was the
      key to the second major rewrite (now counting three) of the
      MCL  pruning schemes, resulting in much faster code, gener-
      ally producing a more accurate computation of the MCL  pro-
      cess.

      In May 2001 Anton Enright initiated the parallellization of
      the mcl code and threaded  inflation.  From  this  example,
      Stijn  threaded  expansion. This was great, as the MCL data
      structures and operands (normal matrix  multiplication  and
      Hadamard multiplication) just beg for parallellization.

      In  Jan  2003  the  03-010  release  introduced support for
      sparsely enumerated (i.e. indices need not  be  sequential)
      graphs  and matrices, the result of a major overhaul of the
      matrix library and most higher layers.   Conceptually,  the
      library  now  sees  matrices as infinite quadrants of which
      only finite subsections happen to have nonzero entries.

      Joost van Baal set up the mcl CVS tree and packaged mcl for
      Debian  GNU/Linux. He completely autotooled the sources, so
      much so that at first I found it hard  to  find  them  back
      amidst  bootstrap, aclocal.m4, depcomp, and other beauties.

      Jan van der Steen shared his elegant mempool  code.  Philip
      Lijnzaad  gave  useful  comments.  Philip, Shawn Hoon, Abel
      Ureta-Vidal, and Martin Mokrejs sent helpful bug reports.

  SEE ALSO
      mcl development is  discussed  on  mcl-devel@lists.mdcc.cx,
      this list is archived at http://lists.mdcc.cx/mcl-devel/.

      mcxio(5) - a description of the mcl matrix format.

      mclfaq(7) - Frequently Asked Questions.

      mcx  -  an  interpreter  for  a stack language that enables
      interaction with the mcl matrix libraries. It can  be  used
      both  from the command line and interactively, and supports
      a rich set of operations such  as  transposition,  scaling,
      column  scaling,  multiplication, Hadamard powers and prod-
      ucts, et cetera. The general aim is to provide handles  for
      simple  number  and  matrix arithmetic, and for graph, set,
      and clustering operations. The following is a  very  simple
      example of implementing and using mcl in this language.

       2.0 .i def                    # define inflation value.
       /small lm                     # load matrix in file 'small'.
       dim id add                    # add identity matrix.
       st .x def                     # make stochastic, bind to x.
       { xpn .i infl vm } .mcl def   # define one mcl iteration.
       20 .x .mcl repeat             # iterate  20 times
       imac                          # interpret matrix as clustering.
       vm                            # view matrix (clustering).

      One  of  the  more  interesting  things that can be done is
      doing mcl runs with  more  complicated  inflation  profiles
      than the two-constant approach used in mcl itself.

      Several  other utilities come with mcl, facilitating analy-
      sis and comparison of different clusterings.

      clmdist - compute the split/join distance between two  par-
      titions.  The split/join distance is better suited for mea-
      suring partition similarity than the long-known equivalence
      mismatch  coefficient.  The  former  measures the number of
      node moves required to transform  one  partition  into  the
      other,  the  latter measures differences between volumes of
      edges of unions of complete graphs associated  with  parti-
      tions.

      clminfo  -  compute a performance measure saying how well a
      clustering captures the edge weights of  the  input  graph.
      Useful  for  comparing  different  clusterings  on the same
      graph, best used in conjunction with clmdist - because com-
      paring  clusterings  at  different  levels  of  granularity
      should somewhat change the performance interpretation.  The
      latter issue is discussed in the clmdist entry.

      clmmeet - compute the intersection of a set of clusterings,
      i.e. the largest clustering that is a subclustering of all.
      Useful  for measuring the consistency of a set of different
      clusterings at supposedly different levels  of  granularity
      (in conjunction with clmdist).

      clmformat - display clusters suitable for scrutinizing.

      clmresidue - extend a clustering of a subgraph onto a clus-
      tering of the larger graph.

      clmimac - interpret MCL iterands as clusterings. The  clus-
      terings associated with early iterands may contain overlap,
      should you be interested therein.

      mclpipeline - set up a pipeline  from  data  parsing  stage
      unto clustering format/display stage.

      mcxassemble  -  assemble  a  matrix/graph from partial edge
      weight scores. Useful intermediate format to be  used  when
      transforming  application  specific  data into an mcl input
      matrix.

      mcxsubs - compute a submatrix of a given matrix, where  row
      and  column index sets can be specified as lists of indices
      combined with list of clusters in a given clustering.  Use-
      ful for inspecting local cluster structure.

      mcxmap - relabel indices in a graph.

      mcl's home at http://micans.org/mcl/.

  REFERENCES
      Stijn van Dongen, Graph Clustering by Flow Simulation.  PhD
      thesis, University of Utrecht, May 2000.
      http://www.library.uu.nl/digia-
      rchief/dip/diss/1895620/inhoud.htm

      Stijn  van Dongen. A cluster algorithm for graphs.  Techni-
      cal Report INS-R0010, National Research Institute for Math-
      ematics and Computer Science in the Netherlands, Amsterdam,
      May 2000.
      http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z

      Stijn van  Dongen.  A  stochastic  uncoupling  process  for
      graphs.   Technical  Report  INS-R0011,  National  Research
      Institute for  Mathematics  and  Computer  Science  in  the
      Netherlands, Amsterdam, May 2000.
      http://www.cwi.nl/ftp/CWIreports/INS/INS-R0011.ps.Z

      Stijn van Dongen. Performance criteria for graph clustering
      and Markov cluster experiments. Technical Report INS-R0012,
      National  Research  Institute  for Mathematics and Computer
      Science in the Netherlands, Amsterdam, May 2000.
      http://www.cwi.nl/ftp/CWIreports/INS/INS-R0012.ps.Z

      Enright A.J., Van Dongen S., Ouzounis  C.A.   An  efficient
      algorithm  for  large-scale  detection of protein families,
      Nucleic Acids Research 30(7):1575-1584 (2002).

  NOTES
      This  page  was  generated   from   ZOEM   manual   macros,
      http://micans.org/zoem.  Both  html  and  roff pages can be
      created from the same source without having to bother  with
      all the usual conversion problems, while keeping some level
      of sophistication in the typesetting.



  mcl 1.003, 03-276           3 Oct 2003                     mcl(1)
