


                       The Network Queueing System

                            Brent A. Kingsbury

                            Sterling Software
                  1121 San Antonio Road, Palo Alto 94303

                                 _A_B_S_T_R_A_C_T



            This  paper  describes  the  implementation  of  a
            networked,  UNIX  based  queueing system developed
            for  a  government  contract  with  the   National
            Aeronautics  and Space Administration (NASA).  The
            system discussed supports both  batch  and  device
            requests,  and  provides  the facilities of remote
            queueing, request routing,  remote  status,  queue
            access  controls,  batch  request  resource  quota
            limits, and remote output return.

       1.  _O_r_i_g_i_n_s

       The invention of  the  _N_e_t_w_o_r_k  _Q_u_e_u_e_i_n_g  _S_y_s_t_e_m  (NQS)  was
       driven by the need for a good UNIX batch and device queueing
       facility capable of supporting such requests in a  networked
       environment  of  UNIX  machines.  More specifically, NQS was
       developed as part of an effort aimed  at  tying  together  a
       diverse  assortment  of  UNIX  based  machines into a useful
       computational complex for the National Aeronautics and Space
       Administration (NASA).

       Today, this computational complex is officially known as the
       _N_u_m_e_r_i_c_a_l  _A_e_r_o_d_y_n_a_m_i_c  _S_i_m_u_l_a_t_o_r _P_r_o_c_e_s_s_i_n_g _S_y_s_t_e_m _N_e_t_w_o_r_k,
       otherwise known as the NPSN.  The assorted machines in  this
       network  are  of varying manufacture, and (as of the time of
       this writing) include Digital Equipment  Corporation  VAXes,
       Silicon Graphics Irises, large Amdahl 5840 mainframes, and a
       Cray Research Incorporated CRAY-2.  Each of the machines  in
       the network runs its own vendor-supplied version of the UNIX
       operating system, with  appropriate  kernel  and  user-space
       extensions as necessary.

       The presence of UNIX on  all  of  these  machines  has  made
       possible  the  creation  of a common user interface, so that
       despite the obvious hardware differences, users  can  freely
       move  among the different machines of the NPSN without being
       confronted with entirely  different  software  environments.
       As  part  of  this  common  user  interface,  NQS  has  been
       implemented as a collection of _u_s_e_r-_s_p_a_c_e programs providing
       the required batch and device queueing capabilities for each
       machine in the network.






                            PRELIMINARY DRAFT                     1







                            PRELIMINARY DRAFT               4/29/92



       2.  _D_e_s_i_g_n__G_o_a_l_s

       NQS was architected and written with  the  following  design
       goals in mind:

          +o Provide for the full support of both _b_a_t_c_h  and  _d_e_v_i_c_e
            requests.  A _b_a_t_c_h _r_e_q_u_e_s_t is defined as a shell script
            containing commands not requiring the  direct  services
            of  some physical device (other than the CPU resource),
            that  can  be  executed  independently  of   any   user
            intervention   by  the  invocation  of  an  appropriate
            command  interpreter  (e.g.  /bin/csh,  /bin/sh).    In
            contrast,  a  _d_e_v_i_c_e  _r_e_q_u_e_s_t  is  defined  as a set of
            independent instructions requiring the direct  services
            of a specific device for execution (e.g. a line printer
            request).

          +o Support all of the resource quotas enforceable  by  the
            underlying UNIX kernel implementation that are relevant
            to any particular batch request, and its  corresponding
            batch queue.

          +o Support the remote queueing and routing  of  batch  and
            device  requests  throughout  the  network  of machines
            running NQS.  This means that some mechanism must exist
            to reliably transport batch and device requests between
            distinct machines, even if one or both of the  machines
            involved crash repeatedly during the transaction.

          +o Modularize all of the request scheduling algorithms  so
            that  the NQS request schedulers can be easily modified
            on an installation by installation basis, if necessary.

          +o Support queue access restrictions whereby the right  to
            submit  a batch or device request to a particular queue
            can be controlled, in the form  of  a  user  and  group
            access list for any queue.

          +o Support networked output return, whereby the _s_t_d_o_u_t and
            _s_t_d_e_r_r  files of any batch request can be returned to a
            possibly remote machine.

          +o Allow  for  the  mapping  of  accounts  across  machine
            boundaries.   Thus, the account: winston on the machine
            called: Amelia might be mapped to the account:  chandra
            on the machine called: Orville.

          +o Provide  a   friendly   mechanism   whereby   the   NQS
            configuration on any particular machine can be modified
            without having to resort  to  the  editing  of  obscure
            configuration files.



                            PRELIMINARY DRAFT                     2







                            PRELIMINARY DRAFT               4/29/92



          +o Support status operations across the network so that  a
            user  on one machine can obtain information relevant to
            NQS on another machine, without requiring the  user  to
            log in on the target remote machine.

          +o Provide a design for the future implementation of  file
            staging,  whereby several files or file hierarchies can
            be _s_t_a_g_e_d in or out  of  the  machine  that  eventually
            executes  a  particular batch request.  For files being
            _s_t_a_g_e_d-_i_n, this implies that a _c_o_p_y of the file must be
            constructed  on  the  execution  machine,  prior to the
            execution of the batch request.  Such files  must  then
            be  deleted  upon  the completion of the batch request.
            For files being _s_t_a_g_e_d-_o_u_t,  this  implies  the  actual
            movement of the file from the _e_x_e_c_u_t_i_o_n machine, to the
            eventual destination machine.


       3.  _I_m_p_l_e_m_e_n_t_a_t_i_o_n__S_t_r_a_t_e_g_i_e_s

       Before dashing off to implement NQS completely from scratch,
       a  long  look was taken at an already existing UNIX queueing
       system known as the _M_u_l_t_i_p_l_e _D_e_v_i_c_e _Q_u_e_u_e_i_n_g _S_y_s_t_e_m  (MDQS),
       as developed at the U.S. Army Ballistic Research Laboratory.
       [1]

       At one  point,  it  was  even  decided  that  NQS  could  be
       implemented  as  an  enhanced  version  of  MDQS,  borrowing
       heavily from the original MDQS source  code.   Theoretically
       at  least, this strategy was supposed to reduce the work and
       risk involved in building a networked queueing  system  that
       would  satisfy  NASA's  needs.   This  thinking  lasted long
       enough for an early design document to be written  detailing
       the modifications to be made under such a plan.

       The plan however was later abandoned, when it was recognized
       that  the  new  code  required  for  the proposed extensions
       exceeded the size of the already existing MDQS code.  Rather
       than  heap  unwieldy  extensions upon a frame never designed
       for such weight, NQS  was  built  completely  from  scratch.
       This  new  strategy  allowed  for  the construction of a new
       framework from which to hang new ideas, along with  many  of
       the  concepts  included in MDQS.  NQS is therefore something
       old, and something new.










                            PRELIMINARY DRAFT                     3







                            PRELIMINARY DRAFT               4/29/92



       4.  _T_h_e__N_Q_S__L_a_n_d_s_c_a_p_e

       This section of the paper describes the general  design  and
       concepts  of  NQS.  It must be understood that NQS continues
       to be developed.  This  paper  discusses  only  the  current
       state  of  affairs,  with  occasional  pointers  referencing
       future areas of improvement.

       4.1  _T_h_e__Q_u_e_u_e__a_n_d__R_e_q_u_e_s_t__M_o_d_e_l

       In order to provide support for the  two  request  types  of
       _b_a_t_c_h  and  _d_e_v_i_c_e,  NQS implements two distinctly different
       queue types, with the respective type  names  of  _b_a_t_c_h  and
       _d_e_v_i_c_e.  Only _b_a_t_c_h _q_u_e_u_e_s are allowed to accept and execute
       _b_a_t_c_h _r_e_q_u_e_s_t_s.  Similarly, _d_e_v_i_c_e _q_u_e_u_e_s are  only  allowed
       to accept and execute _d_e_v_i_c_e _r_e_q_u_e_s_t_s.

       In addition to the first two queue types, a third queue type
       known  as a _p_i_p_e _q_u_e_u_e exists to transport requests to other
       batch, device, or pipe queues  at  possibly  remote  machine
       destinations.  Readers familiar with MDQS will note that the
       implementation of three  distinctly  different  queue  types
       differs  substantially  from  the  MDQS philosophy of having
       only one queue type.

       4.1.1  _B_a_t_c_h__Q_u_e_u_e_s

       The first queue type implemented in NQS is  called  a  _b_a_t_c_h
       _q_u_e_u_e.  As stated earlier, NQS batch queues are specifically
       implemented to run only _b_a_t_c_h _r_e_q_u_e_s_t_s.

       4.1.1.1  _B_a_t_c_h__Q_u_e_u_e__Q_u_o_t_a__L_i_m_i_t_s

       It is useful to be able to place limits on  the  amounts  of
       different  resources that a batch request can consume during
       execution.  Towards that  end,  NQS  batch  queues  have  an
       associated  set of resource quota limits, that all other NQS
       queue types lack.

       For a batch request to  be  queued  in  a  particular  batch
       queue, any resource quota limits defined by the request must
       be _l_e_s_s _t_h_a_n _o_r _e_q_u_a_l _t_o the corresponding limit as  defined
       for  the  target  batch  queue.  If a batch request fails to
       specify a particular resource limit value for which a  limit
       is  enforceable  by the underlying UNIX implementation, then
       the queued batch request inherits the corresponding limit as
       defined for the target batch queue.

       If a resource limit associated with a batch queue  is  later
       lowered   by  a  system  administrator,  then  all  requests
       residing in the queue with a quota limit  _g_r_e_a_t_e_r  than  the



                            PRELIMINARY DRAFT                     4







                            PRELIMINARY DRAFT               4/29/92



       new  corresponding  quota  limit,  are  given  a _g_r_a_n_d_f_a_t_h_e_r
       clause (and the adjusting system administrator  is  notified
       accordingly).    This   example  illustrates  the  important
       principal enforced in NQS that the set of limits under which
       a  batch request is to run, are determined _a_n_d _f_r_o_z_e_n at the
       time  that  the  batch  request  is  first  queued  in   its
       destination batch queue.

       4.1.1.2  _S_p_a_w_n_i_n_g__a__B_a_t_c_h__R_e_q_u_e_s_t

       The actual execution  of  a  batch  request  is  a  somewhat
       complicated affair.  First, a batch request may require that
       the output files of _s_t_d_e_r_r  and  _s_t_d_o_u_t  be  spooled,  to  a
       possibly  remote  machine  destination.  In order to do this
       safely, a temporary version of the output files  is  created
       in a protected location known to NQS.

       Second,  any  additional  environment  variables  optionally
       exported with the request from the originating (and possibly
       remote) host, are placed in  the  environment  set  for  the
       shell that is about to be _e_x_e_ced.

       Third, based on any request  shell  specifications  and  the
       shell  strategy  policy  at the local host, the proper shell
       (e.g. /bin/csh, /bin/ksh, /bin/sh, etc.)  is chosen (see the
       _B_a_t_c_h  _R_e_q_u_e_s_t  _S_h_e_l_l _S_t_r_a_t_e_g_i_e_s section below).  The chosen
       shell  will  be  spawned  as  a   _l_o_g_i_n   shell,   virtually
       indistinguishable  from  the  shell  that  the request owner
       would  have  gotten  had  they  logged  directly  into   the
       execution machine.

       Fourth, all of the  resource  limits  as  supported  by  the
       underlying  UNIX operating system implementation are applied
       to the new shell process, as determined for the  request  at
       the time it was first queued in the batch queue.

       After the resource limits  have  been  applied,  the  proper
       shell is _e_x_e_ced, and the shell script that defines the batch
       request is actually executed.  Upon completion, the  spooled
       output  files  of  _s_t_d_e_r_r  and  _s_t_d_o_u_t are returned to their
       possibly remote machine destinations.

       4.1.1.3  _B_a_t_c_h__Q_u_e_u_e__R_u_n__L_i_m_i_t_s

       To prevent the local host from being  swamped  with  running
       batch  requests,  some  mechanism  must exist to prevent too
       many batch requests from running at any single  given  time.
       Currently,   this   mechanism   is   quite  simple,  and  is
       implemented by the presence of two batch request _r_u_n _l_i_m_i_t_s.





                            PRELIMINARY DRAFT                     5







                            PRELIMINARY DRAFT               4/29/92



       The first batch request run limit is global in  nature,  and
       places  a  ceiling  on  the maximum number of batch requests
       allowed to execute simultaneously on the local host.

       The second batch request run limit is applied at  the  queue
       level,  and  places a ceiling on the maximum number of batch
       requests allowed to execute simultaneously _i_n _t_h_e _c_o_n_t_a_i_n_i_n_g
       _b_a_t_c_h _q_u_e_u_e.

       When a batch request completes execution, the entire set  of
       batch queues is traversed in order of decreasing batch queue
       priority.  For each batch queue in the order traversed,  any
       eligible  batch  requests are spawned until either the queue
       run limit is reached, or the global batch request run  limit
       is  reached.   If upon discovering that no more requests can
       be spawned for the batch queue under scrutiny, and the total
       number  of  running  batch  requests  is still less than the
       global batch request run limit, then the next lower priority
       batch  queue  is examined applying the same algorithm, until
       all of the batch queues have been examined.

       So far, this simple run limit scheme  has  sufficed  as  the
       only  tool  to  control  the running batch request execution
       load.   Since  batch  requests  can  vary  widely  in  their
       consumption  of  resources,  additional  more  sophisticated
       control mechanisms limiting  the  number  of  simultaneously
       executing batch requests may be required in the future.

       4.1.2  _D_e_v_i_c_e__Q_u_e_u_e_s

       _D_e_v_i_c_e _q_u_e_u_e_s represent the second queue type implemented in
       NQS.   Unlike  their  sibling _b_a_t_c_h _q_u_e_u_e_s, device queues do
       not have a set of associated resource quota limits.   Device
       queues  do  however  have a set of associated _d_e_v_i_c_e_s, which
       batch queues do not have.

       4.1.2.1  _D_e_v_i_c_e_s

       For each _d_e_v_i_c_e _q_u_e_u_e, there exists a set  of  one  or  more
       devices  to  which requests entering the device queue can be
       sent for  execution.   Each  such  device  in  turn  has  an
       associated  server,  which  constitutes  the program that is
       always spawned to handle a request  that  is  given  to  the
       device for execution.

       Any imaginable _q_u_e_u_e-_t_o-_d_e_v_i_c_e _m_a_p_p_i_n_g  can  be  configured.
       In  general,  _N  device queues can be configured to "feed" _M
       devices.  The only restriction placed on the value of _N  and
       _M,  is  the  obvious  one  that  their  respective values be
       greater than or equal to zero (note that it is possible  for
       a  device  queue  to exist without _a_n_y devices in its device



                            PRELIMINARY DRAFT                     6







                            PRELIMINARY DRAFT               4/29/92



       set, though such a queue is useless).  It is  even  possible
       to have multiple device queues feeding the same device.

       4.1.2.2  _S_p_a_w_n_i_n_g__a__D_e_v_i_c_e__R_e_q_u_e_s_t

       When an NQS device completes the task of handling  a  device
       request  or  is  found to be idle after a device request has
       been recently queued, all of the device queues  that  "feed"
       the  device  are  scanned to determine if they have a queued
       request that can be handled by the device.   Like  MDQS,  an
       NQS  device  request  can  specify  that a particular device
       _f_o_r_m_s type be used to execute the  request.   For  a  queued
       device  request  to  be  deemed  eligible for execution by a
       particular device, any forms specified by the  request  must
       match the forms defined for the device.  If the request does
       not specify a forms  type,  then  it  is  assumed  that  the
       request can be satisfied by any device in the mapping set of
       the queue containing the request.

       If two or more queues are found to contain  a  request  that
       can  be  executed  by the newly idled device, then the first
       available request from the device queue with the numerically
       higher _q_u_e_u_e _p_r_i_o_r_i_t_y is chosen.  If two or more such queues
       have the same _q_u_e_u_e _p_r_i_o_r_i_t_y, then the queues  are  serviced
       in the classic "round-robin" fashion.

       4.1.2.3  _D_e_v_i_c_e__Q_u_e_u_e__R_u_n__L_i_m_i_t_s

       Like a batch queue, some mechanism must exist  to  keep  the
       number   of  simultaneously  running  device  requests  from
       swamping the local host.   Unlike  a  batch  queue  however,
       device  queues  do not have an associated run limit.  Device
       queues are instead throttled by  their  associated  devices,
       which   can   be   disabled   as   necessary   by  a  system
       administrator.

       4.1.3  _P_i_p_e__Q_u_e_u_e_s

       _P_i_p_e _q_u_e_u_e_s represent the third queue  type  implemented  in
       NQS, and are responsible for routing and delivering requests
       to other (possibly remote) queue destinations.  Pipe  queues
       derive  their  name  from  their  conceptual similarity to a
       _p_i_p_e_l_i_n_e, transporting requests to other queue destinations.

       4.1.3.1  _P_i_p_e__Q_u_e_u_e_s__a_n_d__R_e_q_u_e_s_t__T_r_a_n_s_p_o_r_t

       Differing from both batch and device queues, pipe queues  do
       not  have  any  associated  quota  limits  or devices.  Pipe
       queues do however have a set of associated  _d_e_s_t_i_n_a_t_i_o_n_s  to
       which  they  route  and  deliver requests.  Pipe queues also
       differ from their sibling batch and device queues,  in  that



                            PRELIMINARY DRAFT                     7







                            PRELIMINARY DRAFT               4/29/92



       they can accept _b_o_t_h batch and device requests.

       With each pipe queue, there is an associated server that  is
       spawned  to  handle each request released from the queue for
       routing and delivery.  Ironically, the spawned instance of a
       pipe queue server is called a _p_i_p_e _c_l_i_e_n_t, due to the use of
       the word _s_e_r_v_e_r in the context of  a  _c_l_i_e_n_t/_s_e_r_v_e_r  network
       connection.

       Thus,  when  a  pipe  queue  request  requires  routing  and
       delivery   to  some  destination  of  the  pipe  queue,  the
       associated pipe queue server is spawned as  a  _p_i_p_e  _c_l_i_e_n_t,
       which   must  then  route  and  deliver  the  request  to  a
       destination.  For each attempted  remote  destination,  this
       requires  the  creation  of  a _n_e_t_w_o_r_k _s_e_r_v_e_r process on the
       remote host acting as an agent on behalf of the  pipe  queue
       request.   The  choice  of the term _p_i_p_e _c_l_i_e_n_t allows us to
       use the standard _c_l_i_e_n_t/_s_e_r_v_e_r  vocabulary  when  discussing
       the  queueing  and  delivery  of  a  pipe queue request to a
       remote host.

       4.1.3.2  _S_p_a_w_n_i_n_g__a__P_i_p_e__R_e_q_u_e_s_t

       When a _p_i_p_e  _c_l_i_e_n_t  is  spawned  to  route  and  deliver  a
       request,   it  is  given  complete  freedom  to  choose  any
       destinations from the destination  set  configured  for  the
       pipe  queue, as possible destinations for the request.  If a
       selected destination does not accept the request,  then  the
       pipe  client  is  free  to  try  another destination for the
       request.

       It is quite possible for a request to be rejected by all but
       one  of  the possible destinations defined for a pipe queue.
       It is not necessary to find  many  destinations  willing  to
       accept  the  request.   Only  one accepting destination need
       exist for the pipe queue request to be handled successfully.

       It is also possible for every single destination of  a  pipe
       queue  to  reject  the  request for reasons which are deemed
       permanent in nature (e.g.  all  of  the  destination  queues
       reside  on  remote machines where the request owner does not
       have access to an account).  In such situations, the request
       is  deleted, and mail is sent to the request owner informing
       him or her of the demise of their request.

       Requests can be rejected by a destination for a plethora  of
       reasons,   including   remote   host  failures,  queue  type
       disagreements with the request type, lack of  request  owner
       account  authorization  at  the  remote  queue  destination,
       insufficient queue space, or any  one  of  a  hundred  other
       reasons  including  the  simple  problem  of the destination



                            PRELIMINARY DRAFT                     8







                            PRELIMINARY DRAFT               4/29/92



       queue being disabled (unable to accept any new requests).

       Some of the  reasons  for  a  destination  rejection  denote
       retriable  events  (the  effort  to queue the request at the
       destination may succeed if tried later).  Examples  of  this
       kind of failure include the destination queue being disabled
       (the system administrators at the destination may enable  it
       some time), and machine failures (the destination machine is
       crashed, but might be rebooted in the future).

       Other destination rejection reasons are more permanent  such
       as  the  lack  of proper account authorization at the remote
       destination, or request and  destination  type  disagreement
       (the  request  is a device request, and the destination is a
       batch queue for instance).

       Due to the tremendous number of ways in which a request  can
       be  rejected  by  a  queue  destination, there is an equally
       tremendous  amount  of  logic  incorporated  into  NQS  that
       attempts  to deal with the situation.  Some failures require
       that queue destinations be disabled for some  finite  amount
       of time after which the destination is considered retriable.
       All failures of the retriable variety require that the  pipe
       queue  request  be  requeued  and delayed for some amount of
       time, after which an attempt is made to reroute the request.

       Even the successful case  of  a  request  being  tentatively
       accepted  by a queue destination is fraught with complexity,
       since one or both machines involved in the  transaction  may
       crash at any time.

       In summary, pipe  queues  are  both  powerful  and  complex.
       Since  the  pipe  client  configured with each pipe queue is
       allowed  to  choose  which  destinations  to  try  from  the
       destination  set,  it  is  possible to implement a crude but
       effective request class  mechanism.   The  pipe  client  can
       examine   the   request,  and  then  choose  an  appropriate
       destination queue that is more appropriate for the  request.
       Thus,  "large"  batch requests queued in a pipe queue can be
       delivered to batch queues which may run only at night, while
       "small"  batch  requests  can  be  delivered  to  fast batch
       queues, which run with a  UNIX  _n_i_c_e  execution  value  that
       gives  high  compute  priority,  while keeping a small upper
       limit on CPU time and maximum file size for the request.

       When a pipe queue is used as request class mechanism, it  is
       wise  to  define  the  target  destination  queues  with the
       attribute of _p_i_p_e_o_n_l_y,  which  prevents  any  requests  from
       being  queued  in such queues unless the requests are queued
       _f_r_o_m _a_n_o_t_h_e_r _p_i_p_e _q_u_e_u_e.  In this  way,  the  request  class
       policies implemented by the pipe queue and associated server



                            PRELIMINARY DRAFT                     9







                            PRELIMINARY DRAFT               4/29/92



       (pipe client) can be strictly enforced.

       Pipe queues also help to ameliorate the unreliability of the
       surrounding  network  and  machines.   Even  if  the  proper
       destination machine is down or unreachable, the  pipe  queue
       mechanism can requeue the request and deliver it later, when
       the destination machine and connecting network are  restored
       to operation.

       4.1.3.3  _P_i_p_e__Q_u_e_u_e__R_u_n__L_i_m_i_t_s

       To prevent pipe queues from flooding the host system with an
       overly  large  number  of simultaneously running pipe client
       processes, a mechanism identical  to  that  implemented  for
       batch queues is employed.

       4.1.4  _R_e_q_u_e_s_t__S_t_a_t_e_s

       In the previous sections,  we  have  described  the  general
       request  and  queue  type concepts implemented in NQS.  This
       section descends the staircase of detail,  focusing  on  the
       different  states  that a request can go through all the way
       from its initial creation, to its ultimate execution.

       A request residing within an NQS queue  can  be  in  one  of
       several  states.   First of all, the request may actually be
       _r_u_n_n_i_n_g.  This request state exists for requests residing in
       batch  and  device  queues,  and implies that the request is
       presently being executed.  The analogous request  state  for
       requests  residing  within  a  pipe queue is termed _r_o_u_t_i_n_g,
       since the request is not actually  running,  but  is  rather
       being routed and delivered to another queue destination.

       The second (and most  common)  request  state,  is  what  is
       termed  the  _q_u_e_u_e_d state.  A request in the queued state is
       completely ready to enter the _r_u_n_n_i_n_g or _r_o_u_t_i_n_g states.

       The third request state describes the condition of  where  a
       request  is  waiting  for some finite time interval to pass,
       after which it will enter  one  of  the  states  of  queued,
       running,  or  routing.   This  request state is known as the
       _w_a_i_t_i_n_g state.

       The fourth request state is known  as  the  _a_r_r_i_v_i_n_g  state.
       All  requests  in  the  arriving state are in the process of
       being queued from  another  (possibly  remote)  pipe  queue.
       When  completely  received  they will enter one of the other
       states of _w_a_i_t_i_n_g, _q_u_e_u_e_d, _r_u_n_n_i_n_g, or _r_o_u_t_i_n_g.

       There are also three additional request states that are  not
       implemented  in  the current version of NQS.  The first such



                            PRELIMINARY DRAFT                    10







                            PRELIMINARY DRAFT               4/29/92



       state is known as  the  _h_o_l_d_i_n_g  state,  and  describes  the
       condition of where an operator, user, or both have applied a
       _h_o_l_d to the given request.  Such a request  is  frozen,  and
       cannot  exit  the  hold state unless all holds applied by an
       operator or user have been released.

       The second and third unimplemented  request  states  concern
       the  batch  request  states  of _s_t_a_g_i_n_g-_i_n, and _s_t_a_g_i_n_g-_o_u_t.
       These states will not be implemented, unless the demand  for
       the  facility of file staging increases, since it is already
       possible to use the remote file copy commands in  the  shell
       script  that  constitutes  a  batch  request,  to  copy  the
       requisite files to and from the execution  machine  for  the
       request.  The advantage of implementing file staging is that
       NQS can use a transaction mechanism to prevent the execution
       of  a  batch request, until all of the input files have been
       staged-in to the local host.  In this way, crashes of remote
       machines cannot cause a batch request to fail.  Output files
       could be similarly staged.

       4.2  _M_o_r_e__L_a_n_d_s_c_a_p_i_n_g

       The previous major section described the queue  and  request
       model  implemented  in  NQS.   This  section  of  the  paper
       describes the implementation of queue access controls, batch
       request   quota  limits,  batch  request  shell  strategies,
       request transaction states, the  networking  implementation,
       account mapping across machine boundaries, NQS configuration
       control,  status  operations,  and   the   possible   future
       implementation of file staging.

       4.2.1  _Q_u_e_u_e__A_c_c_e_s_s__C_o_n_t_r_o_l_s

       In any  reasonable  queueing  system,  it  is  necessary  to
       provide  for the configuration of queue access restrictions.
       Without such restrictions, there would be no way to  prevent
       every  user of the machine from submitting their requests to
       the fastest queue with the  highest  priority  and  resource
       limits  on  the  machine.   Thus,  NQS supports queue access
       restrictions.

       For  each  queue,  access  may  be  either  _u_n_r_e_s_t_r_i_c_t_e_d  or
       _r_e_s_t_r_i_c_t_e_d.   If  access  is  _u_n_r_e_s_t_r_i_c_t_e_d,  any request may
       enter the queue.  If access is _r_e_s_t_r_i_c_t_e_d,  then  a  request
       can only enter the queue if the requester's login _u_s_e_r-_i_d or
       login _g_r_o_u_p-_i_d is defined in the access set for  the  target
       queue.

       All such access permissions are always defined  relative  to
       user  and  group definitions present on the local host.  The
       restriction that all user and group references  be  relative



                            PRELIMINARY DRAFT                    11







                            PRELIMINARY DRAFT               4/29/92



       to  the local host is not a problem, since request ownership
       mapping is  performed  whenever  a  request  is  transported
       across  a  machine boundary (see the _A_c_c_o_u_n_t _M_a_p_p_i_n_g section
       below).

       Lastly,  an  additional  queue  access  parameter  known  as
       _p_i_p_e_o_n_l_y can be defined for any queue.  The presence of this
       queue access attribute prevents requests from being directly
       placed  within the queue by one of the user commands used to
       submit an NQS request.  Queues with the  _p_i_p_e_o_n_l_y  attribute
       can  only accept requests queued via another pipe queue.  As
       outlined in the summary  of  the  _S_p_a_w_n_i_n_g  _a  _P_i_p_e  _R_e_q_u_e_s_t
       section,  this  attribute  makes  is possible to implement a
       simple request execution class facility.

       4.2.2  _B_a_t_c_h__R_e_q_u_e_s_t__Q_u_o_t_a__L_i_m_i_t_s

       As mentioned previously, NQS supports an  extensive  set  of
       batch  request  resource  quota limits.  However, NQS cannot
       enforce a batch request  resource  quota  limit  unless  the
       underlying UNIX implementation also supports the enforcement
       of the same limit.  Thus,  the  resource  limit  enforcement
       functions  of NQS have been implemented using an appropriate
       set of #_i_f_d_e_f_s, allowing the system maintainers to configure
       the resource limit functions as appropriate.

       It must be understood that NQS does not define the interface
       through  which  errant  batch  requests  will be informed of
       their attempts to consume more of a given resource  than  is
       allocated  to  them.   Upon exceeding some limit types, some
       UNIX implementations send a signal to the offending process.
       Other  implementations  may  simply  cause the errant system
       call to fail, with _e_r_r_n_o being set as appropriate.

       If a batch request specifies  the  enforcement  of  a  quota
       limit that is not enforceable at the execution machine, then
       the limit is simply ignored, and the request is run  anyway.
       It is also possible to specify that no limit be given to the
       usage of a particular resource for both a batch request  and
       batch queue.

       Lastly, the NQS implementation  of  batch  request  resource
       limits  allows each batch request to specify a _w_a_r_n_i_n_g _l_i_m_i_t
       value for UNIX kernels that allow  processes  to  be  warned
       when  they  are  getting  close to exceeding some hard quota
       limit.  Once again as for  hard  quota  limits,  the  actual
       enforcement  mechanism  of  warning  limits  is  up  to  the
       supporting UNIX kernel.

       The  full  set  of  batch  request  resource  quota   limits
       recognized  by NQS falls into two principal categories.  The



                            PRELIMINARY DRAFT                    12







                            PRELIMINARY DRAFT               4/29/92



       first category concerns only those limits applicable to _e_a_c_h
       process   of  the  process  family  comprising  the  running
       request.  This category of limits is known  collectively  as
       the _p_e_r-_p_r_o_c_e_s_s limit set.

       The second category concerns only those limits applicable to
       the entire request.  That is, the consumption of the limited
       resource as consumed by _a_l_l processes comprising the running
       batch request must never exceed the given _p_e_r-_r_e_q_u_e_s_t limit.

       The complete set of batch request quota limits supported  by
       NQS   is  listed  below.   Each  limit  is  shown  with  its
       corresponding _Q_s_u_b(1) command syntax (_Q_s_u_b(1) is the command
       used  to submit an NQS batch request).  The use of the "(P)"
       and "(R)" description in the limit definition indicates  the
       _p_e_r-_p_r_o_c_e_s_s or _p_e_r-_r_e_q_u_e_s_t nature of the limit:

         -lc _l_i_m_i_t            - (P) corefile size limit.
         -ld _l_i_m_i_t [ , _w_a_r_n ] - (P) data segment size limit.
         -lf _l_i_m_i_t [ , _w_a_r_n ] - (P) file size limit.
         -lF _l_i_m_i_t [ , _w_a_r_n ] - (R) file space limit.
         -lm _l_i_m_i_t [ , _w_a_r_n ] - (P) memory size limit.
         -lM _l_i_m_i_t [ , _w_a_r_n ] - (R) memory space limit.
         -ln _l_i_m_i_t            - (P) nice execution priority limit.
         -ls _l_i_m_i_t            - (P) stack segment size limit.
         -lt _l_i_m_i_t [ , _w_a_r_n ] - (P) CPU time limit.
         -lT _l_i_m_i_t [ , _w_a_r_n ] - (R) CPU time limit.
         -lv _l_i_m_i_t [ , _w_a_r_n ] - (P) temporary file size limit.
         -lV _l_i_m_i_t [ , _w_a_r_n ] - (R) temporary file space limit.
         -lw _l_i_m_i_t            - (P) working set limit.

       The present implementation also includes provisions for  the
       additional limits of:

         -l6 _l_i_m_i_t            - (R) tape drive device limit.
         -lP _l_i_m_i_t            - (R) number of processors limit.
         -lq _l_i_m_i_t [ , _w_a_r_n ] - (P) Quick device file size limit.
         -lQ _l_i_m_i_t [ , _w_a_r_n ] - (R) Quick device file space limit.

       These last limits  are  not  presently  supported,  but  are
       instead reserved for future use.  The last two future limits
       of -_l_q, and -_l_Q are reserved  for  defining  limits  on  the
       amount  of  fast  (quick)  file storage to be allocated to a
       process of the running request, and to  the  entire  running
       request.   An example of a fast file storage resource can be
       found in the  solid  state  disk  (SSD)  product  that  Cray
       Research Incorporated supports with their CRAY-XMP series of
       computers.






                            PRELIMINARY DRAFT                    13







                            PRELIMINARY DRAFT               4/29/92



       4.2.3  _B_a_t_c_h__R_e_q_u_e_s_t__S_h_e_l_l__S_t_r_a_t_e_g_y

       The execution of a batch request requires the creation of  a
       shell  process  to  interpret the shell script which defines
       the batch request.  On many UNIX systems, there is more than
       one  shell available (e.g. /bin/csh, /bin/ksh, /bin/sh).  To
       deal with this problem, NQS allows a shell  pathname  to  be
       specified when a batch request is first submitted.

       If no particular shell is specified for the execution of the
       request,  then  NQS  must  have some other means of deciding
       which shell to use when spawning the request.  The  solution
       to  this  dilemma has been to equip NQS with a _b_a_t_c_h _r_e_q_u_e_s_t
       _s_h_e_l_l _s_t_r_a_t_e_g_y , which can be configured as necessary by the
       local system administrators.

       The  batch  request  shell  strategy  as  configured  on   a
       particular  system,  determines  the  shell  to be used when
       executing a batch request on the local host  that  fails  to
       identify  any  specific shell for its execution.  Three such
       shell strategies can be configured for  NQS,  and  they  are
       known by the names of

            _f_i_x_e_d,
            _f_r_e_e, and
            _l_o_g_i_n.

       A shell strategy of _f_i_x_e_d causes the request to  be  run  by
       the  _f_i_x_e_d _s_h_e_l_l, the pathname of which is configured by the
       system administrator.  Thus, a particular  NQS  installation
       may  be  configured  with  a  _f_i_x_e_d shell strategy where the
       default shell used to execute all batch requests is  defined
       as the Bourne shell.

       A shell strategy of _f_r_e_e  simply  causes  the  user's  login
       shell  (as  defined  in  the  password file), to be _e_x_e_ced).
       This shell is in turn given a pathname to the batch  request
       shell script, and it is the user's login shell that actually
       decides which shell should be used to interpret the  script.
       The  _f_r_e_e  shell  strategy  therefore runs the batch request
       script _e_x_a_c_t_l_y as would an  interactive  invocation  of  the
       script, and is the default NQS shell strategy.

       The third shell strategy of _l_o_g_i_n simply causes  the  user's
       login  shell  (as  defined  in the password file), to be the
       default shell used to  interpret  the  batch  request  shell
       script.

       The strategies of _f_i_x_e_d and _l_o_g_i_n  exist  for  host  systems
       that  are  short  on available free processes.  In these two
       strategies, a single shell is _e_x_e_ced, and that same shell is



                            PRELIMINARY DRAFT                    14







                            PRELIMINARY DRAFT               4/29/92



       the  shell  that  executes  all of the commands in the batch
       request script (barring shell _e_x_e_c operations  in  any  user
       startup files: .profile, .login, .cshrc).

       In every case however, the shell that is chosen  to  execute
       the  batch  request is always spawned as a login shell, with
       all of the  environment  variables  and  settings  that  the
       request  owner  would  have gotten, had they logged directly
       into the machine.

       The shell strategy as configured for  any  particular  host,
       can always be determined by the NQS _q_l_i_m_i_t command.

       4.2.4  _T_r_a_n_s_a_c_t_i_o_n_s

       The accurate recording of request  state  information  is  a
       sometimes  complicated  affair  within  NQS.   The  need  to
       support some reliable mechanism for the recording of request
       state is particularly critical when an NQS request is in the
       process of being routed and  delivered  to  a  remote  queue
       destination.   It is also necessary to support some reliable
       mechanism for detecting interrupted executions of batch  and
       device  requests  upon  system  restart, so that they can be
       restarted or aborted depending upon the user's wishes.

       To do this, NQS uses the UNIX file system to record  request
       state  information.   On  the  surface, this use of the UNIX
       file  system  to  store  request  state  information   seems
       trivial.  It's not.

       The UNIX file system buffer cache  implementation  of  "lazy
       write I/O" makes the situation almost intolerable, since the
       update   of   request   state   information    must    occur
       _s_y_n_c_h_r_o_n_o_u_s_l_y,  for  many  of the request state transitions.
       That is, there are several instances where the  state  of  a
       particular  request  must  be  accurately  recorded  on  the
       physical disk medium _p_r_i_o_r to continuing  further  with  the
       transaction,  otherwise  reliable  transaction  recovery  is
       impossible.

       The need for synchronous state  updates  becomes  absolutely
       critical  when  an  NQS  pipe  client process is routing and
       delivering a  request  to  a  remote  queue  destination  on
       another  machine.   The  algorithm used to remotely queue an
       NQS request must allow for both  machines  involved  in  the
       transaction   to   crash,   without  leaving  things  in  an
       unrecoverable state.

       The algorithm to do this is implemented using a  well  known
       technique  called  the _t_w_o-_p_h_a_s_e _c_o_m_m_i_t _p_r_o_t_o_c_o_l.  While the
       algorithm is quite interesting, space restrictions  prohibit



                            PRELIMINARY DRAFT                    15







                            PRELIMINARY DRAFT               4/29/92



       a  full explanation of the technique here, and the reader is
       referred to the text:  _N_e_s_t_e_d _T_r_a_n_s_a_c_t_i_o_n_s: _A_n  _A_p_p_r_o_a_c_h  _t_o
       _R_e_l_i_a_b_l_e _D_i_s_t_r_i_b_u_t_e_d _C_o_m_p_u_t_i_n_g by Moss.8  [2]

       What  will  be  described  here  however,  is  the   unusual
       mechanism  implemented  in the present version of NQS to get
       around the UNIX file system buffer cache.

       While AT&T system V release 2 UNIX supposedly  supported  an
       undocumented   flag  in  the  _o_p_e_n(2)  system  call  forcing
       synchronous write operations for the opened file descriptor,
       not all UNIX implementations running on the various machines
       of the NPSN supported this feature.  However, an examination
       of  the UNIX source code as supplied by all of the different
       vendors showed that the _l_i_n_k(2) system call was synchronous,
       to  the  extent  that  the target file inode had either been
       written to disk, or was scheduled to be written to disk upon
       return from the system call.

       Therefore, since the amount of transaction state information
       for   each  request  is  quite  small,  NQS  does  something
       unbelievably strange.  It uses the modification  time  field
       of  protected  and  preallocated  files to store transaction
       state information for each request.

       The update of transaction state information in  this  manner
       is  performed  by  setting  the  modification  time  of  the
       appropriately preallocated file (never  created  or  deleted
       once  NQS  is installed), making a link to the updated inode
       to force its writing to  disk,  followed  by  an  unlink  to
       remove  the  temporary link used to force the I/O operation.
       While the desired synchronous transaction  state  update  is
       accomplished  using  a  mechanism  that  is not very fast or
       efficient, it  does  have  at  least  the  virtue  of  being
       relatively portable.

       All of the code involved in setting and reading  transaction
       state  for  a  request is isolated in a very small number of
       NQS  source  modules.   When  a  synchronous  I/O  mechanism
       becomes  supported  as  a  general  UNIX  standard, then the
       implementation of NQS will be changed to take  advantage  of
       it, discarding the atavistic technique described here.

       4.2.5  _N_e_t_w_o_r_k_i_n_g__I_m_p_l_e_m_e_n_t_a_t_i_o_n

       At present, all  NQS  network  conversations  are  performed
       using  the  _B_e_r_k_e_l_e_y  _s_o_c_k_e_t  _m_e_c_h_a_n_i_s_m,  as ported into the
       respective vendor kernels or emulated by other  means.   The
       only  connection  type  used  by  NQS  is  that  of a _s_t_r_e_a_m
       _c_o_n_n_e_c_t_i_o_n, in which NQS assumes that  the  requisite  bytes
       will  be  reliably transmitted to and from the server in the



                            PRELIMINARY DRAFT                    16







                            PRELIMINARY DRAFT               4/29/92



       order in which they were written, by the underlying  network
       software  of the respective host systems.  Any conversion to
       the use of the _s_t_r_e_a_m_s mechanism as developed by AT&T should
       be extremely straightforward.

       In general, all NQS database information is always stored in
       the form most appropriate for the local host.  If it becomes
       necessary to communicate information to another  remote  NQS
       host,  then  the  information  is  converted  into a network
       format understood by all NQS machines.

       All network conversations performed by NQS are  always  done
       using  the  classic  _c_l_i_e_n_t/_s_e_r_v_e_r  model, in which a client
       process creates a connection to the remote machine  where  a
       server  process  is  created  to act on behalf of the client
       process.

       When this initial connection is created,  some  introductory
       information   is   exchanged   between  the  two  processes.
       Regardless of the transaction to be conducted, the format of
       the  introduction  is  always the same, in which certain key
       "personality"  information  is  transmitted  by  the  client
       process  to  the  remote  server.   Included as part of this
       introductory dialogue, are the the client's identity in  the
       form  of its real user-id and corresponding user name at the
       client host, and the timezone  in  effect  at  the  client's
       machine.

       The parameters of real user-id and user name are both passed
       to  the  server  process,  so  that  the  server can map the
       identity of the client to the  appropriate  account  at  the
       remote server machine.  Although one of these two parameters
       is sufficient, both are passed so that the client mapping at
       the  server  machine  can  be performed by _e_i_t_h_e_r user-id or
       user name, depending upon the implementation at  the  remote
       host.

       The timezone for the client is also passed  across  so  that
       future  implementations of NQS when performing remote status
       operations, will properly  display  event  times  using  the
       timezone of the client.

       Lastly, the initial dialogue is the obvious place  in  which
       attempts  can  be  made  by  malevolent users to try to gain
       unauthorized entry to a remote  machine.   At  present,  the
       only  mechanism to prevent this, is the difficulty in faking
       the NQS protocols, and the requirement that  all  networking
       connections  be  made from privileged ports that can only be
       gotten by privileged root processes.





                            PRELIMINARY DRAFT                    17







                            PRELIMINARY DRAFT               4/29/92



       4.2.6  _A_c_c_o_u_n_t__M_a_p_p_i_n_g

       When a network connection  is  established  between  an  NQS
       client  process  and a remote NQS server process, an account
       mapping must be performed so that the network server at  the
       remote  machine  can take on the proper identity attributes.
       This mapping is performed for all network conversations.  In
       particular,  the  transport  of  a  batch  or device request
       requires that the ownership of the request  be  adjusted  as
       appropriate,  since  the user-id of the request owner is not
       necessarily the same on all machines.

       This mapping can be performed either by mapping the client's
       host  and  user-id,  or  client's  host and user name to the
       proper account.  In both cases though, the mapping  must  be
       done  by  the  remote  server  machine if there is to be any
       semblance of security.

       The choice of whether to map user-id or user name values was
       the  subject  of  intense  debate.   In  the  beginning, the
       mapping was to have been made by mapping user-ids.  Near the
       very end of the project, it was mandated that the mapping be
       performed by user name, and not user-id.

       The present implementation of NQS has therefore adopted  the
       defensive  position  that the server machine should make the
       decision as to which algorithm to  use  when  performing  an
       account  mapping.   Since  both the user-id and user name of
       the client process are available to the server process  (see
       the  _N_e_t_w_o_r_k_i_n_g  _I_m_p_l_e_m_e_n_t_a_t_i_o_n section), the server can use
       either one when performing the account mapping.

       Beyond the problem of user-id versus user name  mapping,  an
       additional  problem  is  posed  by the need to determine the
       identity of the client's host, irrespective of  the  network
       interface   upon   which  a  connection  is  made.   In  the
       environment of the  NPSN,  there  are  often  at  least  two
       different principal paths by which a machine can be reached.
       The  example  paths  typically  include  the  interfaces  of
       ethernet  and  hyperchannel,  and  lead  to the existence of
       entries in the UNIX  /_e_t_c/_h_o_s_t_s  file  where  the  names  of
       _a_m_e_l_i_a-_h_y  and  _a_m_e_l_i_a-_e_c  denote the two different paths of
       hyperchannel and ethernet to the same machine known  locally
       as _a_m_e_l_i_a.

       NQS however  requires  that  it  be  able  to  tell  without
       ambiguity   that   connections  coming  from  _a_m_e_l_i_a-_h_y  and
       _a_m_e_l_i_a-_e_c denote connections coming from the  _s_a_m_e  _m_a_c_h_i_n_e,
       even though the entries in the /_e_t_c/_h_o_s_t_s file are separate.





                            PRELIMINARY DRAFT                    18







                            PRELIMINARY DRAFT               4/29/92



       To do this, it was necessary  to  create  the  notion  of  a
       _m_a_c_h_i_n_e-_i_d,  a  number  that  uniquely  identifies  a client
       machine, irrespective  of  the  path  used  to  conduct  the
       network conversation.  Thus, an additional mapping mechanism
       was created to map different  client  host  addresses  to  a
       single unique machine-id.

       Like the user-id versus user name mapping controversy,  this
       decision  was  also  caught  in  a maelstrom of controversy.
       When the dust finally settled, the  _m_a_c_h_i_n_e-_i_d  concept  was
       still present in the NQS implementation.  Unfortunately, the
       storm of controversy swept away the tools which  were  going
       to be used to administer the machine-id mappings.  Thus, the
       present implementation provides a rudimentary program called
       _n_m_a_p_m_g_r  which can be used to painfully create the requisite
       machine-id mappings.

       Someone receiving NQS source code for the first  time  would
       do  well  to  either  implement their own machine-id mapping
       mechanism, or polish the present mechanism.

       4.2.7  _C_o_n_f_i_g_u_r_a_t_i_o_n__C_o_n_t_r_o_l

       All of the setup and configuration of  NQS  is  accomplished
       through  the  use of a single configuration program known as
       the _q_m_g_r utility.  This program establishes a connection  to
       the  local  NQS  daemon,  and  transmits  message packets to
       perform the various configuration  commands  implemented  in
       NQS.   This  program is quite user friendly, and provides an
       on-line help facility.

       The use of an intelligent configuration program to setup and
       modify  NQS on the local machine provides many benefits, one
       of which is the benefit  of  consistency.   One  cannot  for
       example,  add  a  queue-to-device mapping for a non-existent
       device or queue.

       When given a particular command such as adding a  device  to
       the  queue-to-device  mapping  set  for some queue, the _q_m_g_r
       utility builds a message update packet which is then sent to
       the  local  NQS daemon for processing.  The local NQS daemon
       then successfully performs the update or  returns  an  error
       code, which the _q_m_g_r program diagnoses.  In either case, the
       final outcome of the command is always displayed to the user
       system administrator.









                            PRELIMINARY DRAFT                    19







                            PRELIMINARY DRAFT               4/29/92



       4.2.8  _S_t_a_t_u_s__O_p_e_r_a_t_i_o_n_s

       All of the obvious status operations are supported  by  NQS,
       including  device,  request,  queue, and limit queries.  The
       latter status operation is used  to  determine  the  set  of
       batch  request resource limits supported by NQS on the local
       machine.

       These status functions are supported by the  respective  NQS
       commands:   _q_d_e_v,  _q_s_t_a_t,  and  _q_l_i_m_i_t, with _q_s_t_a_t providing
       information  about  previously  queued  requests  and  their
       containing queues.

       Due to time constraints, the only status function which  has
       been  networked  is  the  _q_s_t_a_t  command.   As  time becomes
       available, this situation will hopefully be corrected.

       4.2.9  _F_i_l_e__S_t_a_g_i_n_g

       Although file staging is not presently implemented  by  NQS,
       future  versions  of  NQS  may implement such a facility.  A
       thorough examination of the NQS source code will reveal that
       provisions  have  been made for this eventuality in both the
       request transaction state mechanism, and the  batch  request
       data structures.


       5.  _C_o_n_c_l_u_s_i_o_n

       NQS is  only  another  effort  aimed  at  providing  a  more
       complete  queueing  system for a collection of UNIX machines
       operating in a networked environment.

       As mentioned in the _I_m_p_l_e_m_e_n_t_a_t_i_o_n _S_t_r_a_t_e_g_i_e_s  section,  NQS
       was  designed  and  written after a careful examination of a
       previous UNIX queueing system known as MDQS.   It  is  hoped
       that  others  will  now  build on NQS, as NQS has been built
       from ideas in MDQS.
















                            PRELIMINARY DRAFT                    20







                            PRELIMINARY DRAFT               4/29/92




                                _R_E_F_E_R_E_N_C_E_S



        1. Kingston, Douglas P. III,  _A  _T_o_u_r  _T_h_r_o_u_g_h  _t_h_e  _M_u_l_t_i-
           _D_e_v_i_c_e  _Q_u_e_u_e_i_n_g _S_y_s_t_e_m, revised for MDQS 2.0, Ballistic
           Research  Laboratory,   Army   Armament   Research   And
           Development Command (AARADCOM). September 12, 1983.

        2. Moss,  J. Elliot B., _N_e_s_t_e_d _T_r_a_n_s_a_c_t_i_o_n_s: _A_n _A_p_p_r_o_a_c_h _t_o
           _R_e_l_i_a_b_l_e     _D_i_s_t_r_i_b_u_t_e_d      _C_o_m_p_u_t_i_n_g,      Cambridge,
           Massachusetts: The MIT Press, 1985.









































                            PRELIMINARY DRAFT                    21




