RECIPE for adding new routines to 'process' program.

e-mail address (bugs, etc.):  azara@bioc.cam.ac.uk

Let's say you want to add a new processing procedure.
The recommended course of action is:

(1) Do you really need this new procedure?
    If no, you have just saved yourself lots of time.

(2) Think of the algorithm.
    Think again.  Well, maybe it is good enough now.

(3) Decide whether you want to program it yourself.
    If not, send a message to above e-mail address requesting the addition.
    (It will be much easier if the idea of the algorithm is stated clearly.)
    If you want to do it yourself, follow the recipe below.

Recipe for coding additional functionality in 'process':

(1) Think of a name for the new procedure.
    Here, we shall call it 'fudge_it', and assume it takes two
    arguments, an integer and a real.

(2) Edit the file 'command.c' if the procedure acts on 1-dimensional data,
    and the file 'script.c' if the procedure acts on > 1-dimensional data.
    These two files are where the parsing for procedures is laid out.

    Add a #include statement near the top (keeping the alphabetical
    order with the other processing routines is a good idea).  Thus

#include "fudge_it.h"

    Add a new line to the command_parse_table (again, in alphabetical
    order, if possible).  The recommended form in 'command.c' is

{ "fudge_it",	2,	parse_int_float,	init_fudge_it },

    and in 'script.c' (where more work needs to be done) is

{ "fudge_it",	2,	parse_int_float,	fudge_it_parse },

    where the first column is a short description of the command, the
    second column specifies the number of arguments (in this case 2),
    the third column specifies the types of the arguments (in this
    case an integer and a real) and the fourth column specifies what
    routine to call when this command is parsed.

    The variable in the third column might have to be defined.  For
    example, in this case we should have defined (if it is not
    defined already) near where the previous ones are defined

static int parse_int_float[] = { PARSE_INT, PARSE_FLOAT };

    For 1-dimensional processing go to (3), else go to (4).
    (1-dimensional processing routines are much easier to set up.  You
    can use (4) for 1-dimensional processing, but it is not recommended.)

(3) Create and edit a new file, 'fudge_it.c', which actually
    implements the desired functionality.  Copying a previously
    created file (e.g. 'phase.c' or 'zerofill.c') and then editing
    this is a good idea.

    A procedure 'init_fudge_it' (name given in the parse table above)
    must be written.  This is called when the script is parsed and
    never again, so it must set out all that is necessary for the
    processing stages.

Status init_fudge_it(Generic_ptr *var, String error_msg)
{
}

    'var' contains the arguments to 'fudge_it' (an integer and a real)
    and 'error_msg' is used for reporting error messages.

    A call to 'setup_command' (located in the file 'command.c') should
    be done near the top of 'init_fudge_it', with a check for any
    errors in doing so.

if (setup_command(&type, &npts_in, ncodes, "fudge_it", do_fudge_it,
							error_msg) == ERROR)
	return  ERROR;

    Here 'type' reports back the type of the data (which can be
    COMPLEX_DATA or REAL_DATA), 'npts_in' the number of points when
    this 'fudge_it' is called, 'ncodes' is the number of times
    that 'fudge_it' has previously appeared in a script, the
    "fudge_it" is just used in case an error occurs in 'setup_command',
    'do_fudge_it' is the name of the routine where the processing
    actually takes place, and finally, 'error_msg' is for reporting
    errors.

    Typically then some checking of the 'type' or 'npts_in' is done
    (to make sure the values are sensible) and then any other
    initialisation (e.g. allocation of required memory).

    There must then be a call to 'end_command' (located in the
    file 'command.c'), and an increment of the 'ncodes' counter.

if (end_command(type, npts_out, "fudge_it", error_msg) == ERROR)
	return  ERROR;

ncodes++;

    Here 'type' is the type of the data, and 'npts_out' is the
    number of points, after the processing by 'fudge_it', the
    "fudge_it" is just used in case an error occurs in 'end_command',
    and finally, 'error_msg' is for reporting errors.

    'ncodes' is just used as an easy way to keep track of how many
    times the command 'fudge_it' appears in any script.  This is so
    that information required at the processing stage can be set up
    at the initialization stage.  Currently, MAX_NCODES (defined in
    the file 'consts.h' to be 15) limits the number of times any
    given command can be used over all scripts.  This is not very
    elegant, but works.

    Finally, the procedure 'do_fudge_it' must be written.  To avoid
    having to forward reference it, it should be placed above the
    procedure 'init_fudge_it'.

static void do_fudge_it(int code, float *data)
{
}

    Here 'code' indicates which of the (many possible) occurences
    of 'fudge_it' in all the scripts this one is, and 'data' is the
    actual data to be operated on.

    Go to (5).

(4) Adding multi-dimensional routines is more difficult.

    Add a routine 'parse_fudge_it' (name given in the parse table above)
    to 'script.c'.

static Status fudge_it_parse(Generic_ptr *var, String error_msg)
{
}

    'var' contains the arguments to 'fudge_it' (an integer and a real)
    and 'error_msg' is used for reporting error messages.  This routine
    must do some checking and all of the script bookkeeping that is not
    required for the 1-dimensional processing routines.

    The example of 'maxent' parsing (for 1-, 2- and 3-dimensional
    maximum entropy processing) is a good one to imitate.

    Create and edit a new file, 'fudge_it.c', which actually
    implements the desired functionality.  The routine 'fudge_it_parse'
    should call a routine in this file to do any required initialisation.
    This might be called 'setup_fudge_it'.

    For 'maxent' processing, for example this routine is

Status setup_maxents (int n, int *type, int *npts_in, int *npts_out,
         				String file, String error_msg);

    Here 'n' is the dimension of the processing (n = 1, 2 or 3), 'type'
    is the type of the data in the n dimensions (possibly changed during
    the processing), 'npts_in' is the number of points on entry to the
    processing routine, and 'npts_out' is the number of points on exit
    from the processing routine, 'file' is the file name that contains
    the 'maxent' data, and 'error_msg' is for any error messages.

    There also needs to be written a routine 'do_fudge_it' which
    actually implements the desired algorithm.

static void do_fudge_it(int code, float *data)
{
}

    Here 'code' indicates which of the (many possible) occurences
    of 'fudge_it' in all the scripts this one is, and 'data' is the
    actual data to be operated on.

(5) Create and edit a new file, 'fudge_it.h', which gives a
    reference to the procedures in 'fudge_it.c' called externally.
    This is the file which is included in 'command.c' and/or 'script.c'.
    Copying a previously created file (e.g. 'phase.h' or 'zerofill.h')
    and then editing this is a good idea.  Typically, for 1-dimensional
    processing, 'fudge_it.h' would look like:

#ifndef _incl_fudge_it
#define _incl_fudge_it

#include "macros.h"
#include "types.h"
#include "consts.h"

extern Status init_fudge_it
        (Generic_ptr *data, String error_msg);

#endif /* _incl_fudge_it */
~
