6.9 Writing Efficient Code
The ECLiPSe compiler tries its best, however there are some
constructs which can be compiled more efficiently than others.
On the other hand, many Prolog programmers overemphasize
the importance of efficient code and write completely unreadable
programs which can be only hardly maintained and which are only
marginally faster than simple, straightforward and readable
programs.
The advice is therefore Try the simple and straightforward
solution first!
The second rule is to keep this original program even if you try
to optimise it. You may find out that the optimisation
was not worth the effort.
To achieve the maximum speed of your programs, you must
produce the optimised code with the flag debug_compile
being off, e.g.
by calling set_flag(debug_compile, off),
or using the pragma nodebug.
Setting the flag variable_names can also cause slight
performance degradations and it is thus better to have
it off, unless variable names have to be kept.
Unlike in the previous releases, the flag coroutine
has now no influence on the execution speed.
Some programs spend a lot of time in the garbage collection,
collecting the stacks and/or the dictionary.
If the space is known to be deallocated anyway, e.g. on failure,
the programs can be often speeded up considerably
by switching the garbage collector off or by increasing
the gc_interval flag.
As the global stack expands automatically, this does not cause
any stack overflow, but it may of course exhaust the machine memory.
When the program is running and its speed is still
not satisfactory, use the profiling tools.
The profiler can tell you which predicates
are the most expensive ones, and the statistics tool
tells you why.
A program may spend its time in a predicate because the predicate
itself is very time consuming, or because it was frequently executed.
The statistics tool gives you this information.
It can also tell whether the predicate was slow because it
has created a choice point or because there was too much
backtracking due to bad indexing.
One of the very important points is the selection
of the clause that matches the current call.
If there is only one clause that can potentially match,
the compiler is expected to recognise this and generate code
that will directly execute the right clause
instead of trying several subsequent clauses until the
matching one is found.
Unlike most of the current Prolog compilers, the ECLiPSe
compiler tries to base this selection (indexing) on the most suitable
argument of the predicate4.
It is therefore not necessary to reorder the predicate
arguments so that the first one is the crucial argument
for indexing.
However, the decision is still based only on one argument.
If it is necessary to look at two arguments
in order to select the matching clause, e.g. in
p(a, a) :- a.
p(b, a) :- b.
p(a, b) :- c.
p(d, b) :- d.
p(b, c) :- e.
and if it is crucial that this procedure is executed
as fast as possible, it is necessary to define
an auxiliary procedure which can be indexed on the other argument:
p(X, a) :- pa(X).
p(X, b) :- pb(X).
p(b, c) :- e.
pa(a) :- a. pa(b) :- b.
pb(a) :- c. pb(d) :- d.
The compiler also tries to use for indexing all type-testing information
that appears at the beginning of the clause body:
- Type testing predicates free/1, var/1, meta/1,
atom/1, integer/1,
rational/1,
float/1,
breal/1,
real/1,
number/1,
string/1, atomic/1, compound/1, nonvar/1 and
nonground/1.
- Explicit unification and value testing
=/2, ==/2,
\==/2 and \=/2.
- Combinations of tests with ,/2, ;/2,
not/1, −>/2.
- Arithmetic testing predicates
</2,
=</2,
>/2,
>=/2 if one argument is an integer constant and the
other one known to be of the integer type.
- A cut after the type tests.
If the compiler can decide about the clause selection at compile time,
the type tests are never executed and thus they incur no overhead.
When the clauses are not disjoint because of the type tests, either a cut
after the test or more tests into the other clauses can be added.
For example, the following procedure will be recognised as deterministic
and all tests are optimised away:
% a procedure without cuts
p(X) :- var(X), ...
p(X) :- (atom(X); integer(X)), X \= [], ...
p(X) :- nonvar(X), X = [_|_], ...
p(X) :- nonvar(X), X = [], ...
Another example:
% A procedure with cuts
p(X{_}) ?- !, ...
p(X) :- var(X), !, ...
p(X) :- integer(X), ...
p(X) :- real(X), ...
p([H|T]) :- ...
p([]) :- ...
Integers less than or greater than a constant can also be
recognised by the compiler:
p(X) :- integer(X), X < 5, ...
p(7) :- ...
p(9) :- ...
p(X) :- integer(X), X >= 15, ...
If the clause contains tests of several head arguments, only the
first one is taken into account for indexing.
Here are some more hints for efficient coding with ECLiPSe:
- Arguments which are repeated in the clause head and in the first
regular goal in the body do not require any data moving and thus
they do not cost anything. For example,
p(X, Y, Z, T, U) :- q(X, Y, Z, T, U).
is as expensive as
p :- q.
On the other hand, switching arguments requires data moves and so
p(A, B, C) :- q(B, C, A).
is significantly more expensive.
- When accessing an argument of a
structure whose functor is known, unification
is better than arg/3.
Note, however, that for better maintainability the structure
notation (see section 5.1)
should be used to define the structures.
- Tests are generally rather slow unless they can be compiled away
(see indexing).
- When processing all arguments of a structure, using =../2
and list predicates is always faster, more readable
and easier analyzable by automated tools than using functor/3
and arg/3 loops.
- Similarly, when adding one new element to a structure, using =../2
and append/3 is faster than functor/arg.
- Waking is less expensive than metacalling and more expensive
than direct calling.
Metacalls, although generally slow, are still a lot faster than
in some other Prolog systems.
- Sorting using sort/2 is very efficient and it does not use
much space.
Using setof/3, findall/3 etc. is also efficient enough
to be used every time a list of all solutions is needed.
- using not not Goal is optimised in the compiler
to use only one choice point.
- =/2, when expanded by the compiler, is faster than ==/2
or =:=/2.
- :/2 is optimised away by the compiler
if both argument are known.
- Using several clauses is much more efficient than using
a disjunction if the clause heads contain nonvariables
which can be used for indexing.
If no indexing can be made anyway, using a disjunction
is slightly faster.
- Conditionals with −> ; are compiled more efficiently
if the condition is a simple built-in test.
However, using several clauses can be faster if the compiler
optimises the test away.