You are here: Recursion Theory and Joy
TITLE>Recursion Theory and JoyAbstract: Joy is a functional programming language which is not based on the application of functions to arguments but on the composition of functions. Many topics from the theory of computability are particularly easy to handle within Joy. They include the parameterisation theorem, the recursion theorem and Rice's theorem. Since programs are data, it is possible to define a Y-combinator for recursion and several variants. It follows that there are self-reproducing and self-describing programs in Joy. Practical programs can be written without recursive definitions by using several general purpose recursion combinators which are more intuitive and more efficient than the classical ones.
Keywords: functional programming, functionals, computability, diagonalisation, program = data, diagonalisation, self-reproducing and self-describing programs, hierarchy of recursion combinators, elimination of recursive definitions.
A deep result in the theory of computability concerns the elimination of recursive definitions. To use the stock example, the factorial function can be defined recursively in Joy by
factorial == [0 =] [pop 1] [dup 1 - factorial *] ifteThe definition is then used in programs like this:
5 factorialBecause in Joy programs can be manipulated as data, the factorial function can also be computed recursively without a recursive definition, as follows:
5 [ [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] [dup cons] swap concat dup cons iThe second line in this program does much the same as the body of the definition of factorial, but it is a quoted program. The third line first transforms this into another longer quoted program which performs "anonymous" recursion, and then the final i combinator essentially dequotes this program causing its execution.
The third line implements Joy's counterpart of the Y combinator of the lambda calculus and of combinatory logic. Exactly the same line can be used to cause anonymous recursion of other functions which are normally defined recursively.
Joy has other combinators which make recursive execution of programs more succinct. (Of course it is also possible in Joy to compute the factorial function more efficiently with iteration instead of recursion.)
The remainder of this paper deals with various aspects of the theory of computability, in particular the theory of recursive functions. The next section gives a recursive definition of a recursion combinator. Following that is a section dealing with various well known and elementary theorems from recursive function theory. The next section then proves a fixpoint theorem for Joy. The theorem implies the existence of self-reproducing programs, as shown in the following section. There is also some discussion of the effect of evaluation order on termination. The topic of the next section is Rice's theorem for Joy. Then there is a section describing further self-reproducing and self-describing programs. Another recursion combinator is constructed in the next section. The final section discusses the more practical recursion combinators of Joy. Previous knowledge of the field of recursion theory is not assumed.
In conventional notation the factorial function can be defined recursively like this:
r-fac(n) = if n = 0 then 1 else n * r-fac(n - 1)It is obvious that in conventional notation the definitions need the formal parameter
n
. Joy was designed to eliminate formal
parameters in definitions, and the factorial function would be defined
in Joy like this:
r-fac1 == [0 =] [pop 1] [dup 1 - r-fac1 *] ifteThe suffix
1
serves to distinguish this definition from a
later version. The RHS of the definition contains three quotations:
an if-part, a then-part and an else-part. These serve as parameters
to the ifte combinator. The if-part tests for equality
with 0
, and if that is true then the then-part
pops off the parameter and replaces it with the result
1
. Otherwise the else-part is executed, which
duplicates the parameter, subtracts 1
from the
top copy, calls r-fac1
recursively on that and, when that
call has returned with a value, multiplies that with the original.
Using a more Joy-like idiom:
r-fac1 == [null] [succ] [dup pred r-fac1 *] ifteThis uses the null predicate in the if-part, the successor function in the then-part and the predecessor operator in the else-part.
In conventional notation the Fibonacci function is defined recursively like this:
r-fib(n) = if n <= 1 then n else r-fib(n - 1) + r-fib(n - 2)A more or less literal translation into Joy is the following:
r-fib1 == [small] [] [pred dup [r-fib1] dip pred r-fib1 +] ifteThe if-part uses the small predicate, which for numeric parameters yields
true
for 0
and for
1
. The then-part is the empty quotation []
,
when executed it does nothing. In the else-part r-fib1
has to call itself twice, once each for the two parameters that have
been prepared by dup
. The dip combinator
applies the quotation [r-fib1]
to the earlier version,
and the later version, after pred
has done its job, is
handled directly by r-fib1
. A cleaner version is
r-fib2 == [small] [] [pred dup pred [r-fib2] app2 +] ifteThe app2 combinator applies the quoted
[r-fib2]
twice, to each of the two numbers on top of the
stack.
It will help if the minor differences between the definitions of the
factorial function and the Fibonacci function are eliminated. In
particular this concerns the patterns of the recursive calls. In the
body of r-fac1
the direct recursive call has been
replaced by its quotation and app1, which applies the
quotation just once, to the single number. So here are two other
versions, they have aligned to make comparisons easier.
r-fac2 == [ null] [succ] [ dup pred [r-fac2] app1 *] ifte r-fib2 == [small] [ ] [pred dup pred [r-fib2] app2 +] ifteThe task of eliminating the recursion in the RHS of the definitions amounts to this: The occurrences of the quoted programs
[r-fac2]
and [r-fib2]
have to be replaced by
the quoted RHS. But this will introduce those same quotes again, and
these have to be replaced by the RHS, and so on ad
infinitum. It seems impossible.
One part of the solution is that the programs have to be given an
extra parameter which is to be used when the else-part is executed.
The extra parameter will have to contain whatever is necessary to
enable the recursion inside the else-part. But this means that the
extra parameter will be somewhat in the way. Consequently the if-part
and the then-part need an extra pop
to remove the
unneeded parameter. Furthermore, in the else-part any preparatory
work before the actual recursion has to be done below the
extra parameter using dip
. The one or two recursions are
then to be effected by app1
and app2
,
respectively. The two programs now look like this:
?-fac: [[pop null] [pop succ] [[dup pred ] dip app1 *] ifte] ? ?-fib: [[pop small] [pop ] [[pred dup pred] dip app2 +] ifte] ?To indicate that these are not recursive definitions, they are not given as definitions at all. The
?
-symbol is left unanalysed at this point,
only this much can be said about it:
?
-symbol denotes a strange combinator.
For comparison the three versions of the factorial program are listed here:
r-fac1 == [ null] [ succ] [ dup pred r-fac1 *] ifte r-fac2 == [ null] [ succ] [ dup pred [r-fac2] app1 *] ifte ?-fac: [[pop null] [pop succ] [[dup pred ] dip app1 *] ifte] ?And here are the three versions of the Fibonacci program:
r-fib1 == [ small] [ ] [ pred dup [r-fib1] dip r-fib1 +] ifte r-fib2 == [ small] [ ] [ pred dup pred [r-fib2] app2 +] ifte ?-fib: [[pop small] [pop ] [[pred dup pred] dip app2 +] ifte] ?
The recursion combinator must do this: At the point where
?-fac
uses app1
and where ?fib
uses app2
they expect ?-fac
and
?fib
on top of the stack. That is why the if-parts and
the then-parts each need an extra pop
, and why the
else-part has to do its initial work from a dip
. It all
means that a recursion combinator has to supply the RHS to itself as
an extra parameter. In general, a recursion combinator expects a
program [P]
and it executes it in a special way. It must
call P
but provide the extra parameter. This is the
defining law for a recursion combinator ?
:
[P] ? == [[P] ?] P
How can a recursion combinator be defined? This can be done
recursively and non-recursively. Once the ?
combinator
is defined, it can be used to eliminate all other recursive
definitions. If a recursion combinator is defined recursively, then
this would be the only recursive definition that is needed. Here is a
step-by-step development of such a definition. The development is a
fairly typical example of how one can write Joy programs
systematically. Start with the defining law:
[P] ? == [[P] ?] PIn the second line
P
occurred twice, in quoted and
unquoted form. It will be simpler if it occurs in only one form, and
that has to be the quoted form. So, using the i
combinator,
== [[P] ?] [P] iThe two quotations are not exactly the same, but they can be produced from two identical quotations:
== [[P ?] [[P] ?] first iNow the two identical quotations can be produced by the
dup
operator:
== [[P] ?] dup first iAll that is needed now is to extract
[P]
to the left:
== [P] [?] cons dup first iIn this construction each line was equivalent to its one or two neighbours. Hence the first and the last lines of this construction are equivalent:
[P] ? == [P] [?] cons dup first iHence the following is a suitable recursive definition of a recursion combinator:
? == [?] cons dup first i
It is useful to think of the RHS as being composed of two parts: the first is
[?] cons dup firstand the second part is just
i
. The first part is just a
function which takes one quoted program as a parameter and produces
two quoted programs as values:
[P] [?] cons dup first [[P] ?] dup first (by cons) [[P] ?] [[P] ?] first (by dup) [[P] ?] [P] (by first)The second part of the definition is
i
, which effectively
dequotes the topmost [P]
and hence executes
P
. So the next reduction step depend on what
P
actually does.
An alternative recursive definition may be constructed:
[P] ? == [[P] ?] P == [[P] ?] [P] i == [P] [?] cons [P] i == [P] [P] [[?] cons] dip i == [P] dup [[?] cons] dip iThe resulting alternative recursive definition is
? == dup [[?] cons] dip iIt follows that the two RHSs of the two equivalent definitions are equal:
[?] cons dup first i == dup [[?] cons] dip i(Note that cancellation of the trailing
i
combinator on
both sides is not valid in general, though it would be valid here.)
We now have a recursive definition of recursion. Computationally this
is quite adequate, the recursive definition of any one of the
?
combinators really does make it possible to eliminate
all other recursive definitions. Some of the following sections deal
with the elimination of recursion even for the recursion combinators.
The recursive definition of a recursion combinator can of course also be given in other languages. For an example in the lambda calculus see {Sokolowski91}.
The main results in the theory can be divided into two groups: those concerned with the relationships between two such formalisms and those concerned with just one.
The principal results in the first group are that the formalisms are all equivalent, in this sense: Given any two formalisms, for any specification in the one formalism there is a specification in the other formalism such that the two specifications compute the same function. Proofs of such results are always constructive, by exhibiting an algorithm which converts any specification in the one formalism into a specification in the other.
Results in the second group concern just one formalism and often take this form: For any specification S1 having a certain property there is another specification S2 such that the two specifications S1 and S2 are related in some special way. Proofs are again constructive, by exhibiting algorithms for converting S1 into S2.
The algorithms of both groups can be expressed in any of the formalisms. The constructive proofs then look like this: There is a specification S (in formalism F) with the capacity of transforming any specification S1 (in formalism F1) into a specification S2 (in formalism F2, not necessarily different from F1) such that the two specification S1 and S2 are related in a special way. Since the formalisms are universal, the algorithms can be written in the formalism itself, as another specification S. The results then take this form: There is a specification S such that for every specification S1 the result of applying S to S1 yields the required S2.
Historically the theory of computability has been mainly couched in terms of recursive functions, a collection of functions built from a small base by means of a small number of constructors. These functions take natural numbers as arguments and give natural numbers as values. By means of a theoretically elegant (but computationally unfeasible) mapping called Gödel numbering, linguistic entities such as expressions can be assigned a unique number. In this way the functions can be made to "talk about themselves". Any syntactic operation on linguistic entities has a counterpart purely arithmetical operation on natural numbers which are the Gödel numbers of these entities. The theorems are basic to any proper understanding of the the foundations of computer science. But the proofs, when not hand-waiving by appeals to Church's thesis, tend to be forbidding.
The ease of writing S depends crucially on how well the formalism can handle its own specifications. Often it is necessary to use encodings of S1 and S2 into a form which the formalism can handle. In the worst case the encodings are truly ghastly. In the best case the formalism can handle its own specifications very naturally. This is sometimes expressed by the slogan program = data. Most formalisms fare badly on this, including most programming languages.
Some languages can treat their own programs as data; they are Lisp, Snobol, Prolog and their descendants, some macro generators and the command languages based on them. Joy does it at least as well. In these languages programs can operate on other programs or on other data to produce other programs which may then be executed. Because of this, arithmetisation is not necessary, functions can take programs rather than Gödel numbers of programs as arguments, and metatheoretic proof become much easier. {Phillips92} writes
One might wonder how differently recursion theory might be viewed if it had arisen out of practical developments instead of predating them.Indeed, one might speculate how much more natural computability theory would have been if Lisp has been invented thirty years earlier. Readers might want to pursue this topic, see the discussion in {Hofstadter85} starting on p 444, and his very startling conclusion on p 449.
In Joy many metatheoretic proofs are easier still because there are no named formal parameters. Consequently the difficult operation of substituting actual parameters for formal parameters does not occur, and everything is algebra. Reasoning about Joy programs will now be illustrated with a number of classical theorems.
A simple form of the parametrisation theorem states: there is a recursive function taking the Gödel number of an n-ary function F1 as one argument and taking the Gödel number of a numeral as a further argument such that the value of this recursive function is the Gödel number of an n-1-ary function F2 which is obtained by substituting the value of the numeral for the first argument of F1.
Here is the same theorem for Joy. An operand is a numeral (or any other constant) or a quotation.
For any program P and operand X there is a program Q such that Q == [Q] i == X PThe required program
Q
is just the concatenation. But
the theorem can be strengthened:
There is a program O such that for any program P and operand X there is a program Q such that X [P] O == [Q] and Q == [Q] i == X [P] i == X PThe required program
O
is cons
:
X [P] O == X [P] cons == [X P] == [Q]The theorem is rather trivial in Joy. The formalisation of the proof is an overkill. However, many proofs later on will have the same structure, and it may help to get used to that.
The parametrisation theorem generalises to any number m of
arguments that are parametrised. Consequently it is often called the
S-m-n theorem, see {Rogers67}. The "S" stands for
"substitution" - it means that m of the n formal
parameters are replaced by fixed values. In Joy notation repeated
cons
operations can parameterise for any m,
simply by repeating cons
m times.
In recursive function theory it is possible for functions to take other functions as parameters, and since functions are untyped, they can take themselves as parameter. One consequence ot the S-m-n theorem is the diagonalisation theorem: There is a recursive function taking as argument the Gödel number of a function which takes at least one parameter, and giving as value the Gödel number of the function obtained from the given one by substituting itself for the parameter.
Again, the counterpart in Joy is trivial.
For any program P there is a program Q such that [Q] i == Q == [P] PThe required quoted
[Q]
is [[P] P]
. The
stronger form is:
There is a program O such that for any program P there is a program Q such that [P] O == [Q] and [Q] i == Q == [P] dup i == [P] PThe required program
O
is dup cons
. Proof:
[P] O == [P] dup cons == [P] [P] cons == [[P] P] == [Q] [Q] i == [[P] P] i == [P] P == QSo in Joy the short program
dup consimplements diagonalisation.
It may help to see diagonalisation in action for a small program:
[dup reverse concat] dup cons i == [dup reverse concat] [dup reverse concat] cons i (by dup) == [[dup reverse concat] dup reverse concat] i (by cons) == [dup reverse concat] dup reverse concat (by i) == [dup reverse concat] [dup reverse concat] reverse concat(by dup) == [dup reverse concat] [concat reverse dup] concat (by reverse) == [dup reverse concat concat reverse dup] (by concat)So the program
[dup reverse concat]
diagonalised and run
produces its own palindrome. Most programs cannot be self-applied
because it would breach typing. But here are a few, readers might
like to see what they do:
[] [reverse 42] [42 pop] [pop] [dup cons]
In {Smullyan61} theory of formal systems of (character) strings, the
norm or diagonalisation of a string X
is
defined to be X
followed by its own Gödel number in
dyadic notation. In Joy the detour via Gödel number is eliminated
by using quotations.
We now consider two reductions. The first is this:
[P] dup i == [P] [P] i (by dup) == [P] P (by i)The second is this:
[P] dup cons i == [P] [P] cons i (by dup) == [[P] P] i (by cons) == [P] P (by i)So the two reductions come to the same, and hence their first lines are identical:
[P] dup i == [P] dup cons iBut this holds for all
[P]
. So we may infer
dup cons i == dup i
For every function P there is a Gödel number of a function Q such that the result of applying the function P to the Gödel number of Q is the Gödel number of a function identical with Q.The theorem says that for every function P which is used to edit code for computing functions, there is a program Q which will compute the same function before and after the editing. So Q is a fixpoint, a program not affected by P. Fixpoint theorems are about programs
P
(rather than about the functions which they compute).
Here is an equivalent version for Joy:
For any program P there is a program Q such that [Q] i == Q == [Q] PThe required program
Q
is
[dup cons P] dup cons PThe proof is as follows:
[ Q ] i == [[dup cons P] dup cons P] i (def Q) == [dup cons P] dup cons P (by i) == [dup cons P] [dup cons P] cons P (by dup) == [[dup cons P] dup cons P] P (by cons) == [ Q ] P (def Q)
The theorem may be strengthened. The transformation from
P
to Q
can be done by a simple program
O
:
There is a program O such that for any program P there is a program Q such that [P] O == [Q] and [Q] i == Q == [Q] PThe required program
O
is
[dup cons] swap concat dup consThe proof is as follows:
[P] O == [P] [dup cons] swap concat dup cons (def O) == [dup cons] [P] concat dup cons (by swap) == [dup cons P] dup cons (by concat) == [dup cons P] [dup cons P] cons (by dup) == [[dup cons P] dup cons P] (by cons) == [Q] (def Q)The fixpoint finding program
O
is useful: henceforth it
will be called the fix operator. We also define the
y combinator for Joy:
y == fix iThis is the Joy counterpart of Curry's "paradoxical" Y combinator (uppercase), see {Curry58}. This combinator is well known in the literature on the lambda calculus and on combinatory logic.
The recursion theorem traces its ancestry to Epimenides, Russell and
Grelling (for [P]
put [i not]
, and to
Gödel. In its general form it is due to Kleene. Its proof is very
cumbersome in just about all formalisms. As can be seen from the
above, in Joy the proof is very simple.
The S-m-n theorem is the basis for partial evaluation or
program specialisation. A special case for this is the
partial evaluator mix which can transform an interpreter into
a compiler, and can transform itself into a compiler generator.
{Jones92} shows that such transformations can be done realistically
only if the partial evaluator has primitives in terms of which the
fixpoint constructions of the recursion theorem can be implemented
efficiently. The two primitives chosen are in fact very close to
Joy's i
combinator and quotation.
The recursion theorem leads in a few steps to Rice's theorem, see {Rogers67}, which encapsulates all the bad news of computability theory: for example the halting problem, or the impossibility of writing programs which check other programs - implementation, student exercises - for correctness.
P
. It is of some interest to see what happens when some
simple actual programs are chosen. Two particularly simple programs
are
i
combinator program,
which dequotes and executes a program on top of the stack.
S
is self-reproducing if this conditions holds:
S i == SNote that
S
is allowed to be arbitrarily complex, it does
not have to be just a quoted program. Such programs are obtained from
the recursion theorem by making P
compute the identity
function. In Joy this is represented most simply by the empty
program, in quoted form []
. (Alternatively, it is
represented by the program id
, in quoted form
[id]
.)
There is a program Q such that [Q] i == [Q]Proof: let
Q
be the program
[dup cons] dup consThen the derivation is as follows:
[[dup cons] dup cons] i == [dup cons] dup cons (by i) == [dup cons] [dup cons] cons (by dup) == [[dup cons] dup cons] (by cons)Readers might like to be reminded of a self-replicating C-program:
p="p=%c%s%c;main(){printf(p,34,p,34);}";main(){printf(p,34,p,34);}(Author unknown. Proving correctness is not easy.)
Related to self-reproducing programs are self-describing
programs. A program S
is self-describing if running
it produces a description of it. In Joy a program S
is
self-describing if this condition holds:
S == [S]Here is a self-describing program:
[dup cons] dup cons == [dup cons] [dup cons] cons (by dup) == [[dup cons] dup cons] (by cons)
i
combinator. Then [P]
is [i]
, and
[Q]
is [[dup cons i] dup cons i]
. This is
what happens:
[[dup cons i] dup cons i] i == [dup cons i] dup cons i (by i) == [dup cons i] [dup cons i] cons i (by dup) == [[dup cons i] dup cons i] i (by cons)So this is a program that runs forever. This is the Joy counterpart of what in the lambda calculus is selfapplication selfapplied. In both the reductions do not terminate. In the lambda calculus the rule used is beta-reduction, the substitution of actual for formal parameters. In Joy there are no formal parameters, just algebraic simplification.
The dup cons
combination occurs so frequently that it is
worth introducing an operator duco defined by
duco == dup consThis will make proofs shorter both horizontally (because programs are shorter) and vertically (because two steps are condensed into one).
[P]
=
dup
. Then [Q]
is the quotation in the first
line below, and execution is as follows:
[[duco dup] duco dup] i == [duco dup] duco dup (by i) == [[duco dup] duco dup] dup (by duco) == [[duco dup] duco dup] [[duco dup] duco dup] (by dup)So this is an example of a program which when run produces two copies of itself. The program in the second line is another self-describing program, it produces two copies of itself.
[P]
= dup i
. Then
execution starts like this:
[[duco dup i] duco dup i] i == [duco dup i] duco dup i (by i) == [[duco dup i] duco dup i] dup i (by duco) == [[duco dup i] duco dup i] [[duco dup i] duco dup i] i (by dup)So this is another program that runs forever. In addition it leaves earlier copies of itself on the stack. In an implementation it must run out of stack space. The program in the second line is again self-describing, but it never finishes what would be a description of infinitely many copies of itself.
Another program that does much the same is
[[duco dup dip] duco dup dip]
duco
operator is used.
[ [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] y == [ [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] fix i (by y) == [ [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] [duco] swap concat duco i (def fix) == [duco] [ [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] concat duco i (by swap) == [ duco [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] duco i (by concat) == [[ duco [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] duco [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] i (by duco) == [ duco [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] duco [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte (by i) == [[ duco [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] duco [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte ] [pop 0 =] [pop pop 1] [[dup 1 -] dip i *] ifte (by duco)At this point the numeric parameter for the computation will be needed. The
ifte
combinator will execute its if-part.
That results in the long two-line quotation being popped off the stack
and the parameter being compared with 0. If that evaluates to
true
, the then-part is executed with the parameters and
the two-line quotation still on top of the stack, The then-part pops
both and replaces them by 1. But if the if-part returns
false
, the else-part is executed, again with the
parameter and the two-line quotation on top of the stack.
First the two-line quotation is set aside by dip
, the
parameter is duplicated and the top copy decremented. Only then the
two-line quotation will be executed by i
. The effect is
to construct a clone of itself with duco
, pushing the
same if-part, then-part and else-part for the contained
ifte
. When that eventually returns, the old parameter
and above it the factorial of the decremented parameter copy are
finally multiplied to give the required factorial of the original
parameter.
In applicative order the expressions are evaluated first and that value is substituted for the formal parameter. If the value of a parameter expression is used repeatedly in the body of the function, then normal order requires repeated evaluation of the expression, whereas applicative order requires only one evaluation. On the other hand, if the body does not require the value at all, then applicative order will have wasted time trying to evaluate the expression. If that evaluation does not terminate at all, then the call of the function will fail to terminate under applicative order although it would have terminated under normal order. Consequently there are some functions that are less defined under applicative order evaluation than they are under normal order.
The Y combinator of the lambda calculus does not work at all for applicative order. No matter what the function is, the call with the Y combinator will try to do the unending sequence of substitutions and hence fail to terminate. Therefore a different version of the Y combinator has to be used, see {Paulson92}, {Winskel93}.
In Joy there are no formal parameters, there is no substitution, and
hence strictly speaking there is no evaluation order. Joy functions
take their actual parameters as values from the stack, and in a way
this resembles applicative order. The y
combinator of Joy always terminates correctly provided that the quoted
program and any of its actual parameters did get onto the stack, and
provided that the quoted program terminates. So there is a difference
between the Y combinator of the lambda calculus under applicative
order, and the y
combinator of Joy. What is the reason?
Consider again, for an arbitrary program [P]
[P] y == [P] fix i == [[duco P] duco P] i == [duco P] duco P == [[duco P] duco P] PThis point is always reached in the initial call, and it is independent of what
P
actually is. At this point there
is a (double) quotation (containing P
twice) on top of
the stack, and P
now has it available as a parameter.
Quotations are never evaluated further, although they can be
explicitly manipulated by operators or they can be explicitly called
by combinators. Consequently the second duco
inside the
quotation will not be executed to yield
== [[[duco P] duco P] duco P] PHere of course the second and third
duco
would have been
the next candidates for execution, and so on. So the reason for the
difference is that quotations in Joy are never evaluated
automatically, whereas abstractions in the lambda calculus will be
under applicative order evaluation.
It is possible to generalise fixpoint combinators for mutual recursion. For the lambda calculus this is done for example by {Kogge91}, shows how pairs of mutually recursive definitions can be eliminated by using a pair of rather complicated mutually recursive combinators Y1 and Y2. A similar technique is explained in {Henson87}. Constructions such as these rely on the existence of double fixpoints, whose existence follows from a double recursion theorem (see for example {Smullyan94}). Presumably these can be translated into Joy, too.
F
of partial
functions and a Joy program [Q]
, the question arises
whether the function computed by [Q]
is in F
or not. Is there an algorithm for deciding such questions? More
specifically, is there a Joy program, say PF
, for
deciding such questions? Such a program would have to satisfy
[Q] PF == true, if the function computed by [Q] is in F == false, if the function computed by [Q] is not in FThe program
PF
is also expected to terminate for all
inputs [Q]
. Can there be such a program?
No. This is Rice's theorem:
For all non-trivial sets F of partial functions, there is no program PF such that for all programs Q PF can decide whether the function computed by Q is in F.Proof: Let
F
be a non-trivial set of functions. Since
F
is non-trivial, there are some functions in
F
and some functions not in F
. Let program
[E]
compute a function in F
and let
[E']
compute a function not in F
.
Now assume that a program PF
exists, and that it always
terminates. Next, consider the following Joy program:
[PF] [pop [E']] [pop [E]] ifteThe program expects an arbitrary program
[Q]
on top of
the stack and then pushes the three small quotations. The
ifte operator then removes them again and executes the
first short quotation [PF]
. This will result in a truth
value, true
or false
. In the first case the
second short quotation is executed, it pop
s the
[PF]
and replaces it by [E']
. In the second
case the third short quotation is executed, it pop
s the
[PF]
and replaces it by [E]
. So, if
[Q]
is in F
, the program returns a program
not in F
, namely [E']
. On the other hand,
if [Q]
is not in F
, the program returns a
program that is in F
, namely [E]
. In other
words, no matter what the input program [Q]
is, the
output program is opposite as far as membership in F
is
concerned. Call the above program OPF
.
By the fixpoint theorem, the program OPF
must have a
fixpoint, say FIXOPF
, satisfying
FIXOPF == [FIXOPF] OPFBut if the program on the left computes a function in
F
,
then the program on the right computes a function not in
F
, and vice versa. But this is a contradiction. So the
program OPF
fails for at least one program, its own
fixpoint FIXOPF
. We must conclude that the assumption
was false. So there cannot be a program PF
.
The above proof of Rice's theorem for Joy is adapted from a proof for
recursive functions in {Phillips92}. Several well-known
paradoxes are instances of the recursion theorem, for example
the Liar paradox and Grelling's paradox use as the
program [P]
the simple program [i not]
.
Recent discussions of the Liar and related problems can be found in
{Barwise-Etchemendy87}.
There is an operand Q1 and a program Q2 such that Q1 [Q2] i == Q1 [Q2]Proof: Let
Q1
= [dup]
and let
Q2
= dup
. Then the derivation is
[dup] [dup] i == [dup] dup (by i) == [dup] [dup] (by dup)This seemingly trivial self-reproducing program will be used to derive a version of recursion that is actually more efficient than the one based on the
y
combinator.
b
combinator:
There are programs Q1 and Q2 such that [Q1] [Q2] b == [Q1] [Q2]Proof: let Q1 and Q2 be the programs
[[] cons dup first] [] cons dup firstThen the derivation is:
[[[] cons dup first]] [[] cons dup first] b == [[] cons dup first] [] cons dup first (by b) == [[[] cons dup first]] dup first (by cons) == [[[] cons dup first]] [[[] cons dup first]] first (by dup) == [[[] cons dup first]] [[] cons dup first] (by first)It is useful to introduce an operator codufi by the definition:
codufi == cons dup firstThe operator will be found useful in the next section.
i
combinator:
There is a program Q such that [Q] i i == [Q] =/= [Q] iProof: let
Q
be the program
[duco [] cons] duco [] consThe execution now looks like this:
[[duco [] cons] duco [] cons] i i == [duco [] cons] duco [] cons i (by i) == [[duco [] cons] duco [] cons] [] cons i (by duco) == [[[duco [] cons] duco [] cons]] i (by cons) == [[duco [] cons] duco [] cons] (by i)Observe that the quoted programs in the first and fourth line differ by just the extra quoting in line four.
There is another program with the property.
Let Q
be the program
[false [not] infra dup rest cons] [not] infra dup rest consThe combinator infra expects a program (here
[not]
) on top of the stack, and below that a quotation
(here the first half of the program). It temporarily turns the
quotation into the stack and executes the program (here it complements
the truth value false
or true
at the very
beginning. An outline of the derivation is:
[[false [not] infra dup rest cons] [not] infra dup rest cons] i i [[true [not] infra dup rest cons] [not] infra dup rest cons] i [[false [not] infra dup rest cons] [not] infra dup rest cons](Each by 5 steps)
The quoted programs the first and second lines are examples of mutually describing programs, satisfying
P = [Q] and Q == [P]In detail:
P == [false [not] infra dup rest cons] [not] infra dup rest cons Q == [true [not] infra dup rest cons] [not] infra dup rest cons
true
or
false
) which is thrown every time it is called. The next
program does the same with a counter, and consequently every
generation is different from all previous ones: Let Q
be
the program
[0 [1 +] infra dup rest cons] [1 +] infra dup rest consSuccessive executions using
i
cause the 0
to
be incremented to 1
, 2
and so on.
first
and rest
or sequences of such
operations. There are programs which with every generation become
less and less sensitive to longer and longer sequences of mutilations.
Two such Q
are
[duco duco] duco duco [cons duco] [cons duco] cons duco
P
, a program Q
such that
[Q] i i .. i == P P .. P [Q]where the
i
on the left and the P
on the
right are repeated the same number of times. Here is one answer:
There is a program O such that for all programs P there is a program Q such that [P] O == [Q] and [Q] i == Q == P [Q]Proof: Let
O
be the program
[dup [first i] dip rest duco] cons ducoThen for any
P
:
[P] O == [P] [dup [first i] dip rest duco] cons duco (by O) == [[P] dup [first i] dip rest duco] duco (by cons) == [[[P] dup [first i] dip rest duco] [P] dup [first i] dip rest duco] (by duco)
y
combinator there are other recursion
combinators. One of them is given by the following theorem:
There is a program M such that (1) [y] M == [y] i == y and for some programs N and O (2a) [M] [duco] swap concat == [N] and (2b) [N] duco == [O] and (3) [M] y == [N] dup i == O and for all programs P (4) [P] O == [P] y and there is a program Q such that (5) [P] [O] cons == [Q] and (6) [Q] i == Q == [Q] PThe first line, (1), says that
y
is a fixpoint for
M
. Lines (2a) and (2b) show how to construct two further
programs N
and O
. Line (3) expresses a
relationship between the three programs M
, N
and O
. Line (5) shows how to construct a program
Q
which depends on P
. The last line of the
theorem says that the Q
is a fixpoint for P
.
Proof: Only M
need be given, because
N
and O
are constructed. The required
M
is actually a combinator already seen in section 2:
cons dup first iBut the first three operators can be replaced by codufi defined in the previous section. So we set:
M == codufi iThis is used to construct
N
:
[M] [duco] swap concat == [codufi i] [duco] swap concat (def M) == [duco] [codufi i] (by swap) == [duco codufi i] (by concat) == [N] (def N)So we have:
N == duco codufi iThis program can be used to construct
O
:
[N] duco == [duco codufi i] duco (def N) == [[duco codufi i] duco codufi i] (by duco) == [O] (def O)So we have:
O == [duco codufi i] duco codufi iThis program can now be used to construct, for arbitrary
[P]
a corresponding [Q]
:
[P] [O] cons == [P] [[duco codufi i] duco codufi i] cons (def O) == [[P] [duco codufi i] duco codufi i] (by cons) == [Q] (def Q)So we have, for arbitrary
[P]
,
Q == [P] [duco codufi i] duco codufi iIt remains to be shown that
Q
is a fixpoint for P
:
[Q] i == [[P] [duco codufi i] duco codufi i] i (def Q) == [P] [duco codufi i] duco codufi i (by i) == [P] [[duco codufi i] duco codufi i] codufi i (by duco) == [[P] [duco codufi i] duco codufi i] [P] i (by codufi) == [[P] [duco codufi i] duco codufi i] P (by i) == [Q] P (def Q)This shows that
Q
is indeed a fixpoint for P
.
Different recursion combinators, and indeed a whole hierarchy of them, are well known in the literature on lambda calculus and combinatory logic; see for example {Barendregt84}, {Revesz88}. Possibly one of the most satisfying introductions to combinatory logic is to be found in the remarkable little book {Smullyan90} To Mock a Mockingbird in which he manages to combine humour and rigour. Part III is a self-contained exposition to combinatory logic in which fancyful names are given to lambda calculus combinators.
r-last
finds the last element of a list. The
function r-fac
computes the factorial of a
number. The function r-fib
computes the
Fibonacci value of a number.
r-last == [rest null] [first] [rest r-last] ifte r-fac == [0 =] [succ] [dup pred r-fac *] ifte r-fib == [small] [pop 1] [pred dup pred [r-fib] app2 +] ifteThe following three functions also compute The last, the factorial and the Fibonacci of their parameter. Note that there is no definition; the recursion is taken care of by the
y
combinator.
[ [pop rest null] [pop first] [[rest] dip i] ifte ] y [ [pop 0 =] [pop succ] [[dup pred] dip i *] ifte ] y [ [pop small] [pop pop 1] [[pred dup pred] dip app2 +] ifte] y
But the y
combinator is intrisically inefficient because
of the way it operates. On every recursive call a certain program is
popped off the stack to be executed. The first task of that program
is to construct a copy of itself, in readiness for any further
recursive calls. But this is really quite silly. It would be better
more efficient if the program to be executed was not popped
off the stack at all but simply left there. Whereas most combinators
remove their parameters from the stack, a new x combinator
leaves it there as a parameter for itself.
The following programs use the x
combinator instead of
the y
combinator. They are obtained from the first two
of the previous programs by replacing the internal occurrences of the
i
combinator and the the external occurrence of the
y
combinator by the x
combinator.
[ [pop rest null] [pop first] [[rest] dip x] ifte ] x [ [pop 0 =] [pop succ] [[dup pred] dip x *] ifte ] xThe
x
combinator might have been defined by
x == dup i
Similar lambda calculus constructions are discussed in {Tennent76}, {Tennent91}.
The remainder of this section describes some further general and particular combinators of Joy which can be used to avoid recursive definitions.
false
, the rec1-part is executed. Following that
the four program parameters and the combinator are again pushed onto
the stack bundled up in a quoted form. Then the rec2-part is
executed, where it will find the bundled form. Typically it will then
execute the bundled form, either with i
or with
app2
, or some other combinator.
g-fac == [null ] [succ] [dup pred ] [i * ] genrec; g-fib == [small] [ ] [pred dup pred] [app2 +] genrec;
genrec
combinator.
The essential difference is that the bundled up quotation is
immediately called before the rec2-part. Consequently it can only be
used for linear recursion. Here are programs for finding the
last
of an aggregate or the Fibonacci value of a natural
number:
l-last == [rest null] [first] [rest ] [ ] linrec; l-fac == [null ] [succ ] [dup pred] [*] linrec
b-fib == [small] [] [pred dup pred ] [+ ] binrec; b-qsort == [small] [] [uncons [>] split] [swap23 cons concat] binrec;
linrec
combinator, except that it does not have a
rec2-part. It can only be used for tail recursion, such as
in the program below which returns the last element of an aggregate.
t-last == [rest null] [first] [rest] tailrec;
null
. The rec1-part is also built in, for numeric
parameters it returns the parameter and its predecessor, for aggregate
parameters it returns the first
of the aggregate and the
rest
of the aggregate. Recursion then occurs
respectively on the predecessor or the rest, and the rudimentary
rec2-part typically combines the results.
The first program below computes the factorial (again!). The second turns any aggregate into a list. The third turns a suitable list of small numbers into a set. The fourth and fifth capitalise lists or strings of lowercase letters and produce, respectively, a list or string of corresponding capital letters. The last program takes an aggregate as parameter and produces a list of every second element in the original order and then the other elements in the reverse order.
p-fac == [1 ] [* ] primrec agg2list == [[]] [cons ] primrec list2set == [{}] [cons ] primrec capstring == [""] [[32 -] dip cons] primrec caplist == [[]] [[32 -] dip cons] primrec fancy == [[]] [reverse cons ] primrec