Church numerals are the representations of
natural numbers under Church encoding. The
higher-order function that represents natural number
n is a function that maps any function f to its
n-fold
composition. In simpler terms, a numeral represents the number by applying any given function that number of times in sequence, starting from any given starting value: :n : f \mapsto f^{\circ n} : f^{\circ n} (x) = (\underbrace{f \circ f \circ \ldots \circ f}_{n\text{ times}})\,(x) = \underbrace{f ( f ( \ldots ( f}_{n\text{ times}}\,(x))\ldots )) Church encoding is thus a
unary encoding of natural numbers, corresponding to simple
counting. Each Church numeral achieves this by construction. All Church numerals are functions that take two parameters. Church numerals
0,
1,
2, ..., are defined as follows in the
lambda calculus: :
Starting with 0 not applying the function at all, proceed with 1 applying the function once, 2 applying the function twice in a row, 3 applying the function three times in a row, etc.: : \begin{array}{r|l|l} \text{Number} & \text{Function definition} & \text{Lambda expression} \\ \hline 0 & 0\ f\ x = x & 0 = \lambda f.\lambda x.x \\ 1 & 1\ f\ x = f\ x & 1 = \lambda f.\lambda x.f\ x \\ 2 & 2\ f\ x = f\ (f\ x) & 2 = \lambda f.\lambda x.f\ (f\ x) \\ 3 & 3\ f\ x = f\ (f\ (f\ x)) & 3 = \lambda f.\lambda x.f\ (f\ (f\ x)) \\ \vdots & \vdots & \vdots \\ n & n\ f\ x = f^{\circ n}\ x & n = \lambda f.\lambda x.f^{\circ n}\ x \end{array} The Church numeral
3 is a chain of three applications of any given function in sequence, starting from some value. The supplied function is first applied to a supplied argument and then successively to its own result. The end result is not the number 3 (unless the supplied parameter happens to be 0 and the function is a
successor function). The function itself, and not its end result, is the Church numeral
3. The Church numeral
3 means simply to do something three times. It is an
ostensive demonstration of what is meant by "three times".
Calculation with Church numerals Arithmetic operations on numbers produce numbers as their results. In Church encoding, these operations are represented by
lambda abstractions which, when applied to Church numerals representing the operands, beta-reduce to the Church numerals representing the results. Church representation of addition, \operatorname{plus}(m, n)= m+n, uses the identity f^{\circ (m+n)}(x)=(f^{\circ m}\circ f^{\circ n})(x) =f^{\circ m}(f^{\circ n}(x)): : \operatorname{plus} \equiv \lambda m n.\lambda f x. m\ f\ (n\ f\ x) The successor operation, \operatorname{succ}(n)=n+1, is obtained by
β-reducing the expression "\operatorname{plus}\ 1": : \operatorname{succ} \equiv \lambda n.\lambda f x. f\ (n\ f\ x) Multiplication, \operatorname{mult}(m, n) = m*n, uses the identity f^{\circ (m*n)}(x) = (f^{\circ n})^{\circ m}(x): : \operatorname{mult} \equiv \lambda m n.\lambda f x. m\ (n\ f)\ x Thus b\ (b\ f) \equiv (\operatorname{mult} b\ b)\ f and b\ (b\ (b\ f)) \equiv (\operatorname{mult} b\ (\operatorname{mult} b\ b))\ f, and so by the virtue of Church encoding expressing the
n-fold composition, the exponentiation operation \operatorname{exp}(b, n) = b^n is given by : \operatorname{exp} \equiv \lambda b n. n\ b \equiv \lambda b n f x. n\ b\ f\ x The predecessor operation \operatorname{pred}(n) is a little bit more involved. We need to devise an operation that when repeated n+1 times will result in n applications of the given function f. This is achieved by using the identity function instead, one time only, and then switching back to f: : \operatorname{pred} \equiv \lambda n f x. n\ (\lambda r i. i\ (r\ f))\ (\lambda f. x)\ I As previously mentioned, I is the identity function, \lambda x. x. See
below for a detailed explanation. This suggests implementing e.g. halving and factorial in the similar fashion, : \begin{align} \operatorname{half} &\equiv \lambda n f x. n\ (\lambda r a b. a\ (r\ b\ a))\ (\lambda a b. x)\ I\ f \\ \operatorname{fact} &\equiv \lambda n f. n\ (\lambda r a. a\ (r\ (\operatorname{succ} a)))\ (\lambda a. f)\ 1 \end{align} For example, \operatorname{pred} 4\ f\ x\, beta-reduces to I (f\ (f\ (f\ x))), \operatorname{half}\ 5\ f\ x\, beta-reduces to I\ (f\ (I\ (f\ (I\ x)))), and \operatorname{fact} 4\,f\, beta-reduces to 1\ (2\ (3\ (4\ f))). Subtraction, minus(m,n) = m-n, is expressed by repeated application of the predecessor operation a given number of times, just like addition can be expressed by repeated application of the successor operation a given number of times, etc.: : \begin{align} \operatorname{minus} & \equiv \lambda m n. n \operatorname{pred}\ m \\ \operatorname{plus} & \equiv \lambda m n. n \operatorname{succ}\ m \\ \operatorname{mult} & \equiv \lambda m n. n\ (m \operatorname{plus})\ 0 \\ \operatorname{exp} & \equiv \lambda m n. n\ (m \operatorname{mult})\ 1 \ \ \ \ \{- \ \ m^n\ -\} \\ \operatorname{tet} & \equiv \lambda m n. n\ (m \operatorname{exp})\ 1 \ \ \ \ \{-\ m \uparrow\uparrow n\ -\} \end{align} \operatorname{tet} is the
tetration operation, \operatorname{tet}\ m\ 3 = m^{(m^{(m^1)})}. Similarly to the factorial definition above, it can also be defined by using the intrinsic properties of the Church encoding, creating the "code" expression for it and letting the Church numerals themselves do the rest: : \begin{align} \operatorname{tetr} & \equiv \lambda n. n\ (\lambda r a. r\ a\ a)\ 0 \end{align} Here \operatorname{tetr}\ 3\ a = 0\ a\ a\ a\ a = a\ a\ a = a^{(a^a)}.
Direct Subtraction and Division Just as addition as repeated successor has its counterpart in the direct style, so can subtraction be expressed directly and more efficiently as well: : \begin{align} \operatorname{minus} \equiv \lambda & mnfx. \\ &m\ (\lambda rq. q\ r)\ (\lambda q. x) \\ &(n\ (\lambda qr. r\ q)\ (\operatorname{Y} \ (\lambda qr. f\ (r\ q)))) \end{align} For example, \operatorname{minus}\ 6\ 3\ f\ x reduces to an equivalent of f\ (2\ f\ x). This also gives another predecessor version, beta-reducing \lambda m. \operatorname{minus}\ m\ 1 : : \begin{align} \operatorname{pred'} \equiv \lambda mfx. m\ &(\lambda rq. q\ r)\ (\lambda q. x) \\ &(\lambda r. r\ (\operatorname{Y}\ (\lambda qr. f\ (r\ q)))) \end{align} Direct definition of division is given quite similarly as : \begin{align} \operatorname{div} \equiv \lambda & mnfx. \\ &m\ (\lambda rq. q\ r)\ (\lambda q. x) \\ &(\operatorname{Y}\ (\lambda q. n\ (\lambda qr. r\ q)\ (\lambda r. f\ (r\ q))\ (\lambda x. x))) \end{align} The application to (\lambda x. x) achieves subtraction by 1 while creating a cycle of actions repeatedly emitting an f after n-1 steps. Instead of \operatorname{Y}, (\lambda q. m\,q\,x) can also be used in each of the three definitions above.
Table of functions on Church numerals Notes: {{notelist|refs={{efn|name=ce|In the Church encoding, • \operatorname{pred}(0) = 0 • m \le n \to m - n = 0 }}}}
Predecessor function The predecessor function is given as : \operatorname{pred} \equiv \lambda n f x. n\ (\lambda r i. i\ (r\ f))\ (\lambda f. x)\ (\lambda u. u) This encoding essentially uses the identity :first(\ (\langle i,j\rangle \mapsto \langle j,f \circ j\rangle)^{\circ n}\langle I,I\rangle\ ) = \begin{cases} I & \mbox{if }n=0, \\ f^{\circ (n-1)} & \mbox{otherwise}\end{cases} or :first(\ (\langle x,y\rangle \mapsto \langle y,f(y)\rangle)^{\circ n}\langle x,x\rangle\ ) = \begin{cases} x & \mbox{if }n=0, \\ f^{\circ (n-1)}(x) & \mbox{otherwise}\end{cases}
An explanation of pred The idea here is as follows. The only thing known to the Church numeral \operatorname{pred} n is the numeral n itself. Given two arguments f and x, as usual, the only thing it can do is to apply that numeral to the two arguments, somehow modified so that the
n-long chain of applications thus created will have one (specifically, leftmost) f in the chain replaced by the identity function: : \begin{align} f^{\circ (n-1)} (x) &= \underbrace{I\ (\underbrace{f ( f ( \ldots ( f}_{{n-1}\text{ times}}}_{n\text{ times}}\,(x))\ldots ))) = (Xf)^{\circ n} (Z\,x)\ A \\ &= \underbrace{X f\ (X f\ ( \ldots (X f}_{{n}\text{ times}}\,(Z\,x))\ldots ))\ A \\ &= X\ f\ r_1\ A_1 \,\,\, \{-\ and\ it\ must\ be\ equal\ to:\ -\} \\ &= I\ (X\ f\ r_2\ A_2) \\ &= I\ (f\ (X\ f\ r_3\ A_3)) \\ &= I\ (f\ (f\ (X\ f\ r_4\ A_4))) \\ & \ldots \\ &= I\ (f\ (f\ \ldots (X\ f\ r_n\ A_n) \ldots )) \\ &= \underbrace{I\ (f\ (f\ \ldots (f}_{n\text{ times}}\ (Z\ x\ A_{n+1})) \ldots )) \\ \end{align} Here X f is the modified f, and Z\,x is the modified x. Since Xf itself can not be changed, its behavior can only be modified through an additional argument, A. The goal is achieved, then, by passing that additional argument A along
from the outside in, while modifying it as necessary, with the definitions : \begin{align} A_1 \,\,\,\,\, \,\,\,\,\, &=\, I \\ A_{\,i>1}\,\,\,\,\, & =\,f \\ Z\ x\ f\,\,\,\, &= x = K\ x\ f\\ X\ f\ r\ A_i &= A_i\ (r\ A_{i+1}) \,\,\,\,\,\, \{-\ i.e.,\ -\} \\ X\ f\ r\ i \,\,\,\,\, &= i\ (r\ f) \end{align} Which is exactly what we have in the \operatorname{pred} definition's lambda expression. Now it is easy enough to see that : \begin{align} \operatorname{pred}\ (\operatorname{succ}\ n)\ f\ x &= \operatorname{succ}\ n\ (X f)\ (K\ x)\ I \\ &= X\ f\ (n\ (X\ f)\ (K\ x))\ I \\ &= I\ (n\ (X f)\ (K\ x)\ \,\, f\,\,\, ) \\ &=\ \ldots \\ &= I\ (f\ (f\ \ldots (f\ (K\ x \,\, f\,\, )) \ldots)) \\ &= I\ (n\ f\ x) \\ &= n\ f\ x\ \end{align} : \begin{align} \operatorname{pred}\ 0\ f\ x &=\ 0\ (X f)\ (K\ x)\ I \\ &=\ K\ x\ I \\ &=\ x \\ &=\ 0\ f\ x \end{align} i.e. by eta-contraction and then by induction, it holds that : \begin{align} &\operatorname{pred}\ (\operatorname{succ}\ n) &&=\ n \\ &\operatorname{pred}\ 0 &&=\ 0 \\ &\operatorname{pred}\ (\operatorname{pred}\ 0) &&=\ \operatorname{pred}\ 0\ =\ 0 \\ & \ldots \end{align} and so on.
Defining pred through pairs The identity
above may be coded with the explicit use of pairs. It can be done in several ways, for instance, :\begin{align} \operatorname{f} =&\ \lambda p.\ \operatorname{pair}\ (\operatorname{second}\ p)\ (\operatorname{succ}\ (\operatorname{second}\ p)) \\ \operatorname{pred}_2 =&\ \lambda n.\ \operatorname{first}\ (n\ \operatorname{f}\ (\operatorname{pair}\ 0\ 0)) \\ \end{align} The expansion for \operatorname{pred}_2 3 is: :\begin{align} \operatorname{pred}_2 3 =&\ \operatorname{first}\ (\operatorname{f}\ (\operatorname{f}\ (\operatorname{f}\ (\operatorname{pair}\ 0\ 0)))) \\ =&\ \operatorname{first}\ (\operatorname{f}\ (\operatorname{f}\ (\operatorname{pair}\ 0\ 1))) \\ =&\ \operatorname{first}\ (\operatorname{f}\ (\operatorname{pair}\ 1\ 2)) \\ =&\ \operatorname{first}\ (\operatorname{pair}\ 2\ 3) \\ =&\ 2 \end{align} This is a simpler definition to devise but leads to a more complex lambda expression, : \begin{align} \operatorname{pred}_2 \equiv \lambda n. n\ &(\lambda p. p\ (\lambda a b h. h\ b\ (\operatorname{succ}\ b)) ) \,\, (\lambda h. h\ 0\ 0) \,\, (\lambda a b. a) \end{align} Pairs in the lambda calculus are essentially just extra arguments, whether passing them inside out like here, or from the outside in as in the original \operatorname{pred} definition. Another encoding follows the second variant of the predecessor identity directly, : \begin{align} \operatorname{pred}_3 \equiv \lambda n f x. n\ &(\lambda p. p\ (\lambda a b h. h\ b\ (f\ b)) ) \,\, (\lambda h. h\ x\ x) \,\, (\lambda a b. a) \end{align} This way it is already quite close to the original, "outside-in" \operatorname{pred} definition, also creating the chain of fs like it does, only in a bit more wasteful way still. But it is very much less wasteful than the previous, \operatorname{pred}_2 definition here. Indeed if we trace its execution we arrive at the new, even more streamlined, yet fully equivalent, definition : \begin{align} \operatorname{pred}_4 \equiv \lambda n f x. n\ &(\lambda r a b. r\ b\ (f\ b)) \, K\ x\ x \end{align} which makes it fully clear and apparent that this is all about just argument modification and passing. Its reduction proceeds as :\begin{align} \operatorname{pred}_4 3\ f\ x &=\ (.. (.. (.. K)))\ x\ \, x \\ &=\ (.. (.. K)) \,\,\,\,\,\,\, x\ \,\,(f\ x) \\ &=\ (.. K) \,\,\,\,\,\, (f\ x)\ \,\,(f\ (f\ x)) \\ &=\ K \,\,\,\, (f\ (f\ x))\ \,\,(f\ (f\ (f\ x))) \\ &=\ f\ (f\ x) \\ \end{align} clearly showing what is going on. Still, the original \operatorname{pred} is much preferable since it's working in the top-down manner and is thus able to stop right away if the user-supplied function f is short-circuiting. The top-down approach is also used with other definitions like : \begin{align} \operatorname{pred}_5 \equiv \lambda n f x. n\ &(\lambda r a b. a\ (r\ b\ b)) \, (\lambda a b. x)\ I\ f \\ \operatorname{third} \equiv \lambda n f x. n\ &(\lambda r a b c. a\ (r\ b\ c\ a)) \, (\lambda a b c. x)\ I\ I\ f \\ \operatorname{thirdRounded} \equiv \lambda n f x. n\ &(\lambda r a b c. a\ (r\ b\ c\ a)) \, (\lambda a b c. x)\ I\ f\ I \\ \operatorname{twoThirds} \equiv \lambda n f x. n\ &(\lambda r a b c. a\ (r\ b\ c\ a)) \, (\lambda a b c. x)\ I\ f\ f \\ \operatorname{factorial} \equiv \lambda n f x. n\ &(\lambda r a. a\ (r\ (\operatorname{succ} a))) \, (\lambda a. f)\ 1\ x \\ \end{align}
Division via General Recursion Division of natural numbers may be implemented by, : n/m = \operatorname{if}\ n \ge m\ \operatorname{then}\ 1 + (n-m)/m\ \operatorname{else}\ 0 Calculating n-m with \lambda n m. m\, \operatorname{pred}\, n takes many beta reductions. Unless doing the reduction by hand, this doesn't matter that much, but it is preferable to not have to do this calculation twice (unless the direct subtraction definition is used, see above). The simplest predicate for testing numbers is
IsZero so consider the condition. : \operatorname{IsZero}\ (\operatorname{minus}\ n\ m) But this condition is equivalent to n \le m , not n. If this expression is used then the mathematical definition of division given above is translated into function on Church numerals as, : \operatorname{divide1}\ n\ m\ f\ x = (\lambda d.\operatorname{IsZero}\ d\ (0\ f\ x)\ (f\ (\operatorname{divide1}\ d\ m\ f\ x)))\ (\operatorname{minus}\ n\ m) As desired, this definition has a single call to \operatorname{minus}\ n\ m . However the result is that this formula gives the value of (n-1)/ m. This problem may be corrected by adding 1 to
n before calling
divide. The definition of
divide is then, : \operatorname{divide}\ n = \operatorname{divide1}\ (\operatorname{succ}\ n)
divide1 is a recursive definition. The
Y combinator may be used to implement the recursion. Create a new function called
div by; • In the left hand side \operatorname{divide1} \rightarrow \operatorname{div} \ c • In the right hand side \operatorname{divide1} \rightarrow c to get, : \operatorname{div} = \lambda c.\lambda n.\lambda m.\lambda f.\lambda x.(\lambda d.\operatorname{IsZero}\ d\ (0\ f\ x)\ (f\ (c\ d\ m\ f\ x)))\ (\operatorname{minus}\ n\ m) Then, : \operatorname{divide} = \lambda n.\operatorname{divide1}\ (\operatorname{succ}\ n) where, : \begin{align} \operatorname{divide1} &= Y\ \operatorname{div} \\ \operatorname{succ} &= \lambda n.\lambda f.\lambda x. f\ (n\ f\ x) \\ Y &= \lambda f.(\lambda x.f\ (x\ x))\ (\lambda x.f\ (x\ x)) \\ 0 &= \lambda f.\lambda x.x\\ \operatorname{IsZero} &= \lambda n.n\ (\lambda x.\operatorname{false})\ \operatorname{true} \end{align} :: \begin{align} \operatorname{true} &\equiv \lambda a.\lambda b.a\\ \operatorname{false} &\equiv \lambda a.\lambda b.b \end{align} : \begin{align} \operatorname{minus} &= \lambda m.\lambda n.n \operatorname{pred} m\\ \operatorname{pred} &= \lambda n.\lambda f.\lambda x.n\ (\lambda g.\lambda h.h\ (g\ f))\ (\lambda u.x)\ (\lambda u.u) \end{align} Gives, : \scriptstyle \operatorname{divide} = \lambda n.((\lambda f.(\lambda x.x\ x)\ (\lambda x.f\ (x\ x)))\ (\lambda c.\lambda n.\lambda m.\lambda f.\lambda x.(\lambda d.(\lambda n.n\ (\lambda x.(\lambda a.\lambda b.b))\ (\lambda a.\lambda b.a))\ d\ ((\lambda f.\lambda x.x)\ f\ x)\ (f\ (c\ d\ m\ f\ x)))\ ((\lambda m.\lambda n.n (\lambda n.\lambda f.\lambda x.n\ (\lambda g.\lambda h.h\ (g\ f))\ (\lambda u.x)\ (\lambda u.u)) m)\ n\ m)))\ ((\lambda n.\lambda f.\lambda x. f\ (n\ f\ x))\ n) Or as text, using \ for , divide = (\n.((\f.(\x.x x) (\x.f (x x))) (\c.\n.\m.\f.\x.(\d.(\n.n (\x.(\a.\b.b)) (\a.\b.a)) d ((\f.\x.x) f x) (f (c d m f x))) ((\m.\n.n (\n.\f.\x.n (\g.\h.h (g f)) (\u.x) (\u.u)) m) n m))) ((\n.\f.\x. f (n f x)) n)) For example, 9/3 is represented by divide (\f.\x.f (f (f (f (f (f (f (f (f x))))))))) (\f.\x.f (f (f x))) Using a lambda calculus calculator, the above expression reduces to 3, using normal order. \f.\x.f (f (f (x)))
Predicates A
predicate is a function that returns a Boolean value. The most fundamental predicate on Church numerals is \operatorname{IsZero}, which returns \operatorname{true} if its argument is the Church numeral 0, and \operatorname{false} otherwise: : \operatorname{IsZero} = \lambda n.n\ (\lambda x.\operatorname{false})\ \operatorname{true} The following predicate tests whether the first argument is less-than-or-equal-to the second: : \operatorname{LEQ} = \lambda m.\lambda n.\operatorname{IsZero}\ (\operatorname{minus}\ m\ n) Because of the identity : x = y \equiv (x \le y \land y \le x) the test for equality can be implemented as : \operatorname{EQ} = \lambda m.\lambda n.\operatorname{and}\ (\operatorname{LEQ}\ m\ n)\ (\operatorname{LEQ}\ n\ m)
In programming languages Most real-world languages have support for machine-native integers; the
church and
unchurch functions convert between nonnegative integers and their corresponding Church numerals. The functions are given here in
Haskell, where the \ corresponds to the λ of Lambda calculus. Implementations in other languages are similar. type Church a = (a -> a) -> a -> a church :: Integer -> Church Integer church 0 = \f -> \x -> x church n = \f -> \x -> f (church (n-1) f x) unchurch :: Church Integer -> Integer unchurch cn = cn (+ 1) 0 == Signed numbers ==