The lambda calculus for developers

This will be a quick introduction to the lambda calculus syntax, alpha (α) equivalence and beta (β) reduction.

What does a lambda look like?

I am going to use the identity function as an example for the simplicity it provides. This can be expressed as a lambda function with the notation λx.x. It is a function that when given an argument outputs that argument as its return value. You can also have multiple arguments with a lambda like λxy.xy.

A lambda is comprised of a head, argument and body.

diagram illustrating the parts of a lambda function - head(argument).body

The head of the lambda begins with the lambda character (λ), which indicates the start of a function. This is immediately followed by the argument that is separated from the body of the function by a dot/period (.).

Applying a lambda

The identity function can be applied to any argument passed to it. Here it is when applied to the digit two (2).

(λx.x) 2
2

The steps to arrive at the final answer can be further broken down and illustrated.

the application of the identity lambda against the digit 2 as a diagram

We have just completed the simplest of beta reductions. This is the process of reducing all expressions to their normal form or smallest unit - to the point where you can do no more reduction. In this case we can reduce all the way to a single value, but this is not always the case as you’ll see further on.

In a programming language you may already know

In JavaScript this would be written as

((x) => x)(2)(
  // OR
  function (x) {
    return x;
  },
)(2);

and in PHP

<?php
(function($x) {
    return $x;
})(2)

and Python

(lambda x: x) (2)

and Ruby

->(x) { x }.call(2)

and Haskell

(\x -> x) 2

Some more simple examples

(λx.x) 10
10
---
(λx.x * 2) 2
4
---
(λx.1 + x) 2
3

Free variables

A free variable is one that is not mentioned in the head of a lambda - y is a free variable in the expression λx.xy. This does not prevent the expression from being reduced though.

(λx.xy) z
zy

The opposite of a free variable is a bound variable - it is bound to an argument specified in the head of the lambda.

Higher order

The lambda calculus can also encode higher order operations - that is lambdas that can accept lambdas as arguments and/or return lambdas. To make it easier to keep track of this process I will use “notes to self” inside square brackets ([]) that illustrate the value an argument was substituted with.

For reference here is the identity example again with the additional substitution notation.

(λx.x) 2
[ x := 2 ]
2

Now for the first higher order lambda expression - the identity lambda applied to itself.

(λx.x) (λy.y)
[ x := (λy.y) ]
λy.y

This is a good time to mention that the lambda calculus is left associative during beta reduction. We start with the leftmost expression and apply the left most argument to it. In the following example I’ve added an extra initial step to wrap the first reduction inside parentheses (()) so as to make this association explicit

(λx.x) (λy.y) z
((λx.x) (λy.y)) z  ; here are those extra parentheses
[ x := (λy.y) ]
(λy.y) z
[ y := z ]
z

α-equivalence

This refers to two different expressions that when given the same argument would return the same result. They are functionally equivalent to each other and you could substitute one for the other.

λx.x == λy.y == λz.z
λxy.yx == λzq.qz == λpt.tp

conversely, due to different ordering in the body the following is not equivalent.

λxy.xy != λzq.qz

Importantly, this property gives you the opportunity to rename variables where there may be clashes in an expression.

(λz.z) λz.zz == (λz.z) λy.yy

Where there are free variables in the expression it is not possible to establish equivalence, but you can still rename those that are bound (x in the following example).

λx.xz != λx.xz

If you wanted to avoid a variable name collision in the aforementioned expression you could rename x.

λx.xz -> λy.yz

Multiple arguments and currying

The same basic reduction rules apply when dealing with lambdas that have multiple arguments, but there is a little additional rule. Each argument is actually a lambda and they are nested - this is currying.

λxy.xy

is actually more like the following when considered in its most explicit form.

λx.(λy.xy)

The more arguments you have the more nested lambdas you’ll have.

λxyz.xyz == λx.(λy.(λz.xyz))
λxyzq.xyzq == λx.(λy.(λz.(λq.xyzq)))

This is to say that the first notation is shorthand for each argument being the application of a lambda function.

Now for a multi-argument reduction

(λxy.xy) p t
(λx.(λy.xy)) p t
[ x := p ]
(λy.py) t
[ y := t ]
pt

and again for a higher order example.

(λxy.xy) (λz.q) 1
(λx.(λy.xy)) (λz.q) 1
[ x := (λz.q) ]
(λy.(λz.q)y) 1
[ y := 1 ]
(λz.q) 1
[ z := 1 ]
q

Worth noting here that the z is not used in the lambda body so the value 1 simply evaporates (dropped from the expression/result).

It is, of course, possible to work through more complex problems like the following expression - remembering we start with the left most expression first.

(λxyz.xz(yz)) (λmn.m) (λp.p)
; let's expand to indicate curried arguments
(λx.(λy.(λz.xz(yz)))) (λm.(λn.m)) (λp.p)
[ x := (λm.(λn.m)) ]
(λy.(λz.(λm.(λn.m)) (z) (yz))) (λp.p)
[ y := (λp.p) ]
λz.(λm.(λn.m)) (z) ((λp.p) z)
; there are no more arguments to apply but
; we can still reduce internally
; again we want to do the left most first
[ m := z ]
λz.(λn.z) ((λp.p) z)
[ n := ((λp.p) z) ]
; as n is not used in the body it evaporates
; and the lambda returns z
λz.z

So after all that we land up with the identity function at the end.

Combinators

A lambda term with no free variables (all variables are bound), which serves to combine values.

λz.z
λzy.zy
λxyz.xz(yz)

As opposed to those that contain free variables - in this case y.

λz.y
λz.xy

There is a very famous combinator called the Y combinator that looks like this.

λf.(λx.f (x x))(λx.f (x x))

Divergence

Not all expressions can be considered to converge because they lead to replication and the beta reduction process never ends. Consider the Ω (Omega) divergence below.

(λx.xx) (λy.yy)
[ x := λy.yy ]
(λy.yy) (λy.yy)
[ y := λy.yy ]
(λy.yy) (λy.yy)
; and so on forever

Some exercises for you

The worked answers to these exercises are available after the summary.

Combinator exercises

For each of the following determine if they are combinators or not.

λq.qq
λts.stp
λz.fg
λxy.yx
λrgf.f (ri) g

α-equivalence exercises

Are these terms α-equivalent?

λz.z and λa.a
λbq.qb and λzt.zt
λz.zg and λp.pg
λb.λa.a and λa.λb.b
λd.λxy.y and λf.λyx.x

β-reduction exercises

Reduce the following to their β-normal forms. It will be a lot easier if you use a pen and paper or even a text document in your editor to work through these.

(λa.ab) (λq.q)
(λf.(λg.fgg)) (λn.n) m
(λs.(λp.(sp) s)) (λt.q)
(λb.(λm.(bb) m) (λq.vq)) (λx.(λe.e))
λf.((λx.f (x x)) (λx.f (x x)))
(λq.qg) (λp.(λs.ss)) (λt.t) z
(λfg.gf) (λba.a) (λz.z) pq
(λpt.pt) (λx.xx) (λf.ff)
(λpt.t) g (λq.(λv.vv)) o (λu.uu) (λpf.fz)

Summary

Lambda expressions are:

reduced from left to right

(λx.x) (λy.y) g
[ x := (λy.y) ]
(λy.y) g
[ y := g ]
g

left associative and greedy

(λx.x) (λy.y) g != (λx.x λy.y g)

; first expression
(λx.x) (λy.y) g
[ x := (λy.y) ]
(λy.y) g
[ y := g ]
g

; second expression
(λx.x λy.y g)
(λx.(x(λy.y g)))

applied/reduced through β-reduction to their β-normal form or point of divergence (they either self-replicate or grow - think Y-Combinator and Ω (Omega))
combinators when all variables are bound to arguments (no free variables) - therefore serving to combine values together

Answers

Combinator answers

Yes
No (p is free)
No (fg are free)
Yes
No (i is free)

α-equivalence answers

β-reduction answers

1

(λa.ab) (λq.q)
[ a := (λq.q) ]
(λq.q) b
[ q := b ]
b

2

(λf.(λg.fgg)) (λn.n) m
[ f := (λn.n) ]
(λg.(λn.n) gg) m
[ g := m ]
(λn.n) (m)m
[ n := m ]
mm

3

(λs.(λp.(sp) s)) (λt.q)
[ s := (λt.q) ]
(λp.(λt.q)p) (λt.q)
[ p := (λt.q) ]
(λt.q) (λt.q)
[ t := (λt.q) ]  ; not that it matters, `t` is dropped anyway
q

4

(λb.(λm.(bb) m) (λq.vq)) (λx.(λe.e))
[ b := (λx.(λe.e)) ]
(λm.((λx.(λe.e)) (λx.(λe.e))) m) (λq.vq)
[ m := (λq.vq) ]
(λx.(λe.e)) (λx.(λe.e)) (λq.vq)
[ x := (λx.(λe.e)) ]
(λe.e) (λq.vq)
[ y := (λq.vq) ]
(λq.vq)

5

λf.((λx.f (x x)) (λx.f (x x)))
[ x := (λx.f (x x)) ]
λf.(f((λx.f (x x) (λx.f (x x)))))
[ x := (λx.f (x x)) ]
λf.(f(f((λx.f (x x) (λx.f (x x))))))
[ x := (λx.f (x x)) ]
λf.(f(f(f((λx.f (x x) (λx.f (x x)))))))
; the Y-combinator diverges and the expression actually grows!

6

(λq.qg) (λp.(λs.ss)) (λt.t) z
[ q := (λp.(λs.ss)) ]
(λp.(λs.ss)) g (λt.t) z
[ p := g ]
(λs.ss) (λt.t) z
[ s := (λt.t) ]
(λt.t) (λt.t) z
[ t := (λt.t) ]
(λt.t) z
[ t := z ]
z

7

(λfg.gf) (λba.a) (λz.z) pq
[ f := (λba.a) ]
(λg.g(λba.a)) (λz.z) pq
[ g := (λz.z) ]
(λz.z) (λba.a) pq
[ z := (λba.a) ]
(λba.a) pq
[ b := p ]
(λa.a) q
[ a := q ]
q

8

(λpt.pt) (λx.xx) (λf.ff)
[ p := (λx.xx) ]
(λt.(λx.xx)t) (λf.ff)
[ t := (λf.ff) ]
(λx.xx) (λf.ff)
[ x := (λf.ff) ]
(λf.ff) (λf.ff)
; this diverges

9

(λpt.t) g (λq.(λv.vv)) o (λu.uu) (λpf.fz)
[ p := g ]
(λt.t) (λq.(λv.vv)) o (λu.uu) (λpf.fz)
[ t := (λq.(λv.vv)) ]
(λq.(λv.vv)) o (λu.uu) (λpf.fz)
[ q := 0 ]
(λv.vv) (λu.uu) (λpf.fz)
[ v := (λu.uu) ]
(λu.uu)  (λu.uu) (λpf.fz)
[ u := (λu.uu) ]
(λu.uu) (λu.uu) (λpf.fz)
; this diverges before it can reduce all its terms leaving `(λpf.fz)` unreachable/dangling