Haskell implementation of the simply typed lambda calculus with inductive types. It ranges over base types, has functions, products, sums, units, and a special μ
type (with a μ
variable X
) to define recursive types. It is strongly normalizing due to the well-founded nature of inductive types but not Turing Complete.
Inductive types are an important stepping-stone towards algebraic datatypes in functional languages and a powerful tool to use with inductive equational reasoning. Additionally, catamorphisms in this language form canonical functional folds.
You need Haskell, this compiles with GHC 8.2.2 at least (Stack resolver: lts-11.0).
You can use cabal to build and run this, see this README, alternatively you can use vanilla ghc to build:
To compile and run do:
ghc -O2 -o cata Main
then run ./cata
Alternatively to use the GHCi Interpreter do:
ghci Main
then type main
In either case you get something like the following:
Welcome to the Cata REPL
Type some terms or press Enter to leave.
>
Note: When run in GHCi, you don't have the luxury of escaped characters, backspace, delete etc... Compile it using GHC if you need this.
Where you can then have some fun, try these examples:
\x:A.x
in (inl ():1+M(1+X)):M(1+X)
(the encoding of peano style zero)\x:M(1+((1*X)*X)).x
(the identity function for a binary tree, with nil leaves and elements stored internally)case (inl () : 1 + A) (\x:1.x) (\y:A.())
(performs case analysis on the first argument tocase
, passing the result to the first function ifinl
or otherwiseinr
).\x:A.\y:B.snd (x,y)
.\n:M(1+X).in (inr n:1+M(1+X)):M(1+X)
the successor function for inductively defined nats.\m:M(1+X). cata (\x:1+M(1+X). case x (\u:1.m) (\n:M(1+X).in (inr n:1+M(1+X)):M(1+X))):M(1+X) -> M(1+X)
add for inductively defined nats
The parser is also smart enough to recognise λ, so you can copy and paste from the output:
> \x:A.x
= λx:A.x
> λx:A.x
= λx:A.x
>
denotes the REPL waiting for input, =
means no reductions occurred (it's the same term), ~>
denotes one reduction, and ~>*
denotes 0 or more reductions (although in practice this is 1 or more due to =
).
Note: see syntax below for a description of what the various symbols mean.
There is also a reduction tracer, which should print each reduction step. prefix any string with '
in order to see the reductions:
> '(\m:M(1+X). cata (\x:1+M(1+X). case x (\u:1.m) (\n:M(1+X).in (inr n:1+M(1+X)):M(1+X))):M(1+X) -> M(1+X)) (in (inr (in (inl ():1+M(1+X)):M(1+X)):1+M(1+X)):M(1+X)) (in (inr (in (inl ():1+M(1+X)):M(1+X)):1+M(1+X)):M(1+X))
~> cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) (in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X))
~> (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))) ((λx:⊤ + X.case x (λy:⊤.inl ((λx:⊤.x) y):⊤ + X) (λy:⊤.inr (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) y):⊤ + X)) (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)))
~> case ((λx:⊤ + X.case x (λy:⊤.inl ((λx:⊤.x) y):⊤ + X) (λy:⊤.inr (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) y):⊤ + X)) (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X))) (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))
~> case (case (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)) (λy:⊤.inl ((λx:⊤.x) y):⊤ + X) (λy:⊤.inr (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) y):⊤ + X)) (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))
~> case ((λy:⊤.inr (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) y):⊤ + X) (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X))) (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))
~> case (inr (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X))):⊤ + X) (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))
~> (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X)) (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)))
~> in (inr (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X))):⊤ + μ(⊤ + X)):μ(⊤ + X)
~> in (inr ((λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))) ((λx:⊤ + X.case x (λy:⊤.inl ((λx:⊤.x) y):⊤ + X) (λy:⊤.inr (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) y):⊤ + X)) (inl ():⊤ + μ(⊤ + X)))):⊤ + μ(⊤ + X)):μ(⊤ + X)
~> in (inr (case ((λx:⊤ + X.case x (λy:⊤.inl ((λx:⊤.x) y):⊤ + X) (λy:⊤.inr (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) y):⊤ + X)) (inl ():⊤ + μ(⊤ + X))) (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):⊤ + μ(⊤ + X)):μ(⊤ + X)
~> in (inr (case (case (inl ():⊤ + μ(⊤ + X)) (λy:⊤.inl ((λx:⊤.x) y):⊤ + X) (λy:⊤.inr (cata (λx:⊤ + μ(⊤ + X).case x (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):μ(⊤ + X)->μ(⊤ + X) y):⊤ + X)) (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):⊤ + μ(⊤ + X)):μ(⊤ + X)
~> in (inr (case ((λy:⊤.inl ((λx:⊤.x) y):⊤ + X) ()) (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):⊤ + μ(⊤ + X)):μ(⊤ + X)
~> in (inr (case (inl ((λx:⊤.x) ()):⊤ + X) (λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) (λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X))):⊤ + μ(⊤ + X)):μ(⊤ + X)
~> in (inr ((λu:⊤.in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)) ((λx:⊤.x) ())):⊤ + μ(⊤ + X)):μ(⊤ + X)
~> in (inr (in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)
Note the above computes 1+1, (Submit a PR with a better parser to prevent the need for all these type annotations!)
There is also a typing mechanism, which should display the type or fail as usual.
> t(\n:M(1+X).in (inr n:1+M(1+X)):M(1+X)) (in (inl ():1+M(1+X)):M(1+X))
μ(⊤ + X)
> t\x:X. x x
Note: The above terms denote succ 0 and self-application respectively. If you provide a un-typeable term, the type checker will fail and reduction will not occur.
You can save terms for the life of the program with a let
expression. Any time a saved variable appears in a term, it will be substituted for the saved term:
> let zero = in (inl ():1+M(1+X)):M(1+X)
Saved term: in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)
> let succ = (\n:M(1+X).in (inr n:1+M(1+X)):M(1+X))
Saved term: λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X)
> succ zero
~>* in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)
Note: Consequently let
and =
are keywords, and so you cannot name variables with these. Additionally inl
, inr
, cata
, in
, case
, fst
, snd
, and X
are keywords in Cata.
Additionally we have type level lett
statements that allow you to define and use types:
> lett NAT = M(1+X)
Saved type: μ(⊤ + X)
> let zero = in (inl ():1+NAT):NAT
Saved term: in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)
> let succ = \n:NAT.in (inr n:1+NAT):NAT
Saved term: λn:μ(⊤ + X).in (inr n:⊤ + μ(⊤ + X)):μ(⊤ + X)
> succ zero
~>* in (inr (in (inl ():⊤ + μ(⊤ + X)):μ(⊤ + X)):⊤ + μ(⊤ + X)):μ(⊤ + X)
This makes it easier to define both terms and types, but does not allow type level application (See Omega). lett
is also a keyword.
We base the language on the BNF for Cata:
However we adopt standard bracketing conventions to eliminate ambiguity in the parser. Concretely, the parser implements the non-ambiguous grammar as follows:
and types:
Some notes about the syntax:
- The above syntax only covers the core calculus, and not the repl extensions (such as let bindings above). The extensions are simply added on in the repl.
- Variables are strings (excluding numbers), as this is isomorphic to a whiteboard treatment and hence the most familiar.
X
is a reservedμ
variable and is interpreted as such in a term. μ-less terms that useX
won't act differently than STLC but introducingμ
will change the semantics. Best to avoid unless usingμ
.- Types range over upper-case characters
A,B,C...
, nested arrow types:T -> T
, product typesA * B
(or×
), sum typesA + B
, unit types1
(or⊤
), a special Mu typeμ(T)
(orM(T)
), and Mu type variableX
. - Arrows associate to the right so that
A -> A -> A
is the same asA -> (A -> A)
but not((A -> A) -> A)
. Similar rules follow for the other types. - Products have the highest precedence, followed by sums, arrows, and then all other types.
- Nested terms don't need brackets:
\x:A.\y:B. y
unless enforcing application on the right. Whitespace does not matter(\x:A. x)
unless it is between application where you need at least one space. - Products are 2-element pairs like in Haskell. You form products like
(x, y)
and access each element usingfst
(orπ1
) andsnd
(orπ2
). - Sums are 2-element co-pairs, formed with either
inl x:A
orinr y:B
(assuming eitherx:A
ory:B
is in scope).case s f g
does case analysis ons
, passing the result to the functionf
if s wasinl
org
ifinr
. - Units are largely uninteresting, but can be formed as
()
like in Haskell, and typed like1
or (or⊤
). - To construct terms using
cata
orcase
, use space to apply arguments. for instance,case s f c
appliess
tocase
, which then appliesf
to thecase s
andg
tocase s f
etc... - Inductive types are formed by prefixing a term with
in
and a type withM
(orμ
). Concretely, an inductive term looks likein x:t
wherex
is the term andt
is its type. Similarly, an inductive type looks likeμ(T)
whereT
is the type we insist is inductive. For example, peano nats have type1+ μ(1+X)
where1
is zero, andX
is a μ variable capturing the subterm of type1+μ(1+X)
(also a nat). The example term zero captures this idea above. See the semantics for a description of what these constructions mean. - I should note that the parser is pretty verbose with it's types, a nice challenge would be to implement a backtracking parser as a means to reduce the number of necessary type annotations (in say the zero example above).
- To quit use
Ctrl+C
or whatever your machine uses to interrupt computations.
The semantics implements full beta-reduction on terms and alpha-equivalence as the Eq
instance of CataTerm
. The semantics are the same as the STLC but with additional rules for In
, Cata
, and all the machinery for products, sums and units. We reformulate the semantics as typing judgements:
for variables (adopted from STLC):
for abstractions (adopted from STLC):
and application (adopted from STLC):
beta reduction (adopted from STLC):
We have special introduction, elimination and reduction rules for products:
for sums:
and for units:
with the obvious rules for reduction inside each term that has a subterm. Finally, we have rules for μ
type introduction, elimination and, reduction. Note how this forms a fold:
with reduction inside terms like above.
- The rules for variables, abstraction and application follow directly from STLC; the rules for products, sums, and units are largely standard in the literature. μ types in the literature are typically written as
μX.T
where T is a type that contains X. Since we only use one μ variable, we just writeT
(submit a PR with the more general form!) - We adopt the categorical notation of a functor, which expresses a mapping from terms to terms and types to types. For example, given the functor type
F X = 1 + X
, and typeY = B
, thenF Y = 1 + Y = 1 + B
. similarly if we had a terminl _ : F _
, theninl b: F B
for someb:B
. This idea translates into Cata, however we don't do type-level application, and so you must apply these types yourself before using them. Concretely, you might provideF Y
as the type1 + B
in Cata. - We can construct functors using
+
,*
,->
. A functor is strictly positive if it contains only constant variablesA,B,C,...
, μ variableX
,+
,*
, and->
such thatX
only occurs on the right of arrows. For every s.p. functor F, we can define an inductive typeμF
. A strictly positive check has not been implemented yet, behaviour for non-s.p functors is thus undefined (submit a PR, as it's an easy fix). - For instance, Peano numbers have the shape
F X = 1 + X
where 1 is zero and X is the inductive case (like succ n). An F-shaped term could beinl ():1+X
and hence an inductive nat isin (inl ():1+X): F (μF)
, however we must now unify1+X
andF (μF)
. HereX
must beμF
, whereF X = 1 + X
and so the final type is1+μ(1+X)
, where zero isin (inl ():1+μ(1+X)):1+μ(1+X)
in Cata. Unification is not implemented, so you must derive and input this type yourself (or submit a PR!). - We can implement functions on functors too, supposing
F X = 1 + X
, then a functionF X -> X
might reduce F-shaped terms into a simpler formX
. For instance,\x:1+X.case x (\x:1.()) (\x:X.())
reduces terms of type1+X
to()
; in languages with useful types (like booleans), we might use this to implement more powerful constructs (like conditionals). - We can extend the notation of typed reduction by lifting it to the inductive level, that is given
f:F X -> X
, we getcata f:μF X -> X
. This forms the elimination rule for inductive types and is known as the catamorphism of F. - We combine our knowledge of catamorphisms and inductive types and get folds (finally!). Intuitively this can be understood as a way to deconstruct inductive types into simpler ones (e.g. summing up a list of numbers). The reduction rule states that applying a catamorphism
cata f
to an inductive termin t
will first lift the cataterm to a termF (cata f)
of typeF (μF) -> F X
which is then applied tot:F (μF)
so thatF (cata f) t
has typeF X
. We apply the result tof
to get a term of typeX
which is the simplest form. The implementation ofF
is handled for you, run the reduction tracer to see what it does! - This implementation follows a small-step operational semantics and Berendregt's variable convention (see
substitution
in Cata.hs). - Reductions include the one-step reduction (see
reduce1
in Cata.hs), the many-step reduction (seereduce
in Cata.hs).
- Cata.hs contains the Haskell implementation of the calculus, including substitution, reduction, and other useful things.
- Parser.hs contains the monadic parser combinators needed to parse input strings into typed-term ASTs for the calculus.
- Repl.hs contains a simple read-eval-print loop which hooks into main, and into the parser.
- Main.hs is needed for GHC to compile without any flags, it also invokes the repl.
For contributions, see the project to-do list or submit a PR with something you think it needs.