-
Notifications
You must be signed in to change notification settings - Fork 216
Coinductive Types
The question often arises as to whether Lean will one day have coinductive types. There are no plans to implement this in the near future, but this page contains some notes that are relevant in that regard (transplanted from issue #1441).
"Adding coinductive types to Lean" can mean one of two things: extending Lean's kernel with new constructions and operations or building coinductive types on top of existing primitives. Lean has adopted the design decision of keeping the kernel as small and simple as possible; for example, it implements a minimal form of inductive data type with primitive recursors (à la Dybjer). Other forms of inductive types and recursion, such as nested and mutual inductive types, arbitrary structural recursion, well-founded recursion, etc. have to be "compiled down" to those. We would therefore expect coinductive types to be defined in Lean, without changes to the kernel.
Isabelle has implemented a powerful coinductive datatype package along those lines: https://people.mpi-inf.mpg.de/~jblanche/co-data-impl.pdf. There are a few things that make the project more complicated in Lean:
- Isabelle's definitions are classical, whereas in Lean the development should be computational.
- In dependent type theory, users will expect some reductions to hold definitionally (though it is negotiable as to which ones).
- In general, dealing with dependent types poses additional headaches.
The first issue is important because we would like to generate bytecode from such definitions or compile them to native code. Compilers can recognize coinductive definitions and handle them in special ways, but constructive definitions provide a computational semantics, thereby specifying intended behavior.
Ken Sakayori (@sakas--), an undergraduate at the University of Tokyo at the time, visited Carnegie Mellon for three weeks in the summer of 2015. Together we worked out a proposal for implementing coinductive types in Lean, and Ken carried out a proof-of-concept example by hand. Some notes I wrote up afterwards and Ken's formalization (both with snippets of Lean 2 code) are found below.
Implementing this in Lean 3 would require a lot of work. Someone would have to:
- parse coinductive definitions
- compile them down to inductive primitives
- extend the Lean's function definition system (aka the "equation compiler") to use them
- extend the bytecode compiler and native compiler to use them and generate efficient code
- maintain the system
The first is not so hard. It may soon even be possible to write the parser extension in Lean itself. But the rest are difficult. Building expressions of dependent type theory automatically is fiddly and subtle, and there will likely be many unforeseen corner cases. (It should be possible to carry out all or at least substantial parts of the constructions in Lean itself, with the metaprogramming facilities and API.) Users will expect the same gadgets that are being developed for inductive types (like mutual-nested-dependent-inductive-coinductive constructions and recursion). And all the work will be for nothing unless someone is able to keep the package going and respond to issues, questions, bug reports, and so on.
That said, I will now reproduce the notes on the construction that Ken and I proposed, and add some comments from Leo at the end. Ken has made his Lean 2 code available here: https://gist.github.com/sakas--/2758e8ebfc91db45526c. The notes below also use Lean 2 syntax.
To illustrate the idea, let us consider an inductive type with three constructors:
inductive two_tree (A : Type) : Type :=
| nil {} : two_tree A
| cons : A → two_tree A → two_tree A
| cons₂ : A → two_tree → two_tree A →two_tree A
In other words, a two_tree
is either a leaf, a labeled node with one child, or a labeled node with two children. We chose this example because it is just complicated enough to stand for a general case.
The coinductive version, a lazy two tree
would be similar, except that some branches might be infinite. The idea is to model this as a sequence of finite approximations. For every n, we have a depth n approximation, in which any branch that has not terminated yet is marked continue
.
To do this, we only have to add an index to the definition above, and a constructor for continue
:
inductive l_two_tree_approx (A : Type) : ℕ → Type :=
| continue {} : l_two_tree_approx A 0
| nil {} : Π {n : ℕ}, l_two_tree_approx A (n + 1)
| cons : Π {n : ℕ}, A → l_two_tree_approx A n → l_two_tree_approx A (n + 1)
| cons₂ : Π {n : ℕ}, A → l_two_tree_approx A n → l_two_tree_approx A n →
l_two_tree_approx A (n + 1)
We want to say that a lazy two tree is a sequence of approximations, one for each n
, such that the n
th approximation agrees with the (n+1)
st. Here is what it means for consecutive approximations to agree:
definition agrees : Π {n : ℕ}, l_two_tree_approx A n → l_two_tree_approx A (n + 1) → Prop
| 0 continue _ := true
| _ nil nil := true
| _ nil _ := false
| _ (cons a t₁) (cons a t₂) := agrees t₁ t₂
| _ (cons a t₁) _ := false
| _ (cons₂ a t₁₁ t₁₂) (cons₂ a t₂₁ t₂₂) := agrees t₁₁ t₂₁ ∧ agrees t₁₂ t₂₂
| _ (cons₂ a₁ t₁ t₁') _ := false
We can then define a lazy two tree to be a sequence of approximations, such that each agrees with the next:
definition l_two_tree (A : Type) : Type :=
{s : Π (n : ℕ), l_two_tree_approx A n | ∀ n, l_two_tree_approx.agrees (s n) (s (n + 1))}
That is really the core idea. The definition of agrees
makes sure there is progress, i.e. if (agrees t1 t2)
holds, then t2
refines any continue
left in t1
. We can then go on to define predicates is_nil
is_cons
is_cons₂
and functions head
tail
head₂
tail₂₁
tail₂₂
. Ken worked this all out in the gist. It is tedious, but it is entirely formulaic: you define the operations on approximations, you show that they preserve agreement, and then you list the definition to sequences.
What about corecursion? The idea is that given f : 1 + A * X + A * X * X
we should be able to define a function corec f satisfying
corec f x :=
nil if f x is is in the first component
cons a (corec f x') if f x returns (a, x') in the second component
cons₂ a (corec f x₁) (corec f x₂) if f x returns (a, x₁, x₂) in the third component
We can write this in a nice way be defining
inductive ltwo_tree_shape (X A : Type) :=
| inl {} : ltwo_tree_shape X A
| inm {} : A → X → ltwo_tree_shape X A
| inr {} : A → X → X → ltwo_tree_shape X A
local notation `shape` := ltwo_tree_shape
and then defining
definition corec (f : X → shape X A) (x : X) : l_two_tree A := ...
Ken did all this, and got it to work. He proved, for example,
theorem head_corec {f : X → shape X A} {x : X} (H : is_cons (corec f x)) : head H = pr₁₁ (is_inm_of_is_cons_corec H)
This is a definitional reduction rule in Coq; we only have it propositionally. The other reduction rules should be similar (though tedious) to prove.
I think this solution is very general. Here are two things I am not yet sure of:
- Can we do mutual coinduction / induction, as in Isabelle?
- Can corec simulate all of the guarded recursions that are allowed in Coq, or, at least, can the method be extended to handle them? I think the answer to both questions is "yes".
For this example, Ken had to write almost 600 lines of Lean code (and it is not quite finished). But it is mostly boilerplate, and will hopefully be automatable. One nice thing is that it does not seem to require a lot of new syntax. Instead of
inductive two_tree (A : Type) : Type :=
| nil {} : two_tree A
| cons : A → two_tree A → two_tree A
| cons₂ : A → two_tree → two_tree A →two_tree A
we want to write
coinductive l_two_tree (A : Type) : Type :=
| nil {} : l_two_tree A
| cons : A → l_two_tree A → l_two_tree A
| cons₂ : A → l_two_tree → l_two_tree A →l_two_tree A
and have the Lean generate all the stuff for us.
There is a small wrinkle, in that we have to generate constants is_nil
is_cons
is_cons₂
head
tail
head₂
tail₂₁
tail₂₂
. The first three can be taken from the names of the constructors, but users need some way of specifying the others. Isabelle has you do it something like this:
coinductive l_two_tree (A : Type) : Type :=
| nil {} : l_two_tree A
| cons (head : A) (tail : l_two_tree A) : l_two_tree A
| cons₂ (head₂ : A) (tail₂₁ : l_two_tree) (tail₂₂ : l_two_tree A) : l_two_tree A
Lean 3 now supports such syntax for inductive types.
Another issue is to come up with nicer syntax for writing corecursive definitions. A full-blown effort will require extending the function definition system. But even a no-frills solution, writing
f = corec (λ x, ...)
where ... is any expression of type shape X A
is not too bad. It is intuitive: shape X A
just specifies which constructor you want to use, in our case, nil
, cons
, or cons₂
. It seems no more difficult than using the rec
primitives for inductive types in Lean.
After reading these notes, Leo offered the following suggestions.
It is really useful to devise schematic proofs (aka templates) that work for any coinductive declaration. To design this kind of template, it is useful to try many examples by hand. At least, this is the approach I used when I implemented features such as the brec_on
definition.
Here are some suggestions:
- Try encoding a coinductive indexed family. Indexed families usually create technical problems. I think it will be helpful to understand how they impact the proof templates.
- Coinductive predicate. I usually have to write custom code for inductive predicates.
- Reflexive coinductive types. We say an inductive type
T
is reflexive if it contains at least one constructor that takes as an argument a function returningT
. Example:
coinductive inftree (A : Type) : Type :=
| leaf : A → inftree A
| node : (nat → inftree A) → inftree A
It is also useful to try to simplify the proofs and get them close to a general template as possible. Here is an example where I tried to simplify some of the definitions/theorems in Ken’s file.