Future Work - A general theory of syntax with bindings

The next step to be taken for our theory, is to get it into a fully functional package for Is- abelle/HOL, with a fully automated interface for the user. This means that the user will be able to specify quickly their syntax, by just giving its constructors, and in exchange they will get a rich collection of lemmas about terms and operators on them (see Section3.2.4from Chapter3), as well as bindings-aware reasoning and definition principles, all customized for the particular instance. The whole functorial setting will be hidden from the user and the properties of it will be phrased naturally in terms of the constructors of the syntax. This will include all the modularity properties, as well as mutually-inductively specified objects to get many sorted syntaxes. This process is theoretically straightforward, but technically demand- ing, as witnessed by its predecessor: the BNFs-based (co)datatype package for Isabelle/HOL [18,70,16,13,17].

On a more theoretical level, there are some additional features that we plan to add soon to our framework. In particular, we have shown how we implement and generalize to our functorial setting a swapping-based (co)recursion—as discussed, by now we should more

properly call it “renaming-based (co)recursion”. However our second functorial framework still lacks a substitution-based (co)recursion, that generalizes our principle presented in Sec- tion3.3, Chapter3. After we formalize this result, we will also be able to port to the new framework a generalization of our previous interpretation of syntaxes in semantic domains (Section3.3.4, Chapter3)—the interpretation function will be defined independently of the particular syntax and a good behaviour with respect to substitution (the “substitution lemma”) will be guaranteed once and for all.

As later developments, we plan to go beyond the study of datatypes and formalize binding- aware non-structural induction principles, such as rule induction—which is very well sup- ported by Isabelle’s nominal package [73]. For a dual rule coinduction principle, our study (Section6.4.2, Chapter6) suggests that fresh coinduction may not be the best notion to consider. Similar considerations hold for general recursion in the presence of bindings.

Appendix A

Appendix

A.1 More Details About BNFs and BNF-based (Co)datatypes

The properties of BNFs and their employment in the construction of modular (co)datatypes are described in [70] and [18]. Here we recall the reasoning principles emerging from these constructions, referring to the notations in Section2.3.

The minimality of the datatype construction is expressed in the following structural in- ductionproof principle:

(SI) Given the predicate ϕ: α T → bool, if the condition ∀x :(α,αT)F.(∀t ∈ setm+1

F x. ϕ t)−→ϕ(ctor x)

holds, then the following holds: ∀t: α T. ϕ t.

For codatatypes, we no longer have a structural induction proof principle, but a structural coinductionprinciple:

(SC) Given the binary relation ϕ: α T → α T → bool, if the condition

∀x, y :(α, αT)F. ϕ(ctor x) (ctor y)−→ relF[(=)]mϕ x y

holds, then the following holds: ∀s, t: α T. ϕ s t −→ s=t.

Visual intuition Datatype and codatatype (and the difference between them) can be viewed as “shape plus content.” To simplify notations, consider the binary BNF (α,τ)F and its datatype on the second component, α T, so that we have α T '(α,αT)Fvia the isomorphism

(α,αT)Fctor→ αT

(Thus, we have m=1.) Fig.A.1aillustrates the effect of decomposing, or “pattern-matching” a member of the datatype, t: α T. Such an element will have the form ctor x, where x :

(α,αT)F. In turn, x has two types of atoms: the items in setF₁x, which are members of α, and the items in setF₂x, which themselves members of the datatype—we call the latter the recursive components of t. By repeated applications of ctor, setF₁, and setF₂, any element of the datatype can be unfolded into an F-branching tree, which has two types of nodes: ones that represent members of α, and ones that represent elements of the datatype. The former are always leaves, whereas the latter are leaves if and only if they have no recursive components

T F

ctoroo

a a0 T

a00

(A) Applying the (co)datatype constructor

a0 F

a00 F

a000 ...

(B) Elements of the datatypes as piles of F-shapes

FIGUREA.1: Visualizing a datatype

themselves, i.e., applying setF₂ to them yields ∅. Fig.A.1bpictures a recursive component path of such a tree.

The essential property of the datatype is that all such trees are well founded, meaning that they all end in items t that have no recursive components (setF₂ t=∅). This is precisely what the structural induction principle (SI) says, in a slightly different, higher-order formulation that is more suitable for proof development: A predicate ϕ ends up being true for the whole datatype if, for each element ctor x, ϕ is true for ctor x provided ϕ is true for all its recursive components t ∈ set2_F(ctor x).

If instead of a datatype we consider the codatatype α T defined as α T '∞ ₍α,αT₎_F,

the pictures in Fig.A.1remain relevant. The difference is that members of α T can now be unfolded into possibly non-well-founded trees—i.e., trees that are allowed to have infinite recursive-component paths, corresponding to an infinite number of applications of the constructor. As a result, induction is no longer a valid proof principle. However, we can take advantage of the fact that the tree obtained by fully unfolding a member t of the codatatype determines t uniquely—along the principle “to be is to do,” where “to be” refers to t’s identity and “to do” refers to t’s unfolding behavior.1 Thus, s and t will be equal whenever they are bisimilar as F-trees—that is, if there exists an F-bisimilarity relation ϕ: α T → α T → bool such that ϕ s t holds. The notion of ϕ being an F-bisimilarity means that, whenever ϕ relates two F-trees ctor x and ctor y, their top F-layers have the same shapes, positionwise equal α-atoms, and positionwise ϕ-related components. As seen in Section2.2, such positionwise relations can be expressed using the relator of F. The structural coinduction principle (SC) embodies the above reasoning pattern.

Modularity The type constructors T resulting from (co)datatype definitions are themselves BNFs, hence can be used in later (co)datatype definitions. This allows one to freely mix and nest (co)datatypes in a modular fashion.

The above definitional modularity is matched by modularity with respect to proof principles: The (co)induction principle associated with a (co)datatype respects the abstraction barrier of the (co)datatypes nested in it, in that it does not refer to their definition or their constructors; instead, it only uses their BNF interfaces, consisting of map functions, set functions and relators. For example, here is the structural coinduction principle for finitely branching 1_{This formulation is Jan Rutten’s import of the famous existentialist dogma into the realm of fully abstract}

possibly non-well-founded rose trees, defined as a codatatype by α tree∞'α ×(αtree∞)list,

where for its constructor we write Node instead of ctor: (SCPtree∞) Given ϕ: α tree∞→αtree∞→ bool, if

∀ts, ss :(αtree∞)list. ϕ(Node a ts) (Node b ss)−→ a=b ∧ rellistϕ ts ss

then ∀s, t: α tree. ϕ s t −→ s=t

Thus, the codatatype tree∞ nests the datatype list, but its coinduction principle only refers

to list’s relator structure, rellist. In proofs, one is free to also use the particular definition of

rellist, which is the componentwise lifting of a relation to lists—but the coinduction principle

for tree∞does not depend on such details. The list type constructor is seen as an arbitrary

BNF. To define unordered rose trees, we could use the finite powerset BNF fset instead of list, and the coinductive principle would remain the same, except with relfsetinstead of rellist.

In document A general theory of syntax with bindings (Page 125-129)