Let Binding - Program Synthesis With Types

So far we have added basic types and language features toλ→syn with good results. These features have admitted natural forms of examples and refinement rules and have not disrupted soundness or completeness of λ→syn. However, the fact that a feature is simple does not necessarily mean that we can synthesize it easily. Let bindings are an excellent example of this fact.

Let bindings allow us to bind a value to a name, a function or some other type. They are useful for implementing helper functions that cannot be inlined into the main function’s definition,e.g.because it is recursive, or simply shrinking the size of the program by avoiding code duplication. At first glance, we might introduce let bindings with the standard syntactic sugar:

let x=e1 ine2

def

= (λx:τ.e2) e1.

This is perfectly serviceable for the external language eofλ→_syn which would allow us to use let-bindings in helper functions that we might feed to the synthesizer. However, this doesn’t allow us to synthesizelets because the de-sugaring is not in normal form.

To get around this, we might introducelet as a standard syntactic form with the typing rule:

t-Ilet

Γ` I1⇐τ1 x:τ1,Γ ` I2 ⇐τ2

Γ `let x:τ1= I1 in I2⇐τ2 Transforming this into a synthesis rule, we obtain:

irefine-let

Γ` τ1B· I1 X0 =. . . x:τ1,Γ` τ2BX0 I2 Γ` τ2BX let x:τ1 = I1in I2

On top of the fact that let is not type-directed—irefine-let applies at any type, similarly to irefine-match—we must guess the “helper” type and term_τ1 and I1 out of thin air!

Appealing to the Curry-Howard Isomorphism, synthesizing a let-binding is tantamount to guessing and deriving a lemma and then using that lemma in your proof. In the programming world, this is like guessing and deriving a helper function to use in the solution of a problem. irefine-let precisely reflects our intuition about this process: coming up with a seemingly unrelated lemma or helper function is frequently as hard, if not harder, than solving the original problem itself! We leave this difficult problem of discovering and employing such letbindings to future work.

Chapter 5 Recursion

So far, we have considered adding a variety of basic types to λ→_syn. While these types allow us to synthesize programs that more closely match those found in actual functional programming languages, they have not significantly changed the expressiveness of our core language. Next we will consider adding recursion to λ→_syn which greatly increases its expressiveness.

5.1 µ-types

One way to express recursion within a typed lambda calculus is with µ-types. Figure 5.1 shows how we can addµ-types toλ→_syn. The typeµα.τ binds a recursive occurrence of a type to the type variable α which appears in its definitionτ. We introduce aµ-type with thefold term and eliminate aµ-type with theunfold term. For example, we may use µ-types, pair, sums, andUnit to encode a list type:

List=def µα.Unit+T∗α.

folds in a term explicitly mark points where the recursive type variable is used: fold inl(c1,fold inl(c2,fold inr())).

To use these recursive types, we explicitlyunfold thefolds where ever they appear, for example:

(match unfold fold inl(c1,fold inr())with

inlx1→fstx1

inrx2→ c2)−→∗ c1.

Synthesis rules, again, are derivable directly from the typing rules. To generate an unfold (eguess-unfold), it is sufficient to guess a _µ-type whose one-step un- rolling is the goal type in question. When refining at µ-type (iguess-mu), we know that all our examples are fold values. Refining such a value is straightforward;

Γ ` e: τ t-fold Γ` e :[µα.τ/α]τ Γ `folde: µα.τ t-unfold Γ` e: µα.τ Γ `unfolde: [µα.τ/α]τ Γ ` E⇒ τ t-unfold Γ` E⇒µα.τ Γ` unfoldE ⇒[µα.τ/α]τ Γ` I ⇐τ t-Ifold Γ ` I ⇐[µα.τ]τ Γ` foldI ⇐µα.τ e−→ e0

eval-unfold-fold

unfold foldv−→ v Γ` τ E eguess-unfold Γ` µα.τ ⇒E Γ ` [µα.τ/α]τ ⇒unfoldE Γ ` τBX I irefine-mu

X =σi 7→foldχii<n Γ` [µα.τ/α]τBunfold(X0) I

Γ` µα.τBX foldI Γ` χ: τ t-ex-fold Γ` χ :[µα.τ/α]τ Γ `foldχ: µα.τ v 'χ eq-pair v 'χ foldv 'foldχ

unfold(σi 7→ foldχii<n) = σi 7→χii<n Figure 5.1: λ→_syn µ types

because folds have no computational content, we simply shed the top-level fold constructors of the examples, leaving behind example values of an appropriate type that we can use to synthesize the unfolded program.

Up until now, I-refinement has always decomposed down a goal type down into a base type. At first glance,µ-types seem to pose a problem for type-directed example refinement because aµ-type can be unfolded infinitely, for example, our

Listtype:

µα.Unit+T∗α

≡Unit+ (T∗µα.Unit+T∗α)

≡Unit+ (T∗Unit+ (T∗µα.Unit+T∗α)) ≡. . .

However, example values save us because while a µ-type represents an infinite family of types, example values are necessarily finite structures. This simple example value of typeList:

fold inl(c,fold inr())

Can only be unfolded twice, corresponding to the two folds in the value. Thus application ofirefine-mustop once all the folds have been peeled away from the example values.

5.1.1 Non-determinism

The rules in Figure 5.1 seem perfectly sensible. Indeed, they are sound as we can show with the appropriate lemma.

Lemma 5.1.1 (Example-Type Preservation of unfold). If Γ ` X : µα.τ then Γ `

unfold(X) : [µα.τ/α]τ.

Proof. Immediate from our premise. t-exw-conssays that each_σiis well-typed and t-ex-foldsays that each example foldis well typed along with their components at type [µα.τ/α]τ.

It turns out that the necessary completeness lemma also holds rather trivially:

Lemma 5.1.2 (Satisfaction Preservation of unfold). IffoldI X then I unfold(X). Proof. Consider a single example worldσ 7→foldχ. Bysatisfies andeq-fold, we know thatσ(I) 'χ. We know from the definition ofunfoldthat∀σ 7→foldχ.σ 7→ χ∈ unfold(X)which is sufficient to conclude that I unfold(X).

The problem is that we know that the addition ofµ-types into theλ→ introduces non-termination [Pierce, 2002]! In particular, let ω be diverging term. Then the

expression λx : τ.ω cannot be synthesized inλ→syn extended withµ-types. Because

irefine-guessrelies on evaluation, we will never be able to use it to synthesize_ω (which is an E-term). Thus in the presence of recursion, we lose completeness.

However, recursion poses even more of a problem than this. It is is unlikely for us to encounter ω if we enumerate programs in order of size as suggested in Chapter 2 because its encoding inλ→syn is roughly 30 AST nodes which is too large to generate in a reasonable amount of time. However, such non-terminating expressions in a standard functional programming language are much smaller by comparison. For example imagine that we are synthesizing the body of a function in an ML-like language:

let rec f (x:nat): nat=.

We may try to E-guess the expression f x for the body of f which produces an infinite loop. This expression, in contrast toω, is a mere 3 AST nodes which makes it very likely that we would encounter this term during enumeration. Furthermore, because this looping term is so small, we’ll enumerate many other equivalent terms that contain it, e.g., f (f x) or f (f(f x)). We could alter our evaluation strategy to include a timeout or limit on number of evaluation steps, but with so many non-terminating terms, this approach would not scale appropriately.

In document Program Synthesis With Types (Page 68-72)