5.3 Existing Approaches to Recursive Modules
5.4.2 Elaboration of Recursive Modules
Recall that, under the Harper-Stone approach to formalizing ML, there are two languages: the external language (EL), namely ML (or the extension of it that we are defining), and the internal language type system (IL), into which the EL is elaborated. The purpose of the IL is to encapsulate in type theory as much of ML semantics as is possible.
After studying recursive modules for some time, the only way I have found to cleanly avoid the double vision problem in type theory is the original approach I suggested in Section 5.2.3, namely to require that the declared signature of a recursive module be transparent. Thus, in the IL of my new ML dialect, I provide a recursive module constructrec(X :S.M), which syntactically restricts
the declared signature to be transparent. The typing rule for rec(X :S.M) is as follows:
Γ,X:maybe(S)`M :P S
Γ`rec(X :S.M) :P S
This is the same as the typing rule given in Section 5.2.4, except that here I also require that M be pure/separable.11 This purity condition poses no burden, for if M is impure and has the transparent signature S, then we can always coerce M’s purity classifier toPby writing purify(M).
11
5.4. A NEW APPROACH 111
signature CSS = ρX. sig
structure Expr : sig
datatype t = ... | LetExpr of X.Bind.t * t | ... end
structure Bind : sig
datatype t = ... | ValBind of var * Expr.t | ... end
end
Figure 5.17: Closed Static Signature of Exprand Bind
In contrast, my EL recursive module constructrec(X:S)M does not require S to be transparent. This shifts the burden of avoiding the double vision problem onto the elaboration algorithm. The goal of the final section of this chapter is to provide an informal understanding of how recursive module elaboration works. The formal details are presented in Chapter 9.
The “Easy” Case
Suppose that we are given a recursive EL modulerec(X:S)M, whose body M obeys the dynamic- on-static and separability restrictions described above. Let us begin by considering how this module is elaborated in the “easy” case where M does not contain any explicit uses of sealing. That is, assume that the only abstract types in M arise fromdatatype definitions. An example of such a module is the version of ExprBind shown in Figure 5.12. The figures accompanying the following discussion illustrate the output of elaboration for this version of the ExprBindmodule.
In the absence of sealing, the basic idea in elaboratingrec(X:S)M is to construct a transparent signature S0 that fills in the opaque type specifications of S with the definitions of those type
components provided by M. In essence, S0 will be equivalent to
s
S(Fst(M)). If we use S0 as thedeclared signature of X (instead of S), then we will avoid double vision because we will be able to observe that Fst(X) is equivalent toFst(M).
Unfortunately, in reality, we cannot directly use
s
S(Fst(M)) as the declared signature of X,because M may contain datatype bindings with recursive references to X. The solution is to first define a module, called Static, which resolves the recursive dependencies among M’s type components while ignoring its term components. We can then useStaticin place ofFst(M) in the declared signature of X.
To compute Static, we begin by generating a signature R, which contains as precise a speci- fication of M’s type components as possible and ignores its value components. That is, for every type binding in M,type t = C, there will be atype specification in R of the form type t = C; for every datatype binding in M, the corresponding datatype specification will appear in R as well. For every val or fun binding in M, no specification appears in R. I will refer to R as the “static signature” of M.
The static signature R may have free recursive references to X. However, thanks to the dynamic- on-static restriction on M, these recursive references may only appear within datatype specifica- tions, not within transparenttypespecifications. Consequently, the recursively dependent signature
variable binding X↑S, both to simplify the module meta-theory and make the construct more amenable to the eventual support of separate compilation. Correspondingly, the dereferencing of X must be indicated explicitly in the body M by writingfetch(X).
structure Static :> CSS = let
datatype Expr t = ... | LetExpr of Bind t * t | ... and Bind t = ... | ValBind of var * Expr t | ... in
struct
structure Expr = struct
datatype t = datatype Expr t end
structure Bind = struct
datatype t = datatype Bind t end
end end
Figure 5.18: Canonical Implementation of Exprand Bind’s Closed Static Signature
structure ExprBind = rec (X :
s
EXPR BIND(Static))struct
structure Expr = struct
datatype t = ... | LetExpr of X.Bind.t * t | ...
(* Double vision: t 6≡ X.Expr.t *)
end
structure Bind = struct
datatype t = ... | ValBind of var * Expr.t | ...
(* Double vision: t 6≡ X.Bind.t *)
end end
Figure 5.19: Elaboration of Expr and Bind: First Try
ρX.R is guaranteed to be well-formed. I will refer to this rds as the “closed static signature” of M. The closed static signature of ExprBindis shown in Figure 5.17.
Given the closed static signature ρX.R of M, we may now define Static as the “canonical” module implementing this signature. The details of computing canonical implementations of signa- tures are given in Section 9.3.3, but the basic idea is straightforward. We first resolve the recursive references to X in thedatatypespecifications of R by joining all the type components of the module in a big datatype binding. Once we have done this, we can copy the types into the appropriate substructures. Figure 5.18 illustrates this construction in the case of Exprand Bind.
Having defined Static, we can now use
s
S(Static) as the declared signature of the recursivemodule. There is one last tricky point, however, regarding datatype’s. Figure 5.19 shows what goes wrong in theExprBindexample if we use
s
S(Static) as the declared signature but leave the recursive module body unchanged. Specifically, any datatypebindings in the body will generate fresh abstract types that are not equivalent to the corresponding components of XorStatic. The solution is simple: replace every datatypebinding in the recursive module body with adatatype replication binding that copies the corresponding datatype component from Static instead of5.4. A NEW APPROACH 113
structure ExprBind = rec (X :
s
EXPR BIND(Static))struct
structure Expr = struct
datatype t = datatype Static.Expr.t end
structure Bind = struct
datatype t = datatype Static.Bind.t end
end
Figure 5.20: Elaboration of Expr and Bind: Second Try
structure ExprBind = rec (X :
s
EXPR BIND(Static))struct
structure Expr :> sig type t ... end =
struct ... (* same as in Figure 5.20 *) ... end structure Bind :> sig type t ... end =
struct ... (* same as in Figure 5.20 *) ... end end
Figure 5.21: Elaboration of Expr andBind With Sealing: First Try
defining a fresh one. Figure 5.20 shows how this is done forExpr and Bind.
This completes the discussion of recursive module elaboration in the case where M contains no sealing. In the interest of expository simplicity, I have glossed over a few technical issues. For instance, it is possible that the recursive module body M defines more type components than are specified in the declared signature S or that it defines them in a different order than S does. In this case, while M will be coercible to S according to ML’s notion of signature matching, the signature
s
S(Static) will not be well-formed because the kind of Fst(Static) will not be an IL subkind ofFst(S). This is not a problem in practice, however. As long as M is coercible to S, we can generate a type constructor Static’ that copies and reorders (some of) the type components of Static in order to match Fst(S) precisely. We can then use
s
S(Static’) instead ofs
S(Static) for thedeclared signature of the recursive module. I will leave discussion of other technical subtleties until Section 9.3.8, in which recursive module elaboration is formalized.
The General Case
We now turn attention to the general problem of elaborating rec(X:S)M, where the body M may contain uses of (basic) sealing. To make the problem concrete, consider how we might elaborate our key motivating example, namely the version of the ExprBindmodule in which Expr andBind are sealed with signatures that hide the data constructors ofExpr.tand Bind.t.
A first attempt at elaborating this version ofExprBindis shown in Figure 5.21. The static part
of ExprBindis the same as it was before, so the definitions of Static and CSSremain unchanged
from their definitions in Figures 5.17 and 5.18. The only change here is that the definitions ofExpr and Bind are sealed with opaque signatures. This is problematic because it means that the body of the recursive module does not match the transparent declared signature
s
EXPR BIND(Static).structure ExprBind = rec (X :
s
EXPR BIND(Static))struct
structure Expr :> sig type t = Static.Expr.t ... end = struct ... (* same as in Figure 5.20 *) ... end
structure Bind :> sig type t = Static.Bind.t ... end = struct ... (* same as in Figure 5.20 *) ... end
end
Figure 5.22: Elaboration of Expr and BindWith Sealing: Second Try
signature CSS = sig
structure Expr : sig type t = C end
structure Bind : sig type t = D end
end
structure Static :> CSS = ...
structure ExprBind = rec (X :
s
EXPR BIND(Static))struct
structure Expr :> sig type t = Static.Expr.t ... end =
struct type t = C ... end
structure Bind :> sig type t = Static.Bind.t ... end =
struct type t = D ... end
end
Figure 5.23: Problematic Elaboration of Modified ExprBindExample
Figure 5.22 shows how the elaborator can address this issue by modifying the sealing signatures so that they expose the equivalence of their t components withStatic.Expr.tandStatic.Bind.t, respectively. This technique is similar to the one used to transform datatype definitions in the recursive module body (cf. Figure 5.20). Here, however, it results in a loss of data abstraction. In particular, the implementation of Bind in Figure 5.22 is able to observe that Expr.tis equivalent
to Static.Expr.t, whose data constructors are exposed in the signature CSS.
In the case of ExprBind, it turns out that this loss of data abstraction is actually irrelevant. The data constructors of Expr.t are not visible in the signature of Expr or X.Expr, only in the signature ofStatic.Expr. SinceStaticis an internal, elaborator-generated module variable name, the implementation of Bind has no way of projecting from Static directly. Consequently, while Bind gets to “know” that Expr.t is implemented as adatatype, it has no way of exploiting this knowledge. Thus, for all intents and purposes, Expr.t is an abstract type. (Similarly, as far as Expr is concerned,Bind.tis effectively abstract as well.)
Unfortunately, the elaboration approach of Figure 5.22 only succeeds in enforcing data abstrac- tion boundaries betweenExprandBindbecauseExpr.tandBind.tare implemented bydatatype definitions. To illustrate this point, let us consider hypothetically how the elaboration ofExprBind would differ if Expr.tand Bind.twere implemented internally by the transparent type bindings
type t = C and type t = D, respectively, instead of by datatype bindings. Figure 5.23 illus-
trates how the closed static signature and the elaborated recursive module body would change accordingly. (Note that, in order to obey the dynamic-on-static restriction, it must be the case
5.4. A NEW APPROACH 115
metasig
structure Expr : {public = sig type t end,
private = sig type t = C end}
structure Bind : {public = sig type t end,
private = sig type t = D end}
end
Figure 5.24: Meta-signature for Modified ExprBind
that neither C nor D refers to the recursive variable X.)
The problem with this elaboration is that the signature CSS reveals the identities of Expr.t
and Bind.tto the implementations ofboth modules. In order for data abstraction to be enforced,
Expr should not be able to see howStatic.Bind.tis defined, nor shouldBind be able to see how
Static.Expr.tis defined. On the other hand, in order to avoid double vision, the implementation
of Expr needs to know that Static.Expr.t is equivalent to C, and the implementation of Bind needs to know thatStatic.Bind.tis equivalent to D.
What this tells us is that the implementations ofExprandBindreally ought to be typechecked under different typing contexts. In other words, the elaborator needs to be able to actively change the signature with which Staticis bound in the typing context, depending on what part of the recursive module body it is currently elaborating. This kind of “context switching” is not supported directly by the IL type system, so the elaboration algorithm must implement it in somead hoc way. To implement context switching, my elaboration algorithm employs a novel mechanism called a “meta-signature.” Meta-signatures are a superclass of ordinary IL signatures in which the specifi- cations of submodule components are allowed to provide both a “public” and a “private” signature. The purpose of meta-signatures is to encapsulate the type information that is known at different points during the typechecking of a recursive module.
For example, Figure 5.24 shows a meta-signature describing the knowledge of type information within the ExprBind module. In this meta-signature, the public signatures of Expr and Bind indicate what is publicly known about their type components, whereas the private signatures of Expr and Bind indicate what is privately known within the modules themselves. Note that the closed static signatureCSSofExprBindis obtainable from its meta-signature by ignoring the public signatures and just looking at the private ones.
The purpose ofExprBind’s meta-signature is to tell the elaborator how the signature ofStatic should change at different points during the typechecking of ExprBind. When elaborating the implementation ofExpr, the meta-signature indicates that the signature ofStaticshould use the private signature forExprand the public signature forBind. When elaborating the implementation of Bind, the private and public signatures are reversed.
Although the ExprBindexample does not illustrate it, the meta-signature approach also allows for proper treatment of data abstraction in the presence of nested sealing. For example, suppose thatExprcontained a sealed substructureSubproviding an abstract type componentuimplemented internally as int. This would be reflected in the meta-signature of ExprBindby having the private signature ofExpr be itself a meta-signature:
metasig
type t = C
structure Sub : {public = sig type u end,
private = sig type u = int end}
end
Correspondingly, during the typechecking ofExpr.Sub, the signature of Static.Exprwould be sig
type t = C
structure Sub : sig type u = int end end
whereas during the typechecking ofExpr outside ofSub, the signature of Static.Exprwould be sig
type t = C
structure Sub : sig type u end end
This ensures that the identity of Static.Expr.t is known to all of Expr, but the identity of
Static.Expr.Sub.uis only known to Expr.Sub.
Finally, it should be understood that meta-signatures are purely an elaboration mechanism. The elaborator uses them, as part of typechecking, to ensure that Expr and Bind respect each other’s abstraction boundaries. Once that is verified, the elaborator throws away the meta-signature and translates ExprBind precisely as shown in Figure 5.23. Thus, the meta-signature technique does not require any extensions to the IL type system.
Chapter 6
Type-Theoretic Extensions for
Recursive Modules
In this chapter, I extend my module type system of Chapters 3 and 4 with support for recursive type constructors, recursively dependent signatures, and recursive modules. The type system resulting from these extensions will serve as the basis of the internal language (IL) of the new ML dialect presented in Chapter 8.
As explained in Chapter 5, all of the new recursive constructs are restricted in various ways, so that their inclusion in the IL does not significantly complicate typechecking. Recursive type constructors are iso-recursive, meaning that the type system requires the use of explicit coercions to mediate between a recursive type and its unfolding. Recursively dependent signatures must obey a dynamic-on-static restriction, prohibiting recursive dependencies from within the specifications of type components. Recursive modules are required syntactically to have transparent declared signatures in order to avoid the double vision problem. As a result, none of the extensions presented in this chapter poses any major technical difficulties. Sections 6.1 through 6.4 present the extensions to the languages of type constructors, terms, signatures, and modules, in that order.
There are a few subtleties, however, that may be of general interest. One point of note is the subtyping rule for recursively dependent signatures, which relies on singleton signatures in an unexpected way. Another is that, in the presence of both singleton kinds and recursive type constructors of higher kind, there exist recursive types—or, more precisely, projections from recur- sive type constructors—whose expansions are not well-formed under the iso-recursive form of type equivalence. To address this issue, I identify a simple restriction on the kind of a recursive type constructor, which guarantees that its expansion will be well-formed. While this is not the weakest restriction possible, it is easy to explain and general enough to support the elaboration of all the recursive module examples from Chapter 5.
6.1
Constructor-Language Extensions
Figure 6.1 shows the extensions to the syntax of the constructor language from Chapter 3, and Figure 6.2 shows the well-formedness and equivalence rules for the new type constructors. The recursive type constructor µα:K.C is iso-recursive. This means that µα:K.C is not equivalent to its fixed-point expansion C[µα:K.C/α]. In addition, two recursive constructors µα:K1.C1 and
µα:K2.C2 are only equivalent if the Ki are equivalent and the Ci are equivalent.
Type Constructors C,D ::= · · · |µα:K.C
Base Types b ::= · · · |C1 C2 |maybe(C)
Figure 6.1: Extensions to Type Constructor Syntax
Well-formed constructors: ∆`C : K ∆, α:K`C : K ∆`µα:K.C : K (105) ∆`C0 :T ∆`C00:T ∆`C0 C00:T (106) ∆`C :T ∆`maybe(C) :T (107) Constructor equivalence: ∆`C1 ≡C2 : K ∆`K1≡K2 ∆, α:K1 `C1≡C2: K1 ∆`µα:K1.C1 ≡µα:K2.C2 : K1 (108) ∆`C01 ≡C02 :T ∆`C001 ≡C002 :T ∆`C01 C001 ≡C02 C002 :T (109) ∆`C1 ≡C2 :T ∆`maybe(C1)≡maybe(C2) :T (110)
Figure 6.2: Inference Rules for Type Constructors